We wish to drop rows that contain missing values.
We cover the following scenarios:
We wish to drop rows where any column from a selected set of columns has a missing value.
In this example, we wish to drop any rows from the table table_1
where either the column col_1
or the column col_2
has a missing value.
SELECT *
FROM table_1
WHERE NOT (col_2 IS NULL OR col_3 IS NULL);
Here is how this works:
IS NULL
expression. It
returns TRUE
if the value of col_1
is missing and FALSE
otherwise.OR
to check if col_2
or col_3
has missing value. This returns TRUE
for rows
where neither of the columns has a missing value. We use WHERE NOT
to select rows that
complement this condition.
See Common Scenarios: Complement
and Multiple Filters: OR.table_1
but with any row,
where col_2
or col_3
has a missing value, dropped.col_2 IS NOT NULL AND col_3 IS NOT NULL
.Extension: Each Selected Column
We wish to drop rows if and only if all the values of the specified columns are missing.
SELECT *
FROM table_1
WHERE col_2 IS NULL
AND col_3 IS NULL;
Here is how this works:
IS NULL
expression. It
returns TRUE
if the value of col_1
is missing and FALSE
otherwise.AND
to check if both col_2
and col_3
has missing value. This returns TRUE
for
rows where both of the columns has a missing value. We use WHERE
to select rows that meet this
condition. See Multiple Filters: AND.table_1
but with any
row where all the specified columns, which here are col_1
and col_2
, are missing is dropped.We wish to drop rows where any column has a missing value.
SELECT *
FROM table_1
WHERE TO_JSON_STRING(table_1) NOT LIKE '%:null%';
Here is how this works:
TO_JSON_STRING
function is called on each row in table_1
, which converts the row to a JSON
string representation. For example in a table with 3 columns , the row will be repressed as
"{"col_1":"str value","col_2":null,"col_3": int_value}"
, where col_2
here has a missing value.NOT LIKE '%:null%'
to return TRUE if it
does not contain the substring :null,
.
See String Detecting: Complement.:null
to avoid including string columns that contain the
substring 'null'
.table_1
but with any row that has
one or more missing values dropped.We wish to drop rows where all columns have missing value (a value of NULL
).
SELECT *
FROM table_1
WHERE (SELECT COUNT(*)
FROM UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(table_1), ':null'))) < 3
Here is how this works:
TO_JSON_STRING()
function to convert each row
to a Json string.REGEXP_EXTRACT_ALL()
to extract all matches for the pattern ':null'
. We use the
substring :null
to avoid including string columns that contain the substring 'null'
.REGEXP_EXTRACT_ALL()
returns an array of matches, we UNNSET
the array and count the elements
which in this case will be the number of columns with missing values.table_1
but with any
row where all columns have missing values NULL
is dropped.We wish to drop rows that have more than a specified number of columns with missing values.
In this example, we wish to retain rows where the values of less than three columns are
missing NULL
.
SELECT *
FROM table_1
WHERE (SELECT COUNT(*)
FROM UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(table_1), ':null'))) < 3
Here is how this works:
This works similar to the Empty Rows scenario above except we filter using any number instead of the number of columns in the table.