We wish to carry out data transformation operations conditionally. In other words, the data transformation operation we wish to carry out is not the same for all rows, rather it is different based on one or more conditions.
We will cover two scenarios:
We wish to perform a data transformation operation that involves one condition.
In this example, we wish to create a new numerical column col_13
whose value is a function of the
value of the numerical column col_1
. The value of col_13
should be 50% higher than the
corresponding value of col_1
if the value of col_1
is greater than 1000. Otherwise, the value
of col_13
should be just 20% higher than the corresponding value of col_1
.
SELECT *,
IF(col_1 > 1000, col_1 * 1.5, col_1 * 1.2) AS col_13
FROM refcon.dataset.table_1;
Here is how this works:
IF()
conditional expression.IF()
three arguments as follows:col_1 > 1000
.TRUE
, which in this example is col_1 * 1.5
.FALSE
, which in this example is col_1 * 1.2
.col_13
will take the value produced by col_1 * 1.5
for rows where
the condition col_1 > 10
is TRUE
, and the values produced by col_1 * 1.2
elsewhere.IF()
and other conditional expressions.We wish to perform a data transformation operation that involves multiple conditions.
In this example, we wish to create a new numerical column col_13
whose value is a function of the
values of the columns col_1
and col_11
and where the function that maps col_1
and col_11
to col_13
is different for different ranges of values of col_1
and col_11
.
SELECT *,
CASE
WHEN col_1 > 1000 AND col_11 > 10 THEN col_1 * 1.5
WHEN col_1 > 1500 AND col_11 > 20 THEN col_1 * 1.2
WHEN col_1 > 2000 AND col_11 > 50 THEN col_1 * 1.1
ELSE col_1 END AS col_13
FROM refcon.dataset.table_1;
Here is how this works:
CASE
expression.IF()
(see “One Condition” above) to perform a
conditional data transformation operation that involves multiple conditions. The recommended
approach is to use CASE
expression.CASE
statement accepts any number of two-sided formulas of the
form < WHEN condition THEN expression >
wherecol_1 > 1000 AND col_11 > 10
.col_1 * 1.5
.ELSE
is a catch-all case that is applied if all prior cases do not match.CASE
and other conditional expressions.