The most common data transformations involve applying a function to one or more columns of a table to output a new column.
We will cover three common scenarios for creating new columns:
We wish to create a new column that is entirely composed of the same value.
In this example, we create a new column col_3
that has the value 2022 for all rows of the
table table_1
.
SELECT *, 2022 AS col_3
FROM table_1;
Here is how this works:
AS col_3
will add a new column col_3
to the table table_1
or will add it as col_3_1
column if a column named col_3
already exists.col_3
will be set to the scaler value 2022.We wish to create a new column by applying arithmetic or logical operations on existing columns.
In this example, we create a new column col_3
that is the ratio of two existing columns col_1
and col_2
.
SELECT *, SAFE_DIVIDE(col_1, col_2) AS col_3
FROM table_1;
Here is how this works:
SAFE_DIVIDE(col_1, col_2)
: The data transformation to be carried out which is to divide the
value of the column col_1
for each row by the value of the column col_2
of the same row.SAFE_DIVDE
is equivalent to the division operator ( X / Y ), but returns NULL if an error
occurs, such as a division by zero error.AS col_3
will add a new column col_3
to the table table_1
or will add it as col_3_1
column if a column named col_3
already exists.+
, -
, *
, and /
) or comparison operations (such as >
, <
, and ==
)
applied to numeric data which we cover
in Numeric Operations.Built-In Functions
We wish to create a new column by applying one or more built-in functions on existing columns.
In this example, we create a new column col_3
that is the log of an existing column col_2
.
SELECT *, LOG10(col_2) AS col_3
FROM table_1;
Here is how this works:
AS col_13
will add a new column col_3
to the table table_1
or will add it as col_3_1
column if a column named col_3
already exists.LOG10(col_2)
is the data transformation to be carried out which is to apply the LOG10
function to take the log base 10 of each value of the column col_2
.User-Defined Functions
We wish to create a new column by applying one or more user-defined functions (UDF) to existing columns.
In this example, we create a new column col_13
via a UDF that accepts a string
value and returns a string column that contains only digits.
CREATE FUNCTION IF NOT EXISTS
refcon.dataset.extract_digits(v_col_value STRING) RETURNS string
AS (
REGEXP_REPLACE(v_col_value, '[^0-9 ]', '')
);
SELECT *, refcon.dataset.extract_digits(col_9) AS col_13
FROM refcon.dataset.table_1;
Here is how this works:
extract_digits()
which accepts a string value as an argument.REGEXP_REPLACE(v_col_value, '[^0-9 ]', '')
, inside the function, use regular expressions to
extract digits from a string.SELECT
just
like we would a built-in function (see “Built-In Functions” above).