In some situations, we don’t merely wish to add new columns corresponding to some or all parts resulting from the split of a string column. Rather, we wish to apply some processing on the parts to generate the desired output.
In this section we cover two common scenarios of processing parts produced by string splitting:
This section is complemented by:
We wish to create new columns by concatenating some of the parts resulting from splitting the values of a string column.
In this example, we will pick the third and first parts resulting from splitting a string column and concatenate them into a new column.
SELECT col_1,
CONCAT(SPLIT(col_1, '_')[SAFE_OFFSET(2)],
'-',
SPLIT(col_1, '_')[SAFE_OFFSET(0)]) AS col_2
FROM table_1;
Here is how this works:
SPLIT()
to return an array of parts for each
element in col_1
.
See Split.SAFE_OFFSET(2)
and SAFE_OFFSET(0)
respectively.CONCAT()
to concatenate the selected vectors element-wise to produce a
new column as output and store that in a new column col_2
.We wish to set the data type of parts resulting from splitting the values of a string column.
SELECT col_1,
SAFE_CAST(SPLIT(col_1, ' ')[SAFE_OFFSET(1)] AS INT64) AS col_2,
SPLIT(col_1, ' ')[SAFE_OFFSET(2)] AS col_3
FROM table_1;
Here is how this works:
SAFE_CAST()
to covert second element of the array to a numeric data type.We wish to apply some arbitrary processing logic to the parts resulting from splitting the values of a string column. In other words, we wish to apply some custom logic to determine how to use the output of a string split operation.
In this example, the elements of column col_1
are strings of the form '4,3'
. We
wish to multiply the numbers within the comma-separated segments and add up the results. So for this
case, we would get 4 * 2 = 8
.
SELECT col_1,
SAFE_CAST(SPLIT(col_1, ',')[SAFE_OFFSET(0)] AS INT64) *
SAFE_CAST(SPLIT(col_1, ',')[SAFE_OFFSET(1)] AS INT64) AS col_2
FROM table_1;
Here is how this works:
str_sSPLITplit()
to break down each element of the column col_1
into parts
around the delimiter ‘,‘
. See Split.SAFE_OFFSET()
and we cast it to an integer.
See Select
and Set Data Type.