Split a string around a given delimiter executing as many splits as there are instances of the
delimiter. For instance, if we split ‘a, b, c, d’
around the ‘, ‘
as the delimiter, we would get
four sub strings a
, b
, c
and d
.
We then cover four extensions that may be applied to the above scenario:
‘-’
character or underscore ‘_’
character.This section is complemented by:
We wish to split a string around a given delimiter executing as many splits as there are instances of the delimiter.
In this example, we wish to split each element in the column col_1
around hyphen ‘-’
characters
and add a new column to the table df
for each part where the nth part goes into the nth new
column.
SELECT col_1,
SPLIT(col_1,'-') AS col_2
FROM table_1;
Here is how this works:
SPLIT()
to split each element in a string
column, which here is the column col_1
, into parts around a given delimiter, which here is the
hyphen character ‘-’
.table_1
with an
additional column added holding an array of the parts of the elements of the column col_1
.We wish to split a string into individual words.
SELECT col_1,
REGEXP_EXTRACT_ALL(col_1, r'\b\w+\b') AS col_2
FROM table_1;
Here is how this works:
This code works similarly to the Free Split scenario above except that we use the
regular expression '\W+'
as the delimiter which denotes possible separators between words.
We wish to split a string into individual characters.
SELECT col_1,
SPLIT(col_1,'') AS col_2
FROM table_1;
Here is how this works:
This code works similarly to the Split scenario above except that we use ''
as the
delimiter which denotes splitting around individual characters.
We wish to split a string around any of a set of delimiters
In this example, we wish to split each element of the column col_1 into its parts which we wish to
return as new columns. The delimiter can be a hyphen ‘-’
or an underscore ‘_’
.
SELECT col_1,
SPLIT(REGEXP_REPLACE(col_1, '-|_', '-'), '-') as col_2
FROM table_1;
Here is how this works:
REGEXP_REPLACE()
to replace multiple delimiters with one delimiter. In this
example we use '-|_'
regular expression to replace '-'
and '_'
with a '-'
.
See Replacing.SPLIT()
to split each element in a string column, which here is the
column col_1
, into parts around a given delimiter. See Split above.[-_]
.Given a column of strings to split and a column of equal size of delimiters, we wish to split each element in the first column around the corresponding delimiter in the second column. This is often a case when we wish to split the values of a given column of a table using delimiters provided by another column.
SELECT col_1,
col_2,
SPLIT(col_1,col_2) AS col_3
FROM table_1;
Here is how this works:
SPLIT()
to split each element in a string column into parts around a
given delimiter. See Split above.SPLIT()
takes a column as an input, which here are the values of the
column col_1
, and the delimiter pattern, which here are the values of the column col_2
.
Therefore, the function str_split()
will use the corresponding value of the column col_2
as
the delimiter when splitting a value in the column col_1
.