In this section, we cover the five common string combining scenarios that we come across in data manipulation contexts:
We wish to concatenate the values of one column to the corresponding values of another column.
In this example, we wish to concatenate the values of the column col_1
to the corresponding values
of the column col_2
with a comma as a separator into a new column col_3
.
SELECT col_1,
col_2,
CONCAT(col_1, ', ', col_2) AS col_3
FROM table_1;
Here is how this works:
CONCAT()
function to concatenate the values in the column col_1
with the
corresponding values in the column col_2
with a
separator ', ‘
in between.CONCAT()
function takes any number of values to concatenate.Alternative: Using concatenation operator
SELECT col_1,
col_2,
col_1 || ', ' || col_2 AS col_3
FROM table_1;
Here is how this works:
||
to concatenate the values in the column col_1
with the
corresponding values in the column col_2
with a
separator ', ‘
in between.Extension: Handling Missing Values
We wish to concatenate the values of two columns and substitute missing values with a given token ( string).
In this example, we wish to concatenate the values of the column col_1
to the corresponding values
of the column col_2
with a comma as a separator into a new column col_3
. We wish to substitute
missing values with ‘-’
.
SELECT col_1,
col_2,
CONCAT(IFNULL(col_1, '-'), ', ', IFNULL(col_2, '-')) AS col_3
FROM table_1;
Here is how this works:
CONCAT()
with columns that have missing values, any concatenation operation where any
of
the values being concatenated is a missing value would have returned a missing value as the result
of the operation.‘-’
as a replacement for missing values.IFNULL()
to replace missing values in the columns to be
concatenated, which here are col_1
and col_2
, with a particular replacement string, which here
is '-'
.We wish to concatenate all values of a string column into one string with an optional separator.
In this example, we wish to concatenate the values of the column col_2
into a single string for
each group where the groups are defined by the values of the column col_1
.
SELECT col_1,
STRING_AGG(col_2, ',') col_3
FROM table_1
GROUP BY col_1;
Here is how this works:
GROUP BY
clause groups the rows of a table by the values in one or more columns. The rows
are then assigned to a group based on the values in the grouping column(s). In this example, the
rows are being grouped by the values in the col_1 column.STRING_AGG()
function is used to concatenate a list of strings, separated by a specified
delimiter (','
in this example).We wish to create a new string column by inserting the value of one or more string variables into a string template.
In this example, we wish to create a new column col_3
from the values of the columns col_1
and col_2
where each value follows the template '{col_1} vs {col_2}'
for the corresponding rows.
SELECT col_1,
col_2,
FORMAT('%s vs %s', col_1, col_2) AS col_3
FROM table_1;
Here is how this works:
FORMAT()
function to create a new column col_3 by inserting values of the
columns col_1
and col_2
into the template '%s vs %s'
.FORMAT()
is quite convenient to work with. We use it as follows:'%s vs %s'
.We wish to create a new string column by repeating the values in an existing string column a set number of times.
In this example, we wish to create a new column col_3
where each value is the repetition (
concatenation to itself) of the corresponding value in the col_2
3 times.
SELECT col_1,
col_2,
REPEAT(col_2, 3) AS col_3
FROM table_1;
Here is how this works:
REPEAT()
function to create a new column col_3
by repeating (concatenating to
itself) each value in the column col_2
a
specified number of times.REPEAT()
function takes an integer argument, which in this case is 3, specifying the number
of times each value of the calling string column will be repeated.table_1
with an additional columncol_3
holding the values
from the column col_2
each repeated 3 times.We wish to pad a string with a specified character for up to a given width.
In this example, we wish to create a new column col_3
that is a padding of the values of the
column col_2
to get each to 4 characters.
SELECT col_1,
col_2,
LPAD(col_2, 4, '*') col_3
FROM table_1;
Here is how this works:
LPAD()
to pad the values of the column col_2
by adding the character *
to the
left
so each string value is 4
characters long.LPAD
method takes three arguments:original_value
which is the input we wish to pad in this case col_2
.return_length
: which is the desired length of the output padded string and which in this
case is 4
. If return_length
is less than or equal to the original_value
length, the
input will be truncated to match return_length
. For example, LPAD('hello world', 7)
returns 'hello w'.pattern
which specified the pattern to use for padding the input string to match the
given return_length
and which in this case is '*'
.RPAD()
achieves the same but by padding on the right hand side.table_1
with an additional columncol_3
holding the values
from the column col_2
with padding applied to the left side of each value.