By Location

We wish to extract a substring, from a subject string, given the locations of its start and end characters.

We will cover the following scenarios:

  • Character: Extract a single character given its index in the parent string.
  • Substring: Extract a substring given its start index in the parent string and its length.
  • Multiple Substrings: Extract multiple substrings given their start location in the parent string and their length.
  • Location Columns: The start and/or end locations are provided as values of columns in a table.

Character

We wish to extract a single character from a string given its index.

In this example, we wish to extract the first character from each value of the column col_1.

SELECT col_1,
       SUBSTRING(col_1, 1, 1) AS col_2
FROM table_1;

Here is how this works:

  • We use the function SUBSTRING() to extract the first character from the left from each element of the column col_1.
  • We pass to SUBSTRING() the following:
    • The column of strings, which in this case is the column col_1, from whose elements we wish to extract particular characters
    • The start location and the length of the substring we wish to extract. Since in this case, we wish to extract a single character the length is 1.
  • The output table will have the same number of rows as the input table with an added column col_2 which contains the first character of the corresponding element of the column col_1.

Extension: Right Indexing

We wish to extract a character from a string by indexing relative to the end (right side) of the string.

In this example, we wish to extract the last character from each value of the column col_1.

SELECT col_1,
       SUBSTRING(col_1, -1, 1) AS col_2
FROM table_1;

Here is how this works:

  • We can use negative indices to index relative to the right end of the string. In this case we use -1 to refer to the first character from the end i.e. the last character.

Substring

We wish to extract a substring from a string given its start location and length.

In this example, we wish to extract the 4 characters substring starting at the 2nd character from each value of the column col_1.

SELECT col_1,
       SUBSTRING(col_1, 2, 4) AS col_2
FROM table_1;

Here is how this works:

  • We use the function SUBSTRING() to extract a substring from each element of the column col_1.
  • We pass to SUBSTRING() the following:
    • The column of strings, which in this case is the column col_1, from whose elements we wish to extract particular characters.
    • The start location and the length of the substring we wish to extract.
  • The output table will have the same number of rows as the input table with an added column col_2 which contains the extracted substring (characters 2 through 5) from the corresponding element of the column col_1.

Extension: Start and End Indices

We wish to extract a substring from a string given its start and end location indices.

In this example, we wish to extract the substring starting at the 2nd character and ending at the 5th character from each value of the column col_1.

SELECT col_1,
       SUBSTRING(col_1, 2, (5-2)+1) AS col_2
FROM table_1;

Here is how this works:

  • The code is exactly the same as the primary solution, we just need to calculate the length using the start and end locations.
  • In this case (5-2)+1 = 4 gives us the length of the substring given the start and end positions.

Extension: nth to End

We wish to drop the first n characters and keep the rest.

SELECT col_1,
       SUBSTRING(col_1, 3) AS col_2
FROM table_1;

Here is how this works:

  • Our objective is to drop the first n characters (which in this example is 2) and keep the rest; i.e. to capture the characters between start=3 and the last character.
  • When length is not specified, the function produces a substring that starts at the specified position and ends at the last character.

Extension: Start to nth

We wish to drop the last n characters and keep the rest.

SELECT col_1,
       SUBSTRING(col_1, 1, LENGTH(col_1) - 2) AS col_2
FROM table_1;

Here is how this works:

  • Our objective is to drop the last n characters (which in this example is 2) and keep the rest;
  • We pass 1 as the start position, and we subtract the number of characters we wish to drop from the length of the string using LENGTH(col_1) - 2.

Extension: nth from End to End

We wish to keep the last n characters and drop the rest.

SELECT col_1,
       SUBSTRING(col_1, -3) AS col_2
FROM table_1;

Here is how this works:

  • Our objective is to keep the last n characters (which in this example is 3) and drop the rest;
  • We pass -3 as the start position, and we don't pass the length to get all characters after the start position.

Multiple Substrings

We wish to extract multiple substrings given their start and end location indices.

In this example, we wish to extract three substrings given their start and end locations from each value of the column col_1. We wish to obtain the extracted substrings as three new columns named a, b, and c.

SELECT col_1,
       SUBSTRING(col_1, 1, 2) AS a,
       SUBSTRING(col_1, 2, 3) AS b,
       SUBSTRING(col_1, 3, 4) AS c
FROM table_1;

Here is how this works:

  • We can make individual calls to SUBSTRING() with the start position and length for each substring.

Location Columns

We wish to extract substrings from each element of a string column by the location where the start and/or end locations are provided as values of columns in a table.

In this example, we wish to create a new column col_4 where each element is a substring extracted from the corresponding value of the column col_1 where the start and end location are provided by the corresponding values of the columns col_2 and col_3 respectively.

SELECT *,
       SUBSTRING(col_1, col_2, (col_3 - col_2) + 1) AS col_4
FROM table_1;

Here is how this works:

  • We use SUBSTRING() to extract a substring given its locations.
  • We pass col_2 as the start position and (col_3 - col_2) + 1 as the substring length.
SQL
I/O