Sorting by Implicitly Selected Columns

We wish to identify the columns to use for sorting, not by explicitly spelling out their names but by specifying criteria satisfied by the desired columns.

In this example, we wish to sort the rows of the table dummy_table by all the columns whose names start with col_1. Note: We are assuming that the order of the columns is appropriate for the task at hand.

BEGIN
    DECLARE v_table_name STRING;
    DECLARE v_columns STRING;

    SET v_table_name = 'dummy_table';
    SET v_columns = (WITH table_columns AS
                              (SELECT column_name
                               FROM refcon.examples.INFORMATION_SCHEMA.COLUMNS
                               WHERE table_name = v_table_name
                                 AND STARTS_WITH(column_name, 'col_1'))
                     SELECT STRING_AGG(column_name) AS columns
                     FROM table_columns);
    EXECUTE IMMEDIATE FORMAT("""SELECT *
                                FROM refcon.examples.%s
                                ORDER BY %s
                             """, v_table_name, v_columns);
END;

Here is how this works:

  • First we define and set our variables, so we can use this snippet with any table.
  • INFORMATION_SCHEMA.COLUMNS contains the columns info for each table in the dataset. We get the column names and keep only the columns that match our pattern.
  • STARTS_WITH(STRING,PREFIX) functions checks if the STRING starts with a given prefix ('col_1' in this example).
  • We apply STRING_AGG to get the column names as a comma separated string. i.e. col_1, col_11, col_12 and set v_columns variable.
  • We use EXECUTE IMMEDIATE to execute our dynamic query using the v_table and v_columns variables.
  • We achieve this using the FORMAT function which formats a data type expression as a string. %s will be replaced with the variable values in the order of listing the variables.
  • The output of FORMAT will be SELECT * FROM refcon.examples.dummy_table ORDER BY col_1, col_11, col_12.
SQL
I/O