Dynamic Column Specification

We wish to pass the names of the columns to be selected dynamically.

We will cover the following:

  1. As Environment Variable: The names of the columns to be selected are specified as a variable in the environment. This is useful for instance when structuring a script and wish to have column specification as part of a configuration section at the beginning of the script separate from the logic.
  2. As Procedure Argument: Column selection happens inside a procedure and the names of the columns to be selected are passed to the procedure as an argument. This is useful when we wish to write our data manipulation pipeline as a reusable procedure.

As Environment Variable

We wish to specify the names of the columns to be selected as a variable in the environment.

In this example, we specify the names of the columns we wish to select as a variable v_cols_to_select and then use that variable for column selection.

BEGIN
    DECLARE v_cols_to_select ARRAY <STRING>;
    SET v_cols_to_select = ['col_1','col_2','col_3'];
    EXECUTE IMMEDIATE FORMAT("""SELECT %s
                                FROM refcon.dataset.table_1
                             """, ARRAY_TO_STRING(v_cols_to_select, ','));
END;

Here is how this works:

  • We define our variable v_cols_to_select using a DECLARE statement as an array of strings.
  • We assign the value ['col_1','col_2','col_3'] to our variable using SET statement.
  • We use EXECUTE IMMEDIATE to execute our dynamic query using the v_table and v_cols_to_select variables.
  • We achieve this using the FORMAT function which formats a data type expression as a string. %s will be replaced with the variable values in the order of listing the variables.
  • The output of FORMAT will be SELECT col_1,col_2,col_3 FROM refcon.dataset.table_1.

As Procedure Argument

We wish to pass the names of the columns to be selected as an argument to a procedure. The actual column selection happens inside the procedure.

In this example, column selection happens inside the procedure select_columns_by_name which takes the table name and the names of the columns to be selected as an argument v_cols_to_select.

CREATE OR REPLACE PROCEDURE
    refcon.dataset.select_columns_by_name(v_table_name STRING,
                                           v_cols_to_select ARRAY <STRING>)
BEGIN
    EXECUTE IMMEDIATE FORMAT("""SELECT %s
                                FROM refcon.dataset.%s
                             """, ARRAY_TO_STRING(v_cols_to_select, ','),
                             v_table_name);
END;

CALL refcon.dataset.select_columns_by_name('table_1',
                                            ['col_1','col_2']);

Here is how this works:

  • We create select_columns_by_name procedure which takes two inputs; v_table_name and v_cols_to_select.
  • ARRAY_TO_STRING(v_cols_to_select, ',') will convert the array to a comma separated string.
  • EXECUTE IMMEDIATE works similarly to the scenario above.
  • We call the procedure using CALL statement and pass our input variables.
SQL
I/O