At times, we wish to exclude one or more columns and retain the rest. At other times the criteria to describe the columns that we wish to exclude are easier to formulate than their complement (the criteria for columns we wish to retain). In such situations we need to exclude some columns and return the rest. There are four common column exclusion scenarios which we cover in this section:
We wish to exclude one or more columns of a table given their names and to return the table’s remaining columns.
In this example, we wish to exclude the columns named col_1
and col_5
and return the remaining columns.
SELECT * EXCEPT (col_1,col_5)
FROM refcon.dataset.table_1;
Here is how this works:
SELECT *
means select all columns in the table .EXCEPT (col_1,col_5)
will instruct SELECT
to exclude the columns we specify by name.We wish to exclude one or more columns of a table given their positions (where 1 is the left most column) and to return the table’s remaining columns.
In this example, we wish to exclude the columns at the first and fifth positions and return the remaining columns.
BEGIN
DECLARE v_column_positions ARRAY <INT64>;
DECLARE v_table_name STRING;
DECLARE v_columns STRING;
SET v_column_positions = [1,5];
SET v_table_name = 'table_1';
SET v_columns =
(WITH table_columns AS
(SELECT column_name
FROM refcon.dataset.INFORMATION_SCHEMA.COLUMNS
WHERE table_name = v_table_name
AND ordinal_position NOT IN UNNEST(v_column_positions))
SELECT STRING_AGG(column_name) AS columns
FROM table_columns);
EXECUTE IMMEDIATE FORMAT("""SELECT %s
FROM refcon.dataset.%s
""", v_columns, v_table_name);
END;
Here is how this works:
INFORMATION_SCHEMA.COLUMNS
contains the columns info for each table in the dataset.ordinal_position
that does not equal any of
the v_column_positions
array elements.IN
operator we need to UNNEST
the array first.STRING_AGG
to get the column names as a comma separated string. i.e. col_1,col_2,col_3,..
and
set v_columns
variable.EXECUTE IMMEDIATE
to execute our dynamic query using the v_table
and v_columns
variables.FORMAT
function which formats a data type expression as a string. %s
will be replaced
with the variable values in the order of listing the variables.We wish to exclude columns that match a condition and return the table’s remaining columns.
In this example, we exclude columns that end with 1
and return the remaining columns.
BEGIN
DECLARE v_table_name STRING;
DECLARE v_suffix STRING;
DECLARE v_columns STRING;
SET v_table_name = 'table_1';
SET v_suffix = '1';
SET v_columns = (WITH table_columns AS
(SELECT column_name
FROM refcon.dataset.INFORMATION_SCHEMA.COLUMNS
WHERE table_name = v_table_name
AND NOT ENDS_WITH(column_name, v_suffix))
SELECT STRING_AGG(column_name) AS columns
FROM table_columns);
EXECUTE IMMEDIATE FORMAT("""SELECT %s
FROM refcon.dataset.%s
""", v_columns, v_table_name);
END;
Here is how this works:
INFORMATION_SCHEMA.COLUMNS
contains the columns info for each table in the dataset. We get the column names and
keep only the columns that match our pattern.ENDS_WITH(column_name, v_suffix)
will return TRUE
for any column that ends with '1'
.'1'
.STRING_AGG
to get the column names as a comma separated string. i.e. col_1, col_11
and
set v_columns
variable.EXECUTE IMMEDIATE
to execute our dynamic query using the v_table
and v_columns
variables.FORMAT
function which formats a data type expression as a string. %s
will be replaced
with the variable values in the order of listing the variables.We wish to exclude columns of one or more data types and return the data frame’s remaining columns.
In this example, we wish to exclude both columns of an integer data type and of a logical data type.
BEGIN
DECLARE v_table_name STRING;
DECLARE v_data_types ARRAY <STRING>;
DECLARE v_columns STRING;
SET v_table_name = 'table_1';
SET v_data_types = ['INT64','BOOL'];
SET v_columns = (WITH table_columns AS
(SELECT column_name
FROM refcon.dataset.INFORMATION_SCHEMA.COLUMNS
WHERE table_name = v_table_name
AND data_type NOT IN UNNEST(v_data_types))
SELECT STRING_AGG(column_name) AS columns
FROM table_columns);
EXECUTE IMMEDIATE FORMAT("""SELECT %s
FROM refcon.dataset.%s
""", v_columns, v_table_name);
END;
Here is how this works:
This works similarly to the "By Position" scenario described above, except that we use filter our based on data types instead of position.