Shifting

We wish to shift the values of a column up or down by a given offset. Shifting allows us to access a column’s previous or following values enabling us to compute changes.

Previous

We wish to shift the value of a column down by one i.e. we wish to obtain the previous value based on an ordered column.

In this example, we wish to compute the difference and rate of change between the value of each element of the column attr_1 and the value of its previous element based on the column date.

SELECT date,
       attr_1,
       LAG(attr_1) OVER (ORDER BY date)              previous,
       attr_1 - LAG(attr_1) OVER (ORDER BY date)     delta,
       SAFE_DIVIDE((attr_1 - LAG(attr_1) OVER (ORDER BY date)),
                   LAG(attr_1) OVER (ORDER BY date)) change
FROM table_1
ORDER BY date;

Here is how this works:

  • We use the LAG() window function to shift the values of the column attr_1 down and store the resulting values in a new column previous.
  • By default LAG() shifts down by 1 position, i.e. an offset of 1. In the OVER() clause, we specify the column we wish to sort by, in this example, ORDER BY date.
  • In attr_1 - LAG(attr_1) OVER (ORDER BY date), we compute the difference between the current value of attr_1 and its previous value.
  • We compute the rate of change by dividing the delta by attr_1 and create a new column named change.
  • Finally, we sort the table by the values of the date column to see the ordered results. See Sorting.

Extension: Shift Per Group

We wish to shift the value of a column down by one i.e. we wish to obtain the previous value based on an ordered column per group, where the groups are specified by the column col_1

In this example, we wish to compute the difference between the value of each element of the column attr_1 and the value of its previous element based on the column date for each group in col_1.

SELECT col_1,
       date,
       attr_1,
       LAG(attr_1) OVER (PARTITION BY col_1 ORDER BY date)          previous,
       attr_1 - LAG(attr_1) OVER (PARTITION BY col_1 ORDER BY date) delta
FROM table_1
ORDER BY col_1, date;

Here is how this works:

This code is similar to the code above in Previous with one exception: We specify the group column in the OVER() clause using PARTITION BY. In this example PARTITION BY col_1.

Extension: Specify Offset Value

We wish to shift the value of a vector down by an arbitrary offset.

In this example, we wish to compute the difference and rate of change between the value of each element of the column attr_1 and its value one week prior (seven days earlier).

SELECT date,
       attr_1,
       LAG(attr_1, 7) OVER (ORDER BY date)              previous,
       attr_1 - LAG(attr_1, 7) OVER (ORDER BY date)     delta,
       SAFE_DIVIDE((attr_1 - LAG(attr_1, 7) OVER (ORDER BY date)),
                   LAG(attr_1, 7) OVER (ORDER BY date)) change
FROM table_1
ORDER BY date;

Here is how this works:

This code is similar to the code above in Previous with one exception: We pass the desired offset value to the LAG(), which here is 7.

Extension: Fill NULLs

When we shift a column, we inadvertently create missing values corresponding to the shift. By default, those missing values are filled as NULL. We can specify an alternative value.

In this example, we wish to fill the missing values resulting from the shift with 0.

SELECT date,
       attr_1,
       LAG(attr_1, 1, 0) OVER (ORDER BY date)          previous,
       attr_1 - LAG(attr_1, 1, 0) OVER (ORDER BY date) delta
FROM table_1
ORDER BY date;

Here is how this works:

  • This code is similar to the code above in Previous with one exception:
    • We pass the desired value to use to fill NULLs generated by shifting to the argument default of LAG(), which here is 0.
    • In order to do that, we need to pass the offset value first, which here is 1.
    • The final expression is LAG(attr_1, 1, 0) where 1 is the offset and 0 is the default value.
  • Note that this approach is preferable to filling NULLs after shifting because there may be other NULLs in the data that we do not necessarily wish to replace.

Next

We wish to shift the value of a column up by one i.e. we wish to obtain the next value.

In this example, we wish to compute the difference and rate of change between the value of each element of the column attr_1 and the value of its next element. The table is sorted by the value of the column date.

SELECT date,
       attr_1,
       LEAD(attr_1) OVER (ORDER BY date)          previous,
       LEAD(attr_1) OVER (ORDER BY date) - attr_1 delta,
       SAFE_DIVIDE((LEAD(attr_1) OVER (ORDER BY date) - attr_1),
                   attr_1)                        change
FROM table_1
ORDER BY date;

Here is how this works:

  • This code is similar to the code under Previous above except that we use LEAD() instead of LEAD() because we wish to shift up i.e. obtain the next value.
  • The same extensions covered under Previous above can be applied here.
SQL
I/O