Memory Use

When working with data, especially relatively large datasets, it is often important to keep an eye on the data’s memory consumption and at times take actions to optimize memory use such as clear unnecessary data frames from memory, change the data types of certain column, or work with a smaller subset of the data.

In this page, we look at how to get information about the memory consumption of datasets.

Data Frame

We wish to know how much memory does a particular data frame consume.

df %>% object.size()

Here is how this works:

  • We pass the Data Frame df to the function object.size().
  • object.size() returns the memory consumed by the Data Frame df.
  • object.size() can be applied to any object in the R environment to get the memory consumed by that object.

Particular Column

We wish to know how much memory does a particular column consume.

df %>% pull(col_1) %>% object.size()

Here is how this works:

  • We use pull() to extract the column col_1 from the data frame df.
  • object.size() applied to a data frame’s column returns the memory consumed by that column (col_1 in this example).

Each Column

We wish to know much memory does each column of a data frame consume.

df %>% map_dbl(object.size)

Here is how this works:

  • map_dbl() iterates over all columns of the data frame df and applies object.size() to each returning the memory consumed by each column.
  • object.size() can be applied to any object in the R environment to get the memory consumed by that object.

Available Memory

We wish to know how much memory does our system have in total, how much is used, and how much is free.

library(memuse)
Sys.meminfo()

Here is how this works:

  • We use the Sys.meminfo() function from the memuse package to get useful information about the current state of the system’s memory.
  • Sys.meminfo() returns basic memory information such as total memory, free memory, and memory used.
R
I/O