Computes the recommended number of chunks to break a data.frame into. It can accept filesizes in bytes (as integer) or a data.frame

recommend_nchunks(
  df,
  type = "csv",
  minchunks = data.table::getDTthreads(),
  conservatism = 8,
  ram_size = df_ram_size()
)

Arguments

df

a disk.frame or the file size in bytes of a CSV file holding the data

type

only = "csv" is supported. It indicates the file type corresponding to file size `df`

minchunks

the minimum number of chunks. Defaults to the number of CPU cores (without hyper-threading)

conservatism

a multiplier to the recommended number of chunks. The more chunks the smaller the chunk size and more likely that each chunk can fit into RAM

ram_size

The amount of RAM available which is usually computed. Except on RStudio with R3.6+

Examples

# recommend nchunks based on data.frame
recommend_nchunks(cars)
#> [1] 6

# recommend nchunks based on file size ONLY CSV is implemented at the moment
recommend_nchunks(1024^3)
#> [1] 6