All functions

add_chunk()

Add a chunk to the disk.frame

as.data.frame(<disk.frame>)

Convert disk.frame to data.frame by collecting all chunks

as.data.table(<disk.frame>)

Convert disk.frame to data.table by collecting all chunks

as.disk.frame()

Make a data.frame into a disk.frame

collect(<disk.frame>) collect_list()

Bring the disk.frame into R

colnames() names(<disk.frame>)

Return the column names of the disk.frame

compute(<disk.frame>)

Compute without writing

create_dplyr_mapper()

Create dplyr function for disk.frame

csv_to_disk.frame()

Convert CSV file(s) to disk.frame format

delete()

Delete a disk.frame

dfglm()

Fit generalized linear models (glm) with disk.frame

df_ram_size()

Get the size of RAM in gigabytes

disk.frame()

Create a disk.frame from a folder

select(<disk.frame>) rename(<disk.frame>) filter(<disk.frame>) filter_all.disk.frame() filter_if.disk.frame() filter_at.disk.frame() mutate(<disk.frame>) transmute(<disk.frame>) arrange(<disk.frame>) chunk_arrange() tally.disk.frame() count.disk.frame() add_count.disk.frame() add_tally.disk.frame() chunk_summarize() chunk_summarise() summarize(<disk.frame>) summarise(<disk.frame>) do(<disk.frame>) group_by_all.disk.frame() group_by_at.disk.frame() group_by_if.disk.frame() mutate_all.disk.frame() mutate_at.disk.frame() mutate_if.disk.frame() rename_all.disk.frame() rename_at.disk.frame() rename_if.disk.frame() select_all.disk.frame() select_at.disk.frame() select_if.disk.frame() chunk_summarise_all() chunk_summarise_at() chunk_summarize() chunk_summarize_all() chunk_summarize_at() chunk_summarize_if() distinct(<disk.frame>) chunk_distinct() glimpse(<disk.frame>)

The dplyr verbs implemented for disk.frame

evalparseglue()

Helper function to evalparse some `glue::glue` string

foverlaps.disk.frame()

Apply data.table's foverlaps to the disk.frame

gen_datatable_synthetic()

Generate synthetic dataset for testing

get_chunk()

Obtain one chunk by chunk id

get_chunk_ids()

Get the chunk IDs and files names

groups(<disk.frame>)

The shard keys of the disk.frame

group_by(<disk.frame>) chunk_group_by()

Group by within each disk.frame

hard_arrange()

Perform a hard arrange

hard_group_by()

Perform a hard group

head(<disk.frame>) tail(<disk.frame>)

Head and tail of the disk.frame

is_disk.frame()

Checks if a folder is a disk.frame

anti_join(<disk.frame>) full_join(<disk.frame>) inner_join(<disk.frame>) left_join(<disk.frame>) semi_join(<disk.frame>)

Performs join/merge for disk.frames

make_glm_streaming_fn()

A streaming function for speedglm

map() map_dfr() imap() imap_dfr() lazy() delayed() chunk_lapply()

Apply the same function to all chunks

map2() map_by_chunk_id()

`map` a function to two disk.frames

merge(<disk.frame>)

Merge function for disk.frames

move_to() copy_df_to()

Move or copy a disk.frame to another location

nchunks() nchunk()

Returns the number of chunks in a disk.frame

nrow() ncol()

Number of rows or columns

overwrite_check()

Check if the outdir exists or not

print(<disk.frame>)

Print disk.frame

rbindlist.disk.frame()

rbindlist disk.frames together

rechunk()

Increase or decrease the number of chunks in the disk.frame

recommend_nchunks()

Recommend number of chunks based on input size

remove_chunk()

Removes a chunk from the disk.frame

sample_frac(<disk.frame>)

Sample n rows from a disk.frame

setup_disk.frame()

Set up disk.frame environment

shard() distribute()

Shard a data.frame/data.table or disk.frame into chunk and saves it into a disk.frame

shardkey()

Returns the shardkey (not implemented yet)

shardkey_equal()

Compare two disk.frame shardkeys

show_ceremony() ceremony_text() show_boilerplate() insert_ceremony()

Show the code to setup disk.frame

srckeep()

Keep only the variables from the input listed in selections

`[`(<disk.frame>)

[ interface for disk.frame using fst backend

tbl_vars(<disk.frame>)

Column names for RStudio auto-complete

write_disk.frame() output_disk.frame()

Write disk.frame to disk

zip_to_disk.frame()

`zip_to_disk.frame` is used to read and convert every CSV file within the zip file to disk.frame format