Add a chunk to the disk.frame — add

If no chunk_id is specified, then the chunk is added at the end as the largest numbered file, "n.fst".

add_chunk(df, chunk, chunk_id = NULL, full.names = FALSE, ...)

Arguments

df: the disk.frame to add a chunk to
chunk: a data.frame to be added as a chunk
chunk_id: a numeric number indicating the id of the chunk. If NULL it will be set to the largest chunk_id + 1
full.names: whether the chunk_id name match should be to the full file path not just the file name
...: Passed in the write_fst. E.g. compress

Value

disk.frame

Details

The function is the preferred way to add a chunk to a disk.frame. It performs checks on the types to make sure that the new chunk doesn't have different types to the disk.frame.

Examples

# create a disk.frame
df_path = file.path(tempdir(), "tmp_add_chunk")
diskf = disk.frame(df_path)

# add a chunk to diskf
add_chunk(diskf, cars)
#> path: "C:\Users\RTX2080\AppData\Local\Temp\RtmpCo1OFr/tmp_add_chunk"
#> nchunks: 1
#> nrow (at source): 50
#> ncol (at source): 2
add_chunk(diskf, cars)
#> path: "C:\Users\RTX2080\AppData\Local\Temp\RtmpCo1OFr/tmp_add_chunk"
#> nchunks: 2
#> nrow (at source): 100
#> ncol (at source): 2

nchunks(diskf) # 2
#> [1] 2

df2 = disk.frame(file.path(tempdir(), "tmp_add_chunk2"))

# add chunks by specifying the chunk_id number; this is especially useful if
# you wish to add multiple chunk in parralel

add_chunk(df2, data.frame(chunk=1), 1)
#> path: "C:\Users\RTX2080\AppData\Local\Temp\RtmpCo1OFr/tmp_add_chunk2"
#> nchunks: 1
#> nrow (at source): 1
#> ncol (at source): 1
add_chunk(df2, data.frame(chunk=2), 3)
#> path: "C:\Users\RTX2080\AppData\Local\Temp\RtmpCo1OFr/tmp_add_chunk2"
#> nchunks: 2
#> nrow (at source): 2
#> ncol (at source): 1

nchunks(df2) # 2
#> [1] 2

dir(attr(df2, "path", exact=TRUE))
#> [1] "1.fst" "3.fst"
# [1] "1.fst" "3.fst"

# clean up
delete(diskf)
delete(df2)