If no chunk_id is specified, then the chunk is added at the end as the largest numbered file, "n.fst".
add_chunk(df, chunk, chunk_id = NULL, full.names = FALSE, ...)
the disk.frame to add a chunk to
a data.frame to be added as a chunk
a numeric number indicating the id of the chunk. If NULL it will be set to the largest chunk_id + 1
whether the chunk_id name match should be to the full file path not just the file name
Passed in the write_fst. E.g. compress
disk.frame
The function is the preferred way to add a chunk to a disk.frame. It performs checks on the types to make sure that the new chunk doesn't have different types to the disk.frame.
# create a disk.frame
df_path = file.path(tempdir(), "tmp_add_chunk")
diskf = disk.frame(df_path)
# add a chunk to diskf
add_chunk(diskf, cars)
#> path: "C:\Users\RTX2080\AppData\Local\Temp\RtmpCo1OFr/tmp_add_chunk"
#> nchunks: 1
#> nrow (at source): 50
#> ncol (at source): 2
add_chunk(diskf, cars)
#> path: "C:\Users\RTX2080\AppData\Local\Temp\RtmpCo1OFr/tmp_add_chunk"
#> nchunks: 2
#> nrow (at source): 100
#> ncol (at source): 2
nchunks(diskf) # 2
#> [1] 2
df2 = disk.frame(file.path(tempdir(), "tmp_add_chunk2"))
# add chunks by specifying the chunk_id number; this is especially useful if
# you wish to add multiple chunk in parralel
add_chunk(df2, data.frame(chunk=1), 1)
#> path: "C:\Users\RTX2080\AppData\Local\Temp\RtmpCo1OFr/tmp_add_chunk2"
#> nchunks: 1
#> nrow (at source): 1
#> ncol (at source): 1
add_chunk(df2, data.frame(chunk=2), 3)
#> path: "C:\Users\RTX2080\AppData\Local\Temp\RtmpCo1OFr/tmp_add_chunk2"
#> nchunks: 2
#> nrow (at source): 2
#> ncol (at source): 1
nchunks(df2) # 2
#> [1] 2
dir(attr(df2, "path", exact=TRUE))
#> [1] "1.fst" "3.fst"
# [1] "1.fst" "3.fst"
# clean up
delete(diskf)
delete(df2)