If no chunk_id is specified, then the chunk is added at the end as the largest numbered file, "n.fst".

add_chunk(df, chunk, chunk_id = NULL, full.names = FALSE)

Arguments

df

the disk.frame to add a chunk to

chunk

a data.frame to be added as a chunk

chunk_id

a numeric number indicating the id of the chunk. If NULL it will be set to the largest chunk_id + 1

full.names

whether the chunk_id name match should be to the full file path not just the file name

Value

disk.frame

Details

The function is the preferred way to add a chunk to a disk.frame. It performs checks on the types to make sure that the new chunk doesn't have different types to the disk.frame.

Examples

# create a disk.frame df_path = file.path(tempdir(), "tmp_add_chunk") diskf = disk.frame(df_path) # add a chunk to diskf add_chunk(diskf, cars)
#> path: "C:\Users\RTX2080\AppData\Local\Temp\RtmpInritK/tmp_add_chunk" #> nchunks: 1 #> nrow (at source): 50 #> ncol (at source): 2 #> nrow (post operations): ??? #> ncol (post operations): ???
add_chunk(diskf, cars)
#> path: "C:\Users\RTX2080\AppData\Local\Temp\RtmpInritK/tmp_add_chunk" #> nchunks: 2 #> nrow (at source): 100 #> ncol (at source): 2 #> nrow (post operations): ??? #> ncol (post operations): ???
nchunks(diskf) # 2
#> [1] 2
df2 = disk.frame(file.path(tempdir(), "tmp_add_chunk2")) # add chunks by specifying the chunk_id number; this is especially useful if # you wish to add multiple chunk in parralel add_chunk(df2, data.frame(chunk=1), 1)
#> path: "C:\Users\RTX2080\AppData\Local\Temp\RtmpInritK/tmp_add_chunk2" #> nchunks: 1 #> nrow (at source): 1 #> ncol (at source): 1 #> nrow (post operations): ??? #> ncol (post operations): ???
add_chunk(df2, data.frame(chunk=2), 3)
#> path: "C:\Users\RTX2080\AppData\Local\Temp\RtmpInritK/tmp_add_chunk2" #> nchunks: 2 #> nrow (at source): 2 #> ncol (at source): 1 #> nrow (post operations): ??? #> ncol (post operations): ???
nchunks(df2) # 2
#> [1] 2
dir(attr(df2, "path", exact=TRUE))
#> [1] "1.fst" "3.fst"
# [1] "1.fst" "3.fst" # clean up delete(diskf) delete(df2)