The disk.frame group by operation perform group WITHIN each chunk. This is often used for performance reasons. If the user wishes to perform group-by, they may choose to use the `hard_group_by` function which is expensive as it reorganizes the chunks by the shard key.
# S3 method for grouped_disk.frame
summarise(.data, ...)
# S3 method for grouped_disk.frame
summarize(.data, ...)
# S3 method for disk.frame
group_by(
.data,
...,
.add = FALSE,
.drop = stop("disk.frame does not support `.drop` in `group_by` at this stage")
)
# S3 method for disk.frame
summarize(.data, ...)
# S3 method for disk.frame
summarise(.data, ...)
a disk.frame
same as the dplyr::group_by
from dplyr
from dplyr
hard_group_by