The disk.frame group by operation perform group WITHIN each chunk. This is often used for performance reasons. If the user wishes to perform group-by, they may choose to use the `hard_group_by` function which is expensive as it reorganizes the chunks by the shard key.

The disk.frame group by operation perform group WITHIN each chunk. This is often used for performance reasons. If the user wishes to perform group-by, they may choose to use the `hard_group_by` function which is expensive as it reorganizes the chunks by the shard key.

# S3 method for disk.frame
group_by(.data, ..., add = FALSE, .drop = group_by_drop_default(.data))

chunk_group_by(.data, ...)

# S3 method for disk.frame
group_by(.data, ..., add = FALSE, .drop = group_by_drop_default(.data))

Arguments

.data

a disk.frame

...

same as the dplyr::group_by

See also

hard_group_by

hard_group_by