Sample n rows from a disk.frame
# S3 method for disk.frame
sample_frac(tbl, size = 1, replace = FALSE, weight = NULL, .env = NULL, ...)
A data.frame.
<tidy-select
>
For sample_n()
, the number of rows to select.
For sample_frac()
, the fraction of rows to select.
If tbl
is grouped, size
applies to each group.
Sample with or without replacement?
<tidy-select
> Sampling weights.
This must evaluate to a vector of non-negative numbers the same length as
the input. Weights are automatically standardised to sum to 1.
DEPRECATED.
ignored
cars.df = as.disk.frame(cars)
collect(sample_frac(cars.df, 0.5))
#> speed dist
#> 1: 10 34
#> 2: 7 4
#> 3: 7 22
#> 4: 8 16
#> 5: 11 17
#> 6: 12 28
#> 7: 13 34
#> 8: 12 14
#> 9: 15 26
#> 10: 14 80
#> 11: 14 36
#> 12: 13 46
#> 13: 18 76
#> 14: 18 84
#> 15: 17 40
#> 16: 19 36
#> 17: 23 54
#> 18: 20 64
#> 19: 19 46
#> 20: 20 48
#> 21: 25 85
#> 22: 24 70
#> speed dist
# clean up cars.df
delete(cars.df)