Sample n rows from a disk.frame

# S3 method for disk.frame
sample_frac(tbl, size = 1, replace = FALSE, weight = NULL, .env = NULL, ...)

Arguments

tbl

A data.frame.

size

<tidy-select> For sample_n(), the number of rows to select. For sample_frac(), the fraction of rows to select. If tbl is grouped, size applies to each group.

replace

Sample with or without replacement?

weight

<tidy-select> Sampling weights. This must evaluate to a vector of non-negative numbers the same length as the input. Weights are automatically standardised to sum to 1.

.env

DEPRECATED.

...

ignored

Examples

cars.df = as.disk.frame(cars) collect(sample_frac(cars.df, 0.5))
#> speed dist #> 1 10 34 #> 2 8 16 #> 3 4 10 #> 4 10 26 #> 5 12 14 #> 6 13 26 #> 7 11 28 #> 8 13 34 #> 9 13 46 #> 10 15 26 #> 11 16 32 #> 12 15 54 #> 13 18 42 #> 14 17 32 #> 15 18 84 #> 16 17 40 #> 17 22 66 #> 18 19 68 #> 19 20 48 #> 20 20 52 #> 21 24 120 #> 22 25 85
# clean up cars.df delete(cars.df)