Sample n rows from a disk.frame

# S3 method for disk.frame
sample_frac(tbl, size = 1, replace = FALSE, weight = NULL, .env = NULL, ...)

Arguments

tbl

A data.frame.

size

<tidy-select> For sample_n(), the number of rows to select. For sample_frac(), the fraction of rows to select. If tbl is grouped, size applies to each group.

replace

Sample with or without replacement?

weight

<tidy-select> Sampling weights. This must evaluate to a vector of non-negative numbers the same length as the input. Weights are automatically standardised to sum to 1.

.env

DEPRECATED.

...

ignored

Examples

cars.df = as.disk.frame(cars)

collect(sample_frac(cars.df, 0.5))
#>     speed dist
#>  1:    10   34
#>  2:     7    4
#>  3:     7   22
#>  4:     8   16
#>  5:    11   17
#>  6:    12   28
#>  7:    13   34
#>  8:    12   14
#>  9:    15   26
#> 10:    14   80
#> 11:    14   36
#> 12:    13   46
#> 13:    18   76
#> 14:    18   84
#> 15:    17   40
#> 16:    19   36
#> 17:    23   54
#> 18:    20   64
#> 19:    19   46
#> 20:    20   48
#> 21:    25   85
#> 22:    24   70
#>     speed dist

# clean up cars.df
delete(cars.df)