This function is mainly used internally, such as for simulating missing data patterns, but is available in case anyone finds it useful.
Arguments
- data
the existing tbl
- ...
columns to calculate the joint distribution from, if none are chosen, all columns with 10 or fewer unique values will be chosen
- n
the number of total observations to return
- empirical
Should the returned data have the exact same distribution of conditions? (versus be sampled from a population with this distribution)
Examples
sim_joint_dist(ggplot2::diamonds, cut, color, n = 10)
#> # A tibble: 10 × 2
#> cut color
#> <ord> <ord>
#> 1 Ideal F
#> 2 Premium H
#> 3 Ideal D
#> 4 Premium F
#> 5 Premium H
#> 6 Very Good G
#> 7 Ideal E
#> 8 Ideal F
#> 9 Very Good E
#> 10 Premium D