This function is mainly used internally, such as for simulating missing data patterns, but is available in case anyone finds it useful.
sim_joint_dist(data, ..., n = 100, empirical = FALSE)
the existing tbl
columns to calculate the joint distribution from, if none are chosen, all columns with 10 or fewer unique values will be chosen
the number of total observations to return
Should the returned data have the exact same distribution of conditions? (versus be sampled from a population with this distribution)
data table
sim_joint_dist(ggplot2::diamonds, cut, color, n = 10)
#> # A tibble: 10 × 2
#> cut color
#> <ord> <ord>
#> 1 Very Good F
#> 2 Very Good D
#> 3 Ideal E
#> 4 Premium G
#> 5 Ideal E
#> 6 Good D
#> 7 Good G
#> 8 Very Good J
#> 9 Very Good E
#> 10 Premium H