This function is mainly used internally, such as for simulating missing data patterns, but is available in case anyone finds it useful.

sim_joint_dist(data, ..., n = 100, empirical = FALSE)



the existing tbl


columns to calculate the joint distribution from, if none are chosen, all columns with 10 or fewer unique values will be chosen


the number of total observations to return


Should the returned data have the exact same distribution of conditions? (versus be sampled from a population with this distribution)


data table


sim_joint_dist(ggplot2::diamonds, cut, color, n = 10)
#> # A tibble: 10 × 2
#>    cut       color
#>    <ord>     <ord>
#>  1 Very Good F    
#>  2 Very Good D    
#>  3 Ideal     E    
#>  4 Premium   G    
#>  5 Ideal     E    
#>  6 Good      D    
#>  7 Good      G    
#>  8 Very Good J    
#>  9 Very Good E    
#> 10 Premium   H