It is useful to be able to simulate data with a specified structure. The faux package provides some functions to make this process easier. See the vignettes for more details.

Installation

You can install the released version of faux from CRAN with:

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("debruine/faux")

Quick overview

Simulate data for a factorial design

See the Simulate by Design vignette for more details.

between <- list(pet = c(cat = "Cat Owners", 
                        dog = "Dog Owners"))
within <- list(time = c("morning", 
                        "noon", 
                        "evening", 
                        "night"))
mu <- data.frame(
  cat    = c(10, 12, 14, 16),
  dog    = c(10, 15, 20, 25),
  row.names = within$time
)
df <- sim_design(within, between, 
                 n = 100, mu = mu, 
                 sd = 5, r = .5)

Default design plot

p1 <- plot_design(df)
p2 <- plot_design(df, "pet", "time")

cowplot::plot_grid(p1, p2, nrow = 2, align = "v")

Plot the data with different visualisations.

Simulate new data from an existing data table

See the Simulate from Existing Data vignette for more details.

new_iris <- sim_df(iris, 50, between = "Species") 

Simulated iris dataset

Simulate data for a mixed design

You can build up a cross-classified or nested mixed effects design using piped functions. See the contrasts vignette for more details.

# simulate 20 classes with 20 to 30 students per class
data <- add_random(class = 20) %>%
  add_random(student = sample(20:30, 20, replace = TRUE), 
             .nested_in = "class") %>%
  add_between(.by = "class", 
              school_type = c("private","public"), 
              .prob = c(5, 15)) %>%
  add_between(.by = "student",
              gender = c("M", "F", "NB"),
              .prob = c(.49, .49, .02))
school_type gender n
private M 66
private F 58
private NB 3
public M 173
public F 175
public NB 7

Other simulation packages

I started this project as a collection of functions I was writing to help with my own work. It’s one of many, many simulation packages in R; here are some others. I haven’t used most of them, so I can’t vouch for them, but if faux doesn’t meet your needs, one of these might.

  • SimDesign: generate, analyse and summarise data from models or probability density functions
  • simstudy: Simulation of Study Data
  • simr: Power Analysis of Generalised Linear Mixed Models by Simulation
  • simulator: streamlines the process of performing simulations by creating a common infrastructure that can be easily used and reused across projects
  • lsasim: Simulate large scale assessment data
  • simmer: Trajectory-based Discrete-Event Simulation (DES
  • parSim: Parallel Simulation Studies