It is useful to be able to simulate data with a specified structure. The faux package provides some functions to make this process easier. See the vignettes for more details.


You can install the released version of faux (1.0.0) from CRAN with:

And the development version ( from GitHub with:

# install.packages("devtools")

Quick overview

Simulate data for a factorial design

between <- list(pet = c(cat = "Cat Owners", 
                        dog = "Dog Owners"))
within <- list(time = c("morning", 
mu <- data.frame(
  cat    = c(10, 12, 14, 16),
  dog    = c(10, 15, 20, 25),
  row.names = within$time
df <- sim_design(within, between, 
                 n = 100, mu = mu, 
                 sd = 5, r = .5)

Default design plot

p1 <- plot_design(df)
p2 <- plot_design(df, "pet", "time")

cowplot::plot_grid(p1, p2, nrow = 2, align = "v")

Plot the data with different visualisations.

Simulate new data from an existing data table

new_iris <- sim_df(iris, 50, between = "Species") 

Simulated iris dataset

Other simulation packages

I started this project as a collection of functions I was writing to help with my own work. It’s one of many, many simulation packages in R; here are some others. I haven’t used most of them, so I can’t vouch for them, but if faux doesn’t meet your needs, one of these might.

  • simstudy: Simulation of Study Data
  • simr: Power Analysis of Generalised Linear Mixed Models by Simulation
  • simulator: streamlines the process of performing simulations by creating a common infrastructure that can be easily used and reused across projects
  • lsasim: Simulate large scale assessment data
  • simmer: Trajectory-based Discrete-Event Simulation (DES
  • parSim: Parallel Simulation Studies