It is useful to be able to simulate data with a specified structure. The faux package provides some functions to make this process easier. See the vignettes for more details.


You can install the released version of faux ( from CRAN with:

And the development version ( from GitHub with:

# install.packages("devtools")

Quick overview

Simulate data for a factorial design

between <- list(pet = c(cat = "Cat Owners", 
                        dog = "Dog Owners"))
within <- list(time = c("morning", 
mu <- data.frame(
  cat    = c(10, 12, 14, 16),
  dog    = c(10, 15, 20, 25),
  row.names = within$time
df <- sim_design(within, between, 
                 n = 100, mu = mu, 
                 sd = 5, r = .5)

Default design plot

p1 <- plot_design(df)
p2 <- plot_design(df, "pet", "time")

cowplot::plot_grid(p1, p2, nrow = 2, align = "v")

Plot the data with different visualisations.

Simulate new data from an existing data table

new_iris <- sim_df(iris, 50, between = "Species") 

Simulated iris dataset

Other simulation packages

I started this project as a collection of functions I was writing to help with my own work. It’s one of many, many simulation packages in R; here are some others. I haven’t used most of them, so I can’t vouch for them, but if faux doesn’t meet your needs, one of these might.

  • simstudy: Simulation of Study Data
  • simr: Power Analysis of Generalised Linear Mixed Models by Simulation
  • simulator: streamlines the process of performing simulations by creating a common infrastructure that can be easily used and reused across projects
  • lsasim: Simulate large scale assessment data
  • simmer: Trajectory-based Discrete-Event Simulation (DES
  • parSim: Parallel Simulation Studies