scale()

R
scale
Published

2017-06-22

You can use scale() to center and/or scale (i.e., Z-score) a vector of numbers.

Z-score a list of numbers

x <- c(10, 12, 14, 16, 18)
scale(x)
           [,1]
[1,] -1.2649111
[2,] -0.6324555
[3,]  0.0000000
[4,]  0.6324555
[5,]  1.2649111
attr(,"scaled:center")
[1] 14
attr(,"scaled:scale")
[1] 3.162278

However, the result contains the mean and SD. This can cause problems if you want to assign it to a new column in a data frame, which you can fix using as.vector()

as.vector(scale(x))
[1] -1.2649111 -0.6324555  0.0000000  0.6324555  1.2649111

I find it more straightforward to just use the equation for a Z-score

( x - mean(x) ) / sd(x)
[1] -1.2649111 -0.6324555  0.0000000  0.6324555  1.2649111

You can just center the numbers without scaling.

as.vector(scale(x, center=TRUE, scale=FALSE))
[1] -4 -2  0  2  4
( x - mean(x) )
[1] -4 -2  0  2  4

Scaling without centering divides numbers by their root mean square.

as.vector(scale(x, center=FALSE, scale=TRUE))
[1] 0.6262243 0.7514691 0.8767140 1.0019589 1.1272037
x / sqrt(sum(x^2)/(length(x)-1))
[1] 0.6262243 0.7514691 0.8767140 1.0019589 1.1272037

Set the scale to a number to divide by that number

as.vector(scale(x, center=FALSE, scale=3))
[1] 3.333333 4.000000 4.666667 5.333333 6.000000
x / 3
[1] 3.333333 4.000000 4.666667 5.333333 6.000000

Create new columns in a dataframe with the scaled or centered variable

suppressMessages( library(tidyverse) )
df <- data.frame(id = seq(1,5), x = x)
df.s <- df %>%
  mutate(
    x.s = as.vector(scale(x)),
    x.c = as.vector(scale(x, scale=F)),
    x.z = (x - mean(x)) / sd(x)
  )
df.s
  id  x        x.s x.c        x.z
1  1 10 -1.2649111  -4 -1.2649111
2  2 12 -0.6324555  -2 -0.6324555
3  3 14  0.0000000   0  0.0000000
4  4 16  0.6324555   2  0.6324555
5  5 18  1.2649111   4  1.2649111