scale()
You can use scale()
to center and/or scale (i.e., Z-score) a vector of numbers.
Z-score a list of numbers
x <- c(10, 12, 14, 16, 18)
scale(x)
## [,1]
## [1,] -1.2649111
## [2,] -0.6324555
## [3,] 0.0000000
## [4,] 0.6324555
## [5,] 1.2649111
## attr(,"scaled:center")
## [1] 14
## attr(,"scaled:scale")
## [1] 3.162278
However, the result contains the mean and SD.
This can cause problems if you want to assign it to a new column in a data frame,
which you can fix using as.vector()
as.vector(scale(x))
## [1] -1.2649111 -0.6324555 0.0000000 0.6324555 1.2649111
I find it more straightforward to just use the equation for a Z-score
( x - mean(x) ) / sd(x)
## [1] -1.2649111 -0.6324555 0.0000000 0.6324555 1.2649111
You can just center the numbers without scaling.
as.vector(scale(x, center=TRUE, scale=FALSE))
## [1] -4 -2 0 2 4
( x - mean(x) )
## [1] -4 -2 0 2 4
Scaling without centering divides numbers by their root mean square.
as.vector(scale(x, center=FALSE, scale=TRUE))
## [1] 0.6262243 0.7514691 0.8767140 1.0019589 1.1272037
x / sqrt(sum(x^2)/(length(x)-1))
## [1] 0.6262243 0.7514691 0.8767140 1.0019589 1.1272037
Set the scale to a number to divide by that number
as.vector(scale(x, center=FALSE, scale=3))
## [1] 3.333333 4.000000 4.666667 5.333333 6.000000
x / 3
## [1] 3.333333 4.000000 4.666667 5.333333 6.000000
Create new columns in a dataframe with the scaled or centered variable
suppressMessages( library(tidyverse) )
df <- data.frame(id = seq(1,5), x = x)
df.s <- df %>%
mutate(
x.s = as.vector(scale(x)),
x.c = as.vector(scale(x, scale=F)),
x.z = (x - mean(x)) / sd(x)
)
df.s
## id x x.s x.c x.z
## 1 1 10 -1.2649111 -4 -1.2649111
## 2 2 12 -0.6324555 -2 -0.6324555
## 3 3 14 0.0000000 0 0.0000000
## 4 4 16 0.6324555 2 0.6324555
## 5 5 18 1.2649111 4 1.2649111