I’m a sucker for a silly correlation, and I’ve also been meaning to try out the ravelRy package for working with the ravelry API.
Setup
library(tidyverse) # for data wranglinglibrary(ravelRy) # for searching ravelry library(rvest) # for web scrapinglibrary(DT) # for interactive tablelibrary(showtext) # for fonts# add the font Ravelry uses on their webpagefont_add_google(name ="Jost", family ="Jost")showtext_auto()# increase penalty for scientific notationoptions(scipen =4)
13.1 Craft Shops
I wanted to find the country of all the users on Ravelry, but the ravelRy package doesn’t have a function for searching people and I don’t have time to write one today, so I just searched for yarn shops, which also have a country attribute.
The slowly() function below is from purrr and retrieves 1000 yarn shops’ details every 10 seconds. There turned out to be 8072. You get an error if you try to retrieve an empty page of shops, so I wrapped the search_shops() code in tryCatch() so the function didn’t fail then. I saved the result to RDS so I can set this code chunk to eval = FALSE and read this in without calling the ravelry API every time I knit this book.
I had to fix some country names so they match with the other data sets. There is only one happiness value for the UK, so I reluctantly combined the four countries in the UK.
Code
shops <-readRDS("data/yarn_shops.rds")shop_countries <-tibble(country = shops) %>%mutate(country =recode(country,Wales ="United Kingdom", Scotland ="United Kingdom",England ="United Kingdom","Northern Ireland"="United Kingdom","Viet Nam"="Vietnam","Bosnia and Herzegowina"="Bosnia and Herzegovina","Korea, Republic of"="South Korea",.default = country)) %>%count(country, name ="yarn_shops")
13.2 Temperature
I feel like average temperature might have something to do with the number of yarn shops in a country, so I scraped that from a Wikipedia page using rvest. I had to edit the Wikipedia page first because the table formatting was borked. I also had to fix the html minus symbol because it made the negative numbers read in as character strings.
Then I decided maybe the average temperature for the coldest month was better, which I found on a site with the rather upsetting name listfist, but had problems reading it with rvest so gave up.
Code
html <-read_html("https://listfist.com/list-of-countries-by-average-temperature")
13.3 Happiness
I got the happiness ratings from the World Happiness Report, which also included a population column in units of thousands.
Let’s just have a quick look at the correlations here. The number of yarn shops per million population is positively correlated with happiness and negatively correlated with mean annual temperature. However, temperature is also negatively correlated with happiness.
First, I made a simple plot of the correlation between the number of yarn shops per million people and happiness score for each country. I set the size of the points relative to the population and the colour relative to the mean annual temperature.