library(tidyverse) # for data wranglinglibrary(readxl) # for reading data from WHOlibrary(patchwork) # for combining plotslibrary(ggtext) # for styled text in plotslibrary(showtext) # for fontsfont_add_google("Roboto", regular.wt =300, bold.wt =500)font_add_google("Roboto", family ="RobotoB", regular.wt =400, bold.wt =700)showtext_auto()
24.1 Examples
I browsed the Financial Times Data Visualisation webpage to find a plot I want to re-create. I don’t have a subscription, so I could only see the thumbnails, but it looks like this is a section for snarky commentary on plots.
I did find that you can hack the image URLs to increase the width and see bigger versions.
The Visual and data journalism section looks more fruitful, if I had a subscription. Then I searched twitter (congrats on getting a 2-letter handle, @FT!) and found a link to an article on Ten charts that tell the story of 2019 that I could access. It’s interesting to read some pre-pandemic news; the first two charts are on Brexit and anti-vax movements.
I always like these lollipop charts and haven’t had much practice making them, so I’m going to recreate this plot.
24.2 Data
First, I needed to find the data. The original chart said the data came from WHO and Unicef; I downloaded the WHO Data on MCV1.
I tried to get the list of countries shown on the original chart programatically, but they’re not just the biggest increases and decreases, so I had to extract them manually. I wrote a quick function to search for the code for any name or part of a name.
Some of the countries on the chart use different names or abbreviations, so I made a named vector with the country codes and names for the chart.
Code
countries <-c(Global ="Global", KAZ ="Kazakhstan", GBR ="UK", USA ="USA", SDN ="Sudan", UGA ="Uganda", COD ="Dem Rep of Congo", HTI ="Haiti", NGA ="Nigeria", ETH ="Ethiopia", AGO ="Angola", SOM ="Solmalia", TCD ="Chad", THA ="Thailand", UKR ="Ukraine", IDN ="Indonesia", VEN ="Venezuela", CMR ="Cameroon", PHL ="Philippines", YEM ="Yemen", SYR ="Syria", MDG ="Madagascar", SSD ="South Sudan", CAF ="Ctl African Rep", GIN ="Guinea")
I used that vector to filter the mcv1 table, make a new column called country and make it a factor so the countries will display in the right order. I also filtered the table to just the “WUENIC” coverage and made the table wide so there was a column for each year.
Time for the first plot! The data aren’t exactly the same as on the original chart. While the original chart showed an increase in vaccination in the Democratic Republic of the Congo, the data I’m using show a decrease. Similarly, Indonesia has a change in the opposite direction as the original chart. I’ll plot the data I have, but in the order of the original plot, so I’ll need to make two separate plots and combine them with patchwork.
First, I need to set up a few things for the plot theme. I extracted the colour for the dots for 2008 and 2018, plus the background and stripe colours, and the grey text colour, and set those to variables I can use later.
In the original chart, Kazakhstan doesn’t have much change at all, so the dots are nudged up and down vertically so they don’t completely overlap. The USA also doesn’t have discernible change and isn’t nudged in the original plot, but I’ll nudge it here. I’ll make a new variable called y, which is the numeric value for the country factor, but reversed, and y08 and y18 which are the y-axis values for each year. They’re all the same as y, except for Kazakhstan and the USA.
I’ll also add a new column called stripe to set the alpha for the stripe for each country, following the pattern from the original chart, with Global being a dark stripe, and alternating semi-transparent and no stripe after that.
This plot took a lot of trial and error with the annotations. The trick is to set coord_cartesian(clip = "off") so you can plot outside the axis limits, and add some plot margin with the theme().
Code
# define top panelrisen <-slice(figdat, 1:13) %>%ggplot(aes(y = y)) +geom_hline(aes(yintercept = y, alpha =I(stripe)),color = bg_tan, size =9) +geom_vline(xintercept =95, linetype =2, size =0.35) +geom_segment(aes(x = Y2008, xend = Y2018, yend = y), color ="black", size =0.25) +geom_point(aes(x = Y2008, y = y08), color = dot_08, size =3) +geom_point(aes(x = Y2018, y = y18), color = dot_18, size =3) +annotate("text", size =3,label ="The measles 'first dose' immunisation\noffers an individual 90% protections from\nthe disease. Twenty-three countries have\nyet to introduce the second dose, which\nwould increase this cover to 99%.", x =3, y =24.15, hjust =0, vjust =1) +annotate("text", size =3, label ="(95% threshold\nneeded to control\nspread in a community)", x =100, y =26, hjust =1, vjust =0, lineheight =1, color = text_color) +annotate("richtext", size =3,label ="<span style='color: #24559A; font-size: 22px;'>●</span> 2008 <span style='color: #DD6A8D; font-size: 22px;'>●</span> 2018",x =-23, y =26.5, hjust =0, vjust =0, color = text_color,label.colour ="transparent") +scale_x_continuous(breaks = (0:10)*10, expand =expansion(0)) +scale_y_continuous(breaks =1:25, labels =rev(figdat$country),expand =expansion(add = .6)) +coord_cartesian(xlim =c(0, 100), ylim =c(13, 25), clip ="off") +labs(x =NULL, y =NULL,title="Countries where the MCV1 coverage has been maintained or risen")# define themeft_theme <-theme_minimal(base_family ="Roboto", base_size =10) +theme(text =element_text(color = text_color),plot.background =element_rect(fill = bg_light, color ="transparent"),axis.line.y.left =element_line(color ="grey60", size =0.2),axis.line.y.right =element_line(color ="grey60", size =0.2),panel.grid.major.y =element_blank(),panel.grid.minor.x =element_blank(),panel.grid.minor.y =element_blank(),panel.grid.major.x =element_line(size =0.4, color ="grey90"),plot.title.position ="plot",plot.title =element_text(size =9, face ="bold") )# display top panel with themetop_panel <- risen + ft_theme +theme(plot.margin =unit(c(.4, .3, .1, .1), "inches"))top_panel
24.3.4 Fallen plot
The lower plot is a bit simpler, with fewer annotations.
Code
# define bottom panelfallen <-slice(figdat, 14:25) %>%ggplot(aes(y = y)) +geom_hline(aes(yintercept = y, alpha =I(stripe)),color = bg_tan, size =9) +geom_vline(xintercept =95, linetype =2, size =0.35) +geom_segment(aes(x = Y2008, xend = Y2018, yend = y), color ="black", size =0.25) +geom_point(aes(x = Y2008, y = y08), color = dot_08, size =3) +geom_point(aes(x = Y2018, y = y18), color = dot_18, size =3) +scale_x_continuous(breaks = (0:10)*10, expand =expansion(0)) +scale_y_continuous(breaks =1:25, labels =rev(figdat$country),expand =expansion(add = .6)) +coord_cartesian(xlim =c(0, 100), ylim =c(1, 12), clip ="off") +labs(x =NULL, y =NULL,title="Countries where the MCV1 coverage has fallen")# display bottom panel with themebottom_panel <- fallen + ft_theme +theme(plot.margin =unit(c(.1, .3, .1, .1), "inches"))bottom_panel
24.3.5 Combine Plots
Code
top_panel + bottom_panel +plot_annotation(title ="**Global health:** mixed fortunes countering the 'anti-vax' movement",subtitle ="Measles immunisation coverage (%) for the first dose (MCV1) among 1 year olds \nEstimates, as of Apr 23 2022", theme =theme(plot.background =element_rect(fill = bg_light, color ="transparent"),plot.title =element_markdown(size =12, family ="RobotoB", face ="plain", color ="black"),plot.subtitle =element_markdown(size =10, family ="Roboto", lineheight =1.5,face ="plain", color = text_color) ) ) +plot_layout(nrow =2)