The small things
How a tiny soloution for me helped the larger team.
By Millie O. Symns in R
February 21, 2022
Whenever I learn something new in R, I feel like a magician, especially visualizations.
Since I usually work on cross-functional teams, I am often learning about business problems and playing my part to help solve the issue with data. Someone recently asked me:
What project have you worked on that created a lasting impact?
I have written papers and completed research projects for program directors and leaders to make decisions (and I am proud of those). I think the story folks want to hear, but it doesn’t always have to be the big thing that makes all the difference.
Background
Working on a global team with schools in various countries meant dealing with varying school calendars. After a couple of months in, I was trying to wrap my head around academic years.
I wondered if there was looking for information on these school timelines. I asked around if there was any documentation or visuals about the school schedules, but no luck. Then I found the right colleague who kindly pointed to a mini-database I hadn’t seen that had many columns of information about the schools, and there it was - information on the school calendars for all schools across the globe! 🥳
I extracted the CSV and happily opened up RStudio to get in there.
NOTE: I created a sample dataset to show my process.
Process
At first, I thought I would create a clean table for reference, but knowing myself and dealing with a concept of time 🙃 it felt best to create a visual.
I did a quick search on google to find what techniques people have used to create Gantt charts and came across this page: https://cran.r-project.org/web/packages/vistime/vignettes/vistime-vignette.html#ex.-3-gantt-charts
I knew I wanted something quick and function, so this seemed to fit!
Based on this document, I figured out the steps I needed to take:
- Create a table of the calendar months to see frequencies and assign colors
- Create start and end dates from the months
- Format a data frame for the final visual
At first I started with a file that looked something like this:
id | geo | country | school_calendar_months | school_year_start_month | school_year_end_month |
---|---|---|---|---|---|
1 | APAC | Australia | February - November | February | November |
3 | APAC | New Zealand | February - December | February | December |
6 | EMEA | England | September - July | September | July |
16 | North America | United States | September - June | September | June |
My sample data is about 30 rows with 4 geos/regions and 15 countries with various start and end months
Then took a look at different academic years and assigned a color to each. I decided to input a gradient of green for this one so that the end result shows a darker color (meaning more schools in this country with that particular academic year schedule) to a lighter color.
school_calendar <- raw_school_info %>%
# Getting a count of the calendar month combos
group_by(school_calendar_months) %>%
summarise(sum = n()) %>%
arrange(desc(sum)) %>%
# hard coding exact colors I would like in the visual based on the sum of calendar month combos
mutate(color = case_when(sum == 9 ~ "#145a32",
sum == 5 ~ "#1e8449",
sum == 4 ~ "#27ae60",
sum == 2 ~ "#52be80",
sum == 1 ~ "#7dcea0",
TRUE ~ ""),
fontcolor = "white")
I needed a date column, so I just made a lookup table to join in later to save my brain. It felt the easiest at the time.
l_months <-tribble(
~month, ~date,
"January", "20220101",
"February", "20220201",
"March", "20220301",
"April", "20220401",
"May", "20220501",
"June", "20220601",
"July", "20220701",
"August", "20220801",
"September", "20220901",
"October", "20221001",
"November", "20221101",
"December", "20221201"
) %>%
mutate(date = as_date(date))
Then I started doing a series of left join to get the date columns where I needed them to be
#Add date lookup for the start date
school_dates <- left_join(raw_school_info, l_months, by = c("school_year_start_month" = "month")) %>%
rename(start_date = date)
#Add date lookup for the end date
school_dates <- left_join(school_dates, l_months, by = c("school_year_end_month" = "month")) %>%
rename(t_end_date = date)
#Add colors for calendar months
school_dates <- left_join(school_dates, school_calendar, by = "school_calendar_months")
Then for my last bit of formatting, I needed to correct the end dates so the Gantt chart would display correctly.
#Corrected end dates for academic years
school_dates %<>%
# Check to see which set of months are negative (i.e., start date is after the end date)
mutate(t_months = interval(start_date , t_end_date) %/% months(1),
# Correct the end date where needed
end_date = as_date(ifelse(t_months < 0, ymd(t_end_date) + years(1), ymd(t_end_date))),
school_months = interval(start_date , end_date) %/% months(1)) # may need to add +1 since end dates are the 1st of each month
I then made my final data frame by selecting the columns I needed and removing the duplicate rows, since I was still working on the school level. And finally, plot my Gantt chart.
#Removed duplicates to have unique file by geo and country
school_cal_timeline <- school_dates %>%
select(geo, country, start = start_date, end = end_date, school_months, color, fontcolor) %>%
distinct(geo, country, start, end, .keep_all = T)
#Plot school calendar
vistime(school_cal_timeline, col.group = "geo", col.event = "country", title = "Global School Calendars", optimize_y = F, linewidth = 20)
Final Visual
You can take a picture of the plot and save it as a png. This is about what it looked like:
Takeaways
Having this visual was super helpful at the time. It helped me see a concentration of schools by country with each particular school calendar. It was also great at reminding me what regions and countries would make sense to group for outreach if I wanted to send out pre and post-surveys.
I created this for myself but then thought this might be helpful for another colleague and shared it with them. Word got about, and other folks started asking for the chart I made from other divisions that also needed to deal with global school calendar schedules.
I didn’t go into creating this chart with that intention, but I am glad it had the impact that it did and helped make work a little easier for other folks. 🤗