Last January, when 55 centimeters of snow blanketed Toronto in just 15 hours, the city’s snow removal fleet seemed to be struggling to keep up. But was it really different from other storms, or did it just seem so?
For three students from the Faculty of Applied Science and Engineering at the University of Toronto who were taking “Data Science for Engineers”, a graduate course taught by Sebastien Goodfellowassistant professor in the department of civil and mineral engineering, it was the perfect case study to test their new skills in numerical calculation.
“There was a lot of media coverage at the time saying the city had responded poorly,” says Katia Ossetchkina, a master’s candidate. “We wanted to see if there was a way to analyze the movement and distribution of snowplows and salt trucks across the city.”
Real-time data on the locations of Toronto’s more than 800 snow plows and salt trucks is publicly available during the winter months. There’s even a website that tracks this data on a map. But the team – which also included master’s candidates Thomas deBoer and Lucas Herzog – soon realized they needed more.
“There is no historical storage,” de Boer explains. “You can’t just download it as a file, so we had to create an algorithm that would ping that web server, download the data, and store it on our computer, which we could then use to create our own historical database and do our analysis from that.
By the time the team set up their technique, it was too late to collect data on the January storm. But by analyzing data from subsequent storms — and gleaning statistics about previous ones in the city and local news reports — researchers were able to verify that the city’s response was improving as winter progressed.
“We learned that Toronto had increased the number of snow plows on the road in February compared to January, and crews were faster in meeting certain benchmarks, such as the percentage of roads that had been cleared of snow at some point during the storm,” says de Boer.
Herzog says the team also spotted other interesting trends.
“Of course they clear the arteries first, but we saw that they would stop clearing around 6 a.m., just before the morning commute,” Herzog says.
“And that’s where a lot of these Twitter complaints are coming from,” de Boer adds. “People were wondering how they’re supposed to get to a thoroughfare when the street in front of their entrance is blocked by two feet of snow.”
Stimulated by these kinds of observations, the team decided to take the project a step further by applying their data analysis to Twitter messages. The team used Twitter’s Application Programming Interface (API) to gather feedback from those tweeting at Toronto’s 311 and the City of Toronto’s Winter Operations account. They were then able to carry out what is called a “sentiment analysis”, measuring whether the words used in the tweets were positive or negative.
This allowed the team to compare the public’s response to the January storm to one that occurred in February.
“We saw a lot of negative tweets in January with people complaining about not having service yet, and that also came with a lot of geographic information, so we could see the hardest hit areas,” Ossetchkina says.
“Then we saw this reverse trend in February where people were saying ‘thank you’ and saying the city was doing a good job in specific regions. That was a very interesting performance metric.
The team says this kind of data analysis could help other engineers on future projects. They made their historical database publicly available and even wrote detailed instructions so other teams could replicate their approach.
Goodfellow says he was very impressed with the students’ work.
“What I love about this project is that it’s very unique,” he says. “This is a new dataset that the students made public and can now be used by other engineers to investigate new questions or to hone their data science skills.
“Even better than that, it was a dataset of the city we all live in that provided a special motivation for students to really push beyond.”