24 headsign
Can I Rightfully Complain About My Bus?
Like most bus riders, I have a love-hate relationship with my route. In theory, I love its frequency and convenience; in practice, I often just give up and walk. I decided to investigate.
Intro
My daily commute to McGill for the past two and a half years has stayed the same: a trip on the 24 bus. The 24 is one of Montreal's "10 minute max" buses: that is, between 6am and 9pm, one should not have to wait more than 10 minutes, and on average no more than 5 minutes, for a bus. To see if the 24 is holding up its end of the deal, I scraped real-time arrivals at my stop from 6am to 9pm for a week. Here are the results.
Part 1: Wait Times
The main goal of this project was to see how often I need to wait more than ten (or five) minutes for a 24 to show up at my stop. Here's a visualization of the results, with the hour on the x axis and the day on the y axis.
Rush Hours

Most noticeable above is the near-constant stream of buses during the morning rush towards downtown. I'm much more likely to have to wait longer during the afternoon rush. At first, this seems a banal observation: it seems obvious that there would be more downtown-bound buses in the morning. But, although I don't have offical ridership statistics, I have noticed that there is also a significant rush in the other direction.

Below are the scheduled times for 24 buses at my stop from 7am-9am and 5pm-7pm, the busiest periods on the 24 (by my eye). Italicized entries are buses that end their trip at the metro station after my stop, before entering downtown. These truncated 24s run during rush hour to reduce crowding on the route.

7a 08 15 21 30 35 42 47 50 53
8a 01 10 12 15 18 21 25 28 30 32 35 38 40 45 51 58
 
5p 02 12 18 22 32 42 47 52
6p 02 06 10 15 19 23 28 33 38 46 50 55

It's safe to say that when we compare the 8am time slot to the 6pm time slot, both shouldn't be seeing any yellow in the visualization above. While this is generally true with the 8am slot, it is not at all the case with the 6pm slot. For example, let's look at the real bus arrivals during these hours for Wednesday (the scraping results unfortunately don't differentiate between the trips ending at the metro station):

8a 02 06 11 13 16 22 26 29 31 33 39 41 46 56 58 59
6p 02 16 23 29 39 48

Notice that there are the exact same number of arrivals in both the scheduled and real-time arrivals for 8-9am, which is not true for the 6-7pm slot. It looks like, for some reason, the STM's real-time data shows arrivals for the extra trips in the morning rush but not during the afternoon rush.

Off-Peak Periods

At other times, the results are mixed. Below is a simple chart of the the probability of my wait time being in the categories defined above, in between 10am and 5pm.
The good news for the 24 is figure in blue: at midday, you have about a 60% chance of waiting less than five minutes for a bus. This beats the "10 min max" standard by 10%. The bad news, however, is that there is still a sizeable percentage of the time where you'd have to wait for longer than 10 minutes.

Although 5-10% of the time might not seem like a high number, there are two main reasons why I'd argue that it should be much lower. First, the 24 generally runs by my stop at 5-7 minute frequencies during this time, meaning that a bus falling into this category is, almost by definition, bunched with another; further, if you've waited ten minutes for a 24, you've missed not one, but two buses. Second, a 10% figure means that, on average, someone relying on it every day, like me, ends up late one day every two weeks.

Part 2: Real-Time Data Accuracy
A second goal of this project was to evaluate the accuracy of the real-time data itself. The STM was late to the real-time game (as I have previously ranted about on my blog), which begs the question: was it worth the wait?

Below is a simple bar chart how accurate real-time data was on each day.
And here is another illustrating how often real-time data was more accurate than scheduled data:
These figures aren't particularly kind to the new real-time system. The first chart is inoffensive: a 10-20% rate of inaccuracy larger than two minutes isn't apocalyptic. It's the second chart that's damning: even given real-time data, you'd do better just to look at the schedule 30-40% of the time.
Conclusion: Improvements Necessary All-Round
I should preface the following by emphasizing my gratefulness: the 24 is the first bus that I've lived near with such a high frequency. In terms of transit accessbility, I'm as fortunate as I've ever been.

There are two final points I'd like to make. First, this data has limited reliability. By comparing real-time arrivals to themselves, I'm assuming that the closer the bus is to the stop, the more accurate the real-time arrivals are (I calculated the 'real' arrival times by interpolating from the scraped data).

Second, even if the data has a medium-large margin of error, the results suggest there is a lot of room for improvement. Because of the 24's high ridership, even a four-minute wait for the 24 at rush hour can mean no place on the next bus for riders. Reliability and consistent intervals between buses are crucial to ensuring people along the route get where they need to go.