Two years ago, I was waiting at the corner bus shelter when I heard someone rhetorically ask the question, "where's my bus?".
Little did they know I was the one sitting behind a desk, Monday through Friday, collating GIS and GTFS data to enable the service that provides such information.
To many people, words such as GIS (Geographic Information Systems) or GTFS (General Transit Feed Specification) sound like a foreign language. What these terms have in common is their purpose: data formats and systems that can influence real-time predictions for a seamless travel experience.
Enter NextBus.
GTFS, as we know it for NextBus, requires several files coded as text files in Unix -based format: agency.txt, calendar.txt, calendar_dates.txt, routes.txt, shapes.txt, stops.txt, stop_times.txt, trips.txt.
At first glance, this sounds like a laundry list, but in fact, these files, when paired together, contain much of the data that informs the predictions made for a transit agency using GTFS format.
The calendar.txt file contains the calendar days, as well as days of the week, for which any given schedule is valid. The calendar_dates.txt file contains the special schedules and the exceptions to the rule, such as a special Thanksgiving schedule or a cancellation of services for Thanksgiving Day.
What’s the Formula?
To expand on this point, the shapes.txt file contains the shape of each route coded as a series of shapepoints meant to accurately capture the starts, stops, and turns that a vehicle will take to travel from the start of a trip to a terminal stop, and return to the trip’s origin. Each shapepoint is formatted in the WGS 1984 (World Geographic System) geocoordinate system in decimal format. This is the same system that designates the location of the Capital building of Austin, Texas as 30.2747 degrees North latitude and 97.7404 degrees West longitude, or 30.2747, -97.7404. The degree of decimal precision used for the coordinate influences how far away a point is placed on a map from the location intended. Now, that’s cool.
In its essence, coding the shapes.txt file is like building the model railroads many of us remember from childhood. Instead of using toy train tracks, we are using a series of coded latitude/longitude shapepoints to visualize where a vehicle is meant to go. This, in turn, informs the predictor, which predicts a vehicle’s arrival at a stop in part based on the shape of the route and the distance expected. It doesn’t get much cooler than this!
Pick Me Up Where?
Another factor that can influence real-time predictions is the location of stops. These, too, are coded as latitude/longitude points in the WGS 1984 geocoordinate system and represent the location at which a vehicle is meant to stop to pick up travelers. The more accurate the location, the more accurately the predictor can determine when a vehicle is meant to arrive and whether the vehicle has adhered to a schedule if the stop is a scheduled stop. If, for example, a bus stop on a map is improperly coded as being marked before the stoplight when in fact the stop is after the stoplight, this makes a difference as arrival prediction is impacted by the stoplight timing. My team works around-the-clock to identify such issues and correct them to improve prediction accuracy.
What’s That Vehicle Up To?
Believe it or not, many hours are spent identifying how a vehicle hits a stop radius. Issues are often seen at transfer stations and rail stations where a bus loops around the station and passes through a stop radius twice while on its route. Both the location of the stop and how the vehicle hits a designated radius around the stop influence how the vehicle is tracked as arriving and departing from the stop. Manually reducing a stop radius can mean that a vehicle will be tracked as arriving later and expanding a stop radius can mean that a vehicle will be tracked as arriving earlier. A careful review of how a vehicle is operating and how it is being tracked is required to set parameters to best reflect an agency’s operations and, in turn, generate best arrival and departure predictions.
What’s Best For the Traveler?
Manual configuration involves another layer of accuracy. In this format, our team uses ArcGIS to create and draw routes and stops based on transit agencies’ published schedules or manuals. By working with an agency to gather as much information as possible, we can decide upon an assembly of routes, stops, stop orders and schedules that will generate the best predictions for the agency. Whether the format of the data is GTFS or another format altogether, the quality of the data greatly influences the predictions.
The time spent is worth it. Accurate data means accurate predictions.
Jessica has over 10 years of experience in GIS systems and urban planning. At Cubic, Jessica leads a team of Geo Data Analysts who perform GIS import processing of transportation schedules, routes, and stops within the NextBus database. By analyzing and resolving GIS and GTFS (General Transit Feed Specification) issues, issues with predictability for NextBus, and issues with schedule adherence, she and her team play a vital role in the accuracy of NextBus's predictor enabling better transit experiences for travelers all over North America.
Jessica is an alumna of the University of California, Berkley and holds an M.S. in GIS Technologies and an M.S. in Planning from the University of Arizona.