Analytics Proves Ride-Sharing Could Cut Taxis’ Road Time By 40 Percent

Data scientists say sharing cabs would have little effect on the quality of service, but it would ease congestion and reduce pollution

Ride-sharing could dramatically reduce road time for taxis, as well as pollution, say data scientists.

Researchers at MIT, Cornell University and the Italian National Research Council’s Institute for Informatics and Telematics have disclosed details of a project that had them analysing 150 million trip records collected from more than 13,000 New York City cabs over the course of a year.

Carbon dioxide

They found that if passengers had been willing to tolerate no more than five minutes in delays per trip, almost 95 percent of the trips could have been shared. The optimal combination of trips would have reduced total travel time by 40 percent, with corresponding reductions in operational costs and carbon dioxide emissions.

taxisThe study was inspired by the proliferation of smartphone apps that find users a ride in real time, like the one made by the US start-up Uber.

“Of course, nobody should ever be forced to share a vehicle,” said Carlo Ratti, professor of the practice in MIT’s Department of Urban Studies and Planning (DUSP) and one of the paper’s co-authors. “However, our research shows what would happen if people have sharing as an option. This is more than a theoretical exercise, with services such as Uber Pool bringing these ideas into practice.”

Finding the optimal combination of trips does require foreknowledge of trips’ starting times. For example, a 30-minute trip the length of Manhattan might be combined with a 10-minute trip beginning 15 minutes later. But that kind of advance planning is unlikely if the passengers are using cellphone apps. So the researchers also analysed the data on the assumption that only trips starting within a minute of each other could be combined. Even then, they still found a 32 percent reduction in total travel time.

“We think that with the potential of a 30 percent reduction in operational costs, there is plenty of room for redistributing these benefits to customers, because we have to offer them lower fares; to drivers, because we have to incentivise them to belong to this system; to companies. And, of course, there is a benefit for the community,” said Paolo Santi, a visiting scientist in DUSP and first author on the paper.

In analysing taxi data for ride-sharing opportunities, “typically, the approach that was taken was a variation of the so-called ‘travelling-salesman problem,’” Santi explained. “This is the basic algorithmic framework, and then there are extensions for sharing.”

The travelling-salesman problem asks whether, given a set of cities and the travel times between them, there is a route that would allow a travelling salesman to reach all of them within some time limit. Unfortunately, the travelling-salesman problem is also an example – indeed, perhaps the most famous example – of an NP-complete problem, meaning that even for moderate-sized data sets, it can’t (as far as anyone knows) be solved in a reasonable amount of time.

So Santi and his colleagues took a different approach. First, they characterised every taxi trip according to four measurements: the time and GPS coordinates of both the pick-up and the drop-off. Then, for each trip, their algorithm identifies the set of other trips that overlap with it – the ones that begin before it ends. Then it determines whether the trip they’re examining can be combined with any of those other trips without exceeding the delay threshold. On average, any given trip is “shareable” with about 100 other trips.

150 million trips

Next, the algorithm represents the shareability of all 150 million trips in the database as a graph. A graph is a mathematical abstraction consisting of nodes, usually depicted as circles, and edges, usually depicted as lines between nodes. In this case, the nodes represent trips and the edges represent their shareability.

The graphical representation itself was the key to the researchers’ analysis. With that in hand, well-known algorithms can efficiently find the optimal matchings to either maximise sharing or minimise travel time.

The researchers also conducted experiments to ensure that their matching algorithm would work in real time, if it ran on a server used to coordinate data from cellphones running a taxi-sharing app. They found that, even running on a single Linux box, it could find optimal matchings for about 100,000 trips in a tenth of a second, whereas the GPS data indicated that on average, about 300 new taxi trips were initiated in New York every minute.

Finally, an online application designed by Szell, HubCab, allows people to explore the taxi data themselves, using a map of New York as an interface.

David Mahfouda, the CEO of the car- and taxi-hailing company, Bandwagon, whose business model is specifically built around ride sharing, says that his company hired analysts to examine the same data set that Santi and his colleagues did.

He said: “We did analysis of rides from LaGuardia Airport and were able to build really detailed maps around where passengers were headed from that high-density departure point. We simplified the problem in order to focus on a particular real-world problem that we thought we could solve. Making the entire data set available on a queryable basis does seem like a significantly larger lift.”

What do you know about tech in transport? Take our quiz!