Wolt Delivery Network Analysis
Coronavirus pandemic had a substantial influence on urban mobility throughout the world. Telecommunication affected fundamental transportation parameters such as congestion, passenger/freight ratios, and greenhouse gas emissions while last-mile delivery services became a crucial part of urban life. Although, most of the restrictive measures have been already lifted the habits and patterns developed during the pandemic left their effects on consumer behaviour. In this article, we will analyse the Helsinki food delivery network during the summer of 2020.*
Helsinki has a diverse delivery ecosystem but when it comes to food delivery there are two major players Wolt and Foodora. The comprehensive analysis of the food delivery system would imply taking into account the data from both delivery services. However, when dealing with private companies, data availability becomes a serious bottleneck. Therefore, only limited available data from Wolt was used for this article.
Ordering Patterns
Before looking at the delivery network let's first take a brief look at ordering patterns. Daily and weekly cycles influence all human activities this is, even more, the case with food delivery services. Over the course of their life humans develop persistent habits that sync with daily cycles. As we can see from the graph below there are very district hourly ordering patterns for food deliveries.
- Each day seems to have two peaks in the number of orders.
- Hottest ordering times are slightly different for workdays and weekends.
- During the workdays number of orders, peaks at 8 am and 16pm with a decrease in orders during the lunchtime.
- The weekends exhibit a similar behaviour but with a higher overall number of orders and with different peaks at 10–11 am and 15–16 pm.
If you're a food delivery company this is great news because you can predict the demand based on deterministic trends, scale and shrink your delivery infrastructure proportionately to match the hourly demand. The overall larger trends can affect the total number of orders but the hourly ordering patterns should remain relatively unaltered.
If you’re interested in in depth exploration of ordering patterns you can find a more detailed analysis and time series forecast here:
Wolt’s current business model uses a distance-based progressive pricing model for the delivery fees. Currently, the distance-based delivery fee rates are 1.90 € up to 1 km, 3.90 € up to 2 km, 5.90 € up to 3 km, and 7.90 € for deliveries over 3 km.
As we can see from the animation above this pricing model affects the ordering behaviour. The majority of deliveries are short distance deliveries with occasional longer deliveries.
Wolt Delivery Network
Now that we have some understanding of the ordering patterns we can take a look at the delivery network. In order to generate our network, we will regard restaurants as origin nodes and users as destination nodes. If the restaurant has delivered to the user the two nodes will be linked with a directed edge. The graph below illustrates the generated delivery network. Origin(restaurants) nodes are marked with orange colour while destination(users) nodes with purple.
The resulting network has the following basic properties:
DiGraph with 2098 nodes and 15901 edges
Average Degree: 15.158
Edge Density: 0.0036
Average Clustering Coefficient: 0.0941
Transitivity: 0.0110
Our network has a unique structure. The low edge density and Avg. Clustering Coefficient doesn’t come as a surprise since Wolt makes deliveries only from venues to users. Users are not connected amongst each other. This results in our network having two district kinds of nodes. The destination nodes will only have out-degree while the user nodes will only have in-degree. The high in-degree of a node would mean that the user is ordering from a diverse set of restaurants. While high out-degree would show popular restaurants that are delivering to many unique users. Below we can see the top 5 nodes by in-degree and out-degree.
Top 5 Users that have the most diverse diets (Node in-degree):
Node: 322, In-degree: 35
Node: 379, In-degree: 33
Node: 456, In-degree: 32
Node: 492, In-degree: 32
Node: 556, In-degree: 32
Top 5 Venues with most unique users (Node out-degree):
Node: 39, Out-degree: 362
Node: 21, Out-degree: 350
Node: 9, Out-degree: 337
Node: 90, Out-degree: 337
Node: 100, Out-degree: 274
In order to understand the structure of deliveries a little better, we can plot the degree distribution of the network. The graph below shows the distribution of nodes by In-degree. We can see that on average users order from 8 different restaurants while 50% of users order from up to different 6 restaurants (median).
The story is very different for restaurants. The histogram below shows that the majority of restaurants seem to deliver to only one user while some restaurants are delivering to up to 350 unique users.
We can also take a look at networks in-degree centrality to determine where the users with the most diverse diets are living. 🍔
As we can see inhabitants of Kamppi, Kallio and Töölö neighbourhoods prefer some diversity in their diets while other areas are a little more conservative with their choices.
In fact, if we apply a community detection algorithm to our network the same 3 neighbourhoods can be identified as distinct communities.
This kind of separation does not come as a surprise when we consider the geographic characteristics of Helsinki. The void in the middle of the map is the area where railway tracks split Helsinki into two. Naturally, this divide leads to increased delivery distances and increased fees associated with the delivery. Hence users often prefer ordering from local restaurants. This keeps the fees low and the 🍕 warm!
Error & Attack Tolerance
Studies on the Error and attack tolerance of the networks are based on Percolation theory. Percolation theory is a sub-field of statistical physics that describes the formation and behaviour of connected clusters in random systems. Although percolation theory’s origin is rooted in the study of lattices and random systems the concept and theory are also useful in the study of connectivity of empirical networks such as the delivery network studied here. Error and Attack tolerances help us understand the robustness of the network. Many complex systems display a surprising degree of tolerance against errors. For instance, relatively simple organisms grow and reproduce despite drastic environmental interventions. This is due to their error and attack tolerance attributed to the robustness of the underlying metabolic network.
- Errors: Nodes/edges are removed randomly from the graph until the network disintegrates.
- Attack: Nodes/edges are removed based on their importance(degree, weight, centrality) until the size of the giant component of the network reaches 0.
The logic being the naming is that errors are random and can happen to any element in the network while attacks firstly target the “important” elements of the network.
Removing Links
The graph below shows the Error and Attack Tolerance of the delivery network when links between the nodes are removed. In real life scenario removal of links simulates what would happen if orders will stop being made.
The red line illustrates how quickly the network would disintegrate if random links are removed i.e. Error tolerance. The green line illustrates the robustness of the networks when the nodes connecting high degree nodes are removed i.e. Attack tolerance. The blue line illustrates the removal of links between low degree nodes.
The key takeouts are that if Wolt’s popularity declines and the users stop ordering one by one the delivery network will disintegrate first gradually and then suddenly. Furthermore, the removal of users with diverse diets will disintegrate the network slightly faster.
Removing Nodes
What if a new pandemic happens and government forces restaurants to shut down completely? How will this affect the delivery network? We can remove nodes from our network and see how quickly it will disintegrate.
As we can see removing high degree nodes (popular restaurants) sharply decreases the size of the giant components in the network(green line). Removing only the top 10% of the high degree nodes results in an 80% per cent decrease in the size of the giant component of the network. In other words, the delivery network will disintegrate sharply as the popular restaurants start to close down or leave the network. The same cannot be said about users or unpopular restaurants where the relationship is close to linear.
Growing the Network
Suppose a new user joins Wolt’s customer base. Can we predict which restaurants the user will be ordering from?
We can assume that restaurants that are popular among the existing users might have a higher probability of attracting the new user. This phenomenon is known as Matthew effect or preferential attachment commonly referred to as “rich get richer”. Based on this we can stimulate the growth of the network by adding a single user at a time and connecting it to existing nodes with a certain probability.
In order to preserve the properties of the original network, we need two parameters in our growth simulation.
- New nodes should attach to existing restaurants based on the restaurant’s out-degree. This way popular restaurants will have a higher chance of getting a new customer.
- New nodes should have in degree between 1–35 with the same probability distribution as in our current network. This way new consumer behaviour will simulate the behaviour of existing users.
With this in mind, we can add 2000 new customers and see what happens with our network
DiGraph with 5000 nodes and 18804 edges
Average Degree: 7.5216
Edge Density: 0.0007523104620924184
Average Clustering Coefficient: 0.03917784873040924
Transitivity: 0.00808486679523513
Top 5 Users that have the most diverse diets (in degree):
(322, 35)
(379, 33)
(456, 32)
(492, 32)
(556, 32)
Top 5 Venues with most unique users (out degree):
(21, 421)
(39, 413)
(90, 408)
(9, 393)
(100, 313)
As we can see from basic network stats above now we have 5000 nodes and 18800 edges. The average degree has decreased from 15.1 to 7.5. The clustering coefficient however has increased. This is due to the fact that new users ordered from a different selection of restaurants thus increasing the clustering in the network.
Further development
While our dataset is limited, the data on restaurants and their popularity are available from Open Streetmap. By using our knowledge of the existing network we could extrapolate and grow our network accordingly. This would allow us to model the actual larger delivery network that was trimmed from the data.
Conclusion
In this article, we analysed Helsinki’s food delivery network primarily based on the data from Wolt. We looked at some of the characteristics of the network, applied a clustering algorithm, analysed the Error and Attack Tolerance of the network and simulated its growth. If you are interested in continuing this exploration you can find the code behind the analysis here:
About the Author
I am a curious Data Scientist with a strong passion for finding and understanding patterns. My interests include Math, Computer Science, Architecture & Urbanism. You can connect with me on LinkedIn and Github.