Ever since the emergence of wireless mesh networks several decades ago, researchers and practitioners have hotly debated the question of its scalability. Proponents argue that since every new node adds capacity in excess of its own needs, mesh networks should be able to scale without limit. Opponents argue that multi-hopping, protocol overhead, and wireless interference limit scalability. Who is right?
Not surprisingly, as with many of these long-lasting debates, the answer is: it depends. But on what? In this two-part series, we shall discuss the factors on which scalability depends, the differing interpretations of scalability in theory and in practice, and how to make mesh networks scale to the extent possible. This first part will be on fundamental limits (the governing “laws of physics” as it were), and the second part will focus on practical aspects and protocols. Rather than just present the existing body of research on the topic, we will strive to intuitively explain the concepts so that the reader is equipped to reason about specific situations she may encounter.
To get an intuitive understanding of the fundamental issues in scalability, let us take an analogy: traffic flow in Manhattan (see figure). This is like a mesh network where nodes are intersections, and the links are the road segments between the intersections (we shall refer to the segments simply as blocks — that is, for this article, a block is the road between two successive intersections). Messages are cars, and the number of hops of a message corresponds to the number of blocks travelled by a car. Following most research on the topic of scalability, we shall consider point-to-point (unicast) communications, so cars have a single destination and cannot clone themselves!
Suppose that one car from each establishment on each block takes one trip per day at a random time to a random destination within Manhattan. Now, suspending disbelief for a moment, suppose Manhattan can grow uniformly and indefinitely on all sides. As it does so, since there are more roads, the car-carrying capacity of Manhattan increases. On the other hand, since there are more blocks, there are proportionately more establishments, and hence more cars starting out on trips. Thus, the capacity is being consumed faster as well. Can Manhattan grow indefinitely? Or will the increase in cars overwhelm the increase in capacity at some point? Which increase wins?
As an intermediate conceptual step toward the answer, let us consider two simplifications, or “special cases”. First, suppose every car only travels two blocks. Then, the set of cars on a given block between successive intersections is restricted to those cars that have originated within two blocks of the street. In this case, no matter how much Manhattan grows, the number of cars on each block will remain the same since the number of establishments per block is the same. Manhattan scales.
At the other extreme, suppose that each car’s trip takes it via every single block. Then, every block will be visited by an ever increasing number of cars that grows as the number of blocks. Clearly, there will come a point when the streets will get choked up. Manhattan does not scale.
Thus, whether or not Manhattan scales depends upon the traffic patterns — in particular, how localized it is. As traffic becomes less and less localized, there comes a point at which it transitions from being scalable to unscalable. But what is this point?
Suppose that each trip is to a randomly chosen destination. Does this scale? Unfortunately, not. To see why this is the case, consider a street segment, say S as shown in the figure. Cars from various origination points intended for various (randomly chosen) destinations make their way through S. How many cars would go through S? This is a complex calculation, but fortunately has been worked out as √ N for an N node square lattice1 — equivalent to a Manhattan grid with N intersections. Thus, the number of cars in S increases as √ N as Manhattan grows. Equivalently, this means that the capacity available per car decreases as C/√ N where C is the car-carrying capacity of a given block.
The above is a highly simplified and narrow version of a seminal paper in mesh network theory that came out at the turn of the millennium. Called the Gupta-Kumar result after its authors, then at the University of Illinois, it proves using rigorous techniques that the per-node transport capacity in a mesh network with uniformly randomly chosen source-destination pairs is at best O(W /√ N ) — where W is the node capacity, and the symbol “O” should be read as “of the order of”. As N increases, capacity decreases as √ N . In other words, there is no room for messages at some scale. Note that it is never zero but can come arbitrarily close to it. This is referred to as asymptotically (“in the limit”) tending to zero. They showed that this result holds regardless of the network topology. A subsequent paper showed that using directional antennas does not change the result either.
Does this mean the opponents are right and we should stop attempting to build large networks? Not necessarily. A key assumption in the above result is that the destination of a message is selected uniformly randomly over all destinations. This means that farther destinations are equally likely as nearer destinations. This appears to be unrealistic, especially considering the fact that there are more destinations farther away than nearer due to geometric laws, which suggests that you would pick farther destinations more frequently than nearby ones. This seems unlikely both for mesh networks and for transportation within Manhattan!
What happens to the capacity trend if we assume more localized traffic — be it messages or cars in Manhattan? The figure below contrasts how traffic would look with uniformly distributed traffic and more localized traffic. Assume there are 16 cars that leave a location bound for four representative destinations 1, 2, 3 and 4 blocks away respectively. With non-local traffic (left), in this case uniformly distributed in number of blocks, one-fourth of 16, or 4 cars would leave for each destination. With more localized traffic (right), more cars would travel one block, fewer two blocks and so on.
Notice that in the localized example, the number of cars going a certain number of blocks roughly halves as the block distance increases. This corresponds to a power-law distribution of traffic. The choice was deliberate — the power-law distribution2 captures localization elegantly using a parameter denoted by α. With power-law distribution, the probability that a car picks a destination at a distance d is inversely proportional to dα. In the example, d is the number of blocks and is slightly greater than 2. If α were 3, then the frequency of cars would fall by a third for each successively farther away block.
Soon after the celebrated Gupta-Kumar paper, a paper from MIT showed that if the distance between sources and destinations in a mesh network are power-law distributed with an exponent α > 2 (roughly at least as localized as the figure on the right in the equivalent mesh network), then the per-node capacity stays constant as the network increases. In other words, mesh networks with such traffic scale indefinitely, similar to the constant-hop/block traffic we talked about earlier!
Intuitively, this is because for sufficiently local traffic (characterized by α > 2), the contributions of traffic from farther away nodes decreases rapidly enough that they form a convergent series (specifically, the “arithmetico-geometric series” 1/2 + 2/4 + 3/8 + 4/16 + 5/32 + …. = 2). Note that it is not the case that messages be restricted to a certain number of hops, only that messages over more hops should be a lot less frequent than over fewer hops.
Mesh networks can scale without limit if the message destinations are locally biased as might be true for real-world applications
But is real-world mesh network traffic sufficiently local that it allows indefinite scaling? Unfortunately, there is little work on this. My own work on military traffic distribution indicated a power-law distribution with exponents greater than 2, which means that those specific networks will scale if the pattern holds as size increases. Moreover, in many scenarios, the messages may only need to go a few hops before being backhauled via higher-bandwidth technologies. It has been shown that a mesh network with strategically placed base stations can scale indefinitely as long as the number of base stations increases proportional to √ N . As for Manhattan transportation, it does appear that at least NYC taxi trips are heavily localized!
You may be surprised that the node capacity (radio bitrate), message size, routing protocols etc. do not affect scalability. This is because we are asking : is there a limit, rather than what is the limit. A 10 Gbps radio will certainly increase the limit compared to a 100 Kbps radio, but with uniformly randomly chosen source-destination traffic there will exist some point — however large — at which the network will choke up. Since the Gupta-Kumar and subsequent works cited above assume optimal paths and channel access, protocols cannot help.
But even if a mesh network is not scalable in this fundamental, theoretical sense, it may be scalable to the desired number of nodes. Consider a mesh network that works at 1 million nodes but not at 1 million + 1 nodes. In the fundamental sense, it is not scalable since there is a limit. Yet, such a network would be quite an achievement and would probably qualify as “scalable” for many. It appears that an alternate interpretation of scalability that captures the actual limit would be useful in practice.
Fundamentally unscalable networks may well scale in practice to the desired number of nodes depending on radio rate, message load, and protocols.
Thus, while the fundamental laws of scalability are certainly important, there are equally important questions in the context of implementing and deploying mesh networks: (1) can we tell, for a mesh network, what the limit is, if any, and what factors influence it (e.g. radio bitrate, message frequency etc.); (2) is the limit achievable, and if not what is achievable; and (3) what role do networking protocols play in scalability?
We will take up these questions in the upcoming Part II of this article, introducing the notion of in-practice scalability, and developing insights on how to design protocols to improve the scalability of real-world mesh networks.
But for now enjoy that drive up Fifth Avenue, but please don’t go too many blocks!