On Variable Length Addresses

Why Topologically Sensitive Names (i.e. Addresses)
Are Inherently Variable Length
in a Global-Scale Communication Network

Introduction and Terminology

The term address, in internetworking, usually indicates an identifier with topological significance. In other words, the address has some structure in it which helps you tell where it is in the internetwork's topology, which is the network's connectivity structure. In other words, such addresses are not what are called flat addresses, where each address is a unique quantity, and is assigned with no particular relationship to where the entity it refers to is.

The structure in topologically significant addresses is usually a hierarchy, a "tree"-shaped structure of larger and larger naming units, much like a person's street address, which gives a geographic address using a hierarchy of larger and larger geographic units: the house, street, city, region/province, country, etc. This hierarchy results in a sequence of elements in the address, those elements being the name of each unit, be it geographical or topological.

Ramifications of Hierarchical Addresses

This has all sort of useful consequences. For instance, looking at two hierarchical topologically significant addresses will usually tell you something about how close the places identified by the two of them are, by how much of their address they have in common. In the exact same way, looking at two geographically significant addresses will tell you something similar, about how geographically close they are. However, this is not the most important consequence.

The real reason addresses have to have both hierarchy, and topological significance, is that in a very large network, like the Internet, addresses have to have these properties, or the cost of the job of finding paths through the network (from a given source, to a given destination with which is wishes to communicate) rises to the point where it is infeasible. The reason why this is so is not a matter of poor engineering, but is implicit in the underlying mathematics of finding paths through networks, and cannot be avoided by clever design.

Hierarchical Addresses in Networks

Recent internetworking designs have attempted to carry such hierarchical, topologically significant addresses in fixed length fields in places such as internetwork protocol packet headers. This is a non-functional design choice in a network which is intended to be global in size. Here's why.

The Address Abstraction Hierarchy

The address abstraction hierarchy is what we call the organization of hierarchical topological units that are used in topological addresses. To give you a concrete example of this concept, if you look at the geographic equivalent, the geographic address abstraction hierarchy is the set of hierarchical geographical units - countries, states/provinces, cities, etc, etc - organized in a way which shows the relationships between them. Obviously, not all of these boundaries are of equal importance; a given set of state/province boundaries are in one country, etc, etc, and have a more local meaning and utility.

So, you can draw a diagram of the "tree" of the geographic boundary hierarchy (much like the diagrams you may be familiar with, of tree-structured file systems), in which all the states/provinces which are part of one country are depicted underneath that country, connected to it, but "hanging" from it, and so on. The diagram starts from a base, the root, under which are all the highest level entities; countries inthis case. The ends of the lowest level branches, the leaves, are individual geographic street addresses. The full hierarchical geographic address is given by concatenating the names of the geographic entities along the path from the root to the individual leaf.

In an essentially identical manner, one can draw the address abstraction hierarchy for the topology of a communication network, with lower level entities (i.e. part of the network) hanging from the higher level entity (i.e. a larger part of the network) of which they are a part. The hierarchical address of a given place in that network is given by the concatenation of the names of the topological entities along the path from the root to the individual leaf.

To make this more concrete, imagine drawing a map of the network (or at least a part of it) on a piece of paper. (Obviously, to draw a map of the whole network would take a very, very large piece of paper!) First, draw (probably in a separate color, to make it clearer) non-overlapping circles around small pieces of the network; these are the lowest level entities. Now, draw circles (again, in a different color) which include groups of the circles you drew previously. You have drawn a hierarchy of topological boundaries, just as state boundaries enclose groups of counties, etc.

Variability in the Address Abstraction Hierarchy

To again turn to the example of a typical hierarchical file sysytem, in a typical hierarchical file system, the structure of the hierarchy is not uniform. I.e., some filenames have more elements in them than others; in some places one will have a top-level directory with a relatively large number of layers of sub-directories underneath it; in other places, a top-level directory may have no layers of sub-directories beneath it.

In a very similar manner, in a large scale network, the length of the address (i.e. the number of elements along the path from the root of the abstraction hierarchy to a given leaf) will vary, since the structure of the abstraction hierarchy is not uniform, i.e. it is not controlled by some centralized authority to ensure that this happens. Moreover, not only is it not uniform, but it is constantly changing as the network grows, which it will continue to do for decades to come.

The geographic address abstraction hierarchy does not show this kind of behaviour, but then again, geography is not as flexible as network topology. Also, many geographic boundaries are based on political arrangements, which again tend to change slowly. The situation in the network is completely different, and much, much more dynamic.

Ramifications of the Variable Address Abstraction Hierarchy

We can now see why a hierarchical address is inherently variable length: a hierarchical address is just the list of the topological entities in the address abstraction hierarchy, along the path from the root to the individual leaf, and the number of entries here can vary.

Clearly, it is chancy to try to store such an inherently variable length item in a fixed length field, as has been tried in many recent internetworking architectures, especially for a a network which is intended to be global in size. Certainly, one can allocated a fixed length field, and for some time, at least, the inherently variable length data item can be stored in it without problems. Inevitably, though, will come instances when the desired length address is too large for the fixed-length field, and the users will be prevented from doing what they want, although they may be able to find some alternative which is functional, if not preferable. From that point on, the fixed-length field will be more and more of a problem, as more and more instances appear in which the "natural" arrangement of the abstraction hierarchy cannot be achieved within the fixed field size; eventually, cripplingly so.

Surely one can estimate the maximum size needed, though? Not really. A global network, with consequent massive infrastructural investment, must of necessity be intended to last for many years. Exactly what the network will look like toward the end of that lifespan is anyone's guess. Any choice of a fixed field size is thus a risky bet.

Length Variability in Hierarchical Addresses

To explore the topic of variability in address lengths, which results from variability in the abstraction hierarchy structure, in a bit more detail, this variability in the abstraction hierarchy is allowed in a large network because there is no global control over the structure of the abstraction hierarchy. There is no real need for such global control, it would be extremely difficult to do in any case, and such a global control would unduly restrict the flexibility of address assignment.

For example, the world telephone system does not have a central control over what phone numbers in all the individual countries look like: how long they are, or how many segments they are split up into. Individual countries may decide to change the way their phone numbers look, perhaps because they run out of phone numbers, but this change is a local change in the telephone number abstraction hierarchy, and is not controlled by a central organization which controls all the telephone numbers in the world. Such an organization would be a practical impossibility.

As to why this kind of variability in the address abstraction hierarchy happens, it happens because it is useful. Each section of the tree is organized according to what seems most useful for that section. The reasons why differing organizational schemes appear in different parts of the abstraction hierarchy are many.

For example, a single site may grow so large that it may decide it needs an extra layer of hierarchy to organize its addressing. A growth of an order or magnitude in any entity at a given layer can probably be handled without adding another layer of hierarchy, but a growth of larger than that seems likely to need another layer, and one thing the history of the deployment of computers shows us is that the number of computers has been increasing exponentially for some time, and shows no signs of slowing down anytime soon.

Another reason for increasing the number of layers in the hierarchy is somewhat arcane at the moment, but will very probably come to be more common soon. That is the need for what is called policy routing, or quality of service routing, by which it is meant that the user wishes to exert some control over where their traffic goes. To use a simple example, when someone gets into their car, they make numerous decisions about which path to take to their destination, including such factors as whether they wish to pay a toll fee, etc, etc. The network is going to need to provide this same flexibilty to the users of the network, to chose service providers, etc, etc.

To do this, the network needs to be able to provide ways of naming the transmission facilities used. To use our road example again, a road which charges a toll is usually given a special name (and color) on road-maps, so that people can recognize it as a distinct entity when they are deciding what path to take. In the same way, the network will need to be able to name regions of the topology about which statements about the kind of service, etc, they offer can be made. This also will increase the numbers of layers of hierarchy needed.

Conclusion

So, there are many reasons why a lot of local variability in the address abstraction hierarchy is needed, and this local variability makes the length of addresses (which are, remember, derived from this abstraction hierarchy) inherently variable. Storing such an inherently variable length item in a fixed length field is a chancy business.

Back to JNC's home page

Why Topologically Sensitive Names (i.e. Addresses) Are Inherently Variable Length in a Global-Scale Communication Network