Evolutionary Possibilities for the Internetwork Layer

Background and Context

A project called Nimrod, which aims to produce a next-generation routing architecture for the Internet, has produced, as part of its work, a somewhat different perspective on the potential future evolutionary path of the internetwork layer. This perspective is based on an established school of thought about how to design large-scale systems. This section both explains that thinking to some degree, and uses it to describe what the future evolution of the internetwork layer might look like.

Basic Principles of Large-Scale System Design

The Nimrod routing architecture springs, in part, from a design vision that sees the entire internetwork layer, although distributed across all the hosts and routers of the internetwork, as a single system.

Approaching the internetwork layer from this direction, one would naturally take a typical system designer point-of-view, and think of the modularization of the system: choosing the functional boundaries that divide the system up into functional units, and defining the interaction between these units. This modularization is the key part of the system design process. Thinking about the interaction is part of the modularization process, as it affects the placement of the functional boundaries. Poor placement leads to overly complex interaction, or to interaction that is needed but cannot be realized.

These isues are even more important in a system which is expected to have a long lifetime. Correct placement of the functional boundaries, to clearly and simply break up the system into its truly fundamental units, is a necessity if the system is to endure and serve the internetwork community well. It will be readily appreciated that a global communication system presents a particular challenge along these lines, since not only is the installed base substantial, but the inherent need for interoperability means that making changes, even in an evolutionary way, is more difficult than it would be with a similarly sized overall investment in many independent smaller systems.

The Internetwork Layer Service Model

Let us return to the view of the internetwork layer as a system. That system provides certain services to its clients; that is, it instantiates a service model. Without a definition of the service model that the internetwork layer is supposed to provide, it will be impossible to select the specific mechanisms at the internetwork level which are needed to provide that service model.

To answer the question of what the service model ought to be, one can view the internetwork layer itself as a subsystem of an even larger system-the entire internetwork itself. From that point of view, the issue of the service model of the internetwork layer is clearer. The services provided by the internetwork layer are no longer purely abstract, but can be thought of as the external module interface of the internetwork layer module. If it becomes clear what overall services the internetwork as a whole should provide, and then where to put the functional boundaries of the internetwork layer as a whole, the service model of the internetwork layer should be easier to define.

In general terms, it seems that the unreliable packet ought to remain the fundamental building block of the internetwork layer. The design principle which says that one can take any packet and discard it with no warning or other action, or take any router and turn it off with no warning, and have the system continue to work, seems very powerful. The simplicity of component design (since routers don't have to take extraordinary measures to retain a packet of which they have the only copy) and overall system robustness resulting from these two assumptions are absolutely necessary.

In detail, however, particularly in areas that are still the subject of research and experimentation (such as resource allocation, security, etc.), it is difficult to provide a finished definition of exactly what the service model of the internetwork layer ought to be. However, in any event, by viewing the internetwork layer as a large system, one starts to think about what subsystems are needed to implement it and provide its service model, and what the interaction among them should look like. Until the service model of the internetwork layer is more clearly visible, though, such a discussion is necessarily rather nebulous.

State and Flows in the Internetwork Layer

The internetwork layer as whole contains a variety of information, of diverse lifetimes. Taken together, this information is the internetwork layer's state. Some of the information comprising the internetwork layer's state is stored in the routers, and some is stored in the packets flowing through the network, etc.

The internetwork layer's state can be characterized by several different classifications of that state. For example, the term forwarding state is used here to refer to information in the packet which records something about the progress of this individual packet through the network (such as the hop count, or the pointer into a source route). Other kinds of state, some described below, contain various kinds of information about what service the user wants from the network (such as the destination of the packet, etc.).

User and Service State

User state is the term used here for state which reflects the service requests of the user. This is information which can be sent in each packet, or stored in the router and applied to multiple packets, depending on which makes the most engineering sense. It is still called user state, even when a copy is stored in the routers.

User state can be divided into two classes; critical (such as destination addresses), without which the packets cannot be forwarded at all, and non-critical (such as a resource allocation class), without which the packets can still be forwarded, although probably not quite in the way the user would most prefer. There are a range of possible mechanisms for getting this user state to the routers; it may be put in every packet, or placed there by a setup. In the latter case, there is a whole range of possibilities for how to get it back when it is lost, such as periodically placing a copy in a normal data packet.

However, another kind of state, which cannot be stored in each packet, needs to be defined. Called server state here, it describes the longer term situation, across the life of many packets. In other words, the server state is inherently associated with a number of packets over some timeframe (for example, how much of a resource allocation has been used), and meaningless for a single packet.

The existence of server state apparently contradicts the commonly accepted stateless model of routers somewhat, but this contradiction is more apparent than real. The routers already contain state, such as routing table entries, without which it is virtually impossible to handle user traffic. All that is being changed is the amount, granularity, and lifetime, of state in the routers, so the change is not really that radical.

Some of this service state may need to be installed in a fairly reliable fashion; for example, if there is service state related to routing, or to billing or allocation of resources for a critical application, one more or less needs to be guaranteed that this service state has been correctly installed.

To the extent that state is included in the routers (either as service state or as user state), one has to be able to associate that state with the proper packets. The fields in the packets used for this purpose are called tags here.


It is useful to step back a bit at this point, and think about the traffic in the network. Some of it will be from applications that are basically transactions; that is, they require only a single packet, or a very small number of them. (The term datagram refers here to such applications, and the term packet describes the unit of transmission through the network.) However, other packets are part of longer-lived communications, which have been termed flows. [Clark]

From the user's point-of-view, a flow can be seen as a sequence of packets that are related, usually by virtue of having originated from a single application instance. In an internetwork layer with a more complex service model (for example, one which supports such features as resource allocation), the flow would have service requirements to pass on to some or all of the subsystems which provide those services.

To the internetworking layer, a flow can be seen as a sequence of packets that shares all the attributes that the internetworking layer cares about. This includes, but is not limited to: source/destination, path, resource allocation, accounting/authorization, authentication/security, etc.

There is not necessarily a one-to-one mapping from flows to anything else, be it a TCP connection or an application instance. A single flow might contain several TCP connections (for example, with the file transfer application, where one has the control connection and a number of data connections), or a single application might have several flows (for example, multimedia conferencing, with one flow for the audio, another for a graphic window, etc., each having different resource requirements in terms of bandwidth, delay, etc.)

Flows are not inherently unicast. They may also be multicast constructs, having multiple sources and destinations. Multicast flows are somewhat more complex than unicast, as there is a larger pool of state distributed across the network which must be made coherent, but the concepts are similar.

Practical Details of Flows

There is an interesting architectural issue here. To begin, there will probably be many different internetwork level subsystems (such as routing, resource allocation, security/access-control, and accounting). Now, there are two choices:

The former option has the advantage of being a little easier to deploy incrementally, since there is no need to agree on a common flow mechanism. It may also save on replicated state (if there are 3 flows which are the same for subsystem X, but different for Y, there only needs to be one set of X state). This option also has a lot more flexibility. The latter option is simple and straightforward.

The engineering choice of which is the better option is not trivial; the analysis of the cost and benefits of each depends on conditions which cannot yet be determined, such as the percentage of flows that will need to share the same state in certain subsystems. In general, however, simple and straightforward seems to be better. This system is quite complex already, and the benefits of being able to mix and match may not be worth the added complexity. It seems that any place it is possible to make things simpler, that should be the preferred choice. So, for the moment, the assumption here is that there will be a single, systemwide, definition of flows.

The packets which belong to a flow could be identified by a tag consisting of a number of fields (such as addresses, ports, etc.). However, it may be more straightforward and foolproof to simply identify which flow a packet belongs to by means of a specialized tag field (the flow-id) in the internetwork header. Given that one can seemingly always find situations where the existing fields alone don't do the job, and thus still need a separate field to do the job correctly, it seems best to take this simple, direct approach.

The simplicity of globally-unique flow-id's (or at least a flow-id which is unique along the path of the flow) is also desirable; although it may take more bits in the header, one doesn't have to worry about all the mechanism needed to remap locally-unique flow-id's. From the perspective of designing a widely deployed system with a long lifetime, simplicity and directness is the only way to go. That consideration translates into the strategy of having flows named solely by unique flow-id's, rather than by some complex semantics using existing fields.

However, the issue of how to recognize which packets belong to flows is somewhat orthogonal to the issue of whether the internetwork level should recognize flows at all. Is this a good idea?

Flows and State

To the extent that service state exists in the routers, one has to be able to associate that state with the related packets. In fact, this represents a fundamental reason for the explicit recognition of flows. However, while access to a service state is one reason to explicitly recognize flows at the internetwork layer, it is not the only one.

If the user has requirements in a number of areas (for example, routing and access control), they can theoretically communicate these requirements to the routers by placing a copy of all the relevant information in each packet within the internetwork header. If many subsystems of the internetwork are involved and the requirements are complex, this information could take up many bits.

There are two schools of thought on how to proceed. The first says that for reasons of robustness and simplicity, all user state ought to be repeated in each packet. For efficiency reasons, the routers may cache such information, probably along with precomputed data derived from the user state. (It makes sense to store such cached information along with any applicable server state, of course.)

The second school says that if a client is going to generate many packets, it makes engineering sense to give all this information to the routers once, and from then on place a tag (the flow-id) in the packet which tells the routers where to find that information. It is simply going to be too inefficient to carry all the user state around all the time. This is purely an engineering efficiency reason, but it is a significant one. (There is clearly no point in storing in the routers any user state about packets that are providing datagram service; the datagram service has usually come and gone in the same packet, and this discussion is about state retention.)

There is a somewhat more fundamental line of thinking which argues that the routers will inevitably come to contain more and more information about the user state. (In fact, this is already starting to happen; most high-performance routers include caches of such information as the preferred means of gaining that speed.) The question is whether that information will be installed by means of an explicit mechanism, or whether the routers will simply infer it from watching the packets that pass through them. To the extent that retention of user state in the routers is inevitable, there are obvious benefits to be gained from recognizing that fact, and explicitly designing an installation mechanism for that purpose. This strategy is far more likely to give satisfactory results than a more ad-hoc mechanism.

It seems unlikely that there is any benefit to an intermediate position, in which one subsystem installs user state in the routers, and another carries a copy of its user state in each packet. It appears to make little sense to design a system in which one internetwork layer subsystem, such as resource allocation, carries user state in all packets (perhaps with a hint in the packets to help find potentially cached copies in the router), and have another subsystem, such as routing, uses a direct installation technique with a flow-id tag in the packets. This seems to have the disadvantages of both, without the advantage in simplicity of having only one. We should do one or the other, based on a consideration of the efficiency/robustness/complexity tradeoffs.

There are also other intermediate positions that incorporate both mechanisms, but it is unclear at this point how useful they might be. In one scenario, a flow might use one of these techniques for all the subsystems it uses, and another flow might use the other technique for all of them. There is potentially some use to this, although the cost in complexity of supporting two separate mechanisms for handling user state may not be worth the benefits. In another scenario, a flow might use one mechanism with one particular router along its path, and another for a different router. A number of different reasons exist as to why one might do this, including the possibility that not all routers will support the same mechanisms simultaneously.

In addition, if there is a way of installing such flow-associated state, it makes sense to have only one, which all subsystems use, instead of building a separate one for each flow.

It is a difficult to make this choice - between direct installation in the routers, and placing the information in each packet - without a better idea of exactly how much user state the network is likely to use in the future. For example, it might turn out that 500-byte headers are needed to include the full source route, resource reservation, and other user state in every header.

It is also difficult to make the choice without consideration of the actual mechanisms involved. As a general principle, it is best if the process of recovering lost state is as local as possible, and the number of entities which have to become involved is fairly limited. For instance, currently, when a router crashes, traffic is rerouted around it without needing to open a new TCP connection. From this perspective, the installation option looks a lot more attractive if it is simple and relatively cheap to reinstall user state when a router crashes, without otherwise causing a lot of work.

However, given the likely growth in user state, the necessity for service state, the requirement for reliable installation, and a number of similar considerations, it seems that direct installation of the user state, and explicit recognition of flows, through a unified definition and tag mechanism in the packet, is the best design choice.

Ramifications of Flows

Once the idea of flows has been accepted as a basic element of the architecture-one that presumably will carry a large share of the traffic-one then has to examine the ramifications of that design decision. These begin with the basic internetwork packet format. For example, should there be separate packet formats for the two service modes, flow and datagram, and if only one format, what optimizations, if any, should it include for the flow mode?

Another ramification of explicit support for flows at the internetwork layer that needs to be considered is its impact on the internal organization of the internetwork layer. Given that the internetwork layer provides two distinctly different services, should those services be provided independently by the internetwork layer, or should one be built on top of the other, using the facilities provided by the base service?

There is an excellent case that one service should be built on top of the other, leading to a simpler internetwork layer than if completely separate mechanisms must be provided. If this is the case, then which service should be the base? It seems inescapable that datagram service should be built on top of flows, and not the other alternative, since one can provide datagram service on top of flows, but not the other way around.

This discussion should show that, although to many people the current basic structure and services provided by the internetwork layer appear to approach some final ideal, it is in fact quite possible that future developments will include radical departures from past practise. In particular, the supposition that the basic internal mechanism of the internetwork layer should be flows is quite novel, and accepted far less widely than support for flows as a service provided to the user.

Hopefully, the analysis above will have also shown that such speculations are not merely "change for the sake of change", but are grounded in a hard look at inescapable fundamentals, such as how much information the system will need to operate - an amount which will have to grow over time in order to provide the increased capabilities needed both by the users and by new applications; where information is in the overall system; and how it gets from where it is created to where it is needed, and how much overhead is involved in that transfer.


[Clark]	    David D. Clark, "The Design Philosophy of the DARPA Internet
                Protocols", Proceedings of the 1988 SIGCOMM Symposium, pp.
                106-114, Stanford, California, August 1988.
[Nimrod]    Isidro Castineyra, Noel Chiappa, Martha Steenstrup, "The Nimrod
                Routing Architecture", RFC 1992, August 1996.

Back to JNC's home page