Requirements for the Nimrod Routing Architecture

The Internetwork Layer and the Nimrod Routing Architecture

Overview of Nimrod

Nimrod is a project which aims, in part, to produce a next-generation routing architecture for the Internet; but also, more generally, to try and produce a basic design for routing in a single global-scale communication substrate, a design which will prove sufficiently flexible and powerful to serve into a future as yet unforseeable.

Nimrod does this through the conjunction of two powerful basic mechanisms:

distribution of maps, as opposed to distribution of routing tables; and
selection of routes by clients of the network, not by the switches in the network.

(This latter approach is sometimes called source routing, but this term can be a bit misleading, since in Nimrod the route does not have to be chosen by the actual source, but can be the responsibility of an agent working on the source's behalf. The terms unitary routing or explicit routing have therefore been used to describe this approach. The important thing to realize is that the path is not selected in a fully distributed, hop-by-hop manner, in which each switch has an equal role to play in selecting the path.)

The actual operation is fairly simple, in principle. Maps of the network's actual connectivity (maps which will usually include high-level abstractions for large parts of that connectivity, in the same way road maps of an area may not show all the roads, just the 'important' ones) are made available to all the entities which need to select paths. Those entities use these maps to compute paths, and those paths are passed to the actual switches, along with the data, as directions on how to forward the data.

The rest of this section discusses how the Nimrod routing and addressing architecture interacts with the rest of the internetwork layer, and what requirements it has upon the internetwork layer protocol's packet format.

The Nimrod Subsystems of the Internetwork Layer

As mentioned elsewhere, Nimrod springs, in part, from a design vision which sees the entire internetwork layer, even though it is distributed across all the hosts and routers of the internetwork as a single system; in such a vision, one starts to think about what subsystems are needed, and what the interactions among them should look like. (See this for more information.)

The Nimrod routing architecture is simply a number of the subsystems of this larger system, the internetwork layer. It is not intended to be a purely standalone set of subsystems, but rather, to work together in close concert with the other subsystems of the internetwork layer (resource allocation, security, charging, etc.) to provide the internetwork layer service model.

One reason that Nimrod is not simply a monolithic subsystem is that some of the interaction with the other subsystems of the internetwork layer, for instance the resource allocation subsystem, is much clearer and easier to manage if the routing is broken up into several subsystems, with the interaction between them open.

It is important to realize that Nimrod was initially broken up into separate subsystems for purely internal reasons. It so happens that, considered as a separate problem, the fundamental boundary lines for dividing routing up into subsystems are the same boundaries that make interaction with other subsystems cleaner; this provides added evidence that these boundaries are in fact the right ones.

The subsystems which comprise the functionality covered by Nimrod are

Routing information distribution (in the case of Nimrod, topology map distribution, along with the attributes [policy, QOS, etc.] of the topology elements),
Route selection (strictly speaking, not part of the Nimrod system per se, but a local matter, although they are necessary), and
User traffic handling.

The first can be defined fairly well without reference to other subsystems, but the second and third (route selection and user traffic handling) are necessarily more involved. For instance, route selection might involve finding out which links have the resources available to handle some required level of service. For user traffic handling, if a particular application needs a resource reservation, getting that resource reservation to the routers is as much a part of getting the routers ready as making sure they have the correct routing information, so here too, routing is tied in with other subsystems.

It is possible to talk about the relationship between the Nimrod subsystems, and the other functional subsystems of the internetwork layer, but until the service model of the internetwork layer is more clearly visible, along with the functional boundaries within that layer, such a discussion is necessarily rather nebulous.

Specific Interaction Issues

Here is an incomplete list of the things that Nimrod would like to see from the internetwork layer as a whole:

A unified definition of flows in the internetwork layer, and a unified way of identifying, through a separate flow-id field, which packets belong to a given flow.
A unified mechanism (potentially distributed) for installing state about flows (including multicast flows) in routers.
A method for getting information about whether a given resource allocation request has failed along a given path; this might be part of the unified flow setup mechanism.
An interface to (potentially distributed) mechanism for maintaining the membership in a multicast group.
Support for multiple interfaces; that is, multi-homing. Nimrod does this by decoupling transport level entity (TLE) identification from interface identification (called locators in Nimrod). For example, a packet with any valid destination locator should be accepted by a TCP module, if the destination TLE identifier (TLEI) is the one assigned to that entity.
Support for multiple locators per network interface. This is needed for a number of reasons, among them to allow for less painful transitions in the locator abstraction hierarchy as the topology changes.
Support for multiple TLEI's per TLE. This would definitely include both multiple multicast TLEI's (i.e. TLEI's that refer to a group of TLE's), and at least one unicast TLEI (the need for multiple unicast TLEI's per TLE is not obvious).
Support for distinction between a multicast group as a named entity, and a multicast flow which may not reach all the members.
A distributed, replicated, user name translation system (perhaps using the existing Domain Name System - DNS) that maps such user names into (TLEI, locator0, ... locatorN) bindings.

General Principles for Packet Formats

As a general rule, the design philosophy of Nimrod is 'maximize the lifetime (and flexibility) of the architecture'. Design tradeoffs (i.e. optimizations) that will adversely affect the flexibility, adaptability and lifetime of the design are not not necessarily wise choices; they may cost more than they save. Such optimizations might be the correct choices in a stand-alone system, where the replacement costs are relatively small; in the global communication network, the replacement costs are very much higher.

Providing the Nimrod functionality requires the carrying of certain information in the packets. The design principle noted above has a number of corollaries in specifying the fields to contain that information.

First, the design should be 'simple and straightforward', which means that various functions should be handled by completely separate mechanisms, and fields in the packets. It may seem that an opportunity exists to save space by overloading two functions onto one mechanism or field, but general experience is that, over time, this attempt at optimization costs more, by restricting flexibility and adaptability.

Second, fixed field lengths should be specified to be somewhat larger than can conceivably be used; the history of system architecture is replete with examples (processor address size being the most notorious) where fields became too short over the lifetime of the system. Below, for each item, an indication is given as what the smallest reasonable 'adequate' lengths are, but this is more of a 'critical floor' than a recommendation. A 'recommended' length is also given, which is the length which corresponds to the application of this principle, but note that these are not upper bounds; modulo efficiency concerns, fields cannot be 'too long', only 'too short'!

It is important to note that this does not mean that implementations must support the maximum value possible in a field of that size. In practise, system-wide administrative limits would be placed on the maximum values which must be supported. Then, as the need arises, the administrative limit can be increased. This allows an easy, and completely interoperable (with no special mechanisms) path to upgrade the capability of the network. If the maximum supported value of a field needs to be increased from M to N, an announcement is made that this is coming; during the interim period, the system continues to operate with M, but new implementations are deployed; while this is happening, interoperation is automatic, with no transition mechanisms of any kind needed. When things are 'ready' (i.e. the proportion of old equipment is small enough), use of the larger value commences.

Packet Format Fields

In considering the packet format, one can distinguish between the host-router part of the packet's path, and the router-router part. It is not absolutely necessary that they be the same (although it is simpler), and if they are different, a format that is good for one may not be best suited to the another. In addition, Nimrod has three forwarding modes (flows, datagram, and source-routed packets), and some modes use fields that are not used by other modes.

Bearing these points in mind, what Nimrod would like to see in the internetworking packet is:

Source and destination TLE identification. There are several possibilities here, such as:
- 'Shortish', fixed length fields which appear in each packet, in the internetwork header, which contain globally unique, topologically insensitive identifiers for TLE's. For these 48 bits is adequate, but at least 64 bits is recommended.
- Some shorter field which appears in all packets which contains a value which is not globally unique, but only unique in mapping tables on each end, tables which map from the small value to a globally unique value, such as a DNS name. For these 8 bits is adequate, but at least 16 bits is recommended.
- Some overall design which does not include any TLE identification in the packet at all, but transfers it at the start of a communication, and from then on infer it. This alternative would have to have some other means of telling which TLE a given packet is for, if there are several TLE's at a given destination. Some coordination on allocation of flow-ids, or higher level port numbers, etc, might do this.
Any of these will do, but the exact choice of which is an question that includes factors outside the routing, so a specific recommendation cannot be made here.
Flow identification. There are two basic approaches here, depending on whether flows are aggregated (in intermediate switches) or not. (It should be emphasized at this point that it is not yet known whether flow aggregation will be needed. The only reason to do it is to control the growth of state in intermediate routers, but there is no definite case made that either this growth will be unmanageable, or that aggregating flows will be feasible.)
For the non-aggregated case, a single flow-id field will suffice. This must not use one of the two previous TLE fields, as in datagram mode the flow-id will be over-written during transit of the network. It is possible to use non-globally unique flow-ids, (which would allow a shorter length to this field), although this would mean that collisions would result, and have to be dealt with. For a non-globally unique flow-id, 24 bits would be adequate, 32 are recommended.
For the aggregated case, three broad classes of mechanism are possible.
- The packet contains a sequence of flow-ids. Whenever aggregation or deaggregation occurs, the next one along the list is used. This takes the most space, but is otherwise the least work for the routers.
- The packet contains a stack of flow-ids, with the current one on the top. On aggregation, push a new one on; on de-aggregation, take one off. This takes more work, but less space in the packet. Encapsulating packets to do aggregation does basically this, but entire headers are being stacked, not just flow-ids. The clever way to do this flow-id stacking is to find out from flow-setup how deep the stack will get, and allocate the necessary space in the packet when it's created. That way, all that ever has to happen is adding a new flow-id, or removing one; it is never necessary to modify the header to make room for more flow-ids.
- The packet contains only the 'base' flow-id, and the current flow-id. On aggregation, the latter is updated. The tricky part comes one de-aggregation; the correct value must be replaced. To do this, there has to be state in the router at the end of the aggregated flow, which says what the de-aggregated flow for each base flow is. The downside here is obvious: there is no need for individual flow state for each of the constituent flows in all the routers along the path of that aggregated, flow, except for the last one.
Other than encapsulation, which has significant inefficiency in space overhead fairly quickly, after just a few layers of aggregation, there appears to be no way to do it with just one flow-id in the packet header. Even if there is no modification of the packets, but the aggregation is done by mapping some number of 'base' flow-id's to a single aggregated flow in the routers along the path of the aggregated flow, the table that does the mapping is still going to have to have a number of entries directly proportional to the number of base flows going through the switch.
A looping packet detector. This is any mechanism that will detect a packet which is 'stuck' in the network; a timeout value in packets, together with a check in routers, is an example. If this is a hop-count, it has to be more than 8 bits; 12 bits would be adequate, and 16 is recommended. (which also makes it easy to update). This is not to say that networks with diameters larger than 256 are good, or that one should design such nets, but limiting the maximum path through the network to 256 hops is likely to provide a very painful limit, later on, just as making 'infinity' 16 in RIP did, eventually. When that ceiling is reached, it's going to be very painful, and there won't be an easy fix. Note that path lengths of over 30 hops are already been seen in the Internet.
Optional source and destination locators. These are structured, variable length items which are topologically sensitive identifiers for the place in the network from which the traffic originates or to which the traffic is destined. The locator will probably contain internal separators which divide up the fields, so that a particular field can be enlarged without creating a great deal of upheaval. An adequate value for maximum length supported would be up to 32 bytes per locator, and longer would be even better; up to 256 bytes per locator is recommended.
Perhaps (paired with the above), an optional pointer into the locators. This is forwarding state i.e. state in the packet which records something about its progress across the network) which is used in the datagram forwarding mode to help ensure that the packet does not loop. It can also improve the forwarding processing efficiency. It is thus not absolutely essential, but is very desirable from a real-world engineering view point. It needs to be large enough to identify locations in either locator; e.g. if locators can be up to 256 bytes, it would need to be 9 bits.
An optional source route. This is used to support the source routed packet forwarding mode. Although not designed in detail yet, two possible approaches can be discussed.
- In one, used with semi-strict source routing (in which a contiguous series of entities is named, albeit perhaps at a high layer of abstraction), the syntax will likely look much like source routes in PIP; in Nimrod they would be a sequence of Nimrod entity identifiers (i.e. locator elements, not complete locators), along with clues as to the context in which each identifier is to be interpreted (e.g. up, down, across, etc). Since those identifiers themselves are variable length (although probably most will be two bytes or less, otherwise the routing overhead inside the named object would be excessive), and the hop count above contemplates the possibility of paths of over 256 hops, it would seem that these might possibly some day exceed 512 bytes, if a lengthy path was specified in terms of the actual physical assets used. An adequate length would be 512 bytes; the recommended length would be 2^16 bytes (although this length would probably not be supported in practise; rather, the field length would allow it).
- In the other, used with classical loose source routes, the source consists of a number of locators. It is not yet clear if this mode will be supported. If so, the header would need to be able to store a sequence of locators (as described above). Space might be saved by not repeating locator prefixes that match that of the previous locator in the sequence; Nimrod will probably allow use of such "locally useful" locators. It is hard to determine what an adequate length would be for this case; the recommended length would be 2^16 bytes (again, with the previous caveat).
Perhaps (paired with the above), an optional pointer into the source route. This is also optional forwarding state. It needs to be large enough to identify locations anywhere in the source route; e.g. if the source router can be up to 1024 bytes, it would need to be 10 bits.
An internetwork header length. This is mentioned since the above fields could easily exceed 256 bytes, if they are to all be carried in the internetwork header (see comments below as to where to carry all this information), the header length field needs to be more than 8 bits; 12 bits would be adequate, and 16 bits is recommended. The approach of putting some of the data items above into an interior header, to limit the size of the basic internetworking header, does not really seem optimal, as this data is for use by the intermediate routers, and it needs to be easily accessible.
Authentication of some sort is needed. See the recent IAB document which was produced as a result of the IAB architecture retreat on security, section 4, and especially section 4.3. There is currently no set way of doing "denial/theft of service" in Nimrod, but this topic is well explored that document; Nimrod would use whatever mechanism(s) seem appropriate to those knowledgeable in this area.
A version number. Future forwarding mechanisms might need other information (i.e. fields) in the packet header; use a version number would allow it to be modified to contain what's needed. (This would not necessarily be information that is visible to the hosts, so this does not necessarily mean that the hosts would need to know about this new format.) 4 bits is adequate; it's not clear if a larger value needs to be recommended.

Field Requirements and Addition Methods

It is possible to use Nimrod in a mode where needed information/fields are added by the first-hop router. It is thus useful to ask 'which of the fields must be present in the host-router header, and which could be added by the router?' The only ones which are absolutely necessary in all packets are the TLEI's (provided that some means is available to the routers to map them into locators).

As to the others, if the user wishes to use flows, and wants to guarantee that their packets are assigned to the correct flow, the flow-id field is needed. If the user wishes efficient use of the datagram mode, it's probably necessary to include the locators in the packet sent to the router. If the user wishes to specify the route for the packets, and does not wish to set up a flow, they need to include the source route.

How would additional information/fields be added to the packet, if the packet is emitted from the host in incomplete form? (By this is meant the simple question of how, mechanically, not the more complex issue of where any needed information comes from. It is assumed here that the packet does not contain empty fields for this information, but rather a format which does not include those fields at all.)

This question is complex, since all the IPng candidates (and in fact, any reasonable inter-networking protocol) are extensible protocols; those extension mechanisms could be used. Also, it would possible to carry some of the required information as user data in the internetworking packet, with the original user's data encapsulated further inside. Finally, a private inter-router packet format could be defined.

It's not clear which choice is best, but it is possible to discuss which fields the Nimrod routers need access to, and how often; less used ones could be placed in harder-to-get-to locations (such as in an encapsulated header). The fields to which the routers need access on every hop are the flow-id and the looping packet detector. The locator/pointer fields are only needed at intervals (in what Nimrod's datagram forwarding mode calls active routers), as is the source route (the latter on entry to every object which is named in the source route).

Depending on how access control is done, and which forwarding mode is used, the TLEI's and/or locators might be examined for access control purposes, wherever that function is performed.

This is not a complete exploration of the topic, but should give a rough idea of the issues.

The Future of Routing

It is worth noting that although this section is specifically about the requirements of the Nimrod routing architecture, there are reasons to believe that any routing architecture for a large, ubiquitous global network will have many of the same basic fundamental principles as the Nimrod architecture, and the requirements that these generate.

While contemporary routing technologies do not yet have the characteristics and capabilities that generate these requirements, they also do not seem to be completely suited to routing in the next-generation Internet. As routing technology moves towards what is needed for the next generation Internet, the underlying fundamental laws and principles of routing will almost inevitably drive the design, and hence the requirements, toward things which look like the material presented here.

Therefore, even if Nimrod is not the routing architecture of the next-generation Internet, the basic routing architecture of that Internet will have requirements that, while differing in detail, will almost inevitably be similar to these. This makes the requirements of Nimrod worth looking at, even if the specific Nimrod design is destined to be nothing more than a historical curiousity.

Bibliography

[Nimrod]    Isidro Castineyra, Noel Chiappa, Martha Steenstrup, "The Nimrod
                Routing Architecture", RFC 1992, August 1996.

Back to JNC's home page