Network Layer Routing Considered Harmful

Why is it a bad idea to do routing at the "network" layer, using "network" in the sense of a particular local technology, such as ATM or Frame Relay?

In brief, we are comitting the 49-layer model error - the layer X guys are trying to include functionality in their layer which properly ought to be at layer Y - this process inevitably leads to 7*7 layers, i.e. 49. I reckon that eventually we will want to stop doing that, and move basically all the routing functionality to the internetwork layer.

I understand that the some people, such as common carriers, have reasons for not wanting to get involved in the internetwork layers, but I think they're wrong because they haven't thought it all the way through yet.

So, what is the argument against doing routing at the physical-network layer (hereinafter the "network layer", as opposed to the "internetwork layer"), in more detail?

This note has two parts. First, an argument as to why it's the absolutely the wrong thing in the long run. Second, I'll look at some reasons why it might seem like doing network-level routing is reasonable in some circumstances (especially in currently deploying common-carrier WAN networks as an OK short-term kludge), and discuss them.

Well, let's start by looking at one of the key papers in networking, "End-End Arguments in System Design", by Saltzer, Clark, and Reed. (*Everyone* in the IETF should read this.) In it we find this thought:

"This paper presents a design principle that helps guide placement of functions among the [layers] of a distributed [information] system. The principle, called the end-to-end argument, suggests that functions placed at low levels of a system may be redundant, or of little value when compared with the cost of providing them at that low level."

In other words, if a function is provided at too low a layer, it will still be necessary to provide the same function at a higher layer anyway, so you've wasted your time providing it at the lower layer. The solution at the lower layer is *necessarily* an *incomplete* one. In many cases (the paper gives several), you've made negative progress by doing so. Depending on how this physical-layer routing is done, the same might well be true here.

Let's start by realizing that routing in an internet is not something you can do partially at the network layer, inside a WAN, and do well. When routing is done partially at a lower layer, a layer which is inaccessible to the internetwork level routing (which you *have* to have, I assume everyone already understands why), the flexibility and functionality of routing, as an overall function, will inevitably suffer.

Examples of this abounded in the early Internet. We tried to use the ARPANet as a backbone, together with a variety of specialized links (such as satellite) between various sites. We wanted to route from S across the ARPANet to nearby (in ARPANet terms) site X, which had a special link to Y, which was near D, where the traffic was headed. Since the ARPANet looked like an integral object to the internetwork routing, and the distance from S to X, Y and D was all the same (since the ARPANet level routing information, which *could* have shown the true variances, was *inaccessible* to the internetwork level routing), it was impossible to do this.

In general, the problem is that in doing routing at the network layer, you wish to insulate the internal complexity of the WAN from the internetwork layer. In other words, you wish to make the internal topological complexity of the WAN disappear, and have it appear like a simple object (e.g. a giant Ethernet) to the internetwork layer routing. However, this is often in direct conflict with the goals of the internetwork routing (and reality :-), if the WAN cannot reasonably be modeled this way. If the WAN does not have the service characteristics of a LAN, but a more complex pairwise behaviour, then you can't do anything about dealing with that behaviour if it *looks* like a LAN to the routing.

Worse, you're duplicating functionality. Any *reasonable* internetwork routing architecture has to have built-in abstraction tools to handle the large internets we see now. By creating, and using, other, incompatible, abstraction mechanisms, you're creating a complex, *useless*, nightmare.

To back off, and look at the "information wall" problem, you could create an interface between the network and internetwork level routing, and allow the internetwork routing to pick up and use routing information from the network layer, and control how traffic is routed, but you're just into the same situation as the abstraction thing. You've got two mechanisms in place of one. As the lower level one *cannot* be a *complete* solution, if you want to minimize your complexity/functionality ratio, you have to ditch the lower one.

In all but a few specialized cases, doing network layer routing in an internetwork is just plain misguided, the same way providing reliable network level delivery in an internetwork is misguided.

Parenthetically, the hopes of the ATM proponents notwithstanding, the Internet of the future is *not* going to be pure ATM, but rather a mix of technologies, which we will need an internetwork layer to tie together. The argument is pretty easy to make.

Back when I studied systems at MIT as an undergrad, we studied multi-level memories. It was explained to us that this was a valuable thing to study, as even though the technologies of the day (core and "washing-machine" disks at that point) would soon be obsolete, economics and physics would always dictate a mix of memory speeds and sizes, and thus automatic storage management problems would always be with us, and sure enough, they are.

In the same way, economics and physics will, I reckon, always dictate a range of speeds and maximum sizes for available network technologies. If nothing else, a technology that will run at X bits/second on a single channel will run at nX bits/second if run n wide, at a consequent increase in cost. Round trip delays over long distances will always be longer than over short, with consequences for channel access algorithms (until some *very* clever person figures out how to get around ol' Albert :-). Etc, etc, etc.

There may be some applications designed which *only* operate on ATM, and are closely tied to the service model of ATM. However, I reckon that these will be driven out by applications which can operate anywhere in the Internet; they ubiquity, and wider potential customer base, will be a killer.

So, we're always going to have an Internet with a mix of technologies, and an internet layer. You can't escape that way.

I understand that the common carriers have reasons for not wanting to get involved in the internetwork layers. In general, the argument goes that due to the proliferation of internetworking layers, the carriers just want to provide a lower layer service that anyone can use, and i) avoid all the politics, ii) avoid all the complexity of handling N internetwork layers, and iii) maximize their potential customer base. These are good points, but they ignore two things.

First, in providing that lower layer service, they shouldn't attempt to provide functionality which is inappropriate at that layer. Most modern datagram services (SMDS, Frame Relay, etc) do *not* attempt to provide reliable delivery, since the vendors realize that they are going to be used in an internetworking environment, where provision of this functionality at the network layer is misguided. The same should apply to routing.

Second, and more importantly, to the degree that they provide these services on networks which include packet switches, you simply can't hide those switches, and expect to get away with it. A system of switches will have certain generic operating problems such as congestion, etc, which will have to be solved at the *internetwork* layer. Any mechanisms which solve these problems at the *network* layer are necessarily *duplicative* and *incomplete*.

Still, in the short run, as an expediency move (particularly in an age with many different internetwork protocols), it may be reasonable for common carriers implement routing functionality. This is not necessarily bad, as long as everyone realizes up front that it is a short-term kludge. In the long run, they are going to have to bite the bullet and implement the internetworking protocol on their switches.

Also, there are potentially cases in which it is useful to hide network layer routing. In general, they fit the following model: the variance of all service characteristics (delay, bandwidth, etc) among different pair-wise connectiion points is "small". Any network which needs internal routing, and which fits this criteria, can do internal network layer routing without a major problem.

However, there are two questions. First, this mechanism is going to be duplicative. Is the benefit (perhaps in a simpler local routing protocol, or less traffic) worth the cost in extra mechanism? I expect it will be a rare case where a hard-nosed appraisal says yes. Second, you still have the issue of the generic problems of packet switches, mentioned above.

At the end of the day, you still come back to the same model: the internetwork is going to consist of transmission facilities, and switches, and the internetwork layer needs to manage *all* the switches.

(Note that I'm *not* saying that all the switches should directly parse and handle internetwork datagrams, as routers do now; my model of the future is that the internetwork layer will have a strong "flow" component, and the actual switches will likely look like ATM switches, but be under the control of associated "internetwork control nodes".)

Building a network layer routing mechanism may be an interim step in getting stuff deployed here and now, but in the long run it's a temporary step.

Back to JNC's home page