Date: 4 Aug 1982 1247-EDT From: Larry Campbell To: cbosgd!mark at UCB-C70, header-people at MIT-MC Subject: Re: news vs mail Message-ID: <"MS11(2225)+GLXLIB1(1056)" 11845159982.28.71.18348 at DEC-MARLBORO> Regarding: Message from cbosgd!mark@Berkeley (Mark Horton) of 27-Jul-82 2337-EDT Note that 32 character message-ID's are not sufficient to guarantee uniqueness, especially in the internet world. A clever mail system will allow users to reference messages by message-ID WITHOUT having to type it; for example, my mail system lets me say READ RELATED-MESSAGES, which shows all messages in a chain of replies (plus the original). Most mail systems assign small numbers to the messages in a given message file, so that users have a short identifier for each message in that file. But message-ID's should be unique across the universe. P.S.: I also vote for making message-ID mandatory. --------  Date: 4 August 1982 1659-EDT From: Rudy.Nedved at CMU-10A To: Header-People at MIT-MC Subject: msg id Message-Id: <04Aug82 165928 EN0C@CMU-10A> I second the motion to standardize the message id. I would like to see the message id exported to the out of band information. Software could use it to reduce duplicates, notice loops, etc. If the mail recieving process maintained a cache of messages recently recieved, it could even say "don't bother sending it, I already have it" and deliver the cached copy to the recipient. Reducing the amount of time used by uucp dialups, etc. [Duplicates to the same machine but different recipients will become more evident when mail is sent to Sam@A and Fred@B and Fred@B forwards to Fred@A.] -Rudy  Date: 4-Aug-82 17:43:36-EDT (Wed) From: cbosgd!mark@Berkeley (Mark Horton) Subject: Re: news vs mail Via: cbosgd.uucp (V3.94 [3/6/82]); 4-Aug-82 17:43:37-EDT (Wed) To: header-people@mit-mc.arpa Unless people have site names longer than about 25 characters, there should be no uniqueness problem. And the number 32 is not hard and fast, it's just a reasonable handle on what people can reasonably type. If a site has a heirarchical name that's 50 characters long, nobody will ever send them any mail because it's too much work! (Having just converted to the user@host.arpa addressing format for the arpanet, I'm already annoyed at the extra .arpa on the end. I'm just going to have to get used to it, but there is a reasonable upper bound somewhere.) Sure, a mail system can look things up, as long as you're in the mail system. Problem is that there are a lot of tools out there that were not intended for the mail system, and you'd like to be able to use them, too. A key example: I want to find all followups in our archived news directory to article 'cbosgd.1234'. There are several thousand messages in there, which is asking a lot of a mail system (also, each message is in a separate file, spanning several layers of subdirectories). On UNIX, I can use the 'grep' utility (which finds all lines containing a string in a list of files) and say grep cbosgd.1234 */* and it will find it, without having ever written a special purpose tool for this purpose. The human mind needs unexpected things like this all the time - one of the key points of UNIX is to combine general purpose tools in ways that the authors had never thought of. Now, if that string were 80 characters long and full of special characters, blanks, and upper and lower case, it would be a real pain to type. Ease in typing is the key - in fact, it would probably be a good idea to only use lower case letters, digits, and a few punctuation marks like period. Or at least make case not matter in the message id. Mark  Date: 4-Aug-82 21:16:49-EDT (Wed) From: cbosgd!mark@Berkeley (Mark Horton) Subject: msg id Via: cbosgd.uucp (V3.94 [3/6/82]); 4-Aug-82 21:16:51-EDT (Wed) To: header-people@mit-mc.arpa I like the notion of the message ID being out-of-band. This makes it easier to find for the mail server, and allows it to refuse to accept something it's already seen. I'm not quite sure what it means for the message ID to be out of band (in the "envelope") since although people talk about it, I don't see any actual implementations, or the standard, keeping "envelope" information separate (or at least duplicated). Presumably the message id and other envelope information (not added by forwarders) would also be in the message itself. Rudy Nedved raises a point that I hadn't thought of, however. News is all to one recipient per system - once the system has seen it, it can throw it away. This is not true of mail - it can go to several people on the same system. To cite Rudy's example, if mail is sent Sam@A, Fred@B; and Fred@B forwards to Fred@A, it's important to make sure Fred@A gets his copy instead of A throwing it away. I suppose the "recent message ID table" could contain a string made even more unique, msg-id || recipient, for each recipient. But this is an implementation problem. One other big advantage to the message ID being out of band: I remember once a bug in a mailer caused me to get an infinite number of messages from "Graham@Shasta" with no headers and no body. Apparently it was connecting and immediately bombing out. Since no headers had been transmitted yet, there could have been no message ID transmitted, and it would have been almost impossible to detect the duplication. But if the msgid were uttered in the same breath as the recipient, there's no problem! Mark  Date: 5 Aug 1982 0439-PDT Sender: BILLW at SRI-KL Subject: long addresses... Subject: [menlo70!sri-unix!hplabs!menlo70!sytek!zehntel!teklabs!ucbc...] From: BILLW at SRI-KL To: geoff at CSL Cc: header-people at MC Message-ID: <[SRI-KL] 5-Aug-82 04:39:48.BILLW> Really now. Through menlo-70 twice. crashes hermes on a survey command. can UUCP unix wizzards mail go straight to sri-unix ? This is getting really rediculous... WW Begin forwarded message Mail-From: ARPAnet host SRI-CSL rcvd at 5-Aug-82 0358-PDT ARPANET host SRI-UNIX rcvd at 16-Jul-82 2314-PDT Date: 15 Jul 82 15:12:31-PDT (Thu) From: menlo70!sri-unix!hplabs!menlo70!sytek!zehntel!teklabs!ucbcad!ARPAVAX.CSVAX.decvax!utzoo!watmath!dmmartindale@BerkeSubject: SI 9700/STC tape experience? To: Unix-Wizards@sri-unix Subject: SI 9700/STC tape experience? Article-I.D.: watmath.3037 Via: news.usenet; 16 Jul 82 23:08-PDT Remailed-date: 4 Aug 1982 1942-PDT Remailed-from: the tty of Geoffrey S. Goodfellow Remailed-to: Unix-Wizards@SRI-CSL: ; Does anyone have any experience with the Systems Industries 9700 tape controller and/or STC tri-density tape drive running on an 11/780? It looks like we will be buying this combination for a 780 that will be arriving in about a month. How much does the TM11 driver need to be modified to support the 9700 properly? Any comments about performance or reliability? Dave Martindale, decvax!watmath!dmmartindale -------------------- End forwarded message  Date: 5 Aug 1982 1158-EDT Sender: MOOERS at BBNA Subject: Re: long addresses...... From: MOOERS at BBNA To: BILLW at SRI-KL Cc: geoff at CSL, header-people at MC, Cc: Dodds at BBNA, Mooers at BBNA Message-ID: <[BBNA] 5-Aug-82 11:58:31.MOOERS> In-Reply-To: <[SRI-KL] 5-Aug-82 04:39:48.BILLW> The current Hermes crashes if the username is more than 91 charcters long, and if the Hermes template includes the Source: item, which parses the From: field. We h;ave now increased the possible length of the username to 511 characters. This fix will be included in the next version of Hermes, before the end of the month. I will award a prize to the first person who sends in a valid username of more than 511 characters. Does anyone care to start a pool to guess how long it will be before one appears? Let us hope that TCP addressing intervenes and makes things more reasonable. Meanwhile, Hermes users might want to edit the STEMPLATE temporarily to replace Source: with From:. This is strictly a band-aid fix, but it will keep you from crashing. ---Charlotte  Date: 5 August 1982 1733-EDT (Thursday) From: Craig.Everhart at CMU-10A To: MOOERS at BBNA Subject: Re: long addresses...... CC: Header-People at MIT-MC In-Reply-To: <[BBNA] 5-Aug-82 11:58:31.MOOERS> Message-Id: <05Aug82 173326 CE10@CMU-10A> Well, really. 511 characters is a band-aid fix, too, as is any seriously static limit. Why bomb out some poor Hermes user three years from now? I also want to point out that the inclusion of an out-of-band Message-ID field will require a third kind of mail server reply code--i.e. one that indicates successful receipt even though the message wasn't transmitted! Unless you want the message to be transmitted anyway, and simply discarded at the receiving end. Seems like a waste to go that way. But transmitting Message-ID out-of-band sounds like a good idea. Now let's get cracking on a standard for, or at least a discussion of, the inclusion of out-of-band text as commentary in the in-band mail text itself; programs that process mail will still want to see, and operate on, all this information newly being passed out-of-band. There are probably cases where it would do somebody good if the current out-of-band information (the recipient name) were being saved somewhere in the delivered mail text. Craig Everhart  Date: 5 Aug 82 22:01:22 EDT (Thu) From: Steve Bellovin Subject: Re: long addresses... To: BILLW at Sri-Kl, geoff at Sri-Csl Cc: cbosgd!mark at Ucb-C70, header-people at Mit-Mc In-Reply-To: Message of 5 Aug 1982 0439-PDT from BILLW at SRI-KL <[SRI-KL] 5-Aug-82 04:39:48.BILLW> Via: UNC; 5 Aug 82 23:42-EDT Isn't anarchy wonderful....? USENET is organized (definitely too strong a word) as a multiply-connected graph; each site that receives a news item broadcasts it to several neighbors. Depending on network propogation delays (most links are 1200-baud dialup), an article from point A might reach point B by any of several different routes; it's unpredictable, uncontrollable, and in general irrelevant anyway. One of the things SRI-UNIX does with "net.unix-wizards" articles is feed them into the ARPA mailing list; given the way news is distributed, there's no way to "mail it directly" to SRI. Now -- there *is* duplicate article detection code in every version of news in the field. An article that had once passed through menlo70 should never have been sent there again (though there some problems in that code because of the mixed-net addressing, they do not seem to apply to this message). If it did reach menlo70 again, it should have been rejected immediately as a duplicate. Only if the second copy arrived more than two or three weeks later would it be accepted, and that's almost impossible. I'm at a loss to explain how this note got in its current state, in fact, but it's too old now -- the log files from mid-July are likely long-gone. What can be done in the future? Well, we're trying to move to the Internet addressing standard. It's hard, because the routing information is important -- Usenet comprises at least 250 machines, and there's no central registery of who talks to whom. So it won't happen soon, but we hope it will, eventually.  Date: 5 Aug 82 23:45:56 EDT (Thu) From: Steve Bellovin Subject: Re: long addresses...... To: Craig.Everhart at Cmu-10a, MOOERS at Bbna Cc: Header-People at Mit-Mc In-Reply-To: Message of 5 August 1982 1733-EDT (Thursday) from Craig.Everhart at CMU-10A <05Aug82 173326 CE10@CMU-10A> References: <[BBNA] 5-Aug-82 11:58:31.MOOERS> Via: UNC; 6 Aug 82 1:55-EDT I'm starting to wonder about everything being classified as "out-of-band". As Craig Everhart pointed out, mail readers need some of this information. For example, I'd like to be able to get at Message-Id, so that it can be included in "In-Reply-To" lines -- which in turn can be used to track conversations, a feature I sorely miss. We need return paths; we need envelope addresses; we need mailing-list names -- why move it out of band? We've already got a perfectly acceptable place to put them, and that's the header! Seriously, there's no reason to maintain this information in multiple places. The actual delivery addresses have to be out-of-band, because they don't always appear in the header in sufficient detail (most of you aren't listed in this note, for example). And return-path information might be needed to reject a letter with a bad header, so it has to be out-of-band. But I see no reason to move anything else; if you don't want to see it on your screen, ask your mail-reader to delete it. Nor do I think we should be trying to severely limit the number of defined headers. The purpose of header-lines is to improve communication between people; we're better off defining more lines, so that when two different sites decide to use it, they're sharing a common semantics. To cite an example I mentioned before, we'd like to implement a "Priority" field here; our mail receiver will beep a few extra times when a high-priority note arrives, and the mail-reader will display it first. But it's useless for netmail if we call our priorities "High", "Normal", and "Low", and someone else prefers "1", "2", and "3".  Date: 6 Aug 1982 1023-EDT Sender: MOOERS at BBNA Subject: Usernames over 511 characters long. From: MOOERS at BBNA To: Header-people at MIT-MC, smb.unc at UDEL-RELAY Cc: Dodds at BBNA, Mooers at BBNA Message-ID: <[BBNA] 6-Aug-82 10:23:32.MOOERS> In-Reply-To: Your message of 5 Aug 82 21:18:12 EDT (Thu) Actually, Hermes will not blow up on usernames of more than 512 characters. It will (a) refuse gracefully to print more than 511 characters unless you use the LONG-PRINT-FORM template, which prints the message varbatim, and (b) do various strange things if you explode such a message. But no blowups on printing. [PHOTO: Recording initiated Fri 6-Aug-82 9:55AM] ... >get (message-file) testmsg >survey (messages) 1 (using template) survey-form (to file) 1 609 6 Aug 82 Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi >print (messages) 1 (using template) print-form (to file) Message 1 609 6 Aug 82 From: Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcd efghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcd efghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcd efghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcd efghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcd efghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.A...[Very long line; print using LONG-PRINT-FORM.] To: Mooers Subject: 512 characters in username. >print (messages) 1 (using template) long-print-form (to file) Message 1; 609 chars Date: 6 Aug 82 From: Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcd efghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcd efghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcd efghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcd efghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcd efghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcd efghi.Abcdefghi.Abcdefghi.Abcdefghi.Abcdefghi.Ab at SRI-CSL To: Mooers Subject: 512 characters in username. ... [PHOTO: Recording terminated Fri 6-Aug-82 10:00AM] ---Charlotte  Date: 6 Aug 1982 08:44:06-PDT From: lblg!sventek#j at LBL-UNIX Subject: Why "out-of-band" information To: smb.unc at udel-relay cc: header-people at mit-mc The discussion which I tried to encourage with my last entry was to make a distinction between the header, which is similar to the memo header of hardcopy correspondence, and the out-of-band information, which is equivalent to the envelope in which the hardcopy memo is encapsulated when using the U.S. Postal Service. It is often the case that the information contained in the message header is sensitive, with the result that the sender would not appreciate some random trusted process, such as a mail delivery module, to look at that information; as a result, the sender encrypts the entire message, and informs the recipient as to the appropriate method for decrypting the sensitive information. As was mentioned, the only out-of-band information currently kept by SMTP is the Return-Path: information and the actual addresses to which the mail is addressed. Because of this, delivery systems are currently required to scan the message header for additional information upon which to make routing decisions, thus preventing the sealing of sensitive information in an end-to-end fashion. If additional out-of-band information was permitted by SMTP, many of the header fields which message composition utilities currently bury in the header would be placed in the out-of-band compartment, since the purpose of those headers is to guide routing decisions, determine accounting, etc. It is certainly the case that the accumulated out-of-band information should be made available with the message when it is actually delivered; such information may be required for particular end-user functions, such as message-id's for In-reply-to: fields and the like. Obviously, the simplest policy for making the out-of-band information available with the least impact to existing mail-handling software is for the final delivery system to place the out-of-band information in the header when it is delivered to the user's mailbox. Recently developed systems may choose to merely associate the two pieces of information in the mailbox database structure, only presenting it to the user upon demand, but still having it available for end-user functions which require that information. Such a system would be required to adequately support the end-to-end encryption scenario of which I am so fond. Joe Sventek  Date: 6 Aug 1982 1258-EDT Sender: MOOERS at BBNA Subject: Re: long addresses...... From: MOOERS at BBNA To: Craig.Everhart at CMU-10A Cc: Header-People at MIT-MC, Cc: Dodds at BBNA, Mooers at BBNA Message-ID: <[BBNA] 6-Aug-82 12:58:50.MOOERS> In-Reply-To: <05Aug82 173326 CE10@CMU-10A> Here's a bit more background on the "long username" problem in HERMES. The senior programmer for HERMES is Doug Dodds, and he sends the following comment: "511 characters is not just an arbitrary limit, as the earlier 91 characters was. It is the longest string that can be represented in the BCPL string format, and is a fundamental limit to many things in HERMES. In particular, HERMES cannot create messages with lines longer than 511 characters. Any fixed number like this seems arbitrary, but it is not mindless. If anyone seriously needs longer lines, I'm afraid they will have to shop for another mailsystem. Cheers, Doug" But the statement in the previous message still holds true. HERMES cannot create a line longer than 511 characters, but it will now print the first 511 characters of any 511-plus line that it receives in a message from some other source, and then truncate the rest of the line gracefully. ---Charlotte  Date: Sunday, 8 Aug 1982 00:42-PDT To: MOOERS at BBNA Cc: Craig.Everhart at CMU-10A, Header-People at MIT-MC Subject: Re: Re: long addresses...... In-reply-to: Your message of 6 Aug 1982 1258-EDT. <[BBNA] 6-Aug-82 12:58:50.MOOERS> From: greep at RAND-UNIX I recommend that any further discussion along these lines be moved to info-hermes or whatever. I'm sure most of us don't really care to hear about problems with the BCPL compiler.  HNIJ@MIT-AI 08/09/82 18:22:29 To: header-people at MIT-MC please add me to your mailing list. john.  Date: 11 Aug 82 11:06:45-EDT (Wed) From: Dave Crocker To: Header-People at Mit-Ai Subject: Additional changes to the spec I am putting the final touches to the spec today and thought that you might be interested in the important changes. I had hoped that this last round would not have such modifications, but some people pointed out significant problems: All of the parameters in the "Received" field are optional. It was observed that given systems, in given cases, may not have access to things like the name of the host that sent the mail. The "with" parameter may be used repeatedly, so that you can list the full set of protocols used over the link. The and rules were removed, so that the "via" and "with" parameters are now specified as taking an atom. This is designed to remove specific names from the standard, so that the NIC will hold the ONLY list and people will not assume that the (original) list in the standard is correct. The "id" parameter now takes a . "In-Reply-To" and "References" have field-bodies which are sequences, but not lists. That is, you need not use commas to separate things. In the section on Automatic Use of From / Sender / Reply-to, it is advised that Sender be used when the transport system is sending notices of transport and delivery problems. The rule now, again, requires the semi-colon. It was observed that making it optional can lead to some very messy (almost ambiguous) parsing. The recognition of "Postmaster" was clarified to require that it be recognized without regard to upper and lower case. The rule for quoting individual characters, with backslash, were clarified. Quoting is only permitted within quoted-strings, comments, and domain-literals. It specifically is not permited within atoms. This has confused some people. Were there less history to this specification, its use, and the amount of existing software, I believe that the quoting mechanism should have been fully general, but... Dave  Date: 16 August 1982 17:37 edt From: Frankston.SoftArts at MIT-MULTICS Subject: Message-ids Sender: COMSAT.SoftArts at MIT-MULTICS To: header-people at MIT-MC *from: BOB via SAI via MIT-Multics (Bob Frankston) Original-date: MON AUG 16 1982 08:21:52 Original to: header-people at mit-mc This discussion might be over by now, but I just want to point out that "host-name" is not unique since it is only unique within a given network. Something like an international phone number (11 digits in North America) is a good starting point as a replacement for host name.  Date: 16 Aug 82 21:05:12-EDT (Mon) From: Dave Crocker To: Header-People at Mit-Ai Subject: 733->822 wrapup This is mostly to thank all of you for your assistance with the new specification. Sound and fury not withstanding, there were many good ideas and it was very helpful having the different sides argued. There was one clarification that went in at the last minute. I would like to make sure that all of you note it: The rule for domain abbreviation was re-worded and is, formally, somewhat different than in the earlier versions. Basically, the short-form now is only permitted a) if the sender's domain permits it, b) only while the message is within that domain, and c) only if the address has a domain reference which ends with ALL but the first of the sequence which is the sender's domain. (That is, if the sending domain is @c.b.a, then the address must end with b.a, tho it is permitted to have additional sub-domain references to its left, such as d.c.b.a. The b.a may be left off. Lastly, top-level domain names are reserved only within the domains that permit abbreviation. A side comment is that you should note that the specification for Message-Id is intended to permit global uniqueness, where 'global' is defined to the be internet. The Id includes the full domain specification of the creating host and it is then expected that the host will generate a qualifier which is unique within that system. Dave  Date: 16 Aug 82 21:48:00 EDT (Mon) From: Steve Bellovin Subject: Re: Message-ids To: Frankston.SoftArts at Mit-Multics, header-people at Mit-Mc In-Reply-To: Message of 16 August 1982 17:37 edt from COMSAT.SoftArts at MIT-MULTICS Via: UNC; 16 Aug 82 22:31-EDT Bob Frankston has a point -- the syntax specifies a "route-id" for the "Message-Id" field, but that doesn't work well in an Internet environment. If a message is forwarded to another net by one of the recipients addresses, it's easy (well, relatively easy) to adjust the "From", "To", etc., lines to contain the proper network name. But you can't mung a "Message-Id", because the sender will not know of a message with the altered identification field.  Date: 17 August 1982 16:20 edt From: Charles Hornig at MIT-MULTICS Subject: RFC 822 migration Sender: Hornig.Multics at MIT-MULTICS To: header-people at MIT-MC Message-ID: <820817202025.386599 at MIT-MULTICS> Due to an oversight, MIT-Multics was accidentally sending out mail with headers containing domain names for a while today. This pointed up a severe problem with RFC 822. RFC 733 header parsers cannot reply to mail with an RFC 822 header. This makes it very difficult to be the first to convert. Any ideas on how to deal with this?  Date: 17 Aug 1982 1531-PDT Sender: WESTINE at USC-ISIF Subject: Message-ids From: Postel@isif To: Header-people at MIT-MC Message-ID: <[USC-ISIF]17-Aug-82 15:31:23.WESTINE> The domain name is to be an absolute global name for a host. The messgae-id argument is "msg-id" which is an "addr-spec" which is "local-part@domain". No route involved. --jon.  Date: 17 Aug 82 18:52:28-EDT (Tue) From: Dave Crocker To: Charles.Hornig at Mit-Multics cc: header-people at Mit-Mc Subject: Re: RFC 822 migration The domain abbreviation mechanism was included specifically for the purpose of trying to help existing software keep from breaking. As long as the address conforms to the abbreviation rule, send it out in short-form. Depending on what the 'sub-domain' reference is, the address may be reply-able by 733 parsers. Dave  Date: 19 Aug 1982 1450-PDT Sender: WESTINE at USC-ISIF Subject: RFC 822 Migration From: Postel@isif To: Header-People at MIT-MC Message-ID: <[USC-ISIF]19-Aug-82 14:50:48.WESTINE> The conversion from the current mail procedures and formats will have to proceed through stages. First people should change their programs to accept the new formats (as well as everything they accept now). When enough systems have made this change, people can change program to begain sending the new stuff. Finally, when enough systems send the new stuff, people can cut the old stuff out of their programs. --jon.  Date: 25 August 1982 04:48-EDT From: Frank J. Wancho Subject: Header munging To: HEADER-PEOPLE at MIT-MC cc: ARPAVAX.dag at UCB-C70 Is this sort of apparent header munging really necessary? Are or should there be rules to handle this to present to the user? What should any Reply function do if given this as-is? Or - is this a remailed message by the originator that somehow lost the blank line between the original message and the new header? Date: 25 Aug 1982 00:46:50-PDT From: ARPAVAX.dag at Ucb-C70 Mail-From: UCBARPA received by UCBVAX at 25-Aug-82 00:55:21-PDT (Wed) Date: 24-Aug-82 17:46:44-PDT (Tue) From: ARPAVAX: dag (David Gewirtz) at Berkeley Message-Id: <8207250046.3474.ARPAVAX@Berkeley> Via: ucbvax.EtherNet (V3.147 [7/22/82]); 25-Aug-82 00:55:21-PDT (Wed) Received: by UCBARPA (3.165 [8/23/82]) id a03474; 24-Aug-82 17:46:46-PDT (Tue) To: info-cpm-request at BRL Via: Ucb-C70; 25 Aug 82 3:57-EDT  Date: 25 Aug 1982 15:12:11-PDT From: ARPAVAX.dag at Berkeley Mail-From: UCBARPA received by UCBVAX at 25-Aug-82 15:20:33-PDT (Wed) Date: 25-Aug-82 14:48:23-PDT (Wed) From: ARPAVAX:dag (David Gewirtz) Subject: Header munging Message-Id: <8207252148.17151.ARPAVAX@Berkeley> Via: ucbvax.EtherNet (V3.147 [7/22/82]); 25-Aug-82 15:20:33-PDT (Wed) Received: by UCBARPA (3.165 [8/23/82]) id a17151; 25-Aug-82 14:48:25-PDT (Wed) To: FJW@MIT-MC, HEADER-PEOPLE@MIT-MC Cc: ARPAVAX.dag@UCB-C70 I think I understand your message as being a complaint about the verbosity in the header. Unfortunately, since I don't have the handy-dandy mailer sources for all of the machines a message goes through, I can't do much about it. However, if you simply want to know how to get a message to me, on htis machine (ARPAVAX), it is done as "ARPAVAX.DAG at UCB-C70" officially. I believe that "ARPAVAX.DAG at BERKELEY" and perhapes "ARPAVAX:DAG at BERKELEY" will also work. If that fails, or you are truly sick of life, you can simply mail to "DAG at BERKELEY" or "DAG at UCB-C70", which will put your randomnesses into my account there. I guess that should do it. David P.S. You could always use the telephone, write a letter, or contact Pony Express (a subsidiary of Federal Express for those who MUST have over-year delivery)  Date: 26 Aug 1982 2250-PDT Sender: GEOFF at SRI-CSL Subject: Xerox Woes -- Long lines with out returns in them. From: the tty of Geoffrey S. Goodfellow Reply-To: Geoff at SRI-CSL To: MsgGroup at BRL, Header-People at MC Message-ID: <[SRI-CSL]26-Aug-82 22:50:49.GEOFF> The following poop is reputed to come from Xerox. It deals with the problem of letting long lines out to pollute our old outmoded terminals which are afflicted with inferior technology(?). Hopefully this is not a veiled scheme to induce us all to buy STARS!? Here is the Xerox statement of the problem, and the (unfortunate) disposition of the Xerox mail system maintainers on the issue. ======= Begin Xerox Poop ====== Description: Getting mail with long sentences without CRs poses serious problems for many ARPAnet recipients. It is not reasonable to require Xerox personell to remember to manually apply the Chop hack to such msgs before sending. MailSend should incorporate the Chop feature, the way Laurel does. Disposition: Rejected "This is the wrong solution to the problem. As is stated in the RFC to replace 733 (the standard for ARPA messages) ... it is interesting to note that the receiver of a message can exercise an extraordinary amount of control over the message's appearance." By placing explicit CRs into the message (where just white space line breaking was intended by the user) on the way out, the formatting ability of the remote displayer is limited. As someone noted recently "In general, the sender cannot know the properties of the receiver's terminal; so it is inappropriate for the sender to insert automatic line breaks for formatting purposes. Indeed, such insertion can make the receiver's job much more difficult: it is far easier for the receiver to insert line breaks as they are needed than for it to distinguish CRLFs that were inserted by the sender's mail program from ones that truly represent breaks in the text." ====== End Xerox Poop ====== Does anyone out in ARPANET, or CSNET, or INTERNET, (other than Xerox StarNet) agree with this type of rationale? Does anyone like these long lines? Is there anyone out there with a terminal or a system that can and does break long lines as the Xerox argument claims we can and do? I find the Xerox attitude on this issue rather anti-social myself.... Then of course, there is also the issue of Xerox Internet mail leaking out onto the ARPANET Internet without "at PARC-MAXC" on its from fields. -------  Date: 26 Aug 1982 23:08:51-PDT From: ARPAVAX.dag at Berkeley Mail-From: UCBARPA received by UCBVAX at 26-Aug-82 23:18:44-PDT (Thu) Date: 26-Aug-82 21:23:22-PDT (Thu) From: ARPAVAX:dag (David Gewirtz) Subject: Re: Header munging Message-Id: <8207270423.1003.ARPAVAX@Berkeley> Via: ucbvax.EtherNet (V3.147 [7/22/82]); 26-Aug-82 23:18:44-PDT (Thu) Received: by UCBARPA (3.174 [8/25/82]) id a01003; 26-Aug-82 21:23:24-PDT (Thu) To: KING@CMU-20C, dag.at.BERKELEY@cmu-20c, dag.at.UCB-C70@cmu-20c Cc: header-people@mit-mc c fjw@mit-mc You know, I really don't appreciate your lousy attitude. I have no control of the system headers, and if you don't like them, you do something about it. Personally, I don't see any need to put up with your garbage. The following message is just rude: >From KING@CMU-20C Thu Aug 26 18:52:01 1982 Date: 26 Aug 1982 1032-EDT From: Dave King Subject: Re: Header munging Message-Id: <820725103218KING@CMU-20C> Received: from ucb by UCBARPA (3.174 [8/25/82]) id a06402; 26-Aug-82 18:52:00-PDT (Thu) To: ARPAVAX.dag.at.BERKELEY , dag.at.BERKELEY , dag.at.UCB-C70 Regarding: Message from ARPAVAX.dag at Berkeley of 25-Aug-82 1812-EDT In-Reply-To: <8207252148.17151.ARPAVAX@Berkeley> What is this garbage you are putting in my mail file? If you can't send mail that a program (or a human) can parse, please keep quiet and use other means of communicating. -------- Sorry to those of you who have to put up with this, but I would appreciate it if you would reevaluate your attitude and complaints. I do not wish to put up with any of this. Your truly, David  Mail-from: SU-NET host Shasta rcvd at 27-Aug-82 0123-PDT Date: Friday, 27 Aug 1982 01:23-PDT To: Geoff at Sri-Csl Cc: MsgGroup at BRL, Header-People at Mit-Mc Subject: Re: Xerox Woes -- Long lines with out returns in them. Reply-to: Hartwell at Score In-reply-to: Your message of 26 Aug 1982 2250-PDT. <[SRI-CSL]26-Aug-82 22:50:49.GEOFF> From: Steve Hartwell I agree with the Xerox position. Although I am sympathetic to those who do not have the easy capability to reformat incoming messages (I mean in their personal mailboxes -- It would be a gross error to implement linefolding at the input MTP level) I feel very strongly about interfering with the contents (body) of mail messages. For one, it is both conceivable and in practice on this machine for mail messages to be processed by programs rather than people. For example, on Shasta you can send mail to our Canon printer. (Remotely, from any machine). I would certainly NOT want the sending program to break lines at some arbitrary boundary for me, for every reason under the sun. If this seems like a trivial example, I can provide more if you'd like. Moreover, given that there are so many different terminal widths, it seems unfair to choose what would have to be the lowest common denominator of them all. For those of you who think this is 80, or even 72, think again. The number of users with Apple I 40-column monitors is not a small minority. No, the width of the output is the responsibility of the thing that is doing the output. I can no more insist that the sender respect my terminal width than I would insist that s/he include screen formatting control characters for my particular terminal type. It doesn't do any harm to ask contributors to consider the audience when sending a message; on that basis I would add my opinion that it is unwise to send messages with long lines unless there is some gainful purpose to doing so. (While I'm at it, could you Puh-LEEZE not right-justify text... !) Item 2: about not adding "at PARC-MAXC" Any message which does not contain a proper, valid, and attainable return address should be rejected. Mail headers are read and interpreted by programs routinely, and should conform to proper format for that reason (as compared to the message body, which has no such requirement, and shouldn't). Besides, it's not fair to force readers to guess how to reply to messages or edit the incoming messages so that the headers are in the format they should have been in the first place. I potentially hang myself here since Shasta is on the local Stanford ethernet and my return address by default would be indirected through Sumex or Score, and some mailers do not permit addresses of the form Hartwell@Shasta at Sumex (Notably, USC-ECLC, the home of our beloved Moderator....). Fortunately, I can provide a Reply-to field that forwards to me here at Shasta.  Date: 27 Aug 1982 0138-PDT From: Mark Crispin Subject: Re: Header munging Sender: ADMIN.MRC at SU-SCORE To: ARPAVAX.dag at UCB-C70 cc: KING at CMU-20C, header-people at MIT-MC Reply-To: Admin.MRC at SU-SCORE Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) In-Reply-To: Your message of 26-Aug-82 2308-PDT David Gewirtz - There are standards established for the format of an electronic mail message sent over Internet. These standards have a purpose, in that Internet (including but not limited to the ARPANET component) is a production network and electronic mail is a production service of Internet. It is completely unreasonable to expect every mail receiver in the world to be able to understand everybody's idea of an electronic mail format. Perhaps you were being unfairly blamed for the incorrect format of your message headers. The fact is, however, that your headers are in an incorrect format and can cause (and from all reports have caused) great grief to the unfortunates who receive your mail. If you are the maintainer of the software which generated that illegal header, the blame lies squarely with you. If you are a manager at the facility, it also lies with you; you should not allow your programmers to install such broken software. Even if you are merely a user, you can still put pressure to bear upon your management to get the problem fixed. There have been too many excuses from Berkeley and/or UUCP-land about how it isn't possible to change the software, or how it is an imposition to force UUCP to use Internet standards. I have no intention of suggesting that UUCP should change its standards. However, if its electronic mail is interconnected with Internet then it must NOT transmit any electronic mail to Internet that does not conform with Internet standards. The direct responsibility of protocol translation lies upon the relay, Berkeley. It should either do a full and correct conversion, or it should bounce back as undeliverable mail that does not conform. The supposed multitude of UNIX sites who do not want to convert to Internet format electronic mail will convert if they want interconnectivity with Internet. If not, they won't be able to send mail to Internet and again that is no problem. -------  Date: 27-Aug-82 01:46-PDT From: KELLEY at OFFICE Subject: RE: Xerox Woes -- Long lines with out returns in them. To: Geoff at SRI-CSL Cc: MsgGroup at BRL, Header-People at MC Identifier: TYM-KIRK-13ASX In-reply-to: <[SRI-CSL]26-Aug-82 22:50:49.GEOFF> Length: 1 page(s)[estimate] Posted: 27-Aug-82 00:45-PDT As a Tymshare (Augment) Mail user, I like the official Xerox policy and plan to do it here for exactly the reasons given. It allows receiving tables in tact and still "chopping" paragraphs as needed by Augment readers' window sizes. -- Kirk  Date: 27 August 1982 04:54 edt From: Barry Margolin at MIT-MULTICS Subject: Re: Xerox Woes -- Long lines with out returns in them. Sender: Margolin.Multics at MIT-MULTICS To: Hartwell at SU-SCORE cc: Geoff at SRI-CSL, MsgGroup at BRL, Header-People at MIT-MC In-Reply-To: Message of 27 August 1982 04:32 edt from Steve Hartwell is 1000 characters (but not counting the leading dot duplicated for transparency)." This translates to no more than 12 80-character lines; line 13 will violate the SMTP standard unless a line break is forced. If Xerox's MailSend maintainers really did send that message, it is an example of monumental egotism which reflects poorly on the entire Xerox organization, including the many individuals there who have worked long and hard to establish better communications between Xerox and the rest of the world (not to mention having contributed a good deal to the state of the art in general). I hope that the Xerox personnel on these lists will do us a favor by putting pressure to bear on the individuals who have this attitude. Xerox's software, as evidenced by the Alto/Dolphin software I am most familiar with, goes to some lengths to shield the user from the need to format text. In particular, it dynamically formats the text in real time as the user edits it, PROVIDED the user enters the text as a run-on, unformatted line. This is similar to what EMACS does with Auto Fill mode; unlike EMACS, the Xerox style formatting is done continuously (EMACS converts the last space into a CRLF if the line becomes too long). This relieves the Xerox user from issuing a formatting command (the EMACS M-Q command), but also lulls the Xerox user into a false sense of security that the message will look to the recepient the way it looks to the sender. The reason is that while EMACS actually converts the stored spaces into CRLF's, the Xerox software stores the text as specified by the user and only displays it in a formatted fashion. The argument that the sender does not know of the properties of the receiver's terminal is spurious, as is the claim that the receiver would have trouble interpreting between needed line breaks and true breaks in the text. Most Internet senders assume a "canonical terminal width" of between 65 and 72 characters. This length displays reasonably on most terminals; terminals with shorter widths generally have features to work around their limitation. For example, my Atari microcomputer's terminal emulator deals with its 40 character/line limit by doing Xerox- style dynamic formatting at display time; a 65 to 72 character line comes out just about right. The argument about line breaks vs. text breaks confuses me. A premature line break is obviously a text break. Paragraph breaks are preferably expressed as two consecutive line breaks; a combination pleasing to the eye on almost any medium displayed upon. -- Mark -- -------  Date: 27 Aug 1982 0510-PDT From: Mike Peeler Subject: RE: Xerox Woes -- Long lines with out returns in them. To: KELLEY at OFFICE-2 cc: Geoff at SRI-CSL, MsgGroup at BRL, Header-People at MIT-MC, Human-Nets at SU-SCORE In-Reply-To: Your message of 27-Aug-82 0146-PDT I agree with the 0X people. It's so easy to do something right about long lines, and so lazy to bomb out on them, that I just pity the poor people who have to work on systems with such brain damage. I have no sympathy, however, for those on micros with 40-character wide displays and no lowercase. I would opt to use my Smith-Corona and correcting cartridge rather than deal with such idiocy. I support 0X' stand on long lines even though I've always reformatted digests I put out to 70 columns, and even though I myself limit my text to columns 10 through 60. The former was my tribute to the reality that some of our readers do not live in the best of all possible worlds. The latter was the result of my own evaluation of the human factors involved in reading. Printed text of the readable variety is between eight and twelve words per line. A left margin prevents automatic text-processors from reformatting: given that I'm doing the formatting on the sending side, I don't want it messed up at the other end. So I chose left and right margins of ten each from columns 0 and 70, respectively, which puts about ten words on a line. With only 40 columns the right margin would get excessively ragged, and with 60, it would be centered on the screen instead of around reading position, which is normally somewhat to the left of real center and somewhat more so on a crt. For you word processor fans, EMACS makes all this happen automagically. Let it be known that even after 0X' chopper gets through, the lines remain longer than eighty characters, a fact I detected by the tell-tale warparound (sic) that messages from our Altos produce on our cheapo Heap... uh, Zenith's. (I'm even now on an Alto masquerading as a Datamedia. It's great!) That puts them more than 25% beyond optimal reading range. On another topic, current on Human-Nets, I have just switched back to two spaces after periods. I noticed that I do read faster when I can isolate the sentences visually. (Holistic parsing?) In most print the punctuation abuts tightly against the previous character, so that there is no real need for much extra distance. With a fixed-width font, though, I think an extra space does help. Regards, Mike -------  Date: Friday, 27 August 1982, 13:23-EDT From: Robert W. Kerns Subject: Re: Header munging To: ARPAVAX.dag at UCB-C70 Cc: KING at CMU-20C, dag.at.BERKELEY at cmu-20c, dag.at.UCB-C70 at cmu-20c, header-people at mit-mc In-reply-to: <8207270423.1003.ARPAVAX@Berkeley> Date: 26 Aug 1982 23:08:51-PDT From: ARPAVAX.dag at Berkeley Date: 26-Aug-82 21:23:22-PDT (Thu) From: ARPAVAX:dag (David Gewirtz) Via: ucbvax.EtherNet (V3.147 [7/22/82]); 26-Aug-82 23:18:44-PDT (Thu) Received: by UCBARPA (3.174 [8/25/82]) id a01003; 26-Aug-82 21:23:24-PDT (Thu) To: KING@CMU-20C, dag.at.BERKELEY@cmu-20c, dag.at.UCB-C70@cmu-20c Cc: header-people@mit-mc c fjw@mit-mc You know, I really don't appreciate your lousy attitude. Dear garbage-header: If this is not your doing, please stick it in the face of whoever maintains your mail software, and don't take it personally. I was once in your shoes, and my anger is only at whoever installed this crock. I really don't appreciate receiving your lousy mail. Your screwed up headers break my mail reader. This is not the fault of my mail reader, since it was designed in good faith to handle a number of different message header formats, including the standard to which yours is required to conform: RFC733. It is impossible for me to modify my mail reading program to understand your bizarre format because I can see no basis for choosing between your multiple dates, FROM: lines, etc, other than that one must be correct and the other is incorrect. (There is also the matter of there being an incredible amount of distasteful GARBAGE there, but I can forgive ugliness, although it reflects very poorly on the profesionalism of the program's author). Even if I were to figure out how to choose between them, if we allow this flouting of community standards to continue, someone will just do it another way that is incompatible with yours. You speak of attitudes. A good part of my anger comes from the attitude that we keep hearing from Unix-land, where with one mouth we hear "Ah, UNIX is so wonderful. Anybody can go in and modify the software", and with another mouth (is this Hydra, maybe?), we get "We can't fix our software, there's too many of us". I'm tired of hearing Unix's faults sung as praises. It's a self-serving ego trip that has held back university computing by maintaining the defective status-quo. Please, if you can't fix your headers, quarentine them. Send your messages via someone who can put a decent header on them for you. Stop making my life difficult. In fact, I don't even demand that. Just assure me you'll DO something, be it start working on the software or ask your software maintainer to do something.  Date: 27 Aug 1982 12:43 PDT From: AHenderson at PARC-MAXC Subject: Re: Xerox Woes -- Long lines with out returns in them. In-reply-to: Admin.MDP's message of 27 Aug 1982 0510-PDT To: Mike Peeler cc: KELLEY at OFFICE-2, Geoff at SRI-CSL, MsgGroup at BRL, Header-People at MIT-MC, Human-Nets at SU-SCORE Mike: The other piece that you might want to factor into your arguments is that not all terminals measure line length in "characters". Some use "points" or "inches". What with variable pitch fonts and different type sizes, families and faces, breaking lines into "reasonable" lengths becomes even more problematic to handle at the sending end. In fact, this discussion reminds me of another in the field of graphics: resolution independence; there too, it is important to separate content from form in a way which still honors the creators intent. All the best, Austin PS: Let me hasten to add that in this matter I speak for myself, not the company that I am pleased to work for. Were I to do the latter, I'd probably have to spend more time than I'd like with the lawyers.  Date: Friday, 27 August 1982, 15:54-EDT From: Robert W. Kerns Subject: Re: Xerox Woes -- Variable width fonts To: AHenderson at PARC-MAXC Cc: Mike Peeler , KELLEY at OFFICE-2, Geoff at SRI-CSL, MsgGroup at BRL, Header-People at MIT-MC, Human-Nets at SU-SCORE In-reply-to: The message of 27 Aug 82 15:43-EDT from AHenderson at PARC-MAXC Coming from a competing environment with variable width fonts, I'd like to point out that in order to transmit anything remotly tabular, there must be some sort of agreement on what the relative widths of characters are. Seems the only feasible way to do that currently is to use fixed-width fonts for displaying messages. That's what we do; and I can't say that fixed width fonts have caused me cancer of the eyeball, but they have allowed me to make sense of many messages that would have been rendered unreadable with variable-width fonts. To use variable-width fonts when composing a message meant for the outside world strikes me as elitist and isolationist, given the current state in the rest of the world. This doesn't mean I'm not willing to talk about improving the rest of the world so we could use variable-width fonts, graphics, sound, animation, or what have you in messages, but I do recognize the current limitations and stick to fixed-width ascii messages.  Date: 27 Aug 1982 1340-PDT From: POSTEL at USC-ISIF Subject: Effective Communication To: MsgGroup at BRL, Header-People at MIT-MC, To: Human-Nets at SU-SCORE When the sender or speaker cares about effectively communicating he or she adapts to the capabilities of the receiver or listener. When we want childern to understand something we speak slowly and use common small words. When we send computer mail we adapt to a "average capability" receiver. There are of course differences of opinion as to what this "average capability" is, and even more differences about what it should be. I think that for some time the "average capability" has been reasonably assumed to be the ability to handle ASCII characters with up to 72 characters on a line. The assumption is that these will be fixed width. There are no real expectation that format effectors like back space and even tab will do reasonable things for the "average capability" receiver. I think it is pretty clear that receivers with 40 character lines know that they are below average, and don't complain about longer lines. When better than average senders communicate with better than average receivers they can make use of their common superior features, but when they want to include the average receiver in their discussion, they should come down to our level. --jon. -------  Mail-from: SU-NET host Shasta rcvd at 27-Aug-82 1422-PDT Date: Friday, 27 Aug 1982 14:22-PDT To: Geoff at SRI-CSL, MsgGroup at BRL, Header-People at MIT-MC, Human-Nets at SU-SCORE Subject: Re: Xerox Woes -- Variable width fonts In-reply-to: Your message of Friday, 27 August 1982, 15:54-EDT. From: Brian Reid Good grief; this has certainly degenerated into a lot of @i[ad hominem]. Let me, as an outsider to Xerox, explain what is going on before people get angrier in the absence of the facts. It is certainly true that some Xerox-origin mail violates the Arpanet format standards, but there is no malice or piggishness behind it. Xerox has an in-house mail system that handles many times as much mail as the Arpanet. That mail system is divided into two distinct parts, namely the delivery part and the sender/reader part. The delivery part is called "Grapevine"; it is documented in various public technical reports. It is uniformly used by everyone, and works beautifully; it is a much nicer mail system than anything I have seen on the Arpanet or in Unix-land. This mail system originated in the Xerox reseach lab in Palo Alto. Originally there was an Alto mail program, named Laurel (also documented in a public technical report) that sent, received, filed, and managed mail. Laurel is a masterpiece of human engineering, but not terribly functional. One of the properties of Laurel was that it automatically wrapped lines to fit on the screen, so that a message would appear properly regardless of whether or not the sender and the receiver had the same screen parameters. At some point in the design of their mail system, the folks at Xerox addressed the issue of export to alien networks, one of which is the Arpanet. They seem to have decided (correctly) that the transport mechanism (grapevine) should not mess with the contents of a mail message, but only move it around. If the requirements of an alien network were different, it was the responsibility of either the portal program (at the Xerox/Alien boundary) or of the user program (at the sender's fingertips) to meet those requirements. I am a little fuzzy here, but I believe that what happened next was that an engineering decision was made to put the formatting code in the user program (Laurel) and not in the portal program. I don't know the thinking behind this step, and in from my vantage point it seems to have been the wrong place to put it. Nevertheless, Laurel was modified so that when it was sending a message to an Arpanet recipient, it carefully formatted the message to conform to Arpanet format standards. This was all happening at the time that RFC733 was being settled, and there was a certain amount of disagreement as to just what those standards were, but because Laurel was developed in the research community there and the Arpanet was used to communicate with other researchers, it happened. The Xerox mail system was such a success that it migrated out of the computer research lab and into all corners of the company. The Xerox internet connects all kinds of interesting places, thousands of miles apart, and they all exchange mail uniformly, without any need to know the physical or logical location of the recipients. In particular, it migrated into the development branches of the company, who had computers other than the Alto, and who therefore could not run Laurel. Lacking an Alto on which to run Laurel, somebody in a non-research division wrote a different program, which ran on his development computer, that handled mail and interfaced to Grapevine. Since I have never seen this different program documented in a public place, I shouldn't say more about it, but let me say that the person who wrote it had no interest in communicating with the Arpanet, and wrote a (fairly simplistic) program that would let people in the development branch send mail to each other. He failed to include any code for compatibility with alien networks, but the thing works just fine for sending mail inside Xerox, which is what he used it for. As more people got these non-alto computers, they began using this alternative mail program. Many of them are sufficiently removed from the Arpanet research community that they probably have no idea that their mail escaping to the Arpanet is violating its standards. There is certainly no sneering at the standards, just ignorance of them, and since only one message in many many thousand is an Arpanet message, this does not appear to the folks who maintain that program (if anybody maintains it) to be a big problem. I consider this kind of situation to be an inevitable social consequence of a fully distributed computing environment. My own solution to it has been to add some simple code to my mail-reading program to wrap lines for me as it displays the text on my screen. It leaves the file intact, in case I need to preserve the original line breaks for some reason. Brian Reid Stanford  Date: 27 Aug 1982 1522-PDT Sender: CERF at USC-ISI Subject: Re: Effective Communication From: CERF at USC-ISI To: POSTEL at USC-ISIF Cc: MsgGroup at BRL, Header-People at MIT-MC, Cc: Human-Nets at SU-SCORE Message-ID: <[USC-ISI]27-Aug-82 15:22:40.CERF> In-Reply-To: Your message of 27 Aug 1982 1340-PDT Jon, Basically I am in agreement with you except for one point. It appears that, as personal computers with bitmap displays become more prevalent, and along with them, the regular use of windows, that smaller width display formats may become somewhat more common. Should this be the case, one would typically expect the clever window system to aid in the display of information not specifically formatted for that purpose - for instance by automatically wrapping lines as needed. The 40 char displays (such as the awful APPLE display I occasionally use at home) usually automatically wrap. Most other hand-held terminals I have seen, with narrow displays (e.g. 16X32 on a TV screen) also do this. Consequently, it would appear that, no matter how carefully one has formatted a message, it may have to be displayed on a device not quite suitable, but that receiver needs to cope somehow. Perhaps someday we will be able to carry along some instructions or assumptions about the output display capability so that the receiver can do more than just wraparound as needed. In the meantime, I think I agree that some "typical" assumptions like 72 char wide or something will usually result in fairly regulard displays. The deliberate avoidance of any CR's at all can only tend to irritate those receivers with limited capability in their mail reading programs. However, I am not sure I would insist that all mail be formatted in this fashion - depends a good deal on who is likely to receive it. Vint  Date: 27 Aug 1982 16:11:20-PDT From: mo at LBL-UNIX (Mike O'Dell [system]) To: CERF at USC-ISI cc: MsgGroup at BRL, Header-People at MIT-MC, Human-Nets at SU-SCORE, POSTEL at USC-ISIF Subject: Re: Effective Communication In-reply-to: Your message of 27 Aug 1982 1553-PDT (Friday). <[USC-ISI]27-Aug-82 15:22:40.CERF> To really do this "right" in the long run requires sending "structured documents" which record the original semantics intended. This could make the interchange format look something akin to Scribe input. While this is indeed most general and offers the greatest freedom in tailoring the presentation WITHOUT scarificing the intended semantics, it means you probably couldn't read mail (with the intended typography) on a Fuzzball. There has to be some compromise between the power you need on occassion and the functionality you use routinely. Note that such a compromise might be accomplished by simply "typing" the documents as to the "presentation layer functionality" needed to display it. Then if readers-composers offered options as to whether to format-on-output, people could be reasonably happy. As for limits on line length, any number you pick is wrong, or will be if you wait long enough. It is just that some numbers are wrong sooner than others. -Mike  Date: Friday, 27 August 1982 16:40-PDT From: Jonathan Alan Solomon To: Header-People at MIT-MC, Msggroup at BRL, Human-Nets at Rutgers Address: 3737 South Hoover Street Room PHE 204 Los Angeles, California 90089-0273 Phone: (213) 202-1793 Subject: Effective Communication I'm a bit confused, I thought that the limit of 1000 characters was so that the mail transportation implementation could, if need be, impose a maximum limit. Presumably the """"best"""" implementation is one which negotiates the max limit, if any, in the contact. My receiver could TELL your sender that we only want 72 character lines. Your sender could either chop directly along those lines (hoping we patch all of it back together for the user), or do the form of autofilling some of us get in our mail senders. Anyway - where does restriction actually happen? At the sender? Receiver? The processing agents? Cheers, --JSol  Date: 27 August 1982 21:54-EDT From: Gail Zacharias Subject: Xerox Woes -- Long lines with out returns in them. To: Geoff at SRI-CSL, Admin.MDP at SU-SCORE cc: HEADER-PEOPLE at MIT-MC, Human-Nets at SU-SCORE, MsgGroup at BRL It seems to me that any half-decent mail reader has to have the capability to reformat a message at viewing time, if for no other reason than the fact that tastes vary, even if everbody used same-width terminal. Given that, totally unformatted messages are no worse than messages formatted in a way you dislike, and those will always exist. With regard to MDP's policy, putting a left margin in your message with the express purpose of foiling reformatting by the receiver strikes me as infinitely more anti-social than not formatting it at all. If I ask for reformatting it's because I want it. If I ask for it to happen automatically, it's because I want it to happen automatically, your appreciation of your own formatting notwithstanding. Anyhow, one way to deal with this problem would be a header field, e.g. Format: Rigid (tables, code -- do not touch, display in fixed font) or Format: Free-form (Xerox style, needs formatting) or Format: Typical (usual case, doesn't need reformatting, but ok to do so) or Format: Scribe (contains scribe formatting directives (to be discouraged....)) etc ....  Date: 27 Aug 1982 2203-PDT Sender: STEF at DARCOM-KA Subject: Re: Xerox Woes -- Long lines with out returns in them. From: STEF at DARCOM-KA To: GZ at MIT-MC Cc: MsgGroup at BRL, HEADER-PEOPLE at MIT-MC Message-ID: <[DARCOM-KA]27-Aug-82 22:03:16.STEF> In-Reply-To: Your message of 27 August 1982 21:54-EDT We should separate normative concepts from current operating problems. I see no arguments against the idea that someday we will have messages that carry their format specifications with them, and I expect that within a very few years the great proportion of User Agents will be able to reasonably process such messages. But, we ain't there yet with our current stock of hardware and software, and I see little profit in acting as though that grand day is here now. So, we have the basis for two discussions here: 1. What should it be like when it is the way we would rather have it be? 2. What should we do till the baby comes? Hopefully, we should not pretend the baby is here before it is. Best - Stef  Date: 28 Aug 1982 0151-EDT From: Robert W. Kerns Subject: 1000 character limit To: JSol at USC-ECLC, Header-People at MIT-MC, Msggroup at BRL, Human-Nets at RUTGERS cc: RWK at SCRC-TENEX In-Reply-To: Your message of 27-Aug-82 1943-EDT Actually, the 1000 character limit can be necessary for VMS. Records (lines) are written in a single I/O call, and are limited in size to the program's quota for system buffer space. It's possible to get around this, especially in VMS 3.0, but not with a "normal" text file. I expect the 1000 character limit is imposed for situations like this, rather than having anything to do with presentation format. Anyway, I think this discussion is silly, particularly so since it is taking place on three mailing lists, with largely overlapping readership. If someone were to propose an experimental protocol with imbedded formating, that might be interesting. The rest of the world can communicate with the tools they have now, and think about the future. But flames about how what we're doing now is not the wave of the future are simply too obvious to be worth sending to 3 large-distribution mailing lists. -------  Date: 28 Aug 1982 0701-PDT From: Mike Peeler Subject: Re: Xerox Woes -- Long lines with out returns in them. To: GZ at MIT-MC cc: Geoff at SRI-CSL, HEADER-PEOPLE at MIT-MC, Human-Nets at SU-SCORE, MsgGroup at BRL In-Reply-To: Your message of 27-Aug-82 1854-PDT Gail, I agree with you about the capabilities the mail-readers of the future ought to have. As Stef (Thank you, Stef) points out, we have to separate the normative concepts from the current operating problems. In principle, we agree, you and I. In practice, I chose to use a left margin to deal with a specific problem. Mail-readers which fill text only for the purpose of limiting its right margin are doing the wrong thing to mine, since I intentionally use a short right margin. If my format bothers your eyes, I happen to know it would only take you a few keystrokes to remove the margin and reformat this message. Not formatting the text at all is, given the current state of the world, far more anti-social. Most people do not have the kind of mail-reader that will fool around with the body of a message. This majority will have to live with the message in its original presentation. For that matter, what is wrong with my wanting my work presented in a particular way? Would anyone consider an artist anti-social if he took measures to make sure a museum displayed his painting in his particular notion of the proper lighting? Regards, Mike -------  Date: 28 August 1982 11:50 edt From: Barry Margolin at MIT-MULTICS Subject: Re: Xerox Woes -- Long lines with out returns in them. Sender: Margolin.Multics at MIT-MULTICS To: Mike Peeler cc: GZ at MIT-MC, Geoff at SRI-CSL, HEADER-PEOPLE at MIT-MC, Human-Nets at SU-SCORE, MsgGroup at BRL In-Reply-To: Message of 28 August 1982 10:01 edt from Mike Peeler I doubt that most mail reading programs format the text automatically, although many will do so on demand. This makes Xerox's message just more anti-social, because I will most likely have to read it twice - once to find out that it is not formatted, then with it reformatted. I will then have to hope that my reformatting did what was intended, not screwing up any tables or diagrams. I liked someones idea about the "Format:" header field, but for now we should restrict ourselves to a few simple values like "Formatted", "Unfilled", etc., and leave things like "Scribe" or "RFC-1024" (the DoD standard text formatter, to be designed in five years) for the future. barmar  Date: 28 Aug 1982 1411-PDT From: Brian Harvey Subject: who should break long lines To: header-people at MIT-MC Let me tell you, when you spend a couple of weeks reading your mail at 1200 baud, you develop a new conception of what issues are really important and which aren't. I recommend it to all mailing list participants. P.S. -- Postel is right, of course, as usual.  Date: 27-Aug-82 19:33:11-PDT (Fri) From: ucbvax: (Eric Allman) Subject: Re: Header munging & lousy attitude Message-Id: <8207280233.19879.ARPAVAX@Berkeley> Received: by UCBARPA (3.177 [8/27/82]) id a19879; 27-Aug-82 19:33:17-PDT (Fri) Received: from UCBARPA by UCB-UCBVAX (3.177 [8/27/82]) id a00338; 28-Aug-82 13:07:58-PDT (Sat) Phone: (415) 548-3211 To: header-people@mit-mc Cc: ucbvax. (David Gewirtz) Subject: Re: Header munging Message-Id: <8207290405.28964.ARPAVAX@Berkeley> Received: by UCBARPA (3.177 [8/27/82]) id a28964; 28-Aug-82 21:05:09-PDT (Sat) Received: from UCBARPA by UCB-UCBVAX (3.177 [8/27/82]) id a02754; 29-Aug-82 00:53:00-PDT (Sun) To: ARPAVAX.dag@UCB-C70, Admin.MRC@SU-SCORE Cc: KING@CMU-20C, header-people@MIT-MC I have forwarded your message to those who are those who may be able to make the changes you wish. I am sorry if I have caused any confusion, but (sigh...) I am only a user, and just moved to this machine. Thanks for your clear and civil message. David  Date: 28-Aug-82 21:19:31-PDT (Sat) From: ucbvax: (David Gewirtz) Subject: Re: Header munging & lousy attitude Message-Id: <8207290419.29076.ARPAVAX@Berkeley> Received: by UCBARPA (3.177 [8/27/82]) id a29076; 28-Aug-82 21:19:36-PDT (Sat) Received: from UCBARPA by UCB-UCBVAX (3.177 [8/27/82]) id a03029; 29-Aug-82 01:04:49-PDT (Sun) To: ucbvax. Subject: Xerox Woes -- Long lines with out returns in them. To: Geoff at SRI-CSL cc: HEADER-PEOPLE at MIT-MC, MsgGroup at BRL I don't think Xerox is winning, but thought I'd answer the question you raised: Is there anyone out there with a terminal or a system that can and does break long lines as the Xerox argument claims we can and do? The operating system here wrapped the lines with an overflow indication, so I had no trouble reading the message. So I didn't need to, but just for fun, I asked my mail reader to refill the message text as well. It didn't do this automatically (I'd consider that a bug), but it was a single simple command.  Date: 29 Aug 1982 15:20 PDT From: Taft at PARC-MAXC Subject: Line folding To: (MsgGroup@MIT-ML,) Header-People@MIT-MC It is a curious coincidence that this discussion erupted just one day after many of the responsible people within Xerox had a meeting to decide what we should do about RFC 822. Perhaps I can put this discussion to rest with a little bit of authoritative information about our position and our plans. 1. Commentary on the discussion so far The "Xerox position" has been reasonably well articulated in several messages, particularly the ones from Hartwell, Everhart, and Peeler; so I do not intend to repeat most of those arguments here. And there are certainly reasonable technical and pragmatic grounds for disagreement with the "Xerox position". On the other hand, insulting and inflammatory remarks about motives and attitudes serve no constructive purpose; they are inappropriate in a large public forum such as this one; and they do not deserve the dignity of a reply. 2. Elaboration on the "Xerox position" Our practice of making the recipient's mail software responsible for formatting messages for the recipient's terminal is a very pragmatic one. There is no "standard" line width anywhere in the Xerox Internet; variable-width fonts and variable-size display windows make this a meaningless concept. Though 80 x 24 display terminals may be more-or- less "standard" in the ARPA Internet right now, I am certain this will not be true for very much longer. The proposal, "Let the sender break lines as a courtesy to unsophisticated recipients, and smart recipients reformat to suit their own needs" breaks down because there is no reliable way for the recipient, no matter how smart, to distinguish CRs put in for line folding from CRs that truly represent breaks in the text. Making this distinction is possible only if there is either (a) a syntactic distinction between the two uses (i.e., two kinds of CRs), or (b) higher-level formatting information (a la Scribe, Tex, Pub, etc.) indicating which text may be reformatted and which is to be taken literally. Neither approach seems likely to be adopted by the ARPA Internet community any time soon. The proposal, "Let the sender and recipient negotiate over line length" doesn't work since (a) the sender may be sending to a distribution list and not know the identity of the ultimate recipients; (b) the recipient may wish to display the same message in different-size windows; and (c) in general, the sender and recipient don't communicate directly, but only via one or more store-and-forward intermediate agents. Breaking unfolded paragraphs into lines appropriate for the recipient's terminal is such a trivial procedure for the recipient's mail software to do that it seems inconceivable that even the tiniest microcomputer should be incapable of doing so. And indeed, most of the messages so far suggest that the problem lies not with low-powered computers but with crufty old big-computer software, such as MSG, which nobody is maintaining any more. In an evolving network environment, I consider it an untenable position that a standard (whether official or de-facto) should have to cater to software that is no longer being maintained. If nobody is prepared to devote resources to maintenance, the software should be allowed to die. 3. History and plans Until now, we have permitted unfolded text to escape into the Arpanet simply because the mail forwarding software on PARC-MAXC (as well as all our other transport software) does not make ANY modifications to message content. This reflects our belief in the principle of rigid separation between envelope and content information. During the discussions that led to RFC 822, I (and several other people) argued strongly for having this principle written into the standard, along with a statement that line folding be the responsibility of the recipient. However, because of the short time available for issuing and implementing RFC 822, we lost on this issue. (As an aside, though, I should mention that RFC 822 still does NOT prescribe that the sender do line folding, except for one non-binding recommendation that HEADERS be folded to "65 or 72 characters".) Therefore, it is our intention to conform to RFC 822 and to de-facto usage by implementing a mail translation gateway that WILL translate names and WILL do line folding when forwarding mail between the Xerox and ARPA Internets. Though this violates the principles by which we believe message systems should work, we recognize that our adherance to these principles causes inconvenience to some ARPA Internet users. To those unsuspecting users whose mail software chokes on long lines from Xerox senders, we apologize. To implementors of such software, we can only suggest that well-engineered mail-handling software should tolerate arbitrary message content, even if it can't handle all of it in an ideal fashion. If there are any chinks in the armor, the software will eventually break, and unsuspecting users will be inconvenienced. (Yes, there are messages that break Laurel!) It is likely that the mail translation gateway software will be done well before the January 1 scheduled cutover to RFC 822. Ed Taft Xerox PARC  Date: 29 August 1982 1919-EDT (Sunday) From: David Lamb at CMU-10A To: header-people at MIT-MC, msggroup at brl Subject: Re: Xerox Woes -- Long lines with out returns in them. Sender: David.Lamb at CMU-10A Reply-To: Rdmail at CMU-10A In-Reply-To: Earl A. Killian@MIT-MC's message of 27 Aug 82 01:14-EST Message-Id: <29Aug82 191931 RD00@CMU-10A> Shortly after this controversy about breaking long lines started, I added code to RdMail to break long lines. Whether this is done is under the control of the user, but is the default unless the user changes the templates that control printing on the terminal. This only affects output to the terminal (unless the user changes the template that controls mail forwarding). It took a total of about 7 hours, most of which was tracking down a really obscure bug that has nothing to do with breaking lines, but that I happened to tickle with this change. Thus there is at least one non-Xerox system that now breaks lines as Xerox suggested. I've had no user feedback yet, since only about 30 people use the experimental version.  Date: Sunday, 29 Aug 1982 18:40-PDT To: Taft at PARC-MAXC cc: MsgGroup at MIT-ML, Header-people at MIT-MC Subject: Re: Line folding In-reply-to: Your message of 29 Aug 1982 15:20 PDT. From: gaines at RAND-UNIX I don't think the numbers support your view that the 24x80 display will become less prevelent soon. Enormous numbers of these are being sold, and the majority of the large volume (or expected large volume) workstations also have adopted this format. We all look forward to the day when a first-class bit map display costs no more than the equivalent, at that time, of today's $700 Televideo, but that won't be soon. On the matter of stuffing CRs: For most messages the formatting within paragraphs doesn't matter. For these, an appropriate header entry might be included. Then the recipient (program or person) would know that the CRs within paragraphs were inserted for the convience of display on common 80 character terminals with unsophisticated software support, but that they weren't meaningful. Without such an entry, all formatting could be considered meaningful, and it would be up to the recipient to figure out how to deal with the message if the sender's formatting caused him problems.  Date: 29 Aug 1982 1907-PDT From: Brian Harvey Subject: long lines To: header-people at MIT-MC Let's don't invent a new header field for whether the message text is formatted or not. In order for that to do any good, everyone has to go rewrite their mail readers to understand it, a cure much worse than the disease. I suggest that mail system maintainers with a lot of energy to put into this small problem simply adopt the heuristic that if a line in a message is longer than the reader's screen is wide, you'd better reformat it one way or another. If you want to be extra hyper clever, if the lines happen to be longer than 80 but shorter than 132 and the terminal happens to be a VT100 you can... But we lazy bums who hardly ever have trouble reading people's messages anyway can ignore the problem. Whereas if you invent a new concept in headers, everyone will start playing with it and things will only get worse. When we design the Information System in the Sky, with a network standard representation for italics and all that, then yes.  Date: 30 August 1982 0106-EDT From: Rudy.Nedved at CMU-10A To: David.Lamb at CMU-10A Subject: Re: Xerox Woes -- Long lines with out returns in them. CC: Header-People at MIT-MC In-Reply-To: <29Aug82 191931 RD00@CMU-10A> Message-Id: <30Aug82 010652 EN0C@CMU-10A> Ah...David if you only knew the history of the rest of the CMU TOPS-10 mail software, you would know that Craig Everhart put in code to FTPSrv to break long lines at word boundaries. I don't think anyone has ever noticed. -Rudy  Date: 30 August 1982 0131-EDT From: Rudy.Nedved at CMU-10A To: Header-people at MIT-MC Subject: Re: Header munging & lousy attitude In-Reply-To: <8207280233.19879.ARPAVAX@Berkeley> Message-Id: <30Aug82 013140 EN0C@CMU-10A> For the record: Dave King is a system analyst for CMU's Computation Center which is the "general and administrative" computing facility for the university. This is a seperate entity from the "research" computing facility's for Computer Science department and the Robotics Insitute (which have overlapping personnel). CMU-20C is a CMU CS machine which he has a guest account on. Sincerely, Rudy Nedved Research Systems Programmer Computer Science Department Carnegie-Mellon University  Date: 30 August 1982 0131-EDT From: Rudy.Nedved at CMU-10A To: Header-people at MIT-MC Subject: Re: Header munging & lousy attitude In-Reply-To: <8207280233.19879.ARPAVAX@Berkeley> Message-Id: <30Aug82 013140 EN0C@CMU-10A> For the record: Dave King is a system analyst for CMU's Computation Center which is the "general and administrative" computing facility for the university. This is a seperate entity from the "research" computing facility's for Computer Science department and the Robotics Insitute (which have overlapping personnel). CMU-20C is a CMU CS machine which he has a guest account on. Sincerely, Rudy Nedved Research Systems Programmer Computer Science Department Carnegie-Mellon University  Date: 30 Aug 1982 1204-PDT Sender: POSTEL at USC-ISIF Subject: The 1000 Character Line From: POSTEL at USC-ISIF To: Header-People at MIT-MC Message-ID: <[USC-ISIF]30-Aug-82 12:04:44.POSTEL> The SMTP statement about the maximum line length needs to be interpreted in light of another very firm statement in the SMTP document. TO THE MAXIMUM EXTENT POSSIBLE IMPLEMENTATION TECHNIQUES WHICH IMPOSE NO LIMITS ON THE LENGTH OF THESE OBJECT SHOULD BE USED. So if one is implementing an SMTP program and can't think of any way to do it with out having some maximum line length, then the maximum had better be at least 1000 characters. SMTP receivers must handle at least 1000 character lines, SMTP senders can ensure that there messages will get through by limiting themselves to sending less than 1000 character lines (even if longer lines would work in some cases). --jon.  Date: Monday, 30 August 1982 14:44-PDT From: Jonathan Alan Solomon To: Mike Peeler Cc: Geoff at SRI-CSL, GZ at MIT-MC, HEADER-PEOPLE at MIT-MC, Human-Nets at SU-SCORE, MsgGroup at BRL Address: 3737 South Hoover Street Room PHE 204 Los Angeles, California 90089-0273 Phone: (213) 202-1793 Subject: Xerox Woes -- Long lines with out returns in them. For that matter, what is wrong with my wanting my work presented in a particular way? Would anyone consider an artist anti-social if he took measures to make sure a museum displayed his painting in his particular notion of the proper lighting? We've strayed off the discussion (as I see it), which was that Xerox wanted to send streams of data as mail without carriage returns. However, to answer your question, Yes. I think that I have a right to look at your work of art in the light (or point of view) that I think is best. That's my opinion. Electronic mail is defined (in the ARPA sense) to be several lines of text separated by newlines (,). Xerox is ignoring this definition of Internet electronic mail if they insist on sending messages without carriage returns. In fact, it would imply that at least one carriage return is required to consider the text "mail", in the current sense. (Note: I am not talking about multi media mail). --JSol  Sender: Karlton at PARC-MAXC Date: 30-Aug-82 20:06:42 PDT (Monday) From: Phil Karlton Subject: Re: Effective Communication In-reply-to: POSTEL's message of 27 Aug 1982 1340-PDT To: MsgGroup at BRL, Header-People at MIT-MC Reply-To: Karlton.pa at PARC-MAXC As the individual responsible for rejecting the request for automatic insertion of NewLines into outgoing messages, I believe that I should explain in slightly more detail why. The mail composer for which I have responsibility is a very simple program. It represents about three days work on my part. (I owe most of the ease of producing the tool to the fact that a large number of software packages (communication with the mail servers, mail parsing, display, editing) are available.) It makes no edits of the messages other than to prepend a Date: field and a From: field if necessary. As Jon Postel said, "When the sender or speaker cares about effectively communicating he or she adapts to the capabilities of the receiver or listener." Since the overwhelming majority of the messages that get sent using this tool never get near the Arpanet, I felt (and still do) that it was up to the sender to make sure that the recipient would be able to understand the message. It is not a difficult task (in our environment) to break lines just before a message is sent. In fact, that is what I intend to do to this message. Some felt that this last second editing was too onerous, so a package (Chop) was written (not by me) to do long line breaking and was made available to the general community as an "official hack". Users can load Chop into their environment, and, then, with a SINGLE button push, get their messages into the form that they want. What the original change request was complaining about was that even this was too much for a user to have to do. I disagree. I was also aware that the Xerox mail gateway was going to be changed "someday" so that it would do line folding. I had no desire to write software that would soon be obsolete and add no function (in the eyes of most of the users, including those signing my paychecks) to the tool I was charged with releasing and then maintaining. If I had been more discreet and friendly when rejecting the original change request, this flap would not have occurred. I thought the individual to whom I sent the reply, had known most of the above. I suppose I will have to start taking diplomacy lessons. There is one point of misinformation that I saw that I should correct. I was not ignorant of RFC733 when I wrote the tool in question; I wrote what was probably the very first 733 conforming mail editor: RdMail at CMU. (I still feel slightly guilty about the other fire I started by having RdMail allow names with spaces in them in mail headers.) I was just trying to get a simple tool going that would allow our community of users to get its job done. I was pleased that they (and I) would be able to communicate with Arpanet recipients, I just had no idea that it would be so awful to merely follow the standard. I have no desire to fan any more flames. If you feel the need to enlighten my attitude, please send me a personal message rather than trying to convince me in a public forum that I am a fool. I should also add that I am speaking for myself and not as an official representative of Xerox. PK  Date: 30 Aug 1982 2119-PDT Sender: GEOFF at SRI-CSL Subject: Re: Effective Communication From: the tty of Geoffrey S. Goodfellow Reply-To: Geoff at SRI-CSL To: Karlton.pa at PARC-MAXC Cc: MsgGroup at BRL, Header-People at MIT-MC Message-ID: <[SRI-CSL]30-Aug-82 21:19:22.GEOFF> In-Reply-To: Your message of 30-Aug-82 20:06:42 PDT (Monday) Although I think your view (and Postel's) on "Effective Communication" is amiable in principle, it is not so in actual practice. Two actual examples come to mind: Because at the time a message is composed, it may be addressed to or include a Xerox Internet Distribution List (such as Movie.PA^, for example), which may have on it one or more ARPANET recipients. The user, composing the message, doesn't have any idea or notion about the ARPANET folk on the list, and that Chop should be invoked -- the result being, a non-chopped message leaks out to the ARPANET recipients. And secondly, for the same reason that people, by the virtue of human nature, accidentally reply to all recipients of a message when they only meant to have their reply go only to the sender. Or even worse, as some of us have admitted in times past, replied to all recipients of a message they were a BCC recipient on! My point is simple: Line-folding (aka Chopping) should be automatic (for that recipients that expect folded (Chopped) lines). It is all together too easy for users to forget or overlook or not even KNOW, that they should invoke the "chop hack" because a message includes (or might include) some ARPANET recipients. I think the solution Ed Taft intends to implement in the PARC-MAXC gateway is an excellent one: The Xerox Internet folk get their non-folded lines, which your terminals will handle for you in the way you are USED TO, and we ARPANET folk will get our folded (chopped) lines, which we are USED TO. The best of both worlds, for each worlds users.  Date: 31 Aug 1982 0210-PDT From: Mike Peeler Subject: Re: Xerox Woes -- Long lines with out returns in them. To: JSol at USC-ECLC cc: Geoff at SRI-CSL, GZ at MIT-MC, HEADER-PEOPLE at MIT-MC, Human-Nets at SU-SCORE, MsgGroup at BRL In-Reply-To: Your message of 30-Aug-82 1451-PDT JSol, You ignored or misunderstood the issue when you said Electronic mail is defined...to be several lines of text separated by newlines... Xerox is ignoring this definition of Internet electronic mail if they insist on sending messages without carriage returns. They are not sending messages without newlines. Only do they omit SPURIOUS newlines. This means they send CRLF if and only if the sender claims the line MUST break there. The receiver, which could conceivably know something about the device being used for output, can then act accordingly, one possibility being to treat whitespace as optional line breaks and explicit newlines as mandatory. Or it can do whatever else it wants to do. Their policy may be inconvenient for those whose mail-readers cannot handle it, but let's get one thing straight: it does ADD INFORMATION by eliminating "noise". I realize that subsequent messages from Marvin Solomon and Rick.Gumpertz@Cmu-10a to MsgGroup have covered this topic in greater depth, but I want to make it clear that, while Xerox may indeed be guilty of obstinacy, it is not by any means for no reason. Regards, Mike -------  Date: 31 Aug 1982 0728-PDT Sender: BILLW at SRI-KL Subject: As things go from bad to worse... From: William "Chops" Westfield To: header-people at MC Message-ID: <[SRI-KL]31-Aug-82 07:28:22.BILLW> Will someone please send these people a copy of RFC733 (or 8xx...) ! (If only you could figure out who they were...) ((at least someone is helping prevent bad headers fro getting out onto the ARPANet, sort of, i guess...)) Mail-From: ARPAnet host BRL rcvd at 31-Aug-82 0323-PDT Date: 29-Aug-82 22:08:24-PDT (Sun) From: (BAD ADDRESS)ucbvax (BAD ADDRESS), ARPAVAX Received: by UCBARPA (3.177 [8/27/82]) id a13794; 29-Aug-82 22:08:26-PDT (Sun) from UCBARPA by UCB-UCBVAX (3.177 [8/27/82]) id a05931; 29-Aug-82 22:14:52-PDT (Sun) Via: Ucb-C70; 30 Aug 82 1:17-EDT Brl; 30 Aug 82 11:54-EDT Brl-Bmd; 30 Aug 82 12:44-EDT  Date: Tuesday, 31 August 1982 11:49-PDT From: Jonathan Alan Solomon To: Mike Peeler Cc: Geoff at SRI-CSL, GZ at MIT-MC, HEADER-PEOPLE at MIT-MC, Human-Nets at SU-SCORE, MsgGroup at BRL Address: 3737 South Hoover Street Room PHE 204 Los Angeles, California 90089-0273 Phone: (213) 202-1793 Subject: Xerox Woes -- Long lines with out returns in them. After readig Jon Postel's comments, I am inclined to retract my statement that Xerox would be violating protocol by sending mail which has infinite line length. I don't see within the SMTP spec any indication that the line length of 1000 characters was meant to be enforced unless implementations could not be "infinite". It was my misunderstanding that it was to be a hard limit. Also, in RFC822 (the son of RFC733), line folding to conform to the 65 to 72 lines "should" be done in the header, but is also not enforced by the standard. Nothing is said about the message text. However, I do find it annoying that I have seen quite a bit of Xerox mail which does NOT have *any* newlines in it (maybe I did not make myself clear enough before). I am complaining that Xerox software gives the impression to the end user that newlines exist, when in the text of the message; the newlines disappear. I have received messages from Xerox-land which have been without any newlnes, while the user who composed them assured me that they saw separate lines when they sent the text. Somehow this seems unreasonable. Some questions for Xerox: Does the machine allow users to insert newlines? If not sending them is the default case, then is the fact that newlines will be removed by the program well documented, as well as a method for turning off this behavior? Is it company wide policy (or some such nonsense) that the software should be UNABLE to add in newlines even if the sender wants them there (This would seem to violate Mike Peeler's statement of composer rights to formatting)? As a mailing list moderator, I get alot of feedback from Xerox employees who may not be sophistocated enough to understand how to specfically add newlines, people keep telling me that they thought the newlines were there. *THIS* is what I am complaning about, note the spec doesn't seem to specifically prohibit Xerox from doing it. Cheers, --JSol  Date: 31-Aug-82 15:35:03 PDT (Tuesday) From: holbrook.ES at PARC-MAXC Subject: Background and info on the Xerox environment In-reply-to: JSol's message of Tuesday, 31 August 1982 11:49-PDT To: Jonathan Alan Solomon cc: Mike Peeler , Geoff at SRI-CSL, GZ at MIT-MC, HEADER-PEOPLE at MIT-MC, MsgGroup at BRL Various people have shed light on the Xerox side of this uproar: Brian Reid provided a bit of history on early roots of the problem in the Alto world, Phil Karlton gave us his story from the point of view of the implementor of the mail tools in question, and Ed Taft has now made the whole arguement moot by telling us that Xerox has already decided to install line chopping software at MAXC gateway. But Jon Solomon's message pointed out to me that there is still some misunderstanding floating around. I'll try to try to fill in some more of the gaps. Although Brian Reid was reluctant to say anything about the offending mail software, a paper has been submitted for publication that discusses the Development Environment [1], so I think I can safely shed some light without hurting anyone's eyes. Let me pause to give the usual disclaimer: I do not speak as a representative of Xerox, but only as a (relatively new) employee. Brian pointed out that the old Alto/Laurel mail system was designed by researchers who were sensitive to the ARPANET gateway sitting in their lap. Some care was taken to be able to interface properly with the outside world, and the people using the system were more aware of the issues involved. Now the world has changed somewhat. Grapevine has opened up the world of mail to non-researchers in Xerox. Although there are still more Altos around than anything else, new hardware/software combinations have come into use. One such combination is the Dandelion and the Mesa Development environment. The Dandelion (or DLion as it's known here) is the machine used in the 8010 Star workstation. As someone else from Xerox noted, the Star product is not the problem: it doesn't communicate with the ARPANET or (for the moment) Grapevine. The software called into question runs in the Mesa Development Environment, which includes all of the tools used to develop Star and the other NS8000 products. A little history: the Mesa Development Environment was born out the the experience gained from the Alto and its tools. Although many of the tools in the Development Environment existed in the Alto world, the interaction between the Alto tools was clumsy. Each tool tended to have its own style of interaction; furthermore, when you were running one tool, you couldn't run anything else. The Mesa Development Environment sought to remedy those problems. Building on the style of interaction developed in Smalltalk, InterLisp-D, and other research efforts, the Mesa Environment was developed using the paradigm of multiple-overlapping windows. In the Development environment, each 'tool' (examples compiler, mail user agent, file transfer utility, and so on) lives in a separate window. Each window can be grown, shrink, or stretched to fit the users desires. Unlike the Alto world, the Mesa environment tends not to use many different proportionally spaced fonts; typically, all tools use a single fixed-space font. The user can specify the default font he wants to use; individual tools may also choose to use different fonts. A mail tool exists; it is called Hardy. Hardy is loosely based on Laurel; it lacks many of Laurel's fancier features, but the because of the power of the environment in which it is used (multiple windows, large screen, mouse), it is very usable. Hardy, however, is only a mail reading tool; to send mail, it communicates with a Mail Send Tool. The Mail Send Tool lives in a separate window, and multiple Mail Send Tools mail be simultaneously in existance. I could, for example, move my mouse back into the Hardy window to read mail that I just received, and invoke a new Mail Send Tool to reply to a message without disturbing the message I was previously composing. Now we get to the heart of the problem. Since windows (including the Mail Send Tool) can be stretched in any manner desired, the software that underlies all windows that contain editable text automatically breaks lines that are wider than the current window. All breakage is done at word boundaries. This is similar to Emacs auto-fill mode, in that the cursor goes to the next 'line' during typein, but this word break mode is also active whenever text is displayed in a window. Thus, if I change the shape of the window that I'm reading my mail in, the text changes shape to fill out the window. When I'm reading mail, I often expand the Hardy tool window to fill the entire screen. (This can be done with one mouse click). If the text contains no extraneous newlines, I can get lots of information on my screen at once. Here's where the confusion comes in for users: the may be split over many 'logical window lines', but may in fact be a single unbroken physical line. There is no 'company policy' against newlines. The user is free to enter newline characters where ever he wants, but no one ever does unless he has to. But that's not the whole story. Phil Karlton noted that a hack exists that allows the user to chop up his message prior to sending with a single keypress. So why does mail come out unchopped? We could plead ignorance of how it looks to those out there, but that doesn't apply to everyone. I started at Xerox in mid July; prior to that, I was at UC Irvine living on 80 column CRTs. I correspond regularly with people in the net, yet people still receive unchopped messages from me on occasion. I confess: I am guilty. My problem is partially one of conditioning. This message not withstanding, I typically don't send very long messages. I tend to quickly dash things off - and for me, the sooner I've got a message off the better. As soon as I type my name at the bottom of a message, I go into automatic mode. Typing my name at the end of a message and reaching up with the mouse and bugging 'Deliver' are one action. It takes a very concicous effort to break that routine. It's not natural. So, I will welcome the installation of line chopping software at the PARC gateway. Until then, I and others will occasionally be guilty of net pollution. Please bear with us; if you find someone who doesn't seem to be aware of what they are doing, let them know. But don't pound on us; some of us are doing the best that we can. Paul Holbrook [1] Eric Harslem and LeRoy Nelson. "A retrospective on the development of Star". submitted to 6th Conference on Software Engineering, Tokyo, Japan, September 1982.  Date: Tuesday, 31 August 1982 16:40-PDT From: Jonathan Alan Solomon To: holbrook.ES at PARC-MAXC Cc: Mike Peeler , Geoff at SRI-CSL, GZ at MIT-MC, HEADER-PEOPLE at MIT-MC, MsgGroup at BRL Address: 3737 South Hoover Street Room PHE 204 Los Angeles, California 90089-0273 Phone: (213) 202-1793 Subject: Background and info on the Xerox environment Paul, Thanks very much for your explanation, and thank you for pointing out what I was essentially trying to say. I'm sorry if you felt that I was pounding at you. The conditioning problem you describe is the heart of the problem with Internet mail. Everybody is used to a different set of "conditions". You went from the UNIX/TOPS-20 environment into the Xerox environment and have to "unlearn" what you have built up mostly out of habit. This is the complaint of many EMACS users trying to get used to (for example) TVEDIT (and contrawise too!) Also, everyone would like to implement their own version of the best, rosiest, and most advanced computer world. I, for example, consider the new From: field in RFC822 to be a step backwards in computer technology, equal in magnitude to Xerox Alto's having to conform to IBM 1130's! We have to all come to grips with the reality that IBM1130's do exist, and if we want to talk to them, we have to talk their way (Please nobody pounce on me saying that there aren't any IBM 1130's on the ARPANET, I meant it only as an example.) You have also demonstrated an awareness for the problem and a personal solution, Xerox has a different solution, which will involve insuring that even if users forget, the internet mail gateway software will fill the lines to something "reasonable". I think your solution, i.e. to change your habits, is more useful in the long run and will make you more conscious of the recipients of your message. I am more concerned with the non-researchers (particularly secretaries and administrators) who may have a hard enough time learning how to log onto the machine. The responsibility to insure that the text appears (Mike Peeler's statement) the way they want it, or (my statement) a way I can read it, is on the sender of the message, not on some arbitrary agent between the sender and the receiver. How 'bout a novice mode in this software which, when you mouse the "deliver" key, that it explains that the text is not formatted as it appears on the screen, it could further offer to format the text of the message to look exactly like what is on the screen, experts may not need this mode, so it should be switchable, but it might even be a useful reminder for you to use until you get sick of it (your memory might improve). You will be fulfilling what I consider a responsibility to your readers by insuring that mail is formatted so we can read it. I consider the "chopper" software to be a compromise, but the real problem is in design of the software and education of the users. Cheers, --JSol  Date: 31 August 1982 20:23 edt From: Barry Margolin at MIT-MULTICS Subject: Re: Background and info on the Xerox environment Sender: Margolin.Multics at MIT-MULTICS To: Jonathan Alan Solomon cc: holbrook.ES at PARC-MAXC, Mike Peeler , Geoff at SRI-CSL, GZ at MIT-MC, HEADER-PEOPLE at MIT-MC, MsgGroup at BRL In-Reply-To: Message of 31 August 1982 19:54 edt from Jonathan Alan Solomon While I do not believe in recipients formatting the message, I have to disagree with JSol that the Xerox users should learn to format their messages themselves. Within the Xerox world, sending the messages without extraneous newlines is the preferred method. Someone several days ago pointed out that messages can make their way outside Xerox-land without the sender's explicit knowledge, perhaps because an ARPAnet user was added to a mailing list. The sender cannot be expected to know of this, and cannot be expected to suddenly change habits. Since the preferred message formats differ between the two networks, it is the job of the gateway program to do the appropriate reformatting, however much we all hate the idea of automatic text munging. At least an agent that is on the Xerox network can be expected to have a better idea of the meaning of a agent on the receiver's computer. I have a question for the Xerox people. We now understand the interface that the mail sending programs present to the users. Lines are automatically wrapped to fit on the screen, and the user may insert explicit newlines anywhere he or she wishes. How, on the other hand, does one insert an explicit "don't put a newline here", in order to have an explicit over-length line, or does the concept not exist on the Xerox machines? On most systems I have used, a text line generally maps into a terminal line, but if the text line overflows the terminal line we get a continuation indicator in the margin and the text continues on the next line (breaking exactly at the margin, rather than at a word). This is not very aesthetic, but at least it retains the structure that its creator intended, and indicates that the reader should send it to a printer or wider terminal in order to see it appropriately. Tables very often come this way, and it would often not be appropriate to break these at word boundaries.  Date: 1-Sep-82 9:53:18 PDT (Wednesday) From: holbrook.ES at PARC-MAXC Subject: Re: Background and info on the Xerox environment In-reply-to: Margolin.Multics's message of 31 August 1982 20:23 edt To: Barry Margolin at MIT-MULTICS cc: Geoff at SRI-CSL, GZ at MIT-MC, HEADER-PEOPLE at MIT-MC, MsgGroup at BRL Remember that unlike Emacs auto fill mode, there are no physical newlines in the text where the text is broke by by the window display software. With regard to the question about explicit an "don't put a newline here" indicator, no such thing exists. However, if a line is wider than your window but contains a newline at the end of it, you get the same effect that you do on many terminal when lines overflow 80 columns: part of the text gets put on the next line alone. To put it another way, If I write text broken up into short lines like this, and then display it in a narrow window, it comes out looking like this: | If I write text broken up | | into | | short lines like this, and| | then display | | it in a narrow window, it | | comes out looking like | | this: | (Folks with proportionally spaced fonts won't like me for that, but then they probably are familar with what I'm trying to get across...) As you can see, it is clear that the text contains physical lines that are wider than the window. This is obvious to the user, so all he does is increase his window size. Of course, this is what gets people into trouble when they are composing messages without newlines. If the width of the window happens to be close to 80 characters wide, it can look like it is properly broken up when it isn't. This was the point of Jon's message. Regarding the PARC gateway, I would hope that there might be a way to send mail across without having it formatted. If I wanted to send a table suitable for printing on a 132 column printer, I would not want the gateway to mess that up. Perhaps a field in the header could inform the gateway that the sender has taken responsibility for seeing that the message is formatted properly. Paul  Date: 1 September 1982 14:34 edt From: Charles Hornig at MIT-MULTICS Subject: someone is munging headers... Sender: Hornig.Multics at MIT-MULTICS To: header-people at MIT-MC Message-ID: <820901183416.684987 at MIT-MULTICS> I just received a message with this header in my mailbox. Return-Path: Date: 31 August 1982 20:23 edt From: Barry.Margolin at Mit-Multics Subject: Re: Background and info on the Xerox environment Sender: Margolin.Multics at Mit-Multics To: Jonathan Alan Solomon cc: holbrook.ES at Parc-Maxc, Mike Peeler , Geoff at Sri-Csl, GZ at Mit-Mc, HEADER-PEOPLE at Mit-Mc, MsgGroup at BRL In-Reply-To: Message of 31 August 1982 19:54 edt from Jonathan Alan Solomon Via: Mit-Multics; 31 Aug 82 20:32-EDT Via: Brl; 31 Aug 82 20:35-EDT Via: Brl-Bmd; 1 Sep 82 9:47-EDT Note in particular the "From: Barry.Margolin at Mit-Multics". I believe (based on my knowledge of the Multics mailer) that this line read "From: Barry Margolin at MIT-MULTICS" when the mail was sent. This is important because "Barry Margolin" as a valid address at MIT-Multics, while "Barry.Margolin" is not. Any ideas on who did it? P.S.: The "From" field on this message should read: From: Charles Hornig at MIT-Multics  Date: 1-Sep-82 12:26:36 PDT (Wednesday) From: holbrook.ES at PARC-MAXC Subject: Re: someone is munging headers... In-reply-to: Hornig.Multics's message of 1 September 1982 14:34 edt To: Charles Hornig at MIT-MULTICS cc: header-people at MIT-MC Well, this is very strange. I sent out a reply to Margolin's message, copied to Header-People and MsgGroup. I got back a copy from Header-People with the following header: Mail-from: Arpanet host MIT-MC rcvd at 1-SEP-82 1012-PDT Date: 1-Sep-82 9:53:18 PDT (Wednesday) From: holbrook.ES at PARC-MAXC Subject: Re: Background and info on the Xerox environment In-reply-to: Margolin.Multics's message of 31 August 1982 20:23 edt To: Barry Margolin at MIT-MULTICS cc: Geoff at SRI-CSL, GZ at MIT-MC, HEADER-PEOPLE at MIT-MC, MsgGroup at BRL However, I also got back a copy from MsgGroup, which came back this way: Mail-from: Arpanet host BRL-BMD rcvd at 1-SEP-82 1039-PDT Date: 1 Sep 82 13:26:03-EDT (Wed) From: Memo Service (MMDF) Subject: Failed mail To: holbrook.ES at Parc-Maxc Your message could not be delivered to 'msggroup' for the following reason: 'Unknown problem' Your message follows: Date: 1-Sep-82 9:53:18 PDT (Wednesday) From: holbrook.ES at Parc-Maxc Subject: Re: Background and info on the Xerox environment In-reply-to: Margolin.Multics's message of 31 August 1982 20:23 edt To: Barry.Margolin at Mit-Multics cc: Geoff at Sri-Csl, GZ at Mit-Mc, HEADER-PEOPLE at Mit-Mc, MsgGroup at BRL Via: Parc-Maxc; 1 Sep 82 12:56-EDT Via: Brl; 1 Sep 82 13:16-EDT Note that the copy I got back from BRL has the dot in the address: "Barry.Margolin". This strongly suggests that the MMDF gateway at BRL did it. Paul  Date: 1 Sep 1982 at 1503-PDT To: holbrook.ES at Parc-Maxc Cc: Charles.Hornig at Mit-Multics, header-people at Mit-Mc Subject: Re: someone is munging headers... In-reply-to: Your message of 1-Sep-82 12:26:36 PDT (Wednesday). From: knutsen at SRI-UNIX MMDF does do this... but isnt there an RFC somewhere saying "no spaces in mailbox names"?  Date: 1 Sep 1982 1537-PDT From: Mark Crispin Subject: someone is munging headers... Sender: ADMIN.MRC at SU-SCORE To: Header-People at MIT-MC Reply-To: Admin.MRC at SU-SCORE Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) Yes, the BRL mailer is both inserting the dots and doing the idiotic case hacking of the host names. I recommend that you bug dpk@BRL about fixing it. BRL isn't the only mailer that munges headers this way. The mailer running on the various VAX Unices on Stanford's Ethernet loves to append various things to the From line. Its favorites are "@Shasta at Sumex-Aim" or just " at Shasta". As I had it explained to me, that mailer doesn't know the difference between external-origin mail it is relaying and local-origin mail it is sending out on the network. Both are "net mail", and the mail delivery process not the mail composition process is charged with adding the host name. Lots of very amusing headers have been created when things like mail expansion lists have been involved. I think the lesson to be learned is that munging the header is generally a loss; that node name insertion is best done at the mail composition level. If the mail delivery process must mung headers, it must have a way of distinguishing local-origin mail from mail it is relaying, or all sorts of horrible things will happen to the headers a remote site has so carefully set up to be the right thing. -------  Date: 2 Sep 1982 1229-PDT From: POSTEL at USC-ISIF Subject: re: spaces in names & related RFCs To: header-people at MIT-MC Please check RFCs 805 and 808. --jon. -------  Date: 2 Sep 82 11:03:15 EDT (Thu) From: Steve Bellovin Subject: header-munging To: "Charles.Hornig" at Mit-Multics, Admin.MRC at Su-Score, Header-People at Mit-Mc Cc: dpk at Brl-Bmd Via: UNC; 3 Sep 82 2:00-EDT Unfortunately, we can't get away from munging headers, especially in an Internet environment. When you send a letter, you don't know where it's going to end up. I'm off on a CSnet host, which means that letters arriving here from ARPAland should (soon) be from "user@host.arpa", and say something like "To: Header-People@MIT-MC.arpa". I suspect that most folks wouldn't like a mailer that included the full domain qualifer on all letters (nor does RFC822 require that); unless they do, however, some sort of munging will be necessary. The trick is to do it properly...  Date: 6 Sep 1982 1558-PDT From: Feinler at SRI-NIC (Jake Feinler) Subject: Re: MSGGROUP#1831 Re: Xerox Woes -- Variable width fonts To: reid.Shasta at SUMEX-AIM, Geoff at SRI-CSL, To: MsgGroup at BRL, Header-People at MIT-MC, To: Human-Nets at SU-SCORE cc: FEINLER In response to the message sent 27 Aug 1982 14:22-PDT - Friday from reid.Shasta@Sumex-Aim Brian, I am curious about your statement that Grapevine 'handles many times as much mail as the Arpanet'. On what do you base this statement? To the best of my knowledge no one knows how much mail traffic is carried by the Arpanet. If you have any figures, I would be glad to get them. Jake -------  Mail-from: SU-NET host Shasta rcvd at 6-Sep-82 2219-PDT Date: Monday, 6 Sep 1982 22:16-PDT To: Feinler at SRI-NIC, (Jake, Feinler) Cc: Geoff at SRI-CSL, Header-People at MIT-MC, MsgGroup at BRL Subject: Re: MSGGROUP#1831 Re: Xerox Woes -- Variable width fonts In-reply-to: Your message of 6 Sep 1982 1558-PDT. From: Brian Reid Jake, My estimate of relative mail traffic over Grapevine and Arpanet is based on 3 things: (1) experience at a half-dozen or so ARPANET sites watching netmail activity in and out at various times, which gave me a very rough idea of the netmail patterns in and out of those sites. I have never seen more than 1000 messages a day be sent from one Arpanet site. (In this statistic a human-nets broadcast to half of the human race is considered to be one message because the same thing is sent to all recipients.) I would be very surprised if there were more than 5 machines on the Arpanet that had outgoing message traffic that was even CLOSE to 1000 messages a day. (2) second-hand experience (i.e. hearing Grapevine people talk about it) of mail volumes at Xerox. I have no idea whether or not their numbers are considered secret, but I had better play safe and not repeat them, since Xerox considers practically everything to be secret. (3) considerations of locality. On the ARPANET machines that I have used the vast majority of the computer mail that is sent and received is local. At places like CMU and MIT, which have a zillion machines, there is a lot of "local" mail that actually moves from one machine to another but not via the Arpanet. By contrast, at Xerox all mail goes through Grapevine, because there is no such concept as "local" when everyone has his own machine. Brian  Date: 6 Sep 1982 2357-PDT Sender: STEF at DARCOM-KA Subject: Re: MSGGROUP#1831 Re: Xerox Woes -- Variable width fonts From: STEF at DARCOM-KA To: reid.Shasta at SUMEX-AIM Cc: Header-People at MIT-MC, MsgGroup at BRL Message-ID: <[DARCOM-KA] 6-Sep-82 23:57:36.STEF> In-Reply-To: Your message of Monday, 6 Sep 1982 22:16-PDT Hi Brian - All that analysis is interesting, but I find it a bit strained to lean on the technical distinction between Grapevine defining everything to be netmail, and then discounting local CMU mail. I tend to feel that mail is mail is mail, no matter how far it does (or does not) travel, or whether it goes out and back into its source host or not. In the end, it all has to be composed and sent, and then received and processed. However, I would tend to agree that Xerox users process more mail per user on the average. I base is on a hypothesis that the amount of mail a user will tolerate is proportional to the power of their processing tools. I believe that the Laurel/Grapevine tools are superior, and that those bit-mapped displays on ALTOs make for better tools. This cannot infer however, that all these Xerox users are more productive. I expect that much of the extra power goes into their many fascinating junk lists. But, of course, many are more productive. I expect that something like Parkinson's law applies here. Mail volume expands to fill the available capacity to process it. The task for management then, is to find ways to keep people focused on being productive. I doubt that this focusing activity can be automated with addtional technical features and functions. Anybody ever seen a FOCUS Command? Yes, I know this is veering off the original topic, but it seems important at this point to note that although these tools give users the power to do more, there is nothing in these tools that will automatically make make people use that capability to produce more. Indeed, I find that these tools often help me procrastinate faster, better, and more convincingly, and with less effort too! Cheers - Stef  Date: 7 Sep 1982 1309-PDT From: Mark Crispin Subject: Re: number of messages Sender: ADMIN.MRC at SU-SCORE To: Header-People at MIT-MC, MsgGroup at BRL Reply-To: Admin.MRC at SU-SCORE Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) I have seen SCORE process more than 1000 messages/day on several occasions. SCORE's mailer is a busy (not-so) little beaver; if it gets blocked for even as short an interval as an hour hundreds of messages will pile up in its queues. I don't have firm statistics, but I am fairly certain it has delivered more than 1000 pieces of netmail in a single day during busy days. I'll confess that much of the netmail is to other Stanford facilities. -------  Mail-from: SU-NET host Shasta rcvd at 7-Sep-82 1501-PDT Date: Tuesday, 7 Sep 1982 14:57-PDT To: STEF at Darcom-Ka Cc: Header-People at Mit-Mc, MsgGroup at BRL Subject: Stef's comment on my answer to Jake In-reply-to: Your message of 6 Sep 1982 2357-PDT. <[DARCOM-KA] 6-Sep-82 23:57:36.STEF> From: Brian Reid I think that for this issue it is quite appropriate to pay careful attention to the distinction between local mail and network mail, even if the network is 10 feet long and goes to the next room. The essence of this discussion has been that there is a particular standard for the format of mail during transport, which is used by Xerox internally but which has been judged "losing", "too expensive", "wrong", "bogus", and so forth by a random collection of hackers in the Arpanet community. There have been two parts to the discussion: (1) Is the Xerox standard a reasonable thing to do, technically? (2) Given that Xerox has been gating mail from their internet out to the Arpanet without adjusting its format, which violates conventions but not standards on the Arpanet, Arpanet, should they be pressured into changing this behavior? My point at the time I sent the original message a couple of weeks ago is that the Xerox folks had a huge amount of mail traffic TRANSPORTED OVER A NETWORK, AND THEREFORE SUBJECT TO RULES AND CONVENTIONS OF NETWORK TRANSPORT. They needed to adopt a transport convention that was better than what the Arpanet uses, because they had a higher volume of mail that could be damaged by an inappropriate format. In summary: the Xerox people had a greater need for good network transport conventions than did Arpanet people, because all of the Xerox mail had to go through network transport.  Date: 7 Sep 1982 1641-PDT From: Mark Crispin Subject: local mail vs. network mail Sender: ADMIN.MRC at SU-SCORE To: MsgGroup at BRL, Header-People at MIT-MC Reply-To: Admin.MRC at SU-SCORE Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) I think the issue of transport conventions is irrelevant. Xerox comprises an essentially homogenous community. When dealing with a heterogeneous community sacrifices at times have to be made to meet some sort of common denominator. I am interested in seeing some sort of automatic line-breaking available in the MM mailsystem for TOPS-20. I feel, though, that I want to do more than the half-way solution of merely breaking overly long lines. I also want it to repair lines that are "just slightly" too long, such as messages using an 85 character margin on my 80 character/line terminal. I would want such messages presented to me in a pleasing format, much as M-G or repeated applications of M-Q in EMACS would do for me. As a mailsystem maintainer, I would like time to think about these issues. I'm also worried about the RFC 822, SMTP, and TCP/IP conversions. My mailsystem presently talks (perfectly valid) RFC 733, FTP, and NCP; in some cases my mailsystem's usage of RFC 733 is invalid under RFC 822 (e.g. multiple at's and allowing non-mailbox From's) and addressing these matters has top priority. Since Xerox has decided to apply line breaks in messages sent out over the Internet, let's get our TCP/IP transitions done first. Once that has been completed, we ought to talk somewhat about the structure of the text of an electronic mail message and means of dealing with alternative notions of proper structure. I at least would like to see something done in this area, but, hey, I don't know about the other software wizards, but my brain overloads when I have too many things to do. Please don't talk about forcing us mailsystem maintainers to deal with long-line messages now. At least with me (and perhaps others) it scares us and makes us react defensively when we already have the terrifying December 31 deadline hanging over us. After December 31, I think all of the ARPANET software maintainers should all get together and have a big party. Everybody else should wait a week (a month? six months?) for us to recover from our hangovers then give us the long lines. We'll be much more receptive. -- Mark -- -------  Date: 7 Sep 82 23:46:13 EDT (Tue) From: Steve Bellovin Subject: minor complaint To: MsgGroup at Brl, Header-People at Mit-Mc Via: UNC; 8 Sep 82 1:36-EDT Is there any chance we can get this discussion on just one of the two lists? I'm starting to feel like Captain Yossarian ("I see everything twice!"), and I suspect I'm not the only one. To avoid further debate on both lists simultaneously, I hereby suggest that we move the "long lines from Xerox" discussion, and all its children, to just MSGGROUP. --Steve  Date: 7 Sep 82 19:28:03-EDT (Tue) From: Michael Muuss To: MsgGroup at Brl cc: Header-People at Mit-Mc Subject: Messages per Day (BRL) BRL processes about 200 messages per day, with between 1 and 700 recipients per message (average of about 20). This implies delivery of about 4000 COPIES of messages per day. No real distinction is made between local and network(s) delivery in these statistics. We don't even mail most of the big digests any more! Who says the ArpaNet doesn't carry lots of mail? We sometimes have to run 5 or 6 network mailers just to push the stuff out in a timely fashion. -Mike  Date: 7 September 1982 20:36-EDT From: Earl A. Killian Subject: statistics To: HEADER-PEOPLE at MIT-MC, MsgGroup at BRL Here are some MC statistics for the 24 hours of 9/6/82. This is unfortunately a holiday, but older data is on tape. 933 mail new items were submitted, 601 from the network. 2353 icps were attempted, 1057 of which failed. 3439 recipients received mail (no idea how many local). Of the above, some of the stuff used the multiple recipient feature of mail delivery for network mail. 247 messages reaching 940 recipients were sent with this technique (i.e. 3.8 recipients per message).  Date: Wednesday, 8 September 1982, 03:58-EDT From: Robert W. Kerns Subject: Stef's comment on my answer to Jake To: reid at SU-SHASTA at SU-AI Cc: STEF at Darcom-Ka, Header-People at Mit-Mc, MsgGroup at BRL In-reply-to: The message of 7 Sep 82 17:57-EDT from Brian Reid My point at the time I sent the original message a couple of weeks ago is that the Xerox folks had a huge amount of mail traffic TRANSPORTED OVER A NETWORK, AND THEREFORE SUBJECT TO RULES AND CONVENTIONS OF NETWORK TRANSPORT. Actually, it had to do with the rules and conventions of transport, not the fact that it went over a piece of cable or a phone line. The problem arises from the fact that the mail reader and the mail sender have to communicate. At TOPS-20 sites, for example, mail is composed by BABYL, MM, MSG, HERMES, etc. If one of these were to expect that the receiver breaks lines, you can be sure there would be an uproar from the users of all the other software! All messages are transported, whether they happen to pass through something called a network or not. In summary: the Xerox people had a greater need for good network transport conventions than did Arpanet people, because all of the Xerox mail had to go through network transport. This is nonsense, when you realize (from above) that the word "network" is completely spurious. And in order to communicate with US, their greatest need is transport protocols which are COMPATIBLE. They realized this, and have promised to fix it.  Return-path: @USC-ISID,@UCL-CS,steve@ucl-cs Via: USC-ISID ; Thursday, September 9, 1982 08:21:58-PDT Date: 9 Sep 82 9:17:35-BST (Thu) From: Steve Kille Reply-To: steve%ucl-cs at isid To: header-people at mit-mc, msggroup at brl cc: indra at UCL-CS Subject: UCL statistics For those interested, here are UCL-CS statistics for 6 September. 95 local letters processed 107 letters from various networks 692 local recipients (total) 81 network recipients 3.9 recipients per letter average Steve Kille -------- Š  Date: Friday, 10 September 1982, 02:16-EDT From: Robert W. Kerns Subject: More header mangling To: MsgGroup at BRL, HEADER-PEOPLE at MIT-MC I wish whatever it is (at BRL, I think) that is mangling my headers I send to MsgGroup would stop it. It probably thinks it is doing everybody a service, but it is wrong. I can *NOT* be reached as RWK.SCRC-TENEX at MIT-MC. I *CAN* be reached as "RWK at SCRC-TENEX" at MIT-MC. If this translater wants to "help" convert to the standard that was decreed without telling anyone, it could convert it to "RWK.SCRC-TENEX.MIT-MC" at BRL, and forward that to "RWK at SCRC-TENEX" at MIT-MC. It does no one any good to convert from a formerly legal address that still works to a currently legal (but stupid) address that does NOT work.  Date: 10 Sep 82 3:24:38-EDT (Fri) From: Doug Kingston To: Robert W Kerns cc: MsgGroup at Brl, HEADER-PEOPLE at Mit-Mc Subject: Re: More header mangling I will endeavor to see what I can do, but the MIT gateway is largely at fault for allowing ILLEGAL headers onto the network. The conversion of illegal headers is undefined operation. I also have a more serious bug to fix which will take precedence. The person responsible for the address parser is currently unavailable for two weeks, so I may have difficultly unearthing the "problem" myself. -Doug-  Date: 10 Sep 1982 0111-PDT From: Mark Crispin Subject: header mangling Sender: ADMIN.MRC at SU-SCORE To: MsgGroup at BRL, Header-People at MIT-MC Reply-To: Admin.MRC at SU-SCORE Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) In an attempt to work around the problem of multiple at's and relays, I am planning on having MMAILR use "%" instead of multiple " at "'s to indicate a routed address. In other words, the former RWK at SCRC-TENEX at MIT-MC will become RWK%SCRC-TENEX at MIT-MC I would like to ask the maintainers of the MIT I.T.S. FTP server to make the (simple) change to their software so that "%" is accepted as an alternative to "@". This will be done to the TOPS-20 FTP server shortly, in a release with some other bugfixes for TOPS-20 FTPSER in the XMAILR environment. Once the code in MMAILR is debugged, it will be retrofitted into XMAILR. For those folks wondering what MMAILR is, it is a TOPS-20 only descendant of XMAILR for the brave new world of SMTP and TCP/IP. It supports the new standard for the format of queued mail, and has several incompatible changes in interface from XMAILR. MMAILR is still "under development" and isn't ready for any sites to run yet, although it will be in alpha test soon. You know, I just realized one loss with not having multiple at addresses. MM used to be smart enough to realize when a relay route was not needed. If it recognized any of the host names in the route list, it could flush the remainder of the route and compute the route itself. With the new scheme, MM cannot have any knowledge of what constitutes a relay-routed address and so cannot be smart about how to deliver it. If I receive a message for whom one of the recepients is FOO%SU-SCORE at MIT-XX, it would have to deliver to XX since it really cannot know that the FOO is a local mailbox; formerly it did. Any scheme where it would know that "%" is a relay character with certain hosts would remove generality and otherwise would not be trustworthy. -- Mark -- -------  Return-path: @USC-ISID,@UCL-CS,steve@ucl-cs Via: USC-ISID ; Friday, September 10, 1982 01:14:04-PDT Date: 10 Sep 82 9:09:25-BST (Fri) From: Steve Kille To: Robert W. Kerns cc: steve%ucl-cs at Usc-Isid, header-people at Mit-Mc, msggroup at BRL, indra.UCL-CS at BRL Subject: Re: UCL statistics There are about 70 local users. At a guess, there are 20 or so users processing 20-30 messages a day comprising most of the local load. These figures are probably not at all stable, as a usable return path from the ARPANET ha only become available over the last few months. I will be compiling some rather more detailed statistics in the near future if anyone is interested Steve -------- Š  Date: 10 Sep 1982 0154-PDT From: Mark Crispin Subject: Re: More header mangling Sender: ADMIN.MRC at SU-SCORE To: dpk at BRL cc: RWK.SCRC-TENEX at MIT-MC, MsgGroup at BRL, HEADER-PEOPLE at MIT-MC Reply-To: Admin.MRC at SU-SCORE Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) In-Reply-To: Your message of 10-Sep-82 0024-PDT This isn't completely fair. At the time the software which sends multiple-at headers was implemented, it was done as part of the official, published standard, RFC 733. Those who decided to invalidate multiple-at addresses did so without generally publishing the decision (much less consulting with the implementors of relays into the MIT, Stanford, CMU, and Rutgers local networks!) deserve the recriminations. Those of us who implemented a multiple-at address mailsystem in good faith, believing the published standards, do not. I am under the impression that even RFC 822 supports multiple at headers if the string with the extra at's is quoted, e.g. that "RWK@SCRC-TENEX" at MIT-MC is a perfectly valid address. So let's not talk about "illegal" headers; until I see somebody go to jail for generating a multiple at addresses let's say "invalid under the current standard" and recognize that these addresses were valid in the standard in effect up to less than a month ago. It has already been agreed that we'll change our mail software to use "%" instead of multiple at's, although we would rather move away from relays in favor of domains entirely. Allow us time to convert! It is clear to me, though, that a site relaying messages should not reformat the message header the way BRL is doing. If BRL's mailer perceives that the header is invalid, it should treat it as text and build a new header on top of it. Any reformatting of the header could actually destroy information; as in this case where BRL's mailer destroyed information which would be meaningful to some sites at least in favor of information which is useless to everybody. At the same time, I recognize that BRL has other matters which are of greater priority to address. Hopefully we'll both have our mailsystems fixed in the not-too-distant future. Peace, -- Mark -- -------  Date: 10 Sep 82 11:29:29-EDT (Fri) From: Dave Crocker To: rwk.src-tenex at Mit-Mc cc: MsgGroup at Brl, HEADER-PEOPLE at Mit-Mc Subject: Re: More header mangling Nice to have your attention. I have repeatedly reported the situation to people at MIT, to no effect. The problem, of course, is that you guys are sending out illegal headers. It is, therefore, not too suprising that MMDF's header-munging code mishandles it. If you sent out "mbox at foo" at bar, things would be fine. Unfortunately, the initial stuff is not quoted, so that the specification contains two parts. If you fix that and headers are still mangled, when relayed, we'll be glad to try to find the problem at our end. Dave  Date: Friday, 10 September 1982 14:36-PDT From: Jonathan Alan Solomon To: Dave Crocker Cc: HEADER-PEOPLE at Mit-Mc, MsgGroup at Brl, RWK at SCRC-TENEX Address: 3737 South Hoover Street Room PHE 204 Los Angeles, California 90089-0273 Phone: (213) 202-1793 Subject: More header mangling 1) I believe RFC822 will be officially implemented on Jan 1983, which means that until then we follow RFC733, right? Mail system implementors need time to work, right? 2) During the conversion - we should EXPECT to find inconsistencies, and problems replying to mail. Messages which are so crucial should have instructions in the message as to how to reply (note I put my USPS address and my home telephone number in the headers if you can't get electronic mail to me). 3) It will do no good to throw flames at each other. According to ARPA, USER.HOST@FORWARDER is as invalid as USER@HOST@FORWARDER, so BRL is as much at fault as MIT, BRL may (if it feels compelled to) change USER@HOST@FORWARDER into USER%HOST@FORWARDER. 4) I don't consider it valid to change RWK at SCRC-TENEX into rwk.src-tenex at Mit-Mc, under *ANY* mail standard. a) First of all, rwk is not lower case, both rfc733 and rfc822 specify that mail gateways should not change case of mail. b) src-tenex is not src-tenex, but SCRC-TENEX. You don't need an rfc to realize that the name is mispelled, even if the world knew how to forward to that address, src-tenex is not even a valid host. i) not only that, but scrc-tenex wouldn't be valid either since it is lower case c) Mit-Mc is MIT-MC, not casified. 4) ECL will eventually support all forms of forwarding. Currently it allows you to send user%host@forwarder, and user@host@forwarder. 5) To the best of my knowledge, RFC822 doesn't allow "JSOL at MIT-MC" or "JSOL%MIT-OZ" at MIT-MC as valid, it seems to require (someone correct me if I am wrong, I'm about to make changes to XMAILR to support this) JSOL%MIT-OZ@MIT-MC, or JSOL@MIT-MC. I seem to recall much hot flammage about this particular topic on Header-People not too recently. Someone out there (in UNIX land?) complained that ,"a","t", was too hard to parse? Please - nobody take all of this flaming personally. I think it's horribly silly to argue about things which we have all committed to change. The secret word here is time, give us maintainers time to implement things!! Rwk - perhaps you just wanted to hear that MMDF will be changed to conform to Rfc822? Yes, well it will be changed, so will XMAILR. Anyone want to volunteer to change COMSAT (on ITS?). Peace, --JSol  Date: 10 September 1982 18:30 edt From: Charles Hornig at MIT-MULTICS Subject: Re: More header mangling Sender: Hornig.Multics at MIT-MULTICS To: header-people at MIT-MC In-Reply-To: Message of 10 September 1982 18:00 edt from Jonathan Alan Solomon Another point about standards and "illegal headers": For a brief time a few weeks ago, MIT-MULTICS was sending out RFC 822 headers. We were forced to stop because people's mail readers couldn't parse them. It seems that many mailers don't understand RFC 733 (which allows spaces in names and multiple at's) and that almost none understand RFC 822. Since we can't win, the least we can do is try to get by. This means you should do as little as possible to an address you don't understand in the hope that someone else will. WHat BRL is doing makes it impossible for even the human user to figure out to whom to reply.  Date: Friday, 10 September 1982, 19:29-EDT From: Robert W. Kerns Subject: Re: More header mangling To: dcrocker at UDel-Relay Cc: MsgGroup at Brl, HEADER-PEOPLE at Mit-Mc In-reply-to: The message of 10 Sep 82 11:29-EDT from Dave Crocker Date: 10 Sep 82 11:29:29-EDT (Fri) From: Dave Crocker Nice to have your attention. I have repeatedly reported the situation to people at MIT, to no effect. The problem, of course, is that you guys are sending out illegal headers. It is, therefore, not too suprising that MMDF's header-munging code mishandles it. If you sent out "mbox at foo" at bar, things would be fine. Unfortunately, the initial stuff is not quoted, so that the specification contains two parts. As you well know, this is perfectly legal under RFC733. The decision to make declare it illegal was made almost in secret much later, without MIT's participation, and not announced until after everybody maintaining MIT's mailer had left MIT. There is no hope of making the change in the immediate future. If the people trying to impose this change had been less secretive (or if they had paid attention to MIT's calls for structure in mailbox names at the time, which would provide an alternative to this technique), then this situation could have been avoided. In the meantime, we all have to live with the consequences, even those of us who had nothing to do with the mess. If you fix that and headers are still mangled, when relayed, we'll be glad to try to find the problem at our end. Dave Doing nothing at all when you don't understand the header is the ONLY reasonable course. Of course, if you DID want to understand and convert the double syntax to fit the current decrees, (and you must have understood it to make this wrong conversion!), convert it to be "FOO@BAR" at BAZ, and all will be well. I can see no reason to believe that converting to "FOO.BAR" would EVER be understood, since if it were, then that syntax would have been generated in the first place. Since it now appears that MIT will be converted to TCP, there is hope of getting MIT's mailer changed as well, but I doubt it would happen until after the TCP conversion. Unless there's someone at MIT going to do it that I don't know about.  Date: Friday, 10 September 1982, 19:50-EDT From: Robert W. Kerns Subject: Re: More header mangling To: dpk at BRL Cc: header-people at MIT-MC, MsgGroup at BRL In-reply-to: The message of 10 Sep 82 03:24-EDT from Doug Kingston Date: 10 Sep 82 3:24:38-EDT (Fri) From: Doug Kingston I will endeavor to see what I can do, but the MIT gateway is largely at fault for allowing ILLEGAL headers onto the network. The conversion of illegal headers is undefined operation. I also have a more serious bug to fix which will take precedence. The person responsible for the address parser is currently unavailable for two weeks, so I may have difficultly unearthing the "problem" myself. Thanks for your attention. I'm very hostile to the idea that MIT is "at fault" in this matter. Mark Crispin's observations sum up the situation very well; this is something that was changed out from beneath MIT and other places that implemented relaying early on. It was done in such a way as to guarentee maximal disruption. I can sympathize with your needing time to do it; I certainly do not have time to fix MIT's software for them. Back when I worked for MIT that might have been a posibility. (Actually, I'm not sure who put on that header; come to think of it it probably was XMAILR (Mark Crispin's work), rather than the MIT software, but MIT is in the same boat.)  Date: 11 September 1982 0031-EDT (Saturday) From: Richard H. Gumpertz To: Jonathan Alan Solomon Subject: Re: More header mangling CC: HEADER-PEOPLE at Mit-Mc, MsgGroup at Brl In-Reply-To: Jonathan Alan Solomon@USC-ECLC's message of 10 Sep 82 16:36-EST Message-Id: <11Sep82 003154 RG02@CMU-10A> If people want to maintain compatibility, it seems to me that multiple atsigns should be handled by just adding quotes around all but the last. Hence, FOO @ BAR @ GORP would become "FOO @ BAR"@GORP, not FOO.BAR@GORP or FOO%BAR@GORP or anything else as silly. Rick  Date: 13-Sep-82 19:40:55-PDT (Mon) From: UCBARPA.eric@Berkeley Subject: Re: Header Mangling. Message-Id: <8208140240.14786@UCBARPA.BERKELEY.ARPA> Received: by UCBARPA.BERKELEY.ARPA (3.198 [9/12/82]) id A14786; 13-Sep-82 19:40:57-PDT (Mon) Received: from UCBARPA.BERKELEY.ARPA by UCBVAX.BERKELEY.ARPA (3.198 [9/12/82]) id A11317; 13-Sep-82 19:44:14-PDT (Mon) Phone: (415) 548-3211 To: header-people@MIT-MC, msggroup@BRL It doesn't seem that "mysterious disappearing protocol" argument is about RFC 822, but about the rather silent modification of RFC 733. I agree that to change a standard silently and without notice can only be considered an extraordinary botch. I heard about it by word of mouth several years after the fact. In any case arguing about it seems like beating a dead horse. The people that caused this made a bad management error and presumably regret it. To reverse the decision now would probably break as many systems as it would fix. 822 however is published, and a conversion period has been specified. It isn't perfect, but sometimes it is necessary to just get something out. To imply that TCP has come out without "valid technical questions" or intense debate is simply naive. Many people feel it is being shoved down their throats. Long have I listened to the "discussions" of how "stupid" it is. I know a number of people who intend to do nothing because "DCA wouldn't DARE turn off NCP." Hah! (Or perhaps I should save my laughs for 1-Jan-83.) But in any case something had to happen to try to improve the previous situation. I don't normally participate in these discussions because I feel it is more important to move the UNIX world forward than argue that we should be allowed to stay the same. I find it ironic that people have claimed that 822 isn't a BIG enough step -- considering that a lot of the debaucheries in it were to make it easier for those of you who already have working mail systems. I accept that you all have valid points. Now, how about we move forward and stop cussing each other out? eric allman P.S. Before any of you start yelling at me about the bad headers Berkeley has been producing, let me say that I feel this is endemic to building new mail systems. Please be polite to me -- I intend to be polite to you when you undergo the painful effort of conversion.  Date: 18 Sep 1982 (Saturday) 0211-EDT From: HAGAN at Wharton-10 (John Hagan) Subject: Message munging To: header-people at MIT-MC I have a feeling someone's MMDF or such is adding sperious blank lines to the end of transfered messages. Mail coming from the BRL area seems to always have about 3 to 5 blank lines at the end, proportional to the number of Via: 's. Am I nuts? --Kid.  Date: 19 Sep 1982 (Sunday) 2043-EDT From: HAGAN at Wharton-10 (John Hagan) Subject: Appending of blank lines to messages To: header-people at MIT-MC Several people have told me that messages routinely get an extra blank line added to the end of a message. One suggested that at each "hop" on the UUCP net a blank line is appended. This cannot be strictly true since messages with ten or more "!" have come through cleanly to me without the extra blank lines. Only messages with Via's, as those from BRL seem to have extensive blank lines. Even if this is a feature, it should be a bug. Just as headers should be preserved in all their glory, the message body should especially be untouched by mail servers. I blame MMDF. Any comments from MMDF wizards? --Kid.  Date: 19 Sep 1982 2330-PDT From: KLH at SRI-NIC Subject: Re: Appending of blank lines to messages To: HAGAN at WHARTON-10, header-people at MIT-MC In-Reply-To: Your message of 19-Sep-82 1858-PDT Well, I believe I can explain the mystery (and save MMDF people from a fruitless search). The culprit is an ambiguity in the mail transfer protocol with respect to end-of-message-text; FTP's MAIL command shares this problem with SMTP's DATA command. Basically the message text is being transmitted over the command connection(s) and requires some in-band marker to indicate when the text has been completely transfered. This marker is a line with nothing on it except a period, in other words the sequence .. The question is whether the first in there should be considered part of the message text or not. There are some mail-sending programs which don't bother to check the message text to see if it ends in a new-line; they just slap on a . sequence at the end to guarantee that the message will be terminated no matter what the last characters were. This is normally fine, except that most mail-receiving programs probably consider the before the "." to be part of the message text, and they don't bother checking the previous characters either. So every time the message is transmitted (via distribution sites, forwarders, relays, etc) you get another blank line at the end. This phenomenon is quite noticeable when forwarding loops exist; you can tell from the number of blank lines how many iterations of the loop the Flying Dutchmessage has gone through. The MLFL command in FTP does not have this problem, because the data transfer is over a separate connection and no marker is necessary. However, there is a different problem associated with this command, namely there is no way to verify that the closing of the data connection (which indicates end of message) was deliberate rather than accidental, which causes the successful transmission of truncated messages. (Plus some sites have problems with their NCP related to shoving out the last data bits of a closed connection, but thats another story). One way to fix this is to explicitly specify that all mail sending programs must ensure that the message text ends in a sequence; they must add it if it isn't there, and they must NOT add it if it is already there. Then simply require that mail receiving programs consider the first of the . terminator to be part of the message text. Or, of course, one could require that the first be thrown away, which would make it possible to send messages which did not end in a newline, and the mail-sending process could blithely add on the . without checking. I don't really care which fix is specified, although I think there are entirely too many programs (especially on UNIX) which break fatally if something they are handling doesn't end in a newline. It is not true that all network messages end in newlines; they just look that way, owing to the transmission protocol. Hardly the most important issue on hand, but we might as well settle it now. --Ken -------  Date: 20 Sep 82 6:55:52-EDT (Mon) From: Dave Crocker To: John Hagan cc: header-people at Mit-Mc Subject: Re: Appending of blank lines to messages No version of MMDF adds blank lines. It does verify that the end of the message body is a newline and, if it isn't, it adds one, since that is the requirement for RFC733 (and RFC822), but this has never been observed to create blank ending lines. UUCP mail creates blank lines regularly. This was discussed, at great length in Unix-Wizards (I think) quite some time ago. The likely reason you have not observed it before is because some UUCP sites have version of the mail program that work correctly. Dave  Date: 20-Sep-82 21:32:05-PDT (Mon) From: UCBARPA.mark@Berkeley Subject: Appending of blank lines to messages Message-Id: <8208210432.10606@UCBARPA.BERKELEY.ARPA> Received: by UCBARPA.BERKELEY.ARPA (3.201 [9/18/82]) id A10606; 20-Sep-82 21:32:07-PDT (Mon) Received: from UCBARPA.BERKELEY.ARPA by UCBVAX.BERKELEY.ARPA (3.201 [9/18/82]) id A12934; 20-Sep-82 21:34:37-PDT (Mon) To: header-people@mit-mc The problem has nothing to do with MMDF, SMTP, or UUCP. It's a property of the V7 UNIX /bin/mail program that all mail must begin with "From " and end with a blank line. The simplest way to insure this is to stick a From line on the front and an extra newline on the end, ignoring the contents. This worked fine when all mail was locally generated and locally delivered. This property has the side-effect that if a message goes through 10 hops, there will be 10 From lines on the front (9 of which begin ">From " so they won't look like real from lines) and 10 newlines on the end. Believe me, the headers are much uglier than the blank lines. You think From foo!bar!mumble!joe@Berkeley Mon Sep 20 12:11:10 1982 is ugly? How'd you like to have to read From uucp Mon Sep 20 12:11:10 1982 >From nuucp Mon Sep 20 12:10:30 1982 remote from foo >From fred Mon Sep 20 12:08:55 1982 remote from bar >From joe Mon Sep 20 12:04:12 1982 remote from mumble on the tops of your messages? Half the UUCP world gets this nonsense! There is a Berkeley program that inputs the above nonsense and converts it to an ! notation so that at least replies work, which is run on all Berkeley systems. But the same program does not attempt to strip off multiple newlines from the end - you can't really tell what was added and what was originally there, and besides, as someone pointed out, messing with the body of a message is bad idea. As far as I know, all /bin/mail programs around which are descended from the V7 program still add a newline. (I'm not sure what happens on Berkeley UNIX systems, since a different program gets invoked, but in many cases the mail still goes through /bin/mail.) On non-Berkeley systems (including most of Bell Labs) people are still running a system where /bin/mail is the ONLY mail software they've got! (If you're on a UNIX system, read your binmail(1) page and then see if you still feel sorry for yourself because you have to read extra blank lines.) Of course, getting a change like this made to avoid extra blank lines is not just a matter of getting one hacker to fix one mail system. There are several hundred (nobody knows for sure because there's no list) sites on uucp, with no way of reaching more than about the 250 of them on USENET. Many of these sites inside Bell Labs are computer centers that maintain 20 or 30 or 40 machines, with a policy that they run the vanilla system that comes from USG (the UNIX Support Group inside Bell Labs) so they can point the finger when a bug is found. Getting USG to fix a newline bug is obviously low priority in the face of the internet syntax, their lack of manpower (and even monkeypower), and the obviously dramatic need for a reasonable user interface. Also, they are now entertaining ideas for what to put in the version of UNIX that will be out in Fall 1983. So don't hold your breath. Mark  Date: 22-Sep-82 17:15:55 PDT (Wednesday) From: holbrook.ES at PARC-MAXC Subject: Re: MSGGROUP DIGEST SUGGESTIONS In-reply-to: Reynolds' message of 22 SEP 1982 1549-PDT To: Reynolds at Ames-67 cc: Header-People@mit-mc So far, I've received the same message three times, with three different timestamps in the Date: field. What gives? Looks like the Ames-67 mail is spitting them out.. Mail-from: Arpanet host BRL rcvd at 22-SEP-82 1703-PDT Date: 22 SEP 1982 1549-PDT To: MSGGROUP at BRL From: Reynolds at Ames-67 Subject: MSGGROUP DIGEST SUGGESTIONS Via: Ames-67; 22 Sep 82 18:49-EDT Via: Brl; 22 Sep 82 19:00-EDT Via: Brl-Bmd; 22 Sep 82 19:08-EDT Mail-from: Arpanet host BRL rcvd at 22-SEP-82 1647-PDT Date: 22 SEP 1982 1548-PDT To: MSGGROUP at BRL From: Reynolds at Ames-67 Subject: MSGGROUP DIGEST SUGGESTIONS Via: Ames-67; 22 Sep 82 18:48-EDT Via: Brl; 22 Sep 82 19:00-EDT Via: Brl-Bmd; 22 Sep 82 19:06-EDT Mail-from: Arpanet host BRL rcvd at 22-SEP-82 1703-PDT Date: 22 SEP 1982 1549-PDT To: MSGGROUP at BRL From: Reynolds at Ames-67 Subject: MSGGROUP DIGEST SUGGESTIONS Via: Ames-67; 22 Sep 82 18:49-EDT Via: Brl; 22 Sep 82 19:00-EDT Via: Brl-Bmd; 22 Sep 82 19:08-EDT  Date: 22 September 1982 22:01-EDT From: David A. Moon Subject: header mangling To: Admin.MRC at SU-SCORE cc: HEADER-PEOPLE at MIT-MC, MsgGroup at BRL Date: 10 Sep 1982 0111-PDT From: Mark Crispin In an attempt to work around the problem of multiple at's and relays, I am planning on having MMAILR use "%" instead of multiple " at "'s to indicate a routed address. In other words, the former RWK at SCRC-TENEX at MIT-MC will become RWK%SCRC-TENEX at MIT-MC I would like to ask the maintainers of the MIT I.T.S. FTP server to make the (simple) change to their software so that "%" is accepted as an alternative to "@". The FTP server does not presume to reformat addresses that pass through it. I don't see how substituting one character from another buys anything, other than hiding the @'s from people and programs that are looking for multiple @'s to flame about, until they learn that % and @ are synonymous. However, in the interests of maximizing the usefulness of the mail system, I will make Comsat understand this syntax whenever you tell me that mail is being or going to be sent in it (plus a week or two for seeing my mail and having the half hour to do it, test it, and install it.)  Date: 23 Sep 82 6:28:29-EDT (Thu) From: Dave Crocker To: David A Moon cc: HEADER-PEOPLE at Mit-Mc, MsgGroup at Brl, Cerf at Usc-Isi Subject: Re: header mangling With respect to the illegal transmission of multiple @/" at " in address headers, the basic requirement is that the relevant MIT, Stanford, and CMU sites stop sending them. Some of the affected sites, most notably Mark Crispin, were very much caught in the time-warp which failed to publish the ARPA decision to prohibit mutliple at-signs. Some of the affected sites have known about the situtation for close to two years and continue to refuse to make the change. The BRL MMDF behavior that started this brouhaha is the result of MMDF very carefully intercepting such extra tokens and converting them to a character which MMDF treats at lexically equivalent to at-sign. In our case, we chose period. This was done about 2 years ago. Mark Crispin is choosing percent sign; UCL has chosen it also. Rather than complain that some remote system, like BRL, should put quotes around text that it had no organizational responsibility for generating, the sending sites should make sure that the text conforms, before letting the text loose onto the Arpanet. Dave  Date: 23 Sep 82 8:32:08-EDT (Thu) From: Dave Crocker To: David A Moon cc: Admin.MRC at Su-Score, HEADER-PEOPLE at Mit-Mc, MsgGroup at Brl Subject: Re: header mangling It is amazing how useful waking up can be. I reread your note and realized that, besides suggesting that the source(s) of the problem be fixed, there was a simple expedient to request, based on your last paragraph: Please make your ftp server accept period (".") as lexically equivalent to at-sign ("@"); the use of the period is effective immediately. Dave P.S. It occurs to me that it might be useful to note that the BRL MMDF is using a header-munger that was added to MMDF by SRI and is run by them, as well as UDel and will be run on Rand-Relay (for CSNet). D/  Return-Path: Date: 23 September 1982 09:20 edt From: Charles Hornig at MIT-MULTICS Subject: Re: header mangling Sender: Hornig.Multics at MIT-MULTICS To: header-people at MIT-MC In-Reply-To: Message of 23 September 1982 09:06 edt from Dave Crocker Before people go off and make "." be a special character you should remember that there are systems in the world where "." is legal in user names. Certainly this is true on Multics, and I believe it is true for ITS as well. Furthermore, no one has ever proposed that you have to quote it. I would hate to have some mail addressed to "FOO.BAR at MC" get forwarded to "FOO at BAR" rather than being put in FOO.BAR's mailbox. As I understand what we will be supposed to do, "RWK at SCRC-TENEX at MIT-MC" becomes "@MIT-MC.ARPA:RWK@TENEX.SCRC". The real question is how to make COMSAT understand that.  Date: 23 Sep 1982 1153-EDT From: J. Noel Chiappa Subject: Re: header mangling To: Charles Hornig at MIT-MULTICS, header-people at MIT-MC cc: JNC at MIT-XX In-Reply-To: Your message of 23-Sep-82 0920-EDT Gee, wasn't '%' officially picked as THE character for that very reason? I seem to remember something about that.... -------  Mail-from: SU-NET host Shasta rcvd at 23-Sep-82 0954-PDT Date: Thursday, 23 Sep 1982 09:54-PDT To: Dave Crocker Cc: David A Moon , Admin.MRC at Su-Score, HEADER-PEOPLE at Mit-Mc, MsgGroup at Brl Subject: NO! Don't propagate this madness! Stop the Bloody periods!!!! In-reply-to: Your message of 23 Sep 82 8:32:08-EDT (Thu). From: Brian Reid The period is the worst possible choice as a substitute for the @ sign, and nobody else should track this foolish mistake. Please don't go convincing other people to do this too! Almost every site on the Arpanet assigns some meaning to the period already: Tops-20 sites use periods for subdirectories: Admin.MRC Multics uses periods for account codes: Schauble.Multics Xerox uses periods for registry names: Schroeder.PA (Palo Alto registry) CMU uses periods as an equivalent for spaces: Craig.Everhart Berkeley uses prefix periods the way Crocker wants to use suffix periods: Kim.ouster means person "ouster" at site "Kim". Dave, you should pick almost any other character besides the period for this vigilante action. I have a nagging feeling that it would cause less grief for the collected Arpanet community if you used a lowercase "q" instead of a period: MOONqSCRC-TENEX@AI. There is maybe 1 instance in 50 of a net name that legitimately contains a "q"; enormous numbers of them contain periods. A % sign or an ampersand or a # sign or something that does not customarily show up in net names would be far superior. Brian Reid, a.k.a. Brian.Reid@CMUA, CSL.BKR@SU-SCORE, BReid.PA@Parc  Date: 23 Sep 1982 0957-PDT From: Mark Crispin Subject: Re: header mangling Sender: ADMIN.MRC at SU-SCORE To: MOON at MIT-MC cc: Header-People at MIT-MC Reply-To: Admin.MRC at SU-SCORE Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) In-Reply-To: Your message of 22-Sep-82 1901-PDT No, Dave, I'm talking about merely changing the I.T.S. FTP server to recognize "%" as an alias for "@" in the MAIL or MLFL command. That has nothing to do with reformatting addresses that pass through it, which I agree is an incredibly bad idea. I suspect it should be a single PDP-10 machine instruction change. You might as well do it now, since it shouldn't ought to hurt anything. -------  Date: 23 Sep 82 13:08:03-EDT (Thu) From: Dave Crocker To: Brian Reid cc: David A Moon , Admin.MRC at Su-Score, HEADER-PEOPLE at Mit-Mc, MsgGroup at Brl Subject: Re: NO! Don't propagate this madness! Stop the Bloody periods!!!! Perhaps my main point was not clear: We chose period over two years ago. I was merely responding to David Moon's offer to incorporate interpretation of whatever character was specified. I still believe that the correct resolution is for the ITS, CMU, and Stanford sites to remove/fix the symbols before letting messages out onto the Arpanet. Dave  Date: 23 Sep 82 14:18:12-EDT (Thu) From: Dave Crocker To: Brian Reid cc: David A Moon , Admin.MRC at Su-Score, HEADER-PEOPLE at Mit-Mc, MsgGroup at Brl Subject: Re: Re: NO! Don't propagate this madness! Stop the Bloody periods!!!! The MMDF code in question is semantically unchanged from two years ago. To my knowledge, no recent changes took place. It just happens that the mapping behavior became highly visible, due to BRL's acting as a relay. There is a chance that some code DID get changed; if so, it was merely to make the address munging work 'correctly' (as we originally specified it). At any rate, per Crispin's note, I want to stress that hassling non-conforming sites was not on the agenda. In fact, they were not considered, at all. (It has been a tempting thought, for the past couple of years, but was deemed counter-productive.) The munging was being performed strictly for the purpose of following the dictum that says 'be generous in what you accept with your parser and conservative in what you feed others'. Postel has some version of this statement in one of his specifications. MMDF is generous in accepting the multiple at-signs but conservative in generating only one. Dave  Mail-from: SU-NET host Shasta rcvd at 23-Sep-82 1031-PDT Date: Thursday, 23 Sep 1982 10:31-PDT To: Dave Crocker Cc: David A Moon , Admin.MRC at Su-Score, HEADER-PEOPLE at Mit-Mc, MsgGroup at Brl Subject: Re: Re: NO! Don't propagate this madness! Stop the Bloody periods!!!! In-reply-to: Your message of 23 Sep 82 13:08:03-EDT (Thu). From: Brian Reid You might have chosen period 2 years ago, just like the Multics people chose another meaning for it 15 years ago and the Tops-20 people 10 years ago. But you didn't start this vigilante substitution of other people's codes into your kind of codes until fairly recently. It is the subsitution, not your unfortunate choice of the period, that is the act of aggression. However, your basic position of vigilante action about the multiple @ signs would be cute instead of obnoxious if you had chosen some character besides the period to change them into.  Date: 23 Sep 1982 1048-PDT From: Mark Crispin Subject: cooling off the flames Sender: ADMIN.MRC at SU-SCORE To: MsgGroup at MIT-ML, Header-People at MIT-MC Reply-To: Admin.MRC at SU-SCORE Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) Folks, I talked with Dave Crocker during the Internet conference in Germany. Here are some facts which are not generally known, but should be: . Dave did not take vigilante action. What happened was that his mailsystem once generated multiple at's the way the mailsystem at MIT, Stanford, and CMU does. Being closer to the Internet working group at the time, he was aware of the decision to eliminate multiple at's. The way he did it was that his mail delivery system converts multiple at's into the dot format. This was done a few years ago. Nobody was aware of it at the time, until MsgGroup mailings started going via MMDF at BRL and multiple at MsgGroup mail got clobbered. . The way in which multiple at's are generated by the TOPS-20 mailsystem makes it moderately difficult to fix it. In particular, the address "FOO at SCRC-TENEX" is passed by the mailsystem with a special request to expand "SCRC-TENEX" to a routed address if necessary. The mail delivery process will then expand it, e.g. to "SCRC-TENEX at MIT-MC". This is only done at the explicit request of the mail composition process, and otherwise the mail delivery process will never modify a message header. Incoming network mail, not generated by the mail composition process, will not have the request and consequently the mail delivery process (XMAILR in this case) will not touch the header. The problem is that since the request is around "SCRC-TENEX", it is moderately difficult to change the " at " preceding it to any character, be it "%" or "." or "q". There are various complex cases which have to be considered to do it properly; it is not a trivial fix. I am unwilling to make any change to that part of the mailsystem without verifying its correctness, for fear of destroying possibly valuable mail. The change which will be made is that the previous " at " will be converted into a "%", and all subsequent at's in the routing address up to the last one will be likewise converted. The FTP servers will be fixed to treat "%" as equivalent to "@" in the MAIL and MLFL commands. This has already been agreed to. It is a dead issue; complaining about it further is valueless. . Period is completely unacceptable to us as a node delimiter to the left of the at-sign. Despite what the standards may say, we will not use it for that purpose, preferring "%" instead. However, the standard (as well as private conversations with Crocker) makes it clear that we are perfectly free to fantasize that "." is an ordinary character and "%" is special, just as Crocker is free to fantasize the other way. I wish the standard had not given such special meaning to ".", but it is irrelevant. This only refers to the "mailbox" part of the address. Period is a perfectly reasonable character in a domain address. Ultimately the address should become something like @MIT.ARPA:RWK@TENEX.SCRC but for now we'll use "RWK%SCRC-TENEX at MIT-MC". Surprisingly it looks like domains would be easier to implement for me than the "%" hack, but I wish to coddle older programs for just a little while longer. Given this, I propose the following rules: (1) NO FURTHER COMPLAINTS ABOUT MIT OR STANFORD MULTIPLE AT'S. (2) NO FURTHER DEFENSE FROM MIT OR STANFORD OF MULTIPLE AT'S. (3) NO FURTHER ARGUMENTS ABOUT THE MERITS OF "%" VS. "."; IT HAS BEEN MADE CLEAR THAT BOTH CAN BE USED AS NODE DELIMITERS OR ORDINARY CHARACTERS. Anybody who violates these rules should have their terminal taken away from them for a week. -- Mark -- -------  Date: 23 Sep 1982 1316-PDT From: KLH at SRI-NIC Subject: Using "%" instead of "@" To: header-people at MIT-MC, msggroup at MIT-ML cc: KLH at SRI-NIC In-Reply-To: Your message of 23-Sep-82 0957-PDT Moon is correct. The ITS FTP server has no business doing anything about the arguments to MAIL or MLFL; it is up to the actual mailer, COMSAT, to handle them in whatever fashion is deemed fit. I must confess I have totally forgotten why it is necessary to even have a substitute character at all. I think if you counted up the software that would have to be fixed to parse the hostname off by going "backwards", versus the programs that would have to be modified to understand "%", you might find that the latter number is much larger. Finally, I am tired of seeing people beat on MIT for being "uncooperative", refusing to recognize the requirements of the rest of the world. From my viewpoint it is the other way around; the rest of the world has never recognized MIT's needs, and RFC733 required a very significant compromising of MIT capabilities, for no clearly stated reason. This left a bitter taste which RFC822 does nothing to alleviate; the surprising thing is that people there are still willing to be accomodating, as Moon's message demonstrates. --Ken -------  Date: 23 Sep 82 16:08:03-EDT (Thu) From: Michael Muuss To: HEADER-PEOPLE at Mit-Mc, MsgGroup at Brl Subject: Can we all agree? Why don't we all just agree to translate to the percent-sign (instead of the Dot), and be done with it? The change to MMDF is like a line or two of code; surely the same is true for all the other mailers that bother to change headers. -Mike  Date: Thursday, 23 September 1982 15:37-PDT From: Jonathan Alan Solomon To: Richard H Gumpertz Cc: Admin.MRC at Su-Score, Dave Crocker , HEADER-PEOPLE at Mit-Mc, David A Moon , MsgGroup at BRL, Brian Reid Address: 3737 So. Hoover St. LA, Cal. 90089-0273 Phone: (213) 743-6861 Subject: NO! Don't propagate this madness! Stop the Bloody periods!!!! Gee, I second that. If we can use either "." or "%", why not "@"? I get the feeling we have asked this before. Oh well... --Jsol  Date: 23 September 1982 1527-EDT (Thursday) From: Richard H. Gumpertz To: Brian Reid Subject: Re: NO! Don't propagate this madness! Stop the Bloody periods!!!! CC: David A Moon , Admin.MRC at Su-Score, HEADER-PEOPLE at Mit-Mc, MsgGroup at Brl, Dave Crocker In-Reply-To: Brian Reid@Shasta@SU-Score's message of 23 Sep 82 11:54-EST Message-Id: <23Sep82 152754 RG02@CMU-10A> I would like to make a radical suggestion: instead of ".", "%", or "#", why not use "@"? Is it really that hard to change the standard back to allow (but perhaps discourage) multiple "@"s? It surely is not difficult for mail-processing programs that don't want to understand them to just IGNORE AND PASS THROUGH all but the last (rightmost) @. After all, multiple @s do form a fairly straightforward syntax and have proven to be useful. Why build kludge upon kludge to translate old conventions when it would be easier to just accept the old convention as it was? Why not support BOTH domain naming AND route naming, with the former the preferred method? Who is getting any benefit out of eliminating multiple @s? Is that benefit greater than the cost? Rick  Date: 23 September 1982 2043-EDT (Thursday) From: Richard H. Gumpertz To: Header-People at MIT-MC, MsgGroup at BRL Subject: munging my name! Message-Id: <23Sep82 204303 RG02@CMU-10A> Why is BRL munging my FROM line? It originally reads Richard H. Gumpertz When it comes back, however, the "." after the H is gone (my middle name is not "H"!) and the capitalization of CMU-10A has been raped. THIS SHOULD NOT BE DONE!!!!! The only change I consider acceptable (though still annoying) is changing " at " to "@". If you can't get it right, LEAVE IT ALONE!!!!! Rick  Date: 23 September 1982 23:48 cdt From: Stachour.CSCswtec at HI-Multics Subject: Re: munging my name! To: Richard H Gumpertz cc: Header-People at MIT-MC, MsgGroup at BRL In-Reply-To: Msg of 09/23/82 21:39 from Richard H Gumpertz Gee, Rick, I don't know why your line of CMU-10A is getting changed by BRL. The mailer tables here at HI-Multics (which are carefully maintained, I'm told, in the 'right' capitalization from 'official netowrk tables' (wherever they are) ) have the capitalization of your site as 'CMU-10a' (and I suspect that's what the capitialization is on this msg (CMU-10a) of your host-id. Always questing, ...Paul  Date: 24 Sep 82 0:32:01-EDT (Fri) From: Andrew Scott Beals To: Admin.MRC at Su-Score cc: MOON at Mit-Mc, Header-People at Mit-Mc Subject: Re: header mangling Via: UMCP-CS; 24 Sep 82 2:00-EDT from the user level, ITS *does* recgonize % at equivalent to @ (for tip users happiness). i don't know about mail (comst), but I think it does.  Date: 24 Sep 1982 (Friday) 2056-EDT From: HAGAN at Wharton-10 (John Hagan) Subject: How about this: To: header-people at MIT-MC I like this (and played with it in an experimental mailer locally): To: FOO Path: at MIT-MC via Udel-Relay via My-Vax via CMU-Vax9764 Or some other English based addressing system. Then, all sites can be seperated by a consistant character: "FOO".CMU-VAX9764.My-Vax.Udel-Relay@MIT-MC --Kid.  Date: 25 Sep 82 3:12:11-EDT (Sat) From: Michael Muuss To: header-people at Mit-Mc cc: Stef at Darcom-Ka Subject: [notesfiles article] Looks like the Bell folks are now tracking the Arpa MAIL formats! -Mike ***** brl-bmd:net.general / cbosgd!mark / 10:26 am Aug 20, 1982 Three new RFC's have been published on the ARPANET. They deal with the new mail standards. Their numbers are RFC 819 Internet Syntax (18 pages) RFC 821 SMTP (Simple Mail Transfer Protocol) (68 pages) RFC 822 Header Format (replaces RFC 733) (47 pages) I am posting the short one, RFC 819, to net.sources. If there is sufficient interest, I will post the others (although 821 probably does not apply to our environment - it would apply to a local net or a long-haul full duplex reliable net such as the arpanet) or send them to interested people if there is a small number of such people. There is one error in 819 you should be aware of. Several examples use names like "alpha!beta!gamma!john.UUCP" which do not contain an "@" sign. Since all internet addresses are of the form "user@host", the dot should be changed to an at. I'm not making this change since it's not in the document as published, but readers should be aware of the error. I hope to see a timetable for conversion shortly. My understanding is that it is to be phased in gradually - first sites are supposed to understand but not generate the new syntax, then sites are supposed to understand both and generate the new, and eventually only care about the new syntax. An estimate is two months at each phase. Software to support the new syntax exists. If you are in Bell Labs or otherwise licensed for UNIX 5.0 (presumably this means Bell System only) I have a version of the 5.0 /bin/mail command that understands it and generates new RFC 822 headers. (It's the same one that went out for testing a month or so ago - no bugs were found.) If you are running Berkeley UNIX, the Berkeley sendmail program supports this syntax. (Sendmail isn't available to the general public yet but will be before anyone seriously urges conversion.) Both pieces of software are, of course, free. If you are running something else, some minor conversion will probably be necessary, but no serious problems are expected unless you have a home-grown mail system. I understand that the authors of MMDF and MH plan to support the new syntax, but I have nothing to do with that. At this point I urge all UUCP sites to understand the new syntax and to plan for conversion, but not yet to undertake actual conversion. The intent is that UUCP will use the simplification "user@host.uucp", at least initially. Mark Horton ----------  Date: 25 September 1982 13:48-EDT From: Ken Harrenstien Subject: MsgGroup vs Header-People & @ % . To: STEF at DARCOM-KA cc: HEADER-PEOPLE at MIT-MC, MsgGroup at BRL Stef is correct. Header-people is for the discussion of nitty-gritty and should include all those people who actually implement mail software. Please try to ensure that MsgGroup is not on the CC list for future messages about the "@ % ." issue, as well as similar technical details. Thanks.  Date: 25 September 1982 2221-PDT (Saturday) From: v.wales at UCLA-Security (Rich Wales) Subject: multiple at-signs in addresses -- why not? To: Header-People at MIT-MC In an effort to pour some oil on troubled waters, I am going to propose an "upward-compatible" modification to the Internet address scheme of RFC819 and RFC822. (Horrors, some may say; everybody knows that RFC stands for Real Firm Concrete. Well, here I go with my chisel anyway; if you want to take away my terminal, you'll have to break down my office door to get at it!) Suppose we modify the syntax/semantics of an Internet address like so: After the FIRST at-sign in an address, periods and at-signs are equivalent and may be used interchangeably. I don't believe this rule would do any violence to Internet addressing. If I understand the way the new Internet addresses will work, each machine (or name server) which handles a message will have to scan the address string from right to left, taking off one or more components from the end and (if relaying is involved) passing what's left to the next machine or name server. This scanning and splitting can just as easily be done on the basis of "either '.' or '@'" as on periods alone. The "single at-sign" rule admittedly does have the advantage that, once an address parser sees the at-sign, it knows that there are no more domain specifications. However, an address parser is typically going to have to scan the address anyway to see if it has an at-sign at all; otherwise, it isn't going to be able to distinguish between net mail and local mail. If this scan is done left-to-right (as it probably would be anyway), locating the first at-sign in an address is trivial. I realize this proposed mod would make Internet addresses non-unique (because you could switch between periods and at-signs) and perhaps slightly less aesthetic. (Yes, I know, some people think they're awful already -- no flames, please!) However, it seems to me that this would be a quite acceptable way of accommodating the sites which have been using the at-sign for quite a long time already, and the amount of resentment and accusation currently flying around could be lessened considerably. -- Rich  Date: 26 Sep 1982 1153-PDT Sender: CERF at USC-ISI Subject: Re: multiple at-signs in addresses -- why not? From: CERF at USC-ISI To: v.wales at UCLA-SECURITY Cc: Header-People at MIT-MC, POSTEL at ISIF, Cc: DCROCKER at UDEL-RELAY Message-ID: <[USC-ISI]26-Sep-82 11:53:10.CERF> In-Reply-To: Your message of 25 September 1982 2221-PDT (Saturday) RICH, I THINK I UNDERSTAND YOUR MOTIVATION AND I AM SYMPATHETIC. HOWEVER, I WONDER IF BY DOING AS YOU SUGGEST, WE WOULD BE INADVERTENTLY PROPAGATING DIFFERENT SEMANTICS TO THE RIGHT OF THE "@" SIGN. CURRENTLY, THE RFC DEFINITION PERMITS ONLY THE ONE "@" SIGN. TO ITS RIGHT IS A DOMAIN SPECIFICATION WHICH MAY CONTAIN ONE OR MORE PERIODS. IN THE MULTIPLE-@ FORMULATION, AREN'T THE ENTRIES BETWEEN "@" SIGNS REPRESENTATIVE OF HOSTS - GIVING A SORT OF PATH OR ROUTE, AS IN UUCP "!" SIGNS? IF THAT IS THE CASE, THE ELEMENTS OF THE DOTTED DOMAIN SPECIFIER ARE DIFFERENT FROM-THE ELEMENTS OF THE MULTIPLE-@ FORMAT AND THUS THE SEMANTICS OF THE TWO FIELDS DIFFER AS WELL. MIXING THE TWO FORMATS MIGHT CAUSE CONSIDERABLE TROUBLE AT A SITE TRYING TO FIND DOMAIN NAMES, FOR EXAMPLE. JON POSTEL OR DAVE CROCKER ARE PROBABLY IN A BETTER POSITION TO ARTICULATE THIS THAN I AM - AND PERHAPS I AM WRONG, IN WHICH CASE I WOULD BE HAPPY TO HAVE A LITTLE TUTORIAL ON THE SUBJECT. PEACE, VINT CERF  Date: 26 Sep 1982 14:22:03-PDT From: mo at LBL-UNIX (Mike O'Dell [system]) To: CERF at USC-ISI cc: Header-People at MIT-MC, POSTEL at ISIF, DCROCKER at UDEL-RELAY, v.wales at UCLA-SECURITY Subject: Re: multiple at-signs in addresses -- why not? In-reply-to: Your message of 26 Sep 1982 1316-PDT (Sunday). <[USC-ISI]26-Sep-82 11:53:10.CERF> There are several problems here, I will try to articulate some of them as I percieve them, as a person currently living with a lot of chewing gum and bailing wire holding 3 mail systems together. 1) When there is more than one separator character in an address , you MUST know the precidence and the "handedness" of associativity; ie, to the right or left. 2) Separator characters must be chosen with much care, because there are already many in use with various degrees of sacredness and imutability. 3) In any event, there MUST be semantic consistancy - you can't say "interchange periods and at-signs, but don't expect it to work"!!! That is utterly amazing, not to mention violating (2). Let me propose a few ad hoc rules which might get us over the hump; I make no claim of perfection, only some sucess in using the scheme. Form of an address: (using new words to avoid overloading!) @ with the following rules. is the longest string, FROM THE RIGHT, containing NO at-signs. is everything else, minus the at-sign. It is explicitly allowed to contain at-signs. is the destination to which delivery should be attempted, supplying as the "user name" in the mail transport protocol (NCP/FTP, for instance). This gives the current 733-inspired model allowing multiple atsigns with the explicit statment of precedence. What does this allow? It provides for user.localnethost@arpanethost or "Firstname Lastname"@arpanethost or person%localnethost@arpanethost or person@localnethost@arpanethost without doing discrimanatory violence to any particular system. What about "domains"? They are covered too: If the delivery agent understands arpanethost.ARPA then person@arpanethost.ARPA is fine. As for the difference in semantics between components of structured domain names and multi-atsign routing, there may or may not be a difference. The interpretation is domain-specific and also possibly host specific. As an example: postel@F.ISI.ARPA If your host table were so organized, that could be a flat address which maps directly to the host. But what the example were mo@mo-sun.lbl-unix.ARPA It isn't reasonable for all the machines in the world to know about the SUN sitting on my desk, so what do we do? The name server doesn't help much, because then it needs to know all the machines in the world. I believe the problem at this level is much like the gateway problem at the IP level - we need ICMP messages!! I propose the following algorithm for a mail delivery agent when presented with a address like mo@mo-sun.lbl-unix.ARPA 1) Try to map the whole name; if that fails, drop the left-most component (up to a period) and try to remap. [N.B. - "try to map the name" might involve hitting the name server, but it needn't always.] When the name is reduced to one component and the domain name itself, [ foo.bar ] always try the name server IFF you know one for the specified domain!! If you don't know a server, you are stuck and you return the message. 2) Once you have a mapping, send the message. One of two things will happen - 1) You mapped the entire right-part and will deliver to the correct (ie, the specified) host 2) You will have mapped only some right-substring of the domain and will be connecting to a "domain gateway". Domain gateways always perform an implicit forward if necessary, but when they do, the send a "routing redirect" message to the originating mail system, telling it of a better route (ideally, containing a real internet address for the ultimate destination). This way, the originating host can update its mapping cache if it wishes. In this scheme, it would be quite useful if ALL hosts in a domain would function as a forwarding gateway to improve the chances of getting mail to its destination, but they need not if they don't want to. While a particular host wouldn't like to forward mail all the time, the redirect messages would hopefully cause frequent mail to follow more direct paths. The name servers in domains form the top-level gateways in a sense, and I propose that name servers always forward mail. That way they can function as mapping servers, or forwarding servers. The mechanisms in SMTP provide sufficient mechanism for this implicit forwarding to work and be recorded such that replies will work, at least as far as I can tell (assuming they use the ). It will result in heirarchical routing unless the reply agent does path calculus based on external knowledge, but if it gets it back, maybe that is better than nothing.... Gee, this is longer than I ever intended, but this is what I have thinking about rather hard. ANY insights are quite welcome! -Mike  Date: 26-Sep-82 21:27:36-EDT (Sun) From: UCBVAX:cbosgd!mark@Berkeley (Mark Horton) Subject: mo's mail routing algorithm Message-Id: <8208270127.8228@UCBVAX.BERKELEY.ARPA> Received: by UCBVAX.BERKELEY.ARPA (3.207 [9/26/82]) id A08228; 26-Sep-82 18:27:33-PDT (Sun) Via: cbosgd.uucp (V3.94 [3/6/82]); 26-Sep-82 21:27:37-EDT (Sun) To: header-people@mit-mc Gee, it's nice to see something constructive come out of all this flaming! Mike's algorithm looks like something we can really use for the time being. One minor correction: When the name is reduced to one component and the domain name itself, [ foo.bar ] always try the name server IFF you know one for the specified domain!! If you don't know a server, you are stuck and you return the message. Actually, giving up is only necessary if you don't recognize the top level domain [ bar ]. You can't assume there is a name server for all domains. Thus, if you are sending mail to mark@cbosgd.uucp, you won't get a name server for uucp (there is no such thing), but instead the correct thing to do is to figure out a gateway/forwarder for that domain (in the case of uucp, it would be Berkeley [for now]) and pass the mail on to the gateway for forwarding. (E.g. connect to Berkeley and tell it to mail to mark@cbosgd.uucp). The gateway will forward the mail after it is sent to the gateway (e.g. not in real time - your mailer shouldn't hang around to wait for final delivery.) Hopefully all the name servers will be able to identify the gateway for each top level domain. If not, the locations of the gateways could be (ugh) hardwired into the mailers. But it would be better to use name servers anyway, since the name server might decide from the host information (cbosgd.uucp) that a better gateway would be seismo.arpa, or udel.arpa, rather than berkeley.arpa. Mark  Date: 25 September 1982 22:10 edt From: Schauble.Multics at MIT-MULTICS Subject: MsgGroup vs Header-People To: Header-People at MIT-MC, MsgGroup at BRL May I suggest, then, that it is probably almost never correct to send a message to both Header-People and MsgGroup. What is better, if the discussion moves into territory covered by the other list is to move it, and provide a pointer to the archives for those interested in picking up the threads. I would like to keep the traffic unduplicated. Paul  Date: 26 Sep 1982 2148-PDT Sender: STEF at DARCOM-KA Subject: Re: MsgGroup vs Header-People From: STEF at DARCOM-KA To: Schauble at MIT-MULTICS Cc: Header-People at MIT-MC, MsgGroup at BRL Message-ID: <[DARCOM-KA]26-Sep-82 21:48:02.STEF> In-Reply-To: Your message of 25 September 1982 22:10 edt Hi Paul - Were the world perfect, with the boundary between Msggroup and Header-People as clear as black and white, and if everyone who ever contributed knew the distinction perfectly, then we might be blessed with no need to ever deal with a duplicate message. In the meantime, I would be happy to see it minimized, as best we can, without inhibiting the flow of ideas. To me, it is the flow of ideas that is paramount. Minimizing my use of the delete command can never compare as an objective function. But, the recent overlap was beyond reason, and was threatening the flow of ideas. Perhaps I should have realized it and stepped in sooner. Some things are only visible with hindsight, I fear. Cheers - Stef  Date: 27 September 1982 05:20-EDT From: David A. Moon Subject: @ vs % vs . vs # To: HEADER-PEOPLE at MIT-MC I have received approximately 100 messages on this topic. Since I cannot possibly spend the time to sort all this garbage out, and I do not even work for MIT any more and only maintain the mail system now and then in my spare time, I intend to do nothing about this. If someone sends me specific instructions as to what is the right thing, I will follow them (unless several people do this and they aren't consistnt and it isn't obvious who is right.)  Date: 27 Sep 1982 0626-PDT From: Mark Crispin Subject: Re: @ vs % vs . vs # Sender: ADMIN.MRC at SU-SCORE To: MOON at MIT-MC cc: HEADER-PEOPLE at MIT-MC Reply-To: Admin.MRC at SU-SCORE Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) In-Reply-To: Your message of 27-Sep-82 0220-PDT David - The absolute right thing is to implement SMTP with domain addresses. See the relevant RFC's on the topic. Given that this is a lot of work, an acceptable partial solution is to change all places where the I.T.S. mailer generates multiple at addresses to use "%" in all positions but the very right-most at. For example, an address referencing a Chaosnet site should be of the form "MRC%MIT-EE at MIT-MC" instead of "MRC at MIT-EE at MIT-MC". Additionally, whichever process on I.T.S. separates local-part from host address in incoming network mail (note this only happens with multiple at addresses) should allow "%" as a host delimiter as well as "@". On TOPS-20/Tenex, it is the FTP server which performs this task; I guess it's COMSAT on I.T.S. Of course, this has the problem of disallowing "%" as a legitimate character in a mailbox name on I.T.S. I don't think this is so serious as all the flaming mail we've been getting. I am working on an appropriate technical solution for the TOPS-20 mailer, and subsequently it will be portable to the Tenex mailer. -- Mark -- -------  Mail-From: ARPANET host Utah-20 received by CMU-10A at 25-Sep-82 00:01:36-EDT Date: 24 Sep 1982 2200-MDT From: Jay Lepreau Subject: Re: Programs to be changed To: Rick.Gumpertz at CMU-10A In-Reply-To: Your message of 23-Sep-82 1736-MDT Remailed-To: Header-People at MIT-MC Remailed-From: Richard H. Gumpertz Remailed-Date: Monday, 27 September 1982 1158-EDT We would have to change at least 2, maybe more. They are gross hacks that I hope to avoid messing much with till we get Berkeley's sendmail and move to the domain stuff... -------  Date: 27 Sep 82 12:19-PDT From: zsu at SRI-TSC To: Header-People at Mit-Mc CC: ZSu at SRI-TSC Subject: Re: multiple at-signs I believe one major reason for not having multiple at-signs is to prevent relative routing. If multiple at-signs were allowed, there would almost be nothing in the way to prevent the Internet naming convention degenerated into allowing relative routing. The discussion given in Vint's recent message has made that clear. Source routing is explicitly provided by "Relaying" in RFC821, the SMTP specification. However, we try to emphasize in RFC819 that only absolute routing should be allowed for relaying, that is the specification for each relay in the route must be universally interpretable. Cheers, Zaw-Sing  Date: 27 September 1982 2302-PDT (Monday) From: v.wales at UCLA-Security (Rich Wales) Subject: multiple at-signs in addresses -- a clarification To: Header-People at MIT-MC A couple of days ago, I proposed the following: After the FIRST at-sign in an address, periods and at-signs are equivalent and may be used interchangeably. A few people have raised the following objections to this proposal: (1) Mixing periods and at-signs as delimiters would create ambiguity because mailers wouldn't know which direction to parse in. (2) The semantics of "relative" routing, as is currently done by some sites via multiple at-signs, clashes with the "absolute" routing principle of Internet addressing as defined in RFC's 819 and 822. Either I didn't explain myself adequately, or else I am suffering from a fundamental misunderstanding of the whole Internet addressing idea. If the latter is true, I suspect I am probably not alone. Therefore, let me explain my position further -- and if it is obvious that I am missing some critical point, will someone please enlighten me and everyone else. First of all, my intent was that an address with multiple at-signs be treated EXACTLY as if all at-signs (after the first one) were changed to periods. Stated another way, everything after the first at-sign would still be treated as the "domain" string. Hence, the ambiguities we are all familiar with regarding periods as delimiters in "local parts" simply is not relevant to my suggestion. Second, I am assuming here that the sites which want to continue using multiple at-signs in their addresses will set up their Internet-domain name spaces so that the site names they are already using will be valid subdomain names as well. In other words, a current address like "Fred at Flaky-VAX at Podunk" -- which, under RFC819/822 might become "Fred@Flaky-VAX.Podunk.ARPA -- could, according to my suggestion, also be written in the form "Fred@Flaky-VAX@Podunk.ARPA". Even though I propose allowing this latter form, please keep in mind that I am still thinking of it as having a local-part equal to "Fred" and a domain equal to "Flaky-VAX@Podunk.ARPA" (or, if you wish, "Flaky-VAX.Podunk.ARPA"). While I do concede that the temptation exists for people to interpret the address as a local-part equal to "Fred@Flaky-VAX" and a domain of "Podunk.ARPA", I assert that this is purely a psychological objection, and not a valid technical one. -- Rich  Date: 27 September 1982 2319-PDT (Monday) From: v.wales at UCLA-Security (Rich Wales) Subject: another thought on multiple at's and the TOPS-20 mail system To: Header-People at MIT-MC I just thought of something else. Mark Crispin mentioned a while ago that the TOPS-20 mail system fixes up an address like "FOO at SCRC-TENEX" via a request to expand "SCRC-TENEX" into something global (namely, "SCRC-TENEX at MIT-MC"). Well, if the TOPS-20 software expanded "SCRC-TENEX" instead to "SCRC-TENEX.MIT-MC.ARPA" (or whatever the Internet domain really ends up being), then the whole problem would be solved. The solution then becomes not one of trying to patch up the local part by eliminating the first "at" (or at-sign), but rather substituting a "global", fully qualified domain to the right of the at-sign. -- Except, of course, that no one's mailer understands Internet domains yet. -- Rich  Date: 27 Sep 1982 23:48:40-PDT From: mo at LBL-UNIX (Mike O'Dell [system]) To: v.wales at UCLA-Security cc: Header-People at MIT-MC Subject: Re: multiple at-signs in addresses -- a clarification In-reply-to: Your message of 27 Sep 1982 2320-PDT (Monday). I understand your scheme much better now. If multiple atsigns are allowed, you still have to know how they associate - left or right. Your proposed scheme seems to be based on an implied duality between routed addresses and embedded domains. This is sometimes quite valid, but only within restircted sub-domains - those which wish to interpret subdomain nesting as implied routing. It is quite destination-specific. However, from the composer's point of view, without considerable specific information, it isn't possible to know whether the duality is valid. I really believe you have to come "outer-level in" like I proposed. The form you propose is certainly covered by my proposal, I just don't wish to make GENERAL statements about the route-subdomain duality. There will be domains which have true flat address spaces, the number of periods to the right of the "first" atsign not withstanding (I assume you mean "first from the left"!), and allowing some random piece of software along the way transform dots to atsigns or vice versa will only make matters more difficult. We are in complete agreement as to the necessity of admitting the address forms you proposed, but I am not willing to allow arbitrary agents freedom reformatting the address in based on assumptions which are not necessarily true! Hope this all continues to make sense, -Mike O'Dell  Date: 28 Sep 1982 0553-PDT From: Mark Crispin Subject: Re: another thought on multiple at's and the TOPS-20 mail system Sender: ADMIN.MRC at SU-SCORE To: v.wales at UCLA-SECURITY cc: Header-People at MIT-MC Reply-To: Admin.MRC at SU-SCORE Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) In-Reply-To: Your message of 27-Sep-82 2319-PDT Indeed. Your last line: "no one's mailer understands Internet domains yet" says it all. I was at the meeting which created the domain standard, and left very pleased indeed, as domains were precisely the thing that would be easiest to implement in the TOPS-20 mailsystem. I'm afraid, though, that if I were to unilaterally start sending domain-style mail out on ARPANET that my mailbox would be flooded with hundreds of messages protesting the breaking of other sites' mailsystems. -- Mark -- -------  Date: 2 October 1982 21:06-EDT From: David A. Moon Subject: Administration To: HEADER-PEOPLE at MIT-MC Please send future requests for additions or deletions to the list to Header-People-Request @ MIT-MC, not to me.  Date: 5 Oct 1982 16:39 PDT From: BollenG.ES at PARC-MAXC Subject: Re: Administration In-reply-to: MOON's message of 2 October 1982 21:06-EDT To: HEADER-PEOPLE at MIT-MC Please delete me from the list. Thanks.  Date: 11 October 1982 1215-PDT (Monday) From: v.wales at UCLA-Security (Rich Wales) Subject: Internet addressing and UUCP To: Header-People at MIT-MC I see some problems regarding how to incorporate UUCP addresses cleanly into the RFC822-style domain-addressing scheme, and I would like some suggestions or discussion on the matter. The big problem I envision concerns how to handle multiple-hop UUCP addresses. Suppose, just to give a concrete example, that I want to send a message to Spencer Thomas at the University of Utah. Currently, I could do this by sending the mail to the following address: decvax!harpo!utah-cs!utah-gr!thomas at Berkeley Once RFC822 becomes official in January, of course, the ARPANET hop in the above address will change syntax, and I will have to write decvax!harpo!utah-cs!utah-gr!thomas@Berkeley.ARPA Now, assuming that Berkeley has a node "UUCP" in their addressing domain, I could go one step further and write harpo!utah-cs!utah-gr!thomas@decvax.UUCP.Berkeley.ARPA In fact, I suppose I could even go "all the way" and use an address like the following: thomas@decvax!harpo!utah-cs!utah-gr.UUCP.Berkeley.ARPA Supposedly, once Berkeley got its hands on my message, their mailer would enqueue it via UUCP to "decvax", telling the latter machine to send my mail to "harpo!utah-cs!utah-gr!thomas". So far, so good. But now suppose that Utah decides to put a spiffy domain mechanism into their mail software (as they would be allowed -- nay, encouraged -- to do under RFC822). Then, instead of "utah-gr!thomas", I might use something like "thomas@GR", making the full address of my message into the following: thomas@GR.decvax!harpo!utah-cs.UUCP.Berkeley.ARPA In this case, once my message arrived at Berkeley, it would have to be sent off via UUCP to "decvax". The problem is that the address which "decvax" would try to send the message along to would be harpo!utah-cs!thomas@GR And now we end up right back at the age-old controversy: which comes first, the "!" or the "@"? Clearly, in this case I want "decvax" to send my mail via UUCP to "harpo" with the address "utah-cs!thomas@GR", then for "harpo" to send it again via UUCP to "utah-cs" with the address "thomas@GR". The problem is that lots of mail programs out there (including our own mail handler here at UCLA) take the "@" first in such mixed-network addresses. If our mailer, for instance, were told to send something to "harpo!utah-cs!thomas@GR", it would split the address at the "@" and try to send to "GR" rather than "harpo". You can't simply say that from now on, all mailers must split all mixed addresses at the "!" rather than at the "@", because this would close off many necessary mail paths which end in one or more UUCP hops. Whatever we might like to see happen, I don't think we can safely hold our breaths until all UUCP mailers everywhere speak fluent RFC822, so we'll have to continue to allow people to write addresses with "!" in the local-part. Some people might suggest introducing quotes or backslashes into the address passed to UUCP in order to resolve the ambiguity. I am hesitant to do this, though. Not only do I have no guarantee that other UUCP mailers will know what to do with quotes and backslashes, but the last paragraph of section 3.4.3 in RFC822 says that these characters should not be passed along to lower-level mailers anyway. Assuming that everyone in Internet-land implemented a domain named "UUCP", I could of course use something like thomas@GR.utah-cs.UUCP.harpo.UUCP.decvax.UUCP.Berkeley.ARPA I don't think I would be the only person, however, to rebel at having to write this mile-long monstrosity instead of the current "!" syntax. Or, if everyone were to implement the UUCP site names of sites they converse with directly at the top levels of their respective naming hierarchies, then I could drop all the "UUCP"s from the above and end up with the following much-more-reasonable address to send to: thomas@GR.utah-cs.harpo.decvax.Berkeley.ARPA However, it isn't clear that every community in the Internet world would be willing to let other sites dictate its choice of top-level domain names. (Of course, a site could implement an internal mapping of domain names to UUCP site names.) I could go into further detail on this whole issue, but I hope I've said enough already to indicate the source of the problem (or the source of my confusion). It seems to me that some universal solution needs to be worked out fairly soon; if it already has been, I would certainly like to know what it is so I can start implementing it here. -- Rich Wales (Wales at UCLA-Security)  Date: 18 Oct 1982 18:59:52-EST From: Chris Kent Reply-to: cak at Purdue To: header-people@mc Subject: RFC822 Cc: dcrocker@udel-relay, donn@uwisc, jtk@purdue Hello... I'm new to the list (it's not even clear that I've been added yet), so perhaps this point has been discuseed before and I don't know about it. I recently got embroiled in a discussion about some of the headers that our mailer sends... we build a From: field that looks like From: Alfred Newman This is clearly OK under RFC822. However, some people have middle initials in the name database, so it would come out From: Alfred E. Newman which some people contend is incorrect, because there is white space around the '.' in the field. I spent some time reading the spec last night, and it isn't really clear; since there's something inside <>, the spec seems to say that everything seen up till then should be *ignored*; but other people have said that the '.' is confusing because the stuff out there is part of the address. I'm looking for an interpretation of the standard; I can't decide what was intended, and hope that some of you folks have thought about this more. If not, we need to think about it. Cheers, chris  Date: 27 Oct 1982 0757-EDT From: Andrew Scott Beals Subject: Re: SOMEBODY'S MUNGING HEADERS! To: dcrocker at UDEL-RELAY cc: header-people at MIT-MC In-Reply-To: Your message of 27-Oct-82 0140-EDT the peoblem here is that due to the fact that on its (and twenex) dots in login names are perfectly legal, a monstrous ambiguity arises. suppose I was to get the login name zzz.oz on mc. would mail sent thru a header-munging host eventually getting to mc finally get received by "zzz.oz" at mc, or by "zzz" at oz? (zzz is a real person at oz) not to bring up an old subject, but who decided that double at's were illegal? personally, I think that two routing characters are really brain-damaged. -andy -------  Date: 27 Oct 1982 10:11:16-EST From: Tim Korb Reply-to: jtk@Purdue To: cak@Purdue cc: dcrocker@udel-relay, donn@uwisc, jtk@purdue, header-people@mc Subject: Re: RFC822 In-reply-to: Your message of 18 Oct 1982 1859-EST (Monday). Chris, After rereading RFC822 (specifically, the definitions of the syntactic units specials, atom, phrase, mailbox, and route-addr on pages 10, 11, and 27), I have to agree with Donn. The address Alfred E. Newman is not valid, because of the presence of a special character (".") in the phrase part (sequence of words) of a mailbox. However, I consider this a deficiency in the specification. Of what value is it to disallow such an address? I can think of two reasons: historical precedent ("there are just too many mailers out there that won't understand it...") and implementation difficulties ("it's too hard to parse"). The first reason is better than the second, but, given the nature of this problem, neither is particularly convincing. It's not clear that putting the name in quotes solves the problem. "Alfred E. Newman" The specification seems to imply that quoted strings are not comments, that they are a subordinate part of the address (aside: could the CSNET nameserver use this field for the CSNET-ID of the addressee?). I suggest we ignore the problem (and continue to allow phrases with dots in them) until this issue is resolved. Blue-sky suggestion for the next round of message standards: expand and standardize the contents of the envelope, not the human-readable text. Let user front ends translate their own definition of the header into the standard envelope when sending a message and reverse the process (generate a header from the standard envelope) when receiving a message. Tim  Date: 27 Oct 82 11:46:53-EDT (Wed) From: Dave Crocker To: jtk at Purdue cc: cak at Purdue, dcrocker at UDel-Relay, donn at Csnet-Sh, jtk at Purdue, header-people at Mit-Mc Subject: Re: RFC822 Tim -- A E. DuPont is entirely legal. However, after having been parsed, its reassembled form will be: A E.DuPont Quoting the string will preserve its spacing. The reason for having dot be special everywhere is simply so that the standard's lexical analyzer is globally specified, rather than requiring context sensitive lexical analysis. Dave P.S. The transformation approach by "front end" systems, is exactly what is being done by several existing transport services. It seems to be the only way to interface heterogenous networks, unless and until everyone adopts the same standard (no chance of that, I suspect). D/  Date: 1 Dec 82 10:26:24-EST (Wed) From: Dave Crocker To: list; at udel-relay Subject: Transition Ten years ago, the ArpaNet had its coming out party at the first International Conference on Computers and Communications, held in Washington, D.C. As part of my first full-time job, I served as a floor-walker during the demonstration, oblivious to the historical import of the event. The job was simply an interesting way to fill some time during an hiatus in my education. As such things go, it turned out to be the start of my professional life. Over the course of that life, I have had close contact with people on the net, while at UCLA, USC, Rand, and most recently at the University of Delaware. It has been an extraordinary learning and growing experience. In particular, I have appreciated people's openness to sharing and discussing their ideas. Even the occasional verbal flaming has not detracted. We focus on questions of system development; however, to the extent that we also are studying this technology's impact on the way people communicate, the flaming has been useful. Besides, families are expected to have squabbles. Consequently, it has been very difficult for me to decide to move in a new direction. After quite a lot of consideration, I have taken a position with MCI Communications in Washington, D.C., working in their Corporate Development Department with Vint Cerf. My last effective day at UDel will be 21 December, starting at MCI on 5 January. The new UDel contact for CSNet operations questions is Brendan Reilly (reilly@udel-relay), with Dave Farber (farber@udel-relay) continuing as principal investigator. Since some of the Relay's tasks are handled by our distributed staff, the best address for general queries is mmdf@udel-relay. Dave  From: Tim Finin Subject: automatic mail sorters To: msg-group at Mit-Mc, header-people at Mit-Mc Date: 29 Jan 1983 21:57-EST Via: UPenn; 30 Jan 83 3:41-EST (if these mailing lists no longer exist and someone is reading this - please let me know. Thanks. TWF) I'm working on a project involving user-constructed mail "filters". What we would like to build is a rule-driven expert system which will ORDER one's electronic mail on the basis of the message attributes. We don't want to actually filter out unimportant messages, just rank the current (or incomming) ones. The kinds of attributes we are imagining using are things like: - has the message been read?, header seen?, answered? - age of message - sender's identity (e.g. RPG@SAIL), address (BBNA), local vs. network - keywords in subject field and message body - apparent "type" of message body (e.g. pascal code, lisp code, - manner we became a recipient (e.g. only addressee, one of several addressees, as a member of a mailing list, a carbon copy, forwarded...) - size of message - etc. We expect to rank messages with along several dimensions, such as INTEREST, URGENT and IMPORTANCE, and then have rules which combine these rankings to produce an overall ordering of the messages. A crucial aspect to this project would be to provide an environment in which it would be easy for the USER to examine, understand, specify and modify the rules which drive the system. We are aiming for a class of users which includes those technically oriented but having no programming knowledge of experience. We might, for example, allow rules like: if the sender is TIM.UPENN@UDEL ; this fellow sends very then INTEREST is VERY LIKELY to be HIGH ; interesting messages. if the RECIPIENT is a MAILING LIST ; if the mail is not personal then URGENCY is LIKELY to be LOW ; then it's prob. not urgent. if the source is LOCAL ; local (non network) mail has then the IMPORTANCE MAY be HIGH ; many important messages. if the SIZE is > 200 LINES or the TYPE is PASCAL ; very big messages and then the URGENCY MAY be LOW ; programs aren't urgent. if the URGENCY is > MEDIUM and the IMPORTANCE is NOT LOW ; rank from other then the RANK is VERY HIGH ; measures I know that there has bee a fair amount of work in the area of automatic mail filters, routers and the like. I'm interested in getting pointers to people, projects and relevant publications. I'd also like to talk to people who have to deal with a large number of incomming messages (e.g. > 20). I would like to know how they manage the task of reading their mail (old and new) already and what features they would like to see in an mail-sorting expert system. I would greatly appreciate any information, advice or ideas you could give me. Thanks, Tim Finin  Return-Path: Date: 29 Jan 1983 21:57-EST From: Tim Finin Subject: automatic mail sorters To: msg-group@Mit-Mc, header-people@Mit-Mc Via: UPenn; 30 Jan 83 3:41-EST (if these mailing lists no longer exist and someone is reading this - please let me know. Thanks. TWF) I'm working on a project involving user-constructed mail "filters". What we would like to build is a rule-driven expert system which will ORDER one's electronic mail on the basis of the message attributes. We don't want to actually filter out unimportant messages, just rank the current (or incomming) ones. The kinds of attributes we are imagining using are things like: - has the message been read?, header seen?, answered? - age of message - sender's identity (e.g. RPG@SAIL), address (BBNA), local vs. network - keywords in subject field and message body - apparent "type" of message body (e.g. pascal code, lisp code, - manner we became a recipient (e.g. only addressee, one of several addressees, as a member of a mailing list, a carbon copy, forwarded...) - size of message - etc. We expect to rank messages with along several dimensions, such as INTEREST, URGENT and IMPORTANCE, and then have rules which combine these rankings to produce an overall ordering of the messages. A crucial aspect to this project would be to provide an environment in which it would be easy for the USER to examine, understand, specify and modify the rules which drive the system. We are aiming for a class of users which includes those technically oriented but having no programming knowledge of experience. We might, for example, allow rules like: if the sender is TIM.UPENN@UDEL ; this fellow sends very then INTEREST is VERY LIKELY to be HIGH ; interesting messages. if the RECIPIENT is a MAILING LIST ; if the mail is not personal then URGENCY is LIKELY to be LOW ; then it's prob. not urgent. if the source is LOCAL ; local (non network) mail has then the IMPORTANCE MAY be HIGH ; many important messages. if the SIZE is > 200 LINES or the TYPE is PASCAL ; very big messages and then the URGENCY MAY be LOW ; programs aren't urgent. if the URGENCY is > MEDIUM and the IMPORTANCE is NOT LOW ; rank from other then the RANK is VERY HIGH ; measures I know that there has bee a fair amount of work in the area of automatic mail filters, routers and the like. I'm interested in getting pointers to people, projects and relevant publications. I'd also like to talk to people who have to deal with a large number of incomming messages (e.g. > 20). I would like to know how they manage the task of reading their mail (old and new) already and what features they would like to see in an mail-sorting expert system. I would greatly appreciate any information, advice or ideas you could give me. Thanks, Tim Finin  Date: 21 Feb 1983 0110-PST From: Stuart M. Cracraft Subject: a difficult message to respond to To: header-people at MIT-MC Received: from MIT-MC by SRI-CSL at 20-Feb-83 1724-PST Date: Sun Feb 20 16:50:52 1983 Subject: convention listing Message-Id: <8302210031.29950@UCBVAX.ARPA> Received: by UCBVAX.ARPA (3.314/3.5) id AA29950; 20 Feb 83 16:31:59 PST (Sun) To: sf-lovers-request@MIT-MC Could you Please send me a listing of conventions. Thank you, David Kushner P.S. Please put the name David Kushner or DMK in the subject heading as this is a joint account. -thanks -------  Date: Tue, 22 Feb 1983 19:32:21 PST From: Rich Wales To: Header-People@MIT-MC CC: Jon Postel Subject: "From:" lines in UUCP mail?? Some UUCP sites (most notably Berkeley) have recently started putting "From:" lines in the headers of UUCP mail. According to a discussion I recently had with the mail guru at Berkeley, the plan seems to be that each site should prepend its name to the address in the "From:" line as it passes the mail on to the next site in the chain. While I sympathize with those who feel UUCP "From" lines look icky, I am not at ALL enthusiastic about the idea of picking apart an arbitrary "From:" line in order to add our UUCP site name at the right place. I am even less enthusiastic about assuming that umpteen zillion sites out there in UUCP-land will all independently (and correctly!) modify their mailers to do this new trick as well. What does the community in general think about this one? I think it really needs to get thrashed out before the concrete hardens. -- Rich Wales } (take your pick; } they should all } be synonymous)  Return-Path: Date: 29 Jan 1983 21:57-EST From: Tim Finin Subject: automatic mail sorters To: msg-group@Mit-Mc, header-people@Mit-Mc Via: UPenn; 30 Jan 83 3:41-EST (if these mailing lists no longer exist and someone is reading this - please let me know. Thanks. TWF) I'm working on a project involving user-constructed mail "filters". What we would like to build is a rule-driven expert system which will ORDER one's electronic mail on the basis of the message attributes. We don't want to actually filter out unimportant messages, just rank the current (or incomming) ones. The kinds of attributes we are imagining using are things like: - has the message been read?, header seen?, answered? - age of message - sender's identity (e.g. RPG@SAIL), address (BBNA), local vs. network - keywords in subject field and message body - apparent "type" of message body (e.g. pascal code, lisp code, - manner we became a recipient (e.g. only addressee, one of several addressees, as a member of a mailing list, a carbon copy, forwarded...) - size of message - etc. We expect to rank messages with along several dimensions, such as INTEREST, URGENT and IMPORTANCE, and then have rules which combine these rankings to produce an overall ordering of the messages. A crucial aspect to this project would be to provide an environment in which it would be easy for the USER to examine, understand, specify and modify the rules which drive the system. We are aiming for a class of users which includes those technically oriented but having no programming knowledge of experience. We might, for example, allow rules like: if the sender is TIM.UPENN@UDEL ; this fellow sends very then INTEREST is VERY LIKELY to be HIGH ; interesting messages. if the RECIPIENT is a MAILING LIST ; if the mail is not personal then URGENCY is LIKELY to be LOW ; then it's prob. not urgent. if the source is LOCAL ; local (non network) mail has then the IMPORTANCE MAY be HIGH ; many important messages. if the SIZE is > 200 LINES or the TYPE is PASCAL ; very big messages and then the URGENCY MAY be LOW ; programs aren't urgent. if the URGENCY is > MEDIUM and the IMPORTANCE is NOT LOW ; rank from other then the RANK is VERY HIGH ; measures I know that there has bee a fair amount of work in the area of automatic mail filters, routers and the like. I'm interested in getting pointers to people, projects and relevant publications. I'd also like to talk to people who have to deal with a large number of incomming messages (e.g. > 20). I would like to know how they manage the task of reading their mail (old and new) already and what features they would like to see in an mail-sorting expert system. I would greatly appreciate any information, advice or ideas you could give me. Thanks, Tim Finin  Date: 21 Feb 1983 0110-PST From: Stuart M. Cracraft Subject: a difficult message to respond to To: header-people at MIT-MC Received: from MIT-MC by SRI-CSL at 20-Feb-83 1724-PST Date: Sun Feb 20 16:50:52 1983 Subject: convention listing Message-Id: <8302210031.29950@UCBVAX.ARPA> Received: by UCBVAX.ARPA (3.314/3.5) id AA29950; 20 Feb 83 16:31:59 PST (Sun) To: sf-lovers-request@MIT-MC Could you Please send me a listing of conventions. Thank you, David Kushner P.S. Please put the name David Kushner or DMK in the subject heading as this is a joint account. -thanks -------  Date: Tue, 22 Feb 1983 19:32:21 PST From: Rich Wales To: Header-People@MIT-MC CC: Jon Postel Subject: "From:" lines in UUCP mail?? Some UUCP sites (most notably Berkeley) have recently started putting "From:" lines in the headers of UUCP mail. According to a discussion I recently had with the mail guru at Berkeley, the plan seems to be that each site should prepend its name to the address in the "From:" line as it passes the mail on to the next site in the chain. While I sympathize with those who feel UUCP "From" lines look icky, I am not at ALL enthusiastic about the idea of picking apart an arbitrary "From:" line in order to add our UUCP site name at the right place. I am even less enthusiastic about assuming that umpteen zillion sites out there in UUCP-land will all independently (and correctly!) modify their mailers to do this new trick as well. What does the community in general think about this one? I think it really needs to get thrashed out before the concrete hardens. -- Rich Wales } (take your pick; } they should all } be synonymous)  Date: 7 April 1983 22:53 EST From: Ken Harrenstien Subject: Recent logjam To: HEADER-PEOPLE @ MIT-MC I found that a few hosts which died or went away during the TCP transition managed to prevent all header-people mail from going out. I have removed the baddies, and unqueued the piled-up messages (which you should have just received), and fixed things so that such problems will not gum the works again. Thanks to Rich Wales for poking Header-People-Request about it.  Date: 8 Apr 1983 11:40:14-EST From: Christopher A Kent Reply-to: cak@purdue To: bbn-tcp@bbn-unix, bugs@bbn-unix Cc: header-people@mit-mc Subject: RFC822 and " at " It sure would be nice if BBN would move into the present and not send mail headers with "foo at BBN-UNIX", which is expressly forbidden in RFC822. They should be "foo@BBN-UNIX". My mail reader program no longer accepts the first form as a single address; rather it considers it to be three distinct recipients on the local machine, none of which exist (usually). Cheers, chris  Date: Fri, 8 Apr 83 10:30:26 PST From: Rich Wales To: Header-People@MIT-MC CC: bbn-tcp@BBN-UNIX, bugs@BBN-UNIX, cak@Purdue Subject: cak@Purdue's message about RFC822 and "at" It should be noted that RFC822 permits spaces in recipient names and requires that multiple recipients be separated by commas, NOT spaces. I understand that there are places (e.g., CMU and MIT-MULTICS) which use spaces in their mailbox names. Hence, for example, if a message bears a return address of "John Doe@CMUA", this is a valid address under RFC822 and must NOT be split into two pieces at the space. Further, RFC822 permits spaces around punctuation marks such as "@", so "foo @ BBN-UNIX" is just as legal as "foo@BBN-UNIX". I would suggest the following strategy (which is what we do here in our own mail system): (1) Force people to use commas instead of spaces to separate multiple recipients. You may run into user resistance at first (old habits are hard to break), but I predict you will come to grief in the long run unless you follow the standard quite strictly in this respect. (2) Put a temporary hack in your mail system to treat the word "at" as equivalent to the symbol "@". While it is true that RFC822 has officially abolished "at", in actual practice it is liable to be a while yet before everyone out there stops using it. -- Rich  Date: Fri, 8 Apr 83 13:53:35 PST From: Rich Wales To: Header-People@MIT-MC CC: Karlton.PA@PARC-MAXC CC: bbn-tcp@BBN-UNIX, bugs@BBN-UNIX, cak@Purdue Subject: correction of misstatement about RFC822 It was just called to my attention that RFC822 in fact does NOT permit spaces in a mailbox name unless the name is a quoted string. My apologies for having (incorrectly) stated otherwise. As far as I am aware, however, the other points I made still hold -- namely, you can have spaces around "@", and recipients are supposed to be separated by commas rather than spaces. Another reason for not allowing spaces to be used as inter-recipient separators, by the way, is that the following kind of notation -- Rich Wales becomes hopelessly ambiguous if you treat spaces as separators between recipients. -- Rich  Received: by YALE-BULLDOG via CHAOS; Fri, 8 Apr 83 23:03:20 EST Date: Fri, 8 Apr 83 23:04:16 EST From: Nathaniel Mishkin Subject: Re: cak@Purdue's message about RFC822 and "at" To: Rich Wales Cc: Header-People@MIT-MC.ARPA, bbn-tcp@BBN-UNIX.ARPA, cak@PURDUE.ARPA In-Reply-To: Rich Wales , 8 Apr 1983 10:30:26 PST (2) Put a temporary hack in your mail system to treat the word "at" as equivalent to the symbol "@". While it is true that RFC822 has officially abolished "at", in actual practice it is liable to be a while yet before everyone out there stops using it. "temporary": Senders will never change their mail systems to the new standard as long as recipients continue to "accept" the old syntax. If recipients do not accept the old syntax, senders will be forced to fix their mail systems or put up with continuing complaints from recipients. -- Nat -------  Return-Path: Date: 9 April 1983 02:27 est From: Schauble.HIS_Guest at MIT-MULTICS Subject: Mailing list update To: Header-People at MIT-MC Please add PCO-disty at MIT-Multics. thanks, Paul  Date: 9 Apr 1983 0822-EST From: Robert W. Kerns Subject: Re: cak@Purdue's message about RFC822 and "at" To: cak@PURDUE, v.wales@UCLA-SECURITY cc: header-people@MIT-MC, bbn-tcp@BBN-UNIX In-Reply-To: The message of Fri, 8 Apr 83 23:04:16 EST from Nathaniel Mishkin Date: Fri, 8 Apr 83 23:04:16 EST From: Nathaniel Mishkin (2) Put a temporary hack in your mail system to treat the word "at" as equivalent to the symbol "@". While it is true that RFC822 has officially abolished "at", in actual practice it is liable to be a while yet before everyone out there stops using it. "temporary": Senders will never change their mail systems to the new standard as long as recipients continue to "accept" the old syntax. If recipients do not accept the old syntax, senders will be forced to fix their mail systems or put up with continuing complaints from recipients. Experience, (some of it with other BBN software) shows that senders will put up with such continuing complaints for many years, and that USERS of intolerant mail systems are the ones to suffer. USERS of the intolerant but correct mail system end up resenting the implementor of their system when it behaves badly, and USERS of the technically incorrect but probably not actively maintained end up resenting the complaints that they are powerless to do anything about. In short, it would seem to me that your attitude will only make life harder for everyone and not speed implementation of the still-very-new standard at all. -------  Date: 09-Apr-83 17:26:26-UT From: Mills at DCN6 Subject: RWK's grouse about the "@" hack To: Header-People at MIT-MC cc: BBN-TCP at BBN-UNIX Robert, Your position is untenable, short-sighted and just plain wrong. While trying to answer your message my point was reinforced. SCRC-TENEX is not in the NIC data base, so ISID refused to cooperate in answering your message. There are no domain qualifiers in the headers (except for the "Received:" line stuffed by ISID). MIT-MC forgot to stuff a "Received:" line for SCRC-whatever in the header (although it did good with the "Return-path:" line). Clearly, I shouldn't bother to reply to such trash. As a USER critically dependent on interactions with my peers using our mail system, rickety as it is, I cannot afford loss of connectivity while the implementers squabble over transition issues. During the transition period, which will probably last forever, we must provide something like the hack suggested by Chris. I propose a much better salve for Robert's arrogance: Fiddle the mail systems involved on the Header-People list to call the Mail Police for any violation on a message with "Header-People@MIT-MC" in the "From:" field. That even would have the pleasant effect of filtering out the junk mail from that source. Just for fun, I am sending this from a "fuzzball" PC whose name is in the NIC tables, but not in most BBN data bases. You get one point if your mail system can answer it. You get two points if your mail system can send it (SMTP). You get three points if your TCP can hack the 1200-bps tiny pipe connecting me to the world. The fuzzball mail system does (of course) not yet conform completely to RFC-821/822. Regards, Dave -------  Date: 09-Apr-83 19:43:35-UT From: Mills at DCN6 Subject: Bent finger-pointer To: Header-People at MIT-MC cc: BBN-TCP at BBN-UNIX FOlks, I got a kink in my finger joint. Jay Lepreau pointed out that my pointy-finger should land on Nat Mishkin, not Robert Kearns (to whom my aplogies). I failed to notice Robert's message contained a quote of a portion of Nat's message which contained a quote of a portion of Rich Wales' message suggesting the hack in response to a bleat from Chris Kent, whom I mistakenly labelled as the suggester of the hack in the first place. Now that that's all straightened out, my brain is bent. Regards, Dave -------  Date: Sat, 9 Apr 83 05:07:48 CST From: Paul.Milazzo Return-Path: Subject: Re: cak@Purdue's message about RFC822 and "at" To: Nathaniel Mishkin Cc: Header-People@MIT-MC.ARPA Message-Id: <1983.04.09.0507.360.01323@rice> In-Reply-To: Nathaniel Mishkin's message of Fri, 8 Apr 83 23:04:16 EST Via: rice; 9 Apr 83 5:25-CDT Via: Rice; 9 Apr 83 4:32-PDT "If recipients do not accept the old syntax, senders will be forced to fix their mail systems or put up with continuing complaints from recipients." You have what is perhaps a good idea, but unfortunately most "senders" are not mail gurus, and tend to become so very discouraged with failed mail that after a short while they simply give up. This effect has worsened since the ARPANET TCP conversion, because now many users are used to failures. Besides, sometimes it's not the sender's fault. For example, this message had a perfectly wonderful RFC822 header when it left my terminal. By the time it reaches you, it will be filled with "Via:" fields, meaningless return paths, and other things that someone at CSNet thinks are a good idea. Sigh... Paul Milazzo Dept. of Mathematical Sciences Rice University, Houston, TX  Date: 10 Apr 1983 0200-EST From: Robert W. Kerns Subject: Re: RWK's grouse about the "@" hack To: Header-People@MIT-MC cc: BBN-TCP@BBN-UNIX, RWK@SCRC-TENEX In-Reply-To: The message of 09-Apr-83 17:26:26-UT from Mills at DCN6 Return-path: Received: from MIT-MC by SCRC-TENEX with CHAOS; Sat 9-Apr-83 13:20:24-EST Date: 09-Apr-83 17:26:26-UT From: Mills at DCN6 Subject: RWK's grouse about the "@" hack To: Header-People at MIT-MC cc: BBN-TCP at BBN-UNIX Robert, Your position is untenable, short-sighted and just plain wrong. While trying to answer your message my point was reinforced. SCRC-TENEX is not in the NIC data base, so ISID refused to cooperate in answering your message. There are no domain qualifiers in the headers (except for the "Received:" line stuffed by ISID). MIT-MC forgot to stuff a "Received:" line for SCRC-whatever in the header (although it did good with the "Return-path:" line). Clearly, I shouldn't bother to reply to such trash. You must have misunderstood my point, since by replying you have provided an excellent example. Since I'm sending this from home via an extremely kludgy TENEX system running on kludgy hardware, etc., and I do not hack TENEX software, I am a poor user trapped in the middle, and the fact that ISID refused to cooperate only screwed YOU and caused you to flame at ME, and did nothing at all to fix either SCRC-TENEX or MIT-MC. As a USER critically dependent on interactions with my peers using our mail system, rickety as it is, I cannot afford loss of connectivity while the implementers squabble over transition issues. During the transition period, which will probably last forever, we must provide something like the hack suggested by Chris. "Hack suggested by Chris"? The message I replied to suggested no hack, it promoted as a feature ELIMINATING BACKWARD COMPATABILITY to force the rest of the world to comply more quickly. My point is this screws the users, like you, and like me on this system. I propose a much better salve for Robert's arrogance: Arrogance? If we're going to get personal here, I think you are being arrogant, the more so since you don't seem to have looked very closely at what I was saying. If anything, I was complaining about the arrogance of demanding perfection from everyone instantly without concern for the effect on the users when some sites don't comply. Fiddle the mail systems involved on the Header-People list to call the Mail Police for any violation on a message with "Header-People@MIT-MC" in the "From:" field. That even would have the pleasant effect of filtering out the junk mail from that source. Just for fun, I am sending this from a "fuzzball" PC whose name is in the NIC tables, but not in most BBN data bases. You get one point if your mail system can answer it. You get two points if your mail system can send it (SMTP). You get three points if your TCP can hack the 1200-bps tiny pipe connecting me to the world. The fuzzball mail system does (of course) not yet conform completely to RFC-821/822. Somebody sure didn't comply. I'm not even in the ARPA domain (which specification isn't complete yet anyway), and neither are you, but there is no domain information in your header. I couldn't route it to you even by hand. MC didn't record where it got it from when it forwarded it to SCRC-TENEX. So even if I was at working using our winning mail software, I would still lose. ******** In any event, it should be obvious, even to those of us who like to write flames without reading any, that when things don't conform to standards, users are inconvenienced. You seem to be attacking me for somehow saying this was acceptable, or shouldn't be fixed, or something. Rather, I was condemning someone MAKING THE SITUATION WORSE by deliberately introducing unneeded incompatibility with sites that have yet to convert. If you go back and read my original message again, and think about it, I think you'll agree your rather personal attack is quite inappropriate. -------  Date: 10 Apr 1983 0211-EST From: Robert W. Kerns Subject: Re: Bent finger-pointer To: Header-People@MIT-MC cc: BBN-TCP@BBN-UNIX, RWK@SCRC-TENEX In-Reply-To: The message of 09-Apr-83 19:43:35-UT from Mills at DCN6 OK, apology accepted. (Except you spelled my name wrong this time...) Of course, I have to add to the volume of mail on these lists since I can't reply to you directly. (I think RWK%SCRC-TENEX@MIT-MC will get to me, by the way). -------  Date: 10-Apr-83 16:16:13-UT From: Mills at dcn6 Subject: Score SCRC-TENEX 1, me 0 To: Header-People at MIT-MC Robert Kerns, (My apologies to the rest of you, since apparently only MIT-MC knows how to get to SCRC-TENEX, and it isn't telling anyone else). 10-Apr-83 16:00:46,2637;000000000001 Return-path: Received: from MIT-MC.ARPA by DCN6.ARPA ; 10-Apr-83 15:58:46-UT Date: 10 April 1983 10:45 EST From: Communications Satellite Subject: Msg of Sunday, 10 April 1983 10:45 EST To: Mills @ DCN6 ============ A copy of your message is being returned, because: ============ "RWK%SCRC-TENEX" at MIT-MC is an unknown recipient. ============ Failed message follows: ============ Received: from DCN6.ARPA by DCN1.ARPA ; 10-Apr-83 15:33:41-UT Date: 10-Apr-83 15:31:22-UT From: Mills at dcn6 Subject: Mail bakeoff To: Romine at SEISMO, OBrien at RAND-UNIX, MRC at SU-SCORE, Rick.Gumpertz at CMU-CS-A To: Haynes.PA at PARC-MAXC.ARPA, knutsen at Sri-Unix, TIHOR.CMCL1 at NYU.ARPA To: Schauble.HIS_Guest at MIT-MULTICS, RWK%SCRC-TENEX at MIT-MC cc: Mills at dcn6 Robert, As I said, the pointy finger writes crooked headers. I tried hard at ISID to reply to your message, but couldn't get SNDMSG even to eat SRC-TENEX and replace with RWK%SRC-TENEX@MIT-MC, a trick which it will do for mail forwarded by itself into the UK goo. I am also fighting the world from home, but my ammunition is different caliber: an LSI-11 "fuzzball" that I can beat up if it misbehaves, so I have nobody to blame for its foo paws except myself. So far eight mailgrubbers have responded to my chanllenge to boogie with DCN6 and they have accumulated 30.5 points. Of course, I don't know about the others who drowned in the attempt. It would be an interesting exercise if we held a Vint Cerf Memorial Bakeoff, where we each swiped a copy of the Header-People list and tried to send a message to all recipients from our own mail systems. In response to several questions about fuzzballs, see DCNET.DOC in the MILLS directory at ISID. Following are the winners to date in the DCN6 bakeoff: I have awarded another point to each and four points to myself if my mailer apparently succeeded in the reply. Just to be feisty I am relaying this via another fuzzball (DCN1). (Fuzzballs are already known to play the mail game with ISI TOPS-20s, BBN VAXen, BBN C/70s, 3-COM VAXen, Multices and other fuzzies scattered around the world.) I am saving the messages for later postmortem and will report to the Mail Police. Romine@SEISMO, 4 points OBrien@RAND-UNIX, 4 oiubts MRC@SU-SCORE, 3.5 points Rick.Gumpertz@CMU-CS-A, 4 points Haynes.PA@PARC-MAXC.ARPA, 3 points (this message is the reply test) knutsen@Sri-Unix, 3 points (") TIHOR.CMCL1@NYU.ARPA, 3 points (") Schauble.HIS_Guest@MIT-MULTICS, 3 points (") RWK%SCRC-TENEX@MIT-MC, 3 points (") Regards, Dave ------- -------  Date: 11 Apr 1983 02:52:00-EST From: Christopher A Kent Reply-to: cak@purdue To: Mills@DCN6 cc: BBN-TCP@BBN-UNIX, Header-People@MIT-MC Subject: Re: RWK's grouse about the "@" hack In-reply-to: Your message of 09-Apr-83 17:26:26-UT. Dave, I'd like to set the record straight. I did NOT suggest the hack about " at "; I believe that was Rich Wales. Also, accept this as my (late) entry into the mail bakeoff. The suggestion about not accepting mail that doesn't conform to the standards is not very tenable, for a completely different reason. My SMTP server log shows that BBN knows exactly how to be polite and produce RFC822 addresses, since they must to get the mail in the door. But the headers in the TEXT of the message (not checked by my server -- it just believes what the SMTP USER side tells it) has the garbage headers. So there's no way to refuse the mail by checking the header, without an awful lot of mucking about that the server has no business doing. I believe that Jon Postel has asked us to forget about domain qualifiers for the moment; the only valid one currently is .ARPA, and it was reported that there was strong sentiment at the last Internet meeting to drop that for the time being, since it is redundant and breaks many people's mailers. I dropped it here because of this sentiment and the fact that my users were complaining about not being able to get mail through to TWENEX sites (among others) because they couldn't hack the headers. This gave me no end of internal glee, because I felt I was paying someone back for all the pain and agony I went through on Jan 3 when I returned from my Christmas break and had to fix the software I had from BBN that claimed to be implementing the standard ... and in one heck of a hurry. The fact that a number of sites (MIT-Multics in particular) got extremely picky on Jan 1, when they hadn't been at all before the big day, didn't help any (e.g. Multics started requiring HELO messages). If it were just me, I'd say that we should all play the game strictly by the rules, and blow the slowpokes out of the water. I'm willing to hack my code and respond to my users as quickly as I can; I think that's my job. But I realize it's not just me, and it's not all under my control. So I try to strike a compromise. That's why I decided to finally call the Mail Police on BBN; I got just one too many crummy headers. I think this is the route we have to take; keep pounding on them till they respond; but it won't do to cut off users. It's been less than four months since the big day; even though everyone was supposed to be ready on Jan 1, there are STILL sites running NCP; if the basic protocol conversion hasn't been made yet, how can we expect the higher level services to be intact? It's a shame, but that seems to be the way it is. Let's pound on the offending parties, 'cuz no one else will. Cheers, chris  Date: 11 Apr 1983 09:50:04-EST From: Christopher A Kent Reply-to: cak@purdue To: header-people@mit-mc Subject: BBN catches on I just received a letter from the BBN mail system that had "@" instead of " at ". Let's hear it for peer pressure! Cheers, chris  Received: by YALE-BULLDOG via CHAOS; Mon, 11 Apr 83 21:11:42 EST Date: Mon, 11 Apr 83 21:14:38 EST From: Nathaniel Mishkin Subject: Re: cak@Purdue's message about RFC822 and "at" To: RWK%SCRC-TENEX@MIT-MC.ARPA, Mills@DCN6.ARPA, header-people@MIT-MC.ARPA In-Reply-To: RWK%SCRC-TENEX@MIT-MC.ARPA, 9 Apr 1983 08:22 EST From: RWK%SCRC-TENEX@MIT-MC.ARPA -- Experience, (some of it with other BBN software) shows that senders will put up with such continuing complaints for many years, and that USERS of intolerant mail systems are the ones to suffer. From: Mills@DCN6.ARPA -- Your position is untenable, short-sighted and just plain wrong. First, I think that Kerns point is well-taken. The fact that users are suffering is a bad thing. I just believe that it is unfair to place the onus on a site that HAS attempted to implement a mail system meeting the standard. Are you opposed to a standard or just opposed to persnickity mailers not being cooperative amount non-standard mail? I take it the latter. If so, then how much hacking am I obliged to do to accomodate other mail systems? Don't I get at least as much consideration as the site that hasn't had the time to fix their mailer yet to be up to the standard? In any case, I don't see how anyone's lines of communications are going to be cut by virtue of the fact that his mail program says "invalid address syntax" when he attempts to reply to a message that has some ill-formed "From". Any reasonable program would just leave it to the user to get the address into some reasonable shape (e.g. the way I just edited the "Foo at Bar" I got in one of your messages into "Foo@Bar"). The point of the standard is to minimize the overall amount of work that must be expended by all implementors. I believe it is "short-sighted" to go for the quick hacks. It seems to me the 822 was an attempt to SIMPLIFY the mail syntax so that everyone could meet it with a minimum amount of work. However, it is hard enough to implement the syntax without adding hacks that may introduce ambiguity or make the tokenization/parsing more difficult. Call me obstinate but don't call me "plain wrong". -- Nat -------  Date: 13-Apr-83 17:27:39-UT From: Mills at dcn6 Subject: List deletion request To: Header-People at MIT-MC List Keeper: Please remove Agarwal@isid and OConnor@isid from this list. They are no longer associated with the project and their directories have been reclaimed. Thanks and regards, Dave -------  Date: 13 April 1983 20:03 EST From: Ken Harrenstien Subject: List deletion requests etc. To: HEADER-PEOPLE @ MIT-MC Just to remind people (sorry, Dave!) -- Additions/deletions/queries should go to Header-People-Request@MIT-MC rather than to the entire list.  Date: 14 Apr 83 22:58:31 PST (Thu) From: Einar Stefferud Return-Path: Subject: Re: cak@Purdue's message about RFC822 and "at" To: header-people@Mit-Mc, stef@UCI In-Reply-To: Your message of Mon, 11 Apr 83 21:14:38 EST. Sender: Stef.UCI@Rand-Relay Via: UCI; 14 Apr 83 23:16-PST I just want to recall that two most useful things to do are: 1. Be very conservative about sending mail out that meets the standard, and 2. Be very liberal about what is accepted from others. With these in mind, I want to ask that our various User Agent programs remain able to do reasonable things with mail that is dredged from archives which date back to the earlier days of now outdated standards. Who is to say that mail from before some arbitrary date should no longer be processable? Onward! Stef  Date: 17 Apr 1983 15:32:12-EST From: Christopher A Kent Reply-to: cak@purdue To: header-people@mit-mc Subject: unreplyable headers... Cc: sun!postmaster@berkeley, postmaster@berkeley, geoff5@sri-csl, sun!krypton!wnj@berkeley Check this one out! Date: 15 Apr 1983 2014-EST (Friday) From: sun!wnj@krypton Return-path: Received: from UCB-VAX by SRI-CSL at 2-Apr-83 2055-PST Date: 2 Apr 83 18:15:42 PST (Sat) From: sun!wnj@krypton (Bill Joy) Subject: fix to csh # processing bug Return-Path: Message-Id: <8304030215.AA04179@krypton.sun.uucp> Received: by krypton.sun.uucp (3.320/3.14) id AA04179; 2 Apr 83 18:15:42 PST (Sat) Received: from krypton.sun.uucp by sun.uucp (3.320/3.1) id AA09347; 2 Apr 83 18:26:22 PST (Sat) Received: by UCBARPA.ARPA (3.332/3.19) id AA13556; 2 Apr 83 20:57:12 PST (Sat) Received: from UCBARPA.ARPA by UCBVAX.ARPA (3.332/3.20) id AA13192; 2 Apr 83 20:57:09 PST (Sat) To: unix-wizards@sri-csl.ARPA Remailed-date: 12 Apr 1983 1119-PST Remailed-from: the tty of Geoffrey S. Goodfellow Remailed-to: Unix-Wizards: ; Not only is krypton not a valid site in the Internet (.ARPA domain), which is implied unqualified hostname in the the Return-path line, that line hasn't been munged by succeeding mailers. Also, the .uucp domain isn't a valid domain (yet? ever?). The "sun!wnj@krypton" seems to imply that the ! operator has higher precedence than the @, which I have NEVER seen before. I happen to know a little bit of the topology of uucp around Berkeley, but how are my poor users (and those at other sites) supposed to deal with this kind of stuff? Cheers, chris  Date: 17 Apr 1983 15:32:12-EST From: Christopher A Kent Reply-to: cak@purdue To: header-people@mit-mc Subject: unreplyable headers... Cc: sun!postmaster@berkeley, postmaster@berkeley, geoff5@sri-csl, sun!krypton!wnj@berkeley Check this one out! Date: 15 Apr 1983 2014-EST (Friday) From: sun!wnj@krypton Return-path: Received: from UCB-VAX by SRI-CSL at 2-Apr-83 2055-PST Date: 2 Apr 83 18:15:42 PST (Sat) From: sun!wnj@krypton (Bill Joy) Subject: fix to csh # processing bug Return-Path: Message-Id: <8304030215.AA04179@krypton.sun.uucp> Received: by krypton.sun.uucp (3.320/3.14) id AA04179; 2 Apr 83 18:15:42 PST (Sat) Received: from krypton.sun.uucp by sun.uucp (3.320/3.1) id AA09347; 2 Apr 83 18:26:22 PST (Sat) Received: by UCBARPA.ARPA (3.332/3.19) id AA13556; 2 Apr 83 20:57:12 PST (Sat) Received: from UCBARPA.ARPA by UCBVAX.ARPA (3.332/3.20) id AA13192; 2 Apr 83 20:57:09 PST (Sat) To: unix-wizards@sri-csl.ARPA Remailed-date: 12 Apr 1983 1119-PST Remailed-from: the tty of Geoffrey S. Goodfellow Remailed-to: Unix-Wizards: ; Not only is krypton not a valid site in the Internet (.ARPA domain), which is implied unqualified hostname in the the Return-path line, that line hasn't been munged by succeeding mailers. Also, the .uucp domain isn't a valid domain (yet? ever?). The "sun!wnj@krypton" seems to imply that the ! operator has higher precedence than the @, which I have NEVER seen before. I happen to know a little bit of the topology of uucp around Berkeley, but how are my poor users (and those at other sites) supposed to deal with this kind of stuff? Cheers, chris  Date: 17 Apr 1983 1655-PST From: Henry W. Miller Subject: Re: unreplyable headers... To: cak at PURDUE, header-people at MIT-MC cc: sun!postmaster at UCB-VAX, postmaster at UCB-VAX, geoff5 at SRI-CSL, sun!krypton!wnj at UCB-VAX, Miller at SRI-NIC In-Reply-To: Your message of 17-Apr-83 1232-PST Chris, Thank you for bringing this up. I've been meaning to, but havn't had the chance. Yes, it is a real concern, and should be addessed. (No pun intended...) -HWM -------  Date: Wed 20 Apr 83 15:04:05-PST From: Mark Crispin Subject: more on domains To: MsgGroup@BRL.ARPA, Header-People@MIT-MC.ARPA cc: Postel@USC-ISIF.ARPA, Hedrick@RUTGERS.ARPA, nethax@Diablo, Bug-MAIL@MIT-MC.ARPA Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) There is a problem with Jon's suggestion to use "STANFORD" or "MIT" as top-level domains meaning "the Stanford Pup Ethernet" or "the MIT Chaosnet". Both MIT and Stanford have more than one network. At Stanford, an Internet network [36.x.y.z] share the same backbone and some of the same members with a Pup Ethernet. This is the first part of my problem. The name "SU-Shasta.ARPA" refers unambiguously to Internet [36.40.0.192]. The question that pops up is how to refer to Pup 50#300. The present situation is this. My name/address lookup module (HSTNAM) prefers to use Internet registries whenever possible. SHASTA is listed in the NIC HOSTS.TXT file as a nickname for SU-SHASTA, but does not appear in HSTNAM.MULTINET (also distributed by the NIC) which is used by the TOPS-20 kernal and thus consulted by the HSTNAM module. Only SU-SHASTA is in HSTNAM.MULTINET. Therefore, SU-Shasta becomes SU-SHASTA.ARPA, Internet [36.40.0.192]. Shasta is unrecognized as an Internet name and stays as such, becoming Pup 50#300. The same thing happens for most other members on Stanford's Ethernet. This would only be a side note in the annuals of networking comedy, if it weren't for the fact that Pup mailing is presently preferred for most of the Stanford network! A way is therefore needed to unambiguously specify the name registry. Then, Shasta would mean "any registry that recognizes Shasta", but Shasta.ARPA would only be in the Internet domain and Shasta.SU-Pup would only be in the Stanford Pup domain. If you want to have automatic relaying to some network that you are not connected to, it is virtually impossible to do this today without some form of high-level name registry. Automatic relaying is a well-known and loved feature of the TOPS-20 mailer, and several sites have indicated stiff opposition to the prospect of losing it. It is much more feasible to inform the mailer that "mail for anything.MIT-Chaos should be relayed through MIT-MC.ARPA" than to make the futile attempt of keeping track of all the members of MIT-Chaos. It also makes it possible for an intelligent mailsystem to reply to a message from one of these sites without requiring local knowledge of the name. Since there are multiple name registries at multiple sites, I suggest names such as SU-Pup, MIT-Chaos, DEC-DECnet, CMU-CU-DECnet, CSnet, etc. It is wrong to consider "Stanford" and "Stanford Pup Ethernet" to be the same entity. More importantly, I want an official Internet registry of these non-Internet name registries. I am not (yet) asking that these names be recognized as alternative top-level domains to ARPA. I am asking that some list of these names be made and kept so that applications like mine which must have this now will use a common set of names. Mail sent out on the Internet would be of the form: to satisfy the constraints of absolute host naming and non-use of domains other than ARPA. But within the non-Internet networks, you would see addresses such as It is, of course, possible that such addresses will leak out on the Internet from time to time with about the same frequency as host names such as SCRC-TENEX or SU-TINY leak out now. My feeling is that it's a tough world out there and there isn't much that can be done until the restriction on non-ARPA domains is removed. The best we can do is make a good try. I propose as an initial implementation that the Stanford Pup Ethernet get the name SU-Pup and the MIT Chaosnet get the name MIT-Chaos. I would like to receive input from the network hackers at both the Stanford Pup net and the MIT Chaosnet as to whether this will impose a hardship upon them and if they object to those names and want something different. I imagine the Unix systems on the Stanford Pup net would get by if it ignored SU-Pup the way they ignore ARPA now; I would take care of the TOPS-20's. I do want to receive some input from the MIT Chaosnet people. I am willing to create such a registry for ultimate adoption by the NIC, provided that these proposals aren't met with outright hostility. -- Mark -- -------  Resent-Date: 20 Apr 83 21:04:48 PST (Wed) Date: 20 Apr 83 21:04:48 PST (Wed) Resent-From: mo@LBL-CSAM (Mike O'Dell [system]) From: mo@LBL-CSAM (Mike O'Dell [system]) Subject: Mark's note on Jon's note about routed addresses Resent-Message-Id: <8304210504.AA12755@LBL-CSAM.ARPA> Message-Id: <8304210504.AA12755@LBL-CSAM.ARPA> Received: by LBL-CSAM.ARPA (3.320/3.21) id AA12755; 20 Apr 83 21:04:48 PST (Wed) To: header-people@mit-mc.ARPA, msggroup@brl.ARPA Outlawing routed addresses will be as sucessful as outlawing Rain on Tuesday or the 55 mile per hour speed limit. We can either face the reality of relaying mail between dissimilar mail systems, or we will surely drown in an every increasing trickle of bizarre addresses. We can all bust our collective butts continuing to hide all of our internal hosts behind various address magic cookies (%, for instance), or we can work as hard trying to make some progress solving some admittedly non-trivial problems. Here at LBL, adopted SMTP and the entire RFC82* suite just so we could make this stuff work. Moreover, up to now, I have been lobbying VERY hard to convert the UUCP universe to an SMTP/82*-based scheme just so all this crap would stick together just a little better. But if now we are going to reinstitute Internet Xenophobia, why did we ever drop 733????? Hoping I misunderstand, -Mike  Date: 21 Apr 1983 1628-PST From: ROODE at SRI-NIC (David Roode) Subject: Re: host naming To: HEDRICK at RUTGERS cc: postel at USC-ISIF, mrc at SU-SCORE, Header-People at MIT-MC, NameDroppers at SRI-NIC Location: EJ296 Phone: (415) 859-2774 In-Reply-To: Your message of 21-Apr-83 1257-PST The role you suggest having Rutgers play (in your example) of resolving unknown addresses (eminating from Rutgers) seems to me the same role as expected of the elusive domain name server. If there is going to be a long delay in providing domain name server standards and servers, then it seems to me that your suggestion is a perfect transition tool. Both the % hack at SCORE and the . hack at UDEL-RELAY are just that--hacks, unsanctioned etc. There is a concerted effort to remove all traces of routing information from the destination lists in the mail headers. The return path is the exception, and it seems intended to act as a backup mechanism. I think its existence is inconsistent. If it is necessary then the sole reason is to glue together domains which do not have knowledge of each others' existence. In that case, the mail sender will have to make use of the return path in extracting addresses for replies, which is another thing you suggested but seemed to regard as a kludge. I do too. So my view of the problem is that operationally we need name servers before we can abolish routing, and abolishing the standard for the expressing of routing in destination lists caused at least two new standards to spring up. This is not likely to be what the abolishers of the previous standard intended. Letting hosts serve as defacto name servers might be a good way of transitioning. -------  Date: 21 Apr 1983 13:20-PST From: John Gilmore Subject: Re: unreplyable headers... Message-Id: <8304212224.AA06031@sun.uucp> Received: by sun.uucp (3.320/3.14) id AA06031; 21 Apr 83 14:24:48 PST (Thu) Received: by UCBARPA.ARPA (3.332/3.19) id AA02156; 22 Apr 83 23:08:48 PST (Fri) Received: from UCBARPA.ARPA by UCBVAX.ARPA (3.332/3.21) id AA02863; 24 Apr 83 23:08:50 PDT (Sun) In-Reply-To: cak's message of 17 Apr 1983 153213-EST To: cak@purdue.ARPA, header-people@mit-mc.ARPA Cc: postmaster@Berkeley, geoff5@sri-csl.ARPA, krypton!wnj@Berkeley Cc: sun!@Berkeley.arpa.postmaster (Note: as I typed it, my header said: From: John Gilmore To: cak@purdue.arpa, header-people@mit-mc.arpa CC: postmaster@BERKELEY.arpa, geoff5@sri-csl.arpa, wnj@krypton.sun.uucp CC: @Berkeley.arpa:postmaster@sun.uucp Just for fun, compare that with what you receive...) I'm "postmaster@sun.uucp". The message we sent as "sun!wnj@krypton" is clearly wrong. Krypton is a Sun system on our Ethernet; the proper internet address should have been "wnj@krypton.sun.uucp", which we will process correctly if we receive it, but somehow we aren't generating. (Note that it's in the message-id, though. Also note that we aren't the only site at fault, since some site added extra Date:, From:, and Return-path: fields -- apparently SRI-CSL, which also didn't keep the received lines in order. The second Date: is curious, too, since it is three days after the Remailed-date.) Much of the problem is that our only connection to the outside world is via UUCP links. When we have mail addressed to the .arpa domain, we forward it (via uucp) to ucbarpa, which is on the net. However, even thought Eric Allman's "sendmail" runs both at Sun and at Ucbarpa, uucp gets in the middle and screws things up. (That's where the "sun!" came from.) We've been dreaming about writing an IP driver for async serial lines so we can dump uucp for these purposes, but it isn't likely to happen soon. I detected quite a bit of "Arpanet chauvanism" in Chris's message -- eg "the .uucp domain isn't a valid domain (yet? ever?)" Excuuuuse me, but Internet sites outside the Arpanet DO exist, and we've having quite a bit of trouble talking new protocols over old transmission media (as well as continuing to talk to old-protocol sites), so please don't make our jobs any harder by declaring us outlaws. In particular, I seldom see fully-qualified hostnames coming from Arpanet sites -- they just blithely assume that if there's no domain, "it must be .arpa". That makes 90% of the Arpanet traffic that reaches here "unreplyable", since our assumed universe for unqualified names is .sun.uucp. As for the .uucp domain's nonexistence, I believe Jon Postel's group is the naming authority for top-level domains, and has assigned that .uucp -- if I'm wrong, please let me know. The mixing of ! and @ is guaranteed to give problems. The question is: what form should addresses on our Ethernet take when going into the outside world? I'd like to have traffic going to other Internet sites use fully-qualified addresses like "wnj@krypton.sun.uucp", but as Chris points out, the Arpanet community hasn't gotten around to supporting domains other than .arpa, so this won't help. Furthermore, it takes cooperation on both sides of a uucp link to strip out the uucp-generated information -- we MUST add "sun! to the fronts of our addresses (because that's the way uucp works), and the receiving end (ucbarpa) must delete it to return to a sane address. It's clear that in wnj's message, Berkeley didn't delete the "sun!", but perhaps this was because "krypton" was not fully qualified and it was hoping "sun!" would disambiguate it. Perhaps the best solution is to pretend that krypton is directly on the uucp network (i.e. rewrite to "sun!krypton!wnj" which Berkeley would then turn into "sun!krypton!wnj@Berkeley"). We accept incoming mail that way, tho it's a minor violation of the Spirit of Subdomains, since "krypton!wnj" (== wnj@krypton.uucp) is NOT the same thing as wnj@krypton.sun.uucp . I'm not pleased with this, though, because I want to get away from uucp-based addressing as much and as soon as possible. John Gilmore, Sun Microsystems' postmaster  Date: Mon 25 Apr 83 00:20:40-PDT From: Mark Crispin Subject: UUCP domain To: sun!gnu@UCB-VAX.ARPA cc: Header-People@MIT-MC.ARPA Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) The gods of the ARPA protocols have decreed that there should be no other top-level domain than ARPA, and there should be no lower level domain other than the old-style host names. You are not the only person with this problem. I have the unenviable task of worrying about a TOPS-20 internet (with a small "i") mailer. I handle, or try to anyway, Internet, Pup Ethernet, MIT Chaosnet, DECnet, CSnet. I am getting absolutely NO cooperation from the protocol gods on any of these entities. What I get back ranges from silence to (what seems to me to be) a concerted effort to make my life more difficult. Every single one of these networks has its own addressing mechanism, and categorically refuses to do anything to make addressing global. Internet is the worst offender; they have a mechanism and refuses to allow it to be used. -------  Date: Mon 25 Apr 83 00:20:40-PDT From: Mark Crispin Subject: UUCP domain To: sun!gnu@UCB-VAX.ARPA cc: Header-People@MIT-MC.ARPA Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) The gods of the ARPA protocols have decreed that there should be no other top-level domain than ARPA, and there should be no lower level domain other than the old-style host names. You are not the only person with this problem. I have the unenviable task of worrying about a TOPS-20 internet (with a small "i") mailer. I handle, or try to anyway, Internet, Pup Ethernet, MIT Chaosnet, DECnet, CSnet. I am getting absolutely NO cooperation from the protocol gods on any of these entities. What I get back ranges from silence to (what seems to me to be) a concerted effort to make my life more difficult. Every single one of these networks has its own addressing mechanism, and categorically refuses to do anything to make addressing global. Internet is the worst offender; they have a mechanism and refuses to allow it to be used. -------  Date: 25 Apr 1983 0322-PDT From: Henry W. Miller Subject: Re: unreplyable headers... To: sun!gnu at UCB-VAX, cak at PURDUE, header-people at MIT-MC cc: postmaster at UCB-VAX, geoff5 at SRI-CSL, krypton!wnj at UCB-VAX, sun! at UCB-VAX, Miller at SRI-NIC In-Reply-To: Your message of 21-Apr-83 1320-PST (Getting on my soapbox...) I think the subject of domains and addressed must bee addressed. An INTERNET address should conform to thee specs. i find the unix/(clones) way of addressing things confusing and disorderly. Likewise, the uucp domain "situation" should be resolved. Nuff said. I have no ideas and no plans, but I think things should be cleaned up. (Stepping off soapbox) -HWM -------  Date: 25 Apr 1983 11:35:46-EST From: Christopher A Kent Reply-to: cak@purdue.ARPA To: sun!gnu@Berkeley Cc: cak@purdue.ARPA, header-people@mit-mc.ARPA, postmaster@Berkeley, geoff5@sri-csl.ARPA, sun!krypton!wnj@Berkeley Subject: Re: unreplyable headers... In-reply-to: Your message of 21 Apr 1983 13:20-PST. <8304212224.AA06031@sun.uucp> For everyone's edification, here's the header I received with John's message: ------ Date: 21 Apr 1983 13:20-PST From: John Gilmore Subject: Re: unreplyable headers... Message-Id: <8304212224.AA06031@sun.uucp> Received: by sun.uucp (3.320/3.14) id AA06031; 21 Apr 83 14:24:48 PST (Thu) Received: by UCBARPA.ARPA (3.332/3.19) id AA02156; 22 Apr 83 23:08:48 PST (Fri) Received: from UCBARPA.ARPA by UCBVAX.ARPA (3.332/3.21) id AA02863; 24 Apr 83 23:08:50 PDT (Sun) In-Reply-To: cak's message of 17 Apr 1983 153213-EST To: cak@purdue.ARPA, header-people@mit-mc.ARPA Cc: postmaster@Berkeley, geoff5@sri-csl.ARPA, krypton!wnj@Berkeley Cc: sun!@Berkeley.arpa.postmaster ----- If there was Arpanet/Internet chauvanism in my original message, it was subconcious; when wearing my Postmaster hat here, I have to deal with Arpanet, CSNET, uucp, and Berknet mail, so I know how much trouble there is. I sincerely hope that the uucp domain WILL be a valid domain someday, and that there will exist proper nameserver systems for reaching it from everywhere on the Internet, and that we'll get everyone that runs uucp to do the right thing and convert to it; but right now it isn't. Apparently the word just hasn't reached everyone yet. As Mark Crispin said, Postel has decreed that there exists only one domain, .ARPA, and the default for a non-fully-qualified name is .ARPA. I think that newer versions of sendmail (at least at Arpanet sites) understand this (but I can't vouch for it). I think that everyone wants to get away from uucp sorce-route addressing, but the world isn't quite ready for it yet. I'd say that you should be rewriting addresses to be real uucp@berkeley strings. Ugh. Since SMTP is a relatively new protocol, it is understandable that there are many varying interpretations, and that it will take a while to get everyone to talk to everyone. A watchful eye, friendly cooperation and nudging, and lots of late night hacking should get us all together soon. John, for what it's worth, I have a preliminary driver for IP on RS232 lines running under the BBN IP/TCP. You're more than welcome to take a look, if it would help with your implementation. Unfortunately, it doesn't know how to dial yet. Cheers, chris  Date: 25 Apr 1983 12:34:37-EST From: Christopher A Kent Reply-to: cak@purdue.ARPA To: MRC@SU-SCORE.ARPA Cc: MsgGroup@BRL.ARPA, Header-People@MIT-MC.ARPA, Postel@USC-ISIF.ARPA, Hedrick@RUTGERS.ARPA, nethax%Diablo@SCORE, Bug-MAIL@MIT-MC.ARPA Subject: Re: more on domains Personal reaction, no animosity or character assasination intended: Yuck! Bletch! Ugh! Foo! Domains are supposed to do away with exactly this mess. I don't want to have to know that you have four (or fourteen) different kinds of networks inside your domain. I don't want to have to know whether or not there are different naming conventions for each. I think that top-level domains are being used for two purposes, and perhaps this is the root of the problem. They seem to be used to diambiguate different physical networks within a computing environment, and to indicate a funding or administrative grouping of machines. These two purposes seem to be mutually exclusive in many cases. I would propose that something more like be the address that gets to Mark in his example, with the possible modification of dropping the .ARPA qualifier. There's no funny relaying going on, and all I as a user have to understand is how to get to the Stanford domain. Beyond that, it's "invisible". I really don't like the idea of having to know that SU-Tiny is on Pup; I think that should be hanlded by the gateway/nameserver, with protocol translation if so desired, or by having the nameserver return the Internet address of an appropriate relay host. I also feel that it is wrong to think about addresses such as , even within subdomains. The mailer should fully qualify these, even though the user might not have to type the whole string. There should be 0.00% chance of unqualified names leaking out of a subdomain. Should we get Namedroppers@NIC in on this? Along the same lines, there seems to be a recent trend for groups that are bringing up private internets to grab a bunch of class C numbers, rather than one class B number that they manage privately. I think this is wrong. The Internet is cluttered enough with routing packets between gateways; why should the rest of the world know that you have, say, an 10Mb Ethernet, 3Mb Ethernet, Proteon ring, serial line IP, fiber optics, and back-to-back parallel interfaces? I certainly don't care. It should all be invisible to the outside world. Of course, I understand that this is tough until the ICCB gets going and approves a standard way of doing subaddressing; but implementors should be willing to put out a little effort and invent their own, with an eye towards possibly having to do it over later. Comments welcome, as always... Cheers, chris  Date: 25 Apr 83 13:55 EST From: Stephen Tihor To: <@MIT-MC:sun!gnu@Berkeley> Subject: RE: unreplyable headers... Cc: header-people@MIT-MC.ARPA Message-ID: <1154C7449.0036002F;1983@CMCL1.NYU.ARPA> In-Reply-To: <8304212224.AA06031@sun.uucp> ; Message of 25-APR-1983 02:19 from John Gilmore In-Reply-To: <8304212224.AA06031@sun.uucp> ; Message of 25-APR-1983 02:19 from John Gilmore It would seem that the obvious thing to do is to replace the UUCP mail programs used for the internet trafic (the UUCP communications protocal itself is ok) and replace them with your own program designed to handle sheaprding Internet mail around the nauseating /bin/mail and uucp-mail interfaces. We did it to interface our local mail system to the net and although the code would not be useful to you in its current form it certainly is proof of a reasonable solution if you have cooperating systems at both ends of the communications line. -------  Date: 25 Apr 83 13:55 EST From: Stephen Tihor To: <@MIT-MC:sun!gnu@Berkeley> Subject: RE: unreplyable headers... Cc: header-people@MIT-MC.ARPA Message-ID: <1154C7449.0036002F;1983@CMCL1.NYU.ARPA> In-Reply-To: <8304212224.AA06031@sun.uucp> ; Message of 25-APR-1983 02:19 from John Gilmore In-Reply-To: <8304212224.AA06031@sun.uucp> ; Message of 25-APR-1983 02:19 from John Gilmore It would seem that the obvious thing to do is to replace the UUCP mail programs used for the internet trafic (the UUCP communications protocal itself is ok) and replace them with your own program designed to handle sheaprding Internet mail around the nauseating /bin/mail and uucp-mail interfaces. We did it to interface our local mail system to the net and although the code would not be useful to you in its current form it certainly is proof of a reasonable solution if you have cooperating systems at both ends of the communications line. -------  Date: Mon 25 Apr 83 13:40:09-PDT From: Mark Crispin Subject: Re: more on domains To: cak@PURDUE.ARPA cc: MsgGroup@BRL.ARPA, Header-People@MIT-MC.ARPA, Postel@USC-ISIF.ARPA, Hedrick@RUTGERS.ARPA, nethax%Diablo@SU-SCORE.ARPA, Bug-MAIL@MIT-MC.ARPA Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) In-Reply-To: Your message of Mon 25 Apr 83 10:34:34-PDT I agree that having a Stanford name domain would be desirable. The problem is that it seems to require considerably more cooperation from various individuals in the Internet world, not to mention at Stanford, than seems to be forthcoming. A solution that says "you should do this", but "you can't do this yet" doesn't help me when I have to do something "now". -------  Date: 25 Apr 1983 1700-EDT From: HEDRICK@RUTGERS (Mgr DEC-20s/Dir LCSR Comp Facility) Subject: Re: more on domains To: MRC@SU-SCORE cc: cak@PURDUE, MsgGroup@BRL, Header-People@MIT-MC, Postel@USC-ISIF, nethax%Diablo@SU-SCORE, Bug-MAIL@MIT-MC In-Reply-To: Your message of 25-Apr-83 1644-EDT I agree with Mark that something should be done to allow an interim form of domains to start up immediately. I don't much care what it is. My preference, as many of you may know, is to allow any Arpanet host name or nickname to be used as a penultimate subdomain, i.e. right before the .ARPA. Until domains are completely implemented, if a mailer didn't know what the host was, it would direct the mail to the corresponding host. The host name with domain would still be absolute, in principle. In our case we would have other ways of finding better routings in most cases. This would simply be a way of specifying a default routing until we get a complete set of domain servers. If Mark needs more information, my proposal would either require him to use a sub-sub-domain, e.g. SAIL.PUP.SCORE.ARPA. This is what I would prefer. Or if he insists, and NIC agrees, he could define STANFORD-PUP as a nickname for SCORE, and then use SAIL.STANFORD-PUP.ARPA. I think it is fairly important to insist that the penultimate subdomain be a real host name or nickname, since otherwise I am going to have to have a table that maps domain names into relay hosts. I think that is too much for an "interim" solution. Personally I think the whole idea of having to specify the network within Stanford is a violation of the concept of absolute addressing. That seems to be something that belongs in the routing. But I am not sure that is any of my business. If he wants to have two different hosts whose names are the same up to the first dot, I guess I don't care. I do think that something needs to be done quickly. It is clear that there is currently a gaping hole in the mail system. We are seeing various kludges come in to fill it. I would much rather see almost any interim domain system get started as soon as possible. I am going to assume that any reasonable interim system is going to have the same results for just about everybody except Stanford, namely HOST.SITE.ARPA, where site is likely to be fairly obvious (RUTGERS, STANFORD, BERKELEY, etc.) -------  Date: 26 Apr 1983 16:04:10-EST From: Christopher A Kent Reply-to: cak@purdue.ARPA To: HEDRICK@RUTGERS Cc: MRC@SU-SCORE, cak@PURDUE, MsgGroup@BRL, Header-People@MIT-MC, Postel@USC-ISIF, nethax%Diablo@SU-SCORE, Bug-MAIL@MIT-MC Subject: Re: more on domains In-reply-to: Your message of 25 Apr 1983 1700-EDT. I'll third the motion that we need something soon. All the confusion is going to give way to inertia for the incompatible hacks. chris  Date: 27 Apr 83 16:25:26 PDT (Wed) From: mo@LBL-CSAM (Mike O'Dell [system]) Subject: One man's view of the world (MOBY message) Return-Path: Message-Id: <8304272325.AA10000@LBL-CSAM.ARPA> Received: by LBL-CSAM.ARPA (3.320/3.21) id AA10000; 27 Apr 83 16:25:26 PDT (Wed) To: MsgGroup@brl.ARPA, Header-People@mit-mc.ARPA Cc: cak@purdue.ARPA, MRC@su-score.ARPA, Postel@usc-isif.ARPA, mo@LBL-CSAM Friends, What follows is an attempt to write down my model of how the SMTP World can/should work. While there are religious issues involved here, I am submitting this to the list as an attempt to provide one whole view of of the problem (limited mightly by my powers of exposition), all in one place. Earlier today I got copies of notes describing different pieces of this model, but I decided that posting this to the list for the purpose of dissection would be valuable. Because of some of these notes presenting similar ideas, some obvious details have been glossed over (probably wrongly). Again, this are just my views (but I know them to be shared by several others), and they are presented soley to promote understanding and interchange. So, with this in mind, here goes... ---------------------------------------------------------------- A Model for Mail Using SMTP Mike O'Dell Lawrence Berkeley Laboratory mo@lbl-csam This note describes one possible model for using the SMTP mail transport protocol to implement a Mail internet whose addressing is based upon the notion of Mail Domains. A basic philosophy for interconnecting mail systems will be described and its ramifications will be discussed. The reader is assumed to be intimately familiar with the RFC82* family of protocol specifications. 1. Assumptions and Definitions Assumption: There exist large collections of hosts which implement SMTP, and further, the hosts provide facilities whereby SMTP implementations can "directly converse" with some, but not necessarily all, peer SMTP implementations. The existance of such communcations facilities plus the connectivity of such mechanisms indues a partitioning on the collections of hosts; these partitions are called Subnetworks, or Subnets. The connectivity is an important because if there is no direct communication between two Subnets there must be an intermediary agent if traffic is to flow between them. Such intermediaries are called Mail Gateways. Example: The ARPA Internet is one such Subnetwork. It is a collection of hosts which implement SMTP, the communication facility is TCP virtual circuits. Moreover, the Subnet is fully connected. By this we mean that any SMTP can directly communicate with any other SMTP, subject to the usual problems of the node being up, etc. An alternate example would be a Subnet where hosts may only talk to some collection of neighbors. Such networks might require explicit address-to-route binding, but this can be accommodated, as we will show later. Assumption: The SMTP mail transport protocol implements a store-and-forward, hop-routed, negative acknowledgment, message switching network. SMTP implementations, henceforth called Nodes, recieve mail from Sources (e.g., mail composers), reliably pass the mail from one node to another, and finally deliver it to a Sink (local mailbox, delivery agent, etc.). Elaboration: The reliable transfer of mail is based on the notion of Responsibility. At any time, there is at generally one Node which is said to Responsible for a piece of mail. When one SMTP node transfers it to another SMTP node, the Responsibility is "tendered" the sending node transfers the message to the recieving node, and Responsibilty is "discharged" when the sending node receives the positive response from the receiving node. Until such a positive response is received, the node "last in possession" of a message is said to be Responsible. Responsibilty implies that a node will continue to provide its best efforts at continued transfer, until such time as its policies decide a message is "undeliverable". When such a determination is made, the Responsible Node is charged with returning a negative acknowledgement to the Source of the message. This is done via the "Return-path:" field accumulated during message transfer. In cases where SMTP nodes communicate over high-delay channels (as with Batch SMTP over UUCP links, or TCP/SMTP over phone lines), there is a window when both sender and reciever node can believe they are Responsible for a message. This can cause a replication of a message if the ACK to the sender is lost, and could cause two negative acknowledgements to be returned in some failure cases. Solution of these problems would require a 3-way handshake be introduced into the model. At this time, this seems like excessive work when most connections are already quite reliable, and those that aren't will be dramatically improved by this scheme. It is, however, a topic to be explored further. While end-to-end reliability of mail service is clearly a desirable goal, until the mandatory institution of "Return-reciept-requested:" and "registered mail", such guarantees are not possible. Therefore, the goal is to insure a high likelyhood that non-delivery will result in a negative acknowledgement being returned to the Source. It is still possible for a message to be lost in-transit because of catastrophic failure of a node, but all nodes must make best-efforts to return error messages (subject to supression of error loops). In practice, within Subnets like the Internet, the likelyhood of undelivered and not-NAKed mail is small because the number of hops is usually one. Assumption: Addresses are provided on the envelope of the message. In the case of messages coming from another SMTP, the envelope information is provided by the messages exchanged in the protocol. In the case of mail submitted for transfer by a local Source, local policy determines whether it is taken from the message itself, or provided out-of-band through argument lists, etc. Upon final delivery, a "return-path:" line is created from the return address information accumulated along the way. This is the only way to guarantee "reply" operations will always work in the face of arbitrary network topology. User agents may with to perform heuristics for optimizing addresses for replys, but the correct inclusion of a "return-path:" line provided a sufficient mechanism by allowing routing back to the Source and then out again. Clearly, optimizations are useful, but we desire a mechanism which is sufficient in all cases. The issue of whether headers in mail are modified en route is religious, but this author's position is that this scheme will work without "header munging", and ought to be done that way. But, there are others who believe otherwise and are free do modify headers en route. The only requirement is that the minimal correct information ("return-path:" and other required fields) be present with it reaches its final destination. 2. The Form of Addresses The addresses in this model are taken from RFC82*, i.e, domain addresses and routed addresses. The formal syntax is given in the appropriate RFC's, but a few examples will insure conceptual alignment. A domain address is of the form leftpart@domain where "leftpart" is a string prohited from containing any unescaped @ characters and whitespace. The "domain" is a sequence of at least two tokens containing no unescaped @ characters or whitespace, separated by periods (.). A few examples of valid domain addresses: mo@lbl-csam.arpa mo@mo-sun.lbl-csam.arpa "Mike O'Dell"@lbl-mail.arpa Routed addresses are of the form [@domain,]@domain:leftpart@domain where [@domain] can be repeated zero or more times, and "domain" and "leftpart" are as above. 3. The Meaning of Addresses Assigning meaning to addresses is one of the most subtle parts of this scenario so an algorithm will be provided describing the elaboration and routing decisions an SMTP node must perform when processing a message. One of the paramount requirements is that the return-path information be generated and propagaged along each hop. Negative acknowledgements cannot be generated if this is not done. We assume the following algorithm is being performed for each destination address on the envelope of the message. This applies equally whether the Source of the message was another SMTP node or a local user agent program. Further, the algorithmn assumes all addresses are syntactically compliant with (2) above. For a routed address, the "first hop" is the leftmost @domain clause of an address, punctuated by an unescaped comma or colon. In the case of normal domain addresses, the @domain excluding the left part is assumed to be the "first hop" (and "only hop"!). We introduce the notion of an "i-right-subclause" of a domain. Remember that a domain is a clause of the form [xxx.]yyy.zzz An "i-right-subclause" is a clause containing the "i" rightmost subclauses. Example: Given: alpha.beta.gamma.lbl-csam.arpa 1-right-subclause: arpa 2-right-subclause: lbl-csam.arpa 3-right-subclause: gamma.lbl-csam.arpa 4-right-subclause: beta.gamma.lbl-csam.arpa The function NumberOfSubclauses() applied to the Example clause returns the integer 5. We now give the Address Elaboration and Routing Algorithm in a pseudo-code resembling C. The pseudo-function relating to name lookups will be described below. Address Elaboration and Routing Algorigthm ------------------------------------------ ImmediateDomain = ExtractFirstHop(address_to_be_resolved); NameServersToUse = KnownNameServers(); i = 1; NofS = NumberOfSubclauses(ImmediateDomain); State = Unresolved; /* sucess or failure indicator */ DestinationToUse = Undefined; /* resolved destination */ while (State == Unresolved) { if (i <= NofS) { CurrentTry = I_Right_Subclause(ImmdiateDomain, i); } else { State = Unresolvable; break; /* out of loop */ } Response = NameLookup(CurrentTry, NameServersToUse); switch (Response.Type) { case UNKNOWN: /* struck out */ State = Unresolvable; break; case DOMAIN: /* * continue resolution with more components * using name servers returned */ i = i + 1; NameServersToUse = Response.NameServers; continue; /* at top of loop */ case HOST: case MAILGATEWAY: /* * got a direct path to either a host * or a gateway */ State = Resolved; AddressToUse = Response.HostAddresses; break; } } /* * Sucess of resolution is now in State * AddressToUse contains host identifiers * to use getting to the first hop */ /* * If we don't know what to do, but know someone who might, * try sending it to him */ if (State == Unresolvable) { if (KnownOracle()) { ForwardNextHop(KnownOracle()); } else { ReturnToSender(); } } else { ForwardNextHop(AddressToUse); /* updates return-path: */ } ------------------------------------------------------ Comments: This algorithm works whether the NameLookUp function is implemented by really pinging a name server, or whether they are really just files on the host. This is important for bootstrapping - not everyone will have nameservers immediately, and the algorithm allows hosts to serve as interim mail gateways with no prearrangement because any host will make best efforts to forward a message without regard to its source. This algorithm can also be improved by caching some information locally (not unlike implementing NameLookUp() with a file), but in no case can the caching, or any other optimization, violate the right-to-left precidence of clauses within the domain being resolved. It is critically important to allow pieces of the domain to be uninterpreted except by mail gateways. If nodes are allowed to evaluate any part of a name they choose, they can usurp gateway policy decisions regarding issues like centralized mailroom facilities. This problem is really one of binding an address to a route. A critical part of this model is that the binding can be done piecewise as with the algorithmn above, or the algorithm can be embellished with "peephole" optimization to remove extraneous routes, etc. All such optimizations can be done with simple-to-formulate RULES for Address Algebra, and not random heuristics. Additionally, the binding scheme relies on a "most-likely-knowledgable oracle". The KnownNameServers() provide an initial list of oracles, with the knowledge sources vectoring toward the final authority. This scheme allows small machines to have degenerate tables which cannot resolve any address, but have a KnownOracle() which can do mail routing for them. The oracle can do whatever Address Algebra it wants to implement an external address policy. One such policy might be to show only "white pages" addresses in external mail, or whether to transform the routed address (resulting from such "hot potato routing") into a pure domain address of the form . There are good reasons for keeping the routed address and transforming to the non-interpreted scheme, so that should be the purview of the local domain, and not a global decision. The important issue is that such policies can be concentrated in the oracle (mail gateways) for the domain and not every little machine on the subnet. This scheme strictly avoids "recursive name servers" whereby one server will query another on behalf of a request. This can cause serious bookeeping problems, and I really think the "redirect" scheme is much cleaner, particularly in the case of networks with high delay. It is clear that this scheme works best if mail gateways for domains provide name service for at least that domain. 4. The Notion of Domains The notion of a domain is central to this model, but is a rather difficult notion to pin down. The word "domain" has already appeared in the above text, so this discussion is assumed to be retroactive. A domain is a recursive mechanism for collecting and containing details about naming. Domains are frequently composed of subdomains (also called domains) to divide the work of providing naming services for a large collection of nodes on a subnet. The reasons for such division can be either administrative or tactical, but generally, domains don't usually span subnets with radically different communication facilities (e.g., BSMTP over UUCP and SMTP over TCP). Additionally, "subnets" are recursive and subdomains don't usually extend across subnets "owned" by different parties. The notion of "ownership" is a ticklish matter, but is generally based in who has responsibility for arbitrating global names. As an example, DCA/ARPA has global naming authority within ARPANET, so the domain is called ARPA, and is a "top level domain", meaning it always appears only as a 1-right-subclause in a domain clause. In the desirable case of multiple top level domains, mail gateways will have to cross-register with each other to allow a host with a .UUCP address to ask its .ARPA name server about who to see to get his .UUCP address resolved. (Again, the name server can certainly be implemented be a file on the host.) 5. Reverse addresses and Replys The problem of generating reverse addresses for replies is tricky. The only scheme which is sufficient, meaning it will *always* work is reverse routing. This means that the address generated for a reply is a routed address of the form reverse-path:forward-address where "reverse-path" is taken from the envelope and "forward-address" is taken from the recipient fields (To:, CC:, etc.). This will result in replies going back to the source host and then being forwarded like the original mail. Clearly this is manifestly inefficient for most traffic, but in the case of general internet relaying, it is the only semantics which will *always* work, short of hosts having complete, or at least *very considerable* knowledge of global network topology and "externalizing" all addresses in the message at each hop (this nasty business is also known as Header Munging). It is possible to describe a set of rules for "optimizing" outgoing address which will reduce the number of hops in a deterministic way and preserves the semantics of the reverse route. Any optimization preformed by a host must preserve the reverse-route semantics. 6. Conclusions "Mail is a lot harder than you think, and astonishingly harder than it looks."  Date: 27 Apr 83 21:20 EST From: Stephen Tihor To: mo@LBL-CSAM.ARPA Subject: RE: One man's view of the world (MOBY message) Cc: header-people@MIT-MC.ARPA, msggroup@MIT-MC.ARPA Message-ID: <117752466.01350029;1983@CMCL1.NYU.ARPA> In-Reply-To: <8304272325.AA10000@LBL-CSAM.ARPA> ; Message of 27-APR-1983 19:44 from mo@LBL-CSAM (Mike O'Dell [syste Although I can see that reflecting replies off the Sender is the only way to guarentee unambiguous and guarenteed reply routing for messages with relative addresses in them one of the points of Absolute names is that regardless of where on an Internet you are a message containing TIHOR@CMCL1.NYU.ARPA as an address can always be used to send a message to me regardless of when you are since all mail servers must know someone somewhere who can deal with the .ARPA domain and thus can route a reply to me through there. From this I get the impression that you really intend to include the existing UUCP explicit routing as a real method of getting from here to there rather than as a temporary kludge that we can use until the Domain Name/Mail Servers are available. Did I miss something in you message? p.s. Gang: can we limit this discussion to either Header-People or MsgGroup, seeing everything in stereo with several hour delay is beginning to get to me? -------  Date: 27 Apr 83 21:20 EST From: Stephen Tihor To: mo@LBL-CSAM.ARPA Subject: RE: One man's view of the world (MOBY message) Cc: header-people@MIT-MC.ARPA, msggroup@MIT-MC.ARPA Message-ID: <117752466.01350029;1983@CMCL1.NYU.ARPA> In-Reply-To: <8304272325.AA10000@LBL-CSAM.ARPA> ; Message of 27-APR-1983 19:44 from mo@LBL-CSAM (Mike O'Dell [syste Although I can see that reflecting replies off the Sender is the only way to guarentee unambiguous and guarenteed reply routing for messages with relative addresses in them one of the points of Absolute names is that regardless of where on an Internet you are a message containing TIHOR@CMCL1.NYU.ARPA as an address can always be used to send a message to me regardless of when you are since all mail servers must know someone somewhere who can deal with the .ARPA domain and thus can route a reply to me through there. From this I get the impression that you really intend to include the existing UUCP explicit routing as a real method of getting from here to there rather than as a temporary kludge that we can use until the Domain Name/Mail Servers are available. Did I miss something in you message? p.s. Gang: can we limit this discussion to either Header-People or MsgGroup, seeing everything in stereo with several hour delay is beginning to get to me? -------  Date: 27 Apr 83 21:28 EST From: Stephen Tihor Subject: RE: One man's view of the world (MOBY message) Cc: header-people@MIT-MC.ARPA, msggroup@MIT-MC.ARPA Message-ID: <11775F9B4.026D002B;1983@CMCL1.NYU.ARPA> In-Reply-To: <8304272325.AA10000@LBL-CSAM.ARPA> ; Message of 27-APR-1983 19:44 from mo@LBL-CSAM (Mike O'Dell [syste Rereading this massive missive again I guess its just that the UUCP routing rules seem to have too solid a place, since one of the main points of Domain addressing is to get away from that. In effect your document describes a different syntax and specific Heuristic for the old mutiple-@ style of routing which domains were supposed to have obsoleted. -------  Date: 28 Apr 83 08:04:31 PDT (Thu) From: mo@LBL-CSAM (Mike O'Dell [system]) Subject: RE: One man's view of the world (MOBY message) Return-Path: Message-Id: <8304281504.AA19861@LBL-CSAM.ARPA> Received: by LBL-CSAM.ARPA (3.320/3.21) id AA19861; 28 Apr 83 08:04:31 PDT (Thu) To: TIHOR.CMCL1@NYU.ARPA Cc: header-people@MIT-MC.ARPA.msggroup@MIT-MC.ARPA In-Reply-To: Your message of 27 Apr 1983 1838-PDT (Wednesday). <117752466.01350029;1983@CMCL1.NYU.ARPA> The intent is to allow arbitrary interconnection of SMTP mail networks, even in the presence of indirect connections and non-pervasive name servers. What you propose requires any SMTP to make a direct communications connection to any other SMTP in the mail universe. This is simply not possible - there are different communications mechanisms in use and some exhibit too much delay to support real-time name servers. I don't see any way of outlawing some degree of source routing. Note that I do NOT want SMTP addresses to generally look like UUCP strings, but in reality, they occassionally might have to, and if we don't deal with that, the scheme is inadequate. In the case of the ARPA Internet domain, we don't really need domains at all. RFC733 would have been adequate because the Internet exhibits one uniform communications environment with the routing done by the Internet. The facilities of the RFC82* family ARE required to allow distributing responsibility for name binding (and simply believing in name servers isn't sufficient) and interconnecting mail universes. -Mike  Date: 29 Apr 83 10:52 EST From: Stephen Tihor To: mo@LBL-CSAM.ARPA Subject: RE: One man's view of the world (MOBY message) Cc: header-people@MIT-MC.ARPA, msggroup@MIT-MC.ARPA Message-ID: <1193BAF1E.00DF0022;1983@CMCL1.NYU.ARPA> In-Reply-To: <8304281504.AA19861@LBL-CSAM.ARPA> ; Message of 28-APR-1983 11:06 from mo@LBL-CSAM (Mike O'Dell [syste [Note: I am still getting this from both header people and msggroup and I am afraid that my mail presenter isn't up to unifying messages that arrive days apart (one well after I have deleted the other) yet, even if they have the same message-id. Will the maintainers of these two lists please get their heads together and pick one to carry this discussion on?-SWT] { I think my pooint has been may have been more clearly made by the last few messages so if you understand the source-routing vs. mail-forwarding/gatewaying flap ignore this message. - SWT} No I am not arguing that we need perfect instant SMTP links ... just that since the name server selection algorithm must, in the original design, know where to find a name-server (on a high-reliability direct connect internet route) or a mail-gateway to some Domain in ANY domain string since each SMTP transport program is required to know all the top level domains and how to get to them, then all fully qualified mail can be delivered directly to [or TOWARDS] that gateway if there is no better way and one can be certain that it is going the right way. Thus explicit paths are merely an optimization, or deliberate disoptimization, which you are providing to force mail away from this path. Thus while I can not dispute some of the ideas presented, UUCP-style SMTP routing, except for UUCP-style diagnostic purposes is useful only if your mailer knows the state of the world better than the SMTPs between you and your desired gateway OR if you know of and wish to use a selective gateway. It is not clear to me that this is at all reasonable in the four sets of multinets where I currently see SMTP being of interest: (1) core ARPA Internet. Nearly-instant SMTP, full service name servers -- No Problems. (2) adjacent local networks and their friends. SMTP services may be too slow or have too many steps to efficiently provide rapid name-server service -- however can quickly pass mail for major top level domains up to internet through local gateway, handle local domains locally, and have special cases specially routed directly to special local gateways based on local routing tables. (3) UUCP-land after the Great replacement of /bin/mail and UUCP explicit routing with SMTP and table driven routing algorithms. The major problem I see here is keeping track of low level adjacencies between multiply connected organizational domains: BTL, DEC, Tektronix, HP, Berkeley, etc. This is the piece of the general routing problem that Postel last stamped "UNDER STUDY" for normal IP/TCP gateways. Is the added inefficiency of routing all mail from BTL to DEC through one link (say the cannonical and soon to be defunct harpo--decvax link) worth the simplicity? That is not a real question since as long as you control the mail tables (and within any organization that deserves the term ORGANIZED and is serious about networking mail this is not a problem) you can route mail "optimally" by default simply by putting the right paths in the forward-to for the correct target subdomains. (4) A purely chaotic network such as PCnet. This is a good case in that there are a number of small nodes which probably do not have a logical and direct path to some known mail server. A current problem is developing a reasonable routing method for such nodes. Certainly none of the existing proposals are reasonable enough to suggest using such nodes as a route through to another internet. The strongest current proposal, longitude, lattitude, and phone number is certainly something which a PCnet-mail gateway could handle since it each node currently is a single mailbox system. -- Here there are a number of unresolved issues reflect the lack of an clear underlying model and example of the network. Are these small systems without room for routing tables the sort of machines that are motivating this discussion? -------  Date: 4 May 1983 01:31:13-EST From: Christopher A Kent Reply-to: cak@purdue.ARPA To: namedroppers@nic, Header-People@mit-mc, MsgGroup@brl Subject: why are these separate lists? Hello.... I will receive this message twice. No doubt there are some of you that will receive it thrice. What I'm curious about is why. I have just spent a few most interesting hours catching up on about a week's worth of large messages dealing with SMTP/Internet addressing issues, relays, gateways, domain, nameservers, and the like. When it came time to file all this in an orderly fashion, I was faced with a dilemma; the mail was all centered around the same broad topic, but appeared in two very separate lists! I usually file Header-People mail one place, and NameDroppers another; tonight I felt like it should all go in the same place. I don't know what MsgGroup started out for, since I've never been a member. I believe that Header-People wants to deal with mail headers and their formats and evolution, i.e., the "Mail Police", as Dave Mills would put it. NameDroppers was formed (to the best of my recollection) to try to hammer out the issues of creating domains and dealing with them in getting everyone in the world to be able to send mail to everyone else. The groups seem to be bleeding all over one another. I don't find this objectionable, but am concerned that there are folks out there that should be seeing some of these messages that aren't, and we are therefore missing their valuable contributions. Perhaps we should merge these lists into a "Mail-Concerns@foo" (who can support the traffic?) and eliminate this problem. Comments? Cheers, chris  Received: by YALE-BULLDOG via CHAOS; Wed, 4 May 83 13:30:36 EDT Date: Wed, 4 May 83 13:29:51 EDT From: Nathaniel Mishkin Subject: Re: why are these separate lists? To: cak@PURDUE.ARPA Cc: Header-People@MIT-MC.ARPA In-Reply-To: Mailer@SRI-NIC.ARPA, 4 May 1983 01:31:12-EST I will receive this message twice. No doubt there are some of you that will receive it thrice. What I'm curious about is why. I'm curious: do any of these mail systems that generate "Message-ID:" headers use them to eliminate (at least locally generated) duplicate mail? I've come to the conclusion that even if one's mail system is bug-free, you're still going to have problems with duplicate mail. If you want to use message IDs as a way of eliminating duplicates, you have to both (1) save all the "recent" message IDs received by a user and (2) come up with a standard format message ID that every mail system in the world would use. I recently thought of a way to eliminate both the problem of storing saved message IDs and the problem of non-uniformity of message IDs. In this scheme, there is no "Message-ID:" in the message. Associate a dictionary implemented as a hash table with each user's mail box. The key into the dictionary is the concatenated string of the header lines of the message. To check to see whether you've seen a message already, you look up the concatenated header string in the dictionary. If it's there, your mailer says that you've seen the message before. To deal with the random mailer munging of header lines, you'd probably have to be a bit careful: perhaps you'd just take the "Date:" "Subject:" and "From:" header lines as the key (you'd have to be careful with the "From:" though because of UUCP-style hacking). One space efficient implementation of this algorithm looks like a spelling correction algorithm written up a while ago in CACM and used in our local editor: first, you hash the concatenated header into a shorter key; then instead of saving the "value" (i.e. the message), you set the bit at the index pointed to by the hashed key; actually you apply several different hash functions to the concatenated header and turn on the bit at each of the places indicated by each hash function. The lookup operation for a message requires that you hash its headers with the same hash functions and see if all the bits at the hash locations are turned on. The probability that this method will say that a non-duplicate is a duplicate is very low and can be made as low as you want by increasing the number of hash functions or size of the hash bit table. One simple way to employ this scheme is in the mail user interface program. Just before it's about to show you the next message, it applies this scheme to see if you've seen it already; if you have, it skips it. If you're really hyper about making sure you don't lose anything, periodically you could peruse your allegedly duplicate messages. Just a thought. -- Nat -------  Date: 4 May 1983 12:43:43-EST From: Christopher A Kent Reply-to: cak@purdue.ARPA To: Mishkin@YALE, cak@PURDUE.ARPA Subject: Re: why are these separate lists? Cc: Header-People@MIT-MC.ARPA Sounds really nice! I would worry a lot about random header munging right now, though; many systems seem to change things just slightly, but enough to fool an algorithm. I think there are mail readers that use Message-Ids to remove duplicates, but they break for all the reasons you cited. I had a possibly heretical thought the other evening as I was contemplating my mailer's source code, and your letter brought it to mind once again, so I thought I'd pass it along. One of the reasons headers are so ugly is that you have to read, parse, understand, and possibly munge headers. This code is usually pretty ad hoc, which makes it hard to write, read, debug, and modify. Our headers are all describable by a grammar; why aren't header handling programs written like a compiler? You parse the message, store it in an internal parse tree, do some manipulations on the tree, and decompile it. It's all well known technology, and there need be nothing ad hoc about it. If there's a syntax problem, you just dump the message back to the user, preferably with a meaningful error message. Has anyone ever attempted this? Cheers, chris  Received: by YALE-BULLDOG via CHAOS; Wed, 4 May 83 14:47:47 EDT Date: Wed, 4 May 83 14:48:34 EDT From: Nathaniel Mishkin Subject: Re: why are these separate lists? To: cak@PURDUE.ARPA Cc: Header-People@MIT-MC.ARPA, Ellis@YALE.ARPA In-Reply-To: Christopher A Kent , 4 May 1983 12:43:44-EST One of the reasons headers are so ugly is that you have to read, parse, understand, and possibly munge headers. This code is usually pretty ad hoc, which makes it hard to write, read, debug, and modify. Our headers are all describable by a grammar; why aren't header handling programs written like a compiler? You parse the message, store it in an internal parse tree, do some manipulations on the tree, and decompile it. It's all well known technology, and there need be nothing ad hoc about it. If there's a syntax problem, you just dump the message back to the user, preferably with a meaningful error message. Has anyone ever attempted this? The Yale mail system running on our -20s does exactly this. It has a recursive descent parser that translates a message in RFC822 format into a parse tree. According to the person here who wrote the parser, the syntax specified in 822 is well-suited to such a parser. If it encounters a header line it can't parse, it flags it and leaves it in the message. The library containing the parser is used by both the mailer and the mail user interface program. It has been VERY useful to have the parser. -- Nat -------  Date: 4 May 1983 1201-PDT (Wednesday) From: eric%UCBARPA@Berkeley (Eric Allman) Subject: Re: parsing headers Message-Id: <16502.31.420922882@ucbarpa> Received: by UCBARPA.ARPA (3.336/3.27) id AA16503; 4 May 83 12:01:25 PDT (Wed) Received: from UCBARPA.ARPA by UCBVAX.ARPA (3.339/3.27) id AA29443; 4 May 83 12:02:10 PDT (Wed) Phone: (415) 548-3211 To: cak@purdue.ARPA Cc: Header-People@MIT-MC.ARPA, Mishkin@YALE In-Reply-To: Your message of 4 May 1983 12:43:43-EST. <8305041845.AA29114@UCBVAX.ARPA> Fcc: mail I considered using a "real" parser when I was designing sendmail. I rejected the idea because there were too many ambiguous addresses and headers running loose in the world -- Arpanet addresses may be unambiguous, but the sum of Arpanet plus UUCP plus CSNET plus PUP plus Berknet plus BITNET plus DECNET ad infinitum is quite ambiguous. I ended up using a production system, which can assign arbitrary precedence to the ambiguous cases. However, if I were doing it again today I would take a very different approach. I would probably keep the production system, but only use it to put the bizarre addresses into a canonical form (i.e., domains). I would then use something else (possibly a grammar) to parse the canonical form. I suspect that there are just enough gotcha's in the syntax to make ad hoc code easier however. For another heretical idea, how about making the syntax of addresses in the message the same as the syntax of addresses in the envelope? It would be awfully nice if only one parser were needed. eric  Date: 4 May 1983 1234-PDT From: Henry W. Miller Subject: Re: why are these separate lists? To: Mishkin at YALE, cak at PURDUE cc: Header-People at MIT-MC, Ellis at YALE, Miller at SRI-NIC In-Reply-To: Your message of 4-May-83 1218-PDT Throwing in my two cents worth: yes, headers are not the easiest thing to parse. I had to hack our MM a few months back to ignore domains so REPLY would work, and not get confused while searching the HOSTSx.BIN table. I had to patch the code in three seperate locations to ignore ".DOMAIN", depending on the context of the header name. A generalized parser would have been so much nicer. -HWM -------  Date: 4 May 1983 1535-EDT (Wednesday) From: Craig.Everhart@CMU-CS-A To: cak@PURDUE Subject: Re: why are these separate lists? CC: Mishkin@YALE, cak@PURDUE, Header-People@MIT-MC Sender: RdMail@CMU-CS-A In-Reply-To: "Christopher A Kent's message of 4 May 83 12:43-EST" Message-Id: <04May83.153524.RD00@CMU-CS-A> The CACM article was written to describe an algorithm that searched for messages containing keywords. The failure mode (induced by hash conflicts) was that messages without the keywords would occasionally be seen as having the keywords, so that searches would occasionally find slightly too many messages. Using the same algorithm to suppress duplicate messages trips over the same failure, but in this case the result isn't you seeing slightly too many messages, but is instead your missing a message because it was marked as ``duplicate,'' while it wasn't a duplicate at all. Message-IDs are supposed to be unique within the entire Internet, if not also within the entire internet. Why bother with Date: and From: and Subject: and thus have to worry about transformations induced by forwarding gateways? The duplicate elimination mechanism that we were going to install in RdMail but never got around to would work as follows. Using RdMail, people keep their mail for a few days before they delete it, as a matter of course. So, when a new message is incorporated into the file of messages (a job done by RdMail), we compare the internal-form date of the message to the stored internal-form dates of the rest of the messages in the message file. If the results are sufficiently close, like within a few seconds or a minute, RdMail does a string comparison of the Message-ID fields of the relevant messages. If that comparison finds a match, RdMail marks one or the other of the messages with the system attribute Duplicate. (This is just another system attribute like, for example, Examined, Deleted, and Answered.) The user can then handle Duplicate mail any way he/she wants. The scheme relies heavily on the presence of internal-form versions of message Date: fields, and on the fact that message Date:'s aren't usually munged by forwarding gateways. RdMail doesn't maintain anything but dates in an internal form; everything else is maintained as text. But last summer David Nichols and Mike Horowitz here at CMU wrote a Unix program duplicating the RdMail user interface, and followed our recommendations of the time to store both the unparsed message header text and the parsed versions, in order to speed up access to the various fields. Their arrangement has worked quite well in practice. They've called the resultant system Mercury; send inquiries to Mercury@CMU-CS-G. Craig Everhart  Date: Wednesday, 4 May 1983 14:51-PDT To: Header-People at MIT-MC Subject: Re: why are these separate lists? In-reply-to: cak's message of 4 May 1983 12:43:43-EST. From: greep at SU-DSN For a long time I have been suggesting that standards that are intended to be implemented as programs (eg header parsing, FTP reply codes, etc) be described as program logic, thus (1) simplifying implementors' work and (2) helping to ensure a standard interpretation of the standard. (This kind of specification should be in addition to the kinds of docu- mentation currently used.) RFC822 is so complicated that I would be surprised if different people reading it would invariably interpret all of in the same way. (I seem to remember such differences of opinion coming up in the past but can't remember any examples offhand.) Anyway, the response to this suggestion is generally of the form "sounds like a good idea but I don't want to take the time to do it". Admittedly 822 is less subject to varying interpretations than many other standards. (Of course there is also the problem of how it compares with what real mail systems use.)  Received: from Diablo by SCORE with Pup; Wed 4 May 83 15:40:39-PDT Date: Wednesday, 4 May 1983 15:40-PDT To: cak at Purdue.arpa Cc: namedroppers at Sri-nic.arpa, Header-People at Mit-mc.arpa, MsgGroup at Brl.arpa Subject: Re: why are these separate lists? In-reply-to: Your message of 4 May 1983 01:31:13-EST. From: Keith Lantz I have a similar problem, due to being on header-people and msggroup. I am actually more interested in being on namedroppers, now that I hear of it, so: (1) Could whoever add me to same? (2) Strikes me that it (namedroppers) is precisely the domain, as it were, for these discussions. Perhaps the interested parties from the other lists could simply make sure that they are now on namedroppers? Keith  Date: 4 May 1983 at 1636-PDT From: Andrew Knutsen To: eric @ Ucb-Vax (Eric Allman) cc: Header-People @ Mit-Mc Subject: Re: parsing headers In-reply-to: Your message of 4 May 1983 1201-PDT (Wednesday). <16502.31.420922882@ucbarpa> Sender: knutsen @ Sri-Unix There is one reason I can see for having two different (envelope and header) address formats: one is routed, and one is absolute. The routing info is occasionally useful to have, but the users shouldnt be exposed to it.  Date: 4 May 1983 at 1655-PDT From: Andrew Knutsen To: cak at PURDUE cc: Header-People at MIT-MC.ARPA Subject: Re: why are these separate lists? (header parsing, actually) Sender: knutsen at SRI-Unix There is also a header parser -- partly written by Dave Crocker, the rfc822 person -- which has been incorporated into the MMDF mailsystem for Unix. It does parse the header (well, the lines containing addresses) into a sort of token tree (more like a list in the version Im familiar with), then puts it back together. The main problem (again, in our version) is that its one-pass, so if someone has a bogus "From" line but makes up for it with a "sender" or "reply-to" line, it still flags an error. I once thought of re-doing it in YACC (the Unix compiler-compiler), but only for a little while.  Received: by YALE-BULLDOG via CHAOS; Wed, 4 May 83 22:24:01 EDT Date: Wed, 4 May 83 22:26:02 EDT From: Nathaniel Mishkin Subject: Re: why are these separate lists? To: Craig.Everhart@CMU-CS-A.ARPA Cc: cak@PURDUE.ARPA, Header-People@MIT-MC.ARPA In-Reply-To: Craig.Everhart@CMU-CS-A, 4 May 1983 1535-EDT (Wednesday) In-Reply-To: "Christopher A Kent's message of 4 May 83 12:43-EST" The CACM article was written to describe an algorithm that searched for messages containing keywords. We may be talking about different articles that had the same orientation. I'm talking about "Experience with a space efficient way to store a dictionary" by Bob Nix in the May 81 CACM. That paper references "Exact and approximate membership testers" by Carter, Floyd, Gill, Markowsky and Wegman which is in the 1978 SIGACT symposium. Using the same algorithm to suppress duplicate messages trips over the same failure, but in this case the result isn't you seeing slightly too many messages, but is instead your missing a message because it was marked as ``duplicate,'' while it wasn't a duplicate at all. In any event, this is true of the algorithm I suggested too. However, you can reduce the probability of calling a non-duplicate a duplicate by increasing the number of hash functions or the size of the hash table. My theorist hacker friend (Bob Nix) assures me that a reasonable sized table will get the chance of error down to 1 in 1000. Message-IDs are supposed to be unique within the entire Internet, if not also within the entire internet. Why bother with Date: and From: and Subject: and thus have to worry about transformations induced by forwarding gateways? Well, given that people are having a hard enough time getting up to RFC822, I don't really expect that a standard "Message-ID:" will be forthcoming real soon. Lacking such a standard, hashing the headers seems pretty reasonable. But perhaps you're right and this technique is an unnecessary space optimzation. If you have enough disk space, you'd just store all the headers for the past week or so and do a table lookup on that. Of course, there are those of us that get enormous amounts of mail and might object to an expanding "saved headers" data base. -- Nat p.s. And then of course there's the nuisance that I'm going to get back a copy of this message from the remote mailing list. Have to save the headers of outgoing mail too I suppose. -------  Date: 4 May 1983 2225-EDT From: Andrew Scott Beals Subject: Re: why are these separate lists? (header parsing, actually) To: knutsen@SRI-UNIX cc: cak@PURDUE, header-people@MIT-MC In-Reply-To: Your message of 4-May-83 2220-EDT mmdf is full of bugs, including the header munging stuff...in fact, it's still in a state of transition... -------  Date: Wednesday, 4 May 1983 20:24-PDT To: Header-People at MIT-MC Subject: Re: why are these separate lists? In-reply-to: Mishkin's message of Wed, 4 May 83 22:26:02 EDT. From: greep at SU-DSN "Well, given that people are having a hard enough time getting up to RFC822, I don't really expect that a standard "Message-ID:" will be forthcoming real soon." Sounds like there might be a misunderstanding here. Message-ID's don't have to be in any particular form (except within the general constraints of all header fields). Just so every message-id issued is unique. Presumably each host will include its own name somewhere in the message-id so two hosts won't use the same id. Generating usable message-id's is certainly a great deal easier than the parsing of all the crazy address fields that everyone has to do.  Date: 5 May 1983 0211-PDT From: Henry W. Miller Subject: Re: parsing headers To: eric%UCBARPA at UCB-VAX, cak at PURDUE cc: Header-People at MIT-MC, Mishkin at YALE, Miller at SRI-NIC In-Reply-To: Your message of 4-May-83 1201-PDT i have an idea: why don't we all adopt and adhere to the Internet standard, and send bitches accordingly? It just might make life easier... -HWM -------  Date: 5 May 83 10:06:53 EDT (Thu) From: cbosgd!mark@Berkeley (Mark Horton) Subject: Re: why are these separate lists? Message-Id: <8305051406.AA15594@cbosgd.UUCP> Received: by cbosgd.UUCP (3.320/3.7) id AA15594; 5 May 83 10:06:53 EDT (Thu) Received: by UCBVAX.ARPA (3.339/3.27) id AA11919; 5 May 83 07:32:20 PDT (Thu) To: cak@purdue.ARPA Cc: Header-People@mit-mc.ARPA, MsgGroup@brl.ARPA, namedroppers@nic.ARPA This debate about people getting 2 or 3 copies of everything is amusing. I can't resist pointing out that if you were using Usenet (=Netnews, and !=UUCP) instead of mailing lists, this problem never would have arisen. Only one copy of each message goes to each system, tagged with the appropriate newsgroups, e.g. this message might be tagged net.mail.namedroppers,net.mail.header-people,net.mail.msggroup (assuming the same names were used instead of something descriptive of the separate functions of the groups). Each reader then only sees it once, not twice or thrice. (We do get duplicates in Usenet, too, but most of them seem to be caused by arpanet mail that got half transmitted and then aborted before completion). I'll repeat my invitation to any sites, ARPANET or otherwise, who want to join Usenet - drop me a line and I'll point you at a nearby contact. If you run UNIX, the code is all written; if you run something else, you'll have some work to do. (Anybody want to produce a TWENEX implementation?) For all the comments I've heard about how the ARPANET mail technology is supposed to be better than Usenet for mass mailings (my opinion hasn't changed) I'm surprised that the very implementors of the mail systems are unable to solve a problem as simple as multiple copies of the same mail message. If nothing else, why not just have one mailing list MAIL@MC.ARPA to which we all belong? I haven't seen any qualitatively different discussions in MsgGroup or Header-People. Mark Horton  Received: by YALE-BULLDOG via CHAOS; Thu, 5 May 83 09:11:25 EDT Date: Thu, 5 May 83 09:10:05 EDT From: Nathaniel Mishkin Subject: Re: why are these separate lists? To: greep@SU-DSN.ARPA Cc: Header-People@MIT-MC.ARPA In-Reply-To: greep@SU-DSN.ARPA, Wednesday, 4 May 1983 20:24-PDT In-reply-to: Mishkin's message of Wed, 4 May 83 22:26:02 EDT. "Well, given that people are having a hard enough time getting up to RFC822, I don't really expect that a standard "Message-ID:" will be forthcoming real soon." Sounds like there might be a misunderstanding here. Message-ID's don't I guess I meant that even getting mailers to include ANY "Message-ID:" my be difficult given the rate at which things change. As for standardizing the contents of the "Message-ID:" -- I suppose it doesn't matter what format is. However, my intuition is that a standard format might be a win. E.g. suppose the standard message format ID is a host name and a message number. The message number gets increased by the sender's mailer every time it sends a msg. Knowing this standard structure, I might have a more optimized storage scheme for received message IDs; e.g. I might save the 32 bit Internet address and a 64 bit message number instead of the actual text of the message ID. Note I'm NOT suggesting that this particular format is good -- I'm just saying that a fixed format could be helpful. -- Nat -------  Received: by YALE-BULLDOG via CHAOS; Thu, 5 May 83 13:12:31 EDT Date: Thu, 5 May 83 13:14:12 EDT From: Nathaniel Mishkin Subject: Re: why are these separate lists? To: greep@SU-DSN.ARPA Cc: Header-People@MIT-MC.ARPA In-Reply-To: greep@SU-DSN.ARPA, Wednesday, 4 May 1983 14:51-PDT In-reply-to: cak's message of 4 May 1983 12:43:43-EST. For a long time I have been suggesting that standards that are intended to be implemented as programs (eg header parsing, FTP reply codes, etc) be described as program logic, ... Well, I suppose. But as far as I can tell, the grammar specified in 822 is just fine and not prone to multiple interpretation. In this respect, it is an improvement over the grammar given in the earlier mail standard RFC (7??). The grammatical (i.e. declarative) representation has advantages over the program (i.e. procedural) representation. If the program logic is simply going to be a recursive descent parser, you might as well just give the grammar and skip the program. -- Nat -------  Date: Thu, 5 May 83 11:01 PDT From: Taft.PA@PARC-MAXC.ARPA Subject: Re: parsing headers To: Header-People@MIT-MC.ARPA Let's hear it for "real" parsers. I expect a lot of the troubles people have with RFC 822 (or 733, for that matter) are artifacts of inadequate, ad-hoc parsers. There are some practical difficulties with writing a real 822 parser, however. First of all, RFC 822 is full of errors. I've noted half a dozen errors in the grammar in appendix D, as well as several "features" which I am virtually certain are mistakes. There are also some subtle issues that are subject to multiple interpretations (e.g., the semantics of quotes). I've had on my queue for a while writing a summary of the problem's I've found; but I don't know when I will get around to this. Second, there are so many trashy headers floating around the ARPA world these days that a strict 822 parser is virtually useless. I've found it necessary to use a substantially enlarged grammar that includes rules covering most of the headers that actually appear. Some of the rules make the grammar ambiguous, and have to be applied in a context-dependent fashion. The result is somewhat of a mess. One thing that has worked out extremely well is the introduction of a header translator in the Xerox-to-ARPA mail gateway software. It accepts headers conforming to the enlarged grammar and emits headers conforming to 822. This means that mail programs on one side of the gateway are insulated from deviations from 822 on the other side. In particular, Xerox mail programs don't need to cope with all the different kinds of headers that come from the ARPA world. This enables us to convert our mail programs to strict 822 parsers, a process that is now nearly completed. If anyone is interested in the parser used in our mail gateway, I will be delighted to make it available. It is written in Macro-10; but the interesting part of it is a table-driven push-down automaton that is manually derived from the 822 grammar and is fairly easy to change. It produces a complete parse tree which retains all the original text of the header, including comments and formatting; unparsing this would yield the original header. Syntactic modifications are made by fiddling with the tree. Ed Taft  Date: Thu, 5 May 83 12:17:27 PDT From: Rich Wales To: Header-People@MIT-MC.ARPA Subject: Header translation -- pro and con In-reply-to: Ed Taft's message of Thu, 5 May 83 11:01 PDT Ed Taft's idea of translating incoming headers into strict RFC822 has a lot to be said in its favor, but there are risks too: (1) You need to be VERY certain that you know what the foreign host meant by its non-standard header. Otherwise, in the process of "translating" it into orthodox RFC822, you could end up changing (or throwing away) the intent of the original header -- and you might never know that this had happened, because you would have long ago discarded the original. (2) There is (at least, there once was) a fairly strongly held belief in the mail-processing world that it was OK to add extra lines to a header, but that the original header should not be modified. Part of this feeling, I think, stemmed from some mail management systems "out there" which connected messages with replies by look- ing for such things as date strings in "In-reply-to:" lines -- and which got confused if another host had chewed up and spat out the date in a form not byte-for-byte identical with the original. Our solution in the UCLA CS Dep't has been to keep the incoming header as is and supply the mail-reading programs with a fairly smart parsing subroutine. This way, the user can override the program's interpreta- tion of the header if necessary (which he couldn't do if the original header were long gone). We still run into trouble sometimes when we have to translate between UUCP headers and the RFC822-style headers we use internally. In par- ticular, Berkeley has of late started to put both UUCP-style "From" and RFC822-style "From:" lines in their UUCP mail. This creates prob- lems if the mail is to be forwarded beyond us to another UUCP site, because Berkeley apparently expects us to add our UUCP site name to the RFC822-style "From:" line before forwarding -- a form of surgery which our software does not yet do (and which I am not yet convinced we want to do anyway!). -- Rich  Date: 5 May 1983 1606-PDT Sender: OLE at SRI-CSL Subject: All that irrelevant information! From: Ole at SRI-CSL (Ole J. Jacobsen) To: MSGGROUP at MIT-MC Cc: HEADER-PEOPLE at MIT-MC Message-ID: <[SRI-CSL] 5-May-83 16:06:41.OLE> I sit here reading a huge backlog of mail at 300 baud and what do I see? Every message has this strange "Received:" field saying stuff like: "from BRL by SRI-CSL via ARPANET/MILNET with TCP/SMTP; Fri 29 Apr 83 12:59-PDT From Brl-Bmd.ARPA by BRL via smtp; 27 Apr 83 22:22 EDT From brl-gateway2.ARPA by BRL-BMD via smtp; 27 Apr 83 22:15 EDT From Mit-Mc.ARPA by BRL via smtp; 27 Apr 83 22:10 EDT via:.....etc" Now, can anyone tell me what use this is (apart from those who gather statistics on mail and delivery times) and why it sits there as an apperently required field? And is it not the case that all this routing stuff wil grow and grow forever as the network grows (domains or not)? Is it not enough that my mailer knows? (Ther is a field in every msg that says: "Return-Path: MAILER at SRI-CSL") Maybe this is a very naive question, enlighten me please!  Date: Thursday, 5 May 1983 17:41-PDT To: Ole at SRI-CSL (Ole J. Jacobsen) Cc: MSGGROUP at MIT-MC, HEADER-PEOPLE at MIT-MC Subject: Re: All that irrelevant information! In-reply-to: Your message of 5 May 1983 1606-PDT. <[SRI-CSL] 5-May-83 16:06:41.OLE> From: greep at SU-DSN It really is useful for mail system maintainers to have that information around. Certainly the average user should not have to see it if he doesn't want to. There is no reason why mail-reading programs cannot filter out that information, and I think some of them do. (Most people aren't interested in seeing message-id's either, but they can be useful too.) However, there is no place to put the information other than in the message itself, and the way messages are defined, header fields all come at the beginning (before the body of the message). So the moral is: (1) find out if your mail program has a way to hide the fields you don't want to see; (2) if not, find out why not and maybe try to get someone to add this feature. (I won't suggest doing it yourself for fear of being deluged by comments from the rest of the world about how it shouldn't be necessary for each user to do this himself. I agree.) (If you were running Unix, it would be very easy to do this with an existing program, "egrep".)  Date: Thu 5 May 83 17:45:33-PDT From: Mark Crispin Subject: Re: All that irrelevant information! To: Ole@SRI-CSL.ARPA cc: MSGGROUP@MIT-MC.ARPA, HEADER-PEOPLE@MIT-MC.ARPA Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) In-Reply-To: Your message of Thu 5 May 83 16:06:00-PDT Ole - From time to time various individuals ask "why has this message been delayed for three months before I received it?" The Received: lines are invaluable in answering that question. Without them, all that can be done is worthless finger-pointing. Most mail reading processes have modes to suppress display of excess headers. -- Mark -- -------  Date: 5 May 1983 2051-PDT (Thursday) From: eric%UCBARPA@Berkeley (Eric Allman) Subject: Re: Heretical ideas Message-Id: <8611.31.421041088@ucbarpa> Received: by UCBARPA.ARPA (3.336/3.27) id AA08612; 5 May 83 20:51:30 PDT (Thu) Received: from UCBARPA.ARPA by UCBVAX.ARPA (3.339/3.27) id AA09223; 5 May 83 20:54:12 PDT (Thu) Phone: (415) 548-3211 To: mo@LBL-CSAM Cc: Header-People@MIT-MC.ARPA In-Reply-To: Your message of 4 May 1983 1749-PDT (Wednesday). <8305050049.AA07717@LBL-CSAM.ARPA> I can't agree with you completely. As you know, I am a (reluctant) proponent of header munging. In the best of all possible worlds, every host (on the little-i-internet) would have a unique host name. Since I am a pragmatist, this host name could have sufficient structure as to be practical for environments that could not or would not support general name services. Domains give you this ability. For example, at Berkeley there is a domain called "CC" (for the Computer Center). Although this does not correspond to a host per se, it is possible to resolve this name trivially by sending to a specific host that is prepared to process this domain. I consider this fundamentally different than UUCP-style routing. You seem to feel that "all" we need to do is route replies back to the originator and then to the initial recipients. I suspect this would be an unacceptable burden. In general, addresses should (by definition) be unambiguous regardless of your position. Return paths may give you an escape mechanism, but they should be no more than that. On the other hand, I disagree with Jon also. The ability to explicitly route messages back to a particular site for interpretation by their host tables is necessary for a number of reasons, including access to hosts that are new to the internet and for debugging [note my implicit assumption that debugging is never complete.... sigh]. What amazes me is that we have such similarities to the phone system and the postal system, and yet we seem to be determined to make the same mistakes. I would urge everyone involved with mail systems to read "Notes on Distance Dialing" for a very instructive tutorial. On an unrelated topic, let me cast my vote against addresses based on geographical vagaries. The fact that I can communicate with you as easily as I can with Jon Postel in SoCal or Brendan Reilly in Delaware should demonstrate how pointless this is. Just where is FORD-WDL3 anyhow? Do I care? It seems far more important to me to know that the two CSNET relays are logically related than to know that they are on opposite sides of the continent. When people traveled by foot or by horseback, this was important. We travel by electron; in this day of multinationals, affiliation is far more important. eric  Date: 5 May 83 19:53:10 EDT (Thu) From: Ron Natalie To: header-people@Mit-Mc.ARPA Subject: MSG ID's and duplicate messages. The idea of deleting duplicate messages has some merit, especially (as what recently happened to INFO-MICRO and net.micro) somebody goes into a loop and mails the identical letter 38 times. This is a rarity. There is a lot of traffic on a list such as info-micro, and such loops don't often happen. Another problem is that frequently one of the sides of a TCP connection will die while transferring mail. Under our current implementation (a heavily modified MMDF), the addressees get a copy that is tagged as incomplete. If we were to suppress duplicate messages, either the incompletes would not be deleted (same old problem) or the later and more complete copies would be deleted (horrors). Deleting the incompletes is slightly dangerous since replacement copies may never come (the sender in most cases get a "failed mail" message back, thankfully). I am willing to put up with the occasional duplication and the extremely rare crazed looping of someones mailer. These are easily taken care of by my mail reader (although others might not be so luckey). I'd rather see even fragments of mail that was coming to me. -Ron (INFO-MICRO-REQUEST@BRL)  Date: Friday, 6 May 1983, 00:01-EDT From: Christopher C. Stacy Subject: All that irrelevant information! To: Ole J. Jacobsen Cc: MSGGROUP at MIT-MC, HEADER-PEOPLE at MIT-MC In-reply-to: <[SRI-CSL] 5-May-83 16:06:41.OLE> The mail reader which I use allows me profile options to name any header fields which I dont want to see. There is also a "Delete Duplicate Messages" command which looks at various header fields in an attempt to win.  Date: Thursday, 5 May 1983 22:45-PDT To: Ron Natalie Cc: header-people at MIT-MC Subject: Re: MSG ID's and duplicate messages. In-reply-to: Your message of 5 May 83 19:53:10 EDT (Thu). From: greep at SU-DSN The particular problem you describe (eliminating duplicates when the original may have been only partially received) seems easy enough to solve -- when your mail server sees a transmission error, as opposed to a proper end-of-message, it can munge the header somehow, e.g. change the name of the "message-id:" field to "incomplete-message-id:", thus making it appear different from the same message received in its entirety. (If you typically get more than one incomplete version and really want all of them kept, then it could stick the time received somewhere in there too.)  Date: 5 May 1983 2056-PDT (Thursday) From: eric%UCBARPA@Berkeley (Eric Allman) Subject: Re: parsing headers Message-Id: <8646.31.421041382@ucbarpa> Received: by UCBARPA.ARPA (3.336/3.27) id AA08647; 5 May 83 20:56:24 PDT (Thu) Received: from UCBARPA.ARPA by UCBVAX.ARPA (3.339/3.27) id AA09239; 5 May 83 20:59:05 PDT (Thu) Phone: (415) 548-3211 To: Andrew Knutsen Cc: Header-People@Mit-Mc In-Reply-To: Your message of 4 May 1983 at 1636-PDT. <8305050818.AA09266@UCBVAX.ARPA> Fcc: mail But note that both header and envelope support syntax for routing. The differences are far more gratuitous. eric  Date: 5 May 1983 2100-PDT (Thursday) From: eric%UCBARPA@Berkeley (Eric Allman) Subject: Re: parsing headers Message-Id: <8677.31.421041623@ucbarpa> Received: by UCBARPA.ARPA (3.336/3.27) id AA08678; 5 May 83 21:00:25 PDT (Thu) Received: from UCBARPA.ARPA by UCBVAX.ARPA (3.339/3.27) id AA09297; 5 May 83 21:03:08 PDT (Thu) Phone: (415) 548-3211 To: Henry W. Miller Cc: Header-People@MIT-MC In-Reply-To: Your message of 5 May 1983 0211-PDT. <8305050914.AA09741@UCBVAX.ARPA> I would love it if everyone would adopt the Internet standards, assuming they were extended to handle the internet. Do you propose to write the code and personally carry it to all 1200 UUCP sites, all Xerox sites, all PUP sites, all Chaos sites, all DECNET sites, all BITNET sites, etc., etc.? Don't forget the users manuals!!! Personally, I already find myself busy for most of the rest of this century. eric  Date: 5 May 1983 13:43-PDT From: John Gilmore Subject: Incorrect dates in message headers Message-Id: <8305060143.AA00591@sun.uucp> Received: by sun.uucp (3.320/3.14) id AA00591; 5 May 83 18:43:08 PDT (Thu) Received: by UCBARPA.ARPA (3.336/3.27) id AA10372; 6 May 83 00:54:39 PDT (Fri) Received: from UCBARPA.ARPA by UCBVAX.ARPA (3.339/3.27) id AA12059; 6 May 83 00:57:33 PDT (Fri) To: Postmaster@SU-SCORE.ARPA, Postmaster@SRI-NIC.ARPA, Postmaster@Berkeley Cc: Header-People@MIT-MC.ARPA I just noticed that in the following message header, none of the Received: or Date: lines has a correct date. RFC 822 specifies that the day-of-week, if present, is a 3-letter abbreviation preceding the date and separated from it by a comma. Here's what we got instead: Wednesday, 4 May 1983 15:40-PDT Wed 4 May 83 16:00-PDT Wed 4 May 83 15:40:29-PDT 4 May 83 16:03:01 PDT (Wed) 4 May 83 17:03:34 PDT (Wed) 5 May 83 00:38:04 PDT (Thu) The format with the day-of-week after the date and time, in parentheses, is technically OK since the parentheses designate a comment, but it would be better to use the defined format. John Gilmore, Postmaster@sun.uucp , sun!postmaster@Berkeley _________________________________ >Date: Wednesday, 4 May 1983 15:40-PDT From: Keith Lantz Subject: Re: why are these separate lists? Message-Id: <8305042303.AA02716@UCBVAX.ARPA> Received: from SU-SCORE by SRI-NIC via ARPANET/MILNET with TCP/SMTP; Wed 4 May 83 16:00-PDT Received: from Diablo by SCORE with Pup; Wed 4 May 83 15:40:29-PDT Received: from SRI-NIC (sri-nic.ARPA) by UCBVAX.ARPA (3.339/3.27) id AA02716; 4 May 83 16:03:01 PDT (Wed) Received: from UCBVAX.ARPA by UCBARPA.ARPA (3.336/3.27) id AA21576; 4 May 83 17:03:34 PDT (Wed) Received: by sun.uucp (3.320/3.14) id AA00413; 5 May 83 00:38:04 PDT (Thu) To: cak@Purdue.arpa Cc: namedroppers@Sri-nic.arpa, Header-People@Mit-mc.arpa, MsgGroup@Brl.arpa In-Reply-To: Your message of 4 May 1983 01:31:13-EST. (text deleted)  Date: 6 May 1983 at 0159-PDT From: Andrew Knutsen To: eric%UCBARPA @ Ucb-Vax (Eric Allman) cc: header-people @ Mit-Mc Subject: Re: parsing headers In-reply-to: Your message of 5 May 1983 2056-PDT (Thursday). <8646.31.421041382@ucbarpa> Sender: knutsen @ Sri-Unix As I understand it, the routed syntax is :, where is the routing-part and is a domain-address that could go in a header. I really think that routes should be kept out of headers, because they are confusing and also if theyre there they will be used, and the more efficient routings allowed by use of domains wont. Of course, in real life fudge factors will be required. Someone is going to send mail from some brand-new place that nobody knows about, and will want mail back. Thats one of the times when either the return-path or routing in the local-part (added by the gateway) could be used. Hopefully minimally. Here is the relevant piece of rfc822 for reference (and it is still a Request For Comments rather than a Requirement For Compliance, isnt it?). It seems to imply that the "route-addr" option in "mailbox" will be used only when the sender desires source-routing: 6. ADDRESS SPECIFICATION 6.1. SYNTAX mailbox = addr-spec ; simple address / phrase route-addr ; name & addr-spec route-addr = "<" [route] addr-spec ">" route = 1#("@" domain) ":" ; path-relative addr-spec = local-part "@" domain ; global address local-part = word *("." word) ; uninterpreted ; case-preserved domain = sub-domain *("." sub-domain) sub-domain = domain-ref / domain-literal domain-ref = atom ; symbolic reference domain-literal = "[" *(dtext / quoted-pair) "]" 6.2.7. EXPLICIT PATH SPECIFICATION At times, a message originator may wish to indicate the transmission path that a message should follow. This is called source routing. The normal addressing scheme, used in an addr-spec, is carefully separated from such information; the portion of a route-addr is provided for such occa- sions. It specifies the sequence of hosts and/or transmission services that are to be traversed. Both domain-refs and domain-literals may be used. Note: The use of source routing is discouraged. Unless the sender has special need of path restriction, the choice of transmission route should be left to the mail tran- sport service.  Date: 6 May 1983 0906-PDT (Friday) From: eric%UCBARPA@Berkeley (Eric Allman) Subject: Re: parsing headers Message-Id: <13316.31.421085212@ucbarpa> Received: by UCBARPA.ARPA (3.336/3.27) id AA13317; 6 May 83 09:06:54 PDT (Fri) Received: from UCBARPA.ARPA by UCBVAX.ARPA (3.339/3.28) id AA18844; 6 May 83 09:09:48 PDT (Fri) Phone: (415) 548-3211 To: Andrew Knutsen Cc: header-people@Mit-Mc In-Reply-To: Your message of 6 May 1983 at 0159-PDT. <8305060938.AA12888@UCBVAX.ARPA> Fcc: mail You miss my point entirely. I am not discussing whether or not it is a good idea to use source routing. However, both RFC821 and RFC822 define syntaxes that are very close to being semantically and syntactically identical -- but only very close. I want to be able to write one block of code that will manipulate (parse and generate) addresses, and have it work correctly for both my envelope and my header. eric  Date: 6 May 83 12:30 EST From: Stephen Tihor To: eric%UCBARPA@UCB-VAX.ARPA Subject: RE: parsing headers Cc: header-people@MIT-MC.ARPA Message-ID: <12644AD48.003A001D;1983@CMCL1.NYU.ARPA> In-Reply-To: <8677.31.421041623@ucbarpa> ; Message of 6-MAY-1983 02:31 from eric%UCBARPA@Berkeley (Eric Allm (Actually re: Universal Domain syntac mailers) Actually people are already working on Internet mailers of Unix (BSD), and that code can and certainly will be ported to USG Unix as well. The effort to slot in a new mail deliverer if the authors bother to debug it is within the level of support reasonably expected from Usenet sites. (See also gateway comments below). Xerox sites: ignoring 10s and 20s I can communicate with a friend off on the Grapevine with my 822 mailer without problems already. I don't know who is supporting the BITNET software but someone seems to do route translation on some messages I send to Bit-Net so thats just one change at the gateway site. PUP: beats me, I don;t deal with PUP to the best of my knowledge. Given that I gather there are PUP nets hanging behind 10s and 20s on the Internet and given the high caliber of the people doing mailers for 10s and 20s I expect the Internet problem to be handled at the gateway trivially, and back into the PUP net if people there are really serious about wanting to talk to the rest of the world. I expect the Chaos-nets to convert with swoosh of hacking from MIT as soon as MIT and Stanford become legal top level domains and subdomain registration authorities. Certainly it would be better than A%B%C%D which they have to use to talk through the current gateway to the Internet. Actually I don't expect too much from DECnets since DEC's corprate direction for non-LSG systems is towards a NBS compatible mailer although I certainly try to lobby for Internet compatibility capability. Fortunately DECnet supports arbitrary quoted strings in addresses so gatewaying is not that difficult, trivial if you are willing to use a dedicated gateway machine. Also I am sending this message from a RFC822 mailer which will transmit can, depending on which local machine I am using, send this message over the local DECnet, a UUCP link, or an ARPAnet link, OR SOME COMBINATION thereof. -------  Date: 6 May 1983 1009-PDT From: MILLS at USC-ISID Subject: Re: parsing headers To: eric%UCBARPA at BERKELEY, knutsen at SRI-UNIX cc: header-people at MIT-MC, MILLS at USC-ISID In response to the message sent 6 May 1983 0906-PDT (Friday) from eric%UCBARPA@Berkeley eric, I would like to see that, too. However, note that SMTP addresses can have excape characters, [...] addresses and such. That's real nifty, but would probably bend a mailer or two. Regards, Dave -------  Date: 7 May 1983 0923-PDT (Saturday) From: eric%UCBARPA@Berkeley (Eric Allman) Subject: Re: Incorrect dates in message headers Message-Id: <12761.31.421172585@ucbarpa> Received: by UCBARPA.ARPA (3.336/3.27) id AA12763; 7 May 83 09:23:07 PDT (Sat) Received: from UCBARPA.ARPA by UCBVAX.ARPA (3.339/3.28) id AA08213; 7 May 83 09:25:06 PDT (Sat) Phone: (415) 548-3211 To: John Gilmore Cc: Header-People@MIT-MC.ARPA In-Reply-To: Your message of 5 May 1983 13:43-PDT. <8305060143.AA00591@sun.uucp> Fcc: mail John, please note that RFC821 specifies a different format for the date in the Received: line. I don't put the day-of-week before the the rest of the date string because it would violate that standard. Amuzingly enough, RFC822 specifies a different format for the same date in the same header field. Sigh. RFC821 doesn't spec comments at all, but it seemed better to do it that way than to leave out the day-of-week completely. Differences in the white space is acceptable according to the spec. On the other hand, the date in the Date: field is in RFC733 syntax. By the way, dashes are not legal between the time and the zone in 822. eric  Date: 7 May 1983 18:18 EDT (Sat) From: Howard D. Trachtman To: Header-people@MIT-MC Subject: interesting? header munging Date: 6 May 1983 15:42 EDT From: Richard P. Gabriel To: common-lisp at SU-AI Received: from SAIL by SCORE with Pup; Fri 6 May 83 13:05:52-PDT Received: from RANDOM-PLACE by SU-AI with TCP/SMTP; 6 May 83 12:57:17 PDT Received: from MIT-MC by USC-ECL; Fri 6 May 83 12:53:53-PDT This is a header I received asking for confirmation of receiving this message. Note the Received: from RANDOM-PLACE by SU-AI with TCP/SMTP line. While quite amusing, it really isn't providing a lot of information. Is this just a host claiming to be a RANDOM-PLACE or is it more seriously a case of a host munging a header inaccurately or not believing what another host claims to be.  Date: Sat 7 May 83 17:01:54-PDT From: Mark Crispin Subject: "header munging" To: Header-People@MIT-MC.ARPA cc: HDT%MIT-OZ@MIT-MC.ARPA Postal-Address: 725 Mariposa Ave. #103; Mountain View, CA 94041 Phone: (415) 497-1407 (Stanford); (415) 968-1052 (residence) SAIL's network software uses a package for name lookups which is still HOSTS2-based, and I imagine has not been changed since the TCP transition. I believe SAIL is now using host addresses in a standard (as opposed to HOSTS2) format, it isn't too surprising that address lookups in old software don't work. I confess, several years ago when I wrote the SAIL HOSTS2 software I had the address-to-name translation routine return RANDOM-PLACE for a failing lookup instead of something sane such as [10.3.0.11]. People do learn their lessons over time and the TOPS-20 mailsystem software did not repeat that mistake. It's possible that the I.T.S. software still does the same sort of thing, since SAIL's HOSTS2 software was cloned off of the I.T.S. stuff. By the way, your header refers to your message as being from "HDT@MIT-OZ". The Internet doesn't know what an "MIT-OZ" is, because MIT-OZ is a relative name in a different address space. That's why we need something (almost anything!) to assign absolute addresses across the universe. I haven't heard from any of the MIT people about how Chaosnet sites could be absolute addressed... -------  Date: 8 May 83 00:03:35 PDT (Sun) From: wcwells%Topaz.CC@Berkeley Subject: Basic Message Format Message-Id: <8305080715.AA15794@UCBVAX.ARPA> Received: by UCBVAX.ARPA (3.339/3.28) id AA15794; 8 May 83 00:15:50 PDT (Sun) To: MsgGroup@brl, header-people@mit-mc Cc: DCrocker@UDel-Relay, Postal@ISIF, cbosgd!mark@Berkeley Basic Message Format - 1 - "Structure of a Message" My conception of what the structure of an electronic mail message should be is based on the basic message format used by military communications. A basic military message consist of three parts: the "heading", "text", and "ending". HEADING Beginning Procedures Message Heading (Addresses, etc) TEXT Body of Message ENDING Ending Procedures The "beginning procedures" and "ending procedures" are dependent upon the transmission medium and change as the message is passed from one type of circuit to another. Applying the "postal letter" model and generalizing the above I have derived the following "basic message structure": HEADING Transmission Heading Envelope Heading Message Heading TEXT Message Text (Body of Message) ENDING Message Ending Envelope Ending Transmission Ending Using the analogy of a postal letter, the Message Heading, Message Text, and Message Ending are the "letter"; the Envelope Heading and Envelope Ending are the "envelope"; and the Transmission Heading and Transmission Ending are the "mailbag". The "basic message" consists of the Message Heading, Message Text and Message Ending. These parts are ones that the drafter and his mail formatting program create. A "posted message" consists of the Envelope Heading, Message Heading, Message Text, Message Ending, and Envelope Ending. The "message envelope" consisting of the Envelope Heading and Envelope Ending (if defined) contains elements of information added by the post office ("originating host") which accepts the message for transmission. This is where the first postmark ("Message-Id") is added. Post offices passing the message along ("relaying hosts") may or may not add their postmark ("Received"). The message may be put into different mailbags (Transmission Headings and Transmission Endings such as SMTP, UUCP) as it is passed along between different post offices. To ensure that the message is delivered each post office handing the message, takes the message of one mailbag before putting it in another mailbag. (Even though it is possible to put one mailbag inside another, the next post office may not be able to read what instructions are written on the inner mailbag.) Applying the above model, I find that all messages have a text (a message has to say something); that military messages have Transmission Heading, a Message Heading, a Text, and a Transmission Ending; and that Internet messages (RFC 822) have a heading that has a mix of Envelope Heading and Message Heading elements, a Text, and no ending. The Internet message format (RFC 822) defines elements of the Message Heading and some elements of the Envelope Heading. Amateur Radio messages have elements of the Transmission Heading, Message Heading, Message Ending, and Transmission Ending. Bill Wells, RMC, USNR-R topaz.wcwells@BERKELEY.ARPA Computing Services, 297 Evans Hall, University of California, Berkeley CA 94720  Date: 08 May 83 01:48:36 PST (Sun) From: Stef.UCI@Rand-Relay Return-Path: Subject: Re: Basic Message Format To: wcwells%Topaz.CC@Ucb-Vax , Cc: MsgGroup@Brl, header-people@Mit-Mc, cbosgd!mark@Ucb-Vax stef.UCI@Rand-Relay In-Reply-To: Your message of 8 May 83 00:03:11 PDT <8305080711.AA15723@UCBVAX.ARPA> Via: UCI; 8 May 83 1:52-PDT I can see a great deal of value in separating between the "message" (header + text), the "envelope", and the "mailbag" but I am not sure what adding separate endings for each of these will do for us in our computer network mail systems. I can see why the military might do that to help keep things separated in the TWX environment where the transmission facilities are not organized to bound these entities naturally. When all messages, envelopes, and transmission signals are carried "inband" on the same channel using otherwise undistinguishable text strings, all those "endings" must be very useful. But, we do use various delimiters to separate messages in files, after they have been transmitted, so I suppose we do find use for "endings" in that context. Stef  Received: by YALE-BULLDOG via CHAOS; Sun, 8 May 83 12:31:13 EDT Date: Sun, 8 May 83 11:52:22 EDT From: Nathaniel Mishkin Subject: Re: "header munging" To: Mark Crispin Cc: Header-People@MIT-MC.ARPA In-Reply-To: Mark Crispin , Sat 7 May 83 17:01:54-PDT By the way, your header refers to your message as being from "HDT@MIT-OZ". The Internet doesn't know what an "MIT-OZ" is, because MIT-OZ is a relative name in a different address space. That's why we need something (almost anything!) to assign absolute addresses across the universe. I haven't heard from any of the MIT people about how Chaosnet sites could be absolute addressed... We have a local Chaosnet too and we have done the illegal: we have created local domains and use the domain naming strategy internally. E.g. my expanded mailing address is "F.MISHKIN@YALE-RES.YALE.ARPA" (I know that should really be "@RES.YALE.ARPA", but let's ignore that). Fortunately for the outside world, before mail goes out to the Internet, it gets "normalized". We use the fact that there can be names (i.e. "local parts") in the "YALE.ARPA" domain as well as in the "YALE-xx.YALE.ARPA" domains. Names in the former are entries in our local data base; names in the latter sort of domain are user IDs on particular machines. In general, the mail system will substitute a YALE.ARPA name for a name lower in the tree if possible (the data base contains a local-net-wide user ID to person mapping). We don't support the "userid%localhost@YALE.ARPA" "convention". However, as soon as someone legitimizes some strategy for defining local domain names to the world, you'll be able to send to "userid@localhost.YALE.ARPA" since the local software is already dealing with such address. Right now, it's not such a big problem because everyone who's "supposed" to be able to get Internet mail is in the data base. The "wrong" but sensible thing to do at this transitional point would be to allow people to start using local domains but require that the name immediately under ARPA still be something that has an Internet address (or if you want to be fancy, a list of Internet addresses can be associated with the sub-ARPA domain). The strategy for a host trying to mail to "FOO.BAR.ZOT.ARPA" would be to try to contact "ZOT.ARPA". I believe I am not the first to make this suggestion. Certainly it is a hack but it is certainly no worse than the ever-growing "%" convention. In fact, it is probably better since while the implementation may be a crock, at least the syntax is "official". -- Nat -------  Date: 9 May 83 14:19:29 PDT (Mon) From: wcwells%Topaz.CC@Berkeley Subject: Re: Basic Message Format Message-Id: <8305092127.AA19481@UCBVAX.ARPA> Received: by UCBVAX.ARPA (3.339/3.28) id AA19481; 9 May 83 14:27:08 PDT (Mon) To: header-people@mit-mc Basic Message Format - 1 - "Structure of a Message" - Remarks In reply to: Date: 08 May 83 01:48:36 PST (Sun) From: Stef.UCI@Rand-Relay I can see a great deal of value in separating between the "message" (header + text), the "envelope", and the "mailbag" but I am not sure what adding separate endings for each of these will do for us in our computer network mail systems. I agreed that some systems may not use or need more than one type of ending. One of the simplest endings I can think of is no "message ending" and an "end-of-file" for the "envelope ending" and "transmission ending". However, separate endings have their uses. The "message ending" is at the user application level. In some systems the "message ending" contains the signature. I have also seen it used to repeat critical parts of the text (for example, all numeric fields in a telegram) as a "confirmation" that those parts have not been garbled in transmission. One of the most significant uses of the ending part of a message is to ensure that the message has been completely transmitted. In some systems, a "message ending" string indicates to the reader that he has received the end of the message. For various reasons, hosts out there in netland like to periodically truncate messages. Clerks have also been known to lose the last page of a message when they reproduce it offline. If the end of message is broken off at a logical point, the reader may have no indication that he has an incomplete message. If you are interested in reliable communications, using some type of "message ending" indicator at the user level is a very good idea. The "envelope ending" is used by the mail forwarding program(s) of "relaying hosts". It may also provide useful information for the user when things go wrong. The "originating host" can use the "envelope ending" to indicate the end of a "posted message". When a "postmark" , containing unique identification information, is repeated at the beginning and end of a message, you can test to see if the end of the message just read matches the beginning of the message. Using an "envelope ending" in this manner becomes important when messages are spooled for storage in a single file or batched (stacked) to be sent as a single transmission. If the middle of a string of messages is lost, repeating a "postmark" at the beginning and end of message can tell your mail forwarder that you have lost the middle. (If you do not use message identification at the beginning and end of each message in a stack, how can you be sure that end-of-message is not the end-of-message of the following message?) The "transmission ending" is use by the transport system. The simpliest "transmission ending" is a delimiter or string to indicate the end-of-transmission. Bill Wells, RMC, USNR-R topaz.wcwells@BERKELEY.ARPA Computing Services, 297 Evans Hall, University of California, Berkeley CA 94720  Date: 9 May 83 17:44:24 PDT (Mon) From: wcwells%Topaz.CC@Berkeley Subject: Re: Basic Message Format Message-Id: <8305100054.AA21995@UCBVAX.ARPA> Received: by UCBVAX.ARPA (3.339/3.28) id AA21995; 9 May 83 17:54:17 PDT (Mon) To: DDEUTSCH@BBNA Cc: MsgGroup@brl, header-people@mit-mc Basic Message Format - 1 - "Structure of a Message" - Remarks In reply to: Date: 9 May 1983 1217-EDT From: DDEUTSCH@BBNA Stef is absolutely right when he points out that military message procedures have a lot to do with their Telex-like nature. For example, a single long message might be transmitted in several smaller pieces. In that case, it is necessary to mark the last piece as being the end. You also argue that "endings" are needed in order to detect transmission errors. However, the need for doing this goes away with the use of real OSI systems. If you believe in the OSI model, the session layer provides error-free data transmission. That means that message protocols, which reside above the session layer, do not have to do error checking. However, a message protocol must make sure that all the pieces of a message are transmitted and received. That does not require any "ending" in the message content or envelope data; it requires that the sending protocol entity be able to indicate when the last piece of a data is being or has been sent and that the receiving protocol entity be able to indicate that it has received the last piece of data. I was not trying to address the problem of breaking a single long message into sections and transmitting each section as separate "posted-message". When restrictions on the maximum length of a basic message are required, the user can be told to break his narrative message into sections with each section identified in the text as "SECTION .... OF ..." (or a gateway mail forward might do that when entering a long message into that type of network). Though I think you are right in saying that when a message is transmitted in that manner, that "end-of-message" indicator in the Message Ending would come at in the Message Ending of the last section transmitted. The main point of my article on "basic message structure", is that there are three functional levels in transporting a message from user A to user B: basic message (the "letter") - User Level posted message (the "envelope") - Host/Mail Forwarder Level network message (the "mailbag") - Network Level (Yes, I am aware that the "network message" may be a lot of little packets being transmitted separately. In some systems (eg. UUCP) it could also be a set of "posted message" which have be batched to be transmitted as a single "network message".) I will admit that the Network Level should not be of concern to the mail system, however we should be aware that it exists and that some mail systems require users to work at this level. Thus, I have included it in my generalized model. I can believe in the OSI model, but until it is implemented everywhere, I do not trust our growing world of interconnected networks to function perfectly. Not all of these interconnected systems have hardware and software that function perfectly all the time. Even within the ideal OSI environment, software errors and hardware failures are going to occur. Also note that errors not only occur online, but may occur offline as well (for example, a clerk loses part of the message while reproducing it before it is delivered to the reader). If we are going to design a universal message format (which appears to be the goal of the Internet message format), then the basic message format should be designed to ensure that the message sent by the drafter is the same one that is received by the reader; or if it is not complete the reader should have some indication that something is missing. Military communications are particularly concerned about reliability. For example, I might be in a combat situation and order a strike on my current location as follows: Fire on my current position at 1800. Since I am planning to leave my current position before 1800, I would be in a very unhappy situation, if the missile I ordered was fired immediately because the reader of my message only received the first line: Fire on my current position Thus, until all systems and networks connected to the Internet World are functioning perfectly under the OSI model, I think I still have a good argument for using a user level end-of-message indicator in a Message Ending of my basic message structure. Bill Wells, RMC, USNR-R topaz.wcwells@BERKELEY.ARPA Computing Services, 297 Evans Hall, University of California, Berkeley CA 94720  Date: Monday, 9 May 1983 18:58-PDT To: wcwells%Topaz.CC at UCB-VAX Cc: MsgGroup at BRL, header-people at MIT-MC Subject: Re: Basic Message Format In-reply-to: Your message of 9 May 83 17:44:24 PDT (Mon). <8305100054.AA21995@UCBVAX.ARPA> From: greep at SU-DSN I got two copies of your message that have identical bodies but different "date" and "message-id" fields. If you are really trying to send two copies, please don't. If not, maybe your message system is doing something you don't know about.  Date: 9 May 83 18:52:07 PDT (Mon) From: wcwells%Topaz.CC@Berkeley Subject: Message Identification Message-Id: <8305100201.AA22505@UCBVAX.ARPA> Received: by UCBVAX.ARPA (3.339/3.28) id AA22505; 9 May 83 19:01:49 PDT (Mon) To: DDEUTSCH@BBNA Cc: MsgGroup@brl, header-people@mit-mc In reply to: Date: 9 May 1983 1217-EDT From: DDEUTSCH@BBNA ..., I'd like to point out that a message-id is not a postmark, and that there are many kinds of message-ids. A message-id, associated with the content of a message, refers to a communication from the message's originator to its recipients. If I transmitted the exact same message more than once, it would still have the same message-id. On the other hand, the message transfer system uses identifiers for message envelopes. These ids are different from the ids in the message content. For instance, in the example where I transmitted the exact same message more than once, its message-id would remain the same while its envelope would have a new id every time the message was sent. The purpose of an id is to identify a unit of data; the purpose of a postmark is to record the path it has taken. I think we agree. Did I get lost in my use of words? I sometimes use the term "message identification" to mean "message identification in general" Here is how I think we should identify message: Basic message - User Level: Originator ("From:") and Date-Time ("Date:") Posted message - Host/Mail Forwarding Level Originating host postmark ("Message-ID:"). Transmission Identification - Network level: Whatever your network requires. Note that if the same basic message is retransmited by the user, a new "Message-ID" is assigned by the mailer. (See definition in RFC 822). Unfortunately, most mailers do not let the user retransmit the same message with the same date, but assign a new date. If they did, we could trap duplicates. Bill Wells, RMC, USNR-R topaz.wcwells@BERKELEY.ARPA Computing Services, 297 Evans Hall, University of California, Berkeley CA 94720 ~e  Date: 9 May 83 23:22:49 PDT (Mon) From: wcwells%Topaz.CC@Berkeley Subject: duplicate message Message-Id: <8305100632.AA25420@UCBVAX.ARPA> Received: by UCBVAX.ARPA (3.339/3.28) id AA25420; 9 May 83 23:32:51 PDT (Mon) To: greep@SU-DSN Cc: MsgGroup@BRL, header-people@MIT-MC If you are on both MsgGroup and header-people you may have received two copies of the same text with different dates due to the fact that I mistyped the host name for the headers-people mailbox the first time the message was sent. To ensure delivery to people in the headers-people list who are not in the MsgGroup list the message was later resent to headers-people. Sorry, my mailer does not permit me to retransmit with the same date (dates are automatically generated). Here is another vote for combining these two mailing lists into one. Bill Wells  Date: 10 May 83 18:55:23 PDT (Tue) From: wcwells%Topaz.CC@Berkeley Subject: Basic Message Format Message-Id: <8305110209.AA01166@UCBVAX.ARPA> Received: by UCBVAX.ARPA (3.339/3.28) id AA01166; 10 May 83 19:09:27 PDT (Tue) To: MsgGroup@brl, header-people@mit-mc Cc: DCrocker@UDel-Relay, POSTAL@ISIF, cbosgd!mark@Berkeley Basic Message Format - 1 - "Structure of a Message" Version 2 - 10 May 83 1. Introduction. The Internet Mail Protocols as published in November 1982 provided us with a standard for the format of electronic mail message (RFC 822 - Standard for the Format of ARPA Internet Text Messages) and a method of transport mail in a packet network transmission channel (RFC 821 - Simple Mail Transfer Protocol). In view of the the number of non-OSI networks that are connected to the Internet I perceive a need for message structure model that is not only consistent with Internet mail format, but also applicable to existing non-OSI network message systems. The message structure defined herein, includes the concepts of "transmission envelopes" and "batch envelopes" in addtion to "message envelope" and "message content". I have also noticed that, although the Internet mail format standard (RFC 822) says it is not concerned with defining the "message envelope", many of the header fields defined in RFC 822 are functionally "message envelope" fields. Thus I see a need to identify which fields are "message envelope" fields and which are "message content" fields. My conception of the a generalized electronic mail message structure is based on the basic message format used by military communications. A basic military message consist of three parts: the "heading", "text", and "ending". HEADING Beginning Procedures Message Heading (Addresses, etc) TEXT Body of Message ENDING Ending Procedures The heading beginning procedures are a mix of message envelope information and transmission information. The ending procedures are a mix of user information, message envelope information, and transmission information. The "message content" consists of the message heading and the text. I have found that this structure is also applicable to non-military message systems including Amateur Radio, Domestic and International Telegrams, Telex and TWX. One significant feature of the military message format is that the "message content" and much of the "message envelope" format is the same for a variety of transmission media. That is, the message can be relayed via couriers, flashing light, semaphore, flag hoist, radiotelegraph, wire and radiotelewriter, paper tape relay, wire and radiotelephone, and computer networks. 2. Basic Message Structure. HEADING Transmission Heading Batch Heading Envelope Heading Message Heading TEXT Message Text (Body of Message) ENDING Message Ending Envelope Ending Batch Ending Transmission Ending Using the analogy of a postal letter, the Message Heading, Message Text, and Message Ending are the "letter" (message content); the Envelope Heading and Envelope Ending are the "envelope" (message envelope); the Batch Heading and Batch Ending are the rubber band around several "envelopes" being sent to the same "post office"; and the Transmission Heading and Transmission Ending are the "mail bag" (transmission envelope). The "basic message" contains only "message content" information and consists of the Message Heading, Message Text and Message Ending. These parts are ones that the drafter and his mail formatting program create. A "posted message" contains "message envelope" and "message content" information and consists of the Envelope Heading, Message Heading, Message Text, Message Ending, and Envelope Ending. The "message envelope" consisting of the Envelope Heading and Envelope Ending (if used) contains elements of information added by the post office (Message Transfer Sublayer (MTSL) program on the "originating host") which accepts the message for transmission. This is where the first postmark ("Message-Id") is added. Post offices (MTSL mail forwarding programs on relaying hosts) passing the message along may or may not add their postmark ("Received"). A "batch of messages", sometime called a "bundle" or "string of messages", is a set of "posted messages" which have been bundled together for transmission to the same post office as a single message. The post office address on a "batch of messages" should be readable by all post offices. A "batch of messages" is not unbundled until it arrives at the post office to which it is being sent. Not all message systems bundle messages. A "transmission message" may be either a "posted message" or a "batch of messages" surrounded by a "transmission envelope". That is, the "envelopes" (posted messages) and "bundles" (batches of messages) may be put into different mail bags (transmission envelopes) as it is passed along between different post offices. The format of the transmission envelope will vary with the type of network being used to transmit the message. "Post offices" are also known as "Message Transfer Agents (MTA)". To ensure that the message is delivered, each post office ("gateway MTA") that changes the mode of transportation (eg. ARPANET to UUCP) takes the message out of one mail bag before putting it in another mail bag. (Even though it is possible to put one mail bag inside another, the next post office may not be able to read what instructions are written on the inner mail bag.) An alternative to changing "mail bags" is to "refile the message". A simplest way to refile a message from one system to another is to quote the "basic message" in the text of message that has the correct type of heading (and ending) for the next network transporting the message. The refile method is used where the formats of two message systems are incompatible. It should be noted that the trend is towards mail programs that handle "posted messages" only. There are also message systems that have no "ending" or only use one type of ending. 3. Conclusions. Applying the above model, I find that all messages have a text (a message has to say something); that military messages have Transmission Heading, a Message Heading, a Text, and a Transmission Ending; and that Internet messages (RFC 822) have a heading that has a mix of Envelope Heading and Message Heading fields, a Text, and no ending. Amateur Radio messages have some elements of the Transmission Heading, Message Heading, Message Ending, and Transmission Ending. If we are going to make the Internet mail format a universal standard then we will have to define header fields for the Batch Heading Envelope Heading in addition to the Message Heading. Bill Wells, RMC, USNR-R topaz.wcwells@BERKELEY.ARPA Computing Services, 297 Evans Hall, University of California, Berkeley CA 94720  Date: 10 May 1983 2218-PDT Sender: ESTEFFERUD at USC-ECL Subject: Re: Basic Message Format From: ESTEFFERUD at USC-ECL To: wcwells%Topaz.CC at UCB-VAX Cc: MsgGroup at BRL, header-people at MIT-MC Cc: cbosgd!mark at UCB-VAX Message-ID: <[USC-ECL]10-May-83 22:18:26.ESTEFFERUD> In-Reply-To: <8305110206.AA01132@UCBVAX.ARPA> Hi Bill --- First, your topic is OK for MsgGroup, but not very appropriate for Header-People which has always focused more on implementation aspects rather than theoretical wool gathering. So, I suggest that you omit Header-People from your distribution. And Second, fresh ideas are always welcome, but all this is beginning to sound repetitious. Before we proceed further, may we ask for a statement of your purpose? Are you doing this as a term paper for a class at UCB? Are you seriously proposing that what we need are more endings? Are you suggesting that structured mail transfer systems (822/SMTP) should be analyzed as though they were actually flat, like TWX/TELEX where there is only one channel for both transmission and signalling? I am having trouble shaking the feeling that you have not yet come to grasp the important concepts of structured protocols. In any case, I think that you owe the MsgGroup community a bit more background about yourself and your purposes before you continue with your lectures. Not too long winded though. About one CRT screenful should do it. Best regards - Stef  Date: 10 May 83 23:14:19 PDT (Tue) From: wcwells%Topaz.CC@Berkeley Subject: Basic Message Format Message-Id: <8305110626.AA03921@UCBVAX.ARPA> Received: by UCBVAX.ARPA (3.339/3.28) id AA03921; 10 May 83 23:26:42 PDT (Tue) To: MsgGroup@brl, header-people@mit-mc Cc: DCrocker@UDel-Relay, POSTEL@ISIF, cbosgd!mark@Berkeley Basic Message Format - 2 - "Heading Components" 1. Introduction. In my first article I introduced a Basic Message Structure having the following parts. HEADING Transmission Heading Batch Heading Envelope Heading Message Heading TEXT Message Text (Body of Message) ENDING Message Ending Envelope Ending Batch Ending Transmission Ending This article is my first attempt to list the header fields in the "Standard for the Format of ARPA Internet Text Messages (RFC 822, August 13, 1982) by sub-part and functional component membership. 2. Discussion. In reviewing RFC 822, I found two fields that I consider to be "message envelope" fields: Return-Path: Received: My definition of a "message envelope" field is: a heading field that is defined by a message forwarding program (message transfer agent) at the Message Transfer Sub-layer. RFC 822 calls programs at this level "transport system" or "transport service" (see articles 4.3.1 and 4.3.2). The intended use of "Message-ID" and "Resent-Message-ID" is less clear. Article 4.6.1 states that "this identifier is intended to be machine readable and not necessarily meaningful to humans". That statement along would make me believe that these fields are not at the user level. I think this machine generated and readable field is suppose to be the originating post office's postmark. Thus I conclude that these are "message envelope" fields. If true, then why two different fields? If a "basic message" is resent, it should be given a new "message envelope" with a new "Message-ID" by the MTA posting program. If "Message-ID" is the original postmark, then we do not need the "Resent-Message-ID" field. If these fields have some other use, then we need to define a machine readable "Postmark:" field for the "message envelope", and leave these fields in the "message content". Comments? 3. Header Fields by Sub-Part and Component. ENVELOPE HEADING SUB-PART COMPONENT FIELD ___________________________________________________ Envelope Postmark Message-ID: Heading Return Field Return-Path: Envelope Address (not defined) Relay Instructions (not defined) Trace Fields Received: ___________________________________________________ MESSAGE HEADING SUB-PART COMPONENT FIELD ___________________________________________________ Readdressal Precedence (not defined) Message Heading Date-Time Resent-Date: Originator's Resent-From: Address Resent-Sender: Resent-Reply-To: Receiver's Resent-To: Address Resent-cc: Resent-bcc: ___________________________________________________ Original Precedence (not defined) Message Heading Date-Time Date: Heading Originator's From: Address Sender: Reply-To: Receiver's To: Address cc: bcc: Classification (not defined) Content Subject: Information Keywords: In-Reply-To: References: Encryption Encrypted: Field ___________________________________________________ Heading Comments Comments: Comments Field ___________________________________________________ Bill Wells, RMC, USNR-R topaz.wcwells@BERKELEY.ARPA Computing Services, 297 Evans Hall, University of California, Berkeley CA 94720  Date: 11 May 1983 1036-PDT From: POSTEL at USC-ISIF Subject: Errors-To ??? To: header-people at MIT-MC How about if we add an "Errors-To:" header field that would indicate where notification of message processing errors should be sent. The main use i would see for this field is to have mail-exploders (such as header-people) add the Errors-To field with the mailbox of the maintainer of the mailing list. Suppose i send a message to header-people, the header as i send it might look like this: Date: 11 May 83 10:20:30 PDT From: Postel@USC-ISIF.ARPA To: Header-People@MIT-MC.ARPA Subject: Error Reports When the mail processing program at MC gets the message and forwards it to the mailing list, it adds an Errors-To field, like this: Date: 11 May 83 10:20:30 PDT From: Postel@USC-ISIF.ARPA To: Header-People@MIT-MC.ARPA Subject: Error Reports Errors-To: Header-People-Request@MIT-MC.ARPA When the mail processing program at the destination for someone on the list finds it can't deliver the mail (the mailbox has been deleted or whatever) the error report is sent to the mailbox indicated in the Errors-To field. Comments? --jon. -------  Date: Wed, 11 May 83 11:22 PDT From: Taft.PA@PARC-MAXC.ARPA Subject: Re: Errors-To ??? In-reply-to: "POSTEL@USC-ISIF.ARPA's message of 11 May 83 10:36 PDT" To: POSTEL@USC-ISIF.ARPA cc: header-people@MIT-MC.ARPA Sounds like a good idea. However, I would argue that this information belongs on the envelope (i.e., among the properties transferred by SMTP), NOT in the header. Message transport software should not have to parse the header of a message in order to figure out what to do with it. Ed Taft  Date: 11 May 1983 1322-CDT From: Clive Dawson Subject: Re: Errors-To ??? To: POSTEL@USC-ISIF cc: header-people@MIT-MC In-Reply-To: Your message of 11-May-83 1236-CDT I like your proposal--it would help eliminate a lot of unnecessary mail traffic. In the absence of such a field, the error report should presumably be sent to the original sender of the message. Incidentally, with all of the ongoing confusion related to how to deal with strange addresses, it is often the case that these error reports themselves cannot be properly delivered. This causes another error report to be sent back to the original destination, only this time it's addressed to "Mailer" or "Postmaster" or whatever generated the first error report. If a site happens not to have some of these addresses defined ("Postmaster" is *required*, according to the RFC) I imagine some of these error reports could cycle back and forth forever without being seen by human eyes... Clive -------  Date: 11 May 83 14:20 EST From: Stephen Tihor To: POSTEL@USC-ISIF.ARPA Subject: RE: Errors-To ??? Cc: header-people@MIT-MC.ARPA Message-ID: <1314ED1F3.00DF001E;1983@CMCL1.NYU.ARPA> In-Reply-To: Message of 11-MAY-1983 13:54 from POSTEL at USC-ISIF Wonderful. That would eliminate a lot of problems without requiring that the Sender: filed be a valid return address of transmission messages. (Which was the only competing proposal I have heard with the current syntax.) -------  Date: 11 May 83 12:02:15 PDT (Wed) From: wcwells%Topaz.CC@Berkeley Subject: Re: Basic Message Format Message-Id: <8305111912.AA01705@UCBVAX.ARPA> Received: by UCBVAX.ARPA (3.339/3.28) id AA01705; 11 May 83 12:12:23 PDT (Wed) To: ESTEFFERUD@USC-ECL Cc: MsgGroup@BRL, header-people@MIT-MC In reply to: Date: 10 May 1983 2218-PDT From: ESTEFFERUD@USC-ECL Before we proceed further, may we ask for a statement of your purpose? Part of my purpose in putting "basic message format" on the net is to get a discussion going on how the Internet message format should be implimented in a non-structured protocol environment. I have not seen any information on "message envelope" header fields. I think the "header fields" list should be expanded to include information used at the "message transfer" layer. I am also concerned about current implimentations of RFC822 that do not make a distinction between "message envelope" and "message content" header fields. (I would like the "message content" I send to be same when the reader sees it.) In comparing US Govt. (military and non-military) message header fields I have found a number of fields which I believe should be part of the Internet message format standard if it is ever going to be adopted for official communications. Are you doing this as a term paper for a class at UCB? No this is not a term paper, but it may become a RFC. Are you seriously proposing that what we need are more endings? I think at least one user readable ending is needed at the "message content" or "message envelope" level. Are you suggesting that structured mail transfer systems (822/SMTP) should be analyzed as though they were actually flat, like TWX/TELEX where there is only one channel for both transmission and signalling? No. But I believe that the "message envelope" should be upward compatible. I think we should have a "message envelope" standard that will work in both a structured and a non-structed environment. I am having trouble shaking the feeling that you have not yet come to grasp the important concepts of structured protocols. I think I grasp the concepts. Just rusty on the details. In any case, I think that you owe the MsgGroup community a bit more background about yourself and your purposes before you continue with your lectures. Not too long winded though. About one CRT screenful should do it. I am a Programing Consultant at the U.C. Berkeley Center where I write user documentation and assist users with using our systems. I am also a Radioman Chief (telecommunications specialist) in the Naval Reserve about 12 years experience in the use of military message systems. My military duties include telecommunications planning for Naval Reserve activities in the Western US. I am interested in evaluating the feasibility of using ARPANET as an emergency communications alternate to AUTODIN. Best regards Bill Wells, RMC, USNR-R topaz.wcwells@BERKELEY.ARPA Computing Services, 297 Evans Hall, University of California, Berkeley CA 94720  Date: 11 May 1983 1431-EDT Sender: MOOERS@BBNA Subject: Re: Errors-To ??? From: MOOERS@BBNA To: POSTEL@USC-ISIF Cc: header-people@MIT-MC, Mooers@BBNA Message-ID: <[BBNA]11-May-83 14:31:37.MOOERS> In-Reply-To: Your message of 11 May 1983 1036-PDT This is an excellent idea, except that the concept of an Errors-To: field is too limited. The fact that you place the address of HEADER-PEOPLE-REQUEST shows that the function you want is essentially the same function as the Circulation Department of a magazine or newspaper. A little while ago, I suggested that It would be easier on people who don't know the mailing-list conventions to change: From HEADER-PEOPLE as the "broadcast" address, and HEADER-PEOPLE-REQUEST as the "circulation" address, to HEADER-PEOPLE-BROADCAST for the "broadcast" address, and HEADER-PEOPLE for requests and errors A number of people objected that this goes against tradition. (Tradition certainly accumulates quickly !!) Maybe the new field that Jon Postel suggests could be Requests-to: or Changes-to: since it would be very useful for people who simply want to get on or get off the list. ---Charlotte Mooers  Date: Wednesday, 11 May 1983 18:01:20 EDT From: Mike.Accetta@CMU-CS-CAD To: POSTEL@USC-ISIF cc: Header-People@mit-mc Subject: Re: Errors-To ??? Message-ID: <1983.5.11.21.39.55.Mike.Accetta@CMU-CS-CAD> Jon, How is the anticapated function of an "Errors-to:" field different from that of the "envelope" MAIL FROM address supplied by SMTP which is also supposed to be used to report errors? Do you achieve the same effect if mail exploders simply insure that the appropriate address ("Header-People-Request@MIT-MC" in your example) is used as the return path of the mail they redistribute or are you trying to address a different problem? I bet you are more likely to get errors returned to this address (as part of the standard SMTP delivery processing) rather than hoping to get the servers changed to pick an "Errors-to:" field out of the message text. I certainly concur with the other comments that having this information in the "envelope" is clearly preferable to adding it to the message text. - Mike  Date: 11 May 1983 18:12 est From: DBrown.TSDC at HI-MULTICS Subject: ? To: wcwells%Topaz.CC at UCB-VAX cc: header-people at MIT-MC, msggroup at BRL Bravo, Mr. Wells. You gave the kind of answer that I see all too rarely on this net: succinct, to the point and polite, even in answer to an unwarranted "flame". I'll be saving this one away to recommend as a model, both for my own use and for my associates --dave brown DBrown.TSDC at HI-MULTICS.ARPA decvax!watmath!watbun!drbrown late of HM Canadian Forces  Date: 11 May 1983 1857-EDT (Wednesday) From: don.provan@CMU-CS-A To: POSTEL@USC-ISIF Subject: Re: Errors-To ??? CC: header-people@MIT-MC, In-Reply-To: "POSTEL@USC-ISIF's message of 11 May 83 12:36-EST" Message-Id: <11May83.185758.DP0N@CMU-CS-A> i thought that was the function of the "return-path:" line: "It is possible for the mailbox in the return path be different from the actual sender's mailbox, for example, if error responses are to be delivered [to] a special error handling mailbox rather than the message sender's." rfc 821, p. 22 am i reading this wrong, or is this a different type of error? i'm glad this has come up, because i've been noticing mail that has illegal "from:" lines (like with " at " instead of "@"), but good "return-path:" lines. i've been tempted to use the return path for replies, but since the above paragraph indicates that this line should only be used for errors, i haven't. don  Date: 11 May 83 23:23:14 PDT (Wed) From: wcwells%Topaz.CC@Berkeley Subject: Re: Errors-To ??? Message-Id: <8305120633.AA07659@UCBVAX.ARPA> Received: by UCBVAX.ARPA (3.339/3.28) id AA07659; 11 May 83 23:33:05 PDT (Wed) To: header-people@MIT-MC Cc: POSTEL@USC-ISIF Jon, I like the idea. I do not think "Errors-To" is clear enough. I prefer Send-Mail-Errors-To: but would settle for Send-Errors-To: However, I think this idea comes under the heading of what some message systems call "message handling instructions". Why not generalize the field name so you do not have to define a new field for each type of message handling instruction or communications service action? You already have a "message content" level field for the user to use, it's called "Comments:". For mail transport agent use (and communications operator use) I suggest using: Message-Handling: machine-handling-instructions Message-Handling: machine-handling-instructions (comments) Message-Handling: (comments) where machine-handling-instructions are operating signals and plain language instructions registered with NIC. Operating signals are already defined for International, Military, and Amateur Radio communications. Since there are several types of operating signals, I suggest that the "machine-handling-instructions" be defined as signal-type, message-handling-signal where signal-type is: ARRL Amateur Radio Signals defined by the American Radio Relay League, Inc INET Internet Signals (registered with NIC) QSIG International "Q" Signals defined by International Telecommunications Conventions (listed in ACP 131) ZSIG Military "Z" Signals defined in Allied Communications Publications: ACP-131, ACP-131 US SUPP-1 An now some suggestions for Internet defined message handling instructions: a. Message-Handling: INET, SEND MAIL ERRORS TO mailbox b. Message-Handling: INET, CANCELLED AT date-time c. Message-Handling: INET, BOOK d. Message-Handling: INET, MAY BOOK Example (a) is self-explanatory. Example (b) indicates that attempts to effect delivery should be stopped at a certain time because the message contains time sensitive information which becomes obsolete at a certain time (eg. weather forecasts). Since the message is self-cancelling, no message should be sent back to the originator indicating non-delivery. Example (c) would be used to indicate that the message must be handled as a "book message". Example (d) indicates that the message may be handled as a "book message". A "book message" is one which is destined for two or more addressees but is of such a nature (eg. announcements) that the drafter considers that no addressee need or should be informed of any other addressee. "book messages" are very useful as a method of reducing the cost of transmitting the same short message to many addressees in message systems where there is a charge for every line transmitted. Regards. Bill Wells, RMC, USNR-R topaz.wcwells@BERKELEY.ARPA Computing Services, 297 Evans Hall, University of California, Berkeley CA 94720