Improving V6 Unix


Intro

The moderately early PDP-11 versions of Unix such a V6 Unix packed an incredible amount of power into a extemely small amount of space - for V6, a mere 20KB of code (not including device drivers) for the permanently resident kernel - a bang/buck ration that will almost certainly never be exceeded.

It's relatively easy to bring up a V6 PDP-11 Unix under the Ersatz-11 PDP-11 emulator, as covered in the Bringing up V6 Unix on the Ersatz-11 PDP-11 Emulator page. This is an addendum/ancillary page, for those who wish to go further.

It covers a range of topics, including how to get the Standard I/O Library, and the 'tar' command (which, alas, needs system mods to really be supported 'right'), how to have the system default to more reasonable input editing characters, etc.


Other Unix stuff

Here are a bunch of other things that are useful for a PDP-11 V6 Unix.

Tim Shoppa's material

There's another set of RL02 V6 disk pack images lying around, from
Tim Shoppa. Most of it's junk (old biology lab data), but the root pack here has a ton of treasures on it.

Most importantly, it has a newer C compiler, one with support for longs, unsigneds, and a bunch of other stuff. (You could hack unsigneds in the 'vanilla' V6 C compiler, by using a char *, but that didn't always work.) In addition to /bin/ncc, you need the entire /xlib directory - don't rename it, or move the files to /lib; just copy it over as is.

Warning: for return of long data items from procedures to work (long returns use R1 as well as R0), you need a new csv.s, since the one with vanilla V6 bashes R1 on a procedure return. (Hey, R1 isn't used there on return, so bashing it is fine.) It goes in the C library, /lib/libc.a, once you've assembled it.

To get access to the contents of that disk pack image, you need to add the RP driver to your Unix (see here for how to do that).

In addition to the new C compiler, it has a ton of new commands. Alas, all we have are the binaries... :-( Too bad a source disk didn't get saved, along with all those useless packs full of biology stuff!

I haven't fully explored what's there, but one thing that is there is a new cdb, with an extended command set. Note: Some of the commands use some new Unix system calls which aren't in 'vanilla' V6, so they may blow out when you try and use them.

There's also complete kernel code, including code to bring it up (mostly) to v7 compatability, something else I haven't completely explored.



Standard I/O Library

One of the University of New South Wales tapes,
here, has a copy of the source for the Standard I/O Library (which vanilla V6 does not include).

I haven't actually used it much, since there's a copy of the compiled library on the Shoppa disk, in /lib/libS.a. One hitch is that that library is in the so-called 'new archive' format, which the vanilla V6 tools don't grok. That disk has the new archive tool ('nar'), and you can either unpack the library with that, and then repack it with the old archive tool, or use a copy of the library which I did that to already, here.

(I think I got the files in there in the right order... but maybe not! So look out for unresolved references. If you see some, the hack fix is to specify that library twice in the command line [xxx -lS -lS], and that should kludge it for the moment.)

Another hitch is that some of the calls require capabilities the Unix V6 kernel doesn't have (and can't be simulated). The lseek() system call is one example; one can simulate the call to it in the C library routine fseek(), by using block and byte seek() calls (and I did, see the code here), but the problem is that the lseek() system call also returns the file offset pointer (which ftell(), among others, uses), and there's no way to get that info in V6 (that I knew of).

So fully supporting the Standard I/O Library requires a system mod (below).


System mods for Stdio and others; additional library things


lseek() and smdate()

Adding lseek() is pretty easy. I have the code in a new file sys5.c, available
here; it also needs an entry in sysent.c, an updated copy of which is here.

Note: That copy of sysent.c also has the entry for the smdate() system call un-commented-out. That's because 'tar' (below) kinda-sorta requires the ability to change file modified dates to work 'properly' (as in, how it usually works on other systems), and I needed smdate() for that (see the 'tar' entry for the details).

So you'll also have to un-comment-out the code for that call, in sys4.c; or if you don't feel like wrestling with it in 'ed' (or extracting it and editing it on your host machine), you can just download a fixed copy of it here.

Compile them all (don't forget the "-O" flag!), and add them to lib1, viz.:

ar r ../lib1 sys*.o
and then build a new system (see here for how to do that).

While doing 'tar', at one point I thought I needed the utime() system call, so I prepared a copy of it for V6 (available here if you're curious, including a V7 version of the iupdat() internal system routine), but since it turned out I didn't need it, I never bothered adding it.



C Library additions

The standard C library doesn't have an entry for lseek() (the routine C user code calls to do an lseek() system call), of course; there is one in the Standard I/O Library, or if you want to add one to the normal C library, the source is
here.

I have yet to add the access() system call, but for the moment you can fake it with this library routine. (This only works for code you're compiling, of course; when trying to use an existing binary off the Shoppa pack, this won't help you.)



Missing routines

While you're at it, you might want to add to the C library
alloca(), which for some reason isn't in V6; along with mktemp().

Better default input editing and interrupt characters

The default input editing and interrupt characters are, well, primeval. '@' for line delete? How 60s. (You can tell these guys were old Multics hackers; they must have had the Multics line editing characters burned into their brains.)

You can change the editing characters with 'stty'. You can't, however, change the interrupt characters. That includes the use of DELETE for 'interrupt process', which is, ah, non-ideal by modern standards. So since we have to recompile things to change them, we might as well change the input editing characters too.

The interrupt characters are specified in sys/tty.h; change that, and then re-compile tty.c, which is where they are used. Remember to install it in the device library (../lib2) before you build a new Unix.

cc -c -O tty.c
ar r ../lib2/tty.o
rm tty.o

The input editing characters are used in kl.c and dz.c, but I'm not sure how long their setting of them lasts; I think those setting are over-ridden pretty quickly by getty, etc. To change the input editing characters, you need to change getty.c and login.c. The ones I use are here and here; getty goes in /etc, and login in /bin.

Note that I have added two entries to the terminal type table in getty; entry '3' is for pseudo-ttys (console, etc) which send DELETE from the BACKSPACE character on the standard Windoze keyboard, and '4' is for those (TELNET, etc) which send a BACKSPACE character from that key.

So in my /etc/ttys, tty8 (which is the Ersatz-11 console, a pseudo-VT100), is "183" (the '1' turns the line on); and the DZ lines (used for TELNET logins) are "1[a-h]4".


Advanced new (to V6) Unix tools

A number of useful tools which aren't in V6, but aren't trivial to add, are covered here. (Simple ones are covered
here and here.)


tar

The V7 'tar' is not too hard to get running under v6; the biggest problem is that it needs at least one system call which isn't in 'vanilla' V6 (at least, if you want it to operate the way it normally does); that is covered
above.

The code does use fseek(), which in the 'vanilla' Standard I/O Library uses the lseek() call (which also isn't in 'vanilla' V6 Unix), but you could use the alternative fseek() (here, above) instead of adding the lseek() system call.

Once you have them in, getting tar itself to work is pretty easy; mostly dealing with the fact that some system calls return different data in V6. The most problematic one is stat(), which returns the file size in a paired byte and shortword. chown() also takes a different number of arguments in V7; I hacked up a V7-compatible chown (available here - add it to the C library once compiled) to deal with that.

The other issue with tar() is that it uses the utime() system call - but it doesn't set the access time, just the modified time. So although I prepared a copy of the utime() system call for V6 (available here if you're curious, including a V7 version of the iupdat() internal system routine), I didn't need it: I just changed the code in tar to use the mdate() system call (the user form of smdate()).

You will also need the header files stdio.h, types.h, nstat.h, dir.h, and signal.h (available through the links).

Note that v6 doesn't keep the write dates when copying or moving a file (and that it also has no 'mvall' command). I find the following shell run command/file (which I have at /lcl/bin/mvall) useful:

tar ctvf - $2 $3 $4 $5 $6 $7 $8 $9 | (cd $1 ; tar xf -)
Usage is:
mvall {target} {files}
which works best if you're in the directory where the files currently reside. The shell file doesn't delete them automagically, although it could; add:
rm $2 $3 $4 $5 $6 $7 $8 $9
to the shell run file if you want that behaviour.


strings

'strings' uses the ftell() call in the Standard I/O Library, so you have to add the
lseek() system call before it will work.

Having done that, V6 compatible source is here.


More useful new Unix tools

A few more new, interesting (well, to me :-) Unix tools.

si

Think of this ('system internals) as 'ps' on radioactive steroids. It shows you pretty much all the data inside the kernel: the mount table, the inode table, the file table, the text table, the disk buffer cache - you name it. Eventually I will get around to writing a 'man' page for it - until then, 'Use the Source, Luke'! A quick listing of options (args are like 'ps'):

It uses 'ncheck' for inode number -> file name mappings; it keeps the mappings in a file ("filenms") in the root directory of each disk pack, and recomputes them automagically whenever it looks like they are out of date. (The logic here is still not entirely complete.)

The source is available here, but as currently written it requires some minor hacks to the kernel (to get the system uptime, and also the current values in param.h - I got tired of having to recompile 'si' whenever I changed a parameter). The latter also includes a tweak to allow the running system's size to be found, to see if the symbol table in /unix applies to the running system.

So, you will also need:

The first two are very slightly tweaked to retain useful info ('diff' is your friend). The last goes in conf and has to be listed explicitly in the 'ld' command to build a new Unix - since it doesn't contain any unresolved externals, it won't load from a library. I.e.:
ld -x l.o m40.o c.o param.o ../lib1 ../lib2
mv a.out /nunix

This command is now too large to be compiled with the 'vanilla' V6 C compiler (gets symbol table overflows), so you have to either i) break it into two pieces (which I was too lazy to do), ii) compile it with the new C compiler (above), or iii) increase the size of the symbol table in the V6 compiler (which is not as hard as it sounds). To do the latter, edit c0h.c to change 'hshsiz' from 200 to (say) 400. Then re-compile and install. Or you can just download it here.

Also it needs some things I moved out to a private library (I called it libL.a, for 'Local') while I was still trying to make it fit the old compiler (before I just gave up :-):

I think that's all of them! Anyway, it's rather neat: give it a whirl. Something like:
si mfitV 5
in one TELNET window while you're working in another can be most interested. I was particularly amazed to find out (via the 'b' flag) that even with a large buffer cache, the cache is almost always entirely filled with blocks from the root device. I'm guessing this is because /bin/ is there; it would be interesting to move that to another device, and see what happens.


cmdate

This copies the 'last-modified date' from one file to another; useful if you're moving something around, and don't want to lose that information; the source is
here.

Note: First, it only works on systems that have had the smdate() system call added. Second, it wasn't written for use on a real time-sharing system; i.e. you have to be super-user to use it. And if you 'set-UID' it, anyone will be able to change the write dates on anyone else's files. Yes it would be easy to code so that it checks to see if the file owner is the same as the real UID, but... until someone really needs it, I have more interesting things to do! ;-)


Back to JNC's home page


© Copyright 2014 by J. Noel Chiappa


Last updated: 13/May/2014