Thursday, September 6, 2012

Writing a kernel: I'm falling in love with the PDP-11 architecture

You are doing WHAT?

So the crazyness goes on. I have decided to learn to program in PDP-11 assembly language, and the way I'm doing it is writing a sort of operating system for that platform. I did some OS practices in my college time, but they were based on x86, and were limited to create a real mode (no memory protection!) multitasker. At work I have no reason to do kernel level programming (although I've done some systems level stuff for the IBM mainframe), so I was somehow lacking on OS knowledge.

First of all, my work is completely public. I'm publishing my code at github. The public repository is https://github.com/jguillaumes/muxx MUXX is the name of the toy operating system (Mostly Useless eXperimental eXecutive). I have got to build some tools to manage the SIMH paper tape loading format, which can be found at https://github.com/jguillaumes/retroutils. Feel free to take a look at the code and partake anything you find useful. 

The big picture

MUXX is basically a vehicle for my enjoyement. So it will be probably full of design flaws, underoptimal implementations and failed decisions. Lets see what it IS and what it IS NOT:
  • It will NOT be a UNIX clone.
  • It will have a microkernel structure. Basically, that means that some of the low-level functions will reside in separated tasks with separated address spaces; some of those tasks will be the memory management and some of the device drivers, including the console, multiplexer, paper tape and flopply drivers.
  • It will be written in assembly code and C (not because I like it, but because there is no option that I know of building a cross-compiler for the PDP-11 capable of generating systems level code).
  • The target, emulated machine will be a PDP 11/60 with 256 KW of memory. The only reason for that is that was the only PDP11 I worked with, something like  25 years ago...
MUXX will use memory management (it will be a mapped system), and will run in kernel and user modes. It will not use supervisor mode (at least at the beginning), neither I/D space separation, although I will try to code the necessary hooks to implement those features. 

Development milestones.

There is a doc subdirectory in the MUXX public git repository, which contains some documents about what I am doing. The milestones.odt document lists my development plan (of course subject to changes... that's the good part of being project manager/analyst/system programmer/technical writer all at the same time). I have set up a series of milestones I'm planning to follow, which are (copied straight from the document):

  • Milestone 0: Console I/O, MMU enablement and mapping, system calls using TRAP or EMT, switching to/from user mode via RTS and traps. Already achieved.
  • Milestone 1: Clock interrupt. “AAA/BBB” task switching, with full context management. Human-readable trap/abort messages. Basic memory management (embedded in kernel). Tasks “A” and “B” still linked into kernel. Achieved.
  • Milestone 2: Device Driver framework. Programmed mode paper tape device driver (read only and synchronous). Tasks “A” and “B” loaded from LDA image on paper tape. Tasks loading will be hardcoded at kernel startup, but the tasks will not be linked into the kernel.
  • Milestone 3: Interrupt-driven paper tape device driver. Tasks “A” and “B” loaded from paper tape. The task loading will be still hardcoded.
  • Milestone 4: Memory management out of kernel, in privileged task. Message switching between tasks
  • Milestone 5: Interrupt-driven console device driver. Basic command interpreter (“TASKS”, “LOADA”, “LOADB”, “STOPA” and “STOPB” commands. Loading of tasks “A” and “B” moved out of the startup code.
  • Milestone 6: Interrupt-driven multiplexer device driver (probably DZ11). Tasks “A” and “B” running in different terminals. Several instances of the tasks (up to number of terminals) will be started from the console.
At this time I've achieved milestone 1, so I have a working multitasker, with full memory protection and (sort of) meaningful abort/panic messages. The only device MUXX knows off at this time is the console (DL11), and just for output; that output is done in synchronous mode (so the kernel loops until the I/O operation is done). I have also written a basic "read character" rountine, which is also synchronous, so it can't be used seriously in a multitasking kernel... but I'll overcome this limitation when I reach milestone 5... some day.

Sources of "inspiration"

I don't want to write just another UNIX clone. We have perhaps too many of those. My plan is to learn about OS design and implementantion, and to have fun. Having said that, there is a lot of useful information around. My main sources for... inspiration are:
  • Andy Tannenbaum's book. Yeah, the blue brick. Both the text and the minix sources are really good sources of information. And it is written to be easy to learn from. I'm taking the whole microkernel and message passing ideas from minix.
  • The PDP-11 Handbook. The basic reference book about the PDP-11 architecture. It covers mostly all you need to program the '11: instruction set, interrupt handling, memory managing and basic programming techiques.
  • The PDP-11 peripherals handbook. It contains the information needed to write device drivers: I/O mappings, interrupt vector numbers and control register description. The version I've found in the net is not the last one, and several devices available in SIMH are not covered, but it will be enough to begin.
  • The 2.11BSD system source code.The best way to solve a problem is to look how someone else did solve it before you. Of course you can also set up your own 2.11BSD simulated machine if you want to, but it is probably easier to browse the source code from the linked site.

Tools

The development is being done in a linux machine, so we need some cross-development tools. My previous entry talked about building a PDP-11 cross-assembler. I progressed a little bit since I wrote that. Most of the information I wrote is valid. I've built the tools from the git repositories. The GCC git repository can be found here. The binutils repository is this one.
If you have read the previous entry perhaps you remember I had a quite weird problem with the linker, which failed to create pdp11 binaries. It complained about a syntax error in the default linking script. Although this is not a showstopper (I use a customized link script to build the kernel), it is annoying. The solutuion was ridiculously easy: you MUST use a "clean" environment to build binutils and gcc. By "clean" I mean you must erase some environment variables which interfere in the linking process: LIBPATH, SHLIB_PATH were the culprits in my case. If you plan to build the cross-building tools, check your environment for related variables.
I also found a bug in the gas assembler. I submited it to the binutils bugzilla.The problem is gas assembles this instruction:

         jsr    pc,@(R0)
Like:
        jsr    pc,(R0)
Instead of
        jsr    pc,@0(R0)

Unfortunately, the gcc compiler generates this code if the program uses a function pointer table... for instance, in the typical syscall routing code. Provisionally, until the bug gets fixed (I could try to do it myself...) I'm using a long switch-case block to do the syscall routing. Ugly, but works.
Having a C compiler and an assembler is not enough. The C language is just a little bit over the assembly code. It needs a library (libc) to work. I have tried to port the newlib C library to the PDP-11 target, with mixed results. The real problem is the 64K address space. The code for a simple sprintf() is larger than those 64K! I'm still working on that, but it seems the problem relies in the float/double support. Right now my "kernel" does not support floating point in any way, so I will try to get rid of the float code and see if I can put newlib on diet...

Current status

Right now my "kernel" does the following things:

  • Enables and configures memory management, so each "task" gets its 64K protected address space.
  • Creates and runs tasks in user mode, using TRAP to call the system services.
  • Multitasks, using a simple round-robbin algorithm. I plan to add priorities later.
  • Writes to console (synchronously).
  • Can provide somehow inteligible panic information when it crashes (and it does it a lot).
Right now, the "kernel" and the "tasks" are linked together. My next step will be to write a paper tape device driver, so I can move the "user tasks" out of the kernel and load them separately. Once this is done, I will work on removing the memory management to the kernel and putting it to its own task. 

I'm having a lot of fun :).