MUD - Some of the Ideas I've Been Playing About With... Hat

Hardware Architecture

The basic steps are pretty obvious, and are as follows:

micro --> network --> terminal handler --> parser --> binder --> database management system.

However, the architecture depends on which of these are done in parallel, and on what machines. At the moment, Essex's MUD is:

+-----+   +-----+   +-----+   +---------------------+
|micro|-->| net |-->| T.H.|-->|parser->binder->DBMS |
+-----+   |     |   |     |   |                     |
+-----+   |     |   |     |   |                     |
|micro|-->| net |-->| T.H.|-->|parser->binder->DBMS |
+-----+   |     |   |     |   |                     |
+-----+   |     |   |     |   |                     |
|micro|-->| net |-->| T.H.|-->|parser->binder->DBMS |
+-----+   +-----+   +-----+   +---------------------+
BBC etc.  Prestel   PDP11     DEC-10

That is, the micros are individual entities, and un on a parallel network. Their control paths carry on to a front end PDP11, and onto the DEC-10, all in parallel.

Since the net is out of our hands, and transparent to us anyway, I'll omit it in remaining diagrams.

The VAX system we envisage then looks like this:

+-----+   +------------------------------+
|micro|-->|T.H.-->parser-->binder-->DBMS |
+-----+   |                              |
+-----+   |                              |
|micro|-->|T.H.-->parser-->binder-->DBMS |
+-----+   |                              |
+-----+   |                              |
|micro|-->|T.H.-->parser-->binder-->DBMS |
+-----+   +------------------------------+
BBC etc.  VAX

If we had one separate process to do the last 2 actions, and contact it from the parser to serve database requests, it would be:

+-----+  +------------------------------+
|micro|->|T.H.-->parser                 |
+-----+  |            \                 |
+-----+  |              \               |
|micro|->|T.H.-->parser-->binder-->DBMS |
+-----+  |              /               |
+-----+  |            /                 |
|micro|->|T.H.-->parser                 |
+-----+  +------------------------------+
BBC etc.  VAX

This doesn't really gain anything, as it uses the same amount of processor time. It needs no locks or anything, however, since there is only one process for binding and DBMS, which makes it easier.

Problem: the VAX can only handle 100 ports top, then it gets in trouble. Also, it's expensive!

Solution 1):

if we have 100 players, the fastest they can type is about 10 characters a second. If they all typed at once, non-stop (worst case), that means 1000 characters a second coming in. For this we need 100 ports?! Why not have a multiplexer take them all in and tag them by who sent what? Then we only need one port on the VAX, and a process which takes the tagged input, buffers it, and sends it off to separate parsing processes.

+-----+     +------------------------------+
|micro|     |       parser                 |
+-----+ \   |      /     \                 |
+-----+   \ |    /         \               |
|micro|---->|T.H.-->parser-->binder-->DBMS |
+-----+   / |    \         /               |
+-----+ /   |      \     /                 |
|micro|     |       parser                 |
+-----+     +------------------------------+
BBC etc.    VAX

Do we now need a VAX? If a VAX can't manage it like this, we can have a front-end, eg. a PDP11, which does the terminal handling for it.

+-----+               +-----------------------+
|micro|               |parser                 |
+-----+ \            /|     \                 |
+-----+   \ +-----+/  |       \               |
|micro|---->| T.H.|-->|parser-->binder-->DBMS |
+-----+   / +-----+\  |       /               |
+-----+ /   PDP11    \|     /                 |
|micro|               |parser                 |
+-----+               +-----------------------+
BBC etc.              VAX

We still get the port problem, though, but since the parser is independent of the binder etc. for any particular database, we can incorporate it into the PDP11.

+-----+     +-------------+    +--------------+
|micro|     |       parser|    |              |
+-----+ \   |      /      |\   |              |
+-----+   \ +    /        |  \ |              |
|micro|---->|T.H.-->parser|--->|binder-->DBMS |
+-----+   / +    \        |  / |              |
+-----+ /   |      \      |/   |              |
|micro|     |       parser|    |              |
+-----+     +-------------+    +--------------+
BBC etc.    PDP11              VAX

It could be that we want

\
  \ +-------------+
--->|T.H.-->parser|-->
  / +-------------+
/

as our PDP11, that's OK.

Solution 2):

Noting the same thing bout they user's typing speed, and that the parsing and terminal handling are mundane and not a drain on resources, we do the terminal handling on a micro. The parsing can go there too, and we put the VAX on a net so it only needs one port (the micros send to the binder via the net - they're all on the same one). This gives:

+-----+   +-------------+
|micro|-->|T.H.-->parser|
+-----+   +-------------+\
+-----+   +-------------+  \ +--------------+
|micro|-->|T.H.-->parser|--->|binder-->DBMS |
+-----+   +-------------+  / +--------------+
+-----+   +-------------+/   VAX
|micro|-->|T.H.-->parser|
+-----+   +-------------+
BBC etc.  BBC etc.

The parser and some of the terminal handler could go on the user's micro, ie.

+--------------+   +-----+
|micro-->parser|-->|T.H. |
+--------------+   +-----+

but that gets you nothing except hassle if the user has an odd machine.

The big point about these solutions is that if we hive off the binder and DBMS to a separate machine, it doesn't have to be anywhere near as powerful as a VAX. Small and fast will do the trick. If all users sent a request, the system would have 100 to satisfy at once. Each of these may generate, what, 100 more requests. probably far fewer, and these may require 100 machine code instructions to handle. OK, so that would be one million instructions per second, of a fairly easy sort. A 68000 could do it!

I particularly favour the second option. Here, what happens is the user dials your system and is switched to a modem. Associated with each modem is a micro, which is there only to service the modem. All micros are on an ethernet, say, and also there is a sizeable machine to do the multi-user game. It sends process results back to the terminal-handling micro, and it builds up the text and despatches it. The traffic on the net (well, LAN) is maybe 10 bytes a second from each micro, and probably 100×1000 bytes from the DBMS machine if it sent unadulterated text (which it wouldn't). Even this is only 100K bytes, and that's if all users wanted half a screenful of bludge every second - which they couldn't even read!

It's also cheaper this way. There's a company which makes BBCs and modems in one, calling them softmodems. Even if we didn't get them, 100 BBCs is still only £30K, and the net wouldn't cost much, surely? It's easy to expand, too!

Software Architecture

OK, that's more the hardware side. We're aiming for a modular machine, which, however it is implemented, would look roughly like MUD does at present, ie.

micro --> T.H. --> parser
                         \
                           \
micro --> T.H. --> parser --> binder --> DBMS
                           /
                         /
micro --> T.H. --> parser

This has the first three steps done in parallel, but the binding stage onward really ought to be one-user only, which MUD does by a series of lock instructions, of course.

At each stage along the line, the input to the next stage is more determined. In English, it looks like:

micro --> T.H. --> parser --> binder --> DBMS
       1        2          3          4

1 straight ASCII text;
2 lexical text, ie. tokenised;
3 parse structure, ie. intensional form of request;
4 flat, DBMS request, extensional.

In computer terms, this might be:

1 pick up a box
2 S_PICK S_UP S_A S_BOX S_CRLF
3     VP
     /  \
    V    N
    |    |
  T_GET T_BOX
4
verb=GET object=BOX#3

I'll detail some of this later...

The output is easy, since only the DBMS talks, and it only talks to the terminal handler. It can send huge chunks of text, which the terminal handler formats, buffers and transmits; or, better, it can send coded messages, eg. "format13, Fred" instead of "You get a big smile from Fred."" or whatever. The less it sends out, the less the LAN capacity need be. The terminal handlers won't be overworked, and should have tons of spare capacity.

Functionality

I've been talking of the various stages of this arrangement, and have come to a few reasonably concrete ideas of how they might be set up. I'll go through the stages in turn.

Terminal Handler

The terminal handler just takes input and "normalises" it. In the example earlier, this meant tokenising, but it's more realistic to assume it simply "cleans up" the input, ie. transforms it all to one case, and spaces it properly etc.. This makes the next stage, the parser, easier. Tokenising is possible at the terminal handler level, but some words may have multiple tokens, all of which would have to be passed. So the terminal handler probably takes something of the form
pick up a  box
and transforms it to
PICK UP A BOX <cr>
I don't see the terminal handler as problematical!

Parser

The parser reads things off the terminal handler stream. Assume it does its own tokenising - this means it has to have access to player names so it can parse properly, otherwise it will treat them as unknown words. The DBMS is in contact with the terminal handler for output, so if terminal handler and parser are in one machine and binder/DBMS are in another, it'll mean a link does already exist via the terminal handler. The DBMS will have to keep all parsers informed with regard to who is/isn't playing - simple enough.

The parser has to produce stuff for the binder. This means distilling certain facts from the natural language input - a case frame, if you will. I've given this a lot of thought, indeed I've even written the recogniser part of the parser, and will soon do something on the categorising side to get the final, required slots. It's a bit harder than that, since there is backtracking required - a cinch in Prolog, but we won't be using that!

Ideally, the parser should take a BNF and drop a recogniser. Not too hard to do, but not worth it, I fancy. The final output has to be for the binder. In this form, there is a lot of secondary information which says the conditions which the variables ought to obey. I'll detail these soon, put at present I'll just give the slots:
actor verb obj ins vcount
The actor is whoever gave the command. The vcount is the number of times the other four things should be passed to the binder (for example "drop the ball three times" is interpreted as if the you'd typed "drop the ball" in thrice - it can get a different binding for "ball", of course, eg. if you'd said "break an egg three times"; the difference between "a" and "the"?). Whether the parser passes three separate requests to the binder, or the binder splits it up itself, I don't mind (although I prefer the former).

The verb is an action, augmented by adverbs. Adverbs are handled by giving a verb/adverb pair, and saying the verb to which they're equivalent. So "pick up" is the same as "get", "go quickly" is the same as "run", etc.. The grammar I'm using lets these build up pretty well.

The hardest part of the formalism is the obj and ins part. This is because the other slots are already bound, yet these aren't. Hence, they contain "binding information", a description of those things which have to be true to let an object qualify as a binding.

Basically, an object specification is a structure. The first field tells you how many bindings you want ("drop 3 boxes"), the second field is a list of things you want to be true about the object, ie. its is-list; ("drop a big box") and the third slot is its isn't-list ("drop a box which is not big"). The two lists have the same form, ie. lists of individuals. Individuals are a class, eg. "box", and a path for where it is to be sought. There are also some plus adjectives and some minus adjectives, so if you "drop a big box" it would indicate an individual which is of class "box" with a plus adjective of "big". Don't worry, I'll be giving an example soon! The path saying where to look is either some selector based on the actor (eg. their "carrying" slot), or an object specification.

OK, I'll give you an example now!

"Drop 3 big, shiny coins excluding the heavy florin which isn't silver from the small sack on the floor into the cloth bag which isn't on the table, quickly, twice."

verb throw (drop+quickly=throw)
vcount 2 (so give request to binder twice)
obj (3 [coin () +big +shiny] [florin (1 [sack (1 [floor ()] []) +small] []) -silver])
ins (1 [(bag (1 [] [table ()]) +cloth] [])

In the above obj and ins slots, square brackets represent the is-list and isn't list (the former appearing first), and () means the default location. Thus, for example, [table ()] is an isn't-list, meaning the table which is in the default place to look (probably the room, for the verb "throw").

This looks complicated - it's actually worse because the "adjectives" can be superlatives that act as functions, eg. "biggest". Anyway, the verb, obj, ins and actor are passed vcount times to the binder to find a binding or two...

Binder

The binder operates as follows, with each request made of it (ie. obj or ins - it doesn't have to do them at the same time, but the use of the pronoun %quot;it" might force it so). To evaluate an object specification, the binder walks the is-list evaluating all the individuals which make it up. This will give a list of possible bindings. The same is done for the isn't-list to produce a second list of undesired bindings. The binder removes all elements of the first list which are also elements of the second to get its result. The first count slot's worth of these are taken, and passed back to whatever called the binder.

To evaluate an individual, the path is evaluated. If this is a selector, eg. carry!actor, it's easy, the binder just reads it straight. Otherwise, the binder recurse. It eventually ends up with a list of bound objects as its path; it walks this list carrying out a search, returning a list of successes. A search means something along the "contains" (or whatever) links of the objects in the path, looking for objects of the class of the individual. The adjective matching can either be done at search-time, or afterwards prior to returning the list of successful matches to whatever called the binder (I prefer the latter option). And that's it! Errors, when there's some mistake such as if you say "the box" and there are none, can be passed to the DBMS as "actions". This is how parse failure is handled, too. So there are no errors from the parser, terminal handler or binder per se, they're just transformed to eg. %noobject or %wordunknown, which the DBMS treats as an ordinary action and gives the message (which will be an "error" message!).

Example of the binder in action: I'll only do one object, the obj in the earlier example. This part of the binder, the "bind one object specification" part, is its main part.

OK, proceed like this: walk the (3 [...] [...]) list, evaluating it. This means we must evaluate the is-list, which in turn means evaluating [coin () +big +shiny]. If we evaluate the path, (), we get the default, which we'll suppose is "room". Now search the room of the owner (actor) and come up with, say, (coin1 coin4 coin5 coin9 coin11). Now apply +big and +shiny, and maybe cut out coin1, coin11 and coin5. So, return (coin4 coin9) as the evaluation of the is-list. OK, now to evaluate the isn't-list, which means we have to instantiate [florin (...) -silver] to a florin. Its path, (1 [...] []) +small] []), is 1 object long, so we look at this path's is-list, [sack (1 [floor ()] []). This is a sack, so we look for the sack's path: the object specification says there's one of it. Look at the is-list and it's [floor ()]. The floor has an empty path, so assume the default is "room", and we get, say, (floor7). OK, so that's the is-list; there's no isn't-list, so take the first element of (floor7) and pass it upwards. We now have the sack's path. Search this for elements of class sack, and we find, say, (sack9 sack14). Only sack9 is small, so chuck sack14, take the first element of the list, and pass it upwards. We now have (sack9) as the path for florin! Search sack9 and get maybe (coin4 coin5) as florins in the sack. Let's say neither are silver, so keep them both. We at last have the isn't-list at the top level. Coin4 is in both lists, so remove it from the result list. We then take the first three elements of the is-list; as there's only one, we might want to throw an error here, it depends. Personally, I would, but we don't have to. So (coin4) is returned as the binding, and that's it!

Like I said, complicated!

The binder is in two parts: the management part and the binding part. The former does things like pass results on to the DBMS; the latter is callable only on one object at a time. You sometimes need to call it from the DBMS, though, eg. if you want to know if there is a light source near. Hence, it needs read access to the database. It's so tightly coupled with the DBMS that they ought to be done on a single machine.

DBMS

What is finally passed to the DBMS, then, is a list of requests to process. These are handled serially, although if the DBMS calls itself recursively the functions it calls will be handled first. Hmm, not too clear... The BMS control loop is as follows:
        take next request from buffer
        process the request just taken from buffer
If, in the course of processing, another request is generated, it is handled recursively rather than added to the request buffer. The DBMS consists of pattern/action pairs. When a request is made, it whips down the list of patterns doing a class match on each element. There ought to be a way to optimise this! Hierarchies have to be tangled rather than straight trees, but not looped. Hence, a depth-first search for the class-matches is OK.

for the command "player8 throw coin4 bag6", something like "player throw object container" might match. For unknown commands there can be a catch-all at the end, of the form "being action object object". Whether you make something a class or not is up to you as the programmer, eg. you might want to make "male" and "female" separate classes, or alternatively just use "player" and do the test after the match (which is more sensible!).

I'm simplifying this, by the way. Commands actually have a fifth slot, a "timer". I've not thought too deeply about this yet, but timers are needed to handle demons. If they are included, you can do things like "after 5 seconds, 'run away'" or "every 5 seconds 'who'", which will enable the user to set up their own demons and repeating ones (I suppose "every 5 seconds 'every 5 seconds `who`'" ought not to parse!). Anyway, I'll carry on without them for now!

The hardest part is the definition language. I'm all for no reserved words, using symbols instead. I've come up with something, but it's not finished yet. It's very LISP-like, however, although not syntactically. Lots of operators can be defined in it, too. I think it's wise to give the programmer a choice of doing actions as they like or as we like: all predefined functions ought to be prefixed in some manner, even if they're in a "library", rather than be hard-wired. So .DROP will look at your .CARRY property; you may write your own DROP if you want, but if you use ours you'll need a .CARRY property too.

Anyway, I'll let you have that stuff when it's near to completion!


Copyright © Richard A. Bartle (richard@mud.co.uk)
21st January 1999: design.htm