Roger Firth's IF pages


InfLight -- Inform debugging

Back up

In the Z-machine, everything works by numbers. By that, I mean that the multitude of internal cross-references within even the smallest game are all stored as numeric values, either as an index (object number 6, action number 32, property number 99, ...) or as an address (a routine, a string, a variable, etc). Index values tend to be small, and are often held in a single byte (or even in just part of a byte). An address, on the other hand, is always a two-byte number pointing to the memory location where the item is stored (possibly an unwise generalisation, but I think it's true enough for our purposes).

Sixteens bits -- two bytes -- give a range of addresses from 0 to 65535 (hexadecimal $0000 to $FFFF), far too small to encompass all of the items in a typical game. There needs to be a way of extending the address space (while keeping the address values themselves within the two-byte limit), and so the Z-machine uses two modes: byte addressing and packed addressing.

Byte and packed addresses

A byte address is a 16-bit number which points directly to any byte in the 0-64K address space. In a Version 5 game, a packed address is a 16-bit number which if multiplied by four (giving an 18-bit number with zero in the lowest two bits) then points to every fourth byte in a 0-256K address space. In a Version 8 game, a packed address is a 16-bit number which if multiplied by eight (giving an 19-bit number with zero in the lowest three bits) then points to every eighth byte in a 0-512K address space. For example, a byte address of $3210 explicitly points to the byte at address $3210, whereas a V5 packed address of $3210 implicitly points to the byte at address $0C840, and a V8 packed address of $3210 implicitly points to the byte at address $19080.

Memory map The packed address multiplier of x4 or x8 is the only significant difference between Version 5 and Version 8 games. In the rest of this discussion, we'll keep things simple by illustrating only Version 5 addressing (noting in passing that Version 6 and 7 games use a complex variant on Version 5 addressing that is definitely best forgotten).

Things would get pretty chaotic if the two address modes became totally mixed up (though they do become fairly confusing, as we'll see at the end). Therefore, Inform divides the Z-machine address space cleanly into low memory (which uses just byte addressing) and high memory (which uses just packed addressing).

Furthermore, high memory is used for just two things: all the routines and then all the strings; everything else is stored in low memory. This low/high distinction is quite important; because a packed address can point only to every fourth byte, lots of space would be wasted if small items were held in high memory. However, since a typical string consumes rather more than four bytes, and a typical routine occupies a lot more than four bytes, the fact that up to three bytes are unused at the end of each one becomes fairly unimportant.

Divisions of memory

Low memory begins at address 0, and extends upwards until all items apart from the routines and the strings have been allocated an address. High memory begins immediately after low memory ends, and extends upwards until the routines and then the strings have also been accommodated. Everything in low memory must be reachable using a byte address up to $FFFF, so that low memory can't be larger than $FFFF (64K) bytes. Items in high memory must be reachable using a packed address up to $FFFF, so that the combination of low memory plus high memory can't be larger than $3FFFF (256K) bytes.

There's a further sub-division, of much smaller significance. Low memory is split into dynamic memory (where a program can both read and write values) and static memory (where reading is permitted but writing isn't). Dynamic memory lies in the bottom portion of the low memory address space, and contains objects, variables and arrays -- things which commonly get updated during the course of a game. Static memory lies in the top portion of the low memory address space, and contains stuff like the verb grammars and the dictionary, which are continually read but never changed as a game progresses. The distinction, dating back to an era when machines had little physical memory, is nowadays of minimal importance.

Access to memory

We've said that a program can read and write values in low memory. This happens all the time, generally without you being aware of the details (for example, the Inform statement score=score+1; fetches the score global variable from dynamic memory, increments it and stores the result), though you can occasionally glimpse the mechanics (as in the statement width=0->33; for fetching the current screen width). So you might well ask: what can a program do with a high memory address? The answer is short: run it or print it. That's it.

If my_var is a variable of some sort containing an address in high memory, then your effective choices are my_var(); (which runs it, if a routine) or print (string) my_var; (which outputs it, if a string). The point is, your program can access high memory only on the Z-machine's terms. Routine and string creation is the sole prerogative of the compiler, and your program can't even read the contents of high memory, let alone update it. So, no fancy tricks with self-modifying code, and no string manipulation features like concatenation or subset extraction or..., well, anything but print.

What's in a number?

One final issue. Those who've been following closely may have spotted a dilemma; if everything is held as a number, how can the Z-machine distinguish between byte addresses, packed addresses and things that aren't addresses at all? Another short answer: it can't, other than from the context in which the number's being used. For sure, the compiler knows the difference, but by the time the Z-machine gets to see it, the value $3210 might be the byte address of a variable, the packed address of a string, or the number of islands in the Pacific. So, if we take this somewhat artificial room:

Object  test_room "Test room"
  has   light
  with  name 'test' 'room',
        n_to central_lobby,
        description [;
            print "The room is full of odd devices.^";
            print "Self=",  (name) self, "^";
            print "Prop1=", (object) self.prop1, "^";
            print "Prop2=", (address) self.prop2, "^";
            print "Prop3="; self.prop3();
            print "Prop4=", (string) self.prop4, "^";
        prop1 ticket,                           ! an object
        prop2 'ticket',                         ! a dictionary word
        prop3 [; "ticket of bright yellow."; ], ! a routine
        prop4 "bright yellow ticket.";          ! a string

and change the four properties to read as follows (children, don't try this at home):

        prop1 26,                               ! an object
        prop2 10228,                            ! a dictionary word
        prop3 18247,                            ! a routine
        prop4 22346;                            ! a string

then at run-time the two rooms behave identically:


Test room
The room is full of odd devices.
Self=Test room
Prop1=yellow ticket
Prop3=ticket of bright yellow.
Prop4=bright yellow ticket.

There's no reason to do this, of course, but it illustrates the point about knowing what the numbers signify.

Finally, a few fragments of the bleedin' obvious.