NEWS, EDITORIALS, REFERENCE

Subscribe to C64OS.com with your favorite RSS Reader
February 22, 2017#18 Programming Theory

C-Style Structures in 6502 (2/3)

Post Archive Icon

This is part two of a three part series on Pointers, Structures and Recursion in assembly on the 6502. If you haven't already read part one, make sure to read it first.

To review, a pointer is any two consecutive bytes1 in memory that taken together hold the value of another address elsewhere in memory. The order of these bytes depends on the endianness of the CPU. The 6502 is little endian so its pointers are always ordered: least significant byte, most significant byte. We saw in the previous post how the instructions of the 6502 in absolute addressing mode use inline pointers to data, and indirect mode have inline pointers to pointers in data which point to the final data. I made the claim that this unleashes awesome flexibility and power in computer programming. In this post, we are talking about structures and I will show how pointers and structures can work together to represent networks of relationships between rich and complex data types.

Structures, in the terminology of the C programming language, are known as "structs." Structs in C are incredibly well documented and I strongly encourage you to read about them. The Wikipedia article Structs (C Programming Language) is a good starting point. In C, structs are a first class citizen, with specific language features that make it easy and reliable to work with them. In assembly we are not as well off when it comes to working with structures. In fact, when I began my adventures of seriously learning to program in 6502 ASM, I was very confused about how one would even go about recreating some of the programming techniques that are so common and useful in C.

As it turns out, it is possible to reproduce the essential character of C Structs using only the basic features available to us in a good assembler. Although, it is certainly a more manual process with much less compiler intelligence about what your intentions are. Bugs are therefore more easily introduced. But the flip side is that in assembly we are working much more closely to the metal of a C64 and can see very clearly when something is going to be inefficient. This is a problem with higher level languages that abstract the final machine language implementation away from the programmer's day-to-day thoughts.

What is a Data Type?

Let's begin by asking, what is a data type? In a CPU everything is just made up of bits and bytes, right? Yes, at the most fundamental level everything is just bits and bytes. However it is very useful to label collections of bytes with what they are intended to store. In a compiled language the source code allows you to tell the compiler what you intend to do with a collection of bytes so that if you accidentally do something that doesn't make sense it can warn you. Assemblers do not afford us this luxury, but we are still wise to think in terms of data types and appropriately name our labels to keep track of these things in our code.

A data type therefore is a conceptual label for a set of bytes that are intended to hold a kind of data. If you have just one byte, for instance, maybe that byte is intended to hold a character. As in, a PETSCII value that is meant to be displayed on screen. Or maybe it is intended to hold an 8-bit unsigned integer, a number with a value ranging from 0 to 255. Or maybe it is intended to be an 8-bit signed integer, a number ranging from -128 to 127. Or maybe it is meant to be a set of 8 independent binary flags. We might think of these as a Character, an Unsigned Int, a Signed Int and a Bit Field respectively. Even though each of these conceptual things is at its base just a single byte of memory.

Well, what if we put two bytes together and consider what those might be. Two bytes together can't represent a single PETSCII character. But they could be a 16-bit Unsigned Int (0 to 65535), or a 16-bit Signed Int (-32768 to 32767). These are usually called Unsigned Long and Signed Long in C, as in a long integer, an integer made up of more bytes than usual.2 Or, as we just learned in the previous post, two bytes could be intended to be used as a pointer. A pointer is conceptually more than a mere 16-bit Unsigned Int, it is a 16-bit Unsigned Int that represents a meaningful memory location. If just any old 16-bit Unsigned Int containing any arbitrary value were suddenly used as a pointer it would almost certainly lead to a crash. On the other hand, it should be clear that if you want to increment a pointer by one, it is done in precisely the same way in which you would increment a 16-bit Unsigned Int.

What is a Structure?

Now that we know what a data type is we can answer the question, what is a structure? A structure is a new data type, defined conceptually by the programmer, which can be arbitrarily many bytes, within certain limitations. One limitation is that every struct of the same type is the same size. Just as all ints are 1 byte, and all pointers are 2 bytes, if you have a structure called menuentry3 that is 8 bytes, every menuentry is always exactly 8 bytes. And we'll soon see why. Another limitation is that the size of a struct must be no bigger than the range afforded by the data width of the CPU. This is because of how the index registers are used to manipulate the struct. The 6502 has an 8-bit data bus. This means the .Y index register can only hold an unsigned value between 0 and 255. Thus, structures on the C64 cannot be bigger than 256 bytes. In practice this will not be a major limiting factor.

The next question is, what sort of data types might there be that require multiple bytes? The magic of structures is that they are complex data types. A complex type is one which is composed of many smaller data types. First I'll start simple, and then I'll talk about the structures I'm using C64 OS to represent the Menu System, which is the primary and standard user interface common to all C64 OS applications.

Many types of records with multiple fields can be conceived of as a structure. Imagine you have a record type of a person. It consists of their name, birth date, height and weight. A name can vary in length, so let's allot 32 bytes for a name of up to 32 characters. For their birth date we want 1 byte for the day, 1 byte for the month and 2 bytes for the year, because the current year is greater than 255, but won't ever need to be bigger than the year 65535. Height and weight can each have 1 byte. If height is stored in centimeters, no one is likely to exceed 2.55 meters tall. And if weight is in pounds, we could support tracking weights up to 255 lbs. Many people are heavier than that. We could use 2 bytes which seems like overkill. Or we could measure weight in kilograms. 255 kilos is 562 pounds. Only a small fraction of 1% of the population weigh over 500 pounds. Our conceptual structure might look like this:

  Person
	
  name      32 bytes
  year       2 bytes
  month      1 byte
  date       1 byte
  height     1 byte
  weight     1 byte

The total size of a person struct is the sum of the size of all the parts, or 38 bytes. The size of each part, and the order of the parts must be the same for every instance of the person structure. This allows us to determine the offset of each part from the start of the struct. Which we could rewrite like this:

  Person
	
  name       0
  year      32
  month     34
  date      35
  height    36
  weight    37

The reason we have the parts configured as offsets from the start of the structure is to use them as indexed offsets from either an absolute address or a pointer. C has first class language support for calculating these offsets for us, and for naming the struct parts with a special naming convention that concatenates the name of the struct with the name of the struct member parts. In assembly we don't have that, but we can do something similar. Let's see first how it might look in C.

  typedef struct {
    char          name[32];
    unsigned long year;
    unsigned int  month;
    unsigned int  date;
    unsigned int  height;
    unsigned int  weight;
  } person;

The types are defined elsewhere for the CPU being compiled to. And the sizes of each of those types is used by the compiler to infer the offsets from the start of each subtype. The compiler can also infer the size of the whole structure. When you want to create an instance of this structure in memory there are standard C functions that can be used together to allocate memory of exactly the size of the struct, returning a pointer to the start of the struct. The sub types within can be accessed from the pointer returned by the memory allocation function via a language construct notation and the names declared in the struct's definition shown above. Here's how that would look in C:

  person * ptr = malloc(sizeof(person));

  memcopy(ptr->name,"Gregory Nacu",12);
  ptr->name[12] = 0;
	
  ptr->year  = 1981;
  ptr->month = 5;
  ptr->date  = 25;
  
  ptr->height = 170;
  ptr->weight = 64;

Forgive me if my C is a bit rusty, it's been many years since I did any serious work in the language. But the above should suffice to give an example of how structs can be used to group a set of properties together under a collective name. Where the year is being set, 1981 is greater than 8-bits, but it will fit nicely in the 16-bit unsigned long that the "year" subtype was declared as being. The name was declared as a buffer of 32 one-byte characters. In C the memcpy function is a way of easily putting a string of characters into that buffer, and then the 0 is added to terminate the string. C64 OS also uses the convention of a 0 value byte in a string indicating the end of a string. This is used to allow other functions that work with strings to know when they've hit the end of the string.

Structures in 6502 ASM

Now let's look at how we can do the above in 6502 assembly.

  .block
  ;Person
  pname   = 0
  pyear   = 32
  pmonth  = 34
  pdate   = 35
  pheight = 36
  pweight = 37

  psize   = 38;

  ptr = $FB

  lda #1        ;Alloc one page
  jsr pgalloc
  tay           ;Page addr -> .Y
  lda #0
  jsr memset    ;Zero the page

  ldx #0
  stx ptr
  sty ptr+1

  ;Set the name
  ldy #pname
next
  lda name,x
  sta (ptr),y
  inx
  iny
  cpx #12
  bne next
  
  ;Set the year
  ldy #pyear
  lda #<1981
  sta (ptr),y
  iny
  lda #>1981
  sta (ptr),y
  
  ;Set the month
  ldy #pmonth
  lda #5
  sta (ptr),y
  
  ;Set the date
  ldy #pdate
  lda #25
  sta (ptr),y
  
  ;Set the height
  ldy #pheight
  lda #170
  sta (ptr),y
  
  ;Set the weight
  ldy #pweight
  lda #64
  sta (ptr),y
  
  name     .text "Gregory Nacu";
  .bend

That looks like a lot of lines of assembly, but it's not as hard as it might first seem. Let's break down how it all works, then we'll look at a few things that can be done to make things easier.

First off, the actual struct type definition. This is not a thing that exists in ASM, there are only labels. Since the person struct starts with a "p" I decided to make all the labels that represent the struct sub types begin with the letter p. pname, pyear, pheight, etc. These help to limit the likelihood of a label collision. The code is contained within a block, .block opens the block, .bend closes it. Labels only need to be unique per block. Notice also that ASM doesn't have names for the basic data types the way C has char, int, long, etc. Instead, the labels are defined as the offsets into the structure. pname is 32 bytes long, by virtue of the fact that the next label, pyear, starts at offset 32. Meanwhile pyear is 2 bytes long, because the next label, pmonth, starts at an offset 2 greater than it, 34. As a result of this, the size of person can't be computed automatically, so I have another label, psize, that specifies the total size.

As we learned in part one of this series, the 6502's indirect mode instructions can only use a pointer that is stored in zero page. So I have defined the pointer, ptr, at $FB. This is free space that isn't used by the KERNAL or BASIC roms. Next, we have to talk about memory. If you haven't read about my Thoughts on Memory Management, you really should go read that first. And the part two of that is the post on A Hybrid Memory Manager, which you should also read. Briefly, C64OS has a memory manager that can allocate groups of 256 byte pages. In the future I plan to implement a more granular memory allocation scheme which can be initialized and used within a block of allocated pages. But I haven't implemented that yet, and I don't want to discuss vaporware, so the above example uses the page allocator. In the next post when I go into recursion I'll show how to maximize the efficiency of the paged memory allocator.

A struct is just a chunk of memory that is the right size to hold the bytes that all the sub types require. Our person struct is 38 bytes. It will certainly fit in a 256 byte page. The jsr pgalloc takes the page count in .A, and returns the page number to the start of the allocated memory in .A. The jsr memset fills a page with a byte. In this case I am zeroing the page. It can be useful to know that the variables of our struct will all default to zero. The page number, which is only half a pointer, the high byte, gets written to ptr+1, and the low byte, 0, the start of the page is written to ptr. Note that ptr is therefore little endian, as it must be on the 6502.

The remaining lines populate the struct with data, as in the example shown in C. The C function memcpy() is effectively just a loop that moves bytes from one location to another. In the ASM above I manually do that with a loop. The length is hard set to 12, which is the number of bytes in my name. The 0-byte terminator is already there because I memset the entire page to zeros just prior to populating the struct with data. Notice how the struct sub type is referenced. The Y register is initialzed to #pname. Then it is used as an index from the start of ptr. As each byte is copied the Y index is incremented. Care must be taken to not exceed 32 bytes for the name plus its null terminator. In an ideal world a C Compiler would warn you if you tried to memcopy too many bytes into a statically sized buffer, but I'm not sure that it does in practice.

When we're done setting the name and want to set the year we set the Y index equal to #pyear. Note that we defined the year as a 16-bit number. So we have to set each byte individually. In little endian the low byte should be stored first. I use the assembler's notation to get the low byte of a number, 1981, like this: #<1981. In hex 1981 is $07BD, so $BD is stored at #pyear offset into the struct. Next we have to increment .Y once to index the second year byte. Then we set the hi byte there. #>1981, which resolves to $07.

All of the remaining sub types are one byte each. So they are very easy to handle, and perhaps demonstrate most clearly how a struct is accessed. In each case we load a byte into the .A register, load the appropriate struct offset into the .Y register, and store the .A value through the ptr, indexed by .Y. Boom, boom, boom, boom, month, date, height, weight. Here's what's interesting. Although I set those properties in the order in which they occur inside the struct, I didn't have to. The chunks of code that set those properties could come in any order. Or, because the entire memory page was initialized to zero, we could have skipped any of those sets and the value would be left defaulted to zero.

Structures and Macros

After walking through the code, you can see that it's not as complicated as it might have appeared at first. However, it is also obvious that it is not nearly as clean and readable as the C version. It is also error prone, because the repetition of code could have hard-to-spot typos. Fortunately any good assembler will support macros. A macro is a defined unit of code, with a name and arguments. The arguments get plopped into placeholders in the code. Macros are 100% deterministic, they do only and exactly what you tell them to. They are as efficient as if you typed out the code manually, because they are converted into inline code at assemble time. They have the ability to make the code much more readable, and much less error prone. Let's take a look at how we might rewrite our code above using some macros.

  ptr = $FB
  
  lda #1        ;Alloc one page
  jsr pgalloc
  tay           ;Page addr -> .Y
  lda #0
  jsr memset    ;Zero the page
  
  ldx #0
  stx ptr
  sty ptr+1
  
  #memcpy ptr,pname,name,12
  #set16 ptr,pyear,1981
  #set8 ptr,pmonth,5
  #set8 ptr,pdate,25
  #set8 ptr,pheight,170
  #set8 ptr,pweight,64
  
  name     .text "Gregory Nacu";

Well, that looks much simpler. All of the sub type setting has been reduced to what looks like a one line function call. Although these are not sub-routines proper. When this assembles it will end up exactly the same as the manually typed assembly in the first example.

Wait a second though, where does that memcpy, set16 and set8 come from? You would define these as reusable macros that you could put in an external file and include with your source code. Also as in the C example, I moved the structure definition out of the main source code to show what part of the code is necessary to use this structure in an abstract way. Below is what our .include file might contain to declare the structure labels and the macros.

  ;Person
  pname   = 0
  pyear   = 32
  pmonth  = 34
  pdate   = 35
  pheight = 36
  pweight = 37
  
  psize   = 38;
  
  .macro memcpy ;dest offset src len
  ldx #0
  ldy #\2
next
  lda \3,x
  sta (\1),y
  inx
  iny
  cpx #\4
  bne next
  .endm
  
  .macro set16 ;dest offset value
  ldy #\2
  lda #<\3
  sta (\1),y
  iny
  lda #>\3
  sta (\1),y
  .endm
  
  .macro set8 ;dest offset value
  ldy #\2
  lda #\3
  sta (\1),y
  .endm

Structure Nesting

That's pretty cool. But structures can do a lot more than that. One step up in complexity is to nest structures. You may notice that the person struct contains a 4 byte date. A date though can be consider a kind of structure on its own. Let's take a quick look at how that might work.

  ;Person
  pname   = 0
  pdate   = 32
  pheight = 36
  pweight = 37
  
  psize   = 38;
  
  ;Date
  dyear   = 0
  dmonth  = 2
  ddate   = 3
  
  dsize   = 4;

Above we have our two structs defined as sets of labels. The person struct now contains just 4 sub types, name, date, height and weight. Notice how the date property of person still takes up 4 bytes. And the size of person is exactly the same as before. The new data type, date, is its own struct with 3 properties. Year, month and date. The indexes for these start at 0 again like any struct. The date struct can be used on its own just as we saw in the earlier examples, but let's look at how we would use the person struct with the date struct split out into its own type.

  #memcpy ptr,pname,name,12
  #set16 ptr,pdate+dyear,1981
  #set8 ptr,pdate+dmonth,5
  #set8 ptr,pdate+ddate,25
  #set8 ptr,pheight,170
  #set8 ptr,pweight,64
  
  name     .text "Gregory Nacu";

I skipped the allocation of person, because the size of person is completely unchanged. Look at how the date parts can be referenced using the same macros as before. The full offset is the addition of the offsets from both structs. To set the person's birth month, for example, we set8 starting from the ptr to the person, but the offset is pdate+dmonth. Those are resolved by the assembler to 32+2 or 34. And that's exactly where the month int was stored in the original person before splitting date out into its own data type. It should be clear from this example why it is that structs must always be the same size. Because date is embedded inside person, date must always be the same size so that the properties of person that follow date are always found at the same offsets.

It should also be clear that even though a structure may be composed of sub structures and sub sub structures, the outermost and most abstract structure must still not exceed 256 bytes in size. The addition of the sub struct offsets resolves at assemble time into just one integer offset. And that value must be loaded into the .Y register and used to index the pointer. So, even though we can easily imagine making structures that are very large by composing them from other structs that are made of still more structs, in practice the most complex struct cannot just be arbitrarily large without limit.

Networks of Structures

Before we need to move into discussing recursion, there is one more point I want to cover. Above, we saw how pointers point to structs. And then, from a pointer with an index, we can read or write data to some sub type within the struct. But if this was all we could do, structs would just be a end points, individual records containing some data. Nothing too exciting. What is much more interesting is that structs can contain pointers as their sub types. This has many applications. First of all, instead of having a fixed-length string buffer, say 32 bytes, embedded within the struct, we could instead have a pointer to a name string. The name pointer in the struct is just 2 bytes and always 2 bytes, but it points to a string in memory that can be arbitrarily long. It can be much longer than 32 bytes if necessary, but at the same time if it is much shorter than 32 bytes it doesn't need to waste any space with empty padding.

There is however an even more brilliant and abstract use of pointers in structs. The pointer in one struct can point at the location in memory of another struct of the same kind. Thus, structs can be chained together such that one struct points to the next which points to the next which points to the next. In fact the degree of complexity of these networks of structs is limited only by how much memory is available in the computer. Structures can form networks with virtually any topology. Structs can have pointers to children, siblings, parents, and more.

These networks of linked structures are so common that they have names that characterize the design. For example, the structure: A -> B -> C -> D, where each struct just points to the next, is called a singly linked list, or just a linked list. But when the links go in both directions such that: A <-> B <-> C <-> D, where each struct links not just to the next but also to the previous struct, it's called a doubly linked list. Lists are considered to be among the most basic forms of data structuring. The potential complexity of linked data structures is truly mind boggling. An enormous body of computer science has been developed on the foundation of data structures and how to manipulate, traverse and store them efficiently.

Near the beginning of this post I mentioned that I am using structures to manage the menu system of C64 OS. I have designed a flat text file format that describes all the details of the menus and the menu hierarchy. Each application will have one menu definition file that is loaded in when the app is loaded. A file containing menus can be easily edited in a standard text editor. This is a very intentional design decision. It means that even when the app's code has been assembled and is distributed, the menu file will remain as freely editable text. This allows a user to change the keyboard short cuts of menu commands, or to remove menu commands they never use, or rearrange them to suit a different workflow. It also allows the menus to be localized into different languages without having to re-assemble the original source code. I personally like to use novatext, it is included with Novaterm 9.6, is very user friendly and includes block copy, move and delete.

In order to turn the menu definitions text file into real and usable menus the data has to be loaded, parsed and partially transformed. Each row in the text file represents a menu entry, including a label, keyboard command definition, application command code, and a type such as whether the row represents a heading that introduces a sub menu, an action or a spacer. The text file is loaded in using the relocatable Memory Managed Loader. This puts the file's contents into memory, but there is no structure in place that allows the menu rendering code to quickly traverse the data. How for example can it quickly count how many menu entries are in the root menu? How can it know how wide each root level menu header is? Not only for drawing them, but for determining whether a mouse click landed on a menu entry.

The menu loading code reads through the data in memory, and builds a network of interlinked structs. Each struct points to the next struct that represents the next menu entry in the same menu, and also optionally to the first struct of a new menu that is a child of the entry. The menu rendering code can navigate this network of structs extremely quickly. Every time the user clicks the mouse the menu code needs to determine if the click fell within the bounds of an open menu, and then change the menu open statuses and rerender everything accordingly. In order for this to feel snappy on a 1Mhz 6502, the code has to be very efficient. Traversing structs is a

How the menu loading code goes about doing this parsing, struct building and linking, as well as how it manages allocating and deallocating the memory required to store them efficiently requires recursion. Recursion is an essential programming concept for the ability to work with tree-like networks of data structures. And so it's essential that I, we, know how to do it on the 6502. The third part of this post series will focus on recursion.

I've really enjoyed writing these posts. I hope the C64 enthusiast and programmer enjoys reading them.

  1. Everything I'm saying is from the perspective of the 6502, which has an 8-bit data bus and a 16-bit address bus. Sizes of data types may be different on different CPUs. []
  2. One problem with the definition of C data types is that their sizes are traditionally CPU dependent. On a 16-bit CPU a 1-byte integer is a Short. A regular Int is 2-bytes and a Long is 4-bytes. But on an 8-bit CPU I prefer to think of an Int as 1-byte and a Long as 2-bytes. Not that it matters much, because we're not using a C compiler. []
  3. As we'll see shortly, I am going to be using examples directly out of the code I'm working on for C64 OS. The menu system makes extensive use of pointers and structures. And recursion must be employed to parse and render them. []