NEWS, EDITORIALS, REFERENCE

December 18, 2019#92 Technical Deep Dive

1351 Mouse, and Mouse Driver

Lots of people now have asked me if C64 OS is open source. And, the answer is, no, it isn't.

Will it ever be? Who knows. My feeling about open source is that sometimes it's useful, and sometimes it's just a big pain in the butt. I don't want to negotiate with other developers about the direction of the project. I am planning to open a private beta, very soon. I like getting ideas, and in time I hope people will contribute by writing apps, utilities, and drivers for C64 OS and even sell those apps and utilities. That'd be good.

That said, I'm not putting in any copy protection, because I don't think there is a good way to do that in a way that doesn't hinder the software. We're now in an age of the C64 when the fine folk who want to support the work of others will pay to get a copy, and the people who don't want to never will. Copy protection basically just punishes the people who have paid for it, by making it harder for them to make legitimate copies and use it as they see fit.

I also believe that a project that gets open-sourced, especially if it becomes free, takes on the unfortunate stigma of being worthless. If you buy a copy, you've got some skin in the game. You're in the club. You're a user and a customer, and you're in a community. I think that's the way I want C64 OS to be positioned.

That said, from time to time I will publish the source to some aspects of what I'm working on. These are small stand-alone components that can be used by other people. And the first of these will be the 1351 mouse driver.

What is a driver anyway?

A driver is just a piece of code designed to interface with a piece of hardware and produce generic output that can be used by the rest of the system. In games, drivers could simply be those chunks of code that are used to read the joystick input, for example, but they are rarely considered as such. Most games just have hardcoded into them code to read a standard 5-button joystick. In some rare games, like arkanoid, a game may present the ability to choose which input to use, joystick, mouse or paddle, maybe even the keyboard. This is more driver-like, because the game proper is designed to not care which input type is selected. The code for all input devices produces a generic output that can be used by the game. But usually these games still just have their driver code hardcoded into them.

A chunk of code becomes indisputably a driver when it is made a separate file that has to be loaded explicitly or can be swapped out independently of the program that uses it.

In GEOS

GEOS is among only a few C64 environments that has support for independently loadable driver files. I believe GoDot is another example. And of course there are some other operating systems with support for drivers. Of these, I'm most familiar with how GEOS works.

GEOS supports two driver types: input device and printer. Its input drivers may be up to a maximum of ~380 bytes, and have a reserved address block near the very end of memory. Every input driver must be assembled to $FE80 and may run from there up to $FFF9. Every input driver then begins with a standard jumptable whose routines are called at regular intervals by the GEOS KERNAL. I believe that it is not possible to not have a driver installed. The KERNAL always expects there to be a driver there and bad things would probably happen if you tried to put other code in that space.¹

In C64 OS

In C64 OS, drivers are implemented as relocatable binaries. All relocatable binaries in C64 OS must begin with a jumptable, and that of an input driver must start with two specific entries:

Scan Moves
Scan Buttons

Both of these routines are called automatically, at different times, by the C64 OS input KERNAL module. Scan moves is responsible for moving the mouse cursor. It typically does this by monitoring changes in the input device, but it is free to do more sophisticated things. It simply sets the desired position of the mouse cursor by writing to the 16-bit coordinate pair in workspace memory.² And then sets the "mouse moved" bit in the mouse flags byte.

Scan buttons is called automatically, and it is the driver's responsibility to update the up/down status of the left and right mouse buttons by writing the current button state to the mouse buttons variable in workspace memory. It is the responsibility of the driver to support swapping left and right buttons for left-handed use. Button swap is a setting found as a bit in the mouse flags byte.

That's all the input driver has to do. It doesn't have to move the hardware sprites. It doesn't have to worry about how the pointer is bound on the screen. From the information it provides, higher level mouse events are generated and delivered to the menus, utilities and applications, automatically. Mouse events include:

Move
Left Down
Left Track
Left Up
Left Click
Left Double Click
Right Down
Right Track
Right Up
Right Click

Only one input driver can be installed at a time, but it is also possible for no input driver to be installed. When uninstalled, the driver's memory is freed up for use by the rest of the system. When an input driver is installed, it will be automatically loaded into any single free page of memory.

Input drivers that are currently either complete or in progress include:

1351 Mouse (with acceleration and left/right swap support)
Joystick Port 1*
Joystick Port 2*
Commodore 128 numeric keypad*
Koala Pad**

* These also support configurable digital acceleration.

** Koala Pad is still under development.

Tracking position with the 1351

Side Note: Apparently I'm producing lamers by going into such detail on what some people consider trivial. I'm aware that for some people the level of description I go into is unnecessary. But, I like to teach and to explain. And I can remember the learning curve of working through the points I cover. In short, I am writing the articles that I wish I could have found when I was first figuring these out.

Meanwhile, on codebase64.org, this is everything that is said there about the mouse driver code:

The routine alters $d000 & $d001. It shuldn't[sic] be too hard to modify it. There are some enhancments[sic] possible as well. Self calibration and even limiting the returned alteration to a signed byte since the movements never exceeds[sic] 6 bits… The rest of the routine should speak for itself… https://www.codebase64.org — C= 1351 Standard Mouse Driver

In other words, "Here's the code. It's not perfect. You can change it. You're on your own."

So welcome all lamers! To my posts that actually try to clarify how things work.

A significant portion of how the input pointer system works in C64 OS is implemented in the input KERNAL module. Event generation and distribution, for example, is not part of the input driver. Nor is updating the hardware sprites, nor binding the pointer to a specific region of the screen. Additionally, in C64 OS, there is complex interrupt handling, during which the input drivers for mouse and keyboard are called.

What I have open sourced is therefore not really the C64 OS mouse driver. It is based on the C64 OS mouse driver, but includes elements from the C64 OS KERNAL so it can be used standalone in any other program. This code is also focused only on tracking the mouse and moving the sprites. It includes binding the mouse to the screen edges, which the routines in the manual and codebase64 do not. It does not include button scanning, and it doesn't include the creation or distribution of higher level events.

Find the GitHub repository here: /gnacu/1351mousedriver
The complete source is also listed at the end of this post, from a gist.

Let's just dig into the code, a section at a time. And I will explain bits and pieces as we go. I'll point out things I've learned about 6502 assembly programming along the way.

Changing the IRQ handler

The C64 KERNAL ROM's IRQ handler, after backing up the registers, jumps through the vector at $0314/$0315. At boot up, and upon reset, the KERNAL automatically configures these vectors to point to its own ROM routines. Its own routines are used to update the JiffyClock, scan the STOP key, scan the main keyboard matrix and buffer keypresses, and to blink the BASIC text cursor. If you want merely to wedge in your own routine to be run in addition to the KERNAL's own stuff, then you want to backup the original vector, insert your own, and at the end of your own routine call the routine pointed to by the original vector.

When modifying the KERNAL's IRQ vector, you have to temporarily mask interrupts. The reason is because changing the vector is not atomic. Every individual 6502 instruction is atomic, which means, once begun an instruction is always finished without being interrupted in the middle. If the IRQ line is activated while the CPU is executing an instruction, the instruction will finish executing first, and then the IRQ will be handled before proceeding to the next instruction. Updating a vector is not atomic, however, because you have to write 2 bytes, which takes a minimum of two instructions. Between which an interrupt could theoretically occur.

If the interrupt divided the vector update in half, the low byte and high byte would not belong together, and together they would point to some unexpected address. Jumping to that address would likely lead to a crash.

I've learned two things about this. First, as a general rule, you want to minimize the length of time that interrupts are disabled. If an IRQ fires while interrupts are masked, it will delay when the interrupt is handled. This causes the JiffyClock to lose time, and, in the extreme, could lead the user to experience the computer as unresponsive. Loading the complete vector into two registers ahead of time, allows you to mask interrupts, write both vector bytes and unmask interrupts in the quickest possible succession. I prefer to use X/Y for pointers and vectors, which in C64 OS is the standard for passing pointers to system calls. There are also a number of macros to assist in this process. The C64 KERNAL ROM tends to prefer using A/Y for these.

The second trick I learned is about maintaining consistent state. Let's say you have a routine that has to mask interrupts. No problem, you can use the SEI instruction before your code, and the CLI instruction immediately after. But actually, there is a problem. Imagine that your routine that needs to mask interrupts is later called from within the context of another routine that also masks interrupts. You would get the following:

;Some other routine...
	SEI
	;interrupts masked
	
	JSR yourroutine

	;interrupts no longer masked, but should be!!
	CLI
	RTS

yourroutine
	SEI
	;interrupts masked for yourself
	
	;do some vector manipulation
	
	CLI
	;clear the interrupt mask for yourself. But it
	;accidentally clears for the parent context too.
	RTS

The solution to this problem is to not use the CLI instruction to restore state, unless you're really sure you should. Instead, call PHP to push the processor status byte to the stack. This backs up the state, whether it's masked or unmasked. Then call SEI to ensure it's masked for your own purpose. When finished, call PLP to pull the processor status from the stack, thus restoring to how it was before.

The IRQ handler

One thing you can't do in code that's in a ROM is write directly into the bytes following an immediate mode instruction. To jump to somewhere variable, from ROM, it is absolutely necessary to reserve bytes somewhere in RAM, write the address there, and do an indirect JMP through that vector in RAM.

Since this code is in RAM, however, it is fastest, easiest and most direct to simply put a label before the immediate JMP instruction. Read the vector into X and Y, and then write the vector to label+1 for the low byte (+1 skips the JMP instruction) and label+2 for the high byte. This is what's done at the sysirq label.

Constants and Workspace Memory

Next we have our constants and our "workspace" memory. In this sample I've chosen to put the workspace variables, the 16-bit mouse coordinates, inline with the code. In C64 OS those bytes are assigned addresses in real workspace memory, somewhere between $0200 and $03FF. The reason to put them somewhere low in memory is because drivers can be loaded to anywhere and the KERNAL modules' code gets moved around as they accrete features. But the system and the driver both need to be able to exchange data via a stable, reliable, third location.

I personally like to use constants, mostly for code clarity. I start by defining SID, VIC, CIA1 and CIA2 as the base addresses where those chips are found. Then I define a few more constants for the registers I care about. Such as, potx = sid+$19 for example.

A couple of the constants are used simply for the binding, positioning and offsetting code. Technically, these haven't changed in 35 years, so it might seem superflous to make them constants rather than just hardcoding the values. But, I guess I just like the mild abstraction, in exchange for clarity of purpose.

maxx and maxy are the maximum value that a sprite can have, in the normalized coordinate system. 0 to 319 for horizontal, and 0 to 199 for vertical positions. offsetx and offsety are defined as the offsets of the physical screen from the normalized origin. And lastly, we need to pick some default coordinate for the mouse cursor to start at. Why not smack dab in the middle of the screen?

Scanning Moves, High Level

We need to handle both axes, and they are similar but with one main exception. When you roll the mouse left the PotX gets a negative number, and when you roll the mouse right the PotX gets a positive number. The Y-Axis, though, is inverted at the hardware level. There is probably a reason, but regardless, for vertical moves it's the opposite of what you'd expect. If you roll the mouse up the PotY produces a positive number, even though higher on the screen is represented with smaller numbers in the VIC's registers and in video memory. When the mouse is rolled down, the PotY comes back as a negative number. This has to be handled.

Let's start then with the simpler X-Axis. The general idea is to read the current pot value, and compare it to the previous pot value to know what direction the mouse has moved. If the difference is zero, then there is nothing to do. Otherwise add the move (whether negative or positive) to the current X position.

The routine movechk compares the difference in pot values. Here's how it works from a high level before we get into the details. We read the current pot value into the accumulator, and pass this in, along with the previous pot value in Y. When movechk returns, the zero flag indicates if a move has taken place at all.

If the zero flag is set, no move has occurred along this axis between the last interrupt and this interrupt. When no move has occurred, the other returned parameters don't matter and can be ignored.

If the zero flag is clear, this indicates a move has occurred. Y is returned as the new current pot value. To prepare for the next interrupt, we simply write Y back into the immediate argument for the LDY that comes before the call to movechk. In the 1351 User's Guide, and in the codebase64 version, this byte gets written to a third memory address. I don't see the point of doing this. It takes an extra byte to hold the value, and the LDY then has to load absolute, which takes 2 bytes to address and twice as long to execute. It's technically self modifying, but it's shorter and faster, and the only downside I can see is that it would be tricky to port this code to run from a ROM.

A and X together are returned as a 16-bit signed position delta. This confused the hell out of me when I was first trying to interpret what the code in the back of the 1351 User's Guide was actually doing. We are going to dig deep on this one, in a moment, when we discuss how movechk works.

For now, it is enough to know that A is the low byte of the 16-bit signed value, and X is the high byte. Because it is signed, the negative is already handled, we never need to subtract, we can simply add this to the musposx variable using ADC. Since the high byte is in X, we transfer X to A before doing the second stage of the 16-bit add.

Now let's talk about the Y-axis. It's nearly identical. If zero, then there was no move on this axis and we can skip everything. The other versions of this code I found do the addition even when there has been no move.

Self-modify to write the new current pot value to oldpoty's immediate argument, just as before.

Next we have to reverse the sign, because the Y-Axis is inverted. Since the number is in Two's Complement, the way to reverse the sign is to invert all the bits, and add 1 (with the carry clear.) How this works, if you don't know, is super cool. I went into some detail about signed numbers in two's complement in the post, Floating Point Math from BASIC (2/2). But here's the general idea.

In an unsigned number, each bit is assigned a value that grows by a power of two with each more significant digit. (left to right)

	
128, 64, 32, 16, 8, 4, 2, 1

To convert the value to decimal, then, you multiply the column value times the bit value of that column and add them all together. Thus:

1010 0110

1x128  +  0x64  +  1x32  +  0x16  +  0x8  +  1x4  +  1x2  +  0x1 = 

128 + 32 + 4 + 2 = 

166 (Very easy.)

To represent a signed number in two's complement, the value assigned to the most significant bit is negative. In an 8-bit signed number then the value assigned to the the final bit is −128. Like this:

	
−128, 64, 32, 16, 8, 4, 2, 1

The decimal value is then converted in precisely the same way as above. Thus:

1010 0110

1x−128  +  0x64  +  1x32  +  0x16  +  0x8  +  1x4  +  1x2  +  0x1 = 

−128 + 32 + 4 + 2 = 

−90 (Not too difficult.)

Through the genius of two's complement, it actually works out that if you simply invert the bits and add 1 you reverse the sign. Like this:

	
1010 0110
0101 1001 (inverted)

0x−128 + 1x64 + 0x32 + 1x16 + 1x8 + 0x4 + 0x2 + 1x1 = 

64 + 16 + 8 + 1 = 

89... add 1 and boom

90

you've inverted the sign. It's total magic, which I will leave to the math nerds, but it works every time. (Try it with your own examples, and see for yourself.)

That's for an 8-bit number. How does a 16-signed number work? It works the same way, the sign bit is still the most significant. But in a 16-bit number then, it's bit 15 (the leftmost, start counting from bit 0 on the right) that's negative. Instead of bit 15 having the value 32,768, when signed it has the value of -32,768.

How would you convert an 8-bit signed number to a 16-bit signed number? You employ another genius trick of two's complement called sign extension. Take your original signed 8-bit number, and add a second byte, the high byte. Then set all of the bits in the high byte equal to the original sign bit from the 8-bit number. Thus, -90 as an 8-bit signed number is 1010 0110, to make it a 16-bit signed number you just go:

1111 1111 1010 0110

Try it out on your calculator.

The left column of buttons shows "S" for signed, and the number of bits of precision, 8 in the first screenshot, 16 in the second. The binary value of the number on the screen is displayed in fine print at the top right of the screen.

Screenshots from Simple Hex Calc for iOS, on the App Store.

In the code then, for the Y-axis, the sign has to be reversed. This is done by clearing the carry, applying an EOR #$FF (that inverts the bits) and the ADC #1 adds one. We can then take this sign-inverted low byte and add it to the musposy low byte. Recall that X holds the high byte, and by sign extension it is going to be all 1's if the number was negative, or all 0's if the number was positive. Thus, we just need to invert all the bits in this byte to complete the sign-inversion for the whole 16-bit number. Transfer X to A, apply the EOR #$FF to invert, and then add it to the high byte of musposy.

You might be wondering, if the move delta always fits within an 8-bit range, (which is what we're implicitly assuming when we assume that the 16-bit extended value will either be all 1s or all 0s,) then why use 16-bit numbers at all? The answer, briefly, but we'll get into more detail shortly, is because although the move delta does not exceed 8-bits, the addition of the move delta to the previous mouse position can and will cause the mouse position to exceed the 8-bit range. Therefore, it is easiest simply to get the move delta as a 16-bit value, and add it to the mouse position using a standard 16-bit add.

It is the responsibility of higher level parts of the input KERNAL module to decide how to constrain the 16-bit coordinate space to the physical screen.

Scanning Moves, Low Level

Digging a little deeper then, here's the implementation of the movechk routine. The inputs are commented, and the outputs are also commented, but the outputs vary depending on whether a move has actually occurred.

Don't worry if what it's doing doesn't make sense right away. We'll get into the hardware first, then come back to this routine and it will make more sense.

Briefly, this routine does the comparison between the old pot position and the new, applies any acceleration, and returns the signed 16-bit value (in A/X) ready to be added to the 16-bit coordinate. It's the same routine whether the input values are for the X- or the Y-axis.

Before we get into an explanation of this routine, let's talk about the hardware first.

The lines on the controller port are called the pot lines because, naturally, they were originally meant to be connected to the potentiometers in a pair of game paddles. Commodore built in support for pots so it would be easy to hook up paddles to the C64, and be straightforward to port those 1970s Atari arcade classics, like Pong and Breakout. First, let's think about the theory of how pots would work as an input device.

A potentiometer can be used to divide voltage. As the pot is turned from its minimum to maximum value, the voltage at the center pin varies from near 0V to near to the maximum voltage of the circuit, in this case +5V. The voltage can thus be varied continuously (analog-ly?) from 0 to 100%. If we hooked this analog voltage level up to, say, an 8-bit analog-to-digital converter (ADC), we could get a digital value from $00 to $FF that represents the approximate position of the potentiometer.

I very quickly wired this up in iCircuit, and put the output of the ADC through a pair of 7-segment decoders so we can easily read out the digital value that the ADC is producing.

I took screenshots, above, of the digital value on the screen when the pot was set to each of 5 positions: 0%, 25%, 50%, 75% and 100%. The corresponding digital values are reported as $00, $3F, $7F, $BF, and ... $FE. Those all look great, except for that $FE. Shouldn't it be $FF? We'll come back to this in a minute.

The C64's POT lines

The C64's two controller ports each have two pot lines. These are labeled PotAX and PotAY, for port A, and PotBX and PotBY for port B. They are fed to the SID chip's pot lines. If you'll notice the SID only has two pot lines (PotX and PotY), but there are 4 inputs, two pot lines per controller port. To handle this, the C64 includes a 4066 chip, a Quad Analog Switch/Quad Multiplexor, which is controlled by some lines on CIA 1. It's a bit tricky to program, because all of these lines are overloading the same lines used to scan the keyboard matrix. But it allows the SID to alternate between sampling the two controller ports. For our purposes, the multiplexing can be ignored, because a mouse typically gets connected to controller port 1 and it only uses two pot lines

According to the documentation about how C64 paddles work, here for example, the SID chip's pot lines do not behave as standard analog-to-digital converters. The pots must be connected to capacitors. Each capacitor charges up more slowly when the resistance of its potentiometer is high, and faster when the resistance of its potentiometer is low. The SID chip then samples the port by allowing the capacitors to drain into its inputs. The SID counts how long it takes for the capacitors to discharge and uses that to infer the resistance level of the pots.

The 1351 is doing something more complicated, but in short, its means of transferring information to the SID chip is the same.

Note that the SID's analog inputs do not act like ordinary voltage-to-binary-style A/D converters, and thus cannot be used in this way. Trying to do so yields a highly unlinear[sic] characteristic that is hardly usable for much of what one would use a "regular" A/D converter. Consider the SID's analog inputs ohmmeters rather than voltmeters. https://www.c64-wiki.com/wiki/Paddle

Although the SID requires these capacitors to read a potentiometer correctly, the capacitors are included on the C64's mainboard. So a real paddle doesn't have to include the capacitors, it just wires a potentiometer directly to the controller port's POT lines.

Schematic fragment showing the capacitors on the C64 mainboard, connected to the SID's POT lines.

So now let's think about what the mouse is doing. First, one mouse uses both pot lines, one line per axis. But, unlike paddles, the mouse doesn't have real analog potentiometers. This is obvious after just a moment's thought, because real pots have a fixed rotational distance. You spin the paddle wheel a couple of times clockwise and boom, it hits the end of its ability to rotate, and while in that position it always produces the same voltage level. So you spin it the other way but after a couple of turns it hits the end in that direction. Clearly this doesn't happen with a mouse. You can just keep pushing the mouse up, pick it up off the table set it down closer to you and push it away from you again, and it never hits the end. So, how does it work?

The wheels connected to each axis use a very clever device called a quadrature encoder. You can read about how these work here, but I'll try to explain it. The wheel has evenly spaced holes, and between the holes it has—we could call them—spokes. The wheel then spins between an emitter and a detector. The emitter shines light and the detector detects that light. As the wheel rotates the light is blocked by the spokes but passes through the holes as they rotate by, which generates pulses at a rate that is proportional to the speed that the mouse is moving. Makes sense. The problem is that if this were all it were doing, it wouldn't be able to know if the wheel were spinning clockwise or counter-clockwise. So how does it detect the direction of rotation?

The solution to this is very clever. Instead of each axis having one sensor, each axis has two sensors. We'll name the sensors A and B. They are offset from each other such that while one sensor is over a hole the other is over a spoke. The result of this surprisingly simple arrangement is that when the wheel is spinning in one direction, B is detecting (B is over a hole) at the moment that A transitions from not detecting to detecting. But when the wheel is spinning in the other direction, B is not detecting (B is over a spoke) when A transitions from not detecting to detecting. Amazing. If that sounds confusing, checkout this animated GIF.

Its simplicity is what makes it so ingenious. Essentially, the rising edge of sensor A is what can be used to count the rotational ticks. But every time A rises, one merely needs to look at the state of B to know if its rotating forwards or backwards. If B is high, it's going forwards, if B is low, it's going backwards. Brilliant.

But, what actually counts the rotational ticks? And how does that count get converted into a voltage level? The 1351 mouse contains a custom IC made by MOS specifically for this purpose. It's the MOS 5717. If you want a more technical description of it, you can read this spec. There is also a patent document on the 1351 mouse that goes into a fair amount of detail.

There are 7 pins on the MOS 5717 that are important for how the analog part of the mouse works. 4 of these are for the quadrature encoders. Sensors A and B for both X and Y axes.³ Internally, the chip has two 6-bit counters, one for each axis. Every time a sensor A rises the counter for that axis will count either up or down. If sensor B for that axis is high the counter counts up. If sensor B is low, the counter counts down. 6-bits is not much though. It's a maximum range of 64 (2 ^ 6 = 64), but it's signed too, because the counter can count up or down. So it can only count up to 31 or down to -32 before overflowing and rolling over.

The other three important pins are PotX, PotY and Sync. PotX and PotY are the output lines connected to the controller port's PotX and PotY lines, and sync is actually just wired up to the PotY line too. To understand how these are used we have to know something about how the SID samples the lines. Every 512 clock cycles it starts draining the internal capacitors and counts their discharge time. Then it lets them charge up again, and at the next 512 cycle mark it starts draining them again.

I'm not sure exactly how many cycles it spends draining and counting and then how many cycles it spends letting them charge up again, but it's not hugely important for our purposes. What matters is that the SID doesn't just continously know the voltage level on the paddle pots, but rather it samples them. 512 cycles works out to be approximately 2000 times a second. (1,000,000 / 512 = 1953) The SID latches the last value sampled, so you can read the value at any time, but the latched value gets updated 2000 times a second. This is particularly relevant if you really are trying to read 4 paddles at a time. It's not enough to simply switch that multiplexor from port 1 to port 2 and then read the values off the SID, because after changing the multiplexor you have to wait at least 512 cycles, maybe even 1024 cycles, enough time for it to fully resample.

The MOS 5717, then, must simulate the pot voltage dividing behavior. It uses the sync line to detect when the SID begins sampling. Then it uses the values of the counters to output the correct voltage on its PotY and PotX lines to transfer those 6-bit values to the SID. And this transfer is, therefore, effectively analog. Now here's the real trick, after dumping those 6-bit values to the SID, the counters get reset back to zero. If the mouse is still moving, then the counters immediately start counting again. So, although the range is small, -32 to 31, the value gets read in and reset back to zero ~2000 times a second. Therefore, in practice, it's hard to get the mouse to overflow. It's certainly not impossible, but you have to really try. If you shake the mouse back and forth as fast as you can, even with a good driver, the pointer won't track properly. A jerk too fast in one direction will overflow the internal counter, rolling it over, and causing the SID to interpret it as a movement in the opposite direction. There is nothing we can do about that, but it's not a serious problem.

Interpreting a 6-bit signed value

Now we know how the mouse captures the movements, including both directions on both axes, and we know that the SID sucks these values in ~2000 times a second. But we also know that these values are transferred in a more or less analog way. Let's return now to the little iCircuit test I banged together, above. When the potentiometer was set to 100%, it should have let through 100% of the voltage, and the ADC should have interpreted the value as $FF. But it didn't, it got a value of $FE. And that is in a simulator, which in theory might be even more precise than a real world implementation, because there is no nasty real-world noise to get in the way.

The problem is that an analog to digital conversion lacks that perfect digital precision. So if we're converting an analog voltage to an 8-bit value, we should expect some noise to result in the lowest bit fluctuating on and off. Typically noise will only affect the lowest or least significant bits. This makes good sense, because by definition the most significant bit represents massive differences in voltage. The next most significant bit represents the next most massive difference in voltage and so on, all the way down to the least significant bit that is representing just tiny little variations in the voltage that could easily be fluctuating.

To handle this noise problem, the MOS 5717 shifts its 6-bits up so that when you get the value off the SID they fall between bits 1 and 6, like this: X654 321X, not between bits 0 and 5 like this: XX54 3210.

But the 6510 doesn't operate on 6-bit words, it operates on 8-bit words. So we have to convert a 6-bit signed int to an 8-bit signed int. But the 6-bits are middle aligned, so bits 1 to 5 have to be shifted down one, so they run from 0 to 4. Meanwhile, the sign bit is at bit 6 and has to be shifted or extended up to bit 7.

Diagram of how the bits must be shifted to convert from 6-bit to 8-bit.

What we're left with are those two red X's in the middle, in bit positions 5 and 6. What must they become? Let's just do a little comparison of the values in binary from 6-bit to 8-bit.

  011 111 =
  
  0x-32 + 1x16 + 1x8 + 1x4 + 1x2 + 1x1 = 31

  To represent 31 in 8-bit then, 

  0001 1111 =

  0x−128 + 0x64 + 0x32 + 1x16 + 1x8 + 1x4 + 1x2 + 1x1 = 31

With a positive number it's very easy. The low bits have to align on the right, so because the 6-bits come middle aligned it'll have to be shifted towards the right one place. And the 0 sign bit gets extended out to fill all remaining bits to the left. This sign extension is exactly what we saw earlier when converting from 8-bit to 16-bit.

  100 000 =
  
  1x-32 + 0x16 + 0x8 + 0x4 + 0x2 + 0x1 = −32

  To represent -32 in 8-bit then, 

  1110 0000 =

  1x−128 + 1x64 + 1x32 + 0x16 + 0x8 + 0x4 + 0x2 + 0x1 = −32

And with a negative number, it's just as easy. The low bits have to align on the right, so we shift everything down one place. And the negative sign bit which falls at bit 5 has to be extended out to the left into bits 6 and 7. And, through the magic of two's complement, as you can see above, this actually works. 1110 0000 in 8-bit signed is equal to -32, and 100 000 in 6-bit signed is also equal to -32. All you have to do is extend out the sign bit to fill 8-bits.

And now, finally, we can return to the movechk routine to see how it processes this 6-bit middle-aligned signed int to a regular signed 8-bit int. It's been a while, so let's show the source code again so you don't have to scroll back up for reference.

Let's deal with the Y register first. Y is not really used in the main body of this routine. The old pot value gets transferred in via Y. Right away this gets written into the immediate argument of the subtraction, at the label oldvalue. And the new pot value is transferred in via the A register. The new pot value gets copied to Y, and that's it for Y. Y holds onto the new pot value throughout the routine and is one of the returned values.

The new pot value, still in A, then has the old pot value subtracted from it. Y still holds the new pot value, but now, A holds the delta between the old and the new pot values. Because the pot value is 6-bit and middle aligned, the most significant and the least significant bits are noise. The least significant bit doesn't need to be handled explicitly because it'll get bumped out when all the bits get shifted down. The most significant bit though, it matters, because it's going to be part of the sign extension. So, right after the subtraction, there is an AND #%01111111. This explicitly sets the most significant bit low.

Next, the delta value in A is compared to #%01000000. If you don't know how the MOS 5717 aligns the 6-bit signed int, this doesn't make any sense. But in fact, that 1 at bit 6 is comparing against the sign bit of the middle-aligned 6-bit value. If the comparison sets the carry, it's because the delta is "greater than" or "equal to" #%01000000. This is confusing too, but it effectively tests if the 6-bit middle-aligned int is negative. Let's say, for example, the delta is #%01000010, then from an unsigned int interpretation that's greater than #%01000000. But from a 6-bit signed perspective, it's less negative, but it's still negative. Thus any negative value will branch us down to the neg flag.

But let's stick with what happens if the value is not negative. We've already cleared bit 7, and because of the comparison we know that bit 6 is also clear. Now, we have to shift the 6-bit value to the right, so we do that with the LSR A. In one step this bumps the least significant noise bit out the right side. And LSR always brings a 0 into bit 7. So, it's a positive number, and the upper three bits are all zero, so it's properly sign extended into an 8-bit number.

Phew! That took some thinking, eh?

If the result of this shifting is zero, though, then no actual movement of the mouse has occurred. No other work needs to be done and the code can simply branch directly to nomove. This returns with the zero flag set, so the state of the other registers doesn't even matter.

If there has been a movement though, we can apply some acceleration. I wrote a whole post about mouse acceleration sometime last year. But the conclusion is that it works out quite well to simply double the move distance if the mouse move is greater than a certain cutoff value. So if the mouse is moving slowly, then it's directly proportional, but if you move the mouse faster not only does it move proportionally faster but it moves twice as fast as that. The customizable setting, then, is what that cutoff value should be. In C64 OS, a workspace variable is dedicated for this value when comparing a positive movement, and the value is populated from system settings. A second workspace variable is used for comparing against a negative movement, and is computed ahead of time as the inverse of the value read from settings.

In the code above I've simply hardcoded this cutoff at 10. A mouse movement of less than 10 will move the mouse pointer less than 10 pixels on screen. But a movement of 11 will move the mouse pointer 22 pixels, and a move of 15 will move the cursor 30 pixels, etc.

There is no possible way that this could lead to an overflow. Because, an initial 6-bit signed value is being converted into a 8-bit signed value. So, even if the mouse movement were the maximum of 31 (011 111 in 6-bit), it will get doubled to 62 (0011 1110 in 8-bit) which is a perfectly valid positive 62 in 8-bit.

But finally, even this number is actually being returned as a 16-bit number. We know the value is positive, so the 16-bit sign extension will be all 0s. X is therefore loaded with 0. A last compare of the A register to 0 guarantees that the Zero flag will be low, so the returned values get interpreted as a mouse movement. Y is the old pot value, A is greater than 0, X is the 16-bit sign extension, and the zero flag is clear.

A negative movement

Supposing now that the comparison to #%01000000 set the carry, so the 6-bit value coming from the MOS 5717 is negative. To convert this to 8-bit signed we're going to have to extend the high signed bit out to the right. So the first thing it does is ORA #%10000000 to turn bit 7 high.

The next couple of lines, which are in the sample code at the back of the 1351 user guide, and in the code example at codebase64, is a bit of a mystery to me. It compares the value to #$ff, and if it's equal it branches to nomove. What's weird about this is that it still includes the noise bit. So, after the subtraction, if you get a 6-bit value of 111 111, that's −1, and it's middle aligned as x111 111x, but we've just forced bit 7 high, so we've got 1111 111x. Therefore, comparing to $FF, why does it do that? It's like, if the 6-bit value is −1, and the noise bit happens to be low, well then we'll just proceed and use the −1. But if the 6-bit value is −1 AND the noise bit is high, well then we'll consider that no move. I don't understand what this accomplishes.

Moving on. We've now got the upper two bits high, but we need to shift everything down one. That will right align the 6-bit value, and bump out the noise bit. But we also have to set the new bit 7 to 1. To do this, we set the carry and ROR A. The roll right will shift everything down, but it also rolls the carry into bit 7. Since we've just set the carry high, now our sign extension is complete, all three upper bits are high.

We've got the value to return, but, here's where we can do the negative acceleration. It's the same idea as above, if the value is beyond a certain cutoff point we will double that value. But how this works with two's complement can be confusing. Let's say, just as with the positive movement, we want the cutoff to be 10. So for the negative movement it has to be −10. In C64 OS there are two workspace variables, the first is the positive cutoff value, i.e. 10, which is read in from a settings file. But then the negative equivalent is calculated by starting at zero, and subtracting 10, and ignoring the affect on the carry of the borrow. The negative comparison is then stored so it doesn't have to be computed every time the mouse moves in a negative direction.

The assembler, Turbo Macro Pro, doesn't like negative constants. I have no idea why, but it gives a parse error if I try to put in CMP #−10. It also gives a parse error if you try to put in CMP #0−10, because this is still negative. However, if you put in CMP #256−10, it does what it should do and computes that to 246. Or, $F6 in hex, or 1111 0110 in binary. Let's check that as a signed 8-bit value:

  1x−128 + 1x64 + 1x32 + 1x16 + 0x8 + 1x4 + 1x2 + 0x1 = 

  −128 + 64 + 32 + 16 + 4 + 2 = −10

Indeed, it equals negative 10. Perfect. A negative number in two's complement gets less negative the bigger the number gets if you interpret the number as unsigned. So, for example 1111 1111 is −1, the least negative possible int. But 1000 0000 is −128, the most negative 8-bit int. Right, so that means that the higher the number gets if you interpret it as unsigned, the less negative it becomes if you interpret it as signed. So, −10 is $F6, a big unsigned number, very close to $FF. So when we compare the value to $F6, if it's equal to or greater than $F6, that's equal to or greater (less negative) than −10, so we can skip the acceleration. That's what is done, above, at line 38. The BCS branches if the unsigned interpretation of the number is greater or equal. We already know the number is negative, because this whole section of code is only if it's negative. Thus, if it's less than $F6 it's more negative than −10.

The actual acceleration itself is done by multiplying by 2, with a ASL A. We should do some checks to make sure this actually does what you'd think it does too. Let's start with −16, that's more negative than −10, it should be accelerated by multiplying by two.

  1111 0000 = 
	
  1x−128 + 1x64 + 1x32 + 1x16 + 0x8 + 0x4 + 0x2 + 0x1 = 
	
  −128 + 64 + 32 + 16 = −16
	
  Now let's "multiply by two" by shifting everything left.

  1110 0000 = 
	
  1x−128 + 1x64 + 1x32 + 0x16 + 0x8 + 0x4 + 0x2 + 0x1 = 
	
  −128 + 64 + 32 = −32

So that works, it makes the number more negative. Similarly to the acceleration in the positive direction, we don't ever have to worry about it overflowing. The largest negative number that could result would be if the previous pot value had been 0, and the mouse took its biggest or fastest negative movement, we'd get a 6-bit signed move of -32. Double that, is just -64, which is not going to overflow or rollover an 8-bit signed number.

Last thing to do then, as before, we're returning this as a 16-bit signed number. So load up the X register with the sign extension, which is going to be $FF. This has the side effect of also setting the zero flag low, so we can just return. Zero flag low indicates that a move has occurred, A is negative, X is the 16-bit sign extension of the value in A, and Y is still holding onto the latest pot position, to use for the next compare.

Binding virtual coordinate to the screen

So what we've covered so far is the scanmovs routine, which reads the mouse hardware, and then passes the result, together with the previous mouse data, to the movechk routine. Movechk converts the values, compares them, and accelerates them, and then returns the move delta as a proper 16-bit signed number.

Scanmovs then takes that 16-bit delta and simply adds it to a virtual coordinate space that starts at 0,0 and maxes out at 65535,65535. Scanmovs actually does all of this twice, once for each axis. And the movechk routine is totally agnostic to which axis it's processing. If you wanted to do something more fancy, like apply a different acceleration to the two different axes, you could either implement movechk as two different routines, or modify it to make some additional check to determine which axis it's working on.

Recall that the mouse IRQ routine makes two calls. scanmovs and the boundmus. What we're left with after the scanmovs part is complete is an updated 16-bit virtual coordinate for where the mouse pointer ought to be. Next, then, is the job of boundmus to map the coordinate to the real offsets and boundaries of the screen.

In GEOS, the mouse pointer can, and often is, bound to a small region of the screen. For example when you open a desk accessory, like the calculator, the mouse cursor is prevented from leaving the bounds of where the calculator is drawn on screen. In C64 OS, I don't think this is something I'm ever going to do. Because, even when a utility is open, the user still has the ability to click on menus or the status bar. And if the utility wants to be modal, it can set the system modal flag. But, maybe in some special case, like a fullscreen graphics mode for a game, you could limit where the mouse pointer goes.

In this code, however, maxx and maxy are simply constants set to 319 and 199 respectively. This is the size of the screen, 0 to 319 and 0 to 199.

The four bytes musposx and musposy are the 16-bit virtual coordinates of the mouse. In C64 OS these values are in workspace memory, and you can programmatically change them to move the mouse pointer to a different place on screen. You don't have to worry about whether sprites are on, or how the sprites are positioned, or whether some custom program has taken over the mouse sprites. Set these coordinates to where you want the mouse cursor to be. When the game or special program relinquishes control of the sprites and the OS takes the mouse pointer back, it will update it to these coordinates.

The virtual coordinate of the mouse is signed. So, when a negative mouse movement is added to the coordinate, it may cause the virtual coordinate to go negative. Part of the reason for using a 16-bit coordinate is, A) to generically handle the fact that the screen is wider than 256, but B) to simplify the math while preventing an accelerated vertical movement from rolling the corrdinate over and having the mouse disappear off the top of the screen and appear on the bottom.

Each axis is processed, one at a time. X-axis is first. The first thing we do is look at the X-coordinate high byte. If it's negative, then the coordinate is less then zero, and the value of the low byte is irrelevant and the whole X-coordinate can be reset to zero. If the high byte is zero, however, the low byte is also irrelevant, because whatever its value, it is in range and perfectly valid.

If the high byte is something positive, however, I'm taking a bit of a shortcut. I'm assuming a positive high byte will never be greater than one. Thus, if you programmatically set the X coordinate to 512 or greater the math will break. However, it is impossible for the mouse to be moved to 512, because even an accelerated maximum move is 31 * 2 = 62, which when added to the previously bound coordinate of 319 will max out at 381. So the high byte, if positive and greater than zero is assumed to be one. The low byte is then checked against maxx-256 (319-256 = 63). If the low byte is greater than 63, it is replaced by 63.

Now for the Y-axis. Check the high byte first. If it's negative, the low byte is irrelevant both are reset to zero and the code falls through to the next step.

The low byte for the Y-axis is a bit trickier than for the X-axis. If the high byte is greater than zero, I am once again taking a shortcut and assuming that it's 1. It can only be greater than one if it were set that way programmatically, which would cause a problem. In any case, if the high byte is greater than zero, the Y-coordinate is automatically out of bounds. Decrement the high byte to get it back to zero, and set the low byte to maxy.

If the high byte was zero, though, the low byte may or may not be inbounds. So the low byte is compared to maxy. If it's greater than maxy it is set to maxy, otherwise, it is left alone and we fall through to the next step.

Mapping to the sprites

Finally, we have a 16-bit coordinate, that is in bounds. Now it has to be mapped onto the sprites.

The sprites are a bit tricky. Each sprites has a register in the VIC-II for its Y position and a register for its X position. Unfortunately, the X position may be greater than 8-bits, so the extra X positioning bit is found in another VIC-II register. To save space, the high bits from each of the 8 sprites are found in the 8 bits of that one register, $10, which we've defined as xposmsb.

Sprites on the C64 are able to leave the top and left edges of the screen, this is obviously useful for games that allow objects to enter or exit the screen from the left or top edges. To accommodate this, the origin of the sprite coordinate system is offset up and left from the origin of the bitmapped screen area. The offsets of the sprite coordinate system are set in two constants offsetx and offsety. To map the virtual mouse coordinate to the actual screen we have to add the sprite offsets.

As in all arithmetic involving more than 8-bits, we add the low bits first. Clear the carry, load the low mouse x position, and add to it the offset. This may cause an overflow and the carry to be set, but whatever the result can be set as the sprite's x position low byte. In C64 OS there are 2 sprites used for the mouse pointer, and they are always in the same place, so this value is written to both xpos and xpos2.

Next, we load in the high byte for the X position, and add to it #0, because the offsetX is only 8-bit. This add includes the carry that may have been set high from the previous addition. There are only two possible results of this, either the result will be 0 or 1. If it is 0, branch to clear the high bits. We simply load in the xposmsb, use AND to clear the low two bits without affecting the upper 6 bits and write the result back to xposmsb. If the result was 1, though, we have to set those two low bits. Load in the xposmsb, ORA to set those two low bits, then branch past the clearing code to write the result back to xposmsb.

It's not terrible, but it is certainly not pretty. Many a budget game from back in the day limited the game area to the first 256 pixels of the screen so they wouldn't have to deal with moving the sprites into that overflow area.

The last thing to do is to set the sprite's Y position. But this is very easy. It's all just 8-bits. Clear the carry, load the virtual mouse y coordinate, add to it the offsety. The result cannot overflow, because the maximum virtual coordinate is 199, and the offsety is only 50, so a sprite that is positioned in the last pixel row of the screen will be at 249. Write the result into the VIC-II registers for the mouse Y position for both sprites, and we're done!

Final Comments and Full Source

I must admit, this turned out to be longer than I thought it would be. I started writing this sometime in November, but then I got tripped up for a while trying to figure out how the SID samples the POT lines. I even acquired my first pair of Commodore Paddles (Thanks Jérémie!) so I could open them up for myself and check to see if there were capacitors in them. When I found that the paddles don't have any caps, I eventually figured out that, while the C64-wiki article documents the SID correctly—it needs caps to sample the pots correctly—it took me a while to realize that the C64 mainboard includes those caps.

Then World of Commodore 2019 was coming up quickly, so I put my head down and started trying to fix bugs in C64 OS to get ready for that. And I started preparing my presentations. I gave two talks, plus I wrote a simple "Presenter" program for the C64 to present my slides. I completely ran out of time in November, December came, World of Commodore came and went. And so here, finally, is the completed post about the 1351 mouse and mouse driver.

I'm really pleased that a post like this has gone way beyond just the driver, but got a chance to dive into quadrature encoders, and how the SID samples the pot lines, and how to convert signed numbers from 6-bit to 8-bit to 16-bit with sign extension. I feel like a lot of good stuff has been covered by this post. And I hope that someone, somewhere will find it useful.

Here's the full source in a Gist.

You can read all about GEOS input drivers in The Official GEOS Programmers Reference Guide, Chapter 7: Input Drivers. [↩]
The screen origin is at the top left which is represented by coordinate [0,0]. Any more complex positioning of the hardware sprites is entirely abstracted. [↩]
The mouse, and by this we mean the MOS 5717, can be put in Joystick emulation mode too. But let's just ignore this little complication and assume we're only talking about the behavior while in the native analog mouse mode. [↩]

Do you like what you see?

You've just read one of my high-quality, long-form, weblog posts, for free! First, thank you for your interest, it makes producing this content feel worthwhile. I love to hear your input and feedback in the forums below. And I do my best to answer every question.

I'm creating C64 OS and documenting my progress along the way, to give something to you and contribute to the Commodore community. Please consider purchasing one of the items I am currently offering or making a small donation, to help me continue to bring you updates, in-depth technical discussions and programming reference. Your generous support is greatly appreciated.

Greg Naçu — C64OS.com