Tag - commodore

re-examining XOR data checksum used on amiga floppies
logic analyzer on Amiga 500 Paula chip
how the amiga reads floppies
new book
too many variables

re-examining XOR data checksum used on amiga floppies

So I was wondering exactly how effective the XOR checksum that Commodore used for the amiga sectors was.  If I read a sector, and perform the checksum, and the checksum matches the stored checksum, then the data is the same, right?  Not always.  I had expected it to be better, maybe not MD5 or SHA-1 level, but I expected it to be decent.

I had run into this paper a year ago when I was looking at my transfer checksums.  But this certainly also applies to the amiga sectors, too.

Some good excerpts:

  • the percentage of undetected two-bit
    errors becomes approximately 1/k (k being checksum size), which is rather poor performance for
    a checksum (i.e., 12.5% undetected errors for an 8-bit checksum, and 3.125% for a 32-bit checksum).
  • The XOR checksum has the highest probability
    of undetected errors for all checksum algorithms in this study…..
  • There is generally no reason to continue the common practice of
    using an XOR checksum in new designs, because it has the same software computational cost as an
    addition-based checksum but is only about half as effective at detecting errors.

So I wanted to actually prove myself that XOR is that bad, looking at it from the amiga sector perspective, so I wrote a quick program to:

  1. Create a random block of data 512 bytes long
  2. Calculate the 32-bit XOR checksum based on 32-bit chunks at a time and store it as the last byte (as does the amiga)
  3. Select a number of bit inversions(basically corrupt the data in a specific way) which can affect data and/or stored checksum
  4. Recalculate the checksum of the modified block
  5. If the checksums MATCH, meaning that two different sets of data yield the same checksum, then this is an undetected error.
  6. Count the number of undetected errors vs total errors.

That paper lists the properties of an XOR checksum, and I wanted to compare results:

  • Detects all single-bit errors.  (my testing confirms)
  • Fails to detect some two-bit errors. (my testing confirms, see below)
  • If total number of bit errors is odd, XOR checksum detects it. (confirmed with 1,3,5,7 bit errors)
  • Fails to detect any even number of bit errors that occur in the same bit position of the checksum computational block. (confirmed with 2,4,6,8)

The two-bit errors is really the case that I worry about.  If two bit errors occur in the same bit position, and the inverted oppositely(1—>0 and 0—>1), then it won’t be detected.  So how often does this happen with random data?  Paper’s author, Maxino, says 3.125% of the time.  I can definitely confirm this.  My testing shows 1.8 million hits over 56 million tries.   There are some differences with Amiga data, but I think the results can be the same.  I might load some example amiga sectors and try them.

Once the number of bit errors increase, the probability of them happening in the same bit position, in the same direction, and occurring evenly on each position goes down —- and therefore the overall chance of undetection decreases as well.

4 bit undetected errors happen 0.3% of the time.

6 bit undetected errors happen 0.04% of the time.

8 bit undetected errors happen 0.009% of the time.

I have other testing to do, including running burst error tests.

So for each sector, you really don’t want to see only two bit errors occur.  A single bit error, sure.  Multiple bit errors, ok, because we can detect them.  You don’t want to think you have a good copy when it fact the checksum passing was a false positive.

logic analyzer on Amiga 500 Paula chip

So I’ve attached my Intronix LA1034 to the Paula chip, the 16-bit wide data bus, and the DMA Request line. I triggered the logic analyzer on 0x4489, of course.

Here’s the results:

(click for a full size version)

So I’ve spent most of my time sort watching it come out of a floppy drive, but never from Paula’s perspective. It’s always before Commodore’s custom LSI floppy controller Paula could get ahold of it. So notice that the data is still RAW MFM, and not processed in any way, because Paula doesn’t perform this function. The MFM decoding is done in the trackdisk.device, with the help of the bit blitter.(part of Agnus)

So the normal floppy sync pattern is 0xAAAA 0xAAAA 0x4489 0x4489, but why do we see 0x2AAA ??

So, 0x2AAA is 0010 1010 1010 1010. right? We’re missing the first bit. It turns out there was a bug in the MFM encoding routine. It was fixed March 16th 1990 at 1:08am in the morning, in revision 32.5 of the trackdisk.device.

Also, most times, the sync word used was 0x4489 — which is exactly what I use to find sync in the firmware I wrote for the microcontroller.

Oh and here’s Paula with the great EZ hooks on her leads

(click for full size)

how the amiga reads floppies

So I’ve forever wondered, at the hardware level, how the amiga reads floppy disks.  I’ve gotten bits and pieces over the years, but I’ve never really understood the bulk of it.

So there are four main chips involved in the floppy controller — the functions are not grouped together like they would be on a NEC765 or similar controller.

The four chips involved are:

1. 8520 Complex Interface Adapter (CIA), the ODD one, U7 on the schematics.  This is a generic I/O chip that handles a bunch of things — but specifically the control OUTPUTS from the drive to the amiga.  Commodore calls this the “disk sensing” functions.

2. 8520 Complex Interface Adapter (CIA), the EVEN one, U8 on the schematics.  Same as above, but this one handles the control INPUTS from the amiga to the drive.  Selection, control, and stepping.

3. Gary handles the state of the MOTOR of the floppy drive, and takes what write data/gates from PAULA, does some magic stuff (still working on what Gary actually does), runs it through a NAND gate configured as an INVERTER, and then pipes it to the drive.  Gary handles just disk writes, so controller to drive output.

4. Paula handles processing the incoming read data from the drive, handles the DMA including firing off interrupts to the 68K when the SYNCWORD is found, or when the DMA is complete.  Paula has the real job of doing the data separation and has a digital PLL circuit in hardware.  Note that it appears that Paula doesn’t select, turn the motor on, select sides, etc etc whatsoever.  The programmer/OS has to handle all that stuff.  Paula just brings the bits in, DMA’s them into memory, and lets the rest of the processes handle everything else.

All the MFM stuff is handled inside the trackdisk.device stored in the Kickstart ROM.  I’d like to at least partially disassemble the ROM code since thanks to emulators, the ROM files are everywhere.  Maybe I’ll let IDA have a crack at it.

I would _really_ like to see the dpll circuit setup inside Paula to see exactly how Commodore implemented it on the Amiga.  The paper I recommended a couple posts back talked specifically about design decisions surrounding DPLL and I’d love to know what methods the original engineers used.

Originally I thought the controller was almost entirely in software (and actually, the majority of it is, in fact software) — but Paula has some disk controller hardware too.  You always need to have some hardware components for turning leads off and on — but I’m not too surprised that there is some custom hardware there.

I’m ordering a A500 service manual — which I think I already have a foreign copy of — I’d like as much info as possible.


While I’m hesitant to post this at all, this is a quick and dirty application I threw together, and I do mean threw, that will check for valid MFM bytes.

See the previous post for the link to the actual bytes.

It also ignores 0xFF — which is in fact an illegal byte, but I use it as a delimiter. This means a good delimited file from my SX project should only register one bad byte, and that’s the last byte, the checksum.

This program simply displays the number of valid bytes, followed by the number of illegal bytes, and calculates the percentage of them. My good read from earlier read .007%, which is like exactly one byte out of 13,824 wrong. So good answers should be very low.

I made this as a quick utility to check the output from my SX instead of fooling around with my full MFM decoding software.

Note it has a hard limit at 15,000 bytes, and is designed to take exactly one raw commodore amiga mfm track as an input file.


While I’m sure this is virus free, it’s ALWAYS a good idea to check executables before running them. I have no clue on the requirements works on XP SP2.

new book

I recently received “On the Edge: the Spectacular Rise and Fall of Commodore” and have read about half of the book so far.  Very interesting read.  It’s fun to read about how the various Commodores were developed including the Amiga.  It gives an insider’s view into the people and events that took place.



too many variables

I think that’s really the problem I’ve been facing with this floppy project. Trials and tribulations indeed. Or at least trial and error, with stress on the error part. There are probably less than half-a-dozen people worldwide that have the very specific knowledge that I need to get this off the ground. These would be people from CAPS, one person who sells an amiga (as well as other floppy drives) floppy controller card, and that’s it.

I’ve talked to some really nice ex-Commodore employees who have been friendly, and helpful.

The problem, of course, is that these machines were created a long time ago. The programmers of the original OS, freeware/shareware software, hardware, have all long forgotten the details. And you know what they say, “the devil’s in the details.”

To give you an idea of the variables with which I’m working:

1. Am I reading the hardware signal right? Are there *really* only three possibilities coming out of the drive, a “10”, “100”, and a “1000”? This is all I’ve been able to observe…

2. Is the software I’ve written on the SX microcontroller properly sampling this data at the right time? I have two interrupts occuring, one for the edge detection and one for the 2us timeout which clocks a zero in (up to 3 of them in a row before going idle). So far, sometimes this works, and sometimes it doesn’t. Why and how is this getting out of sync?

3. Is the SX transmitting faster than the PC can handle this? So far, my observations say no, and I implemented a software FIFO to help out.

4. Is my software FIFO working?

5. Is my PC software, that’s designed to receive all this, working properly? I’m now storing the transmitted bytes in a memory array, and later storing to disk, to prevent any speed issues associated with accessing the hard drive.

The REAL problem here is that I simply don’t know what is actually leaving the drive in terms of data. The only thing I’ve been able to figure out is that the MORE data is leaving the drive than is showing up on the PC. This is a bad sign. Where’s the data being lost? My guess, the sampling isn’t working properly. Something is slipping. But how the heck can it slip when a transition resets it, and transition is occuring on a very regular basis (minimum every 4us, maximum every 8us). The thing only has to run freely for a few cycles.

Add the fact that the only real pattern that I can VISUALLY look for is the 0xAAAA 0xAAAA 0x4489 0x4489 raw sync sector in raw MFM. Any words, or real data has to be decoded before, properly aligned on a byte, etc. PAIN IN THE BUTT to figure out if everything is actually working. Most of my programming projects, whatever they might be, are straight-forward. Run it, see if it works, and then go back and fix it if it doesn’t. Even the indication that this thing is working is obscured.

There are so many variables, and I’m constantly changing all of them, that who knows when one part is working? Of course, you say, only change one thing at a time. OK, makes sense, but this just isn’t practical. If I change how the SX xfers to the PC, then I have to modify the PC software accordingly…. So if I make a mistake in coding on one side or the other, who knows?

Maybe I’m lacking the necessary basic project management, coding, microcontroller, hardware experience required to get this off and running?

BTW: my title of this entry reminds me of “Too many secrets” or “Cootys Rat Semen.” from Sneakers. If you don’t know what I’m talking about, please disregard! 🙂