Tag - MFM

found bug last night
characterizing speed performance of floppy drive controller
working FPGA version of the amiga floppy project
bought a Saleae Logic. Another tool for my toolbox
logic analyzer on Amiga 500 Paula chip
how the amiga reads floppies
error correction using hamming distances
feasability of writing disks
data rate vs throughput
SlackFest 2007

found bug last night

With this new FPGA solution, certain tracks would result in what I call a “short read.”  A short read is any received track that contains less than 28,125 delta T’s, aka pulse times.  Given a certain capture time, there are minimum’s and maximum’s of the number of pulses on an amiga track.

If we have a 300 rpm drive, then its 60s/300 = 200ms per revolution.  If the bitcells are 2us wide, then you have at most 200ms/2us = 100,000 bit cells.  The one’s density is at max 50%(raw MFM ’10’), so this means every other cell would contain a 1. So 50,000 pulses, so 50,000 delta T’s.  The minimum one’s density is 25%(raw MFM ‘1000’), so 25,000 pulses.  Now we read more than just one revolution of data, because we will very likely start reading in the middle of a sector.  So instead of 11 sectors worth of read time, we actually need to read 12 sectors worth, to ensure we read the entire sector in which we started.  This is 218.2ms of time minimum.  We could potentially re-assemble data, using some type of circular buffer, but this is more trouble than it’s worth.  I currently read 225ms of data.

225ms / 2us = 56,250 maximum, 28,125 minimum.

I had my FTDI chip, for the usb<->ttl converter, D2XX USB parameters setting the USB transfer size to 57000 bytes.  This is definitely over and above what was needed.  Or so I thought.

I bumped the transfer size from 57000 to 60032 (docs said specifically 64 byte multiples), and everything started working.  I had already narrowed it down that the problem tracks were ones that had a high density, where there were lots and lots of pulses.  So I knew the size of the track was related.  I checked for FIFO overflow, and it wasn’t overflowing.

I’ve got to look when I have a free second, but I think my USB packet size is 4096 bytes.  So 56250+4096 (some amount of padding?) = 60346.   Uh-o, I better bump that to 60,352.  I think the driver (or windows?) that maxes out at 64k transfer size, so I still have a little wiggle room.

Long and short is that it appears to be working much better.  I was glad to find this bug with just a little brainstorming, and getting better visibility into my actual pulses count on the FPGA.

characterizing speed performance of floppy drive controller

So I’ve got things working rather swimmingly right now.  Switched drives from Samsung to Sony, and it’s made a huge difference.  The Sony just seems to work better.

I’m averaging about 355ms per track, yielding 57s total disk times.  The 355ms is made up of 313ms of transfer time at an effective throughput rate on the serial of around 1.175mbps.  Which is basically 1.5mbps baud rate, theoretical max of 1.2mbps.  This isn’t horrible performance, but I really want to get back to 2mbps.  I haven’t been using 2mbps because I have massive errors, but I think that there is some round off happening in my UART that prevents it from working correctly.  I need to revisit my UART code and find out exactly why 2mbps doesn’t work.  I’ve run this usb->ttl converter at 2mbps with my uC, so it really should work fine.

If I go to 2mbps, I’ll EASILY chop off the 88ms from 313ms, and I’ll be transferring the track to the PC in REAL TIME.  Basically, as fast as I receive it, I’ll be sending it to the PC.  Remember, that because I transmit the pulse times, and not the data, that fast times are really required.  This is a little more complicated than just saying the RAW MFM rate is 500kbps, so you need 500kbps of bandwidth to the PC.

There are several optimizations I can do, and I’ll post more later.

working FPGA version of the amiga floppy project

So, I’ve been working on the FPGA version of the amiga floppy project for some time.  I just recently had a fair bit of free time, and so everything came together rather quickly!

I’m now able to read amiga floppy disks in using the same Java client software I had developed for use with the Parallax SX microcontroller board.  There were a few minor changes in the software — most notably the read data routine from the hardware.

I’ve written the code in Verilog on a Xilinx Spartan-3e evaluation board.

The various hardware parts I’ve described:

  • UART: Written from scratch, a transmitter and a receiver.   Simple to use, variable baud rates.
  • FIFO: Generated from Xilinx’s CoreGen. This connects the floppy interface to the USB interface. 32k bytes
  • FSM to always empty the FIFO to the PC.  Once something goes in the FIFO, it immediately gets sent to the PC
  • Read-floppy-FSM: Stores 225ms of Delta T’s (aka time between edges) as 8-bit integers into the FIFO.
  • Command FSM: Receives single-character commands from the java software to execute (R for read, U for upper head, L for lower head, etc)
  • Transmit test pattern routine: Sends 32k of data to the PC to test for reliable communication

A couple advantages with the FPGA solution:

  • We transmit the data to the PC as soon as it’s available.  I want to characterize the actual latency, but it should be pretty small.  This is different from my load->FRAM, and then FRAM-> PC method.  This method should be much faster and we’re just not idling for 225ms.
  • Instead of transmitting the bit-sync’d raw MFM to the PC, I’m sending the delta T’s.  While this requires a little more processing on PC, the PC can more easily determine why a particularly sector can’t be read.  For instance, is the time between pulses too small? Too large?  On a fringe edge?  Plus, since the Java decodes these, I can now add sliders for “acceptable delta T’s” for each 4, 6, or 8us windows.  Before that would require modifying the firmware on the microcontroller.  I can also start to do statistical analysis on the pulse times.

I am currently doing about 430ms per track.  This sucks.  I was beating that by 100ms with my microcontroller.  However, the problem is that because a variable amount of data is going to the PC, the PC receiving code does not know when exactly to stop receiving, so there’s a wait-timer which I have to optimize.  Once I receive the minimum amount of data, I wait 100ms since the last received data, and then exit.  I’ve got to put my logic analyzers in place and figure out how to optimize it.

Denis@h3q can read a disk in 43s, which is pretty darn good.  He is using tokens like I mentioned here and here and here.  I’m transferring much more data though, which gives me more information.  I like his time, and maybe that would be a nice goal to beat!  Wow. That’s only 269ms/track.  Hrrrmm…. that’s pretty good.

bought a Saleae Logic. Another tool for my toolbox


I bought at Saleae Logic which is an inexpensive logic analyzer.  See link here.

It isn’t nearly as fast (only samples at 24mhz max), and it doesn’t have as advanced triggering capabilities, but it does do millions->billions of samples.

So, of course, I put it to the test!  I recorded 5 million samples at 24mhz which works out to be 210ms, just slightly over a floppy track time of 203ms.  I sampled an entire track, which is FANTASTIC if you know anything about logic analyzers.  They just don’t usually store much.

Then I wrote a small C program which converts the exported binary data to RAW AMIGA MFM.  I searched for binary patterns of the sync code 0x94489, and exactly 11 of them came up.  Which means that my little code is working, the logic analyzer is correctly reading the data.  I still have to try to decode this track and see if it decodes properly, but this is pretty neat.  It’s like third party verification of what I’m doing.

I have these odd exception cases where sometimes a track refuses to read although the amiga reads it perfectly.  I’m going to get the bottom of those cases.

I hate to say this, but everything just worked tonight.  No problems. Brought back up the java client, programmed the SX, and off I went.  Pretty neat.

I’ll have more to say on this logic analyzer, but the software is really nice, clean, simple.  It does its job.

I can’t tell you for how long I’ve wanted to sample a whole track w/ some test equipment.  You can buy $$$ logic analyzers and not get this type of buffer space…. It achieves it by doing real-time samples straight to the PC’s ram……

logic analyzer on Amiga 500 Paula chip

So I’ve attached my Intronix LA1034 to the Paula chip, the 16-bit wide data bus, and the DMA Request line. I triggered the logic analyzer on 0x4489, of course.

Here’s the results:

(click for a full size version)

So I’ve spent most of my time sort watching it come out of a floppy drive, but never from Paula’s perspective. It’s always before Commodore’s custom LSI floppy controller Paula could get ahold of it. So notice that the data is still RAW MFM, and not processed in any way, because Paula doesn’t perform this function. The MFM decoding is done in the trackdisk.device, with the help of the bit blitter.(part of Agnus)

So the normal floppy sync pattern is 0xAAAA 0xAAAA 0x4489 0x4489, but why do we see 0x2AAA ??

So, 0x2AAA is 0010 1010 1010 1010. right? We’re missing the first bit. It turns out there was a bug in the MFM encoding routine. It was fixed March 16th 1990 at 1:08am in the morning, in revision 32.5 of the trackdisk.device.

Also, most times, the sync word used was 0x4489 — which is exactly what I use to find sync in the firmware I wrote for the microcontroller.

Oh and here’s Paula with the great EZ hooks on her leads

(click for full size)

how the amiga reads floppies

So I’ve forever wondered, at the hardware level, how the amiga reads floppy disks.  I’ve gotten bits and pieces over the years, but I’ve never really understood the bulk of it.

So there are four main chips involved in the floppy controller — the functions are not grouped together like they would be on a NEC765 or similar controller.

The four chips involved are:

1. 8520 Complex Interface Adapter (CIA), the ODD one, U7 on the schematics.  This is a generic I/O chip that handles a bunch of things — but specifically the control OUTPUTS from the drive to the amiga.  Commodore calls this the “disk sensing” functions.

2. 8520 Complex Interface Adapter (CIA), the EVEN one, U8 on the schematics.  Same as above, but this one handles the control INPUTS from the amiga to the drive.  Selection, control, and stepping.

3. Gary handles the state of the MOTOR of the floppy drive, and takes what write data/gates from PAULA, does some magic stuff (still working on what Gary actually does), runs it through a NAND gate configured as an INVERTER, and then pipes it to the drive.  Gary handles just disk writes, so controller to drive output.

4. Paula handles processing the incoming read data from the drive, handles the DMA including firing off interrupts to the 68K when the SYNCWORD is found, or when the DMA is complete.  Paula has the real job of doing the data separation and has a digital PLL circuit in hardware.  Note that it appears that Paula doesn’t select, turn the motor on, select sides, etc etc whatsoever.  The programmer/OS has to handle all that stuff.  Paula just brings the bits in, DMA’s them into memory, and lets the rest of the processes handle everything else.

All the MFM stuff is handled inside the trackdisk.device stored in the Kickstart ROM.  I’d like to at least partially disassemble the ROM code since thanks to emulators, the ROM files are everywhere.  Maybe I’ll let IDA have a crack at it.

I would _really_ like to see the dpll circuit setup inside Paula to see exactly how Commodore implemented it on the Amiga.  The paper I recommended a couple posts back talked specifically about design decisions surrounding DPLL and I’d love to know what methods the original engineers used.

Originally I thought the controller was almost entirely in software (and actually, the majority of it is, in fact software) — but Paula has some disk controller hardware too.  You always need to have some hardware components for turning leads off and on — but I’m not too surprised that there is some custom hardware there.

I’m ordering a A500 service manual — which I think I already have a foreign copy of — I’d like as much info as possible.

error correction using hamming distances

I’ve been thinking for quite some time about error correction.  About using the structure of the MFM to our advantage.  I’ve come up with something although I haven’t implemented the idea yet.

As you know from my many references to it, http://www.techtravels.org/?p=62 , this post shows the 32 possible byte combinations from legal raw MFM.

I’ve run some numbers today.  First, there are illegal values, and then you have illegal PAIRS of MFM bytes.  For instance, while 0xAA is valid by itself, and 0x11 is valid by itself, 0xAA11 is illegal, because the combination of those would produce 4 consecutive zeros, and as we know, this is illegal.  Just like 0x55 is legal and 0x88 is legal, but 0x5588 is illegal because that would produce back-to-back 1’s, also illegal.

So as far as PAIRS go, roughly 1/3 of the total possible pairs (32*32=1024), are illegal.  That works out to be 341 pairs or so.  I was always interested in the percentage of bad pairs so we could use that info to detect bad bits/bytes.

So here’s my logic:

We need to know the data is bad, and that we fixed the data once we’ve attempted some error correction stuff.  The checksum works fine here.  The checksum uses XOR, and while its not perfect, it provides us with a decent check.

Next, we need to know which bytes are bad so we know which ones to fix.  This is also easy.  Because we know which raw values are ok, we can simply check the bytes against a table (when the checksum fails), and then we know roughly which ones are bad.  The byte before or after might be screwy, but we have the general location.

So now that we know which bytes need fixed, we need to figure out what the correct values for the bytes would be.  Enter hamming distance.  If you don’t know what that is, look here.  Long and short of it is that it provides a measurement of how many bits need flipped/changed to go from one byte to another.  In our case, it tells us that if we have a single-bit error (and this we have to guess on), and we give it the bad byte, it will tell us which LEGAL mfm bytes the byte could have been.  I wrote a small c program today that calculates the hamming distance between BAD BYTES, and legal mfm bytes.  Note this produced ~7200 lines of results.

So then what, we now have a small subset of guesses to make at the byte value.  Now, if there is just one bad byte, then its easy enough to swap in some choices, and then re-checksum the track to see if that fixes it.  If there are multiple values, this becomes much harder for me to wrap my head around programmatically wise.  I think the problem wreaks of recursion, which I’ve never been very good at.  I need to test all the possible (or probable) values in each of the appropriate locations.  This is not simple iteration, as far as I know.

So here’s some examples: (these are in decimal,not my normal hex)

Let’s say our bad byte is DEC 83.  Once again, this is illegal because of the “3” ended in binary 11.

So, the output from my program tells me that a single-bit error could have cause an 81 to look like an 83, that means bit 2 in that byte got mangled from a 0 to a 1.

But it could been an 82 as well.  So bit 1 might have been set, when it should have been cleared.

But that’s it.  Assuming that we just had a single-bit error, then those are the only two possibilities.  I think single-bit errors are the most common, but I’ll need to work on this.

OK, so now let’s look at 2-bit errors in the same byte.  If bad byte is still 83, then 17, 18, and 85 are possible choices.

What I’m really doing is intelligently narrowing the search space to valid possible choices.  And then, I could apply the PAIR logic from above, and potentially eliminate more of the choices.

I think it might make sense to check upwards of 3-bad bits of error…… After that it seems like maybe brute force might just be better/simpler…..

While this stuff is nice, if we have a byte get mangled in such a way that it produces another legal different raw mfm byte, then it might be impossible to check.  I can still apply my pair-logic to it which might help identify which byte is the problem byte.

Definitely different and excited stuff.  I’ve never seen this idea used before, at least not in the amiga floppy world.

feasability of writing disks

While there are some other bug fixes and documentation to be done on what’s already been implemented, I started thinking about writing ADF’s to disk over the last few days.

While the hardware part of it is already in place, there are some things that would need done:

  • The interrupt service routine would need modified to not just read data by reacting to edges, but now extract a bit from memory and put an appropriate pulse on the writedata pin. Floppy drive specs say the pulse should be 0.1us to 1.1us wide.
  • Write an SX routine that would receive data from the PC and store it in the fram. This would need to checksum the received data and either error out or request a retransmit.
  • Write PC routines that would create the full MFM track: I’d need to create a header (easy enough, use sector numbers, 11-sector number, track number, and then checksum everything), then MFM encode the data. I’m already doing much of this in the decode portion, so I can basically do the opposite for encoding.
  • Of course there’ll need to be a “controlling” pc routine, much like my current readtrack() routine.

So the whole process of writing a disk would go something like this:

  1. Have user browse for the filename, select the file, load it into a “floppydisk” structure/object that currently exists.
  2. Rewind and make sure I’m at track 0.
  3. Create the first track’s data.
  4. Send a command to the SX to tell it to receive a track’s worth of data.
  5. Send the data making sure it’s been received correctly.
  6. Tell the SX to write the track.
  7. SX enables the write gate by lowering it, and starts pulsing out data, pausing the appropriate times for 0-bits.

I don’t see any major hangups, although there are a few timing related things to get right. I’ve got make sure 18ms has passed after each step pulse. And I’ve got to make sure 100us has passed since I’ve selected the drive(this is mainly during setup for the first trak). For the last track, I need to make sure I pause for 650us after the last track is written. I also have to make sure that the time from the write gate dropping to the first bit MUST be 8us or less. Same with the final bit, I have to raise that write-gate within 8us after the last pulse.

I’ve got to look into creating a gap, sending SYNC words, learning wtf pre-write compensation is, etc.

data rate vs throughput

Although I’m transmitting from the Parallax SX microcontroller to the PC via USB at a data rate of 2mbps, the actual throughput is much less.  First, there is some overhead, namely start and stop bits, which is 25%.  Next, I actually have to read the data from the FRAM, and this takes time.

It takes approximately 1.7us to read 1 byte, and then 5us to transmit that byte.  The 5us is 500ns (1/2mbps) * 10 bits (8 start + 1 start + 1 stop).  So 6.7us per byte.  This doesn’t include the time it takes to findsync().

So my throughput is approximately ~800 kbps on a data rate of 2mbps.

Kind of sucks, but getting to 2mbps is impossible unless I integrate the reading/findsync’ing into the transmit routine.  And I think that’s generally a bad idea.  I really want to protect my bit times so I have quality transmission.  I don’t want to get into the uh-o situation where processing something else overruns the time slot, etc.

Yeah, so right now it looks like <READDATA aka pause><SEND BYTE><READ><SEND><READ><SEND> and so on.  To get to 2mbps, I’d basically have to <send><send><send><send>.  Now, if I could utilize the “dead time” between bits to actually read the data….. well… then I might be closer.  Remember, too, that I’m bit-banging so doing something interrupt driven is out of the question.

I’m not 100% sure the PC isn’t introducing some of this delay.  Which is why I’ve been looking at revamping the read routines.  First, they are butt-ugly, and second, they don’t handle error cases well.  Actually, they hang in error cases.

I’m still floating around the idea of error CORRECTION by taking advantage of the inherent structure of MFM.  I really think that there is something here.

Next steps are to try to work out a better read routine, and then implement a retry system for bad tracks.

SlackFest 2007

Ok, I’ll admit it.  I’ve been slacking like the best of them.

I’m not horribly unsatisified with my progress, especially this year.  The first four months of this year I achieved the following:

* Whole first ADF written January 29th
* Much cleaner hardware, single circuit board February 14th

* Data is now byte-sync’d, transferred sectors in order Februrary 17th

* Working Java client March 26th
As far as next steps go, I’ve got to

1> get error correction and retries working in the java client

2> clean up the gui and set up a “save config file” option, selecting serial ports, etc

3> clean up the java code

4> start testing it under Ubuntu

I’m very interested in error recovery, and I have been putting a decent amount of thought into it.  I’m really fascinated by the fact that Amiga floppy disks have tons and tons of structure.  First, the MFM is output in such a precise and predictable way, ie we know that no two 1’s are found back to back, no more than three consecutive zeros, etc.  We also know about the way clock bits are inserted.  And because of this fact, only certain bytes are possible when made up of these certain bit patterns.  In fact, only 32 are possible.  With the help of a checksum, I think it would be possible to brute-force a certain number of bad bytes.  Now I do think that the checksum is particularly unhelpful for this (XOR, I think), something like MD5 would be da bomb.  No false chance of duplicates yielding the same checksum.  I don’t understand or appreciate the (math-based) ramifications of using XOR though.  It’s certainly used everyplace within encryption, so this is might be better than I think.

My read routines are no longer raw, although I’m probably going to go back and add a raw read function.

I’ve tossed around the idea of simplifying this project even further by eliminating the FRAM memory and adding real-time functionality which I know the emulator crowd wants.  This is further down the line, and honestly, I don’t even know if its possible (or at least, not sure how I would do it)

I’ve still been managing between 12,000 and 20,000 hits per month (about 10,000 unique visitors), and so I really appreciate the continued interest people have shown.  This is really a fun project, and I’m looking forward to expanding it and making it better and more useful.