floppy.c’s mfm decoder

A couple days ago, whenever I got ahold of floppy.c, I wrote a quickie code stub that utilizes floppySectorMfmDecode from floppy.c of fellow ( Note you must use the mirrors to actually download anything….

This MFM decoder from fellow.c is pretty clear.  AFR is located here See for the local copy.

Anyways, I used mainly the data-portion of the decoder, which was putting out junk for my input from the SX/amiga drive. So I sort of put it on the back burner.

Then I realized that since it decodes the header, this might be a GREAT check for SOME data validity. Some of the header fields are MFM-encoded separately from each other, and from the actual data. What this means is that you need less good data to get a valid read. What sucks is that the actual data is spread out across 1024 bytes of MFM encoding, if the second half is screwed up you lose the even bits of your data. If the first half you lose the odd bits. This is because the clock bits take up the other ‘half’ of encoded data….

Here’s the results of that code stub:

(2372) Format: 255 Sector: 9 Track: 0 Sectors to end: 9
(5620) Format: 255 Sector: 1 Track: 0 Sectors to end: 6
(6700) Format: 170 Sector: 245 Track: 85 Sectors to end: 232 (not valid format)
(8811) Format: 170 Sector: 65 Track: 85 Sectors to end: 240 (not valid format)
(20645) Format: 84 Sector: 170 Track: 174 Sectors to end: 193 (not valid format)
(28232) Format: 255 Sector: 3 Track: 2 Sectors to end: 10
(35714) Format: 255 Sector: 10 Track: 2 Sectors to end: 82
(70992) Format: 170 Sector: 93 Track: 64 Sectors to end: 245 (not valid format)
(89788) Format: 255 Sector: 7 Track: 6 Sectors to end: 4
(98245) Format: 255 Sector: 1 Track: 7 Sectors to end: 3

The valid sectors have a format ID of 255, 0xFF, the other ones are simply corrupted data.

You can even tell they are invalid because the sector numbers, track numbers etc are out of range.

So what’s this mean? I gotta get to fine tuning my SX code that captures this data. I knew I wasn’t SUPER close because out of 100kb of code, I’m getting a lousy 10 sectors, where I should be getting much more.

For this, I really have to learn exactly how the RTCC functions in fine detail, because doing this correctly requires very good timing.


Amateur Electronics Design Engineer and Hacker


  • You can also use the checksum fields to additionally verify read data. While the checksum fields themselves can be broken, they can be a proof when they match read data’s chesksums.

  • Yeah I thought about that as well. According to some docs I was reading last night, it says the checksums are not MFM encoded themselves, similar to the SYNC’s…..This is useful, at least you don’t have to decode that to check. Has this lined up with what you’ve seen?

    Honestly though, the checksums don’t do me much good yet, because I *know* the data is too corrupt because I’m only seeing a few valid sectors where I should be seeing many more. Until I’m getting 95% of the sector headers right, I don’t expect to have the data 100% correct where the checksum would help me….

  • Shame on me, I haven’t got a chance to figure out how checksums are composed, although it requires a look into sector encoding subroutine.

    (But I’ve wrote some fixup code that, given a ‘pseudo MFM’ sector produced by floppyMfmEncodeSector, will rectify it into fully (I hope) compliant MFM flow. Now if I find the DiskMonitor by Quartex – the only one able to run on my KS1.3 – I will finally have something to put into my MCU and to attach the latter to 23-pin external floppy socket.)

    On your data corruption issue: maybe things aren’t that bad. At least you don’t seem to skip or stretch bits I think, or the SYNC pattern wouldn’t appear on a more or less regular basis. If it were only most significant bit that gets lost then it probably could’ve been that piece of program that forms the whole byte, or deals with shifting, or something like that. Can you publish your SX code?

  • I’ll post my SX code here shortly. If you look at some older posts, I have an older copy already online. My damn ISP can’t get their sh*t together, so my internet access keeps going up & down.

    My new code is much better though — much more streamlined….

  • Hello with greetings from Munich/Germany,
    found your article here and this is very very interesting to me.
    I am involved in simular project which you can find on my
    homepage , chapter 1.5 for details.
    My “problem” is still MFM decoding and in my special case clock recovery.
    I am already able to decode and encode MFM data but my current used method only works, because the strobe-clock signal is available. The disadvantage of this methode: I have to rebuilt manually all the timing-track/header infos including CRC during the MFM-encoding cycle.
    My question:
    How did you decode the MFM data if no clock is available ?
    Any references/ideas would be very helpful.
    ( I am using now Altera CyclonII + NIOS )

    Many thanks in advance and best regards,

    Reinhard Heuberger

    P.S. can’t go in contact with you via E-Mail .

  • Greetings Reinhard!

    I will email you shortly, I’m assuming the last line of the comment means that you don’t have my email address?

    From the amiga floppy perspective, the clock bit of the MFM is unnecessary and I don’t use it for anything in my current designs. MFM goes clock-bit data-bit clock-bit data-bit and so on.

    I look for the first falling edge, basically the first pulse, and then start a timer. I wait for the next edge, and take the value of that timer when the edge arrives. This timer value becomes my “delta t” or the time between edges — and these values are streamed to the PC via FIFO/usb in real-time.

    Once these values get to the PC, I reconstruct the RAW MFM data stream, and use the time-betweeen-edges to define the RAW MFM bit groupings of 10, 100, and 1000. So I now have a stream of bits, but I don’t know yet which bits are clock, and which are data.

    So enter the SYNC WORD. For the amiga, its 0xAAAA AAAA 4489 4489. Once I see that bit pattern, the first bit after the last 0x4489 is a clock bit.

    I now mask away every other bit by ANDing it with 0x55. This effectively removes the clock bits.

    The amiga has a goofy way of encoding odd and even bits, so there is some left-shifting going on and a logical OR going on, but suffice to say that the clock bits are never used.

    I’ve tried using various PLL techniques for clock and data recovery/separation but these really haven’t improved my decode-rate, so I’ve kept it nice and simple.

    I don’t know if this will help you but this is what works for my project.

    My email address is listed at the bottom of the home page at