sync’ing to the 0x4489 0x4489

Well, this wasn’t as easy as I first imagined.

There are some issues:

1> By the time you’ve detected the 4489 4489, it’s too late to do much about it. This is because I’m writing one bit at a time, and once you’ve seen the 4489, it’s already (probably) shifted. There’s no way to compensate for the shift because adding 0’s (for instance) to fill out the current byte will screw up the decoding on the PC, because the PC looks for the 4489 as a guide to when to start decoding. You end up with something like 4489 (but shifted), then some binary 0’s(to properly shift the data), then the start of the sector 0x55 (for the FF identifier). This few extra 0-bit’s of shift data screws up the decoding.
2> Then one might say, don’t start recording until you are properly shifted. Sure you lose the first track, but just record extra data. This sounds good on paper, but once you hit the gap, the bit-shift is different. Now you have to detect the gap, which requires decoding the sector header and looking at the sector offset, and then counting from there.

3> I think post-processing within the SX sounds good, but implementing a block-shift(which rotates around the edges) in memory can’t be easy. While I think my SX and the memory is fast enough to make this feasible, it has the same problems of detecting multiple shifts within the data and correcting for them.

I think it’s worthy to note that Marco Veneri when writing AFR decided (I believe) that doing a one-time shift fix was impossible and instead wrote a number of routines to find the shifts, and then routines to extract bit-shifted data from the byte array. I can’t believe that he wouldn’t have preferred a one-time fix routine, and then operate on just the data.

With this being said, I think it would be advantageous to do this on the SX because this eliminates a bunch of routines on the PC. If I can do it with little headaches on the SX, then I think thats the right way to do it.

I just haven’t stumbled on the right way yet.


Amateur Electronics Design Engineer and Hacker


  • I only see 4 possibilities for doing the shifting:
    * shift on-the-fly as bits are streaming in from the disk, so everything is aligned by the time you write it to SRAM. or,
    * record unaligned data to SRAM, then shift data in-place SRAM-to-SRAM, so data in SRAM is then aligned. or,
    * shift on-the-fly as you read unaligned data from SRAM, and send aligned data to the PC. or,
    * With unaligned data on the PC, shift the data in the RAM.

    Forgive me for rambling comments on each one.

    In assembly language, doing a block-shift SRAM-to-SRAM is one of the easiest things you can do.
    Leave a free byte or two (to shift into) just before the beginning of recorded data.

    I suppose the simplest thing to do would be to shift the entire track 1 bit at at time SRAM-to-SRAM.
    Your debug code can send the “before” and “after” images to the PC to make sure the shift is working right.

    So for the SRAM-to-SRAM shifting, the SX would repeatedly:
    For each sector (up to #sectors on a track):
    For each possible bit shift (up to #bits in gap + #bits in sector):
    Shift the entire track 1 bit down.
    If there is no 4489 4489 in first 4 bytes, repeat with next bit.
    Send that sector out.
    Shift everything down 1 sectors worth of bytes.
    Repeat for next sector.

    “By the time you’ve detected the 4489 4489, it’s too late”
    If you give yourself a 32 bit buffer, it’s not too late.
    You can (in either of the on-the-fly possibilities listed above):
    * send the oldest bit in the buffer
    * read the newest bit, and shift it into the buffer
    * if the buffer exactly equals 4489 4489, send a bunch of dummy bits to get byte alignment (unless it’s already aligned).
    * repeat.
    (Would this work if you only had a 16 bit buffer, or does 4489 — literal or shifted — ever occur in normal data ?)

    On the other hand, all this “shift stuff around” will take about the same amount of code whether you do it on the SX side or on the PC side. If you need a custom application on the PC side anyway, you might as well stick the code on the PC side (where it will be far easier to debug).

  • David,

    Nice to see you again. Thanks for posting. Your post is very thought-provoking. I’ve tried to type this message 3 times today.

    First, to answer your question, 4489 never shows up in normal data. The way it is encoded (one combination of bits with a missing clock bit) ensures that. It’s legal MFM but illegal data + clock combination as far as the Amiga is concerned.

    Your on-the-fly method makes the most sense to me. I think I’d do it when it comes in from the drive. So the first sector written would be a complete sector.

    I almost forgot that I attempted this method. It never worked, and I had bigger fish to fry.

    In order to determine how many bits to stuff, I guess you would look at how many bits you have already stored, and if that isn’t an even multiple of 8, then you’d need to store more bits(ie a 0) until it became an even multiple. I’m not sure I like this because then I need a loop inside the data storing routine like so:

    for a=1 to numberofbitstostuff
    store a 0
    next a

    The reason why I don’t like this is TIME. Is that why you also suggested aligning as you sent to the PC?

    My current ISR is about 500ns. I think it takes something like 120ns to get inside the ISR and then my actual routine is 380ns. I really don’t want to expand that size too much.

    BUT if I do it during the PC xfer: reading from the fram is byte-oriented(with my current read routines, which I could of course do manually)…. hrrmmm. then how do you perform 8 shifts, loop thru, I guess — and check for the sequence inside. and then UART code is (of course) byte-based, so you’d need some temporary variable to hold half-real bits and half-placeholder bits…. and then the FRAM will be mid-byte……………………

    YUCK. This is just a mental mess for me. Sorry for the stream-of-consciousness-like posting. It gets messier for me as I go through it.

    Your SRAM-SRAM method also confuses me. (it doesn’t take much these days) I don’t see any clear method of shifting bits in SRAM. Read access from the FRAM is all byte-oriented, and there are no convenient functions to access a particular bit. I suppose I could develop a few, but even then it sounds tedious.

    Moving bits sounds hard to me. Everytime you switch from reading mode to writing mode(and vice versa, requiring multiple opcodes and addresses), you have to relocate to the byte AND bit position that you were at. And so you need routines like I mentioned.

    I suppose if I had a routine like readbyte (bytelocation, bitoffset) and writebyte(bytelocation, bitoffset) I suppose. but ugh.

    AND YOU THOUGHT *YOU* WERE RAMBLING…. I got you beat here. 🙂

    P.S. I edited the wordpress PHP code to allow a larger comment posting field. Those little boxes kill me

  • I feel like you are encouraging me to ramble on some more :-).

    == General comments ==

    Oh, yeah. I forgot that reading the next byte of FRAM was more complex than
    mov indf,W

    When I say “shift a buffer by 1 bit”, I mean something like

    bit_buffer RES d’9′ ; 9 byte bit-buffer.

    bcf STATUS,C ; (unnecessary)
    RLF bit_buffer+8,F
    RLF bit_buffer+7,F
    RLF bit_buffer+6,F
    RLF bit_buffer+5,F
    RLF bit_buffer+4,F
    RLF bit_buffer+3,F
    RLF bit_buffer+2,F
    RLF bit_buffer+1,F
    RLF bit_buffer+0,F

    == FRAM-to-USB on-the-fly ==

    For on-the-fly bit-re-alignment between byte-oriented FRAM and a byte-oriented USB interface, you would:

    * somehow come up with the initial FRAM_index, the index of the byte that contains the first bit of the 4489 sync header
    * somehow come up with n (0

  • oopsies, any attempt to post text including the “less than sign” gets truncated just before the “less than sign”. I suppose we are supposed to use ampersand l t ; (<) … which seems to work.

  • == FRAM-to-USB on-the-fly ==

    For on-the-fly bit-re-alignment between byte-oriented FRAM and a byte-oriented USB interface, you would:

    * somehow come up with the initial FRAM_index, the index of the byte that contains the first bit of the 4489 sync header
    * somehow come up with n (0 <= n < 8), the bit offset that the sync header is shifted, then

    call fill_buffer ; read 9 bytes starting at FRAM[FRAM_index], stick in the bit_buffer[0] to bit_buffer[8]
    call align_buffer ; which calls shift_buffer_1_bit “n” times
    ; the first time through this loop,
    ; bit_buffer[0] and bit_buffer[1] now contain 4489
    call empty_buffer ; dump 8 bytes bit_buffer[0] to bit_buffer[7] to USB
    FRAM_index = FRAM_index + 8;
    until done with sector.

    (This code reads some bytes from FRAM twice — the byte at the end of the buffer on one cycle is re-read from FRAM into the beginning of the buffer on the next cycle. Perhaps it would be convenient to buffer just that byte (unshifted) elsewhere, so FRAM could be read straight through.)

    Does the SX have enough memory to hold an entire sector in RAM and shift it like this?

    == FRAM-to-FRAM in-place

    Pretty much the same approach as FRAM-to-USB on-the-fly,
    except instead of writing out to USB, write back to FRAM.
    * read FRAM[0] to FRAM[8] (9 bytes)
    * do your funky magic
    * write to FRAM[0] to FRAM[7] (8 bytes)
    * read FRAM[8] to FRAM[16] (9 bytes)
    * do your funky magic
    * write to FRAM[8] to FRAM[15] (8 bytes)
    until done with sector.

    == disk-to-FRAM on-the-fly


  • == disk-to-FRAM on-the-fly

    To re-synchronize to byte boundaries, we could either
    * insert fake zero bits just before that sync header, or
    * delete up to 7 bits just before that sync header
    The stuff just before the sync header is merely garbage in the between-sector gap, right?

    (I probably should have mentioned this back in 2005

    “The reason why I don’t like this is TIME.”
    I agree — we don’t want interrupts be bloated pigs that take so much time that they block the next interrupt.
    But I suspect synchronizing will add less than 10 instruction-times to your worst-case floppy-to-FRAM interrupt time.
    “it’s wafer-thin!”

    Didn’t you post your interrupt code once ?
    … searches in vain for the interrupt code …

    I assume you have something like

    —- assumed code —-
    bits RES 1
    bitcount RES 1

    bsf STATUS,C ; (necessary)
    jmp finish_storing_bit

    bcf STATUS,C ; (necessary)
    jmp finish_storing_bit

    RLF bits,f
    ; check bits remaining.
    ; if bits+0 is full, dump it to FRAM.
    ; Otherwise return and wait for another bit.
    decfsz bitcount
    movlw 8
    movwf bitcount
    movf bits,W
    call write_w_to_FRAM
    —- end assumed code —-

    It’s just a little more code to also detect and synchronize to the 4489 4489 sync header, something like this:

    —- code to detect and synchronize —-
    ; WARNING: untested code.
    ; Written by David Cary 2007-02-06 and put into the public domain.
    bits RES 3
    bitcount RES 1

    bsf STATUS,C ; (necessary)
    jmp finish_storing_bit

    ; Just before adding zero bit, check sync.
    ; compare (bits+1):(bits+2) to 0x4489
    ; compare_unsigned_16:
    movf bits+1,w
    addlw (0xFF – 0x44 + 1);
    ; Are they equal ?
    goto continue_storing_bit
    ; yes, hi bytes are equal — compare lo
    movf bits+2,w
    addlw (0xFF – 0x89 + 1);
    goto continue_storing_bit
    ; yes, both bytes are equal — now what?
    ; throw away bits to re-sync.
    ; 8 cases:
    ; bitcount = 8: we’re already synchronized
    ; (so leave bitcount alone).
    ; bitcount = 1..7: not synchronized
    ; (so reset bitcount to 8 to throw away bits).
    movlw 8
    movwf bitcount
    ; also consider setting a “sync found” flag…
    ; resume storing zero bit
    bcf STATUS,C ; (necessary)
    jmp finish_storing_bit

    RLF bits+2,f
    RLF bits+1,f
    RLF bits+0,f
    ; check bits remaining.
    ; if bits+0 is full, dump it to FRAM.
    ; Otherwise return and wait for another bit.
    decfsz bitcount
    movlw 8
    movwf bitcount
    movf bits+0,W
    call write_w_to_FRAM
    —- end code to detect and synchronize —-

  • I’ve got a pretty good gist of what you are doing. I did infact post my ISR code just recently.

    This is still current.

    Also, there is NO sector gap between sectors. Look at my post here

    especially this line:

    “sectors 5-10: 1034, 2122, 3210, 4298, 5386, 6474”

    Notice that each sector is exactly 1088 bytes apart. That is the exact size for a sector.

    There is however, an 830-byte TRACK gap, which exists in one place on the disk for each track. In the previous post, notice the layout of the disk:

    [random starting point][sector 5][sector 6][sector 7][sector 8][sector 9][sector 10][830-byte gap][sector 0][sector 1][sector 2][sector 3][sector 4].

    Don’t assign any importance to the fact that the gap follows sector 10, that is mere a coincidence, and isn’t necessarily going to be like that all the time.

    The full sync sequence is 0xAAAA 0xAAAA 0x4489 0x4489. So if we needed to “erase” bits, or throw them out, or whatever, we could potentially replace them. Also, we could just plain get rid of the SYNC, and use another character, or set of characters (smaller, maybe) that would key the software. Note there are 224 unused raw MFM characters, so we are fine there.

    Also one other point, I do *not* keep shifting bits until I get a byte, and then I write it. I write EACH BIT as it comes in directly to the FRAM.

    I need to print out your code and examine it more closely, but I get the general idea…………

  • One more thing:

    If everything works out perfectly, there seems to be usually only two(though I’ve seen more)different bitshifts per track.

    This is because

    1> you start off in a random place, and so you aren’t properly bitsync’d until you find the first SYNC. Once you find the first SYNC, then you should be properly shifted UNTIL

    2> you hit the track gap. Once that happens, you have to look for another SYNC to get properly aligned after the gap.

    Marco’s afr.c code finds a SYNC, and then assumes there to be another sector right at the end of the last, if there is, then he continues on. If not, he searches for another SYNC at a different bit-shift.

    I see no reason, however, to not fully SYNC on every SYNC pattern I detect. The entire reason, supposedly, for the Amiga sector header to contain a sector offset, which is “sectors until a track gap”, is so that the software decoding the bitstream can know when to start looking for a new SYNC pattern. I think it just so happens, however, that the amiga constantly resyncs on each pattern received.

  • After further deliberation on the subject, I see no timely way of doing this.

    If I do things from disk -> FRAM: This really requires me to write more than one bit out at a time to the memory. It could require up to 7 bit writes, and I’m just not willing to bloat my ISR to that size. My ISR is already around 500ns(including activation), and I think my writes take a ballpark of 140ns. 140 * 7 = 980ns + 500 ~= 1.5us. I just plain don’t like it.

    Maybe FRAM —> FRAM, I’ll investigate this more.

  • Yes, I think you might as well re-sync every time you see the sync pattern.

    So, the “track gap” is usually *not* an exact multiple of 8 bits long. That’s what I expected. So you certainly must re-sync after the track gap.

    No sector gap between sectors? Somehow I forgot that.
    (What if you waited until you somehow detected the “track gap”, and only then started writing data? Then you would only have 1 bitshift per track to deal with. And it wouldn’t matter if the “track gap” was so huge that it wouldn’t all fit in your buffer. But you would have to wait longer, waiting for the disk to rotate all the way to the end of the track, before you step to the next track).

    I agree with keeping the timing-critical disk-to-FRAM as simple as possible, so I wouldn’t bother with trying to do extra stuff at disk-to-FRAM time.

    (: Even though it is moot, I’m going to point out that it does not “requires me to write more than one bit out at a time to the [FRAM].”
    One alternative would be an interrupt routine that
    * immediately save the current bit to a 24 bit buffer in internal RAM.
    * Pull the oldest bit from the buffer (the one we read 24 interrupts ago).
    * if it would help synchronize things, throw that bit away and return. These “skipped bits” occur just before the sync header is written to FRAM, but just after the sync header is read from the disk.
    * otherwise (normally) write that bit to external FRAM and return.
    I’ve been told that *not* writing a bit is pretty fast 🙂

  • While the gap written by the Amiga is a fixed length, in reality, the actual space between end of last sector and the first sync is slightly more than that. This is where and why we have to mess with all this bitshifting junk.

    Wouldn’t this method corrupt the full sync of 0xAAAA 0xAAAA 0x4489 0x4489? Those “skipped bits” are actually part of the second byte of the second 0xAAAA.

    But I guess the response might be, “so what?” As long as we can give the PC something to use as “start of sector indicator” and the data is properly aligned, it doesn’t matter what that pattern is.

    0xAA shows up in normal data, so we can’t sync on those, and I don’t want a huge 72 bit-buffer(8 bytes + 1 spare to throw out.)

    I’m going to try to put together some code today.

  • Here’s my first stab at it. I haven’t tested or even executed this code yet, but it assembles. I will test later today.

    ‘note bitcounter=8 and skipwrite = 0 before ISR is called

    ‘also note bitcounter has no significance from the time we start aligning until we are finished

    SETB rb.1 ‘debug pin(inside interrupt)



    MODE $09
    MOV !RB,#%00000000

    mov myres, w
    ‘myres.0 has the actual bit value

    MOVB C, MYRES.0 ‘Put myres.0 into C to be shifted in
    ‘shift the bit into the 24-bit buffer
    RL BYTE1
    RL BYTE2
    RL BYTE3
    MOVB MYRES.0, C ‘this is the output bit if we need to write it

    CJNE BYTE2, #$44, notasync
    CJNE BYTE1, #$89, notasync
    CJE bitcounter, #1, notasync ‘this is a sync but its already aligned right

    ‘we’ve seen an unaligned sync
    ‘the number of bits we need to skip = the number of bits we are currently out of alignment
    ‘so let’s subtract the number of bits we’ve written for the current byte from 8
    ‘and then skip that number of bits

    MOV skipwrite, #8
    SUB skipwrite, bitcounter


    CJE SKIPWRITE, #0, skipdaskip ‘if we are not in the process of realigning

    dec skipwrite

    CJA SKIPWRITE, #0, goback
    ‘skipwrite just went from 1->0, last bit to skip, let’s reset bitcounter to 8
    ‘we are now properly aligned.
    MOV bitcounter, #8

    jmp goback


    dec bitcounter
    CJA BITCOUNTER,#0, skipbcreset ‘snz wont work because mov is multi-byte instruction

    mov bitcounter, #8 ‘reset bitcounter


    CLRB SCK ‘make sure clock is low to start with

    MOVB SI, myres.0 ‘send bit to fram
    NOP ‘satisfy 5ns data setup time

    SETB SCK ‘raise clock notifying chip to read the data

    ‘COUNT THE Bit stored
    IJNZ lobyte, goback
    IJNZ hibyte, goback
    inc superhibyte


    CLRB rb.1
    MOV W, #-100 ’92/93 seems most regular, 100 drifts

  • Note that my routine, even if it did function, puts my ISR around 1us. I thought this was acceptable, but in basic tests I think I’m dropping data.

    I could potentially move my ISR trigger point back, so it triggers closer to the beginning of the bitcell rather than the middle of the bitcell.

    This would gain me more time per bitcell, but I’m just not happy about it.

    Perhaps I’ll try the fram-fram method.