Yellowstone Progress Update
I’m still working on development of an FPGA-based disk controller card for the Apple II – Yellowstone. Over the past couple of months, I spent a long while analyzing the design of the UDC disk controller. The UDC supports all three major types of Apple II disk drives, making it a promising place to begin learning. After that I spent a long while more exploring how I might squeeze the UDC’s 8K of ROM and 2K of RAM into the limited resources of Yellowstone’s FPGA. Just recently I finally finished up those investigations and returned to actively building and testing the Yellowstone card. Unfortunately it still doesn’t work.
I built a second Yellowstone prototype, identical to the original except for selecting a Lattice MachXO2-2000 FPGA instead of a MachXO2-1200. This new chip is just barely large enough to hold the necessary ROM and RAM for my UDC pseudo-work-alike Verilog code. I’m not sure if I’ll use this solution for the final edition of Yellowstone, or if I’ll use a smaller MachXO2 version paired with a separate ROM or RAM, but at least I’m up and running again.
The card seems to work as expected when I probe its memory space from the Apple II monitor. I can access all 8K of ROM via its custom bank-switching logic, and its 2K of RAM also through bank switching. I can probe its soft-IWM and watch the disk I/O lines change. Everything looks OK. But when I try to actually boot a 5.25 inch disk, it just freezes the computer.
It’s not completely dead; it does do *something*. The disk drive turns on and spins. Using a logic analyzer, I can see some brief activity on the disk I/O lines that I interpret as “hello, are you a 3.5 inch drive?” before it goes silent. If I then reset the Apple II and examine some memory locations where I know the UDC store status info, I can see that it detected one disk drive. But why didn’t it boot? More importantly, why did it freeze?
If this were a normal software program, I could use a debugger to interrupt the program and see where it’s frozen. That alone might be enough to reveal what’s wrong. If not, I could restart the program from the beginning, and step through it line-by-line until I found the problem. But nothing like that is possible here. There’s no facility for Apple II breakpoints or single-stepping through code that’s in ROM, and even if there were, the I/O code is timing-dependent and would likely break when run in the debugger. The poor man’s debugger is printed log messages, flashing LEDs, and similar indicators, but even that will be difficult. I can’t easily add or edit code in the UDC ROM, because it contains lots of absolute address references as well as implicit assumptions about certain chunks of code and data avoiding page boundaries.
I wish I still had my old HP 1631D logic analyzer. Then I could hook up 24 probes to the Apple II’s address bus and data bus and then let the computer run, examining the logged CPU cycles afterwards using the HP’s state listing view. My Saleae logic analyzer is nice for many tasks, but even if it had 24 probes, it’s basically only designed for timing / waveform views. I guess not many people look at parallel busses anymore.
Read 12 comments and join the conversation12 Comments so far
Leave a reply. For customer support issues, please use the Customer Support link instead of writing comments.
In case you\’re interested the HP logic analyzer is available:
https://www.ebay.com/itm/HP-1631D-LOGIC-ANALYZER/254642013657?epid=1104045214&hash=item3b49d8c5d9:g:C9AAAOSwKjRe~hTC
Cool! I was so excited when I got my 1631D, way back in 2008: https://www.bigmessowires.com/2008/01/17/logic-analyzeroscilloscope/ But it was huge and clunky, and I gave it away after I acquired some newer USB logic analyzers. After writing this, I remembered I already have a 32-channel USB logic analyzer, which I reviewed here in 2017: https://www.bigmessowires.com/2017/09/08/zeroplus-lap-c-322000-logic-analyzer-review/ I have to get an Apple II peripheral card to expose all the address and data pins more easily, then we’ll see if the Zeroplus is up to the task. I’m not sure it has enough memory to store a large enough capture window for this.
By closely examining the activity on the disk I/O signals, I was able to make an educated guess what section of the ROM code is running when it freezes up. I’d hoped if I stared at that code long enough, the problem would jump out at me, but no such luck. It seems like maybe it’s doing a JSR to a subroutine, but that subroutine never runs. Maybe a timing problem or glitch with the bit-packing stuff I created to squeeze in the ROM code?
I do have one possible avenue for debugging, but it will be painfully awkward. I can replace an instruction in the ROM code with a BRK instruction, then rebuild the FPGA model, reprogram the chip, and reboot the computer. If the flow of control ever reaches that point, it’ll break into the Apple II monitor.
A related approach just popped into my head. Since this is all happening in an FPGA, I could implement my own breakpoint mechanism. CPU writes to some special address could set the lo/hi byte of a breakpoint address. Then whenever a byte is fetched from the synthetic ROM in the FPGA, it could first check whether the address matches the breakpoint address. If it matches, the FPGA could return zero (the opcode for BRK) instead of the actual data at that address.
I recommend Dave Banks’ 6502Decoder project, which can reconstruct a 6502 instruction trace with surprisingly few signals traced.
https://github.com/hoglet67/6502Decoder/wiki
See also discussions on 6502.org and stardot.org.uk
Wow, 6502Decoder looks awesome! It appears to be dependent on the Sigrok software, which is something new to learn.
Digilent’s Digital Discovery is a pretty good alternative to these bulky HP logic analyzers and it is cheap: 199 USD only.
Specs are also very good: 32-channels with 2Gbit buffer will give you 6s of capture at 10Mhz rate. And it has a lot of other builtin instruments. https://reference.digilentinc.com/reference/instrumentation/digital-discovery/specifications
I’m very pleased with it.
If the FPGA is connected to the signal of interest (and it still has some space), it could _be_ the logic analyser, itself. Looks like the “Reveal” tool would be Lattice’s equivalent of Xilinx’s Chipscope and Altera’s Signal Tap. Or with probably more hook-up work, an open source core like openverifla could be used. They all take up some logic resources though.
The nice things about the HP analyzers is you can run an Inverse Assembler on them for the 6502, and get basically a full trace of your running code. It’s a bit tricky to set up though, especially on the older analyzers that only have a serial port. If you have one with Ethernet (like the HP16500C) it works well.
I use an LA5032 32-channel logic analyzer, and I built a bus riser card with a 40-pin connector that breaks out most of the signals (obviously not all of them, since there’s only 32 channels) and maps them right onto the 40-pin connector that the analyzer uses. So I can just plug my card into the riser, plug the riser into the IIe, and plug the cable into the analyzer, and I’m in business.
The software for the LA5032 is surprisingly good…
Is the LA5032 a Saleae clone? From the screenshots of the software that I see, it looks virtually 100% identical to the Saleae software, including the limitation of only having a timing / waveform view and no state listing view.
I’ve found the problem, but I’m not going to fix it. 🙂 At least not yet. The trouble with one-person projects like these is that they’re almost perpetually in a state of brokenness, which gets discouraging after a while. Fix one problem only to immediately encounter two more. Sometimes it’s nice to just stop and savor a bit of progress.
The problem is a basic logic bug, not a timing issue. The code for the disk controller is in a bank switched ROM. Through manual insertion of BRK statements, I was able to determine that when trying to call a subroutine in one bank, it was actually getting the subroutine at the same offset in a different bank. The desired ROM bank is set by writing to the IWM when the disk is off. The trouble is that “off” is a slightly fuzzy concept; the IWM actually keeps the drive spinning for about one second after the computer turns it off, as a simple performance optimization to avoid frequent stopping and starting. For deciding whether to set the ROM bank, I was checking the 1-second-delayed state of the disk motor instead of the live state of the IWM motor latch.
Although it was slow and tedious, I was able to diagnose the problem without any logic analyzer or other external tools. I just used strategically placed BRK statements in the disk controller code, and the Apple II monitor to examine the stack and other contents of RAM. But a good LA would still make the next bug 100 times easier to troubleshoot.
While keeping a drive spinning after clearing latch 4 is useful, and while software exploits the fact that the drive remains readable as long as it’s spinning, the inability to do anything to the drive configuration while the timer is running seems a misfeature. Have any tricks been found to switch modes without having to wait out the timer?
I’m planning to try my hand at a high-speed 20-sector 5.25″ RWTS for machines with a 65C02 and IWM. Based on documentation I’ve read, it seems like it should be doable, but reading and writing would require different mode settings (async for writing, so as to allow one byte to be written every 18 cycles, but synchronous for reading, so as to ease the timing of data sampling), which would make it necessary for the read/write routines to indicate what should be done with the drive afterward (either turn it off, leave it running until the next operation, leave the timer set with the drive prepared for writing, or leave the timer set with the drive prepared for reading). Can you think of any better way to handle the trailing mode setting?
> Is the LA5032 a Saleae clone?
The software is different enough that I don’t think it’s an exact clone, but it does seem very close.
You’re right that it doesn’t have a state view (that I’ve found), though I wonder if one could post-process the data to generate such a thing. Not that that would be a trivial undertaking.