Yellowstone Glitch, Part 5: Fix Signal Integrity With This One Weird Trick
All the clues surrounding Yellowstone’s glitching problems seem to point towards a power supply problem with the ‘245 bus driver and/or the SRAM. The good news is that I now have a “solution” that seems to fix the glitches, and gets the board functioning normally. The bad news is that I don’t understand why the solution works, or what the underlying problem is, so I can’t be confident that it’s really gone.
In part 4 I described how the glitch occurs occasionally if the value $FF is read from RAM. Something goes wrong when driving all 1 bits from RAM onto the data bus. So my solution is to handle RAM reads like this:
- Make the FPGA output the value $AA / 10101010
- After 70 ns, enable the ‘245 bus driver to put that value on the Apple II bus
- After another 70 ns, turn off the FPGA output and enable the RAM to get the real value
- At the end of the bus cycle, disable the ‘245 and the SRAM
And it works! I no longer see glitching on the logic analyzer, and Yellowstone now seems to work normally for booting disks on the Apple IIgs. But why does this trick help?
One possibility is a residual 5V on the data bus from the previous bus cycle’s value. When the 74LVC245 turns on its outputs and tries to drive 3.3V onto the bus, but finds that the bus capacitance is already charged to 5V, this will briefly create a condition that violates the maximum output voltage rating of the chip. Maybe it causes unexpected behavior or chip damage. If that’s what’s happening, then pre-driving 00000000 before the real RAM value instead of 10101010 should be best, because it will avoid the condition where the chip tries to drive 3.3V into a 5V bus capacitance. But I tried this, and it actually made the glitching worse than it was originally. Hmm. That would seem to rule out this explanation.
A second possibility is that there’s no violation of any maximum ratings, but that driving 11111111 onto the data bus simply demands a large amount of instantaneous current, which the bypass capacitor and the voltage regulator are unable to handle. So the local 3.3V voltage sinks down and/or local GND voltage gets pulled up, and some chips glitch. The ‘245 and the RAM are the furthest chips away from the voltage regulator on the PCB, so this would make sense. Pre-driving 10101010 before the actual RAM value helps smooth out the supply current spike by preventing all the outputs from changing to 1 at the same time. Any other pattern with four 1 bits should serve the same purpose.
If this is indeed what’s happening, then the real solution ought to be ensuring the ‘245 and the RAM both have really good 3.3V and GND connections back to the voltage regulator and to the board’s bulk capacitor, and also adding some larger value bypass capacitors directly next to each chip. So I did some surgery on the board, adding a 10 uF SMD ceramic bypass capacitor next to the existing 0.1 uF for the ‘245, and running 3.3V and ground wires directly back to the regulator, but it didn’t fix the glitching. That was a surprise, and it would seem to rule out this explanation too.
Now I’ve apparently eliminated both of the plausible explanations for the glitching behavior, leaving me with nothing. Either my testing is flawed, or something else is happening here that’s different from either of these explanations. It’s all driving me slowly insane. Although I now have a work-around solution in pre-driving 10101010, I don’t want to move on until I can explain all this behavior and have some confidence that the problem won’t return.
Read 7 comments and join the conversation7 Comments so far
Leave a reply. For customer support issues, please use the Customer Support link instead of writing comments.
This is so exciting! Better than any thriller I saw during the last year.
I would bet that it is not a PSU/decoupling problem, but rather some race condition/unintended carry/overflow in your FPGA code. Nevertheless, I love your writeups; please keep us informed (and thrilled too.) Wish you best luck!
Tom
Haha, I’m glad you find it exciting. Honestly I’m surprised that anyone but me is interested in reading about this.
Another possibility is that there are multiple causes of glitches, and my various tests might have fixed some causes but not all of them, leading me to believe the tests didn’t work. For example, I’ve sometimes seen glitches at the moment when the ‘245 is enabled, and also at the moment when it’s disabled. Is that one issue or two? I’ve also seen glitches at the moment the RAM is enabled, if it happens after the ‘245 is already enabled.
There are a plenty of us lurkers who enjoy watching and learning from the process.
I appreciate that you take the time to write up your experience in such detail.
Looking at the Apple expansion bus pinout, my suspicion is that the problem is instantaneous current through the ground pin on that connector. Trying to shore up on-board bypass caps would actually make that problem worse, since they would increase the board’s ability to move current from the ground wire to/from to the data wires. A simple way to test that would be to use a glitchy version of the FPGA program, but add a solid ground wire between the expansion board and the Apple motherboard. If adding the ground wire would fix the problem, that would suggest that the voltage drop in the ground connection at the edge of the expansion board is the problem, and that cutting the worst-case charge transfer in half (by using the 10101010 data pattern) would cut the worst-case voltage drop in half, substantially alleviating the problem.
When trying to drive buses with fairly high capacitance, it may seem like one needs high-current outputs, but it’s important to remember that 1980s computers used bus speeds more than an order of magnitude slower than modern ones. They’re designed to handle slow signal transitions. They’re not designed to handle high peak ground currents. Parts like a 74LS373 may seem obsolescent, but I’d be more comfortable driving a 1980s data bus with something like that than with a modern high-speed logic family.
BTW, I’ve been meaning to get back to my Apple programming. One of my next planned projects is to adapt my picture-packing program which stores 16 double-hi-res pictures (245,760 7-bit bytes, equivalent to 215,040 octets) on a single 5.25″ floppy side so it can be distributed as a Floppy-Emu compatible Woz file rather than my custom board (one would run the Woz file with a blank floppy to create the disk, which could then on a stock machine). If that works, the next project would be to see if I can program the IWM to boost usable 5.25″ track capacity from 5,120 7-bit bytes to 6,144 7-bit bytes. A floppy written this way couldn’t be read with the 1970s Disk II design, but should work with an IWM. If I get that working, do you think it might be a good test for Yellowstone? Do you think the maker of AppleSauce or definer of the Woz format might be interested in such a thing? I don’t think the present Woz format, at least not as emulators would expect to process it, would handle IWM trickery, and even if nobody used it back in the day, I think it would be cool to have programs that run off a physical 5.25″ floppy that holds 80% more data per side than a conventionally-written floppy.
Those suggestions make lots of sense, but unfortunately didn’t help. I tried a ground wire from Yellowstone back to the Apple II motherboard, and also a ground wire from the SRAM to the voltage regulator and from the ‘245 to the voltage regulator.
Driving 11111111 from RAM through the ‘245 onto the data bus is causing trouble. I can see this clearly – other chips begin to glitch the moment the ‘245 is enabled. But driving 11111111 from the FPGA through the ‘245 onto the data bus is OK, and driving other values from RAM is also OK. I can’t explain why. It must somehow be significant that the RAM chip uses a moderately large amount of current (85 mA) when it’s enabled during a bus cycle. I’ve added extra power and ground wires and bypass capacitors seemingly every which way, without improvement.
I’m fairly confident the problem is too much current at some moment in some location. But I can’t put my finger on exactly where, or why, or how that produces the behavior I’m seeing.
That’s a fair point about slower transitions with older parts being beneficial. 74LS parts won’t help me with 3.3V level translation though, and newer 3.3V parts like 74LVC mostly have faster slew rates and higher currents. I could try 74LVC with series resistors, but it’s not something I can easily test without more significant board surgery or a new board.
I was envisioning using a 74LS373 to drive the bus, and a separate chip to handle level translation in the reverse direction. The input side logic translator would only have to drive a small number of new chips with relatively low pin capacitance, as opposed to having to drive a larger number of older chips with higher pin capacitance, and thus its high speed outputs wouldn’t cause excessive peak currents. Having to replace one transceiver chip with a combination of an input buffer and an output buffer might not be less convenient than using one chip, but it would allow both chips to be picked to be maximally suitable for their intended usage scenarios.
Ah, I see. That could work. I would probably prefer a single bidirectional chip plus series resistors rather than separate input and output chips, but I’ll think about it some more. To your earlier question, I’m not sure if Yellowstone would handle some non-standard IWM trickery, but it might be interesting to try. John K. Morris is the designer of Applesauce and WOZ.