Archive for November, 2020
Yellowstone Back From the Dead
Remember the Yellowstone disk controller card that I designed back in 2018? It was an FPGA-based clone of the Apple II Liron controller, with aspirations to eventually become a universal reconfigurable disk controller. And it worked nicely when it was the only card installed in the computer, but things went haywire when too many other expansion cards were also present. I eventually gave up and abandoned the project, but now I think I’ve fixed it.
The symptoms were documented in a series of blog posts here, here, here, and here. The more expansion cards present along with Yellowstone, the more likely I was to see errors such as unexplained resets and lockups and drops into the Apple II system monitor. Investigation with an oscilloscope showed lots of nasty looking signals, huge over and undershoot on the data bus, and strange transients on the power supply during card I/O.
There were plenty of theories to explain the problem, and I received over 100 comments from helpful readers. Some theories put forward were: poor grounding, insufficient bulk capacitance, a too weak 3.3V regulator, impedance mismatch, bus fighting, failure to meet the minimum input high voltage, a too strong or too weak drive from the bus driver IC, wrong FPGA slew rate, bad scope probes, bad power supply in the Apple II, a race condition in the logic, and more. My own best guess was a combination of grounding and impedance problems. I spent many weeks chasing various theories without much success. I hacked the card and replaced its bus driver with one from a different logic family. I even started to wonder whether the whole problem lay with the computer rather than the card. By March 2018 I gave up in frustration.
Two and a Half Years Later
This past week I’ve been investigating ideas for an Apple II video card, and a reader pointed me to this tech note about Apple IIgs expansion card design. The point was to learn which signals were provided to the different slots, but my eye caught a different paragraph titled “Avoiding Bus Fights”. As the text described,
“To avoid potential (or actual) bus fights, it is helpful to avoid driving read data from an expansion card onto the bus immediately after PH0 rises. … If a card drives data onto the expansion slot data bus immediately after PH0 rises, there may be a bus fight between the expansion card trying to drive the bus, and the Apple IIGS (or Apple IIe) bus buffers, which may not have turned around yet. … Developers can avoid bus fights by simply using 74LS or 74HCT series parts and relying upon typical delay stackups to delay driving the data bus for approximately 30 nanoseconds. A more solid technique is using the first rising edge of the 7M clock, after PH0 rises.”
My card responds when the Apple II asserts the I/O SELECT signal for its slot, which happens at the same time as PH0 rising. What this paragraph says is that the card should intentionally wait at least 30 ns before responding, because the motherboard’s 74LS245 bus driver is still driving the data bus even after PH0 and the assertion of I/O SELECT!
At first glance this seems ridiculous. Why would the Apple II assert I/O SELECT for a card before it’s safe for the card to output data? But if you assume the card is built with 1978 vintage ICs that can’t respond very quickly anyway, it wouldn’t have been a problem. The trouble only appears when you use an FPGA and modern logic families like 74LVC with propagation delays of just a few nanoseconds. It becomes necessary to add an artificial output delay to avoid bus fighting.
Several readers had suggested more or less exactly this, including Fluffysheap who described it perfectly in the comments here. But I must have been too frustrated or too tired back then, and I never fully followed through on checking this theory.
Bus fighting almost perfectly explains the horrible signals I observed on the scope. For a few tens of nanoseconds at the beginning of my card’s data output, it was fighting with the motherboard’s bus driver, creating eight short circuits on the 8-bit bus. This caused a surge of current, resulting in horrible power supply transients and wild swings on the bus. From my scope observations this period seemed to last about 70-80 ns, rather than the 30 ns mentioned in the tech note. But the tech note described the Apple IIgs, not the Apple IIe that I used for my tests. Maybe the Apple IIe bus driver is slower to shut off.
One thing that bus fighting doesn’t seem to explain is why adding more peripheral cards would make the problem worse. It appears my card was engaging in a bus fight with the motherboard’s own bus driver, and the other cards were just innocent bystanders. The only affect of their presence would be to increase the bus capacitance. I may still be missing something here.
Testing It
Armed with this newfound knowledge, I went to edit the Yellowstone FPGA source to insert an intentional delay before enabling the card’s bus driver for output. Lo and behold, code for creating a delay was already there, but commented out. It was written by me. I can’t remember if I ever tested it back in 2018. Maybe I had the idea but never tried it, or maybe I tried it but something went wrong. Either way, I gave thanks to 2018-Steve and just reapplied the already-existing code.
At first there was some comedy, because I tried several different changes that appeared to have no effect. After half an hour, I realized I was rebuilding the FPGA configuration file after each change, but then programming the old configuration file from 2018. Oops.
What can I say, it works. I loaded up my Apple IIe with a sampler of six different expansion cards in different combinations, connected to a variety of Floppy Emus and real drives, including a Smartport-aware Unidisk 3.5 drive. Everything worked as expected, and there were no unexplained resets or other weird behavior.
I looked at the data bus and the power supplies on the scope, and everything appeared cleaner than before. The power supplies looked OK. There was still some overshoot on the databus when the card first started driving, but much less than before. Maybe this can be improved further by adding some small inline resistors on the next version of the card. I adjusted the output delay to about 120 ns, which is probably much longer than necessary, but it still leaves more than ample time for 2020-era logic chips to do their jobs.
One More Issue
Things look good with Yellowstone on the Apple IIe, but the IIgs is another story. I have to switch the IIgs to normal speed instead of fast, but then the card should work. A real Liron disk controller card works fine in a IIgs, as long as the system speed is changed, so there’s no fundamental incompatibility. Unfortunately the Yellowstone card just plain doesn’t work on the IIgs. At first I thought that was a result of my new output delay, but removing the delay didn’t help. Then I dug through all my notes from 2018 and concluded that the card never worked on the IIgs under any circumstances. So this isn’t a new problem – it’s an old problem I just hadn’t noticed because it was overshadowed by the bus fighting.
The sort-of-good news is that this failure of Yellowstone on the IIgs seems repeatable and debuggable. Instead of weird spurious resets and lockups like I was seeing two years ago, it looks like it’s just a communication error with the drive. The Yellowstone card firmware appears to be running OK, and I get reasonable error messages like “NO DEVICE CONNECTED”. With a Unidisk 3.5 drive attached, there’s no drive activity. With a Floppy Emu attached and configured for Smartport emulation, it reports a checksum error.
I suspect there’s another timing problem here, but this one relates to writes to the card instead of reads. Perhaps I’m not latching the data from the bus at the right time, and due to minor differences between the bus timing on the IIe and IIgs, it still works OK on the IIe but occasionally writes the wrong values to the IWM chip on the IIgs, or fails to write anything. That would cause garbled communication with the drive.
I’m calling a stop for the moment, feeling pleased with this new progress, and optimistic that I’ll eventually find an explanation for the IIgs behavior.
Read 4 comments and join the conversationApple II Video Card Thoughts
Recently I’ve been toying with the idea of building an Apple II video card. Like many of you, I’ve grown tired of searching for the dwindling supply of monitors with a native composite video input, and frustrated with available solutions for composite-to-VGA and composite-to-HDMI conversion. This card would fit in an expansion slot of an Apple II, II+, IIe, or IIgs, and provide a high quality VGA output image suitable for use with a modern computer monitor. I welcome your thoughts on this idea.
Why Do This?
Another Apple II video solution? Aren’t there enough options already? It’s true there have been many different Apple II VGA solutions over the years, and a smaller number of HDMI solutions. Unfortunately almost all of them have been retired, or have very limited availability. I’m also interested in this project as a personal technology challenge. It could combine some of my past experience with Yellowstone (Apple II peripheral cards) and Plus Too and BMOW1 (VGA video generation).
How Would It Work?
I see three possible paths for a high-quality Apple II VGA video solution. One is to convert the signals from the computer’s external monitor port. But monitor ports are only present on the Apple IIc and IIgs, and so that wouldn’t meet my goals. The second approach is to tap into a few key video-related signals directly from the motherboard, before they’re combined to make the composite video output. This is a viable approach, and others have been successful using this method. The downside is that it requires a snarl of jumper wires and clips attached to various points on the motherboard, and the details vary from one motherboard revision to the next.
I’m considering a third approach: design a peripheral card that listens to bus traffic, watches for any writes to video memory, and shadows the video memory data internally. Then the card will generate a VGA video output using the shadowed video memory as its framebuffer. It’s simple in concept, but maybe not so simple in the details. The first two approaches require only a comparatively dumb device to do scan conversion and color conversion. In contrast, the bus snooping approach will require the video card to be a smart device that’s able to emulate all the Apple II’s video generation circuitry – its weird memory layouts and mixed video modes and character ROM and page flipping and everything else. But it promises a simple and easy solution when it comes time to use the card. Just plug it in and go.
To be specific, for basic support of text and LORES/HIRES graphics, the card will need to shadow writes to these memory addresses:
$0400-$07FF TEXT/LORES page 1
$0800-$0BFF TEXT/LORES page 2
$2000-$3FFF HIRES page 1
$4000-$5FFF HIRES page 2
and these soft switches:
$C050/$C051 graphics / text
$C052/$C053 mix / no mix
$C054/$C055 page 1 / page 2
$C056/$C057 lores / hires
Can It Work?
For basic 40 column text mode, I hesitate to say this will be easy, but I don’t anticipate it being difficult. LORES and HIRES graphics may be more challenging, because the memory layouts are increasingly strange, and because determining what color to draw will depend on careful study of NTSC color artifact behavior. But I’m relatively optimistic I could get this all working well enough for it to be useful.
After finishing the basics, there will remain several more difficult challenges. First among these is sync. If the vertical blank of the VGA image isn’t synchronized with the vertical blank of the Apple II’s composite video, then tearing or flickering may be visible in the image. Carefully-timed program loops that attempt to change video memory only during blank periods won’t work as intended. Fortunately motherboard revisions 1 and later provide the SYNC signal to peripheral cards in slot 7, which can be used to synchronize the VGA and composite video outputs. For installation in a different slot or on a revision 0 motherboard, a wire can be connected to the motherboard for the SYNC signal, or the card can run unsynchronized.
What about European PAL models of the Apple II? I think Eurapple machines should work OK, since the card’s function is based on bus snooping rather than the actual video data. The VGA output will be the same, since fortunately there are no NTSC/PAL distinctions to worry about for VGA video. I’m less sure about Eurocolor machines. I believe these already have slot 7 occupied with an Apple video card – can anyone confirm? If so, my video card will have to go in a different slot, where the SYNC signal won’t be available. That’s probably no great loss, since software designed to sync with a 50 Hz composite PAL display can’t be synchronized with 60 Hz VGA anyway.
Can double-hires graphics and 80 column text be handled too? I think so, but I haven’t looked at these modes in much detail yet. From what I’ve seen, they both involve bank switching RAM between the main and AUX banks. Supporting these modes would mean shadowing some additional areas of memory, and a few more soft switches that control bank switching, as well as more software complexity. But they should be doable.
Non-standard character ROMs are another question. The video card will need to duplicate the contents of the character ROM in order to generate text. The most obvious solution is to simply assume the character ROM is a standard one, and include a copy of the standard character ROM on the card. But in some models of Apple II, it’s possible to replace the normal character ROM with a ROM containing an international character set in place of the inverse characters. These seem to have been common for Apple II computers sold in several countries. My card won’t know about the custom character ROM, and won’t render those characters correctly. I don’t see any easy answer for this issue. Possibly the card could use DMA to read the character ROM, but that would be a major increase in complexity.
Finally there’s the question of the Apple IIgs. Will the card even work in a IIgs? Maybe not, if writes to video memory on the IIgs are handled specially, and the address and data never appear on the peripheral card bus. That will be easy enough to test. Another question to consider is super hires graphics on the IIgs. I don’t know anything about how this works. I can’t think of any fundamental reason it couldn’t be handled in the same way as the other graphics modes, but it might be very complex.
Hardware Choices
What kind of hardware would be appropriate for a video card like this? It needs at least 18K of RAM to shadow the memory regions for text, LORES, and HIRES graphics. If 80 column text and double-HIRES are also supported, then the RAM requirement grows to something like 54K. The hardware must also be fast enough to snoop 6502 bus traffic and to shadow memory writes that may appear as often as every 2 microseconds. And it must also be capable of generating a VGA output signal with consistent timing and a pixel clock about 25 MHz.
What about something like a Raspberry Pi? Yeah… it might work, but it’s not at all the kind of solution I would choose. Those who’ve read my blog for a while know that while I’ve used the Raspberry Pi in a few projects, it’s not my favorite tool. I strongly prefer solutions that keep as close to the hardware as possible, where I can control every bit and every microsecond. That’s where the fun is.
Could it be done with a fast 32 bit microcontroller, something like an STM32? 54K of RAM would be no problem, and a 100 MHz microcontroller could probably handle an interrupt every 2 microseconds just fine. That would be enough to snoop the 6502 bus and shadow the writes to RAM. I’m less confident that a microcontroller could directly generate the required VGA output signal. If the caches could be disabled, it might be possible to write carefully-timed software loops to output the VGA signal. But the interrupts for RAM shadowing would probably screw that up, and make a hash out of the VGA signal timing.
A small FPGA is probably the solution here. 54K is rather a lot of built-in RAM for an FPGA, so the FPGA would need to be paired with an external RAM of some type – most likely SRAM or a serial RAM. Or the FPGA could be paired with a microcontroller, providing both RAM and some extra processing horsepower.
Read 51 comments and join the conversationHow Many Bits in a Track? Revisiting Basic Assumptions
Yesterday’s post contained lots of details about Apple II copy-protection and the minutiae of 5.25 inch floppy disk data recording. It mentions that bits are recorded on the disk at a rate of 4 µs per bit. This is well-known, and 4 µs/bit appears all over the web in every discussion of Apple II disks. This number underpins everything the Floppy Emu does involving disk emulation, and has been part of its design since the beginning. But as I learned today, it’s wrong.
Sure, the rate is close to 4 µs per bit, close enough that disk emulation still works fine. But it’s not exactly right. The exact number is 4 clock cycles of the Apple II’s 6502 CPU per bit. With a CPU speed of 1.023 MHz, that works out to 3.91 µs per bit. That’s only a two percent difference compared with 4 µs, but it explains some of the behavior I was seeing while examining copy-protected Apple II games.
With the disk spinning at 300 RPM, it’s making one rotation every 200 milliseconds. 4 µs per bit would result in 50000 bits per track, assuming a disk is written using normal hardware with a correctly calibrated disk drive. 50000 is also the number of bits per track given in the well-known book Beneath Apple DOS. But it’s wrong. At 3.91 µs per bit, standard hardware will write 51150 bits per track.
Not content to trust any references at this point, I measured the number directly using a logic analyzer and a real Apple IIe and Apple IIgs. When writing to the disk, both systems used a rate of about 3.92 µs per bit. Here’s a screen capture from a test run with the Apple IIe, showing the time for 10 consecutive bits at 39.252 µs. There was some jitter of about 50 nanoseconds in the measurements, and measuring longer spans of bits revealed an average bit rate of about 3.9205 µs. That’s a tiny difference versus 3.91 µs. Can I say it’s close enough, and let it go? Of course not.
Even the advertised CPU speed of 1.023 MHz is inaccurate, so what is it really? The Apple II’s CPU clock is actually precisely 4/14ths the speed of the NTSC standard color-burst frequency of 3.579545 MHz. (This number can be derived as 30 frames/sec times 525 lines/frame times 455/2 cycles/line divided by a correction factor of 1.001.) 3.579545 MHz times 4 divided by 14 is 1.02272714 MHz. Rounded to three decimal places that gives the advertised CPU speed of 1.023 MHz. But using the exact CPU frequency, four clock cycles should be 3.91111 µs. There’s still a discrepancy of slightly less than 0.01 µs compared with my measurements, hmmm.
But wait! In the comments to the answer to this Stack Exchange question, it’s mentioned that every 65th clock cycle of the Apple II is 1/7th longer than the others, because of weird reasons. That means the effective CPU speed is slower than I calculated by a factor of 1/(65*7). In light of this, I calculate a new average CPU speed of precisely 1.020479520466562 MHz, and a time for four clock cycles of about 3.9197 µs. That’s a difference of only 0.0008 µs from my measurement – less than one nanosecond. Ah ha! So everything makes sense, and my measurements were correct.
The difference between 4 µs per bit and 3.92 may seem like a minor detail, but for a floppy disk emulator developer, it’s like suddenly discovering that the value of pi is not 3.14159 but 3.2. My mind is blown.
Read 1 comment and join the conversationFloppy Emu Update: Apple II Copy-Protection
Good news for Apple II fans: there’s a new Apple II firmware update for the BMOW Floppy Emu Disk Emulator. This update introduces substantial emulation enhancements for copy-protected Apple II software on 5.25 inch floppy disk images. Though Floppy Emu is designed for standard non-protected disks, most copy-protected games and utilities should now work too, including most disks from the WOZ-a-Day collection.
The new firmware is version 0.2N-F26 or 0.2N-F27, depending on the model of your Floppy Emu board. You can download the latest firmware here: firmware
Background
Disk images of copy-protected Apple II software normally must use raw bitstreams, rather than using any high-level data representation like sectors. The NIB and WOZ disk image formats are both raw bitstreams, and the WOZ format has grown increasingly popular for this purpose over the past couple of years.
The BMOW Floppy Emu has supported both NIB and WOZ formats for some time. But even when the disk images could be read, the games didn’t necessarily always run correctly, because they sometimes attempted weird non-standard things with the disk drive. The Floppy Emu hardware didn’t always respond to these attempts as the games expected. The focus of this firmware update has been addressing a handful of edge cases and uncommon behaviors to satisfy most copy-protection schemes, enabling more copy-protected games to work with the Floppy Emu.
Thank you to Nick Bauer, who provided me with a carefully-researched list of several dozen copy-protected WOZ games that weren’t working correctly with the previous Floppy Emu firmware. Nick documented the behavior of each game on two different Apple II models, as well as with MAME, which was a tremendous help. After a few weeks of R&D when the new firmware was ready, Nick tested it with 350+ disk images from his collection.
So what changed?
SD Card I/O Rate
The most significant change is a doubling of the data rate when communicating with the Floppy Emu’s SD memory card, from 5 Mbps to 10 Mbps – the maximum rate supported by the hardware. The data rate has been 5 Mbps since the earliest days of the Floppy Emu, because 10 Mbps didn’t appear to be necessary, and because higher data rates increase the chances of signal errors due to analog signal effects. But at least one copy-protected game wouldn’t run correctly without the higher data rate – the Floppy Emu simply couldn’t fetch data fast enough from the SD card to keep the game satisfied. So 10 Mbps it is.
The data rate was increased in order to help 5.25 inch floppy disk emulation, but it also affects Apple II 3.5 inch floppy disk and Smartport hard disk emulation. This is the change I’ll be watching most closely for user feedback. If there’s already some source of electrical signal degradation in the system, like a noisy power supply, then the increased data rate may result in an unacceptable number of SD card read/write errors. In my testing, it worked smoothly with three different Apple II computers, five different Floppy Emu boards of various generations, and three different SD cards. But one of my earliest hand-soldered Floppy Emu prototypes did report errors at 10 Mbps, so I’ll be keeping an eye on this.
The SD card data rate can be reduced to the old rate of 5 Mbps by holding down Floppy Emu’s PREV button during power-up. In this case, a small number of copy-protected games including Hard Hat Mack may not run correctly, but most software will be unaffected.
Disk Bit Rate
Floppy Emu’s interface with the Apple II computer lies at the opposite end of the data funnel from the SD card. To the computer’s disk controller, the Floppy Emu looks like a standard 5.25 inch floppy drive, with a standard data rate of one bit every 4 microseconds. Except that some copy-protected software violates this standard. The new Floppy Emu firmware changes the bit rate from a fixed rate of 4 microseconds per bit to a variable rate that depends on the disk image, helping games with non-standard bit rates to run correctly.
How does this work? A 5.25 inch floppy disk spins at 300 RPM, which is one rotation every 200 milliseconds. A normal floppy disk has 50000 bits evenly spaced around each circular track, resulting in a rate of one bit per 4 microseconds. But some disks squeeze 51000 or 52000 bits into a track, which produces a bit rate closer to one bit per 3.9 microseconds.
The Floppy Emu’s hardware design makes it difficult to compensate for this, and the previous firmware didn’t even try. The 4 microsecond number is baked into the logic of Floppy Emu’s CPLD chip, ensuring that the rate never wavers even if the microcontroller is servicing an interrupt or is otherwise busy. While this is very helpful for normal disk images, it means that the bit rate can’t be adjusted on the fly. In practice this was rarely a problem, and most games with a 3.9 microsecond bit rate still worked just fine at 4.0 microseconds, but a few did not.
After some experimentation, I settled on a new CPLD design that allows for on-the-fly selection of a few different fixed bit rates. While this isn’t quite as flexible as a fully-adjustable bit rate, it satisfies nearly all of the copy-protected games I tested without requiring any Floppy Emu hardware changes.
Cross-Track Synchronization
Some copy-protected games rely on the relative spatial orientations of adjacent tracks on the disk. The WOZ disk image format encodes this information, and the old Floppy Emu firmware was already maintaining this cross-track synchronization. Frogger uses a track synchronization technique called Spriadisc, and it worked nicely on the Floppy Emu. But deeper investigation revealed there were still some cases where cross-track sync wasn’t being handled properly. The details depended on what method the software used to verify the track synchronization: counting disk bytes, measuring time, or some hybrid method.
There are several sources of potential synchronization error. Whenever the CPLD bit rate doesn’t exactly match the original disk’s bit rate, the behavior in the time domain will be slightly off. An entire track’s worth of bytes will be transmitted in slightly less or slightly more than 200 ms. With the old firmware, this type of error was most evident for games using cross-track synchronization and that had a number of bits significantly different from 50000. This error is mostly eliminated now that the bit rate is changeable.
A second source of error appears when games verify track synchronization through byte counting, if the SD card can’t load the track data quickly enough. A game might read sector 0 from track 0, then immediately step to track 1, and then count how many bytes pass by before sector 0 appears on the new track. If track 1’s data isn’t finished loading from the SD card yet, the Floppy Emu will insert a continuous series of special 10-bit timing bytes into the bitstream, until the real track is ready. This results in the software counting a different total number of bytes until sector 0 appears. Increasing the SD card data rate helps, by loading the track data faster, and reducing or eliminating the time period where the game sees 10-bit timing bytes instead of the track’s data bytes.
Weak Bits
Another common copy-protection scheme involves regions of the disk where there’s no magnetic flux. Due to the design of the floppy drive hardware, these will appear as random data when read. A new random-looking value will be read each time the empty region of the disk is accessed. Copy-protected software can test for this, and confirm that the data really is changing randomly and isn’t some fixed random-looking pattern, such as would be produced by Copy II+ or other disk copying software.
Floppy Emu already supported weak bits, and automatically substituted random “fake bits” in their place. All of the weak bits examples in the WOZ disk image reference set were working fine. But as with cross-track synchronization, a closer look found there were still some minor problems. It appears that the random fake bits supplied by Floppy Emu weren’t quite random-looking enough for some copy-protected software. This was hard to pin down, and I experimented with longer and shorter random sequences, as well as changing the sequence length. What seemed to help most was changing the distribution of random values, so that 0 bits were more likely to appear than 1 bits. The WOZ reference actually mentions this, but I’d tried it once earlier and it seemed to actually make things worse. It remains something of a mystery, but is working OK now with the disk image test suite.
Other Fixes
Aside from the interesting copy-protection stuff, this new firmware update also includes some basic bug fixes. I found two cases where track-to-track stepping movements of the disk were being misidentified or missed altogether, resulting in the Floppy Emu moving to the wrong track of the disk. Neither of these cases ever appeared with normal software, but they sometimes happened with copy-protected games that used custom code to directly control the stepper motors. A small fraction of WOZ disk images were also being misidentified as DSK images, causing errors when attempting to use them.
Oh yeah, one more thing: the new firmware also includes a menu usability enhancement for SD cards with deeply-nested directories of disk images. When exiting a subdirectory and returning to the parent directory, the selected menu item will now be the item for the subdirectory that was just exited, instead of the first menu item. This makes it easier to navigate into and out of subdirectories without getting confused and lost.
Read 3 comments and join the conversation