Yellowstone Reprogramming and Testing
I’m working on the hardware design for version 2.0 of Yellowstone, my FPGA-based universal Apple II disk controller. Most of it is going well, but I’m facing two major challenges I’m unsure how to solve: how to support user reprogramming, and how to batch test newly-made boards. I’m feeling blocked by both of these important questions.
User Reprogramming
The board is based on an FPGA, which contains the disk controller logic design as well as 8 KB of synthetic ROM. There’s a JTAG header on the board which can be used to program the FPGA after assembly. But what happens after I’ve shipped the Yellowstone board to a customer, and then I fix a bug or add a new feature? Most people don’t have a JTAG programmer, so how will the customer install the new firmware?
In an ideal world, the Yellowstone board would have some kind of USB port or memory card slot for installing updates. But I’m reluctant to include additional hardware for the sole purpose of future firmware updates. It’s both a cost concern, and a “I really just want to finish this design” concern.
An alternative might be to implement reprogramming from within the Apple II itself, using a custom software program. But how, exactly? The Apple II bus isn’t connected to the JTAG header. Maybe I could create a connection with the addition of an extra buffer chip, but my enthusiasm for working on that idea is low. After nearly four years of development, I prefer to focus on getting the core functions of the card debugged and working, and call it done.
The solution I’m leaning towards is to simply avoid user reprogramming for all but the most advanced users. For people who own a JTAG programmer (or a Bus Pirate or Arduino with appropriate sketch loaded) and are familiar with the software for using it, they could still update the FPGA if needed, but for everyone else it would be a fixed-function device. The Daisy Chainer adapter that I developed for Floppy Emu faced the same issue surrounding updates and used the same non-solution, and it’s worked out fine. Unless I can think of another simple solution for reprogramming, I’ll probably go with this answer.
Testing
This is a complex board; probably the most complex I’ve ever designed. Let’s say I’ve just assembled 100 of them, or paid a manufacturer to assemble 100 for me: now what? How do I quickly and reliably test them to confirm they work? It’s unrealistic to put each one in my Apple IIe slot 6, connect a bunch of different disk drives, boot some disks, play Oregon Trail… it would take far too much time, and rapidly wear out my equipment. Nor could I expect a manufacturer to do that kind of test for me. It would be prohibitively expensive, if they’d be willing to do it at all.
Some kind of self-test capability would be great, but my options are limited. Without actually connecting something to the drive connectors, I couldn’t verify that they’re working. And unlike most of my other projects, Yellowstone doesn’t have any CPU or microcontroller that could be used to run self-test code. It’s just a big pile of logic circuits and memory that’s normally controlled by the Apple II.
What I should really do is create an external testing apparatus for the Yellowstone card: something the card plugs into, which then quickly exercises all the card’s hardware and flashes a green light if everything’s OK. That’s what I did with the pogo pin test board for testing the USB Wombat.
But here’s the problem: designing an external tester is a big project that needs a lot of time, and the very thought of it makes me collapse into a puddle on the floor whining like a two-year-old who needs his nap. I just don’t want to do it. I designed an external tester for the Wombat, and the effort took at least several weeks. A Yellowstone tester would be even more complex. I’ve simply run out of patience for further scope creep on this project, testing be damned.
So I’ve been bargaining with myself, trying to imagine what kind of limited testing I could do to get the best bang for the buck. What testing would catch the most problems or the most common types of problems, with the least amount of additional engineering work and the least time per test? How far could I get by confirming the FPGA programmed successfully, and doing a visual inspection of the assembled board, without further testing? Is it the difference between 99 percent and 99.99 percent reliability? What number should I be aiming for?
Read 15 comments and join the conversation15 Comments so far
Leave a reply. For customer support issues, please use the Customer Support link instead of writing comments.
Hi Steve,
Just me 2 cents worth…
A Lattice compatible USB JTAG cable is less than $10 on Ebay.
Anyone that wants to experiment with Yellowstone could easily afford that.
As far as testing goes, there is precious little on the card besides the FPGA and voltage translators (at least in the pics I have seen).
Simple solution would be to get an A2 bus extender card from Reactive Micro, or make your own, and run the Apple Diagnostics from the A2 side.
For that kind of testing, I have a 50-pin edge plug (canibalised from a MAC CD ROM Adapter 922-1820) and a ribbon cable to a 50 pin socket.
Cheers,
Leslie
Thought 1: comparative testing between a DUT and a golden sample (same artificial inputs, compare the outputs); but I presume these cards are more than just passive combinatorial logic (and the clocks are not synced) so this probably wouldn’t be much use.
Thought 2: cheat and only measure board power draw. Ie provide artificial inputs from a micro, connect the outputs to some reasonably low-impedance dummy resistors (so changing outputs noticeably changes overall board current draw), scope the board’s power draw over (eg) 5 seconds, then compare the trace shape with a saved ‘known good’ trace you keep onscreen on your scope. Harder to automate, but reasonably quick and easy to do by hand. May be useful or useless depending on the power draw behaviors & predictability of your boards. Should pickup all sorts of faults including: bad soldering, bad parts, bad power supplies & firmware not loaded.
What’s the biggest source of failure for you in the past? Mainly just bad soldering, or do you get faulty silicon reasonably often too? I presume testing the connectors is just as important as anything else (ie skipping the connectors & using pogo-pins onto test pads might be ill advised?).
> How far could I get by confirming the FPGA programmed successfully, and doing a visual inspection of the assembled board, without further testing? Is it the difference between 99 percent and 99.99 percent reliability? What number should I be aiming for?
Don’t think in percentages, think in absolute numbers of bad boards.
What are the costs of 1 RMA process in terms of time, money & reputation/image? What is the max absolute number of RMAs that you want to deal with over (eg) 1 year? I would suggest working backwards to a percentage from there.
Re: User Reprogramming, I wrote about this before, the MachXO2 has the capability to self-program the flash, driven by user logic. e.g. additional Apple I/O ports (or magic sequences on existing ports) could control this. The wishbone bridge would require some logic resources though.
See e.g “Flash Memory (UFM/Configuration) Access” in “Using User Flash Memory and Hardened Control Functions in MachXO2 Devices Reference Guide” p49 http://www.latticesemi.com/dynamic/view_document.cfm?document_id=39086
When I had to design a thingy with user-updateable bitstreams (Altera) I ended up with an MCU and an SD card. The MCU was loading the bitstream to the FPGA at power-on. All the user had to do was to place a file on the SD card.
Could you build passive loopback jigs that could be attached to the card edge and floppy connectors? The FPGA could then be put into a special test mode that would toggle its outputs and verify that the corresponding looped-back inputs were toggling. The test mode could be enabled with a jumper, or maybe by detecting a signal connection on the test card (e.g. pulling down a pin that’s normally floating or high in the Apple II). The test could run automatically at power-on when enabled, and drive an LED or two to indicate pass/fail.
This should give pretty good test coverage – you’d verify the FPGA image load and solder connections and at least one direction of the bus transceivers and their solder connections. You could also have the self test test both directions of the address and data transceivers.
Since the jigs would be passive, it should save you from the pain of designing a complex test jig with a microcontroller and corresponding test software.
@Leslie I didn’t find a bus extender product at RM. Maybe it’s a discontinued product? It sounds helpful.
@Hales for other products I’ve made, manufacturing problems are rare, and they’re almost always soldering problems. This board will have a larger number of fine-pitch components than anything I’ve done before, so the risk of soldering problems is higher. Occasionally there will also be a mechanical problem with a connector, like a bent or broken pin. Faulty silicon has been extremely rare and almost not worth worrying about.
@Alex B thanks for the reminder. I did investigate this a little, about a year ago. Maybe this would allow for a solution with no extra hardware, driven from the Apple II. I’ll look into it more.
@Mike E some type of loop-back tester would be helpful, but probably wouldn’t get 100 percent test coverage. There are lots of unidirectional level shifters at the board edge connector that could only be tested with externally-driven signals. The FPGA also isn’t well-suited to running test code, unless I also added a soft-CPU, which would be a challenge to squeeze in. But I still like this idea for its simplicity, and it might be good enough.
A test that toggles all the FPGA pins could also be performed with JTAG in EXTEST mode, rather than by logic in the FPGA. I don’t have any experience with this, but it should be theoretically doable.
The main problem here isn’t that designing a tester is impossible, but that I’m suffering from severe design fatigue. After almost four years of Yellowstone it’s hard for me to contemplate adding a major new engineering element now, especially one whose purpose is “manufacturability” rather than functionality of the card itself. But maybe there’s no avoiding it.
I think the ideal tester would be a card with an onboard microcontroller and a 50 pin edge connector, as well as two disk connectors with ribbon cables. You would stick the Yellowstone card in the edge connector, and attach the disk connectors to Yellowstone’s disk outputs. Then the microcontroller would act like an Apple II, reading and writing the Yellowstone’s RAM and ROM, and exercising all the disk I/O signals, confirming everything behaves as expected. But this would take lots of time to develop.
Going back to what Hales suggested for thinking about the testing vs reliability tradeoff, I’m not sure if an external tester would pay for itself, when you compare its development cost to the cost savings from improved reliability. Let’s use some rough estimates for the numbers: Developing a good tester might take me four weeks working half-time, that’s 80 hours. Value my engineering time at $60 per hour and that’s $4800 to develop the tester, plus the cost of actually building a few of them, so call it $5000 total.
Will having a detailed tester save me more than $5000? Let’s say I manage to sell 200 cards per year, which I think is a reasonable guess. With strong testing, maybe the reliability rate of delivered cards is 99.5 percent, which is roughly where my other products are. So one card out of 200 would get RMA’d for repair or replacement each year. With weaker testing, say only FPGA programming plus visual inspection, maybe the reliability rate falls to 97 percent. That number might be way off, but it’s my best guess. So then six cards would get RMA’d each year, which is five more than with strong testing. Say each of those five RMAs costs me an hour of time plus $40 in materials, so that’s $100 each, $500 per year. So it would take me ten years to recoup the development cost of the strong tester.
From a purely financial perspective, it may not make sense to design an automated external tester for a product whose sales volumes are this low.
Another option might be to do what I originally said was unrealistic: test each card by hand in my personal Apple II. Let’s say that takes five minutes per card. Using my $60 per hour figure, for 200 cards per year, that would be $1000 per year worth of testing time. If I could arrange to pay someone else to do that work, the cost would probably be less. This might be a reasonable alternative after all. The downside is the wear and tear it would put on the Apple II, which is hard to quantify. A working Apple IIe would cost several hundred dollars to replace.
Whew… I think this comment response is longer than my original post. 🙂
A thought after reading all the comments…and this is coming from someone with zero experience with FPGA\’s and such, and definitely not an electronics expert… Is the FPGA programmed by connected to the JTAG header after assembly? If yes, could you create a firmware that basically just loops through all the input/output pins or whatever while connected to a special and simple test jig, maybe as simple as blinking LED\’s or communicating with I/O pins of a microcontroller? Basically just something dead simple to test the solder connections. Then when the test passes, flash the correct programming onto it and move on.
> […] I’m not sure if an external tester would pay for itself, when you compare its development cost to the cost savings from improved reliability.
> […] Value my engineering time at $60 per hour and that’s $4800 to develop the tester, plus the cost of actually building a few of them, so call it $5000 total.
> […] Say each of those five RMAs costs me an hour of time plus $40 in materials, so that’s $100 each, $500 per year. So it would take me ten years to recoup the development cost of the strong tester.
Wow, that tester sounds intense. Much more than I was expecting. (Also I’m envious of your self worth of 60USD/h 🙂 I work doings admin, electronics repair and a few small designs for about half of that. I might need to look at other jobs, but for now it’s stable and I’m doing OK with it.)
That covers the financial costs, but what about the social/image side and personal toll of dealing with RMAs? Maybe a polite printed note to the sales page + each parcel could help temper difficult customer interactions a little bit? Eg ‘We expect a 3% failure rate for these boards due to their much finer pitch parts and more difficult soldering than our other products. We’re a small group, please contact us if you think your board has issues.’
Additional difficult thought worth considering: solder shorts between certain pins damaging the user’s host device.
Are you sure it will be low sales? With this turning into a Universal Drive Controller, I want one for my IIe, using Floppy Emus. I am guessing I can do two emu drives, one as a 5.25 and the other as a 3.5. I’m hoping the card is a reasonable price, and that might spur others to replace their existing controllers and go with this to handle either real drives or emus. Who would not want a single UDC for both size drives, and to potentially free up a slot?
JTAG is the way to go for testing of the I/O for shorts or open circuits. If there are spare I/O pins, you could connect them to inputs to the card and then drive them from JTAG, but leave them as “no connection” in the actual design. This might allow for stand-alone testing (probably would need a power jumper or socket – but that would be needed for JTAG programming of the FPGA anyhow).
I think there might even be some automated tools that can generate JTAG vectors from an FPGA design file, but that might run up the cost.
Can you upload a PDF of the schematic to the GitHub repository for those of us that don’t have Eagle licenses? I’m interested in seeing the level shifters and how the card edge and disk signals could be looped back.
Sounds like you should get an EE intern to design/make you a tester!
I think you could get away with using a few loopback wires for a full test.
Essentially, you are selling a two-part product:
– the FPGA program
– and the PCB with passives and a few level converters
You can test both separately.
– The FPGA program on your dev board and maybe on a golden sample. That needs to be done once per program version.
– And the PCB by loopback tests. See below.
Write a primitive test program that writes and reads back bit patterns (running 0, running 1, chessboard pattern, all 0, all 1, whatever you think you need to test signals between the FPGA and the world outside the PCB. That tests all wiring from and to the outside world, including level converters. It can also test the RAMs added in on the newer PCB rev.
The program can be a JTAG tester running on a PC, temporarily abusing the FPGA as a gigantic shift register. (That’s what JTAG was designed for.)
Or the program can be a special application running on the FPGA, that runs the tests on board and reports back using a single “good/bad” signal, a UART TX register shifting out a detailed report on a spare pin, or anything else you can think of to report the test result.
In the latter case, you need to replace the test program with the final FPGA program as the last step of the test.
At work, we use that technique for allmost all boards that we deliver to our customers.
I’ve mostly put these questions on hold for the time being, but do have a few more thoughts. For user FPGA reprogramming, self-programming through user logic should work, but may be complex to implement. It would require custom coding on the Apple II side to load and transfer the bitstream data, and in the FPGA logic to implement the Wishbone command controller to perform the internal flash memory writing. There’s also a risk of bricking the device if the self-programming is interrupted, because it requires an already-programmed and working FPGA in order to reprogram.
My current thinking is that SPI programming could solve both problems. I’d still have to write the Apple II side code, but nothing in the FPGA logic, because the SPI configuration interface is a built-in hardware function. That also means it’ll work even if the FPGA is blank or has a corrupted configuration image. So now the question is how to bit-bang SPI from the Apple II using existing bus signals without any help from the FPGA or any (or very little) extra hardware.
For testing, I’ll manually test the initial boards on real Apple II hardware. If and when that becomes unsustainable, I will design a testing board that’s driven by a microcontroller. I was hoping to avoid it, but… probably can’t. The MCU doesn’t need to fully emulate an Apple II, it just needs to write test patterns to addresses on the board, and then verify the board output signals. A loopback test driven by the FPGA could also work, but I think the MCU approach will be simpler in the end. There are probably too many external connections for a loopback test to be worth it, something like 30 inputs and 12 outputs that must be verified but aren’t directly connected to the FPGA, so it’s going to need an external harness or adapter board no matter what.