BMOW title
Floppy Emu banner

Archive for October, 2011

A Working Hardware Replica of the Mac Plus

Plus Too is a home-made replica of a classic Macintosh computer using an FPGA. The project reached a major milestone yesterday, booting to the Finder for the first time, and running several programs. Since then I’ve been getting many inquiries, and because not everyone’s been following the project since its beginning, I’ve created a Plus Too project summary page to document the progress so far and my plans for the next version. If you’re new to Plus Too, please begin by reading the project summary page.

The screenshot above shows a Mac Write document opened with Plus Too. Because there’s no keyboard support yet, the document was created on another computer and added to the Plus Too boot disk image. Below the Macintosh screen region, hardware debugging information is displayed in green. This debug overlay is possible because a pixel-doubled 512 x 342 Mac image conveniently leaves some extra vertical space on a 1024 x 768 VGA display. From left to right, the debugging information shows the current state of the CPU address bus, data bus in, data bus out, address strobes, previous address, and breakpoint address. A poor-man’s breakpoint system is implemented by setting a breakpoint address with panel switches. When the address bus matches the breakpoint address, the CPU’s memory transfer acknowledge signal is withheld, effectively pausing the CPU.

The current system is implemented entirely with an unmodified Altera DE1 FPGA development board. The next version will use a custom-designed circuit board instead of the DE1 kit. The revised Plus Too will use a real 68000 CPU, and will add a microcontroller to manage the floppy disk SD card interface. It will also add the physical connectors necessary to use a real Mac Plus mouse and keyboard if desired.

What Works

The current system recreates a computer similar to a Mac 512Ke, with 512K of RAM and no SCSI. It boots from a System 6.0.8 floppy disk image stored in ROM. The disk image is pre-encoded into a series of virtual tracks and sectors, with the proper low-level layout, header, footer, checksum, and GCR disk byte format. This encoding is performed offline, using a custom-made utility program. Applications can be launched from the disk and run normally.

What Doesn’t Work

The disk is read-only, and there’s no keyboard, sound, SCSI, serial ports, real-time clock, or parameter RAM. The planned SD card interface for loading disk images hasn’t yet been built. There are some obvious stability problems, and the system tends to freeze up if the mouse is moved too rapidly. Disk I/O seems strangely slow– slower even than on a real Mac 512Ke or Mac Plus. There’s a long, long way to go before this project could be considered “done”, but it’s an exciting start!

Read 4 comments and join the conversation 

Plus Too – Hello World!

Plus Too works! Holy cow, it really works. Hot damn!

I didn’t want to tackle SD card loading yet, so the GCR-pre-encoded 800K disk image resides in ROM, just above the Macintosh ROM image. The floppy drive module uses the video module’s memory access time slot during hblank periods to load disk data, transparently to the CPU.

Plus Too ran for about five minutes while I took these photos, then it locked up. Not bad for the first boot.

Now, to celebrate with a cold beer!

Read 6 comments and join the conversation 

Newark/Farnell Product Review

It seems the Big Mess o’ Wires blog has reached the threshold where companies will send their products to be reviewed. I’m flattered. Newark (or Farnell for those of you in Europe, Element 14 in Asia) has kindly offered to send me a product of my choice from their catalog, anything 25 GBP (about $39) or less.

What kind of electronics parts are people interested in seeing reviewed? Tools? Development boards? Programmers? Kits? Specific ICs? My first thought was a demo board for one of the less common microcontrollers (not AVR or PIC), like the MSP430 or an ARM variant.

Check out www.newark.com, and post a comment if there’s something specific you’d like to see reviewed here.

Read 5 comments and join the conversation 

Happy Mac

Things are starting to warm up now. I’ve successfully booted Plus Too as far as the Happy Mac startup icon! It’s not booting all the way into the Finder yet, but most of the tricky business with the IWM, floppy, and disk encoding schemes has been proven to work. Hooray!

If you can tolerate some shakey-cam video, here’s a movie that demonstrates Plus Too’s current capabilities. It shows booting to the question-mark disk screen, moving the mouse, inserting a blank disk, ejecting a disk, inserting a fragment of a System 3.3 startup disk, and the Happy Mac.

I’ve divided all of Plus Too’s disk-related functions into three parts: IWM, drive, and disk. The IWM is the floppy disk controller chip in the classic Macintosh, and my model of the IWM is finished and working. The Plus Too drive model replicates the brains of a 3.5 inch Sony floppy drive, which has sixteen 1-bit status and control registers. The drive model is mostly done, but there are still a few functions related to disk swapping and disk writing that are incomplete. The disk model replicates the GCR encoded data format of a 3.5 inch Macintosh disk, and is where more work is needed in order to boot to the Finder.

Plus Too is intended to load 400K/800K disk images from an SD card, perform on-the-fly GCR encoding for each sector, and pass the result to the drive model and IWM. That part isn’t working yet, so I took some shortcuts in the test shown in the video. The GCR encoding was done offline with a Windows PC, using a custom program I wrote. Then five sectors of the encoded data were stored in a block ROM inside the FPGA itself. Five sectors isn’t much, but it’s all I had space for, and it’s enough for the Mac to recognize a boot disk and show the Happy Mac icon. Testing the boot sequence this way enabled me to confirm that the GCR encoding algorithm is correct, and that the IWM and drive models are working, even before the SD card interface and on-the-fly encoding module is ready.

Next Steps

The next logical step is to implement an SD card reader interface, so I can load encoded data from the card instead of from the limited FPGA memory. Once that’s done, I should be able to boot all the way to a working Finder. For a read-only system I technically don’t need to do any more than that, but doing the encoding on the fly instead of with an offline tool would be much nicer. To support disk writes, on-the-fly encoding (and decoding) will be a necessity. The encoding and decoding algorithm is somewhat complex, and I’m unsure whether to attempt to design a Verilog state machine to do it, or incorporate a simple microcontroller core (maybe even Tiny CPU) and do it with a conventional program instead.

Other Concerns

There are all kinds of timing problems and glitches hiding just beneath the surface, and I’m worried. Every now and then I’ll make a change that causes Plus Too to exhibit broken behavior or fail to boot, even some innocuous change that definitely doesn’t affect the logic. Just today I made a change that caused an unexplained boot failure, and in the latest version I get random mouse droppings when the mouse is in a certain area of the screen.

Usually if I rearrange some modules or make some other superficial change, the problem will disappear, but that’s a very scary situation. There’s no doubt I need to master the Altera timing constraints editor to sort it all out, but my earlier attempts to make sense of it were dismal failures. Unfortunately, it doesn’t seem to be possible to translate a statement like “external signals D15-D0 must be valid no more than 50ns after the clock edge” into a simple constraint that I can enter somewhere. The whole system seems geared toward me writing custom Tcl scripts, which so far I’ve refused to do. Reading through the documentation, my eyes quickly glaze over and I wonder again why this all has to be so complicated.

A few concerns remain in the drive subsystem as well. With my current test setup, I always return the same five sectors of data, regardless of what track or side is actually being accessed. It’s possible there’s some hidden complexity there that I’ll need to address, or that I’ll incorrectly map disk image data to the wrong sectors, or that the method I’m using to determine what track and sector is being accessed isn’t even valid. This is a fairly small detail, though, and I’m optimistic I’ll be able to extend the current model to support all tracks and sides without major problems.

Read 7 comments and join the conversation 

Crazy Disk Encoding Schemes

Wow. I expected the details of the Macintosh floppy data encoding to be a bit complex, but this is worse than I expected. I think I finally understand it well enough to duplicate it, but I can’t explain why it does what it does. Maybe whoever extended Woz’s code from the Apple II was just in a bad mood.

I’ve been focusing my attention on how a single sector’s data is represented on the disk. Most of it is fairly easy to understand, once you’ve found a reference. Each sector consists of an address block and a data block. Between the blocks are $FF sync bytes. The address block begins with a specific header ($D5AA96, a sort of secret password for old Apple II hackers), then five encoded bytes containing the disk format, track number, sector number, and a checksum, and ends with a specific trailer sequence. The data block begins with a different header, then the sector number, the encoded sector data, and a trailer sequence.

It’s the “encoded data” step where things start to get tricky. Logical data bytes must be encoded into disk data bytes before being written to disk. This is due to physical limitations of the magnetic disk media: bytes with too many consecutive zero bits can not be stored reliably. Of the 256 possible bytes values, only 64 (or is it 67?) values can be stored on disk reliably, so the Mac encodes six bits of logical data at a time into one of the 64 “safe” disk byte values, in a process called 6-and-2 GCR encoding. There’s a 64-entry table in the Macintosh ROM for converting six bits of logical data into the corresponding disk byte, which was often called a nibble (even though it’s not 4 bits). When reading a sector, the process is applied in reverse.

All of this I more-or-less already knew before I began. I expected to find a routine somewhere in ROM that grabs three logical data bytes at a time (24 bits), and shifts out six bits at a time, using the GCR lookup table to produce four disk bytes. Once I found the disk sector write routine, it became clear it does more than that. My first hint was this French page about Apple II DOS 3.3, whose low-level disk format is very similar to the Mac’s. According to this page, data values aren’t used directly as indexes into the GCR table. Instead, each data byte is XOR’d with the previous byte, and the result is used as the index into the GCR table. Why? This is where I fail, because I have no idea why. It seems somehow related to checksumming the data, but it would be easier to use the data values as direct GCR table indexes, and then use the sum of all data values as a checksum.

An unexplained running XOR-based index is strange, but I could live with that if it were the only unexplained part. Unfortunately it seems that either the French page is incomplete, or else the Mac encoding method is more complex than Apple DOS 3.3 encoding. I’ve stared at the 68000 assembly code in the ROM routine for quite a while, as well as C re-implementations from MESS and from Ben Herrenschmidt, trying to grasp some kind of high-level purpose in it, but it just seems arbitrary to me.

Instead of XOR-ing each value with the previous one, it XOR’s each value with the sum of all previous values back to the beginning with a stride of three. For example, the 10th value is XOR’d with the sum of the 9th, 6th, 3rd, and 0th values. To facilitate this, three running sums are maintained for the values on the 0th, 1st, and 2nd stride. But wait, it’s more complicated than that. After every 3 logical bytes, the sum for stride 2 is rotated left one bit position, and the bit that’s rotated out is added into the 0th stride sum, and any overflow there is added into the 1st stride sum, whose overflow is added to the second. Or something like that. It’s all a little crazy.

Read 12 comments and join the conversation 

Building a Halloween LED Display

This Halloween, my daughter and I are working on a large LED display board for the yard. We started with a 3 x 6 foot pegboard with a grid of 24 x 48 holes pre-drilled, and 100 each red, orange, yellow, and green LEDs. 5mm size T 1 3/4 LEDs fit pretty nicely into the pegboard holes. Her job is to design some interesting animated shapes for the board, and my job is to figure out how to wire the whole thing up and power it.

My initial thought was to fill only the holes needed for a few specific Halloween shapes, and connect the LEDs using some ad-hoc point to point wiring. But the more I thought about measuring, cutting, stripping, and soldering hundreds of different ad-hoc wires, the more I hated the idea.

The Grid

Wiring up the board as a regular grid should be much easier, even if not all the grid points are used. The simplest method would be to lay down lengths of bare wire horizontally and vertically between the pegboard holes, using a plastic or rubber spacer at the crossing points to ensure the row and column wires don’t make electrical contact. To add a new LED to the grid, you’d just bend one leg down and solder it to the nearest row wire, and bend the other leg to the right and solder it to the nearest column wire. If the column wire was supplied a positive voltage while the row wire was grounded, then the LED at the intersection would illuminate. The great thing about this arrangement is that it wouldn’t require any wiring changes if you later need to change the shape of a jack-o-latern’s eye or a bat’s wing: you could just add a new LED to the grid, and alter the software that controls it.

Driving this grid would require 24 row lines and 48 column lines. The columns would be scanned rapidly one at a time, with all the row lines driven simultaneously to control the state of the LEDs in that column. If the columns were scanned rapidly enough, the human eye would perceive the whole display at once. The 24 row lines would most likely come from a shift register connected to the MCU, while the column lines would come from some type of decoder that enables exactly one of 2^N columns using N input bits from the MCU.

Splitting the Grid

Scanning through 48 columns one at a time would work, but with a 1/48 duty cycle, the LEDs would look very dim. A 1/8 or 1/4 duty cycle is more realistic in order to create an acceptably bright display. To achieve a 1/8 duty cycle, the grid would have to be divided into 6 independent sub-grids, each with 24 rows and 8 columns, with 192 LEDs per fully-populated sub-grid . Each of these sub-grids would require its own 24-bit shift register for the rows, and a 3:8 decoder for the columns.

Power requirements would likely force the sub-grids to be even smaller. Assuming 10 mA per LED, a fully-populated 24 x 8 sub-grid with all LEDs on would see 1.92 amps of current at the decoder, which is way too much for any common IC. After looking at some typical purpose-made LED driver chips, it looks like 300 mA is a more realistic upper limit.

I have a sample of the MAX7219 LED driver, which combines 8 row drivers and 8 column drivers with built-in column-scanning circuitry into a single chip. It can control 64 LEDs, with a total current limit of 320 mA. Each LED is driven with 40 mA when its column is enabled, and the 1/8 duty cycle results in an average drive current of 5 mA per LED. With 1152 total grid positions, it would require subdivision into 18 sub-grids to control the full grid with 18 MAX7219’s. For a brighter display, the MAX7219 could be configured with only 4 columns and a 1/4 duty cycle, for an average drive current of 10 mA per LED. That would require twice as many sub-grids and twice as many MAX7219’s.

Cost Estimate

At roughly $8 per MAX7219, the cost of driver chips would be $144 for the normal display, or $288 for the brighter display. That’s a bit out of the budget range for a weekend Halloween project! I suspect I could save a significant amount of money by using a simpler driver chip, but so far a specific solution has eluded me. The row drivers need constant current outputs, with a relatively high current limit in the 40 mA range. The column drivers (assuming they’re a separate chip and not combined like the MAX7219) need a very high current limit in the 300+ mA range.

The best option I’ve found for the row drivers is the STP08DP05B1, a $1.82 chip with eight constant-current outputs able to supply up to 100 mA per row and 800 mA total. I’m having a little trouble understanding the datasheet, though, and I’m not certain those are truly the per-pin current numbers, and whether they’re for sourcing or sinking current.

I’ve not yet found any good options for the column drivers, probably because I don’t know what such things would be called. High-power decoders? Analog demux switches? I need something with three digital inputs, that enables one of eight analog outputs, and can supply as much current as possible to that output, but at least 300 mA.

I could also use standard low-power shift registers and decoders, connected to separate high-power transistors. That would greatly increase the total number of components needed, though, since I’d need a separate transistor for every row and column in every sub-grid.

No matter what the eventual solution, it appears the cost for driver electronics will be at least $50 or so, and possibly much more.

Power

The power requirements of the full grid would be non-trivial. Assuming I used 18 of the MAX7219 drivers, the maximum current draw would be about 6 amps! That’s well outside the capability of any wall-wart 5V supply in my collection, and would probably require some type of bench supply or switching power supply scavenged from an old PC. However, the display would only draw 6A if it were fully populated with all LEDs simultaneously turned on. With a less than fully populated grid, and some upper limit on the fraction of LEDs that could be on at once, the power requirements might be reduced to a more reasonable level.

Solutions?

Since I’m not about to give up on this display entirely, nor wire up a few hunded LED’s with ad-hoc point to point wiring, some variation of the grid approach looks like the only viable solution. That means I’ll be spending a lot of money on driver electronics and a high-current power supply. To help keep the cost and assembly time down, I’ll probably only build a subsection of the entire 24 x 48 grid, but do it in a way such that the rest of the grid can be populated over the course of future holidays. 🙂

Edit: Thinking about this further, I’m starting to doubt whether this project is a very good idea. 5mm LEDs with a 1 inch grid spacing will do a very poor job of space filling, so this display likely won’t look very good.  It will appear more like a bunch of isolated point lights than adjacent pixels in an image– OK for abstract displays, but not so good for making a recognizable image of a jack-o-lantern.

My bigger concern is safety. Combining a 6A power supply with a grid of bare wires on the back of the board sounds pretty dangerous. Curious fingers could get a significant shock.

Read 5 comments and join the conversation 

« Newer Posts