A CPU in a CPLD
OK, the CPU design spark is back, sooner than I’d expected. I have an urge to implement a minimal CPU using a CPLD. If you’re not familiar with the term, a CPLD is a simple programmable logic chip, existing somewhere on the complexity scale between PALs (like the 22V10’s I used in BMOW) and FPGAs. Typically a CPLD has a similar internal structure to PALs, with macrocells containing a single flip-flop and some combinatorial logic for sum-of-products expressions. They are also non-volatile like PALs. Yet typical CPLDs contain 10x as many macrocells as a PAL, with some macrocells used for internal purposes and not connected to any pin. FPGAs are generally much larger and more complex, with thousands of macrocells and specialized hardware blocks for tasks like multiplication and clock synthesis. FPGAs also normally contain some built-in RAM, and are themselves RAM-based, requiring configuration by some other device whenever power is applied.
I’m attracted to CPLD’s because I’m hoping they’ll provide a good step up from PAL’s, without drowning me in FPGA complexity, as happened when I worked on 3D Graphics Thingy. I’m pretty confident I can figure out how to work with CPLD’s without driving myself crazy, increasing the chances that I might actually finish this project. Given the limited hardware resources of CPLD’s, fitting a CPU will also be an interesting challenge.
I’ve also been wanting to design my own custom PCB’s for quite some time, and this will give me an opportunity. The end goal of this project will be a single-board computer on a custom PCB, with my CPLD-CPU, RAM, ROM, some input buttons/switches, and some output LEDs/LCD. I need to limit myself to CPLD’s that come in a PLCC package, so I can use a through hole socket and solder it myself. Unfortunately that will limit my choices pretty severely. I think it’s theoretically possible to hand-solder the more common TQFP surface-mount package, but I’m not excited to try it. And for other package types, forget it.
Here’s some back-of-the envelope figuring to get the ball rolling. This is assuming an 8-bit CPU with a 10-bit address space (1K).
I/O pins needed:
- 8 data bus
- 10 address bus
- 1 clock
- 1 /reset
- 1 /irq
- 1 read-write
- ~4 chip selects for RAM, ROM, peripherals
That’s 26 I/Os. So a PLCC-44 package should be fine, as CPLDs in that package typically have about 34 I/Os.
Macrocells needed for holding CPU state:
- 10 program counter
- 10 stack pointer
- 10 scratch/address register
- 8 opcode register
- 3 opcode phase
- 8 accumulator
- 8 index register
- 3 ALU condition codes
That’s 60.
Then I’ll need some macrocells for combinatorial logic. This is a lot harder to predict, and in many cases I should be able to use the combinatorial logic resources and the flip-flop from a single macrocell. I’ll just pull some numbers out of thin air.
Macrocells needed for combinatorial logic:
- 16? arithmetic/logic unit (8-bit add, AND, OR, shift, etc)
- 16? control/sequencing logic
- ??? other stuff I forgot
So that’s a grand-total of 92 macrocells for everything.
If I shrunk the address space down, and maybe changed to a 4-bit word size, I might be able to fit it in a 64 macrocell CPLD. But more than likely, it seems I’ll be looking for a CPLD in the 100 to 128 macrocell range. Considering my requirement for PLCC packaging, that will limit the choices to two or three possibilities, but more on that later.
I think the most challenging part of this project will be the control/sequencing logic, and the assignment of opcodes. BMOW was microcoded, and used a separate microcode ROM to execute a 16-instruction microprogram to implement each CPU instruction. In this case, I’ll need to create dedicated combinatorial logic to drive all the enables, selects, and other inputs in the right sequence to ferry data around the CPU to execute the instructions. Doing this with minimal logic will be a real challenge, and undoubtedly I’ll be using the bits of the opcode itself to derive many of those control signals.
Read 12 comments and join the conversation12 Comments so far
Leave a reply. For customer support issues, please use the Customer Support link instead of writing comments.
Well, I overlooked the fact that any output that’s not simply a pass-through of some already-existing processor state will require a macrocell, for the combinatorial logic to determine what value to output. In this case, that means I need 23 additional macrocells for the data and address busses, read-write signal, and chip selects. That brings the new estimated total to 115.
Best of luck Steve. I’ll be following this with interest.
I too am looking forward to this. I’d advise in getting some more surface mount experience and going for it, a little more time perhaps but I think the end result would be more rewarding. Besides, it’d give more hands-on fun!
Actually I think I like the limit of 128 macrocells that using a PLCC package creates. It’s big enough that I think I can do something pretty interesting, but small enough that it’s a challenge to find how to pack the most in the least space. So far I have most of the datapath done in Verilog, but none of the instruction logic yet, and I’ve used 75 macrocells.
I guess you will already have seen this – http://www.opencores.com/project,mcpu
Amazing to get a functional (just!) CPU in 32 macrocells, but it will be much more interesting to see what is possible with 128 macrocells.
Pretty cool project to try. Good luck!
My idea is to make a simple CPU that changes part/all of it’s own microcode sets. A 2-bit program data CPU could then prove actually useful. If I ever mess with this, I’ll be sure to send you a copy of the results. 😉 (Send me an email so I can add you to my address book if you want that because I may lose the URL by not my address book)
I’m thinking that a Harvard architecture with 4 instruction types and 4-word (8 bits) for address/data would be ideal for a small CPU. The limited number of inputs (hopefully multiplex, also) would be within programmable logic devices’ pinouts. There is also in theory the simplest programmable state machine from Wolfram’s site but it’s REALLY useless without infinite RAM and speed.
The funny thing is that I figured out that a cheap parallel EPROM (27Cxxxxx) or Flash ROM variant would be great as an instruction decoder for a CPU. The clocking would be slooooooow but it would work. Forest Mims IIIrd wrote a book I think that showed a CPU with that kind of instruction decoding (he used a hand-assembled switcher-diode-array ROM though).
I’ve been thinking about doing something like this myself for a little while but just wanted to say great work both in what you’ve done and especially in writing it all up so well for others to enjoy and follow. It’s really encouraged me to start my own project in my little spare time 🙂
I’m a c++ developer by trade so verilog is a little strange, it’s so hard to remember you are designing hardware so you can’t do things like write to the same “variable” in two places without thinking about how the hardware is supposed to manage to do that. I’m learning fast though and it’s great fun.
I plan to do something a little higher level I guess, I’ll probably get a cheap fpga breakout board and use that as the base rather than try to solder a 144pin chip. I’d like to make a custom pcb for the rest of it though, and one day make a full board. I want to include vga video and more external memory on mine so I need to use an fpga rather than a cpld.
I’m testing it out on a spartan3-an eval board at the moment. You seem to prefer altara parts, I was wondering if there was any special reason for this?
Hi John, sounds like you’re coming from a similar background as me. Development of Tiny CPU has clearly stalled, and it’s been almost year since my last significant work, but I still hope to finish it. In fact it is more or less finished already– I still need to complete schematic and route the board, but the design part is finished.
You probably have the same Spartan-3A eval board that I do. During development of 3D Graphics Thingy, I spent weeks trying to get the DDR memory interface working, but understanding Xilinx’s docs and navigating their wizards proved too difficult for me. In my limited experience, the Altera stuff seems a little more user-friendly and approachable for hobbyists. Maybe I was just unlucky with the Xilinx part that time, but I’ll be going with Altera for my next project.
I do hope you find the time to finish Tiny CPU, it’s a very nice project, and you’ve clearly put a lot of work into it already.
I’ve just been reading your posts on the 3d graphics thing and now I’m concerned. I had assumed that the RAM on the board was fairly simple static ram where I could just put an address on some pins, assert a signal on another, and read the value back. But it doesn’t look anything like that simple.
There aren’t enough easily accessible pins on the board to interface to an external static ram either without building something to connect to the expansion port.
Oh well, the board is still excellent for learning how these things work and I have a great deal of learning ahead of me before I even think of designing a CPU of my own.
I’ve seen some breakout boards on ebay that contain a fpga, a configuration rom, and oscillator as well as connectors to get at the pins. Probably I should obtain one of those for my project eventually.
John Burton, I got the same problem and ultimately had to apply solder to the pads to get it to make proper contact.
Steve, I’d love to hear what percentage of ROM-inator II buyers have this problem. It’s most likely widespread. If there is ever a revised version of this ROM SIMM in the future, I’d love to see one with more copper applied to the pads, thereby making it thicker at that point where thickness matters most. If you are only using 1oz copper now, doubling it to 2oz might work, or if you are using 2oz now, then you should go with 4oz.
Whoops, I think you’ve replied to the wrong topic, or the wrong person. See here for the latest on the ROM-inator II.
Sorry for my blunder, Steve. My speed-reading of email notification often gets me in trouble! But hopefully the extra copper could potentially prevent such problems from occurring in the future on the ROM-inator II. Best wishes!