Archive for June, 2010
Pin Constraints
I think I’ve found answers for all the big questions about Tiny CPU’s design, and I’m ready to start creating the final schematic and board layout, save for one issue: pin constraints. I need to constrain which signals are assigned to which pins, because once I create the Tiny CPU board, I can’t change those pin assignments again. If a minor bug fix to the Verilog code caused some enable signal to be reassigned to pin 63 instead of pin 20, I’d have to create a whole new revision of the board.
The normal way to address this problem is by specifying pin constraints in the design software, Altera’s Quartus II in this case. I’ve tried that with both the Tiny CPU and Tiny Device designs, which are two entirely separate CPLD projects, but it doesn’t work for either one. If I take the pin assignments that were originally chosen by the design software, enter constraints that specify that it must keep those same assignments, and then recompile the design again, the software complains that the design won’t fit the device. The constraints that define what has already been fit, suddenly no longer fit. This behavior is the same for both designs, and after trying several different methods of specifying constraints at the pin, cell, and LAB levels, using back-annotation as well as manual constraint entry.
I’m going to move forward without constraints, since I don’t have a choice. With luck, any minor bug fixes at the Verilog level will still produce the same pin assignments that I have now. If not, I’ll be spending a lot of money manufacturing board revisions.
Read 2 comments and join the conversationTiny Device
Tiny CPU just barely squeezes into a 128-macrocell CPLD, and so I’ve designed a companion chip called Tiny Device to handle other interface functions. It utilizes a second CPLD, and was originally envisioned as a simple address decoder, but its functions have grown and grown again. The second CPLD is now completely filled as well, and Tiny Device has turned into the Swiss Army Knife of this computer system. Here’s a list of its capabilities:
Address Decoding
As I described in yesterday’s posting, the Tiny Computer memory map is more complex than just statically-mapped RAM and ROM ranges. Tiny Device employs bank switching to dynamically map one of 128 physical memory banks into the lower half of the CPU’s address space. It also manages eight I/O ports in the upper half of the address space, for controlling the bank switching and interfacing with other hardware.
Keyboard Inteface
A PS/2 keyboard interface is provided, using a variation of the design I built from discrete logic for BMOW 1. Incoming keyboard bits are shifted in to a register, and when a complete byte has been read, Tiny Device sets a status bit to inform the CPU. There is no buffering, so if the CPU doesn’t read the byte before the next one arrives (approximately 3 ms), data will be lost.
Serial Input
Using a super-sampling technique, Tiny Device provides a virtual 57,600 bps serial port. Similar to the keyboard, incoming bits are shifted into a register, and a status bit signals the CPU when a byte is ready. A one byte buffer is provided, so the CPU must read the byte before the final bit of the next byte arrives (173 us at 56K bps), or data will be lost.
Serial Output
A separate serial output circuit operates independently of input, providing for simultaneous two-way communication. A status bit informs the CPU when the output circuit is idle and ready to accept a new byte. The CPU must not send a new byte while the output circuit is still busy with the previous one.
LCD Interface
Tiny Device provides a basic interface for communicating with a 128 x 64 graphical LCD. Each of the LCD data and control lines is mapped to an I/O port, but no other control logic is implemented. The CPU is responsible for manipulating the lines as needed to communicate with the LCD, and observing its slow timing requirements.
Tick Count
An 8-bit tick counter is provided at one of the I/O ports. It increments every 3.2 us, and can be used for timing-sensitive loops used to generate audio, measuring the period of time between events, or seeding a random number generator.
Clock Division
To provide timing flexibility, a 20 MHz oscillator is divided by four to create the 5 MHz system clock. This allows the system clock speed to be increased or decreased later, without needing to replace the oscillator.
Output Port
A generic 8-bit output port provides direct control over two LEDs and a piezo speaker.
Status Flags
Tiny CPU queries the status register to get the current state of all I/O devices. This includes the ready flags for the keyboard, serial in, and serial out, as well as the current position of the up/down/set navigation switch.
All of this fits in 127 of the CPLD’s 128 macrocells, making Tiny Device an even tighter fit than Tiny CPU. The Tiny Device Verilog source can be viewed here.
Read 4 comments and join the conversationBank Switching
The Tiny CPU design has a 10-bit address bus — that’s all I could fit in the constrained space of the CPLD. 10 bits means 1K of addressable memory, which is practically nothing. I had planned to improve on this by using bank switching, but until recently, I hadn’t thought much about how it would work. After several days of ripping up one plan after another, here’s what I finally came up with.
With its 10-bit address bus, the CPU sees 1K of memory. This is divided into two 512-byte blocks. Block 1 contains the stack, I/O ports, and a scratch RAM area. It is the “common” block, and is always present in the CPU’s address space no matter what is happening with bank switching. In contrast, block 0 is a swappable memory area, and can be mapped to any bank in physical memory.
Physical memory is 64K, and is divided equally between ROM and RAM. The 64K physical memory space is partitioned into 128 banks of 512 bytes each. Any bank can be mapped into block 0. Bank 127 is always mapped into block 1, the common block.
The bank select register is part of the memory-mapped I/O ports in common memory. To swap a bank, the CPU only needs to write the new bank number to the appropriate address.
This may all seem pretty simple, but take a minute to consider some of the implications:
- Upon reset, bank 0 is mapped to block 0. That puts 512 bytes of ROM, 440 bytes of RAM, the I/O ports, and the stack all in the CPU’s address space. That’s plenty for many small programs, and means they won’t have to bother about bank switching at all.
- Larger programs (lots of program code) can be accommodated by bank switching code segments in/out of block 0, all operating on common data in block 1.
- Programs operating on large data structures can copy some bank-switching helper code to block 1, then swap additional RAM banks in/out of block 0.
- Arguments can be passed on the stack to ROM helper routines in other banks, because the stack is in common memory.
- All of ROM is addressable, with no holes. This makes storing images, audio samples, and other data in ROM much easier.
- There is no difference in handling between ROM and RAM banks. A program running entirely from RAM works just like one whose code is in ROM.
Failures
While this design seems relatively straightforward, it took me a painfully long time to arrive at it. I went through several days of failed designs before settling on this one.
My first attempt was to divide the 1K space into a 768 byte ROM region and a 256 byte RAM region, reasoning that most programs would have more code than data, and then simply tack the bank select register onto the existing address bus. So A0-A9 came from the CPU, and A10-A17 came from the register. That worked poorly, because it swapped the entire address space at once. With that scheme, there’s no easy way to grow the program code space while sharing the same data, or vice-versa. It also left holes of inaccessible memory within each 1K physical memory bank, and caused the stack to disappear when switching banks. With more tricks some of those shortcomings could be addressed, but it didn’t seem promising.
My second attempt extended the first by disconnecting some of the higher-order RAM address lines. This caused the entirety of RAM to appear several times in the physical address space, at the cost of having less total RAM available. So for example, RAM might be repeated eight times in the physical address space, so that in the 768/256 ROM/RAM partitioning for banks N and N+8, the RAM portion would be the same in the two banks while the ROM portion differed. That provided a way to handle larger programs with lots of code sharing the same data, but not programs working on large sets of data. It also still had the same problems with memory holes. And with the different sizes and fixed locations of the ROM and RAM regions, it would be difficult to bootload a program and run it from RAM in the same way it would normally run from ROM.
I think there was a third, fourth, and fifth idea too, but the details all sort of blur together. They mapped 1K chunks of physical memory into the 1K CPU address space in different ways, combined with a split of the 1K CPU address space into ROM and RAM regions. They all sucked. I went through a lot of pieces of paper.
Eventually I hit on the idea of tying the ROM/RAM select to the bank register, rather than the CPU address. I don’t know why it took me so long to think of that, when it seems obvious now. I devised a scheme with two blocks with a 768/256 split, where either block could be mapped to any 1K bank of physical memory. When mapping a bank to the 768-byte block 0, the upper 256 bytes were inaccessible. And when mapping a bank to the 256-byte block 1, the lower 768 bytes were inaccessible. Yet any byte in a bank was accessible as long as you mapped it to the appropriate block. This wasn’t TOO bad, but was certainly awkward, and it also required two different bank select registers (one for each block).
Finally I went for a long run in the hills near my home. I find that when I’m sitting at my desk, trying to find the solution to something difficult, it never comes. All my good ideas come to me either when I’m driving, in the shower, or out for a run. About three miles in to the run, I thought about a 512/512 split, and realized that if the banks were also 512 bytes, I could fit a whole bank into a block, and eliminate all this weirdness. As I said, it seems obvious now.
There’s one slightly unintuitive aspect to this scheme, which is a result of having 512-byte banks but a 1024-byte address space. The CPU address bus is A0-A9, but A9 is actually not connected to the memory at all! That took me a while to grasp. Instead, A9 is used as a select input to the address decoding logic, and determines whether a memory reference is to block 0 or block 1. It operates the mux that selects either the contents of the bank select register or a fixed value for the upper address lines. The A9 that is actually connected to the memory chips is generated by the decoding logic and is not the CPU’s A9.
Read 3 comments and join the conversation2D Graphics Thingy?
Maybe 3D Graphics Thingy isn’t dead after all. One of my goals for the creation of Tiny CPU was to get more comfortable with programmable logic design, and custom circuit board development, and I’ve certainly done that. Once Tiny CPU is done, I think I’m going to revisit graphics in the form of “2D Graphics Thingy”. My plan would be to work through the memory interface problems that stumped me the first time, and create something like a programmable 2D blitter for a VGA frame buffer. That seems like a manageable project I could expect to succeed at, while also being a step along the way toward the ultimate goal of 3DGT.
Read 5 comments and join the conversationPower Research
I’d like Tiny CPU to be flexible with its power requirements. The Tiny Test Board that I made uses an external, 5V regulated power supply, and so doesn’t have any on-board voltage regulation. That works fine, but I’d like the flexibility to use other unregulated power supplies too, and especially the option to use battery power. That’s going to take a bit of work.
The supply current measurements that I took recently showed about 150 mA used by the test board, but the demands of the real Tiny CPU board will be much higher. The Tiny CPU board will have a second CPLD, RAM, and ROM that the test board lacks, adding another 150 mA or so. It’s also likely that I’ll substitute a large graphical LCD for the small text LCD on the test board, requiring still more current. 400 mA is probably a good guess for the total current needs of the real Tiny CPU board.
400 mA is not a trivial amount of current. If I used a 9V power supply, the drop across the voltage regulator would be 4V, and power dissipation would be 4V * 400 mA = 1.6 Watts. That would probably require a TO-220 voltage regulator with a heat sink. If I used a 12V power supply, the power dissipation would be even higher, at 2.8 Watts. In theory I could use a lower voltage supply, like 6V or 7V, depending on the the capabilities of the voltage regulator used. However, 9V is the lowest voltage commonly available.
Battery Power
Powering the Tiny CPU board from a battery would make it totally self-contained. OK, a full-size PC keyboard isn’t very portable, but I could still make some interesting programs using only the buttons and switches on the board itself. Unfortunately, finding suitable batteries looks like it will be more difficult than I thought.
My first idea was to use a 9V battery. Who knew that Energizer publishes the datasheets for all their batteries? Examining the 9V alkaline datasheet was disappointing. It only has a capacity of about 500 mAh, and that’s at a discharge rate of 100 mA. It also assumes the battery is discharged all the way to 4.8V– too low for electronics. At a 400 mA discharge rate, the situation is even worse:
The 400 mA discharge rate is off the chart, somewhere in the sub-1-hour lifetime range. Yuck.
If not a 9V alkaline battery, then what? Sanyo Eneloop AA and AAA rechargeable NiMH batteries are among the best rechargeables currently available. Four AA Eneloops would provide a total voltage of about 5.4V, with substantially more capacity than the 9V alkaline battery:
Fully charged, each battery is roughly 1.4V, providing 5.6V total. The cell voltage drops rapidly, though, flattening out at about 1.25V per cell and 5.0V total, while providing 4 hours of useful battery life.
That seems like a good solution, except that it doesn’t provide enough extra voltage to operate a voltage regulator. Even using a LDO (low-dropout) regulator, it typically needs at least 5.5V input in order to get 5.0V output.
Here, then, would seem to be my options:
- Spend a lot of money on 9V batteries. Works, but not very economical.
- Use five AA batteries. Works, but… five batteries? Typical battery holders are one, two, or four batteries. It just seems strange.
- Use four AA batteries, without any voltage regulator. Works… sort of. With four fully-charged batteries, the total voltage of 5.6V would be pushing the limits of safe operation, although it would quickly fall into a more reasonable 5.0V range. This would preclude using any higher voltage wall power supply, though, and so doesn’t seem like a great option.
- Add a switch to allow optional bypassing of the voltage regulator. The regulator could be used with wall power supplies, 9V batteries, and the first minutes of use for AA batteries, then bypassed as the AA’s lose voltage. Works, but dangerous. If I forgot that the regulator was bypassed while connecting a higher voltage supply: poof, toasted board.
Option 2 is probably the best bet, but using five batteries just kind of offends my sensibilities. I’m still searching for a better solution.
Read 9 comments and join the conversationCPU Angst
I’m starting to fear that Tiny CPU is a short-sighted design. It occupies a strange spot between ultra-simple soft-CPUs like MCPU, and much more capable soft-CPUs like PicoBlaze. The hardware is nearly as complex as PicoBlaze, but the functionality is not so much better than MCPU, making it a poor trade-off. The 10-bit address space and lack of true pointer support (no generic indirect addressing mode) are especially limiting.
The project is far enough along that I’ll finish it anyway, so it will be a good learning experience if nothing else. Next time if I do a “Small CPU” project, substituting even a very small FPGA for the two CPLDs I’m using here should allow for dramatically improved functionality.
Supply Current
I took some power measurements for my test board, at various clock speeds and with different peripherals connected.
1.22 MHz | 4 MHz | 8 MHz | 20 MHz | |
---|---|---|---|---|
with keyboard and 20×2 LCD | 142 mA | 147 mA | 145 mA | 148 mA |
with LCD only | 141 mA | 143 mA | 144 mA | 147 mA |
no peripherals | 89 mA | 100 mA | 101 mA | 105 mA |
Conclusions: Clock speed and the presence of a keyboard have little effect on the current draw. A 20×2 line text LCD adds about 50 mA. I also tried a 20×4 LCD, which draws about 65 mA.
The LCD current is higher than I expected. The datasheet for the 20×4 LCD says the supply current should be just 4 mA, but I assume that’s without the backlight. The datasheet doesn’t say much about the configuration of the backlight LEDs or their current needs, and whether they’re in series, parallel, or both, except to note that VLED is 4.2V. That probably means there are two LEDs in series, or possibly several series pairs, connected in parallel.
As I type this, I realize I’ve made a mistake. If the backlight drops 4.2V, then 0.8V should be dropped across my current limiting resistor. If I want 15 mA (pretty typical for an LED), then the resistor should be 0.8V / 15 mA = 53 Ohms. But I’m actually using a 15 Ohm resistor, and I don’t remember how I came up with that value. With that resistor I should see 0.8V / 15 Ohm = 53 mA going to the backlight, plus 4 mA for the controller chip, so 57 mA. That’s not far from what I measured, so at least that makes sense.
Beep!
I also did some experiments generating tones with a simple speaker. Directly driving the speaker with a digital output pin works, but draws about 15 mA with a 50% duty cycle square wave. Using the digital I/O to switch a transistor on and off instead delivers 50 mA to the speaker, making it noticeably louder while also reducing the current demands on the CPLD. It does require adding a handful more passive components to the board, though: in addition to the transistor itself, it needs a resistor to limit the base current, and a flyback diode in parallel with the speaker to discharge reverse voltage created by switching.
Serial Port
Next up: working on a soft-UART serial port, for communication with a PC. I’m not sure how that will get used ultimately– maybe a bootloader? I’m sure it will prove useful, though.
Be the first to comment!