Archive for October, 2008
Everything But The Kitchen VSYNC
I’ve decided to do whatever it takes to fix the VSYNC problem with BMOW’s video, and right now it looks like that means ripping out and redesigning a decent portion of the video address circuitry. Things are fine when displaying a static image, but not when the image is changing. When trying to animate something, or scroll a screenful of text, the VSYNC signal often gets fubared. This causes the monitor to throw up its hands and go into power-saving mode. It takes about 5 seconds for it to come back again after VSYNC stabilizes, and the constant pattern of scroll, lose sync, wait, regain sync is infuriating.
I did quite a bit of investigation of the problem using the oscilloscope. The proximate cause is that a miscount occurs in the GAL named VERTHI, which maintains the upper 5 bits of the row count. The low 4 bits of the row count and the 9 bits of column count are all fine. I can’t say why VERTHI is going wrong, but it’s presumably something related to noise or a race condition. There is a fair bit of noise observed with the scope. I tried rewiring the clock wires using a few different topologies in an attempt to reduce noise, without success. I also tried two more completely different clock sources for the row counter: the inverted high bit of the column count, and the HBLANK signal. HBLANK introduced some other timing problems but didn’t help with the loss of sync. The high bit of the column count made the loss of sync far worse. I also tried replacing the GAL with another one, and replacing the other parts that send signals to the GAL.
I wish I had a better explanation, but at this point I’m convinced I just shouldn’t use a GAL for the row counter. Instead I’m planning to use a regular hardware counter like a few 74163s or 74393s, and use the GAL to buffer the row address and generate VSYNC and other derived timing signals. Unfortunately that means eating up more of what little board space is left, dimming the prospects for a future audio system. It also means ripping out gobs of my painstakingly-wrapped wires and replacing them with new ones. I just hope it actually fixes the problem!
Read 17 comments and join the conversationDebugging
Holy cow, this thing is riddled with bugs! I started trying to address some of the bugs described in the Boot Logo posting, with mostly positive results. Some problems turned out to be hardware issues, some were in the software, and some still have me scratching my head.
The first bug I’d described in the Boot Logo posting was one where the keyboard appeared not to work during the bootloader sequence, depending on what address the bootloader program was assembled to. A few NOPs added or removed made the difference between the keyboard working or doing nothing. After the usual amount of swearing and probing with the oscilloscope, I discovered that this was caused by a hardware glitch, or more precisely, a bug in the hardware design.
The BMOW keyboard interface hardware reads in a byte’s worth of bits and then signals an interrupt. When the CPU reads the key scancode, that clears the interrupt and resets the bit shifter. The trouble was that as the address bus lines bounce around during the early part of a clock cycle, before settling on their final values, occasionally they would briefly appear as the keyboard interface address. So the interface hardware thought the CPU was reading a scancode, and reset the bit shifter. When the bootloader was assembled to certain addresses, it just happened to cause a pattern on the address bus that exposed this problem, constantly resetting the keyboard bit shifter before a full byte could be read, and rendering the keyboard inoperative as a result.
My fix was to modify the hardware slightly, so that the keyboard interface address must appear on the bus during the second half of the clock cycle in order to reset the bit shifter. It required changing a couple of wires, and altering the keyboard GAL code. This isn’t a perfect fix, but 99.9% of the time the address bus lines should have settled on their final values by then. In practice, it seemed to fix the problem.
IST?
The second bug from the boot logo posting was also one of missed keys from the keyboard. When running the Apple II monitor program, after disassembling a section of memory, the next key typed wasn’t being recognized, so “LIST” became “IST”.
You might think this was related to the first keyboard problem, but it proved to be entirely different. It took me a long, long time to resolve. It turned out to be an interrupt problem. The interrupt routine was inheriting whatever memory bank was active at the time of the interrupt, rather than explicitly setting the active bank to the one it needed. This meant that when a keyboard interrupt hit while video was being drawn, the keyboard driver tried to update its state from VRAM! This caused the key to be ignored, and also sometimes drew a garbage pixel somewhere on the screen. A one-line fix to the interrupt routine to set the active bank fixed the problem quite nicely.
I then attempted to look into the sporadic loss of VSYNC, but kept running into other problems. I lost some time to an intermittent crash on reset, which proved to be another instance where the interrupt routine wasn’t setting the active memory bank. Then I lost still more time to an intermittent crash on power-up. I *think* this was caused by attempting to service an interrupt before the interrupt vector was set up. I altered the bootloader to disable interrupts until the interrupt vector is set, and the problem now seems to have gone away.
Finally I returned to the loss of VSYNC issue once more. I confirmed what I’d discovered when I first encountered the problem: the video row counter is occasionally miscounting when the CPU accesses VRAM. I can’t explain why this happens. It’s a 10-bit counter split across two GALs, and it should endlessly count rows from 0 to 524, generating VSYNC from the count. Nothing should ever alter its counting behavior. Yet I observed it sometimes skipping backwards or forwards in the count, or missing 524 and counting all the way to 1023 before wrapping around.
I can only guess this is caused by some kind of timing or noise problem. Might one of the counter inputs be changing just at the instant of the clock pulse? Or might an input have taken on an illegal voltage? Half-hearted probing with the oscilloscope didn’t find any evidence of such problems.
Feeling a little discouraged, I decided to change course, and alter the video console code to synchronize VRAM access with the VBLANK period to eliminate the “snow” that appears when the CPU and display circuitry contend for VRAM at the same time. I added code to wait until the start of the VBLANK period before accessing VRAM, which should have made for slower, snow-free access.
The new code mostly worked, but not 100%. For one thing, the code has no way of knowing exactly where in the VBLANK period the display circuitry is. It might check and confirm that it’s the VBLANK period, and start doing a VRAM access, just as the VBLANK period ends and the display circuitry tries to access VRAM to display the next frame. I don’t have any easy way of resolving that.
More surprisingly, I discovered that some instructions that did not access VRAM were exercising the hardware as if they did, causing snow. That led to more swearing and probing: the usual routine. What I finally found was a pair of problems related to unexpected addresses on the address bus, similar to keyboard bug #1. Some of BMOW’s instructions drive an address onto the address bus in order to later push that address onto the stack. When this was a VRAM address, it caused the VRAM to think it was being accessed. I hacked the address decoder with an extra wire and code changes to suppress the VRAM memory select signal in such cases, but it feels like an imperfect solution.
Even with that fix in place, I found there were frequently transient values on the address bus that caused VRAM to think it was being accessed during access to some other memory location. I altered a memory control signal to suppress all memory select signals during the first quarter of the clock period, which resolves the problem, but with a heavy cost. By delaying the memory access signal, it reduces the maximum theoretical clock speed of BMOW by 25%. But right now, I’d rather have it run right than run fast.
I gave up at that point. With the new code to wait for VBLANK, there’s no snow anymore, but the loss of VSYNC problem seems to be happening much more often. To add insult to injury, I also discovered a new problem where VRAM access during the VBLANK period interferes with retrieval of the video mode bytes, which are stored in a non-visible portion of VRAM. Every so often, my screen of text will suddenly appear as a screen of garbage video instead, as the display hardware interprets the contents of VRAM as an image rather than text. I’m not immediately sure how to resolve that one either.
I should probably be grateful for all these problems. If there were no more problems, there would be nothing to do, and I’d have to start a new project!
Read 2 comments and join the conversationOdd Problems
Odd problems– are there any other kind?
I did more work on BMOW today, after a few weeks focusing my attention elsewhere. Wow, there are a lot of strange problems going on. Without having changed a thing from my previous boot logo demo, I found that the machine wouldn’t boot up at all. At power-on, I saw nothing on the LCD and garbage on the video. This seemed to have something to do with the USB connection, oddly enough. With no USB connection, the machine would not boot. With the USB connection to a PC, it would boot OK.
After more experimentation, though, the situation appeared less clear-cut. Yes it would boot when the USB was connected, but it would sporadically reset itself every minute or so. And no it wouldn’t immediately boot when the USB was disconnected, but if I left it sitting long enough, it would eventually boot. With no USB, it appeared to be resetting itself every few seconds or so. Sometimes it would crash half way through drawing the boot logo, or draw the boot logo incorrectly in fascinatingly bizarre ways. I tried to capture one of these with the camera, but it always reset itself before I could snap a photo.
My first guess is that I’ve got some kind of power problem. BMOW has a power-on reset chip that will force the reset line to its activate state if the power supply strays too far from 5 volts. Maintaining a USB connection may somehow help minimize power problems, since the USB connector provides power. That may explain why it seems to work better when connected to USB than when not.
After the machine had been on for about 20 minutes, it seemed to work more reliably, although still not without occasional mystery resets. Could the warming of the chips and board have made a difference? Hmmm…
Edit: I should have pulled out the multimeter right away instead of just musing on things. Power and ground differ by 4.5 to 4.6 volts, not the 5.0 volts it should be. The reset chip (a TC1232) is configured to reset the machine if the power is more than 10% out of tolerance, which means outside the 4.5v to 5.5v range. So that certainly explains why BMOW is sporadically resetting. Now as to why the power is out of spec, that may be a more difficult problem to solve.
Read 6 comments and join the conversationBoot Logo Needed!
I’ve made some additions to the ROM-based boot loader, so BMOW now has a proper boot logo paired with cool yet useless system diagnostics. All it needs now is a groovy start-up sound, and it will be a proper retro-80’s wanna-be machine! Power it up, or hit the reset button, and you’ll see this boot screen. You can then upload your program via USB from an attached PC to get something interesting going.
I spent quite a lot of time with a paint program attempting to create a cool boot logo, before admitting defeat and making this utilitarian 4-color logo. It’s actually a 16 color image, and the logo text is anti-aliased using additional colors, but it’s hard to see in the photo.
I’ve been pleased to discover that quite a few people are following BMOW’s evolution, and if anyone with better artistic chops wants to take a crack at making a more impressive BMOW logo, I’d love to hear from you! You’ll win fame (of a sort), fortune (not so much), and my gratitude. If this interests you, post a link in the comments, or send me an email. My address is in the “About BMOW” page. Some details on the format:
- Images can be either 252×64 with16 colors, or 126×64 with 256 colors.
- For 256 color images, the pixels will be twice as wide as they are tall. This is probably best achieved using a paint program by creating a 252×64 image with 256 colors, then scaling it by 50% horizontally when you’re done.
- Uncompressed BMP images are best.
- I lied about the height being 64: any image height that’s a multiple of eight is OK, but 64 is a good guide.
- I lied about the 256 color images: they actually must be 240 colors, but I can do that conversion myself if your image is 256 colors.
Things That Break
No blog update would be complete without a list of all the new things that have broken. Maybe some of you who are working on your own projects will feel better knowing how many random weird problems I have with BMOW.
1. Bootloader Abort – If you press ENTER while on the boot screen, BMOW will skip the USB download step, and jump directly to the program in RAM that you (presumably) downloaded earlier. This is handy after you reset from a program crash, since there’s no need to download the same program again. After I added the boot logo, though, the “ENTER to skip” behavior stopped working. It seemed that it wasn’t recognizing any input from the keyboard at all. With some experimentation, I found that this was influenced by what memory addresses the bootloader code was assembled to. By strategic placement of some NOP instructions, I was able to “fix” it, but I really don’t understand how it broke or why the NOPs make any difference.
2. Missed Keys – Once you load the Apple II monitor program, you can type a monitor command and hit ENTER, and it works fine. But for the next command you type, and all commands thereafter, the first key you press isn’t recognized, so “LIST” becomes “IST”. I think this has something to do with disabling interrupts for too long a period of time, and missing keyboard interrupts. Actually, I have no idea, but I like to make up plausible-sounding explanations.
3. Loss of VSYNC – A while back, I was having some troubles where clearing the video screen would cause a temporary loss of VSYNC, switching my display into power-saving mode. I made some adjustments, and the problem mostly went away. Well, now it’s back. When the video screen is displaying a full screen of text, and the text needs to be scrolled upward by one line, VSYNC is lost and the display retreats into power-saving mode again. At first this was happening almost 100% of the time when text was scrolled. Eventually I discovered that putting the functions in a different order in the file dropped the frequency of the problem to about 10%, even though the functions themselves were unchanged. The only difference was what addresses the code was assembled to. I have no explanation for that, but it hints at some deeper hardware problems that I’d prefer to sweep under the rug for now.
Read 1 comment and join the conversationPutting It All Together
I’ve reached an exciting point in the evolution of BMOW. All of the individual bits of technology that I’ve built are now coming together, and rather than a proof-of-concept of keyboard or USB or video functions, I actually have a real computer! I took some time this week to put together a few of those pre-existing technology bits, with the goal of doing something that was actually useful. I now have a computer with a rudimentary BIOS, that can download and run the Apple II machine language monitor program over USB, using BMOW’s VGA display and keyboard for I/O. Holy cow, it practically *is* an Apple II. Here’s the monitor program running, having just performed a disassembly of itself:
I have to say, the very ordinariness of this accomplishment on a “real computer” makes it all the more exciting to me. I showed my wife, and she didn’t even understand what I was trying to demonstrate. It’s a computer running some software, so what? But when I think of the layers of abstractions built on abstractions built on abstractions that it’s taken to get to this point, it makes my head spin. Just a couple weeks ago, I was struggling with the basics of text character generation, and it’s only been six weeks since I couldn’t tell a VSYNC from a V6 and my video signal looked like a train wreck. Before that, of course, was all the time spent monkeying with the keyboard interface and the USB connection. And let’s not forget the CPU itself– those Apple II guys used a nice MOS 6502 CPU, whereas BMOW doesn’t even have a discrete CPU. All the core logic and control functions are built upon a home-designed microarchitecture that’s implemented using a small pile of dirt-simple parts, hand-wired pin-to-pin with 1000+ individual connections.
There’s a delicious irony in the way BMOW seems more and more unremarkable as it behaves more and more like any other computer that people are familiar with. Yet I’m thrilled to have gotten this far. It’s already well-surpassed my original expectations, and there’s still plenty more fun to be had working on an audio system, adding some persistent storage, and building a case, not to mention all the software (BASIC? Games? A web server?) that begs to be written.
Read 6 comments and join the conversationBoot Loader
Following my experience with the video system, I decided to create a boot loader. It’s a little surprising that I never did this before, actually. While working on the video, after every tiny change to the video software, I had to reprogram the BMOW main ROM and stick it back on the board. I must have done it 100 times, and each time took a couple of minutes of fiddling with the chips, EPROM programmer, and software. What a pain.
Fortunately BMOW already had all the ingredients needed for a better solution, including a USB connection for communicating with a PC. I wrote a fairly simple boot loader program, which I programmed into the main ROM– hopefully my last ROM program for a while. At power-up, the boot loader listens for data from the PC on the USB port. It expects an initial signature, followed by two bytes to determine the download size, then the download data itself, and finally a checksum. The downloaded data is copied byte-for-byte to a fixed address in RAM. Once the download completes, if the checksum matches, the boot loader transfers control to the downloaded program and it’s off to the races.
Using the boot loader allows me to experiment with software changes far faster than before. I can change a couple of lines of code, assemble it, hit BMOW’s reset button, and download the new program using the boot loader. The whole process only takes a few seconds, compared with a few minutes for my old method of burning a new ROM each time.
Read 5 comments and join the conversation