October 10, 2020
I thought I promised myself I wouldn't ignore this blog for 6 months at a time, and here it is October already. I guess I got caught up in a bunch of other stuff. Emulating sewing machines for the Game Boy and two card readers for the NDS. GBE+ is finally running DS software well enough where I can begin reverse-engineering and emulating a bunch of fancy Slot-2 hardware. That would have been unthinkable just last year, when maybe one or two games would actually boot and run. Anyway, as I said before, the long gap between these entries doesn't mean a lack of progress. If anything, quite it's the opposite.
Last time I wrote about getting Z-buffering fixed for the 3D engine of the DS, as well as preliminary texture support. I've since implemented many of the various texture formats. The results were most immediate for games that use the 3D engine with orthogonal projection. Essentially a game will use the 3D engine to display 2D graphics via "flat" textured polygons. It's a pretty common method of drawing among DS titles. It offers far more "sprites" than the 2D engine of the DS can handle, and it's comparable to how modern 2D graphics are drawn today. Once textures were implemented, a whole new world of DS emulation opened up.
Initially, I wasn't handling some transparencies correctly. Most textures use palettes, and the first color in the palette is essentially ignored when drawing. This isn't all that different from how OBJ transparencies work on the DMG, GBC, GBA, and NDS 2D engine. There was still a nasty issue with all textures having an outline. As it turns out, the method I described back in March for drawing 3D polygons was to blame. I was always drawing the outline first, then filling in each polygon. The color of the outline was supposed to be the vertex color, but that's not applicable to textured polygons like these ones. Additionally, even when I stopped drawing the vertex color as the outline, I still mistakenly signaled to my 3D renderer that the outline was already drawn, so for a while textures were missing a 1 pixel border! After that, textures are generally looking pretty good now.
Not all texture issues are solved just yet in GBE+. I still have to implement some of the trickier texture coordinate transformations, or even more basic things like flipping, clamping, and repeating. As a start, however, I was pretty thrilled to see graphics popping up where previously there had been nothing.
I started dabbling with display capture on the NDS as well. The NDS has the ability to capture output from Engine A and write that data to VRAM. It's almost like taking a screenshot. A lot of games use this to show 3D graphics on both screens at the same time. Only Engine A can use the 3D engine for rendering things. However, a game can capture a particular 3D scene from Engine A, then pass that VRAM over to Engine B while Engine A generates a different scene. The trick is to have continually swap Engine A between the top and the bottom screen every other frame, switching the scene it renders. Engine B acts as Engine A's "shadow" by showing the last 3D image that was generated. It sounds pretty elaborate, but it works nicely on real hardware, although I suppose animations are technically limited to 30 FPS for dual-screen 3D. At any rate, I tried my hand at some basic display capture emulation. A number of games still flicker like crazy, but at least they're showing something now.
After emulating the Magic Reader Slot-2 accessory, I figured I should take the time and really commit myself to making improvements on 3D rendering in GBE+. While getting a bunch of games that use 3D engine for 2D graphics was great, I wanted to see 3D polygons working more than anything else. Simple homebrew stuff made with libnds worked just fine. I could look dozens of spinning cubes without issue. 3D graphics for commercial games, however, is a different story. So for September through the start of October, my goal was the get 3D models looking correct for at least one title.
Would you believe that Cory in the House was exceptionally easy to debug? This game became a meme years ago, but it's surprisingly simple when it comes to rendering. The first 3D model that appears in the game is Cory himself, right before the player selects a few options from a menu. The bulk of his shirt is made up of quads and the rest of his body is triangles. None of the matrix stacks are used for this part, just MTX_STORE and MTX_RESTORE, so that made keeping track of the current matrix very straightforward. Additionally, the game doesn't rely of GX FIFO interrupts. no$gba has an excellent debugger for all data sent to the NDS' GPU, including commands issued, textures used, and vertex screen coordinates, all of which proved extremely useful.
I was quite pleased to see GBE+ trying to render something at first, even if it was just a bunch of exploding polygons. I focused on at least getting the first quad of his shirt to render correctly. As it turns out, the game doesn't use the dedicated GX FIFO DMA available to the ARM9 CPU. Instead, it just uses a regular 32-bit fixed destination DMA to the GX FIFO. GBE+ wasn't detecting the writes properly for those DMAs. That fixed a lot of the polygons, but others remained broken. After some further investigation, I realized that the game was changing the position matrix on a per-vertex basis. Unfortunately, GBE+ would only use the position matrix of the last vertex for the entire polygon. Simply keeping a list of the position matrix for each vertex fixed that problem, and Cory soon appeared flawlessly in GBE+!
After this success, I was determined to get more 3D graphics showing in commercial games. My next step was to finally fix the GX FIFO interrupt in GBE+. It had been hacked somewhat to get a few games to kinda acknowledge it (such as Card de Asobu) but many games needed a true implementation. I sat down and figured out what was wrong with GBE+. Turns out when the GX FIFO interrupt is raised, it sets Bit 21 of the ARM9 interrupt flags (IF) to "1" for as long as the initial interrupt conditions are true, or until the interrupt is disabled by writing to GXSTAT. When games process interrupt handling, they clear the ARM9 interrupt flags, but under certain conditions, the GX FIFO flag cannot be cleared. Emulating this behavior seems to do the trick. A bunch of games that froze now continue in-game.
So, with more games booting, I started tackling various problems. Basically, if I saw something wrong, I'd try and debug it, and if it proved too much at the moment, I moved onto a different game and a different problem. Over the past few weeks, I've improved rendering for quad-strips and tri-strips, fixed a few matrix stack issues, eliminated a few more troubles with GX FIFO entry parsing, and others. Of particular note, I emulated the Clip Matrix Results. Most games don't read the Clip Matrix Results, but some do in order to get the most recent position or projection matrix if they don't happen to keep track of that themselves. Oshare Majo is a good example. Without allowing the ARM9 CPU to read the Clip Matrix Results, the 3D models are faceless, as well as mostly headless. Yikes!
I messed up the first implementation for reading the Clip Matrix Results, so one of the girls has her legs just floating up in the air. It's rather interesting that they made a full 3D models of Love and Berry wearing dresses. In the screenshot, they just spin around for a second and only the upper-body is visible, yet Sega took the time to texture and animate the legs and shoes. It may have been a holdover from when the game's origins in the arcade, copy+paste? Anyway, Oshare Majo is far from perfect, still some missing polygons here and there, but it's a vast improvement when last month nothing worked at all. A lot of other games still need much more work, but a few titles are showing lots of promise. A couple (like Rune Factor and Sonic Rush) are even playable now.
Tales of the Tempest is showing a lot of polygonal graphics as well, most of which look correct. Some error messes the game up when one character walks on-screen, so it's not playable just yet. Still, this is the farthest I've ever gone in-game. At any rate, I'm absolutely thrilled by the amount of progress I've made recently. I honestly thought I wouldn't be able to see this kinda stuff happen in GBE+ happen until maybe late next year. Debugging 3D stuff is not easy, but I think I've found a set of methods that works well for me.
Tales of the Tempest holds special meaning for me. It was the first DS game I really wanted to emulate. Back in 2011, I was just getting back into emulation after losing interest through my college years. After I finished school, I had a lot of downtime before finding a job, so I picked up my old hobby again. The emulation scene had advanced quite a bit while I was gone, and HD rendering for 3D games was the main attraction for many people interested in Dolphin or PCSX2. I dabbled with 720p for N64 and PS1 games (the most my old laptop for school could handle) but I really wanted to see how DS games handled. To my dismay, at that time it wasn't possible to render DS games in higher resolutions. I hacked up Desmume to force it to render at 2x, but only got a small section of the screen to show. Being a huge Tales fan, I wanted to see what this game had to offer. At 1x, it's honestly a bit of a blurry mess, and heaven forbid you use a bilinear filter to smudge it all up. At 2x and above, though, it looks pretty decent.
That was the catalyst that propelled me into emudev. I wanted to make my own DS emulator that would one day feature high resolution 3D graphics. Although both melonDS and Desmume do that now, it's still my dream to see it happen in GBE+. Seeing Tales of the Tempest display even okay-ish graphics like this means my journey is nearing a full circle. At the very least, I'm closer than ever. I can taste it.
March 2, 2020
So, it's been almost 2 months and no updates here. Must mean I'm stuck right? That was the old me. The new me is always reaching for more progress, more compatibility, more... anything. A lot of January + early February was dedicated to GBA stuff, namely the AGB-006 and Zoids Cyberdrive, but I also started picking apart some Slot-2 NDS hardware. That's the next frontier for me. I've already researched and emulated two devices, and I've got plenty more odds and ends sitting in my collection waiting to be probed. I'll hold off until the next Edge of Emulation articles to elaborate more on those.
My other area of focus has been 3D rendering on the NDS, and a lot has happened. I talked about implementing a z-buffering system (years ago???) but I got stuck back then and never improved it. I know, I know, it should be such a simple thing, right? Something basic and essential to any 3D renderer. Without that, you'll have a hard time drawing anything more than one polygon correctly. Handling stuff in 3D is not something that just clicks with me. I tried z-buffering, got it halfway right, but lost heart and motivation and didn't touch it for the longest time. Once again, however, I was sick of the status quo of GBE+ not emulating NDS stuff like it should, so a while ago I sat down and really got to work.
The hardest part for me was figuring out how do I calculate Z values inside a polygon? Drawing the polygon's lines was simple, since it reduces to 2D fairly easily with just the X and Y values. Calculate the slope, plot the pixels, and boom, you have the outline of a triangle or a quad. It's not a big leap either to calculate the Z values of the lines. If the starting 3D point is (0.0, 0.0, 1.0)
and the end point is (2.0, 0.0, 3.0)
we know that the Z values gradually changes from 1.0
to 3.0
. Since I'm already calculating X and Y changes to get the polygon outline, adding Z values is straightforward. But that's just the perimeter of the polygon, what about the insides?
For any given X coordinate, if we take the tallest point in the polygon and compare it to the lowest point in the polygon, we can actually tell how all three coordinates change when moving vertically (well the X coordinate will remain constant). We just need to do something like (LINE_Z_BOTTOM - LINE_Z_TOP) / (SCREEN_Y_BOTTOM - SCREEN_Y_TOP)
to see how the Z value changes for that given X coordinate when moving along by the Y axis. That way, GBE+ can scan the polygon, left to right, top to bottom, and collect the polygon's internal Z coordinates. The polygon's outline is used to determine the top-most and bottom-most coordinates to use for the calculations. Now, this is actually something I very much tried to do in 2018, but I got things mixed up (swapping top for bottom), got the depths reversed, and used the wrong Z coordinates (forgot to apply the clip matrix, and therefore the perspective matrix). So, getting z-buffering done was mainly a matter of correcting a bunch of silly mistakes that initially scared me away. Just like that, GBE+ now does spinning cubes without z-fighting!
So, a proper z-buffer paves the way for GBE+ to realistically handle the rest of 3D stuff for the NDS. This was basically a major roadblock I had, but now I've cleared that hurdle. Again, maybe this all seems a bit trivial to some folks, especially those who have dealt with 3D graphics before. But 3D is very intimidating to me, so when I stumbled around and failed in 2018, I had a very hard time picking myself back up and taking another shot at it. At any rate, I wasn't content to just stop there, not when there's so much more to do. I figured I'd take on something that looked fun and interesting, so I chose to deal with interpolating vertex colors. In laymen's terms, this is basically just gradients.
At first, vertex color interpolation puzzled me. Again, a similar question popped into my head. How do I calculate the color values inside a polygon? It's easy enough to calculate the colors for the outline. If you have a triangle with one red, green, and blue vertex, just gradually change the color when moving from one vertex to another. When starting at the red one and ending at the blue, draw a line that starts red and bit by bit changes to blue. The math is rudimentary and so is the concept. But what about all those pesky internal colors? A little lightbulb flashed in my head at this point. Just calculate it like I did the Z values. DUH! For every X coordinate, take the top-most color of a polygon and gradually change that to the bottom-most color of a polygon, and that should produce the correct fill colors. And what do you know, it worked on my first try.
Nice! PSI (CorgiDS and DobieStation author) wrote a concise summary of NDS color interpolation and that helped to solidify and reinforce my understanding on this matter. The color transitions here are smooth like on real hardware too, a bit too smooth actually! I'm still using 24-bit color instead of the reducing it to 18-bits. I'll have to take care of that sooner rather than later, but I'm thrilled with the progress so far. Not wanting to stop, I jumped into another important area of 3D rendering, using textures.
I'll readily admit that studying up on many 3D things tends to make my eyes glaze over, then I start doubting myself when it comes to programming it. But textures, now there's something I understand. I used to deal with OpenGL 2.0 and textured quads all the time for 2D games I made in SDL 1.2, long before I started dabbling with emulation. Furthermore, the affine sprites used by the GBA and NDS have a lot in common with textured quads. So I'm not totally in unfamiliar territory. On the NDS, it's pretty much like old-school OpenGL. A TEXPARAM command is sent, and then TEXCOORD commands are sent for every vertex. Textures are stored in VRAM, so decoding them in many cases isn't much different than handling backgrounds or sprites from the 2D side of the NDS. The example from devkitPro uses Direct Color mode (15-bit RGB), so it's not too difficult to get something displayed.
Whoops! Heh heh, it's facing the wrong way. Well, I kinda got too excited about all the stuff I'd done and wrote this blog post before I even had the chance to fix it... That's my homework I guess. At any rate, I made some pretty decent headway. I plan on implementing support for more texture formats soon. With that done, more games will be able to display graphics where they previously couldn't. I can't wait to see where this leads me.
I'll just take this moment to give a brief shoutout to one of my best friends growing up. Today's his birthday. We were gaming buddies, spending endless hours during weekends and summer breaks lost in virtual worlds. From the PSX, N64, GBA, to GCN, it seemed like we played it all. Those were some real nice childhood memories we made, stuff that sticks with me even today. Haven't seen you in years, but here's hoping you have a good one, wherever you are man!
January 7, 2020
So here's a funny story. On a whim, a few weeks ago I wanted to see how far I could get Cory in the House to run in GBE+. For those of you that don't know, this is basically the best game ever made. Surprisingly, GBE+ made it to the title screen and almost kinda went in-game (though it freezes later, and there are no 3D models yet). The title screen itself was glitched just a bit, as PSISP pointed out. No biggie, I thought. It should be an easy fix. The title screen is just a bunch of sprites. It must have been something to do with rendering.
I started looking around, checking everything about the corrupted sprites. It looked to me as if the graphics in question were being pulled from the wrong address, but that turned out to be a false lead. After looking at every bit of data in OAM, I came to the conclusion that the sprite's tile data itself was essentially bad. GBE+ was rendering it just fine, it's just that the source was no good. Garbage in, garbage out. Some copy operation, either a manual loop or a DMA, was messed up. I kept backtracking to see when the incorrect bytes were written, then further backtracked to find out why. Eventually I found a bit of code in the game that used the NDS' hardware division to calculate an address to begin copying data. I was getting closer to the culprit.
My initial thoughts were "Not this again...". As detailed in my last update, multiplication instructions were playing havoc with Super Princess Peach. A while ago, hardware division caused a nasty bug in Lunar: Dragon Song (remember to mask 32-bit inputs before the division operation, not after). More trouble with math? Really? Well, sort of. The hardware division in GBE+ was solid, but the numerator GBE+ fed the operation was wrong. That went on to affect the result and remainder values and ultimately the address to copy data. I traced where GBE+ came up with the value for the numerator and found that a single instruction threw everything into chaos.
The ADC instruction is pretty simple. It takes three numbers: 2 inputs and the CPU's Carry Flag (a zero or a one). It combines all three and spits out a result. It's pretty useful when dealing with overflows, and it can do some neat tricks like simulate 64-bit numbers with two 32-bit CPU registers if you know what you're doing. The problem here is that the GBE+ wasn't calculating the carry flag correctly in one weird edge case. Two numbers were being added together, 0x00000000 and 0xFFFFFFFF. Normally, adding those two together results in 0xFFFFFFFF, but in this scenario the Carry Flag was set. So the equation essentially becomes 0xFFFFFFFF + 0x1 which evaluates to zero (overflows 32-bits) and sets the Carry Flag. GBE+ was supposed to catch this, but it didn't. The interpreter correctly set the results, but the code that evaluates all of the CPU flags that need to be changed after arithmetic operations forgot to include the Carry Flag as part of the inputs!
That's actually a quite a low-level bug. Worse, GBE+'s GBA core had the same bug! It's actually a bit of an oversight on my part. I had fixed this behavior years ago for other instructions like SBC and RSC, but totally neglected ADC. Whoops... And the thing is, no GBA seemed to notice or care, so it went undetected all this time. Nothing ever exploded in GBA games, at least none that I ever attributed directly to this slightly broken implementation of mine. It's a wonder how I got so far all of this time. Luck? Better late than never to fixing bugs like these, I suppose.
At any rate, what I thought was a simple graphical error turned out to be much, much more. Fixing Cory in the House's title screen also fixed other games as well. Metroid Prime Pinball froze at the first logo screen, but now it's playable. Guilty Gear: Dust Striker was one of the earliest DS games to boot in GBE+, but it always got stuck before heading to the title screen. Now (despite some graphical glitches) it too is playable. Polarium refused to get past the first logo screen, but not it seems playable as well. The funky tiles in ZooKeeper are fixed as well. And the strange score glitches in both Game & Watch Collection titles is gone. Not bad.
December 27, 2019
What a crazy month it's been. GBE+ keeps getting more and more games to at least boot and go in-game. I'm going to have to implement save states soon since they're such a boon for debugging purposes. I had been holding off on tackling save states because a lot of stuff in the NDS core was in flux, but now that things are coming together nicely, I believe a lot of components in GBE+ are mature and won't significantly change. Now that a number of games boot and go in-game, bugs and glitches occur much later than just the first few screens. For a game like Luminous Arc, which has a pretty lengthy intro (4-5 minutes long), that makes it hard to debug why the game freezes before the first battle.
I want to talk about Super Princess Peach. It's a favorite of mine, and I'd really like GBE+ to play it. Until recently, it kept freezing after the title screen. It would start trying to access out-of-bounds memory hundreds and thousands of times infinitely. After tracking down the source, I saw that one bit of the game's code had a bizarre instruction that placed the Program Counter at some wild address. That instruction came from part of the game that copied other code into a particular area of RAM for later execution. The thing was, GBE+ was making the game load the wrong code. Looking at the instructions that handled the copy operation, I saw that it was writing more data than it should have. The game used one CPU register (R1) to act as start address to copy from and another (R2) as the final address to copy from. R1 was being set way too high. The game calculates the start address via the a multiply instruction in ARM mode, and that's where things got complicated.
For the ARM9 CPU in the NDS, the Carry Flag is not affected when using multiplication instructions, but for the ARM7 CPU, the Carry Flag is supposed to be corrupted or destroyed. Essentially, the Carry Flag on that CPU should be considered garbage and unreliable immediately following a multiplication instruction. I suppose internally, the Carry Flag on ARMv4 architectures might get used as scratch or something during the multiplication operation. So code running on the ARM7 shouldn't rely on the Carry Flag after multiplication until it can it's certain that another operation has affected it. No commercial game would dare to run crucial code without taking that precaution, right? WRONG! Observe the following code from Super Princess Peach:
0x37f8f64 CMP{P} R3, R12 LSL 0x00 //Sets Carry Flag (and unsets Zero Flag) for this specific problem 0x37f8f68 LDRHI R2, [R8 + 0x14] //The following 3 instructions then execute (Carry Flag is set AND Zero Flag unset) 0x37f8f6c MOVHI R0, 0x18 ROR #0 0x37f8f70 MLAHI R1, R12, R0, R2 //Multiplication potentially affects Carry Flag! (and Zero Flag!) 0x37f8f74 LDRLS R2, [R8 + 0x14] //The following 3 instructions only execute if Carry Flag is unset (OR Zero Flag is set) 0x37f8f78 MOVLS R0, 0x18 ROR #0 //The 1st MLA instruction *could* cause the 2nd one to run unintentionally! 0x37f8f7c MLALS R1, R12, R0, R2 //Only 1 MLA instruction should run, otherwise address calculation is messed up.
ARM code is executed conditionally, which means instructions can be skipped depending on what CPU flags are set. Here, the relevant condition codes are HI and LS. Those effectively translate into "Is the Carry Flag set and Zero Flag unset?" and "Is the Carry Flag unset or Zero Flag set?" respectively. Here the issue is with the Carry Flag specifically, so ignore the Zero Flag going forward. The Carry Flag will get set or unset by the CMP instruction; for the context of the problem it was always set (with R3 > R12 at the time of debugging). All of those -HI instructions execute in this case, but notice how the last instruction is a multiplication operation (MLA). That means the Carry Flag = whatever. Could be anything. And right after that are instructions that conditionally execute based on the status of the Carry Flag! If those -LS instructions are executed right after the -HI instructions, it messes the address calculation the game uses to copy code to RAM.
None of the bits of ARM7TDMI documentation I have on hand explains exactly how the Carry Flag gets corrupted, so I always just set it to zero, figuring that games would never be so bold as to not set the Carry Flag after a multiplication instruction. But no, there it is, by Nintendo's own hand no less. Perhaps the developers just got lucky here. Maybe they actually knew something about how multiplication destroyed the Carry Flag and made sure the input values would never cause the Carry Flag in the first multiplication instruction to become unset. It's dumb either way, so for now, I'm no longer touching the Carry Flag in ARM7 multiplication instructions. The Carry Flag is supposed to be meaningless anyway for those instructions, but setting it to zero has proven unsafe for at least one game. I notice that many emulators (mGBA for example) do this for ARM7TDMI emulation, so there must be something to this hands-off approach.
Anyway, with that small change, Super Princess Peach goes in game and is largely playable (with graphical issues, none particularly game-breaking though). The real reason I'm very excited about this progress is that it signals a new phase for GBE+'s NDS core: emulating additional hardware! The NDS has a lot of extra stuff, just like the DMG, GBC, and GBA before it. I can't wait to take a crack at some of them. The Rumble Pak is low-hanging fruit, as far as I'm concerned. Simple to implement, and it can be used right away in a game like Super Princess Peach. Who knows. It may not be long before I start tackling obscure NDS hardware for the Edge of Emulation articles.
December 15, 2019
As I mentioned last time, touchscreen input was messed up in GBE+. Homebrew read the coordinates just fine, but commercial games choked hard for some reason. This was a long-standing problem that really got on my nerves. Finally, after hours of debugging, I pinpointed the culprit. I reasoned that it couldn't have been something related to how I emulate the SPI bus, since all of the reads from the touchscreen looked good. As I found out, commercial games were erasing some areas of RAM where the calibration data for touch input is stored. While homebrew never edited those values, commercial games appeared to zero them out, which in turn caused the game to think all touch input came from the coordinates (0, 0)
Evidently as part of some SDK used to build games, the boot process calculates a CRC16 value for 0x70 bytes of user settings (name, birthday, calibration data, etc) stored in firmware. That data is later copied to RAM once the NDS firmware is finished and hands control over to the game (GBE+ high-level emulates this part for now). The CRC16 value the game generates should be the same as one stored in the NDS firmware as well. The game compares both, and if they don't match it wipes the RAM area where the calibration data should have been copied. GBE+ wasn't putting any CRC16 value in the HLE firmware, so that was one problem. Additionally, some games (presumably using different versions of whatever SDK Nintendo provided) manually calculate the CRC16 value and others use the NDS BIOS for that. GBE+ used an outdated CRC16 algorithm (the one described in GBATEK), but I noticed it was nothing like what the game's code was computing. A quick peek at Desmume cleared things up, so now the HLE BIOS replacement in GBE+ correctly figures out the CRC16 value. Solving those two issues let every game that currently boots properly use the touchscreen. As a result, Magical Starsign is more or less fully playable now, whereas before it could get no farther than the title screen.
During this time while I was fooling around, I managed to get Retro Atari Classics to boot and reach the title screen and some menus. Somewhere along with my fixes described above, the game stopped working. I took a look, and it seemed that the game got stuck reading some values from firmware. That is to say, it stopped reading from firmware even when it should have kept reading lots more. Poking around the game's code, I noticed that the ARM7 CPU suddenly executed a bizarre instruction seemingly out of nowhere right before it should have read from firmware. Tracking where the rouge instruction came from, I saw that the ARM9 put it there. I referenced no$gba's debugger and confirmed that the value was supposed to be at that memory location, but it was supposed to be written to the NDS's DTCM, basically a fast chunk of memory for data accesses. The address the ARM9 wrote to overlapped regular RAM, but GBE+ was supposed to treat the DTCM as a separate entity. Turns out GBE+ wasn't doing a thorough job, however. By mistake, GBE+ would write to the DTCM and regular RAM. Whoops! After restricting writes to the DTCM-only when relevant, and making sure the DTCM couldn't be used to execute code, this fixed Retro Atari Classics and evidently a host of other games.
And now for today's showcase of newly working games. Obviously Retro Atari Classics boots. Kirby Canvas Curse boots as well too! It goes in-game to menus and almost reaches the first level before hanging. Yoshi's Island DS boots too and goes through some menus. The intro scenes play for quite a while before hanging. A lot of garbage is displayed on-screen. I think my HLE implementations of some of the NDS BIOS decompression functions may need more work. They were copy+pasted from my GBA core, and I'm not certain if it makes a difference that the NDS added callback functionality to many of them. The title screen is by far the most trippy thing I've seen yet in GBE+'s NDS emulation. Mr. Driller boots and goes to the menus. Mr. Driller boots and goes in game. Although it's pretty slow due to bad CPU timings, and there are some pretty heavy graphical glitches in the menus, the drilling mode itself works! Pretty fun game, if I must say. Game & Watch Collection 2 boots and plays beautifully, thanks to all the work I did on alpha blending and fixing those DTCM bugs. Last and most certainly least, Lunar: Dragon Song boots and seems to be fully playable. The game is so terrible though, I don't think I'll ever find out if GBE+ can run it from start to finish...
This is the kind of progress I wish I had made way back in 2017. Better late than never, I suppose. I'm quite excited to see what next I can get running and what else I can improve. A few games can actually be played to an enjoyable degree. It feels like I had to drag myself every inch just to reach this much progress, but I'm finally here and achieving my dream.
December 11, 2019
Nothing major has changed in terms of compatibility, but the Cart DMA from a few weeks ago is a really tough act to follow. Instead, I've been busy trying to gradually improve some games. I'd like to showcase a few games that actually run pretty well in GBE+. The first is Zoo Keeper. Even before the Cart DMA was implemented, it ran in a playable state. I'm not talking just getting to menus, it actually plays the game. Some sound works, like the digitized voice when going through menus, and a super slow and version of the background music with SHARP and LOUD notes every now and then. Sound is something I'll have to seriously work on later. For now, I'm focused on getting games up and running. While a number of modes in Zoo Keeper work, some just crash, and the ones that do have tiles that shouldn't be there. Still, taken together, everything works more than it doesn't.
Another title that had been working before the Cart DMA was Intellivision Lives! It's a collection of old games for a console released before I was even born, but some of them are actually kinda fun even in this day and age. This game actually got me interested in working on emulating the NDS' window functionality in the 2D engine. The games need to use the window to cutout a certain portion of the screen and display another background layer (the one where the in-game action takes place). So, getting this game playable only needed some adjustments to LCD rendering, and just like that, another commercial NDS game works. Since CPU timings are completely off in GBE+'s NDS core at the moment, the games actually run slower than they're supposed to, but many run just fine overall.
To my surprise, The World Ends With You is now booting with the Cart DMA fixed. Unfortunately, it stalls when it tries to write the initial save file. I got past this by grabbing a save from Desmume, and GBE+ makes it to the title screen. Music plays, but it's super slow and super creepy sounding. GBE+ can't go in-game just yet because the game relies on the touchscreen to do that much. GBE+'s touchscreen functionality strangely works fine with homebrew, but fails in every commercial game. The X/Y coordinates get messed up, although the actual tap itself gets registered, just at (0, 0). It's something I'll have to debug soon, as I have a handful of other games that are probably very playable in GBE+ but get stuck at title screens that need touch input. Still, it's awesome to see one of my favorite games getting this far!
Last but not least is Kirby Super Star Ultra. This game doesn't need Cart DMAs but I hadn't tested it until recently. It takes a looooong time to setup the initial save file (again, due to poor CPU timings in GBE+) but it gets along pretty good after that. The menus can be navigated, and the main game starts up. Apparently there's something wrong with GBE+'s EEPROM save handling, so I had to import another save from Desmume to get Spring Breeze working. Once there, things look great... until Kirby starts moving around. A bug of some sort causes Kirby to launch into the sky, instantly killing him every time he jumps. The minigames don't crash, but they all rely on the touchscreen, which needs work as I said above. Some minor graphical glitches need to be sorted out too. There's currently no sound anywhere. I'm still very happy to see this running so well though. To get this far feels incredible.
I think the remainder of my efforts this year will be on fixing touchscreen input and developing new homebrew tests for NDS hardware. I haven't had a great deal of time to work on those tests, and hopefully they'd reveal many shortcomings in GBE+ as well as a way to fix them.
November 29, 2019
Wow, another update to this blog, 3 days later instead of 6 or 9 months? I'm serious about making decent headway into NDS emulation. It seems once I stop telling myself how impossible it is for GBE+ to be a working NDS emulator, I somehow start to tackle issues and roadblocks that I've been stuck on for years. When people tell you about the power of positive thinking, they aren't joking. Anyway, I've gotten a number of games now booting and running. Some of them have glitchy graphics, a handful freeze, but most importantly, they at least do something rather than nothing. It's proof to me that I'm not a hack who's wasting their time, proof that I'm not running around in circles anymore.
The biggest change in the past few days has been to the way GBE+ handles Cart DMAs. The ARM9 and ARM7 CPUs have special Cart DMA modes that will pull data from the ROM to another area of memory, which is much more convenient than requesting the data manually. When requesting it manually, the game essentially grabs one byte at a time from a set memory location, and then has to copy it again to wherever it needs to. With a DMA, on the other hand, the data can be copied all at once, as far as the software is concerned. It's no surprise that many games rely on the Cart DMA, and without implementing that, an emulator has no hope of properly running those titles
The thing about the Cart DMA is that it doesn't quite work immediately. Once a CPU writes to the DMA MMIO registers and enables a Cart DMA, the actual transfer doesn't happen until it also initiates a transfer on the cartridge SPI bus. Effectively, there is a delay or pause, much like when other timed DMAs (HBlank and VBlank for example) are enabled. This behavior was what messed up GBE+. I was under the assumption that the Cart DMA happened once the CPU wrote to DMA MMIO registers. I wish GBATEK covered the Cart DMA in more detail; I was confused about Cart DMAs until I used the debugger in no$gba to see what really going on. Once I changed GBE+ to reflect the delay for the Cart DMA, magical things started happening. Games that had refused to boot for the longest time suddenly came to life! Not just the fading publisher/developer logos, but title screens too!
The first game I tested was one of my all-time favorite NDS games: Super Princess Peach. I was using this game as a benchmark for the Cart DMA, logging everything it tried to do in no$gba and comparing what GBE+ did. I was blown away when I saw the bright and colorful graphics pop up, and with my first try at re-implementing the Cart DMA (no debugging whatsoever). Unfortunately, it doesn't go in-game just yet. One of the emulated CPUs in GBE+ takes a wrong turn, and the Program Counter starts trying to pull instructions from oblivion. I'll have to find out what exactly is wrong, but getting this far along absolutely made my weekend. If anything, this is motivation to keep pressing forward. I am going to play this game in GBE+, one way or another.
For some reason, Burnout Legends gets the farthest out of all the games that now boot with GBE+'s new Cart DMA code. A lot of the menus can be navigated even though the backgrounds are weird and the text isn't always readable. None of the races work, but I doubt there would be much to look at anyway, since hardly anything 3D works yet. Tales of the Tempest barfs up seizure inducing screens when it tries to play the animated video intro, but pressing some of the buttons gets past that and at least brings half of the title screen. TotT is not a particularly good Tales game, but it is the game that made me want to dip my toes into NDS emulation. Back then, no emulator supported increasing resolutions. It must have been 2011 when I was out of school and rediscovering my love for emulation but also being disappointed at the state of NDS emulation. Being a huge JRPG nerd, this was the game that I wanted to play the most on the NDS at the time, but it just looked so damn ugly, even with filters like HQx. My dismay eventually brought me here. Even though Desmume and melonDS both do HD resolutions, it's still my dream to make my own emulator that can do it as well. One step closer I guess.
Castlevania: Dawn of Sorrow was a nice surprise. It just shows a single background image then freezes, but it's great to see it getting further along now. GBE+ can boot a few more games not listed here, but those have been working for a while now and aren't related to the most recent changes made to the code. I'll try to showcase them later, once even more progress comes. The great thing about getting more games booting is that I find more problems that need to be fixed. Fix those, and more stuff works, the emulator gets a bit better, and find even more problems to solve. It's a difficult cycle, but it always means going forward in some way.
November 26, 2019
It's been another crazy year, emulating all sorts of exotic Game Boy hardware. 2019 has been mostly about GBA stuff (Soul Doll Adapter, Battle Chip Gates, Multi Plus On System, Turbo File Advance). After having conquered the Turbo Files, I finally find myself with some spare time for the NDS. Sometimes, it honestly feels like punching a brick wall. I'm always looking for even the slightest crack to make even the tiniest bit of progress. Some days, it feels like I could go on debugging things all night, come up with hundreds of megabytes of logs, reams of notes, and still feel like GBE+ is nothing more than a broken mess when it comes to NDS emulation. It can be very disheartening, especially when I see other people seemingly will working NDS emulators in just a couple of weeks. Focusing on emulating weird Game Boy hardware and accessories, kind of gives me an excuse to not work on NDS emulation, especially when it just seems so frustrating.
But recently, I've started telling myself I should stop looking at it as a problem I'll never conquer and instead view it as a challenge. I do like challenging tasks. It's what keeps me interested in my Game Boy research. So, I've decided to get myself together and really take a good crack at shaping up the NDS core in GBE+. It's always been a dream of mine. And I think I'm actually, genuinely pumped up about working on it now. Before it was something I kept putting off, even dreading at times (for fear of failing mostly). But now my attitude is totally different. I'm excited to take on the job ahead of me, even though it'll be tough. I must admit, I could probe add-ons and accessories all day long, but for some reason, the NDS kicks my ass. A lot. It's not going to be easy, but even so, I'm ready to stop being so damn hesitant when it comes NDS emulation.
So, after being inspired, I couldn't let my new-found positive energy go to waste. Back in March, I finally got some commercial software to boot and run. One of those titles was Game & Watch Collection. That game was more or less "working", if you stretch the definition of that word. GBE+ could reach each of the 3 minigames, but they were unplayable due to several graphical issues...
The graphical glitches fall into 2 categories: dealing with alpha-blending between the various backgrounds and using the correct palettes for them as well. The game uses a clever mechanism to mimic the old LCD screens of the Game & Watch handhelds. Instead of using a bunch of indvidual sprites, it only uses a couple. Those are the solid black ones you should normally see moving around the screen. Everything else is actually a second BG layer that's alpha-blended. Active sprites just get drawn over the correct location whenever that part of the G&W LCD is supposed to be activated. Getting the blending working requires emulating the special effects part of the NDS 2D engine. It's basically the same as the one from the GBA with only a few edge cases to consider (not for this game though).
I'd been slacking and putting off work on implementing the special effects because it involves quite a bit of work in regards to keeping track of what pixel was rendered by what layer. Once I stopped being lazy, I managed to come up with a decent system that's much cleaner and simpler than what I'd done with my GBA core years ago. Getting alpha blending working fixed Oil Panic, but Donkey Kong and Green House were still a mess. Somewhere along the line, the bottom part of the screen on Donkey Kong went all black. Those backgrounds needed to properly get the palette data from specific parts of VRAM that can be shuffled around. My initial code to handle VRAM banking was a bit naive, as I didn't quite understand how slots within a VRAM bank work on the NDS. I thought there were only 4 slots total, and they switched based on context, but there are actually multiple slots and they're associated with set banks. Once I got all of that sorted, everything could play just fine.
So with that, I managed to get my first commercial NDS game to play mostly without any glitches! All it took was a little effort and time to sit down and get down to business. This was a major boost of confidence for me, proof that I can make GBE+ into a useable NDS emulator someday. That day is coming sooner rather than later if I keep up this pace. I've found a handful of other commercial games that boot in GBE+ (without exploding), so I'm going to pick them apart and slowly work my way up from there. I feel like I've finally got a good foothold, so now I can climb up and keep improving things.
March 18, 2019
As I have said before, this rolling blog is NOT dead. In fact, it's about to get super NOT dead from now on. Previously, my biggest achievement was getting some demo software to run in GBE+. It was the first professionally developed software to run in my emulator. I felt really good about that progress, but I still hit a wall. I never got around to perfecting that Z-buffer I talked about (it works, kinda sorta, but needs more love) and I was still stuck basically. I did work on some more NDS hardware tests, but I still wasn't booting retail games. Black screen. Black screen. BLACK SCREEN! That's all GBE+ ever showed.
Having seen the same stupid, blank screen for years, it gets really demotivating despite all the work I've put into GBE+, like I'm not good enough of a programmer to fix it or something. I'd rather work on something that gives me results, as in graphical output. Results feel good. They're progress, a reward for all the code spent carefully writing, a pat on the back for all the studying and research it takes to make an emulator function. It's quite simply positive feedback. Without that, I just didn't feel like dedicating time to NDS emulation when it kept shutting me down. However, I also am loathe to give up such a personally important goal. So every now and then I take a crack at booting commercial NDS titles.
I had long assumed that there was something not quite right about the way I emulated the way the NDS interfaces with the game ROM. Unlike the Game Boys, the NDS doesn't readily map the ROM to addressable memory. During boot, it can pre-load some data (up to a few megabytes total) from the ROM for the ARM9 and ARM7 CPUs, but accessing additional ROM data involves sending commands via a SPI bus and copy+pasting a handful of bytes at a time to RAM. It's not too complicated, but I had no good way of testing if my implementation was correct or not. The best I could do is compare debugging output from another emulator with GBE+. That's easy enough since I keep a copy of Desmume I modified just for debug logs, but I also needed a test case that did not make extensive use of this ROM interface, to keep the amount of data I needed to analyze to a minimum.
I turned to Game and Watch Collection. It's like 1.3MB total for the relevant ROM data (the padding can be ignored altogether). Unlike the demos, which shove all of their data into RAM via that pre-loading method, Game and Watch Collection used the ROM streaming interface. This made it a perfect test case. I compared GBE+'s results to Desmume's and I found that I was making a silly "off-by-one" error in the ROM transfer lengths. They were 4 bytes too short, terminating before the game logic expected, thus launching it into a perpetual loop where it halted and froze, ultimately giving me that evil, vile black screen of despair. I also messed up ARM7 hardware timers thanks to a typo. Also forgot to add a line of code some time ago for the chip ID used for NDS game carts. Fixing all 3 issues suddenly gave me this:
Yes, that's right! An error screen! A glorious, wonderful, unparalleled error screen! Who cares if means some save data stuff isn't emulated correctly? It's not an entirely black screen, it's not blank. It's graphical output. Getting that far meant I was on the right path, more right than I'd ever been so far in my journey. I wondered what other games would run or boot in GBE+. To my surprise, Digimon World Dusk and Guilty Gear: Dust Strikers started showing me screens as well. Digimon even went in-game with a few more adjustments. And with some more work on game saves, I got Game and Watch Collection to boot and go in-game. Guilty Gear still need better save support, but I have the foundations all set.
I'm 110% excited. This is the moment I've wanted for a loooooong time. Now I finally made it. I can only go up from here, and I'm very much looking forward to working on NDS emulation more and more from now on.
November 20, 2018
This rolling blog is NOT dead. Just got sidetracked between 70+ hours of Octopath Traveler and reverse-engineering Mobile Adapter GB games. The Mobile Adapter was a "must finish by end of year" kinda project, at least as far as supporting internal servers for two games. Man, that took a ton of time and energy, and left no space in my life for all things NDS emulation related. Also, been kinda bummed out that after 4 years of starting GBE+, no professionally developed software actually ran in the NDS core. Seeing something that Nintendo made themselves boot up in your own emulator has proven elusive for me. Well, until last week that is. Some demo software is running in GBE+'s NDS core. Finally, something besides homebrew works! Had a number of errors in the 2D graphics engine, minor oversights that led to a handful of glitches. My experimental branch dealing with better NDS timings didn't prove as useful as I'd hoped; there are still a bunch of things GBE+ just isn't getting right. But, I'm super excited to start moving away from homebrew, even just a little!
I've started a repo of GitHub for NDS hardware tests. Hopefully that will guide me in pinpointing GBE+'s faults on a low-level. Pretty sure the main hang-ups these days are related to reading from the game cart, and probably the timer interrupts. We'll see what happens. In the meantime, I've made a list of things I want to tackle. No more being lazy. Time to really dive into this stuff. To that end, in addition to all of those 2D fixes, today I finished making GBE+ service NDS interrupts via HLE.
This isn't as fancy as it sounds. Whenever an interrupt is fired, the CPU switches modes and jumps to a programmable address where that interrupt is handled by more game code. Once the interrupt is done, it jumps back to whatever code the game was doing before the interrupt fired (typically). The whole "jump here to start, jump back to finish" part belongs to code from the NDS BIOS. For simplicity, it's far easier to tell GBE+ to run that code rather than do the jump manually. However, interrupts are an essential part of most NDS games, so to run them properly, GBE+ needed the NDS BIOS files present as well. This was particularly annoying because GBE+ would complain if the BIOS weren't found, or it would freak out when trying to run NDS software without the BIOS. So I finally had enough and added the ability to manually handle NDS interrupt servicing.
Like I said, it's nothing much. Just push or pop a couple of registers around the stack, calculate the jump address, and set the CPU modes appropiately. I'd done the same thing in the GBA core, and it wasn't too different this time around. Interestingly enough, the NDS7 and NDS9 BIOS differ ever so slightly from the GBA BIOS when it comes time to service interrupts. NDS7 is most like the GBA, changing only 1 instruction, and even then, it's basically the same operation (mov r0, 0x4000000) just with a minor change in the opcode. The interrupt handling vector on NDS7 is at 0x380FFFC instead of the GBA's 0x3FFFFFC. Thanks to the way memory mirroring works on the NDS, however, 0x3FFFFFC holds the same value as 0x380FFFC.
The NDS9 changes things up a bit more. The interrupt handling vector for that CPU is at the DTCM address + 0x3FFC. It floats around depending on wherever the DTCM is specified. To get this address, it has to ask the CP15 coprocessor. It does some quick masking (via bit shifting right, then left) and calculates the vector. But, aside from all that, pretty much the same as the GBA after that. At any rate, I can stop worrying about the BIOS settings in GBE+ when loading up tests!
Next up on the list is more 3D software rendering. I feel like implementing a basic Z-buffer, then finally implementing color blending via vertices, and eventually texture support.
August 23, 2018
Okay, honestly been too long since I updated here. Been too long since I touched anything NDS related. Almost close to finishing up better timings for the ARM7 and ARM9 CPUs. It's basically a done deal, just need to test it and make sure nothing explodes. Got real distracted with the Pocket Sonar on the Game Boy. Also got real distracted with Xenoblade Chronicles 2, investing ~90 hours beating that game in my spare time. Anyway, recently started playing Fire Emblem the Sacred Stones on GBE+ (yes, I use GBE+ for personal recreation and emulation) and I forgot how much effort I put into getting GBA games up and running. I've mostly used GBE+ for DMG and GBC games, but GBA games are enjoyable in GBE+ too (in my biased opinion). Just kind of surprised me that I'd actually want to play something from start to finish in a program I wrote myself. Anyway, it got me motivated to hit the NDS again. One of these days, I want to sit down, relax, and start playing DS games in GBE+ for fun.
Today, nothing impressive, but I'm reworking the way the software renderer fills in polygons. Previously I had some wacky code that tried to figure out a polygon's upper and lower boundaries based on vertices. It was pretty obtuse, what with sorting vertices and being only able to fill half-way before needing to switch around a bunch of coordinates. Just ugh, a mess. It segfaulted to boot when it encountered some extreme conditions too. Not my best work, for sure, although I didn't expect it to be. I've never written a 3D software renderer, so I'm just going on instincts when it comes to design.
So I've been meaning to improve the polygon fill method (for solid colors, haven't even attempted textures or even blended colors). GBE+ has pretty decent code to draw the polygon's outline, so I figured the simplest way to fill the polygon is to use coordinates directly from the outline. This gave me nice, clean coordinates that were always safely within bounds of the screen (the outline drawing code handles that stuff much better than my old poly-fill code). Reduced a couple hundred lines down to a few dozen! Probably performs better too, but I haven't done any metrics. So far, it's just the triangles that I've reworked the filling code, but quads and strips aren't that big of a jump. We'll see if I can't tackle blended colors next.
June 7, 2018
Not much to say besides I've taken (yet another) serious attempt at fixing the timing of the ARM7 and ARM9 CPUs. I'm fairly certain GBE+ is capable of running commerical games with everything that's been implemented to date. It's just that the timings were faked initially when development first started. Now there's no getting around it. I have to get them at least semi-accurate. There's a certain margin of error that's pretty generous (e.g. Desmume and no$gba aren't 100% accurate regarding timing, but that's not needed to get games running), however, GBE+ is wildly out of wack.
To be honest, the timing code was copy+pasted from the GBA core since the ARM7 is virtually identical in the NDS, and in most regards the ARM9 is just a beefier, bigger brother to the ARM7 (it can actually be quite slower in practice thanks to how long it has to wait to access non-cached memory). Anyway, I only had a vague understanding of how the GBA was properly supposed to do timing, so as the NDS core evolved in GBE+ it became apparent that 2014 me had no idea how to do it right. Some ARM instructions in the GBA still don't have any timing at all. Thankfully most GBA games don't care, but I'll have to clean up that mess eventually too.
While half-assing the GBA worked, it won't cut it on the NDS. So I took the time recently to really sit down and understand how the ARM7 and ARM9 operate when accessing memory and go back and examine just what's happening with a given instruction's timing. Had to dig through some source code from Desmume so I could grasp some concepts (like wtf the cache is doing on the ARM9), but I think I finally get all of the concepts that confused me like 4-5 years ago. Now comes the kinda boring part, trudging through all the CPU interpreter code that executes instructions and reworking timing with a new design. It's definitely not "sexy", there's no whizz-bang graphics or sounds, but it's still work that has to be done. I'm starting on the ARM7 side (since it's simpler) and writing homebrew tests for timing. I can see lots and lots of tests coming my way. At least I figured out how to draw text quite efficiently in assembly, which makes printing variables and other values relatively easy.
May 15, 2018
I keep wanting to do another big article about NDS emulation, but I don't feel the time is right just yet. Maybe once I've actually gotten commercial games up and running in GBE+, I'll finish writing the article I've drafted about the NDS 2D engine. Plus, I'm still busy with the Edge of Emulation articles. So I figured it'd be nice to do something in-between. Nothing special, and much more laidback, and definitely something with a bit less fanfare. Something like a mini development blog based on whatever I happen to work on.
So, I'd ideally like to get NDS emulation stable in GBE+ by the end of this year. If I make one positive change to the codebase everyday, something that measurably improves the emulator, I'll get there, no problem. Much better than trying to push several giant changes all at once. Less pressure and more managable. That's really how I've approached GBE+ since 2014, and my GitHub activity reflects that. One commit everyday. If you're always moving forward, eventually you gotta get there. Digressing, I need to get my head back in the game and get GBE+ running games. Time to focus (which is hopefully what these mini-blogs will help with.)
Today, focusing on 3D stuff. GBE+ has basic support for 3D rendering via software. It mostly just draws the lines of triangles and quads (no strips yet), and that's it, just colored outlines. It's not much, but it's something to start with. One annoying problem was that some libnds demos would instantly start "shrinking" all polygons by sending them away from the camera along the z-axis. Couldn't figure out what was going wrong just by looking at it, so I tried disabling a bunch of GX commands sent to the NDS 3D engine. None of them was the culprit, so on a hunch I guessed one of the matrices used to calculate positions was incorrect. Sure enough, the dedicated position matrix (and thereby the clip matrix) was wonky. They kept increasing certain values whenever it was multiplied. Now, you'd expect values to change when something is multiplied (by something other than 1), but an identity matrix was supposed to be loaded right before the matrix multiplication. The values should have been consistent, not infinitely expanding.
The lesson I learned is never assume your code does what you think it does unless you check it up and down. My matrix had specific code convert it into an identity matrix, or so I thought. I mistakenly believed it was clearing the entire matrix before adding the diagonal line of 1s. It added the 1s just fine, but it didn't erase the other elements. So every time matrix multiplication was called, my "identity" matrix still had values from the last multiplication! Literally took one line of code to fix (calling the clear() function I made for that very purpose) but mystery solved.