some assembly required

So, long time huh. Time to write something, I guess.

Its hard to write again after a break, not least because so much as happened in the last three weeks that unless I write for another three weeks I’m never going to cover it all. So right now I’m not going to try. i’ll just write about what I’ve been working on, get caught up to some degree, and maybe come back to any other interesting stuff down the track. So here we go!

I’ve long threatened picking up a DS homebrew kit and doing something interesting with it. I finally snagged a M3 DS Real from the Monash post office, of all places. Its a cute little thing. I got a 4GB microSD card with it, so I should have enough grunt to do anything I could ever hope to do with it. The next thing is just to work out what that is. I have an idea of course, one probably rather predictable if you know me.

I have an interest bordering on an infatuation with the game Frontier: First Encounters; specifically the JJFFE variant. Its a great game, and the effort and style that went into making an old Windows game come to life on more modern systems still totally imprresses me. John Jordan took the game binary and run it through a custom disassembler that produced an assembler source file that would compile on whatever platform he chose. His disassembler also identified operating system calls for graphics, sound, input and filesystem access, which he then abstracted and reimplemented for DirectX, SDL, and whatever else. So now the game runs on (i386) Linux without issue as well as on modern Windows systems. He even fixed a heap of bugs. Thats great!

I’ve messed with this code at various times. I implemented the OS abstraction for AROS a couple of years before I got involved in AROS proper. (That work later led to me working on some graphics subsystem speedups and a SDL graphics driver for AROS). I’ve also long dreamed of somehow converting it to pure C so that it could be hacked on properly. I’ve dabbled with this at various times, both using automatic and manual techniques, but haven’t really got very far mostly because of because of the limited success others have had with the general problem of decompiling assembly back to C.

So anyway, I got a DS kit, and of course started to think about how cool it would be to play FFE on it, and also about how to take advantage of the dual screen and touch screen. I’ve been dreaming of interesting UI changes that would make the game work much better on the DS, but of course first I have to get the game working there. That is not a trivial task, and has been the subject of inquiry for the last couple of weeks.

The problem is obvious. The DS has a pair of ARM CPUs. The JJFFE source is in i386 assembly. So there are really only two options - some sort of emulation, or converting the code to a higher-level language and then recompiling it for ARM.

Emulation would only really require a processor emulator for the game core since all the systems interaction could be done in C, and perhaps would have been the easier option, it doesn’t help much with my eventual goal (or “hope”, rather) of making significant modifications to the code to support the DS hardware properly. So instead, I’ve again returned to converting the assembler code to C.

As mentioned above though, this is something I’d pretty much given up on as being too difficult. I thought about it for a while and realised as a first cut, I don’t need to convert it back to anything resembling its original C. Instead, what if I was write an assmbler that produced C code that implemented the processor instructions, rather than producing raw machine code. The result would look much like C - we’d essentially have a kind of static CPU emulator built into the program code itself, with global variables representing the processor stack and registers. But, it could be recompiled for another CPU, which is the point of the exercise.

This seemed like a reasonable approach, but writing an assembler is insanely complicated. After attempting a hacky parser in Perl, I decided that nothing short of a full assembler would be able to do the job. NASM proved too complicated to penetrate, but then I found YASM, which is a highly modular clone of NASM.

So I took YASM and started writing a custom object format, one that would output C code. However, after experimenting to gain some experience with the interface, I realised that I was just getting the raw machine code and then converting it to C with a little bit of symbol table gymnastics to identify and produce simple unadorned C functions. This reminded me of a project I worked on for a while in 2004 that turns out to be much better suited. That project is a custom disassembler/decompiler of the same kind of was used to produce JJFFE in the first place! Let me explain.

Another old game that I love is Transport Tycoon (actually its sequel, Transport Tycoon Deluxe). At the time, it was Windows-only. There was a project called TTDPatch which would take the server binary and hook all sorts of stuff into it to add new features and fix bugs and whatever else. This worked well, but it was still Windows-only. Wine did a reasonable job with it, but it was still less than ideal. So I decided that I’d give it the same treatment as FFE got, and produce a disassembly and system abstraction that could be run anywhere.

I spent a lot of time studying JJFFE and Jordan’s decompiler and even had a series of email discussions with him to get a feel for just how to do this. After several weeks I managed to get my decompiler to the stage where it produced a viable disassembly and C file of OS call stubs. But, as fate would have it, the day it compiled and ran for the first time (segfaulting of course, as I hadn’t yet learnt about how Windows uses the %fs segment register), OpenTTD was announced, which was essentially a conversion of the original game back to C. My decompiler had no further reason to exist, and so I abandoned it.

The way it worked was pretty straightforward. It implemented what is essentially a Portable Executable (ie the Windows binary format, like ELF for Unix) loader with calls into the NASM disassembler to analyse the code and produce a disassembly and a stub file. Simplified, it does the following:

Inspect the program binary and find the code, data, bss, import and relocation segments.
Load the program binary into RAM.
Apply the relocations to produce a complete program image, additionally creating a “label” for each relocation.
Inspect the import section to build a list of all the external libraries and functions that the program wants.
Disassemble the code segments to find all the relocation labels that are in use and what they point to. From the instruction type, we can determine whether the target is code, data, bss (ie unitialised data), a vector table, etc.
Disassemble from each code label to the next to find any other labels missed in the first disassmble run. That might have happened, for example, if there were “garbage” bytes in between the end of one function and the start of another that caused the wrong disassembly to be produced crossing the function boundary.
Do this dissambly over and over until no new labels are produced.
Run through any relocation labels that have not been processed yet, and make them data labels. This works off the assumption that if the linker thought it important enough to include a relocation, we should probably include whatever that relocation points to in the output, even though we couldn’t actually find it in the code.
Output EXTERN declarations for each external function name.
Disassemble from each code label again, this time producing actual output. Any memory references in the output (ie things beginning with 0x) get replaced with their corresponding label, if there is one.
“Disassemble” from each data label, producing raw byte output (ie db or dd). For any data that was referenced via a relocation, produce the corresponding label.
“Disassemble” from each bss label, producing a memory reservation in the output (ie `resb

Theoretically, the resulting output from that is just about usable. There’s a bit of manual cleanup that has to happen (like the mentioned deal with the %fs register), but this output should at least compile and link, which is most of the fun. Theoretically you implement the stubs for your platform and you’re away.

So, back to our original problem of producing C from a binary. I realised that in this code, I’d already done most of what I needed. I know where all the function boundaries, jump points, vector tables and data are. All that needs to happen is instead of producing formatted assembly, all I need to do is produce some equivalent bit of C code. There’s some complication of course, like the fact that sometimes several instructions map to a single C construct (like if or while) but I figure I’m most of the way there.

So right now, I’m working on cleaning up and generalise the decompiler, which I’ve christened opsoup. It was pretty heavily tied to the PE format before, which of course is no good for me - I need ELF. I’m not bothering with trying to keep it compatible with PE at this point, as I have a pretty specific purpose. I can always add that back in later if I ever need it.

I have absolutely no idea how this is going to go, but its fun finding out. In adition to playing my game, I’m hoping that having the code in C, even horrible not-quite-C, will make it much easier to gradually convert some of the code in actual nice C (due to the availability of things like gdb and printf). I don’t expect it to happen fast, but I’ve been hacking at this code on-and-off for the last five years, so messing with it for another five doesn’t really concern me that much.