Monday 27 October 2008

On me head son

I'm throwing huge lumps of code into the repository at the moment.  I got the first pass at the instruction emulation in and am now writing and executing unit tests.  I've already found and fixed a whole bunch of problems, especially in my condition code flag calculations.

Hooray for unit tests !

It is also incredibly satisfying to watch the tests complete successfully or to follow the operations step by step and see the instructions actually doing the right thing :-)

Saturday 25 October 2008

Grinding out a result

A week since the last post, how time flies. I've been adding instruction emulation and some unit tests this week. Both rewarding and a bit tedious at times and I've not had quite as much time as I'd hoped this week so progress has been a bit slow, but it's moving in the right direction.

One of my goals in developing this project, was to give me the chance to play with some of the newer language features of Java. I learnt the Java ropes on versions 1.2 and 1.3 and did my certification back when Sun had just started calling it the "Java2 platform". Work hasn't afforded me any opportunity to explore the additions made in 1.5 and later.

Having a strong C and C++ background I thought I knew an enum when I saw one. How wrong could I be. A Java enum is so much more, and after realising their potential I've incorporated several into this project.

One of the key enums in Miggy is the DataSize enum. This has grown quite a bit since it's inception, as I'm finding it so useful. Here's a code snippet:
public enum DataSize
{
Nibble(0, 2, 0x000f,"", 0x0008, 4), Byte(1, 2, 0x00ff,".b", 0x0080, 8),
Word(2, 2, 0x0000ffff,".w",0x00008000, 16), Long(4, 4, 0xffffffff,".l",0x80000000, 32),
Unsized(0, 0,0,"",0, 0);

private final int readSize;
private final int byteSize;
private final int mask;
private final String ext;
private final int msb;
private final int bitSize;

DataSize(int byteSize, int readSize, int mask, String ext, int msb, int bitSize)
{
this.byteSize = byteSize;
this.readSize = readSize;
this.mask = mask;
this.ext = ext;
this.msb = msb;
this.bitSize = bitSize;
}
What this does is create an enum named DataSize that has the values Nibble, Byte, Word, Long and Unsized. However each of these values also has additional data associated with it, passed into the constructor. To keep it simple I haven't shown the accessor methods as they just return the private member variables.

You can probably tell what these extra data fields are by their names, but I'll run through them quickly:
  • byteSize - number of bytes the value represents
  • readSize - number of bytes read when used as addtional data for instruction decoding
  • mask - bitmask used to extract data of this size
  • ext - the extension used when disassembling instructions of this size
  • msb - Most Significant Bit mask, used when determining if a signed value is positive or negative
  • bitSize - the number of bits used in this data size. The same as byteSize multiplied by 8.
Using this enum has meant that I do not need loads of conditional code when emulating or disassembling instructions. Apart from making the code simpler and easier to read, it also means I'm able to write more generic code that can handle byte, word or long sized operations.

For example when disassembling an ADD instruction, each of the "sized" handlers can call a base class method passing a DataSize enum representing the size the instruction is working on.
From the ADD_b.java class which handles the byte sized ADD instruction:

public final DecodedInstruction disassemble(int address, int opcode)
{
return decode(address, opcode, DataSize.Byte);
}

This gives the ADD base class all the information it requires to decode and disasemble the instruction.

Sunday 19 October 2008

Disassembler v0.1

I've put the first code drop into the google project repository and thrown together a download containing the compiled jar and a couple of scripts to run the disassembler.

The code isn't pretty at the moment, there's very little in the way of comments and there are virtually no test cases.  I'm a bad man.  That will all change in due course but I was keen to actually get something out there.  I'm itching to start on the emulation code and the release of this milestone will enable me to make a start now.

The disassembler knows nothing of executable formats but seems quite happy to disassemble lumps of compiled 68000 code.  My test file of choice has been the Amiga Kickstart 1.3 ROM.
I'm sure there are bugs, if anyone decides to try it out, please let me know what you think and if you find any problems.

If you want to try it out, you can grab the binary release here
You'll need a recent Java runtime environment installed 1.5 or 6.
Extract the archive to a directory and run either the disasm.bat for Windows or disasm.sh for *nix:

disasm.sh [-b <address>][-o <file>] filename

-b address -> The base address to use for the disassembly output.
-o offset -> Byte offset into the file to start disassembling from.

Have fun.

Thursday 16 October 2008

Twist and Shout

I had a bit of a redesign after being rather unhappy in how messy things were becoming trying to handle the 10-bit masking and hashing, and the subsequent overlapping instruction handling that it required. I decided to add the responsibility of registering opcodes down to the individual instruction classes. This means they can correctly register for the range of opcodes they will handle in full 16-bit glory. It also means these are stored in a 16-bit look up table, directly indexed by the opcode itself. No hashing or maps required and no autoboxing of opcode values to store and retrieve the instruction classes.
The CPU instructions are all loaded dynamically via a configuration file too, so different 680x0 family processors or opcode implementations can be configured quickly.

Yay.

Apart from imminent RSI things have gone well. The disassembler is very nearly done, I'm debugging at the moment. I really want to add a lot more unit tests though, as in my coding frenzy I have been a bit lax in that department.
Once the debugging is complete and I'm happy it's disassembling correctly I'll post the code on the google code project and make a compiled version available too.

Double Yay.

Thursday 9 October 2008

Come back z80 ...

No, not really.  Many moons ago I wrote Gameboy emulator with a cut-down z80 CPU.  Although I never got as far as handling sound, everything else worked quite nicely.  I wrote it in C++ with the CPU core written in x86 assembler.  Thinking back now it seems an order of magnitude simpler than emulating a 68000 in Java.

The painful thing with the 68000 is that the instructions are complex.  Decoding the machine code isn't as simple or straight forward as something like a z80.  The 68000's instructions are all 16-bit words and decoding these is somewhat long winded.

The goal is to quickly identify each instruction so that it can be emulated as fast as possible without wasting precious time trying to determine which action to perform.  I also want to use the same determination algorithm for both the emulator and the disassembler.

I've settled on a scheme of using the 10 most significant bits of an instruction for identification.  This covers the majority of the instruction set and allows me to determine the size of the operation in most case (byte/word/long).  There are however a few special cases where more of the instruction needs to be used for qualification.

I want my core loop clean and envisage something like this simplified pseudo code:

while(running)
short opcode = readMemWord(reg_pc)
Instruction i = Instructions.get(opcode & 0xFFC0)
i.execute(opcode)
end while

I don't really want the mother of all switch statements in there, so I will use a hash collection of some sort, keyed against the 10 most significant bits of the opcode paired with an instance of the class that implements that particular instruction.  This allows me to write a class to handle each instruction and specialise it to handle different operation sizes too.

To facilitate this, I have an Instruction interface that these classes all implement that looks like this:

public interface Instruction
{
  public int execute(int opcode);
  public DecodedInstruction disassemble(int address, int opcode);
}

The disassembler will work in exactly the same way, except calling disassemble instead of execute.

The disassembler is about half done now.  Back to it.

Monday 6 October 2008

In the beginning

Miggy has been conceived.

Last weekend, I finally sat down and attempted to start developing the Amiga emulator I've been itching to write for ages.  I already have a reasonable idea of how this is going to be architected - a pluggable API framework so the emulator can evolve - and I'm going to code it in Java.

I've set up a Google Code project to host it http://miggy.googlecode.com, although I doubt there will be anything there for a while.  This is my first open source project and I am planning on a "release often" stragedy to try and keep me going.

Work has started on a 68000 disassembler.