Run your disassembler on the invaders.h ROM file and let's look at the output.
0000 NOP 0001 NOP 0002 NOP 0003 JMP $18d4 0006 NOP 0007 NOP 0008 PUSH PSW 0009 PUSH B 000a PUSH D 000b PUSH H 000c JMP $008c 000f NOP 0010 PUSH PSW 0011 PUSH B 0012 PUSH D 0013 PUSH H 0014 MVI A,#$80 0016 STA $2072 0019 LXI H,#$20c0 001c DCR M 001d CALL $17cd 0020 IN #$01 0022 RRC 0023 JC $0067 0026 LDA $20ea 0029 ANA A 002a JZ $0042 002d LDA $20eb 0030 CPI #$99 0032 JZ $003e 0035 ADI #$01 0037 DAA 0038 STA $20eb 003b CALL $1947 003e SRA A 003f STA $20ea /* 0000000 00 00 00 c3 d4 18 00 00 f5 c5 d5 e5 c3 8c 00 00 0000010 f5 c5 d5 e5 3e 80 32 72 20 21 c0 20 35 cd cd 17 0000020 db 01 0f da 67 00 3a ea 20 a7 ca 42 00 3a eb 20 0000030 fe 99 ca 3e 00 c6 01 27 32 eb 20 cd 47 19 af 32 */
The first instructions match what we hand assembled before. After that, you can see some new instructions. I pasted the hex data in below for reference. Notice that if you compare the memory with the instructions, it looks like the addresses are stored backward in memory. They are. This is called little endian - little endian machines like the 8080 store the smaller bytes of numbers in memory first. (See below for more on endian-ness)
In Part 1 I mentioned that this code is the ISR code for Space Invaders. Code for interrupts 0, 1, 2, ... 7 start at address $0, $8, $20, ... $38. It looks like the 8080 just gives 8 bytes for each ISR. Space Invaders seems to get around this sometimes by just jumping to another address with more space. (It does that at $000c).
It also appears that ISR 2 is longer than the space allocated to it. Its code goes over $0018 (ISR 3's place). I guess Space Invaders doesn't expect to see anything using interrupt #3.
The Space Invaders ROM file you find on the internet has 4 parts. I'll explain this later, but for now if you want to follow the next section, you need to combine the 4 files into one. On Unix:
cat invaders.h > invaders cat invaders.g >> invaders cat invaders.f >> invaders cat invaders.e >> invaders
Now run your disassembler on the resulting "invaders" file. When the program starts from $0000, the first thing it does is jumps to $18d4. I'd consider this the start of the program. Let's take a look at that code real quick.
18d4 LXI SP,#$2400 18d7 MVI B,#$00 18d9 CALL $01e6
OK - it does 2 things and calls $01e6. I'm going to just paste some of the jumpy code into one code section here:
01e6 LXI D,#$1b00 01e9 LXI H,#$2000 01ec JMP $1a32 ..... 1a32 LDAX D 1a33 MOV M,A 1a34 INX H 1a35 INX D 1a36 DCR B 1a37 JNZ $1a32 1a3a RET
As you saw in the Space Invaders memory map, the some of these addresses are interesting. $2000 is the start of the program's "work ram". $2400 is the start of the video memory.
Lets annotate this code a little for what happens right at startup:
18d4 LXI SP,#$2400 ; SP=$2400 - Establish stack for whole program 18d7 MVI B,#$00 ; B=0 18d9 CALL $01e6 ..... 01e6 LXI D,#$1b00 ; DE=$1B00 01e9 LXI H,#$2000 ; HL=$2000 01ec JMP $1a32 ..... 1a32 LDAX D ; A = (DE), so whatever was in memory at $1B00 1a33 MOV M,A ; Store A into (HL), so to $2000 1a34 INX H ; HL = HL + 1 (now $2001) 1a35 INX D ; DE = DE + 1 (now $1B01) 1a36 DCR B ; B = B - 1 (now 0xff because it wrapped around from 0) 1a37 JNZ $1a32 ; loop, will be taken until b=0 1a3a RET
This code looks like it is going to copy 256 bytes from $1b00 to $2000. Why? I don't know. It is possible for you to follow through this program a long way and speculate on what it is doing.
There is a problem here. If you have an arbitrary chunk of memory that includes 8080 code, it probably has data interleaved in it.
For example, the sprites for the characters in a game may be mixed in with the code. When your disassembler hits that chunk of memory, it is going to think it's code and continue to chew on it. Unless it gets lucky, any code disassembled after that chunk of data may or may not be right.
For now, there isn't a whole lot you can do about this. Just be aware the problem exists. If you see things like:
a jump from known good code to an instruction that doesn't exist in your disassembly listing
a stream of nonsense code (like POP B POP B POP B POP C XTHL XTHL XTHL)
That there is probably data in there that renders some portion of your disassembly unusable. You might have to restart it an an offset if this happens to you.
It turns out that Space Invaders has some runs of zeros in it periodically. If our disassembly ever gets off, the zeros will force it to sort of reset itself.
For a thorough analysis of the Space Invaders code, look here.
Depending on the processor, bytes are stored differently in memory, and the storage depends on the size of the data. Big endian machines store data from biggest byte to smallest byte. Little endian store from smallest to biggest. If you write a 32-bit integer 0xAABBCCDD to memory on each machine, it will look like this in memory:
Little endian: $DD $CC $BB $AA
Big endian: $AA $BB $CC $DD
I started programming Motorola processors that all used big-endian, so that seems more "natural" to me, but I've gotten used to little endian.
My disassembler and emulator completely avoids the endian issue by only reading and writing 1-byte at a time. If you want to use, say a 16-bit read to read an address from the ROM, be aware that code is not portable between host CPU architectures.← Prev: developing-on-the-command-line Next: emulator-shell →
Post questions or comments on Twitter @realemulator101, or if you find issues in the code, file them on the github repository.