The “Hello World” program I started disassembling in the last article has a fuck-ton of subroutines, surprising considering that the source code only had one. I am now making an effort to decipher and document all of them and figure out the exact course of system calls made in the execution of a printf()
statement. I will be documenting my hacking efforts here.
I figured I would devote this particular entry to everything leading up to the first system call. The program goes through a lot of assembly instructions before accessing the WinAPI, which is the real juice of the assembly dump, and today I have finally deciphered the entire program setup. I have been examining the assembly code and accompanying graphs in IDA while taking notes in Notepad in the other monitor. This article is based on those notes.
First, we begin our hacking of the assembly code at the start
subroutine, which is simply a wrapper for start_0
. start_0
looks like this:
.code:00401290 start_0 proc near ; CODE XREF: startj
.code:00401290
.code:00401290 var_8 = byte ptr -8
.code:00401290
.code:00401290 push ebx
.code:00401291 push ecx
.code:00401292 push edx
.code:00401293 push ebp
.code:00401294 mov ebp, esp
.code:00401296 sub esp, 8
.code:00401299 mov eax, 1
.code:0040129E call sub_402990
.code:004012A3 mov eax, ds:dword_40827C
.code:004012A8 add eax, 3
.code:004012AB and al, 0FCh
.code:004012AD xor edx, edx
.code:004012AF sub esp, eax
.code:004012B1 mov ecx, esp
.code:004012B3 mov ebx, ds:dword_40827C
.code:004012B9 mov eax, ecx
.code:004012BB call sub_402A70
.code:004012C0 mov eax, ds:dword_40827C
.code:004012C5 mov [ecx+104h], eax
.code:004012CB mov eax, ecx
.code:004012CD mov edx, ecx
.code:004012CF call sub_402A40
.code:004012D4 lea eax, [ebp+var_8]
.code:004012D7 call sub_401810
.code:004012DC call sub_402A90
.code:004012E1 mov esp, ebp
.code:004012E3 pop ebp
.code:004012E4 pop edx
.code:004012E5 pop ecx
.code:004012E6 pop ebx
.code:004012E7 retn
.code:004012E7 start_0 endp
This subroutine has a function prologue that pushes all the necessary values onto the stack, after which it calls sub_402990
. And this is where things really get interesting.
The screenshot below shows IDA with both text and graph views of sub_402990
.
I had a lot of difficulty figuring out what this subroutine does, partly because there were a lot of jumps and registers that I had to keep track of, and partly just because this was my first time reverse-engineering an executable. But as I continued examining it, gradually the pieces started to come together. Let’s look at all the assembly code up to the first jump:
.code:00402990 sub_402990 proc near ; CODE XREF: start_0+Ep
.code:00402990 ; sub_401810+3Cp ...
.code:00402990 push ebx
.code:00402991 push ecx
.code:00402992 push edx
.code:00402993 push esi
.code:00402994 mov dh, al
.code:00402996 mov esi, offset unk_4085A8
.code:0040299B
.code:0040299B loc_40299B: ; CODE XREF: sub_402990+4A↓j
.code:0040299B mov eax, offset unk_408584
.code:004029A0 mov ecx, esi
.code:004029A2 mov dl, dh
.code:004029A4 cmp esi, eax
.code:004029A6 jbe short loc_4029C2
This sets up everything that happens in the main loop of the subroutine. After the prologue, there are several important mov
instructions. First, we see the setting of eax
and esi
to the addresses ds:408584
and ds:4085A8
respectively. I could tell that these were the start and end addresses of an array to be traversed in the loop, though the purpose of this array was a complete mystery to me until I got to the next subroutine and was actually forced to look at that area of memory.
Next we can see some mov
instructions involving 1-byte registers. The significance of these is not readily apparent, but they turn out to be crucial in the operation of the program.
Now let’s look at the loop. I will show a screenshot of the graph mode for this one, since I feel that would be easier to read in this case:
The conditional branch at the end of the subroutine setup jumps past this loop if eax
and esi
contain the same value, that is to say, if there’s nothing to loop through because the array size is zero. Afterward we see a few branches that transfer control within the loop. But the main point of interest is the two boxes in the middle. The program compares the registers dl
and bh
. bh
contains the 1-byte value pointed to by eax
with an offset of 1. dl
starts off with the old offset stored in eax
but then in subsequent iterations contains whatever was in bh
. The loop continuously increments eax
by 6, reading the byte at that address and comparing it to the last one until it gets to the right address.
I am not entirely certain, but I think the purpose of this whole process is to account for some sort of alignment issue. When we actually look at the table being scanned, we will see that the data items are 4 bytes apart, yet the increment is 6. If that’s not an alignment algorithm I don’t know what else it is.
Once the subroutine has found the right address, it calls sub_402980
, which has the following assembly dump:
.code:00402980 sub_402980 proc near ; CODE XREF: sub_402990+42↓p
.code:00402980 ; sub_4029E0+48↓p
.code:00402980 cmp dword ptr [eax], 0
.code:00402983 jnz short loc_402986
.code:00402985 retn
.code:00402986 ; ---------------------------------------------------------------------------
.code:00402986
.code:00402986 loc_402986: ; CODE XREF: sub_402980+3↑j
.code:00402986 call dword ptr [eax]
.code:00402988 retn
.code:00402988 sub_402980 endp
The next screenshot shows the assembly code for the scanning subroutine side-by-side with a dump of the area of memory being scanned.
It turns out the array this subroutine was scanning was actually a jump table containing addresses of different functions. We can easily see this from the call dword ptr [eax]
instruction in sub_402980
. Tracing back, we can see that the assembly code in the scanning subroutine loads eax
with the address two bytes from the current value of ecx
, which is the second-to-last location jumped to in the array.
.code:004029C2
.code:004029C2 loc_4029C2: ; CODE XREF: sub_402990+16j
.code:004029C2 cmp ecx, offset unk_4085A8
.code:004029C8 jnz short loc_4029CF
.code:004029CA pop esi
.code:004029CB pop edx
.code:004029CC pop ecx
.code:004029CD pop ebx
.code:004029CE retn
.code:004029CF ; ---------------------------------------------------------------------------
.code:004029CF
.code:004029CF loc_4029CF: ; CODE XREF: sub_402990+38j
.code:004029CF lea eax, [ecx+2]
.code:004029D2 call sub_402980
.code:004029D7 mov byte ptr [ecx], 2
.code:004029DA jmp short loc_40299B
.code:004029DA sub_402990 endp
Using the assembly code from the scanning subroutine as a reference, I was able to mentally trace the program’s execution through the jump table to see where the pointer would land. I determined that the landing point is sub_404490
. Looking at this subroutine we find that it’s simply a wrapper for sub_405467
. This subroutine is shown here:
In the assembly code and in the accompanying graph view, we can now see the first WinAPI calls this program makes: ds:GetACP
and ds:GetOEMCP
. We have made it through the long and arduous program setup and can now trace the API calls from start to finish. Our work here is complete, at least for now. See you next time and happy hacking!