Loading a PE

In this tutorial, you will be guided through loading a PE32+ object file.

Portable Executable (PE) is the file format Windows and UEFI bootloaders use to store code. Like ELF, it contains the code and data for one executable or library, and the metadata detailing what the program loader needs to do to set up that code and data within a process.

The original 32-bit version of the format is known as PE32; the extended 64-bit version is called PE32+.

Consider the example tests/pe/pe.pe.c:

#include <stdio.h>

int main() {
    // Unusual number to help me find main()
    // Getting debug info out of PEs is a bit of a chore.
    int out = 0xc001d00d;
    puts("Hello, world!\n");
    return out;
}

This is “hello world”. We can build it into a PE file using the MinGW version of GCC, since true Windows compilers are extremely annoying to actuate on other platforms:

cd smallworld/tests
make pe/pe.amd64.pe

Warning

This requires a version of GCC targeting MinGW. On Debian and Ubuntu, this can be installed via the apt package gcc-mingw-w64.

In order to harness the code contained in a PE, we need to do at least the following:

  • Follow the metadata in the PE to unpack the memory image inside

  • Set execution to start at the correct place

In this case, we will also need to provide some kind of implementation for puts().

Using the PE loader

SmallWorld includes a model of the basic features of a PE loader. To exercise it, you will need to use Executable.from_pe(), described in Memory Objects.

PE files define a “preferred” load address, but usually include enough metadata for the program loader to use an alternate load address.

Let’s take a look at our example:

$ objdump -p pe.amd64.pe | head -n 20

pe.amd64.pe:     file format pei-x86-64

Characteristics 0x22
	executable
	large address aware

Time/Date		Tue Jan  1 00:00:00 1980
Magic			020b	(PE32+)
MajorLinkerVersion	14
MinorLinkerVersion	0
SizeOfCode		0000000000021000
SizeOfInitializedData	0000000000007e00
SizeOfUninitializedData	0000000000000000
AddressOfEntryPoint	00000000000013f0
BaseOfCode		0000000000001000
ImageBase		0000000140000000
SectionAlignment	00001000
FileAlignment		00000200
MajorOSystemVersion	6

The PE format nicely lists the image load address right in its header; the default is 0x140000000. If we don’t specify a load address, it will be loaded here. For this harness, we chose to specify our own address at 0x10000. The SmallWorld PE loader will handle relocating the code.

filename = "pe.amd64.pe"
with open(filename, "rb") as f:
    code = smallworld.state.memory.code.Executable.from_pe(
        f, platform=platform, address=0x10000
    )
machine.add(code)

Finding main()

The next step is to figure out where we want to begin executing.

Sadly, internal symbols are a deprecated feature on PE files; Windows and most MinGW binaries won’t have them, and SmallWorld doesn’t look for them.

This is a case for manual RE. Inspecting the entrypoint routines in Ghidra tells us that the most probable candidate for main() is at offset 0x1000 in the file, or the start of the text segment.

entrypoint = code.address + 0x1000
cpu.rip.set(entrypoint)

Stubbing out the runtime

Note

TODO: This bit is likely mingw-specific. I need to take a look at an actual Windows binary.

If we tried to harness our program now, it would fail with an unmapped memory error. Let’s take a look at main() in a disassembler:

$ objdump -d pe.amd64.pe | grep -A 10 '140001000:'
   140001000:	55                   	push   %rbp
   140001001:	48 83 ec 30          	sub    $0x30,%rsp
   140001005:	48 8d 6c 24 30       	lea    0x30(%rsp),%rbp
   14000100a:	e8 11 06 00 00       	call   0x140001620
   14000100f:	c7 45 fc 00 00 00 00 	movl   $0x0,-0x4(%rbp)
   140001016:	c7 45 f8 0d d0 01 c0 	movl   $0xc001d00d,-0x8(%rbp)
   14000101d:	48 8d 0d dc 0f 02 00 	lea    0x20fdc(%rip),%rcx        # 0x140022000
   140001024:	e8 a7 0d 02 00       	call   0x140021dd0
   140001029:	8b 45 f8             	mov    -0x8(%rbp),%eax
   14000102c:	48 83 c4 30          	add    $0x30,%rsp
   140001030:	5d                   	pop    %rbp

I recognize everything there except for the call to 0x1620. Let’s look at it in a disassembler:

$ objdump -d pe.amd64.pe | grep -A 10 '140001620:'
   140001620:	55                   	push   %rbp
   140001621:	56                   	push   %rsi
   140001622:	57                   	push   %rdi
   140001623:	53                   	push   %rbx
   140001624:	48 83 ec 28          	sub    $0x28,%rsp
   140001628:	48 8d 6c 24 20       	lea    0x20(%rsp),%rbp
   14000162d:	80 3d 54 7a 02 00 00 	cmpb   $0x0,0x27a54(%rip)        # 0x140029088
   140001634:	74 09                	je     0x14000163f
   140001636:	48 83 c4 28          	add    $0x28,%rsp
   14000163a:	5b                   	pop    %rbx
   14000163b:	5f                   	pop    %rdi

If we manually RE this function, we find that it’s just more runtime initialization that we really don’t care about modelling.

Thankfully, it doesn’t return anything or otherwise modify the caller, so we can easily use SmallWorld’s function hooking feature to stub it out:

# Configure __main model
class InitModel(smallworld.state.models.Model):
    name = "__main"
    platform = platform
    abi = smallworld.platforms.ABI.NONE

    def model(self, emulator: smallworld.emulators.Emulator) -> None:
        # Return
        pass


init = InitModel(code.address + 0x1620)
machine.add(init)

Putting it all together

Using what we now know about the PE loader and our particular executable, we can build a complete harness in pe.amd64.py. This will load our file, as well as set up the main stack, and hook puts(). See this tutorial for a more complete guide on how to hook a function, and this tutorial for how to handle external function references in a PE file.

import logging

import smallworld

# Set up logging and hinting
smallworld.logging.setup_logging(level=logging.INFO)

# Define the platform
platform = smallworld.platforms.Platform(
    smallworld.platforms.Architecture.X86_64, smallworld.platforms.Byteorder.LITTLE
)

# Create a machine
machine = smallworld.state.Machine()

# Create a CPU
cpu = smallworld.state.cpus.CPU.for_platform(platform)
machine.add(cpu)

# Load and add code into the state
filename = (
    __file__.replace(".py", ".pe")
    .replace(".angr", "")
    .replace(".panda", "")
    .replace(".pcode", "")
)
with open(filename, "rb") as f:
    code = smallworld.state.memory.code.Executable.from_pe(
        f, platform=platform, address=0x10000
    )
    machine.add(code)

# Create a stack and add it to the state
stack = smallworld.state.memory.stack.Stack.for_platform(platform, 0x2000, 0x4000)
machine.add(stack)

stack.push_integer(0x10101010, 8, None)
cpu.rsp.set(stack.get_pointer())


# Configure _main model
class InitModel(smallworld.state.models.Model):
    name = "__main"
    platform = platform
    abi = smallworld.platforms.ABI.NONE

    def model(self, emulator: smallworld.emulators.Emulator) -> None:
        # Return
        pass


init = InitModel(code.address + 0x1620)
machine.add(init)


# Configure puts model
class PutsModel(smallworld.state.models.Model):
    name = "puts"
    platform = platform
    abi = smallworld.platforms.ABI.NONE

    def model(self, emulator: smallworld.emulators.Emulator) -> None:
        # Reading a block of memory from angr will fail,
        # since values beyond the string buffer's bounds
        # are guaranteed to be symbolic.
        #
        # Thus, we must step one byte at a time.
        s = emulator.read_register("rcx")
        v = b""
        try:
            b = emulator.read_memory_content(s, 1)
        except smallworld.exceptions.SymbolicValueError:
            b = None
        while b is not None and b != b"\x00":
            v = v + b
            s = s + 1
            try:
                b = emulator.read_memory_content(s, 1)
            except smallworld.exceptions.SymbolicValueError:
                b = None
        if b is None:
            raise smallworld.exceptions.SymbolicValueError(f"Symbolic byte at {hex(s)}")
        print(v)


puts = PutsModel(0x10000000)
code.update_import("api-ms-win-crt-stdio-l1-1-0.dll", "puts", puts._address)
machine.add(puts)

# Set entrypoint to "main"
cpu.rip.set(code.address + 0x1000)

# Emulate
emulator = smallworld.emulators.UnicornEmulator(platform)
emulator.add_exit_point(code.address + 0x1031)
final_machine = machine.emulate(emulator)
# for m in machine.step(emulator):
#    pass

Here is what running the harness looks like:

$ python3 pe.amd64.py
[+] starting emulation at 0x11000
[+] emulation complete
puts IAT at 25eb8 or 260ce
b'Hello, world!\n'

We see “Hello, world!”, so we have successfully harnessed pe.amd64.pe.