Raw Function Hooking

In this tutorial, you will be guided through the steps to hook and model a function in an unstructured program, such as an embedded device image.

Consider the example tests/hooking/hooking.amd64.s:

; fake the PLT
gets@PLT equ 0x2800
puts@PLT equ 0x2808

BITS 64;
; This program reads an input string and then writes it out again using libc.
; This requires external calls, notionally to gets and puts which will need to
; be modeled.
        mov     rbp, rsp
        sub     rsp, 64
        mov     rdi, rsp
        call    gets@PLT
        call    puts@PLT

You will need to run the following command to produce tests/hooking/hooking.amd64.bin

cd smallworld/tests
make hooking/hooking.amd64.bin

Just by looking at the assembly, we can see that we have two external function references to gets@PLT and puts@PLT.

To successfully harness this program, we will need the following:

  • An execution stack, to support the call opcode and the stack buffer.

  • Some implementation of gets and puts.

We could find another binary that implements them, but we’re not actually interested in actuating library code, or dealing with the insanity of configuring glibc. Luckily, we can provide python models of these functions using SmallWorld.

Adding a Stack

Setting up a stack is covered in this tutorial. For this test, we don’t need any interesting data on the stack. We just need to reserve the memory and configure the registers:

# Create our stack
stack = smallworld.state.memory.stack.Stack.for_platform(platform, 0x8000, 4000)
machine.add(stack)

# Push a fake return value onto the stack.
stack.push_integer(0xFFFFFFFF, 8, "fake return address")

sp = stack.get_pointer()
cpu.rsp.set(sp)

Adding Function Models

There are two options for adding function models. We can use SmallWorld’s existing library of models, or write our own. For this tutorial, let’s do both.

Adding gets

We can add a default gets model to our machine as follows:

# Configure gets model
#
# Lookup needs the following arguments:
# - The name of the function, in this case "gets"
# - The ABI.  In this case, we're compliant with ABI.SYSTEMV
# - The address.  From the assembly, this will be the base address of the program plus 0x2800.
gets = smallworld.state.models.Model.lookup(
    "gets", platform, smallworld.platforms.ABI.SYSTEMV, code.address + 0x2800
)
# Add the model to the machine
machine.add(gets)

This will read a string from stdin, convert it to a null-terminated byte string, and write it to memory at the address specified in the argument register.

Adding puts

Let’s write our own puts model. We want to read a null-terminated string out of memory at the address specified at the argument register, and print it to stdout.

Let’s be a little careful. If someone feeds us an unterminate string, our model could fail in a variety of ways:

  • We could get an “unmapped read” error if we go off the edge of mapped memory

  • We could get a “symbolic value” error if we’re using angr and read past the edge of initialized memory.

The first case has fairly good error reporting. Let’s add a little more introspection for the second case:

# Define a puts model
class PutsModel(smallworld.state.models.Model):
    name = "puts"
    platform = platform
    abi = smallworld.platforms.ABI.NONE

    def model(self, emulator: smallworld.emulators.Emulator) -> None:
        # Reading a block of memory from angr will fail,
        # since values beyond the string buffer's bounds
        # are guaranteed to be symbolic.
        #
        # Thus, we must step one byte at a time.
        s = emulator.read_register("rdi")
        v = b""
        try:
            b = emulator.read_memory_content(s, 1)
        except smallworld.exceptions.SymbolicValueError:
            b = None
        while b is not None and b != b"\x00":
            v = v + b
            s = s + 1
            try:
                b = emulator.read_memory_content(s, 1)
            except smallworld.exceptions.SymbolicValueError:
                b = None
        if b is None:
            raise smallworld.exceptions.SymbolicValueError(f"Symbolic byte at {hex(s)}")
        print(v)

# Configure the puts model
puts = PutsModel(code.address + 0x2808)
# Add the puts model to the machine
machine.add(puts)

Putting it All Together

Combined, this harness can be found in the script tests/hooking/hooking.amd64.py

import logging

import smallworld

# Set up logging and hinting
smallworld.logging.setup_logging(level=logging.INFO)

# Define the platform
platform = smallworld.platforms.Platform(
    smallworld.platforms.Architecture.X86_64, smallworld.platforms.Byteorder.LITTLE
)

# Create a machine
machine = smallworld.state.Machine()

# Create a CPU
cpu = smallworld.state.cpus.CPU.for_platform(platform)
machine.add(cpu)

# Load and add code into the state
code = smallworld.state.memory.code.Executable.from_filepath(
    __file__.replace(".py", ".bin")
    .replace(".angr", "")
    .replace(".panda", "")
    .replace(".pcode", ""),
    address=0x1000,
)
machine.add(code)

# Create a stack and add it to the state
stack = smallworld.state.memory.stack.Stack.for_platform(platform, 0x8000, 0x4000)
machine.add(stack)

# Set the instruction pointer to the code entrypoint
cpu.rip.set(code.address)

# Push a return address onto the stack
stack.push_integer(0xFFFFFFFF, 8, "fake return address")

# Configure the stack pointer
sp = stack.get_pointer()
cpu.rsp.set(sp)

print(f"SP: {hex(sp)}")

# Configure gets model
gets = smallworld.state.models.Model.lookup(
    "gets", platform, smallworld.platforms.ABI.SYSTEMV, code.address + 0x2800
)
machine.add(gets)


# Configure puts model
class PutsModel(smallworld.state.models.Model):
    name = "puts"
    platform = platform
    abi = smallworld.platforms.ABI.NONE

    def model(self, emulator: smallworld.emulators.Emulator) -> None:
        # Reading a block of memory from angr will fail,
        # since values beyond the string buffer's bounds
        # are guaranteed to be symbolic.
        #
        # Thus, we must step one byte at a time.
        s = emulator.read_register("rdi")
        v = b""
        try:
            b = emulator.read_memory_content(s, 1)
        except smallworld.exceptions.SymbolicValueError:
            b = None
        while b is not None and b != b"\x00":
            v = v + b
            s = s + 1
            try:
                b = emulator.read_memory_content(s, 1)
            except smallworld.exceptions.SymbolicValueError:
                b = None
        if b is None:
            raise smallworld.exceptions.SymbolicValueError(f"Symbolic byte at {hex(s)}")
        print(v)


puts = PutsModel(code.address + 0x2808)
machine.add(puts)

# Emulate
emulator = smallworld.emulators.UnicornEmulator(platform)
emulator.add_exit_point(code.address + code.get_capacity())
final_machine = machine.emulate(emulator)

This harness should take its input over stdin, and echo it back to the console.

Here is what running it looks like:

$ echo "foobar" | python3 hooking.amd64.py
[+] starting emulation at 0x1000
[+] emulation complete
SP: 0xbff8
b'foobar\n'

We do in fact see “foobar” echoed back to the console, so we harnessed hooking.amd64.bin successfully.