Function Hooking with ELF Relocations¶
In this tutorial, you will be guided through the steps to model a function, and hook it using the linker metadata found in an ELF file.
Note
Details on actually loading an ELF file can be found in this tutorial. It’s strongly recommended you complete that one first.
Consider the example tests/rela/rela.elf.c:
#include <stdio.h>
int main() {
puts("Hello, world!\n");
return 0;
}
Here, we have an ELF that references the library function puts.
You can build this into rela.amd64.elf using the following commands:
cd smallworld/tests
make rela/rela.amd64.elf
Let’s take a look at the ELF metadata, specifically regarding puts:
$ readelf -s -D rela.amd64.elf | grep 'puts'
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND puts@GLIBC_2.2.5 (3)
$ readelf -r rela.amd64.elf | grep 'puts'
000000002690 000200000007 R_X86_64_JUMP_SLO 0000000000000000 puts@GLIBC_2.2.5 + 0
The symbol is undefined, and it has a JUMP_SLOT relocation,
meaning that this is a dynamic symbol that’s going to get
loaded from another library.
We could load and harness an entire other library
just to provide puts, but we’re not that interested
in exercising the library. Let’s use a model instead.
To do that, we will need the following:
An execution stack, to support the
callopcodeA model of
puts
Adding a Stack¶
Setting up a stack is covered in this tutorial. For this test, we don’t need any interesting data on the stack. We just need to reserve the memory and configure the registers:
# Create our stack
stack = smallworld.state.memory.stack.Stack.for_platform(platform, 0x8000, 4000)
machine.add(stack)
# Push a fake return value onto the stack.
# Make it something we can use as an exit point later.
stack.push_integer(0x7FFFFFF8, 8, "fake return address")
sp = stack.get_pointer()
cpu.rsp.set(sp)
Adding a Puts Model¶
Let’s write our own puts model.
We want to read a null-terminated string out of memory
at the address specified at the argument register,
and print it to stdout.
Let’s be a little careful. If someone feeds us an unterminated string, our model could fail in a variety of ways:
We could get an “unmapped read” error if we go off the edge of mapped memory
We could get a “symbolic value” error if we’re using angr and read past the edge of initialized memory.
The first case has fairly good error reporting. Let’s add a little more introspection for the second case:
# Define a puts model
class PutsModel(smallworld.state.models.Model):
name = "puts"
platform = platform
abi = smallworld.platforms.ABI.NONE
def model(self, emulator: smallworld.emulators.Emulator) -> None:
# Reading a block of memory from angr will fail,
# since values beyond the string buffer's bounds
# are guaranteed to be symbolic.
#
# Thus, we must step one byte at a time.
s = emulator.read_register("rdi")
v = b""
try:
b = emulator.read_memory_content(s, 1)
except smallworld.exceptions.SymbolicValueError:
b = None
while b is not None and b != b"\x00":
v = v + b
s = s + 1
try:
b = emulator.read_memory_content(s, 1)
except smallworld.exceptions.SymbolicValueError:
b = None
if b is None:
raise smallworld.exceptions.SymbolicValueError(f"Symbolic byte at {hex(s)}")
print(v)
Linking the Puts Model¶
We don’t have a fixed address for puts. Normally, that would be up to the program loader.
Let’s just make up a nice round number.
# Configure the puts model at an arbitrary address
puts = PutsModel(code.address + 0x10000)
# Add the puts model to the machine
machine.add(puts)
We also need to do the program loader’s job and update our ELF
with the address of puts:
# Relocate puts
code.update_symbol_value("puts", puts._address)
Putting it All Together¶
Combined, this harness can be found in the script tests/rela/rela.amd64.py
import logging
import smallworld
# Set up logging and hinting
smallworld.logging.setup_logging(level=logging.INFO)
# Define the platform
platform = smallworld.platforms.Platform(
smallworld.platforms.Architecture.X86_64, smallworld.platforms.Byteorder.LITTLE
)
# Create a machine
machine = smallworld.state.Machine()
# Create a CPU
cpu = smallworld.state.cpus.CPU.for_platform(platform)
machine.add(cpu)
# Load and add code into the state
filename = (
__file__.replace(".py", ".elf")
.replace(".angr", "")
.replace(".panda", "")
.replace(".pcode", "")
)
with open(filename, "rb") as f:
code = smallworld.state.memory.code.Executable.from_elf(
f, platform=platform, address=0x400000
)
machine.add(code)
# Set the entrypoint to the address of "main"
entrypoint = code.get_symbol_value("main")
cpu.rip.set(entrypoint)
# Create a stack and add it to the state
stack = smallworld.state.memory.stack.Stack.for_platform(platform, 0x8000, 0x4000)
machine.add(stack)
# Push fake return
# Make it an exit point
exitpoint = entrypoint + code.get_symbol_size("main")
stack.push_integer(exitpoint, 8, None)
machine.add_exit_point(exitpoint)
# Configure the stack pointer
sp = stack.get_pointer()
cpu.rsp.set(sp)
# Configure puts model
class PutsModel(smallworld.state.models.Model):
name = "puts"
platform = platform
abi = smallworld.platforms.ABI.NONE
def model(self, emulator: smallworld.emulators.Emulator) -> None:
# Reading a block of memory from angr will fail,
# since values beyond the string buffer's bounds
# are guaranteed to be symbolic.
#
# Thus, we must step one byte at a time.
s = emulator.read_register("rdi")
v = b""
try:
b = emulator.read_memory_content(s, 1)
except smallworld.exceptions.SymbolicValueError:
b = None
while b is not None and b != b"\x00":
v = v + b
s = s + 1
try:
b = emulator.read_memory_content(s, 1)
except smallworld.exceptions.SymbolicValueError:
b = None
if b is None:
raise smallworld.exceptions.SymbolicValueError(f"Symbolic byte at {hex(s)}")
print(v)
puts = PutsModel(0x10000)
machine.add(puts)
# Relocate puts
code.update_symbol_value("puts", puts._address)
# Emulate
emulator = smallworld.emulators.UnicornEmulator(platform)
machine.emulate(emulator)
This harness should print Hello, world!\n to the console.
Here is what running it looks like:
$ python3 rela.amd64.py
[+] starting emulation at 0x4014d0
[+] emulation complete
b'Hello, world!\n'
We do in fact see, Hello, world!\n printed to the console,
so we harnessed rela.amd64.elf successfully.