Analyses

SmallWorld provides a simple interface for encapsulating analyses that take, as input, the machine state representation. These can be static analyses, meaning the code is examined but not run. Or they can be dynamic analyses which will run the code, using the input Machine to specify the initial state of memory and registers. Dynamic analysis will, additionally, employ instrumentation to monitor execution and collect side information.

Each analysis should create a subclass of Analysis. This interface is incredibly free-form; it includes a single method Analysis.run(), which takes a Machine object and performs whatever analyses the class implements.

Note

Analyses should not mutate the Machine they are passed.

The Analysis.run() method, notably, returns None. This is because analyses are intended to use Hints to communicate any results, making them available for additional processing.

The constructor for the Analysis class takes a Hinter object, which is stored in self.hinter.

Hints

Conceptually, hints are simply statements of discovery made by an analysis. The idea is that a given analysis will generate a number of discrete discoveries – hints – about the nature of some code. Hints can be collected, composed, and synthesized into even richer insights, which can go into logs or reports for direct human consumption or can, themselves, become higher-level hints intended for downstream analysis.

Take, as an example, the CodeCoverage analysis class. Given some SmallWorld harness that sets up the initial environment for execution and packs it into a Machine object, this analysis will execute code using the Unicorn emulator until it hits an exit point, steps outside a specified code bound, or raises an exception, collecting counts for every instruction program counter encountered. This coverage information (a dictionary mapping program counters to counts) is included as part of the CoverageHint that is emitted by this analysis.

Hints are packaged as data classes that subclass Hint.

Analyses designed to consume hints use the specific Hint subclass to filter the hints they want.

Note

Data inside of a Hint is passed by reference and is never marshalled, so a Hint can contain arbitrarily-large or complex information.

SmallWorld includes a library of Hint subclasses covering information relevant to evaluating a SmallWorld harness. Analyses are encouraged to use existing hint subclasses, although they may create their own if necessary.

Hinting is implemented as a basic pub-sub system via the Hinter class. An Analysis takes, as input to its constructor, a Hinter object. This is subsequently available in self.hinter and can be used with self.hinter.send() to publish a hint for consumption, and self.hinter.register() to register a callback that will fire if a specific class of hint is received. The callback will be of the form callback(Hint) -> None.

Caution

Callbacks are only triggered on an exact class match. If a callback is registered to a given Hint class, and an analysis sends a subclass, the callback will not fire.

The following is a basic example of two dependent analyses that communicate via Hint:

from smallworld.hinting import Hint, Hinter
from smallworld.analysis import Analyses
from smallworld.state import Machine

class FirstAHint(Hint):
    pass

class FirstAnalysis(Analysis):
    name = "first-analysis"
    version = "0"
    description = "An analysis that sends hints"

    def run(self, machine: Machine):
        # Send a hint when we start.
        self.hinter.send(FirstAHint(
            message="Hello, world!"
        ))

class SecondAnalysis(Analysis):
    name = "second-analysis"
    version = "0"
    description = "An analysis that listens for hints"

    def on_hint(self, hint: Hint):
        print(f"Hint: {hint.message}")

    def run(self, machine: Machine):
        # Listen for hints of type Hint.
        self.hinter.register(FirstAHint, self.on_hint)

# Set up the hinter
hinter = Hinter()

# Prepare the dependent analysis;
# should return without doing anything.
machine.analyze(SecondAnalysis(hinter))

# Run the base analysis.
# Should cause SecondAnalysis to print "Hello, World!"
machine.analyze(FirstAnalysis(hinter))

For a more realistic example of how analyses can compose, consider studying the communication of hints between TraceExecution and CoverageFrontier. The TraceExecution analysis is somewhat like CodeCoverage, in that it takes the initial code execution environment specified by the Machine object passed, as input, to the self.run() method, and uses this to execute code. The output for TraceExecution is a TraceExecutionHint which includes the sequence of instructions executed, along wih information about comparison and branch instructions encountered, as well as indications about how a trace ended (in an exception or simply because execution reached a proscribed bound). The CoverageFrontier analyses registers a callback on the TraceExecutionHint, collecting these hints across multiple executions and analyzing them, in aggregate to determine which branches in code are encountered but only ever go one way (are half-covered). These branches, or coverage frontier are interesting for targeted fuzzing or other activities.

SmallWorld Analyses