Working with Types, Structures, and Symbols

This document is organized into four sections describing how to work with types in Binary Ninja. The first section is how to interact with any type, regardless of its source.

The second section explains how to work with the Type Library. This includes multiple sources of information from which Binary Ninja can automatically source for type information from and how you can add to them.

Next, the third section explains how to work with the signature library. While the signature library technically does not directly inform types, it will help automatically match statically compiled functions which are then matched with the type libraries described in the previous section.

Finally, we'll cover how to work with Symbols in a binary.

Working With Types

There are two main ways to interact with types from within a binary view. The first is to use the types view, and the second is to take advantage of the smart structures workflow or otherwise annotate types directly in a disassembly or IL view.

Direct UI manipulation

The simplest way to directly manipulate types in disassembly is by viewing an existing variable or sequence of bytes in linear view and using the following hotkeys:

  • 1, 2, 4, 8: The number hotkeys will create a data variable at the current location if none exists, and then change the size of the variable to an integer in the size of bytes specified in the hotkey.
  • d: If you want to cycle through the different integer sizes, repeatedly pressing d has the same effect as pressing the numbers in order.
  • -: To quickly toggle integers between signed and unsigned integers, you can use the - hotkey.
  • a: This hotkey sets or creates the current variable to a character array up until and including the next null byte.
  • o: o will set or create the current variable to be a pointer reference.
  • *: If you have a selection of identical variables, * will convert them into an array of elements.
  • s: s is a magic hotkey described in the next section in greater detail

Note that you can apply these types to a region of memory as well, not just a single variable. So selecting a large block of bytes and pressing 2 * for example will create an array of int16_t sized elements.

Smart Structures Workflow

New to stable version 1.3.2015 is the "Smart Structures" feature. Rather than manually create a type in the type view and then apply it to disassembly, you can create structures directly from disassembly using the s hotkey. Consider the following example (created using taped from the 2011 Ghost in the Shellcode CTF if you'd like to play along at home):

  1. Assembly view of the start of 0x8048e20
  2. MLIL view of the same basic block
  3. MLIL view after selecting the return of calloc and pressing s
  4. MLIL view after selecting the offset and pressing s to turn it into a member access
  5. MLIL view after selecting the remaining offsets and pressing s in turn
  6. Viewing the structure automatically created after this workflow
  7. Selecting the remaining bytes and turning them into an array using 1 to turn them all into uint_8 variables, and then * to turn them all into an array
  • Structure Workflow 1
  • Structure Workflow 2
  • Structure Workflow 3
  • Structure Workflow 4
  • Structure Workflow 5
  • Structure Workflow 6
  • Structure Workflow 7

hover over the image to temporarily pause

Note that the last step is entirely optional. Now that we've created a basic structure, and if we happen to do some reverse engineering on these binaries, we learn that this is actually a linked list and that the structures should look like:

struct Page
{
    int num;
    int count;
    Tape* tapes[8];
    struct Page* prev;
    struct Page* next;
}

and:

struct Tape
{
    int id;
    char* name;
    char text[256];
};

We can either update our automatically created structure by pressing y to change member types and n to change their names, or we can use the types view to directly import the c code and then apply the types using y. That gives us HLIL that now looks like:

Taped HLIL

Types View

To see all types in a Binary View, use the types view. It can be accessed from the menu View > Types. Alternatively, you can access it with the t hotkey from most other views, or using [CMD/CTRL] p to access the command-palette and typing "types". This is the most common interface for creating structures, unions and types using C-style syntax.

For many built-in file formats you'll notice that common headers are already enumerated in the types view. These headers are applied when viewing the binary in linear view and will show the parsed binary data into that structure or type making them particularly useful for binary parsing even of non-executable file formats.

Types View

Shortcuts and Attributes

From within the Types view, you can use the following hotkeys to create new types, structures, or unions. Alternatively, you can use the right-click menu to access these options and more.

Types Right Click Menu >

  • s - Create new structure
  • i - Create new type
  • [SHIFT] s - Creating a new union

The shortcuts for editing existing elements are:

  • y - Edit type / field
  • n - Rename type / field
  • l - Set structure size
  • u - undefine field

Structs support the attribute __packed to indicate that there is no padding. Additionally, function prototypes support the following keywords to indicate their calling convention or other features:

__cdecl
__stdcall
__fastcall
__convention
__noreturn

Applying Structures and Types

Changing a type

Once you've created your structures, you can apply them to your disassembly. Simply select an appropriate token (variable or memory address), and press y to bring up the change type dialog. Types can be applied on both disassembly and all levels of IL. Any variables that are shared between the ILs will be updated as types are applied.

Examples

enum _flags
{
    F_X = 0x1,
    F_W = 0x2,
    F_R = 0x4
};
struct Header __packed
{
    char *name;
    uint32_t version;
    void (* callback)();
    uint16_t size;
    enum _flags flags;
};

Using the API

Of course, like everything else in Binary Ninja, anything you can accomplish in the UI you can accomplish using the API. Manipulating types is no exception. Here are four common workflows for working with types as commented examples.

Create a new type

Let's follow the most basic workflow: making a new type, inserting it into a BinaryView, and then applying to a variable or memory address.

From C syntax

There are two main ways to create a type with the API. The first is to use one of our APIs that parse a type string and return a type object. For a simple, single type, parse_type_string will return a tuple of the Type and the QualifiedName:

>>> bv.parse_type_string("int foo")
(<type: int32_t>, 'foo')
>>>

For more complicated types that are already in C syntax, you may want to take advantage of the parse_types_* APIs.

>>> bv.platform.parse_types_from_source('''
enum colors {blue, green, brown};

struct person
{
    char name[20];
    int age;
    colors eyecolor;
};
''')
<types: {'colors': <type: enum>, 'person': <type: struct>}, variables: {}, functions: {}>

If you're importing a large number of headers from an existing project you might find that some features are not compatible with the type parser that Binary Ninja currently uses. In that case, you may find the Header Plugin useful as it attempts to automate the normalization of headers to make them compatible with Binary Ninja's type parser.

NOTE: While they have similar names, be aware that the parse_types APIs live off of the Platform class as they require knowledge of the existing architecture's platform whereas the simpler parse_type_string is accessed directly from the BinaryView class.

Using Type objects

Base types can be easily composed to create simple type objects or added with Structures:

>>> myar = Type.array(Type.char(), 20)
>>> print(repr(myar))
<type: char [20]>

This is useful for creating structures that are not easily created in C syntax, such as sparse structures with only some members defined:

>>> s = types.Structure()
>>> s
<struct: size 0x0>
>>> s.insert(0x20, myar, name='name')
>>> s
<struct: size 0x34>

Adding types

Next, we're going to take the optional step of inserting our type into the current BinaryView with the define_user_type API:

>>> bv.define_user_type("myString", myar)

And we can verify the type shows up in the types view as expected:

Custom type

This makes the type available to the user to apply more easily and is appropriate for named structures, but is not required if you simply with to set a type as shown in the next step.

Applying

Of course, having the type available doesn't actually apply it to anything in the binary. Let's examine our sample binary, find a suitable string (like the one at 0x8049f34) and create a data variable using our new type:

>>> bv.define_user_data_var(0x8049f34, bv.types["myString"])

And now we can see that the string was indeed applied to our location:

Custom type

Of course, we could have just directly applied our type without inserting it into the types available in the binary. For example:

>>> bv.define_user_data_var(0x8049f34, Type.array(Type.char(), 20))

If we want to name the variable there, see the section below on working with symbols.

NOTE: There also exists the define_data_var API, however that will create an AUTO data variable. Auto-APIs are intended for code which is expected to be run every-time a binary is opened. Thus, they're not saved to the analysis database since they will generally be generated on load. Keep that in mind when you see multiple APIs that have both _auto_ and _user_ variants.

Deleting

To remove a type from the view:

>>> bv.undefine_user_type('person')

Or you can remove a type applied to memory:

>>> bv.undefine_user_data_var(0x8049f34)

Modifying

Here's a snippet to take an existing function, and set the confidence of all the parameter types to 100%:

old = current_function.function_type
new_parameters = []
for vars, params in zip(current_function.parameter_vars, old.parameters):
    new_type = vars
    new_type.confidence = 256 #max-confidence
    new_parameters.append(FunctionParameter(new_type, params.name, params.location))

current_function.function_type = types.Type.function(old.return_value, new_parameters, \
old.calling_convention, old.has_variable_arguments, old.stack_adjustment)

Type Library

coming soon...

Signature Library

While many signatures are built-in and require no interaction to automatically match functions, you may wish to add or modify your own. First, install the SigKit plugin from the plugin manager.

Running the signature matcher

The signature matcher runs automatically by default once analysis completes. You can turn this off in Settings > Analysis > Autorun Function Signature Matcher (or, analysis.signatureMatcher.autorun in Settings).

You can also trigger the signature matcher to run from the menu Tools > Run Analysis Module > Signature Matcher.

Once the signature matcher runs, it will print a brief report to the console detailing how many functions it matched and will rename matched functions. For example:

1 functions matched total, 0 name-only matches, 0 thunks resolved, 33 functions skipped because they were too small

Generating signature libraries

To generate a signature library for the currently-open binary, use Tools > Signature Library > Generate Signature Library. This will generate signatures for all functions in the binary that have a name attached to them. Note that functions with automatically-chosen names such as sub_401000 will be skipped. Once it's generated, you'll be prompted where to save the resulting signature library.

For headless users, you can generate signature libraries by using the sigkit API (examples and documentation). For more detailed information, see our blog post describing signature generation.

If you are accessing the sigkit API through the Binary Ninja GUI and you've installed the sigkit plugin through the plugin manager, you will need to import sigkit under a different name:

import Vector35_sigkit as sigkit

Installing signature libraries

Binary Ninja loads signature libraries from 2 locations:

WARNING: Always place your signature libraries in your user directory. The install path is wiped whenever Binary Ninja auto-updates. You can locate it with Open Plugin Folder in the command palette and navigate "up" a directory.

Inside the signatures folder, each platform has its own folder for its set of signatures. For example, windows-x86_64 and linux-ppc32 are two sample platforms. When the signature matcher runs, it uses the signature libraries that are relevant to the current binary's platform. (You can check the platform of any binary you have open in the UI using the console and typing bv.platform)

Manipulating signature libraries

You can edit signature libraries programmatically using the sigkit API. A very basic example shows how to load and save signature libraries. Note that Binary Ninja only supports signatures in the .sig format; the other formats are for debugging. The easiest way to load and save signature libraries in this format are the sigkit.load_signature_library() and sigkit.save_signature_library() functions.

To help debug and optimize your signature libraries in a Signature Explorer GUI by using Tools > Signature Library > Explore Signature Library. This GUI can be opened through the sigkit API using sigkit.signature_explorer() and sigkit.explore_signature_library().

For a text-based approach, you can also export your signature libraries to JSON using the Signature Explorer. Then, you can edit them in a text editor and convert them back to a .sig using the Signature Explorer afterwards. Of course, these conversions are also accessible through the API as the sigkit.sig_serialize_json module, which provides a pickle-like interface. Likewise, sigkit.sig_serialize_fb provides serialization for the standard .sig format.

Symbols

Some binaries helpfully have symbol information in them which makes reverse engineering easier. Of course, even if the binary doesn't come with symbol information, you can always add your own. From the UI, this couldn't be simpler. Just select the function, variable, member, register, or whatever you want to change the symbol of and press n.

Rename a function >

That's it! From an API perspective, there are some helper functions to make the process easier. For example, to rename a function:

>>> current_function.name
'main'
>>> current_function.name = "newName"
>>> current_function.name
'newName'

Other objects or variables may need a symbol created and applied:

>>> mysym = Symbol(SymbolType.FunctionSymbol, here, "myVariableName")
>>> mysym
<SymbolType.FunctionSymbol: "myVariableName" @ 0x80498d0>
>>> bv.define_user_symbol(mysym)

Note that here and bv are used in many of the previous examples. These shortcuts and several others are only available when running in the Binary Ninja python console and are used here for convenience.

Valid symbol types include:

SymbolType Description
FunctionSymbol Symbol for function that exists in the current binary
ImportAddressSymbol Symbol defined in the Import Address Table
ImportedFunctionSymbol Symbol for a function that is not defined in the current binary
DataSymbol Symbol for data in the current binary
ImportedDataSymbol Symbol for data that is not defined in the current binary
ExternalSymbol Symbols for data and code that reside outside the BinaryView
LibraryFunctionSymbol Symbols for external functions outside the library