Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 0 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,2 @@
__pycache__
.vscode
custom_hooks.py
tests/bin/*
tests/out/*
tests/Makefile
172 changes: 95 additions & 77 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,77 +1,95 @@
# Symless

Automatic structures recovering plugin for IDA. Able to reconstruct structures/classes and virtual tables used in a binary.

### Features
* Automatic creation of identified structures (c++ classes, virtual tables and others)
* Xrefs on structures usages
* Functions typing using gathered information

Two modes are available: **Pre-Analysis** and **Plugin**.

## Plugin mode
Interactive IDA plugin. Uses static analysis from an entry point selected by the user to build and propagate a structure.

<p align="center">
<kbd>
<img src="img/plugin-demo.gif" alt="Plugin demo"/>
</kbd>
</p>


### Installation
```
$ python plugin/install.py [-u]
```

**Manual installation**: copy the [symless](symless/) directory and [symless_plugin.py](plugin/symless_plugin.py) into IDA plugins folder.

### Usage
While in IDA disassembly view:
- Right-click a register that contains a structure pointer
- Select **Propagate structure**
- Select which structure & shift to apply

Symless will then propagate the structure, build it and type untyped functions / operands with the harvested information. This action can be undone with **Ctrl-Z**. A new structure can be created, an existing one can be completed.

## Pre-Analysis mode

### Before use

#### Specify your IDA installation:

```
export IDA_DIR="$HOME/idapro-M.m"
```

#### Edit the config file to suit your case:

Specify the memory allocation functions used in your executable in the [imports.csv](symless/config/imports.csv) file. Syntax is discussed there.

Symless uses those to find structures creations from memory allocations. C++ classes can also be retrieved from their virtual tables.

### Usage
```
$ python3 symless.py [-c config.csv] <target(s)>
```

* ```config.csv``` - configuration to be used (defaults to [imports.csv](symless/config/imports.csv))
* ```target(s)``` - one or more binaries / IDA bases

Symless will create a new IDA base when given an executable as an argument. Otherwise keep in mind it may overwrite user-modifications on existing bases.

Once done the IDA base will be populated with information about identified structures.

## Support
Both stripped and non-stripped binaries are supported. Symbols are only used to name the created structures.

**x64** and **i386** binairies using the following calling conventions are supported:
* Windows x64 (```__fastcall```)
* Windows i386 (```__stdcall``` & ```__thiscall```)
* System V x64 (```__fastcall```)
* System V i386 (```__stdcall```)

**IDA Pro 7.6** or newer & **python 3**

## Disclaimer
Symless is still in development and might not fit every use cases.
# Symless

An **IDA Pro plugin** that assists with **structure reconstruction**. Using static data-flow analysis to gather information, Symless automates most of the structure creation workflow. Its key features are:

* Inferring and creating structure fields based on access patterns
* Identifying and creating C++ virtual function tables
* Placing cross-references to link each structure field with the code that uses it

## Installation

```bash
$ python3 plugin/install.py [-u]
```

Or install manually: copy the [symless](symless/) directory and [symless_plugin.py](plugin/symless_plugin.py) file into your IDA plugins folder.

## Usage

The **interactive plugin** helps reconstruct a chosen structure. In the Disassembly or Pseudocode view, right-click a line that uses the structure you want to rebuild and select **Propagate structure** from the context menu:

<p align="center">
<kbd>
<img src="img/plugin_context_menu.png"/>
</kbd>
</p>

A form will appear prompting for:

* The **name of the new structure** to create, or an existing structure to extend.
* An **entry point** for the data-flow analysis, which is performed on the microcode. This entry point is a microcode operand that holds a pointer to the structure.

> [!NOTE]
> The microcode is IDA's intermediate representation (IR), generated from the CPU-specific assembly. Because of its similarity with the assembly, it is not difficult to read.

<p align="center">
<kbd>
<img src="img/plugin_builder_form.png" width="448"/>
</kbd>
</p>

Additional options are:

* **Shifted by**, the shift to apply to the structure pointer
* **Spread in callees**, whether the analysis should extend into called functions and discovered virtual methods

Clicking **Propagate** starts the analysis. The structure pointer is tracked from the selected entry, and observed accesses are used to infer structure fields.

> [!TIP]
> To get a more complete structure, run the analysis from the code that initializes the structure (for example, right after an allocation or inside a constructor).

The new structure is added to the Local Types view. Cross-references are added on assembly operands for each field access:

<p align="center">
<kbd>
<img src="img/plugin_built_structure.png"/>
</kbd>
</p>

You can then edit field types directly from the pseudocode. The plugin reduces the amount of back-and-forth navigation between disassembly, pseudocode and local types, required when creating structures and placing cross-references.

## CLI mode

An **automatic command-line** mode also exists, able to identify and automatically reconstruct most of the structures used in a binary. Symless uses two sources to discover structures:

* Dynamic memory allocations
* C++ virtual function tables and constructors

This automatic mode is intended as a pre-analysis step, to create structures and improve decompilation before manual work.

First, add the memory allocators used in your executable in [imports.csv](symless/config/imports.csv). This allows Symless to rebuild structures from dynamic allocations. If you don't, only C++ classes with virtual tables will be reconstructed.

The pre-analysis is ran using:

```bash
$ python3 symless.py [-c config.csv] <target>
```

* ```config.csv``` - configuration file to use (defaults to [imports.csv](symless/config/imports.csv))
* ```target``` - a binary or an IDA database

if target is an executable, a new IDA database will be created. When the analysis finishes, the database is populated with the reconstructed structures.

### Limitations

The main challenge for the automatic analysis is resolving conflicts between structures. This can cause functions to be incorrectly typed, or duplicated structures to be created. In some cases it is better to use the interactive plugin, which is less prone to errors.

## Support

All architectures supported by your IDA decompiler are supported.

Supported IDA versions are **IDA 8.4 and later**.

## Credits

Thalium Team, and Célian Debéthune for working on the architecture-agnostic version during his internship at Thalium.
Binary file removed img/plugin-demo.gif
Binary file not shown.
Binary file added img/plugin_builder_form.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/plugin_built_structure.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/plugin_context_menu.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
68 changes: 28 additions & 40 deletions plugin/symless_plugin.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
import base64
import collections
import importlib
import inspect
import os
import pkgutil
import sys
import traceback
from typing import Collection

import idaapi
Expand All @@ -15,9 +15,10 @@


class fixedBtn(idaapi.Form.ButtonInput):
def __init__(self, plugin: "SymlessPlugin"):
def __init__(self, plugin: "SymlessPlugin", form: "SymlessInfoForm"):
super().__init__(self.reload, "0")
self.plugin = plugin
self.form = form

def reload(self, code):
idaapi.show_wait_box("Reloading Symless..")
Expand All @@ -26,18 +27,17 @@ def reload(self, code):
# terminate all extensions
self.plugin.term()

# reload symless code
reload_plugin()
remove_old_modules()

# rebind all extensions
self.plugin.find_extensions()
self.plugin.find_extensions(reload=True)

except Exception as e:
import traceback

idaapi.hide_wait_box()
utils.g_logger.critical(repr(e) + "\n" + traceback.format_exc())
finally:
else:
idaapi.hide_wait_box()
self.form.Close(1)

def get_tag(self):
return "<Reload:%s%d:%s%s:%s:%s>" % (
Expand Down Expand Up @@ -72,7 +72,7 @@ def __init__(self, plugin: "SymlessPlugin"):
{
"img": idaapi.Form.StringLabel(img_html, tp=idaapi.Form.FT_HTML_LABEL, size=None),
"info": idaapi.Form.StringLabel(info_html, tp=idaapi.Form.FT_HTML_LABEL, size=None),
"reload": fixedBtn(plugin),
"reload": fixedBtn(plugin, self),
},
)

Expand All @@ -92,7 +92,7 @@ def init(self) -> idaapi.plugmod_t:
return idaapi.PLUGIN_KEEP

# find and load extensions from symless plugins folder
def find_extensions(self):
def find_extensions(self, reload: bool = False):
for mod_info in pkgutil.walk_packages(plugins.__path__, prefix="symless.plugins."):
if mod_info.ispkg:
continue
Expand All @@ -111,21 +111,27 @@ def find_extensions(self):
spec.loader.exec_module(module)
except BaseException as e:
sys.modules.pop(module.__name__)
print(f"Error while loading extension {mod_info.name}: {e}")
utils.g_logger.error(f"Error while loading extension {mod_info.name}:")
utils.g_logger.error(repr(e) + "\n" + traceback.format_exc())
continue

# module defines an extension
if not hasattr(module, "get_plugin"):
continue

ext: plugins.plugin_t = module.get_plugin()

# notify the extension that it has been reloaded
if reload:
ext.reload()

self.ext.append(ext)

# debug - reload plugin action
# display info panel
def run(self, args):
info = SymlessInfoForm(self)
info.Compile()
return info.Execute()
info.Execute()
info.Free()

# term all extensions
def term(self):
Expand All @@ -138,31 +144,13 @@ def PLUGIN_ENTRY() -> idaapi.plugin_t:
return SymlessPlugin()


# reload one module, by first reloading all imports from that module
# to_reload contains all modules to reload
def reload_module(module, to_reload: set):
if module not in to_reload:
return

# remove from set first, avoid infinite recursion if recursive imports
to_reload.remove(module)

# reload all imports first
for _, dep in inspect.getmembers(module, lambda k: inspect.ismodule(k)):
reload_module(dep, to_reload)

# reload the module
utils.g_logger.info(f"Reloading {module.__name__} ..")
importlib.reload(module)


# reload all symless code
def reload_plugin():
# list all modules to reload, unordered
to_reload = set()
for k, mod in sys.modules.items():
# remove old symless modules from loaded modules
def remove_old_modules():
to_remove = set()
for k in sys.modules.keys():
if k.startswith("symless"):
to_reload.add(mod)
to_remove.add(k)

for mod in list(to_reload): # copy to alter
reload_module(mod, to_reload)
for r in to_remove:
print(f"Removing old {r} ..")
del sys.modules[r]
40 changes: 24 additions & 16 deletions run_script.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
from typing import List, Optional, Tuple

# max & min supported majors
MIN_MAJOR = 7
MAX_MAJOR = 8
MIN_MAJOR = 8
MAX_MAJOR = 9


def stderr_print(line: str):
Expand Down Expand Up @@ -40,20 +40,23 @@ def find_ida_Linux() -> Optional[str]:
# find in PATH
if "PATH" in os.environ:
for path in os.environ["PATH"].split(":"):
if os.path.exists(os.path.join(path, "idat64")):
if os.path.exists(os.path.join(path, "idat64")) or os.path.exists(os.path.join(path, "idat")):
return path

# find in default location
for major in range(MAX_MAJOR, MIN_MAJOR - 1, -1):
for minor in range(9, 0, -1):
current = "%s/idapro-%d.%d" % (os.environ["HOME"], major, minor)
if os.path.exists(current):
return current
for minor in range(9, -1, -1):
p1 = "%s/idapro-%d.%d" % (os.environ["HOME"], major, minor)
p2 = "%s/ida-pro-%d.%d" % (os.environ["HOME"], major, minor)
if os.path.exists(p1):
return p1
if os.path.exists(p2):
return p2
return None


# find idat executables
def find_idat() -> Tuple[str, str]:
def find_idat() -> Tuple[Optional[str], str]:
ida_dir = None

# user defined IDA path
Expand All @@ -73,18 +76,23 @@ def find_idat() -> Tuple[str, str]:
print(f'Using IDA installation: "{ida_dir}"')

suffix = ".exe" if sys.platform == "win32" else ""
ida32 = os.path.join(ida_dir, "idat" + suffix)
ida64 = os.path.join(ida_dir, "idat64" + suffix)
idat = os.path.join(ida_dir, "idat" + suffix)
idat64 = os.path.join(ida_dir, "idat64" + suffix)

if not os.path.isfile(ida32):
if not (os.path.isfile(idat) or os.path.isfile(idat64)):
stderr_print('Missing idat%s in "%s"' % (suffix, ida_dir))
return None

if not os.path.isfile(ida64):
stderr_print('Missing idat64%s in "%s"' % (suffix, ida_dir))
return None
# earliest IDA 9 version - only idat64
if not os.path.isfile(idat):
return (None, idat64)

# IDA 9 + - only idat
if not os.path.isfile(idat64):
return (None, idat)

return (ida32, ida64)
# IDA 8 or earlier
return (idat, idat64)


# craft IDA batch command
Expand Down Expand Up @@ -119,7 +127,7 @@ def run_ida_batchmode(idat: str, filepath: str) -> int:

# Create .idb from 32 bits executable or .i64 from 64 bits exe
def make_idb(ida_install: tuple, filepath: str) -> Tuple[str, int]:
if run_ida_batchmode(ida_install[0], filepath) == 0:
if ida_install[0] and run_ida_batchmode(ida_install[0], filepath) == 0:
return (f"{filepath}.idb", 0)

# 32 bits analysis failed, try 64 bits mode
Expand Down
Loading
Loading