Skip to content

Cisco-Talos/Xfiletrator

Repository files navigation

Xfiletrator: The Exfiltration Mapping Framework

A structured framework to document and analyze benign tools abused for data exfiltration. It highlights detection-relevant features, stealth techniques, and forensic artifacts to support threat hunting, detection engineering, and post-incident analysis.


Purpose

This project provides a centralized knowledge base of legitimate tools that have been observed—or have the potential—to be used for data exfiltration in adversary operations.

While these tools are not inherently malicious, they are often repurposed by threat actors to exfiltrate sensitive information from compromised environments. The goal is to document these tools in a consistent format, making the data easily accessible for defensive use.


YAML Linting & Quality Checks

This project uses automated validation to ensure YAML files are properly formatted and contain all required fields.

Two-Layer Validation

  1. Generic YAML Linting (yamllint)

    • Enforces consistent YAML formatting
    • Catches syntax errors and common issues
    • Configuration in .yamllint
  2. Custom Schema Validation (scripts/validate_yml.py)

    • Validates required top-level fields (Name, Description, Category, etc.)
    • Ensures Forensics section contains all required fields
    • Validates against project-specific JSON schema

Setup

Install Python dependencies:

pip install -r requirements.txt

Enable pre-commit hooks:

pre-commit install

Running Validation

Pre-commit (automatic): Both yamllint and custom validation run automatically before each commit.

Manual testing:

# Run all validation checks
./scripts/test-validation.sh

# Or run individually:
yamllint .
python scripts/validate_yml.py

GitHub Actions: Every push and pull request automatically runs both validation checks to ensure quality standards.

Configuration

  • yamllint rules: .yamllint (document-start, line-length, and comments spacing disabled)
  • Custom validation: scripts/validate_yml.py and YML-Schema.yml

Framework Structure

Each tool is documented in its own YAML file, located in the /yml/ directory. These entries capture:

  • Tool metadata (name, category, platform, execution method)
  • Capabilities relevant to exfiltration
  • Stealth techniques used to avoid detection
  • Forensic artifacts left on disk or in memory
  • Threat actor usage and external references
  • Tags to support filtering and grouping

This format is designed to support both human review and programmatic consumption (e.g., automation, detection generation or AI projects).


Repository Layout

exfiltration-framework/
├── yml/                      # One YAML file per tool, structured for parsing
│   ├── awscli.yml
│   ├── pscp.yml
│   ├── restic.yml
│   ├── rclone.yml
│   ├── dropboxapi.yml
│   ├── syncthing.yml
│   ├── curl.yml
│   ├── powershell.yml
│   ├── azcopy.yml
│   └── s3browser.yml
│
├── LICENSE                   # Apache License 2.0
├── README.md                 # Project overview and usage
├── CONTRIBUTING.md           # Contribution guidelines

Tags

Tools may include tags to help categorize them by behavior, usage, or context. Examples include:

By Origin:

  • native
  • third-party
  • cloud-based

By Execution Type:

  • cli
  • gui
  • api

By Detection-Relevant Behavior:

  • masquerading
  • encrypted-transfer
  • scheduled-task
  • background-execution

By Threat Context:

  • ransomware
  • apt
  • exfiltration-only

Contributing

Contributions are welcome. If you would like to propose a new tool, improve an existing entry, or suggest new fields or tags, please refer to the contributing guidelines (coming soon).


YAML Validation

To ensure consistency and avoid errors, all tool entries in the yml/ folder should be validated against the schema defined in YML-Schema.yml.

Install Requirements

Before running the validation script, install the required Python libraries:

pip install pyyaml jsonschema

Run the Validator

From the root of the repository, run:

python validate_yml.py

This will check all .yml files in the yml/ directory (excluding templates and meta files) and print the validation results.

A file is considered invalid if it is missing required fields, contains unsupported tags, or does not conform to the schema.


License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.


Disclaimer

This framework is intended for educational and defensive purposes only. All tools listed are legitimate and not inherently malicious. Their inclusion is based on publicly documented misuse by threat actors.

Allowed Tags and Categories

# =========================
# CATEGORIES: General classification of the tool's origin or deployment model
# =========================
categories:
  - native              # Built into the OS (e.g., PowerShell, certutil)
  - third-party         # Tools developed independently from the OS (e.g., rclone, curl)
  - cloud-based         # Tools designed for use with cloud services or platforms (e.g., AWS CLI, Dropbox API)

# =========================
# PLATFORMS: Operating systems supported by the tool
# =========================
platforms:
  - windows             # Microsoft Windows
  - linux               # Linux distributions
  - macos               # Apple macOS

# =========================
# EXECUTION: How the tool is typically executed or interfaced with
# =========================
execution:
  - cli                 # Command-line interface
  - gui                 # Graphical user interface

# =========================
# CAPABILITIES: Functional features relevant to exfiltration
# =========================
capabilities:
  - file-sync              # Continuous synchronization between folders or systems (e.g., rclone, Syncthing).
  - cloud-sync             # Sync or upload to cloud storage platforms like S3, Azure Blob, Dropbox.
  - api-transfer           # Upload via official or custom APIs (e.g., Dropbox API, AWS SDK).
  - direct-to-cloud        # Exfiltrates data directly to cloud endpoints without local staging.
  - selective-upload       # Allows filtering or targeting specific file types or directories.
  - recursive-upload       # Recursively uploads entire folder trees.
  - credentialed-upload    # Requires or supports authenticated uploads (e.g., tokens, IAM keys).
  - anonymous-upload       # Supports unauthenticated uploads (e.g., pre-signed URLs or public buckets).
  - proxy-aware            # Can route traffic through proxies to mask destination.
  - silent-execution       # Executes without user interaction or visible output (used in scripts or automation).
  - portable-execution     # Runs from non-standard paths or without installation (portable binaries).
  - service-identity       # Supports managed identities or service principals (e.g., AzCopy with Azure roles).
  - user-agent-spoofing    # Can spoof or customize the user-agent string in HTTP requests.
  - endpoint-override      # Allows setting custom or attacker-controlled endpoints (e.g., `--endpoint-url`).
  - ftp-upload             # Can exfiltrate via FTP protocol to external servers.
  - header-exfiltration    # Exfiltrates data inside HTTP headers (e.g., `X-Data:`).
  - multipart-upload       # Simulates browser-style form uploads (e.g., using `curl -F`).

# =========================
# FORENSICS: Artifacts and indicators that may appear on a compromised system
# =========================
forensics:
  - binary-location           # Known install or execution paths, especially outside standard directories.
  - config-file-path          # Presence of tool-specific configuration or credential files.
  - command-line-flags        # Flags or arguments used by attackers to trigger upload, sync, or stealth behavior.
  - registry-entry            # Registry keys used to auto-launch or persist the tool (Windows only).
  - scheduled-task-created    # Use of task scheduler or cron to automate tool execution.
  - log-file-location         # Logs written by the tool that may reveal execution or errors.
  - network-indicator         # Domain or API patterns associated with this tool’s upload behavior.

# =========================
# THREAT ACTORS: Types of actors known to abuse the tool
# =========================
threat-actors:
  - apt              # Advanced Persistent Threat groups
  - ransomware       # Ransomware gangs or affiliates

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages