Welcome! This repository contains a Python script for performing static malware analysis, developed as a project during my Bachelor of Science (BSc) in Cyber Security.
The goal of this project was to automate the initial triage process of analyzing potentially malicious files without actually running them. By extracting key characteristics and metadata, this tool helps an analyst quickly determine if a file warrants deeper investigation in a sandbox environment.
Static analysis is the process of examining a malicious file without executing it. Think of it like reading the blueprint of a building to understand its design, instead of walking through the building itself. We look at the file's structure, strings, and other properties to find clues about its potential purpose and capabilities. This is a safe, first-step approach in malware analysis.
This script provides a suite of static analysis features to quickly gather intelligence on a file:
-
📄 File Information Extraction: Extracts basic but crucial information about the file, such as its type (e.g., executable, PDF, etc.) and size.
-
🧮 Hashing: Calculates the file's unique digital fingerprints using MD5, SHA-1, and SHA-256 hashes. These are perfect for checking the file's reputation against databases like VirusTotal.
-
뜯 PE File Analysis: Dives deep into Portable Executable (PE) files—the standard file format for executables, DLLs, etc., on Windows. It can parse headers to reveal compilation timestamps, imported libraries, and function calls, which can hint at the malware's capabilities (e.g., networking, file manipulation).
-
📉 Entropy Analysis: Measures the randomness (entropy) of the file's content. A high entropy score can indicate that the file is packed or encrypted, a common technique used by malware authors to hide their malicious code.
-
🔍 String Analysis: Scans the file for readable text strings. This can often reveal valuable Indicators of Compromise (IoCs) like IP addresses, domain names, file paths, registry keys, or even hidden messages from the author!
-
📋 Report Generation: After the analysis is complete, the script generates a clean, easy-to-read report summarizing all the findings. This is perfect for documentation and further investigation.
Navigate directly to the key components of this repository to learn more.
-
- View the core analysis script that powers this project.
-
- Dive deeper into the project's objectives, scope, system design, and academic background.
-
- Explore the visual gallery of all diagrams and screenshots used in the documentation.
-
- See an example of the output generated by the script after analyzing a file.
To run the analysis, you would typically use a command like this from your terminal:
python3 malware_analyzer.py <path_to_suspicious_file>The script will then perform all the analysis steps and print the final report to the console or save it to a file.
This diagram illustrates the high-level workflow of the analysis script, from file input to report generation.
This was a minor project completed as part of my Cyber Security degree. It was a fantastic learning experience that allowed me to apply theoretical knowledge of malware analysis and Python programming to a practical, hands-on challenge.
While this project met its academic goals, there are many ways it could be expanded:
- Integrate with the VirusTotal API to automatically check file hashes.
- Add support for other file types like PDFs, Office documents, or Android APKs.
- Implement Yara rule scanning to detect known malware families.
- Develop a simple graphical user interface (GUI) for easier use.
This project was a collaborative effort, and I'd like to extend a huge thank you to my teammate Sourabh Pradhan who contributed to its success during our academic studies.
This project is protected under an official copyright registered with the Government of India.
For full details, please see the COPYRIGHT.md file.
