Skip to content

A sophisticated command-line interface tool for downloading, managing, and interacting with powerful AI models completely offline.

Notifications You must be signed in to change notification settings

0x0806/UnleashedLLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

UnleashedLLM

Democratizing Access to Uncensored AI Language Models

A sophisticated command-line interface tool for downloading, managing, and interacting with powerful AI models completely offline.


Table of Contents


Overview

UnleashedLLM is a comprehensive Python application that provides seamless access to uncensored AI language models through an intuitive command-line interface. The tool enables users to download, manage, and interact with various AI models while ensuring complete privacy and offline functionality after initial setup.

Mission Statement

To democratize access to powerful AI language models by providing a user-friendly, privacy-focused platform that operates entirely offline without compromising on functionality or performance.


Key Features

Core Functionality

  • Advanced Model Management: Download, organize, and manage AI models from a curated registry
  • Interactive Chat Interface: Engage in real-time conversations with context-aware AI models
  • Single Inference Engine: Perform one-off text generation tasks with optimized parameters
  • Complete Offline Operation: Full functionality without internet dependency post-installation

Advanced Capabilities

  • Paginated Model Browser: Browse available models with detailed specifications and filtering
  • Real-time Progress Tracking: Download progress with speed indicators and ETA calculations
  • Comprehensive System Diagnostics: Automated system requirement validation and performance analysis
  • Status Dashboard: Monitor storage usage, model availability, and system health
  • Conversation History Management: Maintain context across extended chat sessions
  • Thread-safe Operations: Concurrent model operations with robust error handling

Technical Features

  • GGUF Format Support: Optimized for CPU inference with quantized models
  • Dynamic Memory Management: Adaptive memory allocation based on model requirements
  • Configurable Threading: Optimized CPU utilization with thread limiting
  • Automatic Dependency Resolution: Seamless installation of required packages

Supported Models

Model Categories

Category Size Range Memory Requirement Performance Level
Lightweight 1-3GB 4GB RAM Fast inference, basic capabilities
Medium 4-8GB 8GB RAM Balanced performance and quality
Large 7-15GB 16GB RAM High quality, advanced reasoning
Extra Large 20GB+ 32GB RAM Maximum capability, research-grade

Available Models

Model Size Context Length Capabilities Specialization
Phi-2 Uncensored 1.6GB 2,048 tokens Text generation, Chat Lightweight conversations
Llama2 7B Uncensored 4.1GB 4,096 tokens Text generation, Chat, Coding General purpose
OpenChat 7B 4.2GB 8,192 tokens Text generation, Chat, Reasoning Conversational AI
CodeLlama 13B 7.3GB 16,384 tokens Coding, Text generation Code generation
Mixtral 8x7B 26.9GB 32,768 tokens Text generation, Chat, Reasoning Mixture of experts
Llama2 70B Uncensored 39.5GB 4,096 tokens Advanced reasoning, Complex tasks Research applications

System Requirements

Minimum Requirements

  • Operating System: Linux (Nix-based environment)
  • Python Version: 3.11 or higher
  • Memory: 8GB RAM
  • Storage: 10GB available space (plus model storage)
  • CPU: Multi-core processor (4+ cores recommended)

Recommended Configuration

  • Memory: 16GB+ RAM for optimal performance
  • Storage: SSD with 100GB+ available space
  • CPU: 8+ core processor with high clock speed
  • Network: Stable internet connection for initial model downloads

Performance Benchmarks

  • Lightweight Models: 2-5 tokens/second on 4-core CPU
  • Medium Models: 1-3 tokens/second on 8-core CPU
  • Large Models: 0.5-1.5 tokens/second on high-end CPU

Installation

Quick Start

# Clone the repository
git clone <repository-url>
cd UnleashedLLM

# Run the application (dependencies auto-install)
python3 main.py

Manual Dependency Installation

# Install core dependencies
pip install llama-cpp-python

# Verify installation
python3 -c "import llama_cpp; print('Installation successful')"

Environment Setup

# Set environment variables (optional)
export LLAMA_CPP_LOG_LEVEL=ERROR
export OMP_NUM_THREADS=8

Usage

Command-Line Interface

Launch the main application:

python3 main.py

Menu Navigation

The application provides an intuitive menu system:

  1. Browse Model Library - Explore available models with detailed specifications
  2. Download Model - Select and download models with progress tracking
  3. Interactive Chat - Start conversational sessions with downloaded models
  4. Single Inference - Perform individual text generation tasks
  5. Manage Models - View, organize, and delete downloaded models
  6. System Diagnostics - Check system compatibility and performance metrics
  7. Status Dashboard - Monitor application status and resource usage
  8. Exit - Safely terminate the application

Advanced Usage Examples

Batch Model Download

# Download multiple models in sequence
python3 main.py --batch-download llama2-7b-uncensored,openchat-7b

Configuration Override

# Override default parameters
python3 main.py --max-threads 4 --context-length 2048

Automated Chat Session

# Start chat with specific model
python3 main.py --chat --model llama2-7b-uncensored

Architecture

Core Components

ModelRegistry

  • Centralized model metadata management
  • Version control and compatibility tracking
  • Dynamic model discovery and registration

ModelManager

  • File system operations and storage management
  • Download resumption and integrity verification
  • Model lifecycle management

ChatInterface

  • Context-aware conversation handling
  • Session persistence and history management
  • Response formatting and sanitization

LlamaCppManager

  • Direct integration with llama-cpp-python
  • Memory optimization and thread management
  • Performance monitoring and adjustment

SystemDiagnostics

  • Real-time system monitoring
  • Performance benchmarking
  • Resource usage tracking

Design Patterns

  • Singleton Pattern: Ensures single instance of critical managers
  • Factory Pattern: Dynamic model instantiation based on configuration
  • Observer Pattern: Event-driven status updates and notifications
  • Strategy Pattern: Pluggable inference backends and optimization strategies

Data Flow

User Input → CLI Parser → Model Manager → Inference Engine → Response Formatter → User Output
     ↓              ↓              ↓              ↓                    ↑
System Check → Model Registry → Model Loading → Context Management → Result Processing

Configuration

Default Parameters

DEFAULT_CONFIG = {
    "max_threads": 8,
    "context_length": "auto",  # Dynamically set per model
    "batch_size": 512,
    "temperature": 0.8,
    "top_p": 0.95,
    "top_k": 40,
    "repeat_penalty": 1.1,
    "max_tokens": 400
}

Environment Variables

Variable Description Default
LLAMA_CPP_LOG_LEVEL Logging verbosity ERROR
OMP_NUM_THREADS OpenMP thread count 8
MODEL_CACHE_DIR Model storage directory ./models
MAX_CONTEXT_LENGTH Maximum context tokens 32768

Storage Structure

UnleashedLLM/
├── main.py                 # Main application entry point
├── models/                 # Downloaded model storage
│   ├── llama-2-7b-chat.Q4_K_M.gguf
│   ├── openchat-3.5-0106.Q4_K_M.gguf
│   └── ...
├── cache/                  # Runtime cache and temporary files
├── logs/                   # Application logs and diagnostics
├── config/                 # User configuration files
└── README.md              # Documentation

API Reference

Core Classes

ModelRegistry

class ModelRegistry:
    @staticmethod
    def get_model_info(model_id: str) -> Dict
    @staticmethod
    def list_models_by_category(category: str) -> List[Dict]
    @staticmethod
    def validate_model(model_data: Dict) -> bool

ModelManager

class ModelManager:
    def download_model(self, model_id: str, progress_callback=None) -> bool
    def list_downloaded_models(self) -> List[str]
    def delete_model(self, model_id: str) -> bool
    def get_model_path(self, model_id: str) -> Path

ChatInterface

class ChatInterface:
    def start_chat(self, model_id: str) -> None
    def send_message(self, message: str) -> str
    def end_chat(self) -> None
    def get_chat_history(self) -> List[Dict]

Performance Optimization

CPU Optimization

  • Thread count limited to prevent CPU overload
  • Automatic CPU core detection and utilization
  • Dynamic batch size adjustment based on system performance

Memory Management

  • Lazy model loading to minimize memory footprint
  • Automatic garbage collection for conversation history
  • Memory-mapped file access for large models

I/O Optimization

  • Streaming downloads with resume capability
  • Asynchronous file operations
  • Compressed model storage when possible

Benchmarking Results

Model Size RAM Usage CPU Utilization Tokens/Second
1.6GB 2.1GB 45% 4.2
4.1GB 5.8GB 65% 2.8
7.3GB 9.2GB 78% 1.9
26.9GB 31.4GB 85% 0.8

Troubleshooting

Common Issues

Installation Problems

Issue: llama-cpp-python compilation fails

# Solution: Install build dependencies
sudo apt update
sudo apt install build-essential cmake
pip install --upgrade pip setuptools wheel

Memory Errors

Issue: Out of memory when loading large models

  • Reduce context length in configuration
  • Close other memory-intensive applications
  • Consider using smaller model variants

Performance Issues

Issue: Slow inference speed

  • Adjust thread count based on CPU cores
  • Ensure sufficient RAM is available
  • Check for background processes consuming resources

Download Failures

Issue: Network timeouts during model download

  • Check internet connection stability
  • Verify sufficient disk space
  • Resume download using the same model selection

Diagnostic Commands

# Check system resources
python3 main.py --diagnostics

# Verify model integrity
python3 main.py --verify-models

# Performance benchmark
python3 main.py --benchmark

Log Analysis

Application logs are stored in the logs/ directory:

  • application.log: General application events
  • performance.log: Performance metrics and benchmarks
  • errors.log: Error messages and stack traces

Development

Project Structure

src/
├── core/
│   ├── model_registry.py
│   ├── model_manager.py
│   ├── chat_interface.py
│   └── diagnostics.py
├── utils/
│   ├── file_operations.py
│   ├── network_utils.py
│   └── formatting.py
├── config/
│   ├── default_config.py
│   └── model_definitions.py
└── tests/
    ├── unit/
    ├── integration/
    └── performance/

Development Setup

# Create development environment
python3 -m venv dev-env
source dev-env/bin/activate

# Install development dependencies
pip install pytest black flake8 mypy

# Run tests
pytest tests/

# Code formatting
black src/
flake8 src/

Contributing Guidelines

  1. Code Style: Follow PEP 8 guidelines with Black formatting
  2. Testing: Maintain >90% test coverage for new features
  3. Documentation: Update README and inline documentation
  4. Performance: Benchmark new features for regression testing
  5. Security: Review code for potential vulnerabilities

Extension Points

Custom Model Backends

class CustomInferenceEngine:
    def load_model(self, model_path: str) -> bool
    def generate(self, prompt: str, **kwargs) -> str
    def unload_model(self) -> None

Plugin Architecture

class UnleashedLLMPlugin:
    def initialize(self, app_context: Dict) -> None
    def process_input(self, user_input: str) -> str
    def cleanup(self) -> None

Security & Privacy

Privacy Features

  • Complete Offline Operation: No data transmission after model download
  • Local Storage: All models and conversations stored locally
  • No Telemetry: Zero usage data collection or external communications
  • Secure Model Verification: Checksum validation for downloaded files

Security Considerations

  • Input Sanitization: All user inputs are sanitized before processing
  • File System Isolation: Models stored in dedicated directories with restricted permissions
  • Memory Protection: Sensitive data cleared from memory after use
  • Audit Trail: Comprehensive logging of all system operations

Best Practices

  • Regularly update the application for security patches
  • Monitor system resources for unusual activity
  • Validate model sources before downloading
  • Use appropriate file permissions for model storage

Contributing

How to Contribute

  1. Fork the Repository

    git clone <your-fork-url>
    cd UnleashedLLM
  2. Create Feature Branch

    git checkout -b feature/your-feature-name
  3. Make Changes

    • Follow coding standards
    • Add comprehensive tests
    • Update documentation
  4. Submit Pull Request

    • Provide clear description of changes
    • Reference related issues
    • Ensure CI passes

Development Workflow

  • Issues: Use GitHub issues for bug reports and feature requests
  • Code Review: All changes require peer review
  • Testing: Automated testing with CI/CD pipeline
  • Documentation: Keep documentation in sync with code changes

Changelog

Version 1.1.0 (2025)

  • Enhanced model registry with 30+ models
  • Improved download resumption capabilities
  • Added system diagnostics and performance monitoring
  • Optimized memory management for large models
  • Enhanced chat interface with better context handling

Version 1.0.0 (2024)

  • Initial release with core functionality
  • Basic model download and chat capabilities
  • Command-line interface implementation
  • Support for GGUF model format

License

This project is developed for educational and research purposes. Users are responsible for compliance with applicable AI model licenses and terms of use.

Third-Party Licenses

  • llama-cpp-python: MIT License
  • Model Files: Various licenses (see individual model documentation)

Support

Getting Help

  • Documentation: Comprehensive README and inline documentation
  • Issues: GitHub Issues for bug reports and feature requests
  • Community: Discussion forums and community support

Contact Information

  • Developer: 0x0806
  • Project Repository: [GitHub Repository URL]
  • Issue Tracker: [GitHub Issues URL]

UnleashedLLM v1.1.0 | Python 3.11+ | Linux/Nix Environment

Last Updated: 2025

About

A sophisticated command-line interface tool for downloading, managing, and interacting with powerful AI models completely offline.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages