A specialized Bittensor subnet miner (subnet 231 on testnet) for high-performance FLUX.1-dev inference with TensorRT acceleration and LoRA refitting.
You will need access to a H100 GPU PCIe with the following configuration:
NVIDIA-SMI 570.172.08 Driver Version: 570.172.08 CUDA Version: 12.8
For managing python installation, we recommend uv and only officially support this configuration.
Before running this miner, you must:
-
Set up a HuggingFace account and token:
- Create an account at HuggingFace
- Generate an access token at HuggingFace Settings
- The token must have write permissions to create and upload models
-
Accept FLUX.1-dev model license:
- Visit the FLUX.1-dev model page
- Read and accept the license agreement
- Without accepting the license, the model download will fail
The miner can run in two modes:
- Inference Mode: Serves pre-trained models with TensorRT acceleration
- Kontext Editing Mode: Enable deterministic FLUX.1-Kontext-dev editing
# 1. Clone the repository
git clone https://github.com/dippy-ai/dippy-studio-bittensor-miner
cd dippy-studio-bittensor-miner
# 2. Create .env file with your credentials
cp .env.example .env
# Edit .env and add your HF_TOKEN
# 3. Choose your deployment mode:
# For INFERENCE only (requires TRT engines)
make setup-inference
# 4. Check logs
make logsThe miner server will be available at http://localhost:8091.
The miner now supports deterministic image editing using FLUX.1-Kontext-dev.
# Quick setup
make setup-kontext
# Or manually
echo "ENABLE_KONTEXT_EDIT=true" >> .env
make restart# Edit an image
curl -X POST http://localhost:8091/edit \
-H "Content-Type: application/json" \
-d '{
"prompt": "Add a red hat to the person",
"image_b64": "base64_encoded_image_data",
"seed": 42,
"guidance_scale": 2.5,
"num_inference_steps": 28
}'
# Check status
curl http://localhost:8091/edit/status/{job_id}
# Download result
curl http://localhost:8091/edit/result/{job_id} -o edited.png# Run E2E tests (local miner)
export ENABLE_KONTEXT_EDIT=true
pytest tests/e2e/kontext_determinism/ -v -m e2e
# Or use make command
make test-kontext-determinismFor Docker-based testing, see tests/e2e/kontext_determinism/README.md for network configuration.
See docs/kontext-editing.md for full API documentation including async callbacks.
# Deployment Modes
make setup-inference # Deploy inference-only server (auto-builds TRT if needed)
make setup-kontext # Deploy with FLUX.1-Kontext-dev editing enabled
# Building & Management
make build # Build Docker images
make trt-build # Build TRT engine (20-30 min)
make trt-rebuild # Force rebuild TRT engine
make up # Start miner service
make down # Stop miner service
make logs # Follow miner logs
make restart # Restart miner service
# Testing
make test-kontext-determinism # Run Kontext determinism E2E tests
make test-kontext-unit # Run Kontext unit tests
# Maintenance
make clean-cache # Remove all cached TRT engines
make help # Show all available commandsThe system consists of three main components:
The reverse proxy handles Bittensor authentication and routes requests to internal services.
Setup:
- Register on testnet:
btcli s register --netuid 231 --subtensor.network test - Transfer 0.01 testnet TAO to
5FU2csPXS5CZfMVd2Ahdis9DNaYmpTCX4rsN11UW7ghdx24Afor mining permit - Configure environment variables in
reverse_proxy/.env - Install and run:
cd reverse_proxy uv pip install -e .[dev] python server.py
A FastAPI server (miner_server.py) that can run inference
Features:
- Inference Mode: TensorRT-accelerated image generation with LoRA support and automatic engine preloading
- Static file serving: Direct image URL access
Endpoints:
POST /inference- Generate image (with optional LoRA and callback support)POST /edit- Edit image with FLUX.1-Kontext-dev (with callback support)GET /inference/status/{job_id}- Check inference statusGET /edit/status/{job_id}- Check edit job statusGET /inference/result/{job_id}- Download generated imageGET /edit/result/{job_id}- Download edited imageGET /health- Health check
High-performance inference engine for FLUX.1-dev model.
Building the Engine:
# Using make (recommended)
make trt-build # Build if not exists
make trt-rebuild # Force rebuild
# Or Docker directly
docker compose run --rm trt-builderFollow docs/async_inference_e2e.md for a walkthrough of the new async inference test harness, including how to launch the reference callback server and run checks via pytest or python -m tests.e2e.async_inference.cli run.
Quick start (host execution)
pip install -r requirements_test_async.txt
export ASYNC_MINER_URL=http://localhost:8091
export ASYNC_CALLBACK_BASE_URL=http://127.0.0.1:8092
pytest tests/e2e/async_inference/test_async_inference.py -sRunning against a miner in Docker
The container cannot reach 127.0.0.1 on the host. Use an address the container can see (for example the docker0 gateway on Linux):
pip install -r requirements_test_async.txt
export ASYNC_MINER_URL=http://localhost:8091
export ASYNC_CALLBACK_BASE_URL=http://172.17.0.1:8092
export ASYNC_CALLBACK_BIND_HOST=0.0.0.0 # ensure the mock callback server listens externally
pytest tests/e2e/async_inference/test_async_inference.py -sRun ip addr show docker0 and use the inet value (typically 172.17.0.1) for ASYNC_CALLBACK_BASE_URL.
Expose the same ASYNC_CALLBACK_BASE_URL (and optionally ASYNC_CALLBACK_BIND_HOST) to the miner service via your .env or compose file, then restart the container so it picks up the new settings.
Create a .env file in the project root:
# Required
HF_TOKEN=your_huggingface_token_here # HuggingFace token with write permissions
# Mode Configuration (set based on deployment choice)
ENABLE_INFERENCE=true
MODEL_PATH=black-forest-labs/FLUX.1-dev # Base model path
OUTPUT_DIR=/app/output # Output directory in container (mapped to ./output on host)
MINER_SERVER_PORT=8091 # Server port
MINER_SERVER_HOST=0.0.0.0 # Server host
SERVICE_URL=http://localhost:8091 # Public URL for image servingFor the reverse proxy, create reverse_proxy/.env:
# Required
MINER_HOTKEY=your_miner_hotkey_here # Bittensor miner hotkey
# Service endpoints (internal)
INFERENCE_SERVER_URL=http://localhost:8091 # Miner server for inferenceThe miner provides two main services:
-
Receives generation requests via POST to
/inference:prompt: Text description for image generationlora_path: Optional path to LoRA weightswidth/height: Image dimensionsnum_inference_steps: Quality controlguidance_scale: Prompt adherence strengthseed: For reproducibility
-
Generates images using TensorRT:
- Uses pre-built TRT engine for fast inference
- Supports dynamic LoRA switching via refitting
- Returns image URL immediately after generation
-
Serves generated images via static file server:
- Images accessible at
/images/{job_id}.png - Direct URL access for validator retrieval
- Images accessible at
curl -X POST http://localhost:8091/inference \
-H "Content-Type: application/json" \
-d '{
"prompt": "A beautiful sunset over mountains",
"width": 1024,
"height": 1024,
"num_inference_steps": 28,
"guidance_scale": 7.5,
"seed": 42
}'curl -X POST http://localhost:8091/inference \
-H "Content-Type: application/json" \
-d '{
"prompt": "A portrait in anime style",
"lora_path": "/app/models/anime_lora.safetensors",
"width": 1024,
"height": 1024
}'# Inference status
curl http://localhost:8091/inference/status/{job_id}- H100 PCIe with the following specific
nvidia-smiconfiguration. See below for reference:
NVIDIA-SMI 570.172.08 Driver Version: 570.172.08 CUDA Version: 12.8
- GPU: NVIDIA H100 GPU PCIe (80GB VRAM supports both FLUX.1-dev + Kontext simultaneously)
- CUDA: Version 12.8 Specifically
- RAM: 32GB minimum
- Storage: 100GB+ for model weights (FLUX.1-dev + Kontext models)
- Docker: Latest version with nvidia-container-toolkit
- API logs: Check
docker compose logs -f - GPU usage: Use
nvidia-smito monitor GPU utilization
If inference fails with TRT errors:
- Rebuild the engine:
docker compose --profile build up trt-builder --force-recreate - Check GPU compatibility (requires compute capability 7.0+)
- Ensure sufficient GPU memory (24GB+ recommended)
If LoRA weights don't apply correctly:
- Verify LoRA was trained for FLUX.1-dev (not SDXL or other models)
- Check file path is accessible within container
- Ensure LoRA file is in .safetensors format
If you encounter permission errors downloading FLUX.1-dev:
- Ensure you've accepted the model license on HuggingFace
- Verify your HF_TOKEN is correctly set
- Check that your token has read permissions
If container fails to start:
- Check nvidia-docker is installed:
docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi - Verify Docker has GPU access
- Check port 8091 is not in use:
lsof -i :8091
If the miner can't connect to validators:
- Check firewall settings for port 8091
- Ensure Docker networking is properly configured
- Verify validator endpoints are accessible
- Check SERVICE_URL environment variable for production
For issues and questions:
- Check existing issues in the repository
- Join the Bittensor Discord community
- Review validator documentation for integration details