A LangGraph-based chatbot that can run locally with Ollama or on RunPod serverless infrastructure.
- Local Ollama Support: Run with local Ollama installation
- RunPod Serverless: Deploy Ollama in Docker containers on RunPod for serverless inference
- LangGraph Integration: Complex conversation flows with thoughts, principles, and personality
- Web Interface: Flask-based chat interface
- Stripe Integration: Payment processing capabilities
-
Install dependencies:
pip install -r requirements.txt
-
Start Ollama locally:
ollama serve ollama pull dolphin-mistral-nemo:latest
-
Run the web interface:
python web_chat.py
# Build the Ollama serverless image
docker build -f Dockerfile.ollama -t yourusername/runpod-ollama:latest .
# Push to Docker Hub
docker push yourusername/runpod-ollama:latest- Go to RunPod Console → Serverless
- Click New Endpoint
- Use your Docker image:
yourusername/runpod-ollama:latest - Configure GPU settings based on your model size
- Set timeout to 600 seconds (10 minutes)
from chatbot_component import ChatBot, ChatBotConfig
cfg = ChatBotConfig(
provider="runpod_ollama",
runpod_endpoint="https://api.runpod.ai/v2/YOUR_ENDPOINT_ID",
runpod_api_key="your_runpod_api_key",
model_name="dolphin-mistral-nemo:latest"
)
bot = ChatBot(cfg)
response = bot.get_simple_response("Hello!")chatbot_component.py- Main chatbot logic with LangGraphweb_chat.py- Flask web interfaceDockerfile.ollama- Docker image for RunPod serverlessollama_handler.py- RunPod serverless handlerrunpod_ollama_llm.py- Client wrapper for RunPod Ollamabuild_and_deploy.md- Detailed deployment instructions
The chatbot supports three providers:
- "ollama" - Local Ollama installation
- "runpod" - RunPod vLLM endpoints (basic)
- "runpod_ollama" - RunPod Ollama serverless (recommended)
- ✅ Identical behavior to local Ollama
- ✅ Serverless scaling - pay only when used
- ✅ Any Ollama model - supports full model library
- ✅ Reliable inference - proven Ollama engine
- ✅ Auto-scaling - handles multiple requests
MIT License