This repository showcases the deployment architecture and infrastructure setup for a serverless AI-powered chat application, leveraging AWS services, Amazon Bedrock, and OpenAI LLMs for intelligent conversational experiences.
The design prioritizes scalability, low latency, and minimal operational overhead using a fully managed cloud-native approach.
The architecture consists of the following components:
- A static web application hosted in Amazon S3
- Distributed globally through Amazon CloudFront for high performance, caching, and SSL-secured access.
- Serves as the entry point for all client requests.
- Manages routing, authentication, and rate-limiting to backend services.
- Contains the core logic for handling chat interactions.
- Processes user input, integrates with both OpenAI and Bedrock models, and manages responses.
- Implements lightweight orchestration and memory handling.
- Amazon Bedrock: Provides access to foundational models for enterprise-grade AI.
- OpenAI LLM (e.g., GPT-4): Enhances conversational capabilities with advanced natural-language understanding and reasoning.
- Lambda dynamically routes requests to the most appropriate model based on context or configuration.
- Stores conversation history and session metadata for persistence.
- Enables contextual continuity across chat sessions.