📘 Awesome AI Engineering

The Full-Stack LLM Engineering Playbook.

📑 Table of Contents

📚 Content	🔗 Quick Link
Introduction to AI Agents	🔍 Explore
Building LLMs for Production	🔍 Explore
Building High-Performance, Private AI Infrastructure for the Enterprise	🔍 Explore
Mastering the Model Context Protocol (MCP)	🔍 Explore
Agent Memory Part I (A Survey of Memory)	🔍 Explore
Agent Memory Part II (Building Memory Modules for Agentic AI Systems)	🔍 Explore
Agent Evaluation (Eval) Engineering	🔍 Explore

📚 Introduction to AI Agents

🔑 Key Concepts

🧠 Mind Map (Key Concepts)

📥 Download High-Resolution Mind Map (.jpg)

🔍 Click here to unfold the full Mind Map (agents-architecture-operations-and-evolution-mindmap.jpg)
(点击展开完整思维导图)

📑 Presentation Slides

💡 Tip: Press Ctrl + Click (or Command + Click) to open in a new tab.
📥 View the "Introduction to AI Agents" Slides (PDF)
📥 Download PDF (Direct Link)

🚀 Practical Implementation: Task-Oriented AI Agent

👉 View the AI Agent Project in the LLMs-Lab repository on the Eric-LLMs GitHub profile.

To bridge theory with practice, I developed a modular AI Agent project that implements autonomous reasoning and task execution:

Architecture: Utilizes a decoupled structure with dedicated directories for Agent logic, Tools, Utils, and Prompts.
Reasoning Loop: Features an AutoGPT.py implementation using ReAct (Reasoning and Acting) logic to handle complex, multi-step goal decomposition.
Functional Tools: Includes custom tools for deep data analysis (Excel processing via Pandas), automated communication via email, PDF-based QA interrogation (FileQATool), requirements-driven document generation (WriterTool), and dynamic script-based auditing of structured files using custom heuristics and thresholds (PythonTool).
End-to-End Workflow: Supports real-world scenarios, such as identifying underperforming suppliers from sales records and autonomously drafting/sending notifications.

⬆️ Back to Top : Table of Contents

📚 Building LLMs for Production

This guide covers LLM production, from Transformer architectures to advanced techniques like RAG and Fine-Tuning. It explores frameworks like LangChain, methods to mitigate hallucinations, and optimization via quantization. Learn to build autonomous agents for real-world use.

🔑 Key Concepts

🧠 Mind Map (Key Concepts)

📥 Download High-Resolution Mind Map (.jpg)

🔍 Click here to unfold the full Mind Map (building-llms-for-production-mindmap.jpg)
(点击展开完整思维导图)

📑 Presentation Slides

💡 Tip: Press Ctrl + Click (or Command + Click) to open in a new tab.
📥 View the "Building LLMs for Production" Slides (PDF)
📥 Download PDF (Direct Link)

🛠️ Hands-on Lab & Examples

👉 Explore Practical LLM Implementations in the LLMs-Lab repository on the Eric-LLMs GitHub profile.

The production-grade principles discussed in this book—including Fine-Tuning, RAG optimization, LangChain, Prompt Engineering, Function-Calling, Agent, etc.—have each been researched as a standalone module, and each module features multiple project implementations.

⬆️ Back to Top : Table of Contents

📚 Building High-Performance, Private AI Infrastructure for the Enterprise

🔑 Key Concepts

🧠 Mind Map (Key Concepts)

📥 Download High-Resolution Mind Map (.jpg)

🔍 Click here to unfold the full Mind Map (mindmap.jpg)
(点击展开完整思维导图)

📑 Presentation Slides

💡 Tip: Press Ctrl + Click (or Command + Click) to open in a new tab.
📥 View the "Building High-Performance, Private AI Infrastructure for the Enterprise" Slides (PDF)
📥 Download PDF (Direct Link)

🛠️ Hands-on Projects and Examples

👉 doning ....

⬆️ Back to Top : Table of Contents

📚 Mastering the Model Context Protocol (MCP)

🔑 Key Concepts

🧠 Mind Map (Key Concepts)

📥 Download High-Resolution Mind Map (.jpg)

🔍 Click here to unfold the full Mind Map (mastering-the-model-context-protocol-mindmap.jpg)
(点击展开完整思维导图)

📑 Presentation Slides

💡 Tip: Press Ctrl + Click (or Command + Click) to open in a new tab.
📥 View the "Mastering the Model Context Protocol (MCP)" Slides (PDF)
📥 Download PDF (Direct Link)

🛠️ Hands-on Projects and Examples

👉 Explore Model Context Protocol (MCP) Projects on GitHub A curated collection of industry-standard Model Context Protocol (MCP) server implementations.

⬆️ Back to Top : Table of Contents

📚 Agent Memory Part I

🔑 Key Concepts

🧠 Mind Map (Key Concepts)

📥 Download High-Resolution Mind Map (.jpg)

🔍 Click here to unfold the full Mind Map (unforgettable_agents_architecting_ai_memory-mindmap.jpg)
(点击展开完整思维导图)

📑 Presentation Slides

A Blueprint for Memory in Agentic Intelligence

💡 Tip: Press Ctrl + Click (or Command + Click) to open in a new tab.
📥 View the "A Blueprint for Memory in Agentic Intelligence" Slides (PDF)
📥 Download PDF (Direct Link)

Unforgettable Agents Architecting AI Memory

💡 Tip: Press Ctrl + Click (or Command + Click) to open in a new tab.
📥 View the "Unforgettable Agents Architecting AI Memory" Slides (PDF)
📥 Download PDF (Direct Link)

📑 Further Reading / Resources

For a comprehensive list of papers related to Agent Memory, we highly recommend checking out:
👉 * Agent-Memory-Paper-List by Shichun-Liu.

⬆️ Back to Top : Table of Contents

📚 Building Memory Modules for Agentic AI Systems

A comprehensive guide on designing memory systems for AI Agents. This document synthesizes academic surveys with practical implementation strategies, covering: * Theory: Taxonomy of agent memory (Forms, Functions, Dynamics). * Frameworks: Deep dive into Mem0, Letta (MemGPT), and LangMem. * Practice: Enterprise-grade solutions using Amazon Bedrock AgentCore

🔑 Key Concepts

🧠 Mind Map (Key Concepts)

📥 Download High-Resolution Mind Map (mindmap.png)

🔍 Click here to unfold the full Mind Map
(点击展开完整思维导图)

📑 Presentation Slides

Building Memory for Agentic AI: Theory, Frameworks, and Practice

💡 Tip: Press Ctrl + Click (or Command + Click) to open in a new tab.
📥 View Slides (PDF)
📥 Download PDF (Direct Link)

📑 Key Frameworks & Code Samples

The following frameworks and repositories are discussed in this guide, representing the current state-of-the-art in Agentic Memory:

Mem0: A dual-layer memory framework supporting working, factual, and semantic memory types for agent state persistence.
Letta (MemGPT): Manages infinite context by treating agents like an OS with virtual memory and recursive summarization.
LangMem: A LangChain library that implements Semantic, Episodic, and Procedural memory integration for LangGraph agents.
Amazon Bedrock Samples Comprehensive collection of examples for using Amazon Bedrock, including various implementations of Agentic workflows and memory patterns.

⬆️ Back to Top : Table of Contents

📚 Agent Evaluation (Eval) Engineering

"In the age of Agents, your product is only as good as your ability to measure it."

Evaluating AI Agents requires a fundamental shift from simple output checks ("vibe checks") to analyzing multi-step trajectories, environment changes, and tool usage. This repository consolidates frameworks and engineering practices for moving from intuition to instrumentation.

It synthesizes industry standards from Anthropic, LangChain, and real-world engineering practices to build a robust Evaluation Harness.

🔑 Key Concepts

The Intuition Trap: Why manual "vibe checks" fail as complexity scales.
The Harness: Building a standardized environment for agent execution composed of Inputs, Tasks, and Graders.
Trajectory vs. Outcome: Evaluating the journey (reasoning logs, tool calls) rather than just the destination (final answer).
Reliability Metrics:
- Pass@k (Creativity): Can the agent succeed at least once in k tries? (Good for brainstorming).
- Pass^k (Reliability): Can the agent succeed every single time in k tries? (Critical for autonomous agents).
Swiss Cheese Model: Layering defenses (Automated Evals → Human Review → Production Monitoring) to ensure reliability.

🧠 Mind Map (Framework Overview)

📥 Download High-Resolution Mind Map (mindmap.png)

🔍 Click here to unfold the full Mind Map
(点击展开完整思维导图)

📑 Presentation Slides

A Comprehensive Guide to Evaluating AI Agents Focuses on the engineering framework for testing, including the "Clean Room" methodology, reliability metrics (Pass@k), and the "Harness" architecture. It treats evaluation as a core development practice.

💡 Tip: Press Ctrl + Click (or Command + Click) to open in a new tab.
📥 View Slides (PDF)
📥 Download PDF (Direct Link)

🛠️ Key Frameworks & Code Samples

1. The Tooling Stack (Ecosystem)

Implementing a robust evaluation pipeline requires specific infrastructure. The following tools are referenced and utilized in this framework:

Tool	Category	Key Features
LangSmith	Tracing & Debugging	Full trajectory tracing, `runnableConfig` tagging for A/B testing, and dataset management.
LangFuse	Observability	Open-source alternative for observability, prompt management, and lightweight evaluation.
DeepEval	Unit Testing	"Pytest for LLMs". Specific metrics for RAG (Hallucination, Answer Relevancy) and Agents.
OpenEvals	Graders	A library of pre-built "LLM-as-a-judge" prompts (Conciseness, Correctness, Coherence) compatible with LangSmith.

2. Architecture: Hybrid Agent (Fast vs. Slow)

To balance cost and performance, we implement a Hybrid Agent Architecture:

Reactive Layer (System 1): Handles simple, direct queries (e.g., "What is the stock price?") with low latency.
Deliberative Layer (System 2): Activated for complex planning or multi-step reasoning tasks.
Coordination Layer: A router that classifies intent and dispatches tasks.

3. Evaluation Strategy: The "Clean Room"

To prevent "cheating" through shared state, every evaluation trial runs in a fresh container/sandbox.

Isolation: Fresh container for every trial.
Mocking: Simulate external APIs to control latency and deterministic outputs.
Cleanup: Aggressive state teardown (no shared history).

⬆️ Back to Top : Table of Contents

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
summaries		summaries
README.md		README.md

Eric-LLMs/Awesome-AI-Engineering

Folders and files

Latest commit

History

Repository files navigation

📘 Awesome AI Engineering

The Full-Stack LLM Engineering Playbook.

📑 Table of Contents

📚 Introduction to AI Agents

🔑 Key Concepts

🧠 Mind Map (Key Concepts)

📑 Presentation Slides

🚀 Practical Implementation: Task-Oriented AI Agent

📚 Building LLMs for Production

🔑 Key Concepts

🧠 Mind Map (Key Concepts)

📑 Presentation Slides

🛠️ Hands-on Lab & Examples

📚 Building High-Performance, Private AI Infrastructure for the Enterprise

🔑 Key Concepts

🧠 Mind Map (Key Concepts)

📑 Presentation Slides

🛠️ Hands-on Projects and Examples

📚 Mastering the Model Context Protocol (MCP)

🔑 Key Concepts

🧠 Mind Map (Key Concepts)

📑 Presentation Slides

🛠️ Hands-on Projects and Examples

📚 Agent Memory Part I

🔑 Key Concepts

🧠 Mind Map (Key Concepts)

📑 Presentation Slides

A Blueprint for Memory in Agentic Intelligence

Unforgettable Agents Architecting AI Memory

📑 Further Reading / Resources

📚 Building Memory Modules for Agentic AI Systems

🔑 Key Concepts

🧠 Mind Map (Key Concepts)

📑 Presentation Slides

Building Memory for Agentic AI: Theory, Frameworks, and Practice

📑 Key Frameworks & Code Samples

📚 Agent Evaluation (Eval) Engineering

🔑 Key Concepts

🧠 Mind Map (Framework Overview)

📑 Presentation Slides

🛠️ Key Frameworks & Code Samples

1. The Tooling Stack (Ecosystem)

2. Architecture: Hybrid Agent (Fast vs. Slow)

3. Evaluation Strategy: The "Clean Room"

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Packages 0

Packages