rag-systems-foundations (System Boundaries and Failure Modes)

TL;DR

This repository documents the limits of static Retrieval-Augmented Generation (RAG) systems by isolating and evaluating each boundary in the pipeline: ingestion, retrieval, and ranking.

Across multiple controlled experiments, we see that many RAG failures originate before generation — and often before retrieval even runs.

What This Repository Is

This repository is a systems-level synthesis of several focused experiments that analyze static RAG pipelines, defined as systems where:

documents are chunked once
embeddings are fixed
retrieval behavior is non-adaptive
ranking is applied once per query
generation consumes a fixed context window

No agents. No memory. No adaptive control flow.

Only frozen pipelines under controlled variation.

Why Static RAG Fails (High-Level)

Static RAG systems fail for reasons that are often misattributed:

❌ “The model hallucinated”
❌ “The retriever is weak”
❌ “The embeddings aren’t good enough”

This body of work shows that failures more often arise from:

evidence never being representable as a single unit
relevant chunks existing but appearing too deep to retrieve
ranking saturating due to poorly formed chunks

These are structural failures, not model failures.

System Boundaries Analyzed

This repository organizes results by system boundary, not chronology.

Each boundary is evaluated in isolation, with all others frozen.

1️⃣ Minimal RAG Control System

Repo: rag-minimal-control

Boundary isolated: End-to-end RAG correctness under minimal assumptions.

What it establishes: A smallest-possible RAG system that can still fail in meaningful ways.

Key insight: A system can be architecturally correct yet fail most questions due to evidence starvation — without hallucination.

2️⃣ Retrieval as a Measurable Boundary

Repo: rag-retrieval-eval

Boundary isolated: Evidence surfacing, independent of generation.

What it establishes: Whether relevant evidence exists and whether it appears within Top-K can be measured explicitly.

Key insight: Many failures are retrieval-depth failures, not absence of evidence.

3️⃣ Retrieval Regime: Dense, Sparse, Hybrid

Repo: rag-hybrid-retrieval

Boundary isolated: Recall behavior under different retrieval signals.

What it establishes: Lexical and semantic retrievers surface complementary evidence.

Key insight: Hybrid retrieval improves surfacing but does not eliminate ranking depth or representability issues.

4️⃣ Ranking as a Saturating Layer

Repo: rag-reranking-playground

Boundary isolated: Evidence prioritization after retrieval.

What it establishes: Reranking improves ordering but cannot recover missing or fragmented evidence.

Key insight: Ranking saturates when chunking is misaligned with question structure.

5️⃣ Chunking as a Representational Boundary

Repo: rag-chunking-strategies

Boundary isolated: Document ingestion and representability.

What it establishes: Chunking determines which questions are even answerable as single-context queries.

Key insight: Changing only the chunking strategy causes questions to transition between:

Atomic
Structural
Compositional
Unanswerable

Retrieval quality becomes irrelevant when no coherent chunk exists.

What Static RAG Can and Cannot Do

Static RAG can:

retrieve localized facts
surface section-level explanations
benefit from hybrid retrieval and reranking
fail deterministically and diagnosably

Static RAG cannot:

adapt retrieval strategy per question
reason across disjoint chunks
recover broken causal chains
respond when representability fails

These are hard system limits, not tuning problems.

Core Findings (Bounded Claims)

This repository establishes that:

Retrieval failure is often a ranking problem
Ranking failure is often an ingestion problem
Chunking is not preprocessing — it is representation modeling
Some questions are unanswerable under static constraints, regardless of retriever strength

No claims are made about:

answer quality
faithfulness
optimal chunking strategies
production best practices

Why This Boundary Matters

Once ingestion, retrieval, and ranking are frozen:

remaining failures are not stochastic
further tuning yields diminishing returns
system behavior becomes predictable

At this point, improving performance requires changing the computational regime, not the parameters.

What Comes After Static RAG

When answers are:

compositional by nature
distributed across chunks
absent as single coherent units

…the system must decide what to do next.

That transition marks the boundary between:

static pipelines and adaptive, agentic systems

This repository closes the static regime.

The next regime — where systems must decide whether to retrieve, how to act, and what state to carry forward — is established in:

→ agent-systems-core

Core Claim

Before adding intelligence, measure and understand the constraints of the system you already have.

This repository does exactly that.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rag-systems-foundations (System Boundaries and Failure Modes)

TL;DR

What This Repository Is

Why Static RAG Fails (High-Level)

System Boundaries Analyzed

1️⃣ Minimal RAG Control System

2️⃣ Retrieval as a Measurable Boundary

3️⃣ Retrieval Regime: Dense, Sparse, Hybrid

4️⃣ Ranking as a Saturating Layer

5️⃣ Chunking as a Representational Boundary

What Static RAG Can and Cannot Do

Static RAG can:

Static RAG cannot:

Core Findings (Bounded Claims)

Why This Boundary Matters

What Comes After Static RAG

Core Claim

About

Uh oh!

Releases

Packages

Arnav-Ajay/rag-systems-foundations

Folders and files

Latest commit

History

Repository files navigation

rag-systems-foundations (System Boundaries and Failure Modes)

TL;DR

What This Repository Is

Why Static RAG Fails (High-Level)

System Boundaries Analyzed

1️⃣ Minimal RAG Control System

2️⃣ Retrieval as a Measurable Boundary

3️⃣ Retrieval Regime: Dense, Sparse, Hybrid

4️⃣ Ranking as a Saturating Layer

5️⃣ Chunking as a Representational Boundary

What Static RAG Can and Cannot Do

Static RAG can:

Static RAG cannot:

Core Findings (Bounded Claims)

Why This Boundary Matters

What Comes After Static RAG

Core Claim

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages