This guide describes a collaborative, structured way to build software with LLMs while maintaining correctness, stability, and long-term maintainability. The goal is not novelty—it's shipping reliable systems without losing momentum.
Vibe coding works because it replaces "trusting the model" with systems that enforce correctness. You move fast without breaking things—not because the LLM is perfect, but because your process is.
Treat the LLM as a junior engineer with infinite stamina: fast, tireless, and helpful—but lacking long-term judgment unless you supply it.
Your job is to provide:
- Constraints
- Invariants
- Tests
- Clear expectations
Begin by talking to the LLM about what you want to build, not how to build it yet. Describe:
- The problem
- The type of program (web app, service, CLI, library, embedded system, etc.)
- The users
- The environment it will run in
Treat this as a design discussion. Bounce ideas back and forth. Let the model question assumptions, suggest alternatives, and surface risks early.
Explicitly prompt the model to act as an expert in the relevant domain (for example: "You are a senior backend engineer building production APIs"). This anchors responses in proven patterns instead of generic or experimental approaches.
Before implementation, ask the LLM to help refine the prompt it should follow. Have it identify:
- Missing constraints
- Ambiguous requirements
- Trade-offs that should be decided explicitly
This step dramatically improves output quality. You are aligning on expectations before any code exists.
Once the idea is clear, have the LLM generate a roadmap broken into phases. Each phase should include:
- Goals
- Features included
- Explicit exclusions
Phases reduce scope creep and give you natural checkpoints to reassess direction.
Before significant coding, generate:
- An architecture document (components, responsibilities, data flow)
- A README (what the project is, how it's structured, how to run it)
These documents act as anchors. They help both you and the LLM maintain consistency across long sessions and incremental changes.
Update them as the system evolves.
See /docs/templates for starting templates.
Create a document that defines the non-negotiable truths of your system: rules that must always hold, regardless of UI, API path, ingest source, or future feature additions. These invariants are the foundation for correctness, trust, and long-term maintainability.
If a change violates an invariant, it is a bug or a product decision—never an implementation detail.
Examples of invariants might include:
- Data integrity rules
- Security boundaries
- Authorization guarantees
- Ordering, idempotency, or consistency requirements
- Performance or availability assumptions
This document should be consulted before changing behavior. If new code conflicts with it, either the code is wrong or the invariant must be consciously revised.
See /docs/templates/INVARIANTS.md for a template.
Introduce tests as soon as real code appears.
Start with:
- Static pattern tests (linting, formatting, type checks)
- Basic functional tests for core flows
Run tests before accepting or merging any new feature or refactor produced by the LLM.
As features are added, add tests alongside them. Over-testing early may feel annoying, but it is far cheaper than debugging later—especially with fast-moving AI-assisted development.
Tests are not optional feedback; they are gates.
Tell the LLM explicitly:
- You want boring, predictable, production-grade solutions
- Prefer established patterns and idioms
- Avoid clever abstractions unless asked
Boring code is successful code. Creativity is opt-in. Reliability is the default.
This single instruction prevents a large class of long-term maintenance problems.
Favor modularity aggressively. Keep files small, focused, and single-purpose. As a practical guideline, avoid scripts or files longer than ~1,500 lines. This is not an arbitrary style preference—it is a reliability constraint when collaborating with LLMs.
LLMs reason best over code they can fully and clearly "see." Large, monolithic files reduce the model's ability to understand context, increase hallucinations, and make subtle bugs more likely. Smaller modules improve correctness, review quality, and iteration speed.
A program that runs is not necessarily a program that can be safely modified. Code should be structured so that both humans and LLMs can understand it in isolation.
Each module should:
- Have a single responsibility
- Expose a clear interface
- Hide internal implementation details
- Be understandable without reading the entire codebase
If a file requires scrolling constantly or holding many concepts in your head at once, it is doing too much.
When adding features, resist the temptation to extend existing large files. Instead:
- Extract new behavior into new modules
- Introduce clear boundaries between concerns
- Keep changes localized
Growth should increase the number of small, understandable pieces—not inflate a few central ones.
Modularity creates natural checkpoints for AI-assisted work:
- The LLM can reason about a single module at a time
- Bug hunts can target specific components
- Refactors become safer and more predictable
- Tests can map cleanly to modules
This also makes it easier to give the LLM precise instructions: "Modify this module only" or "Add a new module that implements X."
Treat ~1,500 lines as a soft upper bound, not a strict rule. Hitting that limit is a signal to pause and ask:
- Can this be split?
- Are responsibilities mixed?
- Is there a missing abstraction?
Smaller files are easier to test, easier to document, and easier to reason about—especially across long sessions or future revisits.
Good modularity directly supports:
- Strong System Invariants (enforced at boundaries)
- Cleaner architecture documents
- More targeted tests
- Reduced regression risk
- Faster and safer LLM iteration
In short: modular code is not just cleaner—it is more AI-compatible.
In addition to automated tests, regularly ask the LLM to:
- Review recent changes for bugs
- Check edge cases
- Look for regressions against earlier assumptions
This can be done incrementally—after each feature or group of changes. Treat bug hunts as routine hygiene, not emergency response.
Keep a /prompts folder in the repository containing reusable, role-specific prompts. These act as tools you can invoke consistently.
Examples:
- Engineering prompt (architecture-aware, conservative)
- Frontend bug hunt
- Backend bug hunt
- Bug fixing mode
- User feedback testing (model simulates user behavior and confusion)
- Refactor for clarity
- Performance review
These prompts reduce variance and help you re-enter the correct mindset instantly.
See /prompts for examples.
As the system changes:
- Update the README
- Update the architecture doc
- Update the System Invariants & Contracts if (and only if) product decisions change
Drift between code and documentation is one of the fastest ways to lose control of an AI-assisted project.
The LLM is fast, tireless, and helpful—but it lacks long-term judgment unless you supply it.
Your job is to provide:
- Constraints
- Invariants
- Tests
- Clear expectations
Do that, and vibe coding becomes not just productive, but safe.
After completing significant features or changes, create a checkpoint summary. This builds a rolling audit trail that prevents long-term context loss.
Ask the LLM to produce:
- A 5-bullet summary of what changed
- Any new assumptions introduced
- Any risks or trade-offs added
Why this matters:
- Creates a searchable history of decisions
- Helps you (or future developers) understand why things are the way they are
- Catches drift before it compounds
- Makes it easier to resume work after breaks
Example checkpoint:
## Checkpoint: Added OAuth Authentication (2024-01-15)
### What Changed
- Added OAuth 2.0 support for Google and GitHub
- Created new OAuthController and token validation middleware
- Updated User model to support optional password field
- Added OAuth configuration to environment variables
- Migrated database to add oauth_provider and oauth_id columns
### New Assumptions
- OAuth providers return verified email addresses
- OAuth tokens are validated on every request (no local caching)
- Users can have either password OR OAuth, not both
- Google/GitHub APIs remain stable and available
### Risks & Trade-offs
- OAuth provider outages prevent login (mitigation: support multiple providers)
- Token validation adds latency to every request (acceptable for our scale)
- Email uniqueness now depends on OAuth provider behavior (documented in INVARIANTS.md)
Store checkpoints in:
- A
CHANGELOG.mdorDECISIONS.mdfile in/docs - Commit messages (for smaller changes)
- PR descriptions
- Architecture document updates
Treat checkpoints as future documentation. When you return to the project in 6 months, these summaries are invaluable.
Before you start vibe coding, read these critical documents. They address common failure modes and essential practices:
- Claude Code Setup Guide - Comprehensive beginner guide
- Terminal configuration and notifications
- Running multiple Claude instances in parallel
- Plan mode, CLAUDE.md, slash commands
- Subagents, skills, and hooks
- MCP servers and verification loops
-
Context Management - How to maintain coherent context across sessions
- When to start fresh conversations
- Preventing context pollution
- Managing context window limits
- Session continuity patterns
-
Code Review for AI - How to review AI-generated code
- AI-specific code smells
- Security review checklist
- Human review checkpoints (non-negotiable)
- Iterative refinement techniques
-
Anti-Patterns & Warning Signs - When AI development goes wrong
- Recognizing when LLM is leading you astray
- Common anti-patterns (Framework Fever, The God File, etc.)
- Course correction strategies
- When AI is the wrong tool
-
Version Control Best Practices - Git workflow for AI-assisted development
- Atomic commits with AI code
- Reviewing diffs before committing
- Branching for experiments
- Commit message templates
-
Automation & Testing - Automated quality gates for AI code
- GitHub Actions CI/CD pipelines
- Claude Code hooks (PostToolUse for auto-formatting)
- Pre-commit hooks and static analysis
- Security scanning (CodeQL, Semgrep, Gitleaks)
- Verification loops (key to quality AI code)
- Claude Code GitHub Action for PR reviews
-
Emotional Prompt Engineering - Using psychology to improve LLM output
- Why telling LLMs they're "the best" actually works
- EmotionPrompt research and findings
- Effective vs. ineffective "hype" strategies
- The sycophancy trap and how to avoid it
- High-competence prompt formulas
Start here: If you only read one, read Anti-Patterns & Warning Signs. It will save you from the most common mistakes.
- Review the quick-reference checklist - Print it and keep it handy
- Read Essential Reading - Understand critical practices and failure modes
- Use the templates in
/docs/templatesto set up your project documentation - Browse example prompts in
/promptsto see how to communicate effectively with LLMs - Start with the product conversation before writing any code
- Establish your invariants early and refer to them often
Traditional software development emphasizes careful upfront planning because writing code is expensive. With LLMs, code generation is cheap—but maintaining correctness is still hard.
This workflow inverts the traditional approach:
- Spend time on constraints, tests, and documentation
- Let the LLM handle the tedious implementation
- Use automated checks to catch drift
The result: velocity without chaos.
This guide is a living document. If you've found patterns that work (or anti-patterns that don't), please contribute:
- Share your templates in
/docs/templates - Add useful prompts to
/prompts - Document lessons learned
This guide is released into the public domain. Use it however helps you build better software.