Blog

  • Stop Stuffing Your CLAUDE.md. The Research Says Less Is More.

    A new study benchmarking context files across four coding agents and hundreds of real-world tasks found that auto-generated CLAUDE.md files hurt performance and inflate costs. Here’s what to do instead.


    Run /init. Watch Claude Code analyze your repo, enumerate every directory, catalog your build tools, and produce a sprawling CLAUDE.md full of information it could have just read from package.json. Commit it. Feel productive. Ship worse code.

    That, at least, is the surprising conclusion of “Evaluating AGENTS.md”, a February 2026 paper from ETH Zurich and LogicStar.ai. The researchers built a new benchmark called AgentBench — 138 real GitHub issues across 12 repositories that already contain developer-written context files — and ran four major coding agents against it in three configurations: no context file at all, an LLM-generated context file (the /init approach), and the developer’s own hand-written context file.

    The results should make anyone who cargo-culted a 600-line CLAUDE.md stop and reconsider.

    The Numbers Don’t Lie

    Across Claude Code with Sonnet 4.5, Codex with GPT-5.2 and GPT-5.1 Mini, and Qwen Code with Qwen3-30B, LLM-generated context files reduced task success rates by an average of 2–3% while increasing inference costs by over 20%. That’s right: the agent spent more money to produce worse results.

    SettingAvg. Resolution RateCost ChangeExtra Steps
    No context fileBaseline
    LLM-generated (/init)−2–3%+20–23%+2.5–3.9
    Developer-written (minimal)+4% avg.~+19%+3.3

    Developer-written context files did marginally better than nothing — a 4% average improvement — but they still increased cost and step count. The only setting that was consistently, unambiguously better was having no context file at all from a cost-efficiency standpoint.

    “Unnecessary requirements from context files make tasks harder, and human-written context files should describe only minimal requirements.” — Gloaguen et al., “Evaluating AGENTS.md,” 2026

    Why /init Makes Things Worse

    The paper identifies a core problem: auto-generated context files are almost entirely redundant with documentation the agent can already discover. The researchers found that 100% of Sonnet-4.5-generated context files contained codebase overviews, and 95–99% of files generated by other models did too. These overviews describe directory structures, explain what src/ contains, and catalog build tools — all things the agent would find in seconds by running ls and reading README.md.

    Worse, the study measured how quickly agents found the files relevant to an issue, and discovered that context files didn’t help at all. Agents with context files took the same or more steps to reach the right files as agents without them. In some cases, agents wasted steps hunting for the context file itself, reading it multiple times despite it already being in their context window.

    The behavioral analysis is damning. When context files were present, agents ran more tests, grepped more files, read more files, and invoked more repository-specific tooling. The agents were following the instructions — dutifully using uv when told to, running pytest with specific flags when instructed — but all that additional work didn’t translate into solving more tasks. It just burned tokens. GPT-5.2’s reasoning token usage jumped 22% with LLM-generated context files. The agent was literally thinking harder about instructions that weren’t helping it.

    Key finding: When the researchers removed all documentation from the repos (READMEs, docs folders, example code) and left only the context file, LLM-generated files finally outperformed having nothing. This confirms that /init essentially parrots back existing documentation. If your repo already has a README, the auto-generated CLAUDE.md is noise.

    The Exact Opposite of /init

    So what should you actually put in a CLAUDE.md? The answer follows from the research: only things the agent cannot easily glean by reading the repo itself.

    Your CLAUDE.md should be a tiny pointer file. A few lines of orientation, followed by links to deeper documents that the agent can pull in on demand when they’re relevant. Think of it as a lobby directory, not an encyclopedia.

    What /init generates (don’t do this)

    # Project Overview
    This is a Python web application using
    FastAPI with SQLAlchemy ORM...
    
    ## Directory Structure
    - src/ — Main application code
    - src/api/ — API route handlers
    - src/models/ — Database models
    - src/services/ — Business logic
    - tests/ — Test suite
    - docs/ — Documentation
    
    ## Tech Stack
    - Python 3.12
    - FastAPI 0.109
    - SQLAlchemy 2.0
    - PostgreSQL
    - Redis for caching
    - pytest for testing
    
    ## Build & Test
    - pip install -e ".[dev]"
    - pytest tests/
    - ruff check src/
    
    ## Code Style
    - Follow PEP 8
    - Use type hints
    - Docstrings for public functions
    
    (... 200 more lines ...)

    What you should write instead

    # Acme API
    
    Test: `make test`
    Lint: `make lint`
    Single test: `pytest tests/path -k name`
    
    ## Non-Obvious Rules
    - Migrations: never edit models without `make migration`
    - Auth uses custom HMAC, not JWT. Read docs/SSD.md §4 first.
    - Module X and Module Y must not cross-import.
    
    ## Key Documents
    - System design & architecture: docs/SSD.md
    - Deployment & environment setup: docs/deploy.md
    - Third-party API quirks: docs/integrations.md

    The first example describes things Claude can discover in five seconds. The second tells it things it would never guess: the non-obvious migration workflow, the custom auth scheme that looks like JWT but isn’t, and where to find the document that explains the system’s actual design intent.


    Link Out to Contextual Documents

    The philosophy is progressive disclosure. Your CLAUDE.md stays lean — maybe 30 lines — and links to richer documents that the agent reads only when it’s working in that area. Those linked documents should each contain knowledge that isn’t self-evident from reading the code.

    The kind of documents worth linking

    Not every markdown file in your docs/ folder is worth pointing to. The test is simple: could a competent developer figure this out by reading the source code and existing documentation? If yes, don’t link it. If no, it’s a candidate.

    Useful linked documents tend to fall into a few categories. Workflow documents explain non-obvious processes: “we deploy via a Slack bot, not CI,” or “database changes require a review from the DBA team before merging.” Decision records capture why the code is the way it is — the rejected alternatives, the constraints that shaped the design. API integration guides describe quirks and rate limits of third-party services that the code doesn’t make obvious.

    But the single most valuable linked document — the one that gives a coding agent the deepest leverage — is a Software System Design document.

    The SSD: Your Agent’s Architectural Brain

    An SSD (Software System Design, sometimes called SDD) is the document that bridges intent and implementation. It describes what the system does, why it’s structured the way it is, how the components relate, and what the constraints are. It’s the document you’d write before building the system if you were being disciplined about it.

    For a coding agent, the SSD is transformative for a simple reason: code tells you what is, but not what should be. When Claude reads your FastAPI route handlers, it can see the current request/response shapes, the middleware chain, the database queries. What it cannot see is that the auth service is intentionally stateless because you plan to deploy it at the edge. Or that the event system uses eventual consistency because the team evaluated and rejected strong consistency after load testing. Or that module X and module Y must never import from each other because of a planned future extraction into a separate service.

    Without an SSD, the agent will cheerfully introduce a circular dependency, add state to the auth service, or use synchronous calls where you need async events — and its solution will pass all your tests while violating your architecture.

    What belongs in an SSD

    • System purpose and scope — the one-paragraph answer to “what is this and who is it for?”
    • Architectural decisions and constraints — the choices that shaped the system and the alternatives you rejected. This is the most valuable section for an agent.
    • Component responsibilities and boundaries — what each major module owns, and critically, what it must not own.
    • Data flow and integration patterns — how data moves through the system, especially non-obvious paths like event buses, caches, or async queues.
    • Security model — authentication, authorization, data handling constraints.
    • Known traps — the things that look simple but aren’t. Every codebase has these.

    The SSD doesn’t need to be a 40-page IEEE-standard document. A focused 2–4 page markdown file that a senior engineer would write before handing a system to a new team member is exactly right. It should be a living document — updated when architectural decisions change, not when variable names change.

    When you link to it from CLAUDE.md with a line like For system architecture and design rationale, see docs/SSD.md, you’re giving the agent something it cannot get any other way: your intent.

    A Practical Structure

    Here’s the model I’d recommend, drawn from the research and from what I’ve found working with Claude Code on my own projects:

    # Project Name
    
    One-sentence description of the project.
    
    ## Commands
    Test: `make test`
    Lint: `make lint`
    Type check: `make typecheck`
    Single test: `pytest tests/path.py -k test_name`
    
    ## Non-Obvious Rules
    - Migrations: never edit models without `make migration`
    - Auth uses custom HMAC, not JWT. Read docs/SSD.md §4 first.
    - Module X and Module Y must not cross-import.
    
    ## Key Documents
    - System design & architecture: docs/SSD.md
    - Deployment & environment setup: docs/deploy.md
    - Third-party API quirks: docs/integrations.md

    That’s it. No directory tree. No list of what’s in src/. No explanation of what Python is. Every line either tells the agent something non-obvious or points to a document that does.

    What not to include

    Anything the agent can learn by reading the code. This includes: your tech stack (it’s in your dependency files), your directory structure (it can run ls), your code style (it can read existing files and match the pattern), your API endpoints (they’re defined in the code), and general programming best practices (it already knows those). Every one of these categories appeared in the /init-generated files that the ETH Zurich study found to be counterproductive.


    The Takeaway

    The research is clear: auto-generated context files are redundant documentation that costs you money and focus. Human-written context files help only when they’re minimal — describing requirements the agent can’t discover on its own.

    Your CLAUDE.md should be a short, hand-crafted file that links out to deeper contextual documents. The most important of those linked documents is an SSD — a Software System Design document that captures your architectural intent, constraints, and decisions. The SSD gives the agent something no amount of code-reading can provide: an understanding of why the system is the way it is and what invariants must be preserved.

    Delete your /init output. Write 30 lines by hand. Write an SSD. Link to it. Watch your agent do better work for less money.

    The exact opposite of /init is thinking carefully about what only you know — and writing that down.


    References

    • Gloaguen, T., Mündler, N., Müller, M., Raychev, V., & Vechev, M. (2026). “Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents?” arXiv preprint arXiv:2602.11988.
    • Anthropic. (2025). “Best Practices for Claude Code.” code.claude.com.
    • AGENTS.md. (2025). “A simple, open format for guiding coding agents.” agents.md.
  • Stop Scripting Your Coding Agent. Start Steering It.

    The most common mistake people make when configuring coding agents isn’t giving them too little instruction. It’s giving them too much of the wrong kind — in the wrong places.

    I’ve watched teams spend weeks crafting elaborate agent instructions — specifying exact tools, prescribing file-by-file workflows, mandating specific libraries for every decision. And I’ve watched those same agents produce brittle, mediocre output that falls apart the moment something unexpected happens.

    Then I’ve seen a different approach — where the instructions read more like an engineering handbook than a script — and the results are dramatically better. The difference isn’t volume. It’s knowing which parts of agent behavior deserve rigid specificity and which ones deserve principles. Most people get this backward.

    The Specificity Trap (Where It Hurts)

    Here’s what it looks like when specificity is applied to the wrong layer of agent instructions:

    “Use the fs module to read the config file at ./config/app.json. Parse the JSON. Check if the database.host field exists. If it doesn’t, throw an error with the message ‘Missing database host’. Use pg to create a connection pool with max 10 connections…”

    This feels thorough. It feels safe. You’re telling the agent exactly what to do, so what could go wrong?

    Everything. Because you’ve turned a reasoning engine into a script runner.

    When the config file moves, the agent breaks. When the project uses YAML instead of JSON, the agent breaks. When the database is MongoDB instead of Postgres, the agent breaks. It can’t adapt because you never told it what you were actually trying to accomplish — you only told it which levers to pull.

    Worse, you’ve consumed the agent’s context window with mechanical instructions, leaving less room for the agent to reason about what actually matters: the intent behind the code.

    What Principle-Based Instructions Look Like

    Compare the above with this:

    “Configuration must be validated at startup. The application should fail fast with clear error messages if required values are missing. Database connections should be pooled and bounded. Never silently swallow configuration errors.”

    Notice what changed. There are no tool names. No file paths. No library choices. Instead, there are architectural principles: fail fast, validate early, bound resources, surface errors.

    A capable coding agent receiving these instructions will:

    • Find the config file wherever it lives
    • Use whatever parsing library matches the format
    • Choose the right database driver for the stack
    • Implement connection pooling appropriate to the ORM in use
    • Write meaningful error messages

    And critically, it will do all of this in a way that’s consistent with the existing codebase, because you’ve freed it to observe and adapt rather than follow a rigid script.

    Why This Works: Agents Are Reasoners, Not Runners

    The reason principle-based instructions outperform specific ones — for the judgment layer — comes down to how modern coding agents actually work.

    A coding agent is, at its core, a reasoning system that operates over a context window. It reads your codebase, understands patterns, and generates code that fits. It’s not executing your instructions like a shell script — it’s interpreting them within the context of everything else it knows.

    When you give it specific instructions (“use library X, call function Y”), you’re overriding its ability to reason. You’re replacing its judgment — which is the thing you’re paying for — with your own prescriptive choices, which may or may not match the reality of the codebase.

    When you give it principles (“errors should be explicit, not silent”), you’re informing its judgment. You’re giving it a framework for making decisions, not making the decisions for it.

    This mirrors how effective engineering organizations work. The best teams don’t have 200-page procedure manuals telling developers which functions to call. They have architectural decision records, design principles, and coding standards that guide judgment. A senior engineer who understands “we prefer composition over inheritance” will make better decisions across a thousand situations than one who was handed a flowchart for ten.

    The same is true for agents. An agent that understands your architectural principles will make better decisions across an entire codebase than one that was given step-by-step instructions for a handful of tasks.

    The Three Layers That Actually Matter

    When restructuring agent instructions, think in three layers:

    1. Architectural Principles

    These are the non-negotiable beliefs your codebase is built on. They don’t change between tasks.

    • “Prefer pure functions. Side effects should be explicit and contained.”
    • “Every public API endpoint must validate its input before processing.”
    • “State changes flow in one direction. Components never mutate shared state directly.”
    • “Test behavior, not implementation. Tests should survive refactors.”

    These create a worldview that shapes every decision the agent makes.

    2. Quality Criteria (State, Not Action)

    Instead of saying “run ESLint” or “add error handling,” describe the state the code should be in when the work is done:

    • “All exported functions have explicit return types.”
    • “Error states are represented in the type system, not as thrown exceptions.”
    • “No function exceeds 40 lines.”
    • “Database queries are parameterized. No string concatenation in SQL.”

    These are binary, testable conditions. The agent can verify them itself. And because they describe outcomes rather than steps, the agent is free to reach those outcomes however the current codebase demands.

    3. Anti-Patterns (What Must Not Happen)

    The most underused category. Telling an agent what to avoid is often more powerful than telling it what to do:

    • “Never store secrets in code or configuration files committed to version control.”
    • “No synchronous I/O on the request path.”
    • “Never catch an exception and do nothing with it.”
    • “No circular dependencies between modules.”

    Anti-patterns create guardrails without constraining creativity. The agent can build whatever solution it wants — it just can’t violate these boundaries.

    The Compound Effect

    Here’s the insight most people miss: these three layers don’t just help with individual tasks. They compound across your entire workflow.

    An agent that has internalized your architectural principles will write new code that fits the existing codebase without being told to. It will choose the same patterns, the same error handling style, the same naming conventions — not because you specified them for this task, but because the principles naturally guide it there.

    Specific instructions, on the other hand, are disposable. “Use the fs module to read config” helps with exactly one task. “Configuration must be validated at startup and fail fast on missing values” helps with every task that touches configuration, forever.

    This is the difference between training an agent and scripting one. But it’s only half the picture.

    When Specifics Are Exactly Right

    There’s a category of agent behavior where principles fail and specificity is essential. Understanding where that line falls is what separates a good agent configuration from a great one.

    Consider these two types of instruction:

    “Errors should be surfaced clearly to the user.”

    “Every response must begin with a status header. Phase announcements must fire in order. Output format must include sections A, B, and C with these exact delimiters.”

    The first is a principle. The second is a protocol. And the second should be specific, because protocol isn’t a place where you want the agent exercising judgment. You don’t want it creatively reinterpreting your output format. You don’t want it deciding that today, status headers aren’t important.

    This distinction — judgment vs. protocol — is the real line that separates where principles belong from where specifics belong.

    What Counts as Protocol

    Protocol is anything where consistency matters more than adaptability:

    • Output format. If downstream systems, dashboards, or human workflows depend on structured output, specify the exact structure. The agent’s job is to produce it reliably, not reinvent it.
    • Phase progression. If your agent follows a multi-step workflow (gather context → plan → execute → verify), the sequence should be explicit. The agent shouldn’t skip steps because it feels confident.
    • Integration contracts. API calls to external services, webhook payloads, deployment commands — these have no room for creative interpretation. The endpoint is what it is.
    • Safety gates. “Always ask before deleting files” isn’t a principle to internalize — it’s a rule to follow mechanically, every time, without exception.

    What Counts as Judgment

    Judgment is anything where the right answer depends on context the agent discovers at runtime:

    • Which design pattern fits this module
    • How to handle an error in this specific function
    • Whether to refactor or patch
    • Which testing strategy covers this case
    • How to name things consistently with the existing codebase

    These are the decisions where principles outperform scripts — because the agent has information you didn’t have when you wrote the instructions.

    The Refined Rule

    Script the protocol. Principle-ize the judgment.

    Most agent configurations get this backward. They script the judgment (“use library X for task Y”) while leaving protocol vague (“format the output nicely”). The result: an agent that makes rigid decisions about things that should be flexible, and flexible decisions about things that should be rigid.

    The fix is to audit your instructions and sort each one into two buckets:

    ProtocolJudgment
    Should beSpecific, exact, non-negotiablePrinciple-based, adaptive, contextual
    BecauseConsistency matters more than creativityContext matters more than consistency
    Example“Output must include a verification section with pass/fail for each criterion”“Verification must produce evidence, not just assertions”
    Failure mode when wrongAgent reinvents your workflow every runAgent ignores codebase context to follow your stale script

    The protocol column keeps your agent predictable where predictability matters. The judgment column keeps it intelligent where intelligence matters. Together, they produce an agent that’s both reliable and adaptive — which is what you actually want.

    Making the Shift

    If you’re currently running an agent with heavily prescriptive instructions, here’s how to migrate:

    Step 1: Extract principles from your specifics. Look at your existing instructions and ask: “What belief about good software does this specific instruction encode?” The instruction “use try/catch around every database call” encodes the principle “database operations must have explicit error handling.” Write down the principle. You’ll decide what to delete in Step 4.

    Step 2: Describe states, not steps. Rewrite procedural instructions as conditions. “Run the linter after making changes” becomes “All code conforms to the project’s lint configuration.” The agent will figure out how to make the state true.

    Step 3: Add anti-patterns from your code review history. Look at the last 20 pull request comments from your team. The recurring feedback — “don’t do X,” “we never Y,” “this should be Z” — those are your anti-patterns. Codify them.

    Step 4: Sort specifics into protocol vs. judgment. Don’t blindly delete all specific instructions. Instead, ask: “Is this governing a judgment call or a protocol?” If it’s judgment — which library to use, how to structure code, what pattern to follow — replace it with a principle. If it’s protocol — output format, phase sequence, integration contracts, safety gates — keep it specific. Make it more specific.

    Step 5: Remove tool references from the judgment bucket. For everything you’ve identified as judgment, delete mentions of specific tools, libraries, and file paths. The agent will discover these from the codebase. If it can’t, your codebase has a discoverability problem worth fixing independently.

    The Bigger Picture

    The shift from specific instructions to architectural principles isn’t a universal rule — it’s a sorting rule. The real skill is knowing which category each instruction belongs to.

    Protocol needs precision. Judgment needs principles. Most agent configurations over-script judgment (making the agent rigid where it should be adaptive) and under-script protocol (making the agent inconsistent where it should be reliable). Flip that, and everything improves.

    The best agent configurations I’ve seen read like two documents stitched together: a tight operational spec for how the agent should behave — its workflow, its output format, its safety constraints — and a loose engineering handbook for how it should think — its design principles, its quality standards, its anti-patterns.

    Give your agent a worldview for judgment and a rulebook for protocol. You’ll get an agent that’s both reliable and intelligent — which is the combination that actually matters.

  • Embracing Modernity: Transitioning from Django to FastAPI for a Cleaner Architecture

    In the ever-evolving landscape of web development, the quest for efficiency, scalability, and cleaner architecture is relentless. Many developers have made a pivotal shift from traditional frameworks like Django to more modern, asynchronous frameworks like FastAPI. This transition not only embraces the latest in web technology but also paves the way for a more streamlined, clean architecture in your applications.

    Why FastAPI?

    FastAPI is a modern, fast (high-performance) web framework for building APIs with Python 3.7+ based on standard Python type hints. The key features of FastAPI include:

    • Speed: It’s designed to be fast and efficient, significantly outperforming traditional Python frameworks.
    • Type Checking: Utilizing Python type hints, it provides automatic request validation and serialization, reducing the risk of bugs.
    • Asynchronous Support: FastAPI supports asynchronous request handling, making it suitable for high-load applications.
    • Simple Yet Powerful: Despite its simplicity, FastAPI offers extensive features out of the box, including dependency injection, WebSockets, GraphQL, etc.

    The Shift from Django

    Django, a long-standing giant in the Python web framework arena, is known for its “batteries-included” approach. However, it follows a more monolithic architecture, which can sometimes lead to complex, tightly coupled codebases.

    Switching to FastAPI, you step into a world of more modular and decoupled design, leading to what is often referred to as a “clean architecture”.

    Embracing Clean Architecture with FastAPI

    1. Separation of Concerns

    FastAPI encourages a separation of concerns. You can easily separate different parts of your application (e.g., models, schemas, business logic, and endpoints), leading to a more maintainable and scalable codebase.

    2. Dependency Injection

    FastAPI’s dependency injection system is a game-changer. It allows for cleaner, more modular code, making it easy to manage shared resources and services (like database sessions).

    3. Asynchronous Programming

    The asynchronous capabilities of FastAPI are not just about performance; they also contribute to a cleaner architecture. Asynchronous handlers and background tasks can help keep your logic clear and concise, avoiding the callback hell commonly seen in asynchronous code.

    4. Improved Scalability

    With FastAPI’s performance advantages and its support for asynchronous programming, applications are inherently more scalable. This scalability is a critical aspect of clean architecture, ensuring that your application can grow without a proportional increase in complexity.

    5. Clearer API Design

    FastAPI’s automatic interactive API documentation (with Swagger UI and ReDoc) encourages a more thoughtful and clear API design. This clarity is a cornerstone of clean architecture, promoting maintainability and ease of use.

    Transitioning from Django to FastAPI

    While Django offers a robust solution for many web applications, FastAPI’s focus on speed, simplicity, and clean architecture makes it an attractive option for new projects, especially those heavily reliant on API interactions.

    Migrating Your Project

    • Assess Your Requirements: Consider the specific needs of your project. FastAPI shines in API-heavy, high-performance scenarios.
    • Plan Your Architecture: Leverage FastAPI’s strengths in asynchronous handling and dependency injection to plan a clean, modular architecture.
    • Incremental Migration: For existing Django projects, consider an incremental migration, starting with parts of the application that would benefit the most from FastAPI’s features.

    Conclusion

    The transition from Django to FastAPI is more than just a switch in frameworks. It’s a step towards a more modern, efficient, and cleaner architectural approach in building web applications. FastAPI’s design promotes a more decoupled and scalable architecture, making it a compelling choice for modern web applications that demand performance and maintainability.

    Whether you’re starting a new project or considering refactoring an existing one, FastAPI offers a path towards a cleaner, more efficient web development experience.

  • Building Scalable and Testable Python Applications with Modular Monoliths

    As Python continues to grow in popularity, its use in large-scale applications is becoming more common. However, developers often struggle with the transition from small, simple scripts to large, scalable systems. The key to this transition lies in the architectural patterns we choose. By combining the principles of Ports and Adapters, Domain-Driven Design (DDD), and a modular monolith architecture, Python developers can create systems that are both scalable and maintainable. Moreover, these principles facilitate testing without over-reliance on mocks or patches.

    Embracing a Modular Monolith

    The modular monolith is an architectural style that organizes an application into modules, each encapsulating a specific business capability. This approach offers the simplicity of a monolith with the added benefit of clear boundaries that simplify maintenance and scaling.

    Advantages of a Modular Monolith:

    • Simplified Development: Developers can work on one module without the need to understand the entire codebase.
    • Ease of Deployment: A single codebase and deployment unit make the process straightforward.
    • Refined Scaling: Individual modules can be scaled as needed by adjusting resources or optimizing their performance.

    Ports and Adapters: The Foundation of Flexibility

    Ports and Adapters (also known as Hexagonal Architecture) is an architectural pattern that promotes the separation of concerns by decoupling the application’s core logic from external services and platforms.

    Key Concepts:

    • Ports: Interfaces that define how the application communicates with the outside world.
    • Adapters: Implementations that connect the ports to external services like databases, web frameworks, or third-party APIs.

    Domain-Driven Design: Aligning Code with Business

    DDD is a design approach that focuses on the core domain and its logic. It emphasizes collaboration with domain experts to create a ubiquitous language that is reflected in the code, ensuring that the software accurately represents the business requirements.

    Core Components of DDD:

    • Entities: Objects that are defined by their identity.
    • Value Objects: Objects that are defined by their attributes.
    • Aggregates: A cluster of domain objects that can be treated as a single unit.
    • Repositories: Abstractions for accessing domain objects, typically from a database.

    Crafting a Testable Python Application

    When you build an application with testing in mind, you inherently create a more maintainable system. Here’s how you can apply these principles to achieve that:

    Define Clear Module Boundaries

    Each module should represent a distinct area of business logic. For example, in an e-commerce application, you might have modules for “Order Processing,” “Inventory Management,” and “Customer Relations.”

    Use Ports for External Interactions

    Define interfaces for all external interactions. This could be anything from sending emails to querying a database. By using ports, you can easily swap out implementations without changing the core logic.

    Implement Adapters for Real and Test Environments

    Create adapters for real-world use and separate adapters for testing. The test adapters can use in-memory databases or simple data structures to simulate real data.

    # ports/repository.py
    class OrderRepositoryPort:
        def add_order(self, order):
            pass
    
        def get_order(self, order_id):
            pass
    
    # adapters/real_repository.py
    class RealOrderRepository(OrderRepositoryPort):
        def add_order(self, order):
            # Implementation for a real database
            pass
    
        def get_order(self, order_id):
            # Implementation for a real database
            pass
    
    # adapters/test_repository.py
    class TestOrderRepository(OrderRepositoryPort):
        def __init__(self):
            self.orders = {}
    
        def add_order(self, order):
            self.orders[order.id] = order
    
        def get_order(self, order_id):
            return self.orders.get(order_id)
    

    Encapsulate Business Logic Within Domain Models

    Keep your business logic within your domain models and entities. This makes it easier to test the business logic in isolation.

    Application Services as Orchestrators

    Application services should orchestrate the flow of data between domain models and adapters. They should be thin layers that don’t contain business logic themselves but coordinate the application’s operations.

    Testing Without Mocks or Patches

    With this architecture, you can test most of your application by using real instances of your domain models and test adapters. This reduces the need for mocks or patches and can lead to more reliable tests.

    # tests/test_order_processing.py
    def test_order_can_be_placed():
        test_repository = TestOrderRepository()
        order_service = OrderService(repository=test_repository)
        order = Order(...)
    
        order_service.place_order(order)
    
        assert test_repository.get_order(order.id) is not None
    

    Scaling and Evolving Your Application

    As your application grows, you may find that some modules need to be scaled independently or even broken out into microservices. With the modular monolith architecture, you can do this one module at a time, minimizing risk and disruption.

    Conclusion

    Writing large, scalable Python applications doesn’t have to be daunting. By embracing a modular monolith architecture and applying the principles of Ports and Adapters and DDD from the outset, you create a codebase that is scalable, maintainable, and testable. This approach not only aligns your code with business requirements but also ensures that your application can evolve with those requirements, whether it remains a monolith or transitions to a microservices architecture.