Back to blog

Agent-first Django project generation playbook

A practical, step-by-step playbook for technical founders and agencies to go from prompt to a production-ready Django iteration with AI agents.

If you’re building Django products with AI agents, speed is no longer the hard part. Reliability is.

Most teams can now get from blank folder to “something running” in a day. The challenge is getting to production-ready iteration #1 without accumulating hidden debt: brittle architecture, insecure defaults, silent test gaps, and code no human actually understands.

This playbook is a practical operating system for technical founders and agencies using agent-driven workflows. It is optimized for one outcome:

Ship a Django project that is fast to generate, safe to deploy, and maintainable after the demo.


Who this is for

  • Technical founders spinning up SaaS MVPs fast
  • Agencies delivering client apps with small teams
  • Product engineers leading AI-assisted implementation
  • Teams using tools like Codex/Claude/other coding agents to generate substantial code

If you already know Django basics, this gives you a repeatable execution loop from prompt to first production release candidate.


The 7-stage workflow: Prompt → Production-ready iteration

Stage 1) Define the execution spec (before any code)

Treat this as your contract with the agent.

Create a one-page spec with:

  1. Product slice (what exactly gets built in this iteration)
  2. Non-goals (what must not be built yet)
  3. Architecture constraints (Django + DB + queue + auth + deployment target)
  4. Quality bar (tests, lint, type checks, security checks)
  5. Done criteria (what must pass before merge/deploy)

Example

  • Product slice: “Organization onboarding + invite by email + first project creation.”
  • Non-goals: “No billing, no RBAC matrix beyond org admin/member, no mobile API.”
  • Constraints: “Django + Postgres + Celery + Redis, no custom auth backend.”
  • Quality bar: “Ruff, mypy (core modules), pytest >80% on changed modules, migrations reviewed.”
  • Done: “CI green, no high-severity findings, staging smoke test passed.”

Why this matters: most agent failure starts with vague specs, not bad coding.


Stage 2) Generate the scaffold with opinionated defaults

Prompt the agent to generate the project skeleton, but lock in guardrails up front.

Required baseline for generated Django apps:

  • Split settings (base.py, dev.py, prod.py)
  • 12-factor env handling
  • Postgres by default (avoid SQLite drift)
  • Structured logging
  • Health endpoint
  • Docker + compose for local parity
  • Basic CI pipeline (lint/test)
  • Pre-commit hooks
  • Seed admin user flow

At this point, do not ask the agent to build product features. Only foundation.

Output checkpoint: repository runs locally end-to-end in one command and tests pass.


Stage 3) Lock architecture decisions before feature generation

Before the first feature prompt, freeze these decisions in ARCHITECTURE.md:

  • App boundaries (e.g., accounts, organizations, projects, billing)
  • Domain model ownership (which app owns each model)
  • Event/async boundaries (what enters Celery and why)
  • Permissions strategy (object-level checks, service layer, or policy module)
  • API style (Django Ninja / DRF / server-rendered forms)

Then force the agent to reference this file in every implementation prompt.

Rule: no new app/module creation unless explicitly approved in prompt.

This single rule prevents “agent sprawl architecture,” where each prompt invents a new pattern.


Stage 4) Implement in vertical slices, not horizontal layers

Bad pattern with agents:

  • “Generate all models”
  • then “Generate all serializers”
  • then “Generate all views”

This creates disconnected code with broken real flows.

Use vertical slices instead:

  1. One user story
  2. Domain + API + tests in one pass
  3. Review + fix
  4. Merge

Slice template

  • Story: “Org admin can invite teammate by email.”
  • Includes:
  • model updates
  • business service
  • endpoint/view
  • email trigger
  • tests (unit + integration)
  • docs update

Vertical slices maximize coherence and reduce orphan code.


Stage 5) Run the “hardening pass” before calling it done

After features are generated, run a dedicated hardening cycle. This is where most real quality gains happen.

Hardening checklist:

  • Regenerate migrations and inspect diffs manually
  • Validate queryset performance (select_related, prefetch_related)
  • Add idempotency for retry-prone operations (webhooks, background tasks)
  • Verify authz on every mutating endpoint
  • Validate input constraints server-side (never rely on frontend)
  • Add rate limits where abuse risk exists
  • Remove dead code and duplicate utility functions
  • Tighten exception handling and error observability

Agents are good at producing “happy path complete” systems. Hardening makes them production-safe.


Stage 6) Human review protocol (non-negotiable)

Use a structured review rubric. Don’t do free-form “looks okay” reviews.

Human reviewer rubric

  1. Correctness — does behavior match acceptance criteria?
  2. Security — authn/authz, secrets, validation, injection risks
  3. Data safety — migrations reversible? destructive operations guarded?
  4. Operability — logs, metrics, health checks, failure visibility
  5. Maintainability — naming clarity, module boundaries, test readability

Require reviewer sign-off per section in PR description.

This keeps your quality bar stable across different agents.


Stage 7) First production-ready iteration release

Only release iteration #1 when all are true:

  • CI green (lint, tests, type checks configured scope)
  • Security scan has no high-severity unresolved items
  • Migrations reviewed and tested on a staging clone
  • Rollback strategy documented
  • Post-deploy smoke test runbook executed

At this stage, you’re not shipping “perfect.” You’re shipping stable enough to learn from real users.


Common failure modes (and concrete mitigations)

1) Prompt drift across sessions

Symptom: every prompt changes conventions, creating inconsistent code.

Mitigation: maintain a persistent AGENT_RULES.md with coding standards and architecture constraints. Include it in every prompt.


2) Hallucinated dependencies or APIs

Symptom: generated code references nonexistent methods/packages.

Mitigation: add a strict “dependency allowlist” and force agents to run tests immediately after generation. Reject code with unresolved imports.


3) Security-by-assumption

Symptom: endpoints generated with weak permission checks or missing ownership checks.

Mitigation: enforce endpoint checklist:

  • who can call it?
  • what resource scope?
  • what tenant/org boundary?
  • how is access denied logged?

Require tests for unauthorized access for every write endpoint.


4) Migration hazards

Symptom: risky migrations (drops, table rewrites) generated casually.

Mitigation: force migration review gate:

  • highlight irreversible operations
  • split schema/data migrations when needed
  • require backup + rollback note before apply

5) Test theater

Symptom: lots of tests, low signal.

Mitigation: mandate three classes per critical slice:

  • business-rule unit tests
  • endpoint integration tests
  • permission/abuse-path tests

Track mutation-prone modules, not just global coverage percentage.


6) Operational blind spots

Symptom: app “works,” but failures are invisible in production.

Mitigation: add at minimum:

  • request IDs in logs
  • structured exception logging
  • task retry visibility
  • basic service-level dashboard (errors, latency, queue depth)

Guardrails for quality, security, and maintainability

Use this as your default guardrail stack.

Quality guardrails

  • Pre-commit: format + lint + import order
  • CI: lint + tests + minimal type checks
  • Required PR template with acceptance checklist
  • Small PRs (<500 LOC net when possible)

Security guardrails

  • Secrets only via env/secret manager
  • CSRF/session configuration reviewed per environment
  • Dependency scanning enabled
  • Tenant boundary checks tested explicitly
  • Rate limit + audit logging for sensitive actions

Maintainability guardrails

  • Service layer for non-trivial business logic
  • Keep views thin, models focused, services explicit
  • No duplicated utility modules across apps
  • Every complex module has a short architecture note

Recommended operating loop: Human ↔ Agent

This loop works consistently in real teams:

  1. Human defines slice + constraints
  2. Agent generates implementation + tests
  3. Agent self-checks (lint, tests, static checks)
  4. Human reviews via rubric
  5. Agent applies review deltas
  6. Human signs off + merge/deploy
  7. Post-release notes update prompt/spec library

Cadence recommendation

  • Morning: define 1–2 slices only
  • Midday: implementation + review loops
  • End of day: hardening + merge or explicit carryover

This prevents endless generation with no quality closure.


The “production-ready iteration #1” checklist (cheatsheet)

Copy/paste this into your PR template.

Scope & product

  • [ ] User stories for this iteration are explicit
  • [ ] Non-goals are documented
  • [ ] No out-of-scope feature creep

Code & architecture

  • [ ] Changes follow existing app boundaries
  • [ ] No unapproved framework/pattern introduced
  • [ ] Complex logic moved to service layer

Data & migrations

  • [ ] Migrations reviewed manually
  • [ ] Backward/rollback impact noted
  • [ ] Seed/backfill steps documented (if needed)

Security

  • [ ] Authn/authz checked for every write endpoint
  • [ ] Input validation enforced server-side
  • [ ] Sensitive actions logged
  • [ ] Secrets/config checked for env safety

Quality & tests

  • [ ] Lint + tests pass in CI
  • [ ] Permission abuse-path tests included
  • [ ] Critical flows have integration coverage

Operations

  • [ ] Health check endpoint working
  • [ ] Error logging confirms useful context
  • [ ] Post-deploy smoke tests documented and run

Decision

  • [ ] Ship now
  • [ ] Ship with guardrail TODOs
  • [ ] Hold release (blocked by ___)

Appendix: Prompt templates you can reuse

Prompt A — Project foundation generation

```text You are generating a production-oriented Django starter. Constraints: - Python 3.12+, Django, Postgres, Redis, Celery - Split settings: base/dev/prod - Docker + docker-compose for local parity - Pre-commit, Ruff, pytest, basic CI - Structured logging and /health endpoint - No product features yet

Output: 1) folder structure 2) config files 3) minimal runnable app 4) setup instructions 5) tests proving app boots and health endpoint passes ```

Prompt B — Vertical slice implementation

```text Implement this user story as one vertical slice: Story: Org admin can invite teammate by email.

Follow ARCHITECTURE.md and AGENT_RULES.md strictly. Do not create new apps/modules unless requested.

Deliver: - model/service/view changes - tests: unit + integration + unauthorized access - migration files - brief docs update in docs/changes.md

Run lint/tests and provide exact command outputs. ```

Prompt C — Hardening pass

```text Perform a hardening pass on this branch. Focus on: - authz gaps - migration safety - query performance hotspots - idempotency for retryable operations - error handling and observability

Return: 1) findings by severity 2) concrete code changes 3) residual risks we accept for iteration #1 ```


Final note

Agent-first Django development is not “just faster coding.” It’s a process design problem.

If you standardize specs, constrain architecture, enforce review rubrics, and ship in vertical slices, agents become a force multiplier instead of a cleanup burden.

Build fast, but make your first production iteration boringly reliable.