Agent-first Django project generation playbook

If you’re building Django products with AI agents, speed is no longer the hard part. Reliability is.

Most teams can now get from blank folder to “something running” in a day. The challenge is getting to production-ready iteration #1 without accumulating hidden debt: brittle architecture, insecure defaults, silent test gaps, and code no human actually understands.

This playbook is a practical operating system for technical founders and agencies using agent-driven workflows. It is optimized for one outcome:

Ship a Django project that is fast to generate, safe to deploy, and maintainable after the demo.

Who this is for

Technical founders spinning up SaaS MVPs fast
Agencies delivering client apps with small teams
Product engineers leading AI-assisted implementation
Teams using tools like Codex/Claude/other coding agents to generate substantial code

If you already know Django basics, this gives you a repeatable execution loop from prompt to first production release candidate.

The 7-stage workflow: Prompt → Production-ready iteration

Stage 1) Define the execution spec (before any code)

Treat this as your contract with the agent.

Create a one-page spec with:

Product slice (what exactly gets built in this iteration)
Non-goals (what must not be built yet)
Architecture constraints (Django + DB + queue + auth + deployment target)
Quality bar (tests, lint, type checks, security checks)
Done criteria (what must pass before merge/deploy)

Example

Product slice: “Organization onboarding + invite by email + first project creation.”
Non-goals: “No billing, no RBAC matrix beyond org admin/member, no mobile API.”
Constraints: “Django + Postgres + Celery + Redis, no custom auth backend.”
Quality bar: “Ruff, mypy (core modules), pytest >80% on changed modules, migrations reviewed.”
Done: “CI green, no high-severity findings, staging smoke test passed.”

Why this matters: most agent failure starts with vague specs, not bad coding.

Stage 2) Generate the scaffold with opinionated defaults

Prompt the agent to generate the project skeleton, but lock in guardrails up front.

Required baseline for generated Django apps:

Split settings (base.py, dev.py, prod.py)
12-factor env handling
Postgres by default (avoid SQLite drift)
Structured logging
Health endpoint
Docker + compose for local parity
Basic CI pipeline (lint/test)
Pre-commit hooks
Seed admin user flow

At this point, do not ask the agent to build product features. Only foundation.

Output checkpoint: repository runs locally end-to-end in one command and tests pass.

Stage 3) Lock architecture decisions before feature generation

Before the first feature prompt, freeze these decisions in ARCHITECTURE.md:

App boundaries (e.g., accounts, organizations, projects, billing)
Domain model ownership (which app owns each model)
Event/async boundaries (what enters Celery and why)
Permissions strategy (object-level checks, service layer, or policy module)
API style (Django Ninja / DRF / server-rendered forms)

Then force the agent to reference this file in every implementation prompt.

Rule: no new app/module creation unless explicitly approved in prompt.

This single rule prevents “agent sprawl architecture,” where each prompt invents a new pattern.

Stage 4) Implement in vertical slices, not horizontal layers

Bad pattern with agents:

“Generate all models”
then “Generate all serializers”
then “Generate all views”

This creates disconnected code with broken real flows.

Use vertical slices instead:

One user story
Domain + API + tests in one pass
Review + fix
Merge

Slice template

Story: “Org admin can invite teammate by email.”
Includes:
model updates
business service
endpoint/view
email trigger
tests (unit + integration)
docs update

Vertical slices maximize coherence and reduce orphan code.

Stage 5) Run the “hardening pass” before calling it done

After features are generated, run a dedicated hardening cycle. This is where most real quality gains happen.

Hardening checklist:

Regenerate migrations and inspect diffs manually
Validate queryset performance (select_related, prefetch_related)
Add idempotency for retry-prone operations (webhooks, background tasks)
Verify authz on every mutating endpoint
Validate input constraints server-side (never rely on frontend)
Add rate limits where abuse risk exists
Remove dead code and duplicate utility functions
Tighten exception handling and error observability

Agents are good at producing “happy path complete” systems. Hardening makes them production-safe.

Stage 6) Human review protocol (non-negotiable)

Use a structured review rubric. Don’t do free-form “looks okay” reviews.

Human reviewer rubric

Correctness — does behavior match acceptance criteria?
Security — authn/authz, secrets, validation, injection risks
Data safety — migrations reversible? destructive operations guarded?
Operability — logs, metrics, health checks, failure visibility
Maintainability — naming clarity, module boundaries, test readability

Require reviewer sign-off per section in PR description.

This keeps your quality bar stable across different agents.

Stage 7) First production-ready iteration release

Only release iteration #1 when all are true:

CI green (lint, tests, type checks configured scope)
Security scan has no high-severity unresolved items
Migrations reviewed and tested on a staging clone
Rollback strategy documented
Post-deploy smoke test runbook executed

At this stage, you’re not shipping “perfect.” You’re shipping stable enough to learn from real users.

Common failure modes (and concrete mitigations)

1) Prompt drift across sessions

Symptom: every prompt changes conventions, creating inconsistent code.

Mitigation: maintain a persistent AGENT_RULES.md with coding standards and architecture constraints. Include it in every prompt.

2) Hallucinated dependencies or APIs

Symptom: generated code references nonexistent methods/packages.

Mitigation: add a strict “dependency allowlist” and force agents to run tests immediately after generation. Reject code with unresolved imports.

3) Security-by-assumption

Symptom: endpoints generated with weak permission checks or missing ownership checks.

Mitigation: enforce endpoint checklist:

who can call it?
what resource scope?
what tenant/org boundary?
how is access denied logged?

Require tests for unauthorized access for every write endpoint.

4) Migration hazards

Symptom: risky migrations (drops, table rewrites) generated casually.

Mitigation: force migration review gate:

highlight irreversible operations
split schema/data migrations when needed
require backup + rollback note before apply

5) Test theater

Symptom: lots of tests, low signal.

Mitigation: mandate three classes per critical slice:

business-rule unit tests
endpoint integration tests
permission/abuse-path tests

Track mutation-prone modules, not just global coverage percentage.

6) Operational blind spots

Symptom: app “works,” but failures are invisible in production.

Mitigation: add at minimum:

request IDs in logs
structured exception logging
task retry visibility
basic service-level dashboard (errors, latency, queue depth)

Guardrails for quality, security, and maintainability

Use this as your default guardrail stack.

Quality guardrails

Pre-commit: format + lint + import order
CI: lint + tests + minimal type checks
Required PR template with acceptance checklist
Small PRs (<500 LOC net when possible)

Security guardrails

Secrets only via env/secret manager
CSRF/session configuration reviewed per environment
Dependency scanning enabled
Tenant boundary checks tested explicitly
Rate limit + audit logging for sensitive actions

Maintainability guardrails

Service layer for non-trivial business logic
Keep views thin, models focused, services explicit
No duplicated utility modules across apps
Every complex module has a short architecture note

Recommended operating loop: Human ↔ Agent

This loop works consistently in real teams:

Human defines slice + constraints
Agent generates implementation + tests
Agent self-checks (lint, tests, static checks)
Human reviews via rubric
Agent applies review deltas
Human signs off + merge/deploy
Post-release notes update prompt/spec library

Cadence recommendation

Morning: define 1–2 slices only
Midday: implementation + review loops
End of day: hardening + merge or explicit carryover

This prevents endless generation with no quality closure.

The “production-ready iteration #1” checklist (cheatsheet)

Copy/paste this into your PR template.

Scope & product

[ ] User stories for this iteration are explicit
[ ] Non-goals are documented
[ ] No out-of-scope feature creep

Code & architecture

[ ] Changes follow existing app boundaries
[ ] No unapproved framework/pattern introduced
[ ] Complex logic moved to service layer

Data & migrations

[ ] Migrations reviewed manually
[ ] Backward/rollback impact noted
[ ] Seed/backfill steps documented (if needed)

Security

[ ] Authn/authz checked for every write endpoint
[ ] Input validation enforced server-side
[ ] Sensitive actions logged
[ ] Secrets/config checked for env safety

Quality & tests

[ ] Lint + tests pass in CI
[ ] Permission abuse-path tests included
[ ] Critical flows have integration coverage

Operations

[ ] Health check endpoint working
[ ] Error logging confirms useful context
[ ] Post-deploy smoke tests documented and run

Decision

[ ] Ship now
[ ] Ship with guardrail TODOs
[ ] Hold release (blocked by ___)

Appendix: Prompt templates you can reuse

Prompt A — Project foundation generation

```text You are generating a production-oriented Django starter. Constraints: - Python 3.12+, Django, Postgres, Redis, Celery - Split settings: base/dev/prod - Docker + docker-compose for local parity - Pre-commit, Ruff, pytest, basic CI - Structured logging and /health endpoint - No product features yet

Output: 1) folder structure 2) config files 3) minimal runnable app 4) setup instructions 5) tests proving app boots and health endpoint passes ```

Prompt B — Vertical slice implementation

```text Implement this user story as one vertical slice: Story: Org admin can invite teammate by email.

Follow ARCHITECTURE.md and AGENT_RULES.md strictly. Do not create new apps/modules unless requested.

Deliver: - model/service/view changes - tests: unit + integration + unauthorized access - migration files - brief docs update in docs/changes.md

Run lint/tests and provide exact command outputs. ```

Prompt C — Hardening pass

```text Perform a hardening pass on this branch. Focus on: - authz gaps - migration safety - query performance hotspots - idempotency for retryable operations - error handling and observability

Return: 1) findings by severity 2) concrete code changes 3) residual risks we accept for iteration #1 ```

Final note

Agent-first Django development is not “just faster coding.” It’s a process design problem.

If you standardize specs, constrain architecture, enforce review rubrics, and ship in vertical slices, agents become a force multiplier instead of a cleanup burden.

Build fast, but make your first production iteration boringly reliable.