project-docs v2: the map is useless if nobody checks they're reading it

5 min read

From theory to the battlefield

A few days ago I published about project-docs, a documentation framework that gives AI agents structure and memory. The post explained the architecture, the 6 roles, task routing. All very clean on paper.

But a framework isn’t tested with diagrams. It’s tested by building something real.

So we put it to work: Cube Trainer, an interactive Rubik’s cube algorithm trainer with 3D visualization, built from scratch with Astro + React + cubing.js. 4 phases, 24 tasks, 47 commits. All orchestrated by Claude Code agents using project-docs v1.

The result was a working project. But the process exposed a fundamental problem: the agent completed 4 development phases without creating a single session note, without updating CURRENT_STATE.md, and leaving VISION.md as an empty template. The validator passed because it only checked that files existed — not that they had real content.

project-docs v2 is the result of that lesson.

What worked from day one

The always-read trio saved sessions

The three files that are always loaded — VISION.md, TECH_STACK.md, CONVENTIONS.md — proved to be exactly what the agent needed to stay on track. Across 47 commits there wasn’t a single case of “class component in a hooks project” or “imported a library we don’t use.” The agent knew the stack was Astro + React islands, that TypeScript was in strict mode, that cubing.js should only load on detail pages.

Three files. Under 200 lines combined. That was enough to maintain coherence across weeks.

ADRs prevented re-litigating decisions

Cube Trainer has 3 Architecture Decision Records:

  • ADR-001: Why Astro with React islands (not a SPA)
  • ADR-002: Why cubing.js (despite GPL-3.0 license and bundle size)
  • ADR-003: Why static JSON instead of a CMS or database

Without these documents, every time the agent touched architecture it would have to make the decision again. With ADRs, the decision was already made, documented with alternatives considered, and the agent simply respected it.

Task routing worked as a complexity filter

  • Fast path for CSS tweaks, typos, configuration: implementer only, no planning or review overhead
  • Standard path for full features like “add all 57 OLL cases”: planner → implementer → tester → reviewer
  • Full path for architecture decisions: all roles + human checkpoint

This prevented two common problems: overkill on trivial changes and insufficient rigor on high-impact changes.

Build-time validation as an automatic gate

The algorithm validation script (npm run validate) became the most important test in the project. It verified that every notation was parseable by cubing.js and that every algorithm actually solved its case.

The agent generated the data, the validator rejected it, and the agent corrected it without human intervention. The feedback loop worked because the reviewer had “algorithm data not validated” as an explicit rejection condition.

Where v1 failed: three enforcement gaps

The framework had the right rules. What it didn’t have was a way to enforce them.

Gap 1: No content validation

The validator checked that VISION.md existed and had valid YAML frontmatter. But it didn’t verify that the content was real. A file with <!-- TODO: Describe the long-term purpose --> passed all checks. In Cube Trainer, several “always-read” files still had placeholders after 25 commits.

Gap 2: No commit-time enforcement

Nothing blocked the commit if documentation was incomplete. The agent could — and did — commit features without updating a single docs file. The docs-maintainer had the “theoretical responsibility” to do it, but in practice the workflow had no explicit trigger.

Gap 3: No activity detection

There was no way to detect that a project had gone N commits without session notes. Documentation could be completely abandoned and the framework wouldn’t notice.

The lesson: a documentation system based on the honor system doesn’t work with agents. If it’s not verified, it doesn’t exist.

What changed in v2: main architectural shift

Docs are first-class citizens

The most impactful UX decision. In v1, everything lived in a project-docs/ subdirectory (via git submodule or clone). Every path needed a prefix:

# v1 — 17 files with ~50 occurrences of this variable
{{PROJECT_DOCS_PATH}}/product/VISION.md
{{PROJECT_DOCS_PATH}}/context/TECH_STACK.md
{{PROJECT_DOCS_PATH}}/architecture/SYSTEM_OVERVIEW.md

In v2, docs live at the repo root alongside src/. The template variable {{PROJECT_DOCS_PATH}} was eliminated from all 17 files where it existed:

# v2 — direct paths, zero prefix
product/VISION.md
context/TECH_STACK.md
architecture/SYSTEM_OVERVIEW.md

Only {{PROJECT_NAME}} remains in 2 files (CLAUDE.md and AGENTS.md), replaced by init.mjs.

Installation: from 3 steps to a curl

Terminal window
# v1 — git submodule + node + interactive agent
git submodule add https://github.com/lea2696/project-docs.git project-docs/
node project-docs/scripts/bootstrap.mjs
# v2 — a one-liner that downloads 41 files directly to root
curl -sL https://raw.githubusercontent.com/lea2696/project-docs/main/install.sh | bash

The curl downloads and extracts dist/ (41 files) directly into the repo root. Zero framework traces: no foreign .git, no .gitmodules, no extra subdirectory. Docs end up in product/, context/, architecture/, etc. at the same level as src/.

With optional PRD support to speed up bootstrap:

Terminal window
curl -sL .../install.sh | bash -s -- --prd path/to/prd.md

This copies the PRD to plans/initial-prd.md. The bootstrapper uses it as a source to fill docs automatically instead of asking questions.

Relocated scripts

v1v2
scripts/bootstrap.mjsscripts/docs/init.mjs (simpler, supports --prd)
scripts/validate-docs.mjsscripts/docs/validate-docs.mjs (same logic + --strict)

Documentation enforcement: v2’s main feature

Three layers that solve v1’s three gaps.

Layer 1 — Strict validator (--strict)

The validator now has a strict mode that verifies real content, not just structure:

Placeholder detection: Required files (VISION.md, CURRENT_STATE.md, TECH_STACK.md, CONVENTIONS.md, KNOWN_ISSUES.md) cannot contain template content. If the validator finds <!-- TODO: or <!-- Example: in these files, it’s an error.

Date validation: last_updated in frontmatter must be a real ISO date, not the template literal YYYY-MM-DD.

Session activity: If there are 6+ commits without session notes in sessions/, it’s an error. The framework detects that documentation has been abandoned.

Grace period: Repos with fewer than 3 commits skip all strict checks. Without this, it would be impossible to make the first commit after installing the framework — all docs are empty.

Layer 2 — Pre-commit hook (Claude Code)

{
"hooks": {
"PreToolUse": [{
"matcher": "Bash",
"hooks": [{
"type": "command",
"if": "Bash(git commit *)",
"command": "node scripts/docs/validate-docs.mjs --strict"
}]
}]
}
}

The hook intercepts every git commit and runs the strict validator before allowing it.

A technical detail that changed the design: Claude Code uses exit code 2 to block actions, not exit code 1. Exit 1 is a non-blocking error that the agent can ignore. Exit 2 is a hard block that prevents the action. This was a discovery during implementation — the validator originally returned exit 1 and the agent silently ignored it.

Layer 3 — CI as safety net

docs-lint.yml runs --strict on every push/PR. If the pre-commit hook is bypassed (or if someone commits from outside Claude Code), CI catches it.

.github/workflows/docs-lint.yml
- run: node scripts/docs/validate-docs.mjs --strict

The result: it’s physically impossible to merge code with incomplete or abandoned documentation. The honor system became real enforcement.

Frontmatter as machine-readable metadata

Every documentation file has a YAML block that agents interpret programmatically:

---
status: draft | active | completed | abandoned
owner: human | planner | implementer | docs-maintainer
last_updated: 2026-04-06
read_policy: always | on-demand | never-default
---

In v1, the reading policy existed only as free text in the README. In v2, read_policy is a structured field. The status field indicates whether a document is current. The owner field identifies who’s responsible for keeping it updated. And last_updated is now validated — it can’t be a placeholder.

Templates with concrete examples

Generic TODOs were replaced with commented examples:

<!-- EXAMPLE: "Interactive 3D Rubik's cube algorithm trainer
focused on CFOP method. Targets intermediate cubers
transitioning from beginner method." -->

The difference in output quality between “Describe your project” and a real 3-line example is enormous. But now the strict validator also verifies that those examples get replaced with real content — they can’t stay as-is.

ADR-003: Documentation Enforcement

The framework’s own decisions are documented as ADRs:

  • ADR-001: Two-layer architecture (scaffold + root) — the autodiscovery problem
  • ADR-002: OpenCode support removal — account ban risk
  • ADR-003: Documentation enforcement — why three layers, why exit code 2, why grace period

The framework documents itself with the same format it proposes for projects using it.

Update flow

Terminal window
curl -sL .../install.sh | bash -s -- --update

Only updates infrastructure (scripts, agents, rules, skills). Doesn’t touch docs content, CLAUDE.md, AGENTS.md, or settings.json. Your actual documentation stays intact.

Concrete results in Cube Trainer

To put numbers on what worked (and what didn’t):

  • 47 commits with consistent conventions (feat:, data:, docs:)
  • 4 phases completed without architecture rollbacks
  • 0 decisions re-litigated thanks to ADRs
  • 78 algorithms (PLL + OLL + F2L) automatically validated on every build
  • 3 always-read files were enough to maintain cross-session coherence
  • 25+ commits without a single session note — the gap that motivated v2
  • VISION.md as empty template after 4 phases — the validator never caught it

The agent wrote correct code but completely ignored documentation. Code feedback mechanisms (build-time validation, reviewer rejection conditions) worked. Documentation feedback mechanisms didn’t exist.

The attitude shift

v1: “Agents don’t need more freedom, they need a better map.”

v2: “The map is useless if nobody checks they’re reading it.”

The biggest difference between v1 and v2 isn’t technical. It’s attitudinal. v1 was a good-faith contract: templates are there, folders exist, roles are defined. If something doesn’t work, it’s the agent’s fault for not reading.

v2 assumes that if it’s not verified, it doesn’t exist. Placeholders in always-read files → validation error. 6 commits without session notes → error. Template literal date in frontmatter → error. Pre-commit hook that blocks with exit code 2 → the agent can’t ignore it.

Documentation stopped being a “nice to have” and became an artifact with verifiable invariants, at the same level as tests or the linter.

What’s next

v2 solves the enforcement gaps. But there are things pending:

  • Active memory: RECURRING_MISTAKES needs an automatic trigger, not reliance on agent discipline
  • Automatic sessions: Generating session notes should be a post-commit hook, not a manual task
  • Usage metrics: Knowing which documents each agent reads and which it ignores would help optimize the loading policy

The framework is at github.com/lea2696/project-docs. It installs with a curl, and if you already used v1, --update upgrades the infra without touching your docs.

Repo: github.com/lea2696/project-docs