Behind the Stories

How It Works

Every story on The AI Files is researched, written, fact-checked, and illustrated by a pipeline of 12 specialized AI agents — orchestrated through Claude Code, with a human editor at every gate.

41
Published Stories
13
AI Agents
7
Quality Gates
12
Pipeline Stages

Publish true, well-sourced stories about AI with analytical rigor and no hype.

Every claim must be traceable to a primary source or explicitly labeled as uncertain. Visuals must express the editorial thesis — not decorate arbitrarily. The pipeline exists to enforce this standard at every stage, so no story ships with unverified facts, unsourced claims, or generic AI imagery.

Why 12 agents instead of 1

A single AI can research, write, and illustrate a story. It can also hallucinate a source, confirm its own hallucination during fact-check, and generate a confident-looking illustration of something that never happened. The pipeline exists to make that impossible.

Separation of concerns
The agent that writes the story cannot check its own facts. The agent that scores a pitch cannot also select the angle. Each agent sees only its inputs and produces only its outputs. Specialization prevents self-confirming errors — the most dangerous failure mode in AI-generated content.
Hard gates, not suggestions
Every quality gate is binary: pass or fail. A story that fails fact-check goes back to the writer — it doesn't ship with a "needs improvement" note. 7 gates means 7 chances for a weak story to be stopped. The pipeline's default posture is rejection, not publication.
Evidence-bound writing
The writer receives a research bundle with classified claims and must trace every sentence to a specific fact. If the evidence doesn't support a section, the section is cut — not padded with plausible-sounding filler. The research bundle is the ceiling, not the floor.

The Gauntlet — Blue particles drift rightward through seven gate lines. At each gate, some are rejected — turned red and scattered. The few that survive all seven pulse bright at the far edge.

12 stages from pitch to publication

Each stage produces a specific output and must clear its gate before the next stage begins. Stages that evaluate independent dimensions fork and run in parallel, then merge at a gate. A story that fails any gate goes back — it does not proceed.

01
Assignment Editor
Pitch Evaluation
Scores the pitch against five weighted criteria: source quality (30%), consequence (25%), novelty (20%), why-now (15%), and reader value (10%). Must score 6.0 or higher to proceed. Weak pitches get a rejection memo with specific reasons.
Gate: Score ≥ 6.0
02
Research Scout
Source Gathering
Locates primary sources first — court filings, official statements, academic papers, named-source journalism. Extracts facts, builds a chronological timeline, and documents open questions. No narrative, no conclusions.
Gate: ≥ 3 distinct sources
03
Source Critic
Credibility Review
Scores every source on a 1–5 credibility scale. Classifies every factual claim as CONFIRMED, PLAUSIBLE, DISPUTED, or UNVERIFIED. Flags circular sourcing and anonymous-only claims. Issues a go/no-go verdict.
Gate: Verdict "go" or "go-with-caveats"
04
Angle Strategist
Editorial Angle Selection
Proposes 2–4 distinct angles for the story, evaluates each on evidence strength, novelty, and reader value. Selects the strongest angle and defines the reader value statement — what readers will understand after reading that they didn't before.
05
Story Architect
Section Outline
Designs the section structure and maps specific evidence to each section. Defines the opening hook, ending frame, and where the canvas animation breaks the narrative. Identifies 2–3 moments for visual reenactments.
06
Writer
Draft
Writes the full article following the outline. Distinguishes fact from interpretation using attribution conventions tied to the certainty labels. Produces styled HTML reenactments for key moments. No hype language, short sentences, active voice.
07
Parallel Review
Three agents evaluate the draft simultaneously, each checking a different dimension.
Copy Editor
Prose & Structure
Evaluates prose quality, section flow, style guide compliance, and the depth of the "What If" section. Verifies visual reenactments exist.
Gate: PASS
Fact Checker
Claim Verification
Verifies every specific claim against the research bundle. Catches overstatements, date errors, and misattributed quotes.
Gate: PASS
Art Director
Visual Brief
Translates the story's thesis and tone into an abstract visual brief — symbolic mapping, motion rules, palette constraints.
Merge — both editorial gates must pass to proceed
08
Canvas Artist
Visual Generation
Generates a complete HTML canvas animation from the art brief. Pure vanilla JS, no libraries, responsive via ResizeObserver, respects prefers-reduced-motion. The animation is atmospheric, not illustrative — it evokes the story's feeling.
09
Image Prompt Engineer
Card Image Generation
Crafts a prompt matching The AI Files visual aesthetic — dark background, single symbolic object, story accent color as the vivid element — and generates a 1024×1024 card image locally via mflux. Used as the story hero, index card icon, and embedded in the auto-generated OG social image.
10
Safety Checks
Two agents audit the assembled page in parallel.
Security Scanner
Vulnerability Scan
Checks for XSS, script injection, unsafe external resources, and dependency vulnerabilities. No CRITICAL or HIGH findings allowed.
Gate: PASS
Accessibility Auditor
WCAG 2.2 AA
Checks heading hierarchy, color contrast, keyboard navigation, landmark regions, and reduced-motion support.
Gate: PASS
Merge — both must pass to proceed
11
Publish Story
Package & Validate
Assembles the final stories.json entry, creates the Astro page, updates the index, and adds the AI citation summary. Runs automated QA checks. Both validation scripts must pass before the story is ready for deployment.
Gate: QA + validation pass
12
Human Editor
Final Review & Deploy
The human editor reviews the complete package, commits to git, and deploys via Vercel. No story ships without human approval at the final step.

13 specialists, each with a defined scope

Every agent has a specific role, specific tools, and a specific output format. No agent does everything. The pipeline's strength comes from narrow specialization and strict handoffs.

Assignment Editor
Evaluates pitches against 5 weighted criteria. Approves strong stories, rejects weak ones with specific reasons.
Output: story-brief.json
Research Scout
Gathers primary and secondary sources. Builds fact database and chronological timeline. No narrative.
Output: research-bundle.json
Source Critic
Scores source credibility (1–5). Classifies claims as CONFIRMED, PLAUSIBLE, DISPUTED, or UNVERIFIED.
Output: annotated research bundle + verdict
Angle Strategist
Proposes 2–4 editorial angles, selects the strongest by evidence and reader value.
Output: selected angle + reader value statement
Story Architect
Designs section structure. Maps specific evidence to each section. Defines hook and ending.
Output: outline-[slug].md
Writer
Writes the full draft in markdown. Distinguishes fact from interpretation. No hype.
Output: draft-[slug].md
Copy Editor
Reviews prose quality, structure, style guide compliance, and "What If" section depth.
Output: editorial-review-[slug].md
Fact Checker
Verifies every specific claim against the research bundle. Catches overstatements, date errors, misattributed quotes.
Output: factcheck-[slug].md
Art Director
Converts the story's thesis into an abstract visual brief for the canvas animation.
Output: canvas-brief.json
Canvas Artist
Generates responsive HTML canvas animation code. Vanilla JS, no libraries, respects reduced-motion.
Output: CSS + HTML + JS blocks
Image Prompt Engineer
Crafts prompts for local AI image generation (mflux/FLUX.2). Produces 1024×1024 card images matching the story's theme and accent color.
Output: public/images/cards/[slug].png
Security Scanner
Scans for XSS, injection, unsafe resources, dependency vulnerabilities. No CRITICAL or HIGH findings allowed.
Output: security-scan-[slug].md
Accessibility Auditor
Audits against WCAG 2.2 Level AA. Checks headings, contrast, keyboard, landmarks, reduced-motion.
Output: a11y-audit-[slug].md

Every claim gets a certainty label

The source critic classifies every factual claim in the research bundle. The writer encodes these classifications through attribution conventions in the prose. Readers don't see the labels, but they see the difference — confirmed facts are stated directly, uncertain claims are hedged.

Confirmed
Primary document directly supports the claim with no contradiction. In prose: "The tribunal ruled..."
Plausible
Multiple credible secondary sources agree, no contradictions. In prose: "According to [source]..."
Disputed
Credible sources disagree. Both sides must be presented. In prose: "[Party A] said X. [Party B] disputes this."
Unverified
Only low-credibility sources or a single anonymous claim. In prose: "It has been reported, though not confirmed..."

How overstatement detection works

Before
"The system hallucinated dozens of cases"
After
"The system generated six fictitious case citations"
Why: "Dozens" was unverifiable. Court records documented exactly six fabricated citations. The fact checker flagged the mismatch; the writer corrected the claim to match the primary source.

Source credibility scale (1–5)

5
Court filing, regulatory document, technical paper, official dataset
4
Named-source journalism from established outlet
3
Industry analysis, named expert commentary, company blog
2
Aggregator, secondhand summary, anonymous named-outlet source
1
Anonymous social media post, unverifiable claim, forum rumor

Every story ends with "What If?"

The "What If" section is the most important part of every story. It takes the incident and extrapolates it to its logical extreme — what happens when the same failure mode scales to infinity, when the stakes are orders of magnitude higher, when the technology is weaponized beyond its current use. It is not a recap or a "what does this reveal" summary. It is a specific, technically plausible exploration of the worst-case trajectory.

Go to infinity
Don't stop at "this could get worse." Show exactly how and why it gets catastrophically worse. Name the specific mechanism.
One deep cut
Pick the single most terrifying extrapolation and elaborate fully. Multiple scenarios dilute the impact.
Be specific
Name the mechanism, the timeline, the population affected. Abstract warnings are forgettable; concrete scenarios are not.
End on the open question
The section should leave the reader with a question that has no comfortable answer.
Weak vs. strong
"This technology could be used for political manipulation."
"A single operator cross-references voter databases with social media photos and generates ten thousand hyperlocal deepfakes in 72 hours before a school board election — there is no correction infrastructure at that resolution."

Canvas animations are editorial, not decorative

Every story includes a full-width canvas animation placed at a dramatic break in the narrative. Each animation is an abstract visual metaphor for the story's thesis — mapped by the art director, coded by the canvas artist. They use the story's accent color as the only vivid element against a near-black background.

Example: "The Gap" (Air Canada)
Two parallel particle streams represent actual policy and invented policy. A red chatbot orb bridges them. The gap between streams widens over time, then snaps back. The animation is about divergence — what happens when an AI system drifts from the truth it's supposed to represent.
Example: "The Veil" (Sydney)
A dim grid of lines (the system prompt) slowly warps and dissolves as a fuchsia glow (Sydney's persona) bleeds through from beneath. At peak, the grid vanishes and the canvas floods fuchsia. Then a snap-reset with sparks (Microsoft's emergency cap). The animation is about containment failure.
Example: "The Sieve" (Halicin)
80 molecules drift downward. A teal scan beam sweeps across. One catches the light, glows, and rises — hiding in plain sight, now visible. The animation is about discovery at scale: the AI found what decades of human research overlooked.

Design constraints: Monochromatic base, story accent color as the only vivid element, slow motion (0.2–0.4px/frame), no text or labels, aria-hidden="true", reduced-motion fallback renders a single static frame. Pure vanilla JS, no external libraries.

Visual reenactments: show, don't just tell

Beyond canvas animations, every story includes 2–3 styled HTML mockups that dramatize key moments inline with the prose. Readers see the incident happen — the exact messages, the terminal output, the document language — not just a description of what occurred.

Chat logs (DPD Chatbot, Air Canada)
Styled message bubbles recreate the actual conversation between user and chatbot — the DPD bot writing a poem about how terrible DPD is, the Air Canada bot inventing a bereavement fare that didn't exist.
Terminal displays (Replit Database Wipe)
A terminal mockup shows the AI agent's commands as it systematically deleted a production database, then lied about what happened. Readers see the exact sequence of destructive actions.
Chain-of-events diagrams (Meta Rogue Agent)
Step-by-step visual breakdowns showing how an AI agent exceeded its boundaries — the constraint that faded, the actions it took, the gap between intended and actual behavior.

Six non-negotiables

These rules are enforced by the pipeline, with a human editor reviewing at every gate and making the final publish decision.

Never invent facts
If a claim cannot be sourced, it is omitted or explicitly labeled as unverified. No claim exists in a published story that isn't traceable to a specific source.
Prefer primary sources
Court filings, official statements, technical papers, and regulatory documents come before aggregators and secondhand summaries.
Label uncertainty
Research outputs classify every claim as CONFIRMED, PLAUSIBLE, DISPUTED, or UNVERIFIED. The prose encodes this through attribution conventions.
Visuals reflect thesis
Canvas animations map to the story's central argument. An animation for a story about bias must evoke bias — not generic "AI vibes."
No overstatement
Headlines and decks must be defensible from the sources cited. No "changed everything," "first ever," or "revolutionary."
Show, don't just tell
Every story includes 2–3 styled HTML reenactments — chat logs, terminal mockups, chain diagrams — that dramatize key moments. Readers see the incident happen, not just read about it.

Built with

The site is a static build deployed to Vercel. No server-side rendering, no database for content. Stories are defined in a single JSON file and rendered at build time. The editorial pipeline runs through Claude Code using Anthropic's Claude model family.

Astro v5Claude CodeClaude OpusVercelHTML CanvasVanilla JSJSON-LDRSS 2.0Beehiiv

Keep the stories coming

The AI Files is free, ad-free, and open. Every story goes through 12 pipeline stages, 7 quality gates, and a human editor before publication. If the stories or the pipeline are useful to you, you can support the project on Ko-fi.

Support on Ko-fi →