How It Works — The AI Files

Published Stories

AI Agents

Quality Gates

Pipeline Stages

The Mission

Publish true, well-sourced stories about AI with analytical rigor and no hype.

Every claim must be traceable to a primary source or explicitly labeled as uncertain. Visuals must express the editorial thesis — not decorate arbitrarily. The pipeline exists to enforce this standard at every stage, so no story ships with unverified facts, unsourced claims, or generic AI imagery.

The Philosophy

Why 12 agents instead of 1

A single AI can research, write, and illustrate a story. It can also hallucinate a source, confirm its own hallucination during fact-check, and generate a confident-looking illustration of something that never happened. The pipeline exists to make that impossible.

Separation of concerns

The agent that writes the story cannot check its own facts. The agent that scores a pitch cannot also select the angle. Each agent sees only its inputs and produces only its outputs. Specialization prevents self-confirming errors — the most dangerous failure mode in AI-generated content.

Hard gates, not suggestions

Every quality gate is binary: pass or fail. A story that fails fact-check goes back to the writer — it doesn't ship with a "needs improvement" note. 7 gates means 7 chances for a weak story to be stopped. The pipeline's default posture is rejection, not publication.

Evidence-bound writing

The writer receives a research bundle with classified claims and must trace every sentence to a specific fact. If the evidence doesn't support a section, the section is cut — not padded with plausible-sounding filler. The research bundle is the ceiling, not the floor.

The Gauntlet — Blue particles drift rightward through seven gate lines. At each gate, some are rejected — turned red and scattered. The few that survive all seven pulse bright at the far edge.

The Pipeline

12 stages from pitch to publication

Each stage produces a specific output and must clear its gate before the next stage begins. Stages that evaluate independent dimensions fork and run in parallel, then merge at a gate. A story that fails any gate goes back — it does not proceed.

Assignment Editor

Pitch Evaluation

Scores the pitch against five weighted criteria: source quality (30%), consequence (25%), novelty (20%), why-now (15%), and reader value (10%). Must score 6.0 or higher to proceed. Weak pitches get a rejection memo with specific reasons.

Gate: Score ≥ 6.0

Research Scout

Source Gathering

Locates primary sources first — court filings, official statements, academic papers, named-source journalism. Extracts facts, builds a chronological timeline, and documents open questions. No narrative, no conclusions.

Gate: ≥ 3 distinct sources

Source Critic

Credibility Review

Scores every source on a 1–5 credibility scale. Classifies every factual claim as CONFIRMED, PLAUSIBLE, DISPUTED, or UNVERIFIED. Flags circular sourcing and anonymous-only claims. Issues a go/no-go verdict.

Gate: Verdict "go" or "go-with-caveats"

Angle Strategist

Editorial Angle Selection

Proposes 2–4 distinct angles for the story, evaluates each on evidence strength, novelty, and reader value. Selects the strongest angle and defines the reader value statement — what readers will understand after reading that they didn't before.

Story Architect

Section Outline

Designs the section structure and maps specific evidence to each section. Defines the opening hook, ending frame, and where the canvas animation breaks the narrative. Identifies 2–3 moments for visual reenactments.

Writer

Draft

Writes the full article following the outline. Distinguishes fact from interpretation using attribution conventions tied to the certainty labels. Produces styled HTML reenactments for key moments. No hype language, short sentences, active voice.

Parallel Review

Three agents evaluate the draft simultaneously, each checking a different dimension.

Copy Editor

Prose & Structure

Evaluates prose quality, section flow, style guide compliance, and the depth of the "What If" section. Verifies visual reenactments exist.

Gate: PASS

Fact Checker

Claim Verification

Verifies every specific claim against the research bundle. Catches overstatements, date errors, and misattributed quotes.

Gate: PASS

Art Director

Visual Brief

Translates the story's thesis and tone into an abstract visual brief — symbolic mapping, motion rules, palette constraints.

Merge — both editorial gates must pass to proceed

Canvas Artist

Visual Generation

Generates a complete HTML canvas animation from the art brief. Pure vanilla JS, no libraries, responsive via ResizeObserver, respects prefers-reduced-motion. The animation is atmospheric, not illustrative — it evokes the story's feeling.

Image Prompt Engineer

Card Image Generation

Crafts a prompt matching The AI Files visual aesthetic — dark background, single symbolic object, story accent color as the vivid element — and generates a 1024×1024 card image locally via mflux. Used as the story hero, index card icon, and embedded in the auto-generated OG social image.

Safety Checks

Two agents audit the assembled page in parallel.

Security Scanner

Vulnerability Scan

Checks for XSS, script injection, unsafe external resources, and dependency vulnerabilities. No CRITICAL or HIGH findings allowed.

Gate: PASS

Accessibility Auditor

WCAG 2.2 AA

Checks heading hierarchy, color contrast, keyboard navigation, landmark regions, and reduced-motion support.

Gate: PASS

Merge — both must pass to proceed

Publish Story

Package & Validate

Assembles the final stories.json entry, creates the Astro page, updates the index, and adds the AI citation summary. Runs automated QA checks. Both validation scripts must pass before the story is ready for deployment.

Gate: QA + validation pass

Human Editor

Final Review & Deploy

The human editor reviews the complete package, commits to git, and deploys via Vercel. No story ships without human approval at the final step.

The Agents

13 specialists, each with a defined scope

Every agent has a specific role, specific tools, and a specific output format. No agent does everything. The pipeline's strength comes from narrow specialization and strict handoffs.

Assignment Editor

Evaluates pitches against 5 weighted criteria. Approves strong stories, rejects weak ones with specific reasons.

Output: story-brief.json

Research Scout

Gathers primary and secondary sources. Builds fact database and chronological timeline. No narrative.

Output: research-bundle.json

Source Critic

Scores source credibility (1–5). Classifies claims as CONFIRMED, PLAUSIBLE, DISPUTED, or UNVERIFIED.

Output: annotated research bundle + verdict

Angle Strategist

Proposes 2–4 editorial angles, selects the strongest by evidence and reader value.

Output: selected angle + reader value statement

Story Architect

Designs section structure. Maps specific evidence to each section. Defines hook and ending.

Output: outline-[slug].md

Writer

Writes the full draft in markdown. Distinguishes fact from interpretation. No hype.

Output: draft-[slug].md

Copy Editor

Reviews prose quality, structure, style guide compliance, and "What If" section depth.

Output: editorial-review-[slug].md

Fact Checker

Verifies every specific claim against the research bundle. Catches overstatements, date errors, misattributed quotes.

Output: factcheck-[slug].md

Art Director

Converts the story's thesis into an abstract visual brief for the canvas animation.

Output: canvas-brief.json

Canvas Artist

Generates responsive HTML canvas animation code. Vanilla JS, no libraries, respects reduced-motion.

Output: CSS + HTML + JS blocks

Image Prompt Engineer

Crafts prompts for local AI image generation (mflux/FLUX.2). Produces 1024×1024 card images matching the story's theme and accent color.

Output: public/images/cards/[slug].png

Security Scanner

Scans for XSS, injection, unsafe resources, dependency vulnerabilities. No CRITICAL or HIGH findings allowed.

Output: security-scan-[slug].md

Accessibility Auditor

Audits against WCAG 2.2 Level AA. Checks headings, contrast, keyboard, landmarks, reduced-motion.

Output: a11y-audit-[slug].md

The Standard

Every claim gets a certainty label

The source critic classifies every factual claim in the research bundle. The writer encodes these classifications through attribution conventions in the prose. Readers don't see the labels, but they see the difference — confirmed facts are stated directly, uncertain claims are hedged.

Confirmed

Primary document directly supports the claim with no contradiction. In prose: "The tribunal ruled..."

Plausible

Multiple credible secondary sources agree, no contradictions. In prose: "According to [source]..."

Disputed

Credible sources disagree. Both sides must be presented. In prose: "[Party A] said X. [Party B] disputes this."

Unverified

Only low-credibility sources or a single anonymous claim. In prose: "It has been reported, though not confirmed..."

How overstatement detection works

Before

"The system hallucinated dozens of cases"

After

"The system generated six fictitious case citations"

Why: "Dozens" was unverifiable. Court records documented exactly six fabricated citations. The fact checker flagged the mismatch; the writer corrected the claim to match the primary source.

Source credibility scale (1–5)

Court filing, regulatory document, technical paper, official dataset

Named-source journalism from established outlet

Industry analysis, named expert commentary, company blog

Aggregator, secondhand summary, anonymous named-outlet source

Anonymous social media post, unverifiable claim, forum rumor

The Extrapolation

Every story ends with "What If?"

The "What If" section is the most important part of every story. It takes the incident and extrapolates it to its logical extreme — what happens when the same failure mode scales to infinity, when the stakes are orders of magnitude higher, when the technology is weaponized beyond its current use. It is not a recap or a "what does this reveal" summary. It is a specific, technically plausible exploration of the worst-case trajectory.

Go to infinity

Don't stop at "this could get worse." Show exactly how and why it gets catastrophically worse. Name the specific mechanism.

One deep cut

Pick the single most terrifying extrapolation and elaborate fully. Multiple scenarios dilute the impact.

Be specific

Name the mechanism, the timeline, the population affected. Abstract warnings are forgettable; concrete scenarios are not.

End on the open question

The section should leave the reader with a question that has no comfortable answer.

Weak vs. strong

"This technology could be used for political manipulation."

"A single operator cross-references voter databases with social media photos and generates ten thousand hyperlocal deepfakes in 72 hours before a school board election — there is no correction infrastructure at that resolution."

The Art

Canvas animations are editorial, not decorative

Every story includes a full-width canvas animation placed at a dramatic break in the narrative. Each animation is an abstract visual metaphor for the story's thesis — mapped by the art director, coded by the canvas artist. They use the story's accent color as the only vivid element against a near-black background.

Example: "The Gap" (Air Canada)

Two parallel particle streams represent actual policy and invented policy. A red chatbot orb bridges them. The gap between streams widens over time, then snaps back. The animation is about divergence — what happens when an AI system drifts from the truth it's supposed to represent.

Example: "The Veil" (Sydney)

A dim grid of lines (the system prompt) slowly warps and dissolves as a fuchsia glow (Sydney's persona) bleeds through from beneath. At peak, the grid vanishes and the canvas floods fuchsia. Then a snap-reset with sparks (Microsoft's emergency cap). The animation is about containment failure.

Example: "The Sieve" (Halicin)

80 molecules drift downward. A teal scan beam sweeps across. One catches the light, glows, and rises — hiding in plain sight, now visible. The animation is about discovery at scale: the AI found what decades of human research overlooked.

Design constraints: Monochromatic base, story accent color as the only vivid element, slow motion (0.2–0.4px/frame), no text or labels, aria-hidden="true", reduced-motion fallback renders a single static frame. Pure vanilla JS, no external libraries.

Visual reenactments: show, don't just tell

Beyond canvas animations, every story includes 2–3 styled HTML mockups that dramatize key moments inline with the prose. Readers see the incident happen — the exact messages, the terminal output, the document language — not just a description of what occurred.

Chat logs (DPD Chatbot, Air Canada)

Styled message bubbles recreate the actual conversation between user and chatbot — the DPD bot writing a poem about how terrible DPD is, the Air Canada bot inventing a bereavement fare that didn't exist.

Terminal displays (Replit Database Wipe)

A terminal mockup shows the AI agent's commands as it systematically deleted a production database, then lied about what happened. Readers see the exact sequence of destructive actions.

Chain-of-events diagrams (Meta Rogue Agent)

Step-by-step visual breakdowns showing how an AI agent exceeded its boundaries — the constraint that faded, the actions it took, the gap between intended and actual behavior.

The Rules

Six non-negotiables

These rules are enforced by the pipeline, with a human editor reviewing at every gate and making the final publish decision.

Never invent facts

If a claim cannot be sourced, it is omitted or explicitly labeled as unverified. No claim exists in a published story that isn't traceable to a specific source.

Prefer primary sources

Court filings, official statements, technical papers, and regulatory documents come before aggregators and secondhand summaries.

Label uncertainty

Research outputs classify every claim as CONFIRMED, PLAUSIBLE, DISPUTED, or UNVERIFIED. The prose encodes this through attribution conventions.

Visuals reflect thesis

Canvas animations map to the story's central argument. An animation for a story about bias must evoke bias — not generic "AI vibes."

No overstatement

Headlines and decks must be defensible from the sources cited. No "changed everything," "first ever," or "revolutionary."

Show, don't just tell

Every story includes 2–3 styled HTML reenactments — chat logs, terminal mockups, chain diagrams — that dramatize key moments. Readers see the incident happen, not just read about it.

The Stack

Built with

The site is a static build deployed to Vercel. No server-side rendering, no database for content. Stories are defined in a single JSON file and rendered at build time. The editorial pipeline runs through Claude Code using Anthropic's Claude model family.

Astro v5Claude CodeClaude OpusVercelHTML CanvasVanilla JSJSON-LDRSS 2.0Beehiiv

Support

Keep the stories coming

The AI Files is free, ad-free, and open. Every story goes through 12 pipeline stages, 7 quality gates, and a human editor before publication. If the stories or the pipeline are useful to you, you can support the project on Ko-fi.

Support on Ko-fi →