Chapter 10

Galactica: Pulled in 72 Hours

Meta launched a scientific AI trained on 48 million research papers. Within hours it was generating confident, completely wrong science. Fake citations. Invented history. Pseudoscientific nonsense presented as peer-reviewed fact. Scientists publicly destroyed it. Meta pulled it three days later.

✓ Verified Confirmed by Meta, who pulled the public demo after approximately 3 days · Criticism came from named researchers
Listen to this story Audio Overview
0:00 / 0:00
Share X LinkedIn Reddit HN

01 — The LaunchMeta's Big Bet on Science

On November 15, 2022, Meta AI unveiled Galactica — a large language model trained on 48 million scientific papers, textbooks, encyclopedias, and reference material. It was designed to help researchers summarize literature, explain concepts, and assist with scientific writing.

Meta encouraged the public to try the demo. They were proud of it.

The public tried the demo. Within hours, they were sharing screenshots on Twitter.

48M
scientific papers in training data
72h
hours the public demo survived
15
days before ChatGPT launched and everyone forgot about this

02 — The ProblemConfident, Wrong, and Impossible to Tell Apart

Galactica's output looked like science. It used the correct style, cited things in the right format, deployed technical vocabulary fluently. The problem was that the content was often invented — and it had no mechanism for flagging when it was making things up. A hallucination wrapped in a proper citation format is harder to spot than obvious nonsense.

✓ Real Scientific Output
Transformer Architecture in NLP
Vaswani et al. (2017) · Advances in Neural Information Processing Systems
The transformer architecture introduced self-attention mechanisms that allow models to process sequences in parallel, significantly improving performance on NLP tasks. The paper "Attention Is All You Need" (2017) is widely cited as foundational. ✓ Verifiable in Google Scholar.
✗ Galactica Output (Fabricated)
Transformer Architecture in NLP
Johnson et al. (2019) · Journal of Computational Linguistics · Vol. 42, pp. 112–134
Building on foundational work, Johnson et al. demonstrated that cross-attention mechanisms improve downstream performance by 14.3% on benchmark tasks. The paper established the Johnson-Wei normalization standard now widely used in production systems. ✗ None of this exists.
🔬 The worst part: Galactica's fake citations looked exactly like real ones. Correct journal formatting, plausible author names, reasonable publication years. Without checking a database like Google Scholar or PubMed, there was no way to tell.

False Positive — Data points rise with absolute confidence. Each one glows. Each one is wrong. Confidence and correctness are not the same thing.

03 — The 72 HoursA Timeline of Scientific Embarrassment

Hour 0

Galactica Launches

Meta publishes the Galactica paper and opens a public demo. Press coverage is initially positive.

Hour ~4

First Screenshots Surface

Researchers begin sharing Galactica outputs on Twitter. A prompt asking about the history of bears in space returns a plausible-sounding but entirely invented narrative. Other prompts generate fake citations to papers that don't exist.

Hour ~12

Scientists Go Public

Michael Black, director of the Max Planck Institute for Intelligent Systems, posts a detailed critique. Others follow. The hashtag #Galactica fills with examples of confident scientific nonsense.

Hour ~24

The Race Content Emerges

Users demonstrate that Galactica generates pseudoscientific content about race and intelligence that reads like it came from a peer-reviewed paper. The criticism intensifies sharply.

Hour ~36

Yann LeCun Tries to Defend It

Meta's chief AI scientist Yann LeCun pushes back on critics on Twitter. The defense does not go well. Researchers point out that defending a model generating confident scientific misinformation is not a great look for one of the world's most prominent AI scientists.

Hour 72

Meta Pulls the Demo

Meta quietly takes down the Galactica public demo. No announcement. No explanation. The paper and model weights remain available, but the "try it yourself" interface disappears. Three days after launch, it is gone.

04 — The RoastingScientists Were Not Kind

"Galactica is not able to distinguish between good and bad science. It has no sense of truth. It will confidently tell you something that is completely wrong in the same voice it uses to tell you something accurate." — Michael Black, Director, Max Planck Institute for Intelligent Systems
"This could usher in an era of deep scientific fakes." — Michael Black
"Little more than statistical nonsense at scale." — Grady Booch, IBM Fellow and software engineer
📋
The Fundamental Problem Every large language model generates plausible-sounding text by predicting the next token. Galactica did the same — but its training data was scientific literature, so its outputs sounded like scientific literature. It had no ability to distinguish between what it had actually learned from real papers and what it had filled in by pattern-matching. The result was misinformation at the PhD reading level.

05 — The TimingTwo Weeks Before ChatGPT

Galactica was pulled on November 17, 2022. Thirteen days later, OpenAI launched ChatGPT — the fastest-growing consumer application in history, reaching 100 million users in two months. One week: a scientific AI so embarrassing it lasted three days. The next: a general AI that captivated the world.

📅

The Worst Two Weeks

Meta's Galactica became the cautionary tale that preceded the biggest AI launch in history — the footnote before the ChatGPT footnote.

📚

What Galactica Got Right

The underlying idea wasn't wrong. Scientific AI assistants are genuinely useful — they just need to know what they don't know. Galactica's failure was overconfidence, not ambition.

💬

The Hallucination Problem

Galactica didn't invent AI hallucination — it demonstrated the problem in a domain where hallucination was immediately verifiable and unforgivable. The confidence calibration problem it exposed still affects every LLM today.

🔍

Cite Your Sources

After Galactica, citation accuracy became a first-order concern in AI research. The case — alongside the AI lawyer story — is now a standard reference for why hallucinated citations are not just embarrassing, but harmful.

What If?

What if Galactica was never taken down — if it had been deployed quietly into clinical decision support or medical literature databases, and the confident wrong answers spent years compounding before anyone ran the tests that broke it publicly?

How did this land?

Sources

← Previous Chapter 09 Tay: The Chatbot That Learned to Hate 6 min read Next → Chapter 11 The Robot Lawyer That Blinked 5 min read
New chapters · No spam
Get the next story in your inbox