01 — The ProblemFifty Years of Failure
In the half-century between 1972 and 2022, experimental scientists — using X-ray crystallography, cryo-electron microscopy, and nuclear magnetic resonance — determined the three-dimensional structures of approximately 194,000 proteins. Each structure could take years of work and cost hundreds of thousands of dollars. In July 2022, a single upload to a database added 200 million more.
A protein's function is determined by its three-dimensional shape, and that shape is determined by its amino acid sequence. In 1972, Christian Anfinsen won the Nobel Prize in Chemistry for proving this principle. The question that followed consumed structural biology for the next fifty years: if the sequence determines the shape, can we predict the shape from the sequence alone?
The scale of the problem was staggering. In 1969, the molecular biologist Cyrus Levinthal calculated that a typical protein could assume approximately 10300 possible configurations. If a protein sampled one configuration per picosecond, it would take longer than the age of the universe to try them all. Yet real proteins fold into their correct shape in milliseconds. Nature had solved the problem. Science had not.
In 1994, a biennial competition called CASP — the Critical Assessment of protein Structure Prediction — was established to measure progress. Research groups would submit predicted structures for proteins whose actual structures had been determined experimentally but not yet published. For decades, progress was incremental. A few points per competition cycle. The protein folding problem was acknowledged, studied, and unsolved.
The Fold — An amino acid chain drifts loosely, searching through configuration space. Then it contracts — rapidly, inevitably — into a compact structure. Cross-bonds appear. The protein holds its shape. Then it unfolds and begins again.
02 — The BetFrom Games to Genes
Demis Hassabis was a chess prodigy who reached master standard at thirteen. He designed video games at Bullfrog Productions alongside Peter Molyneux as a teenager, founded his own game studio, then pivoted to neuroscience — earning a PhD at University College London studying how the brain handles memory and imagination. His doctoral work on the link between episodic memory and imagining the future was named one of Science magazine's top ten breakthroughs of 2007.
In 2010, Hassabis co-founded DeepMind in London with Shane Legg and Mustafa Suleyman. The company's thesis was that techniques from neuroscience could unlock general-purpose AI. Google acquired DeepMind in 2014. In March 2016, DeepMind's AlphaGo defeated Lee Sedol, the world's best Go player, 4–1.
After AlphaGo, Hassabis turned the same deep-learning approach toward science. The first target was protein folding.
John Jumper arrived at DeepMind in 2017. He had studied physics and mathematics at Vanderbilt University, then earned a PhD at the University of Chicago applying machine learning to protein dynamics. His background bridged exactly the gap the project required: deep knowledge of both proteins and the neural network architectures that might predict them.
In December 2018, AlphaFold entered CASP13 — DeepMind's first attempt at the competition. It placed first in the Free Modeling category, scoring 68.3 in summed z-scores against 48.2 for the next closest group. The problem was not solved. But the signal was clear: a machine learning lab with no prior structural biology publication had outperformed every academic group in the field on its first try.
03 — The SolveNinety-Two Point Four
Two years later, AlphaFold 2 entered CASP14. The results, announced in November 2020, were not close.
AlphaFold 2 achieved a median GDT score of 92.4 across 97 target proteins. A score above 90 is generally considered competitive with experimental methods. For approximately two-thirds of the targets, AlphaFold scored above 90. The median error in atomic positions was less than one Angstrom — roughly the width of a single atom.
The gap was not incremental. It was a discontinuity.
Andrei Lupas, Director at the Max Planck Institute for Developmental Biology and a CASP assessor, put it directly: "It's a game changer. This will change medicine. It will change research. It will change bioengineering. It will change everything."
John Moult, who had founded CASP in 1994 to measure exactly this kind of progress, said the problem was "in large part, solved."
For the other teams at CASP14 — groups that had spent years or decades on their own prediction methods — it was an extinction-level event for their subfield.
04 — The ReleaseTwo Hundred Million Structures
The breakthrough would have been significant regardless. What made it matter beyond the lab was what DeepMind did next.
On July 15, 2021, AlphaFold 2's full methodology was published in Nature, in a paper by Jumper, Hassabis, and colleagues that would accumulate nearly 43,000 citations by November 2025. On the same day, DeepMind open-sourced AlphaFold's code. One week later, in partnership with EMBL-EBI, they launched the AlphaFold Protein Structure Database — initially containing predicted structures for approximately 365,000 proteins.
In July 2022, one year later, DeepMind uploaded predicted structures for approximately 200 million proteins from one million species. Effectively, every protein known to science.
50 years of experimental work
One upload, July 2022
Science magazine named AlphaFold its 2021 Breakthrough of the Year. By late 2025, according to DeepMind, over three million researchers across 190 countries had used the database, with more than one million users in low- and middle-income countries. The open-source decision had turned a research result into infrastructure.
05 — The PrizeThe Nobel
On October 9, 2024, the Royal Swedish Academy of Sciences awarded the Nobel Prize in Chemistry to three people. Half the prize went to Demis Hassabis and John Jumper "for protein structure prediction." The other half went to David Baker, a biochemist at the University of Washington, "for computational protein design" — work that uses related techniques to design entirely new proteins that do not exist in nature.
The pairing was deliberate. Hassabis and Jumper had taught a machine to read the language of proteins. Baker had taught a machine to write in it.
It was the first Nobel Prize in a natural science awarded primarily for work driven by artificial intelligence.
06 — SignalThe Other Side of the Fold
The applications have been concrete. Researchers at the University of Oxford used AlphaFold to identify a critical protein in malaria vaccine development, accelerating the path from basic research to clinical trials. Scientists have used it to design enzymes that break down plastic waste more efficiently. Drug discovery pipelines across the pharmaceutical industry have integrated AlphaFold predictions into their early-stage target identification.
But the same capabilities carry a second set of implications.
A 2024 paper in EMBO Reports — "Security challenges by AI-assisted protein design" — laid out the dual-use risks. AI-powered protein prediction and design reduce the time, resources, and expertise required for biological engineering. The paper concluded that these tools could "shorten the risk chain for biological weapon development" by lowering the barrier for non-experts.
This is not hypothetical. The same tools that allow a researcher to design a better enzyme or a novel therapeutic allow anyone with access to design a protein that folds into a shape optimized for harm. The database is open. The code is open. The barrier that once existed — years of training, millions of dollars of equipment, institutional access — has been compressed into a laptop and a download link.
The protein folding problem is solved. The protein design problem is just beginning. The fold goes both ways.
The protein folding problem asked: given a sequence, what shape does it make? The protein design problem asks the inverse: given a desired shape, what sequence produces it? AlphaFold answered the first question. David Baker's work — the other half of the same Nobel Prize — is answering the second. Within a decade, designing a novel protein to a precise specification will be as routine as compiling code. The same capability that lets a researcher design a protein to neutralize a toxin lets someone design a protein that mimics one. The same database that accelerates a malaria vaccine accelerates a synthetic pathogen optimized for immune evasion. The barrier was never intent. It was capability — and capability just became a free download. No export control regime, no biosafety protocol, no institutional review board was designed for a world where the tools of molecular engineering are open-source, run on consumer hardware, and improve every eighteen months. The question is not whether someone will use protein design tools to cause harm. The question is what detection infrastructure exists when they do — and right now, the answer is: almost none.