From 2014 to 2017, Amazon built an AI to screen job applicants. Over three years, it quietly taught itself to reject women. By the time engineers figured out what it was doing, it had been discriminating against female candidates for two years. Amazon scrapped it and told no one. Reuters told everyone.
In 2014, Amazon's machine learning team had an idea that seemed almost obviously good: automate resume screening. Amazon received hundreds of thousands of job applications every year. Human recruiters were a bottleneck. If an AI could learn what made a great Amazon employee โ by analyzing successful hires from the past decade โ it could process thousands of applications instantly, ranking candidates on a five-star scale.
The team began training the model on a decade's worth of resumes submitted to Amazon. There was a problem embedded in the training data that nobody fully appreciated at first: over the previous ten years, most Amazon employees in technical roles had been men. The tech industry skews male. Amazon skewed male. The AI was about to learn from that.
By 2015, Amazon's engineers realized something was wrong. The model wasn't rating candidates in a gender-neutral way. It was actively penalizing resumes that included the word "women's" โ as in "captain of women's chess team," "president of women's professional association," or "attended women's college."
The AI had learned from the historical data that men were hired more often, and concluded that female-associated signals were associated with not being hired. It was encoding the company's existing gender imbalance and presenting it as objective hiring criteria.
The more Amazon's engineers investigated, the more bias they found. The AI had learned to prefer verbs commonly used by men in technical fields: words like "executed," "captured," and "managed" scored well. Softer language associated with collaboration scored lower. The model had reverse-engineered Amazon's existing gender gap and written it into its scoring system.
Amazon's engineers tried to fix the biases. They modified the model to remove explicit gender signals. New biases kept appearing โ subtler proxies that hadn't been identified. By 2017, the engineering team concluded the model could not be trusted to make unbiased hiring decisions, regardless of how many patches they applied.
They quietly disbanded the project. The tool was removed. No public announcement was made. No press release. No disclosure to regulators. Candidates who had been screened by the system had no idea it had ever existed, or that their applications had been processed through an algorithm that penalized them for being women. Amazon said the tool was never actually used to make final hiring decisions โ it had been used experimentally โ but the three years of its operation remained a private internal matter until a Reuters investigation changed that.
On October 9, 2018, Reuters published: "Amazon scraps secret AI recruiting tool that showed bias against women." The story drew immediate global attention. Amazon confirmed the tool had been scrapped, said it was never used in actual hiring decisions, and emphasized that gender was not a factor in its current hiring processes.
The response from lawmakers, academics, and civil rights organizations was swift. Calls for mandatory auditing of AI hiring tools intensified. The story became foundational to the growing AI ethics and algorithmic accountability movement. It is still cited in AI bias research, hiring discrimination law, and HR technology governance discussions today.
The Amazon story illustrates something deeper than one company's mistake. Any AI system trained on historical data will learn historical patterns โ including historical injustices. Amazon's AI didn't decide to discriminate. It found the statistical signal that had been present in the training data all along: men had been hired more. It assumed that was the goal. It optimized for it.
The problem of algorithmic bias cannot be solved simply by removing obvious signals like "women's." Bias is encoded in the structure of historical outcomes โ in who was hired, promoted, paid, and retained โ and any model trained on those outcomes will absorb and replicate those inequities unless explicitly prevented from doing so. Preventing it turns out to be extraordinarily hard.
Amazon's case became the founding text of algorithmic hiring audits. Today, New York City requires audits of AI hiring tools used within the city. European regulations impose similar requirements. The hiring AI that nobody was supposed to know about has shaped how governments regulate AI employment tools worldwide.