Evo 2 becomes the largest AI model for biology, capable of writing genomes from scratch

The biggest names in tech meet at Unicorn Summit

In a landmark moment for synthetic biology and artificial intelligence, researchers from the Arc Institute, Stanford University, and Nvidia have unveiled Evo 2—the largest generative AI model for biology to date. Trained on a staggering 128,000 genomes representing all domains of life, Evo 2 has learned to write entire chromosomes and small genomes from scratch.

Unlike previous biological models that focused mainly on bacteria, Evo 2 includes human, plant, and yeast DNA, expanding into the complex world of eukaryotic genomes, where gene regulation and expression are far more intricate. The model was trained on 9.3 trillion DNA letters (A, T, C, G), sourced from a new open dataset called OpenGenome2.

The model’s ability to generate functioning genetic sequences and predict the impact of DNA mutations—both coding and non-coding—is unprecedented. In one test, Evo 2 outperformed state-of-the-art models in identifying harmful mutations in BRCA1, a gene linked to breast cancer, using only DNA sequences as input.

Evo 2 is built like a large language model (LLM), similar to those behind AI chatbots, but it operates in the language of life: the four-letter genetic code. Its architecture enables it to process long DNA sequences at once—crucial for understanding how distant DNA regions interact, especially in complex organisms.

In another demonstration, the model generated 250 complete human mitochondrial genomes, designed a minimal bacterial genome, created a 16,000-letter yeast chromosome, and even embedded a Morse code message inside a mouse genome. While these sequences have yet to be tested in living cells, experiments are underway.

What sets Evo 2 apart is not just its generative power but its open access. The researchers have made both the code and model parameters publicly available and launched a user-friendly web interface to foster global collaboration in the field of generative biology.

With Evo 2, scientists are now one step closer to building custom DNA blueprints, mapping the effects of genetic mutations, and even designing entirely synthetic life forms.

“This represents a key moment in generative biology,” said study co-author Patrick Hsu. “For the first time, machines can fluently read, write, and reason in the language of DNA.”