For decades, scientific productivity has struggled despite increasing funding and larger research teams. Papers became more incremental, experimental projects stretched for years, and the cost of acquiring new data kept rising. A single protein structure determination could require an entire PhD program and hundreds of thousands of dollars. That narrative changed dramatically in late 2024 when Demis Hassabis and John Jumper received the Nobel Prize in Chemistry for their work on AlphaFold. This recognition marked a pivotal moment: an AI system had decoded protein structures at atomic accuracy, compressing what once took years into mere minutes.
Now the paradigm is expanding even further. AI is no longer just an assistant to scientists but is beginning to generate hypotheses, design experiments, and suggest drug candidates. In February 2025, Google unveiled its AI Co-Scientist, a Gemini-based multi-agent system that formulates and evaluates research proposals. We are witnessing the beginning of what researchers call a “magic cycle” in science, where algorithmic tools accelerate discovery, leading to new experiments and data that further refine the algorithms themselves.
AlphaFold: The Protein Revolution That Won a Nobel Prize
The Critical Assessment of Structure Prediction (CASP) is a biannual, community-wide experiment that objectively evaluates protein structure prediction methods. Research groups receive sequences of proteins whose structures have not yet been made public and submit predicted models. The main evaluation metric is the Global Distance Test – Total Score (GDT_TS), which measures how closely a predicted structure matches the experimental one. Scores range from 0 to 100, with higher values indicating better accuracy.
Until 2018, no method consistently exceeded a median GDT_TS of 40-60. AlphaFold 1 raised this to approximately 60, providing the first signs that deep learning could outperform physics-based methods. Then came AlphaFold 2 in 2020, achieving a median GDT_TS of 92.4 in CASP14. This was so accurate that many commentators declared the protein folding problem “solved.”
AlphaFold Version Comparison
| Feature | AlphaFold 1 (2018) | AlphaFold 2 (2020) | AlphaFold 3 (2024) |
|---|---|---|---|
| CASP Score | ~60 median GDT_TS | 92.4 median GDT_TS | Highest accuracy |
| Prediction Scope | Single proteins | Single proteins | All molecules (proteins, DNA, RNA, ligands, ions) |
| Architecture | Deep learning | Evoformer + Structure Module | Pairformer + Diffusion Model |
| Key Innovation | Beat physics-based methods | Solved protein folding | Predicts molecular interactions |
| Protein-Ligand Accuracy | N/A | Limited | 50% higher than previous methods |
| Recognition | CASP13 winner | Transformational | Nobel Prize (2024) |
AlphaFold 3: The Next Evolution
AlphaFold 3, released in May 2024, represents a significant advancement. Rather than focusing solely on protein chains, AF3 predicts interactions of all of life’s molecules, including proteins, DNA, RNA, ligands, and ions.
The architecture features two key innovations. First, the Pairformer replaces AF2’s MSA-heavy Evoformer with a simpler approach that processes a limited number of multiple sequence alignment sequences and template structures, reducing computational overhead while preserving evolutionary context. Second, AF3 uses a diffusion-based structure generation approach. The decoder adds noise to atomic coordinates and then learns to denoise them, gradually assembling the 3D structure. This process is similar to diffusion models used in image generation, allowing AF3 to model interactions among proteins and small molecules rather than only individual proteins.
On the PoseBusters benchmark of protein-ligand complexes, AlphaFold 3 achieves 50% higher accuracy than previous methods and outperforms physics-based tools.
Global Impact and Adoption
DeepMind released a free AlphaFold Protein Structure Database containing predicted structures for over 200 million proteins, covering almost every cataloged protein known to science. The adoption statistics are remarkable:
| Metric | Value |
|---|---|
| Total Structures | 200+ million |
| Database Users | 3+ million |
| Countries Reached | 190+ |
| Users from Low/Middle Income Countries | 1+ million |
| Total Data Downloaded | 23 TB |
| AlphaFold 2 Paper Citations | ~43,000 |
| AlphaFold 3 Paper Citations | 9,000+ |
| Papers Citing AlphaFold | 35,000+ |
| Annual Citation Growth (2019-2024) | ~180% |
An independent analysis by the Innovation Growth Lab suggests that researchers using AlphaFold 2 see an increase of over 40% in their submission of novel experimental protein structures. These protein structures are more likely to be dissimilar to known structures, encouraging exploration of previously uncharted areas of science.
Real-World Applications
AlphaFold has transformed numerous scientific domains:
Malaria Vaccine Development: Researchers used AlphaFold predictions to model antigens from Plasmodium parasites and design stable immunogens, accelerating the selection of vaccine candidates.
Cancer Research: Scientists have employed AlphaFold to understand the structures of oncogenic proteins and identify cryptic binding sites for targeted therapies.
Enzyme Engineering: The database has guided the design of enzymes for industrial biocatalysis and engineering enzymes that break down plastics.
Agriculture: AlphaFold-derived structures have guided the engineering of drought-resistant crops by revealing how plant proteins respond to stress.
Perhaps the most inspiring story of democratization comes from Turkish undergraduate students Alper and Taner Karagöl. Working remotely from Adana, they taught themselves structural biology via AlphaFold tutorials and, with no prior training, published 15 research papers using AlphaFold-predicted structures.
Isomorphic Labs: Commercializing AlphaFold
DeepMind spun out Isomorphic Labs to commercialize AlphaFold for drug discovery. The company partners with pharmaceutical firms including Eli Lilly and Novartis to use AlphaFold’s structural predictions alongside generative models that design candidate molecules. Isomorphic Labs is set to advance its first AI-designed drug candidate into clinical trials by the end of 2025.
Google’s AI Co-Scientist: A Hypothesis Engine
Released in February 2025, Google’s AI Co-Scientist builds on the Gemini 2.0 large language model but departs from single-model paradigms. The system comprises specialized agents orchestrated by a Supervisor:
| Agent | Function |
|---|---|
| Generation Agent | Synthesizes literature and proposes initial research hypotheses |
| Reflection Agent | Critiques its own hypotheses, identifying weak assumptions |
| Ranking Agent | Conducts tournament-style comparisons using Elo rating system |
| Evolution Agent | Iteratively refines promising hypotheses |
| Proximity Agent | Assesses novelty by measuring deviation from existing literature |
| Meta-review Agent | Synthesizes feedback patterns and identifies successful reasoning chains |
This multi-agent architecture leverages test-time compute scaling, a strategy that allocates more computational resources during inference. The system spends additional time reasoning, debating ideas through self-play, and reranking proposals. As the system spends more time refining, the quality of outputs improves and surpasses both baseline models and unassisted human experts.
Validated Case Studies
Drug Repurposing for Acute Myeloid Leukemia (AML): The AI Co-Scientist generated novel repurposing hypotheses for this cancer with poor prognosis. In partnership with oncologists, one of the AI’s top suggestions was KIRA6, a PERK inhibitor originally developed for unrelated indications. Subsequent experiments showed that KIRA6 reduced AML cell viability at clinically relevant concentrations. Notably, these candidates were not obvious from existing literature, and the AI identified them within days, whereas human teams might have taken months.
Liver Fibrosis Target Discovery: The AI Co-Scientist proposed focusing on epigenetic regulators including histone deacetylases (HDACs), DNA methyltransferase 1 (DNMT1), and bromodomain-containing protein 4 (BRD4). In experiments using human hepatic organoids, inhibitors of HDACs and BRD4 showed significant anti-fibrotic activity with p-values below 0.01. A follow-up study at Stanford found that these AI-suggested inhibitors outperformed human-selected treatments.
Bacterial Gene Transfer Mechanisms: Scientists at Imperial College London challenged the AI Co-Scientist to generate hypotheses about how capsid-forming phage-inducible chromosomal islands (cf-PICIs) transfer between bacteria. Remarkably, the AI independently proposed that cf-PICIs interact with diverse phage tails to expand their host range, exactly matching unpublished experimental results. This discovery took human scientists nearly a decade but took the AI only 48 hours.
The Broader AI Science Ecosystem
The AI renaissance in science extends beyond AlphaFold and Google’s Co-Scientist. A constellation of tools is being developed to tackle different stages of the research pipeline.
AlphaEvolve: Coding Algorithms with Gemini
DeepMind’s AlphaEvolve couples Gemini Pro and Gemini Flash models with automated evaluators in an evolutionary framework. Gemini Flash explores a wide search space, while Gemini Pro performs deeper reasoning. Candidate algorithms are evaluated, mutated, and recombined in a process similar to natural selection.
| Achievement | Impact |
|---|---|
| Matrix Multiplication Kernel Optimization | 23% speedup for Gemini training |
| Overall Training Time Reduction | 1% end-to-end savings |
| 4×4 Complex Matrix Multiplication | Beat Strassen’s 56-year-old algorithm (48 vs 49 multiplications) |
| Data Center Efficiency (Borg) | Recovers 0.7% of Google’s worldwide compute resources |
| FlashAttention Kernel | Up to 32.5% speedup |
| Open Math Problems | Matched state-of-the-art in 75% of cases, improved 20% |
The system discovered a more efficient algorithm for 4×4 complex matrix multiplication, outperforming Strassen’s classic 1969 algorithm for the first time. This seemingly small improvement has significant implications given how fundamental matrix multiplication is to all of modern computing and AI.
FutureHouse: Modular Agents for Literature and Chemistry
The FutureHouse platform offers four specialized agents:
- Crow: Performs broad literature search across high-quality open-access papers
- Falcon: Conducts deeper reviews of specific topics
- Owl: Answers “has anyone done X?” by identifying prior art
- Phoenix: Plans and optimizes chemistry experiments based on the ChemCrow framework
Benchmarking shows that these agents outperform state-of-the-art retrieval models on precision and accuracy and even exceed PhD-level human researchers in some literature search tasks.
Sakana AI Scientist: End-to-End Paper Generation
Tokyo-based startup Sakana AI developed The AI Scientist, a fully automated pipeline for producing machine-learning research. It comprises four loops: idea generation, experimental iteration, paper write-up, and automated review. Each loop feeds into the next, enabling the system to continuously refine research directions.
Remarkably, the system can produce a complete research paper for approximately $15, including code, experiments, plots, LaTeX write-up, and peer-review feedback. While currently focused on machine-learning tasks, the concept foreshadows a future where AI systems autonomously produce scholarly output across numerous domains.
NVIDIA BioNeMo: Industry-Scale Drug Discovery
NVIDIA’s BioNeMo platform provides cloud-based generative models and accelerated libraries for drug discovery. The platform offers:
| Capability | Performance Improvement |
|---|---|
| Protein Structure Prediction | 5× to 6.2× speedup |
| Docking Calculations | 5× to 6.2× speedup |
| De Novo Molecule Design | Accelerated generation |
| Virtual Screening | Faster compound evaluation |
Major pharmaceutical and tech-bio leaders have adopted BioNeMo, with Argonne National Laboratory contributing billion-parameter models that scale efficiently on NVIDIA GPUs.
OpenAI Deep Research
In February 2025, OpenAI launched Deep Research, a multi-step research tool integrated into ChatGPT. Deep Research uses a specially optimized version of the o3 model to search, interpret, and analyze large volumes of text, images, and PDFs, synthesizing hundreds of sources into comprehensive reports.
The tool can complete tasks that would take humans many hours in just 5 to 30 minutes, generating summaries with citations and reasoning steps. On the challenging “Humanity’s Last Exam” benchmark covering over 100 expert domains, Deep Research achieved 26.6% accuracy, setting new standards for AI research capabilities.
Berkeley Lab: AI + Automation
At Lawrence Berkeley National Laboratory, AI and robotics combine to accelerate materials science:
- A-Lab: Uses AI algorithms to propose new compounds while robots synthesize and test them
- Autobot: Robotic system that explores chemical reaction spaces to identify catalysts
- BELLA Laser Accelerator: Machine learning models optimize and stabilize beam quality
- Distiller Pipeline: Analyzes electron microscopy data in near real-time
Critical Analysis: Promise vs. Reality
The Breadth-and-Depth Conundrum
Modern science confronts a paradox: breakthroughs require deep domain expertise yet increasingly emerge at the intersection of disciplines. Human scientists can rarely master both the breadth of cross-domain knowledge and the depth of specialized methods. AI helps by bridging these divides, synthesizing literature across biology, chemistry, materials science, and computing.
Risks and Limitations
However, the hype around AI in science can overstate its maturity. Kriti Gaur of the biotech data company Elucidata cautioned that until AI systems deliver genuinely original, verifiable insights that withstand scientific scrutiny, they remain “powerful assistants but not true co-scientists.”
Key concerns include:
- Closed-loop recycling: Models trained on existing literature may simply regurgitate known ideas without genuine discovery
- Training data biases: Protein sequences from well-studied organisms dominate public databases, potentially skewing predictions
- Static predictions: AlphaFold has difficulty modeling dynamic conformational changes or binding kinetics
- Human oversight needs: AI suggestions can sometimes replicate existing knowledge or propose unfeasible experiments
Democratization and Equity
AI can democratize science. The story of the Karagöl brothers shows that advanced research tools are no longer confined to elite labs. Open-access frameworks like FutureHouse and BioNeMo lower entry barriers by providing free or low-cost access to knowledge and computation. Deep Research allows researchers without extensive library access to synthesize information quickly.
Yet equitable access depends on broadband infrastructure, computational resources, and multilingual support. There is a risk that AI tools could deepen divides if their benefits accrue mainly to well-resourced institutions.
Patent Law Challenges
AI-driven discovery raises questions for intellectual property. The ability of AI to enumerate huge numbers of protein structures and antibody variants challenges existing patent frameworks. Some scholars argue that broad claims may become indefensible when AI can trivially predict all variants. Legal commentators note a growing consensus that AI can help satisfy enablement requirements by generating predictive data and structural insights, potentially reshaping patent law.
Future Trajectory and Implications
Merging Narrow and General Intelligence
Researchers envision a fusion of domain-specific models like AlphaFold with general-purpose language models. John Jumper has suggested that future systems will combine AlphaFold’s deep, narrow expertise with the broad reasoning of large language models to handle tasks such as protein design, mutagenesis, and drug discovery simultaneously.
Next-Generation Tools
Several projects hint at what comes next:
- Boltz-2 (MIT and Recursion): Uses physics-inspired deep learning to model entire protein families and predict folding kinetics
- Pearl (Genesis Molecular AI): Combines diffusion models with reinforcement learning to design small molecules from scratch
- Genesis Mission: Gives the AI Co-Scientist access to the U.S. Department of Energy’s 17 national laboratories
Impact on Discovery Timelines
If these trajectories hold, drug discovery timelines could shrink from years to months, and materials discovery could follow similar curves. Generative models might allow researchers to screen billions of compounds in silico and then focus wet-lab efforts on the most promising few.
Closing Thoughts
We stand at an inflection point in scientific discovery. The combination of AlphaFold’s structural insights, AI Co-Scientist’s hypothesis generation, and a growing ecosystem of AI-powered tools heralds a future where knowledge creation is no longer limited by human bandwidth. Instead of linear progress, we may see exponential improvements as data and algorithms reinforce each other.
Yet the promise of this magic cycle will only be realized if we remain vigilant about ethics, bias, equity, and human oversight. The next decade will test our ability to harness AI not as a replacement for scientists but as a co-learner, helping us explore the unknown faster and more collaboratively than ever before.
Sources:
- AlphaFold - Google DeepMind
- AlphaFold: Five Years of Impact - Google DeepMind
- Google DeepMind and Isomorphic Labs introduce AlphaFold 3
- AlphaFold - Wikipedia
- Accelerating scientific breakthroughs with an AI co-scientist - Google Research
- AlphaEvolve: A Gemini-powered coding agent - Google DeepMind
- FutureHouse Platform - AI Agents for Scientific Discovery
- The AI Scientist - Sakana AI
- NVIDIA BioNeMo for Biopharma
- Introducing Deep Research - OpenAI
- How AI and Automation are Speeding Up Science - Berkeley Lab
- CASP - Wikipedia
- pLDDT: Understanding local confidence - AlphaFold