Traditionally, developing custom empirical software for each research challenge has been a major bottleneck, consuming valuable time and slowing scientific progress. Google Research is leveraging an AI system that uses large language models (LLMs) to automate and expedite expert-level scientific software creation. This new approach promises that scientists can test new ideas and hypotheses in days rather than months.
AI as a Code-Optimizing Research Engine
Manual scientific coding is often tedious, requiring researchers to implement, debug, and optimize new solutions for every experiment. Google's AI system reimagines the process by functioning as a systematic code-optimizing research engine.
Given a clearly defined problem and evaluation metric, the system generates and tests thousands of code versions, using a tree search approach inspired by AlphaZero’s mastery of complex games.
Schematic of the algorithm that feeds a scorable task and research ideas to an LLM, which generates evaluation code in a sandbox. This code is then used in a tree search, where new nodes are created and iteratively improved using the LLM. Credit: Google Research
Inside the AI-Driven Workflow
- Input: Each project begins with a scorable task, including a detailed problem statement, a metric for success, and relevant datasets.
- Idea Generation: The AI proposes research strategies and writes executable code, refining its solutions with guidance from LLMs.
- Optimization: Through tree search, the system explores a vast space of possible code paths, rapidly homing in on the most promising directions.
- Validation: Every output is designed to be functional, verifiable, and reproducible—meeting the rigorous standards of scientific software.
Breakthrough Results Across Multiple Disciplines
The system’s capabilities were tested on six demanding benchmarks, spanning fields as diverse as genomics, public health, geospatial analysis, neuroscience, mathematics, and time-series forecasting. The AI consistently matched or exceeded expert-level solutions:
- Genomics: It discovered 40 new approaches for single-cell RNA sequencing data integration, with the top solution outperforming published methods by 14%.
- Public Health: The AI developed 14 models for COVID-19 forecasting, surpassing the CDC’s leading ensemble in accuracy.
- Geospatial Analysis: It delivered advanced semantic segmentation models for remote sensing, edging past state-of-the-art baselines.
- Neuroscience: The system created a new time-series model for predicting whole-brain neural activity, improving upon previous computationally heavy benchmarks.
- Mathematics & Forecasting: Complex integrals that eluded standard methods were solved, and a robust time-series forecasting library was developed, excelling across varied datasets.
Redefining the Scientific Process
This technology tackles a persistent hurdle in science: the slow, manual cycle of software development and hypothesis evaluation. By automating and optimizing empirical software, the AI system enables researchers to explore hundreds or thousands of ideas in a fraction of the time. This rapid iteration accelerates discovery and lets scientists focus on creativity and critical thinking.
A Paradigm Shift for Computational Science
AI-powered empirical software signifies a turning point for computational research. With its ability to generate, test, and optimize sophisticated code at unmatched speed and scale, this approach promises to reshape how scientists address complex scientific and societal questions.
AI Is Updating Scientific Software Development