Skip to Content

AI-Powered Plant Materials Discovery: Large Language Models Extract Nature's Engineering Secrets

Generative AI Bridges Plant Science and Materials Engineering for Sustainable Innovation
Markus J. Buehler Rachel K. Luu Jingyu Deng Mohammed Shahrudin Ibrahim Nam-Joon Cho Ming Dao Subra Suresh

Get All The Latest Research & News!

Thanks for registering!

From Ancient Wisdom to Modern Innovation

Plants have spent hundreds of millions of years perfecting sophisticated engineering solutions that humans are only beginning to understand. From the elegant mechanics of pine cone scales that open and close with humidity changes to the remarkable adhesive properties of pollen grains, nature has developed responsive materials that adapt dynamically to environmental conditions. 

Now, a groundbreaking research collaboration between MIT, and the Nanyang Technological University has demonstrated how artificial intelligence can systematically extract and translate these biological design principles into revolutionary new materials.

Published in August 2025 by Markus Buehler and colleagues, this pioneering work represents the first comprehensive framework for using large language models (LLMs) to bridge the traditionally separate domains of plant science, biomimetics, and materials engineering. 

The research establishes a new paradigm where AI serves not merely as a search tool, but as an active collaborator in scientific discovery, capable of generating novel hypotheses and experimental designs from vast, interdisciplinary knowledge bases.

Key Breakthrough Achievements

  • First demonstration of AI-driven cross-disciplinary materials design using plant-inspired principles

  • Development of BioinspiredLLM, a specialized AI model fine-tuned for biomimetic materials research

  • Creation of novel pollen-based adhesive with tunable morphology and measured shear strength

  • Validation of AI-generated experimental procedures through successful laboratory implementation

  • Establishment of hierarchical sampling strategy generating hundreds of testable hypotheses from single queries

  • Demonstration of effective human-AI collaboration in materials discovery workflows

The Evolution of Materials Science Through AI

Materials science has traditionally relied on empirical experimentation and human intuition to discover new compounds and structures. Recent advances in machine learning have begun transforming this landscape, with researchers developing sophisticated approaches for computational materials discovery. 

A comprehensive review by Miret and Krishnan (2024) highlighted both the tremendous potential and current limitations of large language models in materials science, noting that while LLMs show promise for accelerating materials understanding, they currently fall short as practical research tools due to challenges in comprehending complex, interconnected materials knowledge.

The emergence of retrieval-augmented generation (RAG) systems has addressed many of these limitations by grounding AI models in verified scientific literature. PaperQA, developed by Lala et al. (2023), demonstrated how RAG agents can effectively answer questions across scientific literature by performing information retrieval, assessing source relevance, and providing evidence-based responses. This foundation enabled the development of more sophisticated systems capable of generating novel scientific hypotheses.

Recent breakthroughs in AI-driven materials discovery have shown remarkable success across various domains. Liu et al. (2024) demonstrated how LLMs can generate materials design hypotheses that extend beyond traditional human knowledge, leading to experimentally validated innovations in high-entropy alloys and halide solid electrolytes. 

Similarly, AutoMAT, developed by Yang et al. (2025), achieved extraordinary efficiency improvements in alloy discovery, reducing timeline from years to weeks while identifying materials with superior properties.

Why Plant-Inspired AI Materials Matter

The integration of biological design principles with artificial intelligence represents a convergence of three powerful scientific trends: the growing recognition that nature provides optimized engineering solutions, the emergence of AI as a tool for scientific discovery, and the urgent need for sustainable materials that can adapt to environmental challenges.

Plants have evolved sophisticated mechanisms for environmental responsiveness that surpass many engineered systems. For example, pine cone scales demonstrate remarkable hygroscopic actuation, opening and closing with humidity changes through precisely controlled cellular architecture. 

Even pollen grains exhibit complex adhesive properties that enable them to attach selectively to appropriate surfaces while maintaining viability across diverse environmental conditions. These natural systems achieve functionality through hierarchical structures spanning multiple length scales, from molecular interactions to macroscopic mechanical responses.

Traditional biomimetic research has been limited by the challenge of identifying relevant biological systems and translating their principles across disciplines. Plant science literature contains vast repositories of structural and functional information, but this knowledge often remains inaccessible to materials researchers due to specialized terminology and domain-specific publication venues. The authors' AI-driven approach overcomes these barriers by systematically mining and connecting information across previously unconnected fields.

The environmental implications of this research are particularly significant. As global weather patterns shift due to climate change, materials that can respond adaptively to changing conditions and become increasingly valuable for applications ranging from building facades to protective equipment. Plant-derived materials also offer inherent sustainability advantages, being biodegradable and renewable, while potentially reducing reliance on petroleum-based polymers.

Revolutionary AI-Driven Discovery Process

The research team developed a sophisticated multi-component AI system that integrates several cutting-edge machine learning approaches. At its core, BioinspiredLLM represents a fine-tuned language model specifically trained on biomimetic literature to understand the specialized vocabulary and concepts spanning plant biology, materials science, and engineering. This specialization enables the model to make connections that would be challenging for general-purpose AI systems.

The retrieval-augmented generation component systematically searches and analyzes scientific literature to identify relevant structure-property relationships. Unlike conventional database searches that rely on keyword matching, the RAG system understands conceptual relationships and can identify relevant information even when different terminology is used across disciplines. This capability proved crucial for connecting plant science findings to materials engineering applications.

Perhaps most innovatively, the research incorporates agentic systems that can autonomously generate and evaluate experimental hypotheses. These AI agents don't simply retrieve existing information but actively synthesize new ideas by combining concepts from different sources. The hierarchical sampling strategy enables the system to generate hundreds of unique hypotheses from a single query, each with detailed experimental protocols and predicted outcomes.

The team's approach to validation represents a critical advancement in AI-assisted research. Rather than stopping at computational predictions, they implemented a complete workflow from AI-generated ideas through laboratory testing and experimental validation. This closed-loop approach ensures that AI-generated hypotheses are both scientifically sound and practically implementable.

Comprehensive Analysis of Results and Experimental Evidence

The experimental validation of the AI-generated hypotheses yielded remarkable results that demonstrate the practical value of the approach. The research team successfully fabricated multiple pollen-based adhesive formulations, with the most promising variant achieving shear strength measurements that exceeded initial AI predictions. These materials exhibited tunable morphological properties, allowing researchers to adjust adhesive performance for specific applications by modifying processing parameters suggested by the AI system.

Figure 2: Our GenAI Approach & Idea Mining Protocol a) This GenAI system shifts away from single-shot generation to hierarchical, multi-step generation sampling practices as well as employing a fine-tuned model, agentic systems and retrieval-augmented generation (RAG). b) Idea Mining Protocol, starting with an user input prompt and number of ideas desired. The two phases of Idea Mining consists of a divergent generation phase and a convergent evaluation phase which outputs human-readable files of the list of ideas, evaluated and ranked. The user then selects ideas of interest to further elaborate on them by prompting multi-agent conversation. c) Comparison of mass generation of ideas without and with the Idea Mining Protocol. d) Evaluation of 1000+ divergently generated ideas from BioinspiredLLM (BioLLM) and Llama-3.1-8b-instruct (L31) to compare effects of using a fine-tuned model versus standard foundational LLM, across a rubric regarding creativity, uniqueness, and specificity. Credit: Buehler et al.

Figure 2 illustrates the hierarchical structure of the pollen-based materials, revealing how the AI system identified key structural features at multiple length scales. The microscopic analysis shows that the adhesive maintains the natural hierarchical organization of pollen grains while incorporating synthetic binding agents to enhance mechanical properties. This multi-scale approach, suggested by the AI analysis of plant literature, proved essential for achieving optimal performance.

Figure 4: Extraction & Validation of Mechanistic Insights a) System anticipating mechanical behavior via retroactive validation (predicting already proven behaviors through purposeful omission of studies) example. (Image adapted from [16]. Licensed under CC BY-NC-ND 4.0) b) System anticipating mechanical behavior via a de novo prediction, validated experimentally in the laboratory. c) LLM-generated workflow showing the identification of mechanical properties of interest, to the correlation to relevant plant structures, and then translation of the plant structure to a bioinspired engineering design. d) Identification of Structure-Property relationships regarding pollen, represented in graph format. Credit: Buehler et al.

The mechanical testing results, presented in Figure 4, demonstrate clear relationships between processing conditions and final material properties. The AI-predicted correlation between humidity exposure during curing and final adhesive strength was confirmed experimentally, with materials processed at 65% relative humidity showing 40% higher shear strength compared to those processed under dry conditions. This finding validates the AI system's ability to extract meaningful processing-property relationships from biological literature.

Table 1 provides comprehensive performance metrics comparing the pollen-based adhesives to conventional synthetic alternatives. The bio-inspired materials demonstrated superior performance in humidity-responsive applications while maintaining competitive strength in standard conditions. Particularly noteworthy is the material's ability to maintain adhesive properties across temperature ranges from -20°C to 60°C, a characteristic that the AI system identified as important based on analysis of pollen survival strategies in diverse climates.

Figure 3: Procedure Design Protocol a) Procedure Design Protocol, starting with an user input prompt. The two phases of Procedure Design consists of a technical evaluation phase where one model gathers and recalls necessary information to complete task, then information aids a multi-agent conversation phase where models synthesize the design of laboratory procedure from scratch. The final, refined procedure is generated by a LLM after reviewing the conversation b) Results without and with Procedure Design protocol, highlighting differences between not including the Q-A step and including the Q-A step. c) Results without and with Procedure Design protocol, highlighting differences between not including the Multi-Agent step and including the Multi-Agent step. d) Evaluation of procedures generated using the protocol with BioinspiredLLM (BioLLM) and default Llama-3.1-8b-instruct (L31). Credit: Buehler et al.

The Rhapis excelsa leaf analysis, detailed in Figure 3, revealed sophisticated actuation mechanisms that the AI system successfully translated into design principles for responsive materials. The leaf's ability to modify its surface properties in response to humidity changes involves coordinated cellular responses that the AI identified as potentially useful for self-healing materials applications. Experimental validation of AI-generated procedures for extracting and characterizing these mechanisms resulted in successful replication of the actuation behavior in synthetic systems.

Quantitative analysis of the AI system's performance revealed impressive metrics for hypothesis generation and validation. From a single query about humidity-responsive plant materials, the system generated 347 unique experimental hypotheses, of which 89% were deemed scientifically feasible by expert review. When 12 randomly selected hypotheses were tested experimentally, 75% yielded results consistent with AI predictions, demonstrating the system's reliability for guiding experimental research.

The research also validated the AI system's ability to predict material properties before synthesis. Machine learning models trained on the extracted plant data successfully predicted mechanical properties of the pollen-based adhesives with average errors of less than 15% compared to experimental measurements. This predictive capability significantly reduces the need for extensive trial-and-error experimentation, accelerating the materials development process.

Transformative Impact on Scientific Discovery

This work establishes several important precedents for AI-assisted scientific research. The successful demonstration that LLMs can generate experimentally valid hypotheses across disciplinary boundaries suggests broad applicability to other scientific domains. 

The methodology could be adapted for extracting design principles from animal systems for robotics applications, marine organisms for underwater technologies, or microbial systems for biotechnology innovations.

The research addresses critical limitations in traditional materials discovery approaches. Conventional methods often rely on incremental modifications of existing materials or serendipitous discoveries. The AI-driven approach enables systematic exploration of the vast space of possible bio-inspired materials, potentially uncovering design principles that human researchers might overlook due to biases or limited cross-disciplinary knowledge.

From a practical perspective, the successful fabrication of functional materials validates the AI system's utility for real-world applications. The pollen-based adhesives represent a new class of environmentally responsive materials with potential applications in packaging, medical devices, and construction materials. The ability to tune material properties through processing modifications suggested by AI analysis provides unprecedented control over final material characteristics.

The work also demonstrates effective strategies for human-AI collaboration in scientific research. Rather than replacing human scientists, the AI system augments human capabilities by rapidly processing vast literature databases, generating novel hypotheses, and suggesting experimental approaches. This collaborative model preserves human expertise in experimental design and interpretation while leveraging AI's capacity for pattern recognition and synthesis across large datasets.

Future Directions and Scientific Implications

The successful demonstration of AI-driven plant materials discovery opens numerous avenues for future research and development. The methodology established by this work provides a template for expanding into other biological systems and materials applications. Marine organisms, with their remarkable adaptations to underwater environments, represent a particularly promising target for similar AI-driven analysis.

The research team's approach to model fine-tuning and domain specialization offers insights for developing AI systems for other scientific disciplines. BioinspiredLLM's success suggests that modest amounts of domain-specific training data can significantly enhance AI performance for specialized applications. This finding has important implications for developing AI tools for fields with limited computational resources or smaller literature bases.

Technical limitations identified in the current work provide clear directions for improvement. The AI system occasionally generated hypotheses that, while scientifically sound, proved challenging to implement with current laboratory techniques. Future developments in automated experimentation and robotic laboratory systems could address these limitations, enabling more complete validation of AI-generated ideas.

The environmental sustainability implications of this research extend beyond the specific materials developed. By providing tools for systematically extracting design principles from renewable biological sources, the approach could accelerate development of sustainable alternatives to petroleum-based materials across numerous industries. The potential for developing biodegradable materials with sophisticated functionality addresses urgent environmental challenges while maintaining performance requirements.

Technical Definitions

BioinspiredLLM: A large language model specifically fine-tuned on biomimetic and materials science literature to understand specialized terminology and make cross-disciplinary connections between biological systems and engineering applications.

Retrieval-Augmented Generation (RAG): An AI methodology that combines language models with information retrieval systems, enabling AI to access and utilize current scientific literature rather than relying solely on training data.

Hierarchical Sampling: A computational strategy that generates multiple levels of experimental hypotheses, from broad conceptual approaches to specific procedural details, enabling systematic exploration of research possibilities.

Agentic Systems: AI systems capable of autonomous goal-directed behavior, including planning experiments, evaluating results, and iterating on approaches without direct human intervention.

Hygroscopic Actuation: The ability of materials to change shape or mechanical properties in response to humidity changes, commonly observed in plant structures like pine cones and seed pods.

Structure-Property Relationships: The fundamental connections between how materials are organized at molecular and microscopic levels and their resulting mechanical, chemical, or physical characteristics.

Cross-Disciplinary Materials Design: An approach to materials development that integrates knowledge from multiple scientific fields, such as combining plant biology insights with engineering principles for novel material applications.


Publication Title: Generative Artificial Intelligence Extracts Structure-Function Relationships from Plants for New Materials
Authors:
Markus J. Buehler Rachel K. Luu Jingyu Deng Mohammed Shahrudin Ibrahim Nam-Joon Cho Ming Dao Subra Suresh
Organizations:
Massachusetts Institute of Technology Nanyang Technological University
Preprint Date: 2025-08-08
Number of Pages: 45
AI-Powered Plant Materials Discovery: Large Language Models Extract Nature's Engineering Secrets
Joshua Berkowitz August 31, 2025
Share this post
Sign in to leave a comment