Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/pranavkrishnasuresh/chemAgent/llms.txt

Use this file to discover all available pages before exploring further.

ChemAgent is built on a sophisticated multi-layered architecture that combines large language models, specialized chemistry models, and validation tools to handle complex chemistry queries.

System Overview

The system follows a plan-and-execute pattern using LangGraph, orchestrating three key technologies:

GPT-4o

Orchestrates planning, execution, and replanning

LlaSMol

Specialized chemistry model for 14 task types

RDKit

Validates SMILES structures and detects errors

Architecture Diagram

ChemAgent Architecture
The diagram above illustrates the complete agent workflow with optional PubChem RAG integration.

Core Components

1. LangGraph Plan-Execute Agent

ChemAgent uses LangGraph’s create_react_agent with a state-driven workflow:
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o")
agent_executor = create_react_agent(llm, tools, state_modifier=prompt)
The agent operates through three interconnected nodes:
  • Planner: Creates step-by-step execution plans
  • Agent/Executor: Executes individual steps using available tools
  • Replanner: Evaluates results and adapts the plan
See plan_execute_agent/rdkit_agent.py:66 for the complete implementation.

2. Three Essential Tools

The agent has access to exactly three tools that must be called in sequence:

Tool 1: structure_chem_prompt

Purpose: Tags and structures chemical information in the query
@tool
def structure_chem_prompt(original_prompt):
    """Structure and tag IUPAC or SMILES chemical information for preprocessing."""
    response = client.beta.chat.completions.parse(
        model="gpt-4o",
        messages=[{"role": "system", "content": SYSTEM_TAG_PROMPT}, ...],
        response_format=StructuredPrompt,
    )
    return {"new_prompt": simplified_prompt.get("new_prompt")}
Transforms:
  • "What is aspirin's formula?""What is the molecular formula of <IUPAC> aspirin </IUPAC>?"
  • Adds <SMILES> and <IUPAC> tags where appropriate
See plan_execute_agent/chem_tools.py:56 for implementation details.

Tool 2: answer_chemistry_query

Purpose: Uses LlaSMol to answer chemistry-specific queries
@tool
def answer_chemistry_query(query: str) -> str:
    """Answer chemistry queries using the LlaSMol model."""
    response = generator.generate(query)
    return response[0]["output"][0]
Supports 14 task types:
  • Name conversions (IUPAC ↔ SMILES, SMILES ↔ Formula)
  • Property predictions (Solubility, LIPO, BBBP, Clintox, HIV, SIDER)
  • Molecule captioning and generation
  • Forward synthesis and retrosynthesis
See plan_execute_agent/chem_tools.py:124 for the full implementation.

Tool 3: validate_smiles_rdkit

Purpose: Validates SMILES outputs using RDKit with detailed error reporting
@tool
def validate_smiles_rdkit(smiles_string: str) -> dict:
    """Validate a SMILES string using RDKit with validity vectors."""
    parsing_details = parse_smiles(smiles_string)
    return {
        "valid": parsing_details["valid"],
        "error_message": parsing_details["validity_vector"],
    }
Error detection:
  • Unclosed rings
  • Invalid characters
  • Invalid parentheses
  • Semantic chemistry issues
See plan_execute_agent/chem_tools.py:180 and plan_execute_agent/chemistry_parser.py:258 for validation logic.

3. PubChem RAG Integration (Optional)

When enabled with --use_rag, the system augments queries with contextual information from PubChem:
from plan_execute_agent.pubchem_rag.query_chemistry import query_chemistry_related

additional_info = await asyncio.to_thread(
    query_chemistry_related,
    input_prompt + "\n" + extracted_text,
)
RAG Pipeline:
  1. Term Extraction: Identifies chemistry-related nouns and proper nouns
  2. PubChem Query: Fetches compound data via REST API
  3. Context Augmentation: Adds retrieved information to the agent’s context
See plan_execute_agent/pubchem_rag/query_chemistry.py:13 for the complete RAG workflow.

4. State Management

The agent maintains state using a TypedDict structure:
class PlanExecute(TypedDict):
    input: str
    plan: List[str]
    past_steps: Annotated[List[Tuple], operator.add]
    response: str
State transitions:
  • STARTplanneragentreplanagent → … → END
  • Maximum recursion limit: 50 iterations
See plan_execute_agent/rdkit_agent.py:75 for state definition.

Model Integration

GPT-4o: The Orchestrator

Responsibilities:
  • Creates execution plans (planner node)
  • Decides tool calls and parameters (executor node)
  • Evaluates results and adapts strategy (replanner node)
  • Optionally extracts chemistry text from images
llm = ChatOpenAI(model="gpt-4o", temperature=0)
planner = planner_prompt | llm.with_structured_output(Plan)

LlaSMol: Chemistry Expert

Model variants (all 7B parameters):
  • osunlp/LlaSMol-Mistral-7B (recommended)
  • osunlp/LlaSMol-Llama2-7B
  • osunlp/LlaSMol-CodeLlama-7B
  • osunlp/LlaSMol-Galactica-6.7B
Initialization:
from LLM4Chem.generation import LlaSMolGeneration

generator = LlaSMolGeneration("osunlp/LlaSMol-Mistral-7B", device="cuda")
LlaSMol requires GPU with sufficient VRAM. Set LOW_VRAM=True in config.py to disable it for CPU-only environments.

RDKit: Structure Validator

RDKit provides two validation layers:
  1. Syntax Validation: Detects malformed SMILES strings
  2. Semantic Validation: Uses DetectChemistryProblems() to find chemistry issues
from rdkit import Chem

mol = Chem.MolFromSmiles(smiles_string, sanitize=False)
problems = Chem.DetectChemistryProblems(mol)

Data Flow

A typical query flows through the system as follows:

Configuration Options

Environment Variables

OPENAI_API_KEY=your_key_here  # Required for GPT-4o

Runtime Configuration

from plan_execute_agent.config import LOW_VRAM

# Set to True for CPU-only environments
LOW_VRAM = False  # Default: use LlaSMol model

Agent Parameters

config = {"recursion_limit": 50}  # Maximum plan-execute cycles
See plan_execute_agent/rdkit_agent.py:397 for configuration details.

Performance Characteristics

  • Without RAG: 5-15 seconds per query
  • With RAG: 10-25 seconds per query
  • Image extraction: +3-5 seconds
  • GPU Memory: 8GB+ for LlaSMol models
  • CPU: 4+ cores recommended
  • RAM: 16GB+ for optimal performance
  • SMILES validation: 99%+ with RDKit
  • Name conversions: ~85% accuracy on SMolInstruct
  • Property predictions: Task-dependent (60-90%)

Error Handling

The system includes multiple error handling layers:
  1. GraphRecursionError: Catches infinite loops when recursion limit is reached
  2. VRAM Check: Prevents model loading on low-memory systems
  3. Validation Errors: Detailed feedback via validity vectors
  4. Tool Errors: Graceful degradation with error messages
See plan_execute_agent/rdkit_agent.py:417 for error handling implementation.

Next Steps

Agent Workflow

Explore the plan-execute-replan cycle

LlaSMol Model

Learn about the chemistry model

Build docs developers (and LLMs) love