Architecture

ChemAgent is built on a sophisticated multi-layered architecture that combines large language models, specialized chemistry models, and validation tools to handle complex chemistry queries.

System Overview

The system follows a plan-and-execute pattern using LangGraph, orchestrating three key technologies:

GPT-4o

Orchestrates planning, execution, and replanning

LlaSMol

Specialized chemistry model for 14 task types

RDKit

Validates SMILES structures and detects errors

Architecture Diagram

The diagram above illustrates the complete agent workflow with optional PubChem RAG integration.

Core Components

1. LangGraph Plan-Execute Agent

ChemAgent uses LangGraph’s create_react_agent with a state-driven workflow:

from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o")
agent_executor = create_react_agent(llm, tools, state_modifier=prompt)

The agent operates through three interconnected nodes:

Planner: Creates step-by-step execution plans
Agent/Executor: Executes individual steps using available tools
Replanner: Evaluates results and adapts the plan

See plan_execute_agent/rdkit_agent.py:66 for the complete implementation.

2. Three Essential Tools

The agent has access to exactly three tools that must be called in sequence:

Tool 1: structure_chem_prompt

Purpose: Tags and structures chemical information in the query

@tool
def structure_chem_prompt(original_prompt):
    """Structure and tag IUPAC or SMILES chemical information for preprocessing."""
    response = client.beta.chat.completions.parse(
        model="gpt-4o",
        messages=[{"role": "system", "content": SYSTEM_TAG_PROMPT}, ...],
        response_format=StructuredPrompt,
    )
    return {"new_prompt": simplified_prompt.get("new_prompt")}

Transforms:

"What is aspirin's formula?" → "What is the molecular formula of <IUPAC> aspirin </IUPAC>?"
Adds <SMILES> and <IUPAC> tags where appropriate

See plan_execute_agent/chem_tools.py:56 for implementation details.

Tool 2: answer_chemistry_query

Purpose: Uses LlaSMol to answer chemistry-specific queries

@tool
def answer_chemistry_query(query: str) -> str:
    """Answer chemistry queries using the LlaSMol model."""
    response = generator.generate(query)
    return response[0]["output"][0]

Supports 14 task types:

Name conversions (IUPAC ↔ SMILES, SMILES ↔ Formula)
Property predictions (Solubility, LIPO, BBBP, Clintox, HIV, SIDER)
Molecule captioning and generation
Forward synthesis and retrosynthesis

See plan_execute_agent/chem_tools.py:124 for the full implementation.

Tool 3: validate_smiles_rdkit

Purpose: Validates SMILES outputs using RDKit with detailed error reporting

@tool
def validate_smiles_rdkit(smiles_string: str) -> dict:
    """Validate a SMILES string using RDKit with validity vectors."""
    parsing_details = parse_smiles(smiles_string)
    return {
        "valid": parsing_details["valid"],
        "error_message": parsing_details["validity_vector"],
    }

Error detection:

Unclosed rings
Invalid characters
Invalid parentheses
Semantic chemistry issues

See plan_execute_agent/chem_tools.py:180 and plan_execute_agent/chemistry_parser.py:258 for validation logic.

3. PubChem RAG Integration (Optional)

When enabled with --use_rag, the system augments queries with contextual information from PubChem:

from plan_execute_agent.pubchem_rag.query_chemistry import query_chemistry_related

additional_info = await asyncio.to_thread(
    query_chemistry_related,
    input_prompt + "\n" + extracted_text,
)

RAG Pipeline:

Term Extraction: Identifies chemistry-related nouns and proper nouns
PubChem Query: Fetches compound data via REST API
Context Augmentation: Adds retrieved information to the agent’s context

See plan_execute_agent/pubchem_rag/query_chemistry.py:13 for the complete RAG workflow.

4. State Management

The agent maintains state using a TypedDict structure:

class PlanExecute(TypedDict):
    input: str
    plan: List[str]
    past_steps: Annotated[List[Tuple], operator.add]
    response: str

State transitions:

START → planner → agent → replan → agent → … → END
Maximum recursion limit: 50 iterations

See plan_execute_agent/rdkit_agent.py:75 for state definition.

Model Integration

GPT-4o: The Orchestrator

Responsibilities:

Creates execution plans (planner node)
Decides tool calls and parameters (executor node)
Evaluates results and adapts strategy (replanner node)
Optionally extracts chemistry text from images

llm = ChatOpenAI(model="gpt-4o", temperature=0)
planner = planner_prompt | llm.with_structured_output(Plan)

LlaSMol: Chemistry Expert

Model variants (all 7B parameters):

osunlp/LlaSMol-Mistral-7B (recommended)
osunlp/LlaSMol-Llama2-7B
osunlp/LlaSMol-CodeLlama-7B
osunlp/LlaSMol-Galactica-6.7B

Initialization:

from LLM4Chem.generation import LlaSMolGeneration

generator = LlaSMolGeneration("osunlp/LlaSMol-Mistral-7B", device="cuda")

LlaSMol requires GPU with sufficient VRAM. Set LOW_VRAM=True in config.py to disable it for CPU-only environments.

RDKit: Structure Validator

RDKit provides two validation layers:

Syntax Validation: Detects malformed SMILES strings
Semantic Validation: Uses DetectChemistryProblems() to find chemistry issues

from rdkit import Chem

mol = Chem.MolFromSmiles(smiles_string, sanitize=False)
problems = Chem.DetectChemistryProblems(mol)

Data Flow

A typical query flows through the system as follows:

Configuration Options

Environment Variables

OPENAI_API_KEY=your_key_here  # Required for GPT-4o

Runtime Configuration

from plan_execute_agent.config import LOW_VRAM

# Set to True for CPU-only environments
LOW_VRAM = False  # Default: use LlaSMol model

Agent Parameters

config = {"recursion_limit": 50}  # Maximum plan-execute cycles

See plan_execute_agent/rdkit_agent.py:397 for configuration details.

Performance Characteristics

Latency

Without RAG: 5-15 seconds per query
With RAG: 10-25 seconds per query
Image extraction: +3-5 seconds

Resource Requirements

GPU Memory: 8GB+ for LlaSMol models
CPU: 4+ cores recommended
RAM: 16GB+ for optimal performance

Accuracy

SMILES validation: 99%+ with RDKit
Name conversions: ~85% accuracy on SMolInstruct
Property predictions: Task-dependent (60-90%)

Error Handling

The system includes multiple error handling layers:

GraphRecursionError: Catches infinite loops when recursion limit is reached
VRAM Check: Prevents model loading on low-memory systems
Validation Errors: Detailed feedback via validity vectors
Tool Errors: Graceful degradation with error messages

See plan_execute_agent/rdkit_agent.py:417 for error handling implementation.

Next Steps

Agent Workflow

Explore the plan-execute-replan cycle

LlaSMol Model

Learn about the chemistry model

Get Started

Core Concepts

Guides

Configuration

System Overview

GPT-4o

LlaSMol

RDKit

Architecture Diagram

Core Components

1. LangGraph Plan-Execute Agent

2. Three Essential Tools

Tool 1: structure_chem_prompt

Tool 2: answer_chemistry_query

Tool 3: validate_smiles_rdkit

3. PubChem RAG Integration (Optional)

4. State Management

Model Integration

GPT-4o: The Orchestrator

LlaSMol: Chemistry Expert

RDKit: Structure Validator

Data Flow

Configuration Options

Environment Variables

Runtime Configuration

Agent Parameters

Performance Characteristics

Error Handling

Next Steps

Agent Workflow

LlaSMol Model

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Configuration

Documentation Index

​System Overview

GPT-4o

LlaSMol

RDKit

​Architecture Diagram

​Core Components

​1. LangGraph Plan-Execute Agent

​2. Three Essential Tools

​Tool 1: structure_chem_prompt

​Tool 2: answer_chemistry_query

​Tool 3: validate_smiles_rdkit

​3. PubChem RAG Integration (Optional)

​4. State Management

​Model Integration

​GPT-4o: The Orchestrator

​LlaSMol: Chemistry Expert

​RDKit: Structure Validator

​Data Flow

​Configuration Options

​Environment Variables

​Runtime Configuration

​Agent Parameters

​Performance Characteristics

​Error Handling

​Next Steps

Agent Workflow

LlaSMol Model

Build docs developers (and LLMs) love

System Overview

Architecture Diagram

Core Components

1. LangGraph Plan-Execute Agent

2. Three Essential Tools

Tool 1: structure_chem_prompt

Tool 2: answer_chemistry_query

Tool 3: validate_smiles_rdkit

3. PubChem RAG Integration (Optional)

4. State Management

Model Integration

GPT-4o: The Orchestrator

LlaSMol: Chemistry Expert

RDKit: Structure Validator

Data Flow

Configuration Options

Environment Variables

Runtime Configuration

Agent Parameters

Performance Characteristics

Error Handling

Next Steps