Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/pranavkrishnasuresh/chemAgent/llms.txt

Use this file to discover all available pages before exploring further.

Welcome to ChemAgent

ChemAgent is a Plan-and-Execute agent that leverages RDKit, LangGraph, and LLMs to handle chemistry-related tasks with optional RAG support from PubChem. This guide will help you run your first successful chemistry query.
ChemAgent uses GPT-4o for planning and optional LlaSMol-Mistral-7B for specialized chemistry tasks. Make sure you have your OpenAI API key ready.

Your First Query

The simplest way to get started is to run a basic IUPAC to SMILES conversion query.
1

Import the agent

import asyncio
from plan_execute_agent.rdkit_agent import process_input
2

Run a simple query

# IUPAC to SMILES conversion
result, completed, attempts, llasmol_response, llasmol_errors, formatted_input = (
    asyncio.run(process_input(
        "Could you provide the SMILES for <IUPAC> 4-ethyl-4-methyloxolan-2-one </IUPAC>?"
    ))
)

print("Result:", result)
print("Completed:", completed)
print("Attempts:", attempts)
3

Check the output

The process_input() function returns a tuple containing:
  • result: The final answer to your query
  • completed: Boolean indicating if the task completed successfully
  • attempts: Number of replanning attempts made
  • llasmol_response: Raw response from the LlaSMol model (if LOW_VRAM=False)
  • llasmol_errors: Any validation errors encountered
  • formatted_input: The structured input created by the planner

Running from Command Line

You can also run queries directly from the terminal:
python -m plan_execute_agent.rdkit_agent --query "Could you provide the SMILES for <IUPAC> 4-ethyl-4-methyloxolan-2-one </IUPAC>?"

Common Query Examples

import asyncio
from plan_execute_agent.rdkit_agent import process_input

query = "Please provide the SMILES representation for <IUPAC> 4-ethyl-4-methyloxolan-2-one </IUPAC>."
result, completed, attempts, _, _, _ = asyncio.run(process_input(query))

print(f"SMILES: {result}")

Using RAG for Enhanced Context

Enable PubChem RAG to provide additional context from PubChem database:
import asyncio
from plan_execute_agent.rdkit_agent import process_input

query = "Could you provide the SMILES for <IUPAC> 4-ethyl-4-methyloxolan-2-one </IUPAC>?"

# Enable RAG by setting use_rag=True
result, completed, attempts, _, _, _ = asyncio.run(
    process_input(query, use_rag=True)
)

print(f"Result with PubChem context: {result}")
RAG queries take longer as they fetch additional context from PubChem. Only enable RAG when you need enhanced context for complex queries.

Processing Images with GPT-4o

ChemAgent can extract chemistry information from images using GPT-4o’s vision capabilities:
import asyncio
from plan_execute_agent.rdkit_agent import process_input

query = "Please identify this molecule and provide its IUPAC name and SMILES."
image_path = "sample-chem-image.jpg"

result, completed, attempts, _, _, _ = asyncio.run(
    process_input(query, image_path=image_path)
)

print(f"Extracted chemistry info: {result}")
The image extraction happens automatically before the main query processing. GPT-4o extracts chemical names, formulas, and relevant text from the image.

Supported Query Types

ChemAgent supports a wide range of chemistry tasks:

Name Conversion

  • IUPAC to Molecular Formula
  • IUPAC to SMILES
  • SMILES to IUPAC
  • SMILES to Molecular Formula

Property Prediction

  • Solubility (ESOL)
  • LIPO (Lipophilicity)
  • BBBP (Blood-brain barrier permeability)
  • Clintox (Clinical toxicity)
  • HIV activity
  • Side Effects

Molecule Tasks

  • Molecule Captioning
  • Molecule Generation
  • Molecule Description

Reaction Chemistry

  • Forward Synthesis
  • Retrosynthesis

Understanding the Agent Architecture

ChemAgent uses a Plan-and-Execute architecture:
  1. Planner: Creates a step-by-step plan using GPT-4o
  2. Executor: Executes each step using specialized tools:
    • structure_chem_prompt: Tags IUPAC/SMILES information
    • answer_chemistry_query: Processes queries using LlaSMol (if enabled)
    • validate_smiles_rdkit: Validates SMILES output using RDKit
  3. Replanner: Updates the plan based on results and replan if needed
The agent automatically validates all SMILES outputs using RDKit to ensure chemical accuracy.

Low VRAM Mode

If you’re running on a system with limited VRAM (less than 15GB), the agent defaults to LOW_VRAM mode:
# In plan_execute_agent/config.py
LOW_VRAM = True  # Default setting
In LOW_VRAM mode:
  • LlaSMol model is not loaded
  • Only GPT-4o is used for all tasks
  • Significantly reduced memory footprint
  • Still provides accurate results for most queries
With LOW_VRAM=True, the answer_chemistry_query tool will raise a RuntimeError. The agent will respond directly using GPT-4o instead.

Tracking Results

All query results are automatically logged to run_logs.csv:
# View the logs
cat plan_execute_agent/run_logs.csv

# Visualize attempts and errors
python plot.py
The log contains:
  • Query text
  • Number of attempts
  • Completion status
  • Validation errors (if any)

Next Steps

Installation Guide

Complete setup instructions for production use

Core Concepts

Learn about the agent architecture

API Reference

Detailed documentation of all functions and tools

Guides

More chemistry query examples and use cases

Troubleshooting

If you encounter a GraphRecursionError, the agent exceeded the recursion limit (default: 50). This usually means the query is too complex or vague.Solution: Simplify your query or increase the recursion limit in the code.
If SMILES validation fails, check the llasmol_errors return value for details.Solution: The agent will automatically replan and try to fix the error. If it persists, the SMILES may be fundamentally invalid.
If the image path is invalid, the agent will print a warning and ignore the image.Solution: Verify the image path exists and is accessible.
If you see “answer_chemistry_query tool cannot be used with LOW_VRAM enabled”, the agent tried to use LlaSMol when it’s disabled.Solution: Set LOW_VRAM=False in plan_execute_agent/config.py and ensure you have ≥15GB VRAM.

Build docs developers (and LLMs) love