Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/pranavkrishnasuresh/chemAgent/llms.txt

Use this file to discover all available pages before exploring further.

Overview

ChemAgent requires several environment variables to be configured before running the agent. These variables are loaded from a .env file in the root directory of the project.

Setup Instructions

1. Create .env File

The project uses python-dotenv to load environment variables. Create a .env file in the root directory:
cp .env.example .env
The .env file is automatically loaded using load_dotenv(override=True) in the agent scripts.

2. Required Environment Variables

OpenAI API Key

ChemAgent uses OpenAI’s API for several components:
  • GPT-4o for the plan-and-execute agent (rdkit_agent.py:65)
  • GPT-4o for structuring chemical prompts with tags (chem_tools.py:64)
  • AsyncOpenAI client for RAG queries (rdkit_agent.py:268)
Variable Name: OPENAI_API_KEY Usage Locations:
  • plan_execute_agent/rdkit_agent.py - ChatOpenAI LLM initialization
  • plan_execute_agent/chem_tools.py - OpenAI client for structured outputs
  • plan_execute_agent/pubchem_rag/llm_response.py - RAG query processing
Configuration:
.env
OPENAI_API_KEY=your_api_key_here
The OpenAI API key is required for the agent to function. Without it, the agent will fail to initialize.

WandB Configuration (Optional)

For fine-tuning tasks, Weights & Biases (WandB) integration is available:
.env
WANDB_PROJECT=your_project_name
WANDB_WATCH=gradients
WANDB_LOG_MODEL=checkpoint
Usage: LLM4Chem/finetune.py for tracking fine-tuning experiments
WandB variables are only required if you plan to fine-tune the LlaSMol models.

Environment Loading

The .env file is loaded in multiple locations:

Agent Scripts

from dotenv import load_dotenv

load_dotenv(override=True)  # Loads .env file
Locations:
  • plan_execute_agent/rdkit_agent.py:40-42
  • plan_execute_agent/chem_tools.py:45
The override=True flag ensures that environment variables in .env take precedence over system environment variables.

Distributed Training Variables

For fine-tuning with distributed training:
LOCAL_RANK=0
WORLD_SIZE=1
Usage: LLM4Chem/finetune.py for multi-GPU training coordination
These variables are typically set automatically by your distributed training launcher (e.g., torchrun, deepspeed) and don’t need manual configuration.

Dependencies

The environment configuration requires:
python-dotenv==0.19.1
langchain-openai==0.1.25
openai
Install via:
pip install -r agent_requirements.txt
pip install -r comb_requirements.txt

Verification

To verify your environment is configured correctly:
import os
from dotenv import load_dotenv

load_dotenv(override=True)

# Check if OpenAI API key is loaded
if os.getenv("OPENAI_API_KEY"):
    print("✓ OpenAI API key configured")
else:
    print("✗ OpenAI API key not found")

Next Steps

VRAM Settings

Configure VRAM requirements for LlaSMol model

Model Selection

Choose and configure LlaSMol models

Build docs developers (and LLMs) love