Model Selection - ChemAgent

Overview

ChemAgent supports multiple LlaSMol model variants, each based on different foundation models. The model selection determines the chemistry reasoning capabilities and computational requirements.

Available Models

ChemAgent supports four LlaSMol model variants defined in LLM4Chem/config.py:145-150:

Model Options

Model Name	Base Model	Parameters	Recommended Use
`osunlp/LlaSMol-Mistral-7B`	`mistralai/Mistral-7B-v0.1`	7B	Default - Best overall performance
`osunlp/LlaSMol-Galactica-6.7B`	`facebook/galactica-6.7b`	6.7B	Scientific domain knowledge
`osunlp/LlaSMol-Llama2-7B`	`meta-llama/Llama-2-7b-hf`	7B	General chemistry tasks
`osunlp/LlaSMol-CodeLlama-7B`	`codellama/CodeLlama-7b-hf`	7B	Code-like SMILES generation

All models require at least 15GB VRAM. See VRAM Settings for details.

Default Model Configuration

The agent uses Mistral-7B by default:

chem_tools.py:115-119

if not LOW_VRAM:
    from LLM4Chem.generation import LlaSMolGeneration
    generator = LlaSMolGeneration("osunlp/LlaSMol-Mistral-7B", device="cuda")
else:
    generator = None

Changing Models

To use a different model, modify the generator initialization in plan_execute_agent/chem_tools.py:

Example: Using Galactica

generator = LlaSMolGeneration(
    "osunlp/LlaSMol-Galactica-6.7B",
    device="cuda"
)

Example: Using Llama2

generator = LlaSMolGeneration(
    "osunlp/LlaSMol-Llama2-7B",
    device="cuda"
)

Example: Using CodeLlama

generator = LlaSMolGeneration(
    "osunlp/LlaSMol-CodeLlama-7B",
    device="cuda"
)

Model Loading Process

The model loading is handled by LLM4Chem/model.py:19-61:

1. Base Model Resolution

model.py:19-23

def load_tokenizer_and_model(model_name, base_model=None, device=None):
    if base_model is None:
        if model_name in BASE_MODELS:
            base_model = BASE_MODELS[model_name]
    assert base_model is not None, "Please assign the corresponding base model."

The function automatically resolves the base model from the BASE_MODELS dictionary.

2. Tokenizer Initialization

model.py:25-30

tokenizer = AutoTokenizer.from_pretrained(base_model)
tokenizer.padding_side = 'left'
tokenizer.pad_token = '<pad>'
tokenizer.sep_token = '<unk>'
tokenizer.cls_token = '<unk>'
tokenizer.mask_token = '<unk>'

3. Device Selection

model.py:32-34

if device is None:
    device = get_device()

Automatic device detection:

Uses CUDA if available
Falls back to CPU (not currently implemented)

4. Model Loading with PEFT

model.py:35-50

if device == "cuda":
    model = AutoModelForCausalLM.from_pretrained(
        base_model,
        torch_dtype=torch.bfloat16,
        device_map="auto",
    )
    
    model = PeftModelForCausalLM.from_pretrained(
        model,
        model_name,
        torch_dtype=torch.bfloat16,
    )
    
    model = model.merge_and_unload()
else:
    raise NotImplementedError("No implementation for loading model on CPU yet.")

CPU inference is not currently supported. The model requires a CUDA-enabled GPU.

5. Configuration and Optimization

model.py:53-59

model.config.pad_token_id = tokenizer.pad_token_id
model.config.bos_token_id = tokenizer.bos_token_id
model.config.eos_token_id = tokenizer.eos_token_id

model.eval()
if torch.__version__ >= "2" and sys.platform != "win32":
    model = torch.compile(model)

Device Configuration

Automatic Device Detection

The default behavior detects available hardware:

model.py:10-16

def get_device():
    if torch.cuda.is_available():
        device = "cuda"
    else:
        device = "cpu"
    return device

Manual Device Selection

You can specify the device explicitly:

generator = LlaSMolGeneration(
    "osunlp/LlaSMol-Mistral-7B",
    device="cuda"  # Explicitly use CUDA
)

Multi-GPU Support

The model uses device_map="auto" for automatic multi-GPU distribution:

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.bfloat16,
    device_map="auto",  # Automatically distributes across GPUs
)

Generation Configuration

Models can be configured with task-specific generation settings defined in LLM4Chem/config.py:26-102:

Example: Forward Synthesis

TASKS_GENERATION_SETTINGS = {
    "forward_synthesis": {
        "generation_kargs": {"num_return_sequences": 5, "num_beams": 8},
    },
}

Example: Retrosynthesis

"retrosynthesis": {
    "max_new_tokens": 960,
    "generation_kargs": {"num_return_sequences": 10, "num_beams": 13},
}

Example: Property Prediction

"property_prediction-esol": {
    "batch_size": 16,
    "max_new_tokens": 20,
    "generation_kargs": {
        "num_return_sequences": 1,
        "num_beams": 4,
    },
}

Generation API

The LlaSMolGeneration class provides the generation interface:

Initialization

generation.py:65-70

class LlaSMolGeneration(object):
    def __init__(self, model_name, base_model=None, device=None):
        self.prompter = GeneralPrompter(get_chat_content)
        self.tokenizer, self.model = load_tokenizer_and_model(
            model_name, base_model=base_model, device=device
        )
        self.device = self.model.device

Generation Method

generation.py:114

def generate(self, input_text, batch_size=1, max_input_tokens=512, 
             max_new_tokens=1024, canonicalize_smiles=True, 
             print_out=False, **generation_settings):

Parameters:

input_text: Query string or list of queries
batch_size: Number of samples per batch (default: 1)
max_input_tokens: Maximum input length (default: 512)
max_new_tokens: Maximum output length (default: 1024)
canonicalize_smiles: Canonicalize SMILES in input (default: True)
print_out: Print input/output during generation
**generation_settings: Additional generation parameters

Example Usage

from LLM4Chem.generation import LlaSMolGeneration

# Initialize generator
generator = LlaSMolGeneration(
    "osunlp/LlaSMol-Mistral-7B",
    device="cuda"
)

# Generate response
query = "What is the SMILES for <IUPAC> ethanol </IUPAC>?"
response = generator.generate(
    query,
    max_new_tokens=512,
    num_return_sequences=3,
    num_beams=5
)

print(response[0]["output"][0])

Supported Tasks

Models support the following chemistry tasks (config.py:4-19):

Synthesis Tasks

forward_synthesis - Predict products from reactants
retrosynthesis - Predict reactants from products

Molecule Understanding

molecule_captioning - Generate descriptions of molecules
molecule_generation - Generate molecules from descriptions

Name Conversion

name_conversion-i2f - IUPAC to molecular formula
name_conversion-i2s - IUPAC to SMILES
name_conversion-s2f - SMILES to molecular formula
name_conversion-s2i - SMILES to IUPAC

Property Prediction

property_prediction-esol - Water solubility
property_prediction-lipo - Lipophilicity
property_prediction-bbbp - Blood-brain barrier permeability
property_prediction-clintox - Clinical toxicity
property_prediction-hiv - HIV activity
property_prediction-sider - Side effects

Model Requirements

Dependencies

transformers==4.34.1
torch==2.0.0
peft==0.7.0
acceleerate==0.24.1
bitsandbytes==0.41.3.post2
sentencepiece==0.1.99

Hardware Requirements

Component	Requirement
GPU	CUDA-enabled (NVIDIA)
VRAM	Minimum 15GB
CUDA	Version compatible with PyTorch 2.0.0
Driver	NVIDIA driver supporting CUDA

Troubleshooting

Model Not Found

If you encounter model download errors:

# Ensure you have Hugging Face access
from huggingface_hub import login
login(token="your_hf_token")

Base Model Not Resolved

If you see: AssertionError: Please assign the corresponding base model Ensure your model name is in the BASE_MODELS dictionary or provide it explicitly:

generator = LlaSMolGeneration(
    "osunlp/LlaSMol-Mistral-7B",
    base_model="mistralai/Mistral-7B-v0.1",
    device="cuda"
)

CUDA Out of Memory

See VRAM Settings for memory optimization strategies.

Next Steps

VRAM Settings

Configure memory requirements

Chemistry Tools

Learn about available chemistry tools

Get Started

Core Concepts

Guides

Configuration

Documentation Index

​Overview

​Available Models

​Model Options

​Default Model Configuration

​Changing Models

​Example: Using Galactica

​Example: Using Llama2

​Example: Using CodeLlama

​Model Loading Process

​1. Base Model Resolution

​2. Tokenizer Initialization

​3. Device Selection

​4. Model Loading with PEFT

​5. Configuration and Optimization

​Device Configuration

​Automatic Device Detection

​Manual Device Selection

​Multi-GPU Support

​Generation Configuration

​Example: Forward Synthesis

​Example: Retrosynthesis

​Example: Property Prediction

​Generation API

​Initialization

​Generation Method

​Example Usage

​Supported Tasks

​Synthesis Tasks

​Molecule Understanding

​Name Conversion

​Property Prediction

​Model Requirements

​Dependencies

​Hardware Requirements

​Troubleshooting

​Model Not Found

​Base Model Not Resolved

​CUDA Out of Memory

​Next Steps

VRAM Settings

Chemistry Tools

Build docs developers (and LLMs) love

Overview

Available Models

Model Options

Default Model Configuration

Changing Models

Example: Using Galactica

Example: Using Llama2

Example: Using CodeLlama

Model Loading Process

1. Base Model Resolution

2. Tokenizer Initialization

3. Device Selection

4. Model Loading with PEFT

5. Configuration and Optimization

Device Configuration

Automatic Device Detection

Manual Device Selection

Multi-GPU Support

Generation Configuration

Example: Forward Synthesis

Example: Retrosynthesis

Example: Property Prediction

Generation API

Initialization

Generation Method

Example Usage

Supported Tasks

Synthesis Tasks

Molecule Understanding

Name Conversion

Property Prediction

Model Requirements

Dependencies

Hardware Requirements

Troubleshooting

Model Not Found

Base Model Not Resolved

CUDA Out of Memory

Next Steps