Overview
Theusage_metrics module provides provider-agnostic utilities for extracting token usage and cost information from LLM response messages. It handles the variations in metadata structure across different providers (OpenAI, HuggingFace, etc.) and ensures consistent usage tracking.
extract_usage_from_ai_message()
Extract token usage from LLM response message in a provider-agnostic way.Signature
Parameters
LLM response message object (typically AIMessage from LangChain)
Returns
Dictionary containing:
input_tokens(int): Number of input/prompt tokensoutput_tokens(int): Number of output/completion tokenstotal_tokens(int): Total tokens (input + output)usage_source(str): Source of usage data (“usage_metadata”, “response_metadata”, or “missing”)
Extraction Priority
The function searches for usage information in the following order:message.usage_metadata(LangChain standard)message.response_metadata["token_usage"]message.response_metadata["usage"]- Returns zeros if not found
Field Name Mapping
The function handles multiple field name variations:- Input tokens:
input_tokens,prompt_tokens,input - Output tokens:
output_tokens,completion_tokens,output - Total tokens:
total_tokens,total
total_tokens is not provided or is 0, it’s calculated as input_tokens + output_tokens.
Example
Usage Sources
Token usage was found in
message.usage_metadata (LangChain standard location)Token usage was found in
message.response_metadata["token_usage"] or message.response_metadata["usage"]No token usage information found; all token counts are 0
extract_cost_from_ai_message()
Extract provider-reported cost from LLM response message when available.Signature
Parameters
LLM response message object (typically AIMessage from LangChain)
Returns
Dictionary containing:
total_cost(Optional[float]): Provider-reported cost in USD, or None if not availablecost_source(str): Source of cost data (“response_metadata”, “response_metadata.usage”, “response_metadata.billing”, or “missing”)
Extraction Priority
The function searches for cost information in the following order:- Direct fields in
response_metadata:total_cost,cost,usd_cost - Fields in
response_metadata["usage"]:total_cost,cost,usd_cost - Fields in
response_metadata["billing"]:total_cost,cost,usd_cost - Returns None if not found
Important Behavior
This function intentionally does not estimate cost from a local price table. If the provider does not return billing metadata, cost is reported as missing. Use the
pricing.resolve_total_cost() function for cost estimation.Example
Cost Sources
Cost was found in direct fields of
message.response_metadataCost was found in
message.response_metadata["usage"]Cost was found in
message.response_metadata["billing"]No provider-reported cost found;
total_cost is NoneComplete Usage Tracking Example
Track Usage and Cost for Single Query
Aggregate Metrics Across Multiple Queries
Utility Functions
Internal Helpers
The module includes internal utility functions for safe type coercion:Provider Compatibility
Supported Providers
OpenAI
Full support for usage_metadata and response_metadata extraction
HuggingFace
Full support for TGI and Inference Endpoint metadata
Other Providers
Graceful fallback with missing source indicator
Metadata Structure Variations
The module handles these common metadata structures: LangChain Standard (usage_metadata):Best Practices
Always extract usage before cost calculation
Always extract usage before cost calculation
Token usage information is required for cost estimation. Always call
extract_usage_from_ai_message() before resolve_total_cost().Check usage_source for data quality
Check usage_source for data quality
Monitor the
usage_source field to identify when usage data is missing. This helps catch configuration issues early.Use provider-reported cost when available
Use provider-reported cost when available
Provider-reported costs are more accurate than estimates. Always prefer
extract_cost_from_ai_message() results when total_cost is not None.Aggregate metrics for batch operations
Aggregate metrics for batch operations
For evaluating multiple examples, aggregate metrics across all calls to get total costs and average usage patterns.
