Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/BoundaryML/baml/llms.txt

Use this file to discover all available pages before exploring further.

Classification is one of the most common use cases for LLMs. BAML makes it easy to build robust classification systems with type-safe outputs and clear validation.

Single-Label Classification

Spam Detection

Let’s start with a simple spam classifier that categorizes messages as SPAM or NOT_SPAM.

Define the Schema

First, define an enum for your classification labels:
spam_classifier.baml
enum MessageType {
  SPAM
  NOT_SPAM
}

Create the Classification Function

spam_classifier.baml
function ClassifyText(input: string) -> MessageType {
  client "openai/gpt-4o-mini"
  prompt #"
    Classify the message. 

    {{ ctx.output_format }}

    {{ _.role("user") }} 
    
    {{ input }}
  "#
}
The {{ ctx.output_format }} macro automatically injects instructions for the LLM to return valid enum values.

Test Your Classifier

spam_classifier.baml
test BasicSpamTest {
  functions [ClassifyText]
  args {
    input "Buy cheap watches now! Limited time offer!!!"
  }
}

test NonSpamTest {
  functions [ClassifyText]
  args {
    input "Hey Sarah, can we meet at 3 PM tomorrow to discuss the project?"
  }
}

Usage in Code

from baml_client import b
from baml_client.types import MessageType

def classify_message(text: str) -> MessageType:
    result = b.ClassifyText(text)
    return result

# Example usage
message = "CONGRATULATIONS! You've won $1,000,000!!!"
classification = classify_message(message)

if classification == MessageType.SPAM:
    print("This is spam!")
else:
    print("This is legitimate")

Support Ticket Classification

Here’s a more sophisticated example that classifies support tickets into categories:
ticket_classifier.baml
enum Category {
  Refund
  CancelOrder
  TechnicalSupport
  AccountIssue
  Question
}

function ClassifyMessage(input: string) -> Category {
  client "openai/gpt-4o"
  prompt #"
    Classify the following INPUT into ONE of the following categories:

    INPUT: {{ input }}

    {{ ctx.output_format }}

    Response:
  "#
}

test ClassifySupport {
  functions [ClassifyMessage]
  args {
    input "I want to return my order and get a refund"
  }
}

Multi-Label Classification

For cases where an item can belong to multiple categories simultaneously, use arrays:
multi_label.baml
enum TicketLabel {
  ACCOUNT
  BILLING
  GENERAL_QUERY
  TECHNICAL
  URGENT
}

class TicketClassification {
  labels TicketLabel[]
  confidence string @description("High, Medium, or Low")
}

function ClassifyTicket(ticket: string) -> TicketClassification {
  client "openai/gpt-4o-mini"
  prompt #"
    You are a support agent at a tech company. 
    Analyze the support ticket and select all applicable labels.

    {{ ctx.output_format }}

    {{ _.role("user") }}
    
    {{ ticket }}
  "#
}

Multi-Label Test Cases

multi_label.baml
test SingleLabelCase {
  functions [ClassifyTicket]
  args {
    ticket "I need help resetting my password"
  }
}

test MultiLabelCase {
  functions [ClassifyTicket]
  args {
    ticket "My account is locked and I can't access my billing information. This is urgent!"
  }
}

Usage in Code

from baml_client import b

def categorize_ticket(ticket_text: str):
    result = b.ClassifyTicket(ticket_text)
    
    print(f"Labels: {result.labels}")
    print(f"Confidence: {result.confidence}")
    
    # Check for specific labels
    from baml_client.types import TicketLabel
    if TicketLabel.URGENT in result.labels:
        # Escalate to priority queue
        escalate_ticket(ticket_text)
    
    return result

# Example
ticket = "I forgot my password and need to update my payment method"
classification = categorize_ticket(ticket)

Best Practices

1. Use Descriptive Enum Values

// Good - clear and descriptive
enum Sentiment {
  POSITIVE
  NEGATIVE
  NEUTRAL
  MIXED
}

// Avoid - ambiguous
enum Sentiment {
  GOOD
  BAD
  OK
}

2. Add Context to Complex Classifications

class ContentModeration {
  category "SAFE" | "INAPPROPRIATE" | "NEEDS_REVIEW"
  reason string @description("Explanation for the classification")
  confidence float @description("Score between 0 and 1")
}

function ModerateContent(text: string) -> ContentModeration {
  client "openai/gpt-4o"
  prompt #"
    Moderate the following content for safety.
    Provide a clear reason for your classification.

    {{ ctx.output_format }}

    {{ _.role("user") }}
    {{ text }}
  "#
}

3. Test Edge Cases

test AmbiguousMessage {
  functions [ClassifyMessage]
  args {
    input "Is this spam? Not sure..."
  }
}

test EmptyInput {
  functions [ClassifyMessage]
  args {
    input ""
  }
}

test MixedContent {
  functions [ClassifyMessage]
  args {
    input "Hi there! Buy our product now! Also, how's the weather?"
  }
}

Advanced: Classification with Confidence Scores

class ClassificationResult {
  category Category
  confidence float @description("Between 0.0 and 1.0")
  alternative_categories Category[] @description("Other possible categories")
}

function ClassifyWithConfidence(input: string) -> ClassificationResult {
  client "openai/gpt-4o"
  prompt #"
    Classify the input and provide:
    1. The most likely category
    2. A confidence score (0.0 to 1.0)
    3. Alternative categories if confidence is below 0.8

    {{ ctx.output_format }}

    {{ _.role("user") }}
    {{ input }}
  "#
}
from baml_client import b

result = b.ClassifyWithConfidence("Maybe I want a refund?")

if result.confidence < 0.7:
    print(f"Low confidence. Consider: {result.alternative_categories}")
    # Route to human review
else:
    print(f"Confident classification: {result.category}")

Next Steps

Build docs developers (and LLMs) love