Classification Examples

Classification is one of the most common use cases for LLMs. BAML makes it easy to build robust classification systems with type-safe outputs and clear validation.

Single-Label Classification

Spam Detection

Let’s start with a simple spam classifier that categorizes messages as SPAM or NOT_SPAM.

Define the Schema

First, define an enum for your classification labels:

spam_classifier.baml

enum MessageType {
  SPAM
  NOT_SPAM
}

Create the Classification Function

spam_classifier.baml

function ClassifyText(input: string) -> MessageType {
  client "openai/gpt-4o-mini"
  prompt #"
    Classify the message. 

    {{ ctx.output_format }}

    {{ _.role("user") }} 
    
    {{ input }}
  "#
}

The {{ ctx.output_format }} macro automatically injects instructions for the LLM to return valid enum values.

Test Your Classifier

spam_classifier.baml

test BasicSpamTest {
  functions [ClassifyText]
  args {
    input "Buy cheap watches now! Limited time offer!!!"
  }
}

test NonSpamTest {
  functions [ClassifyText]
  args {
    input "Hey Sarah, can we meet at 3 PM tomorrow to discuss the project?"
  }
}

Usage in Code

Python
TypeScript

from baml_client import b
from baml_client.types import MessageType

def classify_message(text: str) -> MessageType:
    result = b.ClassifyText(text)
    return result

# Example usage
message = "CONGRATULATIONS! You've won $1,000,000!!!"
classification = classify_message(message)

if classification == MessageType.SPAM:
    print("This is spam!")
else:
    print("This is legitimate")

import { b } from './baml_client'
import { MessageType } from './baml_client/types'

async function classifyMessage(text: string): Promise<MessageType> {
  return await b.ClassifyText(text)
}

// Example usage
const message = "CONGRATULATIONS! You've won $1,000,000!!!"
const classification = await classifyMessage(message)

if (classification === MessageType.SPAM) {
  console.log("This is spam!")
} else {
  console.log("This is legitimate")
}

Support Ticket Classification

Here’s a more sophisticated example that classifies support tickets into categories:

ticket_classifier.baml

enum Category {
  Refund
  CancelOrder
  TechnicalSupport
  AccountIssue
  Question
}

function ClassifyMessage(input: string) -> Category {
  client "openai/gpt-4o"
  prompt #"
    Classify the following INPUT into ONE of the following categories:

    INPUT: {{ input }}

    {{ ctx.output_format }}

    Response:
  "#
}

test ClassifySupport {
  functions [ClassifyMessage]
  args {
    input "I want to return my order and get a refund"
  }
}

Multi-Label Classification

For cases where an item can belong to multiple categories simultaneously, use arrays:

multi_label.baml

enum TicketLabel {
  ACCOUNT
  BILLING
  GENERAL_QUERY
  TECHNICAL
  URGENT
}

class TicketClassification {
  labels TicketLabel[]
  confidence string @description("High, Medium, or Low")
}

function ClassifyTicket(ticket: string) -> TicketClassification {
  client "openai/gpt-4o-mini"
  prompt #"
    You are a support agent at a tech company. 
    Analyze the support ticket and select all applicable labels.

    {{ ctx.output_format }}

    {{ _.role("user") }}
    
    {{ ticket }}
  "#
}

Multi-Label Test Cases

multi_label.baml

test SingleLabelCase {
  functions [ClassifyTicket]
  args {
    ticket "I need help resetting my password"
  }
}

test MultiLabelCase {
  functions [ClassifyTicket]
  args {
    ticket "My account is locked and I can't access my billing information. This is urgent!"
  }
}

Usage in Code

Python
TypeScript

from baml_client import b

def categorize_ticket(ticket_text: str):
    result = b.ClassifyTicket(ticket_text)
    
    print(f"Labels: {result.labels}")
    print(f"Confidence: {result.confidence}")
    
    # Check for specific labels
    from baml_client.types import TicketLabel
    if TicketLabel.URGENT in result.labels:
        # Escalate to priority queue
        escalate_ticket(ticket_text)
    
    return result

# Example
ticket = "I forgot my password and need to update my payment method"
classification = categorize_ticket(ticket)

import { b } from './baml_client'
import { TicketLabel } from './baml_client/types'

async function categorizeTicket(ticketText: string) {
  const result = await b.ClassifyTicket(ticketText)
  
  console.log(`Labels: ${result.labels}`)
  console.log(`Confidence: ${result.confidence}`)
  
  // Check for specific labels
  if (result.labels.includes(TicketLabel.URGENT)) {
    // Escalate to priority queue
    await escalateTicket(ticketText)
  }
  
  return result
}

// Example
const ticket = "I forgot my password and need to update my payment method"
const classification = await categorizeTicket(ticket)

Best Practices

1. Use Descriptive Enum Values

// Good - clear and descriptive
enum Sentiment {
  POSITIVE
  NEGATIVE
  NEUTRAL
  MIXED
}

// Avoid - ambiguous
enum Sentiment {
  GOOD
  BAD
  OK
}

2. Add Context to Complex Classifications

class ContentModeration {
  category "SAFE" | "INAPPROPRIATE" | "NEEDS_REVIEW"
  reason string @description("Explanation for the classification")
  confidence float @description("Score between 0 and 1")
}

function ModerateContent(text: string) -> ContentModeration {
  client "openai/gpt-4o"
  prompt #"
    Moderate the following content for safety.
    Provide a clear reason for your classification.

    {{ ctx.output_format }}

    {{ _.role("user") }}
    {{ text }}
  "#
}

3. Test Edge Cases

test AmbiguousMessage {
  functions [ClassifyMessage]
  args {
    input "Is this spam? Not sure..."
  }
}

test EmptyInput {
  functions [ClassifyMessage]
  args {
    input ""
  }
}

test MixedContent {
  functions [ClassifyMessage]
  args {
    input "Hi there! Buy our product now! Also, how's the weather?"
  }
}

Advanced: Classification with Confidence Scores

class ClassificationResult {
  category Category
  confidence float @description("Between 0.0 and 1.0")
  alternative_categories Category[] @description("Other possible categories")
}

function ClassifyWithConfidence(input: string) -> ClassificationResult {
  client "openai/gpt-4o"
  prompt #"
    Classify the input and provide:
    1. The most likely category
    2. A confidence score (0.0 to 1.0)
    3. Alternative categories if confidence is below 0.8

    {{ ctx.output_format }}

    {{ _.role("user") }}
    {{ input }}
  "#
}

Python
TypeScript

from baml_client import b

result = b.ClassifyWithConfidence("Maybe I want a refund?")

if result.confidence < 0.7:
    print(f"Low confidence. Consider: {result.alternative_categories}")
    # Route to human review
else:
    print(f"Confident classification: {result.category}")

import { b } from './baml_client'

const result = await b.ClassifyWithConfidence("Maybe I want a refund?")

if (result.confidence < 0.7) {
  console.log(`Low confidence. Consider: ${result.alternative_categories}`)
  // Route to human review
} else {
  console.log(`Confident classification: ${result.category}`)
}

Next Steps

Learn about Data Extraction for more complex structured outputs
Explore Tool Calling to combine classification with actions
Check out Prompt Engineering Tips for better accuracy

Use Cases

Classification Examples

Single-Label Classification

Spam Detection

Define the Schema

Create the Classification Function

Test Your Classifier

Usage in Code

Support Ticket Classification

Multi-Label Classification

Multi-Label Test Cases

Usage in Code

Best Practices

1. Use Descriptive Enum Values

2. Add Context to Complex Classifications

3. Test Edge Cases

Advanced: Classification with Confidence Scores

Next Steps

Build docs developers (and LLMs) love

Use Cases

Documentation Index

​Single-Label Classification

​Spam Detection

​Define the Schema

​Create the Classification Function

​Test Your Classifier

​Usage in Code

​Support Ticket Classification

​Multi-Label Classification

​Multi-Label Test Cases

​Usage in Code

​Best Practices

​1. Use Descriptive Enum Values

​2. Add Context to Complex Classifications

​3. Test Edge Cases

​Advanced: Classification with Confidence Scores

​Next Steps

Build docs developers (and LLMs) love

Single-Label Classification

Spam Detection

Define the Schema

Create the Classification Function

Test Your Classifier

Usage in Code

Support Ticket Classification

Multi-Label Classification

Multi-Label Test Cases

Usage in Code

Best Practices

1. Use Descriptive Enum Values

2. Add Context to Complex Classifications

3. Test Edge Cases

Advanced: Classification with Confidence Scores

Next Steps