Industrial Anomaly Datasets

Industrial anomaly datasets focus on detecting manufacturing defects with pixel-level ground truth masks. LAFT supports MVTec AD and VisA, two widely-used benchmarks for industrial anomaly detection.

Overview

Industrial datasets inherit from IndustrialAnomalyDataset and provide:

Image-level labels: Boolean tensor indicating normal (False) or anomalous (True)
Pixel-level masks: Binary masks showing exact defect locations
Category-based: Each dataset has multiple object categories (bottles, cables, etc.)
Mask transforms: Apply transformations to both images and masks

Building an Industrial Dataset

Use the build_industrial_dataset() function to load MVTec AD or VisA:

from laft.datasets import build_industrial_dataset
from torchvision import transforms
import torch

image_transform = transforms.Compose([
    transforms.Resize(224),
    transforms.ToTensor(),
])

mask_transform = transforms.Compose([
    transforms.Resize(224),
    transforms.Lambda(lambda x: x.float()),
])

dataset = build_industrial_dataset(
    name="mvtec",              # or "visa"
    category="bottle",         # object category
    split="test",              # "train" or "test"
    root="./data",
    transform=image_transform,
    mask_transform=mask_transform,
)

image, mask, label = dataset[0]
print(f"Label: {label}")  # True if anomalous
print(f"Mask shape: {mask.shape}")  # [H, W], bool tensor

Mask transforms are applied to boolean tensors. Convert to float if needed for operations like resizing.

MVTec AD

MVTec Anomaly Detection (MVTec AD) is the most widely-used benchmark for industrial anomaly detection, containing 15 object categories.

Dataset Structure

MVTec AD organizes data by anomaly type:

mvtec_anomaly_detection/
├── bottle/
│   ├── train/
│   │   └── good/          # Normal samples only
│   ├── test/
│   │   ├── good/          # Normal samples
│   │   ├── broken_large/  # Anomaly type
│   │   ├── broken_small/
│   │   └── contamination/
│   └── ground_truth/      # Pixel-level masks
│       ├── broken_large/
│       ├── broken_small/
│       └── contamination/

Usage Example

from laft.datasets import build_industrial_dataset
from torch.utils.data import DataLoader

# Load test set for a category
dataset = build_industrial_dataset(
    name="mvtec",
    category="bottle",
    split="test",
    root="./data",
)

print(f"Total samples: {len(dataset)}")
print(f"Anomalies: {dataset.labels.sum().item()}")
print(f"Normal: {(~dataset.labels).sum().item()}")

# Iterate through dataset
for image, mask, label in dataset:
    # image: PIL.Image (or transformed tensor)
    # mask: torch.Tensor [H, W], dtype=torch.bool
    # label: torch.Tensor [], dtype=torch.bool
    
    if label:
        # Anomalous sample - mask shows defect location
        defect_pixels = mask.sum().item()
        print(f"Found anomaly with {defect_pixels} defect pixels")
    else:
        # Normal sample - mask is all zeros
        assert mask.sum() == 0

Implementation Details

From laft/datasets/mvtec.py:61-71:

def load_image(self, index: int):
    with open(os.path.join(self.data_root, self.split, f"{self.filenames[index]}.png"), "rb") as f:
        image = Image.open(f)
        image.load()
    return image

def load_mask(self, index: int):
    with open(os.path.join(self.data_root, "ground_truth", f"{self.filenames[index]}_mask.png"), "rb") as f:
        mask = Image.open(f)
        mask.load()
    return torch.from_numpy(np.asarray(mask).copy()) > 0

Masks are converted to boolean tensors with > 0 to create binary masks.

Training vs Testing

Train Split

Contains only normal samples
No anomalies included
Used for learning normal patterns

Test Split

Mix of normal and anomalous samples
Multiple defect types per category
Pixel-level masks for anomalies

VisA

VisA (Visual Anomaly) dataset contains 12 categories with more diverse anomaly types than MVTec AD.

Dataset Structure

VisA uses CSV metadata for organization:

VisA_20220922/
├── split_csv/
│   └── 1cls.csv          # Metadata file
├── candle/
│   ├── Data/
│   │   ├── Images/
│   │   │   ├── Normal/
│   │   │   └── Anomaly/
│   │   └── Masks/

Usage Example

from laft.datasets import build_industrial_dataset

# Load VisA dataset
dataset = build_industrial_dataset(
    name="visa",
    category="pcb1",
    split="test",
    root="./data",
)

image, mask, label = dataset[0]

# VisA provides detailed anomaly masks
if label:
    print(f"Anomaly detected with mask coverage: {mask.float().mean():.2%}")

Implementation Details

From laft/datasets/visa.py:52-59:

with open(os.path.join(self.data_root, "split_csv", self.csv_filename)) as f:
    for row in DictReader(f):
        if row["object"] == category and row["split"] == split:
            self.image_filenames.append(row["image"])
            self.mask_filenames.append(row["mask"] or None)
            label_list.append(row["label"] != "normal")

VisA uses CSV-based organization, allowing flexible metadata management.

Mask Transforms

Mask transforms are essential for aligning masks with transformed images.

Basic Mask Transform

import torch
from torchvision import transforms

# Image transform
image_transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], 
                        std=[0.229, 0.224, 0.225]),
])

# Mask transform - must match image transform geometry
mask_transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.Lambda(lambda x: x.float().unsqueeze(0)),  # [1, H, W]
])

dataset = build_industrial_dataset(
    name="mvtec",
    category="bottle",
    split="test",
    root="./data",
    transform=image_transform,
    mask_transform=mask_transform,
)

Advanced Mask Handling

from torchvision import transforms
import torch.nn.functional as F

class MaskTransform:
    def __init__(self, size=224):
        self.size = size
    
    def __call__(self, mask):
        # mask is boolean tensor [H, W]
        mask = mask.float().unsqueeze(0).unsqueeze(0)  # [1, 1, H, W]
        mask = F.interpolate(mask, size=(self.size, self.size), 
                            mode='nearest')
        return mask.squeeze(0).bool()  # [1, H, W], bool

mask_transform = MaskTransform(size=224)

dataset = build_industrial_dataset(
    name="mvtec",
    category="transistor",
    split="test",
    root="./data",
    mask_transform=mask_transform,
)

Use mode='nearest' for mask interpolation to preserve binary values. Avoid bilinear interpolation which creates intermediate values.

Working with Masks

Visualizing Anomalies

import matplotlib.pyplot as plt
from laft.datasets import build_industrial_dataset

dataset = build_industrial_dataset(
    name="mvtec",
    category="bottle",
    split="test",
    root="./data",
)

# Find first anomaly
for idx, (image, mask, label) in enumerate(dataset):
    if label:
        fig, axes = plt.subplots(1, 3, figsize=(12, 4))
        
        axes[0].imshow(image)
        axes[0].set_title("Original Image")
        axes[0].axis('off')
        
        axes[1].imshow(mask, cmap='gray')
        axes[1].set_title("Defect Mask")
        axes[1].axis('off')
        
        # Overlay
        axes[2].imshow(image)
        axes[2].imshow(mask, alpha=0.5, cmap='Reds')
        axes[2].set_title("Overlay")
        axes[2].axis('off')
        
        plt.tight_layout()
        plt.show()
        break

Computing Metrics

from laft.datasets import build_industrial_dataset
import torch

dataset = build_industrial_dataset(
    name="mvtec",
    category="carpet",
    split="test",
    root="./data",
)

# Analyze defect coverage
defect_coverages = []

for image, mask, label in dataset:
    if label:  # Only anomalies
        coverage = mask.float().mean().item()
        defect_coverages.append(coverage)

print(f"Average defect coverage: {torch.tensor(defect_coverages).mean():.2%}")
print(f"Max defect coverage: {torch.tensor(defect_coverages).max():.2%}")
print(f"Min defect coverage: {torch.tensor(defect_coverages).min():.2%}")

DataLoader Integration

from torch.utils.data import DataLoader
from laft.datasets import build_industrial_dataset

dataset = build_industrial_dataset(
    name="mvtec",
    category="screw",
    split="test",
    root="./data",
)

loader = DataLoader(
    dataset,
    batch_size=16,
    shuffle=True,
    num_workers=4,
)

for images, masks, labels in loader:
    # images: [batch_size, C, H, W]
    # masks: [batch_size, H, W], dtype=torch.bool
    # labels: [batch_size], dtype=torch.bool
    
    num_anomalies = labels.sum().item()
    print(f"Batch has {num_anomalies} anomalies")
    break

Why Industrial Datasets?

Industrial datasets are crucial for:

Pixel-level localization: Identify exact defect locations, not just image-level classification
Real-world manufacturing: Test on actual production scenarios
Diverse defect types: Each category has multiple anomaly types (cracks, scratches, contamination, etc.)
Benchmark standardization: Compare methods on widely-accepted datasets

Use Cases

Quality Control

Automatically detect manufacturing defects in production lines

Defect Localization

Identify precise locations of anomalies for repair or analysis

Zero-Shot Detection

Train on normal samples only, detect any deviation

Few-Shot Learning

Learn from limited anomaly examples with normal data

Dataset Comparison

Feature	MVTec AD	VisA
Categories	15	12
Image Resolution	Varies (700-1024px)	Varies
Train Samples/Category	~200-300	~100-300
Test Samples/Category	~60-100	~100-200
Defect Types	3-7 per category	3-8 per category
Organization	Directory-based	CSV metadata
Masks	Binary PNG	Binary PNG

Reference

API Summary

from laft.datasets import build_industrial_dataset

def build_industrial_dataset(
    name: Literal["mvtec", "visa"],
    category: str,  # See CATEGORIES for each dataset
    split: Literal["train", "test"],
    root: str = "./data",
    transform: Callable | None = None,
    mask_transform: Callable | None = None,
) -> IndustrialAnomalyDataset

Dataset Returns

image, mask, label = dataset[index]
# image: PIL.Image or transformed tensor
# mask: torch.Tensor of shape [H, W], dtype=torch.bool
# label: torch.Tensor of shape [], dtype=torch.bool (True=anomaly)

Base Class Reference

From laft/datasets/base.py:54-94:

class IndustrialAnomalyDataset(Dataset):
    labels: torch.Tensor  # [num_samples], bool
    
    def __getitem__(self, index: int):
        image = self.load_image(index)
        label = self.labels[index]
        
        if label.item():  # anomaly
            mask = self.load_mask(index)
        else:
            mask = torch.zeros((image.size[1], image.size[0]), 
                              dtype=torch.bool)
        
        if self.transform is not None:
            image = self.transform(image)
        if self.mask_transform is not None:
            mask = self.mask_transform(mask)
        
        return image, mask, label

Source Code

View the complete implementation in laft/datasets/mvtec.py and laft/datasets/visa.py

Get Started

Core Concepts

Datasets

Guides

​Industrial Anomaly Datasets

​Overview

​Building an Industrial Dataset

​MVTec AD

​Categories

​Dataset Structure

​Usage Example

​Implementation Details

​Training vs Testing

Train Split

Test Split

​VisA

​Categories

​Dataset Structure

​Usage Example

​Implementation Details

​Mask Transforms

​Basic Mask Transform

​Advanced Mask Handling

​Working with Masks

​Visualizing Anomalies

​Computing Metrics

​DataLoader Integration

​Why Industrial Datasets?

​Use Cases

Quality Control

Defect Localization

Zero-Shot Detection

Few-Shot Learning

​Dataset Comparison

​Reference

​API Summary

​Dataset Returns

​Base Class Reference

Source Code

Build docs developers (and LLMs) love

Industrial Anomaly Datasets

Overview

Building an Industrial Dataset

MVTec AD

Categories

Dataset Structure

Usage Example

Implementation Details

Training vs Testing

VisA

Categories

Dataset Structure

Usage Example

Implementation Details

Mask Transforms

Basic Mask Transform

Advanced Mask Handling

Working with Masks

Visualizing Anomalies

Computing Metrics

DataLoader Integration

Why Industrial Datasets?

Use Cases

Dataset Comparison

Reference

API Summary

Dataset Returns

Base Class Reference