TaobaoVDDataset
Dataset class for TaobaoVD-GC video quality assessment dataset. Handles video loading, frame sampling, and preprocessing for both DOVER++ (640x640) and V-JEPA2 (384x384) models.Parameters
Path to CSV file with labels
Directory containing video files
Number of frames to sample from each video
Target resolution for videos (640 for DOVER++, 384 for V-JEPA2)
Dataset mode: ‘train’, ‘val’, or ‘test’
Optional video processor for V-JEPA2 model
Attributes
Column names for MOS scores: [‘Traditional_MOS’, ‘Alignment_MOS’, ‘Aesthetic_MOS’, ‘Temporal_MOS’, ‘Overall_MOS’]
Whether the dataset contains ground truth labels
Methods
__len__()
Returns the number of samples in the dataset. Returns:int - Number of samples
__getitem__(idx)
Get a single sample from the dataset. Parameters:idx(int): Sample index
Dict[str, Any] containing:
frames(torch.Tensor): Video frames with shape (C, T, H, W)prompt(str): Text prompt for the videovideo_name(str): Name of the video filelabels(torch.Tensor): MOS labels (5,) or zeros for test mode
OptimizedGPUCollate
Optimized collate function for GPU processing with text encoding. Handles batching of video data and text encoding, optimizing for GPU memory usage and processing speed.Parameters
Optional video processor for V-JEPA2
Text encoder for prompt processing
Device to place tensors on
Maximum number of frames per video
Methods
__call__(batch)
Collate a batch of samples. Parameters:batch(List[Dict[str, Any]]): List of samples from dataset
Dict[str, torch.Tensor] containing:
pixel_values_videos(torch.Tensor): Batched video frames (B, C, T, H, W)text_emb(torch.Tensor): Text embeddings (B, D)labels(torch.Tensor): Batched labels (B, 5)video_names(List[str]): List of video namesprompts(List[str]): List of original prompts
create_data_loaders
Create train and validation data loaders.Parameters
Path to training CSV file
Path to validation CSV file
Directory containing training videos
Directory containing validation videos
Batch size for data loading
Number of frames per video
Target resolution for videos
Optional video processor for V-JEPA2
Optional text encoder for prompt processing
Device for processing
Number of worker processes for data loading
Returns
Training data loader
Validation data loader
create_test_loader
Create test data loader.Parameters
Path to test CSV file
Directory containing test videos
Batch size for data loading
Number of frames per video
Target resolution for videos
Optional video processor for V-JEPA2
Optional text encoder for prompt processing
Device for processing
Returns
Test data loader with num_workers=0