Documentation Index
Fetch the complete documentation index at: https://mintlify.com/AymanMahfuz27/tiktok-auto-collection-sorter/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The TikTok Auto Collection Sorter stores video embeddings and predictions in three main artifact files:
labeled_embeddings.pt - Features and labels for training videos
unlabeled_embeddings.pt - Features for unsorted videos awaiting classification
predictions.json - Model predictions with confidence scores
All .pt files are PyTorch tensors saved with torch.save() and loaded with torch.load().
labeled_embeddings.pt
Generated by extract_features.py:198-205. Contains extracted features from videos already sorted into folders.
File Structure
PyTorch dictionary containing training data.
Multi-modal embeddings for each video
- Shape:
(num_videos, 1024)
- Type:
torch.FloatTensor
- Description: Each row is one video’s concatenated embedding:
- First 512 dimensions: CLIP visual features (averaged over N sampled frames)
- Last 512 dimensions: CLIP text features (from Whisper transcription)
- Both modalities are L2-normalized before concatenation
Example:# Shape: (150, 1024) for 150 labeled videos
tensor([[0.123, -0.045, ..., 0.089], # Video 1
[0.456, 0.012, ..., -0.234], # Video 2
...])
Class indices for each video
- Shape:
(num_videos,)
- Type:
torch.LongTensor
- Description: Integer class labels corresponding to folder categories. Indices map to
label_names list.
Example:# Shape: (150,)
tensor([0, 2, 1, 0, 3, ...]) # 0=fitness, 1=funny, 2=soccer, etc.
Sorted list of folder category names
- Type: Python list of strings
- Description: Maps integer indices to human-readable folder names. Order determines the class index mapping.
Example:["art", "fitness", "funny", "music", "soccer"]
# Index 0 = "art", Index 1 = "fitness", etc.
File paths for each video
- Type: Python list of strings
- Description: Absolute paths to source video files. Order corresponds to
features and labels tensors.
Example:[
"/home/user/data/Favorites/videos/soccer/video1.mp4",
"/home/user/data/Favorites/videos/fitness/video2.mp4",
...
]
Loading Example
import torch
# Load labeled embeddings
data = torch.load("artifacts/labeled_embeddings.pt", weights_only=False)
# Access components
features = data["features"] # torch.Tensor (N, 1024)
labels = data["labels"] # torch.Tensor (N,)
label_names = data["label_names"] # list[str]
video_paths = data["video_paths"] # list[str]
print(f"Loaded {len(features)} videos")
print(f"Categories: {label_names}")
print(f"Feature shape: {features.shape}") # (150, 1024)
unlabeled_embeddings.pt
Generated by extract_features.py:235-241. Contains extracted features from videos in the root folder (not yet sorted).
File Structure
PyTorch dictionary containing features for unsorted videos.
Multi-modal embeddings for each unsorted video
- Shape:
(num_videos, 1024)
- Type:
torch.FloatTensor
- Description: Same structure as
labeled_embeddings.pt features (512-d visual + 512-d text, L2-normalized)
Example:# Shape: (50, 1024) for 50 unsorted videos
tensor([[0.234, -0.123, ..., 0.456],
[0.678, 0.234, ..., -0.123],
...])
File paths for each unsorted video
- Type: Python list of strings
- Description: Absolute paths to unsorted video files in the root videos folder
Example:[
"/home/user/data/Favorites/videos/unsorted_video1.mp4",
"/home/user/data/Favorites/videos/unsorted_video2.mp4",
...
]
Note: No labels or label_names keys since these videos are unlabeled.
Loading Example
import torch
# Load unlabeled embeddings
data = torch.load("artifacts/unlabeled_embeddings.pt", weights_only=False)
features = data["features"] # torch.Tensor (M, 1024)
video_paths = data["video_paths"] # list[str]
print(f"Found {len(features)} unsorted videos to classify")
predictions.json
Generated by predict.py:158-173. Contains model predictions for all unsorted videos.
File Structure
JSON array of prediction objects, one per unsorted video.
Each object in the array has the following fields:Filename of the video (basename only, not full path)Example: "unsorted_video1.mp4"
Top prediction - the folder category with highest confidenceExample: "soccer"
Confidence score for the top prediction (0.0 to 1.0)Example: 0.87 (87% confidence)
Ranked list of top-k predictions with confidence scoresShow Prediction object structure
Confidence score (0.0 to 1.0)
Example:[
{"folder": "soccer", "confidence": 0.87},
{"folder": "fitness", "confidence": 0.09},
{"folder": "funny", "confidence": 0.04}
]
Complete Example
[
{
"video": "tiktok_12345.mp4",
"predicted_folder": "soccer",
"confidence": 0.87,
"top_predictions": [
{"folder": "soccer", "confidence": 0.87},
{"folder": "fitness", "confidence": 0.09},
{"folder": "funny", "confidence": 0.04}
]
},
{
"video": "tiktok_67890.mp4",
"predicted_folder": "music",
"confidence": 0.62,
"top_predictions": [
{"folder": "music", "confidence": 0.62},
{"folder": "art", "confidence": 0.31},
{"folder": "funny", "confidence": 0.07}
]
}
]
Loading Example
import json
# Load predictions
with open("artifacts/predictions.json") as f:
predictions = json.load(f)
for pred in predictions:
video = pred["video"]
folder = pred["predicted_folder"]
conf = pred["confidence"]
print(f"{video} → {folder} ({conf:.0%})")
model_config.json
Generated by train.py:220-225. Stores model metadata and configuration.
File Structure
JSON object containing model configuration and metadata.Show Configuration fields
Type of trained model: "mlp", "knn", or "logreg"
Feature dimension (1024 for CLIP visual + text)Only present for MLP models
Number of output classes (folder categories)Only present for MLP models
Hidden layer sizeOnly present for MLP models. Default: 256
Sorted list of category names matching class indices
Input feature dimension (same as input_dim)
Best cross-validation accuracy achieved during training
Example
{
"model_type": "mlp",
"input_dim": 1024,
"num_classes": 5,
"hidden_dim": 256,
"label_names": ["art", "fitness", "funny", "music", "soccer"],
"feature_dim": 1024,
"best_cv_accuracy": 0.89
}
Data Flow Summary
-
Feature Extraction (
extract_features.py)
- Input: Raw
.mp4 video files
- Output:
labeled_embeddings.pt + unlabeled_embeddings.pt
-
Model Training (
train.py)
- Input:
labeled_embeddings.pt
- Output:
model.pt + model_config.json
-
Prediction (
predict.py)
- Input:
unlabeled_embeddings.pt + model.pt + model_config.json
- Output:
predictions.json