Transformation Functions

The laft module provides essential functions for vector transformations, similarity computations, and feature space manipulations commonly used in representation learning.

inner

Projects features onto a vector or subspace defined by one or more vectors.

def inner(
    features: Tensor,  # [batch_size, feature_size]
    vectors: Tensor,   # [feature_size] or [num_vectors, feature_size]
    *,
    basis: bool = True,
) -> Tensor:           # [batch_size, feature_size]

features

Tensor

Input features with shape [batch_size, feature_size] to be projected.

vectors

Tensor

Vector(s) defining the projection subspace. Can be:

1D tensor of shape [feature_size] for single vector projection
2D tensor of shape [num_vectors, feature_size] for subspace projection

basis

bool

default:"True"

If True, treats vectors as an orthonormal basis. If False, computes the orthonormal basis via SVD.

projection

Tensor

Projected features with shape [batch_size, feature_size].

Usage

Project features onto a concept vector:

import torch
from laft import inner

# Project features onto a single direction
features = torch.randn(32, 512)  # 32 samples, 512-dim features
concept_vector = torch.randn(512)  # concept direction
projected = inner(features, concept_vector)

# Project onto a subspace defined by multiple vectors
concept_vectors = torch.randn(10, 512)  # 10 concept vectors
subspace_projection = inner(features, concept_vectors)

orthogonal

Removes the component of features along specified vector(s), creating an orthogonal projection.

def orthogonal(
    features: Tensor,  # [batch_size, feature_size]
    vectors: Tensor,   # [feature_size] or [num_vectors, feature_size]
    *,
    normalize: bool = False,
    basis: bool = True,
) -> Tensor:           # [batch_size, feature_size]

features

Tensor

Input features with shape [batch_size, feature_size].

vectors

Tensor

Vector(s) to remove from features. Shape [feature_size] or [num_vectors, feature_size].

normalize

bool

default:"False"

If True, normalizes the resulting orthogonal projection to unit length.

basis

bool

default:"True"

If True, treats vectors as an orthonormal basis. If False, computes the orthonormal basis via SVD.

orthogonal_features

Tensor

Features with the specified vector component(s) removed, shape [batch_size, feature_size].

Usage

Remove unwanted concepts from feature representations:

from laft import orthogonal

# Remove bias direction from features
features = torch.randn(32, 512)
bias_vector = torch.randn(512)
debiased = orthogonal(features, bias_vector)

# Remove multiple unwanted directions and normalize
unwanted_directions = torch.randn(5, 512)
cleaned = orthogonal(features, unwanted_directions, normalize=True)

cosine_similarity

Computes pairwise cosine similarity between two sets of vectors.

def cosine_similarity(
    x1: Tensor,
    x2: Tensor | None = None,
    eps: float = 1e-8
) -> Tensor

Tensor

First set of vectors. Shape [batch_size, feature_size] for 2D or [batch_size, seq_len, feature_size] for 3D.

Tensor | None

default:"None"

Second set of vectors. If None, computes similarity of x1 with itself. Must have same dimensionality as x1.

eps

float

default:"1e-8"

Small epsilon value to prevent division by zero during normalization.

similarity_matrix

Tensor

Cosine similarity matrix. Shape depends on input dimensions:

2D inputs: [batch_size_x1, batch_size_x2]
3D inputs: [batch_size, seq_len_x1, seq_len_x2]

Usage

from laft import cosine_similarity

# Compute pairwise similarities between two sets
embeddings1 = torch.randn(32, 512)
embeddings2 = torch.randn(64, 512)
similarity = cosine_similarity(embeddings1, embeddings2)  # [32, 64]

# Compute self-similarity matrix
self_sim = cosine_similarity(embeddings1)  # [32, 32]

cosine_distance

Computes cosine distance (1 - cosine similarity) between vectors.

def cosine_distance(
    x1: Tensor,
    x2: Tensor | None = None,
    eps: float = 1e-8
) -> Tensor

Tensor

First set of vectors.

Tensor | None

default:"None"

Second set of vectors. If None, uses x1.

eps

float

default:"1e-8"

Small epsilon value for numerical stability.

distance_matrix

Tensor

Cosine distance matrix with same shape as cosine_similarity output.

Usage

from laft import cosine_distance

embeddings1 = torch.randn(32, 512)
embeddings2 = torch.randn(64, 512)
distance = cosine_distance(embeddings1, embeddings2)  # [32, 64]

knn

Computes k-nearest neighbor anomaly scores based on cosine distance.

def knn(
    train_features: Tensor,
    test_features: Tensor,
    *,
    n_neighbors: int = 30,
) -> Tensor

train_features

Tensor

Training set features with shape [num_train, feature_size].

test_features

Tensor

Test set features with shape [num_test, feature_size].

n_neighbors

int

default:"30"

Number of nearest neighbors to consider. Automatically capped at num_train if larger.

scores

Tensor

Anomaly scores for test samples, shape [num_test]. Lower scores indicate samples more similar to training data.

Usage

Detect anomalies using k-nearest neighbors:

from laft import knn

# Train on normal samples
train_features = torch.randn(1000, 512)  # normal samples
test_features = torch.randn(100, 512)    # test samples

# Compute anomaly scores (lower = more normal)
scores = knn(train_features, test_features, n_neighbors=30)

pca

Computes principal components of vectors using randomized SVD.

def pca(
    vectors: Tensor,
    n_components: int | None = None,
    *,
    center: bool = False,
    niter: int = 5,
) -> Tensor

vectors

Tensor

Input vectors with shape [num_samples, feature_size].

n_components

int | None

default:"None"

Number of principal components to compute. If None, computes min(num_samples, feature_size) components.

center

bool

default:"False"

If True, centers the data by subtracting the mean before computing PCA.

niter

int

default:"5"

Number of iterations for the randomized SVD algorithm.

components

Tensor

Principal component vectors with shape [n_components, feature_size].

Usage

from laft import pca

# Extract top principal components
features = torch.randn(1000, 512)
components = pca(features, n_components=50, center=True)

# Project data onto principal components
projected = features @ components.T  # [1000, 50]

align_vectors

Aligns vectors to have consistent sign relative to a reference vector.

def align_vectors(
    vectors: torch.Tensor,
    reference_idx: int = 0
) -> Tensor

vectors

Tensor

Vectors to align, shape [num_vectors, feature_size].

reference_idx

int

default:"0"

Index of the vector to use as reference for alignment.

aligned

Tensor

Aligned vectors with shape [num_vectors, feature_size]. Vectors are flipped if they have negative cosine similarity with the reference.

Usage

Ensure consistent direction across related vectors:

from laft import align_vectors

# Align concept vectors to point in similar directions
concept_vectors = torch.randn(10, 512)
aligned = align_vectors(concept_vectors, reference_idx=0)

prompt_pair

Computes pairwise differences between prompt embeddings, useful for analyzing prompt variations.

def prompt_pair(*prompts_list: Tensor) -> Tensor

prompts_list

*Tensor

One or more prompt embedding tensors:

Single tensor: Computes all pairwise differences within one set of prompts [num_prompts, feature_size]
Multiple tensors: Computes pairwise differences between different sets of prompts

pairwise_diff

Tensor

Aligned pairwise differences between prompts. Shape [num_pairs, feature_size] where:

Single input: num_pairs = num_prompts * (num_prompts - 1) / 2
Multiple inputs: num_pairs = sum of all cross-combinations

Usage

Analyze semantic differences between prompts:

from laft import prompt_pair

# Compare variations of a single prompt
prompts = torch.randn(5, 512)  # 5 prompt variations
differences = prompt_pair(prompts)  # [10, 512] (5 choose 2)

# Compare prompts across different categories
category_a = torch.randn(3, 512)
category_b = torch.randn(4, 512)
cross_diffs = prompt_pair(category_a, category_b)  # [12, 512] (3 * 4)

Core Functions

Datasets

Utilities

Transformation Functions

inner

Usage

orthogonal

Usage

cosine_similarity

Usage

cosine_distance

Usage

knn

Usage

pca

Usage

align_vectors

Usage

prompt_pair

Usage

Build docs developers (and LLMs) love

Core Functions

Datasets

Utilities

​inner

​Usage

​orthogonal

​Usage

​cosine_similarity

​Usage

​cosine_distance

​Usage

​knn

​Usage

​pca

​Usage

​align_vectors

​Usage

​prompt_pair

​Usage

Build docs developers (and LLMs) love

inner

Usage

orthogonal

Usage

cosine_similarity

Usage

cosine_distance

Usage

knn

Usage

pca

Usage

align_vectors

Usage

prompt_pair

Usage