BERTScore expects pre-computed token embeddings — for example, the contextual vectors produced by a BERT model for each token in a sentence. You must extract these embeddings yourself before calling this function; reval does not perform tokenisation or model inference.Internally, the function computes greedy alignment using dot product similarity: each token in the candidate is matched to its most similar token in the reference (and vice versa), and the scores are averaged to produce precision, recall, and F1.For best results, L2-normalise your embeddings with Normalize before passing them in so that dot product equals cosine similarity.BERTScore
A sequence of token embedding vectors for the candidate text. Each inner slice is the embedding for one token. All vectors should have the same dimensionality.
A sequence of token embedding vectors for the reference text. Each inner slice is the embedding for one token. All vectors should have the same dimensionality as those in
candidates.float64 values:
Average maximum similarity from each candidate token to the most similar reference token.
Average maximum similarity from each reference token to the most similar candidate token.
Harmonic mean of precision and recall.
Example
DotProduct
BERTScore for greedy token alignment. When vectors are L2-normalised, the dot product equals cosine similarity. Returns 0 if the vectors have different lengths.
The first vector.
The second vector. Must have the same length as
a, otherwise 0 is returned.float64 — the sum of element-wise products.
L2Norm
a — the square root of the sum of squared elements. This is the magnitude used to normalise vectors before computing cosine similarity.
The input vector.
float64 — the Euclidean length of the vector.
Normalize
a — that is, a divided by its L2 norm so that the result has unit length. If the norm is zero (the zero vector), the original slice is returned unchanged.
The input vector to normalise.
[]float64 — a new slice with the same direction as a and a magnitude of 1.0.