Precision / recall / F1
Three scores per pair — precision (coverage of candidate in reference), recall (coverage of reference in candidate), F1 (harmonic mean).
Compute precision / recall / F1 between candidate and reference texts using token-level matching
You send two arrays — `candidates` and `references` — and the service compares them pairwise using contextual token embeddings. For each candidate / reference pair the response returns precision (how much of the candidate appears in the reference), recall (how much of the reference appears in the candidate), and F1 (harmonic mean). Synchronous. Note: the field names are `candidates` and `references`. The older `text_a` / `text_b` pairs are NOT accepted by the live upstream.
Three scores per pair — precision (coverage of candidate in reference), recall (coverage of reference in candidate), F1 (harmonic mean).
Compares texts at the contextual token embedding level for nuanced semantic comparison.
Send aligned `candidates` and `references` arrays; the response includes one results[] entry per pair.
Works on the languages supported by the underlying contextual model.
Find near-duplicate content with a similarity threshold on F1.
Check if two sentences convey the same meaning (high F1).
Compare a generation against a reference to measure semantic preservation (BERTScore-style).
Input
candidates: array of strings + references: array of strings (aligned by index)
Output
results[] — one {precision, recall, f1} per (candidate, reference) pair
Prerequisites
POST a single-element candidates / references pair. Response wraps the score triplet in results[].
{
"candidates": ["The cat sat on the mat"],
"references": ["A cat is on the rug"]
}Response
{
"status": "success",
"data": {
"results": [
{ "precision": 0.9589, "recall": 0.9589, "f1": 0.9589 }
]
}
}Field names are `candidates` and `references` — NOT `text_a` / `text_b`. Both arrays must align: results[i] corresponds to (candidates[i], references[i]).
Compare aligned candidate / reference text lists. Returns precision, recall, F1 per pair.
/v1/proxy/text-similarity
Billed per request.
| Service | Unit | Price |
|---|---|---|
| Text Similarity | item | $0.004/request |
A: 0 to 1 for each of precision, recall, and F1. Higher means more similar.
A: Replicate the candidate to match the references list length: candidates = [c, c, c], references = [r1, r2, r3]. The response returns three scores in order.
A: Text Embeddings compares whole-sentence vectors. Text Similarity compares at the token level, giving more nuanced precision/recall/F1 (BERTScore-style).
1.2 (2026-04-29)
1.1 (2026-02-23)
1.0 (2026-01-26)