Text Similarity

Compute precision / recall / F1 between candidate and reference texts using token-level matching

You send two arrays — `candidates` and `references` — and the service compares them pairwise using contextual token embeddings. For each candidate / reference pair the response returns precision (how much of the candidate appears in the reference), recall (how much of the reference appears in the candidate), and F1 (harmonic mean). Synchronous. Note: the field names are `candidates` and `references`. The older `text_a` / `text_b` pairs are NOT accepted by the live upstream.

textanalysisnlp

Overview

Features

Precision / recall / F1

Three scores per pair — precision (coverage of candidate in reference), recall (coverage of reference in candidate), F1 (harmonic mean).

Token-level matching

Compares texts at the contextual token embedding level for nuanced semantic comparison.

Pairwise list input

Send aligned `candidates` and `references` arrays; the response includes one results[] entry per pair.

Multilingual

Works on the languages supported by the underlying contextual model.

Use Cases

Duplicate detection

Find near-duplicate content with a similarity threshold on F1.

Paraphrase detection

Check if two sentences convey the same meaning (high F1).

Generation eval

Compare a generation against a reference to measure semantic preservation (BERTScore-style).

Input / Output

Input

candidates: array of strings + references: array of strings (aligned by index)

JSON body

Output

results[] — one {precision, recall, f1} per (candidate, reference) pair

JSON

Specs

Latency
~0.3-1 s per pair
Async
false
Rate Limit
Per API key
Max Input
Model-dependent (typically a few hundred tokens per text)

Quickstart

Prerequisites

  • -A CN8 Gateway API key with text-similarity in allowed_services

1. Compare two texts

text-similarity

POST a single-element candidates / references pair. Response wraps the score triplet in results[].

POST/v1/proxy/text-similarity
{
  "candidates": ["The cat sat on the mat"],
  "references": ["A cat is on the rug"]
}

Response

{
  "status": "success",
  "data": {
    "results": [
      { "precision": 0.9589, "recall": 0.9589, "f1": 0.9589 }
    ]
  }
}

Field names are `candidates` and `references` — NOT `text_a` / `text_b`. Both arrays must align: results[i] corresponds to (candidates[i], references[i]).

Text Similarity

POSTsync

Compare aligned candidate / reference text lists. Returns precision, recall, F1 per pair.

/v1/proxy/text-similarity

Pricing

Billed per request.

ServiceUnitPrice
Text Similarityitem$0.004/request
  • -Cost is per request, not per pair — batch as many comparisons as you can per call.

Guides & Tips

Important Notes (verified against the live upstream)

  • -Request fields are `candidates` and `references` — both arrays of strings. The legacy
  • -`text_a` / `text_b` form is rejected with 400.
  • -Arrays are aligned by index: `results[i]` corresponds to `(candidates[i], references[i])`.
  • -Both arrays MUST have the same length.
  • -Cost is per request, not per pair.
  • -`f1` is the primary similarity metric. Use precision/recall when you care about
  • -directional coverage.

How it works

  • -Both texts in a pair are tokenised and run through a contextual model.
  • -Each token in candidate is matched to the most similar token in reference (and vice versa) by cosine similarity on token embeddings.
  • -Precision: average of max similarities from candidate to reference. Recall: average from reference to candidate. F1: harmonic mean.

When to use F1 vs precision/recall

  • -F1 is balanced — use it for general "are these the same?" questions.
  • -Precision: "is the candidate a subset of the reference?" (translation/paraphrase fidelity)
  • -Recall: "does the candidate cover everything in the reference?" (summarization completeness)

FAQ

Q: What is the score range?

A: 0 to 1 for each of precision, recall, and F1. Higher means more similar.

Q: Can I compare 1 candidate to N references?

A: Replicate the candidate to match the references list length: candidates = [c, c, c], references = [r1, r2, r3]. The response returns three scores in order.

Q: How is this different from Text Embeddings + cosine similarity?

A: Text Embeddings compares whole-sentence vectors. Text Similarity compares at the token level, giving more nuanced precision/recall/F1 (BERTScore-style).

Related Products

Changelog

1.2 (2026-04-29)

  • -Renamed request fields: `text_a` / `text_b` → `candidates` / `references` (arrays). Old form returns 400.
  • -Documented the results[] wrapper and per-pair shape.
  • -Documented per-request (not per-pair) pricing.

1.1 (2026-02-23)

  • -Aligned with text_similarity prose (now superseded).

1.0 (2026-01-26)

  • -Initial catalog.