Language Detection

Detect the language of text with confidence scores, supporting 75+ languages

You send a `texts` array and the service returns one detection per text inside `data.results`. Each detection has an ISO 639-1 code (`en`, `tr`, `es`, …), a full `language_name`, and a `confidence` score. Synchronous. Note: the response wraps detections in a `results[]` array; alternative-language candidates are not returned by the live upstream.

textanalysisnlp

Overview

Features

75+ languages

Detects major and many minor languages, returning standard ISO 639-1 codes.

Confidence per detection

Each detection includes a confidence score (0-1).

Batch input

texts is an array — one detection per input string in results[].

Fast

Lightweight detection; typically sub-second response.

Use Cases

Content routing

Route text to the right translator, locale, or language-specific NLP pipeline.

Analytics

Measure language distribution in user-generated content.

Pre-processing

Detect language before feeding text into language-specific services (sentiment, NER, etc.).

Input / Output

Input

texts: array of strings

JSON body

Output

results[] with language, language_name, confidence per input

JSON

Specs

Latency
~0.1-0.5 s
Async
false
Rate Limit
Per API key
Max Input
No hard limit; very short strings (<10 chars) may reduce accuracy

Quickstart

Prerequisites

  • -A CN8 Gateway API key with text-language in allowed_services

1. Detect language

text-language

POST a texts array. Response contains one results[] entry per input string.

POST/v1/proxy/text-language
{
  "texts": ["Bonjour, comment allez-vous?"]
}

Response

{
  "status": "success",
  "data": {
    "results": [
      { "language": "fr", "language_name": "French", "confidence": 0.5357 }
    ]
  }
}

Field is texts (plural). Detections are wrapped in results[].

Language Detection

POSTsync

Detect the language of one or more input texts. Returns ISO 639-1 code, language name, and confidence per text.

/v1/proxy/text-language

Pricing

Billed per request.

ServiceUnitPrice
Language Detectionitem$0.001/request

Guides & Tips

Important Notes (verified against the live upstream)

  • -Field name is `texts` (plural array). The legacy `text` (string) form returns 400.
  • -Detections live inside `data.results` — they're not on `data` directly.
  • -No `alternatives` field is returned by the live upstream — only the top match.
  • -Cost is per request, not per text in the array.

How it works

  • -The detector returns the best match with a confidence score.
  • -For mixed-language content, the dominant language is typically returned.

FAQ

Q: What format is the language code?

A: ISO 639-1 two-letter codes (e.g. en, tr, es, de, fr, ar, zh, ja, ko).

Q: How accurate is detection on very short inputs?

A: Short texts (<10 characters) have lower confidence. For reliable detection, send at least a sentence.

Q: Can I get alternative-language candidates?

A: Not from the live upstream — only the top match is returned. If you need alternatives, send the same text to a second backend client-side.

Related Products

Changelog

1.2 (2026-04-29)

  • -Aligned to live upstream: request body is `texts` (array), response is `data.results[]`.
  • -Removed `alternatives` field from response shape — not returned by the live upstream.

1.1 (2026-02-23)

  • -Aligned with language_detector prose (now superseded).

1.0 (2026-01-26)

  • -Initial catalog.