Back to blog
6 minNagovori

Introducing the Nagovori API

apiannouncementdeveloper

Introducing the Nagovori API

Today we're launching the Nagovori API — a simple REST interface to our speech-to-text and text-to-speech engine. If you've been using Nagovori through the web interface or our Telegram bot, you can now build the same capabilities directly into your applications.

Why an API?

Since launch, Nagovori has processed hundreds of thousands of audio minutes through the web dashboard and messaging bots. But developers kept asking: "Can I call this from my server?"

The answer is now yes.

We built the API for teams that need transcription as a building block rather than a standalone tool. Think: a CRM that automatically transcribes sales calls. A podcast platform that generates searchable transcripts. A legal firm that turns depositions into structured documents. A customer support platform that logs calls as text.

These aren't hypothetical use cases — they're requests we've received from teams already using Nagovori through the web interface.

What You Can Do

The API covers everything the web interface offers, and it's designed to be predictable. No websockets, no GraphQL, no vendor SDK required. Just JSON over HTTPS.

Speech-to-Text:

  • Upload audio files in any common format (MP3, WAV, OGG, FLAC, M4A, WEBM, and more)
  • Automatic language detection across 50+ languages
  • Timecoded transcript segments for subtitle generation
  • Real-time streaming via Server-Sent Events — watch the text appear as it's recognized
  • AI postprocessing: clean up filler words or generate structured summaries

Text-to-Speech:

  • Convert text to natural-sounding audio in 7 distinct voices
  • Stream audio in real-time for interactive applications
  • Batch synthesis for pre-rendering content
  • Stress mark support for Russian text (за́мок vs замо́к)

Usage Management:

  • Create and manage API keys from your profile
  • Track usage and remaining minutes via /v1/profile/usage
  • Same pricing as web — no API surcharge

Authentication

We designed authentication to be simple. Create an API key from your profile page, and use it as a Bearer token:

curl https://api.nagovori.ru/v1/me \
  -H "Authorization: Bearer nag_YOUR_API_KEY"

API keys start with nag_ so they're easy to identify in code reviews and secret scanners. Keys are hashed with SHA-256 before storage — we never store the raw key. If a key leaks, revoke it instantly from your profile and create a new one.

You can create multiple keys for different environments (staging, production, CI) and revoke them independently.

A Complete Example

Here's a full transcription workflow in Python — from file upload to printed transcript:

import requests
import time
import os

API_KEY = os.environ["NAGOVORI_API_KEY"]
BASE = "https://api.nagovori.ru/v1"
headers = {"Authorization": f"Bearer {API_KEY}"}

# 1. Get a presigned upload URL
presign = requests.post(f"{BASE}/uploads/presign", headers=headers, json={
    "filename": "call.mp3",
    "content_type": "audio/mpeg",
    "size_bytes": os.path.getsize("call.mp3"),
}).json()

# 2. Upload the file directly to object storage
with open("call.mp3", "rb") as f:
    requests.put(presign["upload_url"], data=f,
                 headers={"Content-Type": "audio/mpeg"})

# 3. Start transcription
job = requests.post(f"{BASE}/transcriptions", headers=headers, json={
    "object_key": presign["object_key"],
    "filename": "call.mp3",
    "content_type": "audio/mpeg",
    "size_bytes": os.path.getsize("call.mp3"),
    "language": "auto",
}).json()

# 4. Poll until complete
while job["status"] in ("queued", "processing"):
    time.sleep(3)
    job = requests.get(
        f"{BASE}/transcriptions/{job['id']}", headers=headers
    ).json()

if job["status"] == "completed":
    print(job["transcript_text"])
else:
    print(f"Failed: {job.get('error_message')}")

The same flow works in any language. Check our Examples page for TypeScript, Go, and cURL versions.

Real-Time Streaming

For applications that need to show text as it's being recognized, the API supports Server-Sent Events:

import requests

url = f"{BASE}/transcriptions/{job_id}/stream"
response = requests.get(url, headers=headers, stream=True)

for line in response.iter_lines():
    if line.startswith(b"data: "):
        # Parse the JSON event and update your UI
        pass

The stream emits token events with individual words as they're recognized, and a final done event with the complete transcript.

Pricing

API usage shares the same minute balance as the web interface and bots. There's no separate API pricing — if you have free minutes or a purchased package, the API uses the same pool.

Minute packages start at 1.4 ₽/min for large volumes. Every new account gets free minutes to test with — no credit card required.

Rate Limits and Fair Use

To keep the service fast for everyone:

  • 60 API requests per minute per user
  • 1 concurrent transcription at a time (queue additional jobs)
  • File uploads via presigned URLs (no multipart through our servers)

These limits are generous for most use cases. If you're building something that needs higher throughput, reach out.

Getting Started

  1. Sign in to nagovori.ru
  2. Go to your Profile and create an API key
  3. Follow the Quickstart guide — your first transcription in 5 minutes
  4. Browse the full API documentation and interactive API Reference

We're excited to see what you build. If you have questions or feedback, reach out through the platform.