8 minNagovori
Build a Telegram Transcription Bot in 30 Minutes
telegramtutorialpythonbot
Build a Telegram Transcription Bot in 30 Minutes
In this tutorial, you'll build a Telegram bot that transcribes voice messages and audio files using the Nagovori API. The bot receives a voice message, sends it to Nagovori for transcription, and replies with the text.
Prerequisites
- Python 3.10+
- A Telegram bot token from @BotFather
- A Nagovori API key (create one in your Profile)
Setup
Install dependencies:
pip install python-telegram-bot requests
Set environment variables:
export TELEGRAM_BOT_TOKEN="your-telegram-bot-token"
export NAGOVORI_API_KEY="nag_your_api_key"
The Code
Create bot.py:
import os
import time
import tempfile
import requests
from telegram import Update
from telegram.ext import Application, MessageHandler, filters, ContextTypes
TELEGRAM_TOKEN = os.environ["TELEGRAM_BOT_TOKEN"]
API_KEY = os.environ["NAGOVORI_API_KEY"]
BASE_URL = "https://api.nagovori.ru/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}
async def handle_voice(update: Update, context: ContextTypes.DEFAULT_TYPE):
"""Handle voice messages and audio files."""
message = update.message
if not message:
return
# Get the file from Telegram
if message.voice:
file = await message.voice.get_file()
filename = "voice.ogg"
content_type = "audio/ogg"
elif message.audio:
file = await message.audio.get_file()
filename = message.audio.file_name or "audio.mp3"
content_type = message.audio.mime_type or "audio/mpeg"
elif message.document and message.document.mime_type and \
message.document.mime_type.startswith("audio/"):
file = await message.document.get_file()
filename = message.document.file_name or "audio.mp3"
content_type = message.document.mime_type
else:
return
await message.reply_text("Transcribing... please wait.")
# Download from Telegram
with tempfile.NamedTemporaryFile(suffix=".ogg", delete=False) as tmp:
await file.download_to_drive(tmp.name)
file_size = os.path.getsize(tmp.name)
# 1. Presign upload
presign = requests.post(
f"{BASE_URL}/uploads/presign",
headers=HEADERS,
json={
"filename": filename,
"content_type": content_type,
"size_bytes": file_size,
},
).json()
# 2. Upload to Nagovori
with open(tmp.name, "rb") as audio:
requests.put(
presign["upload_url"],
data=audio,
headers={"Content-Type": content_type},
)
os.unlink(tmp.name)
# 3. Create transcription
job = requests.post(
f"{BASE_URL}/transcriptions",
headers=HEADERS,
json={
"object_key": presign["object_key"],
"filename": filename,
"content_type": content_type,
"size_bytes": file_size,
"language": "auto",
},
).json()
# 4. Poll for result
while job["status"] in ("queued", "processing"):
time.sleep(2)
job = requests.get(
f"{BASE_URL}/transcriptions/{job['id']}",
headers=HEADERS,
).json()
if job["status"] == "completed":
text = job["transcript_text"]
# Telegram messages have a 4096 character limit
for i in range(0, len(text), 4000):
await message.reply_text(text[i:i + 4000])
else:
error = job.get("error_message", "Unknown error")
await message.reply_text(f"Transcription failed: {error}")
def main():
app = Application.builder().token(TELEGRAM_TOKEN).build()
app.add_handler(MessageHandler(
filters.VOICE | filters.AUDIO | filters.Document.AUDIO,
handle_voice,
))
print("Bot started. Listening for voice messages...")
app.run_polling()
if __name__ == "__main__":
main()
Running
python bot.py
Send a voice message to your bot — it will reply with the transcribed text within seconds.
Deployment
For production, consider:
- Webhooks instead of polling for lower latency
- Async polling with
asyncioinstead oftime.sleep - Error handling with retry logic for API failures
- Docker container for easy deployment
Docker Example
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY bot.py .
CMD ["python", "bot.py"]
Extending the Bot
Ideas for improvements:
- Add
/langcommand to set preferred language - Support video messages (extract audio track)
- Add
/summarycommand to use AI postprocessing - Store transcription history in a database
- Add inline mode for transcribing forwarded voice messages
Cost
Each voice message uses your Nagovori minute balance. A typical 1-minute voice message uses 1 minute from your balance. Check your remaining minutes in the Profile.