Async Speech-to-text

Articles

What should I do if I'm getting an error?
Is there a Postman collection for using the API?
How can I use Universal-2?
How does the API handle files that contain spoken audio in multiple languages?
Are there any limits on file size or file duration for files submitted to the API?
Are Custom Models More Accurate than General Models?
What audio and video file types are supported by your API?
How do I generate subtitles?
How can I transcribe YouTube videos?
Do you have any resources for how to use your API?
What is the difference between your Speech-to-Text models?
Can I submit files to the API that are stored in a Google Drive?
How can I integrate AssemblyAI with other services?
Does it cost extra to export SRT or VTT captions?
Can I get a list of all transcripts I have created?
Where can I find a list of recent changes to the API?
Can I delete the transcripts I have created using the API?
Do you offer cross-file Speaker Identification?
Do you offer translation?
Should I use Speaker Labels or Multi-channel?
Do you have example use cases for using AssemblyAI?
What languages do you support?
How does Automatic Language Detection work?
What is the recommended file type for using your API?
How are individual speakers identified and how does the Speaker Label feature work?
What are the recommended options for audio noise reduction?
How can I test AssemblyAI without writing code?
Is there an OpenAPI spec/schema for the API?
Do you offer voice-to-voice or text-to-speech (TTS)?
Is there a way to generate SRT or VTT captions with Speaker Labels?
What is the difference between Custom Vocabulary and Custom Spelling?
Does your API return timestamps for individual words?
Can I customize how words are spelled by the model?
What IP Address Should I Whitelist for AssemblyAI?
How does AssemblyAI compare to other ASR providers?
Is there a way for us to send the start time / end time for transcription instead of transcribing the whole length of a call recording?
How long does it take to transcribe a file?
What types of audio URLs can I use with the API?
How are paragraphs created for the /paragraphs endpoint?
Do we have resources for building with Make?
What is the minimum audio duration that the API can transcribe?
How are word/transcript level confidence scores calculated?
Where can I find cURL code examples?
What causes a "read operation timed out" error?
Can I use the API without internet access?
Can I send audio to AssemblyAI in segments and still get speaker labels for the whole recording?
Why can't I access recording URLs from the /upload endpoint directly?
Should I use the NA or EU endpoint for my Speech-to-Text requests?