Async Speech-to-text

How can I use Universal-2?
How does AssemblyAI compare to other ASR providers?
Are Custom Models More Accurate than General Models?
Does it cost extra to export SRT or VTT captions?
How does Automatic Language Detection work?
How are paragraphs created for the /paragraphs endpoint?
What is the difference between Custom Vocabulary and Custom Spelling?
How are individual speakers identified and how does the Speaker Label feature work?
How does the API handle files that contain spoken audio in multiple languages?
Do you offer cross-file Speaker Identification?
Should I use Speaker Labels or Multi-channel?
Do you offer translation?
Do you offer voice-to-voice or text-to-speech (TTS)?
Do we have resources for building with Make?
Is there an OpenAPI spec/schema for the API?
How are word/transcript level confidence scores calculated?
Where can I find a list of recent changes to the API?
What audio and video file types are supported by your API?
What is the recommended file type for using your API?
Are there any limits on file size or file duration for files submitted to the API?
How long does it take to transcribe a file?
Can I delete the transcripts I have created using the API?
Does your API return timestamps for individual words?
Is there a way for us to send the start time / end time for transcription instead of transcribing the whole length of a call recording?
What is the difference between your Speech-to-Text models?
Is there a Postman collection for using the API?
Can I get a list of all transcripts I have created?
What IP Address Should I Whitelist for AssemblyAI?
What is the minimum audio duration that the API can transcribe?
What causes a "read operation timed out" error?
What types of audio URLs can I use with the API?
Can I use the API without internet access?
How do I generate subtitles?
Do you have any resources for how to use your API?
How can I integrate AssemblyAI with other services?
What languages do you support?
How can I test AssemblyAI without writing code?
Do you have example use cases for using AssemblyAI?
How can I transcribe YouTube videos?
Can I submit files to the API that are stored in a Google Drive?
Can I send audio to AssemblyAI in segments and still get speaker labels for the whole recording?
What should I do if I'm getting an error?
Should I use the NA or EU endpoint for my Speech-to-Text requests?
Can I customize how words are spelled by the model?
What are the recommended options for audio noise reduction?
Why can't I access recording URLs from the /upload endpoint directly?
Where can I find cURL code examples?
Is there a way to generate SRT or VTT captions with Speaker Labels?