Async Speech-to-text
Articles
- What should I do if I'm getting an error?
- Is there a Postman collection for using the API?
- How can I use Universal-2?
- How does the API handle files that contain spoken audio in multiple languages?
- Are there any limits on file size or file duration for files submitted to the API?
- Are Custom Models More Accurate than General Models?
- What audio and video file types are supported by your API?
- How do I generate subtitles?
- How can I transcribe YouTube videos?
- Do you have any resources for how to use your API?
- What is the difference between your Speech-to-Text models?
- Can I submit files to the API that are stored in a Google Drive?
- How can I integrate AssemblyAI with other services?
- Does it cost extra to export SRT or VTT captions?
- Can I get a list of all transcripts I have created?
- Where can I find a list of recent changes to the API?
- Can I delete the transcripts I have created using the API?
- Do you offer cross-file Speaker Identification?
- Do you offer translation?
- Should I use Speaker Labels or Multi-channel?
- Do you have example use cases for using AssemblyAI?
- What languages do you support?
- How does Automatic Language Detection work?
- What is the recommended file type for using your API?
- How are individual speakers identified and how does the Speaker Label feature work?
- What are the recommended options for audio noise reduction?
- How can I test AssemblyAI without writing code?
- Is there an OpenAPI spec/schema for the API?
- Do you offer voice-to-voice or text-to-speech (TTS)?
- Is there a way to generate SRT or VTT captions with Speaker Labels?
- What is the difference between Custom Vocabulary and Custom Spelling?
- Does your API return timestamps for individual words?
- Can I customize how words are spelled by the model?
- What IP Address Should I Whitelist for AssemblyAI?
- How does AssemblyAI compare to other ASR providers?
- Is there a way for us to send the start time / end time for transcription instead of transcribing the whole length of a call recording?
- How long does it take to transcribe a file?
- What types of audio URLs can I use with the API?
- How are paragraphs created for the /paragraphs endpoint?
- Do we have resources for building with Make?
- What is the minimum audio duration that the API can transcribe?
- How are word/transcript level confidence scores calculated?
- Where can I find cURL code examples?
- What causes a "read operation timed out" error?
- Can I use the API without internet access?
- Can I send audio to AssemblyAI in segments and still get speaker labels for the whole recording?
- Why can't I access recording URLs from the /upload endpoint directly?
- Should I use the NA or EU endpoint for my Speech-to-Text requests?