# Fast mode Fast mode provides synchronous transcription for individual audio files. The response returns immediately after the transcription completes. ## Key features - **Single file processing** Processes one audio file at a time - **Immediate results** Returns results after the transcription completes - **Short recordings** Works best for short recordings Fast mode can handle one (mono) or two (stereo) audio channels. The API returns either a single combined transcript or separate transcripts for each channel. ## Input We support two audio input channel formats. - Mono channel - Stereo channel ## Output - Single transcript Single aggregated transcript when `channel_separation=false` - Per-channel transcripts Per-channel aggregate transcripts with channel tags when `channel_separation=true` ## Audio formats WAV, MP3, M4A, and MP4 ## Transcription options These options control transcription language, segmentation mode and channel separation. | Parameter | Type | Default | Description | | -------------------- | ------- | -------- | ------------------------------------------------------------- | | `language` | string | — | Language code (e.g., en-US). See language abbreviations. | | `segmentation_mode` | string | `"auto"` | Voice Activity Detection (VAD) mode. Set `"none"` to disable. | | `word_time_offsets` | boolean | `false` | Include word-level timestamps. | | `channel_separation` | boolean | `false` | Transcribe stereo channels separately. | ## Audio transcription Use this endpoint to send an audio file for synchronous transcription: **`POST /aiservices/scribe/transcribe`** ### Example request (cURL) This cURL example sends an audio file to the Scribe API and returns a transcription: ```shell curl -X POST https://api.zoom.us/v2/aiservices/scribe/transcribe \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: application/json" \ -d '{ "file": "https://example.com/path/clip.mp3", "config": { "language": "en-US", "word_time_offsets": true, "channel_separation": false } }' ``` #### Response (200) ```json { "request_id": "req_123", "duration_sec": 27.4, "result": { "text_display": "Human-readable with punctuation.", "text_lexical": "lowercase lexical form", "segments": [ { "start": 0.0, "end": 5.2, "text": "Welcome everyone ...", "words": [ { "word": "Welcome", "start": 0.0, "end": 0.4 }, { "word": "everyone", "start": 0.41, "end": 0.9 } ] } ] } } ```