speech endpoint to implement the following functions based on the TTS model:IMPORTANT: It must be stated to the user that what they are hearing is AI-generated speech, not a human voice
| Format | Features | Applicable scenarios |
|---|---|---|
| MP3 | Default format | Common scenes |
| Opus | Low Latency | Web Streaming and Communications |
| AAC | Efficient compression | Mobile device playback |
| FLAC | Lossless compression | Audio archiving |
| WAV | No compression | Low latency applications |
| PCM | Raw samples | 24kHz, 16-bit signed |
Note: The current sound is mainly optimized for English