sarvamai-go SDK Documentation
Text-to-Speech

Convert (REST)

`TextToSpeech.Convert` for file-style TTS generation.

Signature

func (c *TTSClient) Convert(ctx context.Context, text string, targetLanguage LanguageCode, opts ...option) (*ConvertResponse, error)

Options

OptionType
WithModelBulbulV2|BulbulV3
WithSpeakerVoiceSpeakerVoice
WithPitchfloat64
WithPacefloat64
WithLoudnessfloat64
WithSpeechSampleRateSpeechSampleRate
WithOutputAudioCodecAudioCodec
WithEnablePreprocessingbool
WithTemperaturefloat64

Global validation

  1. target_language_code must be in languages.TargetLanguages.
  2. text length:
    • BulbulV3: 1..2500
    • BulbulV2: 1..1500
  3. speaker voice must exist in model-specific allowed voice list.

Model-specific valid / invalid combinations

CombinationResult
BulbulV3 + WithPitch(...)Validation error
BulbulV3 + WithLoudness(...)Validation error
BulbulV3 + WithEnablePreprocessing(...)Validation error
BulbulV3 + WithPace(0.5..2.0)Valid
BulbulV3 + WithTemperature(0.01..2.0)Valid
BulbulV3 + sample rate in {8000,16000,22050,24000,32000,44100,48000}Valid
BulbulV2 + WithTemperature(...)Validation error
BulbulV2 + WithPitch(-0.75..0.75)Valid
BulbulV2 + WithLoudness(0.3..3.0)Valid
BulbulV2 + WithPace(0.3..3.0)Valid
BulbulV2 + sample rate in {8000,16000,22050,24000}Valid

Response

FieldType
RequestIdstring
Audios[]string (base64 audio payloads)

Example

resp, err := client.TextToSpeech.Convert(
    ctx,
    "Hello from Sarvam AI Go SDK",
    tts.LanguageEnIN,
    tts.WithModel(tts.BulbulV3),
    tts.WithSpeakerVoice(tts.SpeakerShubh),
    tts.WithTemperature(0.7),
    tts.WithOutputAudioCodec(tts.AudioCodecMP3),
)
if err != nil {
    panic(err)
}
fmt.Println(len(resp.Audios))

On this page