Skip to main content

Overview

VideoDraft uses Google’s advanced Text-to-Speech technology to provide natural, expressive voices for your video narration. With support for multiple languages and voice styles, you can create professional voiceovers that connect with your audience.
Speech generation costs 1 credit per 100 characters. Visionary and Studio plans include unlimited speech generation!

Available Voices

English Voices

Alex (Male)

Voice ID: en-US-Wavenet-D
  • Natural American accent
  • Professional tone
  • Clear articulation
  • Versatile delivery
Listen to sample →

Maria (Female)

Voice ID: en-US-Wavenet-C
  • Warm, friendly tone
  • American accent
  • Engaging delivery
  • Perfect for education
Listen to sample →

International Voices

Carlos (Male)

Voice ID: es-ES-Wavenet-B
  • Native Spanish speaker
  • Clear pronunciation
  • Professional tone
Listen →

Lucia (Female)

Voice ID: es-ES-Wavenet-C
  • Warm Spanish voice
  • Natural intonation
  • Engaging style
Listen →

Voice Technology

Google Wavenet

Technology: Advanced neural network synthesis
  • Natural prosody and intonation
  • Human-like voice quality
  • Contextual emphasis
  • Emotional expression

Google Neural2 & Chirp3-HD

Next-Generation Voices:
  • Enhanced naturalness
  • Better pronunciation
  • Improved clarity
  • Regional accents

Speech Synthesis Features

Speaking Styles

Natural, relaxed delivery perfect for:
  • Educational content
  • Explainer videos
  • Casual presentations
  • Social media
Clear, authoritative tone ideal for:
  • Corporate videos
  • Product demos
  • Training materials
  • News updates
Expressive, engaging style for:
  • Narratives
  • Children’s content
  • Entertainment
  • Documentaries
Clear, measured pace for:
  • Tutorials
  • How-to guides
  • Technical content
  • E-learning

Voice Parameters

Adjustable Settings:
  • Speaking Rate: 0.5x to 2.0x speed
  • Pitch: -20% to +20% adjustment
  • Volume Gain: -10dB to +10dB
  • Emphasis: Automatic or manual

Text Formatting for Speech

SSML Support

Use Speech Synthesis Markup Language for advanced control:
<speak>
  Welcome to VideoDraft. 
  <break time="500ms"/>
  Let's create <emphasis level="strong">amazing</emphasis> videos together.
  <prosody rate="slow">Take your time to explore.</prosody>
</speak>

Pronunciation Guide

Numbers and Dates:
  • “2024” → “twenty twenty-four”
  • “$99” → “ninety-nine dollars”
  • “3/4” → “three quarters”
Abbreviations:
  • “Dr.” → “Doctor”
  • “vs.” → “versus”
  • “etc.” → “et cetera”
Custom Pronunciation:
Use phonetic spelling in parentheses:
"CEO (see-ee-oh)" for letter-by-letter
"GIF (jiff)" for specific pronunciation

Credit Usage

Pricing Structure

CharactersCreditsCost
1001Standard rate
1,00010~1 minute speech
5,00050~5 minutes speech
10,000100~10 minutes speech
Average speaking rate is 150-180 words per minute, approximately 900-1,100 characters.

Unlimited Plans

Visionary & Studio Plans Include:
  • ✅ Unlimited speech generation
  • ✅ All voice options
  • ✅ No character limits
  • ✅ Priority processing

Best Practices

Script Writing for TTS

Write for the ear, not the eye:
  • Use simple sentences
  • Avoid complex punctuation
  • Break up long thoughts
  • Use natural pauses
Example:
Instead of: "The product—which launched in 2023—has received numerous accolades."

Write: "The product launched in 2023. It has received numerous accolades."

Voice Selection Guide

Choose Based on Content:

Corporate/Professional

  • Alex (Male): Authority
  • Maria (Female): Approachable
  • Formal scripts
  • Clear delivery

Educational

  • Maria (Female): Friendly teacher
  • Alex (Male): Expert instructor
  • Clear explanations
  • Engaging tone

Marketing

  • Either voice works
  • Match brand personality
  • Enthusiastic delivery
  • Call-to-action focus

Storytelling

  • Choose by character
  • Match narrative tone
  • Expressive delivery
  • Emotional range

Common Issues

Pronunciation Problems

Solutions for common TTS issues:
  1. Acronyms
    • Write phonetically: “NASA (nah-sah)”
    • Or spell out: “N-A-S-A”
  2. Technical Terms
    • Add pronunciation guides
    • Use simpler alternatives
    • Test before finalizing
  3. Foreign Words
    • Use phonetic spelling
    • Or provide translation
    • Test with target audience
  4. Numbers
    • Write as words for clarity
    • “1st” → “first”
    • “1,000” → “one thousand”

Quality Optimization

For Best Results:
  • Proofread carefully
  • Test short segments first
  • Listen to full narration
  • Make adjustments as needed

Multi-Language Projects

Creating Versions

1

Prepare Master Script

Write in primary language with clear structure
2

Professional Translation

Ensure cultural adaptation, not just literal translation
3

Select Native Voice

Choose voice that matches regional preferences
4

Test and Refine

Have native speakers review the output

Localization Tips

  • Adjust speaking pace for language
  • Consider cultural context
  • Maintain consistent tone
  • Verify technical terms

Advanced Techniques

Emotional Delivery

Conveying Emotion Through Text:
  • Short sentences = urgency
  • Long sentences = calm
  • Questions = engagement
  • Exclamations = excitement

Character Voices

Creating Distinction:
  • Use different voices for characters
  • Adjust speaking rate
  • Vary sentence structure
  • Add personality through word choice

Background Integration

Mixing with Music:
  • Keep narration clear
  • Allow musical breaks
  • Sync to rhythm when appropriate
  • Balance audio levels

Future Features

Coming soon:
  • Voice cloning (custom voices)
  • Emotional voice variants
  • More language options
  • Real-time preview
  • Advanced SSML editor

Next Steps

Pro tip: Generate a 30-second test narration with your script to ensure the voice and pacing match your vision before creating the full video!