Popular Lesson

6.3 – Controlling Emotion with Speech to Speech Lesson

Achieve more control and realism in AI-generated voices by using speech to speech, allowing you to convey emotion, pacing, and expressive performance in your movie scenes. Get started by watching the lesson video, where you’ll see this process in action.

What you'll learn

Use speech to speech tools to record and control how dialogue is delivered
Maintain your intended tone and pacing while transforming the voice
Apply an AI-generated voice that follows your exact inflection and emphasis
Capture subtle expressions such as hesitation, nervousness, or urgency
Prepare voice clips that easily sync with animated characters

Lesson Overview

Generating believable and emotionally rich dialogue for your movie characters relies on more than just clear words. While text to speech engines can sound natural, they often struggle with passionate delivery, tense moments, or human quirks like stammering or awkward pauses. This lesson introduces the speech to speech feature in ElevenLabs—sometimes called a voice changer tool—which solves the common challenge of controlling the emotional content and delivery of your character’s lines.

Instead of typing text and relying on the AI to interpret emotion, you’ll record the dialogue yourself—delivering it exactly as you want, with the right tone, rhythm, and emphasis. The AI then transforms your performance into a different-sounding voice, while keeping all your intended nuances intact. This method is especially useful for handling scenes requiring real emotional expression—such as a character hesitating, showing nervousness, or adding unique pacing that’s hard to script. As a result, your AI-generated movies sound more convincing and human. This approach is helpful whether you’re working on a single dramatic moment or aiming for consistent vocal performance throughout a film.

Who This Is For

If you want AI-generated voices in your movies to sound more expressive and tailored, this lesson is designed for you.

Filmmakers seeking more emotional range in AI dialogue
Animators working on character-driven scenes
Writers who want their dialogue to be performed with intention
Creators needing custom pacing or emphasis
Anyone who finds text to speech too flat or inconsistent for their project

Skill Leap AI For Business

Comprehensive, Business-Centric Curriculum
Fast-Track Your AI Skills
Build Custom AI Tools for Your Business
AI-Driven Visual & Presentation Creation

Try Skill leap AI For Business

Where This Fits in a Workflow

This lesson comes into play when you need more nuance in a character’s voice than text to speech alone can provide. After writing your script and preparing your scenes, you can use the speech to speech tool to bring lines to life—especially for moments that demand emotional complexity or specific timing.

For example, if a character delivers a nervous warning or shows hesitation, you’ll record the line in your own voice with the intended emotion, upload it, and select the desired AI voice. The resulting audio can then be used for precise lip syncing in your animation or scene. This approach ensures your movies have the exact performance you envisioned, rather than settling for generic delivery.

Technical & Workflow Benefits

The traditional text to speech process often means trial and error—adjusting punctuation, guessing at prompts, and hoping the AI “gets” your emotional intent. With speech to speech, you cut out the guesswork by performing the line yourself. The AI retains your pace, pauses, and inflection, so nervous stammers, urgent warnings, or subtle changes in tone are preserved in the final output.

This method saves time spent on tweaking text scripts for better delivery. It also removes limitations on expressiveness—making your AI voices feel more authentic. In projects where character performance is key, such as dramatic dialogue or comedic timing, speech to speech can make the difference between an artificial-sounding voice and a believable character.

Practice Exercise

Try applying the speech to speech technique using a short scene from your own script or a practice scenario. For example, imagine a character urgently warning someone not to touch a mysterious object.

Record yourself delivering the line with nervous pauses or urgency (e.g., “Don’t, don’t touch it, please.”).
Upload your recording to the speech to speech tool and select an AI voice.
Generate the new voice clip and compare it with a text to speech version of the same line.

Ask yourself: Did the speech to speech version better capture the emotion or pacing you wanted? Which method would you use for emotional scenes in your movie?

Other Relevant Courses

Course Context Recap

In the previous lesson, you learned about using text to speech to generate character voices, but may have noticed limitations in controlling expressiveness. This lesson introduced speech to speech as a solution for more emotional and realistic performances. Up next, you’ll see how to sync these expressive audio clips to on-screen characters for full animation results. Keep going to bring even more depth to your AI-powered movie production.

Section 1 – Intro To Making AI Movies (05:30)0/2

Section 2 – Developing with ChatGPT (05:44)0/3

Section 3 – Creating Our Images (08:22)0/6

Section 4 – Image Editing (07:35)0/3

Section 5 – Bringing Your Images to Life (08:48)0/4

Section 6 – Adding Voices and Dialogue (09:24)0/5

Section 7 – Editing Your Movie (16:33)0/5

Section 8 – Course Wrap Up (01:07)0/1

Lessons in this Course