Log in to Freesound

Problems logging in?
Don't have an account? Join now

Problems logging in?

Enter your email or username below and we'll send you a link to help you login into your account.

Back to log in

Almost there!

We've sent a verification link by email

Didn't receive the email? Check your Spam folder, it may have been caught by a filter. If you still don't see it, you can resend the verification email.

Default title

  • Sounds
  • Tags
  • Forum
  • Map
    • Sounds
    • Packs
    • Forum
    • Map
    • Tags
    • Random sound
    • Charts
    • Donate
    • Help

Freesound Forums

  • Freesound Forums
  • Production Techniques, Music Gear, Tips and Tricks
  • AI voice training Scripts and techniques

AI voice training Scripts and techniques

Subscribe

Started April 10th, 2023 · 8 replies · Latest reply by Sadiquecat 3 months, 4 weeks ago

Sadiquecat

3,339 sounds

415 posts

2 years, 5 months ago
#1

Hello.

I'm looking for techniques, tips and scripts to read in order to train a AI text to speech.

I can't find much info other than "go on that website, throw audio at it and voila".

Id love to have a more technical approach!

I want to record myself, and maybe some family members sort of to immortalise what we sound like.

SO does anyone know if there's like a script to read making all the different sounds, maybe covering the common ones a few times. Is saying each letter and making each "sound" a thing, or is it unnatural and noisy training ?
I'm sure reading books would be a great start, but sometimes people can be monotonous while reading or exaggerate punctuation or articulation, so in the end would it be "natural" ?

Id also presume something like this would have different specific words between a angry tone, calm tone, quick, casual etc...

Then there's probably technical sides of mic placement ? Is a "close" mic preferable, or would it sound too "podcasty" and something 30cm, 1m away be better ? I presume recording a few distances at once would be the best.

If you have resources to point to, or experience, or ideas, let me know!

Many thanks <3

CC0 Be a hero.
A
Appricot

0 sounds

2 posts

2 years, 4 months ago
#2

Hi

I'm not sure if you tried MS's Clipchamp. It looks like the major part of functions you described are there. At least for voicover it works well. No scripts or special prompts or API or etc required. I've tried text-to-speech and it was really good. You can use embeded voices or to train with yours recorded. Settings are not so reach yet and tone/mood regulations are hardly possible. At least I didn't try. Perhaps, you find a way to do that also. wink Yet, different speed and level of voice are really good. Honestly, I didm't expect that it could be so good. It worth to try.

Sadiquecat

3,339 sounds

415 posts

2 years, 3 months ago
#3

Sorry for the delay, I didn't see any notification of your message ^^'
Thanks I appreciate it and will have a deeper look.
The "speaker coach" sounds interesting, didn't think a thing like that would exist.

I have found the thing I was looking for the other day : https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/record-custom-voice-samples

There's a rather in depth explanation of the process and scripts in a few languages !

Cheers!

CC0 Be a hero.
A
Appricot

0 sounds

2 posts

2 years, 3 months ago
#4

You've found really fruitful instructions. It looks a bit sophisticated, but, assume, it might result in really good customized voice. Have to find time to check it out. Thanks a lot for sharing.

Sadiquecat

3,339 sounds

415 posts

3 months, 4 weeks ago
#5

Edit: This was a response to a now deleted message.

Excuse my scepticism, but is this a full AI bot, or is it an AI assisted response, or am I mistaken? x')

Why 15-20cm mic distance?

CC0 Be a hero.
lujainsameer

54 sounds

34 posts

3 months, 4 weeks ago
#6

i dont know, but i dont think you should be hat far away from your mic but also dont be too close to the mic

orange juice grin
Sadiquecat

3,339 sounds

415 posts

3 months, 4 weeks ago
#7

Good to see you back Lua! smile

CC0 Be a hero.
Post reply
About Freesound Terms of use Privacy Cookies Developers Help Donations Blog Freesound Labs Get your t-shirt!
© 2025 Universitat Pompeu Fabra