Jump to content

Voice samples being stolen for Voice Cloning


joanette24

Recommended Posts

Hey all, I was wondering how do you feel about when a potential customer specifically asks for a downloadable audio sample of you voice work? On my gig page I have my samples available for anyone to listen to and for the most part it works great, however every now and then I get asked to specifically send an audio sample as a WAV or mp3 file. I had a situation where a person was constantly asking me for an mp3 sample, but I found out by looking at their Fiverr page that they Clone voice artists samples, and then use it to sell Text-to-speech gigs, after that I would never send any samples anymore. I would instead politely and professionally explain to buyers that due the risk of voice cloning and AI TTS software that is on the dramatic rise, that I can unfortunately not send them downloadable samples, but they are always welcome to listen to my gig page Samples. Do you guys and girls feel that I am being fair in doing so?

Link to comment
Share on other sites

Sounds reasonable enough to me.

Gig page audio samples should be more than enough to appease a real prospective client, and once someone starts asking for free samples or audio to download, then that’s encroaching upon the “buyer” likely being a cheapskate or a fraud.

Asking if I’m capable of pulling off a specific type of voice is perfectly reasonable. Wanting a sample while they are window browsing half a dozen other VO sellers is enough reason to make me say goodbye.

-Oh, and welcome to the forum!

Link to comment
Share on other sites

Sounds reasonable enough to me.

Gig page audio samples should be more than enough to appease a real prospective client, and once someone starts asking for free samples or audio to download, then that’s encroaching upon the “buyer” likely being a cheapskate or a fraud.

Asking if I’m capable of pulling off a specific type of voice is perfectly reasonable. Wanting a sample while they are window browsing half a dozen other VO sellers is enough reason to make me say goodbye.

-Oh, and welcome to the forum!

Thanks for the reply, And thanks for the welcome! I completely agree. I just wonder perhaps how wide spread this type of Voice Sample fraud could run though? In the last month alone I’ve had 2 separate buyers very specifically ask for samples that are very obviously script lines that are used for AI Voice cloning training. Are people aware of this type of fraud? The last thing I want to do is cause a panic. Or maybe I am just unlucky 😛

Link to comment
Share on other sites

Thanks for the reply, And thanks for the welcome! I completely agree. I just wonder perhaps how wide spread this type of Voice Sample fraud could run though? In the last month alone I’ve had 2 separate buyers very specifically ask for samples that are very obviously script lines that are used for AI Voice cloning training. Are people aware of this type of fraud? The last thing I want to do is cause a panic. Or maybe I am just unlucky 😛

VO sellers who keep a perfunctory track of the industry are well aware of such programs and schemes.

Best policy is, no order, no voice. But that won’t help if you receive an actual order from someone wanting you to perform such content for a voice mimicking program…

Turning down such an order would require a cancellation, and those hurt a seller’s stats and visibility. A Catch-22 situation.

Link to comment
Share on other sites

I have no problem sending my produced demos. These contain music and/or sound effects and audio processing that makes it impossible to use for cloning purposes. At least, I believe it does.

If clients want a raw audio where I say specific things or custom auditions, they have to pay for it. If they clone my voice they have actually used my voice to create an end product, meaning they are violating the rights, unless they ordered a full buy-out.

I know many of my buyers are studios/producers etc. looking for samples to add to their library so that when customers contact them, they can present the demo to the client. This is very normal. Besides, they might not want to disclose to their client that they got the voice over on Fiverr, considering it will often be a lot cheaper here than on other platforms.

So all in all, I have no problems sending my demos, but it’s produced enough so that just cloning the voice to use with a TTS software would be difficult, if not impossible.

Considering it would be super easy for anyone to copy the audio from your gig profile (all you need is Audacity and record the audio via your output on a PC) I don’t think you’re especially protected from this type of scams and thefts, even if you refuse to send samples.

With that being said, if they are using your voice to create a clone for TTS without you agreeing to this, I’m sure it could be considered a violation of your rights, and I would actively pursue anyone doing this to the fullest extent of the law.

I would also make sure to cram their websites, social media profiles and anything else I could find with bad reviews and basically do what I could to let anyone who considers buying their shitty TTS software know that they are unethical thieves.

Link to comment
Share on other sites

The big flaw with voice replication/cloning is that it is still terrible in replicating believable emotion. It’s eerily accurate in replicating straight, normal speech but is synthetic garbage whenever passion, anger, fear, humor, intensity, etc. are required.

Link to comment
Share on other sites

The big flaw with voice replication/cloning is that it is still terrible in replicating believable emotion. It’s eerily accurate in replicating straight, normal speech but is synthetic garbage whenever passion, anger, fear, humor, intensity, etc. are required.

but is synthetic garbage whenever passion, anger, fear, humor, intensity, etc. are required.

Maybe they could record/train the AI with those separately then and then in the text to speech software (the text entry bit) they could specify which groups of words require that different type of speech, like you bold certain text.

Link to comment
Share on other sites

but is synthetic garbage whenever passion, anger, fear, humor, intensity, etc. are required.

Maybe they could record/train the AI with those separately then and then in the text to speech software (the text entry bit) they could specify which groups of words require that different type of speech, like you bold certain text.

I don’t really care of what hiccups/difficulties they face and how to circumvent them.

Link to comment
Share on other sites

The big flaw with voice replication/cloning is that it is still terrible in replicating believable emotion. It’s eerily accurate in replicating straight, normal speech but is synthetic garbage whenever passion, anger, fear, humor, intensity, etc. are required.

It feels artificial. I don’t see it as a threat to more high-paying projects, but it’s absolutely a threat for the lower tier market, especially when uneducated clients think they can reproduce the results created by a real voice over artist for next to nothing using TTS. That’s still years ahead, if it will ever be possible.

I can easily hear when TTS is used, even when it sounds very convincing, because there’s an intuitive sense in all of us that just “knows” when something isn’t right. It’s like looking at a high-quality 3D render of a room: it looks photo realistic, but there’s something that just feels a bit “off”; it doesn’t feel real, even though it looks real.

Link to comment
Share on other sites

It could probably be beneficial to those who do proper voice overs too, they’d have the rights to use it with their voice and could probably train it better (eg. with more data) than other people, so it would probably sound better. As long as the buyer knew exactly what they were getting (and maybe could hear what it would be like) it could be okay. eg. if there are days/times when a voice over person has lost their voice or something, they could enable the gig of theirs that uses that sort of tech. I assume it could generate the audio for a very big script faster too.

Link to comment
Share on other sites

It could probably be beneficial to those who do proper voice overs too, they’d have the rights to use it with their voice and could probably train it better (eg. with more data) than other people, so it would probably sound better. As long as the buyer knew exactly what they were getting (and maybe could hear what it would be like) it could be okay. eg. if there are days/times when a voice over person has lost their voice or something, they could enable the gig of theirs that uses that sort of tech. I assume it could generate the audio for a very big script faster too.

A VO seller who sells a blend of real and automated VO will likely just be clumped into being a robo-voice seller. If I was a prospective buyer, I would definitely want to know if the seller was going to actually perform the script or just run it through a program.

Link to comment
Share on other sites

A VO seller who sells a blend of real and automated VO will likely just be clumped into being a robo-voice seller. If I was a prospective buyer, I would definitely want to know if the seller was going to actually perform the script or just run it through a program.

That’s why I suggested “they could enable the gig of theirs that uses that sort of tech” - what I meant was they could have separate gigs - one or more that used their normal voice over service and one where they could use this tech (eg. when there’s a problem with their voice/lost their voice). The gig that used this type of tech would say it would be used in the gig description/title so the buyer would know before ordering, and it could have samples they could listen to of demos they created with this tech.

Link to comment
Share on other sites

That’s why I suggested “they could enable the gig of theirs that uses that sort of tech” - what I meant was they could have separate gigs - one or more that used their normal voice over service and one where they could use this tech (eg. when there’s a problem with their voice/lost their voice). The gig that used this type of tech would say it would be used in the gig description/title so the buyer would know before ordering, and it could have samples they could listen to of demos they created with this tech.

That’s the issue. Deception in gig descriptions is so widespread and pervasive, there would definitely be sellers who’d lie about offering a real voice and just offer a robo-voice. In fact, that is already happening right now.

Link to comment
Share on other sites

Absolutely. Its best to be Vigilant from the get go and make sure that you know what the warning signs are before accepting any gigs, or at least adding TTS purposes into a buy out license. AI Training tech is evolving at a scary pace and there are already real time voice cloning software that require only a couple of seconds worth of Audio if you are tech savvy enough and have beefy hardware, If you are interested go have a listen to some high quality AI Voices have a listen to Replica Studios or LOVO’s Ai voices, they have some very VERY convincing voices for a multitude of characters and emotions, its specifically targeting the video game industry. They have of course been recorded in high quality environments with many hours of AI training. but just like how deepfakes have evolved to a point beyond the uncanny valley,so will AI voices reach that level as well. It already has on some levels. It’s fascinating to me and I’m all for progress and the wonders of technology. I would just want people to be aware of it. and then take precautions to ensure their voices don’t get stolen, or at least give consent before hand to avoid legal issues.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...