Freesound Forums

Freesound in the era of generative Artificial Intelligence

Started June 7th, 2024 · 48 replies · Latest reply by JunkiEDM 7 months ago

Sadiquecat

3,414 sounds

461 posts

1 year, 7 months ago

Hello there,
A Freesound blog post about AI has been posted here https://blog.freesound.org/?p=2082

I invite you to read it there!

Bare in mind, as you read and form your opinion, the following challenges of Freesound.
It is build as a research project. And it follows the Creative Commons licences.
With those pillars, how can Freesound do what is best for its community?

This is a sensitive topic for many of us. I am optimistic we all wish the best for our fellow humans; just disagree on how to get there.

Disclamers :
-I am a Freesound moderator. I have seen and slightly participated behind the scenes to discussions about AI. I am really grateful and respectful towards the team seeing how serious and passionate yet respectful the discussions went. I am biased to respect the Freesound team.
-Posts, comments, replies by me (Sadiquecat) is my thoughts and opinions as an individual. I do not represent Freesound! Nor do I speak on their behalf.

CC0 Be a hero.

frederic.font

748 sounds

503 posts

1 year, 7 months ago

Thanks for opening this thread @Sadiquecat, a discussion was initiated in the blog post (https://blog.freesound.org/?p=2082#comment-115389), but it might be good to continue it here in the forums so participation is easier for anyone interested.

frederic
the freesound team

strangely_gnarled

17 sounds

605 posts

1 year, 7 months ago

Just one headsworth of thought:-
I'm concerned that Freesound may be swamped with many 100thousands of AI-gen uploads from, perhaps, just a small number of accounts making it impossible to manage and moderate them.
Measures/restrictions might be needed which would negatively impact the 'traditional' audio-artist posters who are the flesh on the backbone of this wonderful site.

Unfortunately I do not have ideas or suggestions for workable solutions to stop a casket of gems being tossed and stirred into a mountain of(probably very worthy) artificial stones.

I'm sure this sounds elitist, but I assure you It's not supposed to.
I don't begrudge or wish to interfere with the opportunities AI promises.
Equally I don't want to trawl endless shelves in a hyper-market of mass production if I'm interested in artisan originals from the creative skills from the likes of a Qubodup, ERH. RHumphries, klankbeeld... (random sample - hundreds++ more..)

I've never used a freesound sample in the outside world, but I've learned, been fascinated and inspired by creative people, their thoughts, sounds, tricks and techniques.
Perhaps I'm an oddity. Certainly not a typical freesounder.

Wibby

Heaven in the sky is to die for, Heaven on earth is to live for.

Sadiquecat

3,414 sounds

461 posts

1 year, 7 months ago

You're definitely not alone there.

There's a "AI sound flooding" section in the blog post.

There's the idea of having a flag for AI content, and being able to filter it in/out of searches.

As far as moderation goes, there's ideas of having systems to automatically identify AI generated content. Ironically using AI to identify AI to flag AI LOL.

The blogpost also mentions

we don’t see the flooding problem as a very likely scenario, at least in quantities that would severely impact Freesound. We believe that, unlike other platforms in which users upload content and get economic rewards for its consumption, Freesound users have no incentives for uploading large amounts of content and abusing our terms. Furthermore, the relatively small size of Freesound plays in our favour: in case of need, we might be able to impose more strict rules for uploading sounds and manual checks which would allow us to avoid content flooding.

I half agree with this, while it is true that there's not much incentive to spam, I think the ease to make sounds will draw a larger good willing public wanting to contribute stuff they made using AI. After all easier to find an Ai website than go out and record something. So that ease will probably draw more people in for the better or worse.
Then will come a layer of quantity. For the same reasons, id expect average Joe to generate 50sounds instead of recording 3. In other words one AI user could count as 5 normal users.
So I don't foresee a commercial spamming of AI content that would overwhelm Freesound, but I can imagine AI uploads being a substantial part of uploads. (Say over 40%) in 5years or so.
Flooding wouldn't be an issue (at least for users) as long as it can be filtered out / flagged. But the more uploads inviably means more work for the mod team, which in turns means more time for tickets for human recorded/made sounds to be approved.

I may be wrong, I can also imagine people getting bored of the novelty of generative AI.
The "on demand" nature of AI generated content might not prompt people to share their making as much as actual recorded or creatively made sounds; as anyone could generate a sound themselves instead of searching for it. I presume as time goes on, more and more people will get bored of AI generated stuff. I'm personally no longer impressed at AI images, I know it's potential and how it looks like, to the point that I feel bored looking at a otherwise extraordinary pleasant looking one. I may be an exception here, but I doubt I'll be alone to go "Ah AI generated stuff... Seen it before. Boring..." and wishing to returns to cruder more "human achievement" art. If there's a general feeling like this, I presume the making and sharing of AI generated content to steady at less prominent levels than most fear.

CC0 Be a hero.

timp666

0 sounds

1 post

1 year, 7 months ago

One thought I had with regards to this supposedly legit use, is that using some CC licenses require attribution for each and every use. That means attribution is required with every "output". All they have done so far is made a csv list of what their dataset was trained on, which is only INPUT and its not accessible without signing up. It does not seem to uphold the requirements of attribution with its OUTPUT?

frederic.font

748 sounds

503 posts

1 year, 7 months ago

Hi timp666, I guess you are referring to Stable Audio Open model right? The CSV list is only accessible after sign up, and in our opinion it should be fully public. We already asked Stability Audio to fix that.

When you refer to the Output you mean the sounds generated by the model right? This is discussed in our blog post (https://blog.freesound.org/?p=2082), so I recommend to check it out if you have not done it. The summary is that CC attribution requirement would only be applicable if the sounds generated by the model are considered derivative works or adaptations of sounds in the training set (think of it as if the sounds generated by the model redistribute in some way the sounds or parts of sounds from the training set), and this is not normally considered to be the case (this is something that AI models actively try to avoid).

frederic
the freesound team

j1987

246 sounds

1 post

1 year, 6 months ago

I think it's a good idea to have a checkbox on upload for marking a sound as AI generated. And adding a search function to filter them out if you don't want to download AI sounds.

I recently started getting into stable audio open, where the majority of the dataset comes from Freesound, and what I've been doing is generate sound effects, and then mash them up, filter, and add effects in adobe audition. I'd like to share these on freesound, and having clear guidelines would go a long way.

As far as allowing bots to train on your material, I have no opinion since all my sounds are CC-0.

temnix

0 sounds

2 posts

1 year, 6 months ago

Hello. I don't normally respond to blog entries and don't care to participate on the forums here generally. However, your entry on generation of artificial sounds by algorithms alarmed me. Your position struck me as oblivious to the real danger of generated sounds: that recordings from rich reality will give way to inferior substitutes. It must be understood that "generative A. I." is always to be spelled in quotes, it has been more fairly called plagiarism software. It generates nothing but combs through the work of others and recombines it. But not giving credit is not the point here, recycling banality is. Reality from the microphone is abundant, strange, unpredictable, unfathomable, inspiring, ecstatic. Human ideas of what something ought to sound like is echo in headspace. Letting recordings be displaced by rehashings, whether unlabeled or simply so massively common that real sounds will be pushed off to vanishing sidelines, is opening the door to simulacra, substituting margerine for butter.

We all know, for example, what typical videogame sounds are supposed to be like: what the sound of clanging armor is, the sound of a pained grunt, a leap, a fireball, a steam whistle, a shutting door, a zombie roar. Our familiarity is already a product of such games' serialization. Still, developers with good budgets, even when they engineer the sounds, are smart about adding extras. small touches, accents that those might include. There the individual memory and creativity comes to help what would otherwise be a cliche. Developers with big budgets, like Diablo IV, send out teams with large wooly mikes. The reason is simply that you can't outsmart the real thing. If generated sounds are allowed presence, we will be sliding to a place where sounds are mixed up only from previous sounds, and everything ends up sounding the same, but more fundamentally, fake. At that point real recordings may even begin to be distrusted for dissimilitude to the bell-curve triviality cycled and recycled by the software, just as real speech, with individual style and a good vocabulary, already gets denounced as fake by Internet mobs with the mental scope of a football fan.

Artificial sounds mixed by humans are already second-rate to live recordings, they are filler, though sometimes good filler. After all, not everything can be recorded. But letting "A. I." participate on equal terms will distort the field. The human uploaders who clicked a button and let the computer churn are not creators and must not be allowed to think of themselves as such. Laissez-faire will also make Freesound unusable to me personally. "A. I." sounds not should but MUST be labelled, at the least, however, the really wise and courageous approach would be to ban them preemtively altogether. When you say that "A. I." will be "an important part of our lives," you are surrendering to conformity with a dangerous and utterly unuseful technology whose only real purpose and intent, out there, is to take away people's jobs to cut employers' costs. We are not doomed to these fakes, however. We have a choice. I can speak for myself that this fakery will not be part of MY life in any capacity. Even the fact that you are discussing letting such sounds in throws Freesound in a great deal of doubt, as far as I am concerned. If I stop by in another six months and see "A. I." all over the place, I am not going to shrug and start incorporating these fakes into my creativity. I will simply stick with old sounds, record my own, head off to another place without them or do without sounds whenever I can.

klankbeeld

7,347 sounds

2,077 posts

1 year, 6 months ago

Thanks@temnix and others for the mail,

Nice to hear that AI cannot handle reality. YES, it's close.
I challenge AI to reproduce my 'river' sounds from the Netherlands 100% artificially without using the original files.
Moreover, my climate influence for making such a 1-hour sound recording is very small.
- 2 sandwiches
- 1 bottle of water
- 1 ore 2 hour cycling
that will lead to several grams of CO2

Producing 1 hour of audio by a data center will be many times higher.

And.…

it's just a pale reflection of what I make in real life. Many artists and scientists gratefully use my files. And it will remain that way for centuries to come.....

The question is whether we let the heavy burden on Mother Earth and our grandchildren count more than our artificial product.

I hope my grandchildren will still say; 'My grandfather recorded that beautiful sound along the river in 2024 and I can still hear it.' And your and your children too

I will continue to go out on my bike to make beautiful things for you......... I look at the modern world and continue stoical.

To hear, you first have to listen

kevp888

1,598 sounds

212 posts

1 year, 6 months ago

#10

@klankbeeld : Totally agree ! And moreover, I guess AI could never give me even 1% of the enjoyment and gratefulness I get when I have the chance to record unpredictable events, a beautiful birdsong, an impressive thunder strike, some cute kids singing, or this special atmosphere which will give all its flavour to the sound I share.

Wishing you all the best !

Kevin

Robinhood76

1,924 sounds

111 posts

1 year, 6 months ago

#11

AI will be for us important milestone. It will have impact for those who make living of creating sounds like me.
All the changes will strike us seriously and will sift us for those who love sound recording and for those who does it for money. First group will survive as real sound artists, second one will be replaced by AI.
We have some time to prepare, because sound are not a priority in AI and most people need very specific audio, but surely it will come.
I use AI when creating graphics for illustrating books and I see how fast it develops, but it's still hard to get results that I want. It's always a compromise.
AI will divide creativity for two groups: cheap artificial utility for masses and artistic value for elite users. The middle will shrink and will disappear in next 10-15 years.
So keep joy of what you are doing. The true passion will always have the audience.

sound addictive human being...

SieuAmThanh

1,701 sounds

41 posts

1 year, 5 months ago

#12

Many people forget humans use stuff (art pictures, musica, books, movies, etc) for train, practice, learn skills. Even corupt copyright mônopoly law in many countrys allow it. Train AI is same idea.

I like AI, great for inspire and help me. But big problem AI òften use English language, limit type sound can create. Hard create sound for stuff need use diferent culture/language, no have English name and data no have other culture data.

kevp888

1,598 sounds

212 posts

1 year, 5 months ago

#13

@SieuAmThanh: Deeply agree with you. Like many other things, AI probably suffers from globalisation.
And in that matter, I think and hope human will always keep the edge !

Wishing you all the best !

Kevin

klankbeeld

7,347 sounds

2,077 posts

1 year, 5 months ago

#14

Hello freesounders.

I am following this discussion on AI with great pleasure.
The development on this subject in my old head (60+) is going faster than I can keep up with myself.
I have decided, despite all the arguments of possible abuse, to participate in the new world of AI. I am now completely open in it.
This week I received a request from a university from France to make a 24-hour city sound to train an AI product. It was a small effort. I have since made the file available to them. A PhD is going to work on this.

Freesounders, when the steam engine was introduced the world was also too small and a huge discussion ensued. Just read back on wiki.
I am all over after reading the viewpoint on the freesound blog ( https://blog.freesound.org/?p=2082 ). Have faith in the scientists in the universities of this world. No one can say now what the AI world will look like in two years.
I say; Like me, make available your beautiful sound recordings with thorough description, and you will contribute to the future of all our world.

ps. I have also adjusted my position regarding the energy consumption, in my earlier post, of AI; that too will be solved by the scientists in time. Welcome to 2024.

To hear, you first have to listen

kevp888

1,598 sounds

212 posts

1 year, 5 months ago

#15

@Klankbeeld : Yes, it’s good to keep optimistic. As AI mimics human, it’s a great idea to give it the best from us to train !
May I ask which French university ?

Wishing you all the best !

Kevin

klankbeeld

7,347 sounds

2,077 posts

1 year, 5 months ago

#16

https://www.univ-gustave-eiffel.fr

To hear, you first have to listen

kevp888

1,598 sounds

212 posts

1 year, 5 months ago

#17

I see, thank you !

Kevin

josefe299A

0 sounds

1 post

1 year, 5 months ago

#18

I have used freesound.org sounds many times before for animation projects, and when I have been able to, I have donated because I appreciate the contributions the site and its users make in order to create a richer commons for culture and creativity.

I am concerned, like others, about not only allowing but embracing AI sounds in what is otherwise a rich, people-driven community.

Most of my personal work is CC0, CCBY or CCBYSA, and I strongly disagree with the interpretation that the output of generative models should not give credit to the sounds used to train them. If freesound.org truly stands by the CC licenses, this should be obvious from the get go. This is because training AI models does NOT function like a human using something as reference does. And AI model mashes together a number of things to create new ones, effectively creating a derivative of those works. Therefore, if the generative model does not give credit per output, it does not comply with the license.

And don't forget that CC is not the only game in town when it comes to licenses, nor are they right in every aspect of what they preach. Deferring to them when talking about taking a stance on AI-generated content is a fallacy, and feels like an easy way out action.

Someone else made a good point as well about the environmental impact of AI, even though they later recanted their opinion. I agree with their first statement, that the "generation" of a single sound will be way worse for the planet than the recording of a sound by a person would be otherwise.

I am also concerned about the flooding of sounds here. If I keep using the site and AI remains embraced as it is here, I will most definitely not use any sounds made on or after 2024, and I am very much not interested in donating money to an organization that so readily embraces the capitalistic push of AI by corporations that are interested in protecting their bottom lines and not the people who make art. I strongly recommend reading Cory Doctorow's article on this: https://pluralistic.net/2024/07/25/accountability-sinks/#work-harder-not-smarter

There's also the fact that there's real people being exploited to train these systems, not just stuff they find on the internet: https://briarpatchmagazine.com/articles/view/the-workers-ai-hides

Not to mention how these models are being trained by trampling over the denial of consent of multiple people: https://www.tomshardware.com/tech-industry/artificial-intelligence/several-ai-companies-said-to-be-ignoring-robots-dot-txt-exclusion-scraping-content-without-permission-report

This is not a rejection of technology per-se; I agree that AI could be a tool that is helpful in some areas of life. But the way that it is being implemented now is not only soulless, but unethical, exploitative and overtly colonial. I reject the corporations pushing this on us in every aspect of our lives; I reject the companies firing workers in the hopes of "maximizing productivity" with technology; I reject the idea that feeding these monsters is good for the commons. The commons are the people; without people there's no commons.

And these companies are perpetrating an enclosure of our commons, yet again. I would have thought freesound.org would have been on the side of the people and not of the enclosures, and I hope that some further discussions change the way this goes.

Hopefully the bubble bursts soon and we can take a look at how we fix this mess.

Wishing you the best, and very concerned about this site,
-J.

synthoid000

0 sounds

1 post

1 year, 4 months ago

#19

One of my favorite topics, no longer confined to a "theoretical" world. Trying to "contain" AI in its ever evolving forms is like trying to hold water in a fishnet, its perpetual expansion into de facto ubiquity in countless aspects of human endeavor is inevitable, more like a co-evolutionary symbiosis. As a momentary sidenote, I recall having this very discussion with attorneys, social anthropologists and all sorts of other folks at the Virtual Humans conference . . . circa 1997.

Having said all that, I still cling to the notion, like a barnacle to a rock, of organically creating my sometimes "mysterious" sounds, often wafting out of the synth lab in the predawn hours (not sure about nearby humans, but various critters do come by to investigate). I even make it a point to identify on my released music tracks that my material is "AI free" content, although I will admit that the cover art graphics are often AI generated or enhanced.

This is what leads to my point here in this meandering missive, that being the boundaries between AI and human created content, whatever it might be, is becoming evermore diffuse, if not eventually irrelevant. Even if I organically create my content, at what point does some portion of it get fed into an AI sampling engine, and regurgitated into a slightly modified rendering of [fill in the blank] genre of sound sample, or part of a newly created piece of music.

The ability to detect, unravel and identify such content is itself becoming evermore challenging at best. Ironically, AI tools can now used for exactly this purpose . . . separating out specific tracks, looking for specific features of interest, and so on. An argument can be made for identifying the AI status of current and future content releases . . . well, maybe.

As for sound and soundscape libraries, presets, "performance" files and so on for virtually every type of synth platform and DAW known to exist, these are all off the shelf products being offered by numerous artists and vendors. The "old" days of countless hours spent inventing the next really unique, newly discovered types of soundscapes and "sound matrices" (my fancy term for such) is fading away, almost like an ancient artform being kept alive by the relatively fanatical few, while almost anyone can buy a bundle of [fill in the blank] content ingredients and patch together the next big hit tune, or so the relentless heavy marketing keeps trying to claim.

At least in my odd parallel universe, I'm not really that interested in creating the next "big hit" for the top 40 list and all that, it's a bit more esoteric, that I have fun with, and sometimes gain a collection of Spotify listeners (always appreciated). As for the more serious contenders who really are trying to navigate the next "big hit" commercial waters, hoping to get noticed among the seemingly countless other artists with the same rap, hiphop, technorave or whatever genre modeled content, staying "organic" will likely not even be possible.

The modern production studio is heavily layered in highly sophisticated software, capable of enhancing, creating or "correcting" every conceivable aspect of actual music production. Real instruments do still exist (well, sometimes), with many of the artists and musicians who "play"
their computers and synth platforms as unique forms of virtual instrumentation. Of course AI is getting filtered into all of this, it would be absurd to pretend it's not.

I'm constantly bombarded by colorful, loud blaring advertisements for the latest version of AI whatever to create, exactly as worded, the next hit tune, the next big hit bass line, vocal, drum tracks, etc., . . . the ads often lean towards providing quasi automated workflow, ability to grind out that next big hit tune in a few days or even hours . . . and then, of course, comes the just now starting to appear AI enhanced marketing and promo platforms so that your new AI created hit tune now becomes a huge commercial success with the AI promo engines . . .
see a pattern here, maybe?

Well, back to my relatively remote synth lab techno bunker studio nestled among the vineyards and orchards of northern Cal, I feel somewhat lucky to be more or less outside the AI / commercial hit tune music production box, but I do appreciate making note of all this, and others offering their thoughts in this bubbling cauldron of potential controversy, brewing at this very moment.

mrtd_

0 sounds

1 post

1 year, 4 months ago

#20

I'm have no experience with Freesound; I became interested, so I created an account, and first thing I saw the news post that sparked this forum thread. To be clear about my stance, I completely avoid the generative AI text/image/sound services like ChatGPT, Midjourney etc. and the companies behind them.

I wanted to comment specifically on these two notions:

The "AI boom" being comparable to the "sampling boom" with regard to copyright
Accepting AI created content because AI is "here to stay"

The generative AI models are trained on human created material, both non-copyrighted and copyrighted. These companies' goal was to cement their products as "here to stay" before the laws could "catch up". There would be no "AI boom" at this scale if they hadn't scraped basically the entire internet for vast amounts of material, with intentional disregard for legality and for the value of creative work. Generative AI devalues creative work both in it's functionality (replacing it and reducing it to a button-click) and how it was developed (theft).

That's why it's not comparable to the "sampling boom". It's also why I believe that it's naive to think that copyright has any value to these companies, and why I believe that being accepting towards generative AI, as someone who does creative work, is self-contradiction. If you have concerns about the damage generative AI will do to communities like this, what about what their creators already have done? If I embrace generative AI, I'm not just embracing a new tool/tech, I'm embracing products that were made by stealing from people like me.

Finally, completely anecdotal: I worked as a software developer for global company based (in the EU) that purchased their own in-house instance of ChatGPT. During testing, they called in a meeting about concerns, benefits, potential etc. There was some concerns, but a lot of praise and talk about benefits. There was also some "who's to say if ChatGPT isn't sentient?". It isn't, it's a language model... I asked what their thoughts where on how ChatGPT was trained on copyrighted material, and how OpenAI (ChatGPT creators) had/have ongoing lawsuits because of it. An attendee said "that's not [our company]'s problem". These companies don't care about us or our rights, and in the case of generative AI, their success is built on our existing work.

Post reply