Tuesday, March 12, 2013

just how bad is soundcloud's "automatic content protection system"?

as many of you know, one of my music processes is is a form of sonification (or audification) where i trick my computer into opening various data files as if they were sound files, and then i make music out of the resulting sounds. (i wrote a tutorial about it here.)

a while back i posted an old song called "denial of service" to soundcloud because i was thinking about including it on my recently-released album on the DLL. the song is what you might call "pure" data sound, in that every sound in the song is made of this sonified data. so imagine my surprise when i got an email from soundcloud that they'd taken the song down because they thought i had sampled some house song:

i repeat: every sound in my song is taken from sonified data. no drum machines, drum samples, or any other instruments were used. but somehow my song still triggered an automatic takedown for uncleared sampling. i didn't even run any audio effects on the samples—they are all clean.

want to hear for yourself? here is my song (warning: fairly abrasive). and here is the song they think i sampled (warning: fairly mediocre). no human who listened to these two recordings would think they had anything in common.

as a glitch artist/musician, i find this fascinating. what could this algorithm be hearing in my song to make it think i sampled some house track? in terms of tone, timbre, tempo, and meter, these two recordings could hardly be more dissimilar. but to a bot, they apparently sound alike. somewhere in there, indiscernible to mere human hearing, the two must have some tonal sweep or polyrhythm in common. or some checksum came up with the same result by chance. it's hard to even speculate about how this may have happened without knowing more about how their bot works.

but it's also frustrating that my song has been pulled down based on such a laughably false accusation. and although i'm certain i will win the dispute, it's annoying to have to send soundcloud my name, address, and phone number, as well as tick seven checkboxes threatening to sue my ass off if i'm lying:

of course, there's no way for me to take soundcloud to court for wasting my time or for issuing bogus threats. all this because of a poorly programmed bot. in a way i'm lucky that my song obviously doesn't contain any "musical" samples of any sort. what would happen if i had made some housey track that just happened to use the same drum hits, or just sounded similar?

update: within a couple hours of filing a dispute, i received the following message:

thank you for providing feedback in regards of the upload:
stAllio! - denial of service

This notification is to inform you that your upload has been released to your account.
Thanks,
The SoundCloud team

so although it's reassuring that these disputes are reviewed quickly, it's alternately fascinating and horrifying that such a thing could happen in the first place. and i must wonder what would've happened if my case hadn't been so open-and-shut.

7 comments:

meatsock said...

i got nabbed by the same system for starting my show with a song from the Kopyright Liberation Front. fwiw both pitch shifting more than 3% or layering different tracks seems to throw it off, but we shouldn't HAVE to

stAllio! said...

that's interesting... they actually nabbed you for sampling the KLF, and not for sampling whoever the KLF may have sampled? if so then some copyright troll must have acquired the rights to the KLF material... unless the KLF is now sending out takedown notices as a media prank.

Eric Honour said...

Given your description, my guess would be that in addition to actual "does it sound like that" they also calculate a hash value and compare it against other stuff they know of, and you are a victim of a hash collision: http://en.wikipedia.org/wiki/Hash_collision

The Fizz said...

I'm wondering if you tried using notepad++ (not notepad) instead of wordpad. It seems to be much more flexible in opening and saving "unsupported" file types without breaking them. It's also much more manageable for undoing things, in my opinion... Once you save it once it easily complies and saves it the same way each time for easy seeing what you did to the file. And even if you do manage to break it, you can simply undo it and save and your file is back. It also has quite a few editing features I like to play with to get different effects.

stAllio! said...

fizz, the fact that wordpad is so destructive is precisely why i use it for glitch art.

i haven't tried notepad++; i'm pretty comfortable with using a hex editor so it seems like notepad++ would be a step down.

Iris said...

This is cool!

Robert F. Crocker said...

You can get instant popularity and more exposure for your music track and other future audio projects. More people will come to listen to your music and admire you. of course, there's no way for me to take soundcloud to court for wasting my time or for issuing bogus threats. all this because of a poorly programmed bot. in a way i'm lucky that my song obviously doesn't contain any "musical" samples of any sort.
Yes! We assured you that the followers we provide for your buy real soundcloud plays account are permanent. They will never drop in future.