Tuesday, March 12, 2013

just how bad is soundcloud's "automatic content protection system"?

as many of you know, one of my music processes is is a form of sonification (or audification) where i trick my computer into opening various data files as if they were sound files, and then i make music out of the resulting sounds. (i wrote a tutorial about it here.)

a while back i posted an old song called "denial of service" to soundcloud because i was thinking about including it on my recently-released album on the DLL. the song is what you might call "pure" data sound, in that every sound in the song is made of this sonified data. so imagine my surprise when i got an email from soundcloud that they'd taken the song down because they thought i had sampled some house song:

i repeat: every sound in my song is taken from sonified data. no drum machines, drum samples, or any other instruments were used. but somehow my song still triggered an automatic takedown for uncleared sampling. i didn't even run any audio effects on the samples—they are all clean.

want to hear for yourself? here is my song (warning: fairly abrasive). and here is the song they think i sampled (warning: fairly mediocre). no human who listened to these two recordings would think they had anything in common.

as a glitch artist/musician, i find this fascinating. what could this algorithm be hearing in my song to make it think i sampled some house track? in terms of tone, timbre, tempo, and meter, these two recordings could hardly be more dissimilar. but to a bot, they apparently sound alike. somewhere in there, indiscernible to mere human hearing, the two must have some tonal sweep or polyrhythm in common. or some checksum came up with the same result by chance. it's hard to even speculate about how this may have happened without knowing more about how their bot works.

but it's also frustrating that my song has been pulled down based on such a laughably false accusation. and although i'm certain i will win the dispute, it's annoying to have to send soundcloud my name, address, and phone number, as well as tick seven checkboxes threatening to sue my ass off if i'm lying:

of course, there's no way for me to take soundcloud to court for wasting my time or for issuing bogus threats. all this because of a poorly programmed bot. in a way i'm lucky that my song obviously doesn't contain any "musical" samples of any sort. what would happen if i had made some housey track that just happened to use the same drum hits, or just sounded similar?

update: within a couple hours of filing a dispute, i received the following message:

thank you for providing feedback in regards of the upload:
stAllio! - denial of service

This notification is to inform you that your upload has been released to your account.
Thanks,
The SoundCloud team

so although it's reassuring that these disputes are reviewed quickly, it's alternately fascinating and horrifying that such a thing could happen in the first place. and i must wonder what would've happened if my case hadn't been so open-and-shut.