I went on a tear the other day, after finding out what wavetable synthesis actually does – I’d always vaguely assumed it was just another kind of sample playback, and wasn’t interested. A few hours of python later, I have a little program that goes mining through my music library, hunting for interesting waveforms; a little normalization and some spectral morphing later, and out come a bunch of wavetables, like you’d use in Serum or Massive.
The machine proceeds to use its wavetables, in a barely controlled sort of let’s-just-see-what-happens way, to generate anywhere from a few seconds to a few minutes of… well… let’s call it “sound”. It’s terrible, most of the time, admittedly – but it is a surprising kind of sound, too, and worth the exploration, for the compelling bits of alien music that sometimes come blurting out of that weird robot’s brain; dark throbbing rhythmic gritty off-kilter stuff that I would never have imagined making on purpose.
I had a few ideas for ways I might try to control this novel instrument – following the envelope or pitch of one sound to drive the generation of another, perhaps – so I went to look at the state of the music-analysis world, imagining I’d find some envelope-follower algorithm and do some stuff with FFTs.
Well, sure, tempo sync is old hat in the DJ world by now, and I’ve even gotten used to effortless key-matching, but the tools you can use now, and the kinds of information they can measure about music, are far beyond what I had imagined.
aubio, for example, is a Python library, and an unusually easy-to-use command line tool, which can find beats, follow the tempo, distinguish notes, follow the pitch of those notes, and also break the track up into its louder and quieter sections.
pyAudioAnalysis builds from there,, offering a generic toolkit for extracting, classifying, and measuring features in sound; you can use this library to train a custom segmentation engine, which will recognize and extract any particular pieces of an audio recording you might happen to be interested in.
Essentia, though, is the box of shiny toys that really made my head explode. Beat tracking, segmentation, feature classification, sure, we’ve seen that now; but how about… melody extraction? Mood detection? The list is amazing: dynamic complexity, spectral energy, dissonance, spectral complexity, beat loudness, and danceability!?
You’re not stuck on the command line, either – Sonic Visualizer will accept Essentia as a plug-in, along with a whole lot of others, and then there’s this Pure Data toolkit…