Audio

Jason Downer

CopperCloudMusic

Sound Editor

Composer

For the past few years I've been developing an audio synthesis engine named Nemi. Sound is produced by non-linear feedback networks which have been evolved using genetic algorithms, as the equations these networks represent are too complex to hand craft.

Even simple voices are astoundingly complex, and it is nearly impossible to determine what a given network will produce beforehand. Key state, note, and velocity can all dramatically alter the character of the sound.

Most networks create noise, or silence. Only after several generations of refinement do useable instruments emerge.

The Nemi sound engine can be driven by MIDI, tcp connections, or data files. Beowolf clustering works, but still kludgy at this point, and each machine must launch it's own daemon before hand.

Realtime audio is achieved by timing execution speed for each "K" or control cycle, then cutting (if necessary) the longest, non-sustaining voices. I'm getting good results with 440 K cycles per second, that winds up producing a little more than 2 ms of latency, which is good enough for what I do.

I've included a few clips in mp3 and ogg format to illustrate.

Evil Machine.ogg

Illuminati5.mp3

Midi6.mp3

For info on instrument modellng and waveguides visit Planet CCRMA.

A neat project I've been working on is Enscribe, which transforms color photographs into audio watermarks. It's neat, check it out.

Another interesting field is that of audio re-synthesis. A Fourier Transform converts data (be it audio, visual, or other type of data) into discrete frequencies. I ran into interesting problems while trying to create a noise removal filter...

JPEG image files use a sister of the Fourier Transform called the Discrete Cosine Transform or DCT or achieve high compression ratios. If the compression is set too high, the image starts too resemble a mosaic of blurry blocks.

The blur and the wavy lines around edges (Gibbs Phenomena), are caused by a loss of low amplitude spectra. A similar effect can be done with audio, creating a smeared sound image.If the smear is delayed it resembles reverb. This is exactly what a poor implementation of an mpeg encoder would produce.

I've included a sample program code to illustrate audio programming here. It show the basics of opening an OSS audio device in full duplex withopen and ioctl. There's also example use of a POSIX timer and some simple FFT code. It tries to remove noise by parabolically cutting frequency spectra below a certain threshold in real time. Bump the cutoff threshold up a few orders of magnitude and you get those "fading" artifacts.

Some "code in process" that tries to remap frequencies (heterodyning) in realtime is here, but it's not pretty.

Here are some examples of the heterodyne work, sorry but they are low volume:

Piano with remapping from 100 bins above

Piano with sqrt of 2 remapping

Square root remapping-flameco guitar

Square root regular guitar

Square root horns and strings

Cheesy synth to the 3.1 power