Automatic Generation of Sound Synthesis Techniques
Research realized at MIT Media Lab. Master in Science in Media Arts & Sciences
Digital sound synthesizers, ubiquitous today in sound cards, software and dedicated hardware, use algorithms (Sound Synthesis Techniques, SSTs) capable of generating sounds similar to those of acoustic instruments and even totally novel sounds. The design of SSTs is a very hard problem. It is usually assumed that it requires human ingenuity to design an algorithm suitable for synthesizing a sound with certain characteristics. Many of the SSTs commonly used are the fruit of experimentation and a long refinement processes. A SST is determined by its “functional form” and “internal parameters”. Design of SSTs is usually done by selecting a fixed functional form from a handful of commonly used SSTs, and performing a parameter estimation technique to find a set of internal parameters that will best emulate the target sound.
A new approach for automating the design of SSTs is proposed. It uses a set of examples of the desired behavior of the SST in the form of “inputs + target sound”. The approach is capable of suggesting novel functional forms and their internal parameters, suited to follow closely the given examples.
Design of a SST is stated as a search problem in the SST space (the space spanned by all the possible valid functional forms and internal parameters, within certain limits to make it practical). This search is done using evolutionary methods; specifically, Genetic Programming (GP). A custom language for representing and manipulating SSTs as topology graphs and expression trees is proposed, as well as the mapping rules between both representations. Fitness functions that use analytical and perceptual distance metrics between the target and produced sounds are discussed.
The AGeSS system (Automatic Generation of Sound Synthesizers) developed in the Media Lab is outlined, and some SSTs and their evolution are shown.
Digital Watermarking of Audio Signals Using a Psychoacoustic Auditory Model and Spread Spectrum Theory
Research realized at University of Miami. Master in Science in Music Engineering
A new algorithm for embedding a digital watermark into an audio signal is proposed. It uses spread spectrum theory to generate a watermark resistant to different removal attempts and a psychoacoustic auditory model to shape and embed the watermark into the audio signal while retaining the signal’s perceptual quality. Recovery is performed without knowledge of the original audio signal. A software system is implemented and tested for perceptual transparency and data-recovery performance.
|These projects were done at MIT Media Lab between 1999 and 2001