Sonic Texture Recognition/Classification (2008-12)


To work with that element of sound we might call it’s “texture”, to create work that is itself “textural” is a very contemporary idea. I have focused on this phenomenon both as an aesthetic interest, as a perceptual phenomenon and as a behavior of sound signals. I am keenly interested in ways to extract, sense, transform or synthesize textural elements, and how to create machines that can identify, parse and re-define textural qualities. My work in developing creative, listening and reacting machines has largely centered around this aesthetic concern, which has most recently led to my exploration of of the use of nonlinear time/frequency analysis techniques – something that finds its way into other software projects including FILTER and GREIS. This “experimental signals and systems” work is built around the use of empirical mode decomposition and the use of machine listening/learning built upon a dynamical systems model. This was best articulated in my 2012 Journal of Acoustical Society of America paper.

The abstract:

“This paper describes a system for modeling, recognizing, and classifying sound textures. The described system translates contemporary approaches from video texture analysis, creating a unique approach in the realm of audio and music. The signal is first represented as a set of mode functions by way of the Empirical Mode Decomposition technique for time/frequency analysis, before expressing the dynamics of these modes as a linear dynamical system (LDS). Both linear and nonlinear techniques are utilized in order to learn the system dynamics, which leads to a successful distinction between unique classes of textures. Five classes of sounds comprised a data set, consisting of crackling fire, typewriter action, rainstorms, carbonated beverages, and crowd applause, drawing on a variety of source recordings. Based on this data set the system achieved a classification accuracy of 90%, which outperformed both a Mel-Frequency Cepstral Coefficient based LDS-modeling approach from the literature, as well as one based on a standard Gaussian Mixture Model classifier.”

System Diagram for this project:

Related writings:

Doug Van Nort, Jonas Braasch and Pauline Oliveros. Sound Texture Recognition through Dynamical Systems Modeling of Empirical Mode Decomposition. Journal of the Acoustical Society of America, Vol. 132, issue 4, pp. 2734-2744, 2012.

Doug Van Nort. Instrumental Listening: sonic gesture as design principle. Organised Sound 14(2):177-187, August 2009.

Doug Van Nort, Jonas Braasch and Pauline Oliveros, Sound Texture Analysis based on a Dynamical Systems Model and Empirical Mode Decomposition, Proceedings of the 129th Convention of the Audio Engineering Society, San Francisco, CA, November 2010.

Doug Van Nort, Texture Perception: Signal Modeling and Compositional Approaches, in Proc. of the 2007 Conference of the Society for Music Perception and Cognition (SMPC-07), Montreal, QC, August 2007.