A Computational Approach to Rhythm Description

Audio Features for the Computation of Rhythm Periodicity Functions and their use in Tempo Induction and Music Content Processing

   

A Dissertation submited to the Department of Technology of the University Pompeu Fabra
for the program in Computer Science and Digital Communication in partial fulfilment of the requirements for the degree of
Doctor per la Universitat Pompeu Fabra
with the mention of European Doctor

 

Fabien Gouyon 2005

 

 

Abstract:

This dissertation is about musical rhythm. More precisely, it is concerned with computer programs that automatically extract rhythmic descriptions from musical audio signals.


New algorithms are presented for tempo induction, tatum estimation, time signature determination, swing estimation, swing transformations and classification of ballroom dance music styles.
These algorithms directly process digitized recordings of acoustic musical signals. The backbones of these algorithms are rhythm periodicity functions: functions measuring the salience of a rhythmic pulse as a function of the period (or frequency) of the pulse, calculated from selected instantaneous physical attributes (henceforth features) emphasizing rhythmic aspects of sound.
These features are computed at a constant time rate on small chunks (frames) of audio signal waveforms.


Our algorithms determine tempo and tatum of different genres of music, with almost constant tempo, with over 80% accuracy if we do not insist on finding a specific metrical level. They identify time signature with around 90% accuracy, assuming lower metrical levels are known.
They classify ballroom dance music in 8 categories with around 80% accuracy when taking nothing but rhythmic aspects of the music into account. Finally they add (or remove) swing to musical audio signals in a fully-automatic fashion, while conserving very good sound quality.


From a more general standpoint, this dissertation substantially contributes to the field of computational rhythm description:

a) by proposing an unifying functional framework;
b) by reviewing the architecture of many existing systems with respect to individual blocks of this framework;
c) by organizing the first public evaluation of tempo induction algorithms;
d) by identifying promising research directions, particularly with respect to the selection of instantaneous features which are best suited to the computation of useful rhythm periodicity functions and the strategy to combine and parse multiple sources of rhythmic information.

 

 

Links:

_Thesis in pdf format (ISBN: 84-689-7256-8)

_Satellite workshop on Multidisciplinary approaches to the understanding of music cognition, co-organized with Hendrik Purwins

    

 

Thesis Direction:

Dr. Xavier Serra Department of Technology Pompeu Fabra University, Barcelona and
Dr. Gerhard Widmer Department of Computational Perception Johannes Kepler University, Linz

 

Thesis Committee:

Chairperson--------------------------------------------------------
Dr. Gustavo Deco
Department of Technology
Pompeu Fabra University, Barcelona

Secretary-----------------------------------------------------------
Dr. Hector Geffner
Department of Technology
Pompeu Fabra University, Barcelona

Thesis reader----------------------------------------------------
Dr. Peter Desain
Nijmegen Institute for Cognition and Information
University of Nijmegen, Nijmegen

Thesis reader------------------------------------------------------
Dr. Simon Dixon
Austrian Research Institute for Artificial Intelligence
Vienna

Thesis reader------------------------------------------------------
Dr. Anssi Klapuri
Institute of Signal Processing
Tampere University of Technology, Tampere