Pattern Recognition Letters

Special Issue On

Future Trends in Pattern Recognition of Non-Speech Audio Signals (PRNSA)



Pattern Recognition Letters is seeking high-quality manuscripts for a Special Issue on Future Trends in Pattern Recognition of Non-Speech Audio Signals, scheduled for publication in 2010.

Aim and Scopes:

August 1996: in a panel session at the Thirteenth National Conference on Artificial Intelligence (AAAI-96), Rodney A. Brooks (noted Professor of Robotics at MIT) remarked that while automatic speech recognition was a highly researched domain, there had been very little work trying to build machines able to understand "non-speech sound cues". He went as far as naming this one of the biggest challenges faced by Artificial Intelligence.

Twelve years have passed, and the pattern recognition community has widely endeavored to make this more of a reality. Systems now exist that are able to analyze the contents of musical signals (e.g. classifying them into genres or recognizing their instruments), soundscapes (e.g. discriminating the sound of a busy road from that of a calm pedestrian street), non-verbal communication cues of humans (e.g. detecting baby cries in a kindergarten environment) and animals (e.g. predicting the context of dog barks), to list but a few.

In an interesting parallel to the hegemony of hidden Markov models in speech recognition, most techniques to recognize non-speech audio have relied so far on a common paradigm, the Bag-of-Frames (BOF). BOF represents signals as the long-term statistical distribution of local frame-based features vectors, a prototypical implementation of which being Gaussian Mixture Models of Mel-Frequency Cepstrum Coefficients. While this paradigm has provided a successful basis for many of today’s most spectacular realizations, recent research increasingly suggests that it is intrinsically bounded to moderate performance. Its limitations include the difficulty to model temporal dynamics between successive frames, to account either for short events or for long-term dependencies, and to incorporate extrinsic information such as contextual cues. The time has come, it seems, to turn to alternative frameworks, with greater cognitive and biological plausibility.

For this special issue, we are soliciting original contributions of leading researchers from academia and industry, which address pattern recognition techniques for non-speech audio signals that depart significantly from the traditional frame-based models. By non-speech audio, we understand any acoustic signal within the hearing range of humans or animals, which is not associated with linguistic content. This includes music, soundscapes and environmental noises (broadly defined), animal vocalizations or acoustic signaling, etc. Human vocalizations also qualify if considered in their prosodic, non-linguistic qualities (e.g. babbling, humming, etc.).

We are particularly interested in systems that take pattern recognition out of the desktop, into real-world, continuous, mobile, context-rich physical devices. The topics include, but are not limited to:

Submission procedure:

Manuscript should conform to the standard guidelines of the Pattern Recognition Letters . Instructions for formatting papers can be found in the "Guide for authors". Submitted articles must not have been previously published and must not be currently submitted for publication elsewhere. Prospective authors should submit the electronic copy of their complete manuscript via online electronic Elsevier system (EES), in which authors must select the "article type" from the menu as "PRNSA", the acronym of the special issue. All submitted papers will be reviewed by at least two independent reviewers.

Important Dates:

o    Full paper due:   Jan. 31, 2009 Extended: Feb. 15, 2009
o    First notification:   April 31, 2009 Delayed: May. 15, 2009
o    Revised manuscript (for second review) due:   June 31, 2009 Extended: July 15, 2009
o    Acceptance Notification:   August 31, 2009
o    Final manuscript due:   Sept. 31, 2009
o    Scheduled publication of the special issue:   2010

Guest Editors

Dr. Jean-Julien Aucouturier (aucouturier@gmail.com)
Temple University, Japan Campus.

Dr. Laurent Daudet (daudet@lam.jussieu.fr)
Université Pierre et Marie Curie-Paris 6, France.