Thanks to machine listening techniques, the LISTEN joint laboratory will accelerate research in the field of information extraction from audio signals, with very promising applications in various fields/sectors, around the analysis of sound scenes and audio content, especially music.
Taken independently, these research topics are not new, but the performance of the related systems has considerably increased thanks to unified methodological approaches, combining machine learning and deep learning applied to different types of audio signals (giving rise to so-called machine listening), which accelerates the implementation of technologies and their adoption by the public.
This increased performance has led to significant progress, with a strong socio-economic impact in some sectors, and make it possible to envisage their extension to other fields of application:
- Robust speech processing: robust human voice detection (natural, amplified, shouted, multiple…) and speaker identification
- Source separation, enhancement and localization: handling generic sound classes (speech, music, environmental sound) and underrepresented ones (rare sound sources)
- Spatio-temporal detection of domestic/urban/industrial/natural sound events:
- for instance in automobile: with autonomous vehicles or the remote detection by sound of dangers, warning signals or priority vehicles
- to assist frail people: understanding of the domestic sound environment and detection of sounds related to abnormal or dangerous situations (falls of people, broken windows, alarms…)
- or with predictive maintenance in industry, e.g., intelligent listening for anomaly detection on the production line.
- Music content analysis: transcription (rhythm, melody, harmony) to improve access to music catalogs, make automatic recommendation or help in music creation or pedagogy
- Ecological applications:
- Smart city, e.g., qualification of noise pollution sources
- Bioacoustics, e.g., animal sound segmentation, detection and classification
- And many others…