SONY

menu
Search button in the site

Search

Metric Learning with Background Noise Class for Few-shot Detection of Rare Sound Events

Date
2020
Academic Conference
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Authors
Kazuki Shimada
Yuichiro Koyama
Akira Inou (Sony Corporation)
Research Areas
Audio & Acoustics

Abstract

Few-shot learning systems for sound event recognition have gained interests since they require only a few examples to adapt to new target classes without fine-tuning. However, such systems have only been applied to chunks of sounds for classification or verification. In this paper, we aim to achieve few-shot detection of rare sound events, from query sequence that contain not only the target events but also the other events and background noise. Therefore, it is required to prevent false positive reactions to both the other events and background noise. We propose metric learning with background noise class for the few-shot detection. The contribution is to present the explicit inclusion of background noise as an independent class, a suitable loss function that emphasizes this additional class, and a corresponding sampling strategy that assists training. It provides a feature space where the event classes and the background noise class are sufficiently separated. Evaluations on few-shot detection tasks, using DCASE 2017 task2 and ESC-50, show that our proposed method outperforms metric learning without considering the background noise class. The few-shot detection performance is also comparable to that of the DCASE 2017 task2 baseline system, which requires huge amount of annotated audio data.

to the top