DCASE 2021

Sony has won the first place in the Task 3 category of DCASE 2021. The world's largest international competition for the detection and classification of acoustic scenes and events.

The goal of Task 3 (Sound Event Localization and Detection with Directional Interference) is to recognize individual sound events of specific classes, by detecting their temporal activity and estimating their location in the presence of spatial ambient noise and interfering directional events that do not belong to the target classes.

Comment by Corresponding author, Kazuki Shimada

I am very happy to be able to win the 1st place in DCASE2021 Task3. First of all, I would like to thank all the members (Naoya Takahashi, Yuichiro Koyama, Shusuke Takahashi, Emiru Tsunoo, Masafumi Takahashi, Yuki Mitsufuji) who worked on this task together. We were able to obtain good results with the cooperation of many people. The Task 3 we worked on this time is to estimate when and where acoustic events such as human voice, footsteps and dog barks occurred. Understanding the actual sound environment and content in time and space enables AI to behave according to the situation, and leads to efficient and flexible content production. Using this result as a foothold, I would like to further advance research on sound environment understanding and content analysis, such as in collaboration with other research institutes, and contribute to a wide range of Sony’s business areas such as electronics and entertainment.