ICCV is the premier international computer vision event comprising the main conference and several co-located workshops and tutorials.

October 11 ~ 17, 2021
(ICCV-2021 is a Virtual-only Conference)

It is Sony's pleasure to become a Champion Exhibitor of ICCV-2021.

Recruit information for ICCV-2021

We look forward to highly motivated individuals applying to Sony so that we can work together to fill the world with emotion and pioneer the future with dreams and curiosity. Join us and be part of a diverse, innovative, creative, and original team to inspire the world.

For Sony AI positions, please see

*The special job offer for ICCV-2021 has closed. Thank you for many applications.

Our event information

Sponsor Session 01

(1) Federated Learning in Practice


Federated Learning has attracted significant attentions since it offers a privacy-aware model training paradigm which does not require raw data sharing but allows participants to collaboratively construct a better global model. In this talk, I will introduce the background and categorization of Federated Learning, followed by several interesting industrial cases.

  • photo: Lingjuan Lyu

    Speaker :

    Lingjuan Lyu - Sony AI

(2) Visual Datasets and Biases


Machine learning models invariably depend on the source of their experience-i.e., the data that they are trained and evaluated on. Data size and efficiency are often prioritised to the detriment of data curation practices. In this talk, I will overview several research questions related to visual datasets and biases.

  • photo: Jerone Andrews

    Speaker :

    Jerone Andrews - Sony AI

Sponsor Session 02

Highly Efficient Realtime Visual Sensing Applications with Event-based Sensors


Inspired by human eyes, event-based vision sensors (EVS) are an emerging technology that addresses challenging scenarios faced by today's conventional cameras. They react quickly to sudden changes, handle extreme lighting conditions and operate with very little power. We have drastically reduced the EVS pixel and reached a size where we can integrate it with state-of-the art image sensors to get the best of both worlds. We are developing software solutions to facilitate adaptation of the EVS technology in various markets. We introduce some of our current works including high-speed tracking, high-speed depth sensing, visual light communication and efficient event-based processing.

  • photo: Yoshitaka Miyatani

    Speaker :

    Yoshitaka Miyatani - Sony Semiconductor Solutions Corporation, Japan

Sponsor Session 03

(1) Densely connected multidilated convolutional networks for dense prediction tasks


Tasks such as semantic segmentation require a modeling of both local and global structures in a large input field. As the local and global structures often depend on each other, their simultaneous modeling is important. We propose a novel CNN architecture called D3Net that enables multiresolution learning in almost all layers. D3Net outperforms SOTA architectures in several tasks.

  • photo: Naoya Takahashi

    Speaker :

    Naoya Takahashi - R&D Center, Sony Group Corporation

(2) Beyond image recognition towards audio and language


Multimodal recognition is critical for realizing AI that understands the world at a deeper level. We introduce some of Sony's endeavors towards such goal, first with localization of sounding objects through weakly supervised learning, and also with a review of transformer architecture in visuolinguistic tasks.

  • photo: Tokuhiro Nishikawa

    Speaker :

    Tokuhiro Nishikawa - R&D Center, Sony Group Corporation

  • photo: Andrew Shin

    Speaker :

    Andrew Shin - R&D Center, Sony Group Corporation

Technologies & Business
use case

Technology 02

Sony's Latest Image Sensors and the Technologies that Lies Behind Them

In the imaging and sensing field, Sony has a great selection of cutting-edge products, such as intelligent vision sensors with AI processing functionality, ToF image sensors that can be used even for AR/MR, and automotive image sensors critical to realizing autonomous driving. The unique technologies that lie behind these products are all world firsts, and include back-illuminated CMOS image sensors, stacked CMOS image sensors, and Cu-Cu direct bonding.

Technology 03

Imaging and Sensing Technology

CMOS image sensor development at Sony began in 1996 and led to the launch of our first CMOS image sensor (IMX001) in 2000. At the time, CMOS image sensors produced noisy images under low light and were also inferior to CCD image sensors in the number of pixels. However, the lower readout speed of CCD image sensors convinced us that they would be unable to support high-resolution data as the industry moved from SD to HD video. In 2004, we therefore changed course greatly by shifting our focus from CCD to CMOS image sensor development. It was a bold decision. Instead of holding the world's number one share in CCD image sensors, we would be building on a negligible market share in CMOS image sensors.
Later in 2007 we commercialized CMOS image sensors with an original column A/D conversion circuit for fast, low-noise performance, followed in 2009 by back-illuminated CMOS image sensors with twice the sensitivity of conventional image sensors – beyond the human eye.
Further examples of technical innovation that has enabled us to constantly lead the industry include stacked CMOS image sensors in 2012 – with higher image quality and multiple functions in a smaller package, thanks to layering of the pixel and signal-processing sections – and, in 2015, the world's first image sensors with a Cu-Cu connection, enabling smaller packages, higher performance, and greater productivity in manufacturing.

Page Top