Search button in the site


Sony would like to extend our heartfelt condolences to those who have died from the COVID-19 virus and their families and pray for the speedy recovery of those currently battling the disease.
As many of you already know, European Conference on Computer Vision (ECCV) is to be held as a fully virtual conference for the first time in its long history.
Sony salutes the ECCV and all general chairs for prioritizing the need to protect people's lives with its decision to convert the conference to a virtual environment, and we thank the staff at the organizing committee for their swift preparations.
As one of the sponsors of the event, Sony intended to hold sessions and host a technology exhibition in the sponsors booth.
It was very unfortunate that we couldn't meet you all in Glasgow but as an alternative we intend to introduce several of Sony's latest combined AI, Sensor and Computer Vision technologies on this site,
including some which are still in the development stage.
While it may be small, we are keen to contribute to the first virtual conference in any way that we can.
We ECCV participants readily join the global battle to defeat COVID-19 and, once this global health emergency has passed, we look forward to seeing you all in person next year on site.

ECCV 2020 :
European Conference on Computer Vision

The European Conference on Computer Vision (ECCV) is one of the top conferences for researchers in this field,
together with the IEEE conferences ICCV (international Conference on Computer Vision) and CVPR (Computer Vision and Pattern Recognition).
Similar to ICCV and CVPR in scope it covers cutting-edge developments in image and video analysis, AI / deep learning methods for CV, and a rich spectrum of applications including security, healthcare, big data, and many more.ECCV is held biennially, in alternation with the International Conference on Computer Vision. Like ICCV and CVPR, it is considered an important conference in computer vision.

Industry sessions :
Unique assets for next generation computer vision and AI

Sony Semiconductor will show the unique intelligent vision sensor, which realizes high-speed edge AI processing. Sony R&D Center will introduce cutting edge algorithms and implementations for next generation entertainment. Sony AI will talk about the concept of Flagship Projects to leverage assets with AI for visionary applications.

Session 1
Intelligent Vision Sensor

Sony has developed innovative image sensor called intelligent vision sensor. They are the first image sensors in the world to be equipped with AI processing functionally. The sensor was realized by utilizing Sony's advanced stacking technology and original DSP for AI. Including AI processing functionality on the image sensor itself enables high-speed edge AI processing and extraction of only the necessary data, reduces data transmission latency, addresses privacy concerns, and reduces power consumption and communication costs in using cloud services. These products expand the opportunities to develop AI-equipped cameras, enabling a diverse range of applications in the retail and industrial equipment industries and contributing to building optimal systems that link with the cloud.

Fig. Intelligent Vision Sensor

  • Seigo Hirakawa

    Sony Semiconductor Solutions Corporation, Japan

    Seigo Hirakawa is a Senior Manager at Sony Semiconductor Solutions Corporation in Atsugi. He leads a development team for signal processing including AI for image sensors. He received a bachelor's and master's degree in electronic engineering from Tokyo Institute of Technology, Japan.

Session 2
Online and Offline 3D Mapping

Sony RDC is doing research in the field of Computer Vision. Among others, we have implemented state-of-the-art systems in 3D modelling; both: online and offline reconstruction. For rendering possible 360˚ reconstruction and sensing in real time with low power consumption, we have developed a processor specifically for vision sensing. This processor can connect up to 12 cameras, IMUs, etc., and performs stereo depth estimation, SLAM, occupancy mapping, etc., simultaneously. The focus of our offline 3D modelling system comprises the reconstruction of high-resolution as well as large sets of images. We have developed effective solutions on Multi-View Stereo, depth refinement and mesh reconstruction and show superior results on well-known academic benchmarks. Combining online and offline reconstruction enables a unique VR experience and extended robotic applications.

  • Dr. Andreas Kuhn

    Sony R&D Center Europe, Germany

    Andreas Kuhn is a Senior Engineer at Sony R&D Center Europe in Stuttgart. Prior to joining Sony in 2017, he has worked as a Postdoc at the University of the Bundeswehr Munich and the University of North Carolina at Chapel Hill. Andreas holds a Diploma and PhD in Computer Science from the University of Tübingen and the University of the Bundeswehr Munich respectively. His main research interests lie in 3D Reconstruction from images by Multi-View Stereo, Optimization and Machine Learning. Among others, Andreas is contributing in reviewing activities for the European Conference on Computer Vision.

Session 3
Sony AI

Sony AI is a new organization, established in November 2019, with offices in Japan, the US and Europe. With a mission to unleash human imagination and creativity, we believe in researching and developing AI techniques that empower artists, makers and creators around the world. In this session, I will introduce Sony AI's approach to AI research, and discuss some of our first projects.

  • Dr. Peter Dürr

    Sony AI Zürich, Switzerland

    Peter Dürr is the Director of Sony AI in Zürich. After joining Sony in 2011 he worked on computer vision, AI and robotics research in various assignments at Sony R&D Center and Aerosense in Tokyo, at Sony R&D Center Europe, and recently Sony AI in Zürich. Peter holds an MSc in mechanical engineering from ETH Zürich and a PhD in computer and communication science from EPFL in Lausanne.

Business use case

Case 1
Sony's World's First Intelligent Vision Sensors with AI Processing Functionality
Enabling High-Speed Edge AI Processing and Contributing to Building of Optimal Systems Linked with the Cloud

Sony Corporation announced two models of intelligent vision sensors, the first image sensors in the world to be equipped with AI processing functionality. Including AI processing functionality on the image sensor itself enables high-speed edge AI processing and extraction of only the necessary data, which, when using cloud services, reduces data transmission latency, addresses privacy concerns, and reduces power consumption and communication costs.

Fig. Intelligent Vision Sensor

The spread of IoT has resulted in all types of devices being connected to the cloud, making commonplace the use of information processing systems where information obtained from such devices is processed via AI on the cloud. On the other hand, the increasing volume of information handled in the cloud poses various problems: increased data transmission latency hindering real-time information processing; security concerns from users associated with storing personally identifiable data in the cloud; and other issues such as the increased power consumption and communication costs cloud services entail. The new sensor products feature a stacked configuration consisting of a pixel chip and logic chip. They are the world's first image sensor to be equipped with AI image analysis and processing functionality on the logic chip. The signal acquired by the pixel chip is processed via AI on the sensor, eliminating the need for high-performance processors or external memory, enabling the development of edge AI systems. The sensor outputs metadata (semantic information belonging to image data) instead of image information, making for reduced data volume and addressing privacy concerns. Moreover, the AI capability makes it possible to deliver diverse functionality for versatile applications, such as real-time object tracking with high-speed AI processing. Different AI models can also be chosen by rewriting internal memory in accordance with user requirements or the conditions of the location where the system is being used.

Case 2
Using imaging information
that humans can't see in AI applications
Polarization image sensor

1) Polarization Image Sensor with Four-Directional on-chip Polarizer and global shutter function

Sony Semiconductor Solutions has launched a polarization image sensor (polarization sensor): 3.45µm pixel size with four-directional polarizer which is formed on the photodiode of the image sensor chip*1. This polarization sensor is targeting the industrial equipment market. In addition to capturing brightness and color*2, this image sensor can also capture polarization information that cannot be detected by a normal image sensor. This polarization sensor can be used in many applications in the industrial field, such as inspection when visibility and sensing are difficult.

2) Developing high-performance AI as we develop products

In many cases, AI has been developed by applying machine learning to images visible to the human eye. Without a doubt however, using polarization sensors to take advantage of image information that we cannot see will greatly expand the horizons of AI. To put it another way, instead of this technology solving our problems for us, the technology will expand our own ability to solve problems. We are picking up the pace in wholly new AI development that on draws on our unmatched development of high-performance sensors. And in consideration of the many different products and services Sony offers, we are uniquely positioned to rapidly develop practical, high-performance AI as we develop products and launch them around the world. Unlike AI development that relies solely on image processing or machine learning, this makes our work very satisfying.
Polarsens is a CMOS Image Sensor pixel technology that has several different angle polarizer formed on chip during the semiconductor process allowing highly accurate alignment with pixel.

3) Example of Polarization Direction

Polarization direction provides the direction information of reflected plane of an object. The direction of polarization image (Fig. "Normal Image") shows the angle of the polarization direction in color using HSV color mapping (Fig. "Polarization Direction image"). In this example, the upper side of the cube is highlighted in light blue meaning that the angle of the polarization direction is 90 degree (according to Fig. "HSV color mapping").

Case 3
Eye AF in "Alpha" Series and "Xperia 1"
Focus on Shooting with Eye AF and Capturing the Decisive Moment of Kando

1) Certain Success through AF Technology The Strong Belief Supporting Advanced Technology

Eye AF refers to the camera's ability to detect subjects' eyes and focus perfectly on them. We produce all of the core elements at our company including the CMOS sensors and image processing engines and lenses, and as we optimize every individual step, we can draw out the very best AF accuracy and tracking. The Eye AF feature is highly rated for its ability to look after all focusing issues, leaving the photographer free to concentrate on composition and the subject's expression. The evolved real-time Eye AF found on Alpha 9 II cameras can keep up with even the fastest athletes, quickly and accurately focusing and tracking their eyes. Sony's Eye AF technology has since been used in more cameras and has been well received by professionals and many more.

Fig. Sony's full-frame mirrorless camera, Alpha 9 II

This full-frame mirrorless camera has a game-changing image sensor capable of a new dimension of photographic speed. With high-speed subject recognition and lightning fast AI-based data processing, tracking subjects' eyes in real time. In addition, the camera is equipped with other unique features such as silent shooting which omits shutter noise and unwanted vibration, expanding photographic potential.

2) Developing New Face Detection and Realizing the World's First Smartphone Eye AF

Sony's flagship Xperia 1 smartphone is the world's first smartphone with Eye AF. Eye AF is also well known as one of the signature features of Sony's "Alpha" series cameras. Normal cameras are held either vertically or horizontally, but smartphones are totally free. Xperia 1 introduced face and eye detection technology that would work seamlessly for faces and eyes at any angle. Getting real-time processing onto the device was a struggle, but by making use of our base technology, we succeeded. Furthermore, Xperia 1 II, the successor to the Xperia 1, is equipped with "Real-time Eye AF" which locks focus on the subject's eye for stunning portrait shots, is now available for both humans and animals.

Fig. Xperia 1 Ⅱ

Case 4
Image Recognition Technology on aibo
~ Building Eyes of aibo to Live in Homes ~

aibo uses its eyes (cameras) to recognize the surroundings and modifies the behavior based on what it sees. It identifies owners and detects friends (other aibos) and toys to play with. Edge-friendly lightweight algorithm based on deep learning allows to recognize rapidly to interact with people. With the depth sensor, it can reach to the toys (ex. Pink balls/bones) while avoiding obstacles. A camera on the back realizes SLAM for building the map of the room and it can be used for go-to-charger and patrol application.

Fig. Sensors/Display device on aibo

Fig. Image Recognition Technology on aibo


  • Ilya Reshetouski, Hideki Oyaizu, Kenichiro Nakamura, Ryuta Sato, Suguru Ushiki, Ryuichi Tadano, Atsushi Ito, Jun Murayama (Sony R&D Center)


    We introduce a lensless imaging framework for contemporary computer vision applications in long-wavelength infrared (LWIR). The framework consists of two parts: a novel lensless imaging method that utilizes the idea of local directional focusing for optimal binary sparse coding, and lensless imaging simulator based on Fresnel-Kirchhoff diffraction approximation. Our lensless imaging approach, besides being computationally efficient, is calibration-free and allows for wide FOV imaging. We employ our lensless imaging simulation software for optimizing reconstruction parameters and for synthetic images generation for CNN training. We demonstrate the advantages of our framework on a dual-camera system (RGB-LWIR lensless), where we perform CNN-based human detection using the fused RGB-LWIR data.

  • Yuhi Kondo, Taishi Ono, Legong Sun , Yasutaka Hirasawa, Jun Murayama (Sony R&D Center)


    Polarization has been used to solve a lot of computer vision tasks such as Shape from Polarization (SfP). But existing methods suffer from ambiguity problems of polarization. To overcome such problems, some research works have suggested to use Convolutional Neural Network (CNN). But acquiring large scale dataset with polarization information is a very difficult task. If there is an accurate model which can describe a complicated phenomenon of polarization, we can easily produce synthetic polarized images with various situations to train CNN.In this paper, we propose a new polarimetric BRDF (pBRDF) model.We prove its accuracy by fitting our model to measured data with variety of light and camera conditions. We render polarized images using this model and use them to estimate surface normal. Experiments show that the CNN trained by our polarized images has more accuracy than one trained by RGB only.

Recruit information

If you are interested in working with us, please click here for more open positions of job and internship.

Page Top