SONY

menu
Search button in the site

Search

Thirty-fifth Conference on Neural Information Processing Systems

December 06 ~ 14, 2021
(NeurIPS-2021 is a Virtual-only Conference)

In an effort to provide a safe, productive, and worthwhile forum for both attendees and sponsors alike, the annual NeurIPS conference will remain fully virtual for 2021. NeurIPS-2021 is hosting a complimentary careers website for companies, non-profits and academics to post jobs, postdoctoral positions, or fellowships. We support this activity.

Recruit information for NeurIPS-2021

We look forward to highly motivated individuals applying to Sony so that we can work together to fill the world with emotion and pioneer the future with dreams and curiosity. Join us and be part of a diverse, innovative, creative, and original team to inspire the world. If you are interested in working with us, please click here for more open positions of job and internship.

Full Time

  • AI Engineer

    Full Time
    United States
    Machine Learning

  • Research Scientist (Computer Vision and Algorithmic Fairness)

    Full Time
    United States
    AI Ethics, Computer Vision

  • Research Scientist (Reinforcement Learning)

    Full Time
    United States
    Reinforcement Learning

  • Senior Research Scientist (Reinforcement Learning)

    Full Time
    United States
    Reinforcement Learning

  • Lead Research Scientist Machine Learning for Gastronomy

    Full Time
    Tokyo, Japan
    Machine Learning

  • Research Scientist (Privacy-Preserving Machine Learning)

    Full Time
    Tokyo, Japan
    Machine Learning

  • Senior Research Scientist (Computer Vision)

    Full Time
    Tokyo, Japan
    Computer Vision

  • Senior Research Scientist (Reinforcement Learning)

    Full Time
    Tokyo, Japan
    Reinforcement Learning

  • Senior Research Scientist (Robotics)

    Full Time
    Tokyo, Japan
    Robotics

  • Robotics Engineer

    Full Time
    Tokyo, Japan
    Robotics

  • Research Scientist (Ethics)

    Full Time
    Tokyo, Japan
    AI Ethics

  • Project Lead - AI Ethics: AI Ethics Office (AEO), Sony Group Corporation

    Full Time
    Tokyo, Japan
    Others

  • Project Manager

    Full Time
    Tokyo, Japan
    Others

  • Senior Research Engineer - Machine Learning/Image Processing

    Full Time
    San Jose, United States
    Computer Vision, Machine Learning

  • AI Application Engineer

    Full Time
    Tokyo, Japan
    Computer Vision, Natural Language Processing

  • Senior Research Engineer Biomedical Engineering & Computer Vision

    Full Time
    San Jose, United States
    Machine Learning, Data Analysis

  • Machine Learning Research Engineer

    Full Time
    Tokyo, Japan
    Data Analysis, Machine Learning

  • R&D engineer of Music, Acoustics, Speech, and Language technology field

    Full Time
    Tokyo, Japan
    Speech/Audio Signal Processing, Natural Language Processing

  • Robotics, Engineer / Researcher

    Full Time
    Tokyo, Japan
    Robotics

  • Research Engineer | Computational Sensing, Embedded AI

    Full Time
    Tokyo, Japan
    Computer Vision

Intern

  • Research Intern (Privacy-Preserving Machine Learning)

    Internship
    Location Flexible
    Machine Learning

  • Robotics, Engineer / Researcher

    Internship
    Tokyo, Japan
    Robotics

  • AI Application Engineer

    Internship
    Tokyo, Japan
    Computer Vision, Natural Language Processing

  • Deep Learning, Researcher/Research Engineer/Software Engineer

    Internship
    Tokyo, Japan
    Machine Learning

  • R&D engineer of Music, Acoustics, Speech, and Language technology field

    Internship
    Tokyo, Japan
    Speech/Audio Signal Processing, Natural Language Processing

  • Engineer / Researcher in Affective Computing (brain technology, biomedical engineering and signal processing)

    Internship
    Kanagawa, Japan
    Human Machine Interaction, Machine Learning

Technologies & Business
use case

Technology 01

We introduce two technologies related to Sony's entertainment. First, we have realized a distributed computing environment using a large number of GPUs to study real-time, unbiased ray tracing. Second, we explain the lightweight motion capture technology which only uses six IMUs to estimate whole-body joint positions and orientations.

1) Realtime Ray Tracing Technology with Distributed Rendering and Material Estimation

  • Tadayasu Hakamatani

    Advanced Technology Dept., Future Technology Group, Sony Interactive Entertainment.

    He graduated from Tokyo Institute of Technology in 1989, after which he joined Sony. He joined the game division in 1993 and was involved in the commercialization of PS1/PS2/PS3. Since 2015, he has been researching real-time photorealistic expression, focusing on ray tracing technology.

2) Capture Bodily Motions with a Simple Setup

  • Yasutaka Fukumoto

    R&D Center, Sony Group Corporation

    Yasutaka Fukumoto is a Senior Manager at R&D Center in Tokyo, where he leads a team of research engineers working on motion sensing technologies and applications for a wide range of Sony's future businesses. He received master's degree in Mechano-Informatics from The University of Tokyo in 2004, after which he joined Sony-Kihara Research Center and has been engaged in various research and development projects for 17+ years.

Publications

Publication 01
Adversarial Intrinsic Motivation for Reinforcement Learning

Peter Stone
Sony AI

Authors:
Ishan Durugkar, Mauricio Tec, Scott Niekum, and Peter Stone

Abstract:
Learning with an objective to minimize the mismatch with a reference distribution has been shown to be useful for generative modeling and imitation learning. In this paper, we investigate whether one such objective, the Wasserstein-1 distance between a policy's state visitation distribution and a target distribution, can be utilized effectively for reinforcement learning (RL) tasks. Specifically, this paper focuses on goal-conditioned reinforcement learning where the idealized (unachievable) target distribution has full measure at the goal. This paper introduces a quasimetric specific to Markov Decision Processes (MDPs) and uses this quasimetric to estimate the above Wasserstein-1 distance. It further shows that the policy that minimizes this Wasserstein-1 distance is the policy that reaches the goal in as few steps as possible. Our approach, termed Adversarial Intrinsic Motivation (AIM), estimates this Wasserstein-1 distance through its dual objective and uses it to compute a supplemental reward function. Our experiments show that this reward function changes smoothly with respect to transitions in the MDP and directs the agent's exploration to find the goal efficiently. Additionally, we combine AIM with Hindsight Experience Replay (HER) and show that the resulting algorithm accelerates learning significantly on several simulated robotics tasks when compared to other rewards that encourage exploration or accelerate learning.

Publication 02
Machine versus Human Attention in Deep Reinforcement Learning Tasks

Peter Stone
Sony AI

Authors:
Sihang Guo, Ruohan Zhang, Bo Liu, Yifeng Zhu, Mary Hayhoe, Dana Ballard, and Peter Stone

Abstract:
Deep reinforcement learning (RL) algorithms are powerful tools for solvingvisuomotor decision tasks. However, the trained models are often difficultto interpret, because they are represented as end-to-end deep neuralnetworks. In this paper, we shed light on the inner workings of suchtrained models by analyzing the pixels that they attend to during task execution, and comparing them with the pixels attended to by humans executing the same tasks. To this end, we investigate the following two questions that, to the best of our knowledge, have not been previously studied. 1) How similar are the visual representations learned by RL agents and humans when performing the same task? and, 2) How do similarities and differences in these learned representations explain RL agents’ performance on these tasks? Specifically, we compare the saliency maps of RL agents against visual attention models of human experts when learning to play Atari games. Further, we analyze how hyperparameters of the deep RL algorithm affect the learned representations and saliency maps of the trained agents. The insights provided have the potential to inform novel algorithms for closing the performance gap between human experts and RL agents.

Publication 03
Conflict-Averse Gradient Descent for Multi-task learning

Peter Stone
Sony AI

 

Authors:
Bo Liu, Xingchao Liu, Xiaojie Jin, Peter Stone, and Qiang Liu

Abstract:
The goal of multi-task learning is to enable more efficient learning than single task learning by sharing model structures for a diverse set of tasks. A standard multi-task learning objective is to minimize the average loss across all tasks. While straightforward, using this objective often results in much worse final performance for each task than learning them independently. A major challenge in optimizing a multi-task model is the conflicting gradients, where gradients of different task objectives are not well aligned so that following the average gradient direction can be detrimental to specific tasks' performance. Previous work has proposed several heuristics to manipulate the task gradients for mitigating this problem. But most of them lack convergence guarantee and/or could converge to any Pareto-stationary point.In this paper, we introduce Conflict-Averse Gradient descent (CAGrad) which minimizes the average loss function, while leveraging the worst local improvement of individual tasks to regularize the algorithm trajectory. CAGrad balances the objectives automatically and still provably converges to a minimum over the average loss. It includes the regular gradient descent (GD) and the multiple gradient descent algorithm (MGDA) in the multi-objective optimization (MOO) literature as special cases. On a series of challenging multi-task supervised learning and reinforcement learning tasks, CAGrad achieves improved performance over prior state-of-the-art multi-objective gradient manipulation methods.

Publication 04
Gradient Driven Rewards to Guarantee Fairness in Collaborative Machine Learning

Lingjuan Lyu
Sony AI and the University of Melbourne

Authors:
Xu Xinyi (National University of Singapore), Lingjuan Lyu* (Sony AI), Xingjun Ma (Deakin University), Chenglin Miao (University of Georgia), Chuan-Sheng Foo (Institute for Infocomm Research), Bryan Kian Hsiang Low (National University of Singapore)

Abstract:
Collaborative machine learning provides a promising framework for different agents to pool their resources (e.g., data) for a common learning task. In realistic settings where agents are self-interested and not altruistic, they may be unwilling to share data or model without adequate rewards. Furthermore, as the data/model the agents share may differ in quality, designing rewards which are fair to them is important so they do not feel exploited and discouraged from sharing. In this paper, we investigate this problem in gradient-based collaborative machine learning. We propose a novel cosine gradient Shapley to evaluate the agents' contributions and design commensurate rewards in the form of better models. Compared to existing baselines, our method is more efficient and does not require a validation dataset. We provide theoretical fairness guarantees and empirically validate the effectiveness of our method.

Session Name and Date/Time:
Poster Session 5
Thursday, December 09, 2021
12:30 AM - 02:00 AM (PST)

Publication 05
Anti-Backdoor Learning: Training Clean Models on Poisoned Data

Lingjuan Lyu
Sony AI and the University of Melbourne

Authors:
Yige Li (Xidian University), Xixiang Lyu (Xidian University), Nodens Koren (University of Copenhagen) , Lingjuan Lyu (Sony AI), Bo Li (University of Illinois at Urbana-Champaign), Xingjun Ma (Deakin University)

Abstract:
Backdoor attack has emerged as a major security threat to deep neural networks(DNNs). While existing defense methods have demonstrated promising results on detecting and erasing backdoor triggers, it is still not clear if measures can be taken to avoid the triggers from being learned into the model in the first place. In this paper, we introduce the concept of anti-backdoor learning, of which the aim is to train clean models on backdoor-poisoned data. We frame the overall learning process as a dual-task of learning the clean portion of data and learning the backdoor portion of data. From this view, we identify two inherent characteristics of backdoor attacks as their weaknesses: 1) the models learn backdoored data at a much faster rate than learning clean data, and the stronger the attack the faster the model converges on backdoored data; and 2) the backdoor task is tied to a specific class (the backdoor target class). Based on these two weaknesses, we propose a general learning scheme, Anti-Backdoor Learning (ABL), to automatically prevent backdoor attacks during training. ABL introduces a two-stage gradient ascent mechanism into standard training to 1) help isolate backdoor examples at an early training stage, and 2) break the correlation between backdoor examples and the target class at a later training stage. Through extensive experiments on multiple benchmark datasets against 10 state-of-the-art attacks, we empirically show that ABL-trained models on backdoor-poisoned data achieve the same performance as they were trained on purely clean data. Code is available athttps://github.com/bboylyg/ABL.

Session Name and Date/Time:
Poster Session 5
Thursday, December 09, 2021
12:30 AM - 02:00 AM (PST)

Publication 06
Exploiting Data Sparsity in Secure Cross-Platform Social Recommendation

Lingjuan Lyu
Sony AI and the University of Melbourne

Authors:
Jamie Cui (Ant Group), Chaochao Chen (Ant Group), Lingjuan Lyu (Sony AI), Carl Yang (Emory University), Li Wang (Ant Group)

Abstract:
Social recommendation has shown promising improvements over traditional systems since it leverages social correlation data as an additional input. Most existing works assume that all data are available to the recommendation platform. However, in practice, user-item interaction data (e.g., rating) and user-user social data are usually generated by different platforms, both of which contain sensitive information. Therefore, how to perform secure and efficient social recommendation across different platforms, where the data are highly-sparse in nature remains an important challenge. In this work, we bring secure computation techniques into social recommendation, and propose S3Rec, a sparsity-aware secure cross-platform social recommendation framework. As a result, S3Rec can not only improve the recommendation performance of the rating platform by incorporating the sparse social data on the social platform, but also protect data privacy of both platforms. Moreover, to further improve model training efficiency, we propose two secure sparse matrix multiplication protocols based on homomorphic encryption and private information retrieval. Our experiments on two benchmark datasets demonstrate that S3Rec improves the computation time and communication size of the state-of-the-art model by about 40× and 423× in average, respectively.

Session Name and Date/Time:
Poster Session 4
Wednesday, December 08, 2021
04:30 PM - 06:00 PM (PST)

Publication 07 (Deep RL Workshop)
Expert Human-Level Driving in Gran Turismo Sport Using Deep Reinforcement Learning with Image-based Representation

Ryuji Imamura*¹, Takuma Seno*², Kenta Kawamoto*² and Michael Spranger*²
*1) Sony AI Intern, *2) Sony AI

Abstract:
When humans play virtual racing games, they use visual environmental information on the game screen to understand the rules within the environments. In contrast, a state-of-the-art realistic racing game AI agent that outperforms human players does not use image-based environmental information but the compact and precise measurements provided by the environment. In this paper, a vision-based control algorithm is proposed and compared with human player performances under the same conditions in realistic racing scenarios using Gran Turismo Sport (GTS), which is known as a high-fidelity realistic racing simulator. In the proposed method, the environmental information that constitutes part of the observations in conventional state-of-the-art methods is replaced with feature representations extracted from game screen images. We demonstrate that the proposed method performs expert human-level vehicle control under high-speed driving scenarios even with game screen images as high-dimensional inputs. Additionally, it outperforms the built-in AI in GTS in a time trial task, and its score places it among the top 10% approximately 28,000 human players.

Publication 08 (Offline RL Workshop)
d3rlpy: An Offline Deep Reinforcement Learning Library

Takuma Seno
Sony AI and Keio University

Abstract:
In this paper, we introduce d3rlpy, an open-sourced offline deep reinforcement learning (RL) library for Python. d3rlpy supports a number of offline deep RL algorithms as well as online algorithms via a user-friendly API. To assist deep RL research and development projects, d3rlpy provides practical and unique features such as data collection, exporting policies for deployment, preprocessing and postprocessing, distributional Q-functions, multi-step learning and a convenient command-line interface. Furthermore, d3rlpy additionally provides a novel graphical interface that enables users to train offline RL algorithms without coding programs. Lastly, the implemented algorithms are benchmarked with D4RL datasets to ensure the implementation quality.

Session Name and Date/Time:
2nd Offline Reinforcement Learning Workshop
Tuesday, December 14, 2021

Page Top