Pursue compact and high-performance AI
We are developing technology which accurately recognizes users’ natural speech amongst background noise and reverberation. Our focus is on improving the performance of audio signal processing and speech recognition technologies in the real world. We use deep learning to optimally integrate audio signal processing and speech recognition. This enables advanced speech recognition in unfavorable conditions, such as when there is mechanical noise from robotics. These technical optimizations catered to devices and use cases will be thoroughly user-friendly.
We are developing Spoken Language Understanding technology to understand user utterances. This technology converts speech recognition text strings into machine-understandable information (semantic representation). We have based our models on various linguistic phenomena such as disfluencies and abbreviations, in addition to a semantic database which links spoken language with the real world. For further understanding of natural language itself, we are developing Natural Language Processing technology that analyzes text. This process involves tokenizing, assigning parts of speech and semantic attributes, and parsing the structure. We are also developing Knowledge Information Processing technology which is applied for disambiguation of language.
Deep Learning is a machine-learning technology that allows users to create AI models that “recognize and predict” and “generate and transform” any data through the provision of training data. The R&D deep learning research team has been working on fundamental technologies including large-scale training, model compression, few-shot learning, generation modeling, and neural rendering. The outcomes of these research efforts are integrated into the Neural Network Console, a GUI development tool, and the Neural Network Libraries, which are open-source libraries, to actively contribute to the advancement of AI fields. At Sony, these technologies are used not only in electronics products featuring AI technology, but also in the entertainment business such as movies, music, and games, as well as in new types of business.
Behavior Learning, including but not limited to deep reinforcement learning, is the technology that enables an autonomous system to learn optimal behavior through its own trial-and-error experience. We aim to develop these technologies for planning actions in environments that are too complex or varied for humans to deal with, and for online optimization control mechanisms which effectively adapt to environments which vary more than anyone could anticipate in advance. We aim to apply this technology in robotics, including both navigation and manipulation, and in gaming AI. Also, we are proactively working on joint research projects with overseas universities and laboratories that are utilizing cutting-edge technologies.
We are developing an Agent Platform that understands user utterances and text input, combines audio and visual representations, and responds with animated character representations. This Agent Platform uses multiple technologies including speech recognition, image recognition, spoken language understanding, interactive response generation, and visual expression. We are also developing development tools, SDKs, and cloud systems for developing agent applications. Applications are being used in the entertainment field, such as animation and movies; in the financial field, such as life plan support; and in the B2B field, such as store and office reception and guidance.
Each business unit within the Sony Group is striving to create new customer values and business through the use of data. The Sony Group is involved in many different businesses, including entertainment, electronics, network services, and finance, and therefore generates huge amounts of widely varying data on a daily basis. We are involved in research to develop new machine-learning technologies and analysis platforms that allow us to make effective and easy use of this data. For example, we are currently developing Prediction One, which allows users to easily perform predictive analysis without being experts in either machine learning or statistics, as well as causal inference technology for personalization in direct to consumer (DTC) services. Using these core technologies and analysis platforms, we will be able to bring these advanced technologies to market.
AI functions based on deep learning have been widely utilized in many types of products and services. However, the training time required for deep learning increases every year because realizing more-advanced AI functions requires “more and more training datasets” and “ever-larger models for training.” Therefore, we are developing the technologies required for the Sony-dedicated supercomputer (GAIA) which we are currently constructing. In particular, we are focusing on three technologies, namely, “AI-Optimized Architecture” based on the latest hardware, “Resource Management” which aims to extract the most out of the available hardware resources, and “Large-Scale Deep Learning” to accelerate the training time dramatically through the use of multiple processors such as GPUs. Our aim is the ultra-acceleration of AI development and the creation of new values through the use of more advanced AI, while also integrating the “sData” training data sharing system.
Explainable AI is a technology that visualizes the bases of judgment so that humans can understand machine learning, presenting them as what is termed a “black box.” The goal of these efforts is to create AI that can be trusted by humans. Machine learning has seen remarkable progress in recent years, with innovation apparent in every field. On the other hand, there are ethical issues in that there are concerns about AI hurting humans. Therefore, we are undertaking research and development addressing explainable AI to solve fairness, accountability, and transparency in machine learning through technology. Through the use of explainable AI, we are aiming to develop a variation of AI that allows humans to understand and control why that AI makes particular decisions.