Cutting Edge

What it will take for Sony to become No. 1 in the AI field

May 29, 2019

In November 2018, Sony announced that it had achieved the world’s fastest time* for distributed deep learning. This was made possible by using the Neural Network Libraries deep learning framework developed by Sony and the AI computing infrastructure called AI Bridging Cloud Infrastructure (ABCI) constructed and operated by Japan’s National Institute of Advanced Industrial Science and Technology (AIST). We sat down for a talk with the two Sony researchers and the one ABCI developer at AIST who worked together to achieve this feat.

*As of November 13, 2018 (Sony survey)


  • Hiroaki Mikami

    Core System Development Department
    System and Platform Technology
    Development Division 1
    R&D Center, Sony Corporation

  • Yuichi Kageyama

    Core System Development Department
    System and Platform Technology
    Development Division 1
    R&D Center, Sony Corporation

  • Hirotaka Ogawa

    Team Leader
    Artificial Intelligence
    Research Center
    National Institute of
    Advanced Industrial Science
    and Technology (AIST)

Why Sony and AIST are collaborating

──What led to the collaboration between the National Institute of Advanced Industrial Science and Technology (AIST) and Sony Corporation?

Yuichi Kageyama:Yoshiyuki Kobayashi, an evangelist for Neural Network Libraries (NNL) at Sony, introduced us to Dr. Satoshi Sekiguchi, Vice President and Director General of the Department of Information Technology and Human Factors, AIST and suggested that we look into the potential to use ABCI.
It just so happened that, at that time, we were wanting to advance our research and development by using a supercomputer with world-class computing and data processing capabilities like that of ABCI. So we immediately met with Ogawa-san, AIST’s research team leader, and tested the waters by saying, “We would love to do some research with you!”

──What kind of challenges were you facing in the previous environment?

Hiroaki Mikami:Originally, we were envisioning using a cloud-based IT infrastructure service (cloud service). However, while a cloud service can provide a network that is stable enough, it is not so effective when you want higher performance or larger-scale services. For example, a network delay in the order of milliseconds can be addressed with a cloud service. If you only need a few dozen GPUs, then there is no problem with a cloud service. However, if you need faster speed or more GPUs, then it is quite difficult to do with a cloud service. So, that was the appeal of ABCI, as it can solve the problem of network delay because it operates in the order of microseconds, and it has the greatest number of usable GPUs in Japan.

Kageyama:When it comes to the development of AI, we have to try everything. It is essential to employ a lot of GPUs to accelerate the speed of the development. That is why using a supercomputer makes sense, because a supercomputer makes it possible to use many GPUs. However, that had not been an option for us to even consider because we didn’t have access to a supercomputer, and the possibility of getting access to a supercomputer such as ABCI anytime soon didn’t even cross our minds. However, several years ago, the area of high-performance computing (HPC) and the area of AI began to merge in an accelerated manner. That presented us, finally, with a chance to get access to ABCI and to work with AIST’s HPC experts, so we quickly made the request to do so when that door opened.

So you expected to achieve the world’s fastest time?!

──What did you think, Ogawa-san, when you received the inquiry for the joint research?

Hirotaka Ogawa:I knew of the existence of NNL that Sony had developed, but not many researchers were using this machine learning tool, so I was a little anxious about the idea.

But after listening to Kageyama-san, I downloaded the NNL code and tried it out, and I realized that it was very good. I was able to confirm for myself, once again, that the level of Sony software is quite high, and with that I knew we could expect good results.

Kageyama:So, compared to proven machine learning tools, it seems that you had your doubts about whether Sony’s software was really usable or not, right? In the field of AI, I believe Sony still has much to do. We want to be No. 1 in AI technology development, in Japan at least. As we aim to be in the global top-class in AI, it will be essential for us to actively collaborate with outside parties. So, in that sense, this joint research activity was a huge step forward.

──In the 2nd ABCI Grand Challenge (October 2018) program, Sony came out of nowhere to achieve the world’s top speed. Going into that challenge, did you have any conviction or confidence that Sony would be able to produce the world’s top speed?

Mikami:Well, not everything worked out as we had expected, but yes, we had thought we were capable of doing what we did, which was 3.7 minutes.

Kageyama:At the 1st ABCI Grand Challenge (July 2018), we were delighted to achieve a record of 10 minutes, but right after that, a Chinese IT company published a paper that reported their achievement of 6.6 minutes, which got us down. Then, we achieved the world’s fastest speed of 3.7 minutes, but after that another company achieved 2.2 minutes, and then 1.8 minutes, but then we also got down to 2.0 minutes in January 2019.
Honestly, since Sony’s scale of investment in AI is completely unlike that of a major IT company, I think that it is great that we are now able to keep pace with them.

4,352 GPUs, 2.5 microsecond latency

──Could you provide an overview of ABCI?

Ogawa:In order to accelerate the introduction of cutting-edge AI technology into R&D and industry and so forth, deep learning algorithms, various types and enormous amount of real world big data, and the high-performance computing power to combine these two especially to perform machine learning processing are critical. ABCI is a large-scale cloud computing system that brings all three of these elements together to provide a platform for promoting open innovation. AIST started development of the system in 2016 and officially launched the operation in August 2018.
ABCI is located at AIST’s Kashiwa research center, which is near the Tsukuba Express railway’s Kashiwanoha-campus Station. It is a computing system that consists of 1,088 servers, each with 2 CPUs and 4 GPUs for a total of 4,352 GPUs. As Mikami-san touched on earlier when he talked about latency, ABCI not only has many servers equipped with GPUs, but all the servers and all the GPUs are configured in a way to achieve communication within a latency of 2.5 microseconds or less.

──What is your primary motivation?

Ogawa:I’m crazy about playing with this big toy called ABCI! All joking aside, though, my job is to help produce top notch results using ABCI. It is to help scale up technological seeds that AIST, various research institutes, universities, and companies research into to real-world solutions and to make this powerful tool available for use in various industries in a variety of ways.
At present, ABCI has been used in more than 100 projects by AIST, public research institutes, companies, and universities, and the ABCI user base has reached 500 to 600 people. However, I am not satisfied yet. In the future, I would like to see the number of users increase five-fold.

AI is already being used in almost every case

──What fields will Sony develop in the future by using deep learning?

Kageyama:In terms of research and development, we do various things such as one-shot learning and transfer learning. In the product application area, there are more and more Sony products and services that utilize AI such as aibo, the real estate price estimation engine, Xperia Ear’s gesture recognition, and digital paper handwriting input.

Mikami:Shortening learning time helps to make a good model quickly. Products and services will be individually optimized for people more and more in the future, so I hope our research activities will help in quicker implementation of these things to society.

──In the case of Sony, I am excited to think of the possibilities that AI could bring to the realm of entertainment….

Kageyama:That is something I still can’t comment on. However, yes, since Sony is a creative entertainment company, whether it is entertainment or financial, I think it is natural that we should implement and make good use of AI in all sorts of businesses.

Where can I find a deep neural network expert?!

──What future benefits do you think there will be for AIST in the present collaboration with Sony in the field of AI?

Ogawa:I think there are various levels of benefits we will see. Kageyama-san said that he thought supercomputing was out of his reach, and at AIST we don’t have so much knowledge of the deep learning processing that Sony excels in, although we have accumulated a lot of technology regarding supercomputing and are fluent with the ABCI system itself. So, just coming together like this and complementing each other’s technological strengths is very important in itself. In addition, Sony has expertise in the application domain. So, going forward, I hope to see Sony and AIST share knowledge and promote research and development together in this domain as well.

──One of the things often said about AI is that it can analyze data, but that it still takes humans to interpret the results. What is your opinion of the human-machine relationship of this kind, Kageyama-san?

Kageyama:Basically, that topic falls in the domain of explainable AI, and it is a frequent topic within Sony as well. Whenever AI produces a result, the question often is, “What is the justification for those results?” For example, if AI makes a financial prediction, it is natural for there to be questions such as, “Why is that so?” Obviously, nobody is going to be persuaded by an answer like, “That is the result of learning.” So, how we interpret the AI results remains a matter of research and development. In fact, I think that the more we try to address safety and security issues, the more there will be a need for being accountable for such AI results. So, that is a topic that is being researched internally as well.

──Are you recruiting new people?

Mikami:I would very much like to find someone who has expert skills in deep neural network (DNN).

Kageyama:The competition for getting people in this area is very fierce. So, I think it is important to increase Sony’s AI presence by building on the achievements like what we did this time with ABCI, one at a time. We have implemented services including our Neural Network Console cloud service, and we just need to keep steadily moving forward with our strategies. In addition, it may be necessary to more actively disclose Sony’s achievements in the AI field, including those related to products and services. Since researchers see the essence of things, I don’t think spending money on promotions and such will necessarily bring in new talent. Rather, I think we can attract new talent better by showing how appealing it is to do research here at Sony, since we have such a broad range of products and services—from electronics, to entertainment, to finance, and so on.

──I’d like to ask you, Ogawa-san, what do you think is the key to getting good talent to come to Japan?

Ogawa:In the AI industry, especially in the research field, the mobility of talent is a given. From experience, we know that it is quite difficult to get researchers who have just earned degrees in the United States to move to Japan. So, I spend more time and effort in trying to get researchers, especially the less experienced researchers from Europe, India, and China to come here. In doing so, it is important to give such recruits tasks that are appealing. There is not much we can do about researchers leaving after having acquired knowledge at AIST. I often comfort myself by saying, “Well, at least we did our mission as a public organization to help develop talent.”

Kageyama:Speaking of which, Mikami-san is in his second year with Sony. His original area of expertise was something else.

Mikami:In school, I was working on compilers and programming languages. My specialty is human computer interaction, or UI.

Ogawa:I see.

Kageyama:Mikami-san produced wonderful results despite facing the challenge of taking on a new field. What’s great about the field of AI is that we can produce innovation by gathering together a variety of talented people. That is, even people who have completely different areas of expertise, if tasks are set appropriately and they progress toward goals, can rise to the top level. This is the kind of approach I want to incorporate.

──Lastly, please tell us about what you found most interesting during this collaboration with AIST.

Mikami:It was exciting that I got to quickly and directly get my work out into the world. That was a real pleasure.

Kageyama:It was fun to simply have the chance to talk with people outside the company. In doing so, I realized just how many different ideas are out there and how much more I need to grow.
Speaking of which, I was surprised to hear people say things like, “You Sony people are always on time!” That is, I had thought Sony is more famous for being quite loose when it comes to being on time. Anyhow, just being able to experience different cultures was fascinating for me.

Related article