Envisioning the Future Created by Sony’s AI
— Artificial Intelligence That Assists Creativity

We live in an age where artificial intelligence is already used in many ways — not just in smart speakers but also in processes such as personnel recruitment and investment management. What is happening at the forefront of AI research? What course should Sony set for the future? We sat down with two of Sony's most distinguished talents in the field of AI, Hiroaki Kitano and Masahiro Fujita, to explore these questions.


  • Hiroaki Kitano

    Senior Vice President
    Sony Corporation
    President & Chief Executive Officer
    Sony Computer Science Laboratories, Inc.

  • Masahiro Fujita

    Senior Chief Researcher of
    AI Collaboration Office,
    Sony Corporation

Input and output are now being linked

──In 2005, the inventor and futurist Ray Kurzweil introduced the idea of technological singularity. Now, 14 years later, what are the themes and issues at the forefront of AI research?

Hiroaki Kitano: Image recognition, and recognition and categorizing systems in general have really progressed and are entering the deployment phase. In fact, various deep learning tools based on open source and user-friendly deep learning services such as Google's Cloud AutoML are available. Many people are already using “AI” from the aspect of machine learning.

Also, natural language processing has been amazingly evolved. Google is leading the way here, as well. It has announced a high-level natural language processing technique called BERT, and technology that has made it possible not only to translate language but to do so while reproducing the voice and mood of the speaker. The next challenge is practical application of these technologies, and further difficult challenge is making a smooth, continuous dialogue.

There are more and more new projects coming out at the academic level demonstrating possible use of such technologies. However, in terms of implementation to products and services, some are going very well, but others aren't as expected. This is because not everything can be done with deep learning.

Masahiro Fujita: I think deep neural network research started in the late 2000s, and it was around 2010 that it became quite familiar to the researchers. From 2014 or so, it was reported that visual image recognition technologies exceeded human abilities in some situations. For such recognition technologies, in particular, large advancement of computer technology contributed to its rapid progress, such as the capability to process massive labeled teaching data or technologies to avoid overtraining of neural networks. The next approach that appeared was the one used in AlphaGO, where deep neural network and reinforcement learning were combined to be able to learn interaction strategies.

Then, lately, generative models using deep neural network have been emerging. They have the ability to draw pictures, create music, and synthesize speech. Using adversarial training has made it possible to create very realistic images and speech.

The significance of AI capacity to make scientific discoveries

──Kitano-san, we have heard that you will be the keynote speaker at the International Joint Conference on Artificial Intelligence, IJCAI 2019. What will you talk about?

Kitano:I am planning to give a speech on the integration of artificial intelligence and systems biology, especially focusing on the topic about integrating AI and robotics to autonomously make scientific discoveries that are worthy of a Nobel Prize or beyond. I would say this theme is seriously important not only for myself but also for the society, and I hope this challenge will be completed in the next 20 to 30 years particularly applied for the field life sciences and medicine.

I think the life sciences are the field that we human beings are not very good at. That is, the amount of information is so massive and the research targets are so non-linear and super multi-dimensional. Its dynamic is very complex to understand. Human cognitive function is not good at understanding such things. By the way, 2 million scholarly articles are published every year in the biomedical field. If we look at only important topics such as oncology or immune system, there are several hundred new articles published every day. In other words, the volume completely exceeds the human’s information processing capacity.

That is where I want to develop AI for scientific discovery. Even now, the life sciences are making great progress with iPS cell therapies and new anti-cancer drugs being created, but I think that the life sciences will make an even more dramatic leap forward by incorporating AI.

As one of the challenges in AI and systems biology field, I am advocating the Nobel Turing Challenge. The idea is that, if a highly-autonomous AI scientist is developed, the Nobel Prize Committee will not be able to tell if a human being or AI was responsible for a particular scientific discovery. For example, Satoshi Nakamoto is said to be the founder of Bitcoin and Blockchain, but no one really knows who he is. Yet we assume it to be somebody, somewhere, not AI. This is only because we all know that AI has not reached that level yet. And if Nobel Prize in Economics would be given in the field of blockchain or virtual currency, Satoshi Nakamoto should be awarded without doubt. There would be a higher chance in the future that a similar thing to this might happen, but discovery would be actually done by an AI system.

Another example was Nicolas Bourbaki in the mathematical world of the 1930s. Everyone thought that there was a person named Nicolas Bourbaki, but it turned out that it was actually a virtual personality that a group of mathematicians created. In that same sense, I am hoping that we can quickly arrive at that level of AI, where you would think at first it was something a human being did, when in fact it was AI.

Making AI systems that can carry out scientific discovery means creating a machine that can generate knowledge. Practically, this has not been realized yet. Once AI can produce knowledge, it can accelerate its production and fill the society with great discoveries on a daily basis. Then, if that knowledge is used to leverage technologies to move on to the next step, civilization will probably enter a new phase, where AI will even accelerate autonomously out of human hands.

──If it comes to that level, what kind of AI will it be?

Kitano:There are two possibilities. One is a single AI agent that can access everything and can be a super scientist, and the other is a group of various AI agents, each of which specializes in its own field of expertise and connected with each other, expanding their knowledge organically.

Basically speaking, human scientists have a history of studying one thing and then another, eventually combining all findings leading to a discovery. In order to make this process happen with AI, there comes a question: “Is it better to have a lot of AI agents working together or should we create a super AI that can access everything?” Which way would be quicker to make a historic discovery is not clear yet. I tend to think, however, that it would be better to have a variety of AI agents. It is very interesting to consider such a thing.

There is also an approach of not aiming for huge discoveries, but rather checking all the possibilities using not-so-difficult experiments for a period of a year or so, which is exactly what machines are good at.

What is interesting about science is that we don't know where the important discoveries are waiting. Recently, there was news about the mass-culture of hematopoietic stem cells with liquid glue components, as you might have heard. This discovery is not by accident. It was a substance found after 15 years of research. The story is interesting, but my point is that it wasn't an accident that they thought of a liquid glue; it was the result of exhaustive experimentation.

It is often the case where we can obtain a lot of findings by just doing all possible not-so-complicated experiments, and at times something very important will be revealed in the process. The interesting thing about science is serendipity, but when we humans intervene, we tend to lean toward what we believe is important. In that respect, since machines do not have such a bias, I think they will make great discoveries in areas that humans would normally not attempt.

──Fujita-san, what do you think about the question of a super-scientist AI versus diversified AI?

Fujita:As my own preference, I think that multi-agents that know narrow but deep things acting in an integrated manner might be more likely to succeed. It is difficult to develop a single agent that can address a whole field from the beginning, and besides, learning a wide variety of fields will make the learning itself difficult converging. So, I think it will be better to have many AI agents, each specializing in a certain field and being integrated into the overall effort.

When we discuss about how to realize a dialogue agent, an example that comes up in the discussion is developing two different agents, one knowing well about curry and the other knowing well about ramen, versus developing one agent that knows all of those foods. Developing the former type is not so difficult, but the latter one is quite hard to develop.

In that respect, when there are multiple agents having special knowledge and ability in their respective domains and speaking in their own special way, it might be possible to realize an AI agent that can make a scientific discovery, especially if we bring them together in an incremental (gradually growing) way.

Why is Sony heading toward food?

──I heard that you will hold a workshop pertaining to food at IJCAI. What is the purpose of that?

Fujita:I’d like to propose the introduction of AI as a cooking tool in the food area. Basically, for food, whether it is Japanese or Basque cuisine, there are rules that have been cultivated throughout a long history. In cooking, a combination of various ingredients, seasonings, and cooking methods are used in accordance to those rules, but a new flavor, or taste & aroma, can be created by changing and trying new combinations of these elements. For example, raspberry and seaweed have the same dominant molecular structure of taste & aroma, so raspberry-and-seaweed roll will be very delicious. It is already known that combining elements within certain constraints can produce delicious flavors.

With very extensive ranges of ingredients, cooking is a chemical process in a sense. So, what results will occur through the exploration of cooking is an attractive theme in terms of the artificial intelligence field.

Kitano:The Science & Cooking World Congress 2019 was held in Barcelona in March of this year. Sony conducted a session and also hosted a dinner there. By the way, the opening talk of this event was by Ferran Adrià, the owner of El Bulli, which has become very famous for molecular gastronomy.

Looking at the discussions that came out of the conference, not only did they address tastes that are known to be good by experience, they also really made headway into science, and I felt there are many areas where we as Sony can make a contribution.

Barcelona and San Sebastian are kind of the Silicon Valley of food. They are experimenting with a lot of things, and the chefs who used to work at El Bulli now seem to have their own restaurants and labs and ateliers in which they are doing all sorts of experiments. All this is resulting in great recipes that make their way to restaurants. It really seems like they are truly practicing science and food in earnest.

Fujita:The other day I talked with a chef from San Sebastian. According to him, he is running a restaurant and he is always busy with day-to-day operations. So, he takes a four-month vacation each year, with one month on an actual vacation, but the remaining three months spent trying to come up with new dishes. Apparently he needs to do that in order to get into the creative frame of mind. The mindset and mood for taking on the challenge of creating novel delicious dishes is completely different from focusing on cooking delicious plates amid the daily busy operations of a restaurant. I think that AI can help such chefs in their creative pursuits.

The moment it becomes a trolley problem, you’ve already lost?!

──When we talk about AI, the trolley problem and frame problem are commonly cited. What concerns do you think there are in AI from an ethical point of view?

Fujita:I think from Sony's view, the point is to understand what kind of social issues the AI integrated in Sony products and services could potentially cause.

It is generally known that the face detection and recognition accuracy may vary depending on the skin color or gender. Also, when using an AI system as the recruiting system for a given company, there is a possibility that the data may lean toward men, creating a possibility of a certain bias. These results from the use of AI may cause a risk of unintentional differences in gender treatment.

A deep neural network is a black box. So, if you simply use it as a black box, the output is likely to be affected by certain secondary biases, although you put proper data and believe you have obtained a good result. Therefore, it is critical for Sony to fully recognize this fact and deliver AI systems that can be accepted by society, while making best efforts to assure our customers there are no such problems.

As for the trolley problem, it is a difficult issue for even humans to think about, although it is an interesting topic to discuss.

──Kitano-san, what are your thoughts on how to eliminate biases?

Kitano:I think it will not work well unless we apply an engineering-oriented approach to the process. An approach of saying, "Well, we tried it and this is what we got," surely does not work. We need to make a system for designing unbiased AI systems, including establishing quality standards for the process of creating AI. In that sense, I think that industrialization and engineering of machine learning, in particular, will be a very important process.

In terms of the trolley problem, from my point of view, the moment you faced a trolley problem, I think you have already lost. I mean if the track ahead is blocked by pedestrians—whether it is the main track or a side track—you should have slowed down the vehicle earlier. In situations of poor visibility, such as fog or sharp curve, you all the more need to run the vehicle at reduced speed and be prepared to stop the vehicle at any time.

The trolley problem is quite interesting as a philosophical question, but actually, you must not go that far. Obviously a solution needs to be implemented prior to things coming at that point.

──Kitano-san has mentioned an engineering-oriented approach to address bias. Is such research already underway?

Fujita:There are organizations emerging to discuss or work on those approaches. Of course, we are aware that we need to be careful in this regard. As you know, Sony and others in the home appliance industry have established a rigorous quality assurance system. Be it hardware or software, we already have a process to confirm performance against established quality and safety standards. Now, as we integrate AI into this process, it is imperative that we recognize what is unique about AI. That will be the key point.

AI that assists creativity

──Speaking of Sony-unique AI, I’m sure the entertainment area comes first in our minds. What kind of outcomes has Sony AI made in this area?

Fujita:For music, for example, Sony Computer Science Laboratories (Sony CSL) has developed an AI agent called Flow Machines, which composes music automatically. However, Sony's stance is not to replace humans for completely automated music composition. Rather, we want to make a system that will assist music creators. This project is all about AI and creators interacting with each other, and the end goal is to help the creator make great music.

This is the same with cooking. Just as with the case of the entertainment, the chef is given options based on various search results and hypotheses, and they select what they like based on their human sensibilities. It is the same for movie production. There will always be a need for such systems to bring out creativity.

Kitano:Flow Machines is a project that is led by Sony CSL Paris, and a video showing how Flow Machines made a Beatles-like song trended on YouTube about three years ago. That was interesting in that AI was capable of such a thing, but in practical sense, an artist won’t sell any music by saying, "Hey everyone, I made a Beatles-like song!" Obviously, it has to be that artist's own song. So, we now stand on the concept of using AI to assist artists to create their own original songs in an interactive manner. For example, the artist enters their own chord progressions and style of data. Flow Machines can then help in producing music that is of that style but is not thought of by that artist. That is the major change we have experienced. It is human-centric, supporting artists to exercise their creativity. AI is capable of producing pretty interesting things that a human would have never, ever thought of. We actually did a performance of that kind at South by Southwest (SXSW) this year.

I think that such AI assistance will continue to grow in the field of entertainment creativity. It will be interesting when AI does something that artists do not expect, leading to stimulation of the artists’ creativity. There is a limit to the imagination of humans, after all. When AI puts out ideas, artists will feel stimulated and their creativity will be powered to find further ideas.

A correlation between AI and the intestinal flora?

──The intestine is often called the second brain. Is there any possibility that research findings about intestines will be reflected on AI research?

Kitano:It depends on the phase. Reinforcement learning is being integrated with results in animal behavioral sciences and psychology. Right now there are so many things to do in applying deep learning and reinforcement learning, so everyone is focused on that.

Also, things like the peripheral nervous system, brain-gut correlation, the immune control of nerves, and the microflora are all extremely fundamental elements. Recently, it has been found out that intestinal bacteria and inflammatory intestinal disease have significant relations with central nervous system diseases. I have no idea about how this finding will contribute to AI, but scientifically it is a big step, and these areas are attracting attention from researchers because many exciting discoveries are appearing and changing our conventional wisdom. I think this finding will be largely related with the research on gastronomy as well.

Fujita:Speaking of intestinal flora, there is a science in the sense of modeling accurate biological system, but I think there is also an engineering significance regarding of how to make complex systems work in a cooperative and stable way mimicking biological intestinal flora. Indeed, as we consider a system where a variety of systems work together toward a unified goal without any central command, looking into the principle of operation of such organisms would often be very insightful.

So, in investigating how to have multi-agents work together toward the same goal without a central command, the discussion will turn to those kinds of considerations.

──Are there any other cases of mimicking a living thing in AI research?

Fujita:I also do robotics. So, putting aside the question of whether it is good or bad, I sometimes tend to use biomimetic solutions. Conversely, when considering a system without human constraints, such as a system with several arms instead of two, we need a system based on a different concept from humans. This naturally makes me question what to have as metaphor. I am always in search of this kind of reference and try to look in nature for anything that might have a similar system.

Kitano:Right. When I’m doing fine-pitch surface mounting on the printed circuit board manually, I wish I had two more arms to hold the board while soldering the components.

Fujita:In order to make robots work with people in a coordinated manner for a task that needs to be solved by cooperating with humans, what should we do and how? Simply put, it is the matter of estimating and assisting what humans want to do. But that itself is a very difficult problem.

For example, in the world of shared autonomy, machines and human beings skillfully share autonomy to implement given tasks, but the idea of machines being able to estimate the intentions of humans is still a very difficult matter. Naturally, we will need to keep in mind the analogies of how humans estimate others’ intentions, while we implement more of an engineering-oriented approach.

──In that sense as well, the area of cooking is pretty interesting.

Fujita:So far Sony has been working with movies, music, and games — basically, a lot to do with virtual worlds. However, food involves work by physical entities, and we will need to figure out how AI and robots can work together for real tasks in the real world. I think this is a very good theme for robotics researchers as well.

──There are so many different fields to work in. I think that would mean there is an urgency to develop and recruit talent. What is the current situation in that regard?

Kitano:Well, we are always trying to find good talents. That is all we can do. To recruit very talented people, we need to have challenging themes to attract them – themes that would make them feel they want to join Sony and change the world together. In addition to that basic necessity, talented people are watching whether Sony is committed to the AI field, so we need to show our seriousness toward AI. From the practical aspect, at the same time, we need to have upside potential that can be expected in many ways.

Fujita:I think it is important to keep on appealing to talented people with our attractive achievements. We need to communicate to the world introducing attractive challenges we are working on, and gain talented human resources to put into them. Acting as a main sponsor of IJCAI is one of our such messages so that we can hire excellent human resources.

──Where are AI personnel concentrated now?

Kitano:There is an AI talent shortage over the world. People who handle AI technology aren’t simply enough everywhere. But if we dare to identify an area where AI talent is concentrated, maybe that is in startups. In the case of startups, the upside of an IPO is very attractive. So, Sony may need to do carve-outs or similar measures to ensure the upside. I think we need to consider a variety of approaches, consider ways to keep attracting good talent, do new things, and keep our business growing. We need to act proactively as we address the reality of the world.

Fujita:When we talk about recruiting AI personnel, we also mean engineers who can use AI systems to provide products and services, as well as AI researchers. Such engineers in particular are rapidly becoming necessary.

Kitano:But one thing important here is that we must carefully consider where to deploy them; in other words, the recruited people need to understand the domains and areas we want to aim at. They have much knowledge of AI and data science, which is OK. But if we want to use AI for cooking and they do not know nothing about cooking, they cannot achieve a success in the project. The same goes for the manufacturing industry. If they don't understand manufacturing process, materials, or design at all, the results will be pretty poor.

The basics of AI are important, but I think it is important to have a deep understanding of the domain to which AI is applied. This applies to not only engineers and scientists, but also the management, accounting, or personnel domains. When we create an AI system for personnel work assistance, for instance, the researchers and engineers involved in the development must know the personnel jobs; otherwise the created system won't be very useful.

──Lastly, tell us about your dreams — what you think will come to reality within your lifetime?

Kitano:For me, it is the Nobel Turing Challenge. That will change the future for humanity. AI accelerates discoveries exponentially — a great transformation of civilization. That is my biggest dream.

Fujita:For me it is food. I would like to make a breakthrough in the field of food using robots and AI. I envision chefs having the same standing as artists and musicians at Sony and everyone enjoying their food, or systems that can help people without cooking knowledge or skills make food with the help of AI. I hope I can propose something that will drastically change people’s view of the food world.

Reference:Sony × AI

Previous Article

All IoT devices in the world might be connected to ELTRES in the future

July 31, 2019

Next Article

Pioneering the Future with Optical Elements

September 2, 2019


Page Top