Human-computer interaction evolution: From voice assistants to “virtual humans”, who is the leader?
On January 9, 2007, when the original iPhone was born, the smug Jobs pointed out very clearly on the podium: Whether it is Mac, iPod or iPhone, their revolutionary innovation is reflected in the evolution of human-computer interaction. .
Since then, with smartphones as the carrier, human-computer interaction has continued to evolve with technological breakthroughs.
For example, in the wave of integration and development of the mobile internet and artificial intelligence, intelligent voice assistants represented by Apple Siri and Google Assistant have become a fashion for a while. They have opened up a new way of interaction in addition to touch. It is a voice dialogue that is more in line with the way humans communicate. This kind of interaction has quickly covered the entire industry a few years after its birth.
However, the voice assistant is not the end, people are still looking for further breakthroughs in the future at the level of human-computer interaction. Especially now, the iPhone has been born for nearly 15 years, AI/AR/VR/digital media and other technologies have developed rapidly, and the concept of the metaverse has also emerged. At the time of rapid technological change, a question worth pondering has been placed in front of the entire industry:
After voice assistants gradually become popular, what kind of way should human-computer interaction evolve?
Of course, this is a question that requires the entire industry to give an answer through practice – and among many answerers, the most striking thing is to practice their own thinking on the general direction of human-computer interaction in the future by virtue of their own AI strength. OPPO.
As of August this year, ColorOS has reached 460 million monthly active users worldwide. The huge user service requires strong technical capabilities, especially the support of AI capabilities. So, how does OPPO build the AI service that supports more than 460 million users around the world?
Why is the “virtual human” of multimodal interaction the future?
For the future of human-computer interaction, what can really represent OPPO’s own thinking and choices is an important evolution of its Xiaobu assistant at the recent OPPO Developers Conference.
To put it simply, it is to evolve the OPPO Xiaobu Assistant from the previous form of intelligent auxiliary tool based on voice interaction to the form of intelligent assistant based on multi-modal interaction of virtual human, so as to evolve the interaction between human and intelligent assistant into human and virtual human. interaction between.
This is not difficult to understand – after all, the “virtual human” based on multi-modal interaction has become a new direction for the current exploration and development of human-computer interaction, and has also become the common choice of the entire industry.
For example, in June this year, China’s first original virtual student “Hua Zhibing” entered Tsinghua University as an undergraduate; and the original intention of designing the virtual student “Hua Zhibing” was to hope that she would eventually resemble a human being Think like a human, keep learning like a human, understand human thoughts, actively generate interactions that meet user needs, and capture human needs intuitively and comprehensively – of course, behind the birth of “Huazhibing”, it is the multimodality that is reflected The result of interaction.
In addition to enrolling, “virtual people” have also been on the C-bit platform on the Internet.
For example, in September, Xiaohongshu joined 20+ virtual idols at the same time. They turned into trend intelligence officers, trying on new products of many trendy brands for the first time, and interpreting different trendy charms… Not long ago, the virtual person AYAYI also became an idol. The cat super brand digital manager, and opened a Tmall Double 11 Metaverse Art Exhibition. Interestingly, Huawei used digital people for the first time at this year’s HDC developer conference to live broadcast in real-time sign language.
Of course, looking at the entire social living space, the role of “virtual human” is seamlessly integrated into it.
For example, Xinhua News Agency and Tencent have jointly created a digital astronaut and digital reporter “Xiaozhen”, which is specially developed for aerospace themes and scenarios; Japan’s Harajuku-style girl digital human imma also appeared at the closing of the Tokyo Paralympics in early September. Shanghai Pudong Development Bank and Baidu jointly released the digital employee “Xiaopu”; Hunan Satellite TV announced the launch of the first digital host Xiao Yang, and Jiangsu Satellite TV directly launched a program “2060” to promote virtual idols…
It can be seen that with the continuous development of digital media technologies such as artificial intelligence and virtual reality, the virtual human based on multimodal interaction has moved from the Internet and the digital world to the actual social scene of human beings, and this has become a major trend – even Some people believe that virtual human will become the basic mode of human-computer interaction in the future.
It is against such a background that OPPO is also aware of the general development trend of human-computer interaction, and has seized the opportunity to realize a new round of evolution of Xiaobu Assistant from a voice assistant to a “virtual human”.
OPPO Xiaobu assistant transforms and opens “virtual human” customization
Xiaobu Assistant is the first mobile phone voice assistant with more than 100 million monthly active users in China. The current monthly active users have reached 130 million, which is the concentrated expression of OPPO AI applications. As the earliest artificial intelligence assistant with smartphones as the carrier, Xiaobu takes voice interaction as the main body and covers multiple mobile phone brands of OPPO. Terminal Equipment.
Since its birth in 2018, Xiaobu Assistant has undergone several upgrades, constantly innovating functions and experiences. At this OPPO Developers Conference, OPPO announced that Xiaobu Assistant has been officially upgraded from a pure voice assistant to a multi-modal intelligent assistant, including five functional modules of voice, suggestions, instructions, screen recognition and sweeping.
The need for multimodality is the key feature of intelligent assistants in the intelligent era. Intelligent assistants need to be compatible with different software and hardware intelligent scenarios, introduce more AI capabilities, and combine them closely with the original AI capabilities to form a comprehensive AI capabilities.
The “Xiaobu Virtual Human” launched in September this year is also an important presentation of the multi-modal interaction form of the intelligent assistant.
As the industry’s first mobile phone smart assistant based on multi-modal interaction of virtual human, Xiaobu virtual human breaks the way of interaction, and evolves the interaction between human and intelligent assistant into the interaction between human and virtual human.
It is reported that in the exploration of broken-dimensional human-computer interaction, “Xiaobu virtual human” can realize content services, real-time interaction and emotional interaction with users in multiple scenarios and ecology, and its first online anthropomorphic broadcast news and weather. Such functions are directly reflected; at present, this function has covered OPPO Reno5/6 and Find X3 series models.
At the same time, Xiaobu virtual human integrates voice, semantic and visual multi-modal interaction technology, which can provide a natural and smooth interactive experience of virtual digital technology. With the support of multi-modal emotion recognition algorithms, Xiaobu virtual human can keenly capture the emotional characteristics of users and create a multi-dimensional emotional connection with users.
What is interesting is that OPPO uses Xiaobu Assistant as the entrance to open up the Xiaobu ecosystem.
Each developer can customize the exclusive Xiaobu virtual human. They can have different voices, images, personalities, skills and services, incarnate into different roles, such as intelligent customer service, virtual assistants, and delivery anchors, etc., and carry them on multiple smart terminal devices.
OPPO’s action is not only to empower developers to realize the evolution of “virtual human” at the level of human-computer interaction, so as to further integrate into the development trend of “virtual human” in the entire industry. Its essence is to provide users with a more intelligent, personalized, natural and real interactive experience.
The “magic ammunition” that Xiaobu turned into is not just AI
If the evolution of Xiaobu is a transformation, then AI technology can be said to be the core “magic ammunition” behind this transformation.
In fact, “the rise of high-rise buildings”, whether it is the transformation and upgrading of Xiaobu’s assistant, or the open platform ecology built by OPPO for developers, is based on the ecological capabilities of OPPO’s full-stack AI technology – after all, and only With a stable base, high-rise buildings can only rise from the ground.
From the perspective of OPPO’s betting on AI, its layout on AI is a long-term thinking with planning and combination of points and aspects. Among them, the AI framework is the focus of OPPO’s overall AI capabilities, and it is also a major manifestation of its AI strength.
In order to build this AI framework, OPPO spent a lot of energy and cost to create the OPPO full-stack AI technology ecosystem. It is understood that this AI technology ecosystem consists of 6 parts, mainly including:
- Hybrid cloud infrastructure layers for compute, networking, middleware, and databases;
- Cloud-native data lake layer for storage and processing of massive cross-system data;
- Device-side inference, model compression, large-scale training, AutoML integrated device-cloud machine learning system;
- AI capability layer for basic capabilities such as voice, NLP, knowledge graph, CV, and recommended search;
- Cross-terminal, multi-scenario business application layer;
- Provide enterprises with a secure and reliable AI security capability layer;
As we all know, machine learning requires a lot of data calculation and verification. OPPO has now entered more than 50 countries around the world, and the monthly active users of ColorOS have reached 460 million. The huge user base has also accumulated massive data and computing resources for its full-stack AI technology ecosystem.
On a quarterly basis, the video cloud has been invoked more than 420 million times, and new photos have exceeded 30 billion; big data has added more than 10PB every day, and the Heyun infrastructure covers eight regions around the world.
From the perspective of machine learning systems, OPPO’s full-stack AI technology ecosystem provides two parts: the end-side and the cloud-side—among them, the end-side algorithm engine, acceleration framework and model seat located on the end-side provide low-latency and efficient response; On the cloud side, large-scale training and inference are performed on the device-side data uploaded to the cloud, which in turn optimizes the device-side algorithm.
With the support of massive data and excellent algorithms and computing power, OPPO AI has been continuously recognized in the industry. Its large-scale knowledge-based pre-training, including contextual understanding, short text similarity and other speech semantic indicators have repeatedly topped the industry. Authoritative evaluation.
Not only that, at the world’s top computer vision conference CVPR, OPPO has achieved the first results in extreme super-resolution perception, handheld device visual positioning and multi-target behavior analysis in 2020 and 2021 respectively.
In addition, from the perspective of AI capabilities, OPPO AI has basic capabilities such as voice, NLP, knowledge graph, CV, and recommendation search. The human evaluation satisfaction of its end-to-end generative dialogue model has exceeded 85%, which is relatively high in the industry. level; and this achievement has been applied to the business scenario of Xiaobu’s generative chat.
In addition to having powerful AI capabilities, security is also an important part that cannot be ignored. OPPO AI security technology provides a full range of security guarantees from application detection, malicious protection to attack confrontation.
Official data shows that in application detection, more than 530,000 APPs have been detected, and more than 10,000 malicious APPs have been found. The browser blocks more than 3 million malicious downloads every day, and has more than 150,000 privacy policies. Users filtered malicious behaviors over 114 billion times, and banned 2.8 million malicious accounts.
It is not difficult to see that OPPO’s full-stack AI technology ecosystem is the implementation of OPPO’s key AI capabilities and technical advantages, providing a solid architectural foundation, higher resource utilization efficiency, and secure and credible privacy protection for OPPO’s massive AI services.
Of course, AI technical capabilities alone are not enough. In addition to wisdom, the AI ecosystem that OPPO hopes to create is more humane and warm.
Therefore, in response to the psychological problems of the urban population, OPPO has released the “AI Warming Up Plan”, which provides warm companionship to more than 2 million people every day; at the same time, for the aging of mobile phones, OPPO has created a care version of Xiaobu Assistant, which has improved 7 million elderly people. Users’ mobile phone experience; for the inheritance of traditional culture, OPPO also jointly launched the “I am a folk music artist” campaign jointly with the musician A Duo, with more than 2.8 billion exposures on the entire network, promoting the inheritance and promotion of folk music in the form of technological innovation.
Of course, the evolution of OPPO Xiaobu Assistant, at a fundamental level, reflects OPPO’s strong investment and continuous accumulation in basic technologies such as AI, and the active embrace and continuous exploration of new concepts and trends in the industry – but more importantly, based on The continuous exploration of the future form of human-computer interaction reflects OPPO’s extreme emphasis on the actual product experience of every ordinary user.
From a certain point of view, whether it is a voice assistant or a “virtual human”, a truly excellent device experience must be based on the in-depth understanding and perception of the user itself, and at the same time use the power of technology and the humanities. The perspective embodies this understanding and perception on specific and easy-to-use software and hardware products, so that it can directly meet the real needs of users.
This is also the reason why every truly consumer-facing technology company should stand at the intersection of technology and humanities just like Steve Jobs.
After all, technology is about people.
Pre: What are the application status and a... Next: Heavy!Peitian robot joins hands with ...