The OpenAI chat bot caused this outcry even though people were technically not allowed to access it from inside China. But so many people have figured out how to use proxy servers to access them anyway that this week the government blocked their access, Chinese media reported.
Beaten to the fist by US-made chatbots such as ChatGPT and Microsoft Bing, China’s biggest tech companies, top universities and even city governments rushed to say they would release their own versions. Search giant Baidu announced this week that it will release its ChatGPT competitor, Ernie Bot, in March.
Despite only just announcing the efforts, these companies – including Baidu, e-commerce giant Alibaba and Tencent, maker of popular messaging app WeChat – have spent the better part of a decade developing their internal AI capabilities.
Baidu, which is the most popular search engine in the country, is the closest to winning the race. But despite years of investment and weeks of hype, the company has yet to release Ernie Bot.
Artificial intelligence experts suggest that the Chinese government’s tight control over the country’s internet is partly to blame.
“With a generative chatbot, there’s no way to know in advance what it will say,” said Zhao Yuanyuan, a former member of Baidu’s natural language processing team. “It’s a huge concern.”
Baidu did not respond to request for comment.
In China, regulators require that everything posted online, down to the shortest commentary, be reviewed first to make sure it doesn’t contravene a long list of banned topics. For example, a Baidu search for Xinjiang will simply return geographic information about the western region, with no mention of the system of re-education camps its Uyghur population has been subjected to for years.
Baidu has become so good at filtering this type of content that other companies are using its software to do it for them.
The challenge Baidu and other Chinese tech companies face is applying those same constraints to a chatbot that creates new content with every use. It’s precisely this quality that has made ChatGPT so amazing — its ability to create the feeling of an organic conversation by giving a new answer to every prompt — and so hard to censor.
“Even if Baidu launches Ernie Bot as promised, there is a good chance it will be suspended soon,” said Xu Liang, lead developer of Hangzhou-based YuanYu Intelligence, a start-up that launched its own chatbot. Smaller scale AI in late January. . “There will simply be too much moderation to do.”
Xu would know – his own bot, ChatYuan, was suspended a few days after its launch.
At first, everything went well. When ChatYuan was asked about Xi Jinping, the bot praised China’s top leader and described him as a reformist who values innovation, according to screenshots released by Hong news sites. Kong and Taiwan.
But when asked about the economy, the bot said there was ‘no room for optimism’ as the country faced critical issues including pollution, lack of investment and a housing bubble. .
The bot also described the war in Ukraine as Russia’s “war of aggression,” according to the screenshots. China’s official position has been to diplomatically – and perhaps materially – support Russia.
The ChatYuan website remains under maintenance. Xu insisted that the site was down due to technical errors and that the company chose to take its service offline to improve content moderation.
Xu was “not particularly in a rush” to bring the user-facing service back online, he said.
A handful of other organizations spearheaded their own efforts, including a team of researchers from Fudan University in Shanghai, whose chatbot Moss was swamped with traffic and crashed within 24 hours of being released.
Users around the world have already demonstrated that ChatGPT itself can easily become a scammer and share information that its parent company tried to prevent it from disclosing, such as how to commit a violent crime.
“As we’ve seen with ChatGPT, it’s going to be very complicated to actually control the outputs of some of these models,” said Jeff Ding, assistant professor of political science at George Washington University, who focuses on concurrency. of AI between the United States and China. .
So far, Chinese tech giants have used their AI capabilities to augment other — less politically risky — product lines, such as cloud services, driverless cars and research. After a government crackdown has already put the country’s tech companies on edge, the release of China’s first large-scale chat bot puts Baidu in an even more precarious position.
Baidu CEO Robin Li was optimistic during a call with investors on Wednesday, and said the company would release Ernie Bot in the coming weeks, then include the AI behind it in most of its other products, from advertising to driverless vehicles.
“Baidu is the best representative of the long-term growth of China’s artificial intelligence market,” Li said in a letter to investors. “We are at the crest of the wave.”
Baidu is already as synonymous with search in China as Google is elsewhere, and Ernie Bot could cement Baidu’s position as a major provider of the most advanced AI technology, a top priority in Beijing’s drive to achieve full technological independence from the United States.
Baidu stands to gain by making Ernie Bot available as part of its cloud services, which currently account for only 9% share in a highly competitive market, according to Kevin Xu, a technology executive and author of the technology newsletter Interconnected. The ability to use AI to chat with passengers is also a fundamental part of the company’s plans for Apollo, the software that powers its driverless cars.
The type of AI behind chatbots learns to do its job by assimilating huge amounts of information available online: encyclopedias, academic journals and also social networks. Experts have suggested that any chatbot in China would need to have internalized only Party-approved information made easily accessible online inside the firewall.
But according to open-source research papers on his training data, Ernie consumed a vast wealth of information in English, including Wikipedia and Reddit, both of which are blocked in China.
The more information the AI digests – and, importantly, the more it interacts with real humans – the more successful it is at mimicking them.
But an AI bot can’t always distinguish between helpful and hateful content. According to George Washington University’s Ding, after ChatGPT was trained by digesting the 175 billion parameters that inform it, parent company OpenAI still needed to employ several dozen human contractors to teach it not to. regurgitate racist and misogynistic speech or give instructions on how to do it. things like building a bomb.
This human-formed version, called InstructGPT, is the framework behind the chat bot. No similar effort has been announced for Baidu’s Ernie Bot or any of the other Chinese projects underway, Ding said.
Even with a strong content management team in place at Baidu, that may not be enough.
Zhao, the former Baidu employee, said the company initially dedicated only a handful of engineers to developing its AI framework. “Baidu’s AI research has been hampered by a lack of commitment to a risky field that promised little short-term return,” she said.
Baidu maintains a list of the banned keywords it filters, including content involving violence, pornography and politics, according to Zhao. The company also outsources data labeling and content moderation work to a team of contractors as needed, she said.
Early generations of AI chatbots launched in China, including a Microsoft bot called XiaoBing — which translates to LittleBing — first launched in 2014, quickly broke censors and went offline. XiaoBing, which Microsoft established as an independent brand in 2020, has been repeatedly removed from WeChat for comments such as telling users that it was his dream to emigrate to the United States.
The team behind XiaoBing was too eager to show off its technological advancements and did not sufficiently consider the political consequences, Zhao said.
“Latest-generation chatbots could only select answers from an engineer-curated database and could decline out-of-the-box questions,” she said. “Problems have even arisen under these predetermined conditions.”