Artificial Intelligence software has beaten humans in one of the world’s most-challenging reading comprehension tests.
In a feat being hailed as a world first, a deep neural network scored higher than the average person on a Stanford University designed quiz.
The breakthrough could lead to more advanced robots and automated systems, capable of solving complex problems and answering difficult questions.
Future applications could range from customer service to helping tackle social and political issues, like climate change and conflicts over resources.
The AI, created by retail firm Alibaba’s Institute of Data Science and Technologies, based in Hangzhou, China, took part in the Stanford Question Answering Dataset (Squad).
Squad is a large-scale reading comprehension dataset comprised of over 100,000 question-answer pairs based on over 500 Wikipedia articles.
It is seen as the world’s top machine reading-comprehension test and attracts universities and institutes ranging from Google, Facebook, IBM, Microsoft to Carnegie Mellon University, Stanford University and the Allen Research Institute.
Teams competing in the challenge have to build machine-learning models that can provide answers to the questions in the dataset.
Alibaba’s deep neural network model scored 82.44, beating the human score of 82.304 in providing exact answers to questions.
The company said it’s the first time a machine has out-done a real person in such a contest.
Microsoft Research achieved a similar feat, scoring 82.65, but those results were finalised a day after Alibaba’s, the firm said.
Luo Si, chief scientist for natural language processing at the Alibaba institute, said in a written statement: ‘It is our great honor to witness the milestone where machines surpass humans in reading comprehension.
‘That means objective questions such as “what causes rain” can now be answered with high accuracy by machines.
‘The technology underneath can be gradually applied to numerous applications such as customer service, museum tutorials and online responses to medical inquiries from patients, decreasing the need for human input in an unprecedented way.’
The accuracy of Alibaba’s AI is tied to its ability to infer meaning, narrowing down from paragraphs to sentences to words, locating precise phrases that contain potential answers.