Microsoft researchers have created technology that uses artificial intelligence to read a document and answer questions about it about as well as a human.
The technology could help search engines and intelligent assistants interact with people and provide information in more natural ways, much like people communicate with each other.
A team at Microsoft Research Asia reached the human parity milestone using the Stanford Question Answering Dataset, known among researchers as SQuAD. It's a machine reading comprehension dataset that is made up of questions about a set of Wikipedia articles.
According to the SQuAD leaderboard, on Jan. 3, Microsoft submitted a model that reached the score of 82.650 on the exact match portion. The human performance on the same set of questions and answers is 82.304. On Jan. 5, researchers with the Chinese e-commerce company Alibaba submitted a score of 82.440, also about the same as a human.
Microsoft has made a significant investment in machine reading comprehension. For example, instead of typing in a search query and getting a list of links, Microsoft's Bing search engine is moving toward efforts to provide people with more plainspoken answers, or with multiple sources of information on a topic that is more complex or controversial.
With machine reading comprehension, researchers say computers also would be able to quickly parse through information found in books and documents and provide people with the information they need most in an easily understandable way.
That would let drivers more easily find the answer they need in a dense car manual.
Microsoft is already applying earlier versions of the models that were submitted for the SQuAD dataset leaderboard in its Bing search engine, and the company is working on applying it to more complex problems.
It's also looking at ways that computers can generate natural answers when that requires information from several sentences. For example, if the computer is asked, "Is John Smith a U.S. citizen?," that information may be based on a paragraph such as, "John Smith was born in Hawaii. That state is in the U.S."
Ming Zhou, assistant managing director of Microsoft Research Asia, said the SQuAD dataset results are an important milestone, but he noted that, overall, people are still much better than machines at comprehending the complexity and nuance of language.
"Natural language processing is still an area with lots of challenges that we all need to keep investing in and pushing forward," Zhou said. "This milestone is just a start."