The Project Halo team is embarking on an exciting new phase of research, and making a significant shift toward semi-automated knowledge acquisition through the use of natural language processing over textual resources. We are also integrating semi-formal methods for reasoning with knowledge, such as textual entailment and evidential reasoning, and a robust hybrid architecture that has multiple reasoning modules operating in tandem.
In the current two-year phase, we are giving the computer increasingly difficult biology exams, at the 4th, 8th, and 12th grade levels, while aiming to gradually improve performance on, and eventually pass, these exams. This is ambitious because it requires the computer to have substantial general and simple science knowledge. Eventually, the system will evolve to field users' questions, shifting the focus from exams to those answers related to direct user queries. The three main technology thrusts within the system are:
Extracting knowledge from relevant natural language texts, supplemented by other sources and techniques such as crowd-sourcing and judicious use of manual encoding.
Performing inference with that knowledge to answer questions.
Use of natural language processing and other techniques to interpret the actual exam questions and connect them with the knowledge.
Founded by technologist and philanthropist Paul G. Allen, Vulcan is a Seattle-based company that is committed to innovation and believes in the power of a great idea. In addition to its artificial intelligence arm, Vulcan is behind the research and development of several new technologies that help people live, work, play and learn.
In 2001, Vulcan convened a group of AI experts to brainstorm how to create a system with massive amounts of human knowledge that people could access in an interactive, easy-to-use fashion. Just as the philosopher Aristotle mentored his peers, a "Digital Aristotle" would be a computer embodiment of an insightful teacher. Project Halo was launched to move ahead with making the "Digital Aristotle" vision a reality.
Project Halo began as a six-month pilot in 2002 to evaluate the feasibility of constructing reasoning systems capable of sophisticated behavior. The pilot efforts focused on encoding 70 pages from a chemistry textbook. Although the pilot demonstrated good question-answering performance, there was a high cost involved with knowledge acquisition. As a result, the research program that followed focused heavily on more efficient knowledge-acquisition technologies that allowed domain experts to enter knowledge directly into the system. This work resulted in the Automated User-Centered Reasoning and Acquisition System (AURA) and could operate in the domains of physics, chemistry, and biology. After this initial development, there was a substantial effort to encode large portions of a popular biology textbook, and that resulted in:
- BioKB101, a large knowledge base about biology built with AURA
- Inquire, a sophisticated iPad app that helped students learn biology: www.inquireproject.com
Students found the Inquire iPad application useful and the knowledge acquisition process was sophisticated, but, the cost of building the knowledge base remained too high. Moving forward, the project will put a greater emphasis on automated techniques for knowledge acquisition.
A Study of the AKBC Requirements for Passing an Elementary Science Test
Peter Clark, Phil Harrison, Niranjan Balasubramanian. Proc. AKBC'13, 2013
This article outlines the new direction of the Halo project, with emphasis on semi-automatic knowledge acquisition and evidential reasoning.
Project Halo: Towards a Knowledgeable Biology Textbook
Peter Clark. Invited Talk at CLEF 2012 (Conference and Labs of the Evaluation Forum), 2012
This presentation overviews the research on AURA and Inquire, and describes some of the new research on textual inference for question-answering.
Inquire Biology: A Textbook that Answers Questions
Vinay Chaudhri et al., AI Magazine 34 (3), 2013
This article describe Inquire, an intelligent iPad-based textbook that not only presents material but also answers users' questions using a substantial biology KB, constructed with AURA.
Project Halo Update: Progress Toward Digital Aristotle
David Gunning et al., AI Magazine 31 (3), 2010
This article describes AURA, a sophisticated knowledge acquisition and reasoning environment for capturing domain knowledge directly from experts and answering users' questions.
View the full list of Halo-related publications
Project Halo: Towards a Digital Aristotle
Noah Friedland et al., AI Magazine 25 (4), 2004, pp. 29-47
This article describes the results of the original pilot project (2002-3) to assess the state of the art in applied knowledge representation and reasoning technologies.
Project Halo Team
Peter Clark is the Senior Research Manager for Project Halo. His work focuses upon NLP, machine reasoning, and large knowledge bases, and the interplay between these three areas. He received his Ph.D. in Computer Science in 1991, and has researched these topics for 30 years with more than 80 refereed publications, including a AAAI Best Paper award in 1997.
Philip Harrison is an NLP specialist in language interpretation, with emphasis on syntax and semantics. He was one of the principal developers of the Boeing Simplified English Checker, a program used worldwide by aerospace writers to check conformity with an official writing standard.
William Smith specializes in creating applications that use emerging Semantic Web technologies. His current work includes Linked Open Data, semantic application framework development, and integration of neurobiology data sets. William joined Vulcan in 2010 working on Semantic Web applications, and has spent over 10 years developing web frameworks across several different industries.
Tommy Lu joined Vulcan in 2010 and oversees the quality assurance and quality engineering process for Project Halo. He has developed several automation frameworks, harnesses, and software utilities using a variety of technologies.
Kevin Humphreys is a Research Scientist who brings both academic and industry experience in natural language to the Project Halo team. He has a Ph.D. in Natural Language Generation and has carried out academic research in Information Extraction including entries to MUC and TREC system evaluations, and development of the GATE Language Engineering environment. He also has over 12 years of product development experience at Microsoft, including work on natural language features for Office and Bing.
- Dan Weld, University of Washington
- Chris Manning, Stanford University
- Jonathan Berant, Stanford University
- Benjamin Van Durme, Johns Hopkins University
- Mihai Surdeanu, University of Arizona
Vulcan Inc. building at 505 Fifth Avenue South in Seattle, Washington