[PDF][PDF] Philosophers are mortal: Inferring the truth of unseen facts

G Angeli, CD Manning - Proceedings of the seventeenth …, 2013 - aclanthology.org
Proceedings of the seventeenth conference on computational natural …, 2013aclanthology.org
Large databases of facts are prevalent in many applications. Such databases are accurate,
but as they broaden their scope they become increasingly incomplete. In contrast to
extending such a database, we present a system to query whether it contains an arbitrary
fact. This work can be thought of as re-casting open domain information extraction: rather
than growing a database of known facts, we smooth this data into a database in which any
possible fact has membership with some confidence. We evaluate our system predicting …
Abstract
Large databases of facts are prevalent in many applications. Such databases are accurate, but as they broaden their scope they become increasingly incomplete. In contrast to extending such a database, we present a system to query whether it contains an arbitrary fact. This work can be thought of as re-casting open domain information extraction: rather than growing a database of known facts, we smooth this data into a database in which any possible fact has membership with some confidence. We evaluate our system predicting held out facts, achieving 74.2% accuracy and outperforming multiple baselines. We also evaluate the system as a commonsense filter for the ReVerb Open IE system, and as a method for answer validation in a Question Answering task. 1 Introduction
Databases of facts, such as Freebase (Bollacker et al., 2008) or Open Information Extraction (Open IE) extractions, are useful for a range of NLP applications from semantic parsing to information extraction. However, as the domain of a database grows, it becomes increasingly impractical to collect completely, and increasingly unlikely that all the elements intended for the database are explicitly mentioned in the source corpus. In particular, common-sense facts are rarely explicitly mentioned, despite their abundance. It would be useful to infer the truth of such unseen facts rather than assuming them to be implicitly false. A growing body of work has focused on automatically extending large databases with a finite set of additional facts. In contrast, we propose a system to generate the (possibly infinite) completion of such a database, with a degree of confidence for each unseen fact. This task can be
aclanthology.org