English access to structured data

Citation

Richardson, K.; Bobrow, D. G.; Condoravdi, C.; Waldinger, R.; Das, A. English access to structured data. IEEE International Conference on Semantic Computing (ICSC); 2011 September 19-21; Stanford University, CA.

Abstract

In this paper we present work in using a domain model to guide text interpretation, in the context of a project that aims to interpret English questions as queries to be answered from structured databases. We adapt a broad-coverage and ambiguity enabled NLP system to produce domain-specific logical forms by using knowledge of the domain to zero-in on the appropriate interpretation. The non-logical vocabulary of the logical forms is drawn from a domain theory that constitutes a higher-level abstraction of the contents of a set of related databases. The meanings of the vocabulary terms are encoded in an axiomatic domain theory. In order to retrieve information from the databases, the logical forms must be must be instantiated by values from fields in the database. We use an axiomatic domain theory interpreted by a first-order theorem prover called SNARK to identify the groundings, and then retrieve the values through procedural attachments semantically linked to the database. SNARK attempts to prove the logical form as a theorem by reasoning using the database content, and returns the exemplars of the proof(s) back to the user as answers to the query. The focus of this paper is more on the language task; however, we discuss the interaction that must occur between language and reasoning in order to build an end-to-end natural interface to databases. We illustrate the process using examples drawn from an HIV treatment domain, where the underlying databases are reports of temporally bound treatments of individual patients.


Read more from SRI