RootsTech 2012 Featured Presenter: David Embley

RootsTech Logo on Glass

David Embley is our next featured presenter.

Web of Knowledge for Historical Documents

This class will be held Thursday, Feb 2 from 4:15-5:15. It is targeted towards software developers.Key Ideas to be Learned from the Presentation:

1. What a web of knowledge superimposed over historical documents is.

2. How to automate the process of superimposing a web of knowledge over historical documents:

(a) Construct extraction ontologies that  can automatically extract information from historical documents.

(b)  Preprocess historical documents, building links from conceptualizations in the web of knowledge to facts in historical documents.

(c) Define inference rules, and preprocess extracted facts to obtain implied facts.

(d)  Using extraction ontologies, interpret user queries, return results, and provide result justifications.

3. The status of a prototype implementation along with some experimental results.

4. What needs to be done to make the vision practical:

(a) Automate as much as possible the construction of fact extractors.

(b) Enhance and optimize hybrid keyword and semantic search.

(c) Build  tools to clean and organize facts, including information integration and object-identity resolution.

(d)  Enable multilingual fact extraction and query processing.

For more information about this topic, see David's syllabus.

Tags
About the Author