Public Universities with Billion Dollar Endowments? Challenges and Opportunities in the Acquisition of Open-Domain Fine-Grained Classes from the Web
Add to Google Calendar
Instances (e.g., university of michigan at ann arbor, cadillac cts-v) and their class labels (universities, sport sedans) are basic building blocks towards the elusive goal of constructing knowledge resources automatically. Different Web users may refer to the same instances using different class labels at different levels of granularity. Consequently, efforts to represent knowledge about open-domain instances benefit from the acquisition of as many relevant class labels as possible. Class labels conveniently summarize the properties shared among the instances of each class. Finer-grained classes are more specific, their class labels capture more of the relevant properties associated with their instances, but they are more difficult to extract from textual data.
Marius Pasca is a research scientist at Google. He graduated with a Ph.D. degree in Computer Science from Southern Methodist University in Dallas, Texas and an M.Sc. degree in Computer Science from Joseph Fourier University in Grenoble, France. Current research interests include factual information extraction from unstructured text and natural-language matching functions for information retrieval.