Integrating Word Embedding Offsets into the Espresso System for Part-Whole Relation Extraction

Phi Van Thuy (1451208)


Part-whole relation, or meronymy, plays an important role in many domains and applications. Among approaches to addressing the part-whole relation extraction task, the Espresso bootstrapping algorithm has proved to be effective by significantly improving recall while keeping high precision. In this study, we first investigate the effect of using fine-grained subtypes and the careful seed selection step on the performance of extracting part-whole relations. Our multi-task learning and careful seed selection were major factors for achieving higher precision. Then, we improve the Espresso bootstrapping algorithm for the part-whole relation extraction task by integrating a word embedding approach into its iterations. The key idea of our approach is utilizing an additional ranker component, namely Similarity Ranker in the Instances Extraction phase of the Espresso system. This ranker component uses embedding offset information between instance pairs of part-whole relations. The experiments show that our proposed system achieved a precision of 84.9% for harvesting instances of the part-whole relation, and outperformed the original Espresso system.