Coordination is a syntactic phenomenon in which two or more elements, known as conjuncts, are linked together typically by a coordinating conjunction. Coordinate structures frequently occur in natural language and are the major source of ambiguities. In this dissertation, we focus on identifying the conjuncts of coordinate structures, especially in English text. We propose two methods for the task: (1) a top-down approach and (2) a bottom-up approach, both of which utilize deep neural networks.
The top-down approach first identifies a coordinate structure and then retrieves the individual conjuncts from it. Our neural networks incorporate similarity and replaceability properties of conjuncts as features without external thesauri, language models, or syntactic parsers. The lightweight model enables our system to examine all possible coordination spans. Although this approach outperforms existing methods, our analysis reveals that the system is not good at finding conjuncts in coordination.
The bottom-up approach, in contrast, first finds conjuncts and then constructs a coordinate structure from them. In this approach, coordinate structures for a given sentence are identified in the form of the syntactic tree, which is produced by a context-free grammar for coordination. Our neural network model consists of submodels, each of which is specialized in capturing different parts of coordinate structures. Using the models with the CKY algorithm, our system efficiently produces coordinate structures.
The main contribution of this dissertation is to demonstrate effective frameworks for coordination disambiguation. Experimental results show that our methods achieve state-of-the-art results, ensuring that the global structure of coordination is consistent.