Abstract
Linguistic research and language technology development employ large data repositories of ordered trees, known as “treebanks.” We define a path language for linguistic trees represented in XML called LPath, based on XPath, and provide a new labeling scheme for LPath query evaluation. We report a strategy for evaluating expressions of the language against treebank data. The language contains three expressive features which are important for linguistic query, namely immediate precedence, subtree scoping, and edge alignment. We motivate and illustrate these features with a variety of linguistic queries. This work provides a scalable and reusable model for linguistic tree queries, and relates it to well-understood
semistructured and relational languages.
semistructured and relational languages.
Original language | English |
---|---|
Title of host publication | Programming Language Technologies for XML (PLANX) |
Pages | 35-46 |
Number of pages | 12 |
Publication status | Published - 2005 |
Externally published | Yes |