Abstract
This paper addresses the question of how document classifiers can exploit implicit information about document similarity to improve document classifier accuracy. We infer document similarity using simple n-gram overlap, and demonstrate that this improves overall document classification performance over two datasets. As part of this, we find that collective classification based on simple iterative classifiers outperforms the more complex and computationally-intensive dual classifier approach.
Original language | English |
---|---|
Title of host publication | Proceedings of the 4th Joint Conference on Lexical and Computational Semantics (*SEM 2015) |
Place of Publication | Denver, United States |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 106-116 |
Number of pages | 11 |
ISBN (Electronic) | 9781941643396 |
Publication status | Published - 1 Jan 2015 |
Externally published | Yes |
Event | 4th Joint Conference on Lexical and Computational Semantics, *SEM 2015 - Denver, United States Duration: 4 Jun 2015 → 5 Jun 2015 |
Conference
Conference | 4th Joint Conference on Lexical and Computational Semantics, *SEM 2015 |
---|---|
Country/Territory | United States |
City | Denver |
Period | 4/06/15 → 5/06/15 |