Trang Tran Hanh Pham


2024

pdf bib
Contribution of Move Structure to Automatic Genre Identification: An Annotated Corpus of French Tourism Websites
Rémi Cardon | Trang Tran Hanh Pham | Julien Zakhia Doueihi | Thomas François
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

The present work studies the contribution of move structure to automatic genre identification. This concept - well known in other branches of genre analysis - seems to have little application in natural language processing. We describe how we collect a corpus of websites in French related to tourism and annotate it with move structure. We conduct experiments on automatic genre identification with our corpus. Our results show that our approach for informing a model with move structure can increase its performance for automatic genre identification, and reduce the need for annotated data and computational power.