Building a corpus for the anonymization of Romanian jurisprudence

Vasile Păiș, Dan Tufis, Elena Irimia, Verginica Barbu Mititelu


Abstract
Access to jurisprudence is of paramount importance for both law professionals (judges, lawyers, law students) and for the larger public. In Romania, the Superior Council of Magistracy holds a large database of jurisprudence from different courts in the country, which is updated daily. However, granting public access requires its anonymization. This paper presents the efforts behind building a corpus for the anonymization process. We present the annotation scheme, the manual annotation methods, and the platform used.
Anthology ID:
2024.law-1.7
Volume:
Proceedings of The 18th Linguistic Annotation Workshop (LAW-XVIII)
Month:
March
Year:
2024
Address:
St. Julians, Malta
Editors:
Sophie Henning, Manfred Stede
Venues:
LAW | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
71–76
Language:
URL:
https://aclanthology.org/2024.law-1.7
DOI:
Bibkey:
Cite (ACL):
Vasile Păiș, Dan Tufis, Elena Irimia, and Verginica Barbu Mititelu. 2024. Building a corpus for the anonymization of Romanian jurisprudence. In Proceedings of The 18th Linguistic Annotation Workshop (LAW-XVIII), pages 71–76, St. Julians, Malta. Association for Computational Linguistics.
Cite (Informal):
Building a corpus for the anonymization of Romanian jurisprudence (Păiș et al., LAW-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.law-1.7.pdf
Video:
 https://aclanthology.org/2024.law-1.7.mp4