Maria José B. Finatto

Also published as: Maria José Bocorny Finatto, Maria José Finatto


2024

pdf bib
Can rules still beat neural networks? The case of automatic normalisation for 18th-century Portuguese texts
Leonardo Zilio | Rafaela R. Lazzari | Maria José B. Finatto
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 2

2016

pdf bib
VerbLexPor: a lexical resource with semantic roles for Portuguese
Leonardo Zilio | Maria José Bocorny Finatto | Aline Villavicencio
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper presents a lexical resource developed for Portuguese. The resource contains sentences annotated with semantic roles. The sentences were extracted from two domains: Cardiology research papers and newspaper articles. Both corpora were analyzed with the PALAVRAS parser and subsequently processed with a subcategorization frames extractor, so that each sentence that contained at least one main verb was stored in a database together with its syntactic organization. The annotation was manually carried out by a linguist using an annotation interface. Both the annotated and non-annotated data were exported to an XML format, which is readily available for download. The reason behind exporting non-annotated data is that there is syntactic information collected from the parser annotation in the non-annotated data, and this could be useful for other researchers. The sentences from both corpora were annotated separately, so that it is possible to access sentences either from the Cardiology or from the newspaper corpus. The full resource presents more than seven thousand semantically annotated sentences, containing 192 different verbs and more than 15 thousand individual arguments and adjuncts.

2015

pdf bib
VerbLexPor: um recurso léxico com anotação de papéis semânticos para o português (VerbLexPor: a lexical resource annotated with semantic roles for Portuguese)
Leonardo Zilio | Maria José Bocorny Finatto | Aline Villavicencio
Proceedings of the 10th Brazilian Symposium in Information and Human Language Technology

2014

pdf bib
Comparing the Quality of Focused Crawlers and of the Translation Resources Obtained from them
Bruno Laranjeira | Viviane Moreira | Aline Villavicencio | Carlos Ramisch | Maria José Finatto
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Comparable corpora have been used as an alternative for parallel corpora as resources for computational tasks that involve domain-specific natural language processing. One way to gather documents related to a specific topic of interest is to traverse a portion of the web graph in a targeted way, using focused crawling algorithms. In this paper, we compare several focused crawling algorithms using them to collect comparable corpora on a specific domain. Then, we compare the evaluation of the focused crawling algorithms to the performance of linguistic processes executed after training with the corresponding generated corpora. Also, we propose a novel approach for focused crawling, exploiting the expressive power of multiword expressions.

2011

pdf bib
Comparando Avaliações de Inteligibilidade Textual entre Originais e Traduções de Textos Literários (Comparing Textual Intelligibility Evaluations among Literary Source Texts and their Translations) [in Portuguese]
Bianca Franco Pasqualini | Carolina Evaristo Scarton | Maria José B. Finatto
Proceedings of the 8th Brazilian Symposium in Information and Human Language Technology

pdf bib
Características do jornalismo popular: avaliação da inteligibilidade e auxílio à descrição do gênero (Characteristics of Popular News: the Evaluation of Intelligibility and Support to the Genre Description) [in Portuguese]
Maria José B. Finatto | Carolina Evaristo Scarton | Amanda Rocha | Sandra Aluísio
Proceedings of the 8th Brazilian Symposium in Information and Human Language Technology

2009

pdf bib
Statistically-Driven Alignment-Based Multiword Expression Identification for Technical Domains
Helena Caseli | Aline Villavicencio | André Machado | Maria José Finatto
Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications (MWE 2009)