Decompose, Prioritize, and Eliminate: Dynamically Integrating Diverse Representations for Multimodal Named Entity Recognition

Zihao Zheng, Zihan Zhang, Zexin Wang, Ruiji Fu, Ming Liu, Zhongyuan Wang, Bing Qin


Abstract
Multi-modal Named Entity Recognition, a fundamental task for multi-modal knowledge graph construction, requires integrating multi-modal information to extract named entities from text. Previous research has explored the integration of multi-modal representations at different granularities. However, they struggle to integrate all these multi-modal representations to provide rich contextual information to improve multi-modal named entity recognition. In this paper, we propose DPE-MNER, which is an iterative reasoning framework that dynamically incorporates all the diverse multi-modal representations following the strategy of “decompose, prioritize, and eliminate”. Within the framework, the fusion of diverse multi-modal representations is decomposed into hierarchically connected fusion layers that are easier to handle. The incorporation of multi-modal information prioritizes transitioning from “easy-to-hard” and “coarse-to-fine”. The explicit modeling of cross-modal relevance eliminate the irrelevances that will mislead the MNER prediction. Extensive experiments on two public datasets have demonstrated the effectiveness of our approach.
Anthology ID:
2024.lrec-main.403
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
4498–4508
Language:
URL:
https://aclanthology.org/2024.lrec-main.403
DOI:
Bibkey:
Cite (ACL):
Zihao Zheng, Zihan Zhang, Zexin Wang, Ruiji Fu, Ming Liu, Zhongyuan Wang, and Bing Qin. 2024. Decompose, Prioritize, and Eliminate: Dynamically Integrating Diverse Representations for Multimodal Named Entity Recognition. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 4498–4508, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Decompose, Prioritize, and Eliminate: Dynamically Integrating Diverse Representations for Multimodal Named Entity Recognition (Zheng et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.403.pdf