HYRR: Hybrid Infused Reranking for Passage Retrieval

Jing Lu, Keith Hall, Ji Ma, Jianmo Ni


Abstract
Existing passage retrieval systems typically adopt a two-stage retrieve-then-rerank pipeline. To obtain an effective reranking model, many prior works have focused on improving the model architectures, such as leveraging powerful pretrained large language models (LLM) and designing better objective functions. However, less attention has been paid to the issue of collecting high-quality training data. In this paper, we propose HYRR, a framework for training robust reranking models. Specifically, we propose a simple but effective approach to select training data using hybrid retrievers. Our experiments show that the rerankers trained with HYRR are robust to different first-stage retrievers. Moreover, evaluations using MS MARCO and BEIR data sets demonstrate our proposed framework effectively generalizes to both supervised and zero-shot retrieval settings.
Anthology ID:
2024.lrec-main.748
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
8528–8534
Language:
URL:
https://aclanthology.org/2024.lrec-main.748
DOI:
Bibkey:
Cite (ACL):
Jing Lu, Keith Hall, Ji Ma, and Jianmo Ni. 2024. HYRR: Hybrid Infused Reranking for Passage Retrieval. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 8528–8534, Torino, Italia. ELRA and ICCL.
Cite (Informal):
HYRR: Hybrid Infused Reranking for Passage Retrieval (Lu et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.748.pdf