Yanyue Zhang


2024

pdf bib
Opinions Are Not Always Positive: Debiasing Opinion Summarization with Model-Specific and Model-Agnostic Methods
Yanyue Zhang | Yilong Lai | Zhenglin Wang | Pengfei Li | Deyu Zhou | Yulan He
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

As in the existing opinion summary data set, more than 70% are positive texts, the current opinion summarization approaches are reluctant to generate the negative opinion summary given the input of negative opinions. To address such sentiment bias, two approaches are proposed through two perspectives: model-specific and model-agnostic. For the model-specific approach, a variational autoencoder is proposed to disentangle the input representation into sentiment-relevant and sentiment-irrelevant components through adversarial loss. Therefore, the sentiment information in the input is kept and employed for the following decoding which avoids interference of content information with emotional signals. To further avoid relying on some specific opinion summarization frameworks, a model-agnostic approach based on counterfactual data augmentation is proposed. A dataset with a more balanced emotional polarity distribution is constructed using a large pre-trained language model based on some pairwise and mini-edited principles. Experimental results show that the sentiment consistency of the generated summaries is significantly improved using the proposed approaches, while their semantics quality is unaffected.

2023

pdf bib
Disentangling Text Representation With Counter-Template For Unsupervised Opinion Summarization
Yanyue Zhang | Deyu Zhou
Findings of the Association for Computational Linguistics: ACL 2023

Approaches for unsupervised opinion summarization are generally based on the reconstruction model and generate a summary by decoding the aggregated representation of inputs. Recent work has shown that aggregating via simple average leads to vector degeneration, generating the generic summary. To tackle the challenge, some approaches select the inputs before aggregating. However, we argue that the selection is too coarse as not all information in each input is equally essential for the summary. For example, the content information such as “great coffee maker, easy to set up” is more valuable than the pattern such as “this is a great product”. Therefore, we propose a novel framework for unsupervised opinion summarization based on text representation disentanglement with counter-template. In specific, a disentangling module is added to the encoder-decoder architecture which decouples the input text representation into two parts: content and pattern. To capture the pattern information, a counter-template is utilized as supervision, which is automatically generated based on contrastive learning. Experimental results on two benchmark datasets show that the proposed approach outperforms the state-of-the-art baselines on both quality and stability.