Publications

Exploring the Dialogue Comprehension Ability of Large Language Models

Published in Arxiv, 2023

This study introduces a dual-assessment approach for large language models (LLMs), using dialogue summarization to evaluate factual consistency and derived factual questions to gauge comprehension, uncovering a notable error rate and proposing a multi-task fine-tuning strategy for improvement.

Recommended citation: She S, Huang S, Wang X, et al. Exploring the Dialogue Comprehension Ability of Large Language Models[J]. arXiv preprint arXiv:2311.07194, 2023. https://arxiv.org/abs/2311.07194

Improved Pseudo Data for Machine Translation Quality Estimation with Constrained Beam Search

Published in EMNLP2023, 2023

The study introduces CBSQE, a method for generating more accurate pseudo data for machine translation quality estimation by using constrained beam search to differentiate between likely correct and incorrect translation segments, improving performance in both supervised and unsupervised settings.

Recommended citation: Geng, X., Zhang, Y., Lai, Z., She, S., Zou, W., Tao, S., Yang, H., Chen, J., & Huang, S. (2023). Improved Pseudo Data for Machine Translation Quality Estimation with Constrained Beam Search. Conference on Empirical Methods in Natural Language Processing. https://aclanthology.org/2023.emnlp-main.764.pdf

CoP: Factual Inconsistency Detection by Controlling the Preference

Published in AAAI2023, 2022

This paper proposes an interesting and novel idea for using tweaked model behavior as an evaluation for factual consistency. The paper demonstrates the SOTA performance on the corresponding task. Code 文字解读 Paper preprint

Recommended citation: She S, Geng X, Huang S, et al. CoP: Factual Inconsistency Detection by Controlling the Preference[J]. arXiv preprint arXiv:2212.01611, 2022. "Paper Title Number 1." https://arxiv.org/abs/2212.01611