📝 Publications

Why Safeguarded Ships Run Aground? Aligned Large Language Models’ Safety Mechanisms Tend to Be Anchored in The Template Region 
Chak Tou Leong, Qingyu Yin, Jian Wang, Wenjie Li

Direct Preference Optimization Using Sparse Feature-Level Constraints 
Qingyu Yin†, Chak Tou Leong†, Hongbo Zhang, Minjun Zhu, Hanqi Yan, Qiang Zhang, Yulan He, Wenjie Li, Jun Wang, Yue Zhang, Linyi Yang

E^2CL: Exploration-based Error Correction Learning for Embodied Agents 
Hanlin Wang†, Chak Tou Leong†, Jian Wang, Wenjie Li

No Two Devils Alike: Unveiling Distinct Mechanisms of Fine-tuning Attacks 
Chak Tou Leong, Yi Cheng, Kaishuai Xu, Jian Wang, Hanlin Wang, Wenjie Li

Self-Detoxifying Language Models via Toxification Reversal 
Chak Tou Leong†, Yi Cheng†, Jiashuo Wang, Jian Wang, Wenjie Li
- EMNLP 2025TokenSkip: Controllable Chain-of-Thought Compression in LLMs, Heming Xia, Chak Tou Leong, Wenjie Wang, Yongqi Li, Wenjie Li
- EMNLP 2025Expanding before Inferring: Enhancing Factuality in Large Language Models through Premature Layers Interpolation, Dingwei Chen, Ziqiang Liu, Feiteng Fang, Chak Tou Leong, Shiwen Ni, Ahmadreza Argha, Hamid Alinejad-Rokny, Min Yang, Chengming Li
- ACL 2025 FindingsSTeCa: Step-level Trajectory Calibration for LLM Agent Learning, Hanlin Wang, Jian Wang, Chak Tou Leong, Wenjie Li
- ACL 2025Subtle Errors Matter: Preference Learning via Error-injected Self-editing, Kaishuai Xu, Tiezheng Yu, Wenjun Hou, Yi Cheng, Chak Tou Leong, Liangyou Li, Xin Jiang, Lifeng Shang, Qun Liu, Wenjie Li
- EMNLP 2024 FindingsDeeper Insights Without Updates: The Power of In-Context Learning Over Fine-Tuning, Qingyu Yin, Xuzheng He, Luoao Deng, Chak Tou Leong, Fan Wang, Yanzhao Yan, Xiaoyu Shen, Qiang Zhang
- ACL 2024Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue, Jian Wang, Chak Tou Leong, Jiashuo Wang, Dongding Lin, Wenjie Li, Xiaoyong Wei
- ACL 2024 FindingsMuffin: Mitigating Unhelpfulness in Emotional Support Conversations with Multifaceted AI Feedback, Jiashuo Wang, Chunpu Xu, Chak Tou Leong, Wenjie Li, Jing Li
- ACMMM 2024SCREEN: A Benchmark for Situated Conversational Recommendation, Dongding Lin, Jian Wang, Chak Tou Leong, Wenjie Li
- AAAI 2024COOPER: Coordinating Specialized Agents towards a Complex Dialogue Goal, Yi Cheng, Wenge Liu, Jian Wang, Chak Tou Leong, Yi Ouyang, Wenjie Li, Xian Wu, Yefeng Zheng
