📝 Publications

See full list in Google Scholar Badge

Preprint
sym

Direct Preference Optimization Using Sparse Feature-Level Constraints
Qingyu Yin†, Chak Tou Leong†, Hongbo Zhang, Minjun Zhu, Hanqi Yan, Qiang Zhang, Yulan He, Wenjie Li, Jun Wang, Yue Zhang, Linyi Yang

EMNLP 2024 Findings
sym

E^2CL: Exploration-based Error Correction Learning for Embodied Agents
Hanlin Wang†, Chak Tou Leong†, Jian Wang, Wenjie Li

Preprint
sym

No Two Devils Alike: Unveiling Distinct Mechanisms of Fine-tuning Attacks
Chak Tou Leong, Yi Cheng, Kaishuai Xu, Jian Wang, Hanlin Wang, Wenjie Li

EMNLP 2023
sym

Self-Detoxifying Language Models via Toxification Reversal
Chak Tou Leong†, Yi Cheng†, Jiashuo Wang, Jian Wang, Wenjie Li