I am a thrid-year PhD student at Natural Language Processing Group at The Hong Kong Polytechnic University (PolyU), advised by Prof. Maggie Wenjie Li. Before that, I received my BEng degrees from the School of Computer Science, Wuhan University, in 2022.
My primary focus is on uncovering mechanistic insights to enhance the safety alignment of Large Language Models (LLMs). Closely related to this, I am also passionate about the mechanistic interpretability of LLMs’ general computational processes (check out Awesome-LLM-Interpretability!). Beyond these core areas, I have broad interests in LLM alignment, improving their reasoning capabilities, and developing more effective interactions between LLMs, humans, and the environment.
🔥 News
- 2024.09: 🎉 Two papers are accepted by EMNLP 2024
- 2024.05: 🎉 Two papers are accepted by ACL 2024
📝 Publications
E^2CL: Exploration-based Error Correction Learning for Embodied Agents
Hanlin Wang†, Chak Tou Leong†, Jian Wang, Wenjie Li
No Two Devils Alike: Unveiling Distinct Mechanisms of Fine-tuning Attacks
Chak Tou Leong, Yi Cheng, Kaishuai Xu, Jian Wang, Hanlin Wang, Wenjie Li
Self-Detoxifying Language Models via Toxification Reversal
Chak Tou Leong†, Yi Cheng†, Jiashuo Wang, Jian Wang, Wenjie Li
EMNLP 2024 Findings
Deeper Insights Without Updates: The Power of In-Context Learning Over Fine-Tuning, Qingyu Yin, Xuzheng He, Luoao Deng, Chak Tou Leong, Fan Wang, Yanzhao Yan, Xiaoyu Shen, Qiang ZhangACL 2024
Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue, Jian Wang, Chak Tou Leong, Jiashuo Wang, Dongding Lin, Wenjie Li, Xiaoyong WeiACL 2024 Findings
Muffin: Mitigating Unhelpfulness in Emotional Support Conversations with Multifaceted AI Feedback, Jiashuo Wang, Chunpu Xu, Chak Tou Leong, Wenjie Li, Jing LiACMMM 2024
SCREEN: A Benchmark for Situated Conversational Recommendation, Dongding Lin, Jian Wang, Chak Tou Leong, Wenjie LiAAAI 2024
COOPER: Coordinating Specialized Agents towards a Complex Dialogue Goal, Yi Cheng, Wenge Liu, Jian Wang, Chak Tou Leong, Yi Ouyang, Wenjie Li, Xian Wu, Yefeng Zheng
🎖 Honors and Awards
- 2021.12 Outstanding Prize, Scholarships for Hong Kong, Macau and Overseas Chinese Students (9 students awarded school-wise)
Academic and Professional Activities
Open-Source Projects
- Awesome-LLM-Interpretability
- A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc.
Academic Services
- Conference Reviewer: NeuIPS 2024, ICLR 2025
Teaching Assistant
- COMP 6709: Advanced Natural Language Processing, Spring 2024 PolyU
- COMP 4133: Information Retrieval, Fall 2023 PolyU
- COMP 5423: Natural Language Processing, Spring 2023 PolyU