Soolab Sibei Yang

Sibei Yang is an Associate Professor and PhD Supervisor at the HCP Lab, School of Computer Science and Engineering, Sun Yat-sen University. Prior to this, she was a Tenure-track Assistant Professor at ShanghaiTech University from 2021 and a Research Assistant Professor at The Hong Kong Polytechnic University since 2020. She received her Ph.D. from the University of Hong Kong in 2020, under the supervision of Prof. Yizhou Yu, with support from the Hong Kong PhD Fellowship. Dr. Yang obtained her B.S. in Computer Science from the Chu Kochen Honors College at Zhejiang University in 2016.

She has published over 60 papers in leading venues such as TPAMI, CVPR, ICCV, ECCV, NeurIPS, ICLR, and SIGGRAPH. As the first or corresponding author, Sibei Yang has contributed to more than 30 papers in top-tier (CCF-A) journals and conferences.

Research Interests

Our current research interests primarily focus on 1) Multimodal Large Language Models (LLM/MLLM), 2) Vision-Language Understanding and Generation, 3) Embodied AI, and 4) Open-world Visual Understanding. Our mission is to facilitate the learning of unified and universal perception, understanding, reasoning, generation, and interaction within the realm of an open world. We envision that harnessing multimodal information—particularly vision and language—in a unified and generalizable manner, and enabling its deployment in real-world physical interactions, is fundamental to advancing our understanding of the world and fostering the emergence of artificial intelligence.

[Recruitment – Jan 2026] We are seeking Research Interns; applicants should submit their transcript and CV to sibeiyang9@gmail.com with the subject line [Research Intern – Name]. Preference will be given to candidates intending to apply for graduate study in our group.

News

Mar, 2026	3 papers are accepted by CVPR 2026.🥳🥳🥳
Jan, 2026	2 papers are accepted by ICLR 2026.🐲🐲🐲
Dec, 2025	Sibei Yang will serve as Area Chair for ECCV 2026.
Oct, 2025	Congratulations to Yulin Zhang and Jiajin Tang for receiving the National Scholarship👍
Sep, 2025	Sibei Yang is listed in the Stanford University/Elsevier “World’s Top 2% Scientists”
Sep, 2025	5 papers are accepted by NeurIPS 2025 🎊🎊🎊
Aug, 2025	Sibei Yang will serve as Area Chair for ICLR 2026
Jun, 2025	7 papers are accepted by ICCV 2025 👏👏👏
May, 2025	2 papers are accepted by ACL 2025 🎉🎉
Feb, 2025	6 papers are accepted by CVPR 2025 🥳🥳🥳
Jan, 2025	3 papers are accepted by ICLR 2025 🎊🎊🎊
Dec, 2024	Sibei Yang will serve as Area Chair for ICCV 2025
Oct, 2024	Congratulations to Cheng Shi for once again receiving the National Scholarship! 🎉🎉🎉
Aug, 2024	One Paper is accepted by TPAMI 2024 👏👏👏
Jul, 2024	Three papers are accepted by ECCV 2024 🎉🎉🎉
Feb, 2024	Two papers are accepted by CVPR 2024 🎊🎊🎊
Jan, 2024	One paper is accepted by ICLR 2024 🐲🐲🐲
Dec, 2023	Congratulations to Cheng Shi for receiving the National Scholarship, and to Jiajin Tang for achieving the Outstanding Student Award.👏👏👏
Sep, 2023	2 papers Free-Bloom (Zero-Shot Text-to-Video Generation) and DDCoT (CoT Prompting for Multimodal Reasoning in LMs) are accepted by NeurIPS 2023 🎉🎉🎉
Jul, 2023	5 papers are accepted by ICCV 2023 🎉
May, 2023	Sibei Yang will serve as Area Chair for WACV2024.
May, 2023	Our website is released! Thanks to Yufan, Hanzhuo and Yuchen :)

Recent Publication

* equal contribution; † corresponding author.

2026

Vision Transformer Needs More Than Register

Cheng Shi, Yizhou Yu, and Sibei Yang†

Accepted by CVPR, 2026

arXiv Code
WeaveTime: Streaming from Earlier Frames into Emergent Memory in VideoLLMs

Yulin Zhang, Cheng Shi, and Sibei Yang†

Accepted by CVPR, 2026

arXiv HTML
CVPR Finding

Direct Language Embedding Enables Gaussian Splatting for Large Scenes

Zhida Li, Jianqiao Zhu, Hejin Huang, Yipeng Qin, Sibei Yang, Guanbin Li

Accepted by CVPR Finding, 2026
Chart Deep Research in LVLMs via Parallel Relative Policy Optimization

Jiajin Tang, Gaoyang, Wenjie Wang, Sibei Yang†, Xing Chen

Accepted by ICLR, 2026
RefAny3D: 3D Asset-Referenced Diffusion Models for Image Generation

Hanzhuo Huang, Qingyang Bao, Zekai Gu, Zhongshuo Du, Cheng Lin, Yuan Liu†, Sibei Yang†

Accepted by ICLR, 2026

arXiv HTML Code

2025

Vision Function Layer in Multimodal LLMs

Cheng Shi, Yizhou Yu, and Sibei Yang†

Accepted by NeurIPS, 2025

arXiv PDF Code
Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video

Yulin Zhang, Cheng Shi, Yang Wang, Sibei Yang†

Accepted by NeurIPS, 2025

arXiv HTML PDF
Discovering Compositional Hallucination in LVLMs

Sibei Yang†, Ge Zheng, Jiajin Tang, Jiaye Qian, Hanzhuo Huang, Cheng Shi

Accepted by NeurIPS, 2025

PDF
Intervene-All-Paths: Unified Mitigation of LVLM Hallucinations across Alignment Formats

Jiaye Qian, Ge Zheng, Yuchen Zhu, Sibei Yang†

Accepted by NeurIPS, 2025

arXiv PDF Code
Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models

Xu Yue, Chengyan Fu, Li Xiong, Sibei Yang, Wenjie Wang

Accepted by NeurIPS, 2025

arXiv PDF
Sim-DETR: Unlock DETR for Temporal Sentence Grounding

Jiajin Tang*, Zhengxuan Wei*, Yuchen Zhu, Cheng Shi, Guanbin Li, Liang Lin, Sibei Yang†

Accepted by ICCV, 2025

arXiv PDF
Why LVLMs Are More Prone to Hallucinations in Longer Responses: The Role of Context

Ge Zheng*, Jiaye Qian*, Jiajin Tang, Sibei Yang†

Accepted by ICCV, 2025

arXiv PDF
No More Sibling Rivalry: Debiasing Human-Object Interaction Detection

Bin Yang*, Yulin Zhang*, Hong-Yu Zhou, Sibei Yang†

Accepted by ICCV, 2025

arXiv PDF
Closed-Loop Transfer for Weakly-supervised Affordance Grounding

Jiajin Tang*, Zhengxuan Wei*, Ge Zheng, Sibei Yang†

Accepted by ICCV, 2025

arXiv PDF
Augmenting Moment Retrieval: Zero-Dependency Two-Stage Learning

Zhengxuan Wei*, Jiajin Tang*, and Sibei Yang†

Accepted by ICCV, 2025

arXiv PDF Code
ICCV2025

Penalizing Boundary Activation for Object Completeness in Diffusion Models

Haoyang Xu, Tianhao Zhao, Sibei Yang, Yutian Lin

Accepted by ICCV, 2025

arXiv PDF Code
ICCV2025

VLDrive: Vision-Augmented Lightweight MLLMs for Efficient Language-grounded Autonomous Driving

Ruifei Zhang, Wei Zhang, Xiao Tan, Sibei Yang, Xiang Wan, Xiaonan Luo, Guanbin Li

Accepted by ICCV, 2025

arXiv PDF Code
Free on the Fly: Enhancing Flexibility in Test-Time Adaptation with Online EM

Qiyuan Dai, and Sibei Yang†

Accepted by CVPR, 2025

arXiv PDF
Adaptive Part Learning for Fine-Grained Generalized Category Discovery: A Plug-and-Play Enhancement

Qiyuan Dai, Hanzhuo Huang, Yu Wu, Sibei Yang†

Accepted by CVPR, 2025

arXiv PDF
Rethinking Query-based Transformer for Continual Image Segmentation

Yuchen Zhu*, Cheng Shi*, Dingyou Wang, Jiajin Tang, Zhengxuan Wei, Yu Wu, Sibei Yang†

Accepted by CVPR, 2025

arXiv PDF Code
SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model

Chunlin Yu*, Hanqing Wang*, Ye Shi, Haoyang Luo, Sibei Yang, Jingyi Yu, Jingya Wang

Accepted by CVPR, 2025

arXiv HTML Code
VTON 360: High-fidelity virtual try-on from any viewing direction

Zijian He, Yuwei Ning, Yipeng Qin, Guangrun Wang, Sibei Yang, Liang Lin, Guanbin Li

Accepted by CVPR, 2025

arXiv HTML Code
Dissecting and Mitigating Diffusion Bias via Mechanistic Interpretability

Yingdong Shi, Changming Li, Yifan Wang, Yongxiang Zhao, Anqi Pang, Sibei Yang, Jingyi Yu, Kan Ren

Accepted by CVPR, 2025

arXiv HTML
DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing

Yi Wang, Fenghua Weng, Sibei Yang, Zhan Qin, Minlie Huang, Wenjie Wang

Accepted by ACL, 2025

arXiv PDF Code
Don’t Say No: Jailbreaking LLM by Suppressing Refusal

Yukai Zhou, Jian Lou, Zhijie Huang, Zhan Qin, Sibei Yang, Wenjie Wang

Accepted by ACL, 2025

arXiv PDF Code
MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow

Hanzhuo Huang*, Yuan Liu*, Ge Zheng, Jiepeng Wang, Zhiyang Dou, Sibei Yang†

Accepted by ICLR, 2025

arXiv HTML Code
CityAnchor: City-scale 3D Visual Grounding with Multi-modality LLMs

jinpeng Li*, Haiping Wang*, Jiabin Chen, Yuan Liu, Zhiyang Dou, Yuexin Ma, Sibei Yang, Yuan Li, Wang Wenping, Zhen Dong, Bisheng Yang

Accepted by ICLR, 2025

PDF Code
ICLR2025

Discovering Influential Neuron Path in Vision Transformers

Yifan Wang, Yifei Liu, Yingdong Shi, Changming Li, Anqi Pang, Sibei Yang, Jingyi Yu, Kan Ren

Accepted by ICLR, 2025

arXiv Code