Sibei Yang is an Associate Professor and PhD Supervisor at the HCP Lab, School of Computer Science and Engineering, Sun Yat-sen University. Prior to this, she was a Tenure-track Assistant Professor at ShanghaiTech University from 2021 and a Research Assistant Professor at The Hong Kong Polytechnic University since 2020. She received her Ph.D. from the University of Hong Kong in 2020, under the supervision of Prof. Yizhou Yu, with support from the Hong Kong PhD Fellowship. Dr. Yang obtained her B.S. in Computer Science from the Chu Kochen Honors College at Zhejiang University in 2016.
She has published over 60 papers in leading venues such as TPAMI, CVPR, ICCV, ECCV, NeurIPS, ICLR, and SIGGRAPH. As the first or corresponding author, Sibei Yang has contributed to more than 30 papers in top-tier (CCF-A) journals and conferences.
Research Interests
Our current research interests primarily focus on 1) Multimodal Large Language Models (LLM/MLLM), 2) Vision-Language Understanding and Generation, 3) Embodied AI, and 4) Open-world Visual Understanding. Our mission is to facilitate the learning of unified and universal perception, understanding, reasoning, generation, and interaction within the realm of an open world. We envision that harnessing multimodal information—particularly vision and language—in a unified and generalizable manner, and enabling its deployment in real-world physical interactions, is fundamental to advancing our understanding of the world and fostering the emergence of artificial intelligence.
我们正在招2026秋季入学的硕士生和博士生,欢迎感兴趣的同学附上简历联系我sibeiyang9@gmail.com。如果想从学生视角了解我们可以联系组内师兄石骋shicheng2025@connect.hku.hk
| Oct, 2025 |
Congratulations to Yulin Zhang and Jiajin Tang for receiving the National Scholarship👍
|
| Sep, 2025 |
Sibei Yang is listed in the Stanford University/Elsevier “World’s Top 2% Scientists”
|
| Sep, 2025 |
5 papers are accepted by NeurIPS 2025 🎊🎊🎊
|
| Aug, 2025 |
Sibei Yang will serve as Area Chair for ICLR 2026
|
| Jun, 2025 |
7
papers are accepted by ICCV 2025 👏👏👏
|
| May, 2025 |
2
papers are accepted by ACL 2025 🎉🎉
|
| Feb, 2025 |
6
papers are accepted by CVPR 2025 🥳🥳🥳
|
| Jan, 2025 |
3
papers are accepted by ICLR 2025 🎊🎊🎊
|
| Dec, 2024 |
Sibei Yang will serve as Area Chair for ICCV 2025
|
| Oct, 2024 |
Congratulations to Cheng Shi for once again receiving the National Scholarship! 🎉🎉🎉
|
| Aug, 2024 |
One Paper is accepted by TPAMI 2024 👏👏👏
|
| Jul, 2024 |
Three papers are accepted by ECCV 2024 🎉🎉🎉
|
| Feb, 2024 |
Two papers are accepted by CVPR 2024 🎊🎊🎊
|
| Jan, 2024 |
One paper is accepted by ICLR 2024 🐲🐲🐲
|
| Dec, 2023 |
Congratulations to Cheng Shi for receiving the National Scholarship, and to Jiajin Tang for achieving the Outstanding Student Award.👏👏👏
|
| Sep, 2023 |
2 papers Free-Bloom (Zero-Shot Text-to-Video Generation) and DDCoT (CoT Prompting for Multimodal Reasoning in LMs) are accepted by NeurIPS 2023 🎉🎉🎉
|
| Jul, 2023 |
5 papers are accepted by ICCV 2023 🎉
|
| May, 2023 |
Sibei Yang will serve as Area Chair for WACV2024.
|
| May, 2023 |
Our website is released! Thanks to Yufan, Hanzhuo and Yuchen :)
|
* equal contribution; † corresponding author.
2025
-
Vision Function Layer in Multimodal LLMs
Accepted by NeurIPS, 2025
-
Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video
Accepted by NeurIPS, 2025
-
Discovering Compositional Hallucination in LVLMs
Accepted by NeurIPS, 2025
-
Intervene-All-Paths: Unified Mitigation of LVLM Hallucinations across Alignment Formats
Accepted by NeurIPS, 2025
-
Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models
Xu Yue, Chengyan Fu, Li Xiong,
Sibei Yang, Wenjie Wang
Accepted by NeurIPS, 2025
-
Vision Transformer Needs More Than Register
In submission, 2025
-
Sim-DETR: Unlock DETR for Temporal Sentence Grounding
Accepted by ICCV, 2025
-
Why LVLMs Are More Prone to Hallucinations in Longer Responses: The Role of Context
Accepted by ICCV, 2025
-
No More Sibling Rivalry: Debiasing Human-Object Interaction Detection
Accepted by ICCV, 2025
-
Closed-Loop Transfer for Weakly-supervised Affordance Grounding
Accepted by ICCV, 2025
-
Augmenting Moment Retrieval: Zero-Dependency Two-Stage Learning
Accepted by ICCV, 2025
-
Penalizing Boundary Activation for Object Completeness in Diffusion Models
Accepted by ICCV, 2025
-
VLDrive: Vision-Augmented Lightweight MLLMs for Efficient Language-grounded Autonomous Driving
Ruifei Zhang, Wei Zhang, Xiao Tan,
Sibei Yang, Xiang Wan, Xiaonan Luo, Guanbin Li
Accepted by ICCV, 2025
-
Free on the Fly: Enhancing Flexibility in Test-Time Adaptation with Online EM
Accepted by CVPR, 2025
-
Adaptive Part Learning for Fine-Grained Generalized Category Discovery: A Plug-and-Play Enhancement
Accepted by CVPR, 2025
-
Rethinking Query-based Transformer for Continual Image Segmentation
Accepted by CVPR, 2025
-
SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model
Chunlin Yu*, Hanqing Wang*, Ye Shi, Haoyang Luo,
Sibei Yang, Jingyi Yu, Jingya Wang
Accepted by CVPR, 2025
-
VTON 360: High-fidelity virtual try-on from any viewing direction
Zijian He, Yuwei Ning, Yipeng Qin, Guangrun Wang,
Sibei Yang, Liang Lin, Guanbin Li
Accepted by CVPR, 2025
-
Dissecting and Mitigating Diffusion Bias via Mechanistic Interpretability
Yingdong Shi, Changming Li, Yifan Wang, Yongxiang Zhao, Anqi Pang,
Sibei Yang, Jingyi Yu, Kan Ren
Accepted by CVPR, 2025
-
DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing
Yi Wang, Fenghua Weng,
Sibei Yang, Zhan Qin, Minlie Huang, Wenjie Wang
Accepted by ACL, 2025
-
Don’t Say No: Jailbreaking LLM by Suppressing Refusal
Yukai Zhou, Jian Lou, Zhijie Huang, Zhan Qin,
Sibei Yang, Wenjie Wang
Accepted by ACL, 2025
-
MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow
Accepted by ICLR, 2025
-
CityAnchor: City-scale 3D Visual Grounding with Multi-modality LLMs
jinpeng Li*, Haiping Wang*, Jiabin Chen, Yuan Liu, Zhiyang Dou, Yuexin Ma,
Sibei Yang, Yuan Li, Wang Wenping, Zhen Dong, Bisheng Yang
Accepted by ICLR, 2025
-
Discovering Influential Neuron Path in Vision Transformers
Yifan Wang, Yifei Liu, Yingdong Shi, Changming Li, Anqi Pang,
Sibei Yang, Jingyi Yu, Kan Ren
Accepted by ICLR, 2025
2024
-
A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-oriented Perspective
Accepted by TPAMI, 2024
-
Part2Object: Hierarchical Unsupervised 3D Instance Segmentation
Accepted by ECCV, 2024
-
Plain-DNet: A Plain Multi-Dataset Object Detector
Accepted by ECCV, 2024
-
WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural Language
Zhenxiang Lin, Xidong Peng, Peishan Cong,
Ge Zheng, Yujing Sun, Yuenan Hou, Xinge Zhu,
Sibei Yang, Yuexin Ma
Accepted by ECCV, 2024
-
Curriculum Point Prompting for Weakly-Supervised Referring Image Segmentation
Accepted by CVPR, 2024
-
The Devil is in the Object Boundary: Towards Annotation-free Instance Segmentation Using Foundation Models
Accepted by ICLR, 2024
-
OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers
Han Liang, Jiacheng Bao, Ruichi Zhang, Sihan Ren, Yuecheng Xu,
Sibei Yang, Xin Chen, Jingyi Yu, Lan Xu
Accepted by CVPR, 2024
-
RealDex: Towards Human-like Grasping for Robotic Dexterous Hand
Yumeng Liu*, Yaxun Yang*, Youzhuo Wang*, Xiaofei Wu, Jiamin Wang, Yichen Yao, Sören Schwertfeger,
Sibei Yang, Wenping Wang, Jingyi Yu, Xuming He, Yuexin Ma
Accepted by IJCAI, 2024
2023
-
DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models
Accepted by NeurIPS, 2023
-
Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator
Accepted by NeurIPS, 2023
-
LoGoPrompt: Synthetic Text Images Can Be Good Visual Prompts for Vision-Language Models
Accepted by ICCV, 2023
-
EdaDet: Open-Vocabulary Object Detection Using Early Dense Alignment
Accepted by ICCV, 2023
-
CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection
Accepted by ICCV, 2023
-
Temporal Collection and Distribution for Referring Video Object Segmentation
Accepted by ICCV, 2023
-
Grounded lmage Text Matching with Mismatched Relation Reasoning
Yu Wu*, Yana Wei*, Haozhe Wang, Yongfei Liu,
Sibei Yang, Xuming He†
Accepted by ICCV, 2023
-
Contrastive Grouping with Transformer for Referring Image Segmentation
Accepted by CVPR, 2023
-
CCQ: Cross-Class Query Network for Partially Labeled Organ Segmentation
Accepted by AAAI, 2023
-
DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance
Longwen Zhang*, Qiwei Qiu*, Hongyang Lin*, Qixuan Zhang,
Cheng Shi, Wei Yang, Ye Shi,
Sibei Yang†, Lan Xu†, Jingyi Yu†
Accepted by SIGGRAPH, 2023
-
A Unified Visual Information Preservation Framework for Self-supervised Pre-training in Medical Image Analysis
Hong-Yu Zhou*, Chixiang Lu*, Chaoqi Chen,
Sibei Yang, Yizhou Yu†
Accepted by TPAMI, 2023
2022
-
Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference Understanding
Accepted by ECCV, 2022