Sibei Yang

prof.jpeg

She received her Ph.D. degree from the University of Hong Kong in 2020, advised by Prof. Yizhou Yu. Her Ph.D. study is supported by Hong Kong PhD Fellowship. She obtained her B.S. degree in computer science from Chu Kochen Honors College at Zhejiang University in 2016.

Her general research interests span computer vision, natural language processing, and the intersection of them.

Research Lab

Our current research interests primarily focus on 1) Open-world Visual Understanding, 2) Neural Generation and Editing, 3) Vision-Language Joint Understanding, 4) Large Language and Vision Models, and 5) Embodied AI. Our mission is to facilitate the learning of unified and universal perception, understanding, reasoning, and generation within the realm of an open world. We believe that learning from multimodal information (especially vision and language) in a general and unified manner, holds the key to a deeper understanding of our world.

We are always looking for undergraduate and graduate students!


News

May 20, 2025 2 papers are accepted by ACL 2025 🎉🎉
Feb 27, 2025 6 papers are accepted by CVPR 2025 🥳🥳🥳
Jan 22, 2025 3 papers are accepted by ICLR 2025 🎊🎊🎊
Dec 3, 2024 Sibei Yang will serve as Area Chair for ICCV 2025
Oct 1, 2024 Congratulations to Cheng Shi for once again receiving the National Scholarship, and to Ge Zheng for achieving the National Scholarship as well! 🎉🎉🎉
Aug 13, 2024 One Paper is accepted by TPAMI 2024 👏👏👏
Jul 5, 2024 Three papers are accepted by ECCV 2024 🎉🎉🎉
Feb 27, 2024 Two papers are accepted by CVPR 2024 🎊🎊🎊
Jan 15, 2024 One paper is accepted by ICLR 2024 🐲🐲🐲
Dec 30, 2023 Congratulations to Cheng Shi for receiving the National Scholarship, and to Jiajin Tang for achieving the Outstanding Student Award.👏👏👏
Sep 23, 2023 2 papers Free-Bloom (Zero-Shot Text-to-Video Generation) and DDCoT (CoT Prompting for Multimodal Reasoning in LMs) are accepted by NeurIPS 2023 🎉🎉🎉
Jul 14, 2023 5 papers are accepted by ICCV 2023 🎉
Jun 1, 2023 Congratulate to Ge Zheng for winning the Undergraduate Excellent Graduation Thesis!
May 27, 2023 Sibei Yang will serve as Area Chair for WACV2024.
May 13, 2023 Our website is released! Thanks to Yufan, Hanzhuo and Yuchen :)

Recent Publication

* equal contribution; † corresponding author.

2025

  1. Last-vit.jpg
    Vision Transformer Needs More Than Register
    In submission, 2025
  2. LoopTrans
    Closed-Loop Transfer for Weakly-supervised Affordance Grounding
    In submission, 2025
  3. SuperChat
    SuperChat: Introducing Super-Image Representation for Video LLMs
    In submission, 2025
  4. EyesWideOpen
    Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video
    In submission, 2025
  5. HalTrapper
    Why LVLMs Are More Prone to Hallucinations in Longer Responses: The Role of Context
    In submission, 2025
  6. SCHall
    Discovering Compositional Hallucination in LVLMs
    In submission, 2025
  7. TTA.jpg
    Free on the Fly: Enhancing Flexibility in Test-Time Adaptation with Online EM
    Accepted by CVPR, 2025
  8. GCD.png
    Adaptive Part Learning for Fine-Grained Generalized Category Discovery: A Plug-and-Play Enhancement
    Accepted by CVPR, 2025
  9. Rethinking.png
    Rethinking Query-based Transformer for Continual Image Segmentation
    Accepted by CVPR, 2025
  10. SegAfford.png
    SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model
    Chunlin Yu*, Hanqing Wang*, Ye Shi, Haoyang Luo, Sibei Yang, Jingyi Yu, Jingya Wang
    Accepted by CVPR, 2025
  11. vton.png
    VTON 360: High-fidelity virtual try-on from any viewing direction
    Zijian He, Yuwei Ning, Yipeng Qin, Guangrun Wang, Sibei Yang, Liang Lin, Guanbin Li
    Accepted by CVPR, 2025
  12. dissecting.png
    Dissecting and Mitigating Diffusion Bias via Mechanistic Interpretability
    Yingdong Shi, Changming Li, Yifan Wang, Yongxiang Zhao, Anqi Pang, Sibei Yang, Jingyi Yu, Kan Ren
    Accepted by CVPR, 2025
  13. delman.jpg
    DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing
    Yi Wang, Fenghua Weng, Sibei Yang, Zhan Qin, Minlie Huang, Wenjie Wang
    Accepted by ACL, 2025
  14. DSN.png
    Don’t Say No: Jailbreaking LLM by Suppressing Refusal
    Yukai Zhou, Jian Lou, Zhijie Huang, Zhan Qin, Sibei Yang, Wenjie Wang
    Accepted by ACL, 2025
  15. mvtokenflow.gif
    MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow
    Hanzhuo Huang*, Yuan Liu*, Ge Zheng, Jiepeng Wang, Zhiyang Dou, Sibei Yang†
    Accepted by ICLR, 2025
  16. cityanchor.png
    CityAnchor: City-scale 3D Visual Grounding with Multi-modality LLMs
    jinpeng Li*, Haiping Wang*, Jiabin Chen, Yuan Liu, Zhiyang Dou, Yuexin Ma, Sibei Yang, Yuan Li, Wang Wenping, Zhen Dong, Bisheng Yang
    Accepted by ICLR, 2025
  17. ICLR2025
    Discovering Influential Neuron Path in Vision Transformers
    Yifan Wang, Yifei Liu, Yingdong Shi, Changming Li, Anqi Pang, Sibei Yang, Jingyi Yu, Kan Ren
    Accepted by ICLR, 2025

2024

  1. TPAMI2024
    A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-oriented Perspective
    Chaoqi Chen*, Yushuang Wu*, Qiyuan Dai*, Hong-Yu Zhou*, Mutian Xu, Sibei Yang†, Xiaoguang Han†, Yizhou Yu†
    Accepted by TPAMI, 2024
  2. Part2Object.png
    Part2Object: Hierarchical Unsupervised 3D Instance Segmentation
    Accepted by ECCV, 2024
  3. Plain-D.png
    Plain-DNet: A Plain Multi-Dataset Object Detector
    Accepted by ECCV, 2024
  4. wildrefer.png
    WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural Language
    Zhenxiang Lin, Xidong Peng, Peishan Cong, Ge Zheng, Yujing Sun, Yuenan Hou, Xinge Zhu, Sibei Yang, Yuexin Ma
    Accepted by ECCV, 2024
  5. CVPR2024
    Curriculum Point Prompting for Weakly-Supervised Referring Image Segmentation
    Accepted by CVPR, 2024
  6. devil.png
    The Devil is in the Object Boundary: Towards Annotation-free Instance Segmentation Using Foundation Models
    Accepted by ICLR, 2024
  7. omg.png
    OMG: Towards Open-vocabulary Motion Generation via Mixture of Controllers
    Han Liang, Jiacheng Bao, Ruichi Zhang, Sihan Ren, Yuecheng Xu, Sibei Yang, Xin Chen, Jingyi Yu, Lan Xu
    Accepted by CVPR, 2024
  8. RealDex.png
    RealDex: Towards Human-like Grasping for Robotic Dexterous Hand
    Yumeng Liu*, Yaxun Yang*, Youzhuo Wang*, Xiaofei Wu, Jiamin Wang, Yichen Yao, Sören Schwertfeger, Sibei Yang, Wenping Wang, Jingyi Yu, Xuming He, Yuexin Ma
    Accepted by IJCAI, 2024

2023

  1. ddcot.jpg
    DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models
    Accepted by NeurIPS, 2023
  2. flower_bloom.gif
    Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator
    Hanzhuo Huang*Yufan Feng*Cheng Shi, Lan Xu, Jingyi Yu, Sibei Yang†
    Accepted by NeurIPS, 2023
  3. ICCV2023
  4. ICCV2023
    EdaDet: Open-Vocabulary Object Detection Using Early Dense Alignment
    Accepted by ICCV, 2023
  5. ICCV2023
    CoTDet: Affordance Knowledge Prompting for Task Driven Object Detection
    Accepted by ICCV, 2023
  6. ICCV2023
    Temporal Collection and Distribution for Referring Video Object Segmentation
    Accepted by ICCV, 2023
  7. ICCV2023
    Grounded lmage Text Matching with Mismatched Relation Reasoning
    Yu Wu*, Yana Wei*, Haozhe Wang, Yongfei Liu, Sibei Yang, Xuming He†
    Accepted by ICCV, 2023
  8. CVPR2023
    Contrastive Grouping with Transformer for Referring Image Segmentation
    Accepted by CVPR, 2023
  9. AAAI2023
    CCQ: Cross-Class Query Network for Partially Labeled Organ Segmentation
    Xuyang Liu, Bingbing Wen, and Sibei Yang†
    Accepted by AAAI, 2023
  10. SIGGRAPH2023
    DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance
    Longwen Zhang*, Qiwei Qiu*, Hongyang Lin*, Qixuan Zhang, Cheng Shi, Wei Yang, Ye Shi, Sibei Yang†, Lan Xu†, Jingyi Yu†
    Accepted by SIGGRAPH, 2023
  11. TPAMI2023
    A Unified Visual Information Preservation Framework for Self-supervised Pre-training in Medical Image Analysis
    Hong-Yu Zhou*, Chixiang Lu*, Chaoqi Chen, Sibei Yang, Yizhou Yu†
    Accepted by TPAMI, 2023

2022

  1. ECCV2022
    Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference Understanding
    Accepted by ECCV, 2022