profile photo

Weitong Cai

London QMUL UESTC, NCTU Multimodal Learning Computer Vision

I received my Ph.D. in Computer Science from Queen Mary University of London (QMUL), under the supervision of Prof. Shaogang Gong FREng. Prior to that, I received both my Master's and Bachelor's degrees from University of Electronic Science and Technology of China (UESTC). where I was supervised by Prof. Jianwen Chen. I was also a visiting student at National Chiao Tung University (NCTU), supervised by Prof. Hsueh-Ming Hang.

My research lies at the intersection of computer vision, machine learning, and large foundation models, with a particular focus on multimodal representation learning and video understanding.

News
Publications
safs_small Color When It Counts: Grayscale-Guided Online Triggering for Always-On Streaming Video Sensing
Weitong Cai, Hang Zhang, Yukai Huang, Shitong Sun, Jiankang Deng, Songcen Xu, Jifei Song, Zhensong Zhang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026
Egocentric Co-Pilot Egocentric Co-Pilot: Web-Native Smart-Glasses Agents for Assistive Egocentric AI
Sicheng Yang, Yukai Huang, Weitong Cai, Shitong Sun, Fengyi Fang, You He, Yiqiao Xie, Jiankang Deng, Hang Zhang, Jifei Song, Zhensong Zhang
The ACM Web Conference (WWW), 2026
Plug-and-Play Clarifier Plug-and-Play Clarifier: A Zero-Shot Multimodal Framework for Egocentric Intent Disambiguation
Sicheng Yang, Yukai Huang, Weitong Cai, Shitong Sun, You He, Jiankang Deng, Hang Zhang, Jifei Song, Zhensong Zhang
AAAI Conference on Artificial Intelligence (AAAI), 2026
safs_small MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment Retrieval
Weitong Cai, Jiabo Huang, Shaogang Gong, Hailin Jin, Yang Liu
Pattern Recognition (PR), 2025
safs_small Semantic Video Moment Retrieval by Temporal Feature Perturbation and Refinement
Weitong Cai, Jiabo Huang, Jian Hu, Shaogang Gong, Hailin Jin, Yang Liu
International Conference on Pattern Recognition Systems, 2024
safs_small Relax Image-Specific Prompt Requirement in SAM: A Single Generic Prompt for Segmenting Camouflaged Objects
Jian Hu, Jiayi Lin, Shaogang Gong, Weitong Cai
AAAI Conference on Artificial Intelligence (AAAI), 2024
safs_small Hybrid-Learning Video Moment Retrieval across Multi-Domain Labels
Weitong Cai, Jiabo Huang, Shaogang Gong
British Machine Vision Conference (BMVC) , 2022
Experiences
    Teaching Fellow (40% full-time) at QMUL for Probability & Matrices (ECS509U, Prof. Qianni Zhang), 2024-2025
    Demonstrator at QMUL for Machine Learning (ECS708P, Prof. Ioannis Patras), Fall, 2023
    Demonstrator at QMUL for Deep Learning and Computer Vision (ECS795P, Prof. Shaogang Gong), Spring, 2021, 2022, and 2023
    Teaching Assistant at UESTC for Information Theory and Inforamtion Coding (Prof. Wenyi Wang), Spring, 2019
Misc
  • My name is more formally written as 蔡卫彤
  • Swimming is one of my passions, with my specialties being backstroke and medley.
  • I am a fan of StarCraft, both 1 and 2, and I hope there will be a 3.
  • I also enjoy the mainline Pokémon games and hope to see a remake of Emerald.

Last updated: March 2026


Special thanks to the template!