Yuxiang Cai is an assistant professor affiliated with School of Software Technology at Zhejiang University. He obtained the PhD degree in computer science and technology from Zhejiang University in 2023, supervised by Prof. Jianwei Yin. His research interests lie in Crossover Service, Model Transfer and VLM. In particular, he is actively working on intelligent interpretation of multimodal data (video, 2D/3D image, …), transfer learning (UDA, TTA, …), AI+X (remote sensing, medicine, …) and space–air–ground integrated computing.

For students who are interested in joining our research group (Master/Intern), please contact me via caiyuxiang AT zju.edu.cn.

🔥 News

  • 2026.02:  🎉 One paper is accepted by CVPR 2026
  • 2026.02:   I was Invited to serve as a reviewer for ACMMM 2026
  • 2026.02:   I was Invited to serve as a reviewer for ICML 2026
  • 2026.01:  🎉 One paper is accepted by ICLR 2026
  • 2026.01:  🎉 One paper is accepted by ICASSP 2026
  • 2025.12:   I was Invited to serve as a reviewer for IJCAI-ECAI 2026
  • 2025.11:   I was Invited to serve as a reviewer for CVPR 2026
  • 2025.09:  🎉 One United States Patent is granted
  • 2025.09:  🎉 One Patent is granted by State Intellectual Property Office of China
  • 2025.08:   I was Invited to serve as a reviewer for AAAI 2026
  • 2025.05:  🎉 One paper is accepted by IEEE GRSM(IF=16.4)
  • 2025.04:  🎉 One Patent is granted by State Intellectual Property Office of China
  • 2025.03:  🎉 One paper is accepted by ICME 2025
  • 2025.02:   I was Invited to serve as a reviewer for ACMMM 2025
  • 2025.01:   I was Invited to serve as a reviewer for IJCAI 2025
  • 2024.07:  🎉 One paper is accepted by ACMMM 2024
  • 2024.01:   Join the School of Software Technology at Zhejiang University as an Assistant Professor
  • 2023.11:  🎉 One Patent is granted by State Intellectual Property Office of China
  • 2023.07:  🎉 One paper is accepted by ACMMM 2023
  • 2023.05:   I was Invited to serve as a reviewer for IEEE TGRS
  • 2023.02:   I was Invited to serve as a reviewer for NPL
  • 2022.12:  🎉 One paper is accepted by IEEE TGRS
  • 2022.09:   I was Invited to serve as a reviewer for TJSC
  • 2022.08:  🎉 One paper is accepted by IEEE TGRS

📝 Selective Publications

CVPR 2026
sym

IBISAgent: Reinforcing Pixel-Level Visual Reasoning in MLLMs for Universal Biomedical Object Referring and Segmentation

Yankai Jiang, Qiaoru Li, Binlu Xu, Haoran Sun, Chao Ding, Junting Dong, Yuxiang Cai*, Xuhong Zhang, Jianwei Yin.

  • This work proposes a novel agentic MLLM, named IBISAgent, that reformulates segmentation as a vision-centric, multi-step decision-making process. IBISAgent enables MLLMs to generate interleaved reasoning and text-based click actions, invoke segmentation tools, and produce high-quality masks without architectural modifications.
ACMMM 2024
sym

MultiDAN: Unsupervised, Multistage, Multisource and Multitarget Domain Adaptation for Semantic Segmentation of Remote Sensing Images

Yuxiang Cai, Yongheng Shang, Jianwei Yin*

  • This work proposes a novel multistage, multisource and multitarget unsupervised domain adaptation network called MultiDAN for remotely sensed semantic segmentation.
ACMMM 2023
sym

Exploring High-Correlation Source Domain Information for Multi-Source Domain Adaptation in Semantic Segmentation

Yuxiang Cai, Meng Xi*, Yongheng Shang, Jianwei Yin

  • This work proposes a novel multi-source domain adaptation method for semantic segmentation to maximally exploit the high-correlation source domains and source pixels for target domain.
TGRS 2023
sym

DASRSNet: Multitask Domain Adaptation for Super-Resolution-Aided Semantic Segmentation of Remote Sensing Images

Yuxiang Cai, Yingchun Yang, Yongheng Shang, Zhengwei Shen, Jianwei Yin*

  • This work proposes DASRSNet, a novel multitask domain adaptation network for cross-domain semantic segmentation of low-resolution remote sensing images.
TGRS 2022
sym

IterDANet: Iterative Intra-domain Adaptation for Semantic Segmentation of Remote Sensing Images

Yuxiang Cai, Yingchun Yang, Yongheng Shang, Zhenqian Chen, Zhengwei Shen, Jianwei Yin*

  • This work proposes IterDANet, an iterative intra-domain adaptation network for semantic segmentation of remote sensing images.

 * denotes Corresponding Author

📓 Selective Patents

  • US, “SEMANTIC SEGMENTATION METHOD FOR CROSS-DOMAIN REMOTE SENSING IMAGES BASED ON ITERATIVE INTRA-DOMAIN ADAPTATION”, US 12423825 B2.
  • CN, “基于多级领域相关度的多源领域自适应语义分割方法及装置”, ZL 2023 1 1119643.3.
  • CN, “基于迭代域内适应和自训练的跨域遥感图像语义分割方法”, ZL 2022 1 0402338.4.
  • CN, “基于双向无监督域适应融合的跨星遥感图像语义分割方法”, ZL 2021 1 1017498.9.
  • CN, “基于尺寸平衡FCOS的高分辨率遥感场景目标检测方法”, ZL 2021 1 1143539.9.

📚 Professional Activities

  • Reviewer for IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), International Conference on Machine Learning(ICML), ACM International Conference on Multimedia(ACMMM), AAAI Conference on Artificial Intelligence(AAAI), International Joint Conference on Artificial Intelligence(IJCAI), …
  • Reviewer for IEEE Transactions on Geoscience and Remote Sensing (TGRS), Neural Processing Letters (NPL), The Journal of SuperComputing (TJSC), …

🎓 Education

  • 2018.09 - 2023.12, Ph.D., Computer Science and Technology, Zhejiang University, Hangzhou.
  • 2014.09 - 2018.06, B.E., Computer Science and Technology, Lanzhou University, Lanzhou.

💬 Contact

  • caiyuxiang AT zju dot edu dot cn
  • Building 3, School of Software Technology, Zhejiang University, Ningbo, China.