๐Ÿ“ Publications

๐Ÿ“ Publications

Book

Springer 2025
sym

Visual Object Tracking: An Evaluation Perspective
X. Zhao, Shiyu Hu, X. Yin
Springer, Part of the book series: Advances in Computer Vision and Pattern Recognition (ACVPR)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Intelligent Evaluation Technology
๐Ÿ“ƒ Book

Acceptance

TPAMI 2023
sym

Global Instance Tracking: Locating Target More Like Humans
Shiyu Hu, X. Zhao, L. Huang, K. Huang
IEEE Transactions on Pattern Analysis and Machine Intelligence (CCF-A Journal)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Large-scale Benchmark Construction ๐Ÿ“Œ Intelligent Evaluation Technology
๐Ÿ“ƒ Paper ๐Ÿ“‘ PDF ๐Ÿชง Poster ๐ŸŒ Platform ๐Ÿ”ง Toolkit ๐Ÿ’พ Dataset

IJCV 2024
sym

SOTVerse: A User-defined Task Space of Single Object Tracking
Shiyu Hu, X. Zhao, K. Huang
International Journal of Computer Vision (CCF-A Journal)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Dynamic Open Environment Construction ๐Ÿ“Œ 3E Paradigm
๐Ÿ“ƒ Paper ๐Ÿ“‘ PDF ๐ŸŒ Platform

IJCV 2024
sym

BioDrone: A Bionic Drone-based Single Object Tracking Benchmark for Robust Vision
X. Zhao, Shiyu Huโœ‰๏ธ, Y. Wang, J. Zhang, Y. Hu, R. Liu, H. Lin, Y. Li, R. Li, K. Liu, J. Li
International Journal of Computer Vision (CCF-A Journal)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Drone-based Tracking ๐Ÿ“Œ Visual Robustness
๐Ÿ“ƒ Paper ๐ŸŒ Platform ๐Ÿ“‘ PDF ๐Ÿ”ง Toolkit ๐Ÿ’พ Dataset

NeurIPS 2023
sym

A Multi-modal Global Instance Tracking Benchmark (MGIT): Better Locating Target in Complex Spatio-temporal and causal Relationship
Shiyu Hu, D. Zhang, M. Wu, X. Feng, X. Li, X. Zhao, K. Huang
Conference on Neural Information Processing Systems (CCF-A Conference, Poster)
๐Ÿ“Œ Visual Language Tracking ๐Ÿ“Œ Long Video Understanding and Reasoning ๐Ÿ“Œ Hierarchical Semantic Information Annotation
๐Ÿ“ƒ Paper ๐Ÿ“ƒ PDF ๐Ÿชง Poster ๐Ÿ“น Slides ๐ŸŒ Platform ๐Ÿ”ง Toolkit ๐Ÿ’พ Dataset

ไธญๅ›ฝๅ›พ่ฑกๅ›พๅฝขๅญฆๆŠฅ 2023
sym

Visual Intelligence Evaluation Techniques for Single Object Tracking: A Survey (ๅ•็›ฎๆ ‡่ทŸ่ธชไธญ็š„่ง†่ง‰ๆ™บ่ƒฝ่ฏ„ไผฐๆŠ€ๆœฏ็ปผ่ฟฐ)
Shiyu Hu, X. Zhao, K. Huang
Journal of Images and Graphics (ใ€Šไธญๅ›ฝๅ›พ่ฑกๅ›พๅฝขๅญฆๆŠฅใ€‹, CCF-B Chinese Journal)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Intelligent Evaluation Technique ๐Ÿ“Œ AI4Science
๐Ÿ“ƒ Paper ๐Ÿ“‘ PDF

IET-CVI 2025
sym

Improved SAR Aircraft Detection Algorithm Based on Visual State Space Models
Y. Wang, J. Zhang, Y. Wang, Shiyu Huโœ‰๏ธ, B. Shen, Z. Hou, W. Zhou
IET Computer Vision (CCF-C Journal)
๐Ÿ“Œ Synthetic Aperture Radar ๐Ÿ“Œ State Space Models ๐Ÿ“Œ Aircraft Object Detection

ICML 2025
sym

CSTrack: Enhancing RGB-X Tracking via Compact Spatiotemporal Features
X. Feng, D. Zhang, Shiyu Hu, X. Li, M. Wu, J. Zhang, X. Chen, K. Huang
International Conference on Machine Learning (CCF-A Conference, Poster)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Multi-modal Learning

NeurIPS 2024
sym

Beyond Accuracy: Tracking more like Human via Visual Search
D. Zhang, Shiyu Hu, X. Feng, X. Li, M. Wu, J. Zhang, K. Huang
Conference on Neural Information Processing Systems (CCF-A Conference, Poster)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Visual Search Mechanism ๐Ÿ“Œ Visual Turing Test

NeurIPS 2024
sym

MemVLT: Vision-Language Tracking with Adaptive Memory-based Prompts
X. Feng, X. Li, Shiyu Hu, D. Zhang, M. Wu, J. Zhang, X. Chen, K. Huang
Conference on Neural Information Processing Systems (CCF-A Conference, Poster)
๐Ÿ“Œ Visual Language Tracking ๐Ÿ“Œ Human-like Memory Modeling ๐Ÿ“Œ Adaptive Prompts

CVPRW 2024
sym

Diverse Text Generation for Visual Language Tracking Based on LLM
X. Li, X. Feng, Shiyu Hu, M. Wu, D. Zhang, J. Zhang, K. Huang
the 3rd Workshop on Vision Datasets Understanding and DataCV Challenge in CVPR 2024 (Workshop in CCF-A Conference, Oral, Best Paper Honorable Mention)
๐Ÿ“Œ Visual Language Tracking ๐Ÿ“Œ Large Language Model ๐Ÿ“Œ Evaluation Technique
๐Ÿ“ƒ Paper ๐Ÿ“ƒ PDF ๐Ÿชง Poster ๐Ÿ“น Slides ๐ŸŒ Platform ๐Ÿ”ง Toolkit ๐Ÿ’พ Dataset ๐Ÿ† Award

ICASSP 2025
sym

Enhancing Vision-Language Tracking by Effectively Converting Textual Cues into Visual Cues
X. Feng, D. Zhang, Shiyu Hu, X. Li, M. Wu, J. Zhang, X. Chen, K. Huang
IEEE International Conference on Acoustics, Speech, and Signal Processing (CCF-B Conference, Poster)
๐Ÿ“Œ Visual Language Tracking ๐Ÿ“Œ Multi-modal Learning ๐Ÿ“Œ Grounding Model

ICASSP 2024
sym

Robust Single-particle Cryo-EM Image Denoising and Restoration
J. Zhang, T. Zhao, Shiyu Hu, X. Zhao
IEEE International Conference on Acoustics, Speech, and Signal Processing (CCF-B Conference, Poster)
๐Ÿ“Œ Medical Image Processing ๐Ÿ“Œ AI4Science ๐Ÿ“Œ Diffusion Model
๐Ÿ“ƒ Paper ๐Ÿ“‘ PDF

TCSVT 2024
sym

Finger in Camera Speaks Everything: Unconstrained Air-Writing for Real-World
M. Wu, K. Huang, Y. Cai, Shiyu Hu, Y. Zhao, W. Wang
IEEE Transactions on Circuits and Systems for Video Technology (CCF-B Journal)
๐Ÿ“Œ Air-writing Technique ๐Ÿ“Œ Benchmark Construction ๐Ÿ“Œ Human-machine Interaction
๐Ÿ“ƒ Paper ๐Ÿ“ƒ PDF ๐Ÿ”ง Toolkit

PRCV 2024
sym

VS-LLM: Visual-Semantic Depression Assessment based on LLM for Drawing Projection Test
M. Wu, Y. Kang, X. Li, Shiyu Hu, X. Chen, Y. kang, W. Wang, K. Huang
Chinese Conference on Pattern Recognition and Computer Vision (CCF-C Conference)
๐Ÿ“Œ Psychological Assessment System ๐Ÿ“Œ Gamified Assessment ๐Ÿ“Œ AI4Science
๐Ÿ“ƒ Paper ๐Ÿ“ƒ PDF

PRCV 2023
sym

A Hierarchical Theme Recognition Model for Sandplay Therapy
X. Feng, Shiyu Hu, X. Chen, K. Huang
Chinese Conference on Pattern Recognition and Computer Vision (CCF-C Conference, Poster)
๐Ÿ“Œ Psychological Assessment System ๐Ÿ“Œ Gamified Assessment ๐Ÿ“Œ AI4Science
๐Ÿ“ƒ Paper ๐Ÿ“‘ PDF ๐Ÿ”– Supplementary ๐Ÿชง Poster

Neurocomputing 2022
sym

Revisiting Instance Search: A New Benchmark Using Cycle Self-training
Y. Zhang, C. Liu, W. Chen, X. Xu, F. Wang, H. Li, Shiyu Hu, X. Zhao
Neurocomputing (CCF-C Journal)
๐Ÿ“Œ Video Instance Search ๐Ÿ“Œ Benchmark Construction ๐Ÿ“Œ Data Mining
๐Ÿ“ƒ Paper ๐Ÿ“‘ PDF ๐ŸŒ Project

ๅ›พๅญฆๅญฆๆŠฅ 2021
sym

Visual Turing: The Next Development of Computer Vision in The View of Human-computer Gaming (่ง†่ง‰ๅ›พ็ต๏ผšไปŽไบบๆœบๅฏนๆŠ—็œ‹่ฎก็ฎ—ๆœบ่ง†่ง‰ไธ‹ไธ€ๆญฅๅ‘ๅฑ•)
K. Huang, X. Zhao, Q. Li, Shiyu Hu
Journal of Graphics (ใ€Šๅ›พๅญฆๅญฆๆŠฅใ€‹, CCF-C Chinese Journal)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Intelligent Evaluation Technique ๐Ÿ“Œ AI4Science
๐Ÿ“ƒ Paper ๐Ÿ“‘ PDF

C&E:AI 2025
sym

Artificial Intelligence-Enabled Adaptive Learning Platforms: A Review
L. Tan, Shiyu Hu, Darren J. Yeo, K. Cheong
Computers & Education: Artificial Intelligence
๐Ÿ“Œ Adaptive Learning Platforms ๐Ÿ“Œ AI for Education ๐Ÿ“Œ Educational Technology

ไธญๅ›ฝๅฟƒ็†ๅซ็”Ÿๆ‚ๅฟ— 2025
sym

A Review of Intelligent Psychological Assessment Based on Interactive Environment (ๅŸบไบŽไบคไบ’็Žฏๅขƒ็š„ๆ™บ่ƒฝๅŒ–ๅฟƒ็†ๆต‹่ฏ„)
K. Huang, Y. Kang, C. Yan, Shiyu Hu, L. Wang, T. Tao, W. Gao
Chinese Mental Health Journal (ใ€Šไธญๅ›ฝๅฟƒ็†ๅซ็”Ÿๆ‚ๅฟ—ใ€‹, CSSCI Journal, Top Psychological Journal in China)
๐Ÿ“Œ Psychological Assessment System ๐Ÿ“Œ Gamified Assessment ๐Ÿ“Œ AI4Science

CSAI 2023
sym

Rethinking Similar Object Interference in Single Object Tracking
Y. Wang, Shiyu Hu, X. Zhao
International Conference on Computer Science and Artificial Intelligence (EI Conference, Oral)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Similar Object Interference ๐Ÿ“Œ Data Mining
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“‘ PDF

Preprint

Preprint
sym

FIOVA: A Multi-Annotator Benchmark for Human-Aligned Video Captioning
Shiyu Hu*, X. Li*, X. Li, J. Zhang, Y. Wang, X. Zhao, K. Cheong (*Equal Contributions)
๐Ÿ“Œ Large Vision-Language Models ๐Ÿ“Œ Video Caption ๐Ÿ“Œ Video Understanding
๐Ÿ“ƒ Paper ๐Ÿ“‘ PDF ๐ŸŒ Project

Preprint
sym

When LLMs Learn to be Students: The SOEI Framework for Modeling and Evaluating Virtual Student Agents in Educational Interaction
Y. Ma*, Shiyu Hu*, X. Li, Y. Wang, Y. Chen, S. Liu, K. Cheong (*Equal Contributions)
๐Ÿ“Œ AI4Education ๐Ÿ“Œ LLMs ๐Ÿ“Œ LLM-based Agent
๐Ÿ“ƒ Paper ๐Ÿ“‘ PDF

Preprint
sym

DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM
X. Li, Shiyu Hu, X. Feng, D. Zhang, M. Wu, J. Zhang, K. Huang
๐Ÿ“Œ Visual Language Tracking ๐Ÿ“Œ Large Language Model ๐Ÿ“Œ Evaluation Technique
๐Ÿ“ƒ Paper ๐Ÿ“‘ PDF ๐ŸŒ Project

Preprint
sym

Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark
X. Li, Shiyu Hu, X. Feng, D. Zhang, M. Wu, J. Zhang, K. Huang
๐Ÿ“Œ Visual Language Tracking ๐Ÿ“Œ Multi-modal Interaction ๐Ÿ“Œ Evaluation Technology
๐Ÿ“ƒ Paper ๐Ÿ“‘ PDF ๐ŸŒ Project

Preprint
sym

Nearing or Surpassing: Overall Evaluation of Human-Machine Dynamic Vision Ability
Shiyu Hu, X. Zhao, Y. Wang, Y. Shan, K. Huang
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Intelligent Evaluation Technique ๐Ÿ“Œ AI4Science
๐Ÿ“‘ PDF