๐Ÿ“ Publications

๐Ÿ“ Publications

Acceptance

TPAMI 2023
sym

Global Instance Tracking: Locating Target More Like Humans
Shiyu Hu, X. Zhao, L. Huang, K. Huang
IEEE Transactions on Pattern Analysis and Machine Intelligence (CCF-A Journal)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Large-scale Benchmark Construction ๐Ÿ“Œ Intelligent Evaluation Technology
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“‘ PDF ๐Ÿชง Poster ๐ŸŒ Platform ๐Ÿ”ง Toolkit ๐Ÿ’พ Dataset

IJCV 2024
sym

SOTVerse: A User-defined Task Space of Single Object Tracking
Shiyu Hu, X. Zhao, K. Huang
International Journal of Computer Vision (CCF-A Journal)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Dynamic Open Environment Construction ๐Ÿ“Œ 3E Paradigm
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“‘ PDF ๐ŸŒ Platform

IJCV 2024
sym

BioDrone: A Bionic Drone-based Single Object Tracking Benchmark for Robust Vision
X. Zhao, Shiyu Huโœ‰๏ธ, Y. Wang, J. Zhang, Y. Hu, R. Liu, H. Lin, Y. Li, R. Li, K. Liu, J. Li
International Journal of Computer Vision (CCF-A Journal)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Drone-based Tracking ๐Ÿ“Œ Visual Robustness
๐Ÿ“ƒ Paper ๐ŸŒ Platform ๐Ÿ—’ bibTex ๐Ÿ“‘ PDF ๐Ÿ”ง Toolkit ๐Ÿ’พ Dataset

NeurIPS 2023
sym

A Multi-modal Global Instance Tracking Benchmark (MGIT): Better Locating Target in Complex Spatio-temporal and causal Relationship
Shiyu Hu, D. Zhang, M. Wu, X. Feng, X. Li, X. Zhao, K. Huang
the 37th Conference on Neural Information Processing Systems (CCF-A Conference, Poster)
๐Ÿ“Œ Visual Language Tracking ๐Ÿ“Œ Long Video Understanding and Reasoning ๐Ÿ“Œ Hierarchical Semantic Information Annotation
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“ƒ PDF ๐Ÿชง Poster ๐Ÿ“น Slides ๐ŸŒ Platform ๐Ÿ”ง Toolkit ๐Ÿ’พ Dataset

ไธญๅ›ฝๅ›พ่ฑกๅ›พๅฝขๅญฆๆŠฅ 2023
sym

Visual Intelligence Evaluation Techniques for Single Object Tracking: A Survey (ๅ•็›ฎๆ ‡่ทŸ่ธชไธญ็š„่ง†่ง‰ๆ™บ่ƒฝ่ฏ„ไผฐๆŠ€ๆœฏ็ปผ่ฟฐ)
Shiyu Hu, X. Zhao, K. Huang
Journal of Images and Graphics (ใ€Šไธญๅ›ฝๅ›พ่ฑกๅ›พๅฝขๅญฆๆŠฅใ€‹, CCF-B Chinese Journal)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Intelligent Evaluation Technique ๐Ÿ“Œ AI4Science
๐Ÿ“ƒ Paper ๐Ÿ“‘ PDF

NeurIPS 2024
sym

Beyond Accuracy: Tracking more like Human via Visual Search
D. Zhang, Shiyu Hu, X. Feng, X. Li, M. Wu, J. Zhang, K. Huang
the 38th Conference on Neural Information Processing Systems (CCF-A Conference, Poster)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Visual Search Mechanism ๐Ÿ“Œ Visual Turing Test

NeurIPS 2024
sym

MemVLT: Vision-Language Tracking with Adaptive Memory-based Prompts
X. Feng, X. Li, Shiyu Hu, D. Zhang, M. Wu, J. Zhang, X. Chen, K. Huang
the 38th Conference on Neural Information Processing Systems (CCF-A Conference, Poster)
๐Ÿ“Œ Visual Language Tracking ๐Ÿ“Œ Human-like Memory Modeling ๐Ÿ“Œ Adaptive Prompts

ICASSP 2025
sym

Enhancing Vision-Language Tracking by Effectively Converting Textual Cues into Visual Cues
X. Feng, D. Zhang, Shiyu Hu, X. Li, M. Wu, J. Zhang, X. Chen, K. Huang
the 50th IEEE International Conference on Acoustics, Speech, and Signal Processing (CCF-B Conference, Poster)
๐Ÿ“Œ Visual Language Tracking ๐Ÿ“Œ Multi-modal Learning ๐Ÿ“Œ Grounding Model

TCSVT 2024
sym

Finger in Camera Speaks Everything: Unconstrained Air-Writing for Real-World
M. Wu, K. Huang, Y. Cai, Shiyu Hu, Y. Zhao, W. Wang
IEEE Transactions on Circuits and Systems for Video Technology (CCF-B Journal)
๐Ÿ“Œ Air-writing Technique ๐Ÿ“Œ Benchmark Construction ๐Ÿ“Œ Human-machine Interaction
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“ƒ PDF ๐Ÿ”ง Toolkit

CVPRW 2024
sym

Diverse Text Generation for Visual Language Tracking Based on LLM
X. Li, X. Feng, Shiyu Hu, M. Wu, D. Zhang, J. Zhang, K. Huang
the 3rd Workshop on Vision Datasets Understanding and DataCV Challenge in CVPR 2024 (Workshop in CCF-A Conference, Oral, Best Paper Honorable Mention)
๐Ÿ“Œ Visual Language Tracking ๐Ÿ“Œ Large Language Model ๐Ÿ“Œ Evaluation Technique
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“ƒ PDF ๐Ÿชง Poster ๐Ÿ“น Slides ๐ŸŒ Platform ๐Ÿ”ง Toolkit ๐Ÿ’พ Dataset ๐Ÿ† Award

ICASSP 2024
sym

Robust Single-particle Cryo-EM Image Denoising and Restoration
J. Zhang, T. Zhao, Shiyu Hu, X. Zhao
the 49th IEEE International Conference on Acoustics, Speech, and Signal Processing (CCF-B Conference, Poster)
๐Ÿ“Œ Medical Image Processing ๐Ÿ“Œ AI4Science ๐Ÿ“Œ Diffusion Model
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“‘ PDF

PRCV 2024
sym

VS-LLM: Visual-Semantic Depression Assessment based on LLM for Drawing Projection Test
M. Wu, Y. Kang, X. Li, Shiyu Hu, X. Chen, Y. kang, W. Wang, K. Huang
the 7th Chinese Conference on Pattern Recognition and Computer Vision (CCF-C Conference)
๐Ÿ“Œ Psychological Assessment System ๐Ÿ“Œ Gamified Assessment ๐Ÿ“Œ AI4Science
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“ƒ PDF

ไธญๅ›ฝๅฟƒ็†ๅซ็”Ÿๆ‚ๅฟ— 2024
sym

A Review of Intelligent Psychological Assessment Based on Interactive Environment (ๅŸบไบŽไบคไบ’็Žฏๅขƒ็š„ๆ™บ่ƒฝๅŒ–ๅฟƒ็†ๆต‹่ฏ„)
K. Huang, Y. Kang, C. Yan, Shiyu Hu, L. Wang, T. Tao, W. Gao
Chinese Mental Health Journal (ใ€Šไธญๅ›ฝๅฟƒ็†ๅซ็”Ÿๆ‚ๅฟ—ใ€‹, CSSCI Journal, Top Psychological Journal in China)
๐Ÿ“Œ Psychological Assessment System ๐Ÿ“Œ Gamified Assessment ๐Ÿ“Œ AI4Science

PRCV 2023
sym

A Hierarchical Theme Recognition Model for Sandplay Therapy
X. Feng, Shiyu Hu, X. Chen, K. Huang
the 6th Chinese Conference on Pattern Recognition and Computer Vision (CCF-C Conference, Poster)
๐Ÿ“Œ Psychological Assessment System ๐Ÿ“Œ Gamified Assessment ๐Ÿ“Œ AI4Science
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“‘ PDF ๐Ÿ”– Supplementary ๐Ÿชง Poster

CSAI 2023
sym

Rethinking Similar Object Interference in Single Object Tracking
Y. Wang, Shiyu Hu, X. Zhao
the 7th International Conference on Computer Science and Artificial Intelligence (EI Conference, Oral)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Similar Object Interference ๐Ÿ“Œ Data Mining
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“‘ PDF

Neurocomputing 2022
sym

Revisiting Instance Search: A New Benchmark Using Cycle Self-training
Y. Zhang, C. Liu, W. Chen, X. Xu, F. Wang, H. Li, Shiyu Hu, X. Zhao
Neurocomputing (CCF-C Journal)
๐Ÿ“Œ Video Instance Search ๐Ÿ“Œ Benchmark Construction ๐Ÿ“Œ Data Mining
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“‘ PDF ๐ŸŒ Project

ๅ›พๅญฆๅญฆๆŠฅ 2021
sym

Visual Turing: The Next Development of Computer Vision in The View of Human-computer Gaming (่ง†่ง‰ๅ›พ็ต๏ผšไปŽไบบๆœบๅฏนๆŠ—็œ‹่ฎก็ฎ—ๆœบ่ง†่ง‰ไธ‹ไธ€ๆญฅๅ‘ๅฑ•)
K. Huang, X. Zhao, Q. Li, Shiyu Hu
Journal of Graphics (ใ€Šๅ›พๅญฆๅญฆๆŠฅใ€‹, CCF-C Chinese Journal)
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Intelligent Evaluation Technique ๐Ÿ“Œ AI4Science
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“‘ PDF

Preprint

Preprint
sym

Can LVLMs Describe Videos like Humans? A Five-in-One Video Annotations Benchmark for Better Human-Machine Comparison
Shiyu Hu*, X. Li*, X. Li, J. Zhang, Y. Wang, X. Zhao, K. Cheong (*Equal Contributions)
Submitted to a CAAI-A conference, under review
๐Ÿ“Œ Large Vision-Language Models ๐Ÿ“Œ Evaluation Technique ๐Ÿ“Œ Visual Turing
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“‘ PDF ๐ŸŒ Project

Preprint
sym

Students Rather Than Experts: A New AI for Education Pipeline to Model More Human-like and Personalised Early Adolescences
Y. Ma*, Shiyu Hu*, X. Li, Y. Wang, S. Liu, K. Cheong (*Equal Contributions)
Submitted to a CAAI-A conference, under review
๐Ÿ“Œ AI4Education ๐Ÿ“Œ LLMs ๐Ÿ“Œ LLM-based Agent
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“‘ PDF ๐ŸŒ Project

Preprint
sym

DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM
X. Li, Shiyu Hu, X. Feng, D. Zhang, M. Wu, J. Zhang, K. Huang
Submitted to a CAAI-A conference, under review
๐Ÿ“Œ Visual Language Tracking ๐Ÿ“Œ Large Language Model ๐Ÿ“Œ Evaluation Technique
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“‘ PDF ๐ŸŒ Project

Preprint
sym

Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark
X. Li, Shiyu Hu, X. Feng, D. Zhang, M. Wu, J. Zhang, K. Huang
Submitted to a workshop in CCF-A conference, under review
๐Ÿ“Œ Visual Language Tracking ๐Ÿ“Œ Multi-modal Interaction ๐Ÿ“Œ Evaluation Technology
๐Ÿ“ƒ Paper ๐Ÿ—’ bibTex ๐Ÿ“‘ PDF ๐ŸŒ Project

Preprint
sym

Nearing or Surpassing: Overall Evaluation of Human-Machine Dynamic Vision Ability
Shiyu Hu, X. Zhao, Y. Wang, Y. Shan, K. Huang
๐Ÿ“Œ Visual Object Tracking ๐Ÿ“Œ Intelligent Evaluation Technique ๐Ÿ“Œ AI4Science
๐Ÿ“‘ PDF ๐Ÿ—’ bibTex