Qitong Wang (王琦童)

Ph.D. Student

University of Delaware

Dept. of Computer & Information Sciences

Address:

Email CV Scholar Github X (Twitter)

I am currently pursuing my Ph.D. in the Department of Computer and Information Sciences (CIS) at the University of Delaware (UD), advised by Christopher Rasmussen. Previously I collaborate with Julie Michelle Klinger on designing machine learning frameworks tailored for geospatial data analysis. My research primarily revolves around Computer Vision and Machine Learning. Specifically, I am dedicated to exploring the application of trustworthy deep learning models. Additionally, my research also involves developing frameworks for video learning and understanding.

Prior to joining the University of Delaware, I completed my M.S. degree in the Department of Computer Science at Boston University advised by Margrit Betke. During that period, my research focus was on developing models for text detection and recognition. Before that, I got my B.Eng. degree from the Wuhan University of Technology.

In the industry, I am fortunate to have the opportunities to intern or collaborate with Fan Du (Dolby Laboratories), Pranav Maneriker (Dolby Laboratories), Jihui Jin (Dolby Laboratories), Ting Liu (Google), Long Zhao (Google), Liangzhe Yuan (Google), R. Manmatha (Amazon), Yusheng Xie (Amazon).

News

(Jul 2025) I will be serving as a reviewer of ACMMM 2025 Datasets.
(Jul 2025) I will be serving as a Program Committee of AAAI 2026.
(Jul 2025) I will be serving as a reviewer of IEEE Access.
(Jul 2025) Paper of MO-SAM accepted at PLOS Sustainability and Transformation 2025.
(May 2025) I will be serving as a reviewer of BMVC 2025.
(Feb 2025) I will be serving as a reviewer of ACMMM 2025.
(Feb 2025) I will be serving as a reviewer of ICCV 2025.
(Dec 2024) I will be serving as a reviewer of IEEE TMM.
(Dec 2024) Received Outstanding Conference Travel Award from CIS Department of the UD.
(Dec 2024) Paper on VLM prediction rationality accepted at AAAI 2025.

--- show more ---

Publications

MO-SAM: Testing the reliability and limits of mine feature delineation using Segment Anything Model to democratize mine observation and research

Qitong Wang, Emmanuel Chinkaka, Romain Richaud, Mehrnaz Haghdadi, Coryn Wolk, Kopo V. Oromeng, Kyle Frankel Davis, Federica Bianco, Xi Peng, Julie Michelle Klinger

PLOS Sustainability and Transformation, 2025.

paper code bibtex

@article{10.1371/journal.pstr.0000182,
doi{10.1371/journal.pstr.0000182,
author = {Wang, Qitong AND Chinkaka, Emmanuel AND Richaud, Romain AND Haghdadi, Mehrnaz AND Wolk, Coryn AND Oromeng, Kopo V. AND Davis, Kyle Frankel AND Bianco, Federica B. AND Peng, Xi AND Klinger, Julie Michelle},
journal = {PLOS Sustainability and Transformation},
publisher = {Public Library of Science},
title = {MO-SAM: Testing the reliability and limits of mine feature delineation using Segment Anything Model to democratize mine observation and research},
year = {2025},
month = {07},
volume = {4},
url = {https://doi.org/10.1371/journal.pstr.0000182},
pages = {1-25},
number = {7},
}

Beyond Accuracy: On the Effects of Fine-tuning Towards Vision-Language Model's Prediction Rationality

Qitong Wang, Tang Li, Kien X. Nguyen, Xi Peng

Association for the Advancement of Artificial Intelligence (AAAI), Philadelphia, Pennsylvania, USA, 2025.

paper code bibtex

@InProceedings{Wang_2025_Rationale,
author = {Wang, Qitong and Li, Tang and Nguyen, Kien X. and Peng, Xi},
title = {Beyond Accuracy: On the Effects of Fine-tuning Towards Vision-Language Model's Prediction Rationality},
booktitle = {In Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI)},
month = {February},
year = {2025},
}

Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition

Qitong Wang, Long Zhao, Liangzhe Yuan, Ting Liu, Xi Peng

International Conference on Computer Vision (ICCV), Paris, France, 2023.

paper code blogpost bibtex

@InProceedings{Wang_2023_ICCV,
author = {Wang, Qitong and Zhao, Long and Yuan, Liangzhe and Liu, Ting and Peng, Xi},
title = {Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2023},
pages = {3307-3317}
}

Learning Representational Invariances for Data-Efficient Action Recognition

Yuliang Zou, Jinwoo Choi, Qitong Wang, Jia-Bin Huang

Computer Vision and Image Understanding (CVIU), 2022.

paper code website bibtex

@article{zou2023learning,
title={Learning representational invariances for data-efficient action recognition},
author={Zou, Yuliang and Choi, Jinwoo and Wang, Qitong and Huang, Jia-Bin},
journal={Computer Vision and Image Understanding},
volume={227},
pages={103597},
year={2023},
publisher={Elsevier}
}

Region-aware Arbitrary-shaped Text Detection with Progressive Fusion

Qitong Wang, Bin Fu, Ming Li, Junjun He, Xi Peng, Yu Qiao

IEEE Transactions on Multimedia (TMM), 2022.

paper code bibtex

@article{wang2022region,
title={Region-aware Arbitrary-shaped Text Detection with Progressive Fusion},
author={Wang, Qitong and Fu, Bin and Li, Ming and He, Junjun and Peng, Xi and Qiao, Yu},
journal={IEEE Transactions on Multimedia},
year={2022},
publisher={IEEE}
}

Semantic-Based Sentence Recognition in Images Using Bimodal Deep Learning

Yi Zheng, Qitong Wang, Margrit Betke

IEEE International Conference on Image Processing (ICIP), Anchorage, Alaska, USA, 2021.

paper data bibtex

@article{Zheng2021SemanticBasedSR,
title={Semantic-Based Sentence Recognition in Images Using Bimodal Deep Learning},
author={Y. Zheng and Qitong Wang and Margrit Betke},
journal={2021 IEEE International Conference on Image Processing (ICIP)},
year={2021},
pages={2753-2757},
url={https://api.semanticscholar.org/CorpusID:238082348}
}

A Method for Detecting Text of Arbitrary Shapes in Natural Scenes That Improves Text Spotting

Qitong Wang, Yi Zheng, Margrit Betke

Workshop on Text and Documents in the Deep Learning Era (CVPR), Virtual, 2020.

paper code bibtex

@InProceedings{Wang_2020_CVPR_Workshops,
author = {Wang, Qitong and Zheng, Yi and Betke, Margrit},
title = {A Method for Detecting Text of Arbitrary Shapes in Natural Scenes That Improves Text Spotting},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2020}
}

Service

Conference Reviewer - ECCV 2024, CVPR 2025, ICCV 2025, ACMMM 2025, ACMMM 2025 Datasets, BMVC 2024-2025.
Journal Reviewer - IEEE Transactions on Image Processing (TIP), IEEE Transactions on Multimedia (TMM), IEEE Access, PLOS ONE.
Program Committee - AAAI 2026.
Volunteering Conference Reviewer - CVPR 2023, NeurIPS 2023, AAAI 2024, ICLR 2025.
Volunteering Journal Reviewer - IEEE Transactions on Artificial Intelligence (TAI), IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), ACM Transactions on Intelligent Systems and Technology (TIST).