Research Projects

Our research primarily focuses on privacy-preserving multi-modal learning and its applications in multimedia analysis and reasoning:

  • Object Re-Identification (Re-ID) aims to match the same object (e.g., persons, vehicles, animals) across multiple distinct views. Our work encompasses a wide range of directions, including cross-modal Re-ID, unsupervised Re-ID, domain generalized Re-ID, multi-species Re-ID, and UAV Re-ID.
  • Multimodal Emotional Understanding aims to comprehensively analyze and interpret human emotions and intentions by leveraging multiple modalities. Our work integrates textual expressions, visual cues, acoustic features, and their intricate interactions to achieve a more nuanced and robust understanding of human emotional states and behavioral intentions.
  • Multimodal AI-Generated Content (AIGC) focuses on generating and synthesizing content across multiple modalities such as text, images, audio, and video through AI technologies. Our work spans several key areas in AIGC development, including text-guided fashion editing, and so on.
  • Continual Learning, also known as incremental learning, aims to enable neural networks to acquire new information from continuous streams of training data while maintaining learned knowledge. Our work mainly covers challenging class incremental learning and incremental learning based on pre-trained vision language model.
  • Federated Learning is a decentralized approach to machine learning that enables multiple devices or institutions to collaboratively train a model without sharing their local data. Our work in Federated Learning covers various areas including Security, Generalization, Robustness in Federated Learning, and Federated Graph Learning.
  • Multimodal Medical AI aims to integrate and analyze diverse types of medical data (e.g., images, clinical notes, genomic data) for comprehensive healthcare applications. Our research focuses on multimodal medical image analysis, with particular emphasis on developing interpretable AI systems, enhancing model generalizability, and ensuring fairness across diverse populations.

Fundings

Terms of Releasing Implementation:

Software provided here is for personal research purposes only. Redistribution and commercial usage are not permitted. Feedback, applications, and further development are welcome. Contact yemang AT whu.edu.cn for bugs and collaborations. All rights of the implementation are reserved by the authors.


Copyright © 2024 Multimedia Analysis & Reasoning (MARS) Lab
Wuhan University 武汉大学