Cecilia-Xue's Blog


  • Home

  • About

  • Archives

  • publication

  • Search

Bridging Sensor Gaps via Single-Direction Tuning for Hyperspectral Image Classification

Posted on 2023-09-29

Recently, vision transformer (ViT) models have excelled in diverse vision tasks, emerging as robust alternatives to convolutional neural networks. Inspired by this, some researchers have started exploring the use of ViTs in tackling HSI classification and achieved remarkable results. However, the training of ViT models requires a considerable number of training samples, while hyperspectral data, due to its high annotation costs, typically has a relatively small number of training samples. This contradiction has not been effectively addressed. In this paper, aiming to solve this problem, we propose the single-direction tuning (SDT) strategy, which serves as a bridge, allowing us to leverage existing labeled HSI datasets even RGB datasets to enhance the performance on new HSI datasets with limited samples. The proposed SDT inherits the idea of prompt tuning, aiming to reuse pre-trained models with minimal modifications for adaptation to new tasks. But unlike prompt tuning, SDT is custom-designed to accommodate the characteristics of HSIs. The proposed SDT utilizes a parallel architecture, an asynchronous cold-hot gradient update strategy, and unidirectional interaction. It aims to fully harness the potent representation learning capabilities derived from training on heterologous, even cross-modal datasets. In addition, we also introduce a novel Triplet-structured transformer (Tri-Former), where spectral attention and spatial attention modules are merged in parallel to construct the token mixing component for reducing computation cost and a 3D convolution-based channel mixer module is integrated to enhance stability and keep structure information. Comparison experiments conducted on three representative HSI datasets captured by different sensors demonstrate the proposed Tri-Former achieves better performance compared to several state-of-the-art methods. Homologous, heterologous and cross-modal tuning experiments verified the effectiveness of the proposed SDT. [Paper] [code]

TriFormer.jpg
sdt.jpg

Transformer-based Open-world Instance Segmentation with Cross-task Consistency Regularization

Posted on 2023-09-28

Open-World Instance Segmentation (OWIS) is an emerging research topic that aims to segment class-agnostic object instances from images. The mainstream approaches use a two-stage segmentation framework, which first locates the candidate object bounding boxes and then performs instance segmentation. In this work, we instead promote a single-stage \textcolor{}{transformer-based framework} for OWIS. We argue that the end-to-end training process in the single-stage framework can be more convenient for directly regularizing the localization of class-agnostic object pixels. Based on the \textcolor{}{transformer-based} instance segmentation framework, we propose a regularization model to predict foreground pixels and use its relation to instance segmentation to construct a cross-task consistency loss. We show that such a consistency loss could alleviate the problem of incomplete instance annotation – a common problem in the existing OWIS datasets. We also show that the proposed loss lends itself to an effective solution to semi-supervised OWIS that could be considered an extreme case that all object annotations are absent for some images. Our extensive experiments demonstrate that the proposed method achieves impressive results in both fully-supervised and semi-supervised settings. Compared to SOTA methods, the proposed method significantly improves the $AP_{100}$ score by 4.75% in UVO dataset $\rightarrow$UVO dataset setting and 4.05% in COCO dataset $\rightarrow$UVO dataset} setting. In the case of semi-supervised learning,our model learned with only 30% labeled data, even outperforms its fully-supervised counterpart with 50% labeled data. [Paper] [code]

TOIS.jpg
TOIS.jpg

UW360 Video Database for Learning Object Representation in Underwater Tracking

Posted on 2022-05-29

Underwater object tracking attracts a growing interest because of its importance for marine engineering and aquatic robotics. Despite numerous tracking algorithms have been proposed over the last decade, they are mainly focused on general open air datasets. In contrast, objects in underwater videos are small and densely distributed, and have more complex moving patterns in the unique water environment. To develop tracking algorithms that can deal with these challenges posed by underwater videos, we construct the first real-world underwater tracking benchmark called UW360, which contains 360 videos with over 180K frames that cover a diverse range of underwater scenes. Our UW360 dataset exhibits several challenges including (1) Missing discriminative information. (2) Dense scenarios. (3) Complex motion trajectories and appearance variations. We experimentally evaluate the state-of-the-art trackers on UW360 and present a comprehensive analysis. Furthermore, we conduct some method design explorations to shed light on the potential future directions towards this new scenario beyond open-air. [Paper] [Benchmark] [Dataset]

UW360.jpg

HyT-NAS for Hyperspectral Image Classification

Posted on 2022-05-29

In this paper, NAS and Transformer are combined for handling HSI classification task for the first time. Compared with previous works, the proposed method has two main differences. Firstly, we revisit search spaces designed in previous HSI classification NAS methods and propose a novel hybrid search space, consisting of the space dominated cell and the spectrum dominated cell. Compared with search spaces proposed in previous works, the proposed hybrid search space is more aligned with the characteristic of HSI data, that is HSIs have a relatively low spatial resolution and an extremely high spectral resolution. Secondly, for further improving the classification accuracy, we attempt to graft the emerging transformer module on the automatically designed convolutional neural network (CNN) to add global information to local region focused features learned by CNN. Experimental results on three public HSI datasets show that the proposed method achieves much better performance than comparison approaches. [Paper] [code]

HyT-NAS.jpg

Record of 2016 RoboCup China Open(CHINA ROBOT COMPETITION)

Posted on 2016-10-29

Congratulations!Our team Exited finally win the first prize in 2016 China Robot competition !

1.jpg

5 posts
GitHub E-Mail
© 2023 Xizhe Xue(薛希哲)