Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark

This paper introduces an a new method for domain adaptation through meta self-learning approach.

Apr 19, 2022

Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark

Contents

Introduction Contributions Related Works Multi-Domain Text Recognition Dataset Meta Self-Learning algorithm Experiment Results Conclusion

Introduction

This paper introduces an a new method for domain adaptation through meta self-learning approach. The experiments showed a noticeable difference in final performance comparing to similar methods. Also authors introduced a new large-scale (chinese & english) dataset for multi-domain benchmark and experiments.

Contributions

Collected a multi-source domain adaptation dataset for text recognition with over 5 million images from 5 different domains. The first multi-domain adaptation dataset for text recognition (according to the authors).

Proposed a new self-learning framework for multisource domain adaptation, which is effective and can be easily fit into any Multi-Domain Adaptation and self-learning problem.

Experiments are conducted on new dataset, which provides a benchmark and show the effectiveness of proposed method.

Related Works

Text Recognition

4-staged STR network similar to Naver: TPS/STN + CNN + BiLSTM/GRU + CTC/Attention

Domain adaptation

Discrepancy based domain adaptation

Uses a domain confusion loss by calculating the maximum mean discrepancy (MMD) between the source domain data and the target domain data.

Adversarial training based domain adaptation

Uses a domain discriminator and gradient reversal layer to separate the feature extractor and the domain discriminator, which forces the feature extractor to extract the domain-invariant feature.

Self-training-based domain adaptation

The method trains the model iteratively by generating pseudo-label of target data and adding them into the training data.

Meta-Learning

Model-Agnostic Meta-Learning (MAML)

Reptile: A Scalable Meta-Learning Algorithm
Online Meta-Learning

https://arxiv.org/pdf/2004.04398.pdf

Self-Learning

Predict labels for the unlabeled data using the model trained on source domains and take them as correct labels if the predict confidence is higher than a threshold. The self-learning method can always bring considerable improvement because of the direct use of target domain data.

Multi-Domain Text Recognition Dataset

Proposed multi-domain text dataset consists of 5,209,215 images in total, and is divided into five domains, which are:

Synthetic domain

1,110,620 images in total

Handwritten domain

The data of the handwritten domain is generated using the images in CASIA Online and Offline Chinese Handwriting Databases. 1,897,021 images in total for handwritten images

Document domain

The data of document domain is collected from an open-source projects and the dataset contains about 3 million images. Authors filtered out the images that contains characters not in character set and got 1,710,885 images in total.

Street view domain

Merged from: SVT, SVT perspective, ICDAR2013, ICDAR2015, RCTW17, ICDAR-2019, and CUTE80 datasets → 199,346 images

Car plate domain

Re-balanced CCPD dataset → 207,928 images in total

The character set size is set to 3,816, with 3,754 common Chinese characters and 62 alphanumeric characters.

Meta Self-Learning algorithm

Warm-Up and Generation of Pseudo-Labels:

The model will first be trained on DS as the warm-up phase. Warm-up is a necessary process for the self-learning method, and this process will greatly improve the quality of the generated pseudo-label and lead to a better result. Without warm-up process, the generated pseudo-labels will either have low confidence or wrong content, which will greatly jeopardize the predict accuracy on the target domain. After the warmup, the target data with pseudo-label will be generated.
Random Split: The usage of the pseudo-label is one of the most important issue. As the raw pseudo-label can be noisy, a meta-update is used in our method. During the meta-update, both source domain data and target domain data with pseudo-lables will be used, and are divided randomly into meta-train set and meta-test set, which corresponds to the support set and query set in vanilla MAML.
Meta update loss:

Train:

Test:

Experiment Results

Experiment settings:

PyTorch on Tesla T4
Adam is used as the outer optimizer, and SGD is used as the meta optimizer

α is set to 1e − 3
β and γ are changed during the training process

Input size:

100 × 32

Character set in these experiments is set to 3818, which includes:

3756 common Chinese characters
62 alphanumeric characters.

Results:

Baseline:

The baseline model is trained with only source domains without any multi-source domain adaptation methods.

MLDG:

During the training, the source domains are divided into meta-train set and meta-test set. The model will first update one step using the meta-train set and then validate on the meta-test set. The final model converged on source domains will be deployed on the truly held-out target domain.

Pseudo-Label:

As the warm-up is a necessary step for the pseudo-label method, we use the baseline model as the pre-trained model and start training using pseudo-label directly on it.
Pseudo-labels score threshold:

Handwritten: 0.98
Others: 0.9

Meta Self-Learning (paper proposal):

Same setting with the pseudo-label method

NOTE: Best results are received from different pseudo-labels usage settings (see explanations below)

Pseudo-labels usage settings experiments:

IAOS: Use all 5 domains during the meta-update, and only use source domains during the outer optimization.
IPOA: Use the pseudo-label domain as meta-test set only during meta-update and use all five domains during outer optimization.
IPOP: has the same setting with IPOA during meta-update, while only use images with pseudo-label during the outer optimization.

Conclusion

A fresh approach/direction for domain shift problem solution in text recognition area
Large Multi-Domain Chinese Text Recognition dataset
Application in for Lomin Textscope: documents with different domains???

Self-Learning
Online-Learning

See more posts

Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark

Introduction

Contributions

Related Works

Multi-Domain Text Recognition Dataset

Meta Self-Learning algorithm

Experiment Results

Conclusion

More articles

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

Parsing Table Structures in the Wild

Awesome Table Structure Recognition

Complicated Table Structure Recognition