Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark

This paper introduces an a new method for domain adaptation through meta self-learning approach.
Inc Lomin's avatar
Apr 19, 2022
Meta Self-Learning for Multi-Source Domain Adaptation: A Benchmark

Introduction

This paper introduces an a new method for domain adaptation through meta self-learning approach. The experiments showed a noticeable difference in final performance comparing to similar methods. Also authors introduced a new large-scale (chinese & english) dataset for multi-domain benchmark and experiments.
 
notion image

Contributions

  • Collected a multi-source domain adaptation dataset for text recognition with over 5 million images from 5 different domains. The first multi-domain adaptation dataset for text recognition (according to the authors).
  • Proposed a new self-learning framework for multisource domain adaptation, which is effective and can be easily fit into any Multi-Domain Adaptation and self-learning problem.
  • Experiments are conducted on new dataset, which provides a benchmark and show the effectiveness of proposed method.
 
  • Text Recognition
    • 4-staged STR network similar to Naver: TPS/STN + CNN + BiLSTM/GRU + CTC/Attention
  • Domain adaptation
    • Discrepancy based domain adaptation
      • Uses a domain confusion loss by calculating the maximum mean discrepancy (MMD) between the source domain data and the target domain data.
    • Adversarial training based domain adaptation
      • Uses a domain discriminator and gradient reversal layer to separate the feature extractor and the domain discriminator, which forces the feature extractor to extract the domain-invariant feature.
    • Self-training-based domain adaptation
      • The method trains the model iteratively by generating pseudo-label of target data and adding them into the training data.
  • Meta-Learning
    • Model-Agnostic Meta-Learning (MAML)
      • notion image
    • Reptile: A Scalable Meta-Learning Algorithm
    • Online Meta-Learning
    • Self-Learning
      • Predict labels for the unlabeled data using the model trained on source domains and take them as correct labels if the predict confidence is higher than a threshold. The self-learning method can always bring considerable improvement because of the direct use of target domain data.
      •  

      Multi-Domain Text Recognition Dataset

       
      Proposed multi-domain text dataset consists of 5,209,215 images in total, and is divided into five domains, which are:
    • Synthetic domain
      • 1,110,620 images in total
    • Handwritten domain
      • The data of the handwritten domain is generated using the images in CASIA Online and Offline Chinese Handwriting Databases. 1,897,021 images in total for handwritten images
    • Document domain
      • The data of document domain is collected from an open-source projects and the dataset contains about 3 million images. Authors filtered out the images that contains characters not in character set and got 1,710,885 images in total.
    • Street view domain
      • Merged from: SVT, SVT perspective, ICDAR2013, ICDAR2015, RCTW17, ICDAR-2019, and CUTE80 datasets → 199,346 images
    • Car plate domain
      • Re-balanced CCPD dataset → 207,928 images in total
      The character set size is set to 3,816, with 3,754 common Chinese characters and 62 alphanumeric characters.
       

      Meta Self-Learning algorithm

       
       
      notion image
       
    • Warm-Up and Generation of Pseudo-Labels:
      • The model will first be trained on DS as the warm-up phase. Warm-up is a necessary process for the self-learning method, and this process will greatly improve the quality of the generated pseudo-label and lead to a better result. Without warm-up process, the generated pseudo-labels will either have low confidence or wrong content, which will greatly jeopardize the predict accuracy on the target domain. After the warmup, the target data with pseudo-label will be generated.
      • Random Split: The usage of the pseudo-label is one of the most important issue. As the raw pseudo-label can be noisy, a meta-update is used in our method. During the meta-update, both source domain data and target domain data with pseudo-lables will be used, and are divided randomly into meta-train set and meta-test set, which corresponds to the support set and query set in vanilla MAML.
      • Meta update loss:
        • Train:
          • notion image
        • Test:
notion image
notion image
 
 

Experiment Results

 
  • Experiment settings:
    • PyTorch on Tesla T4
    • Adam is used as the outer optimizer, and SGD is used as the meta optimizer
      • α is set to 1e − 3
      • β and γ are changed during the training process
    • Input size:
      • 100 × 32
    • Character set in these experiments is set to 3818, which includes:
      • 3756 common Chinese characters
      • 62 alphanumeric characters.
      •  
  • Results:
    • Baseline:
      • The baseline model is trained with only source domains without any multi-source domain adaptation methods.
    • MLDG:
      • During the training, the source domains are divided into meta-train set and meta-test set. The model will first update one step using the meta-train set and then validate on the meta-test set. The final model converged on source domains will be deployed on the truly held-out target domain.
    • Pseudo-Label:
      • As the warm-up is a necessary step for the pseudo-label method, we use the baseline model as the pre-trained model and start training using pseudo-label directly on it.
      • Pseudo-labels score threshold:
        • Handwritten: 0.98
        • Others: 0.9
    • Meta Self-Learning (paper proposal):
      • Same setting with the pseudo-label method
      •  
        notion image
         
        NOTE: Best results are received from different pseudo-labels usage settings (see explanations below)
         
      • Pseudo-labels usage settings experiments:
        • IAOS: Use all 5 domains during the meta-update, and only use source domains during the outer optimization.
        • IPOA: Use the pseudo-label domain as meta-test set only during meta-update and use all five domains during outer optimization.
        • IPOP: has the same setting with IPOA during meta-update, while only use images with pseudo-label during the outer optimization.
        •  
          notion image
           
          notion image

      Conclusion

       
    • A fresh approach/direction for domain shift problem solution in text recognition area
    • Large Multi-Domain Chinese Text Recognition dataset
    • Application in for Lomin Textscope: documents with different domains???
      • Self-Learning
      • Online-Learning
      •  
Share article