JokerGAN: Memory-Efficient Model for Handwritten Text Generation with Text Line Awareness

Oct 1, 2021·
Jan Zdenek
Jan Zdenek
,
Hideki Nakayama
· 0 min read
Abstract
Collecting labeled data for training of models for image recognition problems, including handwritten text recognition (HTR), is a tedious and expensive task. Recent work on handwritten text generation shows that generative models can be used as a data augmentation method to improve the performance of HTR systems. We propose a new method for handwritten text generation that uses generative adversarial networks with multi-class conditional batch normalization, which enables us to use character sequences with variable lengths as conditional input. Compared to existing methods, our model has significantly lower memory requirements which are almost constant regardless of the size of the character set. This allows us to train a generative model for languages with a large number of characters, such as Japanese. An additional condition makes the generator aware of vertical properties of the characters in the generated sequence, which helps us generate text with well-aligned characters in the text line. Our experiments on handwritten text datasets show that the proposed model can be used to boost the performance of HTR, particularly when we only have access to partially annotated data and train the generative model in a semi-supervised fashion. Our results also show that our model outperforms the current state-of-the-art for handwritten text generation. In addition, a human evaluation study indicates that the proposed method generates handwritten text images that look more realistic and natural.
Type
Publication
29th ACM International Conference on Multimedia (ACMMM 2021)
publications
Jan Zdenek
Authors
Research Scientist
Jan is a research scientist at CyberAgent, where he works on artificial intelligence and computer vision with a focus on image generation and editing. He received his PhD in Information Science and Technology from the University of Tokyo, where his research centered on image generation. Prior to that, he received his Master’s degree in Creative Informatics from the University of Tokyo, and his Bachelor’s degree in Computer and Information Science from the Czech Technical University in Prague. Born and raised in the Czech Republic, he currently works in Japan.