Erasing Scene Text With Weak Supervision
Abstract
Scene text erasing is a task of removing text from natural scene images, which has been gaining attention in recent years. The main motivation is to conceal private information such as license plate numbers and house nameplates that can appear in images. In this work, we propose a novel method for scene text erasing that approaches the problem as a general inpainting task. Unlike previous methods, which require pairs of original images containing text and images with the text removed, our method does not need corresponding image pairs for training. We use a separately trained scene text detector and an inpainting network. The scene text detector predicts segmentation maps of text instances, which are then used as masks for the inpainting network. The network for inpainting, trained on the Places2 dataset of indoors and outdoors scenes, fills in masked out regions in an input image and generates a final image with erased text. The results show that our method is able to remove text from images and naturally fill in the background.
Type
Publication
Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision (WACV 2020)

Authors
Research Scientist
Jan is a research scientist at CyberAgent, where he works on artificial intelligence and computer vision
with a focus on image generation and editing. He received his PhD in Information Science and Technology
from the University of Tokyo, where his research centered on image generation. Prior to that, he received
his Master’s degree in Creative Informatics from the University of Tokyo, and his Bachelor’s degree
in Computer and Information Science from the Czech Technical University in Prague.
Born and raised in the Czech Republic, he currently works in Japan.
Authors