Scientists from the Division of Sustainable Energy and Environmental Engineering at Osaka University used generative adversarial networks trained on a custom dataset to virtually remove obstructions from building façade images. This work may assist in civic planning as well as computer vision applications.
The ability to digitally "erase" unwanted occluding objects from a cityscape is highly useful but requires a great deal of computing power. Previous methods used standard image datasets to train machine learning algorithms. Now, a team of researchers at Osaka University have built a custom dataset as part of a general framework for the automatic removal of unwanted objects - such as pedestrians, riders, vegetation, or cars - from an image of a building's façade. The removed region was replaced using digital inpainting to efficiently restore a complete view.
The researchers used data from the Kansai region of Japan in an open-source street view service, as opposed to the conventional building image sets often used in machine learning for urban landscapes. Then they constructed a dataset to train an adversarial generative network (GAN) for inpainting the occluded regions with high accuracy. "For the task of façade inpainting in street-level scenes, we adopted an end-to-end deep learning-based image inpainting model by training with our customized datasets," first author Jiaxin Zhang explains.
The team used semantic segmentation to detect several types of obstructing objects, including pedestrians, vegetation, and cars, as well as using GANs for filling the detected regions with background textures and patching information from street-level imagery. They also proposed a workflow to automatically filter unblocked building façades from street view images and customized the dataset to contain both original and masked images to train additional machine learning algorithms.