Authors: Jisan Mahmud, True Price, Akash Bapat, Jan-Michael Frahm Description: We propose a boundary-aware multi-task deep-learning-based framework for fast 3D building modeling from a single overhead image. Unlike most existing techniques which rely on multiple images for 3D scene modeling, we seek to model the buildings in the scene from a single overhead image by jointly learning a modified signed distance function (SDF) from the building boundaries, a dense heightmap of the scene, and scene semantics. To jointly train for these tasks, we leverage pixel-wise semantic segmentation and normalized digital surface maps (nDSM) as supervision, in addition to labeled building outlines. At test time, buildings in the scene are automatically modeled in 3D using only an input overhead image. We demonstrate an increase in building modeling performance using a multi-feature network architecture that improves building outline detection by considering network features learned for the other jointly learned tasks. We also introduce a novel mechanism for robustly refining instance-specific building outlines using the learned modified SDF. We verify the effectiveness of our method on multiple large-scale satellite and aerial imagery datasets, where we obtain state-of-the-art performance in the 3D building reconstruction task.