Span-based Localizing Network for Natural Language Video Localization