Most urban applications necessitate building footprints in the form of concise vector graphics with sharp boundaries rather than pixel-wise raster images. This need contrasts with the majority of existing methods, which typically generate over-smoothed footprint polygons. Editing these automatically produced polygons can be inefficient, if not more time-consuming than manual digitization. This paper introduces a semi-automatic approach for building footprint extraction through semantically-sensitive superpixels and neural graph networks. We first learn to generate superpixels that are not only boundary-preserving but also semantically-sensitive. The superpixels respond exclusively to building boundaries rather than other natural objects. These intermediate superpixel representations can be naturally considered as nodes within a graph. Consequently, graph neural networks are employed to model the global interactions among all superpixels and enhance the representativeness of node features for building segmentation, which also enables efficient editing of segmentation results. Classical approaches are utilized to extract and regularize boundaries for the vectorized building footprints. We efficiently accomplish accurate segmentation outcomes by utilizing minimal clicks and straightforward strokes, eliminating the necessity for editing polygon vertices. A significant improvement of 8% in AP50 was observed in vector graphics evaluation, surpassing established techniques. Additionally, we have devised an optimized and sophisticated pipeline for interactive editing, poised to further augment the overall quality of the results. The code for training the superpixel and graph networks will be made publicly available at
https://vrlab.org.cn/~hanhu/~hanhu/projects/spgraph/.