PanoContext: A Whole-Room 3D Context Model for Panoramic Scene Understanding
Author(s): Zhang, Yinda; Song, Shuran; Tan, Ping; Xiao, Jianxiong
DownloadTo refer to this page use:
http://arks.princeton.edu/ark:/88435/pr1qg23
Abstract: | The field-of-view of standard cameras is very small, which is one of the main reasons that contextual information is not as useful as it should be for object detection. To overcome this limitation, we advocate the use of 360° full-view panoramas in scene understanding, and propose a whole-room context model in 3D. For an input panorama, our method outputs 3D bounding boxes of the room and all major objects inside, together with their semantic categories. Our method generates 3D hypotheses based on contextual constraints and ranks the hypotheses holistically, combining both bottom-up and top-down context information. To train our model, we construct an annotated panorama dataset and reconstruct the 3D model from single-view using manual annotation. Experiments show that solely based on 3D context without any image region category classifier, we can achieve a comparable performance with the state-of-the-art object detector. This demonstrates that when the FOV is large, context is as powerful as object appearance. All data and source code are available online. |
Publication Date: | 2014 |
Citation: | Zhang, Yinda, Shuran Song, Ping Tan, and Jianxiong Xiao. "PanoContext: A Whole-Room 3D Context Model for Panoramic Scene Understanding." In European Conference on Computer Vision (2014): pp. 668-686. doi:10.1007/978-3-319-10599-4_43 |
DOI: | 10.1007/978-3-319-10599-4_43 |
ISSN: | 0302-9743 |
EISSN: | 1611-3349 |
Pages: | 668 - 686 |
Type of Material: | Conference Article |
Journal/Proceeding Title: | European Conference on Computer Vision |
Version: | Author's manuscript |
Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.