The More You Look, the More You See: Towards General Object Understanding Through Recursive Refinement
Author(s): Wang, Jingyan; Russakovsky, Olga; Ramanan, Deva
DownloadTo refer to this page use:
http://arks.princeton.edu/ark:/88435/pr1bj9z
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Wang, Jingyan | - |
dc.contributor.author | Russakovsky, Olga | - |
dc.contributor.author | Ramanan, Deva | - |
dc.date.accessioned | 2021-10-08T19:44:12Z | - |
dc.date.available | 2021-10-08T19:44:12Z | - |
dc.date.issued | 2018-03 | en_US |
dc.identifier.citation | Wang, Jingyan, Olga Russakovsky, and Deva Ramanan. "The More You Look, the More You See: Towards General Object Understanding Through Recursive Refinement." In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1794-1803. IEEE, 2018. doi: 10.1109/WACV.2018.00199 | en_US |
dc.identifier.uri | https://www.cs.cmu.edu/~jingyanw/papers/refinement.pdf | - |
dc.identifier.uri | http://arks.princeton.edu/ark:/88435/pr1bj9z | - |
dc.description.abstract | Comprehensive object understanding is a central challenge in visual recognition, yet most advances with deep neural networks reason about each aspect in isolation. In this work, we present a unified framework to tackle this broader object understanding problem. We formalize a refinement module that recursively develops understanding across space and semantics - "the more it looks, the more it sees." More concretely, we cluster the objects within each semantic category into fine-grained subcategories; our recursive model extracts features for each region of interest, recursively predicts the location and the content of the region, and selectively chooses a small subset of the regions to process in the next step. Our model can quickly determine if an object is present, followed by its class ("Is this a person?"), and finally report finegrained predictions ("Is this person standing?"). Our experiments demonstrate the advantages of joint reasoning about spatial layout and fine-grained semantics. On the PASCAL VOC dataset, our proposed model simultaneously achieves strong performance on instance segmentation, part segmentation and keypoint detection in a single efficient pipeline that does not require explicit training for each task. One of the reasons for our strong performance is the ability to naturally leverage highly-engineered architectures, such as Faster-RCNN, within our pipeline. Source code is available at https://github.com/ jingyanw/recursive-refinement. | en_US |
dc.format.extent | 1794 - 1803 | en_US |
dc.language.iso | en_US | en_US |
dc.relation.ispartof | 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) | en_US |
dc.rights | Author's manuscript | en_US |
dc.title | The More You Look, the More You See: Towards General Object Understanding Through Recursive Refinement | en_US |
dc.type | Conference Article | en_US |
dc.identifier.doi | doi:10.1109/WACV.2018.00199 | - |
dc.identifier.isbn13 | 978-1-5386-4886-5 | - |
dc.identifier.isbn13 | 978-1-5386-4887-2 | - |
pu.type.symplectic | http://www.symplectic.co.uk/publications/atom-terms/1.0/conference-proceeding | en_US |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
ObjectUnderstandingRecursiveRefinement.pdf | 8.19 MB | Adobe PDF | View/Download |
Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.