Sean Bell*,
Paul Upchurch*,
Noah Snavely,
Kavita Bala
Cornell University
* Equal contribution
Computer Vision and Pattern Recognition (CVPR) 2015
Abstract:
Recognizing materials in real-world images is a challenging task. Real-world materials have rich surface texture, geometry, lighting conditions, and clutter, which combine to make the problem particularly difficult. In this paper, we introduce a new, large-scale, open dataset of materials in the wild, the Materials in Context Database (MINC), and combine this dataset with deep learning to achieve material recognition and segmentation of images in the wild.
MINC is an order of magnitude larger than previous material databases, while being more diverse and well-sampled across its 23 categories. Using MINC, we train convolutional neural networks (CNNs) for two tasks: classifying materials from patches, and simultaneous material recognition and segmentation in full images. For patch-based classification on MINC we found that the best performing CNN architectures can achieve 85.2% mean class accuracy. We convert these trained CNN classifiers into an efficient fully convolutional framework combined with a fully connected conditional random field (CRF) to predict the material at every pixel in an image, achieving 73.1% mean class accuracy. Our experiments demonstrate that having a large, well-sampled dataset such as MINC is crucial for real-world material recognition and segmentation.
BibTeX:
@article{bell15minc, author = "Sean Bell and Paul Upchurch and Noah Snavely and Kavita Bala", title = "Material Recognition in the Wild with the Materials in Context Database", journal = "Computer Vision and Pattern Recognition (CVPR)", year = "2015", }
These models predict 23 material classes with a mean class accuracy of 85.2% (GoogLeNet) on the MINC test set.
MINC is the full dataset described in our paper (Section 3). It consists of 3M labeled point samples and 7061 labeled material segmentations in 23 material categories. In order to use MINC you will also need the original resolution images.
MINC-2500 is a patch classification dataset with 2500 samples per category (Section 5.4 of the paper). This is a subset of MINC where samples have been sized to 362 x 362 and each category is sampled evenly. The original resolution images are not needed as we include the extracted patches in the archive.
We provide original resolution images for non-commercial research purposes only. Please enter your information to generate a "terms and conditions" form (Javascript must be enabled).
The annotations are licensed under a Creative Commons Attribution 4.0 International License. The photos have their own licenses.
This work was supported in part by Google, Amazon AWS for Education, a NSERC PGS-D scholarship, the National Science Foundation (grants IIS- 1149393, IIS-1011919, IIS-1161645), and the Intel Science and Technology Center for Visual Computing. We thank NVIDIA for the generous donation of K40 GPUs.
Header background pattern: courtesy of Subtle Patterns.