Material segmentations
Users were asked to draw around regions of a single type of material.
Vanishing points
Each color corresponds to one vanishing point.  Hover over the points on the right to see the full lines.
Whitebalance points
Users were asked to click on points that are white or gray.
Each color corresponds to one user.
Median chroma- 5.545 (24.2 s)
- 5.267 (7.45 s)
- 7.211 (6.09 s)
- 5.239 (3.19 s)
- 5.461 (2.95 s)
 Human reflectance judgements
Our user interface for collecting annotations shows the user
				an image and asks them, for a particular pair of pixels
				(indicated with crosshairs and labeled Points 1 and 2), which
				of the two points has a darker surface color. The user can then
				select one of three options: Point 1, Point 2, and About the
				same. We ask users to specify their confidence in their
				assessment as Guessing, Probably, or Definitely, as was done by
				[Branson et al. 2010].
We aggregate judgements from 5 users for each pair of points
				and use the CUBAM machine learning model [Welinder et al. 2010]
				to model two forms of bias.
See our publication for more details.

Our user interface
Intrinsic image decompositions
The input image is decomposed into a "reflectance" and "shading" layer.  Note that the reflectance layer is listed twice: color (left) and grayscale (center).  Decompositions are ordered by error and then runtime (best on top).  The parameters for each algorithm are the same for all photos; they are set to the values that produce lowest mean error (WHDR) for all photos.  See our publication for more details.
- Algorithm: garces2012_clustering- Parameters:  - km k: 8
- remap gamma 2 2: False
 
- Result:
			 - Weighted human disagreement rate (WHDR): 21.3% (δ: 0.1)
- WHDR for equal edges only: 0.2323
- WHDR for inequalities only: 0.1842
- Runtime: 3.2 s
 
- Algorithm: bell2014_densecrf- Parameters:  - abs reflectance weight: 0
- abs shading gray point: 0.5
- abs shading weight: 500.0
- chromaticity weight: 0
- kmeans intensity scale: 0.5
- kmeans n clusters: 20
- n iters: 25
- pairwise intensity chromaticity: True
- pairwise weight: 104
- shading blur init method: none
- shading blur sigma: 0.1
- shading target norm: L2
- shading target weight: 20000.0
- split clusters: True
- theta c: 0.025
- theta l: 0.1
- theta p: 0.1
 
- Result:
			 - Weighted human disagreement rate (WHDR): 32.6% (δ: 0.1)
- WHDR for equal edges only: 0.2690
- WHDR for inequalities only: 0.4103
- Runtime: 210.5 s
 
- Algorithm: zhao2012_nonlocal- Parameters:  - chrom thresh: 0.001
- gamma: False
- texture patch distance: 0.0003
- texture patch variance: 0.03
 
- Result:
			 - Weighted human disagreement rate (WHDR): 32.7% (δ: 0.1)
- WHDR for equal edges only: 0.3502
- WHDR for inequalities only: 0.2918
- Runtime: 50.4 s
 
- Algorithm: shen2011_optimization- Parameters:  - rho: 1.9
- unmap srgb: False
- wd: 3
 
- Result:
			 - Weighted human disagreement rate (WHDR): 36.9% (δ: 0.1)
- WHDR for equal edges only: 0.5649
- WHDR for inequalities only: 0.0785
- Runtime: 246.1 s
 
- Algorithm: grosse2009_grayscale_retinex- Citation: - Roger Grosse, Micah K. Johnson, Edward H. Adelson, William T. Freeman.  "Ground truth dataset and baseline evaluations for intrinsic image algorithms".   Proceedings of the International Conference on Computer Vision (ICCV)- .   http://www.cs.toronto.edu/~rgrosse/intrinsic/- . 
- Result:
			 - Weighted human disagreement rate (WHDR): 38.4% (δ: 0.1)
- WHDR for equal edges only: 0.5768
- WHDR for inequalities only: 0.0971
- Runtime: 145.3 s
 
- Algorithm: grosse2009_color_retinex- Citation: - Roger Grosse, Micah K. Johnson, Edward H. Adelson, William T. Freeman.  "Ground truth dataset and baseline evaluations for intrinsic image algorithms".   Proceedings of the International Conference on Computer Vision (ICCV)- .   http://www.cs.toronto.edu/~rgrosse/intrinsic/- . 
- Parameters:  - L1: True
- threshold color: 0.7
- threshold gray: 0.5
 
- Result:
			 - Weighted human disagreement rate (WHDR): 39.6% (δ: 0.1)
- WHDR for equal edges only: 0.5808
- WHDR for inequalities only: 0.1216
- Runtime: 206.3 s
 
- Algorithm: baseline_reflectance- Result:
			 - Weighted human disagreement rate (WHDR): 40.2% (δ: 0.1)
- WHDR for equal edges only: 0.0000
- WHDR for inequalities only: 1.0000
- Runtime: 0.1 s
 
- Algorithm: baseline_shading- Result:
			 - Weighted human disagreement rate (WHDR): 50.4% (δ: 0.1)
- WHDR for equal edges only: 0.7845
- WHDR for inequalities only: 0.0859
- Runtime: 0.1 s