Models of visual attention have focused predominantly on bottom-up approaches that ignored structured contextual and scene information. I propose a model of contextual cueing for attention guidance based on the global scene configuration. It is shown that the statistics of low-level features across the whole image can be used to prime the presence or absence of objects in the scene and to predict their location, scale, and appearance before exploring the image. In this scheme, visual context information can become available early in the visual processing chain, which allows modulation of the saliency of image regions and provides an efficient shortcut for object detection and recognition.
© 2003 Optical Society of America
(100.5010) Image processing : Pattern recognition
(330.0330) Vision, color, and visual optics : Vision, color, and visual optics
(330.4060) Vision, color, and visual optics : Vision modeling
Antonio Torralba, "Modeling global scene factors in attention," J. Opt. Soc. Am. A 20, 1407-1418 (2003)