2017 Abstracts - Vision Sciences Society

2017 Abstracts - Vision Sciences Society
Vision Sciences Society
1 7 t h A n n u a l M e e t i n g, M ay 1 9 - 2 4 , 2 0 1 7
T r a d e W i n d s I s l a n d R e s o r t s , S t. P e t e B e a c h , F l o r i d a
Sessions Overview . . . . . . . . . . . . . 2
Member-Initiated Symposia . . . . . . . 4
Saturday Morning Talks . . . . . . . . . 13
Saturday Morning Posters . . . . . . . . 20
Saturday Afternoon Talks . . . . . . . . 58
Saturday Afternoon Posters . . . . . . . 65
Sunday Morning Talks . . . . . . . . . 102
Sunday Morning Posters . . . . . . . . 109
Sunday Afternoon Talks . . . . . . . . 148
Sunday Afternoon Posters . . . . . . . 156
Monday Morning Talks . . . . . . . . 189
Monday Morning Posters . . . . . . . 195
Tuesday Morning Talks . . . . . . . . 232
Tuesday Morning Posters . . . . . . . 239
Tuesday Afternoon Talks . . . . . . . . 275
Tuesday Afternoon Posters . . . . . . 283
Wednesday Morning Talks . . . . . . 317
Wednesday Morning Posters . . . . . 324
Topic Index . . . . . . . . . . . . . . . 349
Author Index . . . . . . . . . . . . . . . 352
Sessions Overview
Member-Initiated Symposia . . . . . . . . . . . . . . . . . 4 Sunday Morning Posters . . . . . . . . . . . . . . . . . 109
Saturday Morning Talks . . . . . . . . . . . . . . . . . . 13
Perception and Action: Arm movements . . . . . . . . . . . . . . . . . . 13
Face perception: Experience and disorders . . . . . . . . . . . . . . . . 14
Object Recognition: Neural mechanisms . . . . . . . . . . . . . . . . . . 16
Perceptual Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Saturday Morning Posters . . . . . . . . . . . . . . . . 20
Attention: Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Motion: Biological motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Visual Search: Features and objects . . . . . . . . . . . . . . . . . . . . . . . 26
Visual Memory: Long term and working . . . . . . . . . . . . . . . . . . 31
Visual Memory: Working memory . . . . . . . . . . . . . . . . . . . . . . . 35
Color and Light: Neural mechanisms . . . . . . . . . . . . . . . . . . . . . 38
Color and Light: Constancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Binocular Vision: Continuous flash suppression and
awareness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Binocular Vision: Other . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Perceptual Organization: Grouping . . . . . . . . . . . . . . . . . . . . . . 48
Perceptual Organization: Neural mechanisms . . . . . . . . . . . . . 51
Temporal Processing: Duration . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Multisensory: Vision and audition . . . . . . . . . . . . . . . . . . . . . . . 55
Saturday Afternoon Talks . . . . . . . . . . . . . . . . 58
Attention: Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Motion: Flow, biological, and higher-order . . . . . . . . . . . . . . . . 59
Visual Search: Other . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Color and Light: Material perception . . . . . . . . . . . . . . . . . . . . . 62
Saturday Afternoon Posters . . . . . . . . . . . . . . 65
Perception and Action: Affordances . . . . . . . . . . . . . . . . . . . . . . 65
Face Perception: Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Face Perception: Neural mechanisms . . . . . . . . . . . . . . . . . . . . . 70
Eye Movements: Pursuit and anticipation . . . . . . . . . . . . . . . . . 74
Object Recognition: Where in the brain? . . . . . . . . . . . . . . . . . . 76
Scene Perception: Models and other . . . . . . . . . . . . . . . . . . . . . . 80
Scene Perception: Neural mechanisms . . . . . . . . . . . . . . . . . . . . 82
3D Perception: Shape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Visual Memory: Neural mechanisms . . . . . . . . . . . . . . . . . . . . . 89
Visual Memory: Cognitive disorders, individual differences . 93
Multisensory: Touch and balance . . . . . . . . . . . . . . . . . . . . . . . . 95
Spatial Vision: Crowding and masking . . . . . . . . . . . . . . . . . . . 97
Sunday Morning Talks . . . . . . . . . . . . . . . . . . . 102
Attention: Selection and modulation . . . . . . . . . . . . . . . . . . . . 102
Color and Light: Color vision . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Spatial Vision: Crowding and statistics . . . . . . . . . . . . . . . . . . 105
3D Perception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Vi s i on S c i enc es S o ci e ty
Motion: Depth and models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Motion: Flow and illusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Motion: Higher order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Development: Typical and lifespan . . . . . . . . . . . . . . . . . . . . . 116
Perception and Action: Grasping . . . . . . . . . . . . . . . . . . . . . . . 120
Object Recognition: Foundations . . . . . . . . . . . . . . . . . . . . . . . . 123
Perceptual Learning: Plasticity and adaptation . . . . . . . . . . . . 127
Perceptual Learning: Specificity and transfer . . . . . . . . . . . . . 131
Attention: Neuroimaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Eye Movements: Cognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Scene Perception: Categorization and memory . . . . . . . . . . . . 142
Scene Perception: Spatiotemporal factors . . . . . . . . . . . . . . . . 145
Sunday Afternoon Talks . . . . . . . . . . . . . . . . . 148
Object Recognition: Mechanisms and models . . . . . . . . . . . . . 148
Binocular Vision: Rivalry and bistability . . . . . . . . . . . . . . . . . 149
Spatial Vision: Neural mechanisms . . . . . . . . . . . . . . . . . . . . . . 151
Multisensory Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Sunday Afternoon Posters . . . . . . . . . . . . . . . 156
Motion: Neural mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Face Perception: Development and experience . . . . . . . . . . . . 158
Face Perception: Disorders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Development: Atypical development . . . . . . . . . . . . . . . . . . . . 163
Color and Light: Appearance . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Color and Light: Other . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
Attention: Exogenous and endogenous . . . . . . . . . . . . . . . . . . 172
Attention: Spatial selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Attention: Individual differences, lifespan and clinical . . . . . 177
Perception and Action: Walking and navigating . . . . . . . . . . 181
Temporal Processing: Sequences, oscillations and
temporal order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Temporal Processing: Timing . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Monday Morning Talks . . . . . . . . . . . . . . . . . . . 189
Eye Movements: Neural mechanisms . . . . . . . . . . . . . . . . . . . . 189
Perceptual Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
Attention: Mostly temporal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
Binocular Vision: Stereopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Monday Morning Posters . . . . . . . . . . . . . . . . . 195
Color and Light: Material perception . . . . . . . . . . . . . . . . . . . . 195
Color and Light: Lightness and brightness . . . . . . . . . . . . . . . 197
Spatial Vision: Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
Spatial Vision: Neural mechanisms . . . . . . . . . . . . . . . . . . . . . . 201
Object Recognition: Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Perception and Action: Manual interception and
reaching movements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
Face Perception: Emotion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Face Perception: Social cognition . . . . . . . . . . . . . . . . . . . . . . . . 212
Visual Memory: Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Visual Memory: Attention and cognition . . . . . . . . . . . . . . . . . 219
Eye Movements: Remapping and applications . . . . . . . . . . . . 223
Eye Movements: Saccades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
VSS 2017 Abs t rac t s Sessions Ov eriew
Tuesday Morning Talks . . . . . . . . . . . . . . . . . . 232 Tuesday Afternoon Posters . . . . . . . . . . . . . . 283
Face Perception: Emotion and models . . . . . . . . . . . . . . . . . . . 232
Eye Movements: Fixation and perception . . . . . . . . . . . . . . . . 233
Visual Search: Attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Motion: Neural mechanisms and models . . . . . . . . . . . . . . . . 236
Tuesday Morning Posters . . . . . . . . . . . . . . . . 239
Attention: Capture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Attention: Divided . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Attention: Electrophysiology . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
Perception and Action: Mutual interactions . . . . . . . . . . . . . . 250
Face Perception: Individual differences, learning and
experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Face Perception: Wholes, parts, and features . . . . . . . . . . . . . 257
Object Recognition: Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
3D Perception: Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
Binocular Vision: Stereopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
Perceptual Learning: Models and neural mechanisms . . . . . 270
Spatial Vision: Texture and natural image statistics . . . . . . . . 272
Visual Search: Eye movements and memory . . . . . . . . . . . . . 283
Visual Search: Models and mechanisms . . . . . . . . . . . . . . . . . . 286
Eye Movements: Models and neural mechanisms . . . . . . . . . 289
Eye Movements: Perception . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
Perception and Action: Theory and mechanisms . . . . . . . . . . 294
Color and Light: Cognition and preference . . . . . . . . . . . . . . . 296
Color and Light: Thresholds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Attention: Attentional blink . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
Attention: Inattention, blindnesses, and awareness . . . . . . . . 304
Binocular Vision: Rivalry and bistability . . . . . . . . . . . . . . . . . 306
Object Recognition: Categories . . . . . . . . . . . . . . . . . . . . . . . . . 308
Object Recognition: Features . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
Wednesday Morning Talks . . . . . . . . . . . . . . . . 317
Face Perception: Neural mechanisms and models . . . . . . . . . 317
Perception and Action: The basis of decisions and actions . . 318
Eye Movements: Saccades and pursuit . . . . . . . . . . . . . . . . . . . 320
Visual Memory: Capacity and integration . . . . . . . . . . . . . . . . 322
Tuesday Afternoon Talks . . . . . . . . . . . . . . . . 275 Wednesday Morning Posters . . . . . . . . . . . . . . 324
Scene Perception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Attention: Neural manipulation and mechanism . . . . . . . . . . 276
Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
Visual memory: Working memory and persistence . . . . . . . . 280
Attention: Reward and value . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
Attention: Tracking, time and selection . . . . . . . . . . . . . . . . . . 328
Attention: Space and objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
Object Recognition: Neural mechanisms . . . . . . . . . . . . . . . . . 336
Abstract Numbering System
Each abstract is assigned a unique 4 or 6 digit number based on when and where it is to be presented. The format of the
abstract numbering is DT.RN (where D is the Day, T is the Time, R is the Room and N is the presentation Number).
First Digit - Day
Second Digit - Time
Early AM talk session
Late AM talk session
AM poster session
Early PM talk session
Late PM talk session
PM poster session
Third Digit - Room
1 Talk Room 1
2 Talk Room 2
3 Banyan Breezeway
Fourth-Sixth Digits - Number
1, 2, 3...
001, 002...
For talks
For posters
Saturday, early AM talk in Talk Room 1, 6th talk
36.3013 Sunday, PM poster in Banyan Breezeway, poster board 13
53.4106 Tuesday, AM poster in the Pavilion, poster board 106
Note: Two digits after the period indicates a talk, four digits indicates a poster (the last three digits are the board number).
Vis io n S c ie nc es Societ y
Member-Initiated Symposia
Schedule Overview
Friday, May 19, 2017, 12:00 - 2:00 pm
S1 - A scene is more than the sum of its objects: The mechanisms
of object-object and object-scene integration Talk Room 1
S2 - The Brain Correlates of Perception and Action: from Neural
Activity to Behavior Pavilion
Friday, May 19, 2017, 2:30 - 4:30 pm
S3 - How can you be so sure? Behavioral, computational, and neuroscientific perspectives on metacognition in perceptual
decision-making Talk Room 1
S4 - The Role of Ensemble Statistics in the Visual Periphery
Friday, May 19, 2017, 5:00 - 7:00 pm
S5 - Cutting across the top-down-bottom-up dichotomy in attentional capture research Talk Room 1
S6 - Virtual Reality and Vision Science Pavilion
S1 - A scene is more than the sum of its
objects: The mechanisms of object-object and object-scene integration
Friday, May 19, 2017, 12:00 - 2:00 pm, Talk Room 1
Organizer(s): Liad Mudrik, Tel Aviv University and Melissa Võ,
Goethe University Frankfurt
Presenters: Michelle Greene, Monica S. Castelhano, Melissa L.H.
Võ, Nurit Gronau, Liad Mudrik
In the lab, vision researchers are typically trying to create “clean”,
controlled environments and stimulations in order to tease apart
the different processes that are involved in seeing. Yet in real life,
visual comprehension is never a sterile process: objects appear
with other objects in cluttered, rich scenes, which have certain
spatial and semantic properties. In recent years, more and more
studies are focusing on object-object and object-scene relations as
possible guiding principles of vision. The proposed symposium
aims to present current findings in this continuously developing
field, while specifically focusing on two key questions that have
attracted substantial scientific interest in recent years; how do
scene-object and object-object relations influence object processing, and what are the necessary conditions for deciphering these
relations. Greene, Castelhano and Võ will each tackle the first
question in different ways, using information theoretic measures,
visual search findings, eye movement, and EEG measures. The
second question will be discussed with respect to attention and
consciousness: Võ’s findings suggest automatic processing of
object-scene relations, but do not rule out the need for attention. This view is corroborated and further stressed by Gronau’s
results. With respect to consciousness, Mudrik, however, will
present behavioral and neural data suggesting that consciousness
may not be an immediate condition for relations processing, but
rather serve as a necessary enabling factor. Taken together, these
talks should lay the ground for an integrative discussion of both
complimentary and conflicting findings. Whether these are based
Vi s i on S c i enc es S o ci e ty
on different theoretical assumptions, methodologies or experimental approaches, the core of the symposium will speak to how
to best tackle the investigation of the complexity of real-world
scene perception.
Measuring the Efficiency of Contextual Knowledge
Speaker: Michelle Greene, Stanford University
The last few years have brought us both large-scale image databases and
the ability to crowd-source human data collection, allowing us to measure
contextual statistics in real world scenes (Greene, 2013). How much contextual information is there, and how efficiently do people use it? We created
a visual analog to a guessing game suggested by Claude Shannon (1951)
to measure the information scenes and objects share. In our game, 555 participants on Amazon’s Mechanical Turk (AMT) viewed scenes in which a
single object was covered by an opaque bounding box. Participants were
instructed to guess about the identity of the hidden object until correct.
Participants were paid per trial, and each trial terminated upon correctly
guessing the object, so participants were incentivized to guess as efficiently
as possible. Using information theoretic measures, we found that scene
context can be encoded with less than 2 bits per object, a level of redundancy that is even greater than that of English text. To assess the information from scene category, we ran a second experiment in which the image
was replaced by the scene category name. Participants still outperformed
the entropy of the database, suggesting that the majority of contextual
knowledge is carried by the category schema. Taken together, these results
suggest that not only is there a great deal of information about objects coming from scene categories, but that this information is efficiently encoded
by the human mind.
Where in the world?: Explaining Scene Context Effects during
Visual Search through Object-Scene Spatial Associations
Speaker: Monica S. Castelhano, Queen’s University
The spatial relationship between objects and scenes and its effects on visual
search performance has been well-established. Here, we examine how
object-scene spatial associations support scene context effects on eye movement guidance and search efficiency. We reframed two classic visual search
paradigms (set size and sudden onset) according to the spatial association
between the target object and scene. Using the recently proposed Surface
Guidance Framework, we operationalize target-relevant and target-irrelevant regions. Scenes are divided into three regions (upper, mid, lower)
that correspond with possible relevant surfaces (wall, countertop, floor).
Target-relevant regions are defined according to the surface on which the
target is likely to appear (e.g., painting, toaster, rug). In the first experiment, we explored how spatial associations affect search by manipulating
search size in either target-relevant or target-irrelevant regions. We found
that only set size increases in target-relevant regions adversely affected
search performance. In the second experiment, we manipulated whether
a suddenly-onsetting distractor object appeared in a target-relevant or target-irrelevant region. We found that fixations to the distractor were significantly more likely and search performance was negatively affected in
the target-relevant condition. The Surface Guidance Framework allows for
further exploration of how object-scene spatial associations can be used to
quickly narrow processing to specific areas of the scene and largely ignore
information in other areas. Viewing effects of scene context through the
lens of target-relevancy allows us to develop new understanding of how
the spatial associations between objects and scenes can affect performance.
What drives semantic processing of objects in scenes?
Speaker: Melissa L.H. Võ, Goethe University Frankfurt
Objects hardly ever appear in isolation, but are usually embedded in a
larger scene context. This context — determined e.g. by the co-occurrence
of other objects or the semantics of the scene as a whole — has large impact
on the processing of each and every object. Here I will present a series of
eye tracking and EEG studies from our lab that 1) make use of the known
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s Vision at a glance: the necessity of attention to contextual
integration processes
Speaker: Nurit Gronau, The Open University of Israel
Objects that are conceptually consistent with their environment are typically grasped more rapidly and efficiently than objects that are inconsistent
with it. The extent to which such contextual integration processes depend
on visual attention, however, is largely disputed. The present research
examined the necessity of visual attention to object-object and object-scene
contextual integration processes during a brief visual glimpse. Participants
performed an object classification task on associated object pairs that were
either positioned in expected relative locations (e.g., a desk-lamp on a desk)
or in unexpected, contextually inconsistent relative locations (e.g., a desklamp under a desk). When both stimuli were relevant to task requirements,
latencies to spatially consistent object pairs were significantly shorter than
to spatially inconsistent pairs. These contextual effects disappeared, however, when spatial attention was drawn to one of the two object stimuli
while its counterpart object was positioned outside the focus of attention and was irrelevant to task-demands. Subsequent research examined
object-object and object-scene associations which are based on categorical
relations, rather than on specific spatial and functional relations. Here too,
processing of the semantic/categorical relations necessitated allocation of
spatial attention, unless an unattended object was explicitly defined as a
to-be-detected target. Collectively, our research suggests that associative
and integrative contextual processes underlying scene understanding rely
on the availability of spatial attentional resources. However, stimuli which
comply with task-requirements (e.g., a cat/dog in an animal, but not in
a vehicle detection task) may benefit from efficient processing even when
appearing outside the main focus of visual attention.
Object-object and object-scene integration: the role of conscious
Speaker: Liad Mudrik, Tel Aviv University
On a typical day, we perform numerous integration processes; we repeatedly integrate objects with the scenes in which they appear, and decipher
the relations between objects, resting both on their tendency to co-occur and
on their semantic associations. Such integration seems effortless, almost
automatic, yet computationally speaking it is highly complicated and challenging. This apparent contradiction evokes the question of consciousness’
role in the process: is it automatic enough to obviate the need for conscious
processing, or does its complexity necessitate the involvement of conscious
experience? In this talk, I will present EEG, fMRI and behavioral experiments that tap into consciousness’ role in processing object-scene integration and object-object integration. The former revisits subjects’ ability to
integrate the relations (congruency/incongruency) between an object and
the scene in which it appears. The latter examines the processing of the relations between two objects, in an attempt to differentiate between associative relations (i.e., relations that rest on repeated co-occurrences of the two
objects) vs. abstract ones (i.e., relations that are more conceptual, between
two objects that do not tend to co-appear but are nevertheless related). I will
claim that in both types of integration, consciousness may function as an
enabling factor rather than an immediate necessary condition.
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
S2 - The Brain Correlates of Perception
and Action: from Neural Activity to
Friday, May 19, 2017, 12:00 - 2:00 pm, Pavilion
Organizer(s): Simona Monaco, Center for Mind/Brain Sciences,
University of Trento & Annalisa Bosco, Department of Pharmacy
and Biotech, University of Bologna
Presenters: J. Douglas Crawford, Patrizia Fattori, Simona
Monaco, Annalisa Bosco, Jody C. Culham
In recent years neuroimaging and neurophysiology have enabled
cognitive neuroscience to identify numerous brain areas that are
involved in sensorimotor integration for action. This research
has revealed cortical and subcortical brain structures that work
in coordination to allow accurate hand and eye movements. The
visual information about objects in the environment is integrated into the motor plan through a cascade of events known
as visuo-motor integration. These mechanisms allow not only to
extract relevant visual information for action, but also to continuously update this information throughout action plan and
execution. As our brain evolved to act towards real objects in
the natural environment, studying hand and eye movements in
experimental situations that resemble the real world is critical
for our understanding of the action system. This aspect has been
relatively neglected in the cognitive sciences, mostly because
of the challenges associated with the experimental setups and
technologies. This symposium provides a comprehensive view of
the neural mechanisms underlying sensory-motor integration for
the production of eye and hand movements in situations that are
common to real life. The range of topics covered by the speakers
encompasses the visual as well as the motor and cognitive neurosciences, and therefore are relevant to junior and senior scientists
specialized in any of these areas. We bring together researchers
from macaque neurophysiology to human neuroimaging and
behavior. The combination of works that use these cutting-edge
techniques offers a unique insight into the effects that are detected
at the neuronal level, extended to neural populations and translated into behavior. There will be five speakers. Doug Crawford
will address the neuronal mechanisms underlying perceptual-motor integration during head-unrestrained gaze shifts in the frontal
eye field and superior colliculus of macaques. Patrizia Fattori will
describe how the activity of neurons in the dorsomedial visual
stream of macaques is modulated by gaze and hand movement
direction as well as properties of real objects. Jody Culham will
illustrate the neural representation for visually guided actions and
real objects in the human brain revealed by functional magnetic
resonance imaging (fMRI). Simona Monaco will describe the
neural mechanisms in the human brain underlying the influence
of intended action on sensory processing and the involvement of
the early visual cortex in action planning and execution. Annalisa
Bosco will detail the behavioral aspects of the influence exerted by
action on perception in human participants.
Visual-motor transformations at the Neuronal Level in the Gaze
Speaker: J. Douglas Crawford, Centre for Vision Research, York
University, Toronto, Ontario, Canada
Additional Authors: AmirSaman Sajad, Center for Integrative &
Cognitive Neuroscience, Vanderbilt University, Nashville, TN and
Morteza Sadeh, Centre for Vision Research, York University, Toronto,
Ontario, Canada
Vis io n S c ie nc es Societ y
time-course and neuronal signature of scene semantic processing to test
whether seemingly meaningless textures of scenes are sufficient to modulate semantic object processing, and 2) raise the question of its automaticity. For instance, we have previously shown that semantically inconsistent
objects trigger an N400 ERP response similar to the one known from language processing. Moreover, an additional but earlier N300 response signals perceptual processing difficulties that go in line with classic findings
of impeded object identification from the 1980s. We have since used this
neuronal signature to investigate scene context effects on object processing
and recently found that a scene’s mere summary statistics — visualized as
seemingly meaningless textures — elicit a very similar N400 response. Further, we have shown that observers looking for target letters superimposed
on scenes fixated task-irrelevant semantically inconsistent objects embedded in the scenes to a greater degree and without explicit memory for
these objects. Manipulating the number of superimposed letters reduced
this effect, but not entirely. As part of this symposium, we will discuss the
implications of these findings for the question as to whether object-scene
integration requires attention.
Me mb e r - Init iat ed Sym p osia
Me mber - I ni t i at ed Sy m p os i a
The fundamental question in perceptual-motor integration is how, and
at what level, do sensory signals become motor signals? Does this occur
between brain areas, within brain areas, or even within individual neurons? Various training or cognitive paradigms have been combined with
neurophysiology and/or neuroimaging to address this question, but the
visuomotor transformations for ordinary gaze saccades remain elusive.
To address these questions, we developed a method for fitting visual and
motor response fields against various spatial models without any special training, based on trial-to-trial variations in behavior (DeSouza et al.
2011). More recently we used this to track visual-motor transformations
through time. We find that superior colliculus and frontal eye field visual
responses encode target direction, whereas their motor responses encode
final gaze position relative to initial eye orientation (Sajad et al. 2015; Sadeh
et al. 2016). This occurs both between neuron populations, but can also
be observed within individual visuomotor cells. When a memory delay
is imposed, a gradual transition of intermediate codes is observed (perhaps due to an imperfect memory loop), with a further ‘leap’ toward gaze
motor coding in the final memory-motor transformation (Sajad et al. 2016).
However, we found a similar spatiotemporal transition even within the
brief burst of neural activity that accompanies a reactive, visually-evoked
saccade. What these data suggest is that visuomotor transformations are
a network phenomenon that is simultaneously observable at the level of
individual neurons, and distributed across different neuronal populations
and structures.
Neurons for eye and hand action in the monkey medial posterior
parietal cortex
Speaker: Patrizia Fattori, University of Bologna
Additional Authors: Fattori Patrizia, Breveglieri Rossella, Galletti
Claudio, Department of Pharmacy and Biotechnology, University of
In the last decades, several components of the visual control of eye and
hand movements have been disentangled by studying single neurons in the
brain of awake macaque monkeys. In this presentation, particular attention
will be given to the influence of the direction of gaze upon the reaching
activity of neurons of the dorsomedial visual stream. We recorded from
the caudal part of the medial posterior parietal cortex, finding neurons sensitive to the direction and amplitude of arm reaching actions. The reaching activity of these neurons was influenced by the direction of gaze, some
neurons preferring foveal reaching, others peripheral reaching. Manipulations of eye/target positions and of hand position showed that the reaching
activity could be in eye-centered, head-centered, or a mixed frame of reference according to the considered neuron. We also found neurons modulated by the visual features of real objects and neurons modulated also by
grasping movements, such as wrist orientation and grip formation. So it
seems that the entire neural machinery for encoding eye and hand action is
hosted in the dorsomedial visual stream. This machinery takes part in the
sequence of visuomotor transformations required to encode many aspects
of the reach-to-grasp actions.
The role of the early visual cortex in action
Speaker: Simona Monaco, Center for Mind/Brain Sciences, University
of Trento
Additional Authors: Simona Monaco, Center for Mind/Brain Sciences,
University of Trento; Doug Crawford, Centre for Vision Research, York
University, Toronto, Ontario, Canada; Luca Turella, Center for Mind/
Brain Sciences, University of Trento; Jody Culham, Brain and Mind
VS S 2017 Abst ract s
and might be the target of reentrant feedback for sensory-motor integration. Third, the early visual cortex shows action-driven modulation during
both action planning and execution, suggesting a continuous exchange of
information with higher-order visual-motor areas for the production of a
motor output.
The influence of action execution on object size perception
Speaker: Annalisa Bosco, Department of Pharmacy and Biotechnology,
University of Bologna
Additional Authors: Annalisa Bosco, Department of Pharmacy and
Biotechnology, University of Bologna; Patrizia Fattori, Department of
Pharmacy and Biotechnology, University of Bologna
When performing an action, our perception is focused towards object
visual properties that enable us to execute the action successfully. However, the motor system is also able to influence perception, but only few
studies reported evidence for hand action-induced visual perception modifications. Here, we aimed to study for a feature-specific perceptual modulation before and after a reaching and grasping action. Two groups of
subjects were instructed to either grasp or reach to different sized bars and,
before and after the action, to perform a size perceptual task by manual and
verbal report. Each group was tested in two experimental conditions: no
prior knowledge of action type, where subjects did not know the successive
type of movement, and prior knowledge of action type, where they were
aware about the successive type of movement. In both manual and verbal
perceptual size responses, we found that after a grasping movement the
size perception was significantly modified. Additionally, this modification
was enhanced when the subjects knew in advance the type of movement to
execute in the subsequent phase of task. These data suggest that the knowledge of action type and the execution of the action shape the perception of
object properties.
Neuroimaging reveals the human neural representations for
visually guided grasping of real objects and pictures
Speaker: Jody C. Culham, Brain and Mind Institute, University of
Western Ontario
Additional Authors: Jody C. Culham, University of Western Ontario;
Sara Fabbri, Radboud University Nijmegen; Jacqueline C. Snow,
University of Nevada, Reno; Erez Freud, Carnegie-Mellon University
Neuroimaging, particularly functional magnetic resonance imaging (fMRI),
has revealed many human brain areas that are involved in the processing
of visual information for the planning and guidance of actions. One area
of particular interest is the anterior intraparietal sulcus (aIPS), which is
thought to play a key role in processing information about object shape for
the visual control of grasping. However, much fMRI research has relied
on artificial stimuli, such as two-dimensional photos, and artificial actions,
such as pantomimed grasping. Recent fMRI studies from our lab have used
representational similarity analysis on the patterns of fMRI activation from
brain areas such as aIPS to infer neural coding in participants performing
real actions upon real objects. This research has revealed the visual features
of the object (particularly elongation) and the type of grasp (including the
number of digits and precision required) that are coded in aIPS and other
regions. Moreover, this work has suggested that these neural representations are affected by the realness of the object, particularly during grasping.
Taken together, these results highlight the value of using more ecological
paradigms to study sensorimotor control.
Functional magnetic resonance imaging has recently allowed showing that
intended action modulates the sensory processing of object orientation in
areas of the action network in the human brain. In particular, intended
actions can be decoded in the early visual cortex using multivoxel pattern
analyses before the movements are initiated, regardless of whether the target object is visible or not. In addition, the early visual cortex is re-recruited
during actions in the dark towards stimuli that have been previously seen.
These results suggest three main points. First, the action-driven modulation
of sensory processing is shown at the neural level in a network of areas
that include the early visual cortex. Second, the role of the early visual cortex goes well beyond the processing of sensory information for perception
Vi s i on S c i enc es S o ci e ty
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s Friday, May 19, 2017, 2:30 - 4:30 pm, Talk Room 1
Organizer(s): Megan Peters, University of California Los Angeles
Presenters: Megan Peters, Ariel Zylberberg, Michele Basso, Wei
Ji Ma, Pascal Mamassian
Metacognition, or our ability to monitor the uncertainty of our
thoughts, decisions, and perceptions, is of critical importance
across many domains. Here we focus on metacognition in perceptual decisions -- the continuous inferences that we make about
the most likely state of the world based on incoming sensory
information. How does a police officer evaluate the fidelity of
his perception that a perpetrator has drawn a weapon? How
does a driver compute her certainty in whether a fleeting visual
percept is a child or a soccer ball, impacting her decision to
swerve? These kinds of questions are central to daily life, yet how
such ‘confidence’ is computed in the brain remains unknown. In
recent years, increasingly keen interest has been directed towards
exploring such metacognitive mechanisms from computational
(e.g., Rahnev et al., 2011, Nat Neuro; Peters & Lau, 2015, eLife),
neuroimaging (e.g., Fleming et al., 2010, Science), brain stimulation (e.g., Fetsch et al., 2014, Neuron), and neuronal electrophysiology (e.g., Kiani & Shadlen, 2009, Science; Zylberberg et
al., 2016, eLife) perspectives. Importantly, the computation of
confidence is also of increasing interest to the broader range of
researchers studying the computations underlying perceptual
decision-making in general. Our central focus is on how confidence is computed in neuronal populations, with attention to
(a) whether perceptual decisions and metacognitive judgments
depend on the same or different computations, and (b) why confidence judgments sometimes fail to optimally track the accuracy
of perceptual decisions. Key themes for this symposium will
include neural correlates of confidence, behavioral consequences
of evidence manipulation on confidence judgments, and computational characterizations of the relationship between perceptual
decisions and our confidence in them. Our principal goal is to
attract scientists studying or interested in confidence/uncertainty,
sensory metacognition, and perceptual decision-making from
both human and animal perspectives, spanning from the computational to the neurobiological level. We bring together speakers
from across these disciplines, from animal electrophysiology and
behavior through computational models of human uncertainty, to
communicate their most recent and exciting findings. Given the
recency of many of the findings discussed, our symposium will
cover terrain largely untouched by the main program. We hope
that the breadth of research programs represented in this symposium will encourage a diverse group of scientists to attend and
actively participate in the discussion.
Transcranial magnetic stimulation to visual cortex induces
suboptimal introspection
Speaker: Megan Peters, University of California Los Angeles
Additional Authors: Megan Peters, University of California Los Angeles;
Jeremy Fesi, The Graduate Center of the City University of New York;
Namema Amendi, The Graduate Center of the City University of New
York; Jeffrey D. Knotts, University of California Los Angeles; Hakwan
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
In neurological cases of blindsight, patients with damage to primary visual
cortex can discriminate objects but report no visual experience of them. This
form of ‘unconscious perception’ provides a powerful opportunity to study
perceptual awareness, but because the disorder is rare, many researchers
have sought to induce the effect in neurologically intact observers. One
promising approach is to apply transcranial magnetic stimulation (TMS)
to visual cortex to induce blindsight (Boyer et al., 2005), but this method
has been criticized for being susceptible to criterion bias confounds: perhaps TMS merely reduces internal visual signal strength, and observers are
unwilling to report that they faintly saw a stimulus even if they can still
discriminate it (Lloyd et al., 2013). Here we applied a rigorous responsebias free 2-interval forced-choice method for rating subjective experience
in studies of unconscious perception (Peters and Lau, 2015) to address this
concern. We used Bayesian ideal observer analysis to demonstrate that
observers’ introspective judgments about stimulus visibility are suboptimal
even when the task does not require that they maintain a response criterion
-- unlike in visual masking. Specifically, observers appear metacognitively
blind to the noise introduced by TMS, in a way that is akin to neurological
cases of blindsight. These findings are consistent with the hypothesis that
metacognitive judgments require observers to develop an internal model
of the statistical properties of their own signal processing architecture, and
that introspective suboptimality arises when that internal model abruptly
becomes invalid due to external manipulations.
The influence of evidence volatility on choice, reaction time and
confidence in a perceptual decision
Speaker: Ariel Zylberberg, Columbia University
Additional Authors: Ariel Zylberberg, Columbia University; Christopher
R. Fetsch, Columbia University; Michael N. Shadlen, Columbia
Many decisions are thought to arise via the accumulation of noisy evidence
to a threshold or bound. In perceptual decision-making, the bounded evidence accumulation framework explains the effect of stimulus strength,
characterized by signal-to-noise ratio, on decision speed, accuracy and
confidence. This framework also makes intriguing predictions about the
behavioral influence of the noise itself. An increase in noise should lead to
faster decisions, reduced accuracy and, paradoxically, higher confidence.
To test these predictions, we introduce a novel sensory manipulation that
mimics the addition of unbiased noise to motion-selective regions of visual
cortex. We verified the effect of this manipulation with neuronal recordings
from macaque areas MT/MST. For both humans and monkeys, increasing
the noise induced faster decisions and greater confidence over a range of
stimuli for which accuracy was minimally impaired. The magnitude of the
effects was in agreement with predictions of a bounded evidence accumulation model.
A role for the superior colliculus in decision-making and
Speaker: Michele Basso, University of California Los Angeles
Additional Authors: Michele Basso, University of California Los Angeles;
Piercesare Grimaldi, University of California Los Angeles; Trinity Crapse,
University of California Los Angeles
Evidence implicates the superior colliculus (SC) in attention and perceptual
decision-making. In a simple target-selection task, we previously showed
that discriminability between target and distractor neuronal activity in the
SC correlated with decision accuracy, consistent with the hypothesis that
SC encodes a decision variable. Here we extend these results to determine
whether SC also correlates with decision criterion and confidence. Trained
monkeys performed a simple perceptual decision task in two conditions
to induce behavioral response bias (criterion shift): (1) the probability of
two perceptual stimuli was equal, and (2) the probability of one perceptual stimulus was higher than the other. We observed consistent changes in
behavioral response bias (shifts in decision criterion) that were directly correlated with SC neuronal activity. Furthermore, electrical stimulation of SC
mimicked the effect of stimulus probability manipulations, demonstrating
that SC correlates with and is causally involved in setting decision criteria.
To assess confidence, monkeys were offered a ‘safe bet’ option on 50% of
trials in a similar task. The ‘safe bet’ always yielded a small reward, encouraging monkeys to select the ‘safe bet’ when they were less confident rather
Vis io n S c ie nc es Societ y
S3 - How can you be so sure?
Behavioral, computational, and neuroscientific perspectives on metacognition in perceptual decision-making
Me mb e r - Init iat ed Sym p osia
Me mber - I ni t i at ed Sy m p os i a
than risk no reward for a wrong decision. Both monkeys showed metacognitive sensitivity: they chose the ‘safe bet’ more on more difficult trials. Single- and multi-neuron recordings from SC revealed two distinct neuronal
populations: one that discharged more robustly for more confident trials,
and one that did so for less confident trials. Together these finding show
how SC encodes information about decisions and decisional confidence.
Testing the Bayesian confidence hypothesis
Speaker: Wei Ji Ma, New York University
Additional Authors: Wei Ji Ma, New York University; Will Adler, New
York University; Ronald van den Berg, University of Uppsala
Asking subjects to rate their confidence is one of the oldest procedures in
psychophysics. Remarkably, quantitative models of confidence ratings
have been scarce. What could be called the “Bayesian confidence hypothesis” states that an observer’s confidence rating distribution is completely
determined by posterior probability. This hypothesis predicts specific
quantitative relationships between performance and confidence. It also
predicts that stimulus combinations that produce the same posterior will
also produce the same confidence distribution. We tested these predictions
in three contexts: a) perceptual categorization; b) visual working memory;
c) the interpretation of scientific data.
Integration of visual confidence over time and across stimulus
Speaker: Pascal Mamassian, Ecole Normale Supérieure
Additional Authors: Pascal Mamassian, Ecole Normale Supérieure;
Vincent de Gardelle, Université Paris 1; Alan Lee, Lingnan University
Visual confidence refers to our ability to estimate our own performance
in a visual decision task. Several studies have highlighted the relatively
high efficiency of this meta-perceptual ability, at least for simple visual
discrimination tasks. Are observers equally good when visual confidence
spans more than one stimulus dimension or more than a single decision?
To address these issues, we used the method of confidence forced-choice
judgments where participants are prompted to choose between two alternatives the stimulus for which they expect their performance to be better
(Barthelmé & Mamassian, 2009, PLoS CB). In one experiment, we asked
observers to make confidence choice judgments between two different
tasks (an orientation-discrimination task and a spatial-frequency-discrimination task). We found that participants were equally good at making these
across-dimensions confidence judgments as when choices were restricted
to a single dimension, suggesting that visual confidence judgments share
a common currency. In another experiment, we asked observers to make
confidence-choice judgments between two ensembles of 2, 4, or 8 stimuli.
We found that participants were increasingly good at making ensemble
confidence judgments, suggesting that visual confidence judgments can
accumulate information across several trials. Overall, these results help us
better understand how visual confidence is computed and used over time
and across stimulus dimensions.
S4 -The Role of Ensemble Statistics in
the Visual Periphery
Friday, May 19, 2017, 2:30 - 4:30 pm, Pavilion
Organizer(s): Brian Odegaard, University of California-Los Angeles
Presenters: Michael Cohen, David Whitney, Ruth Rosenholtz,
Tim Brady, Brian Odegaard
The past decades have seen the growth of a tremendous amount
of research into the human visual system’s capacity to encode
“summary statistics” of items in the world. Studies have shown
that the visual system possesses a remarkable ability to compute
properties such as average size, position, motion direction, gaze
direction, emotional expression, and liveliness, as well as variability in color and facial expression, documenting the phenomena
across various domains and stimuli. One recent proposal in
the literature has focused on the promise of ensemble statistics
Vi s i on S c i enc es S o ci e ty
VS S 2017 Abst ract s
to provide an explanatory account of subjective experience in
the visual periphery (Cohen, Dennett, & Kanwisher, Trends in
Cognitive Sciences, 2016). In addition to this idea, others have
suggested that summary statistics underlie performance in visual
tasks in a broad manner. These hypotheses open up intriguing
questions: how are ensemble statistics encoded outside the fovea,
and to what extent does this capacity explain our experience of
the majority of our visual field? In this proposed symposium, we
aim to discuss recent empirical findings, theories, and methodological considerations in pursuit of answers to many questions
in this growing area of research, including the following: (1) How
does the ability to process summary statistics in the periphery
compare to this ability at the center of the visual field? (2) What
role (if any) does attention play in the ability to compute summary statistics in the periphery? (3) Which computational modeling frameworks provide compelling, explanatory accounts of this
phenomenon? (4) Which summary statistics (e.g., mean, variance)
are encoded in the periphery, and are there limitations on the
precision/capacity of these estimates? By addressing questions
such as those listed above, we hope that participants emerge from
this symposium with a more thorough understanding of the role
of ensemble statistics in the visual periphery, and how this phenomenon may account for subjective experience across the visual
field. Our proposed group of speakers is shown below, and we
hope that faculty, post-docs, and graduate students alike would
find this symposium to be particularly informative, innovative,
and impactful.
Ensemble statistics and the richness of perceptual experience
Speaker: Michael Cohen, MIT
While our subjective impression is of a detailed visual world, a wide variety of empirical results suggest that perception is actually rather limited.
Findings from change blindness and inattentional blindness highlight
how much of the huge amounts of the visual world regularly go unnoticed. Furthermore, direct estimates of the capacity of visual attention and
working memory reveal that surprisingly few items can be processed and
maintained at once. Why do we think we see so much when these empirical results suggests we see so little? One possible answer to this question
resides in the representational power of visual ensembles and summary
statistics. Under this view, those items that cannot be represented as individual objects or with great precision are nevertheless represented as part
of a broader statistical summary. By representing much of the world as an
ensemble, observers have perceptual access to different aspects of the entire
field of view, not just a few select items. Thus, ensemble statistics play a
critical role in our ability to account for and characterize the apparent richness of perceptual experience.
Ensemble representations as a basis for rich perceptual
Speaker: David Whitney, University of California-Berkeley
Much of our rich visual experience comes in the form of ensemble representations, the perception of summary statistical information in groups of
objects—such as the average size of items, the average emotional expression of faces in a crowd, or the average heading direction of point-light
walkers. These ensemble percepts occur over space and time, are robust to
outliers, and can occur in the visual periphery. Ensemble representations
can even convey unique and emergent social information like the gaze of
an audience, the animacy of a scene, or the panic in a crowd, information
that is not necessarily available at the level of the individual crowd members. The visual system can make these high-level interpretations of social
and emotional content with exposures as brief as 50 ms, thus revealing an
extraordinarily efficient process for compressing what would otherwise
be an overwhelming amount of information. Much of what is believed to
count as rich social, emotional, and cognitive experience actually comes in
the form of basic, compulsory, visual summary statistical processes.
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s Speaker: Ruth Rosenholtz, MIT
Visual perception is full of puzzles. Human observers effortlessly perform many visual tasks, and have the sense of a rich percept of the visual
world. Yet when probed for details they are at a loss. How does one explain
this combination of marvelous successes and puzzling failures? Numerous researchers have explained the failures in terms of severe limits on
resources of attention and memory. But if so, how can one explain the
successes? My lab has argued that many experimental results pointing to
apparent attentional limits instead derived at least in part from losses in
peripheral vision. Furthermore, we demonstrated that those losses could
arise from peripheral vision encoding its inputs in terms of a rich set of local
image statistics. This scheme is theoretically distinct from encoding ensemble statistics of a set of similar items. I propose that many of the remaining
attention/memory limits can be unified in terms of a limit on decision complexity. This decision complexity is difficult to reason about, because the
complexity of a given task depends upon the underlying encoding. A complex, general-purpose encoding likely evolved to make certain tasks easy
at the expense of others. Recent advances in understanding this encoding
-- including in peripheral vision -- may help us finally make sense of the
puzzling strengths and limitations of visual perception.
The role of spatial ensemble statistics in visual working memory
and scene perception
Speaker: Tim Brady, University of California-San Diego
At any given moment, much of the relevant information about the visual
world is in the periphery rather than the fovea. The periphery is particularly
useful for providing information about scene structure and spatial layout,
as well as informing us about the spatial distribution and features of the
objects we are not explicitly attending and fixating. What is the nature of
our representation of this information about scene structure and the spatial
distribution of objects? In this talk, I’ll discuss evidence that representations
of the spatial distribution of simple visual features (like orientation, spatial
frequency, color), termed spatial ensemble statistics, are specifically related
to our ability to quickly and accurately recognize visual scenes. I’ll also
show that these spatial ensemble statistics are a critical part of the information we maintain in visual working memory – providing information about
the entire set of objects, not just a select few, across eye movements, blinks,
occlusions and other interruptions of the visual scene.
Summary Statistics in the Periphery: A Metacognitive Approach
Speaker: Brian Odegaard, University of California-Los Angeles
Recent evidence indicates that human observers often overestimate their
capacity to make perceptual judgments in the visual periphery. How can
we quantify the degree to which this overestimation occurs? We describe
how applications of Signal Detection Theoretic frameworks provide one
promising approach to measure both detection biases and task performance
capacities for peripheral stimuli. By combining these techniques with new
metacognitive measures of perceptual confidence (such as meta-d’; Maniscalco & Lau, 2012), one can obtain a clearer picture regarding (1) when subjects can simply perform perceptual tasks in the periphery, and (2) when
they have true metacognitive awareness of the visual surround. In this
talk, we describe results from recent experiments employing these quantitative techniques, comparing and contrasting the visual system’s capacity
to encode summary statistics in both the center and periphery of the visual
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
S5 - Cutting across the top-down-bottom-up
dichotomy in attentional capture
Friday, May 19, 2017, 5:00 - 7:00 pm, Talk Room 1
Organizer(s): J. Eric T. Taylor, Brain and Mind Institute at Western
Presenters: Nicholas Gaspelin, Matthew Hilchey, Dominique
Lamy, Stefanie Becker, Andrew B. Leber
Research on attentional selection describes the various factors
that determine what information is ignored and what information is processed. These factors are commonly described as either
bottom-up or top-down, indicating whether stimulus properties or an observer’s goals determine the outcome of selection.
Research on selection typically adheres strongly to one of these
two perspectives; the field is divided. The aim of this symposium
is to generate discussions and highlight new developments in the
study of attentional selection that do not conform to the bifurcated approach that has characterized the field for some time (or
trifurcated, with respect to recent models emphasizing the role
of selection history). The research presented in this symposium
does not presuppose that selection can be easily or meaningfully
dichotomized. As such, the theme of the symposium is cutting
across the top-down-bottom-up dichotomy in attentional selection
research. To achieve this, presenters in this session either share
data that cannot be easily explained within the top-down or bottom-up framework, or they propose alternative models of existing
descriptions of sources of attentional control. Theoretically, the
symposium will begin with presentations that attempt to resolve
the dichotomy with a new role for suppression (Gaspelin &
Luck) or further bemuse the dichotomy with typically bottom-up
patterns of behaviour in response to intransient stimuli (Hilchey,
Taylor, & Pratt). The discussion then turns to demonstrations that
the bottom-up, top-down, and selection history sources of control
variously operate on different perceptual and attentional processes (Lamy & Zivony; Becker & Martin), complicating our categorization of sources of control. Finally, the session will conclude
with an argument for more thorough descriptions of sources
of control (Leber & Irons). In summary, these researchers will
present cutting-edge developments using converging methodologies (chronometry, EEG, and eye-tracking measures) that further
our understanding of attentional selection and advance attentional capture research beyond its current dichotomy. Given the
heated history of this debate and the importance of the theoretical
question, we expect that this symposium should be of interest to a
wide audience of researchers at VSS, especially those interested in
visual attention and cognitive control.
Vis io n S c ie nc es Societ y
Summary statistic encoding plus limits on decision complexity
underlie the richness of visual perception as well as its quirky
Me mb e r - Init iat ed Sym p osia
Me mber - I ni t i at ed Sy m p os i a
Mechanisms Underlying Suppression of Attentional Capture by
Salient Stimuli
Speaker: Nicholas Gaspelin, Center for Mind and Brain at the University
of California, Davis
Additional Authors: Nicholas Gaspelin, Center for Mind and Brain at
the University of California, Davis; Carly J. Leonard, Center for Mind and
Brain at the University of California, Davis; Steven J. Luck, Center for
Mind and Brain at the University of California, Davis
Researchers have long debated the nature of cognitive control in vision,
with the field being dominated by two theoretical camps. Stimulus-driven
theories claim that visual attention is automatically captured by salient
stimuli, whereas goal-driven theories argue that capture depends critically
the goals of a viewer. To resolve this debate, we have previously provided
key evidence for a new hybrid model called signal suppression hypothesis.
According to this account, all salient stimuli generate an active salience signal which automatically attempts to guide visual attention. However, this
signal can be actively suppressed. In the current talk, we review the converging evidence for this active suppression of salient items, using behavioral, eye tracking and electrophysiological methods. We will also discuss
the cognitive mechanisms underlying suppression effects and directions
for future research.
Beyond the new-event paradigm in visual attention research: Can
completely static stimuli capture attention?
Speaker: Matthew Hilchey, University of Toronto
Additional Authors: Matthew D. Hilchey, University of Toronto, J. Eric
T. Taylor, Brain and Mind Institute at Western University; Jay Pratt,
University of Toronto
The last several decades of attention research have focused almost exclusively on paradigms that introduce new perceptual objects or salient sensory changes to the visual environment in order to determine how attention
is captured to those locations. There are a handful of exceptions, and in the
spirit of those studies, we asked whether or not a completely unchanging
stimuli can attract attention using variations of classic additional singleton
and cueing paradigms. In the additional singleton tasks, we presented a
preview array of six uniform circles. After a short delay, one circle changed
in form and luminance – the target location – and all but one location
changed luminance, leaving the sixth location physically unchanged. The
results indicated that attention was attracted toward the vicinity of the only
unchanging stimulus, regardless of whether all circles around it increased
or decreased luminance. In the cueing tasks, cueing was achieved by changing the luminance of 5 circles in the object preview array either 150 or 1000
ms before the onset of a target. Under certain conditions, we observed
canonical patterns of facilitation and inhibition emerging from the location containing the physically unchanging cue stimuli. Taken together, the
findings suggest that a completely unchanging stimulus, which bears no
obvious resemblance to the target, can attract attention in certain situations.
Stimulus salience, current goals and selection history do not
affect the same perceptual processes
Speaker: Dominique Lamy, Tel Aviv University
Additional Authors: Dominique Lamy, Tel Aviv University Alon Zivony,
Tel Aviv University
When exposed to a visual scene, our perceptual system performs several
successive processes. During the preattentive stage, the attentional priority
accruing to each location is computed. Then, attention is shifted towards
the highest-priority location. Finally, the visual properties at that location
are processed. Although most attention models posit that stimulus-driven
and goal-directed processes combine to determine attentional priority,
demonstrations of purely stimulus-driven capture are surprisingly rare.
In addition, the consequences of stimulus-driven and goal-directed capture on perceptual processing have not been fully described. Specifically,
whether attention can be disengaged from a distractor before its properties have been processed is unclear. Finally, the strict dichotomy between
bottom-up and top-down attentional control has been challenged based on
the claim that selection history also biases attentional weights on the priority map. Our objective was to clarify what perceptual processes stimulus
Vi s i on S c i enc es S o ci e ty
VS S 2017 Abst ract s
salience, current goals and selection history affect. We used a feature-search
spatial-cueing paradigm. We showed that (a) unlike stimulus salience and
current goals, selection history does not modulate attentional priority, but
only perceptual processes following attentional selection; (b) a salient distractor not matching search goals may capture attention but attention can
be disengaged from this distractor’s location before its properties are fully
processed; and (c) attentional capture by a distractor sharing the target feature entails that this distractor’s properties are mandatorily processed.
Which features guide visual attention, and how do they do it?
Speaker: Stefanie Becker, The University of Queensland
Additional Authors: Stefanie Becker, The University of Queensland;
Aimee Martin, The University of Queensland
Previous studies purport to show that salient irrelevant items can attract
attention involuntarily, against the intentions and goals of an observer.
However, corresponding evidence originates predominantly from RT
and eye movement studies, whereas EEG studies largely failed to support
saliency capture. In the present study, we examined effects of salient colour
distractors on search for a known colour target when the distractor was
similar vs. dissimilar to the target. We used both eye tracking and EEG (in
separate experiments), and also investigated participant’s awareness of the
features of irrelevant distractors. The results showed that capture by irrelevant distractors was strongly top-down modulated, with target-similar distractors attracting attention much more strongly, and being remembered
better, than salient distractors. Awareness of the distractor correlated more
strongly with initial capture rather than attentional dwelling on the distractor after it was selected. The salient distractor enjoyed no noticeable advantage over non-salient control distractors with regard to implicit measures,
but was overall reported with higher accuracy than non-salient distractors.
This raises the interesting possibility that salient items may primarily boost
visual processes directly, by requiring less attention for accurate perception, not by summoning spatial attention.
Toward a profile of goal-directed attentional control
Speaker: Andrew B. Leber, The Ohio State University
Additional Authors: Andrew B. Leber, The Ohio State University; Jessica
L. Irons, The Ohio State University
Recent criticism of the classic bottom-up/top-down dichotomy of attention
has deservedly focused on the existence of experience-driven factors outside this dichotomy. However, as researchers seek a better framework characterizing all control sources, a thorough re-evaluation of the top-down,
or goal-directed, component is imperative. Studies of this component have
richly documented the ways in which goals strategically modulate attentional control, but surprisingly little is known about how individuals arrive
at their chosen strategies. Consider that manipulating goal-directed control
commonly relies on experimenter instruction, which lacks ecological validity and may not always be complied with. To better characterize the factors
governing goal-directed control, we recently created the adaptive choice
visual search paradigm. Here, observers can freely choose between two targets on each trial, while we cyclically vary the relative efficacy of searching
for each target. That is, on some trials it is faster to search for a red target
than a blue target, while on other trials the opposite is true. Results using
this paradigm have shown that choice behavior is far from optimal, and
appears largely determined by competing drives to maximize performance
and minimize effort. Further, individual differences in performance are
stable across sessions while also being malleable to experimental manipulations emphasizing one competing drive (e.g., reward, which motivates
individuals to maximize performance). This research represents an initial
step toward characterizing an individual profile of goal-directed control
that extends beyond the classic understanding of “top-down” attention and
promises to contribute to a more accurate framework of attentional control.
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s Friday, May 19, 2017, 5:00 - 7:00 pm, Pavilion
Organizer(s): Bas Rokers, University of Wisconsin - Madison &
Karen B. Schloss, University of Wisconsin - Madison
Presenters: Jacqueline Fulvio, Robin Held, Emily Cooper, Stefano
Baldassi, David Luebke
Virtual reality (VR) and augmented reality (AR) provide exciting
new opportunities for vision research. In VR sensory cues are
presented to simulate an observer’s presence in a virtual environment. In AR sensory cues are presented that embed virtual
stimuli in the real world. This symposium will bring together
speakers from academia and industry to present new scientific
discoveries enabled by VR/AR technology, discuss recent and
forthcoming advances in the technology, and identify exciting
new avenues of inquiry. From a basic research perspective, VR
and AR allow us to answer fundamental scientific questions that
have been difficult or impossible to address in the past. VR/AR
headsets provide a number of potential benefits over traditional
psychophysical methods, such as incorporating a large field of
view, high frame rate/low persistence, and low latency head
tracking. These technological innovations facilitate experimental
research in highly controlled, yet naturalistic three-dimensional
environments. However, VR/AR also introduces its own set
of unique challenges of which potential researchers should be
aware. Speakers from academia will discuss ways they have used
VR/AR as a tool to advance knowledge about 3D perception,
multisensory integration, and navigation in naturalistic three-dimensional environments. Speakers will also present research on
perceptual learning and neural plasticity, which may benefit from
training in cue-rich environments that simulate real-world conditions. These talks will shed light on how VR/AR may ultimately
be used to mitigate visual deficits and contribute to the treatment
of visual disorders. Speakers from industry will highlight recent
technological advances that can make VR such a powerful tool
for research. Industry has made significant strides solving engineering problems involving latency, field of view, and presence.
However, challenges remain, such as resolving cue conflicts and
eliminating motion sickness. Although some of these issues may
be solved through engineering, others are due to limitations of the
visual system and require solutions informed by basic research
within the vision science community. This symposium aims to
provide a platform that deepens the dialog between academia and
industry. VR holds unprecedented potential for building assistive technologies that will aid people with sensory and cognitive
disabilities. Hearing from speakers in industry will give vision
scientists an overview of anticipated technological developments,
which will help them evaluate how they may incorporate VR/
AR in their future research. In turn vision researchers may help
identify science-based solutions to current engineering challenges.
In sum this symposium will bring together two communities
for the mutually beneficial advancement of VR-based research.
Who may want to attend: This symposium will be of interest to
researchers who wish to consider incorporating AR/VR into their
research, get an overview of existing challenges, and get a sense
of future directions of mutual interest to industry and academia.
The talks will be valuable to researchers at all stages of their
careers. Hearing from representatives from both industry and
academia may be useful for early stage researchers seeking opportunities beyond the highly competitive academic marketplace and
may help researchers at all stages identify funding sources in the
highly competitive granting landscape.
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Extra-retinal cues improve accuracy of 3D motion perception in
virtual reality environments
Speaker: Jacqueline Fulvio, University of Wisconsin - Madison
Additional Authors: Jacqueline M. Fulvio & Bas Rokers, Department of
Psychology, UW-Madison
Our senses provide imperfect information about the world that surrounds
us, but we can improve the accuracy of our perception by combining
sensory information from multiple sources. Unfortunately, much of the
research in visual perception has utilized methods of stimulus presentation that eliminate potential sources of information. It is often the case for
example, that observers are asked to maintain a fixed head position while
viewing stimuli generated on flat 2D displays. We will present recent work
on the perception of 3D motion using the Oculus Rift, a virtual reality (VR)
head-mounted display with head-tracking functionality. We describe the
impact of uncertainty in visual cues presented in isolation, which have surprising consequences for the accuracy of 3D motion perception. We will
then describe how extra-retinal cues, such as head motion, improve visual
accuracy. We will conclude with a discussion of the potential and limitations of VR technology for the understanding visual perception.
Perceptual considerations for the design of mixed-reality
Speaker: Robin Held, Microsoft
Additional Authors: Robin Held, Microsoft
Virtual-reality head-mounted displays (VR HMDs) block out the real world
while engulfing the user in a purely digital setting. Meanwhile, mixed-reality (MR) HMDs embed digital content within the real-world while maintaining the user’s perception of her or his surroundings. This ability to
simultaneously perceive both rendered content and real objects presents
unique challenges for the design of MR content. I will briefly review the
technologies underlying current MR headsets, including display hardware,
tracking systems, and spatial audio. I will also discuss how the existing
implementations of those technologies impact the user’s perception of the
content. Finally, I will show how to apply that knowledge to optimize MR
content for comfort and aesthetics.
Designing and assessing near-eye displays to increase user
Speaker: Emily Cooper, Dartmouth College
Additional Authors: Nitish Padmanaban, Robert Konrad, and Gordon
Wetzstein, Department of Electrical Engineering, Stanford University
From the desktop to the laptop to the mobile device, personal computing
platforms evolve over time. But in each case, one thing stays the same: the
primary interface between the computer and the user is a visual display.
Recent years have seen impressive growth in near-eye display systems,
which are the basis of most virtual and augmented reality experiences.
There are, however, a unique set of challenges to designing a display
that is literally strapped to the user’s face. With an estimated half of all
adults in the United States requiring some level of visual correction, maximizing inclusivity for near-eye displays is essential. I will describe work
that combines principles from optics, optometry, and visual perception to
identify and address major limitations of near-eye displays both for users
with normal vision and those that require common corrective lenses. I will
also describe ongoing work assessing the potential for near-eye displays to
assist people with less common visual impairments at performing day-today tasks.
See-through Wearable Augmented Reality: challenges and opportunities for vision science
Speaker: Stefano Baldassi, Meta Company
Additional Authors: Stefano Baldassi & Moqian Tian, Analytics & Neuroscience Department, Meta Company
We will present Meta’s Augmented Reality technology and the challenges
faced in product development that may generate strong mutual connections between vision science and technology, as well as new areas of
research for vision science and research methods using AR. The first line
Vis io n S c ie nc es Societ y
S6 - Virtual Reality and Vision Science
Me mb e r - Init iat ed Sym p osia
Me mber - I ni t i at ed Sy m p os i a
VS S 2017 Abst ract s
of challenges comes from the overlap between virtual content and the real
world due to the non-opacity of the rendered pixels and the see-through
optics. What are the optimal luminance, contrast and color profile to enable
least interference? Will the solutions be qualitatively different in photonic
and scotopic conditions? With SLAM, the virtual objects can be locked onto
the real scene. Does the real world provide the same environmental context
to the virtual object as a real object? Last, what are the implication of digital
content in the periphery, given Meta’s industry-leading 90° FOV? The second line of challenges is in the domain of perception and action and multisensory integration. Meta supports manipulation of virtual objects. In the
absence of haptic stimulation, when hands interact with the virtual object
we currently rely on visual and proprioceptive cues to guide touch. How is
the visuo-motor control of hands affected by manipulations without haptics? In order to enable people to interact with the virtual objects realistically and effectively, are cues like occlusion and haptic feedback necessary?
Will time locked sound introduce valuable cues?
Computational Display for Virtual and Augmented Reality
Speaker: David Luebke, NVIDIA
Additional Authors: David Luebke, VP Graphics Research, NVIDIA
Wearable displays for virtual & augmented reality face tremendous challenges, including: Near-Eye Display: how to put a display as close to the
eye as a pair of eyeglasses, where we cannot bring it into focus? Field of
view: how to fill the user’s entire vision with displayed content? Resolution: how to fill that wide field of view with enough pixels, and how to render all of those pixels? A “brute force” display would require 10,000×8,000
pixels per eye! Bulk: displays should be as unobtrusive as sunglasses, but
optics dictate that most VR displays today are bigger than ski goggles.
Focus cues: today’s VR displays provide binocular display but only a fixed
optical depth, thus missing the monocular depth cues from defocus blur
and introducing vergence-accommodation conflict. To overcome these
challenges requires understanding and innovation in vision science, optics,
display technology, and computer graphics. I will describe several “computational display” VR/AR prototypes in which we co-design the optics,
display, and rendering algorithm with the human visual system to achieve
new tradeoffs. These include light field displays, which sacrifice spatial resolution to provide thin near-eye display and focus cues; pinlight displays,
which use a novel and very simple optical stack to produce wide fieldof-view see-through display; and a new approach to foveated rendering,
which uses eye tracking and renders the peripheral image with less detail
than the foveal region. I’ll also talk about our current efforts to “operationalize” vision science research, which focuses on peripheral vision, crowding, and saccadic suppression artifacts.
Vi s i on S c i enc es S o ci e ty
S e e page 3 fo r A b s tr ac t N um bering Syst em
Saturday Morning Talks
Saturday, May 20, 8:15 - 9:45 am
Talk Session, Talk Room 1
Moderator: Robert Volcic
21.11, 8:15 am The causal role of the lateral occipital (LO) cortex
and anterior intraparietal sulcus (aIPS) in real and pantomimed
grasping: an fMRI-guided TMS study Diana Tonin1([email protected]
ac.uk), Vincenzo Romei2, Rachel Lambert1, Andre Bester1, Janak Saada3,
Stephanie Rossit1; 1School of Psychology, University of East Anglia,
Norwich, UK, 2Department of Psychology, Centre for Brain Science, University of Essex, Colchester, UK, 3Department of Radiology, Norfolk and
Norwich University Hospital, Norwich, UK
Milner and Goodale (1995) propose a model of vision that makes a distinction between vision for perception and vision for action. One strong
claim of this model is that the visual processing of objects for real grasping depends on dorsal stream areas whereas the processing of objects for
pantomimed actions depends on the ventral stream regions. However, and
even more that 20 years after its formulation, this claim is largely based
on a single-case neuropsychological study: visual form agnosic patient
DF can scale her grip aperture to different object sizes during real visually-guided grasping, but her grip scaling is impaired when performing
pantomimed grasping in a location adjacent to these same objects. Here we
used fMRI-guided transcranial magnetic stimulation (TMS) to shed light
on the specific role of the lateral occipital (LO) cortex, a key ventral stream
area in object perception, and the anterior intraparietal sulcus (aIPS), a key
dorsal stream region in grip scaling, in real and pantomimed grasping. We
applied theta burst TMS over left aIPS, left LO or vertex in three separate
sessions before 16 participants performed real object grasping and pantomimed grasping in an adjacent location to the presented object. Grasping
movements were performed in open loop with the right-hand in response
to 3D Efron blocks presented in the right visual field. For real grasping,
TMS over aIPS significantly weakened the relationship between object size
and grip aperture when compared to TMS over LO and TMS over vertex,
whereas TMS over LO had no effects. For pantomimed grasping, TMS over
both aIPS and LO considerably reduced the relationship between object
size and grip aperture when compared to vertex stimulation. Our results
show that while aIPS is causally involved in grip scaling for both real and
pantomime grasping, LO is only involved in pantomimed grasping.
21.12, 8:30 am Proprioception calibrates object size constancy for
grasping but not perception in limited viewing conditions Juan
Chen ([email protected]), Irene Sperandio , Melvyn Goodale ; The Brain
and Mind Institute, The University of Western Ontario, London, Ontario,
Canada, 2School of Psychology, University of East Anglia, Norwich, UK
1 1
Observers typically perceive an object as being the same size even when it
is viewed at different distances. What is seldom appreciated, however, is
that people also use the same grip aperture when grasping an object positioned at different viewing distances in peripersonal space. Perceptual size
constancy has been shown to depend on a range of distance cues, each of
which will be weighted differently in different viewing conditions. What
is not known, however, is whether or not the same distance cues (and the
same cue weighting) are used to calibrate size constancy for grasping. To
address this question, participants were asked either to grasp or to manually estimate (using their right hand) the size of spheres presented at different distances in a full-viewing condition (light on, binocular viewing) or in
a limited-viewing condition (light off, monocular viewing through a 1 mm
hole). In the full-viewing condition, participants showed size constancy in
both tasks. In the limited-viewing condition, participants no longer showed
size constancy, opening their hand wider when the object was closer in
both tasks. This suggests that binocular and other visual cues contribute
to size constancy in both grasping and perceptual tasks. We then asked
participants to perform the same tasks while their left hand was holding
a pedestal under the sphere. Remarkably, the proprioceptive cues from
holding the pedestal with their left hand dramatically restored size con-
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Acknowledgement: This work was supported by a discovery grant from the Natural Sciences and Engineering Research Council of Canada (NSERC) to M.A.G.
and the
21.13, 8:45 am The medial grasping area in the parietal cortex
of the macaque Patrizia Fattori1([email protected]), Rossella
Breveglieri1, Marina De Vitis1, Annalisa Bosco1, Claudio Galletti1; 1Dept.
Pharmacy and Biotechnology, University of Bologna, Italy
The parietal lobe hosts different areas involved in linking sensory information to signals useful to control arm movements. One area of the inferior parietal lobule, area AIP, is the parietal grasping area for antonomasia.
Recently, also one area of the superior parietal lobule, area V6A, has been
found to be involved in encoding grasping (Fattori et al., 2010 J. Neurosci;
Fattori et al., 2012 J. Neurosci). In this work, we tested grasping responses
of single V6A neurons with the same tasks used to study the lateral grasping area AIP (Murata et al., 2000 J. Neurophysiol). Delayed grasping actions
were performed by two monkeys towards five objects of different shapes
(so to require different grip types) either in full vision or in complete darkness. In a second task, the same objects were observed without performing
grasping. We recorded 200 V6A neurons, and quantified the neural activity during grasping preparation, execution, and object holding and during
object observation. The overwhelming majority of neurons (94%) resulted
to be task-related. Most of task-related cells were influenced by visual background. Half of the neurons were excited by the visual input, half were
inhibited. Similarly to AIP, V6A contains Visual cells, activated only during
grasping in light; Motor neurons, equally activated during grasping in dark
and in light; Visuomotor cells, differently activated while grasping in dark
and in light. A subpopulation of grasp-related cells discharged also while
the monkey observed the objects outside the grasping context. Most of the
Visual, Motor, and Visuomotor neurons, highly or moderately selective
for the object during grasping, lost their selectivity during object observation. A frame emerges where the two grasping parietal areas share some
functional properties useful to update and control prehension, as the hand
approaches an object, but with a different weight of sensory signals.
Acknowledgement: MIUR (PRIN and FIRB) EU FP7-IST-217077-EYESHOTS
21.14, 9:00 am Modeling Hand-Eye Movements in a Virtual Ball
Catching Setup using Deep Recurrent Neural Network Kamran
Binaee1([email protected]), Anna Starynska1, Rakshit Kothari1,
Christopher Kanan1, Jeff Pelz1, Gabriel Diaz1; 1Rochester Institute of Technology, Chester F. Carlson Center for Imaging Science
Previous studies show that humans efficiently formulate predictive strategies to make accurate eye/hand movements when intercepting with a target moving in their field of view, such as a ball in flight. Nevertheless, it is
not clear how these strategies compensate for noisy sensory input and how
long these strategies are valid in time, as when a ball is occluded mid-flight
prior to an attempted catch. To investigate, we used a Virtual Reality ball
catching paradigm to record the 3D gaze of ten subjects as well as their head
and hand movements. Subjects were instructed to intercept a virtual ball
in flight while wearing a head mounted display which was being tracked
using motion capture system. Midway through its parabolic trajectory, the
ball was made invisible for a blank duration of 500 ms. We created 9 different ball trajectories by choosing three pre-blank (300, 400, 500 ms) and
three post-blank durations (600, 800, 1000 ms). The ball launch position and
angle were randomized. During the blank, average angular displacement
of the ball was 11 degrees of visual angle. In this period subjects were able
to track the ball successfully using head+eye pursuit. In success trials, subjects have higher smooth pursuit gain values during the blank, combined
with a sequence of saccades in the direction of ball trajectory toward the
end of the trial. Approximately 200 ms before the catching frame, angular
gaze-ball tracking error in elevation, forecasts subject’s success or failure.
We used this dataset to train a deep Recurrent Neural Network (RNN) that
Vis io n S c ie nc es Societ y
Saturday AM
stancy in the grasping task but not in the manual estimation task. These
results suggest that proprioceptive information can support size constancy
in grasping when visual distance cues are severely limited, but such cues
are not sufficient to support size constancy in perception.
Perception and Action: Arm movements
Saturday AM
Satur day Morni ng Tal ks
VSS 2017 Abst ract s
models human hand-eye movements. By using previous input sequences,
the RNN model predicts the angular gaze vector and hand position for a
short duration into the future. Consistent with studies of human behavior,
the proposed model accuracy decreases when we extend the prediction
window beyond 120 ms.
Face perception: Experience and disorders
21.15, 9:15 am Congruency between perceptual and conceptual
21.21, 8:15 am The speed of continuous face detection suggests
object size modulates visually-guided action Christine Gam-
ble1([email protected]), Joo-Hyun Song1,2; 1Cognitive, Linguistic, and Psychological Sciences Department, Brown University, 2Brown
Institute for Brain Science, Brown University
In daily interactions with the world around us, object sizes critically affect
the kinematics and dynamics of goal-directed movements such as pointing or grasping. For instance, ballistic pointing movements are faster the
larger their target is (Fitts, 1955). However, because the perceptual and conceptual sizes of objects are mostly consistent in the real world—elephants
are almost always perceived and conceptualized as larger than rabbits—
it is not clear if we guide movements solely based on our assessment of
perceptual size, or if objects’ higher-order conceptual sizes also influence
action. Here, we compared pointing movements directed at images of
real-world objects when their relative perceptual sizes were either congruent or incongruent with their relative conceptual sizes (e.g. an elephant was
presented as perceptually larger or smaller than a rabbit, respectively). Participants were instructed to point to the larger or smaller of two simultaneously presented objects in perceptual and conceptual size judgment tasks.
We observed that participants pointed to target objects faster when their
perceptual and conceptual sizes were congruent compared to incongruent.
Furthermore, we demonstrated that when perceptual and conceptual sizes
were incongruent, pointing movements were more attracted towards the
incorrect object (e.g. in the larger perceptual size judgment task, the conceptually larger but perceptually smaller object), leading to more curved trajectories. These results were observed in both the perceptual and conceptual
size judgment tasks, consistent with prior research showing that perceptual
size judgments are impaired when object size is inconsistent with object
knowledge (Konkle and Oliva, 2012). Despite this interactive modulation
of goal-directed pointing by perceptual and conceptual size, we observed
greater overall competition (i.e. curvature), in the conceptual size judgment
task. Thus, we propose that assessments of perceptual size have greater
influence on action than assessments of real-world conceptual size, despite
the fact that both are performed automatically.
21.16, 9:30 am Errors in manual interception are precisely what one
would expect for the psychophysically determined errors in perception Cristina de la Malla1, Jeroen Smeets1, Eli Brenner1; 1Department
of Human Movement Sciences, Vrije Universiteit Amsterdam
Visual illusions influence the way we perceive things. They can also influence the way we move. Whether illusions influence perception and action
to the same extent is still under debate. A major difficulty in resolving this
debate is that it only makes sense to compare the influences if one knows
how the action is based on the attribute that is influenced by the illusion.
We therefore used what we know about how visual information guides
the hand in interception to examine whether psychophysical estimates of
the extent to which a two-dimensional Gabor patch (a sinusoidal grating
which the luminance contrast is modulated by a two-dimensional Gaussian) is perceived to move at a different velocity than its true velocity when
the grating moves within the Gaussian can explain the errors that people
make when trying to intercept such Gabor patches. In separate two-interval
forced choice discrimination tasks we measured how moving the grating
influenced the perceived position and the perceived velocity of the target.
When the grating moved in the same direction as the patch, the patch was
judged to move faster than it really was. When it moved in the opposite
direction, it was judged to move more slowly. The perceived position was
hardly affected. We calculated the errors that subjects
would make if they
used these judgements to predict the motion of the moving patch during
last 100 ms of an interceptive action (when movements can no longer be
corrected due
to sensorimotor delays). We compared these predicted errors
with the errors that subjects actually made when they had to intercept similar targets. The predicted errors closely matched the actual errors in interception. We conclude that errors in perceiving how a target moves lead to
the errors that one would expect in an action directed towards such targets.
Saturday, May 20, 8:15 - 9:45 am
Talk Session, Talk Room 2
Moderator: Isabel Gauthier
shortcuts in the visual hierarchy for upright faces Jacob Martin1([email protected]), Charles Davis1, Maximilian Riesenhuber2,
Simon Thorpe1; 1CerCo, CNRS, 2Department of Neuroscience, Georgetown
University Medical Center
The detection of faces in the visual field is a key cognitive task of high ecological importance. While a number of studies have shown human subjects’
impressive ability to detect faces in individual images, we here report evidence that subjects are able to rapidly saccade towards 4000 faces continuously at rates approaching 6 faces a second when there is no background
(including the time for blinks and eye movements). Surprisingly, pasting
or hiding the faces by blending them into a large background pictures had
little effect on detection rates, saccade reaction times, or accuracy. Saccade
reaction times were similar to the “ultra-rapid” saccades found in studies
which utilized pauses and fixations between experimental trials (Crouzet et
al. 2010). Upright faces were found more quickly and more accurately than
inverted faces; both with and without a cluttered background, and over a
large range of eccentricities (4°-16°). These results argue for the existence of
a face-selective shortcut in the visual hierarchy which enables ultra-rapid
and high-throughput face detection.
Acknowledgement: ERC Advanced Grant No323711 (M4), NSF NEI
21.22, 8:30 am Thickness of deep layers in FFA predicts face
recognition performance Isabel Gauthier1([email protected]
edu), Rankin McGugin1, Benjamin Tamber-Rosenau2, Allen Newton3;
Department of Psychology, Vanderbilt University, Nashville, TN, USA,
Department of Psychology, University of Houston, Houston, TX, USA,
Department of Radiology and Radiological Sciences, Vanderbilt University, Nashville, TN, USA
Individual differences in expertise with non-face objects has been positively related to neural selectivity for these objects in several brain regions,
including in the fusiform face area (FFA). Recently, we reported that FFA’s
cortical thickness is also positively correlated with expertise for non-living objects, while FFA’s cortical thickness is negatively correlated with
face recognition ability. These opposite relations between structure and
visual abilities, obtained in the same subjects, were postulated to reflect
the earlier experience with faces relative to cars, with different mechanisms
of plasticity operating at these different developmental times. Here we
predicted that variability for faces, presumably reflecting pruning, would
be found selectively in deep cortical layers. In 13 men selected to vary in
their performance with faces, we used ultra-high field imaging (7 Tesla),
we localized the FFA functionally and collected and averaged 6 ultra-high
resolution susceptibility weighed images (SWI). Voxel dimensions were
0.194x0.194x1.00mm, covering 20 slices with 0.1mm gap. Images were
then processed by two operators blind to behavioral results to define the
gray matter/white matter (deep) and gray matter/CSF (superficial) cortical boundaries. Internal boundaries between presumed deep, middle and
superficial cortical layers were obtained with an automated method based
on image intensities. We used an extensive battery of behavioral tests to
quantify both face and object recognition ability. We replicate prior work
with face and non-living object recognition predicting large and independent parts of the variance in cortical thickness of the right FFA, in different
directions. We also find that face recognition is specifically predicted by the
thickness of the deep cortical layers in FFA, whereas recognition of vehicles
relates to the thickness of all cortical layers. Our results represent the most
precise structural correlate of a behavioral ability to date, linking face recognition ability to a specific layer of a functionally-defined area.
Acknowledgement: This work was supported by the NSF (SBE-0542013 and
SMA-1640681) and the Vanderbilt Vision Research Center (P30-EY008126)
Acknowledgement: NWO 464-13-169
Vi s i on S c i enc es S o ci e ty
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day M orning Talks
21.23, 8:45 am Hemispheric specialization for faces in pre-reading
children Aliette Lochy ([email protected]), Adelaide de Heer1
ing2, Bruno Rossion1; 1Psychological Sciences Research Institute, Institute
of Neuroscience, University of Louvain, Belgium, 2UNESCOG/CRCN,
Université Libre de Bruxelles, Belgium
References Berhmann, M. & Plaut, DC. (2013). TICS, 17(5), 210-219. De
Heering, A. & Rossion, B. (2015). E-life 2015;4;e06564. Liu-Shuang, J. et al.
(2014). Neuropsychologia, 52, 57-72. Lochy, A. et al. (2016). PNAS, 113:
Acknowledgement: Belgian Science Policy Office (Belspo) Grant IAP P7/33,
European Research Grant facessvep 284025, and the Belgian National Fund for
Scientific Research
21.24, 9:00 am Development of neural sensitivity to face identity
correlates with perceptual discriminability Vaidehi Natu1([email protected]
stanford.edu), Michael Barnett1, Jake Hartley1, Jesse Gomez2, Anthony
Stigliani1, Kalanit Grill-Spector1,2,3; 1Department of Psychology, Stanford
University, Stanford, CA 94305, 2Neurosciences Program, Stanford University School of Medicine, 3Stanford Neurosciences Institute, Stanford
University, Stanford, CA 94305
Face-selective regions in the human ventral stream undergo prolonged
development from childhood to adulthood. Children also show protracted
development of face perception. However, the neural mechanisms underlying the perceptual development remain unknown. Here, we asked if
development is associated with changes in neural sensitivity to face identity, or changes in the overall level of response to faces, or both. Using fMRI,
we measured brain responses in ventral face-selective regions (IOG-faces,
pFus-faces, and mFus-faces) and two object-selective regions (pFs-objects and LO-objects, as control regions) in children (ages 5-12, N=23) and
adults (ages 22-34, N=12), when they viewed adult and child faces, which
parametrically varied in the amount of dissimilarity. Since similar faces
generate lower responses than dissimilar faces due to fMRI-adaptation, it
can be used to study neural sensitivity across age groups. Additionally, a
subset of participants (12 children; 11 adults) participated in a behavioral
experiment conducted to assess perceptual discriminability of face identity. Our data reveal the following main findings: (1) in both children and
adults, responses in ventral face-selective regions linearly increased with
face dissimilarity (Fig. 1a), (2) neural sensitivity to face identity increased
with age in face- but not object-selective regions (Fig. 1b), (3) the amplitude
of responses to faces increased with age in both face- and object-selective
regions (Fig. 1c) and (4) perceptual discriminability of face identity was correlated with the neural sensitivity to face identity of face-selective regions
(Fig. 1d). Our results suggest that developmental increases in neural sensitivity to face identity in face-selective regions improves perceptual discriminability of faces. These findings significantly advance understanding
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Acknowledgement: NIH Grants: 1RO1EY 02231801A1, 1R01EY02391501A1
and 5T32EY020485 NSF Grant: DGE-114747.
21.25, 9:15 am Deafness Amplifies Visual Information Sampling
during Face Recognition Junpeng Lao1([email protected]), Chloé
Stoll2, Matthew Dye3, Olivier Pascalis2, Roberto Caldara1; 1Eye and Brain
Mapping Laboratory (iBMLab), Department of Psychology, University of
Fribourg, Fribourg, Switzerland, 2Laboratoire de Psychologie et Neurocognition (CNRS), Université Grenoble Alpes, Grenoble, France, 3Rochester
Institute of Technology/National Technical Institute for Deaf, Rochester,
New York, USA
We move our eyes to navigate, identify dangers, objects and people in a
wide range of situations during social interactions. However, the extent to
which visual sampling is modulated and shaped by non-visual information is difficult to control. A particular fate of nature might be helpful to
achieve this feat: the occurrence of deafness. Research has shown that early
profound hearing loss enhances the sensitivity and efficiency of the visual
channel in deaf individuals, resulting in a larger peripheral visual attention
compared to the hearing population (Dye et al., 2009). However, whether
such perceptual bias extends to visual sampling strategies deployed during
the biologically-relevant face recognition task remains to be clarified. To
this aim, we recorded the eye movements of deaf and hearing observers
while they performed a delayed matching task with upright and inverted
faces. Deaf observers showed a preferential central fixation pattern compared to hearing controls, with the spatial fixation density peaking just
below the eyes. Interestingly, even unlike hearing observers presenting a
global fixation pattern, the deaf observers were not impaired by the face
inversion and did not change their sampling strategy. To assess whether
this particular fixation strategy in the deaf observers was paired with a
larger information intake, the same participants performed the identical
experiment with a gaze-contingent paradigm parametrically and dynamically modulating the quantity of information available at each fixation – the
Expanding Spotlight (Miellet et al. 2013). Visual information reconstruction
with a retinal filter revealed an enlarged visual field in deafness. Unlike
hearing participants, deaf observers used larger information intake from all
the fixations. This visual sampling strategy was robust and as effective for
inverted face recognition. Altogether, our data show that the face system
is flexible and might tune to distinct strategies as a function of visual and
social experience.
Acknowledgement: This study was supported by the Swiss National Science
Foundation (n° 100014_156490/1) awarded to Roberto Caldara.
21.26, 9:30 am Is face perception preserved in pure alexia? Eval-
uating complementary contribution of the left fusiform gyrus to
face processing Andrea Albonico1,2([email protected]), Jason
Barton1; 1Human Vision and Eye Movement Laboratory, Departments of
Medicine (Neurology), Ophthalmology and Visual Sciences, University
of British Columbia, Vancouver, Canada, 2NeuroMI - Milan Center for
Neuroscience, Milano, Italy
Face recognition and reading are two expert forms of human visual processing. Recent evidence show that they involve overlapping cerebral networks in the right and left hemispheres, leading the many-to-many hypothesis to predict that a lesion to the left fusiform gyrus that causes pure alexia
will also be associated with mild impairments in face processing. Our goal
was to determine if alexic subjects showed face identity processing deficits
similar but milder to those seen in prosopagnosia following right fusiform
lesions, or if they had different, complementary face processing deficits,
which would be predicted if there were hemispheric lateralization of different face perceptual functions. We tested three patients with pure alexia
from left fusiform lesions and one prosopagnosic subject with a right fusiform lesion. First, they had standard neuropsychologic tests of face identity recognition. Second, we atested their ability to discriminate faces in
images reduced to high-contrast linear contours, similar to letters. Third,
we assessed their ability to detect and discriminate facial speech patterns,
and to identify these and integrate them with speech sounds in the McGurk
effect (Campbell et al, 1986). Alexic subjects had normal familiarity for
face identity on the Cambridge Face Memory Test. However, they were
Vis io n S c ie nc es Societ y
Saturday AM
The developmental origin of the human right hemispheric lateralization
for face perception remains unclear. According to a recent hypothesis,
the increase in left lateralized posterior neural activity during reading
acquisition contributes to, or even determines, the right hemispheric
lateralization for face perception (Behrmann & Plaut, 2013). This view
contrasts with the right hemispheric advantage observed in few months
old infant. Recently, a Fast Visual Periodic Stimulation (FPVS) paradigm
in EEG showed that periodically presented faces among objects lead to
strongly right lateralized face-selective responses in 4-6 months old infants
(de Heering & Rossion, 2015). Here we used the exact same paradigm in
EEG to study the lateralization of responses to faces in a group (N=35) of 5
years-old pre-school children showing left-lateralized responses to letters
(Lochy et al., 2016). Rather surprisingly, we found bilateral face-selective
responses in this population, with a small positive correlation found
between preschool letter knowledge and right hemispheric lateralization
for faces (rho=0.30; p< 0.04), but no correlation between the left lateralization to letters and the right lateralization to faces. However, discrimination
of facial identity with FPVS (Liu-Shuang et al., 2014) in these pre-reading
children was strongly right lateralized, and unrelated to their letter knowledge. These findings suggest that other factors than reading acquisition,
such as the posterior corpus callosum maturation during early childhood
as well as the level required by the perceptual categorization process (i.e.,
generic face categorization vs. face individualization), play a key role in
the right hemispheric lateralization for face perception in humans.
of neural mechanisms underlying the development of face perception and
have important implications for assessing development in neural mechanisms of high-level cortical areas.
Saturday AM
Satur day Morni ng Tal ks
impaired in matching faces for identity across viewpoint, which was worse
with line-contour faces. The prosopagnosic patient was also impaired in
matching faces across viewpoints, but did well with line-contour faces.
Alexic patients could detect facial speech patterns but had trouble identifying them and integrating them with speech sounds, whereas identification
and integration was intact in the prosopagnosic subject. We conclude that,
in addition to visual word processing, the left fusiform gyrus is involved
in processing linear contour information and speech patterns in faces, a
contribution complementary to the face identity processing of the right
fusiform gyrus.
Object Recognition: Neural mechanisms
Saturday, May 20, 10:45 am - 12:30 pm
Talk Session, Talk Room 1
Moderator: Timothy Andrews
22.11, 10:45 am Dynamic differences in letter contrast polarity
improve peripheral letter string and word recognition performance Jean-Baptiste Bernard1,2,3,4([email protected]),
Eric Castet1,2; 1Laboratoire de Psychologie Cognitive, CNRS, UMR 7290,
Aix-Marseille Université, 3Fondation de l’avenir, 4Fondation Visaudio
Letter crowding (the inability to identify a letter when surrounded by other
letters) is reduced when target and flankers are dissimilar. This release is
particularly strong when target and flankers have different contrast polarities (Kooi et al, 1994), but peripheral word recognition does not benefit
from this release because observers need to simultaneously report target
and flankers of different contrasts (Chung et al, 2010). Here, we investigate
if the sequential uncrowding of successive letters using a dynamic contrast
polarity difference could improve peripheral letter string and word recognition performance. Three subjects participated in two experiments with
letters presented horizontally at 10° in the lower visual field in white or
black on a gray background (same absolute contrast value) while eye position was controlled. Subjects identified trigrams (Experiment 1) and 5-letter
words (Experiment 2) using three different displays (4 blocks of 50 trials for
each experiment and each display): (a) Basic display (Black letters), (b) Static
contrast polarity variation (SCPV) display (letters with static alternate contrast polarity) and (c) Dynamic contrast polarity variation (DCPV) display
(each letter successively changing from black to white for 200 ms from left
to right). Presentation duration was 800 ms in Experiment 1, and depended
on subjects in Experiment 2 (3.54±1.26 s). Letter print-size was adjusted so
that letter recognition rate was at 50% for the basic display in Experiment
1. For each subject, results for Experiments 1 and 2 show the best recognition rate for the DCPV display (average: 64±4% (Exp1) and 72±5% (Exp2))
compared to the SCPV (40±2% (Exp1) and 57±3% (Exp2)) and basic (42±3%
(Exp1) and 49±6% (Exp2)) displays. These results suggest that peripheral
letter string and word recognition can be improved using letter contrast
polarity differences if word letters are successively uncrowded.
Acknowledgement: Fondations de l’Avenir et Visaudio, AP-VIS-15-001s
22.12, 11:00 am A developmental deficit in seeing the orientation
of typical 2D objects Gilles Vannuscorps1([email protected]
edu), Albert Galaburda2, Eric Falk3, Alfonso Caramazza1,4; 1Department of
Psychology, Harvard University, Cambridge (MA), USA, 2Department of
Neurology, Harvard Medical School and Beth Israel Deaconess Medical
Center, Boston (MA), USA, 3Carroll School, Lincoln (MA), USA, 4Center
for Mind/Brain Sciences, Università degli Studi di Trento, Trento (TN),
We report the results from a single-case study of an adolescent, Davida,
with no remarkable medical history, normal neurological exam, brain MRI
and electroencephalogram, who has a highly specific deficit in perceiving
the orientation of static, moving, flashed and flickering 2D shapes such as
black, grey or colored letters, arrows, abstract shapes and line drawing of
objects and faces. Davida reports seeing multiple orientations of these stimuli concurrently (the correct orientation and the equivalent of its rotation by
90, 180 and 270 degrees). Davida’s results in non-speeded tasks probing her
perception of orientation through verbal judgments, visual illusions, direct
copy, and directed movements corroborated this difficulty. For instance,
when asked to point to the tip of an arrow shown on a computer screen,
she typically pointed where the tip of the arrow would be if the arrow was
rotated by 90, 180 or 270 degrees. In contrast, (a) the processing of orienta-
Vi s i on S c i enc es S o ci e ty
VSS 2017 Abst ract s
tion from auditory, tactile and kinesthetic information is intact; (b) visual
judgments about the identity, shape, distance, color, size, movement and
location of the same kind of stimuli are intact; and (c) the perception of the
orientation of the same shapes (letters, arrows, abstract shapes) shown in
3D, under very low luminance contrast and very low spatial frequencies
is intact. The dissociation between processes engaged in the perception of
the orientation of 2D shapes under medium to high luminance contrast
and spatial frequency from those involved in the perception of the identity, shape, distance, color, size, movement and location of the same kind of
stimuli, and those involved in the perception of the orientation of low contrast and low spatial frequency 2D and 3D shapes raises intriguing questions about the interaction of dorsal and ventral visual processes across the
two hemispheres.
22.13, 11:15 am A data-driven approach to stimulus selection
reveals the importance of visual properties in the neural representation of objects. David Coggan1([email protected]), David Watson1,
Tom Hartley1, Daniel Baker1, Timothy Andrews1; 1Department of Psychology, University of York, UK
The neural representation of objects in the ventral visual pathway has
been linked to high-level properties of the stimulus, such as semantic or
categorical information. However, the extent to which patterns of neural response in these regions reflect more basic underlying principles is
unclear. One problem is that existing studies generally employ stimulus
conditions chosen by the experimenter, potentially obscuring the contribution of more basic stimulus dimensions. To address this issue, we used
a data-driven analysis to describe a large database of objects in terms of
their visual properties (spatial frequency, orientation, location). Clustering
algorithms were then used to select images from distinct regions of this
feature space. Images in each cluster did not clearly correspond to typical
object categories. Nevertheless, they elicited distinct patterns of response in
the ventral stream. Moreover, the similarity of the neural response across
different clusters could be predicted by the similarity in image properties,
but not by the similarity in semantic properties. These findings provide an
image-based explanation for the emergence of higher-level representations
of objects in the ventral visual pathway.
22.14, 11:30 am Neural Mechanisms of Categorical Perception
in Human Visual Cortex Edward Ester1([email protected]),
Thomas Sprague1,2, John Serences1,2; 1Department of Psychology, University of California, San Diego, 2Neurosciences Graduate Program, University of California, San Diego
Category learning warps perceptual space by enhancing the discriminability of physically similar exemplars from different categories and minimizing differences between equally similar exemplars from the same category,
but the neural mechanisms responsible for these changes are unknown.
One possibility is that categorization alters how visual information is represented by sensory neural populations. Here, we used a combination of
fMRI, EEG, and computational modeling to test this possibility. In Experiment 1, we used fMRI and an inverted encoding model (IEM) to estimate
population-level feature representations while participants classified a set
of orientations into two discrete groups (Freedman & Assad, 2006). We
reasoned that if category learning alters representations of sensory information, then orientation-selective responses in early visual areas should
be biased according to category membership. Indeed, representations of
orientation in visual areas V1-V3 were biased away from the actual stimulus orientation and towards the center of the appropriate category. These
biases predicted participants’ behavioral choices and their magnitudes
scaled inversely with the angular distance separating a specific orientation
from the category boundary (i.e., larger biases were observed for orientations adjacent to the boundary relative orientations those further away
from the boundary). In Experiment 2, we recorded EEG over occipitoparietal electrode sites while participants performed a similar categorization
task. This allowed us to generate time-resolved representations of orientation and track the temporal dynamics of category biases. We observed
biases as early as 50-100 ms after stimulus onset, suggesting that category
learning alters how visual information is represented by sensory neural
Acknowledgement: NIH R01-EY025872
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s 22.15, 11:45 am Joint coding of shape and blur in area V4 Timothy
Oleskiw1,2([email protected]), Amy Nowack2, Anitha Pasupathy2; 1Department of Applied Mathematics, University of Washington, 2Department of
Biological Structure, University of Washington
S atur day M orning Talks
(typically right) may be mandatorily engaged by faces, the non face-preferred hemisphere (typically left) may be flexibly recruited to serve current
tasks demands.
22.17, 12:15 pm Does symmetry have a special status in single neu-
rons? RT Pramod1,2([email protected]), SP Arun1,2; 1Center for
Neuroscience, Indian Institute of Science, Bangalore, India, 2Department
of Electrical Communication Engineering, Indian Institute of Science,
Bangalore, India
Acknowledgement: This work was funded by NEI grant R01EY018839 to A.
Pasupathy, Vision Core grant P30EY01730 to the University of Washington, P51
grant OD010425 to the Washington National Primate Research Center, Natural
Sciences and Research Counsel of Canada PGS-D to T. D. Oleskiw, and University of Washington Computational Neuroscience Training Grant to T. D. Oleskiw.
Acknowledgement: Wellcome-DBT India Alliance (SPA) MHRD, Government of
India (PRT)
22.16, 12:00 pm Selective attention modulates face categorization
differently in the left and right hemispheres Genevieve Quek1([email protected]), Dan Nemrodov2, Bruno Rossion1, Joan
Liu-Shuang1; 1Psychological Sciences Research Institute and Institute of
Neuroscience, University of Louvain, Belgium, 2Department of Psychology, University of Toronto Scarborough, Canada
Despite the broad interest in the role of selective attention in human face
perception, there has been little focus on characterizing attentional modulation of this critical brain function in a dynamic visual environment. Here
we exploited fast periodic visual stimulation to separately characterise the
impact of attentional enhancement and suppression on generic face categorization. We recorded 128 channel EEG while participants viewed a 6Hz
stream of object images (e.g., buildings, animals, objects, etc.) with a face
image embedded as every 5th image in the sequence (i.e., OOOOFOOOOFOOOOF…). Stimulating the visual system this way elicits a response at
exactly 6Hz, reflecting processing common to both face and object images,
and a response at 6Hz/5 (i.e., 1.2 Hz), reflecting a differential response to
faces as compared to objects. We measured this face-selective response
while manipulating the focus of task-based attention: On Attend Faces trials, participants responded to instances of female faces in the sequence; on
Attend Objects trials, they responded to instances of guitars, and on Baseline trials, they performed an orthogonal task, monitoring a central fixation cross for colour changes. We inspected indices of attentional enhancement (Attend Face–Baseline) and attentional suppression (Baseline–Attend
Objects) on right and left occipito-temporal electrodes separately. We
observed that during the orthogonal task, face-specific activity was predominantly centred over the occipito-temporal region of the face-preferred
hemisphere (right hemisphere in 13/15 observers). Where task-based attentional suppression was comparable across the left and right hemispheres,
task-based attentional enhancement was much more prominent in the non
face-preferred hemisphere (left hemisphere in 13/15 observers). These
results suggest the left and right face-selective cortical regions may support
face categorization in distinct ways – where the face-preferred hemisphere
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Symmetry is a salient global attribute: it is easy to detect, remember and
influences fundamental visual processes such as recognition and segmentation. Yet we know very little about how symmetry is represented
in neurons. To address this issue, we recorded from single neurons in the
monkey inferior temporal (IT) cortex using shapes made of two arbitrary
parts connected by a stem. Shapes made with two identical parts were
symmetric while those with different parts were asymmetric. We tested
the same shapes oriented vertically and horizontally to characterize mirror symmetry about both axes. Using these shapes we asked whether symmetric objects had any special status at the neural level that would explain
their special status at the behavioural level. Our main findings were similar for horizontal and vertical objects: (1) Symmetric objects did not evoke
significantly stronger neural responses compared to asymmetric objects;
(2) Neural responses to the whole object were explained as a linear sum
of the part responses, with no special deviation for symmetric objects; (3)
Neural responses to symmetric objects elicited no greater nonlinear interactions between parts compared to asymmetric objects and (4) The sole distinguishing characteristic of symmetric objects was that they were more
distinct from each other compared to equivalent asymmetric objects. This
distinctiveness is a straightforward outcome of part summation but explain
a number of observations regarding symmetry in perception. We propose
that symmetry becomes special in perception due to generic computations
at the neural level. Perceptual Learning
Saturday, May 20, 10:45 am - 12:30 pm
Talk Session, Talk Room 2
Moderator: Jozsef Fiser
22.21, 10:45 am REM sleep stabilizes visual perceptual learning
which was rendered fragile by NREM sleep Yuka Sasaki1(yuka_
[email protected]), Masako Tamaki1, Takeo Watanabe1; 1Brown University, Department of Cognitive, Linguistic, and Psychological Sciences
NREM sleep play roles in enhancing visual perceptual learning (VPL).
However, to successfully consolidate already enhanced VPL, a stabilization process of VPL needs to occur after the enhancement. Otherwise,
VPL should be left fragile and vulnerable to interference by training of
another task. Since REM sleep succeeds NREM sleep during which VPL
is enhanced, we tested whether REM sleep plays a role in stabilization.
More specifically, we tested whether VPL is still resilient to interference
and therefore is stabilized if REM sleep does not occur after training. We
used a two-block training paradigm using the texture discrimination task.
Earlier studies have shown that learning of the first training with a stimulus
is interfered by the second training with a similar but different stimulus
unless the time-interval between the two trainings was longer than 60 min.
Here, we separated two trainings by a 120-min interval, during which subjects either slept (sleep group, n=11) or stayed awake (control-wake group,
n=10). Performance was measured before and after the first training, before
the interval, and after the second training. In the control-wake group, consistently with the previous findings, the first learning was not interfered by
the second, which showed stabilization of the first learning during wakefulness occurred. In the sleep group, the first learning was significantly
interfered by the second training with subjects who showed only NREM
sleep, whereas no such interference occurred with subjects who showed
REM sleep after NREM sleep. The degree of the resilience of the first learning measured after the second training was significantly correlated with
the strength of theta activity (5-7 Hz) from the visual areas retinotopically
Vis io n S c ie nc es Societ y
Saturday AM
Blur is a common and informative cue of naturalistic visual scenes. For
example, cast shadows have blurry boundaries, as do objects outside the
focal plane, and surface features of 3D objects may be associated with shading blur. Interestingly, while psychophysical studies have long demonstrated the importance of detecting and encoding blur for scene segmentation and perception of depth and 3D structure, the underlying neural
mechanisms have yet to be discovered. To investigate this we record
single-unit activity from area V4 in two awake fixating Macaca mulatta in
response to shape stimuli exhibiting blurred boundaries. Specifically, after
classifying shape selectivity of single neurons, preferred and non-preferred
shapes are presented under multiple levels of Gaussian blur. Surprisingly,
our data reveals a population of V4 neurons which are tuned for intermediate levels of boundary blur, demonstrating, for the very first time, blur
selectivity anywhere in primate visual cortex. After performing a series
of control experiments our results reveal a sophisticated neural computation within V4, with responses being enhanced by the removal of high spatial frequency content; this effect is not explained by confounding factors
of stimulus size, curvature, or contrast. We interpret our findings in the
context of computational studies that argue for shape and blur as forming a
sufficient representation of naturalistic images. A simple descriptive model
is proposed to explain observed data, wherein blur selectivity modulates
the gain of shape-selective responses, supporting the hypothesis that shape
and blur are fundamental features of a sufficient neural code for natural
image representation within the ventral pathway. More generally, we
believe that our findings will shift paradigms surrounding area V4’s role in
visual processing: as opposed to computations of object recognition alone,
our results suggest that V4 also provides the neural substrate underlying
processes of scene segmentation and understanding. Satur day Morni ng Tal ks
corresponding to the trained visual field during REM sleep. These results
suggest that theta activity in the visual area during REM sleep is necessary
for consolidation of VPL during sleep after training.
Saturday AM
Acknowledgement: NIH R01EY019466, NSF BCS 1539717
22.22, 11:00 am Evidence for awake replay in human visual cortex
after training Ji Won Bang1([email protected]), Yuka Sasaki2, Takeo
Watanabe2, Dobromir Rahnev1; 1School of Psychology, Georgia Institute
of Technology, 2Cognitive, Linguistic & Psychological Sciences, Brown
Understanding how the human brain learns is a fundamental goal of neuroscience. A large body of animal research shows that awake replay -- the
repetition of neuronal patterns exhibited during learning -- plays a critical
role in memory formation. However, very few studies have tested whether
awake replay occurs in humans and none have employed non-hippocampus-dependent tasks. Here, we examined whether awake replay occurs in
the human visual cortex immediately after extensive training on a visual
task. We trained participants on one of two Gabor patch orientations (45°
vs. 135°) using a two-interval forced choice (2IFC) detection task. Critically,
using functional MRI, we obtained participants’ spontaneous brain activity both before and after the vision training. We then tested whether the
post-training spontaneous activity in early visual cortex appeared more
similar to the trained than untrained stimulus (to classify the spontaneous
activity we first constructed a decoder that could distinguish the patterns of
activity for each Gabor orientation). Consistent with the existence of awake
replay, we found that immediately after vision training, the activation patterns in areas V1 and V3 were more likely to be classified as the trained
orientation. No such difference was found for the pre-training spontaneous
activity. In addition, behavioral performance on the trained orientation significantly improved after training demonstrating the effectiveness of the
training. Taken together, these results demonstrate that a process of awake
replay occurs immediately after visual learning. Our findings are the first
to demonstrate the phenomenon of awake replay in non-hippocampus-dependent tasks. We speculate that awake replay may be fundamental to all
types of learning.
22.23, 11:15 am Combining the cholinesterase inhibitor done-
pezil with perceptual learning in adults with amblyopia Susana
Chung1,2,3([email protected]), Roger Li1,2,3, Michael Silver1,2,3, Dennis
Levi1,2,3; 1School of Optometry, UC Berkeley, 2Vision Science Graduate
Program, UC Berkeley, 3Helen Wills Neuroscience Institute, UC Berkeley
Amblyopia is a developmental disorder that results in a wide range of
visual deficits. Although brain plasticity is limited in adults, one approach
to recovering vision in adults with amblyopia is perceptual learning (PL).
Recent evidence suggests that neuromodulators may enhance adult plasticity. Here we asked whether donepezil, a cholinesterase inhibitor, can
enhance PL in adults with amblyopia. Nine adults with amblyopia were
first trained on a single-letter identification task (letters were presented
at low contrast) while taking a daily dose (5 mg) of donepezil throughout
training. Following 10,000 trials of training, participants showed improved
contrast sensitivity in identifying single letters. However, the magnitude
of improvement was no greater than, and the rate of improvement was
slower than that obtained in a previous study in which adults with amblyopia were trained using identical experimental protocols but without donepezil (Chung, Li & Levi, 2012). In addition, the transfer of learning to a
size-limited (acuity) or to a spacing-limited (crowding) task was less than
that found in the previous study with no donepezil administration. After
an interval of several weeks, six of these participants returned for a second
training task — flanked letter identification (identifying crowded letters) —
also with concurrent donepezil administration. Following another 10,000
trials of training, only one observer showed learning for this subsequent
training task, which has been previously shown to be highly amenable to
PL in adults with amblyopia. Control studies showed that the lack of a
learning effect on the flanked letter identification task was not due to either
the order of the two training tasks or the use of a sequential training paradigm. Our results reveal that donepezil does not enhance or speed up PL of
single-letter identification in adults with amblyopia, and importantly, may
even block participants’ PL of a task related to crowding.
Acknowledgement: NIH/NEI Research Grants R01-EY012810 and R01EY020976
Vi s i on S c i enc es S o ci e ty
VSS 2017 Abst ract s
22.24, 11:30 am Dissociable effects of stimulus strength, task
demands, and training on occipital and parietal EEG signals
during perceptual decision-making Sirawaj Itthipuripat1(itthipuri-
[email protected]), Kai-Yu Chang2, Vy Vo1, Stephanie Nelli1, John
Serences1,3; 1Neurosciences Graduate Program, UCSD, 2Cognitive Science,
UCSD, 3Psychology, UCSD
In most tasks, behavioral performance depends on several factors including
stimulus strength, task demands, and also the amount of expertise. Here,
we investigated how these different factors impacted neural modulations
of early sensory and post-sensory processing. To address this question, we
recorded encephalography (EEG) from human subjects performing a perceptual decision-making task and used two event-related potentials (ERPs):
an early visual negativity (VN) and a late centro-parietal positivity (CPP) as
neuromarkers for early sensory and post-sensory processing, respectively.
Across four days, subjects discriminated the orientation of a patch of oriented lines as we manipulated stimulus strength (0-60% coherence) and
task demands (number of possible target orientations: 2 or 4 choices). While
behavioral performance improved with increased stimulus coherence,
reduced choice number, and increased training duration, we observed distinguishable modulations of the VN and CPP components. Specifically, the
amplitudes of the VN and CPP increased multiplicatively with increased
stimulus coherence. On the other hand, reducing the task demands did not
alter the VN amplitude, but increased the ramping rate of the CPP. Similar to increasing stimulus coherence, training amplified the VN amplitude,
however; it reduced the CPP amplitude. The data suggest that altering task
demands can produce an effect on post-sensory processing that is similar to
changing stimulus strength but in the absence of changes in early sensory
processing. On the other hand, training and increasing stimulus strength
can produce similar effects on early sensory processing with different patterns of neural modulations at post-sensory stages.
Acknowledgement: An HHMI international student fellowship to S.I., NIH R01MH092345 and a James S. McDonnell Foundation grant to J.T.S
22.25, 11:45 am Double training reduces motor response specificity Lukasz Grzeczkowski1,2([email protected]), Aline
Cretenoud1, Fred Mast3, Michael Herzog1; 1Laboratory of Psychophysics,
Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL),
Department of Psychology, Ludwig-Maximilian University of Munich,
Institute of Psychology, University of Bern
The hallmark of perceptual learning is its specificity. Recently, we trained
observers with a classical three-line bisection task where observers
responded by button presses whether the central line is offset to the left or
right. Performance improved. However, there was no transfer to the same
bisection task when observers adjusted the central line with the computer
mouse. Likewise, adjustment training did not transfer to the button press
condition. Here, we first show that training is even specific when the trained
hand is used for both motor responses. However, there is transfer from the
trained to the untrained hand. Most importantly, we show that a double
training protocol enables strong transfer from the mouse adjustment condition to the button presses condition but not the other way around. In each
training session, observers trained blockwise with either a vertical bisection stimulus and adjusted the central line with the computer mouse or
they trained with a horizontal bisection stimulus and responded by button
presses. Before and after training, we tested performance with the vertical
bisection stimulus where observers responded by button presses. Surprisingly, training led to transfer in this condition. Without the double training, there was no such transfer. We propose that stimuli are coded together
with their corresponding actions when both are linked through extensive
Acknowledgement: Project “Learning from Delayed and Sparse Feedback” (Project Number: CRSII2_147636) of the Swiss National Science Foundation (SNFS)
22.26, 12:00 pm Visual statistical learning provides scaffolding for
emerging object representations Jozsef Fiser1([email protected]), Gabor
Lengyel1, Marton Nagy1; 1Department of Cognitive Science, Central European University, Hungary
Although an abundance of studies demonstrated human’s abilities for
visual statistical learning (VSL), much fewer studies focused on the consequences of VSL. Recent papers reported that attention is biased toward
detected statistical regularities, but this observation was restricted to spa-
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day M orning Talks
Saturday AM
tial locations and provided no functional interpretation of the phenomenon. We tested the idea that statistical regularities identified by VSL
constrain subsequent visual processing by coercing further processing to
be compatible with those regularities. Our paradigm used the well-documented fact that within-object processing has an advantage over across-object processing. We combined the standard VSL paradigm with a visual
search task in order to assess whether participants detect a target better
within a statistical chunk than across chunks. Participants (N=11) viewed
4-4 alternating blocks of “observation” and “search” trials. In both blocks,
complex multi-shape visual scenes were presented, which unbeknownst to
the participants, were built from pairs of abstract shapes without any clear
segmentation cues. Thus, the visual chunks (pairs of shapes) generating
the scenes could only be extracted by tracking the statistical contingencies
of shapes across scenes. During “observation”, participants just passively
observed the visual scenes, while during “search”, they performed a 3-AFC
task deciding whether T letters appearing in the middle of the shapes
formed a horizontal or vertical pairs. Despite identical distance between
the target letters, participants performed significantly better in trials in
which targets appeared within a visual chunk than across two chunks or
across a chunk and a single shape. These results suggest that similar to
object-defined within/between relations, statistical contingencies learned
implicitly by VSL facilitate visual processing of elements that belong to the
same statistical chunk. This similarity between the effects of true objects
and statistical chunks support the notion that VSL has a central role in the
emergence of internal object representations.
22.27, 12:15 pm Evidence for stimulus abstraction before percep-
tual learning Xin-Yu Xie1([email protected]), Cong Yu1; 1School of
Psychological and Cognitive Sciences, IDG-McGovern Institute for Brain
Sciences, and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing
Visual perceptual learning (VPL) is traditionally attributed to early cortical
neural plasticity or response reweighting. However, our double training
studies demonstrate often complete learning transfer to untrained locations,
orientations, and physical stimuli, suggesting that VPL involves learning
at a conceptual level (e.g., learning an abstract orientation concept). It is
unclear whether such a concept is abstracted after learning (e.g., abstracting the rules of reweighting that define a concept), or before learning (e.g.,
abstracting stimulus information before reweighting). Subjects practiced
orientation discrimination with a Gabor that either rotated trial-by-trial in
12 locations (anti-clockwise) and 4 orientations (clockwise) at 5-deg eccentricity, or in a roving order (47 conditions excluding the pre/post one that
was never practiced). Each condition received 2 trials per block, 12 trials
per session, over 5 daily sessions. A staircase controlled the orientation
difference from trial to trial. The multiple stimulus conditions and scarce
number of trials per condition minimize the possibility of early cortical
plasticity and response reweighting (and so abstraction of reweighting
rules). Both rotating and roving training conditions produced significant
orientation learning. Training also improved the untrained pre/post condition, as much as when training was performed at the pre/post condition
with equal number of trials. Similar effects were seen with orientation training using symmetry dot-patterns whose global orientation rotated. However, training with an irrelevant contrast discrimination task with multiple conditions had no significant effect on orientation performance at the
pre/post condition, indicating that orientation learning is genuine and not
caused by improved attention to the periphery. These results suggest that
early cortical plasticity and response reweighting, as well as abstraction of
reweighting rules, are unnecessary for VPL. Instead the brain may abstract
the stimulus information in advance before reweighting, which explains
VPL and its transfer in various double training studies and in current multiple-condition training study.
Acknowledgement: Supported by a Natural Science Foundation of China grant
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Vis io n S c ie nc es Societ y
Saturday Morning Posters
Saturday AM
Attention: Features
Saturday, May 20, 8:30 am - 12:30 pm
Poster Session, Banyan Breezeway
23.3001 Acts feature based suppression of ignored stimuli globally
or locally in the visual field? Matthias Mueller1([email protected]
de); 1Dept. of Psychology, University of Leipzig, Germany
A key property of feature-based attention is global facilitation of the
attended feature throughout the visual field. However, the question to
what extend suppression of unattended features acts globally as well is still
a matter of controversial results. We presented superimposed randomly
moving red and blue dot kinematograms (RDKs) flickering at a different frequency each to elicit frequency specific steady-state visual evoked
potentials (SSVEP) in the center of the screen and a red and blue RDK in
the left and right periphery, respectively. Subjects shifted attention to one
color of the superimposed central RDKs to detect coherent motion events
in the attended color RDK, while the peripheral RDKs were task irrelevant.
We found global facilitation of the attended color but suppression was
restricted to the location of focused attention. We replicated our previous
result: shifting of processing resources followed a bi-phasic process with a
leading enhancement of the attended RDK followed by suppression of the
unattended RDK. Our results are based on a pre-cue baseline serving as reference for post-cue SSVEP analysis. Other reference measures, like a neutral color (Painter et al., 2014), probe stimuli (Moher et al., 2014; Zhang and
Luck, 2009), or no baseline reference (Störmer and Alvarez, 2014) resulted
in a different pattern of effects. While some results suggested global suppression relative to the reference measure, others suggest local suppression, hampering conceptual advancement. Thus, providing solutions for a
common and unique reference frame seems to be an important challenge
for future studies to uncover neural dynamics in visual attention in general.
Acknowledgement: German Research Foundation
23.3002 Using Angles as Features Matthew Inverso1([email protected]
edu), Charles Chubb1, Charles Wright1, George Sperling1; 1University of
California, Irvine
Introduction. What properties of angles can serve as features for spatially
distributed attention? Here we use the centroid paradigm to investigate
this question. Methods. Stimuli were clouds of randomly oriented v’s with
vertices of different angles. In Experiment 1, stimuli comprised eight v’s, 4
each with acute and obtuse angles of size that varied randomly both within
a trial and across trials. Stimuli were exposed for 200 ms and followed by
noise masks. In separate conditions, observers strove to mouse-click the
centroid of the v’s, giving equal weight to (1) all angles, (2) acute angles
(< 90 deg) while ignoring obtuse angles (>90 deg), (3) obtuse angles while
ignoring acute angles. Experiments 2 and 3 used v’s with only two fixed
angle sizes, and participants strove to mouse-click the centroid of a designated one of the two sets. Results. Participants in Experiment 1 gave
about 4-5 times more average weight to targets than distracters (we call
this measure “Selectivity”). However, for component angles of 67.5 versus
112.5 deg, the most similar target and distracter angles, selectivity was only
about 2.5. In Experiment 2, when filtering between just two acute angles
(30, 60 deg) participants gave target acute angles about 4 times more weight
than distractors. However, when discriminating between 75 versus 105 or
between 120 versus 150 deg, selectivity was reduced to less than 2. Conclusions. Angles can be used as a feature in feature based attention. For equal
angular differences between targets and distracters, selectivity is greater for
small than for large angles.
23.3003 Feature-based surround suppression in the motion
domain Sang-Ah Yoo1, 5([email protected]), John Tsotsos2, 5, Mazyar
Fallah3, 4, 5; 1Department of Psychology, York University, 2Department of
Electrical Engineering and Computer Science, York University, 3School of
Kinesiology and Health Science, York University, 4Canadian Action and
Perception Network, York University, 5Centre for Vision Research, York
Vi s i on S c i enc es S o ci e ty
When we attend to a certain visual feature, such as a specific orientation
(Tombu & Tsotsos, 2008) or specific colour (Störmer & Alvarez, 2014), processing of features nearby in that space are suppressed (i.e., feature-based
surround suppression). In the present study, we investigated feature-based
surround suppression in a new feature domain, motion direction, using
motion repulsion as a measurement. Chen and colleagues (2005) suggested
that attention to one motion direction reduces motion repulsion by inhibiting the other direction. Based on this finding, we conducted a similar direction judgment task having naïve participants. They reported perceived
directions of two superimposed motions after viewing the motions for 2
sec. The directional differences between two motions systematically varied
(10~70 deg) and the surfaces were separated by different colours (green or
red). In the unattended condition, participants performed direction judgment tasks only, attending equally to both motions. In the attended condition, a colour cue was presented, indicating which motion participants
should attend. Participants were asked to detect a brief directional shift
of the cued motion and then, report the perceived motion directions. We
compared the magnitude of motion repulsion between the two attention
conditions. In contrast to the findings of Chen and colleagues, participants
showed greater motion repulsion in the attended condition than in the
unattended condition, especially when two motions moved along nearby
directions. The results suggest that feature-based surround suppression
exists in the motion domain and that it may occur on an early stage of
motion processing where the global direction of motion is computed.
Acknowledgement: the Canada Research Chairs Program, the Natural Sciences
and Engineering Research Council of Canada, and the Air Force Office for Scientific Research (USA)
23.3004 Does Feature-Based Attention for Grayscale Vary Across
Visual Tasks with Identical Stimuli? Howard Yang1([email protected]
edu), Peng Sung2, Charles Chubb3, George Sperling4; 1Department of Cognitive Sciences, UC Irvine, 2Department of Cognitive Sciences, UC Irvine,
Department of Cognitive Sciences, UC Irvine, 4Department of Cognitive
Sciences, UC Irvine
Are feature-based visual attention filters for dark versus light items invariant across different tasks? Method. Stimuli were briefly flashed (300 ms)
clouds comprising 16 bars (length, width = .72°, .045°), two each of 8 Weber
contrasts ±0.25, ±0.5, ±0.75, ±1 on a mean gray background. Bar orientations
had a fixed dispersion of 22.5 deg. around a mean that varied randomly
across trials. In the centroid task, the participant strove to mouse-click the
centroid of a “target set” of bars, giving equal weight to all bars in this set
while ignoring all the other “distractor” bars. In the slant task, participants
adjusted the orientation of a central response bar to match the mean-orientation of the bars in the target set, giving equal weight to all target bars while
ignoring the distractor bars. In each of the two tasks, in separately blocked
conditions, the target set included 1) all bars, 2) bright bars only [bars more
luminous than the background] or 3) dark bars only. Results. In each condition in each task, we derived an attention filter that reflected the impact
exerted on the participant’s responses by bars of different Weber contrasts.
In both tasks, participants’ attention filters in the all-bars condition gave
nearly equal weight to all 8 Weber contrasts. In the bright-bars-only and
dark-bars-only selective attention conditions, participants’ centroid-task
attention filters more accurately approximated equal weight to all target
bar luminances than slant-task filters, which despite contrary instructions
and feedback, weighted bars more nearly in proportion to absolute Weber
contrast. On the other hand, in selective attention to bright-bars-only and
dark-bars-only conditions, slant-task filters assigned very little weight
to distractors yielding excellent target-to-distractor-weight ratios: >14:1,
whereas centroid-task filters were less selective, yielding target-to-distractor-weight ratios: 5:1. Conclusion. Attention filters for gray-scale can differ
between different tasks using identical stimuli.
23.3005 Shape interactions require more than feedforward repre-
sentation Larissa D’Abreu1([email protected]), Timothy Sweeny1;
University of Denver
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day M orning Post ers
prone to induce cue-related response biases. Here we used two opposite
tasks to isolate the response bias in these tasks. Participants were presented
with an abrupt cue and two gratings afterward, and were instructed to
report the orientation of the stimulus which looked higher (Experiment
1) or lower (Experiment 2) in contrast. By comparing performance across
experiments, the reversal of instructions allows to better estimate the magnitude of observed response biases and therefore to better isolate the effects
of attention on apparent contrast. We also systematically study attentional
effects over a wide range of contrast levels (15-60%). When using a higher
comparative task, we found a boost of apparent contrast by attention with
low-contrast stimuli (15% and 25%), but null effects with high-contrast
stimuli (40% and 60%). When using a lower comparative task, surprisingly,
an attentional attenuation of apparent contrast was found at high contrast
levels, whereas no effect was found at low contrast levels. Thebetween-experiment analysis demonstrated that the observed effect was an addictive
combination of a perceptual effect induced by attention and response bias
to report the item on the side where the cue was presented. After isolating
the two effects, we were able todemonstrate that attention alters perceived
contrast in a contrast-dependent way: attention enhances contrast at low
contrast levels, but attenuates it at high contrast levels.
23.3006 Variable Viewpoint Hybrid Search: Searching for the
Goal. Feature-based attention can enhance perception to an attended color.
However, it is less clear how attending to a color modulates processing of
nearby colors. The feature-similarity gain model predicts a graded level of
attentional enhancement centered on the attended color. However, a center-surround mechanism claims inhibition of colors nearby the attended
one (Stormer & Alvarez, 2014). Here, we investigate how attentional modulation varies systematically as a function of the difference between the
stimulus color and the attended color. Methods. Subjects were sequentially presented with two intervals, with each interval consisting of a patch
of static colored dots. In one patch all dots had random colors, while in the
other patch one color was overrepresented (the target). Subjects performed
a 2IFC task reporting the interval that contained the target. The amount
of overrepresentation was determined by interleaved staircases for each
target color and each subject in a thresholding session at the start of the
experiment. In the cueing condition, a fixed-color cue appeared briefly at
the beginning of each trial. The target matched this color on 50% of trials.
In the remaining trials, the target was ±15°, ±30°, ±45° or ±60° away from
the cued color (6.25% each) on a color wheel (CIE L*a*b space). In separate
blocks of neutral trials, there were no cues. The cueing effect was the difference between the neutral and cued conditions for a given color. Results.
For most subjects, we found a significant enhancement for the cued target
color and, more importantly, a general trend for inhibition at its immediate
neighbors (±15°). Once outside this inhibitory zone, there was a rebound of
cueing effect. Thus, our data are consistent with a surround-suppression
effect in feature-based attention. We also found evidence for an interaction between attentional modulation and category boundaries in the color
Object or the Image? Abla Alaoui Soce ([email protected]),
Bochao Zou1,3, Jeremy Wolfe1,2; 1Brigham and Women’s Hospital, 2Harvard
Medical School, 3School of Optoelectronics, Beijing Institute of Technology
In hybrid search, observers search visual arrays for any of several target types held in memory. Items in the visual display must be matched
to some internal representation or ‘template’. Previous experiments have
shown that searching for specific targets is more efficient than searching
for categories of targets (Cunningham & Wolfe, 2014). Between search for
this exact image of this exact chair and search for the category “chairs”,
is search for a specific object that can be viewed from multiple positions.
Such search for targets that appear under different viewpoints is closer to
real world search. We conducted a hybrid search experiment using specific
target objects that could be rendered in multiple viewpoints. We compare
this varied viewpoint condition to a specific viewpoint condition, in which
each target appeared in only a single viewpoint. Is varied viewpoint hybrid
search similar to single viewpoint search, suggesting that search templates
are independent of viewpoint? Or, is varied viewpoint search like a category search where multiple views are like multiple instances of a category?
When the memory set size is 2-4, searching for varied viewpoint targets (2
targets: 33 msec/item; 4 targets: 82 msec/item) was just as fast as searching
for single viewpoint targets (2 targets: 38 msec/item; 4 targets: 69 msec/
item), (t(11)=0.88, p=0.40; t(11)=1.09, p=0.30). However, when more targets (8-16) are stored in memory, searching for varied viewpoint targets (8
targets: 127 msec/item; 16 targets: 150 msec/item) was less efficient than
searching for specific viewpoints (8 targets: 80 msec/item; 16 targets: 103
msec/item) (t(11)=4.63, p< 0.001, t(11)=3.36, p< 0.01), more closely resembling a categorical search (8 targets: 125 msec/item). This suggests that a
small number of viewpoint independent representations can be activated
in hybrid search. For larger memory sets, however, observers might only
activate a canonical view, making search harder for other views.
Acknowledgement: National Eye Institute (NEI) Grant No. EY017001
23.3007 How does attention alter perceived contrast? Enhance-
ment at low contrast levels turns into attenuation at high contrast
levels. Liu-Fang Zhou1,2([email protected]), Simona Buetti2, Shena
Lu1, Yong-Chun Cai1; 1Department of Psychology and Behavioral Sciences,
Zhejiang University, Hangzhou, China, 2Department of Psychology, University of Illinois at Urbana-Champaign, IL, USA
It has been a long-standing question of whether attention alters appearance. A recent landmark study demonstrated that attention enhances
apparent contrast (Carrasco, Ling, & Read, 2004, Nature Neuroscience).
One shortcoming of the tasks used in this type of study is that they are
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Acknowledgement: Zhejiang ProvincialNatural Science Foundation of China
23.3008 Surround Suppression in Feature-based Attention to
Color Wanghaoming Fang1([email protected]), Mark Becker1,
Taosheng Liu1,2; 1Department of Psychology, Michigan State University,
Neuroscience Program, Michigan State University
Acknowledgement: National Institutes of Health (R01EY022727)
23.3009 Continuous vs. categorical representation of fea-
ture-based attentional priority in human frontoparietal
cortex Mengyuan Gong1,2([email protected]), Taosheng Liu1,2; 1Depart-
ment of Psychology, Michigan State University, 2Neuroscience Program,
Michigan State University
Previous studies suggest a functional role of dorsal frontoparietal network
in representing feature-based attentional priority, yet how these features
are represented remains unclear. In an fMRI experiment, we used a feature
cueing paradigm to assess whether attentional priority signals vary continuously or categorically as a function of feature similarity. We presented
two superimposed dot fields moving along two linear directions (left-tilted
and right-tilted), while varying the angular separation between the two
motion directions. Subjects were cued to attend to one of the two dot fields
and respond to a possible speed-up in the cued direction. We examined
Vis io n S c ie nc es Societ y
Saturday AM
At any moment, some objects in the environment are seen clearly whereas
others elude visual awareness. Objects that go unseen may be missed
because they fail to engage reentrant processing from higher- to lower-levels of visual analysis. Nevertheless, several investigations suggest that
these unseen objects are processed, at least to some extent, in a feedforward
wave of representation, and that this processing can influence attention
and even bias other perceptual judgments. Here, we attempted to understand the depth of feedforward representation at an intermediate level of
visual analysis. Object-substitution masking (OSM) is thought to prevent
feedback activity while preserving feedforward activity. Thus, we used
OSM to evaluate whether the feedforward representation of an unseen
shape’s aspect ratio is potent enough to influence the appearance of another
nearby shape that is clearly visible. Observers viewed two simultaneously
presented ellipses on each trial for 17msec. An arrow appeared after the
ellipses disappeared, cueing observers to rate the aspect ratio (e.g. how tall
or flat, using a magnitude matching screen) of one ellipse from the pair.
On some trials, the uncued ellipse was masked by four adjacent dots that
lingered after its offset for 240 msec. On each trial, we measured subjective awareness by asking observers how many ellipses they saw clearly. As
expected, the aspect ratio of the cued ellipse was biased toward that of the
uncued ellipse on control trials with no masking—perceptual averaging.
Crucially, this averaging effect did not occur when the uncued ellipse was
successfully masked. Interestingly, when observers indicated that they saw
both ellipses, we found a significant effect of perceptual averaging intermediate between the other conditions. These results suggest that feedforward
representation of an unseen object is insufficient to influence the perception
of a nearby object, at least at intermediate levels of visual analysis. Saturday AM
Satur day Morni ng P os t er s
how information contained in the multi-voxel neural patterns changed
with the angular separation between the two directions. If attentional priority represents continuous changes of the features, priority signals in the
dorsal pathway should become more similar when the angular separation
between the attended directions decreases. However, if attentional priority
represents attended feature in a categorical manner, then priority signals
should remain largely invariant with respect to changes in the angular separation. We trained a classifier to decode the attended direction (left-tilted
vs. right-tilted) for each angular separation, and found that the decoding
accuracy improved with increasing angular separation in the visual cortex (V1 and V2). In contrast, decoding accuracy remained invariant to the
degree of feature similarity (and significantly above chance) in the intraparietal sulcus (IPS) and frontal areas (FEF and IFJ). These results indicate
dissociated roles of visual cortex and frontoparietal areas in representing
attentional priority, suggesting a flexible transformation of feature-based
priority from continuous to categorical representation along the dorsal
visual streams.
Acknowledgement: National Institutes of Health (R01EY022727)
23.3010 Tuning attention to relative features results in fea-
ture-based enhancement and suppression Josef Schoenhammer1([email protected]), Stefanie Becker2, Dirk Kerzel1; 1University
of Geneva, 2The University of Queensland
Many theories of visual attention propose that we select sought-for items
(targets) by tuning attention to their elementary features (e.g., green, yellow). However, recent findings showed that we often select a target in a
context-dependent manner, by tuning attention to its relative features, that
is, to the features that the target has relative to the surrounding non-target items (e.g., greener, yellower). In our Experiment 1, we replicated these
basic findings, employing a cueing paradigm with spatially unpredictive
pre-cues. Target and nontarget colors remained fixed in a block of trials
(e.g., yellowish-green and green), so that also the relative color remained
constant (e.g., yellower). Consistent with a relational account, we found
that the cues elicited cueing effects only when they had the same relative
color as the target (e.g., yellowest item), regardless of whether the cues had
the same elementary color as the target or not (e.g., yellowish-green or yellow). Critically, cues that mis-matched both, the target’s elementary and
relative color (e.g., a green cue among yellowish-green contextual cues),
elicited inverse cueing effects, that is, slower RTs in cued than uncued trials. It has been hypothesized that these effects might be attributable to suppression of the cue color or, alternatively, to capture by the contextual cues,
as those had the same elementary and relative color as the target. In Experiment 2, we added a white cue to each cue array. We assumed that this color
would neither be attentionally enhanced nor suppressed. Hence, trials in
which white cues preceded at the target location were regarded as baseline.
We found that RTs were slower than baseline when the mis-matching cue
preceded the target location, but faster than baseline when the matching
contextual cues preceded the target location. Thus, the results suggest that
inverse effects are the result of combined suppression and enhancement.
23.3011 Short display time reduces distractor interference when
distractor is a feature of the target Zhi Li1([email protected]), Fan
Yang1, Yijie Chen1; 1Department of Psychology and Behavioral Sciences,
Zhejiang University
Load theory (Lavie and Tsal, 1994) compromises the long debate of early vs
late selection hypothesis of selective attention by assuming that the locus of
attentional filter is flexible depending on the perceptual load of the task. The
filter operates at an early stage when perceptual load is high and operates at
a late stage when perceptual load is low. Evidence supporting load theory
often involves a flanker task, in which distractor and target are spatially
separated. When distractor and target occupy the same space, however,
object-based attention may take over and the distractor may be processed
to a late stage regardless of the perceptual load (Chen, 2003; Cosman and
Vecera, 2012). The present study examined the load effect when distractor
is a feature of the target. Participants judged whether the color of a central
item also appeared on the item in a peripheral array. The items were either
all colorful numbers or all color squares. The display time of the stimuli
was either very short (barely enough for the task) or self terminated (on
screen until response). In the colorful number condition, color was task-relevant information and number was task-irrelevant information. By using
the results from the color square condition as baseline, the time spent on
Vi s i on S c i enc es S o ci e ty
VS S 2017 Abst ract s
processing the task-irrelevant feature (i.e. number) in the colorful number
condition was calculated. It turned out when the display time was short,
less time was spent on processing the task-irrelevant feature than that when
the display time was self terminated. These findings showed that the time
available for processing the stimuli affected whether the task-irrelevant
information would be processed. Short display time significantly reduced
the distractor interference even when the distractor was a feature of the
target. These findings supported and extended the load theory.
Acknowledgement: Chinese National Natural Science Foundation (31671129)
23.3012 Is Mean Size a Good Example of a Statistical Summary
Representation? Centroid versus Mean Size Judgments Laris
RodriguezCintron1([email protected]), Charles Wright2, Charles Chubb3;
Cognitive Sciences, University of California Irvine, 2Cognitive Sciences,
University of California Irvine, 3Cognitive Sciences, University of California Irvine
Introduction. Work by Ariely (2001) inspired interest in research using the
judged mean size of a briefly presented set of stimuli, differing in size, as
a prototypical example of a statistical summary representation (SSR). Like
Ariely, many authors have concluded that mean size judgments rely on a
global strategy – i.e., most members of the set are included in this calculation (Ariely, 2001). However, Myczek and Simons (2008) presented simulation results suggesting that mean-size judgments could result from a
subsampling strategy. To explore whether subsampling is the appropriate
mechanism to explain performance in the mean-size task, we used an efficiency analysis to compare performance across three tasks: two versions
of the centroid task and the mean size task. Like the subsampling simulations, the efficiency analysis used in centroid-task research (Sun, Chubb,
Wright, Sperling, 2015), is based on the degree that an ideal observer fails
to register or include all of the stimuli in the calculation. Method. Observers
were presented with a cloud of either 3 or 9 squares for 300 ms followed
by a mask. In different sessions, observers were asked to estimate one of
(a) the mean size of the stimuli, (b) the centroid of the stimuli ignoring the
size differences, or (c) the centroid weighting the elements of the stimuli
according to their size. Results. We found that efficiency was high in both
centroid tasks, but substantially lower in the mean-size task. Conclusions.
These results suggest that stimulus size is registered accurately and can be
used effectively in the context of centroid judgments but not for judgments
of mean size. Presumably, sources of error other than subsampling lead to
the low efficiency observed when judging mean size. Given these results,
size judgments may be a poor task to us to study SSRs. 23.3013 Conjunctive targets are better than or equal to both con-
stituent feature targets in the centroid paradigm A. Nicole Winter1([email protected]), Charles Wright1, Charles Chubb1, George Sperling1;
University of California at Irvine
In the centroid paradigm (Sun, Chubb, Wright, & Sperling, 2015), a method
for studying feature-based attention, participants view a brief display of
items and then estimate the centroid, or center of mass, of the target items
while ignoring the distractors. In our previous work (Winter, Wright,
Chubb, & Sperling, 2016), we found performance on conjunctive target conditions was better than feature target conditions for one constituent feature
dimension and worse for the other. In this study, we find performance on
conjunctive target conditions is better than or equal to performance on both
constituent feature target conditions. Methods: Targets were defined by
luminance (the darkest items), shape (the most circular items), or their conjunction (the darkest and most circular items). Each stimulus display contained items that varied over two levels of each feature dimension. These
two levels were chosen to be either more or less similar, resulting in four
display types that were intermixed throughout the three blocked target
conditions. Results: As expected, performance in all three target conditions
was better when the stimuli differed more on the relevant dimension(s).
When both the feature dimensions were sufficiently different, performance
on the conjunction task was better than or equal to performance on both
feature tasks. Conclusion: Given the visual search literature, it is perhaps
surprising that participants can estimate the centroids of conjunctive targets at all, let alone better than they can constituent feature targets. The
current findings suggest that conjunctive centroid judgments do not incur
any cost to performance; rather, it seems they offer a performance advantage when the levels of both feature dimensions are sufficiently different.
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day M orning Post ers
23.3014 Ensembles Increase Search Efficiency When Predictive
of Target Location Phillip Witkowski ([email protected]), Joy
Geng1; 1University of California, Davis
23.3015 Limits to Attentional Selection of Features Madison
Elliott1([email protected]), Ronald Rensink1; 1The University of
British Columbia
Longstanding questions exist about how features like color and orientation
are selected by visual attention (Theeuwes, 2013; Brawn & Snowden, 1999;
Treisman, 1988). Here, we present a new methodology to investigate this
issue. This methodology is based on the perception of Pearson correlation
r in scatterplots containing both a “target” population, and an irrelevant
“distractor” population, which is to be disregarded. Observers viewed two
such scatterplots side-by-side (each containing a target and a distractor
population), and were asked to identify the one with the higher target correlation. Methods from Rensink & Baldridge (2010) were used to measure
discrimination via just noticeable differences (JNDs) at 75% correct. Target items were always black, and the background always white. Distractor
items differed in color or in orientation (Fig. 1). In our color manipulation,
distractor dots were one of four shades of red. In our orientation manipulation, target dots were replaced with horizontal lines, and distractors were
lines oriented at 30, 45, 60, and 90 degrees. In conditions where there were
no distractor populations, JNDS were proportional to the distance from r
= 1, consistent with the results of earlier studies. In two-population conditions, however, the slope of the JND lines increased, indicating interference from the irrelevant distractors. Two forms of interference were found.
In our color manipulation, when the distractor dots were light pink (and
most different from the target dots), interference was low, but when they
were dark red (and most similar to the target dots), interference was high.
Meanwhile, in our orientation manipulation, interference was high for distractors at 60 and 90 degree, but low for distractors at 30 and 45 degrees.
This suggests that attentional selection may differ for different features. It
also shows that this methodology may be a useful new way to examine
attentional selection.
Acknowledgement: Natural Sciences and Engineering Research Council, Canada.
23.3016 Flexible prioritization of feature dimensions in perception
of objects, ensembles, and social stimuli Jose Rivera-Aparicio1([email protected]
williams.edu), Benjamin Lin1, Jeremy Cone1, Mariko Moher1; 1Psychology
Department, Williams College
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
23.3017 What modulate attentional parameters, familiarity or fea-
tures? Thomas Sørensen1,2([email protected]), Yongming Wang2,3,
Xinlu Cai2,3, Raymond Chan2,3,4, Jonas Dall1,2; 1Department of Communication and Psychology, Aalborg University , 2Sino-Danish Center for Education and Research, 3Institute of Psychology, Chinese Academy of Sciences,
CAS Key Laboratory of Mental Health
Several studies have investigated object-based capacity limitations of visual
short-term memory (VSTM) (e.g. Luck & Vogel, 1997; Alvarez & Cavanagh,
2001). Recently research interest has turned from object-based processing towards the resolution of objects retained in short-term memory (e.g.
Wilken & Ma, 2004). Although this research is highly relevant, there may be
an inherent difference whether a stimulus can be easily classified in a discrete category, or if it belongs on a spectrum of a continuous category. Previous studies have shown that object based capacity of VSTM is not only
limited by object complexity as argued by Alvarez & Cavanagh (2001), but
also relates to familiarity and expertise (Sørensen & Kyllingsbæk, 2012; Dall,
Watanabe, & Sørensen, 2016). Here we investigated the influence of two
vectors of complexity, namely the number of features that the constitute an
object versus the degree of familiarity with said object. We presented Chinese observers with a whole report design (see Sperling, 1960), consisting of
four stimulus conditions. Chinese characters varied along two aspects: the
word frequency and the number of strokes used in the character. Data were
analysed using the Theory of Visual Attention (Bundesen, 1990) enabling
us to isolate specific components of attention; VSTM capacity (K), as well
as parameters like processing speed (C), and the threshold for visual perception (t0) (e.g. Ásgeirsson, Nordfang & Sørensen, 2015). The threshold of
visual perception was not affected by the manipulation of stroke count, nor
by character frequency. In turn we found a consistent pattern in both processing speed and capacity of VSTM revealing that observer performance
was driven mainly by familiarity, and not stroke count, demonstrating that
object complexity is dependent on the robustness of an observer’s mental
categories, rather than on the number of features in the object per se.
Acknowledgement: Sino-Danish Center for Education and Research
23.3018 Blur as a Guide for Attention when Viewing Representational Visual Art Christina Chao1([email protected]), Chai-Youn Kim2,
Emily Grossman1; 1Department of Cognitive Sciences, University of California, Irvine, 2Department of Psychology, Korea University
Background. Visual artists implement particular techniques (e.g. line
arrangements, spatial layout, shadows) when creating representative
2-dimensional art piece. If and how an artist implements a particular technique can influence how viewers’ attention is guided through the art piece.
Here, we analyze the use of surrounding blur, which artists use to high-
Vis io n S c ie nc es Societ y
Saturday AM
Introduction: Research shows the visual system efficiently encodes peripheral objects as statistical representations, known as ensembles. However,
few studies have explored the role of ensembles in visual search. Some
models suggest that attention is drawn to ensembles with average qualities
similar to the target (Im et al. 2015). Other models propose that learned
ensemble-target associations facilitate visual search by cueing the location
of the target. (Alvarez, 2011). Our project examines the function of ensembles to understand how they are used to facilitate target search and localization. Methods: Participants (N=20 per experiment) located a target
line in one of two groups of lines, which formed ensembles in opposite
locations on the screen. The average orientation of one ensemble matched
the target orientation. The non-matching ensemble was 30 to 60 degrees different. Participants reported whether the target was in the left- or right-side
ensemble, or was absent. In Experiment 1, the target was equally likely to
be in all locations. In Experiment 2, the target was in the matching ensemble
on 75% of trials and in either the non-matching ensemble or absent in 25%
of trials. Results: Results from both experiments indicated that participants made significantly more initial saccades toward the matching ensemble, suggesting that the ensembles captured attention. Only in Experiment
2 did participants have faster response times, suggesting that participants
used ensembles as cues to the target location after learning the ensemble-target association. This is further supported by evidence that participants were less likely to check the opposite ensemble after finding the target. This pattern suggests ensembles primarily influence visual search by
acting as learned cues to the targets location. Conclusion: These results
suggest that target-matching ensembles capture attention, but the effect on
visual search is small unless there is a meaningful association between the
ensemble and the target.
As we look around the world, we identify items along many dimensions,
such as color (looking for red as you search for an apple) and shape (looking for skinny rods as you search for a writing implement). Which dimension we prioritize may change, depending on our current goals. Using a
task-switching paradigm, we examined whether certain feature dimensions
are prioritized over others in visual processing of objects, ensembles, and
social stimuli (e.g., animate creatures). On each trial, participants matched a
target stimulus to one of two probe stimuli according to a particular dimension, such as color. After a few trials, the relevant dimension switched,
forcing participants to focus on a previously ignored dimension (shape, in
this case). We also investigated whether there was an asymmetry in switch
costs; that is, whether it is easier to switch from one dimension to another
(e.g., color to shape) than vice versa (e.g., shape to color). In Experiments
1a and 1b, participants sorted individual objects and homogeneous ensembles. As expected, switches in the sorting dimension led to increased reaction times. Furthermore, participants incurred a larger cost when switching
from color to shape than vice versa, suggesting that color may be prioritized over shape for both individual objects and for homogeneous ensembles. In Experiment 2, participants sorted individual objects and heterogeneous ensembles. Switch costs were again observed; however, participants
did not exhibit asymmetric switch costs for color or shape. In Experiment
3, participants sorted social stimuli. Switch costs were observed, and once
again, participants exhibited a greater switch cost for switching from color
to shape than vice versa. Together, these results suggest that while color is
prioritized over shape in perception of objects and social stimuli, this may
not be the case for heterogeneous ensembles. This underscores the importance of context in featural processing.
Saturday AM
Satur day Morni ng P os t er s
light or emphasize a component of the art piece (often an important figure or object). Given that blurred regions of visual scenes are less fixated
than clear regions (Enns & MacDonald, 2012; DiPaola, et. at, 2013), how
is the salience of highlighted objects impacted when blur is included in a
visual art piece? Method. Regions of high salience were identified on each
art piece through mouse clicks made by a naïve group of human subjects
(N = 24). From these data we identified three commonly selected regions
of interest (the primary face, a secondary face, and a salient object). A new
group of subjects (N = 81) then participated in a change detection paradigm
to measure the impact of blur on these three targeted salient regions, and a
nonsalient control region. Blur was implemented as surrounding the object,
on the object, with random placement in the image, or no blur. Results.
We found a main effect of region on the ability to detect changes, but no
significant effect of blur positioning. Blur did not modulate salience as measured through change blindness. An analysis of artistic ability revealed a
trend towards higher salience when blur surrounded the targeted regions
of interest. Conclusion. Our results suggest that implementing blur in an
artistic sense alters the aesthetics of the image, but may be less effective
for guiding attention. For the non-expert, blur may only be effective when
everything but the region of interest is blurred.
23.3019 ‘Mind contact’: Might eye-gaze effects actually reflect
more general phenomena of perceived attention and intention? Clara Colombatto1([email protected]), Benjamin van
Buren1, Brian Scholl1; 1Department of Psychology, Yale University
Eye gaze is an especially powerful social signal, and direct eye contact has
profound effects on us, influencing multiple aspects of attention and memory. Existing work has typically assumed that such phenomena are specific
to eye gaze — but might such effects instead reflect more general phenomena of perceived attention and intention (which are, after all, what we so
often signify with our eyes)? If so, then such effects might replicate with
distinctly non-eyelike stimuli — such as simple geometric shapes that are
seen to be pointing in various directions. Here we report a series of experiments of this sort, each testing whether a previously discovered ‘eye gaze’
effect generalizes to other stimuli. For example, inspired by work showing
that faces with direct gaze break into awareness faster, we used continuous
flash suppression (CFS) to render invisible a group of geometric ‘cone ’
shapes that pointed toward or away from the observers, and we measured
the time that such stimuli took to break through interocular suppression.
Just as with gaze, cones directed at the observer broke into awareness faster
than did ‘averted’ cones that were otherwise equated — and a monocular control experiment ruled out response-based explanations that did not
involve visual awareness, per se. In another example, we were inspired
by the “stare in the crowd effect”, wherein faces with direct eye gaze are
detected faster than are faces with averted gaze. We asked whether this
same effect occurs when it is cones rather than eyes that are ‘staring’, and
indeed it does: cones directed at the observer were detected more readily
(in fields of averted cones) than were cones averted away from the observer
(in fields of direct cones). These results collectively suggest that previously
observed “eye contact” effects may be better characterized as “mind contact” effects.
Acknowledgement: ONR MURI #N00014-16-1-2007
23.3020 The role of visual attention and high-level object informa-
tion on short-term visual working memory in a change detection
task. Moreno Coco1([email protected]), Antje Nuthmann1,
Sergio Della Sala1; 1Human Cognitive Neuroscience, Psychology, University of Edinburgh, UK
Some studies have suggested that visual attention and visual working memory (VWM) rely on shared processes and on the same limited
resources (e.g., Chun 2011; Kiyonaga & Enger, 2013). Other studies, instead,
have shown that visual attention and VWM might be dissociable and complementary (e.g., Johnson et al., 2008; Tas et al., 2016). In the present study,
we investigated whether and to what degree visual attention is a necessary
condition for effective encoding in VWM. Moreover, we explored how the
memorability of objects depends on their high-level contextual information
(e.g., their congruency or location). Twenty-six young participants performed a change detection task on 192 photographs of naturalistic scenes
(96 experimental/change trials, 96 fillers/no change trials), while being
eye-tracked. Three conditions of target object were considered: Congruency
(it became another object), Location (it moved to another location) or Both
Vi s i on S c i enc es S o ci e ty
VS S 2017 Abst ract s
(it changed and moved). We implemented the change using a gaze contingency paradigm. This was done to ensure that the object was always looked
at during the study (or encoding) phase, prior to the retention interval (900
ms). We analyzed accuracy and response time for correct trials. We found
that participants were better able to remember a change when Both features changed than when the target object changed Location (second best)
or Congruency (worst and slowest). Crucially, the closer participants’ eye
fixations were to the target object, and the higher the similarity in scan-patterns during encoding and recall, the more likely it was that they correctly
detected the change. These results suggest that visual attention is predictive
of effective VWM, especially when the object does not change in location.
This condition is difficult to discriminate by resorting on extra-foveal strategies only. Motion: Biological motion
Saturday, May 20, 8:30 am - 12:30 pm
Poster Session, Banyan Breezeway
23.3021 Identity Matching of Unfamiliar People from Point-Light
Biological Motion Asal Baragchizadeh1([email protected]
edu), Alice O’Toole1; 1School of Behavioral and Brain Sciences, University
of Texas at Dallas
Point-light displays (PLDs) (Johansson, 1973) present compelling depictions of humans in motion and contain useful information for action
(Dittrich, 1993; Kozlowski & Cutting, 1977) and gender perception (Kozlowski & Cutting, 1978). The few studies that have tested person recognition
from PLDs provided weak support for biological motion as an identity
cue, but only when participants were asked to name the familiar people
depicted. Here, we examined the role of biological motion for identification using an identity-matching task (same or different person) for a large
number of unfamiliar identities. We tested a broad range of actions, including walking, running, jumping forward, and boxing. Participants (n = 39)
matched identities in 120 pairs of PLDs and responded using a 5-point scale
(1: sure the same person to 5: sure different people). Subjects viewed PLD
pairs of same action (e.g., both walking) and different actions (e.g., walking and boxing). Results showed performance accuracy well above chance
in the same-action condition (mean a-ROC = .70, 95% CI [0.68, 0.73], p <
0.0001). In the different-action condition, accuracy was moderate and also
greater than chance (mean a-ROC = .59, 95% CI [0.57, 0.62], p < 0.0001).
As expected, identity discrimination was more accurate when the pairs
performed the same action rather than different actions (p < 0.0001). For
same-action trials, the quality of identity information varied with action
type (cf., also Loula et al., 2005). Jumping forward yielded the highest
a-ROC score (M=.77, SD=.22), followed by walking (M=.70, SD=.09), and
running (M=.63, SD=.21). Boxing yielded the lowest a-ROC score (M=.62,
SD=.33). In combination with previous work (Cutting & Kozlowski, 1977;
Beardsworth & Buckner, 1981; Loula et al., 2005), the current results suggest that biological motion cues not only provide information reliable for
discriminating the identity of familiar people, but also for discriminating
unfamiliar identities. 23.3022 Categorizing features of coordination from joint
actions Joseph Burling1([email protected]), Hongjing Lu1; 1University
of California, Los Angeles
Our ability to perceive others’ actions and coordinate our own body movements accordingly is essential for interacting with the social world. Interacting with others often requires precise control of our own limbs and body to
adapt to sudden changes in movement. However, during passive observation of joint action between two persons, are observers sensitive to specific
features of coordinated movement, and do groups of features emerge for
different types social actions? Participants viewed short video sequences
showing two actors performing ten different interpersonal interactions,
such as shake hands, high-five, etc. In some trials, temporal misalignments
were introduced that temporally shift one actor’s movements forward or
backward in time relative to the partner actor. The temporal offsets varied in magnitude for each lead/lag condition (exact timing depended on
the total action duration). Participants rated degree of interactiveness on
a scale of 1–7. First, we compared human interactiveness ratings across
joint actions and found a significant interaction between offset magnitude
and joint action type, p(F9,454 = 10.7) < .001. We found that temporal misalignment did not alter participant ratings for some joint actions, e.g, shake
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day M orning Post ers
domain, we systematically varied the number of noise dots in which the
stimuli was embedded using an adaptive stair-case approach. Contrary to
our prediction, both groups showed equal sensitivity to global bird motion
with no inversion cost. However, consistent with previous work showing a
robust inversion effect for human motion, both groups were more sensitive
to upright human walkers than their inverted counterparts. Thus, at least
under the conditions of our experiment, our result suggests that experience
in the bird domain does not influence the sensitivity to global bird motion.
However, the inversion effect with humans, but not with birds, suggest that
motion recognition within the two domains rely on different mechanisms.
Acknowledgement: NSF BCS-1353391
23.3025 Motion information reducing manipulations can bias the
23.3023 Subcortical and cortical responses to local biological
motion as revealed by fMRI and MEG Dorita Chang1([email protected]),
Hiroshi Ban2,3, Yuji Ikegaya2,4, Ichiro Fujita3, Nikolaus Troje5; 1Department
of Psychology, The University of Hong Kong, Hong Kong, 2Center for
Information and Neural Networks (CiNet), NICT, Japan, 3Graduate School
of Frontier Biosciences, Osaka University, Japan, 4Graduate School of
Pharmaceutical Sciences, The University of Tokyo, Japan, 5Department of
Psychology, Queen’s University, Canada
We report findings from both human fMRI (n = 35), and MEG (n = 10)
experiments that tested neural responses to dynamic (“local”, acceleration)
cues in biological motion. We measured fMRI responses (3T Siemens Trio,
1.5 mm3) to point-light stimuli that were degraded according to: 1. spatial
coherency (intact, horizontally scrambled with vertical order retained, horizontally scrambled with vertical order inverted); 2. local motion (intact,
constant velocity); and 3. temporal structure (intact, scrambled). Results
from MVPA decoding analyses revealed surprising sensitivity of subcortical (non-visual) thalamic area ventral lateral nucleus (VLN) for discriminating local naturally-accelerating biological motion from constant velocity motion, in addition to a wide cortical network that extends dorsally
through the IPS and ventrally, including the STS. Retaining the vertical
order of the local trajectories resulted in higher accuracies than inverting
it, but phase-randomization did not affect (discrimination) responses. In a
separate experiment, different subjects were presented with the same stimuli while magnetic responses were measured using a 360 channel whole
head MEG system (Neuromag 360, Elekta; 1000 Hz sampling frequency).
Results revealed responses in much of the same cortical network identified
using fMRI, peaking at 100-150 ms, and again at 350-500 ms after stimulus
onset during which we also observed important functional differences with
greater activity in hMT+, LO, and STS for structure-from-motion versus
the local natural acceleration stimulus, and greater early (V1-V3) and IPS
activity for the local natural acceleration versus constant velocity motion.
We also observed activity along the medial surface by 200 ms. The fact that
medial activity arrives distinctly following early cortical activity (100-150
ms), but before the 350-500 ms window suggests that the implication of
thalamic VLN for biological motion perception observed with fMRI may
have arisen from early cortical responses, but not higher order extrastriate
23.3024 Examining the role of motion in expert object recogni-
tion. Simen Hagen1([email protected]), Quoc Vuong2, Lisa Scott3, Tim
Curran4, James Tanaka1; 1Department of Psychology, University of
Victoria, 2Institute of Neuroscience, Newcastle University, 3Department of
Psychology, University of Florida, 4Department of Psychology and Neuroscience, University of Colorado Boulder
Motion information contributes to multiple functions during the early
stages of vision (e.g., attract attention, segment objects from the background); however, it also contributes to later stages of object recognition.
For example, human observers can detect the presence of a human, judge
its actions, judge its gender and identity simply based on motion cues conveyed in a point-light display. In the current study we examined whether
real-world experience in an object domain can influence the sensitivity to the
motion of objects within that domain. People with- and without-extensive
experience in the bird domain were shown point-light displays of upright
and inverted birds in flight, or upright and inverted human walkers, and
asked to discriminate them from spatially scrambled point-light displays of
the same stimuli. While the spatially scrambled stimuli retained the local
motion of each dot of the moving objects, it disrupted the global percept
of the object in motion. To estimate a detection threshold in each object
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Acknowledgement: ARI grant W5J9CQ-11-C-0047 and NSF
discrimination of sex in biological motion perception Eric Hiris1([email protected]), Danielle Brzezinski1, Alayna Stein1; 1Department of
Psychology, University of Wisconsin - La Crosse
The relative importance of motion and form information in biological
motion perception has been debated in the biological motion literature.
Several techniques have been used to study the role of form information
in biological motion perception, including presenting stationary single
frames of biological motion to remove motion entirely, or reducing the
motion information available in the display by (1) presenting sequential
position walkers where the point-lights move on the limbs across frames
(Beintema & Lappe, 2002; Beintema, Georg, & Lappe, 2006) or (2) presenting size-changing biological motion where the size of the entire biological
motion display varies across frames (Lappe, Wittinghofer, & de Lussanet,
2015). We compared these various methods of investigating the role of form
in biological motion perception. Specifically, we compared performance on
a sex discrimination task in biological motion in four conditions: (1) normal
biological motion, (2) static frame of normal biological motion, (3) sequential position biological motion, and (4) size-changing biological motion. The
results showed that discriminability of the sex of the actor, as measured
by the slope of the best fitting logistic regression functions, was highest in
normal biological motion and significantly lower in the other conditions
that remove or reduce motion information. In addition to these expected
results, sequential position biological motion and size-changing biological
motion also created a significant bias in sex discrimination where the displays were biased to be perceived as more male (as measured by the 50%
point of the function). These results suggest that some manipulations of
biological motion stimuli may create significant biases in biological motion
perception that are not likely due to the removal of motion information per
se. Future research is needed to explain the basis of the bias and may lead to
greater understanding of the mechanisms of biological motion perception.
23.3026 Priming and Adaptation in Biological Motion Percep-
tion Hongjing Lu1,2([email protected]), Yujia Peng1; 1Department
of Psycholology, UCLA, 2Department of Statistics, UCLA
Recent perceptual experiences can alter visual perception in two distinct
ways: priming and adaptation-induced aftereffects. While priming typically leads to facilitation effects, associated with faster and/or more accurate responses to the same stimulus, adaptation results in repulsive aftereffects, revealed by slower and/or less accurate responses to the same stimulus. In the present study, we examined priming and adaptation effects
in biological motion perception, and their interactions with part/whole
body structure. On each trial, participants first viewed a walking action
(S1) presented as either the whole-body motion, or as subparts of body
movements (bipedal leg movements, bilateral arm-leg movements with the
same motion direction, and unilateral arm-leg movements with the opponent motions). The S1 stimuli were presented in the forms of point-light
or skeleton walkers for either a short duration (100 ms) or a long duration
(500 ms). After the presentation of S1, a point-light whole-body walker was
shown briefly as the second stimulus (S2) for 200 ms. Participants were
asked to judge the facing direction of the walker in S2. Results showed that
with a short duration of S1 (100 ms), both whole-body and subpart movements in S1 elicited robust priming effects but with different facilitation
magnitudes. For subpart movements, bipedal feet movements and bilateral
arm-leg movements showed stronger priming effects than did unilateral
arm-leg movements, suggesting that some subparts of body movements
may be encoded in the hierarchical representation of actions. When the S1
stimulus was presented for a long duration (500 ms), the whole-body skeleton display yielded an adaptation effect, and no other conditions yielded
Vis io n S c ie nc es Societ y
Saturday AM
hands, tug-of-war, arguing and threaten. However, ratings varied depending on the temporal direction of misalignments for other joint actions, such
as catch, high-five, chicken dance, skipping and threaten. Second, based
on rating distributions across joint actions, we fit a generative probabilistic
cluster model to group the distributions into latent classes, revealing shared
characteristics among sets of joint actions. The resulting clusters organized
joint actions by the dimensions of average rating score and sensitivity to
offset directionality. Further analysis on the clustered structure of joint
actions revealed that global motion synchrony, spatial proximity between
actors, and local, brief, but highly salient moments of interpersonal coordination are critical features that impact judgments of interactiveness.
Satur day Morni ng P os t er s
aftereffect, suggesting that biological motion adaptation depends on the
global whole-body representation of actions. These findings also indicate
that a transition from priming to adaptation depends on the temporal duration of the first stimulus. Saturday AM
Acknowledgement: NSF BCS-1353391
23.3027 Seeing illusory body movements in human causal interactions Yujia Peng 1([email protected]), Hongjing Lu1,2; 1Department of
Psychology, University of California Los Angeles, 2Department of Statistics, University of California Los Angeles
Goal-directed actions entail causality. One person moving his limbs in a
certain way causes another person to react, creating a meaningful interaction. For example, seeing a friend throwing a ball towards you causes
you to raise your arm to catch it. If humans are sensitive to the causal relation between two individual actions, then one action may provide information about the causal history of a static frame of the other action: the
causal actions that generated the posture change over time. The present
study examined whether human causal interactions can induce a percept
of gradual motion between two distinct postures. The stimuli involved
an interactive action, with one actor throwing an object and another actor
catching it. The object itself was not presented. On each trial, the thrower
was shown first, followed by a brief presentation of the catcher, while the
thrower continued his movements in the entire trial. During the brief presentation, the catcher demonstrated either a sudden posture change (two
static posture frames) or a gradual posture change with smooth movements
(multiple frames). The two actors either showed a meaningful interaction
(i.e. they faced each other), or a non-interactive situation (i.e. they faced
away), or were presented upside-down. Participants judged whether the
catcher showed a sudden or gradual posture change. We found that in the
interaction condition, the proportion of trials in which a sudden change
was misidentified as a gradual change was significantly higher than in
the non-interactive or the inverted conditions. This finding suggests that
observers were more likely to perceive illusory gradual motions when
body movements were consistent with a causal interpretation of two actors
interacting to achieve a common goal. To account for the human results,
a Bayesian model was developed that incorporated inferred expectations
based on causal actions.
Acknowledgement: NSF BCS-1353391
23.3028 How Do We Recognize People in Motion? Noa Simhi1(noa.
[email protected]), Galit Yovel1,2; 1The School of Psychological Sciences,
Tel-Aviv University, 2The Sagol School of Neuroscience, Tel-Aviv University
Person recognition has been primarily studied with static images of faces.
However, in real life we typically see the whole person in motion. This
dynamic exposure provides rich information about a person’s face and
body shape as well as their body motion. What is the relative contribution
of the face, body and motion to person recognition? In a series of studies,
we examined the conditions under which the body and motion contribute
to person recognition beyond the face. In these studies, participants were
presented with short videos of people walking towards the camera and
were asked to recognize them from a still image or a video that was taken
on a different day (so recognition was not based on clothing or external
facial features). Our findings show that person recognition relies primarily on the face, when facial information is clear and available. However,
when facial information is unclear or at a distance the body contributes to
person recognition beyond the face. Furthermore, although person recognition based on the body alone is very poor, the body can be used for person recognition when presented in whole person context and in motion. In
particular, person recognition from uninformative faceless heads attached
to headless bodies was better than recognition from the body alone. Additionally, person recognition from dynamic headless bodies was better than
recognition from multiple static images taken from the video. Overall our
results show that when facial information is clearly available, person recognition is primarily based on the face. When facial information is degraded,
body, motion and the context of the whole person are used for person recognition. Thus, even though the face is the primary source of information
for person identity, information from the body contributes to person recognition in particular in the context of the whole person in motion.
Vi s i on S c i enc es S o ci e ty
VS S 2017 Abst ract s
23.3029 Dynamics of multistable biological motion percep-
tion Louisa Sting1([email protected]), Leonid
Fedorov2, Tjeerd Dijkstra2, Howard Hock3, Martin Giese2; 1Department
of Computer Science, Cognitive Science Center, University of Tuebingen
, 2Center for Integrative Neuroscience, HIH, UKT, University of Tuebingen, 3Center for Complex Systems and the Brain Sciences, Department of
Psychology, Florida Atlantic University
The dynamic stability of percepts has been extensively studied in low-level
motion (Hock et al. 2003, 1996). A manifestation of dynamic stability is
the perceptual hysteresis shown for a pair of mutually exclusive motion
stimuli. So far hysteresis effects have not been investigated in biological
motion perception. Its measurement requires a parameter that controls the
relative bias of perception for the two alternatives. We developed such a
stimulus for biological motion perception and investigated dynamic stability. METHODS: Our stimulus is based on the fact that body motion perception from two-dimensional movies can be bistable (Vanrie et al. 2004),
alternating between two different percepts. We developed a new stimulus
by random sampling two shaded volumetric walkers covered with 1050
circular discs. The fraction of discs drawn from either walker is a hysteresis
parameter that allows to vary gradually the preference for two perceived
walking directions. We realized two experiments: I. Measurement of the
times before the first perceptual switch as function of the hysteresis parameter. II. Measurement of a hysteresis loop, varying the hysteresis parameter
gradually up and down. This experiment adapted the Modified Method of
Limits by (Hock et al. 1993). RESULTS: Experiment I shows that, dependent
on the hysteresis parameter, the new stimulus can induce both an unambiguous perception of walking direction and perceptual bistability. The
average switching time is smallest if both percepts are equally likely and it
depends systematically on the hysteresis parameter (p < 10-15). Experiment
II measured the percept probabilities as function of the hysteresis parameter. These probabilities are significantly dependent on previous values of
the parameter (i.e. whether it was increasing or decreasing), implying perceptual hysteresis (p < 0.01). CONCLUSION: We demonstrated that body
motion perception, like low-level motion perception, shows indicators of
dynamic multi-stability.
Acknowledgement: HFSP RGP0036/2016 European Commission HBP FP7ICT-2013-FET-F/ 604102 Fp7-PEOPLE-2011-ITN (Marie Curie): ABC
PITN-GA-011-290011, German Federal Ministry of Education and Research:
BMBF, FKZ: 01GQ1002A, Deutsche Forschungsgemeinschaft: DFG GI 305/4-1,
DFG GZ: KA 1258/15-1.
Visual Search: Features and objects
Saturday, May 20, 8:30 am - 12:30 pm
Poster Session, Banyan Breezeway
23.3030 Target prevalence in a search task transfers to another
search task if their search items look visually similar Han-Gyeol
Son1([email protected]), Hyung-Bum Park1, Joo-Seok Hyun1; 1Department of Psychology, Chung-Ang University
The probability of target presence can affect accuracy and speed of a visual
search task, and this is known as target prevalence effect. The present study
reports that target prevalence of a visual search that was once performed
can influence another subsequent search with neutral target prevalence
(i.e., 50%) if the search arrays are visually similar. In the experiments, participants performed two independent search tasks across trials where one
had the target prevalence of 10, 50, or 90% (prevalence-search), while the
other had 50% (neutral-search). In the target-mismatch condition, the target
for each task differed in the target-relevant feature (e.g., different Landolt
gap-openness), but the search items across the two tasks shared a target-irrelevant feature (e.g., rounded black Landolt Cs), making the search arrays
look visually similar. Conversely, in the array-mismatch condition, the target for each task shared the target-relevant feature (e.g., the same Landolt
gap-openness), but the search items across the two tasks differed in their
target-irrelevant feature (e.g., rounded black Landolt Cs vs. angulated
white Landolt Cs), making the search arrays look dissimilar. The results
showed that target prevalence manipulation of the prevalence-search
influenced accuracy and RTs of neutral search trials exclusively in the target-mismatch condition, indicating that target prevalence of a search task
can transfer to another search task if their search items look similar. They
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s further suggest that contextual information such as target prevalence in a
daily search task can influence another search task if the tasks share objects
that are visually similar rather than dissimilar.
23.3031 Motor Biases Do Not Account for the Low Prevalence
Effect Chad Peltier1([email protected]), Mark Becker1; 1Michigan State
The low prevalence effect (LPE) is an increase in miss errors as target prevalence decreases in a visual search task. There are three proposed causes of
this effect: a decrease in quitting threshold, a conservative shift in criterion,
and a target absent motor bias. Per the motor bias hypothesis, the frequent
target absent responses that occur in low target prevalence searches bias
the target absent motor response. Occasionally this prepotent motor bias
results in an erroneous target absent button press despite a target detection.
Fleck and Mitroff (2007) found that the LPE was eliminated when observers
could make a corrective response to overcome the prepotent target absent
response. Several researchers have since found that allowing a corrective
response or controlling for motor biases does not eliminate the LPE. Here
we predict that motor biases will influence search performance only when
the response to response time between trials is minimal. We investigate our
hypothesis by allowing corrective responses under different conditions. In
Experiment 1 we manipulate target prevalence and set size. Results show
that motor biases contribute to the LPE only when the set size is small,
thereby producing short response to response intervals between trials. In
Experiment 2, we manipulate the Inter-Response-Interval (IRI) to find the
time it takes to eliminate effects of motor biases on the LPE. The results
show that as the IRI increases, the effects of motor biases decrease. Overall,
we show that motor biases not only fail to account for the LPE, but also
fail to influence search performance when target presence judgements are
separated by enough time. These results indicate that researchers do not
need to control for motor biases in time consuming serial search tasks and
that real-world searches where trials last several seconds are unlikely to be
influenced by motor biases.
23.3032 Target prevalence in a visual search task differentially
modulates lure effects from visual working memory Beatriz Gil
Gómez de Liaño1([email protected]), Trafton Drew2, Daniel
Rin1, Jeremy Wolfe3; 1Universidad Autónoma de Madrid, 2University of
Utah, 3Brigham & Women’s Hospital-Harvard Medical School
Performance on Visual Search (VS) is driven by bottom-up, stimulus-based
information and the top-down state of the observer: What is she looking
for? What else is on her mind? Here, we investigate the top-down effects of
holding task-irrelevant objects in working memory during VS. Observers
searched through16 real-world objects looking for a target, while maintaining other specific objects in working memory (1 or 4). The memorized
objects could appear as “lures”: distractors in the VS task. We varied target
prevalence in VS (4%, 50%, 96%, and a 100% target present condition). Our
results demonstrate that lure effects clearly depend on target prevalence.
For target absent trials, RTs are longer when lure is present for almost all
conditions, particularly in the 96% condition, and this effect did not interact
with target prevalence. This slowing may reflect the cost of recognizing
and disengaging from the lure. For target present trials, at low (4%) prevalence, lures significantly decreased RTs (p< .001), while there was no effect
of lures for 50% and 96%. This speeding may be related to the elevation
of miss error rates. Perhaps finding a lure encouraged quicker search termination, which could be related to the “satisfaction of search” effect in
radiology. In fact, real world search tasks vary widely from low prevalence
tasks like radiologists screening for cancer, or high prevalence tasks like
looking for posts of a trending topic in Twitter. The present results suggest
that the effects of distracting information held in working memory depend
on the nature of the VS, and remind us that prevalence has complex effects
on search performance. Acknowledgement: PSI2015-69358-R (MINECO, Spain)
23.3033 Temporal dynamics of attentional templates Anna
Grubert1([email protected]), Martin Eimer2; 1Department of
Psychology, Durham University, 2Department of Psychological Sciences,
Birkbeck, University of London
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Attentional templates (representations of target features) are activated
prior to search to guide spatial attention to target-matching events in the
visual field. To investigate the temporal dynamics of preparatory template
activation, we developed a new rapid serial probe presentation technique
and measured N2pc components to these probes during single-colour and
multiple-colour search. Participants searched for colour-defined targets
that appeared together with different-colour distractors in circular search
displays. During this task, a continuous stream of circular probe displays
appeared at a different location closer to fixation. These task-irrelevant displays contained one coloured and five grey items. All probe and search
displays were presented for 50ms and were separated by a 200ms stimulus onset asynchrony. In Experiment 1, target colour was constant (e.g.,
red), and probe arrays contained either a target-colour or a distractor-colour singleton. Only target-matching probes triggered an N2pc, indicative
of template-guided attentional capture. No N2pc was elicited for probes
that appeared directly after a preceding search display and the N2pc was
largest for probes that immediately preceded the next search display. This
demonstrates that template activation is not constant, but is modulated in
line with temporal task parameters. A control experiment showed that the
apparent transient template de-activation following search displays is not
an automatic consequence of target identity or response-related processing.
Analogous N2pc results were found in Experiment 2, where target colour
alternated predictably across successive search displays (e.g., red, green,
red), and singleton probes either matched the previous or the upcoming
target colour. Both types of probes triggered a similar N2pc pattern, suggesting simultaneous activation of colour target templates during multiple-colour search even when the upcoming target colour is known. These
results show that our new rapid serial probe presentation method can provide novel electrophysiological insight into the time course of attentional
Acknowledgement: This work was supported by grant ES/K006142/1 from the
Economic and Social Research Council (ESRC), UK.
23.3034 Visual search through displays of data Christine
Nothelfer1([email protected]), Steven Franconeri2; 1Northwestern
With the increasing availability and importance of data, the human visual
system serves as a critical tool for analysis of patterns, trends, and relations
among those data. Building on recent translational visual search work in
domains like baggage screening (e.g., Mitroff et al., 2015) and radiology
(e.g., Wolfe, 2016), we explored how different ways of representing data
values can lead to efficient or inefficient visual processing of the relations
between the values in a data pair. We asked participants to find a particular
relation among the opposite relation (e.g., small/large value pairs among
large/small), under a variety of common, and manipulated, data encoding
methods: bullet graphs (one value as a bar, the other as a threshold dash
through the bar), line graphs, connected dash graphs, area graphs, dash
graphs (lines placed where the tops of bars in a bar graphs ‘would be’),
adjacent bars, and separated bars. Displays were divided into quadrants
containing 1-5 data pairs each, for a total set size of 4-20 items. Participants
were asked to quickly indicate which quadrant contained the target data
pair. The choice of data depiction led to enormous differences in processing
efficiency for relations between values (ranked fast to slow in the order
listed above), from flat search slopes (line graphs), medium search slopes of
69 ms/pair (dash graphs), and severely steep search slopes of 115 ms/pair
(separated bar graphs), one of the most ubiquitous encoding types. Visual
search for relations can be strikingly serial, but performance can improve
substantially with small changes to displays. Exploring visual search and
relation processing in the context of data visualization displays may provide a rich case study for both basic research on the mechanisms of search,
but also concrete guidelines for the students and scientists who use vision
to process and convey patterns in data.
23.3035 Visual search in large letter arrays containing words: are
words implicitly processed during letter search? Maria Falikman1,2,3([email protected]); 1Department of Theoretical and
Applied Linguistics, Lomonosov Moscow State University, 2Cognitive
Research Lab, National Research University Higher School of Economics,
Department of Psychology, Russian Academy for National Economy and
Public Affairs
Vis io n S c ie nc es Societ y
Saturday AM
Acknowledgement: This work was supported by the Ministry of Education of
the Republic of Korea and the National Research Foundation of Korea (NRF2016S1A5A2A01026073 & NRF-2014S1A5A2A03066219).
S atur day M orning Post ers
Saturday AM
Satur day Morni ng P os t er s
VS S 2017 Abst ract s
The word superiority effect (Cattell, 1886) is discussed in psychology for
more than a century. However, a question remains whether automatic word
processing is possible without its spatial segregation. Our previous studies
of letter search in large letter arrays containing words without spatial segregation revealed no difference in performance and eye movements when
observers searched for letters always embedded in words, never embedded in words, or when there were no words in the array (Falikman, 2014;
Falikman, Yazykov, 2015). Yet both the percentage of participants who
noticed words during letter search and their subjective reports whether
words made search easier or harder significantly differed for target letters
within words and target letters out of words. In the current study, we used
the Processes Dissociation Procedure (Jacoby, 1991) to investigate whether
words are processed implicitly when observers search for letters. Two
groups of participants, 40 subjects each, performed 1-minute search for 24
target letters (either Ts, always within words, or Hs, always out of words)
in the same letter array of 10 pseudorandom letter strings, 60 letters each,
containing 24 Russian mid-frequency nouns. After that, they filled in two
identical word-stem completion forms, each containing the same 48 word
beginnings (24 for words included in the array). First, the participants were
instructed to use words that could appear in the search array (“inclusion
test”), then – to avoid using such words (“exclusion test”). Comparison
of conscious and unconscious processing probabilities revealed no difference between them (with the former not exceeding 0.09 and the latter not
exceeding 0.11), no difference between the two conditions, and no interaction between the factors. This allows concluding that, despite of subjective
reports, words embedded in random letter strings are mostly not processed
either explicitly or implicitly during letter search, and that automatic unitization requires spatial segregation.
Acknowledgement: Fundamental Research Program, National Research University Higher School of Economics
23.3036 The guidance of attention by features and feature configu-
rations during shape/shape conjunction search Cody McCants ([email protected]), Nick Berggren1, Martin Eimer1; 1Department of
Psychological Sciences, Birkbeck, University of London, UK
During visual search, attentional templates guide the allocation of attention
to objects with template-matching features. When targets are defined by
a conjunction of two colours, these guidance processes appear to operate
independently for each of these features. As a result, there is no effective
attentional guidance when targets defined by a particular configuration
(e.g., blue above red) are accompanied by a distractor with the reverse
configuration (e.g., red above blue) in the same display. This has been
demonstrated recently with ERP markers of attentional selectivity (N2pc
components; Berggren & Eimer, 2016). The present study investigated
attentional guidance in tasks where targets were defined by the configuration of two different shapes (e.g., hourglass above circle). Participants
reported the presence or absence of such combined-shape targets while
ignoring non-target items that could be non-matching (e.g. hexagon above
cross), partially-matching (e.g., cross above circle) or fully-matching in the
reverse configuration (e.g., circle above hourglass). The pattern of N2pc
components measured for target-present and target-absent displays again
revealed evidence for independent attentional guidance by each of the
two target-defining shapes. However, and in contrast colour configuration
search, attention was also guided by shape configuration. When a target
and a reverse-configuration nontarget appeared in the same display, an
N2pc was triggered by the target, although it was delayed and attenuated
relative to the target N2pc triggered in the absence of a template-matching
distractor. Similar N2pc results were observed when the two shapes were
adjacent (thus forming a single object) and when they appeared in different
quadrants, demonstrating that guidance by shape configuration was not
simply based on search templates for fused target objects. Results show that
in contrast to colour, conjunction search in the shape domain can be guided
by target templates for spatial configurations.
University, Russia, 3Department of General Psychology, University of
Padova, Italy, 4Human Inspired Technology Research Centre, University
of Padova, Italy
Real world objects have a variety of features with different probability distributions. A tree leaf can have a unimodal hue distribution in summer that
changes to a bimodal one in autumn. We have previously shown that perceptual systems can learn not only summary statistics (mean or variance),
but also distribution shapes (probability density functions). To use such
information observers need to relate it to spatial locations and other features. We investigated whether observers can do this during visual search.
Ten observers looked for an odd-one-out line among 64 lines differing in
orientation. Each observer participated in five conditions consisting of
interleaved prime (5-7 trials) and test (2 trials) streaks. Distractors on prime
streaks were randomly drawn from a mixture of two Gaussian distributions (10° SD) or a mixture of Gaussian and uniform (20° range) with means
located ±20° from a random value. The target was oriented 60° to 90° away
from the mean of the resulting bimodal distribution. During test streaks,
both target and distractor mean changed with distractors randomly drawn
from a single Gaussian distribution. In the spatially-bound condition, the
two prime distributions were spatially separated with distractors from one
distribution on the left, the rest on the right. In the feature-bound condition,
distractors from one distribution were blue, the others yellow (target color
was randomly yellow or blue). We analyzed RTs on test trials by distance in
feature space relative to distractor distributions on prime streaks and target
location or color. Separation of distributions by location and, to a lesser
extent, by color, allowed observers to encode them separately. However,
the properties of one distribution affected encoding of another. The results
demonstrate the power and limitations of distribution encoding: observers
can encode more than one distribution simultaneously, but each resulting
representation is affected by other distributions.
Acknowledgement: Supported by Russian Foundation for Humanities (#15-3601358⢢) and Icelandic Research Fund (#152427).
23.3038 Does Orientation Matter? The Effects of Target Orienta-
tion in Multiple Target Visual Search Stephen Adamo1, Joseph Nah1,
Andrew Collegio1, Paul Scotti1, Sarah Shomstein1; 1Department of Psychology, The George Washington University
Multiple-target visual searches, where more than one target can be present
in a search array, are subject to Subsequent Search Miss (SSM) errors: a
decrease in second target detection after successful detection of the first
target. While SSM errors have been known in radiology for over 50 years,
their underlying cause remains elusive. The perceptual set account predicts
that SSM errors are driven by target similarity, such that a second target
is more likely to be missed if it is dissimilar to a previously found target.
Biggs et al., (2015) demonstrated initial strong evidence for this account by
exploring how different types of target similarity affect SSM errors. If a second target shared the identity, color, or category of a previously found target, observers made fewer SSM errors. However, target orientation was not
investigated as a measure of similarity. Here, we investigated SSM errors in
a multiple target search, with targets that appeared in the same or different
orientation. Observers were asked to search for up to two targets: highor low-salience target letter T’s and L’s, amongst low-salience pseudo T/L
distractors. Four search items were independently rotated either 0°, 90°,
180°, or 270° and presented for 400ms equally centered around a fixation
point. The results demonstrated an SSM effect with decreased low-salience
target accuracy after a high-salience target was detected compared to single, low-salience, target accuracy. However, there was improved second
target detection when both targets shared identity (i.e., both T’s or both L’s)
and orientation, compared to when both targets were either different types
(i.e., a T and an L) or different orientations (e.g., two T’s of different orientations). These results provide novel evidence suggesting that SSM errors
are impacted by the rotation of targets, providing further evidence for the
perceptual set account. Acknowledgement: ESRC
Acknowledgement: NSF BCS-1534823 to S. Shomstein
23.3037 Binding feature distributions to locations and to other fea-
23.3039 The Influence of Color and Form Information on Visual
tures Andrey Chetverikov1,2([email protected]), Gianluca Campana3,4, Árni
Kristjánsson1; 1Faculty of Psychology, School of Health Sciences, University of Iceland, Iceland, 2Department of Psychology, Saint Petersburg State
Vi s i on S c i enc es S o ci e ty
Search Guidance and Verification Times Mark Becker1([email protected]
msu.edu), Ryan Wujcik1, Chad Peltier1; 1Department of Psychology, Michigan State University
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s 23.3040 The Grass isn’t Greener: No detriment for red-green color
deficiency in search for camouflaged targets Alyssa Hess1(alyssa.
[email protected]), Mark Neider1; 1Department of Psychology, College
of Sciences, University of Central Florida
Visual search is an essential task we perform every day. Color is an important feature for guiding search, yet those with visual color impairments
often live normal and unassisted lives (Wolfe, 1994). Those with red-green
color vision deficiencies are at a specific disadvantage when discriminating natural terrains, which often lie within wavelengths between 557 – 589
mμ (Hendley & Hecht, 1949). Despite this, those with color-deficiencies are
able to successfully analyze complex natural scenes in recognition tasks,
suggesting some sort of visual accommodation for impoverished chromatic information (Gegenfurtner, Wichmann & Sharpe, 1996). Previously,
we investigated how search changes in natural images when targets are
not only camouflaged, but also presented without color information (in
grayscale), finding that accuracy suffered for those searching without color
information. However, it is not yet understood how those with deficiencies
search natural images for obscured targets. In this experiment, we compared those with normal vision to those with red-green color deficiencies
in a visual search task for a camouflaged target in natural, wooded images.
We found no significant differences between those with color-deficiencies
and those with normal vision in response time or accuracy, suggesting two
possible conclusions. First, that color information is not necessary to guide
attention in this unique type of search task. Alternatively, those with redgreen deficiencies might reprioritize visual information in order to guide
search in natural scenes.
Acknowledgement: This work was supported by grant number N000141512243
from the Office of Naval Research to MBN
23.3041 Physical Properties Guide Visual Search for Real-world
Objects Li Guo1([email protected]), Susan Courtney1, Jason Fischer1;
Department of Psychological and Brain Sciences, The Johns Hopkins
Finding a missing earring in a jewelry box can be a frustrating challenge,
but it becomes easier if the earring differs in color, shape, or size from other
items in the box. These visual attributes and many others help to guide our
attention toward the items of interest. Does our knowledge of objects’ physical properties – e.g., that the earring is hard, smooth, and dense – also guide
our search? Would we be faster to locate the earring if it appeared among
soft objects rather than hard ones? Here, we tested observers’ ability to use
their physical knowledge about everyday objects to guide their attention
in visual search. We presented participants with search arrays comprising
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
sixteen objects. The objects were rated on perceived hardness by a separate
group of online participants, and in each search array a target object was
paired either with 15 distractors of similar hardness (e.g., soft target among
soft distractors) or 15 distractors of different hardness (e.g., soft target
among hard distractors). Participants (n=24) were asked to find the target
object among the distractors after viewing a word label of that target for 1s.
They pressed a key after locating the target and then indicated the target
location with a mouse click. We found that participants were faster to locate
a target object when it appeared among distractors of different hardness
(1.28s±0.05) vs. distractors of similar hardness (1.64s±0.08; t(23)=5.04; p<
0.001). Critically, this effect was intact after controlling for any influences of
image luminance, contrast, color, shape, and semantic content. Our results
indicate that observers can use their knowledge of objects’ physical properties to guide their visual search toward likely targets. These findings point
toward an important role of physical knowledge in guiding how we engage
with visual scenes in daily life.
23.3042 Task-irrelevant optic flow guides overt attention during
visual search Yoko Higuchi1([email protected]),
Terumasa Endo2, Satoshi Inoue2, Takatsune Kumada1; 1Graduate school of
Informatics, Kyoto University, 2TOYOTA Motor Corporation
It is known that basic visual features such as color or contrast capture
attention. Recent research suggests that optic flow also attracts attention
even when it is irrelevant to participants’ task. However, the impact of an
individual’s attentional set on attentional capture to optic flow is poorly
understood. In two experiments, we examined whether task-irrelevant
optic flow induces attentional capture under different conditions of a participant’s attentional set. The first experiment aimed to confirm the course
of attentional guidance via optic flow in a visual search task. Participants
had to find a target, Gabor patch, with an orientation different from distractors. Prior to onset of the search display, a task-irrelevant optic flow display
was presented for 1, 3, or 5 sec. Results indicated that all three optic-flow
exposure conditions yielded faster search times when the target was presented at the expanding point of optic flow (EPOF) than when the target
happened at another location. Moreover, eye movement analysis revealed
that the first saccade headed for EPOF. In the second experiment, a task
irrelevant color circle, was presented in the search display. This procedure
ensured that participants’ attention was directed to the color circle which
was concurrently presented with a target if participants are sensitive to feature singletons. However, results revealed that the optic flow continued to
strongly guide attention. In other words, a color singleton does not override attentional capture created by optic flow. These results suggest that
optic flow quickly guides an attention-forward expanding point regardless
of participants’ attentional set.
23.3043 Effects of prior knowledge on visual search in
depth Bochao Zou1,2([email protected]), Yue Liu1, Jeremy Wolfe2;
School of Optoelectronics, Beijing Institute of Technology, China, 2Visual
Attention Lab, Harvard Medical School and Brigham & Women’s Hospital, United States
In visual search, if observers know the target has a certain feature in advance,
they can restrict search to potential elements that share this feature. A limited set of features (e.g. color and motion) can support this phenomenon.
What about knowledge of target depth, signaled by binocular disparity?
Can observers restrict search to disparity-defined depth planes? Previous
studies have come out with somewhat different results in reaction time
and search efficiency. We hypothesized more stable guidance by disparity
might be seen if more time was provided for observers to resolve depth in
the stereoscopic displays. In experiment 1, observers searched for 2 among
5s. In order to provide the time prior to search, figure-eight placeholders
were used. Placeholders were presented in depth from the beginning of
each trial and then changed into 2s or 5s after a 1500msec delay. Three conditions were compared: one depth plane or two depth planes with targets
cued to be in the front or back plane. Searches in both two-depth conditions
were significantly more efficient than in the one-depth condition (34 vs.
49ms/item). Perfect guidance would predict that two-depth RTxSetSize
slopes should be half of the one-depth slopes. However, they were not that
shallow. In Experiment 2, placeholders were presented in a cloud of 12 possible depths. Observers were cued toward the front, back, or middle of the
depth cloud or no cue was given. The cue was probabilistic, meaning that
a front cue indicated that the cue was most likely in the closest depth plane
Vis io n S c ie nc es Societ y
Saturday AM
In visual search, a working memory representation of the search target
guides attention to similar items, and is used to verify whether an inspected
item is the target. Research comparing picture to text-based search cues
finds that picture cues produce better guidance (reduced time to first fixate
the target) and shorter verification times (reduced time between fixating the
target and response). These findings suggest that a high fidelity working
memory representation benefits both search guidance and target verification. Here we investigated the source of this picture cue benefit to determine
how the cue’s visual form and/or color information differentially impact
guidance and verification times. Given that visual acuity drops precipitously with eccentricity, visual form information may be unlikely to influence guidance to the periphery, but may benefit verification. By contrast,
color information is likely to influence guidance but may have little influence on verification times. To test the impact of color and visual form information, subjects searched colored arrays of photo-realistic objects while we
manipulated the cue type. Cues were gray-scaled pictures, color pictures,
text labels, or text-labels with color information (e.g., “blue shoe”). Tracking eye movements allowed us to parse reaction times into guidance and
target verification phases. Results show that color information improved
guidance, and did so to a similar extent for both picture and text cues. To
our surprise, form information also improved guidance and this improvement was additive with the color benefit, and of about the same magnitude.
In terms of verification times, form information reduced verification times,
but color information did not reduce verification times for text-based cues.
Finally, during guidance, color’s impact occurred within the first fixation
while the benefit of form information was slightly delayed. These results
suggest unique impacts of form and color information on search guidance
and target verification processes.
S atur day M orning Post ers
Saturday AM
Satur day Morni ng P os t er s
VS S 2017 Abst ract s
with lower probabilities at further depth planes. With these more heterogeneous displays, no benefits of prior knowledge of depth were observed. In
a control, we confirmed that guidance by color works with our paradigm.
Depth guidance works, but may be limited to near and far.
tion of the Next item become increasingly random since the next item has
not been found. The results show that observers are actively searching for
and finding the Next target before they finish collecting the Current target.
Acknowledgement: National Eye Institute (NEI) Grant No. EY017001, National
973 Program of China (2013CB328805).
23.3046 Category supersedes identity in visual search: Attentional
23.3044 Into the Woods: Characterizing and Training Detection of
Camouflaged Targets in Natural Scenes Dawn Sarno1([email protected]
knights.ucf.edu), Alyssa Hess1, Joanna Lewis1, Ada Mishler1, Corey Bohil1,
Arthur Kramer2, Mark Neider1; 1Department of Psychology, College of
Sciences, University of Central Florida, 2Departments of Psychology and
Engineering, Colleges of Science and Engineering, Northeastern University
Search performance has been shown to decline as target-background similarity increases (Wolfe, Oliva, Horowitz, Butcher, & Bompas, 2002). For
some tasks, such as searching for a camouflaged enemy, this decrement in
performance can mean life or death. Previous research has suggested that
performance on these difficult search tasks can be improved through training (Hess, Wismer, Bohil, & Neider, 2016) and, importantly, this training
has been found to transfer to novel stimuli (Neider, Ang, Voss, Carbonari,
& Kramer, 2013). The goal of the present study was to develop a training
intervention to improve detection of camouflaged targets in natural scenes
and engender transfer to untrained targets and backgrounds. The training
task consisted of searching for camouflaged targets derived from distorted
patches of a wooded scene. Following training, transfer to new background
classes was assessed utilizing novel wooded and urban scenes; transfer to
new target classes was assessed with three novel target types: blur, lens
flare, and geometric. Participants were assigned to one of three training
groups (adaptive, massed, or control) and trained over 14, one-hour sessions. In the adaptive group, target difficulty varied on a trial-to-trial basis
depending on performance; the massed group received increasingly more
difficult targets as they progressed through the training sessions. Following training, both training groups showed evidence of transfer of training
to novel wooded scenes compared to the control group, with the adaptive
group showing the strongest evidence of transfer (average 1.5s decrease in
response times, 7% increase in accuracy). The adaptive group also demonstrated transfer of training to several novel target classes. Our findings suggest that adaptively training participants to detect camouflaged targets in
natural scenes can engender transfer of training to untrained background
and target types.
Acknowledgement: This work was supported by grant number N000141512243
from the Office of Naval Research to MBN
23.3045 When does visual search move on?: Using the color wheel
to measure the dynamics of foraging search Anna Kosovicheva1([email protected]), Joseph Feffer2, Abla Alaoui Soce3, Matthew Cain3,5,
Jeremy Wolfe3,4; 1Department of Psychology, Northeastern University,
State College Area High School, 3Brigham and Women’s Hospital, 4Harvard Medical School, 5U.S. Army Natick Soldier RD&E Center
When foraging for multiple instances of a visual target, can observers begin
to search for the next item before collecting the current target or must they
complete the current search first? To answer this question, we examined the
temporal dynamics of foraging search using a novel dynamic color technique. Observers searched for 2–16 Ts among 9–23 Ls while all items continuously varied independently in color. The trial terminated after a pseudorandom number of targets had been clicked. At that point, observers were
shown a color response palette and asked to report either the color of the
“Current” target they had just clicked or the “Next” target they intended to
click. The difference between the observers’ color response and the actual
color of the item at the end of the trial gives an estimate of the time (relative
to the end of the trial) when they found the target. If observers were guessing, color distributions would be uniformly randomly distributed. However, distributions for Current and Next trials were narrower than those
expected by random guesses, indicating that observers were able to report
the color of the Next item on some proportion of the trials. Observers’ color
responses were also consistent with sequential acquisition of the targets.
Average responses for Current targets corresponded to colors shown 330
ms before the end of the trial, while average responses for Next targets
occurred 174 ms before the end of the trial. As targets become sparser, it
takes longer to find the Next target and, thus, reports of the color and loca-
Vi s i on S c i enc es S o ci e ty
Acknowledgement: National Eye Institute (NEI) Grant No. EY017001
templates reflect participants’ category knowledge in both item
and set searches Brianna McGee1([email protected]), Chelsea Echiv-
erri1, Benjamin Zinszer2, Rachel Wu1; 1University of California, Riverside,
University of Rochester
Prior research has shown that category search is similar to 1-item search
(as measured by the N2pc ERP marker of attentional selection) because
items in a category can be grouped into one attentional template. The present study investigated whether the perceived size of a familiar category
impacts the attentional template used when searching for a category or specific items from that category. Critically, the perceived size of the categories
was based on prior knowledge, rather than the experimental stimuli. We
presented participants with sixteen items: eight from a smaller category
(social media logos) and eight from a larger category (manufacturing company logos). We predicted that search for smaller categories would rely on
a better-defined attentional template compared to larger companies, and
therefore produce a larger N2pc. Twenty adult participants completed four
search tasks: Search 1) specific social media logo (e.g., Facebook); Search 2)
specific manufacturing logo (e.g., Xbox); Search 3) any social media logo;
Search 4) any manufacturing logo. Neither reaction time nor accuracy differed between searches for social media logos or manufacturing logos, and
familiarity measures showed that both categories were equally familiar to
the participants. However, only searches in the social media category (for
either a specific item or any item from the category) produced a significant
N2pc. No N2pc was found in either item or category search for manufacturing logos. Our results show that participants’ knowledge about a category’s
size influences the way they search for both a specific item from the category and the whole category. 23.3047 Modeling categorical search guidance using a convo-
lutional neural network designed after the ventral visual pathway Gregory Zelinsky1([email protected]), Chen-Ping
Yu2; 1Departments of Psychology and Computer Science, Stony Brook
University, 2Department of Psychology, Harvard University
Category-consistent features (CCFs) are those features occurring both
frequently and consistently across the exemplars of an object category.
Recently, we showed that a CCF-based generative model captured the overt
attentional guidance of people searching for name-cued target categories
from a 68-category subordinate/basic/superordinate-level hierarchy (Yu,
Maxfield, & Zelinsky, 2016, Psychological Science). Here we extend this
work by selecting CCFs for the same 68 target categories using an 8-layer
Convolutional Neural Network (CNN) designed to reflect areas, receptive
field (kernel) sizes, and bypass connections in the primate ventral stream
(VsNet). We replicated our previously-reported finding that the number of
CCFs, averaged over categories at each hierarchical level, explains the subordinate-level guidance advantage observed in gaze time-to-target. However, we now show stronger guidance to individual categories for which
more CNN-CCFs were extracted (r=0.29, p=0.01). We also found that CCFs
extracted from VsNet’s V1-V2 layers were most important for guidance
to subordinate-cued targets (police cars); CCFs from TEO-V4 and V4-TE
contributed most to guidance at the basic (car) and superordinate (vehicle) levels, respectively. This pattern suggests a broad coding of priority
throughout the ventral stream that varies with cue specificity; early visual
areas contribute more to subordinate-level guidance while less specific cues
engage later areas coding representative parts of the target category. Converging evidence for this suggestion was obtained by finding the image
patches eliciting the strongest filter responses and showing that these units
from areas V4 and higher had receptive fields tuned to highly category-specific parts (police car sirens). VsNet also better captured observed attentional guidance behavior, and achieved higher classification accuracy, than
comparable CNN models (AlexNet, Deep-HMAX), despite VsNet having fewer convolutional filters. We conclude that CCFs are important for
explaining search guidance, and that the best model for extracting CCFs is
a deep neural network inspired by the primate visual system.
Acknowledgement: This work was supported by NSF grant IIS-1161876 to G.J.Z.
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s 23.3048 How the Heck Did I Miss That? How to use the hybrid
search paradigm to study “incidental finding” errors in radiology. Jeremy Wolfe1,2([email protected]), Abla Alaoui Soce1;
Brigham and Women’s Hospital, 2Harvard Medical School
Acknowledgement: NEI EY017001
23.3049 “Deep” Visual Patterns Are Informative to Practicing
Radiologists in Mammograms in Diagnostic Tasks Jennevieve Sevil-
la1,2([email protected]), Jay Hegde1,2,3; 1Brain and Behavior Discovery
Institute, 2James and Jean Culver Vision Discovery Institute, Augusta University, Augusta, GA, 3Department of Ophthalmology, Medical College of
Georgia, Augusta University, Augusta, GA
“Deep Learning” is a form of perceptual learning in which the trainee learns
to perform a given task by learning the informative, often abstract, statistical patterns in the data from a relatively large set of labeled examples. We
have previously reported that, using deep learning, naïve, non-professional
human observers can be trained to detect camouflaged objects in natural
scenes, or anomalies in radiological images (Chen and Hegdé, Psychol Sci
2014; Hegdé, J Vis 2014). By systematically manipulating the deep visual
patterns (e.g., principal components [PCs]) using image synthesis algorithms, we have identified the patterns that such non-professional ‘experts’
use in detecting cancers in screening mammograms. But it is not known
whether or to what extent practicing radiologists can or do use the same
patterns. To help address this issue, we tested practicing radiologists (N
= 9; 3 mammography specialists) under comparable conditions. Briefly,
either original mammograms or synthesized counterparts that were missing 0 to 2 of the previously characterized PCs were viewed ad libitum one
per trial. Depending on the trial, subjects indicated whether the mammogram contained a cancer (detection task), or whether the image was original
or synthesized (discrimination task). Subjects were unable to discriminate
original vs. synthesized images when the latter contained all PCs (d‘ = 0.38,
p > 0.05), indicating that the two sets of images were mutually perceptually
metameric. In the detection task, the performance of the radiologists covaried with the cumulative eigenvalue of the PCs in the image and with that
of the non-professional subjects (two-way ANCOVA, eigenvalue x training
mode; p < 0.05 for both factors and interaction). Together, our results indicate that at least some of the visual patterns used by professionally trained
radiologists are the same as that learned and used by non-professionals
trained in the laboratory.
Acknowledgement: This study was supported by the U. S. Army Research Office
grants W911NF-11-1-0105 and W911NF-15-1-0311 to Jay Hegdé.
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
23.3050 Predicting airport screening officers’ visual search com-
petency with a rapid assessment Stephen Mitroff1([email protected]),
Justin Ericson1, Benjamin Sharpe2; 1The George Washington University,
Kedlin Company
Visual search is a vital cognitive ability for a variety of professions, including airport security, radiology, and the military. Given the importance of
such professions, it is necessary maximize performance, and one means to
do so is to select individuals based upon their visual search competency.
Recent work has suggested that it is possible to quickly classify individuals as strong or weak visual searchers (Ericson, Kravitz, & Mitroff, Psychonomic Society 2016); demonstrating that those who started out faster and
more accurate were more likely to have superior performance later in the
task. A critical question is whether it is possible to predict search competency within a professional search environment. The current study examined whether a relatively quick visual search task could predict professional searchers’ actual on-job performance. Over 600 professional searchers from the USA Transportation Security Administration (TSA) completed
an approximately 10-minute assessment on a tablet-based XRAY simulator
(derived from Airport Scanner; Kedlin Co.). The assessment contained 72
trials that were simulated XRAY images of bags. Targets (0 or 1 per trial)
were drawn from a set of 20 prohibited items, and distractors (5 to 15 per
trial) were taken from a set of 100 allowed items. Participants searched for
prohibited items and tapped on them with their finger. Two tutorials had
to be successfully complete prior to the assessment. Performance on the
assessment significantly related to three on-job measures of performance
for the TSA officers: (1) detecting simulated threat items projected into
actual carry-on bags, (2) detecting real threat items covertly introduced into
the checkpoint, and (3) an annual proficiency exam. These findings suggest
that it may be possible to quickly identify potential hires based on their core
visual search competency, which could provide organizations the ability to
make new hires and assess their current workforce.
Acknowledgement: US Transportation Security Administration
Visual Memory: Long term and working
Saturday, May 20, 8:30 am - 12:30 pm
Poster Session, Pavilion
23.4001 Context transitions modulate perceptual serial depen-
dence Anastasia Kiyonaga1([email protected]), Mauro Manassi1,
Mark D’Esposito1, David Whitney1; 1Helen Wills Neuroscience Institute
Serial dependence in perception describes when visual stimuli appear more
similar to recently-attended stimuli than they truly are. By smoothing perception over the ever-changing image on the retina, this bias is thought
to stabilize our visual experience from one moment to the next. Although
this perceptual continuity is generally helpful, it fundamentally reflects
a misapprehension of the current stimulus, and could therefore interfere
with processing when the previous information is no longer relevant to
current goals (i.e., proactive interference). If serial dependence between
successive perceptual instances were flexible and adaptive, therefore, it
should be reduced when environmental cues trigger the segmentation of
visual episodes. Event boundaries (i.e., transitions to a new episode or
context) are thought to update one’s goal state in working memory, and
may thereby signal that the upcoming information should be segregated
from what came before. Accordingly, we expect a stable context to promote
serial dependence in perception, whereas a context shift should flush the
lingering trace of recent perception from memory, and curtail its influence
on current processing. We tested this hypothesis by periodically changing
the background color (i.e., context) during a continuous series of orientation judgments. On each trial, participants saw a randomly-oriented grating, then adjusted a response bar to match their perceived orientation of
the stimulus. Across all trials, orientation responses were attracted toward
the orientation from the previous trial. On the first trial in a new context,
however, this serial dependence was eliminated, suggesting that the (typically attractive) previous trial information was purged at the start of a new
visual episode. In contrast, all subsequent trials in a given context showed
significant serial dependence. These data suggest that context transitions
can update goal settings to dampen the bias toward recently-attended stimuli when it may no longer serve current goals.
Vis io n S c ie nc es Societ y
Saturday AM
When radiologists perform one task (e.g. Does this patient have pneumonia?), they are also expected to search for “incidental findings” that might
be clinically significant (e.g. signs of lung cancer). Unfortunately, these
incidental findings are missed at rates higher than is desirable. Moreover,
the same lesion that would be found if it were the object of search, can be
missed when it is an incidental finding. To develop techniques to address
this problem, we have designed a hybrid search analog task that can be
used with non-experts. In hybrid search, observers look for an instance
of any of several candidate targets held in memory. Reaction time (RT)
increases linearly with the visual set size and linearly with the log of the
number of targets held in memory. The same pattern is seen with search for
categorical targets (e.g. find any cat, car, coin, or cookie), but these targets
produce longer RTs. To simulate the incidental finding situation, observers
search for any of three specific and three categorical targets. Specific targets
are the analog of the radiologist’s specific task. Categorical targets are the
analog of the incidental findings. They are known to the observer but less
well-defined than the specific targets. When categorical and specific targets
are mixed within a block, observers miss more than twice as many categorical targets as they do specific targets. Observers miss fewer categorical
targets if all targets in a block are categorical. Observers miss the fewest
targets when all were specific. In a mixed block with 4X as many specific
targets as categorical targets, the categorical target miss rate becomes very
large (38%), mimicking the pattern of incidental finding errors in radiology.
Given this ‘model system’, we can test interventions that could reduce the
incidental error rate in the lab and in the clinic.
S atur day M orning Post ers
Satur day Morni ng P os t er s
VS S 2017 Abst ract s
23.4002 Evidence for sequential access in visual long-term
memory Inder Singh ([email protected]), Aude Oliva , Marc Howard ;
Center for Memory and Brain, Boston University, 2CSAIL, Massachusetts
Institute of Technology
Saturday AM
One of the most well-known results in recognition memory tasks is that
the response time increases and accuracy reduces with an increase in
the lag of the item and the probe. These effects are usually explained by
changes in memory strength with lag. Models of memory that include a
temporal dimension allow for a mechanism of sequential access. We used a
continuous recognition paradigm with highly memorable pictures to mitigate changes in accuracy and enable a detailed examination of the effect
of recency on retrieval dynamics across three experiments. The recency
at which the pictures were repeated ranged over more than two orders
of magnitude from immediate repetitions after a few seconds to tens of
minutes. Analysis of the RT distributions showed that the time at which
memories became accessible changed with the recency of the probe item.
Despite changes in accuracy across the three experiments, we see a consistent slope of the first decile of the RT distributions with logarithm of the
intervening lag. The linear trend in RT on a log scale suggests an underlying compressed temporal dimension. Analyses of RT distributions showed
that the time to initiate memory access to varies with log(lag). Additional
analyses revealed that this effect was not attributable to an effect of immediate repetitions nor to increased processing fluency of the probe. These
results suggest that visual memories can be accessed by sequentially scanning along a compressed temporal representation of the past. The form of
the compression is closely analogous to the compression associated with
cortical magnification in vision.
Acknowledgement: NSF, AFOSR
23.4003 Different Limits on Fidelity in Visual Working Memory and
Visual Long Term Memory Natalie Kataev1([email protected]),
Andrei Teodorescu2, Ron Hajaj1, Roy Luria1,3, Yonatan Goshen-Gottstein1;
School of Psychological Sciences, Tel-Aviv University, 2The Institute of
Information Processing and Decision Making, Haifa University, 3Sagol
School for Neuroscience, Tel-Aviv University
How detailed are long-term memory (LTM) representations as compared
to those of working memory (WM)? Recently, Brady et al. (2013) suggested
that both types of memory are constrained by the same bound on fidelity,
after which the memory representation is lost. In their experiments, however, WM performance may have been contaminated with LTM representations. Here, we aimed to replicate their findings, while tapping a purer
measure of WM. In addition, we examined whether a representation of an
item can exist alongside the absence of color information. Participants were
presented with colored real-life objects and were asked to remember both
the items and their color. At test, participants judged whether the objects,
presented in grey, had previously appeared (item memory) and then chose
their color from a continuous-color wheel (color memory). This procedure
allowed us to examine the memory of an item separately from the memory
of its corresponding color, in a within-subject design. In the WM condition,
participants had three seconds to encode three colored objects, after which
they performed the item- and color-memory tasks for only a single object
out of the three. In the LTM condition, participants were presented with
hundreds of items, one at a time for three seconds each. They were subsequently tested for item and color memory. We calculated the variability
of internal representations of color (fidelity) and the probability of forgetting an object’s color. In replication of Brady et al. (2013), the probability of
guessing in LTM was found to be higher than in WM. However, the critical
analysis of fidelity revealed significantly better fidelity for WM. We also
found that items can be remembered while their color is lost, rendering
item and color information to be partly independent. We discuss the theoretical implications of different boundaries of WM and LTM.
23.4004 Enhanced perceptual processing of visual context benefits
later memory Megan deBettencourt1,2([email protected]), Nicholas
Turk-Browne3,4, Kenneth Norman3,4; 1Institute for Mind and Biology,
University of Chicago, 2Department of Psychology, University of Chicago,
Princeton Neuroscience Institute, Princeton University, 4Department of
Psychology, Princeton University
Vi s i on S c i enc es S o ci e ty
Fluctuations in attention affect task performance in the moment, but can
also have long-lasting consequences by influencing memory formation.
These effects are typically studied by manipulating whether to-be-remembered objects or words are selectively attended during encoding. However,
a key determinant of memory is the temporal context in which stimuli are
embedded, not just the individual stimuli themselves. Here we examine
how attention to temporal context impacts subsequent memory. Participants in two fMRI experiments completed multiple runs of a memory-encoding task. In each run, they studied lists of sequentially presented words
for a later test. Between words, participants were rapidly presented with a
series of photographs from a single visual category, either faces or scenes.
These photographs served as the temporal context, and were not themselves tested for memory. At the end of the run, participants were asked
to recall as many words as possible from one of the lists. We trained a multivariate pattern classifier to decode the two possible contexts (face versus
scene) from an independent localizer task with no words. Applying this
classifier to the memory-encoding runs allowed us to measure the perceptual processing of the temporal context for a given list. As a manipulation
check, we were able to decode the visual category of the interleaved context photographs when collapsing across lists. Critically, list-wise variance
in this decoding related to list-wise variance in the number of words later
recalled. Moreover, within lists, there was more classifier evidence for the
category of the context surrounding, and even preceding, words that were
later remembered versus forgotten. Altogether, these findings suggest that
enhanced contextual processing may be one mechanism through which
attention can boost memory formation.
Acknowledgement: NIH R01 EY021755, NSF BCS 1229597, The John Templeton Foundation, Intel Corporation
23.4005 The impact of mnemonic interference on memory for
visual form Aedan Li1([email protected]), Celia Fidalgo1,
Andy Lee1,2, Morgan Barense1,2; 1Department of Psychology, University of
Toronto, 2Rotman Research Institute, Baycrest
How does interference impact memory? Previous work found that different types of distracting information can differentially alter how visual representations are forgotten. In addition, a recent series of experiments found
that highly dissimilar interfering items erase the contents of memory, while
highly similar and variable interfering items blur memory representations.
Though these effects have been shown for color memory, it is unclear if
they extend to other object features such as shape. Here, we used a novel
“Shape Wheel” to assess how different kinds of interference would impact
shape memory. On this wheel, 2D line drawings were morphed together to
create an array of 360 shapes, corresponding to 360 degrees on a circle. Participants were asked to remember a shape sampled from this wheel, then
were shown interfering shapes that were either perceptually similar to the
studied shape, perceptually dissimilar from the studied shape, perceptually variable, or scrambled shapes (baseline condition). We used a mixture
model to measure the probability that the item is stored in memory, defined
as accuracy, as well as the level of detail of that representation, defined as
precision. We found that when interfering shapes were similar to the studied item, a numerical but non-significant benefit to memory accuracy was
observed. However, when interfering shapes were dissimilar to the studied
item, accuracy was reduced. In contrast, memory precision was reduced
only when interfering shapes were similar or perceptually variable. These
findings extend previous results by demonstrating the differential effects
of interference for isolated feature-level shape information. Visually dissimilar interference erases memory representations, while visually variable
and highly similar interfering items tended to blur shape representations.
As the impact of interference was consistent across studies, these findings
may offer a set of general principles regarding how interference impacts
high-level object representations and all features therein.
23.4006 Does an unexpected task reset the contents of visual
working memory? Garrett Swan1([email protected]), Brad Wyble1,
Hui Chen2; 1Department of Psychology, College of Liberal Arts, Pennsylvania State University, 2Department of Psychology, Zhejiang University
It is well known that visual information can be held in memory while performing different tasks concurrently, such as remembering a color during a
separate visual search task. However, it is not clear whether we can maintain this information in the face of immediate unexpected tasks, such as a
surprise question. This question is relevant to our general understanding of
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s 23.4007 Are memorable images easier to categorize rapidly? Lore
Goetschalckx1([email protected]), Steven Vanmarcke1, Pieter
Moors1, Johan Wagemans1; 1Laboratory of Experimental Psychology, Brain
and Cognition, KU Leuven
Some images we see stick in mind, while others fade. Recent studies of
visual memory have found remarkable levels of consistency for this interitem variability across observers (e.g., Isola 2011), suggesting that memorability can be considered an intrinsic image property. However, the visual
features underlying memorability are not yet well understood. Investigating the relation between image memorability and inter-item variability
in other visual tasks can provide more insight. Here, we asked whether
an image that is easier to process and categorize is also more memorable.
We used a rapid-scene categorization task and assessed whether there are
consistent differences in difficulty between images in this task (defined
as “categorizability”) and whether they correlate with memorability. We
selected 14 scene categories and 44 images per category from a set previously quantified on memorability (Bylinskii 2015). Per trial, participants
saw an image for a duration of 32 ms, followed by a mask of 80 ms. Next, a
category label appeared on screen and the task was to indicate whether the
label matched the image. For each participant, a random half of the scenes
was presented as signal trials (i.e., label matches image), the other half as
no-signal trials. For signal trials, we collected on average 79 responses per
image. An image’s categorizability score was calculated as the proportion
of correct responses on signal trials. The average categorizability score per
category varied between .55 and .89. Thus, given the task context, some categories were considerably easier than others. For most categories, consistency scores were high (mean split-half Spearman’s rho up to .90), suggesting that categorizability is an intrinsic image property too. However, the
predicted positive correlation between categorizability and memorability
was not observed. This suggests that the ease with which an image can be
categorized relies on features distinct from those involved in memorability.
Acknowledgement: Research Foundation - Flanders (FWO)
23.4008 Resource scarcity impairs visual online detection and
prospective memory Brandon Tomm1([email protected]),
Jiaying Zhao1,2; 1Department of Psychology, University of British Columbia, 2Institute for Resources, Environment and Sustainability
Operating under limited resources (e.g., money, time) poses significant
demands on the cognitive system. Scarcity induces attentional trade-offs
of information in the environment, which can impact memory encoding. In
three experiments (N=227) we demonstrate that people under time scarcity
failed to detect time-saving cues as they occur in the environment, suggesting that scarcity impairs the ability to detect online cues. These time-saving
cues, if noticed, would have saved more time for the time poor participants,
alleviating the condition of scarcity. A follow-up experiment showed that
the visuospatial proximity of the time-saving cues to the focal task determined successful detection of the time-saving cues, suggesting that the
online detection errors can be explained by spatial attention on the task at
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
hand. Thus, time scarcity may cause attentional trade-offs whereby attention is focused on the task at hand, while ironically, other beneficial information is neglected as it occurs in the environment. We also demonstrate
that people under time scarcity were more likely to forget previous instructions to execute future actions, suggesting that scarcity causes prospective
memory errors. Ironically, the time poor participants failed to remember
previous instructions which, if followed, would have saved them time.
These experiments show that scarcity impairs the online detection of beneficial information in the environment, as well as the execution of prospective memory cues. Failures of prospective memory and online detection
are particularly problematic because they cause forgetting and neglect of
beneficial information, perpetuating the condition of scarcity. The current
studies provide a new cognitive account for the counterproductive behaviors in individuals under resource scarcity, and have implications for interventions to reduce neglect and forgetting in the poor.
23.4009 Suppressing visual representations in long-term memory
with recognition Ashleigh Maxcey1([email protected]); 1The Ohio
State University
In this presentation, I will discuss a paradigm we have developed to look at
recognition-induced forgetting of visual objects. Recognition-induced forgetting occurs when practice recognizing an object, from a group objects
learned at the same time, leads to worse memory for objects from that
group that were not practiced. This forgetting effect is commonly accompanied by improved memory for practiced objects. We have shown that
recognition-induced forgetting is not an artifact of category-based set size.
I will discuss our developmental work showing this forgetting effect comes
online by 6 years of age without a memory benefit for practiced objects until
9 years of age. Further, the forgetting appears to remain robust with healthy
aging in samples of older adults, without the benefit for practiced objects
shown in young adults but accompanied by a decrease in intrusion errors.
I will conclude by discussing our use of this paradigm to understand how
this forgetting phenomenon operates on temporally clustered objects and
stimuli of expertise, as well as our technique of using cathodal transcranial direct-current stimulation to DLPFC to examine the role of inhibitory
mechanisms in this forgetting phenomenon.
23.4010 Sequential whole-report reveals different states in visual
working memory Benjamin Peters1([email protected]),
Benjamin Rahm2, Stefan Czoschke1, Catherine Barnes1, Jochen Kaiser1,
Christoph Bledowski1; 1Institute of Medical Psychology, Goethe-University, Frankfurt am Main, Germany, 2Medical Psychology and Medical
Sociology, Albert-Ludwigs-University, Freiburg, Germany
Working memory (WM) provides rapid and flexible access to a limited
amount of information in the service of ongoing tasks. Studies of visual
WM usually involve the encoding and retention of multiple items, while
probing a single item only. Little is therefore known about how well multiple items can be reported from visual WM. Here we asked participants
to successively report each of up to eight simultaneously encoded Gabor
patch orientations from WM. Report order was externally cued, and stimulus orientations had to be reproduced on a continuous dimension. Participants were able to sequentially report items from WM with an abovechance precision even at high set sizes. Importantly, we observed that
precision varied systematically with report order: It dropped steeply from
the first to the second report but decreased only slightly thereafter. This
trajectory of precision was better captured by a discontinuous rather than
an exponential function, suggesting that items were reported from different
states in visual WM. Additional experiments showed that the steep drop in
precision between the first and subsequent reports could not be explained
by a retro-cue that selectively protected fragile visual WM representations
for the first reported item, the longer retention interval for later reported
items, or the visual interference by the first report. Instead, the drop in precision disappeared when participants performed an interfering task that
mimicked the executive demands of the report procedure after the retention interval and prior to the first report. The present study provided the
hitherto missing initial characterization of sequential reports from visual
WM. Taken together, these results suggest that a sequential whole-report
reveals qualitatively different states in visual WM that may differ in the
degree of dependence on executive functions.
Acknowledgement: German Research Foundation (DFG Grant BL 931/3-1)
Vis io n S c ie nc es Societ y
Saturday AM
visual working memory and attention and is also relevant for experimental paradigms utilizing surprise test methodologies. When considering the
results of experiments with unexpected questions, it is especially important
to determine if an inability to report information is due to the reorientation
to a new task imposed by the surprise question. To answer this question,
we ran two experiments where the instructions unexpectedly switched
from recognition to recall in a surprise trial. Half of the participants were
asked to report the same attribute (Exp1 = Identity, Exp2 = Color) of a target
stimulus in both pre-surprise and post-surprise trials, while for the other
half, the reported attribute switched from identity to color or vice versa.
Importantly, all participants had to read an unexpected set of instructions
and respond differently on the surprise trial. A decline in accuracy on the
surprise trial compared to the first control trial was only observed in the
different-attribute groups, but not in the same-attribute groups. Accuracy
on the surprise trial was also higher for the same-attribute groups than the
different-attribute groups. Furthermore, there was no difference in reaction
time on the surprise trial between the two groups. These results suggest
that information participants expected to report can survive an encounter
with an unexpected task. The implication is that failures to report information on a surprise trial in many experiments reflect genuine differences in
memory encoding, rather than forgetting or overwriting induced by the
surprise question.
S atur day M orning Post ers
Satur day Morni ng P os t er s
VS S 2017 Abst ract s
23.4011 Surface and boundary organization of objects influences
visual short-term memory performance Benjamin McDunn ([email protected]), James Brown1; 1University of Georgia
Saturday AM
Visual features of an object can be properties of either its surface (such as
color or texture) or its boundary (such as shape or size). Previous visual
short-term memory studies have focused on the importance of object number in determining capacity limits, but an “object” might have any number
of distinct surfaces and boundary contours. In the current study, we explore
how the organization of surfaces and boundaries that indicate two task-relevant features can influence memory performance. In Experiment 1, memory for two task-relevant features (color and orientation) was tested in four
display conditions utilizing different surface and boundary organizations
for the object stimuli. Experiment 2 tested the same four display conditions
using two boundary features, size and shape, as the task-relevant features.
Both experiments were conducted using both a full probe trial-type, where
all studied objects reappeared at test, and a partial probe trial-type, where
only one object reappeared at test. The combination of display conditions
and probe trial-types allow us to distinguish effects from both local proximity of the features and utilization of the global spatial layout of the display.
The results of Experiments 1 and 2 show significant differences depending on display type that interact with probe type. Interestingly, the results
suggest the differences between surface and boundary organizations of the
stimuli were mediated by differences in the utilization of either local proximity of the features or the global spatial layout of the display. This finding
suggests some effects of object status on memory performance observed in
previous studies may be mediated by how effectively these proximity cues
can be utilized by participants.
23.4012 The Role of Memory Uncertainty in Change Localiza-
tion Aspen Yoo1([email protected]), Luigi Acerbi1,2, Wei Ji Ma1,2;
Department of Psychology, New York University, 2Center for Neural
Science, New York University
In many perceptual tasks, humans near-optimally use sensory uncertainty
information in their decisions. It is unknown whether they do so in decisions based on visual working memory (VWM). Some circumstantial evidence is available: humans’ confidence reports are positively correlated
with their errors in a delayed-estimation task (Rademaker et al., 2012), and
humans near-optimally integrate current knowledge of uncertainty with
working memories (Keshvari et al., 2012). However, it is unclear whether
people accurately store uncertainty information in VWM and use it in a
subsequent decision. To investigate this, we collected data in two change
localization tasks with variable stimulus reliability. Each trial consisted of a
sample array of four Gabors, a delay, and a test array of four Gabors. Participants reported which of the four Gabors changed in orientation. In Task
1, we used two levels of contrast to manipulate memory uncertainty. In the
sample array, the stimuli could be all high contrast, all low contrast, or two
of each. In the test array, stimuli were either all high or all low contrast.
In Task 2, we replicated this result with variable delay times (1 or 3 seconds) instead of variable contrast. We evaluated two models. The Optimal
model assumes that observers know their memory uncertainty on a trialto-trial and item-to-item basis and use this information to maximize performance. The Fixed model assumes observers do not use knowledge of their
uncertainty, but assume that stimuli are equally uncertain. In both tasks,
the Optimal model outperformed the Fixed model for three of four participants (: Task 1: M = 10.1, SEM = 6.9; Task 2: M = 9.3, SEM = 15.3). Moreover,
the Optimal model provides good fits to the psychometric curves. These
results provide preliminary evidence that humans maintain uncertainty
information in VWM and use it in a subsequent decision.
23.4013 Is location information lost from visual short-term
memory? Andra Mihali1([email protected]), Wei Ji Ma1; 1New York
Visual short-term memory (VSTM) performance as a function of set size is
well accounted for by noise corrupting the stimulus representation, with
the amount of noise increasing with set size. It has been proposed that, in
addition to this mechanism, there is also a loss of binding between feature
and location information (Bays et al, 2009). An analysis of delayed-estimation data suggests that the prevalence of such binding errors is low (Van
den Berg, Awh, and Ma, 2014), but this analysis was quite indirect. Here,
we address the question of whether location information is maintained in
VSTM with a more direct approach. 11 observers performed two VSTM-
Vi s i on S c i enc es S o ci e ty
based tasks with arrays of 2,3,4 and 6 items: a target detection task (target
present half of the time) and a target localization task (always one target).
Any loss of location information would affect localization performance
but not detection performance. Therefore, if we can jointly fit an optimal
observer model with the same parameters to detection and localization,
this would suggest that location information loss is minimal. Indeed, we
were able to fit well the variable-precision encoding model jointly to the
detection and localization data. These preliminary model fits suggest that
location information is maintained in VSTM to a significant extent. Acknowledgement: NIH
23.4014 Attentional boost effect: Failure to replicate Katherine
Moen1([email protected]), Stephanie Saltzmann1, Melissa Beck1; 1Psychology, Louisiana State University
Dual-task paradigms typically impair performance relative single-task
paradigms. However, research on the attentional boost effect (ABE) suggests that dual-task performance is improved to that of single-task performance on critical trials (when a response is required). The response leads to
improved memory for the simultaneously presented scene, relative to a single-task. Research suggests that responding to a subset of stimuli results in
increased attention, leading the to better memory for the associated memory items. In an attempt to better understand the role of attention in the
ABE, we measured eye movements during a typical ABE task. Therefore,
the current study sought to replicate the ABE and document the pattern
of eye movements associated with critical and non-critical trials. In three
experiments, participants encoded real-world scenes with a circle in the
center. Divided attention (DA) participants pressed a button when the circle was a non-prevalent color, and full attention (FA) participants ignored
the circles. All participants completed a recognition memory test after a
delay. Replicating previous ABE studies, Experiment 1 used a 1000ms
encoding time and a two-alternative-forced-choice recognition test, Experiment 2 used 1000ms encoding time and a single-item recognition test, and
Experiment 3 used a 500ms encoding time and a single-item recognition
test. Behavioral results revealed no differences between FA and DA in the
first two experiments and a DA impairment in Experiment 3. No ABE was
observed in any of the three experiments. Across three experiments, dwell
times during test were longer for the FA condition compared to DA condition. There were no differences during encoding, and no differences for
critical versus non-critical trials. Overall, these experiments suggest that the
ABE is not robust. The lack of an ABE is consistent with the lack of allocation of attention differences, as measured by eye movements, on critical
versus non-critical trials.
23.4015 Working Memory Capacity and Cognitive Filtering Predict
Demand Avoidance. Jeff Nador1([email protected]), Brad Min-
nery2, Matt Sherwood2, Assaf Harel1, Ion Juvina1; 1Wright State University,
Wright State Research Institute
In general, optimization of task performance minimizes cognitive demand.
For example, when participants can choose freely between task variants,
they tend to select the one that minimizes cognitive demand (Kool et al.,
2010). Here, we test whether the ability to filter irrelevant information
during task performance will reduce cognitive processing demands. Previous research has shown that observers with higher visual working memory (VWM) capacity tend to be more efficient cognitive filterers (Vogel,
McCollough & Machizawa, 2005). Consequently, we hypothesize that such
demand avoidance arises from individual differences in VWM capacity. To
test this hypothesis, we collected psychophysical and electrophysiological
measures of VWM capacity and cognitive filtering in a sample of 22 observers. We then correlated these with independent psychophysical measures
of demand avoidance. We found that observers with higher VWM capacity
tended to select the less demanding of two task alternatives, and that this
occurred because filtering irrelevant information increased their sensitivity
to our covert demand manipulation. Moreover, reaction times increased
significantly when a given trial’s instructions switched with respect to the
preceding trial’s, and this increase tended to be larger among those with
greater cognitive filtering ability. Taken together, our results suggest that
working memory capacity and cognitive filtering ability contribute to individual differences in demand avoidance. Inefficient cognitive filterers tend
to process more irrelevant information and are therefore less sensitive to
covert variations in demand. Efficient filterers, on the other hand, can suc-
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day M orning Post ers
cessfully ignore irrelevant information, and are therefore more sensitive.
As such, we surmise that individual differences in visual working memory
capacity and cognitive filtering predict demand avoidance.
the same plane as the saccade as opposed to changes that occurred in the
orthogonal plane. This supports our hypothesis that saccade driven remapping impacts the precision of SWM – saccades smeared SWM.
Acknowledgement: Office of Naval Research
23.4018 The effects of content-dependent competition on working
memory capacity limits Jason Scimeca1([email protected]), Jacob
Miller1, Mark D’Esposito1; 1Helen Wills Neuroscience Intitute, University
of California, Berkeley
gmail.com), Chaipat Chunharas1, Pascal Mamassian2, John Serences1;
Psychology Department, University of California San Diego, San Diego,
USA, 2Laboratoire des Systèmes Perceptifs, Ecole Normale Supérieure,
Paris, France
Both visual attention and working memory (WM) are marked by severe
capacity limits. The biased-competition model (Desimone & Duncan, 1995)
proposes that capacity limits in visual attention arise because simultaneously perceived stimuli compete for neural representation in sensory cortex. Sensory recruitment models of WM (D’Esposito, 2007; Postle, 2006)
argue that information in WM is maintained in sensory cortex. The competitive map framework links these models by proposing that capacity limits
in attention and WM arise from competition in content-dependent cortical maps (Franconeri et al., 2013). A recent study demonstrated that this
map framework can explain visual processing limits for simultaneously
presented items drawn from either the same or different categories (e.g.
two faces/two scenes or four faces; Cohen et al., 2014). Here we examine
whether the map framework can explain capacity limits in WM. To prevent competition at perception, four items were sequentially presented and
then maintained in WM for 10 seconds. Across several categories (faces/
bodies/scenes), WM capacity is higher when items are drawn from separate categories versus a single category. This is consistent with lower
across-category versus within-category competition, supporting the role of
content-dependent competition in WM capacity limits independent of perceptual competition. Furthermore, we used fMRI and a forward modeling
approach (Brouwer & Heeger, 2011) to assess the nature of competition that
occurs within WM. Using multivoxel patterns in sensory cortex recorded
on low-load (load-2) same-category trials, we trained a model to project
fMRI activity into a representational space consisting of content-dependent
channels (e.g. a face channel and a scene channel). We then invert the model
to reconstruct channel amplitudes based on data from load-4 same-category and mixed-category trials. The amplitude of the relevant channel predicts behavioral accuracy across trials, and these amplitudes are higher on
mixed-category versus same-category trials, consistent with reduced cortical competition across content-dependent channels.
follows retinal coordinates, while repulsion from cardinal follows
real-world coordinates. Rosanne Rademaker1([email protected]
Systematic biases emerge when people report an orientation from memory
after a brief delay. One such bias is the classic oblique effect, with smaller
replication errors for targets presented at cardinal compared to oblique orientations. A second known bias is a repulsion away from the cardinal axes,
with responses to targets near vertical and horizontal exaggerated to lie
even further away from those axes. Here we wanted to test the origins of
these biases. Twelve participants were presented with randomly oriented
gratings (between 1–180º in 3º steps) on each trial for 100 ms. After a 1.5
s delay period a response probe appeared and participants replicated the
target orientation using the mouse. Critically, on half of the trials a rotating
chinrest tilted the head of participants 45º from upright – with tilt direction counterbalanced across participants. Participants switched between
upright and tilted head positions every 60 trials, and 1800 trials per tilt
position were collected over the course of several days. Data show that
the classic oblique effect is tied to a retinal coordinate frame, with better
resolution for targets presented at orientations that are cardinal relative
to the head, irrespective of its tilt. However, the repulsion from cardinal
remained tied to real world vertical and horizontal. We hypothesize that
while the classical oblique effect is driven by retinal and cortical factors
determined during visual development (such as the over-representation of
cardinal orientations in visual cortex), the second ‘repulsion’ bias is due to
a higher-level decisional component whereby representations are cropped
relative to real-world cardinal coordinates.
Visual Memory: Working memory
Saturday, May 20, 8:30 am - 12:30 pm
Poster Session, Pavilion
23.4017 Saccades Smear Spatial Working Memory Matthew
Peterson1,2,3([email protected]), Shane Kelley1, Eric Blumberg1; 1Human
Factors and Applied Cognition, George Mason University, 2Cognitive and
Behavioral Neuroscience Program, George Mason University, 3Neuroscience Interdisciplinary Program, George Mason University
We took advantage of saccadic remapping to test whether eye movements
and spatial working memory (SWM) share a common spatial representing
system. If SWM, perception, and saccades share the same spatial representation system, then multiple saccades along the same axis should lead to
representationql errors along that axis due to the increased potential for
errors from multiple remappings. Subjects performed a spatial change
detection task (6 sessions). During the retention interval, subjects detected
whether an X (go) or O (no-go) appeared. Depending on the session, the X or
O was either located centrally (no-shift), peripherally and identified using
covert attention (covert), or peripherally and identified using a saccade. For
the change detection task, when a change occurred, an item moved either in
the vertical or horizontal direction (0.56°-3.4°). For Experiment 1, saccades
and covert shifts occurred along the horizontal axis, and in Experiment 2
shifts were made along the vertical axis. For both experiments, memory
accuracy was highest for the no-shift condition, lower for the covert condition, and lowest for the saccade condition (Lawrence et al., 2004). Gaussian PDFs were fit to the memory data, with standard deviation (precision),
guessing, and bias as parameters. None of the tasks had a significant effect
on bias or guessing. However, in both experiments, a covert attention shift
led to a loss of memory precision compared to the no-shift condition, and
this loss was equal in both the vertical and horizontal planes. Importantly,
in both experiments saccades had an additional effect beyond covert attention: there was an additional loss of precision for changes that occurred in
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
23.4019 The Functional Limit in Visual Working Memory Storage:
The Tale Is In The Tail Marcus Cappiello1([email protected]), Weiwei
Zhang1; 1Department of Psychology, University of California, Riverside
Although working memory plays a significant role in a wide range of cognitive domains, its storage capacity is highly limited. The nature of this
limit in visual working memory (VWM) has been the subject of considerable controversy. The discrete-slot model attributes the bottleneck in VWM
storage to capacity (i.e., a limited number of discrete slots). In contrast, the
variable precision model replaces the capacity limit with variable mnemonic precision in that the variance in precision could produce extremely
low precision that behaviorally resembles random memory responses. One
common misconception about the discrete-slot model assumes no variance in mnemonic precision across items and trials, potentially leading
to the model’s poor performance in comparison to the variable precision
model. More importantly, the most fundamental difference between the
two models is the presence (discrete-slot model) or absence (variable precision model) of guessing for a large number of to-be-remembered items
(i.e. a capacity limit). The present study thus adopted four approaches to
establish this discrete nature of VWM storage. First, the slot model with
variability sufficiently accounted for residual errors in model fits. Second,
the slot model outperformed the variable precision model at large memory
set sizes when capacity constrained recall performance more than precision. Thirdly, the variable precision model produced increasing proportion
of extremely low precision trials that are indistinguishable from guessing
(and hence functionally equivalent to capacity) over memory set sizes.
Lastly, non-parametric modeling of recall data showed strong evidence for
two discrete clustering of precision in the true underlying distribution (one
for guessing and the other for graded memory representation). Together
these results provide strong support for the slot model of VWM storage
Vis io n S c ie nc es Societ y
Saturday AM
23.4016 Dissociable biases in orientation recall: The oblique effect
Satur day Morni ng P os t er s
23.4020 When shorter delays lead to worse memories: Taking
Saturday AM
attention away from visual working memory temporarily makes it
more vulnerable to test interference. Benchi Wang1(wangbenchi.
[email protected]), Jan Theeuwes1, Christian Olivers1; 1Department of
Experimental and Applied Psychology, Vrije Universiteit, Amsterdam, the
Evidence shows that visual working memory (VWM) is strongly served
by attentional mechanisms, whereas other evidence shows that VWM
representations readily survive when attention is taken away. To reconcile these findings, we tested the hypothesis that directing attention away
makes a memory representation vulnerable to interference from the test
pattern, but only temporarily so. When given sufficient time, the robustness of VWM can be restored. In six experiments, participants remembered
a single grating for a later memory test. In the crucial conditions, participants also performed a letter change detection task in between, during the
delay period. Using various replications, Experiments 1-4 demonstrate, the
effect predicted: The intervening task had an adverse effect on memory performance, but only when the test display appeared immediately after the
secondary task. At long delays (of 3.5 seconds), memory performance was
on a par with conditions in which there was no intervening task. By varying the similarity between the test and memorized pattern, Experiments
5-6 further showed that performance suffered at early test intervals, unless
the test item was dissimilar to the memory item. In conclusion, VWM storage involves multiple types of representation, with unattended memories
being more susceptible to interference than others. Moreover, importantly,
this fragility has only a temporary status.
23.4021 No evidence for an object working memory capacity ben-
efit with extended viewing time Colin Quirk1([email protected]),
Edward Vogel1; 1Department of Psychology, University of Chicago
Multiple studies have shown that visual working memory (VWM) fills
within hundreds of milliseconds (Vogel, Woodman, & Luck, 2006) and
additional encoding time does not allow for more items to be stored (e.g.
Luck & Vogel, 1997). In contrast, recent studies have suggested that there is
a VWM capacity benefit for real-world objects at long encoding times (i.e.
multiple seconds). For example, Brady, Störmer, & Alvarez (2016) showed
that VWM performance for real-world objects is better than for simple
colors at long encoding times, supporting the claim that realistic items
have more information that can be encoded given sufficient time. Additionally, they measured the contralateral delay activity (a neural marker
for the amount of information stored in VWM) and found an increase in
CDA amplitude for real-world objects compared to colors for large set
sizes at long encoding times, suggesting that this increase in performance
is due to an effect of VWM capacity and not long-term memory. In our
first experiment, we attempted a direct replication of Brady et al.’s behavioral phenomenon with a larger number of subjects (N=25) and more trials
per condition (50 trials per condition). Subjects were asked to remember
six real-world objects or colors after a presentation time of 200ms, 1s, or
2s. We failed to replicate their primary behavioral result, instead finding
that performance was improved for both colors and real-world objects at
longer encoding times. There was no significant difference between VWM
performance for colors and real-world objects. Our second experiment was
another attempt at a direct replication (N=25) that also included a stronger articulatory suppression manipulation and again we found no performance benefit for real-world objects at long encoding times. These results
suggest that there is no additional benefit for real-world objects compared
to simple colors under extended viewing conditions.
23.4022 Encoding strategies in visual working memory Hagar
Cohen1([email protected]), Halely Balaban1, 2, Roy Luria1, 2; 1The School
of Psychological Sciences, Tel-Aviv University, 2The Sagol School of Neuroscience, Tel-Aviv University
The goal of the present study was to examine which type of task, either simple or complex, receives higher priority when encoded into visual working
memory. Participants performed the change detection task with arrays of 2
and 3 items that could be either colored squares (the simple task), random
polygons (the complex task) or a mixture of both stimuli. By equating the
number of items in each comparison while varying complexity level, we
were able to measure how the addition of complex object effects the encoding of simple objects and vice-versa. In experiment 1, accuracy for color was
not further affected by the addition of polygon relative to adding another
Vi s i on S c i enc es S o ci e ty
VS S 2017 Abst ract s
color, but polygon performance significantly decreased when appeared
next to a color or another polygon, indicating a preference for encoding
simple items. In experiment 2 we replaced the random polygons with difficult to distinguish colors, and replicated the results of the previous experiment, suggesting that the results were not due to category preference. In
experiment 3, we encouraged participants to encode the polygons by telling
them that on trials in which colors and polygons are presented together,
the chance of the polygon to be the probed item is much higher than that
of color (which was indeed the case). We found an increase in accuracy for
polygons (relative to Experiment 1), accompanied with a mild decrease in
color performance. Our results suggests that although participant’s initial
strategy is to encode the items in the simple task, they’re able to change it
when motivated doing so.
23.4023 Visual working memory of multiple preferred objects Holly
Lockhart1([email protected]), Stephen Emrich1; 1Psychology Department,
Brock University
A key debate regarding visual working memory (VWM) mechanisms
focuses on the differences between discrete- versus continuous-resource
models of VWM capacity limits. Recent findings have demonstrated that
VWM resources can be allocated disproportionately according to the probability that an item would be probed, consistent with the continuous resource
model. However, this finding was based on a single report on each trial,
with the assumption that all items in the display would get the predicted
quantity of VWM resources. The current study sought to address this methodological limitation, and determine whether multiple items are reported
according to attentional prioritization during encoding. Using a two-item
report task we tested the flexibility and quality of VWM when two items
are cued during the encoding of a super-capacity display of six coloured
items. To establish attentional priority, the cued items were probed on 50%
of the trials, while in 25% of trials one cued item and one uncued item were
reported, and two uncued items were reported on the remaining 25% of
trials. Measures of precision, guess rate, and rate of non-target errors were
taken from the three-component mixture model. Results show that participants reported the two cued items with approximately equal precision to
each other, suggesting that in fact multiple items can be prioritized simultaneously. Uncued items were also half as likely to be correctly reported and
three times as likely to be misremembered compared with cued items. The
results are in line with the predictions of a flexible resource model in which
VWM resources can be allocated across multiple items in accordance with
the task demands.
Acknowledgement: NSERC
23.4024 The precision of visual working memory is set by the
number of subsets Gaeun Son1([email protected]), Sang Chul
Chong1,2; 1Graduate Program in Cognitive Science, Yonsei University,
Department of Psychology, Yonsei University
The current study investigated whether the precision of visual working
memory could be changed by the number of subsets, with the number of
items unchanged. Specifically, we always presented five differently oriented bars but varied the number of subsets. We assumed that similarly oriented bars would be organized as one subset while dissimilar bars would
be treated as other subsets. Within a subset the orientation difference of all
bars was 5°, and across subsets, the smallest orientation difference between
two bars was 45°. Thus, within a subset all bars had similar orientations,
but across subsets bars had dissimilar orientations. If subsets are used as
units of visual working memory rather than individual bars, the precision
of represented bars should decrease as the number of subsets increases.
In Experiment 1 and 2, five differently oriented bars were presented for
200ms. After 900ms, a bar whose orientation could be adjusted by a mouse
was presented in the center of the screen with a circle indicating one specific location of the encoded bars. Participants were asked to recall the orientation of the item indicated by the circle. In Experiment 1, there were
two conditions with one subset and two subsets, and in Experiment 2 there
were also two conditions with two subsets and three subsets. We compared
memory precision between the two conditions in each experiment, with the
following steps being taken: (a) the five bars were sorted by their orientations in each subset condition; and (b) the orientation precision of corresponding bars was compared between the subset conditions. We found that
the precision of reported orientation decreased as the number of subsets
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day M orning Post ers
increased. These results suggest that items held in visual working memory
are organized into subsets depending on orientation similarity and the subsets are represented as units of visual working memory.
formation and maintenance of mean with sequentially presented items
in VWM is distinct from those with simultaneously presented items, and
shows a recency effect specific to the mean computation.
Acknowledgement: This work was supported by the National Research
Foundation of Korea (NRF) grant funded by the Korea government (NRF2016R1A2B4016171).
Acknowledgement: JSPS KAKENHI #16H01727
working memory Jifan Zhou1([email protected]), Yijun Zhang1,
Shulin Chen1, Rende Shui1, Mowei Shen1; 1Department of Psychology and
Behavioral Sciences, Zhejiang University
The “working” function of visual working memory (VWM) has been highlighted by recent studies. Findings demonstrated that the sequentially
presented visual elements would be involuntarily integrated into visual
objects or figures inside VWM, providing evidence that VWM functions as
a buffer serving perceptual processes by storing the intermediate perceptual representations for further processing. In those studies, the number of
visual elements was usually controlled within the capacity of VWM; however, the realistic environment we live in is so rich and complex that the
visual system has to constantly deal with massive visual information. How
is such enormous amount of information actually processed with limited
VWM capacity? Notwithstanding researchers know that the visual system can extract statistical properties of crowds of objects to form ensemble
representations, it is largely unclear whether and how ensemble representations integrate inside VWM. This issue was investigated in the present
study. Participants viewed two temporally separated groups of discs, after
a short time, they reported the memorized mean size of either one of the
groups or the whole (i.e., all the discs in the two groups) by adjusting a
probe disc. The results indicated that participants were able to report accurate mean size of each group and the whole set of discs, respectively. More
importantly, the reported mean size of the whole could be predicted by
the pooled mean calculated based on the reported means of two individual groups. This result suggested that the temporally separated ensemble
representations stored in VWM are able to be integrated into a higher-level
ensemble representation, using the perceived statistics of the crowds of
objects. Thus, when the amount of objects exceeds the capacity of VWM,
the visual system will chose to store the necessary statistics for describing
the ensemble and supporting further statistical computation.
Acknowledgement: This research is supported by the National Natural Science
Foundation of China (No.31571119, No. 31600881), and the Fundamental
Research Funds for the Central Universities.
23.4026 Formation and maintenance of mean orientation of
sequentially presented objects in visual working memory Jun
Saiki1([email protected]), Mutsumi Yamaoka1; 1Graduate School
of Human and Environmental Studies, Kyoto University
We can perceive ensemble information such as average size and orientation quickly and efficiently. Such efficient statistical perception is observed
both with simultaneous and sequential presentation of objects. Mean size
information of simultaneously presented objects influences visual working
memory (VWM) for each object. However, few studies have addressed the
relationship between VWM for ensemble and for single items with sequentially presented objects. This study investigated characteristics of VWM for
ensemble formed from sequentially presented items, compared with VWM
for ensemble from simultaneously presented items. Participants viewed a
sequence of randomly oriented arrow stimuli with 100ms duration for each
item and 1000ms inter-item interval, and reported the orientation of their
mean or of a particular item. In Experiment 1, participants reported the
mean orientation of 4- or 12-item sequence, and multiple regression analysis showed that the beta weight of each item increased as the serial order,
indicating a recency effect. To test whether the recency effect simply reflects
the property of sequential memory task, Experiment 2 asked participants
to report the orientation of all 4 items, and the recency effect disappeared,
suggesting that the recency effect is specific to the computation of mean
orientation. Next, to examine whether the recency effect occurs only when
memory for single items is unnecessary, Experiment 3 asked participants
to report either a single item or the mean, depending on the response cue
presented after the sequence, and mean orientation trials still showed the
recency effect. Furthermore, VWM precisions for a single item and for the
mean were comparable, which is inconsistent with studies using simultaneous presentation reporting higher precision with the mean. Taken together,
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
ing in visual working memory Yuri Markov1([email protected]),
Igor Utochkin1; 1National Research University Higher School of Economics, Moscow, Russia
Research shows that object-location binding errors can occur in VWM
indicating a failure to store bound representations rather than mere forgetting (Bays et al., 2009; Pertzov et. al. 2012). Here we investigated how
categorical similarity between real-world objects influences the probability
of object-location binding errors. Our observers memorized three objects
(image set: Konkle et. al. 2010) presented for 3 seconds and located around
an invisible circumference. After a 1-second delay they had to (1) locate
one of those objects on the circumference according to its original position (localization task), or (2) recognize an old object when paired with a
new object (recognition task). On each trial, three encoded objects could be
drawn from a same category or different categories, providing two levels of
categorical similarity. For the localization task, we used the mixture model
(Zhang & Luck, 2008) with swap (Bays et al., 2009) to estimate the probabilities of correct and swapped object-location conjunctions, as well as the
precision of localization, and guess rate (locations are forgotten). We found
that categorical similarity had no effect on localization precision and guess
rate. However, the observers made more swaps when the encoded objects
have been drawn from the same category. Importantly, there were no correlations between the probabilities of these binding errors and probabilities
of false recognition in the recognition task, which suggests that the binding errors cannot be explained solely by poor memory for objects. Rather,
remembering objects and binding them to locations appear to be partially
distinct processes. We suggest that categorical similarity impairs an ability
to store objects attached to their locations in VWM. Acknowledgement: Program for Basic Research at NRU HSE in 2016
23.4028 Perceptual organization predicts variability in visual
working memory performance across displays and items. Young
Eun Park1,2([email protected]), William Ju3, Frank Tong1,2;
Psychology Department, Vanderbilt University, 2Vanderbilt Vision
Research Center , 3Department of Electrical Engineering and Computer
Science, Vanderbilt University
Growing evidence suggests that visual working memory can store items
as perceptual groups rather than as independent units (Brady, Konkle, &
Alvarez, 2011). Can perceptual grouping explain particularly high capacity estimates for certain types of stimuli? Based on our previous finding
that working memory capacity for orientation is greatly enhanced for
line stimuli in comparison to Gabor gratings (Park et al., VSS 2015), here
we investigated whether working memory for line orientation can take
advantage of Gestalt rules of organization to make more efficient use of
its limited capacity. We hypothesized that multiple line orientations can
be stored more efficiently if they can be organized into perceptual groups
according to the rules of “similarity” and “good continuation”. If so, then
working memory performance should vary systematically depending on
the spatial relations between items in any given visual display. We randomly generated 96 displays, each containing six oriented lines at various
locations, and presented the same set of displays to 700+ observers in an
online experiment. On each trial, observers were asked to report the orientation of a randomly selected item from memory. Pooling the responses
from all observers (110+ trials/item), we observed marked differences in
the average error magnitude across displays (25.4˚-39.2˚) and items (19.3˚84.4˚), which proved highly consistent across observers (a random split-half
correlation of .77, p < .0001). We characterized the frequency of orientation
clustering in the displays, as well as the degree of collinearity among pairs
of orientations based on the smoothness of their implied path. By entering
these factors into a multiple regression model, we could accurately predict
working memory performance for specific displays (R = 0.62) and items (R
= 0.40). Our findings demonstrate that the presence of rich spatial structure
in arrays of oriented lines allows for highly efficient storage of information
in visual working memory.
Vis io n S c ie nc es Societ y
Saturday AM
23.4025 Integration of ensemble representations stored in visual
23.4027 An effect of categorical similarity on object-location bind-
Satur day Morni ng P os t er s
VS S 2017 Abst ract s
23.4029 Perceptual Grouping Influences Neural Correlates of
Spatial Working Memory Laura Rabbitt ([email protected]), Craig
McDonald1, Matthew Peterson1; 1Psychology, George Mason University
Saturday AM
Spatial working memory is limited in the number of locations that can be
maintained over time but can be improved when stimuli are organized
into familiar patterns. This study examined the neural correlates of spatial working memory (SWM), specifically if SWM could be measured by
the contralateral delay activity (CDA), an event-related potential known
to index visual working memory. Additionally, the study investigated
whether or not task instruction would alter the amplitude of the CDA in
the SWM task. In the current study, participants performed a SWM change
detection task where participants were cued to remember the locations of 1
– 4 colored squares on one side of a bilateral array, indicated by a cue prior
to the beginning of the trial. At the beginning of the experiment, participants were given one of two instruction types: to remember the individual
location of the squares (spatial instructions), or to remember the squares
by grouping them into a single unit (constellation instructions). Results of
the experiment demonstrated that the CDA indexes the number of items
in SWM and increases in amplitude as the number of locations to remember increases. Unlike the spatial instruction condition, the CDA reached
an asymptote for two locations in the constellation instruction condition.
Additionally, the CDA amplitude was sustained for a longer period of time
in the constellation instruction condition than for the individual location
instructions. These results indicate that the CDA can measure SWM and
how perceptual grouping influence the pattern and duration of neural correlates of SWM.
23.4030 Successful movement inhibition boosts the inhibition of
distractors in visual working memory Min-Suk Kang1,2([email protected]
skku.edu), Hayoung Song1,2,3; 1Department of Psychology, Sungkyunkwan
University, 2Center for Neuroscience and Imaging Research, IBS, 3Global
Biomedical Engineering, Sungkyunkwan University
The common inhibitory control hypothesis posits that the executive control
process is involved in inhibiting thoughts and actions. According to the theory, the inhibition of prepotent response should also facilitate the inhibition
of competing representations in memory. To test this hypothesis, in Experiment 1, participants remembered three targets that were presented with
either one or five distractors. During the retention interval, they performed
a stop-signal task in which they countermanded a simple choice response
upon an infrequent stop-signal (25%). We found that the stop-signal trials
that were successfully inhibited (canceled trials) resulted in higher memory
accuracy than the trials in which the response was erroneously committed
(non-canceled trials). This result indicates that the successful response inhibition facilitated the inhibition of distractors and, thus, the working memory performance was enhanced due to reduced distractor intrusion. Alternatively, however, it is possible that the poor memory performance in the
non-canceled trials could have occurred because participants adjusted their
behaviors after committing errors in the non-canceled trials. We ruled out
the post-error processing hypothesis in Experiment 2. Participants remembered three targets that were presented with either one or five distractors
like Experiment 1. During the retention interval, they performed a simple
shooting task in which they were required to shoot a moving target by
pressing a button upon an infrequent shooting-signal (25%). We found that
the memory accuracy was comparable whether the shoot hit or missed the
target. This result indicates that the post-error processing cannot explain
poor memory performance that accompanied non-canceled trials of Experiment 1. Taken together, these results indicate that the common inhibitory
mechanism activated by the inhibition of distractors in visual working
memory is further boosted by successful response inhibition.
Acknowledgement: NRF 2016R1D1A1B03930292
23.4031 The time course of retaining the hierarchical representa-
tion in visual working memory Vladislav Khvostov1([email protected]),
Igor Utochkin ; National Research University Higher School of Economics, Russia
1 1
It was shown that the features of individual items retrieved from visual
working memory (VWM) are systematically biased towards the mean feature of a sample set (Brady & Alvarez, 2011), suggesting hierarchical encoding in VWM. In this work, we investigated how hierarchical representations
are stored over time. Observers were shown four white differently oriented
Vi s i on S c i enc es S o ci e ty
triangles for 200 ms and asked to memorize their orientations. After a 1-,
4-, or 7-second delay, they had to report either one individual orientation,
or the average orientation of all triangles, rotating a probe circle. We also
precued a target (a signal to memorize one particular orientation, all four
individual orientations, or the average orientation) or postcued (no signal
presented, requiring to remember both the individuals and the average).
Using the mixture model (Zhang & Luck, 2008), we estimated the precision and the probability of a tested representation being in VWM, as well
as a systematic bias that would indicate hierarchical coding. Participants
showed very precise and unbiased memories when only one triangle was
precued. However, when they had to remember four orientations their
reports were less precise and strongly biased towards the mean, both when
the triangles were precued and postcued. However, the bias did not reach
the mean, showing that observers had some memory for both the mean and
the individual orientations – this is a signature of hierarchical coding. One
surprising finding was that the bias towards the mean was slightly stronger
after 1 second as compared to 4 or 7 seconds. This suggests that individual
representations may be a bit more affected by the mean at early retention
stages. However, there were no other substantial changes in the precision,
biases, or probability of being in memory with the delay. This suggests
that hierarchical representations probably depend more on encoding than
retention factors.
Acknowledgement: Program for Basic Research at NRU HSE in 2016
23.4032 Frequency domain analyses of EEG reveal neural correlates of visual working memory capacity limitations observed
during encoding using a full report paradigm. Kyle Killebrew1(kyle-
[email protected]), Candace Peacock2, Gennadiy Gurariy1, Marian
Berryhill1, Gideon Caplovitz1; 1University of Nevada, Reno, Department of
Psychology, 2University of California, Davis, Department of Psychology
Visual working memory (VWM) is capacity limited. In an attempt to better
understand the role of encoding-related processes in this capacity limitation,
we combined a full report VWM paradigm with an EEG frequency tagging
technique to measure neural correlates of encoding related processes. This
paradigm allows behavioral and neural responses for all items in a memory array to be measured independently. Specifically, observers performed
a full report VWM task in which they were presented with a four-item
memory array and asked to recall as many items as possible after a brief
delay. The memory arrays contained visual shape stimuli, each flickering
at different rates. Each of these was ‘tagged’ using that particular flicker
frequency. While performing the task, neural activity was recorded using
high-density electroencephalography (hdEEG) and the steady-state visual
evoked potential (SSVEP) was measured during the WM encoding period
in response to the frequency tagged stimuli. During retrieval, observers
either recalled the location and identity of each item in the order they chose
or in an order explicitly demanded by the paradigm. Our results demonstrate that the frequency tag amplitudes for correctly recalled items were
larger than for forgotten items for both recall types on electrodes broadly
distributed across the scalp. A secondary induced-power analysis found
increased power in the theta (5-8Hz), alpha (9-12Hz) and beta (13-31Hz)
bands on trails in which 3 or 4 items were correctly recalled compared to 1
or 2 items. However this effect was only observed in the sequential-recall
conditions. Building on our previous work using a recognition paradigm,
the current results further demonstrate the important role of encoding-related processes in the overall capacity limitation of VWM.
Acknowledgement: NSF 1632738, NSF 1632849
Color and Light: Neural mechanisms
Saturday, May 20, 8:30 am - 12:30 pm
Poster Session, Pavilion
23.4033 Luminance modulates the contrast response in human
visual cortex Louis Vinke1,2([email protected]), Sam Ling2,3,4; 1Graduate
Program in Neuroscience, Boston University, Boston, Massachusetts, USA,
Center for Systems Neuroscience, Boston University, Boston, Massachusetts, USA, 3Department of Psychological and Brain Sciences, Boston
University, Boston, Massachusetts, USA, 4Donders Institute for Brain,
Cognition and Behavior, Radboud University, Nijmegen, The Netherlands
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s 23.4034 Receptive Field Structures of Color-responsive Neurons
in Macaque Monkey V1 Wei-Ming Huang1([email protected]),
Hsiang-Yu Wu1, Yu-Cheng Pei2, Chun-I Yeh1,3,4; 1Department of Psychology, National Taiwan University, Taiwan., 2Department of Physical
Medicine and Rehabilitation, Chang Gung Memorial Hospital, Taiwan.,
Neurobiology and Cognitive Science Center, National Taiwan University,
Taiwan., 4Institute of Brian and Mind Sciences, National Taiwan University, Taiwan.
Visual receptive fields have been studied as a way to understand the properties of color- and luminance-responsive neurons in the primary visual
cortex (V1). In macaque monkey V1, many neurons responding to color are
highly selective for orientation and spatial frequency (Johnson et al., 2001;
Friedman et al., 2003). One would predict that the receptive field structures
of color-responsive neurons should consist of multiple elongated sub-regions (like simple cells). However, previous studies had shown mixed
results: some found simple-cell-like receptive fields by using dense noise
(Horwitz et al., 2007; Johnson et al., 2008), whereas others found receptive fields that were blub-like and less elongated when using sparse noise
(Conway and Livingstone, 2006). Here we measured receptive fields of V1
color-responsive neurons with three different stimulus ensembles: Hartley
gratings, binary white noise, and binary sparse noise. All three stimulus
ensembles consisted of equiluminance colors of red and green representing
different cone weights. Receptive fields were estimated by reverse correlation and fitted with the 2-D Gabor function. We studied 54 V1 units and
found that Hartley maps tended to have higher aspect ratios (p=0.03) and
larger numbers of subregions (p=0.02) than white-noise maps (Friedman’s
test). There was a negative correlation between the aspect ratio of the map
and the circular variance measured with drifting gratings (Hartley gratings:
r=-0.28, p=0.04; white noise: r=-0.30, p=0.04; Spearman’s rank correlation).
Similar to previous findings, the distribution of circular variances for color-responsive neurons was comparable with that for luminance-responsive
neurons (Leventhal et al., 1995; Ringach et al., 2002). In summary, the receptive field of color-responsive neurons may change accordingly with different stimulus ensembles. For neurons that are well tuned for orientation, the
tuning properties can be predicted by their receptive field structures.
Acknowledgement: MOST 103-2321-B-002-028 / MOST 104-2320-B-002065-MY3
23.4035 Quantifying the relation between pupil size and electro-
physiological engagement of visual cortex Nina Thigpen1([email protected]
ufl.edu), Andreas Keil1; 1University of Florida
Little is known about how physical properties of the eye influence how
light information is received in early visual cortex. In humans, mass activity
in the primary visual cortex can by quantified by measuring the steady-
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
state visually evoked potential (ssVEP). Given that both the ssVEP and
pupil size are often modulated by the same manipulations, such as stimulus brightness and physiological arousal, the question arises whether pupil
size is directly related to the amount of primary visual cortical engagement
to a given stimulus. To test this hypothesis, we systematically manipulated pupil size by manipulating the brightness of five sinusoidal gratings,
shown one at a time to participants for 3 seconds, 40 times each. Each stimulus flickered at either 6, 10, or 15 Hz, to elicit ssVEPs, used as a measure of
visual cortical engagement. We observed a strong negative linear relation
between pupil size and ssVEP amplitude, across participants and driving
frequencies. Surprisingly, we observed a quadratic relationship between
pupil size and the amplitude of the second harmonic of the driving frequency. These results suggest that there is a systematic relation between
pupil size and mass activity in primary visual cortex that is not explained
by light energy entering the retina. The data may play an important role
in the understanding of the bi-directional relationships between the autonomic nervous system and primary visual cortex. Acknowledgement: This research was supported by grant R01MH097320 from
the National Institutes of Health and by grant N00014-14-1-0542 from the Office
of Naval Research to Andreas Keil
23.4036 Tracing the representation of colored objects in the
primate brain Le Chang1,2([email protected]), Pinglei Bao1,2,
Doris Tsao1,2; 1Division of Biology and Biological Engineering, California
Institute of Technology, 2Howard Hughes Medical Institute
Even though color vision is commonly defined as the ability of an organism
to distinguish objects based on the wavelengths of light they reflect, color
research has mainly focused on the representation of colors independent
of the problem of distinguishing objects. Thus the mechanism by which
color contributes to object recognition remains unclear, as little is known
about how color and object information are co-represented in the part of
the brain responsible for object recognition: in primates, inferotemporal
(IT) cortex. The recent discovery of “color patches” in macaque IT cortex
makes this problem experimentally tractable. Here we recorded neurons in
three color patches, middle color patch CLC, and two anterior color patches
ALC and AMC, while presenting images of objects systematically varied in
hue. We found that all three patches contain high concentrations of hue-selective cells, and carry significant information about both hue and object
identity. We found two clear transformations across the three patches. The
first transformation, from CLC to ALC, reduces information about object
identity. The second transformation, from ALC to AMC, mainly affects representation of hue: color space is represented in a dramatically distorted
way in AMC, with over-representation of yellow and red, the natural colors of mammal faces and bodies; furthermore, AMC develops an expanded
representation of primate faces, displaying hue-invariant representation
of monkey identity. Our findings suggest that IT cortex uses three distinct
computational strategies to represent colored objects: multiplexing hue and
object shape across all objects (CLC), extracting hue largely invariant to
shape (ALC and AMC), and multiplexing hue and object shape specifically
for ecologically important objects (AMC). Overall, our study reveals the
neural architecture for representing colored objects in IT cortex, and sheds
light on the general organizational principles of IT cortex.
Acknowledgement: HHMI NIH (RO1EY019702)
23.4037 Electrophysiological correlates of perceptual blue-yellow
asymmetries with #thedress Talia Retter1,2([email protected]),
Owen Gwinn1, Sean O’Neil1, Fang Jiang1, Michael Webster1; 1Department
of Psychology, Center for Integrative Neuroscience, University of Nevada,
Reno, USA, 2Psychological Sciences Research Institute, Institute of Neuroscience, University of Louvain, Belgium
Asymmetries in blue-yellow color perception, in which blues appear less
saturated than yellows of equivalent chromatic contrast, have been shown
to affect a variety of color percepts including the appearance of #thedress
(Winkler et al, 2015). This asymmetry may reflect greater ambiguity about
the source of bluish tints compared to other hues, e.g., whether the blue is
due to the lighting or the surface. When the blue is inverted to yellow, this
ambiguity may be removed, and agreement about the inverted dress color
is much higher. We tested for neural correlates of this perceptual asymmetry using a paradigm for measuring asymmetries in visual discrimination
with high-density electroencephalography (EEG) and frequency tagging
(Retter & Rossion, 2016). Throughout four 50-sec sequences, the original
Vis io n S c ie nc es Societ y
Saturday AM
The vast majority of models in vision downplay the importance of overall
luminance in the neural coding of visual signals, placing emphasis instead
on the coding of features such as relative contrast. Given that the visual
system is tasked with encoding surfaces and objects in scenes, which often
vary independently in luminance and contrast, it seems plausible that luminance information is indeed encoded and plays an influential role in visuocortical processing. However, the cortical response properties that support
luminance encoding remain poorly understood. In this study, we investigate the interaction between contrast response and luminance in human
visual cortex, using fMRI. We assessed BOLD responses in early visual cortex (V1-V3) while participants viewed checkerboard stimuli that varied in
contrast and luminance. Specifically, we utilized an adaptation paradigm
that allowed us to reliably measure contrast responses at multiple spatial
scales (voxel-wise and retinotopic), and across a set of luminance levels. To
control for changes in pupil diameter with varying luminance levels, stimuli were viewed monocularly through an artificial pupil. We found that
the extent to which the overall luminance of a signal modulates responses
in visual cortex is contrast dependent, with reliable increases in contrast
responses along with increasing luminance levels, but only occurring at
low levels of contrast. Furthermore, the modulation strength of luminance
on contrast responses did not exhibit any retinotopic bias. These results
reveal that the visuocortical neural code does indeed retain and utilize
information about the luminance of a visual signal, but appears to preferentially modulate the response only at low-to-zero contrast levels. This
finding suggests that luminance likely plays a dominant role in visual tasks
such as our perceptual encoding and segregation of surfaces.
S atur day M orning Post ers
Saturday AM
Satur day Morni ng P os t er s
VS S 2017 Abst ract s
dress image was presented in alternation with the color-inverted image at
a rate of six images per second (6 Hz), with measurements collected for 14
observers. Given that the color-inverted image is consistently perceived to
have yellow stripes, we hypothesized that an asymmetry between the two
images would be present in the EEG recording at the alternation rate (3 Hz).
Settings were also made for a second pair of dress images formed by rotating the original colors by +/-90 deg to create reddish-greenish versions,
for which a less-pronounced perceptual asymmetry predicts a weaker 3
Hz response. These hypotheses were supported, with larger blue-yellow
than red-green responses at 3 Hz and its specific harmonics (e.g., 9 Hz) in
the frequency-domain of the EEG over occipital channels. In contrast, the
response at the image-presentation rate of 6 Hz and its harmonics did not
differ across these two conditions. Our results suggest that the blue-yellow
asymmetry, a potentially higher-level aspect of color appearance unrelated
to chromatic sensitivity, is nevertheless evident in electrophysiological
recordings of the cortical responses to color.
illuminations covering a greenish-blue chromaticity range in equal perceptual steps; the three reference sets overlapped but had different means.
All illuminations were the smoothest-possible metamers for the requested
chromaticity. The point of subjective equality (PSE) for each reference in
each block was determined by averaging the final reversals of two interleaved one-up, one-down staircases, one approaching the reference from
yellower hues, the other bluer. PSEs were systematically biased towards
the mean of each block (the same illumination was remembered as more
yellow when all references were biased towards yellow, compared to when
biased towards blue). The set of illuminations perceptually equal to each
reference chromaticity, defined as all hues between the convergence points
of the two staircases, are skewed towards bluer hues. While illumination
colour memory shows the same bias towards recent stimuli as surface
colour memory, there is an additional bias towards blue which may result
from poorer discrimination or prior expectations for illumination colour.
Acknowledgement: EY10834 P20, GM103650, FNRS FC7159
real scenes Sylvia Pont1([email protected]), Ling Xia2, Tatiana Kartashova1; 1Perceptual Intelligence lab, Industrial Design Engineering, Delft
University of technology, 2Changzhou key Laboratory of Robotics and
Intelligent Technology, College of Internet of Things Engineering, Hohai
University, China
23.4038 Differential effects of four types of TMS on signal process-
ing Greta Vilidaite1([email protected]), Daniel Baker1; 1Department of
Psychology, University of York
Transcranial magnetic stimulation (TMS) is often used to link behaviour
to anatomy by targeting a brain area during an associated task. Decreases
in performance on that task are often explained as a suppression of stimulus-driven signals, but could also be explained by increases in neural noise.
This study used a 2IFC double-pass contrast discrimination paradigm (Burgess & Colborne, 1988, J Opt Soc Am A, 5:617-627) to distinguish between
these two possibilities in four types of TMS: online single-pulse (spTMS),
online three-pulse repetitive (rTMS), offline continuous (cTBS) and intermittent theta burst stimulation (iTBS). Using standard stimulation protocols with a Magstim Super Rapid2, online TMS was applied to early visual
cortex 50ms after onset of each stimulus in each interval, and offline TBS
was applied before the start of the task. On each trial (200 total) two grating
stimuli of random contrast were presented peripherally (position determined by phosphene localization). Half of the trials contained a 4% contrast increment in one of the intervals. The exact same trial sequence was
then repeated with randomized interval order (second pass). A decrease
in accuracy in the 4% target condition would indicate signal suppression
whereas a reduction in consistency of responses between the two passes
would indicate an increase in neural noise. Mean accuracy and consistency
scores were bootstrapped within participants. It was found that spTMS
reduced accuracy whereas rTMS decreased consistency. This implies that
spTMS decreases signal strength whilst rTMS increases neural noise without affecting the stimulus-driven signal. Offline stimulation (cTBS, iTBS)
did not affect accuracy or consistency. This is the first study to compare
several types of TMS using a single paradigm that can dissociate noise from
suppression. These findings can explain inconsistencies in results between
previous studies using different TMS protocols and so comparisons across
protocols should be made with caution.
Color and Light: Constancy
23.4040 The optics, perception and design of light diffuseness in
Human observers can perceive intensity and direction differences of the
illumination on objects and in scenes. They also have a sense for the light
diffuseness. Reviewing studies into light diffuseness perception and practical lighting guidelines we encountered the problem that there is no agreement on how to describe and measure the light diffuseness, complicating
comparisons. We found a large variety of metrics relating to visual effects of
light diffuseness, including contrast, shape expressing, material expressing,
and atmosphere effects. Moreover, many metrics appeared to be application-, context- or even object-specific. We compared four approaches and
propose a normalized metric for light diffuseness, ranging from 0, meaning fully collimated light (a beam with zero spread), to 1, meaning fully
diffuse or Ganzfeld illumination. We developed a measurement method
for real scenes using cubic illuminance metering. We tested metric and
method using simulations, measurements on Debevec luminance maps
using a cubic and tetrahedron shaped meter, and measurements in real
scenes using the cubic meter. We also tested the influence of scene properties (lighting, geometry and furnishing) and variations within scenes.
We compared optical against psychophysical data from our own and other
studies, and against practical lighting guidelines. We found that the cubic
meter method and metric give robust measurements of light diffuseness.
Measurements in real scenes fell in a wide range of 0.1 – 0.9. We found
extremely strong effects of furnishing and geometry. Such material-lighting
interactions in scenes / architectural spaces are not well-understood and
form a challenge in practical lighting design. Most practical guidelines note
a broadband range centered slightly above medium diffuseness or hemispherical diffuse light (overcast sky). The psychophysical data contract to
narrow bands, depending on the type of scene (varying per experiment),
suggesting a template representation of light diffuseness that depends on
the overall appearance of a scene.
Saturday, May 20, 8:30 am - 12:30 pm
Poster Session, Pavilion
Acknowledgement: Part of this work has been funded by the EU FP7 Marie Curie
Initial Training Networks (ITN) project PRISM, Perceptual Representation of
Illumination, Shape and Material (PITN-GA-2012-316746).
23.4039 Memory Bias for Illumination Colour Stacey Aston1(stacey.
23.4041 Contrast adaptation and illuminant spectra Ivana Ilic1(iva-
[email protected]), Maria Olkkonen2,3, Anya Hurlbert1; 1Institute of Neuroscience, Newcastle University, UK, 2Psychology, Durham University, UK,
Institute of Behavioural Sciences, University of Helsinki, Finland
Perceptual estimates of surface colour are biased in memory toward the
mean of recently viewed colours (Olkkonen et al. 2014). Discrimination of
global illumination colour is also biased, with discrimination thresholds
enlarged for illumination changes opponent to the adaptation illumination;
yet overall discrimination is poorest for bluish illumination changes (Aston
et al. 2015). Does memory for illumination colour show the same central
tendency as surface colour, and are biases for memory and discrimination
linked? Participants (n=7) viewed an enclosed grey wall illuminated by
tuneable multi-channel LED lamps. Following an initial 2-min adaptation
period under D65 illumination, participants viewed on each trial: reference
light (500 ms), top-up adaptation light (2000 ms), and test light (500 ms);
then (under D65) responded by button press whether the test was “bluer or
yellower” than the reference. Each of three trial blocks contained 5 reference
Vi s i on S c i enc es S o ci e ty
[email protected]), Lorne Whitehead2, Michael Webster1; 1Department of Psychology, College of Liberal Arts, University of Nevada, Reno,
Department of Physics and Astronomy, University of British Columbia
Artificial illuminants vary widely not only in their mean chromaticity but
also in the range or gamut of colors they produce. For example, new highgamut LED illuminants can expand the saturation of reds and greens by
roughly 30%. We explored how the visual system might adapt to changes in
the color distributions induced by different illuminants. A set of simulated
surfaces (Munsell spectra) was constructed to form a uniform circle of chromaticities in cone-opponent space, when illuminated by a Plankian radiator
with a color temperature of 2724 or 4000 K. Corresponding sets were then
calculated for the same surfaces under a 3-primary LED spectrum with the
same mean chromaticity. Observers simultaneously adapted for 3 minutes
to a random sequence of the same surfaces under each pair of Plankian
vs. LED sources, shown in two 4-deg fields above and below fixation. A
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day M orning Post ers
light or diffused light. We used a magnitude estimation method for evaluations. Observers evaluated its appearance in terms of glossiness, naturalness, translucency, sharpness, saturation, brightness, roughness, heaviness, hardness, and preference. They rated the appearance of test samples
under diffused light for each item in comparison with that under the direct
light, which was served as a reference. The results showed the difference
of appearance in all items under directed and diffused lights. Those shifts
were generally larger in glossiness, sharpness, brightness, and roughness,
implying that those factors are especially influenced by the diffuseness of
lighting. Samples tended to appear less glossy and smother under diffused
light than direct light, and their difference was larger for a sample with
rough surface. These trends were consistent with our previous finding.
Acknowledgement: EY-10834
23.4044 Unraveling simultaneous transparency and illumination
23.4042 Universal information limit on real-world color con-
stancy David Foster1([email protected]), Iván Marín-Franch2;
School of Electrical and Electronic Engineering, University of Manchester,
Manchester, UK, 2Faculty of Optics and Optometry, University of Murcia,
Murcia, Spain
The light reflected from scenes under the sun and sky changes over the
course of the day, yet the reflecting properties of individual surfaces appear
unchanged. The phenomenon of color constancy is often attributed to operations applied to cone photoreceptor signals. These operations include
cone-specific adaptation such as von Kries scaling, typically by average
scene color or the brightest color; transformations of combinations of cone
signals; and transformations of the whole color gamut, e.g., for optimum
color discrimination. But are any of these or similar operations sufficient
for constancy in the real world, where both spectral and geometric changes
in illumination occur, including changes in shadows and mutual illumination? To address this question, cone signals were calculated from time-lapse
hyperspectral radiance images of five different outdoor scenes containing
mixtures of herbaceous vegetation, woodland, barren land, rock, and rural
and urban buildings. Shannon’s mutual information between cone signals
was estimated across successive time intervals. Combined with the data
processing inequality from information theory (“functions of data cannot
increase information”), these estimates set an upper limit on the performance of any color constancy operation using cone signals alone. For all
five scenes, the information limit declined markedly with increasing time
interval, though not always monotonically. This pattern was little altered
by changing the way that cone signals were initially sampled before information was estimated, e.g., taking spatial ratios of signals, omitting signals
from dark regions of scenes, and using local statistical features (local mean,
maximum, and standard deviation of cone signals). Moreover, dividing
scenes into a mosaic of smaller patches for independent processing did not
improve performance. It seems that operations on color signals alone are
insufficient to uniquely identify the reflecting properties of individual surfaces. Reliable color constancy in the real world depends on more than just
Acknowledgement: Engineering and Physical Sciences Research Council, United
Kingdom (Grant Nos. GR/R39412/01, EP/B000257/1, and EP/E056512/1)
23.4043 Appearance of surface property influenced by the diffuse-
ness of lighting Yoko Mizokami1([email protected]), Yuki
Kiyasu1, Hirohisa Yaguchi1; 1Graduate School of Advanced Integration
Science, Chiba University
Lighting condition could largely influence the appearance of object surface
property. It is known that the components of specular and diffuse reflection
change depending on the diffuseness of lighting. The diffuseness of lighting
could influence the appearance of various surface properties, and it would
be important to investigate them systematically. We previously examined
how the impression of surface appearance of test samples with different
roughness and shape changed under diffused light and direct light, and
our results suggested that glossiness and smoothness were main factors
influenced by the lighting conditions (ECVP2016). Here, we further examine how the surface appearance of test samples with different roughness
and shape changed by diffused light and direct light using real samples in
real miniature rooms. We prepared plane test samples with three different
levels of surface roughness and spheres with matt and gloss surfaces. A
sample was placed in the center of a miniature room with either directed
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Acknowledgement: JSPS KAKENHI 16K00368, the Konica Minolta Imaging
Science Encouragement Award
changes Robert Ennis1([email protected]), Katja
Doerschner1; 1Justus-Liebig University, Giessen, Germany
Retinally incident light is an ambiguous product of spectral distributions of
light in the environment and their interactions with reflecting, absorbing,
and transmitting materials. An ideal color constant observer would unravel
these confounded sources of information and account for changes in each
factor. We have previously shown (VSS, 2016) that when observers view
the whole scene, they can disentangle simultaneous changes in the color
of the illumination and the surfaces of opaque objects, although standard
global scene statistics in the color constancy literature did not fully account
for their behavior. Here, we have extended this investigation to simultaneous color changes in the color of the illuminant and of glass-like blobby
objects (similar to Glavens (Phillips, et. al., 2016)). To simulate changes in
the color of the illuminant and of transparent objects, we made a simple
physically-based GPU-accelerated rendering system. Color changes were
constrained to “red-green” and “blue-yellow” axes. At the beginning of
the experiment, observers (n=6) first saw examples of the most extreme
illuminant/transparency changes for our images. They were asked to use
these references as a mental scale for illumination/transparency change
(0% to 100% change). Next, they used their scale to judge the magnitude
of illuminant/transparency change between pairs of images. Observers
viewed sequential, random pairs of images (2s per image) with a view of
the whole scene or of only the object itself (produced by masking the scene).
Observers were capable of extracting simultaneous illumination/transparency changes when provided with a view of the whole scene, but were
worse when viewing only the object. Global scene statistics did not fully
account for their behavior in either condition. We take this as suggesting
that observers make use of local changes in shadows, highlights, and caustics across different objects to determine the properties of the illuminant
and the objects it illuminates.
23.4045 #thedress: A Tool for Understanding How Color Vision
Works Rosa Lafer-Sousa1([email protected]), Bevil Conway2; 1Department of Brain and Cognitive Sciences, MIT, 2Laboratory of Sensorimotor
Research, National Eye Institute, NIH
The “dress” photograph provides an opportunity to investigate how the
brain resolves stimulus ambiguity to achieve color. We analyzed responses
from a large number of naïve and non-naïve subjects, collected in lab and
online. First, contrary to initial scientific reports suggesting a wide range
of dress percepts, using K-means clustering to analyze color-matching
responses we find the dress was viewed categorically (white/gold and
blue/black) among observers who had seen the photograph before and
naïve subjects (color-matching responses predicted subjects’ categorical
labels, binomial regression). As well, 48% of observers self-reported experiences of perceptual switching (W/G switched less often). These results
show that #thedress is analogous to bi-stable shape images. Second, we
quantitatively compared color-matching responses obtained online and
in laboratory, and performed a power analysis to determine the number
of subjects required to obtain results representative of the general population. We conclude that initial scientific studies were underpowered.
Third, observers descriptions of the lighting conditions were predictive
of the colors seen (binomial regression); W/G observers typically inferred
a cool illuminant, whereas B/K observers inferred a warm illuminant.
Fourth, subjective reports of where in the image subjects looked revealed
systematic differences between B/K and W/G observers. Fifth, we show
Vis io n S c ie nc es Societ y
Saturday AM
new color was sampled from the distributions every 200 ms. Test stimuli
were then shown for 500 ms in the two fields and interleaved with 4 sec
of re-adaptation. The tests included 16 chromaticities uniformly sampling
different chromatic angles relative to the illuminant mean. The test pair
were yoked so that increasing the test contrast in the top field reduced it in
the bottom field or vice versa, and observers adjusted them to match their
appearance. These matches required significantly higher contrast along the
reddish-greenish axis for the LED adaptation, consistent with a sensitivity
loss induced by selective adaptation to the higher red-green contrast created by the LED spectra. Our results suggest that commonly available light
sources may significantly alter the states of contrast adaptation in the visual
system, and that this contrast adaptation is important for understanding
the perceptual consequences of both short and long-term exposure to different illuminants. Funding: EY-10834
Saturday AM
Satur day Morni ng P os t er s
VS S 2017 Abst ract s
here that when the dress is cropped from the rest of the photograph, and
digitally placed on a female model, a color tint applied to the model’s skin
that reflects the illuminant was sufficient for observers to disambiguate the
illuminant color and achieve a predictable perception of the dress’ colors.
Finally, presenting versions of the photograph with unambiguous lighting
cues influenced how subjects reported the dress’ colors in subsequent viewings of the original photograph. Together the results document a powerful
example of a bi-stable color image, and illustrate how multiple perceptual
and cognitive cues are used by the brain to resolve color.
was approximately 7.2 (a Caucasian 5.7, two Chinese 7.9), slightly larger
than the conventional values of acceptable perceptual error (4.0). Despite
these differences, color constancy indices between selected CIE standard
illuminants (D65, A, F2, F11) ranged over 0.57–0.84, close to values from
traditional color-constancy experiments with human observers. The color
quality of facial prostheses in modern additive skin manufacturing may be
as good perceptually as that of real human skin, even under different scene
23.4046 Luminance-contrast reversal disambiguates illumination
23.4048 When the brightest is not the best: Illuminant estimation
interpretation in #TheDress Shigeki Nakauchi1([email protected]), Kai
Shiromi1, Hiroshi Higashi1, Mohammad Shehata1,2, Shinsuke Shimojo2;
Department of Computer Science and Engineering, Toyohashi University
of Technology, 2Biology and Biological Engineering, California Institute of
Background: One of the potential hypothesis for explaining the individual
differences in perceiving #TheDress is ambiguity in illumination interpretation. Asymmetry between luminance and blue-yellow in variations of
natural sunlight is suspected to play an important role. Here, to test the
hypothesis, color matching experiments were conducted for variants of
#TheDress. Methods: As for the visual stimuli, we manipulated the hue
and/or luminance contrast of the original #TheDress: original (OR), hue
reversed (HR), luminance contrast reversed (LR), hue and luminance contrast reversed (HLR). Observers were asked to view one of these images
displayed on a calibrated monitor and to match the dress/lace colors in
CIELUV uniform color space by selecting the closest color among 25 uniform rectangular color patches equally spaced in L*, u* and v* coordinates
by button press. Observers were pre-categorized into blue-black (BK) and
white-gold (WG) groups by their color naming responses to the original. Results: Matches between BK and WG differ in L* (lightness) and v*
(blue-yellow direction) for the OR which duplicated the previous observations. However, L* matches for the bright lace part in the HR image
still differ among groups although individual differences in color naming
vanished. For both the LR and HLR, however, we found no differences
in matches between groups although the HLR image had the same luminance-color structure (bright blue and dark yellow) as the original. Discussions: Results imply that luminance-contrast polarity is the one of the key
factors affecting the individual differences in #TheDress. This is because
reversing the luminance-contrast may disambiguate indirect/direct illumination interpretations. Furthermore, specular highlights do not work
as a local cue for the illuminant color in the luminance-contrast reversed
23.4047 Color quality assessments of 3D facial prostheses in
varying illuminations Kinjiro Amano1([email protected]), Ali
Sohaib2, Kaida Xiao3, Julian Yates1, Charles Whitford4, Sophie Wuerger2;
School of Medical Sciences, University of Manchester, UK, 2Institute of
Psychology Health and Society, University of Liverpool, UK, 3School of
Design, University of Leeds, UK, 4School of Engineering, University of
Liverpool, UK
Skin color provides essential information about an individual’s health condition and emotion. Additive manufacturing of human skin has been developed markedly in recent years along with increasing demands for clinical
and medical applications. It is therefore critical to achieve precise color
reproduction of facial skin and constant color appearance under different
illuminations particularly for the application to maxillofacial prostheses. In
this study, the color quality of 3D facial prostheses under various illuminations was assessed by measuring human perceptual error, quantified by
the color difference metric CIEDE2000, and an index for color constancy.
The index was calculated in the same manner as a standard color-constancy
index. Thus, in a color space, where the chromaticity coordinates of real
skin and artificial skin were located, let a be the distance between real skin
and artificial skin colors under a test illuminant and let b be the distance
between the real skin color under test and reference illuminants, then the
index is 1 – a/b. Perfect constancy corresponds to unity and the greater the
error, the lower the index. 3D facial prostheses of three human subjects, one
Caucasian and two Chinese, were generated by an additive manufacturing
with an elaborated color management from 3D color digital imaging to 3D
printing. Colors of the 3D prostheses and subjects’ real skin were compared
with a spectrophotometer. Mean color difference CIEDE2000 over subjects
Vi s i on S c i enc es S o ci e ty
Acknowledgement: EPSRC, UK (EP/L001012/1 and EP/K040057)
based on highlight geometry Takuma Morimoto1([email protected]
new.ox.ac.uk), Robert Lee2, Hannah Smithson1; 1University of Oxford,
Department of Experimental Psychology, 2University of Lincoln, School of
To achieve color constancy, the visual system must estimate the illuminant. An influential proposal for illuminant estimation is to assume that
the brightest element in a scene is either a white surface or a specular highlight and therefore provides the illuminant color. We tested an alternative
hypothesis: Observers use the geometry of the surface and the illumination
to select highlight regions, even when they are not the brightest elements
in the scene. In computer-rendered scenes we manipulated the reliability
of the “brightest element” and the “highlight geometry” cues to the illuminant, and tested the effect on performance in an operational color constancy task. To eliminate other cues to the illuminant, scenes contained only
a single spherical surface illuminated by multiple point sources of light,
each with the same spectral content. The surface reflectance took a single
spectral distribution but was modified by surface texture that attenuated
the reflectance by a variable scale factor. The surface had one of three levels
of specularity: zero (matte), low and mid. In the experiment, observers saw
a one-second animation and their task was to indicate if the color change
was due to an illuminant change or a material change. Discrimination performance was close to chance for matte surfaces, as predicted. However,
as specularity increased, performance significantly improved. Importantly,
it was shown that performance exceeded the prediction given by an ideal
observer using the brightest element to perform the discrimination. Moreover, separate analyses for trials in which the specular region fell on a
dark part of the texture showed an additional performance enhancement,
even though the brightest element heuristic would predict a performance
decrease. These results suggest that human observers do not simply rely on
the brightest element in constancy tasks, but rather utilize the geometry of
specular regions to separate surface and illuminant properties.
Acknowledgement: Wellcome Trust (094595/Z/10/Z)
Binocular Vision: Continuous flash suppression
and awareness
Saturday, May 20, 8:30 am - 12:30 pm
Poster Session, Pavilion
23.4049 Why are dynamic Mondrian patterns unusually effective in
inducing interocular suppression? Shui’Er Han1([email protected]
com), Garry Kong1,2, Randolph Blake3, David Alais1; 1School of Psychology, University of Sydney, 2Science Division, New York University Abu
Dhabi , 3Department of Psychology, Vanderbilt University, Nashville, TN
In so-called continuous flash suppression (CFS), a dynamic sequence of
Mondrian images presented to one eye effectively suppresses a static target in the other eye for many seconds at a time. This strong and enduring interocular suppression is generally attributed to the rapid Mondrian
pattern changes, which resemble a series of backward and forward masks.
However, using spatiotemporal filtering techniques, recent studies demonstrate similarities with binocular rivalry, with CFS producing strong suppression when stimuli favouring parvocellular streams (slow temporal
modulations) are used and when target/masker attributes are matched. To
evaluate this discrepancy, we manipulated the pattern and temporal structure of a 10 Hz Mondrian and measured the respective effects on suppression durations. The Mondrian pattern is an ideal masker because it contains
an abundance of edge and contour information and these features influence
both visual temporal masking and rivalry suppression. Compared to phase
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s 23.4050 Mechanisms of suppression: How the classic Mondrian
beats noise in CFS masking Weina Zhu1,2([email protected]),
Jan Drewes2, David Melcher2; 1School of Information Science, Yunnan
University, 650091 Kunming, China, 2Center for Mind/Brain Sciences
(CIMeC), University of Trento, 38068 Rovereto, Italy
In a typical Continuous Flash Suppression (CFS) paradigm (Tsuchiya &
Koch, 2005), a series of different “Mondrian” patterns is repeatedly flashed
to one eye, suppressing awareness of the image presented to the other eye.
In our previous study (Zhu, Drewes, & Melcher, 2016), we found that the
spatial density of the Mondrian patterns affected the effectiveness of CFS.
To better understand this finding, we varied the shape and edge information in the mask. Typical Mondrian-style masks are made from individual rectangular patches, resulting in sharp horizontal and vertical edges
between neighboring luminance levels. To investigate the role of these
edges, we compared grayscale Mondrian masks with various noise patterns as well as phase-scrambled Mondrian equivalents and “Klee” masks
(a type of pink noise mask with edges). We employed a breakthrough CFS
paradigm with photographic face/house stimuli and a range of temporal
masking frequencies (3-16Hz). The noise patterns were white noise with
spatial frequency filtering applied, resulting in noise spectra ranging from
1/f0.5 to 1/f10. Subjects (N=16) were instructed to press the button as soon
as they saw any part of the stimulus. Results show that the most effective
mask was the classic Mondrian. Among the noise masks, pink noise (1/f1)
lead to longer suppression while the least effective masking was achieved
by 1/f10 noise. Interestingly, the masking effectiveness of the phase-scrambled Mondrian masks as well as the “Klee” masks was not significantly different from pink noise. Adding edges to noise masks therefore did not significantly improve masking effectiveness. Phase scrambling the Mondrian
patterns did significantly reduce their effectiveness. The remaining advantage Mondrian masks have over random noise patterns may result from the
higher effective contrast range or presence of surface shapes afforded to the
Mondrians by the patchwork design. Acknowledgement: This work was supported by the National Natural Science
Foundation of China 23 (61005087, 61263042, 61563056), Science and Technology Project of Yunnan Province, China 24 (2015Z010). JD and DM were
supported by a European Research Council (ERC) grant (Grant 25 Agreement No.
313658) and High-level Foreign Expert Grant (GDT20155300084).
23.4051 Different suppressing stimuli produce different sup-
pression in the continuous flash suppression paradigm Motomi
Shimizu1([email protected]), Eiji Kimura2; 1Graduate School of
Advanced Integration Science, Chiba University, Japan, 2Department of
Psychology, Faculty of Letters, Chiba University, Japan
Purpose: Stimulating one eye with a high-contrast dynamic stimulus can
render a salient stimulus in the other eye invisible (continuous flash suppression; CFS). We have previously demonstrated, using a flickering achromatic Gabor as the suppressing stimulus, that successive exchanges of
the eye-of-presentation led to breaking suppression (Shimizu & Kimura,
VSS2016). This finding suggested that CFS is mainly mediated by eyerather than stimulus-based suppression. This study aimed to extend the
previous finding and investigated whether different suppressing stimuli
such as Mondrian patterns would produce similar or different suppression. Method: The suppressing stimulus was a series of different Mondrian
patterns presented at a rate of 5 Hz. The patterns were either achromatic
or chromatic. The target was an achromatic Gabor patch (2.5 cpd, sigma
= 0.22°, 40% contrast). The eye of presentation was manipulated in three
conditions. In the dichoptic condition the suppressing stimulus was presented to the observer’s dominant eye and the target was to the other. In
the eye-swap condition the two stimuli were dichoptically presented but
repeatedly exchanged between the eyes at every 1 second. In the monocu-
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
lar condition they were presented in the same eye. We asked observers to
detect the target as soon as possible and measured detection time. Results &
Discussion: In the dichoptic condition, Mondrian patterns, whether achromatic or chromatic, produced longer detection time (i.e., stronger suppression) than the Gabor patch used previously. Moreover, in contrast to the
results with the Gabor, Mondrian patterns resulted in significantly longer
detection time (3.7 sec) in the eye-swap condition than in the monocular
condition (1.9 sec), although the suppression was much reduced relative to
that in the dichoptic condition (8.7 sec). The significant, although reduced,
suppression in the eye-swap condition cannot be easily accounted for by
eye-based suppression, and implicates a partial contribution of eye-independent, possibly stimulus-based, suppression.
Acknowledgement: Supported by JSPS KAKENHI (26285162 & 25285197)
23.4052 Analyzing the time course of processing invisible stimuli: Applying event history analysis to breaking continuous flash
suppression data. Pieter Moors1([email protected]), Johan
Wagemans1; 1Laboratory of Experimental Psychology, Department of
Brain and Cognition, University of Leuven (KU Leuven)
Breaking continuous flash suppression (b-CFS) is an interocular suppression paradigm in which the time to detect an initially suppressed stimulus is measured for different classes of stimuli. For example, in one classic
study Jiang et al. (2007) reported shorter suppression times for upright
compared to inverted faces. Because such a difference was not observed in
a perceptually matched condition which did not involve interocular suppression, Jiang et al. argued to have provided evidence for unconscious
processing of face stimuli. Although the suitability of b-CFS as a paradigm
for unraveling the scope and limits of unconscious processing has already
been firmly criticized, it is still argued that the paradigm is useful because
increased suppression durations allow for more elaborate processing of the
perceptually suppressed stimulus and hence certain effects can be detected
more easily. In this study, we explicitly tested this claim by applying event
history analysis to a set of b-CFS studies that were collected during the last
years. Event history analysis refers to a set of statistical methods for studying the occurrence and timing of events while explicitly taking the passage
of time into account. For an experiment comparable to Jiang et al. (2007),
our analyses show that, over the course of a trial, the hazard (i.e., probability of response occurrence at time t, given that no response has occurred
yet) associated with upright faces was higher compared to inverted faces,
supporting faster breakthrough of upright faces. However, this face orientation effect did not interact with time, indicating no differential evolution
over time for the hazard functions associated with upright and inverted
faces. We discuss these results in the context of current discussions on the
usefulness of b-CFS paradigms to allow for more elaborate processing of
the perceptually suppressed stimulus.
Acknowledgement: METH/02/14
23.4053 Perceptual learning does not affect access to aware-
ness Chris Paffen1([email protected]), Surya Gayet1, Micha Heilbron1,
Stefan Van der Stigchel1; 1Experimental Psychology & Helmholtz Institute,
Utrecht University
Visual information that is relevant for an observer can be prioritized for
access to awareness (Gayet et al., 2014). Here we investigate whether information that became relevant due to extensive training is prioritized for
awareness. Participants performed in 3-day speed discrimination training
involving dots moving in two directions, of which one was task-relevant
(the attended direction) and the other was task-irrelevant (the ignored
direction). Before and after training, we measured detection times for
reporting the location of moving dots that were initially suppressed from
awareness by continuous flash suppression (a method which is often used
to assess prioritization for access to awareness called breaking continuous
flash suppression; b-CFS). We hypothesized that b-CFS durations for the
attended direction would selectively decrease after training. We also measured motion coherence thresholds for the attended, ignored and a neutral
motion direction before and after training. Results show that perceptual
learning took place: during training, speed discrimination of the attended
motion direction became increasingly better. Also, coherence thresholds for
the attended motion direction decreased after training, while thresholds for
ignored and neutral motion directions were unaffected. B-CFS durations
decreased after training for all three motion directions, revealing no selec-
Vis io n S c ie nc es Societ y
Saturday AM
scrambled Mondrians, our findings reveal significantly longer suppression durations for intact Mondrian patterns. This suppressive advantage
applied to both location and identity judgments, and was predominantly
driven by pattern edges. Updating the Mondrian smoothly and continuously resulted in lower suppression durations than the standard, discrete
presentation schedule, demonstrating the significant contribution of visual
temporal masking in CFS. The differences in suppression durations with
an intact, discretely updated Mondrian masker also varied with temporal
frequency content, suggesting that there might be a dual component mechanism in CFS involving temporal masking and interocular suppression.
S atur day M orning Post ers
Saturday AM
Satur day Morni ng P os t er s
tive decrease for the previously attended motion. A follow-up experiment
showed that b-CFS durations also decreased after three days without training, revealing that perceptual learning did not cause the general decrease in
b-CFS durations. We conclude that information that has become relevant
due to extensive training is not prioritized for access to awareness. Our
experiments do show, however, that b-CFS durations decrease for stimuli
that are shown in succession, even when measurements are separated by
several days. The latter has important consequences for studies applying
b-CFS to assess access to awareness. Gayet, S., Van der Stigchel, S., &
Paffen, C. L. E. (2014). Front Psychol, 5, 460.
23.4054 The Functional Order of Binocular Rivalry and Blind Spot
Filling-in Stella Qian1([email protected]), Jan Brascamp1,2, Taosheng
Liu1,2; 1Department of Psychology, Michigan State University, 2Neuroscience Program, Michigan State University
Binocular rivalry occurs when two eyes receive conflicting information,
leading to perceptual alternations between two eyes’ images. The locus of
binocular rivalry has received intense investigation as it is pertinent to the
mechanisms of visual awareness. Here we assessed the functional stage
of binocular rivalry relative to blind spot filling-in. Blind spot filling-in
is thought to transpire in V1, providing a reference point for the locus of
rivalry. We conducted two experiments to explore the functional order of
binocular rivalry and blind spot filling-in. Experiment 1 examined if the
information filled-in at the blind spot can engage in rivalry with a physical stimulus at the corresponding location in the fellow eye. Resulting perceptual reports showed no difference between this condition and a condition where filling-in was precluded by presenting the same stimuli away
from the blind spot, suggesting that the rivalry process is not influenced
by any filling-in that might occur. In Experiment 2, we paired the fellow
eye’s rival stimulus, not with the filled-in surface at the blind spot, but with
the ‘inducer’ that immediately surrounds the blind spot and that engenders filling-in. We also established two control conditions away from the
blind spot: one involving a ring physically identical to the inducer, and one
involving a disk that resembled the filled-in percept. Perceptual reports in
the blind spot condition resembled those in the former, ‘ring’ condition,
more than those in the latter, ‘disk’ condition, indicating that a perceptually suppressed inducer does not engender filling-in. Our behavioral data
suggest that binocular rivalry functionally precedes blind spot filling-in.
We conjecture that binocular rivalry involves processing stages at or before
V1, which would be consistent with views of binocular rivalry that involve
low-level competition, and with evidence that binocular rivalry correlates
can be found as early as the lateral geniculate nucleus.
Acknowledgement: Supported by a grant from the National Institutes of Health
23.4055 The content of visual working memory alters processing
of visual input prior to conscious access: evidence from pupillometry Surya Gayet1([email protected]), Chris Paffen1, Matthias Guggenmos2,
Philipp Sterzer2, Stefan Van der Stigchel1; 1Experimental Psychology,
Utrecht University, Helmholtz Institute (Utrecht, The Netherlands),
Psychiatry and Psychotherapy, Charite University Medecine (Berlin,
Visual working memory (VWM) allows for keeping relevant visual information available after termination of its sensory input. Storing information
in VWM, however, affects concurrent conscious perception of visual input:
initially suppressed visual input gains prioritized access to consciousness
when it matches the content of VWM (Gayet et al., 2013). Recently, there
has been a debate whether such modulations of conscious access operate
prior to conscious perception or, rather, during a transition period from
non-conscious to conscious perception. Here, we used pupil size measurements to track the influence of VWM on visual input continuously, and
dissociate between these possibilities. Participants were sequentially presented with two shapes drawn from different shape categories (ellipses,
rectangles, or triangles) and a retro-cue, indicating which of the two shapes
should be remembered for subsequent recall. During the retention interval,
participants were instructed to report whether a target shape, which either
matched or mismatched the concurrently memorized item, was presented
left or right of fixation. Critically, the target shape was initially suppressed
from consciousness by continuous flash suppression, and could therefore
only be responded to once it was consciously accessible. Analyses of
response times revealed that targets were released from suppression faster
Vi s i on S c i enc es S o ci e ty
VS S 2017 Abst ract s
when they matched compared to when they mismatched the memorized
shape. This behavioral effect was paralleled by a differential pupillary
response such, that pupil constriction was more pronounced when visual
input matched compared to when it mismatched the content of VWM.
Importantly, this difference in pupil size emerged already 500ms after
target onset, and almost two seconds before participants could report the
location of the target shape. We conclude that the content of VWM affects
processing of visual input when it is not yet consciously accessible, thereby
allowing it to reach prioritized conscious access. Acknowledgement: This project was funded by grants 404.10.306 and 452.13.008
from the Netherlands Organization for Scientific Research to S.V.d.S. and
C.L.E.P., and to S.V.d.S. respectively, by a seed money grant from Neuroscience
and Cognition Utrecht to S.G., and M.G. and P.S. were supported by the German
Research Foundation (grants STE 1430/6-2 and STE 1430/7-1).
23.4056 Access to awareness and semantic categories: low-level
image properties drive access to awarenes Sjoerd Stuit1([email protected]
uu.nl), Martijn Barendregt2, Susan te Pas1; 1Experimental Psychology,
Utrecht University, 2Experimental & Applied Psychology, Vrije Universiteit Amsterdam
Many social and visual research experiments have demonstrated behavioral
effects based on semantic category differences, even when presented subliminally. For example, threatening faces reach awareness faster compared
to neutral faces and naked human bodies attract attention when they match
your sexual preference. Overall, images from categories that are relevant to
an observer reach awareness faster compared to irrelevant images. However, a direct comparison of the processing of visual images from different
semantic categories is complicated by the inherent differences in low-level
image properties. Thus the question remains: is the time an image requires
to reach awareness determined by the semantic category and its relevance
or by low-level image properties? Here, we used a set of 400 pseudo-randomly selected images (from Google images) divided into four semantic
categories (food, animals, art and naked human bodies) to test if access to
awareness differs between categories when low-level image properties are
taken into account. We used a breaking-continuous flash suppression paradigm to measure the amount of time an image takes to reach the observers
awareness. Next, we extracted multiple indices of color and spatial frequency information from each images. Using a mixed-effects analysis we
show that after taking image statistics into account, naked human bodies
show no categorical effect on access to awareness. However, images of animals still result in deviating access to awareness rate compared to all other
categories. Taken together, we show that most of the variance in access to
awareness is in fact due to differences in low-level image properties. In
particular, we find that differences in the spatial frequency content of the
target image and that of the interocular mask strongly predict the variance
in access to awareness. Our result demonstrate the importance of taking
image statistics into account before comparing semantic categories.
23.4057 The effect of trypophobic images on conscious awareness
during continuous flash suppression Risako Shirai1([email protected]
kwansei.ac.jp), Hirokazu Ogawa1; 1Department of Integrated Psychological Sciences, Kwansei Gakuin University
Trypophobia is a fear of clustered objects like lotus seed heads. Trypophobic objects do not involve dangerous objects, but they are a source of
discomfort. Recently, Cole and Wilkins (2013) demonstrated that such trypophobic images contained excess energy at a particular range of spatial
frequencies and claimed the unique power spectrums caused discomfort.
In the present study, we examined whether the trypophobic unique power
spectrums affect accessing to conscious awareness by using breaking continuous flash suppression (b-CFS) paradigm. In the b-CFS paradigm, the
dynamic masking pattern is presented to one eye, which can suppresses
the awareness for a target image presented to other eye until the target
image breaks the suppression. The target images consisted of trypophobic, fear-related, hole or neutral scenes. All target images were original in
intact-image condition, while the target images were converted to phase
scrambled images in phase-scrambled-image condition. In both condition,
participants were instructed to press a left or right key to indicate where
the target image appeared on the display. The results showed that the
fear-related and hole images emerged into awareness faster than the neutral images in intact-image condition. Moreover, the trypophobic images
emerged into awareness faster than neutral, fear-related and hole images.
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s 23.4058 Dissociating the Effects of Relevance and Predictability on
Visual Detection Sensitivity Roy Moyal1([email protected]), Shimon
Edelman1; 1Department of Psychology, College of Arts and Sciences,
Cornell University
When confronted with familiar tasks, people draw upon past experience to
anticipate sensory input and optimize their performance. While it is uncontroversial that predictions influence perception, their effects on visual
detection sensitivity are debatable; some studies suggest that surprise facilitates detection, whereas others show the opposite effect. These seemingly
contradictory findings might be attributable to interactions between the relevance and the predictability of the stimuli used. To clarify the effects of
expectation on visual detection sensitivity and dissociate them from those
of relevance cues and primes, we conducted two continuous flash suppression experiments. In each trial, participants viewed a thin gray bar in one
eye and an animated rectangular patch in the other eye. Expectation was
manipulated by the inclusion of a cue, which predicted (with 80% validity) the orientation of the bar in half of the trials. In the first experiment,
participants were asked to quickly press a key only if they detected a bar
of a certain orientation (the relevance manipulation). In the second experiment, the cues were identical to the targets predicted by them in half of
the trials; in the remaining trials, colored circles were used instead (the cue
type manipulation). When the masked bar did not warrant a response,
detection performance was poorer (in terms of both visibility reports and
localization performance) in invalid cue trials relative to both valid cue and
nonpredictive cue trials. These differences were absent when the presented
stimulus was behaviorally relevant. Primes and abstract predictive cues,
when valid, improved detection performance to similar extents. Our results
suggest that, when both are at play, the effects of attentional priorities on
visual detection thresholds override those of prior expectations. They also
indicate that predictive cueing and repetition priming may rely on similar
neural mechanisms.
Binocular Vision: Other
Saturday, May 20, 8:30 am - 12:30 pm
Poster Session, Pavilion
23.4059 Interocular interactions in macaque LGN Kacie Dough-
erty1([email protected]), Michele Cox1, Jacob Westerberg1,
Alexander Maier1; 1Department of Psychology, College of Arts and Science, Vanderbilt Unversity
Some of the most common visual disorders, such as amblyopia and stereoblindness, affect binocular vision. However, our understanding of how
the brain processes binocular inputs is limited. Here we investigate where
the signals from the two eyes first interact in the primary visual pathway in
primates with normal binocular vision. The LGN is the first structure in this
pathway receiving inputs from both eyes, with neighboring layers receiving exclusive inputs from one retina or the other. While the vast majority of
neurons in the LGN are driven by stimulation of one eye only, it is unclear
to what degree responses of LGN neurons depend on what is viewed by
both eyes. In the primary visual cortex (V1), the next stage in the primary
visual pathway, the vast majority of neurons respond to either eye, with
one eye often evoking stronger responses than the other. In this study, we
test the hypothesis that interocular interactions occur prior to spiking in V1.
We trained macaque monkeys to fixate on a computer screen. Using a linear
multicontact electrode array, we recorded LGN spiking responses to drifting gratings that varied in contrast and were presented to one or both eyes.
Then, we compared contrast response functions under monocular and binocular stimulation conditions. We observed that the firing rate of a minority
of LGN neurons, exclusive to the magnocellular layers, modulated under
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
binocular stimulation. These effects included both binocular suppression
and facilitation. We will discuss these results with regard to interocular
anatomical connections in the primate early visual system. Acknowledgement: 2T32 EY007135-21 , P30-EY008126, Knights Templar Eye
Foundation, Whitehall Foundation
23.4060 Overestimation of the number of elements in a three-dimensional stimulus is dependent on the size of the area containing the elements Yusuke Matsuda1([email protected]), Koichi
Shimono1, Saori Aida2; 1Faculty of Marine Technology, Tokyo University
of Marine Science and Technology, 2School of Computer Science, Tokyo
University of Technology
Numerosity perception has been intensively examined using two-dimensional (2-D) stimuli, but has almost never been investigated using three-dimensional (3-D) stimuli. Recently, however, it was reported that a stereoscopic 3-D stimulus is perceived to have more elements than a stereoscopic
2-D stimulus when both contain the same number of elements. This suggests that the depth structure of the stimulus plays a role in numerosity
perception. We examined the effect of the size of the area containing the elements on the overestimation phenomenon, using random-dot stereograms
for 3-D and 2-D stimuli, which consisted of black square elements (6.7 * 6.7
arcmin) scattered in a circular area (4.4, 8.9, or 13.3 arcdeg in diameter).
When the stereograms were fused, the 3-D and 2-D stimuli were perceived
to have two transparent surfaces and a single surface, respectively. Observers performed a numerosity discrimination task, where they identified
which of the two stimuli (presented side-by-side) had a greater number
of elements. The number of elements was maintained constant at 50, 100,
or 150 for the 3-D stimulus and varied for the 2-D stimulus to calculate the
Weber fraction as an index of the degree of numerosity overestimation for
the 3-D stimulus. The results indicated that the Weber fraction increases
with the size of the circular area and the number of elements. The results
can be explained in terms of a process or processes, with an output representing the perceived numerosity. The process(es) loads the visual system
more heavily when the observer estimates the number of elements scattered in a 3-D space than when a single surface is estimated. This results in
the overestimation of the elements in the 3-D stimuli.
23.4061 Binocular contrast interactions in cross- and iso-ori-
ented surround modulation: measurement and modeling Pi-Chun
Huang1([email protected]); 1Department of Psychology,
National Cheng Kung University
The detectability and discrimination abilities of a visual target can be
improved or impaired by its surround stimulus, which is termed center-surround modulation. However, it is yet unclear whether or not surround modulation can occurs before or only after the binocular integration
stage. In response, the pattern-masking paradigm was adopted to systematically measure the detection threshold of a target (horizontal Gabor, 2
cpd) under various pedestal contrasts and two surround contrasts (0 and
0.4), and with monocular, binocular and dichoptic viewing conditions. We
also compared the modulation effects when the surround orientation was
in parallel or orthogonal to the target orientation. With the monocular and
dichoptic viewing conditions, the results showed that surround facilitation
occurred at low pedestal contrast when the target and the surround mask
were presented to the same eye; in contrast, surround suppression occurred
at low pedestal contrast when the target and mask were presented to different eyes regardless of the pedestal’s eye origin. With the binocular viewing
condition, the surround modulation disappeared. To further investigate
this phenomenon, the surround modulation effects under different combinations of eye origin were fitted with a two-stage binocular contrast-gain
control model. The model not only successfully described the results, but
also demonstrated that the surround modulation occurred before binocular summation, with interocular suppression also being involved. Furthermore, surround modulation was better modeled by using the multiplicative
excitatory and multiplicative suppressive factors at the monocular level,
but linearly added for interocular influence. Thus the role of surround
modulation was to raise the gain of the spatial filter at the monocular level.
Acknowledgement: This work was supported by NSC-101-2401-H-006-003-MY2
and NSC 102-2420-H-006 -010 -MY2 to PCH.
Vis io n S c ie nc es Societ y
Saturday AM
However, the phase-scrambled versions of the trypophobic images did
not show any differences between the image types. These results showed
that the trypophobic unique power spectrums did not affect the conscious
awareness. Furthermore, we assessed what factors contributed to creating
the benefit of trypophobic images on awareness using multiple regression
analysis. The results showed that the benefit of trypophobic images on
awareness was predicted by the benefit of hole and fear-related images on
awareness. Taken together, the individual cognitive processes of trypophobic images might be explained by how much the processes of the simple
geometric shape and emotion were facilitated.
S atur day M orning Post ers
Satur day Morni ng P os t er s
VS S 2017 Abst ract s
23.4062 Binocular Combination: Data and Binocular Perceptual
Template Model Chang-Bing Huang ([email protected]), Ge
Chen1,2,3, Fang Hou4, Zhong-Lin Lu5; 1CAS Key Laboratory of Behavioral
Science, Institute of Psychology, Chinese Academy of Sciences, 16 LinCui
Rd, ChaoyangDist, Beijing 100101, China., 2University of Chinese Academy of Sciences, Beijing, China, 3Visual Information Processing and Learning Lab (VisPal), Institute of Psychology, CAS, Beijing, China, 4School
of Ophthalmology and Optometry and Eye Hospital, Wenzhou Medical
University, Wenzhou, Zhejiang, China, 5Center for Cognitive and Brain
Sciences, Department of Psychology, Ohio State University, 225 Psychology Building, 1835 Neil Avenue Columbus, Ohio 43210, USA.
Saturday AM
We have two eyes but only see one world. How visual inputs from the
two eyes combine in binocular vision has been one of the major focuses in
basic and clinical vision research. Here, we employed the external noise
approach with dichoptic displays to develop a binocular perceptual template model (bPTM) based on multi-pathway contrast-gain control model
and the perceptual template model. The method of constant stimuli was
used to measure psychometric functions in a sinewave grating detection
task in two spatial frequencies, three external noise levels, seven contrast
levels, and four dichoptic and one binocular conditions. There were a total
of 210 conditions and 18900 trials (90 trials/condition). We found that the
threshold versus external noise contrast function (TvC) in the four dichoptic conditions were virtually identical, and were only higher than that of
the binocular condition in zero and low external noise conditions. The
thresholds in the highest external noise conditions were virtually identical
across all five display conditions. We propose a binocular perceptual template model that consists of monocular perceptual templates, non-linear
transducer, and internal noises, interocular contrast-gain control, binocular
summation, binocular internal noise, and decision process. The model is
compatible with the original PTM in binocular conditions, and the MCM
developed for suprathreshold phase and contrast combination and stereopsis. With only five parameters, the bPTM provided an excellent account
of all the data (r^2 > 90%). With one additional parameter, the model can
take into account of the imbalances between the two eyes in near threshold
tasks, complementing the multi-pathway contrast gain control model in
suprathreshold tasks. The empirical results and bPTM shed new light on
binocular combination and may provide the basis to investigate binocular
vision in clinical populations.
23.4063 Real-time experimental control with graphical user inter-
face (REC-GUI) for vision research Ari Rosenberg1([email protected]
wisc.edu), Byounghoon Kim1, Shobha Kenchappa1, Ting-Yu Chang1;
Department of Neuroscience, School of Medicine and Public Health,
University of Wisconsin-Madison, Madison, WI, USA
Vision science studies often involve a combination of behavioral control,
stimulus rendering/presentation, and precisely timed measurements of
electrophysiological and/or behavioral responses. The constraints imposed
by these various requirements can make it challenging to jointly satisfy
all the necessary design specifications for experimental control systems.
Since precise knowledge of the temporal relationships between behavioral
and neuronal data is fundamental to understanding brain function, we
are spearheading an open-source, flexible software suite for implementing behavioral control, high precision control of stimulus presentation,
and electrophysiological recordings. The state-of-the-art system is being
developed to implement highly demanding specifications (e.g., rendering geometrically correct stereoscopic images with large depth variations,
binocular stimulus presentation at 240 Hz, and real-time enforcement of
behavior such as binocular eye and head positions), making the system ideally suited for a broad range of vision studies. The Real-Time Experimental
Control with Graphical User Interface (REC-GUI) consists of three major
components: (i) experimenter control panel (Python), (ii) scripts for rendering 2D or 3D visual stimuli (MATLAB/Octave with PsychToolbox), and
(iii) data acquisition components including eye/head monitoring (search
coil: Crist Instruments; optical: EyeLink, SR-Research Inc.) and high-density neural recording (Scout Processor, Ripple Inc.). Because rendering and
presenting complex visual stimuli like 3D stereoscopic images can require
significant computing power capable of interrupting display synchronization, the system divides stimulus rendering/presentation and behavioral
control between different processors. All processors communicate with
each other over a network in real-time using User Datagram Protocol to
minimize communication delays (average 767 ± 260 µsec over a gigabyte
Vi s i on S c i enc es S o ci e ty
network switch). Because the system is modular, all components can be
easily substituted to be compatible with different software, hardware, and
data acquisition systems. For example, MATLAB-based stimulus rendering/presentation can be readily replaced with C-code. We will soon make
all of the MATLAB/Octave and Python scripts available for customization
and collaborative development.
Acknowledgement: This work was supported by National Institutes of Health
Grant DC014305, the Alfred P. Sloan Foundation, and the Whitehall Foundation.
23.4064 Interocular enhancement revealed in binocular com-
bination Jian Ding1([email protected]), Oren Yehezkel1, Anna
Sterkin2, Uri Polat3, Dennis Levi1; 1School of Optometry, UC Berkeley ,
Goldschleger Eye Research Institute, Sackler Faculty of Medicine, Tel
Aviv University, 3Faculty of Life Sciences, Optometry and Vision Sciences,
Bar-Ilan University
Interocular suppression has been demonstrated in multiple binocular tasks.
However evidence for interocular enhancement has been elusive, because,
in the normal visual system, it is concealed by the strong interocular suppression. Interocular enhancement was first exposed in a study on amblyopic binocular vision (Ding, Klein & Levi 2013b) where the interocular suppression from the non-dominant eye to the dominant eye is almost absent,
thus revealing interocular enhancement. For normal binocular vision, adding interocular enhancement to a gain-control model (Ding-Sperling model,
Ding & Sperling 2006) results in significant improvement in model fitting in
binocular phase and contrast combination tasks (Ding, Klein & Levi 2013a),
and in binocular contrast discrimination (Ding & Levi 2016). In the present study, we examined how normally sighted observers combine slightly
different orientations presented to the two eyes. The stimuli were briefly
presented (80 ms) Gabor patches (3 cpd) presented to the two eyes, which
differed in both orientation and contrast. We used a signal-detection rating
method to estimate the perceived orientation. We tested three orientation
differences (10, 15 and 20 degrees), four base contrasts (10, 20, 40, and 60%)
and seven interocular (dichoptic) contrast ratios (0.25, 0.5, 0.75, 1, 1.33, 2,
and 4). We found that the interocular suppression decreased when the
base contrast increased, contradicting the prediction (more suppression at
a higher contrast level) of the Ding-Sperling model. Our modeling showed
that interocular enhancement is needed to neutralize the effect of interocular suppression when the base contrast increased. By adding interocular
enhancement to the Ding-Sperling model, the modified model successfully
accounted for whole data set. Combined with interocular suppression,
interocular enhancement appears to play an important role in binocular
Acknowledgement: NEI: RO1EY020976
23.4065 A contrast-based Pulfrich effect in normals and a spontaneous Pulfrich effect in amblyopes Alexandre Reynaud1(alexandre.
[email protected]), Robert Hess1; 1McGill Vision Research, Dept of Ophthalmology, McGill University
Any processing delay between the two eyes can result in illusory 3D percepts for moving objects because of either changes in the pure disparities over time for disparity sensors or by changes to sensors that encode
motion/disparity conjointly. This is demonstrated by viewing a fronto-parallel pendulum through a neutral density (ND) filter placed over one eye,
resulting in the illusory 3D percept of the pendulum following an elliptical
orbit in depth, the so-called Pulfrich phenomenon. Because of the difference between their two eyes, a small percentage (4%) of mild anisometropic
amblyopes who have rudimentary stereo are known to experience a spontaneous Pulfrich phenomenon. Here we use a paradigm where a cylinder
rotating in depth, defined by moving Gabor patches is presented at different interocular phases, generating strong to ambiguous depth percepts.
This paradigm allows one to manipulate independently the spatio-temporal properties of the patches to determine their influence on perceived
motion-in-depth. We show psychophysically that an interocular contrast
difference can itself result in a similar illusory 3D percept of motion-indepth. For amblyopes we observe a spontaneous Pulfrich phenomenon but
opposite to that expected, suggesting a faster processing by the amblyopic
eye. This spontaneous delay is reduced at low spatial and temporal fre-
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s quencies. We conclude that spatio temporal properties of the stimuli are
important for the illusion of motion-in-depth from contrast differences or
the spontaneous Pulfrich experienced by some amblyopes. Acknowledgement: ERA-NET NEURON (JTC 2015) to RFH
23.4066 The impact of object-based grouping on perceived depth
The amount of depth perceived between two vertical lines is markedly
reduced when those lines are connected to form the boundaries of a uniform closed object (Deas & Wilcox, 2014). Recently, we suggested that this
degraded depth effect is contingent on perceptual grouping of elements
to form an object and on disparity changes along the horizontal axis (Sudhama et al., 2015 VSS). In previous studies stimuli were presented virtually
on LCD displays, using a mirror stereoscope. In this set of experiments, we
ask whether the same distortions in perceived depth are observed when
multiple, consistent 2D depth cues are present. Here, we replicated Deas
and Wilcox’s original paradigm using physical stimuli. Targets consisted
of 3D-printed vertical posts (in isolation and connected to form rectangles),
mounted on a customized computer-controlled motion platform. Stimuli
were printed with a range of horizontal disparities between the vertical
contours. The stimulus dimensions, viewing distance, and test manipulations closely matched the original paradigm. A set of four disparities was
tested ten times apiece in random order, for isolated and connected stimuli.
On each trial, observers judged the amount of depth between two vertical
posts using a touch sensitive strip. We found that the resulting depth magnitude estimates were accurate over a large range of disparities. Moreover,
there was no difference between estimates obtained in the isolated line vs.
closed object configurations. In follow-up experiments with virtual targets
we also found that the disruptive effects of perceptual grouping are modulated by the presence of multiple depth cues. We argue that this is not due
to conflicts between 2D and stereoscopic depth cues. Instead, the absence
of reliable additional depth cues makes stereoscopic depth estimates more
susceptible to phenomena such as object-based grouping. These results
have clear implications for creation and use of stereoscopic imagery in virtual environments.
Acknowledgement: NSERC Discovery and CREATE Grants
23.4067 Visual Discomfort and Ethnicity Robert Mosher1(robert.
[email protected]), Daniel Del Cid1, Arthur Ilnicki1, Stefanie
Drew1; 1Vision Sciences, College of Social and Behavioral Sciences, California State University Northridge
Ethnic differences in prevalence of visual impairment have been reported
in children but this distinction has been less studied in college age adults
(Kleinstein, 2003). Other research suggests an overall 16.8% prevalence
of myopia in the Latino population (Tarczy-Hornoch, 2006). We previously noted a high incidence of visual discomfort symptoms reported by
a diverse population of college students. These findings were expected, as
symptoms including ocular fatigue, perceptual distortions, and headaches
are linked to nearwork tasks commonly performed by students, such as
reading or viewing computer screens. Based on ethnic-specific prevalence
of myopia noted by other studies we investigated whether similar patterns
might be observed in reports of visual discomfort symptoms in college
students. Methods: Two validated surveys for assessing visual discomfort
symptoms, the Visual Discomfort Survey (VDS) (Conlon et al., 1999) and
the Convergence Insufficiency Symptoms Survey (CISS) (Borsting et al.,
2003), and demographic questions were administered to 451 college students. Results: Participants who identified as Latino/a or Hispanic showed
significantly different patterns of results in a mediation model than those
that identified as Non-Latio/a or Non-Hispanic. These findings suggest
prevalence of visual discomfort in the college population may be confounded by ethnic differences.
23.4068 Does an eye movement make the difference in 3D? Katharina Rifai1([email protected]), Siegfried Wahl1;
Institute for Ophthalmic research, University of Tuebingen
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Currently, stereo image presentation is still the dominant commercially
used 3D image presentation technology. Widely applied in cinematic
applications, gaming, and optical technologies, adverse effects such as
fatigue, vertigo, and nausea are well described. However, the origin of
those adverse effects is relatively sparsely explored, partially due to technological difficulties in disentangling influencing factors. Stereographic
image content commonly varies from real world visual input in a variety
of ways. Specifically, in most applications the influence of the observer’s
eye remains unconsidered. Thus neither accommodation nor changes in
image projection due to eye movements are accurately mirrored in stereoscopic 3D image content. In the current study, task performance as well as
a subjective experience is assessed in a video-based stereo imaging system,
in which the visual impact of accommodation as well as image projection
changes due to eye movements can selectively be enabled. In four conditions (ACC accommodation, EM eye movements, ACC&EM, STATIC)
subjects performed a time-limited manual accuracy task. Subjects collected
pins from predefined touch points with forceps. Task performance was
measured by the amount of collected pins per time and compared between
the four conditions. Subjective experience was evaluated in a customized
questionnaire, and compared to task performance. In the questionnaire
task difficulty, experienced depth, immersion, and adverse reactions were
analyzed. Thus, the presented study considers the dedicated influences
of accommodation and eye movements in 3D perception of stereographic
video content. The results shed light on the relevance of active vision in the
perception of depth.
Acknowledgement: This project has been founded by the Inter-University Center
for Medical Technologies Stuttgart – Tübingen (IZST) and Carl Zeiss Meditec
23.4069 A Bayesian model of distance perception from ocular con-
vergence Peter Scarfe1([email protected]), Paul Hibbard2; 1School
of Psychology and Clinical Language Sciences, University of Reading, UK,
Department of Psychology, University of Essex, UK
When estimating distance from ocular convergence humans make systematic errors such that perceived distance is a progressive underestimate of
true physical distance. Similarly, when estimating the shape of an object
from binocular visual cues, object depth is progressively underestimated
with increasing distance. Misestimates of distance are thought to be key to
explaining this lack of shape constancy. Here we present a Bayesian model
of distance perception from ocular convergence which predicts these biases
given the assumption that the brain is trying to estimate the most likely
distance to have produced the measured, noisy, ocular convergence signal. We show that there is a lawful relationship between the magnitude
of noise in the ocular convergence signal and the magnitude of perceptual
bias (more noise results in greater bias). Furthermore, using a database of
laser scans of natural objects, we generate prior probabilities of distances in
the environment and show how these priors are distorted by the process of
distance estimation, such that the perceptual prior based on distance estimates is not necessarily equal to the objectively measured distance prior in
the world. This has important implications for defining perceptual priors
based on direct statistical measurements of the environment across multiple disciplines.
23.4070 Fusional Vergence differences between manual phoropter
and automated phoropter Efrain Castellanos1([email protected]
edu), Kevin Phan1; 1Western University College of Optometry
Purpose: The use of automated phoropters is becoming common in ophthalmic clinics however “the clinical norms” utilized for evaluating vergences were obtained using the manual phoropter. We sought to investigate
and compare the fusional vergence findings obtained with the automated
phoropter (Nidek RT-5100) and the manual phoropter (Topcon). Methods: The study was conducted at the College of Optometry at Western
University of Health Sciences, Pomona California where a total of 188 participants (optometry students) who were paired and individuals examined
each other and performed vergence measurements. The vergence measurement was performed for both distance vision (20 feet) and near vision (40
centimeters) using the 1) manual phoropter 2) automated phoropter. The
sequence of measurement was randomized. Results: A paired samples
t-test was utilized to evaluate the vergence data of blur/ break and recovery was analyzed for each method using paired samples t-test. The mean
values of blur/break and recovery was significantly different between the
Vis io n S c ie nc es Societ y
Saturday AM
magnitude: Virtual vs. physical targets Aishwarya Sudhama1([email protected]
my.yorku.ca), Lesley Deas2, Brittney Hartle2, Matthew Cutone2, Laurie
Wilcox2; 1Department of Biology, Centre for Vision Research, York
University, 2Department of Psychology, Centre for Vision Research, York
S atur day M orning Post ers
Saturday AM
Satur day Morni ng P os t er s
two phoropters at 20 feet p-values of 0.006, 0.013, and 0.002 respectively.
At near distance (40 cms) convergence base out showed significant difference for recovery (p= < 0.0001) and divergence base in prism for break in
fusional vergence (p=0.006). Conclusion: The vergence values obtained
using an automated phoropter is significantly different when compared to
values obtained using manual phoropter and the results obtained using
these phoropters cannot be used interchangeably. Clinicians need to take
this into account when making any clinical judgement involving any prism
prescription. A new set of clinical norms might be needed as a clinical
guideline when evaluating patients using automated phoropters.
23.4071 Modulation of oculomotor control & adaptation with
cerebellar TMS: effects on slow-tonic vergence adaptation. Heidi
Patterson1([email protected]), Ian Erkelens1, Claudia Martin Calderon1, William Bobier1, Benjamin Thompson1,2; 1University of Waterloo,
School of Optometry & Vision Science, 2University of Auckland, School of
Optometry & Vision Science
The adaptation of heterophoria to horizontal base-out prism reflects a slow
change in the underlying tonic vergence neural innervation. Recent fMRI
evidence suggests the posterior cerebellum may play a role in this unique
adaptive process. We applied continuous theta-bust stimulation (cTBS) to
the oculomotor vermis (OMV) of the posterior cerebellum to investigate a
causal relationship between this neural structure and slow-tonic vergence
(STV) adaptation. 14 subjects fused a 0.18 LogMAR chart at 40cm through
a 15 prism diopter (PD) base-out prism for 4 minutes after receiving active
or sham cTBS stimulation (3-50Hz pulses at 200ms intervals for 40 seconds)
to the OMV over the posterior cerebellum. Change in heterophoria, measured with Modified Thornington Technique every 15 seconds, defined
the amplitude and rate of STV adaptation. cTBS was applied at 80% of the
individual’s active motor threshold via a 2x75mm butterfly coil. Stimulation sites were localized using the BrainSight® neuro-navigation system
and anatomical landmarks. The amplitude of STV adaptation was not
different between active (6.31 ± 0.40 PD) and sham (6.97 ± 0.41 PD) conditions, p = 0.18. There was also no difference between the maximum rate of
tonic vergence adaptation in the active (0.41 ± 0.07 PD/s) or sham (0.33 ±
0.05 PD/s) conditions, p = 0.18. Baseline levels of tonic vergence innervation, measured before and after stimulation at each visit, were not different
between conditions (p > 0.15). cTBS applied to the OMV of the posterior
cerebellum did not affect tonic vergence innervation or STV adaptation to
base-out prism in healthy controls. This is in contrast to other types of oculomotor adaptation, where cTBS has been shown to affect both reflexive
pro-saccade generation and adaptation to double step stimuli. These results
suggest the OMV of the posterior cerebellum plays a limited role in the
management and adaptation of tonic vergence neural innervation. Acknowledgement: NSERC, OGS, COETF
23.4072 A computational model for the joint development of
accommodation and vergence control Jochen Triesch1([email protected]
uni-frankfurt.de), Samuel Eckmann1, Bertram Shi2; 1Frankfurt Institute for
Advanced Studies, 2Dept. of Electronic and Computer Engineering, Hong
Kong University of Science and Technology
Several studies investigating the development of amblyopia and strabismus suggest a strong interaction between vergence and accommodation.
For example, patients suffering from strabismus often develop amblyopia and subjects with amblyopia show decreased vergence and accommodation performance. Here we present the first computational model
for the joint development of accommodation and vergence control in the
active efficient coding framework. We use an online sparse coding algorithm to learn binocular receptive fields similar to those in V1 simple cells.
These adapt online to the input statistics by maximizing coding efficiency.
Simultaneously, the learned sparse representation is used to determine
the reward for two actor-critic reinforcement learners (RLs), which control
accommodation and vergence, respectively. By optimizing coding complexity (for accommodation control) and efficiency (for vergence control)
the system learns to focus images with zero disparity under healthy conditions. Interestingly, the accommodation RL learns to deduce the correct
command from the input disparity. We simulate an anisometropic case
where the refraction power of one eye is decreased. In this situation our
model chooses to focus close objects with the healthy and distant objects
with the hyperopic eye. Vergence performance remains high as long as the
refraction difference stays small. However, when focusing the object with
Vi s i on S c i enc es S o ci e ty
VS S 2017 Abst ract s
one eye leads to a highly blurred input for the other eye, the receptive fields
become more and more monocular. Thus, the RLs are no longer able to
assess the exact input disparity, which ultimately leads to a decrease of
both the vergence and accommodation performance. In conclusion, we
present, to the best of our knowledge, the first model for the joint learning
of vergence and accommodation control. The model explains how the brain
might learn to exploit disparity signals to control both vergence and accommodation and how refractive errors could derail this process.
Acknowledgement: This research is supported by the project
Perceptual Organization: Grouping
Saturday, May 20, 8:30 am - 12:30 pm
Poster Session, Pavilion
23.4073 Inter-Edge Grouping: Are many figure-ground principles
actually perceptual grouping? Joseph Brooks1([email protected]),
Anka Davila1, Akul Satish1; 1Centre for Cognitive Neuroscience & Cognitive Systems, School of Psychology, University of Kent
Figure-ground organization (a.k.a. edge-assignment) determines the shapes
that we see at edges and is widely known through experience of Rubin’s
reversible faces-vase image. It is thought to be affected by a host of imagebased (e.g., convexity) and non-image factors (e.g., attention, familiarity).
Figure-ground organization often appears alongside perceptual grouping
as a topic in psychology textbooks but they are typically discussed as separate processes of perceptual organization with their own distinct phenomena and mechanisms. Here, we propose a new class of figure-ground principles based on perceptual grouping between edges and demonstrate that
this inter-edge grouping (IEG) is a powerful influence on figure-ground
organization. We presented participants with tri-partite images with two
vertical dividing edges creating a central region and two flanking regions
(e.g., Rubin’s faces-vase). The two dividing edges were either grouped or
ungrouped according to one of seven different grouping principles (e.g.,
colour similarity, common fate). We measured figure-ground organization
of these tri-partite images (e.g., inner or flanking regions figural) using both
subjective reports and an objective, indirect measure of figure-ground organization. Across all grouping principles and both measures, we found that
figure-ground organization was affected by IEG such that the central region
between the two edges was more likely to be reported as figural when the
edges were grouped whereas the flanking regions were reported as figural
with ungrouped edges. In addition to these new phenomena, we can also
describe some classic figure-ground principles under the same coherent
framework. For instance, symmetry in multi-partite displays can be reinterpreted as inter-edge symmetry and convexity effects on figure-ground may
be partially due to inter-edge good continuation. Our results suggest that
figure-ground organization and grouping have more than a mere association within Gestalt psychology. Instead, perceptual grouping may provide
a mechanism underlying a broad class of new and extant figure-ground
Acknowledgement: Experimental Psychology Society Small Grant
23.4074 The mechanism underlying the competition between
grouping organizations Einat Rashal1([email protected]), Michael
Herzog1; 1Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland
Most previous investigations studied the competition between grouping
principles using subjective reports. Recently, Rashal, Yeshurun and Kimchi
(2016) used the primed-matching paradigm to investigate the time-course
of this competition. In this paradigm, a prime stimulus is followed by a
pair of test figures that are either identical to one another or different. Typically, “same” responses to the test-pair are faster and/or more accurate
when they are similar to the prime than when they are dissimilar. In that
study, the primes depicted one grouping principle, or two principles that
led to different organizations (e.g., columns by brightness simmilarity and
rows by proximity). Their results showed that at certain points of the timecourse both organizations produced similar priming, suggesting that representations of both organizations are constructed, and presumably compete
before the final percept is chosen for conscious perception. In the current
study, we examined whether the time-course of the competition is affected
by grouping strength. To that end, we manipulated the degree of the elements’ similarity for each grouping principle in the prime. Priming effects
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s Acknowledgement: European Union’s Horizon 2020 research and innovation
23.4075 Estimating the relative strength of similarity and proximity
in perceptual grouping with tripole Glass patterns Chien-Chung
Chen1,2([email protected]), Lee Lin1, Yih-Shiuan Lin1; 1Department of
Psychology, National Taiwan University, 2Neurobiology and Cognitive
Science Center, National Taiwan University
In Gestalt tradition, proximity and similarity are important cues for perceptual organization. We investigated how the visual system integrates these
cues by measuring their relative strength when they produced conflicting
grouping signals. We used tripole Glass pattern (tGP) which composed of
randomly distributed sets of three dots, including a seed and two context
dots. The tripoles were arranged in a way that linking the seed with one
context dot would produce a percept of clockwise (CW) spiral while the
other, counter-clockwise (CCW) spiral. The contrast of context ranged from
-30 to 0dB while the seed contrast kept at -20dB. The distance between
the seed and the context dots were between 5 and 20 min. The observers’
task was to indicate whether the tGP they perceived in each trial was CW
or CCW. When the distance between the seed and the two context dots
were the same, the probability of seeing CW spiral first increased and then
decreased with CW dot contrast, forming an Inverted-U shape psychometric function. The peak of the inverted-U function shifted rightward as CCW
dot contrast increased. When all dots had the same contrast, it was 10-20%
more likely to see the pattern formed by linking seed and the context dot at
a half distance than the other dot. Such proximity advantage was canceled
by decreasing the contrast of the proximity dot by about 6dB (50%), suggesting a linear trade-off between proximity and contrast similarity. Our
result cannot be accounted for either by similarity or contrast energy theories for Glass pattern, but was well fit by a pattern normalization model, in
which the response of a pattern detector was the sum of the excitations of
linear filters operating on local dipoles raised by a power and divided by an
inhibition signals from all other dipoles. Acknowledgement: MOST (Taiwan) 103-2410-H-002-076-MY3
23.4076 Parallelism is an emergent feature not derived from the
detection of individual line slopes James Pomerantz1([email protected]
edu), Curtiss Chapman1, Jon Flynn2, Colin Noe1, Tian Yingxue1; 1Department of Psychology, Rice University, 2Neurobiology and Anatomy, UT
Health Sciences Center Houston
Visual systems are quite sensitive to parallelism between two or more line
segments, but how do they detect this feature? Many methods are possible,
one of which includes computing and then comparing the slopes of the
lines in a feedforward manner. If the visual system employs this method,
it should be harder for us to perceive parallelism between two lines when
they are oriented obliquely, compared with being horizontal or vertical,
because of the oblique effect (OE; Appelle, 1972). Our experiment confirmed the expected OE for processing individual line segments, but we
found a greatly reduced OE for processing pairs of parallel vs. nonparallel
lines. We also demonstrated a sizeable configural superiority effect (CSE;
Pomerantz, Sager, and Stoever, 1977) for line pairs over individual line
segments. This CSE means it is easier to determine which of four line segments has a different slope from the other three identical segments when
the same, non-informative line segment is added next to all four segments
to create four pairs of segments, three pairs of which are parallel and one
non-parallel (or vice versa). Our findings differed largely in expected ways
when the stimuli were presented inside a diamond-shaped frame (sharing
parallelism within the frame) rather than square: performance with oblique
lines improved. In summary, our results suggest that parallelism is a salient
emergent feature in vision, more salient to us than the slopes of the individual lines from which it arises. Parallelism appears to be detected through
some method other than computing and comparing the slopes of the two
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
line segments, perhaps by being detected directly. Parallelism thus joins
other emergent features arising from line segments, including collinearity,
intersection, closure, and symmetry, as an extremely salient higher-order
property of wholes that is more perceptible are than the component parts
from which it derives.
23.4077 Category-based updating of object representations Ru Qi
Yu1([email protected]), Jiaying Zhao1,2; 1Department of Psychology,
University of British Columbia, 2Institute for Resources, Environment and
Sustainability, University of British Columbia
The visual system is efficient at detecting regularities in the environment.
When two objects reliably co-occur, changes in one object are automatically
transferred to its co-occurring partner. It is unknown how such updating
can transpire across categorical boundaries. In Experiment 1, participants
viewed a random temporal stream of objects, which came from two distinct categories based on texture (i.e., objects in Category A had stripes vs.
objects in Category B had dots). Each object had a unique shape. After exposure, one object in Category A (e.g., A1) increased in size, and participants
recalled the size of another object in the same category (e.g., A2) or in the
different category (e.g., B1). We found that objects in the same category
were recalled to be reliably larger than objects in the different category,
suggesting that changes in one object are more likely to be transferred to
another in the same category than in an object in a different category. To
elucidate if the cross-category transfer can be facilitated by statistical regularities, we conducted Experiment 2, where participants viewed the same
objects, except now objects in the two categories were temporally paired
(i.e., A1 reliably appeared before B1). After exposure, one object in Category A (e.g., A1) increased in size, and participants recalled the size of the
cross-category paired partner (B1), the within-category random object (A2),
or cross-category random object (B2). We found that the within-category
object (A2) was recalled to be reliably larger than any cross-category object
(B1 or B2). This suggests that changes in one object were more strongly
transferred to other objects in the same category any objects of a different
category, regardless of statistical regularities. These results reveal a within-category advantage of updating of feature changes, that they are more
readily transferred within the same category than across categories.
23.4078 Solving the Complexity of Object Occlusions in Scenes: The
Grouping of Adjacent Surfaces and Non-Adjacent but Connected
Surfaces Debarshi Datta1([email protected]), Howard Hock1,2;
Department of Psychology, Schmidt College of Science, Florida Atlantic
University, 2Center for Complex Systems and Brain Sciences, Schmidt
College of Science, Florida Atlantic University
In contrast with classic Gestalt examples of perceptual grouping, most natural environments contain multiple objects, each with multiple surfaces.
Each object is likely to occlude other objects partially and is itself likely to
be partially occluded. A central question, therefore, is how the visual system resolves the resulting surface correspondence problem by successfully
determining which surfaces belong to which objects (Guzman, 1969). To
this end, a recently developed dynamic grouping methodology determines
whether pairs of adjacent surfaces are grouped together (Hock & Nichols,
2012; Hock & Schöner, 2015). The grouping of adjacent surfaces, which
depends on their affinity state, is indicated by the direction of perceived
motion across one surface when its luminance (and thus, its luminance similarity with the adjacent surface) is perturbed. It is shown here that dynamic
grouping also can occur for nonadjacent surfaces, providing they are
linked in two-dimensions by a connecting surface. METHOD: Three disconnected horizontal surfaces with the same luminance are presented with
two darker surfaces that connect them. Dynamic grouping motion is created by decreasing the central horizontal surface’s luminance, decreasing
its similarity with the flanking horizontal surfaces (while simultaneously
increasing its similarity with the darker connecting surfaces). RESULTS:
The perception of outward (diverging) dynamic grouping motion toward
the flanking horizontal surfaces indicated that the central horizontal surface was grouped with the nonadjacent, but connected flanking horizontal
surfaces. This was consistent with connectivity functioning as a grouping
variable, which was first reported by Palmer and Rock (1994). Preliminary
evidence indicates that the dynamic grouping motion is stronger when
the nonadjacent but connected horizontal surfaces are aligned, and the
Vis io n S c ie nc es Societ y
Saturday AM
were expected to emerge for the organization that produces stronger priming relative to the other (i.e., the dominant organization). The time-course
of the competition was examined by varying prime duration. We found
that priming effects for the dominant organization increased as grouping strength increased for that organization. However, priming was also
reduced for the dominant organization as grouping strength for the second
organization increased. These results further support previous findings of
a competition between multiple representations, and provide evidence for
grouping strength as a factor in this competition.
S atur day M orning Post ers
Satur day Morni ng P os t er s
connecting surfaces function as occluders (in this case dark vertical bars),
consistent with amodal completion requiring the perceptual grouping of
nonadjacent surfaces behind an occluding surface.
Saturday AM
23.4079 Evidence for Configural Superiority Effects in Convolu-
tional Neural Networks Shaiyan Keshvari1([email protected]), Ruth
Rosenholtz1,2; 1Computer Science and Artificial Intelligence Laboratory,
Massachusetts Institute of Technology, 2Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology
Finding a left-tilted line among right-tilted becomes easier with an “L”
added to each item, transforming the task into one of finding a triangle
among arrows. This configural superiority effect occurs for a wide array
of stimuli, and is thought to result from vision utilizing “emergent” features (EFs), such as closure, in the composite case. A more computational
interpretation can be couched in idea of a visual processing hierarchy, in
which higher level representations support complex tasks at the expense of
other, possibly less ecologically relevant tasks. Detecting the oddball might
be inherently easier in the composite condition given the representation at
some level of the hierarchy. To test this, we used the VGG-16 (Simonyan & Zisserman, 2015) convolutional neural network (CNN), trained to
recognize objects using the ImageNet dataset, as a stand-in for the hierarchical visual encoding. Such CNNs have high performance on object recognition, as well as on tasks for which they are not trained. Feature vectors
at different layers correlate with responses of various brain areas (Hong et
al., 2015). We tested five EF stimuli in a 4AFC oddball localization task
(Pomerantz & Cragin, 2013). We trained a multi-class SVM operating on the
outputs of the last fully connected layer, and performed a K-fold cross-validation. Two EFs (orthogonality and roundness) show better performance
(33 and 53 percentage points, respectively) in the composite than the base
case. One (closure) showed no effect (< 1 pp), and two (parallelism and 3D)
had worse performance in the composite (23 and 21 pp). A pilot behavioral
experiment (200 ms presentation) confirmed that observers (N=2) are better
with composite stimuli for all five EFs (44 +/- 0.06 pp). This suggests that
some EFs are better represented by highest layers of the network than their
base features, but it is not the complete story.
23.4080 Can perceptual grouping unfold in the absence of visual
consciousness? Ruth Kimchi1([email protected]), Dina
Devyatko1, Shahar Sabary1; 1Department of Psychology and Institute of
Information Processing and Decision making, University of Haifa
What kinds of perceptual organization can occur without awareness of
the stimulus? Previous studies addressing this issue yielded inconsistent
results (e.g., Harris et al., 2011; Lau & Cheung, 2012; Montoro et al., 2014;
Moors et al., 2015; Wang et al., 2012). The inconsistency may be partly due
to different techniques used to induce invisibility. In this study, we examined whether visual consciousness is required for two perceptual grouping
principles: luminance similarity and element connectedness, using priming
paradigm and the same technique — continuous flash suppression (CFS;
Tsuchiya & Koch, 2005) — to render the prime invisible. Participants were
presented with a liminal prime consisted of dots organized into rows or
columns by luminance similarity (Experiment 1; 20 participants) or by element connectedness (Experiment 2; 19 participants), followed by a clearly
visible target composed of lines, the orientation of which could be congruent or incongruent with the orientation of the prime. The prime-target SOA
varied (200,400, 600, or 800 ms). On each trial participants made speeded
discrimination response to the orientation of the target lines (vertical or
horizontal) and then rated the visibility of the prime using a scale ranging
from 0 (“I saw nothing”) to 3 (“I clearly saw …”). Unconscious grouping of
the prime was measured as the priming effect on target discrimination performance of prime-target orientation congruency, on trials in which participants reported no visibility of the prime. In both experiments, and across
all prime-target SOA, there were no priming when the prime was reported
invisible; significant priming was observed when the prime was reported
visible. These findings suggest that perceptual grouping by luminance similarity and by element connectedness does not take place when the visual
stimulus is rendered nonconscious using CFS.
Acknowledgement: ISF
VS S 2017 Abst ract s
23.4081 1,2,3, many: Perceptual order is computed by patches
containing 3x3 “repetitions” of Motifs Mikhail Katkov1(mikhail.
[email protected]), Hila Harris1, Dov Sagi1; 1Department of Neurobiology, Weizmann Institute of Science, Rehovot, 76100 Israel
It is believed that symmetry plays an important role in human visual perception, as is manifested by the Gestalt laws. Mathematically, symmetry is
defined as a transformation mapping an image to itself. An important class
of such transformations is the Wallpaper Group that consists of repetitive
patterns (Motifs). Strict symmetry rarely appears in nature. In statistical
physics the deviation from symmetry is characterized by the Order Parameter, ranging from zero (random) to one (symmetry). Operationally, it is
usually defined as the first order statistic over a local symmetry measure.
Here we are interested in estimating the size of the local symmetry measure in the human visual system, defined in terms of the number of Motif
repetitions. We used a 4AFC spatial odd-ball discrimination task: three
quadrants of stimulus contained randomly generated textures, whereas the
fourth quadrant contained a texture with varying degree of order. Order
was controlled by the thermodynamic temperature in a Boltzmann distribution with potentials having different symmetries. Images were generated
by a Chromatic Gibbs Sampler. Motif size was NxM Gaussian blobs (N=7-9,
M=7-9 depending on the symmetry of the potential). Results from 4 observers show that psychometric functions (discrimination performance vs. temperature) were not different between images trimmed to size of 3x3 motifs
and larger images, whereas discrimination of images of one motif size
was practically at chance level for all temperatures. Images of 2x2 motifs
were in between (at high temperatures these patches do not have obvious
repetitions). Importantly, scaling the trimmed images leads to the same
performance. We conclude from these results that if order is computed
in the brain it is performed by patches containing 3x3 motifs. Moreover,
the amount of the information in the patches, and not the physical size, is
important for order perception.
Acknowledgement: Basic Research Foundation, administered by the Israel Academy of Science
23.4082 Examining a shift in response bias through two lenses: A
concurrent examination of process and informational characteristics Michael Wenger1([email protected]), Lisa DeStefano1, James
Townsend2, Yanjun Liu2, Ru Zhang3; 1Psychology, Cellular and Behavioral
Neurobiology, The University of Oklahoma, 2Psychological and Brain
Sciences, Indiana University, 3University of Colorado Boulder
Critical distinctions in human information processing, such as parallel versus serial processing, or integral versus separable dimensions of
encoded information, are at the very core of understanding the foundations
of psychological experience. As such, two lines of general and powerful
mathematical characterizations of these problems have been developed,
resulting in two meta-theories: general recognition theory (GRT, Ashby &
Townsend, 1986), which addresses the relations among multiple sources of
encoded information using response frequencies, and systems factorial theory (SFT, Townsend & Nozawa, 1995), which addresses fundamental characteristics of processing using reaction times (RTs). To date, GRT and SFT
have evolved separately, with open questions existing for each; the present
effort is one of a set of ongoing efforts intended to address the questions
of each meta-theory individually by using the two approaches together. In
the present effort, we sought to investigate the extent to which a response
bias could be reliably identified in both response frequencies and RTs, by
using static GRT, a newer RT version of GRT (RTGRT, Townsend, Houpt,
& Silbert, 2012) and SFT. The stimuli for this particular investigation were
designed to induce the Hering illusion, where physically vertical lines to
the left and right of a center point are superimposed on a set of radiating
lines, resulting in the illusion that the vertical lines are bowed outward.
Observers participated in two tasks, a double factorial task using a conjunctive (AND) response rule and a complete identification task. Payoff
schemes were manipulated in a way that the optimal responding was first
unbiased and then was biased toward specific responses. Results indicate
that capacity increased with liberal bias and decreased with conservative
bias. Individual differences across observers suggest the potential for using
these regularities to advance a theoretical synthesis of the two approaches.
Acknowledgement: National Science Foundation
Vi s i on S c i enc es S o ci e ty
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s Perceptual Organization: Neural mechanisms
Saturday, May 20, 8:30 am - 12:30 pm
Poster Session, Pavilion
23.4083 Neural representations of ensemble coding for visual
The human brain is endowed with the ability to summarize the properties of similar objects to efficiently represent a complex visual environment.
Although previous behavioral studies demonstrated that we can extract the
mean orientation, size and speed from sets of items (Ariely, 2001; Chong
& Treisman, 2003; Dakin & Watt, 1997; Watamaniuk & Duchon, 1992),
the underlying neural mechanism remains poorly understood. Here, we
investigated the neural substrates of visual statistical summary representation. More specifically, using fMRI and encoding methods we examined
1) whether the mean orientation is represented in population-level orientation tuning responses in early visual areas as well as high-level fronto-parietal regions and 2) whether this tuning profile is modulated by the
variance of orientation in sets of items. In the experiment, 30 small Gabor
patches varying in orientation briefly appeared at random locations within
a hypothetical circle and subjects were instructed to estimate their mean
orientation and indicate it by a button press. Our behavioral data showed
that the averaging performance was impaired as the orientation of Gabor
patches became heterogeneous. We also found robust activation in parietal
and dorsolateral prefrontal cortices while subjects performed the averaging
task. Next, we estimated the tuning responses of the mean orientation in
parietal and dorsolateral frontal cortices as well as early retinotopic visual
areas. The results showed that the population-level orientation tuning functions peak at the mean orientation of sets of Gabor patches, and the tuning
strength was attenuated as the variance in orientation increased, which
reflects the decrease in behavioral performance. Our results suggest that
early visual cortex and frontoparietal regions may serve to process ensemble coding, whereby summary statistics of visual stimuli are extracted.
Acknowledgement: This work was supported by IBS-R015-D1
23.4084 Conjoint and independent representation of numerosity
and area in human intraparietal cortex Andrew Persichetti1([email protected]
emory.edu), Lauren Aulet1, Daniel Dilks1, Stella Lourenco1; 1Department
of Psychology, Emory University
The posterior parietal cortex in primates has been implicated in representing different abstract quantities such as numerosity and spatial extent (e.g.,
object size). However, there is heated debate about the functional organization of the underlying representations. Recent evidence from single-unit
recording in monkeys and population receptive field models in humans
suggests that there are overlapping groups of parietal neurons tuned for
both quantities. Here, using a continuous carry-over functional magnetic
resonance imaging adaptation design, we asked whether the overlap in
these representations reflects independent populations of neurons coding
for each quantity separately, or a single neural population that is conjointly
tuned to both quantities. Specifically, we presented images of dot arrays
that varied concurrently in numerosity (2-5 dots) and cumulative area in
a continuous, counterbalanced sequence. We modeled adaptation along
these two dimensions with both a City-block and Euclidean contraction
covariate. In the case of independent populations, neural adaptation will
reflect the additive combinations of adaptation for number and area in isolation, as modeled by the City-block covariate. In the case of a single conjoint population, the amount of adaptation for a combined change will be
subadditive, as modeled by the Euclidean contraction covariate. We found
a subadditive amount of adaptation in a posterior region in the right intraparietal sulcus (rIPS), which overlaps with previously reported topographic
maps for both dimensions. In contrast, we found an additive amount of
adaptation in a more anterior region of the rIPS, as well as in both a posterior and anterior region in the left IPS (lIPS). Thus, we found evidence for
both conjoint and independent populations of neurons, with neurons in the
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
posterior rIPS conjointly representing numerosity and area, and neurons in
the anterior regions of the rIPS and in the lIPS independently representing
these dimensions.
Acknowledgement: This work was supported by a Scholar Award from the John
Merck Fund to SFL, Emory College, Emory University (DD) and the National
Science Foundation Graduate Research Fellowship Program under Grant No.
DGE-1444932 (AP).
23.4085 Measuring Integration Processes in Visual Symmetry with
Frequency-tagged EEG Nihan Alp1,2([email protected]), Peter
Kohler2, Naoki Kogo1, Johan Wagemans1, Anthony Norcia2; 1Brain and
Cognition, KU Leuven , 2Department of Psychology, Stanford University
Previous brain imaging studies of symmetry have shown that several higher-tier visual areas have strong responses to mirror (Sasaki et al. 2005) and
rotation symmetry (Kohler et al., 2016). The aim of the current work was to
isolate dynamic signatures of brain responses associated with the integrative processes that underlie symmetry perception. We measured steadystate VEPs as participants viewed symmetric patterns comprised of distinct
spatial regions presented a two different frequencies (f1/f2). Under these
circumstances, the intermodulation (IM) components have been shown
to capture integrative processing (Alp et al., 2016), because only neuronal
populations that non-linearly integrate the parts of the image can produce
these IMs. To measure integration processing during mirror symmetry
perception, we used wallpaper patterns (Fedorov, 1891). For the mirror
symmetric stimuli, we generated a PMM pattern containing two mirror
symmetry axes by tiling the plane with a two-fold mirror symmetric lattice. We then diagonally split each lattice into separate parts to generate an
image-pair that could be presented at different frequencies. To generate the
control stimuli, we created a control pattern by rotating the first lattice by
90o, and then combining diagonally split images from the mismatched patterns. This procedure removes all mirror symmetry from the control image,
while keeping local properties equal. All images contained translation and
rotation symmetry, but mirror symmetry could only emerge through the
combination of the image-pair in the mirror symmetric stimulus. Both mirror and control stimuli evoked activity at the IMs, indicating that non-linear
integration is occurring for both pattern types. Several response components showed differential responses between the mirror and control stimuli, however, indicating symmetry-specific integration. There was a complex pattern of statistically reliable differences in both self-terms (2f1,2f2)
and IMs (f2-f1), which suggests the involvement of distinctive non-linear
global pooling in the presence of mirror symmetry. Acknowledgement: Research Foundation Flanders - FWO
23.4086 Recurrent Interaction between Visual Cortical Areas
Contributes to Contour Integration in the Human Brain: An fMRIguided TMS Study Ya Li1([email protected]), Yonghui Wang1, Sheng
Li2,3,4,5; 1School of Psychology, Shaanxi Normal University, 2School of
Psychological and Cognitive Sciences,Peking University, 3Beijing Key
Laboratory of Behavior and Mental Health, Peking University, 4Key Laboratory of Machine Perception (Ministry of Education), Peking University,
PKU-IDG/McGovern Institute for Brain Research, Peking University
One of the challenging task for the human visual system is how they extract
and integrate the local elements from the cluttered background into the
global contour perception. Although previous studies have suggested
the involvement of both striate and extrastriate cortex for this intermediate-level processing of visual perception, their relative roles and dynamic
interactions between these areas are largely unknown. To examine whether
the recurrent processing between the lower and higher-level visual areas
plays a causal role in contour integration, we applied fMRI-guided transcranial magnetic stimulation (TMS) on early visual cortex (V1/V2) and
intermediate-level visual area (V3B) at four SOAs (60/80, 90/110, 120/140
or 150/170 ms) (plus a no-TMS condition) while the participants performed
a contour detection task. Results showed that both V1/V2 and V3B were
critically involved in the process of contour integration. Importantly, the
first critical inference time window for V1/V2 (120/140 ms, p < .05, Cohen’s
d = 0.57) follows that for V3B (90/110 ms, p < .05, Cohen’s d = 0.58). The
inference effect was also found at 150/170 ms for both areas (V1/V2: p
= .05, Cohen’s d = 0.50; V3B: p = .08, Cohen’s d = 0.41). These findings
suggested that the critical contribution of V3B to contour integration was
earlier than that of V1/V2. The present study provides direct evidence sup-
Vis io n S c ie nc es Societ y
Saturday AM
features in the early visual and fronto-parietal cortex Kyeong-Jin
Tark1([email protected]), Sunyoung Park1, Insub Kim1,2, Won Mok
Shim1,2; 1Center for Neuroscience Imaging Research, Institute for Basic
Science (IBS), 2Department of Biomedical Engineering, Sungkyunkwan
University (SKKU)
S atur day M orning Post ers
Saturday AM
Satur day Morni ng P os t er s
VS S 2017 Abst ract s
porting the causal role of the recurrent processing between V3B and V1/V2
in contour integration and agree with the data from monkey physiology.
Our findings fit well with the incremental grouping theory (Roelfsema,
2006; Roelfsema & Houtkamp, 2011), in which a feedforward sweep generates a coarse template in higher visual areas with large receptive fields
before the processing of detail information in lower visual areas with small
receptive field through feedback mechanisms.
and/or par triangularis, and just anterior to, but not overlapping (< 2%
overlap), iPCS maps. Second, we found that ToM activation patterns in the
TPJ were more posterior and superior, closer to the angular gyrus, compared to the visually defined TPJ region, which was closer to the planum
temporale, again with no overlap. The individual subject analysis shows
that the positions of the visual areas in association cortex are systematically
related to, but not overlapping with, regions defined by cognitive tasks.
Acknowledgement: This work was supported by the National Natural Science
Foundation of China (31230029, 31271081, 31371026).
Acknowledgement: NSF Graduate Research Fellowship DGE 1342536 (W.M.),
NIH-R01 EY016407 (C.C.) and NIH-R00 EY022116 (J.W.)
23.4087 Two-stage generative process in illusory shape perception:
23.4089 Top-down neural processing that supplements missing
a MEG study Ling Liu1,2([email protected]), Huan Luo1,2; 1School of
Psychological and Cognitive Sciences, Peking University, 2IDG/McGovern Institute for Brain Research, Peking University
Grouping local parts into coherent shapes (e.g., illusory shape perception)
is a central function in vision and has been suggested to be a generative
process such that feedback signals carry predictions (i.e., the illusory shape)
and feedforward signals represent prediction errors (i.e., the mismatch
between predictions and actual bottom-up inputs). Although recent fMRI
studies provide evidence supporting the predictive coding hypothesis in
illusory shape perception, the neuronal dynamics and the associated brain
regions underlying this generative process remains largely unknown. To
address the issue, we recorded magnetoencephalography (MEG) signals
while human subjects were presented with Pac-Man figures, the combination of which is either or not able to induce an illusory shape perception
(‘Kanizsa triangle’), corresponding to grouping and ungrouping conditions
respectively. Critically, here we employed a temporal response function
technique (TRF) combine with randomly modulated the luminance of each
Pac-Man to extract neuronal response specific for each of the three Pac-Man
figures. First, the TRF responses for grouping condition showed decreased
activities compared to ungrouping condition, consistent with predictive
coding account, given that the predictive errors are assumed to be smaller
when shape perception is induced. Second, two time periods, one early and
one late, associated with different neuronal oscillatory frequency and different brain regions, showed the inhibition effects. Specifically, within 100
~150 ms, a beta-band (14-20 Hz) decrease was originated in bilateral early
visual cortex (V1 and V2) and TPJ regions; within 200-400 ms, a theta-band
(4-7 Hz) inhibition was found to arise from right IFG and TPJ regions.
We propose that the illusory shape perception consists of two stages: an
early one that quickly encodes predictive error in early sensory areas and
a late one that performs background inhibition in right parietal and frontal
regions after the establishment of illusory shape as foreground.
Acknowledgement: the National Nature Science Foundation of China Grants to
H. L. (31522027, 31571115).
23.4088 The topographical relationship between visual field maps
in association cortex and brain areas involved in non-visual cognition Eline Kupers1([email protected]), Wayne Mackey2, Clayton
Curtis1,2, Jonathan Winawer1,2; 1Department of Psychology, New York
University, New York, USA, 2Center for Neural Science, New York University, New York, USA
Visual field maps have been found in all lobes of the brain. Multiple maps
in association cortex are in or near regions associated with cognitive tasks
that are not explicitly visual. Understanding the relationship between visually-defined areas and cognitively-defined areas will clarify our understanding of association cortex. Using fMRI, we investigated two pairs of
visually defined and cognitively defined areas in individual subjects: (1) a
visual field map in the inferior precentral sulcus (iPCS), and Broca’s area,
and (2) a visually responsive region in the temporoparietal junction (TPJ;
Horiguchi et al., 2016, doi:10.1093/cercor/bhu226) and an area involved
in theory of mind (ToM, Saxe & Kanwisher, 2003, doi:10.1016/S10538119(03)00230-1). Using an attention-demanding retinotopic mapping
task (Mackey et al., 2016, doi:10.1101/083493), we defined visual field maps
in iPCS and visually responsive regions in TPJ. In the same subjects, Broca’s
area was defined by a language localizer (words > jabberwocky sentences;
Fedorenko et al.,2012, doi:10.1016/j.cub.2012.09.011), and a ToM area in TPJ
from a story localizer (ToM stories > descriptions of pictures; Saxe & Kanwisher, 2003). We projected the contrast patterns onto the cortical surface
and compared their locations to the previously defined visual areas. We
found that Broca’s area was left-lateralized, on or near the pars opercularis
Vi s i on S c i enc es S o ci e ty
image features revealed by brain decoding with deep neural network representation Mohamed Abdelhack1,2([email protected]
kyoto-u.jp), Yukiyasu Kamitani1,2; 1Department of Intelligence Science and
Technology, Graduate School of Informatics, Kyoto University, 2Department of Neuroinformatics, ATR Computational Neuroscience Laboratories
The problem of visual recognition entails matching the sensory input with
the stored knowledge of semantic information. This process involves a
feed-forward component where visual input is processed to extract features
characterizing different objects. It is also presumed to involve an opposite
processing pathway where high-level features propagate back providing
prediction on the kind of visual stimulus presented. This top-down pathway appears to be useful particularly for tasks like processing degraded
stimuli. The process by which the forward and backward pathways integrate leading to visual perception is still largely unknown. Here, using a
deep neural network (DNN) trained to recognize objects as a proxy for hierarchical neural representations (Horikawa & Kamitani, 2015), we demonstrate that top-down processing pathway attempts to supplement missing
visual features. We first trained multivoxel fMRI decoders to predict DNN
features of multiple layers for stimulus images. The trained decoders were
then used to analyze independent fMRI data collected while viewing pairs
of normal and degraded images. Degraded images were created by blurring original images using averaging filters, and by binarizing slightly
blurred images using thresholding. We found that decoded features from
fMRI responses to degraded images were more correlated to DNN features
calculated from the original images than from the degraded (presented)
ones. This was especially salient in the lower level DNN representations.
We also found that the task of categorizing the visual stimuli increased the
correlation difference especially in higher visual areas. These results suggest the operation of the top-down pathway as it attempts to supplement
the missing information in degraded images. The effect of the categorizing
task may indicate how giving a prior guides the exploration efforts towards
successful perception. This DNN-based brain decoding approach may
reveal the interactions between the bottom-up and the top-down pathways,
providing empirical evidence for existing and novel theoretical models.
Acknowledgement: 1- JSPS KAKENHI Grant number JP15H05710,
JP15H05920, 2- ImPACT Program of Council for Science, Technology and Innovation (Cabinet Office, Government of Japan), 3- The New Energy and Industrial
Technology Development Organization (NEDO)
23.4090 Visual hallucinations following occipital stroke associated
with altered structural connectivity Sara Rafique1([email protected]
ca), John Richards2, Jennifer Steeves1; 1Centre for Vision Research and
Department of Psychology, York University, 2Department of Emergency
Medicine, University of California, Davis, Medical Center,
Irreversible damage to the visual pathway that results in vision loss can
produce visual hallucinations in cognitively healthy individuals. These
visual hallucinations stem from disruption to neuronal function that leads
to aberrant functional activity across visual cortices and associated cortical networks. We sought to investigate structural changes in white matter
connectivity and its contribution to chronic visual hallucinations following damage to the visual cortex. We performed diffusion tensor imaging
to assess white matter in a patient suffering from continuous and disruptive unformed visual hallucinations for more than 2 years following right
occipital stroke, and in healthy age-matched controls. White matter structure was reconstructed using probabilistic fibre tractography, and diffusion
was quantified by measuring diffusion tensor indices. Using probabilistic
tractography, we reconstructed reciprocal white matter tracts between the
lateral geniculate nucleus and visual cortex, and between visual cortices.
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s Temporal Processing: Duration
Saturday, May 20, 8:30 am - 12:30 pm
Poster Session, Pavilion
23.4091 Individual differences in the perception of (a bigger)
time Simon Cropper1([email protected]), Christopher Groot1,
Andrew Corcoran1, Aurelio Bruno2, Alan Johnston3; 1MSPS, University
of Melbourne, Australia, 2Experimental Psychology, University College
London, UK, 3School of Psychology, University of Nottingham, UK
The ability of subjects to identify and reproduce brief temporal intervals
is influenced by many factors whether stimulus-, task- or subject-based.
Previously we have shown the effects of personality on sub-second timing
judgements (VSS 2015); the current study extends this result to supra-second judgements, to examine the postulated dissociation between sub- and
supra-second timing mechanisms. 141 undergraduate subjects completed
the OLIFE schizotypal personality questionnaire prior to performing a
modified temporal-bisection task. Subjects responded to two identical
instantiations of a 4deg grating, presented 4deg above fixation for 3 secs
in a rectangular temporal-envelope. They initiated presentation with a button-press, and released the button when they considered the stimulus to
be half-way through. Subjects were then asked to indicate their ‘most accurate estimate’ of the two intervals. The stimuli were static and blocked into
four repeats of 50 stimulus pairs. The significant order-effect seen in the
sub-second data disappeared; this was at the expense of accuracy, as the
mid-point was consistently underestimated. Precision in the response was
increased as a proportion of total duration, reducing the variance below
that predicted by Weber’s law. This result is consistent with a breakdown
of the scalar properties of time perception in the early supra-second range.
All subjects showed good insight into their own performance, though that
insight did not necessarily correlate with the veridical bisection point; they
were consistently and confidently wrong. The significant correlations with
schizotypy seen in the sub-second data were not replicated in the current
study. These data support a partial dissociation of timing mechanisms,
but also suggest that not only is perception the critical mitigator of confidence in time, but that individuals effectively compensate for differences in
perception at the level of metacognition in early supra-second time.
23.4092 Perception of duration in the absence of the clock
reset Ljubica Jovanovic1([email protected]), Pascal Mamassian1;
Laboratoire des Systèmes Perceptifs, CNRS UMR 8248, Département
d’Études Cognitives, École Normale Supérieure, Paris, France
Models of time perception propose different mechanisms, varying in
complexity and levels of explanation (Block & Grondin, 2014; Matthews
& Meck, 2014). However, most models assume that in order to estimate
duration a clear onset of to-be-time interval is needed. We aimed to explore
this assumption by investigating estimation of time in absence of the onset
of the interval. Stimuli consisted of a small disc rotating around a clock
with variable speeds. After a variable duration, the disc would stop and
participants were prompted to reproduce the duration of the last rotation.
Since the stopping position was random, there was no salient onset of the
last rotation. In order to investigate the contribution of visual information on the timings, we introduced an occlusion along the path of the disc.
We compared performance in non-occluded and occluded conditions. In
addition, there were conditions in which the disc could abruptly change
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
speed behind the occluder, to investigate the effect of unexpected event
on perceived duration. In agreement with previous work, short durations
(stimuli moving fast) are perceived to last longer while long durations
are underestimated (Jazayeri & Shadlen, 2010). Moreover, bias and variability of reproduced durations in this task were comparable to a control
experiment with clear onset of the duration. Importantly and surprisingly, reproduced times were less biased in the occluded condition. Past
work has revealed that perceived duration is biased by visual information
(Kaneko & Murakami, 2009). Our results indicate that when visual information is not always available, participants properly take into account the
duration during which stimulus is absent and their overall performance
is improved. Taken together, our results suggest that timing is possible
without assuming any reset of a clock. We propose ways to modify existing
models of time perception to account for our results. Acknowledgement: PACE ITN, European Union’s Horizon 2020 research and
innovation programme under the Marie Sklodwska-Curie grant agreement No
23.4093 Stimulus response compatibility affects duration judg-
ments, not the rate of an internal timer. D. Alexander Varakin1([email protected]); 1Department of Psychology, Eastern Kentucky
Varakin, Hays, and Renfro (2015, VSS) demonstrated that stimulus response
compatibility (SRC) influences duration judgments. The current experiment tested whether SRC affects an internal timer’s rate. Participants (N
= 215) performed a temporal bisection task, judging on each trial whether
a visual stimulus’ duration was closer to pre-learned short or long standards. Response mapping was counterbalanced: about half of participants
used a right-hand key for “long” judgments and a left-hand key for “short”
judgments, vice versa for remaining participants. On each trial, stimuli
appeared on the left or right side of the monitor, thus inducing SRC. Two
additional factors were manipulated. The first was the temporal location of
the SRC-relevant stimulus. In the “during” condition, the stimulus being
judged appeared on the left or the right of fixation. In the “after” condition, the stimulus being judged appeared in the center of the monitor, and
the response prompt appeared on the left or right of fixation. If SRC only
changes the rate of an internal timer, then SRC might not be observed in the
“after” condition, because the relevant temporal interval had ended when
SRC was introduced. The third factor was the magnitude of short/long
standard durations, which were either 200ms/800ms or 400ms/1600ms. If
SRC changes the rate of an internal timer, the SRC effect should be smaller
for 200ms/800ms standards than for 400ms/1600ms. The results replicated
SRC’s influence on temporal bisection: long-compatible stimuli reliably
elicited long judgments at shorter durations than short-compatible stimuli.
However, SRC was observed even when it was present only after the relevant temporal interval ended, and SRC did not interact with the magnitude
of the short and long standards. Overall, these results suggest that SRC
did not affect the rate of an internal timer, but may have affected processes
otherwise unrelated to time perception. 23.4094 Central tendency effects override and generalize across
illusions in time estimation Eckart Zimmermann1([email protected]
com); 1Institute for Experimental Psychology, Heinrich Heine University
Düsseldorf, Universitätsstraße 1, 40225 Düsseldorf, Germany
Illusions and central tendency effects strongly modulate temporal interval estimations. First, interval estimations are subject to distortions during
active and passive observation: Interval compression occurs when an
action produces a stimulus or when one of the interval markers is masked.
Second, central tendency effects consist in an overestimation of short and
an underestimation of long intervals. To understand the functional role of
both phenomena, I asked which effect would dominate if both are set into
direct competition. To this end, I tested two temporal illusions: active intentional compression and passive mask-induced compression. Both illusions
produced systematic underestimations when several intervals durations
were presented in blockwise fashion. However, strong central tendency
effects occurred when interval duration was randomized. I presented an
interval of 112 ms intermixed either in a context of 5 shorter (32-96 ms) or 5
longer (128-192 ms) intervals. The 112 ms interval compressed to about half
of its duration when presented only with shorter intervals and dilated by a
factor of 1.5 when presented only with shorter intervals. Central tendency
effects thus clearly dominated interval estimations. Next, I asked about the
Vis io n S c ie nc es Societ y
Saturday AM
Tracts were further reconstructed from visual cortex to frontal, temporal,
and parietal regions of interest based on fMRI findings showing functional
differences in the patient compared with healthy age-matched controls.
White matter tracts showed regeneration of terminal fibres of ipsilesional
optic radiations in the patient that were displaced anterior to the lesion site;
however, reciprocal intrahemispheric tracts from ipsilesional visual cortex
to lateral geniculate body were disrupted. There was an absence of interhemispheric white matter tracts from ipsilesional to contralesional primary
visual cortex, while contralesional to ipsilesional tracts were spared in the
patient. Further, we observed compromised structural characteristics and
changes in diffusion of white matter tracts in the patient connecting the
visual cortex with frontal and temporal regions. This cortical remapping
and disruption of communication between visual cortices and from visual
cortex to remote regions is consistent with our previous findings showing
imbalanced functional activity of the same regions associated with chronic
visual hallucinations in the patient.
S atur day M orning Post ers
Saturday AM
Satur day Morni ng P os t er s
generality of central tendency effects by testing their transfer between the
illusions. I presented the active illusion with a duration of 112 ms either with
5 shorter or with 5 longer passive illusion intervals. In separate sessions, I
presented the passive illusion for 112 ms intermixed into either shorter or
longer active illusion intervals. Central tendency effects induced in either
the active or the passive illusion intervals transferred to the other illusion.
These results demonstrate that the immediate context of sensory stimulation determines whether intervals appear compressed or dilates, irrespective of whether the interval is actively produced or passively observed. This
is consistent with recent bayesian explanations of time estimation.
23.4095 Synchronized stimuli are perceived to be shorter Bo-Rong
Lin1([email protected]), Chang-Bing Huang1; 1Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences
University of Chinese Academy of Sciences, CAS Visual Information
Processing Laboratory (VisPal), Institute of Psychology, CAS
Processing and telling time, or time perception, is vital for the survival of
any organism. A variety of spatial cues have been found to be effective
in modulating observers’ time perception. Here, we investigated the effect
of synchronization, a form of temporal cue, on the perceived duration of
visual stimuli in which spatial cues were largely controlled. Stimuli were
100 non-overlapping Gabor patches moving in one of two directions that
orthogonal to their orientations. Synchronization of Gabor patches were
defined by entropy and correlation; entropy refers to the probability of
moving direction change and correlation refers to the likelihood that all
Gabor elements reverse their motion directions simultaneously. In Experiment 1, nineteen observers performed a duration discrimination task that
included four pairs of stimuli that differed in entropy and correlation and
found that stimuli with high synchronization were perceived significantly
shorter (by ~50ms) than random but otherwise identical stimuli of the
same duration. This contraction effect couldn’t be explained by change in
perceived speed (Experiment 2). Varying the display duration from 350 to
1050ms didn’t significantly affect the magnitude of perceived contraction
(Experiment 3). Furthermore, we found no significant difference in both
appearance and disappearance detection times to stimuli (Experiment 4),
ruling out the possibility of different detection time with stimuli of different synchronization factors. Taken together, our findings suggest that synchronization can also effectively modulate time perception, possibly acting
via slowing down the pacemaker, an essential part of the internal clock. Acknowledgement: Supported by the National Natural Science Foundation of
China and the Knowledge Innovation Program of Chinese Academy of Sciences.
23.4096 A superposition of moving and static stimuli appears to
dilate in time when the moving stimulus is attended to Daisuke
Hayashi1([email protected]), Hiroki Iwasawa1, Takayuki
Osugi2, Ikuya Murakami1; 1Department of Psychology, The University of
Tokyo, 2Department of Human Sciences and Cultural Studies, Yamagata
A moving stimulus appears to last longer than a static one. This time
dilation in a moving stimulus has been explained by stimulus domains,
such as temporal frequency and speed (e. g. Kanai et al., 2006; Kaneko &
Murakami, 2009). However, previous studies have presented moving and
static stimuli separately, and it is still unknown whether the observer’s
attentional set to the moving stimulus affects perceived duration when
the moving and static stimuli overlap in the same location. We presented
moving and static random-dot patterns simultaneously within the same
field and instructed the observer to attend to either one of these patterns
colored differently. We measured the perceived duration of the attended
pattern in the two-interval forced-choice paradigm with the method of constant stimuli. The standard stimulus consisted of 300 moving dots and 300
static dots whereas the comparison stimulus consisted of 300 static dots.
The standard stimulus appeared to last longer when the moving stimulus
was attended to than when the static one was attended to, even though
the physical display was the same between conditions. Subsidiary experiments demonstrated that the difference in perceived duration between the
attend-to-moving and attend-to-static conditions increased proportionally
to the physical duration and that, compared to the apparent duration of a
static pattern alone, attending to the static pattern in the superposition did
not last shorter when it was added to the moving pattern that had already
appeared for one second. These results indicate that endogenous attention
to a moving stimulus is a crucial component of the time dilation and that
Vi s i on S c i enc es S o ci e ty
VS S 2017 Abst ract s
a simultaneously appearing stimulus within the superposition can draw
exogenous attention, affecting the perceived duration. In the pacemaker-accumulator framework, these effects can be explained by selective accumulation of temporal units by feature-based attention.
Acknowledgement: Supported by a JSPS Grant-in-Aid for Scientific Research on
Innovative Areas (25119003)
23.4097 Time compression, but not dilation, in slowly moving stimuli Saya Kashiwakura1, Isamu Motoyoshi2; 1Department of Integrated
Sciences, The University of Tokyo, 2Department of Life Sciences, The
University of Tokyo
A number of psychophysical studies have shown that a moving visual
stimulus is perceived to last longer than a stationary stimulus. In contrast,
here we report a case in which the perceived duration of a moving stimulus
is shorter than that of a stationary stimulus. In our procedure, observers
viewed a natural movie (i.e., a running horse) that was presented at a particular speed (0.0, 0.25, or 1.9 relative to the original speed) for a particular
duration (0.5, 1.1, or 1.6 sec), and indicated whether its duration appeared
longer or shorter than that of the comparison movie presented at the original speed. We found that the duration of the movie with a slow speed (i.e.,
0.25 speed) was perceived to be shorter than that of a static image (i.e., 0.0
speed), especially when the physical duration of the stimulus was longer
than 1.1 sec. The perceived duration of a fast movie was longer than that
of a static image as consistent with previous studies. Similar patterns of the
results were obtained when we employed artificial stimuli such as drifting
gratings, and when we measured the perceived duration by manual reproduction. These results are inconsistent with the fundamental assumption
of time perception that the subjective experience of time passage depends
on the number of changes or events. We discuss potential factors that may
account for this paradoxical effect.
Acknowledgement: This study was partially supported by JSPS KAKENHI
JP16H01499 and JP15H03461.
23.4098 Task-relevant attention and repetition suppression
co-determine perceived duration Yong-Jun Lin1([email protected]),
Shinsuke Shimojo1; 1Computation and Neural Systems, California Institute
of Technology
Duration perception of an event can be influenced by the temporal context. One such phenomenon is subjective time expansion induced in an
oddball paradigm (“oddball chronostasis”), where the duration of a novel
item (oddball) appears longer than that of repeated items (standards). Two
leading theories are 1) attention enhances oddball duration [Tse et al., 2004]
and 2) repetition suppression reduces standards duration [Pariyadath &
Eagleman, 2007]. However, no studies so far have evaluated both together.
We thus measured observers’ chronostasis magnitude (CM) with constant
stimuli method, where CM = standard duration - point of subjective equality of the target, and manipulated three sequences types: repeated, ordered
and random (Fig 1a). The stimuli dimensions were digits, orientations,
and colors in Exp 1, 2, and 3, respectively (Fig 1b-d). The repeated condition was the classic oddball paradigm. In the ordered condition, items
never repeated; the target was the item that did not follow the order. In
the random condition, the observers were instructed which item would be
the target while the other items were random and unpredictable. Positive
CM in the random condition would indicate task-relevant attention effect.
From random to ordered condition, CM increment would imply prediction error effect; from ordered to repeated condition, it would imply repetition suppression. Results in Exp 1 and 2 revealed task-relevant attention
and repetition suppression effects (Fig 2a,b); results in Exp 3 showed only
task-relevant attention effect (Fig 2c). In Exp 1 and 2, attention and repetition suppression contributed about equally. In all experiments, CM correlations between sequence type condition pairs were mostly significant
(Tab 1), indicating a common factor, which is likely task-relevant attention.
Hence, both attention and repetition suppression are necessary for explaining the original oddball phenomenon. In the special case of colors, attention
alone may be sufficient as an account.
Acknowledgement: NSF-1439372 and JST.CREST to Shinsuke Shimojo
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day M orning Post ers
23.4099 Attention mediates the encoding of duration Jim Maarseveen1([email protected]), Hinze Hogendoorn1, Frans Verstraten1,2, Chris
Paffen1; 1Utrecht University, Helmholtz Institute, Department of Experimental Psychology, The Netherlands , 2University of Sydney, Faculty of
Science, School of Psychology, Sydney, NSW 2006, Australia
23.4100 Luminance motion induces larger time compression and
larger time dilation than equiluminant motion Hiroshi Yoshimat1,2
Multisensory: Vision and audition
Saturday, May 20, 8:30 am - 12:30 pm
Poster Session, Pavilion
23.4102 Oculomotor Response Precedes Awareness Access of
su ([email protected]), Yuki Murai , Yuko Yotsumoto ;
Department of Life Sciences, The University of Tokyo, 2Japan Society for
the Promotion of Science
Acknowledgement: NIH Grant R00-EY022116 (JW)
Multisensory Emotional Information Under Interocular Suppression Yung-Hao Yang1(yunghaoy[email protected]), Su-Ling Yeh1; 1Depart-
ment of Psychology National Taiwan University, Taiwan
After adapting to a moving stimulus, the duration of another moving stimulus presented at the adapted location is underestimated (adaptation-induced time compression). On the other hand, the duration of moving stimulus is overestimated than that of static stimulus (time dilation). The compression effect has been reported to be selective to luminance motion and
then considered to relate to early motion processing such as the magnocellular pathway. In contrast, the dilation effect has been considered to relate
to higher motion processing such as the area MT, however, the luminance
selectivity of this effect remains unknown. In this study, we directly compared the adaptation-induced time compression and the time dilation, in
terms of luminance selectivity. In the experiments, we measured the time
compression and the time dilation for luminance gratings and subjectively
equiluminant color gratings (0.5 cpd, diam 8°). In the experiment of time
dilation, observers compared the duration of a moving (7 Hz) standard
stimulus and that of a static test. In the experiment of time compression, the
adaptor was presented at the beginning of each trial, and observers compared the durations of two moving (7 Hz) stimuli: standard presented at
the adapted location and test at another location. All stimuli were centered
at 5° eccentricity. The standard duration was 600 ms and the test duration
was variable (300-1200 ms). We found that the luminance motion induced
significantly larger time compressions (~17%) than the equiluminance
motion (~7%), consistent with the previous studies. Furthermore, significant time dilations were observed for the luminance motion (~22%), but not
for the equiluminant motion. These results indicate the luminance motion
induces larger motion-induced time distortions. Our study suggests that
the early visual processing such as the magnocellular pathway is responsible for both the time compression and the time dilation.
Previous studies have shown that emotional salient information can
attract attention in the absence of visual awareness. Since affective voice
can enhance emotional meaning of facial expression, we tested whether
emotional congruency of affective voices can also modulate attention allocation of invisible facial expressions. We adopted the continuous flash
suppression (CFS) paradigm to render facial expressions (e.g., happy and
fearful) invisible to the participants, and manipulated affective voices (e.g.,
laughing and screaming) to generate either congruent or incongruent condition. We measured the time releasing from interocular suppression and
simultaneously recorded eye movement as an index of attention allocation.
The results showed that happy faces have shorter first saccade latency and
shorter suppression time than fearful face, the latter result had been replicated by experiments with different data bases. Importantly, congruent
affective voices revealed shorter dwell time and shorter suppression time
than incongruent counterparts. The results suggest that affective voice can
influence the attention attraction of invisible facial expression. In addition,
these results also provide new evidence that emotional meaning of facial
expression can be extracted under interocular suppression and thus integrated with affective voice. Keywords: Facial expression, multisensory
integration, unconscious processing, eye-movement 23.4101 Temporal windows in psychophysical discrimination and
Our percept of the world is a multisensory one. In order to successfully
navigate our surrounds, we must integrate information originating from
different modalities. Here, we delve into the relationship between two low
level sensory features. Specifically, we investigated the highly specific perceptual matches that exist between auditory amplitude modulation (AM)
in neural responses in human visual cortex Jingyang Zhou1([email protected]
nyu.edu), Silvia Choi1, Jonathan Winawer1; 1Department of Psychology,
New York University
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Acknowledgement: This research was supported by Taiwan’s Ministry of Science
and Technology under Grant 104-2420-H-002 -003 -MY2 to Dr. Su-Ling Yeh
23.4103 Context dependent crossmodal assocations between
visual spatial frequencies and auditory amplitude modulation
rates. Joo Huang TAN1([email protected]), Po-Jang HSIEH1;
Duke-NUS Medical School
Vis io n S c ie nc es Societ y
Saturday AM
Attention has been suggested to play an important role in duration processing. Here, we investigated whether attention mediates the encoding of the
durations of multiple events by measuring the duration after-effect (DAE)
following adaptation to concurrently presented events with different
durations. Observers adapted by viewing two streams of Gaussian blobs
displayed to the left and right of a central fixation cross. In Experiment
1, blobs in one stream lasted 200 ms while the blobs in the other stream
lasted 800 ms. To manipulate attention, observers were instructed to perform a duration-oddball detection task on one of the streams (and ignore
the other), while maintaining central fixation. In Experiment 2, observers
adapted in three conditions: repetitions of blobs lasting 200 and 400 ms,
while performing the oddball task on the 200 ms blobs (Attended: 200 ms,
Unattended: 400 ms; A200/U400) or the 400 ms blobs (A400/U200), or to
repetitions of blobs that both lasted 400 ms, while performing the oddball
task on one of the streams (A400/U400). To measure the DAE, observers
completed a cross-modal duration judgment task in which they compared
a fixed auditory reference to a visual test stimulus with a varying duration. The results of Experiment 1 reveal that attending to blobs lasting 200
ms caused an after-effect in line with adaptation to 200 ms, while attending 800 ms blobs caused an after-effect in line with adaptation to 800 ms.
This shows that the magnitude of the DAE depended on which duration
was attended to during adaptation. Experiment 2 revealed no difference
between the after-effects when the unattended stimulus lasted 200 ms
(A400/U200) and when it lasted 400 ms (A400/U400), demonstrating that
the unattended duration does not contribute to the measured DAE. These
results show that attention plays a crucial role in selecting which durations
are encoded.
Previously at VSS (Zhou et al., 2016, 56.4040), we presented a model of
temporal responses to briefly viewed stimuli measured with fMRI and
intracranial electrodes. We showed that the duration over which responses
to visual stimuli interact, a temporal window, increases along the visual
hierarchy. Here, we tested a behavioral correlate of the different temporal window lengths with several psychophysical tasks designed such that
performance was expected to be limited by different stages of cortical processing. We adopted a psychophysics paradigm similar to Burr and Santoro
2001, (doi:10.1016/S0042-6989(01)00072-4), in which we measured discrimination sensitivity as a function of exposure duration, defining the temporal
window for a given task as the duration at which sensitivity saturated (the
inflection point in the sensitivity versus duration plots). The first two tasks
were adapted from Burr and Santoro, and the third was novel: (1) contrast
thresholds for dot motion direction judgments (left or right; 100% coherence); (2) coherence thresholds for dot motion direction judgments (left or
right; high contrast); (3) contrast thresholds for judgments of facial emotional expression (happy or sad). Task 1 was expected to isolate a first stage
motion mechanism that is spatially local, represented in early visual areas
(V1), with short temporal windows. Task 2 was expected to be limited by
a second stage motion mechanism which integrates motion direction over
large regions (MT or MST), with longer temporal windows. Task 3 was
expected to be limited by a late stage object recognition mechanism, with
long temporal windows. The results were in accord with these predictions,
with short temporal windows for Task 1 (~300 ms) and long windows for
Tasks 2 and 3 (~1000 ms; ~1700 ms). Future work will test explicit linking
models that predict the psychophysical window length from the neural
Saturday AM
Satur day Morni ng P os t er s
rate and visual spatial frequency. We conducted a series of perceptual
matching tasks between visual spatial frequencies and auditory AM rates.
Participants were tasked to adjust the AM rates of the sound stimulus to
match the spatial frequency of a grating stimulus displayed on screen. Each
participant was only presented with one specific visual spatial frequency
for the experiment. Initial AM rate of the sound stimulus at the start of
each trial was manipulated as an independent variable across subjects. We
demonstrate that perceptual associations made between specific pairs of
visual spatial frequencies and auditory amplitude modulation rates are
highly conserved across individuals within specific experimental context.
However, these associations are highly, and easily influenced by auditory
context. This work serves to demonstrate the point that sensory context can
exert strong influences on crossmodal associations. 23.4104 Look at me when I’m talking to you! Sound influences gaze
behaviour in a ‘split-screen’ film Jonathan Batten1([email protected]
com), Jennifer Haensel1, Tim Smith1; 1Psychological Sciences, Birkbeck,
University of London
Viewing a dynamic audiovisual scene has inherent challenges for where
and when gaze is allocated because of the competing and transient sensory
information. The applied craft of film production has developed intuitive
solutions for guiding viewers’ gaze through visual and sound editing techniques, for example sound designers believe that increasing the loudness of
dialogue relative to background ambient noise orients a viewers’ attention
to the speaking character. A fundamental assumption of these techniques
is that a viewer’s gaze is attracted to audiovisual elements in a scene and
inversely less attracted to visual events without sound. Empirical evidence
of viewing behaviour to dynamic scenes has predominantly focused on
visual features, the role of sound as an influence on viewers’ gaze is less
clear. This study utilised a found experiment, Mike Figgis’s experimental
feature film, Timecode (2000) which contains four continuous perspectives
of interrelated events displayed using a 2x2 split-screen, where each quadrant has an isolatable sound mix. We investigated the influence of sound on
gaze behaviour to a 4 minute 50 second excerpt by manipulating the presence of sound across the four quadrants one at a time with abrupt sound
cuts shifting sound 16 times (each quadrant represented four times). Forty-eight participants free-viewed the clip whilst being eye-tracked (sound
order was counterbalanced across participants). Sound representation to a
quadrant significantly increased the proportion of gaze to that region. Gaze
was also influenced by time, as later sound representations of a quadrant
had a significantly higher proportion of gaze than earlier ones. Fixation
durations to sound regions were significantly longer than those to visual
only quadrants. The auditory and visual salience values are also considered
as predictors of gaze between the quadrants. These preliminary results suggest that dynamic scene viewing behaviour is significantly influenced by
the inclusion of corresponding sound.
Acknowledgement: Jonathan Batten’s research was funded by a ESRC PhD
Studentship (ES/J500021/1).
23.4106 Limits of sensory fusion in audio-visual cue conflict stim-
uli Baptiste Caziot1,2([email protected]), Pascal Mamassian1,2; 1Ecole
Normale Supérieure, 2Paris Sciences et Lettres
What are the limits of sensory fusion in cue conflict stimuli? We recorded
perceptual reports and RTs for discrepant audio and visual cues. A shape
subtending approximately 10 deg was displayed twice during 83 ms separated by 333 ms on a monitor. The size of the shape changed between the
two occurrences so as to simulate a displacement in depth. Simultaneous
with the visual displays, auditory white noise was played through headphones with varying loudness also simulating a distance change (inverse
square law). Participants reported whether the target was approaching or
receding. We recorded unimodal and bimodal performance. In bimodal trials the 2 cues were either congruent (simulating the same change in depth),
or opposite creating a strong conflict between cues. From block to block
observers were instructed to report the direction of either the visual signal
or the auditory signal. We found that in visual blocks perceptual reports
were similar in the unimodal, congruent and conflict conditions, as were
response times. Thus responses appear to have been mediated by the visual
signal alone. In sharp contrast, in auditory blocks perceptual decisions were
more precise in the congruent condition than the unimodal condition. But
perceptual decisions were very poor in the conflict condition, and response
times were longer. Therefore observers could disregard the auditory signal
Vi s i on S c i enc es S o ci e ty
VS S 2017 Abst ract s
when paying attention to the visual signal, but were unable to suppress the
visual signal when paying attention to the auditory signal. In a separate
experiment observers were unaware of the cue conflict and fused the cues
optimally. A mixture model (Knill, 2003; Körding et al., 2007), where the
probability to fuse the cues decrease with increasing conflict captured the
pattern of results only if different fusion functions are used in the different
tasks, underlying the strong contribution of task in sensory fusion.
Acknowledgement: NSF/ANR - CRCNS 1430262
23.4107 The Expanding and Shrinking Double Flash: An Auditory
Triggered Dynamic Replay of a Visual Stimulus Noelle Stiles1,2([email protected]), Armand Tanguay, Jr.1,3, Shinsuke Shimojo1; 1Division
of Biology and Biological Engineering, California Institute of Technology, 2Department of Ophthalmology, University of Southern California,
Departments of Electrical Engineering, Biomedical Engineering, and
Ophthalmology, University of Southern California
Background: In the double flash illusion, a visual flash can be doubled by
the presentation of two beeps, one simultaneously with the flash and one
following (Shams, et al., 2000). The current study found that a visual flash
of a static spatial gradient (black at the center to white at the edges on a
white background), can generate the perception of visual expansion then
contraction. Furthermore, when multiple beeps are played during and after
the visual gradient flash, the visual expansion then contraction is perceived
twice. Methods: A single flash of either a black circle with sharp edges (SE)
or a circular gradient (G) is presented for 20 ms. One, two, or three beeps
of 6 ms each are paired with this flash, randomly across trials. Participants
(N = 7) were asked to report the number of flashes perceived and the type
of perception (e.g., for 1 flash: a circle expanding then shrinking, a circle
shrinking then expanding, or a flash of constant size). Results: Participants
reported significantly more flashes for the two and three beep conditions
compared to the single beep condition for both the SE and G flashes (p
< 0.005). Participants indicated significantly more dynamics (expansion or
shrinking) for the G flash as compared to the SE flash (p < 0.005). Participants also verbally indicated that the illusory flash for the G stimulus (i.e.,
the second flash when two flashes were reported) had the same dynamic
visual expansion as the real first flash. Discussion: The double flash illusion
occurs even with a gradient stimulus. We hypothesize that the gradient
flash generates a perception of expansion due to higher contrast regions
being processed faster (Seiffert and Cavanagh, 1999). As the center region
has the highest instantaneous contrast, it is processed faster than the rest
of the gradient.
23.4108 The Spatial Double Flash Illusion: Audition-Induced
Spatial Displacement Armand Tanguay, Jr.1,2([email protected]),
Bolton Bailey2, Noelle Stiles2,3, Carmel Levitan2,4, Shinsuke Shimojo2;
Departments of Electrical Engineering, Biomedical Engineering, and
Ophthalmology, University of Southern California, 2Division of Biology
and Biological Engineering, California Institute of Technology, 3Department of Ophthalmology, University of Southern California, 4Department
of Cognitive Science, Occidental College
Background: The spatial double flash illusion is generated by the brief presentation of a central visual stimulus (a small rectangular target; a “flash”)
in conjunction with a short auditory stimulus (a “beep”) that is physically
displaced to the left (or right) of the central (peripheral) flash, followed by a
second identical auditory stimulus that is physically displaced to the right
(or left) of the single flash. The second beep generates an illusory flash that
is displaced in the direction of the auditory beep sequence. This illusion is
a variant of the original double flash illusion with no audio displacement
(Shams, et al., 2000). Methods: A 17 ms flash of a white rectangle against a
grey background is presented centrally, displaced by 11.5° vertically below
a fixation cross, in conjunction with a 7 ms 800 Hz audio tone (beep). A
second beep is generated 57 ms following the first beep. The two speakers
used to present the beeps are displaced to the left and right of a centrally
located monitor. Participants (N = 10) were asked to report the number of
flashes perceived, whether or not the two flashes were collocated or displaced, and if displaced, in which direction. Results: Participants reported
significantly more illusory flashes displaced in the direction of the auditory
beep sequence than in the opposite direction (Left to right, p = 0.011; Right
to left, p = 0.036). Discussion: The illusory flash following the presented
flash was perceived to be displaced laterally in space in the same direction
as the sequence of audio stimuli predominantly more often than it was per-
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day M orning Post ers
ceived to be displaced in the opposite direction. As such, both the generation of the illusory flash and its location are modified by auditory input, an
unusual example of crossmodal interaction in which audition dominates
over vision.
23.4109 Protective effects of combined audiovisual stimulation on
temporal expectations in noisy environments Felix Ball ([email protected]
gmx.de), Lara Michels1, Fabienne Fuehrmann1, Johanna Starke1,2, Toemme
Noesselt1,3; 1Department of Biological Psychology, Otto-von-Guericke
University, Magdeburg, Germany, 2Department of Neurology, Otto-vonGuericke University, Magdeburg, Germany, 3Center of Behavioural Brain
Sciences, Otto-von-Guericke University, Magdeburg, Germany
In real life, we are exposed to a rich environment, a complex and continuous stream of multisensory information. This information needs to
be integrated to generate a reliable mental model of our world. There is
converging evidence that there are at least two optimization mechanisms
to integrate incoming information: multisensory interactions (MSI) and
temporal expectations (TE). However, how these mechanisms interact is
currently unknown. In a series of 4 psychophysical experiments we tested
whether MSI-induced behavioral benefits interact with TE-induced benefits, and whether these effects are affected by distinct experimental contexts. In particular, auditory (A) and/or visual (V) stimulus sequences were
presented either alone or simultaneously in all experiments. Participants
discriminated visual and/or auditory frequencies of deviant target stimuli
(high/low) within each sequence. Moreover, temporal expectation about
time-of-target-occurrence was manipulated block-wise: targets preferentially occurred either early (‘early block’) or late (‘late block’) within the
stimulus sequence within each block. Task difficulty was further altered
by using speakers (‘same location’, Exp. 1 & 3) or headphones (‘different
location’, Exp. 2 & 4), and by changing the predictability of target modality
(predictable: Exp.1 & 2, unpredictable: Exp. 3 & 4). Multisensory interplay
was always quantified by comparing subject-specific performance during
multisensory stimulation with performance in the best unisensory condition (max-criterion). We observed distinct effects for MSI: multisensory
enhancement was dependent on task difficulty, increased with increasing
noise and was dominant when participants reported having problems with
the task. Remarkably, TE effects were also enhanced for multisensory relative to unisensory stimulation and TE effects for unisensory stimuli even
vanished under high spatial uncertainty. Together, the pattern of results
indicate that multisensory stimulation has a protective and enhancing
effect on the generation and usage of temporal expectations, highlighting
the need for multisensory paradigms in future studies investigating temporal expectations.
Acknowledgement: SFB-TR31-TPA08
23.4110 Processing of congruent and incongruent facial expressions during listening to music: an eye-tracking study Kari
Kallinen1([email protected]); 1Finnish Defense Research Agency
Introduction Studies have shown that (a) multimodal emotional experience might be increased in the combined music-picture condition and (b)
that music influences ratings on visual stimuli. However, there is scarcityof
studies that examine the potential moderating effects of music on looking
at images. In the present paper we report the results of an eye-tracking
study on congruent and incongruent emotional music (joy, sad, and anger)
and facial expressions (happy and sad). We expected that facial expressions
congruent to music would attach more attention than incongruent faces.
In addition, we expected that angry music (which had no corresponding face images), would elicit higest eye-movement activity between the
facial expression (as the subject search for corresponding facial expression). Methods Five men and five women aged 33-64 years (M=46,9)
took part in the experiment. Their task was to listen to three pieces of
music (a priori sad, joyful and angry) and at the same time look at facial
expressions (sad and happy) presented in the screen. Eye movements were
tracked with Ergoneer Dikablis eye-tracker during listening to music and
watching the facial expressions. Results As expected, in connection joyful and sad music the congruent faces (i.e., happy faces for joyful music
and sad faces for sad music) elicited more attention in terms of AOI attention ratio and total glance time as compared for incongruent faces (for AOI
attention ratio Ms = 53,2% and 36,7%; p=.002; for total glance time Ms =12,9
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
23.4111 Cross-modal Matching as a Means of Stimulus Norming
for the Visual World Paradigm Kelly Dickerson1([email protected]), Brandon Perelman1, Peter Gerhardstein2; 1Human Research and
Engineering Directorate, US Army Research Laboratory, 2Department of
Psychology, Binghamton University
The visual world paradigm (VWP) has been used to evaluate how a variety
of cognitive and contextual factors influence the deployment of visual spatial attention. Previous VWP studies have employed linguistic cues, where
a spoken or written target word immediately precedes a search array. This
method of cuing is effective and easily controlled but lacks ecological validity. A more ecologically valid way to cue a visual event would be to use
environmental sounds as cues. While this answer to ecological validity
seems straightforward, matching audio and visual cues when those cues
are meant to represent real objects is methodologically challenging. The
present study attempts to meet this challenge by using subjective ratings
of stimulus attributes as our matching variables. This study contains 150
stimuli (50 images, 100 sounds) that were rated by 10 participants for pleasantness and familiarity. For each stimulus participants were also asked for
an open-ended identification response. Following the subjective ratings
phase of the experiment participants completed a 4AFC task where each
of the 50 images was presented with two matched (target) sounds and two
mismatched (lures) sounds. There were no significant differences between
sounds and images for pleasantness or familiarity. There was a significant
difference in identification accuracy between sounds and images, with
images being slightly more accurately identified than sounds. In the 4AFC
task participants were highly accurate at selecting one of the two matched
sounds, and false alarms (responses to lures) were generally low. When a
matched auditory stimulus was selected the participant was significantly
more likely to select the more pleasant or familiar of the two sounds. These
results will inform stimulus selection and matching in a future VWP study.
Using subjective factors for cross-modal matching is one possible approach
to overcoming difficulties in norming real-world audio-visual events.
23.4112 Cross-modal ‘Goodness of Fit’ Judgments of Auditory and
Visual Meter in Musical Rhythms Stephen Palmer1([email protected]
berkeley.edu), Joshua Peterson1, Nori Jacoby2; 1Psychology, U. C. Berkeley,
Neuroscience, Columbia University
The metrical hierarchy of musical rhythm is defined by the structure of
emphases on beats in measures. We investigated 3/4 and 4/4 time signatures in auditory and visual meter using cross-modal goodness-of-fit ratings for visual and auditory probes, respectively. For auditory context conditions, four measures in 3/4 or 4/4 time were defined by a louder beat followed 2 or 3 softer, equally-timed beats, respectively. A visual probe circle
occurred in the next four measures at one of 12 phase-angles relative to the
auditory downbeat: 0°, 45°, 60°, 90°, 120°, 135°, 180°, 225°, 240°, 270°, 300°,
and 315°. Context and probe modalities were reversed for the visual context conditions. Participants rated how well probe stimuli “fit” the rhythmic context in the other modality. Visual contexts showed an expected
beat-defined hierarchy, with highest ratings on the downbeat, next-highest
for the other beats, and lowest for non-beats. Auditory contexts showed
a single broad peak for the downbeat, with little evidence of elevated fit
ratings for other beats over non-beats. Similar results were obtained when
participants made explicit ratings of cross-modal synchrony using the same
stimuli, suggesting a role for purely psychophysical asymmetries in visual
vs. auditory processing. Several factors relevant to explaining the asymmetry between these cross-modal conditions are discussed, including faster
and more accurate timing information in auditory than visual perception,
and increased precision in timing information with additional repetitions
of events occurring at regular intervals. Additional data support the relevance of these factors.
Acknowledgement: NSF Grants BCS-1059088
Vis io n S c ie nc es Societ y
Saturday AM
Acknowledgement: One of us (BB) gratefully acknowledges support from a
Caltech Summer Undergraduate Research Fellowship (SURF).
and 8,88 seconds, p=.002) . In connection with music that expressed anger
the preliminary analysis showed no effects. Conclusion The results give
new information about the interactive effects of emotional music and facial
expressions. The knowledge about the effects of music on image processing and interaction between music and images are important and useful,
among other things, in the context of (multi)media design and presentation.
Saturday Afternoon Talks
Attention: Features
Saturday, May 20, 2:30 - 4:15 pm
Talk Session, Talk Room 1
Moderator: Greg Zelinsky
Saturday PM
24.11, 2:30 pm Seeing physics in the blink of an eye Chaz Firestone1([email protected]), Brian Scholl1; 1Department of Psychology,
Yale University
People readily understand visible objects and events in terms of invisible physical forces, such as gravity, friction, inertia, and momentum. For
example, we can appreciate that certain objects will balance, slide, fall, bend
or break. This ability has historically been associated with sophisticated
higher-level reasoning, but here we explore the intriguing possibility that
such physical properties (e.g. whether a tower of blocks will topple) are
extracted during rapid, automatic, visual processing. We did so by exploring both the timecourse of such processing and its consequences for visual
awareness. Subjects saw hundreds of block-towers for variable masked
durations and rated each tower’s stability; later, they rated the same towers
again, without time pressure. We correlated these limited-time and unlimited-time impressions of stability to determine when such correlations peak
— asking, in other words, how long it takes to form a “complete” physical
intuition. Remarkably, stability impressions after even very short exposures (e.g. 100ms) correlated just as highly with unlimited-time judgments
as did impressions formed after exposures an order-of-magnitude longer
(e.g. 1000ms). Moreover, these immediate physical impressions were accurate, agreeing with physical simulations — and doing so equally well at
100ms as with unlimited time. Next, we exploited inattentional blindness to
ask whether stability is processed not only quickly, but also spontaneously
and in ways that promote visual awareness. While subjects attended to a
central stimulus, an unexpected image flashed in the periphery. Subjects
more frequently noticed this image if it was an unstable tower (vs. a stable
tower), even though these two towers were just the same image presented
upright or inverted. Thus, physical scene understanding is fast, automatic,
and attention-grabbing: such impressions are fully extracted in (an exposure faster than) the blink of an eye, and a scene’s stability is automatically
prioritized in determining the contents of visual awareness.
Acknowledgement: ONR MURI #N00014-16-1-2007
24.12, 2:45 pm Strategic Templates for Rejection Nancy Carl-
isle1([email protected]); 1Department of Psychology, Lehigh
Can attention actively suppress a feature? Arita, Carlisle, & Woodman
(JEP:HPP, 2012) reported cuing a distractor color sped search (negative
cue) compared with neutral cue trials, although the benefit was smaller
than typical positive cues. Search arrays contained two colors of objects,
where all objects of one color appeared in one hemifield. The negative cue
benefit was reduced when search was made easier by reducing the set size.
This suggests that attentional templates can be used for active suppression,
but only if it is strategically beneficial to use the cue. A failure to replicate
calls this conclusion into question. Beck & Hollingworth found no negative
cue benefit with mixed colors and suggested the previous results could be
explained by a spatial template generated after the search array was presented (JEP:HPP, 2015). Crucially, their manipulation may have reduced
the perceived benefit of the negative cue. I created a design with ⅓ of trials
in a block with mixed color arrangement (as in Beck & Hollingworth, 2015),
and ⅔ of trials with separated arrangement (as in Arita and colleagues,
2012). Importantly, participants did not know which array arrangement
would appear, and therefore needed to adopt a strategy based on the utility
of the cue for the entire block. I found a main effect of cue type on reaction
time (p < .0001), with both positive and negative cues leading to significant
benefits (p’s < .0001). Importantly, I found no interaction between cue type
and array arrangement (p = .97) indicating no difference in the cuing effects
based on display arrangement, in contrast to the spatial template hypothesis. My data suggest we can actively suppress a particular feature, but that
Vi s i on S c i enc es S o ci e ty
we may only utilize this control when it is strategically advantageous. This
evidence suggests active suppression should be incorporated into theories
of attentional control.
24.13, 3:00 pm How do we ignore salient distractors? Clayton Hickey1([email protected]), Matthew Weaver1, Hanna Kadel2, Wieske
van Zoest1; 1Center for Mind / Brain Sciences, University of Trento, Italy,
Philipps-Universität Marburg, Germany
Our visual environment is too rich for us to deal with at once, so we sample
from it by making eye movements. Optimally, we should suppress stimuli
that are strategically unimportant so as to ensure that useful objects are fixated first. But there is little actual evidence that distractor suppression plays
a role in oculomotor control. Here, we use concurrent recording of EEG
and eye-tracking first to determine if distractor suppression fosters efficient eye-movement behaviour. Participants searched for targets presented
alongside salient distractors, and we subsequently sorted trials as a function of which stimulus was first fixated. Results show that target-directed
saccades are associated not only with enhanced attentional processing of
the target, as reflected in the N2pc, but also stronger suppression of the distractor, as indexed in the distractor positivity (Pd). In a subsequent experiment, we build from this finding to investigate the impact of proactive
distractor cues. These tell people about the characteristics of non-targets
they should ignore before search begins. We find that people are better able
to ignore cued distractors, as reflected in saccadic accuracy and reaction
time. But, surprisingly, this is associated with a reduction in the distractor-elicited Pd. This suggests that distractor cues do not act by potentiating
online attentional suppression, but rather by reducing cortical sensitivity to
distractor features before the stimuli appear. We investigate this hypothesis in further time-frequency analyses of EEG data preceding target- and
distractor-directed saccades, identifying correlates of oculomotor control in
the phase synchronization of parietal alpha and occipital beta. Our results
demonstrate the key role of distractor suppression in oculomotor control,
pointing at two ways such suppression can be instantiated in the brain.
24.14, 3:15 pm More than a filter: Feature-based attention regulates the distribution of visual working memory resources Blaire
Dube1([email protected]), Stephen Emrich2, Naseem Al-Aidroos1;
Department of Psychology, University of Guelph, 2Department of Psychology, Brock University
How does feature-based attention regulate visual working memory (VWM)
performance? The prominent filter account proposes that attention acts like
a “bouncer” for VWM—the brain’s “nightclub”—filtering out distracting
information to ensure that access to VWM resources is reserved for relevant information. This account, however, originated from discrete-capacity
models of VWM architecture, the assumptions of which have since been
challenged. Across three experiments, we revisited the filter account by
testing if feature-based attention plays a broader role in regulating VWM
performance. Each experiment used partial report tasks in which participants memorized the colors of circle and square stimuli, and we provided
a feature-based goal by manipulating the likelihood that one shape would
be probed over the other across a range of probabilities. By decomposing
participants’ responses using mixture and variable-precision models, we
estimated the contributions of guesses, non-target responses, and imprecise
memory representations to their errors. Consistent with the filter account,
participants were less likely to guess when the probed memory item
matched the feature-based goal. Interestingly, this effect varied with the
strength of the goal, even across high-probabilities where goal-matching
information should always be prioritized, demonstrating strategic control
over filter strength. Beyond this effect of attention on which stimuli were
encoded, we also observed effects on how they were encoded: Estimates of
both memory precision and non-target errors varied continuously with feature-based attention. The results demonstrate a new role for feature-based
attention in dynamically regulating the distribution of resources within
working memory so that the most relevant items are encoded with the
greatest precision.
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s 24.15, 3:30 pm Attentional cues potentiate recovery of fine direc-
tion discrimination in cortically-blind patients Matthew Cavana-
ugh1,2,3([email protected]), Antoine Barbot2,3,
Marisa Carrasco4, Krystel Huxlin2,3; 1Neuroscience Graduate Program,
University of Rochester, 2Flaum Eye Institute, University of Rochester,
Center for Visual Science, University of Rochester, 4Department of Psychology and Center for Neural Science, NYU
24.16, 3:45 pm Prediction facilitates complex shape processing in
visual cortex Peter Kok1([email protected]), Nicholas Turk-Browne1,2;
Princeton Neuroscience Institute,Princeton University, Princeton, NJ,
USA., 2Department of Psychology, Princeton University, Princeton, NJ,
Perception is an inferential process, in which sensory inputs and prior
knowledge are combined to arrive at a best guess of what is in the world.
In line with this, previous studies have shown that expectations strongly
modulate neural signals in sensory cortices. However, most of these studies
have focused on expectations about simple features, such as the orientation
or spatial location of a grating. This stands in contrast to daily life, where
expectations often pertain to more complex objects, such as the expectation of seeing a dog upon hearing a bark. In the current study, we used
auditory cues to manipulate the predictability of complex shapes that were
defined along a continuum of Fourier descriptors. With high-resolution
fMRI, we found that the univariate neural response to invalidly predicted
shapes was delayed with respect to validly predicted shapes throughout
visual cortex (e.g., in V1, V2, and lateral occipital cortex). Not only was the
overall response delayed, but so too was the information present in neural
activity patterns. Specifically, we trained an inverted encoding model on
shapes in the absence of predictions, and used this model to reconstruct
what these visual areas represented when a shape was validly and invalidly predicted. The same shape was presented in both cases, but there was
a marked delay in information about this shape when it was invalidly predicted. These results suggest that invalid expectations interfere with shape
processing throughout the visual cortical hierarchy. Moreover, the fact that
predictions about complex shape change the timing of neural responses
stands in contrast to the effect of predictions about simple features, which
modulate the amplitude of response. This discrepancy suggests that differ-
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
ent neural mechanisms may underlie expectations of varying complexity,
which could be related to different sources of object vs. feature expectations
in the brain.
Acknowledgement: This work was supported by The Netherlands Organisation
for Scientific Research (NWO, Rubicon grant 446-15-004) to P.K. and NIH grant
R01 EY021755 to N.B.T-B.
24.17, 4:00 pm Computing Saliency over Proto-Objects Predicts
Fixations During Scene Viewing Yupei Chen1([email protected]
edu), Gregory Zelinsky1,2; 1Department of Psychology, Stony Brook University, 2Department of Computer Science, Stony Brook University
Most models of fixation prediction operate at the feature level, best exemplified by the Itti-Koch (I-K) saliency model. Others suggest that objects
are more important (Einhäuser et al., 2008), but defining objects requires
human annotation. We propose a computationally-explicit middle ground
by predicting fixations using a combination of saliency and mid-level representations of shape known as proto-objects (POs). For 384 real-world scenes
we computed an I-K saliency map and a proto-object segmentation, the latter using the model from Yu et al. (2014). We then averaged the saliency
values internal to each PO to obtain a salience for each PO segment. The
maximally-salient PO determined the next fixation, with the specific x,y
position being the saliency-weighted centroid of the PO’s shape. To generate sequences of saccades we inhibited fixated locations in the saliency
map, as in the I-K model. We found that this PO-saliency model outperformed (p < .001) the I-K saliency model in predicting fixation-density
maps obtained from 12 participants freely viewing the same 384 scenes (3
seconds each). Comparison to the GBVS saliency model showed a similarly
significant benefit. Over five levels we also manipulated the coarseness of
the PO segmentations for each scene on a fixation-by-fixation basis, meaning that the first predicted fixation was based on the coarsest segmentation
and the fifth predicted fixation was based on the finest. Doing this revealed
considerable improvements relative to the other tested saliency models,
largely due to the capture of a relationship between center bias and ordinal
fixation position. Rather than being an ad hoc addition to a saliency model,
a center bias falls out of our model via its coarse-to-fine segmentation of a
scene over time (fixations). We conclude that fixations are best modeled at
the level of proto-objects, which combines the benefit of objects with the
computability of features.
Acknowledgement: This work was supported by NSF grant IIS-1161876 to G.J.Z.
Motion: Flow, biological, and higher-order
Saturday, May 20, 2:30 - 4:15 pm
Talk Session, Talk Room 2
Moderator: Michael Morgan
24.21, 2:30 pm Viewpoint oscillation frequency influences the
perception of distance travelled from optic flow Martin Bossard1([email protected]), Cédric Goulon1, Daniel Mestre1; 1Aix-Marseille Université, CNRS, ISM UMR 7287, Marseille, France
In everyday life, humans and most animals need to navigate in their environment, which produces multiple sources of perceptual information, such
as locomotor cues (i.e. proprioceptive, efference copy and vestibular cues)
and optic flow. However, few studies focused on the role of the visual consequences of walking (bob, sway, and lunge head motion) on self-motion
perception. In a previous study, in which static observers were confronted
to a visual simulation of forward motion, we have shown that adding
rhythmical components to an optic flow pattern improved the accuracy
of subjects’ travelled distance estimations, in comparison with a purely
translational flow. These results were attributable to the fact that oscillations may increase the global retinal motion and thus improve vection.
Another hypothesis was that, walking step frequency being a significant
cue in speed perception, visual consequences of step frequency might be at
the origin of better estimations.To test this, we used the same experimental
procedure in which observers, immersed inside a 4-sided CAVE, had to
indicate when they thought they had reached a previously seen target. We
tested whether different oscillation frequencies would affect the perception
of distance travelled. Observers were confronted with 4 conditions of optic
flows simulating forward self-motion. The first condition was generated
by purely translational optic flow, at constant speed. The three other conditions of flows were vertical triangular oscillations with three kinds of
Vis io n S c ie nc es Societ y
Saturday PM
Background. Visual perceptual training in cortically-blind (CB) fields
improves performance on trained tasks, recovering vision at previously
blind locations. However, contrast sensitivity and fine discrimination
remain abnormal, limiting the usefulness of recovered vision in daily life.
Here, we asked whether it is possible to overcome residual impairment in
fine direction discrimination (FDD) performance by training CB subjects
with endogenous, feature-based attention (FBA) cues. Methods. Nine
CB subjects were recruited and underwent coarse direction discrimination
(CDD) training, followed by FDD training with an FBA cue. Following
completion of each training protocol, we tested FDD thresholds at blindfield locations and corresponding intact-field locations, with both neutral
and valid FBA cues. T-tests were used to assess significance of differences
in FDD thresholds attained after different types of training. Results. Subjects who trained using CDD tasks were able to attain FDD thresholds of
26±5.5° (average±SEM). Training FDD without cues attained FDD thresholds of 18±3.8 deg, not significantly different from those attained following CDD training (26±5.5; p>0.1). Following FDD training with FBA cues,
FDD thresholds measured with valid FBA cues (5.4±1.3°) were significantly
lower than thresholds attained following FDD training without FBA cues
(p=0.02) or CDD training (p=0.01). Moreover, FDD thresholds at blind-field
locations for subjects trained and measured with FBA cues were statistically indistinguishable from thresholds at intact-field locations measured
with FBA cues (p=0.054) or with neutral FBA cues (p=0.4). Even when measured using neutral FBA cues, FDD thresholds at the blind-field locations
trained with FBA cues (9.8±0.2°) were significantly lower than following
CDD (p=0.02). Lastly, intact-field thresholds were lower when tested with
(3.1±0.3°) than without (4.3±0.7°) FBA cues (p=0.03). Conclusion. Mechanisms governing FBA appear to be intact and functional in CB subjects.
Importantly, training with FBA can be leveraged to recover normal, fine
visual discrimination performance at trained, blind-field locations.
S atur day Aft ernoon Talks
Satur day Af t ernoon Tal ks
frequencies added to linear forward motion, at the same forward speed.
Results show that two groups can be distinguished. Regarding the first
group, as in the previous study, adding rhythmic components improves
the perception of distance travelled. For the second group, the higher the
frequency, the earlier the answers, suggesting that these subjects related
the oscillation frequency to their step frequency and perceived themselves
as moving faster.
24.22, 2:45 pm Optic flow and self-motion information during
real-world locomotion Jonathan Matthis1([email protected]), Karl
Saturday PM
Muller1, Kathryn Bonnen1, Mary Hayhoe1; 1Center for Perceptual Systems,
University of Texas at Austin
A large body of research has examined the way that patterns of motion
on the retina contribute to perception of movement through the world,
but the actual visual self-motion stimulus experienced during real-world
locomotion has never been measured. We used computer vision techniques
to estimate optic flow from a head-mounted video camera recorded when
subjects walked over various types of real-world terrain. Eye movements
and full body kinematics were also recorded. We found that the optic flow
experienced during locomotion reveals a pulsing pattern of visual motion
that is coupled to the phasic acceleration patterns of the gait cycle. This
pulsing optic flow pattern is not present in the constant-velocity flow fields
that are generally used to simulate self-motion. This difference between
real-world and simulated visual self-motion has important consequences
on the behavior of the focus of expansion (FOE) during locomotion, which
has been extensively studied as an key locus of information about heading
direction but has not been recorded during natural behavior. Results show
that the acceleration patterns of the head cause the FOE to follow a complex
path in the visual field, in contrast to simulated constant-velocity self-motion stimuli wherein the FOE lies in a stable location in observer’s environment. To examine how task-relevant locomotor variables could be derived
from real-world stimuli, we processed the head-mounted videos using biologically plausible models of motion sensitive areas in visual cortex. The
resulting patterns of simulated neural activity are complex, but display a
clear coupling to the bipedal gait cycle. By comparing the resulting patterns
of simulated neural activity across different time scales to subjects’ kinematics, we found features that correlate with locomotion-relevant variables
such as heading direction. We also found features of the visual motion
stimulus that may play an important role in postural stability during locomotion over rough terrain.
Acknowledgement: NIH 1T32 - EYE021462, NIH R01 - EY05729
24.23, 3:00 pm Visual-vestibular detection of curvilinear paths
during self-motion John Perrone1([email protected]); 1The School of
Psychology, The University of Waikato, New Zealand
Humans are able to navigate through cluttered environments while avoiding obstacles in their way. How this occurs is still unknown despite many
years of research. It is well established that the visual image motion on the
back of the eyes (vector flow field) can be used to extract information about
our trajectory (e.g., heading) as well as the relative depth of points in the
world but that rotation of the eye/head or body confounds this extraction
process. We have previously shown how local efference signals regarding
eye or head movements can be used to compensate for the perturbations
in image motion caused by the rotation (Perrone & Krauzlis, JOV, 2008).
However, movement of the body along curved paths also introduces
visual rotation, yet the mechanisms for detecting and compensating for
this remain a mystery. The curvilinear signals from the primate vestibular
system that have so far been measured indicate insufficient precision for a
direct compensatory role. A curvilinear path generates a flow vector (T+R,
θ) made up of a translation and rotation component. We need to find (R, φ)
which provides the body’s curvilinear rotation rate and direction. I have
discovered that there exists a trigonometric relationship linking the flow
vector to the curvilinear rate, i.e., R = (T+R)sin(α - θ)/sin(α - φ). Here, α is
a function of the heading. The body’s curvilinear rotation can be found by
sampling many vectors and testing a sparse array of heading directions (α).
However, this purely visual solution occasionally produces errors in the (R,
φ) estimates. Model simulations show that broadly tuned vestibular signals
for the values of α, R and φ are sufficient to eliminate these errors. Model
tests against existing human psychophysical data revealed comparable curvilinear rotation estimation precision. Combined visual-vestibular signals
produce greater accuracy than each on its own.
Vi s i on S c i enc es S o ci e ty
VSS 2017 Abst ract s
24.24, 3:15 pm Residual Perception of Biological Motion in Cortical
Blindness Meike Ramon1([email protected]), Nicolas Ruffieux1,
Junpeng Lao1, Françoise Colombo2, Lisa Stacchi1, François-Xavier Borruat3,
Ettore Accolla2,3,4, Jean-Marie Annoni 2,3,4, Roberto Caldara1; 1University of
Fribourg, Eye and Brain Mapping Laboratory, Department of Psychology,
Rue P.-A.-de-Faucigny 2, 1700 Fribourg, Switzerland, 2Fribourg Hospital,
Unit of Neuropsychology and Aphasiology, CP, 1708 Fribourg, Switzerland, 3Jules-Gonin Ophtalmological Hospital, Neuro-Ophthalmology Unit,
University of Lausanne, Avenue de France 15, 1004 Lausanne, Switzerland, 4University of Fribourg, Laboratory for Cognitive and Neurological Sciences, Department of Medicine, Ch. du Musée 5, 1700 Fribourg,
The ability to perceive biological motion (BM) relies on a distributed network of brain regions and can be preserved after damage to high-level
visual areas. However, whether it can withstand the loss of vision following bilateral striate damage remains unknown. Here we tested categorization of human and animal BM in BC, a rare case of cortical blindness after
anoxia-induced bilateral striate damage. The severity of his impairment,
encompassing various aspects of vision and causing blind-like behavior,
contrasts with a residual ability to process motion (for a video demonstration see perso.unifr.ch/roberto.caldara/VSS/BC_patient.mov). We presented BC with static or dynamic point-light displays (PLDs) of human or
animal walkers. These stimuli were presented individually, or in pairs in
two alternative forced choice (2AFC) tasks. Confronted with individual
PLDs, BC was unable to categorize the stimuli, irrespective of whether they
were static or dynamic. In the 2AFC task, BC exhibited appropriate gaze
towards diagnostic information, but performed at chance level with static
PLDs, in stark contrast to his ability to efficiently categorize dynamic biological agents. This striking ability to categorize BM provided top-down
information is important for at least two reasons. Firstly, it emphasizes the
importance of assessing patients’ (visual) abilities across a range of task
constraints, which can reveal potential residual abilities that may in turn
represent a key feature for patient rehabilitation. Our findings reinforce
the view that the BM processing network can operate despite severely
impaired low-level vision, emphasizing that processing dynamicity in biological agents is a robust feature of human vision.
24.25, 3:30 pm Who’s chasing whom?: Changing background
motion reverses impressions of chasing in perceived animacy Ben-
jamin van Buren1([email protected]), Brian Scholl1; 1Department of
Psychology, Yale University
Visual processing recovers not only seemingly low-level features such as
color and orientation, but also seemingly higher-level properties such as
animacy and intentionality. Even abstract geometric shapes are automatically seen as alive and goal-directed if they move in certain ways. What
cues trigger perceived animacy? Researchers have traditionally focused
on the local motions of objects, but what may really matter is how objects
move with respect to the surrounding scene. Here we demonstrate how
movements that signal animacy in one context may be perceived radically
differently in the context of another scene. Observers viewed animations
containing a stationary central disc and a peripheral disc, which moved
around it haphazardly. A background texture (a map of Tokyo) moved
behind the discs. For half of observers, the background moved generally
along the vector from the peripheral disc to the central disc (as if the discs
were moving together over the background, with the central disc always
behind the peripheral disc); for the other half of observers, the background
moved generally along the vector from the central disc to the peripheral
disc. Observers in the first condition overwhelming perceived the central
disc as chasing the peripheral disc, while observers in the second condition experienced the reverse. A second study explored objective detection:
observers discriminated displays in which a central ‘wolf’ disc chased a
peripheral ‘sheep’ disc from inanimate control displays in which the wolf
instead chased the sheep’s (invisible) mirror image. Although chasing was
always signaled by the wolf and sheep’s close proximity, detection was
accurate when the background moved along the vector from the sheep to
the wolf, but was poor when the background moved in an uncorrelated
manner (controlling for low-level motion). These dramatic context effects
indicate that spatiotemporal patterns signaling animacy are detected with
reference to a scene-centered coordinate system.
Acknowledgement: ONR MURI #N00014-16-1-2007
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day Aft ernoon Talks
24.26, 3:45 pm Non-retinotopic feature integration is mandatory
and precise Leila Drissi Daoudi ([email protected]), Haluk
Öğmen2, Michael Herzog1; 1Laboratory of Psychophysics, Brain Mind
Institute, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland,
Department of Electrical & Computer Engineering, University of Denver,
Denver, USA
Acknowledgement: This project was financed with a grant from the Swiss
SystemsX.ch initiative (2015/336), evaluated by the Swiss National Science
24.27, 4:00 pm Attraction and Repulsion Between Local and Global
Motion Michael Morgan1([email protected]), Joshua Solomon1;
Applied Vision Research Centre, City, University of London
The interaction between local and global motion was studied with moving, circular clouds of dots, which could also move within the cloud. If the
cloud moved near-vertically downwards (~270 deg) but the dots within
it moved obliquely (240 or 300 deg) the apparent path of the cloud was
attracted to that of the dots, as previously demonstrated with moving
Gabor patches (Tse & Hseih, 2006; Lisi & Cavanagh, 2015). This attractive
effect was enhanced in parafoveal viewing. A larger effect in the opposite
direction (repulsion) was found for the perceived direction of the dots
when they moved near-vertically (~270 deg) and the cloud containing them
moved obliquely (240 or 300 deg). These results are discussed in relation to
Gestalt principles of perceived relative motion, and more recent Bayes-inspired accounts of the interaction between local and global motion. Acknowledgement: The Leverhulme Trust RPG_2016_124
Visual Search: Other
Saturday, May 20, 5:15 - 6:45 pm
Talk Session, Talk Room 1
Moderator: Arni Kristjansson
25.11, 5:15 pm If I showed you where you looked, you still wouldn’t
remember Avi Aizenman1([email protected]), Ellen
Kok2,3, Melissa Vo4, Jeremy Wolfe2; 1Vision Science, University of California, Berkeley, 2Brigham and Women’s Hospital/Harvard Medical School,
School of Health Professions Education, Maastricht University, 4Scene
Grammar Lab, Goethe University
Observers are no better at reporting where they just fixated in an image
than they are at guessing where someone else has fixated. We investigated
whether providing participants with explicit, online information about
where they looked during a search task would help them recall their own
eye movements afterwards. Seventeen observers searched for various
objects in “Where’s Waldo” images for 3s. On 2/3rds of scenes, observers made target present/absent responses afterwards. On the other third,
however, they were asked to click twelve locations in the scene where they
thought they had just fixated. Half of the scenes were presented normally
(control). In the other half, we employed a gaze-contingent window that
gave the impression of a roving 7.5 deg “spotlight” that illuminated everything fixated, while the rest of the display was still visible but darker. To
measure the fidelity of the memory, we placed a virtual circle around each
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Acknowledgement: DFG grant VO 1683/2-1 to author M.L.V.
25.12, 5:30 pm Peripheral Representations Enhance Dense Clutter
Metrics in Free Search Arturo Deza1([email protected]), Miguel
Eckstein2; 1Dynamical Neuroscience, UCSB, 2Psychological and Brain
Sciences, UCSB
Introduction: Clutter models (Feature Congestion; FS, Rosenholtz et. al 2005;
Edge Density, ED, Mack & Oliva, 2004, and ProtoObject Segmentation, PS,
Yu et. al, 2014) aim at generating a score from an image that correlates with
performance in a visual task such as search. However, previous metrics do
not take into account the interactions between the influences of clutter and
the foveated nature of the human visual system. Here we incorporate foveated architectures to standard clutter models (Deza & Eckstein, 2016), and
assess their ability (relative to unfoveated clutter metrics) to predict multiple eye movement search performance across images with varying clutter.
Methods: Observers (n = 8) freely searched for a target (a person, yes/no
task) with varying levels of clutter, small targets, and with a 50% probability of target present. Data Analysis: We correlated the clutter scores for
images with the time to foveate a target (2 degree radius from target center).
Results: We find that Feature Congestion (r=0.45 vs r_FoV=0.72, p< 0.05)
and Edge Density (r=0.38 vs r_FoV=0.87, p< 0.05) benefit from inclusion
of a foveated (Fov) architecture. ProtoObject Segmentation does not show
such improvements. However, the unfoveated ProtoObject Segmentation
model correlates just as high with human foveation time as all the other
foveated versions: r=0.76 vs r_Fov = 0.38. The dissociation in results across
the FC, ED and PS can be explained in terms of differences across models
in the spatial density of the representations. ProtoObject Segmentation has
spatially coarse intermediate representations leading to little effects from
spatial pooling associated with a foveated architecture. Conclusion: Models
with spatially dense representation pipelines can benefit from a foveated
architecture when computing clutter metrics to predict time to foveate a
target during search with complex scenes.
25.13, 5:45 pm The width of the functional viewing field is sensitive to distractor-target similarity even in efficient singleton
search Gavin Ng1([email protected]), Alejandro Lleras1, Simona Buetti1;
University of Illinois at Urbana-Champaign
Contrary to most models of visual search, recent work from our lab showed
that variability in efficient search is meaningful and systematic. Reaction
times (RTs), which reflect stage one processing times, increase logarithmically with set size, indicating an exhaustive processing of the scene, even
in the presence of an easily visible singleton target. This increase is modulated by distractor-target similarity. The functional viewing field (FVF)
- the region surrounding fixation from which useful information can be
extracted - has been shown to be smaller in inefficient compared to efficient
search tasks. Here we show that the size of the FVF, like RTs, is variable
even in efficient search tasks. We monitored eye movements as observers
discriminated the direction of a singleton target. In higher distractor-target
similarity displays, observers’ initial saccades landed further away from
the target than in low distractor-targetsimilarity displays, even though
search was efficient with both types of distractors. Furthermore, observers
executed more fixations, indicating that the FVF was smaller in higher distractor-target displays. Interestingly, regardless of distractor-target similarity, observers fixated closer to the target when there were more items in the
display. This presumably results from observers switching to a smaller FVF
in order to determine the identity of the target once it is located. Additionally, we found that initial saccade latencies (ISLs) were not affected by total
Vis io n S c ie nc es Societ y
Saturday PM
Visual features can be integrated across retinotopic locations. For example,
when a Vernier is followed by a sequence of flanking lines on either side, a
percept of two diverging motion streams is elicited. Even though the central Vernier is unconscious due to metacontrast masking, its offset is visible
at the following elements. If an offset is introduced to one of the flanking
lines, the two offsets integrate (Otto et al., 2006). Here, by varying the number of flanking lines and the position of the flank offset, we show that this
integration lasts up to 450ms. Furthermore, this process is mandatory, i.e,
observers are not able to consciously access the individual lines and change
their decision. These results suggest that the contents of consciousness can
be modulated by an unconscious memory-process wherein information is
integrated for up to 450ms. This mandatory and unconscious process is not
sluggish, but very precise. By using parallel streams, we show that even for
spatially very close stimuli the offsets do not spill over. Offsets integrate
only when presented in the same stream. Hence, these results suggest that
non-retinotopic feature integration is a very precise mechanism, and that
the streams create a spatio-temporal window of unconscious, mandatory
integration that lasts up to 450ms.
fixation and each click and measured the overlap. Perfect overlap would
represent perfect memory. When modeled with some noise in placing
clicks, best fixation produced 66% overlap for an average circle of diameter
2.6 deg. Overlap with randomly generated ‘clicks’ is chance performance
(21% overlap). As in prior work, participants’ click performance (28% overlap) was far from ceiling and quite close to chance performance. It was
slightly better than the no-spotlight control (26%, p=0.02) in the spotlight
condition. Giving observers more information about their fixations by dimming the periphery improved memory for those fixations modestly, at best.
Interestingly, 9 of 14 observers queried thought the spotlight improved
their memory (even though it didn’t). One thought it made matters worse
and four reported no subjective difference. Memory for fixations is poor,
introspection about that memory is poor, and additional information about
fixation does not help much.
Satur day Af t ernoon Tal ks
set size, suggesting that initial processing of the display is not exhaustive
but restricted to the FVF. However, the first eye movement was sensitive
to distractor type: ISLs were significantly longer for higher distractor-target
similarity displays and most of the initial saccades were directed towards
the target. Our results show that the size of the FVF is modulated by distractor-target similarity even in efficient visual search, and that this affects
the initial processing time of the search display.
25.14, 6:00 pm Serial dependence determines object classification
in visual search Mauro Manassi1([email protected]), Árni
Saturday PM
Kristjánsson2, David Whitney1; 1University of California, Berkeley, Department of Psychology, Berkeley, CA, USA, 2Department of Psychology,
University of Iceland
In everyday life, we continuously search and classify the environment
around us: we look for keys in our messy room, for a friend in the street
and so on. A very important kind of visual search is performed by radiologists, who have to search and classify tumors in X-rays. An underlying
assumption of such tasks is that search and recognition are independent of
our past experience. However, recent studies have shown that our percepts
can be strongly biased toward previously seen stimuli (Fischer & Whitney,
2014; Liberman et al., 2014). Here, we tested whether serial dependence can
influence search and classification of objects in critical tasks such as tumor
detection. We created three objects with random shapes (objects A/B/C)
and generated 48 morph objects in between each pair (147 objects in total).
Observers were presented on each trial with a random object and were
asked to classify the morph as A/B/C. In order to simulate a tumor search
task, we embedded the morph in a noisy background and randomized its
location. We found that subjects made consistent perceptual errors when
classifying the shape on the current trial, seeing it as more similar to the
shape presented on the previous trial. This perceptual attraction extended
over 15 seconds back in time (up to 3 trials back). In a control experiment,
we checked whether this kind of serial dependence is due to response
repetition, on some trials asking subjects to press the space bar instead of
classifying the object. Serial dependence still occurred from those trials, ruling out a mere response bias. Our results showed that object classification
in visual search can be strongly biased by previously seen stimuli. These
results are particularly important for radiologists, who search and classify
tumors when viewing many consecutive X-rays.
25.15, 6:15 pm Searching with and against each other Diederick
Niehorster1,2([email protected]), Tim Cornelissen1,3, Ignace Hooge4,
Kenneth Holmqvist1,5; 1The Humanities Laboratory, Lund University,
Sweden, 2Department of Psychology, Lund University, Sweden, 3Scene
Grammar Lab, Department of Cognitive Psychology, Goethe University
Frankfurt, Germany, 4Department of Experimental Psychology, Helmholtz Institute, Utrecht University, the Netherlands, 5UPSET, North-West
University (Vaal Triangle Campus), South Africa
Although in real life people frequently perform visual search together, in lab
experiments this social dimension is typically left out. Collaborative search
with feedback about partners’ gaze has been shown to be highly efficient
(Brennan et al. 2008). Here we aim to replicate previous findings regarding
collaborative search strategies and how they change when people compete instead. Participants were instructed to search a jittered hexagonal
grid of Gabors for a target with a vertical orientation, among 24 distractors
rotated -10 or 10° while being eye-tracked. Sixteen participants completed
three conditions: individual, collaborative and competitive search. For collaboration and competition, searchers were paired with another searcher
and shown in real-time at which element the other searcher was looking.
Searchers were instructed to find the target as fast as possible and received
points or a penalty depending on whether they found the correct target.
When instructed to collaborate, both searchers received points or a penalty,
regardless who responded. During competition, only the searcher who
responded was rewarded points or penalized. Early in trials the overlap in
visited hexagons between searchers remained low, indicating that searchers formed a collaboration strategy. This strategy resulted in search times
that were roughly half that of individual search without an increase in
errors, indicating collaboration was efficient. During competition overlap
increased earlier, indicating that competing searchers divided the search
space less efficiently than collaborating searchers. During competition,
participants increased the rate at which they inspected the elements of the
display and, despite no longer dividing the search space as efficiently as
during collaboration, found targets faster than in the collaboration condi-
Vi s i on S c i enc es S o ci e ty
VSS 2017 Abst ract s
tion without an increase in errors. We conclude that participants can efficiently search together when provided only with information about their
partner’s gaze position. Competing searchers found the target even faster,
but without a clear strategy.
Acknowledgement: Marcus and Amalia Wallenberg foundation,
25.16, 6:30 pm Estimates of a priori power and false discovery rates
induced by post-hoc changes from thousands of independent
replications Dwight Kravitz1([email protected]), Stephen Mitroff1;
Department of Psychology, The George Washington University
Scientific progress relies on accurate inference about the presence (or
absence) of an experimental effect. Failures to replicate high-profile studies have elevated concerns about the integrity of inference in psychology
research (Nosek et al., 2015). One proposed solution is pre-registering
experimental designs before data collection to prevent post-hoc changes
that might increase false positives and to increase publication of null findings. However, pre-registration does not always align with the inherently
complex and unpredictable nature of research, particularly when a priori
power estimates are not sufficient to guide the design of studies. Better a
priori power estimates would also increase confidence in interpreting null
results. The current study used a massive dataset of visual search performance (>11 million participants, >2.8 billion trials; Airport Scanner, Kedlin
Co.; www.airportscannergame.com) to produce empirical estimates of the
a priori power of various designs (i.e., number of trials and participants)
and to estimate the impact of and appropriate corrections for various posthoc changes (e.g., retaining pilot data). Dividing the dataset into many
thousands of independent replications of various designs allowed estimation of the minimum effect size each design can reliably detect (i.e., a priori power). Application of common post-hoc changes to these thousands
of replications, yielded precise estimates of the individual and combined
impact of post-hoc changes on false positive rates, which in some cases
were >30%. Critically, adjusted p-values that correct for post-hoc changes
can also be derived. The approach and findings discussed here have the
potential to significantly strengthen research practices, guiding the design
of studies, encouraging transparent reporting of all results, and providing
corrections that allow flexibility without sacrificing integrity.
Color and Light: Material perception
Saturday, May 20, 5:15 - 6:45 pm
Talk Session, Talk Room 2
Moderator: Sylvia Pont
25.21, 5:15 pm Neo’s Spoon and Newton’s Apples: Prediction of
rigid and non-rigid deformations of materials Lorilei Alley1(lorilei.
[email protected]), Alexandra Schmid1, Katja Doerschner1;
Justus Liebig University Giessen
Throughout life we acquire complex knowledge about the properties of
objects in the world. This knowledge allows us to efficiently predict future
events (e.g. whether a falling porcelain cup will shatter) and is critical for
survival (e.g. predicting if a snake will strike). Vision research is only beginning to understand the mechanisms underlying such complex predictions.
We conducted a study to investigate whether the visual system makes
predictions about the kinematics of materials based on object shape and
surface properties. Stimuli were computer- rendered familiar objects (teacup, chair, spoon, jelly etc.) that we hypothesised would generate strong
expectations about their material kinematics when dropped from a height
(whether they would shatter, wobble, splash, bounce, etc.). Control stimuli were novel unfamiliar 3D shapes rendered with the familiar objects’
surface properties. Utilizing a ‘violation of expectation’ paradigm, on each
trial we showed a static view of the object, followed by a video sequence
of the object falling and impacting the ground. The motion was either ‘congruent’ with the object and material, behaving as expected (e.g. a falling
teacup shattered), or ‘incongruent’, where the kinematics violated potential
predictions (e.g. a falling teacup wrinkled like cloth). In a ‘static’ condition, different observers viewed only the first static frame. Observers used
a scale to rate each video clip on 4 adjectives: ‘hard’, ‘gelatinous’, ‘heavy’,
and ‘liquid’. We analysed whether congruency of the motion affected how
fast observers performed the ratings. We computed a predictability score
for the ‘congruent’ outcome by comparing ratings between this condition
and the static condition. Stimuli with high predictability scores should gen-
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s erate larger surprise effects (i.e. longer RTs), and this is exactly what was
found (r =.483, p < .001). Our results demonstrate for the first time that kinematic properties are an integral part of the visual system’s representation
of material qualities.
Acknowledgement: Sofja Kovalevskaja Award: Alexander von Humboldt Foundation, sponsored by the German Federal Ministry for Education and Research.
25.22, 5:30 pm Visual perception of elastic behavior of bouncing
objects Vivian Paulun1([email protected]),
S atur day Aft ernoon Talks
a form of gloss induction. Both, the intensification of the lowlights by the
darkest layer and the increase of brightness of the highlights by the lightest
layer induce an increase in perceived gloss.
Acknowledgement: EU Marie Curie Initial Training Network ‘‘PRISM’’ (FP7 PEOPLE-2012-ITN, Grant Agreement: 316746
25.24, 6:00 pm The interaction between surface roughness and the
illumination field on the perception of metallic materials James
Todd1([email protected]), Farley Norman2; 1Department of Psychology,
Ohio State University, 2Department of Psychological Sciences, Western
Kentucky University
When an object bounces around a scene, its behavior depends on both its
intrinsic material properties (e.g., elasticity) and extrinsic factors (e.g., initial position, velocity). Visually inferring elasticity requires disentangling
these different contributions to the observed motion. Moreover, although
the space of possible trajectories is very large, some motions appear intuitively more plausible than others. Here we investigated how the visual system estimates elasticity and the typicality of object motion from short (2s)
simulations in which a cubic object bounced in a room. We varied elasticity
in ten even steps and randomly varied the object’s start position, orientation and velocity to gain three random samples for each elasticity. Based
on these 30 variations we created two reduced versions of each stimulus,
showing the cube in a completely black environment (as opposed to a fully
rendered room). In one condition the cube was identical to the original stimulus; in the other, the cube rigidly followed the same path without rotating
or deforming. Thirteen observers rated the apparent elasticity of the cubes
and the typicality of their motion. We found that observers were good at
estimating elasticity in all three conditions, i.e. irrespective of whether the
scene provided all possible cues or was reduced to the movement path.
Some of the random variations produced more typical representatives of a
given elasticity than others. Rigid motion was generally perceived as less
typical than full motion. The pattern of ratings is consistent with simple
heuristics based on the duration and the speed of the motion: The longer
and faster an object moved, the higher was its perceived elasticity. The
same measures showed a similar but weaker relation to the true elasticity
of the cubes. Analysis of the distribution of many trajectories suggests such
heuristics can be inferred through unsupervised learning. An important phenomenon in the study of human perception is the ability
of observers to identify different types of surface materials. One factor that
complicates this process is that materials can be observed with a wide range
of surface geometries and light fields. The present research was designed
to examine the influence of these factors on the appearance of metal. The
stimuli depicted three possible objects that were illuminated by three possible light fields. These were generated by a single point light source, 2 rectangular area lights, or projecting light onto a translucent white box that
contained the object (and the camera) so that the object would be illuminated by ambient light in all directions. The materials were simulated using
measured parameters of chrome with four different levels of roughness.
Observers rated the metallic appearance and shininess of each depicted
object using two sliders. The highest rated appearance of metal and shininess occurred for the surfaces with the lowest roughness in the ambient
light field, and these ratings dropped systematically as the roughness was
increased. For the objects illuminated by point or area lights, the appearance of metal and shininess were significantly less than in the ambient conditions for the lowest roughness value, and significantly greater than in the
ambient condition for intermediate values of roughness. We also included
a control condition depicting objects with a low roughness and a porcelain
reflectance function that had both Lambertian and specular components.
These objects were rated as highly shiny but they did not appear metallic.
An analysis of the luminance patterns in these images revealed that the primary difference between metal and porcelain occurs near smooth occlusion
boundaries, thus suggesting that these regions provide critical information
for distinguishing different types of shiny materials.
Acknowledgement: This research was supported by the DFG (SFB-TRR-135:
25.25, 6:15 pm The interplay between material qualities and
lighting Fan Zhang1([email protected]), Huib de Ridder1, Rene van
25.23, 5:45 pm Perceiving gloss behind transparent layers Sabrina
Hansmann-Roth1, Pascal Mamassian1; 1Laboratoire des Systèmes Perceptifs, CNRS UMR 8248, 29 rue d’Ulm, 75005 Paris, France
The image intensity depends on the illumination, the reflectance properties
of objects but also on the reflectance and absorption properties of any intervening media. We recently showed (Hansmann-Roth & Mamassian, VSS
2016) that dark backgrounds increase perceived gloss of central patches.
We hypothesized that this simultaneous gloss contrast induced by the dark
background causes a perceptual shift in the luminance range. Highlights
on the central patch appeared brighter, inducing an increase in perceived
gloss. In the current study we present the participant with glossy objects
behind partially-transmissive materials. The transparent layer causes an
achromatic color shift and compression in contrast, which can affect the
perception of the specular reflections of the object behind the transparent
layer. We rendered objects with various gloss levels and presented them
behind four different transparent layers with varying reflectance properties
ranging from black to white (constant transmittance: 0.5). We conducted
a maximum likelihood conjoint measurement experiment (Knoblauch &
Maloney, 2012) and investigated the contamination of different transparent layers on perceived gloss. We presented two objects simultaneously
and asked our participants to indicate which object appears glossier. We
used the additive model of MLCM, assigned perceptual scale values to
each gloss level of the object and to each reflectance level of the transparent
layer, and modeled the contribution of both features to perceived gloss.
Our results indicate a significant contribution of the transparent layer when
estimating gloss. Highlights and lowlights are affected most by the lightest
and darkest transparent layer respectively. In conclusion, we show that
disentangling the transparent layer from the underlying object results in
Egmond1, Sylvia Pont1; 1Perceptual Intelligence Lab, Industrial Design
Engineering, Delft University of Technology
In previous research we tested visual material perception in matching and
discrimination tasks, and found multiple material and lighting dependent
interactions. We used four basis surface scattering modes, namely diffuse,
asperity, forward, and mesofacet scattering, which we represented by
covering a same-shaped 3D object with “matte”, “velvety”, “specular”,
and “glittery” finishes, respectively. All four birds were photographed
in so-called ambient, focus and brilliance lighting, three canonical modes
that are commonly used in lighting design. In the current study, we asked
observers to judge the 12 stimuli on 9 material qualities terms that are commonly used in material perception studies, namely “matte”, “velvety”,
“specular”, “glittery”, “hard”, “soft”, “rough”, “smooth”, and “glossy”,
to test 1) whether the naming of the scattering modes we used is proper;
2) whether certain material qualities can be brought out or eliminated by
certain canonical types of illuminations. For each term and each stimulus image, we first asked observers to judge whether it was applicable. If
they answered “yes”, they were asked to rate the term on a scale from 1
to 7. Three repetitions for 12 stimuli and 9 qualities, made 324 trials per
observer. In preliminary results with 9 inexperienced observers, we found
that 1) “matte” applied to all materials, while “velvety”, “specular” and
“glittery” specifically applied to those respective materials; 2) for “specular” and “glossy” we found similar judgments; 3) brilliance light brought
out glitteriness, specularity, glossiness and smoothness the best; 4) focus
light resulted in a small increase in the velvetiness, softness and roughness
ratings compared to those for ambient and brilliance light. In further analysis, we will look into how the key stimuli image features can trigger certain
perceived qualities and how to design lighting to optimize appearance.
Acknowledgement: This work has been funded by the EU FP7 Marie Curie Initial
Training Networks (ITN) project PRISM, Perceptual Representation of Illumination, Shape and Material (PITN-GA-2012-316746).
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Vis io n S c ie nc es Societ y
Saturday PM
Roland Fleming1; 1Department of Psychology, Justus-Liebig-University
Satur day Af t ernoon Tal ks
VSS 2017 Abst ract s
25.26, 6:30 pm Integration of color and gloss in surface material discrimination Toni Saarela1([email protected]), Maria
Saturday PM
Olkkonen1,2; 1Institute of Behavioural Sciences, University of Helsinki,
Department of Psychology, Durham University
Background: Real-world surfaces differ from each other in several respects,
for example, in roughness, diffuse reflectance (color), and specular reflectance (gloss). When identifying and discriminating surface materials, the
visual system could use information from several such cues to perform
more precisely and consistently across varying viewing conditions. We
tested the integration of information from diffuse and specular reflectance in a discrimination task. Methods: Stimuli were spectrally rendered images of 3D shapes with surface corrugation. We independently
varied two surface-material “cues”: (1) diffuse reflectance, resulting in
greenish-to-bluish color variation, and (2) specular reflectance, resulting
in matte-to-glossy appearance variation. On each trial, the observer saw
a reference and a test stimulus. The reference was near the middle of the
color and gloss ranges, with trial-to-trial jitter. In different blocks of trials,
the test varied in the color-only, gloss-only, or in one of three intermediate
directions. The observer identified the bluer and/or glossier stimulus. We
fit psychometric functions to the proportion-bluer/glossier data to estimate
discrimination thresholds. To encourage observers to judge surface properties rather than local image cues, different shapes were interleaved, but
within each trial the two shapes were identical. Each shape was further
rendered with several rotation angles, changing the pattern of colors and
specular highlights. On each trial, rotation was selected randomly for each
stimulus. Results: Having two cues improved discrimination: Thresholds
were lower in the two-cue, compared to the single-cue conditions. Comparison against model predictions revealed that cue integration was less than
optimal statistically, falling between the optimal and strongest-single-cue
threshold predictions. Conclusion: The visual system can combine information from color and gloss to improve discrimination of surface material,
although the integration falls short of statistically optimal. When faced with
shape, viewpoint, and material variation, the visual system might rely on a
robust but sub-optimal strategy of cue integration.
Acknowledgement: Supported by the Academy of Finland grant 287506.
Vi s i on S c i enc es S o ci e ty
S e e page 3 fo r A b s tr ac t N um bering Syst em
Saturday Afternoon Posters
Perception and Action: Affordances
Saturday, May 20, 2:45 - 6:45 pm
Poster Session, Banyan Breezeway
26.3001 Categorical selectivity in the visual pathway revealed by
Neuronal activity in the primate occipito-temporal pathway has been
shown to contain information about visual object category. Specifically,
recent neuroimaging and electrophysiological studies have revealed that
the inferior temporal (IT) cortex contains regions that encode stimuli
belonging to one category compared to others. However, the majority of
these studies have limited their investigations to higher parts of IT cortex,
leaving earlier areas in the visual hierarchy, as well as areas within the
occipito-parietal pathway, less thoroughly explored. Here, we used functional magnetic resonance imaging (fMRI) to investigate the neural encoding of object categorisation in awake behaving macaques. We employed an
event-related design and presented 3 monkeys with 48 images, consisting
of 24 animate (human- and animal- faces and body-parts) and 24 inanimate (objects and places) images. Monkeys were trained to fixate on a cue
and received juice reward for maintaining fixation within the frame where
the images were presented. For each subject, we collected approximately
1170-1521 volumes per session. We collected about 10 sessions per monkey. Regions of interest included early visual areas (V1, V2, V3, V4), category-selective regions in IT cortex (face-, object-, body-part- “patches”) and
dorsal-parietal regions. We used a general linear model to analyse the time
series data in which we included the animals’ broken fixations and head
motion as nuisance regressors. Consistent with previous studies, we found
that the animate-inanimate, face-body parts and the face-inanimate contrasts activated face patches. We also found voxels preferentially activated
by objects vs. places in patches along the temporal cortex and, interestingly,
in the intraparietal sulcus. We did not find any categorical organisation in
areas V1-3, but the animate-inanimate division was observed in V4. Our
results suggest that as one moves beyond the striate cortex, a network of
visual areas exhibiting a categorical organisation of object representation
begins to emerge.
Acknowledgement: European Research Council
26.3002 Grasp Affordances Are Necessary for Enhanced Target
Detection Near the Hand Robert McManus1([email protected]
edu), Laura Thomas1; 1Center for Visual and Cognitive Neuroscience,
Department of Psychology, North Dakota State University
Observers show biases in attention when viewing objects within versus outside of their hands’ grasping space. For example, people are faster to detect
targets presented near a single hand than targets presented far from the
hand (Reed et al., 2006). While this effect could be due to the proximity of
the hands alone, recent evidence suggests that visual biases near the hands
could be contingent on both the hands’ proximity and an observer’s affordances for grasping actions (e.g., Thomas, 2015; 2016). The current study
examined the role an observer’s potential to act plays in biasing attention
to the space near the hands. Sixty-one participants completed a standard
Posner cueing task in which targets appeared on the left or right side of
the display. Across blocks, participants either placed their non-responding
hand near one of the target locations or kept this hand in their laps. Half of
the participants completed this task with their hands free, creating an affordance for a grasping action. The remaining participants completed the task
with their non-responding hand immobilized by a fingerboard, eliminating
their potential to grasp. In the hands-free condition, participants showed
faster detection of targets presented near the right hand than when targets
were presented far from the hand. However, participants in the hands-immobilized condition were no faster to detect targets near the hands than
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Acknowledgement: National Science Foundation NSF BCS 1556336 Google
Faculty Research Award
26.3003 Breaking Ground: Effects of Texture Gradient Disruption
on the Visual Perception of Object Reach-Ability Jonathan Doyon1([email protected]), Alen Hajnal1; 1Department of Psychology,
University of Southern Mississippi
Gibson’s ground theory of space perception (1950) places the density gradient of a surface at the center of distance perception, such that (1) the rate
of change in the density of the texture elements of a surface specifies the
orientation and slant of a surface relative to an observer, and (2) the magnitude change in density of texture elements surrounding an object compared
to the density near the observer specifies that object’s distance from the
observer. Sinai, Ooi, and He (1998) investigated this theory as it relates to
how the brain might exploit this cue to simplify computations of distance.
Their investigations found that observers were more accurate in judging
absolute distance, measured by blind-walking, when objects were viewed
on a surface of homogenous texture (i.e., continuous texture gradient), as
opposed to a surface of heterogenous texture (i.e., an interruption of texture gradient caused by a gap in the surface). Here, we seek to extend this
investigation to the perception of an object’s reach-ability. We suspect that
continuous texture gradient may be critical in the successful realization of
certain affordances (Gibson, 1979). In a table-top reaching task, participants
were asked to judge whether an object was reachable in two conditions:
(1) when the object rested on a surface with a continuous texture gradient
and (2) when the object rested on a surface with an interruption in the texture gradient (i.e., two distinct gradients). Results showed that participants
overestimated action capabilities in both conditions, but less so in the heterogeneous condition. As a consequence, participants in the heterogenous
condition were also more accurate with respect to their capabilities. This
comports with Sinai, Ooi, & He’s findings that distance estimates across
discontinuous gradients are smaller than those made across continuous
26.3004 Bayes meets Gibson: Affordance-based control of target
interception in the face of uncertainty Scott Steinmetz1([email protected]
gmail.com), Nathaniel Powell1, Oliver Layton1, Brett Fajen1; 1Cognitive
Science Department, Rensselaer Polytechnic Institute
To be efficient over time in the pursuit of moving targets, humans and
other animals must know when to abandon the chase of a target that is
moving too fast to catch or for which the costs of pursuing outweigh the
benefits of catching. From an affordance-based perspective, this entails
perceiving catchability, which is determined by how fast one needs to
move to intercept in relation to one’s locomotor capabilities. However,
affordances are traditionally treated as categorical (i.e., the action is either
possible or not) when in fact the presence of variability in both perception
and movement ensures that target catchability is a continuous, probabilistic function. We developed a computational framework that treats interception as a dynamic decision making process under uncertainty. In our
dynamic Bayesian model, the pursuer continuously updates its belief about
catchability based on informational variables, such as relative target distance and time until the target reaches an escape region. These beliefs shift
based on the likelihood of detecting each variable at a given value (plus
noise) when the target was catchable and when it was uncatchable. At each
moment, the model uses its belief about catchability to decide whether to
continue to pursue the target or give up. To evaluate the model, we compared its beliefs about target catchability to the behavior of humans in an
experiment in which subjects had to decide whether to pursue or abandon
the chase of a moving target (Fajen et al., VSS 2016). In a subset of randomly
selected trials, the model’s beliefs closely matched human behavior – that
is, the belief reflected high certainty in catchability when subjects pursued
Vis io n S c ie nc es Societ y
Saturday PM
fMRI in awake macaques Vassilis Pelekanos1,2([email protected]),
Olivier Joly1, Robert Mok1, Matthew Ainsworth1,2, Radoslaw Cichy3, Diana
Kyriazis1, Maria Kelly2, Andrew Bell1,2, Nikolaus Kriegeskorte1; 1Cognition and Brain Sciences Unit, Medical Research Council , 2Department of
Experimental Psychology, Oxford University, 3Department of Education
and Psychology, Free University Berlin
targets appearing far from the hands. These results suggest that improved
target detection is contingent not only on the proximity of the hands to a
stimulus, but on the ability to use the hands in a grasping action as well.
Satur day Af t ernoon P os te r s
the target and high certainty in uncatchability when subjects gave up. Our
framework provides a powerful tool for investigating action-scaled affordances as probabilistic functions of actor-environment variables.
Acknowledgement: ONR N00014-14-1-0359
26.3005 Towards Affordance-Based Control in Catching Fly Balls:
The Affordance of Catchability Dees Postma1([email protected]
com), Frank Zaal ; Center for Human Movement Sciences, University
Medical Center Groningen, University of Groningen
Saturday PM
1 1
After a drive in baseball, it is crucial for the fielding team to get hold of
the ball as quickly as possible. The best way to do this is to make a direct
catch. However, this might not always be possible. Some fly balls are simply uncatchable. In that case, it might be better to get the ball after the first
bounce. The latter situation requires a fielder to employ different timing
and coordination from the former, illustrating that perceived catchability
could have an effect on locomotor control. Until now, the effects of (un)
catchability on running to catch fly balls have received little attention. We
aim to formulate an affordance-based control strategy that appreciates
the influence of perceived catchability on locomotor control in catching
fly balls. A first step in doing so is to identify the factors that cause some
fly balls to be catchable and others to be uncatchable. In an experiment,
18 participants were required to intercept 44 fly balls. Some fly balls were
catchable whereas others were not. Mixed Effects Regression was used to
examine a number of factors possibly related to catchability. The analysis
showed that the boundary between catchable and uncatchable fly balls is
largely determined by the locomotor qualities of the individual, the distance to be covered and the time available to do so. The present contribution also studied participants’ ability to judge catchability. The same participants were presented with another set of 44 fly balls for which they were
to indicate whether these would be uncatchable. Importantly, they were
allowed to start running before giving their judgment. Participants could
judge catchability correctly. Interestingly, their judgments were predominantly given while they already had started running. These findings pave
the way towards the formulation of an affordance-based control strategy
for running to catch fly balls.
Acknowledgement: University Medical Center Groningen
26.3006 Learning affordances through action: Evidence from visual
search Greg Huffman1([email protected]), Jay Pratt1;
University of Toronto
It has long been thought that objects are processed according to affordances
they offer. Much of the evidence for this conclusion, however, comes from
studies that used images of tools that participants may or may not have
previous experience interacting with. Moreover, many tools are spatially
asymmetric, adding a further potential confound. In the current study, we
eliminated these confounds by using simple geometric stimuli and having
participants learn that certain color-shape combinations afforded successfully finishing a task whereas others did not. The learning trials began with
a small circle (the ‘agent’) surrounded by two circles and two squares that
were blue or yellow and were contained with a ‘+’ shaped structure. The
participant’s task was to move the agent, using the arrow keys, past the
shapes, out of the structure. Importantly, two of these color-shape combinations allowed the agent to pass (doors) while the other two stopped
the agent (walls). To measure whether doors were preferentially processed
after affordances were learned, the test trials had participants search for a
‘T’ among ‘L’s that were presented on the same color-shape combinations.
Evidence for affordance processing would be found if responses times were
shorter for targets appearing on doors than targets on walls. The data supported this hypothesis, indicating that not only do affordances guide object
processing, but also that affordances can be learned and assigned to otherwise arbitrary stimuli. The response time benefit may reflect a search bias
with the attentional system prioritizing the processing of previously action
relevant stimuli.
Acknowledgement: Natural Sciences and Engineering Research Council of
VSS 2017 Abst ract s
26.3008 Action-Specific Effects in Perception and their Mecha-
nisms Jessica Witt1([email protected]), Nathan Tenhundfeld1,
Marcos Janzen1, Michael Tymoski1, Ian Thornton2; 1Department of Psychology, College of Natural Sciences, Colorado State University, 2Department of Cognitive Science, University of Malta
Spatial perception is influenced by the perceiver’s ability to act. For example, distances appear farther when traversing them requires more energy,
and balls appear to move faster when they are more difficult to block.
Despite many demonstrations of action-specific effects across a wide range
of scenarios, little is known about the mechanism underlying these effects.
To explore these mechanisms, we leveraged individual differences with the
idea that if a common mechanism underlies two tasks, outcomes on these
tasks should be highly related. We found that the magnitude of the two
action-specific effects described as examples did not correlate with each
other (r = .15, p = .19). This suggests unique mechanisms underlying energetic-based and performance-based action-specific effects. Furthermore,
the magnitude of each effect correlated with perceptual precision within
the task (r = .31, r = .40, ps < .001) but not with perceptual precision in the
other task (rs < .03). This pattern is consistent with a Bayesian mechanism
such that when visual information is less reliable, the perceptual system
places greater weight on other sources of information such as those from
the motor system. In addition, performance on a biological motion perception task did not correlate with the magnitude of either action-specific effect
(rs < .03). This lack of relationship suggests that the processes involved in
connecting the motor system to the visual system to perceive biological
motion are not the same processes that connects the motor and visual systems to perceive distance or speed. These data are the first to suggest different mechanisms underlying the different kinds of action-specific effects
and to suggest multiple types of connections from the motor system to the
visual system.
Acknowledgement: National Science Foundation (BCS-1348916 and BCS1632222)
26.3009 Distance on hill overestimation is not influenced by hiking
experience Janzen Janzen1([email protected]), Tenhundfeld
Nathan1, Tymoski Michael1, Witt Jessica1; 1Colorado State University
Previous research has found that perceptual estimations are scaled by
one’s ability to act and by the associated costs related to this action. For
example, distances on a hill are judged as being farther than distances on
the flat, due to the higher metabolic costs associated with traversing hills.
One’s ability to act may vary according to several different factors, such as
age, body size, body control, energetic potential, and task requirements.
The extent to which one’s ability to act affects spatial perception and which
factors specifically contribute to the variation of estimations might provide
insight into the underlying mechanisms that rule action-specific perception. In a previous study, we hypothesized that participants with more
experience walking up hills would judge distances on hills more accurate,
while less experienced participants would overestimate distances on hills.
Previous results indicated that experienced hikers did not overestimate distances on hills when visually matching them to distances on the flat. In this
study, we aimed to replicate that finding, and to include other factors that
have previously been shown to influence distance estimation. Participants
visually matched distances on a hill to distances on flat ground in VR and
answered a survey on hiking experience. Replicating the main effect, participants overestimated distances on the hill. Contrary to our hypothesis,
hiking experience did not modulate overestimation on hills, nor did BMI,
or percent of body fat/muscle, contrary to previous research. This study
utilized a highly reliable measure which imbues confidence in interpreting
this null result. Considering the very high reliability of 0.96, these results
suggest that the distance-on-hill effect is robust, and that previous experience does not influence this effect.
26.3010 I Can’t Afford Both: Walk-through-ability Affordance
Judgments do not Correlate to the Distance on Hill Effect Michael
Tymoski1([email protected]), Jessica Witt4, Nathan Tenhundfeld2, Marcos Janzen3; 1Colorado State University
Previous research has shown that visual perception is fundamentally
linked to information about the perceiver’s body. The ecological approach
to visual perception states that perception of affordances (i.e. environmental cues about action capabilities) is anchored to physical dimensions of the
Vi s i on S c i enc es S o ci e ty
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s 26.3011 An uphill battle: Distances are reported as farther on a hill
even when immediate feedback about estimation accuracy is provided Nathan Tenhundfeld1([email protected]), Jessica Witt1;
Cognitive Psychology, Natural Sciences, Colorado State University
Studies have reported distances are seen as farther on a hill than on the
flat ground. These studies are part of a larger theoretical framework that
suggests your ability to act changes how you see the world. However, this
framework has been met with controversy. Critics suggest the reported
‘distance-on-hill’ effect may be nothing more than response bias. One
effective method used to eliminate response biases is to provide feedback
about the accuracy of the perceptual judgments. We used an Oculus Rift
DK2, head mounted virtual reality system to present the stimuli. Participants were tasked with visually matching the perceived egocentric distance
between a cone placed on a virtual hill, and a cone placed on the virtual
flat ground. After each trial they were given feedback which told them if
their estimate was too far, too close, or correct (which was defined as being
within 30cm of the actual distance). Results indicated that despite the feedback there was still an overall significant main effect for the hill on perceived distance, F (1, 40) = 19.36, p < .001. This suggests that even though
participants were given accurate feedback to correct their estimations, they
were still unable to resist the distance-on-hill effect. The main effect for distance on the difference scores was also significant F(1, 40) 73.75, p < .001.
As the distance to the target cone increased, so too did the effect of the hill
(versus flat ground). This provides further substantiation for an energetic
account of the distance-on-hill effect. Taken together, this replication of the
distance-on-hill effect in virtual reality, and the effect’s resistance to immediate feedback on estimation accuracy, provides evidence for a perceptual
account and helps rule out a response bias account for the effect of action
on perception.
Acknowledgement: NSF
26.3012 Support for modulation of visuomotor processes in shared,
social space: Non-human distractors do not influence motor
congruency effects relating to object affordances Elizabeth Sacco-
ne1([email protected]), Owen Churches1, Ancret Szpak1,
Mike Nicholls1; 1School of Psychology, Flinders University of South
Recent research suggests close, interpersonal proximity modulates visuomotor processes for object affordances in shared space. In our previous
study, manipulable object stimuli in reachable space elicited motor congruency effects for participants acting alone, but when a co-actor stood
opposite, only objects nearest the participant produced motor congruency
effects. An alternative, non-social mechanism may explain these findings,
however. Perhaps participants perceived the co-actor as a distractor they
attempted to ignore, and in doing so neglected the space and stimuli nearby.
The current study addressed this alternative explanation with a non-social version of the original experiment, employing non-human distractor
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
items in place of the co-actor. Participants stood at a narrow table, viewing images of household objects on a flat screen. Participants responded
to the upright or inverted orientation of objects with left- or right-facing
handles. Objects appeared in one of two locations, either nearer or farther
from the participant’s side of the table. Participants performed the task both
alone and with a Japanese waving cat statue (Experiment 1) or a digital
metronome (Experiment 2) placed opposite. Both experiments produced
the typical object affordance congruency effect; a response advantage when
left/right response hand matched the object’s left/right handle orientation.
In Experiment 1, the cat statue elicited a similar but statistically nonsignificant pattern of results to the original, social study, perhaps reflecting
participants’ anthropomorphisation of the cat. Accordingly, Experiment
2 employed a distractor that was devoid of human-like features (metronome). The affordance congruency effect emerged but was not modulated
by stimulus proximity and metronome presence. These results support
past findings indicating social modulation of object affordances in nearbody space. Together with the previous study, these results provide an
important step towards understanding how visuomotor processes operate
in real-world, social contexts and have broad implications for object affordance, joint action and peripersonal space research.
26.3013 Memory for real objects is better than images – but only
when they are within reach Michael Compton1([email protected]),
Jacqueline Snow1; 1The University of Nevada, Reno
Previous studies of human memory have focused on stimuli in the form
of two-dimensional images, rather than tangible real-world objects. Previously, we found a memory advantage for real-world objects versus colored
photographs of the same items. A potential explanation for this ‘real object
advantage’ (ROA) is that real objects (but not their representations) afford
genuine physical interaction. Here, we examined directly whether the ROA
is influenced by reachability by comparing memory performance for images
versus real objects, when they are presented either within versus outside of
reach. Participants were asked to memorize 112 different objects: half were
real objects and the remainder were high-resolution color images of objects.
Half of the stimuli in each display format were presented within reach, and
the remainder were outside of reaching distance. The images were matched
closely to the real objects for size, viewing angle, background and illumination. Participants completed a free recall task, a recognition task, and also
a task in which they indicated whether the object was displayed as a real
object or an image. We predicted that if graspability is important in driving
the ROA, then stimuli positioned beyond reach should have no influence
on memory for images, but should impair memory for real objects. In line
with this prediction, we found that free recall for real objects was superior
to images, but only when the objects were within reach. Conversely, recall
for images was unaffected by distance, suggesting that the effect for the
real objects was not attributable to distance-related changes such as retinal
size. A similar pattern was observed in participants’ ability to indicate the
format in which the stimuli were presented Reachability is a critical determinant of the ROA. Face Perception: Models
Saturday, May 20, 2:45 - 6:45 pm
Poster Session, Banyan Breezeway
26.3014 Coding of faces by tensor components Sidney Lehky1,3([email protected]), Anh Huy Phan2, Andrzej Cichocki2, Keiji Tanaka1; 1Cognitive Brain Mapping Laboratory, RIKEN Brain Science Institute, 2Advanced
Brain Signal Processing Laboratory, RIKEN Brain Science Institute, 3Computational Neurobiology Laboratory, Salk Institute
Neurons selectively responsive to faces exist in the ventral visual stream of
both monkeys and humans. However, the characteristics of face cell receptive fields are poorly understood. Here we use tensor decompositions of
faces to model a range of possibilities for the neural coding of faces that
may inspire future experimental work. Tensor decomposition is in some
sense a generalization of principal component analysis from 2-D to higher
dimensions. For this study the input face set was a 4-D array, with two spatial dimensions, color the third dimension, and the population of different
faces forming the fourth dimension. Tensor decomposition of a population
of faces produces a set of components called tensorfaces. Tensorfaces can
be used to reconstruct different faces by doing different weighted combinations of those components. A set of tensorfaces thus forms a population
Vis io n S c ie nc es Societ y
Saturday PM
perceiver’s body in a task relevant way, while the action-specific account
for visual perception states that visual perception is linked to internal cues
about action capability. For instance, the “distance on hill (DoH)” effect
demonstrates that distance judgments are bioenergetically scaled, such
that distances on hills are perceived as farther than equal distances on flat
ground, due to the increased energy requirement to walk the distance on
a hill versus on flat ground. Both theories posit that cues related to the
perceiver’s body and its potential for action modulate visual perception.
However, this theoretical convergence is unexplored. We conducted an
individual differences study by comparing participants’ performance on
a DoH task and an affordance task. For the DoH paradigm participants
were presented a virtual hill on an Oculus Rift DK2. They then performed
a visual matching task on the egocentric distance to both a cone on the hill
and a cone on the flat ground. As expected, distances on hills were judged
to be farther away than distances on flat ground, F(1,158)=85.25, p< .001.
For the affordance paradigm, participants made judgments on their ability to walk through a doorway aperture. Their affordance judgments were
correlated with their actual abilities, r=.456, p< .001. However, when we
compared performance on these two tasks, there was no significant correlation between DoH effect and accuracy of affordance judgements, r=.014,
p=.859. These data suggest that, although the action specific and ecological
accounts for visual perception theoretically converge, they do not employ
the same underlying mechanism.
S atur day A ft ernoon Post ers
Satur day Af t ernoon P os te r s
code for the representation of faces. A special feature of the tensor decomposition algorithm we used was the ability to specify the complexity of the
tensorface components, measured as Kolmogorov complexity (algorithmic
information). High-complexity tensorfaces have clear face-like appearances, while low-complexity tensorfaces have blob-like appearances that
crudely approximate faces. For a fixed population size, high-complexity
tensorfaces produced smaller reconstruction errors than low-complexity
tensorfaces when dealing with familiar faces. However, high-complexity
tensorfaces had a poorer ability to generalize to handling novel face stimuli
that were very different from the input face training set. This raises the
possibility that it may be advantageous for biological face cell populations
to contain a diverse range of complexities rather than a single optimal complexity.
Saturday PM
26.3015 Identifying ‘Confusability Regions’ in Face Morphs Used
for Ensemble Perception Emma ZeeAbrahamsen1([email protected]
edu), Jason Haberman1; 1Department of Psychology, Rhodes College
The ability to extract summary statistics from a set of similar items, a phenomenon known as ensemble perception, is an active area of research. In
exploring high-level ensemble domains, such as the perception of average
expression, researchers have often utilized gradually changing face morphs
that span a circular distribution (e.g., happy to sad to angry to happy).
However, in their current implementation, face morphs may unintentionally introduce noise into the ensemble measurement, leading to an underestimation of ensemble perception abilities. Specifically, some facial expressions on the morph wheel appear perceptually similar even though they are
positioned relatively far apart. For instance, in a morph wheel of happysad-angry-happy expressions, an expression between happy and sad may
not be perceptually distinguishable from an expression between sad and
angry. Without accounting for this perceptual confusability, observer error
will be overestimated. The current experiment accounts for this by determining the perceptual confusability of a previously implemented morph
wheel. In a 2-alternative-forced-choice task, 7 observers were asked to discriminate between multiple anchor images (36 in total) and all 360 facial
expressions on the morph wheel (which yielded close to 27,000 trials per
participant). Results are visualized on a ‘confusability matrix’ depicting the
images most likely to be confused for one another. This confusability matrix
reveals discrimination thresholds of relatively adjacent expressions and,
more importantly, uncovers confusable images between distant expressions on the morph wheel, previously unaccounted for. By accounting for
these ‘confusability regions,’ we demonstrate a significant improvement in
model estimation of previously published ensemble performance, suggesting high-level ensemble abilities may be better than previously thought.
26.3016 The Lightness Distortion Effect: Additive Conjoint Measurement Shows Race Has a Larger Influence on Perceived Lightness of Upright than Inverted Faces Nikolay Nichiporuk1,2(nichi-
[email protected]), Kenneth Knoblauch3, Clément Abbatecola3, Steven
Shevell1,2,4; 1Department of Psychology, University of Chicago, IL, USA,
Institute for Mind and Biology, University of Chicago, IL, USA, 3University of Lyon, Université Claude Bernard Lyon 1, INSERM, Stem Cell and
Brain Research Institute U1208, Lyon, France, 4Department of Ophthalmology and Visual Science, University of Chicago, IL, USA
BACKGROUND African American faces are judged to be darker than Caucasian faces, even when faces are matched for mean luminance and contrast
(Levin & Banaji, 2006). This is the Lightness Distortion Effect (LDE), which
is found even when faces are blurred, suggesting that low-level visual features drive LDE to at least some degree (Firestone & Scholl, 2015). Here,
the LDE is measured using maximum likelihood conjoint measurement
(MLCM). Upright and inverted faces were tested separately to control for
low-level visual features. METHODS The joint influence of (1) overall mean
luminance and (2) race was measured for perceived face lightness. Thirteen
African American faces ranging in mean luminance and contrast and 13
Caucasian faces, matched to the African American faces in mean luminance
and contrast, were tested (Levin & Banaji, 2006). All pairs of the 26 faces
(either upright or inverted, in separate runs) were presented straddling
fixation for 250 msec, followed immediately by a noise mask (with replications, 1,800 judgments in all for each observer). Conjoint measurement
requires that participants only choose which member of a pair appears
lighter; this ameliorates concern about demand characteristics (e.g. Firestone & Scholl, 2015). Perceptual lightness scales for all 26 face stimuli were
Vi s i on S c i enc es S o ci e ty
VSS 2017 Abst ract s
derived from MLCM. RESULTS & CONCLUSIONS Each observer’s results
were analyzed separately. For 5 of the 6 observers, race had a significant
effect on lightness judgments of upright faces (p < 0.001 for each observer)
in the direction of a fixed decrement in perceived lightness for each African
American face. Further, the magnitude of this effect was greater for upright
than inverted faces for 5 of the 6 observers (p < 0.05). The greater effect of
race with upright than inverted faces shows that perception of face lightness depends on race beyond just low-level features.
26.3017 Face Representations in Deep Convolutional Neural
Networks Connor Parde1([email protected]), Carlos Castillo2,
Matthew Hill1, Y. Colon1, Jun-Cheng Chen2, Swami Sankaranarayanan2,
Alice O’Toole1; 1School of Behavioral and Brain Sciences, The University
of Texas at Dallas, 2Department of Electrical Engineering, University of
Maryland, College Park
Algorithms based on deep convolutional neural networks (DCNNs) have
made impressive gains on the problem of recognizing faces across changes
in appearance, illumination, and viewpoint. These networks are trained
on a very large number of face identities and ultimately develop a highly
compact representation of each face at the network’s top level. It is generally assumed that these representations capture aspects of facial identity
that are invariant across pose, illumination, expression, and appearance.
We analyzed the top-level feature space produced by two state-of-the-art
DCNNs trained for face identification with >494,000 images of 10,575 individuals (Chen, 2016; Sankaranarayanan, 2016). In one set of experiments,
we trained classifiers to predict image-based properties of faces using the
networks’ top-level feature descriptions as input. Classifiers determined
face yaw to within 9.5 degrees and face pitch (frontal versus offset) at 67%
correct. Top-level features also predicted whether the input came from a
photograph or video frame with 87% accuracy. In a second experiment,
we compared top-level feature codes of different views of the same identities to develop an index of feature invariance. Surprisingly, we found that
invariant coding was a characteristic of individual identities, rather than
individual features - with some identities encoded invariantly whereas others were not. In a third analysis, we used t-Distributed Stochastic Neighbor
Embedding to visualize the top-level DCNN feature space for the Janus
CS3 dataset (cf. Klare et al., 2015) containing over 69,000 images of 1,894
distinct identities. This visualization indicated that image quality information is retained in the top-level DCNN features, with poor quality images
clustering at the center of the space. The representation of photometric
details for face images in top-level DCNN features echoes findings of object
category-orthogonal information in macaque IT cortex (Hong et al., 2016),
reinforcing the claim that coarse codes can effectively represent complex
stimulus sets.
Acknowledgement: This work was supported by the Intelligence Advanced
Research Projects Activity (IARPA).
26.3018 Training a deep convolutional neural network with multiple
face sizes and positions, but not resolutions, is necessary for
generating invariant face recognition across these transformations Megha Srivastava1,2([email protected]), Kalanit Grill-Spector2,3;
Computer Science Department, Stanford University, 2Psychology Department, Stanford University, 3Stanford Neurosciences Institute, Stanford
Convolutional neural networks have demonstrated human-like ability
in face recognition, with recent networks achieving as high as 97% accuracy (Taigman, 2014). It is thought that non-linear operations (e.g. maximum-pooling) are key for developing position and size invariance (Riesenhuber & Poggio, 1999). However, it is unknown how training contributes
to invariant face recognition. Here, we tested how training affects invariant
face recognition across position, size, and resolution. We used a convolutional neural network architecture of TensorFlow (tensorflow.org). We
trained the network to recognize 101 faces that varied in age, gender, and
ethnicity across views (15 views/face, spanning 0 to ±105⁰). The network
was trained on 80% of views, randomly selected, and tested on the remaining 20% of views. During training faces were shown centrally and presented in one size and resolution. Then, we tested face recognition across
views for new positions, sizes, and resolutions not shown during training.
Results show that face recognition performance progressively declined
for faces shown in different positions (Figure 1A) or sizes (Figure 1B) than
shown during training. However, face recognition performance general-
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s ized across resolutions (Figure 1C). Further experiments using a constant
number of training examples, but different training regimes, revealed that
training with random positions (Figure 1D) or random sizes (Figure 1E)
generated more robust performance than training with faces in 5 positions
(Figure 1D) or 5 sizes (Figure 1E). Additionally, the network displayed better performance on faces shown in new sizes than new positions. Overall,
our results indicate that the architecture of the neural network is (1) sufficient for invariant face recognition across resolutions, (2) but insufficient
for invariant face recognition across size and position unless trained with
many faces varying in size and position. By understanding the limits of
convolutional neural networks we can gain insights to understanding factors that enable successful face recognition.
26.3019 Using Psychophysical Methods to Study Face IdentificaOliver Garrod1, Lukas Snoek2, Steven Scholte2, Philippe Schyns1; 1Institute
of Neuroscience and Psychology, University of Glasgow, Scotland, UK,
Department of Psychology, Brain and Cognition, University of Amsterdam, Netherlands
Deep neural networks (DNN) have been very effective in identifying
human faces from 2D images, on par with human-level performance.
However, little is known about how they do it. In fact, their complexity
makes their mechanisms about as opaque as those of the brain. Here, unlike
previous research that generally treats DNNs as black boxes, we use rigorous psychophysical methods to better understand the representations
and mechanisms underlying the categorization behavior of DNNs. We
trained a state-of-the-art 10-layer ResNet to recognize 2,000 human identities generated from a 3D face model where we controlled age (25, 45, 65
years of age), emotional expression (happy, surprise, fear, disgust, anger,
sad, neutral), gender (male, female), 2 facial orientation axes X and Y (each
with 5 levels from -30 to +30 deg), vertical and horizontal illuminations
(each with 5 levels from -30 to +30), plus random scaling and translation of
the resulting 26,250,000 2D images (see S1). At training, we used two conditions of similarity of images (most similar; most different using subsets of
face generation parameters) to test generalization of identity across the full
set of face generation parameters. We found catastrophic (i.e. not graceful)
degradation of performance in the most different condition, particularly
when combining the factors of orientation and illumination. To understand the visual information the network learned to represent and identify
faces, we applied Gosselin & Schyns (2001) Bubbles procedure at testing.
We found striking differences in the features that the network represents
compared with those typically used in humans. To our knowledge, this
is the first time that a rigorous psychophysical approach controlling the
dimensions of face variance is applied to better understand the behavior
and information coding of complex DNNs. Our results inform fundamental differences between categorization mechanisms and representations of
DNNs and the human brain.
Acknowledgement: PGS is funded by Wellcome Trust Investigator grant
107802/Z/15/Z and by MURI grant EP/N019261/1
26.3020 Picturing Jonah Hill: memory-based image reconstruction
of facial identity Chi-Hsun Chang1([email protected]), Dan
Nemrodov1, Andy Lee1,2, Adrian Nestor1; 1Department of Psychology at
Scarborough, University of Toronto, Toronto, Ontario, Canada, 2Rotman
Research Institute, Baycrest Centre, Toronto, Ontario, Canada
Our memory for human faces has been studied extensively, especially
regarding the specific factors that influence face memorability. However, the detailed visual content of the representations underlying face
memory remains largely unclear. Additionally, the relationship between
face memory and face perception is not well understood given that these
two aspects of face processing are typically investigated independently.
Accordingly, the current work aimed to examine these issues by adopting
an imaging reconstruction approach to estimate the visual appearance of
face images from memory and perception-based behavioural data in neurologically healthy adults. Specifically, we used judgements of visual similarity between facial stimuli and between recollections of facial appearance
retrieved from memory to construct a joint perceptual-memory face space.
The structure of this hybrid representational space was then used to reconstruct the appearance of different facial identities corresponding to both
unfamiliar and familiar faces. Specifically, memory-based reconstructions
were carried out for newly-learned faces as well as for famous individu-
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
als (i.e., media personalities such as Jonah Hill). Our results demonstrated
robust levels of memory and perception-based reconstruction performance
for all face categories separately in every participant. In particular, we note
the ability to reconstruct successfully images corresponding to famous
individuals based on information retrieved from long-term memory. Thus,
the current work provides a new approach to probing the representational
content of face memory. Theoretically, it suggests that memory and perception share a common representational basis and, furthermore, from a
translational standpoint, it can open a new path for concrete applications
such as computer-based ‘sketch artists’.
26.3021 Large inversion effects are not specific to faces and do
not vary with object expertise Constantin Rezlescu1,2([email protected]
com), Tirta Susilo3, Angus Chapman3, Alfonso Caramazza2; 1Institute of
Cognitive Neuroscience, University College London, 2Department of Psychology, Harvard University, 3School of Psychology, Victoria University
of Wellington
Visual object recognition is impaired when stimuli are shown upsidedown. This phenomenon is known as the inversion effect, and a substantial
body of evidence suggests it is much larger for faces than non-face objects.
The large inversion effect for faces has been widely used as key evidence
that face processing is special, and hundreds of studies have used it as a
tool to investigate face-specific processes. Here we show that large inversion effects are not specific to faces. We tested two groups of participants,
over the web (n=63) and in the laboratory (n=57), with two car tasks tapping basic object recognition and within-class recognition. Both car tasks
generated large inversion effects which were identical to those produced by
parallel face tasks. For basic-level recognition, the car inversion effects were
28% (SD=12%) and 27% (SD=13%), while the face inversion effects were
28% (SD=13%) and 26% (SD=10%) (web and lab samples, respectively). For
within-class recognition, the car inversion effects were 23% (SD=9%) and
26% (SD=10%), while the face inversion effects were 25% (SD=13%) and
25% (SD=10%). Additional analyses showed that the car inversion effects
did not vary with car expertise. Our findings demonstrate that non-face
object recognition can depend on processes that are highly orientation-specific, challenging a critical behavioral marker of face-specific processes. ​We
suggest that, rather than being face-specific, inversion effects are the result
of a special type of processing engaged in recognition of exemplars from
complex but highly homogeneous sets of objects with a canonical orientation. Studies which claimed to measure face-specific mechanisms did not
control for this type of processing and so will need reexamination.
26.3022 Initial fixation to faces during gender identification is
optimized for natural statistics of expressions Yuliy Tsank1(yuliy.
[email protected]), Miguel Eckstein1; 1Psychological and Brain Sciences, University of California Santa Barbara
During face discrimination tasks, observers vary their initial fixation to
faces in order to maximize task performance. For gender discrimination
of faces with neutral expressions, the optimal point of fixation (OPF) is
just below the eyes and is predicted by an optimal Bayesian model that
takes into account the foveated nature of the visual system (foveated ideal
observer, FIO; Peterson & Eckstein, 2012). Here, we investigate human
OPFs for a gender task with an atypical relative frequency of facial expressions (50/50 - happy/neutral) that alters the theoretical OPF predicted by
the FIO. Methods: Five observers completed a gender discrimination task
with 40 faces (20/20 male/female, and 10 of each with a happy expression) in luminance noise and 15 deg in height. In one condition, observers
made free saccades in 3 blocks with presentation times of 350ms. In a second condition (forced-fixation), we assessed the human OPFs for the same
task. Observers randomly fixated 1 of 5 horizontally centered positions on
the face (forehead, eyes, nose, mouth, and an individual preferred fixation)
for 200ms. Results: The forced-fixation condition showed that the human
OPF is above the nose tip. However, human initial fixations in the free-saccade condition are suboptimal and directed to a higher point located below
the eyes. This is consistent with the OPF for a gender discrimination task
using the more frequent neutral-expression faces. Conclusions: Our findings suggest that observers optimize their initial fixation to faces, taking
into account the statistics of occurrence of facial expressions. This fixation
strategy seems to be inflexible to greatly altered facial expression statistics. Vis io n S c ie nc es Societ y
Saturday PM
tion in a Deep Neural Network Tian Xu1([email protected]),
S atur day A ft ernoon Post ers
Satur day Af t ernoon P os te r s
Face Perception: Neural mechanisms
Saturday, May 20, 2:45 - 6:45 pm
Poster Session, Banyan Breezeway
Saturday PM
[email protected]), Katja Weibert1, Robin Kramer1, Kay Ritchie1, Mike
Burton1; 1Department of Psychology, University of York, York, UK
A full understanding of face recognition must involve identifying the
visual information that is used to discriminate different identities and how
this is represented in the brain. Previous behavioural studies have shown
that the recognition of familiar faces is primarily based on differences in
surface properties of the image. Our aim was to explore the importance of
surface properties and also shape properties in the neural representation of
familiar faces. We took a set of face images and measured variance in the
shape and surface properties using principal component analysis (PCA).
Face-selective regions (FFA, OFA and STS) were defined in an independent
localizer scan. We then showed participants a subset of the face images and
measured the resulting patterns of neural response using fMRI. Patterns of
response to pairs of images were compared to generate a similarity matrix
across all faces in each ROI. Corresponding similarity matrices for shape
and surface properties were then created by correlating the PCA vectors
across pairs of images. The similarity matrices for shape and surface properties were then used to predict the patterns of neural response in each
ROI. Patterns of response in the OFA could be predicted by both the shape
and surface properties of the face images. However, patterns of response
in the FFA and STS could only be predicted by the shape of the face image.
The dissociation between the selectivity for shape in the FFA and previous behavioural findings revealing a preeminent role of surface properties
in face recognition suggests that, although the FFA may play a role in the
recognition of facial identity, this region is not solely responsible for this
26.3024 Fast periodic visual stimulation reveals face familiarity
processing across image variability in the human adult brain Friederike Zimmermann1([email protected]), Bruno
Rossion1; 1University of Louvain (UCL), Belgium
Recognizing a familiar face across widely variable natural images is a fundamental ability for us humans (Burton & Jenkins, 2011). Yet, it is difficult
to capture this process reliably, without an explicit behavioural task. Here
we designed a fast and automatic approach that required both discriminating highly familiar from unfamiliar faces and generalizing across different
images of the same individual face. We recorded high-density electroencephalogram (EEG) at 6 Hz during brief 70 sec sequences of fast periodic
visual stimulation (FPVS). During stimulation, unfamiliar faces were presented at 6Hz (base) with familiar (here famous) faces appearing as every
7th image at a stimulation rate of 0.86 Hz. Images showed faces across naturally occurring changes in view, expression or visual appearance. Throughout each sequence, base face identity varied at every stimulation cycle
while the 0.86 Hz periodic famous identity stayed the same. Stimulation
sequences were also shown at inverted orientation. Clear visual discrimination responses emerged to periodic presentations of the same familiar face
over occipito-temporal electrodes. Importantly, familiarity responses were
markedly reduced on inverted sequences, ruling out a low-level account.
These findings indicate that the human brain can implicitly discriminate
familiar from unfamiliar faces at a glance, and generalize across wide image
variability. Our data demonstrate that FPVS-EEG is a highly sensitive tool
to characterize rapid invariant familiar face recognition in the human brain. 26.3025 Compound facial threat cue perception: Contributions
of visual pathways by image size Troy Steiner1([email protected]
gmail.com), Robert Franklin Jr.2, Kestutis Kveraga3,4, Reginald Adams, Jr.1;
Department of Psychology, The Pennsylvania State University, U.S.A.,
College of Arts and Sciences, Anderson University, U.S.A., 3Athinoula A.
Martinos Center, Department of Radiology, Massachusetts General Hospital, U.S.A., 4Department of Radiology, Harvard Medical School, U.S.A.
Introduction: Relatively greater amygdalar response has been found to
rapidly presented fear faces when coupled with averted gaze (offering
clear signal of threat location), and to sustained presentations of ambiguous threat (direct gaze fear; Adams et al., 2012). To help explain these
Vi s i on S c i enc es S o ci e ty
VSS 2017 Abst ract s
results, the parvocellular (P) and magnocellular (M) pathways have been
implicated in the processing ambiguous versus clear threat, respectively.
We tested another presentation parameter relevant to these visual pathways, manipulating stimulus size as a natural spatial filter (Smith &
Schyns, 2009). Methods: Twenty-nine (15 female) participants passively
viewed fearful faces in an ABA design alternating between averted and
direct gaze, for 16 blocks. Fourteen participants (8 female) were presented
small faces (visual angle of 3.9̊ by 5˚) while fifteen participants (6 female)
were presented large faces (visual angle of 9.7˚ by 12.5 ̊). Each block consisted of 16 trials; stimuli were presented for 300 ms then a 1200 ms fixation. Results: For small presentations favoring (M) pathway, clear threat
(averted gaze fear) minus ambiguous threat (direct gaze fear) yielded
activation in many regions, including: right-amygdala, PMC, SMA, leftIFC, right-OFC, thalamus, insula, and left-TPJ. Ambiguous threat-gaze
minus clear threat yielded fewer areas of activation including the OFC,
right-mPFC, right-ITG, CG, and posterior cingulate. The pattern was in
many ways reversed for large stimuli: Ambiguous minus clear threat elicited activation in the amygdala, mPFC regions, right-STS, bilateral-iFG,
and right-insula. Clear minus ambiguous threat elicited fewer activations including in the right-insula and SMA, left-caudate, cingulate, and
MT. Conclusion: Our findings with small versus large presentations of
clear versus ambiguous threat-gaze pairs yielded similar patterns of activations previously found for rapid versus sustained presentations of the
same threat-gaze pairs (Adams et al., 2012). These findings highlight that
presentation parameters yield highly variable effects, presumably due to
differential magnocellular versus parvocellular involvement.
Acknowledgement: This work was supported by grant # R01 MH101194 awarded
to KK and RBA, Jr.
26.3026 Population receptive field tuning in the human Fusiform
Face area Kelly Chang1([email protected]), Yiqin Shen1, Jason Webster1,
Geoffrey Boynton1, Yuichi Shoda1, Ione Fine1; 1University of Washington
Introduction. Population receptive field modeling (pRF; Dumoulin & Wandell, 2008) provides a powerful way of estimating the cumulative tuning
of the population of cells in a single voxel. PRF models have been used to
estimate spatial tuning properties in multiple cortical and subcortical visual
areas, auditory frequency tuning, attentional effects, and topographic organization for size/numerosity. Here we examine whether pRF modeling
would reveal systematic voxel-wise tuning preferences within the Fusiform Face area (FFA). The FFA is selectively responsive to face vs. non-face
stimuli, and individual voxels show preferences for individual faces. However, it is still debated whether neurons with similar tuning preferences
for identity are scattered or clustered (Dubois et al., 2015). Methods. Stimuli consisted of 19 distinct pairs of stereotypically Caucasian and African
American male faces. Each pair was morphed into 7 equal steps, creating a
total of 133 unique faces. Stimuli were histogram equalized for luminance.
Using fMRI, we presented subjects with the full stimulus set in each run,
and we collected at least 6 runs from each subject (n = 3). For each subject,
we used a modified pRF model to estimate each voxel’s tuning preference
along our morph sequence within functionally defined FFA. Results. All
voxels in the functionally defined FFA showed a strong baseline response
to all faces. However, for all subjects and hemispheres, voxels also showed
replicable (across sessions) tuning along the dimension of race. At a threshold of r > 0.2 (corresponding to a 1.25% false discovery rate) 58% (averaged
across subjects) of voxels in the left FFA and 62% of voxels in the right FFA
showed featural selectivity. Tuning preferences varied smoothly across the
cortical surface of the FFA. It remains to be seen whether these topologies
show inter-subject consistency or are unique across individuals.
26.3027 Mapping Spatial Preferences in Face and Object Patches
in the Rhesus Macaque Using fMRI Caleb Sponheim1, Adam
Messinger1, Leslie Ungerleider1; 1Section on Neurocircuitry, Lab of Brain
and Cognition, National Institute of Mental Health
The dorsal visual stream of the brain primarily processes visual spatial
information, whereas the ventral visual stream processes the shape, category, and identity of visual objects. In primates, the ventral visual stream
contains regions that preferentially respond to visual categories, such as
faces and objects. However, even in the ventral visual stream, neural activity can be modulated by the location of an object in the visual field. Early
visual areas are retinotopically organized, with many responding preferentially or exclusively to stimuli in the contralateral visual field. Early visual
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s Acknowledgement: NIH Intramural Research Program
26.3028 Face repetition probability does not affect repetition
suppression in macaque middle lateral face patch. Kasper
Vinken1,2([email protected]), Hans Op de Beeck2, Rufin
Vogels1; 1Laboratory for Neuro- and Psychophysiology, KU Leuven, 2Laboratory of Biological Psychology, KU Leuven
It has been proposed that repetition suppression (i.e. a reduced neural
activity when stimuli are repeated) results from a fulfilled expectation of
repetition or a reduced prediction error (Summerfield et al., 2008). This
implies that repetition suppression should increase when a repetition
is expected and decrease when it is unexpected. While this prediction is
supported by human functional imaging (fMRI) studies (e.g. Summerfield
et al., 2008), no evidence was found in macaque inferior temporal cortex
(IT) spiking activity (Kaliukhovich & Vogels, 2011). Here, we tested three
possible explanations for this discrepancy by recording spiking activity in
macaque IT. First, we performed recordings in face patch ML instead of
outside of a face-selective region. Second, we used faces as stimuli. Third,
we required our monkeys to perform a task instead of passive fixation. In
two experiments, we manipulated the probability of a face repetition (75%
or 25%) between blocks of 40 trials. A trial consisted of two face presentations that were either a repetition (same face identity) or an alternation,
followed by a saccade response by the monkey to receive a reward. In a first
experiment, we included target trials where one face was inverted. The task
(did the trial contain an inverted face?) was orthogonal to the manipulation
of repetition probability. We observed repetition suppression, but there
was no effect of repetition probability on its magnitude (face-selective cells
in 2 monkeys). In a further experiment, we made a face repetition relevant
to the task (i.e. was the face repeated?). There was a clear performance bias
dependent on the probability of a face repetition, but again no effect on
the responses (face-selective cells in 1 monkey). In conclusion, regardless
of whether a face repetition is explicitly relevant for the monkey, we see no
evidence of a prediction error response in ML.
26.3029 The superior temporal sulcus is causally connected to the
amygdala: A combined TBS-fMRI study David Pitcher1([email protected]
york.ac.uk), Shruti Japee2, Lionel Rauth2, Leslie Ungerleider2; 1Department of Psychology, University of York, UK, 2National Institute of Mental
Health, USA
Non-human primate neuroanatomical studies have identified a cortical
pathway from the superior temporal sulcus (STS) projecting into dorsal sub-regions of the amygdala, but whether this same pathway exists
in humans is unknown. Here, we addressed this question by combining
thetaburst transcranial magnetic stimulation (TBS) with functional magnetic resonance imaging (fMRI) to test the prediction that the STS and
amygdala are functionally connected during face perception. Human participants (N=17) were scanned, over two sessions, while viewing 3-second
video clips of moving faces, bodies and objects. During these sessions, TBS
was delivered over the face-selective right posterior STS (rpSTS) or over the
vertex control site. A region-of-interest analysis revealed results consistent
with our hypothesis. Namely, TBS delivered over the rpSTS reduced the
neural response to faces (but not to bodies or objects) in the rpSTS, right
anterior STS (raSTS) and right amygdala, compared to TBS delivered over
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
the vertex. By contrast, TBS delivered over the rpSTS did not significantly
reduce the neural response to faces in the right fusiform face area (rFFA)
or right occipital face area (rOFA). This pattern of results is consistent with
the existence of a cortico-amygdala pathway in humans for processing face
information projecting from the rpSTS, via the raSTS, into the amygdala.
This conclusion is consistent with non-human primate neuroanatomy and
with existing face perception models. References Pitcher, D., Japee, S.,
Rauth, L., & Ungerleider, L. G. (In Press). The superior temporal sulcus is
causally connected to the amygdala: A combined TBS-fMRI study. Journal
of Neuroscience. Acknowledgement: NIH Intramural
26.3030 A combined fMRI-MEG investigation of face information
processing in the occipito-temporal cortex Xiaoxu Fan1,2([email protected]), Hanyu Shao1򟙺, Fan Wang1, Sheng He1,2; 1Institute of
Biophysics, CAS, 2Department of psychology, University of Minnesota
The processing of face information involves a distributed functional network of face sensitive areas in the occipitotemporal cortex and beyond.
However, we do not yet have a comprehensive understanding of the temporal dynamics of these key regions and their interactions. In this study, we
investigated the spatio-temporal properties of face processing in the occipitotemporal cortex using fMRI and Magnetoencephalograpy (MEG). Subjects viewed faces and objects during the experiment. Each subject’s face
selective regions were localized with fMRI contrasting responses to faces
and objects. Their MEG data were analyzed with beamformer method,
which allows for an inverse model to obtain MEG source signals. The
results show that face-selective areas identified by MEG are highly consistent with that localized by fMRI. More importantly, MEG signals at the
various sources reveal an intricate dynamic picture of these hierarchical
face sensitive areas. Specifically, the face-selective signal at right Occipital
Face Area (rOFA) reaches peak at around 110 ms, and would last longer if
face components were rearranged preventing the perception of a wholistic
face. Then face information engages the right posterior Fusiform Face Area
(pFFA) and onto the anterior aFFA at about 120 ms and 130 ms respectively.
Activity in the left fusiform gyrus peaks slightly later than the rFFA, at
around 140 ms. Subsequently, a region in the inferior temporal gyrus just
lateral to the rFFA is activated, with somewhat more sustained signal and
peaking at about 155 ms, with a second peak at around 210 ms. The right
posterior Superior Temporal Sulcus (pSTS), presumably more sensitive to
dynamic facial properties, reaches peak activity at about 170 ms. Overall,
while many studies have propsed the N170 as a key electrophysiological
index for face processing, our source-localized MEG data suggest a significantly earlier engagement of the core ventral face-selective areas. Acknowledgement: XDB02050001,NSFC81123002
26.3031 Differential visual pathway contributions to compound
facial threat cue processing Cody Cushing1([email protected]
edu), Reginald Adams, Jr.2, Hee Yeon Im1,3, Noreen Ward1, Kestutis Kveraga1,3; 1Athinoula A. Martinos Center, Department of Radiology, Massachusetts General Hospital, Charlestown, MA, U.S.A., 2Department of
Psychology, The Pennsylvania State University, State College, PA, U.S.A. ,
Department of Radiology, Harvard Medical School, Boston, MA, U.S.A.
Facial expression can be a threat cue whose meaning depends on the direction of eye gaze. For example, fear combined with averted eye gaze clearly
signals threat and its location, while a direct gaze leaves the location ambiguous (Adams et al., 2012). Processing of clear and ambiguous threat cues is
thought to differentially involve the major visual pathways: magnocellular
(M) pathway for rapid processing of clear threat and parvocellular (P) pathway for slower processing of threat ambiguity (Adams et al., 2012, Kveraga, 2014). Here we sought to characterize neurodynamics while perceiving
threat from faces that were biased towards M or P pathways, or unbiased
two-tone images. Participants (N=58) viewed a series of direct and averted
gaze fearful and neutral displays, each for 1s. We extracted source-localized
MEG activity and computed direction of information flow via phase-slope
index (PSI) analysis in select regions of the face-processing network: primary visual cortex (V1), fusiform face area (FFA), periamygdaloid cortex
(PAC), posterior superior temporal sulcus (pSTS), and orbitofrontal cortex
(OFC). We found that early activity in V1 lead activity in FFA (p=0.001) and
in PAC (p=0.007) for P-biased compared to M-biased faces. In fear > neutral
faces, activity in left PAC and FFA was modulated by pathway, with PAC
leading for M-biased and FFA leading for P-biased fear faces (p=0.03). With
Vis io n S c ie nc es Societ y
Saturday PM
areas can also exhibit a preference for the upper or lower visual field. It is
unclear how spatial preferences are retained as information travels down
the ventral visual stream. In particular, it is not known whether shape
selective areas in the inferior temporal cortex show spatial preferences.
To assess the retinotopic dependence of face-selective areas, we measured
fMRI responses in two rhesus macaques to the presentation of static monkey faces (and objects) in four quadrants of the visual field during a central
fixation task. We evaluated responses in six face-selective areas, many of
them in the superior temporal sulcus, and found they all exhibited a similar retinotopic pattern of activation. Activation was greater when complex
objects were presented in the contralateral quadrants than in the ipsilateral quadrants of the visual field. Activation was also greater when faces
were presented in the lower visual field quadrants than in the upper quadrants. The results suggest that visual field location information is retained
throughout the ventral visual stream, and effects the processing of complex
shape stimuli such as faces and objects. The preferences also suggest that
a face in the lower hemifield of the visual field may be assessed and recognized more consistently than one in the upper half of the visual field.
S atur day A ft ernoon Post ers
Satur day Af t ernoon P os te r s
increased threat ambiguity, OFC led activity in pSTS later in the trial for
both unbiased faces (p=0.005) and for M/P-biased faces (p=0.003), suggesting reflective processing to interpret the meaning of the cue. These results
suggest an early role of feedforward processing for parvocellular inputs,
evidenced by its flow from V1 and pSTS, and indicate feedback based from
magnocellular inputs, as evidenced by its flow from PAC to FFA. Our findings describe the dynamics of information flow for M/P contributions to
emotional face perception. Acknowledgement: R01 MH101194
Saturday PM
26.3032 Neurodynamics of reading crowd emotion: Independent visual pathways and hemispheric contributions Hee Yeon
Im1,2([email protected]), Cody Cushing1, Daniel Albohn3, Troy
Steiner3, Noreen Ward1, Reginald Adams, Jr.3, Kestutis Kveraga1,2; 1Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General
Hospital, Charlestown, MA, USA, 2Department of Radiology, Harvard
Medical School, Boston, MA, USA, 3Department of Psychology, The Pennsylvania State University, State College, PA, USA
Introduction: The visual system exploits redundancies by extracting summary statistics from groups of similar items. In social situations, extracting
average emotion from crowds of faces helps us to avoid potential threats
(e.g., mob violence or panic). We conducted fMRI, MEG, and behavioral
experiments to investigate contributions of magnocellular (M) and parvocellular (P) visual pathways, and of hemispheric lateralization in reading
of crowd emotion. Methods: Participants in fMRI (N=32), MEG (N=38),
and behavioral (N=36) experiments were presented bilaterally with either
arrays of faces or single faces with varying emotional expressions. Participants performed a 2AFC task as to which facial crowd, or single face comparison to avoid. In the behavioral experiment, the original stimuli were
converted to M-biased (low-luminance contrast) or P-biased (isoluminant
chromatic) stimuli to isolate visual field contributions. Results: fMRI and
MEG results revealed that reading crowd emotion evoked highly lateralized activations along the dorsal stream, including the prefrontal and parietal cortex. Conversely, individual emotion processing activated the ventral
stream, including the FFA. MEG activity in the precuneus (dorsal stream)
differentially increased for facial crowds from 180 ms after stimulus onset,
indicating early engagement of M-pathway, whereas the FFA (ventral
stream) showed higher activation for individual faces 200 ms after stimulus
onset. Furthermore, we found goal-dependent hemispheric asymmetry for
crowd emotion processing: the right hemisphere was superior for task-congruent (e.g., angry crowd to avoid) decisions and the left hemisphere for
processing task-incongruent cues (e.g., happy crowd). However, the right
hemisphere was superior for individual emotion processing regardless
of task demands. Finally, in the behavioral task, we found crowd emotion extraction to be more accurate from M- than P-stimuli, and observed
goal-dependent hemispheric asymmetry only for M-stimuli. Conclusion:
Unlike individual emotion processing, reading crowd emotion is predominantly carried out by the M/dorsal stream, with the right hemisphere superiority for task-congruent decisions.
Acknowledgement: R01 MH101194
26.3033 Spatiotemporal dynamics of view-sensitive and view-in-
variant face identity processing Charles C.-F. Or1,2([email protected]
sg), Joan Liu-Shuang2, Bruno Rossion2; 1Division of Psychology, School of
Humanities & Social Sciences, Nanyang Technological University, Singapore, 2Psychological Sciences Research Institute & Institute of Neuroscience, University of Louvain, Belgium
The ability to extract the identity of faces across substantial variations in
angular head orientation is critical for face recognition, yet the underlying neural mechanism is not well understood. Using a validated paradigm
with fast periodic visual stimulation in electroencephalography (EEG; LiuShuang, Norcia, & Rossion, 2014, Neuropsychologia), we investigated the
tuning function of face identity perception in 20 observers across 7 ranges
of viewpoint variations: 0° (no change), ±15°, ±30°, ±45°, ±60°, ±75°, ±90°.
In each 60-s stimulation sequence, images of one single face identity, randomly chosen from our stimulus set, were displayed successively at a rapid
rate of F = 6 Hz (6 images/s), interleaved with different face identities at
fixed intervals of every 7th face (F/7 Hz = 0.86 Hz). Critically, at every stimulation cycle, faces varied randomly both in viewpoint within a predefined
range (e.g. in the ±45° condition, faces were shown between -45° and +45°
in steps of 5°) and in size between 80% and 120%. Periodic EEG responses
Vi s i on S c i enc es S o ci e ty
VSS 2017 Abst ract s
at 6 Hz captured general visual processing of the face stimuli, while those
at 0.86 Hz and harmonics captured face individualisation. All observers
showed significant face individualisation responses, peaking over bilateral occipito-temporal regions. These responses decreased linearly with
increasing viewpoint variations (responses decreased by > 50% between
0° and ±90° conditions), suggesting reduced face identity discrimination.
Analysing the face individualisation response in the time-domain revealed
a dissociation between an early (~200–300 ms) view-sensitive response and
a later (~300–600 ms) view-invariant response, both peaking over the same
bilateral occipito-temporal regions. These findings suggest two separate
view-based face recognition processes, where an initial reduced ability to
discriminate face identities due to viewpoint variations is complemented
partly by a later, high-level view-invariant process. Acknowledgement: This work was supported by the Postdoctoral Researcher
Fellowship from the National Fund for Scientific Research to C.O. and J.L., and
Grant facessvep 284025 from the European Research Council to B.R.
26.3034 The Spatiotemporal Neural Dynamics of the Processing
of Infant Faces. Lawrence Symons1([email protected]), Kelly
Jantzen1, Amanda Hahn2, Taylor Kredel1, Benjamin Ratcliff1, Nikal Toor1,
McNeel Jantzen1; 1Department of Psychology, Western Washington University, 2Department of Psychology, Humboldt State University
Substantial evidence indicates that both infant faces and attractive adult
faces are associated with stronger activity across an extended face processing networks that critically includes the orbitofrontal cortex and the
fusiform gyrus. In this study we used electroencephalography (EEG) to
investigate the spatiotemporal brain dynamics of face processing to better
understand the degree to which specialized processing occurs for infant
faces. EEG was acquired while participants viewed infant faces and adult
faces of the same or opposite sex. All faces were digitally manipulated to
have high and low aesthetic version. Source analysis of the event related
potentials revealed activity across a broad face processing network. The
most significant increases occurred for infant faces regardless of cuteness.
Early increases were observed at the time of the N170 in the orbitofrontal
cortex, the inferior occipital gyrus and the fusiform gyrus. Later increases
were observed between 300 and 500 milliseconds in the anterior cingulate,
the superior temporal sulcus and the precuneus. Attractiveness resulted in
only a modest change in neural activity. The results of this experiment suggest that infant faces undergo specialized processing that does not simply
reflect their perceived cuteness.
26.3035 Temporal dynamics of the core and extended face per-
ception system with fMRI Silvia Ubaldi1([email protected]), Aidas
Aglinskas1, Elisa Fait1, Scott Fairhall1; 1Center for Mind/Brain Sciences,
University of Trento
The extensively studied core and extended network for perceiving and
knowing about others provides for a critical aspect of human cognition.
Here, to gain a systems-level understanding of hierarchical organization
and interregional coordination, we apply a novel approach to assess temporal dynamics with fMRI. To determine the differential temporal-tuning of
cortical regions, we cognitively overload the system using the rapid-serial
presentation of faces (N=35 participants). Famous faces and buildings were
presented at twelve different ISIs ranging from 100 to 1200 msec. Contrasting faces with buildings revealed the core system of face perception (OFA,
FFA, pSTS), along with the extended system areas (precuneus, mPFC, IFG,
ATL, amygdalae). Beta-values for each ISI were extracted to determine
differential regional temporal-tuning across the system. Neural activity in
OFA and FFA was maximal at ISI 300 msec, offsetting around 400 msec
in OFA and at 500 msec in FFA. A similar temporal-tuning profile in IFG
is consistent with top-down/perceptual coordination. In extended system
areas (precuneus, ATL, mPFC), activity was effectively gated at faster ISI’s,
coming online only in a second wave of activations peaking at 500 msec.
A more complex profile was observed in pSTS. Investigating cortical temporal-tuning provides novel insight into the systems-level organization
network. In particular, convergent temporal profiles between IFG and core
regions suggest a role of the IFG in top-down amplification of relevant/
perceptual signals. We also observed a broad two-stage activation of the
system, with early-perceptual processes transitioning to a near simultaneous second wave of activation across the extended system. Interestingly the
amygdala, a potentially rapidly activated limbic region, grouped with these
latter areas. The application of cognitive-masking to determine regional
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s temporal tuning provides novel insight into the function of face-perception
system while simultaneously providing a bridge between fMRI and magneto/encephalographic techniques.
26.3036 Attention modulation of rapid face identity discrimina-
tion Xiaoqian Yan1([email protected]), Joan Liu-Shuang2, Bruno Ros-
sion3; 1Institute of research in Psycholgoy, University of Louvain, 2Institute
of research in Psycholgoy, University of Louvain, 3Institute of research in
Psycholgoy, University of Louvain
Human observers detect faces in the visual environment extremely rapidly
and automatically. Yet how basic units of visual information processing,
i.e. spatial frequencies (SF), play a role in this remarkable ability remains
unexplored. We shed light on this fundamental issue by estimating the
minimal and optimal amount of SF content required for fast face detection. Stimulation sequences composed of naturalistic and highly variable
images of faces and objects were presented with parametrically increasing
SF content (0.50 to 128 cycles-per-image or cpi across 14 SF steps, 4 s/step),
such that initially blurry images gradually sharpened over the course of a
56-s sequence. Stimuli were shown rapidly at 12 Hz (83-ms SOA), thereby
constraining perception to a single glance. A No Face condition consisted
of randomly presented object images, while in the critical Face condition,
face images were interleaved among objects every 8th image (OOOOOOOFOOOOOOOFOO…) at a frequency of 1.5 Hz (667-ms SOA). Electroencephalographic (EEG) responses at 1.5 Hz (and harmonics) reflect face
detection (i.e. differential perception of faces vs. objects) while responses
at 12 Hz (and harmonics) reflect visual processing common to objects and
faces (Retter & Rossion, 2016, Neuropsychologia). Participants responded
the moment they could perceive faces. All 16 participants detected faces
at around 6.46 cpi and showed significant face-selective responses located
over (right) occipito-temporal regions in the Face condition only. Critically,
this face-selective response emerged at around 4.22 cpi (≈1.69 cycles-perface or cpf) and steadily increased until 23.24 cpi (≈9.30 cpf). Beyond 23.24
cpi, face-selective responses were equivalent to responses to full-spectrum
(unfiltered) faces both in amplitude and spatio-temporal dynamics. In summary, neural face detection emerges with extremely coarse SF information
(before explicit behavioural response) but continues to integrate SF content
until a relatively fine level of image detail, thereby demonstrating the relevance of higher SF in face detection. Acknowledgement: FNRS
Acknowledgement: J.L.S. is supported by a FNRS Post-Doc grant (FC91608),
G.Q. is supported by the University of Louvain & he Marie Curie Actions of the
European Commission (F211800012), V.G. is a FNRS-FRS Research Associate,
& B.R. is supported by an ERC grant (facessvep)
26.3037 Neural Correlates of Dynamic Face Perception Huseyin
26.3039 Task-modulated integration of facial features in the
Past research on the electrophysiology of face perception has focused
almost exclusively on brain responses to artificial stimuli that are transient
and static. Therefore, our knowledge of the electrophysiological correlates
of face perception is rudimentary, consisting mostly of averaged ERP
responses in the first 200 ms after stimulus onset, and lacking virtually
any description of how our brain may respond to more naturally occurring dynamic faces. Our goal was to characterize the neural correlates of
naturally occurring dynamic faces over a more sustained presentation time
(500ms). To this end, we recorded Magnetoencephalography responses to
both dynamic and static face and non-face stimuli and used both traditional
ERF component analysis to compare our results to the M100 and M170 face
responses, as well as machine learning techniques to reveal other representations of viewing a dynamic face. In our ERF analyses, we observe that the
dynamic-face induced ERFs have larger M100 and M170 responses (M170
is ~40ms earlier) compared to the static-face ERFs. In our classification analyses, the face vs non-face classification performance is shown to constantly
improve as a larger time window is used, until 500ms, yielding ~80% accuracy at 500ms for both dynamic and static stimuli. Hence, the information
of face-ness is not specific to a time interval but rather distributed (more
widely in the case of dynamic stimuli) in the full temporal content. Finally,
this strong face selectivity is achieved at the sensors that probe the temporal lobes for dynamic stimuli, and the occipital lobes for static stimuli.
Overall, our results both provide new correlates of dynamic face perception and emphasize the critical information that lies in looking at sustained
responses rather than the traditional transient responses to static faces.
The presence of intermodulation frequencies (IM) in an EEG frequency-tagging paradigm indicates non-linear integration of multiple tagged
visual features by the brain (Norcia et al., 2015). Despite its growing use in
high-level vision, the efficiency of IM as an index of non-linear processing
remains unclear, mostly because the importance of the non-linear integration for the task is typically unknown. We assessed the efficiency of IM
using a realistic face processing task which we know implements a simple
XOR non-linear function—wink detection. On each trial, EEG activity was
recorded while each feature of a face flickered at a specific frequency (e.g.
left eye: 6 Hz, right eye: 3Hz and mouth: 8 Hz). Subjects had to fixate a
central cross and detect winks (one eye closed rather than no eyes/both
eyes closed) in the non-linear condition, and the closing of one of the two
eyes in the linear condition. Comparisons of brain responses between tasks
during identical visual stimulations revealed that left/right-eye tagged
IM—the neural response imputable to the non-linear integration of both
features—were stronger in occipito-temporal electrodes when this particular feature integration was useful for the task at hand (i.e. wink condition,
F(1,362)=13.98,p< .001). The magnitude of the eye-pair IM was also associated with faster response time (RT) in the non-linear wink detection condition (r= -.73,p< .05), but not in the linear control task (r=-.10,p>.70). Oppositely, the magnitude of the mouth tagged neural responses (unrelated to
both tasks) was associated with longer RT in both conditions (r1=.67,p1<
.05;r2 =.85,p2< .05), most likely reflecting a distractor effect. While the magnitude of feature frequency-tags clearly outweighed that of IM (average
SNR were ~15 and ~1.75, respectively), the present results clearly demonstrates that IM can be an effective neural correlate of non-linear visual integration processing.
Ozkan1([email protected]), Sharon Gilad-Gutnick1, Evan Ehrenberg1,
Pawan Sinha1; 1Brain and Cognitive Sciences, Massachusetts Institute of
Acknowledgement: This work was supported in part by The National Eye
Institute (NIH) and The Scientific and Technological Research Council of Turkey
brain Simon Faghel-Soubeyrand1([email protected]),
Frédéric Gosselin1; 1Département de Psychologie, Université de Montréal
Acknowledgement: Natural Sciences and Engineering Research Council of
26.3038 Coarse to fine human face detection in a dynamic visual
scene Joan Liu-Shuang1([email protected]), Genevieve Quek1,
Valérie Goffaux1, Bruno Rossion1; 1University of Louvain, Belgium
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Vis io n S c ie nc es Societ y
Saturday PM
The human face bears prominent biological and social meanings, making
it detected quickly and automatically (as early as after 100ms of visual presentation). By contrast, it has been suggested that face identity processing
depends on selective attention (Palermo & Rhodes, 2007). Our study used
a fast periodic visual stimulation (FPVS) approach to examine the effect of
selective attention on face identity discrimination at a glance. We recorded
128-channel EEG while participants viewed 70s sequences of female faces
shown at 6 Hz. Within each sequence, a randomly selected identity was
repeated (A) with different female face identities (B, C…) embedded every
7th image (AAAAAABAAAAAAC…). Responses at 6 Hz reflect common
visual processing of all stimuli, while responses at 0.857 Hz (i.e., 6 Hz/7)
reflect face identity discrimination (Liu-Shuang et al., 2014). Participants
performed two tasks: (1) on Attend Fixation trials, participants monitored
the central fixation cross for color changes (7 targets); (2) on Attend Face
trials, participants responded to male faces which randomly replaced a
female face identity change (7 targets). Although there were robust face discrimination responses in the orthogonal task as shown previously, attending to face gender increased responses on all electrodes, including the bilateral occipito-temporal regions. This effect does not appear to stem from a
general increase in attention as behavioral performance and 6-Hz common
visual responses did not differ between conditions. Thus, it appears that
selective attention can modulate face identity discrimination in a rapid
visual stream, but is not mandatory.
S atur day A ft ernoon Post ers
Satur day Af t ernoon P os te r s
26.3040 Characteristics of face adaptation revealed by EEG Owen
Saturday PM
Gwinn1([email protected]), Talia Retter1,2, Sean O’Neil1, Michael Webster1;
Department of Psychology, Center for Integrative Neuroscience, University of Nevada, Reno, USA, 2Psychological Sciences Research Institute,
Institute of Neuroscience, University of Louvain, Belgium
Recent work has demonstrated robust face adaptation in neural responses
monitored with electroencephalography and frequency tagging (Retter
& Rossion, 2016). We examined factors controlling this adaptation and
whether they exhibit similar properties to the face aftereffects measured
behaviorally. An average female face was contracted or expanded along the
horizontal or vertical axis to form four images. Observers viewed a 20-sec
sequence of the faces presented at a rate of 6 Hz, while responses were
recorded with high-density EEG. This resulted in a 6 Hz signal over occipital channels, indicating that responses to each of the four distortions were
equal. This sequence was repeated after 20-sec adaptation to alternations
between two of the faces (e.g. horizontal contracted and expanded), with
the logic that a selective response change to the adapting faces should lead
to asymmetric responses during the test phase and a signal at 3 Hz. This pair
has the same mean (undistorted) as the test sequence and thus should not
bias responses driven only by the mean. However, adaptation instead produced a 3 Hz response that was present over right occipito-temporal sites,
consistent with selective adaptation to the distortion axis. Similar biases
were found when the adapting distortions were twice the magnitude of test
distortions, or when adapting to a single novel distortion (e.g. expanded
both horizontal and vertical) that was not part of the test sequence. These
effects argue against the alternative that the neural responses are driven
by prior exposure to the same image or face during adaptation and test.
Instead, the neural aftereffects appear to reflect response changes induced
by both the mean distortion and the contrast (variance) of the distortions.
While adaptation to the mean parallels perception, the neural adaptation
to variance appears stronger and may reflect processes distinct from those
underlying the perceived aftereffects.
Acknowledgement: EY10834, P20 GM103650, FNRS FC7159
26.3041 Representational confusion: the possible consequence of
demeaning your data Fernando Ramírez1,2([email protected]), Carsten Allefeld1,2, John-Dylan Haynes1,2; 1Bernstein Center for
Computational Neuroscience Berlin, Charité – Universitätsmedizin Berlin,
Germany, 2Berlin Center for Advanced Neuroimaging, Charité – Universitätsmedizin, Berlin, Germany
The increased sensitivity afforded by multivariate pattern analysis methods (MVPA) has led to their widespread application in neuroscience.
Recently, similarity-based multivariate methods seeking not only to detect
information regarding a dimension of interest, say, an object’s rotational
angle, but to describe the underlying representational structure, have flourished under the name of Representational Similarity Analysis (RSA). However, data pre-processing steps implemented before conducting RSA can
significantly change the correlation (and covariance) structure of the data,
hence possibly leading to representational confusion—i.e., concluding that
brain area X encodes information according to representational scheme A,
and not B, when the opposite is true. Here, we demonstrate with computer
simulations and empirical fMRI-data that time series demeaning (including z-scoring) can lead to representational confusion. Further, we expose
a complex interaction between the effects of data demeaning and how the
brain happens to encode information—usually the question under study—
hence incurring a form of circularity. These findings should foster reflection
on implicit assumptions bearing on the interpretation of MVPA and RSA,
and awareness of the possible impact of data demeaning on inferences
regarding representational structure and neural coding
26.3042 Representational similarity analysis of EEG and fMRI
responses to face identities and emotional expressions Kaisu
Ölander1,2([email protected]), Ilkka Muukkonen1, Jussi Numminen3, Viljami Salmela1,2; 1Department of Psychology and Logopedics,
University of Helsinki, Helsinki, Finland, 2Aalto NeuroImaging, Aalto
University, Espoo, Finland., 3Helsinki Medical Imaging Center, Töölö
Hospital, University of Helsinki, Helsinki, Finland
Currently it is acknowledged that the cortical network processing facial
information consists of several areas that process different aspects of faces
from low-level features to person-related knowledge. We applied multivariate representational similarity analysis to investigate how parametric
Vi s i on S c i enc es S o ci e ty
VSS 2017 Abst ract s
variation of facial expressions and identities affect temporal (EEG) and spatial (fMRI) patterns of brain activity. As stimuli, we used faces with neutral expression, happy, fearful, angry, and morphed (50%) versions of the
expressions, as well as four core identities (two male and two female) and
all combinations of these identities morphed (33 and 67%) to each other. In
total, we had 112 different faces (7 expressions from 16 identities). The representational dissimilarity matrices (RDMs) were calculated for each time
point in the event related potentials (ERPs) from EEG, and for each voxel
in fMRI data by using a searchlight approach. Low-level stimulus model
RDMs were based on spatial frequency (SF) spectrum of the whole face, the
region around the eyes or the region around the mouth. Additional model
RDMs were based on the emotional expressions, identities, and interactions
between these factors. ERP RDMs correlated with SF models between 220460 ms, with the identity model between 270-420 ms and 670-1000 ms, and
with emotion models at 180 ms, between 230-500 ms, and at 600 ms. There
was also an interaction between emotion type and identity. In fMRI, activity patterns related to expressions were found in early visual areas (V1-V3),
lateral occipital complex (LOC), occipital face area (OFA), fusiform face
area (FFA), posterior superior temporal sulcus (pSTS), and left middle frontal regions, and identity related patterns only in frontal areas. Distinct distributions of positive and negative correlations across face selective areas
suggest a different type of processing in LOC, OFA and FFA in comparison
to other regions in the face network. Acknowledgement: Academy of Finland
Eye Movements: Pursuit and anticipation
Saturday, May 20, 2:45 - 6:45 pm
Poster Session, Banyan Breezeway
26.3043 Does the baseline motor response predict the short-term
adaptability of phasic vergence? Ian Erkelens1([email protected]
ca), William Bobier1; 1University of Waterloo, Optometry & Vision Science
It has been hypothesized that faster, more accurate baseline neural-motor
responses result in greater adaptability to repeated external perturbations.
Like saccades, phasic convergence exhibits robust adaptive behavior when
exposed to double-step gap stimuli. Directional asymmetries exist in the
non-adapted baseline motor response of this phasic vergence mechanism
to convergent or divergent disparities. We leverage these directional asymmetries to investigate the relationship between the baseline motor response
and its adaptability to a double-step convergent or divergent stimuli. 10
adults (26±3.8y/o) completed 2 study visits where baseline convergence
or divergence responses to a 2° disparity step were measured and then
adapted using an increasing double-step stimuli (2°+1.5°, 175ms). Individual eye movements were recorded at 250Hz with infrared video oculography, while stimuli where presented dichoptically at 40cm. Vergence kinematics of baseline and adapted responses were compared between stimulus
directions. Compared to convergence, divergence exhibited significantly
less adaptive changes in gain (9±2%, vs. 31±3% p=0.0005), peak velocity
(4±4% vs. 32±3% p=0.0001) and peak acceleration (3±5% vs.30±6%, p =
0.006). Only divergence gain was altered after adaptation (p = 0.005); while
divergence peak velocity (p = 0.36) and peak acceleration (p=0.63) were
unchanged. Adapted divergence response duration increased (25±9ms,
p=0.03), whereas adapted convergence duration was unchanged (-6±9ms,
p=0.97). Baseline convergence peak velocity was faster (12.5±1.4°/s vs.
8.7±2.4°/s, p=0.004) than divergence in all subjects. Baseline vergence peak
velocity was the strongest predictor of the adaptability of the gain and peak
velocity of each system. The results demonstrate that phasic convergence
adapts to systematic errors by altering all orders of the dynamic response,
whereas phasic divergence adapts by altering only the duration of response
output. This adaptive behavior is most strongly correlated with the initial
peak velocity of the response, suggesting the baseline neural-motor function determines the degree of adaptability within this oculomotor system.
Acknowledgement: NSERC, OGS, COETF
26.3044 Dynamic modulation of volatility by reward contingen-
cies: effects on anticipatory smooth eye movement Jean-Bernard
Damasse1([email protected]), Anna Montagnini1, Laurent Perrinet1; 1Institut de Neurosciences de la Timone, CNRS - Aix-Marseille Université, Marseille, France
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s Acknowledgement: ANR grant « Reinforcement and Eye Movements » ANR-13APPR-0008-02
26.3045 Effect of attention on cyclovergence and cycloversion
eye movements. Madhumitha Mahadevan1([email protected]
UH.EDU), Scott Stevenson1; 1College of Optometry,University of Houston
This study aimed to determine the effect of spatial attention on torsional eye
responses. Verbal instructions were used to ask the subjects to pay attention to one of two torsion stimuli to determine if the instructions had any
effect on cyclovergence and cycloversion response amplitudes. The stimuli
consisted of a fixation dot, an inner disk, and an outer annulus, each filled
with static random dots. Diameters of the fixation dot, the central disk and
the annulus were 0.5 degrees, 40.3 degrees and 80.6 degrees respectively.
Dots in the central disk and annulus rotated 5 degrees back and forth about
the central fixation with a frequency of 0.25 or 0.5 Hz. Four conditions were
run to balance attention and frequency across field position. Dots seen by
left and right eyes rotated in the same (cycloversion) or opposite (cyclovergence) directions. Subjects (N = 6) wore red – green anaglyph glasses and
were asked to hold their head and gaze steady on the central fixation dot
and pay attention to the torsion motion of the disk and ignore the annulus
or vice versa. Scleral search coils were used to record eye position at 500
Hz. Fourier analysis was used to determine the tracking amplitude at each
frequency for each condition. We observed consistent responses to both
vergence and version stimuli in all subjects for both frequencies, with an
overall average tracking gain of 0.08. Instruction to attend or ignore made a
roughly 2x change in version responses in 3 subjects, a 1.25 change in a 4th
subject and no change in a 5th and 6th subject. No subject showed a change
in cyclovergence responses with attention. The results of this study suggest
that the mechanisms controlling cyclovergence are outside the influence of
attentional enhancement.
Acknowledgement: Student Vision Science Grants to advance Research (SVGR) University of Houston, College of Optometry
26.3046 Cognitive expectation modulates ocular torsion Austin
Rothwell1([email protected]), Miriam Spering1,2,3; 1Ophthalmology & Visual Sciences, University of British Columbia, 2Institute
for Computing, Information & Cognitive Systems, University of British
Columbia, 3Center for Brain Health, University of British Columbia
Purpose: Torsional eye movements are considered reflexive responses to
visual image rotation or to head roll about the visual axis. Recent studies
indicate that torsion scales with visual stimulus properties such as rotational direction, speed and size, indicating a voluntary component. However, it is unclear whether these eye movements can be modulated by cognitive factors such as expectation. Method: Head-fixed healthy human
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
adults (n=6) viewed a textured disk, translating horizontally to the right
across a computer monitor and rotating about its center. This type of stimulus triggers horizontal smooth pursuit eye movements with a torsional
component. Stimulus rotation was either clockwise, in the same direction
as a rolling ball (“natural”), or counterclockwise (“unnatural”). In baseline
trials, the texture moved horizontally without rotation. These three rotation conditions were presented in separate blocks of 100 trials each, two
blocks per condition, to elicit cognitive expectation of rotational direction.
Three-dimensional eye position was recorded with a head-mounted Chronos eye tracker. Results: Observers initiated horizontal pursuit 250 ms
prior to stimulus onset in anticipation of translational stimulus motion.
This effect was stronger for baseline than for rotation conditions, indicating that stimulus rotation is taken into account when computing anticipatory horizontal pursuit velocity. Interestingly, the eyes also started rotating clockwise in response to “natural” and counterclockwise in response
to “unnatural” rotation prior to stimulus onset in anticipation of stimulus
rotation. Conclusions: Torsional eye movements can be modulated by
cognitive factors, indicating a strong voluntary component in the control of
these movements. The frontal pursuit pathway, including areas such as the
frontal eye field, might carry ocular torsion signals and underlie the effects
of cognitive expectation on ocular torsion.
26.3047 Altered smooth pursuit of global motion caused by illusory
positon shifts in local elements Zheng Ma1([email protected]), Steve
Heinen1; 1The Smith Kettlewell Eye Research Institute
Previously we showed that the pursuit system can integrate local motion
information to veridically pursue the global motion of a large object. Here,
we demonstrate that illusory position shifts in local elements alter pursuit
gain for global motion, and similarly affect global motion perception. The
target consisted of four Gabor patches arranged in a diamond configuration, each drifting within a circular aperture. It is well known that Gabors
drifting within static apertures produce an illusory displacement of the
Gabor in the drift direction. In the current experiment, the apertures containing the Gabor patches translated together to the left or right in each
trial at a constant velocity of 10°/s. Drift direction conditions relative to the
global translation direction were Same, Opposite, or Orthogonal. Observers
were instructed to pursue the global motion. Pursuit gain was higher in the
Same than the Opposite condition, evidence that the local drifting motion
patches disrupted the pursuit system’s ability to integrate global motion.
To test if the effect originated in the motion perception system, we assessed
observers’ perceived speed of the translating stimuli using a staircase
method. Consistent with the pursuit result, we found higher global speed
perception in the Same than in the Opposite condition. We further asked
if integration could be restored when a non-illusory local motion cue was
provided by adding circular frames to each Gabor patch. This manipulation
reduced the difference between the Same and Opposite conditions for both
smooth pursuit and perception. The results suggest that local moving elements that produce an illusory position shift can interfere with the pursuit
and perception of global motion. However, when additional local translation cues are provided, motion information is successfully integrated to
accurately guide pursuit and perception.
Acknowledgement: Smith Kettlewell Eye Research Institute Postdoctoral Fellowship
26.3048 Response of pursuit cells in MST after eye position pertur-
bation by microstimulation of the Superior Colliculus (SC) Jérome
Fleuriet1,2([email protected]), Leah Bakst2,3, Michael Mustari1,2,4; 1Department of Ophthalmology, University of Washington, Seattle, WA, 2Washington National Primate Research Center, University of Washington,
Seattle, WA, 3Graduate Program in Neuroscience, University of Washington, Seattle, WA, 4Department of Biological Structure, University of
Washington, Seattle, WA
Primates use the fovea to maintain high quality central vision of the world.
When a target is moving, smooth pursuit eye movements (SEM) keep the
fovea of both eyes on target. A large proportion of neurons in area MST
discharge during SEM. These smooth pursuit cells (SPC) carry visual and
extraretinal signals. We recorded 27 SPC in MST of a macaque monkey
during SEM trials interrupted with a saccade that was elicited by electrical microstimulation (MS) in the SC. The MS consisted of a train of pulses
(0.1ms, 400Hz, 40ms) at low currents (< 40mA). The tracking behavior was
characterized by 1) an evoked saccade that brought the eye outside the tar-
Vis io n S c ie nc es Societ y
Saturday PM
By using a visual tracking task, where we manipulated the probability for
the target to move in a direction (Right) or another (Left) in three different
direction-biased blocks (with 50%, 75% and 90% of rightward trials respectively), we observed a systematic and graded anticipatory smooth pursuit
eye movements (aSPEM) in human volunteers, suggesting that probabilistic information about the a priori direction of future motions is inferred to
optimize visuomotor tracking. Smooth eye movements are known to be
sensitive to reward contingencies both during the visually guided phase
(Schütz et al, 2015), maintained pursuit during blanking (Madelain & Krauzlis, 2003) and anticipation, where aSPEM could be enhanced or reduced
by reward in a velocity criterion-matching protocol (Damasse et al, 2016).
Optimal decision-making results from the weight given to the outcomes
of possible decisions. These weights reflect their relevance in predicting
future outcomes, which itself is related to the volatility of the environment
(Behrens et al, 2007). In our situation, indeed, the way each past outcome is
included to infer decision-making in the present is quite complex, as it has
to account both for an evolving reward schedule and on sensorimotor regularities (probability of motion direction). To analyze this, we implemented
an agent that produces aSPEM velocities and parameterized by a characteristic memory decay time (i.e. the number of past trials used to estimate the
likelihood of a particular motion direction –similarly to Anderson & Carpenter, 2006). We challenged this model by comparing its predictions to the
experimental aSPEM velocity changes associated to specific trial-sequences
(tested across many subjects). Results suggest that aSPEM reflect an estimation of the volatility of predictive information that may be dynamically
biased by the reinforcement program. This dynamical bias was consistent
with our previously reported block-based results. S atur day A ft ernoon Post ers
Saturday PM
Satur day Af t ernoon P os te r s
get path, 2) a SEM following this abrupt change in eye position and 3) a corrective saccade. We quantified the ratio of activity after the evoked saccade,
before the corrective saccade and during the corrective saccade. A majority of neurons (63%) presented ratios between 0.8 and 1.2 after the evoked
saccade but ratios less than 0.8 during the corrective saccade. Among this
subpopulation, 59% actually had a decrease of their firing rate before the
corrective saccade. On average, this drop of activity occurred 54 (±11ms)
after the evoked saccade offset or 109 (±9ms) after its onset. Interestingly,
in 90% of cases the latency of this drop of activity was sensitive to the delay
between the evoked and corrective saccades. Finally, 22% neurons did not
present a drop of activity during these intervals while 15% presented a
decrease from the evoked saccade offset. This eye position perturbation
showed that the activity of a majority of MST smooth pursuit cells was not
interrupted by a direct corollary discharge from the saccadic system. However this activity seems inhibited by the occurrence of a corrective saccade
even though not always time-locked to it.
Acknowledgement: EY026274, EY013308, EY06069, ORIP ODO10425, and
Research to Prevent Blindness
26.3049 Eye-hand coordination during visuomotor tracking under
complex hand-cursor mapping Frederic Danion1([email protected]
univ-amu.fr), Randy Flanagan2; 1Institut de Neurosciences de la Timone,
CNRS, Aix-Marseille University, , 2Centre for Neuroscience Studies and
Department of Psychology, Queen’s University
Previous studies have investigated eye-hand coordination when tracking with the hand a pseudo-random target (Xia & Barnes, 1999; Soechting et al, 2010; Tramper & Gielen, 2011). In all these studies the mapping
between hand motion and cursor motion was always straightforward. Here
we investigate the impact of using a complex hand-cursor mapping. Two
hand-cursor mappings were tested, either a simple one in which hand and
cursor motion matched perfectly, or a complex one in which the cursor
behaved as a mass attached to the hand by means a spring. Pseudo-random target motion was obtained via the combination of two sinusoids on
each of the vertical and horizontal axis (Mrotek & Soechting, 2007). Subjects
were instructed to move their hand so as to bring the animated cursor as
close as possible from the moving target. Our results showed that hand
tracking performance was substantially more accurate under the regular
mapping that the spring one. On average the tracking error (i.e. cursor-target distance) was almost two times greater under the spring mapping (4.8
vs. 2.7cm). Although in the latter case hand tracking improved across trials,
performance never returned to baseline (i.e. compared to regular). Despite
those substantial differences in hand tracking performance, eye behaviour
seemed relatively unaffected. Indeed under both types of mapping, gaze
always led cursor position and lagged on target position, but with gaze
remaining substantially closer from the target (about 2.5 cm) than from the
cursor (up to 4.5 cm under spring). In addition, we found no difference in
the saccade rate between the two mappings. Overall we conclude that 1)
even when subjects have to learn a complex hand-cursor mapping, gaze is
mostly driven to gather information about ongoing target motion, and 2)
eye behaviour is relatively insensitive to hand-cursor mapping.
Acknowledgement: This work was supported by a PICS from the CNRS, and by a
French National Grant REM ANR-13-APPR-0008
Object Recognition: Where in the brain?
Saturday, May 20, 2:45 - 6:45 pm
Poster Session, Pavilion
26.4001 Lateral occipitotemporal cortex’s selectivity to small arti-
facts reflects multi-modal representation of shape-grasp mapping
elements Wei Wu1([email protected]), Xiaoying Wang1, Chenxi
He1, Yanchao Bi1; 1State Key Laboratory of Cognitive Neuroscience and
Learning & IDG/McGovern Institute for Brain Research, Beijing Normal
Recent studies have reported intriguingly similar activation preference to
small artifacts relative to other object categories in the left lateral occipitotemporal cortex (lLOTC) across various modality and populations (see
reviews in Riciardi et al., 2014; Bi et al., 2016). What drives the multimodal
tool selectivity here is unclear. Our study investigated the potential properties underlying the multimodal small artifact selectivity in the lLOTC
using representational similarity analysis (RSA). BOLD-fMRI responses to
Vi s i on S c i enc es S o ci e ty
VSS 2017 Abst ract s
33 small artifacts were collected for both sighted and congenitally blind
individuals when they performing size judgment tasks on object auditory
names or pictures. Similarity ratings on the overall shape, the shape of the
object parts people typically interact with (i.e., when grasping for typical
use), the manner of manipulating and of grasping were collected to build 4
different behavioral representational similarity matrices (RSMs). RSA identified significant correlation between functionally-defined lLOTC’s neural
RSM and the grasping-manner and grasp-part-shape RSMs across all experiments (Rs > 0.109; ps < 0.012). Furthermore, the shared variance of these
two variables derived from principal component analyses significantly correlated with lLOTC’s neural RSM across all experiments (sighted auditory:
r = 0.129, P < 0.01; sighted visual: r = 0.215, P < 10-6; blind: r = 0.124, P <
0.01 ). The unique effects of either of these two variables, as well as the
effects of overall-shape and overall-manipulation manner, were observed
in the sighted visual experiment and not the blind auditory experiment(Rs
< 0.07; ps > 0.127), i.e., not exhibiting multi-modal patterns. These results
indicate that the representation of the shape element that is indicative of the
manner of grasping best explains the multi-modal representation of small
artifacts in lLOTC, highlighting the critical role of interaction between
visual and nonvisual object properties on the functional organization of the
higher-order visual cortex (Bi et al., 2016).
26.4002 The N300p, a novel ERP component associated with
extended categorization training Yue Meng1([email protected]
com), Shamsi Monfared1, Jonathan Folstein1; 1Florida State University
Subordinate level category learning, which is thought to elicit perceptual
expertise, affects allocation of attention to learned stimuli and creation of
new perceptual and mnemonic representations. There is some controversy
concerning whether effects of expertise are driven primarily by attention or
formation of new perceptual representations (e.g. recruitment of the FFA).
It is therefore desirable to study the neural correlates of category learning
in the context of an attention manipulation. Interestingly, the N250, an ERP
component associated with perceptual expertise, has a similar time course
and postero-lateral scalp distribution to an attention-related ERP component, the selection negativity. Here we attempted to replicate and extend a
previous study in which we found evidence dissociating the N250, which
was sensitive to trained vs. untrained stimuli, from the selection negativity,
which was sensitive to number of features shared with a rare target. Participants were trained over six sessions (an increase from our previous study)
to categorize cartoon alien stimuli, followed by an EEG session during
which participants detected single target alien and ignored non-targets
that shared between zero and four features with the target. This task was
performed on trained vs. untrained stimuli appearing at fast or slow presentations rates (an attempt to manipulate attentional load). The selection
negativity scaled with the number of target features in the stimulus, but
was insensitive presentation rate. The comparison of trained to untrained
stimuli elicited an unexpected new component, which we call the N300p.
This large negative component had a similar time course to the N250 but,
unlike the N250, a clearly parietal scalp distribution. The N300p could be a
new expertise component associated with novel aspects of our task, which
included many-to-one mapping and a stimulus set in which participants
were required to process disjunctions of highly interchangeable features. 26.4003 A-modal versus Cross-modal: How input modality and
visual experience affect categorical representation in the “visual”
cortex Stefania Mattioni1,2([email protected]), Mohamed Rezk2,
Karen Cuculiza1, Ceren Battal1, Roberto Bottini1, Markus Van Ackeren1,
Nick Oosterhof1, Olivier Collignon1,2; 1Center for Mind/Brain Sciences
(CIMEC), University of Trento, Italy, 2Institute of Psychology (IPSY)
and Institute of Neuroscience (IONS), University of Louvain-la-Neuve,
It has recently been proposed that some regions of the occipital cortex,
typically considered purely visual, develop a preferential tuning for specific categories independently of the sensory input and visual experience.
In contrast, several studies showed that occipital responses to non-visual
inputs is unique to blind individuals due to crossmodal plasticity. To further assess how the functional tuning of occipital regions is (in)dependent
of visual input and experience, we characterized with fMRI brain responses
to 8 categories presented acoustically in sighted and early blind individuals, and to the same stimuli presented visually in a separate sighted group.
First, we observed that the posterior middle temporal gyrus (pMTG) was
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s Acknowledgement: European Research Council starting grant MADVIS (ERCStG 337573)
26.4004 Contralateral bias persists in category-selective visual
areas Sarah Herald1([email protected]), Hua Yang2,
Bradley Duchaine1; 1Psychological and Brain Sciences, Dartmouth College,
University of Massachusetts Medical School, Worcester, MA
fMRI studies in humans and single-unit work in macaques has suggested
that visual recognition mechanisms show contralateral biases that are
much weaker than those found in early visual cortex (Hemond et al., 2007).
In almost all these studies, single stimuli were displayed peripherally to
assess biases. In natural vision though, visual stimuli are present in both
hemifields, and a recent ERP study found the N170 was driven exclusively
by the contralateral stimulus when faces and houses were simultaneously
presented to both hemifields (Towler & Eimer, 2015). To examine contralateral biases in category-selective visual areas using a more naturalistic
display with fMRI, we first carried out a dynamic localizer using videos of
faces, bodies, scenes, objects, and scrambled objects to identify category-selective areas. We then scanned participants while they viewed faces and
houses simultaneously presented to the left and right visual hemifields. We
found that face-selective and place-selective areas displayed large contralateral biases in which category-selective regions were primarily influenced
by contralaterally-presented stimuli. For example, in the right OFA, the
response to a contralateral house and an ipsilateral face is comparable to
a contralateral house and ipsilateral house. Conversely, the response to a
contralateral face and an ipsilateral house is only slightly weaker than the
response to a contralateral face and ipsilateral face. Other category-selective areas, though not all, showed responses that were much more strongly
modulated by contralateral than ipsilateral stimuli. These findings tentatively suggest that, under natural viewing conditions, peripheral stimuli
are represented primarily in contralateral category-selective areas and that
detection of peripheral stimuli is carried out largely by the contralateral
26.4005 Building of object view invariance in a newly-discovered
network in inferior temporal cortex Pinglei Bao1,2([email protected]),
Doris Tsao1,2; 1Division of Biology and Biological Engineering, Caltech,
The Howard Huges Medical Insitute
Object recognition in primates is mediated by hierarchical, multi-stage
processing of visual information within occipital and inferior temporal (IT)
cortex. It is known that IT contains several networks that process specific
categories or stimulus dimensions. Furthermore, at least in the case of face
processing network, the nodes appear to be organized hierarchically, e.g.,
neurons in the middle faces patches are tuned for specific facial views,
while those in the most anterior patch are tuned for the identity of faces in a
view-invariant way. However, there remains a fair amount of IT cortex that
doesn’t belong to any known network, raising the question: are there any
new, undiscovered networks not yet accounted for by existing functional
parcellation studies? If so, what are these networks processing and how are
they organized? To address this question, we exploited the technique of
electrical microstimulation combined with simultaneous functional magnetic resonance imaging. Electrical microstimulation of a region of macaque
IT cortex not belonging to any known network produced strong activation
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
in three patches that also didn’t overlap with any known networks. We
targeted single-unit recordings to these three patches, while monkeys
passively viewed an image set consisting of 51 objects with 24 views for
each object; the objects included faces, animals, houses, vegifruit, vehicle,
and man-made objects. Average responses across neurons from the three
patches revealed high similarity in object preferences between the patches,
further confirming these patches belong to a common network; for example, all three patches showed the smallest response to faces. Representational similarity analysis on population single-unit responses in each of the
three patches revealed the most view-invariant representation in the most
anterior patch, and the least view-invariant representation in the posterior
patch, suggesting that, analogous to the face patch network, view-invariant
object representation is built up hierarchically within this new network.
26.4006 Decoding the representational dynamics of object rec-
ognition with MEG, behavior, and computational models Brett
Bankson1([email protected]), Martin Hebart1, Chris Baker1; 1Section
on Learning and Plasticity, Laboratory of Brain and Cognition, National
Institute of Mental Health
Previous studies using electrophysiological recordings have identified the
time course of category representation during the first several hundred milliseconds of object recognition, but less is known about the perceptual and
semantic features reflected by this information (Cichy et al., 2016, Clarke et
al., 2012). Here we apply machine learning methods and representational
similarity analysis (RSA) to MEG recordings in order to elucidate the temporal evolution of representations for concrete visual objects. During MEG
recording, 32 participants were repeatedly presented with object stimuli while completing a visual oddball task. Half of the participants were
exposed to one set of 84 object exemplars, while the other half was presented with different exemplars of the same concepts. The 84 object concepts were selected based on lexical frequency. We used a support vector
classifier to produce pairwise decoding accuracies between all object items
at all time points, which served as dissimilarity matrices for later analyses. Complementary behavioral data from an object arrangement task were
included in our analyses, as well as model predictions from a semantic
model and a convolutional neural network. MEG analyses showed robust
pairwise decoding of object images, peaking around 100 ms post-stimulus
onset. Before 150 ms, the MEG data contained information similar to the
the early layers of a convolutional neural network (CNN), suggesting early
discriminability in patterns of neural activity based on visual information
before 150 ms. From 200-450 ms, the MEG data show persistent similarity
across visual exemplars for the same concept. Further, there was high correlation with the behavioral data, mid-level CNN layers, and the semantic
model. Together, these results suggest the emergence of an abstract behaviorally-relevant representation of concrete object concepts peaking between
250-300 ms.
26.4007 Representation of visual and motor object features in
human cortex Ariana Familiar1([email protected]), Heath Matheson1, Sharon Thompson-Schill1; 1Department of Psychology, University of
To accomplish object recognition, we must remember the shared sensorimotor features of thousands of objects, as well as each object’s unique
combination of features. While theories differ on how exactly the brain does
this, many agree that featural information is integrated in at least one cortical region, or “convergence zone”, which acts as a semantic representation
area that links object features of different information types. Moreover, it
has been posited the anterior temporal lobe (ATL) acts as a “hub” that associates object features across sensory and motor modalities, as it is reciprocally connected to early modality-specific cortical regions and patients
with ATL damage have shown deficits in processing and remembering
object information across input modalities (Patterson et al., 2007). Our lab
recently found evidence that the left ATL encodes integrated shape and
color information for objects uniquely defined by these features (fruits/
vegetables; Coutanche & Thompson-Schill, 2014), suggesting ATL acts as a
convergence zone for these visual object features. However, whether ATL
encodes integrated object features from different modalities had not been
established. We used functional magnetic resonance imaging (fMRI) and
multi-voxel pattern analysis (MVPA) to examine whether ATL acts as an
area of convergence for object features across visual and motor modalities.
Using a whole-brain searchlight analysis, we found activity patterns during
Vis io n S c ie nc es Societ y
Saturday PM
the most reliable region being able to decode the 8 presented categories
independently of the input modality (in vision and audition in the sighted)
and visual experience (in audition in the sighted and blind). Importantly,
we also observed that the occipital cortex of blind individuals showed
enhanced coding of acoustical stimuli. To further understand the nature
of this reorganization, we used representational similarity analysis (RSA)
in those regions in order to link similarities of brain activity patterns with
different features similarities of the acoustical stimuli space. We found a
stronger correlation between the patterns of activity in some portions of the
occipital cortex with the categorical features of the stimuli (e.g. animate-inanimate), whereas we did not find any information about the physical
properties (e.g. pitch) of the stimuli. Together, our results suggest that the
occipital cortex shows a strong sensory tuning toward visual stimuli in the
sighted and reorganizes to enhance its response toward non-visual input
in case of early visual deprivation. Additional analyses on the nature of
the functional reorganization show that the representation is mostly linked
to “high-level” categorical tuning rather than low-level properties of the
sounds (e.g. pitch).
S atur day A ft ernoon Post ers
Satur day Af t ernoon P os te r s
VSS 2017 Abst ract s
a memory retrieval task in a region within the left ATL could successfully
classify objects defined by unique combinations of visual (material) and
motor (grip) features, but could not classify either constituent feature while
generalizing over identity. These results suggest that in addition to being
a convergence zone for visual object features, left ATL also acts as an area
of convergence for object information across visual and motor modalities.
26.4008 The large-scale organization of object processing in the
ventral and dorsal pathways Erez Freud1,2([email protected]),
Jody Culham , David Plaut , Marlene Behrmann ; Department of
Psychology, Carnegie Mellon University, 2Center for the Neural Basis of
Cognition, Carnegie Mellon University and the University of Pittsburgh ,
The Brain and Mind Institute,University of Western Ontario, 4Department
of Psychology,University of Western Ontario
Saturday PM
1,2 1
One of the hallmark properties of the ventral visual pathway is sensitivity to
object shape. Accumulating evidence, however, suggests that object-based
representations are also derived by the dorsal visual pathway although
less is known about the characteristics of these representations, their spatial distribution, and their perceptual importance. To bridge this gap, the
present study combined psychophysical and fMRI experiments in which
participants viewed and recognized objects with different levels of scrambling that distorted object shape. Neural shape sensitivity was assessed
by measuring the reduction of fMRI activation in response to scrambled
versus intact versions of the same stimulus. In the ventral pathway, shape
sensitivity increased linearly along an anterior-posterior axis from early
visual cortices (i.e., v1-v4) to posterior extrastriate cortices (i.e., LO1) and
remained constant along the occipitotemporal cortex. In the dorsal pathway, shape sensitivity also increased linearly along an anterior-posterior
axis from early visual cortices (i.e., v1-v3d) to posterior extrastriate cortices
(i.e., V3a, posterior IPS). However, in stark contrast to the ventral stream,
in moving from posterior extrastriate cortices to more anterior regions (i.e.,
IPS 1-4, aIPS), shape selectivity gradually decreased. Interestingly, as with
the anterior ventral pathway, the posterior IPS activation profile was found
to be highly correlated with recognition performance obtained outside of
the scanner, further pointing to a plausible contribution of this region to
perception. Finally, these results were replicated using a different method
for manipulating object integrity (diffeomorphic alteration) suggesting
the results are not attributable to modulations of low-level object features.
Together, these results provide novel evidence that object representations
along the dorsal pathway are not monolithic and gradually change along
the posterior-anterior axis. These findings challenge the binary dichotomy
between the two pathways and suggest that object recognition might be the
product of more distributed neural mechanisms.
26.4009 Effect of Task on Object Category Representations Across
Human Ventral, Dorsal, and Frontal Brain Regions JohnMark Tay-
lor1([email protected]), Maryam Vaziri-Pashkam1, Yaoda
Xu1; 1Department of Psychology, Harvard University
Recent evidence demonstrates that dorsal visual stream regions represent
not just “where” or “how” information, but also object identity (“what”)
information like the ventral stream. Here, we further explored the hypothesis that the dorsal stream encodes object representations in a manner that
also reflects task demands, whereas the ventral stream encodes objects in a
relatively task-invariant manner. We also examined responses in a frontal
ROI corresponding to the frontal eye field (FEF), as it belongs to the dorsal attention network along with the intraparietal sulcus (IPS), and since
frontal regions also participate in object categorization tasks. In our fMRI
MVPA study, participants viewed blocks of pictures from 8 object categories and did either a category oddball task (e.g., respond to a face in a block
of houses) or a one-back task (e.g., respond to the same face appearing
twice in a row). The oddball task thus drew participants’ explicit attention
to object category while the one-back task drew attention to the exact exemplar shown. We examined object category representations across a number
of ROIs, including object responsive regions in lateral occipital cortex, inferior and superior IPS, and FEF. Consistent with past research, we obtained
significant object category decoding across both visual pathways, and additionally in FEF. Category decoding was not enhanced by the category oddball task, suggesting that information extracted in the one-back task was
sufficient to distinguish categories. We found that task and category contributed roughly equally to the category representational structures in the
dorsal and frontal ROIs, but in ventral regions category contributed much
Vi s i on S c i enc es S o ci e ty
more than task to the category representational structures. Task context
thus plays a more prominent role in shaping object category representation
in dorsal and frontal regions than in ventral regions. Supported by NIH
grant 1R01EY022355 to YX
Acknowledgement: Supported by NIH grant 1R01EY022355 to YX
26.4010 Spatial frequency tolerant object representations in the
ventral and dorsal visual processing pathways Maryam Vaziri
Pashkam1([email protected]), Yaoda Xu1; 1Vision Sciences Laboratory,
Department of Psychology, Harvard University
Object category representations have been found in both human ventral
and dorsal visual processing regions. Given the differences in anatomical
connections to parvo- and magno-cellular layers in LGN, the two pathways
may exhibit differential sensitivity to spatial frequency. To test this idea, in
this study, observers viewed blocks of images from six natural object categories and performed a one-back repetition detection task on the images.
Images were shown in full spectrum, high spatial frequencies (>7 cpd), or
low spatial frequencies (< 1 cpd). Using fMRI and MVPA, we examined
how object category decoding would be modulated by the spatial frequency content of the images. We examined responses from topographic
regions V1-V4 and IPS0-2, the object shape selective lateral occipital cortex
(LO), a temporal region activated by our object stimuli, as well as superior
and inferior IPS (two parietal regions previously implicated in object processing). We obtained above chance category decoding for the intact, high
and low frequency images in all the regions examined. Importantly, the
decoding accuracy was no different between the high and low frequency
images in all the regions examined except for V4 and IPS2 where decoding
was higher for the high frequency images. We also trained the classifier
with high and tested it with low frequency images or vice versa and found
that all regions showed robust generalization across spatial frequency. A
representational similarity analysis further showed that object category
representations were separated based on spatial frequency in early visual
but not in dorsal and higher ventral regions. These results demonstrate that
object category representations in both ventral and dorsal regions are tolerant to changes in spatial frequency and argue against a dissociation of the
two pathways based on spatial frequency sensitivity. ​
Acknowledgement: NIH grant 1R01EY022355 to Y.X.
26.4011 Encoding of partially occluded and occluding stimuli in the
macaque inferior temporal cortex Tomoyuki Namima1,2([email protected]
uw.edu), Anitha Pasupathy1,2; 1Department of Biological Structure, University of Washington, 2Washington National Primate Research Center,
University of Washington
Image segmentation – the process by which scenes are segmented into component objects – is a fundamental aspect of vision and a cornerstone of scene
understanding; its neural basis, however, is largely unknown. Partial occlusions pose a special challenge to segmentation because, unlike non-overlapping stimuli, they require the parsing of overlapping contours and
regions and/or the grouping of noncontiguous regions. To begin to understand how partially occluded stimuli are segmented in the primate brain,
we studied the responses of single neurons in IT cortex to shape stimuli
subjected to increasing levels of occlusion. We asked whether IT responses
are consistent with a segmented representation whereby responses of each
neuron are dictated by either the occluded or the occluding stimulus, but
not both. We recorded from 43 well-isolated, single IT neurons as animals
were engaged on a sequential shape discrimination task. On each trial, two
stimuli were presented in sequence and the animal had to report whether
the stimuli were the same or different with a rightward or leftward saccade,
respectively. The second stimulus in the sequence was occluded with randomly positioned dots; occlusion levels were titrated by varying occluding
dot width. Some neurons (11/43, 26%) showed strong responses to unoccluded stimuli and responses gradually declined with increasing levels of
occlusion. These unoccluded-preferred neurons showed shape-selective
responses to occluded stimuli. These neurons behaved quite like those in IT
cortex during passive fixation (Kovacs et al., 1995) and their responses were
consistent with a encoding of the identity of the occluded shape. Many others (21/43, 49%), however, showed weak responses to unoccluded stimuli
and stronger responses under occlusion. Taken together, our results sup-
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s port the idea that IT neurons encode segmented components of the image,
with one sub-group encoding the occluded stimulus and other encoding
the occluders.
Acknowledgement: NEI grant R01EY018839, Vision Core grant P30EY01730,
P51 grant OD010425
26.4012 A dynamic representation of shape similarity in the lateral
intraparietal area Koorosh Mirpour1([email protected]),
James Bisley1,2,3, Wei Song Ong4; 1Dept Neurobiol, UCLA, Los Angeles,
CA, 2Jules Stein Eye Institute, David Geffen Sch. of Med. Los Angeles, CA,
Dept of Psychology and the Brain Res. Inst., UCLA, Los Angeles, CA,
Dept Neurobiol, Univ. of Pennsylvania, Philadelphia, PA
Acknowledgement: National Eye Institute
26.4013 Neural responses to shape and texture stimuli in macaque
area V4 Taekjun Kim1([email protected]), Wyeth Bair1, Anitha
Pasupathy1; 1BIological Structure, School of Medicine, University of
Functional and anatomical evidence demonstrates that visual information
is processed along the dorsal and ventral visual pathways in the primate.
Area V4 is an important intermediate stage of the ventral visual pathway,
which is specialized for object recognition. Several studies agree that V4
neurons respond to visual shape properties (e.g., contour features along a
shape boundary), and surface properties (e.g., color, brightness, and texture). However, the main role of V4 in vision has been extensively debated.
In the current study, we examined which of object boundary and surface
characteristics has a greater impact on V4 single unit activity by presenting simple 2D shape stimuli and texture patches to the same neurons. Our
findings showed that simple 2D shape stimuli typically evoked stronger
responses from V4 single units compared to texture patches contained
within the receptive field or extending beyond. In many cells (>40% of our
cell population), neural responses to circular texture patches were tuned to
crucial dimensions of texture perception – coarseness, directionality, and
regularity. However, response variation attributable to texture information
was largely influenced by the cells’ preference for a visual shape defined
by texture stimuli. Texture stimuli that defined the surface of preferred but
not non-preferred visual shape could yield response variation. Preferred
visual shape of a single unit was unchanged under various texture conditions within the shapes. Unlike standard shape stimuli which affect both
transient and sustained activity of V4 neurons, neural response variation
due to the texture information was reflected mostly in sustained activity as
a form of suppression. These results suggest that the main role of V4 neurons in object recognition is to represent shape information which is largely
consistent across surface properties.
26.4014 Exploring the role of curvature for neural shape represen-
tations across hV4 and Lateral Occipital visual field maps Richard Vernon1,2([email protected]), Andre Gouws1,2, Samuel
Lawrence1,2, Bruce Keefe1,2, Declan McKeefry3, Alex Wade1,2, Antony
Morland1,2; 1York Neuroimaging Centre, University of York, York, UK,
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Department of Psychology, University of York, York, UK, 3Bradford
School of Optometry and Vision Sciences, University of Bradford, Bradford, UK
Whilst Macaque V4’s role for curvature and shape processing is well
documented, the relationship between human Lateral Occipital LO-1/2
and more ventral hV4 retinotopic maps is less clear. The former regions
overlap shape-selective Lateral Occipital Complex (LOC), and we previously demonstrated that LO-2 and LOC share similar (potentially curvature-based) neural shape representations. Therefore, we asked more directly
whether curvature is explicitly represented in these LO maps, or if instead
it is processed by hV4 in line with Macaque literature. We used radial frequency (RF) patterns to test whether degree of curvature (manipulated via
amplitude), or simply the number of curves (manipulated via frequency;
range 3-5), would most influence neural shape representations. This was
tested in a rapid event-related fMRI experiment, using representational
similarity analysis to assess patterns of activity across retinotopically- and
functionally-defined regions of interest (ROIs). Neural similarity was compared to shape similarity metrics based upon amplitude, frequency, and
additional predictors derived from Principal Component Analysis (PCA)
on more exploratory stimulus similarity metrics. Those PCA components
were ‘Lobe-prominence’ (degree of curve protrusion) and ‘Lobe-curvature’
(curvature breadth/acuteness). After controlling for low-level influences,
we found three divergent influences in later visual cortex. First, frequency
was only influential for LO-1, implying LO-1 is at least partially distinct
from other ROIs. Amplitude was well-represented across all Lateral Occipital ROIs (LO-1/2, LO), however this is complicated by our two components.
We found that ‘Lobe-curvature’ was somewhat influential not only for Lateral Occipital ROIs, but also for hV4. Conversely, ‘Lobe-prominence’ only
explained variance in Lateral Occipital regions. This implies that whilst
hV4 likely does process shape curvature to some extent, its representation
is nevertheless distinct from that of our Lateral Occipital ROIs. On the basis
of these results, we suggest the Lateral Occipital shape representation may
in part be based upon shape protrusions.
Acknowledgement: BBSRC grants BB/L007770/1 and BB/P007252/1
26.4015 Radial frequency tuning in human visual cortex Antony
Morland1,2([email protected]), Samuel Lawrence1,2, Richard Vernon1,2,
Bruce Keefe1,2, Andre Gouws1,2, Alex Wade1,2, Declan McKeefry3; 1York
Neuroimaging Centre, University of York, York, UK, 2Department of
Psychology, University of York, York, UK, 3Bradford School of Optometry
and Vision Sciences, University of Bradford, Bradford, UK
Radial frequency (RF) patterns are shape stimuli defined by a sinusoidal
modulation of a circle’s radius. Low frequency RF patterns, with few modulations around the perimeter, are processed by global, mid-level shape
mechanisms, however the neural locus of these mechanisms in humans is
not well understood. We used fMRI to measure neural responses to a large
range of RFs, and modeled neural tuning to RF in early, lateral and ventral visual cortex. Responses were modeled by a Gaussian neural model
defined in RF space, where each voxel’s tuning to RF was defined by the
model which generated a response that best predicted the fMRI data. To
quantify this pattern, we measured tuning profiles to RF for visual areas
V1, V2, V3, V4, VO1, VO2, LO1, LO2 and object-selective LOC. Low, globally processed RF tuning was localised to lateral occipital cortex (LO) in all
subjects. Specifically, tuning to global RFs first emerged in visual field maps
LO1 and LO2, and persisted through LOC. In addition, we correlated RF
tuning profiles from each area against stimulus contrast energy and shape
defined by circularity. Only LO2 and LOC profiles were significantly better
explained by sensitivity to shape over contrast energy. All early and ventral
areas showed tuning to high, locally processed RFs and were more strongly
correlated with stimulus contrast energy over shape. We replicated our
results using a control stimulus set where all RFs were combined with the
same high-frequency contour modulations to match stimuli for low-level
differences, showing LO responses were driven by the global shape of low
RFs which remained constant across both stimulus sets. Our results suggest
a shape processing pathway through lateral occipital cortex, where global
shape representations are formed in LO2, likely providing input to LOC
where more complex representations of objects are formed.
Acknowledgement: BBSRC grants BB/L007770/1 and BB/P007252/1
Vis io n S c ie nc es Societ y
Saturday PM
Visual object recognition in primates is a very efficient and reliable cognitive ability. Psychophysical studies have shown that flexibility, efficiency
and performance of visual object recognition is achieved by the representation of shape similarities as opposed to the representation of shapes themselves. Stable versions of such neural representations have been found in
the ventral pathway of non-human primates. However, some aspects of
visual object recognition require dynamic comparisons of the shape similarity in context of a goal oriented task. This form of representation is more
likely to appear in area that can integrate bottom-up sensory with topdown task relevant information. We tested whether neurons in the lateral
intraparietal area (LIP) of posterior parietal cortex could fulfill this role by
collating information from object specific similarity map representations to
allow general decisions about whether a stimulus matches the object being
looked for. We found that when animals compared two peripheral stimuli
to a sample at their fovea, the response to the matching target remained
stable, but the response to the distractor depended on how similar it is to
the sample: the more similar, the greater the response to the distractor. Our
data suggest that mental comparisons may utilize a dynamic perceptual
similarity representation in LIP, which bridges object information from the
ventral stream with decision making activity in pre-frontal cortex.
S atur day A ft ernoon Post ers
Satur day Af t ernoon P os te r s
26.4016 Decoding face pareidolia in the human brain with
fMRI Susan Wardle1,2([email protected]), Kiley Seymour1,2,3,
Jessica Taubert ; Department of Cognitive Science, Macquarie University, Sydney, Australia, 2ARC Centre for Excellence in Cognition and its
Disorders, Macquarie University, 3School of Psychology, University of
New South Wales, Sydney, Australia, 4Laboratory of Brain And Cognition,
National Institute of Mental Health
Saturday PM
4 1
A common human experience is face pareidolia, whereby illusory faces
are perceived in inanimate objects. A unique aspect of pareidolia is that
the objects are typically perceived simultaneously as both an illusory face
and an inanimate object. Ventral visual areas such as the lateral occipital
complex (LOC) and fusiform face area (FFA) in human occipital-temporal
cortex are category-selective and respond to either objects or faces respectively. Consequently, it is unclear how these category-selective regions process stimuli with a dual face/object identity. Here we use fMRI to probe
how visual stimuli with a persistent dual identity are processed by face
and object-selective areas. We used a diverse image set containing natural
examples of pareidolia in a wide variety of everyday objects. Critically, we
created a yoked image set that was matched for object content and visual
features but did not contain any illusory faces. We used a yoked block
design to measure patterns of BOLD activation in response to objects where
pareidolia was present or absent. Standard functional localizers were used
to define category-selective areas. Using standard leave-one-run out classification, a linear support vector machine (SVM) could decode pareidolia
objects versus non-face objects from both early visual cortex (V1), and higher-level category-selective areas (LOC and FFA). Importantly, in both LOC
and FFA the classifier could successfully decode the presence or absence
of pareidolia faces in new image sets that were not used for training the
classifier, demonstrating generalization. In contrast, the presence of pareidolia could not be decoded in V1 when different image sets were used for
training versus testing the classifier. This suggests that both FFA and LOC
respond to the presence of illusory faces in inanimate objects. Interestingly,
cross-classification of object identity was not successful in either FFA or
LOC, suggesting face pareidolia is strongly represented in these areas.
26.4017 A tool for automatic identification of cerebral sinuses and
corresponding artifacts in fMRI Keith Jamison1([email protected]),
Luca Vizioli1, Ruyuan Zhang1, Jinyi Tao2, Jonathan Winawer3, Kendrick
Kay1; 1Department of Radiology, University of Minnesota, 2Department of
Psychology, University of Minnesota, 3Psychology and Center for Neural
Science, New York University
Functional magnetic resonance imaging (fMRI) is a widely used method for
investigating the cortical mechanisms of visual perception. Given that fMRI
measures oxygenation-related changes in hemodynamics, it is critical to
understand the factors governing the accuracy with which hemodynamics
reflect neural activity. We conducted ultra-high-resolution fMRI in human
visual cortex during a simple event-related visual localizer experiment (7T,
0.8-mm isotropic, 2.2-s TR, 84 slices, gradient-echo EPI), and also collected
whole-brain anatomical T1- and T2-weighted volumes (3T, 0.8-mm isotropic). We find that major cerebral sinuses (superior sagittal sinus, straight
sinus, and left and right transverse sinuses) can be clearly identified by
computing the ratio of the T1- and T2-weighted volumes (Salimi-Khorshidi
et al. 2014), and we show that these sinuses are nearly perfectly aligned
across subjects after transformation to volumetric MNI space. We then construct a sinus atlas and develop a software tool that automatically predicts
the location of the sinuses given only a T1-weighted anatomical volume
obtained for a subject. We show that this tool accurately reproduces manual
segmentations of the sinuses in our subjects. Importantly, we demonstrate
that regions of the cortical surface located near the sinuses correspond to
regions with signal dropout and unreliable fMRI responses in our functional data. These sinus-affected regions are not only located near hV4 as
previously reported (Winawer et al. 2010), but are also located near many
other regions in occipital, parietal, and temporal cortex. Because the atlas is
accurate, automated, and easy to use, we suggest that it be routinely used
to identify cortical regions that are likely to suffer from imaging artifacts,
thereby avoiding the need to exclude regions based on ad hoc, subjective
measures and aiding proper interpretation of fMRI data.
Vi s i on S c i enc es S o ci e ty
VSS 2017 Abst ract s
Scene Perception: Models and other
Saturday, May 20, 2:45 - 6:45 pm
Poster Session, Pavilion
26.4018 Places: An Image Database for Deep Scene Understand-
ing Bolei Zhou1([email protected]), Agata Lapedriza2, Antonio Torralba1, Aude Oliva1; 1MIT, 2Universitat Oberta de Catalunya
The rise of multi-million-item dataset initiatives has enabled machine
learning algorithms to reach near-human performances at object and scene
recognition. Here we describe the Places Database, a repository of 10 million pictures, labeled with semantic categories and attributes, comprising
a quasi-exhaustive list of the types of environments encountered in the
world. Using state of the art Convolutional Neural Networks (CNN), we
show performances at natural image classification from images collected
in the wild from a smart phone as well as the regions used by the model to
identify the type of scene. Looking into the representation learned by the
units of the neural networks, we find that meaningful units representing
shapes, objects, and regions emerge as the diagnostic information to represent visual scenes. With its high-coverage and high-diversity of exemplars,
Places offers an ecosystem of visual context to guide progress on currently
intractable visual recognition problems. Such problems could include
determining the actions happening in a given environment, spotting inconsistent objects or human behaviors for a particular place, and predicting
future events or the cause of events given a scene.
26.4019 Similarities Between Deep Neural Networks and Brain
Regions In Processing Good and Bad Exemplars of Natural
Scenes Manoj Kumar1,2([email protected]), Shuchen Zhang3, Diane
Beck1,2,4; 1Neuroscience Program, University of Illinois at Urbana-Champaign, 2Beckman Institute, University of Illinois at Urbana-Champaign,
Department of Computer Engineering, University of Illinois at Urbana-Champaign, 4Department of Psychology, University of Illinois at
The layers of Deep Neural Networks (DNNs) have shown some commonality with processing in the visual cortical hierarchy. It is unclear, however,
whether DNNs capture other behavioral regularities of natural scenes,
e.g. the representativeness of an image to its category. Humans are better at categorizing and detecting good (more representative of category)
than bad exemplars of natural scenes. Similarly, prior work has shown
that good exemplars are decoded better than bad exemplars in V1 as well
as higher visual areas such as the retrosplenial cortex (RSC) and the parahippocampal place area (PPA). Here we ask whether a DNN, that was
not explicitly informed about representativeness of its training set, shows
a similar good exemplar advantage, and if so in which layers do we see
this effect. We used good and bad exemplars from six categories of natural scenes (beaches, city streets, forests, highways, mountains and offices)
and processed them through the pre-trained Places205-AlexNet. We asked
both whether this DNN could distinguish good from bad scenes, as well
as whether 6-way category classification differed for good and bad exemplars. Classification was performed using the feature space of each layer
separately. Our results show that while in the lowest layer (conv1) there is
insufficient information to make a good vs. bad discrimination, the layer
(fc7) (the second highest fully connected layer) can clearly make this discrimination. Furthermore, the six-way categorization was better for good
exemplars than bad exemplars in lower and higher layers, although categorization accuracy increases overall at the higher layers. This parallels
results seen in V1, RSC and PPA, and suggests that the DNN learns statistical regularities that distinguish good from bad exemplars without such
information being explicitly encoded into the training set.
Acknowledgement: ONR MURI (DMB)
26.4020 Computational mechanisms for identifying the naviga-
tional affordances of scenes in a deep convolutional neural network Michael Bonner1([email protected]), Russell Epstein1;
University of Pennsylvania
A central component of spatial navigation is determining where one can
and cannot go in the immediate environment. For example, in indoor environments, walls limit one’s potential routes, while passageways facilitate
movement. In a recent set of fMRI experiments, we found evidence sug-
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day A ft ernoon Post ers
asked to rate each image with one of four options: Not symmetric, Somewhat symmetric, Symmetric, and Highly symmetric. We measure the
bilateral symmetry of an image by comparing CNN features across multiple levels between two vertical halves of an image. We use the AlexNet
model pre-trained on the ImageNet dataset for extracting feature maps at
all 5 convolutional layers. The extracted feature maps of the two bilateral
halves are then compared to one another at different layers and spatial levels. The degree of similarity on different feature maps can then be used to
model the range of symmetry an image can be seen to have. We train a
multiclass SVM classifier to predict one of the four symmetry judgements
based on these multi-level CNN symmetry scores. Our symmetry classifier
has a very low accuracy when it needs to predict all observers’ responses
equally well on individual images. However, our classification accuracies
increase dramatically when each observer is modeled separately. Our
results suggest that symmetry is in fact in the eye of the beholder: While
some observers focus on high-level object semantics, others prefer low or
mid-level features in their symmetry assessment.
26.4021 Expecting and detecting objects in real-world scenes:
While there is evidence that both visual salience and previously stored scene
knowledge influence scene viewing behavior, the relationship between
them and viewing behavior is unclear. The present study investigated the
relationship between stimulus-based saliency and knowledge-based scene
meaning. Sixty-five participants performed a scene memorization task
while their eye movements were recorded. Each participant viewed 40 realworld scenes for 12 seconds each. A duration-weighted fixation density
map for each scene was computed across all 65 participants to summarize
viewing behavior. A saliency map for each scene was computed using the
Graph-Based Visual Saliency model (Harel, Koch, & Perona 2006). Finally,
a meaning map was generated for each scene using a new method in which
people rated how informative/recognizable each scene region was on a
6-point Likert scale. Specifically, each scene was sampled using overlapping
circular patches at two spatial scales (300 3° patches or 108 7° patches). Each
unique patch was then rated by participants on Amazon Mechanical Turk
(N=165) that rated a random set of ~300 patches. The 3° and 7° rating maps
were smoothed using interpolation, and then averaged together to produce
a meaning map for each scene. The unique and shared variance between
the fixation density map and the saliency and meaning maps for each scene
were computed using multiple linear regression with salience and meaning as predictors. The squared partial correlation showed that on average
meaning explained 50% (SD=11.9) of the variance in scene fixation density
while salience explained 35% (SD=12.5). The squared semi-partial correlation indicated that on average meaning uniquely explained 19% (SD=10.6)
of variance in fixation density while salience only uniquely accounted for
4% (SD=4.0). These results suggest that scene meaning is a better predictor
of viewing behavior than salience, and stored scene-knowledge uniquely
accounts for relevant scene regions not captured by salience alone. when do target, nontarget and coarse scene features contribute? Harish Katti1(harish[email protected]), Marius Peelen2, S. P. Arun1;
Centre for Neuroscience, Indian Institute of Science, Bangalore, India,
560012, 2Center for Mind/Brain Sciences, University of Trento, 38068
Rovereto, Italy
Humans excel at finding objects in complex natural scenes but understanding this behaviour has been difficult because natural scenes contain targets, nontargets and coarse scene features. Here we performed two studies to elucidate object detection on natural scenes. In Study 1, participants
detected cars or people in a large set of natural scenes. For each scene,
we extracted target-associated features, annotated nontarget objects, and
extracted coarse scene structure and used them to model detection performance. Our main finding is that target detection in both person and car tasks
was predicted using target and coarse scene features, with no discernible
contribution of nontarget objects. By contrast, nontarget objects predicted
target rejection times in both person and car tasks, with contributions from
target features for person rejection. In Study 2, we sought to understand
the computational advantage of context. Context is commonly thought of
reducing computation by constraining locations to search. But can it have a
more fundamental role in making detection more accurate? To do so, scene
context must be learned independently from target features. Humans,
unlike computers, can learn contextual expectations separately when we
see scenes without targets. To measure these expectations, we asked subjects to indicate the scale, location and likelihood at which targets may
occur in scenes without targets. Humans showed highly systematic expectations that we could accurately predict using scene features. Importantly,
we found that augmenting state-of-the art deep neural networks with these
human-derived expectations improved performance. This improvement
came from accepting poor matches at highly likely locations and rejecting strong matches at unlikely locations. Taken together our results show
that humans show systematic behaviour in detecting objects and forming
expectations on natural scenes that can be predicted and understood using
computational modelling.
Acknowledgement: (HK) Department of Science and Technology, Government
of India, (SPA) Indian Institute of Science (SPA, MVP) India Trento partnership
26.4022 Symmetry in the Eye of the Beholder Seyed Ali Amir-
shahi1,2,3([email protected]), Asha Anoosheh4, Stella Yu1,2, Jakob
Suchan5, Carl Schultz6, Mehul Bhatt5; 1UC Berkeley, 2ICSI, 3NTNU, 4ETH
Zurich, 5University of Bremen, 6University of Muenster
We study how subjective perception of symmetry can be computationally
explained by features at different levels. We select 149 images with varying degrees of symmetry from photographs and movie frames and collect
responses from 200 subjects. Each subject is shown 50 random images and
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Acknowledgement: Seyed Ali Amirshahi was supported by a fellowship within
the FITweltweit programme of the German Academic Exchange Service (DAAD).
26.4023 The Relationship Between Salience and Meaning During
Real-World Scene Viewing Taylor Hayes1([email protected]), John
Henderson1,2; 1Center for Mind and Brain, University of California, Davis,
Department of Psychology, University of California, Davis
Acknowledgement: National Science Foundation (BCS-1151358)
26.4024 THREAT - A database of line-drawn scenes to study threat
perception Jasmine Boshyan1,2([email protected]),
Nicole Betz3, Lisa Feldman Barrett1,3,4, David De Vito5, Mark Fenske5,
Reginald Adams, Jr.6, Kestutis Kveraga1,2; 1Athinoula A. Martinos Center
for Biomedical Imaging, Massachusetts General Hospital, Charlestown,
MA, USA, 2Department of Radiology, Harvard Medical School, Boston,
MA, USA, 3Department of Psychology, Northeastern University, Boston,
MA, USA, 4Department of Psychiatry, Massachusetts General Hospital, Charlestown, MA, USA, 5Department of Psychology, University of
Guelph, Guelph, Canada, 6Department of Psychology, The Pennsylvania
State University, State College, PA
Efficient extraction of threat information from scene images is a remarkable
feat of our visual system, but little is known about how it is accomplished.
To facilitate studies of threat perception with well-controlled scene images,
we created a set comprising 500 hand-traced line drawings of photographic
visual scenes depicting various dimensions of threat. We used color-photo
scene images previously reported in Kveraga et al. (2015) depicting direct
Vis io n S c ie nc es Societ y
Saturday PM
gesting that the human visual system solves this problem by automatically
identifying the navigational affordances of the local scene. Specifically, we
found that the occipital place area (OPA), a scene-selective region near the
transverse occipital sulcus, appears to automatically encode the navigational layout of visual scenes, even when subjects are not engaged in a navigational task. Given the apparent automaticity of this process, we predicted
that affordance identification could be rapidly achieved through a series
of purely feedforward computations performed on retinal inputs. To test
this prediction and to explore other computational properties of affordance
identification, we examined the representational content in a deep convolutional neural network (CNN) that was trained on the Places database for
scene categorization but has also been shown to contain information relating to the coarse spatial layout of scenes. Using representational similarity
analysis (RSA), we found that the CNN contained information relating to
both the neural responses of the OPA and the navigational affordances of
scenes, most prominently in the mid-level layers of the CNN. We then performed a series of analyses to isolate the visual inputs that are critical for
identifying navigational affordances in the CNN. These analyses revealed
a strong reliance on visual features at high-spatial frequencies and cardinal
orientations, both of which have previously been identified as low-level
stimulus preferences of scene-selective visual cortex. Together, these findings demonstrate the feasibility of computing navigational affordances in
a feedforward sweep through a hierarchical system, and they highlight the
specific visual inputs on which these computations rely.
Saturday PM
Satur day Af t ernoon P os te r s
threat, indirect threat, threat aftermath, and low threat scenes. Sixty participants were randomly assigned to rate all 500 scenes answering one of three
questions: 1) How much harm might you be about to suffer in this scene if
this was your view of the scene?; 2) How much harm might someone (not
you) be about to suffer in this scene?; 3) How much harm might someone
(not you) have already suffered in this scene?. Another 134 participants
were randomly assigned to rate the images on various other threat dimensions. The mean ratings on these threat dimensions were submitted to a factor analysis, which resulted in three distinct factors including Affect (comprised of perceived emotional intensity, physical and psychological harm,
and affect), Proximity (comprised of perceived threat clarity, its proximity
in space and time, and degree of motion), and Agency (comprised of perceived human and animal agency, and whether inanimate objects present
in the scene could be used as a potential weapon). Mean ratings on three
harm questions and three factors were then submitted to cluster analyses,
which grouped images into six distinct categories. This unique set of
images, accompanied by ratings assessing multiple dimensions of threat
and their clusters, is well suited for investigating research questions on
emotion regulation and threat perception in neurotypical and clinical populations. Information on using it can be found at http://www.kveragalab.
Acknowledgement: This work was supported by grant # R01 MH101194 awarded
to KK and RBA, Jr
26.4025 The Use of Infographics to Evaluate Visual Context Pro-
cessing Beliz Hazan1([email protected]), Daniel D. Kurylo1,2; 1The
Graduate Center, CUNY, 2CUNY Brooklyn College
The use of contextual information may be explored with infographics
(informational graphics). Infographics is described as a combination of text,
visual pictures, and graphs to demonstrate data, information and knowledge, as well as convey information through visual storytelling. Comprehending infographics has been associated with several cognitive functions,
including attention, visuospatial perception, and visual working memory,
as well as perception of holistic characteristics, a process termed Gestalt
Thinking. The study described here aimed to develop an assessment tool of
context processing by using infographics at different perceptual and cognitive levels. Observers viewed complex images and were asked specific
questions about information contained within the image. Level 1 test items
contained relationships among basic stimulus features, such as color and
luminance, which required perceptual comparison and reasoning. Level 2
test items contained conceptual relationships among stimulus components,
which required deductive reasoning. Performance was indexed as the
level of feature disparity, where critical visual information was progressively made more salient. Assessments of verbal comprehension (vocabulary and similarity) and perceptual reasoning (block design and matrix
reasoning) was based upon a standardized test (WASI II). Results indicated
that unlike Level 1 infographics, a significant positive correlation existed
between Level 2 infographics and matrix reasoning, which involves fluid
intelligence, knowledge of part-whole relationships, and perceptual organization (Spearman rs=.897, p< .05). Unexpectedly, a significant negative
correlation existed between Level 2 infographics and the similarities subtest, which involves crystalized intelligence and verbal concept formation
(rs =-.901, p< .001). Results indicate that comprehension of Level 2 infographics, which rely on global relationships, is enhanced by visuospatial
and perceptual organization ability, but weakened by greater ability in
focusing on specific concepts. Results support a model of contextual processing that emphasizes global relationships and deemphasizes attention
focus on image components.
Acknowledgement: The Graduate Center, CUNY Doctoral Student Research
VSS 2017 Abst ract s
from other objects is the specific spatial information which they carry
regarding other objects. Our lab previously showed that participants have a
precise notion of where objects belong relative to anchors but not relative to
other objects (Boettcher & Vo, 2016). In a series of two eye-tracking experiments we tested what role anchor objects occupy during visual search. In
Experiment 1, participants searched through scenes for an object which was
cued in the beginning of each trial. Critically, in half of the scenes a target
relevant anchor was swapped for an irrelevant, albeit semantically consistent, anchor. This lead to marginally faster reaction times and time to first
fixation on the target. Additionally, subjects covered significantly less of the
scene when the anchor was present compared to swapped. These marginal
effects might underestimate the role of anchors owed to the sheer speed of
the search, partly due to the guidance available from the physical features
of the target. Therefore, in Experiment 2 participants were briefly shown
a target-absent scene before the target cue. Search was then restricted to
a gaze-contingent window. Participants were now significantly faster to
respond, and the area of the scene which they covered was significantly
smaller for trials with congruent compared to swapped anchors. Moreover, observers were marginally faster at fixating the target in the anchor
present trials. Taken together, anchor objects seem to play a critical role in
scene grammar, and specifically in executing it during visual search within
Acknowledgement: DFG grant VO 1683/2-1 to MLV.
26.4027 Aging alters neural processing underlying figure-ground
organization Allison Sekuler1([email protected]), Jordan Lass1,
Ali Hashemi1, Patrick Bennett1, Mary Peterson2; 1Department of Psychology, Neuroscience & Behaviour, McMaster University, 2Department of
Psychology and Cognitive Science Program, University of Arizona
Aging decreases observers’ ability to segment figure from ground, but what
neural mechanisms underlie age-related changes in figure-ground organization? We measured EEG in older and younger observers while they
viewed stimuli comprising eight alternating convex and concave regions,
and indicated whether a red probe was “on” or “off” the region they perceived as figure. There were two types stimuli: (1) high-competition stimuli, in which all convex regions were one colour and all concave regions
were a different colour; these stimuli support two competing interpretations: convex or concave figures in front of a ground of uniformly coloured
regions of the other type; (2) low-competition stimuli, in which all concave
regions were the same colour, but convex regions were each coloured differently, favouring the interpretation of convex figures of varying colours
in front of a uniformly coloured background comprising concave regions.
In younger observers, the amplitude of the parieto-occipital N250 event-related potential (ERP) was more negative for high- than low-competition
stimuli. This difference was absent in older observers. Furthermore, the
N250 amplitude difference was inversely correlated with behaviour: Individuals showing a larger N250 difference between high- and low-competition stimuli reported seeing the convex regions as figures equally often in
high- and low-competition stimuli, whereas individuals with similar N250
amplitudes across conditions were less likely to perceive convex regions as
figure in high- than low-competition stimuli. The brain-behaviour correlation also separated observers by age: Older observers generally showed
a large behavioural difference between high- and low-competition conditions, but smaller ERP differences; whereas younger observers generally
showed a small behavioural difference, but larger ERP differences. These
results suggest that figure-ground organization is driven by mechanisms
sensitive to the degree of competition between stimulus interpretations,
and that the age-related reduction in resolving high figure-ground competition in healthy aging is related to an altered neural response.
Acknowledgement: CFI, CIHR, CRC, NSF, NSERC, NSF, ONR
26.4026 Anchoring spatial predictions: Evidence for the critical
Scene Perception: Neural mechanisms
er1,2([email protected]), Eric Dienhart1, Melissa Vo1; 1Scene
Grammar Lab, Goethe University Frankfurt, 2Brain & Cogntion Lab,
Oxford University
Saturday, May 20, 2:45 - 6:45 pm
Poster Session, Pavilion
Real-world scenes follow certain rules known as scene grammar, which
allow for extremely efficient visual search. In the current work, we seek
to understand what role objects, specifically anchor objects, hold during a
visual search in 3D rendered scenes. Anchors are normally, large and diagnostic of the scene they are found in. However, what distinguishes anchors
characteristics of scene processing network Zhengang Lu1([email protected]), Soojin Park1; 1Department of Cognitive Science, Johns
Hopkins University
role of anchor objects for visual search in scenes. Sage Boettch-
Vi s i on S c i enc es S o ci e ty
26.4029 Time-resolved fMRI decoding reveals spatio-temporal
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day A ft ernoon Post ers
Acknowledgement: National Eye Institute (NEI R01EY026042, to SP)
26.4030 Evidence for a grid-like representation of visual space in
humans Joshua Julian ([email protected]), Alex Keinath , Giulia
Frazzetta1, Russell Epstein1; 1Department of Psychology, University of
Most cortical regions represent visual space retinotopically. However,
many behaviors would benefit from a non-retinotopic representation of
visual space. Grid cells in the entorhinal cortex (EC) may provide a neural substrate for such a non-retinotopic representation. In freely navigating
rodents, grid cells fire when the animal’s body occupies a hexagonal lattice
of spatial locations along the chamber floor. In head-fixed monkeys, on the
other hand, grid cells fire when the animal directs its gaze to a hexagonal
lattice of locations on the visible screen (Killian et al., 2012). To determine
whether similar scene-based grid responses can be identified in humans, we
scanned participants with fMRI while tracking their gaze during an unconstrained visual search task in which they had to find a target letter (‘L’)
among numerous distractors letters (‘T’s). Building on fMRI methods previously used to identify the grid signal during virtual navigation (Doeller et
al., 2010), we used a quadrature filter approach to measure fMRI responses
as a function of gaze movement direction. In particular, we first extracted
gaze movement directions modulo 60°, thus equating all 6-Fold symmetric
gaze movement directions. Then, using half of the fMRI data, we computed
the rotation of the gaze movement directions that maximally modulated
EC activity. Using this fit rotation angle, we predicted EC activity in the
withheld fMRI data. Examination of this independent data confirmed that
there was significant modulation of EC activity bilaterally as a function of
gaze movement direction. Follow-up analyses confirmed that this modulation only exhibited the 6-fold rotational symmetry characteristic of grid cell
firing, and not 4- or 8-fold symmetries. These results mark the first evidence
of a grid-like representation of visual space in humans, and suggest that the
same mechanisms supporting the cognitive map of navigational space may
also support a map of visual space.
26.4031 Discriminating multimodal from amodal representations
of scene categories using fMRI decoding Yaelan Jung1([email protected]
gmail.com), Bart Larsen2, Dirk Bernhardt-Walther1; 1Department of Psychology, University of Toronto, 2Department of Psychology, University of
Previous studies have shown that, unlike V1 and A1, temporal, parietal,
and prefrontal cortices process sensory information from multiple sensory
modalities (Downar et al, 2000). However, it is unknown whether neurons
in these areas process sensory information regardless of modality (amodal),
or whether these areas contain separate but spatially mixed populations
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
of neurons dedicated to each sensory modality (multimodal). Here we
used fMRI to study how temporal, parietal, and prefrontal areas represent scene categories in the case of conflicting evidence from visual and
auditory input. For instance, participants were shown an image of a beach
and played office sounds at the same time. If a brain area processes visual
and auditory information separately, then we expect scene categories to be
decodable from at least one modality, as conflicting information from the
other modality is not processed by the same neurons. However, in an area
where neurons integrate information across sensory modalities, conflicting
information from visual and auditory inputs should lead to interference
and hence a deterioration of the neural representation of scene categories.
In our experiment, we were able to decode scene categories from fMRI
activity in temporal and parietal areas for visual or auditory stimuli. By
contrast, in prefrontal areas, we could decode neither visual nor auditory
scene categories in this conflicting condition. Note that both types of scene
categories were decodable from the image-only and sound-only conditions,
when there was no conflicting information from the other modality. These
results show that even though temporal, parietal, and prefrontal cortices
all represent scene categories based on multimodal inputs, only prefrontal
cortex contains an amodal representation of scene categories, presumably
at a conceptual level.
Acknowledgement: NSERC Discovery Grant (#498390), Canadian Foundation
for Innovation
26.4032 Retinotopic organization of scene area in macaque infe-
rior temporal cortex and its implications for development Michael
Arcaro1([email protected]), Margaret Livingstone1;
Department of Neurobiology, Harvard Medical School
Primates have specialized domains in inferior temporal (IT) cortex that are
responsive to particular object categories. Recent fMRI studies have shown
that retinotopic maps cover much of category-selective IT cortex in humans
and monkeys. So far, retinotopy in monkey IT cortex has been reported
within and around the lower bank of the STS (Kolster et al. 2014; Janssens
et al. 2014). In the present study, we confirm this previously reported retinotopy and extend these prior findings by examining retinotopy in the
ventral-most regions of IT - occipital temporal sulcus (OTS). We identified two retinotopic areas, referred to as OTS1 and OTS2, which have not
been described previously in the macaque. These new regions are located
ventral to retinotopic areas V4A and PIT. Both regions contain contralateral representations of the periphery with little coverage of central visual
space. OTS1/2 show selectivity for scenes compared to objects, faces, and
bodies. Our results resolve the relationship between scene-selective areas
in humans (Aguirre et al. 1996, Epstein and Kanwisher 1998) and primates
(Nasr et al. 2011; Kornblith et al. 2013). OTS1/2 overlap with the functionally defined place area, LPP. Further, the visual field organization of
OTS1/2 corresponds well with the organization of scene-selective retinotopic areas PHC1/2 in humans (Arcaro et al. 2009). Our data provide new
evidence that monkey LPP is the homologue to human area PPA. Our
results illustrate parallels in the retinotopic organization between primate
species. First, the broad eccentricity bias across human ventral temporal
cortex (Hasson et al. 2002) is clearly present in macaque IT cortex. Second,
we find that the extent of retinotopy in macaque IT cortex roughly matches
that in humans. Recent results from our lab suggest that this retinotopic
organization is present at birth and is likely fundamental in guiding experience-dependent development of IT.
Acknowledgement: NIH, NIBIB
26.4033 Eye movements during scene viewing are causally dependent on the occipital place area Jennifer Henry1([email protected]
com), George Malcolm2, Edward Silson1, Chris Baker1; 1Laboratory of
Brain and Cognition, NIMH, NIH, 2School of Psychology, University of
East Anglia, UK
Despite the huge variability of visual properties in our environment, we
can efficiently process the scenes we are embedded in. This processing is
supported by three cortical regions: parahippocampal place area (PPA),
medial place area (MPA) [or retrosplenial complex, RSC], and occipital
place area (OPA). Within the contexts of recognition and navigation, the
functions of these regions are generally studied in terms of the visual information they respond to. Here we move beyond these tasks to investigate
the role of OPA in guiding eye movements during scene viewing. OPA is
i) located in occipito-parietal cortex, likely feeding information into parts
Vis io n S c ie nc es Societ y
Saturday PM
Current functional brain imaging studies have identified a number of areas
involved in scene processing in the human brain, including the Parahippocampal Place Area (PPA), the Occipital Place Area (OPA), and the Retrosplenial Complex (RSC). This spatial loci of scene processing have to be
combined with the temporal characteristics of brain activity to provide a
comprehensive understanding of what and how each brain area represents
scene information. Using a recently developed time-resolved repetition
paradigm, we combined fMRI and multi-voxel pattern decoding method
to show a spatial and temporal characterization of brain responses to scene
images varied in spatial boundary (open vs. closed) and scene content (natural vs. urban). Critically, the repetition lags between the first and second
scene image were manipulated in small step (33 ms) starting from 66 ms
to 1033 ms to reconstruct time courses of spatial boundary and scene content representation in each functionally localized scene-related brain area.
We observed that the PPA showed a similar time course of both spatial
boundary and scene content information, suggesting that the temporal
characteristics of the PPA might be robust to various scene properties.
Interestingly, the OPA showed an opposite time course for representing
spatial boundary and scene content information, suggesting that the OPA
might process spatial boundary and scene content in a competing manner
in time. In contrast, RSC didn’t show any relation in representing spatial
boundary and scene content in time. These findings suggest that different
scene areas not only differ in what types of scene information it represents
but also in temporal profiles of how to extract information from scenes. Our
preliminary results provide a spatio-temporal-resolved fMRI approach as a
tool to further understanding the neural dynamics in the scene processing
network during the first second of scene perception.
Saturday PM
Satur day Af t ernoon P os te r s
of the dorsal pathway critical for eye movements, and ii) contains retinotopic representations of the contralateral visual field. OPA was disrupted
with transcranial magnetic stimulation (TMS) while participants searched
scenes for 1s. Participants then chose which of two objects had been in the
previous scene. On half of the trials, participants received repetitive TMS: a
five pulse train over 500ms, starting at scene onset. Half of the participants
received TMS to rOPA and half to rOFA (occipital face area), which also
exhibits a contralateral visual field bias though is more responsive to face
stimuli. If OPA plays a causal role for gaze guidance in scenes, then TMS
to rOPA, but not rOFA, should disrupt the eye movement pattern. Given
OPA’s contralateral representation, eye movements should be biased
toward the ipsilateral visual field following rOPA, but not rOFA stimulation. There was an overall left-to-right gaze pattern across all conditions,
despite every trial starting at center. Critically, the average fixation position
for participants in the rOPA condition was biased toward the ipsilateral
visual field and saccade latencies to the ipsifield were shorter. These results
suggest that OPA might play a causal role in analysing local scene information for eye movement guidance.
26.4034 Category discrimination of early electrophysiological
responses reveals the time course of natural scene perception Matthew Lowe1([email protected]), Jason Rajsic1,
Susanne Ferber1,2, Dirk Walther1,2; 1Department of Psychology, University
of Toronto, 2Rotman Research Institute, Baycrest
Humans have the remarkable ability to categorize complex scenes within a
single glance. Which properties of scenes make this feat possible, and what
is the time course of this process? Neural representations of scene categories for line drawings and colour photographs have been shown to elicit
similar responses in scene-selective cortex. Together with previous investigations highlighting the importance of surface features for scene identification, these results suggest that both structure and surface features play
an integral role in perceiving and understanding our environment. Within
the spatial domain, these features may be closely interwoven in the human
brain. Within the temporal domain, however, they may elicit distinct patterns along a hierarchy of visual processing. To investigate these questions,
the present study used electroencephalography (EEG) to examine the time
course of scene processing (beach; city; forest; highway; mountain; office)
for colour photographs and line drawings (stimulus-type) in the human
visual system. Participants (N=16) performed a blocked scene-memorization task during observation of colour photographs and line drawings. An
initial event-related potential (ERP) analysis revealed dissociable response
patterns across scene categories over the occipital pole for early visually-evoked components P1 and P2. Furthermore, line drawings evoked an
overall higher P1 amplitude, while colour photographs evoked a higher
P2 peak. Additional differences across stimulus-type were distributed
throughout cortex. To investigate these response patterns in greater detail,
we performed an analysis examining the grand-averaged correlations for
within-category versus across-category discriminations during the time
course of scene processing. This analysis revealed that significant discriminations of scene categories in line drawings emerge earlier (~80ms) than
colour photographs (~100ms). Critically, these findings provide evidence
that basic-level categorization of scenes can occur earlier in visual processing than object-class detection (e.g., animal detection), and further suggest
that differences in visual feature processing emerge across the temporal
domain for natural scene perception.
Acknowledgement: NSERC Discovery Grant (#498390), Canadian Foundation
for Innovation
26.4035 Artificially-generated scenes demonstrate the importance
of global scene properties for scene perception Mavuso Mzozoyana1([email protected]), Matthew Lowe2,3, Iris Groen4, Jonathan
Cant2, Assaf Harel1; 1Department of Psychology, Wright State University, 2Department of Psychology, University of Toronto Scarborough,
Department of Psychology, University of Toronto St. George, 4Laboratory
of Brain and Cognition, National Institute of Mental Health, National
Institutes of Health
A recent surge of behavioral, neuroimaging, and electrophysiological studies highlights the significance of global scene properties, such as spatial
boundary and naturalness, for scene perception and categorization. The
stimuli used in these studies are oftentimes real-world naturalistic scene
images, which while essential for maintaining ecological validity, also pose
Vi s i on S c i enc es S o ci e ty
VSS 2017 Abst ract s
a real challenge for interpretation. Specifically, since real-world scenes
vary dramatically in physical stimulus properties (e.g. color) and range of
semantic categories they span, it is difficult to isolate the unique role that
global scene properties play in scene processing. To overcome this challenge, the present study used a set of computer-generated scene stimuli
(Lowe at al., 2016) that were designed to control for two global scene properties (spatial boundary and naturalness) while minimizing and controlling
for other sources of scene information, such as color and semantic category.
The set comprised of 576 individual grayscale scene exemplars spanning 12
spatial layouts and 12 textures for each combination of naturalness (manmade/natural) and spatial boundary (open/closed). We presented these
artificial scenes to participants while their Event-Related Potentials (ERPs)
were recorded. We aimed to establish whether the artificial scenes would
generate similar electrophysiological signatures of naturalness and spatial
boundary previously obtained using real-world scene images (Harel et al.,
2016). Strikingly, we found that similar to previous work, the peak amplitude of the P2 ERP component was sensitive to both the spatial boundary
and naturalness of the scenes despite vast differences between the stimuli.
In addition, we also found earlier effects of spatial boundary and naturalness, expressed as a modulation of the amplitude of the P1 and N1 components. These results suggest that naturalness and spatial boundary have a
robust influence on the nature of scene processing. This influence is independent of scene category and color, and might be observed earlier than
previously thought.
26.4036 Neurodynamics and hemispheric lateralization in threat
and ambiguous negative scene recognition Noreen Ward1([email protected]), David De Vito2, Cody Cushing1, Jasmine
Boshyan1,3, Hee Yeon Im1,3, Reginald Adams, Jr.4, Kestutis Kveraga1,3;
Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts
General Hospital, Charlestown, MA, 2Department of Psychology, University of Guelph, Guelph, Canada, 3Department of Radiology, Harvard
Medical School, Boston, MA, USA, 4Department of Psychology, The Pennsylvania State University, State College, PA
Efficient threat detection and appropriate action are critical for survival.
However, some stimuli are merely negative without an impending threat
and may offer useful clues about past or future dangers. We have shown
previously (Kveraga et al., 2015) that threat and merely negative scene
images are well discriminated and activate distinct brain networks in fMRI.
However, the neurodynamics and hemispheric contributions underlying this process have not been studied. Methods: In this MEG study, we
employed bilaterally presented threat or merely negative scene images
paired with contextually matched neutral scenes in a 2AFC paradigm. Participants (N=64) had to identify the threatening or negative scene in each
pair via a key press corresponding to the side of presentation. We extracted
source-localized MEG activity from five ROIs in both hemispheres: fusiform
face area (FFA), posterior STS (pSTS), periamygdaloid cortex (PAC), parrahippocampal cortex (PHC), and orbitofrontal cortex (OFC). Results: When
threat or merely negative scenes were presented in the left visual hemifield,
the contralateral right hemisphere (RH) and the ipsilateral left hemisphere
(LH) showed significantly greater activation starting at about 300-400 ms
for threat vs. merely negative scenes. This threat amplitude advantage was
significantly greater in LH. Conversely, when threat or merely negative
scenes were presented in the right visual hemifield, the contralateral LH
generally had a phase lead in activity for threat vs. merely negative stimuli
but no amplitude difference, while the ipsilateral RH had higher activity
to merely negative scenes late in the trial, beginning at ~700 ms. Conclusions: Our findings show that deciding between two scene images leads
to differential hemispheric dynamics. Threat images evoke greater activity when presented on the left, in both LH and RH, while merely negative
images evoke increased later activity when presented in the right hemifield.
Acknowledgement: This work was supported by grant # R01 MH101194 awarded
to KK and RBA, Jr
26.4037 Dissociating scene navigation from scene categorization:
Evidence from Williams syndrome Frederik Kamps1([email protected]
edu), Stephanie Wahab1, Daniel Dilks1; 1Department of Psychology, Emory
Recent functional magnetic resonance imaging (fMRI) evidence suggests
that human visual scene processing is supported by at least two functionally distinct systems: one for visually-guided navigation, including the
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s Acknowledgement: This work was supported by Emory College, Emory
University (DDD), a National Eye Institute Vision Sciences training grant
5T32EY007092-30 (FSK), and a Scholarly Inquiry and Research at Emory
(SIRE) Independent Research Grant (SW)
3D Perception: Shape
Saturday, May 20, 2:45 - 6:45 pm
Poster Session, Pavilion
26.4038 Inferring the deformation of unfamiliar objects Filipp
Schmidt1([email protected]), Flip Phillips2, Roland
Fleming1; 1Department of Psychology, Justus-Liebig-University Giessen,
Department of Psychology and Neuroscience, Skidmore College
When objects are deformed by external forces (e.g. a crushed can or twisted
rag), the resulting shape is a complex combination of features from the original shape and those imparted by the transformation. If we observe only the
resulting shape, distinguishing the origin of its various features is formally
ambiguous. However, in many cases the transformation leaves distinctive
signatures that could be used to infer how the object has been transformed.
Here we investigated how well observers can identify the type and magnitude of deformations applied to unfamiliar 3D shapes. We rendered objects
subjected to physical simulations of 12 shape-transforming processes (e.g.,
twisting, crushing, stretching). Observers rated the magnitude of object
deformation at different stages of the transformation process (e.g., barely
twisted vs. strongly twisted). Another group viewed one transformed
object at a time and ranked other objects-which were submitted to the same
or one of the 11 other transformations-according to their similarity to the
test object in terms of the applied transformation. A third group viewed a
subset of the objects and painted on the surface to indicate which regions
appeared most informative about the type of transformation. We find that
observers can estimate the magnitude of deformation of unfamiliar objects
without knowing their pre-transformed shapes. They can infer specific
causal origins from these deformations, reflected in their ability to identify other objects subjected to the same transformation. We also identify
the shape features underlying these inferences by comparing the painting responses to the physical mesh deformations. Our findings show that
observers can infer transformations from object shape. This ability to infer
the causal origin of objects is potentially useful in estimating their physical
properties (e.g., stiffness), predicting their future states, or judging similarity between different objects.
Acknowledgement: This research was funded by the DFG (SFB-TRR-135:
“Cardinal mechanisms of perception”) and by an ERC Consolidator Award (ERC2015-CoG-682859: “SHAPE”)
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
26.4039 Depth-Inversion “Easillusions” and “Hardillusions”:
Differences for Scenes and Faces Thomas Papathomas1,2([email protected]
rci.rutgers.edu), Attila Farkas1, Tom Grace3, Alistair Kapadia4, John
Papayanopoulos5, Vanja Vlajnic6, Sophia Lovoulos7, Katya Echazarreta8,
Yuan Li9; 1Center for Cognitive Science, Rutgers University, Piscataway,
NJ, 2Dept of Biomedical Engineering, Rutgers University, Piscataway, NJ,
Dept of Psychology, Rutgers University, Piscataway, NJ, 4Westfield High
School, Westfield, NJ, 5Dept. of Mechanical Engineering, Georgia Tech,
Atlanta, GA, 6Dept. of Applied Statistics, Pennsylvania State University,
College Park, PA, 7Holmdel High School, Holmdel, NJ, 8Dept. of Electrical
Engineering, University of California, Los Angeles, 9Dept. of Computer
Science, Rutgers University, Piscataway, NJ
Introduction: Two depth-inversion (DI) illusions, where viewers perceive
depth structure opposite to the stimulus’s physical depth, are hollow
masks and reverspectives [Wade & Hughes, 1999]. For faces, one explanation is that face-specific 3D stored knowledge and a general convexity bias
overcome data-driven depth cues to produce DI. For reverspectives, stored
general perspective rules (e.g., that retinal trapezoids are rectangles slanted
in physical space with their long retinal edge closer to viewer) may account
for the DI [Gregory, Phil. Trans. R. Soc. B, 2005]. We call such easily obtained
DI illusions “Easillusions”. Rationale for present study: We investigated
whether humans can downplay stored knowledge and rules to obtain DI
“hardillusions” for stimuli in which stored knowledge and rules oppose,
rather than favor, DI [Papathomas et al. ECVP 2015]. Examples of such
stimuli are normal, convex 3-D faces and “proper-perspectives”, in which
the retinal trapezoids are consistent with the depth of the physical surfaces.
However, our 2015 study included only unpainted masks and fragments of
reverse-perspectives. Methods: This study included a complete reverspective and realistically painted convex masks. Stimuli were both easillusions
(hollow mask, reverspective) and hardillusions (convex mask, proper-perspective). Also, all stimuli were either realistically painted or unpainted.
We assessed how long it took 16 subjects to obtain DI. Results: Painted
reverspectives tended to produce DI faster than unpainted ones, whereas
painted proper perspectives were much slower than unpainted ones. For
faces, the difference for obtaining DI between painted and unpainted faces
was much smaller both for the hollow and convex masks. The only stimuli that no subject was able to get DI were painted convex masks. Conclusions: Perspective painted cues played a much stronger role than facial
painted cues, providing additional evidence that facial 3D geometry plays
a larger role than scene 3D geometry.
26.4040 Distortions in perceived depth magnitude for stereoscopic
surfaces Matthew Cutone1([email protected]), Laurie Wilcox1; 1Centre
for Vision Research, Department of Psychology, York University
When interpreting 3D surfaces, the human visual system is often confronted with regions with little or no structure. It has been shown that the
stereoscopic system rapidly and effectively interpolates depth estimates
across ambiguous regions to create the percept of a continuous plane. In
previous studies, we have found reduced estimates of perceived depth
magnitude for rows of elements with smooth disparity gradients (Deas
& Wilcox, 2015). Here we evaluate if such distortions are also seen for
extended surfaces, and assess the extent to which the interpolation process is influenced by the coherence of disparate elements. We used stimuli
comprised of sparse, randomly positioned elements whose disparity map
depicted two overlapping Gaussian bumps, with laterally separated peaks,
but variable amplitude and width. The Gaussians were constrained so that
the peaks were aligned horizontally, and the stimulus was encircled by a
textured annulus positioned on the screen plane. Two horizontal line segments, one on each side of the circle, indicated the middle of the stimulus:
the region that the observers were asked to estimate. Subjects viewed the
stimuli stereoscopically, and on each trial toggled between the stimulus
and a response screen. On the response screen, subjects indicated the perceived height of the surface along the horizontal midline, using lines with
nodes that could be adjusted vertically, effectively drawing a cross-section
of the surface. We found that depth estimates were accurate in the fronto-parallel regions at the edges of the bumps, but that for all test conditions,
subjects systematically and significantly underestimated depth within the
region containing depth variation. The depth distortions were remarkably
consistent across observers, were not influenced by the percentage of elements displaced off the surface, and appear to have been constrained by the
regions of maximum surface curvature. Vis io n S c ie nc es Societ y
Saturday PM
occipital place area (OPA), and a second for scene categorization (e.g., recognizing a kitchen vs. a beach), including the parahippocampal place area
(PPA). However, fMRI data are correlational, and a stronger test of this
“two systems for visual scene processing” hypothesis would ask whether
it is possible to find cases of neurological insult impairing one ability independent of the other. Toward this end, here we tested visually-guided navigation and categorization abilities in adults with Williams syndrome (WS),
a genetic developmental disorder involving cortical thinning in and around
the posterior parietal lobe (potentially including OPA, but not PPA). WS
adults and mental-age matched (MA) controls (i.e., 7 year old typically-developing children) completed a visually-guided navigation and a categorization task. In the visually-guided navigation task, participants viewed
images of scenes, and indicated which of three doors (left, center, or right)
they would be able to exit along a complete path on the floor. In the categorization task, participants viewed the exact same scene images, and indicated whether each depicted a bedroom, kitchen, or living room. If visual
scene processing is supported by independent visually-guided navigation
and categorization systems, then WS adults will be impaired on the visually-guided navigation task, but not on the categorization task. Indeed, we
found that WS adults performed significantly worse on the visually-guided
navigation task compared to the categorization task, relative to MA controls. These findings provide the first causal evidence for dissociable visually-guided navigation and categorization systems, and further suggest that
this distinction may have a genetic basis. Future studies will ask whether
patients with PPA damage show the opposite profile from WS, for a full
double dissociation.
S atur day A ft ernoon Post ers
Satur day Af t ernoon P os te r s
VSS 2017 Abst ract s
26.4041 Shape constancy in anaglyphs: Effects of angle, context
and instruction Alexander Bies ([email protected]), Atsushi Kikumoto1, Stefanos Lazarides1, Margaret Sereno1; 1Psychology, Arts and
Sciences, University of Oregon
Saturday PM
When asked to draw a figure from perspective, many people inadvertently
draw a shape representative of the figure’s physical characteristics instead.
This shape constancy effect has been studied extensively, yet questions
remain about which 3-dimensional contextual cues are sufficient to elicit
such effects, and whether, in a fully crossed design, the variables of context,
instruction and angle interact. Here, we rendered quadrilateral surfaces as
skeletal outlines embedded in a cuboid polyhedron or as an isolated shape,
lacking texture and shading, rotated around the viewing plane, and converted them into anaglyphs. Participants were asked to report the shapes’
physical and apparent width using a 6-alternative forced choice paradigm.
Twelve participants each completed 440 trials, 2 repetitions of 11 angles of
rotation of 3 object widths, 2 types of instruction and 2 levels of context.
Statistical analyses averaging across repetitions and object width revealed
a significant three-way interaction among rotation angle, instruction, and
context. Participants were equivalently accurate at low levels of rotation
and when judgment and context were consistent (e.g., physical width
judgments were accurate for 3-dimensional shapes). When instructed
judgment and level of context were inconsistent (i.e., judging the physical
width of a quadrilateral or apparent width of a cuboid), participants made
increasingly large errors (i.e., regression toward the real object and toward
the apparent object, respectively). Results were replicated in a second
experiment using the method of adjustment to capture width judgments.
Although individuals can correctly estimate both real and apparent widths
across rotation angle, errors arise with rotation under conditions of inconsistent instruction and context. In addition, binocular integration is sufficient to drive the perception of three-dimensional shape and elicit shape
constancy effects from anaglyphs. Future studies of artistic ability and the
effects of shape constancy may build on these results and new approach to
measuring shape constancy.
26.4042 Critical contours link surface inferences with image
flows Benjamin Kunsberg ([email protected]), Steven Zucker ;
Department of Applied Mathematics, Brown University, 2Department of
Computer Science, Yale University
Three-dimensional shape can be inferred from shading and/or contours,
but both problems are ill posed. Computational efforts impose priors to
force a unique solution. However, psychophysical results indicate that
viewers infer qualitatively similiar but not quantitatively identical shapes
from shading or contours or both. The challenge is to connect shading and
contour inference and find the family of related solutions. We have discovered a solution to this challenge by reformulating the problem not as
an inference from images directly to surfaces, but rather as one that passes
through an abstraction. At the heart of this abstraction is another fundamental question rarely addressed: how can local image information, given
e.g. by image derivatives, integrate to global constructs, e.g. surface ridges.
Artists achieve it intuitively with strokes; suggestive contours attempt it
by presuming the surface is given. Our theory uses a geometrical-topological construct (the Morse-Smale complex). We introduce a novel shading-to-critical-contour limiting process, which identifies image regions
(e.g., consecutive narrow bands of bright, dark, and bright intensity) with a
distinct gradient flow signature. Curves through the flow flanked by steep
gradients are the critical contours. They capture generic aspects of artists’
drawings and have a physiologically-identifiable signature. Importantly,
this shading-limit holds for a variety of rendering functions, from Lambertian to specular to oriented textures, so it applies to many materials.
Our main mathematical result proves that the image critical contours are
one-to-one with corresponding structures on (the slant of) surfaces. This
is the invariance that, in turn, anchors the equivalence class of surfaces.
The theory is novel in that (i) it reveals an invariance at the core of surface
inferences; (ii) connects the shading, contour, and texture problems; and
(iii) transitions from local to global. Psychophysical predictions follow.
The phenomenon of motion transparency is well known and has been
extensively investigated for decades. A typical demonstration of motion
transparency shows random dots moving in two directions, but the perception of depth is ambiguous. Here we find that when motion parallax
is used to present planes of random dots at different depths, then this
facilitates the perception of transparency and makes depth less ambiguous. The motion of random dots was synchronized to participants’ head
movements, to present frontoparallel overlaid surfaces at different depths,
within a 28 deg circular mask. The number of overlaid planes, dot density
(0.5, 1, 2, 4, 8 dots/deg2) and depth separation between the planes was
varied. Four participants indicated how many planes in depth they were
able to perceive. In separate trials, participants also indicated the depth of
the planes using a depth matching task. A coherence noise task was also
used in order to determine the percent signal dots that was necessary to
perceive transparency with motion parallax. The results indicated that
participants could perceive at most three simultaneous overlaid surfaces.
Increasing either the number of planes or dot density had a detrimental
effect on the perception of transparency. At higher dot densities, only two
planes could be perceived. The results with the coherence noise task indicated that transparency could be perceived at percent signal levels comparable to those for motion without transparency. These results are similar
to those found for motion transparency in which disparity was used to
present planes at different depths, and suggests that the number of planes
that can be perceived may be limited to at most three because of effects of
attention. Moreover, depth perception was likely degraded at the highest
densities because of inhibitory interactions between adjacent dots moving
in opposite directions and depth averaging.
26.4044 Seeing through transparent layers Dicle Dovencioglu1(d-
[email protected]), Andrea van Doorn2,3, Jan Koenderink2,3, Katja
Doerschner1; 1Justus Liebig University of Giessen (JLU Giessen), Department of General Psychology, Giessen, Germany, 2University of Leuven
(KU Leuven), Laboratory of Experimental Psychology, Leuven, Belgium,
Utrecht University, Experimental Psychology, Utrecht, The Netherlands
Humans are good at estimating the causal changes in the visual information by perceptually dividing complex visual scenes into multiple layers,
this is also true when objects are viewed through a transparent layer. For
example, we can effectively drive through heavy fog or hard rain; or decide
whether an object in a river is animate while fishing. In such complex
scenes, changes in visual information might be due to observer motion,
object motion, deformations of the transparent medium, or a combination of these. Recent research has shown that image deformations can
provide information to attribute various properties to transparent layers,
such as their refractive index, thickness, or transparency. However, different transparent mediums can cause similar amounts of refraction or they
can be rated similarly translucent while one being more foggy. Despite our
rich lexicon to describe the nature of a transparent layer, the optical and
geometrical properties that identify each transparent layer class remains
to be discovered. Here, we use eidolons to estimate equivalence classes
for perceptually similar transparent layers. Specifically, we ask whether
we could describe the specific image deformations that are interpreted
as transparency in terms of the parameters of the Eidolon Factory (reach,
grain, coherence; https://github.com/gestaltrevision/Eidolon). To create
a stimulus space for the eidolons of a fiducial image, while keeping the
coherence fixed at 1, we varied the reach and grain levels to systematically
increase the amount of local disarray in an image. We asked participants (n
= 11) to adjust the reach and grain values simultaneously so that the object
in the scene looked like it is under water. Our results suggest that eidolons
with higher grain values (g > 8) are in a perceptually equivalent class and
these eidolons give an under water impression, probably due to the wavelike large local disarray.
Acknowledgement: KD is supported by a Sofja Kovalevskaja Award from the
Alexander von Humboldt Foundation, endowed by the German Ministry of
26.4045 Highlight disparities contribute to perceived depth of
26.4043 The perception of transparency with motion paral-
shiny 3D surface Jeffrey Saunders1([email protected]); 1Department of
Psychology, University of Hong Kong
Psychology Department, Roanoke College, Salem, VA U.S.A.
When a shiny surface is viewed binocularly, the specular highlights have
different disparities than points on the surface. This study tested whether
conflicting highlight disparities contribute to perception of surface shape
lax Athena Buckthought1([email protected]), Shuhang Wu1;
Vi s i on S c i enc es S o ci e ty
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s Acknowledgement: Supported by the Hong Kong Research Grants Council, GRF
26.4046 Non-veridical Depth Perception Causes Symmetric 3D
Objects to Appear Asymmetric, and Vice Versa Ying Yu1([email protected]
osu.edu), Alexander Petrov1, James Todd1; 1Department of Psychology,
The Ohio State University
Prior research has indicated that perceived depth from binocular disparity becomes increasingly compressed as viewing distance increases. One
geometric consequence of this is that a symmetric 3D object should be perceived as asymmetric whenever the axis of compression is at an oblique
angle to the plane of 3D symmetry. Method: To test this hypothesis, we
presented binocular images of 3D polyhedra with one plane of mirror
symmetry, similar to the stimuli of Li et al. (2011, doi:10.1167/11.4.11). On
each trial, one of fifteen objects was rendered against a gray background
on a LCD monitor. The 3D orientations of these objects were constrained
so that the viewing direction (i.e., the Z-axis) formed an oblique angle with
the object’s symmetry plane, and at least five pairs of corresponding vertices were visible. Visible edges were all rendered in black, all occluded
regions were removed, and polka-dot textures were mapped onto each visible face. Six observers looked at each polyhedron through LCD shutter
glasses binocularly from a chin rest and pressed keys to stretch or compress the object along the Z dimension so as to make it appear as symmetrical as possible. Each observer performed 90 trials each from two viewing
distances: “near” (100cm) and “far” (200cm). The dependent variable was
the Z-scaling (S) required to make the object appear symmetrical. S=1 produced an object with perfect 3D symmetry, whereas deviations up or down
from 1 produced increasing asymmetries. Results: For most observers, the
adjustments were significantly larger than 1 and increased systematically
with viewing distance. The group-averaged mean adjustment was S=1.24
(SE=0.07) and 1.61 (SE=0.18) at the near and far distances, respectively. This
suggests that observers’ inability to accurately scale binocular disparities
can cause physically symmetric objects to appear asymmetric, and some
asymmetric objects to appear symmetric.
26.4047 Distortions of apparent 3D shape from shading caused
by changes in the direction of illumination Makaela Nartker1(nar-
[email protected]), James Todd1, Alexander Petrov1; 1The Ohio State
A fundamental problem for the perception of 3D shape from shading is to
achieve some level of constancy over variations in the pattern of illumination. The present experiment was designed to investigate how changes
in the direction of illumination influence the apparent shapes of surfaces.
The stimuli included 3D objects with Lambertian reflectance functions that
were illuminated by rectangular area lights. The radial positions of these
lights were systematically manipulated to allow five different directions of
illumination. All stimuli had exactly the same bounding contours so that
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
those contours provided no information for distinguishing the different
possible surfaces. Observers judged the 3D shapes of these objects in two
phases: First, they marked critical points (e.g. local depth minima, maxima,
and inflection points) along a designated scan line in an image. These were
then used to position control points on a spline curve located adjacent to the
image, and observers adjusted the shape of that curve to match the apparent profile in depth along the designated scan line. The results revealed
that parts of the surface appeared to shift slightly toward the direction of
illumination, but these changes were much smaller than what would be
expected based on differences in the pattern of luminance among the stimulus images. Regions of high curvature where the surface abruptly changed
from flat to curved remained much more stable over changes in illumination than regions with more gradual curvature. These findings demonstrate
that there is a substantial amount of illumination constancy in the perception of 3D shape from shading, but that it is not perfect. Several hypotheses
are considered about how this constancy could potentially be achieved.
26.4048 Effect of head translation and manual control on depth
sign perception from motion parallax Masahiro Ishii1([email protected]
ac.jp); 1School of Design, Sapporo City University
Motion parallax produced during observer translation acts as a cue for perceiving relative depth. However, when information about observer translation is unavailable, the perceived sign of depth is ambiguous. This is theoretically and empirically ascertained. This study focuses on manual control,
in comparison with head translation, as a cue for perceiving depth sign,
since action can affect vision. Harman et al. (1999), for instance, reported
that human object recognition was better when the observer could rotate
the object images using a trackball rather than passive observation. In this
study, an experiment was conducted to investigate the effect of manual
control on reducing ambiguity of depth sign perception. For comparison,
the effect of observer translation was also investigated. Stimuli were generated by a computer and presented on a CRT monitor. Four participants
took part in the experiments. The displays simulated a corrugated surface
in the frontoparallel plane, and it could be rotated to-and-fro around a
vertical axis. In the experiment, each corrugated surface had one of two
possible spatial phases (center-far/center-near). The surface structure was
depicted with random dots on a black background. The stimulus change
was associated with the rotation of a knob manipulated by the participant,
or the lateral translation of a chin rest yoked to head translation. The axis
of the knob was aligned with the axis of the stimulus rotation. The display
was presented during participant manipulation or translation. Participants
were forced to discriminate between center-far and center-near. Participants with head translation perceived the depth sign with almost perfect
accuracy. Participants with manual control, by contrast, perceived the
depth sign with around 0.75 accuracy (chance level 0.5). This suggests that
reliability of visual change of outer world from head translation is higher
than that from manual control in the visual system.
Acknowledgement: JSPS KAKENHI 26330310
26.4049 Minimal Deformation Constrains the Perceived Height of
the Stereokinetic Cone Yang Xing1([email protected]), Zili Liu1;
Department of Psychology, University of California, Los Angeles, USA
The current study was conducted to examine whether the minimal deformation hypothesis can explain a stereokinetic percept. Stereokinetic stimuli
are 2D configurations that lead to 3D percepts when rotated in the image
plane. A rotating ellipse with an eccentric dot gives rise to the percept of
a cone with defined height. The dot is perceived as the apex of the cone,
which is constantly deforming except when the dot is on the minor axis
of the ellipse. In the current study, the spatial relationship between the
ellipse and dot varied across trials in terms of the dot’s location (0º [minor
axis], 30º, 60º, 90º [major axis]), the aspect ratio of the ellipse (0.6 or 0.8),
and rotation speed (60º/sec or 90º/sec). During each trial, participants (n
= 8) adjusted the length of a 2D bar to indicate their perceived height of
the cone. This 2D bar was oriented along the ellipse’s minor axis and was
perceived to be perpendicular to the circular base of the cone. Our results
were quantitatively consistent with the traditional hypothesis of minimal
deformation, which is similar to the maximal rigidity assumption (Ullman, 1979). As the dot shifted position from the minor axis towards the
major axis, observers consistently reported an increasingly shorter cone.
The results illustrate the tendency of observers to perceive the apex of the
cone at a height that minimized its distance to the axis of rotation in order
Vis io n S c ie nc es Societ y
Saturday PM
when other shape cues are available. A recent study by Muryy et al (2013)
found that perceived shape followed highlight disparities for mirrored surfaces, but not for surfaces with texture or shading. Is stereo information
from highlights overridden by additional surface information, or is there
still an influence on quantitative perceived shape? To test this, I varied the
depth of specular highlights relative to a surface and measured the effect
on perceived extent in depth. Subjects viewed stereo images of elliptical
bumps with a flat frontal rim, with varied height and curvature, and estimated height of the bump. Surfaces had smooth shading with and without
highlights, and highlights were either accurate or shifted in depth. Simulated illumination was a grid of light fixtures, which was translated and
scaled to control the image position of highlights. Surfaces had either high
contrast texture or very low contrast texture to vary the quality of surface
disparity information. With high contrast texture, highlight disparities did
not influence depth estimates. The presence of highlights produced an
overall increase in perceived depth, but varying the depth of highlights had
no effect. With low contrast texture, depth estimates were strongly influenced by highlight disparities. The presence of highlights improved accuracy, and varying the depth of highlights produced corresponding changes
in depth estimates. The effect was the same whether the highlights were
consistent or inconsistent with smooth shading. The results demonstrate
that binocular highlights can influence perceived shape when other surface
information is available but weak. This suggests that highlight disparities
are perceptually integrated despite providing conflicting information
S atur day A ft ernoon Post ers
Satur day Af t ernoon P os te r s
to reduce the relative motion between the dot and circular base of the cone.
Therefore, the hypothesis can also be considered as a 3D extension of the
more recent “slow and smooth” hypothesis (Yuille & Grzywacz, 1988;
Weiss, Simoncelli, & Adelson, 2002). 26.4050 Mapping the Hierarchical Neural Network of 3D Vision
Saturday PM
using Diffusion Tensor Imaging Ting-Yu Chang1([email protected]
edu), Niranjan Kambi2, Erin Kastar2, Jessica Phillips2, Yuri Saalmann2, Ari
Rosenberg1; 1Department of Neuroscience, School of Medicine and Public
Health, University of Wisconsin-Madison, Madison, WI, USA, 2Department of Psychology, University of Wisconsin-Madison, Madison, WI, USA
The transformation of egocentrically encoded two-dimensional (2D) retinal
images into allocentric three-dimensional (3D) visual perception is essential
for successful interactions with the environment. However, the hierarchical
neural network underlying this transformation remains largely unknown.
Here we use diffusion magnetic resonance imaging (GE MR750 3T scanner,
16-channel receive-only head coil; 60 diffusion directions, b=1000 s/mm
2 , NEX=10) to map the neural network of 3D vision in rhesus macaques
(N=7). Focus is given to the caudal intraparietal area (CIP), an important
site of 3D visual processing, as well as the visual posterior sylvian area
(VPS) which is implicated in allocentric vision. T1-weighted scans are first
used to define cortical areas according to the F99 atlas using CARET software. High-resolution diffusion-weighted scans (1mm isotropic) are then
used to perform probabilistic tractography using FSL. Our results reveal
a network within the dorsal visual pathway that putatively underlies 3D
vision. Consistent with previous anatomical data, we find that V3A is
strongly connected with CIP. We further find that the posterior intraparietal area (PIP) likely contributes to the 2D to 3D visual transformation
as an intermediate stage between V3A and CIP. Additionally, we provide
the first evidence that CIP is connected with the retroinsular cortex (Ri), a
subdivision of VPS where visual responses are observed. By combining our
probabilistic tractography results with previous electrophysiological and
anatomical data, we propose that the following circuit underlies the 2D to
3D visual transformation and creation of allocentric visual representations:
V1 → V2d → V3A → PIP → CIP → Ri. To elucidate a broader neural network underlying 3D visual perception and action, future work will extend
this analysis to include the ventral visual pathway, as well as decision and
motor circuits. We are additionally using these results to guide electrophysiological studies investigating the neural basis of 3D visual perception.
Acknowledgement: This work was supported by National Institutes of Health
Grants DC014305 (A.R.) and 1R01MH110311 (Y.S.), the Alfred P. Sloan Foundation (A.R.), and the Whitehall Foundation (A.R.).
26.4051 Overrepresentation of vertical limbs in primate inferotem-
poral cortex Cynthia Steinhardt1([email protected]), Chia-Chun
Hung1, Charles Connor1; 1Department of Neuroscience Krieger Mind/
Brain Institute Johns Hopkins University
We reported previously that responses of individual neurons in macaque
monkey inferotemporal cortex (IT) convey information about 3D medial
axis shape (Hung et al., Neuron, 2012). Specifically, many IT neurons signal configurations of medial axis elements (connected torsos and limbs) in
terms of 3D position, orientation, curvature, and connectivity. We hypothesized that these neurons provide an explicit, efficient shape code for elongated, branching objects such as vertebrate animals. Here, we analyzed
the strength of IT neural population responses to projecting limbs (medial
axis elements that have a termination on one end). Our dataset comprised
spiking responses of 111 IT neurons, each tested with 400–600 3D medial
axis shapes presented on a computer monitor using shading and binocular
disparity as cues for shape-in-depth. These shapes were initially random
but evolved through multiple generations based on a genetic algorithm
driven by the neuron’s responses. Thus, our sampling strategy converged
toward high response shapes in later generations. We characterized projecting limbs in terms of their object-centered 3D position, 3D orientation,
curvature, and surface shape. We binned the position/orientation/curvature/surface space into a multi-dimensional matrix. For each stimulus,
we summed the neural response into the bins occupied by that stimulus.
We used plots and statistical tests to analyze anisotropies in the resulting
matrix. One major trend was over-representation of vertical limbs, that
is, limbs projecting downwards or upwards. We hypothesize that this
over-representation of vertical limbs reflects the prevalence and/or ecological significance of vertical projections in the natural world.
Vi s i on S c i enc es S o ci e ty
VSS 2017 Abst ract s
26.4052 Perception of Depth in Natural Scenes Yiran Duan1([email protected]
gmail.com), Alexandra Yakovleva1, Anthony Norcia1; 1Department of
Psychology, Stanford University
The role of disparity in depth perception has typically been studied using
random-dot stereograms because they contain no monocular clues, thus
isolating the responses to “pure disparity”. Pure disparity rarely exists in
the real world. Rather, disparity occurs in the context of several monocular depth cues. To study disparity processing in natural scenes, we compared visually evoked responses to 2D and 3D versions of the same scene
that differed only in their disparity structure. Thirty natural scenes were
drawn from a large set of natural image stereo-pairs (Burge et al., 2016 J.
Vis) to ensure diversity of scene types and depth maps. The scenes mainly
included outdoor settings with trees and buildings. Nineteen subjects
(Mean age = 22.9, 8 males) viewed a sequential two-alternative temporal
forced choice presentation of two different versions of the same scene (2D
vs. 3D) interleaved by a scrambled image with the same power spectrum
(4 images per trial, 750 ms each). Scenes were viewed orthostereoscopically
at 3 meters through a pair of shutter glasses. After each trial, participants
indicated with a key press which version of the scene was 3D. Performance
on the discrimination was >90%. We compared 128 channel Visual Evoked
Potentials elicited by 2D and 3D scenes using Reliable Component Analysis (Dmochowski Greaves and Norcia, 2015). Both scene types elicited
responses with onset times of ~180 msec. The differential response between
2D and 3D scenes was maximal on mid-line electrodes over the occipital
pole. Significant differences between responses to 2D and 3D scenes first
emerged at around 200 msec. This suggests that approximately 20 additional msec are needed for the brain to begin the extraction of 3D structure
from the disparity cues in natural scenes.
Acknowledgement: EY018875-04 from the National Institute of Health
26.4053 Learning to identify depth edges in real-world images with
3D ground truth Krista Ehinger1([email protected]), Kevin Joseph1,
Wendy Adams2, Erich Graf2, James Elder1,3; 1Centre for Vision Research,
York University, 2Department of Psychology, University of Southampton,
Vision: Science to Applications (VISTA) Program, York University
Luminance edges in an image are produced by diverse causes: change
in depth, surface orientation, reflectance or illumination. Discriminating
between these causes is a key step in visual processing, supporting segmentation and object recognition. Previous work has shown that humans
can discriminate depth from non-depth edges based on a relatively small
visual region around the edge (Vilankar et al. 2014), however little is
known about the visual cues involved in this discrimination. Attempts to
address this question have been based upon small and potentially biased
hand-labelled datasets. Here we employ the new Southampton-York Natural Scenes (SYNS) 3D dataset (Adams et al. 2016) to construct a larger and
more objective ground truth dataset for edge classification and train a deep
network to discriminate depth edges from other kinds of edges. We used a
standard computer vision edge detector to identify visible luminance edges
in the HDR images and fit planar surfaces to the 3D points on either side
of the edge. Based on these planar fits we classified each edge as depth
or non-depth. We trained convolutional neural networks on a subset of
SYNS scenes to discriminate between these two classes based only on the
information in contrast-normalized image patches centred on each edge.
We found that the networks were able to discriminate depth edges in a
reserved test subset of the SYNS scenes with 81% accuracy. Interestingly,
this performance was relatively invariant with patch size. Although performance decreased when information about edge orientation or color was
removed, it remained in the 72-75% range, suggesting a larger role for spatial (e.g., blur, textural, configural) cues. These results demonstrate that 1)
The SYNS dataset can be used to provide 3D ground truth for visual tasks,
and 2) Colour, orientation and spatial cues are all important for the local
discrimination of depth edges.
Acknowledgement: NSERC CREATE Training Program in Vision Science &
26.4054 Mitigating Perceptual Error in Synthetic Animatronics
using Visual Feature Flow Ryan Schubert1,2([email protected]), Gerd
Bruder1, Greg Welch1; 1Institute for Simulation and Training, University
of Central Florida, 2Computer Science Department, University of North
Carolina at Chapel Hill
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day A ft ernoon Post ers
Acknowledgement: The Office of Naval Research (ONR) Code 30 under Dr. Peter
Squire, Program Officer (ONR awards N00014-14-1-0248 and N00014-12-11003)
Visual Memory: Neural mechanisms
Saturday, May 20, 2:45 - 6:45 pm
Poster Session, Pavilion
26.4055 Spatial selectivity of alpha band activity declines with
increasing visual working memory load David Sutterer1([email protected]
uchicago.edu), Joshua Foster , Kirsten Adam , Edward Vogel , Edward
Awh1; 1University of Chicago
Recent work has demonstrated that it is possible to reconstruct spatially-specific channel tuning functions (CTFs) during the encoding and delay
period of a working memory (WM) task, using an inverted encoding model
(IEM) and electroencephalography (Foster et al., 2016). These CTFs can be
derived from the distribution of alpha-band (8-12hz) activity across the
scalp, providing a temporally resolved measure of the location of a single
position stored in WM. Recent, functional magnetic imaging (fMRI) work
has demonstrated that feature specific patterns of bold activity degrade
as memory load increases (e.g. Sprague, Ester, and Serences, 2014). Here,
we show that the loss of feature specificity with increasing memory load
extends to spatially specific patterns of alpha band activity. On each trial,
participants encoded and maintained the location of either one or two colored dots while EEG was recorded. After a 1s delay period, participants
were cued to report the location of one of the dots. We trained an IEM
to assess CTF selectivity for each set-size and found that CTF selectivity
decreased when participants maintained two items relative to a single
item, consistent with behavioral performance decrements observed with
memory load increases. A key debate is whether items are maintained
simultaneously (each with lower precision), or if instead only a single item
is actively represented at any given time-point (i.e. shifting 1-item focus
of attention), but previous fMRI work lacked the temporal resolution to
address this question. Simulations revealed that CTF selectivity was significantly higher for the observed two-item activity than would be expected
if subjects only actively maintained one of the two items, suggesting that
participants simultaneously represented two positions. Together this pattern of results supports the idea that oscillatory activity in the alpha band
is integral to online spatial representations during memory maintenance.
26.4056 Topography of alpha-band power tracks improvement
in working memory precision with repeated encoding Kirsten
Adam1([email protected]), Joshua Foster1, David Sutterer1, Edward
Vogel1, Edward Awh1; 1Department of Psychology, University of Chicago
The topography of EEG alpha-band power (8 – 12 Hz) tracks locations held
in visual working memory in a time-resolved fashion (Foster et al. 2016).
However, there has been little work linking changes in the quality of alphaband representations with changes in behavioral precision. In two experiments, we tracked changes in behavioral precision and EEG alpha-band
representations as memoranda were repeated across trials. In Experiment
1 (n = 16), participants performed a 1-item spatial working memory task.
During each trial, participants remembered a briefly presented spatial location (100 ms) over a blank delay (1,000 ms) and reported the remembered
location using a mouse click. The same memory display was repeated six
trials in a row. In Experiment 2 (n = 23), participants performed a 1- or
2-item spatial working memory task, and each display repeated three times.
In both experiments, behavioral precision improved for repeated displays
relative to novel displays, and the quality of alpha-band representations
likewise improved. Previously, it has been demonstrated that alpha-band
representations track memory content independent from response preparation; thus, the observed changes in the quality of alpha-band representations rule out simple motor priming accounts of improvement across
repetitions. These data also demonstrate that decoded alpha-band representations are sensitive to subtle improvements in the quality of working
memory representations (here, an average response error improvement of
only around 1 degree). Finally, we observed that participants continued to
attend to the previously remembered location (as measured by the presence
of alpha-band representations) during the inter-trial interval, suggesting
that the behavioral boost for repeated stimuli came from covertly attending
relevant locations at the time of encoding.
Acknowledgement: NIH 2R01 MH087214-06A1
26.4057 Working memory reconstructions using alpha-band activ-
ity are disrupted by sensory input. Tom Bullock1,2([email protected]
ucsb.edu), Mary MacLean1,2, Barry Giesbrecht1,2; 1Dept. of Psychological
and Brain Sciences, University of California, Santa Barbara, CA 93106,
Institute for Collaborative Biotechnologies, University of California, Santa
Barbara, CA 93106
Recent work suggests that the spatial distribution of alpha-band activity
across the scalp measured by electroencephalography (EEG) can be used
to track specific spatial representations of stimuli held in working memory
(WM; Foster et al. 2016). Here, we tested the extent to which these representations can be disrupted by sensory input. Participants (n=18) performed a
simple recall task involving the presentation of a circular stimulus (250ms)
at one of eight equally spaced locations circumventing fixation and the subsequent recall of the stimulus location following a brief retention period
(1750ms). Critically, we manipulated the representation of the stimulus
during the retention period by 1) requiring participants to close their eyes
immediately after stimulus offset and 2) presenting a mask immediately
after stimulus offset. Requiring participants to close their eyes eliminated
the potential for continued spatial selection during the retention period,
and masking reduced possible after-image effects. Participants engaged in
four conditions while we recorded EEG at the scalp: eyes-open, eyes-closed,
eyes-open/masked, eyes-closed/masked. We used an inverted encoding
modeling technique to estimate location-selective tuning functions (TFs)
from spatially distributed alpha activity measured across the scalp during
the target and retention period (Foster et al. 2016). We then folded these
TFs at center and calculated slope at each time-point. We observed a robust
stimulus representation (greater positive slope) during the stimulus presentation, followed by a decline in the quality of the representation during
the 500ms post-stimulus offset. Between 200-500ms post-stimulus the mask
caused significant disruption to the spatial representation of the stimulus, relative to the unmasked conditions (p< .05). Furthermore, the stimulus representation was not reliable in the eyes-closed/masked condition
during the final 1000 ms of the retention period (p< .05), and WM precision
was reduced (p< .05). Together, these effects suggest alpha-band WM representations are not immune to disruption by sensory input.
Acknowledgement: This work was supported by the Institute for Collaborative Biotechnologies through contract W911NF-09-0001 from the U.S. Army
Research Office.
26.4058 Alpha-band activity reveals robust representations of spa-
tial position during the storage of non-spatial features in working
memory Joshua Foster1([email protected]), Emma Bsales1,
Edward Awh1; 1Department of Psychology, The University of Chicago
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Vis io n S c ie nc es Societ y
Saturday PM
The perceived visual flow of features on a 3D object provides cues about
the underlying shape and motion. Likewise, imagery of a dynamic virtual
object projected onto a static featureless physical object can be used to simulate shape or motion. If the shape of the virtual object is geometrically
different from the physical object, the visual flow of features moving across
the surface will be distorted, conveying incorrect shape cues. Mitigating
these incorrect shape cues supports synthetic animatronics—simulating
physical motion or deformation on geometrically static display surfaces.
To achieve this, we define two sets of feature flow curves that represent the
visual flow of a set of features over the course of an animation for a specific viewpoint: one set of features for the flow corresponding to the correct
perception of the virtual object, and a second for the flow of features otherwise distorted by the display surface. These feature flow curves provide a
basis for a perceptual error measure at single time steps (e.g., visual angular
error) and for identifying temporal flow patterns that might give perceptual shape cues for the underlying display surface (e.g., a sharp trajectory
change indicative of a fold or edge). We then dynamically alter the virtual
imagery on the physical surface to reduce perceptual error by diminishing
the visibility of specific features (and thus the resulting visual flow). This
is achieved by contrast reduction or low-pass filtering proportional to the
aggregate error across a set of viewpoints. We have observed that by doing
this dynamic filtering of the virtual imagery we can reduce the unwanted
perception of the underlying surface while maintaining feature salience
in areas of geometric similarity, upholding the overall perception of the
desired virtual shape and motion.
Saturday PM
Satur day Af t ernoon P os te r s
Visual working memory (WM) enables active maintenance of visual information via sustained patterns of stimulus-specific activity (Harrison &
Tong, 2009; Serences et al., 2009). Past work has shown that observers can
control which features of an object are maintained in WM (Serences et al.,
2009; Woodman & Vogel, 2008). However, behavioral studies suggest that
stimulus position enjoys a privileged status in WM (e.g., Rajsic & Wilson,
2014), raising the possibility that unlike non-spatial features, stimulus position may be necessarily maintained alongside to-be-remembered features.
To test whether stimulus position is maintained during non-spatial WM
tasks, we examined spatially selective alpha-band (8-12 Hz) activity using
an encoding model of spatial selectivity. Using this approach, past work
has shown that the scalp distribution of alpha-band activity tracks locations
stored in WM (Foster et al., 2016). In Experiment 1, observers remembered
the color of a sample stimulus. While the position of the sample stimulus
varied trial-to-trial, stimulus position was irrelevant to the task and unpredictive of probe position. Nevertheless, alpha activity tracked the original
location of the stimulus throughout the delay period, demonstrating that
stimulus position was represented in the pattern of alpha-band activity.
Experiment 2 established that these spatial representations are under
volitional control rather than being an automatic consequence of sensory
activity – when observers were asked to store one of two simultaneous
presented sample stimuli, spatially selective alpha activity was amplified
for the target item compared to the non-target item. In Experiment 3, we
observed spatially selective activity throughout the delay period of an orientation WM task, suggesting that spatial representations are not specific to
the storage of colors in WM but are seen during the storage of non-spatial
features in WM more generally. Our findings show that active representations of stimulus position are retained during the maintenance of non-spatial features in WM.
26.4059 Parieto-occipital alpha power dynamics selectively code
for the storage of spatial locations in visual working memory Keisuke Fukuda1([email protected]), Christopher Sundby2,3,
Geoffrey Woodman2; 1Department of Psychology, University of Toronto
Mississauga, 2Department of Psychology, Vanderbilt Vision Research
Center, Vanderbilt University, 3Department of Law, Vanderbilt University
Visual working memory (VWM) allows us to actively represent a limited
amount of visual information in mind at a given moment. Recent electrophysiological studies have consistently shown that modulations of the
power of parieto-occipital alpha activity (8-13Hz) is directly involved in
active maintenance of VWM representations. For example, the reduction
of the parieto-occipital alpha power observed during the VWM retention
interval shows the capacity-limited set size effect predicted by the behavioral measures of VWM capacity (Erikson, et al., 2016; Fukuda, Mance, &
Vogel, 2015; Fukuda, Kang, & Woodman, 2016). Furthermore, the topographical distribution of this alpha power modulation during VWM delay
can be used to decode the content of VWM (Foster, et al., 2016; Fukuda,
Kang, & Woodman, 2016; Samaha, Sprague, & Postle, 2016). In this study,
we sought to extend this finding by specifying the nature of the representation in VWM that are reflected in these parieto-occipital alpha power
dynamics. More specifically, we had participants maintain location, color,
or their conjunction in a short-term memory task while we recorded their
electroencephalograms (EEGs). Pattern classification results revealed that
location information, but not color information, can be reliably decoded
from the topographical distribution of the parieto-occipital alpha power
during the retention interval of the memory task. This finding clearly
demonstrates the selective sensitivity of the parieto-occipital alpha activity
to the storage of spatial locations in VWM. Acknowledgement: This research was supported by the National Institutes of
Health (R01-EY019882, R01-EY025275, R01-MH110378, P30-EY08126, and
26.4060 Alpha-Band Activity Tracks Updates to the Content of Spatial Working Memory Eren Gunseli1([email protected]), Joshua
Foster1, David Sutterer1, Edward Vogel1, Edward Awh1; 1Department of
Psychology, Institute for Mind and Biology, University of Chicago
Prior work has shown that topography of alpha-band activity tracks
locations maintained in spatial working memory (WM). Here, we tested
whether dynamic changes in alpha activity track the updating of information in spatial WM. Subjects were shown a memory location followed by
an auditory cue which instructed subjects to update the location held in
Vi s i on S c i enc es S o ci e ty
VSS 2017 Abst ract s
memory. Subjects used the mouse to click on the updated location. We used
an inverted spatial encoding model to reconstruct the spatially-selective
response profiles from the topographic distribution of alpha power. This
time-resolved analysis showed that spatially-specific alpha activity tracked
the initial location held in working memory, and revealed the transition
to the newly updated location. Furthermore, the location specificity of the
estimated response profiles, or Channel Tuning Functions (CTFs), showed
that subjects with a stronger focus on the updated location were faster to
report that location at the end of the trial. These findings highlight a new
approach for observing active updating of the contents of spatial WM.
26.4061 Suppression of irrelevant information from working
memory is reflected in the PD and CDAp components of the
EEG Tobias Feldmann-Wüstefeld1([email protected]), Edward
Vogel1; 1University of Chicago
Visual Working Memory (WM) literature has traditionally focused on how
the visual system maintains relevant information. On the other hand, visual
attention studies demonstrated the crucial role of active suppression. Given
the close relationship between visual WM and visual attention, it stands to
reason that active suppression plays an important, and yet often ignored,
role in WM. To better characterize this, we adapted a classical change
detection task (Luck & Vogel, 1997) to include irrelevant information. In
this task, participants were simultaneously presented with items that were
to-be-memorized (memory targets) and to-be-ignored items (memory distractors). Critically, memory targets and distractors were systematically
lateralized, enabling us to use lateralized ERP components to isolate the
neural markers of suppression from WM. Specifically, we were interested in an N2pc subcomponent, the distractor positivity (PD). The PD is
typically observed in visual search tasks in which salient items need to be
actively suppressed (Hickey et al., 2009). We hypothesized that this ERP
component would also be implicated in this WM task given that active
suppression was required. We found that the PD component increased
with the number of distractors to be suppressed from WM, with the WM
capacity being identical. This suggests that in order to sufficiently maintain
relevant information in WM, more active suppression was required with an
increasing number of irrelevant items. Furthermore, individual differences
in WM capacity predicted the PD amplitude. This demonstrates that the
ability to suppress irrelevant information from WM contributes to better
WM performance. In addition we found contralateral delay activity of
positive polarity (CDAp) starting at around 450 ms, suggesting lingering
active suppression of irrelevant items from WM. In sum our results suggest
that active suppression of irrelevant information plays an important role in
visual WM and its neural markers are the ERP components PD and CDAp.
26.4062 What Information Can Actually Be Decoded from the EEG
in Visual Working Memory Tasks? GiYeul Bae1([email protected]),
Steven Luck1; 1Center for Mind and Brain, University of California, Davis
Previous research showed that a feature value stored in memory (WM)
can be decoded via a spatial pattern of EEG oscillatory activity in the EEG
alpha band (8-12 Hz). The present study sought to determine what information is actually being decoded. First, we asked whether orientation can
be decoded when it is decoupled from location. In Experiment 1, observers
performed an orientation delayed estimation task in which the orientation
and the angular location of a sample stimulus were independently manipulated. We separately decoded both the orientation and the angular location.
We found that pure orientation decoding was above chance, although it
was considerably weaker than pure location decoding. Second, we investigated how precisely angular location information can be decoded. Decoding precision was computed by estimating the dispersion of the decoded
response distributions, and we also compared decoding performance using
different numbers of underlying channel tuning functions (CTFs). We
found that the decoding precision reaches at asymptote at 8 CTFs, implying
that the EEG can discriminate angular locations as small as 45 degrees of
the angular space. Third, we found that the decoding was equally precise
across different feature values and that it reflects both metric and categorical information. Fourth, we found that decoding is above chance when
the decoding weights derived from one observer are applied to another
observer, indicating some consistency in scalp topography. Fifth, by temporally separating data for training and testing, we found that the EEG codes
both time-dependent and time-independent information. Lastly, using a
2AFC task, Experiment 2 showed that orientation decoding can be achieved
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s using a task other than delayed estimation. Together, these findings provide several insights about what visual information is actively maintained
in working memory and can be decoded from the EEG.
Acknowledgement: This work was supported by grant R01MH076226 to SJL.
26.4063 Decoding the Contents of Working Memory Using EEG
Provides Evidence For the Sensory Recruitment Hypothesis Allison
Bruning1([email protected]), Michael Pratte1; 1Department of Psychology, Mississippi State University
26.4064 Bridging Working Memory and Imagery: Encoding induced
alpha EEG activity reveals similar neural processes Joel Robitaille1([email protected]), Stephen Emrich1; 1Psychology Department,
Brock University
While both imagery and visual working memory address the mental representation of visual information, it remains unclear whether the representations of information during these processes are mediated by similar mechanisms. Albers et al. (2012) were able to demonstrate that working memory
representations can be identified and tracked down during a mental imagery rotation by decoding fMRI activity detected in the primary visual cortex.
A recent study by Foster et al. (2016) reported that it is possible to identify
the feature of an object held in working memory by applying an encoding
model on induced alpha activity (8-15Hz). In an attempt to determine the
similarities between imagery and working memory, we replicated Foster et
al. (2016) and extended their findings by investigating the behavioural and
neural properties imagery. A forward encoding model was applied to EEG
activity recorded while participants were holding the orientation of a stimulus in working memory and then transformed through a mental rotation
of 60°. The reconstruction of orientation selectivity profiles revealed the
orientation of the working memory representation and reliable changes in
the mental representation during the imagery manipulation. Furthermore,
the behavioural results indicate that the level of precision in the report of
the transformed orientation feature is comparable with typical working
memory precision. These results suggest that visual working memory and
imagery share similar neural and behavioural mechanisms.
Acknowledgement: NSERC
26.4065 Time-reversed activation of sequentially memorized items
during maintaining period in humans Qiaoli Huang1,2([email protected]
pku.edu.cn), Jiarong Jia1,2,3, Huan Luo1,2; 1School of Psychological and
Cognitive Science, Peking University, 2IDG/McGovern Institute for Brain
Research, Peking University, 3Peking-Tsinghua Center for Life Science,
Peking University
It has been hotly debated whether multiple items in working memory are
represented simultaneously or sequentially. A recent study found that items
in different sequence positions elicited gamma power that is phase locked
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
to distinct phase of a theta oscillation, supporting the sequential representation model. However, the results are based on activities in encoding period
and it still remains unknown how the sequentially memorized items are
represented during maintaining period. In the present study, we recorded
EEG activities while human subjects performed a sequential working memory task. In each trial, subjects were first presented with multiple rectangles
with different orientation and color, and were instructed to only memorize
the orientation of the cued rectangles and their temporal order (“Encoding
phase”). Next, they performed a central fixation task, and were simultaneously presented with circles either endowed with memory-related color or
memory-unrelated color (“Maintaining phase”). Finally, they were asked
to judge whether or not the orientation of a presented rectangle was similar
to that of the to-be-memorized rectangles in the Encoding phase (“Recalling
phase”). Critically, we employed a temporal response function technique
(TRF) to extract item-specific response in “Maintaining phase”. We found
that first, the TRF responses for the to-be-memorized items exhibited stronger alpha-band (~10 Hz) power compared to non-memorized item. Second,
the alpha power profiles for the multiple to-be-memorized items showed a
time-reversed alpha activation pattern. Specifically, the item that occupied
earlier (later) sequential position in Encoding phase elicited later (earlier)
alpha responses in Maintaining period. Finally, the sequential activation
sequence became faster as the memory list became longer. In summary, our
results support the sequential representation model in working memory,
and provide direct neuronal evidence in human subjects that sequential
memory is mediated by re-activating sequences in a time-reversed manner
during maintaining period.
Acknowledgement: This work was supported by the National Nature Science
Foundation of China Grant 31522027 and 31571115 to LH.
26.4066 Modulation of working memory filtering efficiency during
acute bouts of exercise. Lindsey Purpura1,2([email protected]
ucsb.edu), Thomas Bullock1,2, Barry Giesbrecht1,2; 1University of California,
Santa Barbara, 2Institute for Collaborative Biotechnologies
Locomotor activity impacts behavioral performance and brain activity in
various species including invertebrates, rodents, and humans (Chiappe et
al., 2010; Niell & Stryker, 2010; Bullock et al., 2015; Bullock et al., 2016).
Here we investigated the effect of exercise on working memory (WM) filtering efficiency. Filtering efficiency was measured using the amplitude
of the scalp-recorded contralateral delay activity (CDA). CDA amplitude tracks with the number of encoded items and can be used as a way
to assess the amount of encoded information compared to the number of
relevant and irrelevant items presented (Vogel et al., 2005). While previous research suggests sensory gain during exercise (Bullock et al., 2015),
other work suggests impaired cognitive control during physical exertion
(Eddy et al., 2015). If filtering efficiency is related to attentional control,
then the finding that cognitive control is impaired during physical exercise
predicts decreased filtering efficiency during exercise. To test this prediction, participants (n=5) encoded a brief (100 ms) memory array and after
a retention period (900ms) reported if any relevant items had changed
orientation. One-third of trials included four relevant items, one-third presented two relevant items, and one-third presented two relevant and two
irrelevant items. Participants completed this task during rest (mean heart
rate (HR)=67.3 bpm) and low intensity (duration = 45 min; mean HR=103.7
bpm) cycling. Filtering efficiency was calculated for each subject and each
exercise condition. Filtering efficiency was higher during rest compared
to low intensity exercise, t(4)=2.31, p< .05 (one-tailed). All participants
showed this effect. Although it is unclear whether this effect is caused by a
drop in CDA amplitude for four relevant items or an increase in amplitude
on trials with irrelevant items during low intensity exercise compared to
rest, the results suggest that WM filtering efficiency is modulated by brief
bouts of physical exercise. Acknowledgement: This work was supported by the Institute for Collaborative Biotechnologies through contract W911NF-09-0001 from the U.S. Army
Research Office.
26.4067 Neural evidence for unitization following perceptual exper-
tise Jackson Liang1([email protected]), Jonathan Erez2, Felicia
Zhang3, Rhodri Cusack2,4, Morgan Barense1,5; 1Department of Psychology,
University of Toronto, 2Department of Psychology, University of Western
Ontario, 3Department of Psychology, Princeton University, 4The Brain and
Mind Institute, 5Rotman Research Institute
Vis io n S c ie nc es Societ y
Saturday PM
Recent fMRI studies have shown that the contents of visual working memory can be decoded from early visual areas, including V1. This result has
been interpreted as support for the sensory recruitment hypothesis: the
idea that the neurons responsible for vision also sub-serve visual working
memory and visual imagery. However, whereas these results imply that
the same brain areas are responsible for vision and visual memory, they do
not rule out the possibility that these processes rely on completely different
populations of neurons within these areas. For example, although viewing
and remembering an orientation might lead to the same global radial bias
pattern in V1, entirely different neurons may produce these patterns during
vision and memory. We develop a novel EEG paradigm that allows us to
directly test whether the same neurons responsible for processing incoming visual signals are indeed modulated by an internally driven memory
signal. Participants held an orientation in working memory while viewing
a flicking visual noise patch. This flickering stimulus generated an EEG
response known as the steady state visually evoked potential (SSVEP), a
measure of early neural responses to the noise stimulus. Critically, if memory relies on these same visual neurons, then the SSVEP response to visual
stimulation should also carry information about the stimulus being held in
memory. We confirm this prediction by showing that a multivariate pattern
classifier can be used to identify a remembered orientation from the stimulus-driven SSVEP. This finding demonstrates a direct interaction between a
bottom-up stimulus-driven signal and a top-down memory-driven signal,
providing strong evidence for the sensory recruitment hypothesis and a
powerful new approach for investigating visual memory with EEG.
S atur day A ft ernoon Post ers
Saturday PM
Satur day Af t ernoon P os te r s
The organization of representations in the ventral visual stream (VVS) is
thought to be hierarchical, such that posterior VVS represents simple object
features, whereas anterior VVS supports increasingly complex conjunctive representations of multiple features. Despite considerable empirical
support for this representational hierarchy for processing novel objects, it
is unclear what changes occur to distributed object representations with
extended learning. The perceptual expertise literature shows that discrimination between complex objects becomes faster with experience; this is a
hallmark of unitization theory, whereby multiple features can be unitized
and accessed as rapidly as a single feature. Keeping the organizing principle of the representational hierarchy in mind, this simple idea makes a
powerful and unique prediction: unitization through perceptual training
should modify conjunctive representations, but not simply as response tuning of existing representations. Rather, conjunctive representations would
be redistributed to posterior VVS, whose architecture is specialized for
processing single features. To test this hypothesis, we used fMRI to scan
participants before and after visual training with novel objects comprising
1-3 features that were organized into distinct feature conjunctions. First, we
used neural pattern similarity to replicate earlier findings that complex feature conjunctions were associated with conjunctive coding in anterior VVS.
Critically, we also demonstrated that for well-learned objects, the strength
of conjunctive coding increased post-training within posterior VVS. Furthermore, multidimensional scaling revealed increased pattern separation
of the representation for individual objects following training. Finally, we
showed that functional connectivity between anterior and posterior VVS
increased for unfamiliar objects, consistent with early involvement in unitizing feature conjunctions in response to novelty. While there is strong
behavioral support for unitization theory, a compelling neural mechanism
had been lacking to date. Here, we leveraged recent advances in VVS subregional function to link established behavioral observations with representational transformations in the human brain.
26.4068 Neural mechanisms of precision in visual working memory
for faces Elizabeth Lorenc1([email protected]), Mark
D’Esposito ; Helen Wills Neuroscience Institute, University of California,
Berkeley, 2Psychology, University of California, Berkeley
1,2 1
Visual working memory (VWM) allows for the maintenance and manipulation of information about objects no longer in view. Interestingly, the
precision with which visual information can be encoded, maintained, and
retrieved from VWM varies considerably between healthy individuals, and
even from trial to trial within a single individual. We hypothesize that a
stimulus-selective area such as the fusiform face area (FFA) supports precise VWM by maintaining perception-related activity when a visual stimulus is no longer present. To that end, we trained an encoding model on
perception-related activity patterns, and then inverted the model to reconstruct face VWM representations in the FFA and early visual areas. Functional magnetic resonance imaging data was collected while participants
performed a delayed-estimation task for faces. On each trial, a post-cue
indicated whether the participant should store the item through a 10s
delay period (‘Store”) or discard it from memory (“Drop”). “Store” trials
ended with a method-of-adjustment response in which a random face was
morphed to match the remembered face, and “drop” trials ended with a
perceptual matching task in which a probe face was morphed to match a
simultaneously-presented test face. We found that faces could reliably be
reconstructed from both the early visual and FFA regions of interest during
perception, before the “store” or “drop” post-cue. Interestingly, reliable
face reconstructions persisted in both V1-V3 and the FFA through the memory delay, when a participant was actively holding a face in memory. However, we found the opposite in the “drop” trials; delay activity patterns
were anti-correlated with those at perception, yielding negative population
tuning curves. Future analyses will investigate the role of the lateral prefrontal cortex in sustaining perception-related activity when a face stimulus
is actively maintained in working memory, and in suppressing that activity
when maintenance is not required.
Acknowledgement: NIH Grant MH63901 and NIMH F31MH107157
26.4069 Decoding visual spatial working memory uncertainty from
human cortex Thomas Sprague1([email protected]), Masih Rahmati1,
Aspen Yoo1, Wei Ji Ma1,2, Clayton Curtis1,2; 1Department of Psychology,
New York University, 2Center for Neural Science, New York University
Vi s i on S c i enc es S o ci e ty
VSS 2017 Abst ract s
Although we have remarkable insight into the variations in quality of our
visual working memory (WM) representations (Rademaker et al, 2012),
how this uncertainty arises from neural activity patterns remains unknown.
Bayesian theories of probabilistic population coding posit that information is represented as a probability distribution over feature values within
populations of noisy neurons, and that the width of distributions directly
indexes the uncertainty with which a feature is represented (Pouget et al,
2000; Ma et al, 2006; Jazayeri & Shadlen, 2006). Previous efforts to relate
behavioral performance to neural WM representations (Ester et al, 2013;
Sprague et al, 2014; 2016) have used linear methods, which cannot utilize the noise in neural responses to optimally constrain decoding. This is
a critical challenge, as noise places critical constraints on representations
of information in neural activity patterns (Averbeck et al, 2006). Here, we
adapted a recently-published decoding method to measure representations
of spatial positions in WM, as well as their uncertainty (van Bergen et al,
2016). This method, based on a Bayesian generative model of neural activity
which incorporates spatial preferences of individual voxels and estimates
of their noise, results in a full likelihood function over feature values, rather
than a point estimate. Participants remembered a precise spatial position
over an extended delay interval (10 s) while we imaged cortical activation
patterns using BOLD fMRI. Decoded likelihood functions from visual cortex yielded accurate estimates of decoded feature values (mean of the likelihood function). Furthermore, the uncertainty of feature representations
(circular standard deviation of the likelihood function) accurately reflected
the noise of the representation: on trials with greater uncertainty, decoding
error was higher. These results support variable precision models of WM,
which posit that items are maintained with different levels of precision
across items and across trials (van den Berg et al, 2012).
Acknowledgement: NIH/NEI R01-EY01640 (CEC) and NIH NEI R01-EY020958
26.4070 Active Maintenance of Working Memory Representations
Remains Robust Under Automatic, But Not Non-Automatic, Processing of Distractor Stimuli Orestis Papaioannou1,2([email protected]
edu), Steven Luck1,2; 1Center for Mind and Brain, UC Davis, 2Department
of Psychology, UC Davis
Visual working memory relies heavily on the active maintenance of representations. However, it is unclear whether this active maintenance can
co-occur with other concurrent processing of stimuli. Sparked by this question, we used event related potentials (ERPs) - specifically contralateral
delay activity (CDA) - to create a continuous marker of active maintenance
of lateralized stimuli during the processing of lexical stimuli. Participants
were asked to remember four colored items presented on the right or left
side of the screen for a change detection task. A lexical item (word or consonant string) was presented during the 1500 ms retention interval on a
subset of trials. Participants were instructed to either ignore these items
and focus entirely on the memory task (single-task condition), or to indicate
whether the item presented was a word or consonant string (dual-task condition). A CDA to the lateralized memory items was observed for both conditions prior to, or in the absence of, an intervening stimulus. However, the
CDA was disrupted by the processing of the lexical items during the duals
task condition, but not the single task condition. A larger N400 - a component associated with semantic and orthographic processing - was found for
words compared to consonant strings in both conditions, indicating that
participants differentiated between words and consonant strings in both
conditions. Thus, during the single-task condition, the CDA was not disrupted by the lexical stimuli even though the N400 data indicate that these
stimuli were discriminated. Taken together, these finding suggest that
active maintenance is unimpeded by automatic lexical processing but fails
when this same processing must be tied to a non-automated task. Interestingly, behavioral measures show only a minor decrease in change-detection performance in the dual-task condition, providing evidence of a secondary working memory process that can support the memory task when
the CDA has been disrupted.
Acknowledgement: Grant R01 MH076226, National Institute of Mental Health.
26.4071 Decoding the Content of Visual Working Memory in the
Human Visual System Xilin Zhang1([email protected]), Nicole
Mlynaryk1, Shruti Japee1, Leslie Ungerleider1; 1Laboratory of Brain and
Cognition, National Institute of Mental Health, National Institutes of
Health, Bethesda, Maryland, USA
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s 26.4072 Can the visual cortex represent the invisible? Shude
Zhu1([email protected]), Li Zhang1, Rudiger von der Heydt1,2;
Krieger Mind/Brain Institute, Johns Hopkins University, 2Department of
Neuroscience, Johns Hopkins School of Medicine
In everyday vision objects often occlude others from sight, but objects
are perceived as permanent despite temporary occlusions. Observations
on border ownership coding in low-level visual areas (V1, V2) suggested
an influence from object representations at a higher level that have some
persistence (O’Herron & von der Heydt, JOV 11(2):12, 2011). We recorded
from neurons of areas V2 and V4 searching for persistence of object-evoked
activity during temporary occlusion. Monkeys performed a visual foraging
task in which they sequentially fixated individual figures of an array of
10 figures in search for reward. The array was constructed so that fixating one figure would, in most cases, bring another figure into the receptive
field (RF) of the neuron under study, while in other cases it would bring
uniform background into the RF. During the presentation of the array, a
grating of opaque stripes drifted over the array, variably occluding some
of the figures. To a human observer, the 10 figures appeared permanent
despite the temporary occlusions. The fixations produced 4 different conditions, depending on whether the RF was on an occluding stripe or not,
and whether there was a figure at the location of the RF or not. We determined the average firing rate for each condition and calculated a permanence index PERMI= (OccludedFigure – OccludedNothing) / (VisibleFigure – VisibleNothing). Preliminary results suggest that V4 contains a small
proportion of neurons (8/86) with high permanence (PERMI >0.5), whereas
no such neurons were found in V2 (0/43). The distribution of recording
locations suggests spatial clustering of high permanence neurons within
V4. Thus, V4 might be involved in providing object permanence.
Acknowledgement: NIH R01EY02966, NIH R01EY027544, ONR BAA08-019
26.4073 TMS of the frontal eye fields reveals load- and cue-related modulations of cortical excitability and effective connectivity Amanda van Lamsweerde1([email protected]),
sented in either the right or left visual hemifield, indicated by a cue at the
beginning of each trial. Partway through the WM delay, single pulse TMS
was delivered to the right FEF. The resulting TMS-evoked response (TER)
was then source-localized and two synthetic measures--significant current density (SCD) and significant current scattering (SCS)--were derived
to assess condition-specific changes in cortical excitability and effective
connectivity, respectively. Preliminary results (n=6) revealed load- and
cue-specific increases in both measures: both SCD and SCS increased when
four versus two items were maintained, but this effect was restricted to
the condition in which stimuli were encoded from the contralateral visual
hemifield. Follow-up analyses examining whether these differences were
specific to particular brain regions hypothesized to play a role in WM maintenance revealed that changes in SCD and SCS were most pronounced in
the stimulated region, with smaller modulations observed across a range of
areas, including bilateral primary and extrastriate visual cortex, among others. Additional planned analyses include an assessment of the relationship
between the TER, the amplitude of CDA, and behavioral measures of WM
capacity and precision.
Acknowledgement: NIH R15-MH105866-01
Visual Memory: Cognitive disorders, individual
Saturday, May 20, 2:45 - 6:45 pm
Poster Session, Pavilion
26.4074 Individual differences reveal independent mechanisms
for working memory and perceptual serial dependence Kathy
Zhang1([email protected]), David Whitney1, 2; 1Department of
Psychology, University of California - Berkeley, 2Helen Wills Neuroscience Institute, University of California - Berkeley
Objects in the world appear to have stable identities, sometimes more stable
than they actually are. This is thought to be facilitated by perceptual serial
dependence, and has been reported for orientation, faces, attractiveness,
position, among various other visual object domains. The mechanism for
serial dependence remains unclear: it has been argued that serial dependence may occur at the level of perception, at the level of decisions, or that
serial dependence, being an influence of the past on the perception of the
present, may simply be a byproduct of working memory. Here we tested
individual differences in perceptual serial dependence for orientation and
faces in addition to measures of working memory capacity to try to determine whether serial dependence is indeed correlated with memory. There
were substantial individual differences in perceptual serial dependence,
but the correlation between orientation and face serial dependence across
subjects was approximately zero and non-significant. This indicates that
there is some degree of independence and likely separable processes that
contribute to serial dependence in the perception of faces and orientation,
while also providing evidence against the possibility that serial dependence
is a stimulus invariant ‘decision’ bias. We then compared serial dependence
in face and orientation perception to measures of working memory capacity, including an operation span task and a change detection task. While the
estimates of working memory capacity were stable across sessions within
subjects, each of these measures of working memory was only weakly and
non-significantly correlated with perceptual serial dependence of either
faces or orientation. These results build on previous ones (Zhang et al, 2015)
suggesting that serial dependence operates independently from traditional
measures of working memory. They further raise the possibility that serial
dependence operates at multiple levels of visual processing.
Andrea Bocincova1, Andrew Heinz1, Jeffrey Johnson1; 1North Dakota State
26.4075 Time is needed for memory to be biased toward an ensem-
In a recent study, Reinhart et al. (2012) reported that the amplitude of
local field potentials recorded in the macaque FEF during a delayed-saccade working memory task predicted the amplitude of contralateral delay
activity (CDA) measured over posterior sensors. This finding suggests that
the FEF may contribute to the maintenance of information in WM through
feedback inputs to posterior brain regions. In this study, we sought causal
evidence for this possibility by using TMS to stimulate the FEF and EEG to
measure the strength of the resulting response and its spatial spread to distal cortical areas during WM maintenance. Specifically, subjects completed
a cued recall test in which they remembered either two or four colors pre-
Department of Psychology, Sungkyunkwan University, 2Center for Neuroscience Imaging Research, Institute for Basic Science
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
ble average Byung-Il Oh1([email protected]), Min-Suk Kang1,2;
It has been shown that memory of an individual item is biased toward
an ensemble average (Brady & Alvarez, 2011; Lew & Vul, 2015). Here we
investigated how this bias changes over time by using an orientation estimation task. Participants were presented with 20 oriented white bars for
200 ms on a gray background. In each trial, the stimuli had five orientations
that differed by 0, −15, +15, −30, and +30 degrees from a randomly selected
orientation, and each orientation was uniformly sampled four times. These
20 stimuli were then randomly presented within an imaginary 4 × 5 grid.
Vis io n S c ie nc es Societ y
Saturday PM
Visual working memory (VWM) enables the storage and manipulation of
limited information about stimuli no longer in view over short temporal
intervals. Given the huge amount of information confronting the visual
system at any given moment, VWM storage of multiple items is often
required. However, how multiple VWM representations are maintained in
the human visual system remains unclear. Here, we used functional magnetic resonance imaging (fMRI) multivariate pattern analysis (MVPA) to
address this question as human subjects performed a delayed orientation
discrimination task. During each trial, subjects maintained fixation while
three sample orientation gratings (20° ± 3°, 80° ± 3°and 140° ± 3°) were
randomly presented in three locations (5° eccentricity) for 3 s, followed
by a cue indicating the number of gratings (i.e., the memory load: R1, R2,
and R3) that subjects needed to remember (one, two, and three gratings,
respectively). These three types of cue appeared with equal probability
and randomly in the experiment. After a 10-s retention interval, a test grating was briefly and randomly presented in one of the cued locations, and
subjects indicated its orientation relative to the cued grating (± 3° or ± 6°).
Behavioral data showed that the increasing memory load (R1, R2, and R3)
impaired subjects` performance in discriminating the small differences in
orientation between the cued grating and the test grating. The fMRI experiment demonstrated that MVPA decoding in occipital areas, but not parietal
areas, closely tracked the impaired subjects` performance along with the
increasing memory load (R1, R2, and R3). Our results suggest that occipital
cortex, but not parietal cortex, has a central role in multiple VWM storages
in the human brain.
S atur day A ft ernoon Post ers
Saturday PM
Satur day Af t ernoon P os te r s
VSS 2017 Abst ract s
The participants had to remember those bars and recall the orientation of a
target, which was cued with a circular contour. This cue was displayed at
one of three different times. In one condition, the stimuli and the cue were
simultaneously presented; in other two conditions, the cue was presented
500 ms or 1000 ms after the stimuli onset. Estimation phase started 1500
ms after the stimuli onset such that a randomly oriented probe appeared
within the circular cue, and the participants reported the remembered orientation of the target by adjusting the probe orientation. Importantly, the
target orientation was either the mean orientation of the 20 stimuli or ±15
degrees from the mean. When the cue was simultaneously presented with
the stimuli, the remembered orientation of the target was similar to its physical orientation. However, when the cue was presented 500 ms or 1000 ms
after the stimuli onset, the remembered orientation was biased toward the
mean orientation of the 20 stimuli. These results suggest that the memory
representation is gradually biased toward an ensemble average over time. Acknowledgement: NRF 2016R1D1A1B03930292
26.4077 Dissociable Effects of Depressed Mood, Schizotypal
Personality Disorder, and Age on the Number and Quality of Visual
Working Memory Representations Weiwei Zhang1([email protected]
ucr.edu), Weizhen Xie1, Marcus Cappiello1; 1Dept. of Psychology, UC
Limited storage in Visual Working memory (VWM) sets a major constraint
on a variety of cognitive and affective processes. Additional reduction of
this central bottleneck has been associated with declines in various health
related factors such as depressed mood, Schizotypal Personality Disorder
(SPD), and age. The present study examined the disruptive effects of these
factors on quantitative (i.e., the number of retained representations) and
qualitative (i.e., precision) aspects of VWM representations. In two studies, participants completed a short-term color recall task along with questionnaires on mental health including depressed mood, SPD, and demographic information. Study 1 showed that depressed mood was associated
with reduced VWM storage capacity, whereas SPD was associated with
reduced mnemonic precision (assessed as the inverse of recall variability
after randomness in recall was factored out). These patterns were absent in
sensory memory, indicating the VWM effects were post-perceptual. Study
2 replicated the reduction in VWM storage capacity by depressed mood
and further demonstrated that chronological age negatively correlated
with VWM precision. The latter effect remained significant after statistically controlling the contribution of poor sleep quality that was associated
with age. These results demonstrate that depression, SPD, and age can have
dissociable effects of on VWM representations, in line with the growing
literature suggesting that the two aspects of VWM representations can be
disassociated using different experimental manipulations and supported
by non-overlapping neural mechanisms. Together, these findings support
that the quantity of retained VWM representations can be independent of
their quality.
26.4078 Impact of Impaired Spontaneous Grouping on Estimates
of Visual Working Memory Capacity in Schizophrenia Molly
Erickson ([email protected]), Brian Keane , Dillon Smith , Steven
Silverstein1; 1Division of Schizophrenia Research, Department of Psychiatry, Rutgers University
Schizophrenia is a mental illness that is associated with working memory
(WM) deficits; however, a mechanistic account for these deficits has not yet
been identified. Recent evidence suggests that electrophysiological abnormalities during early encoding/consolidation processes may constrain WM
capacity in PSZ (Erickson et al., 2016). One hypothesis that dovetails with
these observations is that PSZ do not use spontaneous configural grouping
strategies to encode and consolidate items in storage the same way that
healthy control subjects (HCS) do. The present study was conducted to test
this hypothesis. Two HCS and one PSZ have completed the task to date,
with an expanded sample size anticipated by May 2017. Participants were
exposed to three variants on a change-detection task: (1) a pro-grouping
task variant wherein to-be-remembered items (four sectored circles) create an illusory contour defined polygon; (2) an anti-grouping task variant
wherein to-be-remembered items are rotated and surrounded by surrounding circles to inhibit illusory contour formation; and (3) a neutral task variant, wherein to-be-remembered items are rotated, but not surrounded by
circles that inhibit illusory contour formation. Consistent with our hypothesis, preliminary results suggest that HCS have reduced accuracy in the
Vi s i on S c i enc es S o ci e ty
anti-grouping condition compared to the neutral and pro-grouping conditions. By contrast, PSZ accuracy appears to be improved in the pro-grouping condition compared to either the anti-grouping or neutral task variants.
Taken together, these observations suggest that (1) HCS can flexibly use
grouping strategies to encode items regardless of whether grouping cues
are weak (neutral condition) or strong (pro-grouping condition) to improve
WM storage; and (2) poor WM task performance in PSZ may be due in part
to decreased use of spontaneous grouping strategies to encode items. This
conclusion is supported by evidence that PSZ exhibit improved WM task
performance when grouping cues are made more explicit.
26.4079 Evidence of limited cross-category visual statistical learn-
ing in amnesia Marian Berryhill1([email protected]), Adelle Cerreta1,
Timothy Vickery2; 1Department of Psychology, Program in Cognitive and
Brain Sciences, University of Nevada, 2Department of Psychological and
Brain Sciences, University of Delaware
The neural correlates of visual statistical learning (VSL) remain debated,
but neuroimaging and neuropsychological findings support the emerging
view that MTL involvement is needed for this form of implicit learning.
We extended new findings showing that performance on classic triplet
VSL tasks is interrupted in amnesic patients. We sought to test whether
some forms of VSL may persist without intact MTL, by combining stimuli
within and between broad categories (faces and scenes) in an otherwise
typical VSL paradigm. In Experiment 1, the familiarization task required
participants to monitor sequentially presented faces (male/female) and
scenes (indoor/outdoor), and to report image flickers. 16 AB pairs were
repeated, such that A always predicted B. The nature of these pairs was
the key manipulation. To examine how categorical boundaries impact
VSL these pairs were consistent (male->male; indoor->indoor), inconsistent (male->female; indoor->outdoor) or cross-category (male->outdoor;
indoor->female). During a surprise 2AFC recognition phase, the task was
to pick the more familiar pairing versus a foil. Here, the amnesic participant showed chance performance, suggesting no VSL and a reliance of this
form of VSL on MTL structures. In Experiment 2, the familiarization task
was modified to require a stimulus categorization judgment. During the
recognition stage, the patient demonstrated significantly above chance performance for a subset of AB pairs. Surprisingly, her recognition was better
for the pairs that crossed category boundaries, regardless of whether the
same (male ->outdoor) or a different (male->indoor) motor response was
required. She showed this same pattern across two testing sessions separated by more than a week. These data provide additional context to our
understanding of the relationship between VSL and the MTL. We found
evidence of limited VSL despite profound MTL damage, suggesting that
the neural underpinnings of VSL may be more varied and contingent on
task demands than previously thought.
Acknowledgement: This material is based on work supported by the National
Science Foundation (NSF 1632738, NSF 1632849). The NSF had no role in study
design, data collection and analysis.
26.4080 Distortions of spatial memory: Social attention, but not
social interaction effects Tim Vestner1([email protected]), Steven
Tipper1, Tom Hartley1, Shirley-Ann Rueschemeyer1; 1Department of Psychology, University of York
Recent studies have demonstrated the malleability of distance perception
in affective/social situations, claiming that distance perception is inherently tied to social experience. Importantly, only an egocentric perspective,
involving distances between the observer and a target stimulus, was studied. The present series of experiments investigated whether similar distortions of space also hold true for allocentric conditions. Using a variety of
displays presenting individuals in different spatial configurations and various relationships to each other, it was tested whether an observer recalled
the distance between individuals as smaller or larger depending on their
relationship and level of engagement. Distance-altering effects resulting
from the attention-direction of the individuals were found. However, thus
far there is no evidence for the influence of the social relationships of the
agents on the recall of their distance to each other. These results confirm
previous research on attention cueing but are not in agreement with theories proposing social top-down effects on spatial memory.
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s 26.4081 Degradation of object-specific knowledge from atrophy of
perirhinal cortex Amy Price1([email protected]), Amy Halpin2,
Michael Bonner3, Murray Grossman2; 1Princeton Neuroscience Institute,
Princeton University, 2Department of Neurology, University of Pennsylvania, 3Department of Psychology, University of Pennsylvania
Multisensory: Touch and balance
Saturday, May 20, 2:45 - 6:45 pm
Poster Session, Pavilion
26.4082 Measuring end-to-end latency of a virtual reality system
objectively and psychophysically Andrew Glennerster1([email protected]), Stuart Gilson2; 1School of Psychology and Clinical
Language Sciences, University of Reading, UK, 2Department of Optometry
and Visual Science, University College of Southeast Norway
Reduction of end-to-end latency (‘motion-to-photons’) is critical for convincing, high-fidelity virtual reality and for scientific uses of VR that mimic
real world interactions as closely as possible. We measured the end-to-end
latency of a real-time infra-red camera-based tracking system (Vicon), with
rendering on a standard graphics PC and using a head mounted display
(nVis SX111 HMD). A 100Hz camera captured both a tracked ‘wand’ and
the rendered object (a sphere) on the display screen as the wand was moved
from side to side. Cross-correlation of the centroid positions of the tracked
wand and rendered sphere allowed us to calculate the end-to-end latency
of the system for different displays. With our HMD (LCD display), this was
about 40ms (± 2ms) whereas for a CRT it was 30ms. Because our display
was refreshed at 60Hz and rendering time was less than 16.6ms, we could
wait for the latest possible Vicon tracker coordinate (available at 250Hz)
before rendering the next frame and swapping buffers. This reduced
latency by 9ms (to 31ms). In a psychophysical experiment, we showed
that a reduction in latency of this magnitude was easily detectable. Three
observers waved a wand, rendered as a multi-faceted ball and, in a forcedchoice paradigm, identified whether the latency between hand movement
and rendered stimulus movement was ‘high’ or ‘low’ (50% of trials were of
each type; 4 practice trials including both types preceded each run). We varied the latency difference by a combination of (i) adding artificial latency to
one stimulus and (ii) minimizing the latency of the shorter latency stimulus.
Plotting d’ against log latency difference and fitting a straight line showed
that the threshold difference (d’ = 1) was less than 4ms for all participants.
This corresponds to a remarkably low Weber fraction of about 10%. 26.4083 Multimodal Contributions to Subjective Visual Verti-
cal Chéla Willey1([email protected]), Zili Liu1; 1Department of Psychology, College of Life Sciences, University of California, Los Angeles
We investigated the perception of subjective visual vertical (SVV) as a result
of visuo-vestibuo-proprioceptive integration. Estimates of vertical are typically made by rotating a rod in space to a vertical position while standing
upright. The visual context in which the rod is presented can influence SVV
estimates as follows. In the rod and frame task, SVV estimates are biased
towards the orientation of a surrounding contextual frame. However, SVV
may also be influenced by vestibular and proprioceptive input indicating
the direction of gravity. We sought to measure the effect of these modalities in SVV estimates by reducing their contribution in four conditions.
Participants performed the rod and frame task while standing upright and
while lying down using a virtual reality headset. This allowed us to eliminate contributing information due to vestibular gravitational cues available in the upright position. The use of virtual reality also allowed for the
immersed visual illusion of the upright position in the supine condition.
Further, we manipulated proprioceptive input by applying vibration to the
participants’ back in the supine position, the feet in the upright position
and to the neck in both positions during half of the trials in each body orientation condition. During repeated trials, participants judged the orientation
(clockwise or counterclockwise) of the rod located at the center of a neutral
or rotated 3D frame. We found that there was an increase in bias towards
the orientation of the frame and a decrease in sensitivity in SVV estimates
in the supine conditions as compared to upright conditions, suggesting an
effect of vestibular information. However, we found limited support that
proprioceptive information (without vibration) influenced SVV estimates
in the current study. Our results suggest that SVV estimates are more heavily influenced by visual cues when there is a lack of available vestibular
26.4084 Effect of Vibrotactile Feedback through the Floor on
Social Presence in an Immersive Virtual Environment Myungho
Lee1([email protected]), Gerd Bruder1, Greg Welch1; 1University
of Central Florida
Despite the multisensory nature of human perception, applications involving virtual humans are typically limited to visual stimulation and speech.
We performed an experiment investigating the effects of combined visual,
auditory and/or vibrotactile stimuli on a participant’s sense of social presence with a virtual human. In an immersive virtual environment achieved
via a head-mounted display, the participants were exposed to a virtual
human (VH) walking toward them and pacing back and forth, within their
social space. Participants were randomly assigned to one of three conditions: participants in the “Sound” condition (N=11) received spatial auditive feedback of the ground impact of the footsteps of the VH; participants
in the “Vibration” condition (N=10) received additional vibrotactile feedback from the footsteps of the VH via a haptic platform; while participants
in the “Mute” condition (N=11) were not exposed to sound or vibrotactile
feedback. We measured presence/social presence via questionnaires. We
analyzed the participants’ head movement data regarding backing away
behaviors when the VH invaded the participant’s personal space as well as
the view direction toward the face of the VH. Our results show that social
presence and the backing away distance in the Vibration condition were
significantly higher than in the Sound condition. Presence in the Mute
condition was significantly lower than in the other two conditions. The
vibrotactile feedback of a VH’s footsteps increased the social presence in
both subjective self-reports of the sense of social presence and behavioral
responses when it was accompanied by sounds, compared to vision and
sounds only. We found that participants who experienced both the footstep
sounds and vibrations exhibited a greater avoidance behavior to the VH,
e.g., avoided looking at the VH’s face directly and moved their head backward more when the VH invaded their personal space.
Acknowledgement: The Office of Naval Research (ONR) Code 30 under Dr. Peter
Squire, Program Officer (ONR awards N00014-14-1-0248 and N00014-12-11003)
Acknowledgement: EPSRC EP/N019423/1
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Vis io n S c ie nc es Societ y
Saturday PM
Over a lifetime of experience, we store information about the objects in our
environment and their defining features. One important statistical property
for visual objects is the co-occurrence of constituent features. For example,
the round shape of an apple co-occurs frequently with the color red, but
not blue. Recent fMRI work has shown that a sub-region of the anterior
temporal lobe, perirhinal cortex, encodes object-specific information, while
lower-level perceptual features are encoded in posterior regions. Here we
tested whether perirhinal cortex plays a causal role in object-specific representations by coding object-feature co-occurrence statistics. We examined patients with neurodegenerative disease that, as a group, had atrophy spanning portions of temporal, parietal and frontal cortex. We used a
behavioral version of representational similarity analysis to characterize the
semantic and perceptual representations of objects in the patients. Through
a series of behavioral assessments, we constructed dissimilarity matrices
reflecting the patients understanding of feature co-occurrence statistics for
each object category, the perceptual similarities for color, and perceptual
similarities for simple shapes. We then examined the degree to which the
patients’ behavioral dissimilarity matrices correlated with those of a control
group, which provided a continuous measure of behavioral impairment for
each task in every patient. We tested for neural regions that underlie these
representations by examining correlations with neuroimaging measures of
cortical atrophy. Consistent with our predictions, we found that the degree
of atrophy of perirhinal cortex was strongly correlated with the degree to
which patients were impaired on knowledge of object-feature statistics
but not with their perceptual assessments of color and shape information,
which appeared to rely on more posterior regions of the ventral stream.
These results situate perirhinal cortex at the interface of perception and
semantic memory, and demonstrate its critical role in coding the statistical
regularities that define an object category.
S atur day A ft ernoon Post ers
Satur day Af t ernoon P os te r s
VSS 2017 Abst ract s
26.4085 Spatiotemporal dynamics of braille letter perception in
blind readers Santani Teng ([email protected]), Radoslaw Cichy , Dimitrios Pantazis2, Aude Oliva1; 1CSAIL, Massachusetts Institute of Technology, 2McGovern Institute for Brain Research, Massachusetts Institute of
Technology, 3Department of Education and Psychology, Free University
of Berlin
Saturday PM
Traditionally “visual” cortical regions in blind persons are known to activate in response to a wide range of nonvisual tasks, suggesting a functional
reorganization in response to blindness. However, the functional correlates
of specific regional activations and, more generally, the principles governing the reorganization of cortical processing remain unclear. This is
in part because the underlying dynamics of crossmodal plasticity are not
well understood. Previously, we measured brain responses to Braille letter
stimuli using MEG alone, finding that sensory representations are widely
variable and idiosyncratic across subjects. Thus, to elucidate the braille processing stream in blind individuals with high spatial resolution, here we
additionally measure fMRI responses in repeated sessions for individual
subjects. Early-blind, braille-proficient participants were presented with
single-letter stimuli to the left index finger in random order, responding to
occasional deviants. Univariate contrasts yielded reliable BOLD activation
in right somatosensory and bilateral occipital and parietal cortices. Further,
we use MVPA to generate similarity matrices between letter identities from
MEG and fMRI data. We then relate these results with representational similarity analysis, leveraging the spatial resolution of fMRI and the temporal
resolution of MEG in a spatiotemporal fusion analysis: Significant correlations between MEG and fMRI representations index the propagation
of braille information between its arrival in somatosensory cortex and its
subsequent evolution along the processing stream. We interpret results in
the context of competing proposals of processing hierarchies, e.g. whether
a visually deprived cortex reverses the typical visual hierarchy or largely
co-opts it using tactile input.
Acknowledgement: Vannevar Bush Faculty Fellowship in Cognitive Neuroscience
to A.O., NIH R01-EY020484 to A.O., Martinos Imaging Center at MIT
26.4086 Estimation of gloss and shape from vision and
touch. Wendy Adams1([email protected]), Gizem Küçükoğlu2,
Michael Landy2; 1Department of Psychology, University of Southampton,
Department of Psychology, New York University
Image cues to gloss are affected by both gloss and shape. For this reason, an
optimal observer would jointly estimate gloss and shape. Consistent with
this, we have shown that underestimation of depth is associated with overestimation of gloss, and vice versa. If observers jointly estimate shape
and gloss, using all available information, then shape cues from another
modality, such as haptics (touch) should modulate perceived gloss. Observers viewed and touched visual-haptic stimuli that independently varied in
depth and gloss. The shape information provided by touch was either consistent with vision, or differed by ±15%. On each trial, observers reported
both perceived depth and perceived gloss, with reference to two sets of
physical stimuli: one varied in gloss (painted ping pong balls with varying mixtures of matte and glossy varnish), the other in depth (3-D printed
arrays of random-depth ellipses similar to the visually and haptically rendered stimuli). As expected, perceived depth increased as a function of
both visually and haptically defined depth; observers integrated the depth
information provided by the two modalities. In addition, and in line with
previous findings, perceived gloss was affected by both visually rendered
gloss and visually rendered depth: perceived gloss increased with visually-defined depth. However, the haptic depth perturbations had little effect
on perceived gloss. Subtle changes in quantitative depth, signalled by a
non-visual modality do not appear to modulate the perceived glossiness
of a surface.
Acknowledgement: EPSRC grant EP/K005952/1
But, if so, which sense dominates visuo-haptic size perception? Here we
show that visuo-haptic size perception is more than the simple combination
of visually and haptically sensed dimensions. Specifically, in Experiment
1, we asked participants to judge the size of two objects placed at different egocentric distances in visual, haptic and visuo-haptic conditions. The
point of subjective equality and the discrimination threshold, determined
with an adaptive staircase procedure, were taken as a measure of size constancy and precision, respectively. We found a lack of size constancy in
both visual and haptic conditions but, surprisingly, not in the visuo-haptic
condition. Precision was lowest in the haptic condition with no advantage
of the visuo-haptic condition over the visual condition. In Experiment 2,
we tested two possibilities that may explain these results. The first was that
participants might have estimated the size of objects by comparing them to
their own visible hand. The second was that they might have scaled retinal
size by the haptically sensed distance. To contrast these two possibilities,
we manipulated whether participants could see their hand while grasping
objects in the visuo-haptic condition. We found that participants’ size constancy and precision were not impacted by the availability of hand vision.
In sum, our findings show that imperfect size constancy is found also in
haptics, and suggest that visuo-haptic size perception comprises the proprioceptive information about the hand position which promotes visual
processing of object’s properties.
26.4088 Hand as a Deformable Sensor: Toward a Quantitative
Framework for Characterizing 4D Dynamics of the Hand during
Visual-Haptic Cross-Modal Perception Jay Hegdé1,2,3([email protected]);
Brain and Behavior Discovery Institute, James and Jean Culver Vision
Discovery Institute, Department of Ophthalmology, Medical College of
Georgia, Augusta University, Augusta, GA
During haptic exploration of objects, the hand undergoes complex and
dynamic shape changes, or deformations. Deformations in the sensor inevitably introduce distortions in the sensory information. Therefore, in order
to perceive the object properly, the brain must, in computational terms,
‘discount’ hand deformations during visual-haptic cross-modal perception. This process is poorly understood, in no small measure because of a
lack of a quantitative understanding of the deformations of hand during
haptic exploration. To help overcome this barrier, we used a 3D scanner to
measure the 3D shape of both hands at rest and during haptic exploration
in 7 adult human subjects (5 females) at an average spatial resolution of
0.62 mm ± 0.54 [SD]. As the requisite first step in developing a 4D framework for hand representations, we constructed, for the first time, a standard
coordinate system (or map) for hands, akin to those that already exist for
brains. The standard hand map derived by the coregistering the 14 individual hands using different available algorithms yielded comparable results
(average pairwise cophenetic correlation = 0.43, df > 3x104, p < 10-5). Reassuringly, distortions in the map were smallest at the tip of each finger (mean
error, 0.71 mm ± 0.49) where haptic sensitivity is known to be the highest.
Distortions were largest over the opisthenar and the purlicue (4.57 mm ±
2.14), which are relatively unimportant in haptic sensing. To help validate
the map, we determined the extent to which locations of three of the most
sensitive locations of the hand (fingertips of forefinger, thumb, and middle
finger) from individual hands matched the corresponding locations in the
standard hand. We found that these errors were relatively small (0.52 mm ±
0.34). Together, our results demonstrate the feasibility of representing hand
dynamics during haptic explorations within a standard coordinate system.
Acknowledgement: This study was supported by NIH/NINDS grant R21
NS086356, the U. S. Army Research Office grants W911NF-11-1-0105 and
W911NF-15-1-0311, and a pilot grant from the James and Jean Culver Vision
Discovery Institute of Augusta University to Jay Hegdé.
26.4089 Eye and hand dissociation in depth and direction: behav-
ioral encoding of reach Annalisa Bosco1(annalisa.bosco2[email protected]),
26.4087 The role of proprioception in visuo-haptic size percep-
Valentina Piserchia1, Patrizia Fattori1; 1Department of Pharmacy and
Biotechnology, University of Bologna
Constancy in visual size perception is generally incomplete: the perceived size is affected by changes in fixation distance. Whether haptic
and visuo-haptic size perception is subject to the same lack of constancy
is, however, as yet unknown. In principle, haptic size perception should
not be affected by changes in hand position and should thus be unbiased.
The encoding of reaching towards targets in 3-dimensional space has been
studied at behavioral level. However, the contribution of coordinate systems to movement control for dissociated reaches where eye and target positions varied both in direction and depth is not fully understood. Twelve
healthy participants were tested in a memory guided task where reaching targets were presented at different depths and directions in foveal and
peripheral viewing conditions. The peripheral and foveal viewing condi-
tion Robert Volcic1([email protected]), Nadeen Alalami1; 1Department of Psychology, New York University Abu Dhabi
Vi s i on S c i enc es S o ci e ty
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day A ft ernoon Post ers
abilities. Visual tasks include visual acuity, contrast sensitivity, depth perception and motion sensitivity tasks. Motion sensitivity was determined by
measuring duration thresholds in two sizes and two contrasts conditions.
Postural control tasks include maintaining upright stance in different combination of visual aid and floor conditions, and shifting the center of mass
to certain location without stepping out from initial position. Results: We
replicated existing findings that contrast sensitivity and visual acuity are
related to the ability of shifting center of mass accurately (P< 0.01, r=0.523
and P< 0.01, r=0.526 respectively). More importantly, we found that motion
sensitivities to different size of stimulus are related to different abilities
of balance control. Sensitivity to large motion stimulus is negatively correlated with the ability of maintaining upright stance (P< 0.001, r=-0.540),
and sensitivity to small motion stimulus is positively correlated with the
ability of shifting center of mass accurately (P< 0.001, r=0.670). Discussion:
The improvement of perceptual sensitivity to a large-high contrast motion
in older adults indicates that deteriorated ability of suppressing irrelevant
motion signals. Our current finding that the size of motion stimulus affects
the correlation between motion perception and balancing differently suggests that suppression of irrelevant peripheral motion signal might have
critical role in controlling balance.
Acknowledgement: Ministero dell’Università e della Ricerca by FIRB
Acknowledgement: This research was supported by the National Research Foundation of Korea (NRF-2015R1D1A1A01060520)
26.4090 Causal inference in the updating and weighting of allo-
26.4092 Interaction Effect of Frequency, Velocity and Amplitude on
centric and egocentric information for spatial constancy during
whole-body motion Florian Perdreau1([email protected]),
Mathieu Koppen1, Pieter Medendorp1; 1Radboud University, Donders
Institute for Brain, Cognition & Behaviour, Nijmegen, Netherlands
It has been reported that the brain combines egocentric and allocentric information to update object positions after an intervening movement. Studies
typically use discrete updating tasks (i.e., comparing pre- to post-movement
target representations). Such approaches, however, cannot reveal how the
brain would weigh the information in these reference frames during the
intervening motion. A reasonable assumption is that objects with stable
position over time would be more likely to be considered as a reliable allocentric landmark. But inferring whether an object is stable in space while
the observer is moving involves attributing perceived changes in location
to either the object’s or the observer’s displacement. Here, we tested this
causal inference hypothesis by designing a continuous whole-body motion
updating task. At the beginning of a trial, a target was presented for 500 ms,
within a large visual frame. As soon as the target disappeared, participants
were asked to move a cursor to this location by controlling a linear-guide
mounted on the vestibular sled on which they were seated. Participants
were translated sideways as soon as their reaching movement started, and
they had to maintain the cursor on the remembered target location in space
while being moved. During the sled motion, the frame would move with a
velocity proportional to that of the sled (gain ranging from -0.7 to 0.7). Participants’ responses showed a systematic bias in the direction of the frame
displacement, one that increased with the difference between the frame
and the sled velocity for small differences, but was decreasing for large
differences. This bias pattern provides evidence for humans exploiting a
dynamic Bayesian inference process with two causal structures to mediate
the dynamic integration of allocentric and egocentric information in spatial
Acknowledgement: This research was supported by a VICI grant to W.P.M.
26.4091 Suppressive mechanism in motion perception correlates
with postural control ability Liana Saftari1([email protected]
ac.kr), Shuping Xiong2, Oh-Sang Kwon1; 1Department of Human Factors
Engineering, UNIST, Ulsan, South Korea, 2Department of Industrial and
Systems Engineering, KAIST, Daejeon, South Korea
Motivation: To properly control postural balance, good coordination
between visual, vestibular and muscle control system must be maintained. It has been documented that deteriorated visual abilities such as
visual acuity, contrast sensitivity and depth perception are related to the
deficits in postural control. However, little attention has been paid to the
relation between motion detection sensitivity and postural control, which
is an important gap in literature because motion signal is crucial in maintaining balance. Methods: Twenty older (67-78) and twenty young adults
(19-24) underwent series of tasks measuring visual and postural control
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Perceived Vection Magnitude for Yaw Visual Oscillation Xiao FU1([email protected]), Yue WEI2, Daniel CHEN1, Richard SO1,2; 1Department of Industrial Engineering and Logistics Management, The Hong
Kong University of Science and Technology, Hong Kong, PRC, 2Division
of Bio-medical Engineering, The Hong Kong University of Science and
Technology, Hong Kong, PRC
The phenomenon of vection (also referred to as illusive sensation of self-motion) is commonly perceived by stationary observers when watching coherently moving visual stimuli. For oscillatory visual stimuli, relation of the
oscillation frequency (f), root-mean-squared (RMS) velocity (v), and amplitude (A), is v=√2πfA. Previous studies in our lab showed that, when stimuli
oscillating along the fore-and-aft axis, the influences of frequency on perceived vection depended on amplitude and RMS velocity, which indicated
that these three factors had interactive influences on perceived vection. This
study examined the effects of these three factors when viewers were watching visual scenes oscillating along the yaw axis. The experiment consisted
of 4 repeated sessions in separate days adopting a within subjects design.
Subjects were exposed to a virtual optokinetic drum, with alternated blackand-white stripes oscillating sinusoidally. Preliminary data from the first
8 subjects (gender balanced) showed that when visual oscillations were of
the same frequency, perceived vection magnitude became stronger first
and then turned weaker as the velocity (or amplitude) increased (see supplemental materials). Analyses of the main effects of amplitude indicated
that the larger the amplitude, the stronger the perceived vection. However,
when the velocity became very large, the increases in perceived vection
due to larger amplitude could not compensate the reduction of vection
due to the increases in velocity in order to maintain the same frequency. In
conclusion, exposure to visual oscillations along the yaw axis of the same
frequency but different amplitudes and velocities could generate different
vection magnitudes due to the interacting effects of velocity and amplitude.
Findings suggested that frequency alone should not be regarded as a sufficient predictor for perceived vection magnitude.
Acknowledgement: The Hong Kong Grants Council Grants 16200915 and 618812
Spatial Vision: Crowding and masking
Saturday, May 20, 2:45 - 6:45 pm
Poster Session, Pavilion
26.4093 Crowding asymmetries in a neural model of image segmentation Alban Bornet1(al[email protected]), Adrien Doerig1, Michael
Herzog1, Gregory Francis2; 1Laboratory of Psychophysics, Brain Mind
Institute, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland,
Department of Psychological Sciences, Purdue University, USA
In crowding, perception of a target deteriorates when neighbored by flankers. Contrary to predictions of classic pooling models, crowding is strong
only when the target groups with the flankers. We recently showed that
Vis io n S c ie nc es Societ y
Saturday PM
tions consisted in three eye/hand configurations: in the constant-gaze configuration, the eyes fixated a central fixation target and the hand reached
one of the peripheral reaching targets, in the constant reach configuration,
the eyes fixated one of the peripheral targets and the hand reached always
the central target, and in the foveal reach configuration, the fixation and
reaching targets were coincident. A novel approach for behavioral data was
used to define the prevalent coordinate system used by each subject and
it consisted in the application of combined gradient and vector analysis.
The results showed reach error patterns that are based on both eye-centered and space-centered representation: in depth more deviated towards a
space-centered representation and in direction perfectly balanced between
eye-centered and space-centered. We correlated the trajectory variability in
eye-centered and space-centered coordinates and we found that, in direction, the variability was described by a combination of linear and non linear model and, in depth, by a significant linear model. Present data indicate that the different weights of coordinate systems found in depth and
direction are correlated with the variability distribution across eye/target
configurations. In particular, the non linear distribution of movement variability in direction can be related to a mixed encoding and the linear distribution in depth with a more defined spatiotopic encoding. Saturday PM
Satur day Af t ernoon P os te r s
VSS 2017 Abst ract s
a version of the neural LAMINART model could explain many grouping
effects in crowding. In the model, top-down segmentation signals promote separate neural representations of separate groups in a scene. The
model is implemented as spiking neurons in the NEST simulator, consists
of hundreds of thousands of neurons and several million connections, and
represents early stages of vision (V1, V2 and V4). Here, we present new
simulations that hypothesize that the placement of the top-down signal is
less precise for more peripheral locations. The overall strength of crowding
for flankers depends on whether the top-down signal can generate distinct
segmentations of the target and a flanker. A flanking set of 8 long lines
spans a large surface that the top-down signal will easily catch and segment from the target vernier, so such segmentations will be very common
regardless of distance from fixation. In contrast, a flanking square can, in
principle, be segmented from the target, but such segmentations will be less
common with larger distances from fixation. These properties produce the
predicted crowding asymmetry: when the target is in the right-side visual
field, crowding is stronger with long-sized lines flanking the target on the
left (closer to fixation) and a square flanking the target on the right (farther
from fixation) than when the flanker locations are switched. In an empirical study, 6 observers discriminated a target vernier with the stimuli used
in the simulations. Consistent with the model predictions, crowding was
stronger with an array of aligned flankers to the left and a square to the
right compared to the other way around.
somewhat on the specific task. For example, using stimuli that do not easily
combine to form a unique symbol (e.g. letters or objects), observers typically
confuse the source of objects and report either the target or a distractor.
Alternatively, when continuous features are used (e.g. orientated gratings
or line positions), observers often report a feature matching the average of
target and distractor features. To help reconcile these empirical differences,
we developed a method of adjustment that allows detailed analysis of multiple error categories occurring within the one task. We presented a Landolt
C target oriented randomly at 10° eccentricity in the right peripheral visual
field in one of several distractor conditions. To report the target orientation,
an observer adjusted an identical foveal target. We converted each perceptual report into angular distances from the target orientation and from the
orientations of the various distractor elements. We applied new analyses
and modelling to these data to quantify whether perceptual reports show
evidence of positional uncertainty, source confusion, and featural averaging on a trial-by-trial basis. Our results show that observers reported a distractor orientation instead of the target in more than 50% of trials in some
conditions. Our data also reveal a heterogeneous distribution of perceptual reports that depends on target-distractor distance. We conclude that
aggregate performance in visual crowding cannot be neatly labelled, and
the appearance of a crowded display is probabilistic.
Acknowledgement: Human Brain Project, grant agreement n°604102
26.4096 Un-crowding affects cortical activation in V1 differently
26.4094 Perceptual Grouping and Segmentation: Uncrowd-
ing Gregory Francis1([email protected]), Alban Bornet2, Adrien
Doerig2, Michael Herzog2; 1Psychological Sciences, Purdue University,
Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique
Fédérale de Lausanne
In visual crowding a target stimulus can be difficult to identify or discriminate when it is surrounded by additional flanking elements. Numerous
empirical studies have demonstrated that the strength of crowding depends
on the apparent grouping relationship between the target and flanking elements. Last year (Francis, Manassi & Herzog, 2016), we described a neural
network model of visual perception that explained a wide variety of crowding effects as the result of neural grouping and segmentation mechanisms.
We now present new model simulations of an uncrowding effect, where a
single stimulus around a target causes strong crowding but additional flanking stimuli produce weaker crowding effects. Manassi, Lonchampt, Clarke
& Herzog (JOV, 2016) investigated more than a dozen such uncrowding
examples; and our new simulations demonstrate that the neural network
model accounts for many of these new uncrowding examples because adding flanking elements leads to a larger perceptual group of flankers that
is distinct from the target. This larger group is more easily segmented by
top-down signals that produce distinct representations of the target and
the group of flankers. These distinct representations avoid crowding effects
because the segmentation process effectively isolates the target. Consistent with the new empirical data, the model demonstrates uncrowding for
flankers made of squares, circles, hexagons, octagons, stars, and irregular
shapes, as long as they form perceptual groups. The model fails to match
empirical findings for some situations where grouping effects seem to be
very sensitive to contextual details; a failure that is not too surprising since
the model grouping mechanisms do not perfectly match human behavior.
Overall, the model is able to account for many new cases of uncrowding;
and the cases where the model is unsatisfactory suggest ways to improve
the model to better understand crowding effects, perceptual grouping, and
visual segmentation.
Acknowledgement: The research leading to these results has received funding
from the European Union Seventh Framework Programme (FP7/2007-2013)
under grant agreement 604102 (HBP).
26.4095 On the heterogeneity of visual crowding William Harri-
son ([email protected]), Peter Bex ; Department of Psychology,
University of Cambridge, 2Queensland Brain Institute, The University of
Queensland, 3Department of Psychology, Northeastern University
3 1
Our ability to identify a visual object in clutter is far worse than predicted
by the eyes’ optics and nerve fiber density. Although the ubiquity of such
visual impairment, referred to as crowding, is generally well accepted,
the appearance of crowded stimuli is debated due in part to the fact that
the patterns of perceptual errors made under crowded conditions depend
Vi s i on S c i enc es S o ci e ty
Acknowledgement: NIH grant R01EY021553 (P.J.B.) NHMRC grant
APP1091257 (W.J.H.)
from LOC Maya Jastrzebowska1,2([email protected]), Vitaly
Chicherov1, Bogdan Draganski2,3, Michael Herzog1; 1Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne
(EPFL), 1015 Lausanne, Switzerland, 2LREN – Department for Clinical
Neurosciences, CHUV, University of Lausanne, Lausanne, Switzerland,
Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig,
In crowding, neighboring elements impede the perception of a target. Surprisingly, increasing the number of neighboring elements can decrease
crowding, i.e., lead to uncrowding (Manassi, 2015). Here, we used fMRI to
investigate the cortical locus of (un)crowding. The experiment consisted of
seven conditions: (1) target only (eight circular target gratings surrounding a central fixation dot, tilted either clockwise (CW) or counterclockwise
(CCW)), (2) 2-flanker (each of the 8 targets was flanked by an inside and
outside vertically-oriented grating), (3) annulus-flanker (inside and outside
flankers connected into annuli), (4) 4-flanker (one inside and three outside
gratings), and (5-7) control conditions corresponding to conditions (2-4)
with targets removed. Participants were asked either to indicate whether
target gratings were tilted CW or CCW by pushing one of two buttons or to
push a button randomly if targets were not present. Target discrimination
was highest in the target only condition, followed by the annulus-flanker,
4-flanker and 2-flanker conditions, respectively. As crowding is known to
attenuate the BOLD response, we predicted that the percent signal change
(PSC) closely reflects the behavioral results (successive decrease in target
identification from annulus-flanker to 4-flanker to 2-flanker) in brain areas
underlying the crowding effect. The PSC was calculated for each subject,
each region of interest (target-activated areas in V1–V4 and LOC) and each
of the conditions of interest. In fMRI, crowding and uncrowding effects
were present throughout areas V1–V4 and LOC, as indicated by comparisons of PSCs in the target only versus 2-flanker and 4-flanker conditions.
However, the more fine-grained differences between the 2-flanker condition and the 4-flanker and annulus-flanker conditions were only present
in V1, V4 and LOC. The expected successive decrease in PSC from annulus-flanker to 4-flanker to 2-flanker was only observed in the LOC, reflecting uncrowding.
26.4097 Relationships between retinal ganglion cells, Ricco’s
area and crowding zone Rong Liu1([email protected]), MiYoung Kwon1;
Department of Ophthalmology, School of Medicine, University of Alabama at Birmingham, Birmingham, AL
Previous studies reported enlargements in the area of complete spatial summation (Ricco’s area) for luminance detection in response to loss of Retinal
Ganglion Cells (Redmond et al., 2010). While the finding suggests the integrity of RGCs may alter spatial pooling properties, little is known about its
impact on the pooling mechanisms of complex object recognition. Here we
examine the relationships between RGC layer thickness, Ricco’s area and
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s Acknowledgement: This work was supported by Research to Prevent Blindness,
and EyeSight Foundation of Alabama.
26.4098 The effect of overall stimulus configuration on crowd-
ing Matthew Pachai1([email protected]), Maya Roinishvili2, Michael
Herzog1; 1Laboratory of Psychophysics, Brain Mind Institute, Ecole Polytechnique Fédérale de Lausanne (EPFL), 2Institute of Cognitive Neurosciences, Agricultural University of Georgia
Since its inception, the standard framework to study vision has been,
implicitly or explicitly, both hierarchical and feedforward. This framework
has facilitated the deconstruction of complex mechanisms into smaller,
more tractable problems, and has been extremely successful in characterizing the processing of basic visual elements. However, this framework
can break down when elements are presented in context, as they are in
every-day life. In crowding, peripheral object discrimination is hindered
by the presence of nearby elements, as for example a Vernier embedded
in a square is more difficult to discriminate than when presented alone.
However, adding additional flanking squares can, counter-intuitively,
ameliorate this deleterious effect (Manassi et al, 2013, J Vis). This example demonstrates how, in order to understand low-level vision, we must
also understand higher-level processing. Here, we take a step toward this
integrated approach by characterizing the effect of flanker configuration
on crowding in a theory-agnostic manner. Previous studies have examined
a small number of experimenter-selected configurations, whereas here we
made no assumptions about which configurations should affect crowding.
We placed a Vernier embedded in a square at the centre of all possible 3x5
arrays with an equal number of squares and stars (3432 total). Six observers
discriminated this Vernier in the presence of each configuration, repeating
each until responding incorrectly or achieving six correct responses in its
presence. In this way, we were able to quantify the effect of all possible configurations on performance. Among the many interesting patterns in our
large data set, we observed a strong positive correlation between the number of clustered square elements and discrimination performance, as well
as evidence for an effect of symmetry. More generally, our data suggest
that configurations encouraging separate grouping of target and flanker
elements ameliorate crowding, using a paradigm in which grouping was
not explicitly manipulated.
Acknowledgement: Swiss National Science Foundation (SNF) Project Number
26.4099 Are there benefits of Visual Crowding? Srimant Tripathy1([email protected]), Harold Bedell2; 1School of Optometry &
Vision Science, Faculty of Life Sciences, University of Bradford, 2College of
Optometry, University of Houston
Objects in peripheral vision are less identifiable when flanked by nearby
distractors. This loss in performance, known as visual crowding, has traditionally been viewed to result from a bottleneck in visual processing (e.g.
Levi, Vision Research, 2008). Here we will present the alternate view that
crowding could serve to suppress the representation of repetitive items in
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
peripheral vision so that attentional resources can be directed to non-repetitive, interesting and novel items, potentially as the targets of upcoming
saccades. Traditionally, crowding experiments have maximized the degradation of target-object identification by presenting targets and distractors
with similar characteristics (Kooi et al., Spatial Vision, 1994) or that group
together strongly (Manassi et al., Journal of Vision, 2012). These conditions yield robust target-flanker interactions extending over long distances
(Bouma, Nature 1970; Tripathy et al., Journal of Vision, 2014). However,
rather than representing a limitation of peripheral visual processing, the
strong suppression of repetitive or similar items might be desirable, serving
to enhance the individuation and processing of eccentrically viewed novel
objects. In agreement with this interpretation, crowding is known to be
minimal in foveal vision, is strongest between repetitive or similar objects
in peripheral vision and between items that readily group together, and
is reduced by target-flanker differences in contrast polarity, color, depth,
shape, or complexity. Crowding also is weakened when transients increase
the conspicuity of the target relative to the flankers and when the target
represents the end-point of a planned saccade. This presentation will evaluate previous studies of crowding to ascertain whether the results can be
understood better in terms of an implicit deficit in the processing of nearby
visual targets or in terms of a beneficial suppression of repetitive peripheral
stimuli so that processing resources can be devoted more fully to unfixated
objects that are novel and potentially interesting.
26.4100 Cross-optotype metrics for foveal lateral masking Sarah
Waugh1([email protected]), Monika Formankiewicz1, Denis
Pelli2; 1Anglia Vision Research, Department of Vision and Hearing
Sciences, Anglia Ruskin University, 2Psychology Department, New York
A reliable unified metric for the effects of lateral masking on foveal visual
acuity remains important and elusive due to the variety of optotypes and
spacings used, especially for clinical testing of children. We sought to
find a measure of lateral masking that is conserved across clinical optotypes. We asked: 1) Does a target surrounded by pictures or symbols produce similar effects to letters? 2) How do these relate to the effects of bars
or a box? 3) Is any masking metric conserved across the different stroke:size ratios of HOTV (5:1), Lea Symbols (7:1), and Kay Pictures (10:1)? For
three adults, the method of constant stimuli yielded psychometric functions of performance versus target size for three flanker conditions (box,
bars, similar optotypes). Visual acuities and psychometric function slopes
were estimated for eight target-flanker separations (0-10 stroke-widths)
and for an isolated optotype. A clinical staircase was used separately on 16
adults to estimate acuity for an isolated optotype and the optimal flanker
position for each of 3 metrics (stroke-width edge-to-edge; arcmin edge-toedge; optotype-width centre-to-centre). A repeated measures ANOVA
performed on laboratory data revealed that the visual acuity versus flanker
separation (in stroke-widths) function was conserved across HOTV, Lea
symbols, and Kay pictures. Psychometric function slopes (performance
versus target size) were significantly steeper than for isolated targets when
flankers were 2 stroke widths away. Lateral masking was estimated from
clinical staircases. It was strongest when flankers were similar optotypes.
It was conserved across optotypes when using units of either stroke-width
or arcmin. When data from both groups were combined, lateral masking
was best conserved when expressed as edge-to-edge spacing in units of
stroke-width. Lateral masking effects on visual acuity measures are
most consistent when surrounding flankers are similar optotypes and units
of stroke-width are used to specify separation.
Acknowledgement: This work was supported by an Evelyn Trust Grant (to SJW)
and HEFCE QR (Quality Related) Funds (to Anglia Vision Research)
26.4101 Topological dominance in peripheral vision Ruijie Wu1,2(r-
[email protected]), Bo Wang1,2, Yan Zhuo1,2, Lin Chen1,2; 1State Key
Laboratory of Brain and Cognitive Science, Institute of Biophysics, Chinese Academy of Sciences, 2The Innovation Center of Excellence on Brain
Science, Chinese Academy of Sciences
Previous studies have shown that the speed of visual information processing increases with eccentricity. Researches have also demonstrated that the
visual system is sensitive to topological changes, such as the appearance
and disappearance of holes in a figure. Our results suggest that, compared
to foveal vision, the peripheral vision is more engaged in the rapid detection of topological changes. We employed a change detection task with eye
movement monitoring. One of the moving figures underwent an abrupt
Vis io n S c ie nc es Societ y
Saturday PM
crowding zone. Ten normally-sighted subjects participated in the study.
For Ricco’s area, a subject’s contrast detection threshold was measured
using a luminance-disc with varying diameter. Two-limbed functions were
fitted to the data of log detection threshold versus log stimulus area. Ricco’s
area was defined as the breakpoint of the two-limbed function. For crowding zone, a subject’s contrast recognition threshold was measured using a
flanked letter with varying center-to-center spacing between the target and
flankers. Clipped lines were fitted to the data of log recognition threshold versus spacing. Crowding zone was defined as the minimum spacing
that yields no threshold elevation in the fit. Measurements were made at 4
locations (at the eccentricity 8.5°). The RGC plus inner plexiform (RGC+)
layer thickness in the central 20° visual-field was measured by Spectral-Domain Optical Coherence Tomography. We found that the RGC+ layer
thickness correlated with both Ricco’s area (r=−0.58,p< 0.01), and crowding zone (r=−0.31, p=0.05). A significant correlation between Ricco’s area
and crowding zone was also found (r=0.58,p< 0.01). Regression analysis
showed a decrease of 1 μm in the RGC+ layer thickness enlarges crowding
zone by 0.02° while an increase of 1° in Ricco’s area (in diameter) enlarges
crowding zone by 4°. Our results demonstrated close relationships between
RGCs, Ricco’s area and crowding zone even in healthy eyes. Our findings
further support the view that changes in RGCs may alter the properties of
spatial integration zone.
S atur day A ft ernoon Post ers
Saturday PM
Satur day Af t ernoon P os te r s
shape change either with a change of hole number (“HC”) or without
(“nHC”). In 7 experiments, many local features (e.g. luminance, similarity,
spatial frequency, perimeter and shape of the contour) were well controlled.
The detection performance was quantified as d’. The results showed that
the d’ under the “HC” condition was significantly larger in the periphery
than it in the fovea, whereas the “nHC” condition manifested significantly
larger d’ in fovea than in periphery. The luminance contrast was manipulated to control for the task difficulty. There were 6 experiments consistently showed the advantage of “HC” detection in the periphery. And the
consistent finding was also revealed by a random motion paradigm. When
the stimulus size was scaled in the periphery according to the cortical magnification theory, the topological advantage cannot be diminished. Moreover, measuring at more eccentricities and even at the eccentricity of 30°,
the performance of “HC” retained its sensitivity while “nHC” deteriorated
with eccentricity increased. Further, in our fMRI experiment, the response
to the “HC” at periphery condition was contrasted with the activation in
response to the “nHC” at periphery condition. The result revealed a major
activation in the anterior temporal lobe. The observed sensitivity advantage
in periphery may support the view that topological definition of objects
provides a coherent account for the object perception in the peripheral
Acknowledgement: The work was supported in part by Chinese MOST grants
(2015CB351701, 2012CB825500), NSFC grant (91132302), and CAS grants
(XDB2010001, XDB2050001).
26.4102 Crowding and binding: Not all feature-dimensions behave
equally Amit Yashar1([email protected]), Xiuyun Wu1, Jiageng
Chen1, Marisa Carrasco1,2; 1Department of Psychology, New York University, New York, NY, USA, 2Center for Neural Science, New York University, New York, NY, USA
Background. Crowding refers to the failure to identify a peripheral item
in clutter. The nature of crowding and the stage at which it occurs are still
debated. Crowding has been proposed as the consequence of averaging of
nearby features (mixture model), and switch between target and distractor objects (swapping model). We use a novel quantitative approach to
disambiguate these two hypotheses and assess the stage of processing at
which crowding occurs by characterizing errors and the interdependency
of different feature-dimensions. Methods. Observers (n=14) estimated
the orientation and spatial frequency (SF) of a Gabor (Exp.1) or the orientation and color of a “T” (Exp.2) via two separate reports. The target was
presented at 7° eccentricity. In the crowding conditions, two distractors
flanked the target, each with unique features. We characterized crowding
errors with respect to each distractor along the two feature-dimensions. We
compared two probabilistic models –mixture and swap– to characterize
the error distributions for each feature-dimension independently and with
respect to the other dimension. Results. Under crowded conditions, the
swapping model performed significantly better than the mixture model for
orientation and color estimation errors, indicating switch between target
and distractor. However, the mixture model better characterized SF errors,
indicating averaging across target and distractors. Regarding interdependency, whereas color and orientation swapped independently from each
other, SF and orientation errors correlated; the probability to swap orientation with a given distractor was independent of the direction of the color
error, but higher when SF error was toward that distractor. Conclusion.
Crowding leads to the swapping of color and orientation but averaging
of orientation and SF. Whereas orientation and color crowding are independent, orientation and SF are interdependent. Our results suggest that
crowding operates after orientation is bound with SF but before it is bound
with color.
26.4103 The alleviation of crowding effect through perceptual
learning Ziyun Zhu1([email protected]), Fang Fang1,2,3,4; 1School
of Psychological and Cognitive Sciences and Beijing Key Laboratory of
Behavior and Mental Health, 2Key Laboratory of Machine Perception
(Ministry of Education), 3Peking-Tsinghua Center for Life Sciences, 4PKUIDG/McGovern Institute for Brain Research, Peking University, Beijing
100871, P.R. China
Our recent study showed that crowding effect can be completely eliminated by perceptual learning (Zhu, Fan, and Fang, JOV 2016). Here, we
present data to further characterize this process. Subjects were trained on
a crowded orientation discrimination task with a target centered at 10°
Vi s i on S c i enc es S o ci e ty
VSS 2017 Abst ract s
eccentricity together with two abutting flankers positioned radially. The
target and flankers were a circular patch of a sine-wave grating. Before and
after training, we measured orientation discrimination thresholds with the
crowded and isolated targets. In Experiment 1, the diameter of the target
and flankers could be 1.5°, 2°, 2.5° or 3°. We found that the extent of alleviation of the crowding effect by training depended on the center-to-center
distance between the target and flankers. The greater the distance, the less
crowding effect after training. When the distance was larger than 3°, the
crowding effect can be completely eliminated. In Experiment 2, we first
replicated our previous finding that there was little transfer of the learning
effect between the left and right visual fields. A new finding is that the
learning effect to eliminate crowding could completely transfer from the
upper to the lower visual filed, but not vice versa. In Experiment 3, we
examined whether the learned ability to eliminate the orientation crowding could generalize to eliminate letter crowding. Before and after training,
we also measured the contrast thresholds for identifying crowded and isolated target letters, which had the same size as and were placed at the same
location as the target grating. We found that the learning effect could completely transfer and eliminate the letter crowding effect. Taken together,
these results suggest that, with a relative large target, crowding effect is
dominated by some high-level cognitive components, though constrained
by visual hemifield properties. The cognitive components can be modified
by perceptual training.
26.4104 Invariant tuning of lateral interactions between visual
stimuli Sunwoo Kwon1([email protected]), Savel’ev Sergey2,
Thomas Albright3, Sergei Gepshtein3; 1Brain and Cognitive Sciences, University of Rochester, 2Department of Physics, Loughborough University,
Vision Center Laboratory, Salk Institute for Biological Studies
Perception of visual stimuli is modulated by their context. The effect of context can be facilitatory or suppressive in a manner that is highly sensitive
to stimulus conditions. To gain a better insight into mechanisms of contextual modulation, we asked what patterns of modulation are invariant of
the nuance of stimulation. We uncover the invariants by measuring maps
of modulation across the full range of modulating parameters and then
compare the empirical maps with predictions of models of neural interactions. We studied contextual modulation using two luminance gratings
(“flankers”) and a “probe” positioned between the flankers. The probe was
either distributed (luminance grating) or localized (line). We measured the
probe visibility for a wide range of flanker contrasts (C) and spatial frequencies (SF). First, we used distributed probes and obtained a bivariate map of probe contrast threshold in the coordinates of C and SF. The
facilitatory effect of context formed well-defined “islands” in the map, i.e.,
facilitation was tuned to both C and SF. The nonmonotonic effect of flanker
contrast could only arise in a nonlinear system, for example in the process
of stimulus encoding or when the distributed neural activity is collapsed to
the binary decision variable. Second, we attempted to bypass the decision
nonlinearity using a localized probe (line). Line contrast threshold varied
as a function of location between the flankers and it depended on flanker
contrast, similar to the results with distributed probes. The results suggest the encoding origin of the nonlinearity. We use results of empirical
mapping to constrain a model of contextual modulation in terms of the
canonical inhibition-stabilized neural network (ISN). We show that a chain
of canonical ISN nodes is tuned to SF (arXiv:1410.4237) producing lateral
interactions that are also tuned to flanker SF and C, similar to our results.
26.4105 Statistics of boundary, luminance, and pattern informa-
tion predict occluding target detection in natural backgrounds R
Calen Walshe1([email protected]), Stephen Sebastian1, Wilson
Geisler1; 1The University of Texas at Austin
Detecting spatial patterns is a fundamental task solved by the human visual
system. Two important constraints on detection performance are the variability that is found in natural scenes and the degradation of the image
that occurs due to optical blurring and non-homogenous sampling of the
retinal ganglion cell (RGC) mosaic across the visual field. Furthermore,
most previous studies of detection performance have been conducted in
the fovea with additive targets. However, image cues are different with
occluding targets so these studies may not generalize well to occluding
targets presented in the periphery. Here, we report eccentricity thresholds
(eccentricity for 70% correct detection) for four different occluding targets
presented in natural backgrounds at varying, but known, distances from
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S atur day A ft ernoon Post ers
Saturday PM
the fovea. The luminance and contrast of the targets was fixed, and precise
experimental control of the statistics (luminance, contrast and similarity) of
the natural backgrounds was obtained using a novel method known as constrained scene sampling (Sebastian, Abrams & Geisler, submitted). Next,
we describe a first-principles model, limited by known physiology of the
human visual system and by the statistics of natural scenes, to compare with
the pattern of observed thresholds. First, target-present and target-absent
images are filtered by a modulation transfer function that approximates the
optics of the human eye. Second, RGC responses are simulated by blurring
and downsampling the optically-filtered image in a fashion consistent the
midget RGCs at each retinal eccentricity. The model then combines luminance, pattern, and boundary information in the target region to predict
detectability across the visual field. We show that a weighted combination
of these three cues predicts the pattern of thresholds observed in our experiment. These results provide a characterization of the information that the
human visual system is likely to be using when detecting occluding objects
in the periphery.
26.4106 Detecting, Localizing and Correcting Exposure-Saturated
Regions Using a Natural Image Statistics Model Zeina Sinno1,2,3([email protected]), Christos Bampis1,2,3, Alan Bovik1,2,3; 1Laboratory for Image
and Video Engineering (LIVE), 2Department of Electrical and Computer
Engineering, 3The University of Texas at Austin
While the human visual system is able to adapt to a wide range of ambient illumination levels, cameras often deliver over- and/or under-exposed
pictures of consequently low quality. This is particularly true of low-cost
CMOS-based mobile camera devices that pervade the market. Towards
finding a way to remediate this problem, we study the characteristics of
poorly-exposed image regions under a natural scene statistics model with
a goal of creating a framework for detecting, localizing and correcting
overand/ or under-exposed pictures. Poorly-exposed picture regions are
detected and located by analyzing the distributions of bandpass, divisively
normalized pictures under a natural scene statistics model. Poor exposure
levels lead to characteristic changes of the empirical probability density
functions (histograms) of the processed pictures. This can be used to trace
potential images saturated by over- or under exposure. Once detected, it
is possible to ameliorate these distortions. If a stack (sequence) of maps
of the same scene is available taken at different exposure levels, then it is
possible to correct poorly exposed regions by fusing the multiple images.
Experiments on multi-exposure datasets demonstrate the effectiveness of
such an approach which suggests its potential for real-time camera tuning
and post-editing of multiply exposed images.
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Vis io n S c ie nc es Societ y
Sunday Morning Talks
Attention: Selection and modulation
Sunday, May 21, 8:15 - 9:45 am
Talk Session, Talk Room 1
Moderator: Anne Sereno
31.11, 8:15 am Investigating the neural correlates of automatic
Sunday AM
attention shifts in electroencephalography Merle Ahrens1,2(m.
[email protected]), Domenica Veniero1, Monika Harvey2,
Gregor Thut1; 1Insititute of Neuroscience and Psychology, University of
Glasgow, UK, 2School of Psychology, University of Glasgow, UK
Previous research has highlighted posterior oscillations in the alpha-band
to play a key role in goal-directed (top-down) visuospatial attention (Foxe&Snyder 2011). However, the oscillatory signatures of automatically
driven (bottom-up) alerting and orienting of attention remain uncertain.
Likewise, it is unclear to what extend these automatic processes are influenced by top-down components, such as mid-frontal oscillatory activity in
the theta-band. These oscillations are associated with cognitive control processes activated when goal directed bias over habitual responses is needed
(Cavanagh&Frank 2014). Here, we employed electroencephalography to
investigate the neural correlates of automatic attentional engagement in
healthy participants. We utilized an exogenously cued dot detection task.
Following a non-predictable spatial cue or no-cue, targets were presented
at cued or non-cued positions at four different cue-target delays (ranging
from 105.8-705.8ms), known to induce initial attentional benefits and later
inhibition-of-return (IOR). This experimental manipulation allowed us to
investigate both automatic alerting (cue vs. no-cue independent of space)
and automatic (re)orienting (cued vs. uncued position) at early and later
stages of spatial attention processes. Between-subject correlations of reaction times (RTs) and alpha-power revealed that individuals who showed
an early alerting effect (faster RTs in cue vs. no-cue) exhibited stronger
alpha-band desynchronization over occipital regions before target onset
(independent of space and hemisphere). Notably, the same analysis also
revealed a negative influence of mid-frontal theta activity (P300) over alerting, where individuals with higher central theta-power displayed slower
RT. Interestingly, central theta-increases also negatively affected later spatial components of automatic attention (i.e. IOR), where IOR was abolished
in individuals with higher theta power. These results suggest an interplay between top-down processes and automatic attention mechanisms,
in accordance with cognitive control overriding reflexive processes. They
highlight the need to control for the engagement of higher-order computations in order to better understand the neural correlates of automatic processes in isolation.
Acknowledgement: The work was supported by a PhD studentship from the
College of Science and Engineering at the University of Glasgow (received by
31.12, 8:30 am Alpha and gamma neurofeedback reinforce control
of spatial attention Yasaman Bagherzadeh1([email protected]), Daniel
Baldauf , Benjamin Lu , Dimitrios Pantazis , Robert Desimone ; McGovern Institute for Brain Research, Massachusetts Institute of Technology,
1 1
Previous studies have shown that alpha synchrony is linked with suppression of information processing, whereas gamma frequency is associated
with attention to targets. To determine whether alpha and gamma synchrony play a causal role in the control of spatial attention, we designed
a MEG neurofeedback task to train subjects to increase an asymmetry of
oscillatory power between the left and right parietal cortex (alpha in Exp1
or gamma in Exp2). During neurofeedback trials a Gabor pattern was presented in the center of the screen, with its contrast modulated according to
a real time measure of the hemispheric asymmetry in the alpha (or gamma)
range. We tested the effects of these oscillatory changes on both behavioral performance in a free viewing task and on visual evoked potentials
recorded from visual cortex. Twenty healthy subjects participated in the
study. They were divided into two groups, a left and a right training group,
depending on the feedback direction of the hemispheric asymmetry. We
found that subjects were able to control the asymmetry between the left
Vi s i on S c i enc es S o ci e ty
and right hemispheres in the frequency range of interest in both training
directions. Increasing alpha in one hemisphere lead to reduced visually
evoked responses (Exp1) while increasing gamma in one hemisphere lead
to enhanced evoked responses (Exp2). Hemispheric asymmetry in the
alpha band resulted in attentional bias in the free viewing task by reducing
the number of fixations in the contralateral hemifield. The results support
the idea that alpha and gamma synchrony play reciprocal roles in the control of spatially directed attention.
31.13, 8:45 am Accounting for attention in perceptual decisions
and confidence Rachel Denison1,2([email protected]), William
Adler2, Marisa Carrasco1,2, Wei Ji Ma1,2; 1Department of Psychology, New
York University, 2Center for Neural Science, New York University
Purpose: To make optimal perceptual decisions, observers must take into
account the uncertainty inherent in their sensory representations. Humans
take into account sensory uncertainty caused by stimulus factors such as
low contrast. However, it is not known whether humans take into account
sensory uncertainty caused by internal factors such as low attention. Here
we asked whether humans adjust their perceptual decisions and confidence
reports to account for attention-dependent uncertainty. Methods: Twelve
observers performed an orientation categorization task, in which the two
categories had the same mean orientation but different standard deviations, and reported both categorization (category 1 or 2) and confidence
(4-point scale) on each trial. In this task, unlike a traditional left vs. right
orientation discrimination, the optimal choice boundaries depend on orientation uncertainty. We manipulated endogenous (voluntary) covert spatial
attention trial-by-trial using a central precue pointing to one of four possible stimulus locations (valid and invalid precues) or to all locations (neutral
precue). Four stimuli appeared briefly on each trial, and a response cue indicated which stimulus should be reported. We used generative modeling of
the experimental data and model comparison to determine the influence
of attention on decision and confidence boundaries. Results: Attentional
cueing affected performance accuracy—highest for valid, intermediate for
neutral, lowest for invalid—verifying that the attentional manipulation of
orientation uncertainty was successful. Decision and confidence boundaries shifted under different levels of attention in a way indistinguishable
from optimal. The Fixed model, in which observers do not adjust for attention-dependent uncertainty, fit the data poorly. The Bayesian model and
two heuristic models, in which observers adjust boundaries according to
parametric decision rules, performed similarly, and substantially better
than the Fixed model. Conclusion: Perceptual decision-making responds
flexibly to uncertainty related to attention, an internal state. This flexibility
should improve perceptual decisions in everyday vision.
Acknowledgement: Funding was provided by National Institutes of Health
National Eye Institute grants F32 EY025533 to R.N.D., T32 EY007136 to NYU
supporting R.N.D., and a National Science Foundation Graduate Research Fellowship to W.T.A.
31.14, 9:00 am Task performance in covert, but not overt, attention
correlates with early ERP laterality Rinat Hilo1([email protected]
ac.il), Marisa Carrasco2, Shlomit Yuval-Greenberg1,3; 1School of Psychological Sciences, Tel Aviv University, 2Department of Psychology and Center
for Neural Science, New York University, 3Sagol School of Neuroscience,
Tel Aviv University
Background. Visual performance decreases with target eccentricity. To
compensate for such decrements, we move our eyes to target locations
(overt attention) or attend to these locations without accompanying
eye-movements (covert attention). Both overt and covert attention enhance
perceptual performance, but it is undetermined whether they do so to the
same extent. Here we compared overt and covert attentional enhancements
using electrophysiological and behavioral measurements. Methods. ERP
and eye-tracking were measured in 16 participants. On each trial, a central
directional cue (100% valid) pointed to the left or right. In most (80%) trials
a task-irrelevant probe appeared bilaterally 300-500ms post cue. Only trials
with probes were analyzed and the ERP signal was examined relative to
probe-onset at time zero, to reveal ongoing attentional enhancement while
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S unday M orning Talks
observers attended to the cued location. In half the trials, a target Gaborpatch was presented 300-500ms after the probe in the cued location. To
indicate target detection, observers responded either by pressing a button
(covert condition) or by shifting the eyes (overt condition). Results. Mean
visual sensitivity was significantly higher for the covert than the overt condition, resulting from same hit rate and lower false alarm. Laterality of the
ERP responses (difference between contralateral and ipsilateral channels
relative to the cue) was found for both overt and covert attention shifts,
around the P1 component (90-135ms) and the N2 component (185-300ms).
ERP laterality in the P1 time-range was positively correlated across participants with task performance on the covert, but not the overt, task. Conclusion. Covert attention can be more effective than overt attention. Overt
attention is a natural dual-task requiring both shifting of attention and performing a target-directed action. Covert attention requires only shifting of
attention without a goal-directed action, and therefore can be easier to perform and is more correlated with early attentional ERP components.
shape or each location formed a response vector. Mean distance between
response vectors for different shapes was greater (more distinctive) in AIT
than LIP, while mean distance between locations was greater (more distinctive) in LIP than AIT. We determined the effect of different attentional
conditions on mean distance between response vectors for all shapes or all
locations. In AIT, mean response distance between shapes was significantly
larger under the shape attention task compared to the location attention
task. In contrast, in LIP, mean response distances for locations were not significantly different between the two attention tasks. Even when changes in
mean responses were factored out, multidimensional scaling still showed
significant task differences in AIT but not LIP, indicating that attention was
globally distorting neural representation spaces only in AIT. Despite single-cell attentional modulations in both areas, we suggest that attentional
modulations of population representations may be weaker in the dorsal
stream because it must maintain more veridical representations for visuomotor control.
Acknowledgement: The Binational United States-Israel National Science Foundation, grant 2013356.
Color and Light: Color vision
31.15, 9:15 am Effect of Apparent Depth in Peripheral Target
Sunday, May 21, 8:15 - 9:45 am
Talk Session, Talk Room 2
Moderator: Michael Crognale
Detection in Driving under Focused and Divided Attention Jiali
Song1([email protected]), Patrick Bennett1, Allison Sekuler1, Hong-Jin
Sun1; 1Psychology, Behaviour & Neuroscience, McMaster University
31.16, 9:30 am Attention to shape enhances shape discrimination in AIT neural population coding but attention to space does
not modulate location discrimination in LIP of macaque monkeys. Anne Sereno1([email protected]), Sidney Lehky2; 1Dept.
of Neurobiology and Anatomy, Univ. of Texas Health Science Center in
Houston, 2Computational Neurobiology Laboratory, Salk Institute
We studied attentional effects for stimulus shape and location in anterior
inferotemporal cortex (AIT, ventral stream) and lateral intraparietal cortex
(LIP, dorsal stream). Monkeys performed two delayed-match-to-sample
tasks. Stimuli were identical in both tasks, but in one the monkey attended
to sample shape (shape attention task) and in the other to sample location
(location attention task). There was also a third passive task in which the
monkey maintained central fixation while the same stimuli were presented
in the same locations. We examined data from all shapes at the most effective location (shape representations), or all locations using the most effective shape (location representations). At the single cell level, there was a
broad range of attentional gain factors for stimulus shape and location in
both brain areas. At the population level, responses of all neurons to each
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
31.21, 8:15 am Metameric Mismatching in Natural and Artificial
Reflectances Arash Akbarinia1([email protected]), Karl
Gegenfurtner2; 1Centre de Visió per Computador, Universitat Autònoma
de Barcelona, 2Abteilung Allgemeine Psychologie, Justus-Liebig-Universität
The human visual system and most digital cameras sample the continuous
spectral power distribution through three classes of receptors. This implies
that two distinct spectral reflectances can result in identical tristimulus values under one illuminant and differ under another – the problem of metamer mismatching. It is still debated how frequent this issue arises in the
real world, using naturally occurring reflectance functions and common
illuminants. We gathered more than ten thousand spectral reflectance
samples from various sources, covering a wide range of environments
(e.g., flowers, plants, Munsell chips) and evaluated their responses under
a number of natural and artificial source of lights. For each pair of reflectance functions, we estimated the perceived difference using the CIE-defined distance ΔE2000 metric in Lab color space. The degree of metamer
mismatching depended on the lower threshold value l when two samples
would be considered to lead to equal sensor excitations (ΔE < l), and on
the higher threshold value h when they would be considered different.
For example, for l=h=1, we found that 43.129 comparisons out of a total
of 6×107 pairs would be considered metameric (1 in 104). For l=1 and h=5,
this number reduced to 705 metameric pairs (2 in 106). Extreme metamers,
for instance l=1 and h=10, were rare (22 pairs or 6 in 108), as were instances
where the two members of a metameric pair would be assigned to different color categories. Not unexpectedly, we observed variations among different reflectance databases and illuminant spectra with more frequency
under artificial illuminants than natural ones. Overall, our numbers are
not very different from those obtained earlier (Foster et al, JOSA A, 2006).
However, our results also show that the degree of metamerism is typically
not very strong and that category switches hardly ever occur.
31.22, 8:30 am Quickly-forming, shape-dependent memory biases
in color perception Maria Olkkonen1,2([email protected]),
Toni Saarela1; 1Institute of Behavioural Sciences, University of Helsinki,
Department of Psychology, Durham University
Background: Both long-term and short-term experience with object color
affects color perception. Memory colors (typical colors of familiar objects
such as fruit) draw perceived color towards them. Central-tendency bias
(CTB) occurs when the perceived color of a stimulus held in memory shifts
towards the average color of recent stimulus history. We studied how these
biases develop: First, what is the time-course of CTB? Second, do memory
colors start developing immediately upon exposure to shape-dependent hue
distributions? Methods: Observers compared the hue of two stimuli in a
2IFC task. A 2-second delay separated the reference (first) and test (second)
intervals. Two visually distinct 2D-shapes, “softy” and “spiky”, were used.
Both had five reference values ranging from blueish to greenish in CIELAB
color space; softies were on average greener and spikies bluer but had one
Vis io n S c ie nc es Societ y
Sunday AM
The ability to detect events in the visual periphery is crucial to driving
safely. The useful field of view (UFOV) task provides an index of the spatial
extent of peripheral vision under focused and divided attention. Previous
research reported reduced UFOV at greater perceived distances in driving (Andersen et al., 2011; Pierce & Andersen, 2014); however, these studies used long stimulus durations, making it difficult to compare directly
with the traditional UFOV task (Sekuler & Ball, 1988; Sekuler, Bennett &
Mamelak, 2000), which correlates with critical aspects of driving performance (Owsley et al., 1998; Ball et al., 1993). Furthermore, previous studies
on the depth effect in driving assessed performance only under divided
attention. The current study adapts the traditional 2D UFOV task to a computer-rendered 3D environment to examine whether apparent depth affects
the detection of brief peripheral targets, under focused and divided attention, and with target retinal image size matched across depth. In the central
task, participants tried to maintain a constant distance from a speed-varying lead car, indicated when the lead car’s image size matched that of a surrounding size-invariant box. In the peripheral task, participants detected
targets appearing at one of several possible locations on the left or right
side at two apparent distances, implied via simulated forward motion and
pictorial cues. The central and peripheral tasks were completed separately
under focused attention, and then, simultaneously under divided attention. We tested 24 participants and found they responded more accurately
to near than far targets at larger eccentricities under focused and divided
attention. Another 24 participants, tested in a second experiment with different target appearance probabilities, showed similar results. Thus, our
data suggest that apparent depth influenced the detection of briefly flashed
peripheral targets. These results are generally consistent with previous
research, and have important implications for understanding the mechanisms modulating the UFOV.
Sunday M orni ng Tal ks
VSS 2017 Abst ract s
color in common. Reference and test were always the same shape except
for the common reference color, for which reference and test were of different shape. On each trial, observers indicated whether the test appeared
bluer or greener than the reference. A 1-1 staircase procedure controlled
the test hue for each reference to track its perceived color. Results: In
within-shape judgments, perceived color of the extreme references was
biased towards the middle hues, consistent with CTB. This effect formed
quickly: it appeared during the first 20 trials and was very prominent after
50-100 trials. Across-shape judgments showed, unexpectedly, a repulsive
effect: The “softy” reference was matched by a bluer “spiky”, and “spiky”
reference by a greener “softy”. Conclusion: Memory biases of perceived
color develop rapidly and can be shape-dependent. The observed attractive
and repulsive biases can be explained by shape-specific adaptation to color
range, whereby hues are normalized with respect to the hue distribution
separately for the two shapes, followed by a central-tendency bias.
the target immediately prior to the color display improved overall performance from M=79.6% to M=86.1% (z=9.6, p< .0001) compared to trials on
uncued trials. Verbal cues facilitated visual discrimination only when the
target and non-targets spanned a category boundary, and in discriminating
more from less typical colors. Exp. 2 showed that color names improved
discrimination performance even when categories were blocked making
the cues redundant. Exp. 3 showed that facilitation from visual color cues
was significantly smaller than from verbal cues, suggesting that words are
especially effective in activating categorical color representations. Overall,
our results suggest that processing color names affects the ability to distinguish colors and that the extent to which we perceive colors categorically
may be flexible and depend on the current task
Acknowledgement: Supported by the Academy of Finland grant 287506.
Peterzell3, Michael Webster1,4; 1Graduate Program in Integrative Neuroscience, University of Nevada, Reno, 2Department of Psychology, Colorado
State University, 3Clinical Psychology Doctoral Program, College of Psychology and Holistic Studies, John F. Kennedy University, 4Department of
Psychology, University of Nevada, Reno
31.23, 8:45 am Color-ambiguity Matching Steven Shevell1,2,3([email protected]
Sunday AM
uchicago.edu), Wei Wang1,2; 1Institute for Mind and Biology, University of
Chicago, 2Department of Psychology, University of Chicago, 3Department
of Ophthalmology & Visual Science, University of Chicago
31.25, 9:15 am Individual differences in hue scaling suggest
mechanisms narrowly tuned for color and broadly tuned for lightness Kara Emery1([email protected]), Vicki Volbrecht2, David
Classical color matching reveals physically different lights that appear
identical. For example, a mixture of 550+670 nm lights appears identical to
580 nm light viewed alone. The explanation is that the physically different
lights result in identical neural responses so must be indistinguishable. Note
this explains why the lights match each other, though not their perceived
hue. This principle is extended here to neural representations of color that
are ambiguous and thus perceptually unstable: two lights with identical
but ambiguous neural representations match each other even though their
hue can vary. This is color-ambiguity matching. METHODS/RESULTS:
Ambiguous chromatic neural representations were generated using a form
of interocular-switch rivalry (aka stimulus rivalry; Logothetis, Leopold &
Sheinberg, Nature,1996). Two binocularly rivalrous chromaticities were
swapped between the two eyes about 7 times/second. This resulted in a
sustained percept for over 1.5 sec (>10 eye swaps) of one color and then
the other color. Further, two such ambiguous representations, one above
a fixation point and one below it, usually were the same color (far above
chance, p< 0.001). Although their color changed regularly, both appeared
the same. Importantly, a control with an ambiguous representation above
fixation and a nonrivalrous stable representation below showed that the
stable color (below) did not directly facilitate the same color above. CONCLUSION: Two separate stimuli that generate identical but ambiguous
neural representations become grouped, even though the resulting color
is unstable. Like classical color matching, the neural representations establish a match without specifying the perceived hue, which fluctuates over
time. Note grouping by identical ambiguous representations occurs prior
to perceptual resolution of the neural ambiguity. This is unlike the typical
assumption that the color seen determines grouping, as in resolution of the
correspondence problem in ambiguous apparent motion in which grouping by color establishes motion direction.
Individual differences in color appearance judgments are large and reliable among color-normal observers, but for poorly understood reasons. In
our recent factor analyses of hue-scaling functions (Emery et al. 2017ab,
Vision Research), we found that the differences depended on multiple processes each tuned to a narrow range of stimulus hues, consistent with a
multiple-channel or population code mediating color appearance, but not
classic opponency. In the present work, we extended this analysis outside
the equiluminant plane by sampling the colors of increments and decrements, to assess the tuning for both hue and lightness. Stimuli included 12
chromatic angles at 30-deg intervals along a circle of fixed contrast in the
cone-opponent plane. Each was shown at five lightness levels (0.5, .7, 1, 1.4
and 2 times the 20 cd/m2 luminance of the gray background). The stimuli
were displayed in random order in a uniform 2-deg field and were repeatedly pulsed until observers recorded the perceived proportion of red,
green, blue, or yellow in the hue. Individual settings for 14 observers were
factor-analyzed with PCA and Varimax rotation. The analysis revealed
approximately seven systematically-tuned factors (i.e. with moderate to
high loadings on 2 or more adjacent stimuli). Together these accounted for
>80% of the total variance. The factors approximated our previous analyses in exhibiting narrow and unipolar tuning for chromatic angle. Across
the lightness levels, however, the factors tended to show consistent loadings, suggesting for example, that there were strong correlations between
how an individual scaled the hues of increments and decrements. No clear
univariant factor emerged that was specific to increments or decrements,
even though the average perceived hue varied with lightness. This pattern
confirms that many independently varying dimensions determine inter-observer differences in hue judgments, and suggests that these dimensions
co-vary (operate similarly) across a range of lightness levels.
Acknowledgement: Supported by NIH EY-026618
31.26, 9:30 am Color vision for flight control in Drosophila Kit
31.24, 9:00 am Facilitation of color discrimination by verbal and
visual cues Lewis Forder ([email protected]), Gary Lupyan ; Department of Psychology, University of Wisconsin-Madison, Madison, WI, USA
1 1
People can distinguish millions of hues, but often refer to colors categorically using linguistic terms that denote large regions of color space. We
hypothesized that color names warp color representations making them
more categorical such that simply hearing a color name would induce more
categorical color perception. We tested this hypothesis by examining how
cuing color verbally (Exps. 1, 2) or visually (Exp. 3) affected people’s ability
to distinguish colors at various locations in color space. We predicted that
hearing a color name (e.g., “blue”) would transiently facilitate perceptually
discriminating between blues and non-blues. If the label activates a more
categorical color representation it should also, counterintuitively, help in
distinguishing more typical blues from less typical blues, but not highly
typical blues and slightly less typical blues. We conducted three experiments (N=85) to test these hypotheses. After locating each participant’s
focal colors and color boundaries, we tested color discrimination using a
4-alternative odd-one out task maintaining a constant perceptual distance
between target and non-targets in CIELUV space. Hearing a color name of
Vi s i on S c i enc es S o ci e ty
Acknowledgement: EY-10834
Longden1([email protected]), Michael Reiser1; 1HHMI Janelia
Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA
The circuitry of color vision in Drosophila is a classic system for understanding the development of neural circuitry and is among the best described for
any animal. Flies can learn to discriminate different wavelengths of light,
but it is not known what they use color vision for in spontaneous behavior.
We have explored how the processing of the wavelength of light contributes to three different kinds of flight behavior. To do this, we developed a
novel ultraviolet and green projector system to display wide-field visual
stimuli. We measured the flight control responses of tethered flies by optically recording changes in wing stroke amplitude. First, flies can stabilize
the horizon even when the intensities of the different wavelengths are
matched so that they provide no luminance contrast. This is achieved by
combining wavelength-sensitive phototaxis and color-blind motion vision
to stabilize the scene. Second, during looms the steering responses are color-blind, but the wing beat frequency varies with the intensity of ultraviolet light. Third, during flight towards an attractive object, a single vertical
stripe, the responses are color blind. Taken together, our results show how
S e e page 3 fo r A b s tr ac t N um bering Syst em
VSS 2017 Abs t rac t s S unday M orning Talks
the wavelength of light can influence multiple aspects of flight attitude and
control, and allow the operation of color vision circuitry to be investigated
in the context of natural behaviors. potentially explain the large variability in observed critical spacing across
different experimental conditions if it is the case that behavioral effects are
observable only when cortical interference reaches a threshold.
Acknowledgement: HHMI Janelia Research Campus
Acknowledgement: BBSRC EASTBIO
Spatial Vision: Crowding and statistics
32.13, 11:15 am Cortical distance determines the perceptual
outcomes of crowding John Greenwood1([email protected]),
Sunday, May 21, 10:45 am - 12:30 pm
Talk Session, Talk Room 1
Moderator: Steve Dakin
32.11, 10:45 am Cortical magnification factor of human V2 predicts
individual susceptibility to letter-crowding Steven Dakin1,2([email protected]
auckland.ac.nz), Samuel Schwarzkopf3,4, Geraint Rees3,5, Catherine
Morgan2, Elaine Anderson1,3,5; 1UCL Institute of Ophthalmology, University College London, 2School of Optometry & Vision Science, University of
Auckland, 3UCL Institute of Cognitive Neuroscience, University College
London, 4Experimental Psychology, University College London, 5Wellcome Trust Centre for Neuroimaging, University College London
Acknowledgement: Wellcome Trust, University of Auckland
32.12, 11:00 am Suppressive stimulus interactions in visual cortex
reflect the critical spacing in crowding Leili Soo1([email protected]
ac.uk), Ramakrishna Chakravarthi1, Plamen Antonov1, Søren Andersen1;
School of Psychology, University of Aberdeen, UK
Crowding is a phenomenon in which peripheral object recognition is
impaired by the close proximity of irrelevant stimuli. Currently, the neural processes underlying object recognition and its failure in crowding are
not well understood. Research examining the neural implementation of
visual attention has found that stimulus processing in visual cortex is suppressed by the presence of nearby stimuli. Could the breakdown of object
recognition seen in crowding be explained by such flanker induced suppression of target processing in the visual cortex? To answer this question,
we assessed cortical processing of a target object as a function of flanker
presence and distance to the target while participants performed a target
orientation discrimination task. Flankers and targets flickered at different
frequencies to elicit steady-state visual evoked potentials (SSVEPs), which
allow for the assessment of cortical processing of each of the concurrently
presented stimuli. Target identification accuracy and target elicited SSVEP
amplitudes decreased with decreasing target-flanker separations. Additionally, we fitted psychometric curves to both behavioral data and target elicited SSVEP amplitudes in order to determine the spatial extent of
interference (‘critical spacing’). The cortical and behavioral critical spacing
estimates closely mirrored each other. Unexpectedly, however, the presence of any flankers, even those far beyond either critical spacing, dramatically decreased SSVEP amplitudes elicited by the target, relative to the
unflanked condition. We conclude that suppressive stimulus interactions
between targets and flankers in the visual cortex may underlie the perceptual phenomenon of crowding. Further, the finding that flankers far outside the traditional critical spacing can suppress target processing might
In peripheral vision, object recognition is disrupted by clutter. This
crowding effect typically causes target and flanker objects to appear more
alike (assimilation). However, tilt contrast effects also increase in peripheral vision, causing target and flanker objects to appear more dissimilar
(repulsion). Although repulsion dominates in the parafovea with large
target-flanker separations, assimilation increases with higher eccentricities and/or smaller separations (Mareschal, Morgan & Solomon, 2010).
The common factor has been argued to be cortical distance: assimilation
occurs when flankers are close to the target within retinotopic maps, while
flankers at greater distances induce repulsion. Here we test this proposal
with two psychophysical manipulations that dissociate cortical and physical distance. Observers (n=8) judged the orientation of a target Gabor
(clockwise/counter-clockwise of vertical), flanked by two Gabors oriented
either clockwise or counter-clockwise of vertical. We first manipulate cortical distance via the arrangement of target-flanker elements: because cortical magnification is higher along the radial dimension (extending from
fixation), radially-positioned flankers will be cortically closer to the target
than tangential/iso-eccentric flankers. Accordingly, we observe far more
assimilation errors with radial flankers, while tangential flankers predominantly induce repulsion errors. We next manipulate cortical distance by
presenting stimuli in the upper and lower visual fields. Because the upperfield representation is compressed (Fortenbaugh, Silver & Robertson, 2015),
target-flanker separations will be effectively reduced relative to the lower
field. Accordingly, for stimuli with the same physical eccentricity and target-flanker separation, we observe far greater assimilation in the upper
than the lower visual field. Individual differences in visual-field size are
also correlated with these assimilation rates. Our results suggest that cortical distance is a key determinant of the perceptual outcomes of crowding.
By combining models of crowding and tilt contrast, we suggest that the
compulsory pooling of orientation-selective population responses can provide a common mechanism for these effects.
Acknowledgement: Funded by the UK Medical Research Council
32.14, 11:30 am Towards a Unifying Model of Crowding: Model
Olympics Adrien Doerig1([email protected]), Aaron Clarke2,
Greg Francis3, Michael Herzog1; 11 Laboratory of Psychophysics, Brain
Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL),
Switzerland, 2Laboratory of Computational Vision, Psychology Department, Bilkent University, Ankara, Turkey, 3Department of Psychological
Sciences, Purdue University, USA
In crowding, perception of an object deteriorates in the presence of nearby
elements. Obviously, crowding is a ubiquitous phenomenon since elements
are rarely seen in isolation. Up to date, there exists no consensus on how
to model crowding. In previous experiments, it was shown that the global
configuration of the entire stimulus needs to be taken into account. These
findings rule out simple pooling or substitution models and favor models
sensitive to global spatial aspects. In order to further investigate how to
incorporate these aspects into models, we tested a large number of models, using a database of about one hundred stimuli. As expected, all local
models fail. Further, capturing basic regularities in the stimulus does not
suffice to account for global aspects, as illustrated by the failures of Fourier analysis and textural models. Our results highlight the importance
of grouping to explain crowding. Specifically, we show that a two-stage
model improves performance strongly. In this model, first, elements are
segregated into groups and, second, only elements in the same group interfere with each other. The model must integrate information across large
parts of the visual field. 32.15, 11:45 am How do we count at a glance? Richard Murray1([email protected]
yorku.ca), Kevin DeSimone1,2, Minjung Kim1; 1Department of Psychology
and Centre for Vision Research, York University, 2Department of Psychology and Center for Neural Science, New York University
Se e page 3 f or Abs t rac t N u m b e r i n g Syste m
Vis io n S c ie nc es Societ y
Sunday AM
Our peripheral vision is fundamentally limited by our inability to recognise objects when they appear within “clutter”, a phenomenon known as
crowding. Although widely studied, the cortical locus of this phenomenon
remains unclear. This is in part because it is difficult to distinguish neural
activity arising from a change in the stimulus (e.g. from introducing clutter)
from activity associated with the resulting crowding. Here we overcome
this by quantifying individual differences in susceptibility to crowding and
correlate this with parameter estimates of cortical architecture, assessed
using population receptive field (pRF) analysis of human fMRI data. We
report that a simple psychophysical index of ‘crowding susceptibility ‘ (the
ratio of acuity for an isolated letter versus a crowded letter) is highly correlated with individual estimates of cortical magnification factor (CMF) in
visual areas V2 and V3. This is strong evidence that V2/V3 plays a crucial
role in setting the spatial scale of crowding and, as has been noted in several
complementary psychophysical and computational studies, is consistent
both with the receptive field (RF) and shape-encoding properties of cells
within these areas.
Joseph Danter1, Rhiannon Finnie1; 1Experimental Psychology, University
College London
Sunday AM
Sunday M orni ng Tal ks
Many studies have examined the ability of humans and other animals
to rapidly perceive the approximate number of elements in a scene, but
there has been little work on what computation underlies this ability. To
address this question we measured psychophysical decision spaces for
number judgements. Observers judged whether a reference stimulus or
test stimulus contained more dots. The reference stimulus had fixed area
and density on all trials. The test stimulus had a wide range of areas and
densities across trials. From 3,300 trials we created a 2D plot showing the
observer’s probability of choosing the test stimulus as more numerous, as a
function of its log area and log density. In fifteen such plots (five observers
with three reference stimuli each; 49,500 trials total), fitted decision curves
showed that number judgements were based on log-area plus log-density,
i.e., they were monotonically related to true number (consistent with Cicchini et al., 2016). We fitted a generalized additive model (GAM) to this
data, and found that number judgements were based on almost perfectly
logarithmic transformations of area and density, again demonstrating that
number judgements are tightly linked to true number. There is debate
about whether number judgements are based on number, or on low-level
properties like density. We implemented an ideal observer model that simply counts stimulus elements, and also Dakin et al.’s (2011) bandpass-energy model of number perception, and ran them in the same experiment as
human observers. Surprisingly, both models’ decision spaces were practically the same as human observers’. Thus decision spaces are highly informative in that they reveal the stimulus properties that guide observers’
number judgements, but they are less useful for discriminating between
current competing models. We will suggest that number adaptation aftereffect experiments have greater potential to choose between current models.
Acknowledgement: NSERC, CFI
32.16, 12:00 pm Multidimensional Normalization is Optimal for
Detection in Natural Scenes Wilson Geisler1([email protected]),
Stephen Sebastian1, Jared Abrams1; 1Center for Perceptual Systems, University of Texas at Austin
A fundamental everyday visual task is to detect specific target objects
within a background scene. Under natural conditions, both the properties
of the background and the amplitude of the target (if present) are generally
different on every occasion. To gain some understanding of detection under
such natural conditions we determined the amplitude thresholds in natural images of a matched-template detector, as a function of the three local
background properties: luminance, contrast, and phase-invariant similarity
to the target. We found that threshold (which is equal to the standard deviation of the template response) is a linear separable function (the product)
of all three dimensions—“multidimensional Weber’s law.” This fact poses a
serious problem for detecting targets under natural conditions, where both
the properties of the background and the target amplitude are uncertain.
Specifically, good performance requires a different decision criterion on the
template responses for each possible combination of background properties. However, we show that divisively normalizing the template (feature)
responses by the product of the locally estimated luminance, contrast, and
similarity creates a distribution of template responses that is normal with
a standard deviation of 1.0, independent of the background properties.
Thus, for any desired false-alarm rate the optimal hit rate is obtained with
a single decision criterion, even under maximum uncertainty. This is just
the sort of normalization (gain-control) observed early in the visual system
for the dimensions of luminance and contrast, and perhaps for similarity.
In psychophysical experiments, we show that human performance is consistent in detail with this normalized matched template observer (wh