Neural Measures of Dynamic Changes in Attentive Tracking Load

Neural Measures of Dynamic Changes
in Attentive Tracking Load
Trafton Drew1, Todd S. Horowitz1, Jeremy M. Wolfe1,
and Edward K. Vogel2
■ In everyday life, we often need to track several objects simul-
taneously, a task modeled in the laboratory using the multipleobject tracking (MOT) task [Pylyshyn, Z., & Storm, R. W. Tracking
multiple independent targets: Evidence for a parallel tracking
mechanism. Spatial Vision, 3, 179–197, 1988]. Unlike MOT, however, in life, the set of relevant targets tends to be fluid and change
over time. Humans are quite adept at “juggling” targets in and out
of the target set [Wolfe, J. M., Place, S. S., & Horowitz, T. S. Multiple object juggling: Changing what is tracked during extended
MOT. Psychonomic Bulletin & Review, 14, 344–349, 2007]. Here,
we measured the neural underpinnings of this process using electrophysiological methods. Vogel and colleagues [McCollough,
A. W., Machizawa, M. G., & Vogel, E. K. Electrophysiological measures of maintaining representations in visual working memory.
Cortex, 43, 77–94, 2007; Vogel, E. K., McCollough, A. W., &
Machizawa, M. G. Neural measures reveal individual differences
in controlling access to working memory. Nature, 438, 500–
503, 2005; Vogel, E. K., & Machizawa, M. G. Neural activity pre-
In a typical multiple-object tracking (MOT) experiment,
participants are asked to track several target items among
a number of identical distractors for 5–20 sec (Pylyshyn &
Storm, 1988). After this period, they are asked to identify
the tracked items. Most individuals can successfully track
about four items, illustrating a significant capacity limitation
in visual processing. In real life, we rarely track the same set
of items for extended periods. It is more likely that we will
be rapidly switching between items as some become relevant and others lose relevance. For instance, while driving,
one might need to track the movements of the cars to the
left and the front when preparing to change lanes, then
switch to monitoring the truck to the right when attempting to get in an exit lane. To study this more complex
natural operation, Wolfe, Place, and Horowitz (2007) introduced the “multiple-object juggling” paradigm. Instead of
tracking a fixed set of targets throughout the trial, participants were cued to add and drop targets throughout the
trial. Participants were surprisingly good at this task; there
Brigham and Womenʼs Hospital and Harvard Medical School,
University of Oregon
© 2011 Massachusetts Institute of Technology
dicts individual differences in visual working memory capacity.
Nature, 428, 748–751, 2004] have shown that the amplitude of
a sustained lateralized negativity, contralateral delay activity
(CDA) indexes the number of items held in visual working memory. Drew and Vogel [Drew, T., & Vogel, E. K. Neural measures
of individual differences in selecting and tracking multiple
moving objects. Journal of Neuroscience, 28, 4183–4191,
2008] showed that the CDA also indexes the number of items
being tracking a standard MOT task. In the current study, we
set out to determine whether the CDA is a signal that merely
represents the number of objects that are attended during a
trial or a dynamic signal capable of reflecting on-line changes
in tracking load during a single trial. By measuring the response
to add or drop cues, we were able to observe dynamic changes
in CDA amplitude. The CDA appears to rapidly represent the
current number of objects being tracked. In addition, we were
able to generate some initial estimates of the time course of this
dynamic process. ■
was no significant cost for juggling targets as opposed to
tracking a fixed set.
How quickly do we acquire new targets and stop tracking old ones? This is an attentional switching problem. Previous research on the topic has studied shifting attention
between fixed spatial locations in response to symbolic
cues using ERPs and fMRI. Converging evidence from these
studies suggests that top–down attentional modulation of
spatial attention is primarily driven by areas in the posterior
parietal cortex (e.g., Yantis, 2008). For example, Bisley and
Goldberg (2003) have shown that the lateral intraparietal
area dynamically represents attended locations during sustained voluntary attention tasks. Yantis et al. (2002) have
shown that activity in the superior parietal lobule responds
transiently to cues that instruct the participant to switch
the spatial location of attention. This area is also active
when switching between nonspatial features (Liu, Slotnick,
Serences, & Yantis, 2003), strongly suggesting a role for
this area in controlling voluntary shifts of attention in multiple representational domains. Two ERP studies on the
subject have focused on nonlateralized parietal activity presumed to reflect a change in the focus of spatial attention.
The switch activity occurred between 300 and 700 msec
Journal of Cognitive Neuroscience 24:2, pp. 440–450
after the cue (Brignani, Lepsien, Rushworth, & Nobre,
2009; Grent-ʼT-Jong & Woldorff, 2007). In both cases, the
investigators studied the response to a symbolic cue that
instructed the observer to shift attention to a different location. Brignani and colleagues (2009) contrasted the evoked
response to “hold” and “switch” cues, thereby allowing
them to focus on activity that was necessary to shift the
locus of attention in preparation for a stimulus in the
new location. However, although the observed activity suggests that processes related to the attentional shift begins
roughly ∼350 msec after the cue, it is not clear what aspect
of the attentional switch is reflected by this activity. As this
was the first time at which the “switch” and “hold” activity
differed significantly, one might expect that it reflects the
beginning of the process, but with the existing data, there is
no way to know when the process is complete; that is, when
has attention effectively switched to the new location?
The current study addressed this question by examining the effect of an exogenous shift cue during an MOT
task on a sustained, lateralized ERP component called
the contralateral delay activity (CDA; McCollough,
Machizawa, & Vogel, 2007; Vogel & Machizawa, 2004).
When a participant is asked to track a set of stimuli presented in one visual hemifield, the difference between
contralateral and ipsilateral activity at posterior-occipital
electrode sites increases with the number of items and saturates above tracking capacity (Drew & Vogel, 2008). By
manipulating the size of the area within which the objects
moved, we have previously shown that this activity appears
to reflect the number of attended items rather than the size
of the attentional spotlight (Drew & Vogel, 2008). Vogel
and colleagues have suggested that this component is an
index of the number of currently attended targets (Drew
& Vogel, 2008; McCollough et al., 2007; Drew, McCollough,
& Vogel, 2006; Vogel & Machizawa, 2004). If so, the CDA
should be sensitive to changes in tracking load during a
trial, thereby allowing us to estimate the time course of
effectively switching from tracking one group of objects
to another.
In visual working memory (VWM) tasks, the CDA responds
to increases in memory load. When participants held two
items in memory and were then asked to add two more
items, CDA amplitude increased (Ikkai, McCollough, &
Vogel, 2010; Vogel, McCollough, & Machizawa, 2005). A
clear prediction of the hypothesis that the CDA is a dynamic index of the number of items currently being attended is that the CDA would decrease when load is
reduced. Alternatively, and consistent with existing data,
the CDA could reflect the number of items selected or individuated during the course of the entire trial. That is,
CDA amplitude might ratchet up each time target load is
modified and might simply asymptote once its maximum
amplitude was reached. This is straightforward in the case
of adding items. CDA amplitude should increase when
additional targets are added to a preexisting target load
(2 + 2 = 4; Vogel et al., 2005). However, the “ratchetup” account would predict a paradoxical result when partic-
ipants are asked to drop three items and pick up one
new item (as in the drop conditions of the present article).
Now, 3 − 1 = 4, because the number of items attended
during the entire trial would increase, whereas the active
number of items being tracked would decrease. If the
ratchet-up hypothesis were true, it would render the CDA
a much less effective tool for studying dynamic situations
because it would no longer be representative of the current
load and the signal appears to saturate the above three
or four items (Drew & Vogel, 2008; Vogel & Machizawa,
To test the dynamic index hypothesis for the CDA, we
employed a variant on Wolfe et al.ʼs (2007) multiple-object
juggling experiments. In this paradigm, participants are
able to change tracking load with no observable behavioral
cost. This creates an ideal situation to determine whether
the CDA is truly representative of the number of items
being actively attended: if so, it should dynamically increase and decrease in response with unpredictable
changes in tracking load during the trial. On the other
hand, if the CDA merely reflects the number of items selected in a given trial, the amplitude should increase when
the tracking load decreases.
In two experiments reported here, participants performed a lateralized tracking task, as in Drew and Vogel
(2008). In the Track 1 and Track 3 baseline conditions,
participants would either track one or three targets, providing baseline CDA measures for one and three targets.
In both experiments, we presented a cue after 500 msec
of tracking on some trials. If tracking three targets, the cue
would either indicate that the participant should Hold
on to that target set or Drop to tracking a single target.
If tracking a single target, the cue would indicate that
the participant should switch to tracking three targets
(Add ). In all switch conditions, participants were asked
to switch to new items. For instance, when switching from
one to three items none of the three items was previously
a target. We recorded ERPs and analyzed CDA amplitude.
To anticipate the results, CDA amplitude followed the
tracking conditions. Thus, in the Add condition, for example, the CDA was initially equivalent to that in the Track 1
condition because participants were tracking a single target in both cases. After the cue, CDA amplitude rose to
equal the Track 3 level, indicating that the participant
was tracking three targets. The time course of the CDA
transition is a measure of the time necessary to make
the switch in tracking behavior. The same logic applies
in reverse to the Drop condition. The Hold condition
serves as a control to evaluate the effects of the cue. In
Experiment 2, we replaced the Hold condition with Refresh 1 and Refresh 3 conditions, where the objects that
the participant was tracking were briefly flashed again during the cue period (“refreshing” their identity). Because
the changes in target load were indicated by flashing
the appropriate number of new targets, these “refresh”
conditions served as an important control for the visually
evoked activity in response to the cue items. This allowed
Drew et al.
us to generate some initial estimates of the time-course
of increasing and decreasing target load during MOT.
There were 17 participants in both Experiment 1 and
Experiment 2, with no overlap. All were neurologically
normal, recruited from the Eugene, Oregon, community,
and gave informed consent according to procedures approved by the University of Oregon institutional review
Stimulus Displays and Procedures
Our method followed the lateralized MOT design developed by Drew and Vogel (2008). Participants were required
to fixate the center of the monitor. Similar stimuli were presented on both sides of fixation to equalize the perceptual
input in each hemifield. However, participants were instructed to attend to one hemifield on a given trial, and
the attended hemifield was varied randomly from trial to
Each trial began with a 200-msec arrow cue presented
at fixation that informed the participants which side of
the screen to attend. Following a 100–200 msec ISI, a
set of eight stationary squares (each subtending 0.4° ×
0.4°) appeared on either side of the screen. One or three
of these squares on the attended side were then illumi-
nated in the relevant color for 500 msec to designate the
targets. An equal subset on the unattended side was illuminated in the irrelevant color to equate stimulus energy
in the two hemifields. For half of the participants, red was
the relevant color and green was the irrelevant color, and
the color mappings were reversed for the remaining subjects. These colors were photometrically equiluminant
according to a Konica Minolta ChromaMeter CS-100a.
After the selection period, color cues and disappeared
and all objects began to move independently for 2000 msec.
Motion in each hemifield was confined within an invisible
rectangle subtending 8.90° × 4.45°, with the inner edge of
this rectangle laterally offset from fixation by 2.16°. Objects
moved at a speed of approximately 1.6°/sec. Motion trajectories were linear and changed at random intervals or when
the items made contact with other items or the boundaries
of the invisible rectangle.
Following the motion phase, one item on the attended
side was illuminated in the relevant color and another on
the unattended side was illuminated in the irrelevant color.
Participants made a button press to identify the selected
item on the attended side as having been either a tracked
target or an untracked distractor.
In Experiment 1, there were five different trials types:
Track 1, Track 3, Add, Drop, and Hold (see Figure 1). For
Tracks 1 and 3 trials, participants tracked the initial targets throughout the duration of the trial. On Add trials,
participants initially tracked one item. After 500 msec of
tracking, three items, which had previously been distractors, were cued in the relevant color on the attended side
Figure 1. Experimental design
for Experiment 1. Here, we have
depicted just the attended side
of the screen. The unattended
side of the screen was matched
in terms in all phases except
selection, where the attended
side contained relevant colors
on targets whereas the
unattended side contained the
irrelevant color on the same
number of items. Relevant color
was counter-balanced across
participants and is depicted as
gray in this example, whereas
the irrelevant color is black.
In the actual experiment,
isoluminant red and green were
used as two color types. On
switch trials, the items that the
participant switched to were
always different than the
originally tracked items. All trial
types were interleaved so that a
participant that began tracking
three objects did not know if
they would be told to switch
to tracking one item (Drop),
ignore an irrelevant cue (Hold),
or simply continue tracking.
Journal of Cognitive Neuroscience
Volume 24, Number 2
of the screen for 200 msec. To match the physical stimulation on both sides of the screen as closely as possible,
three items were illuminated on the unattended side in
the relevant color at the same time. Participants were instructed to immediately stop tracking the initial target
and start tracking the new targets. Drop and Hold trials
followed the same time course but asked the participant
to initially track three items. In the Drop condition, one
former distractor was then cued, and participant was to
track this item until the end of the trial. In the Hold condition, one distractor item was cued in the irrelevant
color on both the attended and unattended side of the
screen. Participants were told to ignore this cue and continue tracking the three initially cued targets. All trial
types were interleaved so that when initially asked to
track one target, there was a 50% probability of being
asked to continue tracking that item throughout the trial,
and when initially tracking three targets, there was a twothirds chance of being asked to stay with the initial targets. After a short practice block, participants completed
880 trials, yielding 176 trials per condition.
There were six conditions in Experiment 2, four (Track 1,
Track 3, Add, and Drop) of which were replications from
Experiment 1. The two new conditions in this experiment
were the Refresh 1 and Refresh 3 conditions. In these conditions, the initially cued target items were cued again (or
refreshed) using the relevant color during the cue period.
As in the Add and Drop conditions, an identical number of
items was cued in the relevant color on the unattended
side. After a short practice block, participants completed
1248 trials, yielding 208 trials per condition.
Electrophysiological Recording and Analysis
ERPs were recorded in each experiment using our standard recording and analysis procedures, including rejection of trials contaminated by blocking, blinks, or large
(>1°) eye movements (see McCollough et al., 2007). We
recorded from 22 tin electrodes mounted in an elastic cap
(Electrocap International) using the International 10/20
System. 10/20 sites F3, FZ, F4, T3, C3, CZ, C4, T4, P3,
PZ, P4, T5, T6, O1 and O2 were used along with five nonstandard sites: OL midway between T5 and O1; OR midway between T6 and O2; PO3 midway between P3 and
OL; PO4 midway between P4 and OR; POz midway between PO3 and PO4. All sites were recorded with a left
mastoid reference, and the data were re-referenced offline to the algebraic average of the left and right mastoids.
Horizontal EOG was recorded from electrodes placed
approximately 1 cm to the left and right of the external
canthi of each eye to measure horizontal eye movements.
To detect blinks, vertical EOG was recorded from an electrode mounted beneath the left eye and referenced to the
left mastoid. The EEG and EOG were amplified with a SA
Instrumentation amplifier with a bandpass of 0.01–80 Hz
and were digitized at 250 Hz in LabView 6.1 running on a
Eye Movements
Trials containing either blinks or eye-movements were excluded from further analysis. Participants with trial rejection rates >25% were excluded from the sample. Using
these criteria, we eliminated 1 of the 17 participants that
took part in Experiments 1 and 4 of the 17 participants in
Experiment 2. We analyzed the horizontal EOG channel
over a long time window (100–2400 msec) to ensure that
eye position did not drift into the movement area during
trial and that the presence of a switch cue on some trials
did not result in an increased rate of eye position drift during the trial. We divided the data on the basis of the side
that was attended and the experimental condition for
each trial. We found no evidence of significant hEOG drift
in either experiment: The main effects for both side attended (F(1, 15) = 2.00, p = .18, η2 = .12; Experiment 2:
F(1, 12) = 4.38, p = .058 , η2 = .27) and condition (F(4,
60) = 1.41, p = .24, η2 = .09; F(5, 60) = 1.93, p = .10,
η2 = .14) were not significant, and the two factors did
not interact (F(4, 60) = .73, p = .57, η2 = .05, F(5, 60) =
1.72, p = .14, η2 = .13). Overall, the drift toward the attended side of the screen was less than 0.7 μV in both experiments. Hillyard and Galambos (1970) have shown that
a 1° eye movement elicits roughly a 16-μV deflection in
hEOG waveforms. Given that the area that the boxes
moved within was lateralized by 2.16° from fixation, it is unlikely that these small drifts in fixation affected the data.
Difference Waves
Contralateral and ipsilateral waveforms were defined in
terms of the side of the screen attended on a given trial.
We then examined the data in terms of contralateral and
ipsilateral response, collapsing the data across attend left
and right trials. As in previous works (Drew & Vogel,
2008; McCollough et al., 2007; Vogel et al., 2005; Vogel &
Machizawa, 2004), we averaged the response from a set of
five posterior-occipital electrodes pairs (P3/4, PO3/4, T5/6,
OL/R, O1/2). We computed difference waves by subtracting
ipsilateral activity from contralateral activity. For the remainder of the article, we will analyze difference waves
from this averaged response unless otherwise stated.
Experiment 1
Behavioral Results
We converted behavioral accuracy to an estimate of number of objects tracked using the equation: m = n(2P − 1).
Here, m is an estimate of the number items accurately
tracked, n is the total number of target items, and P is
the percent correct (Scholl, 2001). We consider n to be
the number of targets the participant was ultimately responsible for at report, so n was 3 for Add trials and
1 for Drop trials. The primary utility of this transformation
is to ensure that participants were capable of tracking
Drew et al.
Figure 2. Behavioral performance in Experiment 1 (A) and Experiment 2 (B). Error bars here and all subsequent figures represent SEM.
more than one target when asked to do so. If we consider
percent correct instead of m, the pattern of results is identical, except that accuracy is naturally lower for the threetarget conditions relative to the one-target conditions.
The performance data in Figure 2A show that participants
were able to efficiently switch target load on-line without
any detectable cost, as observed by Wolfe et al. (2007). A
repeated measures ANOVA showed a large effect of trial
type on performance (F(1, 15) = 52.38, p < .001, η2 =
.78). Using planned comparisons, we found that, as expected, the estimated number of items tracked was
higher in the Track 3 condition than the Track 1 condition
(t(15) = 8.22, p < .001, η2 = .82). In the Drop condition,
the final target load was a single item. Accordingly, we
found that the Track 1 and Drop conditions yielded
equivalent performance (t(15) = 1.52, p = ns, η 2 =
.13). Similarly, performance in the Add and Hold conditions, in which the final target load was three items, was
equivalent to the Track 3 case (Add vs. Track 3: t(15) =
−0.43, p = ns, η2 = .01; Hold vs. Track 3: t(15) = 1.10,
p = ns, η2 = .08). In other words, there was no observable
behavioral cost of switching tracking load during the trial
despite a cue period (200 msec) that was shorter than
previous work (500 msec in Wolfe et al., 2007).
Electrophysiological Results
Replicating previous work (Drew & Vogel, 2008), CDA
amplitude was higher for the Track 3 condition than
the Track 1 condition throughout the trial, including during the initial selection of the target before movement
onset. Amplitude in both the Add and Drop conditions
changed dramatically after the cue period, rising in the
Add condition and falling in the Drop condition. The
Hold condition amplitude appears to be stable aside
from a positive deflection immediately after the cue period. To quantify these results, we initially analyzed the
mean amplitude of the CDA during three periods. The
first was during the typical time window associated with
selection activity before motion onset (the N2pc; 200–
Journal of Cognitive Neuroscience
300 msec). The remaining two measured the CDA at different time points during tracking: one immediately before the onset of the switch cue (800–1000 msec) and a
later period once the amplitude of juggling conditions
appeared to have stabilized (1700–1900 msec).
In the selection time window (Figure 3C), we first compared the amplitude among conditions in which the participant was asked to initially track the same number of
targets. Amplitude was equivalent for the two conditions
that began with tracking one target (Track 1 and Add; F(1,
15) = .05, p = .82, η2 = .00) and for the three conditions
that began with tracking three targets (Track 3, Drop and
Hold; F(2, 30) = 2.24, p = .12, η2 = .13). Accordingly, we
collapsed the data for this time window into these two categories and then compared amplitude between them. As
in Drew and Vogel (2008), we found that the N2pc was
higher when three targets were initially selected (F(1,
15) = 21.12, p < .001, η2 = .59).
We followed the same procedure for the early time window (Figure 3D). Again amplitude was equivalent for the
Track 1 and Add conditions (F(1, 15) = 0.45, p = .51, η2 =
.03). There was a small but significant effect of condition
in for the Track 3, Drop, and Hold conditions (F(2, 30) =
3.84, p = .03, η2 = .2), but this appears to be being driven
by slightly lower amplitude in the Track 1 condition.
Furthermore, an analysis during the same early time window and using visually identical stimuli in Experiment 2
found no effect of condition ( p > .6). Comparing these
across initial target load, we found that amplitude for conditions where one item was initially tracked was much
lower than trials where three items were initially tracked
(F(1, 15) = 30.9, p < .001, η2 = .67).
In the late time window (Figure 3E), we again grouped
the data on the basis of the current number of items being
tracked. Because the late time window is after the cue, the
categories are different than in the previous time windows. Amplitude was equivalent within the one target
category (now comprising Track 1 and Drop; F(1, 15) =
.01, p = .97, η2 = .00) and within the three target category
(now comprising Track 3, Add, and Hold; F(2, 30) = .21,
Volume 24, Number 2
p = .81, η2 = .01). As in the other time windows, amplitude was significantly higher when tracking three targets
compared with tracking one (F(1, 15) = 13.52, p < .005.
η2 = .47).
Next, we compared amplitude within the juggling conditions across time windows (Figure 3B). Comparing the amplitude in early and late time windows for the Add and
Drop conditions, we found a main effect of time (F(1,
15) = 8.3, p = .02, η2 = .36) and, more importantly, a significant crossover interaction (F(1, 15) = 22.3, p > .001,
η2 = .6), but no effect of condition (F(1, 15) = 2.8, p =
.11, η2 = .16). Paired t tests show that amplitude increased
significantly for the Add condition (t(15) = 19.89, p < .001,
η2 = .56) and decreased significantly for the Drop condition (t(15) = −5.19, p < .01, η2 = .26).
Timing Analysis
The previous analyses confirm that CDA amplitude reflects the number of targets that participants should be
currently tracking within a given time window and that
amplitude changes accordingly when participants are
asked to switch tracking loads. However, as is clear from
the waveforms, this is not an instantaneous process: It
takes time for amplitude to shift from the level equivalent
to the number of targets initially tracked to the final tracking load. From the waveforms, it appears clear that amplitude in the Add condition reaches the level of the Track 3
condition later than the Drop condition reaches the level
of Track 1. However it is also clear that the amplitude in
the Hold condition decreases for roughly 200 msec soon
after the cue period. Hold amplitude during the 1200–
1400 msec time window where this effect is maximal is significantly more positive than amplitude in the early (t(15) =
3.67, p < .005, η2 = .47) or late time window (t(15) = 6.43,
p < .001, η2 = .73). It is important to understand this effect
before we attempt to make any estimates about the time
course of adding and dropping items. The positive-going
transient response to the cue in the Hold condition could
be related to inhibitory processes as the participants attempt to suppress the irrelevant information during the
cue period. Alternatively, the onset of stimuli could induce
an automatic response irrespective of the need to inhibit
information. If this is the case, then there may be a visually evoked transient to the cue (e.g., P1) in both the
Add and Drop conditions that obscures our ability to discern when object information has been added or dropped.
We conducted a second experiment to investigate this
Experiment 2
To determine whether the cue transient observed in the
Hold condition related to inhibition of the irrelevant stimulus in this condition or a more general response, we replicated Experiment 1 with two new conditions, Refresh 1 and
Refresh 3, in place of the Hold condition. In both Refresh
conditions, the targets that were initially cued were simply
Figure 3. Electrophysiological results from Experiment 1. (A) Difference waveforms of the CDA amplitude for the five conditions. Waveforms
were taken from an average of five electrode pairs from posterior–parietal sites and were time-locked to the onset of the selection period.
Motion began 500 msec subsequently. Only data from correct trials is shown. (B) Mean CDA amplitude for the Add and Drop conditions as a function
of time. (C, D, and E) Mean CDA amplitude for the five conditions over the selection (C), early (D), and late (E) time windows.
Drew et al.
Figure 4. Electrophysiological results from Experiment 2. (A) Difference waveforms for the four conditions also used in Experiment 1; Refresh 1
and Refresh 3 waveforms have been omitted for ease of viewing. (B) Mean CDA amplitude for the Add and Drop conditions as a function of time.
(C, D, and E) Mean CDA amplitude over the selection (C), early (D), and late (E) time windows. As in Experiment 1, amplitude in the Add and
Drop conditions changed to reflect the current number of items being tracked. This can be observed in mean amplitude bar graphs for the early
(D) and late (E) time windows.
cued again in the relevant color during the cue period. To
balance visual stimulation, the same number of items was
cued in the relevant color on the unattended side as well.
Behavioral Results
As in Experiment 1, we analyzed behavioral performance
on the basis of the number of objects the participant had
to track at the end of each trial (Figure 2B). The estimated
number of objects tracked was higher when tracking three
targets than when tracking one target (F(1, 12) = 64.3, p <
.001, η2 = .84). Performance was equivalent for the one target conditions (Track 1, Refresh 1, and Drop, F(2, 24) =
2.70, p = .88, η2 = .18). However, we did observe a significant effect of condition within the three target category
(Track 3, Refresh 3, and Add; F(2, 24) = 7.94, p < .005,
η2 = .40), which appears to be driven by the fact that performance in Track 3 condition was significantly worse than
performance in the Refresh 3 condition (t(12) = 3.79, p <
.005, η2 = .55). As in Experiment 1, participants showed no
cost of switching targets during the trial: performance in
the Add and Track 3 conditions was equivalent (t(12) =
1.88, p = ns, η2 = .23), as was Drop and Track 1 performance (t(12) = .79, p = ns, η2 = .05).
Electrophysiological Results
Four of the six conditions in this experiment were repeated from Experiment 1 and the overall results for
these conditions are strikingly similar (Figure 4A). As in
Journal of Cognitive Neuroscience
Experiment 1, we grouped conditions on the basis of the
number of targets that the participant was instructed to
be tracking during the given time window. Tracking three
targets was associated with higher amplitude than tracking
one target in both the selection (200–300 msec, Figure 4C;
F(12, 1) = 30.58, p < .001, η2 = .72) and early (800–
1000 msec, Figure 4D; F(12, 1) = 46.3, p < .001, η2 =
.79) time windows. There was no effect of condition within
the one or three target categories (all Fs < 0.6, all ps > .6).
The same pattern of results held in the later period: Three
target amplitude was greater than one target amplitude
(Figure 4E, F(12, 1) = 26.94, p < .001, η 2 = .69), with
no effect of condition within the one target category
(F(24, 2) = 1.90, p = .17, η2 = 14), or the three target category (F(24, 2) = 1.28, p = .30, η2 = .10). When we compared the Add and Drop conditions (Figure 4B), there was
once again an effect of time (F(12, 1) = 42.49, p < .001,
η2 = .78) and a strong crossover interaction (F(12, 1) =
52.27, p < .001, η2 = .81), but no effect of condition (F(12,
1) = .03, p = .86, η2 = .00).
Critically, we observed a cue transient soon after the cue
period in each of the Refresh conditions that appeared to
be very similar to the deflection in response to the Hold
cue in Experiment 1 (compare Figure 4F and Figure 3A).
This suggests that the positive-going transient response
to the cue is stimulus-driven rather than an inhibitory process that is task-driven. Aside from the period immediately
following the cue, amplitude in the Refresh conditions was
followed the same pattern and the Track 1 and Track 3 conditions. Although there was a significant effect of both time
Volume 24, Number 2
(early or late time window: F(12, 1) = 9.75, p < .01, η2 =
.45) and condition (Refresh 1 or Refresh 3: F(12, 1) = 33.1,
p < .001, η2 = .73), the two factors did not interact (F(12,
1) = 0.00, p = .96, η2 = 0). Furthermore, as with the Hold
condition in Experiment 1 amplitude in both refresh conditions was significantly lower during the 1200–1400 msec
time window than either the early (Refresh 1 (t(12) = 2.69,
p < .05, η2 = .37; Refresh 3 (t(12) = 2.81, p < .005, η2 =
.40) or late time windows (Refresh 1 t(12) = 5.23, p < .001,
η2 = .70; Refresh 3 (t(12) = 5.43, p < .001, η2 = .71).
Timing Analysis
In both Experiment 1 and Experiment 2, we observed a
transient decrease in CDA amplitude in response to cues
that did not necessitate switching object load (i.e., the
Hold and Refresh conditions, respectively). The transient
positivity also appears to be present in the Add and Drop
waveforms. In both experiments amplitude for the Add
conditions deflects positively before increasing to reflect
the increase on target load whereas the Drop waveform
appears to immediately decrease to reflect the lowered
tracking load. We, therefore, suggest that the observed
waveforms are a composite of two underlying processes,
one reflecting the change in the number of targets being
tracked, and one reflecting a temporary positive response
to the cue. A better estimate of the switching process
should therefore account for positive deflection that is unrelated to the switching process. We did so by subtracting
the amplitude from the Refresh trial from the Switch
trials. Figure 5B shows the result of subtracting the difference wave on Refresh 1 trials from that of the Drop trials
and Refresh 3 from Add. Figure 5C shows the result of
subtracting the difference wave of Refresh 1 from the
Add condition and subtracting Refresh 3 from the Drop
condition. These figures provide bookends for estimating
when the switch took place while controlling for the positive deflection observed in response flashing objects during the cue interval. In both cases, we used a 20-msec
sliding window to estimate the timing of the switch. We
used Figure 5B to estimate the first time at which the
switch conditions (Add and Drop) reflected the current
Figure 5. Timing analysis for
Experiment 2. (A) Difference
CDA waveforms for the Refresh 1
and Refresh 3 conditions for
comparison with the Add
and Drop waveforms; Track 1
and Track 3 waveforms have
been omitted for ease of viewing.
As in the Hold condition from
Experiment 1, there is a positive
amplitude deflection in both
Refresh conditions soon after the
cue period. (B) Two conditions
designed to control for the
positive deflection in response
to cues information found in
Experiment 1. We subtracted
Refresh 1 amplitude from the
Drop condition and Refresh 3
amplitude from Add. The stars
denote the last time window
where each waveform differed
significantly from zero. (C) Two
additional conditions: Drop–
Refresh 3 and Add–Refresh 1.
Here the stars denote the first
time when each condition
differed significantly from zero
and remained above this level
for the rest of the trial.
Drew et al.
number of objects being tracked. Here, we found the first
time when the difference between the switch conditions
were statistically equivalent ( p > .001) to the amplitude
of the newly cued target size for all subsequent periods.
This yields an estimate of 280 msec after the cue onset for
the both the Drop and Add conditions. The subtraction in
Figure 5C allowed us to estimate the final time at which
the switch conditions were equivalent to the target load
before the switch cue. Here, we found the first point
where all subsequent points of the waveform were greater
than 0 ( p < .001). This yields an estimate of 480 msec
after the cue for the Drop condition and 500 msec for
the Add condition.
Both of these analyses suggest that the time courses for
adding and dropping items are remarkably similar. These
numbers vary somewhat on the basis of how conservative
we are in our statistical tests, but the surprising pattern of
Add and Drop diverging at approximately the same time is
Shifting target sets while tracking multiple moving objects
is a complex attentional operation which humans engage
in on a daily basis. These results provide the first demonstration that we can measure the neural activity underlying this process operating in real time by measuring
the amplitude of CDA. Our results conclusively demonstrate that the CDA is a dynamic rather than transient signal that is sensitive to both tracking load increases and
decreases. This technique provides a powerful tool for
studying how visual attention operates under conditions
that approximate real-world behaviors more closely than
typical laboratory tasks.
Relationship to Previous Research
Previous studies have suggested that CDA amplitude may
serve as an on-line index of the number of items being
held in working memory or the number of items being
tracked during MOT (Drew & Vogel, 2008; McCollough
et al., 2007; Vogel & Machizawa, 2004). The current study
extends this line of research by showing that CDA amplitude is sensitive to dynamic changes in the number
of target during an MOT task. Our group has previously
suggested that one similarity between verbal working
memory (VWM) and MOT is that both tasks require a
pointer for each object that is being attended (Drew,
Horowitz, Wolfe, & Vogel, 2011; Drew & Vogel, 2008). In
fact, when viewing visually identical trials where the task
was manipulated to be either a tracking task or a VWM task,
the CDA amplitude mirrors the target load in both tasks,
suggesting substantial overlap in the neural mechanisms
that underlie these two tasks (Drew et al., 2011).
Given the hypothesis that CDA amplitude serves as an
on-line index of the number of items being attended
Journal of Cognitive Neuroscience
rather than reflecting the number of items that has been
attended in a given trial, it is critical to demonstrate that
the CDA is sensitive to dynamic changes in the number of
items being attended in a given trial. The current study
confirms this hypothesis in the MOT domain. This paves
the way for future research using the CDA to examine
other tasks where attentional load might change during
the trial.
We replicated and extended Wolfe et al.ʼs (2007) finding
that participants can “juggle” moving items in and out of
the target set with little or no impairment in tracking performance. In the original Wolfe et al. study, targets were
added or deleted one by one. Here, participants were
asked to simultaneously delete their current target set
and acquire new targets while holding fixation and attending lateralized items. Again, there was no noticeable decrement in performance relative to constant set tracking.
Although we view MOT as a more ecologically valid task
than many typical cognitive psychology tasks, it would be
very unusual to attentively track a fixed set of cars for a
long period while driving on a highway. It is much more
common to rapidly change what we are tracking as old
targets become irrelevant and new ones appear. This behavior is captured in the laboratory with the multipleobject juggling task. Our data and those of Wolfe et al.
(2007) suggest that humans are quite good at this task.
We can add and drop objects one at a time or drop one
set and acquire a whole new set at the same time with
equal aplomb. Thus, the multiple-object juggling task
may prove a useful way to assess everyday cognition in
populations with potential deficits, such as the elderly.
One might question whether these cues might be simply treated as an initiation of new trial, because the target
items in these experiments always changed during switch
trials. This might have been the case if our trial types were
blocked, but because all trial types were interleaved it was
necessary to track items up until the time when a switch
cue could occur. Alternatively, the participants could have
chosen to gamble that a given trial would be a switch trial,
but such a strategy would result in lower performance on
nonswitch trials with the appropriate target set size. For
instance, this would predict that Set Size 1 accuracy would
be lower than Drop condition accuracy. We found no evidence of such effects. This means that one fundamental
difference between the start of the trial and the cue onset
for switch trials is that participants were actively tracking
either one or three items when the switch cue occurred,
meaning that when the switch cue occurred the participant had to both pick up the new items and quickly stop
tracking initial items.
Time Course Information
After controlling for low-level effects in response to the
onset of cue information, we documented the time
course of attentional switching between moving objects
in response to peripheral cues. There is an existing literature
Volume 24, Number 2
on attentional switching, which has focused on the neural
response to symbolic switch cues. By focusing on the
ERPs evoked by the switch cue, two groups have estimated that cue information is processed between 300
and 700 msec after the cue onset (Brignani et al., 2009;
Grent-ʼT-Jong & Woldorff, 2007). In the current work,
we have focused on the shifts engendered by the cue
rather than cue processing alone and the second experiment was specifically designed to account for low-level
activity evoked by stimulus onset but not specific to cue
processing. In our paradigm, once the switch cue has
been encoded the participant must then switch tracking
sets, and our timing estimates are based examining both
the time when the new tracking load is first reflected in
the CDA waveform and the time when the switch waveform first differed significantly from the original target
load amplitude. These analyses gave a range of time during which the switch process appears to taking place.
Given the many differences between our paradigm and
those used in the previous studies, it was surprising that
the time course of this process (between 280 and 500 msec)
was quite similar to the estimates generated by focusing
on cue processing alone.
We did not expect to find that the time course was
nearly identical for adding and dropping items, having
assumed that switching to three items would take more
time than switching to one. However, this is less surprising when we note that there was also no effect of number
of targets on the latency of the initial rise of CDA at the
start of the trial. Furthermore, our participants showed
hardly any behavioral cost of the switching tracking load.
Future work could explore the possibility of modulating
the amount of time it takes to complete the switching process. For instance, the current study measures the time
course for changing target load without a concurrent load,
but we might expect that this switching process would
take a longer time if the participant had to continue tracking additional objects during a switch condition.
Future Directions
Recent work has shown that the CDA is elicited by a number of lateralized tasks beyond VWM and MOT, including
visual search (Woodman & Arita, 2011) and curve tracing
(Lefebvre, Jolicoeur, & Dellʼ Acqua, 2010). The current
finding that the CDA is sensitive to dynamic changes in
attentional load during a trial provides researchers with
a number of advantages over measuring behavioral performance alone. In the MOT domain, participants can fail
to track targets in several different ways. A participant
could fail to select targets for tracking at the start of a
trial, lose track of a target while tracking, or might confuse a target with a distractor and begin tracking the
wrong object. These effects would be indistinguishable
in accuracy measures but would yield different patterns
in the ERP data. Failures of target selection (Pylyshyn &
Annan, 2006) would be reflected in N2pc amplitude. As
the data from the current study show, the CDA is sensitive to decreases in tracking load, so when objects are
lost, we should see decreased CDA amplitude, whereas
target–distractor swaps would yield relatively stable
Consider the perceptual grouping effects observed by
Yantis (1992). He found that there was a significant advantage for tracking items that initially appeared in easily
grouped configuration (such as a square) and suggested
that perceptual grouping enables participants to effectively lower the number of items that are being tracked
in a given trial. If so, we would expect to observe reduced
CDA amplitude. Alternatively, perhaps the effect depends
entirely on improved selection. By measuring the N2pc
and the CDA, we could determine which phase of the
MOT task is affected by perceptual grouping: selection,
tracking, or both.
One of the defining properties of working memory is that
the contents of this system are constantly changing as
new information is encoded and old information is
moved to a more consolidated form or simply forgotten.
In the current study, we used CDA amplitude and the excellent temporal resolution of ERPs to study this dynamic
process. Although previous CDA research has primary
studied the process of encoding new information, here
we also studied the process of deleting irrelevant information in the face of new, more relevant information.
We have shown quite clearly that the CDA is dynamically
sensitive to both increases and decreases in tracking load
and have documented for the first time the time course
of attentional switching between moving objects in response to peripheral cues. Although a great deal of important recent research on the relationship between
attention and working memory has employed fMRI techniques, critical information about these processes may
be missed because of the poor temporal resolution of
the hemodynamic response. As we move toward studying these processes in more ecologically valid paradigms,
it is advantageous to employ techniques that are capable
of detecting changes that occur along the same time
scale (milliseconds as opposed to seconds) as the processes we are interested in studying. In the current work,
we have shown that CDA amplitude reflects changes in
the number of items being attended within less than
a second and that the time course appears to be relatively stable when picking up one or three new items.
We hope that future research will be able to use these
techniques to further refine our understanding of the time
course of shuttling information in and out of working
Reprint requests should be sent to Trafton Drew, Harvard Medical School, Visual Attention Lab, 64 Sidney St., Suite 170, Cambridge, MA 02139, or via e-mail:
Drew et al.
Bisley, J. W., & Goldberg, M. E. (2003). Neuronal activity in
the lateral intraparietal area and spatial attention. Science,
299, 81–86.
Brignani, D., Lepsien, J., Rushworth, M. F. S., & Nobre, A. C.
(2009). The timing of neural activity during shifts of spatial
attention. Journal of Cognitive Neuroscience, 21, 2369–2383.
Drew, T., Horowitz, T. S., Wolfe, J. M., & Vogel, E. K. (2011).
Delineating the neural signatures of tracking spatial position
and working memory during attentive tracking. Journal of
Neuroscience, 31, 659–668.
Drew, T., McCollough, A. W., & Vogel, E. K. (2006). Eventrelated potential measures of visual working memory.
Clinical EEG and Neuroscience, 37, 286–291.
Drew, T., & Vogel, E. K. (2008). Neural measures of individual
differences in selecting and tracking multiple moving objects.
Journal of Neuroscience, 28, 4183–4191.
Grent-ʼT-Jong, T., & Woldorff, M. G. (2007). Timing and
sequence of brain activity in top–down control of
visual-spatial attention. Plos Biology, 5, 114–126.
Hillyard, S. A., & Galambos, R. (1970). Eye movement artifact
in the CNV. Electroencephalography and Clinical
Neurophysiology, 28, 173–182.
Ikkai, A., McCollough, A. W., & Vogel, E. K. (2010). Contralateral
delay activity provides a neural measure of the number of
representations in visual working memory. Journal of
Neurophysiology, 103, 1963–1968.
Lefebvre, C., Jolicoeur, P., & Dellʼ Acqua, R. (2010).
Electrophysiological evidence of enhanced cortical activity
in the human brain during visual curve tracing. Vision
Research, 50, 1321–1327.
Liu, T. S., Slotnick, S. D., Serences, J. T., & Yantis, S. (2003).
Cortical mechanisms of feature-based attentional control.
Cerebral Cortex, 13, 1334–1343.
Journal of Cognitive Neuroscience
McCollough, A. W., Machizawa, M. G., & Vogel, E. K. (2007).
Electrophysiological measures of maintaining representations
in visual working memory. Cortex, 43, 77–94.
Pylyshyn, Z., & Storm, R. W. (1988). Tracking multiple
independent targets: Evidence for a parallel tracking
mechanism. Spatial Vision, 3, 179–197.
Pylyshyn, Z. W., & Annan, V. (2006). Dynamics of target
selection in multiple object tracking (MOT). Spatial Vision,
19, 485–504.
Scholl, B. J. (2001). Objects and attention: The state of the art.
Cognition, 80, 1–46.
Vogel, E. K., & Machizawa, M. G. (2004). Neural activity
predicts individual differences in visual working memory
capacity. Nature, 428, 748–751.
Vogel, E. K., McCollough, A. W., & Machizawa, M. G.
(2005). Neural measures reveal individual differences
in controlling access to working memory. Nature, 438,
Wolfe, J. M., Place, S. S., & Horowitz, T. S. (2007). Multiple
object juggling: Changing what is tracked during extended
multiple object tracking. Psychonomic Bulletin & Review, 14,
Woodman, G., & Arita, J. T. (2011). Direct electrophysiological
measurement of attentional templates in visual working
memory. Psychological Science, 22, 212–215.
Yantis, S. (1992). Multi-element visual tracking—Attention
and perceptual organization. Cognitive Psychology, 24,
Yantis, S. (2008). The neural basis of selective attention: Cortical
sources and targets of attentional modulation. Current
Directions in Psychological Science, 17, 86–90.
Yantis, S., Schwarzbach, J., Serences, J. T., Carlson, R. L.,
Steinmetz, M. A., Pekar, J. J., et al. (2002). Transient neural
activity in human parietal cortex during spatial attention
shifts. Nature Neuroscience, 5, 995–1002.
Volume 24, Number 2
Copyright of Journal of Cognitive Neuroscience is the property of MIT Press and its content may not be copied
or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission.
However, users may print, download, or email articles for individual use.