PDF - Durham Research Online
Durham Research Online
Deposited in DRO:
23 May 2016
Version of attached le:
Accepted Version
Peer-review status of attached le:
Peer-reviewed
Citation for published item:
Thaler, Lore and Castillo-Serrano, Josena (2016) 'People's ability to detect objects using click-based
echolocation : a direct comparison between mouth-clicks and clicks made by a loudspeaker.', PLoS ONE., 11
(5). e0154868.
Further information on publisher's website:
https://doi.org/10.1371/journal.pone.0154868
Publisher's copyright statement:
Copyright:
c
2016 Thaler, Castillo-Serrano. This is an open access article distributed under the terms of the Creative
Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided
the original author and source are credited.
Additional information:
Use policy
The full-text may be used and/or reproduced, and given to third parties in any format or medium, without prior permission or charge, for
personal research or study, educational, or not-for-prot purposes provided that:
• a full bibliographic reference is made to the original source
• a link is made to the metadata record in DRO
• the full-text is not changed in any way
The full-text must not be sold in any format or medium without the formal permission of the copyright holders.
Please consult the full DRO policy for further details.
Durham University Library, Stockton Road, Durham DH1 3LY, United Kingdom
Tel : +44 (0)191 334 3042 | Fax : +44 (0)191 334 2971
http://dro.dur.ac.uk
Manuscript
Click here to download Manuscript draft2-rev.docx
Human echolocation: mouth-clicks vs. loudspeaker clicks
1
2
3
4
People’s ability to detect objects using click-based echolocation: A direct comparison between
mouth-clicks and clicks made by a loudspeaker
5
Thaler1, L., Castillo-Serrano1, J.
6
7
8
1- Department of Psychology, Durham University
9
10
11
Corresponding author:
12
13
14
15
16
17
18
Lore Thaler
[email protected]
Department of Psychology, Durham University
Science Site, South Road
Durham DH1 5AY
United Kingdom
19
1
Human echolocation: mouth-clicks vs. loudspeaker clicks
20
Abstract
21
Echolocation is the ability to use reflected sound to obtain information about the spatial
22
environment. Echolocation is an active process that requires both the production of the emission as
23
well as the sensory processing of the resultant sound. Appreciating the general usefulness of echo-
24
acoustic cues for people, in particular those with vision impairments, various devices have been built
25
that exploit the principle of echolocation to obtain and provide information about the environment.
26
It is common to all these devices that they do not require the person to make a sound. Instead, the
27
device produces the emission autonomously and feeds a resultant sound back to the user. Here we
28
tested if echolocation performance in a simple object detection task was affected by the use of a
29
head-mounted loudspeaker as compared to active clicking. We found that 27 sighted participants
30
new to echolocation did generally better when they used a loudspeaker as compared to mouth-
31
clicks, and that two blind participants with experience in echolocation did equally well with mouth
32
clicks and the speaker. Importantly, performance of sighted participants’ was not statistically
33
different from performance of blind experts when they used the speaker. Based on acoustic click
34
data collected from a subset of our participants, those participants whose mouth clicks were more
35
similar to the speaker clicks, and thus had higher peak frequencies and sound intensity, did better.
36
We conclude that our results are encouraging for the consideration and development of assistive
37
devices that exploit the principle of echolocation.
38
2
Human echolocation: mouth-clicks vs. loudspeaker clicks
39
1. Introduction
40
Echolocation is the ability to use reflected sound to obtain information about the spatial
41
environment. Echolocation has been studied extensively in various bat species, as well as in some
42
marine mammals. It has also been studied in humans. To echolocate a person emits a sound, e.g. a
43
mouth click, and then uses sound reflections to obtain information about the environment. In this
44
way echolocation is an active process that requires both the production of the emission as well as
45
the sensory processing of the resultant sound. People can use echolocation to determine distance,
46
direction, size, material, motion or shape of distal ‘silent’ surfaces (for reviews see Kolarik et al.,
47
2014; Stoffregen & Pittenger, 1995; Thaler & Goodale, in press). In this way it can provide sensory
48
information otherwise unavailable without vision and therefore, direct sensory benefits for people
49
who are blind. For people with vision impairments, the use of echolocation is also associated with
50
benefits in daily life, such as better mobility in unfamiliar places (Thaler, 2013). Going beyond direct
51
sensory benefits, it has also been suggested that the use of echolocation may improve the
52
calibration of spatial representations for people who are blind from an early age (Vercillo et al.,
53
2015).
54
Appreciating the general usefulness of echo-acoustic cues for people, in particular those with vision
55
impairments, various devices have been built that exploit the principle of echolocation to obtain and
56
provide information about the environment (Ciselet et al., 1982; Heyes, 1984; Hughes, 2001; Ifukube
57
et al., 1991; Kay, 1964, 1974, 2000; Mihajlik & Guttermuth, 2001; Sohl-Dickstein et al., 2015; Waters
58
& Abudula, 2007). Some of these devices are distance measures or localization devices; that is, these
59
devices send out an ultrasonic pulse and then transform the incoming information into a secondary
60
signal about distance and location, which is then fed back to the user. Other devices (e.g., Sohl-
61
Dickstein et al., 2015) are based on the idea that the signal should not be changed but that the user’s
62
brain ‘should do the work’. This device sends out an ultrasonic emission, and receives the echoes
3
Human echolocation: mouth-clicks vs. loudspeaker clicks
63
binaurally via artificial pinnae, and then simply down-samples the signal and sends this down-
64
sampled (but otherwise 'raw') signal to the user via headphones. In this way, it is up to the user to
65
extract the relevant information from the signal. It is common to all these devices that they do not
66
require the person to make a sound. Instead, the device produces the emission autonomously and
67
feeds the resultant sound back to the user.
68
In the context of auditory processing, people typically show a phenomenon that is referred to as
69
echo-suppression (Litovsky et al., 1999; Wallach et al., 1949). It refers to a wide class of phenomena
70
according to which, if two sounds are presented in rapid succession, the percept is dominated by the
71
leading sound. As a consequence, the percept of the second sound is suppressed. This can improve
72
speech intelligibility as well as localization of sound sources in conditions in which reverberations are
73
present. Importantly, using a virtual auralization technique it has been suggested that during
74
echolocation where people actively produce the emission making mouth-clicks, echo suppression is
75
reduced as compared to echolocation where people do not actively produce the emission
76
(Wallmeier et al., 2013). Importantly, if this result also applied in ‘natural’ conditions, there would be
77
implications for assistive technology. Specifically, since the use of assistive devices based on
78
echolocation does not require people to actively make a sound, there is the chance that people
79
might be at a disadvantage (i.e. their echolocation ability might be reduced) when using a device as
80
compared to making their own emissions. Thus, here we tested if echolocation performance in a
81
simple object detection task was affected by the use of a head-mounted loudspeaker as compared
82
to active clicking. Current devices based on echolocation provide sound to the listener using
83
earphones. In our loudspeaker condition, however, we used only a loudspeaker, but no earphones.
84
We did this to keep the natural hearing experience constant across conditions (i.e. HRTF, frequency
85
response characteristics of the outer and inner ear, real-time listening).
4
Human echolocation: mouth-clicks vs. loudspeaker clicks
86
We found that a sample of 27 sighted people new to echolocation did equally well or even better
87
using the loud speaker. We also found that two blind people with expertise in echolocation
88
performed equally well with the speaker and making their own clicks. Finally, we found that even
89
though the two blind experts performed generally better than the sighted participants, the
90
difference in performance was only significant when using mouth clicks. In this way, using the
91
speaker enabled sighted ‘novices’ to approach performance of echo-experts. A correlational analysis
92
of acoustic features of mouth clicks of a subset of our participants (N=16) showed that clicks that
93
were more similar to the clicks made by the loudspeaker and that therefore had higher intensity and
94
higher peak frequencies were associated with better performance in our experiment.
95
We discuss the results with respect to previous findings that suggested that echo suppression should
96
be reduced (and echolocation therefore be enhanced) when people make their own clicks. We
97
conclude that our results are encouraging for the consideration and development of assistive
98
devices that exploit the principle of echolocation.
99
2. Method
100
All procedures were approved by the ethics board in the department of psychology at Durham
101
University and followed the principles laid out by the WHO in the declaration of Helsinki and the BPS
102
code of practice. Blind participants were given accessible versions of all documents. We obtained
103
written informed consent from all participants.
104
2.1. Overview of the experiment
105
Sighted blindfolded and blind participants were asked to use click-based echolocation to determine
106
if there was a disk in front of them or not. The disk could be presented at two different distances
5
Human echolocation: mouth-clicks vs. loudspeaker clicks
107
(1m and 2m). Participants either echolocated using mouth clicks or using clicks played through a
108
head-worn loudspeaker.
109
2.2 Participants
110
For this experiment 27 sighted and 2 blind participants took part. Sighted participants (14 female;
111
mean age: 29.1; SD: 10.1) reported to have normal or corrected to normal vision and hearing and no
112
prior experience with echolocation. Blind participants were both totally blind at time of testing and
113
reported using mouth-click based echolocation on a daily basis. (B1: male, 49 years at time of
114
testing; enucleated in infancy because of retinoblastoma; reported to have used echolocation as
115
long as he can remember. B2: male, 31 years at time of testing; lost sight gradually from birth due to
116
Glaucoma. Since early childhood (approx 3 yrs) only bright light detection; reported to have used
117
echolocation on a daily basis since he was 12 years old). Participants volunteered to take part in the
118
study and were compensated £6/hour or with participant pool credit.
119
2.3. Apparatus
120
The experiment was conducted in a sound-insulated and echo-acoustic dampened room (approx.
121
2.9m x 4.2m x 4.9m, noise-insulated room-inside-a-room construction, lined with acoustic foam
122
wedges that effectively absorb frequencies above 315 Hz).
123
Participants were seated in the centre of the room on a height-adjustable chair facing the back of
124
the room. In trials where an object was present, participants were presented with a 60cm-diameter
125
disc made of polystyrene covered in aluminium foil mounted on a metal pole (1cm diameter). On
126
trials were an object was absent, participants were presented only with the 1cm diameter metal
127
pole (i.e. the pole from which the disc had been removed). The pole had a movable base to facilitate
128
placing it at either 1m or 2m from the participant. Once participants were seated on the chair, the
6
Human echolocation: mouth-clicks vs. loudspeaker clicks
129
height was adjusted in order to match the height of participant’s ears with the height of the centre
130
of the disk.
131
Throughout the experiment participants wore a blindfold and head strap with a loudspeaker
132
mounted on it (Visaton SC5.9 ND; 60g; 90mm (H) x 50mm (W) x 30mm (D)). The speaker was driven
133
by an IBM Lenovo N500 laptop (Intel Pentium Dual PCU T3400 2.16 GHz, 3 GB RAM, 64 bit Windows
134
7 Enterprise SP1 a), connected via USB Soundcard (Creative Sound Blaster X-Fi HD Sound Card;
135
Creative Technology Ltd., Creative Labs Ireland, Dublin, Ireland) and amplifier (Dayton DTA-1) to the
136
speaker, using Audacity software (Audacity 2.1.0). The speaker was placed on the forehead with its
137
centre placed about 25cm from either ear.
138
2.4. Sound Characteristics
139
The sound file (wav-file) used to generate clicks via the speaker had been generated in MatlabR2012
140
(The Mathworks, Natick, MA) at 24 bit and 96kHz. It was 12.1 seconds long, and contained 17
141
individual clicks separated by 750 milliseconds of silence. Each individual click was a 4kHz tone
142
amplitude modulated by a decaying exponential. An illustration of the waveform of an individual
143
click as played through the speaker (recorded with DPA SMK-SC4060 (with protective grid removed)
144
and TASCAM DR100-MKII at 24bit and 96kHz) is shown in Figure 1a. The click’s frequency spectrum
145
is shown in Figure 1b. We chose this specific sound for three reasons. First, it has been suggested
146
previously that a sinusoide amplitude modulated by a decaying exponential would be a suitable
147
model for waveforms created by echolocators mouth-clicks (Martinez-Rojas et al., 2009). Second,
148
the duration and spectral frequency were within the range of durations and frequencies for
149
echolocation mouth-clicks described previously (Schörnich et al., 2012). Finally, to the experimenters
150
this sound phenomenologically resembled mouth-clicks that people make who echolocate on a
151
regular basis.
7
Human echolocation: mouth-clicks vs. loudspeaker clicks
152
153
Figure 1 – (a) Waveform of an individual click as played through the speaker (recorded with DPA SMK-SC4060
with protective grid removed and TASCAM DR100-MKII at 24bit and 96kHz) (b) The click’s frequency spectrum.
154
155
The mouth clicks people made varied from person to person, but they all were brief transients. The
156
rate of clicking was comparably across oral and speaker conditions. We recorded clicks for B1 and B2
157
as well as 14 sighted participants. Unfortunately, we were not able to make recordings for the other
158
sighted participants. Table 1 lists acoustic features of people’s clicks. Clicks were analyzed in Matlab
159
as follows: First, we detected individual clicks by detecting the peak value of the sound envelope
160
computed as absolute value of the waveform. Peaks had to have a minimum separation from one
161
another of 100ms. We then extracted the sound from the peak up to 15ms prior to the peak and 30
162
ms after. We then fitted exponentials of the form 𝑦 = 𝑐𝑒 −𝑏𝑡 to the envelope data, where y is the
163
fitted envelope data point, and t is the sample number. We fitted one curve to the 15ms of envelope
164
data from the beginning to the peak, and one to the 30ms of data from the peak to the end. The
165
fitted curve will be maximal at the peak and drop off as it goes away from the peak. The height of
166
the maximum will depend on c, and the drop off rate on b. The onset and offset of the sound was
167
defined as the sample where the value of the fitted curve was lower than 95% of the maximum
168
value of the fitted curve. Each click and curve-fit was checked audio-visually and data were rejected
169
if the extracted sound was not a click (e.g. coughing, background noise, swallowing). We then used
170
onset and offset values to extract the click from the sound file and to estimate duration, peak
171
intensity, RMS intensity, and peak frequency (i.e. frequency with maximum amplitude in frequency
172
spectrum) of clicks. We subsequently also computed a ‘dissimilarity measure’ (DM) that quantified
173
how similar the acoustics of a participant’s mouth click was to the speaker click. To compute
174
dissimilarity we first computed the difference between mouth click and speaker click with respect to
175
peak intensity, peak frequency and duration. We did not use RMS intensity because it was highly
176
correlated with peak intensity and because peak intensity by itself had a higher correlation to
177
performance (compare Table 1 and see also ‘Results’). We then normalized these difference values
8
Human echolocation: mouth-clicks vs. loudspeaker clicks
178
for each acoustic feature by their standard deviation across participants. We then took the absolute
179
values of these normalized differences. Finally, to get a single dissimilarity measure, we added the
180
normalized absolute difference values together. We did this using only intensity and frequency
181
(DMI,F), and using intensity, frequency and duration (DMI,F,D).
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
Subject
Duration
(ms)
RMS Intensity
(dB)
Peak Intensity
(dB)
Peak Frequency
(Hz)
Speaker
6.2 (0.1)
-9.9 (0)
-4.4 (0)
3979 (4)
B1
B2
S1
S2
S3
S4
S5
S6
S7
S8
S9
S10
S11
S12
S13
S14
5.3 (1.6)
4.1 (1.3)
11.6 (4.3)
11 (5.6)
5.5 (2.9)
6.2 (4.1)
4.7 (2.2)
7.2 (2)
6.4 (2.2)
6.6 (2)
12.8 (1.5)
6 (3.5)
16.1 (6.4)
3.4 (1.4)
18.1 (3.2)
10.8 (3.7)
-10.2 (1.5)
-10.4 (1.6)
-21.6 (2.3)
-24.8 (2.1)
-21.7 (2.7)
-21.1 (3.4)
-20.1 (2.4)
-18.6 (2.8)
-20.3 (2.9)
-18 (2.3)
-8.8 (1.5)
-22.7 (1.9)
-24.2 (1.8)
-14.8 (2.8)
-18.5 (3.1)
-18.3 (2.5)
-3.6 (1.4)
-3.6 (1.5)
-15.9 (2)
-17.7 (1.5)
-16.3 (2.3)
-15 (2.5)
-14.7 (2)
-13.3 (2.4)
-14.7 (2.6)
-12.6 (2.1)
-3.4 (1.6)
-16.6 (1.5)
-17.2 (1.5)
-9.7 (2.4)
-13.2 (3.1)
-12.1 (2.2)
3487 (598)
2903 (378)
1592 (138)
2124 (1230)
1834 (503)
1361 (736)
2852 (2852)
1723 (131)
2094 (272)
1472 (179)
1229 (19)
3149 (316)
1315 (963)
1757 (839)
1015 (40)
1781 (226)
DMI,F
DMI,F,D
--
--
0.8
1.6
5.6
5.2
5.3
5.7
3.6
4.9
4.7
5.1
3.9
3.6
6.2
4.1
5.8
4.6
1
2.1
6.8
6.3
5.5
5.7
4
5.1
4.7
5.2
5.4
3.7
8.5
4.7
8.5
5.6
Table 1- Acoustic features of clicks. For reference, features of clicks made by the loud speaker and computed
using our methods are given in the top row. Values are means. Standard deviations are given in parenthesis.
The last two columns are values of the Dissimilarity Measure (DM) based on differences between mouth clicks
and the speaker clicks in terms of peak intensity (I), frequency (F) or duration (D).
198
199
9
Human echolocation: mouth-clicks vs. loudspeaker clicks
200
2.4 Procedure
201
For sighted participants the experiment consisted of two sessions. In each session there were two
202
click conditions (self-produced mouth clicks and loud-speaker clicks). The order of click conditions
203
was counterbalanced across participants. In each session, participants completed 48 trials per click
204
condition, with 24 trials for each distance (1m or 2m). The object was absent for 12 out of those 24
205
trials. The order of distances (1m vs. 2m) and objects (present vs. absent) was block-randomized. In
206
the beginning of each session, the experimenter demonstrated how to make mouth clicks.
207
Participants then practiced until they produced adequate clicks for the task. Our criteria for
208
adequate clicks were (a) that they did not produce ‘double-clicks’ (i.e. clicks that are created when
209
the tongue is quite back in the mouth and basically creates two brief successive oral vacuum pulses,
210
that sound like a deeper ‘clucking’ sound), and (b) that they could make the clicks with comfort and
211
sustain them throughout a 12 second trial at a rate similar to the speaker. Participants completed 2
212
practice trials per distance and presence condition. They received feedback during practice trials.
213
For blind participants trained in echolocation the experiment consisted of only one session during
214
which all conditions (speaker vs. mouth clicks; 1m vs. 2m; absent vs. present) were presented in
215
block randomized order.
216
At the beginning of each trial, participants occluded their ears using their index fingers’ tip. The
217
experimenter then placed the pole and object. Subsequently, the experimenter stepped behind the
218
participant and tapped them on the shoulder as a sign that they were allowed to unblock their ears.
219
Participants then either produced tongue clicks or listened to the loud-speaker clicks (click-train
220
triggered by the experimenter), depending on the condition they were in. Twelve seconds were
221
given for participants to listen to the clicks and echoes and give a response of whether the object
222
was placed in front of them (‘present’) or not (‘absent’). If participants produced their own tongue
10
Human echolocation: mouth-clicks vs. loudspeaker clicks
223
clicks, the experimenter tapped them on the shoulder again as a sign that time was over for that
224
trial. For the pre-recorded clicks, the end of the click-train signalled that time was over for that trial.
225
If subjects gave no response within those twelve seconds the experimenter requested a judgement.
226
The responses were recorded for each trial. As soon as participants had given a response, they
227
blocked their ears again in order to start with the next trial.
228
No feedback on the accuracy of response was given. Participants could take breaks as often as they
229
wanted. One session took approximately 90 minutes to complete.
230
2.5. Data analysis
231
For sighted participants, we calculated the accuracy of each participant’s responses for each
232
distance (1m vs. 2m), click (self-produced click vs. loud speaker click) and session (1 vs. 2). For the
233
two blind participants we calculated accuracy for each distance and click condition. If participants
234
had answered entirely at random, their accuracy in any condition would have been 0.5.
235
On the group level, data were analysed using repeated measures ANOVA with ‘session’ (1 vs 2),
236
‘distance’ (1m vs. 2m) and ‘sound’ (speaker vs. mouth click) as repeated variables. For the two blind
237
people trained in echolocation we analysed their performance on an individual basis in comparison
238
to the group.
239
To determine if acoustic features of clicks shown in Table 1 were related to performance we ran
240
correlation analyses. For these we correlated individual acoustic features with participants’
241
performance in mouth-click conditions, and we also ran a multiple-linear regression analysis with
242
individual acoustic features as predictors and participants’ performance in mouth-click conditions as
243
criterion.
11
Human echolocation: mouth-clicks vs. loudspeaker clicks
244
3. Results
245
3.1. Group analysis – Sighted Participants
246
The main effect of ‘session’ was significant (F (1, 26)= 7.899, p=.009) indicating that participants
247
were more accurate in detecting the target object during session 2 (M=.650, SD =0.119), as
248
compared to session 1 (M=.582; SD=0.140). Moreover, results showed a significant main effect of
249
sound (F(1, 26)= 8.172, p=.008) indicating that participants detection accuracy was better when
250
they used the loudspeaker (M=.653, SD=0.161), as compared to when they produced their own
251
tongue clicks (M=.579, SD=0.093). The analysis also revealed a significant main effect of distance
252
(F(1, 26)= 19.346, p<.001), indicating that subjects’ accuracy in detecting the target object was
253
higher when it was placed at 1m (M =.648, SD =0.129), as compared to 2m (M=.584, SD =0.109). In
254
addition, the analysis showed a significant interaction effect between sound and distance (F(1, 26)=
255
5.549, p=.026) and a significant interaction effect between session, sound and distance (F(1, 26)=
256
4.398, p=.046). None of the other effects were significant.
257
We used paired t-tests (Bonferroni corrected) to follow up the significant interaction effects. The
258
follow up analysis for the sound x distance interaction revealed a significant difference between
259
speaker and mouth-clicks at 1m (t(26)=-3.699; p=.001) but not at 2m (t(26)=-31.303; p<.204).
260
Furthermore, we found that performance was significantly better at 1m as compared to 2m when
261
using the loudspeaker (t(26)=4.481; p<.001), but not when using mouth clicks (t(26)=1.51; p=.143).
262
This pattern of results is illustrated in Figure 2.
263
264
Figure 2 – Performance split by distance and sound. Error bars represent SEM across participants. ** p< .01;
*** p < .001
265
12
Human echolocation: mouth-clicks vs. loudspeaker clicks
266
The follow up analysis for the sound x distance x session interaction confirms these results, but also
267
illustrate that the effects of distance and sound source are only evident in the second session.
268
Specifically, they show that the significant difference between speaker and mouth-click at 1m is only
269
significant for session 2 (t(26)=-4.234;p<.001), but not session 1 (t(26)=-1.542; p=.135), and similarly
270
that better performance at 1m as compared to 2m with the loudspeaker is also only significant for
271
session 2 (t(26)=5.228;p<.001), but not session 1 (t(26)=1.925; p=.065). This pattern of results is
272
illustrated in Figure 3.
273
274
Figure 3 – Performance split by session, distance and sound. Error bars represent SEM across participants. ***
p < .001
275
276
3.2 Sighted vs. Blind Echolocation Experts
277
Performance of both B1 and B2 plotted together with the data from the group of sighted
278
participants (B1 and B2’s single session performance has been plotted for both session 1 and 2) is
279
shown in Figure 4. It is evident that B1 performs perfectly in all conditions (note that for this reason
280
the plot for B1 has two results superimposed). Thus, B1’s performance is unaffected by distance or
281
sound (mouth click vs. speaker). It is also evident that B2 shows slight variation, but a Chi-square test
282
applied to the distribution of correct responses was non-significant (𝜒 2 (1, N=91)=.01; p=.919),
283
suggesting that also B2’s performance was the same at 1m and 2m, and for mouth clicks and
284
speaker.
285
286
287
Figure 4 – Data for B1 and B2 plotted in comparison to data from sighted participants split by session, distance
and sound (i.e. data replotted as from Figure 3). Note that the plot for B1 has two results superimposed. For
results of significance tests between sighted participants and B1 and B2 please see Table 2.
288
13
Human echolocation: mouth-clicks vs. loudspeaker clicks
289
It is also evident that B1 and B2’s performance exceeds performance of sighted participants. To
290
determine if performance differences were significant, we computed modified t-tests which allow
291
comparison of a value of a single case to a group of subjects (Crawford & Howell, 1998; Crawford &
292
Garthwaite, 2002). Using this procedure, we found that performance of sighted participants was
293
always significantly different from both B1 and B2 when using tongue clicks. In contrast,
294
performance was not significantly different when using a loudspeaker, with the one exception of B1
295
in session 1 at 2m. The test results are summarized in detail in Table 2.
296
Session 1, 1m, mouth-click
Session 1, 2m, mouth-click
Session 2, 1m, mouth-click
Session 2, 2m, mouth-click
Session 1, 1m, loudspeaker
Session 1, 2m, loudspeaker
Session 2, 1m, loudspeaker
Session 2, 2m, loudspeaker
B1: t(26)=2.65; p=.013*
B2: t(26)=2.65; p=.013*
B1: t(26)=3.216; p=.003**
B2: 9(26)=2.626; p=.014*
B1: t(26)=2.364; p=.026*
B2: t(26)=2.364; p=.026*
B1: t(26) = 3.248; p=.003**
B2: t(26) = 2.599; p=.015*
B1: t(26)=1.577; p=.127
B2: t(26)=1.397;p=.174
B1: t(26)=2.205; p=.037*
B2: t(26)=1.764; p=.090
B1: t(26)=1.242; p=.225
B2: t(26)=1.014; p=.320
B1: t(26) = 1.952; p=.062
B2: t(26) = 1.518; p=.141
297
298
299
Table 2- Results of modified t-tests comparing performance of B1 and B2 to performance of the sighted
sample for each condition.
300
301
3.3 Acoustic features of Mouth-Clicks and Performance
302
To investigate the relationship between acoustic features of mouth clicks from a subset (N=16) of
303
our participants and their performance we adopted a correlation/regression approach. First, we
14
Human echolocation: mouth-clicks vs. loudspeaker clicks
304
computed individual correlations between each acoustic feature of the clicks and people’s overall
305
accuracy in mouth click conditions (averaged across sessions and distances). Scatterplots are shown
306
in Figure 5. All correlations were significant (duration: r= -.508; p=.045; peak intensity: r=.617;
307
p=.011; RMS intensity: r=.575; p=.02; frequency: r=.589; p=.016). Subsequently, we used stepwise
308
multiple linear regression to determine which variables, or variable combinations, contributed
309
significantly. Using this approach we found that both peak intensity (standardized beta: .499, t(13) =
310
2.681; p=.019) and peak frequency (standardized beta: .461; t(13) = 2.478; p=.028) had significant
311
positive relationships to overall performance, and that the overall fit was significant (F(2,13)=8.949;
312
p=.004; R2: 0.579). Thus, in our experiment people whose clicks were louder and had higher
313
frequencies performed better when using mouth clicks. When we remove B1 and B2 from the
314
analysis correlations become non-significant (duration: r= -.443; p=.113; peak intensity: r=.067;
315
p=.819; RMS intensity: r=.097; p=.741; frequency: r=.115; p=.695).
316
317
318
Figure 5 – Scatterplots between individual acoustic variables and performance. Data from B1 and B2 are
highlighted in the plots.
319
320
To investigate if the similarity of a person’s click to the loud speaker click may be related to how well
321
they did in our experiment, we correlated dissimilarity measures to overall accuracy. We found that
322
the correlation between participants’ overall accuracy and DMI,F was -.768 (p<.001), and for DMI,F,D it
323
was r=-.747 (p=.001). Scatterplots are shown in Figure 5. The data suggest that participants whose
324
clicks were more similar to the loud speaker click did better. As evident from the acoustic statistics
325
shown in Table 1, clicks that were more similar to the speaker also had higher intensity and peak
326
frequencies. When removing B1 and B2 from the analysis correlations become non-significant (DMI,F
327
: -.206; p<.480; DMI,F,D : r=-.378; p=.183).
15
Human echolocation: mouth-clicks vs. loudspeaker clicks
328
5. Discussion
329
Here we tested how well people were able to detect an object in front of them based on acoustic
330
echoes. They could use either mouth clicks or a loudspeaker, and we had both 27 sighted
331
participants new to echolocation and two blind participants with experience in echolocation. We
332
found that sighted participants new to echolocation did generally better when they used a
333
loudspeaker as compared to mouth-clicks, and that this improvement was most pronounced in the
334
second session and at 1m distance. Furthermore, we found that B1 and B2, both of which had
335
experience in echolocation did equally well with mouth clicks and the speaker. Finally, we found that
336
even though B1 and B2 performed generally better than the sighted participants, the difference in
337
performance was only significant when using mouth clicks. In this way, using the speaker enabled
338
sighted participants to approach performance of B1 and B2. Across a subset of 16 of our participants
339
(incl. B1 and B2), those participants whose mouth clicks were more similar to the speaker clicks, and
340
thus had higher peak frequencies and sound intensity, did better.
341
Echo-suppression
342
These results strongly suggest that the use of the loudspeaker did not impair echolocation
343
performance in our experiment. Based on the idea that the active production of a click would lead to
344
reduced echo-suppression (Wallmeier et al., 2013) we might have expected the opposite pattern of
345
results, namely that participants would have been worse at detecting objects via echoes when they
346
used the speaker, as compared to mouth-clicks. This is expected because if mouth-clicks were to
347
lead to reduced echo suppression, people should do better in echolocation when making mouth-
348
clicks. The fact that we did not observe an advantage of mouth-clicks in our study suggests that
349
reduced echo-suppression during active echolocation as proposed by Wallmeier and colleagues did
350
not drive performance in our experiment.
16
Human echolocation: mouth-clicks vs. loudspeaker clicks
351
Nonetheless our task design might have been unsuitable to measure effects of echo-suppression
352
because the sounds that people used in speaker and mouth-click conditions were not identical
353
(compare methods section where we provide data from click measurements). In fact, for the
354
majority of participants whose clicks we measured, we found that their clicks were softer and/or had
355
lower peak frequencies as compared to the clicks made by the speaker.
356
Thus differences in performance between active clicking and speaker in our study were confounded
357
with differences in the acoustics of the emission itself. In this way then, even though our results
358
suggest that echo-suppression during active echolocation did not drive performance in our
359
experiment, the design of our experiment does not invalidate the hypothesis put forth by Wallmeier
360
et al (2013).
361
Acoustic Features
362
The results of the analyses of acoustic features suggest that (based on individual correlations)
363
intensity, duration and frequency of clicks were related to performance in our experiment. The
364
follow-up multiple linear regression analysis highlighted in particular the contribution of intensity
365
and frequency. Yet, correlations became non-significant when B1 and B2 were excluded from
366
analysis. The latter finding suggests that correlations are driven largely by differences in acoustic
367
click features and performance between sighted participants on the one hand and B1 and B2 on the
368
other.
369
In our study, perceptual echo-expertise and acoustic features of mouth-clicks are confounded
370
because B1 and B2 not only have clicks that are typically shorter, higher, and more intense
371
compared to those of sighted participants, but they also have more experience in perceiving and
372
processing echoes. Thus, we cannot be sure if the correlations we observe are indicative of an
17
Human echolocation: mouth-clicks vs. loudspeaker clicks
373
association between performance and acoustic features of clicks or if they are indicative of an
374
association between performance and perceptual-cognitive echo-expertise. Nonetheless, there is
375
previous research that is generally consistent with what we found in regards to frequency and
376
intensity. For example, Rowan et al. (2013, 2015) found that people’s perception of lateral position
377
was better with high-pass (>2kHz) as compared to low pass (<2Khz) stimuli. They also found that
378
performance improved with increasing sound level. Nonetheless, the stimuli they used were noise
379
stimuli, not clicks. Interestingly, with respect to emission duration it has been reported that people
380
tend to do better with longer sounds. For example, Rowan et al (2013) found that performance to
381
localize the lateral position of an object increased as stimulus duration increased from 10-400ms.
382
Similarly, Schenkman and Nilsson (2010) found that people’s ability to determine the presence of an
383
object increased as stimulus duration increased from 5ms to 50ms to 500 ms. In our experiment
384
shorter clicks were associated with better performance, however, which may seem at odds with
385
these previous findings. This can potentially be explained considering that the magnitude of duration
386
differences that we observed across participants were far below those duration differences used by
387
Rowan et al (2013) or Schenkman & Nilsson (2010). Furthermore, we did not use noise stimuli, but
388
clicks.
389
systematically, and our results as well as the other work discussed above suggest that duration,
390
frequency and intensity should be features to consider in this context.
391
Generalization to other Tasks
392
The task we used here was a simple object detection task. Future work is needed to determine how
393
the results generalize to more complex scenarios and tasks.
394
Assistive Technology
In sum, future work should investigate the issue of acoustic click features more
18
Human echolocation: mouth-clicks vs. loudspeaker clicks
395
The main goal of our work was to test if people could successfully echolocate using a loudspeaker,
396
and how it would compare to when they used their own mouth- clicks. We addressed this question
397
because of its high relevance to developers of assistive devices, which work based on technology
398
rather than people making their own emissions. Here we found that the use of a loudspeaker
399
enabled people who had no experience in echolocation to improve their performance as compared
400
to when they used their own mouth clicks, and that this advantage was most pronounced at 1m
401
distance and in the second testing session. Most importantly, we also found that these ‘echo naïve’
402
people, when using the loudspeaker, were able to perform similar (i.e. not significantly different) to
403
two echolocation experts, i.e. people who have longstanding expertise in echolocation. Finally, for
404
these two echolocation experts the use of a loudspeaker did not make any difference, i.e. they
405
performed equally well in all conditions. This suggests that the use of technology as simple as a
406
head-worn loudspeaker making audible clicks enables people to perform better or just as well as
407
when using mouth-clicks.
408
As mentioned in the introduction, various technological assistive devices for people with vision
409
impairments have been developed based on the echolocation principle (Ciselet et al., 1982; Heyes,
410
1984; Hughes, 2001; Ifukube et al., 1991; Kay, 1964, 1974, 2000; Mihajlik & Guttermuth, 2001; Sohl-
411
Dichstein et al., 2015; Waters & Abudula, 2007). The devices range in their complexity and purpose,
412
but all have in common that they generate the emission and feed a more or less processed signal
413
back to the user. The advantage of technological assistive devices is that they can, for example,
414
achieve greater spatial resolution by working in the ultrasonic range, but our current results suggest
415
that even a tool as simple as a head worn acoustic loudspeaker may facilitate echolocation. Natural
416
echolocation offers advantages in terms of ease of access, sturdiness, and low cost. Future research
417
will determine the degree to which assistive technology may or may not supersede natural
418
echolocation.
19
Human echolocation: mouth-clicks vs. loudspeaker clicks
419
Conclusion
420
Our study is the first to directly compare people’s performance in an echolocation task when they
421
used their mouth or a head-worn loudspeaker to make clicks. Performance was either the same or
422
better with the loudspeaker. This result is encouraging for the development of assistive technology
423
based on echolocation.
424
20
Human echolocation: mouth-clicks vs. loudspeaker clicks
425
References
426
Ciselet V, Pequet E, Richard I, Veraart C, Meulders M (1982) Substitution sensorielle de la vision par
427
l'audition au moyen de capteurs d'information spatial. Arch Int Physiol Biochem 90: P47.
428
Crawford, J. R., & Garthwaite, P. H. (2002). Investigation of the single case in neuropsychology:
429
Confidence limits on the abnormality of test scores and test score differences.
430
Neuropsychologia, 40(8), 1196-1208.
431
432
433
434
435
436
437
438
Crawford, J. R., & Howell, D. C. (1998). Comparing an individual's test score against norms derived
from small samples. The Clinical Neuropsychologist, 12(4), 482-486.
Heyes AD (1984) Sonic Pathfinder: A programmable guidance aid for the blind. Electronics and
Wireless World 90: 26–29.
Hughes B (2001) Active artificial echolocation and the nonvisual perception of aperture passability.
Hum Mov Sci 20: 371–400.
Ifukube, T., Sasaki, T., and Peng, C. (1991) “A blind mobility aid modeled after echolocation of bats,”
IEEE Transactions on Biomedical Engineering,
439
Kay, L. (1964). An ultrasonic sensing probe as a mobility aid for the blind. Ultrasonics, 2(2), 53-59.
440
Kay, L. (1974). A sonar aid to enhance spatial perception of the blind: engineering design and
441
442
443
evaluation. Radio and Electronic Engineer, 44(11), 605-627.
Kay, L. (2000) “Auditory perception of objects by blind persons, using a bioacoustic high resolution
air sonar,” The Journal of the Acoustical Society of America, 2000.
444
Kolarik, A. J., Cirstea, S., Pardhan, S., & Moore, B. C. (2014). A summary of research investigating
445
echolocation abilities of blind and sighted humans. Hearing research, 310, 60-68.
446
447
Litovsky, R. Y., Colburn, H. S., Yost, W. A., & Guzman, S. J. (1999). The precedence effect. The Journal
of the Acoustical Society of America, 106(4), 1633-1654.
21
Human echolocation: mouth-clicks vs. loudspeaker clicks
448
Martinez-Rojas, J. A., Hermosilla, J. A., Montero, R. S., & Espí, P. L. L. (2009). Physical analysis of
449
several organic signals for human echolocation: Oral vacuum pulses. Acta Acustica united
450
with Acustica, 95, 325–330.
451
452
453
Mihajlik, P., Guttermuth, M. (2001) “DSP-based ultrasonic navigation aid for the blind,” Proceedings
of the 18th IEEE Instrumentation and Measurement Technology Conference.
Rowan, D., Papadopoulos, T., Edwards, D., & Allen, R. (2015). Use of binaural and monaural cues to
454
identify the lateral position of a virtual object using echoes. Hearing Research, 323, 32-39.
455
Rowan, D., Papadopoulos, T., Edwards, D., Holmes, H., Hollingdale, A., Evans, L., & Allen, R. (2013).
456
Identification of the lateral position of a virtual object based on echoes by humans. Hearing
457
Research, 300, 56–65.
458
459
460
Schenkman, B. N., & Nilsson, M. E. (2010). Human echolocation: Blind and sighted persons' ability to
detect sounds recorded in the presence of a reflecting object. Perception, 39(4), 483-501.
Schörnich, S., Wiegrebe, L., & Nagy, A. (2012). Discovering your inner bat: Echo-acoustic target
461
ranging in humans. Journal of the Association for Research in Otolaryngology, 13, 673–682.
462
Sohl-Dickstein, J., Teng, S., Gaub, B., Rodgers, C., Li, C., DeWeese, M., & Harper, N. (2015). A device
463
for human ultrasonic echolocation. IEEE Transactions in Biomedical Engineering, 62, 1526 –
464
1534.
465
466
467
468
Stroffregen, T. A., & Pittenger, J. B. (1995). Human echolocation as a basic form of perception and
action. Ecological psychology, 7(3), 181-216.
Thaler, L. (2013). Echolocation may have real-life advantages for blind people: an analysis of survey
data. Frontiers in physiology, 4, 98.
469
Thaler, L & Goodale, M.A. (in Press). Echolocation in People – an Overview. WIREs Cogn Sci, in Press.
470
Vercillo, T., Milne, J. L., Gori, M., & Goodale, M. A. (2015). Enhanced auditory spatial localization in
471
472
473
blind echolocators. Neuropsychologia, 67, 35–40.
Wallach, H., Newman, E. B., & Rosenzweig, M. R. (1949). A precedence effect in sound localization.
The Journal of the Acoustical Society of America, 21(4), 468-468.
22
Human echolocation: mouth-clicks vs. loudspeaker clicks
474
475
476
477
Wallmeier, L., Geßele, N., & Wiegrebe, L. (2013). Echolocation versus echo suppression in humans.
Proceedings of the Royal Society B: Biological Sciences, 280, 20131428.
Waters, D., Abulula, H. (2007) “Using bat-modelled sonar as a navigational tool in virtual
environments,” International Journal of Human-Computer Studies,
478
479
23
Figure 1
Click here to download Figure Figure1.tif
Figure 2
Click here to download Figure Figure2.tif
Figure 3
Click here to download Figure Figure3.tif
Figure 4
Click here to download Figure Figure4.tif
Figure 5
Click here to download Figure Figure5.tif
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement