A mobile augmented reality audio system with binaural microphones

A Mobile Augmented Reality Audio System with Binaural Microphones Robert Albrecht, Tapio Lokki, and Lauri Savioja Aalto University School of Science Department of Media Technology P.O. Box 15400, FI00076 Aalto {robert.albrecht, tapio.lokki, lauri.savioja}@aalto.fi ABSTRACT from the surroundings. Using audio for the presentation of information, and possibly also for the input of information, leaves the user’s sight as well as hands free to observe and interact with the environment. In mobile applications, certain criteria can be defined for the usage of audio. The audio reproduction system must naturally be mobile, which implies that it should be small in size and wearable. In most applications, it is also desirable that the audio is heard only by the user, either not to disturb people in the surrounding, or because the information might be of a private nature. These criteria typically restrict the sound reproduction method to the use of headphones of some kind. Most types of headphones, however, attenuate sounds from the environment, and thus impair the perception of the acoustic environment. Techniques allowing a user to hear the surrounding acoustic environment while adding virtual sounds to the auditory perception have been studied in the field of augmented reality. Two possible techniques are ”acoustichear-through” (or hear-through) augmented reality and ”microphone-hear-through” (or mic-through) augmented reality (AR) [4]. Hear-through AR can be achieved using, e.g., bone-conduction headphones, which do not attenuate sounds from the surroundings. The headset used in micthrough AR attenuates sounds from the surroundings, but has microphones located on both earpieces. The microphone signals are mixed with virtual sounds and played through the headphones. In this way, the acoustic environment is perceived unattenuated. This paper discusses design issues related to mic-through AR. It presents an implementation of mic-through AR hardware consisting of insert headphones with binaural microphones, and a separate mixer and equalizer unit. The usability of the implementation is briefly evaluated. Finally, some applications that could take advantage of an AR system like this are presented. This paper presents a microphone-hear-through augmented reality hardware system, consisting of insert headphones with binaural microphones, i.e., a microphone at each ear, together with a separate mixer and equalizer unit. The equalizer compensates for changes in ear canal resonances caused by the headphones blocking the ear canals, and is used to achieve a natural-sounding reproduction of the acoustic environment when playing the microphone signals through the headphones. The system can be used both for pure augmented reality applications and for any applications presenting audio through headphones while allowing the user to listen to the environment at the same time. Examples of both types of applications are given. The applications can utilize the binaural microphone signals and the user can adjust the level of the environmental sounds heard through the headphones. Categories and Subject Descriptors H.5.1 [Information Interfaces and Presentation]: Multimedia Information Systems—artificial, augmented, and virtual realities; audio input/output General Terms Design, Measurement Keywords Microphone-hear-through augmented reality, audio, binaural microphones 1. INTRODUCTION While the original use of mobile phones is voice communication, new mobile applications typically present information through visual displays and accept input of information through touch interfaces. These types of interfaces, however, draw the attention of the user to the mobile device and away 2. ADVANTAGES OF MIC-THROUGH AUGMENTED REALITY Although using bone-conduction headphones to render virtual sound sources would allow one to hear natural sounds unattenuated and without loss of quality, using headphones with binaural microphones has its advantages. Firstly, reproducing the natural sound environment using headphones allows the listener to either attenuate the level of natural sounds, when desired, or alternatively amplify those sounds. Secondly, the signals from the binaural microphones can be used for many interesting applications. The sound quality Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. IWS ’11, August 30, 2011, Stockholm, Sweden. Copyright 2011 ACM 978-1-4503-0883-0/11/08 ...$10.00. 7 of bone-conduction headphones is also in general inferior to that of headphones. Using headphones for displaying both the natural sound environment and sound events that augment it would probably result in a better integration of natural and virtual sounds. 3. universal serial bus (USB), allowing input of the microphone signals on, e.g., laptops, which often lack an analog stereo microphone input. The mixer should also draw its power through the USB connector, positively affecting both size and ease of operation compared with the original batterypowered mixer. Additionally, the level of the microphone signals passed to the headphones should be adjustable. Ideally, the mixer would not be a separate unit, but integrated either into the headset or into the mobile phone. The microphone signal level should be adjustable in software. The novel mixer is, however, aimed for providing a reasonably sized platform for testing mobile augmented reality audio applications, and not as a protoype for a final product. DESIGN ISSUES The design of a mic-through AR system introduces some issues. The environmental sounds heard through the system must sound close to natural, especially if the system might be used for extended periods of time. For this reason, it must be taken into consideration, e.g., how the natural acoustics of the ear canal change when blocking the earcanal entrance. 3.1 4.1 Ear Canal Resonances The headsets used for the implemented mic-through augmented reality system are modified Philips SHN2500 headphones (see Figure 1). For noise-cancelling purposes, these headphones have a microphone attached to each earpiece. The noise-cancelling electronics from the headphones have been removed, and instead, an additional 3.5 mm tip-ringsleeve connector has been added for microphone signal output. The open ear canal acts as a quarter-wavelength resonator. The lowest resonance lies between 3 and 5.5 kHz [2], mainly depending on the length of the ear canal. When the earcanal entrance is blocked, e.g., by a headphone earpiece, the ear canal instead acts as a half-wavelength resonator, with the lowest resonance at a frequency double that of the quarter-wavelength resonance. When designing headphones, the missing quarter-wavelength resonance is often taken into account by adding a peak in the magnitude response, but the unnatural half-wavelength resonance is seldom compensated for [7]. Both these resonances should be taken into consideration when designing a naturally-sounding mic-through AR system. 3.2 Leakage Although insert headphones quite effectively attenuate sounds from the environment, some sound always leaks through and past the headphones. This leakage is larger at lower frequencies [8]. In a mic-through AR system, where the leaking natural sound is heard together with the microphone signals reproduced with headphones, this leakage would boost environmental sounds at low frequencies if not compensated for. Another problem is also introduced by this leakage. If the sound reproduced with the headphones is delayed relative to the leaking natural sound, a comb-filtering effect is heard. The latency introduced when processing the microphone signals and mixing them with the signals from an external source must therefore be kept at a minimum. Therefore this task cannot be performed on a computer, for example. 4. Headset Figure 1: Earpieces of the Philips SHN2500 noisecancelling headphones. These have microphones located at the end of the earpieces opposite to the headphone drivers. 4.2 Mixer and Equalizer The mixer in the mic-through augmented reality audio system performs two tasks. First, it performs equalization of the microphone signals. Then, it mixes the equalized microphone signals with signals from external sources, and passes the mixed stereo signal to the headphones. In the equalization circuit, there are two peak/notch filters with adjustable center frequency, quality factor, and gain. These can be used to add a peak to the magnitude response representing the missing quarter-wavelength resonance, and remove the peak caused by the half-wavelength resonance. Additionally, a high-pass filter with adjustable cutoff frequency can be used to attenuate low frequencies, which otherwise would be pronounced because of leakage. Insert headphones often also have a pronounced low-frequency response due to the pressure chamber principle [7]. To ensure a short enough delay for the signal travelling from the microphone IMPLEMENTATION The hardware used for mic-through AR presented here is based on earlier research [7, 8]. The system consists of a headset with binaural microphones and a separate mixer and equalizer unit, henceforth simply called the mixer. A usability evaluation made with the original mixer concluded, among other things, that the mixer was unnecessarily large, and suggested that users should be able to adjust the level of the microphone signals reproduced with the headphones [9]. Based on the original mixer, a novel mixer was designed. Two main requirements were set. It should be smaller in size than the original mixer and thus easier to carry around. It should also connect to computers and mobile devices via 8 to the headphone, the equalizer is implemented as an analog circuit. Suitable parameters for the equalization filters of the original mixer were found by doing measurements with four test subjects. The transfer function from a loudspeaker in front of the test subject and a microphone 5 mm inside the entrance of the ear canal was measured both with the ear canal unoccluded and with the headset and mixer used without equalization. Figure 2 shows the transfer functions measured with one of the test subjects. The black line shows the magnitude response with the ear canal unoccluded and the grey line using the headset and mixer without equalization. Individual equalization curves were calculated for each test subject as the difference between the magnitude response of the two measurements on the dB scale. Figure 3 shows a generic equalization curve based on the average of the individual curves, as measured from the mixer. Figure 4 shows transfer function measurements with the headset and mixer using individual equalization (grey line), compared with measurements done with the ear canal unoccluded (black line). Figure 5: The mixer of the augmented reality audio system. On the left side of the mixer are the headphone output, the auxiliary and microphone inputs, and the USB connector. On the right side, there are two potentiometer knobs for adjusting the level of the microphone signals passed to the headphones. The dimensions of the box, excluding knobs and terminals, are 73 mm × 50 mm × 25 mm. 5. The analog equalization circuit receives power through the USB connector. The mixer is shown in Figure 5. Two potentiometer knobs allow adjusting the level of the microphone signals passed to the headphones. A block diagram of the mixer is shown in Figure 6. Figure 2: The transfer function measured from a loudspeaker in front of a test subject to a microphone inside the subject’s ear canal. The black line is the magnitude response with the ear canal unoccluded, and the grey line when using the headset and mixer without equalization. Adapted from [7]. Figure 4: The transfer function measured from a loudspeaker in front of a test subject to a microphone inside the subject’s ear canal. The black line is the magnitude response with the ear canal unoccluded (the same as in Figure 2), and the grey line when using the headset and mixer with individual equalization. Adapted from [7]. USABILITY For the usability study, the mixers were powered by 1150 mAh rechargable batteries supplying the required 5 V operating voltage through a USB socket. Three test users, of which two were authors of this paper and the third also familiar with the previous mixer, tested the mixer and headset in different situations for several hours in total and reported their experiences with the system. Figure 3: A generic equalization curve based on measurements with all four test subjects. The curve was measured from the mixer. Adapted from [7]. 5.1 Naturalness Two of the users commented that the timbre of the system was not entirely natural, but that it had a bit too much emphasis on mid frequencies or higher frequencies. The colouration seemed to amplify certain sounds like, e.g., keyboard strokes. In general, however, the audio quality was considered good and adaptation to the sound might not take more than a few minutes. One user said that after maybe The mixer also contains an integrated circuit performing the task of a USB sound card. The sound card allows input of the microphone signals to a computer, and output from the computer to the headphones via the mixer. The mixer also has an auxiliary analog stereo input. The equalized microphone signals are thus mixed with the signals input from an external source, either via USB or analog input, or both. 9 Two of the users reported wanting to take off the headset when talking to other people. One user speculated that because you have your ears plugged while talking, you assume that you cannot understand other people. He thought that wearing the headset in reality might affect speech intelligibility less than you think. Over all, however, speech intelligibility was considered good and the most problematic thing with talking to other people might be that your own voice is amplified and sounds strange. It might also seem impolite to talk to people while wearing a headset. While using the headset and mixer, talking on the phone was possible, but the fact that the headset earpiece slightly extrudes from the ear was found problematic. This makes it difficult to put the phone in the correct position where the phone speaker is placed against the microphone of the headset. Naturally, a headset like this should be connected directly to the mobile phone and used like any hands-free headset. 5.5 Figure 6: Block diagram of the mixer. On the left side are the analog microphone and auxiliary inputs. With the USB audio chip (at the bottom) the mixer can function as a USB audio device, allowing digital auxiliary input and microphone output to platforms supporting such devices. The USB connector also provides the mixer with a 5 V operating voltage. The equalizer (at the top) provides natural-sounding reproduction of the microphone signals through the headphones. The level of the equalized microphone signals can be adjusted, and the signals are mixed with any signals from the auxiliary inputs, before they are passed to the headphones. When riding a bike, wind noise was found disturbing when using the headset and mixer. When walking outside wind noise was generally not a problem. One of the users reported that he didn’t hear bikes approaching from behind. This might have been because of wind noise, but it still felt that they should have been heard. Increasing the level of the microphone signal might help, but would also increase the level of noise. When driving a car, the headset felt transparent and comfortable. When riding a bus, especially when listening to music, the possibility to attenuate environmental sounds played through the headphones completely was seen as a good option. In this case the headphones block sounds from the outside quite nicely. On the other hand, it might be nice to hear environmental sounds while walking and listening to music, allowing you to be aware of your surroundings. one hour you might even forget that you are wearing the headset. Localization of sound sources worked well with the headset. Noise sources like laptop fans might however be problematic to localize. Normally, there was no noticeable distortion of sounds, but listening to music from loudspeakers at loud levels caused distortion. 5.2 5.6 Noise The Occlusion Effect The earpieces of the headset occlude the ear canals, amplifying especially low-frequency sounds which are conducted through bone and tissue, or through the headset, to the ear canal and cannot escape through the ear canal opening. This causes, e.g., the headset wearer’s own voice to sound boomy and strange. Sounds of eating and drinking are also conducted to the ear canal through bones and flesh and thus are amplified when wearing the headset. 5.4 Mobility and Comfort The test setup was somewhat difficult to wear, because it consisted of the mixer and the separate battery with quite a long cable in between. It would definitely be much easier to use the mixer if it was, e.g., integrated into the headset, also reducing the number of cables required. The current mixer has two separate potentiometers for controlling the amplification of the microphone signal passed to each ear separately. A single stereo potentiometer for adjusting both channels at the same time would be much more convenient, possibly with separate balance adjustment. The potentiometer knobs were found to be unnecessarily large and easily adjusted by mistake, e.g., when keeping the mixer in the pocket. All of the users reported some discomfort wearing the insert-type headphones. One user reported feeling slight pain after 30 minutes of usage. The noise level from the microphones and mixer is somewhat high, masking low-level sounds from the environment. When the level of the microphone signal is kept at a normal level, the noise is not particularly disturbing. When the level is increased, the noise begins to disturb. One user complained about noise caused by movement of the headset cables. 5.3 On the Go 6. APPLICATIONS The applications that can take advantage of the presented AR system can be divided into two categories. The first category consists of what can be called pure AR applications, where the real environment is augmented with virtual objects, which depend on the real environment and add information related to it. These applications typically require information about the user’s location and head orientation Communication 10 with respect to real-world points of interest, to be able to render relevant information at the correct locations. Alternatively, these applications could extract and analyze information from the binaural microphone signals. The second category of applications only uses the system as a means to present virtual objects without hindering perception of the real environment at the same time. An example of each category is given below. More application scenarios have been presented in previous research [3, 5, 6]. An example of the first category would be a virtual museum guide [10]. Using 3D sound, the guide presents the exhibit that the visitor is focusing his or her attention on, based on location and head orientation. The guide also makes recommendations about other exhibits that the visitor might find interesting, based on which types of exhibits the visitor has shown interest for. The same concept could be used, e.g., for a virtual tourist guide. An example of the second category is the Mobile Augmented Messaging application [1]. The application is used for sharing short audio messages between members of groups. Messaging happens asynchronously, i.e., users can browse through and listen to old messages at any time, in addition to listening to new messages when they arrive. Thus, it is an audio equivalent to text chats and discussion forums. A mic-through augmented reality system is not necessary for the application, but allows users to receive messages and listen to the environment at the same time, as well as record messages using the microphones in the headset. In situations where recording of audio messages is not possible or desirable, messages can be input as text and converted to audio using text-to-speech synthesis. The application also includes the possibility to do binaural recordings and post these on Facebook. If a user belongs to several groups, messages from different groups are played back from different directions in front of the user, using head-related transfer functions. Test users found this to be a convenient way to identify which group a new message belonged to in some cases. However, the ability to separate different angles was quickly reduced as the number of groups increased. In addition to voice message input being faster than text input and leaving the user’s hands and vision free for other tasks, test users found the expressiveness of spoken messages to be one of the biggest advantages. Browsing through old voice messages was considered problematic and slow compared with text chats. The test users also thought it was easier and less confusing to participate in several discussions at the same time with text chats. Compared with, e.g., a real-time audio teleconference, the asynchronous communication in this application does not allow for a natural flow of conversation. This was, however, not considered particularly problematic by the test users. In some cases, it might actually be an advantage, allowing all users to say what they want to say without having to wait for their turn. This kind of application is ideally suited for push-to-talktype communication. The users can constantly wear headsets and send each other audio messages. Other users hear these messages directly, but also have the possibility to listen to earlier messages again. 7. the level of sounds from the environment, at the cost of slightly impaired perception of the acoustic environment, compared with hear-through techniques. Also, the binaural microphone signals available makes many interesting applications possible. The implemented system with headset and mixer provides a useful system for testing AR applications on platforms supporting USB audio devices. The main areas for improvement are reduction of noise and colouration of the sound, as well as making the system integrate better with mobile phones. 8. ACKNOWLEDGMENTS The research leading to these results has received funding from Nokia Research Center, the Academy of Finland, project no. [218238] and the European Research Council under the European Community’s Seventh Framework Programme (FP7/2007-2013) / ERC grant agreement no. [203636]. 9. REFERENCES [1] R. Albrecht. Messaging in mobile augmented reality audio. Master’s thesis, Aalto University School of Electrical Engineering, 2011. [2] D. Hammershøi and H. Møller. Sound transmission to and within the human ear canal. The Journal of the Acoustical Society of America, 100(1):408–427, 1996. [3] A. Härmä, J. Jakka, M. Tikander, M. Karjalainen, T. Lokki, H. Nironen, and S. Vesa. Techniques and applications of wearable augmented reality audio. In 114th Audio Engineering Society Convention, 2003. Preprint no. 5768. [4] R. Lindeman, H. Noma, and P. de Barros. Hear-through and mic-through augmented reality: Using bone conduction to display spatialized audio. In Proc. ISMAR 2007, pages 173–176. IEEE, 2007. [5] T. Lokki, H. Nironen, S. Vesa, L. Savioja, A. Härmä, and M. Karjalainen. Application scenarios of wearable and mobile augmented reality audio. In 116th Audio Engineering Society Convention, 2004. Preprint no. 6026. [6] M. Peltola, T. Lokki, and L. Savioja. Augmented reality audio for location-based games. In 35th International Conference of the Audio Engineering Society, 2009. [7] V. Riikonen. User-related acoustics in a two-way augmented reality audio system. Master’s thesis, Helsinki University of Technology, 2008. [8] V. Riikonen, M. Tikander, and M. Karjalainen. An augmented reality audio mixer and equalizer. In 124th Audio Engineering Society Convention, 2008. Preprint no. 7372. [9] M. Tikander. Usability issues in listening to natural sounds with an augmented reality audio headset. Journal of the Audio Engineering Society, 57(6):430–441, 2009. [10] A. Zimmermann and A. Lorenz. LISTEN: a user-adaptive audio-augmented museum guide. User Modeling & User-Adapted Interaction, 18(5):389–416, 2008. CONCLUSIONS Using mic-through techniques for augmented reality applications provides the benefit of allowing users to adjust 11
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Related manuals
Download PDF
advertisement