Near-field vector signal enhancement

Near-field vector signal enhancement

US 20080152167A1

(19) United States

(12) Patent Application Publication (10) Pub. N0.: US 2008/0152167 A1

Taenzer (43) Pub. Date: Jun. 26, 2008

(52) US. Cl. ..................................................... .. 381/942

ENHANCEMENT

(75) Inventor: .(lgréf. Taenzer, Los Altos, CA (57) ABSTRACT

Correspondence Address:

Near-?eld sensing of Wave signals, for example for applica tion in headsets and earsets, is accomplished by placing tWo

THELEN REID BROWN RAYSMAN & or more spaced-apart microphones along a line generally

STEINER LLP

P. O. BOX 640640 between the headset and the user’s mouth. The signals pro duced at the output of the microphones Will disagree in ampli

(73) Assigneet STEP Communications tude and time delay for the desired signalithe Wearer’s voiceibut Will disagree in a different manner for the ambient noises. Utilization of this difference enables recognizing, and

Corporation, San Jose, CA (Us)

(21) APP1- NOJ 11/645,019 subsequently ignoring, the noise portion of the signals and passing a clean voice signal. A ?rst approach involves a complex vector difference equation applied in the frequency

_

(22) Flled: Dec‘ 22’ 2006 domain that creates a noise-reduced result. A second approach creates an attenuation value that is proportional to

_ _ _ _

Pubhcatlon Classl?catlon the complex vector difference, and applies this attenuation value to the original signal in order to effect a reduction of the

(51) Int, Cl,

H04B 15/00 (2006.01) noise. The tWo approaches can be applied separately or com billed

10

[id

ND

22

201

FRAME, l— WINDOW

& DFT

PROCESSING

(3,

P lDFT,

& ADD

FRAME,

WINDOW

& DFT

12

L

24

Patent Application Publication Jun. 26, 2008 Sheet 1 0f 13 US 2008/0152167 A1

16

12 .4

10

14

FIG.1

)0 11 13w 15» 22

FILTER MATCH AID ——

24

r'""".

FILTER —-1| MATCH I AMP AID —

11

13

J 15

12

Patent Application Publication Jun. 26, 2008 Sheet 2 0f 13 US 2008/0152167 A1

OZ_wwwOOmn_

i?

O2

Patent Application Publication Jun. 26, 2008 Sheet 3 0f 13 US 2008/0152167 A1

PDnFDO

A

(LL

A

E

NV

MJVLETIC

PW”:

Patent Application Publication Jun. 26, 2008 Sheet 4 0f 13 US 2008/0152167 A1

méqéfmgwxlc

W

.rDnFDO

Patent Application Publication Jun. 26, 2008 Sheet 5 0f 13 US 2008/0152167 A1

E 0

E \

D -20

O g -40 g

"H

E

3 g o

E

13

|_

5‘

Z

E

<

-100

0

-100

0

\

2

2 8 10 4 6

DISTANCE (m)

FIG. 5

I o DEG —

30 DEG

- - - -60 DEG _

120 DEG

150 DEG —

180 DEG

\

355%‘

4 6

DISTANCE (m)

FIG. 6

8 10

Patent Application Publication Jun. 26, 2008 Sheet 6 0f 13 US 2008/0152167 Al

O = 0.127 m

F 300

F 500 w

F1000

F 2000

FIG. 7

Patent Application Publication Jun. 26, 2008 Sheet 7 0f 13 US 2008/0152167 A1

ATTENUATION (dB)

IZ/

//

/

0 0.5 1 1.5 2 2.5

INPUT SIGNAL MAGNITUDE DIFFERENCE (dB)

3

FIG. 8

ATTENUATION (dB)

L

o

0 1 2 3 4 5

INPUT SIGNAL MAGNITUDE DIFFERENCE (dB)

6

FIG. 9

Patent Application Publication Jun. 26, 2008 Sheet 8 0f 13 US 2008/0152167 A1

PDnFDO

Patent Application Publication Jun. 26, 2008 Sheet 9 0f 13 US 2008/0152167 A1

SE30 22E

Nmmm Jmé

@

1M

Patent Application Publication Jun. 26, 2008 Sheet 10 0f 13 US 2008/0152167 A1

PDnFDO

A||

méaéimgwxlv

Patent Application Publication Jun. 26, 2008 Sheet 11 0f 13 US 2008/0152167 A1 n

0

1

2

3

4

5

6

7

8

9

Lolim

0

0414258374

0563548645

0.650877855

0712838916

0760899648

0800168126

0833369151

0862129187

0887497336

Hilim

66

4.74341649

3486832981

3.018999625

275658351

2582469324

2455733909

2.357898655

227924078

2214091152

ANQJb-(DOENODCO

l

2222222222 il

1.277359132

1.281239964

1.285052105

1.288797944

1292479749

1296099672

1299659758

1303161955

1306608114

1.31

FIG. 13

1538330099

153367055

1529120876

1524676547

1520333298

1516087106

1511934172

1507870908

1.503893922

1.5

Patent Application Publication

Jun. 26, 2008 Sheet 12 0f 13 US 2008/0152167 A1

44205 .PDmZ_

Am“; mOZmmmmEQ mQDtZ0<E

.2206 .Sn_z_

63 mozmmmnia 53.20%

30 100 1000

BIN FREQUENCY (H2)

FIG. 14A

30 100 1000

BIN FREQUENCY (H2)

FIG. 14B

10,000

10, 000

US 2008/0152167 A1

Jun. 26, 2008

NEAR-FIELD VECTOR

ENHANCEMENT

CROSS-REFERENCE TO RELATED

APPLICATIONS

[0001] (Not Applicable)

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention

[0003] The invention relates to near-?eld sensing systems.

[0004] 2. Description of the Related Art

[0005] When communicating in noisy ambient conditions, a voice signal may be contaminated by the simultaneous pickup of ambient noises. Single-channel noise reduction methods are able to provide a measure of noise removal by using a-priori knowledge about the differences betWeen voice-like signals and noise signals to separate and reduce the noise. HoWever, When the “noise” consists of other voices or voice-like signals, single-channel methods fail. Further, as the amount of noise removal is increased, some of the voice signal is also removed, thereby changing the purity of the remaining voice signalithat is, the voice becomes distorted.

Further, the residual noise in the output signal becomes more voice-like. When used With speech recognition software, these defects decrease recognition accuracy.

[0006] Array techniques attempt to use spatial or adaptive

?ltering to either: a) increase the pickup sensitivity to signals arriving from the direction of the voice While maintaining or reducing sensitivity to signals arriving from other directions, b) to determine the direction toWards noise sources and to steer beam pattern nulls toWard those directions, thereby reducing sensitivity to those discrete noise sources, or c) to deconvolve and separate the many signals into their compo nent parts. These systems are limited in their ability to

improve signal-to-noise ratio (SNR), usually by the practical

number of sensors that can be employed. For good perfor mance, large numbers of sensors are required. Further, null steering (Generalized Sidelobe Canceller or GSC) and sepa ration (Blind Source Separation or BSS) methods require time to adapt their ?lter coef?cients, thereby alloWing signi? cant noise to remain in the output during the adaptation period

(Which can be many seconds). Thus, GSC and BSS methods are limited to semi-stationary situations.

[0007] A good description of the prior art pertaining to noise cancellation/reduction methods and systems is con tained in US. Pat. No. 7,099,821 by Visser and Lee entitled

“Separation of Target Acoustic Signals in a Multi-Transducer

Arrangement”. This reference covers not only at-ear, but also remote (off-ear) voice pick-up technologies.

[0008] Prior art technologies for at-ear voice pickup sys tems recently have been driven by the availability and public acceptance of Wired and Wireless headsets, primarily for use

With cellular telephones. A boom microphone system, in

Which the microphone’s sensing port is located very close to the mouth, long has been a solution that provides good per formance due to its close proximity to the desired signal. US.

Pat. No. 6,009,184 by Tate and Wolff entitled “Noise Control

Device for a Boom Mounted Noise-canceling Microphone” describes an enhanced version of such a microphone. HoW ever, demand has driven a reduction in the siZe of headset devices so that a conventional prior art boom microphone solution has become unacceptable.

[0009] Current at-ear headsets generally utiliZe an omni directional microphone located at the very tip of the headset closest to the user’s mouth. In current devices this means that the microphone is located 3" to 4" aWay from the mouth and the amplitude of the voice signal is subsequently reduced by the 1/r spreading effect. HoWever, noise signals, Which are generally arriving from distant locations, are not reduced so the result is a degraded signal-to-noise ratio (SNR).

[0010] Many methods have been proposed for improving

SNR While preserving the reduced siZe and more distant from-the-mouth location of modern headsets. Relatively simple ?rst-order microphone systems that employ pressure gradient methods, either as “noise canceling” microphones or as directional microphones (eg US. Pat. Nos. 7,027,603;

6,681,022; 5,363,444; 5,812,659; and 5,854,848) have been employed in an attempt to mitigate the deleterious effects of the at-ear pick-up location. These methods introduce addi tional problems: the proximity effect, exacerbated Wind noise sensitivity and electronic noise, frequency response colora tion of far-?eld (noise) signals, the need for equaliZation

?lters, and if implemented electronically With dual micro phones, the requirement for microphone matching. In prac tice, these systems also suffer from on-axis noise sensitivity that is identical to that of their omni-directional brethren.

[0011] In order to achieve better performance, second-or der directional systems (eg US. Pat. No. 5,473,684 by Bar tlett and Zuniga entitled “Noise-canceling Differential

Microphone Assembly”) have also been attempted, but the defects common to ?rst-order systems are also greatly mag ni?ed so that Wind noise sensitivity, signal coloration, elec tronic noise, in addition to equaliZation and matching require ments, make this approach unacceptable.

[0012] Thus, adaptive systems based upon GSC, BSS or other multi-microphone methods also have been attempted

With some success (see for example McCarthy and Boland,

“The Effect of Near-?eld Sources on the Grif?ths-Jim Gen eraliZed Sidelobe Canceller”, Institution of Electrical Engi neers, London, IEE conference publication ISSN 0537-9989,

CODEN IECPB4, and US. Pat. Nos. 7,099,821; 6,799,170;

6,691,073; and 6,625,587). Such systems suffer from increased complexity and cost, multiple sensors requiring matching, sloW response to moving or rapidly changing noise sources, incomplete noise removal and voice signal distortion and degradation. Another draWback is that these systems

operate only With relatively clean (positive SNR) input sig

nals, and actually degrade the signal quality When operating

With poor (negative SNR) input signals. The voice degrada tion often interferes With Automatic Speech Recognition

(ASR), a major application for such headsets.

[0013] Another, multi-microphone noise reduction tech nology applicable to headsets is disclosed by Luo, et al. in

US. Pat. No. 6,668,062 entitled “FFT-based Technique for

Adaptive Directionality of Dual Microphones”. In this method, developed for use in hearing aids, tWo microphones are spaced approximately 10-cm apart Within a behind-the converted to the frequency domain and an output signal is created using the equation

US 2008/0152167 A1

Jun. 26, 2008

Where X(u)), Y(u)) and Z(u)) are the frequency domain trans forms of the time domain input signals x(t) and y(t), and the time domain output signal Z(t). In hearing aids the goal is to help the user to clearly hear the conversations of other indi viduals and also to hear environmental sounds, but not to hear the user him/herself. Thus, this technology is designed to clarify far-?eld sounds. Further, this technology operates to produce a directional sensitivity pattern that “cancels noise .

. . When the noise and the target signal are not in the same direction from the apparatus”. The doWnsides are that this technology signi?cantly distorts the desired target signal and requires excellent microphone array element matching.

[0014] Others have developed technologies speci?cally for near-?eld sensing applications. For example, Goldin (U.S.

Publication No. 2006/ 0013412 Al and “Close Talking Auto directive Dual Microphone”, AES Convention, Berlin, Ger many, May 8-1 1, 2004) has proposed using tWo microphones

With controllable delay-&-add technology to create a set of

?rst-order, narroW-band pick-up beam patterns that optimally steer the beams aWay from noise sources. The optimiZation is achieved through real-time adaptive ?ltering Which creates the independent control of each delay using LMS adaptive means. This scheme has also been utiliZed in modern DSP based hearing aids. Although essentially GSC technology, for near-?eld voice pick-up applications this system has been modi?ed to achieve non-directional noise attenuation. Unfor tunately, When there is more than a single noise source at a particular frequency, this system can not optimally reduce the noise. In real situations, even if there is only one physical noise source, room reverberations effectively create addi tional virtual noise sources With many different directions of arrival, but all having the identical frequency content thereby circumventing this method’s ability to operate effectively. In addition, by being adaptive, this scheme requires substantial time to adjust in order to minimiZe the noise in the output signal. Further, the rate of noise attenuation vs. distance is limited and the residual noise in the output signal is highly colored, among other defects.

[0015] In accordance With one embodiment described herein, there is provided a voice sensing method for signi? cantly improved voice pickup in noise applicable for example in a Wireless headset. Advantageously it provides a clean, non-distorted voice signal With excellent noise removal,

Wherein small residual noise is not distorted and retains its original character. Functionally, a voice pickup method for better selecting the user’s voice signal While rejecting noise

signals is provided.

[0016] Although discussed in terms of voice pickup (i.e. acoustic, telecom and audio), the system herein described is applicable to any Wave energy sensing system (Wireless radio, optical, geophysics, etc.) Where near-?eld pick-up is desired in the presence of far-?eld noises/interferers. An alternative use gives superior far-?eld sensing for astronomy, gamma ray, medical ultrasound, and so forth.

[0017] Bene?ts of the system disclosed herein include an attenuation of far-?eld noise signals at a rate tWice that of prior art systems While maintaining ?at frequency response characteristics. They provide clean, natural voice output, highly reduced noise, high compatibility With conventional transmission channel signal processing technology, natural sounding loW residual noise, excellent performance in extreme noise conditionsieven in negative SNR condi tionsiinstantaneous response (no adaptation time prob lems), and yet demonstrate loW compute poWer, memory and hardWare requirements for loW cost applications.

[0018] Acoustic voice applications for this technology include mobile communications equipment such as cellular handsets and headsets, cordless telephones, CB radios,

Walkie-talkies, police and ?re radios, computer telephony applications, stage and PA microphones, lapel microphones, computer and automotive voice command applications, inter coms and so forth. Acoustic non-voice applications include sensing for active noise cancellation systems, feedback detec tors for active suspension systems, geophysical sensors, infrasonic and gunshot detector systems, underWater Warfare and the like. Non-acoustic applications include radio and radar, astrophysics, medical PET scanners, radiation detec tors and scanners, airport security systems and so forth.

[0019] The system described herein can be used to accu rately sense local noises, so that these local noise signals can be removed from mixed signals that contain desired far-?eld signals, thereby obtaining clean sensing of the far-?eld sig nals.

[0020] Yet another use is to reverse the described attenua tion action so that near-?eld voice signals are removed and only the noise is preserved. Then this resulting noise signal, along With the original input signals, can be sent to a spectral subtraction, Generalized Sidelobe Canceller, Weiner ?lter,

Blind Source Separation system or other noise removal appa ratus Where a clean noise reference signal is needed for accu rate noise removal.

[0021] The system does not change the purity of the remaining voice While improving upon the signal-to-noise

ratio (SNR) improvement performance of beamforming

based systems and it adapts much more quickly than do GSC or BSS methods. With these other systems, SNR improve ments are still beloW l0-dB in most high noise applications.

BRIEF DESCRIPTION OF THE SEVERAL

VIEWS OF THE DRAWINGS

[0022] Many advantages of the present invention Will be apparent to those skilled in the art With a reading of this speci?cation in conjunction With the attached draWings,

Wherein like reference numerals are applied to like elements, and Wherein:

[0023] FIG. 1 is a schematic diagram of a type of a Wearable near-?eld audio pick-up device;

[0024] FIG. 1A is a block diagram illustrating a general pick-up process;

[0025] FIG. 2 is generaliZed block diagram of a system for accomplishing noise reduction;

[0026] FIG. 3 is a block diagram shoWing processing

details;

[0027] FIG. 4 is a block diagram of a signal processing portion of a direct equation approach;

[0028] FIG. 5 shoWs on-axis sensitivity relative to the mouth sensitivity vs. distance from the headset;

[0029] FIG. 6 shoWs the attenuation response of a system at seven different arrival angles from 0° to 180°;

[0030] FIG. 7 is a plot of the directionality pattern of a system using tWo omni-directional microphones and mea

[0031] FIG. 8 shoWs attenuation created by Equation (7) as a function of the magnitude difference betWeen the front microphone signal and the rear microphone signal for the 3

dB design example;

US 2008/0152167 A1

Jun. 26, 2008

[0032] FIG. 9 shows the attenuation characteristics pro duced by Equations (8) and (9) as compared With that pro

duced by Equation (7);

[0033] FIG. 10 shoWs a block diagram of hoW an attenua tion technique can be implemented Without the need for the real-time calculation of Equation (7);

[0034] FIG. 11 shoWs a block diagram of a processing method employing full attenuation to the output signal;

[0035] FIG. 12 demonstrates a block diagram of a calcula tion approach for limiting the output to expected signals;

[0036] FIG. 13 is an example limit table;

[0037] FIGS. 14A and 14B shoW a set of limits plotted versus frequency;

[0038] FIG. 15 shoWs a graph of sensitivity as a function of the source distance aWay from the microphone array along the major axis and that of a prior art system; and

[0039] FIG. 16 shoWs the data of FIG. 15 graphed on a logarithmic distance scale to better demonstrate the improved performance.

DETAILED DESCRIPTION OF THE INVENTION

[0040] Embodiments of the present invention are described herein in the context of near-?eld pick-up systems. Those of ordinary skill in the art Will realiZe that the folloWing detailed description of the present invention is illustrative only and is not intended to be in any Way limiting. Other embodiments of the present invention Will readily suggest themselves to such skilled persons having the bene?t of this disclosure. Refer ence Will noW be made in detail to implementations of the present invention as illustrated in the accompanying draW ings. The same reference indicators Will be used throughout the draWings and the folloWing detailed description to refer to the same or like parts.

[0041] In the interest of clarity, not all of the routine fea tures of the implementations described herein are shoWn and described. It Will, of course, be appreciated that in the devel opment of any such actual implementation, numerous imple mentation-speci?c decisions must be made in order to achieve the developer’s speci?c goals, such as compliance

With application- and business-related constraints, and that these speci?c goals Will vary from one implementation to another and from one developer to another. Moreover, it Will be appreciated that such a development effort might be com plex and time-consuming, but Would nevertheless be a routine undertaking of engineering for those of ordinary skill in the art having the bene?t of this disclosure.

[0042] The system described herein is based upon the use of a controlled difference in the amplitude of tWo detected signals in order to retain, With excellent ?delity, signals origi nating from nearby locations While signi?cantly attenuating those originating from distant locations. Although not con strained to audio and sound detection apparatus, presently the best application is in head Worn headsets, in particular Wire less devices knoWn as Bluetooth® headsets.

[0043] Recognizing that energy Waves are basically spheri cal as they spread out from a source, it can be seen that such

Waves originating from nearby (near-?eld) source locations are greatly curved, While Waves originating from distant (far

?eld) source locations are nearly planar. The intensity of an energy Wave is its poWer/unit area. As energy spreads out, the intensity drops off as l/r2, Where r is distance from the source.

Magnitude is the square root of intensity, so the magnitude drops off as l/r. The greater the difference in distance of tWo detectors from a source, the greater is the difference in mag nitude betWeen the detected signals.

[0044] The system employs a unique combination of a pair of microphones located at the ear, and a signal process that utiliZes the magnitude difference in order to preserve a voice signal While rapidly attenuating noise signals arriving from distant locations. For this system, the drop off of signal sen sitivity as a function of distance is double that of a noise canceling microphone located close to the mouth as in a high end boom microphone system, yet the frequency response is still Zeroth-orderithat is, inherently ?at. Noise attenuation is not achieved With directionally so all noises, independent of arrival direction, are removed. In addition, due to its Zeroth order sensitivity response, the system does not suffer from the proximity effect and is Wind noise-resistant, especially using the second processing method described beloW.

[0045] The system effectively provides an appropriately designed microphone array used With proper analog and A/D circuitry designed to preserve the signal “cues” required for the process, combined With the system process itself. It should be noted that the input signals are often “contami nated” With signi?cant noise energy. The noise may even be greater than the desired signal. After the system’s process has been applied, the output signal is cleaned of the noise and the resulting output signal is usually much smaller. Thus, the dynamic range of the input signal path should be designed to linearly preserve the high input dynamic range needed to encompass all possible input signal amplitudes, While the dynamic range requirement for the output path is often relaxed in comparison.

Microphone Array microphones preferably positioned along a line (axis) betWeen the headset location and the user’s mouthiin par ticular the upper lip is a preferred target so that both oral and nasal utterances are detectediis shoWn in FIG. 1. Only tWo microphones are shoWn, but a greater number can be used.

The tWo microphones are designated 10 and 12 and are mounted on or in a housing 16. The housing may have an extension portion 14. Another portion of the housing or a suitable component is disposed in the opening of the ear canal of the Wearer such that the speaker of the device can be heard by Wearer. Although the microphone elements 10 and 12 are preferably omni-directional units, noise canceling and uni directional devices and even active array systems also may be compatibly utiliZed. When directional microphones or micro phone systems are used, they are preferably aimed toWard the attenuation for noise sources located at less sensitive direc tions from the microphones.

[0047] The remaining discussion Will focus primarily on tWo omni-directional microphone elements 10 and 12, With the understanding that other types of microphones and micro phone systems canbe used. For the remaining description, the microphone closest to the mouthithat is, microphone

10*W1ll be called the “front” microphone and the micro phone farthest from the mouth (12) the “rear” microphone.

[0048] In simple terms, using the example of tWo spaced apart microphones located at the ear of the user and on a line approximately extending in the direction of the mouth, the tWo microphone signals are detected, digitiZed, divided into time frames and converted to the frequency domain using conventional digital Fourier transform (DFT) techniques. In

US 2008/0152167 Al

Jun. 26, 2008 the frequency domain, the signals are represented by complex numbers. After optional time alignment of the signals, 1) the difference betWeen pairs of those complex numbers is com puted according to a mathematical equation, or 2) their

Weighted sum is attenuated according to a different math ematical equation, or both. Since in the system described herein there is no inherent restriction on microphone spacing

(as long as it is not Zero), other system considerations are the driving factors on the choice of the time alignment approach.

[0049] The ratio of the vector magnitudes, or norms, is used noise attenuation created by each of the tWo methods. The result of the processing is a noise reduced frequency domain output signal, Which is subsequently transformed by conven tional inverse Fourier means to the time domain Where the output frames are overlapped and added together to create the digital version of the output signal. Subsequently, D/A con version can be used to create an analog output version of the output signal When needed. This approach involves digital frequency domain processing, Which the remainder of this description Will further detail. It should be recogniZed, hoW ever, that alternative approaches include processing in the analog domain, or digital processing in the time domain, and so forth.

[0050] Normalizing the acoustic signals sensed by the tWo microphones 10 and 12 to that of the front microphone 10, then the front microphone’s frequency domain signal is, by de?nition, equal to “1.” That is, gjmeirm (2)

Where no is the radian frequency, 6 is the effective angle of arrival of the acoustic signal relative to the direction toWard the mouth (that is, the array axis), d is the separation distance betWeen the tWo microphone ports and r is the range to the sound source from the front microphone 10 in increments of d. Thus, the frequency domain signal from the rear micro phone 12 is

§,(w, 0, d, r) : yilei'iwdwin/c, Where

2 1 y = l + —cos(0)+ —2, r r

(3)

(4) c is the effective speed of sound at the array, and i is the imaginary operator H. The term rd(y—l)/c represents the arrival time difference (delay) of an acoustic signal at the tWo microphone ports. It can be seen from these equations that

When r is large, in other Words When a sound source is far aWay from the array, the magnitude of the rear signal is equal to “l”, the same as that of the front signal.

[0051] When the source signal is arriving on-axis from a location along a line toWard the user’s mouth (6:0), the magnitude of the rear signal is

(5)

[0052] As an example of hoW this result is used in the design of the array, assume that the designer desires the magnitude of the voice signal to be 3 dB higher in the front microphone 10 than it is in the rear microphone 12. In this case, r l = 103/20 = 0.708 and thus r:2.42. Therefore, the front microphone 10 should be located 2.42~d aWay from the mouth, and, of course, the rear microphone 12 should be located a distance d behind the front microphone. If the distance from the mouth to the front microphone 10 Will be, for example, l2-cm (43A-in) in a particular design, then the desired port-to-port spacing in the microphone arrayithat is the separation betWeen the micro phones 10 and 12*W1ll be 4.96-cm (about 5-cm or 2-in). Of course, the designer is free to choose the magnitude ratio desired for any particular design.

Microphone Matching

[0053] Some processing steps that may be initially applied to the signals from the microphones 10 and 12 are described

With reference to FIG. 1A. It is advantageous to provide microphone matching, and using omni-directional micro phones, microphone matching is easily achieved. Omni-di rectional microphones are inherently ?at response devices

With virtually no phase mismatch betWeen pairs. Thus, any simple prior art level matching method su?ices for this appli cation. Such methods range from purchasing pre-matched microphone elements for microphones 10 and 12, factory selection of matched elements, post-assembly test ?xture

dynamic testing and adjustment, post-assembly mismatch

measurement With matching “table” insertion into the device for operational on-the-?y correction, to dynamic real-time automatic algorithmic mismatch correction.

Analog Signal Processing

[0054] As shoWn in FIG. 1A, analog processing of the microphone signals may be performed and typically consists of pre-ampli?cation using ampli?ers 11 to increase the nor mally very small microphone output signals and possibly

?ltering using ?lters 13 to reduce out-of-band noise and to address the need for anti-alias ?ltering prior to digitiZation of the signals if used in a digital implementation. HoWever, other processing can also be applied at this stage, such as limiting,

compression, analog microphone matching (15) and/or

squelch.

[0055] The system described herein optimally operates

With linear, undistorted input signals, so the analog process ing is used to preserve the spectral purity of the input signals by having good linearity and adequate dynamic range to cleanly preserve all parts of the input signals.

A/D-D/A Conversion

[0056] The signal processing conducted herein can be implemented using an analog method in the time domain. By using a bank of band-split ?lters, combined With Hilbert transformers and Well knoWn signal amplitude detection means, to separate and measure the magnitude and phase components Within each band, the processing can be applied on a band-by-band basis Where the multi-band outputs are then combined (added) to produce the ?nal noise reduced

analog output signal.

US 2008/0152167 A1

Jun. 26, 2008

[0057] Alternatively, the signal processing can be applied digitally, either in the time domain or in the frequency domain. The digital time-domain method, for example, can perform the same steps and in the same order as identi?ed above for the analog method, or may be any other appropriate method.

[0058] Digital processing can also be accomplished in the frequency domain using Digital Fourier Transform (DFT),

Wavelet Transform, Cosine Transform, Hartley transform or any other means to separate the information into frequency bands before processing.

[0059] Microphone signals are inherently analog, so after the application of any desired analog signal processing, the resulting processed analog input signals are converted to digi tal signals. This is the purpose ofthe A/D converters (22, 24) shoWn in FIGS. 1A and 24one conversion channel per input signal. Conventional A/D conversion is Well knoWn in the art, so there is no need for discussion of the requirements on anti-aliasing ?ltering, sample rate, bit depth, linearity and the like since standard good practices suf?ce.

[0060] After the noise reduction processing, for example by circuit 30 in FIG. 2, is complete, a single digital output signal is created. This output signal can be utiliZed in a digital system Without further conversion, or alternatively can be converted back to the analog domain using a conventional

D/A converter system as knoWn in the art.

Time Alignment

[0061] For the best output signal quality, it is preferable, but not required, that the tWo input signals be time aligned for the signal of interestithat is, in the instant example, for the user’s voice. Since the front microphone 10 is located closer to the mouth, the voice sound arrives at the front microphone

?rst, and shortly thereafter it arrives at the rear microphone

12. It is this time delay for Which compensation is to be applied, i.e. the front signal should be time delayed, for example by circuit 26 of FIG. 2, by a time equal to the propagation time of sound as it travels around the headset from the location of the front microphone 10 port to the rear microphone 12 port. Numerous conventional methods are available for accomplishing this time alignment of the input signals including, but not limited to, analog delay lines, cubic-spline digital interpolation methods and DFT phase modi?cation methods.

[0062] One simple means for accomplishing the delay is to select, during the headset design, a microphone spacing, d, that alloWs for offsetting the digital data stream from the front example, When the port spacing combined With the effective sound velocity at the in-situ headset location gives a signal time delay of, for example, 62.5 usec or 125 usec, then at a sample rate of l 6 ksps the former delay can be accomplished by offsetting the data by one sample and in the latter delay can be accomplished by offsetting the data by tWo samples. Since many telecommunication applications operate at a sample rate of 8 ksps, then the latter delay can be accomplished With a data offset of one sample. This method is simple, loW cost, consumes little compute poWer and is accurate. use of a WindoW such as the Hanning or other WindoW or other methods as are known in the art.

Frequency Domain (Fourier) Transformation

[0064] One of the simplest and most common means for multi-band separation of signals in the frequency domain is the Short-Time Fourier Transform (STFT), and the Fast Fou rier Transform (FFT) commonly is the digital implementation of choice. Although alternative means for multi-band pro cessing are applicable as discussed above, a standard digital

FFT/lFFT pair for transformation and processing approach is described herein.

[0065] FIG. 2 is a generalized block diagram of a system 20 for accomplishing the noise reduction With digital Fourier transform means. Signals from front (10) and rear (12) micro alignment circuit 26 for the signal of interest acts on at least one of the converted, digital signals, folloWed by framing and

WindoWing by circuits 28 and 29, Which also generate fre quency domain representations of the signals by digital Fou rier transform (DFT) means as described above. The tWo resultant signals are then applied to a processor 30, Which operates based upon a difference equation applied to eachpair of narroW-band, preferably time-aligned, input signals in the frequency domain. The Wide arroWs indicate Where multiple pairs of input signals are undergoing processing in parallel. In the description herein it Will be understood that the signals being described are individual narroW-band frequency sepa rated “sub”signals Wherein a pair is the frequency-corre sponding subsignals originating from each of the tWo micro phones.

[0066] First, each sub-signal of the pair is separated into its norm, also knoWn as the magnitude, and its unit vector,

Wherein a unit vector is the vector normaliZed to a magnitude of “l” by dividing by its norm. Thus, gjmeirwsjmew)txsjmem) (6)

Where |S/(u),6,d,r)| is the norm ofS/(m,6,d,r), and S/(w,6,d,r)

is the unit vector of S/(m,6,d,r). Thus, all of the magnitude information about the input signal Sfis in the norm, While all the angle information is in the unit vector. For the on-axis

signals described above With respect to equations 2-4, IS/(w,

6,d,r)|:l and S/(w,6,d,r):eiO:l. Similarly, and for the above signals, |S,(uu,6,d,r)|%/_l and S,(u),6,d,r)

:e

[0067] The output signal from circuit 30, then, is

5(a), 0, d, r) = (|sf(w, 0, d, r)| - |S,(w, 0, d, r)|) X

(Sf?u, 0, d, r) +S,(w, 0, d, r))

= (l —y’1)><|_2cos(wrd(l —y)/2c)><]

X eimrd(liy)/2cl

(8)

[0063] The processing may use the Well knoWn “overlap and-add” method. Use of this method often may include the

[0068] Here it can be seen that the amplitude of the output signal is proportional to the difference in magnitudes of the tWo input signals, While the angle of the output signal is the

US 2008/0152167 A1

Jun. 26, 2008 angle of the sum of the unit vectors, Which is equal to the average of the electrical angles of the tWo input signals.

[0069] This signal processing performed in circuit 30 is shown in more detail in the block diagram corresponding of

FIG. 3. Although it provides a noise reduction function, this form of the processing is not very intuitive into hoW the noise reduction actually occurs.

[0070] Dropping the common variables (u),6,d,r) for clarity and rearranging the terms of Equation 8 above gives,

2

{ReF‘Xa}, 0, d, r)]} +

{Im[§,(w, 0, d, r)]}2

[0074] Note the minus sign in the middle of Equation (1 1).

In the prior art approaches, direct summation of tWo indepen dent NR equations helps to achieve greater directional far

?eld noise reduction than When either equation is used alone.

In the present system, a single difference equation (11) is utiliZed Without summation. The result is a unique, nearly non-directional near-?eld sensing system.

[0075] FIG. 4 is a block diagram of the signal processing portion of this direct equation method for creating the noise reduced output signal vector 6(u),6,d,r) from the tWo input a signal vectors F :(w,6,d,r) and R: S ,(u),6,d,r).

[0076] Operation of this equation method is as folloWs: l)Assume that a noise source is located in the far-?eld. In this case, the magnitudes of the tWo input signals are virtually the same as each other due to l/r signal spreading. When the

Where the arroWs again represent vectors. With inspection, it can be seen that the frequency domain output signal for each frequency band is the product of tWo terms: the ?rst term (the portion before the product sign) is a scalar value Which is proportional to the attenuation of the signal. This attenuation is a function of the ratio of the norms of the tWo input signals and therefore is a function of the distance from the sound source to the array. The second term of Equation (9) (the portion after the product sign) is an average of the tWo input signals, Where each is ?rst normalized to have a magnitude equal to one-half the harmonic mean of the tWo separate signal magnitudes. This calculation creates an intermediate signal vector that has the optimum reduction for any set of independent random noise components in the input signals.

The calculation then attenuates that intermediate signal according to a measure of the distance to the sound source by multiplication of the intermediate signal vector by scalar value of the ?rst term.

[0071] Note that this processing is “instantaneous”, in other

Words it does not rely upon any prior information from earlier time framesitherefore it does not suffer from adaptation delay. It should be clari?ed that in these discussions, the variable X(u),6,d,r) beloW, is calculated as a ratio of the mag nitudes When in the linear domain, and as the difference of the logarithms (usually expressed in dB) When in the log domain.

Thus, X is described herein as a ratio When the discussion centers around a linear description, and as a difference When the discussion is about usage in the logarithmic domain.

Although alloWing insight into the noise reduction process, it is important When actually calculating the noise reduction process to be as e?icient as possible for achieving high speed at loW compute poWer. Thus, a more computationally ef?cient method of expressing these equations noW Will be discussed.

[0072] First, the ratio X(u),6,d,r) of the transformed short time framed input signal magnitudes is obtained, Where

2

[0073] Using this magnitude ratio and the original input signals, the output signal 8(u),6,d,r) is calculated as

(10) so both l-X‘l and l-X are equal to Zero. Thereby, according to equation (1 l) the output signal is virtually Zero, and there fore far-?eld signals are greatly attenuated.

2) Assume that a voice signal originates on-axis With a signal magnitude difference of, for example, 3 dB. In this case,

Xzl.4 so that l—X_1z0.29 and l-Xz-0.4l. These values are in inverse proportion to the magnitude difference of the input signals.As these tWo values are applied in Equation (1 1), they have the effect of equaliZing or normalizing the tWo input signals about a mean value. Thus, the output signal becomes the vector average of the tWo input signals after normaliZa tion. It is useful to note that the result is not a vector differ ence, as is used in gradient ?eld sensing.

3) The double difference seen in equation (1 1) leads to a second-order slope in the attenuation vs. distance character istic of the system. FIG. 5 shoWs the on-axis sensitivity rela tive to the mouth sensitivity vs. distance from the headset.

Thus in FIG. 5, the mouth signal sensitivity is at the left end of the curve and at 0 dB. The amount beloW Zero is propor tional to the signal attenuation produced by the system, and is here plotted at frequencies of 300, 500, l k, 2 k, 3 k and 5 kHZ.

Clearly the frequency response is identical at all frequencies, since all the attenuation curves are identical (they all fall on top of one another). Identical frequency response is advanta geous, since it prevents frequency response coloration of the signal as a function of distance, i.e. noise sources sound natural, although greatly attenuated. This second-order slope provides excellent noise attenuation performance of the sys tem.

[0077] The attenuation slope is only slightly directional.

Noise sources that are located at other angles With respect to the headset are equally or more greatly attenuated. FIG. 6 shoWs the attenuation response of the system at seven differ ent arrival angles from 0° to 180° for a frequency of 1 kHz. It

Will be noted that the attenuation response is nearly identical at all angles, except for greater noise attenuation at 90°. This is due to a ?rst-order “?gure-8” (noise canceling) direction ality pattern. The attenuation performance at all angles that are not on-axis exceeds that of the on-axis attenuation shoWn in FIG. 5.

4) The double difference displayed by Equation 11 also cre ates cancellation of any ?rst-order frequency response char acteristic (although not of the directionality) so that the over all frequency response is Zeroth-order even though the directionality response is ?rst-order. This means that the fre quency response is “?at” When used With ?at-response omni directional microphones. In actuality, the frequency charac

Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement