# FFT

```ADSP-21020 FFT User Notes
---------------------------by Ronnin Yee
August 21,
1991
There are currently six FFT routines in the 21020 assembly code runtime
library:
Complex FFT
Real FFT
-------rfft2.asm
irfft2.asm
rfft4.asm
irfft4.asm
A
A
A
A
real FFT (RFFT).
inverse real FFT (IRFFT).
real FFT (RFFT).
inverse real FFT (IRFFT).
In general, a radix-4 FFT will run faster than radix-2 FFT but will take
up
more space and has more restrictions on the length of the FFT.
Specifically,
all radix-2 FFT routines will take data lengths that are any power of two
(>= 32 points) while complex radix-4 routines will only take data lengths
that
are a power of four (>= 64) and real radix-4 routines will only take data
lengths that are (a power of four)*2 >= 128.
Complex inverse FFTs are not provided since they are very easy to
implement
with just a forward FFT. To implement a inverse FFT, one just needs to
swap
the real and imaginary parts of the data, perform the forward FFT, and
then
swap the real and imaginary parts of the result.
To ease the confusion of which data goes where for each of the routines,
the
following table of variables and their location is presented, where "N"
is the
length of the FFT:
Input
Output
Routine
DM
PM
DM
PM
redata[N] <a>
refft[N]
imfft[N]
imdata[N] <a>
sine[N/2]
cosine[N/2]
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . .
imdata[N]
redata[N]
cosine[3N/4]
sine[3N/4]
refft[N] <a>
imfft[N] <a>
------------------------------------------------------------------------------rfft2:
imfft[N/2+1]
real[N] <a>
refft[N/2+1]
sine[N/4]
cosine[N/4]
h_sine[N/4]
h_cosine[N/4]
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . .
rfft4:
imfft[N/2+1]
evreal[N/2]<b>
odreal[N/2]
cosine[3N/8]
h_sine[N/4]
sine[3N/8]
h_cosine[N/4]
refft[N/2+1]<b>
<z>
------------------------------------------------------------------------------irfft2:
odreal[N/2]
refft[N/2+1]<b> imfft[N/2+1]
evreal[N/2]<b>
<z>
sine[N/4]
cosine[N/4]
n_sine[N/4]
n_cosine[N/4]
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . .
irfft4:
refft[N/2+1]
imfft[N/2+1]
cosine[3N/8]
n_sine[N/4]
sine[3N/8]
n_cosine[N/4]
real[N] <a>
------------------------------------------------------------------------------Notes:
multiple of N.
<a> This array must start on an address that is a
<b> This array must start on an address == multiple of
N/2.
<z> This output buffer can overlap with the input buffer
in the
same memory space.
redata
imdata
refft
imfft
=
=
=
=
real part
imaginary
real part
imaginary
of time domain data
part of time domain data
of frequency domain data
part of frequency domain data
real
evreal
odreal
=
=
=
real time domain data
real time domain data, even indices only { =x(2n) }
real time domain data, odd indices only { =x(2n+1) }
All input and output data are in normal order since the routines handle
the
necessary bit- and digit- reversals.
The strange symmetry of how data is shuffled around in radix-2 and radix4
routines is a result of the differences in structure between the radix-2
routines
perform their bitreversal before the FFT and the radix-4 routines perform
their
bitreversal after the FFT. This affects where the data should be placed
for
optimal performance.
Space Saving Ideas and Other Table Talk
--------------------------------------If space is an issue and multiple FFT routines are being used, one may
get the
urge to share tables between routines and thus save space. He should
consider
the following points:
1) The "sine" and "cosine" tables of the radix-2 and radix-4 are NOT
compatible.
The radix-4 routines reads in its sines and cosines in a sort of a
ordering.
2) The "sine" and "cosine" tables of any radix-2 routine is compatible
with the
"sine" and "cosine" table of any other radix-2 FFT routine of the
same
length and type (real or complex). The tables of complex and real
routines
are different because the complex do N point FFTs while the real
actually
do N/2 FFTs. The same holds true among radix-4 routines.
3) "sine" and "cosine" radix-4 tables compiled for a length N FFT can
also be
used for FFTs of length less than N. This is a result of the bitreversed
-like ordering of the tables.
4) All "h_sine" and "h_cosine" tables are the same for the same length
FFT.
This also holds true for "n_sine" and "n_cosine".
5) "n_cosine" and "n_sine" are the same as "h_cosine" and "h_sine"
multiplied
by a factor of (2/N). If you wish to share these tables also and
don't mind
the scaling, use the "h_cosine" and "h_sine" tables and change
"f2=(1/2*HN);" to "f2=0.5;" in the beginning of the conversion stage.
This will cause the output of this routine to be (ifft)*(N/2).
Things to do when your FFT won't work
------------------------------------0)
Take a coffee break.
A refreshed perspective can do wonders.
1) Re-check preprocessor variables. Some of these variables are
computed
slightly differently for different routines. For instance, "STAGES"
is
2)
a
Recompute bit-reversing in bit-reversed variables.
This can be quite
pain.
3)
Are the arrays that the bit-reversed variables are pointing to on the
correct boundries? Note that currently this may require a trip to
the
architecture file (see explaination in the routine).
4) Have you given the right file names to all the tables and incoming
data?
For that matter, are they the right length and did you use the right
program to create them?
5)
Make sure all your data is going into the right memory space. The
assembler will NOT flag an error if you define a variable "foo" in PM
and
use it to access data in DM.
6)
Did you remember to re-compile and re-link?
7)
Repeat steps 1) and 2) very carefully.
8)
Use the simulator to verify all of the above.
Hopefully, this will solve most of your FFT problems.
The Conversion Stage in the Real FFTs
------------------------------------For a more complete discussion of the algorithm we used for the real
"The Fast Fourier Transform" by E. Oran Brigham (Prentice Hall:New
Jersey,
1974).
Given a 2N point sequence, x(n), and having taken the FFT of
x(2n)+jx(2n+1) for
n=0,1,...,N-1, we can now compute:
Given FFT(x(2n)+jx(2n+1)) = A(k)+jF(k),
let X(k) = R(k) + jI(k),
let c(k) = cos(pi*k/N),
let s(k) = sin(pi*k/N),
2R(k) = A(k)+A(N-k) + c(k)(F(k)+F(N-k)) - s(k)(A(k)-A(N-k))
2I(k) = F(k)-F(N-k) - s(k)(F(k)+F(N-k)) - c(k)(A(k)-A(N-k))
This will give us X(k).
for
convenience):
Notice what happens when we let k'=N-k
(k'-> k
2R(N-k) = A(k)+A(N-k) - c(k)(F(k)+F(N-k)) + s(k)(A(k)-A(N-k))
2I(N-k) = -F(k)+F(N-k) - s(k)(F(k)+F(N-k)) - c(k)(A(k)-A(N-k))
because c(N-k) = -c(k) and s(N-k) = s(k).
Since the data needed to compute X(k) is the same as the data needed to
compute
X(N-k), we decided to compute them simultaneously during each iteration
of the
conversion stage loop. So the algorithm does the conversion in k, N-k
pairs
(except for the mid-point).
In order to calculate the inverse, we realize that we have four unknowns
and
four equations. Thus it is a simple matter to derive:
2A(k) = R(k)+R(N-k) - s(k)(R(k)-R(N-k)) - c(k)(I(k)+I(N-k))
2F(k) = I(k)-I(N-k) + c(k)(R(k)-R(N-k)) - s(k)(I(k)+I(N-k))
and
2A(N-k) = R(k)+R(N-k) + s(k)(R(k)-R(N-k)) + c(k)(I(k)+I(N-k))
2F(N-k) = -I(k)+I(N-k) + c(k)(R(k)-R(N-k)) - s(k)(I(k)+I(N-k))
We can, therefore, calculate A(k)+jF(k) from X(k), run the inverse FFT
and
regain x(n).
```