Texas Instruments | TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance | Application notes | Texas Instruments TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance Application notes

Texas Instruments TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance Application notes
Application Report
SPRA965 − October 2003
TMS320C64x DSP Peripheral Component Interconnect
(PCI) Performance
Stéphane Smith
C6x Device Applications
ABSTRACT
This application report describes the number of cycles required to perform a given peripheral
component interconnect (PCI) data transfer based on a variety of permutations of burst
length, CPU speed, EMIF speed, etc.
The PCI bus, created by Intel in 1992, enables fast accesses between PCI adapters, system
memory and external memory. To insure throughput near or at the processor’s native bus
speed, data transactions are performed as burst transfers. In addition, the PCI architecture
implements many features to provide simultaneous connectivity between multiple devices.
Due to the nature of burst transfers and PCI bus arbitration, variations in hardware settings
can drastically affect the throughput across the PCI bus.
This document provides data sheets of possible TMS320C64xx hardware configurations,
and their effects on PCI throughput performance. More specifically, transfer latency, the
number of PCI cycles required to transfer n words of data, overall throughput given n word
bursts, and turn-around penalty will be examined.
1
2
3
4
5
6
7
8
Contents
Design Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Measurement Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Master/Target Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Number of PCI Cycles Required to Transfer n Words of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Total Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Turn-Around Penalty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
List of Figures
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Read Latency Timing Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Write Latency Timing Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Number of PCI Cycles to Read 5 Words Timing Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Number of PCI cycles to write 5 words Timing Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Read Throughput Timing Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Write Throughput Timing Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Read/Read Turn-Around Penalty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Trademarks are the property of their respective owners.
1
SPRA965
List of Tables
Table 1
Table 2
Table 3
Table 4
Table 5
Table 6
1
Read Latency (Measured in PCI Clock Cycles) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Number of PCI Cycles to Read n Words of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Number of PCI Cycles to Write n Words of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Total Throughput for Reads (Measured in MB/s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Total Throughput for Writes (Measured in MB/s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Turn-Around Penalty (Measured in PCI Clock Cycles) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Design Problem
How do various hardware permutations affect the peripheral component interconnect (PCI)
throughput on TMS320C64x digital signal processors?
2
Solution
PCI devices access each other, system memory, and external memory through burst transfers.
A burst transfer is characterized by having an initialization or address phase followed by two or
more data phases. During the address phase, the master passes a starting address and a
transaction type. Various transaction types include I/O read, I/O write, memory read, memory
write, configuration read, configuration write, etc. The following phases are data phases until the
master signals the last data element. Various configurations influence the performance of PCI
burst transfers. The configurations under examination include:
•
•
•
•
•
•
CPU speed
PCI speed
Transfer Source/Destination
EMIF speed
EMIF width
Burst Length
These hardware variations affect transfer latency, the number of PCI cycles required to transfer
n words of data, overall throughput given n word bursts, and turn-around penalty.
3
Measurement Assumptions
The PCI performance measurements were taken with the following assumptions:
2
•
There is no CPU, EMIF, EDMA or PCI activity other than what is required to perform simple
PCI transfers. All measurements were taken with ideal system traffic; actual throughput for
specific applications will vary.
•
The DSP is functioning as a target device. (When in master mode, the target should always
be ready without latency.)
•
•
•
•
PCI latency timer is set to its default value of 0x0.
PCI Min_Gnt (Minimum Grant) field is set to its default value of 0x0.
PCI Max_Lat (Maximum Latency) field is set to the default value of 0x0.
Unless otherwise stated, EMIF is connected to SDRAM.
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
SPRA965
4
Master/Target Latency
Initiator and target latency is the amount of time from when the master starts the transaction to
when the target is ready to transfer the first data item. In order to prevent devices from
monopolizing the PCI bus, the PCI bus specification implements the first data phase rule.
According to specification, the target is limited to 16 PCI clock cycles to complete the first data
transfer. Similarly, the master is limited to a maximum of 8 PCI clock cycles. If for any reason,
the device cannot meet these requirements, the target must issue a retry. The retry, indicated by
deasserting TRDY and DEVSEL, while asserting STOP, terminates the transaction prematurely,
thereby freeing the PCI bus for use by other devices. After a minimum of two clock cycles, the
initiator may reattempt to transfer data.
In general, master/target latency is a function of:
•
How fast the master can transfer data
•
Access time for the target device
In these tests, the master is requesting a read, therefore the target device must prefetch data.
During this prefetching stage, the target will continually issue a retry until it’s ready to stream
data.
Figure 1 displays the latency where:
•
add. represents the target address.
•
cmd is the command used to determine the type of transaction to be performed.
•
t/a represents the turn-around cycle required by reads to hand off control of the AD bus to
the target.
•
d# represent the data items being transferred.
The latency measurement begins at the start of the transaction. This is followed by a series of
retries, until data is finally ready to be transferred.
latency
clk
frame
AD
add
C/BE
cmd
add
byte enable
ÎÎÎÎ
ÎÎÎÎ
ÎÎÎÎ
ÎÎÎÎ
IRDY
TRDY
STOP
cmd
add
byte enable
ÎÎ
ÎÎ
ÎÎ
ÎÎ
cmd
add
byte enable
ÎÎ
ÎÎ
ÎÎ
ÎÎ
cmd
add t/a
byte enable
ÎÎ
ÎÎ
ÎÎ
ÎÎ
cmd
d1
d2
d3
d4
d5
byte enable
ÎÎ
ÎÎ
ÎÎ
ÎÎ
Figure 1. Read Latency Timing Diagram
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
3
SPRA965
Table 1. Read Latency (Measured in PCI Clock Cycles)
CPU
SRC/DST
PCI
33
L2 256k
66
Burst Length
EMIF
EMIF Width
8
18
16
18
64
18
1024
18
8
26
16
26
64
26
1024
26
133
8
100
500
133
16
100
EMIFA (SDRAM)
33
133
64
100
133
1024
100
4
Latency
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
64
22
32
22
64
22
32
22
64
22
32
22
64
22
32
22
64
22
32
22
64
22
32
22
64
22
32
22
64
22
32
22
SPRA965
Table 1. Read Latency (Measured in PCI Clock Cycles) (Continued)
CPU
SRC/DST
PCI
Burst Length
EMIF
133
8
100
133
16
100
500
EMIFA (SDRAM)
66
133
64
100
133
1024
100
33
600
L2 256k
66
EMIF Width
Latency
64
30
32
30
64
30
32
34
64
30
32
30
64
30
32
34
64
30
32
30
64
30
32
34
64
30
32
30
64
30
32
34
8
18
16
18
64
18
1024
18
8
22
16
22
64
22
1024
22
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
5
SPRA965
Table 1. Read Latency (Measured in PCI Clock Cycles) (Continued)
CPU
SRC/DST
PCI
Burst Length
EMIF
EMIF Width
Latency
64
18
32
22
64
22
32
22
64
18
32
22
64
22
32
22
64
18
32
22
64
22
32
22
64
18
32
22
64
22
32
22
64
26
32
30
64
30
32
34
64
26
32
30
64
30
32
34
64
26
32
30
64
30
32
34
64
26
32
30
64
30
32
34
133
8
100
133
16
100
33
133
64
100
133
1024
100
600
EMIFA (SDRAM)
133
8
100
133
16
100
66
133
64
100
133
1024
100
6
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
SPRA965
Table 1. Read Latency (Measured in PCI Clock Cycles) (Continued)
CPU
SRC/DST
PCI
33
L2 256k
66
720
Burst Length
EMIF
EMIF Width
Latency
8
14
16
14
64
14
1024
14
8
22
16
22
64
22
1024
22
8
133
16
133
64
133
1024
133
8
133
16
133
64
133
1024
133
33
EMIFA (SDRAM)
66
64
18
32
22
64
18
32
22
64
18
32
22
64
18
32
22
64
26
32
30
64
26
32
30
64
26
32
30
64
26
32
30
Figure 2 displays the latency where:
•
add. represents the target address.
•
cmd is the command used to determine the type of transaction to be performed.
•
d# represent the data items being transferred.
For write transfers, data is ready immediately; therefore, the latency is always one PCI cycle.
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
7
SPRA965
latency
clk
frame
AD
add
C/BE
cmd
d1
d2
d3
d4
d5
byte enable
IRDY
TRDY
STOP
ÎÎÎÎ
ÎÎÎÎ
Figure 2. Write Latency Timing Diagram
5
Number of PCI Cycles Required to Transfer n Words of Data
Before a device transfers data through the PCI bus, it must prefetch data into either its
read-ahead or write buffers. The device may then connect to and burst data across the PCI bus.
As the data in the buffer depletes, the device must simultaneously prefetch more data while
continually transferring information. If the time required to prefetch data exceeds the time to
transfer data, the buffer will eventually completely empty and the device will be forced to
disconnect from its target. Once more data are available in the buffer, the initiator may reattempt
to complete the transaction.
Figure 3 displays the number of cycles to transfer five words where:
8
•
add. represents the target address.
•
cmd is the command used to determine the type of transaction to be performed.
•
t/a represents the turn around cycle required by reads to hand off control of the AD bus to
the target.
•
d# represent the data items being transferred.
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
SPRA965
n words
clk
frame
AD
add
C/BE
cmd
add
byte enable
ÎÎÎÎ
ÎÎÎÎ
ÎÎÎÎ
ÎÎÎÎ
IRDY
TRDY
STOP
cmd
add
byte enable
ÎÎ
ÎÎ
ÎÎ
ÎÎ
cmd
add
byte enable
cmd
ÎÎ
ÎÎ
ÎÎ
ÎÎ
add t/a
byte enable
ÎÎ
ÎÎ
ÎÎ
ÎÎ
d1
cmd
d2
d3
d4
d5
byte enable
ÎÎ
ÎÎ
ÎÎ
ÎÎ
Figure 3. Number of PCI Cycles to Read 5 Words Timing Diagram
Table 2. Number of PCI Cycles to Read n Words
of Data
CPU
SRC/DST
PCI
33
500
L2 256k
66
Burst Length
EMIF
EMIF Width
Xfer
8
8
16
16
64
64
1024
1024
8
8
16
16
64
64
1024
1479
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
9
SPRA965
Table 2. Number of PCI Cycles to Read n Words
of Data (Continued)
CPU
SRC/DST
PCI
Burst Length
EMIF
133
8
100
133
16
100
33
133
64
100
133
1024
100
500
EMIFA (SDRAM)
133
8
100
133
16
100
66
133
64
100
133
1024
100
10
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
EMIF Width
Xfer
64
8
32
8
64
8
32
8
64
16
32
16
64
16
32
16
64
64
32
64
64
64
32
64
64
1024
32
1024
64
1024
32
1210
64
8
32
8
64
8
32
8
64
16
32
16
64
16
32
16
64
154
32
221
64
213
32
245
64
2809
32
3660
64
3268
32
4584
SPRA965
Table 2. Number of PCI Cycles to Read n Words
of Data (Continued)
CPU
SRC/DST
PCI
33
L2 256k
66
Burst Length
EMIF
Xfer
8
8
16
16
64
64
1024
1024
8
8
16
16
64
64
1024
1024
133
8
600
100
133
16
EMIFA (SDRAM)
EMIF Width
33
100
133
64
100
133
1024
100
64
8
32
8
64
8
32
8
64
16
32
16
64
16
32
16
64
64
32
64
64
64
32
64
64
1024
32
1024
64
1024
32
1303
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
11
SPRA965
Table 2. Number of PCI Cycles to Read n Words
of Data (Continued)
CPU
SRC/DST
PCI
Burst Length
EMIF
133
8
100
133
16
100
600
EMIFA (SDRAM)
66
133
64
100
133
1024
100
33
720
L2 256k
66
12
EMIF Width
Xfer
64
8
32
8
64
8
32
8
64
16
32
16
64
16
32
16
64
158
32
146
64
150
32
253
64
2730
32
3049
64
2886
32
4809
8
8
16
16
64
64
1024
1024
8
8
16
16
64
64
1024
1024
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
SPRA965
Table 2. Number of PCI Cycles to Read n Words
of Data (Continued)
CPU
SRC/DST
PCI
Burst Length
EMIF
8
133
16
133
64
133
1024
133
8
133
16
133
64
133
1024
133
33
720
EMIFA (SDRAM)
66
EMIF Width
Xfer
64
8
32
8
64
16
32
16
64
64
32
64
64
1024
32
1024
64
8
32
8
64
16
32
16
64
154
32
142
64
2534
32
3245
Figure 4 displays the number of cycles to transfer five words where:
•
•
•
add. represents the target address.
cmd is the command used to determine the type of transaction to be performed.
d# represent the data items being transferred.
n words
clk
frame
AD
add
C/BE
cmd
d1
d2
d3
d4
d5
byte enable
IRDY
ÎÎÎÎ
ÎÎÎÎ
TRDY
STOP
Figure 4. Number of PCI cycles to write 5 words Timing Diagram
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
13
SPRA965
Table 3. Number of PCI Cycles to Write n Words
of Data
CPU
SRC/DST
PCI
33
L2 256k
66
Burst Length
EMIF
9
1024
1025
8
9
1024
1265
8
100
33
133
1024
100
EMIFA (SDRAM)
133
8
100
66
133
1024
100
33
L2 256k
66
32
9
64
9
32
9
64
1025
32
1025
64
1025
32
1025
64
9
32
9
64
9
32
9
64
1417
32
1417
64
1421
32
1421
1024
1025
8
9
1024
1025
100
33
133
1024
14
9
9
8
EMIFA (SDRAM)
64
8
133
600
Xfer
8
133
500
EMIF Width
100
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
64
9
32
9
64
9
32
9
64
1025
32
1025
64
1025
32
1025
SPRA965
Table 3. Number of PCI Cycles to Write n Words
of Data (Continued)
CPU
SRC/DST
PCI
Burst Length
EMIF
133
8
100
600
EMIFA (SDRAM)
66
133
1024
100
33
L2 256k
66
720
Xfer
64
9
32
9
64
9
32
9
64
1245
32
1245
64
1245
32
1237
8
9
1024
1025
8
9
1024
1025
8
133
1024
133
8
133
1024
133
33
EMIFA (SDRAM)
66
6
EMIF Width
64
9
32
9
64
1025
32
1025
64
9
32
9
64
1025
32
1025
Total Throughput
The total throughput is defined as the amount of data transferred per unit time. Total PCI
throughput is measured from the start of the transaction until the last data item has been
transferred. The equation for calculating total throughput is:
TotalThroughput +
(#words)(4)
[bytesńs]
(plck)(latency ) xfer)
Where:
#words is the number of words of data transferred.
pclk is the PCI clock period (typically 30ns for a 33MHz clock or 15ns for a 66MHz clock).
latency is the number of cycles between when the master starts the transaction to when the
target is ready to transfer the first data item.
xfer is the number of cycles required to transfer n words of data.
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
15
SPRA965
Figure 5 displays the total-throughput of a PCI transfer where:
•
add. represents the target address.
•
cmd is the command used to determine the type of transaction to be performed.
•
t/a represents the turnaround cycle required by reads to hand off control of the AD bus to the
target.
•
d# represent the data items being transferred.
Total Throughput
clk
frame
AD
add
C/BE
cmd
IRDY
TRDY
STOP
add
byte enable
ÎÎÎÎ
ÎÎÎÎ
ÎÎÎÎ
ÎÎÎÎ
cmd
add
byte enable
ÎÎÎ
ÎÎÎ
ÎÎÎ
ÎÎÎ
cmd
add
byte enable
ÎÎ
ÎÎ
ÎÎ
ÎÎ
cmd
add t/a
byte enable
ÎÎÎ
ÎÎÎ
ÎÎÎ
ÎÎÎ
d1
cmd
d2
d3
d4
byte enable
ÎÎ
ÎÎ
ÎÎ
ÎÎ
Figure 5. Read Throughput Timing Diagram
Table 4. Total Throughput for Reads
(Measured in MB/s)
CPU
SRC/DST
PCI
33
500
L2 256k
66
16
Burst Length
EMIF
EMIF Width
Throughput
8
41
16
62.7
64
104.1
1024
131
8
62.7
16
101.6
64
189.6
1024
181.4
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
d5
SPRA965
Table 4. Total Throughput for Reads
(Measured in MB/s) (Continued)
CPU
SRC/DST
PCI
Burst Length
EMIF
133
8
100
133
16
100
33
133
64
100
133
1024
100
500
EMIFA
(SDRAM)
133
8
100
133
16
100
66
133
64
100
133
1024
100
EMIF Width
Throughput
64
35.6
32
35.6
64
35.6
32
35.6
64
56.1
32
56.1
64
56.1
32
56.1
64
99.2
32
99.2
64
99.2
32
99.2
64
130.5
32
130.5
64
130.5
32
110.8
64
56.1
32
56.1
64
56.1
32
50.8
64
92.8
32
92.8
64
92.8
32
85.3
64
92.8
32
68
64
70.2
32
61.2
64
96.2
32
74.0
64
82.8
32
59.1
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
17
SPRA965
Table 4. Total Throughput for Reads
(Measured in MB/s) (Continued)
CPU
SRC/DST
PCI
33
L2 256k
66
Burst Length
EMIF
EMIF Width
8
41
16
62.7
64
104.1
1024
131
8
71.1
16
112.3
64
198.4
1024
261.1
133
8
100
600
133
16
100
EMIFA
(SDRAM)
33
133
64
100
133
1024
100
18
Throughput
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
64
41
32
35.6
64
35.6
32
35.6
64
62.7
32
56.1
64
56.1
32
56.1
64
104.1
32
99.2
64
99.2
32
99.2
64
131
32
130.5
64
130.5
32
103
SPRA965
Table 4. Total Throughput for Reads
(Measured in MB/s) (Continued)
CPU
SRC/DST
PCI
Burst Length
EMIF
133
8
100
133
16
100
600
EMIFA
(SDRAM)
66
133
64
100
133
1024
100
33
720
L2 256k
66
EMIF Width
Throughput
64
62.7
32
56.1
64
56.1
32
50.8
64
101.6
32
92.8
64
92.8
32
85.3
64
92.8
32
97
64
94.8
32
59.5
64
99.1
32
88.7
64
93.6
32
56.4
8
48.5
16
71.1
64
109.4
1024
131.5
8
71.1
16
112.3
64
198.4
1024
261.1
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
19
SPRA965
Table 4. Total Throughput for Reads
(Measured in MB/s) (Continued)
CPU
SRC/DST
PCI
Burst Length
EMIF
8
133
16
133
64
133
1024
133
8
133
16
133
64
133
1024
133
EMIF Width
Throughput
64
41
32
35.6
64
62.7
32
56.1
64
104.1
32
99.2
64
131
32
130.5
64
62.7
32
56.1
64
101.6
32
92.8
64
94.8
32
99.2
64
106.7
32
83.4
33
720
EMIFA
(SDRAM)
66
Figure 6 displays the total-throughput of a PCI transfer where:
•
add. represents the target address.
•
cmd is the command used to determine the type of transaction to be performed.
•
d# represent the data items being transferred.
Throughput
clk
frame
AD
add
C/BE
cmd
d1
d2
d3
d4
d5
byte enable
IRDY
TRDY
STOP
ÎÎÎÎ
ÎÎÎÎ
Figure 6. Write Throughput Timing Diagram
20
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
SPRA965
Table 5. Total Throughput for Writes
(Measured in MB/s)
CPU
SRC/DST
PCI
33
L2 256k
66
Burst Length
EMIF
Throughput
8
106.7
1024
133.1
8
213.3
1024
215.7
133
8
100
33
133
500
EMIF Width
1024
100
EMIFA
(SDRAM)
133
8
100
66
133
1024
100
64
106.7
32
106.7
64
106.7
32
106.7
64
133.1
32
133.1
64
133.1
32
133.1
64
213.3
32
213.3
64
213.3
32
213.3
64
192.6
32
192.6
64
192
32
192
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
21
SPRA965
Table 5. Total Throughput for Writes
(Measured in MB/s) (Continued)
CPU
SRC/DST
PCI
33
L2 256k
66
Burst Length
EMIF
EMIF Width
8
106.7
1024
133.1
8
213.3
1024
266.1
133
8
100
33
133
600
1024
100
EMIFA
(SDRAM)
133
8
100
66
133
1024
100
33
L2 256k
66
720
64
106.7
32
106.7
64
106.7
32
106.7
64
133.1
32
133.1
64
133.1
32
133.1
64
213.3
32
213.3
64
213.3
32
213.3
64
219.2
32
219.2
64
219.2
32
220.6
8
106.7
1024
133.1
8
213.3
1024
266.1
8
133
1024
133
8
133
1024
133
33
EMIFA
(SDRAM)
66
22
Throughput
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
64
106.7
32
106.7
64
133.1
32
133.1
64
213.3
32
213.3
64
266.1
32
266.1
SPRA965
7
Turn-Around Penalty
The turnaround penalty is the result of the overhead associated with back-to-back transfers.
During a read, the target device prefetches data and issues retries until it is ready to stream data
to its master device. Similarly, for writes, the target device prefetches data into a write buffer
before transferring the first data item. When a device transitions from one transaction to another,
the read-ahead or write buffers might be either full or partially full, and the device must therefore
empty them and reprefetch more data before performing its next burst transfer. The additional
time required for flushing FIFOs is the turnaround penalty. Different back-to-back configurations
include read/write, read/read, write/read and write/write.
Figure 7 displays a back-to-back read/read transaction where:
•
ad. is the target address.
•
cm is the command determining the type of transaction to be performed.
•
t/a represents the turnaround cycle required by reads to hand off control of the AD bus to the
target.
•
d# represents the data items being transferred.
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
23
24
cm
C/BE
IRDY
ad.
AD
frame
clk
byte
enable
cm
ad.
byte
enable
cm
ad.
byte
enable
Isolated latency
ÎÎÎÎ
ÎÎÎÎ
ÎÎÎÎ
ÎÎÎÎ
ÎÎÎÎ
ÎÎÎÎ
ÎÎÎÎ
STOP
TRDY
cm
ad.
byte
enable
cm
byte
enable
ad. t/a d1 d2
cm
ad.
byte
enable
cm
ad.
byte
enable
cm
ad.
byte
enable
cm
ad.
byte
enable
Turn-Around latency
cm
ad.
byte
enable
cm
byte
enable
ad. t/a d1 d2
ÎÎÎÎ
ÎÎÎÎ
ÎÎÎÎ
ÎÎÎÎ
ÎÎÎÎ
ÎÎÎÎ
ÎÎ
ÎÎ
ÎÎÎÎ
ÎÎ
ÎÎ
ÎÎÎÎ
ÎÎÎÎ
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
Figure 7. Read/Read Turn-Around Penalty
SPRA965
SPRA965
The equation to calculate the turn-around penalty is as follows:
TurnAroundPenalty + TurnAroundLatency * IsolatedLatency
Table 6. Turn-Around Penalty
(Measured in PCI Clock Cycles)
CPU
SRC/DST
PCI
33
L2 256k
66
Type
EMIF
T/A Penalty
R/W
0
R/R
4
W/R
8
W/W
8
R/W
0
R/R
4
W/R
8
W/W
8
133
R/W
100
500
133
R/R
100
EMIFA (SDRAM)
EMIF Width
33
133
W/R
100
133
W/W
100
64
0
32
0
64
0
32
0
64
0
32
0
64
0
32
4
64
8
32
8
64
8
32
8
64
8
32
8
64
8
32
8
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
25
SPRA965
Table 6. Turn-Around Penalty
(Measured in PCI Clock Cycles) (Continued)
CPU
SRC/DST
PCI
Type
EMIF
EMIF Width
T/A Penalty
64
0
32
0
64
0
32
0
64
16
32
20
64
20
32
20
64
8
32
12
64
8
32
8
64
8
32
8
64
8
32
8
133
R/W
100
133
R/R
100
500
EMIFA (SDRAM)
66
133
W/R
100
133
W/W
100
33
600
L2 256k
66
26
R/W
0
R/R
0
W/R
8
W/W
8
R/W
0
R/R
4
W/R
8
W/W
8
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
SPRA965
Table 6. Turn-Around Penalty
(Measured in PCI Clock Cycles) (Continued)
CPU
SRC/DST
PCI
Type
EMIF
133
R/W
100
133
R/R
100
33
133
W/R
100
133
W/W
100
600
EMIFA (SDRAM)
133
R/W
100
133
R/R
100
66
133
W/R
100
133
W/W
100
EMIF Width
T/A Penalty
64
0
32
0
64
0
32
0
64
4
32
0
64
0
32
4
64
8
32
8
64
8
32
8
64
8
32
8
64
8
32
8
64
0
32
0
64
0
32
0
64
16
32
16
64
20
32
24
64
12
32
8
64
8
32
8
64
8
32
8
64
8
32
8
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
27
SPRA965
Table 6. Turn-Around Penalty
(Measured in PCI Clock Cycles) (Continued)
CPU
SRC/DST
PCI
33
L2 256k
66
720
Type
EMIF
EMIF Width
R/W
0
R/R
4
W/R
8
W/W
8
R/W
0
R/R
0
W/R
8
W/W
8
R/W
133
R/R
133
W/R
133
W/W
133
R/W
133
R/R
133
W/R
133
W/W
133
33
EMIFA (SDRAM)
66
8
T/A Penalty
64
0
32
0
64
4
32
4
64
8
32
12
64
8
32
8
64
0
32
0
64
12
32
16
64
8
32
8
64
8
32
8
References
1. TMS320C6000 Peripherals Reference Guide (SPRU190)
2. Shanley, Tom and Don Anderson. PCI System Architecture 4th ed. Mindshare Inc., 1999.
3. TMS320C6201/6701 DSP Host Port Interface (HPI) Performance (SPRA449)
28
TMS320C64x DSP Peripheral Component Interconnect (PCI) Performance
IMPORTANT NOTICE
Texas Instruments Incorporated and its subsidiaries (TI) reserve the right to make corrections, modifications,
enhancements, improvements, and other changes to its products and services at any time and to discontinue
any product or service without notice. Customers should obtain the latest relevant information before placing
orders and should verify that such information is current and complete. All products are sold subject to TI’s terms
and conditions of sale supplied at the time of order acknowledgment.
TI warrants performance of its hardware products to the specifications applicable at the time of sale in
accordance with TI’s standard warranty. Testing and other quality control techniques are used to the extent TI
deems necessary to support this warranty. Except where mandated by government requirements, testing of all
parameters of each product is not necessarily performed.
TI assumes no liability for applications assistance or customer product design. Customers are responsible for
their products and applications using TI components. To minimize the risks associated with customer products
and applications, customers should provide adequate design and operating safeguards.
TI does not warrant or represent that any license, either express or implied, is granted under any TI patent right,
copyright, mask work right, or other TI intellectual property right relating to any combination, machine, or process
in which TI products or services are used. Information published by TI regarding third-party products or services
does not constitute a license from TI to use such products or services or a warranty or endorsement thereof.
Use of such information may require a license from a third party under the patents or other intellectual property
of the third party, or a license from TI under the patents or other intellectual property of TI.
Reproduction of information in TI data books or data sheets is permissible only if reproduction is without
alteration and is accompanied by all associated warranties, conditions, limitations, and notices. Reproduction
of this information with alteration is an unfair and deceptive business practice. TI is not responsible or liable for
such altered documentation.
Resale of TI products or services with statements different from or beyond the parameters stated by TI for that
product or service voids all express and any implied warranties for the associated TI product or service and
is an unfair and deceptive business practice. TI is not responsible or liable for any such statements.
Following are URLs where you can obtain information on other Texas Instruments products and application
solutions:
Products
Amplifiers
Applications
amplifier.ti.com
Audio
www.ti.com/audio
Data Converters
dataconverter.ti.com
Automotive
www.ti.com/automotive
DSP
dsp.ti.com
Broadband
www.ti.com/broadband
Interface
interface.ti.com
Digital Control
www.ti.com/digitalcontrol
Logic
logic.ti.com
Military
www.ti.com/military
Power Mgmt
power.ti.com
Optical Networking
www.ti.com/opticalnetwork
Microcontrollers
microcontroller.ti.com
Security
www.ti.com/security
Telephony
www.ti.com/telephony
Video & Imaging
www.ti.com/video
Wireless
www.ti.com/wireless
Mailing Address:
Texas Instruments
Post Office Box 655303 Dallas, Texas 75265
Copyright  2003, Texas Instruments Incorporated
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising