MIPS32® 4Km® Processor Core Datasheet

MIPS32® 4Km® Processor Core Datasheet
November 19, 2004
The MIPS32® 4Km® core from MIPS® Technologies is a member of the MIPS32 4K® processor core family. It is a highperformance, low-power, 32-bit MIPS RISC core designed for custom system-on-silicon applications. The core is designed
for semiconductor manufacturing companies, ASIC developers, and system OEMs who want to rapidly integrate their own
custom logic and peripherals with a high-performance RISC processor. It is highly portable across processes, and can be
easily integrated into full system-on-silicon designs, allowing developers to focus their attention on end-user products. The
4Km core is ideally positioned to support new products for emerging segments of the digital consumer, network, systems,
and information management markets, enabling new tailored solutions for embedded applications.
The 4Km core implements the MIPS32 Architecture and contains all MIPS II™ instructions; special multiply-accumulate
(MAC), conditional move, prefetch, wait, and leading zero/one detect instructions; and the 32-bit privileged resource
architecture. The Memory Management Unit consists of a simple, fixed Block Address Translation (BAT) mechanism for
applications that do not require the full capabilities of a Translation Lookaside Buffer based MMU.
The synthesizable 4Km core implements single cycle MAC instructions, which enable DSP algorithms to be performed
efficiently. The Multiply/Divide Unit (MDU) allows 32-bit x 16-bit MAC instructions to be issued every cycle. A 32-bit x
32-bit MAC instruction can be issued every 2 cycles.
Instruction and data caches are fully configurable from 0 - 16 Kbytes in size. In addition, each cache can be organized as
direct-mapped or 2-way, 3-way, or 4-way set associative. Load and fetch cache misses only block until the critical word
becomes available. The pipeline resumes execution while the remaining words are being written to the cache. Both caches
are virtually indexed and physically tagged to allow them to be accessed in the same clock that the address is translated.
An optional Enhanced JTAG (EJTAG) block allows for single-stepping of the processor as well as instruction and data
virtual address breakpoints.
Figure 1 shows a block diagram of the 4Km core. The core is divided into required and optional blocks as shown.
Processor Core
EJTAG
Cache
Control
System
Coprocessor
BAT
Data
Cache
Fixed/Required
Power
Mgmt.
Optional
Figure 1 4Km Core Block Diagram
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
On-Chip Bus(es)
MMU
Thin I/F
Execution
Core
BIU
Instruction
Cache
Mul/Div Unit
Features
• 32-bit Address and Data Paths
• EJTAG Debug Support with single stepping, virtual
instruction and data address breakpoints
• MIPS32-Compatible Instruction Set
–
–
–
–
–
–
–
All MIPS II Instructions
Multiply-Accumulate and Multiply-Subtract
Instructions (MADD, MADDU, MSUB, MSUBU)
Targeted Multiply Instruction (MUL)
Zero/One Detect Instructions (CLZ, CLO)
Wait Instruction (WAIT)
Conditional Move Instructions (MOVZ, MOVN)
Prefetch Instruction (PREF)
• Programmable Cache Sizes
–
–
–
–
–
–
–
–
–
Individually configurable instruction and data caches
Sizes from 0 - 16KB
Direct Mapped, 2-, 3-, or 4-Way Set Associative
Loads block only until critical word is available
Write-through, no write-allocate
16-byte cache line size, word sectored
Virtually indexed, physically tagged
Cache line locking support
Non-blocking prefetches
• Scratchpad RAM Support
–
–
–
Can optionally replace 1 way of the I- and/or D-cache
with a fast scratchpad RAM
20 index address bits allow access of arrays up to 1MB
Memory-mapped registers attached to the scratchpad
port can be used as a coprocessor interface
• R4000-style Privileged Resource Architecture
–
–
–
Count/Compare registers for real-time timer interrupts
I and D watch registers for SW breakpoints
Separate interrupt exception vector
• Memory Management Unit
–
Simple Block Address Translation (BAT) mechanism
Architecture Overview
The 4Km core contains both required and optional blocks.
Required blocks are the lightly shaded areas of the block
diagram in Figure 1 and must be implemented to remain
MIPS-compliant. Optional blocks can be added to the 4Km
core based on the needs of the implementation.
The required blocks are as follows:
• Execution Unit
• Multiply/Divide Unit (MDU)
• System Control Coprocessor (CP0)
• Memory Management Unit (MMU)
• Block Address Translation (BAT)
• Cache Controllers
• Bus Interface Unit (BIU)
• Power Management
Optional blocks include:
• Instruction Cache
• Data Cache
• Scratchpad RAM
• Enhanced JTAG (EJTAG) Controller
The section entitled "4Kp Core Required Logic Blocks" on
page 3 discusses the required blocks. The section entitled
"4Kp Core Optional Logic Blocks" on page 10 discusses
the optional blocks.
• Simple Bus Interface Unit (BIU)
–
–
–
All I/Os fully registered
Separate unidirectional 32-bit address and data buses
Two 16-byte collapsing write buffers
• Multiply/Divide Unit
–
–
–
Maximum issue rate of one 32x16 multiply per clock
Maximum issue rate of one 32x32 multiply every other
clock
Early-in iterative divide. Minimum 11 and maximum 34
clock latency (dividend (rs) sign extension-dependent)
The 4Km core implements a 5-stage pipeline with
performance similar to the R3000 pipeline. The pipeline
allows the processor to achieve high frequency while
minimizing device complexity, reducing both cost and
power consumption.
The 4Km core pipeline consists of five stages:
• Power Control
–
–
–
Pipeline Flow
Minimum frequency: 0 MHz
Power-down mode (triggered by WAIT instruction)
Support for software-controlled clock divider
• Instruction (I Stage)
• Execution (E Stage)
• Memory (M Stage)
• Align (A Stage)
2
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
• Writeback (W stage)
The 4Km core implements a bypass mechanism that allows
the result of an operation to be forwarded directly to the
instruction that needs it without having to write the result
to the register and then read it back.
• The data cache fetch and the data virtual-to-physical
address translation are performed for load and store
instructions.
• Data cache look-up is performed and a hit/miss
determination is made.
• A 16x16 or 32x16 multiply calculation completes.
Figure 2 shows a timing diagram of the 4Km core pipeline.
I
E
M
A
W
Bypass
Bypass
I-Cache
ALU Op
RegRd
I-A1
• A divide operation stalls for a maximum of 34 clocks
in the M stage. Early-in sign extension detection on the
dividend will skip 7, 15, or 23 stall clocks.
A Stage: Align
D-Cache
I Dec D-AC
• A 32x32 multiply operation stalls for one clock in the
M stage.
Align
RegW
During the Align stage:
I-A2
Bypass
Mul-16x16, 32x16
Acc
RegW
Acc
RegW
Acc
RegW
Bypass
Mul-32x32
Div
Figure 2 4Km Core Pipeline
• A separate aligner aligns load data to its word
boundary.
• A 16x16 or 32x16 multiply operation performs the
carry-propagate-add. The actual register writeback is
performed in the W stage.
• A MUL operation makes the result available for
writeback. The actual register writeback is performed
in the W stage.
I Stage: Instruction Fetch
W Stage: Writeback
During the Instruction fetch stage:
• An instruction is fetched from instruction cache.
• For register-to-register or load instructions, the
instruction result is written back to the register file
during the W stage.
E Stage: Execution
During the Execution stage:
• Operands are fetched from register file.
• The arithmetic logic unit (ALU) begins the arithmetic
or logical operation for register-to-register instructions.
• The ALU calculates the data virtual address for load
and store instructions.
• The ALU determines whether the branch condition is
true and calculates the virtual branch target address for
branch instructions.
• Instruction logic selects an instruction address.
• All multiply and divide operations begin in this stage.
4Km Core Required Logic Blocks
The 4Km core consists of the following required logic
blocks as shown in Figure 1. These logic blocks are defined
in the following subsections:
• Execution Unit
• Multiply/Divide Unit (MDU)
• System Control Coprocessor (CP0)
• Memory Management Unit (MMU)
• Block Address Translation (BAT)
• Cache Controller
• Bus Interface Control (BIU)
M Stage: Memory Fetch
During the memory fetch stage:
• Power Management
• The arithmetic ALU operation completes.
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
3
Execution Unit
The 4Km core execution unit implements a load/store
architecture with single-cycle ALU operations (logical,
shift, add, subtract) and an autonomous multiply/divide
unit. The 4Km core contains thirty-two 32-bit generalpurpose registers used for integer operations and address
calculation. The register file consists of two read ports and
one write port and is fully bypassed to minimize operation
latency in the pipeline.
The execution unit includes:
• 32-bit adder used for calculating the data address
• Address unit for calculating the next instruction
address
• Logic for branch determination and branch target
address calculation
• Load aligner
• Bypass multiplexers used to avoid stalls when
executing instructions streams where data producing
instructions are followed closely by consumers of their
results
• Leading Zero/One detect unit for implementing the
CLZ and CLO instructions
The MDU supports execution of one 16x16 or 32x16
multiply operation every clock cycle; 32x32 multiply
operations can be issued every other clock cycle.
Appropriate interlocks are implemented to stall the
issuance of back-to-back 32x32 multiply operations. The
multiply operand size is automatically determined by logic
built into the MDU.
Divide operations are implemented with a simple 1 bit per
clock iterative algorithm. An early-in detection checks the
sign extension of the dividend (rs) operand. If rs is 8 bits
wide, 23 iterations are skipped. For a 16-bit-wide rs, 15
iterations are skipped, and for a 24-bit-wide rs, 7 iterations
are skipped. Any attempt to issue a subsequent MDU
instruction while a divide is still active causes an IU
pipeline stall until the divide operation is completed.
Table 1 lists the repeat rate (peak issue rate of cycles until
the operation can be reissued) and latency (number of
cycles until a result is available) for the 4Km core multiply
and divide instructions. The approximate latency and
repeat rates are listed in terms of pipeline clocks. For a
more detailed discussion of latencies and repeat rates, refer
to Chapter 2 of the MIPS32 4K™ Processor Core Family
Software User’s Manual.
Table 1
• Arithmetic Logic Unit (ALU) for performing bitwise
logical operations
Opcode
• Shifter & Store Aligner
Multiply/Divide Unit (MDU)
The 4Km core contains a multiply/divide unit (MDU) that
contains a separate pipeline for multiply and divide
operations. This pipeline operates in parallel with the
integer unit (IU) pipeline and does not stall when the IU
pipeline stalls. This setup allows long-running MDU
operations, such as a divide, to be partially masked by
system stalls and/or other integer unit instructions.
The MDU consists of a 32x16 booth recoded multiplier,
result/accumulation registers (HI and LO), a divide state
machine, and the necessary multiplexers and control logic.
The first number shown (‘32’ of 32x16) represents the rs
operand. The second number (‘16’ of 32x16) represents the
rt operand. The 4Km core only checks the value of the
latter (rt) operand to determine how many times the
operation must pass through the multiplier. The 16x16 and
32x16 operations pass through the multiplier once. A
32x32 operation passes through the multiplier twice.
4
4Km Core Integer Multiply/Divide Unit
Latencies and Repeat Rates
Operand
Size
(mul rt)
(div rs)
Latency
Repeat
Rate
MULT/MULTU,
MADD/MADDU,
MSUB/MSUBU
16 bits
1
1
32 bits
2
2
MUL
16 bits
2
1
32 bits
3
2
8 bits
12
11
16 bits
19
18
24 bits
26
25
32 bits
33
32
DIV/DIVU
The MIPS architecture defines that the result of a multiply
or divide operation be placed in the HI and LO registers.
Using the Move-From-HI (MFHI) and Move-From-LO
(MFLO) instructions, these values can be transferred to the
general-purpose register file.
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
As an enhancement to the MIPS II ISA, the 4Km core
implements an additional multiply instruction, MUL,
which specifies that multiply results be placed in the
primary register file instead of the HI/LO register pair. By
avoiding the explicit MFLO instruction, required when
using the LO register, and by supporting multiple
destination registers, the throughput of multiply-intensive
operations is increased.
Two other instructions, multiply-add (MADD) and
multiply-subtract (MSUB), are used to perform the
multiply-accumulate and multiply-subtract operations. The
MADD instruction multiplies two numbers and then adds
the product to the current contents of the HI and LO
registers. Similarly, the MSUB instruction multiplies two
operands and then subtracts the product from the HI and
LO registers. The MADD and MSUB operations are
commonly used in DSP algorithms.
System Control Coprocessor (CP0)
In the MIPS architecture, CP0 is responsible for the virtualto-physical address translation and cache protocols, the
exception control system, the processor’s diagnostics
capability, the operating modes (kernel, user, and debug),
and interrupts enabled or disabled. Configuration
information such as cache size and set associativity is
available by accessing the CP0 registers, listed in Table 2.
Table 2
Register
Number
Table 2
Coprocessor 0 Registers in Numerical Order
Register
Number
Register
Name
11
Compare2
Timer interrupt control.
12
Status2
Processor status and control.
13
Cause2
Cause of last general exception.
14
EPC2
Program counter at last exception.
15
PRId
Processor identification and
revision.
16
Config
Configuration register.
16
Config1
Configuration register 1.
17
LLAddr
Load linked address.
18
WatchLo2
Low-order watchpoint address.
19
WatchHi2
High-order watchpoint address.
20 - 22
Reserved
Reserved.
23
Debug3
Debug control and exception
status.
24
DEPC3
Program counter at last debug
exception.
Reserved
Reserved.
28
TagLo/
DataLo
Low-order portion of cache tag
interface.
29
Reserved
Reserved.
30
ErrorEPC2
Program counter at last error.
31
DeSave3
Debug handler scratchpad register.
25 - 27
Coprocessor 0 Registers in Numerical Order
Register
Name
Function
Function
0
Index1
Reserved in the 4Km core.
1
Random1
Reserved in the 4Km4Km core.
2
EntryLo01
Reserved in the 4Km core.
2. Registers used in exception processing.
3
EntryLo11
Reserved in the 4Km core.
3. Registers used during debug.
4
Context2
Pointer to page table entry in
memory.
5
PageMask1
Reserved in the 4Km core.
6
Wired1
Reserved in the 4Km core.
7
Reserved
Reserved.
8
BadVAddr2
Reports the address for the most
recent address-related exception.
9
Count2
Processor cycle count.
Reset
Assertion of SI_ColdReset signal.
10
EntryHi1
Reserved in the 4Km core.
Soft Reset
Assertion of SI_Reset signal.
DSS
EJTAG Debug Single Step.
1. Registers used in memory management.
Coprocessor 0 also contains the logic for identifying and
managing exceptions. Exceptions can be caused by a
variety of sources, including boundary cases in data,
external events, or program errors. Table 3 shows the
exception types in order of priority.
Table 3
Exception
4Km Core Exception Types
Description
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
5
Table 3
4Km Core Exception Types (Continued)
Exception
DINT
Description
EJTAG Debug Interrupt. Caused by the
assertion of the external EJ_DINT
input, or by setting the EjtagBrk bit in
the ECR register.
NMI
Assertion of EB_NMI signal.
Machine Check
TLB write that conflicts with an
existing entry.
Interrupt
Assertion of unmasked hardware or
software interrupt signal.
Deferred Watch
Deferred Watch (unmasked by K|DM>!(K|DM) transition).
DIB
EJTAG debug hardware instruction
break matched.
WATCH
A reference to an address in one of the
watch registers (fetch).
AdEL
Fetch address alignment error.
Fetch reference to protected address.
TLBL
Table 3
4Km Core Exception Types (Continued)
Exception
Description
DBE
Load or store bus error.
DDBL
EJTAG data hardware breakpoint
matched in load data compare.
Modes of Operation
The 4Km core supports three modes of operation: user
mode, kernel mode, and debug mode. User mode is most
often used for applications programs. Kernel mode is
typically used for handling exceptions and operating
system kernel functions, including CP0 management and I/
O device accesses. An additional Debug mode is used
during system bring-up and software development. Refer to
the EJTAG section for more information on debug mode.
0xFFFFFFFF
Memory Mapped
0xFF400000
0xFF3FFFFF
0xFF200000
0xF1FFFFFF
Memory/EJTAG1
kseg3
Memory Mapped
Fetch TLB miss.
0xE0000000
IBE
Instruction fetch bus error.
DBp
EJTAG Breakpoint (execution of
SDBBP instruction).
Sys
Execution of SYSCALL instruction.
Bp
Execution of BREAK instruction.
RI
Execution of a Reserved Instruction.
CpU
Execution of a coprocessor instruction
for a coprocessor that is not enabled.
Ov
Execution of an arithmetic instruction
that overflowed.
Tr
Execution of a trap (when trap
condition is true).
DDBL / DDBS
EJTAG Data Address Break (address
only) or EJTAG Data Value Break on
Store (address+value).
WATCH
A reference to an address in one of the
watch registers (data).
AdEL
Load address alignment error.
0xDFFFFFFF
0xC0000000
0xBFFFFFFF
0xA0000000
0x9FFFFFFF
Kernel virtual address space
Mapped, 512 MB
kseg2
Kernel virtual address space
Unmapped, 512 MB
Uncached
kseg1
Kernel virtual address space
Unmapped, 512 MB
kseg0
0x80000000
0x7FFFFFFF
User virtual address space
Mapped, 2048 MB
kuseg
0x00000000
1. This space is mapped to memory in user of kernel mode,
and by the EJTAG module in debug mode.
Load reference to protected address.
Figure 3
AdES
4Km Core Virtual Address Map
Store address alignment error.
Store to protected address.
6
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
Memory Management Unit (MMU)
Table 5
The 4Km core contains an MMU that interfaces between
the execution unit and the cache controller. The 4Km core
provides a simple block address translation (BAT)
mechanism that is smaller than the TLB in the MIPS32
4Kc® core and more easily synthesized. Like the TLB, the
BAT performs virtual-to-physical address translation and
provides attributes for the different segments. Those
segments that are unmapped in the 4Kc core’s TLB
implementation (kseg0 and kseg1) are translated
identically by the BAT.
Figure 4 shows how the BAT is implemented in the 4Km
core.
Virtual Address
Instruction
Address
Calculator
Instruction
Cache
Tag RAM
Cacheability of Segments with Block Address
Translation
Segment
Virtual
Address
Range
Cacheability
useg/kuseg
0x0000_00000x7FFF_FFFF
Controlled by the KU field
(bits 27:25) of the Config
register. See Table 4 for
mapping. This segment is
always uncached when
ERL = 1.
kseg0
0x8000_00000x9FFF_FFFF
Controlled by the K0 field
(bits 2:0) of the Config
register. See Table 4 for
mapping.
kseg1
0xA000_00000xBFFF_FFFF
Always uncacheable
kseg2
0xC000_00000xDFFF_FFFF
Controlled by the K23 field
(bits 30:28) of the Config
register. See Table 4 for
mapping.
kseg3
0xE000_00000xFFFF_FFFF
Controlled by the K23 field
(bits 30:28) of the Config
register. See Table 4 for
mapping.
Comparator
Instruction
Hit/Miss
BAT
Data
Hit/Miss
Data
Address
Calculator
Comparator
Virtual Address
Data
Cache
RAM
Figure 4 Address Translation During a Cache Access
The BAT also determines the cacheability of each segment.
These attributes are controlled via bits in the Config
register. Table 4 shows the encoding for the K23 (bits
30:28), KU (bits 27:25), and K0 (bits 2:0) bits of the Config
register.
Table 4
Cache Coherency Attributes
Config Register Fields
K23, KU, and K0
Cache Coherency Attribute
The BAT performs a simple translation to map from virtual
addresses to physical addresses. This mapping is shown in
Figure 5.
Virtual Address
kseg3
0xE000_0000
kseg2
0xC000_0000
Physical Address
kseg3
0xE000_0000
kseg2
0xC000_0000
kseg1
0xA000_0000
0*, 1*, 3, 4*, 5*, 6*
2, 7*
Cacheable, noncoherent, writethrough, no write-allocate
kseg0
0x8000_0000
useg/kuseg
Uncached
*2 and 3 are the required MIPS32 mappings for uncached and cacheable references, other values may have different meanings in other
MIPS32 processors
useg/kuseg
0x4000_0000
reserved
0x2000_0000
In the 4Km core, no translation exceptions can be taken,
although address errors are still possible.
kseg0/kseg1
0x0000_0000
0x0000_0000
Figure 5 BAT Memory Map (ERL=0) in the 4Km
Processor Core
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
7
When ERL=1, useg and kuseg become unmapped and
uncached. This behavior is the same as if there was a TLB.
This mapping is shown in Figure 6.
Virtual Address
Physical Address
kseg3
0xE000_0000
kseg3
0xE000_0000
kseg2
0xC000_0000
kseg2
0xC000_0000
kseg1
0xA000_0000
kseg0
0x8000_0000
reserved
0x8000_0000
policy, the write buffer significantly reduces the number of
writes transactions on the external interface and reduces the
amount of stalling in the core due to issuance of multiple
writes in a short period of time.
The write buffer is organized as two 16-byte buffers. Each
buffer contains data from a single 16-byte aligned block of
memory. One buffer contains the data currently being
transferred on the external interface, while the other buffer
contains accumulating data from the core. Data from the
accumulation buffer is transferred to the external interface
buffer under one of these conditions:
• When a store is attempted from the core to a different
16-byte block than is currently being accumulated
• SYNC Instruction
useg/kuseg
useg/kuseg
• Store to an invalid merge pattern
• Any load or store to uncached memory
0x0000_0000
kseg0/kseg1
0x0000_0000
Figure 6 BAT Memory Map (ERL=1) in the 4Km
Processor Core
Cache Controllers
The 4Km core instruction and data cache controllers
support caches of various sizes, organizations, and setassociativity. For example, the data cache can be 2 Kbytes
in size and 2-way set associative, while the instruction
cache can be 8 Kbytes in size and 4-way set associative.
Each cache can each be accessed in a single processor
cycle. In addition, each cache has its own 32-bit data path
and both caches can be accessed in the same pipeline clock
cycle. Refer to the section entitled "4Kp Core Optional
Logic Blocks" on page 10 for more information on
instruction and data cache organization.
• A load to the line being merged
Note that if the data in the external interface buffer has not
been written out to memory, the core is stalled until the
memory write completes. After completion of the memory
write, accumulated buffer data can be written to the
external interface buffer.
Merge Pattern Control
The 4Km core implements two 16-byte collapsing write
buffers that allow byte, halfword, tri-byte, or word writes
from the core to be accumulated in the buffer into a 16-byte
value before bursting the data out onto the bus in word
format. Note that writes to uncached areas are never
merged.
The 4Km core provides two options for merge pattern
control:
• No merge
The cache controllers also have built-in support for
replacing one way of the cache with a scratchpad RAM.
See the section entitled "4Kp Core Optional Logic Blocks"
on page 10 for more information on scratchpad RAMs.
• Full merge
In No Merge mode, writes to a different word within the
same line are accumulated in the buffer. Writes to the same
word cause the previous word to be driven onto the bus.
Bus Interface (BIU)
The Bus Interface Unit (BIU) controls the external
interface signals. Additionally, it contains the
implementation of the 32-byte collapsing write buffer. The
purpose of this buffer is to store and combine write
transactions before issuing them at the external interface.
Since the 4Km core caches follow a write-through cache
8
In Full Merge mode, all combinations of writes to the same
line are collected in the buffer. Any pattern of byte enables
is possible.
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
SimpleBE Mode
Table 7
To aid in attaching the 4Km core to existing busses, there is
a mode that only generates “simple” byte enables. Only
byte enables representing naturally aligned byte, half, and
word transactions will be generated. Legal byte enable
patterns are shown in Table 6. Writes with illegal byte
enable patterns will be broken into two separate write
transactions. This splitting is independent of the merge
pattern control in the write buffer. The only case where a
read can generate illegal byte enables is on an uncached tribyte load (LWL/LWR). These reads will be converted into
a word read on the bus.
Table 6
Valid SimpleBE Byte Enable Patterns
4Km Reset Types
Reset
ColdReset
X
1
Action
Cold or Hard reset.
The Reset signal is asserted for a warm reset. A warm reset
restarts the 4Km core and preserves more of the processors
internal state than a cold reset. The Reset signal can be
asserted synchronously or asynchronously during a cold
reset, or synchronously to initiate a warm reset. The
assertion of Reset causes a soft reset exception within the
4Km core. In debug mode, EJTAG can request that the soft
reset function be masked. It is system dependent whether
this functionality is supported. In normal mode, the soft
reset cannot be masked.
EB_BE[3:0]
0001
Power Management
0010
The 4Km core offers a number of power management
features, including low-power design, active power
management, and power-down modes of operation. The
4Km core is a static design that supports slowing or halting
the clocks, which reduces system power consumption
during idle periods.
0100
1000
0011
1100
1111
The 4Km core provides two mechanisms for system-level
low power support:
• Register-controlled power management
• Instruction-controlled power management
4Km Core Reset
Register-Controlled Power Management
The 4Km core has two types of reset input signals: Reset
and ColdReset.
The ColdReset signal must be asserted on either a poweron reset or a cold reset. In a typical application, a power-on
reset occurs when the machine is first turned on. A cold
reset (also called a hard reset) typically occurs when the
machine is already on and the system is rebooted. A cold
reset completely initializes the internal state machines of
the 4Km core without saving any state information. The
Reset and ColdReset signals work in conjunction with one
another to determine the type of reset operation (see Table
7).
Table 7
4Km Reset Types
Reset
ColdReset
Action
0
0
Normal Operation, no reset.
1
0
Warm or Soft reset.
The RP bit in the CP0 Status register provides a software
mechanism for placing the system into a low power state.
The state of the RP bit is available externally via the SI_RP
signal. The external agent then decides whether to place the
device in low power mode, such as by reducing the system
clock frequency.
Three additional bits, StatusEXL, StatusERL, and DebugDM
support the power management function by allowing the
user to change the power state if an exception or error
occurs while the 4Km core is in a low power state.
Depending on what type of exception is taken, one of these
three bits will be asserted and reflected on the SI_EXL,
SI_ERL, or EJ_DebugM outputs. The external agent can
look at these signals and determine whether to leave the
low power state to service the exception.
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
9
The following 4 power-down signals are part of the system
interface and change state as the corresponding bits in the
CP0 registers are set or cleared:
• The SI_RP signal represents the state of the RP bit (27)
in the CP0 Status register.
• The SI_EXL signal represents the state of the EXL bit
(1) in the CP0 Status register.
• The SI_ERL signal represents the state of the ERL bit
(2) in the CP0 Status register.
• The EJ_DebugM signal represents the state of the DM
bit (30) in the CP0 Debug register.
data, or data coming from the external interface. The
instruction cache control logic controls the bypass
function.
The 4Km4Km core supports instruction-cache locking.
Cache locking allows critical code or data segments to be
locked into the cache on a “per-line” basis, enabling the
system programmer to maximize the efficiency of the
system cache.
The cache-locking function is always available on all
instruction-cache entries. Entries can then be marked as
locked or unlocked on a per entry basis using the CACHE
instruction.
Instruction-Controlled Power Management
The second mechanism for invoking power-down mode is
through execution of the WAIT instruction. When the
WAIT instruction is executed, the internal clock is
suspended. However, the internal timer and some of the
input pins (SI_Int[5:0], SI_NMI, SI_Reset, and
SI_ColdReset) continue to run. Once the CPU is in
instruction-controlled power management mode, any
interrupt, NMI, or reset condition causes the CPU to exit
this mode and resume normal operation.
Data Cache
The 4Km core asserts the SI_SLEEP signal, which is part
of the system interface bus, whenever the WAIT instruction
is executed. The assertion of SI_SLEEP indicates that the
clock has stopped and the 4Km core is waiting for an
interrupt.
In addition to instruction-cache locking, the 4Km core also
supports a data-cache locking mechanism identical to the
instruction cache. Critical data segments are locked into the
cache on a “per-line” basis. The locked contents can be
updated on a store hit, but cannot be selected for
replacement on a cache miss.
4Km Core Optional Logic Blocks
The cache-locking function is always available on all data
cache entries. Entries can then be marked as locked or
unlocked on a per-entry basis using the CACHE
instruction.
The 4Km core consists of the following optional logic
blocks as shown in the block diagram in Figure 1.
The data cache is an optional on-chip memory block of up
to 16 Kbytes. This virtually indexed, physically tagged
cache is protected. Because the data cache is virtually
indexed, the virtual-to-physical address translation occurs
in parallel with the cache access. The tag holds 22 bits of
physical address, 4 valid bits, a lock bit, and the fill
replacement bit.
Cache Memory Configuration
Instruction Cache
The instruction cache is an optional on-chip memory block
of up to 16 Kbytes. Because the instruction cache is
virtually indexed, the virtual-to-physical address
translation occurs in parallel with the cache access rather
than having to wait for the physical address translation. The
tag holds 22 bits of physical address, 4 valid bits, a lock bit,
and the fill replacement bit.
The 4Km core incorporates on-chip instruction and data
caches that can each be accessed in a single processor
cycle. Each cache has its own 32-bit data path and can be
accessed in the same pipeline clock cycle. Table 8 lists the
4Km core instruction and data cache attributes.
Table 8 4Km Core Instruction and Data Cache Attributes
Parameter
The instruction cache block also contains and manages the
instruction line fill buffer. Besides accumulating data to be
written to the cache, instruction fetches that reference data
in the line fill buffer are serviced either by a bypass of that
10
Instruction
Data
Size
0 - 16 Kbytes
0 - 16 Kbytes
Organization
1 - 4 way set
associative
1 - 4 way set
associative
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
Table 8 4Km Core Instruction and Data Cache Attributes
Parameter
Instruction
Data
Line Size
16 bytes
16 bytes
Read Unit
32 bits
32 bits
Write Policy
na
write-through
without writeallocate
Miss restart after
transfer of
miss word
miss word
Cache Locking
per line
per line
Cache Protocols
The 4Km core supports the following cache protocols:
• Uncached: Addresses in a memory area indicated as
uncached are not read from the cache. Stores to such
addresses are written directly to main memory, without
changing cache contents.
• Write-through: Loads and instruction fetches first
search the cache, reading main memory only if the
desired data does not reside in the cache. On data store
operations, the cache is first searched to see if the
target address is cache resident. If it is resident, the
cache contents are updated, and main memory is also
written. If the cache look-up misses, only main
memory is written.
Scratchpad RAM
The 4Km core also supports replacing up to one way of
each cache with a scratchpad RAM. The scratchpad RAM
is user-defined and can consist of a variety of devices. The
main requirement is that it must be accessible with timing
similar to a regular cache RAM. This means that an index
will be driven one cycle, a tag will be driven the following
clock, and the scratchpad must return a hit signal and the
data in the second clock. The scratchpad can thus easily
contain a large RAM/ROM or memory-mapped registers.
The core’s interface to a scratchpad RAM is slightly
different than to a regular cache RAM. Additional index
bits allow access to a larger array, 1MB of scratchpad RAM
versus 4KB for a cache way. The core does not
automatically refill the scratchpad way and will not select
it for replacement on cache misses. Additionally, stores that
hit in the scratchpad will not generate write-throughs to
main memory.
EJTAG Debug Support
The 4Km core provides for an optional Enhanced JTAG
(EJTAG) interface for use in the software debug of
application and kernel code. In addition to standard user
mode and kernel modes of operation, the 4Km core
provides a Debug mode that is entered after a debug
exception (derived from a hardware breakpoint, single-step
exception, etc.) is taken and continues until a debug
exception return (DERET) instruction is executed. During
this time, the processor executes the debug exception
handler routine.
Refer to the section called "4Kp Core Signal Descriptions"
on page 16 for a list of signals EJTAG interface signals.
The EJTAG interface operates through the Test Access Port
(TAP), a serial communication port used for transferring
test data in and out of the 4Km core. In addition to the
standard JTAG instructions, special instructions defined in
the EJTAG specification define what registers are selected
and how they are used.
Debug Registers
Three debug registers (DEBUG, DEPC, and DESAVE)
have been added to the MIPS Coprocessor 0 (CP0) register
set. The DEBUG register shows the cause of the debug
exception and is used for the setting up of single-step
operations. The DEPC, or Debug Exception Program
Counter, register holds the address on which the debug
exception was taken. This is used to resume program
execution after the debug operation finishes. Finally, the
DESAVE, or Debug Exception Save, register enables the
saving of general-purpose registers used during execution
of the debug exception handler.
To exit debug mode, a Debug Exception Return (DERET)
instruction is executed. When this instruction is executed,
the system exits debug mode, allowing normal execution of
application and system code to resume.
EJTAG Hardware Breakpoints
There are several types of simple hardware breakpoints
defined in the EJTAG specification. These stop the normal
operation of the CPU and force the system into debug
mode. There are two types of simple hardware breakpoints
implemented in the 4Km core: Instruction breakpoints and
Data breakpoints.
The 4Km core can be configured with the following
breakpoint options:
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
11
• No data or instruction breakpoints
similar to the Instruction breakpoint. Data breakpoints can
be set on a load, a store, or both. Data breakpoints can also
be set based on the value of the load/store operation.
Finally, masks can be applied to both the virtual address
and the load/store value.
• One data and two instruction breakpoints
• Two data and four instruction breakpoints
Instruction breaks occur on instruction fetch operations,
and the break is set on the virtual address on the bus
between the CPU and the instruction cache. Instruction
breaks can also be made on the ASID value used by the
MMU. Finally, a mask can be applied to the virtual address
to set breakpoints on a range of instructions.
Data breakpoints occur on load/store transactions.
Breakpoints are set on virtual address and ASID values,
4Km Core Instructions
The 4Km core instruction set complies with the MIPS32
instruction set architecture. Table 9 provides a summary of
instructions implemented by the 4Km core.
Table 9 4Km Core Instruction Set
Instruction
12
Description
Function
ADD
Integer Add
Rd = Rs + Rt
ADDI
Integer Add Immediate
Rt = Rs + Immed
ADDIU
Unsigned Integer Add Immediate
Rt = Rs +U Immed
ADDU
Unsigned Integer Add
Rd = Rs +U Rt
AND
Logical AND
Rd = Rs & Rt
ANDI
Logical AND Immediate
Rt = Rs & (016 || Immed)
BEQ
Branch On Equal
if Rs == Rt
PC += (int)offset
BEQL
Branch On Equal Likely
if Rs == Rt
PC += (int)offset
else
Ignore Next Instruction
BGEZ
Branch on Greater Than or Equal To Zero
if !Rs[31]
PC += (int)offset
BGEZAL
Branch on Greater Than or Equal To Zero And
Link
GPR[31] = PC + 8
if !Rs[31]
PC += (int)offset
BGEZALL
Branch on Greater Than or Equal To Zero And
Link Likely
GPR[31] = PC + 8
if !Rs[31]
PC += (int)offset
else
Ignore Next Instruction
BGEZL
Branch on Greater Than or Equal To Zero
Likely
if !Rs[31]
PC += (int)offset
else
Ignore Next Instruction
BGTZ
Branch on Greater Than Zero
if !Rs[31] && Rs != 0
PC += (int)offset
BGTZL
Branch on Greater Than Zero Likely
if !Rs[31] && Rs != 0
PC += (int)offset
else
Ignore Next Instruction
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
Table 9 4Km Core Instruction Set (Continued)
Instruction
Description
Function
BLEZ
Branch on Less Than or Equal to Zero
if Rs[31] || Rs == 0
PC += (int)offset
BLEZL
Branch on Less Than or Equal to Zero Likely
if Rs[31] || Rs == 0
PC += (int)offset
else
Ignore Next Instruction
BLTZ
Branch on Less Than Zero
if Rs[31]
PC += (int)offset
BLTZAL
Branch on Less Than Zero And Link
GPR[31] = PC + 8
if Rs[31]
PC += (int)offset
BLTZALL
Branch on Less Than Zero And Link Likely
GPR[31] = PC + 8
if Rs[31]
PC += (int)offset
else
Ignore Next Instruction
BLTZL
Branch on Less Than Zero Likely
if Rs[31]
PC += (int)offset
else
Ignore Next Instruction
BNE
Branch on Not Equal
if Rs != Rt
PC += (int)offset
BNEL
Branch on Not Equal Likely
if Rs != Rt
PC += (int)offset
else
Ignore Next Instruction
BREAK
Breakpoint
Break Exception
CACHE
Cache Operation
See Software User’s Manual
COP0
Coprocessor 0 Operation
See Software User’s Manual
CLO
Count Leading Ones
Rd = NumLeadingOnes(Rs)
CLZ
Count Leading Zeroes
Rd = NumLeadingZeroes(Rs)
DERET
Return from Debug Exception
PC = DEPC
Exit Debug Mode
DIV
Divide
LO = (int)Rs / (int)Rt
HI = (int)Rs % (int)Rt
DIVU
Unsigned Divide
LO = (uns)Rs / (uns)Rt
HI = (uns)Rs % (uns)Rt
ERET
Return from Exception
if SR[2]
PC = ErrorEPC
else
PC = EPC
SR[1] = 0
SR[2] = 0
LL = 0
J
Unconditional Jump
PC = PC[31:28] || offset<<2
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
13
Table 9 4Km Core Instruction Set (Continued)
Instruction
14
Description
Function
JAL
Jump and Link
GPR[31] = PC + 8
PC = PC[31:28] || offset<<2
JALR
Jump and Link Register
Rd = PC + 8
PC = Rs
JR
Jump Register
PC = Rs
LB
Load Byte
Rt = (byte)Mem[Rs+offset]
LBU
Unsigned Load Byte
Rt = (ubyte))Mem[Rs+offset]
LH
Load Halfword
Rt = (half)Mem[Rs+offset]
LHU
Unsigned Load Halfword
Rt = (uhalf)Mem[Rs+offset]
LL
Load Linked Word
Rt = Mem[Rs+offset]
LL = 1
LLAdr = Rs + offset
LUI
Load Upper Immediate
Rt = immediate << 16
LW
Load Word
Rt = Mem[Rs+offset]
LWL
Load Word Left
See Software User’s Manual
LWR
Load Word Right
See Software User’s Manual
MADD
Multiply-Add
HI | LO += (int)Rs * (int)Rt
MADDU
Multiply-Add Unsigned
HI | LO += (uns)Rs * (uns)Rt
MFC0
Move From Coprocessor 0
Rt = CPR[0, n, sel] = Rt
MFHI
Move From HI
Rd = HI
MFLO
Move From LO
Rd = LO
MOVN
Move Conditional on Not Zero
if Rt ≠ 0 then
Rd = Rs
MOVZ
Move Conditional on Zero
if Rt = 0 then
Rd = Rs
MSUB
Multiply-Subtract
HI | LO -= (int)Rs * (int)Rt
MSUBU
Multiply-Subtract Unsigned
HI | LO -= (uns)Rs * (uns)Rt
MTC0
Move To Coprocessor 0
CPR[0, n, SEL] = Rt
MTHI
Move To HI
HI = Rs
MTLO
Move To LO
LO = Rs
MUL
Multiply with register write
HI | LO =Unpredictable
Rd = ((int)Rs *
(int)Rt)31..0
MULT
Integer Multiply
HI | LO = (int)Rs * (int)Rd
MULTU
Unsigned Multiply
HI | LO = (uns)Rs * (uns)Rd
NOR
Logical NOR
Rd = ~(Rs | Rt)
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
Table 9 4Km Core Instruction Set (Continued)
Instruction
Description
Function
OR
Logical OR
Rd = Rs | Rt
ORI
Logical OR Immediate
Rt = Rs | Immed
PREF
Prefetch
Load Specified Line into Cache
SB
Store Byte
(byte)Mem[Rs+offset] = Rt
SC
Store Conditional Word
if LL = 1
mem[Rs+offset] = Rt
Rt = LL
SDBBP
Software Debug Break Point
Trap to SW Debug Handler
SH
Store Half
(half)Mem[Rs+offset] = Rt
SLL
Shift Left Logical
Rd = Rt << sa
SLLV
Shift Left Logical Variable
Rd = Rt << Rs[4:0]
SLT
Set on Less Than
if (int)Rs < (int)Rt
Rd = 1
else
Rd = 0
SLTI
Set on Less Than Immediate
if (int)Rs < (int)Immed
Rt = 1
else
Rt = 0
SLTIU
Set on Less Than Immediate Unsigned
if (uns)Rs < (uns)Immed
Rt = 1
else
Rt = 0
SLTU
Set on Less Than Unsigned
if (uns)Rs < (uns)Immed
Rd = 1
else
Rd = 0
SRA
Shift Right Arithmetic
Rd = (int)Rt >> sa
SRAV
Shift Right Arithmetic Variable
Rd = (int)Rt >> Rs[4:0]
SRL
Shift Right Logical
Rd = (uns)Rt >> sa
SRLV
Shift Right Logical Variable
Rd = (uns)Rt >> Rs[4:0]
SSNOP
Superscalar Inhibit No Operation
NOP
SUB
Integer Subtract
Rt = (int)Rs - (int)Rd
SUBU
Unsigned Subtract
Rt = (uns)Rs - (uns)Rd
SW
Store Word
Mem[Rs+offset] = Rt
SWL
Store Word Left
See Software User’s Manual
SWR
Store Word Right
See Software User’s Manual
SYNC
Synchronize
See Software User’s Manual
SYSCALL
System Call
SystemCallException
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
15
Table 9 4Km Core Instruction Set (Continued)
Instruction
Description
Function
TEQ
Trap if Equal
if Rs == Rt
TrapException
TEQI
Trap if Equal Immediate
if Rs == (int)Immed
TrapException
TGE
Trap if Greater Than or Equal
if (int)Rs >= (int)Rt
TrapException
TGEI
Trap if Greater Than or Equal Immediate
if (int)Rs >= (int)Immed
TrapException
TGEIU
Trap if Greater Than or Equal Immediate
Unsigned
if (uns)Rs >= (uns)Immed
TrapException
TGEU
Trap if Greater Than or Equal Unsigned
if (uns)Rs >= (uns)Rt
TrapException
TLT
Trap if Less Than
if (int)Rs < (int)Rt
TrapException
TLTI
Trap if Less Than Immediate
if (int)Rs < (int)Immed
TrapException
TLTIU
Trap if Less Than Immediate Unsigned
if (uns)Rs < (uns)Immed
TrapException
TLTU
Trap if Less Than Unsigned
if (uns)Rs < (uns)Rt
TrapException
TNE
Trap if Not Equal
if Rs != Rt
TrapException
TNEI
Trap if Not Equal Immediate
if Rs != (int)Immed
TrapException
WAIT
Wait for Interrupts
Stall until interrupt occurs
XOR
Exclusive OR
Rd = Rs ^ Rt
XORI
Exclusive OR Immediate
Rt = Rs ^ (uns)Immed
4Km Core Signal Descriptions
The pin direction key for the signal descriptions is shown
in Table 10 below.
This section describes the signal interface of the 4Km
microprocessor core.
Table 10 4Km Core Signal Direction Key
Dir
16
Description
I
Input to the 4Km core sampled on the rising edge of the appropriate CLK signal.
O
Output of the 4Km core, unless otherwise noted, driven at the rising edge of the appropriate CLK signal.
A
Asynchronous inputs that are synchronized by the core.
S
Static input to the 4Km core. These signals are normally tied to either power or ground and should not
change state while SI_ColdReset is deasserted.
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
The 4Km core signals are listed in Table 11 below. Note
that the signals are grouped by logical function, not by
expected physical location. All signals, with the exception
of EJ_TRST_N, are active-high signals. EJ_DINT and
SI_NMI go through edge-detection logic so that only one
exception is taken each time they are asserted.
Table 11 4Km Signal Descriptions
Signal Name
Type
Description
System Interface
Clock Signals:
SI_ClkIn
I
Clock Input. All inputs and outputs, except a few of the EJTAG signals, are sampled and/or
asserted relative to the rising edge of this signal.
SI_ClkOut
O
Reference Clock for the External Bus Interface. This clock signal provides a reference for
deskewing any clock insertion delay created by the internal clock buffering in the core.
SI_ColdReset
A
Hard/Cold Reset Signal. Causes a Reset Exception in the core.
SI_NMI
A
Non-Maskable Interrupt. An edge detect is used on this signal. When this signal is sampled
asserted (high) one clock after being sampled deasserted, an NMI is posted to the core.
SI_Reset
A
Soft/Warm Reset Signal. Causes a SoftReset Exception in the core.
SI_ERL
O
This signal represents the state of the ERL bit (2) in the CP0 Status register and indicates the
error level. The core asserts SI_ERL whenever a Reset, Soft Reset, or NMI exception is taken.
SI_EXL
O
This signal represents the state of the EXL bit (1) in the CP0 Status register and indicates the
exception level. The core asserts SI_EXL whenever any exception other than a Reset, Soft
Reset, NMI, or Debug exception is taken.
SI_RP
O
This signal represents the state of the RP bit (27) in the CP0 Status register. Software can write
this bit to indicate that the device can enter a reduced power mode.
SI_SLEEP
O
This signal is asserted by the core whenever the WAIT instruction is executed. The assertion of
this signal indicates that the clock has stopped and that the core is waiting for an interrupt.
SI_Int[5:0]
A
Active-high Interrupt Pins. These signals are driven by external logic and, when asserted,
indicate the corresponding interrupt exception to the core. These signals go through
synchronization logic and can be asserted asynchronously to SI_ClkIn.
SI_TimerInt
O
This signal is asserted whenever the Count and Compare registers match and is deasserted
when the Compare register is written. In order to have timer interrupts, this signal needs to be
brought back into the 4K core on one of the 6 SI_Int interrupt pins. Traditionally, this has been
accomplished via muxing SI_TimerInt with SI_Int[5]. Exposing SI_TimerInt as an output
allows more flexibility for the system designer. Timer interrupts can be muxed or ORed into
one of the interrupts, as desired in a particular system. In a complex system, it could even be
fed into a priority encoder to allow SI_Int[5:0] to map up to 63 interrupt sources.
Reset Signals:
Power Management Signals:
Interrupt Signals:
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
17
Table 11 4Km Signal Descriptions
Signal Name
Type
Description
Configuration Inputs:
SI_Endian
S
Indicates the base endianess of the core.
EB_Endian
SI_MergeMode[1:0]
S
Base Endian Mode
0
Little Endian
1
Big Endian
The state of these signals determines the merge mode for the 16-byte collapsing write buffer.
Encoding
SI_SimpleBE[1:0]
S
Merge Mode
002
No Merge
012
Reserved
102
Full Merge
112
Reserved
The state of these signals can constrain the core to only generate certain byte enables on EC™
interface transactions. This eases connection to some existing bus standards.
SI_SimpleBE[1:0]
Byte Enable Mode
002
All BEs allowed
012
Naturally aligned bytes, halfwords, and words only
102
Reserved
112
Reserved
External Bus Interface
EB_ARdy
I
Indicates whether the target is ready for a new address. The core will not complete the address
phase of a new bus transaction until the clock cycle after EB_ARdy is sampled asserted.
EB_AValid
O
When asserted, indicates that the values on the address bus and access types lines are valid,
signifying the beginning of a new bus transaction. EB_AValid must always be valid.
EB_Instr
O
When asserted, indicates that the transaction is an instruction fetch versus a data reference.
EB_Instr is only valid when EB_AValid is asserted.
EB_Write
O
When asserted, indicates that the current transaction is a write. This signal is only valid when
EB_AValid is asserted.
EB_Burst
O
When asserted, indicates that the current transaction is part of a cache fill or a write burst. Note
that there is redundant information contained in EB_Burst, EB_BFirst, EB_BLast, and
EB_BLen. This is done to simplify the system design—the information can be used in whatever
form is easiest.
EB_BFirst
O
When asserted, indicates the beginning of the burst. EB_BFirst is always valid.
EB_BLast
O
When asserted, indicates the end of the burst. EB_BLast is always valid.
18
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
Table 11 4Km Signal Descriptions
Signal Name
EB_BLen<1:0>
Type
O
Description
Indicates the length of the burst. This signal is only valid when EB_AValid is asserted.
EB_BLength<1:0>
Burst Length
0
reserved
1
4
2
reserved
3
reserved
EB_SBlock
SI
When sampled asserted, sub-block ordering is used. When sampled deasserted, sequential
addressing is used.
EB_BE<3:0>
O
Indicates which bytes of the EB_RData or EB_WData buses are involved in the current
transaction. If an EB_BE signal is asserted, the associated byte is being read or written. EB_BE
lines are only valid while EB_AValid is asserted.
EB_BE
Signal
Read Data Bits
Sampled
Write Data Bits
Driven Valid
EB_BE<0>
EB_RData<7:0>
EB_WData<7:0>
EB_BE<1>
EB_RData<15:8>
EB_WData<15:8>
EB_BE<2>
EB_RData<23:16>
EB_WData<23:16>
EB_BE<3>
EB_RData<31:24>
EB_WData<31:24>
EB_A<35:2>
O
Address lines for external bus. Only valid when EB_AValid is asserted. EB_A[35:32] are tied
to 0 in this core.
EB_WData<31:0>
O
Output data for writes.
EB_RData<31:0>
I
Input Data for reads.
EB_RdVal
I
Indicates that the target is driving read data on EB_RData lines. EB_RdVal must always be
valid. EB_RdVal may never be sampled asserted until the rising edge after the corresponding
EB_ARdy was sampled asserted.
EB_WDRdy
I
Indicates that the target of a write is ready. The EB_WData lines can change in the next clock
cycle. EB_WDRdy will not be sampled until the rising edge where the corresponding EB_ARdy
is sampled asserted.
EB_RBErr
I
Bus error indicator for read transactions. EB_RBErr is sampled on every rising clock edge until
an active sampling of EB_RdVal. EB_RBErr sampled with asserted EB_RdVal indicates a bus
error during read. EB_RBErr must be deasserted in idle phases.
EB_WBErr
I
Bus error indicator for write transactions. EB_WBErr is sampled on the rising clock edge
following an active sample of EB_WDRdy. EB_WBErr must be deasserted in idle phases.
EB_EWBE
I
Indicates that any external write buffers are empty. The external write buffers must deassert
EB_EWBE in the cycle after the corresponding EB_WDRdy is asserted and keep EB_EWBE
deasserted until the external write buffers are empty.
EB_WWBE
O
When asserted, indicates that the core is waiting for external write buffers to empty.
EJTAG Interface
TAP interface. These signals comprise the EJTAG Test Access Port. These signals will not be connected if the core does not
implement the TAP controller.
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
19
Table 11 4Km Signal Descriptions
Signal Name
Type
Description
EJ_TRST_N
I
Active-low Test Reset Input (TRST*) for the EJTAG TAP. At power-up, the assertion of
EJ_TRST_N causes the TAP controller to be reset.
EJ_TCK
I
Test Clock Input (TCK) for the EJTAG TAP.
EJ_TMS
I
Test Mode Select Input (TMS) for the EJTAG TAP.
EJ_TDI
I
Test Data Input (TDI) for the EJTAG TAP.
EJ_TDO
O
Test Data Output (TDO) for the EJTAG TAP.
EJ_TDOzstate
O
Drive indication for the output of TDO for the EJTAG TAP at chip level:
1: The TDO output at chip level must be in Z-state
0: The TDO output at chip level must be driven to the value of EJ_TDO
IEEE Standard 1149.1-1990 defines TDO as a 3-stated signal. To avoid having a 3-state core
output, the 4K core outputs this signal to drive an external 3-state buffer.
Debug Interrupt:
EJ_DINTsup
S
Value of DINTsup for the Implementation register. A 1 on this signal indicates that the EJTAG
probe can use the DINT signal to interrupt the processor.
EJ_DINT
I
Debug exception request when this signal is asserted in a CPU clock period after being
deasserted in the previous CPU clock period. The request is cleared when debug mode is
entered. Requests when in debug mode are ignored.
O
Asserted when the core is in Debug Mode. This can be used to bring the core out of a low power
mode. In systems with multiple processor cores, this signal can be used to synchronize the
cores when debugging.
Debug Mode Indication:
EJ_DebugM
Device ID bits:
These inputs provide an identifying number visible to the EJTAG probe. If the EJTAG TAP controller is not implemented, these
inputs are not connected. These inputs are always available for soft core customers. On hard cores, the core “hardener” can set these
inputs to their own values.
EJ_ManufID[10:0]
S
Value of the ManufID[10:0] field in the Device ID register. As per IEEE 1149.1-1990 section
11.2, the manufacturer identity code shall be a compressed form of JEDEC standard
manufacturer’s identification code in the JEDEC Publications 106, which can be found at:
http://www.jedec.org/
ManufID[6:0] bits are derived from the last byte of the JEDEC code by discarding the parity
bit. ManufID[10:7] bits provide a binary count of the number of bytes in the JEDEC code that
contain the continuation character (0x7F). Where the number of continuations characters
exceeds 15, these 4 bits contain the modulo-16 count of the number of continuation characters.
EJ_PartNumber[15:0]
S
Value of the PartNumber[15:0] field in the Device ID register.
EJ_Version[3:0]
S
Value of the Version[3:0] field in the Device ID register.
System Implementation Dependent Outputs:
These signals come from EJTAG control registers. They have no effect on the core, but can be used to give EJTAG debugging
software additional control over the system.
EJ_SRstE
20
O
Soft Reset Enable. EJTAG can deassert this signal if it wants to mask soft resets. If this signal
is deasserted, none, some, or all soft reset sources are masked.
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
Table 11 4Km Signal Descriptions
Signal Name
Type
Description
EJ_PerRst
O
Peripheral Reset. EJTAG can assert this signal to request the reset of some or all of the
peripheral devices in the system.
EJ_PrRst
O
Processor Reset. EJTAG can assert this signal to request that the core be reset. This can be fed
into the SI_Reset signal.
Performance Monitoring Interface
These signals can be used to implement performance counters, which can be used to monitor hardware/software performance.
PM_DCacheHit
O
This signal is asserted whenever there is a data cache hit.
PM_DCacheMiss
O
This signal is asserted whenever there is a data-cache miss.
PM_DTLBHit
O
This signal is not used in the 4Km processor core and is tied to ground.
PM_DTLBMiss
O
This signal is not used in the 4Km processor core and is tied to ground.
PM_ICacheHit
O
This signal is asserted whenever there is an instruction-cache hit.
PM_ICacheMiss
O
This signal is asserted whenever there is an instruction-cache miss.
PM_InstComplete
O
This signal is asserted each time an instruction completes in the pipeline.
PM_ITLBHit
O
This signal is not used in the 4Km processor core and is tied to ground.
PM_ITLBMiss
O
This signal is not used in the 4Km processor core and is tied to ground.
PM_JTLBHit
O
This signal is not used in the 4Km processor core and is tied to ground.
PM_JTLBMiss
O
This signal is not used in the 4Km processor core and is tied to ground.
PM_WTBMerge
O
This signal is asserted whenever there is a successful merge in the write-through buffer.
PM_WTBNoMerge
O
This signal is asserted whenever a non-merging store is written to the write-through buffer.
Scan Test Interface
These signals provide the interface for testing the core. The use and configuration of these pins are implementation-dependent.
ScanEnable
I
This signal should be asserted while scanning vectors into or out of the core. The ScanEnable
signal must be deasserted during normal operation and during capture clocks in test mode.
ScanMode
I
This signal should be asserted during all scan testing both while scanning and during capture
clocks. The ScanMode signal must be deasserted during normal operation.
ScanIn<n:0>
I
This signal is input to the scan chain.
ScanOut<n:0>
O
This signal is output from the scan chain.
BistIn<n:0>
I
Input to the BIST controller.
BistOut<n:0>
O
Output from the BIST controller.
4Km Core Bus Transactions
The 4Km core implements the EC™ interface for its bus
transactions. This interface uses a pipelined, in-order
protocol with independent address, read data, and write
data buses. The following subsections describe the four
basic bus transactions: single read, single write, burst read,
and burst write.
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
21
Single Read
If a bus error occurs during the data transaction, external
logic asserts EB_RBErr in the same clock as EB_RdVal.
Figure 7 shows the basic timing relationships of signals
during a simple read transaction. During a single read
cycle, the 4Km core drives the address onto EB_A[35:2]
and byte enable information onto EB_BE[3:0]. To
maximize performance, the EC interface does not define a
maximum number of outstanding bus cycles. Instead it
provides the EB_ARdy input signal. This signal is driven by
external logic and controls the generation of addresses on
the bus.
In the 4Km4Km core, the address is driven whenever it
becomes available, regardless of the state of EB_ARdy.
However, the 4Km4Km core always continues to drive the
address until the clock after EB_ARdy is sampled asserted.
For example, at the rising edge of the clock 2 in Figure 7,
the EB_ARdy signal is sampled low, indicating that external
logic is not ready to accept the new address. However, the
4Km core still drives EB_A[35:2] in this clock as shown.
On the rising edge of clock 3, the 4Km core samples
EB_ARdy asserted and continues to drive the address until
the rising edge of clock 4.
Clock #
1
2
3
4
5
6
7
Single Write
Figure 8 shows a typical write transaction. The 4Km core
drives address and control information onto the
EB_A[35:2] and EB_BE[3:0] signals on the rising edge of
clock 2. As in the single read cycle, these signals remain
active until the clock edge after the EB_ARdy signal is
sampled asserted. The 4Km core asserts the EB_Write
signal to indicate that a valid write cycle is on the bus and
EB_AValid to indicate that valid address is on the bus.
The 4Km core drives write data onto EB_WData[31:0] in
the same clock as the address and continues to drive data
until the clock edge after the EB_WDRdy signal is sampled
asserted. If a bus error occurs during a write operation,
external logic asserts the EB_WBErr signal one clock after
asserting EB_WDRdy.
Clock #
1
2
3
4
5
6
7
8
EB_Clk
Address and Control held until clock after EB_ARdy sampled asserted
8
EB_ARdy
Addr
Wait
EB_Clk
Address and Control held until clock after EB_ARdy sampled asserted
EB_ARdy
Addr
Wait
EB_A[35:2]
Valid
EB_Write
EB_A[35:2]
Valid
EB_Instr,
EB_BE[3:0],
Valid
EB_BE[3:0]
Valid
EB_AValid
Driven by system logic
EB_AValid
Data is Driven until clock after EB_WDRdy
EB_WData[31:0]
EB_RData[31:0]
Valid
Valid
Driven by system logic
EB_WDRdy
EB_RdVal
EB_WBErr
EB_RBErr
Figure 8 Single Write Transaction Timing Diagram
EB_Write
Figure 7 Single Read Transaction Timing Diagram
Burst Read
The EB_Instr signal is only asserted during a single read
cycle if there is an instruction fetch from non-cacheable
memory space. The EB_AValid signal is driven in each
clock that EB_A[35:2] is valid on the bus. The 4Km core
drives EB_Write low to indicate a read transaction.
The 4Km core is capable of generating burst transactions
on the bus. A burst transaction is used to transfer multiple
data items in one transaction.
The EB_RData[31:0] and EB_RdVal signals are first
sampled on the rising edge of clock 4, one clock after
EB_ARdy is sampled asserted. Data is sampled on every
clock thereafter until EB_RdVal is sampled asserted.
22
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
Clock #
1
2
3
4
5
6
7
8
Table 13
Sub-Block Ordering Protocols
EB_Clk
Starting Address
EB_A[3:2]
Address Progression
of EB_A[3:2]
00
00, 01, 10, 11
01
01, 00, 11, 10
EB_BE[3:0]
10
10, 11, 00, 01
EB_Burst
11
11, 10, 01, 00
Addr
Wait
EB_ARdy
EB_A[35:2]
Adr1
Adr2
EB_Instr
Adr3
Adr4
Valid
EB_BFirst
EB_BLast
Driven by system logic
EB_AValid
EB_RData[31:0]
Data1
Read
Wait
EB_RdVal
Data2
Data3
Data4
Read
Wait
EB_RBErr
EB_Write
Figure 9 Burst Read Transaction Timing Diagram
The 4Km4Km core drives address and control information
onto the EB_A[35:2] and EB_BE[3:0] signals on the rising
edge of clock 2. As in the single read cycle, these signals
remain active until the clock edge after the EB_ARdy signal
is sampled asserted. The 4Km core continues to drive
EB_AValid as long as a valid address is on the bus.
The EB_Instr signal is asserted if the burst read is for an
instruction fetch. The EB_Burst signal is asserted while the
address is on the bus to indicate that the current address is
part of a burst transaction. The 4Km core asserts the
EB_BFirst signal in the same clock as the first address is
driven and the EB_BLast signal in the same clock as the last
address to indicate the start and end of a burst cycle.
Figure 9 shows an example of a burst read transaction.
Burst read transactions initiated by the 4Km core always
contain four data transfers in a sequence determined by the
critical word (the address that caused the miss) and
EB_SBlock. In addition, the data requested is always a 16byte aligned block.
The 4Km core first samples the EB_RData[31:0] signals
two clocks after EB_ARDy is sampled asserted. External
logic asserts EB_RdVal to indicate that valid data is on the
bus. The 4Km core latches data internally whenever
EB_RVal is sampled asserted.
The order of words within this 16-byte block varies
depending on which of the words in the block is being
requested by the execution unit and the ordering protocol
selected. The burst always starts with the word requested
by the execution unit and proceeds in either an ascending
or descending address order, wrapping when the block
boundary is reached. Table 12 and Table 13 show the
sequence of address bits 2 and 3.
Note that on the rising edge of clocks 3 and 6 in Figure 9,
the EB_RdVal signal is sampled deasserted, causing wait
states in the data return. There is also an address wait state
caused by EB_ARdy being sampled deasserted on the rising
edge of clock 4. Note that the core holds address 3 on the
EB_A bus for an extra clock because of this wait state.
External logic asserts the EB_RBErr signal in the same
clock as data if a bus error occurs during that data transfer.
Table 12
Sequential Ordering Protocols
Starting Address
EB_A[3:2]
Address Progression
of EB_A[3:2]
00
00, 01, 10, 11
01
01, 10, 11, 00
10
10, 11, 00, 01
11
11, 00, 01, 10
Burst Write
Burst write transactions are used to empty one of the write
buffers. A burst write transaction is only performed if the
write buffer contains 16 bytes of data associated with the
same aligned memory block, otherwise individual write
transactions are performed. Figure 10 shows a timing
diagram of a burst write transaction. Unlike the read burst,
a write burst always begins with EB_A[3:2] equal to 00b.
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
23
Clock #
1
2
3
4
5
Adr1
Adr2
Adr3
Adr4
6
7
External logic drives the EB_WBErr signal one clock after
the corresponding assertion of EB_WDRdy if a bus error
has occurred as shown by the arrows in Figure 10.
8
EB_Clk
EB_ARdy
EB_A[35:2]
EB_BE[3:0]
EB_Write
EB_Burst
EB_BFirst
Driven by
system logic
EB_BLast
EB_AValid
EB_WData[31:0]
EB_WDRdy
Data2
Data1
Write
Wait
Data3
Data4
Write
Wait
EB_WBErr
Figure 10 Burst Write Transaction Timing Diagram
The 4Km core drives address and control information onto
the EB_A[35:2] and EB_BE[3:0] signals on the rising edge
of clock 2. As in the single read cycle, these signals remain
active until the clock edge after the EB_ARdy signal is
sampled asserted. The 4Km core continues to drive
EB_AValid as long as a valid address is on the bus.
The 4Km core asserts the EB_Write, EB_Burst, and
EB_AValid signals during the time the address is driven.
EB_Write indicates that a write operation is in progress.
The assertion of EB_Burst indicates that the current
operation is a burst. EB_AValid indicates that valid address
is on the bus.
The 4Km core asserts the EB_BFirst signal in the same
clock as address 1 is driven to indicate the start of a burst
cycle. In the clock that the last address is driven, the 4Km
core asserts EB_BLast to indicate the end of the burst
transaction.
In Figure 10, the first data word (Data1) is driven in clocks
2 and 3 due to the EB_WDRdy signal being sampled
deasserted at the rising edge of clock 2, causing a wait state.
When EB_WDRdy is sampled asserted on the rising edge of
clock 3, the 4Km core responds by driving the second word
(Data2).
24
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.
Copyright © 2000-2002 MIPS Technologies, Inc. All rights reserved.
Unpublished rights reserved under the Copyright Laws of the United States of America.
This document contains information that is proprietary to MIPS Technologies, Inc. (“MIPS Technologies”). Any
copying, reproducing, modifying, or use of this information (in whole or in part) which is not expressly permitted
in writing by MIPS Technologies or a contractually-authorized third party is strictly prohibited. At a minimum, this
information is protected under unfair competition and copyright laws. Violations thereof may result in criminal
penalties and fines.
MIPS Technologies or any contractually-authorized third party reserves the right to change the information
contained in this document to improve function, design or otherwise. MIPS Technologies does not assume any
liability arising out of the application or use of this information, or of any error of omission in such information.
Any warranties, whether express, statutory, implied or otherwise, including but not limited to the implied warranties
of merchantability or fitness for a particular purpose, are excluded. Any license under patent rights or any other
intellectual property rights owned by MIPS Technologies or third parties shall be conveyed by MIPS Technologies
or any contractually-authorized third party in a separate license agreement between the parties.
The information contained in this document shall not be exported or transferred for the purpose of reexporting in
violation of any U.S. or non-U.S. regulation, treaty, Executive Order, law, statute, amendment or supplement thereto.
The information contained in this document constitutes one or more of the following: commercial computer
software, commercial computer software documentation or other commercial items. If the user of this information,
or any related documentation of any kind, including related technical data or manuals, is an agency, department, or
other entity of the United States government (“Government”), the use, duplication, reproduction, release,
modification, disclosure, or transfer of this information, or any related documentation of any kind, is restricted in
accordance with Federal Acquisition Regulation 12.212 for civilian agencies and Defense Federal Acquisition
Regulation Supplement 227.7202 for military agencies. The use of this information by the Government is further
restricted in accordance with the terms of the license agreement(s) and/or applicable contract terms and conditions
covering this information from MIPS Technologies or any contractually-authorized third party.
MIPS®, R3000®, R4000®, R5000® and R10000® are among the registered trademarks of MIPS Technologies, Inc.
in the United States and certain other countries, and MIPS16™, MIPS16e™,MIPS32™, MIPS64™, MIPS-3D™,
MIPS-based™, MIPS I™, MIPS II™, MIPS III™, MIPS IV™, MIPS V™, MDMX™, SmartMIPS™, 4K™,
4Kc™, 4Km™, 4Kp™, 4KE™, 4KEc™, 4KEm™, 4KEp™, 4KS™, 4KSc™, 5K™, 5Kc™, 5Kf™, 20K™,
20Kc™, R20K™, R4300™, ATLAS™, CoreLV™, EC™, JALGO™, MALTA™, MGB™, SEAD™, SEAD-2™,
SOC-it™ and YAMON™ are among the trademarks of MIPS Technologies, Inc.
All other trademarks referred to herein are the property of their respective owners.
Document Number: MD00040
01.03-2B
25
MIPS32® 4Km® Processor Core Datasheet, Revision 01.08
Copyright © 2000-2002 MIPS Technologies Inc. All right reserved.