ARM1176JZ-S
™
Revision: r0p7
Technical Reference Manual
Copyright © 2004-2009 ARM Limited. All rights reserved.
ARM DDI 0333H (ID012410)
ARM1176JZ-S
Technical Reference Manual
Copyright © 2004-2009 ARM Limited. All rights reserved.
Release Information
Change history
Date
Issue
Confidentiality
Change
19 July 2004
A
Non-Confidential
First release.
18 April 2005
B
Non-Confidential
Minor corrections and enhancements.
29 June 2005
C
Non-Confidential
R0p1 changes - addition of CPUCLAMP.
Figure 10-1 updated.
Section 10.4.3 updated. Table 23-1 updated.
Minor corrections and enhancements.
22 March 2006
D
Non-Confidential
First release for r0p2. Minor corrections and enhancements.
19 July 2006
E
Non-Confidential
Patch update for r0p4.
25 April 2007
F
Non-Confidential
Update for r0p6 release. Minor corrections and enhancements.
15 February 2008
G
Non-Confidential
Update for r0p7 release. Minor corrections and enhancements.
27 November 2009
H
Non-Confidential
Update for r0p7 maintenance release. Minor corrections and enhancements.
Proprietary Notice
Words and logos marked with ® or ™ are registered trademarks or trademarks of ARM® Limited in the EU and other
countries, except as otherwise stated below in this proprietary notice. Other brands and names mentioned herein may be
the trademarks of their respective owners.
Neither the whole nor any part of the information contained in, or the product described in, this document may be
adapted or reproduced in any material form except with the prior written permission of the copyright holder.
The product described in this document is subject to continuous developments and improvements. All particulars of the
product and its use contained in this document are given by ARM in good faith. However, all warranties implied or
expressed, including but not limited to implied warranties of merchantability, or fitness for purpose, are excluded.
This document is intended only to assist the reader in the use of the product. ARM Limited shall not be liable for any
loss or damage arising from the use of any information in this document, or any error or omission in such information,
or any incorrect use of the product.
Where the term ARM is used it means “ARM or any of its subsidiaries as appropriate”.
Figure 14-1 on page 14-2 reprinted with permission from IEEE Std. 1149.1-2001, IEEE Standard Test Access Port and
Boundary-Scan Architecture by IEEE Std. The IEEE disclaims any responsibility or liability resulting from the
placement and use in the described manner.
Confidentiality Status
This document is Non-Confidential. The right to use, copy and disclose this document may be subject to license
restrictions in accordance with the terms of the agreement entered into by ARM and the party that ARM delivered this
document to.
Unrestricted Access is an ARM internal classification.
Product Status
The information in this document is final, that is for a developed product.
Web Address
http://www.arm.com
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
ii
Contents
ARM1176JZ-S Technical Reference Manual
Preface
About this manual ..................................................................................................... xix
Feedback ................................................................................................................. xxiii
Chapter 1
Introduction
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
1.10
1.11
Chapter 2
Programmer’s Model
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
2.10
2.11
ARM DDI 0333H
ID012410
About the processor ................................................................................................. 1-2
Extensions to ARMv6 .............................................................................................. 1-3
TrustZone security extensions ................................................................................. 1-4
ARM1176JZ-S architecture with Jazelle technology ............................................... 1-6
Components of the processor .................................................................................. 1-8
Power management ............................................................................................... 1-21
Configurable options .............................................................................................. 1-23
Pipeline stages ...................................................................................................... 1-24
Typical pipeline operations .................................................................................... 1-26
ARM1176JZ-S instruction set summary ................................................................ 1-30
Product revisions ................................................................................................... 1-46
About the programmer’s model ............................................................................... 2-2
Secure world and Non-secure world operation with TrustZone ............................... 2-3
Processor operating states .................................................................................... 2-12
Instruction length ................................................................................................... 2-13
Data types .............................................................................................................. 2-14
Memory formats ..................................................................................................... 2-15
Addresses in a processor system .......................................................................... 2-16
Operating modes ................................................................................................... 2-17
Registers ................................................................................................................ 2-18
The program status registers ................................................................................. 2-24
Additional instructions ............................................................................................ 2-30
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
iii
Contents
2.12
2.13
Chapter 3
System Control Coprocessor
3.1
3.2
Chapter 4
About the level two interface .................................................................................... 8-2
Synchronization primitives ....................................................................................... 8-6
AXI control signals in the processor ........................................................................ 8-8
Instruction Fetch Interface transfers ...................................................................... 8-14
Data Read/Write Interface transfers ...................................................................... 8-15
Peripheral Interface transfers ................................................................................ 8-41
Endianness ............................................................................................................ 8-42
Locked access ....................................................................................................... 8-43
Clocking and Resets
9.1
ARM DDI 0333H
ID012410
About the level one memory system ........................................................................ 7-2
Cache organization .................................................................................................. 7-3
Tightly-coupled memory .......................................................................................... 7-7
DMA ....................................................................................................................... 7-10
TCM and cache interactions .................................................................................. 7-12
Write buffer ............................................................................................................ 7-16
Level Two Interface
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
Chapter 9
About the MMU ........................................................................................................ 6-2
TLB organization ...................................................................................................... 6-4
Memory access sequence ....................................................................................... 6-7
Enabling and disabling the MMU ............................................................................. 6-9
Memory access control .......................................................................................... 6-11
Memory region attributes ....................................................................................... 6-14
Memory attributes and types ................................................................................. 6-20
MMU aborts ........................................................................................................... 6-27
MMU fault checking ............................................................................................... 6-29
Fault status and address ....................................................................................... 6-34
Hardware page table translation ............................................................................ 6-36
MMU descriptors .................................................................................................... 6-43
MMU software-accessible registers ....................................................................... 6-53
Level One Memory System
7.1
7.2
7.3
7.4
7.5
7.6
Chapter 8
About program flow prediction ................................................................................. 5-2
Branch prediction ..................................................................................................... 5-4
Return stack ............................................................................................................. 5-7
Memory Barriers ...................................................................................................... 5-8
ARM1176JZ-S IMB implementation ...................................................................... 5-10
Memory Management Unit
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
6.10
6.11
6.12
6.13
Chapter 7
About unaligned and mixed-endian support ............................................................ 4-2
Unaligned access support ....................................................................................... 4-3
Endian support ......................................................................................................... 4-6
Operation of unaligned accesses .......................................................................... 4-13
Mixed-endian access support ................................................................................ 4-17
Instructions to reverse bytes in a general-purpose register ................................... 4-20
Instructions to change the CPSR E bit .................................................................. 4-21
Program Flow Prediction
5.1
5.2
5.3
5.4
5.5
Chapter 6
About the system control coprocessor ..................................................................... 3-2
System control processor registers ....................................................................... 3-14
Unaligned and Mixed-endian Data Access Support
4.1
4.2
4.3
4.4
4.5
4.6
4.7
Chapter 5
Exceptions ............................................................................................................. 2-36
Software considerations ........................................................................................ 2-59
About clocking and resets ........................................................................................ 9-2
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
iv
Contents
9.2
9.3
9.4
Chapter 10
Clocking and resets with no IEM ............................................................................. 9-3
Clocking and resets with IEM .................................................................................. 9-5
Reset modes .......................................................................................................... 9-10
Power Control
10.1
10.2
10.3
Chapter 11
About power control ............................................................................................... 10-2
Power management ............................................................................................... 10-3
Intelligent Energy Management ............................................................................. 10-6
Coprocessor Interface
11.1
11.2
11.3
11.4
11.5
11.6
11.7
Chapter 12
About the coprocessor interface ............................................................................ 11-2
Coprocessor pipeline ............................................................................................. 11-3
Token queue management .................................................................................... 11-9
Token queues ...................................................................................................... 11-12
Data transfer ........................................................................................................ 11-15
Operations ........................................................................................................... 11-19
Multiple coprocessors .......................................................................................... 11-22
Vectored Interrupt Controller Port
12.1
12.2
12.3
12.4
Chapter 13
Debug Test Access Port and Debug state ............................................................. 14-2
Synchronizing RealView ICE ................................................................................. 14-3
Entering Debug state ............................................................................................. 14-4
Exiting Debug state ................................................................................................ 14-5
The DBGTAP port and debug registers ................................................................. 14-6
Debug registers ..................................................................................................... 14-8
Using the Debug Test Access Port ...................................................................... 14-21
Debug sequences ................................................................................................ 14-29
Programming debug events ................................................................................. 14-40
Monitor debug-mode debugging .......................................................................... 14-42
Trace Interface Port
15.1
ARM DDI 0333H
ID012410
Debug systems ...................................................................................................... 13-2
About the debug unit .............................................................................................. 13-3
Debug registers ..................................................................................................... 13-5
CP14 registers reset ............................................................................................ 13-25
CP14 debug instructions ...................................................................................... 13-26
External debug interface ...................................................................................... 13-28
Changing the debug enable signals .................................................................... 13-31
Debug events ....................................................................................................... 13-32
Debug exception .................................................................................................. 13-35
Debug state ......................................................................................................... 13-37
Debug communications channel .......................................................................... 13-42
Debugging in a cached system ............................................................................ 13-43
Debugging in a system with TLBs ....................................................................... 13-44
Monitor debug-mode debugging .......................................................................... 13-45
Halting debug-mode debugging ........................................................................... 13-50
External signals ................................................................................................... 13-52
Debug Test Access Port
14.1
14.2
14.3
14.4
14.5
14.6
14.7
14.8
14.9
14.10
Chapter 15
12-2
12-3
12-5
12-7
Debug
13.1
13.2
13.3
13.4
13.5
13.6
13.7
13.8
13.9
13.10
13.11
13.12
13.13
13.14
13.15
13.16
Chapter 14
About the PL192 Vectored Interrupt Controller ......................................................
About the processor VIC port ................................................................................
Timing of the VIC port ............................................................................................
Interrupt entry flowchart .........................................................................................
About the ETM interface ........................................................................................ 15-2
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
v
Contents
Chapter 16
Cycle Timings and Interlock Behavior
16.1
16.2
16.3
16.4
16.5
16.6
16.7
16.8
16.9
16.10
16.11
16.12
16.13
16.14
16.15
16.16
16.17
16.18
Chapter 17
AC Characteristics
17.1
17.2
Appendix A
Global signals .......................................................................................................... A-2
Static configuration signals ...................................................................................... A-4
TrustZone internal signals ....................................................................................... A-5
Interrupt signals, including VIC interface ................................................................. A-6
AXI interface signals ................................................................................................ A-7
Coprocessor interface signals ............................................................................... A-12
Debug interface signals, including JTAG ............................................................... A-14
ETM interface signals ............................................................................................ A-15
Test signals ............................................................................................................ A-16
Summary of ARM1136J-S and ARM1176JZ-S Processor Differences
B.1
B.2
Appendix C
Processor timing diagrams .................................................................................... 17-2
Processor timing parameters ................................................................................. 17-3
Signal Descriptions
A.1
A.2
A.3
A.4
A.5
A.6
A.7
A.8
A.9
Appendix B
About cycle timings and interlock behavior ............................................................ 16-2
Register interlock examples ................................................................................... 16-6
Data processing instructions .................................................................................. 16-7
QADD, QDADD, QSUB, and QDSUB instructions ................................................ 16-9
ARMv6 media data-processing ............................................................................ 16-10
ARMv6 Sum of Absolute Differences (SAD) ........................................................ 16-11
Multiplies .............................................................................................................. 16-12
Branches .............................................................................................................. 16-14
Processor state updating instructions .................................................................. 16-15
Single load and store instructions ........................................................................ 16-16
Load and Store Double instructions ..................................................................... 16-19
Load and Store Multiple Instructions ................................................................... 16-21
RFE and SRS instructions ................................................................................... 16-23
Synchronization instructions ................................................................................ 16-24
Coprocessor instructions ..................................................................................... 16-25
SVC, SMC, BKPT, Undefined, and Prefetch Aborted instructions ...................... 16-26
No operation ........................................................................................................ 16-27
Thumb instructions .............................................................................................. 16-28
About the differences between the ARM1136J-S and ARM1176JZ-S processors .. B-2
Summary of differences ........................................................................................... B-3
Revisions
Glossary
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
vi
List of Tables
ARM1176JZ-S Technical Reference Manual
Table 1-1
Table 1-2
Table 1-3
Table 1-4
Table 1-5
Table 1-6
Table 1-7
Table 1-8
Table 1-9
Table 1-10
Table 1-11
Table 1-12
Table 1-13
Table 1-14
Table 2-1
Table 2-2
Table 2-3
Table 2-4
Table 2-5
Table 2-6
Table 2-7
Table 2-8
Table 2-9
Table 3-1
Table 3-2
Table 3-3
Table 3-4
Table 3-5
Table 3-6
ARM DDI 0333H
ID012410
Change history ................................................................................................................................ ii
TCM configurations ................................................................................................................... 1-13
Configurable options ................................................................................................................. 1-23
ARM1176JZ-S processor default configurations ...................................................................... 1-23
Key to instruction set tables ...................................................................................................... 1-30
ARM instruction set summary ................................................................................................... 1-31
Addressing mode 2 ................................................................................................................... 1-38
Addressing mode 2P, post-indexed only .................................................................................. 1-39
Addressing mode 3 ................................................................................................................... 1-40
Addressing mode 4 ................................................................................................................... 1-40
Addressing mode 5 ................................................................................................................... 1-41
Operand2 .................................................................................................................................. 1-41
Fields ........................................................................................................................................ 1-41
Condition codes ........................................................................................................................ 1-42
Thumb instruction set summary ................................................................................................ 1-42
Write access behavior for system control processor registers .................................................... 2-9
Secure Monitor bus signals ....................................................................................................... 2-11
Address types in the processor system .................................................................................... 2-16
Mode structure .......................................................................................................................... 2-17
Register mode identifiers .......................................................................................................... 2-19
GE[3:0] settings ........................................................................................................................ 2-26
PSR mode bit values ................................................................................................................ 2-28
Exception entry and exit ............................................................................................................ 2-37
Exception priorities .................................................................................................................... 2-57
System control coprocessor register functions ........................................................................... 3-3
Summary of CP15 registers and operations ............................................................................. 3-15
Summary of CP15 MCRR operations ....................................................................................... 3-20
Main ID Register bit functions ................................................................................................... 3-21
Results of access to the Main ID Register ................................................................................ 3-21
Cache Type Register bit functions ............................................................................................ 3-22
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
vii
List of Tables
Table 3-7
Table 3-8
Table 3-9
Table 3-10
Table 3-11
Table 3-12
Table 3-13
Table 3-14
Table 3-15
Table 3-16
Table 3-17
Table 3-18
Table 3-19
Table 3-20
Table 3-21
Table 3-22
Table 3-23
Table 3-24
Table 3-25
Table 3-26
Table 3-27
Table 3-28
Table 3-29
Table 3-30
Table 3-31
Table 3-32
Table 3-33
Table 3-34
Table 3-35
Table 3-36
Table 3-37
Table 3-38
Table 3-39
Table 3-40
Table 3-41
Table 3-42
Table 3-43
Table 3-44
Table 3-45
Table 3-46
Table 3-47
Table 3-48
Table 3-49
Table 3-50
Table 3-51
Table 3-52
Table 3-53
Table 3-54
Table 3-55
Table 3-56
Table 3-57
Table 3-58
Table 3-59
Table 3-60
Table 3-61
Table 3-62
Table 3-63
Table 3-64
Table 3-65
Table 3-66
ARM DDI 0333H
ID012410
Results of access to the Cache Type Register .........................................................................
Example Cache Type Register format ......................................................................................
TCM Status Register bit functions ............................................................................................
TLB Type Register bit functions ................................................................................................
Results of access to the TLB Type Register .............................................................................
Processor Feature Register 0 bit functions ...............................................................................
Results of access to the Processor Feature Register 0 ............................................................
Processor Feature Register 1 bit functions ...............................................................................
Results of access to the Processor Feature Register 1 ............................................................
Debug Feature Register 0 bit functions ....................................................................................
Results of access to the Debug Feature Register 0 .................................................................
Auxiliary Feature Register 0 bit functions .................................................................................
Results of access to the Auxiliary Feature Register 0 ..............................................................
Memory Model Feature Register 0 bit functions .......................................................................
Results of access to the Memory Model Feature Register 0 ....................................................
Memory Model Feature Register 1 bit functions .......................................................................
Results of access to the Memory Model Feature Register 1 ....................................................
Memory Model Feature Register 2 bit functions .......................................................................
Results of access to the Memory Model Feature Register 2 ....................................................
Memory Model Feature Register 3 bit functions .......................................................................
Results of access to the Memory Model Feature Register 3 ....................................................
Instruction Set Attributes Register 0 bit functions .....................................................................
Results of access to the Instruction Set Attributes Register 0 ..................................................
Instruction Set Attributes Register 1 bit functions .....................................................................
Results of access to the Instruction Set Attributes Register 1 ..................................................
Instruction Set Attributes Register 2 bit functions .....................................................................
Results of access to the Instruction Set Attributes Register 2 ..................................................
Instruction Set Attributes Register 3 bit functions .....................................................................
Results of access to the Instruction Set Attributes Register 3 ..................................................
Instruction Set Attributes Register 4 bit functions .....................................................................
Results of access to the Instruction Set Attributes Register 4 ..................................................
Results of access to the Instruction Set Attributes Register 5 ..................................................
Control Register bit functions ....................................................................................................
Results of access to the Control Register .................................................................................
Resultant B bit, U bit, and EE bit values ...................................................................................
Auxiliary Control Register bit functions .....................................................................................
Results of access to the Auxiliary Control Register ..................................................................
Coprocessor Access Control Register bit functions ..................................................................
Results of access to the Coprocessor Access Control Register ...............................................
Secure Configuration Register bit functions ..............................................................................
Operation of the FW and FIQ bits .............................................................................................
Operation of the AW and EA bits ..............................................................................................
Secure Debug Enable Register bit functions ............................................................................
Results of access to the Coprocessor Access Control Register ...............................................
Non-Secure Access Control Register bit functions ...................................................................
Results of access to the Auxiliary Control Register ..................................................................
Translation Table Base Register 0 bit functions .......................................................................
Results of access to the Translation Table Base Register 0 ....................................................
Translation Table Base Register 1 bit functions .......................................................................
Results of access to the Translation Table Base Register 1 ....................................................
Translation Table Base Control Register bit functions ..............................................................
Results of access to the Translation Table Base Control Register ...........................................
Domain Access Control Register bit functions ..........................................................................
Results of access to the Domain Access Control Register .......................................................
Data Fault Status Register bit functions ....................................................................................
Results of access to the Data Fault Status Register .................................................................
Instruction Fault Status Register bit functions ...........................................................................
Results of access to the Instruction Fault Status Register ........................................................
Results of access to the Fault Address Register ......................................................................
Results of access to the Instruction Fault Address Register .....................................................
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-23
3-24
3-25
3-26
3-26
3-27
3-27
3-28
3-29
3-29
3-30
3-30
3-31
3-31
3-32
3-33
3-33
3-34
3-35
3-36
3-36
3-37
3-37
3-38
3-39
3-39
3-40
3-41
3-41
3-42
3-43
3-44
3-45
3-47
3-48
3-49
3-50
3-51
3-52
3-53
3-53
3-54
3-55
3-55
3-56
3-57
3-58
3-58
3-60
3-60
3-61
3-62
3-63
3-63
3-64
3-66
3-67
3-67
3-68
3-69
viii
List of Tables
Table 3-67
Table 3-68
Table 3-69
Table 3-70
Table 3-71
Table 3-72
Table 3-73
Table 3-74
Table 3-75
Table 3-76
Table 3-77
Table 3-78
Table 3-79
Table 3-80
Table 3-81
Table 3-82
Table 3-83
Table 3-84
Table 3-85
Table 3-86
Table 3-87
Table 3-88
Table 3-89
Table 3-90
Table 3-91
Table 3-92
Table 3-93
Table 3-94
Table 3-95
Table 3-96
Table 3-97
Table 3-98
Table 3-99
Table 3-100
Table 3-101
Table 3-102
Table 3-103
Table 3-104
Table 3-105
Table 3-106
Table 3-107
Table 3-108
Table 3-109
Table 3-110
Table 3-111
Table 3-112
Table 3-113
Table 3-114
Table 3-115
Table 3-116
Table 3-117
Table 3-118
Table 3-119
Table 3-120
Table 3-121
Table 3-122
Table 3-123
Table 3-124
Table 3-125
Table 3-126
ARM DDI 0333H
ID012410
Functional bits of c7 for Set and Index ...................................................................................... 3-72
Cache size and S parameter dependency ................................................................................ 3-72
Functional bits of c7 for MVA .................................................................................................... 3-73
Functional bits of c7 for VA format ............................................................................................ 3-74
Cache operations for entire cache ............................................................................................ 3-74
Cache operations for single lines .............................................................................................. 3-76
Cache operations for address ranges ....................................................................................... 3-76
Cache Dirty Status Register bit functions ................................................................................. 3-78
Cache operations flush functions .............................................................................................. 3-79
Flush Branch Target Entry using MVA bit functions ................................................................. 3-80
PA Register for successful translation bit functions .................................................................. 3-81
PA Register for unsuccessful translation bit functions .............................................................. 3-82
Results of access to the Data Synchronization Barrier operation ............................................. 3-84
Results of access to the Data Memory Barrier operation ......................................................... 3-85
Results of access to the Wait For Interrupt operation ............................................................... 3-86
Results of access to the TLB Operations Register ................................................................... 3-87
Instruction and data cache lockdown register bit functions ....................................................... 3-89
Results of access to the Instruction and Data Cache Lockdown Register ................................ 3-89
Data TCM Region Register bit functions ................................................................................... 3-91
Results of access to the Data TCM Region Register ................................................................ 3-91
Instruction TCM Region Register bit functions .......................................................................... 3-92
Results of access to the Instruction TCM Region Register ....................................................... 3-93
Data TCM Non-secure Control Access Register bit functions .................................................. 3-94
Effects of NS items for data TCM operation ............................................................................. 3-95
Instruction TCM Non-secure Control Access Register bit functions ......................................... 3-96
Effects of NS items for instruction TCM operation .................................................................... 3-96
TCM Selection Register bit functions ........................................................................................ 3-97
Results of access to the TCM Selection Register ..................................................................... 3-97
Cache Behavior Override Register bit functions ....................................................................... 3-98
Results of access to the Cache Behavior Override Register .................................................... 3-99
TLB Lockdown Register bit functions ...................................................................................... 3-100
Results of access to the TLB Lockdown Register ................................................................... 3-100
Primary Region Remap Register bit functions ........................................................................ 3-102
Encoding for the remapping of the primary memory type ....................................................... 3-103
Normal Memory Remap Register bit functions ....................................................................... 3-103
Remap encoding for Inner or Outer cacheable attributes ....................................................... 3-104
Results of access to the memory region remap registers ....................................................... 3-105
DMA identification and status register bit functions ................................................................ 3-106
DMA Identification and Status Register functions ................................................................... 3-106
Results of access to the DMA identification and status registers ........................................... 3-107
DMA User Accessibility Register bit functions ........................................................................ 3-108
Results of access to the DMA User Accessibility Register ..................................................... 3-108
DMA Channel Number Register bit functions ......................................................................... 3-109
Results of access to the DMA Channel Number Register ...................................................... 3-109
Results of access to the DMA enable registers ...................................................................... 3-111
DMA Control Register bit functions ......................................................................................... 3-112
Results of access to the DMA Control Register ...................................................................... 3-113
Results of access to the DMA Internal Start Address Register ............................................... 3-114
Results of access to the DMA External Start Address Register ............................................. 3-115
Results of access to the DMA Internal End Address Register ................................................ 3-116
DMA Channel Status Register bit functions ............................................................................ 3-117
Results of access to the DMA Channel Status Register ......................................................... 3-119
DMA Context ID Register bit functions ................................................................................... 3-120
Results of access to the DMA Context ID Register ................................................................ 3-120
Secure or Non-secure Vector Base Address Register bit functions ....................................... 3-121
Results of access to the Secure or Non-secure Vector Base Address Register .................... 3-122
Monitor Vector Base Address Register bit functions ............................................................... 3-123
Results of access to the Monitor Vector Base Address Register ............................................ 3-123
Interrupt Status Register bit functions ..................................................................................... 3-124
Results of access to the Interrupt Status Register .................................................................. 3-124
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
ix
List of Tables
Table 3-127
Table 3-128
Table 3-129
Table 3-130
Table 3-131
Table 3-132
Table 3-133
Table 3-134
Table 3-135
Table 3-136
Table 3-137
Table 3-138
Table 3-139
Table 3-140
Table 3-141
Table 3-142
Table 3-143
Table 3-144
Table 3-145
Table 3-146
Table 3-147
Table 3-148
Table 3-149
Table 3-150
Table 3-151
Table 3-152
Table 3-153
Table 4-1
Table 4-2
Table 4-3
Table 4-4
Table 4-5
Table 4-6
Table 6-1
Table 6-2
Table 6-3
Table 6-4
Table 6-5
Table 6-6
Table 6-7
Table 6-8
Table 6-9
Table 6-10
Table 6-11
Table 6-12
Table 6-13
Table 6-14
Table 6-15
Table 6-16
Table 6-17
Table 7-1
Table 7-2
Table 7-3
Table 7-4
Table 7-5
Table 8-1
Table 8-2
Table 8-3
Table 8-4
Table 8-5
ARM DDI 0333H
ID012410
FCSE PID Register bit functions ............................................................................................. 3-125
Results of access to the FCSE PID Register .......................................................................... 3-126
Context ID Register bit functions ............................................................................................ 3-128
Results of access to the Context ID Register ......................................................................... 3-128
Results of access to the thread and process ID registers ....................................................... 3-129
Peripheral Port Memory Remap Register bit functions ........................................................... 3-131
Results of access to the Peripheral Port Remap Register ...................................................... 3-131
Secure User and Non-secure Access Validation Control Register bit functions ..................... 3-132
Results of access to the Secure User and Non-secure Access Validation Control Register .. 3-133
Performance Monitor Control Register bit functions ............................................................... 3-134
Performance monitoring events .............................................................................................. 3-135
Results of access to the Performance Monitor Control Register ............................................ 3-137
Results of access to the Cycle Counter Register .................................................................... 3-138
Results of access to the Count Register 0 .............................................................................. 3-139
Results of access to the Count Register 1 .............................................................................. 3-140
System validation counter register operations ........................................................................ 3-140
Results of access to the System Validation Counter Register ................................................ 3-141
System Validation Operations Register functions ................................................................... 3-142
Results of access to the System Validation Operations Register ........................................... 3-143
System Validation Cache Size Mask Register bit functions .................................................... 3-145
Results of access to the System Validation Cache Size Mask Register ................................. 3-146
TLB Lockdown Index Register bit functions ............................................................................ 3-149
TLB Lockdown VA Register bit functions ................................................................................ 3-150
TLB Lockdown PA Register bit functions ................................................................................ 3-150
Access permissions APX and AP bit fields encoding ............................................................. 3-151
TLB Lockdown Attributes Register bit functions ..................................................................... 3-152
Results of access to the TLB lockdown access registers ....................................................... 3-152
Unaligned access handling ......................................................................................................... 4-4
Memory access types ............................................................................................................... 4-13
Unalignment fault occurrence when access behavior is architecturally unpredictable ............. 4-14
Legacy endianness using CP15 c1 ........................................................................................... 4-17
Mixed-endian configuration ....................................................................................................... 4-19
B bit, U bit, and EE bit settings ................................................................................................. 4-19
Access permission bit encoding ................................................................................................ 6-12
TEX field, and C and B bit encodings used in page table formats ............................................ 6-15
Cache policy bits ....................................................................................................................... 6-15
Inner and Outer cache policy implementation options .............................................................. 6-16
Effect of remapping memory with TEX remap = 1 .................................................................... 6-17
Values that remap the shareable attribute ................................................................................ 6-18
Primary region type encoding ................................................................................................... 6-18
Inner and outer region remap encoding .................................................................................... 6-18
Memory attributes ..................................................................................................................... 6-20
Memory region backwards compatibility ................................................................................... 6-26
Fault Status Register encoding ................................................................................................. 6-34
Summary of aborts .................................................................................................................... 6-35
Translation table size ................................................................................................................ 6-43
Access types from first-level descriptor bit values .................................................................... 6-45
Access types from second-level descriptor bit values .............................................................. 6-47
CP15 register functions ............................................................................................................. 6-53
CP14 register functions ............................................................................................................. 6-54
TCM configurations ..................................................................................................................... 7-7
Access to Non-secure TCM ........................................................................................................ 7-8
Access to Secure TCM ............................................................................................................... 7-8
Summary of data accesses to TCM and caches ...................................................................... 7-14
Summary of instruction accesses to TCM and caches ............................................................. 7-15
AXI parameters for the level 2 interconnect interfaces ............................................................... 8-3
AxLEN[3:0] encoding ................................................................................................................ 8-10
AxSIZE[2:0] encoding ............................................................................................................... 8-11
AxBURST[1:0] encoding ........................................................................................................... 8-11
AxLOCK[1:0] encoding ............................................................................................................. 8-11
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
x
List of Tables
Table 8-6
Table 8-7
Table 8-8
Table 8-9
Table 8-10
Table 8-11
Table 8-12
Table 8-13
Table 8-14
Table 8-15
Table 8-16
Table 8-17
Table 8-18
Table 8-19
Table 8-20
Table 8-21
Table 8-22
Table 8-23
Table 8-24
Table 8-25
Table 8-26
Table 8-27
Table 8-28
Table 8-29
Table 8-30
Table 8-31
Table 8-32
Table 8-33
Table 8-34
Table 8-35
Table 8-36
Table 8-37
Table 8-38
Table 8-39
Table 8-40
Table 8-41
Table 8-42
Table 8-43
Table 8-44
Table 8-45
Table 8-46
Table 8-47
Table 8-48
Table 8-49
Table 8-50
Table 8-51
Table 8-52
Table 8-53
Table 8-54
Table 8-55
Table 8-56
Table 8-57
Table 8-58
Table 8-59
Table 8-60
Table 8-61
Table 8-62
Table 8-63
Table 8-64
Table 8-65
ARM DDI 0333H
ID012410
AxCACHE[3:0] encoding ...........................................................................................................
AxPROT[2:0] encoding .............................................................................................................
AxSIDEBAND[4:1] encoding .....................................................................................................
ARSIDEBANDI[4:1] encoding ...................................................................................................
AXI signals for Cacheable fetches ............................................................................................
AXI signals for Noncacheable fetches ......................................................................................
Linefill behavior on the AXI interface ........................................................................................
Noncacheable LDRB ................................................................................................................
Noncacheable LDRH ................................................................................................................
Noncacheable LDR or LDM1 ....................................................................................................
Noncacheable LDRD or LDM2 .................................................................................................
Noncacheable LDRD or LDM2 from word 7 .............................................................................
Noncacheable LDM3, Strongly Ordered or Device memory .....................................................
Noncacheable LDM3, Noncacheable memory or cache disabled ............................................
Noncacheable LDM3 from word 6, or 7 ....................................................................................
Noncacheable LDM4, Strongly Ordered or Device memory .....................................................
Noncacheable LDM4, Noncacheable memory or cache disabled ............................................
Noncacheable LDM4 from word 5, 6, or 7 ................................................................................
Noncacheable LDM5, Strongly Ordered or Device memory .....................................................
Noncacheable LDM5, Noncacheable memory or cache disabled ............................................
Noncacheable LDM5 from word 4, 5, 6, or 7 ............................................................................
Noncacheable LDM6, Strongly Ordered or Device memory .....................................................
Noncacheable LDM6, Noncacheable memory or cache disabled ............................................
Noncacheable LDM6 from word 3, 4, 5, 6, or 7 ........................................................................
Noncacheable LDM7, Strongly Ordered or Device memory .....................................................
Noncacheable LDM7, Noncacheable memory or cache disabled ............................................
Noncacheable LDM7 from word 2, 3, 4, 5, 6, or 7 ....................................................................
Noncacheable LDM8 from word 0 ............................................................................................
Noncacheable LDM8 from word 1, 2, 3, 4, 5, 6, or 7 ................................................................
Noncacheable LDM9 ................................................................................................................
Noncacheable LDM10 ..............................................................................................................
Noncacheable LDM11 ..............................................................................................................
Noncacheable LDM12 ..............................................................................................................
Noncacheable LDM13 ..............................................................................................................
Noncacheable LDM14 ..............................................................................................................
Noncacheable LDM15 ..............................................................................................................
Noncacheable LDM16 ..............................................................................................................
Half-line Write-Back ..................................................................................................................
Full-line Write-Back ...................................................................................................................
Cacheable Write-Through or Noncacheable STRB ..................................................................
Cacheable Write-Through or Noncacheable STRH ..................................................................
Cacheable Write-Through or Noncacheable STR or STM1 ......................................................
Cacheable Write-Through or Noncacheable STRD or STM2 to words 0, 1, 2, 3, 4, 5, or 6 .....
Cacheable Write-Through or Noncacheable STM2 to word 7 ..................................................
Cacheable Write-Through or Noncacheable STM3 to words 0, 1, 2, 3, 4, or 5 ........................
Cacheable Write-Through or Noncacheable STM3 to words 6 or 7 .........................................
Cacheable Write-Through or Noncacheable STM4 to word 0, 1, 2, 3, or 4 ..............................
Cacheable Write-Through or Noncacheable STM4 to word 5, 6, or 7 ......................................
Cacheable Write-Through or Noncacheable STM5 to word 0, 1, 2, or 3 ..................................
Cacheable Write-Through or Noncacheable STM5 to word 4, 5, 6, or 7 ..................................
Cacheable Write-Through or Noncacheable STM6 to word 0, 1, or 2 ......................................
Cacheable Write-Through or Noncacheable STM6 to word 3, 4, 5, 6, or 7 ..............................
Cacheable Write-Through or Noncacheable STM7 to word 0 or 1 ...........................................
Cacheable Write-Through or Noncacheable STM7 to word 2, 3, 4, 5, 6 or 7 ...........................
Cacheable Write-Through or Noncacheable STM8 to word 0 ..................................................
Cacheable Write-Through or Noncacheable STM8 to word 1, 2, 3, 4, 5, 6, or 7 ......................
Cacheable Write-Through or Noncacheable STM9 ..................................................................
Cacheable Write-Through or Noncacheable STM10 ................................................................
Cacheable Write-Through or Noncacheable STM11 ................................................................
Cacheable Write-Through or Noncacheable STM12 ................................................................
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-12
8-12
8-13
8-13
8-14
8-14
8-15
8-16
8-16
8-17
8-17
8-18
8-18
8-18
8-18
8-19
8-19
8-19
8-20
8-20
8-20
8-20
8-21
8-21
8-21
8-21
8-21
8-22
8-22
8-23
8-23
8-23
8-24
8-24
8-25
8-25
8-25
8-26
8-26
8-27
8-27
8-29
8-30
8-30
8-31
8-31
8-32
8-32
8-33
8-33
8-34
8-34
8-35
8-35
8-36
8-36
8-37
8-37
8-38
8-38
xi
List of Tables
Table 8-66
Table 8-67
Table 8-68
Table 8-69
Table 8-70
Table 9-1
Table 11-1
Table 11-2
Table 11-3
Table 11-4
Table 11-5
Table 12-1
Table 13-1
Table 13-2
Table 13-3
Table 13-4
Table 13-5
Table 13-6
Table 13-7
Table 13-8
Table 13-9
Table 13-10
Table 13-11
Table 13-12
Table 13-13
Table 13-14
Table 13-15
Table 13-16
Table 13-17
Table 13-18
Table 13-19
Table 13-20
Table 13-21
Table 13-22
Table 13-23
Table 13-24
Table 13-25
Table 13-26
Table 14-1
Table 14-2
Table 15-1
Table 15-2
Table 15-3
Table 15-4
Table 15-5
Table 15-6
Table 15-7
Table 15-8
Table 15-9
Table 15-10
Table 15-11
Table 16-1
Table 16-2
Table 16-3
Table 16-4
Table 16-5
Table 16-6
Table 16-7
Table 16-8
Table 16-9
ARM DDI 0333H
ID012410
Cacheable Write-Through or Noncacheable STM13 ................................................................ 8-39
Cacheable Write-Through or Noncacheable STM14 ................................................................ 8-39
Cacheable Write-Through or Noncacheable STM15 ................................................................ 8-40
Cacheable Write-Through or Noncacheable STM16 ................................................................ 8-40
Example Peripheral Interface reads and writes ........................................................................ 8-41
Reset modes ............................................................................................................................. 9-10
Coprocessor instructions .......................................................................................................... 11-3
Coprocessor control signals ...................................................................................................... 11-4
Pipeline stage update ............................................................................................................... 11-7
Addressing of queue buffers ................................................................................................... 11-10
Retirement conditions ............................................................................................................. 11-20
VIC port signals ......................................................................................................................... 12-3
Terms used in register descriptions .......................................................................................... 13-5
CP14 debug register map ......................................................................................................... 13-5
Debug ID Register bit field definition ......................................................................................... 13-7
Debug Status and Control Register bit field definitions ............................................................. 13-8
Data Transfer Register bit field definitions .............................................................................. 13-12
Vector Catch Register bit field definitions ............................................................................... 13-14
Summary of debug entry and exception conditions ................................................................ 13-14
Processor breakpoint and watchpoint registers ...................................................................... 13-16
Breakpoint Value Registers, bit field definition ........................................................................ 13-17
Processor Breakpoint Control Registers ................................................................................. 13-17
Breakpoint Control Registers, bit field definitions ................................................................... 13-18
Meaning of BCR[22:20] bits .................................................................................................... 13-19
Processor Watchpoint Value Registers .................................................................................. 13-20
Watchpoint Value Registers, bit field definitions ..................................................................... 13-21
Processor Watchpoint Control Registers ................................................................................ 13-21
Watchpoint Control Registers, bit field definitions ................................................................... 13-22
Debug State Cache Control Register bit functions ................................................................. 13-24
Debug State MMU Control Register bit functions ................................................................... 13-24
CP14 debug instructions ......................................................................................................... 13-26
Debug instruction execution .................................................................................................... 13-27
Secure debug behavior ........................................................................................................... 13-28
Behavior of the processor on debug events ........................................................................... 13-33
Setting of CP15 registers on debug events ............................................................................ 13-34
Values in the link register after exceptions ............................................................................. 13-36
Read PC value after Debug state entry .................................................................................. 13-39
Example memory operation sequence ................................................................................... 13-41
Supported public instructions .................................................................................................... 14-6
Scan chain 7 register map ...................................................................................................... 14-19
Instruction interface signals ...................................................................................................... 15-2
ETMIACTL[17:0] ....................................................................................................................... 15-3
ETMIASECCTL[1:0] .................................................................................................................. 15-4
Data address interface signals .................................................................................................. 15-4
ETMDACTL[17:0] ...................................................................................................................... 15-5
Data value interface signals ...................................................................................................... 15-6
ETMDDCTL[3:0] ....................................................................................................................... 15-6
ETMPADV[2:0] .......................................................................................................................... 15-6
Coprocessor interface signals ................................................................................................... 15-7
ETMCPSECCTL[1:0] format ..................................................................................................... 15-7
Other connections ..................................................................................................................... 15-8
Pipeline stages .......................................................................................................................... 16-3
Definition of cycle timing terms ................................................................................................. 16-5
Register interlock examples ...................................................................................................... 16-6
Data Processing Instruction cycle timing behavior if destination is not PC ............................... 16-7
Data Processing Instruction cycle timing behavior if destination is the PC ............................... 16-7
QADD, QDADD, QSUB, and QDSUB instruction cycle timing behavior ................................... 16-9
ARMv6 media data-processing instructions cycle timing behavior ......................................... 16-10
ARMv6 sum of absolute differences instruction timing behavior ............................................ 16-11
Example interlocks .................................................................................................................. 16-11
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
xii
List of Tables
Table 16-10
Table 16-11
Table 16-12
Table 16-13
Table 16-14
Table 16-15
Table 16-16
Table 16-17
Table 16-18
Table 16-19
Table 16-20
Table 16-21
Table 16-22
Table 16-23
Table 17-1
Table 17-2
Table 17-3
Table 17-4
Table 17-5
Table 17-6
Table 17-7
Table 17-8
Table 17-9
Table A-1
Table A-2
Table A-3
Table A-4
Table A-5
Table A-6
Table A-7
Table A-8
Table A-9
Table A-10
Table A-11
Table A-12
Table A-13
Table A-14
Table B-1
Table B-2
Table B-3
Table C-1
ARM DDI 0333H
ID012410
Example multiply instruction cycle timing behavior ................................................................. 16-12
Branch instruction cycle timing behavior ................................................................................. 16-14
Processor state updating instructions cycle timing behavior .................................................. 16-15
Cycle timing behavior for stores and loads, other than loads to the PC ................................. 16-16
Cycle timing behavior for loads to the PC ............................................................................... 16-17
<addr_md_1cycle> and <addr_md_2cycle> LDR example instruction explanation ............... 16-17
Load and Store Double instructions cycle timing behavior ..................................................... 16-19
<addr_md_1cycle> and <addr_md_2cycle> LDRD example instruction explanation ............. 16-19
Cycle timing behavior of Load and Store Multiples, other than load multiples including the PC .......
16-21
Cycle timing behavior of Load Multiples, where the PC is in the register list .......................... 16-22
RFE and SRS instructions cycle timing behavior .................................................................... 16-23
Synchronization Instructions cycle timing behavior ................................................................ 16-24
Coprocessor Instructions cycle timing behavior ...................................................................... 16-25
SVC, BKPT, undefined, prefetch aborted instructions cycle timing behavior ......................... 16-26
Global signals ........................................................................................................................... 17-3
AXI signals ................................................................................................................................ 17-3
Coprocessor signals ................................................................................................................. 17-4
ETM interface signals ............................................................................................................... 17-5
Interrupt signals ........................................................................................................................ 17-5
Debug interface signals ............................................................................................................ 17-5
Test signals ............................................................................................................................... 17-6
Static configuration signals ....................................................................................................... 17-6
TrustZone internal signals ......................................................................................................... 17-6
Global signals ............................................................................................................................. A-2
Static configuration signals ......................................................................................................... A-4
TrustZone internal signals ........................................................................................................... A-5
Interrupt signals .......................................................................................................................... A-6
Port signal name suffixes ............................................................................................................ A-7
Instruction read port AXI signal implementation ......................................................................... A-8
Data port AXI signal implementation ........................................................................................... A-9
Peripheral port AXI signal implementation ................................................................................ A-10
DMA port signals ....................................................................................................................... A-11
Core to coprocessor signals ..................................................................................................... A-12
Coprocessor to core signals ..................................................................................................... A-12
Debug interface signals ............................................................................................................ A-14
ETM interface signals ............................................................................................................... A-15
Test signals ............................................................................................................................... A-16
TCM for ARM1176JZ-S processors ............................................................................................ B-6
CP15 c15 features common to ARM1136J-S and ARM1176JZ-S processors ........................... B-7
CP15 c15 only found in ARM1136J-S processors ...................................................................... B-8
Differences between issue G and issue H .................................................................................. C-1
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
xiii
List of Figures
ARM1176JZ-S Technical Reference Manual
Figure 1-1
Figure 1-2
Figure 1-3
Figure 1-4
Figure 1-5
Figure 1-6
Figure 1-7
Figure 1-8
Figure 2-1
Figure 2-2
Figure 2-3
Figure 2-4
Figure 2-5
Figure 2-6
Figure 2-7
Figure 2-8
Figure 2-9
Figure 2-10
Figure 2-11
Figure 2-12
Figure 2-13
Figure 2-14
Figure 2-15
Figure 2-16
Figure 2-17
Figure 2-18
Figure 3-1
Figure 3-2
Figure 3-3
ARM DDI 0333H
ID012410
Key to timing diagram conventions .............................................................................................. xxi
ARM1176JZ-S processor block diagram .................................................................................... 1-8
ARM1176JZ-S pipeline stages ................................................................................................. 1-24
Typical operations in pipeline stages ........................................................................................ 1-26
Typical ALU operation ............................................................................................................... 1-26
Typical multiply operation ......................................................................................................... 1-27
Progression of an LDR/STR operation ..................................................................................... 1-28
Progression of an LDM/STM operation ..................................................................................... 1-28
Progression of an LDR that misses .......................................................................................... 1-29
Secure and Non-secure worlds ................................................................................................... 2-3
Memory in the Secure and Non-secure worlds ........................................................................... 2-6
Memory partition in the Secure and Non-secure worlds ............................................................. 2-7
Big-endian addresses of bytes within words ............................................................................. 2-15
Little-endian addresses of bytes within words .......................................................................... 2-15
Register organization in ARM state .......................................................................................... 2-20
Processor core register set showing banked registers ............................................................. 2-21
Register organization in Thumb state ....................................................................................... 2-22
ARM state and Thumb state registers relationship ................................................................... 2-23
Program status register ............................................................................................................. 2-24
LDREXB instruction .................................................................................................................. 2-30
STREXB instructions ................................................................................................................ 2-30
LDREXH instruction .................................................................................................................. 2-31
STREXH instruction .................................................................................................................. 2-32
LDREXD instruction .................................................................................................................. 2-33
STREXD instruction .................................................................................................................. 2-33
CLREX instruction ..................................................................................................................... 2-34
NOP-compatible hint instruction ............................................................................................... 2-34
System control and configuration registers ................................................................................. 3-5
MMU control and configuration registers .................................................................................... 3-7
Cache control and configuration registers .................................................................................. 3-8
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
xiv
List of Figures
Figure 3-4
Figure 3-5
Figure 3-6
Figure 3-7
Figure 3-8
Figure 3-9
Figure 3-10
Figure 3-11
Figure 3-12
Figure 3-13
Figure 3-14
Figure 3-15
Figure 3-16
Figure 3-17
Figure 3-18
Figure 3-19
Figure 3-20
Figure 3-21
Figure 3-22
Figure 3-23
Figure 3-24
Figure 3-25
Figure 3-26
Figure 3-27
Figure 3-28
Figure 3-29
Figure 3-30
Figure 3-31
Figure 3-32
Figure 3-33
Figure 3-34
Figure 3-35
Figure 3-36
Figure 3-37
Figure 3-38
Figure 3-39
Figure 3-40
Figure 3-41
Figure 3-42
Figure 3-43
Figure 3-44
Figure 3-45
Figure 3-46
Figure 3-47
Figure 3-48
Figure 3-49
Figure 3-50
Figure 3-51
Figure 3-52
Figure 3-53
Figure 3-54
Figure 3-55
Figure 3-56
Figure 3-57
Figure 3-58
Figure 3-59
Figure 3-60
Figure 3-61
Figure 3-62
Figure 3-63
ARM DDI 0333H
ID012410
TCM control and configuration registers ..................................................................................... 3-8
Cache Master Valid Registers .................................................................................................... 3-9
DMA control and configuration registers ..................................................................................... 3-9
System performance monitor registers ..................................................................................... 3-10
System validation registers ....................................................................................................... 3-11
CP15 MRC and MCR bit pattern ............................................................................................... 3-12
Main ID Register format ............................................................................................................ 3-20
Cache Type Register format ..................................................................................................... 3-22
TCM Status Register format ..................................................................................................... 3-24
TLB Type Register format ......................................................................................................... 3-25
Processor Feature Register 0 format ........................................................................................ 3-27
Processor Feature Register 1 format ........................................................................................ 3-28
Debug Feature Register 0 format ............................................................................................. 3-29
Memory Model Feature Register 0 format ................................................................................ 3-31
Memory Model Feature Register 1 format ................................................................................ 3-32
Memory Model Feature Register 2 format ................................................................................ 3-34
Memory Model Feature Register 3 format ................................................................................ 3-36
Instruction Set Attributes Register 0 format .............................................................................. 3-37
Instruction Set Attributes Register 1 format .............................................................................. 3-38
Instruction Set Attributes Register 2 format .............................................................................. 3-39
Instruction Set Attributes Register 3 format .............................................................................. 3-41
Instruction Set Attributes Register 4 format .............................................................................. 3-42
Control Register format ............................................................................................................. 3-45
Auxiliary Control Register format .............................................................................................. 3-49
Coprocessor Access Control Register format ........................................................................... 3-51
Secure Configuration Register format ....................................................................................... 3-52
Secure Debug Enable Register format ..................................................................................... 3-54
Non-Secure Access Control Register format ............................................................................ 3-56
Translation Table Base Register 0 format ................................................................................ 3-58
Translation Table Base Register 1 format ................................................................................ 3-59
Translation Table Base Control Register format ....................................................................... 3-61
Domain Access Control Register format ................................................................................... 3-63
Data Fault Status Register format ............................................................................................. 3-64
Instruction Fault Status Register format .................................................................................... 3-66
Cache operations ...................................................................................................................... 3-70
Cache operations with MCRR instructions ............................................................................... 3-71
c7 format for Set and Index ....................................................................................................... 3-72
c7 format for MVA ..................................................................................................................... 3-73
Format of c7 for VA ................................................................................................................... 3-74
Cache Dirty Status Register format .......................................................................................... 3-78
c7 format for Flush Branch Target Entry using MVA ................................................................ 3-79
PA Register format for successful translation ........................................................................... 3-80
PA Register format for aborted translation ................................................................................ 3-81
TLB Operations Register MVA and ASID format ...................................................................... 3-88
TLB Operations Register ASID format ...................................................................................... 3-88
Instruction and data cache lockdown register formats .............................................................. 3-88
Data TCM Region Register format ............................................................................................ 3-91
Instruction TCM Region Register format ................................................................................... 3-92
Data TCM Non-secure Control Access Register format ........................................................... 3-94
Instruction TCM Non-secure Control Access Register format .................................................. 3-96
TCM Selection Register format ................................................................................................. 3-97
Cache Behavior Override Register format ................................................................................ 3-98
TLB Lockdown Register format ............................................................................................... 3-100
Primary Region Remap Register format ................................................................................. 3-102
Normal Memory Remap Register format ................................................................................ 3-103
DMA identification and status registers format ....................................................................... 3-106
DMA User Accessibility Register format ................................................................................. 3-107
DMA Channel Number Register format .................................................................................. 3-109
DMA Control Register format .................................................................................................. 3-112
DMA Channel Status Register format ..................................................................................... 3-117
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
xv
List of Figures
Figure 3-64
Figure 3-65
Figure 3-66
Figure 3-67
Figure 3-68
Figure 3-69
Figure 3-70
Figure 3-71
Figure 3-72
Figure 3-73
Figure 3-74
Figure 3-75
Figure 3-76
Figure 3-77
Figure 3-78
Figure 3-79
Figure 4-1
Figure 4-2
Figure 4-3
Figure 4-4
Figure 4-5
Figure 4-6
Figure 4-7
Figure 4-8
Figure 4-9
Figure 4-10
Figure 4-11
Figure 4-12
Figure 4-13
Figure 6-1
Figure 6-2
Figure 6-3
Figure 6-4
Figure 6-5
Figure 6-6
Figure 6-7
Figure 6-8
Figure 6-9
Figure 6-10
Figure 6-11
Figure 6-12
Figure 6-13
Figure 6-14
Figure 6-15
Figure 6-16
Figure 6-17
Figure 6-18
Figure 7-1
Figure 8-1
Figure 8-2
Figure 8-3
Figure 8-4
Figure 9-1
Figure 9-2
Figure 9-3
Figure 9-4
Figure 9-5
Figure 9-6
Figure 10-1
ARM DDI 0333H
ID012410
DMA Context ID Register format ............................................................................................ 3-120
Secure or Non-secure Vector Base Address Register format ................................................ 3-121
Monitor Vector Base Address Register format ........................................................................ 3-122
Interrupt Status Register format .............................................................................................. 3-124
FCSE PID Register format ...................................................................................................... 3-125
Address mapping with the FCSE PID Register ....................................................................... 3-127
Context ID Register format ..................................................................................................... 3-127
Peripheral Port Memory Remap Register format .................................................................... 3-130
Secure User and Non-secure Access Validation Control Register format .............................. 3-132
Performance Monitor Control Register format ........................................................................ 3-133
System Validation Counter Register format for external debug request counter .................... 3-141
System Validation Cache Size Mask Register format ............................................................. 3-145
TLB Lockdown Index Register format ..................................................................................... 3-149
TLB Lockdown VA Register format ......................................................................................... 3-149
TLB Lockdown PA Register format ......................................................................................... 3-150
TLB Lockdown Attributes Register format .............................................................................. 3-151
Load unsigned byte ..................................................................................................................... 4-6
Load signed byte ......................................................................................................................... 4-6
Store byte .................................................................................................................................... 4-7
Load unsigned halfword, little-endian ......................................................................................... 4-7
Load unsigned halfword, big-endian ........................................................................................... 4-8
Load signed halfword, little-endian ............................................................................................. 4-8
Load signed halfword, big-endian ............................................................................................... 4-9
Store halfword, little-endian ........................................................................................................ 4-9
Store halfword, big-endian ........................................................................................................ 4-10
Load word, little-endian ............................................................................................................. 4-10
Load word, big-endian .............................................................................................................. 4-11
Store word, little-endian ............................................................................................................ 4-11
Store word, big-endian .............................................................................................................. 4-12
Memory ordering restrictions .................................................................................................... 6-24
Translation table managed TLB fault checking sequence part 1 .............................................. 6-30
Translation table managed TLB fault checking sequence part 2 .............................................. 6-31
Backwards-compatible first-level descriptor format .................................................................. 6-37
Backwards-compatible second-level descriptor format ............................................................. 6-38
Backwards-compatible section, supersection, and page translation ........................................ 6-38
ARMv6 first-level descriptor formats with subpages disabled ................................................... 6-39
ARMv6 second-level descriptor format ..................................................................................... 6-40
ARMv6 section, supersection, and page translation ................................................................. 6-41
Creating a first-level descriptor address ................................................................................... 6-44
Translation for a 1MB section, ARMv6 format .......................................................................... 6-46
Translation for a 1MB section, backwards-compatible format .................................................. 6-46
Generating a second-level page table address ........................................................................ 6-47
Large page table walk, ARMv6 format ...................................................................................... 6-48
Large page table walk, backwards-compatible format .............................................................. 6-49
4KB small page or 1KB small subpage translations, backwards-compatible format ................ 6-50
4KB extended small page translations, ARMv6 format ............................................................. 6-51
4KB extended small page or 1KB extended small subpage translations,
backwards-compatible format ................................................................................................... 6-52
Level one cache block diagram .................................................................................................. 7-4
Level two interconnect interfaces ................................................................................................ 8-2
Channel architecture of reads ..................................................................................................... 8-8
Channel architecture of writes .................................................................................................... 8-8
Swizzling of data and strobes in BE-32 big-endian configuration ............................................. 8-42
Processor clocks with no IEM ..................................................................................................... 9-3
Read latency with no IEM ........................................................................................................... 9-4
Processor clocks with IEM .......................................................................................................... 9-6
Processor synchronization with IEM ........................................................................................... 9-6
Read latency with IEM ................................................................................................................ 9-8
Power-on reset .......................................................................................................................... 9-10
IEM structure ............................................................................................................................. 10-7
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
xvi
List of Figures
Figure 11-1
Figure 11-2
Figure 11-3
Figure 11-4
Figure 11-5
Figure 11-6
Figure 11-7
Figure 11-8
Figure 11-9
Figure 11-10
Figure 12-1
Figure 12-2
Figure 12-3
Figure 13-1
Figure 13-2
Figure 13-3
Figure 13-4
Figure 13-5
Figure 13-6
Figure 13-7
Figure 14-1
Figure 14-2
Figure 14-3
Figure 14-4
Figure 14-5
Figure 14-6
Figure 14-7
Figure 14-8
Figure 14-9
Figure 14-10
Figure 14-11
Figure 14-12
Figure 14-13
Figure 14-14
Figure 15-1
ARM DDI 0333H
ID012410
Core and coprocessor pipelines ............................................................................................... 11-5
Coprocessor pipeline and queues ............................................................................................ 11-5
Coprocessor pipeline ................................................................................................................ 11-7
Token queue buffers ................................................................................................................. 11-9
Queue reading and writing ...................................................................................................... 11-10
Queue flushing ........................................................................................................................ 11-11
Instruction queue .................................................................................................................... 11-12
Coprocessor data transfer ...................................................................................................... 11-15
Instruction iteration for loads ................................................................................................... 11-16
Load data buffering ................................................................................................................. 11-17
Connection of a VIC to the processor ....................................................................................... 12-3
VIC port timing example ............................................................................................................ 12-5
Interrupt entry sequence ........................................................................................................... 12-7
Typical debug system ............................................................................................................... 13-2
Debug ID Register format ......................................................................................................... 13-6
Debug Status and Control Register format ............................................................................... 13-8
DTR format ............................................................................................................................. 13-12
Vector Catch Register format .................................................................................................. 13-13
Breakpoint Control Registers format ....................................................................................... 13-17
Watchpoint Control Registers format ...................................................................................... 13-21
JTAG DBGTAP state machine diagram .................................................................................... 14-2
RealView ICE clock synchronization ......................................................................................... 14-3
Bypass register bit order ........................................................................................................... 14-8
Device ID code register bit order .............................................................................................. 14-9
Instruction register bit order ...................................................................................................... 14-9
Scan chain select register bit order ......................................................................................... 14-10
Scan chain 0 bit order ............................................................................................................. 14-11
Scan chain 1 bit order ............................................................................................................. 14-11
Scan chain 4 bit order ............................................................................................................. 14-13
Scan chain 5 bit order, EXTEST selected ............................................................................... 14-15
Scan chain 5 bit order, INTEST selected ................................................................................ 14-15
Scan chain 6 bit order ............................................................................................................. 14-17
Scan chain 7 bit order ............................................................................................................. 14-18
Behavior of the ITRsel IR instruction ...................................................................................... 14-22
ETMCPADDRESS format ......................................................................................................... 15-7
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
xvii
Preface
This preface introduces the ARM1176JZ-S™ Technical Reference Manual (TRM). It contains the
following sections:
•
About this manual on page xix
•
Feedback on page xxiii.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
xviii
Preface
About this manual
This is for the ARM1176JZ-S processor. In this book the generic term processor means the
ARM1176JZ-S processor.
Product revision status
The rnpn identifier indicates the revision status of the product described in this manual, where:
rn
Identifies the major revision of the product.
pn
Identifies the minor revision or modification status of the product.
Intended audience
This document is written for hardware and software engineers implementing the processor
system designs, and integrating the processor into a target system.
Using this manual
This book is organized into the following chapters:
Chapter 1 Introduction
Read this for an introduction to the processor and descriptions of the major
functional blocks.
Chapter 2 Programmer’s Model
Read this for a description of the processor registers and programming details.
Chapter 3 System Control Coprocessor
Read this for a description of the processor’s system control coprocessor CP15
registers and programming details.
Chapter 4 Unaligned and Mixed-endian Data Access Support
Read this for a description of the processor support for unaligned and
mixed-endian data accesses.
Chapter 5 Program Flow Prediction
Read this for a description of the functions of the processor’s Prefetch Unit,
including static and dynamic branch prediction and the return stack.
Chapter 6 Memory Management Unit
Read this for a description of the processor’s Memory Management Unit (MMU)
and the address translation process.
Chapter 7 Level One Memory System
Read this for a description of the processor’s level one memory system, including
caches, TCM, DMA, TLBs, and write buffer.
Chapter 8 Level Two Interface
Read this for a description of the processor’s level two memory interface and the
peripheral port.
Chapter 9 Clocking and Resets
Read this for a description of the processor’s clocking modes and the reset signals.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
xix
Preface
Chapter 10 Power Control
Read this for a description of the processor’s power control facilities.
Chapter 11 Coprocessor Interface
Read this for details of the processor’s coprocessor interface.
Chapter 12 Vectored Interrupt Controller Port
Read this for a description of the processor’s Vectored Interrupt Controller
interface.
Chapter 13 Debug
Read this for a description of the processor’s debug support.
Chapter 14 Debug Test Access Port
Read this for a description of the JTAG-based processor Debug Test Access Port.
Chapter 15 Trace Interface Port
Read this for a description of the trace interface port.
Chapter 16 Cycle Timings and Interlock Behavior
Read this for a description of the processor’s instruction cycle timing and for
details of the interlocks.
Chapter 17 AC Characteristics
Read this for a description of the timing parameters applicable to the processor.
Appendix A Signal Descriptions
Read this for a description of the processor signals.
Appendix B Summary of ARM1136J-S and ARM1176JZ-S Processor Differences
Read this for a summary of the differences between the ARM1136JF-S™ and
ARM1176JZ-S processors.
Appendix C Revisions
Read this for a description of the technical changes between released issues of this
book.
Glossary
Read this for definitions of terms used in this book.
Conventions
This section describes the conventions that this manual uses:
•
Typographical
•
Timing diagrams on page xxi
•
Signals on page xxi
Typographical
The typographical conventions are:
ARM DDI 0333H
ID012410
italic
Highlights important notes, introduces special terminology, denotes
internal cross-references, and citations.
bold
Highlights interface elements, such as menu names. Denotes signal
names. Also used for terms in descriptive lists, where appropriate.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
xx
Preface
monospace
Denotes text that you can enter at the keyboard, such as commands, file
and program names, and source code.
monospace
Denotes a permitted abbreviation for a command or option. You can enter
the underlined text instead of the full command or option name.
monospace italic
Denotes arguments to monospace text where the argument is to be
replaced by a specific value.
monospace bold
Denotes language keywords when used outside example code.
< and >
Enclose replaceable terms for assembler syntax where they appear in code
or code fragments. For example:
MRC p15, 0 <Rd>, <CRn>, <CRm>, <Opcode_2>
Timing diagrams
The figure named Key to timing diagram conventions explains the components used in timing
diagrams. Variations, when they occur, have clear labels. You must not assume any timing
information that is not explicit in the diagrams.
Shaded bus and signal areas are undefined, so the bus or signal can assume any value within the
shaded area at that time. The actual level is unimportant and does not affect normal operation.
Clock
HIGH to LOW
Transient
HIGH/LOW to HIGH
Bus stable
Bus to high impedance
Bus change
High impedance to stable bus
Key to timing diagram conventions
Timing diagrams sometimes show single-bit signals as HIGH and LOW at the same time and
they look similar to the bus change shown in Key to timing diagram conventions. If a timing
diagram shows a single-bit signal in this way then its value does not affect the accompanying
description.
Signals
The signal conventions are:
Signal level
The level of an asserted signal depends on whether the signal is
active-HIGH or active-LOW. Asserted means:
•
HIGH for active-HIGH signals
•
LOW for active-LOW signals.
Lower-case n
At the start or end of a signal name denotes an active-LOW signal.
Additional reading
This section lists publications by ARM and by third parties.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
xxi
Preface
See Infocenter, http://infocenter.arm.com, for access to ARM documentation.
ARM publications
This book contains information that is specific to the ARM1176JZ-S processors. See the
following documents for other relevant information:
•
ARM Architecture Reference Manual (ARM DDI 0406)
Note
The ARM DDI 0406 edition of the ARM Architecture Reference Manual (the ARM ARM)
incorporates the supplements to the previous ARM ARM, including the Security
Extensions supplement.
•
Jazelle® V1 Architecture Reference Manual (ARM DDI 0225)
•
AMBA® AXI Protocol V1.0 Specification (ARM IHI 0022)
•
Embedded Trace Macrocell Architecture Specification (ARM IHI 0014)
•
ARM1136J-S Technical Reference Manual (ARM DDI 0211)
•
ARM11 Memory Built-In Self Test Controller Technical Reference Manual
(ARM DDI 0289)
•
ARM1176JZF-S™ and ARM1176JZ-S™ Implementation Guide (ARM DII 0081)
•
CoreSight ETM11™ Technical Reference Manual (ARM DDI 0318)
•
RealView™ Compilation Tools Developer Guide (ARM DUI 0203)
•
ARM PrimeCell® Vectored Interrupt Controller (PL192) Technical Reference Manual
(ARM DDI 0273).
•
Intelligent Energy Controller Technical Overview (ARM DTO 0005).
Other publications
This section lists relevant documents published by third parties:
•
IEEE Standard Test Access Port and Boundary-Scan Architecture specification
1149.1-1990 (JTAG).
Figure 14-1 on page 14-2 is printed with permission IEEE Std. 1149.1-1990, IEEE Standard
Test Access Port and Boundary-Scan Architecture Copyright 2001, by IEEE. The IEEE
disclaims any responsibility or liability resulting from the placement and use in the described
manner.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
xxii
Preface
Feedback
ARM Limited welcomes feedback on the ARM1176JZ-S processor and its documentation.
Feedback on the product
If you have any comments or suggestions about this product, contact your supplier giving:
•
the product name
•
a concise explanation of your comments.
Feedback on this book
If you have any comments on this manual, send email to errata@arm.com giving:
•
the document title
•
the document number
•
the page number(s) to which your comments apply
•
a concise explanation of your comments.
ARM Limited also welcomes general suggestions for additions and improvements.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
xxiii
Chapter 1
Introduction
This chapter introduces the ARM1176JZ-S processor and its features. It contains the following
sections:
•
About the processor on page 1-2
•
Extensions to ARMv6 on page 1-3
•
TrustZone security extensions on page 1-4
•
ARM1176JZ-S architecture with Jazelle technology on page 1-6
•
Components of the processor on page 1-8
•
Power management on page 1-21
•
Configurable options on page 1-23
•
Pipeline stages on page 1-24
•
Typical pipeline operations on page 1-26
•
ARM1176JZ-S instruction set summary on page 1-30
•
Product revisions on page 1-46.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-1
Introduction
1.1
About the processor
The ARM1176JZ-S processor incorporates an integer core that implements the ARM11 ARM
architecture v6. It supports the ARM and Thumb™ instruction sets, Jazelle technology to enable
direct execution of Java bytecodes, and a range of SIMD DSP instructions that operate on 16-bit
or 8-bit data values in 32-bit registers.
The ARM1176JZ-S processor features:
•
TrustZone™ security extensions
•
provision for Intelligent Energy Management (IEM™)
•
high-speed Advanced Microprocessor Bus Architecture (AMBA) Advanced Extensible
Interface (AXI) level two interfaces supporting prioritized multiprocessor
implementations.
•
an integer core with integral EmbeddedICE-RT logic
•
an eight-stage pipeline
•
branch prediction with return stack
•
low interrupt latency configuration
•
internal coprocessors CP14 and CP15
•
external coprocessor interface
•
Instruction and Data Memory Management Units (MMUs), managed using MicroTLB
structures backed by a unified Main TLB
•
Instruction and data caches, including a non-blocking data cache with Hit-Under-Miss
(HUM)
•
virtually indexed and physically addressed caches
•
64-bit interface to both caches
•
level one Tightly-Coupled Memory (TCM) that you can use as a local RAM with DMA
•
external coprocessor support
•
trace support
•
JTAG-based debug.
Note
The only functional difference between the ARM1176JZ-S and ARM1176JZF-S processor is
that the ARM1176JZF-S processor includes a Vector Floating-Point (VFP) coprocessor.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-2
Introduction
1.2
Extensions to ARMv6
The ARM1176JZ-S processor provides support for extensions to ARMv6 that include:
•
Store and Load Exclusive instructions for bytes, halfwords and doublewords and a new
Clear Exclusive instruction.
•
A true no-operation instruction and yield instruction.
•
Architectural remap registers.
•
Cache size restriction through CP15 c1. You can restrict cache size to 16KB for Operating
Systems (OSs) that do not support page coloring.
•
Revised use of TEX remap bits. The ARMv6 MMU page table descriptors use a large
number of bits to describe all of the options for inner and outer cachability. In reality, it is
believed that no application requires all of these options simultaneously. Therefore, it is
possible to configure the ARM1176JZ-S processor to support only a small number of
options by means of the TEX remap mechanism. This implies a level of indirection in the
page table mappings.
The TEX CB encoding table provides two OS managed page table bits. For binary
compatibility with existing ARMv6 ports of OSs, this gives a separate mode of operation
of the MMU. This is called the TEX remap configuration and is controlled by bit [28] TR
in CP15 Register 1.
•
ARM DDI 0333H
ID012410
Revised use of AP bits. In the ARM1176JZ-S processor the APX and AP[1:0] encoding
b111 is Privileged or User mode read only access. AP[0] indicates an abort type, Access
Bit fault, when CP15 c1[29] is 1.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-3
Introduction
1.3
TrustZone security extensions
Caution
TrustZone security extensions enable a Secure software environment. The technology does not
protect the processor from hardware attacks and the implementor must take appropriate steps to
secure the hardware and protect the trusted code.
The ARM1176JZ-S processor supports TrustZone security extensions to provide a secure
environment for software. This section summarizes processor elements that TrustZone uses. For
details of TrustZone, see the ARM Architecture Reference Manual.
The TrustZone approach to integrated system security depends on an established trusted code
base. The trusted code is a relatively small block that runs in the Secure world in the processor
and provides the foundation for security throughout the system. This security applies from
system boot and enforces a level of trust at each stage of a transaction.
The processor has:
•
seven operating modes that can be either Secure or Non-secure
•
Secure Monitor mode, that is always Secure.
Except when the processor is in Secure Monitor mode, the NS bit in the Secure Configuration
Register determines whether the processor runs code in the Secure or Non-secure worlds. The
Secure Configuration Register is in CP15 register c1, see c1, Secure Configuration Register on
page 3-52.
Secure Monitor mode is used to switch operation between the Secure and Non-secure worlds.
Secure Monitor mode uses these banked registers:
R13_mon
Stack Pointer
R14_mon
Link Register
SPSR_mon Saved Program Status Register
The processor implements this instruction to enter Secure Monitor mode:
SMC
Secure Monitor Call, switches from one of the privileged modes to the Secure
Monitor mode.
The processor implements these TrustZone related signals:
nDMASIRQ Secure DMA transfer request, see c11, DMA Channel Status Register on
page 3-117.
nDMAEXTERRIR
Not maskable error DMA interrupt, see c11, DMA Channel Status Register on
page 3-117.
SPIDEN
Secure privileged invasive debug enable, see Secure Monitor mode and debug on
page 13-4.
SPNIDEN
Secure privileged non-invasive debug enable, see Secure Monitor mode and
debug on page 13-4.
Note
Do not confuse Secure Monitor mode with the Monitor debug-mode.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-4
Introduction
AXI supports trusted peripherals through these signals:
AxPROT[1]
Protection type signal, see AxPROT[2:0] on page 8-12.
RRESP[1:0]
Read response signal, see AXI interface signals on page A-7.
BRESP[1:0]
Write response signal, see AXI interface signals on page A-7.
ETMIASECCTL[1:0] and ETMCPSECCTL[1:0]
TrustZone information for tracing, see Secure control bus on page 15-4.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-5
Introduction
1.4
ARM1176JZ-S architecture with Jazelle technology
The ARM1176JZ-S processor has three instruction sets:
•
the 32-bit ARM instruction set used in ARM state, with media instructions
•
the 16-bit Thumb instruction set used in Thumb state
•
the 8-bit Java bytecodes used in Jazelle state.
For details of both the ARM and Thumb instruction sets, see the ARM Architecture Reference
Manual. For full details of the ARM1176JZ-S Java instruction set, see the Jazelle V1
Architecture Reference Manual.
1.4.1
Instruction compression
A typical 32-bit architecture can manipulate 32-bit integers with single instructions, and address
a large address space much more efficiently than a 16-bit architecture. When processing 32-bit
data, a 16-bit architecture takes at least two instructions to perform the same task as a single
32-bit instruction.
When a 16-bit architecture has only 16-bit instructions, and a 32-bit architecture has only 32-bit
instructions, overall the 16-bit architecture has higher code density, and greater than half the
performance of the 32-bit architecture.
Thumb implements a 16-bit instruction set on a 32-bit architecture, giving higher performance
than on a 16-bit architecture, with higher code density than a 32-bit architecture.
The ARM1176JZ-S processor can easily switch between running in ARM state and running in
Thumb state. This enables you to optimize both code density and performance to best suit your
application requirements.
1.4.2
The Thumb instruction set
The Thumb instruction set is a subset of the most commonly used 32-bit ARM instructions.
Thumb instructions are 16 bits long, and have a corresponding 32-bit ARM instruction that has
the same effect on the processor model. Thumb instructions operate with the standard ARM
register configuration, enabling excellent interoperability between ARM and Thumb states.
Thumb has all the advantages of a 32-bit core:
•
32-bit address space
•
32-bit registers
•
32-bit shifter and Arithmetic Logic Unit (ALU)
•
32-bit memory transfer.
Thumb therefore offers a long branch range, powerful arithmetic operations, and a large address
space.
The availability of both 16-bit Thumb and 32-bit ARM instruction sets, gives you the flexibility
to emphasize performance or code size on a subroutine level, according to the requirements of
their applications. For example, you can code critical loops for applications such as fast
interrupts and DSP algorithms using the full ARM instruction set, and linked with Thumb code.
1.4.3
Java bytecodes
ARM architecture v6 with Jazelle technology executes variable length Java bytecodes. Java
bytecodes fall into two classes:
Hardware execution
Bytecodes that perform stack-based operations.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-6
Introduction
Software execution
Bytecodes that are too complex to execute directly in hardware are executed in
software. An ARM register is used to access a table of exception handlers to
handle these particular bytecodes.
A complete list of the ARM1176JZ-S processor-supported Java bytecodes and their
corresponding hardware or software instructions is in the Jazelle V1 Architecture Reference
Manual.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-7
Introduction
1.5
Components of the processor
The main components of the ARM1176JZ-S processor are:
•
Integer core
•
Load Store Unit (LSU) on page 1-11
•
Prefetch unit on page 1-11
•
Memory system on page 1-12
•
AMBA AXI interface on page 1-15
•
Coprocessor interface on page 1-17
•
Debug on page 1-17
•
Instruction cycle summary and interlocks on page 1-19
•
System control on page 1-19
•
Interrupt handling on page 1-19.
Figure 1-1 shows the structure of the ARM1176JZ-S processor.
ARM1176JZ-S
JTAG interface
ETM interface
Coprocessor
interface
VIC interface
Instruction
Cache
Prefetch
Unit
Integer
core
Load Store
Unit
Data
Cache
Instruction
TCM
L1 instruction
side controller
Memory
management
unit
L1 data side
controller
Data
TCM
System
metrics
L2 instruction
interface
Power
control
L2 data
interface
Peripheral
port
L2 DMA
interface
Figure 1-1 ARM1176JZ-S processor block diagram
1.5.1
Integer core
The ARM1176JZ-S processor is built around the ARM11 integer core. It is an implementation
of the ARMv6 architecture and runs the ARM, Thumb, and Java instruction sets. The processor
contains EmbeddedICE-RT™ logic and a JTAG debug interface to enable hardware debuggers to
communicate with the processor. The following sections describe the core in more detail:
•
Instruction set categories on page 1-9
•
Conditional execution on page 1-9
•
Registers on page 1-9
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-8
Introduction
•
•
•
•
•
•
•
Modes and exceptions
Thumb instruction set on page 1-10
DSP instructions on page 1-10
Media extensions on page 1-10
Datapath on page 1-10
Branch prediction on page 1-11
Return stack on page 1-11.
Instruction set categories
The main instruction set categories are:
•
branch instructions
•
data processing instructions
•
status register transfer instructions
•
load and store instructions
•
coprocessor instructions.
•
exception-generating instructions.
Note
Only load, store, and swap instructions can access data from memory.
Conditional execution
The processor conditionally executes nearly all ARM instructions. You can decide if the
condition code flags, Negative, Zero, Carry, and Overflow, are updated according to their result.
Registers
The ARM1176JZ-S core contains:
•
33 general-purpose 32-bit registers
•
7 dedicated 32-bit registers.
Note
At any one time, 16 general-purpose registers are visible. The remainder are banked registers
used to speed up exception processing.
Modes and exceptions
The core provides a set of operating and exception modes, to support systems combining
complex operating systems, user applications, and real-time demands. There are eight operating
modes, six of them are exception processing modes:
•
User
•
Supervisor
•
fast interrupt
•
normal interrupt
•
abort
•
system
•
Undefined
•
Secure Monitor.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-9
Introduction
Thumb instruction set
The Thumb instruction set contains a subset of the most commonly-used 32-bit ARM
instructions encoded into 16-bit wide opcodes. This reduces the amount of memory required for
instruction storage.
DSP instructions
The DSP extensions to the ARM instruction set provide:
•
16-bit data operations
•
saturating arithmetic
•
MAC operations.
The processor executes multiply instructions using a single-cycle 32x16 implementation. The
processor can perform 32x32, 32x16, and 16x16 multiply instructions (MAC).
Media extensions
The ARMv6 instruction set provides media instructions to complement the DSP instructions.
There are four media instruction groups:
•
Multiplication instructions for handling 16-bit and 32-bit data, including
dual-multiplication instructions that operate on both 16-bit halves of their source registers.
This group includes an instruction that improves the performance and size of code for
multi-word unsigned multiplications.
•
Single Instruction Multiple Data (SIMD) Instructions to perform operations on pairs of
16-bit values held in a single register, or on sets of four 8-bit values held in a single
register. The main operations supplied are addition and subtraction, selection, pack, and
saturation.
•
Instructions to extract bytes and halfwords from registers and zero-extend or sign-extend
them. These include a parallel extraction of two bytes followed by extension of each byte
to a halfword.
•
Unsigned Sum-of-Absolute-Differences (SAD) instructions. This is used in MPEG motion
estimation.
Datapath
The datapath consists of three pipelines:
•
ALU, shift and Sat pipeline
•
MAC pipeline
•
load or store pipeline, see Load Store Unit (LSU) on page 1-11.
ALU, shift or Sat pipe
The ALU, shift and Sat pipeline executes most of the ALU operations, and includes a 32-bit
barrel shifter. It consists of three pipeline stages:
Shift
The Shift stage contains the full barrel shifter. This stage performs all shifts,
including those required by the LSU.
The Shift stage implements saturating left shift that doubles the value of an
operand and saturates it.
ALU
ARM DDI 0333H
ID012410
The ALU stage performs all arithmetic and logic operations, and generates the
condition codes for instructions that set these flags.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-10
Introduction
The ALU stage consists of a logic unit, an arithmetic unit, and a flag generator.
The pipeline logic evaluates the flag settings in parallel with the main adder in the
ALU. The flag generator is enabled only on flag-setting operations.
The ALU stage separates the carry chains of the main adder for 8-bit and 16-bit
SIMD instructions.
Sat
The Sat stage implements the saturation logic required by the various classes of
DSP instructions.
MAC pipe
The MAC pipeline executes all of the enhanced multiply, and multiply-accumulate instructions.
The MAC unit consists of a 32x16 multiplier and an accumulate unit that is configured to
calculate the sum of two 16x16 multiplies. The accumulate unit has its own dedicated single
register read port for the accumulate operand.
To minimize power consumption, the processor only clocks each of the MAC and ALU stages
when required.
Return stack
The processor includes a three-entry return stack to accelerate returns from procedure calls. For
each procedure call, the processor pushes the return address onto a hardware stack. When the
processor recognizes a procedure return, the processor pops the address held in the return stack
that the prefetch unit uses as the predicted return address.
Note
See Pipeline stages on page 1-24 for details of the pipeline stages and instruction progression.
See Chapter 3 System Control Coprocessor for system control coprocessor programming
information.
1.5.2
Load Store Unit (LSU)
The Load Store Unit (LSU) manages all load and store operations. The load-store pipeline
decouples loads and stores from the MAC and ALU pipelines.
When the processor issues LDM and STM instructions to the LSU pipeline, other instructions
run concurrently, subject to the requirements of supporting precise exceptions.
1.5.3
Prefetch unit
The prefetch unit fetches instructions from the instruction cache, Instruction TCM, or from
external memory and predicts the outcome of branches in the instruction stream.
See Chapter 5 Program Flow Prediction for more details.
Branch prediction
The core uses both static and dynamic branch prediction. All branches are predicted where the
target address is an immediate address, or fixed-offset PC-relative address.
The first level of branch prediction is dynamic, through a 128-entry Branch Target Address
Cache (BTAC). If the PC of a branch matches an entry in the BTAC, the processor uses the
branch history and the target address to fetch the new instruction stream.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-11
Introduction
The processor might remove dynamically predicted branches from the instruction stream, and
might execute such branches in zero cycles.
If the address mappings are changed, the BTAC must be flushed. A BTAC flush instruction is
provided in the CP15 coprocessor.
The processor uses static branch prediction to manage branches not matched in the BTAC. The
static branch predictor makes a prediction based on the direction of the branches.
1.5.4
Memory system
The level-one memory system provides the core with:
•
separate instruction and data caches
•
separate instruction and data Tightly-Coupled Memories
•
64-bit datapaths throughout the memory system
•
virtually indexed, physically tagged caches
•
memory access controls and virtual memory management
•
support for four sizes of memory page
•
two-channel DMA into TCMs
•
I-fetch, D-read/write interface, compatible with multi-layer AMBA AXI
•
32-bit dedicated peripheral interface
•
export of memory attributes for second-level memory system.
The following sections describe the memory system in more detail:
•
Instruction and data caches
•
Cache power management on page 1-13
•
Instruction and data TCM on page 1-13
•
TCM DMA engine on page 1-14
•
DMA features on page 1-14
•
Memory Management Unit on page 1-14.
Instruction and data caches
The core provides separate instruction and data caches. The cache has the following features:
ARM DDI 0333H
ID012410
•
Independent configuration of the instruction and data cache during synthesis to sizes
between 4KB and 64KB.
•
4-way set-associative instruction and data caches. You can lock each way independently.
•
Pseudo-random or round-robin replacement.
•
Eight word cache line length.
•
The MicroTLB entry determines whether cache lines are write-back or write-through.
•
Ability to disable each cache independently, using the system control coprocessor.
•
Data cache misses that are non-blocking. The processor supports up to three outstanding
data cache misses.
•
Streaming of sequential data from LDM and LDRD operations, and sequential instruction
fetches.
•
Critical word first filling of the cache on a cache-miss.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-12
Introduction
•
You can implement all the cache RAM blocks, and the associated tag and valid RAM
blocks using standard ASIC RAM compilers. This ensures optimum area and
performance of your design.
•
Each cache line is marked with a Secure or Non-secure tag that defines if the line contains
Secure or Non-secure data.
Cache power management
To reduce power consumption, the core uses sequential cache operations to reduce the number
of full cache reads. If a cache read is sequential to the previous cache read, and the read is within
the same cache line, only the data RAM set that was previously read is accessed. The core does
not access tag RAM during sequential cache operations.
To reduce unnecessary power consumption additionally, the core only reads the addressed
words within a cache line at any time.
Instruction and data TCM
Because some applications might not respond well to caching, configurable memory blocks are
provided for Instruction and Data Tightly Coupled Memories (TCMs). These ensure high-speed
access to code or data.
An Instruction TCM typically holds an interrupt or exception code that the processor must
access at high speed, without any potential delay resulting from a cache miss.
A Data TCM typically holds a block of data for intensive processing, such as audio or video
processing.
You can configure each TCM to be Secure or Non-secure.
Level one memory system
You can separately configure the size of the Instruction TCM (ITCM) and the size of the Data
TCM (DTCM) to be 0KB, 4KB. 8KB, 16KB, 32KB or 64KB. For each side (ITCM and DTCM):
•
If you configure the TCM size to be 4KB you get one TCM, of 4KB, on this side.
•
If you configure the TCM size to be larger than 4KB you get two TCMs on this side, each
of half the configured size. So, for example, if you configure an ITCM size of 16KB you
get two ITCMs, each of size 8KB.
Table 1-1 lists all possible TCM configurations. See Configurable options on page 1-23 for
more information about configuring your ARM1176JZ-S implementation.
Table 1-1 TCM configurations
ARM DDI 0333H
ID012410
Configured TCM size
Number of TCMs
Size of each TCM
0KB
0
0
4KB
1
4KB
8KB
2
4KB
16KB
2
8KB
32KB
2
16KB
64KB
2
32KB
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-13
Introduction
The TCM can be anywhere in the memory map. The INITRAM pin enables booting from the
ITCM.
See Chapter 7 Level One Memory System for more details.
TCM DMA engine
To support use of the TCMs by data-intensive applications, the core provides two DMA
channels to transfer data to or from the Instruction or Data TCM blocks. DMA can proceed in
parallel with CPU accesses to the TCM blocks. Arbitration is on a cycle-by-cycle basis. The
DMA channels connect with the System-on-Chip (SoC) backplane through a dedicated 64-bit
AMBA AXI port.
The DMA controller is programmed using the CP15 system-control coprocessor. DMA accesses
can only be to or from the TCM, and an external memory. There is no coherency support with
the caches.
Note
Only one of the two DMA channels can be active at any time.
DMA features
The DMA controller has the following features:
•
runs in background of CPU operations
•
enables CPU priority access to TCM during DMA
•
programmed with Virtual Addresses
•
controls DMA to either the instruction or data TCM
•
allocated by a privileged process (OS)
•
software can check and monitor DMA progress
•
interrupts on DMA event
•
ability to configure each channel to transfer data between Secure TCM and Secure
external memory.
Memory Management Unit
The Memory Management Unit (MMU) has a unified Translation Lookaside Buffer (TLB) for
both instructions and data. The MMU includes a 4KB page mapping size to enable a smaller
RAM and ROM footprint for embedded systems and operating systems such as Windows CE
that have many small mapped objects. The ARM1176JZ-S processor implements the Fast
Context Switch Extension (FCSE) and high vectors extension that are required to run Microsoft
Windows CE. See Chapter 6 Memory Management Unit for more details.
The MMU is responsible for protection checking, address translation, and memory attributes,
and some of these can be passed to an external level two memory system. The memory
translations are cached in MicroTLBs for each of the instruction and data caches, with a single
Main TLB backing the MicroTLBs.
The MMU has the following features:
•
matches Virtual Address, ASID, and NSTID
•
each TLB entry is marked with the NSTID
•
checks domain access permissions
•
checks memory attributes
•
translates virtual-to-physical address
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-14
Introduction
•
•
•
•
supports four memory page sizes
maps accesses to cache, TCM, peripheral port, or external memory
hardware handles TLB misses
software control of TLB.
Paging
Four page sizes are supported:
•
16MB super sections
•
1MB sections
•
64KB large pages
•
4KB small pages.
Domains
Sixteen access domains are supported.
TLB
A two-level TLB structure is implemented. Eight entries in the main TLB are lockable.
Hardware TLB loading is supported, and is backwards compatible with previous versions of the
ARM architecture.
ASIDs
TLB entries can be global, or can be associated with particular processes or applications using
Application Space IDentifiers (ASIDs). ASIDs enable TLB entries to remain resident during
context switches to avoid subsequent reload of TLB entries and also enable task-aware
debugging.
NSTID
TrustZone extensions enable the system to mark each entry in the TLB as Secure or Non-secure
with the Non-Secure Table IDentifier (NSTID).
System control coprocessor
Cache, TCM, and DMA operations are controlled through a dedicated coprocessor, CP15,
integrated within the core. This coprocessor provides a standard mechanism for configuring the
level one memory system, and also provides functions such as memory barrier instructions. See
System control on page 1-19 for more details.
1.5.5
AMBA AXI interface
The bus interface provides high bandwidth connections between the processor, second level
caches, on-chip RAM, peripherals, and interfaces to external memory.
There are separate bus interfaces for:
•
instruction fetch, 64-bit data
•
data read/write, 64-bit data
•
peripheral access, 32-bit data
•
DMA, 64-bit data.
All interfaces are AMBA AXI compatible. This enables them to be merged in smaller systems.
Additional signals are provided on each port to support second-level cache.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-15
Introduction
The ports support the following bus transactions:
Instruction fetch
Servicing instruction cache misses and noncacheable instruction fetches.
Data read/write
Servicing data cache misses, hardware handled TLB misses, cache eviction and
noncacheable data reads and writes.
DMA
Servicing the DMA engine for writing and reading the TCMs. This behaves as a
single bidirectional port.
These ports enable several simultaneous outstanding transactions, providing:
•
high performance from second-level memory systems that support parallelism
•
high use of pipelined and multi-page memories such as SDRAM.
The following sections describe the AMBA AXI interface in more detail:
•
Bus clock speeds
•
Unaligned accesses
•
Mixed-endian support
•
Write buffer
•
Peripheral port.
Bus clock speeds
The bus interface ports operate synchronously to the CPU clock if IEM is not implemented.
Unaligned accesses
The core supports unaligned data access. Words and halfwords can align to any byte boundary.
This enables access to compacted data structures with no software overhead. This is useful for
multi-processor applications and reducing memory space requirements.
The Bus Interface Unit (BIU) automatically generates multiple bus cycles for unaligned
accesses.
Mixed-endian support
The core provides the option of switching between little-endian and byte invariant big endian
data access modes. This means the core can share data with big-endian systems, and improves
the way the core manages certain types of data.
Write buffer
All memory writes take place through the write buffer. The write buffer decouples the CPU
pipeline from the system bus for external memory writes. Memory reads are checked for
dependency against the write buffer contents.
Peripheral port
The peripheral port is a 32-bit AMBA AXI interface that provides direct access to local,
Non-shared devices separately. The peripheral port does not use the main bus system. The
memory regions that these non-shared devices use are marked as Device and Non-Shared.
Accesses to these memory regions are routed to the peripheral port instead of to the data
read-write ports.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-16
Introduction
See Chapter 8 Level Two Interface for more details.
1.5.6
Coprocessor interface
The ARM1176JZ-S processor connects to external coprocessors through the coprocessor
interface. This interface supports all ARM coprocessor instructions:
•
LDC
•
LDCL
•
STC
•
STCL
•
MRC
•
MRRC
•
MCR
•
MCRR
•
CDP.
The memory system returns data for all loads to coprocessors in the order of the accesses in the
program. The processor suppresses HUM operation of the cache for coprocessor instructions.
The external coprocessor interface relies on the coprocessor executing all its instructions in
order.
Externally-connected coprocessors follow the early stages of the core pipeline to permit the
exchange of instructions and data between the two pipelines. The coprocessor runs one pipeline
stage behind the core pipeline.
To prevent the coprocessor interface introducing critical paths, wait states can be inserted in
external coprocessor operations. These wait states enable critical signals to be retimed.
Chapter 11 Coprocessor Interface describes the interface for on-chip coprocessors such as
floating-point or other application-specific hardware acceleration units.
1.5.7
Debug
The ARM1176JZ-S core implements the ARMv6.1 Debug architecture that includes extensions
of the ARMv6 Debug architecture to support TrustZone. It introduces three levels of debug:
•
debug everywhere
•
debug in Non-secure privileged and user, and Secure user
•
debug in Non-secure only.
The debug coprocessor, CP14, implements a full range of debug features that Chapter 13 Debug
and Chapter 14 Debug Test Access Port describe.
The core provides extensive support for real-time debug and performance profiling.
The following sections describe debug in more detail:
•
System performance monitoring on page 1-18
•
ETM interface on page 1-18
•
ETM trace buffer on page 1-18
•
Software access to trace buffer on page 1-18
•
Real-time debug facilities on page 1-18
•
Debug and trace Environment on page 1-19.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-17
Introduction
System performance monitoring
This is a group of counters that you can configure to monitor the operation of the processor and
memory system. See System performance monitor on page 3-10 for more details.
ETM interface
You can connect an external Embedded Trace Macrocell (ETM) unit to the processor for
real-time code tracing of the core in an embedded system.
The ETM interface collects various processor signals and drives these signals from the core. The
interface is unidirectional and runs at the full speed of the core. The ETM interface connects
directly to the external ETM unit without any additional glue logic. You can disable the ETM
interface for power saving.
For more information see:
•
the Embedded Trace Macrocell Architecture Specification
•
Chapter 15 Trace Interface Port
•
Appendix A Signal Descriptions, for details of ETM-related signals.
ETM trace buffer
You can extend the functionality of the ETM by adding an on-chip trace buffer. The trace buffer
is an on-chip memory area. The trace buffer stores trace information during capture that
otherwise passes immediately through the trace port at the operating frequency of the core.
When capture is complete the stored information can be read out at a reduced clock rate from
the trace buffer using the JTAG port of the SoC, instead of through a dedicated trace port.
This is a two-step process that avoids you implementing a wide trace port that has many
high-speed device pins. In effect, a zero-pin trace port is created where the device already has a
JTAG port and associated pins.
Software access to trace buffer
You can access buffered trace information through an APB slave-based memory-mapped
peripheral included as part of the trace buffer. You can perform internal diagnostics on a closed
system where a JTAG port is not normally brought out.
Real-time debug facilities
The ARM1176JZ-S processor contains an EmbeddedICE-RT logic unit that provides the
following real-time debug facilities:
•
up to six breakpoints
•
thread-aware breakpoints
•
up to two watchpoints
•
Debug Communications Channel (DCC).
The EmbeddedICE-RT logic connects directly to the core and monitors the internal address and
data buses. You can access the EmbeddedICE-RT logic in one of two ways:
•
executing CP14 instructions
•
through a JTAG-style interface and associated TAP controller.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-18
Introduction
The EmbeddedICE-RT logic supports two modes of debug operation:
Halting debug-mode
On a debug event, such as a breakpoint or watchpoint, the debug logic stops the
core and forces the core into Debug state. This enables you to examine the internal
state of the core, and the external state of the system, independently from other
system activity. When the debugging process completes, the core and system state
is restored, and normal program execution resumes.
Monitor debug-mode
On a debug event, the core generates a debug exception instead of entering Debug
state, as in Halting debug-mode. The exception entry activates a debug monitor
program that performs critical interrupt service routines to debug the processor.
The debug monitor program communicates with the debug host over the DCC.
Debug and trace Environment
Several external hardware and software tools are available for you to enable:
•
real-time debugging using the EmbeddedICE-RT logic
•
execution trace using the ETM.
1.5.8
Instruction cycle summary and interlocks
Chapter 16 Cycle Timings and Interlock Behavior describes instruction cycles and gives
examples of interlock timing.
1.5.9
System control
The control of the memory system and its associated functionality, and other system-wide
control attributes are managed through a dedicated system control coprocessor, CP15. See
System control and configuration on page 3-5 for more details.
1.5.10
Interrupt handling
Interrupt handling in the ARM1176JZ-S processor is compatible with previous ARM
architectures, but has several additional features to improve interrupt performance for real-time
applications.
The following sections describe interrupt handling in more detail:
•
Vectored Interrupt Controller port
•
Low interrupt latency configuration on page 1-20
•
Configuration on page 1-20
•
Exception processing enhancements on page 1-20.
Note
The nIRQ and nFIQ signals are level-sensitive and must be held LOW until a suitable interrupt
response is received from the processor.
Vectored Interrupt Controller port
The core has a dedicated port that enables an external interrupt controller, such as the ARM
Vectored Interrupt Controller (VIC), to supply a vector address along with an interrupt request
(IRQ) signal. This provides faster interrupt entry but you can disable it for compatibility with
earlier interrupt controllers.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-19
Introduction
Low interrupt latency configuration
This mode minimizes the worst-case interrupt latency of the processor, with a small reduction
in peak performance, or instructions-per-cycle. You can tune the behavior of the core to suit the
requirements of the application.
The low interrupt latency configuration disables HUM operation of the cache. In low interrupt
latency configuration, on receipt of an interrupt, the ARM1176JZ-S processor:
•
abandons any pending restartable memory operations
•
restarts memory operations on return from the interrupt.
To obtain maximum benefit from the low interrupt latency configuration, software must only use
multi-word load or store instructions that are fully restartable. The software must not use
multi-word load or store instructions on memory locations that produce side-effects for the type
of access concerned. This applies to:
ARM
LDC, all forms of LDM, LDRD, and STC, and all forms of STM and STRD.
Thumb
LDMIA, STMIA, PUSH, and POP.
To achieve optimum interrupt latency, memory locations accessed with these instructions must
not have large numbers of wait-states associated with them. To minimize the interrupt latency,
the following is recommended:
•
multiple accesses to areas of memory marked as Device or Strongly Ordered must not be
performed
•
access to slow areas of memory marked as Device or Strongly Ordered must not be
performed. That is, those that take many cycles in generating a response
•
SWP operations must not be performed to slow areas of memory.
Configuration
You configure the processor for low interrupt latency mode by use of the system control
coprocessor. To ensure that a change between normal and low interrupt latency configurations
is synchronized correctly, you must use software systems that only change the configuration
while interrupts are disabled.
Exception processing enhancements
The ARMv6 architecture contains several enhancements to exception processing, to reduce
interrupt handler entry and exit time:
SRS
Save return state to a specified stack frame.
RFE
Return from exception.
CPS
Directly modify the CPSR.
Note
With TrustZone, in Non-secure state, specifying Secure Monitor mode in the <mode> field of the
SRS instruction causes the processor to take the Undefined exception.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-20
Introduction
1.6
Power management
The ARM1176JZ-S processor includes several micro-architectural features to reduce energy
consumption:
•
Accurate branch and return prediction, reducing the number of incorrect instruction fetch
and decode operations.
•
Use of physically tagged caches that reduce the number of cache flushes and refills, to
save energy in the system.
•
The use of MicroTLBs reduces the power consumed in translation and protection
look-ups for each memory access.
•
The caches use sequential access information to reduce the number of accesses to the Tag
RAMs and to unmatched data RAMs.
•
Extensive use of gated clocks and gates to disable inputs to unused functional blocks.
Because of this, only the logic actively in use to perform a calculation consumes any
dynamic power.
•
Optionally supports IEM. The ARM1176JZ-S is separated into three different blocks to
support three different power domains:
—
all the RAMS
—
the core logic that is clocked by CLKIN and FREECLKIN
—
four optional IEM Register Slices to have an asynchronous interface between the
Level 2 ports powered by VCore and clocked by CLKIN, and the AXI system
powered by VSoc and clocked by ACLK clocks, one for each port.
The ARM1176JZ-S processor support four levels of power management:
Run mode
This mode is the normal mode of operation when the processor can use all its
functions.
Standby mode
This mode disables most of the processor clocks of the device, while processor
remains powered up. This reduces the power drawn to the static leakage current,
plus a tiny clock power overhead required to enable the processor to wake up from
the standby state. One of the following events cause a transition from the standby
mode to the run mode:
•
an interrupt, either masked or unmasked
•
a debug request, regardless of whether debug is enabled
•
reset.
Shutdown mode
This mode powers down the entire processor. The processor must save all states,
including cache and TCM state, externally. The processor is returned to the run
state by the assertion of reset. The processor saves the states with interrupts
disabled, and finishes with a Data Synchronization Barrier operation. The
ARM1176JZ-S processor then communicates with the power controller that it is
ready to be powered down.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-21
Introduction
Dormant mode
This mode powers down the processor and leaves the caches and the TCM
powered up and maintaining their state. The valid bits remain visible to software
to enable you to implement dormant mode. For full implementation of dormant
mode you must:
•
modify the RAM blocks to include an input clamp
•
implement separate power domains.
For full implementation of dormant mode see ARM1176JZ-S and ARM1176JZ-S
Implementation Guide.
For more details of power management features see Chapter 10 Power Control.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-22
Introduction
1.7
Configurable options
Note
These options are configurable features of your ARM1176JZ-S processor implementation. They
are not programmable options of the implemented device.
Table 1-2 lists the ARM1176JZ-S processor configurable options.
Table 1-2 Configurable options
Feature
Range of options
IEM support
Yes or No
Cache way size
1KB, 2KB, 4KB, 8KB, or 16KB
Number of cache ways
4, not configurable
TCM block size
4KB, 8KB, 16KB, or 32KB
Number of TCM blocks
0, or auto-configures a to 1 or 2
a. Number of TCM blocks depends only on the size of the
TCM RAM.
In addition, the form of the BIST solution for the RAM blocks in the ARM1176JZ-S design is
determined when the processor is implemented. For details, see the ARM11 Memory Built-In
Self Test Controller Technical Reference Manual.
Table 1-3 lists the default configuration of ARM1176JZ-S processor.
Table 1-3 ARM1176JZ-S processor default configurations
ARM DDI 0333H
ID012410
Feature
Default value
IEM support
No
Cache way size
4KB
Number of cache ways
4
TCM block size
8KB
Number of TCM blocks
2
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-23
Introduction
1.8
Pipeline stages
Figure 1-2 shows:
•
the two Fetch stages
•
a Decode stage
•
an Issue stage
•
the four stages of the ARM1176JZ-S integer execution pipeline.
These eight stages make up the processor pipeline.
Fe1
Fe2
De
Iss
Sh
ALU
Sat
WBex
1st fetch
stage
2nd fetch
stage
Instruction
decode
Reg. read
and issue
Shifter
stage
ALU
operation
Saturation
stage
Writeback
Mul/ALU
MAC1
MAC2
MAC3
1st multiply
acc. stage
2nd multiply
acc. stage
3rd multiply
acc. stage
ADD
DC1
DC2
WBls
Address
generation
Data
cache 1
Data
cache 2
Writeback
from LSU
Figure 1-2 ARM1176JZ-S pipeline stages
From Figure 1-2, the pipeline operations are:
Fe1
First stage of instruction fetch where address is issued to memory and data returns
from memory
Fe2
Second stage of instruction fetch and branch prediction.
De
Instruction decode.
Iss
Register read and instruction issue.
Sh
Shifter stage.
ALU
Main integer operation calculation.
Sat
Pipeline stage to enable saturation of integer results.
WBex
Write back of data from the multiply or main execution pipelines.
MAC1
First stage of the multiply-accumulate pipeline.
MAC2
Second stage of the multiply-accumulate pipeline.
MAC3
Third stage of the multiply-accumulate pipeline.
ADD
Address generation stage.
DC1
First stage of data cache access.
DC2
Second stage of data cache access.
WBls
Write back of data from the Load Store Unit.
By overlapping the various stages of operation, the ARM1176JZ-S processor maximizes the
clock rate achievable to execute each instruction. It delivers a throughput approaching one
instruction for each cycle.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-24
Introduction
The Fetch stages can hold up to four instructions, where branch prediction is performed on
instructions ahead of execution of earlier instructions.
The Issue and Decode stages can contain any instruction in parallel with a predicted branch.
The Execute, Memory, and Write stages can contain a predicted branch, an ALU or multiply
instruction, a load/store multiple instruction, and a coprocessor instruction in parallel execution.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-25
Introduction
1.9
Typical pipeline operations
Figure 1-3 shows all the operations in each of the pipeline stages in the ALU pipeline, the
load/store pipeline, and the HUM buffers.
Ex1
Sh
Ex2
ALU
Ex3
Sat
Fe1
Fe2
De
Iss
Shifter
operation
Calculate
writeback
value
Saturation
WBex
1st fetch
stage
2nd fetch
stage
Instruction
decode
Register
read and
instruction
issue
MAC1
MAC2
MAC3
Base
register
writeback
1st
multiply
stage
2nd
multiply
stage
3rd
multiply
stage
ALU
pipeline
Multiply
pipeline
Common decode pipeline
ADD
DC1
DC2
WBls
Data
address
calculation
First stage
of data
cache
access
Second
stage of
data cache
access
Writeback
from LSU
Load/store
pipeline
Hit under
miss
Load miss
waits
Figure 1-3 Typical operations in pipeline stages
Figure 1-4 shows a typical ALU data processing instruction. The processor does not use the
load/store pipeline or the HUM buffer.
Ex1
Sh
Ex2
ALU
Ex3
Sat
Fe1
Fe2
De
Iss
Shifter
operation
Calculate
writeback
value
Saturation
WBex
1st fetch
stage
2nd fetch
stage
Instruction
decode
Register
read and
instruction
issue
MAC1
MAC2
MAC3
Base
register
writeback
Not used
Not used
Not used
ADD
DC1
DC2
WBls
Not used
Not used
Not used
Not used
ALU
pipeline
Multiply
pipeline
Common decode pipeline
Not used
Load/store
pipeline
Hit under
miss
Figure 1-4 Typical ALU operation
Figure 1-5 on page 1-27 shows a typical multiply operation. The MUL instruction can loop in
the MAC1 stage until it has passed through the first part of the multiplier array enough times.
The MUL instruction progresses to MAC2 and MAC3 where it passes through the second half
of the array once to produce the final result.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-26
Introduction
Ex1
Sh
Ex2
ALU
Ex3
Sat
Fe1
Fe2
De
Iss
Not used
Not used
Not used
WBex
1st fetch
stage
2nd fetch
stage
Instruction
decode
Register
read and
instruction
issue
MAC1
MAC2
MAC3
Base
register
writeback
1st
multiply
stage
2nd
multiply
stage
3rd
multiply
stage
ADD
DC1
DC2
WBls
Not used
Not used
Not used
Not used
ALU
pipeline
Multiply
pipeline
Common decode pipeline
Load/store
pipeline
Hit under
miss
Not used
Figure 1-5 Typical multiply operation
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-27
Introduction
1.9.1
Instruction progression
Figure 1-6 shows an LDR/STR operation that hits in the data cache.
Ex1
Sh
Ex2
ALU
Ex3
Sat
Fe1
Fe2
De
Iss
Shifter
operation
Calculate
writeback
value
Saturation
WBex
1st fetch
stage
2nd fetch
stage
Instruction
decode
Register
read and
instruction
issue
MAC1
MAC2
MAC3
Base
register
writeback
Not used
Not used
Not used
ALU
pipeline
Multiply
pipeline
Common decode pipeline
ADD
DC1
DC2
WBls
Data
address
calculation
First stage
of data
cache
access
Second
stage of
data cache
access
Writeback
from LSU
Load/store
pipeline
Hit under
miss
Not used
Figure 1-6 Progression of an LDR/STR operation
Figure 1-7 shows the progression of an LDM/STM operation that completes by use of the
load/store pipeline. Other instructions can use the ALU pipeline at the same time as the
LDM/STM completes in the load/store pipeline.
Ex1
Sh
Ex2
ALU
Ex3
Sat
Fe1
Fe2
De
Iss
Shifter
operation
Calculate
writeback
value
Saturation
WBex
1st fetch
stage
2nd fetch
stage
Instruction
decode
Register
read and
instruction
issue
MAC1
MAC2
MAC3
Base
register
writeback
Not used
Not used
Not used
ALU
pipeline
Multiply
pipeline
Common decode pipeline
ADD
DC1
DC2
WBls
Data
address
calculation
First stage
of data
cache
access
Second
stage of
data cache
access
Writeback
from LSU
Not used
unless a
miss
occurs
Load/store
pipeline
Hit under
miss
Figure 1-7 Progression of an LDM/STM operation
Figure 1-8 on page 1-29 shows the progression of an LDR that misses. When the LDR is in the
HUM buffers, other instructions, including independent loads that hit in the cache, can run under
it.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-28
Introduction
Ex1
Sh 5
Fe1 1
1st fetch
stage
Fe2 2
2nd fetch
stage
De
3
Instruction
decode
Iss
Register
read and
instruction
issue
4
Ex2
ALU 6
Ex3
Sat 7
Shifter
operation
Calculate
writeback
value
Saturation
MAC1
MAC2
MAC3
Not used
Not used
Not used
WBex 8
ALU
pipeline
Base
register
writeback
Multiply
pipeline
Common decode pipeline
ADD 5
Data
address
calculation
DC1 6
First stage
of data
cache
access
DC2 11
Second
stage of
data cache
access
WBls 12
Writeback
from LSU
Load/store
pipeline
9,10
Load
Hit under
miss
Figure 1-8 Progression of an LDR that misses
See Chapter 16 Cycle Timings and Interlock Behavior for details of instruction cycle timings.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-29
Introduction
1.10
ARM1176JZ-S instruction set summary
This section provides:
•
an Extended ARM instruction set summary on page 1-31
•
a Thumb instruction set summary on page 1-42.
Table 1-4 lists a key to the ARM and Thumb instruction set tables.
The ARM1176JZ-S processor implements the ARM architecture v6 with ARM Jazelle
technology. For a description of the ARM and Thumb instruction sets, see the ARM Architecture
Reference Manual. Contact ARM Limited for complete descriptions of all instruction sets.
Table 1-4 Key to instruction set tables
ARM DDI 0333H
ID012410
Symbol
Description
{!}
Update base register after operation if ! present.
{^}
For all STMs and LDMs that do not load the PC, stores or restores the User
mode banked registers instead of the current mode registers if ^ present, and
sets the S bit. For LDMs that load the PC, indicates that the CPSR is loaded
from the SPSR.
B
Byte operation.
H
Halfword operation.
T
Forces execution to be handled as having User mode privilege. Cannot be
used with pre-indexed addresses.
x
Selects HIGH or LOW 16 bits of register Rm. T selects the HIGH 16 bits,
T = top, and B selects the LOW 16 bits, B = bottom.
y
Selects HIGH or LOW 16 bits of register Rs. T selects the HIGH 16 bits,
T = top, and B selects the LOW 16 bits, B = bottom.
{cond}
Updates condition flags if cond present. See Table 1-13 on page 1-42.
{field}
See Table 1-12 on page 1-41.
{S}
Sets condition codes, optional.
<a_mode2>
See Table 1-6 on page 1-38.
<a_mode2P>
See Table 1-7 on page 1-39.
<a_mode3>
See Table 1-8 on page 1-40.
<a_mode4>
See Table 1-9 on page 1-40.
<a_mode5>
See Table 1-10 on page 1-41.
<cp_num>
One of the coprocessors p0 to p15.
<effect>
Specifies the effect required on the interrupt disable bits, A, I, and F in the
CPSR:
IE = Interrupt enable
ID = Interrupt disable.
<iflags> specifies the bits affected if <effect> is specified.
<endian_specifier>
BE = Set E bit in instruction, set CPSR E bit.
LE = Reset E bit in instruction, clear CPSR E bit.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-30
Introduction
Table 1-4 Key to instruction set tables (continued)
1.10.1
Symbol
Description
<HighReg>
Specifies a register in the range R8 to R15.
<iflags>
A sequence of one or more of the following:
a = Set A bit.
i = Set I bit.
f = Set F bit.
If <effect> is specified, the sequence determines the interrupt flags that are
affected.
<immed_8*4>
A 10-bit constant, formed by left-shifting an 8-bit value by two bits.
<immed_8>
An 8-bit constant.
<immed_8r>
A 32-bit constant, formed by right-rotating an 8-bit value by an even number
of bits.
<label>
The target address to branch to.
<LowReg>
Specifies a register in the range R0 to R7.
<mode>
The new mode number for a mode change. See Mode bits on page 2-28.
<op1>, <op2>
Specify, in a coprocessor-specific manner, the coprocessor operation to
perform.
<operand2>
See Table 1-11 on page 1-41.
<option>
Specifies additional instruction options to the coprocessor. An integer in the
range 0 to 255 surrounded by { and }.
<reglist>
A comma-separated list of registers, enclosed in braces { and }.
<rotation>
One of ROR #8, ROR #16, or ROR #24.
<Rm>
Specifies the register, the value of which is the instruction operand.
<Rn>
Specifies the address of the base register.
<shift>
Specifies the optional shift. If present, it must be one of:
•
LSL #N. N must be in the range 0 to 31.
ASR #N. N must be in the range 1 to 32.
•
Extended ARM instruction set summary
Table 1-5 summarizes the extended ARM instruction set.
Table 1-5 ARM instruction set summary
Operation
Arithmetic
ARM DDI 0333H
ID012410
Assembler
Add
ADD{cond}{S} <Rd>, <Rn>, <operand2>
Add with carry
ADC{cond}{S} <Rd>, <Rn>, <operand2>
Subtract
SUB{cond}{S} <Rd>, <Rn>, <operand2>
Subtract with carry
SBC{cond}{S} <Rd>, <Rn>, <operand2>
Reverse subtract
RSB{cond}{S} <Rd>, <Rn>, <operand2>
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-31
Introduction
Table 1-5 ARM instruction set summary (continued)
Operation
Compare
Logical
Branch
ARM DDI 0333H
ID012410
Assembler
Reverse subtract with carry
RSC{cond}{S} <Rd>, <Rn>, <operand2>
Multiply
MUL{cond}{S} <Rd>, <Rm>, <Rs>
Multiply-accumulate
MLA{cond}{S} <Rd>, <Rm>, <Rs>, <Rn>
Multiply unsigned long
UMULL{cond}{S} <RdLo>, <RdHi>, <Rm>, <Rs>
Multiply unsigned accumulate long
UMLAL{cond}{S} <RdLo>, <RdHi>, <Rm>, <Rs>
Multiply signed long
SMULL{cond}{S} <RdLo>, <RdHi>, <Rm>, <Rs>
Multiply signed accumulate long
SMLAL{cond}{S} <RdLo>, <RdHi>, <Rm>, <Rs>
Saturating add
QADD{cond} <Rd>, <Rm>, <Rn>
Saturating add with double
QDADD{cond} <Rd>, <Rm>, <Rn>
Saturating subtract
QSUB{cond} <Rd>, <Rm>, <Rn>
Saturating subtract with double
QDSUB{cond} <Rd>, <Rm>, <Rn>
Multiply 16x16
SMULxy{cond} <Rd>, <Rm>, <Rs>
Multiply-accumulate 16x16+32
SMLAxy{cond} <Rd>, <Rm>, <Rs>, <Rn>
Multiply 32x16
SMULWy{cond} <Rd>, <Rm>, <Rs>
Multiply-accumulate 32x16+32
SMLAWy{cond} <Rd>, <Rm>, <Rs>, <Rn>
Multiply signed
accumulate long 16x16+64
SMLALxy{cond} <RdLo>, <RdHi>, <Rm>, <Rs>
Count leading zeros
CLZ{cond} <Rd>, <Rm>
Compare
CMP{cond} <Rn>, <operand2>
Compare negative
CMN{cond} <Rn>, <operand2>
Move
MOV{cond}{S} <Rd>, <operand2>
Move NOT
MVN{cond}{S} <Rd>, <operand2>
Test
TST{cond} <Rn>, <operand2>
Test equivalence
TEQ{cond} <Rn>, <operand2>
AND
AND{cond}{S} <Rd>, <Rn>, <operand2>
XOR
EOR{cond}{S} <Rd>, <Rn>, <operand2>
OR
ORR{cond}{S} <Rd>, <Rn>, <operand2>
Bit clear
BIC{cond}{S} <Rd>, <Rn>, <operand2>
Copy
CPY{<cond>} <Rd>, <Rm>
Branch
B{cond} <label>
Branch with link
BL{cond} <label>
Branch and exchange
BX{cond} <Rm>
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-32
Introduction
Table 1-5 ARM instruction set summary (continued)
Operation
Assembler
Branch, link and exchange
BLX <label>
Branch, link and exchange
BLX{cond} <Rm>
Branch and exchange to Jazelle
state
BXJ{cond} <Rm>
Move SPSR to register
MRS{cond} <Rd>, SPSR
Move CPSR to register
MRS{cond} <Rd>, CPSR
Move register to SPSR
MSR{cond} SPSR_{field}, <Rm>
Move register to CPSR
MSR{cond} CPSR_{field}, <Rm>
Move immediate to SPSR flags
MSR{cond} SPSR_{field}, #<immed_8r>
Move immediate to CPSR flags
MSR{cond} CPSR_{field}, #<immed_8r>
Word
LDR{cond} <Rd>, <a_mode2>
Word with User mode privilege
LDR{cond}T <Rd>, <a_mode2P>
PC as destination, branch and
exchange
LDR{cond} R15, <a_mode2P>
Byte
LDR{cond}B <Rd>, <a_mode2>
Byte with User mode privilege
LDR{cond}BT <Rd>, <a_mode2P>
Byte signed
LDR{cond}SB <Rd>, <a_mode3>
Halfword
LDR{cond}H <Rd>, <a_mode3>
Halfword signed
LDR{cond}SH <Rd>, <a_mode3>
Doubleword
LDR{cond}D <Rd>, <a_mode3>
Return from exception
RFE<a_mode4> <Rn>{!}
Stack operations
LDM{cond}<a_mode4L> <Rn>{!}, <reglist>
Increment before
LDM{cond}IB <Rn>{!}, <reglist>{^}
Increment after
LDM{cond}IA <Rn>{!}, <reglist>{^}
Decrement before
LDM{cond}DB <Rn>{!}, <reglist>{^}
Decrement after
LDM{cond}DA <Rn>{!}, <reglist>{^}
Stack operations and restore CPSR
LDM{cond}<a_mode4> <Rn>{!}, <reglist+pc>^
User registers
LDM{cond}<a_mode4> <Rn>{!}, <reglist>^
Soft preload
Memory system hint
In Non-secure this instruction
behaves like a NOP
PLD <a_mode2>
Store
Word
STR{cond} <Rd>, <a_mode2>
Word with User mode privilege
STR{cond}T <Rd>, <a_mode2P>
Byte
STR{cond}B <Rd>, <a_mode2>
Status register
handling
Load
Load multiple
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-33
Introduction
Table 1-5 ARM instruction set summary (continued)
Operation
Assembler
Byte with User mode privilege
STR{cond}BT <Rd>, <a_mode2P>
Halfword
STR{cond}H <Rd>, <a_mode3>
Doubleword
STR{cond}D <Rd>, <a_mode3>
Store return state
SRS<a_mode4> <mode>{!}
Stack operations
STM{cond}<a_mode4S> <Rn>{!}, <reglist>
User registers
STM{cond}<a_mode4S> <Rn>, <reglist>^
Increment before
STM{cond}IB, <Rn>{!}, <reglist>{^}
Increment after
STM{cond}IA, <Rn>{!}, <reglist>{^}
Decrement before
STM{cond}DB, <Rn>{!}, <reglist>{^}
Decrement after
STM{cond}DA, <Rn>{!}, <reglist>{^}
Word
SWP{cond} <Rd>, <Rm>, [<Rn>]
Byte
SWP{cond}B <Rd>, <Rm>, [<Rn>]
Change processor state
CPS<effect> <iflags>{, <mode>}
Change processor mode
CPS <mode>
Change endianness
SETEND <endian_specifier>
NOP-compatible
hints
No Operation
NOP{<cond>}
Byte-reverse
Byte-reverse word
REV{cond} <Rd>, <Rm>
Byte-reverse halfword
REV16{cond} <Rd>, <Rm>
Byte-reverse signed halfword
REVSH{cond} <Rd>, <Rm>
Load exclusive
LDREX{cond} <Rd>, [<Rn>]
Store exclusive
STREX{cond} <Rd>, <Rm>, [<Rn>]
Load Byte Exclusive
LDREXB{cond} <Rxf>, [<Rbase>]
Load Halfword Exclusive
LDREXH{cond} <Rd>, [<Rn>]
Load Doubleword Exclusive
LDREXD{cond} <Rd>, [<Rn>]
Store Byte Exclusive
STREXB{cond} <Rd>, <Rm>, [<Rn>]
Store Halfword Exclusive
STREXH{cond} <Rd>, <Rm>, [<Rn>]
Store Doubleword Exclusive
STREXD{cond} <Rd>, <Rm>, [<Rn>]
Clear Exclusive
CLREX
Data operations
CDP{cond} <cp_num>, <op1>, <CRd>, <CRn>, <CRm>{, <op2>}
Move to ARM reg from coproc
MRC{cond} <cp_num>, <op1>, <Rd>, <CRn>, <CRm>{, <op2>}
Move to coproc from ARM reg
MCR{cond} <cp_num>, <op1>, <Rd>, <CRn>, <CRm>{, <op2>}
Store multiple
Swap
Change state
Synchronization
primitives
Coprocessor
ARM DDI 0333H
ID012410
YIELD{<cond>}
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-34
Introduction
Table 1-5 ARM instruction set summary (continued)
Operation
Alternative
coprocessor
Assembler
Move double to ARM reg
from coproc
MRRC{cond} <cp_num>, <op1>, <Rd>, <Rn>, <CRm>
Move double to coproc
from ARM reg
MCRR{cond} <cp_num>, <op1>, <Rd>, <Rn>, <CRm>
Load
LDC{cond} <cp_num>, <CRd>, <a_mode5>
Store
STC{cond} <cp_num>, <CRd>, <a_mode5>
Data operations
CDP2 <cp_num>, <op1>, <CRd>, <CRn>, <CRm>{, <op2>}
Move to ARM reg from coproc
MRC2 <cp_num>, <op1>, <Rd>, <CRn>, <CRm>{, <op2>}
Move to coproc from ARM reg
MCR2 <cp_num>, <op1>, <Rd>, <CRn>, <CRm>{, <op2>}
Move double to ARM reg
from coproc
MRRC2 <cp_num>, <op1>, <Rd>, <Rn>, <CRm>
Move double to coproc
from ARM reg
MCRR2 <cp_num>, <op1>, <Rd>, <Rn>, <CRm>
Load
LDC2 <cp_num>, <CRd>, <a_mode5>
Store
STC2 <cp_num>, <CRd>, <a_mode5>
Software interrupt
SVC{cond} <immed_24>
Secure Monitor Call
SMC{cond} <immed_16>
Software breakpoint
BKPT <immed_16>
Parallel add
/subtract
ARM DDI 0333H
ID012410
Signed add high 16 + 16,
low 16 + 16, set GE flags
SADD16{cond} <Rd>, <Rn>, <Rm>
Saturated add high 16 + 16,
low 16 + 16
QADD16{cond} <Rd>, <Rn>, <Rm>
Signed high 16 + 16, low 16 + 16,
halved
SHADD16{cond} <Rd>, <Rn>, <Rm>
Unsigned high 16 + 16, low 16 +
16, set GE flags
UADD16{cond} <Rd>, <Rn>, <Rm>
Saturated unsigned high 16 + 16,
low 16 + 16
UQADD16{cond} <Rd>, <Rn>, <Rm>
Unsigned high 16 + 16,
low 16 + 16, halved
UHADD16{cond} <Rd>, <Rn>, <Rm>
Signed high 16 + low 16,
low 16 - high 16, set GE flags
SADDSUBX{cond} <Rd>, <Rn>, <Rm>
Saturated high 16 + low 16,
low 16 - high 16
QADDSUBX{cond} <Rd>, <Rn>, <Rm>
Signed high 16 + low 16,
low 16 - high 16, halved
SHADDSUBX{cond} <Rd>, <Rn>, <Rm>
Unsigned high 16 + low 16,
low 16 - high 16, set GE flags
UADDSUBX{cond} <Rd>, <Rn>, <Rm>
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-35
Introduction
Table 1-5 ARM instruction set summary (continued)
Operation
ARM DDI 0333H
ID012410
Assembler
Saturated unsigned
high 16 + low 16, low 16 - high 16
UQADDSUBX{cond} <Rd>, <Rn>, <Rm>
Unsigned high 16 + low 16,
low 16 - high 16, halved
UHADDSUBX{cond} <Rd>, <Rn>, <Rm>
Signed high 16 - low 16,
low 16 + high 16, set GE flags
SSUBADDX{cond} <Rd>, <Rn>, <Rm>
Saturated high 16 - low 16,
low 16 + high 16
QSUBADDX{cond} <Rd>, <Rn>, <Rm>
Signed high 16 - low 16,
low 16 + high 16, halved
SHSUBADDX{cond} <Rd>, <Rn>, <Rm>
Unsigned high 16 - low 16,
low 16 + high 16, set GE flags
USUBADDX{cond} <Rd>, <Rn>, <Rm>
Saturated unsigned
high 16 - low 16, low 16 + high 16
UQSUBADDX{cond} <Rd>, <Rn>, <Rm>
Unsigned high 16 - low 16,
low 16 + high 16, halved
UHSUBADDX{cond} <Rd>, <Rn>, <Rm>
Signed high 16-16, low 16-16,
set GE flags
SSUB16{cond} <Rd>, <Rn>, <Rm>
Saturated high 16 - 16, low 16 - 16
QSUB16{cond} <Rd>, <Rn>, <Rm>
Signed high 16 - 16, low 16 - 16,
halved
SHSUB16{cond} <Rd>, <Rn>, <Rm>
Unsigned high 16 - 16, low 16 - 16,
set GE flags
USUB16{cond} <Rd>, <Rn>, <Rm>
Saturated unsigned high 16 - 16,
low 16 - 16
UQSUB16{cond} <Rd>, <Rn>, <Rm>
Unsigned high 16 - 16, low 16 - 16,
halved
UHSUB16{cond} <Rd>, <Rn>, <Rm>
Four signed 8 + 8, set GE flags
SADD8{cond} <Rd>, <Rn>, <Rm>
Four saturated 8 + 8
QADD8{cond} <Rd>, <Rn>, <Rm>
Four signed 8 + 8, halved
SHADD8{cond} <Rd>, <Rn>, <Rm>
Four unsigned 8 + 8, set GE flags
UADD8{cond} <Rd>, <Rn>, <Rm>
Four saturated unsigned 8 + 8
UQADD8{cond} <Rd>, <Rn>, <Rm>
Four unsigned 8 + 8, halved
UHADD8{cond} <Rd>, <Rn>, <Rm>
Four signed 8 - 8, set GE flags
SSUB8{cond} <Rd>, <Rn>, <Rm>
Four saturated 8 - 8
QSUB8{cond} <Rd>, <Rn>, <Rm>
Four signed 8 - 8, halved
SHSUB8{cond} <Rd>, <Rn>, <Rm>
Four unsigned 8 - 8
USUB8{cond} <Rd>, <Rn>, <Rm>
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-36
Introduction
Table 1-5 ARM instruction set summary (continued)
Operation
Sign/zero extend
and add
Signed multiply
and multiply,
accumulate
ARM DDI 0333H
ID012410
Assembler
Four saturated unsigned 8 - 8
UQSUB8{cond} <Rd>, <Rn>, <Rm>
Four unsigned 8 - 8, halved
UHSUB8{cond} <Rd>, <Rn>, <Rm>
Sum of absolute differences
USAD8{cond} <Rd>, <Rm>, <Rs>
Sum of absolute differences and
accumulate
USADA8{cond} <Rd>, <Rm>, <Rs>, <Rn>
Two low 8/16, sign extend to 16 +
16
SXTAB16{cond} <Rd>, <Rn>, <Rm>{, <rotation>}
Low 8/32, sign extend to 32, + 32
SXTAB{cond} <Rd>, <Rn>, <Rm>{, <rotation>}
Low 16/32, sign extend to 32, + 32
SXTAH{cond} <Rd>, <Rn>, <Rm>{, <rotation>}
Two low 8/16, zero extend
to 16, + 16
UXTAB16{cond} <Rd>, <Rn>, <Rm>{, <rotation>}
Low 8/32, zero extend to 32, + 32
UXTAB{cond} <Rd>, <Rn>, <Rm>{, <rotation>}
Low 16/32, zero extend to 32, + 32
UXTAH{cond} <Rd>, <Rn>, <Rm>{, <rotation>}
Two low 8, sign extend to 16,
packed 32
SXTB16{cond} <Rd>, <Rm>{, <rotation>}
Low 8, sign extend to 32
SXTB{cond} <Rd>, <Rm>{, <rotation>}
Low 16, sign extend to 32
SXTH{cond} <Rd>, <Rm>{, <rotation>}
Two low 8, zero extend to 16,
packed 32
UXTB16{cond} <Rd>, <Rm>,{, <rotation>}
Low 8, zero extend to 32
UXTB{cond} <Rd>, <Rm>{, <rotation>}
Low 16, zero extend to 32
UXTH{cond} <Rd>, <Rm>{, <rotation>}
Signed
(high 16 x 16) + (low 16 x 16) + 32,
and set Q flag.
SMLAD{cond} <Rd>, <Rm>, <Rs>, <Rn>
As SMLAD, but high x low,
low x high, and set Q flag
SMLADX{cond} <Rd>, <Rm>, <Rs>, <Rn>
Signed
(high 16 x 16) - (low 16 x 16) + 32
SMLSD{cond} <Rd>, <Rm>, <Rs>, <Rn>
As SMLSD, but high x low,
low x high
SMLSDX{cond} <Rd>, <Rm>, <Rs>, <Rn>
Signed
(high 16 x 16) + (low 16 x 16) + 64
SMLALD{cond} <RdLo>, <RdHi>, <Rm>, <Rs>
As SMLALD, but high x low,
low x high
SMLALDX{cond} <RdLo>, <RdHi>, <Rm>, <Rs>
Signed
(high 16 x 16) - (low 16 x 16) + 64
SMLSLD{cond} <RdLo>, <RdHi>, <Rm>, <Rs>
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-37
Introduction
Table 1-5 ARM instruction set summary (continued)
Operation
Saturate, select,
and pack
Assembler
As SMLSLD, but high x low,
low x high
SMLSLDX{cond} <RdLo>, <RdHi>, <Rm>, <Rs>
32 + truncated high 16 (32 x 32)
SMMLA{cond} <Rd>, <Rm>, <Rs>, <Rn>
32 + rounded high 16 (32 x 32)
SMMLAR{cond} <Rd>, <Rm>, <Rs>, <Rn>
32 - truncated high 16 (32 x 32)
SMMLS{cond} <Rd>, <Rm>, <Rs>, <Rn>
32 -rounded high 16 (32 x 32)
SMMLSR{cond} <Rd>, <Rm>, <Rs>, <Rn>
Signed (high 16 x 16) +
(low 16 x 16), and set Q flag
SMUAD{cond} <Rd>, <Rm>, <Rs>
As SMUAD, but high x low,
low x high, and set Q flag
SMUADX{cond} <Rd>, <Rm>, <Rs>
Signed (high 16 x 16) (low 16 x 16)
SMUSD{cond} <Rd>, <Rm>, <Rs>
As SMUSD, but high x low,
low x high
SMUSDX{cond} <Rd>, <Rm>, <Rs>
Truncated high 16 (32 x 32)
SMMUL{cond} <Rd>, <Rm>, <Rs>
Rounded high 16 (32 x 32)
SMMULR{cond} <Rd>, <Rm>, <Rs>
Unsigned 32 x 32, + two 32, to 64
UMAAL{cond} <RdLo>, <RdHi>, <Rm>, <Rs>
Signed saturation at
bit position n
SSAT{cond} <Rd>, #<immed_5>, <Rm>{, <shift>}
Unsigned saturation at
bit position n
USAT{cond} <Rd>, #<immed_5>, <Rm>{, <shift>}
Two 16 signed saturation at
bit position n
SSAT16{cond} <Rd>, #<immed_4>, <Rm>
Two 16 unsigned saturation at
bit position n
USAT16{cond} <Rd>, #<immed_4>, <Rm>
Select bytes from Rn/Rm based
on GE flags
SEL{cond} <Rd>, <Rn>, <Rm>
Pack low 16/32, high 16/32
PKHBT{cond} <Rd>, <Rn>, <Rm>{, LSL #<immed_5>}
Pack high 16/32, low 16/32
PKHTB{cond} <Rd>, <Rn>, <Rm>{, ASR #<immed_5>}
Table 1-6 summarizes addressing mode 2.
Table 1-6 Addressing mode 2
ARM DDI 0333H
ID012410
Addressing mode
Assembler
Offset
-
Immediate offset
[<Rn>, #+/<immed_12>]
Zero offset
[<Rn>]
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-38
Introduction
Table 1-6 Addressing mode 2 (continued)
Addressing mode
Assembler
Register offset
[<Rn>, +/-<Rm>]
Scaled register offset
[<Rn>, +/-<Rm>, LSL #<immed_5>]
[<Rn>, +/-<Rm>, LSR #<immed_5>]
[<Rn>, +/-<Rm>, ASR #<immed_5>]
[<Rn>, +/-<Rm>, ROR #<immed_5>]
[<Rn>, +/-<Rm>, RRX]
Pre-indexed offset
-
Immediate offset
[<Rn>], #+/<immed_12>
Zero offset
[<Rn>]
Register offset
[<Rn>, +/-<Rm>]!
Scaled register offset
[<Rn>, +/-<Rm>, LSL #<immed_5>]!
[<Rn>, +/-<Rm>, LSR #<immed_5>]!
[<Rn>, +/-<Rm>, ASR #<immed_5>]!
[<Rn>, +/-<Rm>, ROR #<immed_5>]!
[<Rn>, +/-<Rm>, RRX]!
Post-indexed offset
-
Immediate
[<Rn>], #+/-<immed_12>
Zero offset
[<Rn>]
Register offset
[<Rn>], +/-<Rm>
Scaled register offset
[<Rn>], +/-<Rm>, LSL #<immed_5>
[<Rn>], +/-<Rm>, LSR #<immed_5>
[<Rn>], +/-<Rm>, ASR #<immed_5>
[<Rn>], +/-<Rm>, ROR #<immed_5>
[<Rn>], +/-<Rm>, RRX
Table 1-7 summarizes addressing mode 2P, post-indexed only.
Table 1-7 Addressing mode 2P, post-indexed only
ARM DDI 0333H
ID012410
Addressing mode
Assembler
Post-indexed offset
-
Immediate offset
[<Rn>], #+/-<immed_12>
Zero offset
[<Rn>]
Register offset
[<Rn>], +/-<Rm>
Scaled register offset
[<Rn>], +/-<Rm>, LSL #<immed_5>
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-39
Introduction
Table 1-7 Addressing mode 2P, post-indexed only (continued)
Addressing mode
Assembler
[<Rn>], +/-<Rm>, LSR #<immed_5>
[<Rn>], +/-<Rm>, ASR #<immed_5>
[<Rn>], +/-<Rm>, ROR #<immed_5>
[<Rn>], +/-<Rm>, RRX
Table 1-8 summarizes addressing mode 3.
Table 1-8 Addressing mode 3
Addressing mode
Assembler
Immediate offset
[<Rn>, #+/-<immed_8>]
Pre-indexed
[<Rn>, #+/-<immed_8>]!
Post-indexed
[<Rn>], #+/-<immed_8>
Register offset
[<Rn>, +/- <Rm>]
Pre-indexed
[<Rn>, +/- <Rm>]!
Post-indexed
[<Rn>], +/- <Rm>
Table 1-9 summarizes addressing mode 4.
Table 1-9 Addressing mode 4
ARM DDI 0333H
ID012410
Addressing mode
Stack type
Block load
Stack pop (LDM, RFE)
IA
Increment after
FD
Full descending
IB
Increment before
E
D
Empty descending
DA
Decrement after
FA
Full ascending
DB
Decrement before
E
A
Empty ascending
Block store
Stack push (STM, SRS)
IA
IA Increment after
E
A
Empty ascending
IB
IB Increment before
FA
Full ascending
DA
DA Decrement after
E
D
Empty descending
DB
DB Decrement before
FD
Full descending
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-40
Introduction
Table 1-10 summarizes addressing mode 5.
Table 1-10 Addressing mode 5
Addressing mode
Assembler
Immediate offset
[<Rn>, #+/-<immed_8*4>]
Immediate pre-indexed
[<Rn>, #+/-<immed_8*4>]!
Immediate pre-indexed
[<Rn>], #+/-<immed_8*4>
Unindexed
[<Rn>], <option>
Table 1-11 summarizes Operand2 assembler.
Table 1-11 Operand2
Operation
Assembler
Immediate value
#<immed_8r>
Logical shift left
<Rm> LSL #<immed_5>
Logical shift right
<Rm> LSR #<immed_5>
Arithmetic shift right
<Rm> ASR #<immed_5>
Rotate right
<Rm> ROR #<immed_5>
Register
<Rm>
Logical shift left
<Rm> LSL <Rs>
Logical shift right
<Rm> LSR <Rs>
Arithmetic shift right
<Rm> ASR <Rs>
Rotate right
<Rm> ROR <Rs>
Rotate right extended
<Rm> RRX
Table 1-12 summarizes the MSR instruction fields.
Table 1-12 Fields
ARM DDI 0333H
ID012410
Suffix
Sets this bit in the
MSR field_mask
MSR instruction
bit number
c
Control field mask bit, bit 0
16
x
Extension field mask bit, bit 1
17
s
Status field mask bit, bit 2
18
f
Flags field mask bit, bit 3
19
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-41
Introduction
Table 1-13 summarizes condition codes.
Table 1-13 Condition codes
1.10.2
Suffix
Description
EQ
Equal
NE
Not equal
HS/CS
Unsigned higher or same, carry set
LO/CC
Unsigned lower, carry clear
MI
Negative, minus
PL
Positive or zero, plus
VS
Overflow
VC
No overflow
HI
Unsigned higher
LS
Unsigned lower or same
GE
Signed greater or equal
LT
Signed less than
GT
Signed greater than
LE
Signed less than or equal
AL
Always
Thumb instruction set summary
Table 1-14 summarizes the Thumb instruction set.
Table 1-14 Thumb instruction set summary
Operation
Move
Arithmetic
ARM DDI 0333H
ID012410
Assembler
Immediate, update flags
MOV <Rd>, #<immed_8>
LowReg to LowReg, update flags
MOV <Rd>, <Rm>
HighReg to LowReg
MOV <Rd>, <Rm>
LowReg to HighReg
MOV <Rd>, <Rm>
HighReg to HighReg
MOV <Rd>, <Rm>
Copy
CPY <Rd>, <Rm>
Add
ADD <Rd>, <Rn>, #<immed_3>
Add immediate
ADD <Rd>, #<immed_8>
Add LowReg and LowReg, update flags
ADD <Rd>, <Rn>, <Rm>
Add HighReg to LowReg
ADD <Rd>, <Rm>
Add LowReg to HighReg
ADD <Rd>, <Rm>
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-42
Introduction
Table 1-14 Thumb instruction set summary (continued)
Operation
Assembler
Add HighReg to HighReg
ADD <Rd>, <Rm>
Add immediate to PC
ADD <Rd>, PC, #<immed_8*4>
Add immediate to SP
ADD <Rd>, SP, #<immed_8*4>
Add immediate to SP
ADD SP, #<immed_7*4>
ADD SP, SP, #<immed_7*4>
Compare
Logical
Shift/Rotate
Add with carry
ADC <Rd>, <Rs>
Subtract immediate
SUB <Rd>, <Rn>, #<immed_3>
Subtract immediate
SUB <Rd>, #<immed_8>
Subtract
SUB <Rd>, <Rn>, <Rm>
Subtract immediate from SP
SUB SP, #<immed_7*4>
Subtract with carry
SBC <Rd>, <Rm>
Negate
NEG <Rd>, <Rm>
Multiply
MUL <Rd>, <Rm>
Compare immediate
CMP <Rn>, #<immed_8>
Compare LowReg and LowReg, update flags
CMP <Rn>, <Rm>
Compare LowReg and HighReg, update flags
CMP <Rn>, <Rm>
Compare HighReg and LowReg, update flags
CMP <Rn>, <Rm>
Compare HighReg and HighReg, update flags
CMP <Rn>, <Rm>
Compare negative
CMN <Rn>, <Rm>
AND
AND <Rd>, <Rm>
XOR
EOR <Rd>, <Rm>
OR
ORR <Rd>, <Rm>
Bit clear
BIC <Rd>, <Rm>
Move NOT
MVN <Rd>, <Rm>
Test bits
TST <Rd>, <Rm>
Logical shift left
LSL <Rd>, <Rm>, #<immed_5>
LSL <Rd>, <Rs>
Logical shift right
LSR <Rd>, <Rm>, #<immed_5>
LSR <Rd>, <Rs>
Arithmetic shift right
ASR <Rd>, <Rm>, #<immed_5>
ASR <Rd>, <Rs>
Branch
ARM DDI 0333H
ID012410
Rotate right
ROR <Rd>, <Rs>
Conditional
B{cond} <label>
Unconditional
B <label>
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-43
Introduction
Table 1-14 Thumb instruction set summary (continued)
Operation
Assembler
Load
Branch with link
BL <label>
Branch, link and exchange
BLX <label>
Branch, link and exchange
BLX <Rm>
Branch and exchange
BX <Rm>
With immediate offset
-
Word
LDR <Rd>, [<Rn>, #<immed_5*4>]
Halfword
LDRH <Rd>, [<Rn>, #<immed_5*2>]
Byte
LDRB <Rd>, [<Rn>, #<immed_5>]
With register offset
Store
Word
LDR <Rd>, [<Rn>, <Rm>]
Halfword
LDRH <Rd>, [<Rn>, <Rm>]
Signed halfword
LDRSH <Rd>, [<Rn>, <Rm>]
Byte
LDRB <Rd>, [<Rn>, <Rm>]
Signed byte
LDRSB <Rd>, [<Rn>, <Rm>]
PC-relative
LDR <Rd>, [PC, #<immed_8*4>]
SP-relative
LDR <Rd>, [SP, #<immed_8*4>]
Multiple
LDMIA <Rn>!, <reglist>
With immediate offset
Change state
ARM DDI 0333H
ID012410
-
Word
STR <Rd>, [<Rn>, #<immed_5*4>]
Halfword
STRH <Rd>, [<Rn>, #<immed_5*2>]
Byte
STRB <Rd>, [<Rn>, #<immed_5>]
With register offset
Push/Pop
-
-
Word
STR <Rd>, [<Rn>, <Rm>]
Halfword
STRH <Rd>, [<Rn>, <Rm>]
Byte
STRB <Rd>, [<Rn>, <Rm>]
SP-relative
STR <Rd>, [SP, #<immed_8*4>]
Multiple
STMIA <Rn>!, <reglist>
Push registers onto stack
PUSH <reglist>
Push LR and registers onto stack
PUSH <reglist, LR>
Pop registers from stack
POP <reglist>
Pop registers and PC from stack
POP <reglist, PC>
Change processor state
CPS <effect> <iflags>
Change endianness
SETEND <endian_specifier>
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-44
Introduction
Table 1-14 Thumb instruction set summary (continued)
Operation
Assembler
Byte-reverse
REV <Rd>, <Rm>
Byte-reverse halfword
REV16 <Rd>, <Rm>
Byte-reverse signed halfword
REVSH <Rd>, <Rm>
Supervisor call
SVC <immed_8>
Software breakpoint
BKPT <immed_8>
Sign or zero extend
ARM DDI 0333H
ID012410
Byte-reverse word
Sign extend 16 to 32
SXTH<Rd>, <Rm>
Sign extend 8 to 32
SXTB<Rd>, <Rm>
Zero extend 16 to 32
UXTH<Rd>, <Rm>
Zero extend 8 to 32
UXTB<Rd>, <Rm>
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-45
Introduction
1.11
Product revisions
This section describes differences in functionality between product revisions of the
ARM1176JZ-S processor:
r0p0-r0p1
Contains the following differences:
•
The addition of the CPUCLAMP input in r0p1 to better support IEM. See
Intelligent Energy Management on page 10-6.
•
The top level RTL hierarchy has been changed in r0p1 to better support
IEM. See Intelligent Energy Management on page 10-6.
•
The architectural clock gating scheme for the generation of clock dedicated
to the RAMs has been changed. For more information see the description
of the RAM interface implementation in the ARM1176JZF-S™ and
ARM1176JZ-S™ Implementation Guide.
r0p1-r0p2
There are no functional differences between r0p1 and r0p2.
r0p2-r0p4
There are no functional differences between r0p2 and r0p4.
r0p4-r0p6
Between r0p4 and r0p6 there are no differences in the functionality described in
this Technical Reference Manual. However, r0p6 introduces optional top-level
latches, for implementing Dormant mode or IEM with cell libraries that do not
provide retention latches. For more information see the description of Dormant
mode implementation in the ARM1176JZF-S™ and ARM1176JZ-S™
Implementation Guide.
r0p6-r0p7
There are no functional differences between r0p6 and r0p7.
Note
Product revisions r0p3 and r0p5 were not generally available.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
1-46
Chapter 2
Programmer’s Model
This chapter describes the processor registers and provides information for programming the
microprocessor. It contains the following sections:
•
About the programmer’s model on page 2-2
•
Secure world and Non-secure world operation with TrustZone on page 2-3
•
Processor operating states on page 2-12
•
Instruction length on page 2-13
•
Data types on page 2-14
•
Memory formats on page 2-15
•
Addresses in a processor system on page 2-16
•
Operating modes on page 2-17
•
Registers on page 2-18
•
The program status registers on page 2-24
•
Additional instructions on page 2-30
•
Exceptions on page 2-36
•
Software considerations on page 2-59.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-1
Programmer’s Model
2.1
About the programmer’s model
The processors implement ARM architecture v6 with Java extensions and TrustZone™ security
extensions.
The architecture includes the 32-bit ARM instruction set, 16-bit Thumb instruction set, and the
8-bit Java instruction set. For details of both the ARM and Thumb instruction sets, see the ARM
Architecture Reference Manual. For the Java instruction set see the Jazelle V1 Architecture
Reference Manual.
TrustZone provides Secure and Non-secure worlds for software to operate in. For more details
see Secure world and Non-secure world operation with TrustZone on page 2-3 and the ARM
Architecture Reference Manual.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-2
Programmer’s Model
2.2
Secure world and Non-secure world operation with TrustZone
This section describes;
•
TrustZone model
•
How the Secure model works on page 2-4.
For more details on TrustZone and the ARM architecture, see the ARM Architecture Reference
Manual.
2.2.1
TrustZone model
The basis of the TrustZone model is that the computing environment splits into two isolated
worlds, the Secure world and the Non-secure world, with no leakage of Secure data to the
Non-secure world. Software Secure Monitor code, running in the Secure Monitor Mode, links
the two worlds and acts as a gatekeeper to manage program flow. The system can have both
Secure and Non-secure peripherals that suitable Secure and Non-secure device drivers control.
Figure 2-1 shows the relationship between the Secure and Non-secure worlds. The Operating
System (OS) splits into the Secure OS, that includes the Secure kernel, and the Non-secure OS,
that includes the Non-secure kernel. For details on modes of operation, see Operating modes on
page 2-17.
User mode
Privileged modes
Non-secure
Secure
Fixed entry
points
Fixed entry
points
Monitor
Non-secure
kernel
Secure
kernel
Secure
device driver
Non-secure
application
Secure
device
Secure
tasks
Figure 2-1 Secure and Non-secure worlds
In normal Non-secure operation the OS runs tasks in the usual way. When a User process
requires Secure execution it makes a request to the Non-secure kernel, that operates in privileged
mode, and this calls the Secure Monitor to transfer execution to the Secure world.
This approach to secure systems means that the platform OS, that works in the Non-secure
world, has only a few fixed entry points into the Secure world through the Secure Monitor. The
trusted code base for the Secure world, that includes the Secure kernel and Secure device
drivers, is small and therefore much easier to maintain and verify.
Note
Software that runs in User mode cannot directly switch the world that it operates in. Changes
from one world to the other can only occur through the Secure Monitor mode.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-3
Programmer’s Model
2.2.2
How the Secure model works
This section describes how the Secure model works from a program perspective and includes:
•
The NS bit and Secure Monitor mode
•
Secure memory management on page 2-5
•
System boot sequence on page 2-8
•
Secure interrupts on page 2-8
•
Secure peripherals on page 2-8
•
Secure debug on page 2-9.
The NS bit and Secure Monitor mode
The Non-secure (NS) bit determines if the program execution is in the Secure or Non-secure
world. The NS bit is in the Secure Configuration Register (SCR) in coprocessor CP15, see c1,
Secure Configuration Register on page 3-52. All the modes of the core, except the Secure
Monitor, can operate in either the Secure or Non-secure worlds, so there are both Secure and
Non-secure User modes and Secure and Non-secure privileged modes, see Operating modes on
page 2-17 and Registers on page 2-18.
Note
An attempt to access the SCR directly in User modes, Secure or Non-secure, or in Non-secure
privileged modes, makes the processor enter the Undefined exception trap. SCR can only be
accessed in Secure privileged modes.
Secure Monitor mode is a privileged mode and is always Secure regardless of the state of the
NS bit. The Secure Monitor is code that runs in Secure Monitor mode and processes switches
to and from the Secure world. The overall security of the software relies on the security of this
code along with the Secure boot code.
When the Secure Monitor transfers control from one world to the other it must save the
processor context, that includes register banks, from one world and restore those for the other
world. The processor hardware automatically shadows and changes context information in
CP15 registers appropriately.
If the Secure Monitor determines that a change from one world to the other is valid it writes to
the NS bit to change the world in operation. Although all Secure privileged modes can access
the NS bit, it is strongly recommended that you only use the Secure Monitor to change the NS
bit. See the ARM Architecture Reference Manual for more information.
A Secure Monitor Call (SMC) is used to enter the Secure Monitor mode and perform a Secure
Monitor kernel service call. This instruction can only be executed in privileged modes, so when
a User process wants to request a change from one world to the other it must first execute a SVC
instruction. This changes the processor to a privileged mode where the Supervisor call handler
processes the SVC and executes a SMC, see Exceptions on page 2-36.
Note
An attempt by a User process to execute an SMC makes the processor enter the Undefined
exception trap.
The Secure Monitor mode is responsible for the switch from one world to the other. You must
only modify the SCR in Secure Monitor mode.
The recommended way to return to the Non-secure world is to:
1.
Set the NS bit in the SCR.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-4
Programmer’s Model
2.
Execute a MOVS, SUBS or RFE.
All ARM implementations ensure that the processor can not execute the prefetched instructions
that follow MOVS, SUBS, or equivalents, with Secure access permissions.
It is strongly recommended that you do not use an MSR instruction to switch from the Secure
to the Non-secure world. There is no guarantee that, after the NS bit is set in Secure Monitor
mode, an MSR instruction avoids execution of prefetched instructions with Secure access
permission. This is because the processor prefetches the instructions that follow the MSR with
Secure privileged permissions and this might form a security hole in the system if the prefetched
instructions then execute in the Non-secure world.
If the prefetched instructions are in Non-secure memory, with the MSR at the boundary between
Secure and Non-secure memory, they might be corrupted to give Secure information to the
Non-secure world.
To avoid this problem with the MSR instruction, you can use an IMB sequence shortly after the
MSR. If you use the IMB sequence you must ensure that the instructions that execute after the
MSR and before the IMB do not leak any information to the Non-secure world and do not rely
on the Secure permission level.
It is strongly recommended that you do not set the NS bit in Privileged modes other than in
Secure Monitor mode. If you do so you face the same problem as a return to the Non-secure
world with the MSR instruction.
Note
To avoid leakage after an MSR instruction use an IMB sequence.
To enter the Secure Monitor the processor executes:
SMC {<cond>} <imm16>
Where:
<cond>
<imm16>
Is the condition when the processor executes the SMC
The processor ignores this 16-bit immediate value, but the Secure Monitor can
use it to determine the service to provide.
To return from the Secure Monitor the processor executes:
MOVS PC, R14_mon
Secure memory management
The principle of TrustZone memory management is to partition the physical memory into
Secure and Non-secure regions. The Secure protection is ensured by checking all physical
access to memory or peripherals. There are various means to split the global physical memory
into Secure and Non-secure regions. This can be done at each slave level, in the memory
controller, or in a global module, for example. The partition can be hard-wired or configurable.
All systems can have specific requirements, but the partitioning must be done so that any
Non-secure access to Secure memory or device causes an external abort to the core, a security
violation. An AXI signal AxPROT[1] indicates whether the current access is Secure or not and
is used to check the access.
The Secure information exists at any stage of the memory management to guarantee the integrity
of data:
•
ARM DDI 0333H
ID012410
at L2 stage, you can split the memory mapping into Secure and Non-secure regions
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-5
Programmer’s Model
•
in the MMU, Secure and Non-secure descriptors can coexist and they are differentiated
by the NSTID.
In the descriptors the NS attribute indicates whether the corresponding physical memory is
Secure or Non-secure.
For Non-secure descriptors, marked with NSTID=Non-secure, NS attribute is forced to
Non-secure value. The Non-secure world can only target Non-secure memory.
For Secure descriptor, marked with NSTID=Secure, NS attribute indicates if the physical
memory targets Secure or Non-secure memory:
In the caches, instruction and data, each line is tagged as Secure or Non-secure, so that Secure
and Non-secure data can coexist in the cache. Each time a cache line fill is performed, the NS
tag is updated appropriately.
For external accesses, AxPROT[1] indicates whether the access is Secure or Non-secure.
The TrustZone security extensions are completely compatible with existing software. This
means that existing applications and operating systems access memory without change. Where
a system employs Secure functionality the Non-secure world is effectively blind to Secure
memory. This means that Secure and Non-secure memory can co-exist with no affect on
Non-secure code.
Figure 2-2 shows the basic connection of the Secure and Non-secure memory.
Core
Data
Core world
state
Abort
Address
MMU
NSTID
NS attribute
Descriptor 1
Descriptor 2
S S
NS NS
Cache
Descriptor (n-1) NS NS
Descriptor (n) NS S
Line 1
Line 2
TCM
S
NS
Line(n-1) NS
Line (n)
S
Page
table
walk
Line 1
Line 2
Line(n-1)
Line(n)
NS access bit
Data
Data
AXI interface
Abort
Address
AxPROT[1]
Control
Data
Abort
AxPROT[1]
Abort
AxPROT[1]
S prot
S prot
External
memory
Secure
slave
Nonsecure
slave
Arbiter
Decoder
Master
peripheral
Figure 2-2 Memory in the Secure and Non-secure worlds
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-6
Programmer’s Model
The virtual memory address map for the Secure and Non-secure worlds appear as separate
blocks. Figure 2-3 shows how the Secure and Non-secure virtual address spaces might map onto
the physical address space. In this example:
•
Non-secure descriptors are stored in Non-secure memory and can only target Non-secure
memory
•
Secure descriptors are stored in Secure memory and can target both Secure and
Non-secure memory.
Non-secure
Virtual memory
Non-secure level 1
descriptors
Non-secure level
1 descriptors
1MB sections
4KB non-secure
4KB non-secure
4KB non-secure
4KB non-secure
4KB non-secure
4KB secure
4KB secure
4KB secure
Secure translation
table base address
32KB on-chip RAM
Non-secure translation
table base address
Physical memory
Non-secure level 2
descriptors
4KB small pages
Secure level 1
descriptors
Secure level 1
descriptors
NS
attribute
1MB sections
Non-secure
SDRAM
Secure level 2
descriptors
Secure
peripherals
Non-secure
peripherals
Secure
Virtual memory
4KB small pages
Figure 2-3 Memory partition in the Secure and Non-secure worlds
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-7
Programmer’s Model
System boot sequence
Caution
TrustZone security extensions enable a Secure software environment. The technology does not
protect the processor from hardware attacks and the implementor must make sure that the
hardware that contains the boot code is appropriately secure.
The processor always boots in the privileged Supervisor mode in the Secure world, that is the
NS bit is 0. This means that code not written for TrustZone always runs in the Secure world, but
has no way to switch to the Non-secure world. Because the Secure and Non-secure worlds
mirror each other this Secure operation does not affect the functionality of code not written for
TrustZone. The processor is therefore compatible with other ARMv6 architectures. Peripherals
boot in their most Secure state.
The Secure OS code at the reset vector must:
1.
2.
Initialize the Secure OS. This includes normal boot actions such as:
a.
Generate page tables and switch on the MMU if the design uses caches or memory
protection.
b.
Switch on the stack.
c.
Set up the run time environment and program stacks for each processor mode.
Initialize the Secure Monitor. This includes such actions as:
a.
Allocate TCM memory for the Secure Monitor code.
b.
Allocate scratch work space.
c.
Set up the Secure Monitor stack pointer and initialize its state block.
3.
Program the partition checker to allocate physical memory available to the Non-secure
OS.
4.
Yield control to the Non-secure OS. The Non-secure OS boots after this.
The overall security of the software relies on the security of the boot code along with the code
for the Secure Monitor.
Secure interrupts
There are no new pins to deal with Secure interrupts. However the IRQ and FIQ bits in the SCR
can be set to 1, so that the core branches to Secure Monitor mode, instead of IRQ or FIQ mode,
when an interrupt occurs. For more information see c1, Secure Configuration Register on
page 3-52.
FIQ can be used to enter the Secure world in a deterministic way, if it is configured as NMI when
the core is in the Non-secure world,. This configuration is done using the FW and FIQ bits in
SCR. The nIRQ pin can also be used as Secure interrupt and can enter directly monitor mode,
if the IRQ bit in the SCR is set to 1. But it might be masked in the Non-secure world if the I bit
in the CPSR is set to 1.
Secure peripherals
You can protect a Secure peripheral by mapping it to a Secure memory region. In addition, you
can protect Secure peripherals by checking the AxPROT[1] signal and generating an error
response if a Non-secure access attempts to read or write a Secure register.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-8
Programmer’s Model
Secure peripherals require Secure device drivers to supervise them. To minimize the effects of
drivers on system security it is recommended that the Secure device drivers run in the Secure
User mode so that they cannot change the NS bit directly.
Secure debug
For details of software debug in Secure systems see, Chapter 13 Debug. Because the processor
boots in Secure mode you might have to make special arrangements to debug code not written
for TrustZone.
2.2.3
TrustZone write access disable
The processor pin CP15SDISABLE disables write access to certain registers in the system
control coprocessor. Table 2-1 lists the registers affected by this pin.
Attempts to write to the registers in Table 2-1 when CP15SDISABLE is HIGH result in an
Undefined exception. Reads from the registers are still permitted. For more information about
the registers, see Chapter 3 System Control Coprocessor.
A change to the CP15SDISABLE pin takes effect on the instructions decoded by the processor
as quickly as practically possible. Software must perform a Prefetch Flush CP15 operation, after
a change to this pin on the boundary of the macrocell, to ensure that its effect is recognized for
following instructions. It it is expected that:
•
control of the CP15SDISABLE pin remains within the SoC that embodies the macrocell
•
the CP15SDISABLE pin is set to logic 0 by the SoC hardware at reset.
You can use the CP15SDISABLE pin to disable subsequent access to system control processor
registers after the Secure boot code runs and protect the configuration that the Secure boot code
applies.
Note
With the exception of the TCM Region Registers, the registers in Table 2-1 are only accessible
in Secure Privileged modes.
Table 2-1 Write access behavior for system control processor registers
Register
Instruction that is Undefined
when CP15SDISABLE=1
Security Condition
Secure Control Register
MCR p15, 0, Rd, c1, c0, 0
Secure Monitor or Privileged when NS=0
Secure Translation Table Base
Register 0
MCR p15, 0, Rd, c2, c0, 0
Secure Monitor or Privileged when NS=0
Secure Translation Table Control
Register
MCR p15, 0, Rd, c2, c0, 2
Secure Monitor or Privileged when NS=0
Secure Domain Access Control
Register
MCR p15, 0, Rd, c3, c0, 0
Secure Monitor or Privileged when NS=0
Data TCM Non-secure Control
Access Register
MCR p15, 0, Rd, c9, c1, 2
Secure Monitor or Privileged when NS=0
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-9
Programmer’s Model
Table 2-1 Write access behavior for system control processor registers (continued)
Instruction that is Undefined
when CP15SDISABLE=1
Security Condition
Instruction/Unified TCM
Non-secure Control Access
Register
MCR p15, 0, Rd, c9, c1, 3
Secure Monitor or Privileged when NS=0
Data TCM Region Registers
MCR p15, 0, Rd, c9, c1, 0
All TCM Base Registers for which the
Data TCM Non-secure Control Access
Register = 0
Instruction/Unified TCM Region
Registers
MCR p15, 0, Rd, c9, c1, 1
All TCM Base Registers for which the
Instruction/Unified TCM Non-secure
Control Access Register = 0
Secure Primary Region Remap
Register
MCR p15, 0, Rd, c10, c2, 0
Secure Monitor or Privileged when NS=0
Secure Normal Memory Remap
Register
MCR p15, 0, Rd, c10, c2, 1
Secure Monitor or Privileged when NS=0
Secure Vector Base Register
MCR p15, 0, Rd, c12, c0, 0
Secure Monitor or Privileged when NS=0
Monitor Vector Base Register
MCR p15, 0, Rd, c12, c0, 1
Secure Monitor or Privileged when NS=0
Secure FCSE Register
MCR p15, 0, Rd, c13, c0, 0
Secure Monitor or Privileged when NS=0
Peripheral Port remap Register
MCR p15, 0, Rd, c15, c2, 4
Secure Monitor or Privileged when NS=0
Instruction Cache master valid
register
MCR p15, 3, Rd, c15, c8, {0-7}
Secure Monitor or Privileged when NS=0
Data Cache master valid register
MCR p15, 3, Rd, c15, c12, {0-7}
Secure Monitor or Privileged when NS=0
TLB lockdown Index register
MCR p15, 5, Rd, c15, c4, 2
Secure Monitor or Privileged when NS=0
TLB lockdown VA register
MCR p15, 5, Rd, c15, c5, 2
Secure Monitor or Privileged when NS=0
TLB lockdown PA register
MCR p15, 5, Rd, c15, c6, 2
Secure Monitor or Privileged when NS=0
TLB lockdown Attribute register
MCR p15, 5, Rd, c15, c7, 2
Secure Monitor or Privileged when NS=0
Validation registers
MCR p15, 0, Rd, c15, c9, 0
Secure Monitor or Privileged when NS=0
Register
MCR p15, 0, Rd, c15, c12, {4-7}
MCR p15, 0, Rd, c15, c14, 0
MCR p15, {0-7}, Rd, c15, c13, {0-7}
2.2.4
Secure Monitor bus
The SECMONBUS exports a set of signals from the core for use in a monitoring block inside
the chip.
Caution
Implementors must ensure that the SECMONBUS signals do not compromise the security of
the processor. The signals provide information for a security monitoring block, that is inside the
SoC, and must not appear outside the chip.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-10
Programmer’s Model
Table 2-2 lists the signals that appear on the Secure Monitor bus SECMONBUS.
Table 2-2 Secure Monitor bus signals
Bits
Description
[24]
ETMIACTL[11] unmodified by Non-invasive security enable masking.
This signal is disabled when ETMPWRUP = 0 and the Performance Monitoring counters are disabled.
[23]
ETMIACTL[9] unmodified by Non-invasive security enable masking.
This signal is disabled when ETMPWRUP = 0 and the Performance Monitoring counters are disabled.
[22]
Signal that indicates, for duration of operation, the execution of a DMB or DSB operation.
[21]
Signal that indicates, for 1 cycle, the execution of a Prefetch Flush operation.
[20:19]
Instruction/Unified TCM Region Register bit[0], entries [1:0].
[18:17]
Data TCM Region Register bit [0], entries [1:0].
[16]
Non-Secure Access Control register bit [18].
[15]
Secure Control Register I bit, bit [12].
[14]
Secure Control Register C bit, bit [2].
[13]
Secure Control Register M bit, bit [0].
[12]
Secure Configuration Register NS bit, bit [0].
[11]
CPSR A bit, bit [8], taken from the core pipeline writeback stage.
[10]
CPSR I bit, bit [7], taken from the core pipeline writeback stage.
[9]
CPSR F bit, bit [6], taken from the core pipeline writeback stage.
[8:5]
CPSR mode bits, bits [3:0], taken from the core pipeline writeback stage.
[4:3]
ETMDDCTL[1:0] unmodified by Non-invasive security enable masking.
This signal is disabled when ETMPWRUP = 0 and the Performance Monitoring counters are disabled.
[2:1]
ETMDACTL[1:0] unmodified by Non-invasive security enable masking.
This signal is disabled when ETMPWRUP = 0 and the Performance Monitoring counters are disabled.
[0]
ETMIACTL[0] unmodified by Non-invasive security enable masking.
This signal is disabled when ETMPWRUP = 0 and the Performance Monitoring counters are disabled.
Note
nRESETIN resets all SECMONBUS output pins except bits [24:23] and bits [2:0].
nPORESETIN resets the output pins for bits [24:23] and bits [2:0].
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-11
Programmer’s Model
2.3
Processor operating states
The processor has these operating states:
ARM state
32-bit, word-aligned ARM instructions are executed in this state.
Thumb state
16-bit, halfword-aligned Thumb instructions.
Jazelle state
Variable length, byte-aligned Java instructions.
In Thumb state, the Program Counter (PC) uses bit 1 to select between alternate halfwords. In
Jazelle state, all instruction fetches are in words.
Note
Transition between ARM and Thumb states does not affect the processor mode or the register
contents. For details on entering and exiting Jazelle state see Jazelle V1 Architecture Reference
Manual.
2.3.1
Switching state
You can switch the operating state of the processor between:
•
ARM state and Thumb state using the BX and BLX instructions, and loads to the PC. The
ARM Architecture Reference Manual describes the switching state.
•
ARM state and Jazelle state using the BXJ instruction.
All exceptions are entered, handled, and exited in ARM state. If an exception occurs in Thumb
state or Jazelle state, the processor reverts to ARM state. Exception return instructions restore
the SPSR to the CPSR, that can also cause a transition back to Thumb state or Jazelle state.
2.3.2
Interworking ARM and Thumb state
The processor enables you to mix ARM and Thumb code. For details see the chapter about
interworking ARM and Thumb in the RealView Compilation Tools Developer Guide.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-12
Programmer’s Model
2.4
Instruction length
Instructions are one of:
•
32 bits long, in ARM state
•
16 bits long, in Thumb state
•
variable length, multiples of 8 bits, in Jazelle state.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-13
Programmer’s Model
2.5
Data types
The processor supports the following data types:
•
word, 32-bit
•
halfword, 16-bit
•
byte, 8-bit.
•
•
Note
When any of these types are described as unsigned, the N-bit data value represents a
non-negative integer in the range 0 to +2N-1, using normal binary format.
When any of these types are described as signed, the N-bit data value represents an integer
in the range -2N-1 to +2N-1-1, using two’s complement format.
For best performance you must align these as follows:
•
word quantities must be aligned to four-byte boundaries
•
halfword quantities must be aligned to two-byte boundaries
•
byte quantities can be placed on any byte boundary.
The processor provides mixed-endian and unaligned access support. For details see Chapter 4
Unaligned and Mixed-endian Data Access Support.
Note
You cannot use LDRD, LDM, LDC, STRD, STM, or STC instructions to access 32-bit
quantities if they are unaligned.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-14
Programmer’s Model
2.6
Memory formats
The processor views memory as a linear collection of bytes numbered in ascending order from
zero. Bytes 0-3 hold the first stored word, and bytes 4-7 hold the second stored word, for
example.
The processor can treat words in memory as being stored in either:
•
Legacy big-endian format
•
Little-endian format.
Additionally, the processor supports mixed-endian and unaligned data accesses. For details see
Chapter 4 Unaligned and Mixed-endian Data Access Support.
2.6.1
Legacy big-endian format
In legacy big-endian format, the processor stores the most significant byte of a word at the
lowest-numbered byte, and the least significant byte at the highest-numbered byte. Therefore,
byte 0 of the memory system connects to data lines 31-24. Figure 2-4 shows this.
Bit
31
Higher address
Lower address
24
23
16
15
8
7
0
Word address
8
9
10
11
8
4
5
6
7
4
0
1
2
3
0
• Most significant byte is at lowest address
• Word is addressed by byte address of most significant byte
Figure 2-4 Big-endian addresses of bytes within words
2.6.2
Little-endian format
In little-endian format, the lowest-numbered byte in a word is the least significant byte of the
word and the highest-numbered byte is the most significant. Therefore, byte 0 of the memory
system connects to data lines 7-0. Figure 2-5 shows this.
Bit
Higher address
Lower address
31
24
23
16
15
8
7
0
Word address
11
10
9
8
8
7
6
5
4
4
3
2
1
0
0
• Least significant byte is at lowest address
• Word is addressed by byte address of least significant byte
Figure 2-5 Little-endian addresses of bytes within words
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-15
Programmer’s Model
2.7
Addresses in a processor system
Three distinct types of address exist in the processor system:
•
Virtual Address (VA)
•
Modified Virtual Address (MVA)
•
Physical Address (PA).
When the core is in the Secure world the VA is Secure, and when the core is in the Non-secure
world the VA is Non-secure. To get the VA to PA translation, the core uses Secure pages tables
while it is in Secure world. Otherwise it uses the Non-secure page tables.
Table 2-3 lists the address types in the processor system.
Table 2-3 Address types in the processor system
Processor
Caches
TLBs
AXI bus
Virtual Address
Virtual index Physical tag
Translates Virtual Address to
Physical Address
Physical Address
This is an example of the address manipulation that occurs when the processor requests an
instruction, see Figure 1-1 on page 1-8:
ARM DDI 0333H
ID012410
1.
The VA of the instruction is issued by the processor, Secure or Non-secure VA according
to the world where the core is.
2.
The Instruction Cache is indexed by the lower bits of the VA. The VA is translated using
the ProcID, Secure or Non-secure one, to the MVA, and then to PA in the Translation
Lookaside Buffer (TLB). The TLB performs the translation in parallel with the Cache
lookup. The translation uses Secure descriptors if the core is in Secure world. Otherwise
it uses the Non-secure ones.
3.
If the protection check carried out by the TLB on the MVA does not abort and the PA tag
is in the Instruction Cache, the instruction data is returned to the processor.
4.
The PA is passed to the AXI bus interface to perform an external access, in the event of a
cache miss. The external access is always Non-secure when the core is in Non-secure
world. In Secure world, the external access is Secure or Non-secure according to the NS
attribute value in the selected descriptor.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-16
Programmer’s Model
2.8
Operating modes
In all states there are eight modes of operation:
•
User mode is the usual ARM program execution state, and is used for executing most
application programs
•
Fast interrupt (FIQ) mode is used for handling fast interrupts
•
Interrupt (IRQ) mode is used for general-purpose interrupt handling
•
Supervisor mode is a protected mode for the OS
•
Abort mode is entered after a data abort or prefetch abort
•
System mode is a privileged user mode for the OS
•
Undefined mode is entered when an undefined instruction exception occurs.
•
Secure Monitor mode is a Secure mode for the TrustZone Secure Monitor code.
Note
Secure Monitor mode is not the same as monitor debug mode.
Modes other than User mode are collectively known as privileged modes. Privileged modes are
used to service interrupts or exceptions, or to access protected resources. Table 2-4 lists the
mode structure for the processor.
Table 2-4 Mode structure
Modes
ARM DDI 0333H
ID012410
Mode
type
State of core
NS bit = 1
NS bit = 0
User
User
Non-secure
Secure
FIQ
privileged
Non-secure
Secure
IRQ
privileged
Non-secure
Secure
Supervisor
privileged
Non-secure
Secure
Abort
privileged
Non-secure
Secure
Undefined
privileged
Non-secure
Secure
System
privileged
Non-secure
Secure
Secure Monitor
privileged
Secure
Secure
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-17
Programmer’s Model
2.9
Registers
The processor has a total of 40 registers:
•
33 general-purpose 32-bit registers
•
seven 32-bit status registers.
These registers are not all accessible at the same time. The processor state and operating mode
determine the registers that are available to the programmer.
2.9.1
The ARM state core register set
In ARM state, 16 general registers and one or two status registers are accessible at any time. In
privileged modes, mode-specific banked registers become available. Figure 2-6 on page 2-20
shows the registers that are available in each mode.
The ARM state core register set contains 16 directly-accessible registers, R0-R15. Another
register, the Current Program Status Register (CPSR), contains condition code flags, status bits,
and current mode bits. Registers R0-R12 are general-purpose registers used to hold either data
or address values. Registers R13, R14, R15, and the Saved Program Status Register (SPSR)
have the following special functions:
Stack Pointer
Register R13 is used as the Stack Pointer (SP).
R13 is banked for the exception modes. This means that an exception
handler can use a different stack to the one in use when the exception
occurred.
In many instructions, you can use R13 as a general-purpose register, but
the architecture deprecates this use of R13 in most instructions. For more
information see the ARM Architecture Reference Manual.
Link Register
Register R14 is used as the subroutine Link Register (LR).
Register R14 receives the return address when a Branch with Link (BL or
BLX) instruction is executed.
You can treat R14 as a general-purpose register at all other times. The
corresponding banked registers R14_mon, R14_svc, R14_irq, R14_fiq,
R14_abt, and R14_und are similarly used to hold the return values when
interrupts and exceptions arise, or when BL or BLX instructions are
executed within interrupt or exception routines.
Program Counter Register R15 holds the PC:
•
in ARM state this is word-aligned
•
in Thumb state this is halfword-aligned
•
in Jazelle state this is byte-aligned.
Saved Program Status Register
In privileged modes, another register, the SPSR, is accessible. This
contains the condition code flags, status bits, and current mode bits saved
as a result of the exception that caused entry to the current mode.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-18
Programmer’s Model
Banked registers have a mode identifier that indicates the mode that they relate to. Table 2-5 lists
these mode identifiers.
Table 2-5 Register mode identifiers
Mode
Mode identifier
User
usra
Fast interrupt
fiq
Interrupt
irq
Supervisor
svc
Abort
abt
System
usra
Undefined
und
Secure Monitor
mon
a. The usr identifier is usually omitted
from register names. It is only used in
descriptions where the User or System
mode register is specifically accessed
from another operating mode.
FIQ mode has seven banked registers mapped to R8–R14 (R8_fiq–R14_fiq). As a result many
FIQ handlers do not have to save any registers.
The Secure Monitor, Supervisor, Abort, IRQ, and Undefined modes each have alternative
mode-specific registers mapped to R13 and R14, permitting a private stack pointer and link
register for each mode.
Figure 2-6 on page 2-20 shows the ARM state registers.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-19
Programmer’s Model
ARM state general registers and program counter
System and
User
FIQ
Supervisor
Abort
IRQ
Undefined
Secure
monitor
R0
R0
R0
R0
R0
R0
R0
R1
R1
R1
R1
R1
R1
R1
R2
R2
R2
R2
R2
R2
R2
R3
R3
R3
R3
R3
R3
R3
R4
R4
R4
R4
R4
R4
R4
R5
R5
R5
R5
R5
R5
R5
R6
R6
R6
R6
R6
R6
R6
R7
R7
R7
R7
R7
R7
R7
R8
R8_fiq
R8
R8
R8
R8
R8
R9
R9_fiq
R9
R9
R9
R9
R9
R10
R10_fiq
R10
R10
R10
R10
R10
R11
R11_fiq
R11
R11
R11
R11
R11
R12
R12_fiq
R12
R12
R12
R12
R12
R13
R13_fiq
R13_svc
R13_abt
R13_irq
R13_und
R13_mon
R14
R14_fiq
R14_svc
R14_abt
R14_irq
R14_und
R14_mon
R15
R15 (PC)
R15 (PC)
R15 (PC)
R15 (PC)
R15 (PC)
R15 (PC)
CPSR
CPSR
ARM state program status registers
CPSR
CPSR
SPSR_fiq
CPSR
SPSR_svc
CPSR
SPSR_abt
CPSR
SPSR_irq
SPSR_und
SPSR_mon
= banked register
Figure 2-6 Register organization in ARM state
Figure 2-7 on page 2-21 shows an alternative view of the ARM registers.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-20
Programmer’s Model
16 general
purpose
registers + 1
status register
R0
R1
R2
33 general purpose registers
R3
R4
R5
23 mode-specific registers (banked registers)
17 banked general-purpose registers + 6 banked status registers
R6
R7
R8
R8_fiq
R9
R9_fiq
R10
R10_fiq
R11
R11_fiq
R12
R12_fiq
R13
R13_fiq
R13_svc
R13_abt
R13_irq
R13_und
R13_mon
R14
R14_fiq
R14_svc
R14_abt
R14_irq
R14_und
R14_mon
SPSR_fiq
SPSR_svc
SPSR_abt
SPSR_irq
SPSR_und
7 status registers
R15 (PC)
CPSR
SPSR_mon
Figure 2-7 Processor core register set showing banked registers
2.9.2
The Thumb state core register set
The Thumb state core register set is a subset of the ARM state set. The programmer has direct
access to:
•
eight general registers, R0–R7. For details of high register access in Thumb state see
Accessing high registers in Thumb state on page 2-22
•
the PC
•
a stack pointer, SP, ARM R13
•
an LR, ARM R14
•
the CPSR.
There are banked SPs, LRs, and SPSRs for each privileged mode. Figure 2-8 on page 2-22
shows the Thumb state core register set.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-21
Programmer’s Model
Thumb state general registers and program counter
System and
User
FIQ
Supervisor
Abort
IRQ
Undefined
Secure
monitor
R0
R0
R0
R0
R0
R0
R0
R1
R1
R1
R1
R1
R1
R1
R2
R2
R2
R2
R2
R2
R2
R3
R3
R3
R3
R3
R3
R3
R4
R4
R4
R4
R4
R4
R4
R5
R5
R5
R5
R5
R5
R5
R6
R6
R6
R6
R6
R6
R6
R7
R7
R7
R7
R7
R7
R7
SP
SP_fiq
SP_svc
SP_abt
SP_irq
SP_und
SP_mon
LR
LR_fiq
LR_svc
LR_abt
LR_irq
LR_und
LR_mon
PC
PC
PC
PC
PC
PC
PC
Thumb state program status registers
CPSR
CPSR
SPSR_fiq
CPSR
SPSR_svc
CPSR
CPSR
SPSR_abt
SPSR_irq
CPSR
SPSR_und
CPSR
SPSR_mon
= banked register
Figure 2-8 Register organization in Thumb state
2.9.3
Accessing high registers in Thumb state
In Thumb state, the high registers, R8–R15, are not part of the standard core register set. You
can use special variants of the MOV instruction to transfer a value from a low register, in the
range R0–R7, to a high register, and from a high register to a low register. The CMP instruction
enables you to compare high register values with low register values. The ADD instruction
enables you to add high register values to low register values. For more details, see the ARM
Architecture Reference Manual.
2.9.4
ARM state and Thumb state registers relationship
Figure 2-9 on page 2-23 shows the relationships between the Thumb state and ARM state
registers. See the Jazelle V1 Architecture Reference Manual for details of Jazelle state registers.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-22
High registers
Low registers
Programmer’s Model
Thumb state
ARM State
R0
R1
R2
R3
R4
R5
R6
R7
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
Stack Pointer (R13)
Link Register (R14)
Program Counter (R15)
CPSR
SPSR
Stack pointer (SP)
Link register (LR)
Program counter (PC)
CPSR
SPSR
Figure 2-9 ARM state and Thumb state registers relationship
Note
Registers R0–R7 are known as the low registers. Registers R8–R15 are known as the high
registers.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-23
Programmer’s Model
2.10
The program status registers
The processor contains one CPSR, and six SPSRs for exception handlers to use. The program
status registers:
•
hold information about the most recently performed ALU operation
•
control the enabling and disabling of interrupts
•
set the processor operating mode.
Figure 2-10 shows the arrangement of bits in the status registers, and the sections from The
condition code flags to Reserved bits on page 2-29 inclusive describe it.
31 30 29 28 27 26 25 24 23
N Z C V Q
DNM
J
(RAZ)
20 19
DNM
(RAZ)
16 15
GE[3:0]
10 9 8 7 6 5 4
DNM
(RAZ)
E A I F T
Greater than
or equal to
0
M[4:0]
Mode bits
Thumb state bit
FIQ disable
IRQ disable
Jazelle state bit
Sticky overflow
Overflow
Carry/Borrow/Extend
Zero
Negative/Less than
Imprecise abort
disable bit
Data endianness bit
Figure 2-10 Program status register
Note
The bits that Figure 2-10 identifies as Do Not Modify (DNM), Read As Zero (RAZ), must not be
modified by software. These bits are:
2.10.1
•
Readable, to enable the processor state to be preserved, for example, during process
context switches
•
Writable, to enable the processor state to be restored. To maintain compatibility with
future ARM processors, and as good practice, you are strongly advised to use a
read-modify-write strategy when changing the CPSR.
The condition code flags
The N, Z, C, and V bits are the condition code flags. You can set them by arithmetic and logical
operations, and also by MSR and LDM instructions. The processor tests these flags to determine
whether to execute an instruction.
In ARM state, most instructions can execute conditionally on the state of the N, Z, C, and V bits.
The exceptions are:
•
BKPT
•
CDP2
•
CPS
•
LDC2
•
MCR2
•
MCRR2
•
MRC2
•
MRRC2
•
PLD
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-24
Programmer’s Model
•
•
•
•
SETEND
RFE
SRS
STC2.
In Thumb state, only the Branch instruction can be executed conditionally. For more
information about conditional execution, see the ARM Architecture Reference Manual.
2.10.2
The Q flag
The Sticky Overflow (Q) flag can be set by certain multiply and fractional arithmetic
instructions:
•
QADD
•
QDADD
•
QSUB
•
QDSUB
•
SMLAD
•
SMLAxy
•
SMLAWy
•
SMLSD
•
SMUAD
•
SSAT
•
SSAT16
•
USAT
•
USAT16.
The Q flag is sticky in that, when set by an instruction, it remains set until explicitly cleared by
an MSR instruction writing to the CPSR. Instructions cannot execute conditionally on the status
of the Q flag.
To determine the status of the Q flag you must read the PSR into a register and extract the Q flag
from this. For details of how the Q flag is set and cleared, see individual instruction definitions
in the ARM Architecture Reference Manual.
2.10.3
The J bit
The J bit in the CPSR indicates when the processor is in Jazelle state.
When:
J=0
The processor is in ARM or Thumb state, depending on the T bit.
J=1
The processor is in Jazelle state.
•
•
ARM DDI 0333H
ID012410
Note
The combination of J = 1 and T = 1 causes similar effects to setting T=1 on a non
Thumb-aware processor. That is, the next instruction executed causes entry to the
Undefined Instruction exception. Entry to the exception handler causes the processor to
re-enter ARM state, and the handler can detect that this was the cause of the exception
because J and T are both set in SPSR_und.
MSR cannot be used to change the J bit in the CPSR.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-25
Programmer’s Model
•
2.10.4
The placement of the J bit avoids the status or extension bytes in code running on
ARMv5TE or earlier processors. This ensures that OS code written using the deprecated
CPSR, SPSR, CPSR_all, or SPSR_all syntax for the destination of an MSR instruction
continues to work.
The GE[3:0] bits
Some of the SIMD instructions set GE[3:0] as greater-than-or-equal bits for individual
halfwords or bytes of the result. Table 2-6 lists these.
Table 2-6 GE[3:0] settings
GE[3]
GE[2]
GE[1]
GE[0]
A op B >= C
A op B >= C
A op B >= C
A op B >= C
SADD16
[31:16] + [31:16] ≥ 0
[31:16] + [31:16] ≥ 0
[15:0] + [15:0] ≥ 0
[15:0] + [15:0] ≥ 0
SSUB16
[31:16] - [31:16] ≥ 0
[31:16] - [31:16] ≥ 0
[15:0] - [15:0] ≥ 0
[15:0] - [15:0] ≥ 0
SADDSUBX
[31:16] + [15:0] ≥ 0
[31:16] + [15:0] ≥ 0
[15:0] - [31:16] ≥ 0
[15:0] - [31:16] ≥ 0
SSUBADDX
[31:16] - [15:0] ≥ 0
[31:16] - [15:0] ≥ 0
[15:0] + [31:16] ≥ 0
[15:0] + [31:16] ≥ 0
SADD8
[31:24] + [31:24] ≥ 0
[23:16] + [23:16] ≥ 0
[15:8] + [15:8] ≥ 0
[7:0] + [7:0] ≥ 0
SSUB8
[31:24] - [31:24] ≥ 0
[23:16] - [23:16] ≥ 0
[15:8] - [15:8] ≥ 0
[7:0] - [7:0] ≥ 0
UADD16
[31:16] + [31:16] ≥ 216
[31:16] + [31:16] ≥ 216
[15:0] + [15:0] ≥ 216
[15:0] + [15:0] ≥ 216
USUB16
[31:16] - [31:16] ≥ 0
[31:16] - [31:16] ≥ 0
[15:0] - [15:0] ≥ 0
[15:0] - [15:0] ≥ 0
UADDSUBX
[31:16] + [15:0] ≥ 216
[31:16] + [15:0] ≥ 216
[15:0] - [31:16] ≥ 0
[15:0] - [31:16] ≥ 0
USUBADDX
[31:16] - [15:0] ≥ 0
[31:16] - [15:0] ≥ 0
[15:0] + [31:16] ≥ 216
[15:0] + [31:16] ≥216
UADD8
[31:24] + [31:24] ≥ 28
[23:16] + [23:16] ≥ 28
[15:8] + [15:8] ≥ 28
[7:0] + [7:0] ≥ 28
USUB8
[31:24] - [31:24] ≥ 0
[23:16] - [23:16] ≥ 0
[15:8] - [15:8] ≥ 0
[7:0] - [7:0] ≥ 0
Instruction
Signed
Unsigned
Note
GE bit is 1 if A op B ≥ C, otherwise 0.
The SEL instruction uses GE[3:0] to select the source register that supplies each byte of its
result.
•
•
ARM DDI 0333H
ID012410
Note
For unsigned operations, the GE bits are determined by the usual ARM rules for carries
out of unsigned additions and subtractions, and so are carry-out bits.
For signed operations, the rules for setting the GE bits are chosen so that they have the
same sort of greater than or equal functionality as for unsigned operations.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-26
Programmer’s Model
2.10.5
The E bit
ARM and Thumb instructions are provided to set and clear the E-bit. The E bit controls
load/store endianness. For details of where the E bit is used see Chapter 4 Unaligned and
Mixed-endian Data Access Support.
Architecture versions prior to ARMv6 specify this bit as SBZ. This ensures no endianness
reversal on loads or stores.
2.10.6
The A bit
The A bit is set automatically. It is used to disable imprecise Data Aborts. It might be not
writable in the Non-secure world if the AW bit in the SCR register is reset. For details of how
to use the A bit see Imprecise Data Abort mask in the CPSR/SPSR on page 2-47.
2.10.7
The control bits
The bottom eight bits of a PSR are known collectively as the control bits. They are the:
•
Interrupt disable bits
•
T bit
•
Mode bits on page 2-28.
The control bits change when an exception occurs. When the processor is operating in a
privileged mode, software can manipulate these bits.
Interrupt disable bits
The I and F bits are the interrupt disable bits:
•
When the I bit is set, IRQ interrupts are disabled.
•
When the F bit is set, FIQ interrupts are disabled. FIQ can be non-maskable in the
Non-secure world if the FW bit in SCR register is reset
Note
You can change the SPSR F bit in the Non-secure world but this does not update the CPSR if
the SCR bit 4 (FW) does not permit it.
T bit
The T bit reflects the operating state:
•
when the T bit is set, the processor is executing in Thumb state
•
when the T bit is clear, the processor is executing in ARM state, or Jazelle state depending
on the J bit.
Note
Never use an MSR instruction to force a change to the state of the T bit in the CPSR. If an MSR
instruction does try to modify this bit the result is architecturally Unpredictable. In the
ARM1176JZ-S processor this bit is not affected.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-27
Programmer’s Model
Mode bits
M[4:0] are the mode bits. Table 2-7 lists how these bits determine the processor operating mode.
Table 2-7 PSR mode bit values
Visible state registers
M[4:0]
Mode
Thumb
ARM
b10000
User
R0–R7, R8-R12a, SP, LR, PC, CPSR
R0–R14, PC, CPSR
b10001
FIQ
R0–R7, R8_fiq-R12_fiqa, SP_fiq, LR_fiq PC,
CPSR, SPSR_fiq
R0–R7, R8_fiq–R14_fiq, PC, CPSR,
SPSR_fiq
b10010
IRQ
R0–R7, R8-R12a, SP_irq, LR_irq, PC, CPSR,
SPSR_irq
R0–R12, R13_irq, R14_irq, PC, CPSR,
SPSR_irq
b10011
Supervisor
R0–R7, R8-R12a, SP_svc, LR_svc, PC, CPSR,
SPSR_svc
R0–R12, R13_svc, R14_svc, PC, CPSR,
SPSR_svc
b10111
Abort
R0–R7, R8-R12a, SP_abt, LR_abt,
PC, CPSR, SPSR_abt
R0–R12, R13_abt, R14_abt, PC, CPSR,
SPSR_abt
b11011
Undefined
R0–R7, R8-R12a, SP_und,
LR_und, PC, CPSR, SPSR_und
R0–R12, R13_und, R14_und,
PC, CPSR, SPSR_und
b11111
System
R0–R7, R8-R12a, SP, LR, PC, CPSR
R0–R14, PC, CPSR
b10110
Secure
Monitor
R0-R7, R8-R12a, SP_mon, LR_mon, PC, CPSR,
SPSR_mon
R0-R12, PC,CPSR, SPSR_mon,
R13_mon,R14_mon
a. Access to these registers is limited in Thumb state.
2.10.8
Modification of PSR bits by MSR instructions
In previous architecture versions, MSR instructions can modify the flags byte, bits [31:24], of
the CPSR in any mode, but the other three bytes are only modifiable in privileged modes.
After the introduction of ARM architecture v6, however, each CPSR bit falls into one of the
following categories:
•
Bits that are freely modifiable from any mode, either directly by MSR instructions or by
other instructions whose side-effects include writing the specific bit or writing the entire
CPSR.
Bits in Figure 2-10 on page 2-24 that are in this category are N, Z, C, V, Q, GE[3:0], and E.
•
Bits that must never be modified by an MSR instruction, and so must only be written as a
side-effect of another instruction. If an MSR instruction does try to modify these bits the
results are architecturally Unpredictable. In the processor these bits are not affected.
Bits in Figure 2-10 on page 2-24 that are in this category are J and T.
•
Bits that can only be modified from privileged modes, and that are completely protected
from modification by instructions while the processor is in User mode. The only way that
these bits can be modified while the processor is in User mode is by entering a processor
exception, as Exceptions on page 2-36 describes.
Bits in Figure 2-10 on page 2-24 that are in this category are A, I, F, and M[4:0].
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-28
Programmer’s Model
Only Secure privileged modes can write directly to the CPSR mode bits to enter Secure
Monitor mode. If the core is in Secure User mode, Non-secure User mode, or Non-secure
privileged modes it ignores changes to the CPSR to enter the Secure Monitor. The core
does not copy mode bits in the SPSR, changed in the Non-secure world, across to the
CPSR.
2.10.9
Reserved bits
The remaining bits in the PSRs are unused, but are reserved. When changing a PSR flag or
control bits, make sure that these reserved bits are not altered. You must ensure that your
program does not rely on reserved bits containing specific values because future processors
might use some or all of the reserved bits.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-29
Programmer’s Model
2.11
Additional instructions
To support extensions to ARMv6, the ARM1176JZ-S processor includes these instructions in
addition to those in the ARMv6 and TrustZone architectures:
•
Load Register Exclusive instructions, see LDREXB, LDREXH on page 2-31, and
LDREXD on page 2-33
•
Store Register Exclusive instructions, see STREXB, STREXH on page 2-32, and STREXH
on page 2-32
•
Clear Register Exclusive instruction, see CLREX on page 2-34
•
Yield instruction, see NOP-compatible hints on page 2-34.
2.11.1
Load or Store Byte Exclusive
These instruction operate on unsigned data of size byte.
No alignment restrictions apply to the addresses of these instructions.
The LDREXB and STREXB instructions share the same data monitors as the LDREX and
STREX instructions, a local and a global monitor for each processor, for shared memory
support.
LDREXB
Figure 2-11 shows the format of the Load Register Byte Exclusive, LDREXB, instruction.
31
28 27
Cond
21 20 19
0 0 0 1 1 1 0 1
16 15
Rn
12 11
Rd
8 7
SBO
4 3
1 0 0 1
0
SBO
Figure 2-11 LDREXB instruction
Syntax
LDREXB{<cond>} <Rxf>, [<Rbase>]
Operation
if ConditionPassed(cond) then
processor_id = ExecutingProcessor()
Rd = Memory[Rn,1]
if Shared(Rn) ==1 then
physical_address=TLB(Rn)
MarkExclusiveGlobal(physical_address,processor_id,1)
MarkExclusiveLocal(processor_id)
STREXB
Figure 2-12 shows the format of the Store Register Byte Exclusive, STREXB, instruction.
31
28 27
Cond
21 20 19
0 0 0 1 1 1 0 0
16 15
Rn
12 11
Rd
8 7
SBO
4 3
1 0 0 1
0
Rm
Figure 2-12 STREXB instructions
Syntax
STREXB{<cond>} <Rd>, <Rm>, [<Rn>]]
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-30
Programmer’s Model
Operation
if ConditionPassed(cond) then
processor_id = ExecutingProcessor()
if IsExclusiveLocal(processor_id) then
if Shared(Rn)==1 then
physical_address=TLB(Rn)
if IsExclusiveGlobal(physical_address,processor_id,1) then
Memory[Rn,1] = Rm
Rd = 0
ClearByAddress(physical_address,1)
else
Rd =1
else
Memory[Rn,1] = Rm
Rd = 0
else
Rd = 1
ClearExclusiveLocal(processor_id)
2.11.2
Load or Store Halfword Exclusive
These instructions operate on naturally aligned, unsigned data of size halfword:
•
The address in memory must be 16-bit aligned, address[0] == b0
When (A,U) == (0,1), (1,0) or (1,1) in CP15 register 1, the instruction generates alignment
faults if this condition is not met.
For more information, see Operation of unaligned accesses on page 4-13.
•
The transaction must be a single access or indivisible burst on bus widths < 16 bits
For AXI based systems, the exclusive access signal, AxPROT[4], must remain asserted
throughout the burst where AxSIZE < 0x1.
The LDREXH and STREXH instructions share the same data monitors as the LDREX and
STREX instructions, a local and a global monitor for each processor, for shared memory
support.
LDREXH
Figure 2-13 shows the format of the Load Register Halfword Exclusive, LDREXH, instruction.
31
28 27
Cond
21 20 19
0 0 0 1 1 1 1 1
16 15
Rn
12 11
Rd
8 7
SBO
4 3
1 0 0 1
0
SBO
Figure 2-13 LDREXH instruction
Syntax
LDREXH{<cond>} <Rd>, [<Rn>]
Operation
if ConditionPassed(cond) then
processor_id = ExecutingProcessor()
Rd = Memory[Rn,2]
if Shared(Rn) ==1 then
physical_address=TLB(Rn)
MarkExclusiveGlobal(physical_address,processor_id,2)
MarkExclusiveLocal(processor_id)
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-31
Programmer’s Model
STREXH
Figure 2-14 shows the format of the Store Register Halfword Exclusive, STREXH, instruction.
31
28 27
Cond
21 20 19
0 0 0 1 1 1 1 0
16 15
Rn
12 11
Rd
8 7
SBO
4 3
1 0 0 1
0
Rm
Figure 2-14 STREXH instruction
Syntax
STREXH{<cond>} <Rd>, <Rm>, [<Rn>]
Operation
if ConditionPassed(cond) then
processor_id = ExecutingProcessor()
if IsExclusiveLocal(processor_id) then
if Shared(Rn)==1 then
physical_address=TLB(Rn)
if IsExclusiveGlobal(physical_address,processor_id,2) then
Memory[Rn,2] = Rm
Rd = 0
ClearByAddress(physical_address,2)
else
Rd =1
else
Memory[Rn,2] = Rm
Rd = 0
else
Rd = 1
ClearExclusiveLocal(processor_id)
2.11.3
Load or Store Doubleword
The LDREXD and STREXD instructions behave as follows:
•
The operands are considered as two words, that load or store to consecutive
word-addressed locations in memory.
•
Register restrictions are the same as LDRD and STRD. For STRD in ARM state, the
registers Rm and R(m+1) provide the value that is stored, where m is an even number.
•
The address in memory must be 64-bit aligned, address[2:0] == b000
When (A,U) == (0,1), (1,0) or (1,1) in CP15 register 1, the instruction generates alignment
faults if this condition is not met.
For more information, see Operation of unaligned accesses on page 4-13.
•
The transaction must be a single access or indivisible burst on bus widths < 64 bits
For AXI based systems, the exclusive access signal, AxPROT[4], must remain asserted
throughout the burst where AxSIZE < 0x3.
The LDREXD and STREXD instructions share the same data monitors as the LDREX and
STREX instructions, a local and a global monitor for each processor, for shared memory
support.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-32
Programmer’s Model
LDREXD
Figure 2-15 shows the format of the Load Register Doubleword Exclusive, LDREXD,
instruction.
31
28 27
Cond
21 20 19
0 0 0 1 1 0 1 1
16 15
Rn
12 11
Rd
8 7
SBO
4 3
1 0 0 1
0
SBO
Figure 2-15 LDREXD instruction
Syntax
LDREXD{<cond>} <Rd>, [<Rn>]
Operation
if ConditionPassed(cond) then
processor_id = ExecutingProcessor()
Rd = Memory[Rn,4]
R(d+1) = Memory[Rn+4,4]
if Shared(Rn) ==1 then
physical_address=TLB(Rn)
MarkExclusiveGlobal(physical_address,processor_id,8)
MarkExclusiveLocal(processor_id)
STREXD
Figure 2-16 shows the format of the Store Register Doubleword Exclusive, STREXD,
instruction.
31
28 27
Cond
21 20 19
0 0 0 1 1 0 1 0
16 15
Rn
12 11
Rd
8 7
SBO
4 3
1 0 0 1
0
Rm
Figure 2-16 STREXD instruction
Syntax
STREXD{<cond>} <Rd>, <Rm>, [<Rn>]
Operation
if ConditionPassed(cond) then
processor_id = ExecutingProcessor()
if IsExclusiveLocal(processor_id) then
if Shared(Rn)==1 then
physical_address=TLB(Rn)
if IsExclusiveGlobal(physical_address,processor_id,8) then
Memory[Rn,4] = Rm
Memory[Rn+4,4] = R(m+1)
Rd = 0
ClearByAddress(physical_address,8)
else
Rd =1
else
Memory[Rn,4] = Rm
Memory[Rn+4,4] = R(m+1)
Rd = 0
else
Rd = 1
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-33
Programmer’s Model
ClearExclusiveLocal(processor_id)
2.11.4
CLREX
Figure 2-17 shows the format of the Clear Exclusive, CLREX, instruction.
31
28 27
1111
21 20 19
0 1 0 1 0 1 1 1
16 15
SBO
12 11
SBO
8 7
SBZ
4 3
0 0 0 1
0
SBO
Figure 2-17 CLREX instruction
The dummy STREX construct specified in ARMv6 is required for correct system behavior. The
CLREX instruction replaces the dummy STREX instruction.
This operation in unconditional in the ARM instruction set.
Syntax
CLREX
Operation
ClearExclusiveLocal(processor_id)
2.11.5
NOP-compatible hints
Figure 2-18 shows the format of the NOP-compatible hint instruction.
31
28 27
Cond
23 22 21 20 19
16 15
0 0 1 1 0 0 1 0 0 0 0 0
12 11
SBO
8 7
0 0 0 0
0
Hint
Figure 2-18 NOP-compatible hint instruction
Syntax
<cond>
Is the condition when the instruction executes. It produces no useful change in
functionality, but is provided to ensure disassembly followed by reassembly
always regenerates the original code.
<hint>
defaults to zero
hint == 0x0: the instruction is NOP
hint == 0x1: the instruction is YIELD
For all other values, RESERVED, the instruction behaves like NOP.
The true NOP for ARM state is equivalent to an MSR to the CPSR with the immed_value
redefined as the hint field and no bytes selected. The instruction is fully architecturally defined,
with all encodings assigned.
Note
True NOPs are architected for alignment reasons and do not have any timing guarantees with
respect to their neighboring instructions.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-34
Programmer’s Model
In an Symmetric Multi-Threading (SMT) design, a yield instruction enables a thread to generate
a hint to the processor that runs it. The hint indicates that the current activity of the thread is not
important, for example sitting in a spin-lock, and so can yield. On a uniprocessor system, this
instruction behaves as a NOP. OSs can use the yielding NOP in those places that require the
yield hint, and the non-yielding NOP in other cases.
Operation
The instruction acts as a NOP irrespective of whether the condition passes or fails, effectively
the ALWAYS condition. Do not use RESERVED values in software.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-35
Programmer’s Model
2.12
Exceptions
Exceptions occur whenever the normal flow of a program has to be halted temporarily. For
example, to service an interrupt from a peripheral. Before attempting to handle an exception, the
processor preserves the current processor state so that the original program can resume when the
handler routine has finished.
If two or more exceptions occur simultaneously, the exceptions are dealt with in the fixed order
given in Exception priorities on page 2-57.
This section provides details of the processor exception handling:
•
Exception entry and exit summary on page 2-37
•
Entering an ARM exception on page 2-38
•
Leaving an ARM exception on page 2-38.
Several enhancements are made in ARM architecture v6 to the exception model, mostly to
improve interrupt latency, as follows:
2.12.1
•
New instructions are added to give a choice of stack to use for storing the exception return
state after exception entry, and to simplify changes of processor mode and the disabling
and enabling of interrupts.
•
The interrupt vector definitions on ARMv6 are changed to support the addition of
hardware to prioritize the interrupt sources and to look up the start vector for the related
interrupt handling routine.
•
A low interrupt latency configuration is added in ARMv6. In terms of the instruction set
architecture, it specifies that multi-access load/store instructions, ARM LDC, LDM,
LDRD, STC, STM, and STRD, and Thumb LDMIA, POP, PUSH, and STMIA, can be
interrupted and then restarted after the interrupt has been processed.
•
Support for an imprecise Data Abort that behaves as an interrupt rather than as an abort,
in that it occurs asynchronously relative to the instruction execution. Support involves the
masking of a pending imprecise Data Abort at times when entry into Abort mode is
deemed unrecoverable.
New instructions for exception handling
This section describes the instructions added to accelerate the handling of exceptions. Full
details of these instructions are given in the ARM Architecture Reference Manual.
Store Return State (SRS)
This instruction stores R14_<current_mode> and SPSR_<current_mode> to sequential
addresses, using the banked version of R13 for a specified mode to supply the base address, and
to be written back to if base register Write-Back is specified. This enables an exception handler
to store its return state on a stack other than the one automatically selected by its exception entry
sequence.
The addressing mode used is a version of an ARM addressing mode, modified to assume a
{R14,SPSR} register list rather than using a list specified by a bit mask in the instruction. For
more information see the ARM Architecture Reference Manual. This enables the SRS
instruction to access stacks in a manner compatible with the normal use of STM instructions for
stack accesses.
When in Non-secure state, specifying Secure Monitor mode in <mode> parameter field causes
the SRS to be an Undefined exception. The behavior prevents the Secure Monitor stack values
being altered.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-36
Programmer’s Model
Return From Exception (RFE)
This instruction loads the PC and CPSR from sequential addresses. This is used to return from
an exception that has had its return state saved using the SRS instruction, see Store Return State
(SRS) on page 2-36, and again uses a version of an ARM addressing mode, modified to assume
a {PC,CPSR} register list.
Change Processor State (CPS)
This instruction provides new values for the CPSR interrupt masks, mode bits, or both, and is
designed to shorten and speed up the read/modify/write instruction sequence used in ARMv5 to
perform such tasks. Together with the SRS instruction, it enables an exception handler to save
its return information on the stack of another mode and then switch to that other mode, without
modifying the stack belonging to the original mode or any registers other than the new mode
stack pointer.
This instruction also streamlines interrupt mask handling and mode switches in other code. In
particular it enables short code sequences to be made atomic efficiently in a uniprocessor system
by disabling interrupts at their start and re-enabling interrupts at their end. A similar Thumb
instruction is also provided. However, the Thumb instruction can only change the interrupt
masks, not the processor mode as well, to avoid using too much instruction set space.
2.12.2
Exception entry and exit summary
Table 2-8 summarizes the PC value preserved in the relevant R14 on exception entry, and the
recommended instruction for exiting the exception handler. Full details of Jazelle state
exceptions are provided in the Jazelle V1 Architecture Reference Manual.
Table 2-8 Exception entry and exit
Previous state
Exception
or entry
Return instruction
SVC
Notes
ARM R14_x
Thumb
R14_x
Jazelle
R14_x
MOVS PC, R14_svc
PC + 4
PC+2
-
SMC
MOVS PC, R14_mon
PC + 4
-
-
UNDEF
MOVS PC, R14_und
PC + 4
PC+2
-
PABT
SUBS PC, R14_abt, #4
PC + 4
PC+4
PC+4
Where the PC is the address
of instruction that had the
Prefetch Abort.
FIQ
SUBS PC, R14_fiq, #4
PC + 4
PC+4
PC+4
IRQ
SUBS PC, R14_irq, #4
PC + 4
PC+4
PC+4
Where the PC is the address
of the instruction that was
not executed because the
FIQ or IRQ took priority.
DABT
SUBS PC, R14_abt, #8
PC + 8
PC+8
PC+8
Where the PC is the address
of the Load or Store
instruction that generated
the Data Abort.
RESET
NA
-
-
-
The value saved in R14_svc
on reset is Unpredictable.
BKPT
SUBS PC, R14_abt, #4
PC + 4
PC+4
PC+4
Software breakpoint.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
Where the PC is the address
of the SVC, SMC, or
undefined instruction. Not
used in Jazelle state.
2-37
Programmer’s Model
2.12.3
Entering an ARM exception
SCR[3:1] determine the mode that the processor enters on an FIQ, IRQ, or external abort
exception, see System control and configuration on page 3-5.
When handling an ARM exception the processor:
1.
Preserves the address of the next instruction in the appropriate LR. When the exception
entry is from:
ARM and Jazelle states:
The processor writes the value of the PC into the LR, offset by a value, current
PC + 4 or PC + 8 depending on the exception, that causes the program to
resume from the correct place on return.
Thumb state:
The processor writes the value of the PC into the LR, offset by a value, current
PC + 2, PC + 4 or PC + 8 depending on the exception, that causes the program
to resume from the correct place on return.
The exception handler does not have to determine the state when entering an exception.
For example, in the case of a SVC, MOVS PC, R14_svc always returns to the next instruction
regardless of whether the SVC was executed in ARM or Thumb state.
2.
Copies the CPSR into the appropriate SPSR.
3.
Forces the CPSR mode bits to a value that depends on the exception.
4.
Forces the PC to fetch the next instruction from the relevant exception vector.
The processor can also set the interrupt and imprecise abort disable flags to prevent otherwise
unmanageable nesting of exceptions.
Note
Exceptions are always entered, handled, and exited in ARM state. When the processor is in
Thumb state or Jazelle state and an exception occurs, the switch to ARM state takes place
automatically when the exception vector address is loaded into the PC.
2.12.4
Leaving an ARM exception
When an exception has completed, the exception handler must move the LR, minus an offset to
the PC. The offset varies according to the type of exception, as Table 2-8 on page 2-37 lists.
Typically the return instruction is an arithmetic or logical operation with the S bit set and Rd =
R15, so the core copies the SPSR back to the CPSR.
Note
The action of restoring the CPSR from the SPSR automatically resets the T bit and J bit to the
values held immediately prior to the exception. The A, I, and F bits are also automatically
restored to the value they held immediately prior to the exception.
2.12.5
Reset
When the nRESETIN signal is driven LOW a reset occurs, and the processor abandons the
executing instruction. The nVFPRESETIN signal is not connected and you must tie it LOW.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-38
Programmer’s Model
When nRESETIN is driven HIGH again, the processor:
1.
Forces NS bit in SCR to 0, Secure, CPSR M[4:0] to b10011, Secure Supervisor mode, sets
the A, I, and F bits in the CPSR, and clears the CPSR T bit and J bit. The E bit is set based
on the state of the BIGENDINIT and UBITINIT pins. Other bits in the CPSR are
indeterminate.
2.
Forces the PC to fetch the next instruction from the reset vector address.
3.
Reverts to ARM state, and resumes execution.
After reset, all register values except the PC and CPSR are indeterminate.
See Chapter 9 Clocking and Resets for more details of the reset behavior for the processor.
2.12.6
Fast interrupt request
The Fast Interrupt Request (FIQ) exception supports fast interrupts. In ARM state, FIQ mode
has eight private registers to reduce, or even remove the requirement for register saving,
minimizing the overhead of context switching.
An FIQ is externally generated by taking the nFIQ signal input LOW. The nFIQ input is
registered internally to the processor. It is the output of this register that is used by the processor
control logic.
Irrespective of whether exception entry is from ARM state, Thumb state, or Jazelle state, an FIQ
handler returns from the interrupt by executing:
SUBS PC,R14_fiq,#4
You can disable FIQ exceptions within a privileged mode by setting the CPSR F flag. When the
F flag is clear, the processor checks for a LOW level on the output of the nFIQ register at the
end of each instruction.
The FW bit and FIQ bit in the SCR register configure the FIQ as:
•
non maskable in Non-secure world, FW bit in SCR
•
branch to either current FIQ mode or Secure Monitor mode, FIQ bit in SCR.
FIQs and IRQs are disabled when an FIQ occurs. You can use nested interrupts but it is up to
you to save any corruptible registers and to re-enable FIQs and interrupts.
2.12.7
Interrupt request
The IRQ exception is a normal interrupt caused by a LOW level on the nIRQ input. IRQ has a
lower priority than FIQ, and is masked on entry to an FIQ sequence.
Irrespective of whether exception entry is from ARM state, Thumb state, or Jazelle state, an IRQ
handler returns from the interrupt by executing:
SUBS PC,R14_irq,#4
You can disable IRQ exceptions within a privileged mode by setting the CPSR I flag. When the
I flag is clear, the processor checks for a LOW level on the output of the nIRQ register at the end
of each instruction.
IRQs are disabled when an IRQ occurs. You can use nested interrupts but it is up to you to save
any corruptible registers and to re-enable IRQs.
The IRQ bit in the SCR register configures the IRQ to branch to either the current IRQ mode or
to the Secure Monitor mode.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-39
Programmer’s Model
2.12.8
Low interrupt latency configuration
The FI bit, bit 21, in CP15 register 1 enables a low interrupt latency configuration. This bit is
not duplicated in both worlds, and can only be modified in Secure state. It applies to both worlds.
This mode reduces the interrupt latency of the processor. This is achieved by:
•
disabling Hit-Under-Miss (HUM) functionality
•
abandoning restartable external accesses so that the core can react to a pending interrupt
faster than is normally the case
•
recognizing low-latency interrupts as early as possible in the main pipeline.
To ensure that a change between normal and low interrupt latency configurations is
synchronized correctly, the FI bit must only be changed in using the sequence:
1.
Data Synchronization Barrier.
2.
Change FI Bit.
3.
Data Synchronization Barrier.
You must disable interrupts during this complete sequence of operations.
You must ensure that software systems only change the FI bit shortly after Reset, while
interrupts are disabled. In low interrupt latency configuration, software must only use
multi-word load/store instructions in ways that are fully restartable. In particular, they must not
be used on memory locations that produce non-idempotent side-effects for the type of memory
access concerned.
This enables, but does not require, implementations to make these instructions interruptible
when in low interrupt latency configuration. If the instruction is interrupted before it is
complete, the result might be that one or more of the words are accessed twice, but the
idempotency of the side-effects, if any, of the memory accesses ensures that this does not matter.
Note
There is a similar existing requirement with unaligned and multi-word load/store instructions
that access memory locations that can abort in a recoverable way. An abort on one of the words
accessed can cause a previously-accessed word to be accessed twice, once before the abort, and
once again after the abort handler has returned. The requirement in this case is either:
•
all side-effects are idempotent
•
the abort must either occur on the first word accessed or not at all.
The instructions that this rule currently applies to are:
•
ARM instructions LDC, all forms of LDM, LDRD, STC, all forms of STM, STRD, and
unaligned LDR, STR, LDRH, and STRH
•
Thumb instructions LDMIA, PUSH, POP, and STMIA, and unaligned LDR, STR, LDRH,
and STRH.
System designers are also advised that memory locations accessed with these instructions must
not have large numbers of wait-states associated with them if the best possible interrupt latency
is to be achieved.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-40
Programmer’s Model
2.12.9
Interrupt latency example
This section gives an extended example to show how the combination of new facilities improves
interrupt latency. The example is not necessarily entirely realistic, but illustrates the main points.
To be simpler, this example applies for legacy code, that is for code that does not use any
TrustZone features. You can therefore assume the core only runs code in either Secure or
Non-secure world.
The assumptions made are:
1.
Vector Interrupt Controller (VIC) hardware exists to prioritize interrupts and to supply the
address of the highest priority interrupt to the processor core on demand. In the ARMv5
system, the address is supplied in a memory-mapped I/O location, and loading the address
acts as an entering interrupt handler acknowledgement to the VIC. In the ARMv6 system,
the address is loaded and the acknowledgement given automatically, as part of the
interrupt entry sequence. In both systems, a store to a memory-mapped I/O location is
used to send a finishing interrupt handler acknowledgement to the VIC.
2.
The system has the following layers:
Real-time layer
Contains handlers for a number of high-priority interrupts. These
interrupts can be prioritized, and are assumed to be signaled to the
processor core by means of the FIQ interrupt. Their handlers do not
use the facilities supplied by the other two layers. This means that
all memory they use must be locked down in the TLBs and caches.
It is possible to use additional code to make access to nonlocked
memory possible, but this example does not describe this.
Architectural completion layer
Contains Prefetch Abort, Data Abort and Undefined instruction
handlers whose purpose is to give the illusion that the hardware is
handling all memory requests and instructions on its own, without
requiring software to handle TLB misses, virtual memory misses,
and near-exceptional floating-point operations, for example. This
illusion is not available to the real-time layer, because the software
handlers concerned take a significant number of cycles, and it is not
reasonable to have every memory access to take large numbers of
cycles. Instead, the memory concerned has to be locked down.
Non real-time layer
Provides interrupt handlers for low-priority interrupts. These
interrupts can also be prioritized, and are assumed to be signaled to
the processor core using the IRQ interrupt.
3.
ARM DDI 0333H
ID012410
The corresponding exception priority structure is as follows, from highest to lowest
priority:
a.
FIQ1, highest priority FIQ
b.
FIQ2
c.
...
d.
FIQm, lowest priority FIQ
e.
Data Abort
f.
Prefetch Abort
g.
Undefined instruction
h.
SVC
i.
IRQ1, highest priority IRQ
j.
IRQ2
k.
...
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-41
Programmer’s Model
l.
IRQn, lowest priority IRQ
The processor core prioritization handles most of the priority structure, but the VIC
handles the priorities within each group of interrupts.
Note
This list reflects the priorities that the handlers are subject to, and differs from the
priorities that the exception entry sequences are subject to. The difference occurs because
simultaneous Data Abort and FIQ exceptions result in the sequence:
a.
Data Abort entry sequence executed, updating R14_abt, SPSR_abt, PC, and CPSR.
b.
FIQ entry sequence executed, updating R14_fiq, SPSR_fiq, PC, and CPSR.
c.
FIQ handler executes to completion and returns.
d.
Data Abort handler executes to completion and returns.
For more information see the ARM Architecture Reference Manual.
4.
5.
Stack and register usage is:
•
The FIQ1 interrupt handler has exclusive use of R8_fiq to R12_fiq. In ARMv5,
R13_fiq points to a memory area, that is mainly for use by the FIQ1 handler.
However, a few words are used during entry for other FIQ handlers. In ARMv6, the
FIQ1 interrupt handler has exclusive use of R13_fiq.
•
The Undefined instruction, Prefetch Abort, Data Abort, and non-FIQ1 FIQ handlers
use the stack pointed to by R13_abt. This stack is locked down in memory, and
therefore of known, limited depth.
•
All IRQ and SVC handlers use the stack pointed to by R13_svc. This stack does not
have to be locked down in memory.
•
The stack pointed to by R13_usr is used by the current process. This process can be
privileged or unprivileged, and uses System or User mode accordingly.
Timings are roughly consistent with ARM10 timings, with the pipeline reload penalty
being three cycles. It is assumed that pipeline reloads are combined to execute as quickly
as reasonably possible, and in particular that:
•
If an interrupt is detected during an instruction that has set a new value for the PC,
after that value has been determined and written to the PC but before the resulting
pipeline refill is completed, the pipeline refill is abandoned and the interrupt entry
sequence started as soon as possible.
•
Similarly, if an FIQ is detected during an exception entry sequence that does not
disable FIQs, after the updates to R14, the SPSR, the CPSR, and the PC but before
the pipeline refill has completed, the pipeline refill is abandoned and the FIQ entry
sequence started as soon as possible.
FIQs in the example system in ARMv5
In ARMv5, all FIQ interrupts come through the same vector, at address 0x0000001C or
0xFFFF001C. To implement the above system, the code at this vector must get the address of the
correct handler from the VIC, branch to it, and transfer to using R13_abt and the Abort mode
stack if it is not the FIQ1 handler. The following code does, assuming that R8_fiq holds the
address of the VIC:
FIQhandler
LDR
PC, [R8,#HandlerAddress]
...
FIQ1handler
... Include code to process the interrupt ...
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-42
Programmer’s Model
STR
SUBS
R0, [R8,#AckFinished]
PC, R14, #4
...
FIQ2handler
STMIA
R13, {R0-R3}
MOV
R0, LR
MRS
R1, SPSR
ADD
R2, R13, #8
MRS
R3, CPSR
BIC
R3, R3, #0x1F
ORR
R3, R3, #0x1B
; = Abort mode number
MSR
CPSR_c, R3
STMFD
R13!, {R0, R1}
LDMIA
R2, {R0, R1}
STMFD
R13!, {R0, R1}
LDMDB
R2, {R0, R1}
BIC
R3, R3, #0x40
; = F bit
MSR
CPSR_c, R3
... FIQs are now re-enabled, with original R2, R3, R14, SPSR on stack
... Include code to stack any more registers required, process the interrupt
... and unstack extra registers
ADR
R2, #VICaddress
MRS
R3, CPSR
ORR
R3, R3, #0x40
; = F bit
MSR
CPSR_c, R3
STR
R0, [R2,#AckFinished]
LDR
R14, [R13,#12] ; Original SPSR value
MSR
SPSR_fsxc, R14
LDMFD
R13!, {R2,R3,R14}
ADD
R13, R13, #4
SUBS
PC, R14, #4
...
The major problem with this is the length of time that FIQs are disabled at the start of the lower
priority FIQs. The worst-case interrupt latency for the FIQ1 interrupt occurs if a lower priority
FIQ2 has fetched its handler address, and is approximately:
•
3 cycles for the pipeline refill after the LDR PC instruction fetches the handler address
•
+ 24 cycles to get to and execute the MSR instruction that re-enables FIQs
•
+ 3 cycles to re-enter the FIQ exception
•
+ 5 cycles for the LDR PC instruction at FIQhandler
•
= 35 cycles.
Note
FIQs must be disabled for the final store to acknowledge the end of the handler to the VIC.
Otherwise, more badly timed FIQs, each occurring close to the end of the previous handler, can
cause unlimited growth of the locked-down stack.
FIQs in the example system in ARMv6
Using the VIC and the new instructions, there is no longer any requirement for everything to go
through the single FIQ vector, and the changeover to a different stack occurs much more
smoothly. The code is:
FIQ1handler
... Include code to process the interrupt ...
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-43
Programmer’s Model
STR
SUBS
R0, [R8,#AckFinished]
PC, R14, #4
...
FIQ2handler
SUB
R14, R14, #4
SRSFD
R13_abt!
CPSIE
f, #0x1B
; = Abort mode
STMFD
R13!, {R2, R3}
... FIQs are now re-enabled, with original R2, R3, R14, SPSR on stack
... Include code to stack any more registers required, process the interrupt
... and unstack extra registers
LDMFD
R13!, {R2, R3}
ADR
R14, #VICaddress
CPSID
f
STR
R0, [R14,#AckFinished]
RFEFD
R13!
...
The worst-case interrupt latency for a FIQ1 now occurs if the FIQ1 occurs during an FIQ2
interrupt entry sequence, after it disables FIQs, and is approximately:
•
3 cycles for the pipeline refill for the FIQ2 exception entry sequence
•
+ 5 cycles to get to and execute the CPSIE instruction that re-enables FIQs
•
+ 3 cycles to re-enter the FIQ exception
•
= 11 cycles.
Note
In the ARMv5 system, the potential additional interrupt latency caused by a long LDM or STM
being in progress when the FIQ is detected was only significant because the memory system was
able to stretch its cycles considerably. Otherwise, it was dwarfed by the number of cycles lost
because of FIQs being disabled at the start of a lower-priority interrupt handler. In ARMv6, this
is still the case, but it is a lot closer.
Alternatives to the example system
Two alternatives to the design in FIQs in the example system in ARMv6 on page 2-43 are:
•
ARM DDI 0333H
ID012410
The first alternative is not to reserve the FIQ registers for the FIQ1 interrupt, but instead
either to:
— share them out among the various FIQ handlers
The first restricts the registers available to the FIQ1 handler and adds the software
complication of managing a global allocation of FIQ registers to FIQ handlers.
Also, because of the shortage of FIQ registers, it is not likely to be very effective if
there are many FIQ handlers.
— require the FIQ handlers to treat them as normal callee-save registers.
The second adds a number of cycles of loading important addresses and variable
values into the registers to each FIQ handler before it can do any useful work. That
is, it increases the effective FIQ latency by a similar number of cycles.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-44
Programmer’s Model
•
The second alternative is to use IRQs for all but the highest priority interrupt, so that there
is only one level of FIQ interrupt. This achieves very fast FIQ latency, 5-8 cycles, but at a
cost to all the lower-priority interrupts that every exception entry sequence now disables
them. You then have the following possibilities:
—
None of the exception handlers in the architectural completion layer re-enable
IRQs. In this case, all IRQs suffer from additional possible interrupt latency caused
by those handlers, and so effectively are in the non real-time layer. In other words,
this results in there only being one priority for interrupts in the real-time layer.
—
All of the exception handlers in the architectural completion layer re-enable IRQs
to permit IRQs to have real-time behavior. The problem in this case is that all IRQs
can then occur during the processing of an exception in the architectural completion
layer, and so they are all effectively in the real-time layer. In other words, this
effectively means that there are no interrupts in the non real-time layer.
—
All of the exception handlers in the architectural completion layer re-enable IRQs,
but they also use additional VIC facilities to place a lower limit on the priority of
IRQs that is taken. This permits IRQs at that priority or higher to be treated as being
in the real-time layer, and IRQs at lower priorities to be treated as being in the non
real-time layer. The price paid is some additional complexity in the software and in
the VIC hardware.
Note
For either of the last two options, the new instructions speed up the IRQ re-enabling and
the stack changes that are likely to be required.
2.12.10 Aborts
An abort can be caused by either:
•
the MMU signalling an internal abort
•
an external abort being raised from the AXI interfaces, by an AXI error response.
There are two types of abort:
•
Prefetch Abort
•
Data Abort on page 2-46.
IRQs are disabled when an abort occurs. When the aborts are configured to branch to Secure
Monitor mode, the FIQ is also disabled.
Note
The Interrupt Status Register shows at any time if there is a pending IRQ, FIQ, or External
Abort. For more information, see c12, Interrupt Status Register on page 3-123.
All aborts from the TLB are internal except for aborts from page table walks that are external
precise aborts. If the EA bit is 1 for translation aborts, see c1, Secure Configuration Register on
page 3-52, the core branches to Secure Monitor mode in the same way as it does for all other
external aborts.
Prefetch Abort
This is signaled with the Instruction as it enters the pipeline Decode stage.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-45
Programmer’s Model
When a Prefetch Abort occurs, the processor marks the prefetched instruction as invalid, but
does not take the exception until the instruction is to be executed. If the instruction is not
executed, for example because a branch occurs while it is in the pipeline, the abort does not take
place.
After dealing with the cause of the abort, the handler executes the following instruction
irrespective of the processor operating state:
SUBS PC,R14_abt,#4
This action restores both the PC and the CPSR, and retries the aborted instruction.
Data Abort
Data Abort on the processor can be precise or imprecise. Precise Data Aborts are those
generated after performing an instruction side CP15 operation, and all those generated by the
MMU:
•
alignment faults
•
translation faults
•
access bit faults
•
domain faults
•
permission faults.
Data Aborts that occur because of watchpoints are imprecise in that the processor and system
state presented to the abort handler is the processor and system state at the boundary of an
instruction shortly after the instruction that caused the watchpoint, but before any following
load/store instruction. Because the state that is presented is consistent with an instruction
boundary, these aborts are restartable, even though they are imprecise.
Errors that cause externally generated Data Aborts might be precise or imprecise. Two separate
FSR encodings indicate if the external abort is precise or imprecise:
•
all external aborts to loads when the CP15 Register 1 FI bit, bit 21, is set are precise
•
all external aborts to loads or stores to Strongly Ordered memory are precise
•
all external aborts to loads to the Program Counter or the CSPR are precise
•
all external aborts on the load part of a SWP are precise
•
all other external aborts are imprecise.
External aborts are supported on cacheable locations. The abort is transmitted to the processor
only if a word requested by the processor had an external abort.
Precise Data Aborts
A precise Data Abort is signaled when the abort exception enables the processor and system
state presented to the abort handler to be consistent with the processor and system state when
the aborting instruction was executed. With precise Data Aborts, the restarting of the processor
after the cause of the abort has been rectified is straightforward.
The ARM1176JZ-S processor implements the base restored Data Abort model, that differs from
the base updated Data Abort model implemented by the ARM7TDMI-S processor.
With the base restored Data Abort model, when a Data Abort exception occurs during the
execution of a memory access instruction, the base register is always restored by the processor
hardware to the value it contained before the instruction was executed. This removes the
requirement for the Data Abort handler to unwind any base register update, that might have been
specified by the aborted instruction. This simplifies the software Data Abort handler. See ARM
Architecture Reference Manual for more details.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-46
Programmer’s Model
After dealing with the cause of the abort, the handler executes the following return instruction
irrespective of the processor operating state at the point of entry:
SUBS PC,R14_abt,#8
This restores both the PC and the CPSR, and retries the aborted instruction.
Imprecise Data Aborts
An imprecise Data Abort is signaled when the processor and system state presented to the abort
handler cannot be guaranteed to be consistent with the processor and system state when the
aborting instruction was issued.
2.12.11 Imprecise Data Abort mask in the CPSR/SPSR
An imprecise Data Abort caused, for example, by an External Error on a write that has been held
in a Write Buffer, is asynchronous to the execution of the causing instruction and can occur
many cycles after the instruction that caused the memory access has retired. For this reason, the
imprecise Data Abort can occur at a time that the processor is in Abort mode because of a
precise Data Abort, or can have live state in Abort mode, but be handling an interrupt.
To avoid the loss of the Abort mode state, R14_abt and SPSR_abt, in these cases, that leads to
the processor entering an unrecoverable state, the existence of a pending imprecise Data Abort
must be held by the system until a time when the Abort mode can safely be entered.
A mask is added into the CPSR to indicate that an imprecise Data Abort can be accepted. This
bit is referred to as the A bit. The imprecise Data Abort causes a Data Abort to be taken when
imprecise Data Aborts are not masked. When imprecise Data Aborts are masked, then the
implementation is responsible for holding the presence of a pending imprecise Data Abort until
the mask is cleared and the abort is taken. The A bit is set automatically on entry into Abort
Mode, IRQ, and FIQ Modes, and on Reset.
Note
You cannot change the CPSR A bit in the Non-secure world if the SCR bit 5 is reset. You can
change the SPSR A bit in the Non-secure world but this does not update the CPSR if the SCR
bit 5 does not permit it.
2.12.12 Supervisor call instruction
You can use the Supervisor call instruction (SVC) to enter Supervisor mode, usually to request
a particular supervisor function. The SVC handler reads the opcode to extract the SVC function
number. A SVC handler returns by executing the following instruction, irrespective of the
processor operating state:
MOVS PC, R14_svc
This action restores the PC and CPSR, and returns to the instruction following the SVC.
IRQs are disabled when a Supervisor call occurs.
2.12.13 Secure Monitor Call (SMC)
When the processor executes the Secure Monitor Call (SMC) the core enters Secure Monitor
mode to execute the Secure Monitor code. For more details on SMC and the Secure Monitor,
see The NS bit and Secure Monitor mode on page 2-4.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-47
Programmer’s Model
Note
An attempt by a User process to execute an SMC makes the processor enter the Undefined
exception trap.
2.12.14 Undefined instruction
When an instruction is encountered that neither the processor, nor any coprocessor in the
system, can handle the processor takes the undefined instruction trap. Software can use this
mechanism to extend the ARM instruction set by emulating undefined coprocessor instructions.
After emulating the failed instruction, the trap handler executes the following instruction,
irrespective of the processor operating state:
MOVS PC,R14_und
This action restores the CPSR and returns to the next instruction after the undefined instruction.
IRQs are disabled when an undefined instruction trap occurs. For more information about
undefined instructions, see the ARM Architecture Reference Manual.
2.12.15 Breakpoint instruction (BKPT)
A breakpoint (BKPT) instruction operates as though the instruction causes a Prefetch Abort.
A breakpoint instruction does not cause the processor to take the Prefetch Abort exception until
the instruction reaches the Execute stage of the pipeline. If the instruction is not executed, for
example because a branch occurs while it is in the pipeline, the breakpoint does not take place.
After dealing with the breakpoint, the handler executes the following instruction irrespective of
the processor operating state:
SUBS PC,R14_abt,#4
This action restores both the PC and the CPSR, and retries the breakpointed instruction.
Note
If the EmbeddedICE-RT logic is configured into Halting debug-mode, a breakpoint instruction
causes the processor to enter Debug state. See Halting debug-mode debugging on page 13-50.
2.12.16 Exception vectors
The Secure Configuration Register bits [3:1] determine the mode that is entered when an IRQ,
a FIQ, or an external abort exception occur.
Three CP15 registers define the base address of the following vector tables:
•
Non-secure, Non_Secure_Base_Address
•
Secure, Secure_Base_Address
•
Secure Monitor, Monitor_Base_Address.
If high vectors are enabled, Non_Secure_Base_Address and Secure_Base_Address registers are
treated as being 0xFFFF0000, regardless of the value of these registers.
Exceptions occurring in Non-secure world
The following exceptions occur in the Non-secure world:
•
Reset on page 2-49
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-48
Programmer’s Model
•
•
•
•
•
•
•
•
•
Undefined instruction
Supervisor call exception
External Prefetch Abort on page 2-50
Internal Prefetch Abort on page 2-50
External Data Abort on page 2-50
Internal Data Abort on page 2-51
Interrupt request (IRQ) exception on page 2-51
Fast Interrupt Request (FIQ) exception on page 2-52
Secure Monitor Call Exception on page 2-52.
Reset
When Reset is de-asserted:
/* Enter secure state */
R14_svc = UNPREDICTABLE value
SPSR_svc = UNPREDICTABLE value
CPSR [4:0] = 0b10011 /* Enter supervisor mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [6] = 1 /* Disable fast interrupts */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of Secure Control Register bit[25] */
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
PC = 0xFFFF0000
else
PC = 0x00000000
Undefined instruction
On an undefined instruction:
/* Non-secure state is unchanged */
R14_und = address of the next instruction after the undefined instruction
SPSR_und = CPSR
CPSR [4:0] = 0b11011 /* Enter undefined Instruction mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
CPSR [9] = Non-secure EE-bit /* store value of NS Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
PC = 0xFFFF0004
else
PC = Non_Secure_Base_Address + 0x00000004
Supervisor call exception
On a SVC:
/* Non-secure state is unchanged */
R14_svc = address of the next instruction after the SVC instruction
SPSR_svc = CPSR
CPSR [4:0] = 0b10011 /* Enter supervisor mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
CPSR [9] = Non-secure EE-bit /* store value of NS Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
PC = 0xFFFF0008
else
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-49
Programmer’s Model
PC = Non_Secure_Base_Address + 0x00000008
External Prefetch Abort
On an external prefetch abort:
if SCR[3]=1 /* external prefetch aborts trapped to Secure Monitor mode */
R14_mon = address of the aborted instruction + 4
SPSR_mon = CPSR
CPSR [4:0] = 0b10110 /* Enter Secure Monitor mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [6] = 1 /* Disable fast interrupts */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of Secure Ctrl Reg bit[25] */
CPSR[24] = 0 /* Clear J bit */
PC = Monitor_Base_Address + 0x0000000C
Else
R14_abt = address of the aborted instruction + 4
SPSR_abt = CPSR
CPSR [4:0] = 0b10111 /* Enter abort mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
If SCR[5]=1 (bit AW)
CPSR [8] = 1 /* Disable imprecise aborts */
Else
CPSR [8] = UNCHANGED
CPSR [9] = Non-secure EE-bit /* store value of NS Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
PC = 0xFFFF000C
else
PC = Non_Secure_Base_Address + 0x0000000C
Internal Prefetch Abort
On an internal prefetch abort:
/* Non-secure state is unchanged */
R14_abt = address of the aborted instruction + 4
SPSR_abt = CPSR
CPSR [4:0] = 0b10111 /* Enter abort mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
If SCR[5]=1 (bit AW)
CPSR [8] = 1 /* Disable imprecise aborts */
Else
CPSR [8] = UNCHANGED
CPSR [9] = Non-secure EE-bit /* store value of NS Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
PC = 0xFFFF000C
else
PC = Non_Secure_Base_Address + 0x0000000C
External Data Abort
On an External Precise Data Abort or on an External Imprecise Abort with CPSR[8]=0 (A bit):
/* Non-secure state is unchanged */
if SCR[3]=1 /* external aborts trapped to Secure Monitor mode */
R14_mon = address of the aborted instruction + 8
SPSR_mon = CPSR
CPSR [4:0] = 0b10110 /* Enter Secure Monitor mode */
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-50
Programmer’s Model
CPSR [5] = 0 /* Execute in ARM state */
CPSR [6] = 1 /* Disable fast interrupts */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of secure Ctrl Reg bit[25] */
CPSR[24] = 0 /* Clear J bit */
Else /* external Aborts trapped in abort mode */
R14_abt = address of the aborted instruction + 8
SPSR_abt = CPSR
CPSR [4:0] = 0b10111 /* Enter abort mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
If SCR[5]=1 (bit AW)
CPSR [8] = 1 /* Disable imprecise aborts */
Else
CPSR [8] = UNCHANGED
CPSR [9] = Non-secure EE-bit /* store value of NS Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
PC = 0xFFFF0010
else
PC = Non_Secure_Base_Address + 0x00000010
Internal Data Abort
On an Internal Data Abort. All aborts that are not external aborts, that is data aborts on L1
memory management occurring when a fault is detected in MMU:
/* Non-secure state is unchanged */
R14_abt = address of the aborted instruction + 8
SPSR_abt = CPSR
CPSR [4:0] = 0b10111 /* Enter abort mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
If SCR[5]=1 (bit AW)
CPSR [8] = 1 /* Disable imprecise aborts */
Else
CPSR [8] = UNCHANGED
CPSR [9] = Non-secure EE-bit /* store value of NS Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
PC = 0xFFFF0010
else
PC = Non_Secure_Base_Address + 0x00000010
Interrupt request (IRQ) exception
On an Interrupt Request, and CPSR[7]=0, I bit:
/* Non-secure state is unchanged */
if SCR[1]=1 /* IRQ trapped in Secure Monitor mode */
R14_mon = address of the next instruction to be executed + 4
SPSR_mon = CPSR
CPSR [4:0] = 0b10110 /* Enter Secure Monitor mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [6] = 1 /* Disable fast interrupts */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of secure Ctrl Reg bit[25] */
CPSR[24] = 0 /* Clear J bit */
PC = Monitor_Base_Address + 0x00000018
else
R14_irq = address of the next instruction to be executed + 4
SPSR_irq = CPSR
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-51
Programmer’s Model
CPSR [4:0] = 0b10010 /* Enter IRQ mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
If SCR[5]=1 (bit AW)
CPSR [8] = 1 /* Disable imprecise aborts */
Else
CPSR [8] = UNCHANGED
CPSR [9] = Non-secure EE-bit /* store value of NS Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
if VE == 0 /* Core with VIC port only */
if high vectors configured then
PC = 0xFFFF0018
else
PC = Non_Secure_Base_Address + 0x00000018
else
PC = IRQADDR
Fast Interrupt Request (FIQ) exception
On a Fast Interrupt Request, and CPSR[6]=0, F bit:
/* Non-secure state is unchanged */
if SCR[2]=1 /* FIQ trapped in Secure Monitor mode */
R14_mon = address of the next instruction to be executed + 4
SPSR_mon = CPSR
CPSR [4:0] = 0b10001 /* Enter Secure Monitor mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [6] = 1 /* Disable fast interrupts */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of secure Ctrl Reg bit[25] */
CPSR[24] = 0 /* Clear J bit */
PC = Monitor_Base_Address + 0x0000001C
Else
/* SCR[4] (bit FW) must be set to avoid infinite loop until FIQ is asserted */
R14_fiq = address of the next instruction to be executed + 4
SPSR_fiq = CPSR
CPSR [4:0] = 0b10001 /* Enter FIQ mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [6] = 1 /* Disable fast interrupts */
CPSR [7] = 1 /* Disable interrupts */
If SCR[5]=1 (bit AW)
CPSR [8] = 1 /* Disable imprecise aborts */
Else
CPSR [8] = UNCHANGED
CPSR [9] = Non-secure EE-bit /* store value of NS Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
PC = 0xFFFF001C
else
PC = Non_Secure_Base_Address + 0x0000001C
Secure Monitor Call Exception
On a SMC:
If (UserMode) /* undefined instruction */
R14_und = address of the next instruction after the SMC instruction
SPSR_und = CPSR
CPSR [4:0] = 0b11011 /* Enter undefined instruction mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
CPSR [9] = Non-secure EE-bit /* store value of NS Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-52
Programmer’s Model
If high vectors configured then
PC = 0xFFFF0004
else
PC = Non_Secure_Base_Address + 0x00000004
else
R14_mon = address of the next instruction after the SMC instruction
SPSR_mon = CPSR
CPSR [4:0] = 0b10110 /* Enter Secure Monitor mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [6] = 1 /* Disable fast interrupts */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of secure Ctrl Reg bit[25] */
CPSR[24] = 0 /* Clear J bit */
PC = Monitor_Base_Address + 0x00000008
/* SMC vectored to the */
/*conventional SVC vector */
Exceptions occurring in Secure world
The behavior in Secure state is identical to that in Non-secure state, except that
Secure_Base_Address is used instead of Non_Secure_Base_Address and that CPSR[6], F bit,
and CPSR[8], A bit, are updated regardless the bits [5:4] of the Secure Configuration Register.
Except Reset, the software model does not expect any other exception to occur in Secure
Monitor mode. However, if an exception occurs in Secure Monitor mode, the NS bit in SCR
register is automatically reset and the core branches either to the exception handler in Secure
world or in Secure Monitor mode, Secure Monitor mode for IRQ, FIQ or external aborts with
the corresponding bit set in SCR[3:1].
The following exceptions occur in the Secure world:
•
Reset
•
Undefined instruction on page 2-54
•
Supervisor call exception on page 2-54
•
External Prefetch Abort on page 2-54
•
Internal Prefetch Abort on page 2-55
•
External Data Abort on page 2-50
•
Internal Data Abort on page 2-55
•
Interrupt request (IRQ) exception on page 2-56
•
Fast Interrupt Request (FIQ) exception on page 2-56
•
Secure Monitor Call Exception on page 2-57.
Reset
When Reset is de-asserted:
/* Stay in secure state */
R14_svc = UNPREDICTABLE value
SPSR_svc = UNPREDICTABLE value
CPSR [4:0] = 0b10011 /* Enter supervisor mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [6] = 1 /* Disable fast interrupts */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of Secure Control Register bit[25] */
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
PC = 0xFFFF0000
else
PC = 0x00000000
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-53
Programmer’s Model
Undefined instruction
On an undefined instruction:
/* secure state is unchanged */
R14_und = address of the next instruction after the undefined instruction
SPSR_und = CPSR
CPSR [4:0] = 0b11011 /* Enter undefined Instruction mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
CPSR [9] = Secure EE-bit /* store value of secure Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
PC = 0xFFFF0004
else
PC = Secure_Base_Address + 0x00000004
Supervisor call exception
On a SVC:
/* secure state is unchanged */
R14_svc = address of the next instruction after the SVC instruction
SPSR_svc = CPSR
CPSR [4:0] = 0b10011 /* Enter supervisor mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
CPSR [9] = Secure EE-bit /* store value of secure Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
PC = 0xFFFF0008
else
PC = Secure_Base_Address + 0x00000008
External Prefetch Abort
On an external prefetch abort:
/* secure state is unchanged */
if SCR[3]=1 /* external prefetch aborts trapped to Secure Monitor mode */
R14_mon = address of the aborted instruction + 4
SPSR_mon = CPSR
CPSR [4:0] = 0b10110 /* Enter Secure Monitor mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [6] = 1 /* Disable fast interrupts */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of secure Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
PC = Monitor_Base_Address + 0x0000000C
Else
R14_abt = address of the aborted instruction + 4
SPSR_abt = CPSR
CPSR [4:0] = 0b10111 /* Enter abort mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of secure Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
PC = 0xFFFF000C
else
PC = Secure_Base_Address + 0x0000000C
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-54
Programmer’s Model
Internal Prefetch Abort
On an internal prefetch abort:
/* secure state is unchanged */
R14_abt = address of the aborted instruction + 4
SPSR_abt = CPSR
CPSR [4:0] = 0b10111 /* Enter abort mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of secure Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
PC = 0xFFFF000C
else
PC = Secure_Base_Address + 0x0000000C
External Data Abort
On an External Precise Data Abort or on an External Imprecise Abort with CPSR[8]=0 (A bit):
/* secure state is unchanged */
if SCR[3]=1 /* external aborts trapped to Secure Monitor mode */
R14_mon = address of the aborted instruction + 8
SPSR_mon = CPSR
CPSR [4:0] = 0b10110 /* Enter Secure Monitor mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [6] = 1 /* Disable fast interrupts */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of secure Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
PC = Monitor_Base_Address + 0x00000010
Else /* external Aborts trapped in abort mode */
R14_abt = address of the aborted instruction + 8
SPSR_abt = CPSR
CPSR [4:0] = 0b10111 /* Enter abort mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of secure Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
PC = 0xFFFF0010
else
PC = Secure_Base_Address + 0x00000010
Internal Data Abort
On an Internal Data Abort. All aborts that are not external aborts, i.e. data aborts on L1 memory
management occurring when a fault is detected in MMU:
/* secure state is unchanged */
R14_abt = address of the aborted instruction + 8
SPSR_abt = CPSR
CPSR [4:0] = 0b10111 /* Enter abort mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of secure Control Reg[25]
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
*/
2-55
Programmer’s Model
PC = 0xFFFF0010
else
PC = Secure_Base_Address + 0x00000010
Interrupt request (IRQ) exception
On an Interrupt Request, and CPSR[7]=0, I bit:
/* secure state is unchanged */
if SCR[1]=1
/* IRQ trapped in Secure Monitor mode */
R14_mon = address of the next instruction to be executed + 4
SPSR_mon = CPSR
CPSR [4:0] = 0b10110 /* Enter Secure Monitor mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [6] = 1 /* Disable fast interrupts */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of secure Control Reg[25]
CPSR[24] = 0 /* Clear J bit */
PC = Monitor_Base_Address + 0x00000018
else
R14_irq = address of the next instruction to be executed + 4
SPSR_irq = CPSR
CPSR [4:0] = 0b10010 /* Enter IRQ mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of secure Control Reg[25]
CPSR[24] = 0 /* Clear J bit */
if VE == 0 /* Core with VIC port only */
if high vectors configured then
PC = 0xFFFF0018
else
PC = Secure_Base_Address + 0x00000018
else
PC = IRQADDR
*/
*/
Fast Interrupt Request (FIQ) exception
On a Fast Interrupt Request, and CPSR[6]=0, F bit:
/* secure state is unchanged */
if SCR[2]=1 /* FIQ trapped in Secure Monitor mode */
R14_mon = address of the next instruction to be executed + 4
SPSR_mon = CPSR
CPSR [4:0] = 0b10110 /* Enter Secure Monitor mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [6] = 1 /* Disable fast interrupts */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of secure Control Reg[25]
CPSR[24] = 0 /* Clear J bit */
PC = Monitor_Base_Address + 0x0000001C
else
R14_fiq = address of the next instruction to be executed + 4
SPSR_fiq = CPSR
CPSR [4:0] = 0b10001 /* Enter FIQ mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [6] = 1 /* Disable fast interrupts */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of secure Control Reg[25]
CPSR[24] = 0 /* Clear J bit */
if high vectors configured then
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
*/
*/
2-56
Programmer’s Model
PC = 0xFFFF001C
else
PC = Non_Secure_Base_Address + 0x0000001C
Secure Monitor Call Exception
On a SMC:
If (UserMode) /* undefined instruction */
R14_und = address of the next instruction after the SMC instruction
SPSR_und = CPSR
CPSR [4:0] = 0b11011 /* Enter undefined instruction mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [7] = 1 /* Disable interrupts */
CPSR [9] = Secure EE-bit /* store value of secure Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
If high vectors configured then
PC = 0xFFFF0004
else
PC = Secure_Base_Address + 0x00000004
else
R14_mon = address of the next instruction after the SMC instruction
SPSR_mon = CPSR
CPSR [4:0] = 0b10110 /* Enter Secure Monitor mode */
CPSR [5] = 0 /* Execute in ARM state */
CPSR [6] = 1 /* Disable fast interrupts */
CPSR [7] = 1 /* Disable interrupts */
CPSR [8] = 1 /* Disable imprecise aborts */
CPSR [9] = Secure EE-bit /* store value of secure Control Reg[25] */
CPSR[24] = 0 /* Clear J bit */
PC = Monitor_Base_Address + 0x00000008
/* SMC vectored to the */
/*conventional SVC vector */
2.12.17 Exception priorities
When multiple exceptions arise at the same time, a fixed priority system determines the order
that they are handled. Table 2-9 lists the order of exception priorities.
Table 2-9 Exception priorities
Priority
Highest
Lowest
Exception
1
Reset
2
Precise Data Abort
3
FIQ
4
IRQ
5
Prefetch Abort
6
Imprecise Data Abort
7
BKPT
Undefined Instruction
SVC
SMC
Some exceptions cannot occur together:
•
The BKPT, undefined instruction, SMC, and SVC exceptions are mutually exclusive.
Each corresponds to a particular, non-overlapping, decoding of the current instruction.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-57
Programmer’s Model
•
When FIQs are enabled, and a precise Data Abort occurs at the same time as an FIQ, the
processor enters the Data Abort handler, and proceeds immediately to the FIQ vector.
A normal return from the FIQ causes the Data Abort handler to resume execution.
Precise Data Aborts must have higher priority than FIQs to ensure that the transfer error
does not escape detection. You must add the time for this exception entry to the worst-case
FIQ latency calculations in a system that uses aborts to support virtual memory.
The FIQ handler must not access any memory that can generate a Data Abort, because the
initial Data Abort exception condition is lost if this happens.
Note
If the data abort is a precise external abort and bit 3 (EA) of SCR is set, the processor enters
Secure Monitor mode where aborts and FIQs are disabled automatically. Therefore, the
processor does not proceed to FIQ vector immediately afterwards.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-58
Programmer’s Model
2.13
Software considerations
When using the processor you must consider the following software issues:
•
Branch Target Address Cache flush
•
Waiting for DMA to complete.
2.13.1
Branch Target Address Cache flush
When the processor switches from the Secure to the Non-secure state the Secure Monitor code
is responsible for flushing the BTAC if necessary. See About program flow prediction on
page 5-2 for more information.
2.13.2
Waiting for DMA to complete
When it is necessary to wait for the generation of an interrupt by the DMA indicating the
completion of a transfer between external memory and an Instruction TCM, the prioritization
between core requests from a tight-loop and the DMA can mean the DMA is locked out from
writing the TCM, so freezing the system. To avoid this, two mechanisms are recommended:
1.
The use of the WFI operation in the wait-loop to freeze core execution while permitting
the DMA to continue. Standby mode is not entered in this case as the DMA keeps on
running and prevents this entry. See Standby mode on page 10-3 for more details.
2.
Including at least five instructions, including NOP instructions, in the wait loop.
For details of the WFI operation see c7, Cache operations on page 3-69.
Note
In the ARM1176 instruction set, WFI is a valid instruction but is treated as a NOP.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
2-59
Chapter 3
System Control Coprocessor
This chapter describes the purpose of the system control coprocessor, its structure, operation, and
how to use it. It contains the following sections:
•
About the system control coprocessor on page 3-2
•
System control processor registers on page 3-14.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-1
System Control Coprocessor
3.1
About the system control coprocessor
The section gives an overall view of the system control coprocessor. For detail of the registers
in the system control coprocessor, see System control processor registers on page 3-14.
The purpose of the system control coprocessor, CP15, is to control and provide status
information for the functions implemented in the ARM1176JZ-S processor. The main functions
of the system control coprocessor are:
•
overall system control and configuration
•
cache configuration and management
•
Tightly-Coupled Memory (TCM) configuration and management
•
Memory Management Unit (MMU) configuration and management
•
DMA control
•
system performance monitoring.
The system control coprocessor does not exist in a distinct physical block of logic.
3.1.1
System control coprocessor functional groups
The system control coprocessor appears as a set of 32-bit registers that you can write to and read
from. Some of the registers permit more than one type of operation. The functional groups for
the registers are:
•
System control and configuration on page 3-5
•
MMU control and configuration on page 3-6
•
Cache control and configuration on page 3-7
•
TCM control and configuration on page 3-8
•
Cache Master Valid Registers on page 3-8
•
DMA control on page 3-9
•
System performance monitor on page 3-10
•
System validation on page 3-11.
The system control coprocessor controls the TrustZone operation of the processor:
•
some of the registers are only accessible in the Secure world
•
some of the registers are banked for Secure and Non-secure worlds
•
some of the registers are common to both worlds.
Note
When Secure Monitor mode is active the core is in the Secure world. The processor treats all
accesses as Secure and the system control coprocessor behaves as if it operates in the Secure
world regardless of the value of the NS bit, see c1, Secure Configuration Register on page 3-52.
In Secure Monitor mode, the NS bit defines the copies of the banked registers in the system
control coprocessor that the processor can access:
NS = 0
Access to Secure world CP15 registers
NS = 1
Access to Non-secure world CP15 registers.
Registers that are only accessible in the Secure world are always accessible in Secure Monitor
mode, regardless of the value of the NS bit.
Table 3-1 on page 3-3 lists the overall functionality for the system control coprocessor as it
relates to its registers.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-2
System Control Coprocessor
Table 3-2 on page 3-15 lists the registers in the system control processor in register order and
gives their reset values.
Table 3-1 System control coprocessor register functions
Function
Register/operation
Reference to description
System control
and configuration
Control
c1, Control Register on page 3-44
Auxiliary control
c1, Auxiliary Control Register on page 3-49
Secure Configuration
c1, Secure Configuration Register on page 3-52
Secure Debug Enable
c1, Secure Debug Enable Register on page 3-54
Non-Secure Access Control
c1, Non-Secure Access Control Register on page 3-55
Coprocessor Access Control
c1, Coprocessor Access Control Register on page 3-51
Secure or Non-secure Vector Base
Address
c12, Secure or Non-secure Vector Base Address Register on
page 3-121
Monitor Vector Base Address
c12, Monitor Vector Base Address Register on page 3-122
ID codea
c0, Main ID Register on page 3-20
Feature ID, CPUID scheme
c0, CPUID registers on page 3-26
TLB Type
c0, TLB Type Register on page 3-25
Translation Table Base 0
c2, Translation Table Base Register 0 on page 3-57
Translation Table Base 1
c2, Translation Table Base Register 1 on page 3-59
Translation Table Base Control
c2, Translation Table Base Control Register on page 3-61
Domain Access Control
c3, Domain Access Control Register on page 3-63
Data Fault Status
c5, Data Fault Status Register on page 3-64
Instruction Fault Status
c5, Instruction Fault Status Register on page 3-66
Fault Address
c6, Fault Address Register on page 3-68
Instruction Fault Address
c6, Instruction Fault Address Register on page 3-69
Watchpoint Fault Address
c6, Watchpoint Fault Address Register on page 3-69
TLB Operations
c8, TLB Operations Register on page 3-86
TLB Lockdown
c10, TLB Lockdown Register on page 3-100
Memory Region Remap
c10, Memory region remap registers on page 3-101
Peripheral Port Memory Remap
c15, Peripheral Port Memory Remap Register on
page 3-130
Context ID
c13, Context ID Register on page 3-127
FCSE PID
c13, FCSE PID Register on page 3-125
Thread And Process ID
c13, Thread and process ID registers on page 3-128
TLB Lockdown Access
c15, TLB lockdown access registers on page 3-149
MMU control and
configuration
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-3
System Control Coprocessor
Table 3-1 System control coprocessor register functions (continued)
Function
Register/operation
Reference to description
Cache control and
configuration
Cache Type
c0, Cache Type Register on page 3-21
Cache Operations
c7, Cache operations on page 3-69
Data Cache Lockdown
c9, Data and instruction cache lockdown registers on
page 3-88
Instruction Cache Lockdown
c9, Data and instruction cache lockdown registers on
page 3-88
Cache Behavior Override
c9, Cache Behavior Override Register on page 3-98
TCM Status
c0, TCM Status Register on page 3-24
Data TCM Region
c9, Data TCM Region Register on page 3-90
Instruction TCM Region
c9, Instruction TCM Region Register on page 3-92
Data TCM Non-secure Access
Control
c9, Data TCM Non-secure Control Access Register on
page 3-94
Instruction TCM Non-secure Access
Control
c9, Instruction TCM Non-secure Control Access Register on
page 3-95
TCM Selection
c9, TCM Selection Register on page 3-97
Instruction Cache Master Valid
c15, Instruction Cache Master Valid Register on page 3-147
Data Cache Master Valid
c15, Data Cache Master Valid Register on page 3-148
DMA Identification and Status
c11, DMA identification and status registers on page 3-105
DMA User Accessibility
c11, DMA User Accessibility Register on page 3-107
DMA Channel Number
c11, DMA Channel Number Register on page 3-109
DMA enable
c11, DMA enable registers on page 3-110
DMA Control
c11, DMA Control Register on page 3-111
DMA Internal Start Address
c11, DMA Internal Start Address Register on page 3-114
DMA External Start Address
c11, DMA External Start Address Register on page 3-115
DMA Internal End Address
c11, DMA Internal End Address Register on page 3-116
DMA Channel Status
c11, DMA Channel Status Register on page 3-117
DMA Context ID
c11, DMA Context ID Register on page 3-120
Performance Monitor Control
c15, Performance Monitor Control Register on page 3-133
Cycle Counter
c15, Cycle Counter Register on page 3-137
Count Register 0
c15, Count Register 0 on page 3-138
Count Register 1
c15, Count Register 1 on page 3-139
TCM control and
configuration
Cache Master
Valid
DMA control
System
performance
monitor
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-4
System Control Coprocessor
Table 3-1 System control coprocessor register functions (continued)
Function
Register/operation
Reference to description
System validation
Secure User and Non-secure Access
Validation Control
c15, Secure User and Non-secure Access Validation Control
Register on page 3-132
System Validation Counter
c15, System Validation Counter Register on page 3-140
System Validation Operations
c15, System Validation Operations Register on page 3-142
System Validation Cache Size Mask
c15, System Validation Cache Size Mask Register on
page 3-145
a. Returns device ID code.
3.1.2
System control and configuration
The purpose of the system control and configuration registers is to provide overall management
of:
•
TrustZone behavior
•
memory functionality
•
interrupt behavior
•
exception handling
•
program flow prediction
•
coprocessor access rights for CP0-CP13.
The system control and configuration registers also provide the processor ID.
The system control and configuration registers consist of three 32-bit read only registers and
eight 32-bit read/write registers. Figure 3-1 shows the arrangement of registers in this functional
group.
CRn
c0
c1
c12
Opcode_1 CRm Opcode_2
0
0
c0
{0-7}
c1
{0-7}
c2
{0-7}
c3
{0-7}
c4
{0-7}
c5
{0-7}
c6
{0-7}
c7
0
c0
0
1
2
2
c1
1
2
0
c0
0
1
c1
0
Read-only
Read/write
ID Code Register
CPUID Registers
CPUID Registers
CPUID Registers
CPUID Registers
CPUID Registers
CPUID Registers
CPUID Registers
Control Register
Auxiliary Control Register
Coprocessor Access Control Register
Secure Configuration Register
Secure Debug Enable Register
Non-secure Access Control Register
Non-secure or Secure Vector Base Address Register
Monitor Vector Base Address Register
Interrupt Status Register
Write-only Accessible in User mode
Figure 3-1 System control and configuration registers
To use the system control and configuration registers you read or write individual registers that
make up the group, see Use of the system control coprocessor on page 3-12.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-5
System Control Coprocessor
Some of the functionality depends on how you set external signals at reset.
System control and configuration behaves in three ways:
•
as a set of flags or enables for specific functionality
•
as a set of numbers, values that indicate system functionality
•
as a set of addresses for processes in memory.
3.1.3
MMU control and configuration
The purpose of the MMU control and configuration registers is to:
•
allocate physical address locations from the Virtual Addresses (VAs) that the processor
generates.
•
control program access to memory.
•
designate areas of memory as either:
— Noncacheable
— unbufferable
— Noncacheable and unbufferable.
•
detect MMU faults and external aborts
•
hold thread and process IDs
•
provide direct access to the TLB lockdown entries.
The MMU control and configuration registers consist of one 32-bit read-only register, one 32-bit
write-only register, and 22 32-bit read/write registers. Figure 3-2 on page 3-7 shows the
arrangement of registers in this functional group.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-6
System Control Coprocessor
CRn
Opcode_1 CRm
c0
c2
0
0
c0
c0
c3
c5
0
0
c0
c0
c6
0
c0
c8
c10
0
0
Opcode_2
3
0
1
2
0
0
1
0
1
2
c0
c2
c13
0
c0
c15
0
5
c2
c4
c5
c6
c7
Read-only
0
1
0
1
2
3
4
4
2
2
2
2
Read/write
TLB Type Register
Translation Table Base Register 0
Translation Table Base Register 1
Translation Table Base Control Register
Domain Access Control Register
Data Fault Status Register
Instruction Fault Status Register
Fault Address Register
Watchpoint Fault Address Register
Instruction Fault Address Register
TLB Operations Register
TLB Lockdown Register
Memory region
Primary Region Remap Register
remap registers
Normal Memory Remap Register
FCSE PID Register
Context ID Register
User Read/Write Thread and Process ID Register
Thread and
process ID
User Read Only Thread and Process ID Register
registers
Privileged Only Thread and Process ID Register
Peripheral Port Memory Remap Register
TLB Lockdown Index Register
TLB Lockdown VA Register
TLB lockdown
access registers
TLB Lockdown PA Register
TLB Lockdown Attributes Register
Write-only
Accessible in User mode
Figure 3-2 MMU control and configuration registers
To use the MMU control and configuration registers you read or write individual registers that
make up the group, see Use of the system control coprocessor on page 3-12.
MMU control and configuration behaves in three ways:
3.1.4
•
as a set of numbers, values that describe aspects of the MMU or indicate its current state
•
as a set of addresses for tables in memory
•
as a set of operations that act on the MMU.
Cache control and configuration
The purpose of the cache control and configuration registers is to:
•
provide information on the size and architecture of the instruction and data caches
•
control instruction and data cache lockdown
•
control cache maintenance operations that include clean and invalidate caches, drain and
flush buffers, and address translation
•
override cache behavior during debug or interruptible cache operations.
The cache control and configuration registers consist of one 32-bit read only register and four
32-bit read/write registers. Figure 3-3 on page 3-8 shows the arrangement of the registers in this
functional group.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-7
System Control Coprocessor
CRn
c0
c7
c9
Opcode_1 CRm Opcode_2
0
0
0
c0
1
Cache Type Register
Cache Operations Register
c0
0
1
0
Data Cache Lockdown Register
Instruction Cache Lockdown Register
Cache Behavior Override Register
c8
Read-only
Read/write
Write only Accessible in User mode
Figure 3-3 Cache control and configuration registers
To use the system control and configuration registers you read or write individual registers that
make up the group, see Use of the system control coprocessor on page 3-12.
Cache control and configuration registers behave as:
•
a set of numbers, values that describe aspects of the caches
•
a set of bits that enable specific cache functionality
•
a set of operations that act on the caches.
3.1.5
TCM control and configuration
The purpose of the TCM control and configuration registers is to:
•
inform the processor about the status of the TCM regions
•
define TCM regions.
The TCM control and configuration registers consist of one 32-bit read-only register and five
32-bit read/write registers. Figure 3-4 shows the arrangement of registers.
CRn
c0
c9
Opcode_1
0
0
Read-only
CRm Opcode_2
c0
2
c1
0
1
2
3
c2
0
Read/write
TCM Status Register
Data TCM Region Register
Instruction TCM Region Register
Data TCM Non-secure Access Control Register
Instruction TCM Non-secure Access Control Register
TCM Selection Register
Write-only Accessible in User mode
Figure 3-4 TCM control and configuration registers
To use the system control and configuration registers you read or write individual registers that
make up the group, see Use of the system control coprocessor on page 3-12.
TCM control and configuration behaves in three ways:
•
as a set of numbers, values that describe aspects of the TCMs
•
as a set of bits that enable specific TCM functionality
•
as a set of addresses that define the memory locations of data stored in the TCMs.
3.1.6
Cache Master Valid Registers
The purpose of the Cache Master Valid Registers is to hold the state of the Master Valid bits of
the instruction and data caches.
The cache debug registers consist of two 32-bit read/write registers. Figure 3-5 on page 3-9
shows the arrangement of registers in this functional group.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-8
System Control Coprocessor
CRn
c15
Opcode_1
3
CRm Opcode_2
c8
c12
Read-only
Read/write
Instruction Cache Master Valid Register
Data Cache Master Valid Register
Write-only Accessible in User mode
Figure 3-5 Cache Master Valid Registers
To use the Cache Master Valid Registers you read or write the individual registers that make up
the group, see Use of the system control coprocessor on page 3-12.
The Cache Master Valid Registers behave as a set of bits that define the cache contents as valid
or invalid. The number of bits is a function of the cache size.
3.1.7
DMA control
The purpose of the DMA control registers is to:
•
enable software to control DMA
•
transfer large blocks of data between the TCM and an external memory
•
determine accessibility
•
select DMA channel.
The Enable, Control, Internal Start Address, External Start Address, Internal End Address,
Channel Status, and Context ID Registers are multiple registers with one register of each for
each channel that is implemented.
The DMA control registers consist of five 32-bit read-only registers, three 32-bit write-only
registers and seven 32-bit read/write registers. Figure 3-6 shows the arrangement of registers.
Opcode_1
CRm
c11
0
c0
Opcode_2
0
CRn
1
2
3
0
0
0
c1
c2
c3
1
2
Read-only
c4
c5
c6
c7
c8
c15
0
0
0
0
0
One register per channel selected
by DMA Channel Number Register
0
Read/write
Present
Queued
Running
Interrupting
DMA Identification
and Status Registers
DMA User Accessibility Register
DMA Channel Number Register
Stop
DMA Enable
Start
Registers
Clear
DMA Control Register
DMA Internal Start Address Register
DMA External Start Address Register
DMA Internal End Address Register
DMA Channel Status Register
DMA Context ID Register
Write-only
Accessible in User mode
Figure 3-6 DMA control and configuration registers
To use the DMA control and configuration registers you read or write the individual registers
that make up the group, see Use of the system control coprocessor on page 3-12.
Code can execute several DMA operations while in User mode if these operations are enabled
by the DMA User Accessibility Register.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-9
System Control Coprocessor
If DMA control registers attempt to execute a privileged operation in User mode the processor
takes an Undefined instruction trap.
The DMA control registers operation specifies the block of data for transfer, the location of
where the transfer is to, and the direction of the DMA. For more details on the operation see
DMA on page 7-10.
DMA control behaves in four ways:
3.1.8
•
as a set of numbers, values that describe aspects of the DMA channels or indicate their
current state
•
as a set of bits that enable specific DMA functionality
•
as a set of addresses that define the memory locations of data for transfer
•
as a set of operations that act on the DMA channels.
System performance monitor
The purpose of the performance monitor registers is to:
•
control the monitoring operation
•
count events.
The system performance monitor consist of four 32-bit read/write registers. Figure 3-7 shows
the arrangement of registers in this functional group.
CRn
c15
Opcode_1
0
Read-only
CRm Opcode_2
c12
0
1
2
3
Read/write
Performance Monitor Control Register
Cycle Counter Register
Count Register 0
Count Register 1
Write-only Accessible in User mode
Figure 3-7 System performance monitor registers
To use the system performance monitor registers you read or write individual registers that make
up the group, see Use of the system control coprocessor on page 3-12.
Note
The counters are only enabled when the SPNIDEN input and the SUNIDEN bit, see c1, Secure
Debug Enable Register on page 3-54, are appropriately set. When the core is in a mode where
non-invasive debug is not permitted, events are not counted but the cycle count register, CCNT,
continues to count.
You can not use the system performance monitor registers at the same time as the system
validation registers, because both sets of registers use the same physical counters. You must
disable one set of registers before you start to use the other set. See System validation on
page 3-11.
System performance monitoring counts system events, such as cache misses, TLB misses,
pipeline stalls, and other related features to enable system developers to profile the performance
of their systems. It can generate interrupts when the number of events reaches a given value.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-10
System Control Coprocessor
3.1.9
System validation
The system validation registers extend the use of the system performance monitor registers to
provide some functions for validation and must not be used for other purposes. The system
validation registers schedule and clear:
•
resets
•
interrupts
•
fast interrupts
•
external debug requests.
The system validation registers consist of four 32-bit read/write registers. Figure 3-8 shows the
arrangement of registers.
CRn
Opcode_1
CRm
c15
0
c9
c12
0
c13
1
2
c13
c13
3
0
c13
c14
Read-only
Opcode_2
0
4
5
6
7
1
2
3
4
5
6
7
1
2
3
4
5
6
7
Read/write
Secure User and Non-secure Access Validation Control Register
Reset counter
Interrupt counter
System Validation
Counter Registers
Fast interrupt counter
External debug request counter
Start reset counter
Start interrupt counter
Start reset and interrupt counters
Start fast interrupt counter
Start reset and fast interrupt counters
Start interrupt and fast interrupt counters
Start reset, interrupt and fast interrupt counters
Start external debug request counter
Stop reset counter
Stop interrupt counter
Stop reset and interrupt counters
Stop fast interrupt counter
Stop reset and fast interrupt counters
Stop interrupt and fast interrupt counters
Stop reset, interrupt and fast interrupt counters
Stop external debug request counter
System Validation Cache Size Mask Register
System
Validation
Operations
Registers
Write-only Accessible in User mode
Figure 3-8 System validation registers
The System Validation Counter Register and System Validation Operations Register reuse the
Cycle Counter Register, Count Register 0, and Count Register 1, see System performance
monitor on page 3-10, to schedule resets, interrupts and fast interrupts respectively. External
debug requests are scheduled using an additional 6 bit counter that is not used by the System
performance monitor registers.
Each of the four counters counts upwards, and when the counter overflows, the corresponding
event occurs. To the core, the events are indistinguishable from ordinary external events. The
System Validation Registers provide functions for loading the counter registers with the
required number of clock cycles before the event occurs, and starting, stopping and clearing the
counters, to return them to their System performance monitor functionality.
The System Validation Registers are usually only accessible from Secure privileged modes, but
a Secure User and Non-secure Access Validation Control Register is provided to permit access
to the System Validation Registers from User modes and Non-secure modes.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-11
System Control Coprocessor
The System Validation Cache Size Mask Register masks the physical size of the caches and
TCMs to make their size appear different to the processor. You can use this in validation by
simulation, but you must not use it in a manufactured device because it can corrupt correct
operation of the processor.
To use the system validation registers you read or write individual registers that make up the
group, see Use of the system control coprocessor.
You cannot use the System Validation Registers at the same time as the System Performance
Monitor Registers, because both sets of registers use the same physical counters. You must
disable one set of registers before starting to use the other set. See System performance monitor
on page 3-10.
System validation behaves in three ways:
3.1.10
•
as a set of bits that enable specific system validation functionality
•
as a set of operations that schedule and clear system validation events
•
as a set of numbers, values that describe aspects of the caches and TCMs for system
validation.
Use of the system control coprocessor
This section describes the general method for use of the system control coprocessor.
You can access system control coprocessor CP15 registers with MRC and MCR instructions.
MCR{cond} P15,<Opcode_1>,<Rd>,<CRn>,<CRm>,<Opcode_2>
MRC{cond} P15,<Opcode_1>,<Rd>,<CRn>,<CRm>,<Opcode_2>
Figure 3-9 shows the instruction bit pattern of MRC and MCR instructions.
31
28 27
Cond
24 23
21 20 19
1 1 1 0
L
16 15
CRn
12 11
Rd
Opcode_1
8 7
5 4 3
1 1 1 1
1
0
CRm
Opcode_2
Figure 3-9 CP15 MRC and MCR bit pattern
The CRn field of MRC and MCR instructions specifies the coprocessor register to access. The
CRm field and Opcode_2 fields specify a particular operation when addressing registers. The L
bit distinguishes between an MRC (L=1) and an MCR (L=0).
Instructions CDP, LDC, and STC, together with unprivileged MRC and MCR instructions to
privileged-only CP15 registers, and Non-secure accesses to Secure registers, cause the
processor to take the Undefined instruction trap.
Note
Attempting to read from a nonreadable register, or to write to a nonwriteable register causes
Undefined exceptions.
The Opcode_1, Opcode_2, and CRm fields Should Be Zero in all instructions that access CP15,
except when the values specified are used to select required operations. Using other values
results in Undefined exceptions.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-12
System Control Coprocessor
In all cases, reading from or writing any data values to any CP15 registers, including those fields
specified as Unpredictable (UNP), Should Be One (SBO), or Should Be Zero (SBZ), does not
cause any physical damage to the chip.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-13
System Control Coprocessor
3.2
System control processor registers
This section gives details of all the registers in the system control coprocessor. The section
presents a summary of the registers and detailed descriptions in register order of CRn,
Opcode_1, CRm, Opcode_2.
You can access CP15 registers with MRC and MCR instructions:
MCR{cond} P15,<Opcode_1>,<Rd>,<CRn>,<CRm>,<Opcode_2>
MRC{cond} P15,<Opcode_1>,<Rd>,<CRn>,<CRm>,<Opcode_2>
3.2.1
Register allocation
Table 3-2 on page 3-15 lists the allocation and reset values of the registers of the system control
coprocessor where:
•
CRn is the register number within CP15
•
Op1 is the Opcode_1 value for the register
•
CRm is the operational register
•
Op2 is the Opcode_2 value for the register.
•
Type applies to the Secure, S, or the Non-secure, NS, world and is:
— B, registers banked in Secure and Non-secure worlds. If the registers are not banked
then they are common to both worlds or only accessible in one world.
— NA, no access
— RO, read-only access
— RO, read-only access in privileged modes only
— R/W, read/write access
— R/W, read/write access in privileged modes only
— WO, write-only access
— WO, write-only access in privileged modes only
— X, access depends on another register or external signal.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-14
System Control Coprocessor
Table 3-2 Summary of CP15 registers and operations
CRn
Op1
CRm
Op2
Register or operation
S type
NS
type
Reset
value
Page
c0
0
c0
0
Main ID
RO
RO
0x41xFB76x a
page 3-20
1
Cache Type
RO
RO
0x10152152 b
page 3-21
2
TCM Status
RO
RO
0x00020002 c
page 3-24
3
TLB Type
RO
RO
0x00000800
page 3-25
0
Processor Feature 0
RO
RO
0x00000111
page 3-27
1
Processor Feature 1
RO
RO
0x00000011
page 3-28
2
Debug Feature 0
RO
RO
0x00000033
page 3-29
3
Auxiliary Feature 0
RO
RO
0x00000000
page 3-30
4
Memory Model Feature 0
RO
RO
0x01130003
page 3-31
5
Memory Model Feature 1
RO
RO
0x10030302
page 3-32
6
Memory Model Feature 2
RO
RO
0x01222100
page 3-34
7
Memory Model Feature 3
RO
RO
0x00000000
page 3-35
0
Instruction Set Feature
Attribute 0
RO
RO
0x00140011
page 3-36
1
Instruction Set Feature
Attribute 1
RO
RO
0x12002111
page 3-38
2
Instruction Set Feature
Attribute 2
RO
RO
0x11231121
page 3-39
3
Instruction Set Feature
Attribute 3
RO
RO
0x01102131
page 3-40
4
Instruction Set Feature
Attribute 4
RO
RO
0x00001141
page 3-42
5
Instruction Set Feature
Attribute 5
RO
RO
0x00000000
page 3-43
6-7
Reserved
-
-
-
-
c3-c7
-
Reserved
-
-
-
-
c0
0
Control
R/W, Bd, X
R/W
0x00050078e
page 3-44
1
Auxiliary Control
R/W
RO
0x00000007
page 3-49
2
Coprocessor Access Control
R/W
R/W
0x00000000
page 3-51
0
Secure Configuration
R/W
NA
0x00000000
page 3-52
1
Secure Debug Enable
R/W
NA
0x00000000
page 3-54
2
Non-Secure Access Control
R/W
RO
0x00000000
page 3-55
c1
c2
c1
0
c1
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-15
System Control Coprocessor
Table 3-2 Summary of CP15 registers and operations (continued)
CRn
Op1
CRm
Op2
Register or operation
S type
NS
type
Reset
value
Page
c2
0
c0
0
Translation Table Base 0
R/W, B, X
R/W
0x00000000
page 3-57
1
Translation Table Base 1
R/W, B
R/W
0x00000000
page 3-59
2
Translation Table Base Control
R/W, B, X
R/W
0x00000000
page 3-61
0
Domain Access Control
R/W, B, X
R/W
0x00000000
page 3-63
c3
0
c0
c4
c5
c6
c7
Not used
0
0
0
c0
0
Data Fault Status
R/W, B
R/W
0x00000000
page 3-64
1
Instruction Fault Status
R/W, B
R/W
0x00000000
page 3-66
0
Fault Address
R/W, B
R/W
0x00000000
page 3-68
1
Watchpoint Fault Address
R/W
NA
0x00000000
page 3-69
2
Instruction Fault Address
R/W, B
R/W
0x00000000
page 3-69
c0
4
Wait For Interrupt
WO
WO
-
page 3-85
c4
0
PA
R/W, B
R/W
0x00000000
page 3-80
c5
0
Invalidate Entire Instruction
Cache
WO
WO, X
-
page 3-71
1
Invalidate Instruction Cache
Line by MVA
WO
WO
-
page 3-71
2
Invalidate Instruction Cache
Line by Index
WO
WO
-
page 3-71
4
Flush Prefetch Buffer
WO
WO
-
page 3-79
6
Flush Entire Branch Target
Cache
WO
WO
-
page 3-79
7
Flush Branch Target Cache
Entry by MVA
WO
WO
-
page 3-79
0
Invalidate Entire Data Cache
WO
NA
-
page 3-71
1
Invalidate Data Cache Line by
MVA
WO
WO
-
page 3-71
2
Invalidate Data Cache Line by
Index
WO
WO
-
page 3-71
c7
0
Invalidate Both Caches
WO
NA
-
page 3-71
c8
0-3
VA to PA translation in the
current world
WO
WO
-
page 3-82
4-7
VA to PA translation in the
other world
WO
NA
-
page 3-83
c0
c6
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-16
System Control Coprocessor
Table 3-2 Summary of CP15 registers and operations (continued)
CRn
Op1
CRm
Op2
Register or operation
S type
NS
type
Reset
value
Page
c7
0
c10
0
Clean Entire Data Cache
WO, X
WO, X
-
page 3-71
1
Clean Data Cache Line by
MVA
WO
WO
-
page 3-71
2
Clean Data Cache Line by
Index
WO
WO
-
page 3-71
4
Data Synchronization Barrier
WO
WO
-
page 3-84
5
Data Memory Barrier
WO
WO
-
page 3-85
6
Cache Dirty Status
RO, B
RO
0x00000000
page 3-78
c13
1
Prefetch Instruction Cache
Line
WO
WO
-
page 3-71
c14
0
Clean and Invalidate Entire
Data Cache
WO, X
WO, X
-
page 3-71
1
Clean and Invalidate Data
Cache Line by MVA
WO
WO
-
page 3-71
2
Clean and Invalidate Data
Cache Line by Index
WO
WO
-
page 3-71
0
Invalidate Instruction TLB
unlocked entries
WO, B
WO
-
page 3-86
1
Invalidate Instruction TLB
entry by MVA
WO, B
WO
-
page 3-86
2
Invalidate Instruction TLB
entry on ASID match
WO, B
WO
-
page 3-86
0
Invalidate Data TLB unlocked
entries
WO, B
WO
-
page 3-86
1
Invalidate Data TLB entry by
MVA
WO, B
WO
-
page 3-86
2
Invalidate Data TLB entry on
ASID match
WO, B
WO
-
page 3-86
0
Invalidate unified TLB
unlocked entries
WO, B
WO
-
page 3-86
1
Invalidate unified TLB entry
by MVA
WO, B
WO
-
page 3-86
2
Invalidate unified TLB entry
on ASID match
WO, B
WO
-
page 3-86
c8
c8
0
0
c5
c6
c7
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-17
System Control Coprocessor
Table 3-2 Summary of CP15 registers and operations (continued)
CRn
Op1
CRm
Op2
Register or operation
S type
NS
type
Reset
value
Page
c9
0
c0
0
Data Cache Lockdown
R/W
R/W, X
0xFFFFFFF0
page 3-88
1
Instruction Cache Lockdown
R/W
R/W, X
0xFFFFFFF0
page 3-88
0
Data TCM Region
R/W, X
R/W, X
0x00000014f
page 3-90
1
Instruction TCM Region
R/W, X
R/W, X
0x00000014g
page 3-92
2
Data TCM Non-secure Control
Access
R/W, X
NA
0x00000000
page 3-94
3
Instruction TCM Non-secure
Control Access
R/W, X
NA
0x00000000
page 3-95
c2
0
TCM Selection
R/W, B
R/W
0x00000000
page 3-97
c8
0
Cache Behavior Override
R/Wh
R/W
0x00000000
page 3-98
c0
0
TLB Lockdown
R/W, X
R/W, X
0x00000000
page 3-100
c2
0
Primary Region Memory
Remap Register
R/W, B, X
R/W
0x00098AA4
page 3-101
1
Normal Memory Region
Remap Register
R/W, B, X
R/W
0x44E048E0
page 3-101
c0
0-3
DMA identification and status
RO
RO, X
0x0000000Bi
page 3-105
c1
0
DMA User Accessibility
R/W
R/W, X
0x00000000
page 3-107
c2
0
DMA Channel Number
R/W, X
R/W, X
0x00000000
page 3-109
c3
0-2
DMA enable
WO, X
WO, X
-
page 3-110
c4
0
DMA Control
R/W, X
R/W, X
0x08000000
page 3-111
c5
0
DMA Internal Start Address
R/W, X
R/W, X
-
page 3-114
c6
0
DMA External Start Address
R/W, X
R/W, X
-
page 3-115
c7
0
DMA Internal End Address
R/W, X
R/W, X
-
page 3-116
c8
0
DMA Channel Status
RO, X
RO, X
0x00000000
page 3-117
c15
0
DMA Context ID
R/W
R/W, X
-
page 3-120
c0
0
Secure or Non-secure Vector
Base Address
R/W, B, X
R/W
0x00000000
page 3-121
1
Monitor Vector Base Address
R/W, X
NA
0x00000000
page 3-122
0
Interrupt Status
RO
RO
0x00000000j
page 3-123
c1
c10
c11
c12
0
0
0
c1
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-18
System Control Coprocessor
Table 3-2 Summary of CP15 registers and operations (continued)
CRn
Op1
CRm
Op2
Register or operation
S type
NS
type
Reset
value
Page
c13
0
c0
0
FCSE PID
R/W, B, X
R/W
0x00000000
page 3-125
1
Context ID
R/W, B
R/W
0x00000000
page 3-127
2
User Read/Write Thread and
Process ID
R/W, B
R/W
0x00000000
page 3-128
3
User Read-only Thread and
Process ID
R/W, RO,
Bk
R/W,
RO
0x00000000
page 3-128
4
Privileged Only Thread and
Process ID
R/W, B
R/W
0x00000000
page 3-128
c14
c15
Not used
0
c2
4
Peripheral Port Memory
Remap
R/W, B, X
R/W
0x00000000
page 3-130
c9
0
Secure User and Non-secure
Access Validation Control
R/W, X
NA
0x00000000
page 3-132
c12
0
Performance Monitor Control
R/W, X
R/W, X
0x00000000
page 3-133
1
Cycle Counter
R/W, X
R/W, X
0x00000000
page 3-137
2
Count 0
R/W, X
R/W, X
0x00000000
page 3-138
3
Count 1
R/W, X
R/W, X
0x00000000
page 3-139
4-7
System Validation Counter
R/W, X
R/W, X
0x00000000
page 3-140
c13
1-7
System Validation Operations
R/W, X
R/W, X
0x00000000
page 3-142
c14
0
System Validation Cache Size
Mask
R/W, X
R/W, X
0x00006655l
page 3-145
c15
1
c13
0-7
System Validation Operations
R/W, X
R/W, X
0x00000000
page 3-142
c15
2
c13
1-7
System Validation Operations
R/W, X
R/W, X
0x00000000
page 3-142
c15
3
c8
0-7
Instruction Cache Master Valid
R/W, X
NA
0x00000000
page 3-147
c12
0-7
Data Cache Master Valid
R/W, X
NA
0x00000000
page 3-148
c13
0-7
System Validation Operations
R/W, X
R/W, X
0x00000000
page 3-142
c15
4
c13
0-7
System Validation Operations
R/W, X
R/W, X
0x00000000
page 3-142
c15
5
c4
2
TLB Lockdown Index
R/W, X
NA
0x00000000
page 3-149
c5
2
TLB Lockdown VA
R/W, X
NA
-
page 3-149
c6
2
TLB Lockdown PA
R/W, X
NA
-
page 3-149
c7
2
TLB Lockdown Attributes
R/W, X
NA
-
page 3-149
c13
0-7
System Validation Operations
R/W, X
R/W, X
0x00000000
page 3-142
c15
6
c13
0-7
System Validation Operations
R/W, X
R/W, X
0x00000000
page 3-142
c15
7
c13
0-7
System Validation Operations
R/W, X
R/W, X
0x00000000
page 3-142
a. See c0, Main ID Register on page 3-20 for the values of bits [23:20] and bits [3:0].
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-19
System Control Coprocessor
b. Reset value depends on the cache size implemented. The value here is for 16KB instruction and data caches.
c. Reset value depends on the number of TCM banks implemented. The value here is for 2 data TCM and 2 instruction TCM
banks.
d. Some bits in this register are banked and some Secure modify only.
e. Reset value depends on external signals.
f. Reset value depends on the TCM sizes implemented. The value here is for 16KB TCM banks.
g. Reset value depends on the TCM sizes implemented, and on the value of the INITRAM static configuration signal. The value
here is for 16KB TCM banks, with INITRAM tied LOW.
h. Some bits in this register are common and some Secure modify only.
i. Reset value depends on the number of DMA channels implemented and the presence of TCMs.
j. Reset value depends on external signals.
k. This register is read/write in Privileged modes and read-only on User mode.
l. Reset value depends on the cache and TCM sizes implemented. The value here is for 2 banks of 16KB instruction and data
TCMs and 16KB instruction and data caches.
Table 3-3 lists the operations available with MCRR operations:
MCRR{cond} P15,<Opcode_1>,<End Address>,<Start Address>,<CRm>
Table 3-3 Summary of CP15 MCRR operations
3.2.2
Op1
CRm
Register or operation
S type
NS
type
Reset
value
Page
0
c5
Invalidate instruction cache range
WO
WO
-
page 3-69
c6
Invalidate data cache range
WO
WO
-
page 3-69
c12
Clean data cache range
WO
WO
-
page 3-69
c14
Clean and invalidate data cache range
WO
WO
-
page 3-69
c0, Main ID Register
The purpose of the Main ID Register is to return the device ID code that contains information
about the processor.
The Main ID Register is:
•
in CP15 c0
•
a 32 bit read-only register common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-10 shows the arrangement of bits in the register.
31
24 23
Implementor
20 19
Variant
number
16 15
Architecture
4 3
Primary part number
0
Revision
Figure 3-10 Main ID Register format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-20
System Control Coprocessor
The contents of the Main ID Register depend on the specific implementation. Table 3-4 lists
how the bit values correspond with the Main ID Register functions.
Table 3-4 Main ID Register bit functions
Bits
Field name
[31:24]
Implementor
Function
Indicates implementor, ARM Limited:
0x41
[23:20]
Variant number
The major revision number n in the rn part of the rnpn revision status.
0x0
[19:16]
Architecture
Indicates that the architecture is given in the feature registers.
0xF
[15:4]
Primary part number
Indicates part number, ARM1176JZ-S:
0xB76
[3:0]
Revision
The minor revision number n in the pn part of the rnpn revision status. For example:
for release r0p0: 0x0
for release r0p7: 0x7
Note
If an Opcode_2 value corresponding to an unimplemented or reserved ID register with CRm
equal to c0 and Opcode_1 = 0 is encountered, the system control coprocessor returns the value
of the main ID register.
Table 3-5 lists the results of attempted access for each mode.
Table 3-5 Results of access to the Main ID Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
data
Undefined exception
data
Undefined exception
User
Undefined exception
To use the Main ID Register read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c0
•
Opcode_2 set to 0.
For example:
MRC p15,0,<Rd>,c0,c0,0 ;Read Main ID Register
For more information on the processor features, see c0, CPUID registers on page 3-26.
3.2.3
c0, Cache Type Register
The purpose of the Cache Type Register is to provide information about the size and architecture
of the cache for the operating system. This enables the operating system to establish how to
clean the cache and how to lock it down. Inclusion of this register enables RTOS vendors to
produce future-proof versions of their operating systems.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-21
System Control Coprocessor
The Cache Type Register is:
•
in CP15 c0
•
a 32-bit read only register, common to Secure and Non-secure worlds
•
accessible in privileged modes only.
All ARMv4T and later cached processors contain this register. Figure 3-11 shows the
arrangement of bits in the Cache Type Register.
31 30 29 28
0 0 0
25 24 23 22 21
Ctype
S P 0
18 17
Size
15 14 13 12 11 10 9
Assoc M Len P 0
Dsize
6 5
Size
3 2 1 0
Assoc M Len
Isize
Figure 3-11 Cache Type Register format
Table 3-6 lists how the bit values correspond with the Cache Type Register functions.
Table 3-6 Cache Type Register bit functions
Bits
Field name
Function
[31:29]
-
0
[28:25]
Ctype
The Cache type and Separate bits provide information about the cache architecture.
b1110, indicates that the ARM1176JZF-S processor supports:
•
write back cache
•
Format C cache lockdown
•
Register 7 cache cleaning operations.
[24]
S bit
S = 1, indicates that the processor has separate instruction and data caches and not a
unified cache.
[23:12]
Dsize field
Provides information about the size and construction of the Data cachea.
[23]
P bit
The P, Page, bit indicates restrictions on page allocation for bits [13:12] of the VA
For ARM1176JZF-S processors, the P bit is set if the cache size is greater than 16KB.
For more details see Restrictions on page table mappings page coloring on page 6-41.
0 = no restriction on page allocation.
1 = restriction applies to page allocation.
[22]
-
0
[21:18]
Size
The Size field indicates the cache size in conjunction with the M bit.
b0000 = 0.5KB cache, not supported
b0001 = 1KB cache, not supported
b0010 = 2KB cache, not supported
b0011 = 4KB cache
b0100 = 8KB cache
b0101 = 16KB cache
b0110 = 32KB cache
b0111 = 64KB cache
b1000 = 128KB cache, not supported.
[17:15]
Assoc
b010, indicates that the ARM1176JZF-S processor has 4-way associativity. All other
values for Assoc are reserved.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-22
System Control Coprocessor
Table 3-6 Cache Type Register bit functions (continued)
Bits
Field name
Function
[14]
M bit
Indicates the cache size and cache associativity values in conjunction with the Size and
Assoc fields.
In the ARM1176JZF-S processor the M bit is set to 0, for the Data and Instruction
Caches.
[13:12]
Len
b10, indicates that ARM1176JZF-S processor has a cache line length of 8 words, that
is 32 bytes. All other values for Len are reserved.
[11:0]
Isize field
Provides information about the size and construction of the Instruction cache.
[11]
P
[10]
-
[9:6]
Size
[5:3]
Assoc
[2]
M
[1:0]
Len
The functions of the Isize bit fields are the same as the equivalent Dsize bit fields and
the Isize values have the corresponding meanings.
a. The ARM1176JZF-S processor does not support cache sizes of less than 4KB.
Table 3-7 lists the results of attempted access for each mode.
Table 3-7 Results of access to the Cache Type Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
Undefined exception
To use the Cache Type Register read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c0
•
Opcode_2 set to 1.
For example:
MRC p15,0,<Rd>,c0,c0,1; returns cache details
Table 3-8 on page 3-24, for example, lists the Cache Type Register values for an ARM1176JZ-S
processor with:
•
separate instruction and data caches
•
cache size = 16KB
•
associativity = 4-way
•
line length = eight words
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-23
System Control Coprocessor
•
caches use write-back, CP15 c7 for cache cleaning, and Format C for cache lockdown.
Table 3-8 Example Cache Type Register format
Bits
Field name
Value
[31:29]
Reserved
b000
[28:25]
Ctype
b1110
[24]
S
b1
[23]
Dsize
Harvard cache
P
b0
[22]
Reserved
b0
[21:18]
Size
b0101
16KB
[17:15]
Assoc
b010
4-way
[14]
M
b0
[13:12]
Len
b10
P
b0
[10]
Reserved
b0
[9:6]
Size
b0101
16KB
[5:3]
Assoc
b010
4-way
[2]
M
b0
[1:0]
Len
b10
[11]
3.2.4
Behavior
Isize
8 words per line, 32 bytes
8 words per line, 32 bytes
c0, TCM Status Register
The purpose of the TCM Status Register is to inform the system about the number of Instruction
and Data TCMs available in the processor.
Table 3-9 on page 3-25 lists the purposes of the individual bits in the TCM Status Register.
Note
In the ARM1176JZ-S processor there is a maximum of two Instruction TCMs and two Data
TCMs.
The TCM Status Register is:
•
in CP15 c0
•
a 32-bit read-only register common to Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-12 shows the bit arrangement for the TCM Status Register.
31 30 29 28
0 0 0
19 18
SBZ/UNP
16 15
3 2
DTCM
SBZ/UNP
0
ITCM
Figure 3-12 TCM Status Register format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-24
System Control Coprocessor
Table 3-9 lists how the bit values correspond with the TCM Status Register functions.
Table 3-9 TCM Status Register bit functions
Bits
Field name
Function
[31:29]
-
Always b000.
[28:19]
-
UNP/SBZ
[18:16]
DTCM
Indicates the number of Data TCM banks implemented.
b000 = 0 Data TCMs
b001 = 1 Data TCM
b010 = 2 Data TCMs
All other values reserved
[15:3]
-
UNP/SBZ
[2:0]
ITCM
Indicates the number of Instruction TCM banks implemented.
b000 = 0 Instruction TCMs
b001 = 1 Instruction TCM
b010 = 2 Instruction TCMs
All other values reserved
Attempts to write the TCM Status Register or read it in User modes result in Undefined
exceptions.
To use the TCM Status Register read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c0
•
Opcode_2 set to 2.
For example:
MRC p15,0,<Rd>,c0,c0,2
3.2.5
; returns TCM status register
c0, TLB Type Register
The purpose of the TLB Type Register is to return the number of lockable entries for the TLB.
The TLB has 64 entries organized as a unified two-way set associative TLB. In addition, it has
eight lockable entries that the read-only TLB Type Register specifies.
The TLB Type Register is:
•
in CP15 c0
•
a 32-bit read only register common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-13 shows the bit arrangement for the TLB Type Register.
31
24 23
SBZ/UNP
16 15
ILsize
8 7
DLsize
1 0
SBZ/UNP
U
Figure 3-13 TLB Type Register format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-25
System Control Coprocessor
Table 3-10 lists how the bit values correspond with the TLB Type Register functions.
Table 3-10 TLB Type Register bit functions
Bits
Field name
Function
[31:24]
-
UNP/SBZ
[23:16]
ILsize
Instruction lockable size specifies the number of instruction TLB lockable entries
0, indicates that the ARM1176JZ-S processor has a unified TLB
[15:8]
DLsize
Data lockable size specifies the number of unified or data TLB lockable entries
0x08, indicates the ARM1176JZ-S processors has 8 unified TLB lockable entries
[7:1]
-
UNP/SBZ
[0]
U
Unified specifies if the TLB is unified, 0, or if there are separate instruction and data TLBs, 1.
0, indicates that the ARM1176JZ-S processor has a unified TLB
Table 3-11 lists the results of attempted access for each mode.
Table 3-11 Results of access to the TLB Type Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
Undefined exception
To use the TLB Type Register read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c0
•
Opcode_2 set to 3.
For example:
MRC p15,0,<Rd>,c0,c0,3
3.2.6
; returns TLB details
c0, CPUID registers
The section describes the CPUID registers:
•
c0, Processor Feature Register 0 on page 3-27
•
c0, Processor Feature Register 1 on page 3-28
•
c0, Debug Feature Register 0 on page 3-29
•
c0, Auxiliary Feature Register 0 on page 3-30
•
c0, Memory Model Feature Register 0 on page 3-31
•
c0, Memory Model Feature Register 1 on page 3-32
•
c0, Memory Model Feature Register 2 on page 3-34
•
c0, Memory Model Feature Register 3 on page 3-35
•
c0, Instruction Set Attributes Register 0 on page 3-36
•
c0, Instruction Set Attributes Register 1 on page 3-38
•
c0, Instruction Set Attributes Register 2 on page 3-39
•
c0, Instruction Set Attributes Register 3 on page 3-40
•
c0, Instruction Set Attributes Register 4 on page 3-42
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-26
System Control Coprocessor
•
c0, Instruction Set Attributes Register 5 on page 3-43.
Note
The CPUID registers are sometimes described as the Core Feature ID registers.
c0, Processor Feature Register 0
The purpose of the Processor Feature Register 0 is to provide information about the execution
state support and programmer’s model for the processor.
Processor Feature Register 0 is:
•
in CP15 c0
•
a 32-bit read-only register common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Table 3-12 lists how the bit values correspond with the Processor Feature Register 0 functions.
Figure 3-14 shows the bit arrangement for Processor Feature Register 0.
31
28 27
Reserved
24 23
Reserved
20 19
Reserved
16 15
Reserved
12 11
State3
8 7
State2
4 3
State1
0
State0
Figure 3-14 Processor Feature Register 0 format
Table 3-12 Processor Feature Register 0 bit functions
Bits
Field name
Function
[31:28]
-
Reserved. RAZ.
[27:24]
-
Reserved. RAZ.
[23:20]
-
Reserved. RAZ.
[19:16]
-
Reserved. RAZ.
[15:12]
State3
Indicates support for Thumb-2™ execution environment.
0x0, ARM1176JZ-S processors do not support Thumb-2.
[11:8]
State2
Indicates support for Java extension interface.
0x1, ARM1176JZ-S processors support Java.
[7:4]
State1
Indicates type of Thumb encoding that the processor supports.
0x1, ARM1176JZ-S processors support Thumb-1 but do not support Thumb-2.
[3:0]
State0
Indicates support for 32-bit ARM instruction set.
0x1, ARM1176JZ-S processors support 32-bit ARM instructions.
Table 3-13 lists the results of attempted access for each mode.
Table 3-13 Results of access to the Processor Feature Register 0
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
Undefined exception
3-27
System Control Coprocessor
To use the Processor Feature Register 0 read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c1
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c0, c1, 0 ;Read Processor Feature Register 0
c0, Processor Feature Register 1
The purpose of the Processor Feature Register 1 is to provide information about the execution
state support and programmer’s model for the processor.
Processor Feature Register 1 is:
•
in CP15 c0
•
a 32-bit read-only register common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-15 shows the bit arrangement for Processor Feature Register 1.
31
28 27
Reserved
24 23
Reserved
20 19
Reserved
16 15
Reserved
12 11
8 7
4 3
0
Reserved
Microcontroller programmer's model
Security extension
Programmer's model
Figure 3-15 Processor Feature Register 1 format
Table 3-14 lists how the bit values correspond with the Processor Feature Register 1 functions.
Table 3-14 Processor Feature Register 1 bit functions
Bits
Field name
Function
[31:28]
-
Reserved. RAZ
[27:24]
-
Reserved. RAZ.
[23:20]
-
Reserved. RAZ.
[19:16]
-
Reserved. RAZ.
[15:12]
-
Reserved. RAZ.
[11:8]
Microcontroller programmer’s model
Indicates support for the ARM microcontroller programmer’s model.
0x0, Not supported by ARM1176JZ-S processors.
[7:4]
Security extension
Indicates support for Security Extensions Architecture v1.
0x1, ARM1176JZ-S processors support Security Extensions
Architecture v1, TrustZone.
[3:0]
Programmer’s model
Indicates support for standard ARMv4 programmer’s model.
0x1, ARM1176JZ-S processors support the ARMv4 model.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-28
System Control Coprocessor
Table 3-15 lists the results of attempted access for each mode.
Table 3-15 Results of access to the Processor Feature Register 1
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
Undefined exception
To use the Processor Feature Register 1 read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c1
•
Opcode_2 set to 1.
For example:
MRC p15, 0, <Rd>, c0, c1, 1 ;Read Processor Feature Register 1
c0, Debug Feature Register 0
The purpose of the Debug Feature Register 0 is to provide information about the debug system
for the processor.
Debug Feature Register 0 is:
•
in CP15 c0
•
a 32-bit read-only register common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-16 shows the bit arrangement for Debug Feature Register 0.
31
28 27
Reserved
24 23
Reserved
20 19
-
12 11
16 15
-
-
8 7
-
4 3
-
0
-
Figure 3-16 Debug Feature Register 0 format
Table 3-16 lists how the bit values correspond with the Debug Feature Register 0 functions.
Table 3-16 Debug Feature Register 0 bit functions
Bits
Field name
Function
[31:28]
-
Reserved. RAZ.
[27:24]
-
Reserved. RAZ.
[23:20]
-
Indicates the type of memory-mapped microcontroller debug model that the processor
supports.
0x0, ARM1176JZ-S processors do not support this debug model.
[19:16]
-
Indicates the type of memory-mapped Trace debug model that the processor supports.
0x0, ARM1176JZ-S processors do not support this debug model.
[15:12]
-
Indicates the type of coprocessor-based Trace debug model that the processor supports.
0x0, ARM1176JZ-S processors do not support this debug model.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-29
System Control Coprocessor
Table 3-16 Debug Feature Register 0 bit functions (continued)
Bits
Field name
Function
[11:8]
-
Indicates the type of embedded processor debug model that the processor supports.
0x0, ARM1176JZ-S processors do not support this debug model.
[7:4]
-
Indicates the type of Secure debug model that the processor supports.
0x3, ARM1176JZ-S processors support the v6.1 Secure debug architecture based model.
[3:0]
-
Indicates the type of applications processor debug model that the processor supports.
0x3, ARM1176JZ-S processors support the v6.1 debug model.
Table 3-17 lists the results of attempted access for each mode.
Table 3-17 Results of access to the Debug Feature Register 0
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
Undefined exception
To use the Debug Feature Register 0 read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c1
•
Opcode_2 set to 2.
For example:
MRC p15, 0, <Rd>, c0, c1, 2 ;Read Debug Feature Register 0
c0, Auxiliary Feature Register 0
The purpose of the Auxiliary Feature Register 0 is to provide additional information about the
features of the processor.
The Auxiliary Feature Register 0 is:
•
in CP15 c0
•
a 32-bit read-only register common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Table 3-18 lists how the bit values correspond with the Auxiliary Feature Register 0 functions.
Table 3-18 Auxiliary Feature Register 0 bit functions
ARM DDI 0333H
ID012410
Bits
Field name
Function
[31:16]
-
Reserved. RAZ.
[15:12]
-
Implementation Defined.
[11:8]
-
Implementation Defined.
[7:4]
-
Implementation Defined.
[3:0]
-
Implementation Defined.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-30
System Control Coprocessor
The contents of the Auxiliary Feature Register 0 [31:16] are Reserved. The contents of the
Auxiliary Feature Register 0 [15:0] are Implementation Defined. In the ARM1176JZ-S
processor, the Auxiliary Feature Register 0 reads as 0x00000000.
Table 3-19 lists the results of attempted access for each mode.
Table 3-19 Results of access to the Auxiliary Feature Register 0
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
Undefined exception
To use the Auxiliary Feature Register 0 read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c1
•
Opcode_2 set to 3.
For example:
MRC p15, 0, <Rd>, c0, c1, 3 ;Read Auxiliary Feature Register 0.
c0, Memory Model Feature Register 0
The purpose of the Memory Model Feature Register 0 is to provide information about the
memory model, memory management, cache support, and TLB operations of the processor.
The Memory Model Feature Register 0 is:
•
in CP15 c0
•
a 32-bit read-only register common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-17 shows the bit arrangement for Memory Model Feature Register 0.
31
28 27
Reserved
24 23
-
20 19
-
16 15
-
12 11
-
8 7
-
4 3
-
0
-
Figure 3-17 Memory Model Feature Register 0 format
Table 3-20 lists how the bit values correspond with the Memory Model Feature Register 0
functions.
Table 3-20 Memory Model Feature Register 0 bit functions
Bits
Field name
Function
[31:28]
-
Reserved. RAZ.
[27:24]
-
Indicates support for FCSE.
0x1, ARM1176JZ-S processors support FCSE.
[23:20]
ARM DDI 0333H
ID012410
-
Indicates support for the ARMv6 Auxiliary Control Register.
0x1, ARM1176JZ-S processors support the Auxiliary Control Register.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-31
System Control Coprocessor
Table 3-20 Memory Model Feature Register 0 bit functions (continued)
Bits
Field name
[19:16]
-
Function
Indicates support for TCM and associated DMA.
0x3, ARM1176JZ-S processors support ARMv6 TCM and DMA.
[15:12]
-
[11:8]
-
Indicates support for cache coherency with DMA agent, shared memory.
0x0, ARM1176JZ-S processors do not support this model.
Indicates support for cache coherency support with CPU agent, shared memory.
0x0, ARM1176JZ-S processors do not support this model.
[7:4]
-
Indicates support for Protected Memory System Architecture (PMSA).
0x0, ARM1176JZ-S processors do not support PMSA
[3:0]
-
Indicates support for Virtual Memory System Architecture (VMSA).
0x3, ARM1176JZ-S processors support:
•
VMSA v7 remapping and access flag.
Table 3-21 lists the results of attempted access for each mode.
Table 3-21 Results of access to the Memory Model Feature Register 0
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
Undefined exception
To use the Memory Model Feature Register 0 read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c1
•
Opcode_2 set to 4.
For example:
MRC p15, 0, <Rd>, c0, c1, 4 ;Read Memory Model Feature Register 0.
c0, Memory Model Feature Register 1
The purpose of the Memory Model Feature Register 1 is to provide information about the
memory model, memory management, cache support, and TLB operations of the processor.
The Memory Model Feature Register 1 is:
•
in CP15 c0
•
a 32-bit read-only register common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-18 shows the bit arrangement for Memory Model Feature Register 1.
31
28 27
-
24 23
-
20 19
-
16 15
-
12 11
-
8 7
-
4 3
-
0
-
Figure 3-18 Memory Model Feature Register 1 format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-32
System Control Coprocessor
Table 3-22 lists how the bit values correspond with the Memory Model Feature Register 1
functions.
Table 3-22 Memory Model Feature Register 1 bit functions
Bits
Field
name
[31:28]
-
Indicates support for branch target buffer.
0x1, ARM1176JZ-S processors require flushing of branch predictor on VA change.
[27:24]
-
Indicates support for test and clean operations on data cache, Harvard or unified architecture.
0x0, no support in ARM1176JZ-S processors.
[23:20]
-
Indicates support for level one cache, all maintenance operations, unified architecture.
0x0, no support in ARM1176JZ-S processors.
[19:16]
-
Indicates support for level one cache, all maintenance operations, Harvard architecture.
0x3, ARM1176JZ-S processors support:
•
invalidate instruction cache including branch prediction
•
invalidate data cache
•
invalidate instruction and data cache including branch prediction
•
clean data cache, recursive model using cache dirty status bit
•
clean and invalidate data cache, recursive model using cache dirty status bit.
[15:12]
-
Function
Indicates support for level one cache line maintenance operations by Set/Way, unified architecture.
0x0, no support in ARM1176JZ-S processors.
[11:8]
-
Indicates support for level one cache line maintenance operations by Set/Way, Harvard architecture.
0x3, ARM1176JZ-S processors support:
•
•
•
•
[7:4]
-
[3:0]
-
clean data cache line by Set/Way
clean and invalidate data cache line by Set/Way
invalidate data cache line by Set/Way
invalidate instruction cache line by Set/Way.
Indicates support for level one cache line maintenance operations by MVA, unified architecture.
0, no support in ARM1176JZ-S processors.
Indicates support for level one cache line maintenance operations by MVA, Harvard architecture.
0x2, ARM1176JZ-S processors support:
•
•
•
•
•
clean data cache line by MVA
invalidate data cache line by MVA
invalidate instruction cache line by MVA
clean and invalidate data cache line by MVA
invalidation of branch target buffer by MVA.
Table 3-23 lists the results of attempted access for each mode.
Table 3-23 Results of access to the Memory Model Feature Register 1
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
Undefined exception
3-33
System Control Coprocessor
To use the Memory Model Feature Register 1 read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c1
•
Opcode_2 set to 5.
For example:
MRC p15, 0, <Rd>, c0, c1, 5 ;Read Memory Model Feature Register 1.
c0, Memory Model Feature Register 2
The purpose of the Memory Model Feature Register 2 is to provide information about the
memory model, memory management, cache support, and TLB operations of the processor.
The Memory Model Feature Register 2 is:
•
in CP15 c0
•
a 32-bit read-only register common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-19 shows the bit arrangement for Memory Model Feature Register 2.
31
28 27
-
24 23
-
20 19
-
16 15
-
12 11
-
8 7
-
4 3
-
0
-
Figure 3-19 Memory Model Feature Register 2 format
Table 3-24 lists how the bit values correspond with the Memory Model Feature Register 2
functions.
Table 3-24 Memory Model Feature Register 2 bit functions
ARM DDI 0333H
ID012410
Bits
Field name
Function
[31:28]
-
Indicates support for a Hardware access flag.
0x0, no support in ARM1176JZ-S processors.
[27:24]
-
Indicates support for Wait For Interrupt stalling.
0x1, ARM1176JZ-S processors support Wait For Interrupt.
[23:20]
-
Indicates support for memory barrier operations.
0x2, ARM1176JZ-S processors support:
•
Data Synchronization Barrier
•
Prefetch Flush
•
Data Memory Barrier.
[19:16]
-
Indicates support for TLB maintenance operations, unified architecture.
0x2, ARM1176JZ-S processors support:
•
invalidate all entries
•
invalidate TLB entry by MVA
•
invalidate TLB entries by ASID match.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-34
System Control Coprocessor
Table 3-24 Memory Model Feature Register 2 bit functions (continued)
Bits
Field name
Function
[15:12]
-
Indicates support for TLB maintenance operations, Harvard architecture.
0x2, ARM1176JZ-S processors support:
•
invalidate instruction and data TLB, all entries
•
invalidate instruction TLB, all entries
•
invalidate data TLB, all entries
•
invalidate instruction TLB by MVA
•
invalidate data TLB by MVA
•
invalidate instruction and data TLB entries by ASID match
•
invalidate instruction TLB entries by ASID match
•
invalidate data TLB entries by ASID match.
[11:8]
-
Indicates support for cache maintenance range operations, Harvard architecture.
0x1, ARM1176JZ-S processors support:
•
invalidate data cache range by VA
•
invalidate instruction cache range by VA
•
clean data cache range by VA
•
clean and invalidate data cache range by VA.
[7:4]
-
Indicates support for background prefetch cache range operations, Harvard architecture.
0x0, no support in ARM1176JZ-S processors.
[3:0]
-
Indicates support for foreground prefetch cache range operations, Harvard architecture.
0x0, no support in ARM1176JZ-S processors.
Table 3-25 lists the results of attempted access for each mode.
Table 3-25 Results of access to the Memory Model Feature Register 2
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
Undefined exception
To use the Memory Model Feature Register 2 read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c1
•
Opcode_2 set to 6.
For example:
MRC p15, 0, <Rd>, c0, c1, 6 ;Read Memory Model Feature Register 2.
c0, Memory Model Feature Register 3
The purpose of the Memory Model Feature Register 3 is to provide information about the
memory model, memory management, cache support, and TLB operations of the processor.
The Memory Model Feature Register 3 is:
•
in CP15 c0
•
a 32-bit read-only register common to the Secure and Non-secure worlds
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-35
System Control Coprocessor
•
accessible in privileged modes only.
Figure 3-20 shows the bit arrangement for Memory Model Feature Register 3.
31
28 27
Reserved
24 23
Reserved
20 19
Reserved
16 15
Reserved
12 11
Reserved
8 7
Reserved
4 3
-
0
-
Figure 3-20 Memory Model Feature Register 3 format
Table 3-26 lists how the bit values correspond with the Memory Model Feature Register 3
functions.
Table 3-26 Memory Model Feature Register 3 bit functions
Bits
Field name
Function
[31:8]
-
Reserved. RAZ.
[7:4]
-
Support for hierarchical cache maintenance by MVA, all architectures
0x0, no support in ARM1176JZ-S processors.
[3:0]
-
Support for hierarchical cache maintenance by Set/Way, all architectures.
0x0, no support in ARM1176JZ-S processors.
Table 3-27 lists the results of attempted access for each mode.
Table 3-27 Results of access to the Memory Model Feature Register 3
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
Undefined exception
To use the Memory Model Feature Register 3 read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c1
•
Opcode_2 set to 7.
For example:
MRC p15, 0, <Rd>, c0, c1, 7 ;Read Memory Model Feature Register 3.
c0, Instruction Set Attributes Register 0
The purpose of the Instruction Set Attributes Register 0 is to provide information about the
instruction set that the processor supports beyond the basic set.
The Instruction Set Attributes Register 0 is:
•
in CP15 c0
•
a 32-bit read-only register common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-21 on page 3-37 shows the bit arrangement for Instruction Set Attributes Register 0.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-36
System Control Coprocessor
31
28 27
24 23
Reserved
-
20 19
16 15
-
-
12 11
-
8 7
4 3
-
-
0
-
Figure 3-21 Instruction Set Attributes Register 0 format
Table 3-28 lists how the bit values correspond with the Instruction Set Attributes Register 0
functions.
Table 3-28 Instruction Set Attributes Register 0 bit functions
Bits
Field name
Function
[31:28]
-
Reserved. RAZ.
[27:24]
-
Indicates support for divide instructions.
0x0, no support in ARM1176JZ-S processors.
[23:20]
-
Indicates support for debug instructions.
0x1, ARM1176JZ-S processors support BKPT.
[19:16]
-
Indicates support for coprocessor instructions.
0x4, ARM1176JZ-S processors support:
•
•
•
•
[15:12]
-
CDP, LDC, MCR, MRC, STC
CDP2, LDC2, MCR2, MRC2, STC2
MCRR, MRRC
MCRR2, MRRC2.
Indicates support for combined compare and branch instructions.
0x0, no support in ARM1176JZ-S processors.
[11:8]
-
Indicates support for bitfield instructions.
0x0, no support in ARM1176JZ-S processors.
[7:4]
-
Indicates support for bit counting instructions.
0x1, ARM1176JZ-S processors support CLZ.
[3:0]
-
Indicates support for atomic load and store instructions.
0x1, ARM1176JZ-S processors support SWP and SWPB.
Table 3-29 lists the results of attempted access for each mode.
Table 3-29 Results of access to the Instruction Set Attributes Register 0
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
Undefined exception
To use the Instruction Set Attributes Register 0 read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c2
•
Opcode_2 set to 0.
For example:
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-37
System Control Coprocessor
MRC p15, 0, <Rd>, c0, c2, 0 ;Read Instruction Set Attributes Register 0
c0, Instruction Set Attributes Register 1
The purpose of the Instruction Set Attributes Register 1 is to provide information about the
instruction set that the processor supports beyond the basic set.
The Instruction Set Attributes Register 1 is:
•
in CP15 c0
•
a 32-bit read-only register common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-22 shows the bit arrangement for Instruction Set Attributes Register 1.
31
28 27
-
24 23
-
20 19
-
16 15
-
12 11
-
8 7
-
4 3
-
0
-
Figure 3-22 Instruction Set Attributes Register 1 format
Table 3-30 lists how the bit values correspond with the Instruction Set Attributes Register 1
functions.
Table 3-30 Instruction Set Attributes Register 1 bit functions
Bits
Field name
[31:28]
-
Function
Indicates support for Jazelle instructions.
0x1, ARM1176JZ-S processors support BXJ and J bit in PSRs.
[27:24]
-
Indicates support for interworking instructions.
0x2, ARM1176JZ-S processors support:
•
•
BX, and T bit in PSRs
BLX, and PC loads have BX behavior.
[23:20]
-
Indicates support for immediate instructions.
0x0, no support in ARM1176JZ-S processors.
[19:16]
-
Indicates support for if then instructions.
0x0, no support in ARM1176JZ-S processors.
[15:12]
-
Indicates support for sign or zero extend instructions.
0x2, ARM1176JZ-S processors support:
•
SXTB, SXTB16, SXTH, UXTB, UXTB16, and UXTH
•
SXTAB, SXTAB16, SXTAH, UXTAB, UXTAB16, and UXTAH.
[11:8]
-
Indicates support for exception 2 instructions.
0x1, ARM1176JZ-S processors support SRS, RFE, and CPS.
[7:4]
-
Indicates support for exception 1 instructions.
0x1, ARM1176JZ-S processors support LDM(2), LDM(3) and STM(2).
[3:0]
-
Indicates support for endianness control instructions.
0x1, ARM1176JZ-S processors support SETEND and E bit in PSRs.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-38
System Control Coprocessor
Table 3-31 lists the results of attempted access for each mode.
Table 3-31 Results of access to the Instruction Set Attributes Register 1
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
Undefined exception
To use the Instruction Set Attributes Register 1 read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c2
•
Opcode_2 set to 1.
For example:
MRC p15, 0, <Rd>, c0, c2, 1 ;Read Instruction Set Attributes Register 1
c0, Instruction Set Attributes Register 2
The purpose of the Instruction Set Attributes Register 2 is to provide information about the
instruction set that the processor supports beyond the basic set.
The Instruction Set Attributes Register 2 is:
•
in CP15 c0
•
a 32-bit read-only register common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-23 shows the bit arrangement for Instruction Set Attributes Register 2.
31
28 27
-
24 23
-
20 19
-
16 15
-
12 11
-
8 7
-
4 3
-
0
-
Figure 3-23 Instruction Set Attributes Register 2 format
Table 3-32 lists how the bit values correspond with the Instruction Set Attributes Register 2
functions.
Table 3-32 Instruction Set Attributes Register 2 bit functions
Bits
Field name
[31:28]
-
Function
Indicates support for reversal instructions.
0x1, ARM1176JZ-S processors support REV, REV16, and REVSH.
[27:24]
-
Indicates support for PSR instructions.
0x1, ARM1176JZ-S processors support MRS and MSR exception return instructions for
data-processing.
[23:20]
-
Indicates support for advanced unsigned multiply instructions.
0x2, ARM1176JZ-S processors support:
•
•
ARM DDI 0333H
ID012410
UMULL and UMLAL
UMAAL.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-39
System Control Coprocessor
Table 3-32 Instruction Set Attributes Register 2 bit functions (continued)
Bits
Field name
Function
[19:16]
-
Indicates support for advanced signed multiply instructions.
0x3, ARM1176JZ-S processors support:
•
SMULL and SMLAL
•
SMLABB, SMLABT, SMLALBB,SMLALBT, SMLALTB, SMLALTT, SMLATB,
SMLATT, SMLAWB, SMLAWT, SMULBB, SMULBT, SMULTB, SMULTT,
SMULWB, SMULWT, and Q flag in PSRs
•
SMLAD, SMLADX, SMLALD, SMLALDX, SMLSD, SMLSDX, SMLSLD,
SMLSLDX, SMMLA, SMMLAR, SMMLS, SMMLSR, SMMUL, SMMULR,
SMUAD, SMUADX, SMUSD, and SMUSDX.
[15:12]
-
Indicates support for multiply instructions.
0x1, ARM1176JZ-S processors support MLA.
[11:8]
-
Indicates support for multi-access interruptible instructions.
0x1, ARM1176JZ-S processors support restartable LDM and STM.
[7:4]
-
Indicates support for memory hint instructions.
0x2, ARM1176JZ-S processors support PLD.
[3:0]
-
Indicates support for load and store instructions.
0x1, ARM1176JZ-S processors support LDRD and STRD.
Table 3-33 lists the results of attempted access for each mode.
Table 3-33 Results of access to the Instruction Set Attributes Register 2
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
Undefined exception
To use the Instruction Set Attributes Register 2 read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c2
•
Opcode_2 set to 2.
For example:
MRC p15, 0, <Rd>, c0, c2, 2 ;Read Instruction Set Attributes Register 2
c0, Instruction Set Attributes Register 3
The purpose of the Instruction Set Attributes Register 3 is to provide information about the
instruction set that the processor supports beyond the basic set.
The Instruction Set Attributes Register 3 is:
•
in CP15 c0
•
a 32-bit read-only registers common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-24 on page 3-41 shows the bit arrangement for Instruction Set Attributes Register 3.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-40
System Control Coprocessor
31
28 27
24 23
-
-
20 19
-
16 15
-
12 11
-
8 7
4 3
-
-
0
-
Figure 3-24 Instruction Set Attributes Register 3 format
Table 3-34 lists how the bit values correspond with the Instruction Set Attributes Register 3
functions.
Table 3-34 Instruction Set Attributes Register 3 bit functions
Bits
Field name
[31:28]
-
Function
Indicates support for Thumb-2 extensions.
0x0, no support in ARM1176JZ-S processors.
[27:24]
-
Indicates support for true NOP instructions.
0x1, ARM1176JZ-S processors support NOP and the capability for additional NOP compatible
hints. ARM1176JZ-S processors do not support NOP16.
[23:20]
-
Indicates support for Thumb copy instructions.
0x1, ARM1176JZ-S processors support Thumb MOV(3) low register ⇒ low register, and the
CPY alias for Thumb MOV(3).
[19:16]
-
Indicates support for table branch instructions.
0x0, no support in ARM1176JZ-S processors.
[15:12]
-
Indicates support for synchronization primitive instructions.
0x2, ARM1176JZ-S processors support:
•
LDREX and STREX
•
LDREXB, LDREXH, LDREXD, STREXB, STREXH, STREXD, and CLREX
[11:8]
-
Indicates support for SVC instructions.
0x1, ARM1176JZ-S processors support SVC.
[7:4]
-
Indicates support for Single Instruction Multiple Data (SIMD) instructions.
0x3, ARM1176JZ-S processors support:
PKHBT, PKHTB, QADD16, QADD8, QADDSUBX, QSUB16, QSUB8, QSUBADDX,
SADD16, SADD8, SADDSUBX, SEL, SHADD16, SHADD8, SHADDSUBX, SHSUB16,
SHSUB8, SHSUBADDX, SSAT, SSAT16, SSUB16, SSUB8, SSUBADDX, SXTAB16,
SXTB16, UADD16, UADD8, UADDSUBX, UHADD16, UHADD8, UHADDSUBX,
UHSUB16, UHSUB8, UHSUBADDX, UQADD16, UQADD8, UQADDSUBX, UQSUB16,
UQSUB8, UQSUBADDX, USAD8, USADA8, USAT, USAT16, USUB16, USUB8,
USUBADDX, UXTAB16, UXTB16, and the GE[3:0] bits in the PSRs.
[3:0]
-
Indicates support for saturate instructions.
0x1, ARM1176JZ-S processors support QADD, QDADD, QDSUB, QSUB and Q flag in PSRs.
Table 3-35 lists the results of attempted access for each mode.
Table 3-35 Results of access to the Instruction Set Attributes Register 3
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
Undefined exception
3-41
System Control Coprocessor
To use the Instruction Set Attributes Register 3 read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c2
•
Opcode_2 set to 3.
For example:
MRC p15, 0, <Rd>, c0, c2, 3 ;Read Instruction Set Attributes Register 3
c0, Instruction Set Attributes Register 4
The purpose of the Instruction Set Attributes Register 4 is to provide information about the
instruction set that the processor supports beyond the basic set.
The Instruction Set Attributes Register 4 is:
•
in CP15 c0
•
a 32-bit read-only register common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-25 shows the bit arrangement for Instruction Set Attributes Register 4.
31
28 27
Reserved
24 23
Reserved
16 15
20 19
-
-
12 11
-
8 7
-
4 3
-
0
-
Figure 3-25 Instruction Set Attributes Register 4 format
Table 3-36 lists how the bit values correspond with the Instruction Set Attributes Register 4
functions.
Table 3-36 Instruction Set Attributes Register 4 bit functions
ARM DDI 0333H
ID012410
Bits
Field name
Function
[31:28]
-
Reserved. RAZ.
[27:24]
-
Reserved. RAZ.
[23:20]
-
Indicates fractional support for synchronization primitive instructions.
0x0, ARM1176JZ-S processors support all synchronization primitive instructions.
See Table 3-34 on page 3-41.
[19:16]
-
Indicates support for barrier instructions.
0x0, None. ARM1176JZ-S processors support only the CP15 barrier operations.
[15:12]
-
Indicates support for SMC instructions.
0x1, ARM1176JZ-S processors support SMC.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-42
System Control Coprocessor
Table 3-36 Instruction Set Attributes Register 4 bit functions (continued)
Bits
Field name
Function
[11:8]
-
Indicates support for writeback instructions.
0x1, ARM1176JZ-S processors support all defined writeback addressing modes.
[7:4]
-
Indicates support for with shift instructions.
0x4, ARM1176JZ-S processors support:
•
shifts of loads and stores over the range LSL 0-3
•
constant shift options
•
register controlled shift options.
[3:0]
-
Indicates support for Unprivileged instructions.
0x1, ARM1176JZ-S processors support LDRBT, LDRT, STRBT, and STRT.
Table 3-37 lists the results of attempted access for each mode.
Table 3-37 Results of access to the Instruction Set Attributes Register 4
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
Undefined exception
To use the Instruction Set Attributes Register 4 read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set to c2
•
Opcode_2 set to 4.
For example:
MRC p15, 0, <Rd>, c0, c2, 4 ;Read Instruction Set Attributes Register 4
c0, Instruction Set Attributes Register 5
The purpose of the Instruction Set Attributes Register 5 is to provide additional information
about the properties of the processor.
The Instruction Set Attributes Register 5 is:
•
in CP15 c0
•
a 32-bit read-only registers common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
The contents of the Instruction Set Attributes Register 5 are implementation defined. In the
ARM1176JZ-S processor, Instruction Set Attributes Register 5 is read as 0x00000000.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-43
System Control Coprocessor
Table 3-38 lists the results of attempted access for each mode.
Table 3-38 Results of access to the Instruction Set Attributes Register 5
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
Undefined exception
To use the Instruction Set Attributes Register 5 read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c0
•
CRm set toc2
•
Opcode_2 set to 5.
For example:
MRC p15, 0, <Rd>, c0, c2, 5 ;Read Instruction Set Attribute Register 5.
3.2.7
c1, Control Register
This section contains information on:
•
Purpose of the Control Register
•
Structure of the Control Register
•
Operation of the Control Register on page 3-45
•
Use of the Control Register on page 3-47
•
Behavior of the Control Register on page 3-48.
Purpose of the Control Register
The purpose of the Control Register is to provide control and configuration of:
•
memory alignment, endianness, protection, and fault behavior
•
MMU and cache enables and cache replacement strategy
•
interrupts and the behavior of interrupt latency
•
the location for exception vectors
•
program flow prediction.
Table 3-39 on page 3-45 lists the purposes of the individual bits in the Control Register.
Structure of the Control Register
The Control Register is:
•
in CP15 c1
•
a 32 bit register, Table 3-39 on page 3-45 lists read and write access to individual bits for
the Secure and Non-secure worlds
•
accessible in privileged modes only
•
partially banked, Table 3-39 on page 3-45 lists banked and Secure modify only bits.
Figure 3-26 on page 3-45 shows the arrangement of bits in the register.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-44
System Control Coprocessor
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6
4 3 2 1 0
S
F T
E V X
D
R
SBZ
SBZ
U FI SBZ IT B
L4
V I Z F R S B SBO W C A M
A R
E E P
T
R
Z
Figure 3-26 Control Register format
Operation of the Control Register
Table 3-39 lists how the bit values correspond with the Control Register functions.
Table 3-39 Control Register bit functions
Bits
Field
name
Access
Function
[31:30]
-
-
This field is UNP when read. Write as the existing value.
[29]
FA
Banked
This bit controls the Force AP functionality in the MMU that generates Access
Bit faults, see Access permissions on page 6-11
0 = Force AP is disabled, reset value.
1 = Force AP is enabled.
[28]
TR
Banked
This bit controls the TEX remap functionality in the MMU, see Memory
region attributes on page 6-14.
0 = TEX remap disabled. Normal ARMv6 behavior, reset value
1 = TEX remap enabled. TEX[2:1] become page table bits for OS.
[27:26]
-
-
This field is UNP when read. Write as the existing value.
[25]
EE bit
Banked
Determines how the E bit in the CPSR bit is set on an exception. The reset
value depends on external signals.
0 = CPSR E bit is set to 0 on an exception, reset value.
1 = CPSR E bit is set to 1 on an exception.
[24]
VE bit
Banked
Enables the VIC interface to determine interrupt vectors.
See the description of the V bit, bit [13].
0 = Interrupt vectors are fixed, reset value.
1 = Interrupt vectors are defined by the VIC interface.
[23]
XP bit
Banked
Enables the extended page tables to be configured for the hardware page
translation mechanism.
0 = Subpage AP bits enabled, reset value.
1 = Subpage AP bits disabled.
[22]
U bit
Banked
Enables unaligned data access operations, including support for mixed
little-endian and big-endian operation. The A bit has priority over the U bit.
The reset value of the U bit depends on external signals.
0 = Unaligned data access support disabled, reset value. The processor treats
unaligned loads as rotated aligned data accesses.
1 = Unaligned data access support enabled. The processor permits unaligned
loads and stores and support for mixed endian data is enabled.
[21]
FI bit
Secure modify only
Configures low latency features for fast interrupts. This bit is overridden by
the FIO bit, see c1, Auxiliary Control Register on page 3-49.
0 = All performance features enabled, reset value.
1 = Low interrupt latency configuration enabled. See Low interrupt latency
configuration on page 2-40.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-45
System Control Coprocessor
Table 3-39 Control Register bit functions (continued)
Bits
Field
name
Access
Function
[20:19]
-
-
UNP/SBZ
[18]
IT bit
-
Deprecated. Global enable for instruction TCM.
Function redundant in ARMv6.
SBO
[17]
-
-
UNP/SBZ
[16]
DT bit
-
Deprecated. Global enable for data TCM.
Function redundant in ARMv6.
SBO
[15]
L4 bit
Secure modify only
Determines if the T bit is set for PC load instructions. For more details see the
ARM Architecture Reference Manual.
0 = Loads to PC set the T bit, reset value.
1 = Loads to PC do not set the T bit, ARMv4 behavior.
[14]
RR bit
Secure modify only
Determines the replacement strategy for the cache.
0 = Normal replacement strategy by random replacement, reset value.
1 = Predictable replacement strategy by round-robin replacement.
[13]
V bit
Banked
Determines the location of exception vectors, see c12, Secure or Non-secure
Vector Base Address Register on page 3-121 and c12, Monitor Vector Base
Address Register on page 3-122. The reset value of the V bit depends on an
external signal.
0 = Normal exception vectors selected, the Vector Base Address Registers
determine the address range, reset value.
1 = High exception vectors selected, address range = 0xFFFF0000-0xFFFF001C.
[12]
I bit
Banked
Enables level one instruction cache.
0 = Instruction Cache disabled, reset value.
1 = Instruction Cache enabled.
[11]
Z bit
Banked
Enables branch prediction.
0 = Program flow prediction disabled, reset value.
1 = Program flow prediction enabled.
[10]
F bit
-
Should Be Zero
[9]
R bit
Banked
Deprecated. Enables ROM protection. If you modify the R bit this does not
affect the access permissions of entries already in the TLB. See MMU
software-accessible registers on page 6-53.
0 = ROM protection disabled, reset value.
1 = ROM protection enabled.
[8]
S bit
Banked
Deprecated. Enables MMU protection. If you modify the S bit this does not
affect the access permissions of entries already in TLB.
0 = MMU protection disabled, reset value.
1 = MMU protection enabled.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-46
System Control Coprocessor
Table 3-39 Control Register bit functions (continued)
Bits
Field
name
Access
Function
[7]
B bit
Secure modify only
Determines operation as little-endian or big-endian word invariant memory
system and the names of the low four-byte addresses within a 32-bit word. The
reset value of the B bit depends on the BIGENDINIT external signal.
0 = Little-endian memory system, reset value.
1 = Big-endian word-invariant memory system.
[6:4]
-
-
This field returns 1 when read.
Should Be One.
[3]
W bit
-
Not implemented in the processor.
Read As One
Write Ignore.
[2]
C bit
Banked
Enables level one data cache.
0 = Data cache disabled, reset value.
1 = Data cache enabled.
[1]
A bit
Banked
Enables strict alignment of data to detect alignment faults in data accesses.
The A bit setting takes priority over the U bit.
0 = Strict alignment fault checking disabled, reset value.
1 = Strict alignment fault checking enabled.
[0]
M bit
Banked
Enables the MMU.
0 = MMU disabled, reset value.
1 = MMU enabled.
Attempts to read or write the Control Register from Secure or Non-secure User modes results
in an Undefined exception.
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
Attempts to write Secure modify only bit in Non-secure privileged modes are ignored.
Attempts to read Secure modify only bits return the Secure bit value. Table 3-40 lists the actions
that result from attempted access for each mode.
Table 3-40 Results of access to the Control Register
Non-secure Privileged access
Access type
Secure Privileged access
User access
Read
Write
Secure modify only
Secure bit
Secure bit
Ignored
Undefined exception
Banked
Secure bit
Non-secure bit
Non-secure bit
Undefined exception
Use of the Control Register
To use the Control Register it is recommended that you use a read modify write technique. To
use the Control Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c1
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-47
System Control Coprocessor
•
•
CRm set to c0
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c1, c0, 0
MCR p15, 0, <Rd>, c1, c0, 0
; Read Control Register configuration data
; Write Control Register configuration data
Normally, to set the V bit and the B, EE, and U bits you configure signals at reset.
The V bit depends on VINITHI at reset:
•
VINITHI LOW sets V to 0
•
VINITHI HIGH sets V to 1.
The B, EE, and U bits depend on how you set BIGENDINIT and UBITINIT at reset.
Table 3-41 lists the values of the B, EE, and U bits that result for the reset values of these signals.
See Reset values of the U, B, and EE bits on page 4-19.
Table 3-41 Resultant B bit, U bit, and EE bit values
UBITINIT
BIGENDINIT
E
E
U
B
0
0
0
0
0
0
1
0
0
1
1
0
0
1
0
1
1
1
1
0
Behavior of the Control Register
These bits in the Control Register exhibit specific behavior:
A bit
The A bit setting takes priority over the U bit. The Data Abort trap is taken if strict
alignment is enabled and the data access is not aligned to the width of the
accessed data item.
DT bit
This bit is used in ARM946 and ARM966 processors to enable the Data TCM.
In ARMv6, the TCM blocks have individual enables that apply to each block. As
a result, this bit is now redundant and Should Be One. See c9, Data TCM Region
Register on page 3-90 for a description of the ARM1176JZ-S TCM enables.
IT bit
This bit is used in ARM946 and ARM966 processors to enable the Instruction
TCM.
In ARMv6, the TCM blocks have individual enables that apply to each block. As
a result, this bit is now redundant and Should Be One. See c9, Instruction TCM
Region Register on page 3-92 for a description of the ARM1176JZ-S TCM
enables.
ARM DDI 0333H
ID012410
R bit
Modifying the R bit does not affect the access permissions of entries already in
the TLB. See MMU software-accessible registers on page 6-53.
S bit
Modifying the S bit does not affect the access permissions of entries already in
the TLB. See MMU software-accessible registers on page 6-53.
W bit
The ARM1176JZ-S processor does not implement the write buffer enable
because all memory writes take place through the Write Buffer.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-48
System Control Coprocessor
3.2.8
c1, Auxiliary Control Register
The purpose of the Auxiliary Control Register is to control:
•
program flow
•
low interrupt latency
•
cache cleaning
•
MicroTLB cache strategy
•
cache size restriction.
For more information on how the system control coprocessor operates with caches, see Cache
control and configuration on page 3-7.
Table 3-42 lists the purposes of the individual bits in the Auxiliary Control Register.
The Auxiliary Control Register is:
•
in CP15 c1
•
a 32-bit:
— read/write register in the Secure world
— read only register in the Non-secure world
•
accessible in privileged modes only.
Figure 3-27 shows the arrangement of bits in the register.
31 30 29 28 27
F F B P
I S F H
O D D D
7 6 5 4 3 2 1 0
SBZ/UNP
C R R T S D R
Z V A R B B S
Figure 3-27 Auxiliary Control Register format
Table 3-42 lists how the bit values correspond with the Auxiliary Control Register functions.
Table 3-42 Auxiliary Control Register bit functions
Bits
Field name
Function
[31]
FIO
Provides additional level of control for low interrupt latency configuration. This bit overrides the
FI bit, see FI bit in c1, Control Register on page 3-44:
0 = Normal operation for low interrupt latency configuration, reset value
1 = Low interrupt latency configuration overridden. This feature:
•
disables the fast interrupt response introduced by setting the FI bit
•
disables Hit-Under-Miss (HUM) functionality
•
abandons restartable external accesses so that all external aborts to loads are precise.
[30]
FSD
Provides additional level of control for speculative operations, see c1, Control Register on
page 3-44. Force speculative operations force the PC to a new value because of static,
speculative, branch prediction:
0 = Enable force speculative operations, reset value
1 = Disable force speculative operations.
[29]
BFD
Disables branch folding. This behavior also depends on the SB and DB bits, [2:1] in this register,
and the Z bit, see c1, Control Register on page 3-44:
0 = Branch folding is enabled, when branch prediction is enabled, reset value
1 = Branch folding is disabled.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-49
System Control Coprocessor
Table 3-42 Auxiliary Control Register bit functions (continued)
Bits
Field name
Function
[28]
PHD
Disables instruction prefetch halting on unconditional, unpredictable instructions that later result
in a prefetch buffer flush. This prefetch halting is a power saving technique:
0 = Prefetch halting is enabled, reset value
1 = Prefetch halting is disabled.
[27:7]
-
UNP/SBZ
[6]
CZ
Controls the restriction of cache size to 16KB. This enables the processor to run software that
does not support ARMv6 page coloring. When set the CZ bit does not effect the Cache Type
Register. See Restrictions on page table mappings page coloring on page 6-41 for more
information:
0 = Normal ARMv6 cache behavior, reset value
1 = Cache size limited to 16KB.
[5]
RV
Disables block transfer cache operations:
0 = Block transfer cache operations enabled, reset value
1 = Block transfer cache operations disabled.
[4]
RA
Disables clean entire data cache:
0 = Clean entire data cache enabled, reset value
1 = Clean entire data cache disabled.
[3]
TR
Enables MicroTLB random replacement strategy. This depends on the cache replacement
strategy that the RR bit controls, see c1, Control Register on page 3-44. The MicroTLB strategy
is only random when the cache strategy is random:
0 = MicroTLB replacement is Round Robin, reset value
1 = MicroTLB replacement is Random if cache replacement is also Random.
[2]
SB
Enables static branch prediction. This depends on program flow prediction that the Z bit enables,
see c1, Control Register on page 3-44:
0 = Static branch prediction disabled
1 = Static branch prediction enabled, if the Z bit is set. The reset value is 1.
[1]
DB
Enables dynamic branch prediction. This depends on program flow prediction that the Z bit
enables, see c1, Control Register on page 3-44:
0 = Dynamic branch prediction disabled
1 = Dynamic branch prediction enabled, if the Z bit is set. The reset value is 1.
[0]
RS
Enables the return stack. This depends on program flow prediction that the Z bit enables, see c1,
Control Register on page 3-44:
0 = Return stack is disabled
1 = Return stack is enabled, if the Z bit is set. The reset value is 1.
Table 3-43 lists the results of attempted access for each mode.
Table 3-43 Results of access to the Auxiliary Control Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Data
Data
Undefined exception
User
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
Undefined exception
3-50
System Control Coprocessor
To use the Auxiliary Control Register you must use a read modify write technique. To access
the Auxiliary Control Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c1
•
CRm set to c0
•
Opcode_2 set to 1.
For example:
MRC p15, 0, <Rd>, c1, c0, 1
MCR p15, 0, <Rd>, c1, c0, 1
3.2.9
; Read Auxiliary Control Register
; Write Auxiliary Control Register
c1, Coprocessor Access Control Register
The purpose of the Coprocessor Access Control Register is to set access rights for the
coprocessors CP0 through CP13. This register has no effect on access to CP14, the debug
control coprocessor, or CP15, the system control coprocessor. This register also provides a
means for software to determine if any particular coprocessor, CP0-CP13, exists in the system.
The Coprocessor Access Control Register is:
•
in CP15 c1
•
a 32-bit read/write register common to Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-28 shows the arrangement of bits in the register.
31
28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
SBZ/UNP cp13 cp12 cp11 cp10 cp9
cp8
cp7
cp6
cp5
cp4
cp3
cp2
cp1
cp0
Figure 3-28 Coprocessor Access Control Register format
Table 3-44 lists how the bit values correspond with the Coprocessor Access Control Register
functions.
Table 3-44 Coprocessor Access Control Register bit functions
Bits
Field name
Function
[31:28]
-
UNP/SBZ.
-
cp<n>a
Defines access permissions for each coprocessor.
Access denied is the reset condition.
Access denied is the behavior for non-existent coprocessors:
b00 = Access denied, reset value. Attempted access generates an Undefined exception
b01 = Privileged mode access only
b10 = Reserved.
b11 = Privileged and User mode access.
a. n is the coprocessor number between 0 and 13.
Access to coprocessors in the Non-secure world depends on the permissions set in the c1,
Non-Secure Access Control Register on page 3-55.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-51
System Control Coprocessor
Attempts to read or write the Coprocessor Access Control Register access bits depend on the
corresponding bit for each coprocessor in c1, Non-Secure Access Control Register on page 3-55.
Table 3-45 lists the results of attempted access to coprocessor access bits for each mode.
Table 3-45 Results of access to the Coprocessor Access Control Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
0
Data
Data
b00
Ignored
Undefined exception
1
Data
Data
Data
Data
Undefined exception
Corresponding bit in Non-Secure
Access Control Register
User
To use the Coprocessor Access Control Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c1
•
CRm set to c0
•
Opcode_2 set to 2.
For example:
MRC p15, 0, <Rd>, c1, c0, 2
MCR p15, 0, <Rd>, c1, c0, 2
; Read Coprocessor Access Control Register
; Write Coprocessor Access Control Register
You must perform an Instruction Memory Barrier (IMB) sequence immediately after an update
of the Coprocessor Access Control Register, see Memory Barriers on page 5-8. You must not
attempt to execute any instructions that are affected by the change of access rights between the
IMB sequence and the register update.
To determine if any particular coprocessor exists in the system write the access bits for the
coprocessor of interest with a value other than b00. If the coprocessor does not exist in the
system the access rights remain set to b00.
3.2.10
c1, Secure Configuration Register
The purpose of the Secure Configuration Register is to define:
•
the current world as Secure or Non-secure
•
the world in which the core executes exceptions
•
the ability to modify the A and I bits in the CPSR in the Non-secure world.
The Secure Configuration Register is:
•
in CP15 c1
•
a 32 bit read/write register
•
accessible in Secure privileged modes only.
Figure 3-29 shows the arrangement of bits in the register.
31
7 6 5 4 3 2 1 0
n
F
A F E
IR N
E
I
W W A
Q S
T
Q
SBZ
Figure 3-29 Secure Configuration Register format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-52
System Control Coprocessor
Table 3-46 lists how the bit values correspond with the Secure Configuration Register functions.
Table 3-46 Secure Configuration Register bit functions
Bits
Field name
Function
[31:7]
-
UNP/SBZ.
[6]
nET
The Early Termination bit is not implemented in ARM1176JZ-S processors.
UNP/SBZ.
[5]
AW
Determines if the A bit in the CPSR can be modified when in the Non-secure world:
0 = Disable modification of the A bit in the CPSR in the Non-secure world, reset value
1 = Enable modification of the A bit in the CPSR in the Non-secure world.
[4]
FW
Determines if the F bit in the CPSR can be modified when in the Non-secure world:
0 = Disable modification of the F bit in the CPSR in the Non-secure world, reset value
1 = Enable modification of the F bit in the CPSR in the Non-secure world.
[3]
EA
Determines External Abort behavior for Secure and Non-secure worlds:
0 = Branch to abort mode on an External Abort exception, reset value
1 = Branch to Secure Monitor mode on an External Abort exception.
[2]
FIQ
Determines FIQ behavior for Secure and Non-secure worlds:
0 = Branch to FIQ mode on an FIQ exception, reset value
1 = Branch to Secure Monitor mode on an FIQ exception.
[1]
IRQ
Determines IRQ behavior for Secure and Non-secure worlds:
0 = Branch to IRQ mode on an IRQ exception, reset value
1 = Branch to Secure Monitor mode on an IRQ exception.
[0]
NS bit
Defines the world for the processor:
0 = Secure, reset value
1 = Non-secure.
Note
When the core runs in Secure Monitor mode the state is considered Secure regardless of the state
of the NS bit. However, Monitor mode code can access nonsecure banked copies of registers if
the NS bit is set to 1. See the ARM Architecture Reference Manual for information on the effect
of the Security Extensions on the CP15 registers.
The permutations of the bits in the Secure Configuration Register have certain security
implications. Table 3-47 lists the results for combinations of the FW and FIQ bits.
Table 3-47 Operation of the FW and FIQ bits
ARM DDI 0333H
ID012410
FW
FIQ
Function
1
0
FIQs handled locally.
0
1
FIQs can be configured to give deterministic Secure interrupts.
1
1
Non-secure world able to make denial of service attack, avoid use of this function.
0
0
Avoid because the core might enter an infinite loop for Non-secure FIQ.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-53
System Control Coprocessor
Table 3-48 lists the results for combinations of the AW and EA bits.
Table 3-48 Operation of the AW and EA bits
AW
EA
Function
1
0
Aborts handled locally.
0
1
All external aborts trapped to Secure Monitor.
1
1
All external imprecise data aborts trapped to Secure Monitor but the Non-secure world can hide Secure
aborts from the Secure Monitor, avoid use of this function.
0
0
Avoid because the core can unexpectedly enter an abort mode in the Non-secure world.
For more details on the use of Secure Monitor mode, see The NS bit and Secure Monitor mode
on page 2-4.
To use the Secure Configuration Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c1
•
CRm set to c1
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c1, c1, 0
MCR p15, 0, <Rd>, c1, c1, 0
; Read Secure Configuration Register data
; Write Secure Configuration Register data
An attempt to access the Secure Configuration Register from any state other than Secure
privileged results in an Undefined exception.
3.2.11
c1, Secure Debug Enable Register
The purpose of the Secure Debug Enable Register is to provide control of permissions for debug
in Secure User mode, see Chapter 13 Debug.
Table 3-49 on page 3-55 lists the purposes of the individual bits in the Secure Debug Enable
Register.
The Secure Debug Enable Register is:
•
in CP15 c1
•
a 32 bit register in the Secure world only
•
accessible in Secure privileged modes only.
Figure 3-30 shows the arrangement of bits in the register.
2 1 0
31
SBZ
SUNIDEN
SUIDEN
Figure 3-30 Secure Debug Enable Register format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-54
System Control Coprocessor
Table 3-49 lists how the bit values correspond with the Secure Debug Enable Register functions.
Table 3-49 Secure Debug Enable Register bit functions
Bits
Field name
Function
[31:2]
-
This field is UNP when read. Write as the existing value.
[1]
SUNIDEN
Enables Secure User non-invasive debug:
0 = Non-invasive debug is not permitted in Secure User mode, reset value
1 = Non-invasive debug is permitted in Secure User mode.
[0]
SUIDEN
Enables Secure User invasive debug:
0 = Invasive debug is not permitted in Secure User mode, reset value
1 = Invasive debug is permitted in Secure User mode.
Table 3-50 lists the results of attempted access for each mode.
Table 3-50 Results of access to the Coprocessor Access Control Register
Secure Privileged
Read
Write
Data
Data
Non-secure Privileged
User
Undefined exception
Undefined exception
To use the Secure Debug Enable Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c1
•
CRm set to c1
•
Opcode_2 set to 1.
For example:
MRC p15, 0, <Rd>, c1, c1, 1
MCR p15, 0, <Rd>, c1, c1, 1
3.2.12
; Read Secure Debug Enable Register
; Write Secure Debug Enable Register
c1, Non-Secure Access Control Register
The purpose of the Non-Secure Access Control Register is to define the Non-secure access
permission for:
•
coprocessors
•
cache lockdown registers
•
TLB lockdown registers
•
internal DMA.
Note
This register has no effect on Non-secure access permissions for the debug control coprocessor,
CP14, or the system control coprocessor, CP15.
The Non-Secure Access Control Register is:
•
in CP15 c1
•
a 32 bit register:
— read/write in the Secure world
— read only in the Non-secure world
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-55
System Control Coprocessor
•
only accessible in privileged modes.
Figure 3-31 shows the arrangement of bits in the register.
31
SBZ
19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
D
C
M TL
SBZ
L
A
CP13
CP12
CP11
CP10
CP9
CP8
CP7
CP6
CP5
CP4
CP3
CP2
CP1
CP0
Figure 3-31 Non-Secure Access Control Register format
Table 3-51 lists how the bit values correspond with the Non-Secure Access Control Register
functions.
Table 3-51 Non-Secure Access Control Register bit functions
Bits
Field name
Function
[31:19]
-
Reserved.
UNP/SBZ.
[18]
DMA
Reserves the DMA channels and registers for the Secure world and determines the page tables,
Secure or Non-secure, to use for DMA transfers. For details, see DMA on page 7-10:
0 = DMA reserved for the Secure world only and the Secure page tables are used for DMA
transfers, reset value
1 = DMA can be used by the Non-secure world and the Non-secure page tables are used for
DMA transfers.
[17]
TL
Prevents operations in the Non-secure world from locking page tables in TLB lockdown
entries.
The Invalidate Single Entry or Invalidate ASID match operations can match a TLB lockdown
entry but an Invalidate All operation only applies to unlocked entries:
0 = Reserve TLB Lockdown registers for Secure operation only, reset value
1 = TLB Lockdown registers available for Secure and Non-secure operation.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-56
System Control Coprocessor
Table 3-51 Non-Secure Access Control Register bit functions (continued)
Bits
Field name
Function
[16]
CL
Prevents operations in the Non-secure world from changing cache lockdown entries:
0 = Reserve cache lockdown registers for Secure operation only, reset value
1 = Cache lockdown registers available for Secure and Non-secure operation.
[15:14]
-
Reserved.
UNP/SBZ.
[13:0]
CPna
Determines permission to access the given coprocessor in the Non-secure world:
0 = Secure access only, reset value
1 = Secure or Non-secure access.
a. n is the coprocessor number from 0 to 13.
To use the Non-Secure Access Control Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c1
•
CRm set to c1
•
Opcode_2 set to 2.
For example:
MRC p15, 0, <Rd>, c1, c1, 2
MCR p15, 0, <Rd>, c1, c1, 2
; Read Non-Secure Access Control Register data
; Write Non-Secure Access Control Register data
Table 3-52 lists the results of attempted access for each mode.
Table 3-52 Results of access to the Auxiliary Control Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Data
Data
Undefined exception
User
3.2.13
Undefined exception
c2, Translation Table Base Register 0
The purpose of the Translation Table Base Register 0 is to hold the physical address of the
first-level translation table.
You use Translation Table Base Register 0 for process-specific addresses, where each process
maintains a separate first-level page table. On a context switch you must modify both
Translation Table Base Register 0 and the Translation Table Base Control Register, if
appropriate.
Table 3-53 on page 3-58 lists the purposes of the individual bits in the Translation Table Base
Register 0.
The Translation Table Base Register 0 is:
•
in CP15 c2
•
a 32 bit read/write register banked for Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-32 on page 3-58 shows the bit arrangement for the Translation Table Base Register 0.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-57
System Control Coprocessor
31
5 4 3 2 1 0
14-N 13-N
Translation table base 0
RGN P S C
UNP/SBZ
Figure 3-32 Translation Table Base Register 0 format
Table 3-53 lists how the bit values correspond with the Translation Table Base Register 0
functions.
Table 3-53 Translation Table Base Register 0 bit functions
Bits
Field name
Function
[31:14-N]a
Translation table base 0
Holds the translation table base address, the physical address of the first level
translation table. The reset value is 0.
[13-N:5]a
-
UNP/SBZ.
[4:3]
RGN
Indicates the Outer cacheable attributes for page table walking:
b00 = Outer Noncacheable, reset value
b01 = Write-back, Write Allocate
b10 = Write-through, No Allocate on Write
b11 = Write-back, No Allocate on Write.
[2]
P
If the processor supports ECC, it indicates to the memory controller it is enabled
or disabled. For ARM1176JZ-S processors this is 0:
0 = Error-Correcting Code (ECC) is disabled, reset value
1 = ECC is enabled.
[1]
S
Indicates the page table walk is to Non-Shared or to Shared memory:
0 = Non-Shared, reset value
1 = Shared.
[0]
C
Indicates the page table walk is Inner Cacheable or Inner Non Cacheable:
0 = Inner Noncacheable, reset value
1 = Inner cacheable.
a. For an explanation of N see c2, Translation Table Base Control Register on page 3-61.
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
Table 3-54 lists the results of attempted access for each mode.
Table 3-54 Results of access to the Translation Table Base Register 0
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Secure data
Secure data
Non-secure data
Non-secure data
User
Undefined exception
A write to the Translation Table Base Register 0 updates the address of the first level translation
table from the value in bits [31:7] of the written value, to account for the maximum value of 7
for N. The number of bits of this address that the processor uses, and therefore, the required
alignment of the first level translation table, depends on the value of N, see c2, Translation Table
Base Control Register on page 3-61.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-58
System Control Coprocessor
A read from the Translation Table Base Register 0 returns the complete address of the first level
translation table in bits [31:7] of the read value, regardless of the value of N.
To use the Translation Table Base Register 0 read or write CP15 c2 with:
•
Opcode_1 set to 0
•
CRn set to c2
•
CRm set to c0
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c2, c0, 0
MCR p15, 0, <Rd>, c2, c0, 0
; Read Translation Table Base Register 0
; Write Translation Table Base Register 0
Note
The ARM1176JZ-S processor cannot page table walk from level one cache. Therefore, if C is
set to 1, to ensure coherency, you must either store page tables in Inner write-through memory
or, if in Inner write-back, you must clean the appropriate cache entries after modification so that
the mechanism for the hardware page table walks sees them.
3.2.14
c2, Translation Table Base Register 1
The purpose of the Translation Table Base Register 1 is to hold the physical address of the
first-level table. The expected use of the Translation Table Base Register 1 is for OS and I/O
addresses.
Table 3-55 on page 3-60 lists the purposes of the individual bits in the Translation Table Base
Register 1.
The Translation Table Base Register 1 is:
•
in CP15 c2
•
a 32 bit read/write register banked for Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-33 shows the bit arrangement for the Translation Table Base Register 1.
31
5 4 3 2 1 0
14 13
Translation table base 1
UNP/SBZ
RGN P S C
Figure 3-33 Translation Table Base Register 1 format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-59
System Control Coprocessor
Table 3-55 lists how the bit values correspond with the Translation Table Base Register 1
functions.
Table 3-55 Translation Table Base Register 1 bit functions
Bits
Field name
Function
[31:14]
Translation table base 1
Holds the translation table base address, the physical address of the first level
translation table. The reset value is 0.
[13:5]
-
UNP/SBZ.
[4:3]
RGN
Indicates the Outer cacheable attributes for page table walking:
b00 = Outer Noncacheable, reset value
b01 = Write-back, Write Allocate
b10 = Write-through, No Allocate on Write
b11 = Write-back, No Allocate on Write.
[2]
P
If the processor supports ECC, it indicates to the memory controller it is enabled or
disabled. For ARM1176JZ-S processors this is 0:
0 = Error-Correcting Code (ECC) is disabled, reset value
1 = ECC is enabled.
[1]
S
Indicates the page table walk is to Non-Shared or to Shared memory:
0 = Non-Shared, reset value
1 = Shared.
[0]
C
Indicates the page table walk is Inner Cacheable or Inner Non Cacheable:
0 = Inner Noncacheable, reset value
1 = Inner Cacheable.
Table 3-56 lists the results of attempted access for each mode.
Table 3-56 Results of access to the Translation Table Base Register 1
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Secure data
Secure data
Non-secure data
Non-secure data
User
Undefined exception
A write to the Translation Table Base Register 1 updates the address of the first level translation
table from the value in bits [31:14] of the written value. Bits [13:5] Should Be Zero. The
Translation Table Base Register 1 must reside on a 16KB page boundary.
To use the Translation Table Base Register 1 read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c2
•
CRm set to c0
•
Opcode_2 set to 1.
For example:
MRC p15, 0, <Rd>, c2, c0, 1
MCR p15, 0, <Rd>, c2, c0, 1
ARM DDI 0333H
ID012410
; Read Translation Table Base Register 1
; Write Translation Table Base Register 1
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-60
System Control Coprocessor
Note
The ARM1176JZ-S processor cannot page table walk from level one cache. Therefore, if C is
set to 1, to ensure coherency, you must either store page tables in Inner write-through memory
or, if in Inner write-back, you must clean the appropriate cache entries after modification so that
the mechanism for the hardware page table walks sees them.
3.2.15
c2, Translation Table Base Control Register
The purpose of the Translation Table Base Control Register is to determine if a page table miss
for a specific VA uses, for its page table walk, either:
•
Translation Table Base Register 0. The recommended use is for task-specific addresses
•
Translation Table Base Register 1. The recommended use is for operating system and I/O
addresses.
Table 3-57 lists the purposes of the individual bits in the Translation Table Base Control
Register.
The Translation Table Base Control Register is:
•
in CP15 c2
•
a 32 bit read/write register banked for Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-34 shows the bit arrangement for the Translation Table Base Register 1.
31
UNP/SBZ
6 5
P
D
1
4
P
D
0
3 2
0
S
N
B
Z
Figure 3-34 Translation Table Base Control Register format
Table 3-57 lists how the bit values correspond with the Translation Table Base Register 0
functions.
Table 3-57 Translation Table Base Control Register bit functions
Bits
Field
name
Function
[31:6]
-
UNP/SBZ.
[5]
PD1
Specifies occurrence of a page table walk on a TLB miss when using Translation Table Base Register
1. When page table walk is disabled, a Section Fault occurs instead on a TLB miss:
0 = The processor performs a page table walk on a TLB miss, with Secure or Non-secure privilege
appropriate to the current world. This is the reset value
1 = The processor does not perform a page table walk. If a TLB miss occurs with Translation Table
Base Register 1 in use, the processor returns a Section Translation Fault.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-61
System Control Coprocessor
Table 3-57 Translation Table Base Control Register bit functions (continued)
Bits
Field
name
[4]
PD0
Specifies occurrence of a page table walk on a TLB miss when using Translation Table Base Register
0. When page table walk is disabled, a Section Fault occurs instead on a TLB miss:
0 = The processor performs a page table walk on a TLB miss, with Secure or Non-secure privilege
appropriate to the current world. This is the reset value
1 = The processor does not perform a page table walk. If a TLB miss occurs with Translation Table
Base Register 0 in use, the processor returns a Section Translation Fault.
[3]
-
UNP/SBZ.
[2:0]
N
Specifies the boundary size of Translation Table Base Register 0:
b000 = 16KB, reset value
b001 = 8KB
b010 = 4KB
b011 = 2KB
b100 = 1KB
b101 = 512-byte
b110 = 256-byte
b111 = 128-byte.
Function
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
Table 3-58 lists the results of attempted access for each mode.
Table 3-58 Results of access to the Translation Table Base Control Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Secure data
Secure data
Non-secure data
Non-secure data
User
Undefined exception
To use the Translation Table Base Control Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c2
•
CRm set to c0
•
Opcode_2 set to 2.
For example:
MRC p15, 0, <Rd>, c2, c0, 2
MCR p15, 0, <Rd>, c2, c0, 2
; Read Translation Table Base Control Register
; Write Translation Table Base Control Register
A translation table base register is selected like this:
ARM DDI 0333H
ID012410
•
If N is set to 0, always use Translation Table Base Register 0. This is the default case at
reset. It is backwards compatible with ARMv5 and earlier processors.
•
If N is set greater than 0, and bits [31:32-N] of the VA are all 0, use Translation Table Base
Register 0, otherwise use Translation Table Base Register 1. N must be in the range 0-7.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-62
System Control Coprocessor
Note
The ARM1176JZ-S processor cannot page table walk from level one cache. Therefore, if C is
set to 1, to ensure coherency, you must either store page tables in Inner write-through memory
or, if in Inner write-back, you must clean the appropriate cache entries after modification so that
the mechanism for the hardware page table walks sees them.
3.2.16
c3, Domain Access Control Register
The purpose of the Domain Access Control Register is to hold the access permissions for a
maximum of 16 domains.
Table 3-59 lists the purposes of the individual bits in the Domain Access Control Register.
The Domain Access Control Register is:
•
in CP15 c3
•
a 32-bit read/write register banked for Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-35 shows the bit arrangement of the Domain Access Control Register.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
D15 D14 D13 D12 D11 D10
D9
D8
D7
D6
D5
D4
D3
D2
D1
D0
Figure 3-35 Domain Access Control Register format
Table 3-59 lists how the bit values correspond with the Domain Access Control Register
functions.
Table 3-59 Domain Access Control Register bit functions
Bits
Field
name
-
D<n>a
Function
The purpose of the fields D15-D0 in the register is to define the access permissions for each one of
the 16 domains. These domains can be either sections, large pages or small pages of memory:
b00 = No access, reset value. Any access generates a domain fault.
b01 = Client. Accesses are checked against the access permission bits in the TLB entry.
b10 = Reserved. Any access generates a domain fault.
b11 = Manager. Accesses are not checked against the access permission bits in the TLB entry, so a
permission fault cannot be generated.
a. n is the Domain number in the range between 0 and 15
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
Table 3-60 lists the results of attempted access for each mode.
Table 3-60 Results of access to the Domain Access Control Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Secure data
Secure data
Non-secure data
Non-secure data
User
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
Undefined exception
3-63
System Control Coprocessor
To use the Domain Access Control Register read or write CP15 c3 with:
•
Opcode_1 set to 0
•
CRn set to c3
•
CRm set to c0
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c3, c0, 0
MCR p15, 0, <Rd>, c3, c0, 0
3.2.17
; Read Domain Access Control Register
; Write Domain Access Control Register
c5, Data Fault Status Register
The purpose of the Data Fault Status Register is to hold the source of the last data fault.
Table 3-61 lists the purposes of the individual bits in the Data Fault Status Register.
The Data Fault Status Register is:
•
in CP15 c5
•
a 32-bit read/write register banked for Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-36 shows the bit arrangement in the Data Fault Status Register.
13 12 11 10 9 8 7
31
S R
S 0 0
D W
UNP/SBZ
4 3
Domain
0
Status
Figure 3-36 Data Fault Status Register format
Table 3-61 shows how the bit values correspond with the Data Fault Status Register functions.
Table 3-61 Data Fault Status Register bit functions
Bits
Field
name
Function
[31:13]
-
UNP/SBZ.
[12]
SD
Indicates whether an AXI Decode or Slave error caused an abort. This bit is only valid
for external aborts. For all other aborts this bit Should Be Zero. See Fault status and
address on page 6-34:
0 = AXI Decode error caused the abort, reset value
1 = AXI Slave error caused the abort.
[11]
RW
Indicates whether a read or write access caused an abort:
0 = Read access caused the abort, reset value
1 = Write access caused the abort.
[10]
S
Part of the Status field. See Bits [3:0] in this table. The reset value is 0.
[9:8]
-
Always read as 0. Writes ignored.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-64
System Control Coprocessor
Table 3-61 Data Fault Status Register bit functions (continued)
Bits
Field
name
[7:4]
Domain
Indicates the domain from the 16 domains, D15-D0, is accessed when a data fault occurs.
Takes values 0-15. The reset value is 0.
[3:0] with
bit[10] = 0
Status
Indicates type of fault generated.
See Fault status and address on page 6-34 for full details of Domain and FAR validity,
and priorities:
b0000 = no function, reset value
b0001 = Alignment fault
b0010 = Instruction debug event fault
b0011 = Access Bit fault on Section
b0100 = Instruction cache maintenance operation fault
b0101 = Translation Section fault
b0110 = Access Bit fault on Page
b0111 = Translation Page fault
b1000 = Precise external abort
b1001 = Domain Section fault
b1010 = no function
b1011 = Domain Page fault
b1100 = External abort on translation, first level
b1101 = Permission Section fault
b1110 = External abort on translation, second level
b1111 = Permission Page fault.
[3:0] with
bit[10] = 1
Status
Indicates type of fault generated.
See Fault status and address on page 6-34 for full details of Domain and FAR validity,
and priorities:
b0000 = no function, reset value
b0001 = no function
b0010 = no function
b0011 = no function
b0100 = no function
b0101 = no function
b0110 = Imprecise external abort
b0111 = no function
b1000 = no function
b1001 = no function
b1010 = no function
b1011 = no function
b1100 = no function
b1101 = no function
b1110 = no function
b1111 = no function.
ARM DDI 0333H
ID012410
Function
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-65
System Control Coprocessor
Table 3-62 lists the results of attempted access for each mode.
Table 3-62 Results of access to the Data Fault Status Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Secure data
Secure data
Non-secure data
Non-secure data
User
Undefined exception
Note
When the SCR EA bit is set, see c1, Secure Configuration Register on page 3-52, the processor
writes to the Secure Data Fault Status Register on a Secure Monitor entry caused by an external
abort.
To use the Data Fault Status Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c5
•
CRm set to c0
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c5, c0, 0
MCR p15, 0, <Rd>, c5, c0, 0
3.2.18
; Read Data Fault Status Register
; Write Data Fault Status Register
c5, Instruction Fault Status Register
The purpose of the Instruction Fault Status Register (IFSR) is to hold the source of the last
instruction fault.
Table 3-63 on page 3-67 lists the purposes of the individual bits in IFSR.
The Instruction Fault Status Register is:
•
in CP15 c5
•
a 32-bit read/write register banked for Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-37 shows the bit arrangement of the Instruction Fault Status Register.
31
UNP/SBZ
4 3
13 12 11 10 9
0
S
S
B 0
UNP/SBZ
Status
D
Z
Figure 3-37 Instruction Fault Status Register format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-66
System Control Coprocessor
Table 3-63 lists how the bit values correspond with the Instruction Fault Status Register
functions.
Table 3-63 Instruction Fault Status Register bit functions
Bits
Field
name
Function
[31:13]
-
UNP/SBZ.
[12]
SD
Indicates whether an AXI Decode or Slave error caused an abort. This bit is only valid for
external aborts. For all other aborts this bit Should Be Zero. See Fault status and address on
page 6-34:
0 = AXI Decode error caused the abort, reset value
1 = AXI Slave error caused the abort.
[11]
-
UNP/SBZ.
[10]
-
Part of the Status field, see bits [3:0] in this table.
Always 0.
[9:4]
-
UNP/SBZ.
[3:0] with
bit[10] = 0
Status
Indicates type of fault generated.
See Fault status and address on page 6-34 for full details of Domain and FAR validity, and
priorities:
b0000 = no function, reset value
b0001= Alignment fault
b0010 = Instruction debug event fault
b0011 = Access Bit fault on Section
b0100 = no function
b0101 = Translation Section fault
b0110 = Access Bit fault on Page
b0111 = Translation Page fault
b1000 = Precise external abort
b1001 = Domain Section fault
b1010 = no function
b1011 = Domain Page fault
b1100 = External abort on translation, first level
b1101 = Permission Section fault
b1110 = External abort on translation, second level
b1111 = Permission Page fault.
Table 3-64 lists the results of attempted access for each mode.
Table 3-64 Results of access to the Instruction Fault Status Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Secure data
Secure data
Non-secure data
Non-secure data
User
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
Undefined exception
3-67
System Control Coprocessor
Note
When the SCR EA bit is set, see c1, Secure Configuration Register on page 3-52, the processor
writes to the Secure Instruction Fault Status Register on a Secure Monitor entry caused by an
external abort.
To use the IFSR read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c5
•
CRm set to c0
•
Opcode_2 set to 1.
For example:
MRC p15, 0, <Rd>, c5, c0, 1
MCR p15, 0, <Rd>, c5, c0, 1
3.2.19
; Read Instruction Fault Status Register
; Write Instruction Fault Status Register
c6, Fault Address Register
The purpose of the Fault Address Register (FAR) is to hold the Modified Virtual Address (MVA)
of the fault when a precise abort occurs.
The FAR is:
•
in CP15 c6
•
a 32-bit read/write register banked for Secure and Non-secure worlds
•
accessible in privileged modes only.
The Fault Address Register bits [31:0] contain the MVA that the precise abort occurred on. The
reset value is 0.
Table 3-65 lists the results of attempted access for each mode.
Table 3-65 Results of access to the Fault Address Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Secure data
Secure data
Non-secure data
Non-secure data
User
Undefined exception
To use the FAR read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c6
•
CRm set to c0
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c6, c0, 0
MCR p15, 0, <Rd>, c6, c0, 0
; Read Fault Address Register
; Write Fault Address Register
A write to this register sets the FAR to the value of the data written. This is useful for a debugger
to restore the value of the FAR.
The ARM1176JZ-S processor also updates the FAR on debug exception entry because of
watchpoints, see Effect of a debug event on CP15 registers on page 13-34 for more details.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-68
System Control Coprocessor
3.2.20
c6, Watchpoint Fault Address Register
Access to the Watchpoint Fault Address register through the system control coprocessor is
deprecated, see CP14 c6, Watchpoint Fault Address Register (WFAR) on page 13-12.
3.2.21
c6, Instruction Fault Address Register
The purpose of the Instruction Fault Address Register (IFAR) is to hold the address of
instructions that cause a prefetch abort.
The IFAR is:
•
in CP15 c6
•
a 32-bit read/write register banked for Secure and Non-secure worlds
•
accessible in privileged modes only.
The Instruction Fault Address Register bits [31:0] contain the Instruction Fault MVA. The reset
value is 0.
Table 3-66 lists the results of attempted access for each mode.
Table 3-66 Results of access to the Instruction Fault Address Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Secure data
Secure data
Non-secure data
Non-secure data
User
Undefined exception
To use the IFAR read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c6
•
CRm set to c0
•
Opcode_2 set to 2.
For example:
MRC p15, 0, <Rd>, c6, c0, 2
MCR p15, 0, <Rd>, c6, c0, 2
; Read Instruction Fault Address Register
; Write Instruction Fault Address Register
A write to this register sets the IFAR to the value of the data written. This is useful for a debugger
to restore the value of the IFAR.
3.2.22
c7, Cache operations
The purpose of c7 is to:
•
ARM DDI 0333H
ID012410
control these operations:
—
clean and invalidate instruction and data caches, including range operations
—
prefetch instruction cache line
—
Flush Prefetch Buffer
—
flush branch target address cache
—
virtual to physical address translation.
•
implement the Data Synchronization Barrier (DSB) operation
•
implement the Data Memory Barrier (DMB) operation
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-69
System Control Coprocessor
•
implement the Wait For Interrupt clock control function.
Note
Cache operations also depend on:
•
the C, W, I and RR bits, see c1, Control Register on page 3-44.
•
the RA and RV bits, see c1, Auxiliary Control Register on page 3-49.
The following cache operations globally flush the BTAC:
•
Invalidate Entire Instruction Cache
•
Invalidate Both Caches.
c7 consists of one 32-bit register that performs 28 functions. Figure 3-38 shows the arrangement
of the 24 functions in this group that operate with the MCR and MRC instructions.
CRn
c7
Opcode_1 CRm Opcode_2
0
c0
c4
c5
c6
c7
c8
c10
c13
c14
Read-only
4
0
0
1
2
4
6
7
0
1
2
0
0-3
4-7
0
1
2
4
5
6
1
0
1
2
SBZ
SBZ
MVA
Index
SBZ
SBZ
MVA
SBZ
MVA
Index
SBZ
SBZ
MVA
Index
SBZ
SBZ
MVA
SBZ
MVA
Index
Read/write
SBZ
MVA
Index
Wait For Interrupt (WFI)
PA Register
Invalidate Entire Instruction Cache
Invalidate Instruction Cache Line (using MVA)
Invalidate Instruction Cache Line (using Index)
Flush Prefetch Buffer
Flush Entire Branch Target Cache
Flush Branch Target Cache Entry
Invalidate Entire Data Cache
Invalidate Data Cache Line (using MVA)
Invalidate Data Cache Line (using Index)
Invalidate Both Caches
VA to PA Translation in the current world
VA to PA Translation in the other world
Clean Entire Data Cache
Clean Data Cache Line (using MVA)
Clean Data Cache Line (using Index)
Data Synchronization Barrier (DSB)
Data Memory Barrier (DMB)
Cache Dirty Status Register
Prefetch Instruction Cache Line
Clean and Invalidate Entire Data Cache
Clean and Invalidate Data Cache Line (using MVA)
Clean and Invalidate Data Cache Line (using Index)
Accessible in User mode
Write only
Should Be Zero
Using MVA
Using Set and Index
Figure 3-38 Cache operations
Figure 3-39 on page 3-71 shows the arrangement of the 4 functions in this group that operate
with the MCRR instruction.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-70
System Control Coprocessor
Opcode_1 CRm
0
c5
c6
c12
c14
Read-only
VA
VA
VA
VA
Read/write
VA
Invalidate Instruction Cache Range
Invalidate Data Cache Range
Clean Data Cache Range
Clean and Invalidate Data Cache Range
Using VA
Accessible in User mode
Figure 3-39 Cache operations with MCRR instructions
•
Note
Writing c7 with a combination of CRm and Opcode_2 not listed in Figure 3-38 on
page 3-70 or CRm not listed in Figure 3-39 results in an Undefined exception apart from
the following operations, that are architecturally defined as unified cache operations and
have no effect:
—
MCR p15,0,<Rd>,c7,c7,{1-7}
—
MCR p15,0,<Rd>,c7,c11,{0-7}
—
MCR p15,0,<Rd>,c7,c15,{0-7}.
•
In the ARM1176JZ-S processor, reading from c7, except for reads from the Cache Dirty
Status Register or PA Register, causes an Undefined instruction trap.
•
Writes to the Cache Dirty Status Register cause an Undefined exception.
•
If Opcode_1 = 0, these instructions are applied to a level one cache system. All other
Opcode_1 values are reserved.
•
All accesses to c7 can only be executed in a privileged mode of operation, except Data
Synchronization Barrier, Flush Prefetch Buffer, Data Memory Barrier, and Clean Data
Cache Range. These can be operated in User mode. Attempting to execute a privileged
instruction in User mode results in the Undefined instruction trap being taken.
There are three ways to use c7:
•
For the Cache Dirty Status Register, read c7 with the MRC instruction.
•
For range operations use the MCRR instruction with the value of CRm to select the
required operation.
•
For all other operations use the MCR instruction to write to c7 with the combination of
CRm and Opcode_2 to select the required operation.
Depending on the operation you require set <Rd> for MCR instructions or <Rd> and
<Rn> for MCRR instructions to:
—
Virtual Address (VA)
—
Modified Virtual Address (MVA)
—
Set and Index
—
Should Be Zero.
Invalidate, Clean, and Prefetch operations
The purposes of the invalidate, clean, and prefetch operations that c7 provides are to:
•
Invalidate part or all of the Data or Instruction caches
•
Clean part or all of the Data cache
•
Clean and Invalidate part or all of the Data cache
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-71
System Control Coprocessor
•
Prefetch code into the Instruction cache.
The terms used to describe the invalidate, clean, and prefetch operations are as defined in the
Caches and Write Buffers chapter of the ARM Architecture Reference Manual.
For details of the behavior of c7 in the Secure and Non-secure worlds, see TrustZone behavior
on page 3-77.
When it controls invalidate, clean, and prefetch operations c7 appears as a 32-bit write only
register. There are four possible formats for the data that you write to the register that depend on
the specific operation:
•
Set and Index format
•
MVA
•
VA
•
SBZ.
Set and Index format
Figure 3-40 shows the Set and Index format for invalidate and clean operations.
31 30 29
S+5 S+4
Set
SBZ/UNP
5 4
Index
1 0
SBZ/UNP 0
Figure 3-40 c7 format for Set and Index
Table 3-67 lists how the bit values correspond with the Cache Operation functions
for Set and Index format operations.
Table 3-67 Functional bits of c7 for Set and Index
Bits
Field name
Function
[31:30]
Set
Selects the cache set to operate on, from the four cache sets.
Value is the cache set number.
[29:S+5]
-
UNP/SBZ.
[S+4:5]
Index
Selects the cache line to operate on.
Value is the index number.
[4:1]
-
SBZ.
[0]
0
For the ARM1176JZ-S, this Should Be Zero.
The value of S in Table 3-68 depends on the cache size. Table 3-68 lists the
relationship of cache sizes and S.
Table 3-68 Cache size and S parameter dependency
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
Cache size
S
4KB
5
8KB
6
16KB
7
32KB
8
64KB
9
3-72
System Control Coprocessor
The value of S is given by:
S = log2
cache size
Associativity x line length in bytes
See c0, Cache Type Register on page 3-21 for details of instruction and data cache
size.
Note
If the data is stated to be Set and Index format, see Figure 3-40 on page 3-72, it
identifies the cache line that the operation applies to by specifying the cache Set
that it belongs to and what its Index is within the Set. The Set corresponds to the
number of the cache way, and the Index number corresponds to the line number
within a cache way.
MVA format
Figure 3-41 shows the MVA format for invalidate, clean, and prefetch operations.
31
5 4
Modified virtual address
0
SBZ
Figure 3-41 c7 format for MVA
Table 3-69 lists how the bit values correspond with the Cache Operation functions
for MVA format operations.
Table 3-69 Functional bits of c7 for MVA
Bits
Field name
Function
[31:5]
MVA
Specifies address to invalidate, clean, or prefetch.
Holds the MVA of the cache line.
[4:0]
-
Ignored. This means that the lower 5 bits of MVA are ignored and these bits are not used for the
cache operations. Only the top bits are necessary to determine whether or not the cache line is
present in the cache. Even if the MVA is not aligned to the cache line, the cache operation is
performed by ignoring the lower 5 bits.
•
Note
Invalidation and cleaning operations have no effect if they miss in the
cache.
•
If the corresponding entry is not in the TLB, these instructions can cause a
TLB miss exception or hardware page table walk, depending on the miss
handling mechanism.
•
For the cache control operations, the MVAs that are passed to the cache are
not translated by the FCSE extension.
VA format
Figure 3-42 on page 3-74 shows the VA format for invalidate and clean
operations. All VA format operations use the MCRR instruction.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-73
System Control Coprocessor
5 4
31
Virtual address
0
SBZ
Figure 3-42 Format of c7 for VA
Table 3-70 lists how the bit values correspond with the Cache Operation functions
for VA format operations.
Table 3-70 Functional bits of c7 for VA format
Bits
Field name
Function
[31:5]
Virtual address
Specifies the start or end address to invalidate or clean.
Holds the true VA of the start or end of a memory block before any modification by FCSE.
[4:0]
-
SBZ.
You can perform invalidate, clean, and prefetch operations on:
•
single cache lines
•
entire caches
•
address ranges in cache.
•
Note
Clean, invalidate, and clean and invalidate operations apply regardless of the lock applied
to entries.
•
An explicit flush of the relevant lines in the branch target cache must be performed after
invalidation of Instruction Cache lines or the results are Unpredictable. There is no impact
on security. This is not required after an entire Instruction Cache invalidation because the
entire branch target cache is flushed automatically.
•
A small number of CP15 c7 operations can be executed by code while in User mode.
Attempting to execute a privileged operation in User mode using CP15 c7 results in an
Undefined instruction trap being taken.
To determine if the cache is dirty use the Cache Dirty Status Register, see Cache Dirty Status
Register on page 3-78.
Entire cache
Table 3-71 lists the instructions and operations that you can use to clean and
invalidate the entire cache.
Table 3-71 Cache operations for entire cache
ARM DDI 0333H
ID012410
Instruction
Data
Function
MCR p15, 0, <Rd>, c7, c5, 0
SBZ
Invalidate Entire Instruction Cache.
Also flushes the branch target cache and globally flushes the BTAC.
MCR p15, 0, <Rd>, c7, c6, 0
SBZ
Invalidate Entire Data Cache.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-74
System Control Coprocessor
Table 3-71 Cache operations for entire cache (continued)
Instruction
Data
Function
MCR p15, 0, <Rd>, c7, c7, 0
SBZ
Invalidate Both Caches.
Also flushes the branch target cache and globally flushes the BTAC.
MCR p15, 0, <Rd>, c7, c10, 0
SBZ
Clean Entire Data Cache.
MCR p15, 0, <Rd>, c7, c14, 0
SBZ
Clean and Invalidate Entire Data Cache.
Register c7 specifies operations for cleaning the entire Data Cache, and also for
performing a clean and invalidate of the entire Data Cache. These are blocking
operations that can be interrupted. If they are interrupted, the R14 value that is
captured on the interrupt is the address of the instruction that launched the cache
clean operation + 4. This enables the standard return mechanism for interrupts to
restart the operation.
If it is essential that the cache is clean, or clean and invalid, for a particular
operation, the sequence of instructions for cleaning, or cleaning and invalidating,
the cache for that operation must handle the arrival of an interrupt at any time
when interrupts are not disabled. This is because interrupts can write to a
previously clean cache. For this reason, the Cache Dirty Status Register indicates
if the cache has been written to since the last clean of the cache was started, see
Cache Dirty Status Register on page 3-78. You can interrogate the Cache Dirty
Status Register to determine if the cache is clean, and if this is done while
interrupts are disabled, the following operations can rely on having a clean cache.
The following sequence shows this approach:
; interrupts are assumed to be enabled at this point
MOV R1, #0
MCR CP15, 0, R1, C7, C10, 0
; Clean (or Clean & Invalidate) Cache
MRS R2, CPSR
CPSID iaf
; Disable interrupts
MRC CP15, 0, R1, C7, C10, 6
; Read Cache Dirty Status Register
ANDS R1, R1, #1
; Check if it is clean
BEQ UseClean
MSR CPSR, R2
; Re-enable interrupts
B Loop1
; - clean the cache again
UseClean Do_Clean_Operations
; Perform whatever operation relies on
; the cache being clean/invalid.
; To reduce impact on interrupt
; latency, this sequence should be
; short
MSR CPSR, R2
; Re-enable interrupts
Loop1
The long cache clean operation is performed with interrupts enabled throughout
this routine.
Single cache lines
There are two ways to perform invalidate or clean operations on cache lines:
•
by use of Set and Index format
•
by use of MVA format.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-75
System Control Coprocessor
Table 3-72 lists the instructions and operations that you can use for single cache
lines.
Table 3-72 Cache operations for single lines
Instruction
Data
Function
MCR p15, 0, <Rd>, c7, c5, 1
MVA
Invalidate Instruction Cache Line, using MVA
MCR p15, 0, <Rd>, c7, c5, 2
Set/Index
Invalidate Instruction Cache Line, using Index
MCR p15, 0, <Rd>, c7, c6, 1
MVA
Invalidate Data Cache Line, using MVA
MCR p15, 0, <Rd>, c7, c6, 2
Set/Index
Invalidate Data Cache Line, using Index
MCR p15, 0, <Rd>, c7, c10, 1
MVA
Clean Data Cache Line, using MVA
MCR p15, 0, <Rd>, c7, c10, 2
Set/Index
Clean Data Cache Line, using Index
MCR p15, 0, <Rd>, c7, c13, 1
MVA
Prefetch Instruction Cache Line
MCR p15, 0, <Rd>, c7, c14, 1
MVA
Clean and Invalidate Data Cache Line, using MVA
MCR p15, 0, <Rd>, c7, c14, 2
Set/Index
Clean and Invalidate Data Cache Line, using Index
Example 3-1 shows how to use Clean and Invalidate Data Cache Line with Set
and Index to clean and invalidate one whole cache way, in this example, way 3.
The example works with any cache size because it reads the cache size from the
Cache Type Register.
Example 3-1 Clean and Invalidate Data Cache Line with Set and Index
MRC
AND
MOV
ADD
MOV
MOV
MOV
MOV
p15,0,R0,c0,c0,1
R0,R0,#0x1C0000
R0,R0, LSR #18
R0,R0,#7
R1,#3:SHL:30
R2,#0
R3,#1
R3,R3, LSL R0
;
;
;
;
;
;
ORR
MCR
ADD
CMP
BNE
R4,R2,R1
p15,0,R4,c7,c14,2
R2,R2,#1:SHL:5
R2,R3
index_loop
;
;
;
;
;
Read cache type reg
Extract D cache size
Move to bottom bits
Get Index loop max
Set up Set = 3
Set up Index counter
; Set up Index loop max
index_loop
Set and Index format
Clean&inval D cache line
Increment Index
Done all index values?
Loop until done
Address ranges
Table 3-73 lists the instructions and operations that you can use to clean and
invalidate the address ranges in cache.
Table 3-73 Cache operations for address ranges
ARM DDI 0333H
ID012410
Instruction
Data
Function
MCRR p15,0,<End Address>,<Start Address>,c5
VA
Invalidate Instruction Cache Range
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-76
System Control Coprocessor
Table 3-73 Cache operations for address ranges (continued)
Instruction
Data
Function
MCRR p15,0,<End Address>,<Start Address>,c6
VA
Invalidate Data Cache Range
MCRR p15,0,<End Address>,<Start Address>,c12
VA
Clean Data Cache Rangea
MCRR p15,0,<End Address>,<Start Address>,c14
VA
Clean and Invalidate Data Cache Range
a. This operation is accessible in both User and privileged modes of operation. All other operations listed
here are only accessible in privileged modes of operation.
The operations in Table 3-73 on page 3-76 can only be performed using an
MCRR or MCRR2 instruction, and all other operations to these registers are
ignored.
The End Address and Start Address in Table 3-73 on page 3-76 is the true VA
before any modification by the Fast Context Switch Extension (FCSE). This
address is translated by the FCSE logic. Each of the range operations operates
between cache lines containing the Start Address and the End Address, inclusive
of Start Address and End Address.
Because the least significant address bits are ignored, the transfer automatically
adjusts to a line length multiple spanning the programmed addresses.
The Start Address is the first VA of the block transfer. It uses the VA bits [31:5].
The End Address is the VA where the block transfer stops. This address is at the
start of the line containing the last address to be handled by the block transfer. It
uses the VA bits [31:5].
If the Start Address is greater than the End Address the effect is architecturally Unpredictable.
The ARM1176JZ-S processor does not perform cache operations in this case. All block
transfers are interruptible. When Block transfers are interrupted, the R14 value that is captured
is the address of the instruction that launched the block operation + 4. This enables the standard
return mechanism for interrupts to restart the operation.
Exception behavior
The blocking block transfers cause a Data Abort on a translation fault if a valid page table entry
cannot be fetched. The FAR indicates the address that caused the fault, and the DFSR indicates
the reason for the fault.
TrustZone behavior
TrustZone affects cache operations as follows:
Secure world operations
In the Secure world cache operations can affect both Secure and Non-secure
cache lines:
•
Clean, invalidate, and clean and invalidate operations affect all cache lines
regardless of their status as locked or unlocked.
•
For clean, invalidate, and clean and invalidate operations with the Set and
Index format, the selected cache line is affected regardless of the Secure
tag.
•
For MVA operations clean, invalidate, and clean and invalidate:
—
ARM DDI 0333H
ID012410
when the MVA is marked as Non-secure in the page table, only
Non-secure entries are affected
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-77
System Control Coprocessor
—
when the MVA is marked as Secure in the page table, only Secure
entries are affected.
Non-secure world operations
In the Non-secure world:
•
Clean, invalidate, and clean and invalidate operations only affect
Non-secure cache lines regardless of the method used.
•
Any attempt to access Secure cache lines is ignored.
•
Invalidate Entire Data Cache and Invalidate Both Caches operations cause
an Undefined exception. This prevents invalidating lockdown entries that
might be configured as Secure.
—
•
the Invalidate Both Caches operation globally flushes the BTAC.
Invalidate Entire Instruction Cache operations:
—
cause an Undefined exception if lockdown entries are reserved for the
Secure world
—
affect all Secure and Non-secure cache entries if the lockdown entries
are not reserved for the Secure world
—
globally flush the BTAC.
Cache Dirty Status Register
The purpose of the Cache Dirty Status Register is to indicate when the Cache is dirty.
The Cache Dirty Status Register is:
•
in CP15 c7
•
a 32-bit read only register, banked for Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-43 shows the arrangement of bits in the Cache Dirty Status Register.
31
1 0
UNP/SBZ
C
Figure 3-43 Cache Dirty Status Register format
Table 3-74 lists how the bit value corresponds with the Cache Dirty Status Register function.
Table 3-74 Cache Dirty Status Register bit functions
Bits
Field name
Function
[31:1]
-
UNP/SBZ.
[0]
C
The C bit indicates if the cache is dirty.
0 = indicates that no write has hit the cache since the last cache clean, clean and invalidate, or
invalidate all operation, or reset, successfully left the cache clean. This is the reset value.
1 = indicates that the cache might contain dirty data.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-78
System Control Coprocessor
The Cache Dirty Status Register behaves in this way with regard to the Secure and Non-secure
cache:
•
clean, invalidate, and clean and invalidate operations of the whole cache in the Non-secure
world clear the Non-secure Cache Dirty Status Register
•
clear, invalidate, and clean and invalidate operations of the whole cache in the Secure
world clear both the Secure and Non-secure Cache Dirty Status Registers
•
if the core is in the Non-secure world or targets Non-secure data from the Secure world,
stores that write a dirty bit in the cache set both the Secure and the Non-secure Cache Dirty
Status Register
•
all stores that write a dirty bit in the cache set the Secure Cache Dirty Status Register.
All writes and User mode reads of the Cache Dirty Status Register cause an Undefined
exception.
To use the Cache Dirty Status Register read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c7
•
CRm set to c10
•
Opcode_2 set to 6.
For example:
MRC p15, 0, <Rd>, c7, c10, 6
; Read Cache Dirty Status Register.
Flush operations
Table 3-75 lists the flush operations and instructions available through c7.
Table 3-75 Cache operations flush functions
Instruction
Data
Function
MCR p15, 0, <Rd>, c7, c5, 4
SBZ
Flush Prefetch Buffera.
MCR p15, 0, <Rd>, c7, c5, 6
SBZ
Flush Entire Branch Target Cacheb.
MCR p15, 0, <Rd>, c7, c5, 7
MVAc
Flush Branch Target Cache Entry with MVA.
a. These operations are accessible in both User and privileged modes of operation. All
other operations are only accessible in privileged modes of operation.
b. This operation is accessible in both Privileged and User modes of operation when in
Debug state.
c. The range of MVA bits used in this function is different to the range of bits used in other
functions that have MVA data.
The Flush Branch Target Entry using MVA operation uses a different MVA format to that used
by Clean and Invalidate operations. Figure 3-44 shows the MVA format for the Flush Branch
Target Entry operation.
31
3 2
MVA
0
SBZ
Figure 3-44 c7 format for Flush Branch Target Entry using MVA
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-79
System Control Coprocessor
Table 3-76 lists how the bit values correspond with the Flush Branch Target Entry using MVA
functions.
Table 3-76 Flush Branch Target Entry using MVA bit functions
Bits
Field name
Function
[31:3]
MVA
Specifies address to flush.
Holds the MVA of the Branch Target Cache line.
[2:0]
-
SBZ.
Note
The MVA does not have to be cache line aligned.
Flushing the prefetch buffer has the effect that all instructions occurring in program order after
this instruction are fetched from the memory system after the execution of this instruction,
including the level one cache or TCM. This operation is useful for ensuring the correct execution
of self-modifying code. See Explicit Memory Barriers on page 6-25.
VA to PA translation operations
The purpose of the VA to PA translation operations is to provide a Secure means to determine
address translation in the Secure and Non-secure worlds and for address translation between the
Secure and Non-secure worlds. VA to PA translations operate through:
•
PA Register
•
VA to PA translation in the current world on page 3-82
•
VA to PA translation in the other world on page 3-83.
PA Register
The purpose of the PA Register is to hold:
•
the PA after a successful translation
•
the source of the abort for an unsuccessful translation.
Table 3-77 on page 3-81 lists the purpose of the bits of the PA Register for successful
translations and Table 3-78 on page 3-82 lists the purpose of the bits of the PA Register for
unsuccessful translations.
The PA Register is:
•
in CP15 c7
•
a 32 bit read/write register banked in Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-45 shows the format of the PA Register for successful translations.
31
10 9 8 7 6 5 4 3 2 1 0
N
S
P
S
H
PA
INNER
- 0
OUTER
Figure 3-45 PA Register format for successful translation
Figure 3-46 on page 3-81 shows the format of the PA register for aborted translations.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-80
System Control Coprocessor
31
7 6
UNP / SBZ
1 0
FSR[12,10,3:0]
1
Figure 3-46 PA Register format for aborted translation
Table 3-77 lists the functional bits of the PA Register for successful translation.
Table 3-77 PA Register for successful translation bit functions
ARM DDI 0333H
ID012410
Bits
Field name
Function
[31:10]
PA
PA Translated physical address.
[9]
NS
Indicates the state of the NS Attribute bit in the page table:
0 = Secure memory
1 = Non-secure memory.
[8]
P
Not used in the ARM1176JZ-S processor.
UNP/SBZ.
[7]
SH
Indicates shareable memory:
0 = Non-shared
1 = Shared.
[6:4]
INNER
Indicates the inner attributes from the page table:
b000 = Noncacheable
b001 = Strongly Ordered
b010 = Reserved
b011 = Device
b100 = Reserved
b101 = Reserved
b110 = Inner Write-through, no allocate on write
b111 = Inner Write-back, no allocate on write.
[3:2]
OUTER
Indicates the outer attributes from the page table:
b00 = Noncacheable
b01 = Write-back, allocate on write
b10 = Write-through, no allocate on write
b11 = Write-back, no allocate on write.
[1]
-
Reserved.
UNP/SBZ.
[0]
-
Indicates that the translation succeeded:
0 = Translation successful.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-81
System Control Coprocessor
Table 3-78 lists the functional bits of the PA Register for aborted translation.
Table 3-78 PA Register for unsuccessful translation bit functions
Bits
Field name
Function
[31:7]
-
UNP/SBZ.
[6:1]
FSR[12,10,3:0]
Holds the FSR bits for the aborted address, see c5, Data Fault Status Register on page 3-64
and c5, Instruction Fault Status Register on page 3-66.
FSR bits [12], [10], and [3:0].
[0]
-
Indicates that the translation aborted:
1 = Translation aborted.
Attempts to access the PA Register in User mode results in an Undefined exception.
Note
The VA to PA translation can only generate an abort to the core if the operation failed because
an external abort occurred on the possible page table request. In this case, the processor updates
the Secure or Non-secure version of the PA register, depending on the Secure or Non-secure
state of the core when the operation was issued. The processor also updates the Data Fault Status
Register and the Fault Address Register:
•
if the EA bit in the Secure Configuration Register is set, the Secure versions of the two
registers are updated and the processor traps the abort into Secure Monitor mode
•
if the EA bit in the Secure Configuration Register is not set, the processor updates the
Secure or Non-secure versions of the two registers, depending on the Secure or
Non-secure state of the core when the operation was issued.
For all other cases when the VA to PA operation fails, the processor only updates the PA register,
Secure or Non-secure version, depending on the Secure or Non-secure state of the core when
the operation was issued, with the Fault Status Register encoding and bit[0] set. The Data Fault
Status Register and Fault Address Register remain unchanged and the processor does not send
an abort to the core.
To use the PA Register read or write CP15 c7 with:
•
Opcode_1 set to 0
•
CRn set to c7
•
CRm set to c4
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c7, c4, 0
MCR p15, 0, <Rd>, c7, c4, 0
; Read PA Register
; Write PA Register
VA to PA translation in the current world
The purpose of the VA to PA translation in the current world is to translate the address with the
current virtual mapping for either Secure or Non-secure worlds.
The VA to PA translation in the current world operations use:
•
CP15 c7
•
four, 32-bit write-only operations common to the Secure and Non-secure worlds
•
operations accessible in privileged modes only
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-82
System Control Coprocessor
The operations work for privileged or User access permissions and returns information in the
PA Register for aborts, when the translation is unsuccessful, or page table information, when the
translation succeeds.
Attempts to access the VA to PA translation operations in the current world in User mode result
in an Undefined exception.
To use the VA to PA translation in the current world write CP15 c7 with:
•
Opcode_1 set to 0
•
CRn set to c7
•
CRm set to c8
•
Opcode_2 set to:
— 0 for privileged read permission
— 1 for privileged write permission
— 2 for User read permission
— 3 for User write permission.
General register <Rn> contains the VA for translation. The result returns in the PA Register, for
example:
MCR p15,0,<Rn>,c7,c8,3
MRC p15,0,<Rd>,c7,c4,0
;get VA = <Rn> and run VA-to-PA translation
;with User write permission.
;if the selected page table has the
;User write permission, the PA is loaded
;in PA register, otherwise abort information is
;loaded in PA Register
;read in <Rd> the PA value
Note
The VA that this operation uses is the true VA not the MVA.
VA to PA translation in the other world
The purpose of the VA to PA translation in the other world is to translate the address with the
current virtual mapping in the Non-secure world while the core is in the Secure world.
The VA to PA translation in the other world operations use:
•
CP15 c7
•
four, 32-bit write-only operations in the Secure world only
•
operations accessible in privileged modes only.
The operations work in the Secure world for Non-secure privileged or Non-secure User access
permissions and returns information in the PA Register for aborts, when the translation is
unsuccessful, or page table information, when the translation succeeds.
Attempts to access the VA to PA translation operations in the other world in any Non-secure or
User mode result in an Undefined exception.
To use the VA to PA translation in the other world write CP15 c7 with:
•
Opcode_1 set to 0
•
CRn set to c7
•
CRm set to c8
•
Opcode_2 set to:
— 4 for privileged read permission
— 5 for privileged write permission
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-83
System Control Coprocessor
—
—
6 for User read permission
7 for User write permission.
General register <Rn> contains the VA for translation. The result returns in the PA Register, for
example:
MCR p15,0,<Rn>,c7,c8,4
MRC p15,0,<Rd>,c7,c4,0
;get VA = <Rn> and run Non-secure translation
;with Non-secure privileged read permission.
;if the selected page table has the
;privileged read permission, the PA is loaded
;in PA register, otherwise abort information is
;loaded in PA Register
;read in <Rd> the PA value
Data Synchronization Barrier operation
The purpose of the Data Synchronization Barrier operation is to ensure that all outstanding
explicit memory transactions complete before any following instructions begin. This ensures
that data in memory is up to date before the processor executes any more instructions.
Note
The Data Synchronization Barrier operation is synonymous with Drain Write Buffer and Data
Write Barrier in earlier versions of the architecture.
The Data Synchronization Barrier operation is:
•
in CP15 c7
•
32-bit write-only access, common to both Secure and Non-secure worlds
•
accessible in both User and Privileged modes.
Table 3-79 lists the results of attempted access for each mode.
Table 3-79 Results of access to the Data Synchronization Barrier operation
Read
Write
Undefined exception
Data
To use the Data Memory Barrier operation write CP15 with <Rd> SBZ and:
•
Opcode_1 set to 0
•
CRn set to c7
•
CRm set to c10
•
Opcode_2 set to 4.
For example:
MCR p15,0,<Rd>,c7,c10,4
; Data Synchronization Barrier operation.
For more details, see Explicit Memory Barriers on page 6-25.
Note
The W bit that normally enables the Write Buffer is not implemented in ARM1176JZ-S
processors, see c1, Control Register on page 3-44.
This instruction acts as an explicit memory barrier. This instruction completes when all explicit
memory transactions occurring in program order before this instruction are completed. No
instructions occurring in program order after this instruction are executed until this instruction
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-84
System Control Coprocessor
completes. Therefore, no explicit memory transactions occurring in program order after this
instruction are started until this instruction completes. See Explicit Memory Barriers on
page 6-25.
It can be used instead of Strongly Ordered memory when the timing of specific stores to the
memory system has to be controlled. For example, when a store to an interrupt acknowledge
location must be completed before interrupts are enabled.
The Data Synchronization Barrier operation can be performed in both privileged and User
modes of operation.
Data Memory Barrier operation
The purpose of the Data Memory Barrier operation is to ensure that all outstanding explicit
memory transactions complete before any following explicit memory transactions begin. This
ensures that data in memory is up to date before any memory transaction that depends on it.
The Data Memory Barrier operation is:
•
in CP15 c7
•
a 32-bit write only operation, common to the Secure and Non-secure worlds
•
accessible in User and Privileged mode.
Table 3-80 lists the results of attempted access for each mode.
Table 3-80 Results of access to the Data Memory Barrier operation
Read
Write
Undefined exception
Data
To use the Data Memory Barrier operation write CP15 with <Rd> SBZ and:
•
Opcode_1 set to 0
•
CRn set to c7
•
CRm set to c10
•
Opcode_2 set to 5.
For example:
MCR p15,0,<Rd>,c7,c10,5
; Data Memory Barrier Operation.
For more details, see Explicit Memory Barriers on page 6-25.
Wait For Interrupt operation
The purpose of the Wait For Interrupt operation is to put the processor in to a low power state,
see Standby mode on page 10-3.
The Wait For Interrupt operation is:
•
in CP15 c7
•
32-bit write only access, common to Secure and Non-secure worlds
•
accessible in privileged modes only.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-85
System Control Coprocessor
Table 3-81 lists the results of attempted access for each mode.
Table 3-81 Results of access to the Wait For Interrupt operation
Secure Privileged
Non-secure Privileged
User
Read
Write
Read
Write
Undefined exception
Wait For Interrupt
Undefined exception
Wait For Interrupt
Undefined exception
To use the Wait For Interrupt operation write CP15 with <Rd> SBZ and:
•
Opcode_1 set to 0
•
CRn set to c7
•
CRm set to c0
•
Opcode_2 set to 4.
For example:
MCR p15,0,<Rd>,c7,c0,4
; Wait For Interrupt.
This puts the processor into a low-power state and stops it executing following instructions until
an interrupt, an imprecise external abort, or a debug request occurs, regardless of whether the
interrupts or external imprecise aborts are disabled by the masks in the CPSR. When an interrupt
does occur, the MCR instruction completes. If interrupts are enabled, the IRQ or FIQ handler is
entered as normal. The return link in R14_irq or R14_fiq contains the address of the MCR
instruction plus 8, so that the normal instruction used for interrupt return (SUBS PC,R14,#4)
returns to the instruction following the MCR.
3.2.23
c8, TLB Operations Register
The purpose of the TLB Operations Register is to either:
•
invalidate all the unlocked entries in the TLB
•
invalidate all TLB entries for an area of memory before the MMU remaps it
•
invalidate all TLB entries that match an ASID value.
These operations can be performed on either:
•
Instruction TLB
•
Data TLB
•
Unified TLB.
Note
The ARM1176JZ-S processor has a unified TLB. Any TLB operations specified for the
Instruction or Data TLB perform the equivalent operation on the unified TLB.
The TLB Operations Register is:
•
in CP15 c8
•
a 32-bit write-only register banked for Secure and Non-secure world operations
•
accessible in privileged modes only.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-86
System Control Coprocessor
Table 3-82 lists the results of attempted access for each mode.
Table 3-82 Results of access to the TLB Operations Register
Secure Privileged
Non-secure Privileged
User
Read
Write
Read
Write
Undefined exception
Secure data
Undefined exception
Non-secure data
Undefined exception
To access the TLB Operations Register write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c8
•
CRm set to:
— c5, Instruction TLB
— c6, Data TLB
— c7, Unified TLB
•
Opcode_2 set to:
— 0, Invalidate TLB unlocked entries
— 1, Invalidate TLB Entry by MVA
— 2, Invalidate TLB Entry on ASID Match.
For example, to invalidate all the unlocked entries in the Instruction TLB:
MCR p15,0,<Rd>,c8, c5,0
; Write TLB Operations Register
Functions that update the contents of the TLB occur in program order. Therefore, an explicit
data access before the TLB function uses the old TLB contents, and an explicit data access after
the TLB function uses the new TLB contents. For instruction accesses, TLB updates are
guaranteed to have taken effect before the next pipeline flush. This includes Flush Prefetch
Buffer operations and exception return sequences.
Invalidate TLB unlocked entries
Invalidate TLB unlocked entries invalidates all the unlocked entries in the TLB. This function
causes a flush of the prefetch buffer. Therefore, all instructions that follow are fetched after the
TLB invalidation.
Invalidate TLB Entry by MVA
You can use Invalidate TLB Entry by MVA to invalidate all TLB entries for an area of memory
before you remap.
You must perform an Invalidate TLB Entry by MVA of an MVA in each area you want to remap,
section, small page, or large page.
This function invalidates a TLB entry that matches the provided MVA and ASID, or a global
TLB entry that matches the provided MVA.
This function invalidates a matching locked entry.
The Invalidate TLB Entry by MVA operation uses an MVA and ASID as an argument.
Figure 3-47 on page 3-88 shows the format of this.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-87
System Control Coprocessor
31
8 7
12 11
Modified virtual address
SBZ
0
ASID
Figure 3-47 TLB Operations Register MVA and ASID format
Invalidate TLB Entry on ASID Match
This is a single interruptible operation that invalidates all TLB entries that match the provided
ASID value.
This function invalidates locked entries but does not invalidate entries marked as global.
In this processor this operation takes several cycles to complete and the instruction is
interruptible. When interrupted the R14 state is set to indicate that the MCR instruction has not
executed. Therefore, R14 points to the address of the MCR + 4. The interrupt routine then
automatically restarts at the MCR instruction. If the processor interrupts and later restarts this
operation, any entries fetched into the TLB by the interrupt that uses the provided ASID are
invalidated by the restarted invalidation.
The Invalidate TLB Entry on ASID Match function requires an ASID as an argument.
Figure 3-48 shows the format of this.
31
8 7
SBZ
0
ASID
Figure 3-48 TLB Operations Register ASID format
3.2.24
c9, Data and instruction cache lockdown registers
The purpose of the data and instruction cache lockdown registers is to provide a means to lock
down the caches and therefore provide some control over pollution that applications might
cause. With these registers you can lock down each cache way independently.
There are two cache lockdown registers:
•
one Data Cache Lockdown Register
•
one Instruction Cache Lockdown Register.
The cache lockdown registers are:
•
in CP15 c9
•
two 32-bit read/write registers, common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-49 shows the bit arrangement of the cache lockdown registers.
31
4 3
SBO
0
L bit for
each cache
way
Figure 3-49 Instruction and data cache lockdown register formats
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-88
System Control Coprocessor
Table 3-83 lists how the bit values correspond with the cache lockdown registers functions.
Table 3-83 Instruction and data cache lockdown register bit functions
Bits
Field name
Function
[31:4]
SBO
UNP on reads, SBO on writes.
[3:0]
L bit for each
cache way
Locks each cache way individually. The L bits for cache ways 3 to 0 are bits [3:0]
respectively. On a line fill to the cache, data is allocated to unlocked cache ways as
determined by the standard replacement algorithm. Data is not allocated to locked cache
ways. If a cache way is not implemented, then the L bit for that way is hardwired to 1, and
writes to that bit are ignored.
0 indicates that this cache way is not locked. Allocation to this cache way is determined by
the standard replacement algorithm. This is the reset state.
1 indicates that this cache way is locked. No allocation is performed to this cache way.
The lockdown behavior depends on the CL bit, see c1, Non-Secure Access Control Register on
page 3-55. If the CL bit is not set, the Lockdown entries are reserved for the Secure world.
Table 3-84 lists the results of attempted access for each mode.
Table 3-84 Results of access to the Instruction and Data Cache Lockdown Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
0
Data
Data
Undefined exception
Undefined exception
Undefined exception
1
Data
Data
Data
Data
Undefined exception
CL bit value
User
The Data Cache Lockdown Register only supports the Format C method of lockdown. This
method is a cache way based scheme that gives a traditional lockdown function to lock critical
regions in the cache.
A locking bit for each cache way determines if the normal cache allocation mechanisms,
Random or Round-Robin, can access that cache way. For details of the RR bit, that controls the
selection of Random or Round-Robin cache policy, see c1, Control Register on page 3-44.
ARM1176JZ-S processors have an associativity of 4. With all ways locked, the ARM1176JZ-S
processor behaves as if only ways 3 to 1 are locked and way 0 is unlocked.
To use the Instruction and Data Cache Lockdown Registers read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c9
•
CRm set to c0
•
Opcode_2 set to:
— 0, for Data Cache
— 1, for Instruction Cache.
For example:
MRC
MCR
MRC
MCR
ARM DDI 0333H
ID012410
p15,
p15,
p15,
p15,
0,
0,
0,
0,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
c9,
c9,
c9,
c9,
c0,
c0,
c0,
c0,
0
0
1
1
;
;
;
;
Read Data Cache Lockdown Register
Write Data Cache Lockdown Register
Read Instruction Cache Lockdown Register
Write Instruction Cache Lockdown Register
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-89
System Control Coprocessor
The system must only change a cache lockdown register when it is certain that all outstanding
accesses that might cause a cache line fill are complete. For this reason, the processor must
perform a Data Synchronization Barrier operation before the cache lockdown register changes,
see Data Synchronization Barrier operation on page 3-84.
The following procedure for lock down into a data or instruction cache way i, with N cache
ways, using Format C, ensures that only the target cache way i is locked down.
This is the architecturally defined method for locking data or instructions into caches:
1.
Ensure that no processor exceptions can occur during the execution of this procedure, by
disabling interrupts. If this is not possible, all code and data or instructions used by any
exception handlers that can be called must meet the conditions specified in step 2.
2.
Ensure that all data or instructions used by the following code, apart from the data or
instructions that are to be locked down, are either:
•
in a Noncacheable area of memory, including the TCM
•
in an already locked cache way.
3.
Ensure that the data or instructions to be locked down are in a Cacheable area of memory.
4.
Ensure that the data or instructions to be locked down are not already in the cache, using
cache Clean and/or Invalidate instructions as appropriate, see c7, Cache operations on
page 3-69.
5.
Enable allocation to the target cache way by writing to the Instruction or Data Cache
Lockdown Register, with the CRm field set to 0, setting L equal to 0 for bit i and L equal
to 1 for all other ways.
6.
Ensure that the memory cache line is loaded into the cache by using an LDR instruction
to load a word from the memory cache line, for each of the cache lines to be locked down
in cache way i.
To lock down an instruction cache use the c7 Prefetch Instruction Cache Line operation
to fetch the memory cache line, see Invalidate, Clean, and Prefetch operations on
page 3-71.
7.
3.2.25
Write to the Instruction or Data Cache Lockdown Register, setting L to 1 for bit i and
restore all the other bits to the values they had before this routine was started.
c9, Data TCM Region Register
The purpose of the Data TCM Region Register is to describe the physical base address and size
of the Data TCM region and to provide a mechanism to enable it.
The Data TCM Region Register is:
•
in CP15 c9
•
a 32-bit read/write register common to Secure and Non-secure worlds
•
accessible in privileged modes only.
If the processor is configured to have 2 Data TCMs, each TCM has a separate Data TCM Region
Register. The TCM Selection Register determines the register in use.
Figure 3-50 on page 3-91 shows the bit arrangement for the Data TCM Region Register.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-90
System Control Coprocessor
31
12 11
Base address (physical address)
7 6
SBZ/UNP
Size
2 1 0
S
E
B
n
Z
Figure 3-50 Data TCM Region Register format
Table 3-85 lists how the bit values correspond with the Data TCM Region Register functions.
Table 3-85 Data TCM Region Register bit functions
Bits
Field name
Function
[31:12]
Base address
Contains the physical base address of the TCM.
The base address must be aligned to the size of the TCM.
Any bits in the range [(log2(RAMSize)-1):12] are ignored. The base address is 0 at Reset.
[11:7]
-
UNP/SBZ.
[6:2]
Size
Indicates the size of the TCM on readsa. All other values are reserved:
b00000 = 0KB
b00011 = 4KB
b00100 = 8KB
b00101 = 16KB
b00110 = 32KB.
[1]
-
UNP/SBZ.
[0]
En
Indicates if the TCM is enabled.
0 = TCM disabled, reset value
1 = TCM enabled.
a. On writes this field is ignored. For more details see Tightly-coupled memory on page 7-7.
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
Note
When the NS access bit is 0 for Data TCM, see c9, Data TCM Non-secure Control Access
Register on page 3-94, attempts to access the Data TCM Region Register from the Non-secure
world cause an Undefined exception.
Table 3-86 lists the results of attempted access for each mode.
Table 3-86 Results of access to the Data TCM Region Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
0
Data
Data
Undefined exception
Undefined exception
Undefined exception
1
Data
Data
Data
Data
Undefined exception
NS access bit value
User
To use the Data TCM Region Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c9
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-91
System Control Coprocessor
•
•
CRm set to c1
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c9, c1, 0
MCR p15, 0, <Rd>, c9, c1, 0
; Read Data TCM Region Register
; Write Data TCM Region Register
Attempting to change the Data TCM Region Register while a DMA operation is running has
Unpredictable effects but there is no impact on security.
3.2.26
c9, Instruction TCM Region Register
The purpose of the Instruction TCM Region Register is to describe the physical base address
and size of the Instruction TCM region and to provide a mechanism to enable it.
Table 3-87 lists the purposes of the individuals bits of the Instruction TCM Region Register.
The Instruction TCM Region Register is:
•
in CP15 c9
•
a 32-bit read/write register common to Secure and Non-secure worlds
•
accessible in privileged modes only.
If the processor is configured to have 2 Instruction TCMs, each TCM has a separate Instruction
TCM Region Register. The TCM Selection Register determines the register in use.
Figure 3-51 shows the bit arrangement for the Instruction TCM Region Register.
31
12 11
Base address (physical address)
7 6
SBZ/UNP
Size
2 1 0
S
E
B
n
Z
Figure 3-51 Instruction TCM Region Register format
Table 3-87 lists how the bit values correspond with the Instruction TCM Region Register
functions.
Table 3-87 Instruction TCM Region Register bit functions
Bits
[31:12]
Field
name
Function
Base
address
Contains the physical base address of the TCM. The base address must be aligned to the size of the
TCM. Any bits in the range [(log2(RAMSize)-1):12] are ignored.
The base address is 0 at Reset.
[11:7]
ARM DDI 0333H
ID012410
-
UNP/SBZ.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-92
System Control Coprocessor
Table 3-87 Instruction TCM Region Register bit functions (continued)
Bits
Field
name
[6:2]
Size
Indicates the size of the TCM on readsa. All other values are reserved:
b00000 = 0KB
b00011 = 4KB
b00100 = 8KB
b00101 = 16KB
b00110 = 32KB.
[1]
-
UNP/SBZ.
[0]
En
Indicates if the TCM is enabled:
0 = TCM disabled.
1 = TCM enabled.
The reset value of this bit depends on the value of the INITRAM static configuration signal. If
INITRAM is HIGH then this bit resets to 1. If INITRAM is LOW then this bit resets to 0. For more
information see Static configuration signals on page A-4.
Function
a. On writes this field is ignored. For more details see Tightly-coupled memory on page 7-7.
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
The value of the En bit at Reset depends on the INITRAM signal:
•
INITRAM LOW sets En to 0
•
INITRAM HIGH sets En to 1.
When INITRAM is HIGH this enables the Instruction TCM directly from reset, with a Base
address of 0x00000. When the processor comes out of reset, it executes the instructions in the
Instruction TCM instead of fetching instructions from external memory, except when the
processor uses high vectors.
Note
When the NS access bit is 0 for Instruction TCM, see c9, Instruction TCM Non-secure Control
Access Register on page 3-95, attempts to access the Instruction TCM Region Register from the
Non-secure world cause an Undefined exception.
Table 3-88 lists the results of attempted access for each mode.
Table 3-88 Results of access to the Instruction TCM Region Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
0
Data
Data
Undefined exception
Undefined exception
Undefined exception
1
Data
Data
Data
Data
Undefined exception
NS access bit value
User
To use the Instruction TCM Region Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c9
•
CRm set to c1
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-93
System Control Coprocessor
•
Opcode_2 set to 1.
For example:
MRC p15, 0, <Rd>, c9, c1, 1
MCR p15, 0, <Rd>, c9, c1, 1
; Read Instruction TCM Region Register
; Write Instruction TCM Region Register
Attempts to change the Instruction TCM Region Register while a DMA operation is running has
Unpredictable effects but there is no impact on security.
3.2.27
c9, Data TCM Non-secure Control Access Register
The purpose of the Data TCM Non-secure Access Register is to:
•
set access permission to the Data TCM Region Register
•
define data in the Data TCM as Secure or Non-secure.
The Data TCM Non-secure Control Access Register is:
•
in CP15 c9
•
a 32-bit read/write register in the Secure world only
•
accessible in privileged modes only.
If the processor is configured to have 2 Data TCMs, each TCM has a separate Data TCM
Non-secure Control Access Register. The TCM Selection Register determines the register in
use.
Figure 3-52 shows the bit arrangement for the Data TCM Non-secure Control Access Register.
31
1 0
SBZ
NS access
Figure 3-52 Data TCM Non-secure Control Access Register format
Table 3-89 lists how the bit values correspond with the register functions.
Table 3-89 Data TCM Non-secure Control Access Register bit functions
Bits
Field
name
Function
[31:1]
-
UNP/SBZ.
[0]
NS access
Makes Data TCM invisible to the Non-secure world and makes TCM data Secure.
0 = Data TCM Region Register only accessible in the Secure world. Data TCM only visible in the
Secure world and only when the NS Attribute in the page table is 0. The reset value is 0.
1 = Data TCM Region Register accessible in the Secure and Non-secure worlds. Data TCM is
visible in the Non-secure world, and also in the Secure world if the NS Attribute in the page table
is 1.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-94
System Control Coprocessor
Table 3-90 lists the effect on TCM operations for different combinations of operating world and
NS bits.
Table 3-90 Effects of NS items for data TCM operation
World
NS
acces
s
NS page
table
Region
visible
Control
Data
Secure
0
1
No
-
-
1
0
No
-
-
0
0
Yes
Secure privileged only
Secure only
1
1
Yes
Secure and Non-secure privileged
Non-secure only
1
X
Yes
Secure and Non-secure privileged
Non-secure only
0
X
No
-
-
Non-secure
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
Attempts to access the Data TCM Non-secure Control Access Register in modes other than
Secure privileged result in an Undefined exception.
To use the Data TCM Non-secure Control Access Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c9
•
CRm set to c1
•
Opcode_2 set to 2.
For example:
MRC p15,0,<Rd>,c9,c1,2
MCR p15,0,<Rd>,c9,c1,2
3.2.28
; Read Data TCM Non-secure Control Access Register
; Write Data TCM Non-secure Control Access Register
c9, Instruction TCM Non-secure Control Access Register
The purpose of the Instruction TCM Non-secure Control Access Register is to:
•
set access permission to the Instruction TCM Region Register
•
define instructions in the Instruction TCM as Secure or Non-secure.
The Instruction TCM Non-secure Control Access Register is:
•
in CP15 c9
•
a 32-bit read/write register in the Secure world only
•
accessible in privileged modes only.
If the processor is configured to have 2 Instruction TCMs, each TCM has a separate Instruction
TCM Non-secure Control Access Register. The TCM Selection Register determines the register
in use.
Figure 3-53 on page 3-96 shows the bit arrangement for the Instruction TCM Non-secure
Control Access Register.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-95
System Control Coprocessor
31
1 0
SBZ
NS access
Figure 3-53 Instruction TCM Non-secure Control Access Register format
Table 3-91 lists how the bit values correspond with the register functions.
Table 3-91 Instruction TCM Non-secure Control Access Register bit functions
Bits
Field
name
Function
[31:1]
-
UNP/SBZ.
[0]
NS access
Makes Instruction TCM invisible to the Non-secure world and makes TCM data Secure.
0 = Instruction TCM Region Register only accessible in the Secure world. Instruction TCM only
visible in the Secure world and only when the NS Attribute in the page table is 0. The reset value
is 0.
1 = Instruction TCM Region Register accessible in the Secure and Non-secure worlds. Instruction
TCM is visible in the Non-secure world, and also in the Secure world if the NS Attribute in the
page table is 1.
Table 3-92 lists the effect on TCM operations for different combinations of operating world, and
NS bits.
Table 3-92 Effects of NS items for instruction TCM operation
World
NS
access
NS page
table
Region
visible
Control
Data
Secure
0
1
No
-
-
1
0
No
-
-
0
0
Yes
Secure privileged only
Secure only
1
1
Yes
Secure and Non-secure privileged
Non-secure only
1
X
Yes
Secure and Non-secure privileged
Non-secure only
0
X
No
-
-
Non-secure
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
Attempts to access the Instruction TCM Non-secure Control Access Register in modes other
than Secure Privileged result in an Undefined exception.
To use the Instruction TCM Non-secure Control Access Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c9
•
CRm set to c1
•
Opcode_2 set to 3.
For example:
MRC p15,0,<Rd>,c9,c1,3
ARM DDI 0333H
ID012410
;Read Instruction TCM Non-secure Control Access Register
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-96
System Control Coprocessor
MCR p15,0,<Rd>,c9,c1,3
3.2.29
;Write Instruction TCM Non-secure Control Access Register
c9, TCM Selection Register
The purpose of the TCM Selection Register is to determine the bank of CP15 registers related
to TCM configuration in use. These banks consist of:
•
c9, Data TCM Region Register on page 3-90
•
c9, Instruction TCM Region Register on page 3-92
•
c9, Data TCM Non-secure Control Access Register on page 3-94
•
c9, Instruction TCM Non-secure Control Access Register on page 3-95.
The TCM Selection Register is:
•
in CP15 c9
•
a 32-bit read/write register banked in the Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-54 shows the bit arrangement for the TCM Selection Register.
31
2 1 0
SBZ
TCM number
Figure 3-54 TCM Selection Register format
Table 3-93 lists how the bit values correspond with the TCM Selection Register functions.
Table 3-93 TCM Selection Register bit functions
Bits
Field name
Function
[31:2]
-
UNP/SBZ.
[1:0]
TCM number
Selects the bank of CP15 registers related to TCM configuration.
Attempts to select a bank related to a TCM that does not exist are ignored:
b00 = TCM 0, reset value.
b01 = TCM 1. When there is only one TCM on both Instruction and Data
sides, write access is ignored.
b10 = Write access ignored.
b11 = Write access ignored.
Accesses to the TCM Region Registers and TCM Non-secure Control Access Registers in the
Secure world, access the bank of CP15 registers related to TCM configuration selected by the
Secure TCM Selection Register. Accesses to the TCM Region Registers in the Non-secure
world, access the bank of CP15 registers related to TCM configuration selected by the
Non-secure TCM Selection Register.
Table 3-94 lists the results of attempted access for each mode.
Table 3-94 Results of access to the TCM Selection Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Secure data
Secure data
Non-secure data
Non-secure data
User
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
Undefined exception
3-97
System Control Coprocessor
To use the TCM Selection Register read or write CP15 c9 with:
•
Opcode_1 set to 0
•
CRn set to c9
•
CRm set to c2
•
Opcode_2 set to 0.
For example:
MRC p15,0,<Rd>,c9,c2,0
MCR p15,0,<Rd>,c9,c2,0
3.2.30
; Read TCM Selection register
; Write TCM Selection register
c9, Cache Behavior Override Register
The purpose of the Cache Behavior Override Register is to control cache write through and line
fill behavior for interruptible cache operations, or during debug. The register enables you to
ensure that the contents of caches do not change, for example in debug.
The Cache Behavior Override Register is:
•
in CP15 c9
•
a 32 bit read/write register, Table 3-95 lists the access for each bit in Secure and
Non-secure worlds
•
accessible in privileged modes only.
Figure 3-55 shows the bit arrangement for the Cache Behavior Override Register.
31
6 5 4 3 2 1 0
SBZ
S_WT
S_IL
S_DL
NS_WT
NS_IL
NS_DL
Figure 3-55 Cache Behavior Override Register format
Table 3-95 lists how the bit values correspond to the Cache Behavior Override Register.
Table 3-95 Cache Behavior Override Register bit functions
Bits
Field name
Access
Function
[31:6]
-
-
UNP/SBZ.
[5]
S_WT
Secure only
Defines write-through behavior for regions marked as Secure write-back:
0 = Do not force write-through, normal operation, reset value
1 = Force write-through.
[4]
S_IL
Secure only
Defines Instruction Cache linefill behavior for Secure regions:
0 = Instruction Cache linefill enabled, normal operation, reset value
1 = Instruction Cache linefill disabled.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-98
System Control Coprocessor
Table 3-95 Cache Behavior Override Register bit functions (continued)
Bits
Field name
Access
Function
[3]
S_DL
Secure only
Defines Data Cache linefill behavior for Secure regions:
0 = Data Cache linefill enabled, normal operation, reset value
1 = Data Cache linefill disabled.
[2]
NS_WT
Common
Defines write-through behavior for regions marked as Non-secure write-back:
0 = Do not force write-through, normal operation, reset value
1 = Force write-through.
[1]
NS_IL
Common
Defines Instruction Cache linefill behavior for Non-secure regions:
0 = Instruction Cache linefill enabled, normal operation, reset value
1 = Instruction Cache linefill disabled.
[0]
NS_DL
Common
Defines Data Cache linefill behavior for Non-secure regions:
0 = Data Cache linefill enabled, normal operation, reset value
1 = Data Cache linefill disabled.
Table 3-96 lists the actions that result from attempted access for each mode.
Table 3-96 Results of access to the Cache Behavior Override Register
Non-secure Privileged access
Bits
Secure Privileged access
User access
Read
Write
Secure only [5:3]
Data
Read As Zero
Ignored
Undefined exception
Common [2:0]
Data
Data
Data
Undefined exception
To use the Cache Behavior Override Register read or write CP15 with:
•
Opcode_1 to 0
•
CRn set to c9
•
CRm set to c8
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c9, c8, 0
MCR p15, 0, <Rd>, c9, c8, 0
; Read Cache Behavior Override Register
; Write Cache Behavior Override Register
You might use the Cache Behavior Override Register during, for example, clean or clean and
invalidate all operations in Non-secure world that might not prevent fast interrupts to the Secure
world if the FW bit is clear, see c1, Secure Configuration Register on page 3-52. In this case, the
Secure world can read or write the Non-secure locations in the cache, so potentially causing the
cache to contain valid or dirty Non-secure entries when the Non-secure clean or clean and
invalidate all operation completes. To avoid this kind of problem, the Secure side must not
allocate Non-secure entries into the cache and must treat all writes to Non-secure regions that
hit in the cache as write-though.
Note
Three bits, nWT, nIL and nDL, are also defined for Debug state in CP14, see CP14 c10, Debug
State Cache Control Register on page 13-23, and apply to all Secure and Non-secure regions.
The CP14 register has precedence over the CP15 register when the core is in Debug state, and
the CP15 register has precedence over the CP14 register in functional states.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-99
System Control Coprocessor
For more information on cache debug, see Chapter 13 Debug.
3.2.31
c10, TLB Lockdown Register
The purpose of the TLB Lockdown Register is to control where hardware page table walks place
the TLB entry in either:
•
the set associative region of the TLB
•
the lockdown region of the TLB, and if in the lockdown region, the entry to write.
Table 3-97 lists the purposes of the individual bits in the TLB Lockdown Register.
The TLB Lockdown Register is:
•
in CP15 c10
•
32-bit read/write register common to Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-56 shows the bit arrangement of the TLB Lockdown Register.
31
29 28
SBZ
26 25
1 0
Victim
SBZ/UNP
P
Figure 3-56 TLB Lockdown Register format
Table 3-97 lists how the bit values correspond with the TLB Lockdown Register functions.
Table 3-97 TLB Lockdown Register bit functions
Bits
Field
name
Function
[31:29]
-
UNP/SBZ.
[28:26]
Victim
Specifies the entry in the lockdown region where a subsequent hardware page table walk can place
a TLB entry. The reset value is 0.
0-7, defines the Lockdown region for the TLB entry.
[25:1]
-
UNP/SBZ.
[0]
P
Determines if subsequent hardware page table walks place a TLB entry in the lockdown region or in
the set associative region of the TLB:
0 = Place the TLB entry in the set associative region of the TLB, reset value.
1 = Place the TLB entry in the lockdown region of the TLB as defined by the Victim bits [28:26].
The TLB lockdown behavior depends on the TL bit, see c1, Non-Secure Access Control Register
on page 3-55. If the TL bit is not set, the Lockdown entries are reserved for the Secure world.
Table 3-98 lists the results of attempted access for each mode.
Table 3-98 Results of access to the TLB Lockdown Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
0
Data
Data
Undefined exception
Undefined exception
Undefined exception
1
Data
Data
Data
Data
Undefined exception
TL bit
value
ARM DDI 0333H
ID012410
User
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-100
System Control Coprocessor
The lockdown region of the TLB contains eight entries. TLB organization on page 6-4 describes
the structure of the TLB.
The Invalidate TLB unlocked entries operation does not invalidate TLB entries in the lockdown
region.
Invalidate TLB Entry by MVA and Invalidate TLB Entry on ASID Match operations invalidate
any TLB entries that correspond to the MVA or ASID given in Rd, if they are in the lockdown
region or if they are in the set-associative region of the TLB. See c8, TLB Operations Register
on page 3-86 for a description of the TLB invalidate operations.
The victim automatically increments after any page table walk that results in a write puts an
entry into the lockdown part of the TLB.
To use the TLB Lockdown Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c10
•
CRm set to c0
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c10, c0, 0
MCR p15, 0, <Rd>, c10, c0, 0
; Read TLB Lockdown Register
; Write TLB Lockdown Register.
Example 3-2 is a code sequence that locks down an entry to the current victim.
Example 3-2 Lock down an entry to the current victim
ADR R1,LockAddr
MCR p15,0,R1,c8,c7,1
MRC
ORR
MCR
LDR
MRC
p15,0,R0,c10,c0,0
R0,R0,#1
p15,0,R0,c10,c0,0
R1,[R1]
p15,0,R0,c10,c0,0
BIC R0,R0,#1
MCR p15,0,R0,c10,c0,0
3.2.32
;
;
;
;
;
;
;
;
;
;
;
set R1 to the value of the address to be locked down
invalidate TLB single entry to ensure that
LockAddr is not already in the TLB
read the lockdown register
set the preserve bit
write to the lockdown register
TLB misses, and entry is loaded
read the lockdown register (victim
increments)
clear preserve bit
write to the lockdown register
c10, Memory region remap registers
The purpose of the memory region remap registers is to remap memory region attributes
encoded by the TEX[2:0], C, and B bits in the page tables that the Data side, Instruction side,
and DMA use. For details of memory remap, see Memory region attributes on page 6-14.
The memory region remap registers are:
•
in CP15 c10
•
two 32-bit read/write registers banked for the Secure and Non-secure worlds:
— the Primary Region Remap Register
— the Normal Memory Remap Register.
•
accessible in privileged modes only.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-101
System Control Coprocessor
These registers apply to all memory accesses and this includes accesses from the Data side,
Instruction side, and DMA. Table 3-99 lists the purposes of the individual bits in the Primary
Region Remap Register. Table 3-101 on page 3-103 lists the purposes of the individual bits in
the Normal Memory Remap Register.
Note
The behavior of the memory region remap registers depends on the TEX remap bit, see c1,
Control Register on page 3-44.
Figure 3-57 shows the arrangement of the bits in the Primary Region Remap Register.
31
20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
UNP/SBZ
- - - -
-
-
-
-
-
-
-
-
Figure 3-57 Primary Region Remap Register format
Table 3-99 lists the functional bits of the Primary Region Remap Register.
Table 3-99 Primary Region Remap Register bit functions
ARM DDI 0333H
ID012410
Bits
Field name
Functiona
[31:20]
-
UNP/SBZ
[19]
-
Remaps shareable attribute when S=1 for Normal regionsb
1 = reset value
[18]
-
Remaps shareable attribute when S=0 for Normal regionsb
0 = reset value
[17]
-
Remaps shareable attribute when S=1 for Device regionsb
0 = reset value
[16]
-
Remaps shareable attribute when S= 0 for Device regionsb
1= reset value
[15:14]
-
Remaps {TEX[0],C,B} = b111
b10 = reset value
[13:12]
-
Remaps {TEX[0],C,B} = b110
b00 = reset value
[11:10]
-
Remaps {TEX[0],C,B} = b101
b10 = reset value
[9:8]
-
Remaps {TEX[0],C,B} = b100
b10 = reset value
[7:6]
-
Remaps {TEX[0],C,B} = b011
b10 = reset value
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-102
System Control Coprocessor
Table 3-99 Primary Region Remap Register bit functions (continued)
Bits
Field name
Functiona
[5:4]
-
Remaps {TEX[0],C,B} = b010
b10 = reset value
[3:2]
-
Remaps {TEX[0],C,B} = b001
b01 = reset value
[1:0]
-
Remaps {TEX[0],C,B} = b000
b00 = reset value
a. The reset values ensure that no remapping occurs at reset
b. Shareable attributes can map for both shared and non-shared memory. If the Shared bit
read from the TLB or page tables is 0, then the bit remaps to the Not Shared attributes
in this register. If the Shared bit read from the TLB or page tables is 1, then the bit
remaps to the Shared attributes of this register.
Table 3-100 lists the encoding of the remapping for the primary memory type.
Table 3-100 Encoding for the remapping of the primary memory type
Encoding
Memory type
b00
Strongly ordered
b01
Device
b10
Normal
b11
UNP, normal
Figure 3-58 shows the arrangement of the bits in the Normal Memory Remap Register.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Figure 3-58 Normal Memory Remap Register format
Table 3-101 lists how the bit values correspond with the Normal Memory Remap Register
functions.
Table 3-101 Normal Memory Remap Register bit functions
ARM DDI 0333H
ID012410
Bits
Field name
Functiona
[31:30]
-
Remaps Outer attribute for {TEX[0],C,B} = b111
b01 = reset value
[29:28]
-
Remaps Outer attribute for {TEX[0],C,B} = b110
b00 = reset value
[27:26]
-
Remaps Outer attribute for {TEX[0],C,B} = b101
b01 = reset value
[25:24]
-
Remaps Outer attribute for {TEX[0],C,B} = b100
b00 = reset value
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-103
System Control Coprocessor
Table 3-101 Normal Memory Remap Register bit functions (continued)
Bits
Field name
Functiona
[23:22]
-
Remaps Outer attribute for {TEX[0],C,B} = b011
b11 = reset value
[21:20]
-
Remaps Outer attribute for {TEX[0],C,B} = b010
b10 = reset value
[19:18]
-
Remaps Outer attribute for {TEX[0],C,B} = b001
b00 = reset value
[17:16]
-
Remaps Outer attribute for {TEX[0],C,B} = b000
b00 = reset value
[15:14]
-
Remaps Inner attribute for {TEX[0],C,B} = b111
b01 = reset value
[13:12]
-
Remaps Inner attribute for {TEX[0],C,B} = b110
b00 = reset value
[11:10]
-
Remaps Inner attribute for {TEX[0],C,B} = b101
b10 = reset value
[9:8]
-
Remaps Inner attribute for {TEX[0],C,B} = b100
b00 = reset value
[7:6]
-
Remaps Inner attribute for {TEX[0],C,B} = b011
b11 = reset value
[5:4]
-
Remaps Inner attribute for {TEX[0],C,B} = b010
b10 = reset value
[3:2]
-
Remaps Inner attribute for {TEX[0],C,B} = b001
b00 = reset value
[1:0]
-
Remaps Inner attribute for {TEX[0],C,B} = b000
b00 = reset value
a. The reset values ensure that no remapping occurs at reset.
Table 3-102 lists the encoding for the Inner or Outer cacheable attribute bit fields I0 to I7 and
O0 to O7.
Table 3-102 Remap encoding for Inner or Outer cacheable attributes
Encoding
Cacheable attribute
b00
Noncacheable
b01
Write-back, allocate on write
b10
Write-through, no allocate on write
b11
Write-back, no allocate on write
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-104
System Control Coprocessor
Table 3-103 lists the results of attempted access for each mode.
Table 3-103 Results of access to the memory region remap registers
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Secure data
Secure data
Non-secure data
Non-secure data
User
Undefined exception
To use the memory region remap registers read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c10
•
CRm set to c2
•
Opcode_2 set to:
— 0, Primary Region Remap Register
— 1, Normal Memory Remap Register.
For example:
MRC
MCR
MRC
MCR
p15,
p15,
p15,
p15,
0,
0,
0,
0,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
c10,
c10,
c10,
c10,
c2,
c2,
c2,
c2,
0
0
1
1
;Read Primary Region Remap Register
;Write Primary Region Remap Register
;Read Normal Memory Remap Register
;Write Normal Memory Remap Register
Memory remap occurs in two stages:
1.
The processor uses the Primary Region Remap Register to remap the primary memory
type, Normal, Device, or Strongly Ordered, and the shareable attribute.
2.
For memory regions that the Primary Region Remap Register defines as Normal memory,
the processor uses the Normal Memory Remap Register to remap the inner and outer
cacheable attributes.
The behavior of the memory region remap registers depends on the TEX remap bit, see c1,
Control Register on page 3-44. If the TEX remap bit is set, the entries in the memory region
remap registers remap each possible value of the TEX[0], C and B bits in the page tables. You
can therefore set your own definitions for these values. If the TEX remap bit is clear, the memory
region remap registers are not used and no memory remapping takes place. For more
information see Memory region attributes on page 6-14.
The memory region remap registers are expected to remain static during normal operation.
When you write to the memory region remap registers, you must invalidate the TLB and perform
an IMB operation before you can rely on the new written values. You must also stop the DMA
if it is running or queued.
Note
You cannot remap the NS bit. This is for security reasons.
3.2.33
c11, DMA identification and status registers
The purpose of the DMA identification and status registers is to define:
•
the DMA channels that are physically implemented on the particular device
•
the current status of the DMA channels.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-105
System Control Coprocessor
Processes that handle DMA can read this register to determine the physical resources
implemented and their availability.
The DMA Identification and Status Register is:
•
in CP15 c11
•
four 32-bit read-only registers common to Secure and Non-secure worlds
•
accessible only in privileged modes.
Figure 3-59 shows the format of DMA identification and status registers 0-3.
31
UNP
2 1
C
H
1
0
C
H
0
Figure 3-59 DMA identification and status registers format
Table 3-104 lists how the bit values correspond with the DMA identification and status registers.
Table 3-104 DMA identification and status register bit functions
Bits
Field name
Function
[31:2]
-
UNP/SBZ
[1]
CH1
Provides information on DMA Channel 1 functions:
0 = DMA Channel 1 functiona disabled
1 = DMA Channel 1 functiona enabled.
[0]
CH0
Provides information on DMA Channel 0 functions:
0 = DMA Channel 0 functiona disabled
1 = DMA Channel 0 functiona enabled.
a. See Table 3-105 for the function of the channel that Opcode_2 of the MRC
instruction determines.
Table 3-105 lists the Opcode_2 values used to select the DMA channel function.
Table 3-105 DMA Identification and Status Register functions
ARM DDI 0333H
ID012410
Opcode_2
Function
0
Indicates channel present:
0 = the channel is not Present
1 = the channel is Present.
1
Indicates channel queued:
0 = the channel is not Queued
1 = the channel is Queued.
2
Indicates channel running:
0 = the channel is not Running
1 = the channel is Running.
3
Indicates channel interrupting:
0 = the channel is not Interrupting
1 = the channel is Interrupting, through completion or an error.
4-7
Reserved. Results in an Undefined exception.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-106
System Control Coprocessor
Access in the Non-secure world depends on the DMA bit, see c1, Non-Secure Access Control
Register on page 3-55. The processor can only access these registers in Privileged modes.
Table 3-106 lists the results of attempted access for each mode.
Table 3-106 Results of access to the DMA identification and status registers
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
0
Data
Undefined exception
Undefined exception
Undefined exception
Undefined exception
1
Data
Undefined exception
Data
Undefined exception
Undefined exception
DMA
bit
User
To access the DMA identification and status registers in a privileged mode read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c11
•
CRm set to c0
•
Opcode_2 set to:
— 0, Present
— 1, Queued
— 2, Running
— 3, Interrupting.
For example:
MRC
MRC
MRC
MRC
3.2.34
p15,
p15,
p15,
p15,
0,
0,
0,
0,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
c11,
c11,
c11,
c11,
c0,
c0,
c0,
c0,
0
1
2
3
;
;
;
;
Read
Read
Read
Read
DMA
DMA
DMA
DMA
Identification
Identification
Identification
Identification
and
and
and
and
Status
Status
Status
Status
Register
Register
Register
Register
present
queued
running
interrupting.
c11, DMA User Accessibility Register
The purpose of the DMA User Accessibility Register is to determine if a User mode process can
access the registers for each channel.
The DMA User Accessibility Register is:
•
in CP15 c11
•
a 32-bit read/write register common to the Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-60 shows the bit arrangement for the DMA User Accessibility Register.
31
2 1 0
SBZ/UNP
U U
1 0
Figure 3-60 DMA User Accessibility Register format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-107
System Control Coprocessor
Table 3-107 lists how the bit values correspond with the DMA User Accessibility Register.
Table 3-107 DMA User Accessibility Register bit functions
Bits
Field
name
Function
[31:2]
-
UNP/SBZ.
[1]
U1
Indicates if a User mode process can access the registers for channel 1:
0 = User mode cannot access channel 1. User mode accesses cause an Undefined exception. This is
the reset value.
1 = User mode can access channel 1.
[0]
U0
Indicates if a User mode process can access the registers for channel 0:
0 = User mode cannot access channel 0. User mode accesses cause an Undefined exception. This is
the reset value.
1 = User mode can access channel 0.
Access in the Non-secure world depends on the DMA bit, see c1, Non-Secure Access Control
Register on page 3-55. The processor can only access this register in Privileged modes.
Table 3-108 lists the results of attempted access for each mode.
Table 3-108 Results of access to the DMA User Accessibility Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
0
Data
Data
Undefined exception
Undefined exception
Undefined exception
1
Data
Data
Data
Data
Undefined exception
DMA bit
User
To access the DMA User Accessibility Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c11
•
CRm set to c1
•
Opcode_2 set to0.
For example:
MRC p15, 0, <Rd>, c11, c1, 0
MCR p15, 0, <Rd>, c11, c1, 0
; Read DMA User Accessibility Register
; Write DMA User Accessibility Register
The registers that you can access in User mode when the U bit = 1 for the current channel are:
•
c11, DMA enable registers on page 3-110
•
c11, DMA Control Register on page 3-111
•
c11, DMA Internal Start Address Register on page 3-114
•
c11, DMA External Start Address Register on page 3-115
•
c11, DMA Internal End Address Register on page 3-116
•
c11, DMA Channel Status Register on page 3-117.
You can access the DMA channel Number Register, see c11, DMA Channel Number Register
on page 3-109, in User mode when the U bit for any channel is 1.
The contents of these registers must be preserved on a task switch if the registers are
User-accessible.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-108
System Control Coprocessor
If the U bit for the currently selected channel is set to 0, and a User process attempts to access
any of these registers the processor takes an Undefined instruction trap.
3.2.35
c11, DMA Channel Number Register
The purpose of the DMA Channel Number Register is to select a DMA channel.
Table 3-109 lists the purposes of the individual bits in the DMA Channel Number Register.
The DMA Channel Number Register is:
•
in CP15 c11
•
a 32-bit read/write register common to Secure and Non-secure worlds
•
accessible in user and privileged modes.
Figure 3-61 shows the bit arrangement for the DMA Channel Number Register.
31
1 0
C
N
SBZ/UNP
Figure 3-61 DMA Channel Number Register format
Table 3-109 lists how the bit values correspond with the DMA Channel Number Register.
Table 3-109 DMA Channel Number Register bit functions
Bits
Field name
Function
[31:1]
-
UNP/SBZ.
[0]
CN
Indicates DMA Channel selected:
0 = DMA Channel 0 selected, reset value
1 = DMA Channel 1 selected.
Access in the Non-secure world depends on the DMA bit, see c1, Non-Secure Access Control
Register on page 3-55. The processor can access this register in User mode if the U bit, see c11,
DMA User Accessibility Register on page 3-107, for any channel is set to 1. Table 3-110 lists the
results of attempted access for each mode.
Table 3-110 Results of access to the DMA Channel Number Register
U1 and
U0 bits
DMA
bit
Secure Privileged
Read or Write
Non-secure
Privileged
Read or Write
Secure User
Read or Write
Non-secure User
Read or Write
Both 0
0
Data
Undefined exception
Undefined exception
Undefined exception
1
Data
Data
Undefined exception
Undefined exception
0
Data
Undefined exception
Data
Undefined exception
1
Data
Data
Data
Data
Either or
both 1
To access the DMA Channel Number Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c11
•
CRm set to c2
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-109
System Control Coprocessor
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c11, c2, 0
MCR p15, 0, <Rd>, c11, c2, 0
3.2.36
; Read DMA Channel Number Register
; Write DMA Channel Number Register
c11, DMA enable registers
The purpose of the DMA enable registers is to start, stop or clear DMA transfers for each
channel implemented.
The DMA enable registers are:
•
in CP15 c11
•
three 32-bit write only registers for each DMA channel common to Secure and
Non-secure worlds
•
accessible in user and privileged modes.
The commands that operate through the registers are:
Stop
The DMA channel ceases to do memory accesses as soon as possible after the
level one DMA issues the instruction. This acceleration approach cannot be used
for DMA transactions to or from memory regions marked as Device. The DMA
can issue a Stop command when the channel status is Running. The DMA channel
can take several cycles to stop after the DMA issues a Stop instruction. The
channel status remains at Running until the DMA channel stops. The channel
status is set to Complete or Error at the point that all outstanding memory accesses
complete. The Start Address Registers contain the addresses the DMA requires to
restart the operation when the channel stops.
If the Stop command occurs when the channel status is Queued, the channel status
changes to Idle. The Stop command has no effect if the channel status is not
Running or Queued.
c11, DMA Channel Status Register on page 3-117 describes the DMA channel
status.
Start
The Start command causes the channel to start DMA transfers. If the other DMA
channel is not in operation the channel status is set to Running on the execution
of a Start command. If the other DMA channel is in operation the channel status
is set to Queued.
A channel is in operation if either:
•
its channel status is Queued
•
its channel status is Running
•
its channel status is Complete or Error, with either the Internal or External
Address Error Status indicating an Error.
c11, DMA Channel Status Register on page 3-117 describes DMA channel status.
Clear
The Clear command causes the channel status to change from Complete or Error
to Idle. It also clears:
•
all the Error bits for that DMA channel
•
the interrupt that is set by the DMA channel as a result of an error or
completion, see c11, DMA Control Register on page 3-111 for more details.
The Clear command does not change the contents of the Internal and External
Start Address Registers. A Clear command has no effect when the channel status
is Running or Queued.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-110
System Control Coprocessor
Access in the Non-secure world depends on the DMA bit, see c1, Non-Secure Access Control
Register on page 3-55. The processor can access these registers in User mode if the U bit, see
c11, DMA User Accessibility Register on page 3-107, for the currently selected channel is set to
1. Table 3-111 lists the results of attempted access for each mode.
Table 3-111 Results of access to the DMA enable registers
U
bit
0
1
DMA
bit
Secure
Privileged
Non-secure
Privileged
Secure User
Non-secure User
Read
Write
Read
Write
Read
Write
Read
Write
0
Undefined
exception
Data
Undefined
exception
Undefined
exception
Undefined
exception
Undefined
exception
Undefined
exception
Undefined
exception
1
Undefined
exception
Data
Undefined
exception
Data
Undefined
exception
Undefined
exception
Undefined
exception
Undefined
exception
0
Undefined
exception
Data
Undefined
exception
Undefined
exception
Undefined
exception
Data
Undefined
exception
Undefined
exception
1
Undefined
exception
Data
Undefined
exception
Data
Undefined
exception
Data
Undefined
exception
Data
To access a DMA Enable Register set the DMA Channel Number Register to the appropriate
DMA channel and write CP15 with:
•
Opcode_1 set to 3
•
CRn set to c11
•
CRm set to c3
•
Opcode_2 set to:
— 0, Stop
— 1, Start
— 2, Clear.
For example:
MCR p15, 0, <Rd>, c11, c3, 0
MCR p15, 0, <Rd>, c11, c3, 1
MCR p15, 0, <Rd>, c11, c3, 2
; Stop DMA Enable Register
; Start DMA Enable Register
; Clear DMA Enable Register
Debug implications for the DMA
The level one DMA behaves as a separate engine from the processor core, and when started,
works autonomously. When the level one DMA has channels with the status of Running or
Queued, these channels continue to run, or start running, even if a debug mechanism stops the
processor. This can cause the contents of the TCM to change while the processor stops in debug.
To avoid this situation you must ensure the level one DMA issues a Stop command to stop
Running or Queued channels when entering debug.
3.2.37
c11, DMA Control Register
The purpose of the DMA Control Register for each channel is to control the operations of that
DMA channel. Table 3-112 on page 3-112 lists the purposes of the individual bits in the DMA
Control Register.
The DMA Control Register is:
•
in CP15 c11
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-111
System Control Coprocessor
•
•
one 32-bit read/write register for each DMA channel common to Secure and Non-secure
worlds
accessible in user and privileged modes.
Figure 3-62 shows the bit arrangement for the DMA Control Register.
31 30 29 28 27 26 25
T D I I F U
R T C E T M
20 19
UNP/SBZ
8 7
ST
2 1 0
UNP/SBZ
TS
Figure 3-62 DMA Control Register format
Table 3-112 lists how the bit values correspond with the DMA Control Register.
Table 3-112 DMA Control Register bit functions
Bits
Field
name
[31]
TR
Indicates target TCM:
0 = Data TCM, reset value
1 = Instruction TCM.
[30]
DT
Indicates direction of transfer:
0 = Transfer from level two memory to the TCM, reset value
1 = Transfer from the TCM to the level two memory.
[29]
IC
Indicates whether the DMA channel must assert an interrupt on completion of the DMA transfer, or
if the DMA is stopped by a Stop command, see c11, DMA enable registers on page 3-110.
The interrupt is deasserted, from this source, if the processor performs a Clear operation on the
channel that caused the interrupt. For more details see c11, DMA enable registers on page 3-110.
Function
The U bita has no effect on whether an interrupt is generated on completion:
0 = No Interrupt on Completion, reset value
1 = Interrupt on Completion.
[28]
IE
Indicates that the DMA channel must assert an interrupt on an error.
The interrupt is deasserted, from this source, when the channel is set to Idle with a Clear operation,
see c11, DMA enable registers on page 3-110:
0 = No Interrupt on Error, if the U bit is 0, reset value
1 = Interrupt on Error, regardless of the U bita. All DMA transactions on channels that have the U bit
set to 1 Interrupt on Error regardless of the value written to this bit.
[27]
FT
Read As One, Write ignored
In the ARM1176JZ-S this bit has no effect.
[26]
UM
Indicates that the permission checks are based on the DMA being in User or privileged mode.
The UM bit is provided so that the User mode can be emulated by a privileged mode process. For a
User mode process the setting of the UM bit is irrelevant and behaves as if set to 1:
0 = Transfer is a privileged transfer, reset value
1 = Transfer is a User mode transfer.
[25:20]
-
UNP/SBZ.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-112
System Control Coprocessor
Table 3-112 DMA Control Register bit functions (continued)
Bits
Field
name
[19:8]
ST
Indicates the increment on the external address between each consecutive access of the DMA. A
Stride of zero, reset value, indicates that the external address is not to be incremented. This is designed
to facilitate the accessing of volatile locations such as a FIFO.
The Stride is interpreted as a positive number, or zero.
The internal address increment is not affected by the Stride, but is fixed at the transaction size.
The stride value is in bytes.
The value of the Stride must be aligned to the Transaction Size, otherwise this results in a bad
parameter error, see c11, DMA Channel Status Register on page 3-117.
[7:2]
-
UNP/SBZ.
[1:0]
TS
Indicates the size of the transactions that the DMA channel performs. This is particularly important
for Device or Strongly Ordered memory locations because it ensures that accesses to such memory
occur at their programmed size:
b00 = Byte, reset value
b01 = Halfword
b10 = Word
b11 = Doubleword, 8 bytes.
Function
a. See c11, DMA User Accessibility Register on page 3-107.
Access in the Non-secure world depends on the DMA bit, see c1, Non-Secure Access Control
Register on page 3-55. The processor can access this register in User mode if the U bit, see c11,
DMA User Accessibility Register on page 3-107, for the currently selected channel is set to 1.
Table 3-113 lists the results of attempted access for each mode.
Table 3-113 Results of access to the DMA Control Register
U bit
DMA
bit
Secure Privileged
Read or Write
Non-secure Privileged
Read or Write
Secure User
Read or Write
Non-secure User
Read or Write
0
0
Data
Undefined exception
Undefined exception
Undefined exception
1
Data
Data
Undefined exception
Undefined exception
0
Data
Undefined exception
Data
Undefined exception
1
Data
Data
Data
Data
1
To access the DMA Control Register set the DMA Channel Number Register to the appropriate
DMA channel and read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c11
•
CRm set to c4
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c11, c4, 0
MCR p15, 0, <Rd>, c11, c4, 0
ARM DDI 0333H
ID012410
; Read DMA Control Register
; Write DMA Control Register
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-113
System Control Coprocessor
While the channel has the status of Running or Queued, any attempt to write to the DMA
Control Register results in architecturally Unpredictable behavior. For ARM1176JZ-S
processors writes to the DMA Control Register have no effect when the DMA channel is
running or queued.
3.2.38
c11, DMA Internal Start Address Register
The purpose of the DMA Internal Start Address Register for each channel is to define the first
address in the TCM for that channel. That is, it defines the first address that data transfers go to
or from.
The DMA Internal Start Address Register is:
•
in CP15 c11
•
one 32-bit read/write register for each DMA channel common to Secure and Non-secure
worlds
•
accessible in user and privileged modes.
The DMA Internal Start Address Register bits [31:0] contain the Internal Start VA.
Access in the Non-secure world depends on the DMA bit, see c1, Non-Secure Access Control
Register on page 3-55. The processor can access this register in User mode if the U bit, see c11,
DMA User Accessibility Register on page 3-107, for the currently selected channel is set to 1.
Table 3-114 lists the results of attempted access for each mode.
Table 3-114 Results of access to the DMA Internal Start Address Register
U bit
DMA
bit
Secure Privileged
Read or Write
Non-secure Privileged
Read or Write
Secure User
Read or Write
Non-secure User
Read or Write
0
0
Data
Undefined exception
Undefined exception
Undefined exception
1
Data
Data
Undefined exception
Undefined exception
0
Data
Undefined exception
Data
Undefined exception
1
Data
Data
Data
Data
1
To access the DMA Internal Start Address Register set the DMA Channel Number Register to
the appropriate DMA channel and read or write CP15 c11 with:
•
Opcode_1 set to 0
•
CRn set to c11
•
CRm set to c5
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c11, c5, 0
MCR p15, 0, <Rd>, c11, c5, 0
; Read DMA Internal Start Address Register
; Write DMA Internal Start Address Register
The Internal Start Address is a VA. Page tables describe the physical mapping of the VA when
the channel starts.
The memory attributes for that VA are used in the transfer, so memory permission faults might
be generated. The Internal Start Address must lie within a TCM, otherwise an error is reported
in the DMA Channel Status Register. The marking of memory locations in the TCM as being
Device results in Unpredictable effects. The global system behavior, but not the security, can be
affected.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-114
System Control Coprocessor
The contents of this register do not change while the DMA channel is Running. When the
channel is stopped because of a Stop command, or an error, it contains the address required to
restart the transaction. On completion, it contains the address equal to the Internal End Address.
The Internal Start Address must be aligned to the transaction size set in the DMA Control
Register or the processor generates a bad parameter error.
3.2.39
c11, DMA External Start Address Register
The purpose of the DMA External Start Address Register for each channel is to define the first
address in external memory for that DMA channel. That is, it defines the first address that data
transfers go to or from.
The DMA External Start Address Register is:
•
in CP15 c11
•
one 32-bit read/write register for each DMA channel common to Secure and Non-secure
worlds
•
accessible in user and privileged modes.
The DMA External Start Address Register bits [31:0] contain the External Start VA.
Access in the Non-secure world depends on the DMA bit, see c1, Non-Secure Access Control
Register on page 3-55. The processor can access this register in User mode if the U bit, see c11,
DMA User Accessibility Register on page 3-107, for the currently selected channel is set to 1.
Table 3-115 lists the results of attempted access for each mode.
Table 3-115 Results of access to the DMA External Start Address Register
U bit
DMA
bit
Secure Privileged
Read or Write
Non-secure Privileged
Read or Write
Secure User
Read or Write
Non-secure User
Read or Write
0
0
Data
Undefined exception
Undefined exception
Undefined exception
1
Data
Data
Undefined exception
Undefined exception
0
Data
Undefined exception
Data
Undefined exception
1
Data
Data
Data
Data
1
To access the DMA External Start Address Register set the DMA Channel Number Register to
the appropriate DMA channel and read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c11
•
CRm set to c6
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c11, c6, 0
MCR p15, 0, <Rd>, c11, c6, 0
; Read DMA External Start Address Register
; Write DMA External Start Address Register
The External Start Address is a VA, the physical mapping that you must describe in the page
tables at the time that the channel is started. The memory attributes for that VA are used in the
transfer, so memory permission faults might be generated.
The External Start Address must lie in the external memory outside the level one memory
system otherwise the results are Unpredictable. The global system behavior, but not the security,
can be affected.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-115
System Control Coprocessor
This register contents do not change while the DMA channel is Running. When the channel
stops because of a Stop command, or an error, it contains the address that the DMA requires to
restart the transaction. On completion, it contains the address equal to the final address of the
transfer accessed plus the Stride.
If the External Start Address does not align with the transaction size that is set in the Control
Register, the processor generates a bad parameter error.
3.2.40
c11, DMA Internal End Address Register
The purpose of the DMA Internal End Address Register for each channel is to define the final
internal address for that channel. This is, the end address of the data transfer.
The DMA Internal End Address Register is:
•
in CP15 c11
•
one 32-bit read/write register for each DMA channel common to Secure and Non-secure
worlds
•
accessible in user and privileged modes.
The DMA Internal End Address Register bits [31:0] contain the Internal End VA.
Access in the Non-secure world depends on the DMA bit, see c1, Non-Secure Access Control
Register on page 3-55. The processor can access this register in User mode if the U bit, see c11,
DMA User Accessibility Register on page 3-107, for the currently selected channel is set to 1.
Table 3-116 lists the results of attempted access for each mode.
Table 3-116 Results of access to the DMA Internal End Address Register
U bit
DMA
bit
Secure Privileged
Read or Write
Non-secure Privileged
Read or Write
Secure User
Read or Write
Non-secure User
Read or Write
0
0
Data
Undefined exception
Undefined exception
Undefined exception
1
Data
Data
Undefined exception
Undefined exception
0
Data
Undefined exception
Data
Undefined exception
1
Data
Data
Data
Data
1
To access the DMA Internal End Address Register set the DMA Channel Number Register to
the appropriate DMA channel and read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c11
•
CRm set to c7
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c11, c7, 0
MCR p15, 0, <Rd>, c11, c7, 0
; Read DMA Internal End Address Register
; Write DMA Internal End Address Register
The Internal End Address is the final internal address, modulo the transaction size, that the
DMA is to access plus the transaction size. Therefore, the Internal End Address is the first,
incremented, address that the DMA does not access.
If the Internal End Address is the same of the Internal Start Address, the DMA transfer
completes immediately without performing transactions.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-116
System Control Coprocessor
When the transaction associated with the final internal address has completed, the whole DMA
transfer is complete.
The Internal End Address is a VA. Page tables describe the physical mapping of the VA when
the channel starts.
The memory attributes for that VA are used in the transfer, so memory permission faults might
be generated. The Internal End Address must lie within a TCM, otherwise an error is reported
in the DMA Channel Status Register. The marking of memory locations in the TCM as being
Device results in Unpredictable effects. The global system behavior, but not the security, can be
affected.
The Internal End Address must be aligned to the transaction size set in the DMA Control
Register or the processor generates a bad parameter error.
3.2.41
c11, DMA Channel Status Register
The purpose of the DMA Channel Status Register for each channel is to define the status of the
most recently started DMA operation on that channel.
The DMA Channel Status Register is:
•
in CP15 c11
•
one 32-bit read-only register for each DMA channel common to Secure and Non-secure
worlds
•
accessible in user and privileged modes.
Figure 3-63 shows the bit arrangement for the DMA Channel Status Register.
17 16 15 14 13 12 11
31
SBZ/
UNP
SBZ/UNP
ESX[0]
B
P
7 6
ES
ISX[0]
2 1 0
IS
Status
Figure 3-63 DMA Channel Status Register format
Table 3-117 lists the functions of the bits in the DMA Channel Status Register.
Table 3-117 DMA Channel Status Register bit functions
Bits
Field name
Function
[31:17]
-
UNP/SBZ.
[16]
ESX[0]
The ESX[0] bit adds a SLVERR or DECERR qualifier to the ES encoding. Only predictable
on ES encodings of b11010, b11100, and b1.1110, otherwise UNP/SBZ. For the predictable
encodings:0 = DECERR1 = SLVERR.
[15:14]
-
UNP/SBZ.
[13]
ISX[0]
The ISX[0] bit adds a SLVERR or DECERR qualifier to the IS encoding. Only predictable on
IS encodings of b11100 and b11110, otherwise UNP/SBZ. For the predictable encodings:0 =
DECERR1 = SLVERR.
[12]
BPa
Indicates whether the DMA parameters are conditioned inappropriately or acceptable:
0 = DMA parameters are acceptable, reset value
1 = DMA parameters are conditioned inappropriately.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-117
System Control Coprocessor
Table 3-117 DMA Channel Status Register bit functions (continued)
Bits
Field name
Function
[11:7]
ES
Indicates the status of the External Address Error. All other encodings are Reserved:
b00000 = No error, reset value
b00xxx = No error
b01001 = Unshared data error
b11010 = External Abort, can be imprecise
b11100 = External Abort on translation of first-level page table
b11110 = External Abort on translation of second-level page table
b10011 = Access Bit fault on section
b10110 = Access Bit fault on page
b10101 = Translation fault, section
b10111 = Translation fault, page
b11001 = Domain fault, section
b11011 = Domain fault, page
b11101 = Permission fault, section
b11111 = Permission fault, page.
[6:2]
IS
Indicates the status of the Internal Address Error. All other encodings are Reserved:
b00000 = No error, reset value
b00xxx = No error
b01000 = TCM out of range
b11100 = External Abort on translation of first-level page table
b11110 = External Abort on translation of second-level page table
b10011 = Access Bit fault on section
b10110 = Access Bit fault on page
b10101 = Translation fault, section
b10111 = Translation fault, page
b11001 = Domain fault, section
b11011 = Domain fault, page
b11101 = Permission fault, section
b11111 = Permission fault, page.
[1:0]
Status
Indicates the status of the DMA channel:
b00 = Idle, reset value
b01 = Queued
b10 = Running
b11 = Complete or Error.
a. The external start and end addresses and the Stride must all be multiples of the transaction size. If this is not the case, the BP
bit is set to 1, and the DMA channel does not start.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-118
System Control Coprocessor
Access in the Non-secure world depends on the DMA bit, see c1, Non-Secure Access Control
Register on page 3-55. These registers can be accessed in User mode if the U bit, see c11, DMA
User Accessibility Register on page 3-107, for the currently selected channel is set to 1.
Table 3-118 lists the results of attempted access for each mode.
Table 3-118 Results of access to the DMA Channel Status Register
U
bit
0
1
Secure
Privileged
Non-secure
Privileged
Read
Write
Read
0
Data
Undefined
exception
1
Data
0
1
DMA
bit
Secure User
Non-secure User
Write
Read
Write
Read
Write
Undefined
exception
Undefined
exception
Undefined
exception
Undefined
exception
Undefined
exception
Undefined
exception
Undefined
exception
Data
Undefined
exception
Undefined
exception
Undefined
exception
Undefined
exception
Undefined
exception
Data
Undefined
exception
Undefined
exception
Undefined
exception
Data
Undefined
exception
Undefined
exception
Undefined
exception
Data
Undefined
exception
Data
Undefined
exception
Data
Undefined
exception
Data
Undefined
exception
To access the DMA Channel Status Register set DMA Channel Number Register to the
appropriate DMA channel and read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c11
•
CRm set to c8
•
Opcode_2 set to 0.
MRC p15, 0, <Rd>, c11, c8, 0
; Read DMA Channel Status Register
In the event of an error, the appropriate Start Address Register contains the address that faulted,
unless the error is an external error that is set to b11010 by bits [11:7].
A channel with the state of Queued changes to Running automatically if the other channel, if
implemented, changes to Idle, or Complete or Error, with no error.
When a channel completes all of the transfers of the DMA, so that all changes to memory
locations caused by those transfers are visible to other observers, its status is changed from
Running to Complete or Error. This change does not happen before the external accesses from
the transfer complete.
If the processor attempts to access memory locations that are not marked as shared, then the ES
bits signal an Unshared error for either:
•
a DMA transfer in User mode
•
a DMA transfer that has the UM bit set in the DMA Control Register.
A DMA transfer where the external address is within the range of the TCM also results in an
Unshared data error.
If the DMA channel is configured Secure, in the event of an error the processor asserts the
nDMASIRQ pin provided it is not masked. If the channel is configured Non-secure, in the event
of an error the processor asserts the nDMAIRQ pin, provided it is not masked. In the event of
an external abort on a page table walk caused by the DMA, the processor asserts the
nDMAEXTERRIRQ output.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-119
System Control Coprocessor
3.2.42
c11, DMA Context ID Register
The DMA Context ID Register for each channel contains the processor 32-bit Context ID of the
process that uses that channel.
The DMA Context ID Register is:
•
in CP15 c11
•
a 32-bit read/write register for each DMA channel common to Secure and Non-secure
worlds
•
accessible in privileged modes only.
Figure 3-64 shows the arrangement of bits in the DMA Context ID Register.
31
8 7
0
PROCID
ASID
Figure 3-64 DMA Context ID Register format
Table 3-119 lists how the bit values correspond with the DMA Context ID Register functions.
Table 3-119 DMA Context ID Register bit functions
Bits
Field name
Function
[31:8]
PROCID
Extends the ASID to form the process ID and identify the current process
Holds the process ID value
[8:0]
ASID
Holds the ASID of the current process and identifies the current ASID
Holds the ASID value
Access in the Non-secure world depends on the DMA bit, see c1, Non-Secure Access Control
Register on page 3-55. Table 3-120 lists the results of attempted access for each mode.
Table 3-120 Results of access to the DMA Context ID Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
0
Data
Data
Undefined exception
Undefined exception
Undefined exception
1
Data
Data
Data
Data
Undefined exception
DMA bit
User
To access the DMA Context ID register in a privileged mode set the DMA Channel Number
Register to the appropriate DMA channel and read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c11
•
CRm set to c15
•
Opcode_2 set to 0.
MRC p15, 0, <Rd>, c11, c15, 0
MCR p15, 0, <Rd>, c11, c15, 0
ARM DDI 0333H
ID012410
; Read DMA Context ID Register
; Write DMA Context ID Register
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-120
System Control Coprocessor
As part of the initialization of the DMA channel, the process that uses that channel writes the
processor Context ID to the DMA Context ID Register. Where the channel is designated as a
User-accessible channel, the privileged process, that initializes the channel for use in User
mode, must write the Context ID at the same time that the software writes to the U bit for the
channel.
The process that translates VAs to physical addresses uses the ASID stored in the bottom eight
bits of the Context ID register to enable different VA maps to co-exist. Attempts to write this
register while the DMA channel is Running or Queued has no effect.
Only privileged processes can read this register. This provides anonymity of the DMA channel
usage from User processes. On a context switch, where the state of the DMA is stacked and
restored, the saved state must include this register.
If a user process attempts to access this privileged register the processor takes an Undefined
instruction trap.
3.2.43
c12, Secure or Non-secure Vector Base Address Register
The purpose of the Secure or Non-secure Vector Base Address Register is to hold the base
address for exception vectors in the Secure and Non-secure worlds. For more information, see
Exceptions on page 2-36.
The Secure or Non-secure Vector Base Address Register is:
•
in CP15 c12
•
a 32-bit read/write register banked in Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-65 shows the arrangement of bits in the register.
31
5 4
Vector base address
0
SBZ
Figure 3-65 Secure or Non-secure Vector Base Address Register format
Table 3-121 lists how the bit values correspond with the Secure or Non-secure Vector Base
Address Register functions.
Table 3-121 Secure or Non-secure Vector Base Address Register bit functions
Bits
Field name
Function
[31:5]
Vector base address
Determines the location that the core branches to on an exception.
Holds the base address. The reset value is 0.
[4:0]
SBZ
UNP/SBZ.
When an exception occurs in the Secure world, the core branches to address:
Secure Vector_Base_Address + Exception_Vector_Address.
When an exception occurs in the Non-secure world, the core branches to address:
Non-secure Vector_Base_Address + Exception_Vector_Address.
When high vectors are enabled, regardless of the value of the register the core branches to:
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-121
System Control Coprocessor
0xFFFF0000 + Exception_Vector_Address
You can configure IRQ, FIQ, and External abort exceptions to branch to Secure Monitor mode,
see c1, Secure Configuration Register on page 3-52. In this case the processor uses the Monitor
Vector Base Address, see c12, Monitor Vector Base Address Register, to calculate the branch
address. The Reset exception always branches to 0x00000000, regardless of the value of the
Vector Base Address except when the processor uses high vectors.
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
Table 3-122 lists the results of attempted access for each mode.
Table 3-122 Results of access to the Secure or Non-secure Vector Base Address Register
Secure Privileged
Non-secure Privileged
User
Read
Write
Read
Write
Secure data
Secure data
Non-secure data
Non-secure data
Undefined exception
To use the Secure or Non-secure Vector Base Address Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c12
•
CRm set to c0
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c12, c0, 0
MCR p15, 0, <Rd>, c12, c0, 0
3.2.44
; Read Secure or Non-secure Vector Base Address Register
; Write Secure or Non-secure Vector Base Address Register
c12, Monitor Vector Base Address Register
The purpose of the Monitor Vector Base Address Register is to hold the base address for the
Secure Monitor exception vector. For more information, see Exceptions on page 2-36.
The Monitor Vector Base Address Register is:
•
in CP15 c12
•
a 32-bit read/write register in the Secure world only
•
accessible in Secure privileged modes only.
Figure 3-66 shows the arrangement of bits in the register.
31
5 4
Monitor vector base address
0
SBZ
Figure 3-66 Monitor Vector Base Address Register format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-122
System Control Coprocessor
Table 3-123 lists how the bit values correspond with the Monitor Vector Base Address Register
functions.
Table 3-123 Monitor Vector Base Address Register bit functions
Bits
Field name
Function
[31:5]
Monitor vector base
address
Determines the location that the core branches to on a Secure Monitor mode exception.
Holds the base address. The reset value is 0.
[4:0]
SBZ
UNP/SBZ.
When an exception branches to the Secure Monitor mode, the core branches to address:
Monitor_Base_Address + Exception_Vector_Address.
The Secure Monitor Call Exception caused by an SMC instruction branches to Secure Monitor
mode. You can configure IRQ, FIQ, and External abort exceptions to branch to Secure Monitor
mode, see c1, Secure Configuration Register on page 3-52. These are the only exceptions that
can branch to Secure Monitor mode and that use the Monitor Vector Base Address Register to
calculate the branch address. For more information about exceptions, see Exception vectors on
page 2-48.
Note
The Monitor Vector Base Address Register is 0x00000000 at reset. The Secure boot code must
program the register with an appropriate value for the Secure Monitor.
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
Table 3-124 lists the results of attempted access for each mode.
Table 3-124 Results of access to the Monitor Vector Base Address Register
Secure Privileged
Read
Write
Data
Data
Non-secure Privileged
User
Undefined exception
Undefined exception
To use the Monitor Vector Base Address Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c12
•
CRm set to c0
•
Opcode_2 set to 1.
For example:
MRC p15, 0, <Rd>, c12, c0, 1
MCR p15, 0, <Rd>, c12, c0, 1
3.2.45
; Read Monitor Vector Base Address Register
; Write Monitor Vector Base Address Register
c12, Interrupt Status Register
The purpose of the Interrupt Status Register is to:
•
reflect the state of the nFIQ and nIRQ pins on the processor
•
to reflect the state of external aborts.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-123
System Control Coprocessor
The Interrupt Status Register is:
•
in CP15 c12
•
a 32-bit read-only register common to Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-67 shows the arrangement of bits in the register.
31
9 8 7 6 5
SBZ
A I F
0
SBZ
Figure 3-67 Interrupt Status Register format
Table 3-125 lists how the bit values correspond with the Interrupt Status Register functions.
Table 3-125 Interrupt Status Register bit functions
Bits
Field name
Functiona
[31:9]
-
SBZ.
[8]
A
Indicates when an external abort is pending:
0 = No abort, reset value
1 = Abort pending.
[7]
I
Indicates when an IRQ is pending:
0 = no IRQ, reset value
1 = IRQ pending.
[6]
F
Indicates when an FIQ is pending:
0 = no FIQ, reset value
1 = FIQ pending.
[5:0]
-
SBZ.
a. The reset values depend on external signals.
•
•
Note
The F and I bits directly reflect the state of the nFIQ and nIRQ pins respectively, but are
the inverse state.
The A bit is set when an external abort occurs and automatically clears when the abort is
taken.
Table 3-126 lists the results of attempted access for each mode.
Table 3-126 Results of access to the Interrupt Status Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Undefined exception
Data
Undefined exception
User
Undefined exception
The A, I, and F bits map to the same format as the CPSR so that you can use the same mask for
these bits.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-124
System Control Coprocessor
The Secure Monitor can poll these bits to detect the exceptions before it completes context
switches. This can reduce interrupt latency.
To use the Interrupt Status Register read CP15 with:
•
Opcode_1 set to 0
•
CRn set to c12
•
CRm set to c1
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c12, c1, 0
3.2.46
; Read Interrupt Status Register
c13, FCSE PID Register
The c13, Context ID Register on page 3-127 replaces the FCSE PID Register. Use of the FCSE
PID Register is deprecated.
The FCSE PID Register is:
•
in CP15 c13
•
a 32-bit read/write register banked for Secure and Non-secure worlds
•
accessible in privileged modes only.
Writing to this register globally flushes the BTAC.
Figure 3-68 shows the arrangement of bits in the register.
31
25 24
FCSE PID
0
SBZ
Figure 3-68 FCSE PID Register format
Table 3-127 lists how the bit values correspond with the FCSE PID Register functions.
Table 3-127 FCSE PID Register bit functions
Bits
Field name
Function
[31:25]
FCSE PID
The purpose of the FCSE PID Register is to provide the ProcID for fast context switch memory
mappings. The MMU uses the contents of this register to map memory addresses in the range
0-32MB.
Identifies a specific process for fast context switch.
Holds the ProcID. The reset value is 0.
[24:0]
-
Reserved.
SBZ.
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-125
System Control Coprocessor
Table 3-128 lists the results of attempted access for each mode.
Table 3-128 Results of access to the FCSE PID Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Secure data
Secure data
Non-secure data
Non-secure data
User
Undefined exception
To use the FCSE PID Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c13
•
CRm set to c0
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c13, c0, 0
MCR p15, 0, <Rd>, c13, c0, 0
; Read FCSE PID Register
; Write FCSE PID Register
To change the ProcID and perform a fast context switch, write to the FCSE PID Register. You
do not have to flush the contents of the TLB after the switch because the TLB still holds the valid
address tags.
From zero to six instructions after the MCR that writes the ProcID might be fetched with the old
ProcID:
{ProcID = 0}
MOV R0, #1
MCR p15,0,R0,c13,c0,0
A0
(any instruction)
A1
(any instruction)
A2
(any instruction)
A3
(any instruction)
A4
(any instruction)
A5
(any instruction)
A6
(any instruction)
;
;
;
;
;
;
;
;
;
Fetched
Fetched
Fetched
Fetched
Fetched
Fetched
Fetched
Fetched
Fetched
with
with
with
with
with
with
with
with
with
ProcID
ProcID
ProcID
ProcID
ProcID
ProcID
ProcID
ProcID
ProcID
=
=
=
=
=
=
=
=
=
0
0
0/1
0/1
0/1
0/1
0/1
0/1
1
Note
You must not rely on this behavior for future compatibility. An IMB must be executed between
changing the ProcID and fetching from locations that are translated by the ProcID.
Addresses issued by the ARM1176JZ-S processor in the range 0-32MB are translated by the
ProcID. Address A becomes A + (ProcID x 32MB). This translated address, the MVA, is used
by the MMU. Addresses higher than 32MB are not translated. The ProcID is a seven-bit field,
enabling 128 x 32MB processes to be mapped.
Note
If ProcID is 0, as it is on Reset, then there is a flat mapping between the ARM1176JZ-S
processor and the MMU.
Figure 3-69 on page 3-127 shows how addresses are mapped using the FCSE PID Register.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-126
System Control Coprocessor
Modified virtual address (MVA)
input to MMU
Virtual address (VA)
issued by the processor
4GB
4GB
127
2
64MB
1
32MB
32MB
C13
0
0
0
Figure 3-69 Address mapping with the FCSE PID Register
3.2.47
c13, Context ID Register
The purpose of the Context ID Register is to provide information on the current ASID and
process ID, for example for the ETM and debug logic.
Table 3-129 on page 3-128 lists the purposes of the individual bits of the Context ID Register.
Debug logic uses the ASID information to enable process-dependent breakpoints and
watchpoints.
The Context ID Register is:
•
in CP15 c13
•
a 32-bit read/write register banked for Secure and Non-secure worlds
•
accessible in privileged modes only.
Writing to this register globally flushes the BTAC.
Figure 3-70 shows the arrangement of bits in the Context ID Register.
31
8 7
PROCID
0
ASID
Figure 3-70 Context ID Register format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-127
System Control Coprocessor
Table 3-129 lists how the bit values correspond with the Context ID Register functions.
Table 3-129 Context ID Register bit functions
Bits
Field name
Function
[31:8]
PROCID
Extends the ASID to form the process ID and identify the current process.
The value is the Process ID. The reset value is 0.
[8:0]
ASID
Holds the ASID of the current process to identify the current ASID.
The value is the ASID. The reset value is 0.
Table 3-130 lists the results of attempted access for each mode.
Table 3-130 Results of access to the Context ID Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Secure data
Secure data
Non-secure data
Non-secure data
User
Undefined exception
The current ASID value in the ID Context Register is exported to the MMU.
To use the Context ID Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c13
•
CRm set to c0
•
Opcode_2 set to 1.
For example:
MRC p15, 0, <Rd>, c13, c0, 1
MCR p15, 0, <Rd>, c13, c0, 1
;Read Context ID Register
;Write Context ID Register
You must ensure that software performs a Data Synchronization Barrier operation before
changes to this register. This ensures that all accesses are related to the correct context ID.
You must execute an IMB instruction immediately after changes to the Context ID Register. You
must not attempt to execute any instructions that are from an ASID-dependent memory region
between the change to the register and the IMB instruction. Code that updates the ASID must
execute from a global memory region.
You must program each process with a unique number to ensure that ETM and debug logic can
correctly distinguish between processes.
3.2.48
c13, Thread and process ID registers
The purpose of the thread and process ID registers is to provide locations to store the IDs of
software threads and processes for OS management purposes.
The thread and process ID registers are:
•
in CP15 c13
•
three 32-bit read/write registers banked for Secure and Non-secure worlds:
— User Read/Write Thread and Process ID Register
— User Read Only Thread and Process ID Register
— Privileged Only Thread and Process ID Register.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-128
System Control Coprocessor
•
each accessible in different modes:
— User Read/Write: read/write in User and privileged modes
— User Read Only: read only in User mode, read/write in privileged modes
— Privileged Only: read/write in privileged modes only.
Table 3-131 lists the results of attempted access to each register for each mode.
Table 3-131 Results of access to the thread and process ID registers
Thread
and
Process
ID
Register
Secure
Privileged
Non-secure Privileged
Secure User
Non-secure User
Read
Write
Read
Write
Read
Write
Read
Write
User
Read/
Write a
Secure
data
Secure
data
Non-secure
data
Non-secure
data
Secure
data
Secure
data
Non-secure
data
Non-secure
data
User Read
Only a
Secure
data
Secure
data
Non-secure
data
Non-secure
data
Secure
data
Undefined
exception
Non-secure
data
Undefined
exception
Privileged
Only a
Secure
data
Secure
data
Non-secure
data
Non-secure
data
Undefined
exception
Undefined
exception
Undefined
exception
Undefined
exception
a. The register names are:
- User Read/Write Thread and Process ID Register
- User Read Only Thread and Process ID Register
- Privileged Only Thread and Process ID Register.
To use the thread and process ID registers read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c13
•
CRm set to c0
•
Opcode_2 set to:
— 2, User Read/Write Thread and Process ID Register
— 3, User Read Only Thread and Process ID Register
— 4, Privileged Only Thread and Process ID Register.
For example:
MRC
MCR
MRC
MCR
MRC
MCR
p15,
p15,
p15,
p15,
p15,
p15,
0,
0,
0,
0,
0,
0,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
c13,
c13,
c13,
c13,
c13,
c13,
c0,
c0,
c0,
c0,
c0,
c0,
2
2
3
3
4
4
;Read User Read/Write Thread and Proc. ID Register
;Write User Read/Write Thread and Proc. ID Register
;Read User Read Only Thread and Proc. ID Register
;Write User Read Only Thread and Proc. ID Register
;Read Privileged Only Thread and Proc. ID Register
;Write Privileged Only Thread and Proc. ID Register
Reading or writing the thread and process ID registers has no effect on processor state or
operation. These registers provide OS support and must be managed by the OS.
You must clear the contents of all thread and process ID registers on process switches to prevent
data leaking from one process to another. This is important to ensure the security of secure data.
The reset value of these registers is 0.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-129
System Control Coprocessor
3.2.49
c15, Peripheral Port Memory Remap Register
The purpose of the Peripheral Port Memory Remap Register is to remap the memory attributes
to Non-Shared Device. This forces access to the peripheral port and overrides what is
programmed in the page tables. The remapping happens both with the MMU enabled and with
the MMU disabled, therefore you can remap the peripheral port even when you do not use the
MMU. The Peripheral Port Memory Remap Register has the highest priority, higher than that
of the Primary and Normal memory remap registers.
Table 3-132 on page 3-131 lists the purposes of the individual bits in the Peripheral Port
Memory Remap Register.
The Peripheral Port Memory Remap Register is:
•
in CP15 c15
•
a 32-bit read/write register banked for Secure and Non-secure worlds
•
accessible in privileged modes only.
Figure 3-71 shows the arrangement of the bits in the register.
31
5 4
12 11
Base address
UNP/SBZ
0
Size
Figure 3-71 Peripheral Port Memory Remap Register format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-130
System Control Coprocessor
Table 3-132 lists how the bit values correspond with the functions of the Peripheral Port
Memory Remap Register.
Table 3-132 Peripheral Port Memory Remap Register bit functions
Bits
[31:12]
Field
name
Function
Base
Address
Gives the physical base address of the region of memory for remapping to the peripheral port. If the
processor uses the Peripheral Port Memory Remap Register while the MMU is disabled, the virtual
base address is equal to the physical base address that is used.
The assumption is that the Base Address is aligned to the size of the remapped region. Any bits in
the range [(log2(Region size)-1):12] are ignored.
The value is the base address. The reset value is 0.
[11:5]
-
UNP/SBZ
[4:0]
Size
Indicates the size of the memory region that the peripheral port is remapped to.
All other values are reserved:
b00000 = 0KBa
b00011 = 4KB
b00100 = 8KB
b00101 = 16KB
b00110 = 32KB
b00111 = 64KB
b01000 = 128KB
b01001 = 256KB
b01010 = 512KB
b01011 = 1MB
b01100 = 2MB
b01101 = 4MB
b01110 = 8MB
b01111 = 16MB
b10000 = 32MB
b10001 = 64MB
b10010 = 128MB
b10011 = 256MB
b10100 = 512MB
b10101 = 1GB
b10110 = 2GB.
a. The reset value, indicating that no remapping is to take place.
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
Table 3-133 lists the results of attempted access for each mode.
Table 3-133 Results of access to the Peripheral Port Remap Register
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Secure data
Secure data
Non-secure data
Non-secure data
User
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
Undefined exception
3-131
System Control Coprocessor
To use the memory remap registers read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c15
•
CRm set to c2
•
Opcode_2 set to 4.
For example:
MRC p15, 0, <Rd>, c15, c2, 4
MCR p15, 0, <Rd>, c15, c2, 4
3.2.50
; Read Peripheral Port Memory Remap Register
; Write Peripheral Port Memory Remap Register
c15, Secure User and Non-secure Access Validation Control Register
The purpose of the Secure User and Non-secure Access Validation Control Register is to
control:
•
access to the system validation registers in User mode and in the Non-secure world
•
access to the performance monitor unit registers in User mode.
Table 3-134 lists the purpose of the individual bits in the register.
The Secure User and Non-secure Access Validation Control Register is:
•
in CP15 c15
•
a 32-bit read/write register in the Secure world only
•
accessible in privileged modes only.
Figure 3-72 shows the bit arrangement for the Secure User and Non-secure Access Validation
Control Register.
31
1 0
SBZ
V
Figure 3-72 Secure User and Non-secure Access Validation Control Register format
Table 3-134 lists how the bit values correspond with the Secure User and Non-secure Access
Validation Control Register functions.
Table 3-134 Secure User and Non-secure Access Validation Control Register bit functions
Bits
Field name
Function
[31:1]
-
UNP/SBZ.
[0]
V
Controls access to system validation registers from User and Non-secure modes, and to
performance monitor registers in User mode.
0 = system validation registers accessible only from Secure privileged modes, performance
monitor registers accessible only from privileged modes. The reset value is 0.
1 = system validation and performance monitor registers accessible from any mode.
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-132
System Control Coprocessor
Table 3-135 lists the results of attempted access for each mode.
Table 3-135 Results of access to the Secure User and Non-secure Access
Validation Control Register
Secure Privileged
Read
Write
Data
Data
Non-secure Privileged
User
Undefined exception
Undefined exception
To access the Secure User and Non-secure Access Validation Control Register read or write
CP15 with:
•
Opcode_1 set to 0
•
CRn set to c15
•
CRm set to c9
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c15, c9, 0
MCR p15, 0, <Rd>, c15, c9, 0
3.2.51
; Read Secure User and Non-secure Access Validation Control Register
; Write Secure User and Non-secure Access Validation Control Register
c15, Performance Monitor Control Register
The purpose of the Performance Monitor Control Register is to control the operation of:
•
the Cycle Counter Register
•
the Count Register 0
•
the Count Register 1.
Table 3-136 on page 3-134 lists the purpose of the individual bits in the register.
The Performance Monitor Control Register is:
•
in CP15 c15
•
a 32-bit read/write register common to Secure and Non-secure worlds
•
accessible in User and Privileged modes.
Figure 3-73 shows the bit arrangement for the Performance Monitor Control Register.
31
28 27
SBZ/UNP
20 19
EvtCount0
EvtCount1
12 11 10
C
X C
R
9
C
R
1
8
C
R
0
7
S
B
Z
6
E
C
C
5
E
C
1
4 3 2 1 0
E
C D C P E
0
Figure 3-73 Performance Monitor Control Register format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-133
System Control Coprocessor
Table 3-136 lists how the bit values correspond with the Performance Monitor Control Register.
Table 3-136 Performance Monitor Control Register bit functions
Bits
Field name
Function
[31:28]
-
UNP/SBZ.
[27:20]
EvtCount0
Identifies the source of events for Count Register 0.
Table 3-137 on page 3-135 lists the values, functions and EVNTBUS bit position for Count
Register 0. The reset value is 0.
[19:12]
EvtCount1
Identifies the source of events for Count Register 1.
Table 3-137 on page 3-135 lists the values and the bit functions for Count Register 1. The reset
value is 0.
[11]
X
Enable Export of the events to the event bus to an external monitoring block, such as the ETM
to trace events:
0 = Export disabled, EVNTBUS held at 0x0, reset value
1 = Export enabled, EVNTBUS driven by the events.
[10]
CCR
Cycle Counter Register overflow flag:
0 = For reads No overflow, reset value.
For writes No effect.
1 = For reads, overflow occurred.
For writes Clear this bit.
[9]
CR1
Count Register 1 overflow flag:
0 = For reads No overflow, reset value.
For writes No effect.
1 = For reads, overflow occurred.
For writes Clear this bit.
[8]
CR0
Count Register 0 overflow flag:
0 = For reads No overflow, reset value.
For writes No effect.
1 = For reads overflow occurred.
For writes Clear this bit.
[7]
-
UNP/SBZ.
[6]
ECC
Used to enable and disable Cycle Counter interrupt reporting:
0 = Disable interrupt, reset value
1 = Enable interrupt.
[5]
EC1
Used to enable and disable Count Register 1 interrupt reporting:
0 = Disable interrupt, reset value
1 = Enable interrupt.
[4]
EC0
Used to enable and disable Count Register 0 interrupt reporting:
0 = Disable interrupt, reset value
1 = Enable interrupt.
[3]
D
Cycle count divider:
0 = Counts every processor clock cycle, reset value
1 = Counts every 64th processor clock cycle.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-134
System Control Coprocessor
Table 3-136 Performance Monitor Control Register bit functions (continued)
Bits
Field name
Function
[2]
C
Cycle Counter Register Reset. Reset on write, Unpredictable on read:
0 = No action, reset value
1 = Reset the Cycle Counter Register to 0x0.
[1]
P
Count Register 1 and Count Register 0 Reset. Reset on write, Unpredictable on read:
0 = No action, reset value
1 = Reset both Count Registers to 0x0.
[0]
E
Enable all counters:
0 = All counters disabled, reset value
1 = All counters enabled.
The Performance Monitor Control Register:
•
controls the events that Count Register 0 and Count Register 1 count
•
indicates the counter that overflowed
•
enables and disables the report of interrupts
•
extends Cycle Count Register counting by six more bits, cycles between counter rollover
= 238
•
resets all counters to zero
•
enables the entire performance monitoring mechanism.
Table 3-137 lists the events that can be monitored using the Performance Monitor Control
Register.
Table 3-137 Performance monitoring events
EVNTBUS
bit position
Event
number
Event definition
-
0xFF
An increment each cycle.
-
0x26
Procedure return instruction executed and return address predicted incorrectly. The
procedure return address was restored to the return stack following the prediction being
identified as incorrect.
-
0x25
Procedure return instruction executed and return address predicted. The procedure return
address was popped off the return stack and the core branched to this address.
-
0x24
Procedure return instruction executed. The procedure return address was popped off the
return stack.
-
0x23
Procedure call instruction executed. The procedure return address was pushed on to the
return stack.
-
0x22
If both ETMEXTOUT[0] and ETMEXTOUT[1] signals are asserted then the count is
incremented by two. If either signal is asserted then the count increments by one.
-
0x21
ETMEXTOUT[1] signal was asserted for a cycle.
-
0x20
ETMEXTOUT[0] signal was asserted for a cycle.
[19]
0x12
Write Buffer drained because of a Data Synchronization Barrier operation or Strongly
Ordered operation.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-135
System Control Coprocessor
Table 3-137 Performance monitoring events (continued)
EVNTBUS
bit position
Event
number
[18]
0x11
Stall because of a full Load Store Unit request queue. This event takes place each clock cycle
when the condition is met. A high incidence of this event indicates the LSU is often waiting
for transactions to complete on the external bus.
[17]
0x10
Explicit external data accesses, Data Cache linefills, Noncacheable, write-through.
[16]
0xF
Main TLB miss.
[15:14]
0xD
Software changed the PC. This event occurs any time the PC is changed by software and
there is not a mode change. For example, a MOV instruction with PC as the destination
triggers this event. Executing a SVC from User mode does not trigger this event, because it
incurs a mode change. If EVENTBUS bit [15] is HIGH, two software PC changes occurred
in this clock cycle and the count increments by two.
[13]
0xC
Data cache write-back. This event occurs once for each half line of four words that are
written back from the cache.
[12]
0xB
Data cache miss. Does not include Cache Operations.
[11]
0xA
Data cache access. Does not include Cache Operations. This event occurs for each
nonsequential access to a cache line, regardless of whether or not the location is cacheable.
[10]
0x9
Data cache access. Does not include Cache Operations. This event occurs for each
nonsequential access to a cache line, for cacheable locations.
[9:8]
0x7
Instruction executed. If EVENTBUS bit [9] is HIGH, two instructions were executed in this
clock cycle and the count is increments by two.
[7]
0x6
Branch mispredicted.
[6]
-
Reserved.
[5]
0x5
Branch instruction executed, branch might or might not have changed program flow.
[4]
0x4
Data MicroTLB miss.
[3]
0x3
Instruction MicroTLB miss.
[2]
0x2
Stall because of a data dependency. This event occurs every cycle when the condition is
present.
[1]
0x1
Stall because instruction buffer cannot deliver an instruction. This can indicate an Instruction
Cache miss or an Instruction MicroTLB miss. This event occurs every cycle when the
condition is present.
[0]
0x0
Instruction cache miss.
Event definition
Note
This event counts all instruction cache misses, including any speculative access that would
be a cache miss. If the instruction that caused a speculative access is not executed then there
might not be a fetch from external memory. This can happen, for example, if the code
branches round the instruction. This means that the value returned in this counter can be
much larger than the number of external memory accesses caused by instruction cache
misses.
-
ARM DDI 0333H
ID012410
All other
values
Reserved. Unpredictable behavior.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-136
System Control Coprocessor
Access to the Performance Monitor Control Register in User mode depends on the V bit, see
c15, Secure User and Non-secure Access Validation Control Register on page 3-132. The
Performance Monitor Control Register is always accessible in Privileged modes. Table 3-138
lists the results of attempted access for each mode.
Table 3-138 Results of access to the Performance Monitor Control Register
Secure Privileged
Non-secure Privileged
User
Read
Write
Read
Write
Read
Write
0
Data
Data
Data
Data
Undefined exception
Undefined exception
1
Data
Data
Data
Data
Data
Data
V bit
To access the Performance Monitor Control Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c15
•
CRm set to c12
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c15, c12, 0
MCR p15, 0, <Rd>, c15, c12, 0
; Read Performance Monitor Control Register
; Write Performance Monitor Control Register
If this unit generates an interrupt, the processor asserts the pin nPMUIRQ. You can route this
pin to an external interrupt controller for prioritization and masking. This is the only mechanism
that signals this interrupt to the core. When asserted, this interrupt can only be cleared if bit 0 of
the Performance Monitor Control Register is high.
There is a delay of three cycles between an enable of the counter and the start of the event
counter. The information used to count events is taken from various pipeline stages. This means
that the absolute counts recorded might vary because of pipeline effects. This has negligible
effect except in cases where the counters are enabled for a very short time.
In addition to the two counters within the processor, most of the events that Table 3-137 on
page 3-135 lists are available on an external bus, EVNTBUS. You can connect this bus to the
ETM unit or other external trace hardware to enable the events to be monitored. If you do not
want this functionality, set the X bit in the Performance Monitor Control Register to 0. In Debug
state the EVNTBUS is masked to zero.
3.2.52
c15, Cycle Counter Register
The purpose of the Cycle Counter Register is to count the core clock cycles.
The Cycle Counter Register:
•
is in CP15 c15
•
is a 32-bit read/write register common to Secure and Non-secure worlds
•
counts up and can trigger an interrupt on overflow.
The Cycle Counter Register bits[31:0] contain the count value. The reset value is 0.
You can use this register in conjunction with the Performance Monitor Control Register and the
two Counter Registers to provide a variety of useful metrics that enable you to optimize system
performance.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-137
System Control Coprocessor
Access to the Cycle Counter Register in User mode depends on the V bit, see c15, Secure User
and Non-secure Access Validation Control Register on page 3-132. The Cycle Counter Register
is always accessible in Privileged modes. Table 3-139 lists the results of attempted access for
each mode.
Table 3-139 Results of access to the Cycle Counter Register
Secure Privileged
Non-secure Privileged
User
Read
Write
Read
Write
Read
Write
0
Data
Data
Data
Data
Undefined exception
Undefined exception
1
Data
Data
Data
Data
Data
Data
V bit
To access the Cycle Counter Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c15
•
CRm set to c12
•
Opcode_2 set to 1.
For example:
MRC p15, 0, <Rd>, c15, c12, 1
MCR p15, 0, <Rd>, c15, c12, 1
; Read Cycle Counter Register
; Write Cycle Counter Register
The value in the Cycle Counter Register is zero at Reset.
You can use the Performance Monitor Control Register to set the Cycle Counter Register to zero.
You can use the Performance Monitor Control Register to configure the Cycle Counter Register
to count every 64th clock cycle.
3.2.53
c15, Count Register 0
The purpose of the Count Register 0 is to count instances of an event that the Performance
Monitor Control Register selects.
The Count Register 0:
•
is in CP15 c15
•
is a 32-bit read/write register common to Secure and Non-secure worlds
•
counts up and can trigger an interrupt on overflow.
Count Register 0 bits [31:0] contain the count value. The reset value is 0.
You can use this register in conjunction with the Performance Monitor Control Register, the
Cycle Count Register, and Count Register 1 to provide a variety of useful metrics that enable
you to optimize system performance.
•
•
ARM DDI 0333H
ID012410
Note
In Debug state the counter is disabled.
When the core is in a mode where non-invasive debug is not permitted, set by SPNIDEN
and the SUNIDEN bit, see c1, Secure Debug Enable Register on page 3-54, the processor
does not count events.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-138
System Control Coprocessor
Access to the Count Register 0 in User mode depends on the V bit, see c15, Secure User and
Non-secure Access Validation Control Register on page 3-132. The Count Register 0 is always
accessible in Privileged modes. Table 3-140 lists the results of attempted access for each mode.
Table 3-140 Results of access to the Count Register 0
Secure Privileged
Non-secure Privileged
User
Read
Write
Read
Write
Read
Write
0
Data
Data
Data
Data
Undefined exception
Undefined exception
1
Data
Data
Data
Data
Data
Data
V bit
To access Count Register 1 read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c15
•
CRm set to c12
•
Opcode_2 set to 2.
For Example:
MRC p15, 0, <Rd>, c15, c12, 2
MCR p15, 0, <Rd>, c15, c12, 2
; Read Count Register 0
; Write Count Register 0
The value in Count Register 0 is 0 at Reset.
You can use the Performance Monitor Control Register to set Count Register 0 to zero.
3.2.54
c15, Count Register 1
The purpose of the Count Register 1 is to count instances of an event that the Performance
Monitor Control Register selects.
The Count Register 1:
•
is in CP15 c15
•
is a 32-bit read/write register common to Secure and Non-secure worlds
•
counts up and can trigger an interrupt on overflow.
Count Register 1 bits [31:0] contain the count value. The reset value is 0.
You can use this register in conjunction with the Performance Monitor Control Register, the
Cycle Count Register, and Count Register 0 to provide a variety of useful metrics that enable
you to optimize system performance.
•
•
ARM DDI 0333H
ID012410
Note
In Debug state the counter is disabled.
When the core is in a mode where non-invasive debug is not permitted, set by SPNIDEN
and the SUNIDEN bit, see c1, Secure Debug Enable Register on page 3-54, the processor
does not count events.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-139
System Control Coprocessor
Access to the Count Register 1 in User mode depends on the V bit, see c15, Secure User and
Non-secure Access Validation Control Register on page 3-132. The Count Register 1 is always
accessible in Privileged modes. Table 3-141 lists the results of attempted access for each mode.
Table 3-141 Results of access to the Count Register 1
Secure Privileged
Non-secure Privileged
User
Read
Write
Read
Write
Read
Write
0
Data
Data
Data
Data
Undefined exception
Undefined exception
1
Data
Data
Data
Data
Data
Data
V bit
To access Count Register 1 read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c15
•
CRm set to c12
•
Opcode_2 set to 3.
For example:
MRC p15, 0, <Rd>, c15, c12, 3
MCR p15, 0, <Rd>, c15, c12, 3
; Read Count Register 1
; Write Count Register 1
The value in Count Register 1 is 0 at Reset.
You can use the Performance Monitor Control Register to set Count Register 1 to zero.
3.2.55
c15, System Validation Counter Register
The purpose of the System Validation Counter Register is to count core clock cycles to trigger
a system validation event.
The System Validation Counter Register is:
•
in CP15 c15
•
a 32 bit read/write register common to the Secure and Non-secure worlds
•
accessible in User and Privileged modes.
The System Validation Counter Register consists of one 32-bit register that performs four
functions. Table 3-142 lists the arrangement of the functions in this group. The reset value is 0.
Table 3-142 System validation counter register operations
CRn
Opcode_1
CRm
Opcode_2
R/W
Operation
c15
0
c12
1
R/W
Reset counter
2
R/W
Interrupt counter
3
R/W
Fast interrupt counter
7
W
External debug request counter
The reset, interrupt, and fast interrupt counters are 32-bits wide. The external debug request
counter is 6 bits wide. Figure 3-74 on page 3-141 shows the arrangement of bits for the external
debug request counter.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-140
System Control Coprocessor
31
6 5
SBZ/UNP
0
Counter value
Figure 3-74 System Validation Counter Register format for external debug request counter
Table 3-143 lists the results of attempted access for each mode. Access in Secure User mode and
in the Non-secure world depends on the V bit, see c15, Secure User and Non-secure Access
Validation Control Register on page 3-132.
Table 3-143 Results of access to the System Validation Counter Register
Secure Privileged
Non-secure Privileged
User
Read
Write
Read
Write
Read
Write
0
Data
Data
Undefined
exception
Undefined
exception
Undefined
exception
Undefined
exception
1
Data
Data
Data
Data
Data
Data
0
Unpredictable
Data
Undefined
exception
Undefined
exception
Undefined
exception
Undefined
exception
1
Unpredictable
Data
Unpredictable
Data
Unpredictable
Data
V
bit
Function
Reset, interrupt, and
fast interrupt counters
External debug request
counter
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
To use the System Validation Counter Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c15
•
CRm set to c12
•
Opcode_2 set to:
— 1, Read/write reset counter
— 2, Read/write interrupt counter
— 3, Read/write fast interrupt counter
— 7, Write external debug request counter.
For example:
MRC
MCR
MRC
MCR
MRC
MCR
MCR
p15,
p15,
p15,
p15,
p15,
p15,
p15,
0,
0,
0,
0,
0,
0,
0,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
c15,
c15,
c15,
c15,
c15,
c15,
c15,
c12,
c12,
c12,
c12,
c12,
c12,
c12,
1
1
2
2
3
3
7
;Read reset counter
;Write reset counter
;Read interrupt counter
;Write interrupt counter
;Read fast interrupt counter
;Write fast interrupt counter
;Write external debug request counter
A read or write to the System Validation Counter Register with a value of Opcode_2 other than
1, 2, 3, or 7 has no effect.
When the system starts the counters they count up, incrementing by one on each core clock
cycle, until they wrap around. When the counters wrap around they cause the specified event to
occur. See c15, System Validation Operations Register on page 3-142.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-141
System Control Coprocessor
The reset, interrupt, and fast interrupt counters reuse the Cycle Count Register, Count Register
0 and Count Register 1 of the System performance monitor registers respectively, see System
performance monitor on page 3-10. You must not use the System Validation Count Register
when the System Performance Monitor Registers are in use.
The reset, interrupt, and fast interrupt counters are read/write. The external debug request
counter is write only. Attempts to read the external debug request counter return 0x00000000
regardless or the actual value of the counter.
3.2.56
c15, System Validation Operations Register
The purpose of the System Validation Operations Register is to start and stop system validation
counters to trigger a system validation event.
The System Validation Operations Register is:
•
in CP15 c15
•
a 32 bit read/write register common to the Secure and Non-secure worlds
•
accessible in user and privileged modes.
The System Validation Operations Register consists of one 32-bit register that performs 16
functions. Table 3-144 lists the arrangement of the functions in this group.
Table 3-144 System Validation Operations Register functions
CRn
Opcode_1
CRm
Opcode_2
R/W
Operation
c15
0
c13
1
W
Start reset counter
2
W
Start interrupt counter
3
W
Start reset and interrupt counters
4
W
Start fast interrupt counter
5
W
Start reset and fast interrupt counters
6
W
Start interrupt and fast interrupt counters
7
W
Start reset, interrupt and fast interrupt counters
c15
1
c13
0-7
W
Start external debug request counter
c15
2
c13
1
W
Stop reset counter
2
W
Stop interrupt counter
3
W
Stop reset and interrupt counters
4
W
Stop fast interrupt counter
5
W
Stop reset and fast interrupt counters
6
W
Stop interrupt and fast interrupt counters
7
W
Stop reset, interrupt and fast interrupt counters
0-7
W
Stop external debug request counter
c15
3
c13
A write to the System Validation Operations Register with a combination of Opcode_1 and
Opcode_2 that Table 3-144 does not list has no effect. A read from the System Validation
Operations Register returns 0x00000000.
The reset value of this register is 0.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-142
System Control Coprocessor
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
Table 3-145 lists the results of attempted access for each mode. Access in Secure User mode and
in the Non-secure world depends on the V bit, see c15, Secure User and Non-secure Access
Validation Control Register on page 3-132.
Table 3-145 Results of access to the System Validation Operations Register
Secure Privileged
Non-secure Privileged
User
Read
Write
Read
Write
Read
Write
0
Unpredictable
Data
Undefined
exception
Undefined
exception
Undefined
exception
Undefined
exception
1
Unpredictable
Data
Unpredictable
Data
Unpredictable
Data
V bit
To use the System Validation Operations Register write CP15 with <Rd> set to SBZ and:
•
Opcode_1 set to:
— 0, Start reset, interrupt, or fast interrupt counters
— 1, Start external debug request counter
— 2, Stop reset, interrupt, or fast interrupt counters
— 3, Stop external debug request counter.
•
CRn set to c15
•
CRm set to c13
•
Opcode_2 set to:
— 1, Reset counter
— 2, Interrupt counter
— 3, Reset and interrupt counters
— 4, Fast interrupt counter
— 5, Reset and fast interrupt counters
— 6, Interrupt and fast interrupt counters
— 7, Reset, interrupt and fast interrupt counters
— Any value, External debug request counter.
For example:
MCR
MCR
MCR
MCR
MCR
MCR
MCR
MCR
MCR
MCR
MCR
MCR
MCR
MCR
MCR
MCR
ARM DDI 0333H
ID012410
p15,
p15,
p15,
p15,
p15,
p15,
p15,
p15,
p15,
p15,
p15,
p15,
p15,
p15,
p15,
p15,
0,
0,
0,
0,
0,
0,
0,
1,
2,
2,
2,
2,
2,
2,
2,
3,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
c15,
c15,
c15,
c15,
c15,
c15,
c15,
c15,
c15,
c15,
c15,
c15,
c15,
c15,
c15,
c15,
c13,
c13,
c13,
c13,
c13,
c13,
c13,
c13,
c13,
c13,
c13,
c13,
c13,
c13,
c13,
c13,
1
2
3
4
5
6
7
0
1
2
3
4
5
6
7
0
;
;
;
;
;
;
;
;
;
;
;
;
;
;
;
;
Start reset counter
Start interrupt counter
Start reset and interrupt counters
Start fast interrupt counter
Start reset and fast interrupt counters
Start interrupt and fast interrupt counters
Start reset, interrupt and fast interrupt counters
Start external debug request counter
Stop reset counter
Stop interrupt counter
Stop reset and interrupt counters
Stop fast interrupt counter
Stop reset and fast interrupt counters
Stop interrupt and fast interrupt counters
Stop reset, interrupt and fast interrupt counters
Stop external debug request counter
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-143
System Control Coprocessor
You use the System Validation Operations Register to start and stop the reset, interrupt, fast
interrupt, and external debug request counters. When the system starts any of these counters,
they count up incrementing by one every core clock cycle, until they wrap around. When the
counters wrap around they cause nVALRESET, nVALIRQ, nVALFIQ, or VALEDBGRQ to
go LOW depending on the operation. You can use these outputs to generate system Reset,
Interrupt request, Fast Interrupt request, or External Debug Request events. You can use the
System Validation Counter Register to set the start value of the counters, see c15, System
Validation Counter Register on page 3-140. Any number of events can occur simultaneously.
When you use the Validation Trickbox Operations Register to start a counter, there is one clock
cycle delay, that generally corresponds to one instruction, before the count begins. If you require
an event to occur on the next instruction, insert a NOP instruction between the MCR instruction,
to the System Validation Operations Register, that starts the counter and the instruction on which
you want the event to occur.
You must leave two clock cycles, that generally corresponds to two instructions, between a write
to a counter with the System Validation Counter Register and the start of that count with the
System Validation Operations Register.
After the system stops the reset, interrupt or fast interrupt counters, or after handling the events
they cause, you must explicitly clear the counters to return them to their System performance
monitoring function. To do this set bits in <Rn> and write to the Performance Monitor Control
Register to clear the relevant overflow flags:
•
bit [10] to clear the reset counter
•
bit [9] to clear the fast interrupt counter
•
bit [8] to clear the interrupt counter.
You must carry out this operation with a read-modify-write sequence to avoid changes to other
bits, see c15, Performance Monitor Control Register on page 3-133. You do not have to clear
the external debug request counter explicitly in this way because it is not used for system
performance monitoring.
The reset, interrupt, and fast interrupt counters reuse the Cycle Count Register, Count Register
0 and Count Register 1 of the System performance monitor registers respectively, see System
performance monitor on page 3-10. As a result you must not perform read or write operations
to the System Validation Counter Register when the System performance monitor registers are
in use.
The System Validation Operations Register is write only and attempts to read this register are
reserved and return 0x00000000.
To schedule system validation events follow this procedure:
ARM DDI 0333H
ID012410
1.
Modify the Secure User and Non-secure Access Validation Control Register to permit
access from User or Non-secure modes if this is required.
2.
Use the Validation Trickbox Counter Register to load the required counter with 0xFFFFFFFF
minus the number of core clock cycles to wait before the event occurs.
3.
Use the Validation Trickbox Operations Register to start the required counter.
4.
Use the appropriate Validation Trickbox Operations Register to stop the required counter,
after the event has occurred or as necessary.
5.
Use the Performance Monitor Control Register to reset the counters and return them to
System performance monitoring functionality.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-144
System Control Coprocessor
3.2.57
c15, System Validation Cache Size Mask Register
The purpose of the System Validation Cache Size Mask Register is to change the apparent size
of the caches and TCMs as they appear to the processor, for validation by simulation. It does not
change the physical size of the caches and TCMs in a manufactured device.
The System Validation Cache Size Mask Register is:
•
in CP15 c15
•
a 32 bit read/write register common to the Secure and Non-secure worlds
•
accessible in User and Privileged modes.
Figure 3-75 shows the arrangement of bits for the System Validation Cache Size Mask Register.
31
15 14
SBZ
12 11 10
8
S
DTCM B ITCM
Z
7 6
4
S
B DCache
Z
3 2
0
S
B ICache
Z
Write enable
Figure 3-75 System Validation Cache Size Mask Register format
Table 3-146 lists how the bit values correspond with the System Validation Cache Size Mask
Register functions.
Table 3-146 System Validation Cache Size Mask Register bit functions
Bits
Field name
Function
[31]
Write enable
Enables the update of the Cache and TCM sizes:
0 = The Cache and TCM sizes are not changed, reset value.
1 = The Cache and TCM sizes take the new values that the DTCM, ITCM, DCache and ICache
fields of this register specify.
Note
This is bit is write access only and Read As Zero.
[30:15]
SBZ
UNP/SBZ.
[14:12]
DTCM
Specifies apparent size of Data TCM and apparent number of Data TCM banks, as it appears
to the processor. All other values are reserved:
b000 = Not present
b011 = 1 bank, 4KB
b100 = 2 banks, 4KB each
b101 = 2 banks, 8KB each
b110 = 2 banks, 16KB each
b111 = 2 banks, 32KB each.
[11]
SBZ
UNP/SBZ.
[10:8]
ITCM
Specifies apparent size of Instruction TCM and apparent number of Instruction TCM banks, as
it appears to the processor. All other values are reserved:
b000 = Not present
b011 = 1 bank, 4KB
b100 = 2 banks, 4KB each
b101 = 2 banks, 8KB each
b110 = 2 banks, 16KB each
b111 = 2 banks, 32KB each.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-145
System Control Coprocessor
Table 3-146 System Validation Cache Size Mask Register bit functions (continued)
Bits
Field name
Function
[7]
SBZ
UNP/SBZ.
[6:4]
DCache
Specifies apparent size of Data Cache, as it appears to the processor. All other values are
reserved:
b011 = 4KB
b100 = 8KB
b101 = 16KB
b110 = 32KB
b111 = 64KB.
[3]
SBZ
UNP/SBZ.
[2:0]
ICache
Specifies apparent size of Instruction Cache, as it appears to the processor. All other values are
reserved:
b011 = 4KB
b100 = 8KB
b101 = 16KB
b110 = 32KB
b111 = 64KB.
At reset, the values in the System Validation Cache Size Mask Register are the correct values
for the implemented caches and TCMs.
Access to the System Validation Cache Size Mask Register in Secure User mode and in the
Non-secure world depends on the V bit, see c15, Secure User and Non-secure Access Validation
Control Register on page 3-132. Table 3-147 lists the results of attempted access for each mode.
Table 3-147 Results of access to the System Validation Cache Size Mask Register
Secure
Privileged
Non-secure Privileged
User
Read
Write
Read
Write
Read
Write
0
Data
Data
Undefined exception
Undefined exception
Undefined exception
Undefined exception
1
Data
Data
Data
Data
Data
Data
V
bit
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
To use the System Validation Cache Size Mask Register read or write CP15 with:
•
Opcode_1 set to 0
•
CRn set to c15
•
CRm set to c14
•
Opcode_2 set to 0.
For example:
MRC p15, 0, <Rd>, c15, c14, 0
MCR p15, 0, <Rd>, c15, c14, 0
ARM DDI 0333H
ID012410
; Read System Validation Cache Size Mask Register
; Write System Validation Cache Size Mask Register
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-146
System Control Coprocessor
You can use the System Validation Cache Size Mask Register, in a validation simulation
environment, to perform validation with cache and TCM sizes that appear to be a different size
from those that are actually implemented. The validation environment for the processor contains
validation RAMs that support cache and TCM size masking using this register. When you write
to the System Validation Cache Size Mask Register, the processor behaves as though the caches
and TCMs are the sizes that are written to the register. The sizes written to the register are
reflected in:
•
The sizes of the cache and TCM RAMs.
•
The sizes of the caches in the Cache Type Register, see c0, Cache Type Register on
page 3-21, the number of Instruction and Data TCM banks in the TCM Status Register,
see c0, TCM Status Register on page 3-24, the sizes of the TCMs in the Instruction TCM
Region Register, see c9, Instruction TCM Region Register on page 3-92, and the Data
TCM Region Register, see c9, Data TCM Region Register on page 3-90.
•
The number and use of cache master valid bits, see Cache Master Valid Registers on
page 3-8.
•
The hazard detection logic that prevents the same line being allocated twice into the
caches.
•
The DMA. If the TCMs are both masked as not present, then the DMA also appears not
to be present.
Note
You must not modify the System Validation Cache Size Mask Register in a manufactured
device. Physical RAMs do not support cache and TCM size masking. Therefore, any attempt to
mask cache and TCM sizes using this register causes address aliasing effects and problems with
cache master valid bits, that result in incorrect operation and Unpredictable effects.
3.2.58
c15, Instruction Cache Master Valid Register
The purpose of the Instruction Cache Master Valid Register is to save and restore the instruction
cache master valid bits on entry to and exit from dormant mode, see Dormant mode on
page 10-4. You might also use this register during debug.
The Instruction Cache Master Valid Register is:
•
in CP15 c15
•
a 32-bit read/write register in Secure world only
•
accessible in privileged modes only.
The number of Master Valid bits in the register is a function of the cache size. There is one
Master Valid bit for each 8 cache lines:
Master Valid bits =
cache size
line length in bytes x 8
For instance, there are 64 Master Valid bits for a 16KB cache. You can access Master Valid bits
through 32-bit registers indexed using Opcode_2. The maximum number of 32-bit registers
required for the largest cache size, 64KB, is 8. The Master Valid bits fill the registers from the
LSB of the lowest numbered register upwards.
Writes to unimplemented Valid bits have no effect, and reads return 0. The reset value is 0.
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-147
System Control Coprocessor
Attempts to access the register in modes other than Secure privileged result in an Undefined
exception.
To use the Instruction Cache Master Valid Register write CP15 with:
•
Opcode_1 set to 3
•
CRn set to c15
•
CRm set to c8
•
Opcode_2 set to <Register Number>.
MRC p15, 3, <Rd>, c15, c8, <Register Number>
MCR p15, 3, <Rd>, c15, c8, <Register Number>
; Read Instruction Cache Master Valid Register
; Write Instruction Cache Master Valid Register
The <Register Number> field of the instruction designates one of the registers required to
capture all the Valid bits. The highest Register Number is one less than the number of times 8KB
divides into the cache size.
3.2.59
c15, Data Cache Master Valid Register
The purpose of the Data Cache Master Valid Register is to save and restore the Data cache
master valid bits on entry to and exit from dormant mode, see Dormant mode on page 10-4. You
might also use this register during debug.
The Data Cache Master Valid Register is:
•
in CP15 c15
•
a 32-bit read/write register in the Secure world only
•
accessible in privileged modes only.
The number of Master Valid bits in the register is a function of the cache size. There is one
Master Valid bit for each 8 cache lines:
Master Valid bits =
cache size
line length in bytes x 8
For instance, there are 64 Master Valid bits for a 16KB cache. You can access Master Valid bits
through 32-bit registers indexed using Opcode_2. The maximum number of 32-bit registers
required for the largest cache size, 64KB, is 8. The Master Valid bits fill the registers from the
LSB of the lowest numbered register upwards.
Writes to unimplemented Valid bits have no effect, and reads return 0. The reset value is 0.
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
Attempts to access the register in modes other than Secure privileged result in an Undefined
exception.
To use the Data Cache Master Valid Register write CP15 with:
•
Opcode_1 set to 3
•
CRn set to c15
•
CRm set to c12
•
Opcode_2 set to <Register Number>.
MRC p15, 3, <Rd>, c15, c12, <Register Number>
MCR p15, 3, <Rd>, c15, c12, <Register Number>
; Read Data Cache Master Valid Register
; Write Data Cache Master Valid Register
The <Register Number> field of the instruction designates one of the registers required to
capture all the Valid bits. The highest Register Number is one less than the number of times 8KB
divides into the cache size.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-148
System Control Coprocessor
3.2.60
c15, TLB lockdown access registers
The purpose of the TLB lockdown access registers is to provide read and write access to the
contents of the lockdown region of the TLB. The processor requires these registers to enable it
to save state before it enters Dormant mode, see Dormant mode on page 10-4. You might also
use this register for debug.
The TLB lockdown access registers are:
•
in CP15 c15
•
four 32-bit read/write registers in the Secure world only:
— TLB Lockdown Index Register
— TLB Lockdown VA Register
— TLB Lockdown PA Register
— TLB Lockdown Attributes Register.
•
accessible in privileged modes only.
The four registers have different bit arrangements and functions. Figure 3-76 shows the
arrangement of bits in the TLB Lockdown Index Register.
31
3 2
UNP/SBZ
0
Index
Figure 3-76 TLB Lockdown Index Register format
Table 3-148 lists how the bit values correspond with the TLB Lockdown Index Register
functions.
Table 3-148 TLB Lockdown Index Register bit functions
Bits
Field name
Function
[31:3]
-
UNP/SBZ.
[2:0]
Index
Selects the lockdown entry of the eight TLB lockdown entries to read or write when accessing
other TLB lockdown access registers.
Select lockdown entry 0 to 7.
Figure 3-77 shows the arrangement of bits in the TLB Lockdown VA Register.
31
VA
12 11 10 9 8 7
S
SBZ G B
Z
0
ASID
Figure 3-77 TLB Lockdown VA Register format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-149
System Control Coprocessor
Table 3-149 lists how the bit values correspond with the TLB Lockdown VA Register functions.
Table 3-149 TLB Lockdown VA Register bit functions
Bits
Field name
Function
[31:12]
VA
Holds the VA of this page table entry.
[11:10]
-
UNP/SBZ.
[9]
G
Defines if this page table entry is global, applies to all ASIDs, or application-specific, ASID
must match on lookups:
0 = Application-specific entry
1 = Global entry.
[8]
-
UNP/SBZ.
[7:0]
ASID
Holds the ASID for application-specific page table entries. For global entries, this field Should
Be Zero.
Figure 3-78 shows the arrangement of bits in the TLB Lockdown PA Register.
31
12 11 10 9 8 7 6 5 4
N
SBZ S
Size SBZ
A
NSTID
PA
3 2 1 0
A
P AP V
X
Figure 3-78 TLB Lockdown PA Register format
Table 3-150 lists how the bit values correspond with the TLB Lockdown PA Register functions.
Table 3-150 TLB Lockdown PA Register bit functions
Bits
Field name
Function
[31:12]
PA
Holds the PA of this page table entry.
[11:10]
-
UNP/SBZ.
[9]
NSA
Defines whether memory accesses in the memory region that this page table entry describes are
Secure or Non-secure accesses. This matches the Secure or Non-secure state of the memory
being accessed. If the NSTID bit is set, the NSA bit is also set regardless of the written value.
This ensures that Non-secure page table entries can only access Non-secure memory, but
Secure page table entries can access Secure or Non-secure memory:
0 = Memory accesses are Secure
1 = Memory accesses are Non-secure.
[8]
NSTID
Defines page table entry as Secure or Non-secure:
0 = Entry is Secure
1 = Entry is Non-secure.
[7:6]
Size
Defines the size of the memory region that this page table entry describes:
b00 = 16MB supersection
b01 = 4KB page
b10 = 64KB page
b11 = 1M section.
[5:4]
-
UNP/SBZ.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-150
System Control Coprocessor
Table 3-150 TLB Lockdown PA Register bit functions (continued)
Bits
Field name
Function
[3]
APX
Access permissions extension bit.
Defines the access permissions for this page table entry. See Table 3-151.
[2:1]
AP
Access permissions, or first sub-page access permissions if the page table entry supports
sub-pages.
[0]
V
Indicates if this page table entry is valid:
0 = Entry is not valid
1 = Entry is valid.
Table 3-151 lists the encoding for the access permissions for bit fields APX and AP.
Table 3-151 Access permissions APX and AP bit fields encoding
APX
AP
Supervisor
permissions
User
permissions
Access type
0
b00
No access
No access
All accesses generate a permission fault
0
b01
Read/write
No access
Supervisor access only
0
b10
Read/write
Read only
Writes in user mode generate permission faults
0
b11
Read/write
Read/write
Full access
1
b00
No access
No access
Domain fault encoded field
1
b01
Read only
No access
Supervisor read only
1
b10
Read only
Read only
Supervisor/User read only
1
b11
Read only
Read only
Supervisor/User read only
Figure 3-79 shows the arrangement of bits in the TLB Lockdown Attributes Register.
31 30 29 28 27 26 25 24
S
AP3 AP2 AP1 P
V
11 10
SBZ
Domain
7 6 5
X
N
3 2 1 0
TEX
C B S
Figure 3-79 TLB Lockdown Attributes Register format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-151
System Control Coprocessor
Table 3-152 lists how the bit values correspond with the TLB Lockdown Attributes Register
functions.
Table 3-152 TLB Lockdown Attributes Register bit functions
Bits
Field
name
[31:30]
AP3
Sub-page access permissions for the fourth sub-page. If the page table entry does not support
sub-pages this field Should Be Zero.
[29:28]
AP2
Sub-page access permissions for the third sub-page. If the page table entry does not support
sub-pages this field Should Be Zero.
[27:26]
AP1
Sub-page access permissions for the second sub-page. If the page table entry does not support
sub-pages this field Should Be Zero.
[25]
SPV
Indicates that this page table entry supports sub-pages. Page table entries that support sub-pages
must be marked as Global, see c15, TLB lockdown access registers on page 3-149:
0 = Sub-pages are not valid
1 = Sub-pages are valid.
[24:11]
SBZ
UNP/SBZ.
[10:7]
Domain
Specifies the Domain number for the page table entry.
[6]
XN
Specifies Execute Never attribute: when set, the contents of the memory region that this page table
entry describes cannot be executed as code. An attempt to execute an instruction in this region
results in a permission fault:
0 = Can execute
1 = Cannot execute.
[5:3]
TEX
TEX[2:0] bits. Describes the memory region attributes. See Memory region attributes on
page 6-14.
[2]
C
C bit. Describes the memory region attributes. See Memory region attributes on page 6-14.
[1]
B
B bit. Describes the memory region attributes. See Memory region attributes on page 6-14.
[0]
S
Indicates if the memory region that this page table entry describes is shareable:
0 = Region is not shared
1 = Region is shared.
Function
Attempts to write to this register in Secure Privileged mode when CP15SDISABLE is HIGH
result in an Undefined exception, see TrustZone write access disable on page 2-9.
Table 3-153 lists the results of attempted access for each mode.
Table 3-153 Results of access to the TLB lockdown access registers
Secure Privileged
Non-secure Privileged
Read
Write
Read
Write
Data
Data
Undefined exception
Undefined exception
User
Undefined exception
To read or write a TLB Lockdown entry, you must use this procedure:
ARM DDI 0333H
ID012410
1.
Write TLB Lockdown Index Register to select the required TLB Lockdown entry.
2.
Read or write TLB Lockdown VA Register.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-152
System Control Coprocessor
3.
Read or write TLB Lockdown Attributes Register.
4.
Read or write TLB Lockdown PA Register. For writes, this sets the valid bit, enabling the
complete new entry to be used.
This procedure must not be interruptible, so your code must disable interrupts before it accesses
the TLB lockdown access registers.
Note
Software must avoid the creation of inconsistencies between the main TLB entries and the
entries already loaded in the micro-TLBs.
To use the TLB lockdown access registers read or write CP15 with:
•
Opcode_1 set to 5
•
CRn set to c15
•
CRm set to:
— c4, TLB Lockdown Index Register
— c5, TLB Lockdown VA Register
— c6, TLB Lockdown PA Register
— c7, TLB Lockdown Attributes Register.
Opcode_2 set to 2.
For example:
MRC
MCR
MRC
MCR
MRC
MCR
MRC
MCR
p15,
p15,
p15,
p15,
p15,
p15,
p15,
p15,
5,
5,
5,
5,
5,
5,
5,
5,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
<Rd>,
c15,
c15,
c15,
c15,
c15,
c15,
c15,
c15,
c4,
c4,
c5,
c5,
c6,
c6,
c7,
c7,
2
2
2
2
2
2
2
2
;
;
;
;
;
;
;
;
Read TLB Lockdown Index Register
Write TLB Lockdown Index Register
Read TLB Lockdown VA Register
Write TLB Lockdown VA Register
Read TLB Lockdown PA Register
Write TLB Lockdown PA Register
Read TLB Lockdown Attributes Register
Write TLB Lockdown Attributes Register
Example 3-3 is a code sequence that stores all 8 TLB Lockdown entries to memory, and later
restores them to the TLB Lockdown region. You might use sequences similar to this for entry
into Dormant mode.
Example 3-3 Save and restore all TLB Lockdown entries
TLBLockSave
ADR
MOV
CPSID
MCR
MRC
MRC
MRC
STMIA
ADD
CMP
BNE
CPSIE
R1,TLBLockAddr
R0,#0
aif
p15,5,R0,c15,c4,2
p15,5,R2,c15,c5,2
p15,5,R3,c15,c7,2
p15,5,R4,c15,c6,2
R1!,{R2-R4}
R0,R0,#1
R0,#8
TLBLockSave
aif
;
;
;
;
;
;
;
;
;
;
;
;
Set R1 to save address
Initialize counter
Disable interrupts
Set TLB Lockdown Index
Read TLB Lockdown VA
Read TLB Lockdown Attrs
Read TLB Lockdown PA
Save TLB Lockdown entry
Increment counter
Saved all 8 entries?
Loop until all saved
Re-enable interrupts
; insert other code here
ADR
MOV
CPSID
ARM DDI 0333H
ID012410
R1,TLBLockAddr
R0,#0
aif
; Set R1 to save address
; Initialize counter
; Disable interrupts
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-153
System Control Coprocessor
TLBLockLoad
ARM DDI 0333H
ID012410
LDMIA
MCR
MCR
MCR
MCR
ADD
CMP
BNE
CPSIE
R1!,{R2-R4}
p15,5,R0,c15,c4,2
p15,5,R2,c15,c5,2
p15,5,R3,c15,c7,2
p15,5,R4,c15,c6,2
R0,R0,#1
R0,#8
TLBLockLoad
aif
;
;
;
;
;
;
;
;
;
Load TLB Lockdown entry
Set TLB Lockdown Index
Write TLB Lockdown VA
Write TLB Lockdown Attrs
Write TLB Lockdown PA
Increment counter
Restored all 8 entries?
Loop until all restored
Re-enable interrupts
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
3-154
Chapter 4
Unaligned and Mixed-endian Data Access Support
This chapter describes the unaligned and mixed-endianness data access support for the processor.
It contains the following sections:
•
About unaligned and mixed-endian support on page 4-2
•
Unaligned access support on page 4-3
•
Endian support on page 4-6
•
Operation of unaligned accesses on page 4-13
•
Mixed-endian access support on page 4-17
•
Instructions to reverse bytes in a general-purpose register on page 4-20
•
Instructions to change the CPSR E bit on page 4-21.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-1
Unaligned and Mixed-endian Data Access Support
4.1
About unaligned and mixed-endian support
The processor executes the ARM architecture v6 instructions that support mixed-endian access
in hardware, and assist unaligned data accesses. The extensions to ARMv6 that support
unaligned and mixed-endian accesses include the following:
•
CP15 Register c1 has a U bit that enables unaligned support. This bit was specified as zero
in previous architectures, and resets to zero for legacy-mode compatibility.
•
Architecturally defined unaligned word and halfword access specification for hardware
implementation.
•
Byte reverse instructions that operate on general-purpose register contents to support
signed/unsigned halfword data values.
•
Separate instruction and data endianness, with instructions fixed as little-endian format,
naturally aligned, but with legacy support for 32-bit word-invariant binary images and
ROM.
•
A PSR endian control flag, the E-bit, set to the value of the EE bit on exception entry, see
c1, Control Register on page 3-44, that adds a byte-reverse operation to the entire load and
store instruction space as data is loaded into and stored back out of the register file. In
previous architectures this Program Status Register bit was specified as zero. It is not set
in legacy code written to conform to architectures prior to ARMv6.
•
ARM and Thumb instructions to set and clear the E-bit explicitly.
•
A byte-invariant addressing scheme to support fine-grain big-endian and little-endian
shared data structures, to conform to a shared memory standard.
The original ARM architecture was designed as little-endian. This provides a consistent address
ordering of bits, bytes, words, cache lines, and pages, and is assumed by the documentation of
instruction set encoding and memory and register bit significance. Subsequently, big-endian
support was added to enable big-endian byte addressing of memory. A little-endian
nomenclature is used for bit-ordering and byte addressing throughout this manual.
Note
In the TrustZone architecture you can only modify the B bit in the Secure world. The A, U and
EE bits are banked for the Secure and Non-secure worlds, see c1, Control Register on page 3-44.
This means that you can only change the endian behavior of the memory system of the
processor, that the B bit controls, in the Secure world. The B bit is expected to have a static
value.
Unaligned data access, that the U bit controls, the value of the E bit in the CPSR on exceptions,
that the EE bit controls, and strict alignment of data, that the A bit controls, can differ in the
Secure and Non-Secure worlds.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-2
Unaligned and Mixed-endian Data Access Support
4.2
Unaligned access support
Instructions must always be aligned as follows:
•
ARM 32-bit instructions must be word boundary aligned, Address [1:0] = b00
•
Thumb 16-bit instructions must be halfword boundary aligned, Address [0] = 0.
The following sections describe unaligned data access support:
•
Legacy support
•
ARMv6 extensions
•
Legacy and ARMv6 configurations on page 4-4
•
Legacy data access in ARMv6 (U=0) on page 4-4
•
Support for unaligned data access in ARMv6 (U=1) on page 4-4
•
ARMv6 unaligned data access restrictions on page 4-5.
4.2.1
Legacy support
For ARM architectures prior to ARM architecture v6, data access to non-aligned word and
halfword data was treated as aligned from the memory interface perspective. That is, the address
is treated as truncated with Address[1:0], treated as zero for word accesses, and Address[0]
treated as zero for halfword accesses.
Load single word ARM instructions are also architecturally defined to rotate right the word
aligned data transferred by a non word-aligned access, see the ARM Architecture Reference
Manual.
Alignment fault checking is specified for processors with architecturally compliant Memory
Management Units (MMUs), under control of CP15 Register c1 A control bit, bit 1. When a
transfer is not naturally aligned to the size of data transferred a Data Abort is signaled with an
Alignment fault status code, see ARM Architecture Reference Manual for more details.
4.2.2
ARMv6 extensions
ARMv6 adds unaligned word and halfword load and store data access support. When enabled,
one or more memory accesses are used to generate the required transfer of adjacent bytes
transparently, apart from a potentially greater access time where the transaction crosses a
word-boundary.
The memory management specification defines a programmable mechanism to enable
unaligned access support. This is controlled and programmed using the CP15 Register c1 U
control bit, bit 22.
Non word-aligned for load and store multiple/double, semaphore, synchronization, and
coprocessor accesses always signal Data Abort with Alignment Faults Status Code when the U
bit is set.
Strict alignment checking is also supported in ARMv6, under control of the CP15 Register c1
A control bit, bit [1], and signals a Data Abort with Alignment Fault Status Code if a 16-bit
access is not halfword aligned or a single 32-bit load/store transfer is not word aligned.
ARMv6 alignment fault detection is a mandatory function associated with address generation
rather than optionally supported in external memory management hardware.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-3
Unaligned and Mixed-endian Data Access Support
4.2.3
Legacy and ARMv6 configurations
Table 4-1 summarizes the unaligned access handling.
Table 4-1 Unaligned access handling
CP15 register c1:
Unaligned access model
4.2.4
U bit
A bit
0
0
Legacy ARMv5. See Legacy data access in ARMv6 (U=0).
0
1
Legacy natural alignment check.
1
0
ARMv6 unaligned half/word access, else strict word alignment check.
1
1
ARMv6 strict half/word alignment check.
Legacy data access in ARMv6 (U=0)
The processor emulates earlier architecture unaligned accesses to memory as follows:
•
If A bit is asserted alignment faults occur for:
Halfword access
Address[0] is 1.
Word access
Address[1:0] is not b00.
LDRD or STRD
Address [2:0] is not b000.
Multiple access
Address [1:0] is not b00.
•
If alignment faults are enabled and the access is not aligned then the Data Abort vector is
entered with an Alignment Fault status code.
•
If no alignment fault is enabled, that is, if bit 1 of CP15 Register c1, the A bit, is not set:
Byte access
Memory interface uses full Address [31:0].
Halfword access
Memory interface uses Address [31:1]. Address [0] asserted as 0.
Word access
Memory interface uses Address [31:2]. Address [1:0] asserted as 0.
—
ARM load data rotates the aligned read data and rotates this right by the byte-offset
denoted by Address [1:0], see the ARM Architecture Reference Manual.
—
ARM and Thumb load-multiple accesses always treated as aligned. No rotation of
read data.
—
ARM and Thumb store word and store multiple treated as aligned. No rotation of
write data.
—
ARM load and store doubleword operations treated as 64-bit aligned.
For more information, see Operation of unaligned accesses on page 4-13.
4.2.5
Support for unaligned data access in ARMv6 (U=1)
The processor memory interfaces can generate unaligned low order byte address offsets only for
halfword and single word load and store operations, and byte accesses unless the A bit is set.
These accesses produce an alignment fault if the A bit is set, and for some of the cases that
ARMv6 unaligned data access restrictions on page 4-5 describes.
If alignment faults are enabled and the access is not aligned then the Data Abort vector is entered
with an Alignment Fault status code.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-4
Unaligned and Mixed-endian Data Access Support
4.2.6
ARMv6 unaligned data access restrictions
The following restrictions apply for ARMv6 unaligned data access:
•
Accesses are not guaranteed atomic. They might be synthesized out of a series of aligned
operations in a shared memory system without guaranteeing locked transaction cycles.
•
Unaligned accesses loading the PC produce an alignment trap.
•
Accesses typically take a greater number of cycles to complete compared to a naturally
aligned transfer. The real-time implications must be carefully analyzed and key data
structures might require to have their alignment adjusted for optimum performance.
•
Accesses can abort on either or both halves of an access where this occurs over a page
boundary. The Data Abort handler must handle restartable aborts carefully after an
Alignment Fault status code is signaled.
As a result, shared memory schemes must not rely on seeing monotonic updates of non-aligned
data of loads, stores, and swaps for data items greater than byte width. Unaligned access
operations must not be used for accessing Device memory-mapped registers, and must be used
with care in Shared memory structures that are protected by aligned semaphores or
synchronization variables.
An Unalignment trap occurs if unaligned accesses to Strongly Ordered or Device when both:
•
the MMU is enabled, that is CP15 c1 bit 0, M bit, is 1
•
the Subpage AP bits are disabled, that is CP15 c1 bit 23, XP bit, is 1.
Swap and synchronization primitives, multiple-word or coprocessor access produce an
alignment fault regardless of the setting of the A bit.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-5
Unaligned and Mixed-endian Data Access Support
4.3
Endian support
The architectural specification of unaligned data representations is defined in terms of bytes
transferred between memory and register, regardless of bus width and bus endianness.
Little-endian data items are described using lower-case byte labeling bX…b0, byteX to byte 0,
and a pointer is always treated as pointing to the least significant byte of the addressed data.
Byte invariant, BE-8, big-endian data items are described using upper-case byte labeling
B0…BX, BYTE0 to BYTEX, and a pointer is always treated as pointing to the most significant
byte of the addressed data.
4.3.1
Load unsigned byte, endian independent
The addressed byte is loaded from memory into the low eight bits of the general-purpose register
and the upper 24 bits are zeroed, as Figure 4-1 shows.
Memory
Address
A[31:0]
7
Register
0
31
b
23
0
15
0
7
0
0
b
Figure 4-1 Load unsigned byte
4.3.2
Load signed byte, endian independent
The addressed byte is loaded from the memory into the low eight bits of the general-purpose
register and the sign bit is extended into the upper 24 bits of the register as Figure 4-2 shows.
Memory
Address
A[31:0]
7
Register
0
31
b
23
se
15
se
7
se
0
b
Figure 4-2 Load signed byte
In Figure 4-2, se means b, bit [7], sign extension.
4.3.3
Store byte, endian independent
The low eight bits of the general-purpose register are stored into the addressed byte in memory,
as Figure 4-3 on page 4-7 shows.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-6
Unaligned and Mixed-endian Data Access Support
Register
Memory
7
Address
A[31:0]
31
23
15
x
x
7
x
0
0
b
b
Figure 4-3 Store byte
4.3.4
Load unsigned halfword, little-endian
The addressed byte-pair is loaded from memory into the low 16 bits of the general-purpose
register, and the upper 16 bits are zeroed so that the least-significant addressed byte in memory
appears in bits [7:0] of the ARM register, as Figure 4-4 shows.
Memory
7
Address
A[31:0]
Register
0
31
+1
b0
lsbyte
b1
msbyte
23
0
15
0
7
b1
0
b0
Figure 4-4 Load unsigned halfword, little-endian
If strict alignment fault checking is enabled and Address bit 0 is not zero, then a Data Abort is
generated and the MMU returns a Misaligned fault in the Fault Status Register.
4.3.5
Load unsigned halfword, big-endian
The addressed byte-pair is loaded from memory into the low 16 bits of the general-purpose
register, and the upper 16 bits are zeroed so that the most-significant addressed byte in memory
appears in bits [15:8] of the ARM register, as Figure 4-5 on page 4-8 shows.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-7
Unaligned and Mixed-endian Data Access Support
Memory
7
Address
A[31:0]
Register
0
31
+1
B0
msbyte
B1
lsbyte
23
0
15
0
7
B0
0
B1
Figure 4-5 Load unsigned halfword, big-endian
If strict alignment fault checking is enabled and Address bit 0 is not zero, then a Data Abort is
generated and the MMU returns a Misaligned fault in the Fault Status Register.
4.3.6
Load signed halfword, little-endian
The addressed byte-pair is loaded from memory into the low 16-bits of the general-purpose
register, so that the least-significant addressed byte in memory appears in bits [7:0] of the ARM
register and the upper 16 bits are sign-extended from bit 15, as Figure 4-6 shows.
Memory
7
Address
A[31:0]
Register
0
31
+1
b0
lsbyte
b1
msbyte
23
se1
15
se1
7
b1
0
b0
Figure 4-6 Load signed halfword, little-endian
In Figure 4-6, se1 means bit 15, b1 bit [7], sign extended.
If strict alignment fault checking is enabled and Address bit 0 is not zero, then a Data Abort is
generated and the MMU returns a Misaligned fault in the Fault Status Register.
4.3.7
Load signed halfword, big-endian
The addressed byte-pair is loaded from memory into the low 16-bits of the general-purpose
register, so that the most significant addressed byte in memory appears in bits [15:8] of the ARM
register and bits [31:16] replicate the sign bit in bit 15, as Figure 4-7 on page 4-9 shows.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-8
Unaligned and Mixed-endian Data Access Support
Memory
7
Address
A[31:0]
Register
0
31
+1
B0
msbyte
B1
lsbyte
23
SE0
15
SE0
7
B0
0
B1
Figure 4-7 Load signed halfword, big-endian
In Figure 4-7, SE0 means bit 15, B0 bit [7], sign extended.
If strict alignment fault checking is enabled and Address bit 0 is not zero, then a Data Abort is
generated and the MMU returns a Misaligned fault in the Fault Status Register.
4.3.8
Store halfword, little-endian
The low 16 bits of the general-purpose register are stored into the memory with bits [7:0] written
to the addressed byte in memory, bits [15:8] to the incremental byte address in memory, as
Figure 4-8 shows.
Register
Memory
7
Address
A[31:0]
31
23
x
15
x
7
b1
0
0
b0
+1
b0
lsbyte
b1
msbyte
Figure 4-8 Store halfword, little-endian
If strict alignment fault checking is enabled and Address bit 0 is not zero, then a Data Abort is
generated and the MMU returns a Misaligned fault in the Fault Status Register.
4.3.9
Store halfword, big-endian
The low 16 bits of the general-purpose register are stored into the memory with bits [15:8]
written to the addressed byte in memory, bits [7:0] to the incremental byte address in memory,
as Figure 4-9 on page 4-10 shows.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-9
Unaligned and Mixed-endian Data Access Support
Register
Memory
7
Address
A[31:0]
31
23
x
15
x
7
B0
0
0
B1
+1
B0
msbyte
B1
lsbyte
Figure 4-9 Store halfword, big-endian
If strict alignment fault checking is enabled and Address bit 0 is not zero, then a Data Abort is
generated and the MMU returns a Misaligned fault in the Fault Status Register.
4.3.10
Load word, little-endian
The addressed byte-quad is loaded from memory into the 32-bit general-purpose register so that
the least-significant addressed byte in memory appears in bits [7:0] of the ARM register, as
Figure 4-10 shows.
Memory
7
Address
A[31:0]
Register
0
31
b0
+1
b1
+2
b2
+3
b3
lsbyte
23
b3
15
b2
7
b1
0
b0
msbyte
Figure 4-10 Load word, little-endian
If strict alignment fault checking is enabled and Address bits [1:0] are not zero, then a Data
Abort is generated and the MMU returns a Misaligned fault in the Fault Status Register.
4.3.11
Load word, big-endian
The addressed byte-quad is loaded from memory into the 32-bit general-purpose register so that
the most significant addressed byte in memory appears in bits [31:24] of the ARM register, as
Figure 4-11 on page 4-11 shows.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-10
Unaligned and Mixed-endian Data Access Support
Memory
7
Address
A[31:0]
Register
0
31
B0
+1
B1
+2
B2
+3
B3
23
B0
msbyte
15
B1
7
B2
0
B3
lsbyte
Figure 4-11 Load word, big-endian
If strict alignment fault checking is enabled and Address bits [1:0] are not zero, then a Data
Abort is generated and the MMU returns a Misaligned fault in the Fault Status Register.
4.3.12
Store word, little-endian
The 32-bit general-purpose register is stored to four bytes in memory where bits [7:0] of the
ARM register are transferred to the least-significant addressed byte in memory, as Figure 4-12
shows.
Register
Memory
7
Address
A[31:0]
31
23
b3
15
b2
7
b1
0
0
b0
b0
+1
b1
+2
b2
+3
b3
lsbyte
msbyte
Figure 4-12 Store word, little-endian
If strict alignment fault checking is enabled and Address bits [1:0] are not zero, then a Data
Abort is generated and the MMU returns a Misaligned fault in the Fault Status Register.
4.3.13
Store word, big-endian
The 32-bit general-purpose register is stored to four bytes in memory where bits [31:24] of the
ARM register are transferred to the most-significant addressed byte in memory, as Figure 4-13
on page 4-12 shows.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-11
Unaligned and Mixed-endian Data Access Support
Register
Memory
7
Address
A[31:0]
31
23
B0
15
B1
7
B2
0
0
B3
B0
+1
B1
+2
B2
+3
B3
msbyte
lsbyte
Figure 4-13 Store word, big-endian
If strict alignment fault checking is enabled and Address bits [1:0] are not zero, then a Data
Abort is generated and the MMU returns a Misaligned fault in the Fault Status Register.
4.3.14
Load double, load multiple, load coprocessor (little-endian, E = 0)
The access is treated as a series of incrementing aligned word loads from memory. The data is
treated as load word data, see Load word, little-endian on page 4-10, where the lowest two
address bits are zeroed. If strict alignment fault checking is enabled and effective Address
bits[1:0] are not zero, then a Data Abort is generated and the MMU returns an Alignment fault
in the Fault Status Register.
4.3.15
Load double, load multiple, load coprocessor (big-endian, E=1)
The access is treated as a series of incrementing aligned word loads from memory. The data is
treated as load word data, see Load word, big-endian on page 4-11, where the lowest two
address bits are zeroed. If strict alignment fault checking is enabled and effective Address
bits[1:0] are not zero, then a Data Abort is generated and the MMU returns an Alignment fault
in the Fault Status Register.
4.3.16
Store double, store multiple, store coprocessor (little-endian, E=0)
The access is treated as a series of incrementing aligned word stores to memory. The data is
treated as store word data, see Store word, little-endian on page 4-11, where the lowest two
address bits are zeroed. If strict alignment fault checking is enabled and effective Address
bits[1:0] are not zero, then a Data Abort is generated and the MMU returns an Alignment fault
in the Fault Status Register.
4.3.17
Store double, store multiple, store coprocessor (big-endian, E=1)
The access is treated as a series of incrementing aligned word stores to memory. The data is
treated as store word data, see Store word, big-endian, where the lowest two address bits are
zeroed. If strict alignment fault checking is enabled and effective Address bits[1:0] are not zero,
then a Data Abort is generated and the MMU returns an Alignment fault in the Fault Status
Register.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-12
Unaligned and Mixed-endian Data Access Support
4.4
Operation of unaligned accesses
This section describes alignment faults and operation of non-faulting accesses of the processor.
Table 4-2 lists the memory access types.
The mechanism for the support of unaligned loads or stores is that if either the Base register or
the index offset of the address is misaligned, then the processor takes two cycles to issue the
instruction. If the resulting address is misaligned, then the instruction performs multiple
memory accesses in ascending order of address.
There is no support for misaligned accesses being atomic, and misaligned accesses to Device
memory might result in Unpredictable behavior.
Table 4-3 on page 4-14 lists details of when an alignment fault must occur for an access and of
when the behavior of an access is architecturally Unpredictable. When an access does not
generate an alignment fault, and is not Unpredictable, details of the precise memory locations
that are accessed are also given in the table.
The access type descriptions used in Table 4-3 on page 4-14 are determined from the load/store
instruction that Table 4-2 lists.
Table 4-2 Memory access types
Access type
ARM instructions
Byte
LDRB, LDRBT, STRB, STRBT
BSync
SWPB, LDREXB, STREXB
Halfword
LDRH, LDRSH, STRH
HWSync
LDREXH, STREXH
WLoad
LDR, LDRT, SWP, load access if U is set to 0
WStore
STR, STRT, SWP, store access if U is set to 0
WSync
LDREX, STREX, SWP, either access if U is set to 1
Two-word
LDRD, STRD
Multi-word
LDC, LDM, RFE, SRS, STC, STM
DWSync
LDREXD, STREXD
The following terminology is used to describe the memory locations accessed:
Byte[X]
This means the byte whose address is X in the current endianness model. The
correspondence between the endianness models is that Byte[A] in the LE
endianness model, Byte[A] in the BE-8 endianness model, and Byte[A EOR 3] in
the BE-32 endianness model are the same actual byte of memory.
Halfword[X] This means the halfword consisting of the bytes whose addresses are X and X+1
in the current endianness model, combined to form a halfword in little-endian
order in the LE endianness model or in big-endian order in the BE-8 or BE-32
endianness model.
Word[X]
ARM DDI 0333H
ID012410
This means the word consisting of the bytes whose addresses are X, X+1, X+2,
and X+3 in the current endianness model, combined to form a word in
little-endian order in the LE endianness model or in big-endian order in the BE-8
or BE-32 endianness model.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-13
Unaligned and Mixed-endian Data Access Support
Note
It is a consequence of these definitions that if X is word-aligned, Word[X]
consists of the same four bytes of actual memory in the same order in the LE and
BE-32 endianness models.
Align(X)
This means X AND 0xFFFFFFFC. That is, X with its least significant two bits forced
to zero to make it word-aligned.
There is no difference between Addr and Align(Addr) on lines where Addr[1:0]
is set to b00. You can use this to simplify the control of when the least significant
bits are forced to zero.
For the Two-word and Multi-word access types, the Memory accessed column only specifies the
lowest word accessed. Subsequent words have addresses constructed by successively
incrementing the address of the lowest word by 4, and are constructed using the same endianness
model as the lowest word.
Table 4-3 Unalignment fault occurrence when access behavior is architecturally unpredictable
A
U
Addr[2:0]
Access
types
Architectural
Behavior
Memory accessed
Note
0
0
-
-
-
-
Legacy, no alignment
0
0
bxxx
Byte, BSync
Normal
Byte[Addr]
0
0
bxx0
Halfword
Normal
Halfword[Addr]
0
0
bxx1
Halfword
Unpredictable
-
0
0
bxx0
HWSync
Normal
Halfword[Addr]
0
0
bxx1
HWSync
Unpredictable
-
Halfword[Align16(Addr)];
Operation unaffected by Addr[0]
0
0
bxxx
Wload
Normal
Word[Align32(Addr)]
Loaded data rotated by
8*Addr[1:0] bits
0
0
bxxx
WStore
Normal
Word[Align32(Addr)]
Operation unaffected by
Addr[1:0]
0
0
bx00
WSync
Normal
Word[Addr]
0
0
bxx1, bx1x
WSync
Unpredictable
-
Word[Align32(Addr)]
0
0
bxxx
Multi-word
Normal
Word[Align32(Addr)]
Operation unaffected by
Addr[1:0]
0
0
b000
Two-word
Normal
Word[Addr]
0
0
bxx1,
bx1x, b1xx
Two-word
Unpredictable
-
0
0
b000
DWSync
Normal
Word[Addr]
0
0
bxx1,
bx1x,
b1xx
DWSync
Unpredictable
-
DWord[Align64(Addr)];
Operation unaffected by
Addr[2:0]
0
1
-
-
-
-
ARMv6 unaligned support
0
1
bxxx
Byte, BSync
Normal
Byte[Addr]
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
Halfword[Align16(Addr)];
Operation unaffected by Addr[0]
Same as LDM2 or STM2
4-14
Unaligned and Mixed-endian Data Access Support
Table 4-3 Unalignment fault occurrence when access behavior is architecturally unpredictable (continued)
A
U
Addr[2:0]
Access
types
Architectural
Behavior
Memory accessed
0
1
bxxx
Halfword
Normal
Halfword[Addr]
0
1
bxx0
HWSync
Normal
Halfword[Addr]
0
1
bxx1
HWSync
Alignment fault
0
1
bxxx
Wload,
WStore
Normal
Word[Addr]
0
1
bx00
WSync,
Multi-word,
Two-word
Normal
Word[Addr]
0
1
bxx1, bx1x
WSync,
Multi-word,
Two-word
Alignment fault
-
0
1
b000
DWSync
Normal
Word[Addr]
0
1
bxx1,
bx1x, b1xx
DWSync
Alignment fault
-
1
x
-
-
-
-
1
x
bxxx
Byte, BSync
Normal
Byte[Addr]
1
x
bxx0
Halfword,
HWSync
Normal
Halfword[Addr]
1
x
bxx1
Halfword,
HWSync
Alignment fault
-
1
x
bx00
WLoad,
WStore,
WSync,
Multi-word
Normal
Word[Addr]
1
x
bxx1, bx1x
WLoad,
WStore,
WSync,
Multi-word
Alignment fault
-
1
x
b000
Two-word
Normal
Word[Addr]
1
0
b100
Two-word
Alignment fault
-
1
1
b100
Two-word
Normal
Word[Addr]
1
x
bxx1, bx1x
Two-word
Alignment fault
-
1
x
b000
DWSync
Normal
Word[Addr]
1
x
bxx1,
bx1x, b1xx
DWSync
Alignment fault
-
Note
-
Full alignment faulting
The following causes override the behavior specified in the Table 4-3 on page 4-14:
•
ARM DDI 0333H
ID012410
An LDR instruction that loads the PC, has Addr[1:0] != b00, and is specified in the table
as having Normal behavior instead has Unpredictable behavior.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-15
Unaligned and Mixed-endian Data Access Support
The reason why this applies only to LDR is that most other load instructions are
Unpredictable regardless of alignment if the PC is specified as their destination register.
The exceptions are ARM LDM and RFE instructions, and Thumbs POP instruction. If the
instruction for them is Addr[1:0] != b00, the effective address of the transfer has its two
least significant bits forced to 0 if A is set 0 and U is set to 0. Otherwise the behavior
specified in Table 4-3 on page 4-14 is either Unpredictable or Alignment Fault regardless
of the destination register.
ARM DDI 0333H
ID012410
•
Any WLoad, WStore, WSync, Two-word, or Multi-word instruction that accesses device
memory, has Addr[1:0] != b00, and Table 4-3 on page 4-14 lists them as having Normal
behavior instead has Unpredictable behavior.
•
Any Halfword instruction that accesses device memory, has Addr[0] != 0, and is specified
in the table as having Normal behavior instead has Unpredictable behavior.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-16
Unaligned and Mixed-endian Data Access Support
4.5
Mixed-endian access support
The following sections describe mixed-endian data access:
•
Legacy fixed instruction and data endianness
•
ARMv6 support for mixed-endian data
•
Instructions to change the CPSR E bit on page 4-21.
For more information, see The ARM Architecture Reference Manual.
4.5.1
Legacy fixed instruction and data endianness
Prior to ARMv6 the endianness of both instructions and data are locked together, and the
configuration of the processor and the external memory system must either be hard-wired or
programmed in the first few instructions of the bootstrap code.
Where the endianness is configurable under program control, the MMU provides a mechanism
in CP15 c1 to set the B bit, that enables byte addressing renaming with 32-bit words. This model
of big-endian access, called BE-32 in this document, relies on a word-invariant view of memory
where an aligned 32-bit word reads and writes the same word of data in memory when
configured as either big-endian or little-endian.
For more information, see Endianness on page 8-42.
This behavior is still provided for legacy software when the U bit in CP15 Register c1 is zero,
as Table 4-4 lists.
Table 4-4 Legacy endianness using CP15 c1
4.5.2
U
B
Instruction
endianness
Data
endianness
Description
0
0
LE
LE
LE, reset condition
0
1
BE-32
BE-32
Legacy BE, 32-bit word-invariant
ARMv6 support for mixed-endian data
In ARMv6 the instruction and data endianness are separated:
•
instructions are fixed little-endian
•
data accesses can be either little-endian or big-endian as controlled by bit 9, the E bit, of
the Program Status Register.
The value of the E bit on any exception entry, including reset, is determined by the CPSR
Register 15 EE bit.
Fixed little-endian Instructions
Instructions must be naturally aligned and are always treated as being stored in memory in
little-endian format. That is, the PC points to the least-significant-byte of the instruction.
Instructions must be treated as data by exception handlers, decoding SVC calls and Undefined
instructions, for example.
Instructions can also be written as data by debuggers, Just-In-Time (JIT) compilers, or in
operating systems that update exception vectors.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-17
Unaligned and Mixed-endian Data Access Support
Mixed-endian data access
The operating-system typically has a required endian representation of internal data structures,
but applications and device drivers have to work with data shared with other processors, DSP or
DMA interfaces, that might have fixed big-endian or little-endian data formatting.
A byte-invariant addressing mechanism is provided that enables the load/store architecture to be
qualified by the CPSR E bit that provides byte reversing of big-endian data in to, and out of, the
processor register bank transparently. This byte-invariant big-endian representation is referred
to as BE-8 in this document.
Mixed-endian configuration supported on page 4-19 describes the effect on byte, halfword,
word, and multi-word accesses of setting the CPSR E bit when the U bit enables unaligned
support.
Byte data access
The same physical byte in memory is accessed whether big-endian, BE-8, or little-endian:
•
unsigned byte load as Load unsigned byte, endian independent on page 4-6 describes
•
signed byte load as Load signed byte, endian independent on page 4-6 describes
•
byte store as Store byte, endian independent on page 4-6 describes.
Halfword data access
The same two physical bytes in memory are accessed whether big-endian, BE-8, or little-endian.
Big-endian halfword load data is byte-reversed as read into the processor register to ensure
little-endian internal representation, and similarly is byte-reversed on store to memory:
•
unsigned halfword load as Load unsigned halfword, little-endian on page 4-7, LE, and
Load unsigned halfword, big-endian on page 4-7, BE-8 describe
•
signed halfword load as Load signed halfword, little-endian on page 4-8, LE, and Load
signed halfword, big-endian on page 4-8, BE-8 describe
•
halfword store as Store halfword, little-endian on page 4-9, LE, and Store halfword,
big-endian on page 4-9, BE-8 describe.
Word data access
The same four physical bytes in memory are accessed whether big-endian, BE-8, or
little-endian. Big-endian word load data is byte reversed as read into the processor register to
ensure little-endian internal representation, and similarly is byte-reversed on store to memory:
ARM DDI 0333H
ID012410
•
word load as Load word, little-endian on page 4-10, LE, and Load word, big-endian on
page 4-10, BE-8 describes
•
word store as Store word, little-endian on page 4-11, LE, and Store word, big-endian on
page 4-11, BE-8 describes.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-18
Unaligned and Mixed-endian Data Access Support
Mixed-endian configuration supported
This behavior is enabled when the U bit in CP15 Register c1 is set. This is only supported when
the B bit in CP15 Register c1 is reset, as Table 4-5 lists.
Table 4-5 Mixed-endian configuration
4.5.3
U
B
E
Instruction
endianness
Data
endianness
1
0
0
LE
LE
LE instructions, little-endian data load/store. Unaligned data access
permitted.
1
0
1
LE
BE-8
LE instructions, big-endian data load/store. Unaligned data access
permitted.
1
1
0
BE-32
BE-32
Legacy BE instructions/data.
1
1
1
-
-
Reserved.
Description
Reset values of the U, B, and EE bits
Table 4-6 lists the reset values of the BIGENDINIT and UBITINIT pins that determine the
values of the U, B, and EE bits at reset. The pins determine the reset value of the B bit and both
the Secure and Non-secure reset values of the U and EE bits.
Table 4-6 B bit, U bit, and EE bit settings
ARM DDI 0333H
ID012410
BIGENDINIT
UBITINIT
B
U
E
E
0
0
0
0
0
0
1
0
1
0
1
0
1
0
0
1
1
0
1
1
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-19
Unaligned and Mixed-endian Data Access Support
4.6
Instructions to reverse bytes in a general-purpose register
When an application or device driver has to interface to memory-mapped peripheral registers or
shared-memory DMA structures that are not the same endianness as that of the internal data
structures, or the endianness of the Operating System, an efficient way of being able to explicitly
transform the endianness of the data is required. The following new instructions are added to the
ARM and Thumb instruction sets to provide this functionality:
•
reverse word, 4 bytes, register, for transforming big and little-endian 32-bit
representations
•
reverse halfword and sign-extend, for transforming signed 16-bit representations
•
Reverse packed halfwords in a register for transforming big- and little-endian 16-bit
representations.
ARM1176JZ-S instruction set summary on page 1-30 describes these instructions.
4.6.1
All load and store operations
All load and store instructions take account of the CPSR E bit. Data is transferred directly to
registers when E = 0, and byte reversed if E = 1 for halfword, word, or multiple word transfers.
Operation:
When CPSR[<E-bit>] = 1 then byte reverse load/store data
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-20
Unaligned and Mixed-endian Data Access Support
4.7
Instructions to change the CPSR E bit
ARM and Thumb instructions are provided to set and clear the E-bit efficiently:
SETEND BE
Sets the CPSR E bit
SETEND LE
Resets the CPSR E bit.
These are specified as unconditional operations to minimize pipelined implementation
complexity.
ARM1176JZ-S instruction set summary on page 1-30 describe these instructions.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
4-21
Chapter 5
Program Flow Prediction
This chapter describes how program flow prediction locates branches in the instruction stream and
the strategies used for determining if a branch is likely to be taken or not. It also describes the two
architecturally-defined SVC functions required for backwards-compatibility with earlier
architectures for flushing the Prefetch Unit (PU) buffers. It contains the following sections:
•
About program flow prediction on page 5-2
•
Branch prediction on page 5-4
•
Return stack on page 5-7
•
Memory Barriers on page 5-8
•
ARM1176JZ-S IMB implementation on page 5-10.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
5-1
Program Flow Prediction
5.1
About program flow prediction
Program flow prediction in the processor is carried out by:
The integer core
Implements static branch prediction and the Return Stack.
The Prefetch Unit The PU implements dynamic branch prediction.
The processor is responsible for handling branches the first time they are executed, that is, when
no historical information is available for dynamic prediction by the PU.
The integer core makes static predictions about the likely outcome of a branch early in its
pipeline and then resolves those predictions when the outcome of conditional execution is
known. Condition codes are evaluated at three points in the integer core pipeline, and branches
are resolved as soon as the flags are guaranteed not to be modified by a preceding instruction.
When a branch is resolved, the integer core passes information to the PU so that it can make a
Branch Target Address Cache (BTAC) allocation or update an existing entry as appropriate. The
integer core is also responsible for identifying likely procedure calls and returns to predict the
returns. It can handle nested procedures up to three deep.
The integer core includes:
•
a Static Branch Predictor (SBP)
•
a Return Stack (RS)
•
branch resolution logic
•
a BTAC update interface to the PU
•
a BTAC allocate interface to the PU.
The processor PU is responsible for fetching instructions from the memory system as required
by the integer core, and coprocessors. The PU buffers up to seven instructions in its FIFO to:
•
detect branch instructions ahead of the integer core requirement
•
dynamically predict those that it considers are to be taken
•
provide branch folding of predicted branches if possible
•
identify unconditional procedure return instructions.
This reduces the cycle time of the branch instructions, so increasing processor performance.
The PU includes:
•
a BTAC
•
branch update and allocate logic
•
a Dynamic Branch Predictor (DBP), and associated update mechanism
•
branch folding logic.
It is responsible for providing the integer core with instructions, and for requesting cache
accesses. The pattern of cache accesses is based on the predicted instruction stream as
determined by the dynamic branch prediction mechanism or the integer core flush mechanism.
The BTAC can:
•
be globally flushed by a CP15 instruction
•
have individual entries flushed by a CP15 instruction
•
be enabled or disabled by a CP15 instruction.
For details of CP15 instructions see c7, Cache operations on page 3-69 and Flush operations on
page 3-79.
The BTAC is globally flushed for:
•
Main TLB FCSE PID changes
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
5-2
Program Flow Prediction
•
•
•
Main TLB context ID changes
Global instruction cache invalidation
Switches by the integer core from Non-secure to Secure state.
When the processor switches from the Secure to the Non-secure state the Secure Monitor code
is responsible for flushing the BTAC if necessary.
The PU prefetches all instruction types regardless of the state of the integer core. That is, it
performs prefetches in ARM state, Thumb state, and Jazelle state. However the rate at which the
PU is drained is state-dependent, and the functioning of the branch prediction hardware is a
function of the state. Branch prediction is performed in all three states, but branch folding
operates only in ARM and Thumb states.
The PU is responsible for fetching the instruction stream as dictated by:
ARM DDI 0333H
ID012410
•
the Program Counter
•
the dynamic branch predictor
•
static prediction results in the integer core
•
procedure calls and returns signaled by the Return Stack residing in the integer core
•
exceptions, instruction aborts, and interrupts signaled by the integer core.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
5-3
Program Flow Prediction
5.2
Branch prediction
In ARM processors that have no PU, the target of a branch is not known until the end of the
Execute stage. At the Execute stage it is known whether or not the branch is taken. The best
performance is obtained by predicting all branches as not taken and filling the pipeline with the
instructions that follow the branch in the current sequential path. In ARM processors without a
PU, an untaken branch requires one cycle and a taken branch requires three or more cycles.
Branch prediction enables the detection of branch instructions before they enter the integer core.
This permits the use of a branch prediction scheme that closely models actual conditional branch
behavior.
The increased pipeline length of the ARM1176JZ-S processor makes the performance penalty
of any changes in program flow, such as branches or other updates to the PC, more significant
than was the case on the ARM9TDMI or ARM1020T processors. Therefore, a significant
amount of hardware is dedicated to prediction of these changes. Two major classes of program
flow are addressed in the ARM1176JZ-S prediction scheme:
1.
Branches, including BL, and BLX immediate, where the target address is a fixed offset
from the program counter. The prediction amounts to an examination of the probability
that a branch passes its condition codes. These branches are handled in the Branch
Predictors.
2.
Loads, Moves, and ALU operations writing to the PC, that can be identified as being likely
to be a return from a procedure call. Two identifiable cases are Loads to the PC from an
address derived from R13, the stack pointer, and Moves or ALU operations to the PC
derived from R14, the Link Register. In these cases, if the calling operation can also be
identified, the likely return address can be stored in a hardware implemented stack, termed
a Return Stack (RS). Typical calling operations are BL and BLX instructions. In addition
Moves or ALU operations to the Link Register from the PC are often preludes to a branch
that serves as a calling operation. The Link Register value derived is the value required for
the RS. This was most commonly done on ARMv4T, before the BLX <register>
instruction was introduced in ARMv5T.
Branch prediction is required in the design to reduce the integer core CPI loss that arises from
the longer pipeline. To improve the branch prediction accuracy, a combination of static and
dynamic techniques is employed. It is possible to disable each of the predictors separately.
5.2.1
Enabling program flow prediction
The enabling of program flow prediction is controlled by the CP15 Register c1 Z bit, bit 11, that
is set to 0 on Reset. See c1, Control Register on page 3-44. The return stack, dynamic predictor,
and static predictor can also be individually controlled using the Auxiliary Control Register. See
c1, Auxiliary Control Register on page 3-49.
5.2.2
Dynamic branch predictor
The first line of branch prediction in the processor is dynamic, through a simple BTAC. It is
virtually addressed and holds virtual target addresses. In addition, a two bit value holds the
prediction history of the branch. If the address mappings change, this cache must be flushed. A
dynamic branch predictor flush is included in the CP15 coprocessor control instructions. Also
included are direct dynamic branch predictor flush from main TLB and integer core.
A BTAC works by storing the existence of branches at particular locations in memory. The
branch target address and a prediction of whether or not it might be taken is also stored.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
5-4
Program Flow Prediction
The BTAC provides dynamic prediction of branches, including BL and BLX instructions in both
ARM, Thumb, and Jazelle states. The BTAC is a 128-entry direct-mapped cache structure used
for allocation of Branch Target Addresses for resolved branches. The BTAC uses a 2-bit
saturating prediction history scheme to provide the dynamic branch prediction. When a branch
has been allocated into the BTAC, it is only evicted in the case of a capacity clash. That is, by
another branch at the same index.
The prediction is based on the previous behavior of this branch. The four possible states of the
prediction bits are:
•
strongly predict branch taken
•
weakly predict branch taken
•
weakly predict branch not taken
•
strongly predict branch not taken.
The history is updated for each occurrence of the branch. This updating is scheduled by the
integer core when the branch has been resolved.
Branch entries are allocated into the BTAC after having been resolved at Execute. BTAC hits
enable branch prediction with zero cycle delay. When a BTAC hit occurs, the Branch Target
Address stored in the BTAC is used as the Program Counter for the next Fetch. Both branches
resolved taken and not taken are allocated into the BTAC. This enables the BTAC to do the most
useful amount of work and improves performance for tight backward branching loops.
5.2.3
Static branch predictor
The second level of branch prediction in the processor uses static branch prediction that is based
solely on the characteristics of a branch instruction. It does not make use of any history
information. The scheme used in the ARM1176JZ-S processor predicts that all forward
conditional branches are not taken and all backward branches are taken. Around 65% of all
branches are preceded by enough non-branch cycles to be completely predicted.
Branch prediction is performed only when the Z bit in CP15 Register c1 is set to 1. See c1,
Control Register on page 3-44 for details of this register. Dynamic prediction works on the basis
of caching the previously seen branches in the BTAC, and like all caches suffers from the
compulsory miss that exists on the first encountering of the branch by the predictor. A second
static predictor is added to the design to counter these misses, and to deal with any capacity and
conflict misses in the BTAC. The static predictor amounts to an early evaluation of branches in
the pipeline, combined with a predictor based on the direction of the branches to handle the
evaluation of condition codes that are not known at the time of the handling of these branches.
Only items that have not been predicted in the dynamic predictor are handled by the static
predictor.
The static branch predictor is hard-wired with backward branches being predicted as taken, and
forward branches as not taken. The SBP looks at the MSB of the branch offset to determine the
branch direction. Statically predicted taken branches incur a one-cycle delay before the target
instructions start refilling the pipeline. The SBP works in both ARM and Thumb states. The SBP
does not function in Jazelle state.
5.2.4
Branch folding
Branch folding is a technique where, on the prediction of most branches, the branch instruction
is completely removed from the instruction stream presented to the execution pipeline. Branch
folding can significantly improve the performance of branches, taking the CPI for branches
significantly lower than 1.
Branch folding only operates in ARM and Thumb states.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
5-5
Program Flow Prediction
Branch folding is done for all dynamically predicted branches, except that branch folding is not
done for:
•
BL and BLX instructions, to avoid losing the link
•
predicted branches onto branches
•
branches that are breakpointed or have generated an abort when fetched.
5.2.5
Incorrect predictions and correction
Branches are resolved at or before the Ex3 stage of the integer core pipeline. A misprediction
causes the pipeline to be flushed, and the correct instruction stream to be fetched. If branch
folding is implemented, the failure of the condition codes of a folded branch causes the
instruction that follows the folded branch to fail. Whenever a potentially incorrect prediction is
made, the following information, necessary for recovering from the error, is stored:
•
a fall-through address in the case of a predicted taken branch instruction
•
the branch target address in the case of a predicted not taken branch instruction.
The PU passes the conditional part of any optimized branch into the integer core. This enables
the integer core to compare these bits with the processor flags and determine if the prediction
was correct or not. If the prediction was incorrect, the integer core flushes the PU and requests
that prefetching begins from the stored recovery address.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
5-6
Program Flow Prediction
5.3
Return stack
A return stack is used for predicting the class of program flow changes that includes loads,
moves, and ALU operations, writing to the PC that can be identified as being likely to be a
procedure call or return.
The return stack is a three-entry circular buffer used for the prediction of procedure calls and
procedure returns. Only unconditional procedure returns are predicted.
When a procedure call instruction is predicted, the return address is taken from the Execute stage
of the pipeline and pushed onto the return stack. The instructions recognized as procedure calls
are:
•
BL <dest>
•
BLX <dest>
•
BLX <reg>.
The first two instructions are predicted by the BTAC, unless they result in a BTAC miss. The
third instruction is not predicted. The SBP predicts unconditional procedure calls as taken, and
conditional procedure calls as not taken.
When a procedure return instruction is predicted, an instruction fetch from the location at the
top of the return stack occurs, and the return stack is popped. The instructions recognized as
procedure returns are:
•
BX R14
•
LDM sp!, {...,pc}
•
LDR pc, [sp...].
The SBP only predicts procedure returns that are always predicted as taken.
Two classes of return stack mispredictions can exist:
•
condition code failures of the return operation
•
incorrect return location.
In addition, an empty return stack gives no prediction.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
5-7
Program Flow Prediction
5.4
Memory Barriers
Memory barrier is the general term applied to an instruction, or sequence of instructions, used
to force synchronization events by a processor with respect to retiring load/store instructions in
a processor core. A memory barrier is used to guarantee completion of preceding load/store
instructions to the programmers model, flushing of any prefetched instructions prior to the
event, or both. The ARMv6 architecture mandates three explicit barrier instructions in the
System Control Coprocessor to support the memory order model, see the ARM Architecture
Reference Manual, and requires these instructions to be available in both Privileged and User
modes:
•
Data Memory Barrier, see Data Memory Barrier operation on page 3-85
•
Data Synchronization Barrier, see Data Synchronization Barrier operation on page 3-84
•
Prefetch Flush, see Flush operations on page 3-79.
Note
The Data Synchronization Barrier operation is synonymous with Drain Write Buffer and Data
Write Barrier in earlier versions of the architecture.
These instructions might be sufficient on their own, or might have to be used in conjunction with
cache and memory management maintenance operations, operations that are only available in
Privileged modes.
5.4.1
Instruction Memory Barriers (IMBs)
Because it is impossible to entirely avoid self modifying code it is necessary to define a
sequence of operations that can be used in the middle of a self-modifying code sequence to make
it execute reliably. This sequence is called an Instruction Memory Barrier (IMB), and might
depend both on the ARM processor implementation and on the memory system implementation.
The IMB sequence must be executed after the new instructions have been stored to memory and
before they are executed, for example, after a program has been loaded and before its entry point
is branched to. Any self-modifying code sequence that does not use an IMB in this way has
Unpredictable behavior.
An IMB might be included in-line where required, however, it is recommended that software is
designed so that the IMB sequence is provided as a call to an easily replaceable system
dependencies module. This eases porting across different architecture variants, ARM
processors, and memory systems.
IMB sequences can include operations that are only usable from Privileged processor modes,
such as the cache cleaning and invalidation operations supplied by the system control
coprocessor. To enable User mode programs access to privileged IMB sequences, it is
recommended that they are supplied as operating system calls, invoked by SVC instructions. For
systems that use the 24-bit immediate in a SVC instruction to specify the required operating
system service, that are default values as follows:
SVC 0xF00000; the general case
SVC 0xF00001; where the system can take advantage of specifying an
; affected address range
These are recommended for general use unless an operating system has good reason to choose
differently, to align with a broader range of operating system specific system services.
The SVC 0xF00000 call takes no parameters, does not return a result, and, apart from the fact that
a SVC instruction is used for the call, rather than a BL instruction, uses the same calling
conventions as a call to a C function with prototype:
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
5-8
Program Flow Prediction
void IMB(void);
The SVC 0xF00001 call uses similar calling conventions to those used by a call to a C function
with prototype:
void IMB_Range(unsigned long start_addr, unsigned long end_addr);
Where the address range runs from start_addr (inclusive) to end_addr (exclusive). When the
standard ARM Procedure Call Standard is used, this means that start_addr is passed in R0 and
end_addr in R1.
The execution time cost of an IMB can be very large, many thousands of clock cycles, even
when a small address range is specified. For small scale uses of self-modifying code, this is
likely to lead to a major loss of performance. It is therefore recommended that self-modifying
code is only used where it is unavoidable and/or it produces sufficiently large execution time
benefits to offset the cost of the IMB.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
5-9
Program Flow Prediction
5.5
ARM1176JZ-S IMB implementation
For the ARM1176JZ-S processor:
•
executing the SVC instruction is sufficient to cause IMB operation
•
both the IMB and the IMBRange instructions flush all stored information about the instruction
stream.
Note
The IMB implementation described here applies to the ARM1020T and later processors,
including the ARM1176JZ-S.
This means that all IMB instructions can be implemented in the operating system by returning
from the IMB or IMBRange service routine, and that the IMB and IMBRange service routines can be
exactly the same. The following service routine code can be used:
IMB_SVC_handler
IMBRange_SVC_handler
MOVS
•
•
5.5.1
PC, R14_svc
; Return to the code after the SVC call
Note
In new code, you are strongly encouraged to use the IMBRange instruction whenever the
changed area of code is small, even if there is no distinction between it and the IMB
instruction on ARM1176JZ-S processors. Future processors might implement the
IMBRange instruction in a more efficient and faster manner, and code migrated from the
ARM1176JZ-S core is likely to benefit when executed on these processors.
ARM1176JZ-S processors implement a Flush Prefetch Buffer operation that is
user-accessible and acts as an IMB. For more details see c7, Cache operations on
page 3-69.
Execution of IMB instructions
This section comprises three examples that show what can happen during the execution of IMB
instructions. The pseudo code in the square brackets shows what happens to execute the IMB (or
IMBRange) instruction in the SVC handler.
Example 5-1 shows how code that loads a program from a disk, and then branches to the entry
point of that program, must execute an IMB instruction between loading the program and trying
to execute it.
Example 5-1 Loading code from disk
IMB
ARM DDI 0333H
ID012410
EQU 0xF00000
.
.
; code that loads program from disk
.
.
SVC
IMB
[branch to IMB service routine]
[perform processor-specific operations to execute IMB]
[return to code]
.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
5-10
Program Flow Prediction
MOV PC, entry_point_of_loaded_program
.
.
Compiled BitBlt routines optimize large copy operations by constructing and executing a
copying loop that has been optimized for the exact operation wanted. When writing such a
routine an IMB is required between the code that constructs the loop and the actual execution of
the constructed loop. Example 5-2 shows this.
Example 5-2 Running BitBlt code
IMBRange EQU 0xF00001.
.
; code that constructs loop code
; load R0 with the start address of the constructed loop
; load R1 with the end address of the constructed loop
SVC
IMBRange
[branch to IMBRange service routine]
[read registers R0 and R1 to set up address range parameters]
[perform processor-specific operations to execute IMBRange]
[within address range]
[return to code]
; start of loop code
.
.
When writing a self-decompressing program, an IMB must be issued after the routine that
decompresses the bulk of the code and before the decompressed code starts to be executed.
Example 5-3 shows this.
Example 5-3 Self-decompressing code
IMB
EQU
0xF00000
.
.
; copy and decompress bulk of code
SVC
IMB
; start of decompressed code
.
.
.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
5-11
Chapter 6
Memory Management Unit
This chapter describes the Memory Management Unit (MMU) and how it is used. It contains the
following sections:
•
About the MMU on page 6-2
•
TLB organization on page 6-4
•
Memory access sequence on page 6-7
•
Enabling and disabling the MMU on page 6-9
•
Memory access control on page 6-11
•
Memory region attributes on page 6-14
•
Memory attributes and types on page 6-20
•
MMU aborts on page 6-27
•
MMU fault checking on page 6-29
•
Fault status and address on page 6-34
•
Hardware page table translation on page 6-36
•
MMU descriptors on page 6-43
•
MMU software-accessible registers on page 6-53.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-1
Memory Management Unit
6.1
About the MMU
The processor MMU works with the cache memory system to control accesses to and from
external memory. The MMU also controls the translation of virtual addresses to physical
addresses.
The processor implements an ARMv6 MMU enhanced with TrustZone features to provide
address translation and access permission checks for all ports of the processor. The MMU
controls table-walking hardware that accesses translation tables in main memory. In each world,
Secure and Non-secure, a single set of two-level page tables stored in main memory controls the
contents of the instruction and data side Translation Lookaside Buffers (TLBs). The finished
virtual address to physical address translation is put into the TLB, associated with a Non-secure
Table IDentifier (NSTID) that permits Secure and Non-secure entries to co-exist. The TLBs are
enabled in each world from a single bit in CP15 Control Register c1, providing a single address
translation and protection scheme from software.
The MMU features are:
•
standard ARMv6 MMU mapping sizes, domains, and access protection scheme
•
mapping sizes are 4KB, 64KB, 1MB, and 16MB
•
the access permissions for 1MB sections and 16MB supersections are specified for the
entire section
•
you can specify access permissions for 64KB large pages and 4KB small pages separately
for each quarter of the page, these quarters are called subpages
•
16 domains
•
one 64-entry unified TLB and a lockdown region of eight entries
•
you can mark entries as a global mapping, or associated with a specific application space
identifier to eliminate the requirement for TLB flushes on most context switches
•
access permissions extended to enable Privileged read-only and Privileged or User
read-only modes to be simultaneously supported
•
memory region attributes to mark pages shared by multiple processors
•
hardware page table walks
•
separate Secure and Non-secure entries and page tables
•
Non-secure memory attribute
•
possibility to restrict the eight lockdown entries to the Secure world.
The MMU memory system architecture enables fine-grained control of a memory system. This
is controlled by a set of virtual to physical address mappings and associated memory properties
held within one or more structures known as TLBs within the MMU. The contents of the TLBs
are managed through hardware translation lookups from a set of translation tables in memory.
To prevent requiring a TLB invalidation on a context switch, you can mark each virtual to
physical address mapping as being associated with a particular application space, or as global
for all application spaces. Only global mappings and those for the current application space are
enabled at any time. By changing the Application Space IDentifier (ASID) you can alter the
enabled set of virtual to physical address mappings.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-2
Memory Management Unit
TrustZone extensions enable the system to mark each entry in the TLB as Secure or Non-secure
with the NSTID. At any time the processor only enables entries with an NSTID that matches the
Security state of the current application.
The set of memory properties associated with each TLB entry include:
Memory access permission control
This controls if a program has no-access, read-only access, or read/write access
to the memory area. When an access is attempted without the required
permission, a memory abort is signaled to the processor. The level of access
possible can also be affected by whether the program is running in User mode, or
a privileged mode, and by the use of domains. See Memory access control on
page 6-11 for more details.
Memory region attributes
These describe properties of a memory region. Examples include Strongly
Ordered, Device, cacheable Write-Through, and cacheable Write-Back. If an
entry for a virtual address is not found in a TLB then a set of translation tables in
memory are automatically searched by hardware to create a TLB entry. This
process is known as a translation table walk. If the processor is in ARMv5
backwards-compatible mode some new features, such as ASIDs, are not
available. The MMU architecture also enables specific TLB entries to be locked
down in a TLB. This ensures that accesses to the associated memory areas never
require looking up by a translation table walk. This minimizes the worst-case
access time to code and data for real-time routines.
Non-secure memory region attribute
This attribute is a TrustZone security extension to the existing ARMv6 MMU. It
defines when the target memory is Secure or Non-secure. See NS attribute on
page 6-19 for a detailed explanation of this bit.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-3
Memory Management Unit
6.2
TLB organization
The following sections describe the TLB organization:
•
MicroTLB
•
Main TLB on page 6-5
•
TLB control operations on page 6-5
•
Page-based attributes on page 6-5
•
Supersections on page 6-6.
6.2.1
MicroTLB
The first level of caching for the page table information is a small MicroTLB of ten entries that
is implemented on each of the instruction and data sides. These entities are implemented in
logic, providing a fully associative lookup of the virtual addresses in a cycle. This means that a
MicroTLB miss signal is returned at the end of the DC1 cycle. In addition to the virtual address,
an Address Space IDentifier (ASID) and a NSTID are used to distinguish different address
mappings that might be in use.
The current ASID is a small identifier, eight bits in size, that is programmed using CP15 when
different address mappings are required. A memory mapping for a page or section can be
marked as being global or referring to a specific ASID. The MicroTLB uses the current ASID
in the comparisons of the lookup for all pages for which the global bit is not set.
The NSTID consists of one bit, and is automatically set when a new entry is written. The entry
is marked as Secure when the MicroTLB request is Secure, that is when it is performed when
the core is in Secure Monitor mode, whatever the value of the NS bit in the CP15 SCR register,
or in any Secure mode, NS bit in CP15 SCR = 0.
The MicroTLB returns the physical address to the cache for the address comparison, and also
checks the protection attributes in sufficient time to signal a Data Abort in the DC2 cycle. An
additional set of attributes, to be used by the cache line miss handler, are provided by the
MicroTLB. The timing requirements for these are less critical than for the physical address and
the abort checking.
You can configure MicroTLB replacement to be round-robin or random. By default the
round-robin replacement algorithm is used. The random replacement algorithm is designed to
be selected for rare pathological code that causes extreme use of the MicroTLB. With such code,
you can often improve the situation by using a random replacement algorithm for the
MicroTLB. You can only select random replacement of the MicroTLB if random cache
selection is in force, as set by the Control Register RR bit. If the RR bit is 0, then you can select
random replacement of the MicroTLB by setting the Auxiliary Control Register bit 3. This
register is only accessible in Secure Privileged modes.
Note
The RR bit is common to the Secure and Non-secure worlds.
All main TLB maintenance operations affect both the instruction and data MicroTLBs, causing
them to be flushed.
The virtual addresses held in the MicroTLB include the FCSE translation from Virtual Address
(VA) to Modified Virtual Address (MVA). For more information see the ARM Architecture
Reference Manual. The process of loading the MicroTLB from the main TLB includes the
FCSE translation if appropriate.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-4
Memory Management Unit
6.2.2
Main TLB
The main TLB is the second layer in the TLB structure that catches the cache misses from the
MicroTLBs. It provides a centralized source for translation entries.
Misses from the instruction and data MicroTLBs are handled by a unified main TLB, that is
accessed only on MicroTLB misses. Accesses to the main TLB take a variable number of cycles,
according to competing requests between each of the MicroTLBs and other
implementation-dependent factors. Entries in the lockable region of the main TLB are lockable
at the granularity of a single entry, as c10, TLB Lockdown Register on page 3-100 describes.
Main TLB implementation
The main TLB is implemented as a combination of two elements:
•
A fully-associative array of eight elements, that is lockable.
You can restrict this region to store Secure entries only, that is entries with NSTID=0,
when the TL bit is clear in the NSAC register, see c1, Non-Secure Access Control Register
on page 3-55
—
—
•
Note
If you clear the TL bit, after creating some NS entries in the Lockdown region, this
does not invalidate these entries. The TL bit prevents the creation of new NS entries
in the Lockdown region.
The TL bit has no influence on the Read/Write Lockdown entry operations, VA PA
or Attributes, in the system control coprocessor, see c15, TLB lockdown access
registers on page 3-149. When the TL bit is set, the processor can write an NS entry
in the Lockdown region with the Write Lockdown operation of the system control
coprocessor.
A low-associativity Tag RAM and DataRAM structure similar to that used in the Cache.
The implementation of the low-associativity region is a 64-entry 2-way associative structure.
Depending on the RAMs available, you can implement this as either:
•
four 32-bit wide RAMs
•
two 64-bit wide RAMs
•
a single 128-bit wide RAM.
Main TLB misses
Main TLB misses are handled in hardware by the two level page table walk mechanism, as used
on previous ARM processors. See c8, TLB Operations Register on page 3-86.
Note
Automatic page table walks might be disabled by PD0 and PD1 bits in the TTB Control register.
6.2.3
TLB control operations
c8, TLB Operations Register on page 3-86 and c10, TLB Lockdown Register on page 3-100
describe the TLB control operations.
6.2.4
Page-based attributes
Memory access control on page 6-11 describe the page-based attributes for access protection.
Memory region attributes on page 6-14 and Memory attributes and types on page 6-20 describe
the memory types and page-based cache control attributes. The processor interprets the Shared
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-5
Memory Management Unit
bit in the MMU for regions that are Cacheable as making the accesses Noncacheable. This
ensures memory coherency without incurring the cost of dedicated cache coherency hardware.
Behavior with MMU disabled on page 6-9 describes the behavior of the memory system when
the MMU is disabled.
6.2.5
Supersections
Supersections are defined using a first level descriptor in the page tables, similar to the way a
Section is defined. Because each first level page table entry covers a 1MB region of virtual
memory, the 16MB supersections require that 16 identical copies of the first level descriptor of
the supersection exist in the first level page table.
Every supersection is defined to have its Domain as 0.
Supersections can be specified regardless of whether subpages are enabled or not, as controlled
by the CP15 Control Register XP bit, bit [23]. This bit is duplicated as Secure and Non-secure,
so that supersections can be enabled or disabled separately in each world. Figure 6-6 on
page 6-38 and Figure 6-9 on page 6-41 show the page table formats of supersections.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-6
Memory Management Unit
6.3
Memory access sequence
When the processor generates a memory access, the MMU:
1.
Performs a lookup for a mapping for the requested virtual address and current ASID and
current world, Secure or Non-secure, in the relevant Instruction or Data MicroTLB.
2.
If step 1 misses then a lookup for a mapping for the requested virtual address and current
ASID and current world, Secure or Non-secure, in the main TLB is performed.
If no global mapping, or mapping for the currently selected ASID, or no matching NSTID, for
the virtual address can be found in the TLBs then a translation table walk is automatically
performed by hardware, unless Page Table Walks are disabled by the PD0 or PD1 bits in the
TTB Control register, that cause the processor to return a Section Translation fault. See
Hardware page table translation on page 6-36.
If a matching TLB entry is found then the information it contains is used as follows:
6.3.1
1.
The access permission bits and the domain are used to determine if the access is permitted.
If the access is not permitted the MMU signals a memory abort, otherwise the access is
enabled to proceed. Memory access control on page 6-11 describes how this is done.
2.
The memory region attributes control the cache and write buffer, and determine if the
access is Secure or Non-secure cached, uncached, or device, and if it is shared, as Memory
region attributes on page 6-14 describes.
3.
The physical address is used for any access to external or tightly coupled memory to
perform Tag matching for cache entries.
TLB match process
Each TLB entry contains a virtual address, a page size, a physical address, and a set of memory
properties. Each is marked as being associated with a particular application space, or as global
for all application spaces. Register c13 in CP15 determines the currently selected application
space. This register is duplicated as Secure and Non-secure to enable fast switching between
Secure and Non-secure applications. Each entry is also associated with the Secure or
Non-secure world by the NSTID.
A TLB entry matches if the NSTID matches the Secure or Non-secure request state of the MMU
request, and if bits [31:N] of the Virtual Address match, where N is log2 of the page size for the
TLB entry. It is either marked as global, or the Application Space IDentifier (ASID) matches the
current ASID. The behavior of a TLB if two or more entries match at any time, including global
and ASID-specific entries, is Unpredictable. The operating system must ensure that, at most,
one TLB entry matches at any time. With respect to operation in the Secure and Non-secure
worlds, multiple matching can only occur on entries with the same NSTID, that is a Non-secure
entry and a Secure entry can never be hit simultaneously.
A TLB can store entries based on the following four block sizes:
ARM DDI 0333H
ID012410
Supersections
Consist of 16MB blocks of memory.
Sections
Consist of 1MB blocks of memory.
Large pages
Consist of 64KB blocks of memory.
Small pages
Consist of 4KB blocks of memory.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-7
Memory Management Unit
Supersections, sections, and large pages are supported to permit mapping of a large region of
memory while using only a single entry in a TLB. If no mapping for an address is found within
the TLB, then the translation table is automatically read by hardware, if not disabled with PD0
and PD1 bits in the TTB Control register, and a mapping is placed in the TLB. See Hardware
page table translation on page 6-36 for more details.
6.3.2
Virtual to physical translation mapping restrictions
You can use the processor MMU architecture in conjunction with virtually indexed physically
tagged caches. For details of any mapping page table restrictions for virtual to physical
addresses see Restrictions on page table mappings page coloring on page 6-41.
6.3.3
Tightly-Coupled Memory
There are no page table restrictions for mappings to the Tightly-Coupled Memory (TCM). For
details of the TCM see Tightly-coupled memory on page 7-7.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-8
Memory Management Unit
6.4
Enabling and disabling the MMU
You can enable and disable the MMU by writing the M bit, bit 0, of the CP15 Control Register
c1. On reset, this bit is cleared to 0, disabling the MMU. This bit, in addition to most of the
MMU control parameters, is duplicated as Secure and Non-secure, to ensure a clear and distinct
memory management policy in each world.
6.4.1
Enabling the MMU
To enable the MMU in one world you must:
6.4.2
1.
Program all relevant CP15 registers of the corresponding world.
2.
Program first-level and second-level descriptor page tables as required.
3.
Disable and invalidate the Instruction Cache for the corresponding world. You can then
re-enable the Instruction Cache when you enable the MMU.
4.
Enable the MMU by setting bit 0 in the CP15 Control Register in the corresponding world.
Disabling the MMU
To disable the MMU in one world proceed as follows:
1.
Clear bit 2 to 0 in the CP15 Control Register c1 of the corresponding world, to disable the
Data Cache. You must disable the Data Cache in the corresponding world before, or at the
same time as, disabling the MMU.
Note
If the MMU is enabled, then disabled, and subsequently re-enabled in the same world, the
contents of the TLBs for this world are preserved. If these are now invalid, you must
invalidate the TLBs in the corresponding world before you re-enable the MMU, see c8,
TLB Operations Register on page 3-86.
2.
6.4.3
Clear bit 0 to 0 in the CP15 Control Register c1 of the corresponding world.
Behavior with MMU disabled
When the MMU is disabled, the Data Cache is disabled and memory accesses are treated as
follows for the corresponding world:
•
•
ARM DDI 0333H
ID012410
When the TEX remap bit, bit [28] in the CP15 Control Register, is reset to 0, behavior is
backward compatible:
—
All data accesses are treated as Strongly Ordered. The value of the C bit, bit [2] in
the CP15 Control Register of the corresponding world, Should Be Zero.
—
All instruction accesses are treated as Cacheable if the I bit, bit [12] of the CP15
Control Register of the corresponding world, is set to 1, and Strongly Ordered if the
I bit is reset to 0.
When the TEX remap bit, bit [28] in the CP15 Control Register, is set to 1:
—
all accesses are treated with the same parameters, independently of the C and I bit
values
—
those parameters depend on the programming of the PRRR and NMRR registers,
see TexRemap=1 configuration on page 6-16 for more information on this behavior.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-9
Memory Management Unit
Note
By default, the PRRR and NMRR registers are reset to that all accesses are treated
as Strongly Ordered.
The other parameters of the MMU behavior when disabled, independent of the TEX remap
configuration, are:
ARM DDI 0333H
ID012410
•
No memory access permission or Access bit checks are performed, and no aborts are
generated by the MMU.
•
The physical address for every access is equal to its virtual address. This is known as a flat
address mapping.
•
The NS attribute for the target memory region is equal to the state, Secure or Non-secure,
of the request, that is Secure requests are considered to target Secure memory.
•
The FCSE PID Should Be Zero when the MMU is disabled. This is the reset value of the
FCSE PID. If the MMU is to be disabled the FCSE PID must be cleared.
•
All CP15 MMU and cache operations can be executed even when the MMU is disabled.
•
Accesses to the TCMs work as normal if the TCMs are enabled.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-10
Memory Management Unit
6.5
Memory access control
Access to a memory region is controlled by:
•
Domains
•
Access permissions
•
Execute never bits in the TLB entry on page 6-12.
6.5.1
Domains
A domain is a collection of memory regions. In compliance with the ARM Architecture and the
TrustZone Security Extensions, the ARM1176JZ-S supports 16 Domains in the Secure world
and 16 Domains in the Non-secure world. Domains provide support for multi-user operating
systems. All regions of memory have an associated domain.
A domain is the primary access control mechanism for a region of memory and defines the
conditions when an access can proceed. The domain determines whether:
•
access permissions are used to qualify the access
•
access is unconditionally permitted to proceed
•
access is unconditionally aborted.
In the latter two cases, the access permission attributes are ignored.
Each page table entry and TLB entry contains a field that specifies the domain that the entry is
in. Access to each domain is controlled by a 2-bit field in the Domain Access Control Register,
CP15 c3. Each field enables very quick access to be achieved to an entire domain, so that whole
memory areas can be efficiently swapped in and out of virtual memory. Two kinds of domain
access are supported:
Clients
Clients are users of domains in that they execute programs and access data. They
are guarded by the access permissions of the TLB entries for that domain.
A client is a domain user, and each access has to be checked against the access
permission settings for each memory block and the system protection bit, the S
bit, and the ROM protection bit, the R bit, in CP15 Control Register c1. Table 6-1
on page 6-12 lists the access permissions.
Managers
Managers control the behavior of the domain, the current sections and pages in
the domain, and the domain access. They are not guarded by the access
permissions for TLB entries in that domain.
Because a manager controls the domain behavior, each access has only to be
checked to be a manager of the domain.
One program can be a client of some domains, and a manager of some other domains, and have
no access to the remaining domains. This enables flexible memory protection for programs that
access different memory resources.
6.5.2
Access permissions
The access permission bits control access to the corresponding memory region. If an access is
made to an area of memory without the required permissions, then a permission fault is raised.
The access permissions are determined by a combination of the AP and APX bits in the page
table, and the S and R bits in CP15 Control Register c1. For page tables not supporting the APX
bit, the value 0 is used.
You do not have to flush the TLB to enable the new S and R bit to take effect. Access
permissions of entries in the TLB are automatically affected by the new S and R values.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-11
Memory Management Unit
Note
The use of the S and R bits is deprecated.
Table 6-1 lists the encoding of the access permission bits.
Table 6-1 Access permission bit encoding
APX
AP[1:0]
Privileged permissions
User permissions
0
b00
No access, recommended use.
Read-only when S=1 and R=0 or when S=0 and R=1,
deprecated.
No access, recommended use.
Read-only when S=0 and R=1, deprecated.
0
b01
Read/write.
No access.
0
b10
Read/write.
Read-only.
0
b11
Read/write.
Read/write.
1
b00
Reserved.
Reserved.
1
b01
Read-only.
No access.
1
b10
Read-only.
Read-only.
1
b11
Read-only.
Read-only.
Restricted access permissions and the access bit
The Access bit is an ARMv6 enhancement, for full details see Access bit fault on page 6-32.
Some operating systems only use a restricted set of the access permissions:
•
APX and AP[1:0] = b111, Read-Only for both Privileged and Unprivileged code
•
APX and AP[1:0] = b011, Read-Write for both Privileged and Unprivileged code
•
APX and AP[1:0] = b101, Read-Only for Privileged code, No Access for Unprivileged
•
APX and AP[1:0] = b001, Read-Write for Privileged code, No Access for Unprivileged.
For such OSs the encoding of the Read-Only or Read-Write and the User or Kernel access
permissions are orthogonal:
•
APX selects the Read-Only or Read-Write permission
•
AP[1] selects the User or Kernel access.
In this case, the AP[0] bit provides Access bit information so that software can optimize the
memory management algorithm.
The Access bit behaves in this way except in the deprecated case that uses the S and R bits, that
is when the S and R bits have opposite values, and when APX and AP[1:0] = b000.
6.5.3
Execute never bits in the TLB entry
Each memory region can be tagged as not containing executable code. If the Execute Never, XN,
bit of the TLB entry is set to 1, then any attempt to execute an instruction in that region results
in a permission fault. If the XN bit is cleared, then code can execute from that memory region.
When the MMU is in ARMv5 mode, see the XP bit in c1, Control Register on page 3-44, the
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-12
Memory Management Unit
descriptors do not contain the XN bit, and all pages are executable. In ARMv6 mode, XP bit =1,
the descriptors specify the XN attribute, see Figure 6-7 on page 6-39 and Figure 6-8 on
page 6-40.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-13
Memory Management Unit
6.6
Memory region attributes
Each TLB entry has an associated set of memory region attributes. These control:
•
accesses to the caches
•
how the write buffer is used
•
if the memory region is shareable
•
if the targeted memory is Secure or not.
6.6.1
C and B bit, and type extension field encodings
The ARMv6 MMU architecture originally defined five bits to describe all of the options for
inner and outer cachability. These five bits, the Type Extension Field, TEX[2:0], Cacheable, C,
and Bufferable, B bits, are set in the descriptors.
Few application make use of all these options simultaneously. For this reason, a new
configuration bit, TEX remap, bit [28] in the CP15 Control Register, permits the core to support
a smaller number of options by using only the TEX[0], C and B bits.
The OS can configure this subset of options through a remap mechanism for these TEX[0], C,
and B bits. The TEX[2:1] bits in the descriptor then become 2 OS managed page table bits.
Additionally, certain page tables contain the Shared bit, S, used to determine if the memory
region is Shared or not. If not present in the descriptor, the Shared bit is assumed to be 0,
Non-Shared. In the TexRemap=1 configuration, the Shared bit can be remapped too.
For TrustZone support, the TEX remap bit is duplicated as Secure and Non-secure versions, so
it is possible to configure in each world the options that are available to the core.
The TLB does not cache the effect of the TEX remap bit on page tables. As a result, there is no
requirement for the processor to invalidate the TLB on a change of the TEX remap bit to rely on
the effect of those changes taking place.
Note
The terms Inner and Outer in this document represent the levels of caches that can be built in a
system. Inner refers to the innermost caches, including level one. Outer refers to the outermost
caches. The boundary between Inner and Outer caches is defined in the implementation of a
cached system. Inner must always include level one. In a system with three levels of caches, an
example is for the Inner attributes to apply to level one and level two, while the Outer attributes
apply to level three. In a two-level system, it is envisaged that Inner always applies to level one
and Outer to level two.
In the processor, Inner refers to level one and the ARSIDEBAND[4:1], for read, and
AWSIDEBAND[4:1], for writes, signals show the Inner Cacheable values.
ARCACHE, for reads, and AWCACHE, for writes, show the Outer Cacheable properties.
TexRemap=0 configuration
This is the standard ARMv6 configuration. The five TEX[2:0], C, and B bits are used to encode
the memory region type. For page tables formats with no TEX field, you must use the value
3'b000.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-14
Memory Management Unit
The S bit in the descriptors only applies to Normal, that is not Device and not Strongly Ordered
memory. Table 6-2 summarizes the TEX[2:0], C, and B encodings used in the page table
formats, and the value of the shareable attribute of the concerned page:
Table 6-2 TEX field, and C and B bit encodings used in page table formats
Page table encodings
Description
Memory type
Page shareable?
0
Strongly Ordered
Strongly Ordered
Shareda
0
1
Shared Device
Device
Shareda
b000
1
0
Outer and Inner Write-Through,
No Allocate on Write
Normal
sb
b000
1
1
Outer and Inner Write-Back,
No Allocate on Write
Normal
sb
b001
0
0
Outer and Inner Noncacheable
Normal
sb
b001
0
1
Reserved
-
-
b001
1
0
Reserved
-
-
b001
1
1
Outer and Inner Write-Back,
Allocate on Writec
Normal
sb
b010
0
0
Non-Shared Device
Device
Non-shared
b010
0
1
Reserved
-
-
010
1
X
Reserved
-
-
011
X
X
Reserved
-
-
1BB
A
A
Cached memory.
BB = Outer policy,
AA = Inner policy.
See Table 6-3.
Normal
sb
TEX
C
B
b000
0
b000
a. Shared, regardless of the value of the S bit in the page table.
b. s is Shared if the value of the S bit in the page table is 1, or Non-shared if the value of the S bit is 0 or not present.
c. The cache does not implement allocate on write.
The Inner and Outer cache policy bits control the operation of memory accesses to the external
memory:
•
The C and B bits are described as the AA bits and define the Inner cache policy
•
The TEX[1:0] bits are described as the BB bits and define the Outer cache policy.
Table 6-3 shows how the MMU and cache interpret the cache policy bits.
Table 6-3 Cache policy bits
ARM DDI 0333H
ID012410
BB or AA bits
Cache policy
b00
Noncacheable
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-15
Memory Management Unit
Table 6-3 Cache policy bits (continued)
BB or AA bits
Cache policy
b01
Write-Back cached, Write Allocate
b10
Write-Through cached, No Allocate on Write
b11
Write-Back cached, No Allocate on Write
You can choose the write allocation policy that an implementation supports. The Allocate On
Write and No Allocate On Write cache policies indicate the preferred allocation policy for a
memory region, but you must not rely on the memory system implementing that policy. The
processor does not support Inner Allocate on Write.
Not all Inner and Outer cache policies are mandatory. Table 6-4 lists possible implementation
options.
Table 6-4 Inner and Outer cache policy implementation options
Cache policy
Implementation options
Supported by the processor
Inner Noncacheable
Mandatory.
Yes
Inner Write-Through
Mandatory.
Yes
Inner Write-Back
Optional. If not supported, the memory system must
implement this as Inner Write-Through.
Yes
Outer Noncacheable
Mandatory.
System-dependent
Outer Write-Through
Optional. If not supported, the memory system must
implement this as Outer Noncacheable.
System-dependent
Outer Write-Back
Optional. If not supported, the memory system must
implement this as Outer Write-Through.
System-dependent
When the MMU is off and TexRemap=0:
•
All data accesses are treated as Shared, Inner Strongly Ordered, Outer Non-Cacheable.
•
Instruction accesses are treated as Non-Shared, Inner and Outer Write-Through, No
Allocate on Write, when the Instruction Cache is on, I=1, bit [12], see c1, Control Register
on page 3-44.
Instruction accesses are treated as Shared, Inner Strongly Ordered, Outer Non-Cacheable,
when the Instruction Cache is off, see Behavior with MMU disabled on page 6-9.
TexRemap=1 configuration
Only three bits, TEX[0], C, and B, are relevant in this configuration. The OS can use the
TEX[2:1] bits to manage the page tables.
In this configuration the processor provides the OS with a remap capability for the memory
attribute. Two CP15 registers, the Primary Region Remap Register (PRRR) and the Normal
Memory Region Register (NMRR) come into effect.
You can access the memory region remap registers of the MMU with:
MCR/MRC {cond} p15, 0, Rd, c10, c2, 0 for the Primary Region Remap register and MCR/MRC
{cond} p15, 0, Rd, c10, c2, 1 for the Normal Memory Region Remap register, see c10, Memory
region remap registers on page 3-101.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-16
Memory Management Unit
The remapping applies to all sources of MMU requests, that is the two registers are applicable
to Data, Instruction and DMA requests.
For TrustZone support, the PRRR and NMRR registers are duplicated as Secure and Nonsecure versions, and the processor uses the appropriate one for the remapping depending on
whether the MMU request is Secure or not.
The PRRR and NMRR registers are expected to be static throughout operation.
However, if the PRRR or NMRR registers are modified in one world, the changes take effect
immediately and enable each of the entries contained in the main TLB to be remapped, without
the requirement to invalidate the TLB.
The remap capability has two levels:
1.
The first level, the Primary Region Remap, enables remap of the primary memory type,
Normal, Device or Strongly Ordered. See Table 6-5.
2.
After primary remapping, any region remapped as Normal memory has the Inner and
Outer cacheable attributes remapped by the Normal Memory Region Remap register. See
Table 6-5. To provide maximum flexibility, this level of remapping permits regions that
were originally not Normal memory to be remapped independently.
Similarly, if the obtained, remapped, memory type is Device or Normal memory, the S bit in the
descriptor is independently remapped according to one of the PRRR[19:16] bit. See Table 6-6
on page 6-18.
Table 6-5 summarizes the parts of the PRRR and NMRR that are used to remap the different
memory region attributes.
Table 6-5 Effect of remapping memory with TEX remap = 1
Page Table
Encodings
ARM DDI 0333H
ID012410
Memory Type
Inner Cache Attributes
when mapped as Normal
Outer Cache Attributes
when mapped as Normal
TEX
C
B
XX0
0
0
PRRR[1:0]
NMRR[1:0]
NMRR[17:16]
XX0
0
1
PRRR[3:2]
NMRR[3:2]
NMRR[19:18]
XX0
1
0
PRRR[5:4]
NMRR[5:4]
NMRR[21:20]
XX0
1
1
PRRR[7:6]
NMRR[7:6]
NMRR[23:22]
XX1
0
0
PRRR[9:8]
NMRR[9:8]
NMRR[25:24]
XX1
0
1
PRRR[11:10]
NMRR[11:10]
NMRR[27:26]
XX1
1
0
PRRR[13:12]
NMRR[13:12]
NMRR[29:28[
XX1
1
1
PRRR[15:14]
NMRR[15:14]
NMRR[31:30]
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-17
Memory Management Unit
Table 6-6 lists how the memory type, the value of the S bit in the page table attributes, and the
primary remap region register determine how the pages can be shared.
Table 6-6 Values that remap the shareable attribute
Shareable attribute when:
Memory Type
S=0
S=1
Strongly Ordered
Shareable
Shareable
Device
PRRR[16]
PRRR[17]
Normal
PRRR[18]
PRRR[19]
Table 6-7 lists the encoding used for each region in the PRRR register, bits [15:0].
Table 6-7 Primary region type encoding
Region
Encoding
Strongly Ordered
b00
Device
b01
Normal Memory
b10
Unpredictable, normal memory for ARM1176JZ-S
b11
Table 6-8 lists the encoding used for each Inner or Outer Cacheable attribute in the NMRR
register, bits [31:0].
Table 6-8 Inner and outer region remap encoding
Inner or Outer Region
Encoding
Non-Cacheable
b00
WriteBack, WriteAllocate
b01
WriteThrough, Non-Write Allocate
b10
WriteBack, Non-WriteAllocate
b11
When the MMU is off the remapping takes place according to the settings in PRRR[1:0], and
PRRR[19],PRRR[17], NMRR[1:0], and NMRR[17:16] as appropriate.
In this case, the S bit is treated as if it is 1 prior to remapping. This behavior takes place
regardless of whether or not the instruction cache is enabled.
ARM DDI 0333H
ID012410
•
Note
The reset value for each field of the PRRR and NMRR makes the MMU behave as if no
remapping occurs, that is Strongly Ordered regions are remapped as Strongly Ordered and
so on.
•
For security reasons, the NS Attribute bit has no remap capability.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-18
Memory Management Unit
6.6.2
Shared
This bit indicates that the memory region can be shared by multiple processors. For a full
explanation of the Shared attribute see Memory attributes and types on page 6-20.
6.6.3
NS attribute
The NS attribute is a TrustZone extension to the V6 MMU. It is specified in the L1 descriptors,
in position 19 for sections and supersections, and in position 3 for coarse pages. It defines if the
targeted memory region corresponding to the page is Secure or Non-secure, that is if this
memory region is accessed with Secure or with Non-secure rights. This bit is ignored in the
Non-secure world.
When the MMU is off, the NS Attribute is equal to the state, Secure or Non-secure, of the MMU
request.
When the NS Attribute is set to 1, the access is performed with Non-secure rights:
•
If the access is cacheable, it can only hit a cache line whose NS-Tag is Non-secure. If this
access causes a linefill, then the created line in the cache has its NS Tag set to 1,
Non-secure.
•
The access can only hit TCM configured as Non-secure.
•
If the access goes external to the core, then it is marked as Non-secure with AxPROT[1]
= Non-secure.
The NS Attribute is specified in the L1 descriptors, in position 19 for sections and supersections,
and in position 3 for coarse pages. The bit contained in the NS descriptors is always ignored, so
that all NS entries in the TLB, that is entries with NSTID=1(Non-secure), have the NS
Attribute=1 (Non-secure). This ensures that the NS world always perform accesses with NS
rights.
Note
This rule is also true when a new entry is created in the Lockdown region with the CP15
Read/Write PA in TLB Lockdown region operation. For this operation, when an entry is written
with NSTID=1, then the corresponding NS Attribute of the entry is forced to 1. See c15, TLB
lockdown access registers on page 3-149.
With this mechanism, only the Secure world can perform Secure accesses, and consequently is
the only one permitted to access Secure memory. The Secure world can also access Non-secure
memory, by setting the NS Attribute appropriately in the corresponding descriptor. The
Non-secure world can only access Non-secure memory.
There is no check of the NS Attribute internally, and therefore the system can not generate an
error because of a wrong NS Attribute. Only external aborts can be generated, if the system has
implemented this feature.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-19
Memory Management Unit
6.7
Memory attributes and types
The processor provides a set of memory attributes that have characteristics that are suited to
particular devices, including memory devices, that can be contained in the memory map. The
ordering of accesses for regions of memory is also defined by the memory attributes. There are
three mutually exclusive main memory type attributes:
•
Strongly Ordered
•
Device
•
Normal.
These are used to describe the memory regions. The marking of the same memory locations as
having two different attributes in the MMU, for example using synonyms in a virtual to physical
address mapping, results in Unpredictable behavior but this does not break security. Table 6-9
lists a summary of the memory attributes.
Table 6-9 Memory attributes
Memory
type
attribute
Shared or
Non-shared
Other attributes
Description
Strongly
Ordered
-
-
All memory accesses to Strongly Ordered memory occur in
program order. Some backwards compatibility constraints
exist with ARMv5 instructions that change the CPSR interrupt
masks. See Strongly Ordered memory attribute on page 6-23.
All Strongly Ordered accesses are assumed to be shared.
Device
Shared
-
Designed to handle memory-mapped peripherals that are
shared by several processors.
Non-shared
-
Designed to handle memory-mapped peripherals that are used
only by a single processor.
Shared
Noncacheable/
Write-Through
Cacheable/
Write-Back Cacheable
Designed to handle normal memory that is shared between
several processors.
Non-shared
Noncacheable/
Write-Through
Cacheable/
Write-Back Cacheable
Designed to handle normal memory that is used only by a
single processor.
Normal
6.7.1
Normal memory attribute
The Normal memory attribute is defined on a per-page basis in the MMU and provides memory
access orderings that are suitable for normal memory. This type of memory stores information
without side effects. Normal memory can be writable or read-only. For writable normal memory,
unless there is a change to the physical address mapping:
•
a load from a specific location returns the most recently stored data at that location for the
same processor
•
two loads from a specific location, without a store in between, return the same data for
each load.
For read-only normal memory:
•
ARM DDI 0333H
ID012410
two loads from a specific location return the same data for each load.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-20
Memory Management Unit
This behavior describes most memory used in a system, and the term memory-like is used to
describe this sort of memory. In this section, writable normal memory and read-only normal
memory are not distinguished. Regions of memory with the Normal attribute can be Shared or
Non-Shared, on a per-page basis in the MMU. The marking of the same memory locations as
being Shared Normal and Non-Shared Normal in the MMU, for example by the use of
synonyms in a virtual to physical address mapping, results in Unpredictable behavior but this
does not break security. All explicit accesses to memory marked as Normal must correspond to
the ordering requirements of accesses that Ordering requirements for memory accesses on
page 6-23 describes. Accesses to Normal memory conform to the Weakly Ordered model of
memory ordering. A description of this model is in standard texts describing memory ordering
issues.
Shared Normal memory
The Shared Normal memory attribute is designed to describe normal memory that can be
accessed by multiple processors or other system masters. A region of memory marked as Shared
Normal is one where the effect of interposing a cache, or caches, on the memory system is
entirely transparent. Implementations can use a variety of mechanisms to support this, from not
caching accesses in shared regions to more complex hardware schemes for cache coherency for
those regions. The processor does not cache shareable locations at level one. In systems that
implement a TCM, the regions of memory covered by the TCM must not be marked as Shared.
The attributes for these regions are remapped to Inner and Outer Write-Back Non-Shared.
Writes to Shared Normal memory might not be atomic. That is, all observers might not see the
writes occurring at the same time. To preserve coherence where two writes are made to the same
location, the order of those writes must be seen to be the same by all observers. Reads to Shared
Normal memory that are aligned in memory to the size of the access are atomic.
Non-Shared Normal memory
The Non-Shared Normal memory attribute describes normal memory that can be accessed only
by a single processor. A region of memory marked as Non-Shared Normal does not have any
requirement to make the effect of a cache transparent.
Cacheable Write-Through, Cacheable Write-Back, and Noncacheable
In addition to marking a region of Normal memory as being Shared or Non-Shared, a region of
memory marked as Normal can also be marked on a per-page basis in an MMU as being one of:
•
Cacheable Write-Through
•
Cacheable Write-Back
•
Noncacheable.
This marking is independent of the marking of a region of memory as being Shared or
Non-Shared, and indicates the required handling of the data region for reasons other than those
to handle the requirements of shared data. As a result, a region of memory that is marked as
being Cacheable and Shared is not cached by the processor at level one. Marking the same
memory locations as having different Cacheable attributes, for example by the use of synonyms
in a virtual to physical address mapping, results in Unpredictable behavior but does not break
security.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-21
Memory Management Unit
6.7.2
Device memory attribute
The Device memory attribute is defined for memory locations where an access to the location
can cause side effects, or where the value returned for a load can vary depending on the number
of loads performed. Memory-mapped peripherals and I/O locations are typical examples of
areas of memory that you must mark as Device. The marking of a region of memory as Device
is performed on a per-page basis in the MMU.
Accesses to memory-mapped locations that have side effects that apply to memory locations
that are Normal memory might require Memory Barriers to ensure correct execution. An
example where this might be an issue is the programming of the control registers of a memory
controller while accesses are being made to the memories controlled by the controller.
Instruction fetches must not be performed to areas of memory containing read-sensitive devices,
because there is no ordering requirement between instruction fetches and explicit accesses.
As a result, instruction fetches from such devices can result in Unpredictable behavior. Up to 64
bytes can be prefetched sequentially ahead of the current instruction being executed. To enable
this, read-sensitive devices must be located in memory in such a way to enable this prefetching.
Explicit accesses from the processor to regions of memory marked as Device occur at the size
and order defined by the instruction. The number of location accesses is specified by the
program. Repeat accesses to such locations when there is only one access in the program, that
is the accesses are not restartable, are not possible in the processor.
An example of where a repeat access might be required is before and after an interrupt to enable
the interrupt to abandon a slow access. You must ensure these optimizations are not performed
on regions of memory marked as Device. If a memory operation that causes multiple
transactions, such as an LDM or an unaligned memory access, crosses a 4KB address boundary,
then it can perform more accesses than are specified by the program, regardless of one or both
of the areas being marked as Device.
For this reason, accesses to volatile memory devices must not be made using single instructions
that cross a 4KB address boundary. This restriction is expected to cause restrictions to the
placing of such devices in the memory map of a system, rather than to cause a compiler to be
aware of the alignment of memory accesses. In addition, address locations marked as Device are
not held in a cache.
Shared memory attribute
Regions of Memory marked as Device are also distinguished by the Shared attribute in the
MMU. These memory regions can be marked as:
•
Shared Device
•
Non-Shared Device.
Explicit accesses to memory with each of the sets of attributes occur in program order relative
to other explicit accesses to the same set of attributes. All explicit accesses to memory marked
as Device must correspond to the ordering requirements of accesses that Ordering requirements
for memory accesses on page 6-23 describes. The marking of the same memory location as
being Shared Device and Non-Shared Device in an MMU, for example by the use of synonyms
in a virtual to physical address mapping, results in Unpredictable behavior but this does not
break security.
An example of an implementation where the Shared attribute is used to distinguish memory
accesses is an implementation that supports a local bus for its private peripherals, while system
peripherals are situated on the main system bus. Such a system can have more predictable access
times for local peripherals such as watchdog timers or interrupt controllers. For shared device
memory, the data of a write is visible to all observers before the end of a Data Synchronization
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-22
Memory Management Unit
Barrier memory barrier. For non-shared device memory, the data of a write is visible to the
processor before the end of a Data Synchronization Barrier memory barrier. See Explicit
Memory Barriers on page 6-25.
6.7.3
Strongly Ordered memory attribute
Another memory attribute, Strongly Ordered, is defined on a per-page basis in the MMU.
Accesses to memory marked as Strongly Ordered have a strong memory-ordering model with
respect to all explicit memory accesses from that processor. An access to memory marked as
Strongly Ordered acts as a memory barrier to all other explicit accesses from that processor,
until the point at which the access is complete.
That is, has changed the state of the target location or data has been returned. In addition, an
access to memory marked as Strongly Ordered must complete before the end of a Memory
Barrier. See Explicit Memory Barriers on page 6-25. To maintain backwards compatibility with
ARMv5 architecture, any ARMv5 instructions that implicitly or explicitly change the interrupt
masks in the CSPR that appear in program order after a Strongly Ordered access must wait for
the Strongly Ordered memory access to complete.
These instructions are MSR with the control field mask bit set, and the flag setting variants of
arithmetic and logical instructions whose destination register is R15, that copies the SPSR to
CSPR. This requirement exists only for backwards compatibility with previous versions of the
ARM architecture, and the behavior is deprecated in ARMv6. Programs must not rely on this
behavior, but instead include an explicit Memory Barrier between the memory access and the
following instruction. See Explicit Memory Barriers on page 6-25.
The processor does not require an explicit memory barrier in this situation, but for future
compatibility it is recommended that programmers insert a memory barrier.
Explicit accesses from the processor to memory marked as Strongly Ordered occur at their
program size, and the number of accesses that occur to such locations is the number that are
specified by the program. Implementations must not repeat accesses to such locations when
there is only one access in the program. That is, the accesses are not restartable.
If a memory operation that causes multiple transactions, such as LDM or an unaligned memory
access, crosses a 4KB address boundary, then it might perform more accesses than are specified
by the program regardless of one or both of the areas being marked as Strongly Ordered.
For this reason, it is important that accesses to volatile memory devices are not made using
single instructions that cross a 4KB address boundary. Address locations marked as Strongly
Ordered are not held in a cache, and are treated as Shared memory locations. For Strongly
Ordered memory, the data and side effects of a write are visible to all observers before the end
of a Data Synchronization Barrier memory barrier. See Explicit Memory Barriers on page 6-25.
6.7.4
Ordering requirements for memory accesses
The various memory types defined in this section have restrictions in the memory orderings that
are permitted.
Ordering requirements for two accesses
The order of any two explicit architectural memory accesses where one or more are to memory
marked as Non-Shared must obey the ordering requirements that Figure 6-1 on page 6-24 lists.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-23
Memory Management Unit
Figure 6-1 lists the memory ordering between two explicit accesses A1 and A2, where A1
occurs before A2 in program order. The symbols used in the figure are as follows:
<
Accesses must occur strictly in program order. That is, A1 must occur strictly
before A2. It must be impossible to tell otherwise from observation of the
read/write values and side effects caused by the memory accesses.
?
Accesses can occur in any order, provided that the requirements of uniprocessor
semantics are met, for example respecting dependencies between instructions
within a single processor.
A2
A1
a.
Device read
Normal
read
NonShared
Shared
Strongly
Ordered
read
Device write
Normal
write
NonShared
Shared
Strongly
Ordered
write
Normal read
?
?
?
<
?a
?
?
<
Device read, Non-Shared
?
<
?
<
?
<
?
<
Device read, Shared
?
?
<
<
?
?
<
<
Strongly Ordered read
<
<
<
<
<
<
<
<
Normal write
?
?
?
<
?
?
?
<
Device write, Non-Shared
?
<
?
<
?
<
?
<
Device write, Shared
?
?
<
<
?
?
<
<
Strongly Ordered write
<
<
<
<
<
<
<
<
The processor orders the normal read ahead of normal write
Figure 6-1 Memory ordering restrictions
There are no ordering requirements for implicit accesses to any type of memory.
Definition of program order of memory accesses
The program order of instruction execution is defined as the order of the instructions in the
control flow trace. Two explicit memory accesses in an execution can either be:
Ordered
Denoted by <. If the accesses are Ordered, then they must occur strictly in
order.
Weakly Ordered
Denoted by <=. If the accesses are Weakly Ordered, then they must occur
in order or simultaneously.
The rules for determining this for two accesses A1 and A2 are:
1.
ARM DDI 0333H
ID012410
If A1 and A2 are generated by two different instructions, then:
•
A1 < A2 if the instruction that generates A1 occurs before the instruction that
generates A2 in program order.
•
A2 < A1 if the instruction that generates A2 occurs before the instruction that
generates A1 in program order.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-24
Memory Management Unit
2.
If A1 and A2 are generated by the same instruction, then:
•
If A1 and A2 are the load and store generated by a SWP or SWPB instruction, then:
•
—
A1 < A2 if A1 is the load and A2 is the store
—
A2 < A1 if A2 is the load and A1 is the store.
If A1 and A2 are two word loads generated by an LDC, LDRD, or LDM instruction,
or two word stores generated by an STC, STRD, or STM instruction, but excluding
LDM or STM instructions whose register list includes the PC, then:
•
—
A1 <= A2 if the address of A1 is less than the address of A2
—
A2 <= A1 if the address of A2 is less than the address of A1.
If A1 and A2 are two word loads generated by an LDM instruction whose register
list includes the PC or two word stores generated by an STM instruction whose
register list includes the PC, then the program order of the memory operations is not
defined.
Multiple load and store instructions, such as LDM, LDRD, STM, and STRD, generate multiple
word accesses, each being a separate access to determine ordering.
6.7.5
Explicit Memory Barriers
This section describes two explicit Memory Barrier operations:
•
Data Memory Barrier
•
Data Synchronization Barrier.
In addition, to ensure correct operation where the processor writes code, an explicit Flush
Prefetch Buffer operation is provided.
These operations are implemented by writing to the CP15 Cache operation register c7. For
details on how to use this register see c7, Cache operations on page 3-69. For more information
on explicit memory barriers, see the ARM Architecture Reference Manual.
Data Memory Barrier
This memory barrier ensures that all explicit memory transactions occurring in program order
before this instruction are completed. No explicit memory transactions occurring in program
order after this instruction are started until this instruction completes. Other instructions can
complete out of order with the Data Memory Barrier instruction.
Data Synchronization Barrier
This memory barrier completes when all explicit memory transactions occurring in program
order before this instruction are completed. No explicit memory transactions occurring in
program order after this instruction are started until this instruction completes. In fact, no
instructions occurring in program order after the Data Synchronization Barrier complete, or
change the interrupt masks, until this instruction completes.
Flush Prefetch Buffer
The Flush Prefetch Buffer operation flushes the pipeline in the processor, so that all instructions
following the pipeline flush are fetched from memory, including the cache, after the instruction
has been completed. Combined with Data Synchronization Barrier, and potentially invalidating
the Instruction Cache, this ensures that any instructions written by the processor are executed.
This guarantee is required as part of the mechanism for handling self-modifying code.
Performing a Data Synchronization Barrier operation and invalidating the Instruction Cache and
Branch Target Cache are also required for the handling of self-modifying code. The Flush
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-25
Memory Management Unit
Prefetch Buffer is guaranteed to perform this function, while alternative methods of performing
the same task, such as a branch instruction, can be optimized in the hardware to avoid the
pipeline flush, for example, by using a branch predictor.
6.7.6
Backwards compatibility
The ARMv6 memory attributes are significantly different from those in previous versions of the
architecture. Table 6-10 lists the interpretation of the earlier memory types in the light of this
definition.
Table 6-10 Memory region backwards compatibility
ARM DDI 0333H
ID012410
Previous architectures
ARMv6 attribute
NCNB, Noncacheable, Non Bufferable
Strongly Ordered
NCB, Noncacheable, Bufferable
Shared Device
Write-Through, Cacheable, Bufferable
Non-Shared Normal, Write-Through Cacheable
Write-Back, Cacheable, Bufferable
Non-Shared Normal, Write-Back Cacheable
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-26
Memory Management Unit
6.8
MMU aborts
Mechanisms that can cause the processor to take an exception because of a memory access are:
MMU fault
The MMU detects a restriction and signals the processor.
Debug abort
Monitor debug-mode debug is enabled and a breakpoint or a watchpoint
has been detected.
External abort
The external memory system signals an illegal or faulting memory access.
Collectively these are called aborts. Accesses that cause aborts are said to be aborted. If the
memory request that aborts is an instruction fetch, then a Prefetch Abort exception is raised if
and when the processor attempts to execute the instruction corresponding to the aborted access.
If the aborted access is a data access or a cache maintenance operation, a Data Abort exception
is raised.
All Data Aborts, and aborts caused by cache maintenance operations, cause the Data Fault
Status Register (DFSR) to be updated so that you can determine the cause of the abort.
For all Data Aborts, excluding external aborts, other than on translation, the Fault Address
Register (FAR) is updated with the address that caused the abort. External Data Aborts, other
than on translation, can all be imprecise and therefore the FAR does not contain the address of
the abort. See Imprecise Data Abort mask in the CPSR/SPSR on page 2-47 for more details on
imprecise Data Aborts.
For all prefetch aborts the processor updates the Instruction Fault Address Register (IFAR) with
the address of the instruction that causes the abort.
When the EA bit is set, see c1, Secure Configuration Register on page 3-52, all external aborts
are trapped to the Secure Monitor mode, and only the Secure versions of the FSR and FAR
registers are updated. In all other cases, the FAR or FSR registers are updated in the world
corresponding to the state of the core that caused the aborted access. For example if the core is
in Secure state, the Secure version of the FAR and FSR are updated, even in the case when the
aborted access has been performed with NS rights because of the NS Attribute being Non-secure
in the MMU.
6.8.1
External aborts
External memory errors are defined as those that occur in the memory system other than those
that are detected by an MMU. External memory errors are expected to be extremely rare and are
likely to be fatal to the running process. Examples of events that can cause an external memory
error are:
•
an uncorrectable parity or ECC error on a level two memory structure
•
a Non- Secure access to Secure memory.
External abort on instruction fetch
Externally generated errors during an instruction prefetch are precise in nature, and are only
recognized by the processor if it attempts to execute the instruction fetched from the location
that caused the error. The resulting failure is reported in the Instruction Fault Status Register if
no higher priority abort, including a Data Abort, has taken place.
The IFAR is updated with the address of the instruction that causes the abort.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-27
Memory Management Unit
External abort on data read/write
Externally generated errors during a data read or write can be imprecise. This means that
R14_abt on entry into the abort handler on such an abort might not hold an address that is related
to the instruction that caused the exception. Correspondingly, external aborts can be
unrecoverable. See Aborts on page 2-45 for more details.
The Fault Address Register is updated with an invalid value, all zeros, on an imprecise external
abort on a data access.
In case a precise external abort occurs during a multiple load or store operation, the FAR in the
appropriate world is always updated with the base address of an AXI burst.
External abort on VA to PA translation operation
For VA to PA translation operations, the only case when an external abort can be asserted is
during the page table walk.
In this case, the external abort is precise, and both the DFSR and the FAR are updated in the
world, Secure or Non-secure, that generated the VA to PA translation operation. This is in
addition to the standard abort mechanism occurring during VA to PA translation operations, that
update the PA register of the corresponding world with the appropriate FSR encoding.
External abort on a hardware page table walk
An external abort occurring on a hardware page table access must be returned with the page
table data. Such aborts are precise. The FAR is updated on an external abort on a hardware page
table walk on a data access, and the IFAR is updated on an external abort on a hardware page
table walk on an instruction access. The appropriate Fault Status Register indicates that this has
occurred.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-28
Memory Management Unit
6.9
MMU fault checking
During the processing of a section or page, the MMU behaves differently because it is checking
for faults. The MMU can generate these faults:
•
Alignment fault on page 6-32
•
Translation fault on page 6-32
•
Access bit fault on page 6-32
•
Domain fault on page 6-33
•
Permission fault on page 6-33.
Aborts that are detected by the MMU are taken before any external memory access takes place.
Alignment fault checking is enabled by the A bit in the Control Register CP15, This bit is
duplicated in the Secure and Non-secure worlds for the support of TrustZone. Alignment fault
checking is independent of the MMU being enabled. Translation, Access bit, domain, and
permission faults are only generated when the MMU is enabled.
The access control mechanisms of the MMU detect the conditions that produce these faults. If
a fault is detected as the result of a memory access, the MMU aborts the access and signals the
fault condition to the processor. The MMU retains status and address information about faults
generated by data accesses in DFSR and FAR, see Fault status and address on page 6-34. The
MMU does not retain status about faults generated by instruction fetches.
An access violation for a given memory access inhibits any corresponding external access, and
an abort is returned to the processor.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-29
Memory Management Unit
6.9.1
Fault checking sequence
Figure 6-2 and Figure 6-3 on page 6-31 show the fault checking sequence for translation table
managed TLB modes.
Virtual address
Checking
alignment
?
Yes
Check address
alignment
No
No
PTW
disabled?
Yes
Misaligned
?
Yes
Alignment
fault
Section
translation
fault
No
Get first-level
descriptor
Translation
external abort
(first level)
Yes
External
abort?
No
Section/Page
translation
abort
Yes
Descriptor
fault?
No
Section/Page
access flag
fault
Yes
Access
bit fault?
No
A
Figure 6-2 Translation table managed TLB fault checking sequence part 1
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-30
Memory Management Unit
A
Section
or page
?
Page
Get second-level descriptor
Section
External
abort?
Yes
Translation
external abort
(2nd level)
No
Invalid Yes
descriptor
?
Page
translation
fault
No
Access
bit fault?
Check domain
Section
domain
fault
No access
Access
type?
Check domain
Manager
Manager
Client
Yes
No access
Page
access bit
fault
Page
domain
fault
Client
Condition is: MMU on,
Strongly ordered or Device,
Unaligned access
Alignment
fault
Access
type?
Yes
Condition is: MMU on,
Strongly ordered or Device,
Unaligned access
Condition
true?
Condition
true?
Yes
Alignment
fault
No
No
Section
permission
fault
Check access permissions
Check access permissions
Violation
?
Violation
?
Yes
No
Yes
Sub-page
permission
fault
No
Physical address
Figure 6-3 Translation table managed TLB fault checking sequence part 2
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-31
Memory Management Unit
6.9.2
Alignment fault
An alignment fault occurs if the processor has attempted to access a particular data memory size
at an address location that is not aligned with that size.
Operation of unaligned accesses on page 4-13 describes the conditions for generating
Alignment faults.
Alignment checks are performed with the MMU both enabled and disabled.
6.9.3
Translation fault
There are two types of translation fault:
Section
Page
6.9.4
A section translation fault occurs if:
•
The TLB tries to perform a page table walk but the page table walk is
disabled by one of the PD0 or PD1 bits. For more details, see Hardware
page table translation on page 6-36.
•
The TLB fetches a first level translation table descriptor, and this first level
descriptor is invalid. This is the case when bits[1:0] of this descriptor are
b00 or b11.
A page translation fault occurs if the TLB fetches a second-level translation table
descriptor and this descriptor is marked as invalid, bits [1:0] = b00.
Access bit fault
When the Force AP bit, see c1, Control Register on page 3-44 bit [29], is set then AP[0]
indicates if there is an Access Bit Fault.
This bit is only taken into account when the MMU is in ARMv6 mode, that is XP=1, bit [23] in
the CP15 Control register.
In the configuration XP=1 and ForceAP=1, the OS uses only bits APX and AP[1] as Access
Permission bits, and AP[0] becomes an Access Bit, see Access permissions on page 6-11. The
Access Bit records recent TLB access to a page, or section, and the OS can use this to optimize
memory managements algorithms.
In the ARM1176JZ-S processor the Access Bit must be managed by the software.
Reading a page table entry into the TLB when the Access Bit is 0 causes an Access Bit fault.
This fault is readily distinguished from other faults that the TLB generates and this permits fast
setting of the Access Bit in software.
The processor can generate two kind of Access Bit faults:
•
Section Access Bit fault, when the Access Bit, AP[0], is contained in a first level
translation table descriptor
•
Page Access Bit fault, when the Access Bit, AP[0], is contained in a second level
translation table descriptor
The Force AP bit is banked in the Secure and Non-secure copies of the CP15 Control Register
for TrustZone support.
The Force AP and XP bits are expected to be static throughout operations.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-32
Memory Management Unit
Any change in the Force AP or XP bit configuration to enable or disable the generation of
Access Bit faults takes effect immediately. In the case where the TLB lookup hits an entry that
was created before Access Bit faults generation was enabled, and that this entry contains
AP[0]=0, then the TLB generates an Access Bit fault.
6.9.5
Domain fault
There are two types of domain fault:
Section
For a section the domain is checked when the first-level descriptor is returned.
Page
For a page the domain is checked when the second-level descriptor is returned.
For each type, the first-level descriptor indicates the domain in CP15 c3, the Domain Access
Control Register, to select. If the selected domain has bit 0 set to 0 indicating either no access
or reserved, then a domain fault occurs.
6.9.6
Permission fault
If the two-bit domain field returns Client, the access permission check is performed on the
access permission field in the TLB entry. A permission fault occurs if the access permission
check fails.
6.9.7
Debug event
When Monitor debug-mode debug is enabled an abort can be taken caused by a breakpoint on
an instruction access or a watchpoint on a data access. In both cases the memory system
completes the access before the abort is taken. If an abort is taken when in Monitor debug-mode
debug then the appropriate FSR, IFSR or DFSR, is updated to indicate a debug abort.
If a watchpoint is taken the WFAR is set to the address that caused the watchpoint. Watchpoints
are not taken precisely because following instructions can run underneath load and store
multiples.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-33
Memory Management Unit
6.10
Fault status and address
Table 6-11 lists the encodings for the Fault Status Register.
Table 6-11 Fault Status Register encoding
Priority
Sources
FSR[10,3:0]
Domain
FSR[12]
Highest
Alignment
b00001
Invalid
SBZ
TLB miss
b00000
Invalid
SBZ
Instruction cache maintenancea
operation fault
b00100
Invalid
SBZ
first-level
b01100
Invalid
SLVERR !DECERR
second-level
b01110
Valid
SLVERR !DECERR
Section
b00101
Invalid
SBZ
Page
b00111
Valid
SBZ
Section
b00011
Valid
SBZ
Page
b00110
Valid
SBZ
Section
b01001
Valid
SBZ
Page
b01011
Valid
SBZ
Section
b01101
Valid
SBZ
Page
b01111
Valid
SBZ
Precise external abort
b01000
Valid
SLVERR !DECERR
Imprecise external abort
b10110
Invalid
SLVERR !DECERR
Parity error exception, not supported
b11000
Invalid
SBZ
Instruction debug event
b00010
Valid
SBZ
External abort on translation
Translation
Access Bit Fault, Force AP only
Domain
Permission
Lowest
a. These aborts cannot be signaled with the IFSR because they do not occur on the instruction side.
Note
All other Fault Status encodings are reserved.
If a translation abort occurs during a Data Cache maintenance operation by virtual address, then
a Data Abort is taken and the DFSR indicates the reason. The FAR indicates the faulting address,
and the IFAR indicates the address of the instruction causing the abort.
If a translation abort occurs during an Instruction Cache maintenance operation by virtual
address, then a Data Abort is taken, and an Instruction Cache Maintenance Operation Fault is
indicated in the DFSR. The IFSR indicates the reason. The FAR indicates the faulting address,
and the IFAR indicates the address of the instruction causing the abort.
Domain and fault address information is only available for data accesses. For instruction aborts
R14 must be used to determine the faulting address. You can determine the domain information
by performing a TLB lookup for the faulting address and extracting the domain field.
Table 6-12 on page 6-35 lists a summary of the abort vector that is taken, and the Fault Status
and Fault Address Registers that are updated for each abort type.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-34
Memory Management Unit
Table 6-12 Summary of aborts
Register updated?
Abort type
Abort taken
Precise?
IFSR
IFAR
DFSR
FAR
WFAR
Instruction MMU fault
Prefetch Abort
Yes
Yes
Yes
No
No
No
Instruction debug abort
Prefetch Abort
Yes
Yes
No
No
No
No
Instruction external abort on translation
Prefetch Abort
Yesa
Yesa
Yes
No
No
No
Instruction external abort
Prefetch Abort
Yesa
Yesa
Yes
No
No
No
Instruction cache maintenance operation
Data Abort
Yes
Yes
No
Yes
Yes
No
Data MMU fault
Data Abort
Yes
No
No
Yes
Yes
No
Data debug abort
Data Abort
No
No
No
Yes
Yes
Yes
Data external abort on translation
Data Abort
Yesa
No
No
Yesa
Yesa
Noa
Data external abort
Data Abort
Nob
No
No
Yesa
Yes
No
Data cache maintenance operation
Data Abort
Yes
No
No
Yes
Yes
No
a. When the EA bit is set, the updated FSR or FAR is always Secure.
b. Data Aborts can be precise, see External aborts on page 6-27 for more details.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-35
Memory Management Unit
6.11
Hardware page table translation
The processor MMU implements the hardware page table walking mechanism from ARMv4
and ARMv5 cached processors with the exception of the removal of the fine page table
descriptor and the addition the page table walk disable bits in the TTB Control register.
The processor implements the page table walk disable feature. Two bits, PD0 and PD1, are
implemented in the TTB Control register. These bits are banked for the Secure and Non-secure
worlds for the support of TrustZone.
Each time a TLB miss occurs, the TLB computes the parameters for an automatic hardware page
table walk. The address of the page table walk is computed from TTB0 or TTB1, see First-level
descriptor address on page 6-43. If the address is computed with TTB0, and the PD0 bit is set
in the TTB Control register of the corresponding world, or if the address is computed using
TTB1 and the PD1 bit is set, then the processor does not perform the automatic hardware page
table walk, and it generates a Section translation fault instead.
With this feature, only a small portion of the memory can be mapped in one world, for example
the Secure world, if the code that runs in this world is expected to be small. This gives the system
a simple way to avoid using a lot of memory to store full page tables.
When hardware page table walks are not disabled, the processor performs the page table walk
in the usual way. A hardware page table walk occurs whenever there is a TLB miss. Processor
hardware page table walks do not cause a read from the level one Unified/Data Cache. or the
TCM. The P, RGN, S, and C bits in the Translation Table Base Registers determine the memory
region attributes for the page table walk.
Two formats of page tables are supported:
•
A backwards-compatible format supporting subpage access permissions. These have been
extended so that certain page table entries support extended region types and with the NS
Attribute bit for TrustZone.
•
ARMv6 format, not supporting sub-page access permissions, but with support for
ARMv6 MMU features. The NS Attribute bit for TrustZone has also been added. These
features are:
— extended region types
— global and process specific pages
— more access permissions
— marking of Shared and Non-Shared regions
— marking of Execute-Never regions.
Additionally, two translation table base registers are provided in each world. On a TLB miss,
the Translation Table Base Control Register, CP15 c2 that is also duplicated in each world, and
the top bits of the virtual address determine if the first or second translation table base is used.
See c2, Translation Table Base Control Register on page 3-61 for details. The first-level
descriptor indicates whether the access is to a section or to a page table. If the access is to a page
table, the processor MMU fetches a second-level descriptor.
A page table holds 256 32-bit entries 4KB in size. You can determine the page type by
examining bits [1:0] of the second-level descriptor. For both first and second level descriptors if
bits [1:0] are b00, the associated virtual addresses are unmapped, and attempts to access them
generate a translation fault. Software can use bits [31:2] for its own purposes in such a
descriptor, because they are ignored by the hardware. Where appropriate, ARM Limited
recommends that bits [31:2] continue to hold valid access permissions for the descriptor.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-36
Memory Management Unit
For both level 1 and level 2 page table walks, the processor performs external accesses with
Secure or Non-secure rights depending on the Secure or Non-secure state of the MMU request
that causes the page table walk. This ensures that Secure translation table descriptors are always
fetched from a Secure memory, and that Non-secure translation table descriptors are always
fetched from Non-secure memory.
6.11.1
Backwards-compatible page table translation subpage AP bits enabled
When the CP15 Control Register c1 bit 23 is set to 0, the subpage AP bits are enabled and the
page table formats are backwards-compatible with ARMv4 and ARMv5 MMU architectures.
This bit is duplicated as Secure and Non-secure versions so that the system can enable or disable
subpages independently in each world.
All mappings are treated as global, and executable, XN = 0. All Normal memory is Non-Shared.
Device memory can be Shared or Non-Shared as determined by the TEX bits and the C and B
bits. For large and small pages, there can be four subpages defined with different access
permissions. For a large page, the subpage size is 16KB and is accessed using bits [15:14] of the
page index of the virtual address. For a small page, the subpage size is 1KB and is accessed
using bits [11:10] of the page index of the virtual address.
The use of subpage AP bits where AP3, AP2, AP1, and AP0 contain different values is
deprecated.
Backwards-compatible page table format
Figure 6-4 shows a backwards-compatible format first-level descriptor.
31
20 19 18 17
24 23
Section (1MB)
Supersection
(16MB)
12 11 10 9 8
5 4 3 2 1 0
Ignored
Translation fault
Coarse page table
15 14
0 0
Coarse page table base address
Section base address
Supersection base
address
SBZ
P
Domain
S
S
N
B 0 1
B
S
Z
Z
N
0
S
SBZ
TEX
AP
P
Domain
0 C B 1 0
N
1
S
SBZ
TEX
AP
P
Ignored
0 C B 1 0
Reserved
1 1
Figure 6-4 Backwards-compatible first-level descriptor format
If the P bit is supported and set for the memory region, it indicates to the system memory
controller that this memory region has ECC enabled. ARM1176JZ-S processors do not support
the P bit.
When bits [1:0] of the first-level descriptor are b01, the descriptor points to a second-level page
table, called a Coarse page table. Figure 6-5 on page 6-38 shows a backwards-compatible
format second-level descriptors.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-37
Memory Management Unit
31
16 15
Translation fault
Ignored
Large page (64KB)
Large page base address
Small page
(4KB)
Small page base address
Extended small
page (4KB)
12 11 10 9 8 7 6 5 4 3 2 1 0
0 0
TEX
AP3 AP2 AP1 AP0 C B 0 1
AP3 AP2 AP1 AP0 C B 1 0
SBZ
Extended small page base address
TEX
AP
C B 1 1
Figure 6-5 Backwards-compatible second-level descriptor format
For extended small page table entries without a TEX field you must use the value b000.For
details of TEX encodings see C and B bit, and type extension field encodings on page 6-14.
Note
For any Supersection description in a first-level page table, and any Large page description in a
second-level page table:
•
you must repeat the description in 16 consecutive page table locations
•
the first description must occur on a 16-word boundary
For more information see the ARM Architecture Reference Manual.
Figure 6-6 shows an overview of the section, supersection, and page translation process using
backwards-compatible descriptors.
Translation
table base
Indexed by
VA[31:20]
First level
page table
Invalid
00
Base address
from L1D[31:10]
Coarse page
31 table
0
Invalid 00
Base address
from L2D[31:16]
Indexed by
VA[15:0]
Indexed by
VA[19:12]
64KB large page
31
0
16KB subpage
16KB subpage
16KB subpage
16KB subpage
01
01
10
Base address
from L2D[31:12]
Indexed by
VA[11:0]
4KB small page
31
0
1KB subpage
1KB subpage
1KB subpage
1KB subpage
11
Base address
from L2D[31:12]
(bit 18 = 0) 10
Base address
from L1D[31:20]
4KB extended
small page
0
31
1MB section
Indexed by
VA[11:0]
Indexed by
VA[19:0]
(bit 18 = 1) 10
Base address
from L1D[31:24]
16MB
supersection
Indexed by
VA[23:0]
Figure 6-6 Backwards-compatible section, supersection, and page translation
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-38
Memory Management Unit
6.11.2
ARMv6 page table translation subpage AP bits disabled
When the CP15 Control Register c1 Bit 23 is set to 1 in the corresponding world, the subpage
AP bits are disabled and the page tables have support for ARMv6 MMU features. Four new page
table bits are added to support these features:
•
The Not-Global (nG) bit, determines if the translation is marked as global (0), or
process-specific (1) in the TLB. For process-specific translations the translation is
inserted into the TLB using the current ASID, from the ContextID Register, CP15 c13.
•
The Shared (S) bit, determines if the translation is for Non-Shared (0), or Shared (1)
memory. This only applies to Normal memory regions. Device memory can be Shared or
Non-Shared as determined by the TEX bits and the C and B bits.
•
The Execute-Never (XN) bit, determines if the region is Executable (0) or Not-executable
(1).
•
Three access permission bits. The access permissions extension (APX) bit, provides an
extra access permission bit.
All ARMv6 page table mappings support the TEX field.
ARMv6 page table format
With the sub-pages enabled or not, all first level descriptors have been enhanced with the
addition of the NS Attribute bit to enable the support of TrustZone.
Figure 6-7 shows the format of an ARMv6 first-level descriptor when subpages are disabled.
20 19 18 17 16 15 14
24 23
31
Translation fault
Coarse page table
Section (1MB)
Supersection
(16MB)
Translation fault
12 11 10 9 8
5 4 3 2 1 0
Ignored
0 0
Coarse page table base address
Section base address
Supersection base
address
SBZ
A
n
N
0
S P
G
S
X
A
N
n
1
S P
S
G
X
P
Domain
S
S
N
B
B 0 1
S
Z
Z
TEX
AP
P
Domain
X
C B 1 0
N
TEX
AP
P
Ignored
X
C B 1 0
N
Reserved
1 1
Figure 6-7 ARMv6 first-level descriptor formats with subpages disabled
If the P bit is supported and set for the memory region, it indicates to the system memory
controller that this memory region has ECC enabled. ARM1176JZ-S processors do not support
the P bit. In addition to the invalid translation, bits [1:0] = b00, translations for the reserved
entry, bits [1:0] = b11, result in a translation fault.
As shown in Figure 6-7, bits [1:0] of a level 1 page table entry determine the type of the entry:
Bits [1:0] == b00
Translation fault.
Bits [1:0] == b01
The entry points to a second-level page table, called a Coarse page table.
Figure 6-8 on page 6-40 shows the formats of the possible entries in the Coarse
page table.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-39
Memory Management Unit
Bits [1:0] == b10
The entry points to a either a 1MB Section of memory or a 16MB Supersection
of memory. Bit [18] of the descriptor selects between a Section and a
Supersection. For details of supersections see Supersections on page 6-6.
Note
You must repeat any Supersection description in 16 consecutive page table
locations, with the first description occurring on a 16-word boundary. For more
information see the ARM Architecture Reference Manual.
Bits [1:0] == b11
Reserved.
Figure 6-8 shows the format of an ARMv6 second-level descriptor.
16 15 14
31
12 11 10 9 8 7 6 5 4 3 2 1 0
Ignored
Translation fault
Large page
(64KB)
Small page
(4KB)
Large page table base address
0 0
X
N
TEX
Extended small page table base address
A
n
S P
G
X
A
n
S P
G
X
SBZ
AP
C B 0 1
TEX
AP
C B 1
X
N
Figure 6-8 ARMv6 second-level descriptor format
As shown in Figure 6-8, bits [1:0] of a second-level descriptor determine the type of the
descriptor:
Bits [1:0] == b00
Translation fault.
Bits [1:0] == b01
The entry points to a 64KB Large page in memory.
Note
You must repeat any Large page description in 16 consecutive page table
locations, with the first description occurring on a 16-word boundary. For more
information see the ARM Architecture Reference Manual.
Bits [1:0] == b1x
The entry points to a 4KB Extended small page in memory.
Bit [0] of the entry is the XN bit for the entry.
Figure 6-9 on page 6-41 shows an overview of the section, supersection, and page translation
process using ARMv6 descriptors.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-40
Memory Management Unit
Translation
table base
Indexed by
VA[31:20]
First level
page table
Invalid
Base address
from L2D[31:16]
00
Base address
from L1D[31:10]
01
Coarse page
table
0
31
Invalid
00
64KB large page
31
0
Indexed by
VA[15:0]
Indexed by
VA[19:12]
01
1
Base address
from L2D[31:12]
4KB extended
small page
0
31
XN
Indexed by
VA[11:0]
(bit 18 = 0) 10
Base address
from L1D[31:20]
1MB section
Indexed by
VA[19:0]
(bit 18 = 1) 10
Base address
from L1D[31:24]
16MB
supersection
Indexed by
VA[23:0]
Figure 6-9 ARMv6 section, supersection, and page translation
6.11.3
Restrictions on page table mappings page coloring
The processor uses virtually indexed, physically addressed caches. To prevent alias problems
where cache sizes greater than 16KB have been implemented, you must restrict the mapping of
pages that remap virtual address bits [13:12].
•
for the Instruction Cache, the Isize P bit, bit[11], of the Cache Type Register CP15 c0,
indicates if this is necessary
•
for the Data Cache, the Dsize P bit, bit[23], of the Cache Type Register CP15 c0, indicates
if this is necessary.
See c0, Cache Type Register on page 3-21 for more information.
This restriction, referred to as page coloring, enables the virtual address bits[13:12] to be used
to index into the cache without requiring hardware support to avoid alias problems.
For pages marked as Non-Shared, if bit 11 or bit 23 of the Cache Type Register is set, the
restriction applies to pages that remap virtual address bits [13:12] and might cause aliasing
problems when 4KB pages are used. To prevent this you must ensure the following restrictions
are applied:
1.
ARM DDI 0333H
ID012410
If multiple virtual addresses are mapped onto the same physical address then for all
mappings of bits [13:12] the virtual addresses must be equal and the same as bits [13:12]
of the physical address. The same physical address can be mapped by TLB entries of
different page sizes, including page sizes over 4KB. Imposing this requirement on the
virtual address is called page coloring.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-41
Memory Management Unit
2.
Alternatively, if all mappings to a physical address are of a page size equal to 4KB, then
the restriction that bits [13:12] of the virtual address must equal bits [13:12] of the
physical address is not necessary. Bits [13:12] of all virtual address aliases must still be
equal.
There is no restriction on the more significant bits in the virtual address equalling those in the
physical address.
Avoiding the page coloring restriction
The processor provides the ability to restrict the cache size to 16KB so that software does not
have to support the page coloring restriction on mapping, see CZ bit in c1, Auxiliary Control
Register on page 3-49.
Note
Setting the CZ flag in the CP15 Auxiliary Control Register does not affect the contents of the
CP15 Cache Type Register. However, when the CZ flag is set, all caches are limited to 16KB,
even if a larger cache size is specified in the CP15 Cache Type Register.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-42
Memory Management Unit
6.12
MMU descriptors
To support sections and pages, the processor MMU uses a two-level descriptor definition. The
first-level descriptor indicates whether the access is to a section or to a page table. If the access
is to a page table, the processor MMU determines the page table type and fetches a second-level
descriptor.
6.12.1
First-level descriptor address
The ARM1176 contains:
•
two Translation Table Base Registers, TTBR0 and TTBR1
•
one Translation Table Base Control Register (TTBCR).
On a TLB miss, the top bits of the modified virtual address determine whether the first or second
Translation Table Base is used. Figure 6-10 on page 6-44 shows the creation of a first-level
descriptor address.
The expected use of two translation tables is to reduce the cost of OS context switches by
enabling the OS, and each individual task or process, to have its own pagetable without
consuming much memory.
In this model, the virtual address space is divided into two regions:
•
0x0 -> 1<<(32-N) that TTBR0 controls
•
1<<(32-N) -> 4GB that TTBR1 controls.
The value of N is set in the TTBCR. If N is zero, then TTBR0 is used for all addresses, and that
gives legacy v5 behavior. If N is not zero, the OS and memory mapped IO are located in the
upper part of the memory map, TTBR1, and the tasks or processes all occupy the same virtual
address space in the lower part of the memory, TTBR0.
The TTBCR, TTBR0, and TTBR1 registers used for this process are banked. Depending on the
state of the MMU requests that cause a page table walk, either Secure or Non-secure registers
are used.
The translation table that TTBR0 points to can be truncated because it must only cover the first
1<<(32-N) bytes of memory. The first entry always corresponds to address 0x0, so this
mechanism is more efficient if processes start at a low virtual address such as 0x0 or 0x8000.
Table 6-13 lists the translation table size.
Table 6-13 Translation table size
ARM DDI 0333H
ID012410
N
Upper boundary
Translation table 0 size
0
4GB
16KB, 4096 entries, v5 behavior, TTBR1 not used.
1
2GB
8KB, 2048 entries
2
1GB
4KB, 1024 entries
3
512MB
2KB, 512 entries
4
256MB
1KB, 256 entries
5
128MB
512B, 128 entries
6
64MB
256B, 64 entries
7
32MB
128B, 32 entries
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-43
Memory Management Unit
The OS can maintain a different pagetable for each process, and update TTRB0 on a context
switch. Using a truncated pagetable means that much less space is required to store the
individual process page tables. Different processes can have different size pagetables, that is,
different values of N, by updating the TTBCR during the context switch.
It is not required that the OS pagetables that TTBR1 points to are updated on a context switch.
Figure 6-10 shows how to create a first level descriptor address.
The PD0 and PD1 bits in TTBCR can be used to prevent pagetable walks from either TTBR. In
particular, disabling walks from TTBR1 and setting TTBR0 to the address of a truncated
translation table can minimize the overhead otherwise incurred in unused translation table
entries.
Translation table base 0
31
14-N 13-N
3210
PSC
Translation base
Modified virtual address
32-N
20 19
First-level table index
31
14-N 13-N
Translation base
Table index
0
210
00
Translation table base 1
31
14 13
3210
PSC
Translation base
Modified virtual address
31
20 19
First-level table index
31
14 13
Translation base
Table index
0
210
00
Translation table base control:
0
If (N > 0 && MVA[31:32-N] != 0)
{TTBR1[31:14], MVA[31:20], 00}
else
{TTBR0[31:14-N], MVA[31-N:20], 00}
1
First-level descriptor address
Where N is the value of the Translation
Table Base Control Register c2
Figure 6-10 Creating a first-level descriptor address
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-44
Memory Management Unit
6.12.2
First-level descriptor
Using the first-level descriptor address, a request is made to external memory. This returns the
first-level descriptor. By examining bits [1:0] of the first-level descriptor, the access type is
indicated as Table 6-14 lists.
Table 6-14 Access types from first-level descriptor bit values
Bit values
Access type
b00
Translation fault
b01
Page table base address
b10
Section base address
b11
Reserved, results in translation fault
First-level translation fault
If bits [1:0] of the first-level descriptor are b00 or b11, a translation fault is generated. This
generates an abort to the processor, either a Prefetch Abort for the instruction side or a Data
Abort for the data side, see MMU fault checking on page 6-29.
If the first level descriptor describes a section or supersection when the Force AP bit is set and
the MMU is in ARMv6 mode, Access bit faults might be generated if AP[0]=0.
First-level page table address
If bits [1:0] of the first-level descriptor are b01, then a page table walk is required. Second-level
page table walk on page 6-47 describes this process.
First-level section base address
If bits [1:0] of the first-level descriptor are b10, a request to a section memory block has
occurred. Figure 6-11 on page 6-46 shows the translation process for a 1MB section using
ARMv6 format, AP bits disabled.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-45
Memory Management Unit
Translation table base
31
14 13
0
Translation base
Modified virtual address
31
20 19
0
First-level table index
Section index
First-level descriptor address
31
2 1 0
14 13
Translation base
First-level table index
0 0
First-level descriptor
31
Section base address
20 19 18 17 16 15 14
12 11 10 9 8
5 4 3 2 1 0
A
N n
X
CB1 0
0
S P TEX AP P Domain
S G X
N
Physical address
31
20 19
0
Section base address
Section index
Figure 6-11 Translation for a 1MB section, ARMv6 format
Following the first-level descriptor translation, the physical address is used to transfer to and
from external memory the data requested from and to the processor. This is done only after the
domain and access permission checks are performed on the first-level descriptor for the section.
Memory access control on page 6-11 describes these checks.
Figure 6-12 shows the translation process for a 1MB section using backwards-compatible
format, AP bits enabled.
Translation table base
31
14 13
0
Translation base
Modified virtual address
31
20 19
0
First-level table index
Section index
First-level descriptor address
31
2 1 0
14 13
Translation base
First-level table index
0 0
First-level descriptor
31
20 19 18
Section base address
N
S
15 14
SBZ
12 11 10 9 8
TEX
5 4 3 2 1 0
AP P Domain 0 C B 1 0
Physical address
31
20 19
Section base address
0
Section index
Figure 6-12 Translation for a 1MB section, backwards-compatible format
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-46
Memory Management Unit
6.12.3
Second-level page table walk
If bits [1:0] of the first-level descriptor bits are b01, then a page table walk is required. The
MMU requests the second-level page table descriptor from external memory. Figure 6-13 shows
how the second-level page table address is generated.
Translation table base
31
14 13
0
Translation base
Modified virtual address
31
20 19
12 11
0
Second-level
table index
First-level table index
First-level descriptor address
31
2 1 0
14 13
Translation base
First-level table index
0 0
First-level descriptor
31
5 4 3 2 1 0
10 9 8
Coarse page table base address
P Domain
N
S
0 1
SBZ
SBZ
Second-level descriptor address
31
10 9
2 1 0
Second-level
table index
Coarse page table base address
0 0
Figure 6-13 Generating a second-level page table address
When the page table address is generated, a request is made to external memory for the
second-level descriptor.
By examining bits [1:0] of the second-level descriptor, the access type is indicated as Table 6-15
lists.
Table 6-15 Access types from second-level descriptor bit values
Descriptor format
Bit values
Access type
Both
b00
Translation fault
Backwards-compatible
b01
64KB large page
ARMv6
b01
64KB large page
Backwards- compatible
b10
4KB small page
ARMv6
b1XN
4KB extended small page
Backwards- compatible
b11
4KB extended small page
Second-level translation fault
If bits [1:0] of the second-level descriptor are b00, then a translation fault is generated. This
generates an abort to the processor, either a Prefetch Abort for the instruction side or a Data
Abort for the data side, see MMU fault checking on page 6-29.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-47
Memory Management Unit
If the second level descriptor describes a large page, a small page, or an extended small page
when the Force AP bit is set and the MMU is in ARMv6 mode, Access bit faults might be
generated if AP[0]=0.
Second-level large page base address
If bits [1:0] of the second-level descriptor are b01, then a large page table walk is required.
Figure 6-14 shows the translation process for a 64KB large page using ARMv6 format, AP bits
disabled.
Translation table base
31
14 13
0
Translation base
Modified virtual address
31
20 19
16 15
12 11
First-level table index
First-level descriptor address
31
2 1 0
14 13
Translation base
First-level table index
0
Page index
Second-level
table index
0 0
First-level descriptor
31
5 4 3 2 1 0
10 9 8
Coarse page table base address
P Domain
N
S
SBZ
0 1
SBZ
Second-level descriptor address
31
10 9
2 1 0
Second-level
table index
Coarse page table base address
0 0
Second-level descriptor
16 15 14
31
Page base address
12 11 10 9 8
6 5 4 3 2 1 0
A
P SBZ AP C B 0 1
X
n
X
TEX
S
G
N
Physical address
16 15
31
Page base address
0
Page index
Figure 6-14 Large page table walk, ARMv6 format
Figure 6-15 on page 6-49 shows the translation process for a 64KB large page, or a 16KB large
page subpage, using backwards-compatible format, AP bits enabled.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-48
Memory Management Unit
Translation table base
31
14 13
0
Translation base
Modified virtual address
31
20 19
16 15
12 11
First-level table index
First-level descriptor address
31
2 1 0
14 13
Translation base
First-level table index
0
Page index
Second-level
table index
0 0
First-level descriptor
31
5 4 3 2 1 0
10 9 8
Coarse page table base address
P Domain
N
S
SBZ
0 1
SBZ
Second-level descriptor address
31
10 9
2 1 0
Second-level
table index
Coarse page table base address
0 0
Second-level descriptor
16 15 14
31
Page base address
12 11 10 9 8 7 6 5 4 3 2 1 0
0 TEX
AP AP AP AP
CB0 1
3
2
1
0
Physical address
16 15
31
Page base address
0
Page index
Figure 6-15 Large page table walk, backwards-compatible format
Using backwards-compatible format descriptors, the 64KB large page is generated by setting all
of the AP bit pairs to the same values, AP3=AP2=AP1=AP0. If any one of the pairs are different,
then the 64KB large page is converted into four 16KB large page subpages. The subpage access
permission bits are chosen using the virtual address bits [15:14].
Second-level small page table walk
If bits [1:0] of the second-level descriptor are b10 for backwards-compatible format, then a
small page table walk is required.
Figure 6-16 on page 6-50 shows the translation process for a 4KB small page or a 1KB small
page subpage using backwards-compatible format descriptors, AP bits enabled.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-49
Memory Management Unit
Translation table base
31
14 13
0
Translation base
Modified virtual address
31
20 19
12 11
Second-level
table index
First-level table index
0
Page index
First-level descriptor address
31
2 1 0
14 13
Translation base
First-level table index
0 0
First-level descriptor
31
5 4 3 2 1 0
10 9 8
Coarse page table base address
P Domain
N
S
0 1
SBZ
SBZ
Second-level descriptor address
31
10 9
2 1 0
Second-level
table index
Coarse page table base address
0 0
Second-level descriptor
31
12 11 10 9 8 7 6 5 4 3 2 1 0
AP AP AP AP
CB1 0
3
2
1
0
Small page base address
Physical address
31
12 11
Page base address
0
Page index
Figure 6-16 4KB small page or 1KB small subpage translations, backwards-compatible format
Using backwards-compatible descriptors, the 4KB small page is generated by setting all of the
AP bit pairs to the same values, AP3=AP2=AP1=AP0. If any one of the pairs are different, then
the 4KB small page is converted into four 1KB small page subpages. The subpage access
permission bits are chosen using the virtual address bits [11:10].
Second-level extended small page table walk
If bits [1:0] of the second-level descriptor are b1XN for ARMv6 format descriptors, or b11 for
backwards-compatible descriptors, then an extended small page table walk is required.
Figure 6-17 on page 6-51 shows the translation process for a 4KB extended small page using
ARMv6 format descriptors, AP bits disabled.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-50
Memory Management Unit
Translation table base
31
14 13
0
Translation base
Modified virtual address
31
20 19
First-level table index
12 11
Second-level
table index
0
Page index
First-level descriptor address
31
2 1 0
14 13
Translation base
First-level table index
0 0
First-level descriptor
31
10 9 8
Coarse page table base address
5 4 3 2 1 0
P Domain
N
S
SBZ
0 1
SBZ
Second-level descriptor address
31
10 9
2 1 0
Second-level
table index
Coarse page table base address
0 0
Second-level descriptor
31
Extended small page base address
12 11 10 9 8
6 5 4 3 2 1 0
A
n
X
P
TEX AP C B 1
S
G X
N
Physical address
31
12 11
Page base address
0
Page index
Figure 6-17 4KB extended small page translations, ARMv6 format
Figure 6-18 on page 6-52 shows the translation process for a 4KB extended small page or a 1KB
extended small page subpage using backwards-compatible format descriptors, AP bits enabled.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-51
Memory Management Unit
Translation table base
31
14 13
0
Translation base
Modified virtual address
31
20 19
First-level table index
12 11
Second-level
table index
0
Page index
First-level descriptor address
31
2 1 0
14 13
Translation base
First-level table index
0 0
First-level descriptor
31
5 4 3 2 1 0
10 9 8
Coarse page table base address
P Domain
N
S
SBZ
0 1
SBZ
Second-level descriptor address
31
10 9
2 1 0
Second-level
table index
Coarse page table base address
0 0
Second-level descriptor
31
12 11
Extended small page base address
9 8
SBZ
6 5 4 3 2 1 0
TEX
AP C B 1 1
Physical address
31
12 11
Page base address
0
Page index
Figure 6-18 4KB extended small page or 1KB extended small subpage translations, backwards-compatible format
Using backwards-compatible descriptors, the 4KB extended small page is generated by setting
all of the AP bit pairs to the same values, AP3=AP2=AP1=AP0. If any one of the pairs are
different, then the 4KB extended small page is converted into four 1KB extended small page
subpages. The subpage access permission bits are chosen using the virtual address bits [11:10].
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-52
Memory Management Unit
6.13
MMU software-accessible registers
The MMU is controlled by the system control coprocessor, CP15 registers. Table 6-16, lists the
system control processor registers and references to their detailed descriptions. For more
information on the system control coprocessor, see Chapter 3 System Control Coprocessor.
Table 6-16 CP15 register functions
Register
Cross reference
TLB Type Register
c0, TLB Type Register on page 3-25
Control Register
c1, Control Register on page 3-44
Non-Secure Access Control Register
c1, Non-Secure Access Control Register on page 3-55
Translation Table Base Register 0
c2, Translation Table Base Register 0 on page 3-57
Translation Table Base Register 1
c2, Translation Table Base Register 1 on page 3-59
Translation Table Base Control Register
c2, Translation Table Base Control Register on page 3-61
Domain Access Control Register
c3, Domain Access Control Register on page 3-63
Data Fault Status Register (DFSR)
c5, Data Fault Status Register on page 3-64
Instruction Fault Status Register (IFSR)
c5, Instruction Fault Status Register on page 3-66
Fault Address Register (FAR)
c6, Fault Address Register on page 3-68 and MMU fault checking on
page 6-29
Instruction Fault Address Register (IFAR)
c6, Instruction Fault Address Register on page 3-69 and MMU fault
checking on page 6-29
TLB Operations Register
c8, TLB Operations Register on page 3-86
TLB Lockdown Register
c10, TLB Lockdown Register on page 3-100
Primary Region Remap Register
c10, Memory region remap registers on page 3-101
Normal Memory Remap Register
c10, Memory region remap registers on page 3-101
FCSE PID Register
c13, FCSE PID Register on page 3-125
ContextID Register
c13, Context ID Register on page 3-127.
Peripheral Port Remap Register
c15, Peripheral Port Memory Remap Register on page 3-130
TLB Lockdown Index Register
c15, TLB lockdown access registers on page 3-149
TLB Lockdown VA Register
c15, TLB lockdown access registers on page 3-149
TLB Lockdown PA Register
c15, TLB lockdown access registers on page 3-149
TLB Lockdown Attributes Register
c15, TLB lockdown access registers on page 3-149
Note
All the CP15 MMU registers, except CP15 c8, contain state that you read from using MRC
instructions and write to using MCR instructions. Registers c5 and c6 are also written by the
MMU. Reading CP15 c8 results in an Undefined exception.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-53
Memory Management Unit
The debug control coprocessor CP14 also influences the MMU when in Debug state. Table 6-17
lists the registers that affect the MMU.
Table 6-17 CP14 register functions
ARM DDI 0333H
ID012410
Register
Cross reference
Debug State MMU Control Register
CP14 c11, Debug State MMU Control Register on page 13-24
Debug State Cache Control Register
CP14 c10, Debug State Cache Control Register on page 13-23
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
6-54
Chapter 7
Level One Memory System
This chapter describes the processor level one memory system. It contains the following sections:
•
About the level one memory system on page 7-2
•
Cache organization on page 7-3
•
Tightly-coupled memory on page 7-7
•
DMA on page 7-10
•
TCM and cache interactions on page 7-12
•
Write buffer on page 7-16.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-1
Level One Memory System
7.1
About the level one memory system
The processor level one memory system consists of:
•
separate Instruction and Data Caches in a Harvard arrangement
•
separate Instruction and Data Tightly-Coupled Memory (TCM) areas
•
a DMA system for accessing the TCMs
•
a Write Buffer
•
two MicroTLBs, backed by a main TLB.
Each cache line can contain Secure or Non-secure data. In parallel with each of the caches is an
area of dedicated RAM on both the instruction and data sides. These regions are referred to as
TCM. You can implement 0, 1 or 2 TCMs on each of the Instruction and Data sides.
You can configure each TCM to contain Secure or Non-secure data. Each TCM has a dedicated
base address that you can place anywhere in the physical address map, and does not have to be
backed by memory implemented externally. The Instruction and Data TCMs have separate base
addresses. A DMA mechanism can access TCMs and this enables loads from or stores to
another location in memory while the processor core is running.
The MMU provides the facilities required by sophisticated operating systems to deliver
protected virtual memory environments and demand paging. It also supports real-time tasks
with features that provide predictable execution time.
A full MMU handles address translation for each of the instruction and data sides. The MMU is
responsible for protection checking, address translation, and memory attributes, some of which
can be passed to the level two memory system. The cache stores each Non-secure memory
region attribute, NS attribute, along with each cache line as an NS Tag.
The processor caches memory translations in MicroTLBs for each of the instruction and data
sides and for the DMA, with a single main TLB backing the MicroTLBs.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-2
Level One Memory System
7.2
Cache organization
Each cache is implemented as a four-way set associative cache of configurable size. The caches
are virtually indexed and physically tagged. You can configure the cache sizes in the range of 4
to 64KB. Both the Instruction Cache and the Data Cache can provide two words per cycle for
all requesting sources.
Each cache way is architecturally limited to 16KB in size, because of the limitations of the
virtually indexed, physically tagged implementation. The number of cache ways is fixed at four,
but the cache way size can vary between 1KB and 16KB in powers of 2. The line length is not
configurable and is fixed at eight words per line.
Write operations must occur after the Tag RAM reads and associated address comparisons are
complete. A three-entry Write Buffer is included in the cache to enable the written words to be
held until they can be written to cache. One or two words can be written in a single store
operation. The addresses of these outstanding writes provide an additional input to the Tag RAM
comparison for reads.
To avoid a critical path from the Tag RAM comparison to the enable signals for the data RAMs,
there is a minimum of one cycle of latency between the determination of a hit to a particular
way, and the start of writing to the data RAM of that way. This requires the Data Cache Write
Buffer to hold three entries, for back-to-back writes. Accesses that read the dirty bits must also
check the Data Cache Write Buffer for pending writes that result in dirty bits being set. The
cache dirty bits for the Data Cache are updated when the Data Cache Write Buffer data is written
to the RAM. This requires the dirty bits to be held as a separate storage array. Significantly, the
Tag arrays cannot be written, because the arrays are not accessed during the data RAM writes,
but permits the dirty bits to be implemented as a small RAM.
The other main operations performed by the cache are cache line refills and Write-Back. These
occur to particular cache ways, that are determined at the point of the detection of the cache miss
by the victim selection logic.
To reduce overall power consumption, the number of full cache reads is reduced by the
sequential nature of many cache operations, especially on the instruction side. On a cache read
that is sequential to the previous cache read, only the data RAM set that was previously read is
accessed, if the read is within the same cache line. The Tag RAM is not accessed at all during
this sequential operation.
To reduce unnecessary power consumption additionally, only the addressed words within a
cache line are read at any time. With the required 64-bit read interface, this is achieved by
disabling half of the RAMs on occasions when only a 32-bit value is required. The
implementation uses two 32-bit wide RAMs to implement the cache data RAM shown in
Figure 7-1 on page 7-4, with the words of each line folded into the RAMs on an odd and even
basis. This means that cache refills can take several cycles, depending on the cache line lengths.
The cache line length is eight words.
The control of the level one memory system and the associated functionality, together with other
system wide control attributes are handled through the system control coprocessor, CP15.
Chapter 3 System Control Coprocessor describes this.
Figure 7-1 on page 7-4 shows the block diagram of the cache subsystem. It does not show the
cache refill paths.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-3
Level One Memory System
CP15
interface
Write
data
Virtual
address
RAMSet base address and size
Write buffer data (1-2 words)
Write buffer addresses
Micro
TLB
DATARAM
TAGRAM
TCM
Way
select
Comparator
Micro TLB
miss and
Data Abort
Cache
hit
Data
out
Figure 7-1 Level one cache block diagram
7.2.1
Features of the cache system
The level one cache system has the following features:
ARM DDI 0333H
ID012410
•
The cache is a Harvard implementation.
•
The caches are lockable at a granularity of a cache way, using Format C lockdown. See
Cache control and configuration on page 3-7.
•
Cache replacement policies are Pseudo-Random or Round-Robin, as controlled by the RR
bit in CP15 register c1. Round-Robin uses a single counter for all sets, that selects the way
used for replacement.
•
Cache line allocation uses the cache replacement algorithm when all cache lines are valid.
If one or more lines is invalid, then the invalid cache line with the lowest way number is
allocated to in preference to replacing a valid cache line. This mechanism does not
allocate to locked cache ways unless all cache ways are locked. See Cache miss handling
when all ways are locked down on page 7-6.
•
Cache lines can contain either Secure or Non-secure data and the NS Tag, that the
MicroTLB provides, indicates when the cache line comes from Secure or Non-secure
memory.
•
Cache lines can be either Write-Back or Write-Through, determined by the MicroTLB
entry.
•
Only read allocation is supported.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-4
Level One Memory System
•
The cache can be disabled independently from the TCM, under control of the appropriate
bits in CP15 c1. The cache can be disabled in Secure state while enabled in Non-secure
state and enabled in Secure state while disabled in Non-secure state.
The CL bit in the system control coprocessor, see c1, Non-Secure Access Control Register
on page 3-55, reserves cache lockdown registers for Secure world operation. When the CL
bit is 0 the cache lockdown registers are only available in the Secure world. When the CL
bit is 1 they are available for both Secure and Non-secure operation.
7.2.2
•
Data cache misses are nonblocking with three outstanding Data Cache misses being
supported.
•
Streaming of sequential data from LDM and LDRD operations, and for sequential
instruction fetches is supported.
Cache functional description
The cache and TCM exist to perform associative reads and writes on requested addresses. The
steps involved in this for reads are as follows:
1.
The lower bits of the virtual address are used as the virtual index for the Tag and RAM
blocks, including the TCM.
2.
In parallel the MicroTLB is accessed to perform the virtual to physical address translation.
3.
The physical addresses read from the Tag RAMs and the TCM base address register, and
the Write Buffer address registers, in parallel with the NS Tag, are compared with the
physical address from the MicroTLB. The processor also compares the NS Tag, that the
processor stores in the Tag RAMs along with the physical address, with the NS attribute
from the MicroTLB. Both comparisons form hit signals for each of the cache ways.
4.
The hit signals are used to select the data from the cache way that has a hit. Any bytes
contained in both the data RAMs and the Write Buffer entries are taken from the Write
Buffer. If two or three Write Buffer entries are to the same bytes, the most recently written
bytes are taken.
The steps for writes are as follows:
7.2.3
1.
The lower bits of the virtual address are used as the virtual index for the Tag blocks.
2.
In parallel, the MicroTLB is accessed to perform the virtual to physical address
translation.
3.
The physical addresses read from the Tag RAMs and the TCM base address register are
compared with the physical address from the MicroTLB. The processor also compares the
NS Tag, that it stores in the Tag RAMs along with the physical address, with the NS
attribute from the MicroTLB. Both comparisons form hit signals for each of the cache
ways.
4.
If a cache way, or the TCM, has recorded a hit, then the write data is written to an entry
in the Cache Write Buffer, along with the cache way, or TCM, that it must take place to.
5.
The contents of the Cache Write Buffer are held until a subsequent write or CP15
operation requires space in the Write Buffer. At this point the oldest entry in the Cache
Write Buffer is written into the cache.
Cache control operations
c7, Cache operations on page 3-69 describes the cache control operations that are supported by
the processor. The processor supports all the block cache control operations in hardware.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-5
Level One Memory System
•
•
Note
The cache operations executed in Secure state might affect all cache lines but cache
operations executed in Non-secure state only affect Non-secure lines.
You can restrict the functional size of each cache to 16KB, even when the physical cache
is larger. This enables the processor to run software that does not support ARMv6 page
coloring restrictions. You enable the this feature with the CZ bit, see c1, Auxiliary Control
Register on page 3-49.
For more information about ARMv6 page coloring see Restrictions on page table
mappings page coloring on page 6-41.
7.2.4
Cache miss handling
A cache miss results in the requests required to do the line fill being made to the level two
interface, with a Write-Back occurring if the line to be replaced contains dirty data.
The Write-Back data is transferred to the Write Buffer. This is arranged to handle this data as a
sequential burst. Because of the requirement for nonblocking caches, additional write
transactions can occur during the transfer of Write-Back data from the cache to the Write Buffer.
These transactions do not interfere with the burst nature of the Write-Back data. The Write
Buffer is responsible for handling the potential Read After Write (RAW) data hazards that might
exist from a Data Cache line Write-Back. The caches perform critical word-first cache refilling.
The internal bandwidth from the level two data read port to the Data Caches is eight bytes per
cycle, and supports streaming.
Cache miss handling when all ways are locked down
The ARM architecture describes the behavior of the cache as being Unpredictable when all ways
in the cache are locked down. However, for ARM1176JZ-S processors a cache miss is serviced
as if Way 0 is not locked.
7.2.5
Cache disabled behavior
If the cache is disabled, then the cache is not accessed for reads or for writes. This ensures that
maximum power savings can be achieved. It is therefore important that before the cache is
disabled, all of the entries are cleaned to ensure that the external memory has been updated. In
addition, if the cache is enabled with valid entries in it, then it is possible that the entries in the
cache contain old data. Therefore, the cache must be disabled with clean and invalid entries.
Cache maintenance operations can be performed even if the cache is disabled. The system can
disable the cache in Secure state when it is enabled in Non-secure state and enable the cache in
Secure state when it is disabled in Non-secure state.
7.2.6
Unexpected hit behavior
An unexpected hit is where the cache reports a hit on a memory location that is marked as
Noncacheable or Shared. The unexpected hit behavior is that these hits are ignored and a level
two access occurs. The unexpected hit is ignored because the cache hit signal is qualified by the
cacheability.
For writes, an unexpected cache hit does not result in the cache being updated. Therefore, writes
appear to be Noncacheable accesses. For a data access, if it lies in the range of memory specified
by the Instruction TCM, then the access is made to that RAM rather than to level two memory.
This applies to both writes and reads.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-6
Level One Memory System
7.3
Tightly-coupled memory
The TCM is designed to provide low-latency memory that can be used by the processor without
the unpredictability that is a feature of caches.
You can use such memory to hold critical routines, such as interrupt handling routines or
real-time tasks where the indeterminacy of a cache is highly undesirable. In addition you can
use it to hold scratch pad data, data types whose locality properties are not well suited to
caching, and critical data structures such as interrupt stacks.
You can separately configure the size of the Instruction TCM (ITCM) and the size of the Data
TCM (DTCM) to be 0KB, 4KB. 8KB, 16KB, 32KB or 64KB. For each side, ITCM and DTCM:
•
If you configure the TCM size to be 4KB you get one TCM, of 4KB, on this side.
•
If you configure the TCM size to be larger than 4KB you get two TCMs on this side, each
of half the configured size. So, for example, if you configure an ITCM size of 16KB you
get two ITCMs, each of size 8KB.
Table 7-1 lists all possible TCM configurations:
Table 7-1 TCM configurations
Configured TCM size
Number of TCMs
Size of each TCM
0KB
0
0
4KB
1
4KB
8KB
2
4KB
16KB
2
8KB
32KB
2
16KB
64KB
2
32KB
When the number of TCM on one side is 2, to make the implementation easier, the TCM for this
side are implemented as one single RAM. This RAM then has a size in the 0-64 KB range. The
lower part of the RAM corresponds to the TCM called TCM0 and the upper part corresponds to
TCM1.
You can also configure each individual TCM to contain Secure or Non-secure data. You make
this configuration in CP15 register c9, accessible in Secure state only. See c9, Data TCM
Non-secure Control Access Register on page 3-94 and c9, Instruction TCM Non-secure Control
Access Register on page 3-95 for more information. After reset, all TCMs are configured as
Secure.
The TCM Status Register in CP15 c0 describes what TCM options and TCM sizes can be
implemented, see c0, TCM Status Register on page 3-24.
Each Data TCM is implemented in parallel with the Data Cache and each Instruction TCM is
implemented in parallel with the Instruction Cache. Each TCM has a single movable base
address, specified in CP15 register c9, see c9, Data TCM Region Register on page 3-90 and c9,
Instruction TCM Region Register on page 3-92.
The size of each TCM can differ from the size of a cache way, but forms a single contiguous
area of memory. Figure 7-1 on page 7-4 shows the entire level one memory system. To access
each of the TCM region and TCM Access Control registers, the TCM Selection registers are set
to the TCM of interest, see c9, TCM Selection Register on page 3-97.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-7
Level One Memory System
The base address of each TCM can be placed anywhere in the physical address map, and does
not have to be backed by memory implemented externally. The Instruction and Data TCMs have
separate base addresses.
You can disable each TCM to avoid an access being made to it. This gives a reduction in the
power consumption. You can disable each TCM independently from the enabling of the
associated cache, as determined by CP15 register c9. Disabling a TCM invalidates the base
address, so there is no unexpected hit behavior for the TCM.
The timing of a TCM access is the same as for a cache access. The ARM1176JZ-S processor
does not support wait states on the TCM interfaces.
Table 7-2 lists the access types for TCM configured as Non-secure.
Table 7-2 Access to Non-secure TCM
Access type
NS attribute of corresponding
page table
Behavior
Non-secure access
X
Access done on TCM
Secure access
0
TCM not visible, go to Level 2 memory
Secure access
1
access done on TCM.
Table 7-3 lists the access types for TCM configured as Secure.
Table 7-3 Access to Secure TCM
7.3.1
Access type
NS attribute of corresponding
page table
Behavior
Non-secure access
X
TCM not visible
Secure access
0
Access done on TCM
Secure access
1
TCM is not visible, go to Level 2 memory.
TCM behavior
TCM forms a continuous area of memory that is always valid if the TCM is enabled. The TCM
is used as part of the physical memory map of the system, and is not backed by a level of external
memory with the same physical addresses. For this reason, the TCM behaves differently from
the caches for regions of memory that are marked as being Write-Through Cacheable. In such
regions, no external writes occur in the event of a write to memory locations contained in the
TCM.
7.3.2
Restriction on page table mappings
The TCMs are implemented in a physically indexed, physically addressed manner, giving the
following behavior:
•
aliases to the same physical address can exist in memory regions that are held in the TCM.
As a result, the page mapping restrictions for the TCM are less restrictive than for the cache, as
Restrictions on page table mappings page coloring on page 6-41 describes.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-8
Level One Memory System
7.3.3
Restriction on page table attributes
The page table entries that describe areas of memory that are handled by the TCM are remapped
to normal, non-cacheable, non-shared type.
If the page table entry covers a region larger than the size of the TCM, then the attributes are
ignored for the TCM region but still apply to the rest of the region covered by the page table
entry.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-9
Level One Memory System
7.4
DMA
The level one DMA provides a background route to transfer blocks of data to or from the TCMs.
It is used to move large blocks, rather than individual words or small structures.
The level one DMA is initiated and controlled by accessing the appropriate CP15 registers and
instructions, see DMA control on page 3-9. This register is common to the Secure and
Non-secure world. DMA channels can be reserved for the Secure world only, or available for
both worlds, see bit [18] in the c1, Non-Secure Access Control Register on page 3-55. This bit
also determines the page tables, Secure or Non-secure, that DMA transfers use. In the
Non-secure world, the read/write access of these DMA registers depends on Non-secure Access
control register bit[18] value. Accessing these registers in the Non-secure world when not
permitted, NSAC[18] clear, results in an Undefined exception.
The value of NSAC[18] is also used during access to the Main TLB for comparison with the
NSTID of the TLB entries:
•
When the channel is defined as Non-secure, NSAC[18] set, the Non-secure page tables
are used. DMA external accesses are done on Non-secure memory regions. For DMA
internal access, only TCM defined as Non-secure can be accessed.
•
When the channel is defined as Secure. NSAC[18] clear, the Secure page tables are used.
The DMA external or internal access depends on the value of the NS attribute in the
corresponding descriptors. If the NS attribute in the descriptor, for external access, is
reset, the DMA channel accesses external Secure memory. If the NS attribute is set, the
DMA channel accesses external Non-secure memory. For internal access, the page
descriptor selects the TCM and the DMA performs a security permission check before
accessing the TCM.
The process specifies the internal start and end addresses and external start address, together
with the direction of the DMA. The addresses specified are Virtual Addresses, and the level one
DMA hardware includes translation of Virtual Addresses to Physical Addresses and checking
of protection attributes.
The TLB, that TLB organization on page 6-4 describes, holds the page table entries for the
DMA, and ensures that the entries in a TLB used by the DMA are consistent with the page
tables. Errors, arising from protection checks, are signaled to the processor using an interrupt.
Completion of the DMA can also be configured by software to signal the processor with an
interrupt using the same interrupt to the processor that the error uses. The status of the DMA is
read from the CP15 registers associated with the DMA.
The DMA controller is programmed using the CP15 coprocessor. DMA accesses can only be to
or from the TCM and must not be from areas of memory that can be contained in the caches.
That is, no coherency support is provided in the caches.
The processor implements two DMA channels. Only one channel can be active at a time. The
key features of the DMA system are:
ARM DDI 0333H
ID012410
•
the DMA system runs in the background of processor operations
•
DMA progress is accessible from software
•
DMA is programmed with virtual addresses, with a MicroTLB dedicated to the DMA
function
•
you can configure the DMA to work to either the instruction or data RAMs
•
DMA is allocated by a privileged process, enabling User access to control the DMA.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-10
Level One Memory System
For some DMA events an interrupt is generated. If the channel is configured as Non-secure the
nDMAIRQ signal is asserted, otherwise if the channel is configured as Secure the nDMASIRQ
signal is asserted. When an external access caused by the DMA aborts, the processor asserts
nDMAEXTERRIRQ. You can route these output pins to an external interrupt controller for
prioritization and masking. This is the only mechanism to signal the interrupt to the core. For
more information, see c11, DMA Channel Status Register on page 3-117.
Each DMA channel has its own set of Control and Status Registers. The maximum number of
DMA channels that can be defined is architecturally limited to 2. Only 1 DMA channel can be
active at a time. If the other DMA channel has been started, it is queued to start performing
memory operations after the currently active channel has completed. The level one DMA
behaves as a distinct master from the rest of the processor, and the same mechanisms for
handling Shared memory regions must be used if the external addresses being accessed by the
level one DMA system are also accessed by the rest of the processor.
Memory attributes and types on page 6-20 describes these. If a User mode DMA transfer is
performed using an external address that is not marked as Shared, an error is signaled by the
DMA channel. There is no ordering requirement of memory accesses caused by the level one
DMA relative to those generated by reads and writes by the processor, while a channel is
running. When a channel has completed running, all its transactions are visible to all other
observers in the system.
All memory accesses caused by the DMA occur in the order specified by the DMA channel,
regardless of the memory type. If a DMA access is performed to Strongly Ordered memory, see
Memory attributes and types on page 6-20, then a transaction caused by the DMA prevents any
additional transactions being generated by the DMA until the point when the access is complete.
A transaction is complete when it has changed the state of the target location or data has been
returned to the DMA. If the FCSE PID, the Domain Access Control Register, or the page table
mappings are changed, or the TLB is flushed, while a DMA channel is in the Running or Queued
state, then the DMA channel must be stopped.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-11
Level One Memory System
7.5
TCM and cache interactions
In the event that a TCM and a cache both contain the requested address, it is architecturally
Unpredictable which memory the instruction data is returned from. It is expected that such an
event only arises from a failure to invalidate the cache when the base register of the TCM is
changed, and so is clearly a programming error. For a Harvard arrangement of caches and TCM,
data reads and writes can access any Instruction TCM for both reads and writes. This ensures
that accesses to literal pools, Undefined instructions, and SVC numbers are possible, and aids
debugging. For this reason, an Instruction TCM must behave as a unified TCM, but can be
optimized for instruction fetches.
You must not program an Instruction TCM to the same base address as a Data TCM and, if the
two RAMs are different sizes, the regions in physical memory of the two RAMs must not be
overlapped. This is because the resulting behavior is architecturally Unpredictable.
In these cases, you must not rely on the behavior of ARM1176JZ-S processor for code that is
intended to be ported to other ARM platforms.
In all cases, no security consideration is necessary because there cannot be a conflict between
accesses targeting Secure and Non-secure memory. Any cache line or TCM data is marked as
being Secure or Non-secure and no Unpredictable situations can result from this.
7.5.1
Overlapping between TCM regions
Where TCM regions overlap, the access priority is worked out using these rules, starting with
the highest priority rule:
1.
Where there is an overlap between a DTCM and an ITCM, the DTCM has priority for data
accesses.
Note
Instruction accesses to the DTCM are not possible.
2.
Where there is an overlap between two TCMs on the same side, TCM0 has priority. This
means that DTCM0 has priority over DTCM1, and ITCM0 has priority over ITCM1.
This means that, for data accesses, the priority order if all four TCMs overlap is:
1.
DTCM0, highest priority
2.
DTCM1
3.
ITCM0
4.
ITCM1, lowest priority.
For instruction accesses, the priority order is:
1.
ITCM0, highest priority
2.
ITCM1, lowest priority.
These priority rules are not affected by whether the TCMs are Secure or Non-secure. The only
effect of configuring TCMs as Secure or Non-secure is that a Secure TCM cannot overlap a
Non-secure TCM.
7.5.2
DMA and core access arbitration
DMA and core accesses to both the Instruction TCM and the Data TCM can occur in parallel.
So as not to disrupt the execution of the core, core-generated accesses have priority over those
requested by the DMA engine, regardless of the security level of the accesses.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-12
Level One Memory System
7.5.3
Instruction accesses to TCM
If the Instruction TCM and the Instruction Cache both contain the requested instruction address,
the processor returns data from the TCM. The instruction prefetch port of the processor cannot
access the Data TCM. If an instruction prefetch misses the Instruction TCM and Instruction
Cache but hits the Data TCM, then the result is an access to the level two memory.
An IMB must be inserted between a write to an Instruction TCM and the instructions being
written that it relies on. In addition, any branch prediction mechanism must be invalidated or
disabled if a branch in the Instruction TCM is overwritten.
7.5.4
Data accesses to the Instruction TCM
If the Data TCM and the Data Cache both contain the requested data address for a read, the
processor returns data from the Data TCM. For a write, the write occurs to the Data TCM. The
majority of data accesses are expected to go to the Data Cache or to the Data TCM, but it is
necessary for the Instruction TCM to be read or written on occasion.
The Instruction TCM base addresses are read by the processor data port as a possible source for
data for all memory accesses. This increases the data comparisons associated with the data,
compared with the number required for the instruction memory lookup, for the level one
memory hit generation. This functionality is required for reading literal values and for debug
purposes, such as setting software breakpoints.
Access to the Instruction TCM involves a delay of 5-12 cycles in reading or writing the data.
This delay enables the Instruction TCM access to be scheduled to take place only when the
presence of a hit to the Instruction TCM is known. This saves power and avoids unnecessary
delays being inserted into the instruction-fetch side. This delay is applied to all accesses in a
multiple operation in the case of an LDM, an LDCL, an STM, or an STCL.
Literal pool accesses
It can take 5-12 cycles for the data port to read data from the Instruction TCM.
Because the path lengths are short, there might sometimes be an increase in
latency to achieve greater clock speeds. Therefore, avoid literal pool accesses
inside critical loops. This does not affect code in cache, because the literal pool is
loaded into the D cache.
Switching penalty between cache & TCM
Normally, an access to the cache or TCM takes a single cycle. However, it can
take three cycles in certain cases.
To perform a cache or TCM read in a single cycle, the processor speculatively
reads the RAM contents. It does not know if it was the correct RAM until after
the read is complete. To save power, the processor performs a speculative read
either to the TCM or to the cache. If the read is wrong, the processor must repeat
the access to the correct location.
There is a penalty of three clock cycles when the core switches between accessing
cache and TCM, for example if it thinks the access is in TCM, but it is in fact in
cache. So. three cycles for the first non-sequential access to TCM, when the
previous access on that side, I-side or D-side, was to cache and similarly, three
cycles penalty for the first non-sequential access to cache, when the previous
access on that side was to TCM. This is not an issue on the I-side, where code does
not typically branch between TCM and cacheable areas, but can be an issue for
data.
For example, in the following code:
Loop
ARM DDI 0333H
ID012410
LDR r0, [r2],#4
; reads an item from D-TCM
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-13
Level One Memory System
LDR
ADD
CMP
BLT
r1, [r3],#4 ; reads an item from D-cache
r4, r0, r1 ; perform some calculation on the loaded data
r1, r5 ; finished yet?
loop
Each iteration of this loop pays the three cycle penalty twice, because the loads
alternate between cache & TCM. This is an extreme example, of course. Because
of hit-under-miss, this 3 cycle penalty might not stall the integer core. If the same
code uses only D-TCM, or only D-cache, each load typically takes one cycle.
This can be important if a performance critical loop operates on two blocks of
data, one in D-TCM and one in main memory, especially if the data is consumed
in small blocks of a byte or word, rather than multiple words per iteration.
So, if you have all of the dhrystone code and data in TCM, you get better
performance than if you have nearly all in TCM.
It is not required for instruction port(s) to be able to access the Data TCM. An attempt to access
addresses in the range covered by a Data TCM from an instruction port does not result in an
access to the Data TCM. In this case, the instruction is fetched from main memory. It is
anticipated that such accesses can result in external aborts in some systems, because the address
range might not be supported in main memory.
Instruction TCMs must not be programmed to the same base address as a Data TCM and, if the
RAMs are of different sizes, the regions in physical memory of the two RAMs must not be
overlapped because the resulting behavior is architecturally Unpredictable. If an access is made
to a location that is covered by both an Instruction TCM and a Data TCM, the access is only to
the Data TCM.
Table 7-4 summarizes the results of data accesses to TCM and the cache. This also embodies
the unexpected hit behavior for the cache that Unexpected hit behavior on page 7-6 describes.
In Table 7-4, the Data Cache can only be hit if the memory location being accessed is marked
as being Cacheable and Not shareable. A hit to the Data TCM and Instruction TCM refers to
hitting an address in the range covered by that TCM.
Table 7-4 Summary of data accesses to TCM and caches
Data
TCM
Data
cache
Instruction
TCM a
Read behavior
Write behavior
Hit
Hit
Hit
Read from Data TCM.
Write to Data TCM. No write to the Instruction
TCM or Data Cache.
No write to level two, even if marked as
Write-Through.
Hit
Hit
Miss
Read from Data TCM.
Write to Data TCM. No write to Data Cache.
No write to level two even if marked as
Write-Through.
Hit
Miss
Hit
Read from Data TCM.
No linefill to Data Cache fill
even if marked Cacheable.
Write to Data TCM. No write to Instruction TCM.
No write to level two even if marked as
Write-Through.
Hit
Miss
Miss
Read from Data TCM.
No linefill to Data Cache even
if marked Cacheable.
Write to Data TCM.
No write to level two even if marked as
Write-Through.
Miss
Hit
Hit
Read from Data Cache.
Write to Data Cache.
If Write-Through, write to Instruction TCM.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-14
Level One Memory System
Table 7-4 Summary of data accesses to TCM and caches (continued)
Data
TCM
Data
cache
Instruction
TCM a
Read behavior
Write behavior
Miss
Hit
Miss
Read from Data Cache.
Write to Data Cache.
If Write-Through, write to level two.
Miss
Miss
Hit
Read from Instruction TCM.
No cache fill even if marked
Cacheable.
Write to Instruction TCM.
No write to level two even if marked as
Write-Through.
Miss
Miss
Miss
If Cacheable and cache
enabled, cache linefill.
If Noncacheable or cache
disabled, read to level two.
Write to level two.
a. Excludes unexpected hit.
Table 7-5 summarizes the results of instruction accesses to TCM and the cache. This also
embodies the unexpected hit behavior for the cache that Unexpected hit behavior on page 7-6
describes. In Table 7-5, the Instruction Cache can only be hit if the memory location being
accessed is marked as being Cacheable and not shareable. A hit to the Instruction TCM refers
to hitting an address in the range covered by that TCM.
Table 7-5 Summary of instruction accesses to TCM and caches
Instruction TCM
Instruction cache a
Data TCM
Read behavior
Hit
Hit
Don’t care
Read from I TCMNo linefill to I Cache even if marked
Cacheable
Hit
Miss
Don’t care
Read from Instruction TCM.
No linefill to Instruction Cache, even if marked cacheable.
Miss
Hit
Don’t care
Read from Instruction Cache.
Miss
Miss
Don’t care
If Cacheable and cache enabled, cache linefill.
If Noncacheable or cache disabled, read to level two.
a. Excludes unexpected hit.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-15
Level One Memory System
7.6
Write buffer
All memory writes take place using the Write buffer. To ensure that the Write buffer is not
drained on reads, the following features are implemented:
•
The Write buffer is a FIFO of outstanding writes to memory. It consists of a set of
addresses and a set of data words, together with their size information.
•
If a sequence of data words is contained in the Write buffer, these are denoted as applying
to the same address by the Write buffer storing the size of the store multiple. This reduces
the number of address entries that must be stored in the Write buffer.
•
In addition to this, a separate FIFO of Write-Back addresses and data words is
implemented. Having a separate structure avoids complications associated with
performing an external write while the write-though is being handled.
•
The address of a new read access is compared against the addresses in the Write buffer. If
a read is to a location that is already in the Write buffer, the read is blocked until the Write
buffer has drained sufficiently far for that location to be no longer in the Write buffer. The
sequential marker only applies to words in the same 8 word, 8 word aligned, block, and
the address comparisons are based on 8 word aligned addresses.
Memory access control on page 6-11 describes the ordering of memory accesses.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
7-16
Chapter 8
Level Two Interface
The processor is designed to be used within larger chip designs using the Advanced Microcontroller
Bus Architecture (AMBA) AXI protocol. The processor uses the level two interface as its interface
to memory and peripherals. This chapter describes the features of the level two interface not
covered in the AMBA AXI Protocol Specification
The chapter contains the following sections:
•
About the level two interface on page 8-2
•
Synchronization primitives on page 8-6
•
AXI control signals in the processor on page 8-8
•
Instruction Fetch Interface transfers on page 8-14
•
Data Read/Write Interface transfers on page 8-15
•
Peripheral Interface transfers on page 8-41
•
Endianness on page 8-42
•
Peripheral Interface transfers on page 8-41.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-1
Level Two Interface
8.1
About the level two interface
The level two memory interface exists to provide a high-bandwidth interface to second level
caches, on-chip RAM, peripherals, and interfaces to external memory.
It is a key feature in ensuring high system performance, providing a higher bandwidth
mechanism for filling the caches in a cache miss than has existed on previous ARM processors.
The processor level two interconnect system uses the following 64-bit wide AXI interfaces:
•
Instruction Fetch Interface
•
Data Read/Write Interface
•
DMA Interface.
Another interface is also provided, the Peripheral Interface. This is a 32-bit AXI interface.
Figure 8-1 shows the level two interconnect interfaces.
Processor
Level two
instruction side
controller
Instruction fetch
port
(64-bit)
Level two data side
controller
Data read/write
port
(64-bit)
Peripheral
port
(32-bit)
DMA
DMA
port
(64-bit)
Figure 8-1 Level two interconnect interfaces
These interfaces provide for several simultaneous outstanding transactions, giving the potential
for high performance from level two memory systems that support parallelism, and also for high
utilization of pipelined memories such as SDRAM.
•
No outstanding accesses are issued on the DMA port. The DMA port can issue bursts of
32-bit or 64-bit data when the address is correctly aligned.
•
The data read/write port can issue outstanding accesses. The maximum number of
outstanding accesses it can issue is two reads and two writes, to give a total of four
outstanding accesses.
•
The instruction port can issue outstanding read accesses, up to a maximum of two
outstanding read accesses.
•
No outstanding accesses are issued by the peripheral port.
Each of the four wide interfaces is an AXI interface, with additional signals to support additional
features for the level two memory system for multi-level cache support.
The processor does not drive the following AXI ID signals:
•
ARIDI
•
ARIDRW
•
AWIDRW
•
WIDRW
•
ARIDP
•
AWIDP
•
WIDP
•
ARIDD
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-2
Level Two Interface
•
•
AWIDD
WIDD.
When you connect the processor in an AXI system, you can choose whatever ID value suits your
system. The only requirement is that AWID and WID must have the same value.
8.1.1
AXI parameters for the level 2 interconnect interfaces
Table 8-1 shows the AXI parameters for the level 2 interconnect interfaces.
Table 8-1 AXI parameters for the level 2 interconnect interfaces
Interface:
Parameter
Instruction, RO
Data, RW
Peripheral, RW
DMA, RW
Write Issuing Capability
Not applicable
2
1
1
Read Issuing Capability
2
2
1
1
Combined Issuing Capability
Not applicable
4
1
1
Write ID Capability
Not applicable
1
1
1
Write Interleave Capability
Not applicable
1a
1a
1a
Write ID Width
Not applicable b
Not applicable b
Not applicable b
Not applicable b
Read ID Capability
1
1
1
1
Read ID Width
Not applicable b
Not applicable b
Not applicable b
Not applicable b
a. The value of 1 means that interleaving or re-ordering cannot occur.
b. The level 2 interconnect interfaces do not implement any AXI ID signals.
8.1.2
Level two instruction-side controller
The level two instruction-side controller contains the level two Instruction Fetch Interface. See
Instruction Fetch Interface.
The level two instruction-side controller handles all instruction-side cache misses including
those for Noncacheable locations. It is responsible for the sequencing of cache operations for
Instruction Cache linefills, making requests for the individual stores through the Prefetch Unit
(PU) to the Instruction Cache. The decoupling involved means that the level two instruction-side
controller contains some buffering.
Instruction Fetch Interface
The Instruction Fetch Interface is a read-only interface that services the Instruction Cache on
cache misses, including the fetching of instructions for the PU that are held in memory marked
as Noncacheable. The interface is optimized for cache linefills rather than individual requests.
8.1.3
Level two data-side controller
The level two data-side controller is responsible for the level two:
•
Data Read/Write Interface
•
Peripheral Interface.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-3
Level Two Interface
The level two data-side controller handles:
•
All external access requests from the Load Store Unit, including cache misses, data
Write-Through operations, and Noncacheable data.
•
SWP instructions and semaphore operations. It schedules all reads and writes on the two
interfaces, that are closely related.
The level two data-side controller also handles the Peripheral Interface.
The level two data-side controller contains the Refill and Write-Back engines for the Data
Cache. These make requests through the Load Store Unit for the individual cache operations that
are required. The decoupling involved means that the level two data-side controller contains
some buffering. The write buffer is an integral part of the level two data-side controller.
Data Read/Write Interface
The Data Read/Write Interface performs reads and swap reads. It services the Data Cache on
cache misses, and reads noncacheable locations.
The Data Read/Write Interface performs writes and swap writes. It services the writes out of the
Write Buffer. Multiple writes can be queued up as part of this interface.
Peripheral Interface
The Peripheral Interface is a bidirectional AXI interface that services peripheral devices. In
ARM1176JZ-S processors, the Peripheral Interface is used for peripherals that are private to the
processor, such as the Vectored Interrupt Controller or Watchdog Timer. Accesses to regions of
memory that are marked as Device and Non-Shared are routed to the Peripheral Interface in
preference to the Data Read/Write Interface.
Instruction and DMA accesses are not routed to the Peripheral port.
Unaligned accesses and exclusive accesses are not supported by the peripheral port, because
they are not supported in Device memory. The order that accesses are presented on the
Peripheral Interface, relative to those on the Data Read/Write Interface is not defined, other than
Strongly Ordered accesses. For this reason, the peripheral port is expected to be used to access
a bus or memory system that is not accessible through the Data Read/Write port. See c15,
Peripheral Port Memory Remap Register on page 3-130 to find out how to remap data accesses
to a defined address region to the peripheral port. In some systems, designers might not want to
use the Peripheral port to access locations in memory that are marked in the page tables as
Non-Shared Device. In these cases, you can use the Remap Registers to remap Non-Shared
Device to Shared Device, so causing these accesses to be made using the main system memory
ports.
8.1.4
DMA
The DMA is responsible for:
•
Performing all external memory transactions required by the DMA engine, and for
requesting accesses from the Instruction TCM and Data TCM as required.
•
Queuing the DMA channels as required. The DMA Interface contains several registers
that are CP15 registers dedicated for DMA use, see DMA control on page 3-9 for details.
The DMA contains buffering to enable the decoupling of internal and external requests. This is
because of variable latency between internal and external accesses.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-4
Level Two Interface
It uses the Prefetch Unit (PU) and the Load Store Unit (LSU) to schedule its accesses to the
TCMs.
DMA Interface
The DMA Interface is a bidirectional interface that services the DMA subsystem for writing and
reading the TCMs. Although the DMA Interface is bidirectional, it is able to produce a stream
of successive accesses that are in the same direction, followed by either an extra stream in the
same direction, or a stream in the opposite direction. Correspondingly the direction turnaround
is not significantly optimized.
The size of the transfer is given in the parameters of the transfer in the CP15 registers. The
transfers are always aligned with the size of the transfer as indicated by the CP15 registers.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-5
Level Two Interface
8.2
Synchronization primitives
On previous architectures support for shared memory synchronization has been with the
read-locked-write operations that swap register contents with memory, the SWP and SWPB
instructions. These support basic busy and free semaphore mechanisms. For details of the swap
instructions, and how to use them to implement semaphores, see the ARM Architecture
Reference Manual.
ARMv6 and its extensions introduce support for more comprehensive shared-memory
synchronization primitives that scale for multiple-processor system designs. Two sets of
instructions are introduced that support multiple-processor and shared-memory inter-process
communication:
•
load-exclusive, LDREX, LDREXB, LDREXH, and LDREXD
•
store-exclusive, STREX, STREXB, STREXH, and STREXD.
The exclusive-access instructions rely on the ability to tag a physical address as exclusive-access
for a particular processor. This tag is later used to determine if an exclusive store to an address
occurs.
For non-shared memory regions, the LDREX{B,H,D} and STREX{B,H,D} instructions are
presented to the ports as normal LDR or STR. If a processor does an STR on a memory region
that it has already marked as exclusive, this does not clear the tag. However, if the region has
been marked by another processor, an STR clears the tag.
Other events might cause the tag to be cleared. In particular, for memory regions that are not
shared, it is systems dependent whether a store by another processor to a tagged physical address
causes the tag to be cleared.
An external abort on either a load-exclusive or store-exclusive puts the processor into Abort
mode.
For an exclusive read access, the processor considers any response apart from EXOKAY as an
external abort.
For an exclusive write access, the processor considers any error response as an external abort,
an OKAY response sets the returned status value to 1.
For SWP and SWPB instructions, in the case of an error response on the locked read access and
to unlock the bus, the processor performs a dummy normal write access with all byte strobes
disabled at the same address as the locked read access.
Note
An external abort on a load-exclusive can leave the processor internal monitor in its exclusive
state and might affect your software. If it does you must execute a CLREX instruction in your
abort handler to clear the processor internal monitor to an open state.
8.2.1
Load-exclusive instruction
Load-exclusive performs a load from memory and causes the physical address of the access to
be tagged as exclusive-access for the requesting processor. This causes any other physical
address that has been tagged by the requesting processor to no longer be tagged as
exclusive-access.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-6
Level Two Interface
8.2.2
Store-exclusive instruction
Store-exclusive performs a conditional store to memory. The store only takes place if the
physical address is tagged as exclusive-access for the requesting processor. This operation
returns a status value. If the store updates memory the return value is 0, otherwise it is 1. In both
cases, the physical address is no longer tagged as exclusive-access for any processor.
8.2.3
Example of LDREX and STREX usage
This is an example of typical usage. Suppose you are trying to claim a lock:
Lock address
Lock free
Lock taken
MOV
try LDREX
CMP
STREXEQ
CMPEQ
BNE
:
:
:
R1,
R0,
R0,
R0,
R0,
try
LockAddr
0x00
0xFF
#0xFF
[LockAddr]
#0
R1, [LockAddr]
#0
;
;
;
;
;
;
;
load the ‘lock taken’ value
load the lock value
is the lock free?
try and claim the lock
did this succeed?
no – try again
. . . .
yes – we have the lock
The typical case, where the lock is free and you have exclusive-access, is six instructions.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-7
Level Two Interface
8.3
AXI control signals in the processor
This section describes the processor implementation of the AXI control signals:
For additional information about AXI, see the AMBA AXI Protocol Specification.
The AXI protocol is burst-based. Every transaction has address and control information on the
address channel that describes the nature of the data to be transferred. The data is transferred
between master and slave using a write channel to the slave or a read channel to the master. In
write transactions, where all the data flows from the master to the slave, the AXI has an
additional write response channel to enable the slave to signal to the master the completion of
the write transaction.
The AXI protocol permits address information to be issued ahead of the actual data transfer and
enables support for multiple outstanding transactions in addition to out-of-order completion of
transactions.
Figure 8-2 shows how a read transaction uses the read address and read data channels.
Read address channel
Address
and
control
Master
interface
Slave
interface
Read channel
Read
data
Read
data
Read
data
Read
data
Figure 8-2 Channel architecture of reads
Figure 8-3 shows how a write transaction uses the write address, write data, and write response
channels.
Write address channel
Address
and
control
Write channel
Master
interface
Write
data
Write
data
Write
data
Write
data
Slave
interface
Write response channel
Write
response
Figure 8-3 Channel architecture of writes
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-8
Level Two Interface
8.3.1
Channel definition
Each of the five independent channels consists of a set of information signals and uses a
two-way VALID and READY handshake mechanism.
The information source uses the VALID signal to show when valid data is available on the
channel. The destination uses the READY signal to show when it can accept the data. Both the
read data channel and the write data channel also include a LAST signal to indicate when the
transfer of the final data item within a transaction takes place.
Read Address channel
The read address channel is used in every transaction and carries all the required read address
and control information for that transaction. The AXI supports the following mechanisms:
•
variable-length bursts, from 1 to 16 data transfers per burst
•
bursts with a transfer size of eight bits up to the maximum data bus width
•
wrapping, incrementing, and fixed address bursts
•
atomic operations, using exclusive and locked access
•
system-level caching and buffering control
•
Secure and privileged access.
Write address channel
The write address channel is used in every transaction and carries all the required write address
and control information for that transaction. The AXI supports the following mechanisms:
•
variable-length bursts, from 1 to 16 data transfers per burst
•
bursts with a transfer size of eight bits up to the maximum data bus width
•
wrapping, incrementing, and fixed address bursts
•
atomic operations, using exclusive and locked access
•
system-level caching and buffering control
•
Secure and privileged access.
Read data channel
The read data channel conveys both the read data and any read response information from the
slave back to the master. The read data channel includes:
•
the data bus, that is 32 bits wide for the Peripheral port, and 64 bits wide for the Data
Read/Write port, Instruction port and DMA port
•
a read response indicating the completion status of the read transaction.
Write data channel
The write data channel conveys the write data from the master to the slave and includes:
•
the data bus, that is 32 bits wide for the Peripheral port, and 64 bits wide for the Data
Read/Write port, Instruction port and DMA port
•
one byte lane strobe for every eight data bits, indicating the bytes of the data bus that are
valid.
Write response channel
The write response channel provides a way for the slave to respond to write transactions. All
write transactions use completion signaling.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-9
Level Two Interface
Note
The completion signal occurs once for each burst, not for each individual data transfer within
the burst.
8.3.2
Signal name suffixes
The signal name for each of the interfaces denotes the interface that it applies to. The signals
have one of these suffixes:
I
Instruction Fetch Interface.
D
DMA Interface.
RW
Data Read/Write Interface.
P
Peripheral Interface.
The second character in the signal name indicates if the data direction is a read, R, or write, W.
For example, AxSIZE[2:0] is called ARSIZEI[2:0] for reads in the Instruction Fetch Interface.
8.3.3
Address channel signals
The address channel control signals in the processor are:
•
AxLEN[3:0]
•
AxSIZE[2:0] on page 8-11
•
AxBURST[1:0] on page 8-11
•
AxLOCK[1:0] on page 8-11
•
AxCACHE[3:0] on page 8-12
•
AxPROT[2:0] on page 8-12
•
AxSIDEBAND[4:0] on page 8-13.
AxLEN[3:0]
The AxLEN[3:0] signal indicates the number of transfers in a burst. Table 8-2 shows the values
of AxLEN that the processor uses.
Table 8-2 AxLEN[3:0] encoding
ARM DDI 0333H
ID012410
AxLEN[3:0]
Number of data transfers
b0000
1
b0001
2
b0010
3
b0011
4
b0100
5
b0101
6
b0110
7
b0111
8
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-10
Level Two Interface
AxSIZE[2:0]
This signal indicates the size of each transfer. Table 8-3 shows the supported transfer sizes.
Table 8-3 AxSIZE[2:0] encoding
AxSIZE[2:0]
Bytes in transfer
b000
1
b001
2
b010
4
b011
8
AxBURST[1:0]
The AxBURST[1:0] signals indicate a fixed, incrementing or wrapping burst. Table 8-4 shows
the burst types that the ARM1176JZ-S processor supports.
Table 8-4 AxBURST[1:0] encoding
AxBURST[2:0]
Burst type
Description
b00
Fixed
Fixed address burst
b01
Incr
Incrementing address burst
b10
Wrap
Incrementing address burst that wraps
to a lower address at the wrap boundary
The processor uses:
•
Wrapping bursts for some cache line fills
•
Incrementing bursts for accesses to Noncacheable memory, including instruction fetches.
AxLOCK[1:0]
The AxLOCK[1:0] signal indicates the lock type of access. The processor supports all locked
type accesses. The instruction port only generates Normal access types. The DMA port only
generates Normal access types. The Data Read/Write port generates all access types, Normal,
exclusive and locked access.
Table 8-5 shows the values of AxLOCK that the processor supports.
Table 8-5 AxLOCK[1:0] encoding
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
AxLOCK[1:0]
Description
b00
Normal access
b01
Exclusive access
b10
Locked access
8-11
Level Two Interface
AxCACHE[3:0]
The AxCACHE[3:0] signals indicate the bufferable, cacheable, write-through, write-back, and
allocate attributes of the transaction. These attributes are for the level two memory system.
Table 8-6 shows the correspondence between the AxCACHE[3:0] encoding and TLB
cacheable attributes.
Table 8-6 AxCACHE[3:0] encoding
AxCACHE[3:0]
Transaction attributes
b0000
Strongly ordered
b0001
Shared device or non-shared device
b0010
Outer Noncacheable
b0110
Outer write-through, no allocate on write
b0111
Outer write-back, no allocate on write
b1111
Outer write-back, write allocate.
AxPROT[2:0]
The AxPROT[2:0] signal indicates the protection level of the transaction, that is if the
transaction is:
•
normal or privileged
•
Secure or Non-secure
•
Data access or Instruction access.
All transactions from the instruction port are marked as instruction accesses, ARPROTI[2] = 1.
Transactions from the DMA port are marked as instruction accesses, AxPROTD[2] = 1, if the
transaction is to or from the Instruction TCM, and as data accesses, AxPROTD[2] = 0, for
transfers to or from the Data TCM.
Transactions on the peripheral and data read/write ports are marked as data accesses.
Table 8-7 shows the supported values for AxPROT[2:0].
Table 8-7 AxPROT[2:0] encoding
ARM DDI 0333H
ID012410
Signal
Description
AxPROT[2]
0 = Data access
1 = Instruction access
AxPROT[1]
0 = Secure
1 = Non-secure
AxPROT[0]
0 = Normal, User
1 = Privileged
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-12
Level Two Interface
AxSIDEBAND[4:0]
The AxSIDEBAND[4:1] signals indicate the bufferable, cacheable, write-through, write-back,
and allocate attributes of the level one memory. AxSIDEBAND[0] indicates the Shared
attribute. Table 8-8 shows the correspondence between the AxSIDEBAND[4:1] encoding and
the TLB cacheable attributes for the Read/Write, Peripheral, and DMA ports.
Table 8-8 AxSIDEBAND[4:1] encoding
AxSIDEBAND[4:1]
Transaction attributes
b0000
Strongly ordered
b0001
Shared device or non-shared device
b0010
Inner Noncacheable
b0110
Inner write-through, no allocate on write
b0111
Inner write-back, no allocate on write
b1111
Inner write-back, write allocatea
a. The ARM1176JZ-S processor does not support write allocate.
Table 8-9 shows the correspondence between the ARSIDEBANDI[4:1] encoding and the TLB
cacheable attributes for the Instruction port.
Table 8-9 ARSIDEBANDI[4:1] encoding
ARSIDEBANDI[4:1]
Transaction attributes
b0000
Strongly Ordered
b0001
Device
b0010
Inner Noncacheable
b0110
Inner Cacheable
These signals are not part of the AXI protocol and are added for additional information.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-13
Level Two Interface
8.4
Instruction Fetch Interface transfers
The tables in this section describe the AXI interface behavior for instruction side fetches to
either Cacheable or Noncacheable regions of memory for the following interface signals:
•
ARBURSTI[1:0]
•
ARLENI[3:0]
•
ARADDRI[31:0]
•
ARSIZEI[2:0].
See the AMBA AXI Protocol Specification for details of the other AXI signals.
8.4.1
Cacheable fetches
Table 8-10 shows the values of ARADDRI, ARBURSTI, ARSIZEI, and ARLENI for
Cacheable fetches.
Table 8-10 AXI signals for Cacheable fetches
8.4.2
Address[4:0]
ARADDRI
ARBURSTI
ARSIZEI
ARLENI
0x00, word 0
0x00
Incr
64-bit
4 data transfers
0x04, word 1
0x00
Incr
64-bit
4 data transfers
0x08, word 2
0x08
Wrap
64-bit
4 data transfers
0x0C, word 3
0x08
Wrap
64-bit
4 data transfers
0x10, word 4
0x10
Wrap
64-bit
4 data transfers
0x14, word 5
0x10
Wrap
64-bit
4 data transfers
0x18, word 6
0x18
Wrap
64-bit
4 data transfers
0x1C, word 7
0x18
Wrap
64-bit
4 data transfers
Noncacheable fetches
Table 8-11 shows the values of ARADDRI, ARBURSTI, ARSIZEI, and ARLENI for
Noncacheable fetches.
Table 8-11 AXI signals for Noncacheable fetches
ARM DDI 0333H
ID012410
Address[4:0]
ARADDRI
ARBURSTI
ARSIZEI
ARLENI
0x00, word 0
0x00
Incr
64-bit
4 data transfers
0x04, word 1
0x04
Incr
64-bit
4 data transfers
0x08, word 2
0x08
Incr
64-bit
3 data transfers
0x0C, word 3
0x0C
Incr
64-bit
3 data transfers
0x10, word 4
0x10
Incr
64-bit
2 data transfers
0x14, word 5
0x14
Incr
64-bit
2 data transfers
0x18, word 6
0x18
Incr
64-bit
1 data transfer
0x1C, word 7
0x1C
Incr
64-bit
1 data transfer
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-14
Level Two Interface
8.5
Data Read/Write Interface transfers
The tables in this section describe the AXI interface behavior for Data Read/Write Interface
transfers for the following interface signals:
•
AxBURSTRW[1:0]
•
AxLENRW[3:0]
•
AxSIZERW[2:0]
•
AxADDRRW[31:0]
•
WSTRBRW[7:0].
8.5.1
Linefills
A linefill comprises four accesses to the Data Cache if there is no external abort returned. In the
event of an external abort, the doubleword and subsequent doublewords are not written into the
Data Cache and the line is never marked as Valid. The four accesses are:
•
Write Tag and data doubleword
•
Write data doubleword
•
Write data doubleword
•
Write Valid = 1, Dirty = 0, and data doubleword.
The linefill can only progress to attempt to write a doubleword if it does not contain dirty data.
This is determined in one of two ways:
•
if the victim cache line is not valid, then there is no danger and the linefill progresses
•
if the victim line is valid, a signal encodes the doublewords that are clean, either because
they were not dirty or they have been cleaned.
The order of words written into the cache is critical-word first, wrapping at the upper cache line
boundary.
Table 8-12 shows the values of ARADDRRW, ARBURSTRW, ARSIZERW, and
ARLENRW for linefills.
Table 8-12 Linefill behavior on the AXI interface
ARM DDI 0333H
ID012410
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00 -0x07
0x00
Incr
64-bit
4 data transfers
0x08-0x0F
0x08
Wrap
64-bit
4 data transfers
0x10-0x17
0x10
Wrap
64-bit
4 data transfers
0x18-0x1F
0x18
Wrap
64-bit
4 data transfers
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-15
Level Two Interface
8.5.2
Noncacheable LDRB
Table 8-13 shows the values of ARADDRRW, ARBURSTRW, ARSIZERW, and
ARLENRW for Noncacheable LDRBs from bytes 0-7.
Table 8-13 Noncacheable LDRB
8.5.3
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00, byte 0
0x00
Incr
8-bit
1 data transfer
0x01, byte 1
0x01
Incr
8-bit
1 data transfer
0x02, byte 2
0x02
Incr
8-bit
1 data transfer
0x03, byte 3
0x03
Incr
8-bit
1 data transfer
0x04, byte 4
0x04
Incr
8-bit
1 data transfer
0x05, byte 5
0x05
Incr
8-bit
1 data transfer
0x06, byte 6
0x06
Incr
8-bit
1 data transfer
0x07, byte 7
0x07
Incr
8-bit
1 data transfer
Noncacheable LDRH
Table 8-14 shows the values of ARADDRRW, ARBURSTRW, ARSIZERW, and
ARLENRW for Noncacheable LDRHs from bytes 0-7.
Table 8-14 Noncacheable LDRH
ARM DDI 0333H
ID012410
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00, byte 0
0x00
Incr
16-bit
1 data transfer
0x01, byte 1
0x01
Incr
32-bit
1 data transfer
0x02, byte 2
0x02
Incr
16-bit
1 data transfer
0x03, byte 3
0x03
Incr
8-bit
1 data transfer
0x04
Incr
8-bit
1 data transfer
0x04, byte 4
0x04
Incr
16-bit
1 data transfer
0x05, byte 5
0x05
Incr
32-bit
1 data transfer
0x06, byte 6
0x06
Incr
16-bit
1 data transfer
0x07, byte 7
0x07
Incr
8-bit
1 data transfer
0x08
Incr
8-bit
1 data transfer
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-16
Level Two Interface
8.5.4
Noncacheable LDR or LDM1
Table 8-15 shows the values of ARADDRRW, ARBURSTRW, ARSIZERW, and
ARLENRW for Noncacheable LDRs or LDM1s.
Table 8-15 Noncacheable LDR or LDM1
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00, byte 0, word 0
0x00
Incr
32-bit
1 data transfer
0x01, byte 1
0x01
Incr
32-bit
1 data transfer
0x04
Incr
8-bit
1 data transfer
0x02
Incr
16-bit
1 data transfer
0x04
Incr
16-bit
1 data transfer
0x03
Incr
8-bit
1 data transfer
0x04
Incr
32-bit
1 data transfer
0x04, byte 4, word 1
0x04
Incr
32-bit
1 data transfer
0x05, byte 5
0x05
Incr
32-bit
1 data transfer
0x08
Incr
8-bit
1 data transfer
0x06
Incr
16-bit
1 data transfer
0x08
Incr
16-bit
1 data transfer
0x07
Incr
8-bit
1 data transfer
0x08
Incr
32-bit
1 data transfer
0x02, byte 2
0x03, byte 3
0x06, byte 6
0x07, byte 7
8.5.5
Noncacheable LDRD or LDM2
Table 8-16 shows the values of ARADDRRW, ARBURSTRW, ARSIZERW, and
ARLENRW for Noncacheable LDRDs or LDM2s addressing words 0 to 6.
A Noncacheable LDRD or LDM2 addressing word 7 is split into two LDRs, as shown in
Table 8-17 on page 8-18.
Table 8-16 Noncacheable LDRD or LDM2
ARM DDI 0333H
ID012410
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00, word 0
0x00
Incr
64-bit
1 data transfer
0x04, word 1
0x04
Incr
32-bit
2 data transfers
0x08, word 2
0x08
Incr
64-bit
1 data transfer
0x0C, word 3
0x0C
Incr
32-bit
2 data transfers
0x10, word 4
0x10
Incr
64-bit
1 data transfer
0x14, word 5
0x14
Incr
32-bit
2 data transfers
0x18, word 6
0x18
Incr
64-bit
1 data transfer
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-17
Level Two Interface
Table 8-17 Noncacheable LDRD or LDM2 from word 7
8.5.6
Address[4:0]
Operations
0x1C, word 7
LDR from 0x1C + LDR from 0x00
Noncacheable LDM3
The values of ARADDRRW, ARBURSTRW, ARSIZERW, and ARLENRW for
Noncacheable LDM3s addressing words 0 to 5 are shown in:
•
Table 8-18 for a load from Strongly Ordered or Device memory
•
Table 8-19 for a load from Noncacheable memory or when the cache is disabled.
A Noncacheable LDM3 addressing word 6 or 7 is split into two operations as shown in
Table 8-20.
Table 8-18 Noncacheable LDM3, Strongly Ordered or Device memory
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00, word 0
0x00
Incr
32-bit
3 data transfers
0x04, word 1
0x04
Incr
32-bit
3 data transfers
0x08, word 2
0x08
Incr
32-bit
3 data transfers
0x0C, word 3
0x0C
Incr
32-bit
3 data transfers
0x10, word 4
0x10
Incr
32-bit
3 data transfers
0x14, word 5
0x14
Incr
32-bit
3 data transfers
Table 8-19 Noncacheable LDM3, Noncacheable memory or cache disabled
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00, word 0
0x00
Incr
64-bit
2 data transfers
0x04, word 1
0x04
Incr
64-bit
2 data transfers
0x08, word 2
0x08
Incr
64-bit
2 data transfers
0x0C, word 3
0x0C
Incr
64-bit
2 data transfers
0x10, word 4
0x10
Incr
64-bit
2 data transfers
0x14, word 5
0x14
Incr
64-bit
2 data transfers
Table 8-20 Noncacheable LDM3 from word 6, or 7
8.5.7
Address[4:0]
Operations
0x18, word 6
LDM2 from 0x18 + LDR from 0x00
0x1C, word 7
LDR from 0x1C + LDM2 from 0x00
Noncacheable LDM4
The values of ARADDRRW, ARBURSTRW, ARSIZERW, and ARLENRW for
Noncacheable LDM4s addressing words 0 to 4 are shown in:
•
Table 8-21 on page 8-19 for a load from Strongly Ordered or Device memory
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-18
Level Two Interface
•
Table 8-22 for a load from Noncacheable memory or when the cache is disabled.
A Noncacheable LDM4 addressing words 5 to 7 is split into two operations as shown in
Table 8-23.
Table 8-21 Noncacheable LDM4, Strongly Ordered or Device memory
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00, word 0
0x00
Incr
64-bit
2 data transfers
0x04, word 1
0x04
Incr
32-bit
4 data transfers
0x08, word 2
0x08
Incr
64-bit
2 data transfers
0x0C, word 3
0x0C
Incr
32-bit
4 data transfers
0x10, word 4
0x10
Incr
64-bit
2 data transfers
Table 8-22 Noncacheable LDM4, Noncacheable memory or cache disabled
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00, word 0
0x00
Incr
64-bit
2 data transfers
0x04, word 1
0x04
Incr
64-bit
3 data transfers
0x08, word 2
0x08
Incr
64-bit
2 data transfers
0x0C, word 3
0x0C
Incr
64-bit
3 data transfers
0x10, word 4
0x10
Incr
64-bit
2 data transfers
Table 8-23 Noncacheable LDM4 from word 5, 6, or 7
8.5.8
Address[4:0]
Operations
0x14, word 5
LDM3 from 0x14 + LDR from 0x00
0x18, word 6
LDM2 from 0x18 + LDM2 from 0x00
0x1C, word 7
LDR from 0x1C + LDM3 from 0x00
Noncacheable LDM5
The values of ARADDRRW, ARBURSTRW, ARSIZERW, and ARLENRW for
Noncacheable LDM5s addressing words 0 to 3 are shown in:
•
Table 8-24 on page 8-20 for a load from Strongly Ordered or Device memory
•
Table 8-25 on page 8-20 for a load from Noncacheable memory or when the cache is
disabled.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-19
Level Two Interface
A Noncacheable LDM5 addressing words 4 to 7 is split into two operations as shown in
Table 8-26.
Table 8-24 Noncacheable LDM5, Strongly Ordered or Device memory
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00, word 0
0x00
Incr
32-bit
5 data transfers
0x04, word 1
0x04
Incr
32-bit
5 data transfers
0x08, word 2
0x08
Incr
32-bit
5 data transfers
0x0C, word 3
0x0C
Incr
32-bit
5 data transfers
Table 8-25 Noncacheable LDM5, Noncacheable memory or cache disabled
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00, word 0
0x00
Incr
64-bit
3 data transfers
0x04, word 1
0x04
Incr
64-bit
3 data transfers
0x08, word 2
0x08
Incr
64-bit
3 data transfers
0x0C, word 3
0x0C
Incr
64-bit
3 data transfers
Table 8-26 Noncacheable LDM5 from word 4, 5, 6, or 7
8.5.9
Address[4:0]
Operations
0x10, word 4
LDM4 from 0x10 + LDR from 0x00
0x14, word 5
LDM3 from 0x14 + LDM2 from 0x00
0x18, word 6
LDM2 from 0x18 + LDM3 from 0x00
0x1C, word 7
LDR from 0x1C + LDM4 from 0x00
Noncacheable LDM6
The values of ARADDRRW, ARBURSTRW, ARSIZERW, and ARLENRW for
Noncacheable LDM6s addressing words 0 to 2 are shown in:
•
Table 8-27 for a load from Strongly Ordered or Device memory
•
Table 8-28 on page 8-21 for a load from Noncacheable memory or when the cache is
disabled.
A Noncacheable LDM6 addressing words 3 to 7 is split into two operations as shown in
Table 8-29 on page 8-21.
Table 8-27 Noncacheable LDM6, Strongly Ordered or Device memory
ARM DDI 0333H
ID012410
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00, word 0
0x00
Incr
64-bit
3 data transfers
0x04, word 1
0x04
Incr
32-bit
6 data transfers
0x08, word 2
0x08
Incr
64-bit
3 data transfers
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-20
Level Two Interface
Table 8-28 Noncacheable LDM6, Noncacheable memory or cache disabled
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00, word 0
0x00
Incr
64-bit
3 data transfers
0x04, word 1
0x04
Incr
64-bit
4 data transfers
0x08, word 2
0x08
Incr
64-bit
3 data transfers
Table 8-29 Noncacheable LDM6 from word 3, 4, 5, 6, or 7
8.5.10
Address[4:0]
Operations
0x0C, word 3
LDM5 from 0x0C + LDR from 0x00
0x10, word 4
LDM4 from 0x10 + LDM2 from 0x00
0x14, word 5
LDM3 from 0x14 + LDM3 from 0x00
0x18, word 6
LDM2 from 0x18 + LDM4 from 0x00
0x1C, word 7
LDR from 0x1C + LDM5 from 0x00
Noncacheable LDM7
The values of ARADDRRW, ARBURSTRW, ARSIZERW, and ARLENRW for
Noncacheable LDM7s addressing word 0 or 1 are shown in:
•
Table 8-30 for a load from Strongly Ordered or Device memory
•
Table 8-31 for a load from Noncacheable memory or when the cache is disabled.
A Noncacheable LDM7 addressing words 2 to 7 is split into two operations as shown in
Table 8-32.
Table 8-30 Noncacheable LDM7, Strongly Ordered or Device memory
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00, word 0
0x00
Incr
32-bit
7 data transfers
0x04, word 1
0x04
Incr
32-bit
7 data transfers
Table 8-31 Noncacheable LDM7, Noncacheable memory or cache disabled
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00, word 0
0x00
Incr
64-bit
4 data transfers
0x04, word 1
0x04
Incr
64-bit
4 data transfers
Table 8-32 Noncacheable LDM7 from word 2, 3, 4, 5, 6, or 7
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x08, word 2
LDM6 from 0x08 + LDR from 0x00
0x0C, word 3
LDM5 from 0x0C + LDM2 from 0x00
0x10, word 4
LDM4 from 0x10 + LDM3 from 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-21
Level Two Interface
Table 8-32 Noncacheable LDM7 from word 2, 3, 4, 5, 6, or 7 (continued)
8.5.11
Address[4:0]
Operations
0x14, word 5
LDM3 from 0x14 + LDM4 from 0x00
0x18, word 6
LDM2 from 0x18 + LDM5 from 0x00
0x1C, word 7
LDR from 0x1C + LDM6 from 0x00
Noncacheable LDM8
Table 8-33 shows the values of ARADDRRW, ARBURSTRW, ARSIZERW, and
ARLENRW for a Noncacheable LDM8 addressing word 0.
A Noncacheable LDM8 addressing words 1 to 7 is split into two operations as shown in
Table 8-34.
Table 8-33 Noncacheable LDM8 from word 0
Address[4:0]
ARADDRRW
ARBURSTRW
ARSIZERW
ARLENRW
0x00, word 0
0x00
Incr
64-bit
4 data transfers
Table 8-34 Noncacheable LDM8 from word 1, 2, 3, 4, 5, 6, or 7
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x04, word 1
LDM7 from 0x04 + LDR from 0x00
0x08, word 2
LDM6 from 0x08 + LDM2 from 0x00
0x0C, word 3
LDM5 from 0x0C + LDM3 from 0x00
0x10, word 4
LDM4 from 0x10 + LDM4 from 0x00
0x14, word 5
LDM3 from 0x14 + LDM5 from 0x00
0x18, word 6
LDM2 from 0x18 + LDM6 from 0x00
0x1C, word 7
LDR from 0x1C + LDM7 from 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-22
Level Two Interface
8.5.12
Noncacheable LDM9
A Noncacheable LDM9 is split into two operations as shown in Table 8-35.
Table 8-35 Noncacheable LDM9
8.5.13
Address[4:0]
Operations
0x00, word 0
LDM8 from 0x00 + LDR from 0x00
0x04, word 1
LDM7 from 0x04 + LDM2 from 0x00
0x08, word 2
LDM6 from 0x08 + LDM3 from 0x00
0x0C, word 3
LDM5 from 0x0C + LDM4 from 0x00
0x10, word 4
LDM4 from 0x10 + LDM5 from 0x00
0x14, word 5
LDM3 from 0x14 + LDM6 from 0x00
0x18, word 6
LDM2 from 0x18 + LDM7 from 0x00
0x1C, word 7
LDR from 0x1C + LDM8 from 0x00
Noncacheable LDM10
A Noncacheable LDM10 is split into two or three operations as shown in Table 8-36.
Table 8-36 Noncacheable LDM10
8.5.14
Address[4:0]
Operations
0x00, word 0
LDM8 from 0x00 + LDM2 from 0x00
0x04, word 1
LDM7 from 0x04 + LDM3 from 0x00
0x08, word 2
LDM6 from 0x08 + LDM4 from 0x00
0x0C, word 3
LDM5 from 0x0C + LDM5 from 0x00
0x10, word 4
LDM4 from 0x10 + LDM6 from 0x00
0x14, word 5
LDM3 from 0x14 + LDM7 from 0x00
0x18, word 6
LDM2 from 0x18 + LDM8 from 0x00
0x1C, word 7
LDR from 0x1C + LDM8 from 0x00 + LDR from 0x00
Noncacheable LDM11
A Noncacheable LDM11 is split into two or three operations as shown in Table 8-37.
Table 8-37 Noncacheable LDM11
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x00, word 0
LDM8 from 0x00 + LDM3 from 0x00
0x04, word 1
LDM7 from 0x04 + LDM4 from 0x00
0x08, word 2
LDM6 from 0x08 + LDM5 from 0x00
0x0C, word 3
LDM5 from 0x0C + LDM6 from 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-23
Level Two Interface
Table 8-37 Noncacheable LDM11 (continued)
8.5.15
Address[4:0]
Operations
0x10, word 4
LDM4 from 0x10 + LDM7 from 0x00
0x14, word 5
LDM3 from 0x14 + LDM8 from 0x00
0x18, word 6
LDM2 from 0x18 + LDM8 from 0x00 + LDR from 0x00
0x1C, word 7
LDR from 0x1C + LDM8 from 0x00 + LDM2 from 0x00
Noncacheable LDM12
A Noncacheable LDM12 is split into two or three operations as shown in Table 8-38.
Table 8-38 Noncacheable LDM12
8.5.16
Address[4:0]
Operations
0x00, word 0
LDM8 from 0x00 + LDM4 from 0x00
0x04, word 1
LDM7 from 0x04 + LDM5 from 0x00
0x08, word 2
LDM6 from 0x08 + LDM6 from 0x00
0x0C, word 3
LDM5 from 0x0C + LDM7 from 0x00
0x10, word 4
LDM4 from 0x10 + LDM8 from 0x00
0x14, word 5
LDM3 from 0x14 + LDM8 from 0x00 + LDR from 0x00
0x18, word 6
LDM2 from 0x18 + LDM8 from 0x00 + LDM2 from 0x00
0x1C, word 7
LDR from 0x1C + LDM8 from 0x00 + LDM3 from 0x00
Noncacheable LDM13
A Noncacheable LDM13 is split into two or three operations as shown in Table 8-39.
Table 8-39 Noncacheable LDM13
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x00, word 0
LDM8 from 0x00 + LDM5 from 0x00
0x04, word 1
LDM7 from 0x04 + LDM6 from 0x00
0x08, word 2
LDM6 from 0x08 + LDM7 from 0x00
0x0C, word 3
LDM5 from 0x0C + LDM8 from 0x00
0x10, word 4
LDM4 from 0x10 + LDM8 from 0x00 + LDR from 0x00
0x14, word 5
LDM3 from 0x14 + LDM8 from 0x00 + LDM2 from 0x00
0x18, word 6
LDM2 from 0x18 + LDM8 from 0x00 + LDM3 from 0x00
0x1C, word 7
LDR from 0x1C + LDM8 from 0x00 + LDM4 from 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-24
Level Two Interface
8.5.17
Noncacheable LDM14
A Noncacheable LDM14 is split into two or three operations as shown in Table 8-40.
Table 8-40 Noncacheable LDM14
8.5.18
Address[4:0]
Operations
0x00, word 0
LDM8 from 0x00 + LDM6 from 0x00
0x04, word 1
LDM7 from 0x04 + LDM7 from 0x00
0x08, word 2
LDM6 from 0x08 + LDM8 from 0x00
0x0C, word 3
LDM5 from 0x0C + LDM8 from 0x00 + LDR from 0x00
0x10, word 4
LDM4 from 0x10 + LDM8 from 0x00 + LDM2 from 0x00
0x14, word 5
LDM3 from 0x14 + LDM8 from 0x00 + LDM3 from 0x00
0x18, word 6
LDM2 from 0x18 + LDM8 from 0x00 + LDM4 from 0x00
0x1C, word 7
LDR from 0x1C + LDM8 from 0x00 + LDM5 from 0x00
Noncacheable LDM15
A Noncacheable LDM15 is split into two or three operations as shown in Table 8-41.
Table 8-41 Noncacheable LDM15
8.5.19
Address[4:0]
Operations
0x00, word 0
LDM8 from 0x00 + LDM7 from 0x00
0x04, word 1
LDM7 from 0x04 + LDM8 from 0x00
0x08, word 2
LDM6 from 0x08 + LDM8 from 0x00 + LDR from 0x00
0x0C, word 3
LDM5 from 0x0C + LDM8 from 0x00 + LDM2 from 0x00
0x10, word 4
LDM4 from 0x10 + LDM8 from 0x00 + LDM3 from 0x00
0x14, word 5
LDM3 from 0x14 + LDM8 from 0x00 + LDM4 from 0x00
0x18, word 6
LDM2 from 0x18 + LDM8 from 0x00 + LDM5 from 0x00
0x1C, word 7
LDR from 0x1C + LDM8 from 0x00 + LDM6 from 0x00
Noncacheable LDM16
A Noncacheable LDM16 is split into two or three operations as shown in Table 8-41.
Table 8-42 Noncacheable LDM16
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x00, word 0
LDM8 from 0x00 + LDM8 from 0x00
0x04, word 1
LDM7 from 0x04 + LDM8 from 0x00 + LDR from 0x00
0x08, word 2
LDM6 from 0x08 + LDM8 from 0x00 + LDM2 from 0x00
0x0C, word 3
LDM5 from 0x0C + LDM8 from 0x00 + LDM3 from 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-25
Level Two Interface
Table 8-42 Noncacheable LDM16 (continued)
8.5.20
Address[4:0]
Operations
0x10, word 4
LDM4 from 0x10 + LDM8 from 0x00 + LDM4 from 0x00
0x14, word 5
LDM3 from 0x14 + LDM8 from 0x00 + LDM5 from 0x00
0x18, word 6
LDM2 from 0x18 + LDM8 from 0x00 + LDM6 from 0x00
0x1C, word 7
LDR from 0x1C + LDM8 from 0x00 + LDM7 from 0x00
Half-line Write-Back
Table 8-43 shows the values of AWADDRRW, AWBURSTRW, AWSIZERW, and
AWLENRW for half-line Write-Backs over the Data Read/Write Interface.
Table 8-43 Half-line Write-Back
Write
address
[4:0]
0x00-0x07
0x08-0x0F
0x10-0x17
0x18-0x1F
8.5.21
Description
AWADDRRW
AWBURSTRW
AWSIZERW
AWLENRW
Evicted cache line valid
and lower half dirty
0x00
Incr
64-bit
2 data transfers
Evicted cache line valid
and upper half dirty
0x10
Incr
64-bit
2 data transfers
Evicted cache line valid
and lower half dirty
0x08
Wrap
64-bit
2 data transfers
Evicted cache line valid
and upper half dirty
0x10
Incr
64-bit
2 data transfers
Evicted cache line valid
and lower half dirty
0x00
Incr
64-bit
2 data transfers
Evicted cache line valid
and upper half dirty
0x10
Incr
64-bit
2 data transfers
Evicted cache line valid
and lower half dirty
0x00
Incr
64-bit
2 data transfers
Evicted cache line valid
and upper half dirty
0x18
Wrap
64-bit
2 data transfers
Full-line Write-Back
Table 8-44 shows the values of AWADDRRW, AWBURSTRW, AWSIZERW, and
AWLENRW for full-line Write-Backs, evicted cache line valid and both halves dirty, over the
Data Read/Write Interface.
Table 8-44 Full-line Write-Back
ARM DDI 0333H
ID012410
Write address [4:0]
AWADDRRW
AWBURSTRW
AWSIZERW
AWLENRW
0x00-0x07
0x00
Incr
64-bit
4 data transfers
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-26
Level Two Interface
Table 8-44 Full-line Write-Back (continued)
8.5.22
Write address [4:0]
AWADDRRW
AWBURSTRW
AWSIZERW
AWLENRW
0x08-0x0F
0x08
Wrap
64-bit
4 data transfers
0x10-0x17
0x10
Wrap
64-bit
4 data transfers
0x18-0x1F
0x18
Wrap
64-bit
4 data transfers
Cacheable Write-Through or Noncacheable STRB
Table 8-45 shows the values of AWADDRRW, AWBURSTRW, AWSIZERW, and
AWLENRW for STRBs over the Data Read/Write Interface.
Table 8-45 Cacheable Write-Through or Noncacheable STRB
8.5.23
Address[4:0]
AWADDRRW
AWBURSTRW
AWSIZERW
AWLENRW
WSTRBRW
0x00, byte 0
0x00
Incr
8-bit
1 data transfer
b0000 0001
0x01, byte 1
0x01
Incr
8-bit
1 data transfer
b0000 0010
0x02, byte 2
0x02
Incr
8-bit
1 data transfer
b0000 0100
0x03, byte 3
0x03
Incr
8-bit
1 data transfer
b0000 1000
0x04, byte 4
0x04
Incr
8-bit
1 data transfer
b0001 0000
0x05, byte 5
0x05
Incr
8-bit
1 data transfer
b0010 0000
0x06, byte 6
0x06
Incr
8-bit
1 data transfer
b0100 0000
0x07, byte 7
0x07
Incr
8-bit
1 data transfer
b1000 0000
Cacheable Write-Through or Noncacheable STRH
Table 8-46 shows the values of AWADDRRW, AWBURSTRW, AWSIZERW, and
AWLENRW for STRHs over the Data Read/Write Interface.
Table 8-46 Cacheable Write-Through or Noncacheable STRH
ARM DDI 0333H
ID012410
Address[4:0]
AWADDRRW
AWBURSTRW
AWSIZERW
AWLENRW
WSTRBRW
0x00, byte 0
0x00
Incr
16-bit
1 data transfer
b0000 0011
0x01, byte 1
0x01
Incr
32-bit
1 data transfer
b0000 0110
0x02, byte 2
0x02
Incr
16-bit
1 data transfer
b0000 1100
0x03, byte 3
0x03
Incr
8-bit
1 data transfer
b0000 1000
0x04
Incr
8-bit
1 data transfer
b0001 0000
0x04, byte 4
0x04
Incr
16-bit
1 data transfer
b0011 0000
0x05, byte 5
0x05
Incr
32-bit
1 data transfer
b0110 0000
0x06, byte 6
0x06
Incr
16-bit
1 data transfer
b1100 0000
0x07, byte 7
0x07
Incr
8-bit
1 data transfer
b1000 0000
0x08
Incr
8-bit
1 data transfer
b0000 0001
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-27
Level Two Interface
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-28
Level Two Interface
8.5.24
Cacheable Write-Through or Noncacheable STR or STM1
Table 8-47 shows the values of AWADDRRW, AWBURSTRW, AWSIZERW, and
AWLENRW for STRs or STM1s over the Data Read/Write Interface.
Table 8-47 Cacheable Write-Through or Noncacheable STR or STM1
Address[4:0]
AWADDRRW
AWBURSTRW
AWSIZERW
AWLENRW
WSTRBRW
0x00, byte 0,
word 0
0x00
Incr
32-bit
1 data transfer
b0000 1111
0x01, byte 1
0x00
Incr
32-bit
1 data transfer
b0000 1110
0x04
Incr
8-bit
1 data transfer
b0001 0000
0x02
Incr
16-bit
1 data transfer
b0000 1100
0x04
Incr
16-bit
1 data transfer
b0011 0000
0x03
Incr
8-bit
1 data transfer
b0000 1000
0x04
Incr
32-bit
1 data transfer
b0111 0000
0x04, byte 4,
word 1
0x04
Incr
32-bit
1 data transfer
b1111 0000
0x05, byte 5
0x04
Incr
32-bit
1 data transfer
b1110 0000
0x08
Incr
8-bit
1 data transfer
b0000 0001
0x06
Incr
16-bit
1 data transfer
b1100 0000
0x08
Incr
16-bit
1 data transfer
b0000 0011
0x07
Incr
8-bit
1 data transfer
b1000 0000
0x08
Incr
32-bit
1 data transfer
b0000 0111
0x08, byte 8,
word 2
0x08
Incr
32-bit
1 data transfer
b0000 1111
0x0C, word 3
0x0C
Incr
32-bit
1 data transfer
b1111 0000
0x10, word 4
0x10
Incr
32-bit
1 data transfer
b0000 1111
0x14, word 5
0x14
Incr
32-bit
1 data transfer
b1111 0000
0x18, word 6
0x18
Incr
32-bit
1 data transfer
b0000 1111
0x1C, word 7
0x1C
Incr
32-bit
1 data transfer
b1111 0000
0x02, byte 2
0x03, byte 3
0x06, byte 6
0x07, byte 7
8.5.25
Cacheable Write-Through or Noncacheable STRD or STM2
Table 8-48 on page 8-30 shows the values of AWADDRRW, AWBURSTRW, AWSIZERW,
and AWLENRW for STM2s to words 0 to 6 over the Data Read/Write Interface.
An STM2 to word 7 is split into two operations as shown in Table 8-49 on page 8-30.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-29
Level Two Interface
Table 8-48 Cacheable Write-Through or Noncacheable STRD or STM2 to words 0, 1, 2, 3, 4, 5, or 6
Address[4:0]
AWADDRRW
AWBURSTRW
AWSIZERW
AWLENRW
First WSTRBRW
0x00, word 0
0x00
Incr
64-bit
1 data transfer
b1111 1111
0x04, word 1
0x04
Incr
32-bit
2 data transfers
b1111 0000
0x08, word 2
0x08
Incr
64-bit
1 data transfer
b1111 1111
0x0C, word 3
0x0C
Incr
32-bit
2 data transfers
b1111 0000
0x10, word 4
0x10
Incr
64-bit
1 data transfer
b1111 1111
0x14, word 5
0x14
Incr
32-bit
2 data transfers
b1111 0000
0x18, word 6
0x18
Incr
64-bit
1 data transfer
b1111 1111
Table 8-49 Cacheable Write-Through or Noncacheable STM2 to word 7
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x1C
STR to 0x1C + STR to 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-30
Level Two Interface
8.5.26
Cacheable Write-Through or Noncacheable STM3
Table 8-50 shows the values of AWADDRRW, AWBURSTRW, AWSIZERW, and
AWLENRW for STM3s to words 0 to 5 over the Data Read/Write Interface.
An STM3 to word 6 or 7 is split into two operations as shown in Table 8-51.
Table 8-50 Cacheable Write-Through or Noncacheable STM3 to words 0, 1, 2, 3, 4, or 5
Address[4:0]
AWADDRRW
AWBURSTRW
AWSIZERW
AWLENRW
First WSTRBRW
0x00, word 0
0x00
Incr
32-bit
3 data transfers
b0000 1111
0x04, word 1
0x04
Incr
32-bit
3 data transfers
b1111 0000
0x08, word 2
0x08
Incr
32-bit
3 data transfers
b0000 1111
0x0C, word 3
0x0C
Incr
32-bit
3 data transfers
b1111 0000
0x10, word 4
0x10
Incr
32-bit
3 data transfers
b0000 1111
0x14, word 5
0x14
Incr
32-bit
3 data transfers
b1111 0000
Table 8-51 Cacheable Write-Through or Noncacheable STM3 to words 6 or 7
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x18, word 6
STM2 to 0x18 + STR to 0x00
0x1C, word 7
STR to 0x1C + STM2 to 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-31
Level Two Interface
8.5.27
Cacheable Write-Through or Noncacheable STM4
Table 8-52 shows the values of AWADDRRW, AWBURSTRW, AWSIZERW, and
AWLENRW for STM4s to words 0 to 4 over the Data Read/Write Interface.
An STM4 to words 5 to 7 is split into two operations as shown in Table 8-53.
Table 8-52 Cacheable Write-Through or Noncacheable STM4 to word 0, 1, 2, 3, or 4
Address[4:0]
AWADDRRW
AWBURSTRW
AWSIZERW
AWLENRW
First WSTRBRW
0x00, word 0
0x00
Incr
64-bit
2 data transfers
b1111 1111
0x04, word 1
0x04
Incr
32-bit
4 data transfers
b11110000
0x08, word 2
0x08
Incr
64-bit
2 data transfers
b11111111
0x0C, word 3
0x0C
Incr
32-bit
4 data transfers
b11110000
0x10, word 4
0x10
Incr
64-bit
2 data transfers
b11111111
Table 8-53 Cacheable Write-Through or Noncacheable STM4 to word 5, 6, or 7
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x14, word 5
STM3 to 0x14 + STR to 0x00
0x18, word 6
STM2 to 0x18 + STM2 to 0x00
0x1C, word 7
STR to 0x1C + STM3 to 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-32
Level Two Interface
8.5.28
Cacheable Write-Through or Noncacheable STM5
Table 8-54 shows the values of AWADDRRW, AWBURSTRW, AWSIZERW, and
AWLENRW for STM5s to words 0 to 3 over the Data Read/Write Interface.
An STM5 to words 4 to 7 is split into two operations as shown in Table 8-55.
Table 8-54 Cacheable Write-Through or Noncacheable STM5 to word 0, 1, 2, or 3
Address[4:0]
AWADDRRW
AWBURSTRW
AWSIZERW
AWLENRW
First WSTRBRW
0x00, word 0
0x00
Incr
32-bit
5 data transfers
b0000 1111
0x04, word 1
0x04
Incr
32-bit
5 data transfers
b1111 0000
0x08, word 2
0x08
Incr
32-bit
5 data transfers
b0000 1111
0x0C, word 3
0x0C
Incr
32-bit
5 data transfers
b1111 0000
Table 8-55 Cacheable Write-Through or Noncacheable STM5 to word 4, 5, 6, or 7
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x10, word 4
STM4 to 0x10 + STR to 0x00
0x14, word 5
STM3 to 0x14 + STM2 to 0x00
0x18, word 6
STM2 to 0x18 + STM3 to 0x00
0x1C, word 7
STR to 0x1C + STM4 to 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-33
Level Two Interface
8.5.29
Cacheable Write-Through or Noncacheable STM6
Table 8-56 shows the values of AWADDRRW, AWBURSTRW, AWSIZERW, and
AWLENRW for STM6s to words 0 to 2 over the Data Read/Write Interface.
An STM6 to words 3 to 7 is split into two operations as shown in Table 8-57.
Table 8-56 Cacheable Write-Through or Noncacheable STM6 to word 0, 1, or 2
Address[4:0]
AWADDRRW
AWBURSTRW
AWSIZERW
AWLENRW
First WSTRBRW
0x00, word 0
0x00
Incr
64-bit
3 data transfers
b1111 1111
0x04, word 1
0x04
Incr
32-bit
6 data transfers
b1111 0000
0x08, word 2
0x08
Incr
64-bit
3 data transfers
b1111 1111
Table 8-57 Cacheable Write-Through or Noncacheable STM6 to word 3, 4, 5, 6, or 7
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x0C, word 3
STM5 to 0x0C + STR to 0x00
0x10, word 4
STM4 to 0x10 + STM2 to 0x00
0x14, word 5
STM3 to 0x14 + STM3 to 0x00
0x18, word 6
STM2 to 0x18 + STM4 to 0x00
0x1C, word 7
STR to 0x1C + STM5 to 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-34
Level Two Interface
8.5.30
Cacheable Write-Through or Noncacheable STM7
Table 8-58 shows the values of AWADDRRW, AWBURSTRW, AWSIZERW, and
AWLENRW for STM7s to words 0 or 1 over the Data Read/Write Interface.
An STM7 to words 2 to 7 is split into two operations as shown in Table 8-59.
Table 8-58 Cacheable Write-Through or Noncacheable STM7 to word 0 or 1
Address[4:0]
AWADDRRW
AWBURSTRW
AWSIZERW
AWLENRW
First WSTRBRW
0x00, word 0
0x00
Incr
32-bit
7 data transfers
b0000 1111
0x04, word 1
0x04
Incr
32-bit
7 data transfers
b1111 0000
Table 8-59 Cacheable Write-Through or Noncacheable STM7 to word 2, 3, 4, 5, 6 or 7
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x08, word 2
STM6 to 0x08 + STR to 0x00
0x0C, word 3
STM5 to 0x0C + STM2 to 0x00
0x10, word 4
STM4 to 0x10 + STM3 to 0x00
0x14, word 5
STM3 to 0x14 + STM4 to 0x00
0x18, word 6
STM2 to 0x18 + STM5 to 0x00
0x1C, word 7
STR to 0x1C + STM6 to 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-35
Level Two Interface
8.5.31
Cacheable Write-Through or Noncacheable STM8
Table 8-60 shows the values of AWADDRRW, AWBURSTRW, AWSIZERW, and
AWLENRW for an STM8 to word 0 over the Data Read/Write Interface.
An STM8 to words 1 to 7 is split into two operations as shown in Table 8-61.
Table 8-60 Cacheable Write-Through or Noncacheable STM8 to word 0
Address[4:0]
AWADDRRW
AWBURSTRW
AWSIZERW
AWLENRW
First WSTRBRW
0x00, word 0
0x00
Incr
64-bit
4 data transfers
b1111 1111
Table 8-61 Cacheable Write-Through or Noncacheable STM8 to word 1, 2, 3, 4, 5, 6, or 7
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x04, word 1
STM7 to 0x04 + STR to 0x00
0x08, word 2
STM6 to 0x08 + STM2 to 0x00
0x0C, word 3
STM5 to 0x0C + STM3 to 0x00
0x10, word 4
STM4 to 0x10 + STM4 to 0x00
0x14, word 5
STM3 to 0x14 + STM5 to 0x00
0x18, word 6
STM2 to 0x18 + STM6 to 0x00
0x1C, word 7
STR to 0x1C + STM7 to 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-36
Level Two Interface
8.5.32
Cacheable Write-Through or Noncacheable STM9
An STM9 over the Data Read/Write Interface is split into two operations as shown in
Table 8-62.
Table 8-62 Cacheable Write-Through or Noncacheable STM9
8.5.33
Address[4:0]
Operations
0x00, word 0
STM8 to 0x00 + STR to 0x00
0x04, word 1
STM7 to 0x04 + STM2 to 0x00
0x08, word 2
STM6 to 0x08 + STM3 to 0x00
0x0C, word 3
STM5 to 0x0C + STM4 to 0x00
0x10, word 4
STM4 to 0x10 + STM5 to 0x00
0x14, word 5
STM3 to 0x14 + STM6 to 0x00
0x18, word 6
STM2 to 0x18 + STM7 to 0x00
0x1C, word 7
STR to 0x1C + STM8 to 0x00
Cacheable Write-Through or Noncacheable STM10
An STM10 over the Data Read/Write Interface is split into two or three operations as shown in
Table 8-63.
Table 8-63 Cacheable Write-Through or Noncacheable STM10
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x00, word 0
STM8 to 0x00 + STM2 to 0x00
0x04, word 1
STM7 to 0x04 + STM3 to 0x00
0x08, word 2
STM6 to 0x08 + STM4 to 0x00
0x0C, word 3
STM5 to 0x0C + STM5 to 0x00
0x10, word 4
STM4 to 0x10 + STM6 to 0x00
0x14, word 5
STM3 to 0x14 + STM7 to 0x00
0x18, word 6
STM2 to 0x18 + STM8 to 0x00
0x1C, word 7
STR to 0x1C + STM8 to 0x00 + STR to 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-37
Level Two Interface
8.5.34
Cacheable Write-Through or Noncacheable STM11
An STM11 over the Data Read/Write Interface is split into two or three operations as shown in
Table 8-64.
Table 8-64 Cacheable Write-Through or Noncacheable STM11
8.5.35
Address[4:0]
Operations
0x00, word 0
STM8 to 0x00 + STM3 to 0x00
0x04, word 1
STM7 to 0x04 + STM4 to 0x00
0x08, word 2
STM6 to 0x08 + STM5 to 0x00
0x0C, word 3
STM5 to 0x0C + STM6 to 0x00
0x10, word 4
STM4 to 0x10 + STM7 to 0x00
0x14, word 5
STM3 to 0x14 + STM8 to 0x00
0x18, word 6
STM2 to 0x18 + STM8 to 0x00 + STR to 0x00
0x1C, word 7
STR to 0x1C + STM8 to 0x00 + STM2 to 0x00
Cacheable Write-Through or Noncacheable STM12
An STM12 over the Data Read/Write Interface is split into two or three operations as shown in
Table 8-65.
Table 8-65 Cacheable Write-Through or Noncacheable STM12
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x00, word 0
STM8 to 0x00 + STM4 to 0x00
0x04, word 1
STM7 to 0x04 + STM5 to 0x00
0x08, word 2
STM6 to 0x08 + STM6 to 0x00
0x0C, word 3
STM5 to 0x0C + STM7 to 0x00
0x10, word 4
STM4 to 0x10 + STM8 to 0x00
0x14, word 5
STM3 to 0x14 + STM8 to 0x00 + STR to 0x00
0x18, word 6
STM2 to 0x18 + STM8 to 0x00 + STM2 to 0x00
0x1C, word 7
STR to 0x1C + STM8 to 0x00 + STM3 to 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-38
Level Two Interface
8.5.36
Cacheable Write-Through or Noncacheable STM13
An STM13 over the Data Read/Write Interface is split into two or three operations as shown in
Table 8-66.
Table 8-66 Cacheable Write-Through or Noncacheable STM13
8.5.37
Address[4:0]
Operations
0x00, word 0
STM8 to 0x00 + STM5 to 0x00
0x04, word 1
STM7 to 0x04 + STM6 to 0x00
0x08, word 2
STM6 to 0x08 + STM7 to 0x00
0x0C, word 3
STM5 to 0x0C + STM8 to 0x00
0x10, word 4
STM4 to 0x10 + STM8 to 0x00 + STR to 0x00
0x14, word 5
STM3 to 0x14 + STM8 to 0x00 + STM2 to 0x00
0x18, word 6
STM2 to 0x18 + STM8 to 0x00 + STM3 to 0x00
0x1C, word 7
STR to 0x1C + STM8 to 0x00 + STM4 to 0x00
Cacheable Write-Through or Noncacheable STM14
An STM14 over the Data Read/Write Interface is split into two or three operations as shown in
Table 8-67.
Table 8-67 Cacheable Write-Through or Noncacheable STM14
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x00, word 0
STM8 to 0x00 + STM6 to 0x00
0x04, word 1
STM7 to 0x04 + STM7 to 0x00
0x08, word 2
STM6 to 0x08 + STM8 to 0x00
0x0C, word 3
STM5 to 0x0C + STM8 to 0x00 + STR to 0x00
0x10, word 4
STM4 to 0x10 + STM8 to 0x00 + STM2 to 0x00
0x14, word 5
STM3 to 0x14 + STM8 to 0x00 + STM3 to 0x00
0x18, word 6
STM2 to 0x18 + STM8 to 0x00 + STM4 to 0x00
0x1C, word 7
STR to 0x1C + STM8 to 0x00 + STM5 to 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-39
Level Two Interface
8.5.38
Cacheable Write-Through or Noncacheable STM15
An STM15 over the Data Read/Write Interface is split into two or three operations as shown in
Table 8-68.
Table 8-68 Cacheable Write-Through or Noncacheable STM15
8.5.39
Address[4:0]
Operations
0x00, word 0
STM8 to 0x00 + STM7 to 0x00
0x04, word 1
STM7 to 0x04 + STM8 to 0x00
0x08, word 2
STM6 to 0x08 + STM8 to 0x00 + STR to 0x00
0x0C, word 3
STM5 to 0x0C + STM8 to 0x00 + STM2 to 0x00
0x10, word 4
STM4 to 0x10 + STM8 to 0x00 + STM3 to 0x00
0x14, word 5
STM3 to 0x14 + STM8 to 0x00 + STM4 to 0x00
0x18, word 6
STM2 to 0x18 + STM8 to 0x00 + STM5 to 0x00
0x1C, word 7
STR to 0x1C + STM8 to 0x00 + STM6 to 0x00
Cacheable Write-Through or Noncacheable STM16
An STM15 over the Data Read/Write Interface is split into two or three operations as shown in
Table 8-69.
Table 8-69 Cacheable Write-Through or Noncacheable STM16
ARM DDI 0333H
ID012410
Address[4:0]
Operations
0x00, word 0
STM8 to 0x00 + STM8 to 0x00
0x04, word 1
STM7 to 0x04 + STM8 to 0x00 + STR to 0x00
0x08, word 2
STM6 to 0x08 + STM8 to 0x00 + STM2 to 0x00
0x0C, word 3
STM5 to 0x0C + STM8 to 0x00 + STM3 to 0x00
0x10, word 4
STM4 to 0x10 + STM8 to 0x00 + STM4 to 0x00
0x14, word 5
STM3 to 0x14 + STM8 to 0x00 + STM5 to 0x00
0x18, word 6
STM2 to 0x18 + STM8 to 0x00 + STM6 to 0x00
0x1C, word 7
STR to 0x1C + STM8 to 0x00 + STM7 to 0x00
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-40
Level Two Interface
8.6
Peripheral Interface transfers
The tables in this section describe the Peripheral Interface behavior for reads and writes for the
following interface signals:
•
AxADDRP[31:0]
•
AxBURSTP[1:0]
•
AxSIZEP[2:0]
•
AxLENP[3:0]
•
WSTRBP[3:0], for write accesses.
See the AMBA AXI Protocol Specification for details of the other AXI signals.
Table 8-70 shows the values of AxADDRP, AxBURSTP, AxSIZEP, AxLENP, and WSTRBP
for example Peripheral Interface reads and writes.
Table 8-70 Example Peripheral Interface reads and writes
Example transfer, read or write
AxADDRP
AxBURSTP
AxSIZEP
AxLENP
WSTRBP
Words 0-7
0x00
Incr
32-bit
2 data transfers
b1111
b1111
0x04
0x08
Incr
32-bit
2 data transfers
b1111
0x0C
0x10
Incr
32-bit
2 data transfers
Incr
32-bit
2 data transfers
0x00
Incr
32-bit
2 data transfers
Incr
32-bit
b1111
b1111
0x0C
Words 0-2
0x00
Incr
32-bit
2 data transfers
b1111
b1111
0x04
Words 0-1
b1111
b1111
0x04
0x08
b1111
b1111
0x1C
Words 0-3
b1111
b1111
0x14
0x18
b1111
0x08
Incr
32-bit
1 data transfer
b1111
0x00
Incr
32-bit
2 data transfers
b1111
b1111
0x04
Word 2
0x08
Incr
32-bit
1 data transfer
b1111
Word 0, bytes 0 and 1
0x00
Incr
16-bit
1 data transfer
b0011
Word 1, bytes 2 and 3
0x06
Incr
16-bit
1 data transfer
b1100
Word 2, byte 3
0x0B
Incr
8-bit
1 data transfer
b1000
The peripheral port can only do incrementing bursts of 2 data transfers maximum. It does not
support unaligned accesses.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-41
Level Two Interface
8.7
Endianness
ARM1176JZ-S processors can be configured in one of three endianness modes of operation
using the U, B, and E bits of the CP15 c1 Control Register, see Mixed-endian access support on
page 4-17.
BE-8 refers to byte-invariant big-endian configuration on 16-bit, halfword, and 32-bit, word,
quantities only.
Even if the data and DMA ports are 64-bit wide, the accesses issued on these ports still have to
be considered as two 32-bit accesses in parallel. The BE-8 configuration does not apply to the
64-bit data but on the two 32-bit words forming these 64-bit data.
The AXI protocol does not support 32-bit word-invariant big-endian, BE-32, accesses.
Therefore, in this configuration the ARM1176JZ-S processor issues byte-invariant big-endian,
BE-8, accesses on the four ports by swizzling the byte lanes and the byte strobes as Figure 8-4
shows.
DATA[63:56]
DATA[55:48]
DATA[47:40]
DATA[39:32]
DATA[31:24]
DATA[23:16]
DATA[15:8]
DATA[7:0]
STRB[7]
STRB[6]
STRB[5]
STRB[4]
STRB[3]
STRB[2]
STRB[1]
STRB[0]
DATA[63:56]
DATA[55:48]
DATA[47:40]
DATA[39:32]
DATA[31:24]
DATA[23:16]
DATA[15:8]
DATA[7:0]
STRB[7]
STRB[6]
STRB[5]
STRB[4]
STRB[3]
STRB[2]
STRB[1]
STRB[0]
Figure 8-4 Swizzling of data and strobes in BE-32 big-endian configuration
Note
If you want to configure the processor for BE-32 mode, it is strongly recommended that you use
the BIGENDINIT and UBITINIT input pins. See c1, Control Register on page 3-44 bit [7].
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-42
Level Two Interface
8.8
Locked access
The AXI protocol specifies that, when a locked transaction occurs, the master must follow the
locked transaction with an unlocked transaction to remove the lock of the interconnect. For
ARM1176JZ-S processors, this implies that, in the case of an abort received on the read part of
a SWP instruction, the Peripheral port or Data port issues a dummy write access with all byte
strobes LOW at the same address as the read access and with AWLOCK = 00, normal
transaction.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
8-43
Chapter 9
Clocking and Resets
This chapter describes the clocking and reset options available for the processor. It contains the
following sections:
•
About clocking and resets on page 9-2
•
Clocking and resets with no IEM on page 9-3
•
Clocking and resets with IEM on page 9-5
•
Reset modes on page 9-10.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
9-1
Clocking and Resets
9.1
About clocking and resets
The processor clocking and reset schemes depend on the, optional, implementation of IEM. This
chapter gives details of the way that clocking and resets work for processors that implement IEM
and for those that do not.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
9-2
Clocking and Resets
9.2
Clocking and resets with no IEM
This section describes clocking and resets for the processor with no IEM:
•
Processor clocking with no IEM
•
Reset with no IEM on page 9-4.
9.2.1
Processor clocking with no IEM
Externally to the processor, you must connect CLKIN and FREECLKIN together.
Logically, the processor has only one clock domain.
The four level two interfaces use dedicated clock enables ACLKENI, ACLKENRW,
ACLKENP, and ACLKEND.
The four clock inputs ACLKI, ACLKRW, ACLKP and ACLKD are not used and must be left
unconnected when you implement the processor.
The SYNCMODEREQ* and SYNCMODEACK* signals are not used and must be left
unconnected.
All clocks can be stopped indefinitely without loss of state.
Figure 9-1 shows the clocks for the processor with no IEM.
RAMs
Core
CLKIN
Instruction
level 2
interface
Data read/
write level
2 interface
DMA level
2 interface
Peripheral
level 2
interface
Clock enables
Level 2
Figure 9-1 Processor clocks with no IEM
Read latency penalty with no IEM
The Nonsequential Noncacheable read-latency with zero-wait-state AXI is a six-cycle penalty
over a cache hit, where data is returned in the DC2 cycle, on the data side, and a five-cycle
penalty over a cache hit on the instruction side.
In the first cycle after the data cache miss, a read-after-write hazard check is performed against
the contents of the Write Buffer. This prevents stalling while waiting for the Write Buffer to
drain. Following that, a request is made to the AXI interface, and subsequently a transfer is
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
9-3
Clocking and Resets
started on the AXI. In the next cycle data is returned to the AXI interface, from where it is
returned first to the level one clock domain before being forwarded to the core. Figure 9-2 shows
this.
DC1
DC2
RAW
L2Req
Fe1
Fe2
L2Req
ARVALIDI
ARVALIDRW RDATARW
RDATAI
Data to L1
Data to L1
Data to LSU
Data to PU
Figure 9-2 Read latency with no IEM
The same sequence appears on the I-Side, except that there is less to do in the equivalent RAW
cycle.
9.2.2
Reset with no IEM
The processor has the following reset inputs:
nRESETIN
The nRESETIN signal is the main processor reset that initializes the
majority of the processor logic.
DBGnTRST
The DBGnTRST signal is the DBGTAP reset.
nPORESETIN
The nPORESETIN signal is the power-on reset that initializes the CP14
debug logic. See CP14 registers reset on page 13-25 for details.
nVFPRESETIN
The nVFPRESETIN signal is not connected and you must tie it LOW.
All of these are active LOW signals that reset logic in the processor.
The following reset signals are only used if IEM is implemented. Otherwise, these inputs are not
connected to any logic internally, and you must connect them according to your design rules:
•
ARESETIn
•
ARESETRWn
•
ARESETPn
•
ARESETDn.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
9-4
Clocking and Resets
9.3
Clocking and resets with IEM
This section describes clocking and resets for the processor with IEM:
•
Processor clocking with IEM
•
Reset with IEM on page 9-8.
9.3.1
Processor clocking with IEM
Externally to the processor, you must connect CLKIN and FREECLKIN together.
It is possible to configure each of the four level two ports to instantiate an IEM register slice so
that the processor can have up to five clock domains, CLKIN, ACLKI, ACLKRW, ACLKP
and ACLKD. Because of the signals SYNCMODEREQI, SYNCMODEREQRW,
SYNCMODEREQP, SYNCMODEREQD, SYNCMODEACKI, SYNCMODEACKRW,
SYNCMODEACKP, and SYNCMODEACKD, it is possible to configure each IEM register
slice to operate synchronously or asynchronously.
The four level two interfaces and the VCore part of the IEM register slices use dedicated clock
enables, ACLKENI, ACLKENRW, ACLKENP, and ACLKEND.
If you configure an IEM register slice to operate asynchronously, its corresponding ACLKEN*
signal must be high. For example, when SYNCMODEACKI is low to indicate asynchronous
operation of the instruction port slice, the ACLKENI signal must be held high accordingly.
All clocks can be stopped indefinitely without loss of state.
Figure 9-3 on page 9-6 shows the clocks for the processor with IEM.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
9-5
Clocking and Resets
Processor
RAMs
Core
Instruction
level 2
interface
CLKIN
Level shift and clamp
VIC interface
Level shift and clamp
Level shift and clamp
Data read/
write level
2 interface
DMA level
2 interface
Peripheral
level 2
interface
CLK
VCoreSliceI
CLK
VCoreSliceRW
CLK
VCoreSliceD
CLK
VCoreSliceP
Level shift and
clamp
Level shift and
clamp
Level shift and
clamp
Level shift and
clamp
VSoCSliceI
VSoCSliceRW
VSoCSliceD
VSoCSliceP
Debug
interface
Clock enables
IEM
register
slices
Level 2
ACLK clocks
Figure 9-3 Processor clocks with IEM
Synchronization with IEM
When the core runs at maximum performance, the two clocks for the IEM Register Slice are
synchronous. At this point, when frequency and voltage changes have taken effect, the IEM
Register Slice can be bypassed. This removes all the latency that the synchronizers introduce.
The synchronization interface is a simple request and acknowledge system. Figure 9-4 shows
the processor synchronization with such a system.
Clock
SYNCMODEREQ
SYNCMODEACK
FIFO multiplexed out
FIFOs drain
Normal FIFO operation
FIFOs all empty
FIFOs closed to new data
Synchronization
over
Normal FIFO operation
Figure 9-4 Processor synchronization with IEM
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
9-6
Clocking and Resets
When maximum performance is required, SYNCMODEREQ is asserted. When the IEM
register slice receives this signal it closes its FIFOs to new data, subject to the constraints
required by the AXI protocol, waits for the FIFOs to drain, and then switches the multiplexers
so that the AXI master and slave connect directly. The IEM register slice asserts
SYNCMODEACK to acknowledge the direct connection.
For reduced performance levels SYNCMODEREQ is deasserted, and the IEM register slice
switches the multiplexers and deasserts SYNCMODEACK when it has done so. The protocol
for these signals means that it is possible to connect different IEM register slices together. You
can connect SYNCMODEREQ to all the IEM register slices in parallel and AND together the
SYNCMODEACK outputs.
This means that the SYNCMODEACK signal only goes high when all the IEM register slices
have asserted their SYNCMODEACK signals. When coming out of bypass mode, all the IEM
registers slices take the same number of cycles, so the SYNCMODEACK signals all deassert
at the same time. Alternatively, if necessary, you can daisy chain the IEM register slices together,
so that each slice in the chain only closes its inputs when the previous slice has been multiplexed
out.
Read latency penalty for synchronous operation with IEM
When the IEM register slices are instantiated, but are synchronous because SYNCMODEREQ
is asserted, the read latency is the same as if the IEM register slices were not present. See Read
latency penalty with no IEM on page 9-3 and Figure 9-2 on page 9-4.
Read latency penalty for asynchronous operation with IEM
When the IEM register slices are instantiated and in asynchronous mode, data read or write
operations incur additional latency because of the synchronization required for the address and
the data between the core and the AXI system. The exact latency depends on:
•
the clock ratios
•
the clock alignments
•
the latency of the AXI system.
On average, with zero-wait-state AXI the system incurs a penalty of 2.5 additional CLKIN
cycles and 4.5 additional ACLK cycles.
Figure 9-5 on page 9-8 shows the latency that the IEM register slices add in a system with
ACLK and CLKIN of the same frequency, but not synchronous. This example AXI system is
zero-wait-state.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
9-7
Clocking and Resets
CLKIN
Core DC1 DC2 RAW L2R AVC WPA
SD1 SD2 RDC
L1
LSU
SA1 SA2 AVS RDS WPD
SoC
ACLKRW
CLKIN
Core
Fe1
Fe2
L2R AVC WPA
SD1 SD2 RDC
L1
PU
SA1 SA2 AVS RDS WPD
SoC
ACLKI
Figure 9-5 Read latency with IEM
The latency, from the pipeline cycles associated with cache reading DC1 and DC2 or Fe1 and
Fe2 to the level two AXI interfaces, is the same as that in Figure 9-2 on page 9-4. The level two
AXI interface, on the Core side of the IEM register slice, asserts ARVALIDRW or ARVALIDI
in cycle AVC. The IEM register slice must then synchronize the address to the ACLK clock
domain on the SoC side. The address is written into an address FIFO in cycle WPA. There are
then two synchronization cycles in the ACLK clock domain, SA1 and SA2, and a buffer cycle
before ARVALID is asserted on the SoC side of the IEM register slice in cycle AVS. Read data
returned from the AXI system in cycle RDS passes through the IEM register slice in a similar
way. In the ACLK clock domain, the data is written into a data FIFO in cycle WPD. The data
then synchronizes in the CLKIN clock domain, in cycles SD1 and SD2, and passes through a
buffer cycle before finally passing to the level two interfaces in cycle RDC. When the level two
interfaces of the core receive the data, they then pass it back to the LSU or PU in two cycles, see
Figure 9-2 on page 9-4.
Each of the IEM register slices, except the peripheral port slice, can store multiple items of read
and write data. This means that a burst of data can typically synchronize in fewer cycles than the
same number of individual data items. The number of cycles required to synchronize a burst of
data depends on:
•
the length of the burst
•
the ratio of the clock frequencies
•
the clock that has the higher frequency
•
the latency of the AXI system
•
if the operation is a read or write.
9.3.2
Reset with IEM
The processor has the following reset inputs:
ARM DDI 0333H
ID012410
nRESETIN
The nRESETIN signal is the main processor reset that initializes the
majority of the processor logic.
DBGnTRST
The DBGnTRST signal is the DBGTAP reset.
nPORESETIN
The nPORESETIN signal is the power-on reset that initializes the CP14
debug logic. See CP14 registers reset on page 13-25 for details.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
9-8
Clocking and Resets
nVFPRESETIN
The nVFPRESETIN signal is not connected and you must tie it LOW.
ARESETIn, ARESETRWn, ARESETPn, ARESETDn
Reset signals for the SoC part of the IEM register slices.
All of these are active LOW signals that reset logic in the processor.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
9-9
Clocking and Resets
9.4
Reset modes
The reset signals present in the processor design enable you to reset different parts of the design
independently. Table 9-1 lists the reset signals, and the combinations and possible applications
that you can use them in.
Table 9-1 Reset modes
9.4.1
Reset mode
nRESETIN
DBGnTRST
nPORESETIN
Application
Power-on reset
0
x
0
Reset at power up, full system reset.
Hard reset or cold reset.
Processor reset
0
x
1
Reset of processor core only, watchdog
reset.
Soft reset or warm reset.
DBGTAP reset
1
0
1
Reset of DBGTAP logic.
Normal
1
x
1
No reset. Normal run mode.
Power-on reset
You must apply power-on or cold reset to the processor when power is first applied to the
system. In the case of power-on reset, the leading, falling, edge of the reset signals, nRESETIN
and nPORESETIN, does not have to be synchronous to CLKIN. Because the nRESETIN and
nPORESETIN signals are synchronized within the processor, you do not have to synchronize
these signals. Figure 9-6 shows the application of power-on reset.
CLKIN
nRESETIN
nPORESETIN
Figure 9-6 Power-on reset
It is recommended that you assert the reset signals for at least three CLKIN cycles to ensure
correct reset behavior. Adopting a three-cycle reset eases the integration of other ARM parts into
the system, for example, ARM9TDMI-based designs.
It is not necessary to assert DBGnTRST on power-up.
9.4.2
CP14 debug logic
Because the nPORESETIN signal is synchronized within the processor, you do not have to
synchronize this signal.
9.4.3
Processor reset
A processor or warm reset initializes the majority of the ARM1176JZ-S processor, excluding
the ARM1176JZ-S DBGTAP controller and the EmbeddedICE-RT logic. Processor reset is
typically used for resetting a system that has been operating for some time, for example,
watchdog reset.
Because the nRESETIN signal is synchronized within the processor, you do not have to
synchronize this signal.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
9-10
Clocking and Resets
9.4.4
DBGTAP reset
DBGTAP reset initializes the state of the processor DBGTAP controller. DBGTAP reset is
typically used by the RealView ICE module for hot connection of a debugger to a system.
DBGTAP reset enables initialization of the DBGTAP controller without affecting the normal
operation of the processor.
Because the DBGnTRST signal is synchronized within the processor, you do not have to
synchronize this signal.
9.4.5
Normal operation
During normal operation, neither processor reset nor power-on reset is asserted. If the DBGTAP
port is not being used, the value of DBGnTRST does not matter.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
9-11
Chapter 10
Power Control
This chapter describes the processor power control functions. It contains the following sections:
•
About power control on page 10-2
•
Power management on page 10-3
•
Intelligent Energy Management on page 10-6.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
10-1
Power Control
10.1
About power control
The features of the processor that improve energy efficiency include:
•
support for Intelligent Energy Management (IEM)
•
accurate branch and return prediction, reducing the number of incorrect instruction fetch
and decode operations
•
use of physically addressed caches to reduce the number of cache flushes and refills,
saving energy in the system
•
the use of MicroTLBs reduces the power consumed in translation and protection look-ups
each cycle
•
the caches use sequential access information to reduce the number of accesses to the
TagRAMs and to unwanted Data RAMs.
In the processor extensive use is also made of gated clocks and gates to disable inputs to unused
functional blocks. Only the logic actively in use to perform a calculation consumes any dynamic
power.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
10-2
Power Control
10.2
Power management
The processor supports these levels of power management:
•
Run mode
•
Standby mode
•
Shutdown mode on page 10-4
•
plus partial support for a fourth level, Dormant mode on page 10-4.
10.2.1
Run mode
Run mode is the normal mode of operation when all of the functionality of the core is available.
10.2.2
Standby mode
Standby mode disables most of the clocks of the device, while keeping the design powered up.
This reduces the power drawn to the static leakage current, plus a tiny clock power overhead
required to enable the device to wake up from the standby state.
The transition from Standby mode to Run mode is caused by the arrival of:
•
an interrupt, whether masked or unmasked
•
a debug request, only when debug is enabled
•
a reset.
The debug request can be generated by an externally generated debug request, using the
EDBGRQ pin on the processor, or from a Debug Halt instruction issued to the processor
through the debug scan chains. Entry into Standby Mode is performed by executing the Wait For
Interrupt CP15 operation, see c7, Cache operations on page 3-69. To ensure that the memory
system is not affected by the entry into the Standby state, the following operations are
performed:
•
A Data Synchronization Barrier operation ensures that all explicit memory accesses
occurring in program order before the Wait For Interrupt have completed. This avoids any
possible deadlocks that might be caused in a system where memory access triggers or
enables an interrupt that the core is waiting for. This might require some TLB page table
walks to take place as well.
•
The DMA continues running during a Wait For Interrupt and any queued DMA operations
are executed as normal, before entering standby mode. This enables an application using
the DMA to set up the DMA to signal an interrupt when the DMA has completed, and then
for the application to issue a Wait For Interrupt operation. The degree of power-saving
while the DMA is running is less than in the case if the DMA is not running.
DMA can receive an AXI error response and generate an interrupt via
nDMAEXTERRIRQ to prevent entering Standby mode.
•
Any other memory accesses that have been started at the time that the Wait For Interrupt
operation is executed are completed as normal. This ensures that the level two memory
system does not see any disruption caused by the Wait For Interrupt.
•
The debug channel remains active throughout a Wait For Interrupt.
Systems using the VIC interface must ensure that the VIC is not masking any interrupts that are
required for restarting the processor when in this mode of operation.
After the processor clocks have been stopped the signal STANDBYWFI is asserted to indicate
that the processor is in Standby mode.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
10-3
Power Control
Note
The core clock does not stop when the core is prepared for debug activity, that is, when either
TCK or JTAGSYNCBYPASS is high.
10.2.3
Shutdown mode
Shutdown mode has the entire device powered down, and you must externally save all state,
including cache and TCM state. The processor is returned to Run mode by the assertion of
Reset. The state saving must be performed with interrupts disabled, and finish with a Data
Synchronization Barrier operation. When all the state of the processor is saved the processor
must execute a Wait For Interrupt operation. The signal STANDBYWFI is asserted to indicate
that the processor can enter Shutdown mode.
10.2.4
Dormant mode
Dormant mode enables the core to be powered down, leaving the caches and the
Tightly-Coupled Memory (TCM) powered up and maintaining their state.
The software visibility of the Cache Master Valid bits and the TLB lockdown entries is provided
to enable an implementation to be extended for Dormant mode.
The processor includes a placeholder that enables you to include the clamping logic necessary
for the full implementation of Dormant mode.
Considerations for Dormant mode
Dormant mode is only partially supported on the processor, because care is required in
implementing this on a standard synthesizable flow. The RAM blocks that are to remain
powered up must be implemented on a separate power domain, and there is a requirement to
clamp all of the inputs to the RAMs to a known logic level, with the chip enable being held
inactive. This clamping is not implemented in gates as part of the default synthesis flow because
it contributes to a critical path. The RAMCLAMP input is provided to drive this clamping.
Basic clamps are instantiated in the placeholder. They can be changed to explicit gates in the
RAM power domain, or pull-down transistors that clamp the values when the core is powered
down. For implementation details, see the ARM1176JZF-S and ARM1176JZ-S Implementation
Guide.
The RAM blocks that must remain powered up in Dormant mode, if it is implemented, are:
•
all Data RAMs associated with the cache and tightly-coupled memories
•
all TagRAMs associated with the cache
•
all Valid RAMs and Dirty RAMs associated with the cache.
The states of the Branch Target Address Cache and the associative region of the TLB are not
maintained on entry into Dormant mode.
Implementations of the processor can optionally disable RAMs associated with the main TLB,
so that a trade-off can be made between Dormant mode leakage power and the recovery time.
Before entering Dormant mode, the state of the processor, excluding the contents of the RAMs
that remain powered up in dormant mode, must be saved to external memory. These state saving
operations must ensure that the following occur:
ARM DDI 0333H
ID012410
•
All ARM registers, including CPSR and SPSR registers are saved.
•
Any DMA operations in progress are stopped.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
10-4
Power Control
•
All CP15 registers are saved, including the DMA state.
•
Any locked entries in the main TLB are saved.
•
All debug-related state are saved.
•
The Master Valid bits for the cache are saved. These are accessed using CP15 register c15
as c15, Instruction Cache Master Valid Register on page 3-147 describes.
•
A Data Synchronization Barrier operation is performed to ensure that all state saving has
been completed.
•
A Wait For Interrupt CP15 operation is executed, enabling the signal STANDBYWFI to
indicate that the processor can enter Dormant mode.
•
On entry into Dormant mode, the Reset signal to the processor must be asserted by the
external power control mechanism.
Transition from Dormant state to Run state is triggered by the external power controller
asserting Reset to the processor until the power to the processor is restored. When power has
been restored the core leaves reset and, by interrogating the external power controller, can
determine that the saved state must be restored.
10.2.5
Communication to the Power Management Controller
Your Power Management Controller in your system must perform the powering up and
powering down of the power domains of the processor. The Power Management Controller must
be a memory-mapped controller. The ARM1176JZ-S processor accesses this controller using
Strongly-Ordered accesses.
The STANDBYWFI signal can also be used to signal to the Power Management Controller that
the ARM1176JZ-S processor is ready to have its power state changed. STANDBYWFI is
asserted in response to a Wait For Interrupt operation.
Note
The Power Management Controller must not power down any of the processor power domains
unless STANDBYWFI is asserted.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
10-5
Power Control
10.3
Intelligent Energy Management
This section describes the provision of IEM in the ARM1176JZ-S processors:
•
Purpose of IEM
•
Structure of IEM
•
Operation of IEM on page 10-7
•
Use of IEM on page 10-7
Note
The ARM1176JZ-S processor is IEM enabled but the level of support for the technology
depends on the specific implementation.
For information on clocks and resets with IEM, see Clocking and resets with IEM on page 9-5.
10.3.1
Purpose of IEM
The purpose of IEM technology is to provide a dynamic optimization between processor
performance and power consumption.
10.3.2
Structure of IEM
The ARM1176JZ-S processor provides a number of features that enable the processor voltage
to vary relative to the voltage of the rest of the system. For this purpose the processor optionally
implements:
•
Placeholders for level shifters and clamps for some inputs and outputs including:
— the debug interface
— interrupt signals including the VIC interface
— resets
— clocks.
•
IEM register slices for the AXI level two interfaces.
Note
The ETM and coprocessor interfaces do not implement level shifters or clamps.
Figure 10-1 on page 10-7 shows the basic structure for IEM in the processor.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
10-6
Power Control
VDD SoC
VDD RAM
VDD core
Coprocessor interface
ETM interface
Processor
RAMs
Down
Down level shift and clamp
Test,
debug,
VIC, and
other
inputs
CLKIN
Up
Up level shift and clamp
RAMCLAMP
Core
Instruction
level 2
interface
Data read/
write level
2 interface
DMA level
2 interface
Peripheral
level 2
interface
CLK
VCoreSliceRW
CLK
VCoreSliceD
CLK
VCoreSliceP
Up
Up
Test,
debug,
VIC,
and
other
outputs
Clock enables
CLK
VCoreSliceI
CPUCLAMP
Up
Down
VSoCSliceI
Up
Down
VSoCSliceRW
Down
VSoCSliceD
Down
VSoCSliceP
Level 2
ACLK clocks
Up
Up level shifter and clamp
Down
Down level shifter and clamp
Figure 10-1 IEM structure
10.3.3
Operation of IEM
IEM balances performance and power consumption by dynamic alteration of the processor
clock frequency and supply voltage. CPUCLAMP is provided to control the clamp cells
between VCore and VSoc. Figure 10-1 shows this.
10.3.4
Use of IEM
To use IEM the processor must be implemented with appropriate register slices and included in
a SoC that contains an Intelligent Energy Controller (IEC™). For example systems, see the
Intelligent Energy Controller Technical Overview.
IEM is functionally transparent to the user.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
10-7
Chapter 11
Coprocessor Interface
This chapter describes the coprocessor interface of the ARM1176JZ-S processor. It contains the
following sections:
•
About the coprocessor interface on page 11-2
•
Coprocessor pipeline on page 11-3
•
Token queue management on page 11-9
•
Token queues on page 11-12
•
Data transfer on page 11-15
•
Operations on page 11-19
•
Multiple coprocessors on page 11-22.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-1
Coprocessor Interface
11.1
About the coprocessor interface
The processor supports the connection of on-chip coprocessors through an external coprocessor
interface. All types of coprocessor instruction are supported.
The ARM instruction set supports the connection of 16 coprocessors, numbered 0-15, to an
ARM processor. In the processor, the following coprocessor numbers are reserved:
CP10
VFP control
CP11
VFP control
CP14
Debug and ETM control
CP15
System control.
You can use CP0-9, CP12, and CP13 for your own external coprocessors.
The processor is designed to pass instructions to several coprocessors and exchange data with
them. These coprocessors are intended to run in step with the core and are pipelined in a similar
way to the core. Instructions are passed out of the Fetch stage of the core pipeline to the
coprocessor and decoded. The decoded instruction is passed down its own pipeline. Coprocessor
instructions can be canceled by the core if a condition code fails, or the entire coprocessor
pipeline can be flushed in the event of a mispredicted branch. Load and store data are also
required to pass between the core Logic Store Unit (LSU) and the coprocessor pipeline.
The coprocessor interface operates over a two-cycle delay. Any signal passing from the core to
the coprocessor, or from the coprocessor to the core, is given a whole clock cycle to propagate
from one to the other. This means that a signal crossing the interface is clocked out of a register
on one side of the interface and clocked directly into another register on the other side. No
combinatorial process must intervene. This constraint exists because the core and coprocessor
can be placed a considerable distance apart and generous timing margins are necessary to cover
signal propagation times. This delay in signal propagation makes it difficult to maintain pipeline
synchronization, ruling out a tightly-coupled synchronization method.
The processor implements a token-based pipeline synchronization method that enables some
slack between the two pipelines, while ensuring that the pipelines are correctly aligned for
crucial transfers of information.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-2
Coprocessor Interface
11.2
Coprocessor pipeline
The coprocessor interface achieves loose synchronization between the two pipelines by
exchanging tokens from one pipeline to the other. These tokens pass down queues between the
pipelines and can carry additional information. In most cases the primary purpose of the queue
is to carry information about the instruction being processed, or to inform one pipeline of events
occurring in the other.
Tokens are generated whenever a coprocessor instruction passes out of a pipeline stage
associated with a queue into the next stage. These tokens are picked up by the partner stage in
the other pipeline, and used to enable the corresponding instruction in that stage to move on. The
movement of coprocessor instructions down each pipeline is matched exactly by the movement
of tokens along the various queues that connect the pipelines.
If a pipeline stage has no associated queue, the instruction contained within it moves on in the
normal way. The coprocessor interface is data-driven rather than control-driven.
11.2.1
Coprocessor instructions
Each coprocessor might only execute a subset of all possible coprocessor instructions.
Coprocessors reject those instructions they cannot handle. Table 11-1 lists all the coprocessor
instructions supported by the processor and gives a brief description of each. For more details
of coprocessor instructions, see the ARM Architecture Reference Manual.
Table 11-1 Coprocessor instructions
Instruction
Data transfer
Vectored
Description
CDP
None
No
Processes information already held within
the coprocessor
MRC
Store
No
Transfers information from the coprocessor
to the core registers
MCR
Load
No
Transfers information from the core
registers to the coprocessor
MRRC
Store
No
Transfers information from the coprocessor
to a pair of registers in the core
MCRR
Load
No
Transfers information from a pair of
registers in the core to the coprocessor
STC
Store
Yes
Transfers information from the coprocessor
to memory and might be iterated to transfer
a vector
LDC
Load
Yes
Transfers information from memory to the
coprocessor and might be iterated to
transfer a vector
The coprocessor instructions fall into three groups:
•
loads
•
stores
•
processing instructions.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-3
Coprocessor Interface
The load and store instructions enable information to pass between the core and the coprocessor.
Some of them might be vectored. This enables several values to be transferred in a single
instruction. This typically involves the transfer of several words of data between a set of registers
in the coprocessor and a contiguous set of locations in memory.
Other instructions, for example MCR and MRC, transfer data between core and coprocessor
registers. The CDP instruction controls the execution of a specified operation on data already
held within the coprocessor, writing the result back into a coprocessor register, or changing the
state of the coprocessor in some other way. Opcode fields within the CDP instruction determine
the operation that is to be carried out.
The core pipeline handles both core and coprocessor instructions. The coprocessor, on the other
hand, only deals with coprocessor instructions, so the coprocessor pipeline is likely to be empty
for most of the time.
11.2.2
Coprocessor control
The coprocessor communicates with the core using several signals. Most of these signals control
the synchronizing queues that connect the coprocessor pipeline to the core pipeline. Table 11-2
lists the signals used for general coprocessor control.
Table 11-2 Coprocessor control signals
11.2.3
Signal
Description
CLKIN
This is the clock signal from the core.
nRESETIN
This is the reset signal from the core.
ACPNUM[3:0]
This is the fixed number assigned to the coprocessor, and is in the range
0-13. Coprocessor numbers 10, 11, 14, and 15 are reserved for system
control coprocessors.
ACPENABLE
When set, enables the coprocessor to respond to signals from the core.
ACPPRIV
When asserted, indicates that the core is in privileged mode. This might
affect the execution of certain coprocessor instructions.
Pipeline synchronization
Figure 11-1 on page 11-5 shows an outline of the core and coprocessor pipelines and the
synchronizing queues that communicate between them. Each queue is implemented as a very
short First In First Out (FIFO) buffer.
No explicit flow control is required for the queues, because the pipeline lengths between the
queues limits the number of items any queue can hold at any time. The geometry used means
that only three slots are required in each queue.
The only status information required is a flag to indicate when the queue is empty. This is
monitored by the receiving end of the queue, and determines if the associated pipeline stage can
move on. Any information that the queue carries can also be read and acted on at the same time.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-4
Coprocessor Interface
Core pipeline
Coprocessor pipeline
Fe2
Instructio
De
n
D
Iss
Length
Cancel
Ex1
Length
Accept
I
Ex1
Ex2
Ex2
Ex3
Ex3
Wb
Ex4
Ex5
Finish
Ex6
Figure 11-1 Core and coprocessor pipelines
Figure 11-2 provides a more detailed picture of the pipeline and the queues maintained by the
coprocessor.
Decode stage
From core Fe2 stage
Instruction
To core Fe1 stage
Length
To LSU Add stage
To core Ex2 stage
Store data
Accept
From core Iss stage
Cancel
I
D
Ex1
Ex2
Ex3
Ex4
Ex5
From LSU Wbls stage
From core Wb stage
Load data
Finish
Ex6
Figure 11-2 Coprocessor pipeline and queues
The instruction queue incorporates the instruction decoder and returns the length to the Ex1
stage of the core, using the length queue, that is maintained by the core. The coprocessor I stage
sends a token to the core Ex2 stage through the accept queue, that is also maintained by the core.
This token indicates to the core if the coprocessor is accepting the instruction in its I stage, or
bouncing it.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-5
Coprocessor Interface
The core can cancel an instruction currently in the coprocessor Ex1 stage by sending a signal
with the token passed down the cancel queue. When a coprocessor instruction reads the Ex6
stage it might retire. How it retires depends on the instruction:
•
Load instructions retire when they find load data available in the load data queue, see
Loads on page 11-16
•
Store instructions retire as soon as they leave the Ex1 stage, and are removed from the
pipeline, see Stores on page 11-17
•
CDP instructions retire when they read a token passed by the core down the finish queue.
Figure 11-2 on page 11-5 shows how data transfer uses the load data and store data queues, and
Data transfer on page 11-15 explains this.
11.2.4
Pipeline control
The coprocessor pipeline is very similar to the core pipeline, but lacks the fetch stages.
Instructions are passed from the core directly into the Decode stage of the coprocessor pipeline,
that takes the form of a FIFO queue.
The Decode stage then decodes the instruction, rejecting non-coprocessor instructions and any
coprocessor instructions containing a nonmatching coprocessor number.
The length of any vectored data transfer is also decided at this point and sent back to the core.
The decoded instruction then passes into the issue (I) stage. This stage decides if this particular
instance of the instruction can be accepted. If it cannot, because it addresses a non-existent
register, the instruction is bounced, informing the core that it cannot be accepted.
If the instruction is both valid and executable, it then passes down the execution pipeline, Ex1
to Ex6. At the bottom of the pipeline, in Ex6, the instruction waits for retirement. It can do this
when it receives a matching token from another queue fed by the core.
Figure 11-3 on page 11-7 shows the coprocessor pipeline, the main fields within each stage, and
the main control signals. Each stage controls the flow of information from the previous stage in
the pipeline by passing its Enable signal back. When a pipeline stage is not enabled, it cannot
accept information from the previous stage.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-6
Coprocessor Interface
From core pipeline
Instruction queue and decoder
I stage
Ex1 stage
Ex2 stage
Decoded instruction
Decoded instruction
Decoded instruction
Ex3 to Ex5 stages
(not shown)
Ex6 stage
Tag
Tag
Tag
Full
Flags
I stage control
Stall D
Enable
Enable
Full
Flags
Ex1 stage control
Stall I
Stall Ex1
Enable
Full
Flags
Ex2 stage control
Stages Ex3 to Ex5 are same as stage Ex2
Decoded instruction
Tag
Enable
Full
Flags
Ex6 stage control
Stall Ex6
Figure 11-3 Coprocessor pipeline
Each pipeline stage contains a decoded instruction, and a tag, plus a few status flags:
Full flag
This flag is set whenever the pipeline stage contains an instruction.
Dead flag
This flag is set to indicate that the instruction in the stage is a phantom. See
Cancel operations on page 11-19.
Tail flag
This flag is set to indicate that the instruction is the tail of an iterated instruction.
See Loads on page 11-16.
There might also be other flags associated with the decoding of the instruction. Each stage is
controlled not only by its own state, but also by external signals and signals from the following
state, as follows:
Stall
This signal prevents the stage from accepting a new instruction or passing its own
instruction on, and only affects the D, I, Ex1, and Ex6 stages.
Iterate
This signal indicates that the instruction in the stage must be iterated to implement
a multiple load/store and only applies to the I stage.
Enable
This signal indicates that the next stage in the pipeline is ready to accept data from
the current stage.
These signals are combined with the current state of the pipeline to determine if the stage can
accept new data, and what the new state of the stage is going to be. Table 11-3 lists how the new
state of the pipeline stage is derived.
Table 11-3 Pipeline stage update
ARM DDI 0333H
ID012410
Stall
Enable input
Iterate
State
Enable
To next stage
Remarks
0
0
X
Empty
1
None
Bubble closing
0
0
X
Full
0
-
Stalled by next stage
0
1
0
Empty
1
None
Normal pipeline movement
0
1
0
Full
1
Current
Normal pipeline movement
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-7
Coprocessor Interface
Table 11-3 Pipeline stage update (continued)
Stall
Enable input
Iterate
State
Enable
To next stage
Remarks
0
1
1
Empty
-
-
Impossible
0
1
1
Full
0
Current
Iteration, I stage only
1
X
X
X
0
None
Stalled, D, I, Ex1, and Ex6 only
The Enable input comes from the next stage in the pipeline and indicates if data can be passed
on. In general, if this signal is unasserted the pipeline stage cannot receive new data or pass on
its own contents. However, if the pipeline stage is empty it can receive new data without passing
any data on to the next stage. This is known as bubble closing, because it has the effect of filling
up empty stages in the pipeline by enabling them to move on while lower stages are stalled.
11.2.5
Instruction tagging
It is sometimes necessary for the core to be able to identify instructions in the coprocessor
pipeline. This is necessary for flushing, see Flush operations on page 11-19, so that the core can
indicate to the coprocessor the instructions that are to be flushed. The core therefore gives each
instruction sent to the coprocessor a tag, that is drawn from a pool of values large enough so that
all the tags in the pipeline at any moment are unique. Sixteen tags are sufficient to achieve this,
requiring a four-bit tag field. Each time a tag is assigned to an instruction, the tag number is
incremented modulo 16 to generate the next tag.
The flushing mechanism is simplified because successive coprocessor instructions have
contiguous tags. The core manages this by only incrementing the tag number when the
instruction passed to the coprocessor is a coprocessor instruction. This is done after sending the
instruction, so the tag changes after a coprocessor instruction is sent, rather than before. It is not
possible to increment the tag before sending the instruction because the core has not yet had time
to decode the instruction to determine what kind of instruction it is. When the coprocessor
Decode stage removes the non-coprocessor instructions, it is left with an instruction stream
carrying contiguous tags. The tags can also be used to verify that the sequence of tokens moving
down the queues matches the sequence of instructions moving down the core and coprocessor
pipelines.
11.2.6
Flush broadcast
If a branch has been mispredicted, it might be necessary for the core to flush both pipelines.
Because this action potentially affects the entire pipeline, it is not passed across in a queue but
is broadcast from the core to the coprocessor, subject to the same timing constraints as the
queues. When the flush signal is received by the coprocessor, it causes the pipeline and the
instruction queue to be cleared up to the instruction triggering the flush. This is explained in
more detail in Flush operations on page 11-19.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-8
Coprocessor Interface
11.3
Token queue management
The token queues, all of which are three slots long and function identically, are implemented as
short FIFOs. The following sections describe an example implementation of the queues:
•
Queue implementation
•
Queue modification
•
Queue flushing on page 11-11.
11.3.1
Queue implementation
The queue FIFOs are implemented as three registers, with the current output selected by using
multiplexors. Figure 11-4 shows this arrangement.
V
Output
Interconnect
A
Buffer A
B
Buffer B
S0
S1
0
1
0
C
Buffer C
1
Out
Figure 11-4 Token queue buffers
The queue consists of three registers. Each of these is associated with a flag that indicates if the
register contains valid data. New data are moved into the queue by being written into buffer A
and continue to move along the queue if the next register is empty, or is about to become empty.
If the queue is full, the oldest data, and therefore the first to be read from the queue, occupies
buffer C and the newest occupies buffer A.
The multiplexors also select the current flag, that then indicates whether the selected output is
valid.
11.3.2
Queue modification
The queue is written to on each cycle. Buffer A accepts the data arriving at the interface, and the
buffer A flag accepts the valid bit associated with the data. If the queue is not full, this results in
no loss of data because the contents of buffer A are moved to buffer B during the same cycle.
If the queue is full, then the loading of buffer A is inhibited to prevent loss of data. In any case,
no valid data is presented by the interface when the queue is full, so no data loss ensues.
The state of the three buffer flags is used to decide the buffer that provides the queue output
during each cycle. The output is always provided by the buffer containing the oldest data. This
is buffer C if it is full, or buffer B or, if that is empty, buffer A.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-9
Coprocessor Interface
A simple priority encoder, looking at the three flags, can supply the correct multiplexor select
signals. The state of the three flags can also determine how data are moved from one buffer to
another in the queue. Table 11-4 lists how the three flags are decoded.
Table 11-4 Addressing of queue buffers
Flag C
Flag B
Flag A
S
1
S
0
Remarks
0
0
0
X
X
Queue is empty
0
0
1
0
0
B=A
0
1
0
0
1
C=B
0
1
1
0
1
C = B, B = A
1
0
0
1
X
-
1
0
1
1
X
B=A
1
1
0
1
X
-
1
1
1
1
X
Queue is full. Input inhibited
New data can be moved into buffer A, provided the queue is not full, even if its flag is set,
because the current contents of buffer A are moved to buffer B. When the queue is read, the flag
associated with the buffer providing the information must be cleared. This operation can be
combined with an input operation so that the buffer is overwritten at the end of the cycle during
which it provides the queue output. This can be implemented by using the read enable signal to
mask the flag of the selected stage, making it available for input. Figure 11-5 shows reading and
writing a queue.
Valid input
Buffer A
One
Two
Three
Four
One
Two
Three
One
Two
One
Two
Flag A
Buffer B
Flag B
Buffer C
Flag C
Read queue
Output
One
One
Figure 11-5 Queue reading and writing
Four valid inputs, labeled One, Two, Three, and Four, are written into the queue, and are clocked
into buffer A as they arrive. Figure 11-5 shows how these inputs are clocked from buffer to
buffer until the first input reaches buffer C. At this point a read from the queue is required.
Because buffer C is full, it is chosen to supply the data. Because it is being read, it is free to
accept more input, and so it receives the value Two from buffer B, that in turn receives the value
Three from buffer A. Because buffer A is being emptied by writing to buffer B, it can accept the
value Four from the input.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-10
Coprocessor Interface
11.3.3
Queue flushing
When the coprocessor pipeline is flushed, in response to a command from the core, some of the
queues might also require flushing. There are two possible ways of flushing the queue:
•
the entire queue is cleared
•
the queue is flushed from a selected buffer, along with all data in the queue newer than the
data in the selected buffer.
The method used depends on the point when flushing begins in the coprocessor pipeline. See
Flush operations on page 11-19 for more details. A flush command has associated with it a tag
value that indicates where the queue flushing starts. This is matched with the tag carried by
every instruction.
If the queue is to be flushed from a selected buffer, the buffer is chosen by looking for a matching
tag. When this is found, the flag associated with that buffer is cleared, and every flag newer than
the selected one is also cleared. Figure 11-6 shows queue flushing.
Flush
all
Flush
tag
Clear A
<=
Tag A
A
Buffer A
B
Buffer B
C
Buffer C
Clear B
<=
Tag B
Clear C
<=
Tag C
Figure 11-6 Queue flushing
Each buffer in the queue has a tag comparator associated with it. The flush tag is presented to
each comparator, to be compared with the tag belonging to each valid instruction held in the
queue. The flush tag is compared with each tag in the queue. If the flush tag is the same as, or
older than, any tag then that queue entry has its Full flag cleared. This indicates that it is empty.
A less-than-or-equal-to comparison is used to identify tags that are to be flushed. If a tag in the
pipeline later than the queue matches, the Flush all signal is asserted to clear the entire queue.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-11
Coprocessor Interface
11.4
Token queues
The following sections describe each of the synchronizing queues:
•
Instruction queue
•
Length queue on page 11-13
•
Accept queue on page 11-13
•
Cancel queue on page 11-14
•
Finish queue on page 11-14.
11.4.1
Instruction queue
The core passes every instruction fetched from memory across the coprocessor interface, where
it enters the instruction queue. Ideally it only passes on the coprocessor instructions, but has not,
at this stage, had time to decode the instruction.
The coprocessor decodes the instruction on arrival in its own Decode stage and rejects the
non-coprocessor instructions. The core does not require any acknowledgement of the removal
of these instructions because each instruction type is determined within the coprocessors
Decode stage. This means that the instruction received from the core must be decoded as soon
as it enters the instruction queue. The instruction queue is a modified version of the standard
queue, that incorporates an instruction decoder. Figure 11-7 shows an instruction queue
implementation.
V
Output
Interconnect
S0
A
Buffer A
Decoder
B
Buffer B
C
Buffer C
S1
0
1
0
1
Out
Figure 11-7 Instruction queue
The decoder decodes the instruction written into buffer A as soon as it arrives. The subsequent
buffers, B and C, receive the decoded version of the instruction in buffer A.
The A flag now indicates that the data in buffer A are valid and represent a coprocessor
instruction. This means that non-coprocessor or unrecognized instructions are immediately
dropped from the instruction queue and are never passed on.
The coprocessor must also compare the coprocessor number field in a coprocessor instruction
and compare it with its own number, given by ACPNUM. If the number does not match, the
instruction is invalid. The instruction queue provides an interface to the core through the
following signals, that the core drives:
ACPINSTRV
This signal is asserted when valid data are available from the core. It must
be clocked directly into the buffer A flag, unless the queue is full, when
case it is ignored.
ACPINSTR[31:0] This is the instruction being passed to the coprocessor from the core, and
must be clocked into buffer A.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-12
Coprocessor Interface
ACPINSTRT[3:0] This is the flush tag associated with the instruction in ACPINSTR, and
must be clocked into the tag associated with buffer A.
The instruction queue feeds the issue stage of the coprocessor pipeline, providing a new input
to the pipeline, in the form of a decoded instruction and its associated tag, whenever the queue
is not empty.
11.4.2
Length queue
When a coprocessor has decoded an instruction it knows how long a vectored load/store
operation is. This information is sent with the synchronizing token down the length queue, as
the relevant instruction leaves the instruction queue to enter the issue stage of the pipeline. The
length queue is maintained by the core and the coprocessor communicates with the queue using
the following signals:
CPALENGTH[3:0]
This is the length of a vectored data transfer to or from the coprocessor. It is
determined by the decoder in the instruction queue and asserted as the decoded
instruction moves into the issue stage. If the current instruction does not represent
a vectored data transfer, the length value is set to zero.
CPALENGTHT[3:0]
This is the tag associated with the instruction leaving the instruction queue, and
is copied from the queue buffer supplying the instruction.
CPALENGTHHOLD
This is deasserted when the instruction queue is providing valid information to the
core length queue. Otherwise, the signal is asserted to indicate that no valid data
are available.
11.4.3
Accept queue
The coprocessor must decide in the issue stage if it can accept an otherwise valid coprocessor
instruction. It passes this information with the synchronizing token down the accept queue, as
the relevant instruction passes from the issue stage to Ex1.
If an instruction cannot be accepted by the coprocessor it is said to have been bounced. If the
coprocessor bounces an instruction it does not remove the instruction from its pipeline, but
converts it to a phantom. This is explained in more detail in Bounce operations on page 11-19.
The accept queue is maintained by the core and the coprocessor communicates with the queue
using the following signals, that are all driven by the coprocessor:
CPAACCEPT
This is set to indicate that the instruction leaving the coprocessor issue stage has
been accepted.
CPAACCEPTT[3:0]
This is the tag associated with the instruction leaving the issue stage.
CPAACCEPTHOLD
This is deasserted when the issue stage is passing an instruction on to the Ex1
stage, whether it has been accepted or not. Otherwise, the signal is asserted to
indicate that no valid data are available.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-13
Coprocessor Interface
11.4.4
Cancel queue
The core might want to cancel an instruction that it has already passed on to the coprocessor.
This can happen if the instruction fails its condition codes, that requires the instruction to be
removed from the instruction stream in both the core and the coprocessor.
The queue, a standard queue, as Token queue management on page 11-9 describes, is maintained
by the coprocessor and is read by the coprocessor Ex1 stage.
The cancel queue provides an interface to the core through the following signals, that are all
driven by the core:
ACPCANCELV
This signal is asserted when valid data are available from the core. It must be
clocked directly into the buffer A flag, unless the queue is full, when it is ignored.
ACPCANCEL
This is the cancel command being passed to the coprocessor from the core, and
must be clocked into buffer A.
ACPCANCELT[3:0]
This is the flush tag associated with the cancel command, and must be clocked
into the tag associated with buffer A.
The coprocessor Ex1 stage reads the cancel queue, that then acts on the value of the queued
ACPCANCEL signal by removing the instruction from the Ex1 stage if the signal is set, and
not passing it on to the Ex2 stage.
11.4.5
Finish queue
The finish queue maintains synchronism at the end of the pipeline by providing permission for
CDP instructions in the coprocessor pipeline to retire. The queue, a standard queue, as Token
queue management on page 11-9 describes, is maintained by the coprocessor and is read by the
coprocessor Ex6 stage.
The finish queue provides an interface to the core using the ACPFINISHV signal, that the core
drives.
This signal is asserted to indicate that the instruction in the coprocessor Ex6 stage can retire. It
must be clocked directly into the buffer A flag, unless the queue is full, when it is ignored.
The finish queue is read by the coprocessor Ex6 stage. It can retire a CDP instruction if the finish
queue is not empty.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-14
Coprocessor Interface
11.5
Data transfer
Data transfers are managed by the LSU on the core side, and the pipeline itself on the
coprocessor side. Transfers can be a single value or a vector. In the latter case, the coprocessor
effectively converts a multiple transfer into a series of single transfers by iterating the instruction
in the issue stage. This creates an instance of the load/store instruction for each item to be
transferred.
The instruction stays in the coprocessor issue stage while it iterates, creating copies of itself that
move down the pipeline. Figure 11-9 on page 11-16 illustrates this process for a load
instruction.
The first of the iterated instructions, shown in uppercase, is the head and the others, shown in
lowercase, are the tails. In the example shown the vector length is four so there is one head and
three tails. At the first iteration of the instruction, the tail flag is set so that subsequent iterations
send tail instructions down the pipeline. In the example shown in Figure 11-9 on page 11-16,
instruction B has stalled in the Ex1 stage, that might be caused by the cancel queue being empty,
so that instruction C does not iterate during its first cycle in the issue stage, but only starts to
iterate after the stall has been removed.
Figure 11-8 shows the extra paths required for passing data to and from the coprocessor.
I
To LSU Add stage
Store data
Ex1
Ex2
Ex3
Ex4
Ex5
From LSU Wbls stage
Load data
Ex6
Figure 11-8 Coprocessor data transfer
Two data paths are required:
•
One passes store data from the coprocessor to the core, and this requires a queue, that is
maintained by the core.
•
The other passes load data from the core to the coprocessor and requires no queue, only
two pipeline registers.
Figure 11-9 on page 11-16 shows instruction iteration for loads.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-15
Coprocessor Interface
I
A
Ex1
B
[C]
C
c
c
c
D
A
[B]
B
C
c
c
c
D
B
C
c
c
c
D
B
C
c
c
c
D
B
C
c
c
c
D
B
C
c
c
c
D
B
C
c
c
c
D
9
10
11
12
13
14
Ex2
A
Ex3
A
A
Ex4
A
Ex5
A
Ex6
Time
1
2
3
4
5
6
7
8
Figure 11-9 Instruction iteration for loads
Only the head instruction is involved in token exchange with the core pipeline, that does not
iterate instructions in this way, the tail instructions passing down the pipeline silently.
When an iterated load/store instruction is cancelled or flushed, all the tail instructions, bearing
the same tag, must be removed from the pipeline. Only the head instruction becomes a phantom
when cancelled. Any tail instruction can be left intact in the pipeline because it has no other
effect.
Because the cancel token is received in the coprocessor Ex1 stage, a cancelled iterated
instruction always consists of a head instruction in Ex1 and a single tail instruction in the issue
stage.
11.5.1
Loads
Load data emerge from the WBls stage of the core LSU and are received by the coprocessor Ex6
stage. Each item in a vectored load is picked up by one instance of the iterated load instruction.
The pipeline timing means that a load instruction is always ready, or arrived a short time ago, in
Ex6 to pick up each data item. If a load instruction has arrived in Ex6, but the load information
has not yet appeared, the load instruction must stall in Ex6, stalling the rest of the coprocessor
pipeline.
The following signals are driven by the core to pass load data across to the coprocessor:
ACPLDVALID
This signal, when set, indicates that the associated data are valid.
ACPLDDATA[63:0]
This is the information passed from the core to the coprocessor.
Load buffers
To achieve correct alignment of the load data with the load instruction in the coprocessor Ex6
stage, the data must be double buffered when they arrive at the coprocessor. Figure 11-10 on
page 11-17 shows an example.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-16
Coprocessor Interface
WBls
Ex6
Data
Interconnect
Data
Valid
Interconnect
Valid
Core
Coprocessor
Figure 11-10 Load data buffering
The load data buffers function as pipeline registers and so require no flow control and are not
required to carry any tags. Only the data and a valid bit are required. For load transfers to work:
•
instructions must always arrive in the coprocessor Ex6 stage coincident with, or before,
the arrival of the corresponding instruction in the core WBls stage
•
finish tokens from the core must arrive at the same time as the corresponding load data
items arrive at the end of the load data pipeline buffers
•
the LSU must see the token from the accept queue before it enables a load instruction to
move on from its Add stage.
Loads and flushes
If a flush does not involve the core WBls stage it cannot affect the load data buffers, and the load
transfer completes normally. If a flush is initiated by an instruction in the core WBls stage, this
is not a load instruction because load instructions cannot trigger a flush. Any coprocessor load
instructions behind the flush point find themselves stalled if they get as far as the Ex6 stage, for
the lack of a finish token, so no data transfers can have taken place. Any data in the load data
buffers expires naturally during the flush dead period while the pipeline reloads.
Loads and cancels
If a load instruction is canceled both the head and any tails must be removed. Because the
cancellation happens in the coprocessor Ex1 stage, no data transfers can have taken place and
therefore no special measures are required to deal with load data.
Loads and retirement
When a load instruction reaches the bottom of the coprocessor pipeline it must find a data item
at the end of the load data buffer. This applies to both head and tail instructions. Load
instructions do not use finish queue.
11.5.2
Stores
Store data emerge from the coprocessor issue stage and are received by the core LSU DC1 stage.
Each item of a vectored store is generated because the store instruction iterates in the
coprocessor issue stage. The iterated store instructions then pass down the pipeline but have no
other use, except to act as place markers for flushes and cancels.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-17
Coprocessor Interface
The following signals control the transfer of store data across the coprocessor interface:
CPASTDATAV
This signal is asserted when valid data is available from the coprocessor.
CPASTDATAT[3:0]
This is the tag associated with the data being passed to the core.
CPASTDATA[63:0]
This is the information passed from the coprocessor to the core.
ACPSTSTOP
This signal from the core prevents additional transfers from the coprocessor to the
core, and is raised when the store queue, maintained by the core, can no longer
accept any more data. When the signal is deasserted, data transfers can resume.
When ACPSTSTOP is asserted, the data previously placed onto CPASTDATA
must be left there, until new data can be transferred. This enables the core to leave
data on CPASTDATA until there is sufficient space in the store data queue.
Store data queue
Because the store data transfer can be stopped at any time by the LSU, a store data queue is
required. Additionally, because store data vectors can be of arbitrary length, flow control is
required. A queue length of three slots is sufficient to enable flow control to be used without loss
of data.
Stores and flushes
When a store instruction is involved in a flush, the store data queue must be flushed by the core.
Because the queue continues to fill for two cycles after the core notifies the coprocessor of the
flush, because of the signal propagation delay, the core must delay for two cycles before
carrying out the store data queue flush. The dead period after the flush extends sufficiently far
to enable this to be done.
Stores and cancels
If the core cancels a store instruction, the coprocessor must ensure that it sends no store data for
that instruction. It can achieve this by either:
•
delaying the start of the store data until the corresponding cancel token has been received
in the Ex1 stage
•
looking ahead into the cancel queue and start the store data transfer when the correct token
is seen.
Stores and retirement
Because store instructions do not use the finish token queue they are retired as soon as they leave
the Ex1 stage of the pipeline.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-18
Coprocessor Interface
11.6
Operations
This section describes the various operations that can be performed and events that can take
place.
11.6.1
Normal operation
In normal operation the core passes all instructions across to the coprocessor, and then
increments the tag if the instruction was a coprocessor instruction. The coprocessor decodes the
instruction and throws it away if it is not a coprocessor instruction or if it contains the wrong
coprocessor number.
Each coprocessor instruction then passes down the pipeline, sending a token down the length
queue as it moves into the issue stage. The instruction then moves into the Ex1 stage, sending a
token down the accept queue, and remains there until it has received a token from the cancel
queue.
If the cancel token does not request that the instruction is cancelled, and is not a Store
instruction, it moves on to the Ex2 stage. The instruction then moves down the pipeline until it
reaches the Ex6 stage. At this point, it waits to receive a token from the finish queue, that enables
it to retire, unless it is either:
•
a store instruction, where it requires no token from the finish queue
•
a load instruction, where it must wait until load data are available.
Store instruction are removed from the pipeline as soon as they leave the Ex1 stage.
11.6.2
Cancel operations
When the coprocessor instruction reaches the Ex1 stage it looks for a token in the cancel queue.
If the token indicates that the instruction is to be cancelled, it is removed from the pipeline and
does not pass to Ex2. Any tail instruction in the I stage is also removed.
11.6.3
Bounce operations
The coprocessor can reject an instruction by bouncing it when it reaches the issue stage. This
can happen to an instruction that has been accepted as a valid coprocessor instruction by the
decoder, but that is found to be unexecutable by the issue stage, perhaps because it refers to a
non-existent register or operation.
When the bounced instruction leaves the issue stage to move into Ex1, the token sent down the
accept queue has its bounce bit set. This causes the instruction to be removed from the core
pipeline.
When the instruction moves into Ex1 it has its dead bit set, turning it into a phantom. This
enables the instruction to remain in the pipeline to match tokens in the cancel queue.
The core posts a token for the bounced instruction before the coprocessor can bounce it, so the
phantom is required to pick up the token for the bounced instruction. The instruction is
otherwise inert, and has no other effect. The core might already have decided to cancel the
instruction being bounced. In this case, the cancel token causes the phantom to be removed from
the pipeline. If the core does not cancel the phantom it continues to the bottom of the pipeline.
11.6.4
Flush operations
A flush can be triggered by the core in any stage from issue to WBls inclusive. When this
happens a broadcast signal is received by the coprocessor, passing it the tag associated with the
instruction triggering the flush.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-19
Coprocessor Interface
Because the tag is changed by the core after each new coprocessor instruction, the tag matches
the first coprocessor instruction following the instruction causing the flush. The coprocessor
must then find the first instruction that has a matching tag, working from the bottom of the
pipeline upwards, and remove all instructions from that point upwards.
Unlike tokens passing down a queue, a flush signal has a fixed delay so that the timing
relationship between a flush in the core and a flush in the coprocessor is known precisely. Most
of the token queues also require flushing and this can also be done using the tags attached to
each instruction. If a match has been found before the stage at the receiving end of a token queue
is passed, then the token queue is cleared.
Otherwise, it must be properly flushed by matching the tags in the queue. This operation must
be performed on all the queues except the finish queue, that is updated in the normal way.
Therefore, the coprocessor must flush the instruction and cancel queues. The flushing operation
can be carried out by the coprocessor as soon as the flush signal is received. The flushing
operation is simplified because the instruction and cancel queues cannot be performing any
other operation. This means that flushing is not required to be combined with queue updates for
these queues.
There is a single cycle following a flush where nothing happens that affects the flushed queues,
and this provides a good opportunity to carry out the queue flushing operation.
The following signals provide the flush broadcast signal from the core:
ACPFLUSH
This signal is asserted when a flush is to be performed.
ACPFLUSHT[3:0]
This is the tag associated with the first instruction to be flushed.
11.6.5
Retirement operations
When an instruction reaches the bottom of the coprocessor pipeline it is retired. How it retires
depends on the kind of instruction it is and if it is iterated, as Table 11-5 lists.
Table 11-5 Retirement conditions
Instruction
Typ
e
Retirement conditions
CDP
-
Must find a token in the finish queue.
MRC
Store
No conditions. Immediate retirement on leaving Ex1.
MCR
Load
All load instructions must find data in the load data pipeline from the core.
MRRC
Store
No conditions. Immediate retirement on leaving Ex1.
MCRR
Load
All load instructions must find data in the load data pipeline from the core.
STC
Store
No conditions. Immediate retirement on leaving Ex1.
LDC
Load
Must find data in the load data pipeline from the core.
Table 11-5 lists the conditions for each coprocessor instruction:
ARM DDI 0333H
ID012410
•
all store instructions retire unconditionally on leaving Ex1 because no token is required in
the finish queue
•
CDP instructions require a token in the finish queue
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-20
Coprocessor Interface
ARM DDI 0333H
ID012410
•
all load instructions must pick up data from the load pipeline
•
phantom load instructions retire unconditionally.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-21
Coprocessor Interface
11.7
Multiple coprocessors
There might be more than one coprocessor attached to the core, and so some means is required
for dealing with multiple coprocessors. It is important, for reasons of economy, to ensure that as
little of the coprocessor interface is duplicated. In particular, the coprocessors must share the
length, accept, and store data queues, that the core maintains.
If these queues are to be shared, only one coprocessor can use the queues at any time. This is
achieved by enabling only one coprocessor to be active at any time. This is not a serious
limitation because only one coprocessor is in use at any time.
Typically, a processor is driven through driver software, that drives only one coprocessor. Calls
to the driver software, and returns from it, ensure that there are several core instructions between
the use of one coprocessor and the use of a different coprocessor.
11.7.1
Interconnect considerations
If only one coprocessor is permitted to communicate with the core at any time, all coprocessors
can share the coprocessor interface signals from the core. Signals from the coprocessors to the
core can be ORed together, provided that every coprocessor holds its outputs to zero when it is
inactive.
11.7.2
Coprocessor selection
Coprocessors are enabled by a signal ACPENABLE from the core. There are 12 of these
signals, one for each coprocessor. Only one can be active at any time. In addition, instructions
to the coprocessor include the coprocessor number, enabling coprocessors to reject instructions
that do not match their own number. Core instructions are also rejected.
11.7.3
Coprocessor switching
When the core decodes a coprocessor instruction destined for a different coprocessor to that last
addressed, it stalls this instruction until the previous coprocessor instruction has been retired.
This ensures that all activity in the currently selected coprocessor has ceased.
The coprocessor selection is switched, disabling the last active coprocessor and activating the
new coprocessor. The coprocessor that received the new coprocessor instruction must have
ignored it, being disabled. Therefore, the instruction is resent by the core, and is now accepted
by the newly activated coprocessor.
A coprocessor is disabled by the core by setting ACPENABLE LOW for the selected
coprocessor. The coprocessor responds by ceasing all activity and setting all its output signals
LOW.
When the coprocessor is enabled, signaled by setting ACPENABLE HIGH, it must
immediately set the signals CPALENGTHHOLD and CPAACCEPTHOLD HIGH, and
CPASTDATAV LOW, because the pipeline is empty at this point. The coprocessor can then
start normal operation.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
11-22
Chapter 12
Vectored Interrupt Controller Port
This chapter describes the vectored interrupt controller port of the processor. It contains the
following sections:
•
About the PL192 Vectored Interrupt Controller on page 12-2
•
About the processor VIC port on page 12-3
•
Timing of the VIC port on page 12-5
•
Interrupt entry flowchart on page 12-7.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
12-1
Vectored Interrupt Controller Port
12.1
About the PL192 Vectored Interrupt Controller
An interrupt controller is a peripheral that is used to handle multiple interrupt sources. Features
usually found in an interrupt controller are:
•
multiple interrupt request inputs, one for each interrupt source, and one interrupt request
output for the processor interrupt request input
•
software can mask out particular interrupt requests
•
prioritization of interrupt sources for interrupt nesting.
In a system with an interrupt controller having the above features, software is still required to:
•
determine the interrupt source that is requesting service
•
determine where the service routine for that interrupt source is loaded.
A Vectored Interrupt Controller (VIC) does both things in hardware. It supplies the starting
address, vector address, of the service routine corresponding to the highest priority requesting
interrupt source. The PL192 VIC is an Advanced Microcontroller Bus Architecture (AMBA)
Advanced High-performance Bus (AHB) compliant, System-on-Chip (SoC) peripheral that is
developed, tested, and licensed by ARM Limited.
The processor VIC port and the Peripheral Interface enable you to connect a PL192 VIC to the
processor. See ARM PrimeCell Vectored Interrupt Controller (PL192) Technical Reference
Manual for more details.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
12-2
Vectored Interrupt Controller Port
12.2
About the processor VIC port
Figure 12-1 shows the VIC port and the Peripheral Interface connecting a PL192 VIC and the
processor.
Processor
INTSYNCEN
IRQADDRVSYNCEN
nFIQ
nIRQ
IRQACK
IRQADDRV
IRQADDR[31:2]
VIC
0
nVICSYNCEN
nVICFIQ
nVICIRQ
VICIRQACK
VICIRQADDRV
VICVECTADDROUT[31:2]
VICINTSOURCE[(N-1):0]
nVICFIQIN
nVICIRQIN
VICVECTADDRIN[31:0]
Figure 12-1 Connection of a VIC to the processor
Note
Do not be confused by the naming of the IRQADDRVSYNCEN and nVICSYNCEN signals.
Although one is active HIGH and the other is active LOW they are connected to a common
external synchronization disable signal. See the signal descriptions in Table 12-1 for more
information.
The VIC port enables the processor to read the vector address as part of the IRQ interrupt entry.
That is, the processor takes a vector address from this interface instead of using the legacy
0x00000018 or 0xFFFF0018.The VIC port does not support the reading of FIQ vector addresses.
The interrupt interface is designed to handle interrupts asserted by a controller that is clocked
either synchronously or asynchronously to the processor clock. This capability ensures that the
controller can be used in systems that have either a synchronous or asynchronous interface
between the core clock and the AXI clock.
The VIC port consists of the signals that Table 12-1 lists.
Table 12-1 VIC port signals
Signal name
Direction
Description
nFIQ
Input
Active LOW fast interrupt request signal
nIRQ
Input
Active LOW normal interrupt request signal
INTSYNCEN
Input
If this signal is asserted HIGH, the internal nFIQ and nIRQ synchronizers are
bypassed and the interface is synchronous
IRQADDRVSYNCEN
Input
If this signal is asserted HIGH, the internal IRQADDRV synchronizer is
bypassed and the interface is synchronous
IRQACK
Output
Active HIGH IRQ acknowledge
IRQADDRV
Input
Active HIGH valid signal for the IRQ interrupt vector address below
IRQADDR[31:2]
Input
IRQ interrupt vector address. IRQADDR[31:2] holds the address of the first
ARM state instruction in the IRQ handler
IRQACK is driven by the processor to indicate to an external VIC that the processor wants to
read the IRQADDR input.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
12-3
Vectored Interrupt Controller Port
IRQADDRV is driven by a VIC to tell the processor that the address on the IRQADDR bus is
valid and being held, and so it is safe for the processor to sample it.
IRQACK and IRQADDRV together implement a four-phase handshake between the processor
and a VIC. See Timing of the VIC port on page 12-5 for more details.
12.2.1
Synchronization of the VIC port signals
The AHB system bus clock signal HCLK can run at any frequency, synchronously or
asynchronously to the processor clock signal, CLKIN. The processor VIC port can cope with
any clocking mode.
nFIQ and nIRQ can be connected to either synchronous or asynchronous sources.
Synchronizers are provided internally for the case of asynchronous sources. The Synchronous
Interrupt Enable port, INTSYNCEN, is also provided to enable SoC designers to bypass the
synchronizers if required. Similarly, a synchronizer is provided inside the processor for the
IRQADDRV signal. If this signal is known to be synchronous, the synchronizer can be
bypassed by pulling IRQADDRVSYNCEN HIGH.
These signals enable SoC designers to reduce interrupt latency if it is known that the nFIQ,
nIRQ, or IRQADDRV input is always driven by a synchronous source. When connecting the
PL192 VIC to the processor, INTSYNCEN must be tied LOW regardless of the clocking mode.
This is because the PL192 nVICIRQ and nVICFIQ outputs are completely asynchronous,
because there are combinational paths that cross this device through to these outputs. However,
IRQADDRVSYNCEN must be set depending on the clocking mode.
12.2.2
Interrupt handler exit
The software acknowledges an IRQ interrupt handler exit to a VIC by issuing a write to the
vector address register.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
12-4
Vectored Interrupt Controller Port
12.3
Timing of the VIC port
Figure 12-2 shows a timing example of VIC port operation. In this example IRQC is received
followed by IRQB having a higher priority. The waveforms in Figure 12-2 show an
asynchronous relationship between CLKIN and HCLK, and the delays marked Sync cater for
the delay of the synchronizers. When this interface is used synchronously, these delays are
reduced to being a single cycle of the receiving clock.
B1
IRQC
B2
IRQB
B3
B4
B5
B6
B7
B8
B9
B10
B11
B12
Processor
clock
Peripheral port
HCLK
IRQADDR[31:2]
IRQC vector address
IRQB vector address
Address sampled
nIRQ
Sync
IRQACK
Sync
Sync
Sync
IRQADDRV
Figure 12-2 VIC port timing example
Figure 12-2 illustrates the basic handshake mechanism that operates between the processor and
a PL192 VIC:
ARM DDI 0333H
ID012410
1.
An IRQC interrupt request occurs causing the PL192 VIC to set the processor nIRQ
input.
2.
The processor samples the nIRQ input LOW and initiates an interrupt entry sequence.
3.
Another IRQB interrupt request of higher priority than IRQC occurs.
4.
Between B3 and B4, the processor decides that the pending interrupt is an IRQ rather than
a FIQ and asserts the IRQACK signal.
5.
At B4 the VIC samples IRQACK HIGH and starts generating IRQADDRV. The VIC
can still change IRQADDR to the IRQB vector address while IRQADDRV is LOW.
6.
At B6 the VIC asserts IRQADDRV while IRQADDR is set to the IRQB vector address.
IRQADDR is held until the processor acknowledges it has sampled it, even if a higher
priority interrupt is received while the VIC is waiting.
7.
Around B8 the processor samples the value of the IRQADDR input bus and deasserts
IRQACK.
8.
When the VIC samples IRQACK LOW, it stacks the priority of the IRQB interrupt and
deasserts IRQADDRV. It also deasserts nIRQ if there are no higher priority interrupts
pending.
9.
When the processor samples IRQADDRV LOW, it knows it can sample the nIRQ input
again. Therefore, if the VIC requires some time for deasserting nIRQ, it must ensure that
IRQADDRV stays HIGH until nIRQ has been deasserted.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
12-5
Vectored Interrupt Controller Port
The clearing of the interrupt is handled in software by the interrupt handling routine. This
enables multiple interrupt sources to share a single interrupt priority. In addition, the interrupt
handling routine must communicate to the VIC that the interrupt currently being handled is
complete, using the memory-mapped or coprocessor-mapped interface, to enable the interrupt
masking to be unwound.
12.3.1
PL192 VIC timing
As its part of the handshake mechanism, the PL192 VIC:
12.3.2
1.
Synchronizes IRQACK on its way in if the peripheral port clocking mode is
asynchronous or bypasses the synchronizers if it is in synchronous mode.
2.
Asserts IRQADDRV when an address is ready at IRQADDR, and holds that address
until IRQACK is sampled LOW, even if higher priority interrupts come along.
3.
Stacks the priority that corresponds to the vector address present at IRQADDR when it
samples the IRQACK signal LOW, while IRQADDRV is HIGH.
4.
Clears IRQADDRV so the processor can recognize another interrupt. If nIRQ is also to
be deasserted at this point because there are no higher priority interrupts pending, it is
deasserted before or at the same time as IRQADDRV to ensure that the processor does
not take the same interrupt again.
Core timing
As its part of the handshake mechanism, the core:
ARM DDI 0333H
ID012410
1.
Starts an interrupt entry sequence when it samples the nIRQ signal asserted.
2.
Determines if an FIQ or an IRQ is going to be taken. This happens after the interrupt entry
sequence is started. If it decides that an IRQ is going to be taken, it starts the VIC port
handshake by asserting IRQACK. If it decides that the interrupt is an FIQ, then it does
not assert IRQACK and the VIC port handshake is not initiated.
3.
Ignores the value of the nFIQ input until the IRQ interrupt entry sequence is completed
if it has decided that the interrupt is an IRQ.
4.
Samples the IRQADDR input bus when both IRQACK and IRQADDRV are sampled
asserted. The interrupt entry sequence proceeds with this value of IRQADDR.
5.
Ignores the nIRQ signal while IRQADDRV is HIGH. This gives the VIC time to deassert
the nIRQ signal if there is no higher priority interrupt pending.
6.
Ignores the nFIQ signal while IRQADDRV is HIGH.
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
12-6
Vectored Interrupt Controller Port
12.4
Interrupt entry flowchart
Figure 12-3.shows all the decisions and actions required to complete interrupt entry. For more
information on interrupt entry, see Exception vectors on page 2-48.
FALSE
!(IRQADDRV
&& VE)
TRUE
FALSE
!((nFIQ||F)
&&(nIRQ||I))
TRUE
FALSE
FALSE
!(nFIQ||F)
IRQ = 1 in
SCR?
TRUE
FALSE
FALSE
VE==1
TRUE
TRUE
Take IRQACK
HIGH
FIQ = 1 in
SCR?
TRUE
SPSR_fiq =
CPSR
SPSR_mon =
CPSR
SPSR_mon =
CPSR
SPSR_irq =
CPSR
LR_fiq =
RA+4
LR_mon =
RA+4
LR_mon =
RA+4
LR_irq =
RA+4
CPSR[4:0] =
FIQ mode
CPSR[4:0] =
MON mode
CPSR[4:0] =
MON mode
CPSR[4:0] =
IRQ mode
CPSR[5] =
ARM state
CPSR[5] =
ARM state
CPSR[5] =
ARM state
CPSR[5] =
ARM state
CPSR[7] =
FIQs and IRQs
disabled
CPSR[7] =
FIQs and IRQs
disabled
CPSR[7] =
IRQs disabled
CPSR[7] =
IRQs disabled
FALSE
FALSE
FALSE
VE==1
V==1
V==1
TRUE
TRUE
TRUE
FALSE
!IRQ
ADDRV==1
Secure
state?
FALSE
TRUE
TRUE
PC[31:0] =
NSBA + 0x1C
PC[31:0] =
SBA + 0x1C
Secure
state?
PC[31:0] =
0xFFFF001C
PC[31:0] =
MBA + 0x1C
PC[31:0] =
MBA + 0x18
PC[31:0] =
IRQADDR[31:2],
0b00
PC[31:0] =
0xFFFF0018
FALSE
TRUE
PC[31:0] =
SBA + 0x18
PC[31:0] =
NSBA + 0x18
Figure 12-3 Interrupt entry sequence
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
12-7
Chapter 13
Debug
This chapter describes the processor debug unit, that assists development of application software,
operating systems, and hardware, and contains the following sections:
•
Debug systems on page 13-2
•
About the debug unit on page 13-3
•
Debug registers on page 13-5
•
CP14 registers reset on page 13-25
•
CP14 debug instructions on page 13-26
•
External debug interface on page 13-28
•
Changing the debug enable signals on page 13-31
•
Debug events on page 13-32
•
Debug exception on page 13-35
•
Debug state on page 13-37
•
Debug communications channel on page 13-42
•
Debugging in a cached system on page 13-43
•
Debugging in a system with TLBs on page 13-44
•
Monitor debug-mode debugging on page 13-45
•
Halting debug-mode debugging on page 13-50
•
External signals on page 13-52.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
13-1
Debug
13.1
Debug systems
The processor forms one component of a debug system that interfaces from the high-level
debugging performed by you, to the low-level interface supported by the processor. Figure 13-1
shows a typical system.
Debug
host
Host computer running RealView™ Debugger
Protocol
converter
for example, RealView™ ICE
Debug
target
Development system containing ARM1176JZ-S
Figure 13-1 Typical debug system
This typical system has three parts:
•
The debug host
•
The protocol converter
•
The processor.
13.1.1
The debug host
The debug host is a computer, for example a personal computer, running a software debugger
such as RealView Debugger. The debug host enables you to issue high-level commands such as
set breakpoint at location XX, or examine the contents of memory from 0x0-0x100.
13.1.2
The protocol converter
The debug host is connected to the processor development system using an interface, for
example an RS232. The messages broadcast over this connection must be converted to the
interface signals of the processor. This function is performed by a protocol converter, for
example, RealView ICE.
13.1.3
The processor
The processor, with debug unit, is the lowest level of the system. The debug extensions enable
you to:
•
stall program execution
•
examine its internal state and the state of the memory system
•
resume program execution.
The debug host and the protocol converter are system-dependent.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
13-2
Debug
13.2
About the debug unit
The processor debug unit assists in debugging software running on the processor. You can use
the processor debug unit, in combination with a software debugger program, to debug:
•
application software
•
operating systems
•
ARM processor based hardware systems.
The debug unit enables you to:
•
stop program execution
•
examine and alter processor and coprocessor state
•
examine and alter memory and input/output peripheral state
•
restart the processor core.
You can debug the processor in the following ways:
•
Halting debug-mode debugging
•
Monitor debug-mode debugging
•
Trace debugging. See Chapter 15 Trace Interface Port for interfacing with an ETM.
The processor debug interface is based on the IEEE Standard Test Access Port and
Boundary-Scan Architecture.
13.2.1
Halting debug-mode debugging
When the processor debug unit is in Halting debug-mode, the processor halts and enters Debug
state when a debug event, such as a breakpoint, occurs. When the processor is in Debug state,
an external host can examine and modify its state using the DBGTAP..
In Debug state you can examine and alter processor state, processor registers, coprocessor state,
memory, and input/output locations through the DBGTAP. This mode is intentionally invasive
to program execution. Halting debug-mode debugging requires:
•
external hardware to control the DBGTAP
•
a software debugger to provide the user interface to the debug hardware.
See CP14 c1, Debug Status and Control Register (DSCR) on page 13-7 to learn how to set the
processor debug unit into Halting debug-mode.
13.2.2
Monitor debug-mode debugging
When the processor debug unit is in Monitor debug-mode, the processor takes a Debug
exception instead of halting. A special piece of software, a debug monitor target, can then take
control to examine or alter the processor state. Monitor debug-mode is essential in real-time
systems where the core cannot be halted to collect information. For example, engine controllers
and servo mechanisms in hard drive controllers that cannot stop the code without physically
damaging the components.
When debugging in Monitor debug-mode the processor stops execution of the current program
and starts execution of a debug monitor target. The state of the processor is preserved in the same
manner as all ARM exceptions. See the ARM Architecture Reference Manual on exceptions and
exception priorities. The debug monitor target communicates with the debugger to access
processor and coprocessor state, and to access memory contents and input/output peripherals.
Monitor debug-mode requires a debug monitor program to interface between the debug
hardware and the software debugger.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
13-3
Debug
When debugging in Monitor debug-mode, you can program new debug events through CP14.
This coprocessor is the software interface of all the debug resources such as the breakpoint and
watchpoint registers. See CP14 c1, Debug Status and Control Register (DSCR) on page 13-7 to
learn how to set the processor debug unit into Monitor debug-mode.
Note
Monitor debug-mode, used for debugging, is not the same as Secure Monitor mode.
13.2.3
Secure Monitor mode and debug
Debug can be restricted to one of three levels, Non-secure only, Non-secure and Secure User
only, or any Secure or Non-secure levels so that you can prevent access to Secure parts of the
system while still permitting Non-secure and optionally Secure User parts to be debugged. This
is controlled by the SPIDEN and SPNIDEN signals and the two bits SUIDEN and SUNIDEN
in the Secure Debug Enable Register in the system control coprocessor, see External debug
interface on page 13-28 and c1, Secure Debug Enable Register on page 3-54.
Invasive debug
Invasive debug is debug where the system can be both observed and controlled
like all of the debug in this section that enables you to halt the processor and
examine and modify registers and memory.
SPIDEN and SUIDEN control invasive debug permissions.
Non-invasive debug
Non-invasive is debug where the system can only be observed but not affected.
The ETM interface, the System Performance Monitor and the DBGTAP program
counter sample register provide non-invasive debug.
SPNIDEN and SUNIDEN control non-invasive debug permissions.
13.2.4
Virtual addresses and debug
Unless otherwise stated, all addresses in this chapter are Modified Virtual Addresses (MVA) as
the ARM Architecture Reference Manual describes. For example, the Breakpoint Value
Registers (BVR) and Watchpoint Value Registers (WVR) must be programmed with MVAs.
The terms Instruction Modified Virtual Address (IMVA) and Data Modified Virtual Address
(DMVA), where used, mean the MVA corresponding to an instruction address and the MVA
corresponding to a data address respectively.
13.2.5
Programming the debug unit
The processor debug unit is programmed using CoProcessor 14 (CP14). CP14 provides:
•
instruction address comparators for triggering breakpoints
•
data address comparators for triggering watchpoints
•
a bidirectional Debug Communication Channel (DCC)
•
all other state information associated with processor debug.
CP14 is accessed using coprocessor instructions in Monitor debug-mode, and certain debug
scan chains in Debug state, see Chapter 14 Debug Test Access Port to learn how to access the
processor debug unit using scan chains.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
13-4
Debug
13.3
Debug registers
Table 13-1 lists definitions of terms used in register descriptions.
Table 13-1 Terms used in register descriptions
Term
Description
R
Read-only. Written values are ignored. However, it is written as 0 or preserved by writing the same value
previously read from the same fields on the same processor.
W
Write-only. This bit cannot be read. Reads return an Unpredictable value.
RW
Read or write.
C
Cleared on read. This bit is cleared whenever the register is read.
UNP/SBZP
Unpredictable or Should Be Zero or Preserved (SBZP). A read to this bit returns an Unpredictable value.
It is written as 0 or preserved by writing the same value previously read from the same fields on the same
processor. These bits are usually reserved for future expansion.
Core view
This column defines the core access permission for a given bit.
External view
This column defines the DBGTAP debugger view of a given bit.
Read/write
attributes
This is used when the core and the DBGTAP debugger view are the same.
On a power-on reset, all the CP14 debug registers take the values indicated by the Reset value
column in the register bit field definition tables:
•
Table 13-4 on page 13-8
•
Table 13-6 on page 13-14
•
Table 13-11 on page 13-18
•
Table 13-14 on page 13-21
•
Table 13-16 on page 13-22.
In these tables, - means an Undefined reset value.
13.3.1
Accessing debug registers
To access the CP14 debug registers you must set Opcode_1 and CRn to 0. The Opcode_2 and
CRm fields of the coprocessor instructions are used to encode the CP14 debug register number,
where the register number is {<Opcode2>, <CRm>}.
Table 13-2 lists the CP14 debug register map. All of these registers are also accessible as scan
chains from the DBGTAP.
Table 13-2 CP14 debug register map
Binary address
ARM DDI 0333H
ID012410
Register
number
CP14 debug register name
Abbreviation
Opcode_2
CRm
b000
b0000
c0
Debug ID Register
DIDR
b000
b0001
c1
Debug Status and Control Register
DSCR
b000
b0010-b0100
c2-c4
Reserved
-
b000
b0101
c5
Data Transfer Register
DTR
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
13-5
Debug
Table 13-2 CP14 debug register map (continued)
Binary address
Register
number
CP14 debug register name
Abbreviation
Opcode_2
CRm
b000
b0110
c6
Watchpoint Fault Address Register
WFAR
b000
b0111
c7
Vector Catch Register
VCR
b000
b1000-b1001
c8-c9
Reserved
-
b000
b1010
c10
Debug State Cache Control Register
DSCCR
b000
b1011
c11
Debug State MMU Control Register
DSMCR
b000
b1100-b1111
c12-c15
Reserved
-
b001-b011
b0000-b1111
c16-c63
Reserved
-
b100
b0000-b0101
c64-c69
Breakpoint Value Registers
BVRya
b0110-b111
c70-c79
Reserved
-
b0000-b0101
c80-c85
Breakpoint Control Registers
BCRya
b0110-b1111
c86-c95
Reserved
-
b0000-b0001
c96-c97
Watchpoint Value Registers
WVRya
b0010-b1111
c98-c111
Reserved
-
b0000-b0001
c112-c113
Watchpoint Control Registers
WCRya
b0010-b1111
c114-c127
Reserved
-
b101
b110
b111
a. y is the decimal representation for the binary number CRm.
Note
All the debug resources required for Monitor debug-mode debugging are accessible through
CP14 registers. For Halting debug-mode debugging some additional resources are required. See
Chapter 14 Debug Test Access Port.
13.3.2
CP14 c0, Debug ID Register (DIDR)
The Debug ID Register is a read-only register that defines the configuration of debug registers
in a system. Figure 13-2 shows the format of the Debug ID Register.
31
28 27
WRP
24 23
BRP
20 19
Context
16 15
12 11
Version
8 7
UNP/SBZ
4 3
Variant
0
Revision
Debug architecture revision
Figure 13-2 Debug ID Register format
For the ARM1176JZ-S processor:
•
ARM DDI 0333H
ID012410
DIDR[31:8] has the value 0x15121x
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
13-6
Debug
•
the value of DIDR[7:0] is determined by fields in the CP15 c0 Main ID Register, as
described in the field descriptions in Table 13-3.
Table 13-3 lists the bit field definitions for the Debug ID Register.
Table 13-3 Debug ID Register bit field definition
Bits
Read/write
attributes
Description
[31:28]
WRP
R
Number of Watchpoint Register Pairs:
b0000 = 1 WRP
b0001 = 2 WRPs
…
b1111 = 16 WRPs.
For the ARM1176JZ-S processor these bits are b0001 (2 WRPs).
[27: 24]
BRP
R
Number of Breakpoint Register Pairs:
b0000 = Reserved. The minimum number of BRPs is 2.
b0001 = 2 BRPs
b0010 = 3 BRPs
…
b1111 = 16 BRPs.
For the ARM1176JZ-S processor these bits are b0101 (6 BRPs).
[23: 20]
Context
R
Number of Breakpoint Register Pairs with context ID comparison capability:
b0000 = 1 BRP has context ID comparison capability
b0001 = 2 BRPs have context ID comparison capability
…
b1111 = 16 BRPs have context ID comparison capability.
For the ARM1176JZ-S processor these bits are b0001 (2 BRPs).
[19:16]
Version
R
Debug architecture version. 0x2 denotes v6.1
[15:12]
R
Debug architecture revision 0x1 denotes TrustZone features
[11:8]
UNP/SBZP
Reserved.
[7: 4]
Variant
R
Implementation-defined variant number, incremented on major revisions of the product.
This field is identical to bits [23:20] of the CP15 c0 Main ID Register, see c0, Main ID
Register on page 3-20.
[3: 0]
Revision
R
Implementation-defined revision number, incremented on minor revisions of the product.
This field is identical to bits [3:0] of the CP15 c0 Main ID Register, see c0, Main ID Register
on page 3-20.
The reason for duplicating the Variant and Revision fields here is that the Debug ID Register is
accessible through scan chain 0. This enables an external debugger to determine the variant and
revision numbers without stopping the core.
13.3.3
CP14 c1, Debug Status and Control Register (DSCR)
The Debug Status and Control Register contains status and configuration information about the
state of the debug system. Figure 13-3 on page 13-8 shows the format of the Debug Status and
Control Register.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
13-7
Debug
20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5
31 30 29 28
2 1 0
UNP/SBZP
wDTRfull
rDTRfull
UNP/SBZP
Core halted
Core restarted
Method of debug entry
Imprecise data abort ignored
Non-Secure World status
Not Secure Privileged Non-Invasive
Debug enable, SPNIDEN input pin
Not Secure Privileged Invasive
Debug Enable, SPIDEN input pin
Monitor debug-mode enable
Mode select
Execute ARM instruction enable
User mode access to DCC control
Interrupts disable
DbgAck
Power down disable
Sticky Undefined flag
Sticky imprecise Data Aborts flag
Sticky precise Data Abort flag
Figure 13-3 Debug Status and Control Register format
Table 13-4 lists the bit field definitions for the Debug Status and Control Register.
Table 13-4 Debug Status and Control Register bit field definitions
Bits
Core view
External
view
Reset
value
Description
[31]
UNP/SBZP
UNP/SBZP
-
Reserved.
[30]
R
R
0
The rDTRfull flag:
0 = rDTR empty
1 = rDTR full.
This flag is automatically set on writes by the DBGTAP debugger to
the rDTR and is cleared on reads by the core of the same register. No
writes to the rDTR are enabled if the rDTRfull flag is set.
[29]
R
R
0
The wDTRfull flag:
0 = wDTR empty
1 = wDTR full.
This flag is automatically cleared on reads by the DBGTAP debugger
of the wDTR and is set on writes by the core to the same register.
[28:20]
UNP/SBZP
UNP/SBZP
-
Reserved.
[19]
R
R
0
Imprecise Data Aborts Ignored. This read-only bit is set by the core in
Debug state following a Data Memory Barrier operation, and cleared
on exit from Debug state. When set, the core does not act on imprecise
data aborts. However, the sticky imprecise data abort bit is set if an
imprecise data abort occurs when in Debug state.
ARM DDI 0333H
ID012410
Copyright © 2004-2009 ARM Limited. All rights reserved.
Non-Confidential, Unrestricted Access
13-8
Debug
Table 13-4 Debug Status and Control Register bit field definitions (continued)
Bits
Core view
External
view
Reset
value
[18]
R
R
0
Non-secure World Status bit 0 = The processor is in Secure state. NS
bit = 0 or Secure Monitor mode.1 = The processor is in Non-secure
state. NS bit = 1 and not Secure Monitor mode.
[17]
R
R
n/a
Not Secure Privilege Non-Invasive Debug Enable, SPNIDEN, input
pin.0 = SPNIDEN input pin is HIGH.1 = SPNIDEN input pin is LOW.
[16]
R
R
n/a
Not Secure Privilege Invasive Debug Enable, SPIDEN, input pin.0 =
SPIDEN input pin is HIGH.1 = SPIDEN input pin is LOW.
[15]
RW
R
0
The Monitor debug-mode enable bit:
0 = Monitor debug-mode disabled
1 = Monitor debug-mode enabled.
For the core to take a debug exception, Monitor debug-mode has to be
both selected and enabled, bit 14 clear and bit 15 set.
[14]
R
RW
0
Mode select bit:
0 = Monitor debug-mode selected
1 = Halting debug-mode selected and enabled.
[13]
R
RW
0
Execute ARM instruction enable bit:
0 = Disabled
1 = Enabled.
If this bit is set, the core can be forced to execute ARM instructions in
Debug state using the Debug Test Access Port. If this bit is set when
the core is not in Debug state, the behavior of the processor is
architecturally Unpredictable. For ARM1176JZ-S processors it has
no effect.
[12]
RW
R
0
User mode access to comms channel control bit:
0 = User mode access to comms channel enabled
1 = User mode access to comms channel disabled.
If this bit is set and a User mode process tries to access the DIDR,
DSCR, or the DTR, the Undefined instruction exception is taken.
Because accessing the rest of CP14 debug registers is never possible
in User mode, see Executing CP14 debug instructions on page 13-27,
setting this bit means that a User mode process cannot access any
CP14 debug register.
[11]
R
RW
0
Interrupts bit:
0 = Interrupts enabled
1 = Interrupts disabled.
Description
If this bit is set, the IRQ and FIQ input signals are inhibited.a
[10]
R
RW
0
DbgAck bit.
If this bit is set, the DBGACK