Lecture 19: Virtual memory - practice Timothy Roscoe

Lecture 19: Virtual memory - practice  Timothy Roscoe
Lecture 19:
Virtual memory - practice
Computer Architecture and
Systems Programming
(252-0061-00)
Timothy Roscoe
Herbstsemester 2012
© Systems Group | Department of Computer Science | ETH Zürich
Last Time: Address Translation
Virtual address
Page table
base register
(PTBR)
Page table address
for process
Virtual page number (VPN)
Virtual page offset (VPO)
Page table
Valid
Physical page number (PPN)
Valid bit = 0:
page not in memory
(page fault)
Physical page number (PPN)
Physical address
Physical page offset (PPO)
Last Time: Page Fault
Exception
4
2
PTEA
CPU Chip
CPU
1
VA
7
Page fault handler
MMU
PTE
3
Victim page
Cache/
Memory
5
Disk
New page
6
TLB Hit
CPU Chip
CPU
TLB
2
PTE
VPN
3
1
VA
MMU
Data
5
A TLB hit eliminates a memory access
PA
4
Cache/
Memory
TLB Miss
CPU Chip
TLB
2
4
PTE
VPN
CPU
1
VA
MMU
3
PTEA
PA
Cache/
Memory
5
Data
6
A TLB miss incurs an add’l memory access (the PTE)
Fortunately, TLB misses are rare
Today
• A note on Terminology
• Virtual memory (VM)
– Multi-level page tables
• Case study: VM system on P6
• Historical aside: VM system on the VAX
• x86-64 and 64-bit paging
• Performance optimization for VM system
Terminology
• Virtual Page may refer to:
– Page-aligned region of virtual address space
and
– Contents thereof (different – might appear on disk)
• Physical Page:
– Page-aligned region of physical memory (RAM)
• Physical Frame (=Physical Page)
– Alternative terminology
– Page = contents, Frame = container
– Page size may be ≠ frame size (rarely)
Today
• A note on Terminology
• Virtual memory (VM)
– Multi-level page tables
• Case study: VM system on P6
• Historical aside: VM system on the VAX
• x86-64 and 64-bit paging
• Performance optimization for VM system
Multi-Level Page Tables
•
Given:
Level 2
Tables
(212)
– 4KB
page size
– 48-bit address space
– 4-byte PTE
•
Problem:
– Would need a 256 GB page table!
• 248 * 2-12 * 22 = 238 bytes
•
Common solution
– Level 1 table stays in memory
– Level 2 tables paged in and out
...
– Multi-level page tables
– Example: 2-level page table
– Level 1 table: each PTE points to a
page table
– Level 2 table: each PTE points to a
page
(paged in and out like other data)
Level 1
Table
...
2-Level Page Table Hierarchy
Level 1
page table
Level 2
page tables
Virtual
memory
VP 0
PTE 0
PTE 0
...
PTE 1
...
VP 1023
PTE 2 (null)
PTE 1023
VP 1024
VP 2047
PTE 4 (null)
PTE 0
PTE 5 (null)
...
PTE 6 (null)
PTE 1023
Gap
PTE 7 (null)
(1K - 9)
null PTEs
2K allocated VM pages
for code and data
...
PTE 3 (null)
PTE 8
0
6K unallocated VM pages
1023 null
PTEs
PTE 1023
1023
unallocated
pages
VP 9215
1023 unallocated pages
1 allocated VM page
for the stack
...
Translating with a k-level Page
Table
Virtual Address
n-1
p-1
VPN 1
VPN 2
Level 2
page table
Level 1
page table
...
...
VPN k
...
0
VPO
Level k
page table
PPN
m-1
p-1
PPN
Physical Address
0
PPO
Today
• A note on Terminology
• Virtual memory (VM)
– Multi-level page tables
• Case study: VM system on P6
• Historical aside: VM system on the VAX
• x86-64 and 64-bit paging
• Performance optimization for VM system
Intel P6
• Internal designation for successor to Pentium
– Which had internal designation P5
• Fundamentally different from Pentium
– Out-of-order, superscalar operation
• Resulting processors
– Pentium Pro (1996)
– Pentium II (1997)
• L2 cache on same chip
– Pentium III (1999)
• Different microarchitecture to the Pentium 4
– Similar memory system
– P4 abandoned by Intel in 2005 for P6-based Core 2 Duo
P6 Memory System
32 bit address space
DRAM
4 KB page size
external
system bus
(e.g. PCI)
L1, L2, and TLBs
• 4-way set associative
L2
cache
Inst TLB
• 32 entries
• 8 sets
cache bus
bus interface unit
instruction
fetch unit
processor package
L1
i-cache
inst
TLB
data
TLB
L1
d-cache
Data TLB
•
•
64 entries
16 sets
L1 i-cache and d-cache
•
•
•
16 KB
32 B line size
128 sets
L2 cache
•
•
unified
128 KB–2 MB
Review of Abbreviations
• Components of the virtual address (VA)
–
–
–
–
TLBI: TLB index
TLBT: TLB tag
VPO: virtual page offset
VPN: virtual page number
• Components of the physical address (PA)
–
–
–
–
–
PPO: physical page offset (same as VPO)
PPN: physical page number
CO: byte offset within cache line
CI: cache index
CT: cache tag
Overview: P6 Address Translation
32
result
CPU
20
VPN
12
VPO
16
TLBT
virtual address (VA)
TLB
hit
L1 (128 sets, 4 lines/set)
...
...
TLB (16 sets, 4 entries/set)
10
10
VPN1 VPN2
20
PPN
PDE
PDBR
L1
miss
L1
hit
4
TLBI
TLB
miss
L2 and DRAM
Page tables
PTE
12
PPO
physical
address (PA)
20
CT
7 5
CI CO
P6 2-level Page Table Structure
• Page directory
– 1024 4-byte page directory entries (PDEs) that
point to page tables
– One page directory per process
– Page directory must be in memory when its process
is running
page
– Always pointed to by PDBR
directory
– Large page support:
1024
• Make PD the page table
• Fixes page size to 4MB (why?)
• Page tables:
– 1024 4-byte page table entries (PTEs) that point to
pages
– Size: exactly one page
– Page tables can be paged in and out
PDEs
Up to 1024
page tables
1024
PTEs
...
1024
PTEs
...
1024
PTEs
P6 Page Directory Entry (PDE)
31
12 11
Page table physical base address
9
Avail
8
7
G
PS
6
5
A
4
3
2
1
0
CD WT U/S R/W P=1
Page table physical base address: 20 most significant bits of physical page table
address (forces page tables to be 4KB aligned)
Avail: These bits available for system programmers
G: global page (don’t evict from TLB on task switch)
PS: page size 4K (0) or 4M (1)
A: accessed (set by MMU on reads and writes, cleared by software)
CD: cache disabled (1) or enabled (0)
WT: write-through or write-back cache policy for this page table
U/S: user or supervisor mode access
R/W: read-only or read-write access
P: page table is present in memory (1) or not (0)
31
1
Available for OS (page table location in secondary storage)
0
P=0
P6 Page Table Entry (PTE)
31
12 11
Page physical base address
9
Avail
8
7
6
5
G
0
D
A
4
3
2
1
0
CD WT U/S R/W P=1
Page base address: 20 most significant bits of physical page address (forces
pages to be 4 KB aligned)
Avail: available for system programmers
G: global page (don’t evict from TLB on task switch)
D: dirty (set by MMU on writes)
A: accessed (set by MMU on reads and writes)
CD: cache disabled or enabled
WT: write-through or write-back cache policy for this page
U/S: user/supervisor
R/W: read/write
P: page is present in physical memory (1) or not (0)
31
1
Available for OS (page location in secondary storage)
0
P=0
Representation of VM Addr. Space
PT 3
Page Directory
P=1, M=1
P=1, M=1
P=0, M=0
P=0, M=1
•
•
•
•
PT 2
PT 0
P=1, M=1
P=0, M=0
P=1, M=1
P=0, M=1
P=1, M=1
P=0, M=0
P=1, M=1
P=0, M=1
P=0, M=1
P=0, M=1
P=0, M=0
P=0, M=0
•
•
•
•
•
•
•
•
•
•
•
•
• Simplified Example
– 16 page virtual address space
• Flags
– P: Is entry in physical memory?
– M: Has this part of VA space been
mapped?
Page 15
Page 14
Page 13
Page 12
Page 11
Mem Addr
Page 10
Disk Addr
Page 9
In Mem
Page 8
Page 7
Page 6
Page 5
Page 4
Page 3
Page 2
Page 1
Page 0
On Disk
Unmapped
P6 TLB Translation
32
result
CPU
20
VPN
12
VPO
16
TLBT
virtual address (VA)
TLB
hit
L1 (128 sets, 4 lines/set)
...
...
TLB (16 sets, 4 entries/set)
10
10
VPN1 VPN2
20
PPN
PDE
PDBR
L1
miss
L1
hit
4
TLBI
TLB
miss
L2 and DRAM
Page tables
PTE
12
PPO
physical
address (PA)
20
CT
7 5
CI CO
P6 TLB
• TLB entry (not all documented, so this is speculative):
–
–
–
–
–
–
–
20
16
1
1
1
1
1
PPN
TLBTag
V
G
S
W
D
V: indicates a valid (1) or invalid (0) TLB entry
TLBTag: disambiguates entries cached in the same set
PPN: translation of the address indicated by index & tag
G: page is “global” according to PDE, PTE
S: page is “supervisor-only” according to PDE, PTE
W: page is writable according to PDE, PTE
D: PTE has already been marked “dirty” (once is enough)
• Structure of the data TLB:
– 16 sets, 4 entries/set
entry
entry
entry
entry
entry
entry
...
entry
entry
entry
entry
set 0
set 1
entry
entry
set 15
Translating with the P6 TLB
1. Partition VPN into TLBT
and TLBI.
CPU
2. Is the PTE for VPN
cached in set TLBI?
12 virtual address
VPO
20
VPN
16
TLBT
4
TLBI
TLB
miss
1
2
PDE
partial
TLB hit
3. Yes: Check permissions,
build physical address
PTE
TLB
hit 3
...
page table translation
20
PPN
4
12
PPO
physical
address
4. No: Read PTE (and PDE
if not cached) from
memory and build
physical address
P6 TLB Translation
32
result
CPU
20
VPN
12
VPO
16
TLBT
virtual address (VA)
TLB
hit
L1 (128 sets, 4 lines/set)
...
...
TLB (16 sets, 4 entries/set)
10
10
VPN1 VPN2
20
PPN
PDE
PDBR
L1
miss
L1
hit
4
TLBI
TLB
miss
L2 and DRAM
Page tables
PTE
12
PPO
physical
address (PA)
20
CT
7 5
CI CO
Translating with P6 Page Tables
(case 1/1)
20
VPN
• Page table and
page present
12
VPO
20
PPN
VPN1 VPN2
12
PPO
Mem
PDBR
Disk
PDE p=1
PTE p=1
data
Page
directory
Page table
Data page
• MMU Action:
– MMU builds
physical address
and fetches data
word
• OS action
– None
Translating with P6 Page Tables
(case 1/0)
20
VPN
• Page table present,
page missing
12
VPO
• MMU Action:
– Page fault exception
– Handler receives
the following args:
VPN1 VPN2
Mem
PDBR
PDE p=1
PTE p=0
Page
directory
Page table
Disk
data
Data page
• %eip that caused
fault
• VA that caused
fault
• Fault caused by
non-present page
or page-level
protection
violation
– Read/write
– User/supervisor
Translating with P6 Page Tables
(case 1/0, cont.)
• OS Action:
20
VPN
12
VPO
20
PPN
VPN1 VPN2
12
PPO
Mem
PDBR
Disk
PDE p=1
PTE p=1
data
Page
directory
Page table
Data page
– Check for a legal
virtual address.
– Read PTE through
PDE.
– Find free physical
page (swapping out
current page if
necessary)
– Read virtual page
from disk into
physical page
– Adjust PTE to point to
physical page, set
p=1
– Restart faulting
instruction by
returning from
exception handler
Translating with P6 Page Tables
(case 0/1)
20
VPN
• Page table missing,
page present
12
VPO
• Introduces
consistency issue
VPN1 VPN2
Mem
PDBR
PDE p=0
data
Page
directory
Data page
Disk
PTE p=1
Page table
– Potentially every
page-out requires
update of disk
page table
• Linux disallows this
– If a page table is
swapped out, then
swap out its data
pages too
Translating with P6 Page Tables
(case 0/0)
20
VPN
• Page table and
page missing
12
VPO
• MMU Action:
VPN1 VPN2
– Page fault
Mem
PDE p=0
PDBR
Page
directory
Disk
PTE p=0
data
Page table
Data page
Translating with P6 Page Tables
(case 0/0, cont.)
20
VPN
• OS action:
12
VPO
– Swap in page
table
– Restart faulting
instruction by
returning from
handler
VPN1 VPN2
Mem
PDBR
PDE p=1
PTE p=0
Page
directory
Page table
• Like case 0/1
from here on.
Disk
– Two disk reads
data
Data page
P6 L1 Cache Access
32
result
CPU
20
VPN
12
VPO
16
TLBT
virtual address (VA)
TLB
hit
L1 (128 sets, 4 lines/set)
...
...
TLB (16 sets, 4 entries/set)
10
10
VPN1 VPN2
20
PPN
PDE
PDBR
L1
miss
L1
hit
4
TLBI
TLB
miss
L2 and DRAM
Page tables
PTE
12
PPO
physical
address (PA)
20
CT
7 5
CI CO
L1 Cache Access
32
data
L2 and DRAM
L1
miss
L1
hit
L1 (128 sets, 4 lines/set)
• Use CT to determine if
line containing word at
address PA is cached in
set CI
• No: check L2
...
physical
address (PA)
• Partition physical
address: CO, CI, and CT
20
CT
7 5
CI CO
• Yes: extract word at byte
offset CO and return to
processor
Speeding Up L1 Access
Tag Check
20
CT
7 5
CI CO
PPN
PPO
Physical address (PA)
No
Change
Address
Translation
Virtual address (VA)
VPN
VPO
20
12
CI
• Observation
–
–
–
–
–
Bits that determine CI identical in virtual and physical address
Can index into cache while address translation taking place
Generally we hit in TLB, so PPN bits (CT bits) available next
“Virtually indexed, physically tagged”
Cache carefully sized to make this possible
Today
• A note on Terminology
• Virtual memory (VM)
– Multi-level page tables
• Case study: VM system on P6
• Historical aside: VM system on the VAX
• x86-64 and 64-bit paging
• Performance optimization for VM system
Historical aside:
virtual page tables
• Same problem: linear page table can be large.
• On the VAX:
– Page size = 512 bytes (so offset = 9 bits)
– Virtual address space = 32bits
⇒ page table index = 23 bits
2
21
ss
Segments:
Addr VPN
9
Addr VPO
00
P0
User program text and data
01
P1
User stack
10
S0
System: kernel and page tables
11
S1
Unused (reserved)
Historical aside:
virtual page tables
• Same problem: linear page table can be large.
• On the VAX:
– Page size = 512 bytes (so offset = 9 bits)
– Virtual address space = 32bits
⇒ page table index = 23 bits
⇒ page table size = 8388608 entries
– Each PTE = 4 bytes (32 bits)
⇒ 32 Mbytes per page table (i.e. per process!)
– Too much memory in those days…
Solution: put the linear table
into virtual memory!
• Of course, most of the PTEs are not used
– Invalid translation: saves space
• TLB hides most of the double lookups
Another
virtual → physical
translation
Physical address
Addr VPO
Linear page table in
physica lmemory
Addr VPN
Linear page table in
virtual memory
ss
VAX translation process
Actually an
addition
⇒ more bits than
shown here
10
21
0x
PxTB
Addr VPN
Addr VPN
9
Addr VPO
00
← Virtual address requested
(in seg. Px: user space)
← Virtual address of PTE
(in seg. S0: system space)
VAX translation process
Actually an
addition
⇒ more bits than
shown here
10
10
10
SPTB
21
0x
PxTB
9
Addr VPN
Addr VPO
Addr VPN
PTE VPN
PTE VPN
00
PTE VPO
00
← Virtual address requested
(in seg. Px: user space)
← Virtual address of PTE
(in seg. S0: system space)
← Physical address of PTE
mapping the PTE we want
VAX translation process
21
0x
10
Addr VPN
PxTB
Addr VPO
Addr VPN
10
10
9
PTE VPN
SPTB
PTE VPN
00
PTE VPO
PTE PFN
← Physical address of PTE
we want
PTE VPO
Load
10
Addr PFN
Addr PFO
Load
Data
← Virtual address of PTE
(in seg. S0: system space)
← Physical address of PTE
mapping the PTE we want
00
Load
10
← Virtual address requested
(in seg. Px: user space)
← The PTE: physical address
of value we want
← Memory value (at last!)
VAX translation process
If you can really understand why this
is the case, you’ll have no problem
understanding Virtual Memory
systems 
• Not so bizarre after all:
This really is a 2-level page table!
2
0x
10
10
SPTB
14
VPN 2
7
VPN 1
PxTB
VPN 2
PTE VPO
PxTB
VPN 2
Page table
base address
1st level
index
00
2nd level
index
9
VPO
Today
• A note on Terminology
• Virtual memory (VM)
– Multi-level page tables
• Case study: VM system on P6
• Historical aside: VM system on the VAX
• x86-64 and 64-bit paging
• Performance optimization for VM system
x86-64 Paging
• Origin
– AMD’s way of extending x86 to 64-bit instruction set
– Intel has followed with “EM64T”
• Requirements
– 48-bit virtual address
• 256 terabytes (TB)
• Not yet ready for full 64 bits
– Nobody can buy that much DRAM yet
– Mapping tables would be huge
– Multi-level array map may not be the right data structure
– 52-bit physical address = 40 bits for PPN
• Requires 64-bit table entries
– Keep traditional x86 4KB page size, and same size for page
tables
• (4096 bytes per PT) / (8 bytes per PTE) = only 512 entries per page
x86-64 Paging
9
VPN1
9
VPN2
9
VPN3
9
VPN4
Page Map
Table
Page
Directory
Pointer
Table
Page
Directory
Table
Page
Table
PM4LE
PDPE
PDE
PTE
12
VPO
Virtual address
BR
40
PPN
12
PPO
Physical address
Today
• A note on Terminology
• Virtual memory (VM)
– Multi-level page tables
• Case study: VM system on P6
• Historical aside: VM system on the VAX
• x86-64 and 64-bit paging
• Performance optimization for VM system
Large Pages
10
22
20
12
VPN
VPO
VPN
VPO
versus
10
22
20
12
PPN
PPO
PPN
PPO
• 4MB on 32-bit, 2MB on 64-bit
• Simplify address translation
• Useful for programs with very large, contiguous working sets
– Reduces compulsory TLB misses
• How to use (Linux)
– hugetlbfs support (since at least 2.6.16)
– Use libhugetlbs
• {m,c,re}alloc replacements
Buffering: Example MMM
• Blocked for cache
c
Block size B x B
a
= i1
b
*
• Assume blocking for L2 cache
– say, 512 MB = 219 B = 216 doubles = C
– 3B2 < C means B ≈ 150
c
+
Buffering: Example MMM
(cont.)
• But: Look at one iteration
assume > 4 KB = 512 doubles
c
a
=
b
*
c
+
blocksize
B = 150
each row used O(B) times
but every time O(B2) ops between
• Consequence
– Each row is on different page
– More rows than TLB entries: TLB thrashing
– Solution: buffering = copy block to contiguous memory
• O(B2) cost for O(B3) operations
Next time: I/O Devices
• What is a device?
• Registers
– Example: NS16550 UART
• Interrupts
• Direct Memory Access (DMA)
• PCI (Peripheral Component Interconnect)
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement