Dispatcher

Dispatcher
Week 3:
Process Dispatch / Mackerel
Advanced Operating Systems
(263-3800-00L)
Timothy Roscoe
Herbstsemester 2012
http://www.systems.ethz.ch/courses/fall2012/AOS
1
© Systems Group | Department of Computer Science | ETH Zürich
Milestone 2
• Processor interrupts
– More device access
• Simple scheduler
– Round-robin: we may see better policies later
• Process dispatch
– Upcalls
– User-level thread scheduling
2
Week 3:
Process Dispatch
3
© Systems Group | Department of Computer Science | ETH Zürich
Definitions
• Scheduling: deciding which task to run (later this
semester)
• Dispatch: how the chosen task starts (or resumes)
execution
• Event: a notification to a task that something happened
• Transport: conveying (non-trivial) data to a task
• IPC: general inter-process communication usually
combines dispatch, notification, transport
• These are hard, if not impossible, to separate entirely.
4
The problem
• Threads are a programming language abstraction
– (different, possibly parallel activities)
⇒ should be lightweight, in the language runtime
• Threads are a kernel abstraction
– (virtual or physical processors)
– way to manage “big” resources like CPUs
• Threads need to communicate (either within or
between address spaces)
– Thread and IPC performance critical to applications
5
What are the options?
1. Implement multiple threads over a single
“virtual CPU” (kernel thread)
–
Perhaps many VCPUs per process
2. One user-level thread per “virtual CPU”
–
Multiple kernel threads in a process
3. Some combination of the above
–
Multiplex user threads over kernel threads
6
1. Many-to-one threads
User
Kernel
CPU 0
CPU 1
7
1. Many-to-one threads
• Early “thread libraries”
– Green threads (original Java VM)
– GNU Portable Threads
– Standard student exercise: implement them!
• Sometimes called “pure user-level threads”
– No kernel support required
– Also (confusingly) “Lightweight Processes”
8
Address space layout
for user level threads
Thread 1 stack
Stack
Thread 3 stack
BSS
Data
Text
Just
allocate
on the
heap
Thread 2 stack
BSS
Data
Text
9
User-level threads
Older Unices, etc.
• High performance
– 10x procedure call
•
•
•
•
Scalable
Flexible (application-specific)
Treat a process as a virtual processor
But it isn’t:
– Page faults
– I/O
– Multiprocessors
10
2. One-to-one threads
User
Kernel
CPU 0
CPU 1
11
2. One-to-one user threads
• Every user thread is/has a kernel thread.
• Equivalent to:
– multiple processes sharing an address space
– Except that “process” now refers to a group of
threads
• Most modern OS threads packages:
– Linux, Solaris, Windows XP, MacOSX, etc.
12
One-to-one user threads
Thread 1 stack
Stack
Thread 2 stack
Thread 3 stack
BSS
BSS
Data
Data
Text
Text
13
Kernel threads
Linux, Vista, L4, etc.
• Excellent integration with the OS
• Slow
– similar to process switch time
• Inflexible
– kernel policy
– Evidence: people implemented user-level threads
over kernel threads anyway
⇒ same old problems...
14
Comparison
User-level threads
One-to-one threads
 Cheap to create and
destroy
 Fast to context switch
 Can block entire
process
 Easier to schedule
 Nicely handles blocking
 Memory usage (kernel
stack)
 Slow to switch
– Not just on system calls
– Page faults!
– Requires kernel crossing
15
2. Many-to-many threads
User
Kernel
CPU 0
CPU 1
16
2. Many-to-many threads
• Multiplex user-level threads over several
kernel-level threads
• Can “pin” user thread to kernel thread for
performance/predictability
• Thread migration costs are “interesting”…
17
Issues with modern threads
• Hard for a runtime to tell:
– How many user-level threads can run at a time
(i.e. how many physical cores are allocated)
– Which user-level threads can run
(i.e. which physical cores are allocated)
• Severely limits flexibility of user-level
scheduler
– Can critically impact performance of parallel
applications.
18
Scheduler activations
• Basic mechanism: upcall to the ULS from the
kernel
– Context for this: a scheduler activation
• Structurally like a kernel thread but . . .
– created on-demand in response to events
(blocking, preemption, etc.)
• User level threads package built on top
• Hardware: DEC SRC Firefly workstation (7processor VAX)
19
Scheduler activations speedup
20
Scheduler activations memory
footprint
21
Psyche threads
• Similar: aims to remove kernel from most thread
scheduling decisions, reflect kernel-level events to user
space
• Kernel and ULS share data structures (read/write, readonly)
• Kernel upcalls ULS (“software interrupts”) in a virtual
processor for:
–
–
–
–
Timer expiration
Imminent preemption (err...)
Start of blocking system call
Unblocking of a system call
• Shared data structure standardises interface for
blocking/unblocking threads
22
Psyche data structures
23
Interesting features of Psyche
• Threads given warning of imminent preemption
– Is there a problem here?
• Upcalls can be nested (stack)
– Likewise?
• Upcalls can be disabled or queued
• Lots of user space data structures to be pinned
• Unlike Scheduler Activations, doesn’t handle
(e.g.) page faults
24
Nemesis dispatch
• Avoid nested upcalls
– Activation handler
doesn’t need to be
reentrant
• Per-domain data
structures
– all user read/write!
• Almost identical in K42,
Barrelfish, . . .
25
Deschedule / Preemption
• If resume bit == 0:
– Processor state ←
activation slot
• Else:
– Processor state ←
resume slot
• Enter the scheduler
26
Dispatch / Reschedule
• If resume bit == 0:
– resume bit ← 1
– Jump to activation addr
on activation stack
(small!)
• Else:
– Processor state ←
resume slot
• c.f. disabling interrupts
27
User-level schedulers in
Nemesis
• Upcall handler gets activations on reschedule
• Resume always set on activation
⇒ no need for reentrant ULS
• Picks a context slot to run from
– Slots are a cache for thread contexts
• Clears resume bit and resumes context
– Implementation: Alpha PALmode call (2 pipeline
drains)
– Must be atomic (or must it?)
• All implemented in user-level library
28
Dispatch in Barrelfish
• Activations: separate “dispatcher” per process per core
– Avoid Psyche-like complexity
• No activation stack: disable mechanism (á la Nemesis)
• Multiple upcall entries (from K42):
–
–
–
–
Preemption / reschedule
Page fault
Exception
etc.
• User-level thread schedulers span address spaces
across cores (dispatchers)
29
Summary of dispatch
• Plenty of ways to deliver processor to an
application
• Expose underlying scheduler decisions
⇒ give more control to user-level thread scheduler
• On uniprocessor (e.g. Nemesis) gives flexibility
• On multiprocessor (e.g. Psyche) gives
performance across cores
30
Week 3:
Domain Specific Languages:
Mackerel
31
© Systems Group | Department of Computer Science | ETH Zürich
The problem
• C is a pain to write OS code in.
• 2 classes of problem:
1. Lack of automatic resource mgmt
2. Hard to express high‐level semantics
32
High‐level languages to the
rescue?
• Write your OS in Java/Eiffel/C#/etc.
– Has been tried. Several times.
• Problems:
– Lose all control over resource management
– Explicit layout / memory access becomes hard
– Still can’t express high‐level semantics
• (OS code is highly specialized)
– Sufficiently‐expressive languages too slow and too
abstract
• (e.g. Haskell)
33
Extend C?
• Promising approach:
– NesC: TinyOS’s C dialect with support for modules,
events [Gay 2003]
– Deputy: extensions to C using type inference for
static checks [e.g. Anderson 2009]
– Ivy: evolving C as a language [Brewer 2005]
• So far, little uptake (poor toolchain support?)
34
Domain specific languages
• Old idea
• Very broad applicability (not just OSes)
• Guy Steele: “design your system as if you were
designing a language anyway”.
• Build a “little language” tailored for the task at
hand
• Generate C which is then compiled with the OS
• In Barrelfish, we use DSLs extensively (4 so far,
and counting)
35
Typical domain specific
language workflow
DSL
code
C code
DSL
compiler
C
compiler
Binary
36
Advantages
•
•
•
•
•
Highly specialized: capture the exact semantics you want!
Can check and enforce useful invariants
Small, easy to learn
Can be very fast (faster than a programmer could write)
Dramatically reduces devel/debug time
Of course, there is a downside:
• Lot of effort to write the compiler
• Complicates toolchain management
• May make the code look somewhat alien...
37
Examples of DSLs in Operating
Systems
•
•
•
•
•
•
Communication interface definition
Hardware register access
Protocol stack design (Click, Prolac)
Capability type system specification
Error code definitions
...
38
Hardware register access
Accessing hardware registers is generally fiddly code
• Lots of bit manipulation (registers have many fields)
• Poor C support
– word size, sign extension, volatile semantics
– bitfield structs are implementation specific!
• Consequences of errors are bad
– Very hard to find bugs
– Frequently hangs entire machine
• C code to manipulate registers is tedious to write
39
Devil Example: Logitech
Busmouse
• Hand‐written macros:
• Programmer usage idioms:
40
Devil Example: Logitech
Busmouse
• Device specified in the Devil DSL:
41
What’s Devil generating?
42
What the programmer gets to
write:
43
Other Devil features
• Pre‐ and post‐conditions
– E.g. index registers used to access other register
banks
– Semaphores which must be held before writing a
register
• “Variables”
– values which are combinations (usually
concatenations) of register values
44
Mackerel
Barrelfish’s answer to Devil
• Things have changes somewhat in the meantime:
– Lots of address space Index registers are less frequent
⇒ pre‐conditions less important
– Register address spaces more useful (PCI, memory, IO)
– Registers are wider (32 or 64 bits)
⇒ meaningful values rarely split across h/w fields
– Complex devices communicate using descriptor rings
⇒ In‐memory data structures just as important as
registers
45
Mackerel features
• Goal: specifications should be as close to datasheet
descriptions as possible.
• Basic constructs specify:
–
–
–
–
–
Individual registers
Register types
Register arrays
In‐memory data types
Collections of constant values
• Make extensive of C compiler’s type system and
inlining
• Comments are incorporated in C printf‐like code
46
Error messages
• Checks for common mistakes in transcribing
data sheet:
– Register sizes
– Duplicate or overlapping addresses
– Etc.
/barrelfish/devices/omap/omap_uart.dev:251:14: Type 'omap_uart.ACR'
(Implicit type of Auxiliary control register) is a Register Of Unusual
Size (31 bits)
make: *** [armv7/include/dev/omap/omap_uart_dev.h] Error 1
47
Mackerel features
Mackerel generates:
• C constant definitions for all constant values
• C Type definitions for all register and data types
• Functions to read/write all registers
• Functions to read/write all register and data type
fields
• Functions to snprintf:
– Register values
– Data type values
– Entire device state!
48
Example device specification
device omap_uart msbfirst ( addr base ) "OMAP4430 UART" {
Device
Order of
Addresses
registername
THR wo addr
(base, 0x0)
"Transmit holding" {
bit fields
_
24 ro;
Register
Default:
Description
thr
8 wo
"Transmit holding register";
write only
N.B. Not a comment!
“Don’t };
care”: name
never accessed
by user code
register RHR ro also addr (base, 0x0) "Recv holding" {
_
24 ro;
another
rhr
8 ro Overlaps "Receive
holding register";
register
};
…
};
[from omap/omap_uart.dev]
49
Example register field attributes
Attribute
Meaning
ro
Read-only: generate no code for writes
wo
Write-only: generate no code for reads
rw
Read/write: default for most registers/fields
rsvd
Reserved: always preserve contents (default for ‘_’)
mbz
Must be zero (can’t be read)
mb1
Must be ones
rw1c
Readable, write ‘1’ to clear
50
Constants
constants sw_flow "Software flow
none
= 0b00
xon1xoff1
= 0b10
xon2xoff2
= 0b01
xon12xoff12Binary literals
= 0b11
};
control" {
"No transmit flow control";
"XON1, XOFF1";
"XON2, XOFF2";
"XON1, XON2: XOFF1, XOFF2";
register EFR wo also addr (base, 0x8) "Enhanced feature register" {
_
24 ro;
auto_cts_en
1
"Enable auto-CTS flow control";
auto_rts_en
1
"Enable auto-RTS flow control";
special_char_detect 1
"Enable special character detect";
enhanced_en
1
"Enable writing to IER";
sw_tx_flow_control 2 type(sw_flow)
"Software TX flow control";
sw_rx_flow_control 2 type(sw_flow)
"Software RX flow control";
};
51
Register types
register rssim rw addr(base, 0x5864) "RSS interrupt mask" type(uint32);
register rssir rw addr(base, 0x5868) "RSS interrupt request" type(uint32);
regtype wakeup "Wakeup register" {
lnkc 1 "Link status change";
mag 1 "Magic packet";
ex
1 "Directed exact";
mc
1 "Directed multicast";
bc
1 "Broadcast";
arp 1 "ARP request packet";
ipv4 1 "Directed IPv4";
ipv6 1 "Directed IPv6";
_
7 mbz;
notco 1 "Ignore TCO/management packets";
flx0 1 "Flexible filter 0 enable";
flx1 1 "Flexible filter 1 enable";
flx2 1 "Flexible filter 2 enable";
flx3 1 "Flexible filter 3 enable";
_
12 mbz;
};
Builtin “raw”
types
register wufc rw addr(base, 0x5808) "Wakeup filter control" type(wakeup);
register wus ro addr(base, 0x5810) "Wakeup status" type(wakeup);
[from e1000.dev] 52
Register arrays
regarray vfta rw addr(base, 0x5600)[128]
"VLAN filter table array" type(uint32);
128 contiguous
registers
regarray rah rw addr(base, 0x5404)[16;8] "Receive address high" {
16 registers (4 bytes each),
rah
16 "Receive address high";
asel
2 type(addrsel) "Address select"; one every 8 bytes
_
13 mbz;
av
1 "Address valid";
};
[from e1000.dev] 53
Data types: in-memory
structures
datatype legacy_rdesc lsbfirst(64) "Legacy rx descriptor" {
addr
64 "Buffer address";
length
16 "Packet length";
Accessed with a “natural”
checksum
16 "Packet checksum";
// Status
word size of 64 bits
dd
1 "Descriptor done";
eop
1 "End of packet";
ixsm
1 "Ignore checksum indication";
vp
1 "Packet is 802.1q (matched VET)";
udpcs
1 "UDP checksum calculated on packet";
tcpcs
1 "TCP checksum calculated on packet";
ipcs
1 "IPv4 checksum calculated on packet";
pif
1 "Passed in-exact filter";
// Errors
ce
1 "CRC or alignment error";
se
1 "Symbol error";
seq
1 "Sequence error";
_
2;
tcpe
1 "TCP/UDP checksum error";
ipe
1 "IPv4 checksum error";
rxe
1 "RX data error";
// VLAN tag field
vlan
12 "VLAN id";
cr
1 "Canonical form indicator";
pri
3 "802.1p priority";
};
[from e1000.dev] 54
C API
See the manual!
Note two backends:
• Bitfield driver: deprecated (why?)
• Shift driver: generates masks and shifts
• Use snprintf()-like functions for debugging
55
Mackerel: some figures
Lines of code (using David Wheeler’s
SLOCCount):
• 2359 lines of Haskell for the Mackerel
compiler
• 1028 lines of Mackerel for the e1000
specification
• 23762 lines of C generated from e1000.dev
56
If DSLs are so good...
How come we don’t see more of them in OS
research?
• Quite hard to design a good one
– Except Mackerel, all the DSLs in Barrelfish were
designed after we had an initial C implementation and
understood the functionality.
• Perception: the effort to implement DSL usually
outweighs the cost of designing, building, and
implementing it
– With yesterday’s tools, there is some truth in this
– But . . .
57
Building a DSL: what does it
take?
DSLs are basically simple compilers:
1. Parser
– Used to be tedious to write
– Gloriously easy these days
– E.g. combinator‐based Monadic parsing in Haskell
2. Back‐end C code generator
– Rather more difficult . . .
58
Writing a backend for a DSL
The backend takes an AST and generates C code
• Basically: concatenate a set of strings into a C file
• Better: encode subset of C syntax into functional
combinators
Easier. But still:
• Writing code through a level of indirection
• Only captures syntax of C, not intended semantics.
• Can’t automate tests
• Error‐prone
• Annoying to debug
• Ultimately, no assurance it works
59
Filet‐o‐Fish
Filet‐o‐Fish is . . .
• Tool for writing C code generators
• Embedding of a subset of C in Haskell
• Notation for expressing DSL semantics
• Library for creating provably‐correct C code from
semantic specifications
• Used in Barrelfish for (to date) 2 DSLs:
– Fugu defines error codes and an error stack
– Hamlet defines capability type system
60
Hamlet: specifying the capability type
system
(Yes, Hamlet really is a type of fish)
Recall that Barrelfish uses typed, partitioned
capabilities
• For each capability, we must specify:
–
–
–
–
–
Physical layout in memory
What it can be retyped to and from
Valid invocations on the capability
What happens when it is passed between domains
etc.
• We capture all this information in a Hamlet
specification
61
How Filet‐o‐Fish compiles
Hamlet
62
Defining semantics instead of
syntax:
63
What does FoF look like?
• For the previous example, Haskell resembles:
validateRetypeCode destType (srcType, validTypes) =
do srcTypeV <- srcType
validTypesP <- sequenceSem validTypes
return (srcTypeV,
(do returnc $ condition validTypesP))
where
condition validTypes = fold orType false validTypes
orType acc srcType = acc .||. (destType .==. srcType)
64
Using QuickCheck to test DSLs
Check randomly‐generated ASTs against
semantic assertions:
65
Affecting the OS design
• Hamlet makes it easy to add new capability types
to Barrelfish
– Led us to encode more functionality into the type
system
– E.g. different cap types for page table levels (On all
architectures)
• Type system enforces page table correctness
• Can encode multiple physical address spaces, etc.
• We expect to push further functionality into
capability system…
66
Summary
• Used appropriately:
– Reduce code complexity
– Though rarely, if never, actually evaluated
• DSLs perhaps seen more as a means to an end...
– Reduce bugs
– Capture (and check) high‐level semantics of the
domain
– Facilitate automated testing and/or correctness proofs
67
Extra references
• Birrell, A. D. and Nelson, B. J. (1984). Implementing remote
procedure calls. ACM Trans. Comput. Syst., 2(1):39–59.
• Dagand, P.‐E., Baumann, A., and Roscoe, T. (2009).
Filet‐o‐Fish: Practical and Dependable Domain‐Specific
Languages for OS Development. In Proc. 5th Workshop on
Programming Languages and Operating Systems (PLOS
2009).
• Eide, E., Frei, K., Ford, B., Lepreau, J., and Lindstrom, G.
(1997). Flick: A flexible, optimizing IDL compiler. In PLDI,
pages 44–56.
• Hamilton, G. and Kougiouris, P. (1994). The Spring nucleus:
A microkernel for objects. Technical report, Sun
Microsystems Laboratories.
68
Extra references
• David Gay, Philip Levis, Robert von Behren, Matt Welsh, Eric Brewer,
and David Culler. 2003. The nesC language: A holistic approach to
networked embedded systems. In Proceedings of the ACM
SIGPLAN 2003 conference on Programming language design and
implementation (PLDI '03). ACM, New York, NY, USA, 1‐11
• Zachary R. Anderson, David Gay, and Mayur Naik. 2009. Lightweight
annotations for controlling sharing in concurrent data structures.
In Proceedings of the 2009 ACM SIGPLAN conference on
Programming language design and implementation (PLDI '09)
• Eric Brewer, Jeremy Condit, Bill McCloskey, and Feng Zhou. 2005.
Thirty years is long enough: getting beyond C. In Proceedings of
the 10th conference on Hot Topics in Operating Systems ‐ Volume
10 (HOTOS'05), Vol. 10. USENIX Association, Berkeley, CA, USA,
14‐14.
69
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement