Dispatcher

Week 3: Process Dispatch / Mackerel Advanced Operating Systems (263-3800-00L) Timothy Roscoe Herbstsemester 2012 http://www.systems.ethz.ch/courses/fall2012/AOS 1 © Systems Group | Department of Computer Science | ETH Zürich Milestone 2 • Processor interrupts – More device access • Simple scheduler – Round-robin: we may see better policies later • Process dispatch – Upcalls – User-level thread scheduling 2 Week 3: Process Dispatch 3 © Systems Group | Department of Computer Science | ETH Zürich Definitions • Scheduling: deciding which task to run (later this semester) • Dispatch: how the chosen task starts (or resumes) execution • Event: a notification to a task that something happened • Transport: conveying (non-trivial) data to a task • IPC: general inter-process communication usually combines dispatch, notification, transport • These are hard, if not impossible, to separate entirely. 4 The problem • Threads are a programming language abstraction – (different, possibly parallel activities) ⇒ should be lightweight, in the language runtime • Threads are a kernel abstraction – (virtual or physical processors) – way to manage “big” resources like CPUs • Threads need to communicate (either within or between address spaces) – Thread and IPC performance critical to applications 5 What are the options? 1. Implement multiple threads over a single “virtual CPU” (kernel thread) – Perhaps many VCPUs per process 2. One user-level thread per “virtual CPU” – Multiple kernel threads in a process 3. Some combination of the above – Multiplex user threads over kernel threads 6 1. Many-to-one threads User Kernel CPU 0 CPU 1 7 1. Many-to-one threads • Early “thread libraries” – Green threads (original Java VM) – GNU Portable Threads – Standard student exercise: implement them! • Sometimes called “pure user-level threads” – No kernel support required – Also (confusingly) “Lightweight Processes” 8 Address space layout for user level threads Thread 1 stack Stack Thread 3 stack BSS Data Text Just allocate on the heap Thread 2 stack BSS Data Text 9 User-level threads Older Unices, etc. • High performance – 10x procedure call • • • • Scalable Flexible (application-specific) Treat a process as a virtual processor But it isn’t: – Page faults – I/O – Multiprocessors 10 2. One-to-one threads User Kernel CPU 0 CPU 1 11 2. One-to-one user threads • Every user thread is/has a kernel thread. • Equivalent to: – multiple processes sharing an address space – Except that “process” now refers to a group of threads • Most modern OS threads packages: – Linux, Solaris, Windows XP, MacOSX, etc. 12 One-to-one user threads Thread 1 stack Stack Thread 2 stack Thread 3 stack BSS BSS Data Data Text Text 13 Kernel threads Linux, Vista, L4, etc. • Excellent integration with the OS • Slow – similar to process switch time • Inflexible – kernel policy – Evidence: people implemented user-level threads over kernel threads anyway ⇒ same old problems... 14 Comparison User-level threads One-to-one threads Cheap to create and destroy Fast to context switch Can block entire process Easier to schedule Nicely handles blocking Memory usage (kernel stack) Slow to switch – Not just on system calls – Page faults! – Requires kernel crossing 15 2. Many-to-many threads User Kernel CPU 0 CPU 1 16 2. Many-to-many threads • Multiplex user-level threads over several kernel-level threads • Can “pin” user thread to kernel thread for performance/predictability • Thread migration costs are “interesting”… 17 Issues with modern threads • Hard for a runtime to tell: – How many user-level threads can run at a time (i.e. how many physical cores are allocated) – Which user-level threads can run (i.e. which physical cores are allocated) • Severely limits flexibility of user-level scheduler – Can critically impact performance of parallel applications. 18 Scheduler activations • Basic mechanism: upcall to the ULS from the kernel – Context for this: a scheduler activation • Structurally like a kernel thread but . . . – created on-demand in response to events (blocking, preemption, etc.) • User level threads package built on top • Hardware: DEC SRC Firefly workstation (7processor VAX) 19 Scheduler activations speedup 20 Scheduler activations memory footprint 21 Psyche threads • Similar: aims to remove kernel from most thread scheduling decisions, reflect kernel-level events to user space • Kernel and ULS share data structures (read/write, readonly) • Kernel upcalls ULS (“software interrupts”) in a virtual processor for: – – – – Timer expiration Imminent preemption (err...) Start of blocking system call Unblocking of a system call • Shared data structure standardises interface for blocking/unblocking threads 22 Psyche data structures 23 Interesting features of Psyche • Threads given warning of imminent preemption – Is there a problem here? • Upcalls can be nested (stack) – Likewise? • Upcalls can be disabled or queued • Lots of user space data structures to be pinned • Unlike Scheduler Activations, doesn’t handle (e.g.) page faults 24 Nemesis dispatch • Avoid nested upcalls – Activation handler doesn’t need to be reentrant • Per-domain data structures – all user read/write! • Almost identical in K42, Barrelfish, . . . 25 Deschedule / Preemption • If resume bit == 0: – Processor state ← activation slot • Else: – Processor state ← resume slot • Enter the scheduler 26 Dispatch / Reschedule • If resume bit == 0: – resume bit ← 1 – Jump to activation addr on activation stack (small!) • Else: – Processor state ← resume slot • c.f. disabling interrupts 27 User-level schedulers in Nemesis • Upcall handler gets activations on reschedule • Resume always set on activation ⇒ no need for reentrant ULS • Picks a context slot to run from – Slots are a cache for thread contexts • Clears resume bit and resumes context – Implementation: Alpha PALmode call (2 pipeline drains) – Must be atomic (or must it?) • All implemented in user-level library 28 Dispatch in Barrelfish • Activations: separate “dispatcher” per process per core – Avoid Psyche-like complexity • No activation stack: disable mechanism (á la Nemesis) • Multiple upcall entries (from K42): – – – – Preemption / reschedule Page fault Exception etc. • User-level thread schedulers span address spaces across cores (dispatchers) 29 Summary of dispatch • Plenty of ways to deliver processor to an application • Expose underlying scheduler decisions ⇒ give more control to user-level thread scheduler • On uniprocessor (e.g. Nemesis) gives flexibility • On multiprocessor (e.g. Psyche) gives performance across cores 30 Week 3: Domain Specific Languages: Mackerel 31 © Systems Group | Department of Computer Science | ETH Zürich The problem • C is a pain to write OS code in. • 2 classes of problem: 1. Lack of automatic resource mgmt 2. Hard to express high‐level semantics 32 High‐level languages to the rescue? • Write your OS in Java/Eiffel/C#/etc. – Has been tried. Several times. • Problems: – Lose all control over resource management – Explicit layout / memory access becomes hard – Still can’t express high‐level semantics • (OS code is highly specialized) – Sufficiently‐expressive languages too slow and too abstract • (e.g. Haskell) 33 Extend C? • Promising approach: – NesC: TinyOS’s C dialect with support for modules, events [Gay 2003] – Deputy: extensions to C using type inference for static checks [e.g. Anderson 2009] – Ivy: evolving C as a language [Brewer 2005] • So far, little uptake (poor toolchain support?) 34 Domain specific languages • Old idea • Very broad applicability (not just OSes) • Guy Steele: “design your system as if you were designing a language anyway”. • Build a “little language” tailored for the task at hand • Generate C which is then compiled with the OS • In Barrelfish, we use DSLs extensively (4 so far, and counting) 35 Typical domain specific language workflow DSL code C code DSL compiler C compiler Binary 36 Advantages • • • • • Highly specialized: capture the exact semantics you want! Can check and enforce useful invariants Small, easy to learn Can be very fast (faster than a programmer could write) Dramatically reduces devel/debug time Of course, there is a downside: • Lot of effort to write the compiler • Complicates toolchain management • May make the code look somewhat alien... 37 Examples of DSLs in Operating Systems • • • • • • Communication interface definition Hardware register access Protocol stack design (Click, Prolac) Capability type system specification Error code definitions ... 38 Hardware register access Accessing hardware registers is generally fiddly code • Lots of bit manipulation (registers have many fields) • Poor C support – word size, sign extension, volatile semantics – bitfield structs are implementation specific! • Consequences of errors are bad – Very hard to find bugs – Frequently hangs entire machine • C code to manipulate registers is tedious to write 39 Devil Example: Logitech Busmouse • Hand‐written macros: • Programmer usage idioms: 40 Devil Example: Logitech Busmouse • Device specified in the Devil DSL: 41 What’s Devil generating? 42 What the programmer gets to write: 43 Other Devil features • Pre‐ and post‐conditions – E.g. index registers used to access other register banks – Semaphores which must be held before writing a register • “Variables” – values which are combinations (usually concatenations) of register values 44 Mackerel Barrelfish’s answer to Devil • Things have changes somewhat in the meantime: – Lots of address space Index registers are less frequent ⇒ pre‐conditions less important – Register address spaces more useful (PCI, memory, IO) – Registers are wider (32 or 64 bits) ⇒ meaningful values rarely split across h/w fields – Complex devices communicate using descriptor rings ⇒ In‐memory data structures just as important as registers 45 Mackerel features • Goal: specifications should be as close to datasheet descriptions as possible. • Basic constructs specify: – – – – – Individual registers Register types Register arrays In‐memory data types Collections of constant values • Make extensive of C compiler’s type system and inlining • Comments are incorporated in C printf‐like code 46 Error messages • Checks for common mistakes in transcribing data sheet: – Register sizes – Duplicate or overlapping addresses – Etc. /barrelfish/devices/omap/omap_uart.dev:251:14: Type 'omap_uart.ACR' (Implicit type of Auxiliary control register) is a Register Of Unusual Size (31 bits) make: *** [armv7/include/dev/omap/omap_uart_dev.h] Error 1 47 Mackerel features Mackerel generates: • C constant definitions for all constant values • C Type definitions for all register and data types • Functions to read/write all registers • Functions to read/write all register and data type fields • Functions to snprintf: – Register values – Data type values – Entire device state! 48 Example device specification device omap_uart msbfirst ( addr base ) "OMAP4430 UART" { Device Order of Addresses registername THR wo addr (base, 0x0) "Transmit holding" { bit fields _ 24 ro; Register Default: Description thr 8 wo "Transmit holding register"; write only N.B. Not a comment! “Don’t }; care”: name never accessed by user code register RHR ro also addr (base, 0x0) "Recv holding" { _ 24 ro; another rhr 8 ro Overlaps "Receive holding register"; register }; … }; [from omap/omap_uart.dev] 49 Example register field attributes Attribute Meaning ro Read-only: generate no code for writes wo Write-only: generate no code for reads rw Read/write: default for most registers/fields rsvd Reserved: always preserve contents (default for ‘_’) mbz Must be zero (can’t be read) mb1 Must be ones rw1c Readable, write ‘1’ to clear 50 Constants constants sw_flow "Software flow none = 0b00 xon1xoff1 = 0b10 xon2xoff2 = 0b01 xon12xoff12Binary literals = 0b11 }; control" { "No transmit flow control"; "XON1, XOFF1"; "XON2, XOFF2"; "XON1, XON2: XOFF1, XOFF2"; register EFR wo also addr (base, 0x8) "Enhanced feature register" { _ 24 ro; auto_cts_en 1 "Enable auto-CTS flow control"; auto_rts_en 1 "Enable auto-RTS flow control"; special_char_detect 1 "Enable special character detect"; enhanced_en 1 "Enable writing to IER"; sw_tx_flow_control 2 type(sw_flow) "Software TX flow control"; sw_rx_flow_control 2 type(sw_flow) "Software RX flow control"; }; 51 Register types register rssim rw addr(base, 0x5864) "RSS interrupt mask" type(uint32); register rssir rw addr(base, 0x5868) "RSS interrupt request" type(uint32); regtype wakeup "Wakeup register" { lnkc 1 "Link status change"; mag 1 "Magic packet"; ex 1 "Directed exact"; mc 1 "Directed multicast"; bc 1 "Broadcast"; arp 1 "ARP request packet"; ipv4 1 "Directed IPv4"; ipv6 1 "Directed IPv6"; _ 7 mbz; notco 1 "Ignore TCO/management packets"; flx0 1 "Flexible filter 0 enable"; flx1 1 "Flexible filter 1 enable"; flx2 1 "Flexible filter 2 enable"; flx3 1 "Flexible filter 3 enable"; _ 12 mbz; }; Builtin “raw” types register wufc rw addr(base, 0x5808) "Wakeup filter control" type(wakeup); register wus ro addr(base, 0x5810) "Wakeup status" type(wakeup); [from e1000.dev] 52 Register arrays regarray vfta rw addr(base, 0x5600)[128] "VLAN filter table array" type(uint32); 128 contiguous registers regarray rah rw addr(base, 0x5404)[16;8] "Receive address high" { 16 registers (4 bytes each), rah 16 "Receive address high"; asel 2 type(addrsel) "Address select"; one every 8 bytes _ 13 mbz; av 1 "Address valid"; }; [from e1000.dev] 53 Data types: in-memory structures datatype legacy_rdesc lsbfirst(64) "Legacy rx descriptor" { addr 64 "Buffer address"; length 16 "Packet length"; Accessed with a “natural” checksum 16 "Packet checksum"; // Status word size of 64 bits dd 1 "Descriptor done"; eop 1 "End of packet"; ixsm 1 "Ignore checksum indication"; vp 1 "Packet is 802.1q (matched VET)"; udpcs 1 "UDP checksum calculated on packet"; tcpcs 1 "TCP checksum calculated on packet"; ipcs 1 "IPv4 checksum calculated on packet"; pif 1 "Passed in-exact filter"; // Errors ce 1 "CRC or alignment error"; se 1 "Symbol error"; seq 1 "Sequence error"; _ 2; tcpe 1 "TCP/UDP checksum error"; ipe 1 "IPv4 checksum error"; rxe 1 "RX data error"; // VLAN tag field vlan 12 "VLAN id"; cr 1 "Canonical form indicator"; pri 3 "802.1p priority"; }; [from e1000.dev] 54 C API See the manual! Note two backends: • Bitfield driver: deprecated (why?) • Shift driver: generates masks and shifts • Use snprintf()-like functions for debugging 55 Mackerel: some figures Lines of code (using David Wheeler’s SLOCCount): • 2359 lines of Haskell for the Mackerel compiler • 1028 lines of Mackerel for the e1000 specification • 23762 lines of C generated from e1000.dev 56 If DSLs are so good... How come we don’t see more of them in OS research? • Quite hard to design a good one – Except Mackerel, all the DSLs in Barrelfish were designed after we had an initial C implementation and understood the functionality. • Perception: the effort to implement DSL usually outweighs the cost of designing, building, and implementing it – With yesterday’s tools, there is some truth in this – But . . . 57 Building a DSL: what does it take? DSLs are basically simple compilers: 1. Parser – Used to be tedious to write – Gloriously easy these days – E.g. combinator‐based Monadic parsing in Haskell 2. Back‐end C code generator – Rather more difficult . . . 58 Writing a backend for a DSL The backend takes an AST and generates C code • Basically: concatenate a set of strings into a C file • Better: encode subset of C syntax into functional combinators Easier. But still: • Writing code through a level of indirection • Only captures syntax of C, not intended semantics. • Can’t automate tests • Error‐prone • Annoying to debug • Ultimately, no assurance it works 59 Filet‐o‐Fish Filet‐o‐Fish is . . . • Tool for writing C code generators • Embedding of a subset of C in Haskell • Notation for expressing DSL semantics • Library for creating provably‐correct C code from semantic specifications • Used in Barrelfish for (to date) 2 DSLs: – Fugu defines error codes and an error stack – Hamlet defines capability type system 60 Hamlet: specifying the capability type system (Yes, Hamlet really is a type of fish) Recall that Barrelfish uses typed, partitioned capabilities • For each capability, we must specify: – – – – – Physical layout in memory What it can be retyped to and from Valid invocations on the capability What happens when it is passed between domains etc. • We capture all this information in a Hamlet specification 61 How Filet‐o‐Fish compiles Hamlet 62 Defining semantics instead of syntax: 63 What does FoF look like? • For the previous example, Haskell resembles: validateRetypeCode destType (srcType, validTypes) = do srcTypeV <- srcType validTypesP <- sequenceSem validTypes return (srcTypeV, (do returnc $ condition validTypesP)) where condition validTypes = fold orType false validTypes orType acc srcType = acc .||. (destType .==. srcType) 64 Using QuickCheck to test DSLs Check randomly‐generated ASTs against semantic assertions: 65 Affecting the OS design • Hamlet makes it easy to add new capability types to Barrelfish – Led us to encode more functionality into the type system – E.g. different cap types for page table levels (On all architectures) • Type system enforces page table correctness • Can encode multiple physical address spaces, etc. • We expect to push further functionality into capability system… 66 Summary • Used appropriately: – Reduce code complexity – Though rarely, if never, actually evaluated • DSLs perhaps seen more as a means to an end... – Reduce bugs – Capture (and check) high‐level semantics of the domain – Facilitate automated testing and/or correctness proofs 67 Extra references • Birrell, A. D. and Nelson, B. J. (1984). Implementing remote procedure calls. ACM Trans. Comput. Syst., 2(1):39–59. • Dagand, P.‐E., Baumann, A., and Roscoe, T. (2009). Filet‐o‐Fish: Practical and Dependable Domain‐Specific Languages for OS Development. In Proc. 5th Workshop on Programming Languages and Operating Systems (PLOS 2009). • Eide, E., Frei, K., Ford, B., Lepreau, J., and Lindstrom, G. (1997). Flick: A flexible, optimizing IDL compiler. In PLDI, pages 44–56. • Hamilton, G. and Kougiouris, P. (1994). The Spring nucleus: A microkernel for objects. Technical report, Sun Microsystems Laboratories. 68 Extra references • David Gay, Philip Levis, Robert von Behren, Matt Welsh, Eric Brewer, and David Culler. 2003. The nesC language: A holistic approach to networked embedded systems. In Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation (PLDI '03). ACM, New York, NY, USA, 1‐11 • Zachary R. Anderson, David Gay, and Mayur Naik. 2009. Lightweight annotations for controlling sharing in concurrent data structures. In Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation (PLDI '09) • Eric Brewer, Jeremy Condit, Bill McCloskey, and Feng Zhou. 2005. Thirty years is long enough: getting beyond C. In Proceedings of the 10th conference on Hot Topics in Operating Systems ‐ Volume 10 (HOTOS'05), Vol. 10. USENIX Association, Berkeley, CA, USA, 14‐14. 69
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Related manuals
Download PDF
advertisement