Deterministic Parallelism via Liquid Effects Ming Kawaguchi Patrick Rondon Alexander Bakst ∗ Ranjit Jhala University of California, San Diego {mwookawa,prondon,abakst,jhala}@cs.ucsd.edu Abstract Shared memory multithreading is a popular approach to parallel programming, but also fiendishly hard to get right. We present Liquid Effects, a type-and-effect system based on refinement types which allows for fine-grained, low-level, shared memory multithreading while statically guaranteeing that a program is deterministic. Liquid Effects records the effect of an expression as a formula in first-order logic, making our type-and-effect system highly expressive. Further, effects like Read and Write are recorded in Liquid Effects as ordinary uninterpreted predicates, leaving the effect system open to extension by the user. By building our system as an extension to an existing dependent refinement type system, our system gains precise value- and branch-sensitive reasoning about effects. Finally, our system exploits the Liquid Types refinement type inference technique to automatically infer refinement types and effects. We have implemented our type-and-effect checking techniques in CS OLVE, a refinement type inference system for C programs. We demonstrate how CS OLVE uses Liquid Effects to prove the determinism of a variety of benchmarks. Categories and Subject Descriptors D.2.4 [Software Engineering]: Software/Program Verification – Validation; D.1.3 [Programming Techniques]: Concurrent Programming – Parallel Programming General Terms Languages, Reliability, Verification Keywords Liquid Types, Type Inference, Dependent Types, C, Safe Parallel Programming, Determinism 1. Introduction How do we program multi-core hardware? While many models have been proposed, the model of multiple sequential threads concurrently executing over a single shared memory remains popular due to its efficiency, its universal support in mainstream programming languages and its conceptual simplicity. Unfortunately, shared memory multithreading is fiendishly hard to get right, due to the inherent non-determinism of thread scheduling. Unless the programmer is exceptionally vigilant, concurrent accesses to shared data can result in non-deterministic behaviors in the program, potentially yielding difficult-to-reproduce “heisenbugs” whose appearance depends on obscurities in the bowels of the operation system’s scheduler and are hence notoriously hard to isolate and fix. Determinism By Default. One way forward is to make parallel programs “deterministic by default” [2] such that, no matter how threads are scheduled, program behavior remains the same, eliminating unruly heisenbugs and allowing the programmer to reason about their parallel programs as if they were sequential. In recent years, many static and dynamic approaches have been proposed ∗ This work was supported by NSF grants CCF-0644361, NSF-0964702, NSF-1029783, and a generous gift from Microsoft Research. for ensuring determinism in parallel programs. While the former approaches have allowed arbitrary data sharing patterns, they also impose non-trivial run-time overheads and hence are best in situations where there is relatively little sharing [5]. In contrast, while static approaches have nearly no run-time overhead, they have been limited to high-level languages like Haskell [7] and Java [2] and require that shared structures be refactored into specific types or classes for which deterministic compilation strategies exist, thereby restricting the scope of sharing patterns. Liquid Effects. In this paper, we present Liquid Effects, a typeand-effect system based on refinement type inference which uses recent advances in SMT solvers to provide the best of both worlds: it allows for fine-grained shared memory multithreading in low level languages with program-specific data access patterns, and yet statically guarantees that programs are deterministic. Any system that meets these goals must satisfy several criteria. First, the system must support precise and expressive effect specifications. To precisely characterize the effect of program statements, it must precisely reason about complex heap access patterns, including those not foreseen by the system’s designers, and it must be able to reason about relations between program values like loop bounds and array segment sizes. Second, the system must be extensible with user-defined effects to track effects which are domainspecific and thus could not be foreseen by the system’s designers. Finally, the system must support effect inference, so that the programmer is not overwhelmed by the burden of providing effect annotations. 1. Precise and Expressive Effect Specifications. Our Liquid Effects type-and-effect system expresses the effect of a statement on the heap as a formula in first-order logic which classifies which addresses in the heap are accessed and with what effect they were accessed — for example, whether the data was read or written. Expressing effects as first-order formulas makes our type-and-effect system highly expressive: for example, using the decidable theory of linear arithmetic, it is simple to express heap access patterns like processing chunks of an array in parallel or performing strided accesses in a number of separate threads. Using first-order formulas for effects ensures that our system can express complex access patterns not foreseen by the system’s designers, and allows us to incorporate powerful off-the-shelf SMT solvers in our system for reasoning about effects. Further, by building our type-and-effect system as an extension to an existing dependent refinement type system and allowing effect formulas to reference program variables, our system gains precise value and branch-sensitive reasoning about effects. 2. Extensibility. Our effect formulas specify how locations on the heap are affected using effect labeling predicates like Read and Write. These effect labeling predicates are ordinary uninterpreted predicates in first-order logic; our system does not treat effect labeling predicates specially. Thus, we are able to provide extensibility by parameterizing our system over a set of user-defined effect labeling predicates. Further, we allow the user to give commuta- void sumBlock (char *a, int i, int len) { if (len <= 1) return; int hl = len / 2; cobegin { sumBlock (a, i, hl); sumBlock (a, i + hl, len - hl); } a[i] += a[i + hl]; } int sum1 (char *a, int len) { sumBlock (a, 0, len); to the array passed to sumBlock; thus, to prove that the program is deterministic, we have to show that the part of the array that is written by each call does not overlap with the portion of the array that is read by the other. In the following, we demonstrate how our system is able to show that the recursive calls access disjoint parts of the array and thus that the function sumBlock is deterministic. Effect Formulas. We compute the region of the array accessed by sumBlock as an effect formula that describes which pointers into the array are accessed by a call to sumBlock with parameters i and len. We note that the final line of sumBlock writes element i of a. We state the effect of this write as the formula 1 · Σa[i] = ν = a + i. return a[0]; } Figure 1. Parallel Array Summation With Contiguous Partitions tivity declarations specifying which pairs of effects have benign interactions when they simultaneously occur in separate threads of execution. This enables users to track domain-specific effects not envisioned by the designers of the type-and-effect system. 3. Effect Inference. Our type rules can be easily recast as an algorithm for generating constraints over unknown refinement types and effect formulas; these constraints can then be solved using the Liquid Types technique for invariant inference, thus yielding a highly-automatic type-based method for proving determinism. To illustrate the utility of Liquid Effects, we have implemented our type-and-effect checking techniques in CS OLVE, a refinement type inference system for C programs based on Liquid Types. We demonstrate how CS OLVE uses Liquid Effects to prove the determinism of a variety of benchmarks from the literature requiring precise, value-aware tracking of both built-in and user-defined heap effects while imposing a low annotation burden on the user. As a result, CS OLVE opens the door to efficient, deterministic programming in mainstream languages like C. 2. Overview We start with a high-level overview of our approach. Figures 1 and 2 show two functions, sum1 and sum2 respectively, which add up the elements of an array of size 2k in-place by dividing the work up into independent sub-computations that can be executed in parallel. At the end of both procedures, the first element of the array contains the final sum. In this section, we demonstrate how our system seamlessly combines path-sensitive reasoning using refinement types, heap effect tracking, and SMT-based reasoning to prove that sum1 and sum2 are deterministic. 2.1 Contiguous Partitions The function sum1 sums array a of length len using the auxiliary function sumBlock, which sums the elements of the contiguous segment of array a consisting of len-many elements and beginning at index i. Given an array segment of length len, sumBlock computes the sum of the segment’s elements by dividing it into two contiguous subsegments which are recursively summed using sumBlock. The recursive calls place the sum of each subsegment in the first element of the segment; sumBlock computes its final result by summing the first element of each subsegment and stores it in the first element of the entire array segment. Cobegin-Blocks. To improve performance, we can perform the two recursive calls inside sumBlock in parallel. This is accomplished using the cobegin construct, which evaluates each statement in a block in parallel. Note that the recursive calls both read and write We interpret the above formula as “the pointer ν accessed by the expression a[i] is equal to the pointer obtained by incrementing pointer a by i.” The final line of sumBlock also reads element i + hl of a. The effect of this read is stated as a similar formula: · Σa[i+hl] = ν = a + i + hl The total effect of the final line of sumBlock stating which portion of the heap it accesses is then given by the disjunction of the two effect formulas: · Σa[i] ⊕ Σa[i+hl] = ν = a + i ∨ ν = a + i + hl The above formula says that the heap locations accessed in the final line of sumBlock are exactly the locations corresponding to a[i] and a[i + hl]. Having determined the effect of the final line of sumBlock as a formula over the local variables a, i, and hl, we make two observations to determine the effect of the entire sumBlock function as a formula over its arguments a, i, and len. First, we observe that, in the case where len = 2, a[i + hl] = a[i + len − 1]. From this, we can see inductively that a call to sumBlock with index i and length len will access elements a[i] to a[i + len − 1]. Thus, we can soundly ascribe sumBlock the effect · ΣsumBlock = a + i ≤ ν ∧ ν < a + i + len. That is, sumBlock will access all elements in the given array segment. Determinism via Effect Disjointness. We are now ready to prove that the recursive calls within sumBlock access disjoint regions of the heap and thus that sumBlock behaves deterministically when the calls are executed in parallel. We determine the effect of the first recursive call by instantiating sumBlock’s effect, i.e., by replacing formals a, i, and len with actuals a, i, and hl, respectively: · ΣCall 1 = a + i ≤ ν ∧ ν < a + i + hl The resulting effect states that the first recursive call only accesses array elements a[i] through a[i + hl − 1]. We determine the effect of the second recursive call with a similar instantiation of sumBlock’s effect: · ΣCall 2 = a + i + hl ≤ ν ∧ ν < a + i + len − hl 1 Here we associate effects with program expressions like a[i]. However, in the technical development and our implementation, effects are associated with heap locations in order to soundly account for aliasing (section 4). declare effect Accumulate; declare Accumulate commutes with Accumulate; void accumLog (char *l, int j) effect (&l[j], Accumulate(v) && !Read(v) && !Write(v)); void sumStride (char *a, int stride, char *log) { foreach (i, 0, THREADS) { for (int j = i; j < stride; j += THREADS) { a[j] += a[j + stride]; accumLog (log, j); } } } int sum2 (char *a, int len) { log = (char *) malloc (len); for (int stride = len/2; stride > 0; stride /= 2) sumStride (a, stride, log); return a[0]; } Figure 2. Parallel Array Summation With Strided Partitions All that remains is to show that the effects ΣCall 1 and ΣCall 2 are disjoint. We do so by asking an SMT solver to prove the unsatisfiability of the conjoined access intersection formula ΣUnsafe = ΣCall 1 ∧ ΣCall 2 , whose inconsistency establishes that the intersection of the sets of pointers ν accessed by both recursive calls is empty. 2.2 Complex Partitions In the previous example, function sum1 computed the sum of an array’s elements by recursively subdividing the array into halves and operating on each half separately before combining the results. We were able to demonstrate that sum1 is deterministic by showing that the concurrent recursive calls to sumBlock operate on disjoint contiguous segments of the array. We now show that our system is capable of proving determinism even when the array access patterns are significantly more complex. Function sum2 uses the auxiliary function sumStride, which sums an array’s elements using the following strategy: First, the array is divided into two contiguous segments. Each element in the first half is added to the corresponding element in the second half, and the element in the first half is replaced with the result. The problem is now reduced to summing only the first half of the array, which proceeds in the same fashion, until we can reduce the problem no further, at which point the sum of all the array elements is contained in the first element of the array. Foreach Blocks. To increase performance, the pairwise additions between corresponding elements in the first and second halves of the array are performed in parallel, with the level of parallelism determined by a compile-time constant THREADS. The parallel threads are spawned using the foreach construct. The statement foreach (i, l, u) s; executes statement s once with each possible binding of i in the range [l, u) in parallel. The foreach loop within sumStride spawns THREADS many threads. Thread i processes elements i, i + THREADS, i + 2 ∗ THREADS, etc., of array a. While this strided decomposition may appear contrived, it is commonly used in GPU programming to maximize memory bandwidth by enabling “adjacent” threads to access adjacent array cells via memory coalescing [1]. To prove that the foreach loop within sumStride is deterministic, we must show that no location that is written in one iteration is either read from or written to in another iteration. (We ignore the effect of the call to the function accumLog for now and return to it later.) Note that this differs from the situation with sum1: in order to demonstrate that sum1 is deterministic, it was not necessary to consider read and write effects separately; it was enough to know that recursive calls to sumBlock operated on disjoint portions of the heap. However, the access pattern of a single iteration of the foreach loop in sumStride is considerably more complicated, and reasoning about noninterference between loop iterations will require tracking not only which locations are affected but precisely how they are affected — that is, whether they are read from or written to. Labeled Effect Formulas. We begin by computing the effect of each iteration of the loop on the heap, tracking which locations are accessed as well as which effects are performed at each location. We first compute the effect of each iteration of the inner for loop, which adds the value of a[j + stride] to a[j]. We will use the unary effect labeling predicates Read(ν) and Write(ν) to record the fact that a location in the heap has or has not been read from or written to, respectively. We capture the effect of the read of a[j + stride] with the effect formula · Σ1 = Read(ν) ∧ ¬Write(ν) ∧ ν = a + j + stride. Note that we have used the effect label predicates Read and Write to record not only that we have accessed a[j + stride] but also that this location was only read from, not written to. We record the effect of the write to a[j] with a similar formula: · Σ2 = Write(ν) ∧ ¬Read(ν) ∧ ν = a + j As before, the overall effect of these two operations performed sequentially, and thus the effect of a single iteration of the inner for, loop is the disjunction of the two effect formulas above: · Σj = Σ1 ∨ Σ2 Iteration Effect Formulas. Having computed the effect of a single iteration of sumStride’s inner for loop, the next step in proving that sumStride is deterministic is to compute the effect of each iteration of the outer foreach loop, with our goal being to show that the effects of any two distinct iterations must be disjoint. We begin by generalizing the effect formula we computed to describe the effect of a single iteration of the inner for loop to an effect formula describing the effect of the entire for loop. The effect formula describing the effect of the entire for is an effect formula that does not reference the loop induction variable j, but is implied by the conjunction of the loop body’s single-iteration effect and any loop invariants that apply to the loop induction variable j. Note that the induction variable j enjoys the loop invariant j < stride ∧ j ≡ i mod THREADS Given this invariant for j, we can now summarize the effect of the for loop as an effect formula which does not reference the loop induction variable j but is implied by the conjunction of the above invariant on j and the single-iteration effect formula Σj : · Σfor = (Write(ν) ⇒ (ν − a) ≡ i mod THREADS) ∧ (Write(ν) ⇒ ν < a + stride) ∧ (Read(ν) ⇒ ν ≥ a + stride) (1) We have now computed the effect of sumStride’s inner for loop, and thus the effect of a single iteration of sumStride’s outer foreach loop. Effect Projection. We define an effect projection operator π(Σ, E) which returns the restriction of the effect formula Σ to the effect label E. In essence, the effect projection is the formula implied by the conjunction of Σ and E(ν). For example, the projection of the Write effect of iteration i of the foreach loop represents the addresses written by thread i. Determinism via Iteration Effect Disjointness. To prove that the foreach loop is deterministic, we must show that values that are read and written in one iteration are not overwritten in another. First, we show that no two foreach iterations (i.e., threads) write to the same location, or, in other words, that distinct iterations write through disjoint sets of pointers. We establish this fact by asking the SMT solver to prove the unsatisfiability of the writewrite intersection formula: 0 ΣUnsafe = π(Σfor , Write) ∧ π(Σfor , Write)[i 7→ i ] ∧ i 6= i 0 in an environment where Specifying Effects. We create a new user-defined effect predicate to track the effect of the accumLog function using the statement declare effect Accumulate; This extends the set of effect predicates that our system tracks to E = {Read, Write, Accumulate} · We then annotate the prototype of the accumLog function to specify accumLog causes the Accumulate effect to occur on the j-th entry of its parameter array l: void accumLog (char *l, int j) effect (&l[j], Accumulate(v) && !Read(v) && !Write(v)); Specifying Commutativity. We specify that our user effect Accumulate does not cause nondeterminism even when it occurs on the same location in two separate threads using the commutativity annotation 0 ≤ i, i0 < THREADS declare Accumulate commutes with Accumulate; The formula is indeed unsatisfiable as, from Equation 1, · π(Σfor , Write) = (ν − a) ≡ i mod THREADS ∧ ν < a + stride (2) This annotation extends the set of pairs of effects that commute with each other — that is, which can occur in either order without affecting the result of the computation. In particular, the commutativity annotation above extends the set of commutable pairs to C = {(Read, Read), (Accumulate, Accumulate)} · 0 and so, after substituting i the query formula is (ν − a) ≡ i mod THREADS ∧ (ν − a) ≡ i0 mod THREADS 0 which is unsatisfiable when i 6= i . Next, we prove that no heap location is read in one iteration (i.e., thread) and written in another. As before, we verify that the intersection of the pointers read in one iteration with those written in another iteration is empty. Concretely, we ask the SMT solver to prove the unsatisfiability of the write-read intersection formula ΣUnsafe = π(Σfor , Write) ∧ π(Σfor , Read)[i 7→ i0 ] ∧ i 6= i0 where, from Equation 1, the locations read in a single iteration of the foreach loop are described by the Read projection · π(Σfor , Read) = ν ≥ a + stride After substituting the the write effect from Equation 2, we can see that the write-read-intersection formula is unsatisfiable as ν < a + stride ∧ ν ≥ a + stride is inconsistent. Thus, by proving the disjointness of the writewrite and write-read effects, we have proved that the foreach loop inside sumStride is deterministic. 2.3 User-Defined Effects By expressing our effect labeling predicates as ordinary uninterpreted predicates in first-order logic, we open the door to allowing the user to define their own effects. Such user-defined effects are first-class: we reason about them using the same mechanisms as the built-in predicates Read and Write, and effect formulas over user-defined predicates are equally as expressive as those over the built-in predicates. We return to the sumStride function to demonstrate the use of user-defined effects. The for loop in sumStride tracks how many times the entry a[j] is written by incrementing the j-th value in the counter array log using the externally-defined function accumLog. We assume that the accumLog function is implemented so that its operation is atomic. Thus, using accumLog to increment the count of the same element in two distinct iterations of the foreach loop does not cause non-determinism. The extended set formally specifies that in addition to pairs of Read effects (included by default), pairs of of Accumulate effects also commute, and hence may be allowed to occur simultaneously. Generalized Effect Disjointness. Finally, we generalize our method for ensuring determinism. We check the disjointness of two effect formulas Σ1 and Σ2 by asking the SMT solver to prove the unsatisfiability of the effect intersection formula: E × E \ C).π(Σ , E ) ∧ π(Σ , E ) ΣUnsafe = ∃(E1 , E2 ) ∈ ( 1 1 2 2 That is, for each possible combination of effect predicates that have not been explicitly declared to commute, we check that the two effect sets are disjoint when projected on those effects. Thus, by declaring an Accumulate effect which is commutative, we are able to verify the determinism of sum2. Effect Inference. In the preceding, we gave explicit loop invariants and heap effects where necessary. Later in this paper, we explain how we use Liquid Type inference [28] to reduce the annotation burden on the programmer by automatically inferring the loop invariants and heap effects given above. Outline. The remainder of this paper is organized as follows: In section 3, we give the syntax of LCE programs, informally explain their semantics, and give the syntax of LCE ’s types. We define the type system of LCE in section 4. In section 5, we give an overview of the results of applying an implementation of a typechecker for LCE to a series of examples from the literature. We review related work in section 6. 3. Syntax and Semantics In this section, we present the syntax of LCE programs, informally discuss their semantics, and give the syntax of LCE types. 3.1 Programs The syntax of LCE programs is shown in Figure 3. Most of the expression forms for expressing sequential computation are standard or covered extensively in previous work [28]; the expression forms for parallel computation are new to this work. v a e ::= | | | x n &n Values variable integer pointer ::= | | | | v a1 ◦ a2 a1 +p a2 a1 ∼ a2 Pure Expressions value arithmetic operation pointer arithmetic comparison ::= | | | | | | | | | | | a ∗v ∗v1 := v2 if v then e1 else e2 f (v) malloc(v) let x = e1 in e2 letu x = unfold v in e fold l e1 k e2 for each x in v1 to v2 {e} Expressions pure expression heap read heap write if-then-else function call memory allocation let binding location unfold location fold parallel composition parallel iteration F ::= f (xi ) { e } Function Declarations P ::= Fe Programs b ::= | | int(i) ref(l, i) Base Types integer pointer p ::= φ(v) Refinement Formulas τ ::= {ν : b | p} Refinement Types i ::= | | n n+m Indices constant sequence c ::= i:τ Blocks l ::= | | l̃ lj Heap Locations abstract location concrete location ::= | | ε h ∗ l 7→ c Heap Types empty heap extended heap ::= | | ε Σ ∗ l̃ 7→ p Heap Effects empty effect extended effect ::= (xi : τi )/h1 → τ /h2 /Σ Function Schemes h Σ σ Figure 3. Syntax of LCE programs Values. The set of LCE values v includes program variables, integer constants n, and constant pointer values &n representing addresses in the heap. With the exception of the null pointer value &0, constant pointer values do not appear in source programs. Expressions. The set of pure LCE expressions a includes values v, integer arithmetic expressions a1 ◦a2 , where ◦ is one of the standard arithmetic operators +, −, ∗, etc., pointer arithmetic expressions a1 +p a2 , and comparisons a1 ∼ a2 where ∼ is one of the comparison operators =, <, >, etc. Following standard practice, zero and non-zero values represent falsity and truth, respectively. The set of LCE expressions e includes the pure expressions a as well as all side-effecting operations. The expression forms for pointer read and write, if-then-else, function call, memory allocation, and let binding are all standard. The location unfold and fold expressions are used to support type-based reasoning about the heap using strong updates As they have no effect on the dynamic semantics of LCE programs, we defer discussion of location unfold and fold expressions to section 4. Parallel Expressions. The set of LCE expressions includes two forms for expressing parallel computations. The first, parallel composition e1 k e2 , evaluates expressions e1 and e2 in parallel and returns when both subexpressions have evaluated to values. The second parallel expression form is the parallel iteration expression form for each x in v1 to v2 {e}, where v1 and v2 are integer values. A parallel iteration expression is evaluated by evaluating the body expression e once for each possible binding of variable x to an integer in the range [v1 , v2 ). All iterations are evaluated in parallel, and the parallel iteration expression returns when all iterations have finished evaluating. Both forms of parallel expressions are evaluated solely for their side effects. Functions and Programs. A function declaration f (xi ) { e } declares a function f with arguments xi whose body is the expression e. The return value of the function is the value of the expression. A LCE program consists of a sequence of function declarations F followed by an expression e which is evaluated in the environment containing the previously-declared func- Figure 4. Syntax of LCE types tions. 3.2 Types The syntax of LCE types is shown in Figure 4. Base Types. The base types b of LCE include integer types int(i) and pointer types ref(l, i). The integer type int(i) describes an integer whose value is in the set described by the index i; the set of indices is described below. The pointer type ref(l, i) describes a pointer value to the heap location l whose offset from the beginning of location l is an integer belonging to the set described by index i. Indices. The language of LCE base types uses indices i to approximate the values of integers and the offsets of pointers from the starts of the heap locations where they point. There are two forms of indices. The first, the singleton index n, describes the set of integers {n}. The second, the sequence index n+m , represents the sequence of offsets {n + lm}∞ l=0 . Refinement Types. The LCE refinement types τ are formed by joining a base type b with a refinement formula p to form the refinement type {ν : b | p}. The distinguished value variable ν refers to the value described by this type. The refinement formula p is a logical formula φ(ν, v) over LCE values v and the value variable ν. The type {ν : b | p} describes all values v such that p[ν 7→ v] is a valid formula. Our refinements are first-order formulas with equality, uninterpreted functions, and linear arithmetic. When it is unambiguous from the context, we use b to abbreviate {ν : b | true}. Blocks. A block c describes the contents of a heap location as a sequence of bindings from indices i to refinement types τ . Each binding i : τ states that a value of type τ is contained in the block at each offset n from the start of the block which is described by the index i. Bindings of the form m : τ , i.e., where types are bound to singleton indices, refer to exactly one element within the block; as such, we allow the refinement formulas within the same block to refer to the value at offset m using the syntax @m. We require all indices in a block to be disjoint. Heap Types. Heap types h statically describe the contents of runtime heaps as sets of bindings from heap locations l to blocks c. A heap type is either the empty heap ε or a heap h extended with a binding from location l to block c, written h ∗ l 7→ c. The location l is either an abstract location ˜ l, which corresponds to arbitrarily many run-time heap locations, or a concrete location lj , which corresponds to exactly one run-time heap location. The distinction between abstract and concrete locations is used to implement a form of sound strong updates on heap types in the presence of aliasing, unbounded collections, and concurrency; we defer detailed discussion to section 4. Locations may be bound at most once in a heap. Heap Effects. A heap effect Σ describes the effect of an expression on a set of heap locations as a set of bindings from heap locations l to effect formulas p. A heap effect is either the empty effect ε or an effect Σ extended with a binding from location ˜ l to an effect formula p over the in-scope program variables and the value variable ν, written Σ ∗ ˜ l 7→ p. Intuitively, an effect binding ˜ l 7→ p describes the effect of an expression on location ˜ l as the set of pointers v such that the formula p[ν 7→ v] is valid. Note that we only bind abstract locations in heap effects. Effect Formulas. The formula portion of an effect binding can describe the effect of an expression with varying degrees of precision, from simply describing whether the expression accesses the location at all, to describing which offsets into a location are accessed, to describing which offsets into a location are accessed and how they are accessed (e.g. whether they are read or written). For example, if we use the function BB(ν) to refer to the beginning of the block where ν points, we can record the fact that expression e accesses the first ten items in location ˜ l (for either reading or writing) with the effect binding may depend on the types of the parameters. We implicitly quantify over all the location names appearing in a function scheme. 4. E E ˜ l 7→ ¬Write(ν). On the other hand, we may specify precisely which offsets into ˜ l are written by using Write to guard a predicate describing a set of pointers into ˜ l. For example, the effect binding ˜ l 7→ Write(ν) ⇒ (BB(ν) ≤ ν ∧ ν < BB(ν) + 10) describes an expression which writes to the first ten elements of ˜ l. Effect predicates like Read and Write may only appear in heap effect bindings; they may not appear in type refinements. Function Schemes. A LCE function scheme (xi : τi )/h1 → τ /h2 /Σ is the type of functions which take arguments xi of corresponding types τi in a heap of type h1 , return values of type τ in a heap of type h2 , and incur side-effects described by the heap effect Σ. The types of function parameters may depend on the other function parameters, and the type of the input heap may depend on the parameters. The types of the return value, output heap, and heap effect EC C C C EC C ˜ l 7→ BB(ν) ≤ ν ∧ ν < BB(ν) + 10, i.e., by stating that all pointers used by e to accesses location ˜ l satisfy the above formula. To record that an expression does not access location ˜ l at all, we ascribe it the effect formula false. We add further expressiveness to our language of effect formulas by enriching it with effect-specific predicates used to describe the particular effects that occurred within a location. We use the unary predicates Read(v) and Write(v) to indicate that pointer v was read from or written to, respectively. Using these predicates, we can precisely state how an expression depends upon or affects the contents of a heap location. For example, we can specify that an expression does not write to a location ˜ l using the effect binding Type System In this section, we present the typing rules of LCE . We begin with a discussion of LCE ’s type environments. We then discuss LCE ’s expression typing rules, paying particular attention to how the rules carefully track side effects. Finally, we describe how we use tracked effects to ensure determinism. Type Environments. Our typing rules make use of three types of environments. A local environment, Γ, is a sequence of type bindings of the form x : τ recording the types of in-scope variables and guard formulas of the form p recording any branch conditions under which an expression is evaluated. A global environment, Φ, is a sequence of function type bindings of the form f : S which maps functions to their refined type signatures. Finally, an effect environment Π is a pair ( , ). The first component in the pair, , is a set of effect label predicates. The second component in the pair, , is a set of pairs of effect label predicates such that (E1 , E2 ) ∈ states that effect E1 commutes with E2 . By default, includes the built-in effects, Read and Write, and includes the pair (Read, Read), which states that competing reads to the same address in different threads produce a result that is independent of their ordering. In our typing rules, we implicitly assume a single global effect environment Π. Local environments are well-formed if each bound type or guard formula is well-formed in the environment that precedes it. An effect environment ( , ) is well-formed as long as is symmetric and only references effects in the set , i.e., ⊆ × , and (E1 , E2 ) ∈ iff (E2 , E1 ) ∈ . C Γ ::= | x : τ ; Γ | a; Γ Φ ::= | f : S; Φ E C C E E (Local Environment) (Global Environment) E ::={Read, Write} | E; E C ::=(Read, Read) | (E, E); C Π ::=(E, C) (Effect Environment) 4.1 Typing Judgments We now discuss the rules for ensuring type well-formedness, subtyping, and typing expressions. Due to space constraints, we only present the formal definitions of the most pertinent rules, and defer the remainder to the accompanying technical report [16]. Subtyping Judgments. The rules for subtyping refinement types are straightforward: type τ1 is a subtype of τ2 in environment Γ, written Γ ` τ1 <: τ2 , if 1) τ1 ’s refinement implies τ2 under the assumptions recorded as refinement types and guard predicates in Γ and 2) τ1 ’s index is included in τ2 ’s when both are interpreted as sets. Rule [<:-A BSTRACT ] allows us to cast pointers to concrete locations to pointers to their corresponding abstract locations. Heap type h1 is a subtype of h2 in environment Γ, written Γ ` h1 <: h2 , if both heaps share the same domain and, for each location l, the block bound to l in h1 is a subtype of the block bound to l in h2 . Subtyping between blocks is pairwise subtyping between the types bound to each index in each block; we use substitutions to place bindings to offsets @i in the environment. Because effect formulas are first-order formulas, subtyping between effects is simply subtyping between refinement types: an effect Σ1 is a subtype of Σ2 in environment Γ, written Γ ` Σ1 <: Σ2 , if each effect formula bound to a location in Σ1 implies the formula bound to the same location in Σ2 using the assumptions in Γ. Type Well-Formedness. The type well-formedness rules of LCE ensure that all refinement and effect formulas are well-scoped and that heap types and heap effects are well-defined maps over locations. These rules are straightforward; we briefly discuss them below, deferring their formal definitions [16]. A heap effect Σ is well-formed with respect to environment Γ and heap h, written Γ, h ` Σ, if it binds only abstract locations which are bound in h, binds a location at most once, and binds locations to effect formulas which are well-scoped in Γ. Further, an effect formula p is well-formed in an effect environment ( , ) if all effect names in p are present in . A refinement type τ is well-formed with respect to Γ, written Γ ` τ , if its refinement predicate references only variables contained in Γ. A heap type h is well-formed in environment Γ, written Γ ` h, if it binds any location at most once, binds a corresponding abstract location for each concrete location it binds, and contains only well-formed refinement types. A “world” consisting of a refinement type, heap type, and heap effect is well-formed with respect to environment Γ, written Γ ` τ /h/Σ, if, with respect to the environment Γ, the type and heap type are well-formed and the heap effect Σ is well-formed with respect to the heap h. The rule for determining the well-formedness of function type schemes is standard. Pure Expression Typing. The rules for typing a pure expression a in an environment Γ, written Γ ` a : τ , are straightforward, and are deferred to [16]. These rules assign each pure expression a refinement type that records the exact value of the expression. Expression Typing and Effect Tracking. Figure 5 shows the rules for typing expressions which explicitly manipulate heap effects to track or compose side-effects. Each expression is typed with a refinement type, a refinement heap assigning types to the heap’s contents after evaluating the expression, and an effect which records how the expression accesses each heap location as a mapping from locations to effect formulas. We use the abbreviation void to indicate the type {ν : int(0) | true}. Pointer dereference expressions ∗v are typed by the rule [TR EAD ]. The value v is typed to find the location and index into which it points in the heap; the type of the expression is the type at this location and index. The rule records the read’s effect in the heap by creating a new binding to v’s abstract location, ˜ l, and uses the auxiliary function logEffect to create an effect formula which states that the only location that is accessed by this dereference is exactly that pointed to by v (ν = v), that the location is read (Read(ν)), and that no other effect occurs. The rules for typing heap-mutating expressions of the form ∗v1 := v2 , [T-SU PD ] and [T-WU PD ], are similar to [T-R EAD ]. The two rules for mutation differ only in whether they perform a strong update on the type of the heap, i.e., writing through a pointer with a singleton index strongly updates the type of the heap, while writing through a pointer with a sequence index does not. Expressions are sequenced using the let construct, typed by rule [T-L ET ]. The majority of the rule is standard; we discuss only the portion concerning effects. The rule types expressions e1 and e2 to yield their respective effects Σ1 and Σ2 . The effect of the entire let expression is the composition of the two effects, Σ1 ⊕ Σ2 , defined in Figure 5. We check that Σ2 is well-formed in the initial environment to ensure that the variable x does not escape its scope. Rule [T-PAR ] types the parallel composition expression e1 k e2 . Expressions e1 and e2 are typed to obtain their effects Σ1 and Σ2 . We then use the OK judgment, defined in Figure 5, to verify that effects Σ1 and Σ2 commute, i.e., that the program remains deterministic regardless of the interleaving of the two concurrently-executing expressions.We give e1 k e2 the effect Σ1 ⊕ Σ2 . We require that the input heap for a parallel composition expression must be abstract, that is, contain only bindings for parallel compositions; this forces the expressions which are run in E E Φ, Γ, h ` e : τ /h2 /Σ Typing Rules Γ ` v : ref(lj , i) h = h1 ∗ lj 7→ . . . , i : τ , . . . Φ, Γ, h ` ∗v : τ /h/l̃ 7→ logEffect(v, Read) [T-R EAD ] h = h1 ∗ lj 7→ . . . , n : b, . . . Γ ` v1 : ref(lj , n) Γ ` v2 : b h0 = h1 ∗ lj 7→ . . . , n : {ν : b | ν = v2 }, . . . [T-SU PD ] Φ, Γ, h ` ∗v1 := v2 : void/h0 /l̃ 7→ logEffect(v1 , Write) Γ ` v2 : τ̂ Γ ` v1 : ref(lj , n+m ) h = h1 ∗ lj 7→ . . . , n+m : τ̂ , . . . Φ, Γ, h ` ∗v1 := v2 : void/h/l̃ 7→ logEffect(v1 , Write) Φ, Γ, h ` e1 : τ1 /h1 /Σ1 Φ, Γ;x : τ1 , h1 ` e2 : τ̂2 /ĥ2 /Σ̂2 Γ ` τ̂2 /ĥ2 /Σ̂2 Φ, Γ, h ` let x = e1 in e2 : τ̂2 /ĥ2 /Σ1 ⊕ Σ̂2 [T-WU PD ] [T-L ET ] Φ, Γ, h ` e1 : τ1 /ĥ0 /Σ̂1 Φ, Γ, h ` e2 : τ2 /ĥ0 /Σ̂2 0 Φ, Γ, ĥ ` Σ̂{1,2} Φ, Γ ` OK(Σ̂1 , Σ̂2 ) h abstract Φ, Γ, h ` e1 k e2 : void/ĥ0 /Σ̂1 ⊕ Σ̂2 Γ1 = Γ;i : {ν : int(v1 +1 ) | v1 ≤ ν < v2 } Φ, Γ1 , h ` e : τ /h/Σ j fresh Γ2 = Γ1 ;j : {ν : int(v1 +1 ) | v1 ≤ ν < v2 ∧ ν 6= i} Φ, Γ2 ` OK(Σ, Σ[i 7→ j]) Φ, Γ, h ` τ /h/Σ̂0 Γ1 ` Σ <: Σ̂0 h abstract Φ, Γ, h ` for each i in v1 to v2 {e} : void/h/Σ̂0 Γ ` v : {ν : ref(l̃, iy ) | ν 6= 0} h = h0 ∗ l̃ 7→ nk : τk , i+ : τ + θ = [@nk 7→ xk ] xk fresh Γ1 = Γ;xk : θτk lj fresh h1 = h ∗ lj 7→ nk : {ν = xk }, i+ : θτ + Φ, Γ1 ;x : {ν : ref(lj , iy ) | ν = v}, h1 ` e : τ̂2 /ĥ2 /Σ̂ Γ1 ` h1 Γ ` τ̂2 /ĥ2 /Σ̂ Σ = Σ̂ ⊕ {(l̃ 7→ logEffect(BB(ν) + nk , Read))}nk Φ, Γ, h ` letu x = unfold v in e : τ̂2 /ĥ2 /Σ h = h0 ∗ l̃ 7→ ĉ1 ∗ lj 7→ c2 Γ ` c2 <: ĉ1 Φ, Γ, h ` fold L : void/h0 ∗ l̃ 7→ ĉ1 /ε [T-PAR ] [T-F OREACH ] [T-U NFOLD ] [T-F OLD ] Γ ` OK(Σ1 , Σ2 ) Effects Checking Ψ(E1 , E2 ) = π(p1 , E1 ) ∧ π(p2 , E2 ) ∀(E1 , E2 ) ∈ ( × ) \ .Unsat([[Γ]] ∧ Ψ(E1 , E2 )) Γ ` OK(p1 , p2 ) E E C ∀l ∈ Dom(Σ1 ⊕ Σ2 ).Γ ` OK(LU(Σ1 , l), LU(Σ2 , l)) Γ ` OK(Σ1 , Σ2 ) Effects Operations · Σ1 ⊕ Σ2 = {L 7→ LU(Σ1 , L) ∨ LU(Σ2 , L)} ( Σ(L) L ∈ Dom(Σ) · LU(Σ, L) = ⊥ o.w. · π(p, E) = p[E 7→ Check] ∧ Check(ν) · logEffect(v, E) = E(ν) ∧ ν = v ∧E 0 ∈E\{E} ¬E 0 (ν) Figure 5. Typing Rules parallel to unfold any locations they access, which in turn enforces several invariants that will prevent the type system from unsoundly assuming invariants in one thread which may be broken by the heap writes performed in another. We elaborate on this below. Rule [T-F OREACH ] types the foreach parallel loop expression. The loop induction variable i ranges over values between v1 and v2 , exclusive; we type the body expression e in the environment enriched with an appropriate binding for i and compute the resulting heap and per-iteration effect Σ. We require that the heap is loop-invariant. To ensure that the behavior of the foreach loop is deterministic, we must check that the effects of distinct iterations do not interfere. Hence, we check non-interference at two arbitrary, distinct iterations by adding a binding for a fresh name j to the environment with the invariant that i 6= j, and verifying with OK that the effect at an iteration i, Σ, commutes with the effect at a distinct iteration j, Σ[i 7→ j]. We return Σ0 , which subsumes Σ but is well-formed in the original environment, ensuring that the loop induction variable i does not escape its scope. As with [T-PAR ], we require that the input heap is abstract, for the same reasons. Strong Updates and Concurrency. The rules for location unfold and fold, [T-U NFOLD ] and [T-F OLD ], are used to implement a local non-aliasing discipline that allows us to perform strong updates on the types of heap locations if we know that only one pointer to the location is accessed at a time. This mechanism is similar to freeze/thaw or adopt/focus [34, 10]. These rules have been treated in previous work [28]; we now discuss how these rules handle effect tracking and give a brief recap of their handling of strong updates. Rule [T-U NFOLD ] types the letu construct for unfolding a pointer to an abstract location, ˜ l, to obtain a pointer to a corresponding concrete location, lj , bound to the variable x. The block corresponding to the concrete location lj is constructed by creating a skolem variable xj for each binding of a singleton index nj to a type τj ; because the binding is a singleton index in a concrete location, it corresponds to exactly one datum on the heap. We apply an appropriate substitution to the elements within the block, then typecheck the body expression e with respect to the extended heap and enriched environment. Well-formedness constraints ensure that an abstract location is never unfolded twice in the same scope and that x does not escape its scope. We allow two concurrently-executing expressions to unfold, and thus simultaneously access, the same abstract location. When a location is unfolded, our type system records the refinement types bound to each of its singleton indices in the environment. If we do not take care, the refinement types bound to singleton indices and recorded in the environment may be invalidated by an effect (e.g. a write) occurring in a simultaneously-executing expression.Thus, an expression which unfolds a location implicitly depends on the data whose invariants are recorded in its environment at the time of unfolding. To capture these implicit dependencies, when a pointer v is unfolded, we conservatively record a pseudo-read [33] at each singleton offset nk within the block into which v points by recording in the heap effect that the pointer BB(v) + nk is read, where BB(v) is the beginning of the memory block where v points. This ensures that the invariant recorded in the environment at the time of unfolding is preserved, as any violating writes would get caught by the determinism check. Rule [T-F OLD ] removes a concrete location from the heap so that a different pointer to the same abstract location may be unfolded. The rule ensures that the currently-unfolded concrete location’s block is a subtype of the corresponding abstract location’s so that we can be sure that the abstract location’s invariant holds when it is unfolded later. The fold expression has no heap effect. Program Typing. The remaining expression forms do not manipulate heap effects; as their typing judgments are straightforward and covered in previous work [28], we defer their definition [16]. We note that the rule for typing if expressions records the value of the branch condition in the environment when typing each branch. The judgments for typing function declarations and programs programs in LCE are straightforward and are deferred to [16]. Program Reduce SumReduce QuickSort MergeSort K-Means-Core IDEA LOC 39 39 73 95 135 222 Quals 7 1 3 6 3 7 Annot 7 3 4 7 5 5 Changes N/A 0 N/A 0 4 3 T (s) 8.8 1.8 5.9 32.4 22.9 59.7 Table 1. (LOC) is the number of source code lines as reported by sloccount, (Quals) is the number of logical qualifiers required to prove memory safety and determinism, (Annot) is the number of annotated function signatures plus the number of effect and commutativity declarations. (Changes) is the number of program modifications, (T) is the time in seconds taken for verification. 4.2 Handling Effects We now describe the auxiliary definitions used by the typing rules of subsection 4.1 for checking effect noninterference. Combining Effects. In Figure 5, we define the ⊕ operator for combining effects. Our decision to encode effects as formulas in first-order logic makes the definition of this operator especially simple: for each location l, the combined effect of Σ1 and Σ2 on location l is the disjunction of of the effects on l recorded in Σ1 and Σ2 . We use the auxiliary function LU(Σ, l) to look up the effect formula for location l in Σ; if there is no binding for l in Σ, the false formula, indicating no effect, is returned instead. Effects Checking. We check the noninterference of heap effects Σ1 , Σ2 using the OK judgment, defined in Figure 5. The effects are noninterfering if the formulas bound to locations in both heap effects are pairwise noninterfering. Two effect formulas p1 and p2 are noninterfering as follows. For each pair of non-commuting effects E1 and E2 in the set , we project the set of pointers in p1 (resp., p2 ) which are accessed with effect E1 (resp., E2 ) using the effect projection operator π. Now, if the conjunction of the projected formulas is unsatisfiable, the effect formulas are noninterfering. User-Defined Effects and Commutativity. We allow our set of effects to be extended with additional effect label predicates by the user. To specify how these effects interact, commutativity annotations can be provided that indicate which effects do not result in nondeterministic behavior when run simultaneously. Then, the user may override the effect of any function by providing an effect Σ as an annotation, allowing additional flexibility to specify domain-specific effects or override the effect system if needed. E E 5. Evaluation We have implemented the techniques in this paper as an extension to CS OLVE, a refinement type-based verifier for C. CS OLVE takes as input a C program and a set of logical qualifiers, or formulas over the value variable ν, to use in refinement and effect inference. CS OLVE then checks the program both for memory safety errors (e.g. out-of-bounds array accesses, null pointer dereferences) and non-determinism. If CS OLVE determines that the program is free from such errors, it outputs an annotation file containing the types of program variables, heaps, functions, and heap effects. If CS OLVE cannot determine that the program is safe, it outputs an error message with the line number of the potential error and the inferred types of the program variables in scope at that location. Type and Effect Inference. CS OLVE infers refinement types and effect formulas using the predicate abstraction-based Liquid Types [28] technique; we give a high-level overview here. We note that there are expressions whose refinement types and heap effects cannot be constructed from the types and heap effects of subexpressions or from bindings in the heap and environment but instead <region r1,r2,r3 | r1:* # r3:*, r2:* # r3:*> void merge(DPJArrayInt<r1> a, DPJArrayInt<r2> b, DPJArrayInt<r3> c) reads r1:*, r2:* writes r3:* { if (a.length <= merge_size) seq_merge(a, b, c); else { int ha=a.length/2, sb=split(a.get(ha),b); final DPJPartitionInt<r1> a_split=new DPJPartitionInt<r1>(a,hd); final DPJPartitionInt<r2> a_split=new DPJPartitionInt<r2>(b,sb); final DPJPartitionInt<r3> c_split=new DPJPartitionInt<r3>(c,ha+sb); cobegin { merge(a_split.get(0),b_split.get(0),c_split.get(0)); merge(a_split.get(1),b_split.get(1),c_split.get(1)); }}} Figure 6. DPJ merge function must be synthesized. For example, the function typing rule requires us to synthesize types, heaps, and heap effects for functions. This is comparable to inferring pre- and post-conditions for functions (and similarly, loop invariants), which reduces to determining appropriate formulas for refinement types and effects. To make inference tractable, we require that formulas contained in synthesized types, heaps, and heap effects are liquid, i.e., conjunctions of logical qualifier formulas provided by the user. These qualifiers are templates for predicates that may appear in types or effects. We read our inference rules as an algorithm for constructing subtyping constraints which, when solved, yield a refinement typing for the program. Methodology. To evaluate our approach, we drew from the set of Deterministic Parallel Java (DPJ) benchmarks reported in [2]. For those which were parallelized using the methods we support, namely heap effects and commutativity annotations, we verified determinism and memory safety using our system. Our goals for the evaluation were to show that our system was as expressive as DPJ’s for these benchmarks while requiring less program restructuring. Results. The results of running CS OLVE on the following benchmarks are presented in Table 1. Reduce is the program given in section 2. SumReduce initializes an array using parallel iteration, then sums the results using a parallel divide-and-conquer strategy. MergeSort divides the input array into quarters, recursively sorting each in parallel. It then merges the two pairs of sorted quarters in parallel, and finally merges the two sorted halves to yield a single sorted array. Each of the final merges is performed recursively in parallel. QuickSort is a standard in-place quicksort adapted from a sequential version included with DPJ. We parallelized the algorithm by partitioning the input array around its median, then recursively sorting each partition in parallel. K-Means-Core was adapted from the STAMP benchmarks [23], the C implementation that was ported for DPJ; we omitted the nearest neighbor computation, as it contained no parallelism. IDEA is an encryption/decryption kernel ported to C from DPJ. Changes. Some benchmarks required modifications in order to conform to the current implementation of CS OLVE; this does not indicate inherent limitations in the technique. These include: multidimensionalizing flattened arrays, writing stubs for matrix malloc, expanding structure fields (K-Means-Core), and soundly abstracting non-linear operations and #define-ing configuration parameters to constants to get around a bug in the SMT solver (IDEA). qualif Q(V: ptr) : &&[_ <= V; V {<, >=} (_ + _ + _)] void merge(int* ARRAY LOC(L) a, int* ARRAY LOC(L) b, int la, int lb, int* ARRAY c) { if (la <= merge_size){ seq_merge(a, b, la, lb, c); } else { int ha = la / 2, sb = split(a[ha],b,lb); cobegin { merge(a,b,ha,sb,c); merge(a+ha,b+sb,la-ha,lb-sb,c+ha+sb); }}} Figure 7. CS OLVE merge function qualitatively compare them. Figure 6 contains DPJ code implementing the recursive Merge routine from MergeSort, including region annotations and required wrapper classes; Figure 7 contains CS OLVE code implementing the same. In this code, Merge takes two arrays a and b and recursively splits them, sequentially merging them into array c at a target size. In particular, each statement in the cobegin writes to a different contiguous interval of c. In DPJ, verifying this invariant requires: First, wrapping all arrays with an API class DPJArrayInt, and then wrapping each contiguous subarray with another class DPJArrayPartitionInt. This must be done because DPJ does not support precise effects over partitions of native Java arrays. Second, the method must be explicitly declared to be polymorphic over named regions r1, r2, and r3, corresponding to the memory locations in which the formals reside. Finally, Merge must be explicitly annotated with the appropriate effects, i.e., it reads r1 and r2 and writes r3. In CS OLVE, verifying this invariant requires: First, specifying that a and b are potentially aliased arrays (the annotation LOC(L) ARRAY). Second, we specify the qualifiers used to synthesize refinement types and heap effects via the line starting with qualif. This specification says that a predicate of the given form may appear in a type or effect, with the wildcard replaced by any program identifier in scope for that type or effect. Using the given qualifier, CS OLVE infers that a and b’s location is only read, and that c’s location is only written at indices described by the formula c ≤ ν < c + la + lb, which suffices to prove determinism. Unlike DPJ, CS OLVE does not require invasive changes to code (e.g. explicit array partitioning), and hence supports domain- and program-specific sharing patterns (e.g. memory coalescing from Figure 2). However, this comes at the cost of providing qualifiers. In future work, abstract interpretation may help lessen this burden. 6. Related Work We limit our discussion to static mechanisms for checking and enforcing determinism, in particular the literature that pertains to reasoning about heap disjointness: type-and-effect systems, semantic determinism checking, and other closely related techniques. Annotations. CS OLVE requires two classes of annotations to compute base types for the program. First, the annotation ARRAY is required to distinguish pointers to arrays from singleton references. Second, as base type inference is intraprocedural, we must additionally specify which pointer parameters may be aliased by annotating them with a location parameter of the form LOC(L). Named Regions. Region-based approaches assign references (or objects) to distinct segments of the heap, which are explicitly named and manipulated by the programmer. This approach was introduced in the context of adding impure computations to a functional language, and developed as a means of controlling the side-effects incurred by segments of code in sequential programs [20, 21]. Effects were also investigated in the context of safe, programmer-controlled memory management [31, 15, 19]. Ideas from this work led to the notion of abstract heap locations [34, 13, 10] and our notion of fold and unfold. Qualitative Comparison. Since the kinds of annotations are very different, a quantitative comparison between CS OLVE and DPJ is not meaningful. Instead, we illustrate the kinds of annotations and Ownership Types. In the OO setting, regions are closely related to ownership types which use the class hierarchy of the program to separate the heap into disjoint, nested regions [9, 29]. In addition to isolation, ownership types can be used to track effects [8], and to reason about data races and deadlocks [6, 4, 22]. Such techniques can be used to enforce determinism [30], but regions and ownership relations are not enough to enforce finegrained separation. Instead, we must precisely track relations between program variables. We are inspired by DPJ [2], which shows how some sharing patterns can be captured in a dependent region system. However, we show the full expressiveness of refinement type inference and SMT solvers can be brought to bear to enable complex, low-level sharing with static determinism guarantees. Checking Properties of Multithreaded Programs. Several authors have looked into type-and-effect systems for checking other properties of multithreaded programs. For example, [11, 24] show how types can be used to prevent races, [12] describes an effect discipline that encodes Lipton’s Reduction method for proving atomicity. Our work focuses on the higher-level semantic property of determinism. Nevertheless, it would be useful to understand how race-freedom and atomicity could be used to establish determinism. Others [27, 3] have looked at proving that different blocks of operations commute. In future work, we could use these to automatically generate effect labels and commutativity constraints. Program Logics and Abstract Interpretation. There is a vast literature on the use of logic (and abstract interpretation) to reason about sets of addresses (i.e., the heap). The literature on logically reasoning about the heap includes the pioneering work in TVLA [35], separation logic [26], and the direct encoding of heaps in firstorder logic [17, 18], or with explicit sets of addresses called dynamic frames [14]. These logics have also been applied to reason about concurrency and parallelism: [25] looks at using separation logic to obtain (deterministic) parallel programs, and [35] uses abstract interpretation to find loop invariants for multithreaded, heapmanipulating Java programs. The above look at analyzing disjointness for linked data structures. The most closely related to our work is [32], which uses intra-procedural numeric abstract interpretation to determine the set of array indices used by different threads, and then checks disjointness over the domains. Our work differs in that we show how to consolidate all the above lines of work into a uniform location-based heap abstraction (for separation logic-style disjoint structures) with type refinements that track finer-grained invariants. Unlike [25], we can verify access patterns that require sophisticated arithmetic reasoning, and unlike [32] we can check separation between disjoint structures, and even indices drawn from compound structures like arrays, lists and so on. Our type system allows “context-sensitive” reasoning about (recursive) procedures. Further, first-order refinements allow us to verify domain-specific sharing patterns via first class effect labels [21]. References [1] Nvidia cuda programming guide. [2] S. V. Adve, S. Heumann, R. Komuravelli, J. Overbey, P. Simmons, H. Sung, and M. Vakilian. A type and effect system for deterministic parallel java. In OOPSLA, 2009. [3] F. Aleen and N. Clark. Commutativity analysis for software parallelization: letting program transformations see the big picture. In ASPLOS, 2009. [8] D. G. Clarke and S. Drossopoulou. Ownership, encapsulation and the disjointness of type and effect. In OOPSLA, 2002. [9] D. G. Clarke, J. Noble, and J. M. Potter. Simple ownership types for object containment. In ECOOP, 2001. [10] R. DeLine and M. Fähndrich. Enforcing high-level protocols in low-level software. In PLDI, 2001. [11] C. Flanagan and S. N. Freund. Type-based race detection for java. In PLDI, pages 219–232, 2000. [12] C. Flanagan and S. Qadeer. A type and effect system for atomicity. In PLDI, 2003. [13] J. Foster, T. Terauchi, and A. Aiken. Flow-sensitive type qualifiers. In PLDI, 2002. [14] B. Jacobs, F. Piessens, J. Smans, K. R. M. Leino, and W. Schulte. A programming model for concurrent object-oriented programs. TOPLAS, 2008. [15] T. Jim, J. G. Morrisett, D. Grossman, M. W. Hicks, J. Cheney, and Y. Wang. Cyclone: A safe dialect of c. In USENIX, 2002. [16] M. Kawaguchi, P. Rondon, A. Bakst, and R. Jhala. Liquid effects: Technical report. http://goto.ucsd.edu/~rjhala/liquid. [17] S. K. Lahiri and S. Qadeer. Back to the future: revisiting precise program verification using smt solvers. In POPL, 2008. [18] S. K. Lahiri, S. Qadeer, and D. Walker. Linear maps. In PLPV, 2011. [19] C. Lattner and V. S. Adve. Automatic pool allocation: improving performance by controlling data structure layout in the heap. In PLDI, pages 129–142, 2005. [20] K. R. M. Leino, A. Poetzsch-Heffter, and Y. Zhou. Using data groups to specify and check side effects, 2002. [21] D. Marino and T. D. Millstein. A generic type-and-effect system. In A. Kennedy and A. Ahmed, editors, TLDI, pages 39–50. ACM, 2009. [22] J.-P. Martin, M. Hicks, M. Costa, P. Akritidis, and M. Castro. Dynamically checking ownership policies in concurrent c/c++ programs. In POPL, pages 457–470, 2010. [23] C. C. Minh, J. Chung, C. Kozyrakis, and K. Olukotun. Stamp: Stanford transactional applications for multi-processing. In IISWC, pages 35–46, 2008. [24] P. Pratikakis, J. S. Foster, and M. W. Hicks. Locksmith: contextsensitive correlation analysis for race detection. In PLDI, 2006. [25] M. Raza, C. Calcagno, and P. Gardner. Automatic parallelization with separation logic. In ESOP, pages 348–362, 2009. [26] J. C. Reynolds. Separation logic: A logic for shared mutable data structures. In LICS, pages 55–74, 2002. [27] M. C. Rinard and P. C. Diniz. Commutativity analysis: A new analysis technique for parallelizing compilers. TOPLAS, 19(6), 1997. [28] P. Rondon, M. Kawaguchi, and R. Jhala. Low-level liquid types. In POPL, pages 131–144, 2010. [29] M. Smith. Towards an effects system for ownership domains. In In ECOOP Workshop - FTfJP 2005, 2005. [30] T. Terauchi and A. Aiken. A capability calculus for concurrency and determinism. TOPLAS, 30, 2008. [31] M. Tofte and J.-P. Talpin. A theory of stack allocation in polymorphically typed languages, 1993. [4] Z. R. Anderson, D. Gay, R. Ennals, and E. A. Brewer. Sharc: checking data sharing strategies for multithreaded c. In PLDI, 2008. [32] M. T. Vechev, E. Yahav, R. Raman, and V. Sarkar. Automatic verification of determinism for structured parallel programs. In SAS, pages 455–471, 2010. [5] A. Aviram, S.-C. Weng, S. Hu, and B. Ford. Efficient system-enforced deterministic parallelism. In OSDI, 2010. [33] J. Voung, R. Chugh, R. Jhala, and S. Lerner. Dataflow analysis for concurrent programs using data race detection. In PLDI, 2008. [6] C. Boyapati, R. Lee, and M. C. Rinard. Ownership types for safe programming: preventing data races and deadlocks. In OOPSLA, pages 211–230, 2002. [7] M. Chakravarty, G. Keller, R. Lechtchinsky, and W. Pfannenstiel. Nepal: Nested data parallelism in haskell. In Euro-Par, 2001. [34] D. Walker and J. Morrisett. Alias types for recursive data structures. pages 177–206. 2000. [35] E. Yahav and M. Sagiv. Verifying safety properties of concurrent heap-manipulating programs. TOPLAS, 32(5), 2010.

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement