Cutting corners cheaply, or how to remove Steiner points Lior Kamma∗ Robert Krauthgamer∗ Huy L. Nguy˜ên† April 4, 2013 Abstract Our main result is that the Steiner Point Removal (SPR) problem can always be solved with polylogarithmic distortion, which resolves in the affirmative a question posed by Chan, Xia, Konjevod, and Richa (2006). Specifically, we prove that for every edge-weighted graph G = (V, E, w) and a subset of terminals T ⊆ V , there is a graph G0 = (T, E 0 , w0 ) that is isomorphic to a minor of G, such that for every two terminals u, v ∈ T , the shortest-path distances between them in G and in G0 satisfy dG,w (u, v) ≤ dG0 ,w0 (u, v) ≤ O(log6 |T |)·dG,w (u, v). Our existence proof actually gives a randomized polynomial-time algorithm. Our proof features a new variant of metric decomposition. It is well-known that every finite metric space (X, d) admits a β-separating decomposition for β = O(log|X|), which roughly means for every desired diameter bound ∆ > 0 there is a randomized partitioning of X, which satisfies the following separation requirement: for every x, y ∈ X, the probability they lie in different clusters of the partition is at most β d(x, y)/∆. We introduce an additional requirement, which is the following tail bound: for every shortest-path P of length d(P ) ≤ ∆/β, the number of clusters of the partition that meet the path P , denoted ZP , satisfies Pr[ZP > t] ≤ 2e−Ω(t) for all t > 0. ∗ Weizmann Institute of Science, {lior.kamma,robert.krauthgamer}@weizmann.ac.il. This work was supported in part by a US-Israel BSF grant #2010418, and by the Citi Foundation. Part of this work was done while visiting Microsoft Research New England. † Princeton University, hlnguyen@princeton.edu. This work was supported in part by NSF CCF 0832797, and a Gordon Wu Fellowship. Part of this work was done while interning at Microsoft Research New England. 1 Introduction Graph compression describes the transformation of a given graph G into a small graph G0 that preserves certain features (quantities) of G, such as distances or cut values. Notable examples for this genre include graph spanners, distance oracles, cut sparsifiers, and spectral sparsifiers, see e.g. [PS89, TZ05, BK96, BSS09] and references therein. The algorithmic utility of such graph transformations is clear – once the “compressed” graph G0 is computed (as a preprocessing step), further processing can be performed on G0 instead of on G, using less resources like runtime and memory, or achieving better accuracy (when the solution is approximate). See more in Section 1.3. Within this context, we study vertex-sparsification, where G has a designated subset of vertices T , and the goal is reduce the number of vertices in the graph while maintaining certain properties of T . A prime example for this genre is vertex-sparsifiers that preserve terminal versions of (multicommodity) cut and flow problems, a successful direction that was initiated by Moitra [Moi09] and extended in several followups [LM10, CLLM10, MM10, EGK+ 10, Chu12]. Our focus here is different, on preserving distances, a direction that was seeded by Gupta [Gup01] more than a decade ago. Steiner Point Removal (SPR). Let G = (V, E, w) be an edge-weighted graph1 and let T = {t1 , . . . , tk } ⊆ V be a designated set of k terminals. Here and throughout, dG,w (·, ·) denotes the shortest-path metric induced by w on the vertices of G. The Steiner Point Removal problem asks to construct on the terminals a new graph G0 = (T, E 0 , w0 ) such that (i) distances between the terminals are distorted at most by factor α ≥ 1, formally ∀u, v ∈ T, dG,w (u, v) ≤ dG0 ,w0 (u, v) ≤ α · dG,w (u, v); and (ii) the graph G0 is (isomorphic to) a minor of G. This formulation of the SPR problem was proposed by Chan, Xia, Konjevod, and Richa [CXKR06, Section 5] who posed the problem of bounding the distortion α (existentially and/or using an efficient algorithm). Our main result is to resolve their open question. Requirement (ii) above expresses structural similarity between G and G0 ; for instance, if G is planar then so is G0 . The SPR formulation above actually came about as a generalization to a result of Gupta [Gup01], which achieves α = 8 for the case where G is a tree,2 and this factor of 8 was proved by [CXKR06] to be tight. The upper bound for trees was later extended by Basu and Gupta [BG08], who achieve distortion α = O(1) for the larger class of outerplanar graphs. How to construct minors. We now describe a general methodology that is natural for the SPR problem. The first step constructs a minor G0 with vertex set T , but without any edge weights, and is prescribed by Definition 1.1. The second step determines edge weights w0 such that dG0 ,w0 dominates dG,w on the terminals T , and is given in Definition 1.2. These steps are illustrated in Figure 1. Our definitions are actually more general (anticipating the technical sections), and consider G0 whose vertex set is sandwiched between T and V . Define a partial partition of V to be a collection V1 , . . . , Vk of pairwise disjoint subsets of V . 1 Throughout, all graphs are undirected and all edge weights are positive. In fact, Gupta [Gup01] only argued that G0 is a tree. Chan et al. [CXKR06] observed later that this same G0 is actually a minor of G. 2 1 V1 t1 t1 2 2 V2 V3 t2 t2 2 3 t3 t3 G G0 Figure 1: G is a 9-cycle with 3 terminals and unit edge weights. Its terminal-centered minor and the standard restriction edge weights are shown on the right. Definition 1.1 (Terminal-Centered Minor). Let G = (V, E) be a graph with k terminals T = {t1 , . . . , tk }, and let V1 , . . . , Vk be a partial partition of V , such that each induced subgraph G[Vj ] is connected and contains tj . The graph G0 = (V 0 , E 0 ) obtained by contracting each G[Vj ] into a single vertex that is identified with tj , is called the terminal-centered minor of G induced by V1 , . . . , Vk . By identifying the “contracted super-node” Vj with tj , we may think of the vertex-set V 0 as containing T and (possibly) some vertices from V \ T . A terminal-centered minor G0 of G can also be described by a mapping f : V → T ∪ {⊥}, such that f |T ≡ id and f −1 ({tj }) is connected in G for all j ∈ [k]. Indeed, simply let Vj = f −1 ({tj }) for all j ∈ [k], and thus V \ (∪j Vj ) = f −1 ({⊥}). Definition 1.2 (Standard Restriction). Let G = (V, E, w) be an edge-weighted graph with terminals set T , and let G0 = (V 0 , E 0 ) be a terminal-centered minor of G. The standard restriction of w to G0 is the edge weight w0 given by the respective distances in G, formally ∀(x, y) ∈ E 0 , 0 wxy := dG,w (x, y). This edge weight w0 is optimal in the sense that dG0 ,w0 dominates dG,w (where it is defined, i.e., on V 0 ), and the weight of each edge (x, y) ∈ E 0 is minimal under this domination condition. 1.1 Main Result Our main result below gives an efficient algorithm that achieves polylog(k) distortion for the SPR problem. Its proof spans Sections 3 and 4, though the former one contains the heart of the matter. Theorem 1.3. Let G = (V, E, w) be an edge-weighted graph with k terminals T ⊆ V . Then there exists a terminal-centered minor G0 = (T, E 0 , w0 ) of G, such that G0 has distortion O(log6 k), i.e., ∀u, v ∈ T, 1≤ dG0 ,w0 (u, v) ≤ O(log6 k). dG,w (u, v) Moreover, w0 is the standard restriction of w, and G0 is computable in randomized polynomial time. 2 This theorem answers a question of [CXKR06].3 The only distortion lower bound known for general graphs is a factor of 8 (which actually holds for trees) [CXKR06], and thus it remains a challenging open question whether O(1) distortion can be achieved in general graphs. Our proof of Theorem 1.3 begins similarly to [EGK+ 10], by iterating over the “distance scales” 2i , going from the smallest distance dG,w (u, v) among all terminals u, v ∈ T , towards the largest such distance. Each iteration i first employs a “stochastic decomposition”, which is basically a randomized procedure that partitions V into so-called “clusters” (disjoint subsets) whose diameter is at most 2i . Then, some clusters are contracted to a nearby terminal, which must be “adjacent” to the cluster; this way, the current graph is a minor of the previous iteration’s graph, and thus also of the initial G. Let us now skip some technical details about the connectivity of the clusters, which is important for the contractions to be legit, and just say that at a high level, after iteration i is executed, we roughly expect “areas” of radius proportional to 2i around the terminals to be contracted. As i increases, these areas get larger until eventually all the vertices are contracted into terminals, at which point the weights are set to be standard restriction. The main challenge is to control the distortion, and this is where we crucially deviate from [EGK+ 10] (and differ from all previous work). In their randomized construction of a minor G0 , for every two terminals u, v ∈ T it is shown that G0 contains a u − v path of expected length at most O(log k)dG (u, v). Consequently, they design a distribution D over minors G0 , such that the stretch dG0 (u, v)/dG (u, v) between any u, v ∈ T has expectation at most O(log k).4 In contrast, in our randomized construction of G0 , the stretch between u, v ∈ T is polylogarithmic with high probability, say at least 1 − 1/k 3 . Applying a simple union bound over the k2 terminal pairs, we can then obtain a single graph G0 achieving a polylogarithmic distortion. Technically, these bounds follow by fixing in G a shortest-path P between two terminals u, v ∈ T , and then tracking the execution of the randomized algorithm to analyze how the P evolves into a u − v path P 0 in G0 . In [EGK+ 10], the length of P 0 is analyzed in expectation, which by linearity of expectation, follows from analyzing a single edge. In contrast, we provide for it a high-probability bound, which inevitably must consider (anti)correlations along the path. The next section features a new tool that we developed in our quest for high-probability bounds, and which may be of independent interest. For sake of clarity, we provide below a vanilla version that excludes technical complications such as terminals, strong diameter, and consistency between scales. The proof of Theorem 1.3 actually does require these complications, and thus cannot use the generic form described below. 1.2 A Key Technique: Metric Decomposition with Concentration Metric decomposition. Let (X, d) be a metric space, and let Π be a partition of X. Every S ∈ Π is called a cluster, and for every x ∈ X, we use Π(x) to denote the unique cluster S ∈ Π such that x ∈ S. In general, a stochastic decomposition of the metric (X, d) is a distribution µ over partitions of X, although we usually impose additional requirements. The following definition is perhaps the most basic version, often called a separating decomposition or a Lipschitz decomposition. Definition 1.4. A metric space (X, d) is called β-decomposable if for every ∆ > 0 there is a probability distribution µ over partitions of X, satisfying the following requirements: 3 We remark that distortion k can be achieved relatively easily, by considering a mapping f that sends every vertex into its nearest terminal, and taking the minor corresponding to f with standard-restriction edge weights. 4 But it is possible that no G0 ∈ supp(D) achieves a low stretch simultaneously for all u, v ∈ T . 3 (a). Diameter bound: for all Π ∈ supp(µ) and all S ∈ Π, diam(S) ≤ ∆. (b). Separation probability: for all x, y ∈ X, Pr [Π(x) 6= Π(y)] ≤ Π∼µ βd(x,y) ∆ . Bartal [Bar96] proved that every n-point metric is O(log n)-decomposable, and that this bound is tight. We remark that by now there is a rich literature on metric decompositions, and different variants of this notion may involve terminals, or (in a graphical context) connectivity requirements inside each cluster, see e.g. [Bar96, CKR01, FRT04, Bar04, LN05, GNR10, EGK+ 10, MN07, AGMW10, KR11]. Degree of separation. Let P =P(x0 , x1 , . . . , x` ) be a shortest path between x0 , x` ∈ X, i.e., a sequence of points in X such that i∈[`] d(xi−1 , xi ) = d(x0 , x` ). We denote its length by d(P ) := d(x0 , x` ), and say that P meets a cluster S ⊆ X if S ∩ P 6= ∅. Given a partition Π of X, define the degree of separation ZP (Π) as the number of different clusters in the partition Π that meet P . Formally ZP (Π) := X 1{P meets S} . (1) S∈Π Throughout, we omit the partition Π when it is clear from the context. When we consider a random partition Π ∼ µ, the corresponding ZP = ZP (Π) is actually a random variable. If this distribution µ satisfies requirement (b) of Definition 1.4, then E [ZP ] ≤ 1 + Π∼µ X i∈[`] Pr [Π(xi−1 ) 6= Π(xi )] ≤ 1 + Π∼µ X βd(xi−1 , xi ) βd(P ) =1+ . ∆ ∆ (2) i∈[`] But what about the concentration of ZP ? More precisely, can every finite metric be decomposed, such that every shortest path P admits a tail bound on its degree of separation ZP ? A tail bound. We answer this last question in the affirmative using following theorem. We prove it, or actually a stronger version that does involve terminals, in Section 2. Theorem 1.5. For every n-point metric space (X, d) and every ∆ > 0 there is a probability distribution µ over partitions of X that satisfies, for β = O(log n), requirements (a)-(b) of Definition 1.4, and furthermore (c). Degree of separation: For every shortest path P of length d(P ) ≤ ∀t ≥ 1, Pr [ZP > t] ≤ 2e−Ω(t) . ∆ β, (3) Π∼µ The tail bound (3) can be compared to a naive estimate that holds for every β-decomposition µ: using (2) we have E[ZP ] ≤ 2, and then by Markov’s inequality Pr[ZP ≥ t] ≤ 2/t. 4 1.3 Related Work Applications. Vertex-sparsification, and the “graph compression” approach in general, is obviously beneficial when G0 can be computed from G very efficiently, say in linear time, and then G0 may be computed on the fly rather than in advance. But compression may be valuable also in scenarios that require the storage of many graphs, like archiving and backups, or rely on low-throughput communication, like distributed or remote processing. For instance, the succinct nature of G0 may be indispensable for computations performed frequently, say on a smartphone, with preprocessing done in advance on a powerful machine. We do not have new theoretical applications that leverage our SPR result, although we anticipate these will be found later. Either way, we believe this line of work will prove technically productive, and may influence, e.g., work on metric embeddings and on approximate min-cut/maxflow theorems. Probabilistic SPR. Here, the objective is not to find a single graph G0 = (T, E 0 , w0 ), but rather a distribution D over graphs G0 = (T, E 0 , w0 ), such that every graph G0 ∈ supp(D) is isomorphic to a minor of G and its distances dG0 ,w0 dominate dG,w (on T × T ), and such that the distortion inequalities hold in expectation, that is, ∀u, v ∈ T, E [dG0 ,w0 (u, v)] ≤ α · dG,w (u, v). G0 ∼D This problem, first posed in [CXKR06], was answered in [EGK+ 10] with α = O(log |T |). Distance Preserving Minors. This problem differs from SPR in G0 may contain a few nonterminals, but all terminal distances should be preserved exactly. Formally, the objective is to find a small graph G0 = (V 0 , E 0 , w0 ) such that (i) G0 is isomorphic to a minor of G; (ii) T ⊆ V 0 ⊆ V ; and (iii) for every u, v ∈ T , dG0 ,w0 (u, v) = dG,w (u, v). This problem was originally defined by Krauthgamer and Zondiner [KZ12], who showed an upper bound |V 0 | ≤ O(|T |4 ) for general graphs, and a lower bound of Ω(|T |2 ) that holds even for planar graphs. 2 Metric Decomposition with Concentration We now prove a slightly stronger result than that of Theorem 1.5, stated as Theorem 2.2. Let (X, d) be a metric space, and let {t1 , . . . , tk } ⊆ X be a designated set of terminals. Recall that a partial partition Π of X is a collection of pairwise disjoint subsets of X. For a shortest path P in X, define ZP = ZP (Π) using Eqn. (1), which is similar to before, except that now Π is a partial partition. We first extend Definition 1.4. Definition 2.1. We say that X is β-terminal-decomposable with concentration if for every ∆ > 0 there is a probability distribution µ over partial partitions of X, satisfying the following properties. • Diameter Bound: For all Π ∈ supp(µ) and all S ∈ Π, diam(S) ≤ ∆. • Separation Probability: For every x, y ∈ X, Pr [∃S ∈ Π such that |S ∩ {x, y}| = 1 ] ≤ Π∼µ 5 βd(x, y) . ∆ • Terminal Cover: For every Π ∈ supp(µ), T ⊆ S S∈Π S. • Degree of Separation: For every shortest path P and for every t ≥ 1, tβ d(P )β e−Ω(t) Pr ZP > max{d(P ), ∆/β} ≤ O min kβ, Π∼µ ∆ ∆ Theorem 2.2. Every finite metric space with k terminals is (4 log k)-terminal-decomposable with concentration. Define the truncated exponential with parameters λ, ∆ > 0, denoted Texp(λ, ∆), to be distribution given by the probability density function gλ,∆ (x) = λ(1−e1−∆/λ ) e−x/λ for all x ∈ [0, ∆). We are now ready to prove Theorem 2.2. For simplicity of notation, we prove the result with cluster diameter at most 2∆ instead of ∆. Fix a desired diameter bound ∆ > 0, and set for the ∆ rest of the proof λ := log k and g := gλ,∆ . For x ∈ X and r > 0, we use the standard notation of a closed ball B(x, r) := {y ∈ X : d(x, y) ≤ r}. We define the distribution µ via the following procedure that samples a partial partition Π of X. for j = 1, 2, . . . , k do 2: choose independently at random Rj ∼ Texp(λ, ∆), and let Bj = B(tj , Rj ). Sj−1 3: set Sj = Bj \ m=1 Bm . 4: return Π = {S1 , . . . , Sk } \ {∅}. 1: The diameter bound and terminal partition properties hold by construction. The proof of the separation event property is identical to the one in [Bar96, Section 3]. The following two lemmas prove the degree of separation property, which will conclude the proof of Theorem 2.2. Fix a shortest path P in X, and let us assume that t/2 is a positive integer; a general t ≥ 1 can be reduced to this case up to a loss in the unspecified constant. Lemma 2.3. If d(P ) < λ, then Pr[ZP > t] ≤ 2e−Ω(t) . Proof. Split the k terminals into Jfar := {j ∈ [k] : d(tj , P ) > ∆ − 2λ} and Jnear := [k] \ Jfar . Define random variables Zfar := #{j ∈ Jfar : Bj ∩ P 6= ∅} and Znear := #{j ∈ Jnear : Sj ∩ P 6= ∅}. Then ZP ≤ Zfar + Znear and Pr[ZP > t] ≤ Pr[Zfar + Znear > t] ≤ Pr[Zfar > t/2] + Pr[Znear > t/2]. For every j ∈ Jfar , Pr[Bj ∩ P 6= ∅] ≤ Pr[Rj ≥ ∆ − 2λ] = Z ∆ g(x)dx = ∆−2λ ∆−2λ ∆ k 8 (e− λ − e− λ ) ≤ , k−1 k and therefore E[Zfar ] ≤ 8. Since Zfar is the sum of independent indicators, by a Chernoff bound Pr[Zfar > t/2] ≤ 2−t/2 for all t ≥ 32e. For smaller t, observe that Pr[Zfar = 0] ≥ (1 − 8/k)k ≥ Ω(1), and thus for every t ≥ 1 we have Pr[Zfar > t/2] ≤ e−Ω(t) . Next, consider which balls among {Bj : j ∈ Jnear } have non-empty intersection with P , and let j1 < j2 < . . . denote their indices. Formally, we place a conditioning henceforth on a some event E 6 that determines whether Rj ≥ d(tj , P ) occurs or not for each j ∈ Jnear . For a = 1, 2, . . ., let Ya be an indicator for the event that ball Bja does not contain P . Then i 1 − e−1 i h h 3 Pr[Ya = 1 | E] = Pr P 6⊆ Bja | P ∩Bja 6= ∅ ≤ Pr Rj < d(tja , P )+λ | Rj > d(tja , P ) ≤ ≤ . 1 − e−2 4 Having conditioned on E, the event {Znear > t/2} implies that Ya = 1 for all a ∈ [t/2], and since these random variables Ya are independent, Pr[Znear > t/2 | E] ≤ (3/4)t/2 ≤ e−Ct for an appropriate constant C > 0. The last inequality holds for all such events E (with the same constant C > 0), and thus also without any such conditioning. Altogether, we conclude that Pr[ZP > t] ≤ 2e−Ω(t) . n l mo ) Lemma 2.4. If d(P ) ≥ λ, then Pr[ZP > td(P )/λ] ≤ O min k log k, d(P e−Ω(t) . λ Proof. Treating P as a continuous path, subdivide it into r := dd(P )/λe segments, say segments of equal length that are (except for the last one) half open and half closed. The induced subpaths P1 , . . . , Pr of P are disjoint (as subsets of X) and have length at most P λ each, though some of subpaths may contain only one or even zero points of X. Writing ZP = i∈[r] ZPi , we can apply a union bound and then Lemma 2.3 on each Pi , to obtain i h d(P ) Pr[ZP > td(P )/λ] ≤ Pr ∃i ∈ [r] such that ZPi > t/2 ≤ O · e−Ω(t) . λ Furthermore, for every j ∈ [k], let Aj := {i ∈ [r] : Pi ∩ B(tj , ∆) 6= ∅}, and since P is a shortest path, |Aj | ≤ 4∆/λ = 4 log k. Observe that ZPi = 0 (with certainty) for all i ∈ / ∪j Aj , hence h i Pr[ZP > td(P )/λ] ≤ Pr ∃i ∈ ∪j∈[k] Aj such that ZPi > t/2 ≤ 4k log ke−Ω(t) . 3 Terminal-Centered Minors: Main Construction This section proves Theorem 1.3 when D := maxu,v∈T dG (u,v) minu,v∈T dG (u,v) satisfies the following assumption. 3 Assumption 3.1. D ≤ 2k . By scaling all edge weights, we may further assume that minu,v∈T dG (u, v) = 1. S Notation 1. Let V1 , . . . , Vk ⊆ V . For S ⊆ [k], denote VS := j∈S Vj . In addition, denote V⊥ := V \ V[k] and V⊥+j := V⊥ ∪ Vj for all j ∈ [k]. We now present a randomized algorithm that, given a graph G = (V, E, w) and terminals T ⊂ V , constructs a terminal-centered minor G0 as stated in Theorem 1.3. This algorithm maintains a partial partition {V1 , V2 , . . . , Vk } of V , starting with Vj = {tj } for all j ∈ [k]. The sets grow monotonically during the execution of the algorithm. We may also think of the algorithm as if it maintains a mapping f : V → T ∪ {⊥}, starting with f |T = id and gradually assigning a value in T to additional vertices, which correspond to the set V[k] . Thus, we will also refer to the vertices in V[k] as assigned, and to vertices in V⊥ as unassigned. The heart of the algorithm is 7 two nested loops (lines 4-9). During every iteration of the outer loop, the inner loop performs k iterations, one for every terminal tj . Every inner-loop iteration picks a random radius (from an exponential distribution) and “grows” Vj to that radius (but without overlapping any other set). Every outer-loop iteration increases the expectation of the radius distribution. Eventually, all nodes are assigned, i.e. {V1 , V2 , . . . , Vk } is a partition of V . Definition 3.2. Let U ⊆ V , by G[U ] we denote the subgraph of G induced by U , with induced edge lengths (i.e. w|E(G[U ]) ). For a subgraph H of G with induced edge lengths, a vertex v ∈ V (H) and r > 0, denote BH (v, r) := {u ∈ V (H) : dH (u, v) ≤ r}, where dH is the shortest path metric in H induced by w. Input: G = (V, E, w), T = {t1 , . . . , tk } ⊆ V Output: A partition {V1 , V2 , . . . , Vk } of V . 1: set b ← 1 + 1/(35 log k) 2: for every j ∈ [k] set Vj ← {tj }, rj = 0. 3: set i ← 0. // i is the iteration number of the outer loop. 4: while V[k] 6= V do 5: i ← i + 1. 6: for all j ∈ [k] do 7: choose independently at random Rji ∼ exp(bi ). 8: rj ← rj + Rji . 9: Vj ← Vj ∪ BG[V⊥+j ] (tj , rj ). // Actually, this is the same as Vj ← BG[V⊥+j ] (tj , rj ). 10: return {V1 , V2 , . . . , Vk }. Algorithm 1: Partitioning V Claim 3.3. The following properties hold throughout the execution of the algorithm. 1. For all j ∈ [k], Vj is connected in G, and tj ∈ Vj . 2. For every j1 , j2 ∈ [k], if j1 6= j2 , then Vj1 ∩ Vj2 = ∅. 3. For every outer loop iteration i and every j ∈ [k], if Vj0 denotes the set Vj in the beginning of the ith iteration (of the outer loop), and Vj00 denotes the set Vj at the end of that iteration, then Vj0 ⊆ Vj00 . In what follows, we analyze the stretch in distance between a fixed pair of terminals. We show 0 that with probability at least 1 − 1/k 3 , the distance between these terminals in G is at most k 6 O(log k) times their distance in G. By a union bound over all 2 pairs of terminals, we deduce Theorem 1.3. Let s, t ∈ T , and let P ∗ be a shortest st-path in G. Due to the triangle inequality, we may focus on pairs which satisfy V (P ∗ ) ∩ T = {s, t}, where V (P ∗ ) is the node set of P ∗ . We denote ` := w(P ∗ ) = dG,w (s, t). 3.1 High Level Analysis Following an execution of the algorithm, we maintain a (dynamic) path P between s and t. In a sense, in every step of the algorithm, P simulates an st-path in the terminal-centered minor induced by V1 , V2 , . . . , Vk . In the beginning of the execution, set P to be simply P ∗ . During the course 8 of the execution update P to satisfy two invariants: At every step of the algorithm, the weight of P is an upper bound on the distance between s and t in the terminal centered minor induced by V1 , . . . , Vk (in that step). In addition, if I is a subpath of P , whose inner vertices are all unassigned, then I is a subpath of P ∗ . Throughout the analysis, we think of P as directed from s to t, thus inducing a linear ordering of the vertices in P . Definition 3.4. A subpath of P will be called active if it is a maximal subpath whose inner vertices are unassigned. We now describe how P is updated during the execution of the algorithm. Consider line 9 of the algorithm for the ith iteration of the outer loop, and some j ∈ [k]. We say that the ball B = BG[V⊥+j ] (tj , rj ) punctures an active subpath A of P , if there is an inner node of A that belongs to the ball. If B does not puncture any active subpath of P , we do not change P . Otherwise, denote by u, v the first and last unassigned nodes (possibly not in the same active subpath) in V (P ) ∩ B respectively. Then we do the following: 1. We replace the entire subpath of P between u and v with a concatenation of a shortest utj -path and a shortest tj v-path that lie in B; this is possible, since G[B] is connected, and u, tj , v ∈ B. This addition to P will be called a detour from u to v through tj . The process is illustrated in figures 2(a)-2(c). Beginning with P ∗ , the figure describes the update after the first four balls. Note that the detour might not be a simple path. It is also worth noting that here u and v may belong to different active subpaths of P . For example, in figure 2(c), the new ball punctures two active subpaths, and therefore in figure 2(d), the detour goes from a node in one active subpath to a node in another active subpath. Note that in this case, we remove from P portions which are not active. 2. If, for some iteration i0 < i of the outer loop, and for some u0 , v 0 ∈ V (P ), we added a detour from u0 to v 0 through tj in iteration i0 , we keep only one detour through tj , from the first node between u, u0 and to the last between v, v 0 . For example, in figure 2(d), the ball centered in t3 punctures an active subpath. Only one detour is kept in figure 2(e). The total weight of all detours during the execution will be called the additional weight to P (ignoring portions of P that are deleted from P ). Denote the set of active subpaths of P in the beginning of the ith iteration of the outer loop by Ai . Let V1f in , . . . , Vkf in be the partition returned by the algorithm, let G0 the terminal-centered minor induced by that partition, and let w0 be the standard restriction of w to G0 . Denote by P f in the path obtained at the end of the execution. Claim 3.5. At every step of the algorithm the following holds: 1. The weight of P is an upper bound on the distance between s and t in the terminal centered minor induced by V1 , . . . , Vk . Moreover, once Ai = ∅ (namely, P has no active subpaths), the additional weight of P is an upper bound on the distance between s and t in the terminal centered minor induced by V1f in , . . . , Vkf in (actually, from this point on, P = P f in . 2. If A is a subpath of P , whose inner points are all in V⊥ , then A is a subpath of P ∗ . 3. If A1 , A2 are two different active subpaths of P , they are internally disjoint. 4. |Ai | ≤ k for all i. 9 B(t5, r5) t5 P∗ P∗ t5 B(t2, r2) t = t2 s = t1 t = t2 s = t1 t5 B(t1, r1) B(t5, r5) B(t2, r2) B(t1, r1) P∗ t = t2 s = t1 t4 t4 t3 t3 (a) We begin with P = P ∗ t4 t3 B(t3, r3) (b) In every iteration of the inner loop, (c) Replacing a portion of P with a dewe grow a terminal-centered ball tour. V5 V1 B(t3, r3) t5 P∗ V5 V2 V1 t = t2 s = t1 t5 P∗ t = t2 s = t1 t4 t3 B(t3, r3) V2 t4 t3 V4 (d) Endpoints of a detour can belong to different active subpaths. B(t3, r3) V4 (e) Joining detours. Figure 2: Updating P Proof. Follows easily by induction on i, j. Corollary 3.6. dG0 ,w0 (s, t) ≤ w(P f in ). Let A ∈ Ai . During the execution of the inner loop, A is either removed from P entirely, or some subpaths of A remain active (perhaps A remains active entirely). Therefore, for every A0 ∈ Ai+1 , either A0 is a non-trivial subpath of A (by non-trivial weSmean |V (A0 )| ≥ 3), or A0 and A are internally disjoint. Therefore there is a laminar structure on i Ai . We describe this structure using a tree T , whose node set is {hi, Ai| A ∈ Ai }. The root of T is h1, P ∗ i, and for every i and every A ∈ Ai , the children of hi, Ai, if any, are all pairs hi + 1, A0 i, where A0 ∈ Ai+1 is a subpath of A. Whenever we update P we log the weight of the detour by charging it to one of the nodes of T as follows. Let u, v be unassigned nodes in P , and consider a detour from u to v in the ith iteration of the outer loop for some i. The weight of the detour is charged to the (unique) active subpath A ∈ Ai such that u ∈ A. For every i and A ∈ Ai , denote the total weight charged to hi, Ai by wi,A (If the node is never charged, the weight of the node is set to 0). Therefore, X w(P f in ) ≤ w(P ∗ ) + wi,A (4) hi,Ai∈T Equation (4) together with Corollary 3.6 imply that if we show that with high probability, the total weight charged to the tree is at most O(log6 k) · `, we can deduce Theorem 1.3. For the rest of this section, we therefore prove the following lemma. 10 Lemma 3.7. With probability at least 1 − O(k −3 ), the total weight charged to the tree is at most O(log6 k)`. Let p = 1/100. An active subpath A ∈ Ai will be called short if w(A) ≤ pbi . Otherwise, A will be called long (Note that P ∗ ∈ A1 is long). Definition 3.8. Let i > 1. Let A ∈ Ai be a short subpath. Denote by Ti,A the subtree of T rooted in hi, Ai. If, in addition, the parent of hi, Ai in T consists of a long subpath, Ti,A will be called a short subtree of T . 3.2 The Behavior of Short Subtrees Clearly, every node of a short subtree of T consists of a short active subpath. The height and weight of short subtrees will play an important role in the analysis of the height and weight of T , and therefore we begin by analyzing short subtrees. Let i0 > 1. Let A ∈ Ai0 be a short subpath, and assume in addition that Ti0 ,A is a short subtree of T . 3.2.1 The Effect of a Single Ball Fix some i ≥ i0 , and some j ∈ [k]. Let X denote the number of active subpaths of A in the beginning of the jth iteration of the inner loop (during the ith iteration of the outer loop). Let X 0 denote the number of active subpaths of A in the end of the jth iteration. Denote by B the ball considered in this iteration, namely B := BG[V⊥+j ] (tj , rj ). Proposition 3.9. With certainty, X 0 ≤ X + 1. Proof. Let A1 , A2 , . . . , AX be the subpaths of A which are active in the beginning of the jth iteration ordered by their location on P . For α ∈ [X] denote by uα , vα the first and last unassigned nodes in Aα , respectively. If B does not puncture any of these subpaths, then X 0 ≤ X < X + 1 (Note that subpaths of A can still be removed if B punctures active subpaths of P not contained in A). So assume B punctures Aα . Assume first that Aα is the only subpath of P which is active and is punctured by B. Then there are three options: If both uα , vα ∈ B, then Aα is replaced and removed entirely from P when adding the detour, and X 0 ≤ X − 1 < X + 1. If uα ∈ B and vα ∈ / B, 0 0 let v be the last node in V (Aα ) ∩ B then the uα v segment of Aα is replaced, and the segment v 0 vα remains active. Therefore X 0 ≤ X < X + 1. The argument is similar, if uα ∈ / B and vα ∈ B. Otherwise, some of the inner portion of A is replaced by a non active path, and both end segments of Aα remain active, therefore X 0 = X + 1. Next, assume the ball punctures several active subpaths of A, and maybe more subpaths of P . Denote by Iα , Iβ the first and last subpaths of A punctured by B. Denote by u the first node in V (Iα )∩B, and v the last node in V (Iβ )∩B. When updating P , the entire subpath of P between u and v is removed. Thus X 0 ≤ X − (β − α + 1) ≤ X < X + 1. For the next proposition, we need the following definition, which is a generalization of the notion of a ball puncturing an active subpath. Recall that a ball B is said to puncture an active subpath A0 if there is an inner node in A0 that belongs to the ball. Definition 3.10. Let A0 be an active subpath of P . We say that a ball B reaches A0 if V (A0 )∩B 6= ∅. A0 By the definition of the balls in the algorithm, B may reach A0 and not puncture if and only if has one endpoint in Vj . 11 Proposition 3.11. Pr[X 0 ≥ X | B reaches an active subpath of A] ≤ p . Proof. Let A0 be an active subpath of A closest to tj . Let d = dG[V⊥+j ] (tj , A0 ) = dG[V⊥+j ] (tj , A) be the distance in G[V⊥+j ] between tj and a nearest node in A0 . Assume that the ball B punctures an active subpath of A. Following the analysis of the previous proof, if X 0 ≥ X, then B punctures exactly one active subpath of A and does not cover it, or B punctures exactly two active such subpaths and does not cover both of them. In either case, B punctures A0 and does not cover it, implying that rj + Rji ≥ d, and rj + Rji < d + pbi (recall that A is short, and therefore, so is A0 ). If B reaches an active subpath of A, and does not puncture it, we get the same conclusion, since this means rj + Rji ≥ rj ≥ d. By the memoryless property, Pr[X 0 ≥ X | B reaches an active subpath of A] ≤ ≤ Pr[Rji < d − rj + pbi | Rji ≥ d − rj ] ≤ 1 − e−p ≤ p 3.2.2 The Effect of a Sequence of Balls Consider the first N balls that reach some active subpath of A, starting from the beginning of iteration i0 of the outer loop, and perhaps during several iterations of that loop. For every a ∈ [N ], let Ya be the indicator random variable for the event that the ath ball reaching an active subpath of A decreased the number of active subpaths. In these notations, Proposition 3.11 stated that ∀a ∈ [N ]. Pr[Ya+1 = 1 | Y1 , . . . , Ya ] ≥ 1 − p Let Y = P a∈[N ] Ya and let Z ∼ Bin(N, 1 − p). Simple induction on N implies the following claim. Claim 3.12. ∀k. Pr [Y > k] ≥ Pr[Z > k]. Lemma 3.13. With probability at least 1 − 1/k 8 , after 70 log k balls reach some active subpath of A, A has no active subpath. In addition, the height of Ti0 ,A is at most 35 log k. Proof. Assume N = 70 log k. Since whenever Ya = 0, the number of active subpaths increases by at most 1, and whenever Ya = 1, the number of active subpaths decreases by at least 1, if Y > N/2, then A has no active subpaths. Therefore by the Chernoff Bound, Pr[A has no active subpath after N balls reach A] ≥ ≥ Pr[Y > N/2] ≥ Pr[Z > N/2] ≥ 1 − 1/k 8 Next, consider some i ≥ i0 . If A has an active subpath at the end of the ith iteration of the outer loop, then at least two times during the ith iteration an active subpath of A is reached by a ball. After 35 log k iterations of the outer loop, if A has an active subpath, then N ≥ 70 log k. By the same arguments as the first part of the proof, Pr[The height of Ti0 ,A is at most N/2] ≥ Pr[Y > N/2] ≥ Pr[Z > N/2] ≥ 1 − 1/k 8 12 We denote by E1 the event that for every i, and every A ∈ Ai , if Ti,A is a short subtree then after at most 70 log k balls reach an active subpath of A, A has no more active subpaths and in addition, the height of Ti,A is at most 35 log k. Lemma 3.14. Pr[E1 ] ≥ 1 − 1/k 3 . Proof. Fix some i, and A ∈ Ai . Assume that Ti,A is a short subtree. By definition, hi, Ai has 0 no short ancestor. For i0 = logb (`/p) ≤ logb (D/p), P ∗ itself is short, since bi = `/p, and thus all tree nodes in level i0 (and lower) are short. Therefore, i ≤ i0 . Since there are at most k nodes in every level of the tree, the number of short subtrees of T is at most k · logb (D/p) = k · (logb D + logb 100) ≤ k 4 log k + O(k log k) ≤ O(k 4 log k). By the previous lemma, and a union bound over all short subtrees, the result follows. Since every node in level logb (`/p) of the tree belongs to some short subtree, we get the following corollary. Corollary 3.15. With probability at least 1 − 2/k 3 , the height of T is at most logb (`/p) + 30 log k ≤ 10 logb D. We denote by E2 the event that for all i ≤ 10 logb D and j ∈ [k], the radius of the jth ball of the ith iteration of the outer loop is at most 280bi+1 log2 k. Lemma 3.16. Pr[E2 ] ≥ 1 − 1/k 3 Proof. Fix i ≤ 10 logb D and j ∈ [k]. Then Pr[Rji > 8bi log k] ≤ 1/k 8 . By assumption 3.1, 10 logb D = O(log D log k) = O(k 3 log k). Thus by a union bound over all values of i and j in question, k · 10 logb D 1 Pr[∀i, j. Rji ≤ 8bi log k] ≥ 1 − ≥1− 3 8 k k It follows that, with probability at least 1 − 1/k 3 , the radius of the jth ball of the ith iteration of P the outer loop is at most i0 ≤i 8bi log k ≤ 280bi+1 log2 k. Summing everything up, we can now bound with high probability the weights of all short subtrees of T . Claim 3.17. Conditioned on the events E1 and E2 , for every i0 > 1 and A ∈ Ai0 . If Ti0 ,A is a short subtree of T , then the total weight charged to nodes of Ti0 ,A is at most 39200ebi0 +1 log3 k with certainty. Proof. Conditioned on E1 , at most 70 log k detours are charged to nodes of every short subtree and for i = i0 + 35 log k, there are no more active subpaths of A. Conditioned on E2 , the most expensive detour is of weight at most 560bi0 +35 log k+1 log2 k, we get that the total weight charged to nodes of the subtree is 70 log k · 560bi0 +1 · b35 log k · log2 k ≤ 39200ebi0 +1 log3 k. 3.3 Bounding The Weight Of T We are now ready to bound the total weight charged to the tree. We sum the weights in each level of the tree from the root down. In each level we sum the weights charged to long active subpaths, and the weights of short subtrees rooted in that level. More formally, for every i ≤ logb (`/p), 13 denote by li the total weight charged to nodes of the form hi, Ai, where A ∈ Ai is a long active subpath. Denote by si the total weight charged to short subtrees rooted in the ith level of T . For every i ≥ logb (`/p), and every A ∈ Ai , A is short and therefore hi, Ai belongs to some short subtree rooted in level at most logb (`/p). Therefore, logb (`/p) X wi,A = X (li + si ) (5) i=1 hi,Ai∈T Since ` ≥ 1, then the root of T , namely h1, P ∗ i, consists of a long path. Therefore, s1 = 0. We can therefore rearrange Equation (5) to get the following. logb (`/p) X hi,Ai∈T wi,A = X (li + si+1 ) (6) i=1 Let i ≤ logb (`/p). Let A ∈ Ai be a long active subpath. That is w(A) ≥ pbi . Thinking of A as a continuous path, divide A into w(A)/(pbi ) segments of length pbi . Some segments may contain no nodes. Let I be a segment of A, and assume I contains nodes (otherwise, no cost is charged to A on account of detours from I). Following the same arguments of propositions 3.9 and 3.11, we get the following result. Lemma 3.18. With probability at least 1 − 1/k 8 , no more than 70 log k balls puncture I. Denote by E3 the event that for every i ≥ logb (`/k 2 ), and for every long active subpath A ∈ Ai , in the division of A to segments of length pbi , every such subsegment is reached by at most 70 log k balls. Lemma 3.19. Pr[E3 ] ≥ 1 − 1/k 3 . Proof. Since for every i ≥ logb (`/p), every A ∈ Ai is short, the number of relevant iterations (of the outer loop) is at most logb (`/p) − logb (`/k 2 ) = logb (k 2 /p) ≤ O(log2 k). For every i ≥ logb (`/k 2 ) and every long path A ∈ Ai , the number of segments of A is at most w(A)/(pbi ) ≤ w(A)/(p`/k 2 ) ≤ k 2 /p. Therefore the number of relevant segments for all i ≥ logb (`/k 2 ) and for all long A ∈ Ai is at most O(k 2 log2 k) Applying a union bound over all relevant segments the result follows. Since Pr[E1 ] ≥ 1 − 1/k 3 and Pr[E2 ] ≥ 1 − 1/k 3 , we get the following corollary. Corollary 3.20. Pr[E1 ∧ E2 ∧ E3 ] ≥ 1 − O(k −3 ) It follows that it is enough for us to prove that conditioned on E1 , E2 and E3 , with probability 1 the total weight charged to the tree is at most O(log6 k)`. Lemma 3.21. Conditioned on E2 and E3 , li ≤ 560bi+1 k log2 k. In addition, if i ≥ logb (`/k 2 ), li ≤ O(log3 k) · ` with probability 1. Proof. To see the first bound, observe that by the update process of P , at most k detours are added to P during the ith iteration. Conditioned on E2 , each one of them is of weight at most 560bi+1 log2 k. To see the second bound, let A ∈ Ai be a long active subpath. The additional 14 weight resulting from detours from vertices of A is at most the number of segments of A of length pbi , times the additional weight to each segment. Therefore, the additional weight is at most wi,A ≤ w(A)/pbi · 70 log k · 560bi+1 log2 k = 39200b/p · log3 k · w(A) . Since all paths in Ai are internally disjoint subpaths of P ∗ , we get: X X li = wi,A ≤ O(log3 k) · w(A) ≤ O(log3 k) · ` long A∈Ai long A∈Ai Lemma 3.22. Conditioned on events E1 , E2 and E3 , si+1 ≤ 39200ebi+2 k log3 k. In addition, if i ≥ logb (`/k 2 ), si+1 ≤ O(log4 k) · `. Proof. Conditioned on E1 and E2 , we proved in Claim 3.17 that the total weight charged to a short subtree rooted in level i + 1 is at most 39200ebi+2 log3 k with certainty. Since there are at most k such subtrees, the first bound follows. To get the second bound, note that by the definition of a short subtree, for every short subtree T 0 rooted at level i + 1, the parent of the root of T 0 consists of a long active subpath A of level i. Conditioned on E3 , every segment of A is intersected by at most 70 log k balls. Therefore, hi, Ai can have at most (w(A)/pbi ) · 70 log k children, and in particular, children consisting of short active subpaths. The cost of a short subtree rooted in the i + 1 level of T is at most 39200ebi+1 log3 k. Thus the total cost of all short subtrees rooted in children of A is bounded by (w(A)/pbi ) · 70 log k · 39200ebi+1 log3 k ≤ O(log4 k) · w(A) Summing over all (internally disjoint) long subpaths of level i, the result follows. We now turn to the proof Lemma 3.7. Proof of Lemma 3.7. Since Pr[E1 ∧ E2 ∧ E3 ] ≥ 1 − O(k −3 ), it is enough to show that conditioned on E1, E2 and E3, the total weight charged to the tree is at most O(log6 k)` with certainty. Recall Plog (`/p) P (li + si+1 ). Following Lemmas 3.21 and 3.22 we get that that hi,Ai∈T wi,A = i=1b logb (`/k2 ) X i=1 logb (`/k2 ) (li + si+1 ) ≤ X (560bi+1 k log2 k + 39f 200ebi+2 k log3 k) i=1 (7) logb (`/k2 ) ≤ O(k log3 k) X bi = O(k log3 k) i=1 b ` · 2 = o(1) · ` b−1 k In addition, logb (`/p) X i=logb (`/k2 )+1 logb (`/p) (li + si+1 ) ≤ X O(log3 k)` + O(log4 k)`) i=logb (`/k2 )+1 (8) ≤ O(log4 k)` · (logb (`/p) − logb (`/k 2 )) ≤ O(log4 k)` · O(log2 k) = O(log6 k) · ` 15 Before describing the generalized algorithm, used to discard Assumption 3.1, we wish to note the following fact, which follows directly from the analysis of Algorithm 1. Remark 1. By slightly changing the analysis of Algorithm 1 (without changing the actual algorithm) we can get that for every k̃ ≥ k, and every d ≥ 3, the algorithm achieves a stretch factor of O(d2 log6 k̃) with probability 1 − O(k̃ d ). 4 Terminal-Centered Minors: Extension to General Case In this section, we complete the proof of Theorem 1.3 by reducing it to the special case where Assumption 3.1 holds (which is proven in Section 3). Let us outline the recursive algorithm for the general case. The algorithm first rescales edge weights of the graph so that minimal terminal 3 distance is 1. If D < 2k then we apply Algorithm 1 and we are done. Otherwise, we construct a set of at most k − 1 well-separated and low diameter balls whose union contains all terminals. Then, for each of the balls, we apply Algorithm 1 on the graph induced by that ball. Each ball is then contracted into a “super-terminal”. We apply the algorithm recursively on the resulting graph G̃ with the set of super-terminals as the terminal set. Going back from the recursion, we “stitch” together the output of Algorithm 1 on the balls in the original graph with the output of the recursive call on G̃, to construct a partition of V as required. The exact algorithm is described in Section 4.1. Before that, we need a few definitions. Assume that the edge weights are already so that the minimum terminal distance is 1. Denote by D the set of all distances between terminals, rounded down to the nearest powers of 2. Note 3 that |D| < k 2 . Consider the case D > 2k . There must exist 0 ≤ m0 ≤ k 3 − k such that D ∩ {2m0 , 2m0 +1 , . . . , 2m0 +k } = ∅. Define R := {(x, y) ∈ T 2 : dG (x, y) < 2m0 }. Claim 4.1. R is an equivalence relation. Proof. Reflexivity and symmetry of R follow directly from the definition of a metric. To see that R is transitive, let x, y, z ∈ T , and assume (x, y), (y, z) ∈ R. Therefore dG (x, y) < 2m0 and dG (y, z) < 2m0 . By the triangle inequality, dG (x, z) < 2m0 +1 . Since D ∩ {2m0 , . . . , 2m0 +k } = ∅, dG (x, z) < 2m0 , and therefore (x, z) ∈ R. For every equivalence class U ∈ T /R, we pick an arbitrary u ∈ U , and define Û = BG (u, 2m0 ). Claim 4.2. U := {Û }U ∈T /R is a partial partition of V . Moreover, for every U ∈ T /R, U ⊆ Û , 3 G[Û ] is connected and of diameter at most 2m0 +1 < 2k . Proof. Let U ∈ T /R. Let u ∈ U be such that Û = BG (u, 2m0 ). For every x ∈ U , by the definition of R, d(x, u) < 2m0 , and thus x ∈ Û . Therefore U ⊆ Û . By the definition of a ball, G[Û ] is connected 3 and of diameter at most 2m0 +1 < 2k . To see that U is a partial partition of V , take U 0 ∈ T /R such that U 6= U 0 , and let u0 ∈ U 0 be such that Û 0 = BG (u0 , 2m0 ). Since (u, u0 ) ∈ / R, dG (u, u0 ) ≥ 2m0 , and since D ∩ {2m0 , . . . , 2m0 +k } = ∅, dG (u, u0 ) ≥ 2m0 +k+1 , thus Û ∩ Û 0 = ∅. 16 4.1 Detailed Algorithm Input: G = (V, E, w), T = {t1 , . . . , tk } ⊆ V Output: A partition {V1 , V2 , . . . , Vk } of V . 1: rescale the edge weights so that the minimal terminal distance is 1. 3 2: if D := maxu,v∈T dG (u, v) ≤ 2k then 3: run Algorithm 1, and return its output. 4: else 5: define R and U as above. 6: for all Û ∈ U do 7: run Algorithm 1 independently on G[Û ]. 8: contract Û to a single “super-terminal”, maintaining edge weights of all remaining edges. 9: denote the resulting graph G̃. 10: run Algorithm 2 recursively on G̃ with the set of super-terminals. 11: for all super-terminals u ∈ G̃ do 12: for all vertices v assigned to u in the recursive call do 13: assign v to its nearest terminal among the nodes contracted to u in line 8. 14: return the resulting partition of V . Algorithm 2: Partitioning V - The General Case It is clear that Algorithm 2 returns a partition of V . In addition, since every level of recursion decreases the number of terminals in the graph, the depth of the recursion is at most k. During each level of the recursion, Algorithm 1 is invoked at most k times. Therefore, Algorithm 1 is invoked at most k 2 times, each time on a set of at most k terminals. By Remark 1, setting d = 5, we get that there exists C0 > 0, such that every time the algorithm is invoked, it achieves a weight stretch factor of at most C0 log6 k, with probability at least 1 − O(k −3 ). Applying a union bound, we get that with high probability, the stretch bound is obtained in all invocations of the algorithm. It remains to show that this suffices to achieve the desired stretch factor in G. Lemma 4.3. With probability at least 1 − 1/k, on a graph P with τ ≤ k terminals, the algorithm obtains a stretch factor of at most C0 log6 k + log6 k · 2−k k0 ≤τ 2(k 0 )2 ≤ 2C0 log6 k. Proof. It is enough to show that conditioned on the event that every invocation of Algorithm 1 achieves a stretch factor of at most C0 log6 k, the generalized algorithm achieves the desired stretch factor. We prove this by induction on k. For k = 2 the result is clear. Assuming correctness for every k̃ < k, we prove correctness for k. Let s, t ∈ T . If (s, t) ∈ R, then s and t are in the same set in U, and by the conditioning, the stretch factor of the distance between s and t is at most C0 log6 k. This does not change in steps 4 − 7 of the algorithm. Otherwise, Let s̃, t̃ be the terminals (or superterminals) associated with s and t in G̃ respectively. Denote d = dG,w (s, t) and d˜ = dG̃,w̃ (s̃, t̃). Denote by G0 = (V 0 , E 0 , w0 ) the terminal-centered minor induced by the partition returned by the recursive call, and by G00 = (V 00 , E 00 , w00 ) the terminal-centered minor induced by the partition returned in the final step of the algorithm. Denote d0 = dG0 ,w0 (s̃, t̃). and d00 = dG00 ,w00 (s, t). Let u be a super terminal on a shortest path P 0 between s̃ and t̃ in G0 . Let P 00 be the path obtained from P 0 in G00 in the following manner. In the place of every super-terminal u in P 0 , originating in some node set Û , we add a path between the corresponding terminals in Û (based on the terminal-centered 17 minor constructed for G[Û ] in step 3). The edges of P 0 are also replaced with corresponding edges in G00 . Recall that in G0 , the weight of every edge is the distance between its endpoints in G̃ (by the definition of a terminal-centered minor). In G00 the weight of every edge is the distance between its endpoints in G. Therefore the weight P 0 contained at most k − 1 edges. In G00 the weight of each such edge increases by at most k2m0 . In addition, every expansion of a super-terminal adds at most k2m0 to the path. Therefore, w00 (P 00 ) ≤ w0 (P 0 ) + 2k 2 2m0 ≤ d0 + d · 2k 2 2−k . By the induction hypothesis X ˜ d0 ≤ C0 log6 k + log6 k · 2−k (k 0 )2 d. k0 ≤k−1 Since d˜ ≤ d, we get that d00 ≤ C0 log6 k + log6 k · 2−k X k0 ≤k−1 ≤ C0 log6 k + log6 k · 2−k 2(k 0 )2 d + d · 2k 2 2−k X 2(k 0 )2 d. k0 ≤k References [AGMW10] I. Abraham, C. Gavoille, D. Malkhi, and U. Wieder. Strong-diameter decompositions of minor free graphs. Theor. Comp. Sys., 47(4):837–855, November 2010. [Bar96] Y. Bartal. Probabilistic approximation of metric spaces and its algorithmic applications. In 37th Annual Symposium on Foundations of Computer Science, pages 184–193. IEEE, 1996. [Bar04] Y. Bartal. Graph decomposition lemmas and their role in metric embedding methods. In 12th Annual European Symposium on Algorithms, volume 3221 of LNCS, pages 89–97. Springer, 2004. [BG08] A. Basu and A. Gupta. Steiner point removal in graph metrics. Unpublished Manuscript, available from http://www.math.ucdavis.edu/~abasu/papers/SPR. pdf, 2008. [BK96] A. A. Benczúr and D. R. Karger. Approximating s-t minimum cuts in Õ(n2 ) time. In 28th Annual ACM Symposium on Theory of Computing, pages 47–55. ACM, 1996. [BSS09] J. D. Batson, D. A. Spielman, and N. Srivastava. Twice-ramanujan sparsifiers. In 41st Annual ACM symposium on Theory of computing, pages 255–262. ACM, 2009. [Chu12] J. Chuzhoy. On vertex sparsifiers with Steiner nodes. In 44th symposium on Theory of Computing, pages 673–688. ACM, 2012. 18 [CKR01] G. Calinescu, H. Karloff, and Y. Rabani. Approximation algorithms for the 0-extension problem. In 12th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 8–16. SIAM, 2001. [CLLM10] M. Charikar, T. Leighton, S. Li, and A. Moitra. Vertex sparsifiers and abstract rounding algorithms. In 51st Annual Symposium on Foundations of Computer Science, pages 265–274. IEEE Computer Society, 2010. [CXKR06] T. Chan, D. Xia, G. Konjevod, and A. Richa. A tight lower bound for the Steiner point removal problem on trees. In 9th International Workshop on Approximation, Randomization, and Combinatorial Optimization, volume 4110 of Lecture Notes in Computer Science, pages 70–81. Springer, 2006. [EGK+ 10] M. Englert, A. Gupta, R. Krauthgamer, H. Räcke, I. Talgam-Cohen, and K. Talwar. Vertex sparsifiers: New results from old techniques. In 13th International Workshop on Approximation, Randomization, and Combinatorial Optimization, volume 6302 of Lecture Notes in Computer Science, pages 152–165. Springer, 2010. [FRT04] J. Fakcharoenphol, S. Rao, and K. Talwar. A tight bound on approximating arbitrary metrics by tree metrics. J. Comput. Syst. Sci., 69(3):485–497, 2004. [GNR10] A. Gupta, V. Nagarajan, and R. Ravi. Improved approximation algorithms for requirement cut. Operations Research Letters, 38(4):322–325, 2010. [Gup01] A. Gupta. Steiner points in tree metrics don’t (really) help. In 12th Annual ACMSIAM Symposium on Discrete Algorithms, pages 220–227. SIAM, 2001. [KR11] R. Krauthgamer and T. Roughgarden. Metric clustering via consistent labeling. Theory of Computing, 7(5):49–74, 2011. [KZ12] R. Krauthgamer and T. Zondiner. Preserving terminal distances using minors. In 39th International Colloquium on Automata, Languages, and Programming, volume 7391 of Lecture Notes in Computer Science, pages 594–605. Springer, 2012. [LM10] F. T. Leighton and A. Moitra. Extensions and limits to vertex sparsification. In 42nd ACM symposium on Theory of computing, STOC, pages 47–56. ACM, 2010. [LN05] J. R. Lee and A. Naor. Extending lipschitz functions via random metric partitions. Inventiones Mathematicae, 1:59–95, 2005. [MM10] K. Makarychev and Y. Makarychev. Metric extension operators, vertex sparsifiers and lipschitz extendability. In 51st Annual Symposium on Foundations of Computer Science, pages 255–264. IEEE, 2010. [MN07] M. Mendel and A. Naor. Ramsey partitions and proximity data structures. J. Eur. Math. Soc., 9(2):253–275, 2007. [Moi09] A. Moitra. Approximation algorithms for multicommodity-type problems with guarantees independent of the graph size. In 50th Annual Symposium on Foundations of Computer Science, FOCS, pages 3–12. IEEE, 2009. 19 [PS89] D. Peleg and A. A. Schäffer. Graph spanners. J. Graph Theory, 13(1):99–116, 1989. [TZ05] M. Thorup and U. Zwick. Approximate distance oracles. J. ACM, 52(1):1–24, 2005. 20

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising