Coordinates for Instant Image Cloning

Coordinates for Instant Image Cloning
Coordinates for Instant Image Cloning
Zeev Farbman
Hebrew University
Gil Hoffer
Tel Aviv University
Yaron Lipman
Princeton University
Daniel Cohen-Or
Tel Aviv University
Dani Lischinski
Hebrew University
(a) Source patch
(b) Laplace membrane
(c) Mean-value membrane
(d) Target image
(e) Poisson cloning
(f) Mean-value cloning
Figure 1: Poisson cloning smoothly interpolates the error along the boundary of the source and the target regions across the entire cloned
region (the resulting membrane is shown in (b)), yielding a seamless composite (e). A qualitatively similar membrane (c) may be achieved
via transfinite interpolation, without solving a linear system. (f) Seamless cloning obtained instantly using the mean-value interpolant.
Abstract
1
Seamless cloning of a source image patch into a target image is an
important and useful image editing operation, which has received
considerable research attention in recent years. This operation is
typically carried out by solving a Poisson equation with Dirichlet boundary conditions, which smoothly interpolates the discrepancies between the boundary of the source patch and the target
across the entire cloned area. In this paper we introduce an alternative, coordinate-based approach, where rather than solving a
large linear system to perform the aforementioned interpolation, the
value of the interpolant at each interior pixel is given by a weighted
combination of values along the boundary. More specifically, our
approach is based on Mean-Value Coordinates (MVC). The use
of coordinates is advantageous in terms of speed, ease of implementation, small memory footprint, and parallelizability, enabling
real-time cloning of large regions, and interactive cloning of video
streams. We demonstrate a number of applications and extensions
of the coordinate-based framework.
A wide variety of image and video editing tasks may be effectively
accomplished by gradient domain techniques, which operate directly on the gradient field of an image [Fattal et al. 2002; Pérez
et al. 2003; Levin et al. 2004; Agarwala et al. 2004; McCann and
Pollard 2008]. One of the most useful gradient domain tools is Poisson cloning: seamless insertion of a source image patch into a target
image (Figure 1). This operation has attracted significant research
attention in recent years [Pérez et al. 2003; Agarwala et al. 2004;
Wang et al. 2004; Jia et al. 2006] and it is featured in professional
image editing products [Georgiev 2004].
Keywords: gradient domain, image editing, mean-value coordinates, Poisson equation, matting, seamless cloning, stitching
Introduction
All gradient domain techniques eventually solve a large sparse linear system, the Poisson equation. This motivated a number of
works proposing fast Poisson solvers for various scenarios [Szeliski
2006; Agarwala 2007; Kazhdan and Hoppe 2008] and for solving
the Poisson equation on the GPU [Bolz et al. 2003; McCann and
Pollard 2008].
In this paper, we introduce a new, coordinate-based approach that
performs seamless cloning, as well as a number of other related operations in a direct manner, without ever having to form and solve
systems of equations. Our approach is fast, straightforward to implement, and features a small memory footprint. The bulk of the
computation may be performed completely in parallel, making it
an ideal candidate for a GPU implementation.
When performing Poisson cloning, one typically solves the Poisson equation, where the gradients inside the cloned region come
from the source patch, and the Dirichlet boundary conditions are
prescribed by the target image. Perez et al. [2003] observed that
solving this Poisson equation is equivalent to solving the Laplace
equation with the Dirichlet boundary conditions set to the difference along the boundary between the source patch and the tar-
get image. In other words, Poisson cloning constructs a harmonic
(or membrane) interpolant that smoothly spreads the discrepancies
along the boundary to the entire cloned area. While the gradient
field of this membrane has minimal L2-norm, there is no evidence
that this particular membrane is necessarily optimal from the perceptual standpoint. Thus, our key idea is to construct a different
smooth interpolating membrane directly, i.e., without solving a linear system. While the membrane we construct is not identical to
the harmonic one, our final results are nevertheless typically indistinguishable from Poisson cloning (Figure 1).
Specifically, our objective is to find a harmonic-like interpolant to
some values along the boundary. Recent advances in the field of
transfinite interpolation allow solving this problem using generalized barycentric coordinates. An important instance is Floater’s
Mean-Value Coordinates (MVC) [Floater 2003]. These coordinates
were specifically designed for constructing smooth harmonic-like
interpolants by mimicking the mean-value property of harmonic
functions, and they are given by a simple closed-form formula.
Thus, the resulting membrane may be evaluated in parallel for any
point inside the region at a cost linear in the number of boundary
vertices.
We further observe that due to the smoothness of the membrane
away from the boundary, it is not necessary to evaluate it at each
and every pixel inside the cloned area. Instead, it suffices to evaluate the membrane only at the vertices of an adaptive mesh, and
obtain the values at the remaining pixels by linear interpolation. A
similar optimization was recently utilized by Agarwala [2007] to
solve large Poisson systems, such as those arising in gradient domain stitching, with a small memory footprint. Another important
optimization that we introduce is adaptive hierarchical sampling of
the boundary.
After presenting the use of mean-value coordinates for seamless
image cloning in Section 3, describing an efficient implementation
on the CPU and on the GPU, and comparing to existing approaches,
we go on to present a number of applications and extensions of our
approach (Section 5). Specifically, we discuss real-time interactive
seamless video cloning, seamless stitching of large panoramas, removal of “smudging” artifacts that sometimes occur with seamless
cloning, and MVC-based matte extraction.
In summary, our specific contributions are:
• A new, coordinate-based method for seamless cloning, which is
easy to implement, features a small memory footprint, and is
highly parallelizable.
• Real-time seamless cloning and healing of still images and video
sequences on the CPU, as well as on the GPU.
• Extensions to related operations, such as seamless stitching and
matting.
2
Background
Gradient domain methods
Psychologists have long discovered that the human visual system
is much more sensitive to local contrasts than to absolute luminances or to slow changes in the luminance [Land and McCann
1971; Palmer 1999]. In particular, slow luminance changes, which
are suppressed by the human visual system as part of lightness constancy, may be often superimposed over an image without a noticeable effect.
Gradient domain methods take advantage of the above properties,
and modify images by manipulating their gradient field to perform
a variety of tasks, ranging from shadow removal [Weiss 2001; Finlayson et al. 2002], to tone mapping [Fattal et al. 2002], seamless
stitching [Levin et al. 2004; Agarwala et al. 2004], image cloning
[Pérez et al. 2003; Georgiev 2004; Jia et al. 2006], seamless video
editing [Wang et al. 2004], and, recently, gradient domain painting
[McCann and Pollard 2008].
Reconstructing a new image from the modified gradient field typically requires solving the Poisson equation, which yields the image
whose gradient field is closest (in the L2-norm sense) to the modified one, subject to some boundary conditions. For example, in
Poisson cloning [Pérez et al. 2003], the gradient field (sometimes
referred to as the guidance field) inside the cloned region is taken
from the source image, while the values of the target image along
the boundary of the cloned region are used to define the Dirichlet
boundary conditions for the equation.
Solving the Poisson equation for large images is a computational
and memory intensive task. Agarwala [2007] observed that in the
case of gradient domain stitching, one essentially solves for an offset function that is smooth away from the seams. This makes it possible to obtain an accurate solution by constructing a reduced linear
system using an adaptive quadtree subdivision of the domain. This
method has been shown to be significantly faster and more scalable
than general Poisson solvers for stitching large images. We also
take advantage of smoothness and use an adaptive mesh to speed
up our computation and to make it scalable; however, in contrast to
Agarwala we avoid solving a linear system altogether.
McCann and Pollard [McCann and Pollard 2008] describe a fast
GPU implementation of a multi-grid Poisson solver, with which
they achieve real-time interactive performance for gradient domain
image editing operations, including seamless cloning. While their
system outperforms previous methods, it does involve a substantial
memory footprint, and the authors report that performance drops
down once this footprint exceeds the available video memory.
Mean-Value Coordinates
Recently, there has been significant interest in using generalized
barycentric coordinates for solving transfinite interpolation problems [Wachpress 1975; Floater 2003; Warren 1996]. In his seminal paper, Floater [2003] introduced the Mean-Value Coordinates
(MVC) which are motivated by the Mean-Value Theorem for harmonic functions. These coordinates approximate a harmonic-like
solution to the boundary interpolation problem. They are welldefined over the entire plane for arbitrary planar polygons without self-intersections, smooth (C∞ , except at the polygon vertices
where they are C0 ), and invariant under similarity transformations
[Hormann and Floater 2006]. MVC coordinates have also been extended to 3D polyhedra and used for space deformation [Ju et al.
2005; Floater et al. 2005; Joshi et al. 2007]. In this work, we explore the novel use of MVC as a computationally attractive alternative for solving the Poisson equation in certain image editing tasks.
In the remainder of this section, we quickly define the 2D meanvalue interpolant, and refer the reader to the references mentioned
above for detailed derivations in 2D and in 3D. Consider a closed
2D polygonal boundary curve (with counter-clockwise ordering)
∂P = (p0 , p1 , ..., pm = p0 ), pi ∈ R2 . The mean-value coordinates
of a point x ∈ R2 with respect to ∂P are given by
λi (x) =
where
wi =
wi
,
m−1
∑ j=0 w j
i = 0, . . . , m − 1,
tan(αi−1 /2) + tan(αi /2)
,
kpi − xk
and αi is the angle ^pi , x, pi+1 (see Figure 2). Once computed,
(1)
(2)
these coordinates may be used to smoothly interpolate any function
f defined at the boundary vertices:
f˜(x) =
m−1
∑ λi (x) f (pi ).
(3)
i=0
pi
pi+1
pi−1
αi αi−1
x
Figure 2: Angle definitions for mean-value coordinates.
3
Algorithm 1 MVC Seamless Cloning
1: {Preprocessing stage}
2: for each pixel x ∈ Ps do
3:
{Compute the mean-value coordinates of x w.r.t. ∂Ps }
4:
λ0 (x), . . . , λm−1 (x) = MVC(x, y, ∂Ps )
5: end for
6: for each new Pt do
7:
{Compute the differences along the boundary}
8:
for each vertex pi of ∂Pt do
9:
diffi = f ∗ (pi ) − g(pi )
10:
end for
11:
for each pixel x ∈ Pt do
12:
{Evaluate the mean-value interpolant at x}
13:
r(x) = ∑m−1
i=0 λi (x) · diffi
14:
f (x) = g(x) + r(x)
15:
end for
16: end for
Mean-Value Seamless Cloning
In this section we explain in detail how mean-value coordinates
may be used to perform instant seamless image cloning.
Let S ⊂ R2 be the domain of the source image and T ⊂ R2 be the
domain of the target image for cloning. Let us denote by g : S →
R, f ∗ : T → R the source and target image intensities over their
respective domains. Let Ps ⊂ S denote the source patch that we
would like to clone seamlessly into Pt ⊂ T . We assume that these
patches are isomorphic, and that their boundaries, ∂Ps and ∂Pt , are
polygonal curves with the same number of vertices.
Poisson cloning computes a function f : Pt → R by solving the
Poisson equation:
∆ f = div∇g w/ Dirichlet boundary conditions f ∂Pt = f ∗ . (4)
In other words, Poisson cloning seeks a function f that agrees with
the target image f ∗ on the boundary of the target region ∂Pt , whose
gradient field is as close as possible to that of the source image g.
Pérez et al. [2003] noted that solving the above Poisson equation is
equivalent to solving the Laplace equation:
∆ f˜ = 0, w/ Dirichlet boundary conditions f˜ ∂Pt = f ∗ − g. (5)
The final outcome of the cloning is then simply defined as
f = g + f˜.
(6)
This formulation reveals that Poisson cloning in fact constructs a
smooth membrane (a harmonic function) f˜ that interpolates the difference f ∗ − g between the target and source images on the boundary of Pt across the entire region.
As stated earlier, we propose to construct a similar smooth interpolating membrane f˜ in an entirely different manner, using meanvalue interpolation, as described below The most obvious advantage of using the mean-value interpolant is that there exists a simple
closed-form formula for constructing it, hence eliminating the need
to solve a large linear system.
Consider a point x ∈ Pt with boundary ∂Pt = (p0 , p1 , ..., pm = p0 ).
The mean-value interpolant obtaining the values f ∗ − g at the
boundary ∂Pt is given at point x by
r(x) =
m−1
∑ λi (x)( f ∗ − g)(pi ),
(7)
i=0
where λi (x), i = 0, . . . , m − 1 are the mean-value coordinates with
respect to ∂Pt , as defined by equations (1–2). The result of meanvalue cloning is then given, similarly to eq. (6), by
f = g + r.
(8)
An unoptimized mean-value cloning procedure is given in pseudocode in Alg. 1. This routine precomputes the mean-value coordinates of each pixel inside the source patch Ps once the patch is
selected and then repeatedly performs mean-value interpolation for
each location Pt in the target image. It is easy to see that the number of operations is O(nm), where n is the number of pixels in the
cloned region, while m is the number of boundary pixels. Since the
mean-value coordinates are precomputed and stored, the memory
footprint is also O(nm). To make MVC cloning fast and scalable,
we introduce two optimizations, which are described below.
The mean-value interpolant is very smooth
away from the boundary of the cloned region. Thus, for all practical purposes, much of the computation in Alg. 1 may be avoided
by constructing an adaptive triangular mesh over Ps . We use the
C GAL [Cgal 2007] library to generate the adaptive mesh. An example is shown in Figure 3. Once the mesh is available, we only
need to compute and store the mean-value coordinates (line 4 in
Alg. 1) at each mesh vertex. Likewise, the evaluation of the interpolant (line 13 in Alg. 1) is also only performed at the mesh vertices, and the value at each pixel is obtained by linear interpolation
of the three values at the vertices of the containing triangle. The
number of these vertices is in practice roughly linear in the number
of boundary pixels. This reduces the total complexity of computing
the coordinates and of evaluating the interpolant to O(m2 ), enabling
interactive performance when cloning regions of moderate size.
Adaptive mesh.
A further significant speedup
is achieved by hierarchically sampling the boundary, rather than
using all of the boundary pixels. This idea is inspired by adaptive hierarchical approaches, such as fast particle simulation algorithms [Carrier et al. 1988] and hierarchical radiosity [Hanrahan
et al. 1991]. Similarly to Coulomb potential fields and solid angles,
the mean-value weight of each boundary vertex decays quickly with
distance. Thus, an accurate approximation of the membrane may
be achieved by sampling the boundary with density that is inversely
proportional to the distance, as demonstrated in Figure 3. In practice, only a constant number of boundary vertices are used when
computing the coordinates and the membrane at each mesh vertex,
reducing the total cost of these operations to O(m).
Hierarchical boundary sampling.
Specifically, we first construct a 1D hierarchy over the sequence of
boundary pixels. Each coarser level in the hierarchy is obtained by
dropping every other point in the previous (finer) level. Note that
by this construction, if a vertex is present at some coarse level in
the hierarchy, it is also present in all the finer levels. The process
stops once the number of points in the coarsest level falls below a
predefined constant (16 in our implementation).
then moves it across the target image, while the seamlessly cloned
result is instantly generated and displayed at each target position.
CPU implementation. Once a selection has been made, a short
pre-processing stage takes place, during which the adaptive mesh is
created, and a vector of MVC coordinates is computed and stored
with each mesh vertex. We also precompute, for each pixel in the
selected region, the index of the mesh triangle containing this pixel
and the three barycentric coordinates with respect to the containing triangle. As the region is moved to each new target location,
we compute the error f ∗ − g at each boundary point, evaluate the
mean-value membrane r(x) at each mesh vertex, linearly interpolate to each pixel, using the precomputed barycentric coordinates,
and finally compute the sum g + r.
Figure 3: An adaptive triangular mesh constructed over the region
to be cloned. The red dots on the boundary show the positions
of boundary vertices that were selected by adaptive hierarchical
subsampling for the mesh vertex indicated in blue.
Next, for each mesh vertex x, we traverse to hierarchy from the top
(coarse) level down. Let pki−s , pki , pki+s be three consecutive vertices
at the k-th level of the hierarchy, where s is the index step between
successive vertices at that level. If each of the following three conditions hold:
kx − pki k
>
εdist
^pki−s , x, pki
<
εang
^pki , x, pki+s
<
εang
then the mean-value weight (2) corresponding to pki at x is sufficiently small and no further refinement of the boundary is necessary around pki . If this is not the case, denser sampling is required
in order to provide a better approximation of the membrane. Therefore, we insert two additional points, and repeat the same test for
k+1
each of the three vertices at the next (finer) level: pk+1
, and
i−s/2 , pi
pk+1
i+s/2 . In our current implementation we set the distance and angle
thresholds εdist and εang to:
εdist =
# boundary pixels
16 · 2.5k
and
εang = 0.75 · 0.8k ,
where k is the current depth in the hierarchy (k = 0 at the coarsest
level). While these expressions were found to provide a good tradeoff between speed and visual quality in our experiments, they are
not necessarily optimal, and could benefit from further tuning.
When given an error function f ∗ − g on the boundary, care must be
taken to avoid aliasing due to subsampling of the boundary. Thus,
we progressively low-pass filter f ∗ − g to obtain adequately bandlimited values at each hierarchy level, before computing the interpolants at any of the mesh vertices.
With both of the above optimizations in place, the total cost of computing the MVC coordinates and of evaluating the membrane (lines
4 and 13 in Alg. 1) becomes roughly linear in the√number of boundary pixels O(m), which in practice grows as O( n), where n is the
number of cloned pixels. Of course, because we linearly interpolate
the membrane values to all n pixels, the asymptotic behavior is still
O(n), similarly to Agarwala [2007].
4
Implementation and Performance
We have implemented MVC cloning both on the CPU and on the
GPU. Both implementations target the interactive seamless cloning
scenario, where the user first selects an image region to clone and
For a region with 133K interior pixels and 1,562 boundary pixels,
the preprocessing stage takes 0.3 seconds (on a single core of an
AMD Athlon 2.5GHz). The interactive cloning then proceeds at a
rates exceeding 90 updates per second. More timings and statistics
are given in Table 1. As expected, the number of mesh vertices
grows linearly with the length of the boundary, but the number of
boundary points sampled by each vertex remains roughly constant,
thanks to the adaptive hierarchical subsampling scheme. As cloned
regions become larger, the computation of barycentric coordinates
eventually dominates preprocessing time, and the cloning time becomes dominated by the linear interpolation step. Thus, for large
regions, performing MVC cloning is almost as cheap as performing
a linear interpolation at each pixel. The memory footprint is modest, consisting mainly of storing the barycentric coordinates and a
mesh triangle index for each pixel.
GPU implementation. MVC cloning is trivially parallelizable,
since the membrane evaluation at each mesh vertex is performed
completely independently of the other vertices. Our current GPU
implementation also uses an adaptive mesh to approximate the
membrane, and performs the hierarchical boundary subsampling.
The adaptive mesh and the vector of MVC coordinates at each mesh
vertex are precomputed on the host CPU as before, but it is no
longer necessary to precompute and store the barycentric coordinates of each pixel, further reducing the memory footprint. At each
frame, a simple vertex shader (30 lines of GLSL) evaluates the error membrane r(x) at each mesh vertex, the rasterizing hardware
linearly interpolates these values to each pixel, and a trivial fragment shader (6 lines of GLSL) computes the final value of g + r.
This results in seamless cloning at roughly 134 frames per second
on a mobile GPU (NVIDIA GeForce 9600M GT), when cloning a
region with 133K interior pixels. The speed advantage of the GPU
implementation over the CPU increases with the size of the cloned
region (see Table 1).
Table 1: Performance statistics for MVC cloning. Times exclude
disk I/O and sending the images to the graphics subsystem. Cloning
rate is the number of region updates per second.
#cloned
pixels
51,820
133,408
465,134
1,076,572
4,248,461
12,328,289
#bdry
pixels
1,113
1,562
2,683
4,145
8,133
14,005
#mesh
vertices
2,063
2,963
5,323
8,241
16,369
28,240
coords
/vertex
38.63
44.21
45.50
44.59
57.71
58.68
prep.
time(s)
0.15
0.30
0.63
1.16
3.63
8.99
cloning rate
CPU
GPU
199.0
163
92.1
134
22.6
82
9.7
44
2.7
26
0.94
−
Comparison with previous approaches
We are not aware of any existing system that is able to perform
seamless cloning on the CPU at the rates reported above. Testimonials by other researchers [Pérez et al. 2003; McCann and Pollard
2008], as well as our own experiments, indicate that common Poisson solvers on the CPU are able to handle regions with 2562 pixels
at a rate of 3–5 solutions per second. Another possibility, which we
have not seen mentioned in the literature, is to precompute a factorization of the Poisson equation matrix during the preprocessing
stage, and then quickly compute the solution via back-substitution
at each target location. In our experiments, for a region with 125K
pixels, computing the back-substitution takes 0.3 seconds. Thus, all
of the above are significantly slower than the rates we are able to
achieve.
McCann and Pollard [2008] also demonstrate real-time seamless
cloning as one of the features of their gradient-domain painting system, reporting rates of 20 multigrid V-cycles per second for a one
megapixel image. A screen captured session with their system is
included in the accompanying video. Note that while the seamless
cloning indeed takes place in real time, there is a fair amount of noticeable flicker, as the cloned region is dragged about. The flicker
may be attributed to two factors: (i) the Poisson equation is solved
over the entire image (with Neumann boundary conditions), thus
the position of the cloned patch has a global effect on the result; (ii)
the result is updated after each V-cycle, which is not always sufficient to achieve complete visual convergence. In fairness, it should
be noted that region cloning is but one feature among several supported by the gradient-domain painting system. It is reasonable to
assume that a GPU-based multigrid solver would perform better
and avoid flicker, if applied to the cloned region only (with Dirichlet boundary conditions). Still, solving the Poisson equation on the
GPU is a much more involved task than MVC cloning, and has a
significantly larger memory footprint.
5
Figure 4: Object removal with Poisson cloning (middle) and MVC
cloning (right). Top left: original image; bottom left: source patch.
The corresponding membranes are visualized using a colormap.
Although the visualization reveals some numerical differences between the membranes (RMS difference of about 0.015), it is difficult
to see the difference between the resulting images.
Results and Extensions
MVC vs. Poisson. We have compared MVC cloning to Poisson
cloning on a large number of examples, using a variety of images
and differently shaped cloning regions. Our conclusion is that although the corresponding membranes are by no means identical,
the outcome of the cloning is typically difficult to tell apart visually. Even when (subtle) differences are visible, it is usually difficult to prefer one outcome over another. The differences between
the two kinds of membranes tend to be smaller for convex shapes,
such as a rectangle or a disk, as demonstrated in Figure 1. Cloning
more concave regions, such as the one shown in Figure 4, typically
results in more significant differences between the membranes, but
the results are difficult to tell apart. The differences between the
membranes become most apparent for extremely concave regions.
For example, consider the synthetic example shown in the top row
of Figure 5. Here, the goal is to fill an omega-shaped hole. While
the Laplace membrane succeeds in eliminating the hole with almost
no visible trace, MVC interpolation is less successful. The reason is
that the MVC membrane in each half of the shape is affected by values along the boundary of the opposite half, despite there being no
lines of sight (inside the shape) between the two halves. However,
in a more typical scenario with less extreme gradients and some
texture, the results become comparable in quality, even though the
same concave region is used (Figure 5, bottom row).
Instant seamless cloning. The gains we achieve in performance
translate into a significantly different interactive experience for the
user. To illustrate this, the accompanying video includes a real-time
screen capture of seamless cloning with the Patch tool in Photoshop CS4. Note that while the user is dragging the patch around no
seamless cloning takes place, and thus the user is unable to assess
the result of the operation in real time. There is also a noticeable delay from the time the patch is dropped in its target position until the
final result appears. In contrast, when cloning with our approach,
the seamless cloning result is displayed instantly, greatly assisting
the user in selecting a suitable target position.
Figure 5: Poisson vs. MVC over a highly concave region. Left:
input image; Middle: Poisson cloning; Right: MVC cloning.
Figure 6: More seamless cloning results, obtained by rotating and
scaling the source patch (left: original, right: after cloning).
As was mentioned earlier, MVC are invariant under similarity transformations. Thus, during an interactive cloning session, the source
region may be rotated and scaled without the need to repeat the
preprocessing. Again, we found the ability to do this with instant
feedback extremely helpful. Two results obtained with the use of
such transformations are shown in Figure 6.
5.1
Mean-Value Video Cloning
Given the speed of MVC cloning, it is only natural to consider applying it to seamless cloning of video. Seamless video cloning has
been attempted before by Wang et al. [2004], by forming and solving a 3D Poisson equation over the entire 3D space-time volume of
the video. Since our goal is to clone interactively, while both the
source clip and the target video are continuously playing, we opt
instead for a frame-by-frame solution, with temporal smoothing
between consecutive interpolating membranes to ensure temporal
coherence.
In our current implementation, the shape of the source video patch
and its position in the source video frames are kept fixed. We store
with each mesh vertex (in addition to its MVC coordinates) a set
of its membrane values in the last k frames. To form a membrane
for the current frame we compute a weighted average of these values, with the weights of older frames decaying with time: ∆t −0.75 ,
where ∆t is the distance in frames between the current frame and
the older one. However, for seamless results the membrane must respond quickly to changing discrepancies between the source and the
target along the boundary. Thus, the weight of older membranes in
the temporal averaging is further reduce at vertices near the boundary (by a factor of 2−d , where d is the normalized distance of a
vertex to the boundary).
The accompanying video demonstrates some results of interactive
seamless video cloning (captured in real time). Snapshots from
the interactive session are shown in Figure 7. Seamless cloning
of video is a much more challenging task than cloning in still images: inserting and removing objects can be a time-consuming and
frustrating task. Therefore, the kind of real-time feedback provided
by our approach is instrumental to the user’s ability to achieve satisfactory results.
5.2
MVC Stitching
Gradient-domain stitching and seamless cloning are closely related.
For example, stitching may be done by setting up a guiding field
that uses the gradients of the images being stitched away from the
seams, and the average of the gradient at pixels along the seams
[Agarwala 2007]. Solving the Poisson equation then yields a seamlessly stitched composite. This approach typically uses Neumann
boundary conditions, which prescribe the value of the derivative
normal to the boundary, and thus result in free-floating boundaries.
Since our MVC cloning machinery is based on interpolation of
boundary values across the domain, it is suitable for Dirichlet
boundary conditions, rather than Neumann boundary conditions.
Nevertheless, it is quite easy to adapt our approach to perform
stitching. Suppose that our goal is to stitch together two images
A and B along a given seam. This may be accomplished by keeping
one image, say B, fixed, and adding to A a smooth offset function, which interpolates the error between the two images along the
seam and gradually goes to zero away from the seam. Specifically,
we construct a polygonal boundary around image A consisting of
the pixels on the seam between the images, as well as the “free”
(unconstrained) corners of A. We set the offset values to the difference B − A for each boundary vertex (pixel) along the seam, and to
zero at the free corners of A. Note that the pixel values along the
free edges of A are not constrained to any particular offset value,
and the offsets along these edges vary smoothly between B − A at
the seam and zero at the free corners. This idea easily extends to
any number of images, by computing a similar offset membrane for
each image in the composite. Figure 8 shows an example result
computed using the approach described above. This 7.5 Mpixel
image took 3.7 seconds to stitch, which is slightly faster than the
times reported by Agarwala [2007]. Additional experiments (up to
33 Mpixels) indicated that the cost of stitching with our approach
grows linearly with the length of the seams, with stitching rates of
above 1 Mpixel per second. The memory footprint is also linear in
the length of the seams.
5.3
Selective Boundary Suppression
It is well known that Poisson cloning works best when the error along the boundary of the cloned region is nearly constant, or
changes smoothly. When this is not the case, there is a visible
“smudging” of the error from the boundary into the cloned area
(Figure 9). Mixing source and target gradients [Pérez et al. 2003]
and optimizing the boundary [Jia et al. 2006] offers a solution to
this problem in some, but not all, cases. For example, in cases such
as the one shown in Figure 9, no adjustment of the boundary is able
to avoid the problem, since any boundary must cut across the trunk
of the tree or the ground on which it stands in the source image. An
alternative solution, is to construct a smooth membrane which does
not attempt to interpolate large errors on the boundary. A similar
workaround has been used in the context of gradient domain fusion
[Agarwala et al. 2004].
Sections of the boundary where the error is too large to be interpolated may be detected automatically, or indicated by the user. In
our current implementation, we let the user paint over a portion of
the boundary that causes an undesirable artifact in the cloned result.
This signals the cloning routine that the marked boundary vertices
should not participate in the membrane evaluation. The mean-value
weights λi corresponding to these vertices are then set to zero (and
the remaining weights are, of course, re-normalized accordingly).
Note that this does not involve recomputing the mesh, or the coordinates at each mesh vertex. Figure 9 and the accompanying video
show an example of a result obtained in this manner, demonstrating that selective boundary suppression provides the user with more
control over the result of the cloning operation and widens the range
of scenarios where seamless cloning is possible.
Figure 9: Selective boundary suppression. Left: source patch
with boundary cutting across an object. Right: regular seamless
cloning results in smudging (left tree), which is removed via selective boundary suppression (right tree, see also the video).
Figure 7: Seamless video cloning: snapshots from an interactive session. A bird is duplicated (left), another bird is removed (middle), a
large rock to the left of the bear is removed (right).
Figure 8: MVC stitching. Top: composite with seams; Middle: MVC membrane; Bottom: seamlessly stitched panorama.
5.4
MVC Matting
Poisson matting [Sun et al. 2004] is a gradient-domain technique for
extracting the matte of a foreground object. Given an image I, the
goal is to estimate the matte α and the foreground and background
color functions F and B, such that:
I = αF + (1 − α)B.
to the matte α. More precisely, the matte gradient field over Ω is
approximated as:
1
∇α ≈
∇I
(10)
F −B
The matte is therefore estimated by solving the Poisson equation
(9)
In order to accomplish this task, the user provides a trimap: a map
classifying the image pixels into three disjoint regions: “definitely
foreground” ΩF , “definitely background” ΩB , and the “unknown
region” Ω between them, which contains the boundary of the foreground object.
Poisson matting relies on the assumption that in the unknown region, both the foreground color F and the background color B vary
smoothly. Thus, the gradients in this region are assumed to be due
∆α = div
∇I
,
F −B
such that α =
1
0
on ∂ΩF
on ∂ΩB
(11)
In general, the above equation is not equivalent to a Laplace equation, because the vector field ∇I/(F − B) is not conservative. However, if F and B vary smoothly in the unknown region, as assumed
by Poisson matting, we may approximate it by a conservative field:
∇I
I
≈∇
.
F −B
F −B
(12)
Thus, defining g = I/(F − B), we obtain that solving the Poisson
equation (11) is approximately equivalent to solving the Laplace
equation:
1 − g on ∂ΩF
(13)
∆α̃ = 0, such that α̃ =
0 − g on ∂ΩB ,
and obtaining the alpha matte as: α = g + α̃. Exactly as before,
it is possible to compute a similar membrane interpolant by using
mean-value coordinates, instead of solving a linear system.
(a) input image
(b) trimap
Specifically, given a trimap, we need to estimate g = I/(F − B) on
the boundaries of the unknown region Ω. We use mean-value interpolation to obtain these estimates. The colors B are smoothly interpolated from their known values along the boundary ∂ΩB , while
F is smoothly extrapolated outward from their known values along
∂ΩF . Here, we take advantage of the fact that mean-value coordinates are well-defined and smooth over the entire plane (except on
the boundary itself) [Hormann and Floater 2006].
Figure 10 shows an input image and a corresponding trimap, as
well as the resulting mattes produced by Poisson matting and by our
approach. It may be seen that although the mattes are not identical
they are quite similar. It should be noted that Poisson matting is
not the best matting method available today (see [Levin et al. 2008;
Wang and Cohen 2007], where Poisson matting is compared with
more state-of-the-art methods). However, when attempting to clone
an object over a non-homogeneous target image, the kind of matte
that we are able to obtain with our approach is often sufficient for
a convincing composite. Figure 11c shows a case where seamless
cloning fails to produce a satisfactory result. Compare this result
with Figure 11d, where the transparency across the cloned region is
modulated by the matte computed using our approach: the overall
appearance of the eagle matches the target image, but the smudging
of the surrounding background is avoided.
(c) Cloning the eagle over a non-homogeneous image.
(d) Applying a matte to the cloned region.
Figure 11: Matted cloning.
(a) input image
(b) trimap
6
Discussion and Conclusions
Using the general framework of mean-value coordinates, we have
presented a new approach for seamless cloning of images and
video, stitching, and matting. We have demonstrated a number of
advantages that our approach offers over existing techniques.
One limitation of our approach is that it is not applicable to every scenario where the Poisson equation might be used,
as it relies on the ability to decompose the solution into a sum of
a smooth interpolating membrane and a known function. Thus, we
do not currently see a way of applying it to tasks such as gradientdomain HDR compression [Fattal et al. 2002], or Poisson cloning
with mixed gradients [Pérez et al. 2003], where the resulting guiding field is not conservative.
Limitations.
(c) Matte from [Sun et al. 2004]
(d) MVC matte
Figure 10: A comparison with Poisson matting.
Interestingly, since we have an estimate of F and B at every point
inside the unknown region, it is also possible to estimate α directly
from these values:
I −B
α=
.
(14)
F −B
The results are not identical, but comparable to those obtained as
described earlier, so this observation merits further investigation.
Another limitation, already pointed out earlier, is that seamless
cloning (be it MVC-based or Poisson-based) only works well when
the texture in the surrounding target region is sufficiently similar to
the texture near the boundaries of the source patch. This becomes
particularly visible in some video cloning examples, where the textures should match both spatially and temporally for satisfactory
results.
Our current implementation of video cloning was
meant as a proof of concept. We believe that a better, specialized
user interface is needed in order to effectively work with seamless
video cloning. The user should be able to adjust the shape of the
source region for cloning, or the region where an object is to be
removed, since having to use a fixed region throughout a video
clip is quite limiting. It would also be interesting to investigate
whether constructing a 3D interpolant (in the space-time volume of
the video) offers any advantages over our current temporal smoothing scheme.
Future Work.
As pointed out in Section 1, a variety of generalized barycentric coordinates schemes have been proposed in recent years. In this paper
we chose to focus on MVC, but it might be interesting to explore
how some of these other schemes compare with MVC in the context of seamless cloning. For example, the higher order barycentric
coordinates proposed by Langer and Seidel [2008] enable interpolation of both values and derivatives on the boundary.
Future work should also examine the possibility of using a
coordinate-based approach to perform cloning of volumes, lightfields, and other high-dimensional data sets, as well as seek additional applications of this powerful framework.
This work was supported in part by grants
from the Israel Ministry of Science, and from the Israel Science
Foundation founded by the Israel Academy of Sciences and Humanities. The authors would also like to thank the anonymous reviewers for their comments.
Acknowledgments:
References
AGARWALA , A., D ONTCHEVA , M., AGRAWALA , M., D RUCKER ,
S., C OLBURN , A., C URLESS , B., S ALESIN , D., AND C OHEN ,
M. 2004. Interactive digital photomontage. ACM Trans. Graph.
23, 3, 294–302.
AGARWALA , A. 2007. Efficient gradient-domain compositing using quadtrees. ACM Trans. Graph. 26, 3, 94.
B OLZ , J., FARMER , I., G RINSPUN , E., AND S CHR ÖDER , P.
2003. Sparse matrix solvers on the GPU: conjugate gradients
and multigrid. ACM Trans. Graph 22, 3, 917–924.
C ARRIER , J., G REENGARD , L., AND ROKHLIN , V. 1988. A
fast adaptive multipole algorithm for particle simulations. SIAM
Journal on Scientific and Statistical Computing 9, 669–686.
C GAL, 2007.
Computational Geometry Algorithms Library.
http://www.cgal.org.
FATTAL , R., L ISCHINSKI , D., AND W ERMAN , M. 2002. Gradient
domain high dynamic range compression. ACM Trans. Graph.
21, 3, 249–256.
F INLAYSON , G. D., H ORDLEY, S. D., AND D REW, M. S. 2002.
Removing shadows from images. In Proc. ECCV, SpringerVerlag, London, UK, vol. IV, 823–836.
F LOATER , M. S., K ÓS , G., AND R EIMERS , M. 2005. Mean value
coordinates in 3d. Comput. Aided Geom. Des. 22, 7, 623–631.
F LOATER , M. S. 2003. Mean value coordinates. Comput. Aided
Geom. Des. 20, 1, 19–27.
G EORGIEV, T. 2004. Photoshop healing brush: a tool for seamless cloning. In Workshop on Applications of Computer Vission
(ECCV 2004), 1–8.
H ANRAHAN , P., S ALZMAN , D., AND AUPPERLE , L. 1991. A
rapid hierarchical radiosity algorithm. Computer Graphics (SIGGRAPH ’91 Proceedings) 25, 4 (July), 197–206.
H ORMANN , K., AND F LOATER , M. S. 2006. Mean value coordinates for arbitrary planar polygons. ACM Transactions on
Graphics 25, 4, 1424–1441.
J IA , J., S UN , J., TANG , C.-K., AND S HUM , H.-Y. 2006. Dragand-drop pasting. ACM Trans. Graph. 25, 3 (July), 631–637.
J OSHI , P., M EYER , M., D E ROSE , T., G REEN , B., AND
S ANOCKI , T. 2007. Harmonic coordinates for character articulation. ACM Trans. Graph. 26, 3, 71.
J U , T., S CHAEFER , S., AND WARREN , J. 2005. Mean value coordinates for closed triangular meshes. ACM Trans. Graph. 24, 3,
561–566.
K AZHDAN , M. M., AND H OPPE , H. 2008. Streaming multigrid
for gradient-domain operations on large images. ACM Trans.
Graph 27, 3.
L AND , E. H., AND M C C ANN , J. J. 1971. Lightness and Retinex
Theory. J. Opt. Soc. Amer. 61 (Jan.), 1–11.
L ANGER , T., AND S EIDEL , H.-P. 2008. Higher order barycentric
coordinates. Computer Graphics Forum (Eurographics 2008)
27, 2, 459–466.
L EVIN , A., Z OMET, A., P ELEG , S., AND W EISS , Y. 2004. Seamless image stitching in the gradient domain. In Proc. ECCV,
Springer-Verlag, vol. IV, 377–389.
L EVIN , A., L ISCHINSKI , D., AND W EISS , Y. 2008. A closedform solution to natural image matting. IEEE Trans. Pattern
Anal. Mach. Intell. 30, 2, 228–242.
M C C ANN , J., AND P OLLARD , N. S. 2008. Real-time gradientdomain painting. ACM Transactions on Graphics (SIGGRAPH
2008) 27, 3 (Aug.).
PALMER , S. E. 1999. Vision Science: Photons to Phenomenology.
The MIT Press, May.
P ÉREZ , P., G ANGNET, M., AND B LAKE , A. 2003. Poisson image
editing. ACM Trans. Graph. 22, 3, 313–318.
S UN , J., J IA , J., TANG , C.-K., AND S HUM , H.-Y. 2004. Poisson
matting. ACM Trans. Graph. 23, 3, 315–321.
S ZELISKI , R. 2006. Locally adapted hierarchical basis preconditioning. ACM Trans. Graph 25, 3, 1135–1143.
WACHPRESS , E. L. 1975. A Rational Finite Element Basis. Academic Press, New York.
WANG , J., AND C OHEN , M. F. 2007. Optimized color sampling
for robust matting. In Proc. CVPR, 1–8.
WANG , H., R ASKAR , R., AND A HUJA , N. 2004. Seamless video
editing. In Proc. ICPR ’04, IEEE Computer Society, Washington, DC, USA, vol. 3, 858–861.
WARREN , J. 1996. Barycentric coordinates for convex polytopes.
Advances in Computational Mathematics 6, 2, 97–108.
W EISS , Y. 2001. Deriving intrinsic images from image sequences.
In Proc. ICCV, 68–75.
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement