Automatic Tracking of Interactive Virtual Players by Cameras Using a Voronoi Freespace Representation Martin Otten, Heinrich Müller Forschungsbericht Nr. 807/2006 January 2006 [6] S.M. Drucker and D. Zeltzer. CamDroid: A system for implementing intelligent camera control. In Proc. ACM Symp. on Interactive 3D Graphics 1995, pp. 139–144, 1995 [7] D.H. Eberly. 3D game engine design. Morgan Kaufman Publishers, San Francisco, 2001 [8] H. Edelsbrunner. Algorithms in combinatorial geometry. SpringerVerlag, Berlin, 1987 [9] S.F. Frisken, R.N. Perry, A.P. Rockwood, and T.R. Jones. Adaptively sampled distance fields: A general representation of shape for computer graphics. In Proc. SIGGRAPH 2000, pp. 249–254, 2000 [10] B. Geiger and R. Kikinis. Simulation of endoscopy. In Computer Vision, Virtual Reality and Robotics in Medicine, Lecture Notes in Computer Science 905, pp. 277–281. Springer-Verlag, 1995 [11] N. Halper, R. Helbing, and Th. Strothotte. A camera engine for computer games: managing the trade-off between constraint satisfaction and frame coherence. Computer Graphics Forum 20(3):C-174–C-183, 2001 [12] H. Hoppe, T. DeRose, T. Duchamp, J. McDonald, and W. Stuetzle. Mesh optimization. In Proc. SIGGRAPH 1993, pages 19-26, 1993 [13] J.C. Latombe. Robot motion planning. 3rd ed., Kluwer Academic Publishers, Boston, MA, 1993 [14] T.-Y. Li, T.-H. Yu, and Y.-C. Shie. On-line planning for an intelligent observer in a virtual factory. Computer Science Department National Chengchi University, Taipei, Taiwan, 2000. Available from www3.nccu.edu.tw/∼li/Publication [15] H. Pfister, M. Zwicker, J. van Baar, M. Gross. Surfels: Surface elements as rendering primitives. In Proc. SIGGRAPH 2000, pages 335–342, 2000 [16] S. Rubin (ed.). AI game programming wisdom. Charles River Media Inc., Hingham, Mass., 2002 [17] S. Teller and C.H. Sequin. Visibility processing for interactive walkthroughs. In Proc. SIGGRAPH 1993, pages 61–69, 1993 [18] Valve Software. Half-Life. Available from http://www.valve-erc.com 23 used. With respect to camera control, just the principle has been outlined, exemplified at a simple objective function and a simple heuristic solution. With this objective function, already good camera paths can be achieved if the free parameters are set properly. Additional work might be invested in this topic. For example, other techniques of camera control, like e.g. that by Halper et al. [11] might be combined with the VFS-rep. A more significant extension is to more than one player or additional moving objects in the scene. In this case more than just one frame P might be taken into consideration by the objective function. For this purpose, the actors or moving objects might be enveloped by a union of balls whose centers are the origins of frames. Then, for instance, collision between them and the camera can be avoided by expressing that the distance of the frame origins has to be kept at a certain distance from each of the ball centers. This might be achieved e.g. by some sort of repelling functions. References [1] F. Aurenhammer. Voronoi diagrams – a survey of a fundamental geometric data structure. ACM Computing Surveys 23(3) 345–406, 1991 Automatic Tracking of Interactive Virtual Players by Cameras Using a Voronoi Freespace Representation Martin Otten, Heinrich Müller Universität Dortmund FB Informatik LS VII D-44221 Dortmund Germany Tel.: +49 - 231 - 755 6134 Fax.: +49 - 231 - 755 6321 e-Mail: mueller@ls7.informatik.uni-dortmund.de [2] C.B. Barber, D.P. Dobkin, and H.T. Hubdanpaa. The Quickhull algorithm for convex hulls. ACM Trans. on Mathematical Software 22(4): 469–483, 1996. Available from http://www.geom.umn.edu/software/qhull [3] C. Bocchini, P. Cignoni, C. Montani, P. Pingi and R. Scopignio. A low cost 3D scanner based on structured light. Computer Graphics Forum 20(3):C-299–C-308, 2001 [4] M. de Berg, M. van Krefeld, M. Overmars, and O. Schwarzkopf. Computational geometry: algorithms and applications (2nd edition). SpringerVerlag, Berlin, 2000 [5] S. Drucker. Intelligent camera control for graphical environments. Ph.D. Thesis, Massachusetts Institute of Technology Media Lab. 1994 22 Forschungsbericht Nr. 807/2006 January 2006 Abstract The problem of automatic camera control consists in continuously following a virtual player by a virtual camera in a virtual environment in order to show the player and its local environment in a suitable way. A particular challenge are interactively controlled players as they occur e.g. in 3D computer games. We present a data structure called Voronoi freespace representation (VFS-rep). The VFS-rep efficiently supports a class of camera control strategies based on local objective functions. Its main feature is the combination of a roadmap with a freespace representation. Besides collision avoidance and visibility estimation which use the freespace information, the roadmap of the VFS-rep helps if the camera has lost the player. In this case, the camera can move continuously along the roadmap to that branch of freespace to which the player disappeared. In this way, undesired discontinuous jumps of the camera to the new location of the player, which can be observed in games, become rare events, in particular in complex environments. Figure 7: Example of a criticial situation in which the VFS-rep helps. The player disappears at a corner, but is found again by the camera. 21 # polygons of the input scene # sampling points # Voronoi vertices # inner Voronoi vertices # inner Voronoi faces # inner Voronoi triangles # inner Voronoi vertices after reduction # inner Voronoi triangles after reduction time of sampling and Voronoi diagram calc./s time of reduction/h 4 722 75 407 440 256 228 077 119 283 339 867 5 853 14 046 51 1:05 Figure 6: Quantitative properties of the scene datacore. ”Inner” means the the part of the mesh side the datacore hull. 440 KB. During the game, just about 1% of the overall computing time of the game has been required by camera control. Figure 7 shows three frames of a criticial situation in the datacore scene in which the VFS-rep has helped. The player disappears at a corner, but is found again by the camera. 7 Conclusions We have presented an approach to automatic observation of a virtual player in a 3D computer game by a camera. The VFS-rep allows the camera to find the player on a continuous path if the view gets lost. We have demonstrated the usefulness of the approach by an implementation. While calculation of the VFS-rep in a pre-processing step needs some time, the VFS-rep allows on-line tracking, including collision avoidance and visibility estimation, in real time and needs just a minor portion of the computing time of the game. The emphasis of the paper has been the presentation of the VFS-rep and the related algorithms. The VFS-rep is calculated from a point-sampling of the input scene. Thus this approach is particularly suited for point-based representations of the scene which is of relevance if scanned data and point-based rendering are 20 1 Introduction The presentation of the views which are relevant for the players or observers is an important issue of computer games. We think of computer games which take place in a 3D virtual world[7, 16]. One or more players may interact with objects of the scene or with other players. For this purpose the real players are visually represented by virtual players which are controlled interactively by the behavior of the real player. Additionally to the players there might be observers who are not actively involved in the game but who are interested in the event. A well-known, net-based game of this type is Half-Life[18]. The visual link between the players or observers and the game is established by virtual cameras. The virtual cameras might be interactively controlled by the players or observers, or they might automatically present suitable views of the current configuration of the game. The latter is of interest for players who usually have to focus their attention on the game. Restricted net or server capacity, excluding simultaneous interactive access by all observers, might be another reason for automatic camera control. One approach to having the relevant events of the games in view is to assign a virtual camera to the virtual player. The virtual camera follows the player automatically and yields views on the player and the player’s current environment which fulfill prescribed requirements. Technically, the problem of player tracking can be formalized as follows. Given are a scene S of obstacles, an online generated sequence Pi , i = 0, . . . , of locations of a player P , and a camera C. Wanted is a sequence of camera configurations Ci , i = 0, . . . , so that Ci is in the freespace of S and sees Pi in a suitable manner. ”Seeing in a suitable manner” is expressed by an objective or cost function c which depends on S, the location of P , and the location of C, and further parameters supplied by the camera control engine of the game. The further parameters in particular concern the view of C on P . But the distance of the camera path from the obstacles of the scene or the smoothness of the camera path might be influenced, too. The main contribution of this paper is a data structure called Voronoi freespace representation (VFS-rep). A particular strength of VFS-rep is the possibility of calculating efficiently a freespace path between the current configuration of 1 the camera and the accompanied player efficiently. This is useful if visibility between the camera and the player is lost, or becomes minor. In this case, the camera can continuously follow the player along the freespace path in order to reach a new position with a better view. In this way, undesired discontinuous jumps of the camera to the new location of the player are rare events. While calculation of the VFS-rep in a pre-processing step needs some time, the VFS-rep allows on-line tracking, including collision avoidance and visibility estimation, in real time and needs just a minor portion of the computing time of a typical game. The following chapter 2 gives a survey on related work. Chapter 3 specifies the player tracking precisely, and outlines the basic approach on an example which is typical for a class of tracking approaches supported by the VFS-rep. Chapter 4 defines the VFS-rep and shows how it can be calculated efficiently. Chapter 5 presents basic algorithms of tracking, in particular concerning distance and visibility calculation, and shows how the VFS-rep can be used to minimize the objective function which is the core of the concept of player tracking. Chapter 6 compiles data of an empirical analysis of the presented solution. Chapter 7 concludes the paper. 2 Related Work Halper et al.[11] give a good survey on the state-of-the-art of camera control in computer games, to which we refer instead of a recapitulation. They also work out the difference of the requirements on camera control in games to camera control in cinematography and computer animation. A particular difference of games is that the scene is influenced interactively so that a perfect planning in advance is not possible. Just estimated predictions on the behavior of e.g. the players might be used in order to optimize the camera behavior. Figure 5: The test scene datacore. A closer view of the original and the reduced Voronoi mesh. There are three types of data commonly used by camera control, which have, according to this difference, to be calculated on-line in real time: data about the freespace around the camera, for example the distance of the camera to the obstacles of the scene, information about visibility, in particular concerning the visibility between the camera and the player, and additional constraints which restrict the motion path of the camera. At a first glance, this time requirement seems considerably. However, it should be noticed that preprocessing has to be invested for every scene just once. For that reason, we did not try to speed up the implementation. In the approach by Halper et al.[11], the freespace and visibility are determined The space requirements of the resulting VFS-rep data structure has been about 2 19 rial structure of the mesh than on its size, so that some variance can be noticed even on meshes on equal size. on-line by using the possibilities of rendering libraries like OpenGL and the related graphics hardware. Constraints are defined by augmenting the given scene on-line by additional geometry not visible to the user. This approach does not use a preprocessed explicit data structure concerning freespace, visibility, and constraints. Those data are determined ”on the fly”. This requires considerable computing power for more complex scenes. The computing resources are taken from the graphics hardware and thus might reduce the possibilities available for rendering. As an alternative for complex or large scenes, preprocessing-based approaches to visibility calculation known from interactive walkthroughs in virtual realities might be applied in order to diminish this problem[17]. A problem related to camera control is motion planning in robotics. A standard formulation of the problem of motion planning is: move a robot from a given start configuration into a desired goal configuration without collision with the surrounding scene. A quite general relation to camera control is that a collisionfree path has to be found here, too. Many solutions concern the version of static obstacles, that is the scene is static and just the robot is dynamic. In many games, the situation is quite similar: the environment is static and just the players change their positions. Approaches to motion planning in static scenes often consists of two phases: a preprocessing phase in which the scene is preprocessed in order to allow an efficient execution of the second phase, path finding. A good introduction into this topic is given by Latombe[13]. A first approach is to represent the complete freespace by a union of cells (regular/non-regular, adaptive/non-adaptive, hierarchical/non-hierarchical). A further possibility is to augment the cells with information about the distance to the closest obstacles. This leads to distance fields[9]. The cells define the vertices of a graph which are connected by an edge if the corresponding cells are neighboring. The second phase consists in finding a path in the graph from the cell of the source configuration to the cell of the goal configuration. Another method are potential fields. In this case, the obstacles of the scene get repulsive fields, whereas the goal gets an attracting field. The superposition of the fields yields a force vector at every location in freespace. The robot reaches the goal configuration by following the force vectors. Figure 4: The test scene datacore. The first picture shows the scene from outside by its polygons. The middle picture shows the set of sampling points, and the third pictures visualizes the reduced VFS-rep by the edges of the resulting Voronoi triangles. 18 Sometimes the freespace is reduced to a subspace or roadmap. Examples are 3 the medial axis and the visibility graph. The medial axis consists of all points of the freespace which have equal distance to at least two different points on obstacles. The visibility graph connects pairs of sampling points on the surface by an edge if they see each other. The medial axis and the visibility graph again define graphs which are traversed in the path finding phase. These methods can be applied to camera control, too, in order to avoid collisions of the camera or to test for visibility. Drucker[5, 6] combines a visibility graph for global planning with some sort of potential approach for local planning. Li et al.[14] augment a rasterized cell representation of freespace by visibility information stored at every cell for rasterized viewing directions. Our approach also is inspired by the methods of robot motion planning. It extends the medial axis roadmap to a representation of the complete freespace by a covering by cells. The advantage is that a basic roadmap for the camera path is available which helps if the view on the player gets lost. In this case, the camera can move along the roadmap to that branch of freespace to which the player disappeared. However, because of requirements on the view of the camera on the player, the medial axis as a roadmap is too restricted. In order to choose a suitable location, the information about freespace can be used. The information about freespace is also useful in order to calculate the visibility of the player with respect to the camera, and to avoid collisions of the camera with the scene. Figure 3: The architecture of the computer game Half-Life. 3 Requirements and basic approach Let us first recall the version of the camera problem treated in the following. The input consists of a scene S of obstacles, an online generated sequence P i , i = 0, ..., of locations of a player P , and a camera C. The output is an on-line real-time-generated sequence of camera configurations Ci so that Ci is in the freespace of S and sees Pi in a suitable manner. The camera C is represented by a 3D orthogonal frame {oC , xC , yC , zC } in the world coordinate system of S (Figure 1). oC is the origin which serves as viewpoint. xC , yC , zC are mutually orthogonal vectors among which zC defines the view axis, and xC , yC span the image plane. For the empirical analysis, a PC with an 800 MHz Celeron CPU and 384 MB RAM and Windows 2000 has been used. The program is written in C++. Figure 4, middle, shows the result of the calculation of the VFS-rep based on uniform point sampling of datacore. A reduced VFS-rep is shown in Figure 4, bottom. The non-reduced VFS-rep is not shown because of its density of line segments. Figure 5 gives a closer view on the meshes. Figure 6 compiles some statistical data of this calculation. The data show that mesh reduction has a considerable effect. The player P is represented by an orthogonal frame {oP , xP , yP , zP } in space, for instance oP as the center for the head, zP as vector from oP towards the face, yP as a vector from oP towards the top of the head. Sampling and calculation of the Voronoi diagram of resulting sampling points by QHULL[2] required 51 s. The subsequent mesh reduction has taken 1:05 h. Similar computation times could be observed for other scenes. The calculation times of mesh reduction, however, seem to be more sensitive to the combinato- 4 17 purpose. The chosen vertex is the second or the last but one vertex, respectively, of p, and the rest of p is constructed as a shortest path between them. We have not yet described the solutions chosen for all the component functions of c. They are for cvis see above, for cdist and cangle Tt,dist/angle := oC − (oP + dopt · zP ) where zP is assumed to have length 1, for c∆dist and c∆angle Tt,∆dist/angle := oC − oC − where c− C is the camera location preceding cC , and Figure 1: Definition of an ideal view of the camera on the player. for csafe Tt,safe := v − oC where v is that vertex of a freespace cell Sf (v) of which oC is a element, for which w(||v − oC ||)df (v) is minimum. 6 Performance evaluation We have implemented a camera control module based on this approach, within the Half-Life environment[18]. Half-Life is a network-based 3D computer game. It works according to the client-server concept (Figure 3). With HalfLife, a free software development kit is provided which allows to implement an own game logic in the server and the functionality of the clients. The camera engine is part of a client. A reasonable heuristics for the camera view is to hold zC , zP , and oP − oC co-linear and equally directed during the motion. In this way, the camera follows the head of the user from behind. Furthermore, the vector u C is always perpendicular to the z-axis of the world coordinate system of S. The direction of vC is chosen so that vC has a positive component in direction of the z-axis of the coordinate system of S. The relation between two consecutive camera configurations C and C + , here C := Ci and C + := Ci+1 , is described by a transformation T , that is C + = T (C). ”Seeing in a suitable manner” is expressed by an objective or cost function c which depends on S, the next configuration P + of the player, the current configuration C of the camera, the unknown new configuration C + = T (C) of the camera corresponding to P + , and further parameters c supplied by the camera control engine. The parameters c allows the control engine to influence the camera behavior globaly. c comprehends for instance the desired distance between the camera and the player which might be changed dependent on the current location. For Half-Life, several 3D game scenes are available. We have used some of them for evaluation of our solution of camera control: snark pit, stalkyard, rapidcore, datacore, frenzy, lambdabunker, subtransit, and undertow. Figure 4, top, shows the scene datacore from outside. In the following we use this scene as a typical example. The goal of optimization is to find a transformation Topt which minimizes c, 16 5 that is Topt = arg minT c(S, P, P + , C, T (C), c) where the minimum is taken over all feasible transformations T . A transformation is feasible if the freespace constraint is satisfied, that is, T (C) is in the freespace of S. In the following simple example of a cost or objetive function, we assume T to be a rigid motion. The internal camera paramters defining the perspective mapping are held constant. The sample cost function consists of several components, c(S, P, P + , C, T (C), c) := cvis (S, P + , T (C)) · (cdist (S, P + , T (C))+ cangle (S, P + , T (C)) + c∆dist (S, C, T (C))+ c∆angle (C, T (C)) + csafe (S, P, P + , C, T (C))) where cvis (S, P + , T (C)) = 1/γ where γ is the opening angle of a maximum view+ cone at c+ with axis in direction of o+ P − oC , so that no obstacle of S is + between C and P in the cone. If oP is not visible from o+ C , then γ is set to 0. cdist (P + , T (C), dopt ) = the difference of the desired distance dopt between + + the camera and o+ P and the actual distance ||oC − oP ||. cangle(P + , T (C)) = the absolute angle between the desired direction z+ P of + + the camera view on o+ P and the actual view direction oC − oP . c∆dist (C, T (C)) = the absolute difference between ||oC + − oC || and ||oC − oC − || where o− C is the camera location preceding oC . c∆angle(C, T (C)) = the absolute angle between the vectors oC + − oC and segment between oC and oP , which is completely in the freespace. However, the exact calculation of this cone is complicated. For that reason, γ is replaced with an other heuristic measure which is sufficient for the purpose. If visibility between oC and oP has been detected, we take the minimum of the estimated distances of the bi , i = 1, . . . , m, to S, where bi are the points calculated by the visiblity procedure. The estimated distance is taken as df (bi ) − ||bi − bi || where bi is a nearest neighbor of bi on the triangle vi+1 . 5.3 Minimization of the objective function We use a simple heuristics in order to find a translational component T t,opt of Topt which yields an approximative minimum of the objective function c of chapter 3. The approach is to choose an optimal or at least a favorable solution Tt,j for every component cj , j ∈ J := {vis, dist, angle, ∆dist, ∆angle, safe} of c, and take Tt,opt as a weighted average of the Tt,j . If Tt,opt yields a point outside the freespace, it is shortened so that the resulting point o+ C stays in the freespace. A possibility is to take o+ C half-way between oC and the point at which the ray from oC towards o∗C leaves the freespace. The weights can be controlled by the camera engine. If the value of the visibility component cvis , evaluated at o+ C , is less than a given threshold provided by the camera engine, the procedure is iterated for o+ C. Otherwise, a procedure searching for the player is initiated. The searching procedure calculates a shortest polygonal path p on the VFS+ rep mesh V (S) between o+ C and oP . The second vertex b of the resulting path is taken for definition the translational component Tt,vis contributing to the overal translational component by Tt,vis := b − oC . The contribution of Tt,vis is strengthened by increasing its weight if the player does not become (sufficiently) visible in subsequent steps. In this way, the camera is finally forced on the roadmap provided by the tirangular mesh V (S). If the player does not become visible even then within a given time limit, the camera executes a jump to the player. csafe (S, C, T (C), w) = w(||oC + − oC ||)/dfree (oC + ) where w(.) is a monotonous function controllable by the camera engine. The shortest path is searched by the A∗ -algorithm[13]. With exception of possibly the first and large point, the vertices of p belong to V (S), and its edges are + edges of V (S). If o+ C or oP , respectively, is not a vertex of V (S), a vertex v C of a triangle of V (S) is selected which has the respective vertex in its freespace. + The vertex with the shortest distance to o+ P or oC , respectively, is take for this 6 15 oC − oC − where c− C is the camera location preceding cC . For the solution of this ”tracking version”, two cases are distinguished. If c is not in the freespace of v − , the triangles v 0 adjacent to v − are determinded for which c is in the freespace of v 0 . If at least one is found, let c be a point on v 0 closest to c. Then the maximum of the values df (c) − ||c − c|| over triangles v 0 is reported as df (c). If none is found, this fact is reported. It means, that c is not in freespace, or that the step size of tracking has been possibly too big. The model behind csafe is that the faster the camera moves, expressed by w(||oC + − oC ||), the higher the distance dfree (o)+ to the obstacles of the scene should be. The other functions demand that the player should be visible to the camera (cvis ), that the camera should hold a given distance and orientation to the player (cdist ,cangle), and that the camera has a certain inertia (c∆dist , c∆angle). If c is in the freespace of v − , the same calulation is made for v − instead of the adjacent v 0 . By applying the definition of the camera view given above, just the translational component of T remains as open parameter which can be used for minimization. At the beginning of tracking, a brute force initialization is performed. All Voronoi triangles v are tested for membership of c in their freespace. The test can be performed by checking a nearest neighboring point c of c on v for whether the free distance df (c) of c exceeds the distance between c and c. 4 Calculation of the Voronoi freespace representation (VFS-rep) 5.2 Visibility calculation The task of visibility calculation is to check for whether the line segment between the viewpoint oC of the camera and the point of interest oP of the player is completely in the freespace. It is assumed that a triangle v is given so that oC is in its freespace Sf (v). The problem is solved by calculating a sequence of points ci , i = 0, . . . , m, so that c0 = oC , cm = oP , ci is in the freespace Sf (vi ) of a triangle vi of V (S), and ci is in Sf (vi−1 ), for i > 0, too. The sequence is constructed by successively finding the triangles vi , as follows. If oP is in Sf (vi−1 ) of the triangle vi−1 of the current point ci−1 , the algorithm terminates. Otherwise, the neighboring triangle v of vi−1 with the farest intersection point c of the freespace Sf (v) with the ray from ci−1 towards oP is determined. Then ci := c and vi := v. If the iteration has reached oP after termination, visibility between oC and oP is reported. Otherwise, the exit point c of the ray with respect to the freespace of vi−1 is reported. In this case, the camera has lost the player. Then c0 is used as the starting point of a search by the camera, which is described in chapter 5.3. The problem of calculation of the Voronoi freespace representation (VFS-rep) is to find, for a given 3D-scene S of polygonal obstacles, enveloped in a bounding volume, a Voronoi freespace representation of the freespace of S inside the bounding volume. We solve the problem in three steps. The first two steps are sampling of the obstacles of S and calculation of the freespace on the resulting set of data points. The third step, data reduction, is not oblige, but usually improves the space and time requirements of the tracking phase significantly. Before we start with the description of the solution, we recall briefly the definition of Voronoi diagrams – more details can be found e.g. in the survey by Aurenhammer[1] and the books by Edelsbrunner[8] or de Berg et al.[4]. Given a finite set of disjoint sites in d-dimensional Euclidean space, the Voronoi region of a site s is the region of all points in space being closer to s than to every other site. The Voronoi diagram is the decomposition of the space into Voronoi cells. The component cvis of the objective function of camera control in chapter 3 depends on the opening angle γ of a maximum viewcone in which the camera sees the player without occlusions caused by the scene. A possibility is to use the opening angle of the maximum cone with tip at oC and axis on the line Figure 2, top, shows the Voronoi diagram in the case of 2D-points as sites. In the case of points, the Voronoi cells are convex polytopes (convex polygons in the plane). Their vertices are called Voronoi points, their edges Voronoi edges, and their faces Voronoi faces. Evidently, the boundaries of the Voronoi cells have maximal distance to the sites and coincide with the medial axis of the freespace between the sites. 14 7 is removed and the resulting hole is triangulated. After every edge swapping, the free distance values df of the vertices of the involved Voronoi triangles are updated. The new values of df are chosen so that they yield a new freespace which is inside the original one. Several cases have to be distinguished for this purpose. The main parameters used are the distance between the original edge and the edge resulting by swapping, and the locations of the points on both edges between which the minimum distance is reached. The mesh reduction algorithm begins by evaluation of the cost function for every vertex of the mesh V (S). The costs of a vertex are calculated by tentatively eliminating the vertex from the mesh and calculating the difference between the new and the old local freespace volume. Then the vertices are arranged into a priority queue according to increasing costs. Vertices are eliminated in the order of the priority queue as long as their costs are less than a given threshold. After elimination of a vertex, the costs of involved neighboring vertices are updated, and the vertices are re-inserted into the priority queue according to their new costs. 5 Camera control Figure 2: Approximate calculation of a medial axis of a polygonal scene, illustrated on a 2D-example. The first picture shows the sampling points and the resulting Voronoi diagram. The second pictures shows an approximation of the medial axis obtained by removing the Voronoi edges induced by closely neighbored sampling points. The white area indicates the region of interest inside a bounding volume which corresponds to the polygon. The medial axis outside this region is omitted. The circles indicate the property of the points on the medial axis to have equal distance to at least two sampling points. In the following we show how the VFS-rep of a given scene S can be used for camera control. We first describe how the the distance of the camera to the obstacles of S and how the visibility between the camera and the player can be calculated with the VFS-rep. Then we present a heuristics for minimization of the objective function c of chapter 3. 5.1 Distance calculation 4.1 Sampling of the scene The goal of sampling is to replace the scene S of polygonal obstacles with a set of point obstacles P . In this way, the curved boundaries of the Voronoi cells are approximated by piecewise flat boundaries, as shown in figure 2, bottom. The advantage is that algorithms for point Voronoi diagrams are much easier to implement. A disadvantage is that a good approximation needs a large number of sampling points. However, as we will see, the efficiency of algorithms for 8 The task of distance calculation is to find, for an arbitrary point c, a value d f (c) so that the ball centered at c with radius df (c) belongs to freespace. df (c) should be not too far from the maximum possible radius. If none exists, this fact should be reported, too. During camera tracking, the following version of distance calculation is relevant. A predecessor c− of c and some additional data are known. The additional data consist of a triangle v − of the VFS-rep so that c− is in the freespace of v − . 13 points. d0 is subtracted in the definition of df (vi ) in order that the resulting freespace Sf (v) does not intersect S. If S would intersect, there would be a point on S of distance ≥ d0 to every sampling point, in contradiction to the sampling condition. point Voronoi diagrams and the memory resources of today’s PC make this approach practical also in three dimensions. The approach of approximation of Voronoi diagrams by point sampling has also been used by e.g. Geiger et al. [10]. The definition of df (vi ) yields an approximation of the freespace by sets Sf (v) of constant thickness each. This means that the function df is discontinuous. Furthermore, several values df (v) need to be stored at every Voronoi point v, one for each incident triangle. A continuous df can be obtained by assigning the minimum of the values to v. Although this shrinks the freespace somewhat, we have used this representation. It has turned out that it is sufficient for our purpose. A side effect of the sampling approach is that it harmonizes perfectly with pointbased modeling[15]. If the scene is already represented by a cloud of points, the sampling step is not necessary. In particular, scene geometry acquired by highly resolved point-based 3D-scanners[3] may be used without surface reconstruction. 4.3 Reduction of the Voronoi freespace representation Uniform sampling of sufficient density causes a considerable number of triangles of the mesh V (S) of the VFS-rep. The sampling density is chosen according to the requirements of the most narrow interesting regions of the freespace, but in this way it often exceeds the requirements of large regions of free space. In large regions, a lower number of larger Voronoi triangles would be sufficient. We achieve this goal by mesh reduction. We reduce V (S) by vertex elimination according to an approach inspired by the algorithm of Hoppe et al.[12]. In contrast to the original algorithm, we use just edge swapping for degree reduction. Vertex splittings or edge contractions are excluded since they would introduce new vertices. In the main phase of the algorithm, only vertices with manifold neighborhood are removed, that is, vertices without non-manifold incident edges. At the end, non-manifold vertices of a specific simple type are removed, too. The point sampling has to satisfy the following Sampling condition. Let ds be a function on S which assigns a nonnegative sampling distance to every point on S. The sampling condition is satisfied if any point q on S has a neighboring sampling point of distance less than ds (q). A particular type of sampling which exemplifies this general definition is uniform sampling. For uniform sampling, ds (q) := d0 for all q ∈ S where d0 > 0 is a constant. This reduces the sampling condition to the requirement that any point q ∈ S has a neighboring sampling point p ∈ S with distance less than the given bound d0 > 0. The bound d0 defines a tolerance which has to be fulfilled in order that a region in space is considered as interesting freespace. This means that small openings or environments of concave corners are ignored. The value of d 0 defines the amount of tolerance. A possible choice is to make d0 dependent on the size of the player. The energy function which controls vertex elimination in the algorithm by Hoppe et al. is replaced with a cost function based on the volume of the freespace. The decision on vertex elimination is based on the difference between the new volume of the resulting freespace and the old volume of the freespace of the replaced triangles. If the difference exceeds a given threshold, the operation is not executed. Another example is medial-axis adaptive sampling. Medial-axis adaptive sampling is defined by ds (q) := max{c0 ·dm (q), d0 } where dm (q) is the distance of q from the medial axis of S. and 0 < c0 < 1 and d0 > 0 are given constants. d0 plays the same role as for uniform sampling. The density of the sampling points is dependent on the distance from the medial axis, and thus is dependent on the extension of freespace in the environment of a surface region. If the freespace is narrow, the sampling density is high, and if there is much space, the points are sampled at a low density. The operation of removal of a vertex v starts with swapping of edges incident to v. The goal is to reduce the degree of v to three. If this goal is achieved, v A difficulty with medial-axis adaptive sampling is that the medial axis usually is not known in advance. It is an interesting open question beyond the scope of 12 9 this paper to work out this sampling approach possibly based on cost-efficient estimates of the medial axis. In this paper, and in our implementation, we have used uniform sampling. The triangles t of the input scene S are sampled independently as follows. First those edges of t of length ≥ 2d0 are subdivided. Then t is triangulated so that the subdivision points are included in the resulting triangulation. The procedure is iterated on the resulting triangles. The vertices of the resulting triangulation are the desired sampling points. 4.2 Calculation of the freespace The VFS-rep consists of a spatial triangular mesh V (S). V (S) is an approximation of the medial axis of the original scene S, that is, of the surfaces of the Voronoi diagram of S which separate the Voronoi cells. The triangular mesh V (S) needs not to be manifold, that is, edges with more than two incident triangles exist. In the 2D-analogue of figure 2, bottom, the polygonal chain of the medial axis corresponds to the triangular mesh in space, and the branching points of the medial axis correspond to non-manifold edges in space. V (S) results by triangulation from Voronoi faces of the point-based Voronoi diagram which are specified later. For that reason we call these triangles Voronoi triangles. Each vertex vi of a Voronoi triangle v refers to a so-called free distance value df (vi ), i = 0, 1, 2. By barycentric interpolation, a distance value df (v) := P 2 v of v, where the µi , i = 0, 1, 2, i=0 µi · df (vi ) can be assigned to every point P2 are obtained by resolving the equation v = i=0 µi · vi . The freespace Sf (v) of v is defined as the union of all balls with center m on v and radius d f (m). Hence the free distance values have to satisfy the constraint that S f (v) has an empty intersection with scene S. In the data structure of a VFS-rep, every Voronoi triangle refers to its three Voronoi vertices. Every Voronoi vertex refers to a list of its incident Voronoi triangles. The VFS-rep is calculated as follows. 1. Calculate the Voronoi diagram of the sampled scene of obstacles, including the input point of minimum distance of every Voronoi point 2. Remove of all Voronoi faces whose generating sampling points p 1 , p2 10 satisfy d(p1 , p2 ) < 2 · d0 , or which are outside of the bounding volume of the scene. 3. Triangulate the remaining Voronoi faces into Voronoi triangles. 4. Assign a free distance value df to every Voronoi vertex. For the calculation of Voronoi diagrams in step 1, several efficient algorithms are known[8, 1, 4] The approach we use in our implementation is to lift the input points onto a hyper-paraboloid in 4D-space. The ”bottom part” of the convex hull of the resulting points yields a Delaunay triangulation of the input points. The dual graph of the Delaunay triangulations is the Voronoi diagram. The convex hull can e.g. be calculated using the QHULL software[2]. In step 2, the generating points p1 and p2 of a face are the input points whose Voronoi regions share this face. The idea behind this choice of Voronoi faces to be removed is that faces induced by points p1 and p2 of distance ≥ 2d0 do not contain points on surfaces of S. The reason is that all points on such faces have a distance > d0 to all sampling points, since p1 and p2 are closest sampling points by definition. According to the sampling condition, points without sampling point within distance d0 are not on S. On the other hand, it happens that Voronoi faces or parts thereof in freespace are lost. These faces, however, should usually be very close to the surface of S and thus are not relevant in our application. The Voronoi faces are plane polygons. Thus the triangulation in step 3 can be performed straightforwardly. We have used the Delaunay triangulation[8, 4] which yields triangulations of the Voronoi faces since they are convex. An advantage of the Delaunay triangulation is that it should avoid ”thin’ triangles. This, however, is not necessarily correct for triangles at the boundary of the triangulation, and other approaches to triangulation might be used instead. The free distance value df (v) of step 4 is chosen so that the convex hull of the spheres of radius df (vi ) at the vertices vi of a Voronoi triangle v, i = 0, 1, 2, is a subset of the freespace of S. For the case of uniform sampling, this constraint is satisfied by df (vi ) = ||p2 − p1 ||/2 − d0 , i = 0, 1, 2, where p1 and p2 are the sampling points inducing the Vononoi face from which v has emerged. The term ||p2 − p1 ||/2 comes from the fact that p1 and p2 belong to the closest sampling points of every point v on v, and that ||v − p1 || = ||v − p2 || ≥ ||p2 − p1 ||/2. Hence this term yields an empty intersection with the sampling 11

Download PDF

- Similar pages