1 Performance Isolation in Virtualized Machines Rahul C. Gundecha (06329017) M. Tech Stage-II Report Department of Computer Science and Engineering Indian Institute of Technology, Bombay Email: rahul[at]cse.iitb.ac.in Guide: Prof. Varsha Apte (varsha[at]cse.iitb.ac.in) Date: January 8, 2007 C ONTENTS I Introduction 2 II Problem description II-A Virtualization . . . . . . . . . . . . . . . . II-B Scheduling of virtual machines . . . . . . II-C Performance isolation and application QoS II-D Problem definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 4 5 6 Solution approach III-A Related work . . . . . . . . . . . . . . . . . . . . . III-B Feedback control theory basics . . . . . . . . . . . III-C Preliminary results . . . . . . . . . . . . . . . . . . III-D Architecture of QoS aware virtualized environment III-E Feedback control system design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 8 8 10 10 12 IV Testbed for QoS aware virtualized environment IV-A Components of the testbed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV-B Implementation issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV-C Workload description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 14 15 16 V Performance evaluation 18 VI Conclusion VI-A Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 22 III References . . . . . . . . . . . . . . . . 23 2 I. I NTRODUCTION The task of predicting & maintaining the system performance and capacity planning is becoming difficult due to increased complexity in the IT applications and infrastructure. Service providers host the applications from different enterprise clients on the shared pool of hardware resources. Clients negotiate the service contract in the form of Service Level Agreement (SLA) with service providers which depict all the related formal information about the contract and the performance guarantees. The performance guarantees include QoS (quality of service) requirements like desired response time or throughput of the application. Degraded performance leads to penalty cost due to SLA violation as well as dissatisfied clients which ultimately results in financial loss for the service providers. Over-provisioning of hardware resources has always been the easiest choice for service providers to avoid any performance problems. But it leads to inefficient and costlier resource management. Nowadays server virtualization is heavily used to build IT infrastructure as it allows sharing of resources among different applications while at the same time providing isolated environment called virtual machine for each application. Virtual machine hosts an OS (operating system) in its secured isolated environment consisting of virtual CPU, main memory and IO devices. Virtual machine monitors(VMM) like VMware, Xen does the task of protection and resource allocation among individual virtual machines. Some of the benefits of server virtualization include consolidation of multiple OS on single physical server, live migration of a virtual machine from one physical server to another physical server. With these capabilities offered by server virtualization, managing a server farm becomes easier and cost effective. Sharing of the resources should not cause performance of an application adversely getting affected by the other applications running on the same hardware. Gupta et al describes the term performance isolation as the scenario in which performance of the client application remains same regardless of type and amount of workload of other applications sharing the resources. Performance isolation is an important goal in any shared hosting environment such as virtualized environment. Performance isolation can be achieved by properly allocating the resources among competing virtual machines. VMM allocates the share of resources like CPU, main memory to each virtual machine. For example, CPU scheduler in Xen accepts two parameters named weight and cap for each of the virtual machine. Weight represents the relative share of a virtual machine, whereas cap represents the upper bound on CPU consumption by the virtual machine. Performance isolation can be achieved by setting the appropriate values of resource management parameters like weight and cap for each virtual machine. Dynamic nature of the workload should be considered while modeling the performance behavior of the applications residing in virtual machines. Client SLAs keep on changing very frequently. Addition or removal of clients is also a continuous process. Same is the case with underlying hardware infrastructure which frequently gets scaled or upgraded with new hardware components. With these many sources of dynamics, delivering QoS to the applications hosted in the virtual machines becomes more complex. Our study focuses on devising a mechanism for computing the share of the resources to be allocated to each virtual machine in such a way that desired QoS is delivered to the applications running inside virtual machines. In this study, we are applying feedback control theory to maintain the performance of the applications running inside virtual machines. Feedback control theory does online analysis of the system and attempts to maintain the output of the system around the desired values. In virtualized environment scenario, output refers to the QoS requirements of the clients which need to get satisfied. Controller in a feedback system computes the values of input parameters which affect the working of the system which in turn affects the output delivered by the system. In virtualized environment, input parameters refer to the resource management parameters like main memory allocation to guest OS, or some scheduler specific parameters like weight, cap, time-slice for a guest OS. 3 In first stage of this Master’s project, we analyzed the feasibility of using feedback control on a simple testbed consisting of a virtual machine hosting a webserver. The results showed that there is approximately linear trend in throughput of the webserver against variations in cap. This result motivated the use of linear models of feedback control theory to model the system and design a feedback control system. In second stage of this Master’s project, we designed a feedback control system which tunes the resource management parameters of each application such that the QoS is maintained at the desired level. In our architecture of QoS aware virtualized environment, we designed and implemented three modules named controller, capacity analyzer and sensors. Sensors carry out the measurement of the QoS delivered to the applications. Controllers compute the new value of cap of virtual machine depending upon the measurements received from the sensors. The capacity analyzer verifies whether the resource demands by all of the virtual machines running on the same physical server can be fulfilled. If it is not feasible, then an appropriate action such as migration of virtual machine to another physical server needs to be done to maintain the QoS of the applications. Otherwise the QoS of some or all applications will degrade when running on the same physical server. We are planning to work on the issues regarding migration of virtual machines to avoid degraded QoS in third stage of this Master’s project. Mainly, our work will focus on devising an algorithm which will monitor the applications and depending upon resource demands, the algorithm will decide which virtual machines to migrate, and on which physical servers they will migrate. We evaluated the controller by deploying a testbed consisting of two instances of a two tier application on a shared set of resources using virtualization. Our testbed also consisted of client machines for continuous load generation to simulate a real world scenario. Each of the application tiers is deployed in a different virtual machine. Our testbed consisted of sensors in the form of proxy servers which measures the response time delivered by the application. The results of the experiments carried out on the testbed demonstrate the effectiveness of the designed feedback control system. Section II begins with introduction to virtualization and the concept of performance isolation. Subsequently we define the problem statement and goals of this Master’s project. Section III presents the work done in first stage of this Master’s project which covers feedback control theory basics and some preliminary results. Subsequently section III describes our approach for developing the solution using control theory. Section IV describes deployment details of the testbed used in the study. Section V describes the performance evaluation results. Report ends with conclusion and future work. 4 II. P ROBLEM DESCRIPTION This section starts with describing the basics of the virtualization. Subsequently we discuss the performance issues occurring in the virtualized environment. Then we define the problem statement for this Master’s project. A. Virtualization The term virtualization refers to the abstraction of resources. The user or the software process is not aware of the actual characteristics of the resource. Rather, they get a view of resource which is more familiar to them or which is more manageable by them. Our concern over here is about server/software virtualization which is more popularly known as virtual machine environment. Figure 1 shows a virtualized environment. Let us see some of the basic terms in server virtualization. • Virtual Machine(VM): This is a virtual environment created by VMM(described below), which simulates all the hardware resources needed by an operating system. The OS running in such environment is called a guest OS. Guest OS has a virtual view of the underlying hardware. • Virtual Machine Monitor (VMM/hypervisor): This is the interface between the guest OS and the underlying hardware. All the administrative tasks like adding a new guest OS, allocation of resources to each of guest OS is done through VMM. Some popular examples of VMM are VMware, Xen. In our study, we have used open source VMM solution Xen . Fig. 1. virtualized environment Host OS: The native OS running on the given hardware is called the Host OS. The VMM is installed on Host OS. This OS has all the privileges on the given hardware. In simpler terms we can describe the virtualization as follows. The actual physical resources are divided into logical partitions. Each of the logical partition is allocated to some guest OS. Each guest OS runs independently on a given partition. For host OS, guest OSes are like the normal processes running on it. The VMM interface is available in host OS through which guest OSes are managed. The term domain is alternatively used in place of virtual machine. Host OS is often called as Domain-0 where as guest OS are called DomUs. • B. Scheduling of virtual machines There are number alternatives for CPU scheduling in Xen like Borrowed Virtual Time(BVT), Simple Earliest Deadline First(SEDF) and Credit scheduler which schedule the virtual machines on available set of processors. The latest scheduler for Xen is credit scheduler which is a proportional fair share SMP(Symmetric multiprocessor) scheduler. Each domain (including host OS) is assigned with number of virtual CPUs (VCPU), weight and cap values. Weight denotes share of a domain and is directly proportional to CPU requirement of a domain. The cap 5 specifies the maximum amount of CPU a domain will be able to consume even if there is idle CPU. Thus credit scheduler works in non-work conserving mode when sum of cap of all domains is less than available CPU capacity. Each CPU manages a local run queue of runnable VCPUs sorted by VCPU priority. A VCPU’s priority can be over or under depending upon whether that VCPU has exceeded its fair share of CPU in the ongoing accounting period. Accounting thread computes how many credits each virtual machine has earned and recomputes the credits. Until a VCPU consumes its allotted credits, priority of VCPU is under. Scheduling decision is taken when a VCPU blocks or completes its time slice which is 30ms by default. On each CPU, the next VCPU to run is picked up from head of the run queue. When a CPU doesn’t find a VCPU of priority under on its local run queue, it looks on other CPUs for VCPU with priority under. This load balancing mechanism guarantees each domain receives its fair share of CPU. No CPU remains idle when there is runnable work in the system. C. Performance isolation and application QoS In a virtualized environment, multiple software servers are hosted together on a single shared platform. Each server may belong to different owner. For each server, the QoS requirements are expressed by the client through Service Level Agreement (SLA) with the service provider. The task of the service provider is to maintain the performance such that SLA of any of the client does not get violated. SLA violations have pre-specified penalty costs associated with them. QoS crosstalk occurs in a situation when maintaining QoS for some client results into degraded QoS for another client. Performance guarantees for the applications running inside the virtual machines can be fulfilled only if there is performance isolation across virtual machines. Figure 2 pictorially depicts the scenario of virtual machine environment. Fig. 2. Applications running inside virtualized environment Performance Isolation as described by  is as follows: ”Resource consumption by any of the virtual machines should not affect the promised performance guarantees to other virtual machines running on the same hardware.” Over-provisioning of the resources can be simplest solution to achieve performance isolation but then the whole essence of using virtualization can be lost. The ultimate aim is actually to increase the benefit of the service provider through better resource utilization with constraint of delivering QoS for each of the client. Hence some 6 better solution other than over-provisioning is required. Let us see one example which describes this problem. In an earlier study we have shown that the behavior of the applications running inside the virtual machines remains unpredictable when there is IO load running on at least one virtual machine. Fig. 3. Effect on mixed load on performance The experiment was done to analyze the effect of mixed load applications on the performance of each other. One application is a CPU intensive application and the other application is a webserver. We carried out first experiment only with webserver running in virtual machine vm3. Next experiment was carried out with CPU intensive application running in vm2 and vm3 is hosting the webserver. In both the experiments we have not set the value of cap for the virtual machines. As shown in the following table, in both cases, CPU consumption by vm3 is the same which is 180% whereas in second experiment vm2 consumed 100% CPU. The testbed consisted of four cores of processor; hence there was still some CPU capacity left. But the readings show there is drastic change in throughput of the webserver in the second experiment. Although CPU consumption is same in both experiments, the quality of service(QoS) delivered has gotten affected by the presence of the other virtual machine. The experiment described above was done with a simple setup. In a real life scenario, the situation can get worse in presence of tens or hundreds of virtual machines sharing the pool of resources. Each of the virtual machines may be hosting different kind of application with different kind of workload patterns and with different levels of desired quality of service. A change in any of the software components such as the virtual machine, or application characteristic or a change in any of the hardware resource can affect the performance adversely. Several studies  revealed that there is compelling need of having better performance isolation mechanism in xen. This is also evident from the fact that three schedulers named Borrowed Virtual Time(BVT), Simple Earliest Deadline First(SEDF) and Credit scheduler have been proposed for virtual machine scheduling in xen in past four years. Lack of performance isolation causes degraded and unpredictable application performance. With this motivation, we define the problem in the following way. D. Problem definition Our work is in the context of providing performance isolation across virtual machines sharing the resources. Specifically most important objective of our work is to devise a mechanism to set resource management parameters for the virtual machines in such a way that the applications running inside virtualized environment can deliver client QoS guarantees. The client QoS requirements need to be translated in resource management parameters. 7 Another important objective is to improve resource utilization with constraint of maintaining client QoS. This objective is important from the perspective of the service providers. For example, the client QoS requirements can be expressed in terms of desired response time of the application. The resource management parameter to be tuned can be scheduler parameter cap of a virtual machine hosting the application. The value of cap represents the upper limit on CPU consumption by a virtual machine. The challenge is to design robust mechanism for setting up the cap of virtual machine in order to maintain the response time of the application even in presence of the other workloads or with the variations in the operating environment. 8 III. S OLUTION APPROACH In this section we present our mechanism to compute the resource management parameters of the virtual machines so as to deliver QoS to the applications running inside virtualized environment. We applied the feedback control theoretic approach  for developing the solution. The basic idea of feedback control systems is that they work on the basis of the feedback they receive from the system at runtime. Therefore building a very accurate model of the system is not necessary. Also, as it works on feedback from a running system, it can respond quickly to the variations occurring in the system. Other alternative for developing the solution include queueing theory. But the queueing models does not handle feedback and it is not good at characterizing transient behavior in overload. Also queueing models does off-line predictive analysis, whereas feedback control theory does online analysis which makes it more robust to changes in the operating environment. A. Related work Recently there has been lot of research work involving control theoretic approach to solve the computing problems. One of the initial attempts to apply control theory to computing problems was done by Keshav . Paper proposed flow control mechanism in network by controlling the sending rate of source. Kamra et al developed the admission controller for the multi-tier web applications which does tunes the dropping probability to maintain the response time around the desired value. Abdelzaher et al proposed the mechanism for tuning the shares of cache allocated to each of class of requests at the proxy cache server to deliver differentiated service. Ramamritham et al designed the controller which chooses the time instances for pulling the dynamic data from the data sources such that the desired level of temporal coherency is maintained. One of the attempts to provide quality of service (QoS) to multimedia applications was done by Nemesis operating system. Nemesis prevents the QoS crosstalk by moving the majority of operating system services into the application itself and using feedback control to monitor application performance. B. Feedback control theory basics Feedback control theory can be very useful for modeling the dynamic behavior of the virtualized environment. Feedback control theory models the input/output relationships of a system. Important benefit of Feedback control is that it does not require accurate system models. Controller uses the feedback to maintain the performance by correcting the errors. Let us see how a feedback control system works. Figure 4 shows a basic feedback control system. A control system diagram is very different from the architectural diagram of a system. Control diagrams depict flow of the data and control signals through the system and the various transformations the signal undergoes. Architectural diagrams depict the functional components involved in the system. Some keywords used in feedback control theory are as follows: • Target system: The system which is being controlled. • Reference input: The desired value of output from the system. • Control error: Difference between reference input and measured output. • Control input: Variable whose value affects the behavior of the target system. • Controller: Computes the value of control input so as to maintain measured output equal to reference input. • Disturbance input: Other factors that affect the target system e.g. Administrative tasks. • Noise input: An effect that changes measured output produced by the target system. • Transducer: Transforms measured output in some desired form. Transducer is also used for averaging of the output depending upon design of the feedback control system. Given a target system our task is to design a controller that adjusts control input to achieve the value of measured output equal to reference input. This output is then used to control the input which will cause change in next output. Feedback control works by tuning the values of parameters named control inputs. The control input are the system variables or the parameters which affects the working of the system which further affects the values of output from the system. So the main idea in feedback control system is to monitor the output from the system and 9 Fig. 4. typical feedback control system depending upon value of the current output set the new value of input parameters. Task of controller is to model the input-output relationship for the system so that the desired responses from system can be achieved by setting up the proper values of input parameters. For example, consider the case of a web server for which we want to maintain the response time at some desired value. One of the configuration parameters for the webserver is number of threads running in the web server. Increase in number of threads will decrease the response time values. Hence we can tune the value of number of threads to achieve the certain desired response time value. One way to tune these kinds of systems is to use feed-forward control which is also called as open loop control. As we know that there is direct relation between number of threads in web server and the response time values, we can directly set the value of number of threads for given value of desired response time. But this approach needs exhaustive experimentation to develop the accurate model of the system which can describe the relationship between number of threads and the response time. This can work well in case of truly isolated environment, but in practice such environments are rare. In case of web servers the usage patterns can change according many issues like time of the day, or the specific festival time. Also the operating system itself might have some scheduled administrative tasks running for some time. The feed-forward systems can’t maintain the desired output value in presence of some disturbances or changes in the system or operating environment. Fig. 5. typical feed-forward system In case of virtualized environment, each virtual machine although has separate execution environment, but are dependent on each other because of contention for the shared resources. So change in any of the software component like application, virtual machine, or application characteristic or change in any of the hardware resource can affect the performance adversely. Technique like feed-forward modeling can not control the system output in such dynamic environment. Whereas the feedback control has advantage of monitoring the current system state 10 and thus can respond to the dynamics in the system. C. Preliminary results We carried out some preliminary experiments in virtualized environment for the purpose of illustration as a part of first stage of this Master’s project. We varied cap of a domain(maximum percentage of CPU that can be allocated to the domain) as input and measured the throughput of the webserver running inside a virtual machine as the output of the system. Fig. 6. Illustrative feedback control system for virtualized environment For experimental purpose we kept only single virtual machine running which hosted apache webserver. Experiments are conducted on Intel(R) Xeon(TM) dual CPU 2.80GHz machine with httperf for load generation. Virtual machine monitor xen3.0.3 is used for the experiment. The research  in the area of control theory demonstrated that linear model representing approximation of the system works well. As shown in figure 7, webserver throughput is approximately linear. This implies that the linear approximations from feedback control theory can be applied to virtualized environment. This result gave us motivation to develop feedback control system for the virtualized environment as a part of work done in second stage of this Master’s project. Fig. 7. Graph of application performance vs. scheduler parameter D. Architecture of QoS aware virtualized environment Architecture proposed in our work is independent of virtual machine monitor(VMM) used, so we can use any of the VMM solutions like VMWare workstation, Xen, MS virtual server. Figure 8 shows the architecture of QoS aware virtualized environment. Datacenters host number of physical servers which are shared among multiple client applications. Each tier of the each of the application is deployed in different tiers. As described by Liu et al in , generally the same tiers of all applications are kept on the same physical servers. By the term ”same tiers” we mean tiers doing same function and not merely number of the tier. As shown in above architecture all 11 Fig. 8. Architecture of QoS aware virtualized environment the virtual machines consisting of tier1 of the application are placed on the physical server1, virtual machines of tier2 on the physical server 2 and so on. Hence for n tier applications there will be at least n physical servers. Placement of these tiers is subject to resource availability on the given physical server. A virtual machine monitor will be running on each of the physical servers which does management of virtual machines on the given server. For simplicity we haven’t shown the host OS or VMM in the given architecture. Please refer to figure 8 for the architecture of the virtualized environment with VMM and host OS. Apart from these usual components of the virtualized environment, we add three modules named controller, capacity analyzer and sensors. Sensor module is deployed in the tier 1 of all applications. As the name suggests, the task of the sensor is to carry out measurements. Sensor will monitor each request coming to the application and measure the values of interest. The measured values can include QoS parameters like response time delivered to each request, throughput of the application. The other task of the sensor will include transforming the measured output in some form which is further being used by controller. The transformation can include summarizing the measured data, storing the history data etc. The controller and capacity analyzer modules are deployed in the host OS on each of the physical server. Controller module receives the values of the QoS parameters from the sensors. Task of the controller is to compute the new values of the resource management parameters for the virtual machine. In this architecture, we compute the resource management parameter values for each virtual machine separately. The computed values for each of the virtual machine are then supplied to capacity analyzer. Capacity analyzer verify whether the resource demands of 12 all virtual machines together will get satisfied on the given physical server or not. Note that each physical server will have separate instances of controller and capacity analyzer running. After verification from the capacity analyzer, the resource management parameter values are then forwarded to the virtual machine monitor which acts as actuator to set these values. Following subsection describes the feedback control system covering these three modules in depth. E. Feedback control system design The following figure 4 depicts the our design of the feedback control system for virtualized environment. For simplicity we are assuming number of applications and number of tiers of every application to be 2 each. Note that each physical server will have separate instance of this feedback control system. For this study we focus on maintaining the response time delivered by application. Response time is the measurement of time between arrival of the request at the server and departure of the request after successful service from the server. Delay over the network between the server and the client is not included in the response time measurement. Hence we are having one reference input in the form of desired response time for an application. In this study we are using cap of the virtual machine hosting the application as control input. Cap of the virtual machine puts the upper limit on the CPU consumption by a virtual machine. We are modeling the system using multiple SISOs. SISO stands for single input single output system. There will be one SISO for one virtual machine of each application running on a physical server. Fig. 9. Feedback control system for virtualized environment As shown in the figure, virtual machine environment is hosting two applications in different virtual machines. Feedback control system gets desired response time for each of the application as the reference input from the user. This input is entirely choice of the user which describes desired Quality of Service. Response time delivered by each of the application is measured with sensors present in the virtual machines. This measured output is then given to transducer which computes exponential average of the response time. Exponential averaging is useful in order to avoid responding to the temporary fluctuations in the system. Exponential averaging technique updates the average response time value in following manner: avg response time = α * current response time + (1 - α) * old avg response time where α denoted exponential factor. Value of α can be configured by the system administrator depending desired responsiveness to the changes in the system. The exponentially averaged response time value is provided to the controller along with the desired response 13 time value. We implemented a PID (Proportional-Integral-Derivative) controller[?]. The controller computes the new value of cap for the virtual machine. The controller computes the cap for the two applications separately. Hence logically there are two controllers running on a given physical server, so we have shown two controllers in this figure. The values computed by both controllers is feed to the capacity analyzer which verifies whether the resource demands of the virtual machines running on same physical server are feasible or not. If the resource demands exceeds the capacity of the physical server then we need allocate some more hardware resources or we should discard some workloads. Allocating new hardware resources can be done by migrating the virtual machines on different physical server. The virtual machine migration technology is supported by many of the virtual machine monitors. Virtual machine migration allows runtime migration of a virtual machine from one physical server to other physical server. This concept is discussed in detail in the future work section. 14 IV. T ESTBED FOR Q O S AWARE VIRTUALIZED ENVIRONMENT This section describes the testbed deployed for carrying out the experiments. We designed and deployed components in the testbed in a way so as to resemble to real world scenario. In this section we also discuss some of the issues occurred in the deployment process. For building the testbed, we have used open source solution Xen3.0.3 A. Components of the testbed For demonstration of the work we have used two-tier systems with apache web server at frontend and MySQL database server connected at the backend. Apache server hosted the two-tier WebCalender application which has web and database tiers. We used httperf for load generation. We have used two instances of the same two-tier system to demonstrate how we can deliver differential quality of service to each of the application. We created four virtual machines by using Xen. Two of the virtual machines are hosting one apache server each and two other virtual machines are hosting one MySQL server each. Fig. 10. Testbed for QoS aware virtualized environment Following subsection discusses the hardware components of the testbed and how the software components are deployed on the hardware. The testbed setup is shown in the figure ??. Our testbed consists of two machines each with following configurations are used for hosting the servers. • Server1 : Intel(R) Xeon(TM) dual CPU 2.80GHz processor, 2 GB main memory. • Server2 : AMD Athlon(tm) dual core processor 3.0GHz, 1 GB of main memory. Generally data centers put same tiers of different applications on the same physical server. We adopted this design by putting virtual machines hosting the web tiers on server1 and virtual machines hosting the database tiers on server2. Apart from the above datacenter design, we have used 2 client machines to emulate behavior of real workload 15 with the help of continuous load generation using httperf. Requests are having exponential distribution. All of the machines are running with linux2.6. All of the machines are connected with 100Mbps ethernet. As discussed in earlier section, we designed two controllers each of which is running in the host OS on each of the physical servers. Each of the virtual machine hosting the web tier also hosts a http proxy named Muffin which acts as sensor. Muffin simply forwards the requests coming from the clients to the web server. We have modified the source code of Muffin to measure the response time of the web server which is described in detail in next subsection. This proxy acting as sensor gives the response time measurement to the controller running in the host OS. This controller also communicates these response time values with other controller running in the host OS on server2 hosting the virtual machines corresponding to the database servers. The proxy Muffin is written in java, whereas all the utilities required for extracting the response time values from muffin log files, controller design is done by coding in C and shells script. Communication among the machines for exchange of the values and parameters is done using sockets programming. For deploying WebCalender application, we installed apache web server, php on the virtual machines hosting the web tier. Also we installed MySQL on the virtual machines hosting the database tiers. B. Implementation issues The work required for solving the problem under study consists of measurement of the QoS parameters and computing the values of tunable parameters depending upon the measurement values we got. Load generators provide the response time values for the requests issued by them. But in practical scenario, the service providers can not rely on the clients to provide the values of QoS being delivered to the clients. So keeping this thing in mind we designed our virtualized environment in a way that QoS parameters are monitored within the hosting environment. Fig. 11. Verifying the response time values given by Tomcat logs Monitoring of the QoS parameters like response time can be done in easiest way if the web server itself monitors the requests coming to it and logs the measurement values for each of the request. So we investigated some of the web servers available and used Tomcat server which supposed to be providing the response time values for each requests. To record the measurements, the access log valve element need to be set in the tomcat configuration file. We carried out some experiments using response time measurements extracted from access log files provided by Tomcat server. After observing the response time values got from these measurements we inferred that the values 16 given by Tomcat are incorrect values of response time. The graphs in figure 11 shows our results of measurement of response time from Tomcat logs as well as from load generator. We then surveyed some alternatives for response time measurement. To deploy our system with minimal changes to existing softwares, we are using a lightweight http proxy Muffin to do the measurement of response time delivered by web server. Following diagram shows flow of a request coming to a application1 running inside our testbed. As shown in last figure of testbed, application1 has its web tier running inside the vm1 and database tier running inside vm3. The virtual machines vm1 and vm3 are running on two different physical servers. Fig. 12. Response time measurement and flow of a request through the testbed Figure 12 shows how a request flows from client through our testbed and back to client. For simplicity we have shown virtual machines corresponding to first application. For application2 the request flow and response time measurement mechanism is same. The proxy server is running on the same OS as of web server and every request will go through this proxy server which will communicate with the web server. Although Muffin itself doesn’t do the response time measurement, as it is open source we can customize it according measurement needs. We modified the source code of Muffin to log the response time values for each of the request coming to the web server. Hence the proxy server Muffin acts as sensor for our control system. It periodically sends the response time values to the controller running in host OS on the same physical server. The controller communicates this response time value to the other controller instance running on physical server2. Once both the controllers get the response time values they compute the cap values for the virtual machines residing in the same physical server as of controller. Controllers carry out this computation of cap independent of other controller. The overhead of running Muffin is very small and can be neglected. C. Workload description The nature of the workload deployed in the virtual machines has an impact on the behavior of the QoS delivered. In our earlier study[RD], we have shown that when multiple virtual machines(VMs) are running CPU-intensive tasks then each of the VMs gets its fair share of CPU. But in case of IO-intensive tasks, although each VM gets 17 its fair share the QoS provided to the application changes drastically depending upon other virtual machines. The resource usage pattern of one VM affects the performance of application running in other VMs. Hence we deployed WebCalender application which is two tier application. We deployed the two tiers in two separate virtual machines which are hosted on two different physical machines which depicts the practical scenario in the data-centers. This workload exercises different IO tasks like querying database, flow of requests through network as two tiers of a application are located in two different virtual machines. 18 V. P ERFORMANCE EVALUATION This section present the results of our experiments carried out to evaluate the feedback control system. The deployment details and workload description is given in the preceding section. For comparison purpose, we carried out a experiemnt without controller running. The cap of the virtual machines were 47 and 40 respectively which are the average values of cap set in the second experiment carried out with the controller running. The desired response time values are 180msec and 220msec for the two webservers respectively which are same as the second experiment with the controller which is described below. Graph in figure 13 shows the values of response time delivered by the webservers running in virtual machines. The table in figure 14 gives more clear picture of the results we got in this experiment. We classified the % error values in certain ranges. Generally any error over magnitude of 10% might not be tolerable by the clients. As shown in the table, 35% and 25% times error values are having magnitude of more than 10%. Fig. 13. Evaluation of virtual machines without controller Fig. 14. Summarized result of virtual machines without controller Figure 15 shows plots of the values of the response time delivered by the both webservers running in the virtualized environment. This experiment was carried out in the same environment as of first experiment without control. The reference inputs given for this experiment were 180msec and 220msec for the two webservers respectively. The graph a) shows the graph of response time values delivered by the webservers against the time. Graph shows that response time values are generally very close to the desired values of response time. Graph b) of % error against the time gives the values of errors between actual and desired values of the response times. The graph c) of exponential average response time against time gives the plot of the response times which are being produced by transducer which are further used by the controller in the computation of new cap value. This experiment was done with exponential factor of 0.2. As shown in the table, application1 delivers response time with error of magnitude less than 10% for 86% of times whereas application2 delivers response time with error of magnitude less than 10% for 89% times. In the first experiment these values were 65% and 75% respectively. This shows that the controller is able to deliver 19 Fig. 15. Evaluation of the feedback controller Fig. 16. Summarized result of evaluation of the feedback controller the desired response time. Both the applications deliver the response time with error magnitude of less than 5% for around 60% of time each. Table also lists out the error values when exponentially averaged response time is compared with the reference response time. In this comparison, we got only 4% and 2% error from application1 and application2 respectively. To illustrate the robustness of the feedback controller, we carried out the experiments in presence of disturbance. We start executing a thread periodically on the virtual machine where application1 is running i.e. on vm1. This thread is cpu hogging loop which alternately sleeps and executes some computation. All other parameters, reference inputs and the environment was same as the experiment explained above. As shown in the table in figure 18, application1 delivers response time with error of magnitude less than 10% for 85.75% of the times, whereas application2 delivers response time with error of magnitude less than 10% for 86.5% times. Both the applications deliver the response time with error magnitude of less than 5% for around 50% of time 20 Fig. 17. Evaluation of the feedback controller in presence of disturbance Fig. 18. Summarized result of evaluation of the feedback controller in presence of disturbance each. The error values when exponentially averaged response time is compared with the reference response time are 0.73% and 0% for application1 and application2 respectively. Hence these results shows that our controller is robust enough in presence of the disturbances. The graphs in figure 19 shows the values of cap set by the controller. Graph a) shows the values of cap in first experiment which was carried out without any disturbance whereas graph b) shows the values of cap from second experiment. From these graphs, we can infer that curve of cap of virtual machine vm1 in second experiment is relatively shifted upwards than the corresponding curve in first experiment. This happens because the virtual machine vm1 has extra load in terms of a thread running periodically which represent disturbance in the system. Average values of cap in the first experiment are 46.98 and 40.48 for vm1 and vm2 respectively, whereas average values of cap in the second experiment are 48.85 and 40.18 for vm1 and vm2 respectively. This shows that in second experiment there is 3% increase in CPU demand by virtual machine vm1 due to disturbance. 21 Fig. 19. Variations in the cap of virtual machines If the reference response time values are very low, it results in the sum of cap for all virtual machines on a physical server exceeding the capacity of the physical server. This means the given physical server is not capable of delivering QoS to the all applications sharing the server. In our current setup, we display an error on console and deliver the degraded service to one application. As a part of future work, we intend to extend the solution to this scenario by using virtual machine migration. Future work part of the next section describes this idea in detail. 22 VI. C ONCLUSION In this study, we described the problem of delivering QoS to the applications running inside the virtualized environment. Our work focused on devising a mechanism for computing the share of the resources to be allocated to each virtual machine in such a way that desired QoS is delivered to the applications running inside virtual machines. As a part of first stage of this Master’s project, we defined the problem and done some initial experiments. We also studied the basics of feedback control theory which we applied to solve the problem. In the second stage of this Master’s project, we designed the feedback control system for virtualized environment. We designed and implemented controller, sensor, and capacity analyzer modules as a part of the control system. Sensors measure the QoS delivered by the applications. Controller use these QoS values to decide new values of resource management parameters like cap of a virtual machine. Capacity analyzer verifies whether the resource demands of all applications can be fulfilled with the given physical server or not. We intend to extend the capacity analyzer module as a part of third stage of this Master’s project. Subsequent subsection discusses this idea in detail. We evaluated the performance of the proposed control system by deploying two tier applications in the virtualized environment testbed. We carried out the experiments with desired response time of the application as reference input and cap of the virtual machines in which application resides as the control input. We implemented the sensor for carrying out response time measurements at the servers. The results of the experiments shows that control system is able to set the values of cap accurately even in the presence of disturbance. A. Future work Recently Liu et al also presented work on the issue under our study. In this work they designed and implemented optimal multivariate control for delivering differentiated services in virtualized environment. The problem solving approach in their work is similar to our approach. Both the approaches use feedback control theoretic approach to set the value of cap of a virtual machine to meet application QoS guarantees. Work presented by Liu et al does online estimation of the controller parameters using recursive least-squares (RLS) method. They used RLS estimator with directional forgetting in learning the controller parameters. This makes their controller more robust to the dynamics in the system. Our future work will involve applying similar mathematical techniques in our feedback control system design. One more difference between these two solutions lies in the feedback control system design. In our work we model the system as multiple SISOs whereas Liu et al models this system as MIMO(multiple input multiple output) system. If the quality of service is expressed in the relative terms then the system can maintain the relative QoS for all the applications using their approach. Hence in presence of heavy load, each application will suffer proportionately which further makes sure that relative QoS is delivered accurately. Our approach guarantees to deliver the QoS defined in absolute terms. In our approach capacity analyzer make sure that each application gets sufficient resource to meet its QoS requirements. If resource demands are exceeding the capacity of the server, we have option of either providing degraded performance or discarding some of the workload or migrating the virtual machines to other physical server. Kochut et al proposed analytical model of the virtual machine migration. As a part of future work we are planning to devise an algorithm based on this analytical model as a part of capacity analyzer. The algorithm will choose the virtual machines that need to be migrated from given physical server. Algorithm will have the task of deciding on which physical servers these chosen virtual machines will migrate. This whole process will have constraint that each application QoS must be delivered. Other important factor will be the overhead of the migration. The algorithm should find out the optimized solution considering these factors. While allocating the resources among competing applications the reward associated with the applications can be considered. The reward can represent the benefit or loss in terms of pricing of the service depending on QoS delivered to the application. Use of more hardware resources can help in delivering the QoS for all applications 23 but it may not be effective from cost perspective. We are planning to devise an algorithm which can optimize the possible benefits from the given infrastructure by allocating appropriate amount of resources to the applications. In recent times there has been increased concern about usage of electric power by the datacenters. We intend to consider the power factor in the optimization problem described above. R EFERENCES  Tarek Abdelzaher, Kang G. Shin, and Nina Bhatti. User-level qos-adaptive resource management in server end-systems. IEEE Transactions on Computers, 52.  Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauery, Ian Pratt, and Andrew Wareld. Xen and the art of virtualization. nineteenth ACM symposium on Operating systems principles, 2003.  VMware site. http://www.vmware.com/.  Diwaker Gupta, Ludmila Cherkasova, Rob Gardner, and Amin Vahdat. Enforcing performance isolation across virtual machines in xen. Middleware 2006: Proceedings of ACM/IFIP/USENIX 7th International Middleware Conference, 2006.  Official Xen project site. http://www.cl.cam.ac.uk/research/srg/netos/xen/.  Sujay Parekh, Dawn M. Tilbury, Joseph L. Hellerstein, and Yixin Diao. Feedback control of computing systems. John Wiley and Sons, Inc, 2004.  Andrzej Kochut and Kirk Beaty. On strategies for dynamic resource management in virtualized server environments. MASCOTS 2007: IEEE / ACM International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), October 2007.  httperf. http://www.hpl.hp.com/research/linux/httperf/.  Website of Muffin proxy server. http://muffin.doit.org/.  Leslie, I.M. McAuley, D. Black, R. Roscoe, T. Barham, P. Evers, D.Fairbairns, and R. Hyden. Design and implementation of os to support distributed multimedia applications (nemesis). IEEE Journal of Selected Areas in Communications, 1996.  Rahul Gundecha. Measurement-based evaluation of virtualization platforms. Technical report, Indian Institute of Technology, Bombay, april 2007.  Ludmila Cherkasova, Diwaker Gupta, and Amin Vahdat. When virtual is harder than real: Resource allocation challenges in virtual machine based it environments. Technical report, HP Laboratories Palo Alto., February 2007.  Pradeep Padala, Xiaoyun Zhu, Zhikui Wang, Sharad Singhal, and Kang G. Shin. Performance evaluation of virtualization technologies for server consolidation. Technical report, HP Laboratories Palo Alto., April 2007.  Zhikui Wang, Xiaoyun Zhu, Pradeep Padala, and Sharad Singhal. Capacity and performance overhead in dynamic resource allocation to virtual containers. Technical report, HP Laboratories Palo Alto., April 2007.  S. Keshav. A control-theoretic approach to flow control. Proceedings of the ACM SigComm, 1991.  Abhinav Kamra, Vishal Misra, and Erich M. Nahum. Yaksha: A self-tuning controller for managing the performance of 3-tiered web sites. Proceedings of 12th International Workshop on Quality of Service(IWQoS), 2004.  Ying Lu, Avneesh Saxena, and Tarek F. Abdelzaher. Differentiated caching services; a control-theoretical approach. International Conference on Distributed Computing Systems, 2001.  Ratul K. Majumdar, Krithi Ramamritham, Ravi N. Banavar, and Kannan M. Moudgalya. Disseminating dynamic data with qos guarantee in a wide area network: A practical control theoretic approach. 10th IEEE Real-Time and Embedded Technology and Applications Symposium, 2004.  Xue Liu, Xiaoyun Zhu, Pradeep Padala, Zhikui Wang, and Sharad Singhal. Optimal multivariate control for differentiated services on a shared hosting platform. Proceedings of the 46th IEEE Conference on Decision and Control (CDC’07), December 2007.