Improving Virtual Machine Live Migration via Application-level Workload Analysis Artur Baruchi and Edson Toshimi Midorikawa University of Sao Paulo LAHPC - Sao Paulo, Brazil Abstract—Virtual Machine (VM) live migration is key for implementing resource management policies to optimize metrics such as server utilization, energy consumption, and quality-ofservice. A fundamental challenge for VM live migration is its impact on both user and resource provider sides, including service downtime and high network utilization. Several VM live migration studies have been published in the literature. However, they mostly consider only system level metrics such as CPU, memory, and network usage to trigger VM migrations. This paper introduces ALMA, an Application-aware Live Migration Architecture that explores application level information, in addition to the traditional system level metrics, to determine the best time to perform a migration. Based on experiments with three real applications, by considering application characteristics to trigger the VM live migration, we observed a substantial reduction in data transferred over the network of up to 42% and the total live migration time decrease of up to 63%. Keywords-Live Migration; Cloud Computing; Performance Prediction; Virtualization; I. I NTRODUCTION Virtual Machine (VM) live migration techniques try to satisfy a given objective, such as cost reduction, through consolidation of several workloads to a few servers, or the performance increase of an application via load balancing. In general, VM migrations happen without the analysis of the VM workload and state. Moreover, live migration techniques and optimizations   usually do not consider the state of the data center (e.g. live migrations in course and current network traffic). The absence of a control, for knowing when is the right moment for migrating a virtual machine, can lead to resource waste, poor customer experience, and even Service Level Agreement (SLA) penalties. Our hypothesis is that choosing the right moment to trigger a live migration can lead to significant improvements, such as decreasing the VM downtime or avoiding network congestion. Live migration techniques are very sensitive to memory usage (at least, pre-copy , and post-copy  algorithms). Hence identifying the current VM resource type utilization (e.g. memory, CPU, I/O) can help speed-up a live migration. One key aspect we explore in this paper is that applications may have cycles that utilize more resources than others. For instance, scientific applications can have parallel processes that need to synchronize at time intervals or web servers can be more utilized during the day than during the night. This paper introduces the Application-aware Live Migration Marco A. S. Netto IBM Research Sao Paulo, Brazil Architecture (ALMA) that supports live migration policies, considers the application-level workload, and carries out a cycle identification. The architecture exploits the fact that understanding application characteristics can assist in better live migration decisions. The paper also presents an evaluation of the architecture by investigating the main metrics (VM downtime, total migration time, and network data transfer) in a set of experiments. The experiments also show that knowing the live migration overhead can help evaluate, before hand, whether a VM migration will be worthwhile. Compared to existing work , –, we couple objective functions (consolidation, load balancing, etc.) with live migration controls, and identify application resource consumption cycles to trigger VM live migrations. Therefore, the main contributions of this paper are: • Introduction of an architecture for VM live migration that considers the application-level workload and a cycle identification; • A method to quantify and predict the application degradation during the live migration and identify application cycles using Fast Fourier Transformation; • Evaluation of the architecture considering metrics related to quality-of-service and resource management using real applications from different domains and a testbed with real servers. II. S YSTEM A RCHITECTURE A. Architecture Overview In an architecture where no Live Migration control exists, once the new VM-to-Host map1 is computed, it is submitted to hosts without any control and is subject to problems like network congestion . In architectures where there is a Live Migration control it is implemented in the hypervisor layer, and does not interact with the objective function module. Usually, this control is designed to avoid network congestion and does not consider the application’s behavior. Another important difference between our proposed architecture and the other two are the evaluated metrics. Most VM live migration architectures make decisions according to system metrics and not according to application characteristics. Our architecture, the Application-aware Live Migration Architecture (ALMA), computes the objective function to the cur1 VM-to-Host map: selection of Hosts to run a given group of VMs. rent VM map. The new map is transferred to a module called Live Migration Control Engine (LMCE), which decides when to migrate the VMs. The orchestration of the live migration aims to minimize the network and application overhead and, mainly, unnecessary migrations. B. Live Migration Control Engine The Live Migration Control Engine (LMCE) module sits between the physical hosts and the objective function computation module. Once the new Host-to-VM map is computed it is applied by LMCE to analyze which VMs can be migrated according to the application workloads2 and cyclic analyses. Based on data collected from the VMs, LMCE can decide when and which VMs are the best candidates for migration. This information can be used in future to make cycle identification. If the application presents a cyclic behavior (e.g. synchronization barriers, which are network intensive, of a parallel application), LMCE could avoid live migrations during this time interval. LMCE accepts the configuration of two time constraints. The first one is the maximum time allowed to postpone a live migration. The second one is the live migration cost, since the provider (or the cloud customer) can adjust an acceptable overhead for the application. LMCE analyzes the application resource consumption (e.g. memory, CPU, I/O) behaviour over time in order to trigger or postpone a live migration applied by the objective function computation. C. Migration Cost Prediction We define overhead as the amount of additional time to finish the workload once the live migration is committed. In order to compute the overhead, it is necessary to normalize the execution time in the hosts involved. Since hosts can have different processor technologies, the time to finish a given workload will be different too. In this paper we considered only processor-related metrics. First, it is necessary to compute the processor ratio between hosts. Once the ratio is computed, the overhead (OA→B ) of the live migration from host A to host B is given by: OA→B = TmA→B - tA + tT B - TB , where: TmA→B : Total execution time of the application workload when migrating from host A to B; tA : Application elapsed time executing on host A; tT B : Application elapsed time executing on host A, but converted to time on host B (using the processing ratio); TB : Total execution time of the entire application workload on host B (without live migration occurrences). The prediction PA→B of how long a workload will run when migrating a virtual machine from host A to B, given that the workload already runs for a certain time interval in host A is: 2 Application workload: the stage in which the application is regarding the type of resource consumption, such as CPU, memory, and I/O. PA→B = tA + TB - tT B + OA→B This prediction model considers the migration overhead and, more importantly, the hardware differences of the hosts involved in the live migration (the source host and the target host). As observed by Birk et al. , the Cloud is built on servers of different generations with different capacities and performance, hence this scenario should be part of any live migration prediction strategy and evaluation. D. Cycle Identification Many application workloads have a cyclic (or temporal) behaviour pattern . Knowing in advance a likely application workload behavior can be useful for live migration strategies. An application that is about to stress a given resource type (CPU, memory, I/O) can have the migration request postponed to the near future, when its resource consumption is known to be more appropriate for a live migration. The estimation of the cycle size uses the Fast Fourier Transformation (FFT), which is used in other science fields (like physics) to identify cyclic patterns in natural events. This kind of analysis can be done by storing the application workload history. The collected data is submitted to the FFT, which estimates the cycle size. The cycle is split in two parts, one with propitious live migration moments and the other with the moments that are not good for live migration (by splitting in two we could reduce the search space for the next step). Finally, to obtain in which moment the application is in terms of resource consumption at a given instant, we calculate the module (rest of the division) of the current instant and the size of the cycle (the pseudo code is presented in Figure 1). Let an application with a cyclic behavior and metric values collected twice an hour, resulting in a total of 48 samples a day. Each sample is composed of workload details of various resources, such as memory, CPU, and I/O usage. Based on these metrics, we classify each sample as suitable or unsuitable to perform live migration (e.g. at moments with high paging rate, we classify the workload as an unsuitable moment). The sample is submitted to FFT, which gives us the cycle size. FFT could return a cycle size of 8 hours; i.e. every 8 hours the workload is restarted. During the cycle period of 8 hours, we may observe several oscillations between suitable and unsuitable moments for live migration. Hence, knowing the cycle size and how the oscillation between suitable and unsuitable moments will occur inside the cycle, we can estimate which moments to migrate a VM. For LMCE to work properly and make the right decisions, it is necessary to receive data from three sources: • User Information: Application deadline, which can come from either a cloud service provider or an end user; • Objective Function Information: A new set of Hoststo-VM maps must be provided; • Virtual Machine Application Classification Information: VMs must send data about the application classification to LMCE. This classification can have two values: Live Migration (LM) and Non Live Migration (NLM), Require: An array C with classification data from a VM for a certain time interval. Each classification sample must be chronologically ordered. CycleSize ← F F T (C) . Find the cycle size using FFT. LM Count ← 1 N LM Count ← 1 for i = 1 to CycleSize do . Split cycle in two Arrays: ArrayLM and ArrayNLM. if C[i] == LM then ArrayLM [LM Count] ← i LM Count ← LM Count + 1 else ArrayN LM [N LM Count] ← i N LM Count ← N LM Count + 1 end if end for now ← CurrentM oment mod CycleSize . Find in which moment inside the cycle we are. 2.6.18. ALMA was implemented in Perl (modules used to cycle calculation) and Python (due to a better API with Xen). B. Benchmark Experiments In this experiment set, we compare ALMA against a traditional consolidation (called SysConsolidation) which consists in consolidating all VMs, in specific moments during the workload, in two hosts (host B and host E). Figure 4 presents the VM placements after the consolidation. We run this test for 10 times, and in a given moment VMs were consolidated. During the tests the workload run from start to finish. After that we submitted consolidation at the same specific moments, but under ALMA control. TABLE I: Benchmark workload experiments. Metric-Policy MigrTime-SysConsol MigrTime-ALMA Downtime-SysConsol Downtime-ALMA if f ind(now, ArrayN LM ) then nextLM ← f indN extBigger(now, ArrayLM ) . Find the next moment, greater than now, in ArrayLM. remainingT ime ← nextLM − now else remainingT ime ← 0 . Inside a LM moment. end if return remainingT ime which present that the moment is right for migration or not, respectively. This data can be sent at different frequencies. The more data available, the better the accuracy of the application classification. vm01 D 31.7 ±21.2 15.3 ±4.9 23.4 ±12.5 22.9 ±15.1 vm02 C 98.9 ±32.6 37.7 ±0.6 23.5 ±6.3 16.4 ±6.3 7.00 vm03_A vm02_A vm01_D 6.00 Error (%) Fig. 1: Algorithm to identify workload cycles. vm03 A 29.5 ±17.9 14.0 ±5.2 16.0 ±5.4 14.5 ±4.4 5.00 4.00 3.00 2.00 1.00 0.00 01 02 03 04 05 06 Experiment ID 07 08 09 10 Fig. 2: Cycle Identification Accuracy: Benchmark Exp. III. E VALUATION This section presents the experiments of ALMA. We used most common metrics of Live Migration (LM)  in two sets of experiments. The first set is based in artificial benchmarks, with well defined behaviour and artificial cycles. The second experiment set comprises three real applications, with their own cycle patterns. In addition we present the evaluation of the prediction model. A. Testbed Configuration We built a Cloud environment composed of five physical servers and a Network Attached Storage (NAS). We connected all components to a 24-port switch and created three separate networks from each other: a network for live migrations only, one network for NAS data transfer and a network for administrative tasks. We configured ten virtual machines with three configuration profiles. The small configuration has one vCPU and 768MB of memory, the Medium configuration has two vCPUs and 1GB of memory and the Large one with two vCPUs and 2GB of memory. The software configuration comprises OpenSuse Linux 12.1 with 3.1.10 Kernel on physical hosts. Xen 4.1.3  was used as hypervisor and the Virtual Machine was installed with CentOS 5.9 and Kernel The evaluation of the proposed architecture uses the following metrics: • Total Migration Time: This metric measures the time, in seconds, between the start of the migration submission and the moment that VM is completely released from the source host. This data was collected using the Xen log in debug mode; • Downtime Duration: Time interval, in seconds, in which the VM is unreachable. This metric was collected using ICMP protocol, and the time interval which requests did not receive an answer, we considered it as downtime; • Network Data Transfer: Amount of data, in MB, transferred in the network during the live migration. This data was collected from the switch; • Cycle Accuracy Identification: This metric, in percentage, measured how accurate the Fast Fourier Transformation estimated the cycle size. This metric is the difference between the calculated cycle and the measured one. The closer to zero, the better the accuracy. The benchmarks used to create artificial cycles are described in Table II. The workload was configured in three VMs (vm02 C, vm03 A and vm01 D, the darker VMs in Figure Classification LM NLM VM03_A LM NLM VM02_C LM NLM VM01_D 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000 Time (sec) Cycle Size SysConsolidation ALMA Fig. 3: Evaluation of consolidation moments for benchmark workloads: ALMA consolidations tend to be closer to the beginning or inside the suitable LM moments. NFS Vlan LM Vlan Data Vlan NAS Server Host_A vm01_B vm01_A vm02_B vm02_A vm03_B vm03_A vm02_C vm01_D vm01_C vm02_D Host_B Host_C Host_D Host_E Fig. 4: After Consolidation. TABLE II: Benchmarks Descriptions. Benchmark SPEC  Workload Type CPU Bound NAS NPB  Memory Bound IOZone  I/O Bound Description Used the twolf benchmark, which spends most of the execution time in internal loops doing mathematical calculations. Performs several changes in memory during the execution and consumes a considerable amount of RAM. We used class D of NPB. Used random read and write in files larger than physical memory (to avoid cache effects). The benchmark performs read and write operations in blocks of 4kb. 4) in order to improve the accuracy of measured metrics. The other VMs were idle during the experiment, but all VMs were migrated to create noise in network. Figure 3 presents the moments of consolidation using SysConsolidation (in dashed red). The blue line represents the workload of VMs across time and the Classification of the interval. The valleys represent moments for Non Live Mi- gration (NLM) and peaks represent the Live Migration (LM) moments. When the SysConsolidation was submitted, all VMs were migrated simultaneously to consolidate in Host B and Host E. When using ALMA, the consolidation was postponed to a more favourable moment to migrate the VM (line in black). This figure shows that ALMA was able to identify and submit the LM in suitable moments to the application workload (during the peaks). The green line in Figure 3 is the cycle size estimated by FFT. The pattern before the green line repeats during the workload, showing that FFT has a good approximation for the workload with benchmarks. When using ALMA, it was possible to improve the first analysed metric, Total Migration Time (Table I), in 61% (vm02 C). The total migration time experienced a great reduction due to the reduction of the amount of data transferred over the network. In the best case scenario (where vm02 C takes 112 seconds to migrate, against 39 seconds using ALMA) host C transferred 98% more data in 17 cycles of pre-copy algorithm. When using ALMA, copy occurred in 30 cycles, but the amount of data transferred was substantially reduced. For the second evaluated metric, the Downtime (Table I), there is no improvement. In some cases, the average showed a little improvement, but the standard deviation was virtually the same. The reason for no improvement in this metric is that the network infrastructure created by the VMM is not part of the pre-copy migration algorithm, it is an independent process that takes place just after the LM finishes. It suffers much more influence from the computational resources of the involved hosts than from the LM algorithm. In the Data Transferred metric, we observed a considerable improvement: SysConsolidation: 10±2 and ALMA 10±2. Using ALMA, the reduction of data transferred over the network was about 42% (about 5GB less data transferred). Finally, the Cycle Accuracy Identification (Figure 2) metric shows the error between the FFT cycle calculation and the size of cycle actually measured. The workload submitted to vm02 C presented the highest error. This is due to the BT benchmark (memory intensive workload) which has a greater fluctuation during its execution (memory sensitive). Nevertheless, this error (about 6%) does not affect the ALMA benefits. Classification LM NLM VM03_A LM NLM VM02_C LM NLM VM01_D 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 12000 13000 Time (sec) Cycle Size SysConsolidation ALMA Fig. 5: Evaluation of consolidation moments for application workloads: ALMA consolidations tend to be closer to the beginning or inside the suitable LM moments. 8.00 C. Experiments with Real Applications BRAMS  TPC-H  VM vm03 A vm02 C vm01 D In Figure 5 it is plotted the workload classification (blue line) over the time. As previous test set, lines in red represent the SysConsolidation and lines in black represent consolidation postponed by ALMA. The green line is the cycle size. This figure shows that ALMA was able to identify the cycle pattern of the application and trigger the migration at the proper moment. TABLE IV: Real application’s workload experiments. Metric-Policy MigrTime-SysConsol MigrTime-ALMA Downtime-SysConsol Downtime-ALMA vm03 A 28.7 ±3.4 10.8 ±0.4 19.1 ±10.1 16.8 ±9.0 vm01 D 27.3 ±9.9 10.1 ±1.1 19.0 ±6.9 20.3 ±10.9 5.00 4.00 3.00 2.00 1.00 0.00 01 02 03 04 05 06 Experiment ID 07 08 09 10 Fig. 6: Cycle Identification Accuracy: Application Exp.. TABLE III: Application Description. Description Biological scientific application. It aims to find the likelihood of a specie to occur given a topography, vegetation and the climate of a region. Brazilian atmospheric modelling application. Used to weather forecast. Simulates an decision support system (Business Inteligence). It is composed of 22 queries that access a huge amount of data. 6.00 Error (%) In this experiment set, ALMA is compared using three applications. We choose two scientific applications, with intensive usage of Memory and IO (major part of the workload) and some periods of CPU usage. The third application is a database running the TPC-H workload. The application description is summarized in Table III. Application OpenModeller  vm03_A vm02_C vm01_D 7.00 vm02 C 43.2 ±1.6 36.6 ±0.7 20.1 ±9.9 22.5 ±11.9 D. Prediction Model Experiment Next, we present the Total Migration Time when using no control over the LM and when using the ALMA architecture (Table IV). We observe an improvement up to 67% (vm03 A and vm01 D). The improvement of vm02 C was not so significant due to hardware differences of target hosts. The vm02 C machine is consolidated in Host B, with less memory. So, to reserve the amount of memory for vm02 C, Host B called the Balloon Driver many times (several calls to Balloon Driver3 were logged in Xen log file). But, even with these constant calls, we observed an improvement of 15%. As in the previous test set, we did not observe any improvement for Downtime metric for the same reasons (Table IV). The statistical differences are irrelevant and no difference can be observed when using ALMA or SysConsolidation. The Data Transfer during the LM was improved by 20% when using ALMA to postpone the LM: SysConsolidation 10±2 and ALMA 10±2. We observed a reduction of up to 2.3GB of data when ALMA architecture was used. Finally, the precision of FFT cycle estimation is presented in Figure 6. As can be observed, the accuracy of FFT when dealing with real workloads is deprecated. This is due to the expected fluctuation behaviour of the applications during its execution. But even with an accuracy deprecated, the error is still low (up to 7%). To evaluate the prediction model, we executed the same applications of the second test set. We chose different moments to trigger the LM and calculated the time the workload would take to finish using our prediction model and compared it with the actual measured time. Th abscissa axis of Figure 7 represents the moment (after the workload started) where the LM was submitted. The ordinate 3 Balloon Driver: It is a mechanism used to force the guest Operating System to give up of some unused memory pages. The unused pages return to the VMM to allocate to other VMs. This mechanisms allows to overcommit the memory available in host between the guests. Total Elapsed Time (min) 180 160 140 Real Execution Prediction 120 100 80 60 40 20 15 19 21 24 28 37 62 63 Consolidation Moment (min) 67 73 (a) Application prediction on vm03. A. Total Elapsed Time (min) 180 160 140 120 ACKNOWLEDGMENT 100 This work has been supported and partially funded by FINEP / MCTI, under subcontract no. 03.14.0062.00. 80 60 40 20 Real Execution R EFERENCES Prediction 15 19 21 24 28 37 62 63 Consolidation Moment (min) 67 73 (b) Application prediction on vm02 C. 180 Total Elapsed Time (min) migration process, (2) cycle identification that can postpone or avoid unnecessary migrations and (3) a prediction model that evaluates the migration cost and hardware differences to avoid live migrations that could, potentially, cause an SLA or QoS violation. We also explored the use of Fast Fourier Transformation to identify application cycles to assist the migration decisions. Our main findings are that: (1) when considering the application behaviour for live migration, there are considerable reductions in total live migration time and data transferred over the network and (2) considering hardware differences and migration costs when performing live migrations can improve live migration prediction models and should be considered. 160 Real Execution 140 Prediction 120 100 80 60 40 20 15 19 21 24 28 37 62 63 Consolidation Moment (min) 67 73 (c) Application prediction on vm01 D. Fig. 7: Prediction model experiments. axis is the elapsed time finish the workload. The prediction model showed a good accuracy, the average error for vm03 A was about 4 minutes (±1.69), 2 minutes (±1.07) for vm02 C and 4 minutes (±2.86) for vm01 D. We observed that during intensive I/O workloads the prediction accuracy is decreased (vm01 D running TPC workload and vm03 A running the OpenModeller). This is due to the non-deterministic behaviour of I/O operations. The I/O requests issued by the VM can be answered in different times, which will depend on several variables. Memory and CPU workloads have a better behaved profile of execution, improving the accuracy of the prediction. IV. C ONCLUDING R EMARKS This paper presented an architecture that allows the coexistence of objective functions and a controlled live migration. The main differences from the existing literature are (1) the use of application-level metrics instead of system-level metrics to evaluate the workload and avoid workloads that harm the live  Clark et al., “Live migration of virtual machines,” in NSDI, 2005.  H. Jin, L. Deng, S. Wu, X. Shi, and X. Pan, “Live virtual machine migration with adaptive, memory compression,” in Cluster Computing and Workshops, 2009.  M. M. Theimer, K. A. Lantz, and D. R. Cheriton, “Preemptable remote execution facilities for the v-system,” SIGOPS Oper. Syst. Rev., 1985.  M. R. Hines, U. Deshpande, and K. Gopalan, “Post-copy live migration of virtual machines,” SIGOPS Oper. Syst. Rev., 2009.  G. Khanna, K. Beaty, G. Kar, and A. Kochut, “Application performance management in virtualized server environments,” in NOMS, 2006.  N. Bobroff, A. Kochut, and K. Beaty, “Dynamic placement of virtual machines for managing sla violations,” in IM, 2007.  W. Voorsluys et al., “Cost of virtual machine live migration in clouds: A performance evaluation,” in Cloud Computing, 2009.  S. Mehta and A. Neogi, “Recon: A tool to recommend dynamic server consolidation in multi-cluster data centers,” in NOMS, 2008.  T. C. Ferreto et al., “Server consolidation with migration control for virtualized data centers,” Future Generation Computer Systems, 2011.  A. Stage and T. Setzer, “Network-aware migration control and scheduling of differentiated virtual machine workloads,” in ICSE Cloud, 2009.  K. Yang, et al., “An optimized control strategy for load balancing based on live migration of virtual machine,” in ChinaGrid, 2011.  T. Wood et al., “Sandpiper: Black-box and gray-box resource management for virtual machines,” Computer Networks, 2009.  M. Seki et al., “Selfish virtual machine live migration causes network instability,” in APSITT, 2012.  R. Birke et al., “State-of-the-practice in data center virtualization: Toward a better understanding of vm usage,” in DSN, 2013.  A. Khan et al., “Workload characterization and prediction in the cloud: A multiple time series approach,” in NOMS, 2012.  P. Leelipushpam and J. Sharmila, “Live vm migration techniques in cloud environment: A survey,” in ICT, 2013.  P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, “Xen and the art of virtualization,” SIGOPS Oper. Syst. Rev., vol. 37, no. 5, pp. 164–177, 2003.  N. Mirghafori, M. Jacoby, and D. Patterson, “Truth in spec benchmarks,” SIGARCH Comput. Archit. News, vol. 23, no. 5, pp. 34–42, Dec. 1995.  D. H. Bailey et al., “The nas parallel benchmarks:summary and preliminary results,” in SC, 1991.  V. Tarasov, S. Bhanage, E. Zadok, and M. Seltzer, “Benchmarking file system benchmarking: It *is* rocket science,” in HotOS, 2011.  M. Souza Munoz et al., “openModeller: a generic approach to species potential distribution modelling,” GeoInformatica, 2011.  S. R. Freitas et al., “The coupled aerosol and tracer transport model to the brazilian developments on the regional atmospheric modeling system (catt-brams) part 1: Model description and evaluation,” Atmospheric Chemistry and Physics, 2009.  M. Kandaswamy and R. Knighten, “I/O phase characterization of TPCH query operations,” in IPDS, 2000.