Lifetime-aware, Fault-aware and Energy

Lifetime-aware, Fault-aware and Energy-aware SDN and CDC:
Optimal Formulation and Solutions
SPRITZ-CLUSIT Workshop on Future Systems Security and Privacy, 2017
Mohammad Shojafar
Consorzio Nazionale Interuniversitario per le Telecomunicazioni (CNIT), Department
of Electronic Engineering, University of Rome Tor Vergata, Italy
October 25, 2017
1/ 38
Content
1. Problem 1: Lifetime-aware CHW machine states
2. Problem 2: CDC power state management and maintenance cost
3. Problem 3: Fault-aware SFC in SDN
4. Conclusion and future Directions
2/ 38
Problem 1: Lifetime-aware CHW machine states
3/ 38
Introduction Problem 1
i) The goal of green networking is to exploit the power
management policies to reduce the network energy cost.
ii) One of the main components in the Network is Commodity
Hardware (CHW)
iii) CHW devices can be efficiently managed by varying their power
states such as using Sleep Mode (SM), in order to limit their
electricity consumption. It impacts on device lifetime in
short-/long-term evaluation.
iv) We plan to address;
i) CHW temperature variations1
ii) CHW lifetime-Aware ISP Networks2
4/ 38
1
A measurement-based analysis of temperature variations introduced by power management on Commodity
HardWare, In 19th IEEE ICTON, pp. 1-4, 2017.
2
Lifetime-Aware ISP Networks: Optimal Formulation and Solutions, IEEE/ACM Transactions on Networking, 2017.
Problem 1- CHW temperature variations
i) We investigate what is the impact on temperature triggered by the
variation of power states on the CHW
ii) It is based on real measurements on a simple testbed:
characterization of temperature on the CPU and the RAM when a
SM state is triggered
iii) Abrupt stopping of the fans triggered by SM tends to spread the
heat over the components, thus increasing their temperature
transient before reaching a steady state.
iv) plug the retrieved temperature measurements in a well known
failure model, showing that the CHW failure rate is reduced by a
factor of 5 when the number of transitions between AM and SM
states is more than 20 per day and the SM duration is in the order
of 800 [s].
5/ 38
Problem 1- Testbed
We select the following devices:
i) one server, which is used as CHW for our experiments;
ii) An iDRAC interface (which is installed on the CHW) to obtain
temperature measurements of the motherboard CPU;
iii) a power meter, used to measure the power consumption of the
CHW;
iv) a thermal camera, which is used to measure the surface
temperature of CPU and RAM components;
v) a Linux-based PC acting as a measurement collector from the
iDRAC interface, the power meter and the thermal camera;
vi) a switch and Ethernet cables to connect the iDRAC interface, the
power meter, and the thermal camera to the measurement
collector PC.
6/ 38
Problem 1- Testbed HW
Table: Testbed HW
Type
Server
Power Meter
Thermal Camera
Measurement Collector
7/ 38
Description
Dell PowerEdge T320 with Intel Xeon (8
cores, 16 threads) at 2.10 GHz, 48 GB RAM,
iDRAC 7, Ubuntu 14.04 LTS
Raritan DPXR20A-16
FLIR A325 Camera
Commodity PC with Ubuntu 12.04 LTS
2
A measurement-based analysis of temperature variations introduced by power management on Commodity
HardWare, In 19th IEEE ICTON, pp. 1-4, 2017.
Problem 1- Testbed SW
SW Tool
snmpget
ipmitool
stress
FLIR IR Monitor
8/ 38
Scope
Obtain
the
CHW
power
consumption
Obtain
the
CHW
CPU
temperature
Load
CHW
CPU
and/or
memory
resources
Measure
the
temperature of
CHW Components
From
To
Measurement Power meter (Ethcollector PC ernet Port)
Server iDRAC InMeasurement
terface (Ethernet
collector PC
Port)
CHW
minal
Ter-
CHW Operating
System
Measurement Themal Camera
collector PC (Ethernet Port)
180
CPU Temperature
Server Power
Ambient Temperature
Power Consumption [W]
160
90
140
80
120
70
100
60
80
50
60
40
40
30
20
0
20
0
1000 2000 3000 4000 5000 6000 7000
Time [s]
Figure: AM - SM Impact
9/ 38
100
Temperature [Celsius]
Problem 1- Results
Problem 1- Results..
90
80
120
70
100
60
80
50
40
60
30
40
0
0
2
4
8
Number of CPU processes
10/ 38
20
Server Power
CPU Temperature
Ambient Temperature
20
Figure: CPU Load Impact
10
0
16
Temperature [Celsius]
Power [W]
140
Problem 1- Results-CPU Temperature variation
11/ 38
ts = 0 [s]
ts = 1200 [s]
ts = 600 [s]
ts = 1800 [s]
Problem 1- Results-RAM Temperature variation
12/ 38
ts = 0 [s]
ts = 1200 [s]
ts = 600 [s]
ts = 1800 [s]
Problem 1- Acceleration Factor (AF) vs TMP
variations
Number of Transitions per Day
AF =
−m δSTD
δSM
−n Ea
max
max )
e K (1/TSTD −1/TSM
15
40
30
10
20
5
10
0
13/ 38
fSTD
fSM
10
20
30
Temperature Variation
40
50
Figure: AF vs. Temperature variations
(1)
Problem 1- CHW lifetime-Aware ISP Networks
min
1 X
AFi,j
t
(2)
(i,j)∈E
1. given set of CHW switches N, their connections E , traffic per
each time-slot t
2. subject to connectivity and maximum link utilization at each t
2
14/ 38
Lifetime-Aware ISP Networks: Optimal Formulation and Solutions, IEEE/ACM Transactions on Networking, 2017.
Problem 1- Simulation Setup
Figure: Network Characterization
2
15/ 38
Lifetime-Aware ISP Networks: Optimal Formulation and Solutions, IEEE/ACM Transactions on Networking, 2017.
Problem 1- Results
Figure: AF vs. HW parameters for Optimal (OPT-ENH) and Heuristic (AFA)
2
16/ 38
Lifetime-Aware ISP Networks: Optimal Formulation and Solutions, IEEE/ACM Transactions on Networking, 2017.
Problem 1- Results..
Figure: Computation Time vs. HW parameters for (OPT-ENH) and
Heuristic (AFA)
2
17/ 38
Lifetime-Aware ISP Networks: Optimal Formulation and Solutions, IEEE/ACM Transactions on Networking, 2017.
Problem 1- Results..
Figure: Average AF vs. Computation Time for Optimal and Heuristic (AFA)
in Network Type Germany17
2
18/ 38
Lifetime-Aware ISP Networks: Optimal Formulation and Solutions, IEEE/ACM Transactions on Networking, 2017.
Problem 2: CDC power state management and maintenance
cost3
19/ 38
3
An Optimal Approach to Reduce Electricity and Maintenance Costs in Cloud Data Centers, IEEE Transactions on
Sustainable Computing, in press, 2017.
Introduction Problem 2
i) Data Centers (DC)s are intensely widespread worldwide to sustain
a variety of applications, such as web browsing, streaming, high
definition videos, and cloud storage.
ii) DCs can be put Active mode (AM) or Sleep Mode (SM) to reduce
the energy and electricity usage.
iii) Transition between AM and SM during long-term periods cause
Maintenance cost [paid by content provider] and increase the
failure rate.
iv) We present;
20/ 38
i) A model to compute the maintenance costs, given the variation over
time of the power states for a set of servers
ii) Optimally formulate the problem of jointly reducing the CDC
electricity consumption and the related maintenance costs
iii) test on realistic case study, clearly show that our solution is able to
wisely trade between maintenance and electricity costs in order to
provide monetary savings for the content provider
Problem 2- CDC Architecture
Data Traffic
Configuration Data
Allocation
Manager
Data Center Network Core
Switch
Switch
Pod
Switch
Switch
Pod
Switch
Switch
Switch
Switch
Switch
Switch
PS PS PS PS
PS PS PS PS
Physical
Server
VM
VM
VM
Hypervisor
Network
Manager
Figure: Cloud Data Center Architecture.
21/ 38
3
An Optimal Approach to Reduce Electricity and Maintenance Costs in Cloud Data Centers, IEEE Transactions on
Sustainable Computing, in press, 2017.
Problem 2- Overall Formulation
The Optimal Maintenance and Electricity Costs (OMEC)
problem, which aims at minimizing the costs for each TS t, is formulated as follows:
h
i
TOT
min C TOT (t) = CM
(t) + CETOT (t)
(3)
subject to:
Maintenance Costs Computation
Electricity Costs Computation
VM Allocation Constraint
Maximum CPU Capacity
Maximum Memory Capacity
(4)
under control variables: xij (t) ∈ {0, 1}, Oi (t) ∈ {0, 1}.
22/ 38
3
An Optimal Approach to Reduce Electricity and Maintenance Costs in Cloud Data Centers, IEEE Transactions on
Sustainable Computing, in press, 2017.
Problem 2- Results..
Number of Events
20000
OMEC
OEA
OC
5000
1000
500
250
100
50
20
10
1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Migrations per TS
Figure: Histogram of the occurrence of migrations events for OMEC, OC
and OEA with |S|= 5, |VM|= 15 and |T |= 1 [year].
23/ 38
3
An Optimal Approach to Reduce Electricity and Maintenance Costs in Cloud Data Centers, IEEE Transactions on
Sustainable Computing, in press, 2017.
Problem 2- Results..
2100
OMEC
OEA
OC
1000
1,800
1000
ρi (T )
τiSM (T ) [h]
8500
500
100
60
40
S1
S2
S3
S4
500
400
300
250
200
150
100
S5
OMEC
OEA
OC
S1
Total Transition Duration [h]
10
5
3
1
S1
S2
S3
S4
Total AF
24/ 38
S5
S3
S4
S5
Total Transitions
800
OMEC
OEA
OC
PS Maint. Cost [$]
AFiT OT (T )
50
40
30
20
S2
600
OMEC
OEA
OC
400
300
200
150
100
S1
S2
S3
S4
S5
Maintenance Cost per server
Problem 2- Results..
10000
CET OT (T )
T OT
CM
(T )
C T OT (T )
Cost [$]
8000
6000
4000
2000
0
Ψ=0.01
Ψ=0.02
Ψ=0.1
TOT
Figure: Electricity Costs CETOT (T ), Maintenance Costs CM
(T ) , and total
TOT
costs C
(T ).
25/ 38
3
An Optimal Approach to Reduce Electricity and Maintenance Costs in Cloud Data Centers, IEEE Transactions on
Sustainable Computing, in press, 2017.
Problem 3: Fault-aware SFC in SDN4
26/ 38
4
Joint Energy Efficient and QoS-aware Path Allocation and VNF Placement for Service Function Chaining, IEEE
Transactions on Network and Service Management, under review, 2017.
Introduction Problem 3
i) A Service Function Chain (SFC) represents the set of
network/service functions that need to be associated to a given
flow.
ii) Software Defined Networking (SDN) provides a powerful
infrastructure to implement SFC.
iii) We jointly consider the problem of flow rerouting and server energy
consumption in SFC context.
iv) We present;
27/ 38
i) Our main objective is to minimize the network energy consumption
while the required VNFs are properly delivered to the traffic flows.
ii) mathematically formulate the resource reallocation problem which is a
cross-layer optimization problem considering energy and SFC
parameters
iii) a suboptimal heuristic to solve the aforementioned optimization
problem and compare the optimal resolution and the heuristic
approach in terms of different metrics and computation time.
Problem 3- Architecture..
Common Routing Algorithm
(Configuration Element)
Proposed Algorithm
(Reconfiguration Element)
North Bound Protocol
Centralized SDN Controller
South Bound Protocol
4
1
2
5
3
Figure: System Architecture.
28/ 38
4
Joint Energy Efficient and QoS-aware Path Allocation and VNF Placement for Service Function Chaining, IEEE
Transactions on Network and Service Management, under review, 2017.
Problem 3- Overall Formulation
The Optimal Network Reconfiguration (ONR) problem, which
aims at minimizing the energy consumption of the servers for each TS
t, is formulated as follows:
minO
N
X
Oi . · Ei
(5)
i=1
subject to:
Flow Conservation Constraints
Server Utilization Constraints
Link Utilization Constraints
VNF/SFC Constraints
(6)
under control variables: xij (t) ∈ {0, 1}, Oi (t) ∈ {0, 1}.
29/ 38
4
Joint Energy Efficient and QoS-aware Path Allocation and VNF Placement for Service Function Chaining, IEEE
Transactions on Network and Service Management, under review, 2017.
Problem 3- Simulation Setup
Table: Hardware Configuration.
Name
Processor
IDE
RAM
System Type
30/ 38
Description
Intel-Core(TM) i5-2410M-CPU 2.30GHz
Standard-SATA AHCI Controller
4.00 GB
64-bit Operating System, Windows 10
4
Joint Energy Efficient and QoS-aware Path Allocation and VNF Placement for Service Function Chaining, IEEE
Transactions on Network and Service Management, under review, 2017.
Problem 3- Network Topology
6
3
9
8
1
11
2
4
5
7
10
Figure: Abilene Network Topology.
31/ 38
4
Joint Energy Efficient and QoS-aware Path Allocation and VNF Placement for Service Function Chaining, IEEE
Transactions on Network and Service Management, under review, 2017.
Problem 3- Results
Figure: Average Power Consumption and Path Length for ONR vs. heuristic
Network Re-configuration (HNR)
32/ 38
4
Joint Energy Efficient and QoS-aware Path Allocation and VNF Placement for Service Function Chaining, IEEE
Transactions on Network and Service Management, under review, 2017.
ONR
HNR
60
50
40
30
20
10
1
2
3
4
5
Iteration
Average Server Utilization (%)
Average Link Utilization (%)
Problem 3- Results..
ONR
3
2
1
0
1
60
50
2
3
4
5
Iteration
Maximum Link Utilization
33/ 38
Maximum Server Utilization (%)
Maximum Link Utilization (%)
70
1
3
4
5
Server Utilization
HNR
80
40
2
Iteration
Link Utilization
ONR
HNR
4
ONR
HNR
30
20
10
0
1
2
3
4
5
Iteration
Maximum Server Utilization
Summary and Conclusions
In problem 1:
i) We have performed a measurement campaign over a CHW server
to retrieve the temperature and the power consumption for
different power states.
ii) When the server is put in SM, the power consumption goes almost
immediately to 0 values, while the temperatures on the
components exhibit a transient, which is almost exhausted after
1800 [s].
iii) We have shown that the lifetime varies with time, and also across
the different devices using AF for each CHW switch.
34/ 38
Summary and Conclusions...
In problem 2:
i) Maintenance costs + Electricity consumption in a CDC by acting
on the PSs power states and the VMs allocation
ii) We address CPU processing, the amount of transferred data, and
the VMs migrations.
In problem 3:
i) formulate the problem of SFC in an SDN-based network, with the
goal of reducing the overall energy consumption as an Integer
Linear Programming (ILP) problem.
ii) we control the link and server congestion by putting constraints on
their maximum utilization.
ii) The proposed ONR and HNR solutions were compared in terms of
power consumption, average path length, link/server utilization,
and computational complexity.
35/ 38
Future Directions...
i) How to integrate the Problems with More sophisticated large-case
scenarios? [CNIT based on new EU projects are working..]
ii) How to take care of multi-discipline security issues in these
problems? [Cisco and Google are working..]
iii) Any other suggestions are welcome!
36/ 38
Thanks for listening. Q?
Project Link: http://superfluidity.eu/
37/ 38
References...
1) A measurement-based analysis of temperature variations
introduced by power management on Commodity HardWare, In
19th IEEE ICTON, pp. 1-4, 2017.
2) Lifetime-Aware ISP Networks: Optimal Formulation and Solutions,
IEEE/ACM Transactions on Networking, 2017.
3) An Optimal Approach to Reduce Electricity and Maintenance
Costs in Cloud Data Centers, IEEE Transactions on Sustainable
Computing, in press, 2017.
4) Joint Energy Efficient and QoS-aware Path Allocation and VNF
Placement for Service Function Chaining, IEEE Transactions on
Network and Service Management, under review, 2017.
38/ 38
Download PDF
Similar pages