Electrical Appliances Identification and Clustering using

Electrical Appliances Identification and Clustering using Novel Turn-on
Transient Features
Mohamed Nait Meziane1 , Abdenour Hacine-Gharbi2 , Philippe Ravier1 , Guy Lamarque1 ,
Jean-Charles Le Bunetel3 and Yves Raingeaud3
1 PRISME
Laboratory, University of Orléans, 12 rue de Blois, 45067 Orléans, France
Laboratory, University of Bordj Bou Arréridj, Elanasser, 34030 Bordj Bou Arréridj, Algeria
3 GREMAN Laboratory, UMR 7347 CNRS - University of Tours, 20 avenue Monge, 37200 Tours, France
{mohamed.nait-meziane, philippe.ravier}@univ-orleans.fr, gharbi07@yahoo.fr, {lebunetel, yves.raingeaud}@univ-tours.fr
2 LMSE
Keywords:
Electrical Appliances Identification and Clustering, Energy Disaggregation, Non-Intrusive Load Monitoring (NILM), Sequential Forward Search (SFS) Algorithm, Supervised and Unsupervised Classification,
Turn-on Transient Features, Wrappers Feature Selection.
Abstract:
Due to the growing need for a detailed consumption information in the context of energy efficiency, different energy disaggregation, also called Non-Intrusive Load Monitoring (NILM), methods have been proposed.
These methods may be subdivided into supervised and unsupervised approaches. Electrical appliance classification is one of the tasks a NILM system should perform. Depending on the chosen NILM approach, the
classification task consists of either identifying the appliances or grouping them into clusters. In this paper,
we present the results of appliance identification and clustering using the Controlled On/Off Loads Library
(COOLL) dataset. We use novel features extracted from a recently proposed turn-on transient current model
for both identification and clustering. The results show that the amplitude-related features of this model are
the most suited for appliance identification (giving a classification rate (CR) of 98.57%) whereas the enveloperelated features are the most adapted for appliance clustering.
1
INTRODUCTION
In the context of energy efficiency, Non-Intrusive
Load Monitoring (NILM) approaches aim to provide, in a non-intrusive manner, detailed energy consumption. This detailed information helps increase
the awareness about energy consumption behavior
of consumers along with other benefits. The benefits of such consumption feedback were discussed
in several previous works (Fischer, 2008) (Darby,
2010) (Hancke et al., 2012). The interest in this field
pioneered by Hart’s work during the mid-1980s (Hart,
1985) (Hart, 1989) (Hart, 1992) started to grow
rapidly these past few years starting around the year
2010 (Parson, 2016).
Along with the appliance working periods and the
consumed energy, appliance class is a very important
output of a NILM system. NILM approaches may be
classified using different criteria (Zeifman and Roth,
2011). One possible taxonomy is subdividing the
approaches into supervised and unsupervised (Zoha
et al., 2012) depending on the chosen strategy for
appliance-related information inference.
Supervised NILM approaches are the most commonly found in the literature. These approaches need
labeled data for the training of the appliance classifier.
A major drawback of these approaches is the non robustness with respect to unseen appliances especially
when the training dataset size is small.
To alleviate this problem, an alternative is the use
of unsupervised NILM approaches (Bonfigli et al.,
2015). These approaches try to solve the NILM problem (i.e. to obtain detailed consumption information) without a priori information (Zoha et al., 2012).
Several challenges face theses approaches (Goncalves
et al., 2011). Nevertheless, they are more adapted to
solve the NILM problem in real case scenarios where
unseen and different appliance types may be encountered.
A mid-way approach that is worth mentioning
is the semi-supervised approach (Barsim and Yang,
2015) (Gillis and Morsi, 2016). This approach is a
mix between both supervised and unsupervised approaches where a training step (supervised) helps in
the prediction of the appliance type using a unsupervised approach.
647
Meziane, M., Hacine-Gharbi, A., Ravier, P., Lamarque, G., Bunetel, J-C. and Raingeaud, Y.
Electrical Appliances Identification and Clustering using Novel Turn-on Transient Features.
DOI: 10.5220/0006245706470654
In Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2017), pages 647-654
ISBN: 978-989-758-222-6
c 2017 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
Copyright ICPRAM 2017 - 6th International Conference on Pattern Recognition Applications and Methods
According to the above-mentioned approaches,
electrical appliances classification problem can be
subdivided into two sub-problems: identification and
clustering. Identification is a supervised problem.
Having a set of data (called training dataset), labeled
with appliance type, the task is to identify the class
(appliance type) to which a new appliance belongs.
Clustering, on the other hand, is an unsupervised
problem. Having an unlabeled set of data (the appliance types are unknown), the task is to find clusters,
or groups, that define classes corresponding to appliances with common characteristics.
The aim of this paper is to present the results
of both appliance identification and clustering using novel features extracted from a recently proposed
turn-on transient current model (Nait Meziane et al.,
2015). Conclusions on the usefulness of these features for both tasks are drawn.
The paper is organized as follows: section 2
describes the features used for appliance classification and the corresponding turn-on transient current
model. This section also discusses the relevance of
these features for the classification task and motivates
the use of some chosen features instead of all the estimated ones. Section 3 presents the identification and
clustering results. It also gives a brief description
of the dataset used. The paper is concluded in section 4 where conclusions are drawn and some possible tracks for the improvement of the presented work
are given.
2
2.1
TURN-ON TRANSIENT
FEATURES
Turn-on Transient Model
The work presented in this paper is based on
modeling the turn-on transient current signal using a recently proposed parametric mathematical
model (Nait Meziane et al., 2015). The parameters
of this model are then estimated and used as features for electrical appliances classification. One of
the goals of this work is to assess the usefulness of
these model parameters for classification. For simplicity, we suppose herein stationary amplitudes and
phases in contrast to the more general model presented in (Nait Meziane et al., 2015).
According to this mathematical model, the turnon transient current is an amplitude modulated sumof-sinusoids that can be written as:
s(t) = e(t)ss (t) + w(t),
648
(1)
where
d
ss (t) = ∑ Ai cos (2π fit + φi ),
(2)
i=1
(
T
A0 eb t + 1
e(t) =
0
, if t ≥ 0
, otherwise
(3)
and w(t) is an additive white Gaussian noise
(AWGN)1 . ss (t) is a sum of undamped sinusoids such
that Ai (≥ 0), φi ∈ [−π, π] and fi are their amplitudes,
phases and frequencies, respectively. The number
of sinusoids d is supposed fixed and known a priori.
The sinusoids frequencies are also known and are odd
order-harmonics such that fi = (2i − 1) f0 , i = 1, . . . , d
( f0 = 50 Hz is the fundamental frequency also called
mains frequency).
The amplitude modulation, or envelope, e(t) describes the current amplitude variation from the
turn-on until reaching the steady-state phase. b =
[b1 , . . . , bn ]T is a vector of n polynomial coefficients
and t = [t, . . . ,t n ]T is a time vector such that bT t is a
nth degree polynomial allowing to tune the model amplitude variation to the real signal amplitude variation.
A0 is a parameter that specifies the initial amplitude of
e(t) i.e. when t = 0, e(t = 0) = A0 + 1.
For the work presented in this paper, and after different tests on real signals, we chose d = 5 harmonics
and a polynomial order n = 3. These values provide a
good fit between the model and real signals.
All the model parameters can be put inside one vector θ = [A0 , b1 , b2 , b3 , A1 , A2 , A3 , A4 , A5 ,
φ1 , φ2 , φ3 , φ4 , φ5 ]T . The used algorithm for the estimation of θ is based on a nonlinear least-squares
optimization algorithm called trust-region reflective (Coleman and Li, 1994). A detailed description
of the model, the theoretical limits of the variance of
the estimated parameters (Cramér-Rao Bounds) and
the estimation algorithm will be considered in an upcoming work.
2.2 Relevance of the Turn-on Transient
Features for Classification
In this sub-section we will analyze the features in
order to select the most relevant ones for appliance
classification amongst the elements of the θ vector.
First, we do a pre-analysis for these features to get
an insight and conclude on their usefulness for appliance classification. Then, we compare the conclusions with the results of a feature selection algorithm.
1 This assumption is verified for the measurements of the
COOLL as shown in (Nait Meziane et al., 2016) where the
measurement system used to create COOLL is described.
Electrical Appliances Identification and Clustering using Novel Turn-on Transient Features
4*5 6 %7 8
4
!9
!6
!:
!;
4*5 6 %7 8
6
!<
5
0*1*-(#(1 2*)3(
0*1*-(#(1 2*)3(
2
0
4
!9
!6
!:
!;
!<
3
2
-2
1
0
-4
0
5
10
0
15
5
Note that one of the main desired properties that define the usefulness for us is the low variability of the
feature value for different measurement instances of
the same appliance. Hence, we will give a special focus for this property in the sequel.
2.2.1
Pre-analysis of Features for Selection
0
g1(t)
g2(t)
-0.5
-1
-1.5
0
Phase-related Features. The phase features φi , i =
1, . . . , d specify the position (in radians) of the sinusoids cos(2φ fit + φi ) at t = 0 with respect to the 2π
time-cycle. These features are subject to an ambiguity in their definition. For example, the solution of
cos(φi ) = 0 is φi = 2πm with m being an integer number. This means that the solution is not unique and
a set of solutions exists. This problem can be alleviated by only keeping the solutions that are in the
range [−π, π]. Still, we will end up having two solutions to choose from (for example, the solutions of
cos(φi ) = 12 are π3 and − π3 ).
Moreover, when working on real signals, the
slightest nonstationarity encountered in real signals
(especially the mains frequency very small variations
that are usually less than 0.5 % of 50 Hz) seemed
to affect negatively the estimated phase values (Figure 1).
For all the above-mentioned reasons, we chose not
to use the phase features for the classification.
Amplitude-related Features. Unlike the phase features, the amplitude features Ai , i = 1, . . . , d, representing the amplitudes of the sinusoids (Eq. (2)), are
much more stable (Figure 2). The estimated valued
also show a variability between the estimated amplitudes of different appliances, even for appliances of
the same type. This suggests that these features are a
15
Figure 2: An example showing the stability of the estimated
amplitude features Ai from measurements of the same appliance: “an electric saw” from the COOLL dataset. The
x-axis represent different action delays (0 to 19 ms) i.e. different measurements of the same appliance.
Amplit ude
Figure 1: An example showing the instability of the estimated phase features φi from measurements of the same appliance: “an electric saw” from the COOLL dataset. The
x-axis represent different action delays (0 to 19 ms) i.e. different measurements of the same appliance. Between action
delays the mains frequency is likely to slightly vary which
affects the phase estimates.
10
!"#$%& '()*+ ,-./
!"#$%& '()*+ ,-./
0.2
0.4
0.6
0.8
1
T ime (s)
Figure 3: Functions g1 (t) and g2 (t).
good candidate for appliance identification instead of
clustering.
In the sequel, these features are kept for use in
appliance classification.
Envelope-related Features. The envelope features
are A0 and b j , j = 1, . . . , n. Whereas we, ideally,
seek time-independent features for appliance classification, all b j depend on the time reference t0 except
bn . To illustrate this, we consider the following exponent of a simulated envelope with n = 2:
g1 (t) = b1t + b2t 2 , t ∈ [0, 1] s
(4)
with b1 = 0.5 and b2 = −2 representing the polynomial coefficients. Our time reference is t0 = 0 s. Suppose now that we shift our time reference from t0 = 0 s
to 0.3 s (i.e. we define a new time interval starting at
0.3 s). We then obtain the function g2 (t) = g1 (t + t0 )
which is the portion of g1 (t) on the newly defined interval (Figure 3). Estimating the parameters b01 and b02
of g2 (t), we find that b01 = −0.7 and b02 = −2. This
corresponds to:
g2 (t) = = b1 (t + t0 ) + b2 (t + t0 )2
= (b1t0 + b2t02 ) + (b1 + 2b2t0 )t + b2t 2
= b00 + b01t + b02t 2 .
(5)
649
ICPRAM 2017 - 6th International Conference on Pattern Recognition Applications and Methods
The parameter that is not affected by the time reference shift is b2 . We show, using a similar reasoning
for n > 2, that the only time reference-independent
parameter is the last coefficient bn adapted for use in
appliance classification.
Note also that this time reference shift generates
a new term b00 that we can pull out of the exponent
0
and multiply by A0 to get A00 = A0 eb0 . Practically, we
always take the time reference as the time where the
current is at its extremum (max or min). Therefore,
the estimated A0 (or A00 ) specifies the highest amplitude the current can reach and is important to keep.
Finally, we select for the appliance classification
A0 and b3 along with Ai , i = 1, . . . , 5.
2.2.2
Feature Selection using a Wrapper
Approach for Identification
Following the pre-analysis sub-section and in order
to prove the soundness of the selected features, we
propose hereafter to use a wrapper-based algorithm
to perform the feature selection task.
The goal is to select the most relevant set of features from the set of all available features (in our
case 14) for appliance identification. We propose
to use the wrapper-based sequential forward search
(SFS) algorithm (Kohavi and John, 1997). This latter, sequentially, adds at each selection step the “relevant” feature that gives the highest possible classification rate (CR). This selection procedure was used
in (Hacine-Gharbi et al., 2015) and is summarized in
Algorithm 1.
Algorithm 1: Wrapper-based sequential forward
search (SFS) algorithm.
1. F = {A0 , b1 , b2 , b3 , A1 , . . . , A5 , φ1 , . . . , φ5 },
S = {}, I = 14 (initial number of parameters),
j = 1 (iteration index).
2. • Evaluate the classification rate CR for each
feature fl ∈ F .
• Select the first feature fπ1 such that:
fπ1 = arg max (CR ( fl )).
fl ∈F
• F = F − { fπ1 }, S = { fπ1 }.
3. • j = j + 1.
• For each fl ∈ F , evaluate CR using S ∪ { fl }.
• Select the feature fπ j such that:
fπ j = arg max (CR (S ∪ { fl })).
fl ∈F
• F = F − { fπ j }, S = S ∪ { fπ j }.
4. Repeat step 3 until j = I.
5. Give the output S that yields the maximum CR.
650
The chosen classifier is based on the k nearest
neighbors (k-NN) algorithm. In order to study the
effect of the value of k on the result, we propose to
apply the selection procedure for k = 1, 2, . . . , 10.
The feature selection procedure is conducted using the COOLL dataset (sub-section 3.1). COOLL
was divided by keeping the first 10 measurement instances from each appliance for the training and the
remaining 10 instances for the test (each appliance
having 20 instances). This yielded 420 measurement
instances for training and 420 measurement instances
for the test.
Table 1 gives the corresponding indexes of the initial set of features that will be used to show the feature
selection result.
Table 1: Initial set of features and corresponding indexes.
fl
index
fl
index
A0
1
A4
8
b1
2
A5
9
b2
3
φ1
10
b3
4
φ2
11
A1
5
φ3
12
A2
6
φ4
13
A3
7
φ5
14
The S matrix (Eq. (6)) gives the selection result
where each row k corresponds to a specific k nearest
neighbors choice and each column j corresponds to
the set of selected features, by relevance, at each iteration j (Algorithm 1). The matrix elements are the
indexes of the initial set of features (Table 1).
5
5
5
5
5
S=
5
5
5
5
5
6
6
6
6
6
6
6
6
6
6
7
7
7
9
7
7
7
7
7
7
8
8
8
8
8
8
8
8
8
8
9
9
9
7
9
9
9
9
9
9
1
1
1
1
1
2
2
1
1
1
2
2
2
2
2
1
1
2
2
2
4
4
4
3
4
4
4
3
3
3
3
3
3
4
3
3
3
4
4
4
10
10
10
11
11
11
13
11
11
11
11
11
11
10
10
10
10
10
14
10
13
13
14
14
13
14
12
13
12
12
12
12
12
13
14
13
11
12
10
13
14
14
13
12
12
12
14
14
13
14
(6)
In Eq. (6), and for each row, the element that corresponds to the maximum CR is highlighted. The selected features are, then, the set that corresponds to
this element (following Algorithm 1, for a specific
row, each element represents the last added feature
to S ). In the first row, for example, the highlighted element is the fourth element indexed 8. Hence, the selected features are A1 , A2 , A3 and A4 with corresponding indexes 5, 6, 7, and 8, respectively. To better illustrate this, Figure 4 gives the CR corresponding to
rows 1, 6 and 10 of matrix S.
The selection results (see highlighted elements
of matrix S) indicate that the algorithm selected the
amplitude-related features as the most relevant for the
identification. This is in agreement with the result
of sub-section 2.2.1 where the amplitude-related fea-
Electrical Appliances Identification and Clustering using Novel Turn-on Transient Features
related features Ai and in the last one we use all seven
features.
+,%--&!.%"&'( $%"# "# /01
100
2344
5344
26344
95
3.1 COOLL Dataset Description
90
85
80
75
2
4
6
8
10
12
14
!"#$%"&'( &()#* !
(a) CR variation corresponding to rows k = 1, 6 and 10 (representing also the number k of chosen nearest neighbors) of
matrix S (Eq. (6)))
2344
5344
26344
+,%--&!.%"&'( $%"# "# /01
100
99
Table 2: COOLL dataset summary. Source: (Picon et al.,
2016).
98
97
96
95
94
2
3
4
5
6
!"#$%"&'( &()#* !
(b) Zoom of Figure 4a. The maximum values of CR for 1NN, 6-NN and 10-NN are found, respectively, for j = 4, 5
and 3.
Figure 4: Classification rate CR variation function of iteration index j (see Algorithm 1. j is also the column index of
matrix S (Eq. (6))).
tures were suspected to be good candidates for appliance identification.
Note also that the phases are the least adapted for
appliance identification which, again, is in agreement
with the observations made in sub-section 2.2.1.
3
Controlled On/Off Loads Library (COOLL) is a
dataset of high-sampled electrical current and voltage
measurements (840 current measurements and 840
voltage measurements) representing individual appliances consumption. The measurements were taken in
June 2016 in the PRISME laboratory of the University of Orléans, France. The appliances are mainly
controllable appliances (i.e. we can precisely control their turn-on/off time instants). 42 appliances of
12 types were measured at a 100 kHz sampling frequency (Table 2). A more detailed description of this
dataset and its specificities can be found in (Picon
et al., 2016).
ELECTRICAL APPLIANCES
IDENTIFICATION AND
CLUSTERING
In this section we give the identification and clustering results right after giving a brief description of the
used COOLL dataset.
In order to assess the usefulness of the model parameters for the classification, we do three tests for
both the identification and the clustering. In the first
test we only use the envelope-related features A0 and
b3 . In the second one, we use solely the amplitude-
N◦
Appliance type
# of appliances
1
2
3
4
5
6
7
8
9
10
11
12
Drill
Fan
Grinder
Hair dryer
Hedge trimmer
Lamp
Paint stripper
Planer
Router
Sander
Saw
Vacuum cleaner
Total
6
2
2
4
3
4
1
1
1
3
8
7
42
# of current
signals
(20
per appliance)
120
40
40
80
60
80
20
20
20
60
160
140
840
Note that we will use the current measurements
for the classification. The voltage measurements
(sampled at 100 kHz) presenting low variability from
an appliance to another are, then, less adapted for
classification and are discarded in this study.
The measurements of COOLL are 6 seconds long
with a pre-turn-on duration of 0.5 second duration and
post-turn-off duration of 1 second. Each appliance
has 20 measurement instances. Each instance corresponds to specific action delay (a turn-on delay wrt
the mains voltage time-cycle) ranging from 0 to 19 ms
with a step of 1 ms (Picon et al., 2016).
3.2 Identification
For the appliance identification task presented hereafter, we use the supervised algorithm k-NN. This algorithm allows the prediction of the class of a new
651
97
4
0
0
3
0
0
0
1
0
15
0
0
39
0
1
0
0
0
0
0
0
0
0
0
0
24
0
3
0
0
0
0
0
13
0
0
0
0
71
1
1
3
0
0
0
4
0
4
0
5
0
33
0
0
0
2
0
16
0
0
0
0
0
0
76
1
0
0
0
1
2
0
0
0
0
0
0
20
0
0
0
0
0
0
0
0
0
0
0
0
20
0
0
0
0
6
0
0
0
0
0
0
0
8
0
6
0
0
0
0
0
0
0
0
0
0
58
0
2
9
0
0
14
4
2
0
0
3
1 117 10
0
0
0
0
0
2
0
0
0
3
1 134
120
100
80
60
+",# &()**#*
Drill
Fan
Grinder
Hair_drayer
Hedge_trimmer
Lamp
Paint_stripper
Planer
Router
Sander
Saw
Vacuum_cleaner
40
20
0
Drill
Fan
Grinder
Hair_drayer
Hedge_trimmer
Lamp
Paint_stripper
Planer
Router
Sander
Saw
Vacuum_cleaner
120 0
0
0
0
0
0
0
0
0
0
0
0
39
0
0
0
1
0
0
0
0
0
0
0
0
40
0
0
0
0
0
0
0
0
0
0
0
0
80
0
0
0
0
0
0
0
0
6
0
0
0
54
0
0
0
0
0
0
0
0
0
0
0
0
80
0
0
0
0
0
0
0
0
0
0
0
0
20
0
0
0
0
0
0
0
0
0
0
0
0
20
0
0
0
0
0
0
0
0
0
0
0
0
20
0
0
0
0
0
0
0
0
0
0
0
0
55
5
0
0
0
0
0
0
0
0
0
0
0 160 0
0
0
0
0
0
0
0
0
0
0
!"#$%&'#$ &()**#*
652
20
0
!"#$%&'#$ &()**#*
Drill
Fan
Grinder
Hair_drayer
Hedge_trimmer
Lamp
Paint_stripper
Planer
Router
Sander
Saw
Vacuum_cleaner
103 0
0
0
1
0
0
0
0
0
16
0
0
39
0
0
0
1
0
0
0
0
0
0
0
0
31
0
5
0
0
0
0
0
4
0
0
0
0
77
1
2
0
0
0
0
0
0
4
0
4
0
32
0
0
0
2
0
18
0
0
1
0
0
0
79
0
0
0
0
0
0
0
0
0
0
0
0
20
0
0
0
0
0
0
0
0
0
0
0
0
20
0
0
0
0
6
0
0
0
0
0
0
0
13
0
1
0
0
0
0
0
0
0
0
0
0
60
0
0
4
0
0
0
5
0
0
0
0
1 150 0
0
0
0
0
0
0
0
0
0
0
0 140
150
100
50
0
D
ril
l
G Fan
H
H a rin
ed ir_ d
ge dr er
_t ay
rim er
Pa
m
in L er
t_ am
st p
rip
p
Pl er
a
R ner
ou
Sa ter
Va
nd
cu
um
er
_c Sa
le w
an
er
point (in the feature space representing an appliance)
by computing relative distances between this latter
and points in its neighborhood. The appliance class is,
then, decided with a majority vote between the classes
of the k nearest neighbors.
The parameters to be fixed are the number k of
nearest neighbors to consider and the distance metric.
For our tests, we chose the Euclidean distance and k =
10. A low k value (< 5) may degrade the robustness
of the classifier by increasing the risk of classifying
using isolated points (the extreme case being k =1).
Since we already know, for our dataset, that a wellformed group should contain 20 points (the 20 measurement instances of the same appliance), we chose
to compare each new point with half of the number
of points we are supposed to find in its neighborhood.
Hence the k = 10.
Since our dataset is not big enough to have a lot of
measurement instances of all appliances in both the
training and the test datasets, and in order to get more
reliable test results, we chose to evaluate the identification performance using K-fold cross-validation
with K = 10. The K-fold cross-validation consists
of dividing the dataset, randomly, into K sets, doing
K tests and averaging the results. For each test, we
choose one different set from all the K sets for the test
keeping the others for the training. We repeat this K
times and we average the obtained results.
Figure 5 gives the confusion matrix for the identification using the parameters A0 and b3 . With a classification rate (CR) of 82.98%, the identification gives
several bad results which indicates that these features
are not the most adapted for appliance identification.
Figure 6 gives the confusion matrix for the identification only using the features Ai . We note the (very)
good CR value of 98.57%. This suggests that these
features are more adapted to appliance identification
than the envelope-related features.
Figure 7, on the other hand, shows the obtained
40
Figure 6: Confusion matrix for identification using Ai .
Classification rate CR = 98.57%.
+",# &()**#*
Figure 5: Confusion matrix for identification using A0 and
b3 . Classification rate CR = 82.98%.
0 140
60
D
ril
l
G Fan
H
H a rin
ed ir_ d
e
ge dr r
_t ay
rim er
Pa
m
in L er
t_ am
st p
rip
p
Pl er
a
R ner
ou
Sa ter
Va
nd
cu
um
er
_c Sa
le w
an
er
D
ril
l
G Fan
H
H a rin
ed ir_ d
e
ge dr r
_t ay
rim er
Pa
m
in L er
t_ am
st p
rip
p
Pl er
a
R ner
ou
Sa ter
Va
nd
cu
um
er
_c Sa
le w
an
er
+",# &()**#*
ICPRAM 2017 - 6th International Conference on Pattern Recognition Applications and Methods
!"#$%&'#$ &()**#*
Figure 7: Confusion matrix for identification using A0 , b3
and Ai . Classification rate CR = 90.95%.
identification confusion matrix obtained after using
all the model parameters. With a CR of 90.95%,
the use of all the features enhances the result of the
envelope-related features but also deteriorates the result of the amplitude-related features.
As a conclusion, we can say that the identification
results confirm the adaptability of the features Ai for
the identification task and the nonadaptability of A0
and b3 for this task.
3.3 Clustering
For the clustering, we use one of the most known unsupervised algorithms i.e. the k-means. It requires the
user to specify a priori the number of clusters needed
to be formed. We chose k = 12 clusters based on
the number of device types we have in the COOLL
dataset.
Figure 8 shows the results as a confusion matrix
using A0 and b3 . The algorithm formed, especially,
one big cluster (cluster 1) and different smaller clusters. The smaller clusters contain only lamps whereas
the big cluster contains mostly motor-driven appli-
Electrical Appliances Identification and Clustering using Novel Turn-on Transient Features
160
120 0
0
0
0
0
0
0
0
0
0
0
Fan
40
0
0
0
0
0
0
0
0
0
0
0
Grinder
40
0
0
0
0
0
0
0
0
0
0
0
Hair_drayer
80
0
0
0
0
0
0
0
0
0
0
0
Hedge_trimmer
60
0
0
0
0
0
0
0
0
0
0
0
Lamp
46
1
4
1
2
1
3
6
2
6
1
7
Lamp1:0ms
0.04
140
0.03
0.02
Amplitude (A)
120
100
80
Paint_stripper
20
0
0
0
0
0
0
0
0
0
0
0
Planer
20
0
0
0
0
0
0
0
0
0
0
0
Router
20
0
0
0
0
0
0
0
0
0
0
0
Sander
60
0
0
0
0
0
0
0
0
0
0
0
Saw
160 0
0
0
0
0
0
0
0
0
0
0
Vacuum_cleaner
140 0
0
0
0
0
0
0
0
0
0
0
0.01
0
−0.01
−0.02
60
−0.03
40
−0.04
20
2
3
Time (s)
4
5
6
100
Drill
60
0
0
0
0
0
0
20
0
0
20 20
Fan
0
0
0
0
0
0
0
40
0
0
0
0
Grinder
0
0
0
0
20 20
0
0
0
0
0
0
80
Hair_drayer
0
40 20
0
0
0
20
0
0
0
0
0
70
Hedge_trimmer
40
0
0
0
0
20
0
0
0
0
0
0
Lamp
0
0
0
0
0
0
0
60
0
0
20
0
Paint_stripper
0
0
0
0
0
0
20
0
0
0
0
0
90
60
50
40
Planer
0
0
0
0
0
20
0
0
0
0
0
0
Router
0
0
0
0
0
20
0
0
0
0
0
0
30
Sander
20
0
0
0
0
0
0
0
0
0
40
0
20
100 0
0
0
20
0
0
0
0
0
20 20
0
20 20
0
0
0
60 20
0
0
0
10
20
12
9
10
11
8
7
6
5
4
3
0
1
Vacuum_cleaner
2
Saw
!"#$% &'#()*+(
Figure 10: Confusion matrix for clustering using Ai .
Drill
160
120 0
0
0
0
0
0
0
0
0
0
0
Fan
40
0
0
0
0
0
0
0
0
0
0
0
Grinder
40
0
0
0
0
0
0
0
0
0
0
0
Hair_drayer
80
0
0
0
0
0
0
0
0
0
0
0
Hedge_trimmer
60
0
0
0
0
0
0
0
0
0
0
0
Lamp
46
1
1
1
2
1
3
2
5
13
4
1
Paint_stripper
20
0
0
0
0
0
0
0
0
0
0
0
Planer
20
0
0
0
0
0
0
0
0
0
0
0
Router
20
0
0
0
0
0
0
0
0
0
0
0
Sander
60
0
0
0
0
0
0
0
0
0
0
0
Saw
160 0
0
0
0
0
0
0
0
0
0
0
Vacuum_cleaner
140 0
0
0
0
0
0
0
0
0
0
0
140
120
100
80
60
40
20
12
11
9
10
8
7
6
5
4
3
0
2
ances. This result is very interesting. The appliances
of the COOLL dataset are mostly motor-driven appliances (i.e. appliances with a motor that is responsible for the main task the appliance is supposed to
perform) that were gathered inside that big cluster.
This indicates that the envelope-related features are
suitable for distinguishing appliances having different
working principles.
Lamps have a different working principle than the
motor-driven loads. One of the reason that may explain the scatter of the rest of the appliances (not
motor-driven) in different clusters is that the lamps are
of different types that have different working principles. Actually, the four lamps of COOLL are, respectively, a 1.6 W light emitting diode (LED), a 15 W
compact fluorescent lamp (CFL), a 105 W halogen
lamp (HL) and 100 W halogen lamp (HL).
The tests we did show that the feature b3 is related
to the envelope amplitude decrease rate. Higher (negative) values indicate faster amplitude decrease.
Note also (Figures 8) that 46 lamps were grouped
inside the big cluster. These lamps are seemingly
lamps with no transient (Figure 9) (the current goes
almost directly from zero to the steady-state and no
transition is observed; they are most likely LED and
CFL lamps) and, hence, are different from lamps with
high amplitude variation transients (halogen lamps).
Figure 10 shows the clustering result using the Ai
features. Clearly, these features are not adapted for
the clustering since no link between the found clusters and the true classes is clear and apparently no deterministic pattern seems visible to distinguish well
defined clusters.
The result of Figure 11 shows that the use of all
available features still allows us to retrieve the motordriven cluster even with the use of the Ai features in
contrast to what happened with identification.
1
Figure 8: Confusion matrix for clustering using A0 and b3 .
1
Figure 9: Turn-on transient current of a light emitting diode
(LED). The interval [0, 1] s represents the pre-turn-on period whereas the interval [5, 6] s represents the post-turn-off
period.
,+#* &'-((*(
!"#$% &'#()*+(
0
,+#* &'-((*(
12
9
10
11
8
7
6
5
4
3
2
0
1
,+#* &'-((*(
Drill
!"#$% &'#()*+(
Figure 11: Confusion matrix for clustering using A0 , b3 and
Ai .
4
CONCLUSIONS
In this paper, novel features extracted from a recently
proposed mathematical model for modeling the turnon transient current were presented and used in order
to classify electrical appliances. These features were
analyzed for the sake of selecting a set of features
that is relevant for appliance classification. From a
653
ICPRAM 2017 - 6th International Conference on Pattern Recognition Applications and Methods
set of fourteen features seven were selected. A sequential forward search (SFS) wrapper-based selection algorithm was also used and its results validated
the soundness of the previously selected features.
The Controlled On/Off Loads Library (COOLL)
was used for the classification. A comparison between the appliance identification and clustering results using the turn-on transient features was conducted. The results indicate that the amplitude-based
features Ai , i = 1, . . . , 5 are the most relevant for appliance identification whereas the envelope-based features A0 and b3 are the most relevant for appliance
clustering.
Future work may investigate further the robustness of the obtained results by testing the classification on other datasets with bigger sizes than COOLL
and containing other families of appliance types (TV,
washing machines, refrigerator, etc.). Other problems
like model selection (parameters d and n) for the turnon transient current model may also be addressed.
ACKNOWLEDGEMENTS
This study was supported by the Région Centre-Val
de Loire (France) as part of the project MDE–MAC3
(Contract n◦ 2012 00073640).
REFERENCES
Barsim, K. S. and Yang, B. (2015). Toward a semisupervised non-intrusive load monitoring system for
event-based energy disaggregation. In 2015 IEEE
Global Conference on Signal and Information Processing (GlobalSIP), pages 58–62.
Bonfigli, R., Squartini, S., Fagiani, M., and Piazza, F.
(2015). Unsupervised algorithms for non-intrusive
load monitoring: An up-to-date overview. In Environment and Electrical Engineering (EEEIC), 2015 IEEE
15th International Conference on, pages 1175–1180.
Coleman, T. F. and Li, Y. (1994). On the convergence of
interior-reflective newton methods for nonlinear minimization subject to bounds. Mathematical Programming, 67(1):189–224.
Darby, S. (2010). Smart metering: what potential for householder engagement? Building Research & Information, 38(5):442–457.
Fischer, C. (2008). Feedback on household electricity consumption: a tool for saving energy? Energy efficiency,
1(1):79–104.
Gillis, J. M. and Morsi, W. G. (2016). Non-intrusive load
monitoring using semi-supervised machine learning
and wavelet design. IEEE Transactions on Smart
Grid, PP(99):1–8.
654
Goncalves, H., Ocneanu, A., Berges, M., and Fan, R.
(2011). Unsupervised disaggregation of appliances
using aggregated consumption data. In The 1st KDD
Workshop on Data Mining Applications in Sustainability (SustKDD).
Hacine-Gharbi, A., Petit, M., Ravier, P., and Némo, F.
(2015). Prosody based automatic classification of the
uses of french ouias convinced or unconvinced uses.
In International Conference on Pattern Recognition
Applications and Methods (ICPRAM), number ISBN
978-989-758-077-2, pages 349–354.
Hancke, G. P., Hancke Jr, G. P., et al. (2012). The role of
advanced sensing in smart cities. Sensors, 13(1):393–
425.
Hart, G. W. (1985). Prototype nonintrusive appliance load
monitor. In MIT Energy Laboratory Technical Report,
and Electric Power Research Institute Technical Report.
Hart, G. W. (1989). Residential energy monitoring and
computerized surveillance via utility power flows.
Technology and Society Magazine, IEEE, 8(2):12–16.
Hart, G. W. (1992). Nonintrusive appliance load monitoring. Proceedings of the IEEE, 80(12):1870–1891.
Kohavi, R. and John, G. H. (1997). Wrappers for feature
subset selection. Artificial intelligence, 97(1):273–
324.
Nait Meziane, M., Picon, T., Ravier, P., Lamarque, G.,
Le Bunetel, J.-C., and Raingeaud, Y. (2016). A
measurement system for creating datasets of on/offcontrolled electrical loads. In Conference on Environment and Electrical Engineering (EEEIC), Proceedings of the 16th IEEE International, pages 2579–
2583.
Nait Meziane, M., Ravier, P., Lamarque, G., Abed-Meraim,
K., Le Bunetel, J.-C., and Raingeaud, Y. (2015). Modeling and estimation of transient current signals. In
Signal Processing Conference (EUSIPCO), 2015 Proceedings of the 23rd European, pages 2005–2009.
Parson, O. (2016).
Overview of the nilm field.
http://blog.oliverparson.co.uk/2015/03/overviewof-nilm-field.html.
Picon, T., Nait Meziane, M., Ravier, P., Lamarque, G., Novello, C., Le Bunetel, J.-C., and Raingeaud, Y. (2016).
COOLL: Controlled on/off loads library, a public
dataset of high-sampled electrical signals for appliance identification. arXiv preprint arXiv:1611.05803.
Zeifman, M. and Roth, K. (2011). Nonintrusive appliance load monitoring: Review and outlook. Consumer
Electronics, IEEE Transactions on, 57(1):76–84.
Zoha, A., Gluhak, A., Imran, M. A., and Rajasegarar, S.
(2012). Non-intrusive load monitoring approaches
for disaggregated energy sensing: a survey. Sensors,
12(12):16838–16866.