FinalPHD-THESIS_SEGURA_e.

FinalPHD-THESIS_SEGURA_e.
optimisation
of monitoring
networks for
water systems
information
theory,
value of
information
and public
participation
josé leonardo alfonso segura
OPTIMISATION OF MONITORING NETWORKS
FOR WATER SYSTEMS
INFORMATION THEORY, VALUE OF INFORMATION AND
PUBLIC PARTICIPATION
OPTIMISATION OF MONITORING NETWORKS
FOR WATER SYSTEMS
INFORMATION THEORY, VALUE OF INFORMATION AND
PUBLIC PARTICIPATION
DISSERTATION
Submitted in fulfilment of the requirements of
the Board for Doctorates of Delft University of Technology
and of the Academic Board of the UNESCO-IHE Institute for
Water Education for the Degree of DOCTOR
to be defended in public
on Tuesday 16th of November 2010 at 15:00 hours
in Delft, The Netherlands
by
José Leonardo ALFONSO SEGURA
born in Bogotá, Colombia.
Master of Science in Hydroinformatics
UNESCO-IHE Delft, the Netherlands
This dissertation has been approved by the supervisor:
Prof. dr. R.K. Price
Members of the Awarding Committee:
Chairman
Vice-chairman
Prof. dr. N.C. van de Giesen
Prof. dr. A.W. Heemink
Prof. dr. S. Uhlenbrook
Prof. dr. V.P. Singh
Prof. dr. R.K. Price
Dr. A. Lobbrecht
Prof. dr. H. H. G. Savenije
Rector Magnificus, TU Delft, The Netherlands
Rector UNESCO-IHE, The Netherlands
TU Delft, The Netherlands
TU Delft, The Netherlands
UNESCO-IHE/VU Amsterdam, The Netherlands
Texas A and M University, USA
TU Delft/UNESCO-IHE, The Netherlands (supervisor)
UNESCO-IHE, Hydrologic, The Netherlands
TU Delft, The Netherlands (reserve)
CRC Press/Balkema is an imprint of the Taylor & Francis Group, an informa business
©2010, José Leonardo Alfonso Segura
All rights reserved. No part of this publication or the information contained
herein may be reproduced, stored in a retrieval system, or transmitted in any
form or by any means, electronic, mechanical, by photocopying, recording or
otherwise, without written prior permission from the publisher.
Although care is taken to ensure the integrity and quality of this publication
and the information therein, no responsibility is assumed by the publishers nor
the author for any damage to property or persons as a result of operation or
use of this publication and/or the information contained herein.
Published by:
CRC Press/Balkema
PO Box 447, 2300 AK Leiden, The Netherlands
e-mail: Pub.NL°c taylorandfrancis.com
www.crcpress.com www.taylorandfrancis.co.uk www.balkema.nl
ISBN 978-0-415-61580-8 (Taylor & Francis Group)
Dedicated with endless love
to my wife Sandra and our children
Valentina and Sebastian
Summary
A vital part of modern water management is the measurement of different processes of
the hydrologic cycle. For this purpose, monitoring networks provide data that is analysed
to help managers make informed decisions about a water system of interest. However,
there are a number of main challenges regarding the design and evaluation of monitoring
networks, which range from establishing proper temporal and spatial scales to defining
their scope at minimum costs.
In theory, these difficulties are generally addressed by scientists when developing new
approaches for monitoring network design. However, in practice, the data collected by
the existing monitoring networks remains, in general, inadequate for understanding and
explaining the dynamics of natural systems. Although this may be because the criteria to
establish the final monitoring network is driven in practice by non-scientific aspects, such
as political and social viewpoints, new approaches that involve the nature of the decision
maker and stakeholder participation can reduce the gap between theory and practice.
In this Ph.D. research, funded by Delft Cluster, Delfland Water Board and UNESCOIHE, innovative methods to design and evaluate monitoring networks are addressed. The
main idea is to maximise the performance of water systems by optimising the information
content that can be obtained from monitoring networks. When we talk about performance
of water systems, we refer to the classification of how the water “behaves” with respect
to each particular water use and interest within a particular system. The estimation of this
performance should drive decisions made about the system. Additionally, when we talk
about maximising information content, we refer to the task of seeking those potential
locations where the placement of a set of data collection devices can give the best
indication of the state of the system at any point within the water system.
Three stages form the pillars of the research: Information Theory, Value of Information,
and data collection by public participation. The first pillar is an information theory-based
method for localising water-level monitoring stations. The placement of a set of
monitoring points that reduces the uncertainty, understood from the perspective of
Shannon's information theory, of water-level measurements in a water system, has been
achieved. Two main methods were developed. First is the Water Level Monitoring
Design in Polders (WMP) method, in which the monitors are located one by one keeping
the information content of the set as high as possible and the mutual information between
pairs of monitors as low as possible. The second approach considers a Multi Objective
Optimization Problem (MOOP), in which two practical situations are considered: 1) the
costs of placing new monitors, and 2) the cost of placing monitors too close to hydraulic
structures. The costs were considered in terms of informative units, for which additional
terms affecting the objective functions were introduced. In addition, a third approach that
considers the problem of locating discharge monitors in the Magdalena River by
complementing the MOOP with a rank-based greedy algorithm, is presented.
vii
The second pillar considers a method to place monitoring devices according to the value
that the user of these devices gives to the information collected, according to its
usefulness to make decisions. We use concepts of Value of Information, defined as the
expected difference between the utility of choosing a particular action given the prior
beliefs about the state of the system and the utility of choosing the action given additional
information coming from an informative device (monitor). The approach selects the most
valuable set of monitors for a particular water system, taking into consideration the
decision-maker’s prior beliefs, the consequences of the system performance and the
quality of the informational message that the monitoring network would provide. This
new approach is characterised by a theoretical development for establishing the value of
the location for one monitor and its extension to the case for n monitors. In addition, it is
proposed that the probabilistic variables needed to calculate the Value of Information are
estimated through the use of models.
The third pillar of the research, driven by a significant practical component, aims to
explore new possibilities of gathering information with mobile phones and to improve
models with this data. Nowadays mobile phones cannot be considered as merely
telephones for voice transmission anymore: they have become devices that combine PC
software, digital cameras, calculators and agendas and that are able to provide Internet,
radio, television and fax services. In particular, mobile phones may be used by the public
for water level monitoring. The idea is to take advantage of the benefits of public
participation in monitoring, identified by diverse authors, which include the creation of
public awareness on environmental issues, the improvement of collaboration among all
stakeholders, the cost effectiveness of data collection activities and the high coverage in
space and time. An important contribution in this respect is the Mobile Monitoring
Experiment (MoMoX), an experiment carried out during 2010 in the polders of Pijnacker
with participants of diverse affiliations, including residents near the water level gauges.
Two main case studies of very different nature are used to test the developed methods
and theories. First, the polders of Pijnacker, a typical low-lying regional, flat and highly
controlled system in the Netherlands, located to the East of the Delfland area. The region
is mainly rural, with some urban development and greenhouses. From the hydrologic
point of view, four major polders, subdivided into 127 smaller polders, are hydrologically
independent response units, with unique target water levels. The system is controlled by a
number of pumping stations, inlets and fixed weirs that are operated in order to keep the
water levels in the canal network between limits defined by the water management of the
region.
The second case study corresponds to the Magdalena River, the major river system in
Colombia and the largest river discharging into the Caribbean Sea. It runs for about 1,530
kilometres from South to North, draining a catchment equivalent to 24% of the area of the
country, and where 77% of the population lives. The area of interest for this research is the
middle and low Magdalena, which are important for the country not only in terms of
economic activities such as navigation, fish production and agriculture, but also because
these regions suffer most from flooding.
viii
The results of this research demonstrate that monitoring networks can be evaluated and
designed by considering new criteria, such as the information content, the nature of the
information user and the potential of the current mobile power for data collection. Among
them, new methodologies for optimising monitoring networks coupling models with
concepts of Information Theory and aspects of Value of Information have been achieved,
and a public-based monitoring network for water-level data collection, characterised by
the use of mobile phones has been configured, tested, run and assessed.
Additionally, this research opens new possibilities for the application of Information
Theory in water resources, in which information has been traditionally quantified in
entropy units. With the inclusion of Value of Information concepts, the entropy-based
methods developed in this thesis, as well as the ones developed in previous studies, can
now be adjusted to consider measuring the information in monetary terms.
ix
Contents
SUMMARY ................................................................................................................................................ VII
CHAPTER 1
1.1
1.2
1.3
1.4
1.5
BACKGROUND .............................................................................................................................. 1
MOTIVATION OF THIS THESIS........................................................................................................ 5
RESEARCH QUESTIONS AND OBJECTIVES ...................................................................................... 7
SCOPE OF THIS THESIS .................................................................................................................. 7
THESIS OUTLINE ........................................................................................................................... 9
CHAPTER 2
2.1
2.2
2.3
2.4
2.5
2.6
CASE STUDY 1: POLDERS OF PIJNACKER..................................................... 27
INTRODUCTION........................................................................................................................... 27
DESCRIPTION OF THE POLDER SYSTEM OF THE PIJNACKER REGION ............................................ 30
WATER LEVEL MONITORING NETWORK ...................................................................................... 33
DESCRIPTION OF THE MODEL OF THE PIJNACKER POLDER SYSTEM ............................................. 34
CHAPTER 4
4.1
4.2
4.3
4.4
LITERATURE REVIEW......................................................................................... 11
INTRODUCTION........................................................................................................................... 11
DESIGN AND EVALUATION OF MONITORING NETWORKS ............................................................. 11
INFORMATION THEORY (IT) ....................................................................................................... 13
VALUE OF INFORMATION (VOI)................................................................................................. 20
PUBLIC PARTICIPATION IN MONITORING ..................................................................................... 24
MODEL RELIABILITY .................................................................................................................. 25
CHAPTER 3
3.1
3.2
3.3
3.4
INTRODUCTION....................................................................................................... 1
CASE STUDY 2: MAGDALENA RIVER.............................................................. 37
INTRODUCTION........................................................................................................................... 37
PERFORMANCE OF THE MAGDALENA RIVER .............................................................................. 44
DEVELOPMENT OF THE HYDRODYNAMIC MODEL FOR THE MAGDALENA RIVER ......................... 45
LIMITATIONS OF THE MODEL ...................................................................................................... 55
CHAPTER 5
INFORMATION THEORY FOR MONITOR LOCATION ................................ 57
5.1
5.2
INTRODUCTION........................................................................................................................... 57
INFORMATION THEORY-BASED APPROACH FOR LOCATION OF MONITORING WATER LEVEL GAUGES
IN POLDERS ............................................................................................................................................... 59
5.3
OPTIMIZING INFORMATION MEASURES FOR THE DESIGN AND EVALUATION OF MONITORING
NETWORKS IN POLDERS ............................................................................................................................ 72
5.4
EVALUATION OF THE MONITORING NETWORK OF THE MAGDALENA RIVER .............................. 87
5.5
CONCLUSIONS .......................................................................................................................... 100
CHAPTER 6
6.1
6.2
6.3
6.4
VALUE OF INFORMATION FOR MONITOR LOCATION........................... 103
INTRODUCTION......................................................................................................................... 103
DEFINITION OF VARIABLES FOR VOI ESTIMATION .................................................................... 104
VALUE OF THE LOCATION FOR ONE MONITOR ........................................................................... 108
VALUE OF THE LOCATIONS FOR TWO MONITORS ...................................................................... 110
6.5
6.6
6.7
SELECTION OF MONITOR LOCATIONS BASED ON VOI ............................................................... 110
CASE STUDIES .......................................................................................................................... 113
CONCLUSIONS .......................................................................................................................... 132
CHAPTER 7
RELIABILITY
7.1
7.2
7.3
7.4
7.5
INTRODUCTION......................................................................................................................... 135
PUBLIC PARTICIPATION IN DATA COLLECTION .......................................................................... 136
RESULTS OF THE EXPERIMENT .................................................................................................. 140
ASSESSING MODEL ERRORS WITH PUBLIC’S DATA .................................................................... 145
CONCLUSIONS .......................................................................................................................... 149
CHAPTER 8
8.1
8.2
PUBLIC DATA COLLECTION AND ASSESSMENT OF MODEL
135
CONCLUSIONS AND RECOMMENDATIONS ................................................ 151
CONCLUSIONS .......................................................................................................................... 151
RECOMMENDATIONS ................................................................................................................ 154
CHAPTER 9
REFERENCES........................................................................................................ 155
LIST OF FIGURES ..................................................................................................................................... 165
LIST OF TABLES ...................................................................................................................................... 171
NOTATIONS ............................................................................................................................................ 173
ABBREVIATIONS ..................................................................................................................................... 175
ACKNOWLEDGEMENTS ........................................................................................................................... 177
ABOUT THE AUTHOR............................................................................................................................... 181
SAMENVATTING ..................................................................................................................................... 183
xii
Chapter 1
Introduction
Everyday we measure things. We look at our watches in order to keep track of time to
avoid running too late; we look at the petrol level to know whether we can reach our
destination with the available combustible; we measure our weight to check how much
time should we spend in the gym; we look into our pockets to know whether we have
enough money to buy something (or to know whether it is time to ask for a salary raise).
In general, we measure because we want to make informed decisions about something.
The path between measuring and making a decision is, however, not straightforward.
Once we measure, we must interpret and analyse the data, and account for errors it may
have. Then, we gain the information that we need to understand and to decide.
A vital part of modern water management is the measurement of the different processes
of the hydrologic cycle. For this purpose, monitoring networks are situated to generate
data with useful information content, which is used by managers of water systems and by
decision-makers to maintain a good performance of their water system.
This chapter addresses the subject of this thesis and the importance of the research
problem, and introduces the motivation to solve it.
1.1 Background
This section describes water systems and their performance, the importance of
monitoring networks and their role in the modelling process, and introduces alternative
means of data collection.
1.1.1
Water systems and their performance
Water systems
A water system can be defined as a set of interconnected components in which water is
present and has demands made of it by diverse users with different interests. Examples of
water systems are the natural rivers, coasts, lakes, and, in general, all the surface water
and groundwater bodies, that are subject to interference by humans (to be used for
example for agriculture, drinking water, hydropower generation, navigation, fishing,
Optimisation of monitoring networks for water systems
recreation and industry) and required by nature in general (for instance terrestrial and
aquatic ecology). In this thesis, approaches to designing, evaluating and optimising
monitoring networks are applied to two water systems that are very different in nature:
namely, rivers and polder systems.
Perhaps the most important water systems are the natural rivers, because of their scope,
the number of users and interests involved from their source to their discharge to lakes or
oceans and because of the complex physical and chemical processes that take place in
their catchment. In fact, ancient civilizations settled in their floodplains and nowadays
they are the focus of development and face enormous sustainability challenges, especially
in developing countries.
In contrast, a completely different type of water system in terms of configuration
(although not in terms of users and interests) is a polder system. A polder is a low-lying
area that is artificially disconnected from the hydrological regime of neighbouring
regions by means of hydraulic structures and dikes, in order to keep the water level
within convenient ranges in the area of interest. Polders drain excess water independently
into higher areas or to the sea.
Several differences between rivers and polders can be distinguished from different
viewpoints. For example, in terms of area, what is drained by a polder system is
considerably smaller than the area of a river catchment, which makes a difference also to
the hydrological regimes in both systems; in terms of drainage type, rivers drain from
higher to lower elevations, whereas polders require pumping stations to drain excess
water to higher elevations; in terms of morphology, rivers “draw” their own stream path
through natural processes, which may induce strong geomorphology changes in time,
while the polder’s canal networks are built artificially, with low morphological changes.
Performance of water systems
When we talk about performance of water systems, we refer to the classification of how
the water “behaves” with respect to each particular water use and interest. The estimation
of this performance should drive decisions made about the system. According to Loucks
et al. (2005) the performance of water systems can be quantified by selecting broad,
general water management objectives and then splitting them into more detailed ones. If
an objective is not general enough, then there is some risk of overlooking sub-objectives.
As a general rule, broad objectives have the characteristic of being subject to a phrase
like “because it is what we all want”. For example “protecting public health” and
“increasing economic development” are objectives that everybody agrees with.
A major drawback, however, is that normally these water management objectives are
conflicting. For example, water quality, the base for preserving the ecosystem and human
health, is an issue that is frequently in conflict with human activities such as development
and industrialization, which are the base of the regional economy; similarly, water
quantity has different conflicts with different uses such as navigation (which requires
high water levels) and agriculture (which requires specific water level ranges). This is the
reason why it is so important to optimise the performance of water systems.
2
Chapter 1 - Introduction
For the case of the Pijnacker region, these conflicts arise when managing water quantity
and water quality. For instance. navigation, ecology, and recreation require high water
levels, while flood prevention, horticulture and pasture agriculture prefer low water
levels. Similarly, for the case of the Magdalena River, high water levels imply not only
conflicts between flood management and cattle farming, but also between fishing
production and illegal land-reclamation activities.
The idea behind measuring the performance of water systems is that decision-makers can
make quantitative-based decisions, even if these decisions are qualitative.
1.1.1 Data, Information, Knowledge, Models
Data is not useful by itself. From the Knowledge Management viewpoint, when data is
processed to make it useful we call it information, and when information is applied we
call it knowledge (Ackoff 1989). Additionally, Abbott (2002) states that models are
encapsulated knowledge that we can use to produce information.
However, models not only transform data into information (e.g., transforming channel
characteristics and flow boundary conditions into stage and discharge at every
computational point), but also need data to improve themselves (e.g., making sure the
model replicates the measurements well through calibration and validation). Cunge
(2003) offers an interesting discussion on why the relationship between data and models
is much more complicated than what it is often presented by teachers and practitioners in
the water sector. This discussion includes the role of data in calibration and validation of
deterministic models, and proposes good practice for modelling.
Nowadays models are considered essential for decision making, since they provide a way
for planners and managers to predict the behaviour of any planned water system design or
management policy before it is implemented. They are also used to make decisions to
optimally control the current water system, by operating control devices that keep water
variables of interest within predefined ranges. In both cases, the water management
objectives, with all their conflicts, should be taken into account (Lobbrecht 1997).
Models can be used, therefore, to improve the performance of water systems.
However, before using them for any purpose, models must reliably replicate the state of
the system under consideration and for this data is needed, so that calibration and
validation can be performed. This condition implies that the data collection is a critical
issue, since it is the first input within the decision-making process (data to generate
information and information to generate decisions). For this reason, monitoring networks
are of paramount importance.
1.1.2 Monitoring networks
A monitoring network can be defined as a set of strategically located measurement
devices that collect data of interest about a water system at a given temporal scale.
Monitoring networks are important because they collect data that, after being interpreted,
provide insights for decision-making. In this thesis we understand monitoring as the
process of observing what is happening in the water system.
3
Optimisation of monitoring networks for water systems
Hydrological monitoring networks are, in a wider concept, part of environmental
networks, which have been classified by Estrin et al. (2003), according to the type of
variable measured, whether physical, chemical or biological. Regarding the function, the
hydrological monitoring networks are conformed by devices that monitor the
hydrosphere and the atmosphere by measuring the presence of water in the ground, on the
surface and in the atmosphere. As a main objective of monitoring, Hooper et al. (2004)
highlighted the understanding of four basic properties, not only referring to water, but
also to sediments, nutrients and pollutants: mass, residence time, fluxes and flow paths. In
this thesis we concentrate on monitoring networks for surface water levels and flows.
There is a close relationship between what is needed to manage and what is needed to be
collected. General objectives of hydrological monitoring are completely linked with the
objectives for the performance of water systems described above. The World
Meteorological Organisation (WMO) stated in 1981 that “the aim of a network is to
provide a density and distribution of stations in a region such that, by interpolation
between data sets at different stations, it will be possible to determine with sufficient
accuracy for practical purposes, the characteristics of the basic hydrological and
meteorological elements anywhere in the region” (Made 1988, page 20). However, the
use of models as interpolators is not explicitly mentioned in WMO guidelines, perhaps
because of the existing difficulties for meteorological modelling.
The questions to be addressed when designing monitoring networks are what to measure,
why, where, how often and with what accuracy (Loucks et al. 2005). First of all, a clear
definition of the objectives of monitoring and the data needs is necessary. This definition
of objectives leads to the monitoring plan, where it is explicitly defined which data is
required, its accuracy and its frequency of monitoring. Temporal and spatial scales are
defined in this plan as well. There exist many methods for design and evaluation of
monitoring networks. Mishra and Coulibaly (2009) provide a comprehensive review of
the available methods, namely statistical methods, entropy-based methods, basin
physiographic characteristics and sampling strategies. In this review, it is demonstrated
that statistical methods are the most developed.
1.1.3
Information Theory and Value of Information
The main concepts of Information theory were developed in the late 1940’s in the field of
communication systems by Shannon (1948), with the aim of providing a quantitative
measure of information. In brief, it states that any entity that provides surprising
outcomes has more information than the entities that do not. For instance, telling the
readers that the author speaks Spanish will not surprise them, as they might infer so just
by looking at his name (or by hearing his English accent). Now, telling the readers that
the author is a very skilled football player (even if this is not true) will be more surprising
for the readers. Therefore, the statement “the author is a very good football player” adds
more information to the reader than the statement “the author speaks Spanish”. One point
to clarify in this respect is that Information Theory does not account for the meaning
because “these semantic aspects of communication are irrelevant to the engineering
problem”, (Shannon 1948).
4
Chapter 1 - Introduction
In contrast, the main concepts of Value of Information were developed in the field of
economics during the late 1960’s (Howard 1968) as it was of interest to know the amount
a decision maker would be willing to pay for information prior to making a decision. An
important characteristic of this theory is that the decision-maker’s beliefs are significant
when assigning a value to information and that its value rises as the cost of making the
wrong decision increases. For instance, consider a man who has to decide whether or not
to go to the doctor when he feels unwell. If the subject is either a hypochondriac (he has
an excessive worry about having a serious illness) or iatrophobic (he fears to go to the
doctor), the decision is already made (a hypochondriac will certainly go and the
iatrophobic will certainly not go). In both cases, any possible additional information to
make the decision has no value, because the prior beliefs are at extremes. Conversely, if
the man is completely doubtful about what to do, any additional information (maybe
googling his symptoms) will be valuable for him to make a decision. Additionally, it is
clear that wrongly deciding to go to the doctor (“paracetamol-water-call in two weeks”),
has relatively less important consequences than wrongly deciding not to do it (further
complications, life at risk).
1.1.4 Public participation in monitoring
Public participation is an interesting option for data collection. The benefits of public
participation in monitoring have been identified by diverse authors for the case of water
quality monitoring (see e.g., Au et al. 2000; Bromenshenk and Preston 1986; Stokes et al.
1990), and include the creation of public awareness on environmental issues, the
improvement of collaboration among all stakeholders, the cost effectiveness of data
collection activities and the high coverage in space and time.
Similarly, disadvantages of voluntary collection have also been identified and include the
lack of confidence in data collection procedures, data quality, which is often unknown
and data that is usually dispersed and non-structured. Gouveia et al. (2004) suggest the
use of Information and Communication Technologies to overcome these problems and
explore non-traditional types of environmental data such as images, sounds and videos. In
this thesis, we explore the use of mobile phones for data collection by the public.
1.2 Motivation of this thesis
There are a number of main challenges regarding the design and evaluation of monitoring
networks, which range from establishing proper temporal and spatial scales to defining
their scope at minimum costs. In theory, these difficulties are generally taken into account
by scientists when developing new approaches for monitoring network design. However,
in practice, it has been reported that the data collected by the existing monitoring
networks remains, in general, inadequate for understanding and explaining the dynamics
of natural systems (Canadian Water Resources et al. 1994; IUCN 1980). This may be
because the criteria to establish the final monitoring network is driven in practice by nonscientific aspects, such as political and social viewpoints.
5
Optimisation of monitoring networks for water systems
Additionally, Mishra and Coulibaly (2009, page 19), point out that “...it is anticipated that
the future will witness a greater and growing demand of hydrometric information for
water resources, environmental, and ecohydrological management”. Provided the
objectives of a monitoring network are defined, it is clear that the ideal, uncertainty-free
monitoring network is one that has an infinite number of monitors 1 in the area of interest,
each one providing data at infinitesimal temporal scale. Naturally, this implies also the
need for infinite resources for operation, management and analysis. Yet another
challenge, then, is to properly select key locations where to collect the data needed to
make proper decisions about the water system with the minimum expense. The cartoon in
Figure 1-1 expresses the concern of not being able to reliably acquire the state of the
system throughout the existing monitoring network.
Figure 1-1. One of the main challenges when designing a monitoring network
It is clear, then, that the resolution of these main challenges will imply on the one hand
the generation of improved water management decisions with regards to the performance
of water systems and on the other hand the possibility of enhancing our knowledge about
the variability of natural systems.
1
In this thesis the terms monitor, device or gauge are used interchangeably to refer to any piece of
equipment used to measure any water-related variable.
6
Chapter 1 - Introduction
1.3 Research questions and objectives
In the context presented above, a main question arises:
How can monitoring networks be optimised (in such a way that information
content and value is maximized), and how can a public-based monitoring network
be configured, with the aim of enhancing the performance of a water system?
In order to give an answer to this question, the following research questions have been
posed:
x How can models and Information Theory concepts be coupled to optimise
monitoring networks?
x How can models and Value of Information concepts be coupled to optimise
monitoring networks?
x How can a public-based monitoring network for hydrological data collection be
configured and how can this data be used to reduce model errors so that an
improved description of the water system, especially during extreme events, can
be obtained?
x How can the new approaches be applied to polder systems (such as the Delfland
system in The Netherlands) and to large river systems (such as the Magdalena
River)?
The objective of this research is to investigate new methods of optimising monitoring
networks, using concepts of Information Theory, Value of Information and public
participation.
The particular objectives of this research are listed as follows:
x To explore different methodologies for optimising monitoring networks using
Information Theory concepts and models.
x To develop a methodology for optimising monitoring networks using Value of
Information concepts and models.
x To configure, test, run and assess a public-based monitoring network for waterlevel data collection, characterised by the use of mobile phones.
x To develop a methodology to use the mobile phone data collected by the public to
reduce model errors.
x To test the developed methodologies and tools in two case studies in water
systems with different hydrologic, hydraulic, socioeconomic and political
conditions.
1.4 Scope of this thesis
1.4.1
Role of models in monitoring network design
There exist different ways to deduce data of non-measured sites as from the measured
ones. Made (1988) provides examples of different interpolation methods for the case of
designing and evaluating monitoring networks, which include mathematical relations,
statistical methods, physically-based mathematical models and their combinations. In this
7
Optimisation of monitoring networks for water systems
thesis, we use physically-based mathematical models to generate dense time series data
sets from a limited set of data in order to account for the physics of the phenomena,
allowing for reliable descriptions of the water system under different conditions.
Subsequently, the resulting data sets are used to design or evaluate the monitoring
network, in such a way that the best points where measurements can be taken are
identified. After this, the designed monitoring network, now in place, will generate data
with a maximized information content (Chapter 5) and with a maximum Value of
Information (Chapter 6), taking into account the predefined water management objectives
for decision making. This data, in turn, can be used to improve the model through
improved calibration and validation. The loop is closed when the improved model is
again used to design or evaluate the previous monitoring network. The process is
repeated until the monitor locations do not change between two consecutive loops. This
iterative process is presented in Figure 1-2.
Monitoring
network
Limited data
(calibration/
validation)
Limited data
(run)
Model
Design /
evaluation
Dense data
Water
management
objectives
Decision-making
Figure 1-2. Use of models for the design of monitoring networks (this thesis)
1.4.2
Other means of data collection
Nowadays mobile phones cannot be considered as merely telephones for voice
transmission anymore. They have become a device that combines PC software, digital
cameras, calculators and agendas and that is able to provide Internet, radio, television and
fax services. One of the most important facts regarding these devices is that any person
may own one and that the knowledge of operating one is in the public domain. All of
these characteristics make the mobile phone to be a cheap device not only for receiving
data but also for sending all kinds of data at conveniently dense spatial and temporal
scales. The option to collect data through mobile phones by means of public participation
is explored in this thesis (Chapter 7). These new approaches for data collection have a
potentially positive impact in developing countries, where mobile phone technology is
8
Chapter 1 - Introduction
accessible to the poor people. A massive participation for data collection could lead to
better water management of water systems that lack proper monitoring technology. The
improved water management, in turn, will provide better living conditions to citizens,
confirming the societal component of Hydroinformatics (Abbott 1991; Jonoski 2002).
1.4.3 Scope
The scope in which the questions posed before are answered includes the first loop of the
flowchart shown in Figure 1-2 for the monitoring of water level and discharges. The
monitoring networks are optimised spatially with theoretical approaches (Information
Theory, Value of Information) and complemented temporally with practical methods
(mobile phone-based public participation for data collection), the latter being also used to
explore the possibilities for identifying the sources of error in a model.
1.5 Thesis outline
The research questions posed in the previous section are answered through the
development of six interconnected chapters, which begin with Chapter 2, where the state
of the art is provided through a detailed review of the literature and where the theories
used in the course of the thesis are presented.
In order to develop the proposed methods, two case studies in catchments with different
hydrologic, hydraulic, socioeconomic and political conditions, namely the Polders of
Pijnacker, The Netherlands and the Magdalena River, Colombia, are presented in Chapter
3 and Chapter 4 respectively. In the latter, details about the development of a
hydrodynamic model, a major task elaborated during this thesis, is also included.
Subsequently, the proposed new methods for designing, evaluating and optimising
monitoring networks are presented and applied in each case study, for which the use of
hydrodynamic models is significant. The concepts of Information Theory and Value of
Information are applied in Chapter 5 and Chapter 6, respectively.
In order to explore new methods for data collection throughout the participation of the
inhabitants residing in a particular water system, Chapter 7 is presented. The popular
short message service (SMS) mobile phones can send, is exploited for both data
collection and model improvement. An important contribution in this respect is the
Mobile Monitoring Experiment (MoMoX), an experiment carried out during 2010 in the
polders of Pijnacker.
Finally, Chapter 8 presents the conclusions and recommendations for each proposed
method.
In order to help the reader to understand the relationship between the chapters, a schema
of the thesis outline is provided in Figure 1-3. First, the Introduction and the Literature
Review work as the base of the three main pillars of the thesis (Information Theory,
Value of Information and Public Participation - Model Reliability). The case studies
9
Optimisation of monitoring networks for water systems
(Polders of Pijnacker and the Magdalena River) cross horizontally these pillars, showing
where the developed methods are applied. Finally, the Conclusions are supported by the
whole framework.
Figure 1-3. Outline of the thesis
10
Chapter 2
Literature Review
2.1 Introduction
Different issues regarding the monitoring of water systems are addressed in this thesis.
For this reason, it is necessary to provide the theoretical framework for the methods
developed further through a detailed review of the relevant literature. In the first place,
the studies related to the design and evaluation of monitoring networks are presented,
which include a number of existing approaches for the design and evaluation of networks
of diverse nature. Subsequently, the basic expressions of Information Theory are
presented along with a number of publications that apply the theory for water-related
problems, including the design and evaluation of monitoring networks for different
purposes. In particular, the review in Information Theory is necessary to introduce the
methods developed in Chapter 5. Thirdly, a review of the concepts, mathematical
development and the application of the Value of Information (VOI) concept is presented
also in the water resources context, which is the starting point of the development of the
methodology presented in Chapter 6. The starting point of the developments given in
Chapter 7 is the combination of two important topics, namely the public participation in
monitoring using Information and Communication Technologies (ICT) and model
reliability and uncertainty. The relevant bibliography of both topics is reviewed at the end
of this chapter.
2.2 Design and evaluation of monitoring networks
One of the most important elements of the planning and management of water resources
is the assessment of these resources, which takes into account the identification of the
sources and the evaluation of their capacity, dependability and quality, implying the
measurement and collection of data of interest (Mishra and Coulibaly 2009). For this
purpose, monitoring networks are designed and sometimes optimized for decisionmaking according to the water management objectives (Loucks et al. 2005). In the
following sections recent developments in the design and evaluation of monitoring
networks are discussed.
2.2.1
Designing a monitoring network
As is the case for any water system, the estimation of water-related variables at sites
where no measurements take place is the main problem the design of a monitoring
Optimisation of monitoring networks for water systems
network must address. Therefore, the design of a monitoring network aims to find the
number and spatial distribution of gauges and the temporal interval of their
measurements.
The monitoring networks must be planned in a process that is not static, but that requires
a continuous cycle. Figure 2-1 shows a simple cycle that Mogheir and Singh (2002)
indicated as a first approach monitoring cycle. In summary, information needs come from
what has been defined as important to manage in the water system. The information
strategy is, thus, the subsequent step that leads to a plan for data collection. After this, the
analysis of data and the utilisation of the information are the processes that create the
inputs for water management and decision making, closing the cycle in this way.
A detailed description of the process for designing monitoring networks is given by
WMO (1994), which suggests beginning with the definition of the institutional setup and
the purposes, objectives and priorities of the network. Subsequently, the obtained
network design is under optimization procedures, financial revisions and final
implementation. The last step includes a review of the network which, together with other
feedback mechanisms applied at each of the previous steps, complete the framework.
Figure 2-1. First cyclic approach for monitoring planning
Adopted from Mogheir and Singh (2002)
A number of approaches to monitoring network design can be found in Mogheir et al.
(2006), who classified them as geostatistical and statistical methods. Among these,
variance-based, probability-based and entropy-based methods can be mentioned. A
comprehensive review is presented in the work by Mishra and Coulibaly (2009).
Although there is extensive literature on a number of approaches to rain-gauge network
design (see e.g., Bogardi and Bardossy 1985; Bras and Rodriguez-Iturbe 1976; Moore et
al. 2000; Pardo-Igúzquiza 1998; Rodriguez-Iturbe and Mejia 1974; Sansó and Müller
1997; Yeh et al. 2006), there is comparatively little for the design of water level gauge
networks. Some examples such as interpolation methods to find the minimal spatial
density (Gandin 1965) and the number of stations required for runoff estimation (Karasev
1968), exist. Moss (1974) present an approach for the design of surface water data
networks that include the statistical nature of parameter accuracy estimates, but this was
presented as an application of a more general approach described by Moss (1976) for
12
Chapter 2 - Literature Review
hydrological monitoring networks. Similarly, Moss and Tasker (1991) performed a
comparison between techniques for designing monitoring networks of stream-gauges, in
which the main variable was the discharge as an estimation of water levels. Husain
(1989) proposed a methodology to select the most important monitoring station out of a
dense set of stations, and also to expand hydrologic networks that have sparse stations. In
particular, he used gamma distributions to estimate the multivariate probability functions
in order to calculate the entropy-based information transmission.
From the design and operation point of view, Made (1988) analyzed a number of methods
to derive water levels at any point along river reaches, provided water level
measurements are available at existing points.
2.2.2
Evaluation of monitoring networks
Existing monitoring networks are evaluated in order to confirm that the objectives for
which the network was designed are met. The result of this assessment may include a
redefinition of the size and scope of the network, which can lead either to the elimination
(due to redundancy or uselessness of the collected data) or the inclusion of additional
monitoring points in places where water-related variables cannot be adequately inferred
from the existing monitors.
In general terms, the same approaches used for the design of monitoring networks are
used for their evaluation. These approaches can be classified as statistically based
methods, information theory methods, user survey approach, hybrid methods,
physiographic components and sampling strategies, which are presented in detail by
Mishra and Coulibaly (2009) and schematized in Figure 2-2.
Two approaches for the design and evaluation of monitoring networks developed in this
thesis are part of the entropy-based methods (see Chapter 5), two of them using
Optimization methods. Yet an additional approach that is not included in the review of
Mishra and Coulibaly (2009), is one that takes into account the Value of Information and
that is also developed in this thesis (see Chapter 6).
2.3 Information Theory (IT)
From the beginning of the development of Information Theory, two problems regarding
information concepts frequently came into view: what is information and how can it be
measured? Even though the first question seems to be crucial for any theory, the second
has been the most developed (Burgin 2003).
An interesting, general definition of information is given by Yankovsky (2000): “Any
interaction between objects during which one object gains some substance and the other
does not lose it, is called Information Interaction. In this case, the substance under
transmission is referred to as Information.”
13
Optimisation of monitoring networks for water systems
Figure 2-2. Classification of methods for design and evaluation of monitoring networks
Generally speaking, all developed theories to measure information can be considered as
sub theories of the General Theory of Information (Burgin 2003). Among these theories,
Shannon’s information theory, semantic theory of information, Fisher Information,
qualitative information theory, algorithmic theory of information, pragmatic theory of
information, social information, utility theory of information, economic theory of
information and dynamic theory of information can be found.
The self-information introduced in the Theory of Information (Shannon 1948), the
Kullback-Leibler divergence (as a measure of the differences between a ‘true’ and an
‘arbitrary’ probability distribution), and the Fisher Information (as the amount of
information that an observable random variable carries about an unknown parameter
upon which the likelihood function of the random variable depends) are related
approaches for measuring information. In this thesis the Theory of Information is used to
develop the methods presented in Chapter 5.
14
Chapter 2 - Literature Review
2.3.1
Quantifying information
Information theory as described by Shannon (1948), provides mechanisms for measuring
information R , which is a reduction in uncertainty H ( X ) . As the latter is also known as
entropy, information entropy, Shannon entropy or marginal entropy, the terms entropy
and uncertainty will be used indistinctly through this thesis. The definition of uncertainty
indicates how surprising is, in average, to get a value x from a random variable X that
can take the possible values x1 , x2 ,..., xn each with probability p x (Equation (2-1))
n
H(X )
¦ p xi log p xi (2-1)
i 1
The units of uncertainty are actually given by the base of the logarithm utilized, being
“nats” if the base is e and “bits” if it is 2. In this thesis the latter will be used. Another
important consideration is that 0 log 0 0 , which is in line with the fact that x log x o 0
as x o 0 and values with zero probability do not change the uncertainty. An analysis of
Equation (2-1) shows that when all values xi are equal, then the uncertainty of the
variable is zero, thus the variable is not random and we are sure what the value xi 1 will
be. On the other hand, when all possible values xi are equally likely then the uncertainty
is maximum (Cover and Thomas 1991). This sense of the predictability of entropy-related
expressions has been exploited in diverse fields such as climate (Majda et al. 2002),
financial time series (Molgedey and Ebeling 2000) and DNA-sequences (Ebeling and
Frommel 1998) among others.
Information R is defined as a reduction in uncertainty Equation (2-2)
R
H before H after
(2-2)
In a flawless communication process (without noise), the receptor is completely certain
about the message that was sent by the emitter, so H after 0 and R H before . This is the
reason why some authors consider (wrongly) that information is the same as entropy or
uncertainty (Schneider 2000).
It is possible to see that entropy is also an alternative measurement of variability or
dispersion, just like the classical variance, but with some additional advantages: variance
is not appropriate to use when the size of the sample is too small (Hart 1971; Mogheir et
al. 2006; Singh 1997; Singh 2000), a situation that is handled by entropy; for discrete
distributions without numerical values the variance becomes totally improper as the use
of different numbered labels is translated into different values of variance, an
inconvenience that entropy prevents, since only probabilities are taken into account. A
more detailed discussion can be found in the work of Wei (1987).
15
Optimisation of monitoring networks for water systems
In some cases, it is necessary to estimate the amount of information content between two
random variables X andY . A frequently used approach is the mutual information (or
transinformation) I , which quantifies the amount of information of one random variable
which is contained in another random variable (Cover and Thomas 1991), and can be
interpreted as the reduction of uncertainty of X due to the knowledge of Y :
n
I ( X ;Y )
p ( xi , y j )
m
¦¦ p( x , y ) log p( x ) p( y )
i
i 1 j 1
j
i
(2-3)
j
Although the concept of transinformation and its relation with entropy can be depicted
using Venn diagrams, they may not represent positive quantities when analyzing more
than 2 variables (MacKay 2003)
Provided that p ( x, y ) is the joint distribution between the variables X and Y , and that
p ( y | x) is the probability of y given x , other information-related measures for two
variables can be defined (Cover and Thomas 1991) including the joint entropy, given by:
n
H ( X ,Y )
m
¦¦ p ( xi , y j ) log p ( xi , y j )
(2-4)
i 1 j 1
which represents the amount of information that is contained in both variables, and the
conditional entropy, given by:
n
H (Y | X )
m
¦¦ p ( xi , y j ) log p ( yi | x j )
(2-5)
i 1 j 1
which represents the amount of information content of X which is not contained inY .
In communication systems, H ( X ) is the information input at the source X , H (Y ) is the
information output at the receiver Y and I ( X , Y ) is the amount of information
transferred from X to Y (MacKay 2003). It is interesting that H ( X | Y ) is the amount of
information loss during the transmission (part of X that never reachesY ) and that
H (Y | X ) is the amount of information that is received as noise (part received by Y that
was never sent by X ). It is clear that neither of these values can be negative.
Although there are a number of approaches to measure the dependence between
variables, the use of the mutual information has become popular in several fields of
science for a number of reasons. First, because it does not take into account any relational
hypothesis between the variables beforehand, as for instance, the linearity of the Pearson
correlation (Steuer et al. 2002) or the linearity and normality of the ordinary correlation
coefficient r (Singh 2000) do. In other words, mutual information does not measure linear
16
Chapter 2 - Literature Review
dependencies but general dependencies, which are more likely to exist in physical and
other natural processes. Second, as in the case of entropy, correlation functions can only
be applied to a sequence of numbers, whereas mutual information can also be applied to a
sequence of symbols (Li 1990). Furthermore, the invariance under transformation
provided by entropy-based relationships is another advantage reported over the ordinary
correlation (Linfoot 1957), as well as the capability to provide not only quantitative
measure of information of one gauge station, but also a measure of the transference and
loss of information between them (Yang and Burn 1994).
One major restriction of the mutual information is that it is only applicable to two random
variables, and often it is needed to evaluate the dependency among several variables. This
situation is addressed by the multivariate mutual information, a topic first studied by
McGill (1954), who defined interaction information (or co-information), for the case of
three variables, as:
I ( X ;Y ; Z )
H ( X ) H (Y ) H ( Z ) > H ( X , Y ) H (Y , Z ) H ( X , Z ) @ H ( X , Y , Z )
I ( X , Y ; Z ) I ( X ; Y ) I (Y , Z )
(2-6)
The general expression for N variables, which was extended by Fano (1968) and
reformulated by Han (1980), is:
I X 1 ; X 2 ;...; X N I X 1 ; X 2 I X 1 ; X 2 ;...; X N 1 | X N (2.7)
Srinivasa (2005) interprets interaction information as the gain (or loss) in the information
transmitted between a set of variables due to additional knowledge of a new variable,
whereas Jakulin and Bratko (2003) think that it is a measure of the amount of information
that is common to all the variables, but not present in any of them. Fass (2006) interprets
it as the influence of one variable on the amount of information shared between the rest
of the variables. These interpretations are related to the fact that interaction information
can be negative, because the dependency among a set of variables can increase or
decrease with the knowledge of a new variable. For the case of knowing the effect of a
third variable on the correlation of two variables, Jakulin and Bratko (2004) explains a
positive interaction information as a synergy between the original variables and a
negative value as a redundancy among these variables. Similarly, Fass (2006) states that
after knowing the third variable, a resulting positive value “facilitates” or “enhances” the
correlation between the two variables, whereas a negative value “inhibits” or “ explains”
this correlation. This author also recognizes that the difficulties in the interpretation of the
possible negativity of the interaction information have been a barrier for its application in
areas like machine learning and psychology (Fass 2006).
The concept of Total Correlation (McGill 1954; Watanabe 1960), C X 1 , X 2 ,..., X N ,
provides a direct and effective way of assessing the dependency among multiple
variables.
17
Optimisation of monitoring networks for water systems
C X 1 , X 2 ,..., X N N
¦H X H X ,X
i
1
2
,..., X N (2-8)
i 1
It can be noted that for the case of N=2 that Total Correlation is equivalent to the wellknown transinformation (or mutual information). The term H X 1 , X 2 ,..., X N is the
multivariate joint entropy (Eq. (2-4) for the case of N variables) of the set X 1 , X 2 ,..., X N .
Total Correlation can be calculated by following the grouping property of mutual
information (Kraskov et al. 2003), for the case of three variables X, Y and Z, is as follows:
x
x
x
x
A new variable A is built up by agglomerating X and Y in such a way
that H A H X , Y . The procedure of ‘agglomeration’ consists of placing in A
a unique value for every unique combination of the corresponding records in X
and Y. For instance, if X=[1,2,1,2,1] and Y=[2,3,1,3,2], then one of the options to
agglomerate X and Y to build the variable A is by putting all the corresponding
digits (or symbols) of X and Y together, i.e., A=[12,23,11,23,12].
Following the same concept, a new variable B is built by the agglomeration of A
and Z, also with the condition H B H A, Z .
The mutual information between the selected pairs for agglomeration, i.e.,
C X , Y H X H Y H A and C A, Z H A H Z H B are
calculated.
The Total Correlation of X, Y and Z is calculated by summing up the partial total
correlations obtained for each built variable,
i.e., C X , Y , Z C X , Y C A, Z .
As can be noted, this method does not need to assess H X 1 , X 2 ,..., X N and therefore the
estimation of the joint probability distribution p x1 , x2 ,..., xN is not needed. After having
calculated C X 1 , X 2 ,..., X N by following the steps above, the multivariate joint entropy
H X 1 , X 2 ,..., X N can then be calculated from Eq.(2-8) as:
H X 1 , X 2 ,..., X N N
¦H X CX , X
i
1
2
,..., X N (2-9)
i 1
A complete reference to related work can be found in the work of Jakulin and Bratko
(2004) and Fass (2006). The application of the Total Correlation in water resources
problems was first achieved by Alfonso et al.(2010b), a work presented in detail in
section 5.3.
2.3.2
Uses of IT in the design and evaluation of monitoring networks
The main idea behind the design of any monitoring network is to reduce as much as
possible the uncertainty associated with the estimation of values of a given variable in the
places where it is not directly measured. The concept of uncertainty has been traditionally
18
Chapter 2 - Literature Review
linked with statistical variance, even though Amorocho and Espildora (1973) noticed that
it was not an objective index of quality when used, for instance, for comparing predicted
values of a hydrological model and the series of data records. It is interesting to note that
during the same period, a similar observation was made in the field of portfolio
management by Philippatos and Wilson (1972).
Diverse authors have applied information theory concepts to the design or evaluation of
monitoring networks for general purposes (Caselton and Zidek 1984) and for more
specific purposes, such as water quality (Harmancioglu 1999), groundwater quality
(Caselton and Husain 1980; Mogheir et al. 2004; Mogheir and Singh 2002; Mogheir et al.
2006), air pollution (Zidek et al. 2000) and rainfall gauging stations (Krastanovic and
Singh 1992a; Krstanovic and Singh 1992b), among others.
From the point of view of surface water gauging, Husain (1989) presented a method for
network design based on the information-transmitting capabilities of a hydrologic
network in terms of entropy. Yang and Burn (1994) criticized this method, pointing out
that a continuous distribution function is assumed when calculating the entropy-related
measurements. Even though these authors overcame this problem by using a nonparametric estimation of the density distributions, the assumptions of having independent
and identically distributed random variables and the assessment of the smoothing factor
of the kernel parameter still continue to add vagueness to the process. These authors,
nevertheless, presented a normalized version of the mutual information between two
gauges, called the Directional Information Transfer Index ( DIT ), to obtain the fraction of
information transferred from one site to another as a value between zero and one:
DITX ,Y
I X ,Y H X (2.10)
The expression was first introduced in 1970 by Coombs, Dawes and Tversky in the field
of Mathematical Psychology under the name Coefficient of Constraint (Fass 2006).
Markus et al. (2003) explained the expression in Equation (2.10) as the information
received by X from Y . When H Y is used in the denominator of Equation (2.10) they
interpret it as the information sent from X to Y . However, these interpretations are
confusing since DIT does not provide a quantification of the transmitted information
content but a normalized version of the transinformation between these variables. For this
reason, when the net information transfer N is introduced as the difference between
information sent and information received (Markus et al. 2003), negative values, that
cannot be seen as an information gain whatsoever, are obtained.
Markus et al. (2003) presented a comparison between entropy and the least square
method to evaluate stream gauges, in which the DIT of Yang and Burn (1994) was
adopted. The authors faced the problem of selecting the bin size for calculating the
empirical frequency analysis to obtain the probability of a value in a particular interval.
They found that, in spite of the differences in entropy values when changing the bin size,
the ranking of stations in terms of the difference between the information received and
19
Optimisation of monitoring networks for water systems
the information sent, remained, in general, inalterable. The problem of the bin size was
pointed out from the beginning by Amorocho and Espildora (1973) and has also been
studied in the case of the mutual information calculation of discrete variables by Steuer,
Kurths et al. (2002).
Two difficulties appear recurrently in the studies that use entropy for monitoring network
design. First, there is a problem of establishing the joint probability functions to calculate
mutual information. This has been mainly solved assuming either a Gaussian distribution
of the variables or evaluating the transinformation as a function of the correlation
coefficient r , as suggested by Harmancioglu and Yevjevich (1987). Secondly, for the
multivariate case, several simplifications are made, for example, analyzing mutual
information by pairs of stations and analyzing the resulting 2D transinformation matrices
(Filippini et al. 1994; Mogheir and Singh 2002), or assuming a normal distribution to
calculate the multivariate joint entropy (Krstanovic and Singh 1992).
Estimating the joint probability of multiple variables is a problem encountered in several
fields and diverse methods of approximation have been proposed. The problem is due to
the combinatorial explosion of the number of probabilities to calculate for a large number
of variables. Fass (2006), in his comprehensive literature review shows a number of
proposed approximations, and states that this is a quintessential problem in human and
machine learning. One of them is the Chow-Liu tree (Chow and Liu 1968), which
approximates the joint probability as a product of bivariate distribution functions; the
paired-variables with the biggest transinformation are selected to be part of the tree.
Kirshner et al.(2004) applied this method to evaluate discrete time-series for modeling
and forecasting daily precipitation occurrence for networks of rain stations, and
demonstrated some improvements over simpler alternatives such as assuming conditional
independence of the multivariate outputs.
Even though the method of total correlation expressed in Equation (2-8) for evaluating
information among multiple variables is an indirect approach to evaluate multivariate
joint probability functions, it is a direct, precise method to evaluate information
dependency among multiple variables.
2.4 Value of Information (VOI)
The Value of Information is about how choices made under uncertain conditions affect
the outcome. The basic idea is that a decision maker, conscious of his limited knowledge
to make an informed decision (this is, under uncertainty), is willing to pay for additional
information provided the expected gain exceeds the cost of collecting and analysing it. As
stated by Weinberger (2001), “...the practical value of information derives from its
usefulness in making informed decisions”.
The value of information depends on several factors (Macauley 2005):
x The degree of uncertainty of the decision-maker. If there are a few possible
choices of remedial actions to decide from, then information has a small value
even if it eliminates uncertainty. Conversely, if the costs of the actions have a
20
Chapter 2 - Literature Review
x
x
x
2.4.1
high variance (widely diverge), then the information can have a very high value
even if it does not reduce uncertainty.
The objects or issues that are at risk as an outcome of a decision; VOI depends on
the value of the outcome. For instance, a willingness to pay for data about oil
exploration potential is in part a function of the price of gas. Outcomes can also
be measured in terms of damages caused by floods or diseases caused by
pollution, when no goods or services are considered.
The cost of using the information to make decisions; analysing (and not
collecting) data can be so expensive that it is made to have little value.
The price of the next-best alternative sources of information; sometimes there are
several other substitutes for information (aerial photographs instead of satellite
images, for example).
Definition
The concept of Value of Information was introduced first in the area of economy as a
way to deal with the limitations of knowledge in the decision-making process. Hirshleifer
and Riley (1979) offer a classic overview of general approaches to understanding the
Value of Information. Following their theoretical development, the value of information,
VOI, provided by one message received can be estimated as the difference between the
utility, u, of the action, am, that is chosen given a particular message, m, and the utility of
the action, a0, that would have been chosen without additional information:
VOI m
u am , p u a0 , p (2.11)
Both utilities are calculated following the expected utility rule of Newmann-Morgenstern,
by summing up the products of probabilities, p, and the consequences, c, resulting from
the selected action, a, as shown in Eq. (2.12).
u a , ps ¦c
as
S
ps
(2.12)
where, ps, is the perceived probability of state, s, cas is the consequence associated with
the decision of performing the action, a, when the system has a state, s, and S is the set of
possible states of the system.
On the one hand, if a decision maker has to make a decision without information, he/she
will have to rely on his/her perspective about the state of the system. This perspective can
be quantified as the probability ps = Ss of having a particular state, s, in the system. On
the other hand, when new information is available, the decision maker can judge whether
or not to believe the new information, thus updating his/her beliefs, or to reject the new
information. Naturally, the new information has value if the individual uses it to make a
decision, and has no value otherwise. Mathematically, the process for which the decision
maker accepts the new information (i.e., is willing to change his/her belief, Ss) can be
represented by Bayes theorem for updating belief as:
21
Optimisation of monitoring networks for water systems
S s ,m
S s qm , s
¦ S s qm,s
(2.13)
S
where Ss,m are the posterior or updated beliefs and qm,s is the conditional probability of
receiving the message, m, given the state, s, which is an amount that is generally provided
by the information service, and is an indication of the quality of the message provided.
False positives and true negatives, or error type I and error type II, are accounted for here.
In summary, Eq. (2.14) shows that the Value of Information is, ultimately, a function of
cas, the consequences of taking an action, a, given a particular state, s; Ss is the prior
probability, or the belief before the additional information; qm,s, the conditional
probability of receiving the message, m, given the state, s.
VOI
f cas , S s , qm, s (2.14)
Figure 2-3 presents the steps for the estimation of the Value of Information.
One of the difficulties in using the VOI lies in how to assess the probabilities before and
after receiving the new information. Some authors have tried to estimate them empirically
by interviewing decision makers directly (see e.g., Bouma et al. 2009; see e.g.,
Schimmelpfennig and Norton 2003), while other authors have used model outputs (see
Dakins et al. 1996; Lin et al. 1999). The work by Yokota and Thompson (2004) offers a
comprehensive review of various VOI applications in the field of environmental health
risk management decisions.
2.4.2
Use of VOI in water resources
The concept of value of information has been applied in different fields. Some examples
are the assessment of supply chains (Gavirneni et al. 1999; Lee et al. 2000), to explore
bidders' incentives to gather information in auctions (e.g., Milgrom and Weber 1982), in
industrial purchasing decisions (Stigler 1961), as a method for system identification
(Fogel and Huang 1982), to assess the market distortions in real estate transactions
(Levitt and Syverson 2008), among others.
The concept of VOI has been actively applied in the field of water quality monitoring.
Some recent researches include the work by Ammar and Kaluarachchi (2009), who
presented a methodology to optimise groundwater quality monitoring networks, taking
into account vulnerability/probability assessment, environmental health risk, the value of
information (VOI), and redundancy. Bouma et al. (2009) presented an assessment of the
Value of Information for water quality in the North Sea, by combining Bayesian decision
theory with an empirical, stakeholder-oriented approach.
22
Chapter 2 - Literature Review
¦c
u a, S s as
Ss
S
max ^u a, S s `
u a0 , S s S s ,m
qm, sS s
a0
S s ,m
¦S q
s m, s
S
u a, S s , m ¦c
as
S s ,m
S
^
`
u am ,Ss,m max u a,Ss,m 'm
u am , S s ,m u a0 , S s ,m 'P
¦q
m
'm
m
'P Figure 2-3. Flowchart for VOI estimation
23
Optimisation of monitoring networks for water systems
Shaqadan (2008) developed a framework to reduce the uncertainty in exposure to health
risk due to drinking contaminated groundwater. The author assessed the socioeconomic
value of potential decisions of collecting additional information for given variables, in
which advanced social welfare concepts to understand the social acceptability of
decisions to collect better information were presented.
In order to contain a plume of groundwater contamination through the installation and
operation of pumping wells, a method by Wagner et al. (1992) was developed in which
the hydraulic conductivity of the aquifer is the main source of uncertainty. Although
these authors mention VOI as a way to evaluate the decisions, the concept is reduced to
the solution of stochastic programming models and the evaluation of expected values of
optimal solutions. Reichard and Evans (1989) examined the value of groundwater
monitoring in reducing exposure uncertainty for different monitoring strategies.
Other uses of VOI in water resources include surface water quality, such as Borisova et
al. (2005) who examine the price and quantity of different devices and assess the
expected value of the information obtained for agricultural nitrogen pollution control;
Ramirez et al. (1988) who introduce two analytical tools for decision making in flood
control design, namely ex-post analysis and the Value of Information. Roberts et al.
(2009) explores the value of information of early warning systems that detect
contaminant agents of soybean crop and found that it mainly depends on the perceived
risk of being infected and the accuracy of forecasts.
The concept of Value of Information is applied in this thesis to develop a methodology
for placing monitors in a water system (see Chapter 6).
2.5 Public participation in monitoring
From the environmental standpoint, public participation is becoming an interesting option
for the control and management of the environment. A number of examples of public
participation in monitoring can be identified. Au et al. (2000) developed a methodology
to involve the public in the collection of water quality data and found that high school
students were able, after proper training, to reliably produce values for total coliforms,
Escherichia coli and toxicity in waterways. In a similar project, Marcelino (2007)
explored the possibilities of working with children to create a multisensory geographical
information using Google Earth. This project was further extended to use mobile phones
learning and participatory contexts (Silva et al. 2008).
Regarding the environmental monitoring, the framework Environmental Collaborative
Monitoring Networks (Gouveia and Fonseca 2008) was created to channelize existing
citizen initiatives that use ICT tools to increase the contribution of volunteered
geographic data. Niinioja et al. (2004) described in detail the results of public
participation in the monitoring of algae in the Lake Pyhäjärvi, on the border between
Finland and Russia. Gouveia et al. (2004) suggested to overcome the problems of using
voluntary collected (lack of confidence in data collection procedures, data quality often
unknown and data usually dispersed and non-structured) by promoting the use of
24
Chapter 2 - Literature Review
Information and Communication Technologies with an emphasis given to tools that
explore non-traditional types of environmental data such as images, sounds and videos in
association with spatial information. Nare et al. (2006) assessed the extent of stakeholder
participation in water quality monitoring and surveillance at the operational level, and
also indigenous knowledge and practices in water quality monitoring in Zimbabwe,
where policies and legislation encourage stakeholder participation.
In particular, this thesis concentrates on the use of mobile phones by the public for water
level monitoring (see Chapter 7). The idea is to take advantage of the benefits of public
participation in monitoring, identified by diverse authors for the case of water quality
monitoring (see e.g., Au et al. 2000; Bromenshenk and Preston 1986; Stokes et al. 1990),
which include the creation of public awareness on environmental issues, the improvement
of collaboration among all the stakeholders, the cost effectiveness of data collection
activities and the high coverage in space and time.
2.6 Model reliability
Models are a simplification of the reality and this simplification implies that models are
not perfect. It is of particular interest, therefore, to assess the reliability of the models that
are being used for planning, design or operation of water systems. In this regard, it is
important to look at the sources of model uncertainty and review some of the available
methods for uncertainty assessment.
2.6.1
Model uncertainty
Shrestha (2009) summarized the most important sources of uncertainty in rainfall runoff
models, namely observational uncertainty, model uncertainty and parameter uncertainty.
The first refers to the uncertainty of the measurements used by the model inputs and
outputs; the second is associated to the simplifications, approximations and/or the lack of
knowledge when building the conceptual structure to describe the physical processes of
interest; the third is related to the model parameters that are generally estimated by
indirect means such as expert judgment or calibration.
Following the definition given by Holling (1978), reliable models provide the confidence
to base management decisions. It can be then stated that models are reliable when the
outputs and their associated uncertainty are within acceptable limits. Therefore, one way
to enhance the reliability of a model is to work on the reduction of its uncertainty.
2.6.2
Methods to assess uncertainty
There exist a significant number of studies to assess the uncertainty of rainfall-runoff
models. A comprehensive review of the available methods is presented by Pappenberger
(2006), where a decision tree is provided to select the proper method for a given situation.
Five sets of methods can be mentioned: forward uncertainty propagation, model
calibration and conditioning uncertainty, qualitative methods, real-time data assimilation
and sensitivity analysis. The first method includes error propagation equations (see e.g.,
Kunstmann and Kastens 2006), Monte Carlo propagation (EPA 1997), reliability methods
25
Optimisation of monitoring networks for water systems
(e.g., Melchers 1999) and fuzzy (e.g., Klir and Smith 2001) and imprecise methods
(Walley 1991); the second includes nonlinear regression (e.g., Kavetski et al. 2002;
Kuczera and Parent 1998), Bayesian methods (e.g., Van Oijen et al. 2005), and
Generalized Likelihood Uncertainty Estimation, GLUE (e.g., Aronica et al. 2002); the
third set of methods include the Numerical, Unit, Spread, Assessment and Pedigree,
NUSAP (Jeroen et al. 2005) methods.
For real-time data assimilation, the Kalman filter-related methods (e.g., Bertino et al.
2003; Moradkhani et al. 2005b; Vrugt et al. 2005) and the sequential Monte Carlo
analysis can be mentioned (e.g., Moradkhani et al. 2005a); regarding sensitivity analysis
the works by Hall et al. (2005) and Wagener et al. (2003) can be cited.
In this thesis, a method to use data collected from the public to improve model reliability
is presented in Chapter 7.
26
Chapter 3
Case study 1: Polders of Pijnacker
Two water systems of a completely different nature have been selected to develop the
methods for monitorig network design, namely a polder system, in order to take into
account a flat, highly controlled water system and a major natural river in order to
consider uncontrolled, stream flows. For the first case, the polder system of the region of
Pijnacker, The Netherlands, is described in this chapter. For the second case, the
Magdalena River, Colombia, is introduced and described in Chapter 4. The methods for
locating monitors are presented in Chapters 5, 6 and 7.
This chapter begins with a description of the general characteristics of a polder system
and the particular strategy of the Delfland Water Board, the local authority that manages
the polders of Pijnacker. This is followed by a description of the water system itself,
including the characteristics of the drainage, the canal network and the control structures.
Next, a description is given of the existing monitoring network, followed by a summary
of a model of the water system, including a review of the rainfall-runoff and
hydrodynamic models.
3.1 Introduction
Polders are developed areas below sea level, which are drained by canals and pumping
stations that discharge excess water either to elevated storage basins or to the sea. In this
way, artificial catchments are formed consisting of the combination of a number of
polders normally delimited by weirs, pump stations or inlet structures.
The Netherlands, with about 20% of its territory below sea level, is famous for its polder
water system configuration,. A typical example is the western part of the country, in the
province of Zuid Holland, where important cities such as The Hague and Rotterdam are
located (Figure 3-1).
Local government agencies called "waterschappen" (water boards) or
"hoogheemraadschappen", exisitng since the 13th century, are organised to manage the
water levels. One of these authorities, the “Hoogheemraadschap van Delfland” (Delfland
Waterboard or simply Delfland) is in charge of an area comprised by 57 polders covering
about 40,000ha.
Optimisation of monitoring networks for water systems
Figure 3-1. Limits of the Delfland Water Board and in the province of Zuid Holland
Delfland is bordered by water bodies to the west (the North Sea) and to the south, namely
the Nieuwe Maas and the Nieuwe Waterweg; Figure 3-2), which are at a higher elevation
than the land. In order to keep the water level within predefined ranges in the area, each
polder drains its extra water into a storage basin through small pumping stations. The
storage basin is formed by sets of canals and lakes that are generally higher than the
polders, from which six main pumping stations, with a total capacity of 54m3/s,
ultimately drain the excess of water either into the North Sea, the Nieuwe Maas or the
Nieuwe Waterweg.
The key parameters for the management of Delfland are water level and water quality
(Lobbrecht 1997). These variables affect seven well defined interests: flood prevention,
ecology, horticulture, pasture agriculture, recreation, navigation and operations. A
summary of these interests is shown in Figure 3-3, where their relative priorities are
shown by the chart divisions. It is worth mentioning that these divisions were made to be
used in the operational optimisation process of the water system and that the values are
not determined by the Waterboard.
28
Chapter 4 - Case study 2: Magdalena River
Figure 3-2. Land uses, main water system components of Delfland region and location of the
polders of Pijnacker Adopted from Lobbrecht (1997)
1
1
(Lobbrecht 1997, p183)
6 7
1
1
1
1
2
1
3
g
f
5
d
5
1
4
a
e
c
b
1
4
1
5
4
3
2
2
Interests
5
Key variables
1
3
Locations
Groundwater level
Chloride
1
Storage basins
b
Flood prevention
Ecology
2
High-lying polders
c
Horticulture
Diffuse BOD
3
Urban polders
d
Pasture agruculture
Surface-water level
4
Glasshouse polders
e
Recreation
5
Rural polders
f
Navigation
6
Main retention
g
Operations
7
Polder retention
a
Figure 3-3. Interest-weighting chart Delfland water system. Adopted from Lobbrecht (1997), p 183
29
Optimisation of monitoring networks for water systems
3.2 Description of the polder system of the Pijnacker region
The region of Pijnacker, with an area of 18.80 km2, is located to the East of the Delfland
area. The region is mainly rural, with some urban development (5.7 km2) and glasshouses
(2.82 km2). Figure 3-4 shows the land use in the region and Figure 3-5 shows, in a
schematic way, a typical profile in the region.
0
0.5
*
#
1 Kilometers
*
#
Land use
õ
Urban Areas
Pasture
*
#
Glasshouse
*
#
#
*
*
#
*
#
*
#
õ
*
#
õ
#
*
*
#
*
#
#
*
*
#
*
#
õ
*
#
õ
*
#
#*
*
#
õ
õ
*
#
õ
õ
õ
õ Pump station
*
#
Weir
õ
õ
õ
*
#
Canal
Storage basin
Polder division
Figure 3-4. Land use in the region of Pijnacker
Figure 3-5. Schematic profile of the polders of Pijnacker
30
Chapter 4 - Case study 2: Magdalena River
The polder system consists of four main polders, namely Polder van Bresland (I), Oude
polder van Pijnacker (II), Noordpolder van Delfgauw (III) and Nieuwe of
drooggemaaktepolder (IV), each being divided into 6, 63, 27 and 31 smaller polders,
respectively (see Figure 3-6). For simplicity, the entire system is referred to as
“Pijnacker” throughout this thesis, as this is the name of the most important village in the
region.
Figure 3-6. Composition of the polder system of Pijnacker and identification of pump stations
3.2.1
Composition of drainage units
From the hydrologic point of view, the four major polders and their 127 small polders are
hydrologically independent response units, with 28 unique target water levels. The
system has 13 pump stations and 21 fixed weirs that are operated in order to keep the
water levels in the canal network between limits defined by the water management of the
region (Figure 3-7). The arrows indicate the flow directions at the structures, which
generally go from West to East and from North to South, with the last destination being
the storage basin. The color scale indicates the target of the water levels, which is also an
indication of the relief of the region.
31
Optimisation of monitoring networks for water systems
NAP Elevation (m)
-5.90 - -5.77
*
#
-5.77 - -5.57
-5.57 - -5.15
*
#
-5.15 - -4.40
0
-4.40 - -3.40
0.5
1 Kilometers
õ
-3.40 - -3.10
-3.10 - -2.80
*
#
-2.80 - -2.52
-2.52 - -2.05
-2.05 - -1.30
*
#
#
*
*
#
*
#
*
#
õ
*
#
õ
#
*
*
#
*
#
*
#
*
#
*
#
*
#
õ
*
#
*
#*
#
õ
õ
*
#
õ
õ
õ
*
#
õ
õ
õ
*
#
õ
õ
Pump station
Weir
Canal
Storage basin
Structure’s flow direction
Figure 3-7. Canal network and target water levels in the polders of Pijnacker.
3.2.2
Characteristics of the canal network
The main canals of the Pijnacker region have a total length of about 68.4 km and cover a
surface of 0.45 km2. The geometry of the canal network allows for a maximum storage
volume of about 487,000 m3. In Figure 3-8 the storage curve of the canal network is
presented for different levels, and shows two well-defined sections in which an increase
in storage can be observed. The rate of change of storage drops between the levels -4.5m
and -4.0m, because few canals exist with these levels, as can be confirmed by looking at
Figure 3-7.
500,000
450,000
400,000
3
Volume (m )
350,000
300,000
250,000
200,000
150,000
100,000
50,000
1.5
1.0
0.5
0.0
-0.5
-1.0
-1.5
-2.0
-2.5
-3.0
-3.5
-4.0
-4.5
-5.0
-5.5
-6.0
-6.5
-7.0
0
Level (m)
Figure 3-8. Storage volume of the Pijnacker’s canal network
32
Chapter 4 - Case study 2: Magdalena River
3.2.3
Structures of control
Although there are weirs and pumps in the region, the former are not really operated, but
are generally fixed. For this reason, the water levels are controlled by pumps that are
operated with simple on/off controllers. The operational on/off levels and the capacities
of the 13 pump stations shown in Figure 3-6 are presented in Table 3-1. From Figure 3-6
it can also be observed that pumps 1 and 2 drain the polder I, pumps 4 and 5 drain the
polder III, pump 6 drains the polder IV and finally pump 7 drains the polder II into the
storage basin. This means that the remaining pumps have the function of raising the water
from the lowest parts of each polder.
Table 3-1. Characteristics of the pump stations in the Pijnacker polders
Pump
Capacity
(m³/s)
1
2
3
4
5
6
7
8
9
10
11
12
13
0.45
0.64
0.12
0.06
0.53
1.05
2.10
0.49
0.05
0.13
0.60
0.30
1.00
On level Off level
(m)
(m)
-5.1
-2.1
-2.27
-2.9
-2.05
-3.1
-2.65
-3.15
-3.45
-3.15
-5.25
-5.9
-5.62
-5.2
-2.11
-2.32
-2.95
-2.15
-3.11
-2.75
-3.16
-3.5
-3.16
-5.35
-6
-5.72
Although the weirs have the possibility of being operated by changing their crest level,
this is not a common practice in the current management of the Pijnacker region.
3.3 Water level monitoring network
At present, automatic water level gauges reporting every 15min are located at the
pumping stations for controlling the pumps. Additionally, manual gauge scales are placed
in the region, some of them located at the pump stations as a backup reference for
operation, and others located elsewhere. The manual gauges are read once a month at
places where no automatic gauges exist.
The limitation of the current water level monitoring network is that it is exclusively
dedicated to the operation the water system through the control of its pumping stations.
However, as every single polder has an associated target water level, the process of
knowing the current state of every point in the system is difficult and very expensive. For
this reason, the Delfland Water Board may miss out-of-range water levels that affect one
or more water users. One way to estimate the state of the water levels at any point in the
system, therefore, consists on having a reliable model of the network of canals.
33
Optimisation of monitoring networks for water systems
Figure 3-9. Location of the existing water level gauges in the Pijnacker polders.
3.4 Description of the model of the Pijnacker polder system
This section describes the model of the Pijnacker polders that was used to develop the
approaches described in the chapters 5, 6 and 7. The objectives, topology, discretization
and components are summarized in the following paragraphs.
3.4.1
General description of the existing hydrodynamic model
The available model was built and instantiated by the Delfland Water Board between
2005 and 2006, for the purpose of evaluating the state of their system under normal
operating scenarios (static and dynamic cases for summer and winter operation in 2006
and system design for 2010) and also under extreme events.
It is a fully 1-D hydrodynamic (HD) model with an attached rainfall-runoff (RR) model
that runs in parallel (both models run simultaneously, sharing information at every timestep). Together, both models include 2,692 link elements that connect 3,300 nodes, 83%
belonging to components of the RR model, such as friction and boundary elements and
areas representing glasshouses, open water, paved, and unpaved areas. The remaining
17% of the nodes represent flow elements of the HD model, such as boundaries, bridges,
flow calculation points, flow connections (including those connected to the RR model),
34
Chapter 4 - Case study 2: Magdalena River
cross sections, culverts, fixed calculation points, lateral flows, measurement stations,
pump stations and weirs. The time step for both HD and RR models is set to 't=3min and
the reporting time, also for both models, is 15 minutes. The spatial discretization of the
calculation points of the network is 50m on average.
3.4.2
Rainfall-runoff connections
The connections between the HD model and the RR model occur at 130 connections
points spread over the canal network. Every single drop-shaped point in Figure 3-10
represents a connection, at which the computation of the hydrological processes
occurring at the corresponding areas is carried out. It is assumed that a rainfall event is
evenly distributed over the area.
¡
¡
¡
*
#
¡
0
0.5
1 Kilometers
¡
¡
¡
*
#
¡
¡
¡
*
# ¡
¡
õ
¡ ¡
õ
*
#
#
*
* ¡
#
¡
#
¡ *
¡
* ¡ #
*#
#
* ¡
¡ ¡õ ¡ ¡
¡ õ
¡
õ
¡
*
##
*
¡
¡
¡
*
#
¡
#
*
¡
*¡¡
#
¡
¡
¡¡
¡
¡
õ
õ
¡
õ Pump station
*
# #Weir
Canal
Storage basin
¡
¡
¡
¡
¡
¡
¡
¡
õ
õ
¡
¡
¡
¡
¡
¡
¡¡
¡
¡
¡
¡
¡
¡
#
¡ *
¡
¡ ¡
¡
RR connection points
¡
¡¡
¡¡
¡
õ
¡
¡
¡
¡
¡
#¡
*
¡
* ¡ ¡¡
#
¡ õ
¡
¡ *
#
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
*¡
#
¡
¡
¡
õ
¡
¡
¡ ¡
¡
¡
¡
¡
¡
¡
¡õ ¡¡
¡
¡
¡
¡¡
¡
¡
¡
¡
¡
*
#
¡
Figure 3-10. Connection points of the hydrodynamic (HD) model and the rainfall-runoff (RR)
model.
35
¡
¡¡
Chapter 4
Case study 2: Magdalena River
In order to analyse the performance, validity and implications of the methods for monitor
location developed in Chapters 5, 6 and 7, two water systems of completely different
nature are considered in this Thesis. First, the polder system of the region of Pijnacker,
The Netherlands, introduced and described in Chapter 3, was selected as being
representative of a flat, highly controlled water system. Second, in order to include a
natural stream, the Magdalena River, Colombia, is introduced and described in this
Chapter.
The following section introduces the general characteristics of the Magdalena River, its
catchment, tributaries and wetlands. This is followed by a description of the river users
and the different water interests, as well as a discussion of how the performance of the
river as a water system can be defined. Next, the 1-D hydrodynamic model development
which is the base of the developed methods is presented. Finally, the limitations of the
model are listed at the end of the chapter.
4.1 Introduction
The Magdalena River, the main river of Colombia, Figure 4-1, runs for about 1,530
kilometres from South to North flowing into the Caribbean Sea, draining a catchment of
273,459 km2, equivalent to 24% of Colombia, and where 77% of the population lives
(Cormagdalena and ONF_Andina 2007). The mean annual discharge at the river mouth is
7,200 m3/s, with a mean low discharge of 4,068 m3/s in March and a mean high discharge
of 10,287 m3/s in November (Restrepo et al. 2006). These figures make the Magdalena
the largest river discharging into the Caribbean Sea.
From a morphologic point of view, the Magdalena River is divided into three regions,
namely the High Magdalena (682 km, from where the river originates until a zone of
rapids nearby the city of Honda), the Middle Magdalena (500km, defined from the
Honda’s rapids until the town of El Banco), and the Low Magdalena (430km, defined
from Regidor until the discharge into the Caribbean Sea, at the city of Barranquilla)
(Julius_Berger_Consortium 1926).
This classification is also convenient for the navigation activity, as each sector needs its
own kind of ships to navigate. In terms of the mean hydraulic slope, the High Magdalena
Optimisation of monitoring networks for water systems
Mag
d
alen
aR
iver
is steep, the Middle Magdalena is moderate and the Low Region is almost flat (see Table
4-1).
Figure 4-1. General location of the Magdalena River and its catchment
Table 4-1. Mean hydraulic slope by sectors, Magdalena River
Region
Sector
High
Embalse de Betania – Purificación
Magdalena
Purificación – Pto. Salgar
Middle
Pto. Salgar – Barrancabermeja
Magdalena
Barrancabermeja – Regidor
Regidor – Banco
Low
Banco – Magangué
Magdalena
Magangué – Calamar
Mean hydraulic
slope (cm/km)
101
45
41
33
16
7
6
The area of interest for this research is the middle and low Magdalena. These regions are
important for the country not only in terms of economic activities such as navigation, fish
production and agriculture, but also because these regions suffer most from flooding
(Cormagdalena and ONF_Andina 2007; IDEAM 2001).
38
Chapter 4 - Case study 2: Magdalena River
4.1.1
Tributaries
The main tributaries, located mainly in the middle reach, have an important influence on
the river’s behaviour in terms of discharge. Figure 4-2 presents the main tributaries
located in the middle and low sector of the Magdalena River, along with the most
important cities.
Figure 4-2. Main tributaries and towns of the middle and low Magdalena River
The Cauca River is the main tributary of the Magdalena. In fact, this river is the second
biggest river of Colombia. In the middle reach, the biggest rivers discharging into the
39
Optimisation of monitoring networks for water systems
Magdalena are the Carare, Sogamoso, La Miel, Nare and Cimitarra. Although Figure 4-3
shows 21 tributaries, it is worth mentioning that the Magdalena River also receives
discharges from an important number of small streams, some of them functioning as
connections between the river and the wetlands.
2500
Discharge (m3/s)
2000
1500
1000
500
Cauca
Cesar
Regla
Cimitarra
Pescado
Caño Balcanes
Nare
Cocorná
La Miel
Claro Del Sur
Pontoná
Doña Juana
Lebrija
Sogamoso
Opón
Carare
Ermitaño
Caño Baul
Palagua
Negro
Velasquez
0
Figure 4-3. Mean discharges of the main tributaries of the middle and low sector of the
Magdalena River
4.1.2
Wetlands
The limit between the middle and the low regions is very special because of a number of
reasons. First, a relatively sudden slope transition, due to the ending of the mountainous
system of the Andes, takes place. Second, the geology of the region, characterized by the
dynamics of the Pacific and the Caribbean tectonic plates and the subsequent geologic
faults, generates the so-called “Depresión Momposina” (see Figure 4-4), a depression
zone in which subsidence process at a rate between 2mm and 4mm per annum (Martinez
1981; Smith 1986; Van der Hammen 1986). Third, the hydraulics of the region,
characterized by the discharges of the rivers Cesar, San Jorge and Cauca (the biggest
tributary of the Magdalena, which divides into the branches Loba and Mompox; see
Figure 4-4), forms a system comprised of hundreds of interconnected “ciénagas” or water
bodies that form a natural reservoir which absorbs the peak flows of the river. Finally, the
sedimentation processes, characterized by the deposition of 21 Mt/yr, yielding a
sedimentation rate between 2 mm/yr (Restrepo 2008) and 3mm/yr (Van der Hammen
1986), which implies that the net elevation change of the wetland bed is barely visible.
Additionally, it is important to mention that the Momposina depression is an area with a
very important biodiversity in fauna and flora and with a high potential for fish, forest
and crop production (Aguilera 2004; Aguilera 2009). However, this biodiversity is under
risk because of uncontrolled human interventions and the lack of protection policies
(Galvis and Mojica 2007; Múnera et al. 2004; Muñoz et al. 2003; Naranjo et al. 1999). A
comprehensive description of the region can be found in DNP, FAO et al. (2003).
40
Chapter 4 - Case study 2: Magdalena River
Although it is known that the Momposina depression is one of the biggest inner deltas in
the world (Van der Hammen 1986), there is no clarity on its extension. Restrepo (2005)
states that this “tectonic tray” has an approximate area of 1850km2, while in another
publication, the same author indicates an area of 800km2 (Restrepo et al. 2006). In terms
of the area occupied by the wetlands, the Julius Berger Consortium (1926) estimated,
with instruments of the time, an area of “at least” 2,200 km2. In this study, the area of the
wetlands in the Momposina depression, taken from El Banco until 25km to the North of
Plato, is 3,096 km2, which includes the Mojana region and the Zapatosa wetland (365
km2 alone). These estimates come from an analysis of the SRTM Water Body Data
(SWBD) data set.
Elevation (m)
1 - 44
44 - 109
109 - 153
153 - 189
189 - 216
216 - 246
246 - 327
327 - 671
671 - 1,249
1,249 - 2,000+
Water bodies
Plato
Magangué
Mompox
El Banco
Pinillos
San Martín
de Loba
Figure 4-4. Inner delta and wetlands of the Momposina depression
4.1.3
Description of the existing Monitoring Network
Nowadays, more than 3,300 hydro-meteorological stations of different types are
operating in the Magdalena – Cauca catchment. The methods presented in this thesis for
the case study, however, consider only the water level monitoring network.
The existing water-level gauges for the river were placed initially to support decisionmaking concerning local problems in the main populated areas, related to flood control
and navigation, while keeping operation and maintenance costs low. However, from a
41
Optimisation of monitoring networks for water systems
global perspective, the information collected by these gauges is limited to supporting
decision and policy-making for navigation, flood control and other issues at other points
of the river.
Figure 4-5 shows the location of the existing limnigraphs and limnimeters to monitor the
water levels of the middle and low Magdalena River, and the present age of their records.
It can be observed that Salgar, Berrío, Barrancabermeja, Wilches, El Banco and Calamar
are the oldest stations and therefore these are always included in all the navigation and
flooding studies in the Magdalena River.
Figure 4-5. Age of the existing limnigraphic and limnimetric stations in the middle and low
Magdalena River (based on the year 2010)
42
Chapter 4 - Case study 2: Magdalena River
During the data collection for this thesis, a number of informal interviews were carried
out with different people involved for years in several studies about the Magdalena River.
They described the following common problems (some also found in the literature) with
the water level monitoring network of the Magdalena River:
x
The distance between two consecutive gauges with sufficient historical records is
huge (157km between Salgar and Berrío, 105km between Berrío and
Barrancabermeja, 676km between Barrancabermeja and El Banco and 309km
between El Banco and Calamar). Even the most recent, intermediate gauges are still
too far apart to draw proper conclusions about water levels in the Magdalena River
(Cormagdalena 2004).
x
Some gauges have changed been resited in time, because of a number of factors that
range from accidents with ships to incidents with high flows that wipe out the gauges,
and situations that have not been properly reported (Cormagdalena 2004).
x
The gauges are read by human observers who must record the levels and report them
in a predefined format. Although there is no documented evidence, situations like
forms being filled without actually looking at the gauges were mentioned, which add
enormous uncertainty to the quality of the data. Other problems like illiterate or noncommitted observers and vandalism were faced by even earlier studies in the
Magdalena River (see e.g., Julius_Berger_Consortium 1926).
x
The majority of the gauges have a relative reference level, which means that they do
not provide the absolute water level in a unique elevation system. This situation has at
least three causes: first, the institution in charge of placing, maintaining and operating
the gauges (Institute for Hydrology, Meteorology and Environmental Studies,
IDEAM) is different from the institution in charge of providing maps, levels and the
topographical information of the country (Geographic Institute Agustín Codazzi,
IGAC), and is also different from the institution in charge of making the decisions
about the floods and navigation on the river from a global perspective (Corporación
Autónoma Regional del Río Grande de la Magdalena, CORMAGDALENA). Second,
the gauges were initially installed with relative or arbitrary references, enough to
make decisions of a local nature. Third, there exists little or no interest in
georeferencing the existing gauges to a unique, precise reference system, as can be
inferred from the persistently ignored recommendations of old studies
(Julius_Berger_Consortium 1926; Mitch 1973) as well as relatively new studies (see
e.g., Cormagdalena 2000b; Cormagdalena 2004). Yet an additional reason is that
IDEAM, which is the entity in charge of the hydrological forecasting in the country,
has been using statistical methods that do not require absolute but relative levels
(IDEAM 2005). Indeed, a number of studies to forecast water levels for navigation
purposes in the Magdalena River have used relative water levels (see e.g., Domínguez
et al. 2009; Fernandez et al. 2010; IDEAM 2005; Mitch 1973; Rivera et al. 2004;
Rojas 2006).
43
Optimisation of monitoring networks for water systems
x
The gauges with absolute datum have actually been referenced using the bench marks
of the closest road projects, which, in turn, are barely referenced to the same absolute
datum. This means that the absolute datum of these gauges can only be considered to
be approximate.
x
Some authors consider that the huge amount of human intervention in the river during
the last decades, such as dredging for navigation purposes, the construction of dikes
for flood control, and the illegal (and therefore not documented) closure of the natural
wetland-river connections for land reclamation, have necessarily affected the time
series records of the river, and for this reason any statistical analysis of them should
be done with care (see e.g., Cormagdalena and Fedenavi 2007).
x
Additionally, the daily discharge data at some stations is based on rating curves, so
that every (relative) water level record registered at the gauge is converted to a
discharge. Although this is a common, accepted practice, several issues added
uncertainty to these curves. For instance, the changing morphology of the Magdalena
River implies that the sections where these rating curves have been deduced have
experienced important modifications. For this reason, some authors have decided to
eliminate old records, in order to guarantee in some way a constant hydraulic section
(see e.g., Cormagdalena 2006; Cormagdalena and Fedenavi 2007).
4.2 Performance of the Magdalena River
The main water interests of the river, which are in permanent conflict, are navigation,
agriculture and pasture, flood control and fish production. Navigation in the middle and
low Magdalena River is important because it is a way to connect the significant/strategic,
productive cities of the interior with the ports of Barranquilla and Cartagena facilitating
the trading activities on the Caribbean Sea and the Atlantic Ocean. Naturally, low water
levels due to either dry periods or sediment deposits imply economic losses for ships that
get stuck in bar sands along the river. A comprehensive review of the navigation in the
low Magdalena River can be found in Alvarado et al. (2008).
The social and political situation of the people settled in the Magdalena River may
explain the type of decisions concerning the management of the river. A well known
situation is that the land is generally owned by a few powerful families that exploit it for
agriculture and cattle (Aguilera 2004; DNP et al. 2003). In order to get more land, these
people started land-reclamation activities, such as closing the streams that connect the
river and the wetlands. Other activities included the construction of carriage ways for
transportation purposes, which help to close the stream connections. These practices have
affected the ecology of the region negatively, together with fish production. Detailed
environmental impacts in the middle Magdalena River due to human activities can be
found in Cormagdalena (2007) and DNP et al. (2003).
From a flood control perspective, the solutions have considered mainly structural
methods such as lateral dikes along the river. Although this is not a good practice,
44
Chapter 4 - Case study 2: Magdalena River
(blocking the floodplain decreases the channel conveyance and increases flood stages;
also their construction eliminates natural storage originally available on the flood plain,
generating an increase of the runoff concentration and of the flood peaks downstream),
between 2004 and 2009 a total of 517 km of linear concrete walls, piles and dikes have
been built along the river (Cormagdalena 2009), which might make the situation worse in
the future. Non-structural methods such as warnings are prepared at national level by the
Institute for Hydrological, Meteorology and Environmental Studies (IDEAM), for local
emergency associations to prepare affected communities for possible flood occurrences.
Although efforts to improve the quality of these warnings are carried out, at present they
still are very fuzzy.
Finally, fishing production has dramatically dropped. Some authors explain this reduction
as a result of agricultural, urban, and industrial development and deforestation in the
river's watershed (see e.g., Abramovitz and Peterson 1996; Silva; Valderrama and Zarate
1989); others think that the reason is due to over-exploitation of fisheries by displaced
people (see e.g., Galvis and Mojica 2007); however, local reports made in the field with
the fishermen (see e.g., Campo 2001; Rodríguez 2001; Romero 2001), conclude that the
drop in fishing production is due to the reduction of areas and depths of the wetlands by
human intervention in the river-wetlands connections. A comprehensive diagnosis of the
fishing problems in the Magdalena River can be found in Gualdrón (2006).
As mentioned above, a good performance of the Magdalena River can be defined by the
extent to which the navigation, agriculture and cattle, flooding and fish production fulfil
the expectations of each water user.
4.3 Development of the hydrodynamic model for the Magdalena River
The theoretical approaches for the design and evaluation of monitoring networks
presented in Chapters 5 and 6 require the use of a hydrodynamic model. For this reason,
the hydrodynamics of the middle and low part of the Magdalena River have been
modelled. In this section the details of this model are presented. It is worth mentioning
that the developed model does not include a hydrological component, because the
analysis of the river response under rainfall events is beyond the objectives of this thesis.
However, the developed model may be updated and complemented for other uses.
4.3.1
Data used
The data available to build and instantiate the hydrodynamic model of the Magdalena
River, between Salgar and Calamar, include information of stages and discharges of river
stations, river network and bathymetry obtained for different studies and satellite
floodplain elevations. The details of the data used are described as follows.
Stage and discharge
Daily, multiannual water level and discharge records are available for the low and middle
Magdalena River and its tributaries range from 1974 to 2003. However, the most
complete data for the tributaries and including 15 out of the 32 existing stations (see
45
Optimisation of monitoring networks for water systems
Figure 4-5) are for the year 1995 (see Table 4-2 and Figure 4-6) and therefore the model
has been built for this period.
Figure 4-6: Available hydrologic data records of discharges (Q) and water levels (h) at river
stations for 1995.
Table 4-2. Number of days of 1995 with discharge and stage data and datum of gauges
Station
Salgar
Berrio
Barranca
San Pablo
Regidor
Peñoncito
El Banco
46
Discharge
Source*
Stage
Magdalena River
365
R
365
365
R
365
365
261
R
261
365
D
365
D
263
321
R
365
Datum
Source**
165.9
104.5
70.5
54.6
23.5
16.6
19.6
A
A
A
A
B
B
A
Chapter 4 - Case study 2: Magdalena River
Tacamocho
Plato
Tenerife
Calamar
365
Santa Ana
San Roque
365
365
Armenia
Magangue
365
365
D
365
297
271
271
365
R
Mompox branch
D
365
D
365
Loba branch
D
D
365
5.62
2.85
0.19
-0.20
B
C
C
A
10,92
15,73
C
C
12.15
9.241
C
C
*R: Rating curve, D: data reported by IDEAM (Cormagdalena 2006); **Datum (masl)
reported by A: Cormagdalena’s website; B: studies by LEH-UN; C: studies by LEH-LF,
Uninorte.
Boundary conditions
The boundaries of the model are the 1995 discharges at Puerto Salgar (upstream) and the
1995 water levels at Calamar (downstream). Additionally, the time series of the 12 main
tributaries of the river in the considered sector (see Figure 4-6 and Figure 4-7) were
included as point sources.
3000
Mean 1995 (model input)
2500
Discharge (m3/s)
Mean
2000
1500
1000
500
CAUCA
CESAR
REGLA
CIMITARRA
PESCADO
NARE
CAÑO BALCANES
COCORNÁ
LA MIEL
CLARO DEL SUR
PONTONÁ
LEBRIJA
DOÑA JUANA
OPÓN
SOGAMOSO
CARARE
CAÑO BAUL
PALAGUA
ERMITAÑO
NEGRO
VELASQUEZ
0
Figure 4-7. Mean discharge of tributaries of the Magdalena River and mean discharges for the
year 1995, used as model inputs.
River network and bathymetry
The model of the Magdalena, composed by the branches Magdalena (Loba) and
Mompox, is built using a network point every 200 meters. In total, the Magdalena branch
contains 4058 points, with chainages ranging from K906+000 at Puerto Salgar (upstream)
to K94+600 at Calamar (downstream). Similarly, the Mompox branch, formed by 1080
points, connects to the Magdalena network at the chainage K402+600 upstream (just after
El Banco) and at K227+800 downstream. The maximum distance between two adjacent
computational points is x = 5000m.
47
Optimisation of monitoring networks for water systems
The bathymetric information used for the hydrodynamic model were obtained during
works made during 1999 and 2000 for the navigation studies ordered by Cormagdalena,
the government’s institution dedicated to the management of the river. Two main
institutions, namely the LEH-LF (del Norte University, Barranquilla) and the LEH-UN
(National University, Bogotá) were in charge of the execution of the studies, the former
in charge of the low Magdalena (Cormagdalena 2000b) and the latter of the middle
Magdalena (Cormagdalena 2000c); together they released the first Magdalena’s River
Navigation Booklet (Cormagdalena 2000a). Yet another navigation booklet was produced
in 2004, and a project to obtain frequent bathymetries for navigation assistance in near
real-time (Alvarado 2006). However, the works of the year 2000 were selected in order to
use satellite elevation data from HydroSheds (HS) by Lehner et al. (2006) to complement
the sections. HS is a hydrologically-corrected version of the of the Shuttle Radar
Topography Mission (SRTM) elevation data obtained by the Space Shuttle Endeavor
mission (National Aeronautics and Space Administration, NASA) in February of 2000.
Cross sections
As the primary focus of field work is navigation, the information collected was
insufficient for the modelling process. Certainly, the work does not include the flood
plain topography so cross sections are incomplete. Additionally, the bathymetries are
expressed in terms of depths in order for the pilots to know how much load they may
transport without getting grounded in the river.
Unfortunately, only local projects such as river bank protections or flood control works
have information about flood plain topography. This is because the local nature of these
projects makes it unnecessary to reference the elevations to an absolute datum.
As stated above all the cross sections available for modelling are limited to the main
channel of the river and therefore the elevation of the embankments is not well defined.
For this reason, it was decided to combine the bathymetry information of the year 2000 to
the HS DTM data, in order to obtain complete cross sections for the model. However, in
cases where there was not enough information or where the available bathymetry did not
have a reliable elevation transformation, the bathymetry obtained during other years was
used. The general procedure to combine the two sources of information is described as
follows:
x
x
x
x
48
Select the place where a cross section is needed.
Search for the corresponding water level at the point where the cross section is
located and at the closest referenced gauge during the period of the field work. If
the data of the referenced gauge is not available, replace it by the water level that
is exceeded 50% of the time for the date and year under consideration.
Estimate the absolute elevation reference of the cross section, based on the
reading of the closest referenced water level gauge and the hydraulic slope during
the field work.
Draw in plan view the cross section line of the bathymetry and extend it for at
least 100 to 500 meters depending on the characteristics of the river.
Chapter 4 - Case study 2: Magdalena River
x
x
Extract the profile from the HS DTM using the line obtained in 4.
Replace the corresponding points of the HS DTM by the bathymetry points.
This procedure was used to produce 33 cross sections that were included in the model.
An example of the obtained cross section is shown in the Figure 4-8, for the case of a
cross section near La Dorada – Puerto Salgar.
190
Bathymetry (8-Mar-03)
Hydrosheds (2000)
Final Hydroshed data added
Elevation (m)
185
180
175
Ignored Hydrosheds
elevation data
170
165
Gauge value at Puerto Salgar
(8 Mar 2003): 168.85m
160
0
100
200
300
400
500
600
Distance (m)
Figure 4-8. Example o f a composite cross section near La Dorada - Puerto Salgar
Wetlands
At present, the hydrologic and hydraulic information about the wetlands is limited to
what can be inferred from satellite images, aerial photography and cartography, which is
basically the extent of the water bodies areas. Perhaps the unique attempt to establish a
water balance in part of the considered region is the work by Díaz-Granados et al. (2001),
in which the wetlands of the Mojana Region and the Zapatoza, downstream the Cesar
River, were modelled. However, the authors recognize that their efforts provide only a
“qualitatively valid approximation” of the behaviour of the wetlands, due to the deficient
amount of information, especially regarding the elevation data, a key component that
drives the flow pattern in such a flat area. Additionally, during the information collection
for this thesis, no bathymetry information for the wetlands was found.
From a modelling point of view, the wetlands at the limit between the middle and low
sector of the Magdalena River were simulated using control structures. For this purpose,
the existing wetlands were grouped into four large reservoirs that were taken into
consideration in the model (see Figure 4-9). These reservoirs were located at chainages
K516+800, K403+800, K381+800 and K269+800 from upstream to downstream,
respectively. The main idea behind the control structures is that during the wet seasons,
the excessive flow coming from Magdalena partially discharges into the wetlands through
weirs. Similarly, during the dry season, allow water flow from the wetlands to the river.
Details of the four grouped wetlands are presented in Table 4-3.
49
Optimisation of monitoring networks for water systems
Figure 4-9. Assumed grouping of the wetland system for the model
The elevations and the areas presented in the Table 4-3 were obtained from GIS analysis
of the HS DTM data, which is still an approximation. It is clear that due to the deficient
cartography and elevation information, the exact connections between wetlands and
between rivers and wetlands is unknown, and therefore the real behaviour of the wetlands
in terms of storage volumes and peak flood attenuation of the rivers Magdalena and
Cauca is uncertain, so further investment in monitoring and research in general is needed
(DNP et al. 2003).
4.3.2
Simulation characteristics
The data available allows the user a simulation period from 01/01/1995 at 12:00:00 PM
to 12/31/1995 at 12:00:00 PM. With a calculation time step of 10 minutes and
provided 'x 5000m and h 3m , the Courant number for stability and accuracy criteria
yields
't
10 u 60
9.81u 3
0.65 1.0
Cr
gh
'x
5000
50
Chapter 4 - Case study 2: Magdalena River
Table 4-3. Characteristics of the grouped wetlands
Wetland
W1
W2
W3
W4
4.3.3
Description
Accounts for wetland at the discharge of the Lebrija
River, Ciénaga Simití and the diversion of the
Morales branch
Wetland2. Accounts for the Zapatoza wetland (nearby
El Banco) and other water bodies at the left shore of
the Magdalena River, in front of El Banco
Wetland3. Accounts for the wetlands of the Mojana
region, at both sides of the discharge of the Cauca
River into the Magdalena River
Wetland4. Accounts for the wetlands of the
Momposina Depression
Elevation
(masl)
Area
(ha)
38
37,414
42
37,414
23
2,473
24
59,864
25
78,551
26
102,778
27
109,411
28
113,081
29
116,852
10
31,858
12
55,011
13
60,571
17
85,054
19
126,217
10
43,721
11
74,600
12
99,335
14
108,833
15
122,234
Model calibration
Once the model was instantiated, the first runs were aimed to check the continuity of the
discharges in the river to get the correct volume of water, compared to the flow
measurements given at different points. As a first attempt, no wetlands were included.
Due to the missing data of some tributaries before 30 April 1995, the analysis was carried
out from 1 May 1995. The first results show that at Puerto Berrío the modelled discharges
51
Optimisation of monitoring networks for water systems
replicate the measurements very well, but the results become less acceptable in the
downstream stations. At Regidor, for instance, although the general trend of the discharge
curve is acceptable (see Figure 4-10(a)), the shape of the measurement curve is smoother
than the modelled one; also the modelled volume is persistently higher (between 700 and
1000m3/s that is not reflected in the measurements). At Calamar, additionally, the flow
curve is completely different in terms of trend, quantity and shape; see Figure 4-10 (b).
Figure 4-10. Modelled and measured discharges at Regidor (a) and Calamar (b), first check
9000
Regidor
measurement
model
Calamar
measurement
model
8000
Discharge (m3/s)
7000
6000
5000
4000
3000
2000
13000
12000
Discharge (m3/s)
11000
10000
9000
8000
7000
6000
5000
25-Dec-95
11-Dec-95
27-Nov-95
13-Nov-95
30-Oct-95
16-Oct-95
02-Oct-95
18-Sep-95
04-Sep-95
21-Aug-95
07-Aug-95
24-Jul-95
10-Jul-95
26-Jun-95
12-Jun-95
29-May-95
15-May-95
01-May-95
4000
ti
Figure 4-11. Modelled and measured discharges at Regidor (a) and Calamar (b), second check
52
Chapter 4 - Case study 2: Magdalena River
Although missing inflows may indicate that other minor reaches are not included or that
rainfall over the river may have a large influence, the deficient quality of the rating
curves used to produce the discharge time series could also be an important source of
error. In order to account for the missing inflows, lateral inflows were added into three
sections of the Magdalena River: from Berrío to San Pablo (400m3/s), from San Pablo to
Regidor (800m3/s) and from Magangue to Tacamocho (500m3/s). The results of these
inflows are shown in Figure 4-12.
8000
Regidor
(a)
measurement
Calamar
(b)
measurement
model
7000
6000
5000
4000
3000
2000
1000
12000
model
11000
10000
9000
8000
7000
6000
5000
4000
Figure 4-12. Modelled and measured discharges at Regidor (a) and Calamar (b), final result
For the Mompox branch, the results are presented in Figure 4-13.
1,200
measurement
model
800
600
400
200
25-Dec-95
1-Dec-95
27-Nov-95
13-Nov-95
30-Oct-95
16-Oct-95
2-Oct-95
8-Sep-95
4-Sep-95
21-Aug-95
7-Aug-95
24-Jul-95
10-Jul-95
26-Jun-95
12-Jun-95
29-May-95
5-May-95
0
1-May-95
Discharge (m3/s)
1,000
ti
Figure 4-13. Modelled and measured discharges at Santa Ana station, Mompox branch (final
result)
53
Optimisation of monitoring networks for water systems
It is evident that the modelled discharge at Regidor replicates best the measurements with
the new lateral inflows. However, at the downstream boundary (Calamar) neither the
shape of the curve nor the discharges are reproduced well. At this point, therefore, the
necessity of involving the wetlands is evident. On the one hand, the regulatory nature of
the wetlands compensates for the water balance downstream, and on the other hand their
presence attenuates the discharge in time, making the time series smoother just as they
are observed at the downstream stations.
After the acceptable reproduction of the volume of water at different points of the river,
the measured and modelled water levels were adjusted by changing the roughness
coefficients. The final Manning coefficients are presented in the Table 4-4. The
coefficients for the intermediate places were estimated using linear interpolation between
two consecutive stations.
Table 4-4. Resistance number (Manning coefficient) at stations
River Name
Station Name
Magdalena
Magdalena
Magdalena
Magdalena
Magdalena
Magdalena
Mompox
Chainage
(m)
759600
656600
607800
258600
212200
167200
39300
Puerto Berrío
Barrancabermeja
San Pablo
Magangué
Tacamocho
Plato
San Roque
Resistance Number
(local values)
0.022
0.048
0.038
0.025
0.025
0.040
0.025
Although the water level is not completely well reproduced by the model, the trend and
values follow the pattern of the measurements, as can be observed in Figure 4-14 for the
Regidor station.
A more detailed description of the developed model and the calibration process can be
found in He (2009), a Master of Science thesis that supports the present dissertation.
measurement
model
27-Nov-95
25-Dec-95
36
Water Level (m)
35
34
33
32
31
11-Dec-95
13-Nov-95
30-Oct-95
16-Oct-95
02-Oct-95
18-Sep-95
04-Sep-95
21-Aug-95
07-Aug-95
24-Jul-95
10-Jul-95
26-Jun-95
12-Jun-95
29-M ay-95
15-M ay-95
01-M ay-95
30
Figure 4-14. Modelled and measured absolute water levels at Regidor station (final result)
54
Chapter 4 - Case study 2: Magdalena River
4.4 Limitations of the model
Although a big effort was made to find a model that reproduces the measurements within
acceptable ranges, there is a high uncertainty in a number of inputs that have been
discussed in this chapter and these are summarized below.
Unreferenced water levels
The well-known problem of the absolute reference datum for elevation is perhaps the
main source of uncertainty in the model, because it affects not only the direct topology of
the model such as the bathymetry and the cross sections, but also the boundary
conditions. Additionally, this also affects the measurement time series, used to evaluate
the performance of the model, which require the use of rating curves to transform
registered discharges to referenced water levels.
Cross sections
The cross sections used in the developed model have two sources of uncertainty: first, the
elevation becomes worse at points that are in the middle of two consecutive stations some
distance apart; second, the limited cross section data includes only the main channel of
the river. The method designed to complement the cross sections using HS data to
describe the floodplains needs to be checked in the field before being used further. Of
course, the ideal solution would be to carry out topography surveys that include complete,
referenced cross sections.
Rating curves
The discharge data that comes from the rating curves at different stations is an uncertain
input, on the one hand, because of the vagueness of the zero-level of the gauges (affected
also by the frequent change of their position due to external factors) and, on the other
hand, because of the major modifications (natural and human) to the river’s morphology,
with the consequent effects on the cross sections where the flow measurements are made
(Cormagdalena 2006).
Wetlands
The modelling of the effect of the wetlands on the Magdalena River hydrodynamics is
limited because of the number of assumptions made, which include the tentative location
of the connections between the wetlands and the river and their capacity, the volume that
the wetlands are able to store and, the lack of knowledge of the elevation of each water
body, which restricts the analysis of water balances and the dynamics of the region in
general. The crest levels of the weir structures used to simulate the wetland-river flow
interchange were adjusted until satisfactory shapes of the discharge curves at the stations
located in the branches Loba and Mompox and also at the downstream stations were
obtained.
As a proper validation of the model was not carried out, it is recommended to perform it
by using recent data sets.
55
Chapter 5
Information Theory for monitor
location
A review on Information Theory is presented in Chapter 2, in which the philosophy
behind the concept of information, the approaches to measure it and some of its
applications are described. This introduction, together with the description of the polder
system in the Pijnacker region in The Netherlands in Chapter 3 and Magdalena River,
Colombia, in Chapter 4, are the foundation of the methods for determining monitor
locations developed in this Chapter.
This Chapter begins with an introduction that includes the main considerations that are
common in the development of the methods, namely the use of models as data generators
and the description of the estimation of probabilities for the information-related
measures. Next, the developed methods are presented in three sections, each one
presenting a description of the developed methods, a brief recall of the case study under
consideration, the presentation and discussion of the results, and the corresponding
conclusions.
The first section introduces an approach for locating water level gauges for monitoring in
polders using Information Theory with pairwise criteria for dependency estimation. The
case study is in the Pijnacker region. The same problem is addressed in the second
section, in which additional practical considerations are included. The monitor location
problem is posed as a multi-objective optimization problem and solved with an
evolutionary optimization method. The third section considers the problem of locating
discharge monitors in the Magdalena River, and applying a multi-objective optimization
method and also a rank-based greedy algorithm. The Chapter finishes with a summary of
the general conclusions obtained.
5.1 Introduction
Two main considerations characterize the methods presented in this Chapter, namely, the
use of models to generate series at each computational point in the water system, and the
discretization of the generated time series to perform the probability calculations. Both of
these considerations are described in the following sections.
Optimisation of monitoring networks for water systems
5.1.1
Use of models as artificial data generators
The use of models is significant because they are adopted to replicate the real world in
such a way that the states of the system can be reproduced. The models are then used to
generate time series from which the information-related quantities are estimated. One of
the reasons we use models instead of empirical measurements is so that, in order to do the
analysis, we have access to a dense set of points which are needed to obtain a complete
picture of the behaviour of the system in terms of the information associated with it. The
available measurements are generally limited to a few points making them insufficient to
analyse and draw conclusions.
It is considered, therefore, that every calculation point within a model is, in principle, a
potential location for a monitoring point within the water system. For this purpose, the
methodologies consider the use of hydrodynamic and rainfall-runoff models to generate a
water level time series at a dense, finite set of calculation points. In this way, a number of
water level records are generated with a predefined record length, from which the
information-related measures are calculated.
It is important to note that the solutions are related to the time step at which the data
records are generated by the model. Therefore, the model must be manipulated according
to the final aim and use of the monitoring network, by considering the proper time and
space scales. Similarly, the resultant monitoring network will be adequate for capturing
the information content of the physics of the runoff process associated with the rainfall
event used to produce the water level time series. In other words, every single rainfall
event has associated with it an optimal monitoring network, because different control
strategies to operate the system take place (e.g., different pump stations will start
pumping at different times and for different periods).
For the case of the Pijnacker (Chapter 3) the water system has been modelled by the
Water Board Delfland to make operational decisions under several scenarios. For the case
of the Magdalena River, the hydrodynamic model has been instantiated and calibrated
within the framework of this thesis (see Chapter 4).
5.1.2
Estimation of probabilities for the calculation of IT quantities
The probabilities required for the estimation of entropy and mutual information are
calculated using the well-known histogram based technique, as described for example by
Steuer et al. (2002); in this way the choice of a probability distribution to fit the
continuous data is avoided. Although there exist a number of nonparametric methods to
estimate mutual information (see e.g., Moon et al. 1995; Sharma 2000), this method uses
bins as an opportunity to take into account water management issues. The subjective
determination of the bin size for the histogram construction (Ruddell and Kumar 2009) is
addressed here using the quantization method introduced below.
5.1.2.1 Quantization
The quantization concept comes from the theory of communication systems, and aims to
convert an analogous (i.e., continuous) sign into a discrete pulse, in order to allow its
58
Chapter 5 - Information Theory for monitor location
digital transmission by applying the mathematical floor function (denoted as «¬x »¼ ). The
conversion of an analogous value x to a quantized value xq, which is rounded to the
nearest multiple of a, is performed by:
xq
« 2x a »
a«
¬ 2a »¼
(5-1)
The relationship between the bin-size b and the parameter a is given by the quotient of
the difference between the maximum and the minimum of the time series and the value a.
For the context of this Chapter, water level time series are transformed through Equation
(5-1) into “pulses” of discrete information, which produces a regular discretization of
water levels (in terms of levels) at irregular intervals of time. An advantage of the
approach is that the quantized water level series are “noise-free”, in the sense that highfrequency, low-amplitude water level changes (i.e., dynamic waves) generated by
neighbouring pumping stations (see for example the time series in Figure 5-2), are
filtered out. Consequently, the value of a can be seen as the minimum dimensional unit of
water level for which the management of the system becomes critical. This is crucial
when computing Equation (2-1), since high-frequency water level signals give very high
values of entropy, but do not necessarily provide information content for water
management decision-making. It must be noted that even though the quantization alters
the water volume at a point, this is not important since only probabilities of the
occurrence of the values are taken into consideration for the entropy calculations.
In order to show how the results may change because of the selection of the parameter a,
a sensitivity analysis is included at the end of each approach presented in the following
sections.
5.2 Information theory-based approach for location of monitoring
water level gauges in polders
An approach called Water Level Monitoring Design in Polders (WMP) for locating and
evaluating water level gauges in a water-system composed of a highly controlled canal
network is presented here. It consists of five parts: 1) the generation of a time series at a
very dense set of calculation points using a hydrodynamic model; 2) a quantization
method to “clean” the noise from the time series; 3) the use of three different pairwise
criteria to evaluate dependency using squared matrices in a similar fashion to Mogheir
and Singh (2002); 4) a procedure to locate the gauges following a method similar to
Krstanovic and Singh (1992a,b), which aims to find the set of points which together
provide the highest information content and are at the same time the most independent of
each other; and 5) the evaluation of the multivariate dependency with Equation (2-8),
using the grouping property of the Total Correlation (Fass 2006; Kraskov et al. 2003) in
order to establish comparisons between the gauges. The first criterion of the procedure in
4) is evaluated with Equation (2-1) and the second is evaluated by three different pairwise
methods: transinformation (Equation (2-3)), DITXY and DITYX proposed by Yang and
Burn (1994) (Equation (5-2) and Equation (5-3)).
59
Optimisation of monitoring networks for water systems
DITX ,Y
I X ,Y H X (5-2)
DITY , X
I X ,Y H Y (5-3)
Although the concept of Transfer Entropy (Schreiber 2000) and its application in
monitoring design (Ruddell and Kumar 2009) promotes a new pairwise dependency
criteria, the dynamic analysis of the time series is beyond the scope of this method.
5.2.1
Description of the WMP methodology
The Water Level Monitoring Design in Polders (WMP) methodology considers two
different conditions when locating the gauges: a) the monitors must be as independent as
possible from each other (that is, have a low pairwise value); b) the monitors must
provide, individually, the highest information content (that is have a high entropy). The
procedure is explained as follows:
a) Read and quantize the water level time series generated by the hydrodynamic
model for each of the calculation points si ( i 1, 2,..., n , where n is the number
of calculation points). Each point has an associated sequence of values X i .
b) Calculate the marginal entropy H X i for each si with Equation (2-1), from which
the values to fulfil condition a) will be taken.
c) For each si , calculate the mutual information in Equation (2-3) with respect to
each of the remaining points and build the symmetric matrix T , in which
I X i ; X i is equivalent to H X i (Cover and Thomas 1991)
T
ª I X1; X1 I X1; X 2 «
« I X 2 ; X1 I X 2 ; X 2 «
!
!
«
¬« I X n ; X 1 I X n ; X 2 ! I X1; X n º
»
! I X 2 ; X n »
»
!
!
»
! I X n ; X n ¼»
(5-4)
In this way, the point si will have an associated vector of mutual information vi
defined by the ith row (or column) of T . The values to fulfil condition b) will be
taken from this matrix.
d) The first monitor m1 is located at the point that provides the highest information
content of the system, (i.e., the point with the highest entropy value), so
m1 max( H ( X i )) .
e) Add m1 to the matrix of the monitoring points M .
f) Recover the mutual information vector v1 of the monitor m1 :
v1
60
I X i ; X m 1 , i {1, 2,..., n}
Chapter 5 - Information Theory for monitor location
g) The system can then be divided into two sets of points with respect to their
dependency on m1 : those that are dependent and those that are independent S mind1 .
The second monitor m2 must be selected from the latter in order to fulfil
condition a). The set S mind1 is obtained by looking at the elements of v1 such
that I X i ; X m1 H , H being the value of transinformation between X i and
X m1 that is insufficient for the pair to be considered dependent.
h) To fulfil condition b), m2 must have the highest entropy possible of the set
S mind1 so, m2
max H X i  S mind1
.
i) Recover the mutual information vector v2 of the monitor m2 :
v2
I X i ; X m 2 , i {1, 2,..., n}
j) The next monitor m3 must be selected in a similar way, but now using a modified
set of independent points S mind3 given by the common set of independent points in
the overlapping transinformation vectors for the previously selected monitors
m1 and m2 . Therefore, v3 v1 v2 .
k) Set v1 v3 and the procedure is repeated from step f) until a maximum number of
points is reached or until mi does not provide a significant information content for
the remaining system (i.e., its marginal entropy is too low).
The matrix T in step c) is replaced by DITXY and DITYX when evaluating the DIT-based
criteria.
Although the general scheme of the method is based on the studies mentioned in the
previous section, several changes are proposed to make it applicable to highly controlled
water systems. First, the time series of water level are generated by a hydrodynamic
model at a very dense set of computational points, each of which is a potential gauging
site. This avoids the use of empirical, historical measurements, which are not available at
the required density (and at all significant points in a highly controlled polder system).
Second, noisy time series produced by the operation of pumps are filtered, by introducing
the quantization concept. Third, the independent sets of time series are defined in a new
way using the common set of independent points in the overlapping transinformation
vectors for the previously selected monitors. The solutions obtained by each of the
pairwise criteria are then evaluated with the concept of Total Correlation to check the
independency among monitors and the joint information provided by the set.
5.2.2
Case study: Pijnacker region, The Netherlands
The case study is located in a low-lying region of Pijnacker, Delfland, The Netherlands,
which has an area of 18.80 km2, 15 pump stations and 21 fixed weirs that are operated in
order to keep the water levels between limits defined by the water management of the
region. A detailed description of the area is presented in Chapter 3.
61
Optimisation of monitoring networks for water systems
In order to apply the WMP approach described above, the following actions are taken.
The water level time-series are generated by a hydrodynamic model built by the Delfland
Water Board to make operational decisions regarding the control structures under several
scenarios. A dense set of n=1520 calculation points separated along the canals by a
distance of 15m on average is used with the rainfall event shown in Figure 5-1.
35
Precipitation (mm/hr)
30
25
20
15
10
5
0
0
12
24
36
48
60
72
84
96
108
120 132 144 156 168 180 192 204
Time (1 unit = 15 min)
Figure 5-1 Rainfall event used in the hydrodynamic model.
The records are generated in a 10-day simulation with a time step of 15 minutes (this is
time step considered by the water board as useful for the management of the area and the
pump operation). Additionally, the parameter a in Equation (5-1) used in the quantization
of the time series is determined by looking at the water level variations that are negligible
for management. In our case, a=5 (cm) is the minimum dimensional unit for which the
water management becomes critical. For example, water level variations of less than 5cm
can be due to wind, ship movement or dynamic waves generated by the operation of
pumping stations and are not important for water management (that is such variations
should not require the hydraulic structures to be operated) and so they should be
considered as noise in the time series. (i.e., these time series should not provide
information content, though they would have high entropy values without discretization).
As an example, the water level time series at the discharge of one of the pump stations is
presented in Figure 5-2, together with its quantized version, used in the entropy
calculation. Finally, the value of H in step g) is considered to be the mean of the vectors vi ,
so that half of the points were considered independent and half dependent. This
assumption is valid since only points with low pairwise dependency criteria with respect
to a given point are selected.
5.2.3
Analysis of Results
In order to have an initial idea of the behaviour of the information content of the
Pijnacker region polder system, the maps of marginal entropy and mutual information are
drawn. In the first place the entropy map of the system is shown in Figure 5-3 where
several static information zones can be identified. A few points with null entropy can be
found, which correspond to fixed boundary conditions for water level (that is, pump
stations discharging to big storage bodies with fixed water level, to the southeast of the
62
Chapter 5 - Information Theory for monitor location
area). This map provides a first view of the areas (with high entropy) where it is suitable
to place the first water level monitoring station, from the information content viewpoint.
Quantized water level (a=5)
Original water level
568
Water level (cm)
566
564
562
560
558
0
100
200
300
400
500
600
Time (1 unit = 15 min)
700
800
Figure 5-2. Example of original water level time-series and its quantized version, at a point
located downstream a pumping station.
Figure 5-3. Entropy map of Pijnacker region
63
Optimisation of monitoring networks for water systems
The subsequent monitors must be as independent as possible to this first monitor. In the
second place, in order to highlight the information-dependency in the system, Figure 5-4
shows the DIT index calculated between an arbitrary point (A) and the rest of the system.
In the figure, the darker the point, the greater is the dependency of points with the point
A. The dependency of A on its neighbouring points is evident. However, some other
regions seem to have a strong dependency with A in spite of their hydraulic
disconnection. It can also be noticed that only a few points (the ones with constant water
level during the time of simulation) are completely independent (DIT=0).
Figure 5-4. Directional Information Transfer Index (DIT) map for the point A (bits).
In order to look for a set of N gauging stations that can provide the maximum information
of water levels in the system, the WMP approach is used, having the dependency criteria
I X ; Y , DITXY and DITYX as the basis to create the matrix T of Equation (5-4). In order
to facilitate further analyses, n is initially considered to be 9. For the first
criterion I X ; Y , Figure 5-5 is obtained, where, beside the monitor locations, the
overlapped I X ; Y map at each step and the value of the marginal entropy of the placed
monitor is presented. The darker the calculation point, the higher its dependency with
respect to the set of previously selected monitors.
Similarly, Figure 5-6 is obtained applying the second criterion DITXY in the WMP
approach. In this case, a value of 1 was assigned to the dependent points and 0 to the
independent ones, in such a way that the empty areas indicate where the next monitor
must be placed. It can be noticed that the ninth monitor does not have independent points
64
Chapter 5 - Information Theory for monitor location
associated to it, implying that this ninth monitor is not needed. This is also confirmed by
its small value of entropy H X 9 1.38 bits, which does not provide much more
information to the joint set.
Figure 5-5. Step-by-step solution for the location of water level monitors using I X ; Y as the
dependency criteria. Entropy of the currently selected point is shown at each step
Finally, WMP is preformed using the third dependency criterion DITYX , obtaining the
nine monitors presented in Figure 5-7.
A small area to the south with no coverage can be noted, which corresponds to points
with very low entropy values (see Figure 5-3). This implies that any point in the system is
completely independent of any point that belongs to this area (Figure 5-4, for example,
confirms this statement for the case of point A). This situation explains why the ninth
monitor in the previous experiment is not worth being selected.
The location of the sets of monitors obtained by means of the three criteria is shown in
detail in Figure 5-8, where the sequences of monitor selection are not presented for
clarity. The summary of the monitors obtained with the different independency criteria is
presented in Table 5-1, as well as their correspondent values of Total Correlation and
Joint Entropy, the latter being calculated from Equation (2-9).
65
Optimisation of monitoring networks for water systems
Figure 5-6. Step-by-step solution for the location of water level monitors using DITXY. Entropy of
the currently selected point is shown at each step (bits).
Figure 5-7. Step-by-step solution for the location of water level monitors using DITYX.
66
Chapter 5 - Information Theory for monitor location
Figure 5-8. Location of water level monitors obtained by the WMP approach, using I(X;Y), DITXY
and DITXY as pairwise dependency criteria.
5.2.4
Discussion
The solutions obtained with the WMP method are discussed and evaluated in this section.
In the first instance, the three pairwise criteria give some similar monitor locations in
terms of spatial distribution. Besides the monitor at point 441 (selected by the three
solutions, since the approach starts with the point with highest entropy), it is noticed that
the monitors at points 733, 133, 426 and 438 have been selected by at least two of the
solutions. Besides this, there are points that are separately identified but are the same
from the practical point of view, such as points 319 and 320, 426 and 876, 719 and 88 as
well as 133 and 144.
It can be noticed, for each criteria of dependency used, that every time a monitor is added
to the set, the number of independent points is reduced, and that this reduction becomes
less evident when new monitors are added. A quantitative way of looking at the reduction
of uncertainty in the system when a new monitor is selected is by evaluating the value of
joint entropy at every step. Since the calculation points are not completely independent,
N
then H X 1 , X 2 ,..., X N z ¦ H X i , so the concept of Total Correlation is needed to
i 1
evaluate the multivariate independency, by subtracting the summation of the marginal
entropies from the value of Total Correlation estimated using the grouping property.
Figure 5-9 shows that the three solutions have a similar behaviour for both information
content and independency. Furthermore, the reduction in uncertainty is strong for the first
67
Optimisation of monitoring networks for water systems
monitors, due to the fact that more information among them is shared as new monitors
come into the solution. This explains why additional points do not reduce the uncertainty
in the system.
Figure 5-9. Evolution of the values of Joint Entropy and Total Correlation as new monitors are
added to the solution set.
In order to allow further analysis, the following values are computed:
C X 1 , X 2 ,..., X n
1520
3510 bits
n 1520
¦ H X 3519.2 bits
i
i 1
H X 1 , X 2 ,..., X n
1520
n 1520
¦ H X CX , X
i
1
2
,..., X n
1520
9.2 bits
i 1
It is clear that the total correlation of all points in the system almost equals the sum of
their marginal entropies. This means that the amount of information shared between all of
the points in the system is practically the same as the total amount of information that
each point adds to the system separately. Moreover, the maximum amount of information
content that can be extracted from the system is 9.2 bits.
The implication of having C almost as big as the sum of the marginal entropies is that the
calculation points are highly dependent on each other. This is like having a Venn diagram
with 1520 circles of different size that overlap each other almost completely. If all the
68
Chapter 5 - Information Theory for monitor location
circles were independent from each other, then C=0 and they do not overlap. The
problem could be seen as selecting the best N circles that “cover” the total “area” of the n
circles but that at the same time have little “overlapping area”.
Table 5-1. Summary of monitors obtained by each dependency criteria and corresponding values
of joint entropy and total correlation
ID of selected monitor (in order)
Criteria
1
2
3
4
5
6
7
8
9
I X , Y 441 133 320 948 876 477 72 272 795
DITXY 441 254 446 905 144 313 719 733 438
DITYX 441 133 426 319 88 730 29 733 673
Joint Entropy Total Correlation
H X 1 , X 2 ,..., X 9 C X 1 , X 2 ,..., X 9 7.32
7.63
6.82
16.79
17.7
13.88
The results are summarized in Table 5-1, in which the value of the total joint entropy and
the Total Correlation are included for each solution. From Figure 5-9 it can be observed
that with only the first two monitors is possible to reach 5.5 bits of information content
(60% of the total information content of the system), at a relatively low dependency value
(less than 0.5% of the total correlation of the system). This provides a good criterion for
evaluating the quality of the results.
In general, the monitors selected are located next to hydraulic structures. This can be
explained by the fact that these elements provide high, systematic variations to the water
levels in the system, during a precipitation event. Although entropy increases
considerably by water level variations from pump stations, its informative capability is
not important if such variations occur within a small water level range. In this case,
quantization appears to be a promising approach to filter out these noisy signals.
The parameter a of Equation (5-1) can be viewed as the minimum dimensional unit of
water level for which the management of the system becomes critical. In the case of a
typical low-lying polder system in the Netherlands, 5cm is already decisive for water
management in terms of the operation of control structures. Conversely, water level
variations smaller than 5cm (due, say, to wind, ship movement or dynamic waves
generated by the operation of pumping stations), are not important for water management
and so they should be considered as noise in the time series. In other water systems such
as rivers, a=5cm might be too low.
A sensitivity analysis of a is carried out by comparing monitor locations obtained for
integer values of a between 1 and 10, and for 15 and 20 cm. The comparison shows that
for each value of a between 3cm and 8cm a minimum of 55% of the same locations are
shared with the average of the four neighbouring solutions that are the closest in number.
For values of a < 3 cm and a > 10 cm, only a few monitor locations are common. Figure
5-10 shows the average percentage of common monitors when comparing solutions for
different values of a. In general, 25% of the locations are common to all solutions,
although they are not necessarily selected in the same order. The first monitor location is
69
Optimisation of monitoring networks for water systems
Percentage of common monitor locations
common to all solutions (which is the one with the highest entropy value, as explained in
step d).
50%
45%
40%
35%
30%
25%
20%
15%
10%
5%
0%
1
2
3
4
5
6
7
8
9
10
15
20
Value of a (cm), Eq. 7
Figure 5-10. Average percentage of common monitor locations comparing the solution obtained
for each value of a with all other calculated solutions (a=1,2,…,10,15,20)
From the hydraulics point-of-view, it is important to note that the monitors located
downstream of a weir cannot give any information about the conditions upstream of the
weir when this is working in a modular regime. On the other hand, when the weir regime
is drowned, the downstream monitoring point may provide additional information to the
system upstream. In general terms, all the weirs break the dependency in information
between upstream and downstream points; in a similar way they break the continuity in
water levels. However, some weirs working in the modular regime may not cause a
discontinuity in information because the same local hydrological information may be
shared upstream and downstream. This situation can be seen in Figure 5-3 and Figure
5-4, in which not all the weirs limit the information areas.
The gauge locations will not change significantly when using different time steps in the
hydrodynamic model, as long as it acceptably replicates the real behaviour of the system.
Certainly, the quantization of the time series as well as the frequency-based method to
estimate the probabilities for the information-related measures would give similar entropy
values. Additionally, the relative nature of the pairwise criteria makes entropy values
unchanged using different time steps.
Even though values of H equal to 0 will define exactly the set of points that is dependant
on a particular point, its applicability might lead to an empty set of independent values,
because even a small amount of information may be shared between hydraulically
disconnected points. In this case the points can be hydrologically-related; for example, a
precipitation event that affects water levels at two hydraulically-unrelated points may
70
Chapter 5 - Information Theory for monitor location
induce a correlation between them and thus share some of their information content, and
I X i ; X j ! 0 . This is the reason why the independent points were selected considering
the mean of the independency criteria as a threshold.
The fact that the monitored signals are dependent is, in practice, a desirable condition,
since cross-checking is necessary to validate the data from other stations and to detect
possible errors. Nevertheless, the theoretical design of monitoring networks should still
consider the most independent monitors as priority places for data collection.
5.2.5
Conclusions
A number of “information zones” can be identified in both entropy and transinformation
maps, defined by pump stations and by weirs working in modular regime, which create
discontinuities in the information content upstream and downstream of the structures.
One information zone may include several target water levels.
The solutions obtained with the three pairwise criteria I, DITXY and DITYX have some
monitors in common in terms of spatial distribution. However, the solution obtained with
the DITXY criterion provides the highest value of joint entropy for the set of monitors and
the highest value of Total Correlation. A high value of joint entropy reveals a high
information content. This is the measure that identifies the preferred solution for the
monitors.
On the contrary, the solution obtained with the DITYX criterion provides the lowest
information content and the minimum value of Total Correlation. According to the
conditions mentioned in the procedure for gauge location, this implies that DITYX is more
effective to fulfil condition a) (independency), whereas DITXY is better to fulfil condition
b) (amount of information content). This opens a new possibility to solve the problem of
monitor selection out of a dense set of potential monitors, which is to use multiobjective
optimization where DITYX is to be maximized and DITYX minimized.
Results show that the calculation points may be highly dependent on each other even if
some of them are hydraulically disconnected. This dependency is due to the hydrological
connection between the points, since in relative small areas the same rainfall events are
shared. Although these hydrological dependencies make the problem of looking for
independent monitors more difficult, the proposed methodology proves to be a suitable
method for this purpose.
The values of marginal entropy are sensitive for different values of a used in Eq.(5-1)
Small values lead to high entropy values for locations near pumping stations, whereas
bigger values tend to filter out small disturbances, causing a decrease in entropy.
However, as a affects all the generated time series equally, the selection of the first
monitor location (see step d in the section Location of Gauges) remains the same for
different values of a. On the contrary, transinformation and DIT values do not change at
all, due to the relative nature of the expressions. The value of a=5 (cm), however, is
found to be the minimum dimensional unit for which the water management becomes
71
Optimisation of monitoring networks for water systems
critical and seems to be good for keeping the informational property from the
management point of view.
The use of multiobjective optimization techniques to solve the problem is explored in the
next section, using the minimization of the total correlation as a first objective and the
maximization of the joint entropy as a second objective; in addition, practical
considerations as constraints are also included.
5.3 Optimizing Information measures for the design and evaluation of
monitoring networks in polders
A method for siting water level monitors based on information-theory measurements is
presented. The first measurement is Joint Entropy, which evaluates the amount of
information content that a monitoring set is able to collect, and the second measurement
is Total Correlation, which evaluates the level of dependency or redundancy among
monitors in the set. In order to find the most convenient set of places to put monitors
from a large number of potential sites, a Multi Objective Optimization Problem (MOOP)
is posed under two different considerations: 1) taking into account the costs of placing
new monitors, and 2) considering the cost of placing monitors too close to hydraulic
structures. In both cases, the joint entropy of the set is maximized and its total correlation
is minimized. The costs are considered in terms of information theory units, for which
additional terms affecting the objective functions are introduced. The proposed method is
applied in a case study of Delfland region, The Netherlands. Results show that Total
Correlation is an effective way to measure multivariate independency, and that it must be
combined with Joint Entropy to get results that cover a significant proportion of the total
information content of the system.
5.3.1
Optimising Joint Entropy and Total Correlation: justification
Regarding the use of Information Theory in the design of monitoring networks, the
analysis is based on looking for those locations where the information content about a
particular water-related variable is a maximum, so that a monitor device placed there has
“potential information” in the sense that once placed, it would reduce uncertainty by
providing information (Mogheir and Singh 2002). For more than one variable (that is,
more than one monitoring device), Joint Entropy is used, because it represents the
information content of the set of monitoring devices, which can be maximized, as
Caselton and Husain (1980) did for the case of reducing an existing rainfall monitoring
network.
Additionally, minimizing transinformation between monitors is the basis of designing
monitoring networks by applying Information Theory. Naturally, the placement of two
monitors that provide exactly the same information is not optimum. In other words, the
redundancy of the monitors should be as small as possible (Mishra and Coulibaly 2009).
In this Chapter, Total Correlation is used to measure redundancy among multiple
variables.
72
Chapter 5 - Information Theory for monitor location
The main contribution of this method is that Joint Entropy and Total Correlation are
independent objectives that must be optimized. A Venn diagram can be used to illustrate
this idea, where the information content of a monitor is represented by the area of a
circle; the information shared between variables corresponds to their overlapping areas
and the information content of the set of variables is described by the total area covered
by the circles (see e.g., Cover and Thomas 1991; Ruddell and Kumar 2009). Suppose, for
example, that ten potential locations are available to place three monitors; the equivalent
Venn diagram might appear as in Figure 5-11(A) which represents the information
content of 10 variables and their common information.
Figure 5-11. Venn diagrams illustrating the proposed optimization problem. (A): Information
content of 10 variables and their common information; (B), (C) and (D): possible solutions for the
selection of three monitor locations (1), obtained by maximizing joint entropy (2) and minimizing
Total Correlation (3).
The task is to select the most informative set of three variables that, simultaneously, are
least interdependent, i.e., the summation of the overlapped areas is a minimum. Three
possible solutions for this generic example are shown in Figure 5-11(B), (C) and (D). The
relationships in terms of information content for each solution are shown in the first row
of Figure 5-11; the area representing the Joint Entropy of each solution set is shown by
the total covered area in the second row of Figure 5-11 (to be maximized) and the
overlapping areas representing Total Correlation is shown in the third row (to be
minimized).
73
Optimisation of monitoring networks for water systems
5.3.2
Description of the MOOP methodology
The Multi Objective Optimization Problem (MOOP) is posed to find the set of new
stations X ^ X 1 , X 2 ,..., X M ` that optimally complement the set of N already existing
^E1 , E2 ,..., EN ` , in such a way that the joint set of M+N stations S ^E , X `
stations E
provides the maximum possible information content with the minimum shared
information between them. The first objective is described by the joint entropy, Eq.(2-4),
while the second is described by the Total Correlation, which is estimated following the
steps described in the previous section. To understand why the second objective is needed
the situation in which two water level gauges are located extremely close to each other
can be considered; they would in effect record the same time series, they would have the
same information content and therefore both stations would be completely dependent
(redundant).
Additionally, every single station should be placed where the highest information content
can be extracted. Under this consideration, water system points with constant or quasiconstant water-level records should not be selected because they do not provide any
further information (i.e., water level value does not change, so just one record would be
enough to describe the state of that point). This consideration can be added as a constraint
in the MOOP by excluding low-entropy points from the decision variable set, which in
turn implies a reduction in the search space and therefore a reduction of the
computational effort. The definition of the threshold to identify low-entropy points is
described in the next section.
Taking into account the previous considerations, the multi-objective optimization
problem (MOOP) is mathematically formulated as follows, Eq (5-5).
^
max ^ H S `
min C S M , N C X 1 , X 2 ,..., X M , E1 , E2 ,..., EN M ,N
`
H X 1 , X 2 ,..., X M , E1 , E2 ,..., EN subject to:
(5-5)
H X i ! Low entropy threshold
N M
number of monitors
The threshold to define low-entropy points to be discarded from the search space was
defined by looking at the relative frequency of the entropies in the system. The Figure
5-12 shows that more than 50% of the points have an entropy value below 0.1 bits, which
represents less than 7% of the point with maximum entropy. This implies that the search
space is dramatically reduced by a factor of several hundreds. This value is used in Eq.
(5-6) and Eq. (5-7) introduced below.
The procedure to estimate the probabilities required for the calculation of Eq. (2-1), (2-4)
and (2-8) follows a frequency analysis described in the section 5.2. Moreover, the MOOP
is solved using the Non-Sorted Genetic Algorithm (NSGA-II) by Deb et al. (2002) and
implemented in the NSGAX software (Barreto et al. 2006).
74
Chapter 5 - Information Theory for monitor location
900
Number of calcualtion points
800
700
600
500
400
300
200
100
1.5
1.4
1.3
1.2
1.1
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Entropy (bits)
Figure 5-12. Definition of low-entropy points to be discarded from the search space for the
optimization, according to the relative frequency of the entropy of the points in the Delfland
system.
5.3.3
Case study: Pijnacker region, The Netherlands
The developed methods were applied in the same case study used in section 5.2, a polder
system in a sub-district of Delfland, in The Netherlands (see details in Chapter 3). The
existing water-level monitoring network fails to provide the proper information about the
state of the system, especially under extreme weather conditions. This is because the
measurements are basically used to check whether the current water levels are between
the on/off levels of the pumps and to operate them accordingly. The measurement points
of the 9 biggest pumping stations are further identified as Existing Pump Monitors
(EPM). The hydrodynamic model used to generate the water level time series was run
with a typical rainfall event for the area. For this study a rainfall event of 5-years return
period was used (much less extreme than the one used in section 5.2, as it is of particular
interest for the management of the region in the assessment of regular flood risks.
5.3.4
Practical situations included in the optimization problem
Two practical situations occur in the Delfland water system: a) the need to introduce
financial restrictions on installing new monitors, and b) the problem of the accuracy of
measurements taken near hydraulic structures, due to small water level fluctuations.
Although both situations should be considered simultaneously, they are applied
separately in order to facilitate the analysis of the results. For both situations it is assumed
that 9 monitors in total need to be located in the system.
x Approach a)
For the first situation we define an additional term u*M, representing the cost (in
informative units) of having to place M new monitors. The parameter u is a constant with
cost units of bits per new monitor. The term u*M is then added to the total correlation (u
75
Optimisation of monitoring networks for water systems
bits of redundancy are added to the set) and subtracted from the joint entropy (u bits of
joint information are subtracted from the set) every time the set contains a new monitor.
In this way, the optima are kept separate for both objectives as M increases. Although u
may differ according to the location of a particular monitor, a constant value is
considered to simplify the problem. For the subsequent experiments, u is defined equal to
1 bit/new monitor; a sensitivity analysis of this parameter is presented at the end of this
section. The resultant optimization problem can be written as:
^
max ^ F
min F1
2
C S M , N uM
C X 1 , X 2 ,..., X M , E1 , E2 ,..., EN uM
H S M , N uM
`
H X 1 , X 2 ,..., X M , E1 , E2 ,..., EN uM
subject to,
H X i ! 0.1bits
0 d M d 9;0 d N d 9; M N
`
(5-6)
9
x Approach b)
For the second situation we introduce the term q*v, which represents the cost in
informative units, of having to place monitors close to hydraulic structures. v is defined
as the number of times the distance ds (minimum distance between a monitor and a
structure evaluated over all possible combinations of monitors and structures) is violated
by a particular solution; q is a constant with cost units of bits per violation of minimum
distance. It is assumed ds=50m and q=1 bit/violated ds. As in the first situation, the term
q*v is added to the total correlation and subtracted from the joint entropy of each
evaluated set in order to keep the optima away from the ideal point (min C and max H) as
v increases. The resultant optimization problem can be written as shown in Eq. (5-7)
^
max ^ F
min F1
2
`
C S M , N qv C X 1 , X 2 ,..., X M , E1 , E2 ,..., EN qv
H S M , N qv
`
H X 1 , X 2 ,..., X M , E1 , E2 ,..., EN qv
subject to,
H X i ! 0.1bits
0 d M d 9;0 d N d 9; M N
(5-7)
9
In order to solve Eq. (5-6) and Eq. (5-7), the Non-Sorted Genetic Algorithm (NSGA-II),
(Deb et al. 2002), was used with the following evolutionary parameters: crossover
probability = 90%, mutation probability = 1/9. The evolutionary parameters of number of
population and generations were set after several experiments with different values; the
use of 500 populations and 2000 generations were found convenient, because with bigger
values the solutions do not improve significantly for the two situations.
5.3.5
Analysis of Results
In order to facilitate the analysis, the calculation points of the system have been labelled
with integer numbers from 1 to 1520. It is worth noting that the total joint entropy of the
system (i.e., the joint information contained in these 1520 points as a single set) is
76
Chapter 5 - Information Theory for monitor location
H sys
H X 1 , X 2 ,..., X 1520 =4.91 bits, a value that represents the ideal amount of
information that the network of monitors S should provide. It must be noted that in
previous approach (section 5.2) the value for Hsys = 9.2 bits was obtained using a rainfall
event with a much higher return period. The results obtained are evaluated with respect to
this value.
x Results for approach a)
The Pareto-optimal set of solutions obtained for the first situation is presented in Figure
5-13, where the solutions are characterized according to the number of existing monitors
that were picked up in the optimization process (0, 1, 2 and 3 EPM). For this point
onwards, the notation SM,N will be used to show that the set of monitors S is composed of
M new monitors and N existing monitors.
-2.1
0 EPM, 9 New
Extreme Ya
(out of scale)
-2.3
1 EPM, 8 New
(negative) Joint Entropy (bits)
-2.5
2 EPM, 7 New
3 EPM, 6 New
-2.7
-2.9
WMP method with Ixy, DITyx
-3.1
-3.3
-3.5
Extreme Xa
-3.7
-3.9
-4.1
0
0.2
0.4
0.6
0.8
1
1.2
Total Correlation (bits)
Figure 5-13 Pareto-optimal set of solutions discriminated by EPM, approach a). Extremes Xa and
Ya are indicated for further analysis. Results obtained with WMP method (Alfonso et al. 2010b)
are also indicated.
The solution S0,9 (corresponding to the full set of EPM) is not presented in the figure
because it is out of the figure scale, but its location can be observed in Figure 5-14. This
solution has a very small total correlation (close to 0) but also a relatively small
information content (1.04 bits): one record taken at each of these monitors would jointly
provide slightly more than one bit of information on average, or about 20% of the
information of the state of the system, meaning that the current monitoring network is far
from optimal.
77
Optimisation of monitoring networks for water systems
Extreme Xa
Extreme Ya
max H(X1,X2,...,X9)
min C(Y1,Y2,...,Y9)
X1: 587
Y1: 827
X2: 991
Y2: 1490
X3: 286
Y3: 1078
X4: 1030
Y4: 1265
X5: 57
Y5: 620
X6: 458
Y6: 704
X7: 42
Y7: 891
X8: 175
Y8: 394
X9: 1204
Y9: 1151
1.5
Y5
X2
Y3
Y4
E9
E5
X7
X8
Y6
E1
E3
X3
Y9
E6
X1
X5
E8
Y1
X4
Y7
S=EPM
E1: 353
Y8
E4
X9
1
X6
Y2
0.5
E2: 1203
E3: 1187
E4: 465
E5: 56
E2
E6: 541
E7: 1337
E7
E8: 669
E9: 842
0
Figure 5-14. Delfland water system with location of solutions for approach a) obtained at the
extremes Xa and Ya of the Pareto frontier of Figure 5-13. Solution for S=EPM is also included.
Scale represents the marginal entropy at each system point estimated with Eq. (2-1).
Several interesting facts can be mentioned (Figure 5-13):
x Solutions for N=4, 5, 6, 7, 8 and 9 are always dominated by other solutions, so
they are not found to be part of the Pareto front.
x Existing monitors 541, 1337, 669, 842 are not selected in any scenario.
x Solutions for S3,6 always include monitors 1187, 465 and 56. These N=6 (new)
monitors make the joint entropy vary between 3 and 3.5 bits (between 60% and
70% of Hsys) and between 0.2 and 0.6 bits in total correlation terms.
x All the previously commented solutions (S3,6) are always dominated by solutions
that consider fewer existing monitors and more new monitors in the final set.
x For the scenario S2,7 we found solutions with joint entropy between 3 and 4 bits,
that range between 0.2 and 1.1 bits of total correlation. Only three combinations
of two existing monitors (1187, 56); (1187, 465) and (1203, 56) are part of the
Pareto front of optimal solutions.
x From Figure 5-13 it is clear that this Pareto front is closer to the ideal value (C=0,
HJ=4.91) and therefore it dominates the previously discussed solution S3,6.
78
Chapter 5 - Information Theory for monitor location
x
For the case of S1,8, the resultant Pareto front dominates practically all the
solutions obtained for the previously discussed scenarios.
In order to characterize the solutions, the extremes of the Pareto front are analyzed
(Figure 5-13), where Xa identifies the solution that maximizes the Joint Entropy (bottomright extreme) and Ya identifies the solution that minimizes the Total Correlation (upperleft extreme). The sub-index a is provided to distinguish the solutions obtained for
approach a and approach b. First, the solution at Xa, which maximizes the (negative) joint
entropy, places (also new) monitors at S9,0 =(587, 991, 286, 1030, 57, 458, 42, 175,
1204), with joint entropy of 4.18 bits (85% of Hsys) and total correlation of 1.19 bits.
Second, the solution at Ya, which minimizes the total correlation, corresponds to the
selection of (all new) monitors S9,0 = (827, 1490, 1078, 1265, 620, 704, 891, 394, 1151),
which is a set with total correlation of 0.0 and joint entropy of 1.51 bits (30% of Hsys).
These two sets of monitors Xa and Ya, as well as the solution for which S=EPM, are
located in the map of the system in Figure 5-14. As expected, the monitoring sets are
different in spatial terms. The most important monitor of each extreme, in terms of
information, are the monitors Xa6 (point 458) and Ya7 (point 891), because they are
located in a zone with high marginal entropy. However, in both extremes Xa and Ya (and
regardless of having excluded the points with the lowest entropy) points with low entropy
are included in the solutions. On one hand, at the extreme Xa, the point Xa2=991 has a
low information content, which means that the eight remaining points would be enough
to place. This situation may lead to a refined criterion to determine the number of
monitors to be placed, so that no assumptions in this regard would be needed and the
second constraint of equations (5-6) and (5-7) may not be considered.
On the other hand, all the monitors of the solution at extreme Ya are located at very low
informative sites, with exception of point Ya6 = 704 with H(Ya6)=1.5 bits. This explains
why this solution has such a low Total Correlation: low-entropy points are more
independent from the rest than high-entropy points. Naturally, this solution is far from
being a good set for monitoring because it does not provide significant joint information.
x Results of approach b)
Following the same identification pattern used for the approach a), the extremes of the
Pareto front in Figure 5-15, is used, being Xb the solution that maximizes the Joint
Entropy and Yb the solution that minimizes the Total Correlation. Additionally, Figure
5-15 discriminates the solutions by the number of violations v of the minimum distance
ds. Several observations can be made. First, it can be noticed that, in spite of having only
9 monitors to place, some of the solutions have violated the distance rule 10 times, which
means that one monitor was close to either a weir or a pump station more than once. This
was expected because of the high density of hydraulic structures in the area. Second,
solutions with low total correlation are found only when no violations take place. This
implies that non redundant sets of monitors are only possible to place away from
hydraulic structures. However, the price of this independency is paid by the fact that
jointly they collect relatively low information (less than 60% of the information content
of the system); the trade-off between the two information quantities is again evident. In
79
Optimisation of monitoring networks for water systems
third place, three solutions give the highest joint entropy, and correspond to solutions
with 3, 4 and 5 distance violations.
-2.1
10 times
9 times
8 times
Extreme Yb
(out of scale)
-2.3
7 times
6 times
5 times
4 times
3 times
2 times
(negative) Joint Entropy (bits)
-2.5
-2.7
-2.9
1 times
Not violated
-3.1
-3.3
-3.5
Extreme
Xb
-3.7
-3.9
WMP method with Ixy, DITyx
-4.1
0
0.2
0.4
0.6
Total Correlation (bits)
0.8
1
1.2
Figure 5-15. Pareto-optimal front, approach b), discriminated by the number of times the
minimum distance ds is violated by the solution set. Extremes Xb and Yb are indicated for further
analysis. Results obtained with WMP method (Alfonso et al. 2010b) are also indicated.
A detailed analysis of the optimization is presented in Table 5-2, which categorizes the
500 solutions obtained for approach b) in terms of violations of the minimum distance
due to the presence of pumps and/or weirs. It can be noted that the majority of violations
are caused by pumps for solutions that include combinations of violations (by pumps and
by weirs). For instance, for 6 violations, the combined possibilities are 5 pumps + 1 weir
(3 solutions), 4 pumps + 2 weirs (22 solutions) and 3 pumps + 3 weirs (18 solutions),
Table 5-2. In other words, the number of violations due to the proximity of the monitors
to the pumps is generally bigger than the number of violations due to their proximity to
the weirs. This can be explained by the fact that a pump operation adds entropy to the
upstream and downstream neighbouring points while a fixed weir in modular regime
stabilizes the water levels in a way that downstream points reduce their marginal entropy.
However, no trivial pattern in the Pareto front was identified based on the number of
pumps or the combination of pumps and weirs.
The resulting monitoring sets obtained at both extremes of the Pareto front (for maximum
joint entropy and for minimum total correlation) are located in the map of the water
system in Figure 5-16. The solution for the extreme Xb, gives a maximum joint entropy of
80
Chapter 5 - Information Theory for monitor location
4.04 bits, about 82% of Hsys. However, two points appear to be low-informative; these are
Xb2 (point 776) and Xb9 (point 170). Similar to the first situation, it has been found that
nine monitors are not necessary: indeed seven are enough to describe the information
content of the system. For the case of the extreme Yb, we find again a similar situation as
in the first situation: the majority of the computational points have negligible information
content, with the exception of points Yb3 (point 1030) and Yb7 (point 801). This result
shows again that points with very low entropy contribute to decrease the total correlation
of the set. However, as in the previous set, this solution should not be taken into account
because it provides low information content (about 2.0 bits or 40% of Hsys).
Table 5-2. Number of solutions for approach b) with minimum distance violations by pumps and
weirs.
Number of solutions with violations by pumps
Number of
solutions with
violations by weirs
0
Total
1
2
3
4
5
Total
6
0
79
21
8
6
4
-
-
118
1
30
33
30
30
18
3
-
144
2
5
16
15
19
22
23
-
100
3
-
-
14
18
26
31
7
96
4
-
-
-
-
8
19
15
42
114
70
67
73
78
76
22
500
x Comparison of results with WMP approach
The set of monitoring networks obtained through the MOOP for both approaches a) and
b) are compared with the method Water Level Monitoring Design in Polders, WMP
(Alfonso et al. 2010b), in which three pairwise criteria, namely Transinformation I(X,Y),
and Directional Information Transfer, DITX,Y and DITY,X (Yang and Burn 1994) are used
to evaluate the independency of the monitoring set. The WMP is a step-based method that
can be classified as a greedy algorithm, in which the next best monitor (with high
information content and low dependency) is selected at each step, given the total number
of monitors. For the sake of comparing results, the WMP method was run for the same
rainfall event used in this paper. It must be noted that the WMP method was run with no
constraints, so the resultant monitoring network is composed of only new devices, and
their proximity to hydraulic structures is not taken into account.
The information theory characteristics of the monitoring sets obtained with the WMP
method are included in the Pareto-optimal solutions of Figure 5-13 and Figure 5-15. The
result for I(X;Y) is almost identical to the result for DITYX so they are presented as a single
point in the graphs and the result for DITXY is out of the scale of both graphs. The
immediate conclusion is that the WMP method provides solutions that are part of both the
Pareto-optimal front obtained for sets composed by new monitors only (Figure 5-13) and
the Pareto-optimal front obtained for sets that do not violate the minimum distance to
hydraulic structures.
81
Optimisation of monitoring networks for water systems
1.5
Extreme Xb
Extreme Yb
max H(X1,X2,...,X9)
min C(X1,X2,...,X9)
Xb1: 354
Yb1: 975
Xb2: 776
Yb2: 1145
Xb3: 1229
Yb3: 1030
Xb4: 478
Yb4: 566
Xb5: 450
Yb5: 1024
Xb6: 141
Yb6: 529
Xb7: 1182
Yb7: 801
Xb8: 976
Yb8: 656
Xb9: 170
Yb9: 75
Xb9
Yb5
1
Xb4
Xb3
Yb7
Yb1
Xb7
Xb6
Yb4
Yb3
Xb5
Xb1
Yb9
Yb2
0.5
Xb8
Yb8
Yb6
œ
Pump station
U
Fixed weir
Xb2
0
Figure 5-16. Delfland water system with location of solutions for approach b) obtained at the
extremes Xb and Yb of the Pareto frontier of Figure 5-15. Location of existing hydraulic structures
is also included. Scale represents marginal entropy values at each system point (bits).
5.3.6
Discussion
5.3.6.1 Priority of monitors’ placement
One of the characteristics of the multiobjective optimization with genetic algorithms is
that during each step of the process, all the variables (monitors) that belong to each
solution are generated at the same time, regardless of their individual significance in
informative terms. Nevertheless, this is an important issue when placing the monitors,
because it is expected that during their implementation some monitors have a different
priority than others. In order to prioritize the monitors, again from the information
standpoint, it is possible to sort them either by total correlation (in ascending order, so the
monitor that adds the least C to the set is placed first) or by joint entropy (in descending
order, so the monitor that adds the biggest JH to the set is placed first).
5.3.6.2 Approach a)
Figure 5-17 shows the behaviour of the information theory values for the solutions of
approach a) obtained at the extremes Xa (left) and Ya (right), sorting them by total
correlation in ascending order (top), and by joint entropy in descending order (bottom).
82
Chapter 5 - Information Theory for monitor location
Progress of informative values as new monitors are added
Extreme Xa
Progress of informative values as new monitors are added
Extreme Ya
5
2
Joint Entropy
Total Correlation
)
1.5
1
(
3
2
,
H, C (bits)
4
1
0
42
827
1490
1078
1265
620
891
394
1151
Monitor ID (Sorted by the least addition in Total Correlation)
2
Joint Entropy
Total Correlation
)
1.5
1
(
3
2
,
H, C (bits)
Joint Entropy
Total Correlation
0
-0.5
704
991
458
57
1204
175
587
1030
286
Monitor ID (Sorted by the least addition in Total Correlation)
5
4
0.5
1
0
0
42
-0.5
704
1030
286
458
175
1204
57
587
991
Monitor ID (Sorted by the biggest addition in Joint Entropy)
Joint Entropy
Total Correlation
0.5
827
1490
1078
1265
620
704
891
394
Monitor ID (Sorted by the biggest addition in Joint Entropy)
Figure 5-17. Progress of information quantities as new monitors are added. Analysis for extremes
Xa and Ya of Figure 5-13.
In all cases, the first monitor is the one with the highest marginal entropy. The top-left
graph shows that the set starts with the monitor 42, H(42)=1.5 bits, and that it is followed
by point 991, with H(991) ~ 0. This means that C(42,991)~0 and H(42,991)=1.5, or, in
other words, that although the monitor 991 is independent from the monitor 42, it also
does not add additional information to what can be inferred from 42 alone. For the case of
Ya (top-right), the first monitor is 704, with a marginal entropy of H(704)=1.5 bits and all
the subsequent monitors add zero total correlation to the set but at the same time they do
not provide any additional information content; in other words, only the monitor 704 has
information content in this case, so that the Figure 5-17 (top-right) is a flat curve.
The bottom of the same figure shows the same results at each extreme, but now the
monitors are sorted by the highest addition in joint entropy. For the case of Xa (bottomleft) the second point is 1030, with H(1030)=1 bit. However, H(42,1030) is not the same
as H(42) + H(1030) because C(42,1030)>0 and the total correlation curve rises. As
expected, the monitor 991 is the last selected, since it does not add information to the set
(and therefore only the first eight monitors would be placed). For the case of Ya (bottomright) it is observed that performing the sorting does not make any sense since the only
point that adds information to the set is point 704. This analysis has profound
implications in the final number of monitors to be located.
5.3.6.3 Approach b)
Figure 5-18 presents the previous analysis for the results obtained for the second
situation. In this case it is evident that monitors 776 and 170 obtained in the solution for
the extreme Xb do not add any information to them jointly, H(1229)=H(1229,776,170) so
they have no capacity to reduce the total correlation to the set. As in the first situation, the
extreme for which total correlation is a minimum (extreme Yb), shows that only the
83
Optimisation of monitoring networks for water systems
points 1030 and 801 are informative points, H(1030,801),and therefore only 7 points
would be needed. This confirms again that the solutions located at this extreme of the
Pareto front should not be considered for the final selected monitoring set.
Progress of informative values as new monitors are added
Extreme Xb
Progress of informative values as new monitors are added
Extreme Yb
6
2
Joint Entropy
Total Correlation
2
0
-2
1229
1.5
H, C (bits)
H, C (bits)
975
1145
566
1024
529
656
75
801
Monitor ID (Sorted by the least addition in Total Correlation)
2
Joint Entropy
Total Correlation
3
2
1
0
1229
1
0.5
0
1030
776
170
478
976
450
1182
354
141
Monitor ID (Sorted by the least addition in Total Correlation)
5
4
Joint Entropy
Total Correlation
1.5
H, C (bits)
H, C (bits)
4
141
354
478
1182
976
450
776
1229
Monitor ID (Sorted by the biggest addition in Joint Entropy)
1
0.5
Joint Entropy
Total Correlation
0
1030
801
1030
975
1145
566
1024
529
656
Monitor ID (Sorted by the biggest addition in Joint Entropy)
Figure 5-18. Progress of information quantities as new monitors are added for solution of
approach a), for extremes Xb and Yb of Figure 5-15.
5.3.6.4 Sensitivity analysis of the parameters u and q
The initial assumptions of u=1 and q=1 were initially chosen to express costs in
informative units, and their selection may affect the outcomes. In order to make the
sensitivity analysis, Eq (5-6) was solved for values of u of 0.1, 0.5, 1, 2 and 5 bits/new
monitor. In order to evaluate the quality of the results, the extremes of the obtained
Pareto fronts are analyzed. Additionally, the average of the marginal entropy of each
solution is analyzed, in order to assess the distribution of the entropy among the selected
monitors.
Joint Entropy (bits)
Total Correlation (bits)
7 New
6 New
0-1
1-2
2-3
3-4
8 New
9 New
5 0
4
3
2
1
0.1
0.5
1
2
Cost u (bits / New monitor)
5
4-5
New monitors in the solution
New monitors in the solution
6 New
7 New
1-2
2-3
8 New
9 New
3 0
2
1
0.1
0.5
1
2
5
Cost u (bits / New monitor)
Figure 5-19. Sensitivity of the maximum Joint Entropy (1) and Total Correlation (2) due to
variations of the parameter u, discriminated by the number of new monitors in the solution.
84
0-1
Chapter 5 - Information Theory for monitor location
Figure 5-19 and Table 5-3 have been prepared for the parameter u, where the sensitivity
of the maximum Joint Entropy and the Total Correlation due to variations of the
parameter u, is presented, according to the number of new monitors in the solution. It can
be observed that, for solutions having only new monitors, good results are achieved
regardless of the value of u (Joint Entropy between 4.12 and 4.38 bits or between 84%
and 89% of Hsys).
However, solutions considering some existing monitors become less informative as u
decreases. This is because there are no restrictions on placing the new monitors and better
combinations of monitors are found. An additional effect is that no solution that considers
4 EMPs was found within the solutions for u=0.1, 0.5 and 1.0. Conversely, high values of
u force the inclusion of existing monitors, which generates solutions of lower quality
(lower information content; see Figure 5-19(1)). When looking at the distribution of the
Total Correlation (Figure 5-19-(2)), solutions with independent monitors are obtained
when more existing monitors are considered, situation that again supports the conflicting
nature of the objectives.
Table 5-3. Sensitivity analysis for parameter u.
Criteria for evaluation
of solutions
EPM
0
1
2
3
4
0
1
2
3
4
Max Joint Entropy
obtained
Min Total Correlation
obtained
0.1
4.12
3.52
3.04
2.75
1.27
0.47
0.19
0.13
-
u (bits/ new monitor)
1
4.31
3.73
2.95
2.10
1.91
1.06
0.89
0.07
-
0.5
4.26
3.84
3.54
2.85
2.40
0.85
0.58
0.14
-
Joint Entropy (bits)
8 vio
8 vio
7 vio
7 vio
6 vio
3.5-4
4-4.5
4 vio
0-0.3
Number of violations
Number of violations
3-3.5
5 vio
5 vio
0.6-0.9
3 vio
2 vio
2 vio
1 vio
0 .9
3 0
6
1 vio
.5
3
4
1
2
Cost q (bits / violation)
5
0.3-0.6
4 vio
3 vio
3
4
5
4.38
4.37
4.30
4.18
2.61
2.57
2.10
2.54
1.88
0.52
Total Correlation (bits)
6 vio
0.5
2
4.27
4.20
4.12
3.71
3.36
1.60
1.87
1.88
1.43
1.34
0.5
1
2
5
Cost q (bits / violation)
Figure 5-20. Sensitivity of the maximum Joint Entropy (left) and Total Correlation (right) due to
variations of the parameter q, discriminated by the number of new monitors in the solution.
85
Optimisation of monitoring networks for water systems
A similar analysis was carried out for the parameter q, see Figure 5-20 and Table 5-4. In
general, solutions obtained with low values of q seem to be less informative and less
independent, regardless of the number of violations of distance to hydraulic structures.
Additionally, a value of q = 0.1 gives solutions that do not include a high number of
violations (see Table 5-4)
Table 5-4. Sensitivity analysis for parameter q.
Criteria for evaluation
of solutions
Max Joint Entropy
obtained
Min Total Correlation
obtained
5.3.7
EPM
0
1
2
3
4
0
1
2
3
4
0.1
4.12
3.52
3.04
2.75
1.27
0.47
0.19
0.13
-
0.5
4.26
3.84
3.54
2.85
2.40
0.85
0.58
0.14
-
u (bits/ new monitor)
1
4.31
3.73
2.95
2.10
1.91
1.06
0.89
0.07
-
2
4.27
4.20
4.12
3.71
3.36
1.60
1.87
1.88
1.43
1.34
5
4.38
4.37
4.30
4.18
2.61
2.57
2.10
2.54
1.88
0.52
Conclusions
An alternative method for siting a set of water level monitors in polders based on
information quantities is presented. The first quantity is joint entropy, which evaluates the
amount of information content that the set is able to collect; the second is total
correlation, which evaluates the level of dependency or redundancy among monitors in
the set. In order to find the most convenient locations to put the monitors from a large
number of potential sites, a multiobjective optimization procedure (MOOP) was posed
under different considerations: one that takes into account the costs of placing a
completely new monitor, and another that considers the cost of placing monitors too close
to hydraulic structures. In both cases, the joint entropy of the set is maximized and its
total correlation is minimized. The costs are considered in terms of information theory
units, for which additional terms u*M and q*v were introduced into the objective
functions.
The following conclusions can be drawn:
- The information measures of Total Correlation and Joint Entropy are two conflicting
quantities, because as the first improves (i.e., monitors are independent among them), the
second deteriorates (i.e., monitors get less information content), and vice versa.
- The existing monitoring network, which reduces to the measurements taken at the
pumping stations to control the switching of the pumps, is not optimal from the
information theory point of view.
- The solutions for which total correlation was considered as a single objective (i.e.,
without joint entropy) are not satisfactory in terms of monitoring, because most of the
monitors in the set are placed at sites with no information content. This means that points
with no information content are the only ones that are able to add the least total
correlation (independency) in a highly number of interdependent points. The Pareto
solutions located at the extreme where the total correlation is minimum should, therefore,
86
Chapter 5 - Information Theory for monitor location
be neglected. However, the maximization of joint entropy gives useful results, as the
solutions found can cover between 82% and 85% of the total information content of the
system by selecting fewer points than the originally proposed 9 points.
- The results obtained with the WMP method presented in section 5.2 are part of the
optimal set of networks obtained by solving the MOOP posed in this section. The
solution of the MOOP, however, has two main advantages over the WMP method: first, it
gives a complete picture in terms of options to select a monitoring set; second, it allows
adding constraints to the problem so a wider range of situations can be tested.
- In terms of the practical situations analyzed separately in this section, namely the
financial constraint of having to place new monitors and the accuracy constraint of
having to place them near hydraulic structures, it can be concluded that the best solutions
in information units are numerically very similar (joint entropy for the first situation
equal 4.18 bits and for the second situation equal to 4.04 bits), but spatially different.
5.4 Evaluation of the Monitoring Network of the Magdalena River
Until now only water level series have been used to estimate the Information Theory
quantities for monitoring network design, due to the implications that they have for
polder systems. For the case of natural streams, however, the analysis of discharge time
series is more interesting than for water levels, since the effects of the tributaries on the
behaviour of the river can be important. For this reason, two methodologies to design the
discharge monitoring networks in rivers using concepts of information theory are
presented in this section. The first methodology considers the optimization of Information
Theory quantities and the second considers a new method that is based on ranking the
different possible monitor combinations. The methodologies are tested for the Magdalena
River in Colombia (see Chapter 4) in which the existing monitoring network is also
assessed. In addition, the use of monitors at the tributaries is also explored. The ranking
method is a promising way of finding the extremes of the Pareto fronts generated during
the multiobjective optimization process.
5.4.1
Description of the methodology
The ideal monitoring network would be composed of a set of gauges that provide the
maximum information content and that are able to capture independent information. As
previously mentioned, Total Correlation and Joint Entropy are two conflicting objectives:
when the Joint Entropy of a set of variables increases, the Total Correlation decreases and
vice versa. For this reason, the best location of N monitors would be such that they
simultaneously fulfil both objectives. A simpler mathematical formulation of the
optimization problem than the one introduced for the case of Delfland in Eq. (5-5) is:
min ^C ( X 1 , X 2 ,..., X N )`
max ^ H ( X 1 , X 2 ,..., X N )`
(5-8)
The optimization problem posed in Eq. (5-8) is solved using two different approaches: a)
Multiobjective Optimisation with Genetic Algorithms (MOGA); 2) a ranked-based
87
Optimisation of monitoring networks for water systems
greedy algorithm to optimize both objectives independently. Both approaches are
described below.
5.4.1.1 Multi-objective optimization approach
One way of solving the problem posed in Eq. (5-8)is by looking at it as a multi-objective
optimization problem (as in the section 5.3), which provides as a result a set of solutions
that draws a Pareto front (efficient, non-dominated solutions). Such a front describes
limits of what is possible in terms of decision criteria, and identifies how an improvement
in one particular criterion is related to losses in other criteria. MOGA has been
successfully used to solve water-related optimization problems(see e.g., Alfonso et al.
2010a; Barreto et al. 2009). A series of experiments were carried out to identify the
optimal set of points for monitoring. In this section, NSGA-II, an elitist non-dominated
sorting genetic algorithm for multi-objective optimization (Deb et al. 2002) is used.
5.4.1.2 Rank-based greedy algorithm
The second approach consist of a greedy algorithm that picks the best information-related
gauge at a time is developed. The idea is to rank all the potential places to locate the
monitors according to the variation in Joint Entropy and in Total Correlation, separately,
caused by the selection of a new monitor. The algorithm requires the location of the first
monitor to start, which could be defined as the monitor with the highest reduction in
uncertainty, as suggested for instance by Krstanovic and Singh (1992). However, starting
with the point that has the highest information content does not guarantee that the final
set of monitors is the most informative, as is shown by the results. The second monitor is
then chosen from the remaining set of monitors in such a way that it provides either the
highest increment in Joint Entropy or the lowest increment in Total Correlation with
respect to the first point. Flowcharts describing both situations are shown in Figure 5-21.
Xi  S
H i H X 0 , X 1 ,..., X i 1 , X i Xi  S
C i C X 0 , X 1 ,..., X i 1 , X i Figure 5-21. Flowchart rank-based greedy algorithm for Joint Entropy (a) and Total Correlation (b)
88
Chapter 5 - Information Theory for monitor location
5.4.2
Case study: Magdalena River, Colombia
The multiobjective optimization method and the ranked-based algorithm are applied to
the monitoring network of the Magdalena River, the main river of Colombia, which runs
for about 1,540 kilometres from South to North through the western half of the country to
the Caribbean Sea. A detailed description of this case study is presented in Chapter 4.
Tributaries on the Magdalena River play an important role in many respects. The
watershed and its main tributaries cover 257,400 km2, which corresponds to 24% of the
total surface of the national territory. The main tributaries, located mainly in the middle
reach, have an important influence on the river’s behaviour in terms of discharge. Figure
5-22 presents the location of the most important cities near the river, the main tributaries
and the available discharge and water level records for 1995.
Figure 5-22: Available hydrologic data records of discharges (Q) and water levels (h) at river stations
for 1995.
The existing water-level gauges for the river were placed initially to support decisionmaking concerning local problems in the main populated areas, related to flood control
and navigation, while keeping operation and maintenance costs low. However, from a
global perspective, the information collected by these gauges is limited to supporting
89
Optimisation of monitoring networks for water systems
decision and policy-making for navigation, flood control and other issues at other points
of the river. Therefore, insight is needed on the design of a new monitoring network
while evaluating the existing network in terms of its information content.
The selection of the parameter a in Eq. (5-1) is done in such a way that all of the
tributaries included in the model have significant information content. That is, a big value
of a would make small discharges insignificant in terms of information content because
the rounding nature of the expression would transform the discharge series to a series
with constant values. On the other hand, a too small value of a would make every
discharge in the river have a similar information content, which is not convenient for our
analysis. For these reasons, a was assumed to have a value of 200 m3/s in order to include
all the available discharges of the tributaries. A sensitivity analysis to show how results
may change because of the selection of the parameter a is included at the end of this
section.
5.4.3
Analysis of Results
The model output includes discharge time series for 181 calculation points along the main
river (Loba branch) and 31 points for the Mompox branch. These time series are
quantized using Eq. (5-1) adopting a value of 200 m3/s for a, which corresponds to the
mean discharge of the smallest tributary with available data; so only the effects of inputs
with this magnitude are captured by the entropy analysis.
The pre-design of the monitoring network for the Magdalena River was done using
Information Theory to select a limited number of points from the 181 discharge points on
the main channel where gauge devices are worth placing. As a first insight, the marginal
entropy of each calculation point was estimated using Eq. (2-1) and a map of the entropy
for the Magdalena River was prepared (Figure 5-23).
5.4.3.1 Analysis of the entropy map
Before presenting the solutions of Eq. (5-8) for the design of the monitoring network
using the methods described above, the entropy maps (Figure 5-23, (a)) obtained for the
discharge time series are compared to the map of discharges (Figure 5-23(b)). Firstly,
entropy increases at points where the tributaries discharge into the river (see for example
the rivers Miel, Negro, Nare, Sogamoso, Cimitarra, Cauca and the convergence of the
branches Mompox and Loba in Figure 5-23(b)). The rivers Opón and Carare do not show
any increment in the entropy, due to their relatively low influence in terms of discharge.
The Mompox branch shows the same entropy along its channel, because there are no
tributaries that flow into it. It is interesting to see that the lowest value of entropy occurs
in this branch. This is because the discharge in this branch ranges from 400m3/s to
1000m3/s, so when applying Eq. (5-1) with a=200m3/s the resulting quantized series have
only four unique values in the frequency analysis and therefore only four sums are
required to evaluate Eq. (5-8).
90
Chapter 5 - Information Theory for monitor location
(a)
(b)
3
Figure 5-23. Entropy Map for a=200 m /s in bits (a) and mean discharge map for 1995 in m3/s (b),
for the Magdalena River.
Secondly, entropy decreases when the wetlands interact with the river. As mentioned
above, the wetlands act as a complex system of reservoirs that absorb the peak flows of
the Magdalena River. For this reason, the discharge time series tends to be smooth, the
range between minimum and maximum discharge is lowered and therefore entropy
diminishes. Indeed, entropy is continuously increasing from upstream to downstream,
until the wetland W1, just before the inflow of the Lebrija River. From this point to El
Banco, entropy remains constant because no additional inflows exist. However, a big
change in entropy takes place at El Banco, reducing it to the values reported after the
inflows of the rivers Miel and Negro. This change is due to the connection of the
Magdalena River to the wetland W2 (Figure 5-23), and the bifurcation of the main river
into the Mompox and the Loba branches. It is clear that, after the bifurcation, the Loba
branch contains, on average, a similar flow to that at San Pablo, or about 3800 m3/s,
highlighting again the effects of the wetlands. On the other hand, the effect of the wetland
W3 on the entropy map is opposite that which is observed due to the first wetlands W1
and W2, since it adds entropy to the river. This effect is due to a third, minor bifurcation
called Chicagua (not shown in Figure 5-23), which interacts with the so-called
Momposina Depression, the “island” formed by the Mompox and Loba branches, in
which water is discharged during high water levels events in the river. For the year 1995
the inflow was mainly from the depression to the Magdalena and therefore it was acting,
on average, as an additional tributary. Although this additional discharge is not
significant for the river (see Figure 5-23(a), between El Banco and Cauca inflow), it
makes a difference in terms of entropy (see Figure 5-23(b)).
91
Optimisation of monitoring networks for water systems
Thirdly, in the middle of the Loba reach, the biggest tributary, the Cauca River, flows
into the Magdalena, greatly increasing its flow and also its entropy, which recovers some
of the entropy absorbed by the wetlands W1 and W2. Additionally, although the wetland
W4 acts in a similar way to W1 and W2 its effect is not apparent in the discharge map or
the entropy map, because the influence of this zone is driven by the significant inflow
from the Cauca River.
Finally, at the point where the branches Loba and Mompox converge, the entropy is a
maximum. As there are no additional inflows or wetlands, this value remains constant
until the most downstream point in Calamar.
5.4.3.2 Results using Multi-objective optimization approach
The multi-objective optimization problem posed in Eq. (5-8) is solved using the Non
Sorted Genetic Algorithm, NSGA-II (Deb et al. 2002), for which the evolutionary
parameters, namely, the number of populations and number of generations, must be
specified. Additionally, the number of decision variables (number of monitors to be
placed along the river) must be defined. In order to perform a sensitivity analysis of these
parameters, a number of experiments were carried out, in which five different populations
(P) and generations (G) were tested with the following combinations (P, G): (50, 20),
(50, 50), (100, 20), (100, 50), (200, 50). Additionally, each experiment was carried out
for a number of gauges from 6 to 9. The final solution was determined by selecting the
best solutions (those with high Joint Entropy and low Total Correlation) from the five
Pareto fronts. For comparison purposes, these solutions have been included in each
Pareto front in Figure 5-24 as black dots.
6 decision variables
7 decision variables
Joint Entropy (bits)
50,20
50,50
100,20
100,50
200,50
8.0
8.1
8.2
8.3
8.4
Joint Entropy (bits)
7.9
7.9
8.1
8.2
8.3
8.4
8.5
8.5
15
16
17
18
19
Total Correlation (bits)
20
20
21
22
23
24
25
Total Correlation (bits)
8 decision variables
9 decision variables
50,20
50,50
100,20
100,50
200,50
8.3
8.4
8.5
8.2
Joint Entropy (bits)
8.2
Joint Entropy (bits)
50,20
50,50
100,20
100,50
200,50
8.0
50,20
50,50
100,20
100,50
200,50
8.3
8.4
8.5
24
25
26
27
Total Correlation (bits)
28
29
28
29
30
31
32
Total Correlation (bits)
33
Figure 5-24: Solutions for multiobjective optimization approach. Black dots form the best Pareto
front obtained by selecting the best points of the 5 combinations (P, G). Points A, B, C and D are
selected for further analysis for 9 decision variables.
92
Chapter 5 - Information Theory for monitor location
Figure 5-24 it can be observed that the increment in the number of decision variables
translates into a small increment in joint information and into a significant increase in
redundant information. This means that new monitors will not add much more
information content compared to what can be deduced from fewer monitors.
Figure 5-25 presents the locations of the solutions A, B, C and D shown in Figure 5-24
for the case of 9 decision variables on the entropy map (bits) of the Magdalena River. The
redundancy of the solutions is evident, especially in the upstream part of the river.
Additionally, Figure 5-26 presents the location of the solutions with the highest value of
Joint Entropy for 6, 7, 8 and 9 monitors.
Several conclusions can be drawn from Figure 5-25. First, in general, the monitors are
located where significant changes in entropy take place. Second, redundancy is reduced
by adding monitors upstream, which do not add extra information content; on the
contrary, joint information increases as more downstream monitors are added, with the
consequent increment of their dependency. This confirms the trade-off between both
information measurements. Finally, monitors are always selected at the Momposina
Depression, especially where the wetlands have connections with the Magdalena and
where the Cauca River discharges. The complex hydraulic conditions make the discharge
change frequently along the river, leading to the increase of information content.
Figure 5-25. Location of selected solutions A, B, C and D of Figure 5-24 for 9 decision variables
93
Optimisation of monitoring networks for water systems
6 monitors
7 monitors
8 monitors
9 monitors
1
0.9
0.8
0.7
0.6
0.5
0.4
Figure 5-26. Location of the most informative (and redundant) solution obtained for 6, 7, 8 and 9
monitors (the most right black dots of each Pareto front of Figure 5-24)
Moreover, Figure 5-26 shows the most informative solutions obtained for different
numbers of monitors and how they are redistributed when an additional monitor is
considered in the solution set. There is a regular distribution of the monitors for 6 and 7
decision variables, while for 8 and 9 monitors there is a tendency to locate the monitors
upstream. This can be explained by recalling that all the tributaries, with the exception of
the Cauca River, are located upstream (see Figure 5-22) so that the information collected
at monitors located in this area provides insight into the state of the system downstream.
5.4.3.3 Results using the ranked-based greedy algorithm
The algorithms presented in Figure 5-21 were applied to the Magdalena River, in order to
find the location of m number of monitors. In order to analyse the evolution of the
monitoring network, experiments were carried out for m=5, 6, 7, 8 and 9. The algorithms
were executed using as the starting point each of the 181 computational points of the
model, generating 5 matrices (one for each m) with size 181 x m.
x Ranking by joint entropy
The solution for the monitors with the maximum joint entropy was selected from the
previously generated matrix; the locations of the monitors are plotted in Figure 5-27.
94
Chapter 5 - Information Theory for monitor location
5 monitors
6 monitors
7 monitors
8 monitors
9 monitors
9
5
5
1
8
5
1
1
6
8
5
1
6
1
5
0.9
1
6
6
0.8
2
2
2
2
2
0.7
4
4
4
4
7
3
3
3
4
7
3
0.6
7
3
0.5
0.4
H
C
8.4322
15.818
8.4604
20.736
8.4717
25.201
8.4717
30.357
8.4717
35.502
Figure 5-27. Results obtained running the flowchart of Figure 5-21(a). Numbers represent the
order in which each monitor was selected. The colour scale represents entropy (bits).
From Figure 5-27 it can be observed that the first monitor is located on the Loba branch,
just before the convergence with the Mompox branch; this monitor, however, is not the
one with the maximum information content (which is the second point from downstream
to upstream). This means that starting with the monitor with the highest entropy does not
guarantee that the final set of monitors has the maximum joint entropy.
The second monitor is located at the discharge of the Lebrija River and the connection to
the wetland W2; the third monitor, that adds the maximum joint entropy to the previous
set of two monitors, is placed after the discharge of the Nare river; the fourth monitor is
located between the discharges of the rivers Carare and Opón and the fifth one is placed
at the downstream part of the river, completing a set of five monitors with a Joint Entropy
value of H=8.4322 bits. The solution for six monitors includes the same previous five
locations in the same order of selection and adds the sixth at the place where the wetland
W3 is connected, incrementing the Joint Entropy of the set to H=8.4604 bits. Similarly,
the solution for seven monitors includes the previous six and adds the seventh at a place
nearby the city of Berrío, downstream of the third monitor. This makes the Joint Entropy
increase again to H=8.4717 bits.
Until now, every new selected monitor has been adding the maximum information
content possible to the previous set and this monitor has been unique at every step.
However, 20 different candidates arise for the monitor number eight and none of them
provides any additional information content to the set of seven monitors, implying that
95
Optimisation of monitoring networks for water systems
further monitors are redundant. The location of the eight and ninth monitors (for which
148 different candidates arose) as shown in Figure 5-27 also confirms that these monitors
are not worth selecting: they all congregate downstream repeating the information
provided by the fifth monitor.
x Ranking by total correlation
The same exercise was performed for the algorithm presented in Figure 5-21 (b); results
are shown in Figure 5-28. It can be observed that the first monitor is located at the
connection to the wetland W3 and that the subsequent monitors were located at the
upstream part of the river, looking for points with very low information content. This is
because one way of reducing Total Correlation is by adding random variables with very
low (or null) entropy (Alfonso et al. 2010c).
5.4.3.4 Comparison with the existing monitoring stations
The monitoring network formed by the existing stations on the Magdalena River, with
flow data available for the year 1995, was evaluated from an Information Theory
perspective. The set of 9 stations (Salgar, Berrío, San Pablo, Regidor, Peñoncito, El
Banco, Magangué, Tacamocho and Calamar) has a value of Joint Entropy of H=8.3808
bits and a value of Total Correlation of C=34.7464 bits. The performance of this network
can be compared to the results obtained for 9 variables using the multiobjective
optimization approach and by the ranking approach in the Joint Entropy – Total
Correlation space (Figure 5-30). It is observed that the set is not optimal (there exist other
solutions that give better Joint Entropy and Total Correlation values).
Figure 5-28. Results obtained running the flowchart of Figure 5-21(b). Numbers represent the
order in which each monitor was selected. The colour scale represents entropy (bits).
96
Chapter 5 - Information Theory for monitor location
Comparison with monitors located at tributaries
From the practical point of view, discharge measurements are part of the navigation
studies for the Magdalena River. In order to determine the water balance, these
measurements are taken before and after the most important tributaries and at bifurcations
such as the Mompox and Loba branches. In order to evaluate these locations from the
Information Theory perspective, the value of the marginal entropy before and after the
inflows of the eight tributaries included in the model is presented in Figure 5-29 . It can
be noted that the information content always increases after every inflow, with the
exception of the Lebrija River, whose discharge is produced nearby the wetland W1.
Therefore, the straightforward conclusion is to place monitors after the tributaries in order
to get the maximum information content of the river.
However, a comparison between monitors located according to this analysis and the
optimal solutions obtained with the multiobjective optimization method for 8 decision
variables (bottom-right of Figure 5-24), reveals that the monitoring networks obtained
considering the tributaries are sub-optimal; this suggests that the effect of the wetlands,
typically ignored in the measurement campaigns due to the difficulties of monitoring in
such a vast wetland area, the hundreds of small connections river-wetlands and the poor
elevation data available, must be taken into account in order to understand the behaviour
of the river better.
1.00
Marginal Entropy
0.95
Monitor
location
After tributary
Before tributary
Total
Correlation
23.70
22.18
Joint
Entropy
8.32
8.31
0.90
0.85
0.80
0.75
0.70
0.65
Before tributary
Cauca
Lebrija
Cimitarra
Sogamoso
Opón
Carare
Nare
Miel,Negro
0.60
After tributary
Figure 5-29. Entropy values before, at and after the main tributaries
To make a general comparison, the resulting monitoring set located taking into
consideration the 8 tributaries of Figure 5-29 is included in the Total Correlation – Joint
Entropy plane for 9 variables in Figure 5-30. It can be observed that both sets (before and
after the tributaries) have a slightly better value of Joint Entropy than the existing
monitors. Naturally, the Total Correlation cannot be evaluated in this graph because it is
very sensitive to the number of monitors in place.
97
Optimisation of monitoring networks for water systems
7.2
Multiobjective Optimization
Ranking by C
Ranking by H
Existing monitors
Max H, ranking by H
Min C, ranking by C
After tributaries (8)
Before tributaries (8)
Joint Entropy H (bits)
7.4
7.6
7.8
8.0
8.2
8.4
8.6
26
27
28
29
30
31
32
33
34
35
36
Total Correlation C (bits)
Figure 5-30. Solutions obtained by different methods, Total Correlation – Joint Entropy plane.
5.4.4
Sensitivity analysis of the parameter a
As previously mentioned, the selection of the parameter a in Eq. (5-1) may change the
value of the entropy-related quantities. In order to analyse the implications of these
changes, the entropy map presented in Figure 5-23 is redrawn for different values of a
(see Figure 5-31).
It can be observed that entropy values decrease when the value of a increases, because
the number of bins for the frequency analysis are fewer, and therefore the number of
sums required to assess Eq. (2-1) is less. However, the relative value of the points with
respect of the others in the same map is, in general, maintained regardless of the value of
a. Therefore the expressions (2-8) and (2-9) yield numerically different values, but
basically the same locations are obtained. This can be seen in Figure 5-31, where the
zones with high entropy are always between the discharges of the tributaries Cimitarra
and Lebrija and also after the convergence of the branches Mompox and Loba. On the
other hand, the zones with low entropy are located in the Mompox branch and at the
upstream part of the river, before the discharge of the rivers Miel and Negro. It can also
be observed that in the wetlands zone the entropy changes in a similar way between
maps. This implies that the resultant monitoring networks generated with the presented
methods do not change significantly when changing the value of a. It must be noted,
however, that an extreme, illogic value of a such as a=1 or a=100.000 m3/s leads to
useless constant entropy maps. It is recommended, therefore, that this value should be set
between the lowest and the highest mean flow of the incoming tributaries of interest.
98
Chapter 5 - Information Theory for monitor location
Figure 5-31. Entropy maps for different values of a, Eq. (5-1)
5.4.5
Conclusions and Recommendations
The entropy map for discharge in the Magdalena River shows that entropy increases at
places where the tributaries flow into the river and diminishes at places where there exist
connections to the wetlands.
The series of experiments carried out above gives rise to the following conclusions:
x The selection of high-entropy points for monitoring leads to redundant monitors
and the selection of low-entropy points generates a final set with low information
content. The conflicting nature of these Information Theory quantities promotes
99
Optimisation of monitoring networks for water systems
x
x
x
the use of a multiobjective optimization approach. However, the selection of the
final monitoring network selection is not straightforward if only the generated
Pareto fronts are analysed, and it is still difficult to find an optimal solution that
satisfies both criteria. In order to choose one point from the Pareto front,
additional constraints are needed to determine the relative importance of joint
entropy and total correlation. It is recommended that decision makers find these
additional constraints by considering the requirements of water users.
Seven monitors is the maximum number of monitors for which the Joint Entropy
continuously increases along the Magdalena River, under the conditions in which
the model was built. Additional monitors are fully redundant and do not add any
further information content.
The ranking-based methods are useful for finding the extremes of the Pareto
fronts generated by the multiobjective optimization procedure and could be used
in further research to normalise the information quantities and therefore to
evaluate the solutions in a relative way. An interesting finding is that the initial
monitor used to start the algorithms in Figure 5-21 plays a significant role in the
Joint Entropy and the Total Correlation of the final set. Also, starting with the
point with the highest entropy does not guarantee that the final set of monitors has
the maximum information content.
Although the existing monitoring stations were placed individually to fulfil the
requirements of the cities without assessing the network as a whole, the
performance of this set yields acceptable information content but there is a high
redundancy between monitors. Moreover, its performance is similar to what is
obtained if the monitors are located following the location of the tributaries, as is
normally done during monitoring campaigns.
5.5 Conclusions
The distribution of the information content on a water system is driven by features that
produce changes in the hydraulic conditions of the system. For the case of the polders of
Pijnacker, the features that interfere with the distribution of the information content are
the weirs and the pumps; for the case of the Magdalena River, it is the incidence of its
wetlands and tributaries. However, this does not mean that a monitoring network
configured by placing monitors at the hydraulic structures in the first case or at the
tributaries in the second case will be optimal from an Information Theory perspective.
In all the experiments, the trade-off between Total Correlation and Joint Entropy is
evident. However, a large Joint Entropy can be preferred over a small Total Correlation.
This is because low-dependent set of monitors are also the least informative. In this case,
therefore, it does not make sense to have a completely independent set of monitors if they
do not provide enough information content individually.
The location with the highest information content of the system is usually the most
dependent of the remaining locations. This implies that once this point is selected, it is
very difficult to find a second point that is informative and at the same time is
independent. This also explains why only a few monitors are needed in spite of having
100
Chapter 5 - Information Theory for monitor location
many possibilities (8 out of 1520 potential monitors in place for the Pijnacker region and
7 out of 181 for the case of the Magdalena River).
The sensitivity of any discretization-based criteria for the assessment of probability
distributions is a well-known difficulty that greatly affects the estimation of entropyrelated quantities. However, these effects are negligible for the methods determining
monitor location due to the relative nature of the developed methods.
101
Chapter 6
Value of Information for monitor
location
This chapter presents a novel approach for locating monitors in a water system using the
concept of Value of Information (VOI). This concept takes into account three main
factors: 1) the belief that the decision-maker has about the state of the water system
before having any information; 2) the consequences associated with the decision of
having to choose among several possible actions given the state of the water system; and
3) the evaluation and update of new information when it becomes available. The
methodology uses water level time series generated by hydrodynamic models at every
computational point, each one being a potential monitor site. The method is tested in two
case studies of completely different nature: the Magdalena River in Colombia, and a
polder system in The Netherlands. It is shown that the methodology can be used as a
complementary approach to existing methods that focus the monitor location problem
exclusively on information-theory.
This Chapter is organised as follows: first, the main considerations are presented in the
introduction, followed by a section that defines the variables for the VOI estimation.
Then an explanation of how monitor locations can be valued according to the VOI theory
is presented. The approach to locating monitors is then explained for the case of one, two
and three monitors and the generalization of the method for n monitors. Subsequently, the
methods are applied to the Magdalena River and to the Pijnacker polders, after a brief test
in a hypothetical, simple case. Finally, the conclusions are presented for this Chapter.
6.1 Introduction
Two main considerations are taken into account when developing the methods presented
in this Chapter: the use of a model as a data generator, in a similar way as presented in
Chapter 5, and the use of the generated data to estimate the probabilities required for the
VOI calculation.
Firstly, in this Chapter the model is used to generate time series from which the
probabilities required to assess the Value of Information are estimated. The model is
required because available measurements are generally limited to a few points making
them insufficient to draw conclusions from their analysis. In contrast, the model
Optimisation of monitoring networks for water systems
generates a dense set of points. As presented in Chapter 5, every calculation point within
a model is considered as a potential location for a monitoring point within a water
system. The details of the model used for the case of the polders of Pijnacker and for the
case of the Magdalena River in Colombia have been presented in Chapter 3 and in
Chapter 4 respectively.
Secondly, this chapter introduces a procedure to estimate the prior beliefs, Ss, and the
conditional probabilities, qm,s, used in the VOI estimation. The assessment of these
parameters is difficult, because the probabilities before and after receiving the new
information are not known. The data generated by the model is used to estimate such
probabilities with the procedure explained in the following section.
6.2 Definition of variables for VOI estimation
As reviewed in Chapter 2, VOI is a function of the consequences cas, of taking an action,
a, given a particular state, s, of the prior probability Ss, or the belief before the acquisition
of additional information, and of the conditional probability qm,s, of receiving the
message, m, given the state, s. This means that three sets of data are needed to estimate
the Value of Information: the set of actions a, the set of possible states s and the set of
messages m that the monitors provide.
First, the set a = (a1, a2, ..., aA) contains A actions that are available for the decision
maker in order for him 2 to deal with the state of the system, for example, by turning a
pump on, by releasing a warning or simply doing nothing. Second, the set s = (s1, s2,...,
sS) contains S possible states of the system, namely, flooding or normal water levels, that
should be defined for each point of the system. Third, the set m = (m1, m2,..., mM) contains
M messages that the information service will provide to the decision maker as an
indication of the possible state of the system. Examples of such messages can be
“Danger” or “Relax”. Two actions, two states and two messages are considered in this
Chapter for simplicity.
6.2.1
Estimation of the probabilistic variables Ss and qm
In the first place, the prior probabilities associated with the possible states are estimated
using the vector Ss, shown in Table 6-1 for the case of two possible states s1 and s2 that
can occur in the given water system. Two values are defined: 1) Number of states sk at
point i, referring to the number of times the state sk has been repeated in the history of the
point i in the water system; 2) Number of records, being the length of the generated time
series at point i, which is generally a constant for all points in the system. Consequently,
each element in the Ss vector contains the relative frequency of each state, which can be
regarded as the probability of a particular state occurring at each point of the system, and
in this respect may be treated as the knowledge the decision maker has about his system
for a given state.
2
We imply both male and female when using him/his/he to refer to the decision-maker.
104
Chapter 6 - Value of Information for monitor location
Table 6-1. Definition of the vector Ss for two possible states of a water system
State
Ss
s1: State 1
Number of states s1 at point i / number of records
s2: State 2
Number of states s2 at point i / number of records
In the second place, the conditional probabilities qm,s, of receiving the message m, given
the state s, are estimated by checking the relative frequency of the situations presented in
Table 6-2, where x is the location at which messages are produced taking into account the
state at the location, and y is any other point in the system.
Table 6-2. Possible situations
possible states
Situation
1
2
3
4
of messages at x for given states at y for the case of two
Message at x
Message m1
Message m1
Message m2
Message m2
State at y
State 1
State 2
State 1
State 2
From Table 6-2, bearing in mind that the messages m1 and m2 describe the states s1 and s2
respectively, it is clear that:
the messages are correct when situations 1 and 4 happen;
there is a type I-error in situation 3;
there is a type II error in situation 2.
For example, consider that m1 = “Danger” is used to describe the state s1=”Flood” and
m2=”No Panic” is used to describe the state s2=”Normal”. In the situation 3 in Table 6-2,
the message m2 incorrectly announces that there is no problem, while in the situation 2
the message m2 incorrectly announces that a flood event is occurring.
Therefore, the conditional probabilities qm,s may be estimated as stated in Table 6-3,
where the value Num situation i is the number of times a situation i (Table 6-2) happened
in the history of the time series and Num states s is the number of times the particular
state s occurred.
Table 6-3 Definition of conditional probabilities qm,s according to the situations presented
in Table 6-2.
m1: Message 1 (at x)
m2: Message 2 (at x)
qm,s
s1: State 1 at y
Num situation 1 / Num states s1
Num situation 3 / Num states s1
s2: State 2 at y Num situation 2 / Num states s2
Num situation 4 / Num states s2
It can be observed that the elements in each row sum up to one. However, a row will have
a value of zero if the corresponding state does not occur. This means that a monitor at x
makes sense only if it is able to say something about the state at y.
105
Optimisation of monitoring networks for water systems
6.2.2
Definition of the consequences cas
The consequences, cas, of taking an action, a, given a particular state s, form a matrix that
contains the costs associated with having chosen to perform an action according to the
state the decision maker thinks is happening. Table 6-4 presents an example of this matrix
for the case of two actions and two states.
Table 6-4. Definition of the Cas matrix.
a1: action 1
Cas
a2: action 2
s1: State 1
Cost of doing a1 when s1
Cost of doing a2 when s1
s2: State 2
Cost of doing a1 when s2
Cost of doing a2 when s2
Naturally, this matrix depends on the type of water system; its definition is not
straightforward because of the hypothetical character of the damages under different
scenarios, especially for extreme states. However, this consequence matrix can be built
according to the judgment of water board experts (see, e.g., van Andel 2009 p. 116-118).
In order to clarify the calculations, a numerical example of the procedure to calculate the
Value of Information is shown next.
Procedure
Following the flowchart presented in Figure 2.3, a numerical example of the calculation
procedure is presented assuming that the calculation of Table 6-1 and Table 6-2 for a
given point in the system yields, respectively:
ª 0.75 0.25º
ª0.925º
Ss «
, qm, s «
»
»
¬0.10 0.90 ¼
¬0.075¼
Suppose also that the matrix of consequences Cas (explained in detail further) is:
ª 5 100 º
cas «
0 »¼
¬ 30
First, the action that would have been chosen without information, u(a0,Ss), is calculated
as the maximum utility given by performing each action a based on the prior beliefs Ss:
­° ª 5 100 ºT ª 0.925º ½°
u a0 , S s max ®¦ «
¾ 6.875
0 »¼ «¬ 0.075»¼ °
30
¯° ¬
¿
Then, the posterior probabilities are calculated and a total of M*S values are obtained
(one for each combination of states s and messages m):
0.925*0.75
0.925*0.25
ª
º
« 0.925*0.75 0.075*0.10 0.925*0.25 0.075 0.90 » ª 0.989 0.774 º
S s ,m
«
» «
»
0.075*0.10
0.075 0.90
¦S qm,sS s «
» ¬ 0.011 0.226 ¼
¬« 0.925*0.75 0.075*0.10 0.925*0.25 0.075 0.90 ¼»
Next, the expected utility, the probability-weighted average of the utilities of the
associated consequences, is estimated:
qm, sS s
106
Chapter 6 - Value of Information for monitor location
ª 5 100 º ª 0.989 0.774º ª 5.267 10.648º
« 30
0 »¼ «¬ 0.011 0.226 »¼ «¬ 98.931 77.406 »¼
S
¬
The decision-maker will choose the action that gives the maximum utility for all possible
states, so:
­ ª 5.267 10.648º ½
u am , S s , m max ® «
» ¾ > 5.267 [email protected]
a
¯ ¬ 98.931 77.406 ¼ ¿
Then, the value of each message is calculated as the difference of the utilities of the
action am that is chosen given the message m, and the utility of the action a0 that would
have been chosen before additional information:
' m u am , S s ,m u a0 , S s ,m > 5.267 [email protected] (6.875) >1.608 [email protected]
u a, S s , m ¦c
as
S s ,m
Finally, the Value of Information is the expected utility of the new information:
VOI ¦ qm ' m 0.925*0.75 0.075*0.10 *1.608 0.925*0.25 0.075 0.90 * 3.774
m
VOI | 0
In this case, the VOI with the given qm,s is zero because the action that would have been
chosen with the additional information does not have an expected utility. This is due to
several factors. First, the decision-makers’ prior belief has a tight distribution (his
confidence about the state of the system is very high). Second, the perceived quality of
the information service given by the matrix qm,s makes it useless. Certainly, 25% of the
time the message will incorrectly reject the true state (there is a flood and the message
does not show it), while 10% of the time it will incorrectly reject the false state (there is
no flood but the message says there is). Third, the outcome of the message is in line with
what the decision maker believes.
22
0.01
20
0.05
0.2
0.1
0.2
0.3
0.3
0.4
0.4
0.5
0.5
0.6
0.6
0.7
0.7
0.8
0.8
0.9
0.9
18
16
14
12
10
8
0.99
0.01 0.05
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0.99
0.99
0.01
6
4
2
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0
0.9 0.99
Figure 6-1. Variation of the Value of Information when changing the prior probability Ss and the
conditional probabilities qm,s for the consequence matrix shown in Table 6-6.
107
Optimisation of monitoring networks for water systems
If the same exercise is repeated assuming that the decision maker is completely ignorant
of the state of his/her system (this is Ss = 0.5), then VOI = 1.62. Additionally, if the
quality of the information service is improved (i.e., qm,s = [1 0 ; 0 1]), then VOI yields
15.00. The variations of the VOI for the consequence matrix shown in Table 6-6 are
presented in Figure 6-1, where the conditional probabilities qm1,s1 and qm1,s2 are presented
on the x-axis and the prior probabilities Ss on the y-axis. It can be observed that the
maximum VOI is 22.00 for the given consequence matrix, which is achieved only if the
quality of the message is perfect.
6.3 Value of the location for one monitor
For the subsequent discussion, a series of propositions are presented, taking into account
that x is the location in a water system where a monitoring device is placed to provide
messages about the possible states of any point, y, on which a decision maker bases his
water management related decisions. For simplicity, it is considered, in principle, that the
water system is a linear canal.
x Proposition 1
The value a decision-maker is willing to pay for a monitor located at x to know the state
at any other point y is given by Vx y VOI cas , S s , qmx , s y
Vx(y) is a single value if y is one point and is a vector if it is calculated for all points y in
the water system (see Figure 6-2).
x Proposition 2
In order to know the state of the world at point x, the most convenient point to place a
monitor is precisely (and obviously) the point x.
The proposition 2 implies that Vx(x) is a maximum, that is, there exist no other locations
where a monitor can provide a higher value about the state of the system at x (see Figure
6-2). This is because the conditional probabilities qmx , sx estimated as in Table 6-3 yield
the unit matrix.
Moreover, if Vx(y) is calculated individually for all points y in the system, then the curve
for Vx, shown in Figure 6-3(a) is obtained. The curve has a maximum at x and decreases
progressively as y moves away from x, because some of the messages produced at x do
not coincide with the states at y and the conditional probabilities qmx , sx do not give unity
for the corresponding matrix. Therefore, the mean Value of Information that the location
x gives about the state of the entire system, VOIx, is defined as the area below the Vx
curve along the canal with lenght L:
VOI x
108
³ V y dl
x
(6-1)
Chapter 6 - Value of Information for monitor location
voi
Vy y Vx x f casx , S sx , qmx , sx
Vx y Vx y x
f cas y , S s y , qmx , s y
y
Figure 6-2. Definition of Vx(x) and Vx(y)
However, from the practical viewpoint, the water system will always have a finite set of
N points and then the resulting curve Vx is a discrete one (Figure 6-3(b)); therefore, the
mean Value of Information that the location x gives about the state of the entire system
can be defined as:
N
VOI x
¦V y x
(6-2)
VOI
VOI
y 1
Figure 6-3. Definition of Vx(y), Vx and VOIx for a monitor located at x to give the state of the
system at y for infinite (a) and finite (b) number of calculation points y
109
Optimisation of monitoring networks for water systems
6.4 Value of the locations for two monitors
The best place to locate a monitor to know the state at point y is the point y itself, and any
other monitor located at a and used to know the state of the system at y has a value Va(y),
which is lower than Vy(y). Now, if an individual needs an additional monitor located at b
to know better the state at the same point y then, first of all, he should not pay more than
Vb(y) for it. Additionally, as the existing monitor at a is already giving some information
about y with value Va(y), then the maximum value the individual should be willing to pay
for the additional monitor located at b is Vb y Va y ; see Figure 6-4. This approach is
in agreement with the fact that the value an individual is willing to pay for placing
additional monitors to know the state at the same point y is progressively lower as new
monitors are added.
Figure 6-4. Value of two monitors
Naturally, the difference Vb y Va y should be positive, and should not be bigger than
Vy(y); in other words, the ideal location b to place a second monitor is such that
Vb y Va y Vy y . This is the basis of the approach described below.
6.5 Selection of monitor locations based on VOI
In this section the value of a monitor as described in the previous section is used to
develop the method for selecting the best locations for monitoring a water system. The
method is presented in steps involving one monitor at a time.
6.5.1
Locating one new monitor
Proposition 1 introduced in Section 6.3 states that the maximum value a decision-maker
is willing to pay for a monitor to be located at x in order to know the state at point y is
given by Vx(y). If the exercise is repeated allowing the point x to move along the water
system, a family of curves Vx (and therefore a vector of VOIx) is obtained. It follows that
the best place to locate one monitor is where it can provide messages about the state of
the maximum number of points in the system; therefore such a monitor should be located
at x where it has the largest VOIx, namely:
110
Chapter 6 - Value of Information for monitor location
max ^VOI x ` max
^³ V y dl`
x
(6-3)
Intuitively, the monitor located at x3 in Figure 6-5 is the one that is more valuable to
capture the state of the largest part of the system, even though Vx1 has the maximum VOI
among the three monitors considered. For this reason, the best monitor is one that
provides the maximum area (or maximum averaged area) below its curve Vx.
Figure 6-5. Selection of the best monitors out of three possibilities
6.5.2
Locating two new monitors
Following the reasoning presented in Section 6.4, the simultaneous location of two
monitors a and b, is an optimization problem that must be solved as presented in Eq.(6-4)
. This procedure looks for two points in the system such that the VOI of the monitor a
plus the positive area between the curves Va and Vb (Aab) is a maximum. (see Figure 6-6).
max ^VOI a Aab `
Aab
³ V
b
Va dl ; ^Vb y Va y ` ! 0
(6-4)
Figure 6-6. VOI-related areas to optimise the monitor locations a and b, Eq. (6-4)
111
Optimisation of monitoring networks for water systems
6.5.3
Locating three new monitors
The same procedure for two monitors is used in simultaneously locating three monitors a,
b and c. The mathematical expression for the optimization problem is presented in
Eq.(6-5), and depicted in Figure 6-7.
max ^VOI a Aab Aabc `
VOI a
³ V dl
a
(6-5)
³ V V dl;^V y V y ` ! 0
³ V V V dl;^V y V y V y ` ! 0
Aab
b
Aabc
a
b
a
a
c
b
a
VOI
c
b
Figure 6-7. VOI-related areas to optimise the monitor locations a, b and c, Eq. (6-5).
6.5.4
Locating N new monitors
The generalization of the optimization problem for the case of N monitors is given by Eq.
(6-6):
max ^VOI a Aab Aabc ... Aabc... N `
VOI a
Aab
Aabc
Aabc... z
³ V dl
a
³ V V dl;^V y V y ` ! 0
³ V V V dl;^V y V y V y ` ! 0
³ V ... V V V ;^V y ... V y V y V y ` ! 0
b
a
c
b
N
b
a
a
c
c
b
b
a
(6-6)
a
N
c
b
a
The application of the procedure for monitor location is described in the case studies
given in the following section.
112
Chapter 6 - Value of Information for monitor location
6.6 Case studies
In order to test VOI approach for siting monitors, experiments in three different water
systems are presented here. The first experiment is carried out for a simple, hypothetical
canal that is controlled by a pump at one end of its reach. Subsequently, the method is
applied to two real but very different water systems in nature: the Magdalena River, the
most important river of Colombia, and the canal network of the Pijnacker polders in The
Netherlands.
6.6.1
Canal and pump
Consider a canal that receives the drainage of a big polder area. At the end of the canal
there is a pump station to drain the water out of the polder. The pump operation is based
on water levels measured at its suction side. In order to see how the VOI approach works,
a particular flood level is defined, which changes linearly from the upstream end, where
flooding always occurs, to the downstream end where flooding never occurs (see Figure
6-8). The actions, states and messages to evaluate the value of information are presented
in Table 6-5.
1.8
1.6
Elevation (m)
1.4
1.2
1
0.8
0.6
Max water level
Mean water level
Min water level
Flood level definition
0.4
0.2
0
1
11
21
31
41
51
61
71
81
91
101
Calculation point
Figure 6-8. Definition of flood levels for the canal-pump experiment compared to the minimum,
mean and maximum water levels obtained by the model
Table 6-5. Definition of actions, states and messages for the canal-pump case
Actions
Possible
Messages
to choose from ‘States of the world’ (from the monitor at x)
a1: Pump On
s1: Flood (anywhere)
m1: Danger
a2: Do Nothing s2: No flood (anywhere)
m2: Normal
The consequences, cas, of taking an action, a, given a particular state s, are assumed to be
constant for every point of the system. Due to the lack of data, it is also assumed that, on
the one hand, the cost of releasing a warning when there is flooding is -$5 (the costs
113
Optimisation of monitoring networks for water systems
associated with communication and mobilization) and is -$30 if there is no flooding (the
damage costs of having turned the pump on when no flooding occurs). On the other hand,
the cost of doing nothing when there is flooding is -$100 (a relatively high cost due to the
disaster associated with a flood that is made worse by the fact that the pump was not
turned on) and is $0 if there is no flooding (see Table 6-6). This is the same consequence
matrix used in the numerical example shown at the end of section 6.2.
Table 6-6. Consequences of doing action a given state s (costs units)
cas
a1
-5
-30
s1
s2
a2
-100
0
Results
Before proceeding with the location of the monitors, an analysis of the value that a
monitor located at x provides when x changes along the canal, is presented. For this
analysis, the prior probabilities, or the beliefs about the states before receiving additional
information (estimated as shown in Table 6-1), are depicted in Figure 6-9:
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
20
40
60
80
100
Figure 6-9. Prior beliefs Ss estimated with the Table 6-1
Additionally, the VOIx presented in Eq.(6-2), is calculated for each point in the system.
The results are shown in Figure 6-10, where the flood level definition and the maximum
and minimum water levels are also presented:
From Figure 6-10 it can be seen that the calculation points which are always flooded (1 to
4) and the calculation points which are never flooded (93 to 102) have no value, because
the decision-maker’s experience (replicated by the analysis of each time series as shown
114
Chapter 6 - Value of Information for monitor location
in Table 6-1) shows that he is completely certain about the state of the system in these
areas (Ss = [0;1] upstream and Ss = [1;0] downstream); see Figure 6-9.
Max water level
Min water level
Flood level definition
mean VOI
1.8
1.6
7
6
1.4
1
4
0.8
3
VOI
Elevation (m)
5
1.2
0.6
2
0.4
1
0.2
0
0
1
11
21
31
41
51
61
71
81
91
101
Calculation point
Figure 6-10. Mean of Eq. (6-2) for all x in the water system and zoomed curve for the no-flood
area
Additionally, the point with the maximum VOIx is located between points 40 and 43,
where Ss is very close to the vector [0.5;0.5] (Figure 6-9), implying that the decision
maker is completely uncertain about the state of the system at these points.
Location of one monitor
The solution of Eq. (6-3) yields point 43 as the location that provides the best description
of the state of the system. Its curve V43 is shown in Figure 6-11.
16
14
Value of Information
12
10
8
6
4
2
0
-2
0
20
40
60
80
Calculation point, canal
100
120
Figure 6-11. V curve for the calculation point with the highest VOIx (point 43)
115
Optimisation of monitoring networks for water systems
It can be observed that this point is not able to produce reliable messages about the state
of the system upstream of point 31 and downstream of point 81. The curve has a constant
value at the top because at those points the expected value of the action taken without
information a0 increases and the expected value of the action a to be taken after receiving
the new information decreases, so their difference yields the same numerical value.
Location of two monitors
Similarly, the solution of the Eq. (6-4) yields points 37 and 57 as the pair of points that
together are the most valuable to describe the state of the water system. The curves V37
and V57 are presented in Figure 6-12. The point 43 selected for the case of one monitor is
not selected, because, 37 and 57 together provide reliable messages about the state of the
entire system. It can be seen that V57(57) = 19.30 is larger than V43(43) = 15.36, but the
former provides information for more points in the system than the latter. This fact,
however, is modified by the addition of the second monitor 37, with a maximum value of
V37(37)=13.51.
q
20
Value of Information
15
10
5
0
-5
0
20
40
60
80
Calculation point, canal
100
120
Figure 6-12. V curves for the monitors 37 and 57, after solving Eq. (6-4)
Location of three monitors
The solution of Eq. (6-5) yields points 4, 38 and 65 and Figure 6-13 shows their V curves.
The selection of point 65 makes sense because V65(65)= 21.69, a value that is very close
to the maximum possible for the considered consequence matrix (see Figure 6-1). Point
38 does the same job as the previously selected point 37, which is to provide information
about the state of the system where there is high uncertainty (in the middle of the reach).
Finally, the point 4 is selected as being complementary to points 65 and 38. This point
covers the area of the canal where there is high certainty, and this explains why its
maximum value V4(4) is so low (0.14).
116
Chapter 6 - Value of Information for monitor location
25
20
V65
Value of Information
V38
15
10
5
V4
0
-5
0
20
40
60
Calculation point
80
100
120
Figure 6-13. V curves for the monitors 2, 50 and 73, after solving Eq.(6-5)
6.6.2
Magdalena River, Colombia
In this case study the location of flow monitors is considered. Details of the Magdalena
River and the developed hydrodynamic model are presented in Chapter 4. The actions,
states and messages required for the VOI analysis are developed here. First, the set of
actions are a1: Release a flood warning; a2: Do nothing. Second, the set of possible states
are s1: Flooding, s2: Normal. Finally, the messages the monitors can provide are m1 =
“Danger”, m2 = “Relax”. For comparison purposes, the consequences, cas, shown in Table
6-7 are the same as those used in the canal-pump example.
Table 6-7. Consequences of doing action a given state s (costs units)
cas
s1
s2
a1
-5
-30
a2
-100
0
So far as costs are concerned, the cost of releasing a warning when there is flooding is $5 (the costs associated with communication and mobilization) but is -$30 if there is no
flooding (the political costs of having mobilized people under a false alarm).
Correspondingly, the cost of doing nothing when there is flooding is -$100 (a high cost
due to the disaster associated to the flood that was not warned), but is $0 if there is no
flooding.
Celerity and lagged time series
In the case of the canal and pump presented in Section 6.6.1 the definition of the
conditional probabilities qm,s (Table 6-3), is based on the premise that a message at point
x may have some value for providing the state of the system at a point y, both states
occurring simultaneously. Although this is a correct principle for flat water systems, an
important additional consideration for the estimation of the conditional probabilities qm,s,
117
Optimisation of monitoring networks for water systems
for the Magdalena River, must take into account the motion of the kinematic wave in the
river.
Indeed, in a non-flat system such as a river, a message produced at time t = t0 from a
monitor located at x might provide more value for indicating the state of the system at a
downstream point y some time in the future (t = t0 + 't), than the value it might provide at
time t = t0. Therefore, in order to account for this issue, we define the matrix qm,s as
follows:
Table 6-8. Definition of the situations for estimation of qm,s for the Magdalena River case
Situation
1
2
3
4
Message at x
at time t0
“Danger”
“Danger”
“Normal”
“Normal”
State at y
at time t0+'t
“Flooding”
“No Flooding”
“Flooding”
“No Flooding”
The new matrix qm,s is shown in Table 6-9.
Table 6-9. Definition of conditional probabilities qm,s according to the situations
presented in Table 6-8.
qm,s
s1
Flood at y
(at time t0+'t)
s2
No Flood at y
(at time t0+'t)
m1: Danger (at x)
at time t0
Number of occurrences of
situation 1 / Number of
“Flooding” at y
Number of occurrences of
situation 2 / Number of “No
Flooding” at y
m2: Normal (at x) at time
t0+'t
Number of occurrences of
situation 3 / Number of
“Flooding” at y
Number of occurrences of
situation 4 / Number of
“No Flooding” at y
Three of the situations described in Table 6-9 are depicted in Figure 6-14, where the
position of a kinematic (flood) wave is shown at different times. In brief, the quality of
the message provided by x at t = t0 is successful in describing the state of the system at y
at a time t = t2; however, it is not successful for describing the state of the same point at a
time t = t1.
It can be seen that the definition of the critical level for flooding (i.e., the threshold that
defines the possible states of the system), has an important effect on the quality of the
messages that are produced by x about the state of the system at y.
From a logical point of view, Table 6-9 is estimated by lagging the time series obtained
from the hydrodynamic model at all computational points. The lag time used is equal to
the travel time T of the kinematic wave, which is estimated by:
118
Chapter 6 - Value of Information for monitor location
T = dx / c
(6-7)
where dx is the distance between two consecutive computational points and c is the
celerity of the kinematic wave.
Figure 6-14. VOI and the effect of lagged time series
Definition of thresholds for the states of the Magdalena River
The states of the Magdalena River are defined for different thresholds in terms of
percentiles at each computational point. The purpose of this definition is to let the
decision maker choose what type of state he wishes to monitor. For instance, a threshold
of 80% means that the set of monitors to be placed is needed to provide the best
information about the flows above the 80% threshold of the time series for the flow at
each point.
In this way, a sensitivity analysis of the value of information according to different
thresholds can be investigated. However, in practice, real thresholds, defined by the water
boards and decision makers should be used.
Results
The results are shown for the location of one, two and three monitors in the Magdalena
River, under different celerity values and different state definition thresholds. Before
119
Optimisation of monitoring networks for water systems
presenting the results, it is worth analysing the mean Value of Information VOIx
estimated with Eq. (6-2), because it provides a picture of the most important zones in
terms of monitoring the state of the entire system. Figure 6-15, Figure 6-16 and Figure
6-17 have been prepared using state thresholds of 80%, 50% and 20% respectively, and
for different values of celerity.
First, from Figure 6-15 it can be observed that the tributaries and the wetlands affect the
variation of VOIx in different ways; in some cases they increase VOI (for example the
rivers Negro, Nare and Lebrija and wetland W4) and in other cases they decrease it (for
example the rivers Sogamoso and Cauca and the wetlands W1 and W2). In general, the
region between the Lebrija River and the wetland W2 is the one that provides the highest
quality of the messages about the state of the entire system. This is consistent with the
fact that this zone divides the river into two reaches: an upstream reach, where the
majority of the tributaries flow into the river, and the wetland reach, where the river slope
decreases sharply to a low value.
11
10
9
8
7
6
5
c=0
c=0.5
c=1.0
c=1.5
c=2.0
c=2.5
c=3.0
4
3
2
1
0
20
40
60
80
100
120
140
160
180
Figure 6-15. Mean Value of Information estimated with Eq. (6-2) for different values of celerity in
the Magdalena River, using a state threshold definition of 80%.
Second, from Figure 6-16 for a state threshold definition of 50% it can be observed that
the mean value of information is more sensitive to smaller disturbances in discharge; this
is especially evident in the upstream reach, in particular for the rivers Cocorná (between
the rivers Miel and Nare), Regla (between the rivers Nare and Carare) and Opón
(between the rivers Carare and Sogamoso). Similarly, the Cesar River becomes important
in the wetland zone, located after the wetland W2 (see Figure 6-16 and Figure 4-2).
120
Chapter 6 - Value of Information for monitor location
6
5
4
3
2
c=0
c=0.5
c=1.0
c=1.5
c=2.0
c=2.5
c=3.0
1
0
-1
0
20
40
60
80
100
120
140
160
180
Calculation point, river
Figure 6-16. Mean Value of Information estimated with Eq. (6-2) for different values of celerity in
the Magdalena River, using a state threshold definition of 50%.
Finally, for a state threshold of 20% (see Figure 6-17), the VOIx curves for different
celerity values show that the downstream reach of the river (after the connection of the
Mompox branch) becomes as important as the zone between the Lebrija River and the
wetland W2. Note that the mean value of information at the wetland zone between W2 and
W3 decreases significantly, implying that, on average, almost any location is unable to
provide reliable messages about the state of the system in this particular zone.
Conversely, any monitor located within W2 and W3 will be unable to provide reliable
messages about the state of the entire system. In general, the shape of the curves follows
the same behaviour as for the case of the state threshold of 50%, where small increments
of flow influence the mean value of information. However, the differences between the
curves with different celerity values are less significant than for the cases with thresholds
of 50% and 80%.
A very important feature revealed by Figure 6-15, Figure 6-16 and Figure 6-17 is that the
mean value of information drops continuously as the state threshold increases. This is
because the definitions of the state of the system given in Table 6-8 and Table 6-9 imply
that only the excess of flow is of interest in defining a flood, and therefore whether to
release a warning or not. Therefore, if the flood threshold is defined as percentile 50 of
121
Optimisation of monitoring networks for water systems
the historic flow at every computational point, then the current state of the system can be
deduced with more confidence than for the case of having a flood threshold of 80%.
Alternatively, a threshold of 80% implies that 80% of the times the river is going to
exceed the percentile 20% of the flow at every computational point. Naturally, a very low
threshold indicates that the river is flooded frequently everywhere so that the state of the
system is reasonably known and therefore the need to site monitors for this case is
reduced. Conversely, it is more difficult to know if the river is going to exceed a
threshold that is rarely reached and in this case the monitors will provide a larger value of
information.
5
4.5
4
3.5
3
2.5
c=0
c=0.5
c=1.0
c=1.5
c=2.0
c=2.5
c=3.0
2
1.5
1
0.5
0
0
20
40
60
80
100
120
140
160
180
Figure 6-17. Mean Value of Information estimated with Eq. (6-2) for different values of celerity in
the Magdalena River, using a state threshold definition of 20%.
In practice, the state definition threshold should be replaced by the real critical flood
levels in order to locate the monitors properly.
Location of monitors
The optimisation problem for the location of one, two and three monitors posed in the
expressions (6-3), (6-4) and (6-5), is solved for celerity values between 0.5m/s and 3m/s.
Additionally, three different state thresholds (80%, 50% and 20%) are selected in order to
take into account different monitoring objectives. For example, the first threshold is for
selecting the most appropriate locations for monitoring extreme events; the second is for
selecting the locations that provide high quality messages about the “normal” state of the
river (when the river exceeds the historic mean discharge at any point);lastly, the third
threshold is to identify the monitor locations for low-flow monitoring. The results are
summarized in Figure 6-18, Figure 6-19 and Figure 6-20.
122
Chapter 6 - Value of Information for monitor location
Figure 6-18. Results for one, two and three monitor locations for different celerity values and 80%
state threshold definition
123
Optimisation of monitoring networks for water systems
Figure 6-19. Results for one, two and three monitor locations for different celerity values and 50%
state threshold definition
124
Chapter 6 - Value of Information for monitor location
1.0
Celerity (m/s)
1.5
2.0
2.5
3.0
One monitor
0.5
15
Two monitors
10
5
Three monitors
0
Figure 6-20. Results for one, two and three monitor locations for different celerity values and 20%
state threshold definition
125
Optimisation of monitoring networks for water systems
From Figure 6-18 it can be concluded that for one monitor the point that provides the best
quality of the message about the state of the entire system moves upstream as the celerity
increases. However, the monitor is always located in the zone between the point of
discharge of the Lebrija River and the wetland W2, described previously. For a celerity
value of 1m/s a better distribution of the VOI is found, that is, more points in the river are
described with one monitor. Possibly this is the physical celerity value for the Magdalena
River. However, it is observed that only one monitor is insufficient to get messages about
the state of the system at the wetlands and at the downstream part of the river.
From Figure 6-15 the location of two monitors for different celerity values appears
sensible. Once again, the zone between the discharge of the Lebrija River and the wetland
W2 is selected because it has the highest VOIx value. The second region to place a monitor
is downstream after the wetland zone, supporting in some way the job of the first
monitor.
The result for three monitors can also be anticipated . It can be summarised as placing
one monitor before, within and after the wetland zone. Naturally, the second monitor is
placed near the discharge point from the Cauca River. When locating two and three
monitors the effect of the celerity is not significant.
As mentioned above, the effect of the threshold reduction on the value of information
takes place as the threshold decreases. This can be also observed in the Figure 6-18,
Figure 6-19 and Figure 6-20. From these figures it can also be concluded that the monitor
locations do not suffer major changes when using different celerity values, because they
are determined largely by the three well-defined VOI zones in which the river is divided:
the upstream reach, which can be monitored by at a single location downstream; the
wetland zone, where important discharge fluctuations take place for the wetland-river
connections and due to the discharge from the Cauca River; and the downstream reach,
whose state can be monitored by a single point upstream.
6.6.3
Pijnacker region, The Netherlands
The VOI-based method for monitor location is also applied to the case study described in
Chapter 3. For reasons related to computational effort, this section includes two different
experiments: one with simplified inputs for the complete Pijnacker water system, and one
with more complex inputs in a smaller system.
Entire Pijnacker polder system
For the first experiment, the same simplified inputs as for the canal-pump case are used
and therefore the states, actions and messages are as shown in Table 6-5. Also, the
consequence matrix of Table 6-4 is adopted. However, as the flood level information is
available this was used for the experiment below.
The mean Value of Information for the entire Pijnacker polder system was estimated with
Eq. (6-2), obtaining the VOI map shown in Figure 6-21.
126
Chapter 6 - Value of Information for monitor location
Figure 6-21. Mean VOI in the Pijnacker water system, with simplified inputs.
It can be observed that importantly a percentage of the points do not provide value in
terms of describing the state of the entire system. Also, the VOI is dependent on the
location of pumps and weirs. It is interesting that most of these zero-VOI points are
located at the most elevated parts of the system.
The location of one and of two monitors was determined and Figure 6-22 was prepared
using the procedure described in Section 6.5,
14
12
10
8
6
4
2
0
(a)
(b)
Figure 6-22. Location of one (a) and two (b) monitors for the Pijnacker water system.
It is noted that the solution of Eq. (6-3) for locating one monitor (Figure 6-22-a) yields
two different results, while the solution of Eq. (6-4) for locating two monitors (Figure
6-22-b) yields seven separate results. One of the solutions is shown in both figures.
127
Optimisation of monitoring networks for water systems
It is interesting that for the case of two monitors, the location obtained for the one
monitor analysis is also selected and that after the placement of two monitors, an
important number of points were still not covered from a VOI point of view. This means
that in this water system, where the water level has a number of discontinuities due to
weirs and pumps, it is too ambitious to pretend that the state of the entire system can be
described with only two points. Although the procedure to solve Eq. (6-5) for the location
of three monitors is not possible because of the computational resources needed for the
number of calculation points under consideration, the division of the system into several
continuous subsystems can be a way to overcome this issue. For this reason, the
experiment was repeated for a subsystem, a simplification that allows more detailed
inputs to be used, as explained below.
Selected subsystem within Pijnacker polder system
The selected subsystem is located at the North-East part of the Pijnacker water system.
The water level in this area is controlled by four weirs and three pumps. The land is used
mainly for pasture, but glasshouses and urban developments also exist. This subsystem
was modelled with 65 calculation points (see Figure 6-23).
08/17/200
Urban
W2
Glasshouse
Pasture
W1
P1
W3
W4
P2
P3
Figure 6-23. Selected subsystem of the Pijnacker polder system
The availability of the model data allows three new features in the procedure for locating
the monitors to be introduced. First, more than two possible states of the system, the
definition of different land uses (urban, glasshouse and pasture), each one with its own
damage function (consequence matrix) and the definition of experience-based
consequence matrices are defined. These three aspects, discussed with staff members of
the Delfland Waterboard in November 2009, are summarized in Table 6-10 and Figure
6-24, where four states, namely severe flood, flood, normal and drought are considered.
Note that the consequences shown in Table 6-10, however, do not represent monetary
128
Chapter 6 - Value of Information for monitor location
values but are relative costs, (the norm is 1000 units, being the reference for the worsecase scenario of doing nothing when a severe flood is present).
Table 6-10. Table of consequences Cas for different land uses for the Pijnacker region
Cas
S1: Severe Flood
S2: Flood
S3: Normal
S4: Drought
a1
a2
Pump1 Pump1
On
Off
Urban
500
1000
100
200
0
0
20
10
a1
a2
Pump2 Pump2
On
Off
Glasshouse
250
500
50
100
0
0
10
5
a1
a2
Pump3 Pump3
On
Off
Pasture
50
100
10
20
0
0
2
1
Figure 6-24. Definition of the possible states, land uses and damage function (consequences)
129
Optimisation of monitoring networks for water systems
Under these conditions, there are 32 possible situations that may happen in at each area of
the water system. These situations are summarized in the Table 6-11 below.
Table 6-11. Situations
Situation
Message at x
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
“Severe Flood”
“Severe Flood”
“Severe Flood”
“Severe Flood”
“Severe Flood”
“Severe Flood”
“Severe Flood”
“Severe Flood”
“Flood”
“Flood”
“Flood”
“Flood”
“Flood”
“Flood”
“Flood”
“Flood”
“Normal”
“Normal”
“Normal”
“Normal”
“Normal”
“Normal”
“Normal”
“Normal”
“Drought”
“Drought”
“Drought”
“Drought”
“Drought”
“Drought”
“Drought”
“Drought”
State at y
S1: Severe Flood
S2: Flood
S3: Normal
S4: Drought
S1: Severe Flood
S2: Flood
S3: Normal
S4: Drought
S1: Severe Flood
S2: Flood
S3: Normal
S4: Drought
S1: Severe Flood
S2: Flood
S3: Normal
S4: Drought
S1: Severe Flood
S2: Flood
S3: Normal
S4: Drought
S1: Severe Flood
S2: Flood
S3: Normal
S4: Drought
S1: Severe Flood
S2: Flood
S3: Normal
S4: Drought
S1: Severe Flood
S2: Flood
S3: Normal
S4: Drought
State of the pump
affecting y
Combined
State ID
On
On
On
On
Off
Off
Off
Off
On
On
On
On
Off
Off
Off
Off
On
On
On
On
Off
Off
Off
Off
On
On
On
On
Off
Off
Off
Off
SF-ON
F-ON
N-ON
D-ON
SF-OFF
F-OFF
N-OFF
D-OFF
SF-ON
F-ON
N-ON
D-ON
SF-OFF
F-OFF
N-OFF
D-OFF
SF-ON
F-ON
N-ON
D-ON
SF-OFF
F-OFF
N-OFF
D-OFF
SF-ON
F-ON
N-ON
D-ON
SF-OFF
F-OFF
N-OFF
D-OFF
Therefore, there are two associated conditional matrices qm,s that depend on the current
status of the pump downstream the point under consideration, each one with 16 elements.
For the pump-on status, the qm,s matrix is presented in Table 6-12, whereas the qm,s matrix
for the pump-off status is presented in Table 6-13.
130
Chapter 6 - Value of Information for monitor location
Table 6-12. qm,s matrix for the on-status of the pump downstream y according to the situations of
Table 6-11
qm,sON
s1
Severe Flood (at
y), pump is on
s2
Flood (at y),
pump is on
s3
Normal (at y),
pump is on
s4
Drought (at y),
pump is on
m1: “Severe
Flood” (at x)
m2: “Flood”
(at x)
m3: “Normal”
(at x)
m4: “Danger”
(at x)
Number of
occurrences of
situation 1 /
Number of SFON
Number of
occurrences of
situation 2 /
Number of
F-ON
Number of
occurrences of
situation 3 /
Number of
N-ON
Number of
occurrences of
situation 4 /
Number of
D-ON
Number of
occurrences of
situation 9 /
Number of
SF-ON
Number of
occurrences of
situation 10 /
Number of
F-ON
Number of
occurrences of
situation 11 /
Number of
F-ON
Number of
occurrences of
situation 12 /
Number of
D-ON
Number of
occurrences of
situation 17 /
Number of
SF-ON
Number of
occurrences of
situation 18 /
Number of
F-ON
Number of
occurrences of
situation 19 /
Number of
F-ON
Number of
occurrences of
situation 20 /
Number of
D-ON
Number of
occurrences of
situation 25 /
Number of
SF-ON
Number of
occurrences of
situation 26 /
Number of
F-ON
Number of
occurrences of
situation 27 /
Number of
F-ON
Number of
occurrences of
situation 28 /
Number of
D-ON
Table 6-13. qm,s matrix for the off-status of the pump downstream y according to the situations of
Table 6-11
qm,sOFF
s1
Severe Flood (at
y), pump is off
s2
Flood (at y),
pump is off
s3
Normal (at y),
pump is off
s4
Drought (at y),
pump is off
m1: “Severe
Flood” (at x)
m2: “Flood”
(at x)
m3: “Normal”
(at x)
m4: “Danger”
(at x)
Number of
occurrences of
situation 5 /
Number of SFOFF
Number of
occurrences of
situation 6 /
Number of
F-OFF
Number of
occurrences of
situation 7 /
Number of
N-OFF
Number of
occurrences of
situation 8 /
Number of
D-OFF
Number of
occurrences of
situation 13 /
Number of
SF-OFF
Number of
occurrences of
situation 14 /
Number of
F-OFF
Number of
occurrences of
situation 15 /
Number of
F-OFF
Number of
occurrences of
situation 16 /
Number of
D-OFF
Number of
occurrences of
situation 21 /
Number of
SF-OFF
Number of
occurrences of
situation 22 /
Number of
F-OFF
Number of
occurrences of
situation 23 /
Number of
F-OFF
Number of
occurrences of
situation 24 /
Number of
D-OFF
Number of
occurrences of
situation 29 /
Number of
SF-OFF
Number of
occurrences of
situation 30 /
Number of
F-OFF
Number of
occurrences of
situation 31 /
Number of
F-OFF
Number of
occurrences of
situation 32 /
Number of
D-OFF
131
Optimisation of monitoring networks for water systems
Results
Following Table 6-1 to estimate the prior beliefs, the mean value of information VOIx is
obtained for the case of considering all the records, and also considering only the records
occurring when the associated pump station (pump downstream the current point y) is ON
and OFF. Figure 6-25 presents the results for VOIx under these three scenarios. The most
important observation, apart from the fact that the three maps differ only in few points, is
that the maximum VOIx is 0.5 units, even though the maximum damage possible is 1000
units, when severe flood occurs in urban areas (see Table 6-10). This is because the
records are mainly in the normal range at all points.
Figure 6-25. VOIx maps considering different data sets
This implies that monitors will not be so valuable in providing messages about the state
of the entire subsystem. That is, there is such a high confidence in the state of the system,
that it is not worth placing monitors. Nevertheless, for the sake of testing the approach for
monitor location, the expressions (6-3), (6-4) and (6-5) are solved. Due to the fact that
many points have equal value of information, several solutions were obtained (43
solutions for one monitor, 303 for two monitors and 343 for three monitors for the pumprelated data sets). The maps in Figure 6-26 show one of the possible solutions for each
case.
These results confirm that even though the number of monitors increases, the value of
information about the state of the entire subsystem remains low. However, it is clear that
the monitors tend to concentrate mainly on the urban zone, where the damage function is
the highest.
6.7 Conclusions
The VOI method for locating monitors optimizes VOI such that their messages can
provide information about the state of the entire water system with minimum redundancy
between the monitors.
132
Chapter 6 - Value of Information for monitor location
1 monitor
2 monitors
3 monitors
Pump Off
12
10
8
6
4
2
0
Pump On
Figure 6-26. Results for the selected subsystem of the Pijnacker water system using calculated
prior beliefs
The VOI approach to locating monitors is very flexible, because it is applicable to any
type of water system, for any water variable and any set of statuses that may occur in the
system. The use of models for data generation is interesting because it permits the
analysis of a dense set of points from which those with the highest VOI can be selected.
There are two main difficulties when applying the method. First, there is the need for a
good definition of consequences or damage functions. Second, a convenient definition of
the states of the system that are to be monitored is required. The first difficulty concerns
amount of data required to define the damage functions, while the second involves the
clear definition of the objectives of the monitoring network. Nevertheless, it has been
shown that both difficulties can be resolved by analysing different scenarios.
One disadvantage with the method is the huge computational effort needed to solve the
monitor location problem for more than three monitors. A workaround consists of
dividing the water system into several subsystems and solving for each subsystem
separately.
It is interesting that in the case of the Magdalena River the VOI is sensitive to the
discharges from the tributaries and in the case of Pijnacker to the location of hydraulic
structures. For the case of natural streams such as the Magdalena River, the inclusion of
the dynamics of the kinematic wave allows for a better location of monitors, depending
on the objective of the monitoring network.
133
Chapter 7
Public data collection and
assessment of model reliability
In the previous chapters, different methods for placing monitor devices were developed
and tested in the polder system of Pijnacker, described in Chapter 3 and in the river
Magdalena, described in Chapter 4. In Chapter 5 the use of Information Theory concepts
was explored in order to maximise the information content about the particular water
system. A complementary method that takes into account the subjective beliefs of a
decision maker in getting information is presented in Chapter 6 with the maximisation of
the Value of Information. The present chapter explores other methodologies to obtain a
reliable idea of the state of a water system through public data collection and model
validation, which, in turn, looks for an incremental increase in the information content of
our water systems.
Two main topics are distinguished in this chapter. First, the use of public participation in
information collection, particularly by the use of mobile phones; second, the use of the
data collected by this method to improve the reliability of the models for decision
making. The first part describes the whole cycle, from conception to results, of an
experiment called MoMoX (Mobile Monitoring Experiment), which was carried out
during the first half of 2010 in the region of Pijnacker. The second part describes the
methods by which the data collected in the first part can be used to improve models.
Before entering into the details, the introduction describes the main motivations of the
methods.
7.1 Introduction
Mobile phones are devices that have evolved significantly during the last years. More
than just phones, these machines can be considered to be small computers that can offer a
variety of features in addition to their basic communication possibilities. However, it is
not only the technology behind them that makes them powerful, but also the social
consequences derived from the fact that almost every person has one in his/her pocket.
Recently, several researches have explored different uses of mobile phones in waterrelated problems. On the one hand, mobile technologies can be used as a tool for
spreading information, as demonstrated, for example, by Alfonso (2006), who presented a
Optimisation of monitoring networks for water systems
methodology to use mobile phones to inform people about the proper times to consume
drinking water in an intermittent water distribution systems, and Naz (2006), who used
wireless technologies to spread early warning messages of flooding in Dakha. On the
other hand, mobile phones are also a tool for public participation in monitoring, as
suggested by Silva (2008), who works with children to create a multisensory
geographical information system using mobile phones in learning and participatory
contexts, and Gouveia et al. (2004) who overcame typical problems of voluntary data
collection by promoting the use of ICT.
The motivation of the research presented in this chapter comes from a consideration of
two aspects concerning data collection in extreme events. First, there is the problem of
validating the results of rainfall-runoff models during such events when typical
monitoring devices fail to record and transmit data. This situation leads to the
incompleteness of the data records especially at the time of the peak of the event, which
is actually the moment when a reliable model is needed to define the real extent of the
flooding and to generate reliable information for decision making.
The second aspect is the fact that mobile phones are not simply communication devices
but small computers that are part of a network, can transfer information in various
formats, can accompany a human being, work at anytime and can connect to the Internet.
Most importantly, people know how to use them and almost everybody has one. The
implications for monitoring are clear: measurements can be sent and analyzed
immediately, taken at any accessible place and at any time, and for as many times as
required. Additionally, minimal training is needed, their use by human observers is
flexible, and there are no maintenance costs and vandalism problems. The data that can
be collected include water levels, rainfall, discharges (given velocity readings as well as
water levels), the operational status of hydraulic structures such as pumps and gates,
flood reports, failures, obstructions, etc.
This chapter is divided into two sections: the first addresses the introduction of Mobile
Monitoroing Experiment (MoMoX), an experiment using members of the public to
collect water level data in a polder in The Netherlands; the second describes a
methodology to use the collected data in an assessment of model reliability.
7.2 Public participation in data collection
In the Chapter 2 the related literature review was presented, where a number of projects
with positive experiences in public participation are mentioned. In the following section
the experience about letting people read and send water level data through mobile phones
is described. Findings and limitations are mentioned.
7.2.1
Mobile Monitoring Experiment (MoMoX)
MoMoX stands for Mobile Monitoring Experiment, a new approach to collecting field
data for water management purposes, which was held in the city of Pijnacker and nearby
areas (see Chapter 3). The experiment was divided into three stages (see Table 7-1). The
136
Chapter 7 - Public data collection and assessment of model reliability
first stage consisted of two small, pilot experiments that took place during February 2010,
in order to test the technological platform and correct possible errors, one of them being
held directly in the field. The second stage, with 15 participants and no rainfall event, was
held during the 22nd of May, 2010 in the field, in order to evaluate the possible errors
coming from people with no experience in water level gauge reading and the performance
of the website. The third stage involved people living or working nearby the water level
gauges, so that they are able to send messages during or after real rainfall events during
the month of July 2010.
Table 7-1. Description of MoMoX stages
Stage
Duration
Number of
Participants
Type of
participants
a) Pilot test field
2h
4
Colleagues
b) Pilot test lecture room
2h
9
Students
Description
1
2
Real field test
1 day
15
Students +
colleagues
3
Real field test
1 month
2
Residents
The platform for the experiment consists of a website with details of MoMoX, which
include the location of the gauges in the region, instructions to read water level gauges
and instructions to send an SMS with the water level information. Every participant can
register either through filling in an online form or by sending an SMS with the nickname
of his/her choice. The experiment procedure has the following steps, noting that all the
gauges in the region were previously labelled by staff members of Delfland Waterboard.
x
x
x
x
x
x
An SMS with the expected rainfall amount for the coming days is prepared by
Hoogheemraadschap van Delfland, the local water authority, and sent to the field
operators and to MoMoX.
An SMS, indicating the start date and time for the experiment and the phone
number where the messages are to be received, is sent to the registered
participants.
The participants start visiting as many water level gauges as possible, read them
and send messages in a predefined format.
The SMS data are received and validated automatically on the MoMoX server.
The data is displayed on the website so that the participants can see their records
on a map display. A list of the top-10 contributors is continuously displayed on
the website, to encourage participation.
The water level information can be immediately accessed by analysts in the
office. Provided the experts have an Internet connection, they can also check the
water level behavior at different places and take proper decisions.
137
Optimisation of monitoring networks for water systems
Figure 7-1 presents these steps graphically, in the form used in a flyer (produced in both
Dutch and English) to advertise the experiment and to encourage people to participate.
Figure 7-1. Flowchart describing the MoMoX general procedures.
7.2.2
Technology used
The technology and processes used in MoMoX are schematized in Figure 7-2. The
process begins with the generation of a mobile-originated (MO) message from a mobile
user. The transmission of SMS is done through the local cellular network by an API
gateway provided by Clickatell, which makes a post with the contents and the metadata
of the SMS to a predefined target address in the web. This post is then captured by a PHP
code located in the mentioned target address, which analyses it and extracts, among other
data, the source of the message (mobile number), the text message and its reception time.
Additionally, the PHP code performs the following tasks:
x Extract the gauge identification number and the reading value from the text
message.
x Using the gauge identification number, the high and low water level limits are
checked. If the reading value is outside this range, a feedback message is sent
back to the user, to let him/her know that there may be a mistake in the reading.
138
Chapter 7 - Public data collection and assessment of model reliability
x
x
The reading value is converted to absolute elevation using the NAP reference.
This information is of interest for the operators of the system, since they can
check whether the target water levels are met with the current control strategies.
The incoming data is stored in a server database, on which SQL-like commands
are performed to draw the stage graphs at each measured point using Google
Visualization API. The map showing the gauge locations is Google Maps-based.
Every time the user clicks on a gauge, the SQL statement is activated, the
visualization API is called and the graph is drawn (see Figure 7-3).
PHP
GOOGLE
•Captures the SMS
•Extracts info
•Validates info
•Calculates levels
•Writes DB
•Feedback SMS
•DB access, queries
•Visualization
(graphs and tables)
•Maps
•Spreadsheets
Figure 7-2. Technology behind MoMoX
Figure 7-3. MoMoX website showing gauge 8 info, and the current water level graph.
139
Optimisation of monitoring networks for water systems
As the database is being fed in real-time, the graphs are immediately available for
checking. An important way of encouraging participants to send messages is by
identifying the contributor (nickname) of each single point in the graph. This, in addition,
provides an indirect way of identifying the people who send wrong data.
7.2.3
Communication campaign
As the central part of the experiment is the use of the general public to collect data, the
experiment has a strong social component, in which the communication processes key in
its success.
Due to the trial-nature of the first stage, a major communication process was not really
required. The pilot test in the field demanded only little time from four colleagues at
UNESCO-IHE. For the pilot test in a lecture room, a presentation of the current research
given to the MSc students of Hydroinformatics at UNESCO-IHE was utilized to meet
potential participants. During the presentation, a simulation of the real experiment was
carried out. As expected, their registration process was straightforward.
For the second stage, however, a stronger communication campaign was needed, in order
to collect as many participants as possible for a real data collection campaign. The
strategy included, on the one hand, disseminating the message throughout the whole
UNESCO-IHE, by email, by personal approach and by letting a video run in public
screens at the entrance of the institute; on the other hand, handing out letters to the
residents living in the area, preferably nearby the gauges to be read. A total of 85 letters,
in Dutch language, were submitted. As a result, 28 participants registered through the
online form in the MoMoX website, 2 of them being residents of the area. On the day of
the experiment, however, 15 people were actually in the field, 12 of them being
UNESCO-IHE students, 1 staff member of the same institute and 2 people with other
affiliations. Unfortunately, the inhabitants that registered did not send any SMSs during
the experiment.
The first two stages of the experiment exposed the need for stronger communication
campaigns to involve participants. For this reason, the third stage consisted of 25 houseby-house personal visits to explain the experiment and to hand out flyers in Dutch. In
order to attract their attention, 15 umbrellas were given away to enthusiastic participants.
However, approximately half of the people contacted was enthusiastic enough (especially
those with scientific background), while the other half were either not interested, not
willing to listen the proposal or too busy. Key communication players, such as the local
newspaper, schools and the association “Vereniging voor Natuur- en Milieubescherming
Pijnacker”, an association for nature conservation in the Pijnacker region, were also
contacted. However, as a result of this campaign, only five residents actually registered.
7.3 Results of the experiment
The results obtained for the three stages mentioned previously, namely the pilot tests, the
experiment of 22nd of May 2010 and the one-month experiment are presented below.
140
Chapter 7 - Public data collection and assessment of model reliability
7.3.1
Findings of the pilot tests
Two main pilot tests were carried out during February 2010. The first one, executed by
four participants, consisted of sending messages from four different locations in the field.
During the first test it was found necessary to provide information on the scales of the
gauges, since some of them are in meters, others in decimeters and still others in
centimeters, and therefore very different values were being reported for the same site
(Figure 7-4).
(a)
(b)
(c)
Figure 7-4. Gauges with scales in cm (a), dm (b) and m (c)
In addition, a command to send feedback messages to the participants was found to be
needed in case the values were rejected during the validation procedure. In this way, the
participants have the opportunity to resend the value after an explanation of the possible
sources of the mistake. Finally, it was necessary to give the option to provide specific
information for a given gauge at the user’s request, such as longitude, latitude, scale and
picture (the latter for those with the option of receiving MMS).
The second pilot test was carried out during a Hydroinformatics’ lecture at UNESCOIHE, in which 9 participants with Internet access were involved. The objective of this test
was to check how the platform would handle several messages coming simultaneously
from different mobile operators and how the website would perform when simultaneous
clients access the maps and the graphs. For this purpose, a visit to the field was simulated
by providing the message the people would send.
Two main issues were found during the second pilot test:
x
x
Mistakes during the mobile numbers registration, especially regarding the format
of the mobile numbers in international format, caused the system not to be able to
receive the inputs from two participants.
Participants subscribed to the mobile carriers KPN Mobiel, Orange Nederland, TMobile Nederland BV, Tele2, Telfort (O2) and Vodafone (Libertel), were able to
send messages, while participants with other carriers were not. Unfortunately, this
141
Optimisation of monitoring networks for water systems
issue is exclusively dependent of the SMS gateway provider (Clickatell), as other
carriers are not included currently in their service. Similarly, sending SMS from
web-based services was not possible.
In spite of these difficulties, the test demonstrated that the platform is a robust one, and
that the graphs and tables are successfully updated in few seconds after the SMSs are
sent.
7.3.2
Findings of the experiment of May 22
The second stage of the experiment consisted of testing the platform from the field,
during a one-day activity with several participants. The purpose of this experiment was to
assess the reliability of the data sent by each participant after reading the water level
gauges by themselves. For this reason, validation-related errors were corrected offline in
order to exclude them from the reading errors, which is the issue of interest. It must be
mentioned that the experiment was carried out in a warm, dry day, and that there was no
evidence of pump operation, so the water levels remained constant all day long.
The validation-related problems found during this experiment are listed as follows:
x
One of the participants used comma (,) as decimal separator, instead of period (.),
generating false data. The resulting gauge value appeared to be the digit before
the comma, and the gauge ID the digits after the comma.
A problem related to the negative sign was found when a participant sent a datum
with the negative sign separated by a space. As the space field is used as a
separator, the system interpreted this as an error and the datum was not registered.
One of the participants sent a gauge value, but forgot to add the gauge ID. The
automatic feedback SMS warned this person and the message was resent.
Although the validation process includes the check of gauge limits, other
procedures for avoiding scaling mistakes were found to be needed. For instance,
(see Figure 7-5), one of the participants sent the value 0.2 instead of 2 for the
gauge 8, due to an error of appreciation of the gauge scale; yet, the value of 0.2
appeared to be within the valid scale range. The same situation occurred in the
gauge 3, where a participant sent -0.1 instead of -0.01.
Gauge 3
0
-0.5
-0.04
-0.06
-0.08
-1
-1.5
-2
-2.5
-0.1
-3
-0.12
-3.5
Figure 7-5. Examples of validation errors related to gauge scale
142
19:12
18:00
16:48
15:36
14:24
13:12
12:57
12:43
12:28
12:14
12:00
11:45
11:31
0
-0.02
Water level (cm)
Water level (cm)
11:16
Gauge 8
13:12
x
12:00
x
10:48
x
Chapter 7 - Public data collection and assessment of model reliability
However, the check limit procedure avoided the registry of a wrong datum sent by
different participants in 5 opportunities at the gauge 19 and one at the gauge 16. Note that
these gauges have scales in meters (e.g.,Figure 7-4c), so this situation points out a
particular difficulty.
On the other hand, problems related to the SMS generation were also found and are listed
below:
Some participants could not find the negative sign in their mobile phone text options for
SMS when redacting the message. Therefore, they send either a positive value (see e.g.
Figure 7-6) or simply did not send anything.
Water level (cm)
19:12
18:00
16:48
15:36
14:24
13:12
12:00
10:48
Gauge 13
10
8
6
4
2
0
-2
-4
-6
-8
-10
-12
Figure 7-6. Example of error related to datum without negative sign.
However, the very opposite case occurred in the gauge 6, where a participant sent a
negative value when the true value was a positive one. As the datum is within the limits
of the gauge scale, it was accepted by the system (see Figure 7-7). This might happen
because of the lack of concentration, maybe due to tiredness. In fact, physical effort is
needed to access this gauge, as it is located below a bridge in the middle of the long
pasture grass, which also make it difficult to read the gauge comfortably.
19:12
18:00
16:48
15:36
14:24
13:12
12:00
10:48
Gauge 6
20
Water level (cm)
15
10
5
0
-5
-10
-15
Figure 7-7. Example of error related to adding an unnecessary negative sign.
143
Optimisation of monitoring networks for water systems
As previously mentioned, the gauges with scale in meters were found to be the most
difficult to read by the participants, probably because the addition of decimal places is a
source of confusion. However other reasons can be that both gauges (16 and 19) have
small figures, and their readings must be done from a considerable distance.
As expected, random errors related to the appreciation of the gauge scales were found.
Some of the resulting water level graphs shown in Figure 7-8 can be used to analyse these
errors.
It can be observed that the random errors are the order of 2 cm. The same error is
obtained if the outliers identified before (shown from Figure 7-5 to Figure 7-7) are
removed.
Gauge 12
16:48
15:36
14:24
1.2
Water level (dm)
8
6
4
2
Gauge 16
1.1
1.05
0.95
19:12
18:00
16:48
15:36
14:24
13:12
12:00
10:48
16:48
15:36
14:24
13:12
12:00
10:48
Gauge 20
12
10
Water level (cm)
9:36
0
-0.02
-0.04
-0.06
-0.08
-0.1
-0.12
-0.14
-0.16
-0.18
-0.2
1.15
1
19:12
18:00
16:48
15:36
14:24
13:12
12:00
10:48
0
Water level (m)
13:12
1.25
10
9:36
Water level (cm)
12
12:00
10:48
Gauge 15
14
8
6
4
2
0
Figure 7-8. Example of random errors due to differences in appreciation
7.3.3
Findings of the experiments with residents
A total of 6 rainfall events occurred during July 2010 in the region of Pijnacker. For each
of these events, a text message was sent to the participants, stating that a rainfall event
was coming and that a SMS was expected from them.
During the whole month, two residents sent SMSs, namely ‘Rein’, who sent 6 SMSs
from the gauge number 19 and ‘Eric’, who sent 4 from the gauge number 12 (see Figure
7-9)
144
Chapter 7 - Public data collection and assessment of model reliability
Figure 7-9. Water level charts obtained with the data sent by the residents and other participants
during the second stage of the experiment
Feedback from participants
Telephone interviews were held at the end of the experiment with those who registered.
Unfortunately, three of them did not answer the phone call, which may imply that they
simply were not available.
The two residents who took part in the experiment, explicitly stated that they joined it
because for them it was a good idea, had a clear purpose and required low effort on their
part. Additionally, both participants explained that the holiday period was not good for
participating in the experiment.
The interviews made it clear that the flyers and the face-to-face visits were clear enough,
so in principle these did not explain the low participation. However, one of the
participants admitted his initial scepticism about the study, as he thought it was “a kind of
call-game for which I had to pay a lot of money”. Also he stated that “it is disappointing
that my neighbours did not participate because I could not compare or share my inputs,
but nice that I knew when it was going to rain thanks to the SMS”.
Future research should explore the use of an extensive media campaign to encourage
public participation, by reporting the successful results of this first experiment.
7.4 Assessing model errors with public’s data
The second section of this chapter describes how the data gathered by members of the
public can be used for model validation. The proposed method is developed for a subregion of the Pijnacker case study (see details in Chapter 3), in which the water levels for
seven subareas with different elevations are controlled by means of four weirs and three
pump stations.
145
Optimisation of monitoring networks for water systems
Areas A1, A2, A5 and A6 are high level areas that discharge to A4 via the weirs 1, 2, 3
and 4 respectively. Area A3 needs the pump 2 to discharge to A4 because the latter is at a
higher level. Next, all water collected in A4 is pumped to a higher level (A7) by means of
the pump 2, to be finally discharged to the main storage canals by the pump 3.
Figure 7-10. Description of the area, absolute elevations and location of hydraulic structures
7.4.1
Description of the validation method
The validation method consists of two components:
a) The creation of a library of patterns in which a large number of models are generated
with different inputs, including extreme events, in a Monte Carlo-like procedure.
b) The selection of the pattern that best describes the pattern depicted by the incoming
public-generated data.
Generation of a library of patterns
In order to generate the library of patterns, the possible variables to be changed are
defined. For the weirs, these variables are the crest level, discharge coefficient and
length; for the pumps, the on and off levels and their capacities are considered; for the
canals, the variables are the roughness coefficients and the dimensions of the cross
sections; in order to vary the external inflows, a single event that is affected by a set of
factors is considered.
The library of patterns is generated by changing all the variables within realistic value
ranges. However, due to the huge number of model runs required, the number of
variables was reduced by selecting the most uncertain variables. In this way, it is assumed
that the (fixed) crest levels and the length of the weirs are known with reasonable
accuracy. Similarly, the on/off levels of the pumps are considered to be well defined.
146
Chapter 7 - Public data collection and assessment of model reliability
Additionally, the dimensions of the cross section are assumed to be constant since erosion
and sedimentation processes are not significant in the area.
This means that four variables remain for selection: the weir discharge coefficients (to be
separately changed for each of the four weir structures), the pump capacities (different for
each pump), the roughness coefficient and the factor for the inflows. In consequence, a
total of 1152 different model runs were generated and executed, and the corresponding
water level time series at the available gauges g1, g2, g3, g17, g18 and g20 (see Figure 7-10)
were stored in a library.
Pattern selection using coming SMS data
The second component of the method for model validation is the selection of the patterns
according to the coming SMS data from the members of the public, as a way to select
good model inputs and to improve the available (offline) model. Here a distinction
between online (real-time) and offline can be made. In this section, only a method for
offline models is presented.
The procedure to validate an offline model consists of selecting those patterns P that best
fit the SMS-based series according to, for instance, a linear least squares evaluation. If
more than one pattern is selected, they equally represent the SMS data at the gauged
points in the system and the choice of patterns may not be sensitive to some variables.
In order to improve the available model used for operation and decision support
(subsequently called the zero-model), a comparison between the zero model and every
pattern in P is carried out, by comparing the variables used in each pattern. Then, the
probability of “goodness” of the zero-model is defined as the number of times that the
parameter is right divided by the number of total parameters. In the case that P has more
than one pattern, the redundant variables are removed from the set and the probability is
assessed using the remaining variables.
7.4.2
Results of the validation method
The zero-model of the region was given. Although the initial idea was to use public-based
SMS data, due to the low participation of the residents and therefore the few data
collected, artificially generated SMS were used to demonstrate the procedure for model
validation. The SMS-based data was generated using the same model with inputs that
were not used for the construction of the library of patterns. The inputs of both models
are shown in Table 7-2.
Table 7-2. Parameters of the zero-model and SMS model, all dimensionless except V3 to V5
(m3/s)
Model
Zero-model
SMS
V1
Inflow
Factor
1.0
1.3
V2
n
0.07
0.04
V3
Pmp
1
0.1
0.1
V4
Pmp
2
0.5
0.5
V5
Pmp
3
1.5
1.5
V6
Weir
1
2.0
1.0
V7
Weir
2
1.0
2.0
V8
Weir
3
1.0
1.0
V9
Weir
4
1.0
2.0
147
Optimisation of monitoring networks for water systems
From the prebuilt library of patterns, three patterns that fit the SMS data better were
found; their characteristics are shown in the Table 7-3. It can be observed that the model
output is not sensitive to the Manning coefficient and therefore this parameter can be
neglected in the probability calculations that are detailed below.
Table 7-3. Parameters of the patterns that fit better the SMS data
Model
Pattern 1
Pattern 2
Pattern 3
V1
Inflow
Factor
1.0
1.0
1.0
V2
n
0.01
0.02
0.07
V3
Pmp
1
0.1
0.1
0.1
V4
Pmp
2
0.5
0.5
0.5
V5
Pmp
3
0.5
0.5
0.5
V6
Weir
1
2.0
2.0
2.0
V7
Weir
2
1.0
1.0
1.0
V8
Weir
3
2.0
2.0
2.0
V9
Weir
4
2.0
2.0
2.0
A visual comparison of the zero-model, the SMS data and the best patterns retrieved, is
presented in Figure 7-11 for the gauged points located in Figure 7-10.
Gauge g1
Gauge g2
-4.32
-5.5
Zero-Model
SMS data
Closest patterns (3 found)
-4.34
-4.36
-5.6
-4.38
-5.65
-4.4
-5.7
-4.42
0
50
100
150
200
250
Zero-Model
SMS data
Closest patterns (3 found)
-5.55
300
-5.75
0
50
100
Gauge g3
150
200
250
300
Gauge g17
-5.8
-2
Zero-Model
SMS data
Closest patterns (3 found)
-5.85
Zero-Model
SMS data
Closest patterns (3 found)
-2.5
-5.9
-5.95
-3
-6
-6.05
0
50
100
150
Gauge g18
200
250
300
-5.5
-3.5
Zero-Model
SMS data
Closest patterns (3 found)
-5.55
50
100
150
Gauge g20
200
250
300
-5.6
-5.65
-5.65
-5.7
-5.7
0
50
100
150
200
250
Zero-Model
SMS data
Closest patterns (3 found)
-5.55
-5.6
-5.75
0
-5.5
300
-5.75
0
50
100
150
200
250
300
Figure 7-11. Zero-model, SMS data and retrieved patterns for the 6 gauged points
It can be seen that the zero-model differs from the retrieved patterns in the capacity of
pump 3 and in the discharge coefficients of the weirs 3 and 4. The probability of
148
Chapter 7 - Public data collection and assessment of model reliability
goodness of the zero-model is therefore, pg=6/(9-1) = 3/4; the remaining 25% of
wrongness can be then reduced by adopting the values of V5, V8 and V9 for any of the
retrieved patterns.
It must be noted that the patterns that best replicate the SMS data have been artificially
generated using a different inflow factor from the one used in the Zero-model. This
implies that the difference in the inflow factor is being compensated by the capacity of
pump 3 and the discharge coefficients of weirs 3 and 4, and therefore the selected
patterns may provide a wrong representation of what is happening in reality. However,
once real, public-based SMS is available, the method will provide real insights into the
source of errors in the Zero-model.
7.5 Conclusions
While traditional monitoring generally provides enough acceptable data to calibrate and
validate models this is not the case during extreme events. In this chapter we demonstrate
that the combination between public participation and mobile phones provides a
promising way to deal with this problem.
It is necessary to generate a more comprehensive library of patterns in order to cover a
wider range of possible inputs, including extreme events.
The validation of the SMS data coming from members of the public is a main issue that
needs further research. For this, different approaches, such as the use of the improved
model itself in a cyclic procedure, and the use of pattern recognition and image
processing of mobile-originated pictures are worth being explored. Additionally,
mechanisms to encourage public participation in order to have a denser data set would
help to identify those SMS data that are incorrect.
149
Chapter 8
Conclusions and recommendations
In this thesis new methods of optimising monitoring networks using concepts of
Information Theory, Value of Information and public participation have been investigated
and tested in two case studies with distinct hydrologic, hydraulic, socioeconomic and
political conditions. The conclusions and the recommendations of the research are
presented in the sections below, following the same order in which the contributions were
developed and applied.
8.1 Conclusions
8.1.1
General conclusions
It has been demonstrated that monitoring networks can be optimised by maximising the
information content and information value, and in addition, that it is possible to configure
a public-based monitoring network with mobile phones. As informed decision-making is
key for adequate water management, the efforts presented to design and evaluate
monitoring networks will lead to an enhancement of the performance of a water system.
Socioeconomic and political conditions drive, in practice, the design and the evolution of
monitoring networks. In developing countries, lack of financial resources is the main
reason for inadequate network density, so their efforts generally concentrate on
optimising the use of the few monitors available. In contrast, in developed countries,
efforts are generally addressed to reduce the size of the existing, dense monitoring
networks.
8.1.2
Information theory for designing monitoring networks
Three new methodologies for optimising monitoring networks by coupling Information
Theory concepts and models were successfully developed and applied, namely the Water
Level Monitoring Design in Polders (WMP), the Multi Objective Optimisation Problem
(MOOP) approach and a rank-based greedy algorithm. The following conclusions can be
made:
The Information Theory capability of quantitatively measured information is a feature
that is intensively exploited in this thesis for monitoring design. The insights it provides
Optimisation of monitoring networks for water systems
about the distribution of the information content is invaluable for optimally place
monitoring devices.
The distribution of the information content for a water system is driven by features that
produce changes in the hydraulic conditions of the system. For the case of the polders of
Pijnacker, the features that interfere with the distribution of the information content are
the weirs and the pumps; for the case of the Magdalena River, it is the incidence of its
wetlands and tributaries. However, this does not mean that a monitoring network
configured by placing monitors at the hydraulic structures in the first case or at the
tributaries in the second case will be optimal from an Information Theory perspective.
In all the experiments, the trade-off between the amount of information content provided
by a set of monitors and the independency among them (mathematically represented by
Joint Entropy and Total Correlation) is evident. However, a large Joint Entropy can be
preferred over a small Total Correlation, because a low-dependent set of monitors is also
the least informative. In such a case, therefore, it does not make sense to have a
completely independent set of monitors if they do not provide enough information
content individually.
The location with the highest information content in the system is usually the most
dependent with regards to the remaining locations. This implies that once this point is
selected, it is very difficult to find a second point that is informative and at the same time
independent. This also explains why only a few monitors are needed in spite of having
many possibilities (8 out of 1520 potential monitors for the Pijnacker region and 7 out of
181 for the case of the Magdalena River).
The sensitivity of any discretization-based criteria for the assessment of probability
distributions is a well-known difficulty that greatly affects the estimation of entropyrelated quantities. However, these effects are negligible for the methods determining
monitor locations due to the relative nature of the developed methods.
8.1.3
Value of Information for designing monitoring networks
It was proved that monitoring networks can be designed by combining the aspects of VOI
and the encapsulated knowledge that the modelling technology offers. The following
conclusions can be drawn from this investigation:
The VOI-based approach to locate monitors is very flexible, because it allows for the
analysis of any type of water system, for any water variable and for any set of states that
may occur in the water system. The use of models for data generation is needed because
it permits the analysis of a dense set of points from which those with the highest value are
selected.
The VOI is sensitive to the discharges from the tributaries in the case of the Magdalena
River, and to the location of hydraulic structures in the case of Pijnacker. This is because
the definition of the states to be monitored in each case study is directly related to the
behaviour of these elements.
152
Chapter 8 - Conclusions and recommendations
The research makes clear that there is the need for a good definition of consequences or
damage functions and for convenient definitions of the states of the system that are to be
monitored. The difficulty in the first case is the amount of data required to define such
damage functions, while for the second it is the clear definition of the objectives of the
monitoring network. Nevertheless, both obstacles can be overcame by analysing different
scenarios, in such a way that a variety of monitoring networks can be produced to
replicate the states of the system that are required for water management.
The huge computational effort needed to solve the monitor location problem for more
than three monitors can be resolved by dividing the water system into several subsystems
and solving the problem for fewer monitors in each subsystem. In this case, it must be
noted that the selected monitoring devices might not be the most valuable to describe the
state of the system of the entire network, but only the state of the subsystem under
consideration. Moreover, this workaround cannot guarantee the independency between
monitor locations, and therefore subsequent analyses in this regard should be carried out.
For the case of natural streams such as the Magdalena River, the inclusion of the
dynamics of the kinematic wave allows for a better location of monitors, depending on
the objective of the monitoring network.
8.1.4
Public participation in data collection and model improvement
A public-based monitoring network for water-level data collection, characterised by the
use of mobile phones, has been configured, tested, run and assessed. The main findings of
this stage of the research are listed below.
The data sent by public SMS shows that random errors are of the order of 2 cm, once
other errors such as negative signs and scale misleading are excluded by validation
processes.
The combination between public participation and mobile phones provides a promising
way to deal with the problem of collecting data for calibrating and validating models in
the case of extreme events. The communication campaign to get people involved is,
however, a major feature. For this reason, future research should explore the use of an
extensive media campaign to encourage public participation, for example by radio or TV,
by reporting the successful results of this first experiment.
From the feedback received from the residents, it can be concluded that short-horizon
rainfall forecast information, for example 15min to 2hr in advance, is a possible
successful business.
Although not officially part of this thesis, a parallel application of the concepts developed
in this part of the research was carried out in the urban basin of the Tunjuelo River in
Bogotá, Colombia, under the supervision of Ir. Carolina Rogéliz, following fruitful
discussions. In this case, residents characterised as having low income, read and send
raingauge data through SMSs that are evaluated to provide feedback about flood risk.
153
Optimisation of monitoring networks for water systems
This positive experience confirms the benefit of public participation and mobile phones
in data collection and corroborates the impact of this research in developing countries.
8.2 Recommendations
The fundamental concepts behind Information Theory and Value of Information proved
to be complementary in the problem of designing and evaluating a monitoring network. It
is expected that the developed methods can be combined into one, comprehensive method
that takes into account both information content and decision-making aspects.
Certainly, this research opens up new possibilities for the application of Information
Theory in water resources, in which information has been traditionally quantified in
entropy units. With the inclusion of the Value of Information concept, the entropy-based
methods developed in this thesis, as well as the ones developed in previous studies, can
now consider measuring information in monetary terms. In this regard, the findings of
this thesis can be used as the starting point to find the value of one unit of information.
Regarding the method for improving models with publicly collected data, it is necessary
to generate a more comprehensive library of patterns as described in Section 7.4.1 in
order to cover a wider range of possible inputs, including extreme events. Additionally,
this method should be tested with real SMS coming from the public.
The use of pattern recognition and image processing of mobile-originated pictures are
worth exploring, not only to gather information about the state of the system, but also to
be adapted into a framework for validation of SMS data. Additionally, mechanisms to
encourage public participation in order to have a denser data set would help to identify
those SMS data that are incorrect.
154
Chapter 9
References
Abbott, M. B. (1991). Hydroinformatics: Information technology and the aquatic
environment, Avebury Technical, Aldershot; Brookfield, USA.
Abbott, M. B. (2002). "On definitions." J. Hydroinformatics, 4(2).
Abramovitz, J. N., and Peterson, J. A. (1996). Imperiled waters, impoverished future : the
decline of freshwater ecosystems / Janet N. Abramovitz ; Anjali Acharaya, staff
researcher ; Jane A. Peterson, editor, Worldwatch Institute, Washington, D.C. :.
Ackoff, R. L. (1989). "From data to wisdom." Journal of Applied Systems Analysis,
16(1), 3-9.
Aguilera, M. M. (2004). "La Mojana: riqueza natural y potencial económico."
Documentos de trabajo sobre economía regional.
Aguilera, M. M. (2009). "Ciénaga de Ayapel: Riqueza en biodiversidad y recursos
hídricos." Documentos de trabajo sobre economía regional.
Alfonso, J. L. (2006). "Use of hydroinformatics technology for real time water quality
management and operation of distribution networks. Case study of Villavicencio,
Colombia," MSc Thesis, UNESCO-IHE, Delft, NL.
Alfonso, J. L., Jonoski, A., and Solomatine, D. P. (2010a). "Multiobjective Optimization
of Operational Responses for Contaminant Flushing in Water Distribution
Networks." Journal of Water Resources Planning and Management, 136(1), 4858.
Alfonso, L., Lobbrecht, A., and Price, R. (2010b). "Information theory-based approach
for location of monitoring water level gauges in polders." Water Resour. Res.,
46(3), W03528.
Alfonso, L., Lobbrecht, A., and Price, R. (2010c). "Optimization of Water Level
Monitoring Network in Polder Systems Using Information Theory." Water
Resour. Res., doi:10.1029/2009WR008953, in press.
Alvarado, M. "Desarrollo de proyecto piloto de navegacion satelital – SNS, entre Puerto
Berrio (k783) y Regidor (k454)." XXII Congreso Latinoamericano de Hidráulica,
Cuidad Guayana, Venezuela.
Alvarado, M., Castro, R., Corredor, H., Mantilla, J. C., Vargas, G., Castro, G., Anaya, H.,
Caycedo, J., Lora, E., Escudero, A., and Roa, G. (2008). Río Magdalena.
Navegación marítima y fluvial (1986-2008), Universidad del Norte, Barranquilla.
Ammar, K., and Kaluarachchi, J. (2009). "Bayesian Method for Groundwater Quality
Monitoring Network Analysis." Journal of Water Resources Planning and
Management, 1, 20.
Optimisation of monitoring networks for water systems
Amorocho, J., and Espildora, B. (1973). "Entropy in the assessment of uncertainty in
hydrologic systems and models." Water Resources Research, 9(6), 1511-1522.
Aronica, G., Bates, P. D., and Horritt, M. S. (2002). "Assessing the uncertainty in
distributed model predictions using observed binary pattern information within
GLUE." Hydrological Processes, 16(10), 2001-2016.
Au, J., Bagchi, P., Chen, B., Martinez, R., Dudley, S. A., and Sorger, G. J. (2000).
"Methodology for public monitoring of total coliforms, Escherichia coli and
toxicity in waterways by Canadian high school students." Journal of
Environmental Management, 58(3), 213-230.
Barreto, W., Vojinovic, Z., Price, R., and Solomatine, D. (2009). "A Multi Objective
Evolutionary Approach to Rehabilitation of Urban Drainage Systems." Journal of
Water Resources Planning and Management, 1, 38.
Barreto, W. J., Price, R. K., Solomatine, D. P., and Vojinovic, Z. "Approaches to MultiObjective Multi-Tier Optimization in Urban Drainage Planning." International
Conference on Hydroinformatics HIC 2006, Nice, France.
Bertino, L., Evensen, G., and Wackernagel, H. (2003). "Sequential data assimilation
techniques in oceanography." International Statistical Review/Revue
Internationale de Statistique, 71(2), 223-241.
Bogardi, I., and Bardossy, A. (1985). "Multicriterion Network Design Using
Geostatistics." Water Resources Research, 21(2).
Borisova, T., Shortle, J., Horan, R. D., and Abler, D. (2005). "Value of information for
water quality management (DOI 10.1029/2004WR003576)." Water Resour. Res.,
41(6), 6004.
Bouma, J. A., van der Woerd, H. J., and Kuik, O. J. (2009). "Assessing the value of
information for water quality management in the North Sea." Journal of
Environmental Management, 90(2), 1280-1288.
Bras, R. L., and Rodriguez-Iturbe, I. (1976). "Network Design for the Estimation of Areal
Mean of Rainfall Events." Water Resources Research, 12(6).
Bromenshenk, J. J., and Preston, E. M. (1986). "Public participation in environmental
monitoring: A means of attaining network capability." Environmental Monitoring
and Assessment, 6(1), 35-47.
Burgin, M. (2003). "Information Theory: a Multifaceted Model of Information." Entropy,
5, 146-160.
Campo, M. (2001). "Proyecto: Recuperación de la pesca artesanal en el Magdalena
Medio. Ciénaga La Tigrera." Cormagdalena, Barrancabermeja.
Canadian Water Resources, A., Mitchell, B., and Shrubsole, D. (1994). Canadian water
management: visions for sustainability, Canadian Water Resources Association=
Association canadienne des ressources hydriques.
Caselton, W. F., and Husain, T. (1980). "Hydrologic Networks: Information
Transmission." Journal of the Water Resources Planning and Management
Division, 106(2), 503-520.
Caselton, W. F., and Zidek, J. V. (1984). "Optimal monitoring network designs."
Statistics & Probability Letters, 2(4), 223-228.
Chow, C., and Liu, C. (1968). "Approximating discrete probability distributions with
dependence trees." Information Theory, IEEE Transactions on, 14(3), 462-467.
156
References
Cormagdalena. (2000a). "Cartilla de Navegación del Río Magdalena entre Puerto Salgar
y Barranquilla y el Canal del Dique." LEH-UN, LEH-LF, Bogotá.
Cormagdalena. (2000b). "Estudio de Navegabilidad del Río Magdalena entre La Gloria
(k460) – Puente Pumarejo (k1). Canal del dique." LEH Las Flores, Barranquilla.
Cormagdalena. (2000c). "Estudio de Navegabilidad del Río Magdalena entre Puerto
Salgar y La Gloria." LEH-UN, Bogotá.
Cormagdalena. (2004). "Estudio de Navegabilidad del Río Magdalena, sector La Gloria Puerto Salgar / La Dorada. Informe Final. CM-160." LEH-UN, Bogotá.
Cormagdalena. (2006). "Estudio de Caracterización Hidrosedimentológica del Río
Magdalena sector presa de Betania – La Gloria , Volumen III - Hidráulica ", LEHUN, Bogotá.
Cormagdalena. (2009). "Informe Ejecutivo, Obras Magdalena Bajo Jun 2009 (In
Spanish). Executive Report, Works in the Low Magdalena River."
Cormagdalena, and Boada_Sáenz. (2007). "EIA + PMA Encauzamiento del río
Magdalena, tramo Puerto Berrío - Barrancabermeja | informe final / I |." Grupo
Neotrópicos, Medellín.
Cormagdalena, and Fedenavi. (2007). "Estudios y diseños de obras de encauzamiento en
el Río Magdalena en el sector comprendido entre Puerto Berrío y
Barrancabermeja. Informe Final." Boada - Sáenz Ing., Bogotá.
Cormagdalena, and ONF_Andina. (2007). "Plan de Manejo de la Cuenca Magdalena
Cauca. Informe Final Fase 4." 6003 - I04, Fluidis, Bogota.
Cover, T. M., and Thomas, J. A. (1991). "Information Theory." John Whiley, New York.
Cunge, J. A. (2003). "Of data and models." Journal of Hydroinformatics, 5(2), 75-98.
Dakins, M., Toll, J., Small, M., and Brand, K. (1996). "Risk-Based Environmental
Remediation: Bayesian Monte Carlo Analysis and the Expected Value of Sample
Information." Risk Analysis, 16(1), 67-79.
Deb, K., Pratap, A., Agarwal, S., and Meyarivan, T. (2002). "A fast and elitist
multiobjective genetic algorithm: NSGA-II." Evolutionary Computation, IEEE
Transactions on, 6(2), 182-197.
Díaz-Granados, M., Camacho, L. A., and Maestre, A. (2001). "Modelación de balances
hídricos de ciénagas fluviales y costeras colombianas." Revista de Ingeniería
Universidad de los Andes(13), 12-20.
DNP, FAO, and DDT. (2003). "Programa de Desarrollo sostenible de la Región de La
Mojana." Departamento Nacional de Planeación, República de Colombia.
Domínguez, E. A., Angarita, H. A., Ardila, F., and Caicedo, F. M. "Hydrological Risk
Modelling Using Adaptive Operators. Overview and Applications." 8th
International Conference on Hydroinformatics, Concepción, Chile.
Ebeling, W., and Frommel, C. (1998). "Entropy and predictability of information
carriers." Biosystems, 46(1-2), 47-55.
EPA. (1997). "Guiding Principles for Monte Carlo Analysis." Risk assessment Forum.
Estrin, D., Michener, W., and Bonito, G. (2003). "Environmental cyberinfrastructure
needs for distributed sensor networks." Scripps Institute of Oceanography,
University of New Mexico, Albuquerque.
Fano, R. M. (1968). Transmission of information, MIT Press Cambridge, Mass).
Fass, D. M. (2006). "Human Sensitivity to Mutual Information," Ph.D, Rutgers, The State
University of New Jersey, New Brunswick, New Jersey.
157
Optimisation of monitoring networks for water systems
Fernandez, N., Jaimes, W., and Altamiranda, E. (2010). "Neuro-fuzzy modeling for level
prediction for the navigation sector on the Magdalena River(Colombia)." Journal
of Hydroinformatics, 12(1), 36-50.
Filippini, F., Galliani, G., Mantovani, M., and Screpanti, F. (1994). "Optimization criteria
for configuring a network of monitoring stations." Environmental Software, 9(2),
77-88.
Fogel, E., and Huang, Y. F. (1982). "On the value of information in system identification-Bounded noise case* 1." Automatica, 18(2), 229-238.
Galvis, G. n., and Mojica, J. I. n. (2007). "The Magdalena River fresh water fishes and
fisheries." Aquatic Ecosystem Health & Management, 10(2), 127 - 139.
Gandin, L. S. (1965). Objective Analysis of Meteorological Fields, Israel Program for
Scientific Translations.
Gavirneni, S., Kapuscinski, R., and Tayur, S. (1999). "Value of information in
capacitated supply chains." Management science, 45(1), 16-24.
Gouveia, C., and Fonseca, A. (2008). "New approaches to environmental monitoring: the
use of ICT to explore volunteered geographic information." GeoJournal, 72(3),
185-197.
Gouveia, C., Fonseca, A., Câmara, A., and Ferreira, F. (2004). "Promoting the use of
environmental data collected by concerned citizens through information and
communication technologies." Journal of environmental management, 71(2), 135154.
Gualdrón, M. I. (2006). "Plan de manejo de los recursos ictiológicos y pesqueros en el
Rio Grande de la Magdalena y sus zonas de amortiguación." Cormagdalena,
Barrancabermeja.
Hall, J. W., Tarantola, S., Bates, P. D., and Horritt, M. S. (2005). "Distributed sensitivity
analysis of flood inundation model calibration." Journal of Hydraulic
Engineering, 131, 117.
Han, T. S. (1980). "Multiple mutual informations and multiple interactions in frequency
data." Information and Control, 46(1), 26-45.
Harmancioglu, N., and Yevjevich, V. (1987). "Transfer of Hydrologic Information
Among River Points." Journal of Hydrology JHYDA 7, 91(1/2).
Harmancioglu, N. B. (1999). Water Quality Monitoring Network Design, Kluwer
Academic Publishers.
Hart, P. E. (1971). "Entropy and Other Measures of Concentration." Journal of the Royal
Statistical Society. Series A (General), 134(1), 73-85.
He, L. (2009). "Information Theory applied to the Monitoring Network for the
Magdalena River " MSc Thesis, UNESCO-IHE, Delft, NL.
Hirshleifer, J., and Riley, J. G. (1979). "The Analytics of Uncertainty and Information-An
Expository Survey." Journal of Economic Literature, 17(4), 1375-1421.
Holling, C. S. (1978). "Adaptive environmental assessment and management." New York.
Hooper, R. P., Reckhow, K. H., and Band, L. E. (2004). "Designing a network of
hydrologic observatories." 2004 Joint Asia Oceania Geosciences Society 1st
Annual Meeting & APHW 2nd Conference, Singapore.
Howard, R. A. (1968). "The foundations of decision analysis." IEEE Transactions on
Systems Science and Cybernetics, 4(3), 211-219.
158
References
Husain, T. (1989). "Hydrologic uncertainty measure and network design." Water
Resources Bulletin, 25(3), 527-534.
IDEAM. (2001). "Geomorfología y suceptibilidad a la inundación del valle fluvial del
Magdalena, sector Barrancabermeja - Bocas de Ceniza." IDEAM, Bogotá.
IDEAM. (2005). Protocolo para la Emisión de los Pronósticos Hidrológicos, Bogotá.
IUCN. (1980). "World conservation strategy: living resource conservation for sustainable
development." IUCN, Gland, Switzerland.
Jakulin, A., and Bratko, I. (2003). "Quantifying and Visualizing Attribute Interactions."
Arxiv preprint cs.AI/0308002.
Jakulin, A., and Bratko, I. (2004). "Testing the significance of attribute interactions."
ACM International Conference Proceeding Series.
Jeroen, P. v. d. S., Matthieu, C., Silvio, F., Penny, K., Jerry, R., and James, R. (2005).
"Combining Quantitative and Qualitative Measures of Uncertainty in ModelBased Environmental Assessment: The NUSAP System." Risk Analysis, 25(2),
481-492.
Jonoski, A. (2002). Hydroinformatics as sociotechnology: promoting individual
stakeholder participation by using network distributed decision support systems,
Taylor & Francis.
Julius_Berger_Consortium. (1926). "Memoria detallada de los estudios del rio
Magdalena, obras proyectadas para su arreglo y resumen del presupuesto."
Servicio Colombiano de Meteorología e Hidrología, Ministerio de Agricultura,
Colombia, Bogotá.
Karasev, I. F. (1968). "Principles for Distribution and Prospects for Development of a
Hydrologic Network."
Kavetski, D., Franks, S. W., and Kuczera, G. (2002). "Confronting input uncertainty in
environmental modelling." Calibration of Watershed Models, 49–68.
Kirshner, S., Smyth, P., and Robertson, A. W. (2004). "Conditional Chow-Liu tree
structures for modeling discrete-valued vector time series." Proceedings of the
20th conference on Uncertainty in artificial intelligence, 317-324.
Klir, G. J., and Smith, R. M. (2001). "On measuring uncertainty and uncertainty-based
information: recent developments." Annals of Mathematics and Artificial
Intelligence, 32(1), 5-33.
Kraskov, A., Stögbauer, H., Andrzejak, R. G., and Grassberger, P. (2003). "Hierarchical
Clustering Based on Mutual Information." Arxiv preprint q-bio.QM/0311039.
Krastanovic, P. F., and Singh, V. P. (1992). "Evaluation of rainfall networks using
entropy: II. applications." Water Resour Manag, 6, 295-314.
Krstanovic, P. F., and Singh, V. P. (1992). "Evaluation of rainfall networks using
entropy: I. Theoretical development." Water Resources Management, 6(4), 279293.
Kuczera, G., and Parent, E. (1998). "Monte Carlo assessment of parameter uncertainty in
conceptual catchment models: the Metropolis algorithm." Journal of Hydrology,
211(1-4), 69-85.
Kunstmann, H., and Kastens, M. (2006). "Direct propagation of probability density
functions in hydrological equations." Journal of Hydrology, 325(1-4), 82-95.
Lee, H. L., So, K. C., and Tang, C. S. (2000). "The value of information sharing in a twolevel supply chain." Management science, 46(5), 626-643.
159
Optimisation of monitoring networks for water systems
Lehner, B., Verdin, K., and Jarvis, A. (2006). "HydroSHEDS Technical Documentation,
V 1. 0." WWF, Washington, DC. Available from: www. worldwildlife.
org/hydrosheds.
Levitt, S. D., and Syverson, C. (2008). "Market distortions when agents are better
informed: The value of information in real estate transactions." Review of
Economics and Statistics, 90(4), 599–611.
Li, W. (1990). "Mutual information functions versus correlation functions." Journal of
Statistical Physics, 60(5), 823-837.
Lin, C., Gelman, A., Price, P. N., and Krantz, D. H. (1999). "Analysis of local decisions
using hierarchical modeling, applied to home radon measurement and
remediation." Statistical Science, 305-328.
Linfoot, E. H. (1957). "An Informational Measure of Correlation." Information and
Control, 1, 85-89.
Lobbrecht, A. H. (1997). Dynamic Water-System Control: Design and Operation of
Regional Water-Resources Systems, AA Balkema.
Loucks, D. P., van Beek, E., and Stedinger, J. R. (2005). Water resources systems
planning and management, UNESCO - WL Delft Hydraulics, Paris.
Macauley, M. K. (2005). "The Value of Information: A Background Paper on Measuring
the Contribution of Earth Science Applications to National Initiatives."
Discussion Paper 05-26. Washington, DC: Resources for the Future. At
http://www. rff. org (accessed May 2007).
MacKay, D. J. C. (2003). Information Theory, Inference and Learning Algorithms,
Cambridge University Press.
Made, W. J. v. d. (1988). Analysis of some criteria for design and operation of surface
water gauging netoworks, The Hague.
Majda, A. J., Kleeman, R., and Cai, D. (2002). "A mathematical framework for
quantifying predictability through relative entropy." Meth. Appl. Anal, 9, 425–
444.
Marcelino, M. J., Gomes, C. A., Silva, M. J., Gouveia, C., Fonseca, A., Pestana, B., and
Brigas, C. (2007). "[email protected] Internet: Children as Multisensory
Geographic Creators." Computers and Education: E-learning from theory to
practice, B. Fernández Manjon, et al. (eds.), ed.
Markus, M., Vernon Knapp, H., and Tasker, G. D. (2003). "Entropy and generalized least
square methods in assessment of the regional value of streamgages." Journal of
Hydrology, 283(1-4), 107-121.
Martinez, A. (1981). "Subsidencia y geomorfología de la depresión inundable del río
Magdalena. ." Revista CIAF 6, No 1-3, 319-328, Bogotá, Colombia.
McGill, W. J. (1954). "Multivariate information transmission." Psychometrika, 19(2), 97116.
Melchers, R. E. (1999). Structural reliability analysis and prediction, John Wiley &
Sons, New York.
Milgrom, P., and Weber, R. J. (1982). "The value of information in a sealed-bid auction*
1." Journal of Mathematical Economics, 10(1), 105-114.
Mishra, A. K., and Coulibaly, P. (2009). "Developments in hydrometric network design:
A review." Reviews of Geophysics, 47(2).
160
References
Mitch, R. M. (1973). "Canal del Dique Survey Project." Misión Técnica ColomboHolandesa, NEDECO report, The Hague.
Mogheir, Y., de Lima, J., and Singh, V. P. (2004). "Characterizing the spatial variability
of groundwater quality using the entropy theory: I. Synthetic data." Hydrological
Processes, 18(11), 2165-2179.
Mogheir, Y., and Singh, V. P. (2002). "Application of Information Theory to
Groundwater Quality Monitoring Networks." Water Resources Management,
16(1), 37-49.
Mogheir, Y., Singh, V. P., and de Lima, J. (2006). "Spatial assessment and redesign of a
groundwater quality monitoring network using entropy theory, Gaza Strip,
Palestine." Hydrogeology Journal, 14(5), 700-712.
Molgedey, L., and Ebeling, W. (2000). "Local order, entropy and predictability of
financial time series." The European Physical Journal B-Condensed Matter,
15(4), 733-737.
Moon, Y. I., Rajagopalan, B., and Lall, U. (1995). "Estimation of mutual information
using kernel density estimators." Physical Review E, 52(3), 2318-2321.
Moore, R. J., Jones, D. A., Cox, D. R., and Isham, V. S. (2000). "Design of the HYREX
raingauge network." Hydrology and Earth System Sciences, 4(4), 521-530.
Moradkhani, H., Hsu, K. L., Gupta, H., and Sorooshian, S. (2005a). "Uncertainty
assessment of hydrologic model states and parameters: Sequential data
assimilation using the particle filter." Water Resour. Res, 41, W05012.
Moradkhani, H., Sorooshian, S., Gupta, H. V., and Houser, P. R. (2005b). "Dual state–
parameter estimation of hydrological models using ensemble Kalman filter."
Advances in Water Resources, 28(2), 135-147.
Moss, M. E. (1976). "Design of Surface Water Data Networks for Regional Information."
Hydrological Sciences--Bulletin, 21(1).
Moss, M. E., and Karlinger, M. R. (1974). "Surface Water Network Design by
Regression Analysis Simulation." Water Resources Research, 10(3).
Moss, M. E., and Tasker, G. D. (1991). "Intercomparison of hydrological network-design
technologies." Hydrological Sciences Journal/Journal des Sciences
Hydrologiques, 36(3), 209-221.
Múnera, M. B., Daza, J. M., and Páez, V. P. (2004). "Ecología reproductiva y cacería de
la tortuga." Rev. biol. trop, 52(1).
Muñoz, E. M., Ortega, A. M., Bock, B. C., and Páez, V. P. (2003). "Demography and
nesting ecology of green iguana, Iguana iguana (Squamata: Iguanidae), in 2
exploited populations in Depresión Momposina, Colombia." Revista de biología
tropical, 51(1), 229.
Naranjo, L. G., Andrade, G. I., and Ponce, E. (1999). "Humedales interiores de
Colombia." Bases técnicas para su conservación y uso sostenible. Instituto
Humboldt y Ministerio del Medio Ambiente. Bogota.
Nare, L., Love, D., and Hoko, Z. (2006). "Involvement of stakeholders in the water
quality monitoring and surveillance system: The case of Mzingwane Catchment,
Zimbabwe." Physics and Chemistry of the Earth, 31(15-16), 707-712.
Naz, N. N. (2006). "Urban Flood Warning System with wireless technology: Case Study
of Dhaka City – Bangladesh," MSc Thesis, UNESCO-IHE, Delft, NL.
161
Optimisation of monitoring networks for water systems
Niinioja, R., Holopainen, A. L., Lepistö, L., Rämö, A., and Turkka, J. (2004). "Public
participation in monitoring programmes as a tool for lakeshore monitoring: the
example of Lake Pyhäjärvi, Karelia, Eastern Finland." Limnologica, 34(1-2), 154159.
Pappenberger, F., Harvey, H., Beven, K., Hall, J., and Meadowcroft, I. (2006). "Decision
tree for choosing an uncertainty analysis methodology: a wiki experiment
http://www. floodrisknet. org. uk/methods http://www. floodrisk. net."
Hydrological Processes, 20(17), 3793-3798.
Pardo-Igúzquiza, E. (1998). "Optimal selection of number and location of rainfall gauges
for areal rainfall estimation using geostatistics and simulated annealing." Journal
of Hydrology, 210(1-4), 206-220.
Philippatos, G. C., and Wilson, C. J. (1972). "Entropy, market risk, and the selection of
efficient portfolios." Applied Economics, 4(3), 209-220.
Ramirez, J., Adamowicz, W. L., Easter, K. W., and Graham-Tomasi, T. (1988). "Ex Post
Analysis of Flood Control: Benefit-Cost Analysis and the Value of Information."
Water Resources Research, 24(8).
Reichard, E. G., and Evans, J. S. (1989). "Assessing the value of hydrogeologic
information for risk-based remedial action decisions." Water Resour. Res., 25(7).
Restrepo, J. D. (2005). Los sedimentos del río Magdalena: Reflejo de la crisis ambiental,
Universidad EAFIT, Colciencias, Medellín, Colombia.
Restrepo, J. D. (2008). "Applicability of LOICZ catchment-coast continuum in a major
Caribbean basin: The Magdalena River, Colombia." Estuarine, Coastal and Shelf
Science, 77(2), 214-229.
Restrepo, J. D., Zapata, P., Díaz, J. M., Garzón-Ferreira, J., and García, C. B. (2006).
"Fluvial fluxes into the Caribbean Sea and their impact on coastal ecosystems:
The Magdalena River, Colombia." Global and Planetary Change, 50(1-2), 33-49.
Rivera, H., Zamudio, E., and Pinzón, H. "Modelación hidrológica en tiempo real para
soportar las decisiones en el sector de navegación del río Magdalena (in Spanish)
Real time hydrological modeling for navigation sector in Magdalena River." XVI
Seminario Nacional de Hidrología e Hidráulica, Sociedad Colombiana de
Ingenieros, Armenia, Quindío, Colombia.
Roberts, M. J., Schimmelpfennig, D., Livingston, M. J., and Ashley, E. (2009).
"Estimating the Value of an Early-Warning System." Review of Agricultural
Economics, 31(2), 303-329.
Rodriguez-Iturbe, I., and Mejia, J. M. (1974). "The Design of Rainfall Networks in Space
and Time." Water Resources Research, 10(4), 713–728.
Rodríguez, P. (2001). "Proyecto: Recuperación de la pesca artesanal en el Magdalena
Medio. Ciénaga La Victoria." Cormagdalena, Barrancabermeja.
Rojas, N. Y. (2006). "Determinación del flujo de la información hidrológica en tiempo
real en los pronósticos hidrológicos del nivel del agua para la navegación del Río
Magdalena. Informe Final." IDEAM, Bogotá.
Romero, E. M. (2001). "Proyecto: Recuperación Natural de la Oferta Ictiológica y la
pesca artesanal en el Magdalena Medio. Ciénaga Morales." Cormagdalena, La
Gloria, Cesar.
Ruddell, B. L., and Kumar, P. (2009). "Ecohydrologic process networks: 1.
Identification." Water Resour. Res., 45(3), W03419.
162
References
Sansó, B., and Müller, P. (1997). Redesigning a Network of Rainfall Stations, Institute of
Statistics and Decision Sciences, Duke University.
Schimmelpfennig, D. E., and Norton, G. W. (2003). "What is the value of agricultural
economics research?" American Journal of Agricultural Economics, 81-94.
Schneider, T. D. (2000). "Information Theory Primer."
Schreiber, T. (2000). "Measuring information transfer." Physical review letters, 85(2),
461-464.
Shannon, C. E. (1948). "A mathematical theory of communication." Bell System
Technical Journal, 27(3), 379–423.
Shaqadan, A. (2008). "Decision Analysis Considering Welfare Impacts in Water
Resources Using the Benefit Transfer Approach," Utah State University, Logan,
Utah.
Sharma, A. (2000). "Seasonal to interannual rainfall probabilistic forecasts for improved
water supply management: Part 3—A nonparametric probabilistic forecast
model." Journal of Hydrology, 239(1-4), 249-258.
Shrestra, D. L. (2009). "Uncertainty Analysis in Rainfall-Runoff Modelling: Application
of Machine Learning Techniques," UNESCO-IHE, TU-Delft, Delft.
Silva, M. I. G. "PLAN DE MANEJO DE LOS RECURSOS ICTIOLÓGICOS Y
PESQUEROS EN EL RIO GRANDE DE LA MAGDALENA Y SUS ZONAS
DE AMORTIGUACIÓN."
Silva, M. J., Pestana, B., and Lopes, J. C. "Using a mobile phone and a geobrowser to
create multisensory geographic information." Proceedings of the 7th international
conference on Interaction design and children, 153-156.
Singh, V. P. (1997). "The use of entropy in hydrology and water resources."
Hydrological Processes, 11(6), 587-626.
Singh, V. P. (2000). "The entropy theory as a tool for modelling and decision-making in
environmental and water resources." Water S. A., 26(1), 1-12.
Smith, D. G. (1986). "Anastomosing river deposits, sedimentation rates and basin
subsidence, Magdalena River, northwestern Colombia, South America."
Sedimentary Geology, 46(3-4), 177-196.
Srinivasa, S. (2005). "Multivariate Mutual Information." University of Notre Dame,
Indiana.
Steuer, R., Kurths, J., Daub, C. O., Weise, J., and Selbig, J. (2002). "The mutual
information: Detecting and evaluating dependencies between variables."
Bioinformatics, 18(2), S231-40.
Stigler, G. J. (1961). "The economics of information." The Journal of Political Economy,
69(3).
Stokes, P., Havas, M., and Brydges, T. (1990). "Public participation and volunteer help in
monitoring programs: An assessment." Environmental Monitoring and
Assessment, 15(3), 225-229.
Valderrama, M., and Zarate, M. (1989). "Some ecological aspects and present state of the
fishery of the Magdalena River basin, Colombia, South America." Canadian
special publication of fisheries and aquatic sciences/Publication speciale
canadienne des sciences halieutiques et aquatiques. 1989.
van Andel, S. J. (2009). Anticipatory Water Management- Using Ensemble Weather
Forecasts for Critical Events, CRC Press, Delft, NL.
163
Optimisation of monitoring networks for water systems
Van der Hammen, T. (1986). "Fluctuaciones Holocénicas del nivel de Inundaciones en la
Cuenca del Bajo Magdalena – Cauca – San Jorge (Colombia)." Geología
Norandina(10), 11– 18.
Van Oijen, M., Rougier, J., and Smith, R. (2005). "Bayesian calibration of process-based
forest models: bridging the gap between models and data." Tree Physiology,
25(7), 915.
Vrugt, J. A., Diks, C. G. H., Gupta, H. V., Bouten, W., and Verstraten, J. M. (2005).
"Improved treatment of uncertainty in hydrologic modeling: Combining the
strengths of global optimization and data assimilation." Water Resources
Research, 41(1), W01017.
Wagener, T., McIntyre, N., Lees, M. J., Wheater, H. S., and Gupta, H. V. (2003).
"Towards reduced uncertainty in conceptual rainfall-runoff modelling: Dynamic
identifiability analysis." Hydrological Processes, 17(2), 455-476.
Wagner, J. M., Shamir, U., and Nemati, H. R. (1992). "Groundwater quality management
under uncertainty: Stochastic programming approaches and the value of
information." Water Resour. Res, 28(5), 1233-1246.
Walley, P. (1991). "Statistical reasoning with imprecise probabilities." Monographs on
Statistics and Applied Probability.
Watanabe, S. (1960). "Information theoretical analysis of multivariate correlation." IBM
Journal of Research and Development, 4(1), 6682.
Wei, Y. "Variance, Entropy and Uncertainty Measure." American Statistical Association.
Weinberger, E. D. (2001). "A Theory of Pragmatic Information and Its Application to the
Quasispecies Model of Biological Evolution." Arxiv preprint nlin.AO/0105030.
WMO. (1994). "Guide to Hydrological Practices, Data Acquisition and Processing,
Analysis, Forecasting and Other Applications." 168.
Yang, Y., and Burn, D. H. (1994). "An entropy approach to data collection network
design." Journal of Hydrology, 157(1-4), 307-324.
Yankovsky, S. (2000). "CONCEPTS OF THE GENERAL THEORY OF
INFORMATION."
Yeh, M. S., Lin, Y. P., and Chang, L. C. (2006). "Designing an optimal multivariate
geostatistical groundwater quality monitoring network using factorial kriging and
genetic algorithms." Environmental Geology, 50(1), 101-121.
Yokota, F., and Thompson, K. M. (2004). "Value of Information Analysis in
Environmental Health Risk Management Decisions: Past, Present, and Future."
Risk Analysis, 24(3), 635-650.
Zidek, J. V., Sun, W., and Le, N. D. (2000). "Designing and integrating composite
networks for monitoring multivariate Gaussian pollution elds." Applied Statistics,
49, 63-79.
164
List of figures
Figure 1-1. One of the main challenges when designing a monitoring network ................ 6
Figure 1-2. Use of models for the design of monitoring networks (this thesis) ................. 8
Figure 1-3. Outline of the thesis ....................................................................................... 10
Figure 2-1. First cyclic approach for monitoring planning............................................... 12
Figure 2-2. Classification of methods for design and evaluation of monitoring
networks ......................................................................................................................... 14
Figure 2-3. Flowchart for VOI estimation ........................................................................ 23
Figure 3-1. Limits of the Delfland Water Board and in the province of Zuid Holland .... 28
Figure 3-2. Land uses, main water system components of Delfland region and
location of the polders of Pijnacker Adopted from Lobbrecht (1997).......................... 29
Figure 3-3. Interest-weighting chart Delfland water system. Adopted from Lobbrecht
(1997), p 183 .................................................................................................................. 29
Figure 3-4. Land use in the region of Pijnacker................................................................ 30
Figure 3-5. Schematic profile of the polders of Pijnacker ................................................ 30
Figure 3-6. Composition of the polder system of Pijnacker and identification of pump
stations ........................................................................................................................... 31
Figure 3-7. Canal network and target water levels in the polders of Pijnacker. ............... 32
Figure 3-8. Storage volume of the Pijnacker’s canal network.......................................... 32
Table 3-1. Characteristics of the pump stations in the Pijnacker polders......................... 33
Figure 3-9. Location of the existing water level gauges in the Pijnacker polders. ........... 34
Figure 3-10. Connection points of the hydrodynamic (HD) model and the rainfallrunoff (RR) model.......................................................................................................... 35
Figure 4-1. General location of the Magdalena River and its catchment ......................... 38
Table 4-1. Mean hydraulic slope by sectors, Magdalena River........................................ 38
Figure 4-2. Main tributaries and towns of the middle and low Magdalena River ............ 39
Figure 4-3. Mean discharges of the main tributaries of the middle and low sector of
the Magdalena River ...................................................................................................... 40
Figure 4-4. Inner delta and wetlands of the Momposina depression ................................ 41
Figure 4-5. Age of the existing limnigraphic and limnimetric stations in the middle
and low Magdalena River (based on the year 2010)...................................................... 42
Figure 4-6: Available hydrologic data records of discharges (Q) and water levels (h)
at river stations for 1995. ............................................................................................... 46
Table 4-2. Number of days of 1995 with discharge and stage data and datum of
gauges............................................................................................................................. 46
Figure 4-7. Mean discharge of tributaries of the Magdalena River and mean
discharges for the year 1995, used as model inputs. ...................................................... 47
Figure 4-8. Example o f a composite cross section near La Dorada - Puerto Salgar........ 49
Optimisation of monitoring networks for water systems
Figure 4-9. Assumed grouping of the wetland system for the model............................... 50
Table 4-3. Characteristics of the grouped wetlands.......................................................... 51
Figure 4-10. Modelled and measured discharges at Regidor (a) and Calamar (b), first
check .............................................................................................................................. 52
Figure 4-11. Modelled and measured discharges at Regidor (a) and Calamar (b),
second check .................................................................................................................. 52
Figure 4-12. Modelled and measured discharges at Regidor (a) and Calamar (b), final
result............................................................................................................................... 53
Figure 4-13. Modelled and measured discharges at Santa Ana station, Mompox
branch (final result) ........................................................................................................ 53
Figure 4-14. Modelled and measured absolute water levels at Regidor station (final
result) ............................................................................................................................. 54
Figure 5-1 Rainfall event used in the hydrodynamic model............................................. 62
Figure 5-2. Example of original water level time-series and its quantized version, at a
point located downstream a pumping station................................................................. 63
Figure 5-3. Entropy map of Pijnacker region ................................................................... 63
Figure 5-4. Directional Information Transfer Index (DIT) map for the point A (bits). .... 64
Figure 5-5. Step-by-step solution for the location of water level monitors using
I X ; Y as the dependency criteria. Entropy of the currently selected point is
shown at each step.......................................................................................................... 65
Figure 5-6. Step-by-step solution for the location of water level monitors using
DITXY. Entropy of the currently selected point is shown at each step (bits). ................. 66
Figure 5-7. Step-by-step solution for the location of water level monitors using
DITYX. ............................................................................................................................. 66
Figure 5-8. Location of water level monitors obtained by the WMP approach, using
I(X;Y), DITXY and DITXY as pairwise dependency criteria.............................................. 67
Figure 5-9. Evolution of the values of Joint Entropy and Total Correlation as new
monitors are added to the solution set............................................................................ 68
Figure 5-10. Average percentage of common monitor locations comparing the
solution obtained for each value of a with all other calculated solutions
(a=1,2,…,10,15,20)........................................................................................................ 70
Figure 5-11. Venn diagrams illustrating the proposed optimization problem. (A):
Information content of 10 variables and their common information; (B), (C) and
(D): possible solutions for the selection of three monitor locations (1), obtained by
maximizing joint entropy (2) and minimizing Total Correlation (3)............................. 73
Figure 5-12. Definition of low-entropy points to be discarded from the search space
for the optimization, according to the relative frequency of the entropy of the points
in the Delfland system.................................................................................................... 75
Figure 5-13 Pareto-optimal set of solutions discriminated by EPM, approach a).
Extremes Xa and Ya are indicated for further analysis. Results obtained with WMP
method (Alfonso et al. 2010b) are also indicated. ......................................................... 77
166
List of figures
Figure 5-14. Delfland water system with location of solutions for approach a)
obtained at the extremes Xa and Ya of the Pareto frontier of Figure 5-13. Solution
for S=EPM is also included. Scale represents the marginal entropy at each system
point estimated with Eq. (2-1)........................................................................................ 78
Figure 5-15. Pareto-optimal front, approach b), discriminated by the number of times
the minimum distance ds is violated by the solution set. Extremes Xb and Yb are
indicated for further analysis. Results obtained with WMP method (Alfonso et al.
2010b) are also indicated. .............................................................................................. 80
Figure 5-16. Delfland water system with location of solutions for approach b)
obtained at the extremes Xb and Yb of the Pareto frontier of Figure 5-15. Location
of existing hydraulic structures is also included. Scale represents marginal entropy
values at each system point (bits). ................................................................................. 82
Figure 5-17. Progress of information quantities as new monitors are added. Analysis
for extremes Xa and Ya of Figure 5-13........................................................................... 83
Figure 5-18. Progress of information quantities as new monitors are added for
solution of approach a), for extremes Xb and Yb of Figure 5-15. .................................. 84
Figure 5-19. Sensitivity of the maximum Joint Entropy (1) and Total Correlation (2)
due to variations of the parameter u, discriminated by the number of new monitors
in the solution................................................................................................................. 84
Figure 5-20. Sensitivity of the maximum Joint Entropy (left) and Total Correlation
(right) due to variations of the parameter q, discriminated by the number of new
monitors in the solution.................................................................................................. 85
Figure 5-21. Flowchart rank-based greedy algorithm for Joint Entropy (a) and Total
Correlation (b)................................................................................................................ 88
Figure 5-22: Available hydrologic data records of discharges (Q) and water levels (h)
at river stations for 1995. ............................................................................................... 89
Figure 5-23. Entropy Map for a=200 m3/s in bits (a) and mean discharge map for
1995 in m3/s (b), for the Magdalena River..................................................................... 91
Figure 5-24: Solutions for multiobjective optimization approach. Black dots form the
best Pareto front obtained by selecting the best points of the 5 combinations (P, G).
Points A, B, C and D are selected for further analysis for 9 decision variables. ........... 92
Figure 5-25. Location of selected solutions A, B, C and D of Figure 5-24 for 9
decision variables ........................................................................................................... 93
Figure 5-26. Location of the most informative (and redundant) solution obtained for
6, 7, 8 and 9 monitors (the most right black dots of each Pareto front of Figure
5-24) ............................................................................................................................... 94
Figure 5-27. Results obtained running the flowchart of Figure 5-21(a). Numbers
represent the order in which each monitor was selected. The colour scale represents
entropy (bits). ................................................................................................................. 95
Figure 5-28. Results obtained running the flowchart of Figure 5-21(b). Numbers
represent the order in which each monitor was selected. The colour scale represents
entropy (bits). ................................................................................................................. 96
Figure 5-29. Entropy values before, at and after the main tributaries .............................. 97
167
Optimisation of monitoring networks for water systems
Figure 5-30. Solutions obtained by different methods, Total Correlation – Joint
Entropy plane. ................................................................................................................ 98
Figure 5-31. Entropy maps for different values of a, Eq. (5-1) ........................................ 99
Figure 6-1. Variation of the Value of Information when changing the prior
probability Ss and the conditional probabilities qm,s for the consequence matrix
shown in Table 6-6....................................................................................................... 107
Figure 6-2. Definition of Vx(x) and Vx(y) ........................................................................ 109
Figure 6-3. Definition of Vx(y), Vx and VOIx for a monitor located at x to give the state
of the system at y for infinite (a) and finite (b) number of calculation points y........... 109
Figure 6-4. Value of two monitors.................................................................................. 110
Figure 6-5. Selection of the best monitors out of three possibilities .............................. 111
Figure 6-6. VOI-related areas to optimise the monitor locations a and b, Eq. (6-4) ...... 111
Figure 6-7. VOI-related areas to optimise the monitor locations a, b and c, Eq. (6-5). . 112
Figure 6-8. Definition of flood levels for the canal-pump experiment compared to the
minimum, mean and maximum water levels obtained by the model........................... 113
Figure 6-9. Prior beliefs Ss estimated with the Table 6-1 ............................................... 114
Figure 6-10. Mean of Eq. (6-2) for all x in the water system and zoomed curve for the
no-flood area ................................................................................................................ 115
Figure 6-11. V curve for the calculation point with the highest VOIx (point 43)............ 115
Figure 6-12. V curves for the monitors 37 and 57, after solving Eq. (6-4)..................... 116
Figure 6-13. V curves for the monitors 2, 50 and 73, after solving Eq.(6-5).................. 117
Table 6-8. Definition of the situations for estimation of qm,s for the Magdalena River
case............................................................................................................................... 118
Figure 6-14. VOI and the effect of lagged time series.................................................... 119
Figure 6-15. Mean Value of Information estimated with Eq. (6-2) for different values
of celerity in the Magdalena River, using a state threshold definition of 80%............ 120
Figure 6-16. Mean Value of Information estimated with Eq. (6-2) for different values
of celerity in the Magdalena River, using a state threshold definition of 50%............ 121
Figure 6-17. Mean Value of Information estimated with Eq. (6-2) for different values
of celerity in the Magdalena River, using a state threshold definition of 20%............ 122
Figure 6-18. Results for one, two and three monitor locations for different celerity
values and 80% state threshold definition.................................................................... 123
Figure 6-19. Results for one, two and three monitor locations for different celerity
values and 50% state threshold definition.................................................................... 124
Figure 6-20. Results for one, two and three monitor locations for different celerity
values and 20% state threshold definition.................................................................... 125
Figure 6-21. Mean VOI in the Pijnacker water system, with simplified inputs. ............ 127
Figure 6-22. Location of one (a) and two (b) monitors for the Pijnacker water system. 127
Figure 6-23. Selected subsystem of the Pijnacker polder system................................... 128
Figure 6-24. Definition of the possible states, land uses and damage function
(consequences) ............................................................................................................. 129
168
List of figures
Table 6-11. Situations ..................................................................................................... 130
Table 6-12. qm,s matrix for the on-status of the pump downstream y according to the
situations of Table 6-11................................................................................................ 131
Table 6-13. qm,s matrix for the off-status of the pump downstream y according to the
situations of Table 6-11................................................................................................ 131
Figure 6-25. VOIx maps considering different data sets ................................................. 132
Figure 6-26. Results for the selected subsystem of the Pijnacker water system using
calculated prior beliefs ................................................................................................. 133
Figure 7-1. Flowchart describing the MoMoX general procedures................................ 138
Figure 7-2. Technology behind MoMoX........................................................................ 139
Figure 7-3. MoMoX website showing gauge 8 info, and the current water level graph. 139
Figure 7-4. Gauges with scales in cm (a), dm (b) and m (c)........................................... 141
Figure 7-5. Examples of validation errors related to gauge scale................................... 142
Figure 7-6. Example of error related to datum without negative sign. ........................... 143
Figure 7-7. Example of error related to adding an unnecessary negative sign. .............. 143
Figure 7-8. Example of random errors due to differences in appreciation ..................... 144
Figure 7-9. Water level charts obtained with the data sent by the residents and other
participants during the second stage of the experiment ............................................... 145
Figure 7-10. Description of the area, absolute elevations and location of hydraulic
structures ...................................................................................................................... 146
Figure 7-11. Zero-model, SMS data and retrieved patterns for the 6 gauged points...... 148
169
List of tables
Table 3-1. Characteristics of the pump stations in the Pijnacker polders......................... 33
Table 4-1. Mean hydraulic slope by sectors, Magdalena River........................................ 38
Table 4-2. Number of days of 1995 with discharge and stage data and datum of
gauges............................................................................................................................. 46
Table 4-3. Characteristics of the grouped wetlands.......................................................... 51
Table 4-4. Resistance number (Manning coefficient) at stations ..................................... 54
Table 5-1. Summary of monitors obtained by each dependency criteria and
corresponding values of joint entropy and total correlation........................................... 69
Table 5-2. Number of solutions for approach b) with minimum distance violations by
pumps and weirs............................................................................................................. 81
Table 5-3. Sensitivity analysis for parameter u. ............................................................... 85
Table 5-4. Sensitivity analysis for parameter q. ............................................................... 86
Table 6-1. Definition of the vector Ss for two possible states of a water system ........... 105
Table 6-2. Possible situations of messages at x for given states at y for the case of two
possible states............................................................................................................... 105
Table 6-3 Definition of conditional probabilities qm,s according to the situations
presented in Table 6-2.................................................................................................. 105
Table 6-4. Definition of the Cas matrix. .......................................................................... 106
Table 6-5. Definition of actions, states and messages for the canal-pump case ............. 113
Table 6-6. Consequences of doing action a given state s (costs units)........................... 114
Table 6-7. Consequences of doing action a given state s (costs units)........................... 117
Table 6-8. Definition of the situations for estimation of qm,s for the Magdalena River
case............................................................................................................................... 118
Table 6-9. Definition of conditional probabilities qm,s according to the situations
presented in Table 6-8.................................................................................................. 118
Table 6-10. Table of consequences Cas for different land uses for the Pijnacker region 129
Table 6-11. Situations ..................................................................................................... 130
Table 6-12. qm,s matrix for the on-status of the pump downstream y according to the
situations of Table 6-11................................................................................................ 131
Table 6-13. qm,s matrix for the off-status of the pump downstream y according to the
situations of Table 6-11................................................................................................ 131
Table 7-1. Description of MoMoX stages ...................................................................... 137
Table 7-2. Parameters of the zero-model and SMS model, all dimensionless except
V3 to V5 (m3/s) ............................................................................................................ 147
Table 7-3. Parameters of the patterns that fit better the SMS data ................................. 148
Notations
a
Quantization coefficient
H(X) Entropy of the random variable X
I(X;Y) Mutual information or transinformation between random variables X and Y
H(X,Y) Joint entrropy of the random variables X and Y
C(X,...,Z) Total correlation among the random variables X,...,Z
DITx,y Directional Information Transfer of a variable transmitted from x to y
T
Dependency matrix built with a given pairwise criteria
v
Dependency vector given by a row or column of matrix T
xq
Quantized value of x
u(a,p) Utility of the action a chosen with a probability p about the state of the system
cas
Consequences of performing the action a when the system has a state s
qs,m
Conditional probability of receiving the message, m, given the state, s
Ss
Decision-maker’s prior probability about the state of the system
Ss,m Decision-maker’s posterior beliefs after receiving additional information
Vx(y) Value of a monitor located at x that provides messages about any other point y
Vx
Value of Information curve of a monitor located at x about the entire water system
VOIx Value of Information of a monitor located at x about the entire water system
Abbreviations
DIT
EPM
HD
HS
ICT
IDEAM
IT
JH
masl
MOGA
MoMoX
MOOP
NSGA
RR
SMS
SRTM
VOI
WMO
WMP
Directional Information Transfer
Existing Pump Monitors
Hydrodynamics
Hydrosheds (hydrologically-corrected version of SRTM elevation data)
Information and Communication Technologies
Institute for Hydrology, Meteorology and Environmental Studies, Col.
Information Theory
Joint Entropy
Meters above sea level
Multi Objective Optimisation with Genetic Algorithms
Mobile Monitoring Experiment
Multi Objective Optimisation Problem, see section 5.3.2
Non-sorted genetic algorithm
Rainfall runoff
Short Message Service
Shouttle Radar Topographic Mission
Value of Information
World Meteorological Organisation
Water Level Monitoring Design in Polders, see section 5.2.1
Acknowledgements
From the scientific side of this research, I recognise the support of two persons during
these years. In the first place, Prof. Roland K. Price, for whom I have two main reasons
to thank apart from his continuous guidance and support: his capacity to keep my
motivation at a high level during the last few years and the confidence he built in me to
successfully perform independent research. I shall always be grateful with him, especially
because his teaching has made me a better person. In the second place, the supervision of
Dr. Arnold Lobbrecht was key because he always facilitated my research path in many
aspects, including financial, moral and scientific. I would also like to express my
gratitude to the thesis committee members for their interest and valuable comments on
my work. On the financial side, I am very grateful to the Delft Cluster and
Hoogheemraadschap van Delfland for providing the funds necessary for this research.
The methods and theories developed in this thesis have been applied to two case studies,
the polders of Pijnacker and the Magdalena River (Colombia). This would not had been
possible without the support of many people. In first place, I need to recognise and thank
the Delfland Waterboard staff for their support, in particular Ir. Jan Dragt as the person
responsible for opening and maintaining the connections required with the Waterboard
and for facilitating the interesting discussions we had during the research; Ir. Job van
Dansik for his valuable inputs and positive criticism during my presentations; Ir. Peter
Hollanders for facilitating all the information I needed and for stimulating discussions.
Many thanks also to Ir. Laura Haitel, who guided me through the flat, beautiful landscape
of the polders of Pijnacker and helped me to understand how this complex system works;
to Frank Keijzer, who performed the labelling of the gauges used by the MoMoX
experiment and provided additional useful information, and to Arie Boele, who provided
me with maps of the gauges and other information about the water system.
I feel blessed for the enthusiasm and assistance I received during my time collecting data
in Colombia. I specially thank Ing. Paulino Galindo, from Cormagdalena, who always
supported this project by providing all the information I asked for. I am especially
grateful to Ing. Eduardo Bravo, who introduced me to the secrets of the Magdalena River
at the very early stages of my professional career. His experience, stories and ideas about
how the River should be managed always inspired me; and this thesis was the perfect
excuse for me to get to know more about our River.
The people from the Laboratory of Hydraulics (LEH) of the National University of
Colombia, one of the institutions in charge of performing studies in the Magdalena River,
have been of great help to me. In particular, I would like to thank Ing. Rafael Ortiz, Ing.
William Perdomo, Ing. Crisitan Plazas, Ing. Marcela Rodríguez, and Ing. Pedro León,
who provided the vital links between Colombia and Delft. My visit to Barranquilla for
data collection was so short for collecting the amount of data I needed, that it would not
be possible through the kind support of Ing. Manuel Alvarado, director of the Laboratory
of Hydraulics of Las Flores (LEH-LF); I am also grateful to Ing. Holbert Corredor, a very
experienced engineer who knows the problems and the river shape in its lower zone very
Optimisation of monitoring networks for water systems
well and with whom I had the opportunity to have a very valuable discussion in
Barranquilla. I want to thank Mrs. Myriam Mercado, who was in charge of collecting and
storing the data I needed. During my visit to Barrancabermeja, I had the kind support of
Ing. Martha Isabel Gualdrón and Ing. Claudia Patricia Guevara, who provided with me
many of the documents referenced in Chapter 4. Finally, I must acknowledge Ing. Alvaro
Sanjinés in Bogotá and specially Ing. Jorge Enrique Saenz, who shared with me his
valuable experience of the Magdalena River.
I have had the fortune to guide two MSc students that supported this research thesis: The
first was Liyan He, who succeeded in the difficult task of building the model for the
Magdalena River and applying some of the Information Theory approaches developed in
Chapter 5. I recognise Liyan’s ability to get through huge amounts of data in Spanish;
The second student was Lasantha Rupasinghe, who worked on applying Information
Theory approaches to placing sensors for pollution in water distribution networks.
Although his efforts are not explicitly presented in this thesis due to the different nature
of the system under consideration, he certainly contributed to the modifications of the
developed methods and demonstrated their applicability to different water systems. I
thank Liyan and Lasantha for sharing with me their time and efforts.
Regarding the MoMoX experiment, I express my sincere gratitude to all the people who
supported me in different ways, among others, the MSc students of Hydroinformatics
2009-2011, who participated in the pilot tests of the experiment; Ewoud Kok, who
assisted me on advertising the experiment in UNESCO-IHE’s screens and the ladies from
the Reception Desk, who took care of the MoMoX-advertisment umbrella for some
weeks. The experiment of 22 May was successful thanks to the participants that identified
themselves with the following nicknames: nathasja, vladyman, CarlosV, Arlex, angela,
Juliette, Gaetano, Aleyda, Leo, Erica_Nino, Marwa, Mesgana, Carolina and odsuren. In
the same way, for the experiment held in July in Pijnacker, I thank the residents identified
by the nicknames josa, 2641, Rein, Eric, Joanne and Roos. In particular, I thank Rein and
Eric, who actively participated on the experiment and gave invaluable feedback through
telephone interviews.
I am thankful to two Dutch persons who were particularly important for the completion
of this dissertation. First, Ir. Steven Weijs, with whom I held discussions from the very
early stages of this research, especially regarding Information Theory. Second,
Rosemarijn te Horst, who kindly supported the MoMoX experiment in regard to
communicating with people living in the polder, visits to the Pijnacker region, and
translations of handouts in Dutch.
I am also grateful to Prof. Dimitri Solomatine, Dr. Andreja Jonoski, Dr. Ioana Popescu,
Dr. Biswa Bhattacharya and the entire Hydroinformatics core at UNESCO-IHE. I
particularly thank Ir. Jan Luijendijk, Ir. Judith Kasperma, Dr. Schalk Jan van Andel, Dr.
Arnold Lobbrecht as well as Dr. Peter Kelderman, from Environmental Resources
Department, for helping me with the Dutch translation of the summary and propositions. I
enjoyed the day-by-day discussions with all of them.
178
Similarly, I want to express my gratitude to my PhD colleagues, including of course the
glorious Tax-Free Employees football team (we are the champions) and those with which
I spent time discussing some of the topics of this thesis: Solomon Seyoum, Michael Siek,
Durga LaL Shrestra, Fikri Abdulah, Ivan García, Carolina Rogeliz, Mónica Sáenz. My
appreciation goes to all my IHE colleagues that I have known during these years at
UNESCO-IHE.
I need to thank also my colleagues and friends Carlos Velez, Arlex Sánchez, Wilmer
Barreto and Gerald Corzo for their feedback on my research during our lunch breaks and
their inputs to motivate my creativity, not only for my PhD but for my entrepreneur
ambitions. I am grateful to my friends Fernando, Claudia, Zaira, Javier, Roberto and
Berenice, who were very close to me at the early stages of my PhD and have supported
me during this process; also to Angela, Gaetano, Carmen, Carlos, Diego, Carol, Gianluca,
Diana, Natashja and Ruzika, who I consider as the extension of my family in The
Netherlands. Especial thanks go to Patricia Nieto, my mother in law, who indirectly
helped me to finalise this document.
Last but not least, I would like to thank my family, father, mother and brothers for their
support from the distance. Above all, my gratitude for my beloved wife, Sandra, who has
always supported me unconditionally during all the stages of this research. Cosita,
gracias por tu apoyo y paciencia durante estos últimos años. Este es también tu logro.
Agradezco tu sacrificio y te agradeceré siempre por esos dos seres tan maravillosos que
son Valentina y Sebastian.
179
About the author
José Leonardo Alfonso Segura was born in 1974 in Bogotá, Colombia. In 1999 he
graduated as Civil Engineer from the Faculty of Engineering of the National University
of Colombia, where his father taught Fluids Mechanics and Hydraulics for 35 years. After
his graduation, and influenced by his father’s steps, he entered the LEH Laboratory as a
Junior Engineer, in the same University. There he worked on projects concerning the
Magdalena River that included navigation path designs, dredging plans, dynamics of river
morphology, flood and erosion control assessments. At LEH Mr. Alfonso developed his
first PC routines to speed up certain tasks, in particular for data processing.
By the end of 2001 he decided to move to the private sector, working for Aquadatos, a
consultancy company that specialises in mathematical modelling of water distribution
networks and urban drainage systems. There he used his programming skills to create
diverse tools to facilitate data processing, editing drawings and optimising software
usage. In 2003 Mr. Alfonso was promoted to Technical Director, and he continued to
develop ICT tools for the company, while managing groups of junior engineers in diverse
projects, the most important of which was the development a methodology for assessing
the supply networks of 20 capital cities of Colombia using mathematical models. His
strong affinity for water and computers led him seek for further qualifications and to look
for specialization courses in which both aspects were combined.
He found at UNESCO-IHE, Delft exactly what he was looking for, and in 2004 he started
his MSc studies in Hydroinformatics with a fellowship granted by the Dutch
Government’s Watermil Project, and graduated in 2006 with a thesis that explored the
use of different Hydroinformatics tools for real time management of water quality in
distribution networks, with a case study in Villavicencio, Colombia. Although not in his
original plan when he first arrived in Holland, he decided to continue with PhD studies,
now working on the challenging task of developing approaches for designing and
evaluating monitoring networks. His findings are presented in this thesis.
List of publications
Alfonso, J. L., Jonoski, A., and Solomatine, D. P. (2010a). "Multiobjective
Optimization of Operational Responses for Contaminant Flushing in Water
Distribution Networks." Journal of Water Resources Planning and Management,
136(1), 48-58.
Alfonso, L., Lobbrecht, A., and Price, R. (2010b). "Information theory-based
approach for location of monitoring water level gauges in polders." Water Resources
Research, 46(3), W03528.
Alfonso, L., Lobbrecht, A., and Price, R. (2010c). " Optimization of Water Level
Monitoring Network in Polder Systems Using Information Theory" Water Resources
Research, doi:10.1029/2009WR008953, in press.
Optimisation of monitoring networks for water systems
Alfonso L., Lobbrecht A., Price, R. (2010d) “Value of Information for locating water
level and flow monitors”, Proc. of 9th International Conference in Hydroinformatics,
Tianjin, China.
Alfonso L., Lobbrecht A., Price, R. (2010e) “Mobile phones for extreme events
modelling validation”, Proc. of 9th International Conference in Hydroinformatics,
Tianjin, China.
Alfonso L., Lobbrecht A., Price, R. (2010f) “Coupling hydrodynamic models and
Value of Information (VOI) for designing stage and flow monitoring networks”.
Submitted.
Alfonso L., Lobbrecht A., Price, R. (2010g) “Information Theory Applied to the
Monitoring Network for the Magdalena River”. Submitted.
Alfonso L., Lobbrecht A., Price, R. (2009) “Locating monitoring water level gauges:
an Information Theory Approach”, Proc. of 8th International Conference in
Hydroinformatics, Concepción, Chile.
Alfonso L. and Lobbrecht A., (2007) “Maximising information content from
monitoring networks for optimal performance of water systems”. Geophysical
Research Abstracts, EGU, vol 9:06836, Wien.
Alfonso, L.; Jonoski, A.; Solomatine, D., (2007) “Optimisation of operational
responses to non-deliberate contamination events in water distribution networks”.
Geophysical Research Abstracts, EGU, vol 9:11567, Wien.
Umbarila P., Alfonso L. (2003) Metodología para la evaluación y monitoreo de la
gestión de redes de abastecimiento (MESGRA). Segunda Conferencia internacional
en uso eficiente del agua EFFICIENT 2003, Tenerife, España.
Bravo E., Alfonso L. (2001) Sistematización del diseño de enrocados para protección
de orillas de ríos por el método de la Universidad de Colorado. Revista AICUN,
Bogotá.
182
Samenvatting
Het meten van de verschillende processen van de hydrologische kringloop is uitermate
belangrijk voor een efficiënt waterbeheer. Meetnetten bieden beheerders gegevens voor
de analyse van het historische en het actuele waterbeheer, waarmee zij informatie
beschikbaar krijgen om beslissingen te nemen over hun watersysteem. Belangrijke
uitdagingen bij het ontwerpen en evalueren van meetnetten zijn o.a. het vinden van de
juiste tijd- en ruimteschalen en de gewenste dekking tegen de laagste kosten.
Het theoretisch optimale meetnetwerk is al langere tijd onderwerp van wetenschappelijk
onderzoek; toch blijken de meetgegevens die met de bestaande meetnetwerken worden
verkregen onvoldoende te zijn om de dynamica van natuurlijke systemen te verklaren. Dit
komt wellicht omdat de criteria voor het vaststellen van het meetnetwerk veelal worden
bepaald door niet-wetenschappelijke, maar door politieke en sociale factoren. Nieuwe
benaderingen, waarbij de aard van de te nemen beslissing en de stem van de
belanghebbenden worden meegenomen, kunnen een brug slaan tussen theorie en praktijk.
In dit promotieonderzoek, dat door Delft Cluster, het Hoogheemraadschap van Delfland
en UNESCO-IHE is gefinancierd, worden innovatieve methoden voor het ontwerpen en
evalueren van meetnetten behandeld. De leidende gedachte hierbij is het ontwikkelen van
een optimaal functionerend watersysteem op basis van optimale informatie uit het
meetnet in dat watersysteem. Als we spreken over het functioneren van een
watersysteem, bedoelen we de classificatie van het gedrag van het water met betrekking
tot iedere specifieke vorm van watergebruik en belang binnen een watersysteem. De mate
van functioneren van het watersysteem zou bepalend moeten zijn voor de beslissingen die
over het watersysteem worden genomen. Als we spreken over het maximaliseren van de
informatie, dan doelen we op het vinden van die mogelijke locaties waar metingen de
beste indicatie geven van de toestand van ieder punt in het watersysteem.
Dit onderzoek is gebaseerd op drie pijlers: Informatietheorie, de Waarde van Informatie
(Value Of Information; VOI), en gegevensverzameling door publieke participatie.
De eerste pijler is een methode waarmee de locaties van waterstandmetingen worden
vastgesteld op grond van de informatietheorie, met nadruk op het reduceren van
onzekerheid. Met behulp van de Informatietheorie van Shannon is het gelukt om de
meetlocaties voor waterstanden in een watersysteem zodanig te bepalen dat de
onzekerheid wordt verminderd. Er zijn twee methoden ontwikkeld. Als eerste de Water
Level Monitoring Design in Polders (WMP) methode, waarin de meetinstrumenten één
voor één zodanig worden geplaatst dat de inhoud van informatie van de set
waterstandmetingen zo groot mogelijk blijft, terwijl de meetinformatie die door de
meetinstrumenten wordt gedeeld zo klein mogelijk blijft. De tweede methode gaat uit van
een Multi-Objective Optimization Problem (MOOP), waarin twee praktische
overwegingen worden afgewogen: 1) de kosten van het plaatsen van nieuwe
meetapparatuur, en 2) de kosten van het te dicht bij een kunstwerk plaatsen van
meetinstrumenten. De kosten zijn uitgedrukt in informatie-eenheden, waarvoor extra
termen in de doelfunctie van het optimalisatieprobleem werden geformuleerd. Voorts is
Optimisation of monitoring networks for water systems
de MOOP methode uitgebreid met een ‘rank-based greedy’ algoritme en toegepast voor
het bepalen van de locatie van afvoermetingen in de Magdalena rivier.
De tweede pijler behandelt een methode om meetinstrumenten te plaatsen op grond van
de waarde die de gebruiker toekent aan de ingewonnen informatie voor het nemen van
beslissingen. Hiervoor zijn concepten van de theorie van de ‘Waarde van Informatie’
gebruikt. Deze zijn gedefinieerd als het verwachte verschil tussen de opbrengst van het
kiezen van een bepaalde actie op basis van de bestaande schatting van de toestand van het
watersysteem, en de opbrengst wanneer de actie wordt gekozen met inbegrip van de extra
informatie van het meetinstrument. De methode selecteert de meest waardevolle set van
meetlocaties voor een bepaald watersysteem op grond van de door de gebruiker vooraf
verwachte toestand van het systeem, de gevolgen voor het functioneren van het systeem,
en de kwaliteit van de informatie die het meetnet beschikbaar zou kunnen stellen. Deze
nieuwe benadering is uitgewerkt door een theorie te ontwikkelen voor het bepalen van de
waarde voor het plaatsen van één meetstation en dit uit te breiden voor de bepaling van n
meetstations. Daarnaast wordt voorgesteld de benodigde probabilistische variabelen voor
het uitrekenen van de informatiewaarde (VOI) te schatten met behulp van
computersimulatiemodellen.
De derde pijler van het onderzoek, met grote praktische waarde, heeft als doel nieuwe
mogelijkheden te verkennen voor het verzamelen van informatie met behulp van mobiele
telefoons om vervolgens hiermee simulatiemodellen te verbeteren. Vandaag de dag kan
een mobiele telefoon niet meer louter gezien worden als een toestel voor het voeren van
gesprekken: ze combineren functies zoals PC software, digitale camera, rekenmachine en
agenda en leveren Internet, radio, televisie en faxdiensten. Zo kunnen mobiele telefoons
ook gebruikt worden voor het waarnemen van waterstanden met hulp van het publiek.
Het idee is om gebruik te maken van de voordelen van het betrekken van het publiek bij
het doen van waarnemingen. Deze voordelen worden door meerdere auteurs beschreven
als: het creëren van maatschappelijke bewustwording van milieuproblemen, het
verbeteren van samenwerking tussen belanghebbenden, de mogelijkheid tot kosten-baten
analyse van dataverzameling, en het grote bereik in ruimte en tijd. Een belangrijke
bijdrage aan dit onderzoek werd geleverd door het ‘Mobile Monitoring Experiment’
(MoMoX), een experiment dat is uitgevoerd in 2010 in de polders van Pijnacker, met
deelnemers met verschillende betrokkenheid, waaronder bewoners dichtbij de locaties
waar waterstanden worden gemeten.
Twee zeer verschillende case studies zijn gebruikt om de ontwikkelde methoden en
theorien te testen. Voor de locatie van de eerste case study is gekozen voor de polders
van Pijnacker, een typisch laaggelegen vlak stuk Nederland met een in hoge mate
gecontroleerd watersysteem. Dit gebied is gelegen aan de oostkant van het beheersgebied
van het Hoogheemraadschap van Delfland. Het is een plattelandsgebied, met hier en daar
wat stedelijke ontwikkeling en broeikassen. Hydrologisch gezien bestaat het gebied uit
vier hoofdpolders, onderverdeeld in 127 kleinere onafhankelijke afvoergebieden, elk met
een eigen doelstelling voor het peilbeheer. De waterstanden in het systeem van kanalen
worden tussen bepaalde niveaus geregeld door een stelsel van gemalen, inlaten en
stuwen.
184
De tweede case study betreft de Magdalena rivier die het belangrijkste riviersysteem van
Colombia vormt en de grootste rivier is die uitmondt in de Caribische Zee. De rivier heeft
een lengte van 1530 kilometer van zuid naar noord, en voert water af uit een
stroomgebied dat 24% van het hele land beslaat en waar 77% van de bevolking woont.
Het gebied dat in dit onderzoek is bestudeerd, beslaat het middelste en benedenstroomse
deel van de rivier. Dit gebied is niet alleen belangrijk vanwege de economische
activiteiten zoals scheepvaart, visvangst en landbouw, maar ook omdat dit gebied het
meest getroffen wordt door overstromingen.
De resultaten van dit onderzoek laten zien dat meetnetwerken geëvalueerd en ontworpen
zouden kunnen worden op basis van nieuwe criteria, zoals de inhoud van informatie, het
soort gebruiker van de informatie en de mogelijkheden die de huidige mobiele
technologie biedt voor het verzamelen van gegevens. Er zijn nieuwe methoden
ontwikkeld die kunnen worden gebruikt voor het optimaliseren van meetnetwerken,
waarbij simulatiemodellen worden gecombineerd met concepten uit de informatietheorie
en onderdelen van theorieën rondom de Waarde van Informatie. Tenslotte is een
openbaar meetnetwerk voor het verzamelen van waterstanden met behulp van mobiele
telefoons ontworpen, getest en beoordeeld.
Naast deze resultaten draagt dit onderzoek nieuwe mogelijkheden aan voor de toepassing
van informatietheorie in het waterbeheer, waar informatie normaliter gekwantificeerd
wordt in entropie-eenheden. Door gebruik te maken van concepten van de Waarde van
Informatie, kunnen zowel de op entropie gebaseerde methoden die zijn behandeld in dit
onderzoek, als de methoden die zijn ontwikkeld in eerder onderzoek, aangepast worden,
zodat ze ook meetinformatie aan geld kunnen relateren.
185
monitoring networks provide data that is analysed to help managers make informed
decisions about their water systems. their design and evaluation have a number of
challenges that must be resolved, among others, the restriction on having a limited
number of monitoring devices.
this book presents innovative methods to design and evaluate monitoring networks.
the main idea is to maximise the performance of water systems by optimising the
information content that can be obtained from monitoring networks. this is done
through the combination of models and two theoretical concepts: information
theory, initially developed in the field of communications, and value of information,
initially developed in the field of economics. additionally, the possibility of using
public participation to gather information with mobile phones to improve models
is also explored in the research. two very different case studies are used to test the
developed methods and theories: pijnacker, a typical low-lying regional polder in
the netherlands, which is highly controlled and the magdalena river, the major river
system in colombia.
the results of this research demonstrate that monitoring networks can be evaluated
and designed by considering new variables, such as the information content, the user
of the information and the potential of current mobile phones for data collection.
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement