Integrating network monitoring,
automation, and notification
tools to centralize and automate
network processes
Lessons learned from an integrated HP
Network Automation and Network Node
Manager i software deployment with TelAlert®
notification in an MPLS environment
Technical white paper
IT challenges in a world of acquisitions
After years of growth by acquisition, a midsize
U.S.-based energy company was facing formidable
IT challenges.
From a technology standpoint, the company needed to
assimilate diverse network architectures acquired from
multiple companies with offices in several states. From an
operations standpoint, the company needed to reduce the
mean time to repair (MTTR) for network outages in field
offices. Typically, MTTR was an unacceptable two and a
half to three days.
While the company’s petroleum reserves were doubling
in size every four months, budgets did not allow network
operations staff to grow linearly with the company.
This meant it needed to find integrated and automated
approaches that would enable the company’s network
operations team to increase its productivity.
To address these challenges, the company turned to Allen
Corporation of America, a highly regarded professional
services firm and HP Software and Solutions partner.
A comprehensive solution
Working closely with the energy company, Allen
developed a comprehensive network automation
and monitoring solution with integrated notification
capabilities. The solution was designed to leverage
the combined capabilities of HP Network Automation
software, HP Network Node Manager i (NNMi)
software, and the TelAlert notification system.
Figure 1.
Adding devices from HP NNMi to HP Network Automation
To add devices to HP Network Automation’s list of supported devices,
you add the OIDs from HP NNMi.
•HP Network Automation (version 7.5) was selected
for its ability to standardize network configurations.
It tracks, regulates, and automates configuration and
software changes across globally distributed, multivendor networks.
•HP NNMi Advanced (version 8.13) was selected to
centralize network monitoring. It provides tools for
managing unified fault, availability, performance,
and advanced network services for physical and
virtualized network infrastructure.
•HP NNMi was integrated with TelAlert to automate
the process of notifying operations personnel of
network-related issues and to avoid the need to staff
a network operations center (NOC) around the clock.
This paper includes key advice Allen discusses for
three important aspects of the solution: integrating HP
Network Automation with HP NNMi, monitoring MPLS
networks with HP NNMi, and stabilizing staffing using
TelAlert notification.
Integrating HP NNMi with HP
Network Automation
HP Network Automation and HP NNMi are designed
to work in a complementary manner. HP Network
Automation finds and configures new devices on the
network and then passes the device information to
HP NNMi. Integration takes place at the graphical
user interface (GUI) level. This integration brings HP
Network Automation diagnostics into HP NNMi.
In the background, the two applications share data with
each other. This data integration allows information on
devices to be imported into HP Network Automation from
HP NNMi. To make this match, HP Network Automation
must know the universally unique ID (UUID) that HP
NNMi gives to a device. This UUID is the tag the two
applications share to make the integration work.
Linking HP Network Automation with HP NNMi
To link HP Network Automation with HP NNMi,
you run a connector installer on the HP Network
Automation server. It connects to HP NNMi and installs
the components there as well. The two applications can
run on a single server or different servers.
HP NNMi is installed first, followed by HP Network
Automation, which configures itself around HP
NNMi. The integration team initially installed the two
applications on a single server, and then later decided
to move HP Network Automation to its own machine.
2
They discovered that breaking and then re-establishing
the integration causes a lot of extra work. For example:
HP NNMi continues to look for HP Network Automation
on the same server.
A lesson learned: Think your way through the impact of
putting both applications on a single system, especially
in light of HP NNMi’s memory requirements when
managing a large number of nodes. If you’re not sure
which approach will work best for you, take the safe
route and put the applications on different servers.
Importing HP NNMi devices into HP Network
Automation
To import HP NNMi devices to HP Network
Automation, run nnmimport on the HP NNMi server,
which queries HP Network Automation for a list of
supported Object IDs (OIDs). HP NNMi then sends HP
Network Automation information on only the nodes
with supported OIDs.
In the integration effort, the energy company wanted
all devices from HP NNMi to be sent to HP Network
Automation, which automates the complete operational
lifecycle of network devices. This approach leverages
the comprehensive discovered network inventory in
NNMi. To meet this requirement, the integration team
configured HP Network Automation to essentially pretend
that it supported a broader list of OIDs. The team
did this by adding OIDs to a configuration file on the
Network Automation server, as shown in Figure 1. This
workaround enabled HP Network Automation to receive
information on all the devices in the HP NNMi database.
Monitoring MPLS with HP NNMi
In the discovery process, HP NNMi queries devices to
determine what they connect with, and then creates
a map showing how devices are connected. When
devices are on MPLS networks, HP NNMi doesn’t
have the information it needs to understand how they
are connected to the rest of the environment. That’s
because the switches and routers in a service provider’s
environment are not exposed to customers who are
using the MPLS service.
Figure 2.
Discovery islands
HP NNMi provides a map that shows how devices on the network are connected. If it can’t determine how
devices are connected, HP NNMi shows them as islands.
When a node is in the “Important Nodes” group,
it turns red when its status is unknown. With this
understanding in mind, the Allen team created a filter
to automatically populate all of the routers within
MPLS containers into the “Important Nodes” node
group. With that designation, when the status of an
MPLS router becomes unknown, it turns red, as does
the container it is housed in on the network map. The
result is that an MPLS outage now causes the isolated
sites to turn red on all map displays.
Stabilizing staffing using TelAlert
notification
This means that HP NNMi can’t see the explicit physical
connection at Layer 2 or Layer 3. This reality makes
discovery across virtual boundaries inherently difficult
in MPLS networks. When HP NNMi can’t determine
how devices are connected, it shows them as islands
on a network, as shown in Figure 2.
On an NNMi network map, failed devices turn red
to indicate a critical state. When a device in an MPLS
service provider’s environment fails, a connection in the
NNMi topology breaks, but the device causing it is not
in the NNMi topology, so no device turns red on the
map. In these cases, HP NNMi turns the nodes in the
island isolated by the failed MPLS connection to blue,
to indicate unknown status.
In a time when the demands on network operators are
growing faster than staff headcounts, automation is
one of the keys to containing staffing levels and costs.
This was the case with the energy company, which
couldn’t justify the expense of staffing a NOC around
the clock. Instead, it wanted to leverage an automated
notification system that would work in tandem with HP
NNMi to alert its operators to network problems.
The customer did not want to bombard its operators
with alerts that indicated HP NNMi was in the process
of investigating potential network problems. It felt that
they didn’t need to be made aware of each step in the
process. What’s more, many network incidents are only
momentary issues—such as power glitches—that are
quickly cleared by HP NNMi.
With these thoughts in mind, the company decided
to suppress the step-by-step incident updates that are
issued as HP NNMi progresses through its root-cause
analysis. It wanted the notification system to send out
a single message with a final answer—such as “Node
Down”—if any message had to be sent at all.
When an MPLS site has many nodes, HP NNMi
administrators often create a container to show just one
symbol on the network map to represent the entire site.
If all of the nodes in this container are blue, indicating
unknown status, the container itself still appears as
green on the network map, suggesting that all is well
with the nodes in the container and providing no
indication on the map that the connectivity to the site
is down. The container would turn red if one of the
nodes within the container turned red. But that doesn’t
happen because HP NNMi doesn’t know the status of
the individual nodes—it only knows that it can’t connect
to the site.
To meet these requirements, the Allen team configured
HP NNMi to post messages to TelAlert with a built-in
three-minute delay when specific incidents enter the
“Registered” state. When these incidents change state
to “Closed,” a second call is made to TelAlert to cancel
the message. If the second call is received within three
minutes, TelAlert does not send the message.
NNMi does post a critical “Root Cause” incident,
indicating that the site is isolated, but the site container
does not change to red by default. In the integration
effort, the customer wanted the container to turn red if
the site was isolated, so operators could see that they
had problems at the site represented by the container.
To make this happen, the Allen team used the
“Important Nodes” node group in HP NNMi to change
the default behavior of the software.
HP NNMi 9 introduces lifecycle transition
actions customized to node groups.
Using this feature, different messages
can be delivered through TelAlert based
on a device’s node group membership.
In one exception to the three-minute-delay rule, the
customer wanted three core network team members to
be alerted to any incident that was kicked off by a linkdown trap. The Allen team programmed the solution to
do this.
New feature in HP NNMi 9
3
The case for
automated
notification
Staffing a NOC on a 24/7 basis is a costly proposition. To guarantee that at least one person is in
your operations center watching the console at all times, you need to have 13 people on payroll.
Why is that? This level of staffing, known as the Rule of 13, allows for a 40-hour week, vacation,
training, and sick time. This rule also acknowledges the reality that you actually must schedule two
people for each shift, so that when one leaves to use the restroom, the NOC remains staffed. It
further assumes that each shift is ten hours to allow handoffs between shifts, that the normal work
week is four ten hour days and that both shifts report for training once a week.
The good news is that the first 13 people will support a workload of up to two people at a time. The
bad news is that after you exceed the load that two people can handle, each additional seat you
need to fill in the operations center will require eight more people on payroll.
Numbers like these are a key driver for automated notification systems that free operators from the
need to be physically present in a NOC.
Moving forward
About Allen Corporation
In this project, the integration of HP Network
Automation, HP NNMi, and TelAlert notification
yielded a highly automated, customized solution for
network management. This solution is helping the
energy company assimilate acquired technology,
maintain standardized configurations, centralize
network monitoring, and reduce staffing demands with
automated notification.
Allen Corporation of America, Inc. is a professional
services company offering industry-leading information
technology, logistics, cyber security, and training
solutions to the private and public sectors. Allen has its
headquarters in Fairfax, Virginia, and offices throughout
the United States and in Europe.
In addition, the use of HP Network Automation has
helped the company condense processes and resolve
network outages in field offices in less time. With HP
Network Automation, the company has reduced MTTR
by 50 percent.
Looking ahead, the company recognizes that its
networks and IT systems must continue to evolve. In
recognition of this, Allen is working with the company
to lay the groundwork for a planned upgrade to new
versions of HP NNMi and HP Network Automation
software. In addition, the company is looking at other
offerings in the HP Data Center Automation suite and
HP Client Automation suite.
To learn more, visit www.allencorp.com.
For more information
To learn more about HP NNMi, visit
www.hp.com/software/nnmi.
To learn more about HP Network Automation, visit
www.hp.com/go/nasoftware.
To learn more about Allen Corporation, visit
www.allencorp.com.
Share with colleagues
Get connected
www.hp.com/go/getconnected
Get the insider view on tech trends, alerts, and
HP solutions for better business outcomes
© Copyright 2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only
warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein
should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.
TelAlert is a registered trademark of MIR3. MIR3 is a service mark of MIR3, Inc.
4AA2-6870ENW, Created November 2010
This is an HP Indigo digital print.