Telecom Operations Support Systems for Broadband Services

Invited Paper
Journal of Information Processing Systems, Vol.6, No.1, March 2010
DOI : 10.3745/JIPS.2010.6.1.001
TOSS: Telecom Operations Support Systems for
Broadband Services
Yuan-Kai Chen*, Chang-Ping Hsu*, Chung-Hua Hu*, Rong-Syh Lin*,
Yi-Bing Lin*, Jian-Zhi Lyu*, Wudy Wu* and Heychyi Young*
Abstract—Due to the convergence of voice, data, and video, today’s telecom operators
are facing the complexity of service and network management to offer differentiated
value-added services that meet customer expectations. Without the operations support of
well-developed Business Support System/Operations Support System (BSS/OSS), it is
difficult to timely and effectively provide competitive services upon customer request. In
this paper, a suite of NGOSS-based Telecom OSS (TOSS) is developed for the support of
fulfillment and assurance operations of telecom services and IT services. Four OSS
groups, TOSS-P (intelligent service provisioning), TOSS-N (integrated large-scale network
management), TOSS-T (trouble handling and resolution), and TOSS-Q (end-to-end service
quality management), are organized and integrated following the standard telecom
operation processes (i.e., eTOM). We use IPTV and IP-VPN operation scenarios to show
how these OSS groups co-work to support daily business operations with the benefits of
cost reduction and revenue acceleration.
Keywords—Operations Support System (OSS), New Generation Operations Systems
and Software (NGOSS), enhanced Telecom Operations Map (eTOM), Internet Protocol
Television (IPTV), IP-Virtual Private Network (IP-VPN)
1. INTRODUCTION
Today, telecom operators are facing many challenges introduced by provisioning diversified
and digital convergent services in a fast-changing multi-technology network environment. It is
essential that a telecom operator quickly respond to market and technology changes, satisfy customers’ needs, and reduce operational expenditure (OPEX). To address these issues, the telecom
operator’s Operations Support System (OSS) must meet the following requirements: automated
service provisioning for fast service fulfillment, proactive and reactive monitoring for end-toend quality assurance, efficient and effective trouble handling, and high flexibility of customization and adjustment for offering new products to market in time.
To fulfill the above requirements, Chunghwa Telecom has developed a series of OSSs called
Telecom OSS (TOSS). These OSSs support the operational processes for broadband service
delivery and quality assurance where the customer needs for new contents and services are con-
※ Yi-Bing Lin’s work was supported in part by NSC 97-2221-E-009-143-MY3, NSC 98-2221-E-009-059-MY2, NSC
98-2219-E-009-016-,Intel, Chunghwa Telecom, IBM, ITRI and NCTU joint research center, and MoE ATU plan.
Manuscript received February 16, 2010; accepted March 4, 2010.
Corresponding Author: Yi-Bing Lin
1
Copyright ⓒ 2010 KIPS (ISSN 1976-913X)
TOSS: Telecom Operations Support Systems for Broadband Services
a
Strategy, Infrastructure & Product c
Operations
Operation Support and Readiness
Fulfillment
Customer Relationship Management
Product lifecycle management Strategy & commitment
Infrastructure lifecycle management 1
Service Configuration and Activation Resource management & Operations
RM&O Support & Readiness
7
2
Order Handling Service management & Operations
4
Assurance Resource Provisioning
5
8
10
Problem Handling Service Problem Management
3
6
Billing
Customer
Quality Management
Service Quality Management
9 Resource Resource Trouble Performance Management
Management
Resource Data Collection & Distribution
Supplier/Partner Relationship Management
b
Enterprise Management Fig. 1. eTOM framework
stantly changing, and sometimes unpredictable. The design of TOSS is compliant to the enhanced Telecom Operations Map (eTOM) framework of New Generation Operations Systems
and Software (NGOSS) [1]. As illustrated in Figure 1, eTOM defines a comprehensive telecom
business process model for service providers (e.g., telecom operators and content providers).
There are three major process areas: Strategy, Infrastructure and Product (SIP; Figure 1 (a)) for
planning and life cycle management, Enterprise Management (EM; Figure 1 (b)) for corporate
or business management, and Operations (OPS; Figure 1 (c)) for core operational management.
As the heart of eTOM, the OPS consists of two vertical process groups, i.e., Operation Support
& Readiness (OSR) and Fulfillment, Assurance, Billing (FAB) which are the focal point of
eTOM framework [2, 3]. These vertical process groups represent a view of flow-through of activities where the OSR enables support and automation for FAB real-time processes. The OPS
can also be viewed horizontally through Customer Relationship Management (CRM), Service
Management & Operations (SM&O), Resource Management & Operations (RM&O) and Supplier/Partner Relationship Management (S/PRM) that represent functionally-related activities.
Chunghwa Telecom’s TOSS mainly supports the processes of the OPS in fulfillment and assurance vertical flows overlaid in the CRM, SM&O and RM&O horizontal levels. In this paper,
the terms customer, product, service, and resource follow eTOM definitions: A customer (a
person or a company) purchases products from the service providers. A product may include
one or more services, hardware, processed materials, software or their combinations. Services
are developed by a service provider for sale within products. The same service may be included
in multiple products that are packaged differently with different prices. The resources refer to
the physical or logical resources (including both network and IT) needed for constructing services or products.
2
Yuan-Kai Chen, Chang-Ping Hsu, Chung-Hua Hu, Rong-Syh Lin, Yi-Bing Lin, Jian-Zhi Lyu, Wudy Wu and Heychyi Young
Based on the eTOM structure, Section 2 introduces the major subsystems of TOSS, explains
how they map to eTOM, and illustrates the corresponding end-to-end operational scenarios.
Sections 3 - 6 elaborate on the implementations of TOSS subsystems. Section 7 concludes our
work with the operational statistics of TOSS in Chunghwa Telecom.
2. OVERVIEW OF TELECOM OPERATIONS SUPPORT SYSTEM (TOSS)
As illustrated in Figure 2, TOSS contains four major subsystems: TOSS-P (Provisioning),
TOSS-N (Network Management), TOSS-T (Trouble Management) and TOSS-Q (Quality Management).
TOSS-P corresponds to Order Handling (Figure 1 (1)) and Service Configuration & Activation in eTOM (Figure 1 (4)). TOSS-P supports light-weight Order Handling Management
(OHM) to accept customer orders. Since TOSS-P is modularly designed with open interfaces,
the OHM developed by Chunghwa Telecom can be easily replaced by other corporate or commercial order handling systems. TOSS-P implements service design functions that translate customer ordered products into service specifications and design how these services should be configured. After that, it instructs TOSS-N and service platforms (Figure 2 (1)) to perform service
activation.
TOSS-N corresponds to eTOM RM&O processes (Figure 1 (7)-(10)). This subsystem provides a complete carrier-grade solution for management of large heterogeneous telecommunications networks as well as information technology (IT) servers. TOSS-N conducts all resourcelevel activities including resource activation, resource testing, resource trouble detection, and
resource performance monitoring. In Figure 2, we use the Internet Protocol Television (IPTV)
service [4] as an example for management, where all network elements for constructing the
IPTV service are resources to be managed by TOSS-N. These network elements include the
ATM switches or MPLS-based routers in core networks (Figure 2 (2)); the Digital Subscriber
Fig. 2. TOSS architecture (and an example of management for IPTV)
3
TOSS: Telecom Operations Support Systems for Broadband Services
Line Access Multiplexers (DSLAM), the Multi-Service Access Nodes (MSAN), or the Layer 2
and Layer 3 switches in xDSL or FTTx access networks (Figure 2 (3)), and the end-user devices
such as home gateways or set-top boxes (STBs) located in customer premises (Figure 2 (4)). In
this paper, the terms device and network element are used alternatively to represent network and
end-user equipment.
TOSS-T corresponds to Problem Handling and Service Problem Management processes in
eTOM (Figure 1 (2) and (5)), which handles troubles reported by customers. Specifically,
TOSS-T generates trouble tickets, dispatches them to the appropriate operators (e.g., network
maintenance operators, field operators, etc.), and tracks them until the problems are resolved and
the services are restored. TOSS-T also interacts with TOSS-N to perform resource testing functions that assist the operators to identify and locate the causes of the service problems.
TOSS-Q corresponds to Customer Quality Management and Service Quality Management in
eTOM (Figure 1 (3) and (6)), which utilizes the network management information monitored by
TOSS-N to conduct service performance analysis based on the views from both service providers and specific customers. For service provider, TOSS-Q is able to detect and notify regional
service degradation in advance to avoid business disaster. For particular customers, TOSS-Q
detects QoS degradation to avoid service level agreement (SLA) violations. Furthermore, TOSSQ provides service quality statistics reports for customers who sign on the SLA for guaranteed
quality level.
In summary, TOSS-P works with TOSS-N to support end-to-end fulfillment process flow to
timely provide customers with their requested products. TOSS-T and TOSS-Q work with TOSSN to support end-to-end assurance process flow to ensure the services provided to customers,
which are continuously available to meet the SLA or Quality of Service (QoS) performance
levels. TOSS-P, TOSS-T and TOSS-Q provide user interfaces to customer service representatives (CSR; i.e., operators in front desk or service center) for handling customer orders and trouble appeals. TOSS-N manages the multi-technology resources and hides the resource management complexity from TOSS-P, TOSS-T and TOSS-Q.
3. INTELLIGENT SERVICE PROVISIONING (TOSS-P)
TOSS-P provides a unified approach to bundle service features across various service platforms (Figure 2 (1)) so that these service features appropriately appear to the customers as products. In addition to the light-weight OHM mentioned in the previous section, TOSS-P includes
two more components. The Designer (Figure 2 (a)) carries out necessary readiness processes (to
design service specifications, provision rules, and so on) before a product is available in the
market. Then the product can be ordered by a customer through the OHM. The Activator (Figure
2 (b)) receives every order request from the OHM and executes provisioning activities according
to the resource budgets and rules. To support end-to-end service provisioning lifecycle management, the Service Inventory (Figure 2 (c)) stores all service definitions and activation knowledge to be accessed by the Designer and the Activator.
3.1 TOSS-P Designer
Before introducing a new or blended service such as IPTV, the Designer implements design
tasks for wrapping up candidate services into new product offerings in the OHM. Specifically,
4
Yuan-Kai Chen, Chang-Ping Hsu, Chung-Hua Hu, Rong-Syh Lin, Yi-Bing Lin, Jian-Zhi Lyu, Wudy Wu and Heychyi Young
the Designer proposes candidate service specifications to present the business view of a new
service. For example, an IPTV service specification can be easily augmented with value-added
features such as “payment allowance” or “video sharing” by the Designer to meet customer
needs.
The Designer either creates new service specifications or reuses existing service specifications
in the Service Inventory to generate new composite service specifications. Through the Designer
web-based portal, a product manager retrieves, modifies and creates service specifications. For
example, the manager may create a composite IPTV service including FTTx connectivity and
IPTV multimedia service specifications. As illustrated in Figure 3 (2), a 32-bit object ID
uniquely identifies the specification for FTTx. Through ServiceSpecCharacterizedBy (Figure 3
(3)-(8)), the Designer describes the FTTx service characteristics such as upstream rate, downstream rate, class of service and so on. Under the object ID, these service specification characteristics are retrieved and managed in the Service Inventory. Every service specification characteristic is set up with its parameters. For the FTTx example, the valueFrom (the lower bound of
the upstream rate) is 2 Mbps (Figure 3 (7)) and the valueTo (the upper bound) is 20 M bps (Figure 3 (8)).
Besides managing service specifications, the Designer defines a provision sequence for every
service. For the IPTV service, the first step of the sequence reserves connectivity and the IP
address of the STB. The second step configures the circuit and user profile at the IPTV application server. This sequence is executed when a customer subscribes to the IPTV product.
The Designer defines error/exception handling in a provision sequence. For example, if the
IPTV application servers are disconnected or temporarily failed during the activation process, an
exception-handling process is initiated to resolve the problem and inform the operators as well.
The Designer also sets the alert criteria to monitor the status of a provision sequence. For example, the IPTV service platform should finish provision task in 2 minutes, otherwise TOSS-P will
report an exception alert to the system administrator.
After finishing the design for service specifications, provision sequence and exceptional handling, the Designer automatically publishes these service specifications to the OHM for product
manager to wrap up various product offerings (e.g., the IPTV service specifications can be used
Fig. 3. FTTx connectivity service specification
5
TOSS: Telecom Operations Support Systems for Broadband Services
to define the Olympics Games package or sports packages for NBA season options).
3.2 TOSS-P Activator
The Activator accepts customer orders from the OHM. Every customer order includes a product instance that specifies the product name (Figure 4 (2)) such as Multimedia on Demand
(MOD; an IPTV product of Chunghwa Telecom) [4] and a Customer Facing Service (CFS) instance which consists of several characteristics such as the uplink bandwidth of IPTV connectivity (Figure 4 (6)). The specification in Figure 4 (7)-(9) enforces the value in Figure 4 (6) to be
restricted in the range [2M, 20M] defined in Figure 3 (6)-(8).
Note that the CFS instance is created when a customer subscribes to a product at the first time.
After this order is handled, the updated CFS instance is stored in the Service Inventory. For a
subsequent order requested by the customer (e.g., upgrade bandwidth from 2Mbps to 10Mbps),
the CFS instance of the customer order will be recognized, retrieved from the Service Inventory,
updated, and then stored back in the Service Inventory.
The Request Manager (RM; Figure 2 (d)) receives a customer order from the OHM through a
common Service Activation Interface (SAI) [5]. The SAI supports several options that allow the
OHM to instruct TOSS-P to perform actions (activate, cancel, modify, finalize, etc.) on services
for lifecycle management.
The RM extracts the CFS instance from the customer order and passes it to the Order Manager (OM; Figure 2 (e)). Based on the CFS instance, the OM controls the provision workflow
that implements the Service Configuration & Activation process in eTOM (see Figure 1 (4)).
Specifically, the OM instructs the State Manager (StM; Figure 2 (e)) to run a finite state machine (FSM) for every CFS instance. This FSM is driven by the action specified by the OHM.
Figure 5 illustrates a partial state transition diagram for the FSM. Every time the Activator finishes a customer order, the StM moves the FSM of the CFS instance to a new state, and saves it
in the Service Inventory. When the customer issues the next order, the state stored in the Service
Inventory will be retrieved with the OHM action to drive the FSM.
Service Provision: When a customer orders a new product, e.g., IPTV, the OHM issues a cus-
Fig. 4. Product instance in the customer order
6
Yuan-Kai Chen, Chang-Ping Hsu, Chung-Hua Hu, Rong-Syh Lin, Yi-Bing Lin, Jian-Zhi Lyu, Wudy Wu and Heychyi Young
Fig. 5. A partial service operation state transition diagram
tomer order with action “provision”. The StM creates the FSM for the CFS instance with the
initial state Start. The OM instructs the Service Component Manager (SCM; Figure 2 (f)) to
provision the IPTV service. Specifically, the SCM dispatches two Service Components (SCs;
Figure 2 (g)) to reserve required resources in TOSS-N and service platforms. The connectivity
SC instructs TOSS-N to allocate end-to-end connectivity including core network, access network, loop and the STB at the customer site. The IPTV middleware SC interacts with IPTV service platform to enable user account and subscription profile of particular program packages or
application options.
Finally, the StM moves the FSM from the Start state to Provisioned_Inactivate, and save the
new state together with the updated CFS instance in the Service Inventory. At this state, all resources required to provision IPTV service are allocated, but not yet activated.
Service Activation: For a service at the Provisioned_Inactivate state, the customer is ready to
activate the product. The customer may request the OHM to issue a customer order with action
“activate”. The OM obtains the CFS instance from the Service Inventory and the customer order,
and requests the SCM to activate the IPTV service. The SCM uses the connectivity and the
IPTV middleware SCs to activate the resources through TOSS-N and service platforms. Then
the StM changes the FSM state from Provisioned_Inactivate to Provisioned_Activate, and saves
the new state together with the updated CFS instance in the Service Inventory. At this point, the
customer can enjoy the product.
Service Deactivation: When the IPTV service is in use (i.e., it is at the Provisioned_Activate
state), the customer may decide to suspend it for a time period. If so, the OM instructs the SCM
to deactivate the service. Specifically, the connectivity and IPTV middleware SCs suspend or
lock all network and service resources. The StM moves the FSM to the Provisioned_Inactivate
state, and saves it in the Service Inventory.
Service Termination: When the IPTV service is at the Provisioned_Inactivate state, the customer may terminate the service. The OM instructs the SCM to “terminate” the underlying network and service resources which will not be activated for this product again. The StM moves
the FSM to the Terminated state. Note that we cannot terminate a service directly from the Provisioned_activate state. For example, a customer may travel and forget to pay the bill, and it is
appropriate to temporarily deactivate the service, and actually terminate the service until we are
sure that the customer confirms service termination.
Service Modification: When the IPTV service is at the Provisioned_Activate state, the customer may request to change the service profile; e.g., to upgrade the Internet service uplink
bandwidth from 2 Mbps to 4 Mbps or upgrade the class of IPTV service from Standard Definition TV (SDTV) to High Definition TV (HDTV). The OM instructs the SCM to modify the net7
TOSS: Telecom Operations Support Systems for Broadband Services
work and service resources. The FSM state remains the same.
In the lifecycle of a product, the OM maintains a service order ID linking to the corresponding customer orders from the OHM. The service provider will continually check the execution
status of TOSS-P for the customer orders through this service order ID.
4. INTEGRATED NETWORK MANAGEMENT (TOSS-N)
TOSS-N performs centralized management of large heterogeneous networks that are geographically distributed. TOSS-N consists of six components. The System Connector (Figure 1
(l)) connects TOSS-N to TOSS-P, TOSS-Q, and TOSS-T. By using Java Message Service
(JMS) and web service, the System Connector is implemented on the loosely-coupled serviceoriented architecture (SOA) [6, 7]. The Flow-through Network Provisioning & Activation
(FNPA; Figure 2 (m)) processes the requests from TOSS-P to automatically configure and activate the network elements. The Network Status & Performance Analysis (NSPA; Figure 2 (n))
collects measured data and analyzes network status. The Alarm & Ticket Management (A&TM;
Figure 2 (o)) creates event tickets when some faults were detected at the network elements, and
issues the fault alarms to the related operators through short messages or e-mails. The Network
Test & Diagnosis (NT&D; Figure 2 (p)) handles network test requests from TOSS-T to detect
abnormal operations of the network elements. The Network Element Adapter (NEA; Figure 2
(q)) monitors and controls broadband network elements with multiple protocols such as SNMP,
TL1, CLI/Telnet, HTTP, CORBA, and TR-069 [8]. In Chunghwa Telecom’s commercial operation, TOSS-N manages hundred types of broadband network elements over 40 vendors.
We use three operation scenarios to illustrate how TOSS-N interacts with TOSS-P, TOSS-T
and TOSS-Q to deal with IPTV service fulfillment and assurance activities.
4.1 Network Provisioning and Activation
To provision a service, TOSS-P issues the network circuit reservation request to TOSS-N.
The System Connector (Figure 2 (l)) dispatches this request to the FNPA (Figure 2 (m)) to check
if the service can be fulfilled with the support of the underlying network. For IPTV service, feasibility checks include the evaluation of network bandwidth (for delivering video data) and the
distance between the service provider’s central office and the customer site. After service feasibility checks, the FNPA reserves the network resources (e.g., specific ports and VLANs of
L2/L3 switches) for this IPTV service. After resource reservation, TOSS-P may activate this
IPTV service by issuing the network activation order to TOSS-N. In response to this request, the
FNPA activates the connectivity circuits from the core network to end-user devices through the
access network. The FNPA then requests the NT&D (Figure 2 (p)) to test the activated network
elements to ensure that the network circuit is ready to transmit IPTV video data.
TOSS-N manages network elements with multiple vendors, models, and technologies. For
IPTV, a variety of broadband network elements (e.g., DSLAM, MSAN, L2/L3 switches and
core routers) are managed and displayed in the TOSS-N management portal shown in Figure 6.
For example, if one clicks on an Alcatel Newbridge 7270 ATM switch in Figure 6 (a), Figure 6
(b) will show the shelf (e.g., P1), slot (e.g., P1-3) and port information of the switch in a hierarchical tree structure. Figure 6 (c) provides function buttons for configuring and monitoring the
network elements. When the operator selects, e.g., the “Network Provision” function in Figure 6
8
Yuan-Kai Chen, Chang-Ping Hsu, Chung-Hua Hu, Rong-Syh Lin, Yi-Bing Lin, Jian-Zhi Lyu, Wudy Wu and Heychyi Young
Fig. 6. Web-based TOSS-N network management
(c), a sub-window (Figure 6 (d)) shows the graphical configuration of the selected ATM switch.
In this sub-window, two rectangles depicted on the left and right sides represent two ATM ports.
The operator may configure an ATM cross-connect circuit by bridging these two ATM ports and
setting up the QoS parameters for this connection.
4.2 Network Degradation Detection and Diagnostics
TOSS-N detects if the network performance is degraded to assure overall service quality and
customer satisfaction. If a performance degradation or fault event is detected when a customer is
watching an IPTV channel, the network automatically alarms the A&TM (Figure 2 (o)) through
the NEA (Figure 2 (q)). For example, if an STB detects that its packet loss is greater than a
threshold, it sends a Threshold Crossing Alert (TCA) SNMP trap to TOSS-N. Upon receipt of
this TCA, the A&TM issues an event ticket to the SPM of TOSS-T to organize an end-to-end
test plan (e.g., a number of ICMP pings) to determine the root cause of the service degradation
(details of the SPM are elaborated on in Section 5.2). If necessary, the SPM may request the
NT&D to perform a number of network tests (e.g., reset of the STB) to fix the network fault.
An event ticket can be at one of the following three states: opened, acknowledged, and closed.
When a fault is detected at a network element, an event ticket with the opened state is created by
the A&TM. The operator uses the TOSS-N management portal to manually change an opened
ticket to the acknowledged state if the operator will handle it. An acknowledged ticket becomes
closed when the associated network fault is resolved. A ticket is marked as “minor” or “major”
according to its severity. For example, a ticket of major severity can be “a port of the ATM
switch was broken”. This ticket will be set to the closed state when the ATM port resumes operation or when the operator shuts down the broken ATM port manually.
9
TOSS: Telecom Operations Support Systems for Broadband Services
4.3 Network Performance Analysis
The NSPA (Figure 2 (n)) periodically collects performance data (e.g., packet loss of an STB)
to generate a series of statistical performance analysis reports. For example, in an IPTV packet
loss analysis report for a group of STBs, every entry represents the number of lost packets per
hour collected from the STB at a particular customer site. The report also includes the packetloss threshold, and the IPTV service is not acceptable when the number of lost packets in that
hour is greater than the threshold. When TOSS-Q requests TOSS-N to retrieve, for example,
packet loss statistics of an STB for service quality analysis, the NSPA will return this report to
TOSS-Q.
5. TROUBLE HANDLING AND RESOLUTION (TOSS-T)
TOSS-T implements Problem Handling (Figure 1 (6)) and Service Problem Management in
eTOM (Figure 1 (9)). Through two components Trouble Handling & Dispatching (THD; Figure
2 (h)) and Service Problem Management (SPM; Figure 2 (i)), TOSS-T provides problem handling and resolution functions such as trouble ticket creation, problem diagnosing, activities
dispatching, problem resolving, and activity tracking/reporting.
5.1 Trouble Handling and Dispatching
The Trouble Handling and Dispatching (THD) receives trouble reports from customers, and is
responsible for managing the recovery activities, fixing customer problems, and providing status
of the activities. Specifically, the THD performs trouble report creation, activity dispatching,
activity tracking, history reporting, and report sharing. The details are given below.
5.1.1 Trouble Report Creation
When the THD receives a customer problem, it creates a trouble report. This customer problem can be an event ticket (such as “performance degradation of a device”) from TOSS-N, or a
customer complaint issued from the CSR. For the IPTV service, the THD has defined more than
30 trouble codes to describe various trouble situations. For examples, code Z61 represents the
complaint “No available IPTV channels”, code Z11 means “Can’t connect to server”, and Z71
means “Can’t subscribe to video clip”.
An IPTV trouble report can be created through the THD web page, which includes the customer’s IPTV service number, reported IPTV trouble code, customer’s contact information and
the appointed time of a visit for, e.g., STB replacement.
5.1.2 Activity Dispatching
A problem is resolved by carrying out several activities. An activity can be problem isolation/diagnosis, resource maintenance, cable/device repair, customer’s equipment replacement,
and so on. Each activity is assigned to an appropriate operator for execution. After the operator
finishes this activity, a result code is returned to the THD. This result code concludes the execution of the activity. The subsequent activity is dispatched by the THD based on the result code of
the previous activity. To do so, the THD relates every trouble code/result code to a dispatch rule.
Depending on device type, operator’s workload, customer’s address and other factors, the dis-
10
Yuan-Kai Chen, Chang-Ping Hsu, Chung-Hua Hu, Rong-Syh Lin, Yi-Bing Lin, Jian-Zhi Lyu, Wudy Wu and Heychyi Young
Fig. 7. An example of IPTV activity dispatching
patch rule defines the appropriate operators to handle specific activity types. In the previous
example, after an IPTV trouble report was created with trouble code Z61 “No available IPTV
channels” (Figure 7 (a)), the THD applies the corresponding dispatch rule (Figure 7 (b)) to select
an operator in the IPTV operation center and dispatches the “diagnose” activity (Figure 7 (c)) to
him/her to identify the cause of the problem. The operator isolates the problem by using the
SPM functions. If the diagnosis indicates an STB failure, the result code is ZC1 “STB is unavailable” (Figure 7 (d)). Based on this result code, the THD applies the dispatch rule “Duty of
Office” (Figure 7 (e)) to assign an appropriate field operator (the rule option is “Field Maintenance”) to carry out the “Repair” activity (Figure 7 (f)). The field operator then makes a visit to
the customer for STB replacement.
5.1.3 Activity Tracking, History Reporting, and Report Sharing
The THD monitors all activities of a customer problem to ensure that they are assigned, coordinated and tracked. The THD also records the execution result of an activity in the trouble report. After a problem is resolved, the THD closes the trouble report and notifies the CSR. The
operator will contact the customer to ensure that the resolution is satisfied. Based on the trouble
report, the THD also generates the statistics for problem analysis and operator workload for future usage. The THD shares the trouble report and the progress of problem resolving with other
subsystems such as TOSS-Q. The shared information is presented in the NGOSS SID-based
XML format [9].
5.2 Service Problem Management
To support an operator who diagnoses a service problem, the Service Problem Management
(SPM) offers service problem management through service testing and service impact analysis.
11
TOSS: Telecom Operations Support Systems for Broadband Services
The SPM interacts with TOSS-N to provide testing functions including physical layer link
testing, data-link layer testing and IP layer testing. For example, when an operator in the IPTV
operation center receives a diagnosis activity from the THD, he/she will utilize the SPM to carry
out the following IPTV service tests:
‧Access network tests check DSLAM, Layer 2 and Layer 3 switches. The test functions include loopback test, traffic diverge query, VTU-R query, GESW port query, and so on.
network tests check ATM switch routers and Broadband Remote Access Servers
(BRAS). The test functions include High Performance Edge Router (HPER) query,
VLAN traffic query, the connectivity test between HPER and BRAS, and so on.
‧Service platform tests include service platform configuration query, user action query,
service lock/unlock, and so on.
‧End-user device tests include STB reset, STB firmware update, STB status query, and so
on.
‧Core
Through the above tests, the SPM assists an operator to identify the root causes of a problem,
and the result is fed back to the THD.
The SPM also performs service impact analysis when it receives event tickets (e.g., network
degradation detection) from TOSS-N. The SPM analyzes these event tickets to determine which
services are involved, and identifies the customers impacted by these events. The SPM then
sends the analysis result to the THD to create a trouble report for problem resolving.
6. QUALITY ASSURANCE MANAGEMENT (TOSS-Q)
Through Service Quality Management (SQM; Figure 2 (j)) and Customer Quality Management (CQM; Figure 2 (k)), TOSS-Q supports quality assurance management by monitoring network element availability, identifying network bottlenecks, and correlating alarms to detect potential problems.
6.1 Service Quality Management
The Service Quality Management (SQM) specifies the levels of services delivered to customers. For example, the service availability of the golden level IP-Virtual Private Network (IPVPN) Service is 99% and the availability of the platinum level is up to 99.95%. The SQM also
predicts service degradation or network problems on specific customers. The SQM implements
three functions described in the following subsections.
6.1.1 Network Performance Monitoring
The SQM collects performance data from various systems and presents these data as key performance indicators (KPIs) that allow an operator to quickly recognize the service status of a
business department (a set of customer sites that are geographically nearby). Examples of IPTV
KPIs are zap time and IPTV server connection time calculated by the Performance Evaluation
and Testing System (PETS; to be elaborated later). The SQM defines three status levels for a
KPI (i.e., green for normal, yellow for warning and red for critical). For warning and critical
KPIs, the SQM issues notifications to the appropriate operators. For example, Figure 8 illus12
Yuan-Kai Chen, Chang-Ping Hsu, Chung-Hua Hu, Rong-Syh Lin, Yi-Bing Lin, Jian-Zhi Lyu, Wudy Wu and Heychyi Young
Fig. 8. The SQM monitoring web page for IPTV
trates the traffic overload KPI of IPTV service. In this example, the STBs in the business department PCPC (Pan-Chiao area; Figure 8 (a)) receive video streams from an L3 switch PCPC212 (Figure 8 (b)). Suppose that the customers in PCPC area try to access IPTV channels many
times without success. These heavily repeated actions result in large traffic to PCPC-212, and
therefore the traffic overload KPIs in Figure 8 (a) and (b) are marked yellow.
Figure 8 (c) illustrates two video servers PCPC-TVS-402 and PCPC-TVS-403 in the IPTV
service platform. These servers connect to the L3 switch PCPC-212. Assuming that some TV
channels are lost in each of the video servers, and therefore these servers are marked red (and
that is why the customers fail to access the TV channels). The above abnormal situation results
in a trouble ticket generated for PCPC (Figure 8 (d)). Notation (1:0) means that there is one
ADSL trouble ticket and no leased-line trouble ticket. The trouble ticket suggests that an appropriate operator be dispatched to fix the problems of the video servers. After the problems are
fixed, the red/yellow KPIs will turn green again.
6.1.2 Service Testing
The SQM implements the Performance Evaluation and Testing System (PETS) to measure the
service quality, and detect the problems before the customers are aware of them. The PETS contains many Remote Test Units (RTUs) and a Test Center (TC). An RTU (Figure 9 (a)) is installed at a central office to emulate the STB at a customer site. The RTU automatically executes
test cases that emulate customer behaviors such as downloading a file from the Internet, selecting IPTV channels and watching IPTV programs. The TC server (Figure 9 (b)) periodically collects the test results from the RTUs. If a potential problem (such as failure for IPTV channel
selection) is detected, the SQM will notify appropriate operators to confirm the problems, and
13
TOSS: Telecom Operations Support Systems for Broadband Services
Fig. 9. The RTU and TC configuration
repair malfunctioned network elements or software configurations. There are about 600 RTUs in
Chunghwa Telecom’s central offices, which execute various tests every two minutes to identify
potential problems and service quality degradation. A major advantage of the RTU/TC approach
is that it is proactively detecting the service problems without involving STBs at the customer
sites.
The RTU test results are sent to the TC to generate a statistics report that provides, for example, the failure history in the last ten days. This report indicates when failures were detected. An
operator may check the test results of a particular time period in this report.
6.1.3 Service Degradation Detection & Diagnostics
According to performance data and event tickets generated from TOSS-N, trouble tickets
generated from TOSS-T, or test results generated from the PETS, the SQM can conduct service
degradation detection and diagnostics. For example, the SQM periodically counts the number of
trouble tickets created in every two hours, and if necessary, sends a notification to alert an operator through short message or email (shown in Figure 10). In this message, the SQM groups
the related tickets by tracing the route of network elements from the service platform to the customer locations of corresponding phone numbers shown in these tickets (Figure 10 (a)). In our
Fig. 10. The SQM notification message
14
Yuan-Kai Chen, Chang-Ping Hsu, Chung-Hua Hu, Rong-Syh Lin, Yi-Bing Lin, Jian-Zhi Lyu, Wudy Wu and Heychyi Young
example, four tickets relate to the same switch SKC1-231 (Figure 10 (e)) although the phone
numbers are assigned to different STBs (Figure 10 (b)) connecting to different DSLAMs (Figure
10 (c)). The SQM also detects that an alarm (Figure 10 (f)) occurred in SKC1-231. Therefore, it
concludes that SKC1-231 is the potential source of the problem indicated by these trouble tickets.
6.2 Customer Quality Management
The Customer Quality Management (CQM) implements an Internet portal for enterprises or
customers with SLA contracts to perform quality assurance functions by themselves. This portal
consists of two CQM functions to be described in the following subsections.
6.2.1 Service Performance Monitoring
Similar to the SQM, the CQM monitors network status through TOSS-N. Network elements
monitored by the CQM are either located in the service provider’s central offices or at customer
sites. For example, an IP-VPN customer can use the CQM to measure network performance
(such as round trip delay, packet loss, throughput, and so on) in different time periods according
to the SLA contract. The CQM provides the round trip delay between the Provider Edge routers
(PEs; Figure 11 (c)) located in the central offices and the Customer Edge routers (CEs; Figure
11 (b) and (d)) located at the customer sites. The CQM can also directly retrieve the throughput
data recorded at the PEs. Furthermore, if a pair of CEs (Figure 11 (b) and (d)) are managed by
TOSS-N, the CQM can also provide the end-to-end packet delay statistics.
Figure 12 (a) illustrates the IP-VPN topology of an enterprise in the CQM web page. When a
user clicks, for example, the line between Tokyo and Bangkok (1st), it shows the round trip time
Fig. 11. Measurement points of a VPN architecture
Fig. 12. The CQM monitoring window for IP-VPN performance
15
TOSS: Telecom Operations Support Systems for Broadband Services
and packet loss in last five minutes (Figure 12 (b)) and the delay statistics in the last 24 hours
(Figure 12 (c)).
6.2.2 SLA Reports and Traffic Analysis Reports
The CQM generates SLA reports for, e.g., availability. The availability report summarizes the
availability measurements of a specific site including total up time, total disconnect time and so
on.
In Figure 11, the CQM fully monitors the service provider network (from Figure 11 (b) to (d)),
but may not be able to access the devices (e.g., web/ERP/video servers) in the customer networks (Figure 11 (a) and (e)). Through tools such as NetFlow [10], the CQM can analyze the
statistics (e.g., utilization of a web server at the customer network; see Figure 11 (e)) by parsing
the data flow received by the CE router in Figure 11 (d). A data flow is a unidirectional stream
of packets between a given source and destination pair, which is identified by source/destination
IP address/port number, Layer 3 protocol type, and type-of-service byte and input logical interface. CQM can show, for example, the top 10 sites of the highest TCP/IP traffic in customer
networks, and provide the detailed information including the IP address/port number, precedence and traffic in KBytes. Currently, Chunghwa Telecom’s CQM monitors up to 15 million
data flows every day and the number of flows is continuously increasing daily.
7. CONCLUSIONS
Based on eTOM, this paper presents TOSS, a practical approach to the development and integration of four OSS groups (i.e., TOSS-P, TOSS-N, TOSS-T, and TOSS-Q). The functionalities
of TOSS and the associated operation scenarios related to IPTV and IP-VPN service fulfillment
and assurance are described. From the viewpoint of functional categorization, TOSS provides
customer-centric operations models regarding automated service provisioning, integrated network management, proactive trouble handling, and end-to-end quality assurance to enhance
customer experience. From the viewpoint of managed broadband network, TOSS provides full
coverage of inventory, faults, and performance management of end-user devices, access network,
core network, and service platforms. TOSS has been successfully applied by Chunghwa Telecom to support daily operations of various fixed-line and wireless services such as xDSL/FTTx,
IPTV, IP-VPN, VoIP, FMC, 3G/3.5G [11], and new innovating ICT services such as telematics
and energy saving services.
The financial report disclosed by Chunghwa Telecom revealed that the operational efficiency
is achieved since TOSS assists to reduce the OPEX (e.g., manpower saving) by streamlining
business operation processes and helps generate revenue by providing the functions for managing new services timely. Consider IPTV service as an example, 99.81% of service orders can be
fulfilled within one day. The number of IPTV subscriptions in the year 2008 was 172% of that
in the year 2007. The ratio of IPTV service trouble in Dec. 2009 was less than 1%, and the number of complaint calls in regard to IPTV service in Dec. 2009 was 84% of that in Dec. 2008. For
broadband access service, the average access bandwidth per user in the year 2008 was 162% of
that in the year 2007. The number of FTTx subscriptions in the year 2008 was 200% of that in
the year 2007. Specifically, TOSS-N manages more than 160 models of broadband network
elements provided by 45 vendors. 885.26 thousand broadband network elements were managed
16
Yuan-Kai Chen, Chang-Ping Hsu, Chung-Hua Hu, Rong-Syh Lin, Yi-Bing Lin, Jian-Zhi Lyu, Wudy Wu and Heychyi Young
by TOSS-N in the year 2009. This amount was 162% of that in the year 2008 and 731% of that
in the year 2007. A total of 20.35 million automatic network activations of broadband services
were performed by TOSS-P and TOSS-N in the year 2009. This amount was 102% of that in the
year 2008, and was equivalent to a saving in manpower of 465 per year. 135.8 million network
testing and diagnostics commands of broadband services were carried out by TOSS-T, TOSS-Q,
and TOSS-N in the year 2009. This amount was 177% of that in the year 2008, and was equivalent to a saving of 1241 in manpower per year. From the commercial operations, our experiences
indicate that TOSS achieves the goals of self-care, customization, complex bundling, and an
end-to-end quality guarantee.
8. ABBREVIATIONS
ATM: Asynchronous Transfer Mode
A&TM : Alarm & Ticket Management
BRAS: Broadband Remote Access Server
CE: Customer Edge Router
CFS : Customer Facing Service
CQM: Customer Quality Management
CRM : Customer Relationship Management
CSR : Customer Service Representatives
DSL: Digital Subscriber Line
DSLAM: DSL Access Multiplexer
FAB: Fulfillment, Assurance, Billing
FNPA : Flow-through Network Provisioning & Activation
FSM : Finite State Machine
FTTx: Fiber to the x
GESW: Giga-bit Ethernet Switch
HDTV: High Definition TV
HPER: High Performance Edge Router
ICT: Information and Communication Technology
IPTV: Internet Protocol Television
IT: Information Technology
JMS: Java Message Service
KPI: Key Performance Indicator
MOD: Multimedia on Demand
MSAN: Multi-Service Access Node
NEA : Network Element Adapter
NSPA: Network Status & Performance Analysis
NT&D: Network Test & Diagnosis
OHM: Order Handling Management
OM: Order Manager
OPEX: Operating Expenditure
OPS: Operations
OSR: Operation Support & Readiness
17
TOSS: Telecom Operations Support Systems for Broadband Services
OSS: Operations Support System
PCPC: Pan-Chiao area
PETS: Performance Evaluation and Testing System
PSTN: Public Switched Telephone Network
QoS: Quality of Service
RM : Request Manager
RTU: Remote Test Unit
SAI: Service Activation Interface
SC : Service Component
SCM: Service Component Manager
SDTV: Standard Definition TV
SI&P: Strategy, Infrastructure & Product
SLA: Service Level Agreement
SNMP: Simple Network Management Protocol
SOA: Service-Oriented Architecture
SPM: Service Problem Management
S/PRM: Supplier/Partner Relationship Management
SQM: Service Quality Management
STB: Set-top Box
StM: State Manager
TC: Test Center
TCA: Threshold Crossing Alert
THD: Trouble Handling & Dispatching
TOSS: Telecom Operations Support System
VPN: Virtual Private Network
VTU-R: VDSL Transceiver Unit - Remote Terminal
xDSL: x Digital Subscriber Line
REFERENCES
[1]
[2]
TM Forum, New Generation Operations Systems and Software (NGOSS), TM Forum, 2009.
TM Forum, Business Process Framework- enhanced Telecom Operations Map (GB921), R8.0, TM
Forum, 2009.
[3] Y.-B. Lin and S.-I. Sou. Charging for Mobile All-IP Telecommunications. John Wiley and Sons,
2008.
[4] Multimedia on Demand (MOD), http://www.cht.com.tw/.
[5] TM Forum. Multi-Technology Operation Systems Interface (MTOSI) Solution Suite Release 2.0
Service Activation, TM Forum, 2009.
[6] T. Erl. Service-Oriented Architecture (SOA): Concepts, Technology, and Design, Prentice Hall, 2005.
[7] C. H. Hu and S. H. Hsu. “SOA-based Alarm Integration for Multi-Technology Network Fault Management,” IEEE Symposium on Service-Oriented System Engineering, Taiwan, Dec., 2008, pp.221226.
[8] Broadband Forum, “TR-069 CPE WAN Management Protocol,” www.broadband-forum.org/technical/.../TR069_Amendment-2.pdf, Broadband Forum, Dec., 2007.
[9] TM Forum, Shared Information/Data (SID) Model – Business View Concepts, Principles, and Domains (GB922), R8.0, TM Forum, 2009.
[10] Cisco IOS Flexible NetFlow Technology White Paper.
18
Yuan-Kai Chen, Chang-Ping Hsu, Chung-Hua Hu, Rong-Syh Lin, Yi-Bing Lin, Jian-Zhi Lyu, Wudy Wu and Heychyi Young
[11] Y.-B. Lin and A.-C. Pang. Wireless and Mobile All-IP Networks, Wiley, 2005.
Yuan-Kai Chen
Yuan-Kai Chen received his B.S.C.S.I.E., M.S.C.S.I.E. and Ph.D. degrees from
The National Chiao Tung University, Taiwan, in 1989, 1991 and 2002, respectively. He had been with Chunghwa Telecom since 1991, working in research
and development of fiber transmission equipments, OA&M systems and mobile
broadband technologies in Telecommunication Laboratories. He has been involved in the design and rollout of 2G/3G and WiMAX networks, and the development of mobile value-added services and handset software implementation.
He now is in charge of strategic planning and new business development.
Chang-Ping Hsu
Chang-Ping Hsu received his BS and MS degrees in Control Engineering and
Information Management from National Chiao Tung Univ., Taiwan, in 1997 and
1999, respectively. He has been a researcher at Telecommunication Laboratories since 1999. His research interests include Problem Handling, Workforce
Management, and Service Quality Management for IPTV, VPN, and Information
Technology.
Chung-Hua Hu
Chung-Hua Hu received his Ph.D. degree in Computer Science and Information
Engineering from The National Chiao-Tung University, Taiwan, in 1998. He has
been working in Chunghwa Telecom Laboratories for more than ten years and
has rich experience in developing a large-scale broadband NMS for Chunghwa
Telecom. His current research interests include NGOSS methodology and development, and multi-technology network management.
Rong-Syh Lin
Rong-Syh Lin is the head of Network Operations Technology Department of
Telecommunication Laboratories, Chunghwa Telecom Co., Ltd. He received his
Ph.D. degree in Computer Science from Chiao Tung University, Taiwan, in 1998.
In 1989, he joined Telecommunication Labs as a Researcher. He has focused
upon efficient service/network operations and QoS/QoE innovation, and leaded
the development of several Integrated Management Systems for broadband
services in CHT, such as xDSL/FTTx, VPN, IPTV, and new ICT services. Since
2006, he was appointed as the program manager of CHT/TL NGOSS Evolution, which successfully
transformed and consolidated dozens of BSS/OSSs based on the TMF NGOSS frameworks. His current interest is on BSS/OSS flexibility and agility to accelerate new service growth.
19
TOSS: Telecom Operations Support Systems for Broadband Services
Yi-Bing Lin
Yi-Bing Lin is Dean and Chair professor of The College of Computer Science,
National Chiao Tung University (NCTU). He is a senior technical editor of IEEE
Network. He serves on the editorial boards of IEEE Trans. on Wireless Communications and IEEE Trans. on Vehicular Technology. He is General or Program
Chair for prestigious conferences including ACM MobiCom 2002. He is Guest
Editor for several journals including IEEE Transactions on Computers. Lin is the
author of the books Wireless and Mobile Network Architecture (Wiley, 2001),
Wireless and Mobile All-IP Networks (John Wiley,2005), and Charging for Mobile All-IP Telecommunications (Wiley, 2008). Lin received numerous research awards including 2005 NSC Distinguished Researcher, 2006 Academic Award of Ministry of Education and 2008 Award for Outstanding contributions in Science and Technology, Executive Yuen. He is in the advisory boards or the review boards of
various government organizations including the Ministry of Economic Affairs, the Ministry of Education,
the Ministry of Transportation and Communications, and the National Science Council. He is a member
of the board of directors, Chunghwa Telecom. Lin is AAAS Fellow, ACM Fellow, IEEE Fellow, and IET
Fellow.
Jian-Zhi Lyu
Jian-Zhi Lyu received his MS degree in Information Engineering and Computer
Science from Feng-Chia University, Taiwan. He has been working in Chunghwa
Telecom Laboratories for several years and works in the area of the service quality assurance. His current research interests include quality assurances (such as
CQM, SQM and SLA) for multi-services, NGOSS methodology and OSS development.
Wudy Wu
Wudy Wu joined CHT Labs in 2000. He has been the representative of the CHT
OSS/BSS Research Lab at the TeleManagement Forum (TMF) since 2006. He
had successfully leaded CHT to initial catalyst programs from 2006 to 2008 in
TMF and won Excellence Awards Best Catalyst Project Management in 2008. In
CHT, he works in the areas of service design and provisioning and also leads the
development team on current Broadband, NGN, ICT and Cloud services. He has
also hosted an advanced software technology program at CHT Labs for years.
Heychyi Young
Heychyi Young received her M.S. degree in Computer Science from the University of Texas at Austin, U.S.A. in 1990. She worked for Chunghwa Telecom
Laboratories for 20 years. Currently she works as the Project Manager of Network Management Technology at the Network Operations Department. With her
rich experience in telecom operations, she also played a key role in Chunghwa
Telecom’s NGOSS Project as well as the IT Consolidation Project.
20