DISASTER Recovery is an exercise that

DISASTER Recovery is an exercise that
Oracle Communications Diameter Signaling Router
DSR 4.X/5.X/6.x 3-tier Disaster Recovery Guide
Release 6.0
E52512-01
July 2014
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Oracle Communications Diameter Signaling Router DSR 3-tier Disaster Recovery Procedure, Release 6.0
Copyright © 2012, 2014 Oracle and/or its affiliates. All rights reserved.
This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are
protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use,
copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or
by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is
prohibited.
The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please
report them to us in writing.
If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S.
Government, the following notice is applicable:
U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the
hardware, and/or documentation, delivered to U.S. Government end users are “commercial computer software” pursuant to the
applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure,
modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the
hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are
granted to the U.S. Government.
This software or hardware is developed for general use in a variety of information management applications. It is not developed or
intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this
software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and
other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this
software or hardware in dangerous applications.
Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and
are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are
trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.
This software or hardware and documentation may provide access to or information on content, products, and services from third parties.
Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party
content, products, and services. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due
to your access to or use of third-party content, products, or services.
MOS (https://support.oracle.com) is your initial point of contact for all product support and training needs. A representative at
Customer Access Support (CAS) can assist you with MOS registration.
Call the CAS main number at 1-800-223-1711 (toll-free in the US), or call the Oracle Support hotline for your local country
from the list at http://www.oracle.com/us/support/contact/index.html.
See more information on MOS in the Appendix section.
2 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
TABLE OF CONTENTS
1
INTRODUCTION .................................................................................................................. 5
1.1 Purpose and Scope......................................................................................................... 5
1.2 References ..................................................................................................................... 5
1.3 Software Release Numbering ......................................................................................... 5
1.4 Acronyms ........................................................................................................................ 6
1.5 Terminology .................................................................................................................... 6
2
GENERAL DESCRIPTION ................................................................................................... 8
2.1 Complete Server Outage (All servers) ............................................................................. 8
2.2 Partial Server Outage with one NO Server Intact and both SOs failed ............................ 8
2.3 Partial Server Outage with both NO Servers failed and one SO server Intact.................. 8
2.4 Partial Server Outage with one NO and one SO Server Intact ........................................ 8
2.5 Partial Service Outage with corrupt database.................................................................. 9
3
PROCEDURE OVERVIEW ................................................................................................. 10
3.1 Required Materials ........................................................................................................ 10
3.2 Disaster Recovery Strategy........................................................................................... 11
4
ROCEDURE PREPARATION ............................................................................................ 13
5
DISASTER RECOVERY PROCEDURE ............................................................................. 15
5.1 Recovering and Restoring System Configuration .......................................................... 16
5.1.1 Recovery Scenario 1 (Complete Server Outage) ............................................................. 16
5.1.2 Recovery Scenario 2 (Partial Server Outage with one NO Server intact and both SOs
failed) 33
5.1.3 Recovery Scenario 3 (Partial Server Outage with both NO Servers failed and one SO
Server intact) ............................................................................................................................... 44
5.1.4 Recovery Scenario 4 (Partial Server Outage with one NO Server and one SO Server
Intact) 56
5.1.5 Recovery Scenario 5 (Both NO Servers failed with DR NO available) ............................. 63
5.1.6 Recovery Scenario 6 (Database recovery) ....................................................................... 64
6
RESOLVING USER CREDENTIAL ISSUES AFTER DATABASE RESTORE ................... 69
6.1 Restoring a Deleted User .............................................................................................. 69
6.1.1 To Keep the Restored User .............................................................................................. 69
6.1.2 To Remove the Restored User ......................................................................................... 69
6.2 Restoring a Modified User ............................................................................................. 70
6.3 Restoring an Archive that Does not Contain a Current User.......................................... 70
E52512-01.docx
3 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Appendix A. EAGLEXG DSR 4.x/5.x/6.x Database Backup .............................................. 72
Appendix B. Recovering/Replacing a Failed 3rd party components (Switches, OAs) ....... 76
Appendix C. Switching a DR Site to Primary .................................................................... 79
Recovery Steps .................................................................................................................. 79
Appendix D. Returning a Recovered Site to Primary ........................................................ 81
Recovery Steps .................................................................................................................. 81
Appendix E. Inhibit A and B level replication on C Level servers ...................................... 83
Appendix F. Un Inhibit A and B level replication on C Level servers ................................. 84
Appendix G. Workarounds for Issues/PR not fixed in this release .................................... 85
Appendix H. My Oracle Support (MOS) ............................................................................ 86
List of Figures
Figure 1: Determining Recovery Scenario .................................................................................................. 11
List of Tables
Table 1. Terminology ................................................................................................................................... 6
Table 2. Recovery Scenarios ...................................................................................................................... 13
List of Procedures
Procedure 1. Recovery Scenario 1 ............................................................................................................. 17
Procedure 2. Recovery Scenario 2 ............................................................................................................. 34
Procedure 3. Recovery Scenario 3 ............................................................................................................. 45
Procedure 4. Recovery Scenario 4 ............................................................................................................. 57
Procedure 5. Recovery Scenario 5 ............................................................................................................. 64
Procedure 6. Recovery Scenario 6 ............................................................................................................. 65
Procedure 7. Recovery Scenario 7 ............................................................................................................. 66
Procedure 8. Recovery Scenario 8 ............................................................................................................. 67
Procedure 9: DSR 4.x/5.x/6.x Database Backup ........................................................................................ 72
Procedure 10: Recovering a failed PM&C Server .................................................................................... 76
Procedure 11: Recovering a failed Aggregation Switch (Cisco 4948E / 4948E-F) ................................... 76
Procedure 12: Recovering a failed Enclosure Switch (Cisco 3020) .......................................................... 77
Procedure 13: Recovering a failed Enclosure Switch (HP 6120XG) ......................................................... 77
Procedure 14: Recovering a failed Enclosure Switch (HP 6125XLG, HP 6125G) ................................... 78
Procedure 15: Recovering a failed Enclosure OA...................................................................................... 78
4 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
1
Disaster Recovery Guide
INTRODUCTION
1.1 Purpose and Scope
This document is a guide to describe procedures used to execute disaster recovery for DSR 4.x/5.x/6.x (3-tier deployments).
This includes recovery of partial or a complete loss of one or more DSR 4.x/5.x/6.x servers. The audience for this
document includes GPS groups such as Software Engineering, Product Verification, Documentation, and Customer Service
including Software Operations and First Office Application. This document can also be executed by Oracle customers, as
long as Oracle Customer Service personnel are involved and/or consulted. This document provides step-by-step
instructions to execute disaster recovery for DSR 4.x/5.x/6.x. Executing this procedure also involves referring to and
executing procedures in existing support documents.
Note that components dependent on DSR might need to be recovered as well, for example SDS or DIH. To recover those
components, refer to the corresponding Disaster Recovery documentation. ([12] for SDS and [21] chapter 6 for DIH)
Note that this document only covers the disaster recovery scenarios of 3-tier deployments. For 2-tier deployments, refer to
[14] for the proper disaster recovery procedures.
1.2 References
[1] HP Solutions Firmware Upgrade Pack, 795-0000-2xx, v2.2.x (latest recommended, 2.2.4)
[2] Diameter Signaling Router 4.x/5.x Networking Interconnect Technical References, TR007133/4/5/6/7/8/9
[3] TPD Initial Product Manufacture, 909-2130-001
[4] Platform 6.x Configuration Procedure Reference, 909-2249-001
[5] DSR 4.x HP C-class Installation, 909-2228-001
[6] DSR 5.x Base Hardware and Software Installation, 909-2282-001
[7] DSR 6.x Hardware and Software Installation, ES4118
[8] Platform 6.7 Configuration Procedure Reference, 909-2297-001, Tekelec, 2014
[9] DSR Software Installation and Configuration Procedure Part 2/2, 909-2278-001
[10] PM&C 5.x Disaster Recover, 909-2210-001
[11] Appworks Database Backup and Restore, UG005196
[12] SDS 3.x Disaster Recovery Guide, TR007061
[13] XIH 5.0 Installation and Upgrade Procedure, 909-2265-001
[14] DSR 3.0/4.x/5.x 2-tier Disaster Recovery, 909-2225-001
[15] Policy DRA Activation, WI006835
[16] CPA Activation Feature Work Instruction, WI006780, latest version, Fisher
[17] IPFE Installation and Configuration, WI006837, latest version, Mahoney
[18] DSR Meta Administration Feature Activation, WI006761, latest version, Fisher
[19] DSR FABR Feature Activation, WI006771, latest version, Karmarkar
[20] DSR RBAR Feature Activation, WI006763, latest version, Fisher
[21] Integrated Diameter Intelligence Hub Disaster Recovery Procedure, 909-2266-001, latest version
[22] IPFE 3.0 Feature Activation and Configuration, WI006931, latest version, Mahoney
[23] DSR MAP-Diameter IWF Feature Activation, WI006965
1.3 Software Release Numbering
This procedure applies to all DSR 4.x/5.x/6.x releases.
E52512-01.docx
5 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
1.4 Acronyms
Acronym
Definition
BIOS
Basic Input Output System
CD
Compact Disk
DIH
Diameter Intelligent Hub
DVD
Digital Versatile Disc
EBIPA
Enclosure Bay IP Addressing
FRU
Field Replaceable Unit
HP c-Class
HP blade server offering
iLO
Integrated Lights Out manager
IPM
Initial Product Manufacture – the process of installing TPD on a hardware platform
MSA
Modular Smart Array
OA
HP Onboard Administrator
OS
Operating System (e.g. TPD)
PM&C
Platform Management & Configuration
SAN
Storage Area Network
SDS
Subscriber Data Server
SFTP
Secure File Transfer Protocol
SNMP
Simple Network Management Protocol
TPD
Tekelec Platform Distribution
TVOE
Tekelec Virtual Operating Environment
VSP
Virtual Serial Port
1.5 Terminology
Table 1. Terminology
Base hardware
Base hardware includes all hardware components (bare metal) and electrical
wiring to allow a server to power on.
Base software
Base software includes installing the server’s operating system: Tekelec Platform
Distribution (TPD).
6 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Failed server
E52512-01.docx
Disaster Recovery Guide
A failed server in disaster recovery context refers to a server that has suffered
partial or complete software and/or hardware failure to the extent that it cannot
restart or be returned to normal operation and requires intrusive activities to reinstall the software and/or hardware.
7 of 86
Disaster Recovery Guide
2
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
GENERAL DESCRIPTION
The DSR 4.x/5.x/6.x disaster recovery procedure falls into four basic categories. It is primarily dependent on the state of
the Network OAM&P servers and System OAM servers:
Recovery of the entire network from a total outage
o Both NO servers failed
o All SO servers failed
Recovery of one or more servers with at least one Network OAM&P server intact
o 1 or both NO servers intact
o 1 or more SO or MP servers failed
Recovery of the Network OAM&P pair with one or more System OAM servers intact
o Both NO servers failed
o 1 or more SO servers intact
Recovery of one or more servers with at least one Network OAM&P and one Site OAM server intact
o 1 or both NO servers intact
o 1 or more SO servers intact
o 1 SO or 1 or more MP servers failed
Recovery of one or more servers with corrupt databases that cannot be restored via replication from the Active Parent
Node.
Note that for Disaster Recovery of the PM&C Server, Aggregation switches, OA or 6120/6125/3020 switches refer to
Appendix B.
For IDIH recovery, refer to [21].
2.1 Complete Server Outage (All servers)
This is the worst case scenario where all the servers in the network have suffered complete software and/or hardware
failure. The servers are recovered using base recovery of hardware and software and then restoring database backups to the
active NO and SO servers. Database backups will be taken from customer offsite backup storage locations (assuming these
were performed and stored offsite prior to the outage). If no backup files are available, the only option is to rebuild the
entire network from scratch. The network data must be reconstructed from whatever sources are available, including
entering all data manually.
2.2 Partial Server Outage with one NO Server Intact and both SOs failed
This case assumes that one or both Network OAM&P servers intact. All servers have failed and are recovered using base
recovery of hardware and software. Database is restored on the SO and replication will recover the database of the
remaining servers.
2.3 Partial Server Outage with both NO Servers failed and one SO server Intact
If both Network OAM&P servers have suffered complete software and/or hardware failure but at least one System OAM
server is available. Database is restored on the NO and replication will recover the database of the remaining servers
2.4 Partial Server Outage with one NO and one SO Server Intact
The simplest case of disaster recovery is with one or both Network and Site OAM&P servers intact. All servers are
recovered using base recovery of hardware and software. Database replication from the active NO and SO servers will
recover the database to all servers. (Note: this includes failures of any disaster recovery Network NOAM&P servers)
8 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
2.5 Partial Service Outage with corrupt database
Case 1 Database got corrupted, replication channel is inhibited (either manually or because of comcol upgrade barrier) and
database backup is available
Case 2 Database got corrupted but replication channel is active
Case 3 Database got corrupted, replication channel is inhibited (either manually or because of comcol upgrade barrier) and
no database backup is available
These cases are discussed in more detail in section 5.1.6.
E52512-01.docx
9 of 86
Disaster Recovery Guide
3
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
PROCEDURE OVERVIEW
This section lists the materials required to perform disaster recovery procedures and a general overview (disaster recovery
strategy) of the procedure executed.
3.1 Required Materials
The following items are needed for disaster recovery:
1. A hardcopy of this document (909-2246-001) and hardcopies of all documents in the reference list: [1] through
[22]
2. Hardcopy of all site surveys performed at the initial installation and network configuration of this customer’s site.
If the site surveys cannot be found, escalate this issue within Oracle Customer Service until the site survey documents
can be located.
3. DSR 4.x/5.x/6.x backup files: electronic backup file (preferred) or hardcopy of all DSR configuration and
provisioning data. Check [11] for more details on the backup procedure.
4. Latest Network Element report: electronic file or hardcopy of Network Element report.
5. Tekelec Platform Distribution (TPD) Media (64 bits).
6. Platform Management & Configuration (PM&C) CD-ROM.
7. DSR 4.x/5.x/6.x CD-ROM (or ISO image file on USB Flash) of the target release.
8. TVOE Platform Media (64 bits)
9. The xml configuration files used to configure the switches, available on the PM&C Server.
10. The switch backup files taken after the switch is configured, available on the PM&C Server.
11. The network element XML file used for the blades initial configuration.
12. The HP firmware upgrade Kit
13. NetBackup Files if they exist. This may require the assistance of the customer’s NetBackup administrator.
For all Disaster Recovery scenarios, we assume that the NO Database backup and the SO Database backup were
performed around the same time, and that no synchronization issues exist among them.
10 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
3.2 Disaster Recovery Strategy
Disaster recovery procedure execution is performed as part of a disaster recovery strategy with the basic steps listed below:
1. Evaluate failure conditions in the network and determine that normal operations cannot continue without disaster
recovery procedures. This means the failure conditions in the network match one of the failure scenarios described in
Section 2.
2. Read and review the content in this document.
3. Gather required materials in Section 3.1.
4. From the failure conditions, determine the Recovery Scenario and procedure to follow (using Figure 1 and Table
2).
5. Execute appropriate recovery procedures (listed in Table 2).
Figure 1: Determining Recovery Scenario
E52512-01.docx
11 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Follow Recovery
scenario #5
Identify all failed
servers
No
Is database
corrupted
Yes
Is replication
inhibited on the
failed server
Yes
Is the Recent
database backup
available that
can be restored
No
Both NO servers
failed
Follow Recovery
scenario #6
No
No
Both SO servers
failed
Yes
DR NO installed
Yes
No
Follow Recovery
scenario #4
Follow Recovery
scenario #8
Yes
Yes
Follow Recovery
scenario #5
No
Follow Recovery
scenario #3
Follow Recovery
scenario #2
No
Both SO servers
failed
Yes
Follow Recovery
scenario #1
12 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
4
Disaster Recovery Guide
ROCEDURE PREPARATION
Disaster recovery procedure execution is dependent on the failure conditions in the network. The severity of the failure
determines the recovery scenario for the network. Use Table 2 below to evaluate the correct recovery scenario and follow
the procedure(s) listed to restore operations.
Note: A failed server in disaster recovery context refers to a server that has suffered partial or complete software and/or
hardware failure to the extent that it cannot restart or be returned to normal operation and requires intrusive activities to reinstall the software and/or hardware.
Table 2. Recovery Scenarios
Recovery
Scenario
1
2
3
4
5
6
7
8
E52512-01.docx
Failure Conditions
Procedure
 Both NO servers failed.
 All SO servers failed.
 MP servers may or may not be failed.
Execute Section 5.1.1,
Procedure 1.
 At least 1 NO server is intact and available.
 All SO servers failed.
 MP servers may or may not be failed.
Execute Section 5.1.2,
Procedure 2.
 Both NO servers failed.
 At least 1 SO server out of Active, StandBy, Spare
triplet is intact and available.
 MP servers may or may not be failed.
 At least 1 NO server is intact and available.
 At least 1 SO server out of Active, StandBy, Spare
triplet is intact and available.
 1 or more MP servers have failed.




Both NO servers failed.
DR NO is Available
SO servers may or may not be failed.
MP servers may or may not be failed.




Server is intact
Database gets corrupted on the server
Latest Database backup of the corrupt server is present
Replication is inhibited (either manually or because of
comcol upgrade barrier)
 Server is intact
 Database gets corrupted on the server
 Replication is occurring to the server with corrupted
database
 Server is intact
 Database gets corrupted on the server
Execute Section 5.1.3,
Procedure 3.
Execute Section 5.1.4,
Procedure 4.
Execute Section5.1.5,
Procedure 5.
Execute Section 5.1.6,
Procedure 6.
Execute Section 5.1.6,
Procedure 7.
Execute Section 5.1.6,
Procedure 8.
13 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Table 2. Recovery Scenarios
Recovery
Scenario
Failure Conditions
Procedure
 Latest Database backup of the corrupt server is NOT
present
 Replication is inhibited (either manually or because of
comcol upgrade barrier)
14 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
5
Disaster Recovery Guide
DISASTER RECOVERY PROCEDURE
Call the Oracle Customer Care Center at 1-888-367-8552 or 1-919-460-2150 (international) prior to executing this
procedure to ensure that the proper recovery planning is performed.
Before disaster recovery, users must properly evaluate the outage scenario. This check ensures that the correct
procedures are executed for the recovery.
**** WARNING *****
**** WARNING *****
NOTE: DISASTER Recovery is an exercise that
requires collaboration of multiple groups and is
expected to be coordinated by the TAC prime. Based on
TAC’s assessment of Disaster, it may be necessary to
deviate from the documented process.
Recovering Base Hardware
1.
Hardware Recovery will be executed by Oracle.
2.
Base Hardware Replacement must be controlled by engineer familiar with DSR Application.
E52512-01.docx
15 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
5.1 Recovering and Restoring System Configuration
Disaster recovery requires configuring the system as it was before the disaster and restoration of operational information.
There are eight distinct procedures to choose from depending on the type of recovery needed. Only one of these should be
followed (not all).
5.1.1 Recovery Scenario 1 (Complete Server Outage)
For a complete server outage, NO servers are recovered using recovery procedures of base hardware and software and then
executing a database restore to the active NO server. All other servers are recovered using recovery procedures of base
hardware and software. Database replication from the active NO server will recover the database on these servers. The
major activities are summarized in the list below. Use this list to understand the recovery procedure summary. Do not use
this list to execute the procedure. The actual procedures’ detailed steps are in Procedure 1. The major activities are
summarized as follows:



Recover Base Hardware and Software for all Blades.
o
Recover the base hardware. (by replacing the hardware and executing hardware configuration procedures,
reference [5]for DSR 4.x or reference [6] for DSR 5.x or reference [7] for DSR 6.x).
o
Recover the Virtual Machines hosting the NOs and SOs. (by executing procedures from reference [5]for
DSR 4.x or reference [6]for DSR 5.x or reference [7] for DSR 6.x)
o
Recover the software. (by executing installation procedures, reference [5] for DSR 4.x or reference [9] for
DSR 5.x and DSR 6.x)
Recover Active NO server by recovering its’ NO VM Image.
o
Recover the NO database.
o
Reconfigure the application
Recover Standby NO server by recovering base hardware and software and/or VM Image.
o


Reconfigure the Application
Recover all SO and MP servers by recovering base hardware and software.
o
Recover the SO database.
o
Reconfigure the Application
o
Reconfigure the signaling interfaces and routes on the MPs (by executing installation procedures,
reference [5] for DSR 4.x . For DSR 5.x/6.x, software will automatically reconfigure the signaling
interface from the recovered database. Refer [9] if existing routes needs to be altered for DSR 5.x/6.x
Restart processes and re-enable provisioning and replication.
Note that any other applications DR recovery actions (SDS and DIH) may occur in parallel. These actions
can/should be worked simultaneously; doing so would allow faster recovery of the complete solution (i.e. stale DB on
DP servers will not receive updates until SDS-SO servers are recovered
16 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Follow procedure below for detailed steps.
Procedure 1. Recovery Scenario 1
S
T
E
P
#
1

This procedure performs recovery if both NO servers are failed and all SO servers are failed. This procedure also
caters the C Level Sever failure
Check off () each step as it is completed. Boxes have been provided for this purpose under each step number.
Note: If any errors are encountered during the execution of this procedure, refer to the list of known issues in
Appendix E before contacting Oracle Customer Support
Should this procedure fail, contact the Oracle Customer Care Center and ask for assistance.
Recover the Failed
Recover the Failed Hardware and Software on ALL failed servers:
Hardware and software
1. Refer Appendix E to understand any workarounds required during this procedure.
2. If necessary, Refer to [10] PM&C Disaster Recover on instructions how to recover a PM&C
Server.
3. Gather the documents and required materials listed in Section 3.1.
4. If the failed server is HP c-Class Blade follow the following steps
a. Remove the failed HP c-Class Servers and Blades and install replacements.
b. Configure and verify the BIOS on the Blade. Execute procedure “Confirm/Update
Blade Server BIOS Settings” from reference [5] for DSR 4.x or reference [6] for DSR
5.x or reference [7] for DSR 6.x.
c. Execute Procedure “Configure Blade Server iLO Password for Administrator Account”
from [5] for DSR 4.x or reference [6] for DSR 5.x or reference [7] for DSR 6.x to setup
the Administrator account for blade servers.
d. Load any firmware upgrades using [5] for DSR 4.x or reference [6] for DSR 5.x or
reference [7] for DSR 6.x
e. For blades based NOAMPs/SOAMPs execute procedure “Install TVOE on VM Host
Server Blades” from reference [5] for DSR 4.x or reference [6] for DSR 5.x or
reference [7] for DSR 6.x.
f.
For blade based NOAMPs/SOAMPs execute procedure “Configure TVOE on Server
Blades” from reference [5] for DSR 4.x or reference [9] for DSR 5.x/6.x
5. If the failed server is RMS based NOAMP follow the following steps
a. For RMS based servers, execute Appendix I from [3] to configure all iLO settings,
including the iLO password.
b. If the failed NOAMP is co-located with the PMAC on the first RMS then execute
procedure “Continue TVOE Configuration on First RMS Server” from reference [5] for
DSR 4.x or reference [9] for DSR 5.x/6.x.
c. Else execute procedure “Configure TOVE on Additional RMS Server(s)” from
reference [5] for DSR 4.x or reference [9] for DSR 5.x/6.x.
6. For NOAMPs execute procedure “Create NOAMP Guest VMs” from reference [5] for DSR 4.x or
reference [9] for DSR 5.x/6.x.
7. For SOAMPs execute procedure “Create SOAM Guest VMs” from reference [5]for DSR 4.x or
reference [9] for DSR 5.x/6.x.
8. IPM all the guests and failed MP servers using procedure “IPM Blades and VMs” from [5] for
DSR 4.x or reference [9] for DSR 5.x/6.x. Instruct any other Application’s personnel to start
recovery procedures on the Guests hosted by the server (parallel recovery).
9. Install the application using procedure “Install the Application Software on the Blades” from
[5]for DSR 4.x or reference [9] for DSR 5.x/6.x.
10. If the recovered server is Active/StandBy or DR NOAMP and
a. If there were Hardware profiles that were manually created, then they needs to be
recreated again and copy it into the /var/TKLC/appworks/profiles/ directory of the
active NOAMP server, the standby NOAMP server, and both the DR NOAM servers
(if applicable). Follow Appendix. SAMPLE NETWORK ELEMENT AND HARDWARE
PROFILES from reference [5]for DSR 4.x or reference [9] for DSR 5.x/6.x.
Repeat this step for all remaining failed blades.
E52512-01.docx
17 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 1. Recovery Scenario 1
2

3

4

5

Obtain latest database
backup and network
configuration data.
Obtain the most recent database backup file from external backup sources (ex. file servers) or tape
backup sources..
From required materials list in Section 3.1; use site survey documents and Network Element report (if
available), to determine network configuration data.
Execute DSR
Installation procedures
for the first NOAMP.
Login into the NO XMI
Address
Upload the backed up
database file from
Remote location into
File Management Area.
Execute the following procedures from the DSR 4.x/5.x/6.x Installation User’s Guide.
1.
Verify the networking data for Network Elements. Use the backup copy of network configuration
data and site surveys (from Step 2)
2.
Execute installation procedures for the first NO server. See reference [5] for DSR 4.x or
reference [9] for DSR 5.x/6.x, Procedure “Configure the First NOAMP Server”, and “Configure
the NOAMP Server Group”. If Netbackup is used, execute Procedure 35 from [5] or Procedure
13 from [9].
Log into the first NO GUI.
1.
below and select the file “NO Provisioning and Configuration:” file backed up after initial
installation and provisioning.
3.
18 of 86
Browse to Main Menu->Status & Manage->Files
2. Select the Active NO Server. The following screen will appear. Click on “Upload” as shown
Click on “Browse” and Locate the backup file and click on “Open” as shown below.
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 1. Recovery Scenario 1
6
Disable Provisioning
4.
Click on the “Upload ” button. The file will take a few seconds to upload depending on the
size of the backup data. The file will be visible on the list of entries after the upload is
complete.
1.
Click on Main Menu->Status & Manage->Database
2.
Disable Provisioning by clicking on “Disable Provisioning” button at the bottom of the
screen as shown below.
3.
A confirmation window will appear, press “OK” to disable Provisioning.
4.
The message “Warning Code 002” will appear.

E52512-01.docx
19 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 1. Recovery Scenario 1
7

Verify the Archive
Contents and Database
Compatibility
20 of 86
1.
Select the Active NO Server and click on the “Compare”:
2.
The following screen is displayed; click the radio button for the restored database file that
was uploaded as a part of Step 2 of this procedure.
3.
Verify that the output window matches the screen below. Note that you will get a database
mismatch regarding the NodeIDs of the blades. That is expected. If that is the only
mismatch, then you can proceed, otherwise stop and contact customer support
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 1. Recovery Scenario 1
NOTE: Archive Contents and Database Compatibilities must be the following:
Archive Contents: Provisioning and Configuration data
Database Compatibility: The databases are compatible.
NOTE: Following is expected Output for Topology Compatibility Check since we
are restoring from existing backed up data base to database with just one
NOAMP:
Topology Compatibility
THE TOPOLOGY SHOULD BE COMPATIBLE MINUS THE NODEID.
NOTE: We are trying to restore a backed up database onto an empty NOAMP
database. This is an expected text in Topology Compatibility.
8
Restore the Database

E52512-01.docx
4.
If the verification is successful, Click BACK button and continue to next step in this
procedure.
1.
2.
3.
Click on Main Menu->Status & Manage->Database
Select the Active NO Server, and click on “Restore” as shown below.
The following screen will be displayed. Select the proper back up provisioning and
configuration file.
4.
Click “OK” Button. The following confirmation screen will be displayed.
5.
If you get an error that the NodeIDs do not match. That is expected. If no other errors
21 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 1. Recovery Scenario 1
beside the NodeIDs are displayed, select the “Force” checkbox as shown above and Click
OK to proceed with the DB restore.
6.
NOTE: After the restore has started, the user will be logged out of XMI NO GUI since the
restored Topology is old data.
7.
Log in Back into GUI using VIP address
8.
Login using the guiadmin login and password into the GUI
9.
Wait for 5-10 minutes for the System to stabilize with the new topology: Monitor the Info tab
for “Success”. This will indicate that the backup is complete and the system is stabilized.
10. Following Alarms must be ignored for NO and MP Servers until all the Servers are
configured.
Alarms with Type Column as “REPL” , “COLL”, “HA” (with mate NOAMP), “DB” (about Provisioning
Manually Disabled)
Do not pay attention to alarms until all the servers in the system are
completely restored.
NOTE: The Configuration and Maintenance information will be in the same state it
was backed up during initial backup.
9
Restore /etc/hosts file of
active NO

Release 5.X/6.x:
From the recovered NO server command line, execute:
# AppWorks AppWorks_AppWorks updateServerAliases <NO Host Name>
Release 4.X:
Update the /etc/hosts file with the missing entries (or copy it from another server (e.g. SO) if it is
complete on that server)
22 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 1. Recovery Scenario 1
10
Re-enable Provisioning
1.
Click on Main Menu->Status & Manage->Database menu item.
2.
Click on the “Enable Provisioning” button. A pop-up window will appear to confirm as
shown below, press OK.

11
Recover standby NO
server.
1. Install the second NO server by executing Reference [5] for DSR 4.x or reference [9] for DSR
5.x/6.x, Procedure “Configure the Second NOAMP Server, steps 1, 4, 5, 6 and 7(only if Netbackup is
used)”. Also execute Procedure 35 from [5] or Procedure 13 from [9] If Netbackup is used.

12

13
Stop Replication to the
C Level servers of this
site

Inhibit Replication to the working C Level Servers which belong to the same site as of the failed SO
servers, as the recovery of Active SO will cause the database wipeout in the C level servers because
of the replication
1.
Recover active SO
server.
Execute Appendix E
Recover the active SO server:
1. Install the SO servers by executing Reference [5] for DSR 4.x or reference [9] for DSR 5.x/6.x,
Procedure “Configure the SOAM Servers”, steps 1, 2, 3, 5, 6, 7, 8. If you are using Netbackup, also
execute step 11.

14
Recover the standby NO server:
Upload the backed up
SO database file from
Remote location into
File Management Area.
E52512-01.docx
1.
2.
Browse to Main Menu->Status & Manage->Files
Select the Active SO Server. The following screen will appear. Click on “Upload” as shown
below and select the file “SO Provisioning and Configuration:” file backed up after initial
installation and provisioning.
3.
Click on “Browse” and Locate the backup file and click on “Open” as shown below.
23 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 1. Recovery Scenario 1
4.
15
Disable Provisioning
Click on the “Upload ” button. The file will take a few seconds to upload depending on the size
of the backup data. The file will be visible on the list of entries after the upload is complete.
1.
Click on Main Menu->Status & Manage->Database
2.
Disable Provisioning by clicking on “Disable Provisioning” button at the bottom of the
screen as shown below.
3.
A confirmation window will appear, press “OK” to disable Provisioning.
4.
The message “Warning Code 002” will appear.

24 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 1. Recovery Scenario 1
16

Verify the Archive
Contents and Database
Compatibility
1.
Login onto the recently recovered Active SO GUI
2.
Click on Main Menu->Status & Manage->Database
3.
Select the Active SO Server and click on the “Compare”:
4.
The following screen is displayed; click the radio button for the restored database file that
was uploaded as a part of Step 2 of this procedure.
5.
Verify that the output window matches the screen below. Note that you will get a database
mismatch regarding the NodeIDs of the blades. That is expected. If that is the only
mismatch, then you can proceed, otherwise stop and contact customer support
NOTE: Archive Contents and Database Compatibilities must be the following:
Archive Contents: Provisioning and Configuration data
Database Compatibility: The databases are compatible.
NOTE: Following is expected Output for Topology Compatibility Check since we
are restoring from existing backed up data base to database with just one
SOAM:
Topology Compatibility
E52512-01.docx
25 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 1. Recovery Scenario 1
THE TOPOLOGY SHOULD BE COMPATIBLE MINUS THE NODEID.
NOTE: We are trying to restore a backed up database onto an empty SOAM
database. This is an expected text in Topology Compatibility.
6.
If the verification is successful, Click BACK button and continue to next step in this
procedure.
NOTE: Please refer to the workarounds in Appendix E if any problems are encountered
in this step.
17
Restore the Database

1.
Login onto the recently recovered Active SO GUI
2.
3.
4.
Click on Main Menu->Status & Manage->Database
Select the Active SO Server, and click on “Restore” as shown below.
The following screen will be displayed. Select the proper back up provisioning and
configuration file.
5.
Click “OK” Button. The following confirmation screen will be displayed.
6.
If you get an error that the NodeIDs do not match. That is expected. If no other errors
beside the NodeIDs are displayed, select the “Force” checkbox as shown above and Click
OK to proceed with the DB restore.
7.
NOTE: After the restore has started, the user will be logged out of XMI SO GUI since the
restored Topology is old data.
8.
Log in Back into GUI VIP.
9.
Login using the guiadmin login and password into the GUI
10. Wait for 5-10 minutes for the System to stabilize with the new topology.
11. Following Alarms must be ignored for NO and MP Servers until all the Servers are
configured.
Alarms with Type Column as “REPL” , “COLL”, “HA” (with mate SOAM), “DB” (about Provisioning
Manually Disabled)
Do not pay attention to alarms until all the servers in the system are
completely restored.
NOTE: The Configuration and Maintenance information will be in the same state it
was backed up during initial backup.
26 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 1. Recovery Scenario 1
18
Re-enable Provisioning

19
Recover remaining SO
servers.

1.
3.
4.

21

Start Replication to the
C Level servers of this
site
2.
Click on Main Menu->Status & Manage->Database menu item.
3.
Click on the “Enable Provisioning” button. A pop-up window will appear to confirm as
shown below, press OK.
Install the remaining SO server by executing Reference [5]for DSR 4.x or reference [9]for DSR
5.x/6.x, Procedure “Configure the SOAM Servers”, steps 1, 4, 5, and 6”. Execute step 11 as well
if Netbackup is used.
If you are using Netbackup, execute Procedure 35 from [5] or Procedure 13 from [9], “Install
Netbackup Client”
Re-enable Replication (if inhibited) to the restored SO by navigating to Main Menu->Status &
Manage-> Database, then selecting the NO in question and clicking on “Allow Replication”.
Restart the application by Navigating to Main Menu->Status & Manage->Server, then selecting
the recovered server and Clicking on “Restart”.
Note: Execute this step only if Step 12 was executed, else skip this step
UnInhibit (Start) Replication to the working C Level Servers which belong to the same site as of the
failed SO servers
1.
Recover the C Level
Server
(This include DA-MP,
SBRs, IPFE, SS7-MPs)
Log into the Active NO GUI
Recover the remaining SO servers (standby, spare) by repeating the following step for each SO
Server:
2.
20
1.
Execute Appendix F
Execute the following procedures from [5] for DSR 4.x or reference [9] for DSR 5.x/6.x FOR EACH
server that has been recovered:
1.
2.
“Configure MP Blades Servers”, Steps 1, 5, 6, 7, 8, 9. Also execute step 10 and 11 if you plan
to configure a default route on your MP that uses a signaling (XSI) network instead of the XMI
network
FOR DSR 4.X *ONLY* : Reapply the signaling Networking Configuration by running the
following command from the active NO command line for each MP Server:
/usr/TKLC/appworks/bin/syncApplConfig <MP_Hostame>
3.
If IPFE servers are being recovered, execute Procedure 6 of [22] for any applicable IPFE
servers.
Note: If this server is an IPFE server, then ensure ipfeNetUpdate.sh from [22] has been executed.
22
Restart Application
Processes
Restart the Application by Navigating to Status & Manage -> Server, then select each server that
has been recovered and clicking on Restart at the bottom of the screen.
DSR 5.X Recovery
Only: Re-Sync NTP if
Necessary (Optional)
Navigate to Status & Manage -> Server, then select each server that has been recovered and click
NTP Sync.

23

E52512-01.docx
27 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 1. Recovery Scenario 1
24
Allow Replication to all
Servers
1.
2.

25
Remove Forced
Standby

26

Fetch and Store the
database Report for the
newly restored data and
save it
1.
2.
3.
4.
Navigate to Status & Manage -> Database
If the “Repl Status” is set to “Inhibited”, click on the “Allow Replication” button as shown
below using the following order, otherwise if none of the servers are inhibited, skip this step
and continue with the next step.:
a.
Active NOAMP Server
b.
Standby NOAMP Server
c.
Active SOAM Server
d.
Standby SOAM Server
e.
Spare SOAM Server (if applicable)
f.
Active DR NOAM Server
g.
MP/IPFE Servers (if MPs are configured as Active/Standby, start with the Active
MP, otherwise the order of the MPs does not matter)
h.
Verify that the replication on all servers is allowed. This can be done by clicking
on each server and checking that the button below shows “Inhibit” Replication”
instead of “Allow Replication”.
Navigate to Status & Manage -> HA
Click on Edit at the bottom of the screen
For each server whose Max Allowed HA Role is set to Standby, set it to Active
Press OK
1. Navigate to Configuration-> Server, select the active NO server and click on the “Report” button at
the bottom of the page . The following screen is displayed:
2. Click on “Save” and save the report to your local machine.
28 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 1. Recovery Scenario 1
27

DSR 4.X Recovery
ONLY: Optimize
Comcol memory usage
on NO and SO
If recovering a DSR 4.x system, execute this step, otherwise skip to step 25.
Obtain a terminal window connection to the (NO/SO) server console via SSH or iLO. If using SSH,
use the actual IP of the server, not the VIP address.
Execute the following on the command line. Wait until the script completes and you are returned to
the command line:
# /usr/TKLC/dsr/bin/optimizeComcolIdbRamUsage
# sleep 20
# prod.start
# pm.sanity
Sanity check OK: 01/23/13 11:42:20 within 15 secs
Verify that the script finished successfully by checking the exit status:
# echo $?
If anything other than “0” is printed out,. halt this procedure and contact Oracle Support..
Repeat this step for the standby NO, D.R. NO (if applicable) servers, and every SO server at
every site.
28

DSR 4.X Recovery
ONLY: Optimize
Comcol memory usage
on DA-MP
E52512-01.docx
SSH to each DA-MP and execute the following command. Note that this command SHOULD NOT be
executed on SBR blades.
# /usr/TKLC/dsr/bin/optimizeComcolIdbRamUsage --force
29 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 1. Recovery Scenario 1
29
Verify Replication
between servers.

For DSR 4.x
1.
2.
Click on Main Menu->Status and Manager->Replication
Verify that replication is occurring between servers Server.
For DSR 5.x and above
1.
Execute irepstat
$ irepstat –m
Output like below shall be generated
-- Policy 0 ActStb [DbReplication] ---------------------------------------------------------------------------------------------------------------------------------------------RDU06-MP1 -- Stby
BC From RDU06-SO1 Active
0 0.50 ^0.17%cpu 42B/s A=none
CC From RDU06-MP2 Active
0 0.10 ^0.17 0.88%cpu 32B/s A=none
RDU06-MP2 -- Active
BC From RDU06-SO1 Active
CC To RDU06-MP1 Active
0 0.50 ^0.10%cpu 33B/s A=none
0 0.10 0.08%cpu 20B/s A=none
RDU06-NO1 -- Active
AB To RDU06-SO1 Active
0 0.50 1%R 0.03%cpu 21B/s
RDU06-SO1 -- Active
AB From RDU06-NO1 Active
30
Verify the Database
states
0 0.50 1%R 0.04%cpu 21B/s
BC To RDU06-MP2 Active
0 0.50 1%R 0.07%cpu 21B/s
1.
2.
Click on Main Menu->Status and Manager->Database
For DSR 4.x Verify that the HA Role is either “Active” or “Standby”, and that the status is
“Normal”.
3.
For DSR 5.x/6.x verify that the OAM Max HA Role is either “Active” or “Standby” for NO and SO
and Application Max HA Role for MPs is “Active”, and that the status is “Normal” as shown
below

30 of 86
0 0.50 ^0.04%cpu 24B/s
BC To RDU06-MP1 Active
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 1. Recovery Scenario 1
31
Verify the HA Status
1.
2.
3.
Click on Main Menu->Status and Manager->HA
Check the row for all the MP Server
Verify that the HA Role is either Active or Standby.
Verify the local node info
1.
2.
Click on Main Menu->Diameter->Configuration->Local Node
Verify that all the local nodes are shown.
Verify the peer node info
1.
2.
Click on Main Menu->Diameter->Configuration->Peer Node
Verify that all the peer nodes are shown.
Verify the Connections
info
1.
2.
Click on Main Menu->Diameter->Configuration->Connections
Verify that all the peer nodes are shown.
Re-enable connections
if needed
1.
2.
3.
Click on Main Menu->Diameter->Maintenance->Connections
Select each connection and click on the “Enable” button
Verify that the Operational State is Available.
Examine All Alarms
1.
2.
Click on Main Menu->Alarms & Events->View Active
Examine all active alarms and refer to the on-line help on how to address them. If needed
contact the Oracle Customer Support hotline.
Restore GUI Username
s and passwords
If applicable, Execute steps in Section 6 to recover the user and group information restored.
Re-activate Optional
Features
If optional features (RBAR, FABR, IPFE, CPA, PDRA, SBR, MIWF) were activated, they will need to
be de-activated and then re-activated. Refer to the [15], [16], [17], [18], [19], [20], [22] or [23] for the
appropriate documentation.
Re-enable transports if
needed
(Applicable ONLY for
DSR 6.x and up)
1.
2.
3.
Click on Main Menu->Transport Manager -> Maintenance -> Transport
Select each transport and click on the “Enable” button
Verify that the Operational Status for each transport is Up.
Re-enable MAPIWF
application if needed
(Applicable ONLY for
DSR 6.x and up)
1.
2.
3.
Click on Main Menu->SS7/Sigtran->Maintenance->'Local SCCP Users'
Click on the “Enable” button corresponding to MAPIWF Application Name.
Verify that the SSN Status is Enabled.
Re-enable links if
needed
(Applicable ONLY for
DSR 6.x and up)
1.
2.
3.
Click on Main Menu->SS7/Sigtran->Maintenance->Links
Click on “Enable” button for each link.
Verify that the Operational Status for each link is Up.
Clear Browser Cache
If the system was restored with DSR 3.0 after running 4.X/5.X, the browser cache needs to be
cleared. To do so in IE, navigate to Tools -> Internet Options and click on Delete under browsing
history. (For other browsers, refer to their respective documentation/help on how to do so)
Backup and archive all
the databases from the
recovered system
Execute Appendix A back up the Configuration databases:

32

33

34

35

36

37

38

39

40

41

42

43

Disaster Recovery Procedure is Complete
End of Procedure
E52512-01.docx
31 of 86
Disaster Recovery Guide
32 of 86
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
5.1.2 Recovery Scenario 2 (Partial Server Outage with one NO Server intact and both
SOs failed)
For a partial server outage with an NO server intact and available; SO servers are recovered using recovery procedures of
base hardware and software and then executing a database restore to the active SO server using a database backup file
obtained from the SO servers. All other servers are recovered using recovery procedures of base hardware and software.
Database replication from the active NO server will recover the database on these server. The major activities are
summarized in the list below. Use this list to understand the recovery procedure summary. Do not use this list to execute
the procedure. The actual procedures’ detailed steps are in Procedure 2. The major activities are summarized as follows:



Recover Standby NO server (if needed) by recovering base hardware, software and the database.
o
Recover the base hardware.
o
Recover the software.
Recover Active SO server by recovering base hardware and software.
o
Recover the base hardware.
o
Recover the software.
o
Recover the Database.
Recover any failed SO and MP/IPFE servers by recovering base hardware and software.
o
Recover the base hardware.
o
Recover the software.
o
The database has already been restored at the active SO server and does not require restoration at the SO
and MP servers.
E52512-01.docx
33 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 2. Recovery Scenario 2
S
T
E
P
#
1

This procedure performs recovery if at least 1 NO server is available but all SO servers in a site have failed. This
includes any SO server that is in another location.
Check off () each step as it is completed. Boxes have been provided for this purpose under each step number.
Note: If any errors are encountered during the execution of this procedure, refer to the list of known issues in
Appendix E before contacting Oracle Customer Support
Should this procedure fail, contact the Oracle Customer Care Center and ask for assistance.
Recover Failed servers
Recover the Failed servers (if needed) by recovering base hardware and software.
(if needed).
1. Refer Appendix E to understand any workarounds required during this procedure.
2. If necessary, Refer to [10] PM&C Disaster Recover on instructions how to recover a PM&C
Server.
3. Gather the documents and required materials listed in Section 3.1.
4. From the NO VIP GUI, set the server HA state to “Forced Standby” by navigating to Main Menu>HA, then clicking on Edit and setting the “Max Allowed HA Role” to Standby .
5. From the NO VIP GUI, Inhibit replication to the failed server by navigating to Main Menu->Status
& Manage-> Database, then selecting the server in question and clicking on “Inhibit
Replication”.
6. If the failed server is HP c-Class Blade follow the following steps
a. Remove the failed HP c-Class Blade and install the replacement into the enclosure.
b. Configure and verify the BIOS on the Blade. Execute procedure “Confirm/Update
Blade Server BIOS Settings” from reference [5] for DSR 4.x or reference [6] for DSR
5.x or reference [7] for DSR 6.x.
c. Execute Procedure “Configure Blade Server iLO Password for Administrator Account”
from [5] for DSR 4.x or reference [6] for DSR 5.x or reference [7] for DSR 6.x to setup
the Administrator account for blade servers.
d. Upgrade the blade firmware and load an errata updates if needed. Refer to [1] for
more details.
e. For blade based NOAMPs/SOAMPs execute procedure “Install TVOE on Server
Blades” from reference [5] for DSR 4.x or reference [6] for DSR 5.x or reference [7] for
DSR 6.x.
f. For blade based NOAMPs/SOAMPs execute procedure “Configure TVOE on Server
Blades” from reference [5] for DSR 4.x or reference [9] for DSR 5.x/6.x.
7. If the failed server is RMS follow the following steps
a. For RMS based servers, execute Appendix I from [3] to configure all iLO settings,
including the iLO password.
b. If the failed NOAMP is co-located with the PMAC on the first RMS then execute
procedure “Continue TVOE Configuration on First RMS Server” from reference [5] for
DSR 4.x or reference [9] for DSR 5.x/6.x. (RMS based NOAMPs only)
c. Else execute procedure “Configure TVOE on Additional RMS Server(s)” from
reference [5] for DSR 4.x or reference [9] for DSR 5.x/6.x. (RMS based NOAMPs
only)
8. For NOAMP server execute procedure “Create NOAMP Guest VMs” from reference [5] for DSR
4.x or reference [9] for DSR 5.x/6.x.
9. For SOAMP server execute procedure “Create SOAMP Guest VMs” from reference [5] for DSR
4.x or reference [9] for DSR 5.x/6.x.
10. IPM all the guests and failed MP servers using procedure “IPM Blades and VMs” from [5] for
DSR 4.x or reference [9] for DSR 5.x/6.x. Instruct any other Application’s personnel to start
recovery procedures on the Guests hosted by the server (parallel recovery).
11. Install the application using procedure “Install the Application Software on the Blades” from [5]
for DSR 4.x or reference [9] for DSR 5.x/6.x.
12. If the recovered server is Active/StandBy or DR NOAMP and
a. If there were Hardware profiles that were manually created, then they needs to be
recreated again and copy it into the /var/TKLC/appworks/profiles/ directory of the
active NOAMP server, the standby NOAMP server, and both the DR NOAM servers
(if applicable). Follow Appendix. SAMPLE NETWORK ELEMENT AND HARDWARE
PROFILES from reference [5]for DSR 4.x or reference [9] for DSR 5.x/6.x.
Repeat this step for all remaining failed blades.
34 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 2. Recovery Scenario 2
2
Recover Standby NO
servers.
1.
2.

3.
4.
3

4
Stop Replication to the
C Level servers of this
site
Inhibit Replication to the working C Level Servers which belong to the same site as of the failed SO
servers, as the recovery of Active SO will cause the database wipeout in the C level servers because
of the replication
2.
Recover first SO server.
Execute Appendix E
Recover the SO servers:
3.

Configure the newly installed application by executing procedure “Configure the Second
NOAMP Server, from [5] for DSR 4.x or reference [9] for DSR 5.x/6.x steps 1, 2, 4, 5 and 6.
If you are using Netbackup, execute Procedure 35 from [5] or Procedure 13 from [9], “Install
Netbackup Client”
Re-enable Replication to the restored NO by navigating to Main Menu->Status & Manage->
Database, then selecting the NO in question and clicking on “Allow Replication”.
Restart the application by Navigating to Main Menu->Status & Manage->Server, then selecting
the recovered server and Clicking on “Restart”.
4.
5.
6.
Configure the newly installed application by executing procedure Configure the newly installed
application by executing procedure “Configure the SOAM Servers”, from [5] for DSR 4.x or
reference [9] for DSR 5.x/6.x steps 1, 2, 4, 5 and 6. Also execute step #11 if you are using
NetBackup on your SOAMs.
If you are using Netbackup, execute Procedure 35 from [5] or Procedure 13 from [9], “Install
Netbackup Client
Re-enable Replication to the restored SO by navigating to Main Menu->Status & Manage->
Database, then selecting the SO in question and clicking on “Allow Replication”.
Restart the application by Navigating to Main Menu->Status & Manage->Server, then selecting
the recovered server and Clicking on “Restart”.
Re-execute this step for all the failed SO servers
5

Upload the backed up
SO database file from
Remote location into
File Management Area.
1. From the NO GUI, Browse to Main Menu->Status & Manage->Files
2. Select the Active SO Server. The following screen will appear. Click on “Upload” as shown below
and select the file “SO Provisioning and Configuration:” file backed up after initial installation and
provisioning.
3.
E52512-01.docx
Click on “Browse” and Locate the backup file and click on “Open” as shown below.
35 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 2. Recovery Scenario 2
6
Disable Provisioning
4.
Click on the “Upload ” button. The file will take a few seconds to upload depending on the
size of the backup data. The file will be visible on the list of entries after the upload is
complete.
1.
From the NO GUI, Click on Main Menu->Status & Manage->Database
2.
Disable Provisioning by clicking on “Disable Provisioning” button at the bottom of the
screen as shown below.
3.
A confirmation window will appear, press “OK” to disable Provisioning.
4.
The message “Warning Code 002” will appear.

36 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 2. Recovery Scenario 2
7

Verify the Archive
Contents and Database
Compatibility
1.
Login onto the Active SO GUI
2.
Click on Main Menu->Status & Manage->Database
3.
Select the Active SO Server and click on the “Compare”:
4.
The following screen is displayed; click the radio button for the restored database file that
was uploaded as a part of Step 2 of this procedure.
5.
Verify that the output window matches the screen below.
NOTE: Archive Contents and Database Compatibilities must be the following:
Archive Contents: Provisioning and Configuration data
Database Compatibility: The databases are compatible.
NOTE: Following is expected Output for Topology Compatibility Check since we
are restoring from existing backed up data base to database with just one
NOAMP:
Topology Compatibility
THE TOPOLOGY SHOULD BE COMPATIBLE
6.
8
Restore the Database

E52512-01.docx
If the verification is successful, Click BACK button and continue to next step in this
procedure.
1.
Login onto the recently recovered Active SO GUI
2.
3.
4.
Click on Main Menu->Status & Manage->Database
Select the Active SO Server, and click on “Restore” as shown below.
The following screen will be displayed. Select the proper back up provisioning and configuration
file.
37 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 2. Recovery Scenario 2
5.
Click “OK” Button. The following confirmation screen will be displayed.
6.
If you get an error that the NodeIDs do not match. That is expected. If no other errors beside the
NodeIDs are displayed, select the “Force” checkbox as shown above and Click OK to proceed
with the DB restore.
7.
NOTE: After the restore has started, the user will be logged out of XMI SO GUI since the
restored Topology is old data.
8.
Log in Back into GUI VIP.
9.
Login using the guiadmin login and password into the GUI
10. Wait for 5-10 minutes for the System to stabilize with the new topology.
11. Following Alarms must be ignored for NO and MP Servers until all the Servers are
configured.
Alarms with Type Column as “REPL” , “COLL”, “HA” (with mate NOAMP), “DB” (about Provisioning
Manually Disabled)
Do not pay attention to alarms until all the servers in the system are
completely restored.
NOTE: The Configuration and Maintenance information will be in the same state it
was backed up during initial backup.
9
Re-enable Provisioning
1.
Click on Main Menu->Status & Manage->Database menu item.
2.
Click on the “Enable Provisioning” button. A pop-up window will appear to confirm as
shown below, press OK.

38 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 2. Recovery Scenario 2
10
Recover remaining SO
servers.

Recover the remaining SO servers (standby, spare) by repeating the following step for each SO
Server:
1.
2.
3.
4.
11

12

Start Replication to the
C Level servers of this
site
UnInhibit (Start) Replication to the working C Level Servers which belong to the same site as of the
failed SO servers
2.
Recover the C Level
Server
(This include DA-MP,
SBRs, IPFE, SS7-MPs)
Install the remaining SO server by executing Reference [5] for DSR 4.x or reference [9] for DSR
5.x, Procedure “Configure the SOAM Servers”, steps 1-3 and 5-8”.
If you are using Netbackup, execute Procedure 35 from [5] or Procedure 13 from [9], “Install
Netbackup Client”
Re-enable Replication (if inhibited) to the restored SO by navigating to Main Menu->Status &
Manage-> Database, then selecting the NO in question and clicking on “Allow Replication”.
Restart the application by Navigating to Main Menu->Status & Manage->Server, then selecting
the recovered server and Clicking on “Restart”.
Execute Appendix F
Execute the following procedures from [5] for DSR 4.x or reference [9] for DSR 5.x/6.x FOR EACH C
Level Server that has been recovered:
1.
2.
“Configure MP Blades Servers”, Steps 1, 5, 6, 7, 8, 9. Also execute step 10 and 11 if you plan
to configure a default route on your MP that uses a signaling (XSI) network instead of the XMI
network
FOR DSR 4.X *ONLY* : Reapply the signaling Networking Configuration by running the
following command from the active NO command line for each MP Server:
/usr/TKLC/appworks/bin/syncApplConfig <MP_Hostame>
3.
If IPFE servers are being recovered, execute Procedure 6 of [22] for any applicable IPFE
servers.
Note: If this server is an IPFE server, then ensure ipfeNetUpdate.sh from [22] has been executed
before proceeding with this step.
13

14
DSR 5.X Recovery
Only: Re-Sync NTP if
Necessary (Optional)
Restart Application
Processes

E52512-01.docx
Navigate to Status & Manage -> Server, then select each server that has been recovered and click
NTP Sync.
Restart the Application by Navigating to Status & Manage -> Server, then select each server that
has been recovered and clicking on Restart at the bottom of the screen.
39 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 2. Recovery Scenario 2
15
Allow Replication to all
Servers
1.
2.

16
Remove Forced
Standby

17

Fetch and Store the
database Report for the
newly restored data and
save it
Navigate to Status & Manage -> Database
If the “Repl Status” is set to “Inhibited”, click on the “Allow Replication” button as shown
below using the following order, otherwise if none of the servers are inhibited, skip this step
and continue with the next step.:
a.
Active NOAMP Server
b.
Standby NOAMP Server
c.
Active SOAM Server
d.
Standby SOAM Server
e.
Spare SOAM Server (if applicable)
f.
Active DR NOAM Server
g.
MP/IPFE Servers (if MPs are configured as Active/Standby, start with the Active
MP, otherwise the order of the MPs does not matter)
Verify that the replication on all servers is allowed. This can be done by clicking on each server and
checking that the button below shows “Inhibit” Replication” instead of “Allow Replication”.
1. Navigate to Status & Manage -> HA
2. Click on Edit at the bottom of the screen
3. For each server whose Max Allowed HA Role is set to Standby, set it to Active
4. Press OK
1.
Navigate to Configuration-> Server, select the active NO server and click on the “Report” button
at the bottom of the page . The following screen is displayed:
2. Click on “Save” and save the report to your local machine.
40 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 2. Recovery Scenario 2
18

Optimize Comcol
memory usage on
recovered NO and SO
(DSR 4.x only)
If recovering a DSR 4.x system, execute this step, otherwise skip to step 16.
For each recovered NO or SO, obtain a terminal window connection to the (NO/SO) server console
via SSH or iLO. If using SSH, use the actual IP of the server, not the VIP address.
Execute the following on the command line. Wait until the script completes and you are returned to
the command line:
# /usr/TKLC/dsr/bin/optimizeComcolIdbRamUsage
# sleep 20
# prod.start
# pm.sanity
Sanity check OK: 01/23/13 11:42:20 within 15 secs
Verify that the script finished successfully by checking the exit status:
# echo $?
If anything other than “0” is printed out,. halt this procedure and contact Oracle Support..
Repeat this step for all recovered NO and SO servers at every site.
19

Optimize Comcol
memory usage on DAMP (DSR 4.x only)
E52512-01.docx
SSH to each recovered DA-MP and execute the following command. Note that this command
SHOULD NOT be executed on SBR blades.
# /usr/TKLC/dsr/bin/optimizeComcolIdbRamUsage --force
41 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 2. Recovery Scenario 2
20
Verify Replication
between servers.

For DSR 4.x
1.
2.
Click on Main Menu->Status and Manager->Replication
Verify that replication is occurring between servers Server.
For DSR 5.x and above
3.
Execute irepstat
$ irepstat –m
Output like below shall be generated
-- Policy 0 ActStb [DbReplication] ---------------------------------------------------------------------------------------------------------------------------------------------RDU06-MP1 -- Stby
BC From RDU06-SO1 Active
0 0.50 ^0.17%cpu 42B/s A=none
CC From RDU06-MP2 Active
0 0.10 ^0.17 0.88%cpu 32B/s A=none
RDU06-MP2 -- Active
BC From RDU06-SO1 Active
CC To RDU06-MP1 Active
0 0.50 ^0.10%cpu 33B/s A=none
0 0.10 0.08%cpu 20B/s A=none
RDU06-NO1 -- Active
AB To RDU06-SO1 Active
0 0.50 1%R 0.03%cpu 21B/s
RDU06-SO1 -- Active
AB From RDU06-NO1 Active
21
Verify the Database
states
0 0.50 ^0.04%cpu 24B/s
BC To RDU06-MP1 Active
0 0.50 1%R 0.04%cpu 21B/s
BC To RDU06-MP2 Active
0 0.50 1%R 0.07%cpu 21B/s
1. Click on Main Menu->Status and Manager->Database
2. Verify that the HA Role is either “Active” or “Standby”, and that the status is “Normal”.

22
Verify the HA Status
1.
2.
3.
Click on Main Menu->Status and Manager->HA
Check the row for all the MP Server
Verify that the HA Role is either Active or Standby.
Verify the local node info
1.
2.
Click on Main Menu->Diameter->Configuration->Local Node
Verify that all the local nodes are listed.
Verify the peer node info
1.
2.
Click on Main Menu->Diameter->Configuration->Peer Node
Verify that all the peer nodes are listed.
Verify the Connections
info
1.
2.
Click on Main Menu->Diameter->Configuration->Connections
Verify that all the peer nodes are listed.

23

24

25

42 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 2. Recovery Scenario 2
26
Re-enable connections
if needed
1.
2.
3.
Click on Main Menu->Diameter->Maintenance->Connections
Select each connection and click on the “Enable” button
Verify that the Operational State is Available.
Examine All Alarms
1.
2.
Click on Main Menu->Alarms & Events->View Active
Examine all active alarms and refer to the on-line help on how to address them. If needed
contact the Oracle Customer Support hotline.
Re-activate Optional
Features
If optional features (CPA, PDRA, SBR) were activated, they will need to be de-activated and then reactivated. Refer to the [15], [16], [17], [18], [19] or [20] for the appropriate documentation.

27

28

29

Sync Policy DRA
PCRF Data (Only
required if Policy DRA
application is
activated)
If recovering a DSR 5.1 or 6.0 system, and PDRA application is activated then execute this step,
otherwise skip to step 28.
Obtain a terminal window connection to the Active NOAMP console via SSH using VIP address.
Follow below steps:
1. Go to Appworks bin directory:
# cd /usr/TKLC/appworks/bin/
2.
Execute the PCRF sync script in “reportonly” mode to check whether PCRF data syncing is
required or not. This is a read-only mode that does not modify the database:
# ./syncPcrfReferencesAfterRestore.sh --reportonly
3.
If the Report Summary shows one or more PCRFs “need to be synced”, then repeat the
script execution again but using the “sync” option instead of “reportonly” in order to sync
the database. The “sync” option will modify the database:
# ./syncPcrfReferencesAfterRestore.sh --sync
4.
This step is only required, if step 3 is executed otherwise skip this step.
Re-execute the PCRF sync script in “reportonly” mode to verify all PCRF data is in sync.
Examine the Report Summary output of the script. Verify the number of “PCRF record(s)
processed in total” is equal to the number of “PCRF record(s) already in sync”:
# ./syncPcrfReferencesAfterRestore.sh --reportonly
30
Re-activate Optional
Features
If optional features (RBAR, FABR, IPFE, CPA, PDRA, SBR, MIWF) were activated, they will need to
be de-activated and then re-activated. Refer to the [15], [16], [17], [18], [19], [20], [22] or [23] for the
appropriate documentation.
Re-enable transports if
needed
(Applicable ONLY for
DSR 6.x and up)
1.
2.
3.
Click on Main Menu->Transport Manager -> Maintenance -> Transport
Select each transport and click on the “Enable” button
Verify that the Operational Status for each transport is Up.
Re-enable MAPIWF
application if needed
(Applicable ONLY for
DSR 6.x and up)
1.
2.
3.
Click on Main Menu->SS7/Sigtran->Maintenance->'Local SCCP Users'
Click on the “Enable” button corresponding to MAPIWF Application Name.
Verify that the SSN Status is Enabled.
Re-enable links if
needed
(Applicable ONLY for
DSR 6.x and up)
4.
5.
6.
Click on Main Menu->SS7/Sigtran->Maintenance->Links
Click on “Enable” button for each link.
Verify that the Operational Status for each link is Up.
Backup and archive all
the databases from the
recovered system
Execute Appendix A back up the Configuration databases:

31

32

33

34

Disaster Recovery Procedure is Complete
End of Procedure
E52512-01.docx
43 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
5.1.3 Recovery Scenario 3 (Partial Server Outage with both NO Servers failed and one
SO Server intact)
For a partial server outage with an SO server intact and available; NO servers are recovered using recovery procedures of
base hardware and software and then executing a database restore to the active NO server using a NO database backup file
obtained from external backup sources such as customer servers or Netbackup. All other servers are recovered using
recovery procedures of base hardware and software. Database replication from the active NO/active SO server will recover
the database on these servers. The major activities are summarized in the list below. Use this list to understand the recovery
procedure summary. Do not use this list to execute the procedure. The actual procedures’ detailed steps are in Procedure 3.
The major activities are summarized as follows:



Recover Active NO server by recovering base hardware, software and the database.
o
Recover the base hardware.
o
Recover the software.
o
Recover the database
Recover Standby NO server by recovering base hardware and software.
o
Recover the base hardware.
o
Recover the software.
o
The database has already been restored at the active NO server and does not require restoration at the
standby NO server.
Recover any failed SO and MP servers by recovering base hardware and software.
o
Recover the base hardware.
o
Recover the software.
o
Database is already intact at one SO Server and does not require restoration at the other SO and MP
servers.
Follow procedure below for detailed steps.
44 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 3. Recovery Scenario 3
S
T
E
P
#
1

This procedure performs recovery if both NO servers are failed but 1 or more SO servers are intact. This includes
any SO server that is in another location (spare SO server).
Check off () each step as it is completed. Boxes have been provided for this purpose under each step number.
Note: If any errors are encountered during the execution of this procedure, refer to the list of known issues in
Appendix E before contacting Oracle Customer Support
Should this procedure fail, contact the Oracle Customer Care Center and ask for assistance.
Recover Failed
Recover the Failed Hardware and Software on ALL failed blades:
Hardware and software
1. Refer Appendix E to understand any workarounds required during this procedure.
2. If necessary, Refer to [10] PM&C Disaster Recover on instructions how to recover a PM&C
Server.
3. Gather the documents and required materials listed in Section 3.1.
4. If the failed server is HP c-Class Blade follow the following steps
a. Remove the failed HP c-Class Blade and install the replacement into the enclosure.
b. Configure and verify the BIOS on the Blade. Execute procedure “Confirm/Update
Blade Server BIOS Settings” from reference [5] for DSR 4.x or reference [6] for DSR
5.x or reference [7] for DSR 6.x.
c. Execute Procedure “Configure Blade Server iLO Password for Administrator Account”
from [5] for DSR 4.x or reference [6] for DSR 5.x or reference [7] for DSR 6.x to setup
the Administrator account for blade servers.
d. Upgrade the blade firmware and load an errata updates if needed. Refer to [1] for
more details.
e. For blade based NOAMPs/SOAMPs execute procedure “Install TVOE on Server
Blades” from reference [5] for DSR 4.x or reference [9] for DSR 5.x/6.x.
f. For blade based NOAMPs/SOAMPs execute procedure “Configure TVOE on Server
Blades” from reference [5] for DSR 4.x or reference [9] for DSR 5.x/6.x.
5. If the failed server is RMS follow the following steps
a. For RMS based servers, execute Appendix I from [3] to configure all iLO settings,
including the iLO password.
b. If the failed NOAMP is co-located with the PMAC on the first RMS then execute
procedure “Continue TVOE Configuration on First RMS Server” from reference [5] for
DSR 4.x or reference [9] for DSR 5.x/6.x. (RMS based NOAMPs only)
c. Else execute procedure “Configure TVOE on Additional RMS Server(s)” from
reference [5] for DSR 4.x or reference [9] for DSR 5.x/6.x. (RMS based NOAMPs
only)
6. For NOAMPs execute procedure “Create NOAMP Guest VMs” from reference [5] for DSR 4.x or
reference [9] for DSR 5.x/6.x.
7. For SOAMPs execute procedure “Create SOAM Guest VMs” from reference [5]for DSR 4.x or
reference [9] for DSR 5.x/6.x.
8. IPM all the guests and failed MP servers using procedure “IPM Blades and VMs” from [5] for
DSR 4.x or reference [9] for DSR 5.x/6.x. Instruct any other Application’s personnel to start
recovery procedures on the Guests hosted by the server (parallel recovery).
9. Install the application using procedure “Install the Application Software on the Blades” from [5]
for DSR 4.x or reference [9] for DSR 5.x/6.x.
10. If the recovered server is Active/StandBy or DR NOAMP and
a. If there were Hardware profiles that were manually created, then they needs to be
recreated again and copy it into the /var/TKLC/appworks/profiles/ directory of the
active NOAMP server, the standby NOAMP server, and both the DR NOAM servers
(if applicable). Follow Appendix. SAMPLE NETWORK ELEMENT AND HARDWARE
PROFILES from reference [5]for DSR 4.x or reference [9] for DSR 5.x/6.x.
Repeat this step for all remaining failed blades.
2
Recover Failed NO
servers.

E52512-01.docx
1.
2.
Verify the networking data for Network Elements. Use the backup copy of network configuration
data and site surveys.
Execute configuration procedures for the first NO server. See reference [5] for DSR 4.x or
reference [9] for DSR 5.x/6.x, Procedure “Configure the First NOAMP Server”, and “Configure
the NOAMP Server Group”. If Netbackup is used, execute Procedure 35 from [5] or Procedure
13 from [9].
45 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 3. Recovery Scenario 3
3

4

5

Obtain latest NO
database backup and
network configuration
data.
Login into the NO XMI
Address
Upload the backed up
database file from
Remote location into
File Management Area.
46 of 86
Obtain the most recent database backup file from external backup sources (ex. file servers,
Netbackup) or tape backup sources. Determine network configuration data.
1.
Using procedures within your organization’s process (ex. IT department recovery procedures),
obtain the most recent backup of the DSR 4.x/5.x NO database backup file. If you are using
Netbackup, co-ordinate with the customer’s Netbackup administrator to obtain the proper
backup files.
2.
From required materials list in Section 3.1; use site survey documents and Network Element
report (if available), to determine network configuration data.
Log into the first NO GUI.
1.
2.
3.
Log into the first NO GUI.
Browse to Main Menu->Status & Manage->Files
Select the Active NO Server. The following screen will appear. Click on “Upload” as shown
below and select the file “Provisioning and Configuration:” file backed up in step 2 above.
4.
Click on “Browse” and Locate the backup file and click on “Open” as shown below.
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 3. Recovery Scenario 3
6
Disable Provisioning
5.
Click on the “Upload ” button. The file will take a few seconds to upload depending on the
size of the backup data. The file will be visible on the list of entries after the upload is
complete.
1.
Click on Main Menu->Status & Manage->Database
2.
Disable Provisioning by clicking on “Disable Provisioning” button at the bottom of the
screen as shown below.
3.
A confirmation window will appear, press “OK” to disable Provisioning.
4.
The message “Warning Code 002” will appear.
1.
Select the Active NO Server and click on the “Compare”:

7

Verify the Archive
Contents and Database
Compatibility
E52512-01.docx
47 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 3. Recovery Scenario 3
48 of 86
2.
The following screen is displayed; click the radio button for the restored database file that
was uploaded as a part of Step 2 of this procedure.
3.
Verify that the output window matches the screen below. Note that you will get a database
mismatch regarding the NodeIDs of the blades. That is expected. If that is the only
mismatch, then you can proceed, otherwise stop and contact customer support
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 3. Recovery Scenario 3
NOTE: Archive Contents and Database Compatibilities must be the following:
Archive Contents: Provisioning and Configuration data
Database Compatibility: The databases are compatible.
NOTE: Following is expected Output for Topology Compatibility Check since we
are restoring from existing backed up data base to database with just one
NOAMP:
Topology Compatibility
THE TOPOLOGY SHOULD BE COMPATIBLE MINUS THE NODEID.
NOTE: We are trying to restore a backed up database onto an empty NOAMP
database. This is an expected text in Topology Compatibility.
8
Restore the Database
4.
If the verification is successful, Click BACK button and continue to next step in this
procedure.
1.
2.
Click on Main Menu->Status & Manage->Database
Select the Active NO Server, and click on “Restore” as shown below.

E52512-01.docx
49 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 3. Recovery Scenario 3
50 of 86
3.
The following screen will be displayed. Select the proper back up provisioning and
configuration file.
4.
Click “OK” Button. The following confirmation screen will be displayed.
5.
If you get an error that the NodeIDs do not match. That is expected. If no other errors
beside the NodeIDs are displayed, select the “Force” checkbox as shown above and Click
OK to proceed with the DB restore.
6.
To check the status of the restore process, navigate to Main Menu->Status & Manage
->Database, the status will be displayed as shown below.
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 3. Recovery Scenario 3
NOTE: After the restore has started, the user will be logged out of GUI since the restored Topology
is old data.
7.
Log in Back into GUI VIP.
8.
Login using the guiadmin login and password into the GUI
9.
Wait for 5-10 minutes for the System to stabilize with the new topology.
10. Following Alarms must be ignored for NO and MP Servers until all the Servers are
configured.
Alarms with Type Column as “REPL” , “COLL”, “HA” (with mate NOAMP), “DB” (about
Provisioning Manually Disabled)
Do not pay attention to alarms until all the servers in the system are
completely restored.
NOTE: The Configuration and Maintenance information will be in the same state it
was backed up during initial backup.
9
Re-enable Provisioning
1.
Click on Main Menu->Status & Manage->Database menu item.
2.
Click on the “Enable Provisioning” button. A pop-up window will appear to confirm as
shown below, press OK.

10
Restore /etc/hosts file of
active NO

Release 5.X/6x:
From the recovered NO server command line, execute:
# AppWorks AppWorks_AppWorks updateServerAliases <NO Host Name>
Release 4.X:
Update the /etc/hosts file with the missing entries (or copy it from another server (e.g. SO) if it is
complete on that server)
11
Recover standby NO
server.
1. Install the second NO server by executing Procedure “Configure the Second NOAMP Server from
reference [5] for DSR 4.x or reference [9] for DSR 5.x/6.x, steps 1, 4, 5, 6 and 7(only if Netbackup is
used)”. Also execute Procedure 35 from [5] or Procedure 13 from [9] If Netbackup is used.

12
Recover the standby NO server:
Recover SO servers.

Recover the remaining SO servers (standby, spare) by repeating the following step for each SO
Server:
1.
2.
3.
E52512-01.docx
Install the remaining SO server by executing Reference [5] for DSR 4.x or reference [9] for DSR
5.x/6.x, Procedure “Configure the SOAM Servers”, steps 1, 4, 5 and 6”. If you are using
Netbackup, execute Procedure 35 from [5] or Procedure 13 from [9], “Install Netbackup Client”
Re-enable Replication (if inhibited) to the restored SO by navigating to Main Menu->Status &
Manage-> Database, then selecting the NO in question and clicking on “Allow Replication”.
Restart the application by Navigating to Main Menu->Status & Manage->Server, then selecting
the recovered server and Clicking on “Restart”.
51 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 3. Recovery Scenario 3
13

Recover the C Level
Server
(This include DA-MP,
SBRs, IPFE, SS7-MPs)
Execute the following procedures from [5] for DSR 4.x or reference [9] for DSR 5.x/6.x FOR EACH C
Level Server that has been recovered:
1.
2.
“Configure MP Blades Servers”, Steps 1, 5, 6, 7, 8, 9. Also execute step 10 and 11 if you plan
to configure a default route on your MP that uses a signaling (XSI) network instead of the XMI
network
FOR DSR 4.X *ONLY* : Reapply the signaling Networking Configuration by running the
following command from the active NO command line for each MP Server:
/usr/TKLC/appworks/bin/syncApplConfig <MP_Hostame>
3.
If IPFE servers are being recovered, execute Procedure 6 of [22] for any applicable IPFE
servers.
Note: If this server is an IPFE server, then ensure ipfeNetUpdate.sh from [22] has been executed
before proceeding with this step.
14

15
DSR 5.X Recovery
Only: Re-Sync NTP if
Necessary (Optional)
Navigate to Status & Manage -> Server, then select each server that has been recovered and click
NTP Sync..
Restart Application
Processes
Restart the Application by Navigating to Status & Manage -> Server, then select each server that
has been recovered and clicking on Restart at the bottom of the screen.

16
Allow Replication to all
Servers

17
Remove Forced
Standby

52 of 86
1.
2.
Navigate to Status & Manage -> Database
If the “Repl Status” is set to “Inhibited”, click on the “Allow Replication” button as shown
below using the following order, otherwise if none of the servers are inhibited, skip this step
and continue with the next step.:
a.
Active NOAMP Server
b.
Standby NOAMP Server
c.
Active SOAMP Server
d.
StandBy SOAMP Server
e.
Spare SOAMP Server (if applicable)
f.
Active MP Servers
g.
Standby MP Servers
Verify that the replication on all servers is allowed. This can be done by clicking on each server and
checking that the button below shows “Inhibit” Replication” instead of “Allow Replication”.
1. Navigate to Status & Manage -> HA
2. Click on Edit at the bottom of the screen
3. For each server whose Max Allowed HA Role is set to Standby, set it to Active
4. Press OK
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 3. Recovery Scenario 3
18

Fetch and Store the
database Report for the
newly restored data and
save it
1.
Navigate to Configuration-> Server, select the active NO server and click on the “Report”
button at the bottom of the page . The following screen is displayed:
2. Click on “Save” and save the report to your local machine.
19

Optimize Comcol
memory usage on
recovered NO and SO
(DSR 4.X only)
If recovering a DSR 4.x system, execute this step, otherwise skip to step 19.
For each recovered NO or SO, obtain a terminal window connection to the (NO/SO) server console
via SSH or iLO. If using SSH, use the actual IP of the server, not the VIP address.
Execute the following on the command line. Wait until the script completes and you are returned to
the command line:
# /usr/TKLC/dsr/bin/optimizeComcolIdbRamUsage
# sleep 20
# prod.start
# pm.sanity
Sanity check OK: 01/23/13 11:42:20 within 15 secs
Verify that the script finished successfully by checking the exit status:
# echo $?
If anything other than “0” is printed out,. halt this procedure and contact Oracle Support..
Repeat this step for all recovered NO and SO servers at every site.
20

Optimize Comcol
memory usage on DAMP(DSR 4.X only)
E52512-01.docx
SSH to each recovered DA-MP and execute the following command. Note that this command
SHOULD NOT be executed on SBR blades.
# /usr/TKLC/dsr/bin/optimizeComcolIdbRamUsage --force
53 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 3. Recovery Scenario 3
21
Verify Replication
between servers.

For DSR 4.x
1.
2.
Click on Main Menu->Status and Manager->Replication
Verify that replication is occurring between servers Server.
For DSR 5.x and above
3.
Execute irepstat
$ irepstat –m
Output like below shall be generated
-- Policy 0 ActStb [DbReplication] ---------------------------------------------------------------------------------------------------------------------------------------------RDU06-MP1 -- Stby
BC From RDU06-SO1 Active
0 0.50 ^0.17%cpu 42B/s A=none
CC From RDU06-MP2 Active
0 0.10 ^0.17 0.88%cpu 32B/s A=none
RDU06-MP2 -- Active
BC From RDU06-SO1 Active
CC To RDU06-MP1 Active
0 0.50 ^0.10%cpu 33B/s A=none
0 0.10 0.08%cpu 20B/s A=none
RDU06-NO1 -- Active
AB To RDU06-SO1 Active
0 0.50 1%R 0.03%cpu 21B/s
RDU06-SO1 -- Active
AB From RDU06-NO1 Active
22
0 0.50 ^0.04%cpu 24B/s
Verify the Database
states
1.
2.
BC To RDU06-MP1 Active
0 0.50 1%R 0.04%cpu 21B/s
BC To RDU06-MP2 Active
0 0.50 1%R 0.07%cpu 21B/s
Click on Main Menu->Status and Manager->Database
Verify that the HA Role is either “Active” or “Standby”, and that the status is “Normal”.
Verify the HA Status
1.
2.
3.
Click on Main Menu->Status and Manager->HA
Check the row for all the MP Server
Verify that the HA Role is either Active or Standby.
Verify the local node info
1.
2.
Click on Main Menu->Diameter->Configuration->Local Node
Verify that all the local nodes are listed.
Verify the peer node info
1.
2.
Click on Main Menu->Diameter->Configuration->Peer Node
Verify that all the peer nodes are listed.
Verify the Connections
info
1.
2.
Click on Main Menu->Diameter->Configuration->Connections
Verify that all the peer nodes are listed.

23

24

25

26

54 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 3. Recovery Scenario 3
27
Re-enable connections
if needed
1.
2.
3.
Click on Main Menu->Diameter->Maintenance->Connections
Select each connection and click on the “Enable” button
Verify that the Operational State is Available.
Examine All Alarms
1.
2.
Click on Main Menu->Alarms & Events->View Active
Examine all active alarms and refer to the on-line help on how to address them. If needed
contact the Oracle Customer Support hotline.
Restore GUI Username
s and passwords
If applicable, Execute steps in Section 6 to recover the user and group information restored.

28

29

30
Sync Split Scope Data
(If PDRA is activated)

If recovering a DSR 5.1 or 6.0 system, and PDRA application is activated then execute this step,
otherwise skip to step 31.
Obtain a terminal window connection to the Active NOAMP console via SSH using VIP address.
Follow below steps:
1. Go to Appworks bin directory:
# cd /usr/TKLC/appworks/bin/
2.
Execute the PCRF sync script in “reportonly” mode to check whether PCRF data syncing is
required or not. This is a read-only mode that does not modify the database:
# ./syncPcrfReferencesAfterRestore.sh --reportonly
3.
If the Report Summary shows one or more PCRFs “need to be synced”, then repeat the
script execution again but using the “sync” option instead of “reportonly” in order to sync
the database. The “sync” option will modify the database:
# ./syncPcrfReferencesAfterRestore.sh --sync
4.
This step is only required, if step 3 is executed otherwise skip this step.
Re-execute the PCRF sync script in “reportonly” mode to verify all PCRF data is in sync.
Examine the Report Summary output of the script. Verify the number of “PCRF record(s)
processed in total” is equal to the number of “PCRF record(s) already in sync”:
# ./syncPcrfReferencesAfterRestore.sh --reportonly
31
Re-activate Optional
Features
If optional features (RBAR, FABR, IPFE, CPA, PDRA, SBR, MIWF) were activated, they will need to
be de-activated and then re-activated. Refer to the [15], [16], [17], [18], [19], [20], [22] or [23] for the
appropriate documentation.
Re-enable transports if
needed
(Applicable ONLY for
DSR 6.x and up)
1.
2.
3.
Click on Main Menu->Transport Manager -> Maintenance -> Transport
Select each transport and click on the “Enable” button
Verify that the Operational Status for each transport is Up.
Re-enable MAPIWF
application if needed
(Applicable ONLY for
DSR 6.x and up)
1.
2.
3.
Click on Main Menu->SS7/Sigtran->Maintenance->'Local SCCP Users'
Click on the “Enable” button corresponding to MAPIWF Application Name.
Verify that the SSN Status is Enabled.
Re-enable links if
needed
(Applicable ONLY for
DSR 6.x and up)
1.
2.
3.
Click on Main Menu->SS7/Sigtran->Maintenance->Links
Click on “Enable” button for each link.
Verify that the Operational Status for each link is Up.
Backup and archive all
the databases from the
recovered system
Execute Appendix A back up the Configuration databases:

32

33

34

35

Disaster Recovery Procedure is Complete
End of Procedure
E52512-01.docx
55 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
5.1.4 Recovery Scenario 4 (Partial Server Outage with one NO Server and one SO
Server Intact)
For a partial outage with an NO server and an SO server intact and available, only base recovery of hardware and software
is needed. The intact NO and SO servers are capable of restoring the database via replication to all servers. The major
activities are summarized in the list below. Use this list to understand the recovery procedure summary. Do not use this
list to execute the procedure. The actual procedures’ detailed steps are in Procedure 4. The major activities are
summarized as follows:



Recover Standby NO server (if necessary) by recovering base hardware and software.
o
Recover the base hardware.
o
Recover the software.
o
The database is intact at the active NO server and does not require restoration at the standby NO server.
Recover any failed SO and MP servers by recovering base hardware and software.
o
Recover the base hardware.
o
Recover the software.
o
The database in intact at the active NO server and does not require restoration at the SO and MP servers.
Re-apply signaling networks configuration if the failed blade is an MP.
Follow procedure below for detailed steps.
56 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 4. Recovery Scenario 4
S
T
E
P
#
1

This procedure performs recovery if at least 1 NO server is intact and available and 1 SO server is intact and
available.
Check off () each step as it is completed. Boxes have been provided for this purpose under each step number.
Note: If any errors are encountered during the execution of this procedure, refer to the list of known issues in
Appendix E before contacting Oracle Customer Support
Should this procedure fail, contact the Oracle Customer Care Center and ask for assistance.
Recover the Failed
Recover the Failed Hardware and Software on ALL failed blades:
Hardware and software
1. Refer Appendix E to understand any workarounds required during this procedure.
2. If necessary, Refer to [10]PM&C Disaster Recover on instructions how to
recover a PM&C Server.
3.
4.
Gather the documents and required materials listed in Section 3.1.
From the NO VIP GUI, set the failed server HA state to “Forced Standby” by navigating to
Main Menu->HA, then clicking on Edit and setting the “Max Allowed HA Role” to Standby
for the NO in question and pressing OK.
5. From the NO VIP GUI, Inhibit replication to the failed servers by navigating to Main Menu>Status & Manage-> Database, then selecting the server in question and clicking on
“Inhibit Replication”.
6. If the failed server is HP c-Class Blade follow the following steps
a. Remove the failed HP c-Class Servers and Blades and install replacements.
b. Configure and verify the BIOS on the Blade. Execute procedure “Confirm/Update
Blade Server BIOS Settings” from reference [5] for DSR 4.x or reference [6]for
DSR 5.x or reference [7] for DSR 6.x.
c. Execute Procedure “Configure Blade Server iLO Password for Administrator
Account” from [5] for DSR 4.x or reference [6] for DSR 5.x or reference [7] for
DSR 6.x to setup the Administrator account for blade servers.
d. Load any firmware upgrades using [5] for DSR 4.x or reference [6] for DSR 5.x
or reference [7] for DSR 6.x
e. For blades based NOAMPs/SOAMPs execute procedure “Install TVOE on VM
Host Server Blades” from reference [5] for DSR 4.x or reference [6] for DSR 5.x
or reference [7] for DSR 6.x.
f.
For blade based NOAMPs/SOAMPs execute procedure “Configure TVOE on
Server Blades” from reference [5]for DSR 4.x or reference [9] for DSR 5.x/6.x
7. If the failed server is RMS follow the following steps
a. For RMS based servers, execute Appendix I from [3] to configure all iLO
settings, including the iLO password.
b. If the failed NOAMP is co-located with the PMAC on the first RMS then execute
procedure “Continue TVOE Configuration on First RMS Server” from reference
[5] for DSR 4.x or reference [9] for DSR 5.x/6.x. (RMS based NOAMPs only)
c. Else execute procedure “Configure TVOE on Additional RMS Server(s)” from
reference [5] for DSR 4.x or reference [9] for DSR 5.x/6.x. (RMS based NOAMPs
only)
8. For NOAMPs execute procedure “Create NOAMP Guest VMs” from reference [5] for DSR
4.x or reference [9] for DSR 5.x/6.x.
9. For SOAMPs execute procedure “Create SOAM Guest VMs” from reference [5]for DSR 4.x
or reference [9] for DSR 5.x/6.x.
10. IPM all the guests using procedure “IPM Blades and VMs” from [5] for DSR 4.x or
reference [9] for DSR 5.x/6.x. Instruct any other Application’s personnel to start recovery
procedures on the Guests hosted by the server (parallel recovery).
11. Install the application using procedure “Install the Application Software on the Blades” from
[5] for DSR 4.x or reference [9] for DSR 5.x/6.x.
12. If the recovered server is Active/StandBy or DR NOAMP and
a. If there were Hardware profiles that were manually created, then they needs to
be recreated again and copy it into the /var/TKLC/appworks/profiles/ directory of
the active NOAMP server, the standby NOAMP server, and both the DR NOAM
servers (if applicable). Follow Appendix. SAMPLE NETWORK ELEMENT AND
HARDWARE PROFILES from reference [5]for DSR 4.x or reference [9] for DSR
5.x/6.x.
Repeat this step for all remaining failed blades.
E52512-01.docx
57 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 4. Recovery Scenario 4
2
Recover failed NOAM
servers
1.
2.

3.
4.
3
Recover SOAM servers.

Recover the remaining SOAM servers (standby, spare) by repeating the following step for each SO
Server:
1.
2.
3.
4.
4

Recover the C Level
Server
(This include DA-MP,
SBRs, IPFE, SS7-MPs)
Configure the newly installed server by executing procedure “Configure the Second NOAMP
Server, from [5] for DSR 4.x or reference [9] for DSR 5.x/6.x steps 1, 2, 4, 5 and 6.
If you are using Netbackup, execute Procedure 35 from [5] or Procedure 13 from [9], “Install
Netbackup Client”
Re-enable Replication to the restored NOAM by navigating to Main Menu->Status & Manage->
Database, then selecting the NO in question and clicking on “Allow Replication”.
Restart the application by Navigating to Main Menu->Status & Manage->Server, then selecting
the recovered server and Clicking on “Restart”.
Install the remaining SO server by executing Reference [5] for DSR 4.x or reference [9] for DSR
5.x/6.x, Procedure “Configure the SOAM Servers”, steps 1-3 and 5-8.
If you are using Netbackup, execute Procedure 35 from [5] or Procedure 13 from [9], “Install
Netbackup Client”
Re-enable Replication to the restored SO by navigating to Main Menu->Status & Manage->
Database, then selecting the NO in question and clicking on “Allow Replication”.
Restart the application by Navigating to Main Menu->Status & Manage->Server, then selecting
the recovered server and Clicking on “Restart”.
Execute the following procedures from [5] for DSR 4.x or reference [9] for DSR 5.x/6.x FOR EACH C
Level Server that has been recovered:
1.
2.
3.
“Configure MP Blades Servers”, Steps 1, 5, 6, 7, 8, 9. Also execute step 10 and 11 if you plan to
configure a default route on your MP that uses a signaling (XSI) network instead of the XMI
network
Re-enable Replication to the restored MP(s) by navigating to Main Menu->Status & Manage->
Database, then selecting the MP in question and clicking on “Allow Replication”.
(DSR 4.X only) Reapply the signaling Networking Configuration by running the following
command from the active NO command line:
/usr/TKLC/appworks/bin/syncApplConfig
<Recovered_MP_Hostame>
4.
Restart the application by Navigating to Main Menu->Status & Manage->Server, then selecting
the recovered servers and Clicking on “Restart”.
5. If IPFE servers are being recovered, execute Procedure 6 of [22] for any applicable IPFE
servers.
Note: If this server is an IPFE server, then ensure ipfeNetUpdate.sh from [22] has been executed
before proceeding with this step.
5
Remove Forced
Standby

58 of 86
1.
2.
3.
4.
Navigate to Status & Manage -> HA
Click on Edit at the bottom of the screen
For each server whose Max Allowed HA Role is set to Standby, set it to Active
Press OK
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 4. Recovery Scenario 4
6

Optimize Comcol
memory usage on
recovered NO and SO
(Execute for DSR
4.x only)
If recovering a DSR 4.x system, execute this step, otherwise skip to step 7.
For each recovered NO or SO, obtain a terminal window connection to the (NO/SO) server console
via SSH or iLO. If using SSH, use the actual IP of the server, not the VIP address.
Execute the following on the command line. Wait until the script completes and you are returned to
the command line:
# /usr/TKLC/dsr/bin/optimizeComcolIdbRamUsage
# sleep 20
# prod.start
# pm.sanity
Sanity check OK: 01/23/13 11:42:20 within 15 secs
Verify that the script finished successfully by checking the exit status:
# echo $?
If anything other than “0” is printed out,. halt this procedure and contact Oracle Support..
Repeat this step for all recovered NO and SO servers at every site.
7

Optimize Comcol
memory usage on DAMP (Execute for DSR
4.X ONLY)
E52512-01.docx
SSH to each recovered DA-MP and execute the following command. Note that this command
SHOULD NOT be executed on SBR blades.
# /usr/TKLC/dsr/bin/optimizeComcolIdbRamUsage --force
59 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 4. Recovery Scenario 4
8
Verify Replication
between servers.

For DSR 4.x
4.
5.
Click on Main Menu->Status and Manager->Replication
Verify that replication is occurring between servers Server.
For DSR 5.x and above
6.
Execute irepstat
$ irepstat –m
Output like below shall be generated
-- Policy 0 ActStb [DbReplication] ---------------------------------------------------------------------------------------------------------------------------------------------RDU06-MP1 -- Stby
BC From RDU06-SO1 Active
0 0.50 ^0.17%cpu 42B/s A=none
CC From RDU06-MP2 Active
0 0.10 ^0.17 0.88%cpu 32B/s A=none
RDU06-MP2 -- Active
BC From RDU06-SO1 Active
CC To RDU06-MP1 Active
0 0.50 ^0.10%cpu 33B/s A=none
0 0.10 0.08%cpu 20B/s A=none
RDU06-NO1 -- Active
AB To RDU06-SO1 Active
0 0.50 1%R 0.03%cpu 21B/s
RDU06-SO1 -- Active
AB From RDU06-NO1 Active
9

Verify the Database
state of the newly
restored blade
3.
60 of 86
0 0.50 ^0.04%cpu 24B/s
BC To RDU06-MP1 Active
0 0.50 1%R 0.04%cpu 21B/s
BC To RDU06-MP2 Active
0 0.50 1%R 0.07%cpu 21B/s
1. Click on Main Menu->Status and Manager->Database
2. For DSR 4.x Verify that the HA Role is either “Active” or “Standby”, and that the status is
“Normal”.
For DSR 5.x/6.x verify that the OAM Max HA Role is either “Active” or “Standby” for NO and SO
and Application Max HA Role for MPs is “Active”, and that the status is “Normal” as shown
below
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 4. Recovery Scenario 4
10
Verify the HA Status
1.
2.
3.
Click on Main Menu->Status and Manager->HA
Check the row for all the MP Server
For DSR 4.x verify that the HA status is either Active of Standby as shown below.
4.
For DSR 5.x/6.x verify that the OAM Max HA Role is either “Active” or “Standby” for NO and SO
and Application Max HA Role for MPs is “Active”, and that the status is “Normal” as shown
below
Verify the local node info
1.
2.
Click on Main Menu->Diameter->Configuration->Local Node
Verify that all the local nodes are listed.
Re-install NetBackup
(Optional)
1.
If NetBackup was previously installed on the system, follow the procedure in [5], Appendix K to
reinstall it.
Verify the peer node info
1.
2.
Click on Main Menu->Diameter->Configuration->Peer Node
Verify that all the peer nodes are listed.
Verify the Connections
info
1.
2.
Click on Main Menu->Diameter->Configuration->Connections
Verify that all the peer nodes are listed.
Re-enable connections
if needed
1.
2.
3.
Click on Main Menu->Diameter->Maintenance->Connections
Select each connection and click on the “Enable” button
Verify that the Operational State is Available.
Examine All Alarms
1.
2.
Click on Main Menu->Alarms & Events->View Active
Examine all active alarms and refer to the on-line help on how to address them. If needed
contact the Oracle Customer Support hotline.

11

12

13

14

15

16

Note: If alarm “10012: The responder for a monitored table failed to respond to a table change” is
raised, the oampAgent needs to be restarted. ssh as root to each server that has that alarm and
execute the following:
# pm.set off oampAgent
# pm.set on oampAgent
E52512-01.docx
61 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 4. Recovery Scenario 4
17

Sync Split Scope Data
(If PDRA is activated)
If recovering a DSR 5.1 and 6.0 system, and PDRA application is activated then execute this step,
otherwise skip to step 17.
Obtain a terminal window connection to the Active NOAMP console via SSH using VIP address.
Follow below steps:
1. Go to Appworks bin directory:
# cd /usr/TKLC/appworks/bin/
2.
Execute the PCRF sync script in “reportonly” mode to check whether PCRF data syncing is
required or not. This is a read-only mode that does not modify the database:
# ./syncPcrfReferencesAfterRestore.sh --reportonly
3.
If the Report Summary shows one or more PCRFs “need to be synced”, then repeat the
script execution again but using the “sync” option instead of “reportonly” in order to sync
the database. The “sync” option will modify the database:
# ./syncPcrfReferencesAfterRestore.sh --sync
4.
18
Re-activate Optional
Features

19

20

21

22

This step is only required, if step 3 is executed otherwise skip this step.
Re-execute the PCRF sync script in “reportonly” mode to verify all PCRF data is in sync.
Examine the Report Summary output of the script. Verify the number of “PCRF record(s)
processed in total” is equal to the number of “PCRF record(s) already in sync”:
# ./syncPcrfReferencesAfterRestore.sh --reportonly
If optional features (RBAR, FABR, IPFE, CPA, PDRA, SBR, MIWF) were activated, they will need to
be de-activated and then re-activated. Refer to the [15], [16], [17], [18], [19], [20], [22] or [23] for the
appropriate documentation.
Re-enable transports if
needed
(Applicable ONLY for
DSR 6.x and up)
1.
2.
3.
Click on Main Menu->Transport Manager -> Maintenance -> Transport
Select each transport and click on the “Enable” button
Verify that the Operational Status for each transport is Up.
Re-enable MAPIWF
application if needed
(Applicable ONLY for
DSR 6.x and up)
4.
5.
6.
Click on Main Menu->SS7/Sigtran->Maintenance->'Local SCCP Users'
Click on the “Enable” button corresponding to MAPIWF Application Name.
Verify that the SSN Status is Enabled.
Re-enable links if
needed
(Applicable ONLY for
DSR 6.x and up)
4.
5.
6.
Click on Main Menu->SS7/Sigtran->Maintenance->Links
Click on “Enable” button for each link.
Verify that the Operational Status for each link is Up.
Backup and archive all
the databases from the
recovered system
Execute Appendix A back up the Configuration databases:
Disaster Recovery Procedure is Complete
End of Procedure
62 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
5.1.5 Recovery Scenario 5 (Both NO Servers failed with DR NO available)
For a partial outage with both NO servers failed but a DR NO available, the DR NO is switched from secondary to primary
then recover the failed NO servers. The major activities are summarized in the list below. Use this list to understand the
recovery procedure summary. Do not use this list to execute the procedure. The actual procedures’ detailed steps are in
Procedure 4. The major activities are summarized as follows:

Switch DR NO from secondary to primary

Recover the failed NO servers by recovering base hardware and software.


o
Recover the base hardware.
o
Recover the software.
o
The database is intact at the newly active NO server and does not require restoration.
If applicable, recover any failed SO and MP servers by recovering base hardware and software.
o
Recover the base hardware.
o
Recover the software.
o
The database in intact at the active NO server and does not require restoration at the SO and MP servers.
Re-apply signaling networks configuration if the failed blade is an MP.
Follow procedure below for detailed steps.
E52512-01.docx
63 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 5. Recovery Scenario 5
S
T
E
P
#
1
This procedure performs recovery if both NO servers have failed but a DR NO is available
Check off () each step as it is completed. Boxes have been provided for this purpose under each step number.
Note: If any errors are encountered during the execution of this procedure, refer to the list of known issues in
Appendix E before contacting Oracle Customer Support
Should this procedure fail, contact the Oeracle Customer Care Center and ask for assistance.
Switch DR NO to
1. Refer Appendix E to understand any workarounds required during this procedure.
Primary
2. Execute Appendix C: Switching a DR Site to Primary to have the DR NO become active.

2
Recover System
If Both SO servers have failed, execute Recovery Scenario 2 (Procedure 2), otherwise execute
procedure 4 to recover the system.
Switch NO back to
Secondary
Once the system have been recovered:
Execute Appendix D: Returning a Recovered Site to Primary to have the recovered NO become
primary again.

3

End of Procedure
5.1.6 Recovery Scenario 6 (Database recovery)
Case 1
For a partial outage with
1.
Server having a corrupted database
2.
Replication channel from parent is inhibited because of upgrade activity or
3.
Server is in a different release then that of its Active parent because of upgrade activity.
4.
Verify that the Server Runtime backup files, performed at the start of the upgrade, are present in
/var/TKLC/db/filemgmt area in the following format
Backup.DSR.HPC02-NO2.FullDBParts.NETWORK_OAMP.20140524_223507.UPG.tar.bz2
Backup.DSR.HPC02-NO2.FullRunEnv.NETWORK_OAMP.20140524_223507.UPG.tar.bz2
Note: During recovery, the corrupted DB will get replaced by the sever Runtime backup. Any configuration done after
taking the backup will not be visible post recovery. In case latest data backup is not available execute Case 3 for recovery
Follow procedure 6 for recovery.
64 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 6. Recovery Scenario 6
S
T
E
P
#
1
This procedure performs recovery if database got corrupted in the system
Check off () each step as it is completed. Boxes have been provided for this purpose under each step number.
Note: If any errors are encountered during the execution of this procedure, refer to the list of known issues in
Appendix E before contacting Oracle Customer Support
Should this procedure fail, contact the Oracle Customer Care Center and ask for assistance.
Change runlevel to 3
1. Refer Appendix E to understand any workarounds required during this procedure.
2. Execute init 3 command to bring the system to runlevel 3.

2
Recover System
Execute /usr/TKLC/appworks/sbin/backout_restore and follow the instructions appearing the
console prompt
Change runlevel to 4
Execute init 6 command to bring the system back to runlevel 4.
Verify the server
Execute pm.getprocs command to verify if the processes are up and running
Sync Split Scope Data
(If PDRA is activated)
If recovering a DSR 5.1 and 6.0 system, and PDRA application is activated then execute this step,
otherwise skip to step 6.

3

4

5

Obtain a terminal window connection to the Active NOAMP console via SSH using VIP address.
Follow below steps:
5. Go to Appworks bin directory:
# cd /usr/TKLC/appworks/bin/
Execute the PCRF sync script in “reportonly” mode to check whether PCRF data syncing is
required or not. This is a read-only mode that does not modify the database:
6.
# ./syncPcrfReferencesAfterRestore.sh --reportonly
If the Report Summary shows one or more PCRFs “need to be synced”, then repeat the
script execution again but using the “sync” option instead of “reportonly” in order to sync
the database. The “sync” option will modify the database:
7.
# ./syncPcrfReferencesAfterRestore.sh --sync
8.
This step is only required, if step 3 is executed otherwise skip this step.
Re-execute the PCRF sync script in “reportonly” mode to verify all PCRF data is in sync.
Examine the Report Summary output of the script. Verify the number of “PCRF record(s)
processed in total” is equal to the number of “PCRF record(s) already in sync”:
# ./syncPcrfReferencesAfterRestore.sh --reportonly
6

Backup and archive all
the databases from the
recovered system
Execute Appendix A back up the Configuration databases:
Disaster Recovery Procedure is Complete
End of Procedure
Case 2
For a partial outage with
1.
Server having a corrupted database
2.
Replication channel is not inhibited or
E52512-01.docx
65 of 86
Disaster Recovery Guide
3. Server has the same release as that of its Active parent
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
The recovery steps are summarized as follows:
Procedure 7. Recovery Scenario 7
S
T
E
P
#
1
This procedure performs recovery if database got corrupted in the system and system is in the state to get
replicated
Check off () each step as it is completed. Boxes have been provided for this purpose under each step number.
Note: If any errors are encountered during the execution of this procedure, refer to the list of known issues in
Appendix E before contacting Oracle Customer Support
Should this procedure fail, contact the Oracle Customer Care Center and ask for assistance.
Login to the console
1. Refer Appendix E to understand any workarounds required during this procedure.
prompt of the server
2. Login to the console prompt of the server.

2
Take the server to Out
of Service State
Execute prod.stop
Take the server to NoDb
state
Execute prod.clobber
Take the server to DbUp
state and start the
application
Execute prod.start
Verify the server state
Execute pm.getprocs command to verify if the processes are up and running
Execute irepstat command to verify if replication channels are up and running
Execute inetmstat command to verify if merging channels are up and running
Backup and archive all
the databases from the
recovered system
Execute Appendix A back up the Configuration databases:

3

4

5

6

Disaster Recovery Procedure is Complete
End of Procedure
Case 3
For the server outage with
1.
Server having a corrupted database
2.
Replication channel is inhibited because of upgrade activity or
3.
Server is in a different release then that of its Active parent because of upgrade activity
4.
Database backup files are NOT present in the system
The recovery steps are summarized as follows:
66 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 8. Recovery Scenario 8
S
T
E
P
#
1

This procedure performs recovery if database got corrupted in the system and there is no database to recover the
system
Check off () each step as it is completed. Boxes have been provided for this purpose under each step number.
Note: If any errors are encountered during the execution of this procedure, refer to the list of known issues in
Appendix E before contacting Oracle Customer Support
Should this procedure fail, contact the Oracle Customer Care Center and ask for assistance.
Identify the Active
1. Refer Appendix E to understand any workarounds required during this procedure.
parent of the server
2. Login to the NO VIP. GoTo “Configuration->Server Groups”
Identify the Active replicating parent of this Node.
Refer to the figure below to identify the Active Replicating Parent
Active NO
AA
Standby NO
AB
Standby SO
BB
Active SO
BB
Spare SO
BC
BC
CC
Active MP1
2

3

4

CC
Active MP2
Identify the Application
version of the Active
parent
Login to the NO VIP. GoTo “Upgrade” screen and look for the “Application Version” of the Active
Parent identified in Step 1
Install the server again
with the DSR ISO for the
Application version
identify in Step 2 above
Follow Procedure 4 for recovery of the server
Backup and archive all
the databases from the
recovered system
Execute Appendix A back up the Configuration databases:
Disaster Recovery Procedure is Complete
End of Procedure
E52512-01.docx
67 of 86
Disaster Recovery Guide
68 of 86
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
6
Disaster Recovery Guide
RESOLVING USER CREDENTIAL ISSUES AFTER DATABASE RESTORE
User incompatibilities may introduce security holes or prevent access to the network by administrators. User
incompatibilities are not dangerous to the database, however. Review each user difference carefully to ensure that the
restoration will not impact security or accessibility.
6.1 Restoring a Deleted User
- User 'testuser' exists in the selected backup file but not in the current
database.
These users were removed prior to creation of the backup and archive file. They will be reintroduced by system restoration
of that file.
6.1.1 To Keep the Restored User
Perform this step to keep users that will be restored by system restoration.
Before restoration,
 Contact each user that is affected and notify them that you will reset their password during this maintenance
operation.
After restoration
 Log in and reset the passwords for all users in this category.
1.
Navagate to the user administration screen.
(Note: for DSR 5.X, this path is Main Menu: Administration->Access Control->Users)
2.
Select the user.
3.
Click the Change Password button.
4.
Enter a new password.
5.
Click the Continue button.
6.1.2 To Remove the Restored User
Perform this step to remove users that will be restored by system restoration.
After restoration, delete all users in this category.
1. Navagate to the user administration screen.
(Note: for DSR 5.X, this path is Main Menu: Administration->Access Control->Users)
E52512-01.docx
69 of 86
Disaster Recovery Guide
2.
Select the user.
3.
Click the Delete button.
4.
Confirm.
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
6.2 Restoring a Modified User
These users have had a password change prior to creation of the backup and archive file. The will be reverted by system
restoration of that file.
- The password for user 'testuser' differs between the selected backup file and
the current database.
Before restoration,
 Verify that you have access to a user with administrator permissions that is not affected.

Contact each user that is affected and notify them that you will reset their password during this maintenance
operation.
After restoration
 Log in and reset the passwords for all users in this category. See the steps in section 6.1.1 for resetting passwords
for a user.
6.3 Restoring an Archive that Does not Contain a Current User
These users have been created after the creation of the backup and archive file. The will be deleted by system restoration of
that file.
- User 'testuser' exists in current database but not in the selected backup file.
If the user is no longer desired, do not perform any additional steps. The user is permanently removed.
To re-create the user, do the following:
Before restoration,
 Verify that you have access to a user with administrator permissions that is not affected.
70 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
 Contact each user that is affected and notify them that you will reset their password during this maintenance
operation.

Log in and record the username, group, timezone, comment, and enabled values for each affected user.
After restoration
 Log in and re-create each of the affected users using the information recorded above
1.
Navagate to the user administration screen.
2.
Click the Add New User button.
3.
Re-populate all the data for this user.
4.
Click the OK button.

Reset the passwords for all users in this category. See the steps in section 6.1.1 for resetting passwords for a user.
E52512-01.docx
71 of 86
Disaster Recovery Guide
Appendix A.
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
EAGLEXG DSR 4.x/5.x/6.x Database Backup
Procedure 9: DSR 4.x/5.x/6.x Database Backup
S
T
E
P
#
The intent of this procedure is to backup the provision and configuration information from an NO or SO server
after the disaster recovery is complete and transfer it to a secure location accessible to TAC.
Prerequisites for this procedure are:

Network connectivity to the NO XMI address via VPN access to the Customer’s network.

DSR 4.x “guiadmin” user password.
Check off () each step as it is completed. Boxes have been provided for this purpose under each step number.
Should this procedure fail, contact the Oracle Customer Care Center and ask for assistance.
1.
Login into NO (or
SO) XMI VIP IP
Address
72 of 86
Login using the “guiadmin” credentials.
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 9: DSR 4.x/5.x/6.x Database Backup
2.
Backup
Configuration data
for the
system.
E52512-01.docx
1.
Browse to Main Menu->Status & Manage->Database screen
2.
Select the Active NOAMP Server and Click on “Backup” button as shown :
3.
Make sure that the checkboxes next to Configuration is checked. Then enter a
filename for the backup and press “OK”.
73 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 9: DSR 4.x/5.x/6.x Database Backup
3.
Verify the back up file
availability.
74 of 86
1.
Browse to Main Menu-> Status & Manage->Files
2.
Select the Active NO (or SO) and click on “List Files”
3.
The files on this server file management area will be displayed in the work
area.
4.
Verify the existence of the backed up configuration back up file as shown
above.
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 9: DSR 4.x/5.x/6.x Database Backup
4.
5.
6.
Download the file to
local machine.
1.
Click on the file link as shown below and click on the download button
2.
File download dialog box will be displayed as shown, click on the save
button and save it to local machine:
Upload the image
to secure location
for future disaster
recovery of entire
system.
Transfer the backed up image saved in the previous step to a secure location where the
Server Backup files are fetched in case of system disaster recovery.
Backup Active SO
For a 3-tier system, repeat Steps 2 through 5 to backup the Active SO, otherwise the
database backup of the DSR 4.x/5.x complete.
6
6
E52512-01.docx
75 of 86
Disaster Recovery Guide
Appendix B.
OAs)
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Recovering/Replacing a Failed 3rd party components (Switches,
Procedure 10: Recovering a failed PM&C Server
S
T
E
P
#
The intent of this procedure is to recover a failed PM&C Server
Check off () each step as it is completed. Boxes have been provided for this purpose under each step number.
Should this procedure fail, contact the Oracle Customer Care Center and ask for assistance.
1.
Refer to [10] PM&C Disaster Recover on instructions how to recover a PM&C Server.
Procedure 11: Recovering a failed Aggregation Switch (Cisco 4948E / 4948E-F)
S
T
E
P
#
The intent of this procedure is to recover a failed Aggregation (4948E / 4948E-F) Switch.
Prerequisites for this procedure are:

A copy of the networking xml configuration files

A copy of HP Misc Firmware DVD or ISO

IP address and hostname of the failed switch

Rack Mount position of the failed switch
Check off () each step as it is completed. Boxes have been provided for this purpose under each step number.
Should this procedure fail, contact the Oracle Customer Care Center and ask for assistance.
1.
1.
Remove the old SSH key of the switch from the PMAC by executing the following
command from a PMAC command shell:
sudo ssh-keygen -R <4948_switch_ip>
2.
76 of 86
Refer to [4], procedure “Replace a failed 4948/4948E/4948E-F switch (c-Class
system) (netConfig)” for DSR 4.x/5.x, or [8] procedure “Replace a failed
4948/4948E/4948E-F switch (PM&C Installed) (netConfig) for DSR 6.x, to
replace a failed Aggregation switch. You will need a copy of the HP Misc
Firmware DVD or ISO and of the original networking xml files custom for this
installation. These will either be stored on the PM&C in a designation location, or
can be obtained from the NAPD.
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 12: Recovering a failed Enclosure Switch (Cisco 3020)
S
T
E
P
#
The intent of this procedure is to recover a failed Enclosure (3020) Switch.
Prerequisites for this procedure are:

A copy of the networking xml configuration files

A copy of HP Misc Firmware DVD or ISO

IP address and hostname of the failed switch

Interconnect Bay position of the enclosure switch
Check off () each step as it is completed. Boxes have been provided for this purpose under each step number.
Should this procedure fail, contact the Oracle Customer Care Center and ask for assistance.
1.
1.
Remove the old SSH key of the switch from the PMAC by executing the
following command from a PMAC command shell:
sudo ssh-keygen -R <enclosure_switch_ip>
2.
Refer to [4], procedure “Reconfigure a failed 3020 switch(netConfig)” for
DSR 4.x/5.x, or [8] procedure “Replace a failed 3020 switch (netConfig) for
DSR 6.x, to replace a failed Enclosure switch. You will need a copy of the HP
Misc Firmware DVD or ISO and of the original networking xml files custom
for this installation. These will either be stored on the PM&C in a designation
location, or can be obtained from the NAPD.
Procedure 13: Recovering a failed Enclosure Switch (HP 6120XG)
S
T
E
P
#
The intent of this procedure is to recover a failed Enclosure (6120XG) Switch.
Prerequisites for this procedure are:

A copy of the networking xml configuration files

IP address and hostname of the failed switch

Interconnect Bay position of the enclosure switch
A copy of HP Misc Firmware DVD or ISO Check off () each step as it is completed. Boxes have been provided for this purpose
under each step number.
Should this procedure fail, contact the Oracle Customer Care Center and ask for assistance.
E52512-01.docx
77 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Procedure 13: Recovering a failed Enclosure Switch (HP 6120XG)
1.
1.
Remove the old SSH key of the switch from the PMAC by executing the
following command from a PMAC command shell:
sudo ssh-keygen -R <enclosure_switch_ip>
2.
Refer to [4], procedure “Reconfigure a failed HP 6120XG switch
(netConfig)” for DSR 4.x/5.x, or [8] procedure “Replace a failed HP
(6120XG, 6125G, 6125XLG) switch (netConfig) for DSR 6.x, to replace a
failed Enclosure switch. You will need a copy of the HP Misc Firmware DVD
or ISO and of the original networking xml files custom for this installation.
These will either be stored on the PM&C in a designation location, or can be
obtained from the NAPD.
Procedure 14: Recovering a failed Enclosure Switch (HP 6125XLG, HP 6125G)
S
T
E
P
#
The intent of this procedure is to recover a failed Enclosure (6125XG) Switch.
Prerequisites for this procedure are:

A copy of the networking xml configuration files
A copy of HP Misc Firmware DVD or ISO Check off () each step as it is completed. Boxes have been provided for this purpose
under each step number.
Should this procedure fail, contact the Oracle Customer Care Center and ask for assistance.
2.
1.
Remove the old SSH key of the switch from the PMAC by executing the
following command from a PMAC command shell:
sudo ssh-keygen -R <enclosure_switch_ip>
2.
Refer to [4], procedure “Reconfigure a failed HP 6125XG switch
(netConfig)” for DSR 4.x/5.x, or [8] procedure “Replace a failed HP
(6120XG, 6125G, 6125XLG) switch (netConfig) for DSR 6.x to replace a
failed Enclosure switch. You will need a copy of the HP Misc Firmware DVD
or ISO and of the original networking xml files custom for this installation.
These will either be stored on the PM&C in a designation location, or can be
obtained from the NAPD.
Procedure 15: Recovering a failed Enclosure OA
S
T
E
P
#
The intent of this procedure is to recover a failed Enclosure Onboard Administrator Switch.
Check off () each step as it is completed. Boxes have been provided for this purpose under each step number.
Should this procedure fail, contact the Oracle Customer Care Center and ask for assistance.
78 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Procedure 15: Recovering a failed Enclosure OA
3.
Refer to [5] for DSR 4.x or reference [6] for DSR 5.x or reference [7] for DSR 6.x procedure
“Replacing Onboard Administrator in a system with redundant OA” to replace a
failed Enclosure OA.
Appendix C.
Switching a DR Site to Primary
Upon the loss of a Primary DSR NO Site, the DR NO Site should become primary. The following steps are used to enable
such switchover.
Preconditions:
• User cannot access the primary DSR
• User still can access the DR DSR
• Provisioning clients are disconnected from the primary DSR
• Provisioning has stopped
Recovery Steps
In order to quickly make DSR GUI accessible and provisioning to continue, DR DSR servers are activated and made to
serve as primary DSR via following steps.
E52512-01.docx
79 of 86
Disaster Recovery Guide
1

Disable the
application on DR
DSR servers.
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
This step ensures that when the DR DSR assumes Primary status in a controlled
fashion. Disabling the application inhibits provisioning and can be started after
successful validation.
1. Login to DR DSR GUI as one of the admin user.
2. Select [Main Menu: Status & Manage → Server] screen.
3. Select the row that has Active DR DSR server. It highlights ‘Stop’
button at the bottom.
4. Click the ‘Stop’ button and then click the ‘OK’ button.
At this time, HA switch over causes an automatic logout.
5.
6.
7.
2

3
SSH to physical IP
address of the
designated primary
DR DSR as root
(for DSR 5.x and
less) or as admusr
(for DSR 6.x and
up) and make it
primary
Verify replication

4

Re-enable the
application on the
now-Primary DSR
using the Active
new-Primary DSR
GUI.
1.
2.
Note: In case HA switchover does not occur. Use the Standby DR NO
XMI IP to login in Step 5
Login to DR DSR GUI as one of the admin user.
Repeat step 3 to 4 for other DR DSR server.
Verify that ‘PROC’ column on both DR DSR servers show ‘Man’
indicating that application is manually stopped.
Login via SSH to the physical IP of the chosen primary DR DSR server
as root user.
Execute the command
top.setPrimary
This step makes the DR DSR take over as the Primary.
3. System generates several replication and collection alarms as
replication/collection links to/from former Primary DSR servers
becomes inactive.
1. Monitor [Main Menu: Status & Manage → Server] screen at newPrimary DSR.
2. It may take several minutes for replication; afterward the DB and
Reporting Status columns should show ‘Normal.’
1. Login to new-Primary DSR GUI as one of the admin user.
2. Select [Main Menu: Status & Manage → Server] screen.
3. Select the row that has the active new-Primary DSR server. This action
highlights the ‘Restart’ button at the bottom.
4. Click the ‘Restart’ button and then click the ‘OK’ button.
5. Verify that ‘PROC’ column now shows ‘Norm’.
6. Repeat step 3 to 5 for standby new-Primary DSR server.
Provisioning connections can now resume to the VIP of the new-Primary
DSR.
80 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
5

Decrease the
durability admin
status
Appendix D.
Disaster Recovery Guide
1. Lower the durability admin status to (NO pair) to exclude formerPrimary DSR servers from the provisioning database durability. A
value greater than 2 must be adjusted downward.
a. Login to new DSR GUI as admin user
b. Select [Main Menu: Administration → General Options]
c. Set durableAdminState to 2 (NO pair)
d. Click the ‘OK’ button
Returning a Recovered Site to Primary
Once a failed site is recovered, the customer might choose to return it to primary state while returning the current active site
to its original DR State. The following steps are used to enable such switchover.
Preconditions:
• Failed Primary DSR site recovered
Recovery Steps
In order to quickly make DSR GUI accessible and provisioning to continue, DR DSR servers are activated and made to
serve as primary DSR via following steps.
1

2

Disable the
application on
currently Active
DSR servers.
Convert former
Primary DSR
servers to new DR
DSR
E52512-01.docx
Disabling the application inhibits provisioning and can be started after successful
validation.
1. Login to Active DSR GUI as one of the admin user.
2. Select [Main Menu: Status & Manage → Server] screen.
3. Select the row that has active DSR server. It highlights ‘Stop’ button at the
bottom.
4. Click the ‘Stop’ button and then click the ‘OK’ button.
At this time, HA switch over causes an automatic logout.
5. Login to Active DSR GUI as one of the admin user.
6. Repeat step 3 to 4 for new active DSR server.
7. Verify that ‘PROC’ column on both DSR servers show ‘Man’ indicating
that application is manually stopped.
1. SSH to VIP of active former-Primary DSR server as root (for DSR 5.x and
less) or as admusr (for DSR 6.x and up).
2. Execute the command
top.setSecondary
This step allows the formerly Primary DSR to become the DR DSR.
3. Monitor [Main Menu: Status & Manage → Server] screen at new DR DSR
GUI.
4. It may take several minutes for replication, afterward the DB and
Reporting Status columns should show ‘Normal.’
81 of 86
Disaster Recovery Guide
3

4

5

6
Start software on
newly DR Site
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
1. Login to new-DR DSR GUI physical IP as one of the admin user.
2. Select [Main Menu: Status & Manage → Server] screen.
3. Select the row that has the active new-DR DSR server. This action
highlights the ‘Restart’ button at the bottom.
4. Click the ‘Restart’ button and then click the ‘OK’ button.
5. Verify that ‘PROC’ column now shows ‘Norm’.
6. Repeat step 3 to 5 for standby new-DR DSR server.
1. Login via SSH to VIP of to-be-primary DSR server as root user.
SSH to VIP address
2. Execute the command
of the to-betop.setPrimary
primary DSR as
root and make it
This step makes the DSR take over as the Primary.
primary
3. System generates several replication and collection alarms as
replication/collection links to/from former Primary DSR servers becomes
inactive.
1. Login to new-Primary DSR GUI as one of the admin user.
Re-enable the
2. Select [Main Menu: Status & Manage → Server] screen.
application on the
now-Primary DSR 3. Select the row that has the active new-Primary DSR server. This action
using the Active
highlights the ‘Restart’ button at the bottom.
new-Primary DSR
4. Click the ‘Restart’ button and then click the ‘OK’ button.
GUI.
5. Verify that ‘PROC’ column now shows ‘Norm’.
6. Repeat step 3 to 5 for standby new-Primary DSR server.
Verify replication

1. Monitor [Main Menu: Status & Manage → Server] screen at new-Primary
DSR.
2. It may take several minutes for replication, afterward the DB and Reporting
Status columns should show ‘Normal.’
Note: the inetmerge process might have to be restarted if replication is taking
excessive time. To restart it, ssh to the active site NO and run the following
command to restart the replication process::
For DSR 4.x/5.x
# pm.kill inetmerge
For DSR 6.x
$sudo pm.kill inetmerge
82 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
7

8

Decrease the
durability admin
status
Set durability
admin status to
include DR DSR
(Optional)
Appendix E.
Disaster Recovery Guide
1. Lower the durability admin status to (NO pair) to exclude former-Primary
DSR servers from the provisioning database durability. A value greater than
2 must be adjusted downward.
a. Login to new DSR GUI as admin user
b. Select [Main Menu: Administration → General Options]
c. Set durableAdminState to 2 (NO pair)
d. Click the ‘OK’ button
1. If you reduced the durability status in step 7, raise durability admin status to
its former value (NO + DRNO) .
a. Login to new primary DSR GUI as admin user
b. Select [Main Menu: Administration → General Options]
c. Set durableAdminState to 3(NO DRNO)
d. Click the ‘OK’ button
2. Now new DR DSR servers are part of provisioning database durability.
Inhibit A and B level replication on C Level servers
Execute the following commands to inhibit A and B level replication on all C Level servers of this site
Log into Active NO :
ssh <user>@<Active NO XMI IP>
login as:
root/admusr
password:
<enter password>
Execute following command on active NO :
for i in $(iqt -p -z -h -fhostName NodeInfo where "nodeId like 'C*' and
siteId='<NE name of the site>'"); do iset -finhibitRepPlans='A B' NodeInfo
where "nodeName='$i'"; done
Note: NE name of the site can be found out by logging into the Active NO GUI and going to
Configuration->Server Groups screen. Please see the snapshot below for more details. E.g. if ServerSO1
belong to the site which is being recovered then siteId will be SO_HPC03.
E52512-01.docx
83 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Note: After executing above steps to inhibit replication on MP(s), no alarms on GUI would be raised
informing that replication on MP is disabled. Verification of replication inhibition on MPs can be done by
analyzing NodeInfo output. InhibitRepPlans field for all the MP servers for the selected site e.g. Site
SO_HPC03 shall be set as ‘A B’:
iqt NodeInfo
nodeId
nodeName
hostName nodeCapability
inhibitRepPlans
siteId
A1386.099
NO1
NO1
Active
NO_HPC03
B1754.109
SO1
SO1
Active
SO_HPC03
C2254.131
MP2
MP2
Active
AB
SO_HPC03
C2254.233
MP1
MP1
Active
AB
SO_HPC03
Appendix F.
excludeTables
Un Inhibit A and B level replication on C Level servers
Execute following commands IF Server is C level server
Note: The following steps will allow ‘A and B’ level replication to recovered C level servers
Log into Active NO:
ssh root@<Active NO IP>
login as:
root/admusr
password:
<enter password>
Execute following command
for i in $(iqt -p -z -h -fhostName NodeInfo where "nodeId like 'C*' and
siteId='<NE name of the site>'"); do iset -finhibitRepPlans='' NodeInfo where
"nodeName='$i'"; done
84 of 86
E52512-01.docx
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
Disaster Recovery Guide
Note: NE name of the site can be found out by logging into the Active NO GUI and going to
Configuration->Server Groups screen. Please see the snapshot below for more details. E.g. if ServerSO1
belong to the site which is being recovered then siteId will be SO_HPC03.
Note: After executing above steps to enable replication on MP(s) no indication on GUI would be raised.
Verification of replication enabling on MPs can be done by analyzing NodeInfo output. InhibitRepPlans field
for all the MP servers shall be empty:
iqt NodeInfo
nodeId
nodeName
hostName nodeCapability
inhibitRepPlans
siteId excludeTables
A1386.099
NO1
NO1
Active
NO_HPC03
B1754.109
SO1
SO1
Active
SO_HPC03
C2254.131
MP2
MP2
Active
SO_HPC03
C2254.233
MP1
MP1
Active
SO_HPC03
Note: This allows AB Replication for the C level server
Appendix G.
Workarounds for Issues/PR not fixed in this release
Issue
Associated PR
Inetmerge alarm after force restore
Workaround
Get the clusterID of the NO using the
following command:
# top.myrole
myNodeId=A3603.215
myMasterCapable=true
Incorrect NodeID
222826
…
Then update the clusterId field in
RecognizedAuthority table to have the same
clusterid:
# ivi RecognizedAuthority
e.g.
iload -ha -xU -frecNum -fclusterId -
E52512-01.docx
85 of 86
Disaster Recovery Guide
DSR 4.x/5.x/6.x 3-tier Disaster Recovery
ftimestamp RecognizedAuthority \
<<'!!!!'
0|A1878|1436913769646
!!!!
Restart the Inetrep service on all affected
servers using the following commands:
Inetrep alarm after performing disaster recovery
222827
# pm.set off inetrep
# pm.set on inetrep
Restart the Inetsync service on all affected
servers using the following commands:
Inetsync alarms after performing disaster recovery
222828
# pm.set off inetsync
# pm.set on inetsync
Active NO /etc/hosts file does not contain server
aliases after force restore done
Release 5.X:
From the recovered NO server command line,
execute:
Active NO cannot communicate with other Servers
222829,234357
# AppWorks AppWorks_AppWorks
updateServerAliases <NO Host Name>
Release 4.X:
Update the /etc/hosts file with the missing
entries (or copy it from another server (e.g.
SO) if it is complete on that server)
Appendix H.
My Oracle Support (MOS)
MOS (https://support.oracle.com) is your initial point of contact for all product support and training needs. A
representative at Customer Access Support (CAS) can assist you with MOS registration.
Call the CAS main number at 1-800-223-1711 (toll-free in the US), or call the Oracle Support hotline for your local country
from the list at http://www.oracle.com/us/support/contact/index.html.
When calling, there are multiple layers of menus selections. Make the selections in the sequence shown below
on the Support telephone menu:
1. For the first set of menu options, select 2, “New Service Request”. You will hear another set of menu
options.
2. In this set of menu options, select 3, “Hardware, Networking and Solaris Operating System Support”. A third
set of menu options begins.
3. In the third set of options, select 2, “ Non-technical issue”. Then you will be connected to a live agent who
can assist you with MOS registration and provide Support Identifiers. Simply mention you are a Tekelec
Customer new to MOS.
86 of 86
E52512-01.docx
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising