Huawei Troubleshooting

S6700 Series Ethernet Switches

V200R001C00

Troubleshooting

Issue

Date

01

2012-03-15

HUAWEI TECHNOLOGIES CO., LTD.

Copyright © Huawei Technologies Co., Ltd. 2012. All rights reserved.

No part of this document may be reproduced or transmitted in any form or by any means without prior written consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.

All other trademarks and trade names mentioned in this document are the property of their respective holders.

Notice

The purchased products, services and features are stipulated by the contract made between Huawei and the customer. All or part of the products, services and features described in this document may not be within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information, and recommendations in this document are provided "AS IS" without warranties, guarantees or representations of any kind, either express or implied.

The information in this document is subject to change without notice. Every effort has been made in the preparation of this document to ensure accuracy of the contents, but all statements, information, and recommendations in this document do not constitute the warranty of any kind, express or implied.

Huawei Technologies Co., Ltd.

Address: Huawei Industrial Base

Bantian, Longgang

Shenzhen 518129

People's Republic of China

Website:

Email: http://www.huawei.com

[email protected]

Issue 01 (2012-03-15) Huawei Proprietary and Confidential

Copyright © Huawei Technologies Co., Ltd.

i

S6700 Series Ethernet Switches

Troubleshooting About This Document

About This Document

Intended Audience

This document describes the procedure for troubleshooting various services supported by the

S6700 in terms of common causes, flowchart, troubleshooting procedure, alarms and logs, and case studies.

This document is intended for: l System maintenance engineers l Commissioning engineers l Network monitoring engineers

Symbol Conventions

The symbols that may be found in this document are defined as follows.

Symbol

DANGER

Description

Indicates a hazard with a high level of risk, which if not avoided, will result in death or serious injury.

Indicates a hazard with a medium or low level of risk, which if not avoided, could result in minor or moderate injury.

WARNING

TIP

NOTE

CAUTION

Indicates a potentially hazardous situation, which if not avoided, could result in equipment damage, data loss, performance degradation, or unexpected results.

Indicates a tip that may help you solve a problem or save time.

Provides additional information to emphasize or supplement important points of the main text.



ii


Troubleshooting About This Document

Command Conventions

The command conventions that may be found in this document are defined as follows.

Convention

Boldface

Italic

[ ]

{ x | y | ... }

[ x | y | ... ]

{ x | y | ... }

*

[ x | y | ... ]

*

&<1-n>

#

Description

The keywords of a command line are in boldface.

Command arguments are in italics.

Items (keywords or arguments) in brackets [ ] are optional.

Optional items are grouped in braces and separated by vertical bars. One item is selected.

Optional items are grouped in brackets and separated by vertical bars. One item is selected or no item is selected.

Optional items are grouped in braces and separated by vertical bars. A minimum of one item or a maximum of all items can be selected.

Optional items are grouped in brackets and separated by vertical bars. Several items or no item can be selected.

The parameter before the & sign can be repeated 1 to n times.

A line starting with the # sign is comments.

Change History

Updates between document issues are cumulative. Therefore, the latest document issue contains all updates made in previous issues.

Changes in Issue 01 (2012-03-15)

Initial commercial release.



iii


Troubleshooting Contents

Contents

About This Document.....................................................................................................................ii

1 Collecting Fault Information.......................................................................................................1

2 System..............................................................................................................................................3

2.1 CPU Troubleshooting.........................................................................................................................................4

2.1.1 CPU Usage Is High....................................................................................................................................4

2.2 Stack Troubleshooting......................................................................................................................................11

2.2.1 Stacking Failures Occur...........................................................................................................................11

2.3 AutoConfig Troubleshooting............................................................................................................................17

2.3.1 Unconfigured Switch Fails to Obtain an IP Address After Startup.........................................................17

2.3.2 Unconfigured Switch Fails to Obtain Files.............................................................................................21

2.4 ALS Troubleshooting.......................................................................................................................................27

2.4.1 Laser Status Does Not Change at ALS Pulse Intervals When a Fiber Link Fails...................................27

2.4.2 Interface Cannot Become Up After the Fiber Link Recovers.................................................................29

2.5 Telnet Troubleshooting.....................................................................................................................................32

2.5.1 The User Fails to Log in to the Server Through Telnet...........................................................................32

2.6 FTP Troubleshooting........................................................................................................................................35

2.6.1 The User Fails to Log in to the Server Through FTP..............................................................................35

2.6.2 The FTP Transmission Fails....................................................................................................................39

2.6.3 The FTP Transmission Rate Is Low........................................................................................................40

2.7 SNMP Troubleshooting....................................................................................................................................42

2.7.1 An SNMP Connection Cannot Be Established........................................................................................42

2.7.2 The NMS Fails to Receive Trap Messages from the Host......................................................................46

2.8 RMON Troubleshooting...................................................................................................................................49

2.8.1 NM Station Cannot Receive RMON Alarms..........................................................................................49

2.9 NQA Troubleshooting......................................................................................................................................53

2.9.1 A UDP Jitter Test Instance Fails to Be Started.......................................................................................53

2.9.2 A Drop Record Exists in the UDP Jitter Test Result...............................................................................55

2.9.3 A Busy Record Exists in the UDP Jitter Test Result...............................................................................56

2.9.4 A Timeout Record Exists in the UDP Jitter Test Result.........................................................................58

2.9.5 The UDP Jitter Test Result Is "Failed", "No Result" or "Packet Loss"...................................................59

2.10 NTP Troubleshooting.....................................................................................................................................62

2.10.1 The Clock is not Synchronized..............................................................................................................62



iv


Troubleshooting

Huawei Proprietary and Confidential


Contents

2.11 HGMP Troubleshooting.................................................................................................................................64

2.11.1 A Candidate Switch Directly Connected to the Administrator Switch Cannot Be Added to the Cluster

..........................................................................................................................................................................64

2.12 LLDP Troubleshooting...................................................................................................................................71

2.12.1 An Interface Cannot Discover Neighbors..............................................................................................71

2.13 NAP-based Remote Deployment Troubleshooting........................................................................................74

2.13.1 Fail to Log In to the Newly Deployed Device Through NAP...............................................................74

2.14 sFlow Troubleshooting...................................................................................................................................77

2.14.1 Target sFlow Collector Used to Receive Counter Sampling Data Cannot Receive Sampling Packets

..........................................................................................................................................................................77

2.14.2 Target sFlow Collector Used to Receive Flow Sampling Data Cannot Receive Sampling Packets

..........................................................................................................................................................................79

3 Physical Connection and Interfaces.........................................................................................84

3.1 Ethernet Interface Troubleshooting..................................................................................................................85

3.1.1 Connected Ethernet Interfaces Down......................................................................................................85

3.1.2 An Ethernet Interface Frequently Alternates Between Up and Down....................................................88

3.2 Eth-Trunk Interface Troubleshooting...............................................................................................................91

3.2.1 Eth-Trunk Interface Cannot Forward Traffic..........................................................................................91

3.2.2 Troubleshooting Cases............................................................................................................................95

4 LAN..............................................................................................................................................101

4.1 VLAN Troubleshooting..................................................................................................................................103

4.1.1 Users in a VLAN Cannot Communicate with Each Other....................................................................103

4.2 MAC Address Table Troubleshooting...........................................................................................................107

4.2.1 Correct MAC Address Entries Cannot Be Generated...........................................................................107

4.3 MAC Address Flapping Troubleshooting......................................................................................................112

4.3.1 MAC Address Flapping Occurs............................................................................................................112

4.4 QinQ Troubleshooting....................................................................................................................................115

4.4.1 Traffic Forwarding Fails on a QinQ Interface.......................................................................................115

4.5 MSTP Troubleshooting..................................................................................................................................119

4.5.1 MSTP Topology Change Leads to Service Interruption.......................................................................119

4.6 GVRP Troubleshooting..................................................................................................................................125

4.6.1 No Dynamic VLAN Can Be Created on an Interface...........................................................................125

4.6.2 Dynamic VLAN Flapping Occurs.........................................................................................................129

4.7 VLAN Mapping Troubleshooting..................................................................................................................131

4.7.1 Users Cannot Communicate After VLAN Mapping Is Configured......................................................132

4.8 SEP Troubleshooting......................................................................................................................................135

4.8.1 Traffic Forwarding Fails on a SEP Link...............................................................................................135

4.9 Loop Troubleshooting....................................................................................................................................138

4.9.1 Loops Cause Broadcast Storms.............................................................................................................138

4.10 Loopback Detection Troubleshooting..........................................................................................................143

4.10.1 Broadcast Storms Still Exist After Loopback Detection Is Configured..............................................144

5 IP Services...................................................................................................................................147

Issue 01 (2012-03-15) v


Troubleshooting



Contents

5.1 IP Address Troubleshooting...........................................................................................................................148

5.1.1 IP Address Fails to Be Allocated to an Interface...................................................................................148

5.2 DHCP Troubleshooting..................................................................................................................................150

5.2.1 A Client Cannot Obtain an IP Address (the S6700 Functions as the DHCP Server)............................150

5.2.2 A Client Cannot Obtain an IP Address (the S6700 Functions as the DHCP Relay Agent)..................154

5.3 DHCPv6 Troubleshooting..............................................................................................................................158

5.3.1 A Client Cannot Obtain an IPv6 Address (the S6700 Functions as the DHCPv6 Relay Agent)..........158

5.4 IPv6 Troubleshooting.....................................................................................................................................161

5.4.1 IPv6 Service Traffic Cannot Be Forwarded..........................................................................................161

6 IP Forwarding and Routing.....................................................................................................165

6.1 Layer 2 and Layer 3 Packet Forwarding Troubleshooting.............................................................................166

6.1.1 Fault Location Roadmap.......................................................................................................................166

6.1.2 Layer 2 Packets Are Lost.......................................................................................................................168

6.1.3 Layer 3 Packets Are Lost.......................................................................................................................173

6.2 Ping Troubleshooting.....................................................................................................................................179

6.2.1 A Ping Operation Fails..........................................................................................................................179

6.2.2 Troubleshooting Cases..........................................................................................................................186

6.3 Tracert Troubleshooting.................................................................................................................................192

6.3.1 The Tracert Operation Fails...................................................................................................................192

6.4 OSPF Troubleshooting...................................................................................................................................194

6.4.1 The OSPF Neighbor Relationship Is Down..........................................................................................194

6.4.2 The OSPF Neighbor Relationship Cannot Reach the Full State...........................................................199


6.5 IS-IS Troubleshooting....................................................................................................................................209

6.5.1 The IS-IS Neighbor Relationship Cannot Be Established.....................................................................209

6.5.2 A Device Fails to Learn Specified IS-IS Routes from Its Neighbor.....................................................214

6.5.3 The IS-IS Neighbor Relationship Flaps.................................................................................................219

6.5.4 IS-IS Routes Flap...................................................................................................................................221


6.6 BGP Troubleshooting.....................................................................................................................................225

6.6.1 The BGP Peer Relationship Fails to Be Established.............................................................................225

6.6.2 BGP Public Network Traffic Is Interrupted..........................................................................................229


6.7 RIP Troubleshooting.......................................................................................................................................237

6.7.1 Device Does not Receive Partial or All the Routes...............................................................................237

6.7.2 Device Does not Send Some or All Routes...........................................................................................240

6.8 MCE Troubleshooting....................................................................................................................................244

6.8.1 Users on a VPN Cannot Communicate with Each Other......................................................................244

7 Multicast......................................................................................................................................248

7.1 Layer 2 Multicast Troubleshooting................................................................................................................249

7.1.1 Users in User VLANs Fail to Receive Multicast Packets (IGMP snooping)........................................249


Issue 01 (2012-03-15) vi


Troubleshooting



Contents

7.2 Layer 3 Multicast Troubleshooting................................................................................................................254

7.2.1 Multicast Traffic Is Interrupted.............................................................................................................254

7.2.2 The PIM Neighbor Relationship Remains Down..................................................................................257

7.2.3 The RPT on a PIM-SM Network Fails to Forward Data.......................................................................260

7.2.4 The SPT on a PIM-SM Network Fails to Forward Data.......................................................................264

7.2.5 MSDP Peers Cannot Generate Correct (S, G) Entries...........................................................................270


8 Security........................................................................................................................................275

8.1 AAA Troubleshooting....................................................................................................................................277

8.1.1 A User Fails in the RADIUS Authentication........................................................................................277

8.1.2 A User Fails in the HWTACACS Authentication.................................................................................282


8.2 ARP Security Troubleshooting.......................................................................................................................293

8.2.1 The ARP Entry of an Authorized User Is Modified Maliciously..........................................................293

8.2.2 The Gateway Address Is Changed Maliciously....................................................................................296

8.2.3 User Traffic Is Interrupted by a Large Number of Bogus ARP Packets...............................................298

8.2.4 IP Address Scanning Occurs.................................................................................................................301

8.2.5 ARP Learning Fails...............................................................................................................................303

8.3 NAC Troubleshooting....................................................................................................................................306

8.3.1 802.1x Authentication of a User Fails...................................................................................................306

8.3.2 802.1x-based Fast Deployment Does Not Take Effect.........................................................................310

8.3.3 MAC Address Authentication of a User Fails.......................................................................................312

8.3.4 MAC Address Bypass Authentication of a User Fails..........................................................................316

8.3.5 Web Authentication of a User Fails......................................................................................................317

8.4 DHCP Snooping Troubleshooting..................................................................................................................320

8.4.1 Users Fail to Go Online After DHCP Snooping Is Configured............................................................320


8.5 Traffic Suppression Troubleshooting.............................................................................................................324

8.5.1 Broadcast Suppression Fails to Take Effect on an Interface.................................................................324

8.6 CPU Defense Troubleshooting.......................................................................................................................326

8.6.1 Protocol Packets Fail to Be Sent to the CPU.........................................................................................326

8.6.2 Blacklist Function Fails to Take Effect.................................................................................................329

8.6.3 Attack Source Tracing Fails to Take Effect..........................................................................................330

8.7 MFF Troubleshooting.....................................................................................................................................332

8.7.1 Users Fail to Access the Internet After MFF Is Configured..................................................................332

8.8 ACL Troubleshooting.....................................................................................................................................336

8.8.1 A User-Defined ACL Fails to Take Effect............................................................................................336


8.9 PPPoE+ Troubleshooting...............................................................................................................................340

8.9.1 PPPoE Users Fail to Access the Internet...............................................................................................341

8.10 URPF Troubleshooting.................................................................................................................................344

8.10.1 Troubleshooting Cases........................................................................................................................344

Issue 01 (2012-03-15) vii


Troubleshooting



Contents

9 QoS...............................................................................................................................................346

9.1 Traffic Policy Troubleshooting......................................................................................................................347

9.1.1 Traffic Policy Fails to Take Effect........................................................................................................347


9.2 Priority Mapping Troubleshooting.................................................................................................................356

9.2.1 Packets Enter Incorrect Queues.............................................................................................................356

9.2.2 Priority Mapping Results Are Incorrect................................................................................................359


9.3 Traffic Policing Troubleshooting...................................................................................................................365

9.3.1 Traffic Policing Based on Traffic Classifiers Fails to Take Effect.......................................................365

9.3.2 Interface-based Traffic Policing Results Are Incorrect.........................................................................365


9.4 Traffic Shaping Troubleshooting....................................................................................................................370

9.4.1 Traffic Shaping Results of Queues Are Incorrect..................................................................................370


9.5 Congestion Avoidance Troubleshooting........................................................................................................376

9.5.1 Congestion Avoidance Fails to Take Effect..........................................................................................376

9.6 Congestion Management Troubleshooting.....................................................................................................379

9.6.1 Congestion Management Fails to Take Effect......................................................................................379


10 Reliability..................................................................................................................................385

10.1 Smart Link Troubleshooting.........................................................................................................................386

10.1.1 Active/Standby Switchover Failure in a Smart Link Group................................................................386

10.1.2 Monitor Link Group Status Is Down...................................................................................................389

10.2 VRRP Troubleshooting................................................................................................................................391

10.2.1 VRRP Group Flaps..............................................................................................................................391

10.2.2 Two Master Devices Exist in a VRRP Group.....................................................................................394


10.3 Ethernet OAM Troubleshooting...................................................................................................................401

10.3.1 MAC Trace Based on Ethernet OAM 802.1ag Fails...........................................................................401

10.3.2 No Unexpected-MEP Alarm Is Generated..........................................................................................404

10.4 BFD Troubleshooting...................................................................................................................................409

10.4.1 BFD Session Cannot Go Up................................................................................................................409

10.4.2 Interface Forwarding Is Interrupted After a BFD Session Detects a Fault and Goes Down...............413

10.4.3 Changed BFD Session Parameters Do Not Take Effect......................................................................414

10.4.4 Dynamic BFD Session Fails to Be Created.........................................................................................416


10.5 DLDP Troubleshooting................................................................................................................................420

10.5.1 DLDP Fails to Detect a Directly Connected Neighbor.......................................................................420

10.6 RRPP Troubleshooting.................................................................................................................................422

10.6.1 RRPP Loop Occurs Temporarily.........................................................................................................422

10.7 MAC Swap Loopback Troubleshooting.......................................................................................................425

Issue 01 (2012-03-15) viii


Troubleshooting Contents

10.7.1 No Remote Loopback Traffic Is Received by the Tester....................................................................426

10.7.2 No Local Loopback Traffic Is Received by the Tester........................................................................430

10.8 ERPS Troubleshooting.................................................................................................................................434

10.8.1 Traffic Forwarding Fails on an ERPS Link.........................................................................................434



ix


Troubleshooting 1 Collecting Fault Information

1

Collecting Fault Information

Collecting Fault Information

After a device fault occurs, collect fault information first. This helps you locate the fault accurately. The following table lists the commands used to collect fault information.

Item Command

General information

display diagnosticinformation

-

-

Description

This command collects general information about the system. The command output includes the output of the display current-configuration and display device commands. The general information must be provided when any fault occurs on a network device.

Using this command in the Telnet window is recommended. If you run the command on the console port, the command execution lasts a long time and cannot be stopped.

Version information

display version

Hot patch information display patch-

information

Device status

display device

System temperature

display environment

Current configuration

display currentconfiguration

System time

display clock

-

-

-

This command displays all configuration information on a device.

You can specify a regular expression to obtain the required configuration information.



1


Troubleshooting 1 Collecting Fault Information

Item

Logs

Alarms

Interface information

Memory usage

CPU usage

Command display logbuffer display trapbuffer display interface display memory-usage display cpu-usage

-

-

-

Description

-

This command displays the status information about all interfaces, whereas the display current-

configuration command displays the configurations of all interfaces. You can also use the display current-

configuration interface interface-

type [ interface-number ] command to check the configuration of a specified interface.

Rectifying the Fault

The common troubleshooting methods are: l Fix or replace the faulty lines.

l Modify the configuration data.

l Restart the system.

After rectifying the fault, you need to run the save command in the user view to save current configurations. Otherwise, the configurations you made will be lost if the device restarts.



2


Troubleshooting 2 System

2

System

About This Chapter

2.1 CPU Troubleshooting

2.2 Stack Troubleshooting

2.3 AutoConfig Troubleshooting

2.4 ALS Troubleshooting

2.5 Telnet Troubleshooting

2.6 FTP Troubleshooting

2.7 SNMP Troubleshooting

2.8 RMON Troubleshooting

2.9 NQA Troubleshooting

2.10 NTP Troubleshooting

2.11 HGMP Troubleshooting

2.12 LLDP Troubleshooting

This chapter describes common causes of the LLDP fault, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

2.13 NAP-based Remote Deployment Troubleshooting

2.14 sFlow Troubleshooting

This chapter describes common causes of sFlow faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.



3



2.1 CPU Troubleshooting

2.1.1 CPU Usage Is High

Common Causes

CPU usage is the percentage of the time during which the CPU executes codes to the total time period. CPU usage is an important index to evaluate device performance.

To view CPU usage, run the display cpu-usage command. If you see that CPU usage exceeds

70%, CPU usage is high. A high CPU usage will cause service faults, for example, BGP route flapping, frequent VRRP active/standby switchover, and even failed device login. To rectify service faults, see the related troubleshooting.

High system CPU usage occurs when CPU usage of some tasks remains high. This fault is commonly caused by one of the following: l A large number of packets are sent to the CPU when loops or DoS packet attacks occur.

l STP flapping frequently occurs and a large number of TC packets are received, causing the device to frequently delete MAC address entries and ARP entries.

Troubleshooting Flowchart

Figure 2-1

shows the troubleshooting flowchart.



4


Troubleshooting

Figure 2-1 CPU usage is high

CPU usage is high

2 System

Are a large number of packets sent to the

CPU?

No

Yes

Are a large number of

TC packets received?

No

Yes

Does a loop occur on the network

Yes

No

Analyze packet features to filter out attack packets

Suppress TC-BPDUs

Eliminate the loop

Seek technical support

Is fault rectified?

No

Yes

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

End

Troubleshooting Procedure

NOTE

Saving the results of each troubleshooting step is recommended. If troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.

The following procedures can be performed in any sequence.

The command output in the following procedures varies according to the device model. The following procedures describe how to view related information.

Procedure

Step 1 Check the names of tasks with a high CPU usage.

Run the display cpu-usage command to check the CPU usage of each task .

Record the names of tasks with CPU usage exceeding 70%.

NOTE

CPU usage of 70% does not necessarily affect services. Services may not be affected when some tasks consume 70% CPU resources but may be affected when some tasks consume 30% CPU resources. This depends on the actual situation.

<Quidway> display cpu-usage

CPU Usage Stat. Cycle: 60 (Second)

CPU Usage : 22% Max: 95%

CPU Usage Stat. Time : 2036-09-26 10:40:39



5



CPU utilization for five seconds: 22%: one minute: 22%: five minutes: 22%.

TaskName CPU Runtime(CPU Tick High/Tick Low) Task Explanation

BOX 0% 0/ 588b01 BOX Output

_TIL 0% 0/ 0 Infinite loop event task

_EXC 0% 0/ 0 Exception Agent Task bcmRX 0% 0/ e97163 bcmRX

VIDL 78% 2/2de643ff DOPRA IDLE

TICK 0% 0/ 346b368

STND 0% 0/ 1c135f STNDStandby task

IPCR 0% 0/ 0 IPCR

VPR 0% 0/ 0 VPR

VPS 0% 0/ 0 VPS

EDBG 0% 0/ 0 EDBG

ECM 0% 0/ fa32e ECM

LOAD 0% 0/ 43cc7 LOAD Load Software

FSP 0% 0/ 1257c9 FSP Stacking

DEV 0% 0/ 0 DEV Device

FCAT 0% 0/ 31cdb FCAT FECD task for catch

packet

FOAM 0% 0/ 0 FOAM

FTS 0% 0/ 0 FTS

RTMR 0% 0/ 8fdddf RTMR

IPCQ 0% 0/ 2ef121 IPCQIPC task for single queue

VP 0% 0/ 1630 VP Virtual path task

RPCQ 0% 0/ c32c7 RPCQRemote procedure call

XMON 0% 0/ 0 XMONVxworks system monitor

VFS 0% 0/ 0 VFS Virtual file system

VMON 0% 0/ 0 VMONSystem monitor

HACK 0% 0/ 0 HACKtask for HA ACK

PNGI 0% 0/ 0 PNGI

VT5 0% 0/ 142717 VT5 Line user's task

BFDS 0% 0/ 0 BFDS

1AGAGT 0% 0/ 0 1AGAGT tExcTask 0% 0/ 0 tS00 tLogTask 0% 0/ 0 tS01 t1 0% 0/ 0 tS02

EHCD_IH 0% 0/ 0 tS03

BusM A 0% 0/ b29ee6 tS04

BULK_CLASS 0% 0/ 0 tS05

BULK_CLASS_IRP 0% 0/ 0 tS06 tBulkClnt 0% 0/ 0 tS09 usbPegasusLib 0% 0/ 0 tS0a usbPegasusLib_IRP 0% 0/ 0 tS0b tUsbPgs 0% 0/ 0 tS0c

EpldIntTask 0% 0/ 0 tS0d tNetTask 0% 0/ 1c4161 tS0e tMethTask 0% 0/ 801d20 tS0f tRlogind 0% 0/ 0 tS10 tTelnetd 0% 0/ 0 tS11 tWdbTask 0% 0/ 0 tS12 tShell 0% 0/ 0 tS13 tDcacheUpd 0% 0/ b6b10 tS14 root 0% 0/ 0 tS15 bcmDPC 0% 0/ 0 tS19 bcmL2MOD.0 0% 0/ 21c70 tS1a bcmCNTR.0 9% 0/40e44673 tS1b bcmTX 0% 0/ 0 tS1c bcmXGS3AsyncTX 0% 0/ 0 tS1d bmLINK.0 1% 0/ b2d2307 tS1e

MACRESTORE 0% 0/ 465d4 tS1f l2entry_sync 0% 0/ 1280d tS20

MACLIMIT 0% 0/ b403 tS21

INFO 0% 0/ 11ba1 INFOInformation center

SAPP 0% 0/ 290c6 SAPP

NQAC 0% 0/ 0 NQAC

NQAS 0% 0/ 0 NQAS

ALM 0% 0/ 0 ALM Alarm

DEVA 0% 0/ 0 DEVA Device assistant



6



SRM 0% 0/ 139ebfb SRM System Resource Manage

FIB6 0% 0/ 0 FIB6IPv6 FIB

BFD 0% 0/ 4016e0 BFD Bidirection Forwarding

Detect

SNPG 0% 0/ 613918 SNPG Multicast Snooping

OAM1 0% 0/ 0 OAM1 EOAM Adapter

NAP 0% 0/ 0 NAP

EOAM 0% 0/ 3fc3e EOAMEthernet OAM 802.1ag

1731 0% 0/ 51bf4 1731Ethernet OAM Y1731

SLAG 0% 0/ 0 SLAG

MCSW 0% 0/ 0 MCSW Mulitcast Switch

L3M4 0% 0/ 0 L3M4

L3I4 0% 0/ 0 L3I4

NDMB 0% 0/ 0 NDMB

NDIO 0% 0/ 0 NDIO

L3MB 0% 0/ 0 L3MB

L3IO 0% 0/ 0 L3IO

PNGM 0% 0/ 0 PNGM

FECD 0% 0/ c0f34 FECD Forward Equal Class

Develope

PPI 0% 0/ b774f PPI Product Process Interface

IFPD 2% 0/ e91f626 IFPD Ifnet Product Adapt

STFW 0% 0/ 0 STFW Super task forward

XQOS 0% 0/ 2cc31 XQOS Quality of service

MIRR 0% 0/ 0 MIRR

SMLK 0% 0/ 24f283 SMLK Smart Link Protocol

CDM 0% 0/ 4b24a CDM

SOCK 0% 0/ 695ed2 SOCKPacket schedule and

process

FIB 0% 0/ 0 FIB Forward Information Base

MFIB 0% 0/ c7c5 MFIBMulticast forward info

IFNT 0% 0/ 0 IFNTIfnet task

U 39 0% 0/ 0 U 39 user command process

task

VTYD 0% 0/ 2d7d84 VTYDVirtual terminal

RSA 0% 0/ 0 RSA RSA public-key algorithms

AGNT 0% 0/ 0 AGNTSNMP agent task

TRAP 0% 0/ bec50a TRAPSNMP trap task

FMAT 0% 0/ 1365e FMATFault Manage task

NTPT 0% 0/ 275831f NTPTNetwork time protocol

task

CFM 0% 0/ 2eaa12 CFM Configuration file

management

HS2M 0% 0/ 0 HS2MHigh available task

WEBS 0% 0/ ba23db WEBSERVER

ACL 0% 0/ 22faa ACL

SECE 0% 0/ ddecf8 SECE Security

DEFD 0% 0/ 64e4 DEFD CPU Defend

STRA 0% 0/ 52fc STRA Source Track

MFF 0% 0/ 677a0 MFF MAC Forced Forwarding

LDT 0% 0/ 27928 LDT Loopback Detect

BFDA 0% 0/ 0 BFDA BFD Adapter

MSYN 0% 0/ 279871 MSYN Mac Synchronization

UCM 0% 0/ f559 UCM User Control Management

AM 0% 0/ 748ae AM Address Management

DHCP 0% 0/ 66457 DHCP Dynamic Host Config

Protocol

AAA 0% 0/ 0 AAA Authen Account Authorize

TM 0% 0/ 0 TM Transmission Management

RDS 0% 0/ 0 RDS Radius

TACH 0% 0/ 3249c8 TACHWTACACS

WEB 0% 0/ 0 WEB Web

PTAL 0% 0/ 0 PTAL Portal

EAP 0% 0/ 1bb19 EAP Extensible Authen

protocol

SAM 0% 0/ 0 SAM Service Agent Module

POE+ 0% 0/ 0 POE+ PPP Over Ethernet Plus

SMAG 0% 0/ 0 SMAG Smart Link Agent

GVRP 0% 0/ 41b05f GVRP Protocol



7



DLDP 0% 0/ 431a6 DLDP Protocol

LLDP 0% 0/ 3dd71 LLDP Protocol

BPDU 0% 0/ 720f BPDU Adapter

EFMT 0% 0/ 0 EFMTEST 802.3AH Test

ADPT 0% 0/ 0 ADPT Adapter

VT6 0% 0/ 1212c5 VT6 Line user's task


task

ALS 1% 0/ a36acf1 ALS Loss of Signal

SPM 0% 0/ 2f45e2 SPM

VT7 0% 0/ 11f4aa VT7 Line user's task


task

ROUT 0% 0/ 23dec78 ROUTRoute task

UTSK 0% 0/ 0 UTSK

APP 0% 0/ 0 APP

IP 0% 0/ 7c2743 IP

LINK 0% 0/ 9a52e5 LINK

VRPT 0% 0/ 74941 VRPT

HOTT 0% 0/ 0 HOTT

TNQA 0% 0/ 6a583 TNQAC

TTNQ 0% 0/ 0 TTNQAS

TARP 0% 0/ 0 TARPING

L2 0% 0/ 3f21af L2

VRRP 0% 0/ 21fdf3d VRRP

L2_P 0% 0/ 7038bd L2_PR

ARP 0% 0/ 0 ARP

SRMT 3% 0/1ae0649a SRMT System Resource Manage

Timer

SRMI 0% 0/ 0 SRMI External Interrupt

VT8 0% 0/ 0 VT8 Line user's task

USB 0% 0/ 0 USB Universal Serial Bus

RMON 0% 0/ 51833 RMONRemote monitoring

MERX 0% 0/ 1659381 MERX Meth Receive


task

VT 0% 0/ 0 VT Virtual Transfer

OS 6% 0/2aea53ff Operation System

The following table lists common tasks.

Task

VIDL

SOCK

RPCQ

Description

Idle task. A higher CPU usage of the VIDL task indicates that the CPU is more idle.

Packet receiving and processing task. If this task has a high CPU usage, the CPU receives and processes a large number of protocol packets, which may be caused by IP packet attacks.

Inter-board communication task. The RPCQ task and SOCK task need to be analyzed together. If the device receives a large number of packets and needs to respond to these packets, the RPCQ task has a high CPU usage. This may be caused by packet attacks.



8



Task

ROUT

Description

Route processing task. When a large number of routes need to be learned or many of them flap, the ROUT task has a high CPU usage.

Check whether the routing module is faulty according to the related routing information.

bcmRX Bottom-layer packet receiving task. A higher

CPU usage of the bcmRX task indicates that the CPU receives a larger number of packets.

Step 2 Check whether a large number of packets are sent to the CPU.

Run the display cpu-defend statistics command to check statistics about the packets sent to the

CPU and focus on the Drop field.

<Quidway> display cpu-defend statistics

Statistics on slot 0:

-------------------------------------------------------------------------------

Packet Type Pass(Bytes) Drop(Bytes) Pass(Packets) Drop(Packets)

------------------------------------------------------------------------------arp-miss N/A N/A 0 0 arp-request N/A N/A 5608 0 bgp N/A N/A 0 0 dns N/A N/A 0 0 fib-hit N/A N/A 0 0 ftp N/A N/A 0 0 hotlimit N/A N/A 0 0 http N/A N/A 0 0 hw-tacacs N/A N/A 0 0 icmp N/A N/A 5 0 icmpv6 N/A N/A 0 0 isis N/A N/A 0 0 nd N/A N/A 0 0 ntp N/A N/A 0 0 ospf N/A N/A 0 0 ospfv3 N/A N/A 0 0 radius N/A N/A 0 0 reserved-multicast N/A N/A 0 0 rip N/A N/A 0 0 ripng N/A N/A 0 0 snmp N/A N/A 0 0 ssh N/A N/A 0 0 tcp N/A N/A 0 0 telnet N/A N/A 2046 0 ttl-expired N/A N/A 0 0

------------------------------------------------------------------------------- l If the value of the Drop field of a certain type of packets is great and CPU usage is high, packet attacks occur. Go to step 5.

l If the value of the Drop field is within the specified range, go to step 3.

Step 3 Check whether a large number of TC packets are received.

If STP is enabled on a device, the device deletes MAC address entries and ARP entries when receiving TC-BPDUs. If an attacker sends pseudo TC-BPDUs to attack the device, the device will receive a large number of TC-BPDUs within a short period of time and frequently delete

MAC address entries and ARP entries. As a result, CPU usage of the device becomes high.

Run the display stp tc-bpdu statistics command to check statistics about the received TC packets and TCN packets.



9



<Quidway> display stp tc-bpdu statistics

-------------------------- STP TC/TCN information --------------------------

MSTID Port TC(Send/Receive) TCN(Send/Receive)

0 XGigabitEthernet0/0/2 1/0 0/0 l If a large number of TC packets and TCN packets are received, run the stp tc-protection command in the system view to suppress TC-BPDUs. After this command is used, only three

TC packets are processed within a Hello interval by default. Run the stp tc-protection

threshold command to set the maximum number of TC packets that can be processed. To change the hello interval, run the stp timer hello command.

[Quidway] stp tc-protection

[Quidway] stp tc-protection threshold 5

[Quidway] stp timer hello 120 l If a small number of TC packets are received, go to step 4.

Step 4 Check whether loops occur on the network.

When multiple interfaces of a device belong to the same VLAN, if a loop occurs between two interfaces, packets are forwarded only between these interfaces in the VLAN. Consequently,

CPU usage of the device becomes high.

Run the display this command in the VLAN view to check whether the device is enabled to generate an alarm when MAC address flapping is detected.

[Quidway-vlan7] display this

# vlan 7

loop-detect eth-loop alarm-only

# l If this function is not configured, run the loop-detect eth-loop alarm-only command to configure this function. If a loop occurs on the network, an alarm is generated when two interfaces of the device learn the same MAC address entry. For example:

Jan 17 2011 19:40:16 L2_SRV_78 L2IFPPI/4/MFLPVLANALARM:OID

1.3.6.1.4.1.2011.5.25.160.3.7 Loop exists in vlan 7, for flapping mac-address

0000-0000-0004 between port XGE0/0/2 and port XGE0/0/3

Check the interface connection and networking information according to the alarm:

–

If no ring network is required, shut down one of the two interfaces according to the networking diagram.

–

If the ring network is required, disable the MAC flapping alarm function and enable loop prevention protocols such as STP.

l If the loop-detect eth-loop alarm-only command is used on the device but no alarm is generated, go to step 5.

Step 5 Collect the following information and contact Huawei technical support personnel.

l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the device

----End

Relevant Alarms and Logs

Relevant Alarms

None



10


Troubleshooting

Relevant Logs

l VOSCPU/4/CPU_USAGE_HIGH

2.2 Stack Troubleshooting

2.2.1 Stacking Failures Occur

2 System

Common Causes

Switches in a stack can form a ring or chain topology, as shown in

Figure 2-2

and

Figure 2-3

.

The Switches are connected using service interfaces.

Figure 2-2 Ring topology

Ethernet

SwitchB

SwitchA

Stack-A

SwitchC

SwitchD stack link common link



11


Troubleshooting

Figure 2-3 Chain topology

Ethernet

SwitchA

Stack-B

SwitchC

SwitchB

Stack-C

SwitchD stack link common link

A stacking failure refers to one of the following situations: l Switches cannot set up a stack.

l A switch fails to join a stack.

l A stack cannot be set up again after it splits.

This fault is commonly caused by one of the following: l The stacking function is disabled on the switches.

l A switch does not have any electronic label or has an incorrect electronic label.

l Stack cables are incorrectly connected.

l Some stack cables are faulty.


Figure 2-4


2 System



12


Troubleshooting

Figure 2-4 Stack troubleshooting flowchart

A stacking failure occurs

Is stacking enabled on switches?

Yes

No

Enable stacking

Are SI

and EI models connected?

Yes

Replace switches and ensure that all switches support stacking

No

Has device electronic label been

No loaded?

Yes

Upload electronic label or replace switch

Has stack card label been loaded?

Yes

No

Replace the stack card

Are stack

cable correctly connected?

Yes

No

Connect stack cables correctly

Are stack ports working properly?

No

Replace stack card or stack cables

Yes



Is the fault rectified?

No

Yes


No

Yes


No

Yes


No

Yes


No

Yes


No

Yes

End

2 System



13


Troubleshooting

Context

2 System

NOTE


CAUTION

Before replacing a switch, power off the switch.

Perform the following steps on each switch where a stacking failure occurs.

Procedure

Step 1 Check that the stacking function is enabled on the switch.

Run the display stack command to check the stack status.

l If the following information is displayed, the stacking function is disabled.

<Quidway> display stack

Error: The stack function is not enabled.

1.

Run the stack enable command in the system view of the S6700EI or S6700SI switches to enable the stacking function, and then restart the switches.

2.

Check whether the stack cables are installed properly. If a stack cable is loose, reconnect it. S6700s can form a ring stack or chain stack. The ring stack is recommended because it is more stable and reliable.

l If the following information is displayed, the stacking function is enabled. Go to the next step.

<Quidway> display stack

Stack topology type:

Link

Stack system MAC:

0200-0001-0000

MAC switch delay time: never

Stack reserved vlanid :

4093

Slot# role Mac address Priority Device type

------ ---- -------------- ------

-------

1 Master 0200-0001-0000 100 S6748-EI

Step 2 Check that the correct electronic label has been loaded to the switch.

Run the display elabel command to view the electronic label.

l If all fields under [Board Properties] are empty, no electronic label is loaded to the switch.

Replace the switch.

<Quidway> display

elabel

/$[System Integration

Version]

/

$SystemIntegrationVersion=3.0



14



[

Slot_0]

/$[Board Integration

Version]

/

$BoardIntegrationVersion=3.0

[

Main_Board]

/$[ArchivesInfo

Version]

/

$ArchivesInfoVersion=3.0

[Board

Properties]

BoardType=

BarCode=

Item=

Description=

Manufactured=

VendorName=

IssueNumber=

CLEICode=

BOM= l If fields under [Board Properties] are not empty, the electronic label has been loaded to the switch. Go to the next step.

<Quidway> display

elabel

/$[System Integration

Version]

/

$SystemIntegrationVersion=3.0

[

Slot_0]

/$[Board Integration

Version]

/

$BoardIntegrationVersion=3.0

[

Main_Board]

/$[ArchivesInfo

Version]

/

$ArchivesInfoVersion=3.0

[Board

Properties]

BoardType=

BarCode=21023518320123456789

Item=02351832

Description=Quidway ,, Mainframe

Manufactured=2009-02-05

VendorName=Huawei

IssueNumber=



15



CLEICode=

BOM=

Step 3 Check that stack cables are correctly connected.

If the switch uses service interfaces and stack ports, check the configuration of the stack ports.

Run the display stack-port { global load-balance | load-balance [ stack-port-id ] |

membership [ stack-port-id ] } command to check the configuration of stack ports.

Compare the displayed interfaces with the actual connected interfaces on the switches.

<Quidway>display stack-port membership stack-port2/1 has 1 ports

---------------------------------------------

XGigabitEthernet2/0/1

stack-port2/2 has 1 ports

---------------------------------------------

XGigabitEthernet2/0/2 l If the displayed interfaces are different from the actual connected interfaces, the stack cables are incorrectly connected. Reconnect the stack cables correctly.

l If the displayed interfaces are the same as the actual connected interfaces, go to the next step.

Step 4 Check that the stack cables are functioning properly.

Run the display stack port all command to check the status of all stack ports.

l If all stack ports are Up, go to the next step.

<Quidway> display stack port all

Show stack port info:

Slot 0:

STACK 1, status: UP, peer: 1


Slot 1:


STACK 2, status: UP, peer: 0 l If a stack port is Down, check whether the device connected to the port has been powered off or is restarting. If so, check the port status after the remote device restarts. If not, replace the stack cable on this port.

<Quidway> display stack port all

Show stack port info:

Slot 0:

STACK 1, status: DOWN, peer:

NONE

STACK 2, status: DOWN, peer:

NONE

If the fault persists, go to 6.


l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the S6700s

----End




16



Relevant Alarms

l A stack port goes Up: FSP_1.3.6.1.4.1.2011.5.25.183.1.22.1 hwStackLinkUp l A stack port goes Down: FSP_1.3.6.1.4.1.2011.5.25.183.1.22.2 hwStackLinkDown l A switch has joined a stack: FSP_1.3.6.1.4.1.2011.5.25.183.1.22.6

hwStackStackMemberAdd l A switch has left a stack: FSP_1.3.6.1.4.1.2011.5.25.183.1.22.7

hwStackStackMemberLeave l

Logical interfaces with the same ID on two switches are connected:

ECM_1.3.6.1.4.1.2011.5.25.183.1.22.9 hwStackLogicStackPortLinkErr l

Physical interfaces of the same logical interface are connected differently:

ECM_1.3.6.1.4.1.2011.5.25.183.1.22.10 hwStackPhyStackPortLinkErr

Relevant Logs

None.

No Interface Is Shut Down by DAD in Direct Mode

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when no interfaces of member switches are shut down after a stack splits.

No Interfaces of Member Switches Are Shut Down After a Stack Splits

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when no interfaces of member switches are shut down after a stack splits.

2.3 AutoConfig Troubleshooting

2.3.1 Unconfigured Switch Fails to Obtain an IP Address After

Startup

Common Causes

In AutoConfig implementation, the DHCP server and switches can be deployed in the same

network segment or different network segments, as shown in

Figure 2-5

and

Figure 2-6

. No

configuration has been performed on SwitchA, SwitchB, and SwitchC.



17


Troubleshooting

Figure 2-5 AutoConfig network diagram (in the same network segment)

DHCP server

Operator

Aggregation switch

FTP/TFTP server

2 System

SwitchA

SwitchB SwitchC

Figure 2-6 AutoConfig network diagram (in different network segments)

Network

DHCP server

Operator

DHCP relay

FTP/TFTP server

SwitchA SwitchB SwitchC

Issue 01 (2012-03-15)

An unconfigured switch has started and been running for 5 minutes, but no IP address is allocated to it from the DHCP server.

This fault is commonly caused by one of the following: l There is a .cfg or .zip file in the flash memory.

l The local interface parameters such as the rate and duplex mode are different from those of the remote interface.

l There is no reachable route between the DHCP server and the switch.



18


Troubleshooting 2 System l The DHCP server configuration (IP address pool and option configuration) is incorrect.

l An event prevents the AutoConfig process. For example, the switch joins a Huawei Group

Management Protocol (HGMP) cluster, or the switch obtains a file from the USB port.


Figure 2-7 Troubleshooting flowchart for an IP address allocation failure in the AutoConfig process

A switch fails to obtain an IP address after startup

Is there any .cfg or

.zip file in flash?

Yes

Delete the .cfg or .zip file and reboot the device

No

Do local and remote interfaces work in same mode?

No

Modify interface parameters on the remote device

Yes

Is DHCP server reachable?

Yes

No

Rectify link and route faults

Is DHCP server configuration correct?

No

Yes

Configure IP address pool and options correctly

Does any event prevent AutoConfig?

Yes

Terminate the event and restart the switch if

AutoConfig is required

No

Seek technical support End

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes



19


Troubleshooting


2 System

NOTE


Procedure

Step 1 Check whether the switch has a configuration file.

Check whether there is any .cfg or .zip file (other than *web.zip or web.zip) in the flash memory.

Log in to the switch from the console port and run the dir command.

l If a .cfg or .zip file is displayed, run the delete command to delete the file, and then reboot the switch.

l If no .cfg or .zip file is displayed, go to step 2.

Step 2 Check the physical connection between the local and remote devices.

The connected interfaces must work in the same mode. By default, 10M electrical ports, 100M electrical ports, GE electrical ports, and GE optical ports on Huawei switches work in autonegotiation mode; 100M optical ports and 10GE optical ports work in non-auto negotiation mode. If the interfaces work in different modes, modify interface parameters on the remote device, that is, the DHCP relay agent or DHCP server.

If the interfaces work in the same mode, go to step 3.

Step 3 Ensure that there is a reachable route between the DHCP server and the switch.

Packets sent from an unconfigured switch are untagged. The DHCP server can allocate an IP address to the switch only if it can receive the untagged packets.

Check whether any DHCP relay agent is deployed between the DHCP server and the switch to determine whether they are in the same network segment.

l If they are in the same network segment, ensure that the untagged packets sent from the switch can reach the DHCP server, and that the IP address of the DHCP server interface connected to the switch is in the same network segment as the address pool on the DHCP server.

l If they are in different network segments, ensure that:

–

On the DHCP relay agent closest to the switch, the interface connected to the switch can process untagged packets.

– The interface IP address is within the range specified in the IP address pool on the DHCP server and is the gateway address of the DHCP server.

– There is a reachable route between the interface and the DHCP server.

If there is a reachable route between the DHCP server and the switch, go to step 4.

Step 4 Check the DHCP server configuration required for AutoConfig.

Item

IP address pool

Solution

If the DHCP server has no IP address pool, configure one according to the IP address plan.



20



Item

Option 147, Option 143, Option

150, and Option 66

Solution

1. Check whether Option 147 is configured. If Option

147 is not configured or is set to AutoConfig (casesensitive), go to step b. If it is set to a value other than

AutoConfig, change it to AutoConfig.

2. Check whether Option143 (FTP server IP address),

Option 150 (TFTP server IP address), or Option 66

(TFTP server name) is configured. If none of them is configured, configure one of them.

If Option 143 is configured, you must also configure

Option 141 (FTP user name) and Option 142 (FTP password). If Option 66 is configured, you must also configure Option 6 (DNS server name).

If the DHCP server configuration is correct, go to step 5.

Step 5 Check whether any event conflicting with the AutoConfig process has occurred.

The AutoConfig process stops when any of the following events occur: l The switch has joined a Huawei Group Management Protocol (HGMP) cluster.

You can log in to the switch and run the display cluster command to check the role of the switch in the cluster.

l The USB port of the switch is connected to a storage device and the switch has obtained a version file or configuration file from the storage device.

When any of the events occurs, determine whether the AutoConfig process is required. If yes, terminate the event and restart the switch. If the fault persists, go to step 6.


l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the devices

----End


Relevant Alarms

None.

Relevant Logs

None.

2.3.2 Unconfigured Switch Fails to Obtain Files



21



Common Causes

In the AutoConfig process, after an unconfigured switch is assigned an IP address and obtains the file server IP address, it starts to obtain required files, including the version file, patch file, configuration file, and Web system file. You can view logs on the file server to check whether the switch has successfully obtained the files. Alternatively, run the display autoconfig-

status command on the switch. If the AutoConfig status is suspend, the switch fails to obtain the files.

NOTE

Among the preceding files, the configuration file is mandatory for a switch, and the other files are optional.

If the switch fails to obtain the files, it starts the retry timer and restarts the AutoConfig process when the retry timer expires. The retry timer length is 30 minutes within 3 days after the first failure to obtain the files, and is 2 hours 3 days later. If the switch fails to obtain the files within

30 days, the AutoConfig process stops.

You can also run the autoconfig getting-file restart command in the system view to start the

AutoConfig process immediately. If the switch fails to obtain the files after the command is executed, the system attempts to obtain the files at intervals.

This fault is commonly caused by one of the following: l The file server, a TFTP server or an FTP server, is unreachable. For example, there is no reachable route to the file server or the FTP user name and password are incorrect.

l Option 67 is not configured on the DHCP server and no intermediate file is available on the file server.

l Option 67 is not configured and the intermediate file does not contain the configuration file name, that is, cfgfile.

l The vrpVer field indicating the version name in Option 145 or the intermediate file does not contain the version number.

l The current system software version name is the same as that defined in Option 145 or the intermediate file, but the version numbers are different.

l The flash memory does not have sufficient space for the files.




22


Troubleshooting

Figure 2-8 Troubleshooting flowchart for a failure to obtain the files

AutoConfig is suspended when trying to obtaining files

Is the file server reachable?

No

Ensure that the file server is reachable

Yes

Are Options

67 and 145 configured on DHCP server?

Yes

No

Ensure Option 67 and

Option 145 are correct intermediate file exist on file server?

Yes

Does

No

Upload the intermediate file to the file server or configure Options 67 and 145

Does file server have specified files?

Yes

No

Upload files specified in intermediate file or

Options 67 and 145 to the file server

Does switch has sufficient space?

Yes

No

Delete unnecessary files to free up space


2 System

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

End



23


Troubleshooting


2 System

NOTE

l The following is the general troubleshooting procedure. You can also log in to the unconfigured switch, run the display autoconfig-status command to check the causes of the failure, and rectify the fault accordingly. After the fault is rectified, run the autoconfig getting-file restart command to restart the

AutoConfig process.

l Saving the results of each troubleshooting step is recommended. If troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.

Procedure

Step 1 Check whether the TFTP server or FTP server is reachable.

1.

Check the route between the switch and the server.

Use the ping command to check whether the switch can ping the server.

l If the ping operation fails, rectify the link fault according to

6.2.1 A Ping Operation

Fails

.

l If the ping operation succeeds, go to step b.

2.

Check the options configured on the DHCP server.

The switch checks for the following options in a DHCP reply packet in sequence. If any of the options is found in the reply packet, the switch processes the packet.

Option Description

Option 150 TFTP server IP address.

Option 143 FTP server IP address.

Option 143 is used with Option 141 (FTP user name) and Option 142 (FTP user password).

Option 66 TFTP server name.

Option 66 is used with Option 6 (DNS server IP address).

If the file server is reachable, go to step 2.

Step 2 Check whether Option 67 and Option 145 are configured on the DHCP server.

l Option 67 specifies the configuration file name; Option 145 specifies the version file name, version number, patch file name, and Web system file name and is optional. If the two options are configured, the switch obtains the files specified by the options. Check whether the file names in the options are the same as those on the file server. Check the contents of the intermediate file and go to step 4.



24



NOTE

The following is an example of Option 145: vrpfile=S6700V100R006C00.cc;vrpver=V100R006C00;patchfile=S6700pat.pat;webfile=V100R006C00web.zip

In Option 145, the vrpfile field must contain the version number. The file name is in the

SxxxxVxxxRxxxCxxSPCxxx.cc format. SPCxxx is optional.

l If neither Option 67 nor Option 145 is configured, the AutoConfig process searches for the configuration file name in the intermediate file. Go to step 3.

Step 3 Check whether an intermediate file exists on the file server.

Searches for the file named lswnet.cfg.

l If the file does not exist on the file server, upload the file to the file server or configure Option

67 and Option 145 on the DHCP server.

l If the intermediate file exists on the file server, the AutoConfig process searches for the configuration file and other required files in the intermediate file. Check the contents of the intermediate file and go to step 4.

NOTE

The intermediate file contains information about a maximum of 2000 devices. Each device is identified by its MAC address or equipment serial number (ESN). If more than 2000 devices are configured in the intermediate file, the AutoConfig process will be suspended.

The following is an intermediate file sample:

MAC=0001-0203-0405;vrpfile=S6700

V100R006C00.cc;vrpVer=V100R006C00;cfgfile=vrpcfg01.cfg;patchfile=S6700pat.pat;webfile=V100R006C00web.zip;

The list contains no space and must end with a semicolon (;). The MAC/ESN field and the cfgfile field are mandatory and the other fields are optional.

Table 2-1

describes the fields in the intermediate file.

Table 2-1 Fields in the intermediate file

Field

MAC/ESN vrpfile vrpVer cfgfile patchfile

Description

MAC address or ESN of a device.

The AutoConfig process finds each device to configure according to this field.

Version file name.

The value of this field must contain the version number. The file name is in the

SxxxxVxxxRxxxCxxSPCxxx.cc format.

SPCxxx is optional.

Version number.

Configuration file name.

The configuration file cannot be a compressed file.

Patch file name.



25



Field

webfile

Description

Web system file name.

The Web system file name must end with web.zip, for example,

S6700V100R006C00web.zip.

Step 4 Check whether the configuration file, version file, patch file, and Web system file exist on the file server.

Search for the configuration file, version file, patch file, and Web system file specified in Option

67 and Option 145 or in the intermediate file.

l If the specified files do not exist on the file server, upload them to the server.

l If the specified files exist, check whether the vrpVer value in Option 145 or the intermediate file is the same as the current system software version. If they are different, change the vrpVer value to the actual system software version number.

If the specified files exist and the version number is correct, go to step 5.

Step 5 Check whether the switch has sufficient space for the files.

Run the dir command in the user view to check the available space in the storage device. If the space is not enough for the files, run the delete command to delete unneeded files.

NOTE

Alternatively, set opervalue to 1 in Option 146 on the DHCP server so that the switch will delete the previous system files when there is no space for the new files. Exercise caution when setting opervalue to 1. This configuration takes effect after the AutoConfig restarts.

Step 6 After all the files are uploaded to the file server, restart the AutoConfig process.

Use either of the following methods to restart the AutoConfig process: l Run the autoconfig getting-file restart command in the system view to restart the

AutoConfig process immediately.

l Wait the AutoConfig process to restart automatically after the retry timer expires. The retry timer length is 30 minutes within 3 days after the first failure to obtain the files, and is 2 hours

3 days later. If the switch fails to obtain the files within 30 days, it stops the AutoConfig process and does not reset the retry timer. Run the autoconfig getting-file restart command in the system view to restart the AutoConfig process.

Step 7 If the fault persists, collect the following information and contact Huawei technical support personnel.


----End


Relevant Alarms

None.



26


Troubleshooting

Relevant Logs

None.

2 System

2.4 ALS Troubleshooting

2.4.1 Laser Status Does Not Change at ALS Pulse Intervals When a

Fiber Link Fails

Common Causes

This fault is commonly caused by one of the following: l Automatic laser shutdown (ALS) is disabled on an interface.

l The interface has been shut down.

l

The optical module fails.


The laser of an S6700 interface works in automatic restart mode, the laser status does not change at ALS pulse intervals when a fiber link fails. When this fault occurs, rectify the fault according to

Figure 2-9

.



27


Troubleshooting

Figure 2-9 Troubleshooting flowchart

Laser status does not change at ALS pulse intervals

Is ALS enabled?

Yes

No

Enable ALS on the interface

Is fault rectified?

Yes

No

Is laser

No

Status always on?

Yes

Is interface

Yes shut down?

No

Restart the interface

Is fault rectified?

Yes

No

2 System

Is optical module

faulty?

No

Yes

Replace the optical module

Is fault rectified?

Yes

No

End



NOTE


Procedure

Step 1 Check whether ALS is enabled.

Run the display als configuration command to check the ALS Status field. If the value of the

ALS Status field is Enable, ALS is enabled. If the value of the ALS Status field is Disable,

ALS is disabled.

l If ALS is disabled, run the als enable command on the interface to enable ALS.

l If ALS is enabled, go to step 2.



28



Step 2 Check whether the laser status is always on or off.

By default, after ALS is enabled, If there is a fault on the link, a laser automatically turns on and off according to the default pulse interval and width. Run the display als configuration command to check the Laser Status field repeatedly. The value of the Laser Status field is changed between "on" and "off".

NOTE

By default, a laser works in automatic restart mode, and the ALS pulse interval and width are 100s and 2s.

That is, a laser automatically turns on for 2s every 100s. The laser works for a short period of time; therefore, the laser status may be always displayed as off when you run the display als configuration command.

Increase the ALS pulse width to check whether the laser status changes.

l If the value of Laser status is always Off, go to step 3.

l If the value of Laser status is always On, go to step 5.

Step 3 Check whether the interface has been shut down.

l If the interface has been shut down, run the undo shutdown command to restart the interface.

l If the interface has not been shut down, go to step 4.

Step 4 Check whether the optical module fails.

l If the optical module fails, replace the optical module.

l If the optical module is running properly, go to step 5.


l Results of the preceding troubleshooting procedure l Configuration file, log file, and alarm file of the S6700

----End


Relevant Alarms

None.

Relevant Logs

None.

2.4.2 Interface Cannot Become Up After the Fiber Link Recovers

Common Causes

Issue 01 (2012-03-15)

This fault is commonly caused by one of the following: l The interface has been shut down or the optical module is damaged.

l The laser works in manual restart mode but has not been started manually.



29


Troubleshooting 2 System l The laser works in auto restart mode but the ALS pulse interval is too long or the pulse width is too short.



Interface cannot go Up after fiber link recovers

Is interface

shut down?

Yes

Restart the interface

No

Is fiber or optical module faulty?

Yes

Rectify fault according to the alarm

No

Does laser work in manual mode?

No

Yes

Manually restart the laser

Are

ALS pulse interval and width correct?

No

Yes

Change the ALS pulse interval and width


Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

End


NOTE


Procedure

Step 1 Check whether the interface has been shut down.



30


Troubleshooting 2 System l If the interface has been shut down, run the undo shutdown or restart command to restart the interface.

l

If the interface has not been shut down, go to step 2.

Step 2 Check whether the optical module or fiber fails.

l If the optical module or fiber fails, replace the optical module.

l If the fiber fails, replace the fiber.

l If the optical module and fiber are running properly, go to step 3.

Step 3 Check whether the laser works in manual restart mode.

Run the display als configuration command to check the Restart Mode field. If the value of the Restart Mode field is Manual, the laser works in manual restart mode. If the value of the

Restart Mode field is Auto, the laser works in automatic mode.

l If the lasers at the two ends of the link work in manual restart mode, run the als restart command at one end to manually start the lasers. If the link is still Down, go to step 5.

l

If either of the two lasers works in automatic mode, go to step 4.

Step 4 Check whether the ALS pulse interval and width are correct.

Run the display als configuration command on the end where the laser works in automatic mode to check the ALS pulse interval and width.

In the command output, the Interval(s) field indicates the ALS pulse interval and the Width

(s) field indicates the ALS pulse width. By default, the ALS pulse interval is 100s and the ALS pulse width is 2s. That is, the laser works for 2 seconds every 100 seconds. If the ALS pulse interval is long, the peer device waits for a long time to receive pulses. During this period, an

LoS persists on the interface and the interface cannot go Up.

To change the ALS pulse interval, run the als restart pulse-interval command on the interface.

To change the ALS pulse width, run the als restart pulse-width command on the interface.

If the fault persists, go to step 5.



----End


Relevant Alarms

None.

Relevant Logs

None.



31



2.5 Telnet Troubleshooting

2.5.1 The User Fails to Log in to the Server Through Telnet

Common Causes

This fault is commonly caused by one of the following: l The route is unreachable, and the user cannot set up a TCP connection with the server.

l The number of users logging in to the server reaches the upper threshold.

l An ACL is configured in the VTY user interface view.

l The access protocol specified in the VTY user interface view is incorrect. For example, when the access protocol is configured to SSH through the protocol inbound ssh command, the user cannot log in to the server through Telnet.


Figure 2-11




32



Figure 2-11 Troubleshooting flowchart for the fault that the client fails to log in to the server through Telnet

The user fails to log in to the server through

Telnet

If the client can successfully ping the server?

No

Check the ping operation fails and rectify the fault

Yes

If All the current VTY channels are used?

Yes

Has an ACL rule been configured with

IP specified?

No

No

Increase the maximum number of users allowed to log in

Permit the IP address of the user in the ACL


No

Yes


No

Yes


Yes

No

Yes

If the user access protocol configured to all or telnet?

No

Configure the user access protocol to all or telnet


No

Yes

Yes

If The authentication

mode configured?

No

Correctly configure the authentication mode

Yes



No

Yes

End


NOTE

Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.



33


Troubleshooting



2 System

Procedure

Step 1 Check whether the Telnet client can ping through the server.

Run the ping command to check the network connectivity. If the ping fails, the Telnet connection cannot be established between the user and server.

If the ping fails, see

6.2.1 A Ping Operation Fails

to locate the problem so that the Telnet client can ping through the server.

Step 2 Check whether the number of users logging in to the server reaches the upper threshold.

Log in to the server through a console interface and then run the display users command to check whether all the current VTY channels are in use. By default, a maximum of 5 users can log in to the server through VTY channels. Run the display user-interface maximum-vty command to view the allowed maximum number of login users.

<Quidway> display user-interface maximum-vty

Maximum of VTY user:5

<Quidway> display users

User-Intf Delay Type Network Address AuthenStatus AuthorcmdFlag

+ 0 CON 0 00:00:00 no

Username : Unspecified

34 VTY 0 00:13:39 TEL 10.138.78.107 no

Username : Unspecified

If the number of users logging in to the server reaches the upper threshold, you can run the user-

interface maximum-vty vty-number command to increase the maximum number of users allowed to log in to the server through VTY channels to 15.

<Quidway> system-view

[Quidway] user-interface maximum-vty 15

Step 3 Check that an ACL is configured in the VTY user interface view.

[Quidway] user-interface vty 0 4

[Quidway-ui-vty0-4] display this user-interface vty 0 4

acl 2000 inbound

authentication-mode aaa

user privilege level 3

idle-timeout 0 0

If an ACL is configured but the IP address of the client to be permitted is not specified in the

ACL, the user cannot log in to the server through Telnet. To enable a user with a specific IP address to log in to the server through Telnet, permit the IP address of the user in the ACL.

Step 4 Check that the access protocol configured in the VTY user interface view is correct.

[Quidway] user-interface vty 0 4

[Quidway-ui-vty0-4] display this user-interface vty 0 4



idle-timeout 0 0

protocol inbound ssh

Run the protocol inbound { all | ssh | telnet } command to configure the user access protocol.

By default, the user access protocol is Telnet.

l If the user access protocol is SSH, the user cannot log in to the server through Telnet.

l If the user access protocol is "all", the user can log in to the server through Telnet or SSH.

Step 5 Check that the authentication mode is configured in the user interface view.

Issue 01 (2012-03-15) 34


Troubleshooting 2 System l If you run the authentication-mode password command to configure the authentication mode for the user logging in to the server through the VTY channel to password, run the

set authentication password command to set the authentication password.

l If you run the authentication-mode aaa command to configure the authentication mode to

aaa, you should run the local-user command to add a local user.

Step 6 If the fault persists, collect the following information and contact Huawei technical support personnel: l Results of the preceding troubleshooting procedures l Configuration files, log files, and alarm files of the devices

----End


Relevant Alarms

None.

Relevant Logs

SHELL/4/TELNETFAILED:Failed to login through telnet. (Ip=[STRING], UserName=

[STRING], Times=[ULONG])

2.6 FTP Troubleshooting

2.6.1 The User Fails to Log in to the Server Through FTP

Common Causes

This fault is commonly caused by one of the following: l The route between the client and the server is unreachable.

l The FTP server is disabled.

l The port to be monitored by the FTP is not the default port and the port is not specified through with the client logs in to the server through FTP.

l The authentication information and working directory of the FTP user are not configured.

l The number of users logging in to the server through FTP reaches the upper threshold.

l

An ACL rule is configured on the FTP server to limit client's access.


Issue 01 (2012-03-15)

The client fails to log in to the FTP server.

Figure 2-12




35



Figure 2-12 Troubleshooting flowchart for the fault that the client fails to log in to the FTP server

The user fails to log in to the server through

FTP.

Whether

the client can successfully ping the server?

No

Yes

Check the physical link and rectify the fault.

No

Whether FTP services are enabled?

Yes

Enable FTP services.

If the port to

which FTP listens is the default value?

Yes

No

Configure the port to which

FTP listens to the default value.

Whether

the FTP user is correctly configured?

Yes

No

Check the authentication information and authorization directory of the

FTP user.

Is the

number of FTP users reaches the upper threshold?

No

Yes

Disconnect certain FTP users.

Has an ACL rule been configured on the FTP server?

No

Yes

Correctly configure an ACL.

Seek technical support.


No

Yes


No

Yes


Yes

No


Yes

No


Yes

No


Yes

No

End



36


Troubleshooting


2 System

NOTE


Procedure

Step 1 Check that the client and the server can successfully ping each other.

Run the ping command to check whether the client can successfully ping the FTP server.

<Quidway> ping 10.164.39.218

PING 10.164.39.218: 56 data bytes, press CTRL_C to break

Request time out

Request time out

Request time out

Request time out

Request time out

--- 10.164.39.218 ping statistics ---

5 packet(s) transmitted

0 packet(s) received

100.00% packet loss l If the ping fails, the FTP connection cannot be established between the client and the server.

To locate this problem, see


so that the FTP client can ping through the FTP server.

l If the ping succeeds, go to

Step 2

.

Step 2 Check that the FTP server is enabled.

Run the display ftp-server command in any view to check the status of the FTP server.

l If the FTP server is disabled, the command output is as follows:

<Quidway> display ftp-server

Info: The FTP server is already disabled.

Run the ftp server enable command in the system view to enabled the FTP server.


[Quidway] ftp server enable

Info: Succeeded in starting the FTP server.

l If the FTP server is enabled, the command output is as follows.


FTP server is running

Max user number 5

User count 0

Timeout value(in minute) 30

Listening port 21

Acl number 0

FTP server's source address 0.0.0.0

Go to

Step 3

.

Step 3 Check that the port listened by the FTP server is the default port.

1.

Run the display tcp status command in any view to check the listening status of the current

TCP port and the default port 21.

<Quidway> display tcp status

TCPCB Tid/Soid Local Add:port Foreign Add:port VPNID State

2a67f47c 6 /1 0.0.0.0:21 0.0.0.0:0 23553

Listening

2b72e6b8 115/4 0.0.0.0:22 0.0.0.0:0 23553

Listening



37



3265e270 115/1 0.0.0.0:23 0.0.0.0:0 23553

Listening

2a6886ec 115/23 10.137.129.27:23 10.138.77.43:4053 0

Establish ed

2a680aac 115/14 10.137.129.27:23 10.138.80.193:1525 0

Establish ed

2a68799c 115/20 10.137.129.27:23 10.138.80.202:3589 0

Establish ed

2.

Run the display ftp-server command in any view to check the port listened by the FTP server.


FTP server is running

Max user number 5

User count 0

Timeout value(in minute) 30

Listening port 21

Acl number 0

FTP server's source address 0.0.0.0

l If the port listened by the FTP server is not port 21, run the ftp server port command to set the port to be listened by the FTP server to port 21.


[Quidway] undo ftp server

[Quidway] ftp server port 21

l If the port listened by the FTP server is port 21, go to

Step 4

.

Step 4 Check that the authentication information and the authorization directory for the FTP user are configured.

l

The name, password, and working directory are mandatory configuration items for an FTP user. A common cause of the fault that the user fails to log in to the server through FTP is because the working directory is not specified.

1.

Run the aaa command to enter the AAA view.

2.

Run the local-user user-name password { simple | cipher } password command to configure the name and password for a local user.

3.

Run the local-user user-name ftp-directory directory command to configure the authorization directory for the FTP user.

l The access type is an optional item. By default, the system supports all access types. If one access type or several access types are configured, the user can log in to the server only through the configured access types.

Run the local-user user-name service-type ftp command to configure the access type to

FTP.

l If the authentication information and authorization directory are configured for the FTP

user, go to

Step 5

.

Step 5 Check that the number of users logging in to the FTP server reaches the upper threshold.

Run the display ftp-users command to check whether the number of users logging in to the FTP server reaches 5.

l If the number of users logging in to the FTP server is greater than or equal to 5, run the

quit command in the FTP client view to tear down the connection between a user and the

FTP server.

l

If the number of users logging in to the FTP server is smaller than 5, go to

step 6

.



38



Step 6 Check that no ACL rule is configured on the FTP server.

Run the display [ ipv6 ] ftp-server command to check whether no ACL rule is configured on the FTP server.

l If an ACL rule is configured, the system allows only the client with the IP address permitted by the ACL rule to log in to the FTP server.

l If no ACL rule is configured, go to

step 7

.

Step 7 Contact Huawei technical support personnel.

l Results of the preceding troubleshooting procedures l Configuration files, log files, and alarm files of the devices

----End


Relevant Alarms

None.

Relevant Logs

FTPS/5/LOGIN_OK:The user succeeded in login. (UserName="[string]", IpAddress=[string],

VpnInstanceName="[string]")

FTPS/5/REQUEST:The user had a request. (UserName="[string]", IpAddress=[string],

VpnInstanceName="[string]", Request=[string])

2.6.2 The FTP Transmission Fails

Common Causes

This fault is commonly caused by one of the following: l The source path or the destination path of a FTP connection contains characters that the device does not support,such as the character of blank space.

l The number of files in the root directory of the FTP server reaches the upper threshold.

l The available space of the root directory of the FTP server is insufficient.


None.


NOTE




39



Procedure

Step 1 Check that the source path or the destination path of a FTP connection contains characters that the device does not support,such as the character of blank space..

l

If contains, change the path.

l

If does not contains, go to

Step 2

.

Step 2 Check that the number of files in the root directory of the FTP server reaches the upper threshold.

At present, a maximum of 40 files can be saved in the root directory of the FTP server. When the number of files in the root directory of the FTP server is greater than 40 and unnecessary files are not cleared in time, new files cannot be saved.

Run the dir command on the FTP server to view the number of files in the root directory of the

FTP server.

l If the number of files in the root directory of the FTP server is greater than or equal to 40, run the delete command in the user view to delete unnecessary files to release the storage space.

l

If the number of files in the root directory of the FTP server is smaller than 40, go to

Step

3

.

Step 3 Check that the available space of the root directory of the FTP server is sufficient.

Run the dir command on the FTP server to view the available space of the root directory on the

FTP server.

l If there is no sufficient space, run the delete /unreserved command in the user view to delete unnecessary files.

l If there is sufficient space, go to

Step 4

.



----End


Relevant Alarms

None.

Relevant Logs

FTPS/3/TRS_FAIL:The user failed to transfer data. (UserName="[string]", IpAddress=[string],

VpnInstanceName="[string]")

2.6.3 The FTP Transmission Rate Is Low

Common Causes

This fault is commonly caused by one of the following:



40


Troubleshooting 2 System l The storage media is the Flash memory.

l Packets are retransmitted because the network is unstable.


None.


NOTE


Procedure

Step 1 The S6700 uses the flash memory as the storage medium.

The reading rate of the Flash memory is fast but the writing rate of the Flash memory is slow.

Table 2-2

shows the FTP transmission data obtained in the laboratory. The data show that compared with other storage media, the writing rate of the Flash memory is the lowest.

Table 2-2 List of the FTP transmission rate

Item get

Flash - Flash

0.55 kbit/s

Flash - hda

Flash - CFcard

0.51 kbit/s

1.63 kbit/s

hda - Flash hda - hda hda - CFcard

CFcard - Flash

32.19 kbit/s

32.91 kbit/s

21.33 kbit/s

51.23 kbit/s

put

0.51 kbit/s

16.05 kbit/s

58.66 kbit/s

1.51 kbit/s

25.70 kbit/s

54.69 kbit/s

0.55 kbit/s

CFcard - hda

CFcard - CFcard

40.19 kbit/s

33.21 kbit/s

14.23 kbit/s

59.14 kbit/s

Step 2 Check that packets are retransmitted.

Capture packets and analyze the packet contents through tools to check whether TCP packets are retransmitted on client PC. Packet retransmission is usually cause by the network instability.

Figure 2-13

shows packets captured through Ethereal. As shown in the diagram, a log of TCP retransmission are received.



41


Troubleshooting

Figure 2-13 Diagram of packets captured through Ethereal

2 System



----End


Relevant Alarms

None.

Relevant Logs

None.

2.7 SNMP Troubleshooting

2.7.1 An SNMP Connection Cannot Be Established

Common Causes

This fault is commonly caused by one of the following: l Packets cannot be exchanged between the host and the NMS.

l Configurations are incorrect.



42




Figure 2-14 Troubleshooting flowchart for the fault that an SNMP connection cannot be established

An SNMP connection fails to be established.

Can

the host and the NMS successfully ping each

other?

Yes

Enable SNMP dubugging on the host to check whether the host can receive SNMP messages.

No

Do

reachable routes exist between the host and the

NMS?

Yes

No

Refer to the troubleshooting roadmap of the IP module.

Does the host receive SNMP messages?

Yes

Does log messages indicating SNMP communication failure exists? Rectify the fault according to the manual.

No

No


Yes

End

Contact Huawei technical support personnel.


NOTE


Procedure

Step 1 Run the ping command to check whether the host and the NMS can successfully ping each other.

l If the ping succeeds, it indicates that the host and the NMS are reachable. Go to Step 2.



43



l If the ping fails, see


to locate the problem so that the host and the NMS can ping through each other.

Step 2 Run the display logbuffer command to check whether login failure logs exist on the host.

l If no login failure log exists on the host, go to Step 3.

l If login failure logs exist on the host, analyze the logs.

Failed to login through

SNMP, because the packet was too large. (Ip=

[STRING],

Times=

[ULONG])


SNMP,becaus e messages was failed to be added to the message list. (Ip=

[STRING],

Times=

[ULONG])

Table 2-3 Log description and solution

Logs Description Solution


SNMP, because the version was incorrect. (Ip=

[STRING],

Times=

[ULONG])

The SNMP version used by the NMS to send login requests is not supported on the host.

1. Run the display snmp-agent sys-info version command to check whether the host supports the

SNMP version used by the NMS to send login requests.

l If the host supports the SNMP version, go to Step c.

l If the host does not support the SNMP version, go to Step b.

2. Run the snmp-agent sys-info version command to configure the SNMP version supported by the host.

l If the fault is rectified, go to Step d.

l If the fault persists, go to Step c.

3. Contact Huawei technical support personnel.

4. End.

Packet bytes received by the host exceeds the threshold.

1. Run the snmp-agent packet max-size command to increase the maximum packet bytes of the host.

l If the fault persists, go to Step b.

l If the fault is rectified, go to Step c.


3. End.

The message list is filled up.




44



Logs


SNMP, because of the decoded PDU error. (Ip=

[STRING],

Times=

[ULONG])


SNMP, because the community was incorrect.

(Ip=

[STRING],

Times=

[ULONG])


SNMP, because of the

ACL filter function. (Ip=

[STRING],

Times=

[ULONG])

Description Solution

An unknown error occurs during packet decoding.


The community string is incorrect.

The IP address from which the NMS sends a login request is denied by the

ACL.

1. Run the display snmp-agent community command to can view the community string configured on the host.

l If the community string used by the NMS to send a login request is the same as that configured on the host, go to Step c.

l If the community string used by the NMS to send a login request is different from that configured on the host, go to Step b.

2. Run the snmp-agent community command to configure a read-write community string, which must be identical with that configured on the host.




4. End.

1. Run the display acl command to view the ACL configuration on the host.

l If the IP address from which the NMS sends login requests is denied by the ACL, go to Step b.

l If the IP address from which the NMS sends login requests is permitted by the ACL, go to

Step c.

2. Run the rule command to enable the ACL to permit the IP address from which the NMS sends login requests.




4. End.



45


Troubleshooting

Logs


SNMP, because of the contextname was incorrect.

(Ip=

[STRING],

Times=

[ULONG])

Description Solution

The

"contextname

" in the login request is incorrect.



----End


Relevant Alarms

None.

Relevant Logs

SNMP/4/ACL_FAILED

SNMP/4/AR_PAF_FAILED

SNMP/6/CNFM_VERSION_DISABLE

SNMP/4/COMMUNITY_ERR

SNMP/4/CONTEXTNAME_ERR

SNMP/4/DECODE_ERR

SNMP/4/INVAILDVERSION

SNMP/4/MSGTBL_ERR

SNMP/4/PACKET_TOOBIG

SNMP/4/PARSE_ERR

SNMP/4/SNMP_SET

SNMP/4/TRAP_SEND_ERR

SNMP/4/SHORT_VB

2.7.2 The NMS Fails to Receive Trap Messages from the Host

2 System

Common Causes

This fault is commonly caused by one of the following: l The trap message is lost.

l The SNMP configuration on the host is incorrect. As a result, the host is unable to send trap messages.

l No trap message is generated on the host-side service module, or the trap message is generated on the host-side service module, but the format of the trap messages is incorrect.

As a result, the trap message cannot be sent.



46




Figure 2-15 Troubleshooting flowchart used when the NMS fails to receive trap messages from the host

The NMS fails to receive trap messages from the host.

Whether the host is correctly configured?

Yes

Observe the system log and rectified the fault according to the manual.

No

Reconfigure the host.


No


Yes

End


Context

NOTE


Procedure

Step 1 Check whether the SNMP configurations on the host are correct.

l If the SNMP configurations are correct, go to Step 2.

l If the SNMP configurations are incorrect, change the configuration according to the following configuration cases.



47



Table 2-4 Typical SNMP configurations

Configuration Case

Configure a destination host running

SNMPv2c, with the destination port being the default 162, the username being huawei, and the IP address being

192.168.1.1.

NOTE

huawei must be an existing username.

Command


[Quidway] snmp-agent target-host trap

address udp-domain 192.168.1.1 params securityname huawei v2c


SNMPv2c, with the destination port being the default 162, the username being huawei, and the IP address being

192.168.1.1. Trap messages are sent through a VPN network named VPN-

Test.

NOTE




address udp-domain 192.168.1.1 udp-port

162 vpn-instance VPN-TEST params securityname huawei v2c


SNMPv3, with the username being huawei. The user belongs to the user group named huawei_group and has

Huawei_view as the notify rights

(notify-view).

NOTE

With Huawei_view, the user can access all nodes from the iso subtree.



SNMPv3, with the username being huawei and the IP address being

192.168.1.1.

NOTE


# Configure a MIB view.


[Quidway] snmp-agent mib-view included

Huawei_view iso

# Configure a user group.

[Quidway] snmp-agent group v3

huawei_group read-view Huawei_view writeview Huawei_view notify-view Huawei_view

# Configure a user.

[Quidway] snmp-agent usm-user v3 huawei

huawei_group



address udp-domain 192.168.1.1 params securityname huawei v3


SNMPv3, with the destination port being 163, the username being huawei, and the IP address being 192.168.1.1.

Trap messages are sent through a VPN network named VPN-Test.

NOTE




address udp-domain 192.168.1.1 udp-port

163 vpn-instance VPN-TEST params securityname huawei v3

Step 2 Run the display snmp-agent trap all command to check whether the trap function is enabled on all feature modules.

l If the trap function is not enabled on all feature modules, go to Step 3.



48


Troubleshooting 2 System l If the trap function is enabled on all feature modules, go to Step 4.

Step 3 Run the snmp-agent trap enable feature-name trap-name command to enable the host to send trap messages and configure parameters for trap messages.

l If the NMS can receive trap messages sent from the host, go to Step 7.

l If the NMS fails to receive trap messages sent from the host, go to Step 4.

Step 4 Check whether the log message indicating that a specific trap is generated exists on the host.

l If the log message indicating that a specific trap is generated does not exist on the host, it indicates that the trap is not generated. In this case, go to Step 6.

l If the log message indicating that a specific trap is generated exists on the host, it indicates that the trap has been generated but the NMS fails to receive the trap message. In this case, go to Step 5.

NOTE

The log message indicating that a specific trap is generated is as follows: #Jun 10 2010 09:55:03 Quideway

IFNET/2/IF_PVCDOWN:OID 1.3.6.1.6.3.1.1.5.3 Int erface 109 turned into DOWN state.

Step 5 Configure trap messages to be sent in Inform mode.

NOTE

Trap messages are transmitted through UDP. UDP transmission is unreliable, which may cause trap messages to be lost on the link.Inform mechanism ensures that trap messages are sent in a reliable manner. For configuration details, refer to the chapter "SNMP Configuration" in the S6700 Series Configuration Guide -

Network Management.

l If the NMS can receive trap messages sent from the host, go to Step 7.

l If the NMS fails to receive trap messages sent from the host, go to Step 6.


----End


Relevant Alarms

None.

Relevant Logs

None.

2.8 RMON Troubleshooting

2.8.1 NM Station Cannot Receive RMON Alarms

Common Causes

This fault is commonly caused by one of the following: l There is no reachable route between the router and the NM station.



49


Troubleshooting 2 System l The SNMP alarm function is not configured correctly.

l Rmon StatsTable is not configured.

l The statistics of RMON is not enabled.

l The RMON eventTable is not enabled.

l The RMON alarmTable is not enabled.

l The alarm variables are not configured correctly.


When the traffic that flows in and out of the LAN exceeds the configured threshold, the NM

station fails to receive alarms.

Figure 2-16




50



Figure 2-16 Troubleshooting flowchart for the failure of the NMS to receive RMON alarms

NM station cannot receive RMON alarms.

Is route to NM station reachable?

No

Configure routes to the NM station correctly

Yes

Is the SNMP alarm function enabled correctly?

No

Configure the

SNMP alarm function correctly

Yes

Is the

RMON statistic function enabled?

No

Enable the RMON statistic function.

Yes

Is the

RMON event entry created?

No

Create the RMON event entry.

Yes

Is theRMON alarm entry created?

Yes


No

Create the RMON alarm entry.

Is fault rectified?

No

Is fault rectified?

No

Is fault rectified?

No

Is fault rectified?

No

Is fault rectified?

No

End

Yes

Yes

Yes

Yes

Yes


NOTE




51



Procedure

Step 1 Check whether there are reachable routes between the switch and the NM station.

Ping the NM station interface from the switch.

l If the ping succeeds, it indicates that the routes between the switch and the NM station are reachable,go to Step 2.

l If the ping fails, check the routes between the switch and the NM station. For details of routing

troubleshooting, see the section


.

Step 2 Check that the SNMP alarm function is configured correctly.

Check whether the NM station can receive other alarms. If not, do as follows: l Run the display snmp-agent trap feature-name rmon all command to check whether the alarm function of the switch is enabled l Run the display snmp-agent target-host command to check whether the NM address through which the switch sends alarms is correct

Step 3 Check that the statistics of RMON is enabled.

Run the display rmon statistics [ gigabitethernet interface-number | xgigabitethernet

interface-number ] command on the switch to check whether the statistics of RMON is enabled on the interface. If no statistics is recorded in the table, run the rmon-statistics enable command to enable the statistics of RMON on the interface.

Step 4 Check that rmonStatsTable is configured.

Run the display rmon statistics [ gigabitethernet interface-number | xgigabitethernet

interface-number ] command on the switch to check whether rmonStatsTable is configured. If rmonStatsTable is null, run the rmon statistics entry-number [ owner owner-name ] command to create entries of the table.

Step 5 Check that the RMON eventTable is enabled.

Run the display rmon event [ entry-number ] command on the switch interface to check whether the RMON eventTable is enabled. If the eventTable is null, run the rmon event command to create entries of the table.

Step 6 Check that the RMON alarmTable is enabled.

Run the display rmon alarm [ entry-number ] command on the switch interface to check whether the RMON alarmTable is enabled. If the alarmTable is null, run the rmon alarm command to create entries of the table.

Step 7 Check that the alarm variables are configured correctly.

Run the display rmon alarm [ entry-number ] command on the switch interface to view the value of the configured alarm variables. On the NM station, check that the values of alarm variables of the interface are consistent with those values configured on the switch interface. If the values are inconsistent, modify the values of the alarm variables to be consistent.

After the preceding operations are complete, if the NM station cannot receive the alarm values of the RMON module on the switch, contact the Huawei technical personnel.

----End



52


Troubleshooting


Relevant Alarms

None.

Relevant Logs

None.

2.9 NQA Troubleshooting

2.9.1 A UDP Jitter Test Instance Fails to Be Started

2 System

Common Causes

This fault is commonly caused by one of the following: l The mandatory parameter of the test instance is incorrect.


Figure 2-17 Troubleshooting flowchart used when a UDP Jitter test instance fails to be started

A UDP jitter test unstance fails to be started.

Whether the test type is Jitter?

Yes

No

Ensure that the test type is Jitter.

Whether the destination address is configured?

Yes

Whether the destination port is configured?

Yes

No

Ensure that the destination address is configured.

No

Ensure that the destination port is configured.



No

Yes


No

Yes


No

Yes

End



53




NOTE


All the following commands, except the display commands, are used in the NQA test instance view. The display commands can be used in any views.

Procedure

Step 1 Run the display nqa-agent admin-name test-name [ verbose ] command on the NQA client or the display this command in the NQA test instance view to check whether the test type is Jitter.

l If the test type is Jitter, go to Step 2.

l If the test type is not Jitter, run the test-type jitter command to configure the test type to

UDP Jitter.

– If the fault is rectified, the operation ends.

– If the fault persists, go to Step 2.

Step 2 Run the display nqa-agent admin-name test-name [ verbose ] command on the NQA client or the display this command in the NQA test instance view to check whether the destination IP address is configured.

l If the destination IP address is configured, go to Step 3.

l If the destination IP address is not configured, run the destination-address ipv4 ip-

address command in the NQA test instance view to configure the destination IP address.


–

If the fault persists, go to Step 3.

Step 3 Run the display nqa-agent admin-name test-name [ verbose ] command on the NQA client or the display this command in the NQA test instance view to check whether the destination port is configured.

l If the destination port is configured, go to Step 4.

l If the destination port is configured, run the destination-port port-number command in the

NQA test instance view to configure the destination port.


–



----End


Relevant Alarms

None.



54


Troubleshooting

Relevant Logs

None.

2.9.2 A Drop Record Exists in the UDP Jitter Test Result

2 System

Common Causes

If the UDP jitter test result has drop records, the value of the "Drop operation number" field in the display nqa results command output is not 0.

This fault is commonly caused by one of the following: l

The destination IP address does not exist or the route to the network segment to which the destination IP address belongs does not exist in the routing table.

l The source IP address is incorrect.


Figure 2-18 Troubleshooting flowchart used when a drop record exists in the UDP jitter test

A drop record exists in the UDP jitter test result.

Whether the destination address reachable?

Yes

No

Ensure that the destination address exists and is reachable.

Whether the source address is configured?

Yes


No

Ensure that the source address exists and is reachable.


No

Yes


No

Yes

End


NOTE


Procedure

Step 1 Run the display ip routing-table command on the NQA client to check whether the route along the test path exists.



55


Troubleshooting 2 System l If the route exists, run the ping command to check whether devices can successfully ping each other.

–

If devices can successfully ping each other, go to Step 2.

– If devices cannot successfully ping each other, see


.

l If the route does not exist, run the corresponding command to reconfigure the route.

Step 2 Run the display nqa-agent admin-name test-name [ verbose ] command on the NQA client or the display this command in the NQA test instance view to check whether the source IP address is configured.

l If the source IP address is configured, run the display ip interface brief on the NQA client to check whether the interface configured with the source IP address exists.

–

If the interface exists, run the display ip routing-table command on the NQA server to check whether the route to the source IP address exists.

– If the route exists, run the ping command to check whether the source IP address is reachable.

– If the source IP address is reachable, go to Step 3.

–

If the source IP address is unreachable, see


.

–

If the route does not exist, run the corresponding command to reconfigure the route.

– If the interface configured with the source IP address does not exist, run the corresponding command to reconfigure IP addresses and recheck the configuration about NQA.

l If the source IP address is not configured, go to Step 3.


----End


Relevant Alarms

None.

Relevant Logs

None.

2.9.3 A Busy Record Exists in the UDP Jitter Test Result

Common Causes

If the UDP jitter test result has busy records, the value of the "System busy operation number" field in the display nqa results command output is not 0.

This fault is commonly caused by one of the following: l The VPN route instance that is configured in the UDP Jitter test instance is unreachable.



56




Figure 2-19 Troubleshooting flowchart used when a busy record exists in the UDP jitter test

A busy record exists in the UDP jitter test result.

Is the VPN instance configured?

No


Yes

Ensure that devices in a VPN can communicate with each other.


No

Yes

End


NOTE


Procedure

Step 1 Run the display nqa-agent admin-name test-name [ verbose ] command on the NQA client or the display this command in the NQA test instance view to check whether the VPN instance is configured.

l If the VPN instance is configured, go to Step 2.

l If the VPN instance is not configured, go to Step 3.

Step 2 Run the ping -vpn-instance vpn-instance-name command on the NQA client to check whether the destination address is reachable.

l If the destination address is reachable, go to Step 3.

l If the destination address is unreachable, see the section


.


----End


Relevant Alarms

None.



57


Troubleshooting

Relevant Logs

None.

2.9.4 A Timeout Record Exists in the UDP Jitter Test Result

2 System

Common Causes

If the UDP jitter test result has timeout records, the value of the "operation timeout number" field in the display nqa results command output is not 0.

This fault is commonly caused by one of the following: l The destination address does not exist, but the route to the network segment of the destination address exists in the routing table.

l The value of the parameter "nqa-jitter tag-version" is 2, and the receiver is not configured with a UDP server.


Figure 2-20 Troubleshooting flowchart used when a timeout record exists in the UDP jitter test

A timeout record exists in the UDP jitter test result.

Whether the destination address reachable?

Yes

No

Ensure that the destination address exists and is reachable.

Is the NQA jitter tagversion 2?

No


Yes

Ensure that the NQA server is configured and is in the Active state.


No

Yes


No

Yes

End


NOTE


Unless otherwise stated, all the following commands, except display commands that can be run in all views, need to be run in the NQA test instance view.



58



Procedure

Step 1 Run the ping command on the NQA client to check whether the route to the destination address is reachable.

l If the route to the destination address is reachable, go to Step 2.

l If the route to the destination address is unreachable, see the section

6.2.1 A Ping Operation

Fails

.

Step 2 Run the display this command in the system view on the NQA client to check whether the value of the parameter "nqa-jitter tag-version" is 2. When the value of this parameter is set to 1 (the default value), this parameter is not displayed in the configuration file. This parameter is displayed in the configuration file when its value is set to 2.

l If the value of the parameter "nqa-jitter tag-version" is 2, go to Step 3.

l If the value of the parameter "nqa-jitter tag-version" is not 2, go to Step 4.

Step 3 Run the display nqa-server command on the NQA server to check whether the nqa-server

udpecho ip-address port-number command has been configured on the NQA server.

l If the nqa-server udpecho ip-address port-number command has been configured on the

NQA server and is in the Active state, go to Step 4.

l If the nqa-server udpecho ip-address port-number command is not configured on the NQA server, run the command to configure the NQA server. Note that the IP address of the NQA server must be identical with the destination IP address configured through the destination-

address ipv4 ip-address command on the NQA client. Also, the port number configured on the NQA server must be identical with that configured through the destination-port port-

number command on the NQA client.


–



----End


Relevant Alarms

None.

Relevant Logs

None.

2.9.5 The UDP Jitter Test Result Is "Failed", "No Result" or "Packet

Loss"



59



Common Causes

The UDP jitter test result displayed in the display nqa results command output can be "failed",

"no result", or "packet loss". In the command output, l

If the "Completion" field is displayed as "failed", the test fails.

l

If the "Completion" field is displayed as "no result", the test has no result.

l

If the "lost packet ratio" field is not 0%, packet loss occurs.


A drop record exists in the UDP jitter test result.

l

A busy record exists in the UDP jitter test result.

l

A timeout record exists in the UDP jitter test result.

l

The TTL expires.

l

The parameter frequency is incorrect.

l

The parameter fail-percent is incorrect.



60




Figure 2-21 Troubleshooting flowchart used when the UDP Jitter test result is "failed", "no result", or "packet loss"

The UDP jitter test result Is "failed", "no result" or "packet loss".

Whether

the existing fault has been rectified?

No

Ensure that the existing fault has been rectified.

Yes

Whether

the TTL is configured?

Yes

Ensure that the TTL of the packet sent from the client is large enough for the packet to reach the destination.

No

Whether the parameter frequency is configured?

Yes

Ensure that the value of the frequency is larger than the value of

(interval*probe-count*jitterpacketnum).

No

Whether the parameter fail-percent is configured?

Yes

Check whether the parameter fail-percent is set to a reasonable value.

No


No


No

Yes

Yes


Yes

No


No

Yes


End


NOTE


All the following commands, except the display commands, are used in the NQA test instance view. The display commands can be used in any views.

Procedure

Step 1 Run the display nqa-agent admin-name test-name [ verbose ] command on the NQA client or the display this command in the NQA test instance view to check whether the TTL is configured.



61


Troubleshooting 2 System l If the TTL is configured, you can run the ttl number command in the NQA test instance view to set the value of the TTL to 255. If the fault persists after the TTL is set to 255, go to

Step 2.

l If the TTL is not configured, you can run the ttl number command in the NQA test instance view to set the value of the TTL to 255. If the fault persists after the TTL is set to 255, go to

Step 2.

Step 2 Run the display nqa-agent admin-name test-name [ verbose ] command on the NQA agent or the display this command in the NQA test instance view to check whether the parameter

frequency is configured.

l If the parameter frequency is configured, compare the value of the frequency and that of the (interval x probe-count x jitter-packetnum). To ensure that the UDP Jitter test instance can be complete normally, the value of the frequency must be greater than that of the (interval x probe-count x jitter-packetnum). If the value of the frequency is less than that of the

(interval x probe-count x jitter-packetnum), run the frequency interval command in the NQA test instance view to increase the value of the frequency.

l If the frequency is not configured or the fault persists after a proper frequency value is set, go to Step 3.

Step 3 Run the display nqa-agent admin-name test-name [ verbose ] command on the NQA agent or the display this command in the NQA test instance view to check whether the parameter fail-

percent is configured.

l If the fail-percent is configured, run the undo fail-percent command in the NQA test instance view to delete the fail-percent. If the fault persists after the fail-percent is deleted, go to Step 4.

l If the fail-percent is not configured, go to Step 4.


----End


Relevant Alarms

None.

Relevant Logs

None.

2.10 NTP Troubleshooting

2.10.1 The Clock is not Synchronized



62


Troubleshooting



2 System

Common Causes

This fault is commonly caused by one of the following: l The link flaps.

l The link is faulty.


Context

NOTE


Procedure

Step 1 Check the NTP status.

[Quidway] display ntp-service status

clock status: unsynchronized

clock stratum: 16

reference clock ID: none

nominal frequency: 100.0000 Hz

actual frequency: 99.9995 Hz

clock precision: 2^18

clock offset: 0.0000 ms

root delay: 0.00 ms

root dispersion: 0.00 ms

peer dispersion: 0.00 ms

reference time: 14:25:55.477 UTC Jun 9 2010(CFBA22F3.7A4B76F6)

The "clock status" field is displayed as "unsynchronized", indicating that the local system clock is not synchronized with any NTP server or a reference clock.

Step 2 Check the status of the NTP connection.

[Quidway] display ntp-service sessions

The value of the "reference" is 0.0.0.0, specifying that the local system clock is not synchronized with any NTP server.

Step 3 Run the ping command on the NTP client to check the status of the link to the NTP server.

[Quidway] ping 20.1.14.1


Request time out

Request time out

Request time out

Request time out

Request time out

--- 20.1.14.1 ping statistics ---



100.00% packet loss l The displayed information "100.00% packetloss" indicates that the link is faulty. To locate the fault, refer to


.

l If the packet loss percentage is not 100.00%, the link flaps. To locate the fault, refer to

6.2.1

A Ping Operation Fails

.

l If the packet loss percentage is 0.00%, the link is normal. Then proceed to step 4.

Issue 01 (2012-03-15) 63




----End


Relevant Alarms

None.

Relevant Logs

The following log information indicates that the clock source with which the local device synchronizes is lost.

NTP/4/SOURCE_LOST

The following log information indicates that the local clock has synchronized with a clock source.

NTP/4/LEAP_CHANGE

NTP/4/STRATUM_CHANGE

NTP/4/PEER_SELE

2.11 HGMP Troubleshooting

2.11.1 A Candidate Switch Directly Connected to the Administrator

Switch Cannot Be Added to the Cluster

Common Causes

Two switches are directly connected. A cluster is created on one switch. The other switch, that is, a candidate switch, cannot be added to the cluster, and there is no prompt on the administrator switch.

This fault is commonly caused by one of the following: l Packets cannot be exchanged between the administrator switch and candidate switch because either of the interfaces connecting them is Down.

l The basic configuration of layer 2 forwarding is incorrectly configured.

l Layer 2 packet forwarding or transparent transmission of packets fails.

l Packets cannot be exchanged between the administrator switch and candidate switch because either of the interfaces that the packets pass through is blocked by a ring protocol.

l The cluster, NDP, or NTDP is incorrectly configured.

l The candidate switch has been added to the cluster and still remains in the cluster, and the new cluster to which the candidate switch is added has a different name from the current cluster.



64


Troubleshooting 2 System l Authentication of the candidate switch fails due to inconsistent super passwords of the candidate switch and administrator switch.


Figure 2-22




65



Figure 2-22 Troubleshooting flowchart for the fault that a candidate switch directly connected to the administrator switch cannot be added to the cluster

A candidate switch directly connected to the administrator switch cannot be added to the cluster

Basic configurations correct?

Yes

Layer 2

Loop protocols block the interface running

HGMP？

No

No

Change basic configurations

No

Fault rectified?

Modify the configurations of the Layer 2 loop protocols

No

Fault rectified?

Yes

Yes

NDP configurations correct?

Yes

NTDP configurations correct?

Yes

Cluster configurations correct?

Yes

NDP can discovery neighbors?

Yes

NTDP can discover topologies?

Yes

Disable and then enable cluster on the switch

No

No

No

No

No

Change NDP configurations

No

Fault rectified?

Yes

Change NTDP configurations

No

Fault rectified?

Yes

Change cluster configurations

No

Fault rectified?

Yes

Collect NDP debugging information

End

Collect NTDP debugging information

Switch can be added to the new cluster?

Yes

End

No

Collect debugging information on the switches




66


Troubleshooting



2 System


NOTE


Procedure

Step 1 Check that basic configurations of the administrator and candidate switches are correct.

HGMP packets can be exchanged only when Layer 2 forwarding is normal. You need to ensure that the administrator and candidate switches are correctly configured so that they can exchange

Layer 2 packets.

Ensure that the two switches are configured as follows: l The two directly connected interfaces are added to the same VLAN.

l

The VLAN is the cluster management VLAN, which is specified by running the

mngvlanid vlan-id command in the cluster view. In addition, vlan-id specifies the VLAN to which the interfaces belong.

l The two interfaces are added to the VLAN in the same manner. For example, the port

trunk allow-pass vlan vlan-id command is run on both interfaces with vlan-id being the same.

If the preceding configurations are correct, run the display vlan vlan-id command on both the administrator and candidate switches to check whether interfaces in the VLAN are Up. For example,

[Quidway] display vlan 1000

--------------------------------------------------------------------------------

U: Up; D: Down; TG: Tagged; UT: Untagged;

MP: Vlan-mapping; ST: Vlan-stacking;

#: ProtocolTransparent-vlan; *: Management-vlan;

--------------------------------------------------------------------------------

VID Type Ports

--------------------------------------------------------------------------------

1000 common TG:XGE0/0/1(U)

VID Status Property MAC-LRN Statistics Description

--------------------------------------------------------------------------------

1000 enable default enable disable VLAN 01000 l If the interfaces are Down, the physical link may fail. In this case, rectify the physical link fault.

l If the interfaces are Up, Layer 2 protocol is normal. In the case where the fault still persists, either cluster configurations or packet processing at layers above Layer 2 may be incorrect.

Go to

Step 2

.

Step 2 Check that the Layer 2 ring protocols on the interfaces of administrator and candidate switches run normally.

l If STP is enabled on administrator and candidate switches, check whether the interfaces running HGMP protocol are blocked by STP. Run the display stp brief command to check the interface status. For example,

[Quidway] display stp brief

MSTID Port Role STP State Protection

0 XGigabitEthernet0/0/1 ROOT FORWARDING NONE

0 XGigabitEthernet0/0/2 DESI FORWARDING NONE


Issue 01 (2012-03-15) 67



If the packets can be normally forwarded, the "STP state" field is displayed as

FORWARDING on the interfaces running HGMP protocol. If the "STP state" field is displayed as DISCARDING, it indicates that the interface is blocked by STP so that the interface cannot forward HGMP packets. You need to change the STP priority so that the interface can leave the DISCARDING state and the switch can be elected as the root bridge.

by running the stp priority priority-level command in the system view. priority-level ranges from 0 to 61440. The smaller the value, the higher the priority. The device with a lower STP priority is elected as the root bridge of the ring.

If the interfaces running HGMP protocol are in the FORWARDING state, it indicates that

STP on the interfaces runs normally.

l If RRPP is configured on both administrator and candidate switches, check whether the interfaces running HGMP protocol are blocked by RRPP. Run the display rrpp verbose

domain domain-index command to check the interface status. For example,

[Quidway] display rrpp verbose domain 1

Domain Index : 1

Control VLAN : major 1000 sub 1001

Protected VLAN : Reference Instance 1

Hello Timer : 1 sec(default is 1 sec) Fail Timer : 6 sec(default is 6 sec)

RRPP Ring : 1

Ring Level : 0

Node Mode : Master

Ring State : Failed

Is Enabled : Enable Is Actived : Yes

Primary port : XGigabitEthernet0/0/3 Port status: UP

Secondary port : XGigabitEthernet0/0/4 Port status: DOWN

If the "Port status" field is displayed as BLOCK, it indicates that cluster packets on the interfaces running HGMP protocol are blocked by RRPP. RRPP blocks secondary ports only. You need to change the blocked interface to be a non-secondary ports to ensure that the interface leave the blocked state.

If the interfaces running HGMP protocol are in the Upstatus, it indicates that RRPP on the interfaces runs normally. Go to

Step 3

.

NOTE

Only one ring protocol, in general, is configured on an interface. Check which ring protocol is configured on the interface before checking the interface status.

Step 3 Check that basic NDP functions are normal.

Run the display ndp command on both the administrator and candidate switches to check whether NDP can successfully discover neighbors. If NDP can discover neighbors, information about the directly connected neighbors can be displayed. For example,

<Quidway> display ndp

Neighbor discovery protocol is enabled.

Neighbor Discovery Protocol Ver: 1, Hello Timer: 60(s), Aging Timer: 180(s)

Interface: XGigabitEthernet0/0/2

Status: Enabled, Packets Sent: 114, Packets Received: 108, Packets Error: 0

Neighbor 1: Aging Time: 174(s)

MAC Address : 0018-8203-39d8

Port Name : XGigabitEthernet0/0/1

Software Version: Version 5.70 V100R005C00SPC001

Device Name : Quidway

Port Duplex : FULL

Product Ver : S6700

If NDP cannot discover neighbors, check that NDP is configured as follows:



68


Troubleshooting 2 System l NDP is globally enabled on both switches by running the ndp enable command in the system view.

l NDP is enabled on the two directly connected interfaces by running the ndp enable command in the interface view.

CAUTION

Debugging affects the performance of the system. So, after debugging, run the undo debugging

all command to disable it immediately.

If the NDP configurations are correct whereas NDP still cannot discover neighbors, collect the debugging information displayed by running the following commands and then contact Huawei technical support personnel.

l Run the terminal monitor and terminal debugging commands in the user view to enable monitoring debugging.

l Run the debugging ndp packet interface interface-type interface-number command in the user view to enable NDP debugging and collect the debugging information in three minutes.

If NDP can discover neighbors, go to

Step 4

.

Step 4 Check that basic NTDP functions are normal.

Check that NTDP is configured as follows: l NTDP is globally enabled on both switches by running the ntdp enable command in the system view.

l NTDP is enabled on the two directly connected interfaces by running the ntdp enable command in the interface view.

l The cluster management VLAN is configured in the cluster view by running the

mngvlanid vlan-id command in the cluster view. In addition, vlan-id specifies the VLAN to which the interface belongs.

If the NTDP configurations are incorrect, correctly configure NTDP.

If the NTDP configurations are correct, run the ntdp explore command on the administrator and candidate switches to discover topologies. After five seconds, run the display ntdp device-

list command on the two switches to check whether NTDP can discover topologies. If NTDP can discover topologies, information about neighbors can be displayed. For example,

[Quidway] display ntdp device-list

The device-list of NTDP:

------------------------------------------------------------------------------

MAC HOP IP PLATFORM

------------------------------------------------------------------------------

001c-2334-2312 1 1.1.1.2/24 S6700

0018-82af-fc38 0 1.1.1.1/24 S6700

If NTDP cannot discover topologies, collect the debugging information displayed by running the following commands on the two switches and then contact Huawei technical support personnel.




69


Troubleshooting 2 System l Run the debugging ntdp all command in the use view to enable NTDP debugging.

l Run the ntdp explore command to discover topologies and the display ntdp device-list command to display the topologies.

If NTDP discovers topologies, go to

Step 5

.

NOTE

l A switch can be added to the cluster only if it has been discovered by NTDP on the administrator switch.

l Switches do not forward received NDP packets and therefore ring protocols cannot block NDP packets.

NTDP packets are forwarded after being received and therefore NTDP packets may be blocked by ring protocols.

Step 5 Check that the basic cluster function is normal.

Check whether the cluster function is configured as follows: l The cluster function is globally enabled on both switches by running the cluster enable command in the system view.

l VLANIF interfaces of the cluster management VLAN are configured on both switches by running the interface vlanif vlan-id command in the system view. vlan-id must be the same as that in the mngvlanid command configured in the cluster view.

l An available IP pool is configured on the administrator switch by running the ip-pool

administrator-ip-address mask command in the cluster view.

l The IP addresses manually assigned to the VLANIF interfaces of the management VLAN do not reside in the IP pool configured by using the ip-pool command.

l No super password or the same super password is configured for the administrator and candidate switches.

If the cluster configurations are incorrect, correctly configure the cluster function.

If the cluster configurations are correct, disable cluster from the switch by running the undo

cluster enable command, and then run the cluster enable command to ensure that the switch does not belong to any cluster. Then, delete the cluster on the administrator switch, and then create a new cluster. Check whether the candidate switch can be added to the new cluster.

l Run the undo build command in the cluster view to delete the existing cluster.

l Run the auto-build command to create a new cluster.

If the candidate switch still cannot be added to the cluster, collect the debugging information displayed by running the following commands on the two switches and then contact Huawei technical support personnel.


l Run the debugging cluster all command in the use view to enable cluster debugging.

l Manually add the candidate switch to the cluster by running the add-member mac-

address mac-address command in the cluster view and collect the command output displayed in 10 seconds.



----End



70


Troubleshooting


Relevant Alarms

HGMP/4/ClstMemStusChg:OID:[oid],DeviceID:[string], Role:[integer].

Relevant Logs

None.

2 System

2.12 LLDP Troubleshooting

This chapter describes common causes of the LLDP fault, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

2.12.1 An Interface Cannot Discover Neighbors

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the LLDP failure.

Common Causes

This fault is commonly caused by one of the following: l The physical link is faulty.

l The LLDP function is not enabled.

l The BPDU function is not enabled on the interface.

l The LLDP transparent transmission function is not properly configured on the interface.




71


Troubleshooting

Figure 2-23 LLDP troubleshooting flowchart

Interface fails to discover neighbor

Do physical links function properly?

Yes

No

Is LLDP enabled?

No

Rectify the link fault

Enable LLDP

Yes

Is BPDU enabled?

No

Yes

No

LLDP transparent transmission enabled?

Yes

Enable BPDU

Configure LLDP transparent transmission


No

Is fault rectified?

Yes

Yes

No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

Is fault rectified?

2 System


NOTE


After you run the display lldp neighbor brief command, the output information shows that the device does not discover any neighbor.

Procedure

Step 1 Check that the physical links between devices function properly.

Run the display interface interface-type interface-number command to view the value of

current state.

l If the value is DOWN, the link is faulty. Rectify the link faulty.

l If the value is UP, the link functions properly. Go to step 2.

Step 2 Check that LLDP is enabled.

By default, if global LLDP is enabled, LLDP is enabled on all interfaces. To disable LLDP on an interface, run the undo lldp enable on the interface.



72



1.

Check that the global LLDP function is enabled.

Run the display current-configuration command to check whether the output information contains the lldp enable command.

l If the lldp enable command cannot be found, run the lldp enable to enable the global

LLDP function.

l If the lldp enable command is contained, go to step b.

2.

Check whether LLDP is disabled on the interface.

Run the display this command in the interface view to check whether the output information contains the undo lldp enable command.

l If the undo lldp enable command is contained, run the lldp enable to enable the LLDP function. on the interface.

l If the command is not found, go to step 3.

Step 3 Check whether BPDU is enabled on the interface.

Run the display this command in the interface view to check whether the output information contains the bpdu enable command.

l If not, LLDP packets are not sent to the CPU, and thus the interface cannot discover neighbors. Run the bpdu enable command to enable BPDU.

l If yes, go to step 4.

Step 4 Check whether LLDP transparent transmission is configured properly.

By default, LLDP transparent transmission is disabled on an interface. You can run the display

this command in the interface view to check whether LLDP transparent transmission is enabled.

If the output information contains l2protocol-tunnel lldp enable, LLDP transparent transmission is enabled on the interface.

l If the interface has only one neighbor, LLDP transparent transmission must be disabled on the LLDP-enabled device; otherwise, the interface cannot discover the neighbor.

l If the interface has multiple neighbors, LLDP transparent transmission must be disabled on the LLDP-enabled device, but enabled on the intermediate device; otherwise, the interface cannot discover neighbors.

–

To enable LLDP transparent transmission, run the l2protocol-tunnel lldp enable command in the interface view.

–

To disable LLDP transparent transmission, run the l2protocol-tunnel lldp disable command in the interface view.

l If the configuration is incorrect, modify the configuration.

l If the configuration is correct, go to step 5.



----End




73


Troubleshooting

Relevant Alarms

None.

Relevant Logs

None.

2 System

2.13 NAP-based Remote Deployment Troubleshooting

2.13.1 Fail to Log In to the Newly Deployed Device Through NAP

Common Causes

This fault is commonly caused by one of the following: l The NAP configuration is error on the local device.

l The connection between the master and slave devices has not been established or is instable.



74




Figure 2-24 Troubleshooting flowchart for the fault that log in to the newly deployed device through NAP fails

Failed to log in to newly deployed device through NAP

Interface is set as NAP master interface?

No

Yes

Yes

Set interface as

NAP master interface

Is interface

UP?

No

Make interface

UP

Fault

Yes rectified?

No

NAP neighbor relationship is set up?

Yes

No

Peer device supports NAP?

No

Yes

Yes

Slave attributes on peer enabled?

No

Yes

Peer does not have any configuration?

No


IP addresses are assigned?

No

Yes

IP addresses confict?

No

Yes

Re-configure

IP address or

IP-pool


No

Fault rectified?

End

Yes



75


Troubleshooting



2 System


NOTE

l Saving the results of each troubleshooting step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.

l Before troubleshooting, ensure that the new device with empty configuration supports NAP.

If you do not know whether the device to be logged in has configurations, contact the on-site Huawei technical support personnel to confirm that the device does not have any configuration.

Procedure

Step 1 Check that the current interface is the NAP master interface.

Run the display nap interface command in any view to check the Port property field.

l

If Master is displayed in this field, go to

Step 2

.

l If the value displayed in this field is not Master, run the nap port master command in the corresponding interface view to configure the interface as the NAP master interface.

If the interface cannot be configured with the nap port master command, the interface does not support NAP. Choose another interface of another type.

NOTE

Currently, Ethernet and Gigabit Ethernet interfaces support NAP.

Step 2 Check that the NAP master interface is in the DETECTING state.

Run the display nap interface command in any view to check the Current status field.

l If DETECTING is displayed in this field, run the display interface command to view the status of the NAP master interface.

– If the NAP master interface is Down, check whether the new device is physically connected and whether the current NAP master interface is connected to the new device.

–

If the NAP master interface is Up, go to

Step 4

.

l If the value displayed in this field is not DETECTING, go to

Step 3

.

Step 3 Check that the NAP neighbor has obtained an IP address.

Run the display nap interface command in any view to check the Current status field.

l If Established is displayed in this field, and the IP addresses of the NAP master and slave interfaces keep changing, IP addresses allocated from the IP address pool conflict. Do as follows based on the number of master interfaces:

– If only one master interface exists, run the nap ip-pool command in the system view to configure the IP address pool.

– If two or more master interfaces exist, run the nap ip-address local local-ip peer peer-

ip mask-length command in the current master interface view to configure IP addresses for the master and slave interfaces.

l If IP-ASSIGNED is displayed in this field, an IP address has been allocated to the NAP

neighbor. Then, go to

Step 4

.

Step 4 Collect the following information, and contact Huawei technical support personnel.

l Results of the preceding troubleshooting procedures

Issue 01 (2012-03-15) 76


Troubleshooting l Configuration files, log files, and alarm files of the devices

----End

2 System


Relevant Alarms

NAP/4/NAP_STATUSCHANGE:OID 1.3.6.1.4.1.2011.5.25.206.3.1 Index [integer], the status of the nap port [octet] has changed to [integer], and the AbnormalReason is [integer].

Relevant Logs

NAP/6/GOTONEIGHBOR:Connected to the device on the slave interface end through the main interface[STRING].

2.14 sFlow Troubleshooting

This chapter describes common causes of sFlow faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

2.14.1 Target sFlow Collector Used to Receive Counter Sampling

Data Cannot Receive Sampling Packets

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when a target sFlow collector fails to receive sampling packets.

Common Causes

This fault is commonly caused by one of the following: l The global configuration of the sFlow agent and sFlow collector is incorrect.

l The sampling configuration on the sFlow agent interface is incorrect.

l The target sFlow collector is unreachable.


This section describes the troubleshooting flowchart for a failure to receive sampling packets.

The troubleshooting roadmap is as follows: l Check whether the global configuration of the sFlow agent and sFlow collector is correct.

l Check whether the sampling configuration on the sFlow agent interface is correct.

l Check whether there is a reachable route from the sFlow agent to the sFlow collector.

Figure 2-25




77



Figure 2-25 Counter sampling troubleshooting flowchart sFlow collector cannot receive sampling packets

Are counter sampling parameters set correctly?

No

C orrectly

set counter sampling parameters

Yes

Can sFlow

agent and collector be

pinged?

Yes

No

Rectify the ping failure

No

Is fault rectified?

No

Is fault rectified?

Yes

Yes



NOTE


Procedure

Step 1 Check that counter sampling parameters are correct.

Before enabling counter sampling, set parameters for an sFlow agent and sFlow collector.

Run the display sflow [ slot slot-id ] command in any view to check the sFlow configuration.

You must configure an sFlow collector and the target collector that receives counter sampling data. If an sFlow agent has no IP address configured, the sFlow agent uses the outbound interface address as the source IP address.

l If counter sampling parameters are not set, set the parameters according to the sFlow configuration guide. If the aging time of the sFlow collector configured in the system is reached, the sFlow collector cannot receive counter sampling packets.

l If counter sampling parameters are set correctly and the fault persists, go to step 2.

Step 2 Check whether the sFlow agent can ping the sFlow collector.

Run the ping sFlow collector IP address command in the view of the sFlow agent to check whether the sFlow collector can be pinged.

The sFlow agent needs to send sampling packets to the sFlow collector, so there must be a reachable route from the sFlow agent to the sFlow collector. If the ping operation fails, rectify the fault according to

6.2 Ping Troubleshooting

.



78



If the target collector still cannot receive sFlow packets after the ping failure is rectified according

to


, run the debugging sflow packet command in the diagnosis view to debug sFlow packets. You can set a long sampling interval to reduce debugging information output and prevent high CPU usage. The following results may be displayed: l If debugging information is displayed within the sampling interval, the sFlow agent has sampling packets and sends them to the sFlow collector. If the fault persists, go to step 3.

l If no debugging information is displayed within the sampling interval, the sFlow agent does not have sampling packets or send sampling packets to the sFlow collector, so the sFlow collector does not receive sampling packets. Go to step 3.


l Results of the preceding troubleshooting procedure l Configuration file, logs, and alarms of the S6700

----End


Relevant Alarms

None.

Relevant Logs

None.

2.14.2 Target sFlow Collector Used to Receive Flow Sampling Data

Cannot Receive Sampling Packets

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when a target sFlow collector fails to receive sampling packets.

Common Causes

This fault is commonly caused by one of the following: l The global configuration of the sFlow agent and sFlow collector is incorrect.

l The sampling configuration on the sFlow agent interface is incorrect.

l The target sFlow collector is unreachable.

l The sFlow agent interface does not work properly.

l The interface cannot receive or send traffic correctly.


The troubleshooting roadmap is as follows: l Check whether the global configuration of the sFlow agent and sFlow collector is correct.

l Check whether the sampling configuration on the sFlow agent interface is correct.



79


Troubleshooting 2 System l Check whether there is a reachable route from the sFlow agent to the sFlow collector.

l Check whether the sFlow agent interface status is Up.

l Check whether there are statistics on received and sent packets on the interface.

Figure 2-26


Figure 2-26 Flow sampling troubleshooting flowchart sFlow collector cannot receive sampling packets

Are flow sampling parameters set correctly?

No

Yes

Can sFlow

agent and collector be pinged?

No

Correctly set flow sampling parameters

Rectify the ping fault

Yes

No

Is the interface Up?

Rectify the interface fault

Yes

Are there traffic statistics on the interface?

No

Yes

Rectify interface traffic fault

No

Is fault rectified?

Yes

Yes

No

Is fault rectified?

Yes

Is fault rectified?

No

Yes

No

Is fault rectified?

Yes



NOTE


Procedure

Step 1 Check that flow sampling parameters are correct.

Before enabling flow sampling, set parameters for an sFlow agent and sFlow collector.

Run the display sflow [ slot slot-id ] command in any view to check the sFlow configuration.

You must configure an sFlow collector and the target sFlow collector that receives counter



80


Troubleshooting



2 System sampling data. If an sFlow agent has no IP address configured, the sFlow agent uses the outbound interface address as the source IP address.

l If flow sampling parameters are not set, set the parameters according to the sFlow configuration guide. If the aging time of the sFlow collector configured in the system is reached, the sFlow collector cannot receive counter sampling packets. The target collector used to receive flow sampling data must be configured on the interface. By default, flow sampling is performed in the inbound and outbound directions. Other parameters can use default values.

l If flow sampling parameters are set correctly and the fault persists, go to step 2.

Step 2 Check whether the sFlow agent can ping the sFlow collector.

Run the ping sFlow collector IP address command in the view of the sFlow agent to check whether the sFlow collector can be pinged.

The sFlow agent needs to send sampling packets to the sFlow collector, so there must be a reachable route from the sFlow agent to the sFlow collector. If the ping operation fails, rectify the fault according to


.


Step 3 Check whether the interface configured with flow sampling is Up.

Run the display this interface command to view the interface status.

l If the interface is Down, rectify the fault according to

3.1.1 Connected Ethernet Interfaces

Down

.

l If the fault persists, go to step 4.

Step 4 Check whether there is incoming or outgoing traffic on the interface.

Run the display this interface command to view the interface status.

[Quidway-XGigabitEthernet0/0/1] display this interface

GigabitEthernet0/0/1 current state :

UP

Line protocol current state :

UP

Description:

Switch Port, PVID : 1, TPID : 8100(Hex), The Maximum Frame Length is

9216

IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is 00e0fc01-9845

Last physical up time : 2011-10-24

19:11:58

Last physical down time : 2011-10-24

19:11:47

Current system time: 2011-10-24

19:21:19

Port Mode: COMMON

FIBER

Speed : 1000, Loopback:

NONE

Duplex: FULL, Negotiation:

ENABLE

Mdi :

Issue 01 (2012-03-15) 81



NORMAL

Last 300 seconds input rate 657184 bits/sec, 641 packets/ sec

Last 300 seconds output rate 0 bits/sec, 0 packets/ sec

Input peak rate 8660824 bits/sec, Record time: 2011-10-24

19:20:59

Output peak rate 1360 bits/sec, Record time: 2011-10-24

19:19:26

Input: 201246 packets, 25758890 bytes

Unicast: 201237, Multicast:

0

Broadcast: 9, Jumbo:

0

Discard: 0, Total Error:

0

CRC: 0, Giants:

0

Jabbers: 0, Fragments:

0

Runts: 0, DropEvents:

0

Alignments: 0, Symbols:

0

Ignoreds: 0, Frames:

0

Output: 394043847 packets, 28371156976 bytes


0


0

Discard: 8008, Total Error:

0

Collisions: 0, ExcessiveCollisions:

0

Late Collisions: 0, Deferreds:

0

Buffers Purged:

0

Input bandwidth utilization threshold :

100.00%

Output bandwidth utilization threshold:

100.00%

Input bandwidth utilization :

0.66%

Output bandwidth utilization : 0.00%

The preceding information shows the statistics on incoming and outgoing packets on the interface. If flow sampling is configured in the inbound or outbound direction and the sFlow collector cannot receive sampling packets (if the sampling ratio is large and the traffic is light, it takes a long time for the sFlow collector to receive sampling packets), run the debugging

sflow packet command in the diagnosis view to debug sFlow packets. The following results may be displayed:



82


Troubleshooting 2 System l If debugging information is displayed, the sFlow agent has sampling packets and sends them to the sFlow collector. A small sampling ratio may result in a large amount of debugging information and high CPU usage. If the sFlow collector cannot receive sFlow packets, go to step 5.

l If there is no debugging information, the sFlow agent does not have sampling packets or send sampling packets to the sFlow collector. Go to step 5.


l Results of the preceding troubleshooting procedure l Configuration file, logs, and alarms of the S6700

----End


Relevant Alarms

None.

Relevant Logs

None.



83


Troubleshooting 3 Physical Connection and Interfaces

3

Physical Connection and Interfaces


3.1 Ethernet Interface Troubleshooting

This chapter describes common causes of Ethernet interface faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

3.2 Eth-Trunk Interface Troubleshooting

This chapter describes common causes of Eth-Trunk interface faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.



84



3.1 Ethernet Interface Troubleshooting

This chapter describes common causes of Ethernet interface faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

3.1.1 Connected Ethernet Interfaces Down

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when Ethernet interfaces between two devices cannot turn Up.

Common Causes

This fault is commonly caused by one of the following: l The devices are powered off or the cable between the interfaces is not properly connected.

l The Ethernet interfaces are manually shut down.

l The fiber between the interfaces is too long or the attenuation is high.

l The interfaces, interface modules, or devices are faulty.




85



Figure 3-1 Troubleshooting flowchart for Ethernet interfaces in Down state

An Etherenet interface is Down

Device powered on and cable well connected?

No

Power on the device and connect the cable properly

Yes

Is interface manually shut down

No

Run undo shutdown on the interface

Yes

Do link and interface module work properly?

No

Replace the cable or interface module

Yes

Is device hardware faulty?

Yes

Replace the hardware

No



Yes

No


No

Yes


Yes

No


No

Yes

End


Context

NOTE




86



Procedure

Step 1 Check that the local and remote switches are powered on and that the cable and interface modules are installed properly.

If the fault persists, go to

Step 2

.

Step 2 Check that the interfaces are not manually shut down.

Run the interface interface-type interface-number command in the system view to enter the interface view, and then run the display this command to check the interface status.

If an interface was shut down by using the shutdown command, run the undo shutdown command in the interface view.

NOTE

If a Monitor Link group is configured on a switch, all downlink interfaces in the group are shut down when the uplink interface is deleted from the group or turns Down. If the uplink interface turns down, rectify the fault on the uplink interface.


Step 3

.

Step 3 Check that the interface modules and the link between the interfaces work properly.

Check Item

Fiber working status

Types of optical modules and fibers

Fiber length and maximum transmission distance of optical modules

Optical signal attenuation

Criteria

The tester shows that optical signals are sent and received successfully.

In a loopback test, the two interfaces are Up.

The fiber type matches the optical module type. For details about mappings between optical module types and fiber types, see

"List of Optical Interface

Attributes" in the hardware description.

The fiber length is smaller than the maximum transmission distance of the optical modules.

For the maximum transmission distance supported by different optical modules, see "List of

Optical Interface Attributes" in the hardware description.

The tester shows that the optical signal attenuation is in the allowed range. For the attenuation range, see "List of Optical

Interface Attributes" in the hardware description.

Follow-Up Operation

If optical signals cannot be sent or received, replace the fibers. If the fault persists, replace the optical modules.

If the fiber type does not match the optical module type, replace the optical modules or fibers.

If the fiber length exceeds the maximum transmission distance of the modules, shorten the distance between the devices or use optical modules with a larger transmission distance.

If the attenuation is high, replace the fibers. If the fault persists, shorten the distance between the devices and use shorter fibers.



87




Step 4

.

Step 4 Check whether the local or remote device has a hardware fault.

Connect the devices using other interfaces. If the fault persists, go to

Step 5

.



----End


Relevant Alarms

None.

Relevant Logs

None.

3.1.2 An Ethernet Interface Frequently Alternates Between Up and

Down

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when an Ethernet interface frequently alternates between Up and Down.

Common Causes

This fault is commonly caused by one of the following: l The cable is not properly connected to the interface.

l The fiber connected to the interface is too long or the attenuation is high.

l The local and remote interfaces, interface modules, or devices are faulty.




88



Figure 3-2 Troubleshooting flowchart for an Ethernet interface frequently alternating between

Up and Down

An interface frequently alternates between

Up and Down

Are cable

and interface module well installed?

Yes

Do link and interface module work properly?

Yes

No

No

Install the cable and interface module properly

Replace the cable or interface module

Is device hardware faulty?

No


Yes

Replace the hardware


Yes

No


Yes

No


Yes

No

End


Context

NOTE


Procedure

Step 1 Check that the cable and interface modules are properly installed on the local and remote devices.


Step 2

.

Step 2 Check that the interface modules and the link between the interfaces work properly.



89



Check Item

Fiber working status

Types of optical modules and fibers

Fiber length and maximum transmission distance of optical modules

Optical signal attenuation

Criteria

The tester shows that optical signals are sent and received successfully.

In a loopback test, the two interfaces are Up.

The fiber type matches the optical module type. For details about mappings between optical module types and fiber types, see

"List of Optical Interface

Attributes" in the hardware description.

The fiber length is smaller than the maximum transmission distance of the optical modules.

For the maximum transmission distance supported by different optical modules, see "List of

Optical Interface Attributes" in the hardware description.

The tester shows that the optical signal attenuation is in the allowed range. For the attenuation range, see "List of Optical

Interface Attributes" in the hardware description.


If optical signals cannot be sent or received, replace the fibers. If the fault persists, replace the optical modules.

If the fiber type does not match the optical module type, replace the optical modules or fibers.

If the fiber length exceeds the maximum transmission distance of the modules, shorten the distance between the devices or use optical modules with a larger transmission distance.

If the attenuation is high, replace the fibers. If the fault persists, shorten the distance between the devices and use shorter fibers.


Step 3

.

Step 3 Check whether the local or remote device has a hardware fault.

l Connect the twisted pair or fibers to another interface.


Step 4

.



----End


Relevant Alarms

None.



90


Troubleshooting

Relevant Logs

None.

3 Physical Connection and Interfaces

3.2 Eth-Trunk Interface Troubleshooting

This chapter describes common causes of Eth-Trunk interface faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

3.2.1 Eth-Trunk Interface Cannot Forward Traffic

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the fault that an Eth-Trunk interface cannot forward traffic.

Common Causes

After an Eth-Trunk interface is configured, it cannot forward traffic.

This fault is commonly caused by one of the following: l Eth-Trunk member interfaces are faulty.

l Configurations of Eth-Trunk member interfaces on the two ends are inconsistent.

l The number of Up Eth-Trunk member interfaces is smaller than the lower threshold.

l Negotiation between member interfaces of the Eth-Trunk interface in static LACP mode fails.


On the network shown in

Figure 3-3

, the Eth-Trunk interface cannot forward traffic.

Figure 3-3 Eth-Trunk network diagram

XGE0/0/8

XGE0/0/9

XGE0/0/8

XGE0/0/9

SwitchA SwitchB

XGE0/0/10

Eth-Trunk1

XGE0/0/10

Issue 01 (2012-03-15)

The troubleshooting roadmap is as follows: l Check that Eth-Trunk member interfaces work properly.

l Check information about Eth-Trunk member interfaces on both ends.

l Check that the number of Up member interfaces is greater than the configured lower threshold.



91


Troubleshooting 3 Physical Connection and Interfaces l Check that LACP negotiation succeeds if the Eth-Trunk interface is in static LACP mode.

Figure 3-4



Eth-Trunk interface cannot forward traffic

Eth-Trunk member interfaces work

Yes properly?

Check physical links connecting member interfaces and rectify the link fault

No

Member interfaces on both ends are consistent?

Yes

Modify the configuration

No

Number of

Up member interfaces is below the lower threshold?

Yes

No

Change the lower threshold

Negotiation between Eth-Trunk interfaces working in static LACP mode fails?

Yes

Locate the cause of the negotiation failure and modify the configuration

No

Collect information

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

Yes

No


End


NOTE




92


Troubleshooting




Procedure

Step 1 Check that Eth-Trunk member interfaces work properly.

Run the display eth-trunk 1 command in any view to check the status of the Eth-Trunk interface.

[Quidway] display eth-trunk 1

Eth-Trunk1's state information is:

WorkingMode: NORMAL Hash arithmetic:According to SA-XOR-DA

Least Active-linknumber: 1 Max Bandwidth-affected-linknumber: 4

Operate status: down Number Of Up Port In Trunk: 0

--------------------------------------------------------------------------------

PortName Status Weight

XGigabitEthernet0/0/8 Down 1

XGigabitEthernet0/0/9 Down 1

XGigabitEthernet0/0/10 Down 1 l

If a member interface is Down, you need to troubleshoot the physical interface. For detailed troubleshooting procedures, see "


".

l If the member interface is Up, verify that each cable is correctly connected to interfaces.


Step 2

.

Step 2 Check information about Eth-Trunk member interfaces on both ends.

Check information about member interfaces of the Eth-Trunk interface on Switch A and

Switch B.

[SwitchA] display eth-trunk 1


WorkingMode: NORMAL Hash arithmetic: According to SA-XOR-DA


Operate status: up Number Of Up Port In Trunk: 3

--------------------------------------------------------------------------------


XGigabitEthernet0/0/8 up 1



[SwitchB] display eth-trunk 1




Operate status: up Number Of Up Port In Trunk: 2

--------------------------------------------------------------------------------

PortName Status Weight


XGigabitEthernet0/0/9 up 1 l Check information about member interfaces of the Eth-Trunk interface on Switch B.

l If the number of member interfaces of the Eth-Trunk interface on Switch A is different from that on Switch B, add the required physical interfaces to the Eth-Trunk interface.

l If the number of member interfaces of the Eth-Trunk interface on Switch A is the same as that on Switch B, go to

Step 3

.

Step 3 Check whether the Eth-Trunk interface is configured with a lower threshold of Up member interfaces.

Run the display eth-trunk 1 command on Switch A and Switch B to view the configuration of the Eth-Trunk interface.




Least Active-linknumber: 4 Max Bandwidth-affected-linknumber: 4

Issue 01 (2012-03-15) 93


Troubleshooting




Operate status: down Number Of Up Port In Trunk: 3

--------------------------------------------------------------------------------





The preceding command output shows that the lower threshold of Up member interfaces of the

Eth-Trunk interface has been set to 4. However, the number of Up member interfaces of the Eth-

Trunk interface is actually 3, which causes the Eth-Trunk interface to go Down.

l If the Eth-Trunk interface is configured with a lower threshold of Up member interfaces and the configured lower threshold is greater than the actual number of Up member interfaces, set the lower threshold to a proper value.

l If the Eth-Trunk interface is not configured with a lower threshold of Up member interfaces,

go to

Step 4

.

Step 4 Check whether Eth-Trunk interfaces work in static LACP mode.

Run the display eth-trunk 1 command on Switch A and Switch B to view the configuration of the Eth-Trunk interface.



Local:

LAG ID: 1 WorkingMode: STATIC

Preempt Delay: Disabled Hash arithmetic: According to SA-XOR-DA

System Priority: 32768 System ID: 0018-826f-fc7a

Least Active-linknumber: 1 Max Active-linknumber: 4

Operate status: down Number Of Up Port In Trunk: 0

--------------------------------------------------------------------------------

ActorPortName Status PortType PortPri PortNo PortKey PortState Weight

XGigabitEthernet0/0/8 UnSelected 10G 32768 264 305 11100010 1



Partner:

--------------------------------------------------------------------------------

ActorPortName SysPri SystemID PortPri PortNo PortKey PortState

XGigabitEthernet0/0/8 32768 0018-823c-c473 32768 2056 305 11100010

XGigabitEthernet0/0/9 32768 0018-823c-c473 32768 2057 305 11100010

XGigabitEthernet0/0/10 32768 0018-823c-c473 32768 2058 305 11100010 l If the Eth-Trunk interface is configured to work in static LACP mode and no physical interface is selected, it indicates that LACP negotiation is unsuccessful. Possible causes for unsuccessful LACP negotiation are as follows:

–

Member interfaces fail, causing timeout of LACP protocol packets.

Connect the cable to another idle interface and add the interface to the Eth-Trunk.

– The Eth-Trunk interface on one end is configured to work in static LACP mode, whereas the Eth-Trunk interface on the other end is not.

Correct the configurations of the two ends of the Eth-Trunk link to make them consistent.

After the configurations are corrected and LACP negotiation succeeds, the output of the

display eth-trunk 1 command is as follows:

[SwitchB] display eth-trunk 1


Local:

LAG ID: 1 WorkingMode: STATIC

Preempt Delay: Disabled Hash arithmetic: According to SA-XOR-DA

System Priority: 32768 System ID: 0018-826f-fc7a

Least Active-linknumber: 1 Max Active-linknumber: 4

Operate status: up Number Of Up Port In Trunk: 3

Issue 01 (2012-03-15) 94



------------------------------------------------------------------------------

--

ActorPortName Status PortType PortPri PortNo PortKey PortState

Weight

XGigabitEthernet0/0/8 Selected 10G 32768 264 305 11111100 1



Partner:

------------------------------------------------------------------------------

--

ActorPortName SysPri SystemID PortPri PortNo PortKey

PortState

XGigabitEthernet0/0/8 32768 0018-823c-c473 32768 2056 305

11111100


11111100


11111100

If LACP negotiation fails after the configurations are corrected, go to

Step 5

.

l

If the Eth-Trunk interface is not configured to work in static LACP mode, go to

Step 5

.



----End


Relevant Alarms

None.

Relevant Logs

None.

3.2.2 Troubleshooting Cases

Traffic Is Not Load Balanced Between Eth-Trunk Member Interfaces Due to the

Incorrect Load Balancing Mode

Fault Symptom

As shown in

Figure 3-5

, SwitchA and SwitchB communicate by using an Eth-Trunk. All

interfaces on SwitchA and SwitchB belong to the same VLAN. After the display interface command is run on SwitchA, the command output shows that the outgoing traffic rate on

XGE0/0/1is 800 Mbit/s and the outgoing traffic rate on XGE0/0/2 is 200 Mbit/s. That is, outgoing traffic is not load balanced between XGE0/0/1 and XGE0/0/2.



95



Figure 3-5 Network diagram of Eth-Trunk load balancing

Switch A

XGE0/0/1

Eth-Trunk1

XGE0/0/2

XGE0/0/1

XGE0/0/2

Switch B

Fault Analysis

1.

Run the display current-configuration command on the Switches to check the configuration of Eth-Trunk 1. The command outputs show that the load balancing mode of Eth-Trunk 1 is src-dst-ip. That is, load balancing is performed based on the Exclusive-

Or result of source and destination IP addresses. SwitchA and SwitchB communicate at

Layer 2; therefore, the load balancing mode is not applicable to this scenario.

This fault is caused by the incorrect load balancing mode.

Procedure

Step 1 Run the system-view command on SwitchA to enter the system view.

Step 2 Run the interface interface-type interface-number command to enter the Eth-Trunk interface view.

Step 3 Run the load-balance src-dst-mac command to set the load balancing mode to src-dst-mac .

Run the display interface [ number [ interface-type ] ] command on SwitchA to check the traffic rates on XGE0/0/1 and XGE0/0/2. You can see that traffic is load balanced properly on the two interfaces.

----End

Summary

The Switches can communicate at Layer 2 or Layer 3 by using Eth-Trunk 1.

Figure 3-6

shows the Layer 3 communication scenario. Eth-Trunk 1 belongs to VLAN 10.

SwitchA functions as the gateway of PCA, and SwitchB functions as the gateway of PCB. IP addresses of PCA and PCB are in different network segments. To enable PCA to communicate with PCB, you must configure a route to the network segment 3.1.1.0 and set the next hop address of the route to 1.1.1.2 on SwitchA. In addition, configure a route to the network segment 2.1.1.0

and set the next hop address of the route to 1.1.1.1 on SwitchB. In the Layer 3 communication scenario, routes must be configured properly.



96


Troubleshooting

Figure 3-6 Layer 3 communication using Eth-Trunk 1


Switch A

VLANIF20

2.1.1.1/24

VLAN 20

PCA

VLANIF10

1.1.1.1/24

Eth-Trunk1

VLANIF10

1.1.1.2/24

VLAN 10

Switch B

VLANIF30

3.1.1.1/24

VLAN 30

PCB

Figure 3-7

shows the Layer 2 communication scenario. All interfaces on SwitchA and

SwitchB belong to VLAN 10. In the Layer 2 communication scenario, you do not need to configure routes.

Figure 3-7 Layer 2 communication using Eth-Trunk 1

Switch A

VLAN 10

Eth-Trunk1

Switch B

PCA

PCB

In the Layer 3 communication scenario, select the IP address-based load balancing modes. In the Layer 2 communication scenario, select the MAC address-based load balancing modes.

Devices at the Two Ends of an Eth-Trunk Cannot Ping Each Other Due to

Inconsistent Aggregation Modes

Fault Symptom

As shown in

Figure 3-8

, SwitchA is an S6700, and SwitchB is a non-Huawei device. An Eth-

Trunk consisting of two XGE links is configured between the two devices. After the configuration, the devices cannot ping the management IP address of each other.



97



Figure 3-8 Network diagram of an Eth-Trunk

VLANIF1

1.1.1.1/24

Eth-Trunk 1

Eth-Trunk

VLANIF1

1.1.1.2/24

Eth-Trunk 1

SwitchA SwitchB

Fault Analysis

1.

Run the display current-configuration interface eth-trunk command on SwitchA and

SwitchB. The command outputs show that the Eth-Trunk interfaces on the two ends belong to the same VLAN.

2.

Check the connection between the member interfaces. The member interfaces on SwitchA are correctly connected to the member interfaces on SwitchB.

3.

Run the display interface command on SwitchA and SwitchB to check the status of the member interfaces. All the member interfaces are in Up state.

4.

Run the display trunkmembership eth-trunk command on SwitchA and SwitchB to check the number of member interfaces in the Eth-Trunk. The two ends contain the same number of member interfaces.

5.

Run the display mac-address command on SwitchA and SwitchB to check their MAC address tables. The command outputs show that SwitchA learns the MAC address of

SwitchB, but SwitchB does not learn the MAC address of SwitchA. The negotiation between the two ends may fail. On the network, LACP is enabled on SwitchB, whereas

SwitchA uses the manual aggregation mode. SwitchA does not respond to the LACP negotiation request sent by SwitchB; therefore, the Eth-Trunk is Down.

NOTE

l SwitchA receives the LACP negotiation request from SwitchB; therefore, SwitchA learns the MAC address of SwitchB.

l SwitchA discards the LACP negotiation request because LACP is disabled. As a result, SwitchB cannot learn the MAC address of SwitchA.

l LACP negotiation fails because SwitchA does not respond to the LACP packet sent from SwitchB.

Therefore, the Eth-Trunk on SwitchB is in Block state, and the two devices cannot learn ARP entries of each other.

Procedure

Step 1 Disable LACP on SwitchB.

SwitchA and SwitchB can ping each other successfully.

----End

Summary

When connecting a Huawei switch to a non-Huawei switch by using an Eth-Trunk, ensure that the two switches use the same link aggregation mode.



98



Two Ends of an Eth-Trunk Cannot Communicate Because They Have Different

Numbers of Member Interfaces

Fault Symptom

Figure 3-9

shows the network diagram of an Eth-Trunk.

Figure 3-9 Networking diagram of Eth-Trunk

XGE0/0/1

Eth-Trunk 1

XGE0/0/2

SwitchA

XGE0/0/1

XGE0/0/2

SwitchB

SwitchA and SwitchB cannot communicate with each other.

Fault Analysis

1.

Run the display current-configuration interface eth-trunk command on SwitchA and

SwitchB to check the VLANs that the Eth-Trunk interfaces belong to. The command outputs show that the Eth-Trunk interfaces on the two ends belong to the same VLAN.

2.

Check the connection between the member interfaces. The member interfaces on SwitchA are correctly connected to the member interfaces on SwitchB.

3.

Run the display interface command on SwitchA and SwitchB to check the status of the member interfaces. All the member interfaces are in Up state.

4.

Run the display trunkmembership eth-trunk command on SwitchA and SwitchB to check the number of member interfaces. The Eth-Trunk interface on SwitchA contains two member interfaces, but the Eth-Trunk interface on SwitchB contains only one member interface (XGE0/0/1). The numbers of member interfaces on the two devices are different, so they cannot communicate with each other.

Procedure

Step 1 Run the system-view command to enter the system view.

Step 2 Run the interface interface-type interface-number command to enter the interface view.

Step 3 Run the eth-trunk trunk-id command to add XGE0/0/2 to Eth-Trunk 1.

Step 4 Run the return command to return to the user view, and then run the save command to save the configuration.

After the preceding operations are completed, SwitchA and SwitchB can communicate with each other.

----End



99


Troubleshooting

Summary


The two ends of an Eth-Trunk must have the same number of member interfaces; otherwise, the two ends cannot communicate with each other.



100


Troubleshooting

4

LAN


Issue 01 (2012-03-15)

4.1 VLAN Troubleshooting

This chapter describes common causes of VLAN faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

4.2 MAC Address Table Troubleshooting

This chapter describes common causes of MAC address table faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

4.3 MAC Address Flapping Troubleshooting

This section describes common causes of MAC address flapping, and provides the troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

4.4 QinQ Troubleshooting

This chapter describes common causes of QinQ faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

4.5 MSTP Troubleshooting

This chapter describes common causes of MPLS faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

4.6 GVRP Troubleshooting

This chapter describes common causes of Generic VLAN Registration Protocol (GVRP) faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

4.7 VLAN Mapping Troubleshooting

This chapter describes common causes of VLAN mapping faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

4.8 SEP Troubleshooting

This chapter describes common causes of SEP faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

4.9 Loop Troubleshooting

This chapter describes common causes of loops, and provides the corresponding troubleshooting procedures, alarms, and logs.



101

4 LAN


Troubleshooting

4.10 Loopback Detection Troubleshooting

This chapter describes common causes of Loopback Detection faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

4 LAN



102


Troubleshooting 4 LAN

4.1 VLAN Troubleshooting

This chapter describes common causes of VLAN faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

4.1.1 Users in a VLAN Cannot Communicate with Each Other

This section describes common causes of the communication failure between users in a portbased VLAN, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

Common Causes

This fault is commonly caused by one of the following: l The link between users is faulty.

l The interfaces connected to the users are shut down manually or the physical interfaces are damaged.

l The switch learns incorrect MAC addresses.

l Port isolation is configured on the switch.

l Incorrect static Address Resolution Protocol (ARP) entries are configured on the user terminals.

l Incorrect mappings between interfaces and MAC addresses are configured on the switch.

NOTE

If users in different VLANs cannot communicate with each other, rectify the fault according to the IP

Forwarding Troubleshooting.


Figure 4-1




103



Figure 4-1 Troubleshooting flowchart for communication failure between users in a port-based

VLAN

Users in a VLAN cannot communicate

Issue 01 (2012-03-15)

Are user interfaces in the VLAN Up?

Yes

Yes

Are terminal

IP addresses correct?

Yes

Are the learned MAC address entries correct?

No

Is VLAN configuration correct?

Yes


No

Bring the interfaces to

Up state

No

Modify terminal IP addresses

No

Modify VLAN configuration

Is port isolation configured?

No

Are static ARP entries on terminals correct?

Yes


Yes

Disable port isolation

No

Modify static ARP entries




No

Yes


No

Yes

No

Yes


No

Yes


No

Yes

End

104




NOTE


Procedure

Step 1 Check that the interfaces connected to the user terminals are in Up state.

Run the display interface interface-type interface-number command in any view to check the status of the interfaces.

l

If the interface is in Down state, rectify the fault according to

Connected Ethernet

Interfaces Down

.

l If the interface is Up, go to 2.

Step 2 Check whether the IP addresses of user terminals are in the same network segment. .

l If they are in different network segments, change the IP addresses of the user terminals.

l If they are in the same network segment, go to

Step 3

Step 3 Check that the MAC address entries on the Switch are correct.

Run the display mac-address command on the Switch to check whether the MAC addresses, interfaces, and VLANs in the learned MAC address entries are correct. If the learned MAC address entries are incorrect, run the undo mac-addressmac-address vlan vlan-id command on the interface to delete the current entries so that the Switch can learn MAC address entries again.

After the MAC address table is updated, check the MAC address entries again.

l If the MAC address entries are incorrect, go to 4.

l If the MAC address entries are correct, go to

Step 5

.

Step 4 Check that the VLAN is properly configured.

l Check the VLAN configuration according to the following table.

Check Item Method

Whether the

VLAN has been created

Run the display vlan vlan-id command in any view to check whether the VLAN has been created. If not, run the vlan command to create the VLAN.



105



Check Item

Whether the interfaces are added to the

VLAN

Method

Run the display vlan vlan-id command in any view to check whether the VLAN contains the interfaces. If not, add the interfaces to the

VLAN.

NOTE

If the interfaces are located on different switches, add the interfaces connecting the switches to the VLAN.

l Add an access interface to the VLAN by using either of the following methods:

NOTE

The default type of an Switch interface is hybrid. To change the interface type to access, run the port link-type Access command in the interface view.

1. Run the port default vlan command in the interface view.

2. Run the port command in the VLAN view.

l Add a trunk interface to the VLAN.

NOTE

The default type of an Switch interface is hybrid. To change the interface type to trunk, run the port link-type trunk command in the interface view.

Run the port trunk allow-pass vlan command in the interface view.

l Add a hybrid interface to the VLAN by using either of the following methods:

NOTE

The default type of an Switch interface is hybrid. To change the interface type to hybrid, run the port link-type Hybrid command in the interface view.

1. Run the port hybrid tagged vlan command in the interface view.

2. Run the port hybrid untagged vlan command in the interface view.

Whether connections between interfaces and user terminals are correct

Check the connections between interfaces and user terminals according to the network plan. If any user terminal is connected to an incorrect interface, connect it to the correct interface.

After the preceding operations:

–

If the MAC address entries are correct, go to

Step 5

.

–

If the MAC address entries are incorrect, go to

Step 7

.

Step 5 Check whether port isolation is configured.

Run the interface interface-type interface-number command in the system view to enter the interface view, and then run the display this command to check whether port isolation is configured on the interface.



106


Troubleshooting 4 LAN l If port isolation is configured, run the undo port-isolate enable command on the interface to disable port isolation.

l If port isolation is not configured, go to

Step 6

.

Step 6 Check whether correct static Address Resolution Protocol (ARP) entries are configured on the user terminals.

l If the static ARP entries are incorrect, modify them.

l If the static ARP entries are correct, go to

Step 7

.



----End


Relevant Alarms

None.

Relevant Logs

None.

4.2 MAC Address Table Troubleshooting

This chapter describes common causes of MAC address table faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

4.2.1 Correct MAC Address Entries Cannot Be Generated

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the MAC address table fault.

Common Causes

This fault is commonly caused by one of the following: l The device fails to learn correct MAC address entries because of incorrect configuration.

l The learned MAC addresses are updated frequently because of a loop on the network.

l

The MAC address learning function on the interface is disabled.

l Blackhole MAC address entries and MAC address learning limit are configured on the interface.

l The number of learned MAC addresses exceeds the maximum.




107


Troubleshooting

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

4 LAN

MAC address entries cannot be generated on the device, so Layer 2 forwarding fails.

The troubleshooting roadmap is as follows: l Check the binding relationship between the outbound interface and the VLAN.

l Check whether a loop occurs on the network.

l Check whether the configurations on the interface conflict or MAC address learning limit is configured on the interface.

l

Check whether the number of learned MAC addresses exceeds the limit.

Figure 4-2



MAC entries cannot be generated

Are configuration incorrect?

Yes

No

Bind MAC address, interface, and

VLAN correctly

Does loop exist?

Yes

Remove the loop

No

Is MAC address learning disabled?

No

Yes

Enable MAC address learning

Is blackhole

MAC or MAC learning limit configured?

No

Yes

Does the number of

MAC entries exceed limit?

No


Yes

Delete blackhole

MAC or MAC learning limit

Delete some

MAC entries

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

End




108



NOTE


Procedure

Step 1 Check that the configurations on the interface are correct.

Run the display mac-address command in the system view to check whether the binding relationships between the MAC address, VLAN, and interface are correct.

<Quidway> display mac-address 000f-e207-f2e0

-------------------------------------------------------------------------------

MAC Address VLAN/VSI Learned-From Type

-------------------------------------------------------------------------------

0025-9e80-2494 1/- XGE0/0/1 dynamic

-------------------------------------------------------------------------------

Total items displayed = 1

If not, re-configure the binding relationships between the MAC address, VLAN, and interface.

If yes, go to step 2.

Step 2 Check whether a loop on the network causes MAC address flapping.

If a loop exists on the network, use either of the following methods to prevent MAC address flapping: l Remove the loop from the network.

l Run the loop-detect eth-loop command in the VLAN view to enable the MAC flapping detection function. The S6700 checks whether a MAC address moves from one interface to another in the VLAN. If MAC address flapping occurs, the S6700 blocks the interface or MAC address.

If no loop exists, go to step 3.

Step 3 Check that MAC address learning is enabled.

Check whether MAC address learning is enabled in the interface view and the VLAN view.

[Quidway-XGigabitEthernet0/0/1] display this

# interface XGigabitEthernet0/0/1

mac-address learning disable

port hybrid tagged vlan 10

undo negotiation auto

# return


# vlan 10

mac-address learning disable

# return

If the command output contains mac-address learning disable, MAC address learning is disabled on the interface or VLAN.

l If MAC address learning is disabled, run the undo mac-address learning disable command in the interface view or VLAN view to enable MAC address learning.



109


Troubleshooting 4 LAN l If MAC address learning is enabled on the interface, go to step 4.

Step 4 Check whether any blackhole MAC address entry or MAC address limiting is configured.

If a blackhole MAC address entry or MAC address limiting is configured, the interface discards packets.

l

Blackhole MAC address entry

Run the display mac-address blackhole command to check whether any blackhole MAC address entry is configured.

[Quidway] display mac-address

blackhole

M-----------------------------------------------------------------------------

--

MAC Address VLAN/VSI Learned-From

Type

------------------------------------------------------------------------------

-

0001-0001-0001 3333/- -

blackhole

------------------------------------------------------------------------------

-

Total items displayed = 1

If a blackhole MAC address entry is displayed, run the undo mac-address blackhole command to delete it.

l MAC address limiting on the interface or VLAN

–

Run the display this command in the interface view or VLAN view. If the command output contains mac-limit maximum, the number of learned MAC addresses is limited.

Run either of the following commands:

– Run the undo mac-limit command in the interface view or VLAN view to disable

MAC address limiting.

–

Run the mac-limit command in the interface view or VLAN view to increase the maximum number of learned MAC addresses.

– Run the display this command in the interface view. If the command output contains

port-security max-mac-num or port-security enable, the number of secure dynamic

MAC addresses is limited on the interface. Run either of the following commands:

NOTE

By default, the limit on the number of secure dynamic MAC addresses is 1 after port security is enabled.

– Run the undo port-security enable command in the interface view to disable port security.

– Run the port-security max-mac-num command in the interface view to increase the maximum number of secure dynamic MAC addresses on the interface.


Step 5 Check whether the number of learned MAC addresses has reached the maximum supported by the S6700.

Run the display mac-address summary command to check the number of MAC addresses in the MAC address table.

l If the number of learned MAC addresses has reached the maximum supported by the

S6700, no MAC address entry can be created. Run the display mac-address command to view MAC address entries.



110



– If the number of MAC addresses learned on an interface is much greater than the number of devices on the network connected to the interface, a user on the network may maliciously update the MAC address table. Check the device connected to the interface:

–

If the interface is connected to a switch, run the display mac-address command on the switch to view its MAC address table. Locate the interface connected to the malicious user according to the displayed MAC address entries. If the interface that you find is connected to another switch, repeat this step until you find the user of the malicious user.

– If the interface is connected to a computer, perform either of the following operations after obtaining permission of the administrator:

–

Disconnect the computer. When the attack stops, connect the computer to the network again.

– Run the port-security enable command on the interface to enable port security or run the mac-limit command to set the maximum number of MAC addresses that the interface can learn to 1.

–

If the interface is connected to a hub, perform either of the following operations:

–

Configure port mirroring and use a packet capture tool to observe packets received by the interface. Analyze the packet types to locate the attacking computer. Disconnect the computer after obtaining permission of the administrator. When the attack stops, connect the computer to the hub again.

– Disconnect computers connected to the hub one by one after obtaining permission of the administrator. If the fault is rectified after a computer is disconnected, the computer is the attacker. After it stops the attack, connect it to the hub again.

–

If the number of MAC addresses on the interface is equal to or smaller than the number of devices connected to the interface, the number of devices connected to the S6700 has exceeded the maximum supported by the S6700. Adjust network deployment.

l If the number of MAC addresses has not reached the maximum supported by the S6700, go to step 6.


l

Results of the preceding troubleshooting procedure l Configuration file, log file, and alarm file of the S6700

----End


Relevant Alarms

None.

Relevant Logs

None.



111



4.3 MAC Address Flapping Troubleshooting

This section describes common causes of MAC address flapping, and provides the troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

4.3.1 MAC Address Flapping Occurs

Common Causes

This fault is commonly caused by one of the following: l A loop exists on the network.

l An unauthorized user uses the MAC address of an authorized user to connect to the network.


When MAC address flapping occurs, rectify the fault according to the following flowchart.



112


Troubleshooting

Figure 4-3 MAC address flapping troubleshooting flowchart

MAC address flapping occurs

4 LAN

Is MAC address flapping detection enabled?

Yes

No

Enable MAC address flapping detection

Is MAC address flapping detection disabled in the

VLAN?

No

Are MAC address flapping records displayed?

No


Yes

Enable MAC address flapping detection in the VLAN

Yes

Shut down the interface where flapping occurs


No

Yes

End


NOTE


Procedure

Step 1 Check that MAC address flapping detection is enabled.

Run the display mac-address flapping command in the user view to check whether MAC address flapping detection is enabled.

<Quidway> display mac-address flapping

Mac-address Flapping Configurations :

-------------------------------------------------

Flapping detection :

Enable

Aging time(sec) : 300

Quit-vlan Recover time(min) :

10

Exclude vlan-list : 10

-------------------------------------------------



113


Troubleshooting



4 LAN l If the Flapping detection field displays Disable, run the mac-address flapping detection command to enable MAC address flapping detection.

l If the Flapping detection field displays Enable, go to

Step 2

.

Step 2 Check whether MAC address flapping detection is disabled in the VLAN where MAC address flapping occurs.

Run the display mac-address flapping command in the user view to check whether MAC address flapping detection is disabled in the VLAN.

<Quidway> display mac-address flapping

Mac-address Flapping Configurations :

-------------------------------------------------

Flapping detection :

Enable

Aging time(sec) : 300

Quit-vlan Recover time(min) :

10

Exclude vlan-list : 10

------------------------------------------------- l If the VLAN is included in Exclude vlan-list, the switch does not check for MAC address flapping in this VLAN. Run the undo mac-address flapping detection exclude vlan

{ { vlan-id1 [ to vlan-id2 ] } &<1-10> | all } command in the VLAN view to enable MAC address flapping in the VLAN. The preceding information shows that the switch does not check for MAC address flapping in VLAN 10.

l If no VLAN is displayed in Exclude vlan-list, the switch checks for MAC address flapping in all VLANs. Go to

Step 3

.

NOTE

By default, the switch checks for MAC address flapping in all VLANs. To disable MAC address flapping in a

VLAN, run the mac-address flapping detection exclude vlan{ vlan-id1 [ to vlan-id2 ] } &<1-10> command in the VLAN view.

Step 3 Check MAC address flapping records on the switch.

Run the display mac-address flapping record command to check MAC address flapping records.

<Quidway> display mac-address flapping record

S : start time

E : end time

(Q) : quit vlan

(D) : error down

-------------------------------------------------------------------------------

Move-Time VLAN MAC-Address Original-Port Move-Ports MoveNum

-------------------------------------------------------------------------------

S:2011-12-23 15:37:02 100 0019-5b35-0da8 XGE0/0/1 XGE0/0/2 521

E:2011-12-23 15:46:32

-------------------------------------------------------------------------------

Total items on slot 0: 1

If no MAC address flapping record is displayed, go to

Step 4

.

If a record is displayed, check between which interfaces the MAC address is flapping. The preceding information shows that MAC address flapping has occurred in VLAN 100. The MAC address is first learned on XGE0/0/1 and then on XGE0/0/2. Check the physical connection between the two interfaces. Disconnect the cable between the interface or shut down the

XGE0/0/2. If the fault persists, go to

Step 4

.


l Networking diagram

Issue 01 (2012-03-15) 114


Troubleshooting l Results of the preceding troubleshooting procedure l Configuration file, logs, and alarms of the switch

----End


4 LAN

Relevant Alarms

l

MAC address flapping in VLANs:

L2IFPPI_1.3.6.1.4.1.2011.5.25.160.3.7_hwMflpVlanAlarm

Logs

None

4.4 QinQ Troubleshooting

This chapter describes common causes of QinQ faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

4.4.1 Traffic Forwarding Fails on a QinQ Interface

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for a traffic forwarding failure on a QinQ interface of the S6700.

Common Causes

This fault is commonly caused by one of the following: l The protocol ID in the QinQ outer VLAN tag configured on the S6700 interface cannot be identified by the device directly connected to the interface.

l The outer VLAN is not created, so the interface cannot be added to the VLAN.

l The S6700 interface is added to the outer VLAN not in untagged mode.

l The VLAN tag of received packets is not in the VLAN tag range set on the S6700, so the

S6700 cannot identify the packets.

l The S6700 interface and the remote interface in the same VLAN cannot communicate with each other.



115




Figure 4-4 Troubleshooting flowchart for a traffic forwarding failure on a QinQ interface

Traffic forwarding fails on a QInQ interfaces

Issue 01 (2012-03-15)

Is protocol

ID in outer VLAN tag correct?

Yes

No

Change the protocol ID

Does outer

VLAN exist?

No

Create outer

VLAN

Yes

Is interface added to outer VLAN correctly?

Yes

No

Add interface to outer VLAN in untagged mode

No

Change the

VLAN tag range on the device

Is user packet

VLAN tag in the allowed range?

Yes

Can interfaces in VLAN communicate?

Yes


No

See VLAN

Troubleshooting


Yes

No


Yes

No


Yes

No


No

Yes


No

Yes

End



116




NOTE


Procedure

Step 1 The protocol ID in the QinQ outer VLAN tag configured on the S6700 interface cannot be identified by the device directly connected to the interface.

NOTE

l The default protocol ID in the outer VLAN tag is 0x8100 on the S6700.

l If the protocol ID on the S6700 interface cannot be identified by the remote device directly connected to the interface, the remote device discards the QinQ packets sent from the S6700 interface.

Run the display current-configuration interface interface-type interface-number command on the S6700 interface to view the protocol ID in the outer VLAN tag.

l If qinq protocol protocol-id is displayed, the protocol ID has been changed to protocol-id.

l If qinq protocol protocol-id is not displayed, the default protocol ID 0x8100 is used.

Check the protocol ID in the outer VLAN tag on the remote interface.

l If the protocol ID on the remote interface is the same as that on the S6700 interface, go to step 2.

l If the protocol ID on the remote interface is different from that on the S6700 interface, run the qinq protocol protocol-id command in the interface view on the S6700 to set the protocol

ID to be the same as that on the remote interface.

Step 2 Check that the specified outer VLAN exists.

NOTE

If the specified outer VLAN has not been created, the S6700 cannot add an outer VLAN tag to packets.

Run the display vlan vlan-id command on the S6700 to check whether the outer VLAN exists.

l If "Error: The VLAN does not exist" is displayed, the specified outer VLAN has not been created. Run the vlan vlan-id command to create the VLAN.

l If the outer VLAN exists, go to step 3.

Step 3 Check that the S6700 interface is added to the outer VLAN correctly.

NOTE

l By default, the type of an interface is hybrid.

l It is recommended that you retain the default interface type and add the interface to the outer VLAN in untagged mode. The trunk interface type is not recommended.

Run the display current-configuration interface command on the S6700 to check whether the

S6700 interface is added to the outer VLAN correctly.



117



Field port hybrid untagged vlan

vlan-id

port hybrid tagged vlan

vlan-id

No preceding information displayed

Description

The interface type is hybrid and the interface has been added to the outer VLAN in untagged mode.

The interface is added to the outer VLAN in tagged mode.

The interface is not added to the outer VLAN.


Perform step 4.

Run the port hybrid

untagged vlan vlan-id command to add the interface to the outer VLAN.

Step 4 Check that the VLAN tag of user packets is in the specified VLAN tag range specified on the

S6700.

Run the display current-configuration interface command on the S6700 to check the configuration of the interface connected to the downstream device.

If port vlan-stacking vlan vlan-id1 to vlan-id2 stack-vlan vlan-id3 is displayed, the S6700 adds outer VLAN tag vlan-id3 to the packets carrying VLAN tags vlan-id1 to vlan-id2.

l If the VLAN tag of user packets is in the range of vlan-id1 and vlan-id2, go to step 5.

l If the VLAN tag of user packets is not in the specified range, run the port vlan-stacking

vlan vlan-id1 to vlan-id2 stack-vlan vlan-id3 command to change the VLAN tag range.

Ensure that the VLAN tag range includes the VLAN tag of user packets.

Step 5 Check that the S6700 interface and the remote interface in the same VLAN can communicate with each other.

Perform ping operations on the two interfaces.

l If the interfaces cannot ping each other, rectify the fault according to

4.1 VLAN

Troubleshooting

.

l If the interfaces can ping each other, go to step 6.


l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the S6700

----End


Relevant Alarms

None.

Relevant Logs

None.



118



4.5 MSTP Troubleshooting

This chapter describes common causes of MPLS faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

4.5.1 MSTP Topology Change Leads to Service Interruption

Common Causes

When the topology on an MSTP network changes, services are interrupted.

This fault is commonly caused by one of the following: l MSTP is incorrectly configured.

l Physical links flap, triggering a large number of TC messages.

l An MSTP-aware device receives MSTP TC messages from clients or transparentlytransmitted MSTP TC messages.


Changing MSTP topology leads to service interruption on the network shown in

Figure 4-5

.



119


Troubleshooting

Figure 4-5 Networking diagram of MSTP

S1

XGE0/0/1

XGE0/0/2

XGE0/0/1

XGE0/0/2

S2

XGE0/0/2

XGE0/0/1

S3

CIST(MSTI0):

XGE0/0/2

XGE0/0/1

S4

Root Switch: S1

Blocked port

MSTI1:

Root Switch: S1

Blocked port

MSTI2:

Root Switch: S2

Blocked port

The troubleshooting roadmap is as follows: l Check that the MSTP status is correct.

l Check whether the device has received TC messages.

l Check that no physical interface on the device alternates between Up and Down.

l Check that the MSTP convergence mode is Normal.

Figure 4-6


4 LAN



120



Figure 4-6 Troubleshooting flowchart for service interruption due to changes in MSTP topology

Services are interrupted or the device is disconnected

MSTP status is correct?

No

Check and modify the

MSTP configuration

Yes

MSTP recalculation is performed?

No

Yes


Physical interface on the device alternates between Up and Down?

Yes

Shut down the flapping interface

No

MSTP convergence mode is

Normal?

Yes

No

Set the MSTP convergence mode to

Normal

Collect information

Is fault rectified?

No

Yes

Is fault rectified?

Yes

No

Is fault rectified?

No

Yes


End


NOTE


Procedure

Step 1 Check the status of interfaces on MSTP devices.

Check the role of each MSTP-enabled port in each instance.


Figure 4-5

, there is only one MSTP ring, which means that each instance can have only one blocked interface. Run the display stp brief command on each device to check whether the status of each port is normal.



121



Run the display stp brief command in any view to check the MSTP status on S1. As shown in

Figure 4-5

, in instances 0 and 1, S1 functions as a root bridge and all ports on S1 are designated

ports. In instance 2, one port on S1 is a designated port and the other port is a root port. Both ports are in the Forwarding state.

[S1] display stp brief









Figure 4-5

, in instances 2, S2 functions as a root bridge and all ports on S2 are designated ports.

In other instances, one port on S2 is a designated port and the other port is a root port. Both of them are in the Forwarding state.










Figure 4-5

, in instance 2, one port on S3 is an Alternate port and the other port is a root port.

The Alternate port is blocked and in the Discarding state. In other instances, one port on S3 is a designated port and the other port is a root port. Both of them are in the Forwarding state.



0 XGigabitEthernet0/0/1 DEST FORWARDING NONE


1 XGigabitEthernet0/0/1 DEST FORWARDING NONE


2 XGigabitEthernet0/0/1 ALTE DISCARDING NONE



Figure 4-5

, in instance 0, one port on S4 is an Alternate port and the other port is a root port.

The Alternate port is blocked and in the Discarding state. In instance 2, one port on S4 is a designated port and the other port is a root port. Both of them are in the Forwarding state.








2 XGigabitEthernet0/0/2 ROOT FORWARDING NONE l On the network shown in

Figure 4-5

, each instance has only one port in the Discarding state and the other port is in the Forwarding state. If several ports are in the Discarding

state, an MSTP calculation error occurs. To solve this problem, go to

Step 6

.

l If the MSTP status is correct, go to

Step 2

.

Step 2 Check that the MSTP configuration is correct.

Run the display stp region-configuration command to view mappings between VLANs and instances.



122



[S1] display stp region-configuration

Oper Configuration:

Format selector :0

Region name :huawei

Revision level :0

Instance Vlans Mapped

0 21 to 4094

1 1 to 10

2 11 to 20 l Check whether mappings between VLANs and instances are correct. If the mapping between a VLAN and an instance is incorrect, run the instance command to map the VLAN to a specified spanning tree instance. Run the active region-configuration command to active the mapping between the VLAN and instance configured by using the instance command.

Run the display current-configuration command to view the MSTP configuration in the configuration file of the device.

l

Check interface configurations to confirm that MSTP-enabled interfaces have been configured with the command (for example bpdu enable) to enable protocol packets to be sent to the CPU.

l Check whether MSTP is disabled on the interfaces connecting to user terminals or the interfaces are configured as edge interfaces.

l If an MSTP-enabled device is configured with a BPDU tunnel, check whether the BPDU tunnel configuration is correct. For BPDU tunnel configurations, see the chapter "BPDU

Tunnel Configuration" in the S6700 Configuration Guide - Ethernet.

l Check whether interfaces are added to VLANs correctly. For VLAN configurations, see the chapter "VLAN Configuration" in the S6700 Configuration Guide - Ethernet.

l If the MSTP configuration is correct, go to

Step 3

.

Step 3 Check that no MSTP recalculation is performed.

Run the display stp command in any view to check whether the device has received TC messages.

[S1] display stp

-------[CIST Global Info][Mode MSTP]-------

CIST Bridge :57344.00e0-fc00-1597

Bridge Times :Hello 2s MaxAge 20s FwDly 15s MaxHop 20

CIST Root/ERPC :0 .0018-826f-fc7a / 20000

CIST RegRoot/IRPC :57344.00e0-fc00-1597 / 0

CIST RootPortId :128.2

BPDU-Protection :disabled

TC or TCN received :0

TC count per hello :0

STP Converge Mode :Normal

Time since last TC :2 days 14h:16m:15s

-------[MSTI 1 Global Info]-------

MSTI Bridge ID :4096.00e0-fc00-1597

MSTI RegRoot/IRPC :4096.00e0-fc00-1597 / 0

MSTI RootPortId :0.0

Master Bridge :57344.00e0-fc00-1597

Cost to Master :0

TC received :0

TC count per hello :2

l

If values of the TC or TCN received, TC count per hello, TC received, and TC count per hello fields in the command output increase, the device has received TC messages and the network topology has changed. In this case, you need to view log messages MSTP/6/

SET_PORT_DISCARDING and MSTP/6/SET_PORT_FORWARDING to check whether the role of an MSTP-enabled port changes.



123


Troubleshooting



4 LAN

–

If the port role does not change, go to

Step 4

.

– If the port role changes, go to

Step 6

.

NOTE

If a multi-process has been created on the device and TC notification has been configured in the multi-process, when the topology of the multi-process changes, a TC message is sent to the process

0 for instructing devices in process 0 to refresh their MAC and ARP address tables. In this manner, devices on the network can re-select links to forward traffic, ensuring non-stop traffic.

l If the values in the TC or TCN received, TC count per hello, TC received, and TC count per hello fields in the command output are 0s, it indicates that the device does not receive any TC message. In this case, contact Huawei technical support personnel.

Step 4 Check that no interface on the device alternates between Up and Down.

View the log message IFNET/4/IF_STATE to check whether an MSTP-enabled port alternates between Up and Down.

l If an MSTP-enabled interface alternates between Up and Down, it indicates that the interface flaps. If a physical interface frequently alternates between Up and Down, the

MSTP status of the device on the network will become unsteady. As a result, a large number of TC messages are generated; ARP entries and MAC entries are frequently deleted; services are interrupted. Run the shutdown command on the flapping interface. If services

are not restored after the flapping interface is shut down, go to

Step 5

.

l

If no interface flaps, go to

Step 5

.

Step 5 Check that the MSTP convergence mode is Normal.

Run the display stp command in any view to check the MSTP convergence mode of the device.

[S1] display stp

-------[CIST Global Info][Mode MSTP]-------

CIST Bridge :57344.00e0-fc00-1597

Bridge Times :Hello 2s MaxAge 20s FwDly 15s MaxHop 20

CIST Root/ERPC :0 .0018-826f-fc7a / 20000

CIST RegRoot/IRPC :57344.00e0-fc00-1597 / 0

CIST RootPortId :128.2

BPDU-Protection :disabled

TC or TCN received :0

TC count per hello :0

STP Converge Mode :Normal

Time since last TC :2 days 14h:16m:15s

-------[MSTI 1 Global Info]-------

MSTI Bridge ID :4096.00e0-fc00-1597

MSTI RegRoot/IRPC :4096.00e0-fc00-1597 / 0

MSTI RootPortId :0.0

Master Bridge :57344.00e0-fc00-1597

Cost to Master :0

TC received :0

TC count per hello :2 l If the convergence mode is Normal, go to

Step 6

.

l If the convergence mode is Fast, run the stp converge normal command to change the convergence mode to Normal. If services are not restored after the convergence mode is changed, go to

Step 6

.



----End

Issue 01 (2012-03-15) 124


Troubleshooting


Relevant Alarms

MSTP_1.3.6.1.4.1.2011.5.25.42.4.2.1 hwMstpiPortStateForwarding

MSTP_1.3.6.1.4.1.2011.5.25.42.4.2.2 hwMstpiPortStateDiscarding

MSTP_1.3.6.1.2.1.17.0.2 topologyChange

Relevant Logs

MSTP/6/RECEIVE_MSTITC

VOSCPU/4/CPU_USAGE_HIGH

4 LAN

4.6 GVRP Troubleshooting

This chapter describes common causes of Generic VLAN Registration Protocol (GVRP) faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

4.6.1 No Dynamic VLAN Can Be Created on an Interface

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for a dynamic VLAN creation failure on an interface.

Common Causes


The link between the GVRP-enabled devices is faulty.

l

The interface registration mode is incorrect.




125


Troubleshooting

Figure 4-7 Troubleshooting flowchart for a dynamic VLAN creation failure

No dynamic

VLAN can be created on an interface

4 LAN

Is the interface

Up?

Yes

Is GVRP configuration correct?

No

No

Bring the interface Up

Modify GVRP configuration

Yes

Is static VLAN created?

Yes

No

Create a static

VLAN

Can interface send and receive

GVRP packets?

Yes


No

Change STP priority or cancel GVRP packet filtering

ACL


No

Yes


Yes

No


Yes

No


Yes

No

End


NOTE


Procedure

Step 1 Check that the GVRP-enabled interface is Up.

Run the display interface interface-type interface-number command in any view to check the interface status.

l

If the interface is in Down state, rectify the fault according to


Interfaces Down

.



126



If the interface retains in Down state, go to

Step 5

.

l If the interface is Up, go to

Step 2

.

Step 2 Check that the GVRP configuration is correct.

Check the GVRP configuration according to the following table.

Check Item

Whether the Bridge

Protocol Data Unit

(BPDU) function is enabled

Method

Run the display this command in the interface view to check whether the BPDU function is enabled on the interface.


# interface XGigabitEthernet0/0/1 bpdu enable

#

If the BPDU function is not enabled on the interface, run the bpdu

enable command in the interface view to enable the BPDU function.

Whether GVRP is enabled

Whether the interface is added to

VLANs by using the port trunk

allow-pass vlan

command l Check whether global GVRP is enabled.

By default, global GVRP is disabled. Run the display gvrp

status command in the system view to check whether global GVRP is enabled.

<Quidway> display gvrp status

GVRP is enabled

If global GVRP is not enabled, run the gvrp command in the system view to enable GVRP.

l Check whether GVRP is enabled on the interface.

Run the display this command in the interface view to check whether GVRP is enabled on the interface.


# interface XGigabitEthernet0/0/1 gvrp

#

If GVRP is not enabled on the interface, run the gvrp command in the interface view to enable GVRP.

Run the display this command in the interface view to check the

VLANs that the interface belongs to.



port link-type trunk

port trunk allow-pass vlan 20 100 gvrp

#

NOTE

The default type of an interface is hybrid. To change the interface type to trunk, run the port link-type trunk command in the interface view.

If the interface is not added to VLANs as a trunk interface, run the port

trunk allow-pass vlan command in the interface view to add the interface to correct VLANs.



127



Check Item

Whether the interface registration mode is correct

Method

Run the display this command in the interface view to check the interface registration mode.

NOTE

A GVRP interface supports three registration modes: l Normal: In this mode, the GVRP interface can register and deregister

VLANs, and transmit dynamic VLAN registration information and static

VLAN registration information.

l Fixed: In this mode, the GVRP interface is disabled from registering and deregistering VLANs and can transmit only the static registration information. If a trunk interface works in fixed registration mode, it allows only the manually configured VLANs even if it is configured to allow all the VLANs.

l Forbidden: In this mode, the GVRP interface is disabled from registering and deregistering VLANs and can transmit only information about VLAN

1. If a trunk interface works in forbidden registration mode, it allows only

VLAN 1 even if it is configured to allow all the VLANs.

The default registration mode of an interface is normal. If the registration mode is fixed or forbidden, run the gvrp registration command to change the registration mode to normal.




port trunk allow-pass vlan 20 100

gvrp

gvrp registration forbidden

#

Step 3 Check that static VLANs are created.

Run the display this command in the system view to check whether static VLANs are created.

If no static VLAN is created, dynamic VLAN cannot be created.

[Quidway]display this

# vlan batch 1 to 10

# l If no static VLAN is created, create static VLANs.

l If static VLANs are created, go to

Step 4

.

Step 4 Check the statistics on GVRP packets sent, received, and discarded on the interface.

Run the display garp statistics command in the user view to check whether there are statistics on GVRP packets sent, received, and discarded on the interface.

Run the display garp statistics command repeatedly to view the change of packet statistics.

NOTE

If the GVRP packets are sent and received by the interface and no GVRP packet is discarded, GVRP functions properly.

<Quidway> display garp statistics

GARP statistics on port XGigabitEthernet0/0/1

Number of GVRP frames received : 0

Number of GVRP frames transmitted : 0

Number of frames discarded : 0



128


Troubleshooting 4 LAN l If GVRP functions properly, go to

Step 5

.

l If GVRP packets are discarded, check the following items.

Check Item

The interface is blocked.

Method

Run the display stp brief command to check whether the interface is blocked by the Spanning Tree Protocol (STP). If the STP state of the interface is DISCARDING, the interface is blocked. Dynamic

VLANs cannot be created on a blocked interface. Run the stp port

priority command to change the interface priority. After the spanning tree is recalculated, the interface will be unblocked.

GVRP packets are filtered out.

Check whether ACLs are configured on the local and remote devices to filter out GVRP packets. If such an ACL is configured, delete it.


Step 5

.



----End


Relevant Alarms

None.

Relevant Logs

None.

4.6.2 Dynamic VLAN Flapping Occurs

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for dynamic VLAN flapping.

Common Causes

This fault is commonly caused by incorrect settings of Generic Attribute Registration Protocol

(GARP) timers.


Issue 01 (2012-03-15)

If Layer 2 forwarding fails after GVRP is configured, dynamic VLAN flapping may occur.

Rectify the fault according to the following flowchart.



129


Troubleshooting

Figure 4-8 Troubleshooting flowchart for dynamic VLAN flapping

Dynamic VLAN flapping occurs

4 LAN

Are LeaveAll timers same on two devices?

Yes

Is LeaveAll timer value too small?

No

Is Leave timer value too small?

No

No

Yes

Set the same

LeaveAll timer on them

Increase the

LeaveAll timer value

Yes

Increase the

Leave timer value

Is Join timer value too large?

No


Yes

Reduce the Join timer value


Yes

No


No

Yes


No

Yes


No

Yes

End


NOTE


Procedure

Step 1 Check that GARP timers are set properly.

After GVRP is enabled, a switch uses the default values of GARP timers. Improper GARP timer settings may cause dynamic VLAN flapping. When setting the GARP timers on a device, consider the timer values of other devices on the network.

Devices of different vendors have different performance. If many static VLANs are configured but the LeaveAll timer is small, dynamic VLAN flapping may occur.

Run the display garp timer command in the user view to check the values of GARP timers.

<Quidway> display garp timer

GARP timers on port XGigabitEthernet0/0/1



130


Troubleshooting

GARP JoinTime : 20 centiseconds

GARP LeaveTime : 60 centiseconds

GARP LeaveAllTime : 1000 centiseconds

GARP HoldTime : 10 centiseconds

The recommended values of the GARP timers are as follows: l GARP Join timer: 600 centiseconds (6 seconds) l GARP Leave timer: 3000 centiseconds (30 seconds) l GARP LeaveAll timer: 12000 centiseconds (2 minutes) l GARP Hold timer: 100 centiseconds (1 second)

If the GARP timers are not set properly, adjust the timer values according to the following figure.

Leave All Period

Leave Time

4 LAN

2 Join Times

Step 2 Run the garp timer leaveall command to set the global LeaveAll timer to be the same as that of other devices on the network.

Step 3 Run the garp timer command to adjust the values of the Leave timer and Join timer on interfaces.

Step 4 If the fault persists, contact Huawei technical personnel.

----End


Relevant Alarms

None.

Relevant Logs

None.

4.7 VLAN Mapping Troubleshooting

This chapter describes common causes of VLAN mapping faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.



131



4.7.1 Users Cannot Communicate After VLAN Mapping Is

Configured

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when users cannot communicate after VLAN mapping is configured.

Common Causes

As shown in

Figure 4-9

, users in VLAN 6 need to communicate with users in VLAN 5 over an

ISP network. The carrier assigns VLAN 10 as the stack VLAN (S-VLAN). Single-tag VLAN mapping is configured on XGE0/0/1 of SwitchC and SwitchD to map customer VLANs (C-

VLANs) 5 and 6 to S-VLAN 10.

Figure 4-9 VLAN mapping networking diagram

VLAN6

ISP network

VLAN10

SwitchC

XGE0/0/1

SwitchA

XGE0/0/1

SwitchD

XGE0/0/1

XGE0/0/1

SwitchB

XGE0/0/2 XGE0/0/3

XGE0/0/2

VLAN5

XGE0/0/3

172.16.0.1/16 172.16.0.2/16 172.16.0.3/16 172.16.0.5/16 172.16.0.6/16 172.16.0.7/16

After VLAN mapping is configured on the interfaces, users in different VLANs cannot communicate with each other. This fault is commonly caused by one of the following: l

The translated VLAN (specified by map-vlan) has not been created.

l

The interfaces configured with VLAN mapping are not added to the translated VLAN.

l

The translated VLAN ID configured on SwitchC and SwitchD is different from the S-

VLAN ID assigned by the carrier.

l

The interfaces configured with VLAN mapping are faulty.




132


Troubleshooting

Figure 4-10 VLAN mapping troubleshooting flowchart

Users cannot communicate after VLAN mapping is configured

Has the translated

VLAN been created?

No

Yes

Create the translated

VLAN

Is the VLAN mapping interface added to translated

VLAN in tagged

mode?

Yes

No Add the interface to the translated VLAN in tagged mode

Is translated

VLAN ID the same as

S-VLAN ID assigned

by carrier?

No

Set the translated VLAN

ID to the S-VLAN ID assigned by the carrier

Yes

Is fault rectified?

No

Yes

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

4 LAN

Are user-side interfaces in

C-VLANs?

Yes


No

Add user-side interfaces to C-VLANs

Is fault rectified?

Yes

No

End


NOTE


Procedure

Step 1 Check that the translated VLAN (specified by map-vlan) has been created.

Run the display vlan command on SwitchC and SwitchD to check whether the translated VLAN has been created.

l If the translated VLAN has not been created, run the vlan command to create it.



133


Troubleshooting 4 LAN l If the translated VLAN has been created, go to step 2.

Step 2 Check that the interfaces configured with VLAN mapping have been added to the translated

VLAN in tagged mode.

Run the display this command on the interfaces configured with VLAN mapping to check whether the interfaces have been added to the translated VLAN in tagged mode.

NOTE

l VLAN mapping can only be configured on a trunk or hybrid interface, and the interface must be added to the translated VLAN in tagged mode.

l If a range of original VLANs is specified by vlan-id1 to vlan-id2 on an interface, the interface must be added to all the original VLANs in tagged mode, and the translated VLAN cannot have a VLANIF interface.

l Limiting MAC address learning on an interface may affect N:1 VLAN mapping on the interface.

l

If the interfaces configured with VLAN mapping have not been added to the translated

VLAN in tagged mode, run the port trunk allow-pass vlan or port hybrid tagged vlan command in the interface views to add the interfaces to the translated VLAN in tagged mode.

l If the interfaces have been added to the translated VLAN in tagged mode, go to step 3.

Step 3 Check that the translated VLAN ID configured on SwitchC and SwitchD is the same as the S-

VLAN ID assigned by the carrier.

Run the display this command on the interfaces configured with VLAN mapping to check whether the translated VLAN ID is the same as the S-VLAN ID assigned by the carrier.

l If the translated VLAN ID on an interface is different from the S-VLAN ID assigned by the carrier, run the undo port vlan-mapping vlan command on the interface to delete the

VLAN mapping configuration, and then run the port vlan-mapping vlan command to set the translated VLAN ID to the S-VLAN ID.

l If the translated VLAN ID is the same as the S-VLAN ID assigned by the carrier, go to step 4.

Step 4 Check that the user-side interfaces are in the C-VLANs.

Run the display vlan vlan-id command on SwitchA and SwitchB to check whether the userside interfaces are in the C-VLANs.

l If the user-side interfaces are not in the C-VLANs, run the port trunk allow-pass vlan,

port hybrid tagged vlan, or port default vlan command to add the interfaces to the C-

VLANs.

l If the user-side interfaces are in the C-VLANs, go to step 5.


l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the Switches

----End


Relevant Alarms

None.



134


Troubleshooting

Relevant Logs

None.

4 LAN

4.8 SEP Troubleshooting

This chapter describes common causes of SEP faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

4.8.1 Traffic Forwarding Fails on a SEP Link

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the fault that traffic forwarding fails on a SEP link.

Possible Causes

After SEP is configured on a ring network, traffic cannot be forwarded normally.

The possible causes are: l The SEP configuration is incorrect.

l The corresponding port is not added to the data VLANs.

l The physical port fails.


The troubleshooting roadmap is as follows: l Check whether the SEP topology and SEP port status are normal.

l Check whether the ports on the ring network are added to the data VLANs.

l Check whether any physical port on the ring network is in Down state.

l Check whether any fault occurs on the physical ports.

Figure 4-11




135


Troubleshooting

Figure 4-11 Flowchart for troubleshooting forwarding failure on a SEP link

Traffic forwarding fails on a SEP link

4 LAN

SEP toplogy and port status normal

?

Yes

No

Physical port Down

?

No

Yes

Run undo

shutdown in interface view

Physical port fails

?

No

Yes

Rectify fault of the physical port

Port allows data VLANs?

Yes

No


Add the port to data VLANs

Fault rectified ?

Yes

No

Fault rectified ?

Yes

No

Fault rectified ?

Yes

No

End


NOTE

Save the result of every step and report the information to Huawei technical personnel if the fault cannot be rectified.

Procedure

Step 1 Check whether the SEP topology and SEP port status are normal.

Normally, a SEP segment contains a primary edge port, a secondary edge port, and some common ports.

Run the display sep topology [ segment segment-id ] [ verbose ] command to check the topology information of a SEP segment.



136


Troubleshooting 4 LAN l If the SEP topology or the port status is abnormal:

– Check whether the SEP configuration is correct. For the correct SEP configuration, see

"SEP Configuration" in the S6700 Series Ethernet Switches Configuration Guide -

Ethernet. If the SEP configuration is incorrect, modify the configuration.

–

If the SEP configuration is correct, go to

Step 2

.

l

If the STP topology and port status are normal, go to

Step 4

.

Step 2 Check the status of interfaces on the SEP segment.

Run the display interface command in any view to check the interface status.

<Quidway> display interface XGigabitEthernet 0/0/1

XGigabitEthernet0/0/1 current state :

DOWN

Line protocol current state : DOWN

Description:HUAWEI, Quidway Series, XGigabitEthernet0/0/1

Interface

Switch Port, PVID : 1, TPID : 8100(Hex), The Maximum Frame Length is 1600

IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is 000b-0918-8bc1

Port Mode: COMMON COPPER

Speed : 10, Loopback: NONE

Duplex: HALF, Negotiation: ENABLE

Mdi : AUTO

Last 300 seconds input rate 0 bits/sec, 0 packets/sec

Last 300 seconds output rate 0 bits/sec, 0 packets/sec

Input peak rate 0 bits/sec, Record time: -

Output peak rate 0 bits/sec, Record time: -


Unicast : 0, Multicast : 0

Broadcast : 0, Jumbo : 0

CRC : 0, Giants : 0

Jabbers : 0, Fragments : 0

Runts : 0, DropEvents : 0

Alignments : 0, Symbols : 0

Ignoreds : 0, Frames : 0

Discard : 0, Total Error : 0




Collisions : 0, Deferreds : 0

Late Collisions: 0, ExcessiveCollisions: 0

Buffers Purged : 0


Input bandwidth utilization threshold : 100.00%

Output bandwidth utilization threshold: 100.00%

Input bandwidth utilization : 0.00%

Output bandwidth utilization : 0.00% l If an interface is in Down state, run the display this command in the corresponding interface view to check whether the interface is shut down.

–

If the interface is shut down, run the undo shutdown command in the interface view.

–

If not, go to

Step 3

.

l

If the interface status is Up, go to

Step 4

.

Step 3 Check whether any fault occurs on the physical port.

l

If a fault occurs on the physical port, see the section

3.1.1 Connected Ethernet Interfaces

Down

to rectify the fault.

l

If the physical port is normal, go to

Step 4

.

Step 4 Check whether the ports on the SEP segment are added to the data VLANs.

Run the display this command in the interface view to check whether an interface allows specified data VLANs.



137



[Quidway] interface xgigabitethernet 0/0/1




stp disable

sep segment 1 edge primary

# return l If the interface does not allow the specified data VLANs, add the interface to the VLANs.

l If the interface allows the specified data VLANs, go to

Step 5

.

Step 5 Collect the following information and contact Huawei technical personnel.

l The results of the preceding steps.

l Configuration files, logs, and alarms of the devices.

----End


Relevant Alarms

None.

Relevant Logs

None.

4.9 Loop Troubleshooting

This chapter describes common causes of loops, and provides the corresponding troubleshooting procedures, alarms, and logs.

4.9.1 Loops Cause Broadcast Storms

This section provides a step-by-step troubleshooting procedure for broadcast storms caused by loops.

Common Causes

Loops on a network cause broadcast storms and may also lead to the following problems: l Users cannot log in to the switch remotely.

l The display interface command output shows a large number of broadcast packets received on one or more interfaces.

l It takes a long time to log in to the switch from the serial port.

l The switch CPU usage exceeds 70%.

l A large number of of ICMP packets are lost in ping tests.

l Interface indicators of interfaces in the VLAN where a loop has occurred blink at a higher frequency than usual.

l A large number of broadcast packets are captured on PCs on the network.



138


Troubleshooting 4 LAN l Loop alarms are generated when loop detection is enabled.

This fault is commonly caused by one of the following: l Cables are connected incorrectly.

Figure 4-12

and

Figure 4-13

show loops caused by incorrect cable connections.

–

As shown in

Figure 4-12

, interfaces in the same VLAN on SwitchB are connected,

causing a loop.

In this scenario, locate the loop as follows:

–

Enable loopback detection on SwitchA and configure SwitchA to generate an alarm when detecting a loop. Locate the device, VLAN, and interface where the loop has occurred according to alarm messages. If a loop alarm is generated on the interface connected to SwitchB, the loop has occurred on SwitchB. If a loop alarm is generated on an interface not connecting SwitchB, the loop has occurred on SwitchA. On the

Switch where the loop has occurred, view broadcast packet statistics on interfaces or observe interface indicators to locate the interface that may encounter a loop. Run the shutdown command on the interface or remove the cable from the interface and check whether the broadcast storm disappears. If the broadcast storm disappears, the loop has occurred on this interface.

– On SwitchA, run the shutdown command on the interface connected to SwitchB or remove the cable from the interface. If the broadcast storm persists, the loop has occurred on SwitchA. If the broadcast storm disappears, the loop has occurred on

SwitchB. On the Switch where the loop has occurred, view broadcast packet statistics on interfaces or observe interface indicators to locate the interface that may encounter a loop. Run the shutdown command on the interface or remove the cable from the interface and check whether the broadcast storm disappears. If the broadcast storm disappears, the loop has occurred on this interface.

–

As shown in

Figure 4-13

, interfaces connecting SwitchD, SwitchE, and SwitchF belong

to the same VLAN. SwitchE is mistakenly connected to SwitchF; therefore, a loop occurs.


–

Enable loopback detection on SwitchC and configure SwitchC to generate an alarm when detecting a loop. Locate the Switch where the loop has occurred according to alarm messages. If a loop alarm is generated on the interface connected to

SwitchD, the loop may have occurred on SwitchD, SwitchE, or SwitchF. If no loop alarm is generated, the loop has occurred on SwitchC. On the Switch where the loop has occurred, view broadcast packet statistics on interfaces or observe interface indicators to locate the interface that may encounter a loop. Run the shutdown command on the interface or remove the cable from the interface and check whether the broadcast storm disappears. If the broadcast storm disappears, the loop has occurred on this interface.



139


Troubleshooting

Figure 4-12 Loop on a switch caused by incorrect cable connections

SwitchA

4 LAN

Issue 01 (2012-03-15)

SwitchB

Figure 4-13 Loop between switches caused by incorrect cable connections

SwitchC

SwitchD

VLAN 1

SwitchE SwitchF

l Configurations of network devices are incorrect.

Figure 4-14

shows a typical networking where incorrect configurations cause loops. As

shown in

Figure 4-14

, interfaces connecting SwitchA and SwitchB and interfaces connecting SwitchA and SwitchC allow packets from VLAN X to pass through. Interfaces connecting SwitchB and SwitchC should not allow packets from VLAN X to pass through; however, a user incorrectly adds the two interfaces to VLAN X, causing a loop on the network.


– View broadcast packet statistics on interfaces or observe interface indicators to locate the interface that may encounter a broadcast storm. Run the shutdown command on the interface or remove the cable from the interface and check whether the broadcast storm disappears. If the broadcast storm disappears, the loop has occurred on this interface. Check whether the VLAN configuration on the interface is incorrect.



140


Troubleshooting

Figure 4-14 Incorrect configurations cause a loop

SwitchA

4 LAN

SwitchB

SwitchC

Loop Occurs Because of Incorrect Cable Connections

NOTE


Procedure

Step 1 Locate the interfaces where a broadcast storm has occurred.

Use either of the following methods: l Check the indicator of each interface. If the indicator of an interface is blinking at a higher frequency than usual, a broadcast storm may have occurred on the interface.

l Run the display interface brief command to check the inbound and outbound bandwidth usages in a recent period of time on each interface.

In the command output, InUti indicates the inbound bandwidth usage, and OutUti indicates the outbound bandwidth usage. If both the inbound and outbound bandwidth usages on an interface approximate to 100%, a broadcast storm may have occurred on the interface.

Step 2 Locate the device where a loop has occurred.

NOTE

To check whether a loop has occurred on a device, run the shutdown command on the interface where a broadcast storm has occurred or remove the cable from the interface. This operation will interrupt services on this interface and can be performed only after gaining the network administrator's permission. After the loop is removed, run the undo shutdown command to enable the interface.

l If broadcast storms occur on multiple interfaces and each of the interfaces is connected to a Switch, the loop may occur between Switches. Go to step 3.

l If a broadcast storm has occurred on a single interface and the interface is not connected to any Switch, the loop has occurred on the local Switch. Go to step 3.

l If the interface is connected to a Switch, the loop may have occurred on the local Switch or the Switch connected to the interface.

– Use the loopback detection protocol to locate the device where the loop has occurred.



141



NOTE

Before configuring the loopback detection protocol, locate the VLAN where the loop has occurred in either of the following ways: l Check the VLAN to which the interface encountering a broadcast storm belongs.

l Check the VLAN to which the PCs encountering a broadcast storm belong.

– Enable loopback detection in the VLAN where the loop has occurred and configure the Switch to generate an alarm when a loop is detected. For details on how to configure loopback detection, see "loopback detection" in the S6700 Series Ethernet

Switches Configuration Guide - Ethernet. If an LDT 1.3.6.1.4.1.2011.5.25.174.3.3

hwLdtPortLoopDetect alarm is generated, the interface indicated in the alarm message is the interface where the loop has occurred. If the interface indicated in the alarm message is the interface connected to a Switch, the loop has occurred on one of downstream Switches connected to the interface. In this case, repeat the preceding operations until the Switch where the loop has occurred is located. Go to step 3.

If this alarm is not generated, the loop has occurred on the local Switch.

–

Run the shutdown command on the interface connected to the local Switch, and check whether the broadcast storm persists on the local Switch and the entire network.

– If the broadcast storm persists on the local Switch but disappears on the downstream

Switch, the loop has occurred on the local Switch. Go to step 3.

–

If the broadcast storm has occurred on an interface that is not connected to any

Switch, the loop has occurred on the Switch where this interface resides. Go to step

3.

– If the broadcast storm disappears on the entire network, the loop has occurred between Switches. Go to step 3.

If the interface where a broadcast storm has occurred is connected to downstream

Switches, and these Switches also encounter a broadcast storm, repeat the preceding operations on the Switches until the Switch where the loop has occurred is located.

Step 3 Locate the interfaces where the loop has occurred and remove the loop.

l If the loop has occurred on a single Switch, the loop is generated because two interfaces in the same VLAN on the Switch are directly connected. Remove the loop as follows:

– Check whether the interface where a broadcast storm has occurred is connected to another interface on the local Switch. If yes, remove the network cable between the interfaces.

–

Run the shutdown command on the interface where a broadcast storm has occurred. If the broadcast storm disappears, and another interface on the local Switch goes Down, there is a loop between the two interfaces. Remove the network cable between the interfaces after gaining the network administrator's permission.

l If the loop exists between Switches, check for incorrect cable connections between

Switches according to the network plan. Check the cable connection of each interface encountering a broadcast storm. If the connection between an interface and the remote device does not conform to the network plan, remove the cable from the interface.

If the fault persists after the preceding operations are complete , go to step 4.



----End



142


Troubleshooting

Loop Occurs Because of Incorrect Configuration

4 LAN

NOTE


Procedure

Step 1 Locate the interfaces where a broadcast storm has occurred.

Locate the interfaces where a broadcast storm has occurred on all the network devices encountering a broadcast storm.

l Check the indicator on each interface. If the indicator on an interface is blinking at a higher frequency than usual, a broadcast storm may have occurred on the interface.

l Run the display interface brief command to check the inbound and outbound bandwidth usages in a recent period of time on each interface.

In the command output, InUti indicates the inbound bandwidth usage, and OutUti indicates the outbound bandwidth usage. If both inbound and outbound bandwidth usages on an interface approximate to 100%, a broadcast storm may have occurred on the interface.

Step 2 Identify and modify the incorrect configurations.

Check the VLANs to which the interfaces encountering a broadcast storm belong and confirm with the network administrator about the devices that should reject packets from these VLANs.

Modify the VLAN configurations on the devices. If the fault persists, go to step 3.



----End


Relevant Alarms

LDT_1.3.6.1.4.1.2011.5.25.174.3.3 hwLdtPortLoopDetect

Relevant Logs

None.

4.10 Loopback Detection Troubleshooting

This chapter describes common causes of Loopback Detection faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.



143



4.10.1 Broadcast Storms Still Exist After Loopback Detection Is

Configured

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when broadcast storms still exist after loopback detection is configured.

Common Causes


Loopback detection is configured on an incorrect interface.

l The default VLAN of the loop detection interface is not the VLAN where a loop occurs.

l The system is not configured to block or shut down the interface when detecting a loop.


Figure 4-15 Loopback detection troubleshooting flowchart

Broadcast storms still exist after loop detedction is configured

Is fault rectified?

No

Yes Is loopback detection enabled on correct interface?

No

Enable loopback detection on correct interface

Yes

Does loop occur in default VLAN of interface?

No

Add interface to

VLANs in tagged mode and specify

VLAN IDs for loopback detection packets

Yes

Is the system configured to block or shut down interface?

Yes

No

Set the action to block or shutdown


Is fault rectified?

No

Yes

Is fault rectified?

Yes

No

End



144




NOTE


Procedure

Step 1 Check that loopback detection is configured on the correct interface.

Run the display this command on the Switch interface connected to the network with a loop. If the command output contains loopback-detect enable, loopback detection is enabled on the interface.

l

If the command output does not contain loopback-detect enable, run the loopback-detect

enable command in the interface view or system view to enable loopback detection.

l If the command output contains loopback-detect enable, go to step 2.

Step 2 Check whether the VLAN where the loop occurs is the default VLAN of the interface that receives broadcast packets.

l If the VLAN where the loop occurs is not the default VLAN of the interface, perform either of the following operations:

–

If the interface has been added to multiple VLANs in untagged mode, run the port trunk

allow-pass vlan or port hybrid tagged vlan command on the local interface and remote interface to add the interfaces to these VLANs in tagged mode. Run the loopback-detect

packet vlan vlan-id command on the local interface to specify these VLAN IDs for loopback detection packets.

– If the interface has been added to multiple VLANs in tagged mode, run the loopback-

detect packet vlan vlan-id command on the interface to specify these VLAN IDs for loopback detection packets.

l

If the VLAN where the loop occurs is the default VLAN of the interface, go to step 3.

Step 3 Check whether the system is configured to block or shut down the interface when a loopback is detected.

Run the display loopback-detect command in the system view to check the loopback detection configuration. Check whether the value of the Action field is block or shutdown.

NOTE

If the action is set to block, the interface can recover automatically after the loop is removed. If the action is set to shutdown, the interface cannot recover automatically after the loop is removed.

l

If the action is not block or shutdown, run the loopback-detect action command in the interface view to set the action to block or shutdown.

l If the action is block or shutdown, go to step 4.



----End




145


Troubleshooting

Relevant Alarms

None.

Relevant Logs

None.

4 LAN



146


Troubleshooting


5.1 IP Address Troubleshooting

5.2 DHCP Troubleshooting

5.3 DHCPv6 Troubleshooting

5.4 IPv6 Troubleshooting

5 IP Services

5

IP Services



147


Troubleshooting 5 IP Services

5.1 IP Address Troubleshooting

5.1.1 IP Address Fails to Be Allocated to an Interface

Common Causes

This fault is commonly caused by one of the following: l The IP address or subnet mask of the interface is incorrect.

l The IP address on the interface conflicts with another existing IP address.

l The number of secondary IP addresses on an interface has exceeded the maximum, so no more secondary IP address can be set.

l The interface has been configured with IP address unnumbered so that it cannot be configured with a secondary IP address.

l

The interface has been configured with the same secondary IP address. It should be configured with a different secondary IP address.


Figure 5-1 Troubleshooting flowchart for a failure to allocate an IP address to an interface

Failed to allocate an

IP address to an interface

Are error messages displayed?

No


Yes

Rectify the fault according to the error message

End

Is fault rectified?

No

Yes


NOTE


Procedure

Step 1 Check the error message and rectify the fault according to

Table 5-1

.



148



Table 5-1 Error messages and troubleshooting methods

Error Message

Error: The specified IP address is invalid.

Description

The IP address or subnet mask is incorrect.

Error: The specified address conflicts with another address.

Error: The specified primary address does not exist.

Error: Please delete the sub address in the interface view first.

Error: The specified address cannot be deleted because it is not the primary address of this interface.

Error: The specified sub address does not exist.

The IP address conflicts with an IP address that has been used by another interface.

The primary IP address to be deleted does not exist.

NOTE

One interface has only one primary IP address. If a primary IP address has been set on an interface when a new primary IP address is set, the original primary IP address is deleted and the new primary IP address takes effect.

The secondary IP address cannot be set.

Error: Please configure the primary address in the interface view first.

Error: The number of addresses of the specified interface reached the upper limit ().

The number of secondary

IP addresses on an interface exceeds the maximum, so no more secondary IP address can be set.

NOTE

A maximum of secondary IP addresses can be set on an interface by default.

The primary IP address cannot be deleted.

The command used to delete the primary IP address cannot delete the secondary IP address.

The secondary IP address to be deleted does not exist.

-

Troubleshooting Method

Configure the IP address or subnet mask correctly.

l The IP address type must be

Class A, Class B, or Class C.

Allocate another IP address to the interface.

You do not need to delete the primary IP address.

First configure the primary IP address.

To delete a primary IP address, delete all the secondary IP addresses on the interface first.

Run the undo ip address ip-

address { mask | mask-length }

sub command to delete the secondary IP address.

You do not need to delete the secondary IP address.



149



Error Message

Error: The address already exists.

Description

The interface has been configured with the same secondary IP address. It should be configured with a different secondary IP address.

Troubleshooting Method

Allocate a different secondary IP address to the interface.

Step 2 If the preceding error messages are not displayed but the IP address fails to be allocated to an interface, contact Huawei technical support personnel.

----End


Relevant Alarms

None.

Relevant Logs

None.

5.2 DHCP Troubleshooting

5.2.1 A Client Cannot Obtain an IP Address (the S6700 Functions as the DHCP Server)

Common Causes

Issue 01 (2012-03-15)

This fault is commonly caused by one of the following: l A fault occurs on the link between the DHCP client and the DHCP server.

l DHCP is disabled on the S6700.

l The DHCP address allocation mode is not set on the VLANIF interface of the S6700.

l When IP addresses are allocated from the global address pool, the global address pool and the IP address of the VLANIF interface are in different network segments.

l When IP addresses are allocated from the global address pool:

–

If the client and server are located on the same network segment and no relay agent is deployed, any IP address in the global address pool and the VLANIF interface IP address on the S6700 are on different network segments.



150



– If the client and server are located on different network segments and no relay agent is deployed, any IP address in the global address pool and the VLANIF interface IP address on the relay agent are on different network segments.

l There is no available address in the address pool.


Figure 5-2


Figure 5-2 Troubleshooting flowchart for the failure to allocate an IP address from the DHCP server to a client

A client cannot obtain

IP address from DHCP server

Does link work properly?

Yes

Is DHCP enabled?

Yes

Is address allocation mode

Yes set?

No

No

No

Rectify fault on link

Enable DHCP

Set address allocation mode

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Does client obtain

IP from interface address pool?

Yes

No

Is global IP pool and

No

Change interface

IP address

Are there available

IP addresses?

No

Re-create a global address pool or reconfigure an IP address for the interface

Yes

Yes


No

Is fault rectified?

Yes

End

No

Is fault rectified?

Yes



151




NOTE


Procedure

Step 1 Check whether a fault occurs on the link between the client and the DHCP server.

l If the client and server are on the same network segment and no relay agent is deployed, configure an IP address for the client network adapter connecting the client and the server.

Ensure that the IP address of the network adapter and the VLANIF interface IP address are on the same network segment. Ping the VLANIF interface IP address from the client.

– If the ping operation fails, the link is faulty. Rectify the link fault according to

6.2.1 A

Ping Operation Fails

.

–

If the ping operation succeeds, go to step 2.

l If the client and server are on different network segments and a relay agent is deployed, ping the links between the client and the relay agent and between the relay agent and the server.

– If the ping operation fails, the link is faulty. Rectify the link fault according to

6.2.1 A

Ping Operation Fails

.

–

If the ping operation succeeds, go to step 2.

Step 2 Check that DHCP is enabled.

NOTE

If DHCP is disabled, the S6700 does not process DHCP messages sent by the DHCP client.

Run the display current-configuration | include dhcp enable command to check whether

DHCP is enabled. By default, DHCP is disabled.

l If no DHCP information is displayed, DHCP is disabled. Run the dhcp enable command to enable DHCP.

l If dhcp enable is displayed, DHCP is enabled. Go to step 3.

Step 3 Check whether DHCP address allocation mode is set on the VLANIF interface of the S6700.

NOTE

If the DHCP address allocation mode is not set on the VLANIF interface of the S6700, the client cannot obtain an IP address in DHCP mode.

Run the display this command in the S6700 interface view to check whether the DHCP address allocation mode is set.

Information Displayed dhcp select global

Description

The S6700 allocates IP addresses to DHCP clients from the global address pool on the VLANIF interface.

Subsequent Operation

Perform step 4.



152



Information Displayed dhcp select interface

Description

The S6700 allocates IP addresses to DHCP clients from the interface address pool on the VLANIF interface.

The DHCP address allocation mode is not set on the VLANIF interface.

Subsequent Operation

Perform step 5.

No information displayed Run the dhcp select global or

dhcp select interface

command to set the DHCP address allocation mode on the VLANIF interface.

Step 4 Check whether addresses in the global address pool and the IP address of the VLANIF interface are on the same network segment.

1.

Run the display ip pool command to check whether the global address pool has been created.

l If the global address pool has not been created, run the ip pool ip-pool-name and

network ip-address [ mask { mask | mask-length } ] commands to create a global address pool and set the range of IP addresses that can be dynamically allocated.

l If the global address pool has been created, obtain the value of ip-pool-name. Then go to step b.

2.

Run the display ip pool name ip-pool-name command to check any IP address in the global address pool is on the same network segment as the VLANIF interface IP address.

l If the client and server are located on the same network segment and no relay agent is deployed:

– If any address in the global address pool and the VLANIF interface IP address on the S6700 are located on different network segments, run the ip address ip

address command to change the VLANIF interface IP address to be on the same network segment as any address in the global address pool.

– If any address in the global address pool and the VLANIF interface IP address on the S6700 are located on the same network segment, perform step 5.

l If the client and server are located on different network segments and a relay agent is deployed:

–

If any address in the global address pool and the VLANIF interface IP address on the relay agent are located on different network segments, run the ip address ip

address command to change the VLANIF interface IP address to be on the same network segment as any address in the global address pool.

–

If any address in the global address pool and the VLANIF interface IP address on the relay agent are located on the same network segment, perform step 5.

Step 5 Check whether the address pool contains available IP addresses.

Run the display ip pool name ip-pool-name command to check the usage of IP addresses in the global or interface address pool.

l If the value of Idle(Expired) is equal to 0, no IP address can be allocated from the address pool.



153



– If the S6700 allocates IP addresses to clients from the global address pool on the VLANIF interface, re-create a global address pool where the network segment can be connected to the previous network segment but cannot overlap with the previous network segment.

– If the S6700 allocates IP addresses to clients from the interface address pool on the

VLANIF interface, reconfigure an IP address for the VLANIF interface. This IP address and the previous IP address must be on different network segments.

l If the value of Idle(Expired) is greater than 0, there are idle(expired) IP addresses. Go to step 6.



----End


Relevant Alarms

None.

Relevant Logs

None.

5.2.2 A Client Cannot Obtain an IP Address (the S6700 Functions as the DHCP Relay Agent)

Common Causes

Issue 01 (2012-03-15)


The link between the client and the DHCP server is faulty.

– The link between the client and the DHCP relay agent is faulty.

–

The link between the DHCP relay agent and the DHCP server is faulty.

l DHCP is disabled on the S6700 globally. As a result, the DHCP function does not take effect.

l The DHCP relay function is disabled on the S6700. As a result, the DHCP relay function does not take effect.

l The DHCP relay agent is not bound to the DHCP server.

–

The DHCP server IP address is not configured on the DHCP relay agent.

–

The VLANIF interface on the DHCP relay agent is not bound to a DHCP server group or the bound DHCP server group contains no DHCP server.

l The configurations of other devices along the link are incorrect.



154


Troubleshooting


Figure 5-3


5 IP Services

Figure 5-3 Troubleshooting flowchart for the failure to allocate IP addresses using the DHCP relay agent

A client cannot obtain an IP address from

DHCP server by DHCP relay agent


Yes

No

Rectify fault on link

Is DHCP enabled on DHCP relay agent?

Yes

No Enable DHCP globally

Is DHCP relay enabled?

Yes

Is DHCP relay bound to DHCP server?

Yes

No

No

Enable DHCP relay

Bind the DHCP server group or configure

DHCP servers

Are configurations of other devices correct?

Yes


No

Correctly configure other devices

End

Is fault rectified?

No

Yes

Is fault rectified?

Yes

No

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

Yes

No


NOTE


Procedure

Step 1 Check whether a fault occurs between the DHCP client and the DHCP server.

1.

Check whether a fault occurs between the DHCP client and the DHCP relay agent.



155



Manually configure an IP address on the DHCP client to be on the same network segment as the user-side VLANIF interface of the DHCP relay agent. This IP address must be different from allocated IP addresses. Then ping the peer device from the IP address to check whether the link works properly.

l If the ping operation fails, rectify the fault on the link according to

6.2.1 A Ping

Operation Fails

.


2.

Check whether a fault occurs between the DHCP relay agent and the DHCP server.

Run the ping -a source-ip-address destination-ip-address command on the DHCP relay agent. source-ip-address specifies the user-side interface of the DHCP relay agent and

destination-ip-address specifies the IP address of the DHCP server.


6.2.1 A Ping

Operation Fails

.

l If the ping operation succeeds, go to step 2.

Step 2 Check whether DHCP is enabled globally on the DHCP relay agent.

NOTE

If DHCP is not enabled globally, the S6700 does not process DHCP messages sent by DHCP clients.



l If no information is displayed, DHCP is disabled. In this case, run the dhcp enable command to enable DHCP.

l If the dhcp enable command is displayed, DHCP is enabled. Go to step 3.

Step 3 Check that the DHCP relay function is enabled.

NOTE

l If the DHCP relay function is disabled, the DHCP client cannot obtain an address on another network segment.

l If the address allocation mode (global/interface) and relay are configured on the S6700 simultaneously, the S6700 preferentially functions as the DHCP server. When the DHCP server fails to allocate IP addresses, the S6700 functions as the DHCP relay agent.

In the view of the VLANIF interface on the S6700, run the display this command to check whether the DHCP relay function is enabled.

l If dhcp select relay is displayed, the DHCP relay function is enabled. Go to step 4.

l If no information is displayed, the DHCP relay function is disabled. Run the dhcp select

relay command to enable the DHCP relay function.

Step 4 Check that the DHCP relay agent is bound to the DHCP server.

NOTE

If the DHCP relay agent is not bound to the DHCP server, no DHCP server can allocate IP addresses to

DHCP clients connected to the DHCP relay agent.

In the view of the VLANIF interface on the S6700, run the display this command to check whether the DHCP relay agent is bound to the DHCP server.

l If dhcp relay server-ip ip-address is displayed, the DHCP server IP address is configured on the DHCP relay agent. Then go to step 6.



156


Troubleshooting 5 IP Services l If dhcp relay server-select group-name is displayed, the VLANIF interface on the DHCP relay agent is bound to a DHCP server group. Then go to step 5.

l If no information is displayed, the DHCP server IP address is not configured on the DHCP relay agent. Configure the DHCP server by using either of the following methods:

–

Run the dhcp relay server-ip ip-address command to configure the DHCP server IP address on the DHCP relay agent.

–

Run the dhcp relay server-select group-name command to bind the VLANIF interface to a DHCP server group and run the dhcp-server command to add a DHCP server to the

DHCP server group.

Step 5 Check that DHCP servers are added to the DHCP server group.

NOTE

If the VLANIF interface of the DHCP relay agent is bound to the DHCP server group but no DHCP server is added to the DHCP server group, no DHCP server can allocate IP addresses to DHCP clients connected to the DHCP relay agent.

Run the display dhcp server group group-name command to check whether DHCP servers are added to the DHCP server group.

l If the Server-IP field is displayed, DHCP servers are added to the DHCP server group. Go to step 6.

l If the Server-IP field is not displayed, no DHCP server is added to the DHCP server group.

Run the dhcp-server command to add DHCP servers to the DHCP server group.

Step 6 Check that the configurations of other devices along the link between the DHCP client and the

DHCP server are correct, including the DHCP server, DSLAMs, LAN switches, and clients.

Check whether the configurations of other devices along the link are correct. If not, modify related configurations. After the preceding steps are complete, if the client still cannot obtain an

IP address, go to step 7.

NOTE

For details on how to check the configurations of the DHCP server, see

5.2.1 A Client Cannot Obtain an

IP Address (the S6700 Functions as the DHCP Server)

.



----End


Relevant Alarms

None.

Relevant Logs

None.



157



5.3 DHCPv6 Troubleshooting

5.3.1 A Client Cannot Obtain an IPv6 Address (the S6700 Functions as the DHCPv6 Relay Agent)

Common Causes

This fault is commonly caused by one of the following: l The S6700 does not have an IPv6 license.

l DHCP is disabled on the S6700 globally.

l The IPv6 packet forwarding function is disabled on the S6700.

l IPv6 is disabled on the VLANIF interface of the S6700.

l The DHCPv6 relay function is disabled on the VLANIF interface of the S6700.

l The link between the DHCPv6 client and the DHCPv6 server is faulty.

–

The link between the DHCPv6 client and the DHCPv6 relay agent is faulty.

– The link between the DHCPv6 relay agent and the DHCPv6 server is faulty.

l The configurations of other devices along the link are incorrect.


Figure 5-4




158


Troubleshooting

Figure 5-4 Troubleshooting flowchart for a failure to allocate an IPv6 address

Client cannot obtain IPv6 address from IPv6 server by DHCPv6 relay agent

5 IP Services

Is IPv6 license available?

Yes

Is DHCP enabled on DHCP relay agent?

Yes

Is IPv6 packet forwarding enabled?

Yes

Is IPv6 enabled on

VLANIF interface?

Yes

Is DHCPv6 relay enabled?

Yes

Is link between client and server faulty?

No

Are configurations of other devices correct?

Yes


No

No

No

No

No

No

Purchase the license

Enable DHCP globally

Enable IPv6 packet forwarding

Enable IPv6 on

VLANIF interface

Enable DHCPv6 relay on VLANIF interface

Yes

Rectify fault on the link

Modify configurations

End

Is fault rectified?

Yes

No

Is fault rectified?

No

Yes

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

Is fault rectified?

No

Yes

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No


NOTE


Procedure

Step 1 Check whether the DHCPv6 relay agent has the IPv6 license.



159



NOTE

Generally, the commands related to IPv6 functions can be used on a newly purchased device, but the IPv6 functions are not implemented. To implement IPv6 functions on the S6700, purchase the license from

Huawei local office.

Run the display license command to check whether the DHCPv6 relay agent has the license for

IPv6 functions. IPv6 functions on the S6700 are controlled by the license.

l If the DHCPv6 relay agent has the license for IPv6 functions, go to step 2.

l If the DHCPv6 relay agent does not have the license for IPv6 functions, purchase the license from Huawei local office.

Step 2 Check whether DHCPv6 is enabled globally on the DHCPv6 relay agent.

NOTE

If DHCP is not enabled globally, the S6700 does not process DHCPv6 packets sent by DHCPv6 clients.



l If no information is displayed, DHCP is disabled. In this case, run the dhcp enable command to enable DHCP.

l If dhcp enable is displayed, DHCP is enabled. Go to step 3.

Step 3 Check whether IPv6 packet forwarding is enabled on the DHCPv6 relay agent.

Run the display current-configuration | include ipv6 command to check whether IPv6 packet forwarding is enabled. By default, IPv6 packet forwarding is disabled.

l If ipv6 is displayed, IPv6 packet forwarding is enabled. Go to step 4.

l If ipv6 is not displayed, IPv6 packet forwarding is disabled. Run the ipv6 command to enable

IPv6 packet forwarding.

Step 4 Check whether IPv6 is enabled on the VLANIF interface of the DHCPv6 relay agent.

Run the display this command in the view of the VLANIF interface on the client side to check whether IPv6 is enabled. By default, IPv6 is disabled.

l If ipv6 enable is displayed, IPv6 is enabled. Go to step 5.

l If ipv6 enable is not displayed, IPv6 is disabled. Run the ipv6 enable command in the

VLANIF interface view to enable IPv6.

Step 5 Check whether DHCPv6 relay is enabled on the VLANIF interface of the DHCPv6 relay agent.

NOTE

If DHCPv6 relay is disabled, the DHCPv6 client cannot obtain an IPv6 address on another network segment.

Run the display this command in the view of the VLANIF interface on the client side to check whether DHCPv6 relay is enabled. By default, DHCPv6 relay is disabled.

l If dhcpv6 relay destination ipv6-address is displayed, DHCPv6 relay is enabled. Go to step

6.

l If dhcpv6 relay destination ipv6-address is not displayed, DHCPv6 relay is disabled. Run the dhcpv6 relay destination ipv6-address command in the VLANIF interface view to enable DHCPv6 relay.

Step 6 Check whether a fault occurs on the link between the DHCPv6 client and the DHCPv6 server.



160



1.

Check whether a fault occurs on the link between the DHCPv6 client and the DHCPv6 relay agent.

Manually set an IPv6 address on the DHCPv6 client to be on the same network segment as the user-side interface of the DHCPv6 relay agent. The IPv6 address cannot conflict with allocated IPv6 addresses. Then ping the peer device from the IPv6 address to check whether the link works properly.


6.2.1 A Ping

Operation Fails

.


2.

Check whether the link between the DHCPv6 relay agent and the DHCPv6 server is faulty.

Run the ping ipv6 -a source-ip-address destination-ip-address command on the DHCPv6 relay agent. source-ip-address specifies the IPv6 address of the VLANIF interface of the

DHCPv6 relay agent and destination-ip-address specifies the IPv6 address of the DHCPv6 server.


6.2.1 A Ping

Operation Fails

.


Step 7 Check whether the configurations of other devices along the link are correct, including the

DHCPv6 server, DSLAM, LAN switch, and clients.

Check whether the configurations of other devices along the link are correct. If not, modify related configurations. After the preceding steps, if the client still cannot obtain an IPv6 address, go to step 8.



----End


Relevant Alarms

None.

Relevant Logs

None.

5.4 IPv6 Troubleshooting

5.4.1 IPv6 Service Traffic Cannot Be Forwarded

Common Causes



161


Troubleshooting

This fault is commonly caused by one of the following: l The IPv6 routing configuration is incorrect.

l The switch cannot obtain the neighbor entry corresponding to the next hop.


l The protocol status of the interface is not Up.

5 IP Services



IPv6 traffic cannot be forwarded

Is fault rectified?

No

Yes

Can device be pinged?

No

Yes

Check the ACL rule


Is IPv6 enabled?

Yes

No

Enable IPv6

Do routes exist?

Yes

No

Reconfigure routing

Is neighbor entry correct?

No

Is protocol status of interface

Up?

No

Yes


Yes

Correct configuration so that the protocol status becomes Up


Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

End



162




NOTE


Procedure

Step 1 Check that the source address and destination address can be pinged on the switch.

Run the ping ipv6 command to check whether the source address can be pinged on the switch.

l If the ping operation fails, go to step 3.


Step 2 Check whether an ACL configured on the switch matches packets.

Run the display acl above all command to check whether a user-defined ACL matches service traffic. Capture packets. Check whether the information (such as the IP address, MAC address,

DSCP priority, VLAN ID, and 802.1p priority) in the packets matches the rule in the user-defined

ACL.

l If yes, run the rule command to change the rule in the user-defined ACL.

l If not, go to step 7.

Step 3 Check that IPv6 is enabled on the switch.

Check whether IPv6 is enabled in the system view and in the interface view. By default, IPv6 is enabled in the system view and in the interface view.

l

– Run the display current-configuration | include ipv6 command in the system view to check whether the ipv6 field exists. If not, run the ipv6 command.

– Run the display ipv6 interface interface-type interface-number command to check whether the IPv6 is enabled field exists. If not, run the ipv6 enable command in the interface view.

l If IPv6 is enabled, go to step 4.

Step 4 Check that there are routes to the destination address on the switch.

Run the display ipv6 routing-table command to check whether there are routes to the destination address in the IPv6 routing table on the switch. The following information indicates that there are routes to the destination address.

Routing Table : Public

Destinations : 1 Routes : 1

Destination : ::1 PrefixLength : 128

NextHop : ::1 Preference : 0

Cost : 0 Protocol : Static

RelayNextHop : :: TunnelID : 0x0

Interface : Vlanif10 Flags : D l If there are no routes to the destination address, check whether the routing configuration is correct. If not, configure the IPv6 routing according to the Quidway S6700 Ethernet

Switches Configuration Guide - IP Routing.

l If there are routes to the destination address, go to step 5.

Step 5 Check whether the neighbor entry learned by the switch is correct.

Run the display ipv6 neighbors command to check the neighbor entry.

l If there is no neighbor entry, the switch does not obtain information about the neighbor entry corresponding to the next hop. Go to step 5.



163


Troubleshooting 5 IP Services l If there is the neighbor entry corresponding to the next hop, the next hop is reachable. Go to step 6.

Step 6 Check whether the IPv6 protocol status on the VLANIF interface of the switch is Up.

l If the IPv6 protocol status on the VLANIF interface of the switch is Down, check the following items.

Check Item Solution

Physical status

If the VLANIF interface status is Down, the corresponding physical

interface may be Down. Rectify the fault according to

3.1.1 Connected

Ethernet Interfaces Down

.

Mode of adding an interface to a

VLAN

Address status

Run the display ipv6 interface brief command to check the IPv6 address status.

If the IPv6 address status is DUPLICATE, IPv6 addresses conflict.

Locate the device with the conflicting IPv6 address and reconfigure an

IPv6 address.

NOTE

A newly configured IPv6 is in TENTATIVE state for a short time. When the IPv6 address is in the TENTATIVE state, addresses do not conflict.

l If the protocol status of the interface is Up, go to step 7.



----End

Run the display vlan vlan-id command to check whether the modes of adding interfaces to VLANs at both ends are the same. An interface can be added to a VLAN in untagged or tagged mode. If the modes are different, change the configurations to be the same.


Relevant Alarms

None.

Relevant Logs

None.



164


Troubleshooting 6 IP Forwarding and Routing

6

IP Forwarding and Routing


6.1 Layer 2 and Layer 3 Packet Forwarding Troubleshooting

6.2 Ping Troubleshooting

This chapter describes common causes of a ping failure, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

6.3 Tracert Troubleshooting

This chapter describes common causes of a Tracert failure, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

6.4 OSPF Troubleshooting

6.5 IS-IS Troubleshooting

6.6 BGP Troubleshooting

6.7 RIP Troubleshooting

6.8 MCE Troubleshooting

This chapter describes common causes of Multi-VPN-Instance CE (MCE) faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.



165



6.1 Layer 2 and Layer 3 Packet Forwarding Troubleshooting

6.1.1 Fault Location Roadmap

During Layer 2 and Layer 3 traffic forwarding, packet loss often occurs. To locate the fault, perform the following operations:

1.

Locate the device where packets are lost.

2.

Locate the cause.

3.


l

Locate the device where packets are lost.

1.

Run the display interface interface-type interface-number command in the interface view to check the statistics on received and sent packets. The command output indicates that packets are not lost on the local device.


XGigabitEthernet 0/0/1 current state :

UP


UP

Description:HUAWEI, Quidway Series, XGigabitEthernet 0/0/1

Interface

Switch Port,PVID : 10,The Maximum Frame Length is

9216

IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is

0018-2000-0140

Last physical up time : 2010-02-02 13:00:36 UTC

+08:00

Last physical down time : 2010-02-02 10:48:49 UTC

+08:00

Port Mode: COMMON

FIBER

Speed : 1000, Loopback:

NONE

Duplex: FULL, Negotiation:

ENABLE

Mdi :

NORMAL

Last 300 seconds input rate 200 bits/sec, 0 packets/ sec

Last 300 seconds output rate 192 bits/sec, 0 packets/ sec

Input peak rate 9488 bits/sec,Record time: 2010-02-02

13:00:39

Output peak rate 161305720 bits/sec,Record time: 2010-02-03

19:27:42



14931


0

Discard: 0, Total Error:

0

CRC: 0, Giants:

0

Jabbers: 0, Throttles:



166


Troubleshooting



6 IP Forwarding and Routing

0

Runts: 0, DropEvents:

0

Alignments: 0, Symbols:

0

Ignoreds: 0, Frames:

0



307332535


0

Discard: 0, Total Error: 0

Collisions: 0, ExcessiveCollisions:

0

Late Collisions: 0, Deferreds:

0

Buffers Purged:

0

Input bandwidth utilization threshold :

100.00%

Output bandwidth utilization threshold:

100.00%


0.01%

Output bandwidth utilization :

0.01%

If packets are received and sent correctly (the value of Discard and Error fields is not increasing), the local device is running properly. Check whether the next node along the forwarding path discards packets by using the preceding method.

2.

Apply a traffic policy to the inbound interface and outbound interface of the device where packet loss occurs. Collect statistics on packets of specified type in the inbound and outbound interfaces and check whether these packets are discarded on the local device.

For example, to collect statistics on ICMP packets with the source address of

10.142.132.248 and destination address of 10.142.132.81 on XGigabitEthernet0/0/2, run the following commands.


[Quidway] acl 3009

[Quidway-acl-adv-3009] rule 5 permit icmp source 10.142.132.248 0

destination 10.142.132.81 0

[Quidway] quit

[Quidway]traffic classifier icmp

[Quidway-classifier-icmp] if-match acl 3009

[Quidway-classifier-icmp] quit

[Quidway] traffic behavior icmp

[Quidway-behavior-icmp]statistic enable

[Quidway-behavior-icmp] quit

[Quidway] traffic policy icmp

[Quidway-trafficpolicy-icmp] classifier icmp behavior icmp

[Quidway-trafficpolicy-icmp] quit

[Quidway] interface XGigabitEthernet 0/0/2

[Quidway-XGigabitEthernet0/0/2] traffic-policy icmp outbound

[Quidway] display traffic policy statistics interface XGigabitEthernet

0/0/2 outbound


Traffic policy outbound: icmp

Rule number: 1

Current status: OK!

Board : 2

Item Packets Bytes

167



-------------------------------------------------------------------------

---------------------

Matched 0

0

+--Passed 0 0

+--Dropped 0 0

+--Filter 0 0

+--URPF - -

+--CAR 0 0

–

If the value of the Dropped field is not 0 in the inbound direction, the local device or the upstream device may be faulty. Local the fault again.

– If the value of the Dropped field is not 0 in the outbound direction, packet loss occurs on the local device.

– If the number of forwarded packets in the inbound direction is the same as the number of forwarded packets in the outbound direction, no packet loss occurs on the local device. Check whether packets are lost on the downstream device.

NOTE

l Configure different matching rules in the traffic classifier to collect statistics on packets of specified type. For example, run the if-match cvlan-id command to configure a matching rule for classifying traffic based on the inner and outer VLAN IDs of QinQ packets so that the statistics on QinQ packets are collected.

l Run the reset traffic policy statistics{ global [ slot slot-id ] | interface interface-type

interface-number | vlan vlan-id } { inbound | outbound } command to clear the statistics.

l

Locate the cause according to the type of discarded packets.

– If Layer 2 packets are lost, see

6.1.2 Layer 2 Packets Are Lost

.

–

If Layer 3 packets are lost, see


.

l



Common Causes

This fault is commonly caused by one of the following: l The interface is not working properly. For example, the physical status of the interface is

Down; the interface works in half duplex mode; the auto-negotiation status on the interface is different from that on the remote interface.

l The interface is blocked by STP, RRPP, Smart Link, or loop detection.

l

The interface is not added to a specified VLAN; therefore, it does not allow packets from the VLAN to pass through.

l

Incorrect MAC addresses are learned.

l There are MAC address configurations that cause packet loss, for example:

–

MAC address learning is disabled on the interface and the interface is configured to drop packets with unknown source MAC addresses.

–

The interface is configured with MAC address limiting rules and discards packets with new source MAC addresses if the number of learned MAC addresses reaches the limit.

– A static MAC address is configured.

–

A blackhole MAC address is configured.



168



– Port security is enabled on the interface.

l The interface is configured to discard the packets that do not match any selective QinQ or

VLAN mapping entry.

l The interface is configured to discard incoming tagged packets.

l The interface is not configured with the BPDU function; therefore, it cannot transparently transmit BPDUs.


Figure 6-1


Figure 6-1 Layer 2 packets are lost

Layer 2 packets are lost

Do interfaces work properly?

Yes

Is interface blocked?

Yes

No

Ensure that the interfaces are Up and work in full duplex mode, and autonegotiation is enabled

Modify configuration so that interface is not blocked by protocols such as STP and RRPP

No

No

Is VLAN configuration correct?

Modify the VLAN configuration

Yes

No

Is MAC address learned correctly?

Yes

Is MAC address configuration correct?

No

Reconfigure mappings between MAC address,

VLAN ID, and interface

Modify the configuration

Yes

No

Are BPDUs transmitted?

Yes

Enable the interface to send BPDUs

Yes

No

Is fault rectified?

Is fault rectified?

Yes

No

Yes

Is fault rectified?

No

No

Is fault rectified?

Yes

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No


End




169



Procedure

Step 1 Check that the interfaces in communication are working properly.

Run the display interface interface-type interface-number command on the local device and remote device to check that the interfaces in communication are working properly..


XGigabitEthernet0/0/1 current state : UP

Line protocol current state : UP

Description:HUAWEI, Quidway Series, XGigabitEthernet0/0/1 Interface

Switch Port,PVID : 10,The Maximum Frame Length is 9216

IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is 0018-2000-0140

Last physical up time : 2010-02-02 13:00:36 UTC+08:00

Last physical down time : 2010-02-02 10:48:49 UTC+08:00

Port Mode: COMMON FIBER


Duplex: FULL, Negotiation: ENABLE

---- More ----

If an interface is Down, rectify the interface fault according to

3.1.1 Connected Ethernet

Interfaces Down

.

If an interface is Down, go to step 2.

Step 2 Check whether the local interface is blocked by STP, RRPP, Smart Link, or loop detection.

STP and RRPP are used as examples.

l If STP is configured, check whether the interface is blocked by STP. Run the display stp

brief command to check the status of the interface.






The value of STP state should be FORWARDING. If the value of STP state is

DISCARDING, the interface is blocked. Run the stp priority priority-level command to change the STP priority of the local switch so that the local switch is selected as the root switch and the interface is not blocked.

The STP priority from 0 to 61440. A smaller value indicates a higher priority. Ensure that the local switch has the smallest priority value so that it can be selected as the root switch

If the value of STP state is FORWARDING, the interface is not blocked.

l If RRPP is configured, check whether the interface is blocked by RRPP. Run the display

rrpp verbose domain domain-index command to check the interface status.


Domain Index : 1




RRPP Ring : 1

Ring Level : 0

Node Mode : Master

Ring State : Failed




If the value of Port status is BLOCK, the interface is blocked. The preceding information indicates that the secondary interface is blocked by RRPP. Modify the RRPP configuration to configure the interface as the primary interface so that it can forward packets.

If the value of Port status is Up, the interface is not blocked.



170



NOTE

Generally, an interface cannot run multiple ring protocols. If a ring protocol is configured on the interface, check the protocol type and the interface status.

l If the interface is blocked, modify the RRPP configuration to allow the interface to forward packets. For details, see S6700 Series Ethernet Switches Configuration Guide.

l If the interface is not blocked, go to step 3.

Step 3 Check that the VLAN configuration on the interface is correct.

Run the display vlan vlan-id command to check whether the interface is added to any VLAN in untagged or tagged mode.

NOTE

l If the interface is configured with a PVID, the interface adds the PVID to untagged incoming packets.

l If selective QinQ is configured on the interface, add the interface to the VLAN specified by the outer

VLAN tag that replaces the original outer VLAN tag of the packets.

<Quidway> display vlan 10

--------------------------------------------------------------------------------

U: Up; D: Down; TG: Tagged; UT: Untagged;

MP: Vlan-mapping; ST: Vlan-stacking;

#: ProtocolTransparent-vlan; *: Management-vlan;

--------------------------------------------------------------------------------

VID Type Ports

--------------------------------------------------------------------------------

10 common UT:XGE0/0/1(D)

TG:XGE0/0/2(U)

VID Status Property MAC-LRN Statistics Description

--------------------------------------------------------------------------------

10 enable default enable disable VLAN 0010 l If the interface is not added to the specified VLAN, add the interface to the specified VLAN.

For details, see "VLAN Configuration" in the S6700 Series Ethernet Switches Configuration

Guide - Ethernet.

l If the interface has been added to the specified VLAN, go to step 4.

Step 4 Check that the MAC address of Layer 2 packets is learned correctly.

Run the display mac-address command in the system view to check whether the bindings between the MAC address, VLAN, and interface are correct. If selective QinQ is configured on an interface, the source MAC addresses of interfaces in the VLAN specified by the replaced outer VLAN tag of the packets are learned.

<Quidway> display mac-address 0000-0000-0033

-------------------------------------------------------------------------------

MAC Address VLAN/VSI Learned-From Type

-------------------------------------------------------------------------------

0000-0000-0033 100/- XGE0/0/2 dynamic

-------------------------------------------------------------------------------

Total items displayed = 1 l If the source MAC address is not learned, reconfigure the bindings between the MAC address,

VLAN ID, and interface number.

l If the MAC address is learned correctly, go to step 5.

Step 5 Check whether any MAC address configuration causes packet loss.



171



Check Item

MAC address learning is disabled on the interface and the interface is configured to drop packets with unknown source MAC addresses.

A static MAC address is configured.

Check Method

Run the display this command in the interface view. Check whether the command output contains information "mac-address learning disable action discard."

Run the display mac-

address static command to view static MAC addresses.

Description

If this configuration is performed on the interface, the interface discards packets whose source MAC addresses do not match any

MAC addresses.

A blackhole MAC address is configured.

Run the display mac-

address blackhole

command to view blackhole

MAC addresses.

Port security is configured.

Run the display this command on the interface.

Check whether the command output contains information

"port-security enable."

If a static MAC address is configured, only the interface bound to the static MAC address processes the packets with this MAC address.

Other interfaces discard the packets with this MAC address.

When a blackhole MAC address is configured, the system discards a packet if the source or destination

MAC address of the packet is the blackhole MAC address.

After port security is enabled on an interface, MAC addresses learned by the interface change to secure dynamic MAC addresses.

When the maximum number of secure dynamic MAC addresses learned on an interface reaches the limit

(the value is 1 by default), the interface does not learn new

MAC addresses. It discards packets with new source

MAC addresses.

l If packets are lost because of incorrect MAC address configurations, modify the configurations. For details, see "MAC Address Configuration" in the S6700 Series Ethernet

Switches Configuration Guide - Ethernet.

l If the configurations are correct, go to step 6.

Step 6 Check whether configurations affecting packet forwarding are performed on the interface.

Run the display this command in the interface view to view the configuration on the interface.



qinq vlan-translation miss-drop

port discard tagged-packet



172



# return l If the qinq vlan-translation miss-drop command is used on an interface configured with selective QinQ and VLAN mapping, the interface discards the received packets that do not match any selective QinQ or VLAN mapping entry.

l If the port discard tagged-packet command is used, the interface discards incoming tagged packets.

l If packets are discarded because of either of the preceding configurations, run the undo port

discard tagged-packet or undo qinq vlan-translation miss-drop command to cancel the configuration.


Step 7 Check whether the packets are BPDUs.

Generally, the destination MAC address of BPDUs is 01:80:C2:00:00:xx. By default, an interface discards received BPDUs. To configure the interface to transparently transmit

BPDUs,configure Layer 2 protocol transparent transmission. For details, see "Layer 2 Protocol

Transparent Transmission Configuration" in S6700 Series Ethernet Switches Configuration

Guide - Ethernet.




----End


Relevant Alarms

None.

Relevant Logs

None.


Common Causes

This fault is commonly caused by one of the following: l The interface is not working properly. For example, the physical status of the interface is

Down; the interface works in half duplex mode; the auto-negotiation status on the interface is different from that on the remote interface.

l The interface is blocked by STP, RRPP, or loop detection.

l The route is unreachable.



173


Troubleshooting 6 IP Forwarding and Routing l The local device does not learn the ARP entry mapping the remote device.

l The traffic policy applied to the interface, VLAN, VLANIF interface, or system contains the deny action.

l Traffic suppression is configured on the interface or in the VLAN.


Figure 6-2


Figure 6-2 Layer 3 packets are lost

Layer 3 packets are lost

Do interfaces work properly?

Yes

Is interface blocked?

No

Do routes exist?

Yes

Does ARP entry exist?

Yes

No

Yes

Ensure that the interfaces are Up and work in full duplex mode, and autonegotiation is enabled

Modify configuration so that interface is not blocked by protocols such as STP and

RRPP

No

No

Is fault rectified?

Is fault rectified?

Yes

Yes

No

Yes

Modify route configuration Is fault rectified?

No

No Yes

See "Ping Operation Failed" Is fault rectified?

No

Are there configurations that lead to packet loss?

Yes

Delete or modify configuration

Is fault rectified?

Yes

No

No



Issue 01 (2012-03-15)

NOTE




174



Procedure

Step 1 Check that the interfaces in communication are working properly.

Run the display interface interface-type interface-number command on the local device and remote device to check that the interfaces in communication are working properly..


XGigabitEthernet0/0/1 current state : UP


Description:HUAWEI, Quidway Series, XGigabitEthernet0/0/1 Interface

Switch Port,PVID : 10,The Maximum Frame Length is 9216


Last physical up time : 2010-02-02 13:00:36 UTC+08:00

Last physical down time : 2010-02-02 10:48:49 UTC+08:00

Port Mode: COMMON FIBER


Duplex: FULL, Negotiation: ENABLE

---- More ----

If an interface is Down, rectify the interface fault according to


Interfaces Down

.

If an interface is Down, go to step 2.

Step 2 Check whether the local interface is blocked by STP, RRPP, Smart Link, or loop detection.

STP and RRPP are used as examples.

l If STP is configured, check whether the interface is blocked by STP. Run the display stp

brief command to check the status of the interface.






The value of STP state should be FORWARDING. If the value of STP state is

DISCARDING, the interface is blocked. Run the stp priority priority-level command to change the STP priority of the local switch so that the local switch is selected as the root switch and the interface is not blocked.

The STP priority from 0 to 61440. A smaller value indicates a higher priority. Ensure that the local switch has the smallest priority value so that it can be selected as the root switch

If the value of STP state is FORWARDING, the interface is not blocked.

l If RRPP is configured, check whether the interface is blocked by RRPP. Run the display

rrpp verbose domain domain-index command to check the interface status.


Domain Index : 1




RRPP Ring : 1

Ring Level : 0

Node Mode : Master

Ring State : Failed




If the value of Port status is BLOCK, the interface is blocked. The preceding information indicates that the secondary interface is blocked by RRPP. Modify the RRPP configuration to configure the interface as the primary interface so that it can forward packets.

If the value of Port status is Up, the interface is not blocked.



175



NOTE

Generally, an interface cannot run multiple ring protocols. If a ring protocol is configured on the interface, check the protocol type and the interface status.

l If the interface is blocked, modify the RRPP configuration to allow the interface to forward packets. For details, see S6700 Series Ethernet Switches Configuration Guide.

l If the interface is not blocked, go to step 3.

Step 3 Check the routes.

Check the routes along the forwarding path. Check whether the local end has a route to the remote end and the remote end has a return route.

l Run the display ip routing-table ip-address command at the local end to check whether there is a reachable route to the remote end. If yes, the following information is displayed:

<Quidway> display ip routing-table 10.1.1.2

Route Flags: R - relay, D - download to fib

------------------------------------------------------------------------------

--------------


Summary Count : 1

Destination/Mask Proto Pre Cost Flags NextHop Interface

10.1.1.0/24 Direct 0 0 D 10.1.1.2 Vlanif10

If not, the display ip routing-table ip-address command does not display any information.

l

Run the display fib ip-address command to view the FIB table.

<Quidway> display fib 10.10.1.0

Destination/Mask Nexthop Flag TimeStamp Interface

TunnelID

10.1.1.0/24 10.1.1.2 U t[198452] Vlanif10 0x0 l

If the route cannot be found, check whether the routing protocol configuration is correct according to S6700 Series Ethernet Switches Configuration Guide - IP Routing.

l If the rout is found, go to step 4.

Step 4 Check whether the local end has learned the ARP entry from the remote end.

Run the display arp all command to check whether the local end has learned the ARP entry from the remote end.

<Quidway> display arp all

IP ADDRESS MAC ADDRESS EXPIRE(M) TYPE INTERFACE VPN-INSTANCE

VLAN/CEVLAN

------------------------------------------------------------------------------

112.112.112.3 00d0-d0c7-ec21 S-- XGE0/0/1

12/-

8.1.1.1 00d0-d0c7-ec21 I - Vlanif8 vpna

112.112.112.1 00e0-fc17-004a 14 D-0 XGE0/0/1

12/-

7.8.60.10 00d0-d0c7-ec21 I - Vlanif60

4.1.1.1 00d0-d0c7-ec21 I - Vlanif4

------------------------------------------------------------------------------

Total:7 Dynamic:2 Static:1 Interface:3

NOTE

In the preceding information, view the EXPIRE and TYPE columns. If the EXPIRE field of an ARP entry has a value and the TYPE field contains D, the ARP entry is a dynamic ARP entry. For example,

112.112.112.1 is dynamic ARP entries. S indicates a static ARP, for example 112.112.112.3. I indicates the ARP entry of a local interface.

l If the local end does not learn the ARP entry from the remote end, rectify the fault according to


.



176


Troubleshooting 6 IP Forwarding and Routing l If the local end has learned the ARP entry from the remote end, go to step 5.

Step 5 Check whether the interface, VLAN, VLANIF interface, or system has configurations that lead to packet loss.

Check whether the traffic policy is correctly applied and whether the traffic behavior and traffic classifier in the traffic policy have configurations leading to packet loss.

l Run the display traffic-policy applied-record policy-name command to check the traffic policy record.

<Quidway> display traffic-policy applied-record p1

-------------------------------------------------

Policy Name: p1

Policy Index: 3

Classifier:c1 Behavior:b1

-------------------------------------------------

*interface XGigabitEthernet0/0/3

traffic-policy p1 inbound

slot 3 : success

*interface XGigabitEthernet0/0/1


slot 1 : success

*vlan 100


slot 1 : fail

slot 3 : fail

*system

traffic-policy p1 global inbound

slot 1 : success

slot 3 : success

-------------------------------------------------

Policy total applied times: 4.

l Run the display traffic policy user-defined command to check the traffic policy configuration.

<Quidway> display traffic policy user-defined

User Defined Traffic Policy Information:

Policy: p1

Classifier: default-class

Behavior: be

-none-

Classifier: c1

Behavior: b1

Committed Access Rate:

CIR 1000 (Kbps), PIR 2000 (Kbps), CBS 125000 (byte), PBS 250000 (byte)

Color Mode: color Blind

Conform Action: pass

Yellow Action: pass

Exceed Action: discard l Run the display traffic behavior user-defined behavior-name command to check whether the traffic behavior has configurations leading to packet loss. For example:

<Quidway> display traffic behavior user-defined b1

User Defined Behavior

Information:

Behavior: b1

Deny l Run the display traffic classifier user-defined [ classifier-name ] command to check the traffic classifier configuration.

<Quidway> display traffic classifier user-defined

User Defined Classifier Information:

Classifier: c1

Precedence: 5

Operator: OR

Rule(s) : if-match acl 3000



177


Troubleshooting 6 IP Forwarding and Routing l Run the display acl { acl-number | all } command to check whether the ACL contains the deny rule.

<Quidway> display acl 3000

Advanced ACL 3000, 1 rule

Acl's step is

5

rule 5 deny ip source 10.10.10.1 0 l If the configurations are incorrect, modify the configurations according to S6700 Series

Ethernet Switches Configuration Guide - QoS.


Step 6 Check whether the interface or VLAN has traffic suppression configurations.

l Run the display this command in the interface view to view the configuration on the interface.


# interface

XGigabitEthernet0/0/2


unicast-suppression cir 100 cbs

18800

broadcast-suppression cir 100 cbs

18800

# return l Run the display this command in the VLAN view to view the configuration in the VLAN.


# vlan

2

broadcast-suppression

qoscar1

unicast-suppression

qoscar1

# return l If the configurations are incorrect, modify the configurations.




----End


Relevant Alarms

None.

Relevant Logs

None.



178




This chapter describes common causes of a ping failure, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.


This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for a ping failure.

Common Causes

This fault is commonly caused by one of the following: l The ping operation is incorrectly performed.


l Routes are unreachable.

l ARP entries cannot be learned correctly.

l The source end or destination end of the ping operation discards ICMP packets.


The troubleshooting roadmap is as follows: l Check the routes between two ends.

l Check whether the device has learned the ARP entry of the peer.

l Check whether ICMP packets are discarded.

l Check whether the physical links or interfaces function properly.



179


Troubleshooting


Failed to ping the peer

Locate the fault

Does transmission time out?

No

Is ping operation correct?

Yes

Is local attack defence policy configured?

No

Is physical/ protocol status Up?

Yes

Is route available?

Yes

Is ARP entry learned from peer?

Yes

Is ping packet sent to the LPU?

Yes

Is ARP packet sent and received?

Yes

Does ARP packet count change

Yes

Is ICMP packet sent and received?

Yes

End

Yes

No

Use the correct ping command

Yes

Delete attack defence policy

No

No

No

No

No

No

No

Add -t to the ping command


Rectify the route fault

Rectify the ARP fault



Is fault rectify?

Yes

No

Is fault rectify?

Yes

No

Is fault rectify?

Yes

No

Is fault rectify?

Yes

No

Is fault rectify?

Yes

No

Is fault rectify?

Yes

No



180


Troubleshooting



NOTE


Procedure

Step 1 Locate the fault.

A ping operation is completed by three devices: sender SwitchA (source end), intermediate device SwitchB, and receiver SwitchC (destination end). Perform ping operations on each link

to find the device where the fault occurs.

Figure 6-4

shows a typical network of ping operation.

If you fail to log in to SwitchB, run the ping -a 10.1.1.1 10.1.2.4 command with the source address on SwitchA to ping SwitchC. If SwitchA fails to ping SwitchC, run the ping -a 10.1.1.1

10.1.1.2 command on SwitchA to ping SwitchB. Then you know where the fault occurs. Assume that the fault occurs between SwitchA and SwitchB.

Figure 6-4 Network diagram

PING packet

SwitchA

VLANIF 10

SwitchB

VLANIF 20

SwitchC

Source

XGE0/0/1 XGE0/0/1

XGE0/0/2 XGE0/0/1

Middle Device Destination

Device

SwitchA

SwitchB

SwitchB

SwitchC

Physical interface

XGE0/0/1

XGE0/0/1

XGE0/0/2

XGE0/0/1

VLANIF interface

VLANIF 10

VLANIF 10

VLANIF 20

VLANIF 20

IP address

10.1.1.1

10.1.1.2

10.1.2.3

10.1.1.4

Step 2 Check whether the data transmission on the link times out.

Run the ping -t time-value -v destination-address command to check the transmission duration on the link.

NOTE

-t indicates the response timeout interval. The default value is 2000 ms. -v indicates the unexpected response packet type. By default, the value is empty.

A successful ping operation means that the sender can receive the response packet within the specified period. If the sender fails to receive the response packet within the specified period, the ping operation fails. Therefore, by using the ping command with the -t and -v parameters,



181


Troubleshooting



6 IP Forwarding and Routing you can know whether the ping failure is caused by transmission timeout. If the following information is displayed, the transmission times out:

<SwitchA> ping -v -t 1 10.1.1.2


Request time out

Error: Sequence number = 1 is less than the correct = 2!

To rectify the fault caused by transmission timeout, increase the -t value. If the fault persists, go to step 3.

NOTE

If the ping succeeds only when -t has a large value, check the device status and link status, and ensure that the devices and links work properly.

If you ping a private address from a PE, run the ping -vpn-instance vpn-name destination-address command on the PE. -vpn-instance vpn-name specifies the VPN instance to which the destination address belongs.

Step 3 Check that the ping operation is performed correctly.

1.

If you run the ping -f command, the ping packets will not be fragmented. Check whether the MTUs of the outbound interfaces along the path are smaller than the size of the ping packet. If an MTU value is smaller than the size of the ping packet, the ping packet is discarded. Change the size of the ping packet to a value smaller than the MTU. If the ping packet is not discarded but the fault still occurs, go to step b. To view the MTU value on an interface, run the following command:

<SwitchA> display interface xgigabitethernet 0/0/1

current state : UP


Last line protocol up time: 2008-08-30 10:56:22

Description:HUAWEI, Quidway Series, xgigabitethernet 0/0/1 Interface

Route Port,The Maximum Transmit Unit is 1500, Hold timer is 10(sec)

2.

If you run the ping -i command with specifying a broadcast interface such as Ethernet interface as the outbound interface, the destination address must be the address of a directly connected interface. If the destination address is not a directly connected interface address, use another ping command. If the fault persists, go to step 4.

NOTE

-f indicates that the ping packet is not fragmented. -i interface-name specifies the outbound interface of the ping packet. The destination address is then used as the next hop address.

Step 4 Check whether a local attack defense policy is configured on the device where the fault occurs.

If the device has been attacked by ICMP packets, the rate limit for ICMP packets sent to the

CPU has been reduced or these packets may have been dropped to protect against attacks. As a result, a ping failure occurs.

Run the display current-configuration | include cpu-defend command to check whether the configuration file contains cpu-defend policy.

l If a CPU attack defense policy is configured, run the display cpu-defend policy policy-

number and display cpu-defend car commands to check whether:

– The IP addresses in the ping operation have been added to the blacklist.

– CAR is configured. If CAR is configured, check whether the bandwidth value is too small.

If the IP addresses are in the blacklist or the bandwidth value is too small, delete the preceding configurations and run a ping command again. If the ping operation still fails, go to step 5.

l If no CPU attack defense policy is configured, go to step 5.

Issue 01 (2012-03-15) 182


Troubleshooting




Step 5 Check whether the physical status of the interfaces is Up.

Run the display this interface command in an interface view to view the physical status of the interface.

[SwitchA-xgigabitethernet 0/0/1] display this interface xgigabitethernet 0/0/1 current state : UP l If the physical status of the interfaces is Up, go to step 6.

l If the physical status of the interface is Down, perform the following operations:

– Check whether the interface is shut down.

– Check whether the interface is correctly connected.

If the interface is shut down, run the undo shutdown command in the interface view.

If the interface is not connected properly, connect it according to

3 Physical Connection and Interfaces

.

If the fault persists after you perform the preceding operations, go to step 6.

Step 6 Check whether the protocol status of the interfaces is Up.

Run the display this interface command in an interface view to view the protocol status of the interface.

[SwitchA-xgigabitethernet 0/0/1] display this interface xgigabitethernet 0/0/1 current state : UP

Line protocol current state : UP l If the protocol status is Down, perform the following operations:

Check whether the interface is an Ethernet interface, whether the VLANIF interfaces have

IP addresses, and whether the IP addresses of the directly connected interfaces are in the same network segment.

NOTE

The masked interface addresses must be in the same network segment.

l If the protocol status is Up, check whether the directly connected interfaces can ping each other. If not, go to step 7.

Step 7 Check the routes.

Check whether SwitchA has a route to SwitchB and SwitchB has a return route.

l

Run the display ip routing-table ip-address command on SwitchA to check whether there is a reachable route to the peer. If yes, the following information is displayed:

<SwitchA> display ip routing-table 10.1.1.2


------------------------------------------------------------------------------

--------------


Summary Count : 1


10.1.1.0/24 Direct 0 0 D 10.1.1.1 Vlanif10

If no, no information is displayed.

l

Run the display fib ip-address command to view the FIB table.

l

If the routing entry cannot be found, check the routing protocol configuration.

l

If the routing entry is found, go to step 8.

Step 8 Check whether the device has learned the ARP entry from the peer.

Run the display arp all command to check whether the device has learned the ARP entry from the peer.

<SwitchA> display arp all

IP ADDRESS MAC ADDRESS EXPIRE(M) TYPE INTERFACE VPN-

Issue 01 (2012-03-15) 183


Troubleshooting




INSTANCE

VLAN/CEVLAN

----------------------------------------------------------------------------------

--------------------------------------------------

192.168.100.114 00aa-004d-b045 20 D-1 xgigabitethernet 0/0/1

10.1.1.1 0000-0000-1122 I - Vlanif10

NOTE

In the preceding information, view the EXPIRE and TYPE columns. If the EXPIRE field of an ARP entry has a value and the TYPE field contains D, the ARP entry is a dynamic ARP entry. For example,

112.112.112.1 and 10.164.44.1 are dynamic ARP entries. S indicates a static ARP, for example

112.112.112.3. I indicates the ARP entry of a local interface.

l If SwitchA has learned the ARP entry from the peer, the fault rectified.

l If SwitchA cannot learn the ARP entry from the peer, go to step 9.

Step 9 Check whether the ping packet is sent to the LPU.

If the interface sends too many packets, configure an advanced ACL to filter packets. The advanced ACL specifies the peer address as the destination address of packets.

[SwitchA] acl 3000

[SwitchA-acl-adv-3000] rule permit ip destination 100.1.1.2 0

Perform the ping operation.

<SwitchA> ping -c 1000 10.1.1.2


Request time out

Request time out

Request time out

Request time out

Request time out

Request time out

Request time out

Enable debugging to view the sent IP packets.

<SwitchA> debugging ip packet acl 3000

<SwitchA> terminal monitor

Info:Current terminal monitor is on

<SwitchA> terminal debugging

Info:Current terminal debugging is on

*0.3438047 Quidway IP/8/debug_case:

Sending, interface = OURSENDPKT, version = 4, headlen = 20, tos = 0, pktlen = 84, pktid = 0, offset = 0, ttl = 255, protocol = 1, checksum = 0, s = 0.0.0.0, d = 10.1.1.2 prompt: Transfering the packet from slot 0

The preceding information indicates that the ping packet has been sent by the MPU. Check whether the ARP packet is sent and received normally.

Step 10 Check whether the device has sent and received the ARP packet successfully.

Run the debugging arp packet interface xgigabitethernet 0/0/1 command.

<SwitchA> debugging arp packet

<SwitchA> terminal monitor


<SwitchA> terminal debugging


If the device has sent and received ARP packets successfully, the following information is displayed:

*0.781949290 SwitchA ARP/8/arp_send:Slot=1;Send an ARP Packet, operation : 1, sender_eth_addr :0000-5ec4-1602,sender_ip_addr : 10.1.1.1, target_eth_addr :

0000-0000-0000, target_ip_addr :100.1.1.2

*0.781949540 SwitchA ARP/8/arp_rcv:Slot=5;Receive an ARP Packet, operation :

Issue 01 (2012-03-15) 184


Troubleshooting




2,sender_eth_addr :0000-5ec4-1603, sender_ip_addr : 10.1.1.2, target_eth_addr :

00e0-fc70-824f, target_ip_addr :100.1.1.1

If the ping operation is successful, both the request and reply packets are displayed. If only request packet is displayed or neither request packet nor reply packet is displayed, the ping operation is failed.

If the IP layer sends and receives packets successfully, run the debugging ethernet packet arp

interface vlanif10 command to check whether the link layer can send and receive packets.

<SwitchA> debugging ethernet packet arp interface vlanif10

<SwitchA> terminal monitor


<SwitchA> terminal debugging



*0.11763937 SwitchA ETH/8/eth_send:Slot=1;Send an Eth Packet, interface : vlanif10, ethformat: 0, length: 42, prototype: 0806 arp, src_eth_addr : 0000-5ec4-1602, dst_eth_addr : ffff-ffffffff

*0.11763937 SwitchA ETH/8/eth_rcv:Slot=1;Receive an Eth Packet, interface : vlanif10,eth format: 0, length: 42, prototype: 0806 arp, src_eth_addr:

0000-5ec4-1603, dst_eth_addr:0000-5ec4-1602

The preceding information indicates that the link layer successfully sends and receives ARP request packets. Go to step 11.

Step 11 Check whether the number of sent and received packets is correct.

Run the display this interface command in the interface view or run the display interface

interface-type interface-number command multiple times to view the count of packets.

To check the count of ARP request packets, view the number of sent broadcast packets; to check the count of ARP reply packets, view the number of received unicast packets.

[SwitchA-xgigabitethernet 0/0/1] display this interface xgigabitethernet 0/0/1 current state : UP


Description:HUAWEI, Quidway Series, xgigabitethernet 0/0/1 Interface


IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is 0025-9e80-2494



Duplex: FULL, Negotiation: ENABLE

Mdi : AUTO



Input peak rate 0 bits/sec,Record time: -

Output peak rate 7072 bits/sec,Record time: 2011-03-28 05:41:20


Unicast : 0,Multicast : 0

Broadcast : 0,Jumbo : 0

CRC : 35,Giants : 0

Jabbers : 0,Fragments : 13

Runts : 0,DropEvents : 0

Alignments : 54,Symbols : 89

Ignoreds : 0,Frames : 0

Discard : 0,Total Error : 191


Unicast : 0,Multicast : 182544

Broadcast : 0,Jumbo : 0

Collisions : 0,Deferreds : 0

Late Collisions: 0,ExcessiveCollisions: 0

Buffers Purged : 0

Discard : 0,Total Error : 0





Issue 01 (2012-03-15) 185



If the number of sent and received packets is correct, go to step 12. If any of the following situations occurs, record the fault location procedure, displayed debugging information, and statistics on interfaces. Then contact Huawei technical support personnel.

l The ARP debugging information is not displayed. When this situation occurs, the IP layer fails to send ARP request or reply packet.

l The ARP debugging information is displayed, but the number of sent broadcast packets does not increase. When this situation occurs, the link layer fails to send packets.

l The ARP debugging information is displayed and the number of sent broadcast packets increases, but the number of received unicast packets does not increase. When this situation occurs, the IP layer successfully sends ARP request or response packet and the link layer also sends and receives packets, but the number of sent and received packets on the interface does not change.

Step 12 Check whether ICMP packets are sent and received successfully.

If the ARP entries are correct and the VLANIF interfaces update the routing tables, but the fault persists, perform the following operations: l Run the debugging ip packet acl acl-number command in the user view to check the sent and received IP packets.

l Run the debugging ip icmp [ verbose ] command to collect more information for fault location. If the fault persists, contact Huawei technical support personnel.


l Results of the preceding troubleshooting procedure l Configuration files, log files, and alarm files of the switches

----End


Relevant Alarms

None.

Relevant Logs

None.


Pinging a Directly Connected Device Fails Because of an Incorrect ARP Entry

Fault Symptom

As shown in

Figure 6-5

, the device connected to SwitchB is replaced by SwitchA. After the

network adjustment, SwitchA cannot ping SwitchB, and the OSPF neighbor status on SwitchA is Exchange. After SwitchA is replaced by the original device, the fault is rectified.



186


Troubleshooting

Figure 6-5 Network diagram of directly connected devices


Switch A

XGE0/0/1

Area 0

VLANIF 20

1.1.1.1/24

Switch B

XGE0/0/1

VLANIF 20

1.1.1.2/24

Fault Analysis

1.

The original device can ping SwitchB, indicating that the link between the two devices functions properly. SwitchA and SwitchB are directly connected, so the fault is not caused by routing problems. The fault may be caused by errors in ARP learning.

2.

Run the display arp all command on SwitchA to check the ARP table.

<SwitchA> display arp all


VLAN

------------------------------------------------------------------------------

1.1.1.1 0025-9e80-2494 I - Vlanif20

1.1.1.2 0025-9e80-248e 18 D-0 XGE0/0/1

33

------------------------------------------------------------------------------


The preceding information shows that SwitchA has learned the ARP entry of SwitchB.

3.

Run the display arp all command on SwitchB to check the ARP table.

<SwitchA> display arp all


VLAN

------------------------------------------------------------------------------

1.1.1.2 0025-9e80-248e I - Vlanif20

1.1.1.1 0016-ecb9-0eb2 S-- XGE0/0/1

33

------------------------------------------------------------------------------


In the ARP table, the IP address of SwitchA (1.1.1.1) maps MAC address 0016-ecb9-0eb2.

The ARP entry type is S, indicating a static ARP entry. According to the ARP tables of

SwitchA, 0016-ecb9-0eb2 is not the actual MAC address mapping 1.1.1.1.

This static ARP entry was configured before the network adjustment. The ARP entry is not updated after the network adjustment; therefore, SwitchA cannot ping SwitchB.

Procedure

Step 1 Run the system-view command on SwitchB to enter the system view.

Step 2 Run the undo arp static ip-address command to delete the static ARP entry.



187



NOTE

After the static ARP entry is deleted, SwitchA can ping SwitchB. A new static ARP entry needs to be configured to prevent ARP attacks.

Step 3 Run the arp static ip-address mac-address vid vlan-id interface interface-type interface-

number command to configure the correct static ARP entry.

SwitchA can ping SwitchB. Run the display ospf peer command to check the status of the OSPF neighbor. The OSPF neighbor is in Full state.

<SwitchA> display ospf peer

OSPF Process 1 with Router ID 11.11.11.105

Neighbors

Area 0.0.0.0 interface 1.1.1.1(Vlanif33)'s neighbors

Router ID: 2.1.1.1.168.10.2 Address:

1.1.1.2

State: Full Mode:Nbr is Master Priority: 1

DR: 1.1.1.2 BDR: 2.1.1.1 MTU: 0

Dead timer due in 34 sec

Retrans timer interval: 8

Neighbor is up for 00:28:17

Authentication Sequence: [ 0 ]

----End

Summary

If a static ARP entry is configured on a device, modify the ARP entry after the MAC address changes. If SwitchB is a non-Huawei device and you cannot log in to SwitchB to check the configuration, ping SwitchB from SwitchA and configure the mirroring function to analyze packets transmitted between SwitchA and SwitchB. Check whether the destination MAC addresses of the packets are correct.

A Switch Can Be Pinged but Cannot Be Accessed

Fault Symptom

As shown in

Figure 6-6

, SwitchC successfully pings the IP address of VLANIF 20 on

SwitchA but cannot connect to SwitchA using Telnet.

Figure 6-6 Network diagram of ping and Telnet

Switch A

VLANIF 20

1.1.1.1/24

VLANIF 20

1.1.1.2/24

VLAN20

XGE0/0/1

XGE0/0/1

VLAN30

XGE0/0/1

XGE0/0/2

Switch B

VLANIF 30

2.1.1.2/24

VLANIF 30

2.1.1.1/24

Switch C



188



Fault Analysis

1.

The Switch supports the fast ICMP reply function. This function enables the Switch to quickly respond to the ICMP echo request packet destined for its own IP address. If this function is enabled on SwitchA (it is enabled by default), SwitchA can respond to ICMP

Echo packets even if it is not configured with the route to 2.1.1.1. The ping operation succeeds, indicating that the link between SwitchA and SwitchC functions properly.

However, routes between SwitchA and SwitchC may be faulty.

2.

Run the tracert 1.1.1.1 command on SwitchC to check routes from SwitchC to SwitchA.

traceroute to 1.1.1.1(1.1.1.1), max hops: 30 ,packet length: 40

1 2.1.1.2 10 ms 1 ms 1 ms

2 * * *

3 * * *

4 * * *

5 * * *

6 * * *

7 * * *

8 * * *

9 * * *

10 * * *

11 * * *

12 * * *

13 * * *

14 * * *

15 * * *

16 * * *

17 * * *

18 * * *

19 * * *

20 * * *

21 * * *

22 * * *

23 * * *

24 * * *

25 * * *

26 * * *

27 * * *

28 * * *

29 * * *

30 * * *

The preceding information shows that SwitchB is reachable from SwitchC but SwitchA is unreachable. The possible cause is that the route to 2.1.1.1 is not configured or is configured incorrectly.

3.

Run the telnet 2.1.1.2 command on SwitchC to log in to SwitchB. Run the telnet 1.1.1.1 command on SwitchB to log in to SwitchA. The Telnet operations are successful, indicating that the Telnet configuration on SwitchA is correct.

4.

Run the display ip routing-table 2.1.1.1 command on SwitchA to check the routing table.

In the routing table, the longest match entry corresponding to destination IP address

2.1.1.1. Run the undo icmp-reply fast command on SwitchA to disable the fast ICMP reply function. Ping SwitchA from SwitchC. The ping operation fails.

In a conclusion, SwitchC can ping SwitchA because the fast ICMP reply function is enabled on SwitchA. SwitchC fails to ping SwitchA because SwitchA does not have the route to

2.1.1.1.

Procedure

Step 1 Run the system-view command on SwitchC to enter the system view.

Step 2 Run the ip route-static 2.1.1.0 255.255.255.0 1.1.1.2 command to configure a static route to

1.1.1.2.



189



Then SwitchC can connect to SwitchA using Telnet.

----End

Summary

If you cannot log in to a device due to routing problems but the link to the device functions properly, log in to each device along the link to locate the fault.

The Switch supports the fast ICMP reply function. Before locating the fault, disable this function.

This fault may also be caused by one of the following: l The Telnet service is not configured on SwitchA or is configured incorrectly.

–

The user authentication mode is incorrect or only the Secure Shell (SSH) login mode is configured in the VTY user view.

–

RADIUS or HWTACACS authentication is configured but the user information is not configured on the authentication server.

l An ACL is configured on SwitchA or an intermediate device to filter out Telnet protocol packets.

NOTE

The default Telnet port number is 23.

l The number of online users reaches the maximum number allowed by SwitchA. Log in to

SwitchA from the console port, and then run the display current-configuration command to check the maximum number of users. Run the display users command to check the number of current Telnet users and check whether it reaches the maximum.

Ping Operation Succeeds in One Direction but Fails in the Other Direction Due to

Fast ICMP Reply

Fault Symptom

As shown in

Figure 6-7

, SwitchA is located between the gateway GPRS support node (GGSN) and SwitchB. The GGSN cannot ping the loopback address of SwitchB, but SwitchB can ping the loopback address of the GGSN. They use their loopback addresses as source IP addresses of the Internet Control Message Protocol (ICMP) packets.


Loopback0

8.8.8.8/32

10.0.2.1/24

0018-2000-0011

VLANIF10

10.0.2.2/24

Loopback0

1.1.1.1/32

0018-2000-0022

VLANIF10

10.0.2.3/24

0018-2000-0033

SwitchB

Issue 01 (2012-03-15)

GGSN SwitchA



190



The following static routes are configured: l Static route on the GGSN: The destination address is 1.1.1.1 and the next hop is 10.0.2.2.

l Static route on SwitchA: The destination address is 1.1.1.1 and the next hop is 10.0.2.3.

l Static route on SwitchB: The destination address is 8.8.8.8 and the next hop is 10.0.2.1

(interface address on the GGSN).

Fault Analysis

1.

SwitchB can ping the loopback address of the GGSN, so the link between SwitchB and the

GGSN is functioning properly. The packets captured on SwitchB show that SwitchB receives an ICMP destination unreachable packet sent from 10.0.2.2 (loopback address of

SwitchA). This may be caused by incorrect forwarding entries.

2.

SwitchB sends an ICMP reply packet with the destination IP address 8.8.8.8 and destination

MAC address 0018-2000-0022. The ICMP reply packet reaches SwitchA, but SwitchA discards the packet because it does not have a route to 8.8.8.8. The ping operation fails.

SwitchA has a static route with the destination IP address 1.1.1.1 and next hop 10.0.2.3.

When receiving an ICMP packet sent from the GGSN to SwitchB, SwitchA uses this static route to forward the packet. According to the ARP table, SwitchA replaces the source MAC address in the Ethernet header with its own MAC address and replaces the destination MAC address with the MAC address of SwitchB.

<SwitchA> display arp


VLAN/CEVLAN

------------------------------------------------------------------------------

10.0.2.3 0018-2000-0033 20 D-0 Eth-Trunk3

10/-

Run the display current-configuration command on SwitchB. The command output contains icmp-reply fast, indicating that fast ICMP reply is enabled.

The fast ICMP reply function speeds up response to ICMP requests. If this function is enabled on the S6700, the S6700 uses the source MAC address of the ICMP request packet as the destination MAC address of the ICMP reply packet, and uses the destination MAC address of the ICMP request packet as the source MAC address of the ICMP reply packet.

After receiving an ICMP request packet, SwitchB sends an ICMP reply packet. The source

MAC address of the ICMP reply packet is the MAC address of SwitchB and the destination

MAC address is the MAC address of SwitchA. When this packet reaches SwitchA,

SwitchA needs to forward it at Layer 3. However, there is no route to the GGSN, so

SwitchA sends a destination unreachable ICMP packet to SwitchB. The ping operation fails.

3.

SwitchB can ping 8.8.8.8 by using the source IP address 1.1.1.1.

The following information shows that SwitchB has learned the ARP entry corresponding to the IP address of the GGSN.

<SwitchB> display arp


VLAN/CEVLAN

------------------------------------------------------------------------------

10.0.2.3 0018-2000-0033 20 I - Vlanif10

10.0.2.1 0018-2000-0011 12 D-0 Eth-Trunk3

10/-

In the ICMP request sent by SwitchB, the source MAC address is the MAC address of

SwitchA, and the destination MAC address is the MAC address of the GGSN. SwitchA transparently transmits this packet at Layer 2 without changing the MAC addresses in the

Ethernet header.



191



Therefore, SwitchB can ping the loopback address of the GGSN.

To rectify the fault, disable the fast ICMP reply function on SwitchB or change the next hop address in the static route on the GGSN.

Procedure

l Disable the fast ICMP reply function on SwitchB.

–

Run the system-view command to enter the system view.

–

Run the undo icmp-reply fast command to disable the fast ICMP reply function.

l Change the next hop address in the static route on the GGSN to the IP address of VLANIF

10 on SwitchB.

After the configuration is complete, the GGSN can use the loopback address as the source

IP address to ping the loopback IP address of SwitchB successfully.

----End

Summary

If fast ICMP reply is enabled on the S6700, the S6700 uses the source MAC address of the ICMP request packet as the destination MAC address of the ICMP reply packet, and uses the destination

MAC address of the ICMP request packet as the source MAC address of the ICMP reply packet.

There may be no reachable route for the reply packet.

Therefore, routes must be configured in both directions of a path.

6.3 Tracert Troubleshooting

This chapter describes common causes of a Tracert failure, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

6.3.1 The Tracert Operation Fails

Common Causes

This fault is commonly caused by one of the following: l Routing entries or ARP entries are incorrect.

l Tracert packets are modified. As a result, these packets are dropped because they fail validity check at the network layer.


Figure 6-8




192


Troubleshooting

Figure 6-8 Troubleshooting flowchart for a tracert failure

The user cannot tracert the destination address


Are FIB and ARP entries correct?

Yes

No

Rectify the routing fault

Does the local end receive ICMP error packets?

No

Yes

Contact Huawei technical support personnel

Does the destination end receive

Tracert packets?

Yes

No

Rectify the forwarding fault

Does the destination end reply with ICMP error packets?

No


Yes

Rectify the forwarding fault

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

End


Context

NOTE


Procedure

Step 1 Check that FIB entries and ARP entries are correct.

Run the display fib command on each device to check whether there is a route to the destination address.

l If there is no route to the destination address, see the

OSPF Troubleshooting

or

IS-IS

Troubleshooting

.

l If there is a route to the destination address and Tracert packets are transmitted over an

Ethernet link, run the display arp all command to check whether the required ARP entry



193


Troubleshooting 6 IP Forwarding and Routing exists. If the required ARP entry does not exist, go to Step 3. If the fault persists, go to Step

2.

Step 2 Check that the device sending Tracert packets (the source end) does not receive ICMP error packets.

Run the display icmp statistics command on the source end to check whether or not it receives

ICMP error packets.

<Quidway> display icmp statistics

Input: bad formats 0 bad checksum 0

echo 13 destination unreachable 18

source quench 0 redirects 43

echo reply 697 parameter problem 0

timestamp 0 information request 0

mask requests 0 mask replies 0

time exceeded 12

Mping request 0 Mping reply 0

Output:echo 704 destination unreachable 93326

source quench 0 redirects 0

echo reply 13 parameter problem 0

timestamp 0 information reply 0

mask requests 0 mask replies 0

time exceeded 0

Mping request 0 Mping reply 0

During the tracert operation, run the display icmp statistics command several times to check the tracert result. If the increased value of the total number of Destination Unreachable packets and Time Exceeded packets in the Input field equals the number of sent Tracert packets, it indicates that the source end receives ICMP error packets but the ICMP error packets are discarded by the source end. Contact Huawei technical support personnel. Otherwise, go to Step

3.



----End


None.

6.4 OSPF Troubleshooting

6.4.1 The OSPF Neighbor Relationship Is Down

Common Causes

This fault is commonly caused by one of the following: l The BFD is faulty.

l The other device is faulty.

l CPU usage on the MPU or LPU of the faulty device is too high.



194


Troubleshooting 6 IP Forwarding and Routing l The link is faulty.

l The interface is not Up.

l The IP addresses of the two devices on both ends of the link are on different network segments.

l The router IDs of the two devices conflict.

l The area types of the two devices are inconsistent.

l The parameter settings of the two devices are inconsistent.


After OSPF is configured on the network, it is found that the OSPF neighbor relationship is

Down.

Figure 6-9


Figure 6-9 Troubleshooting flowchart for the fault that the OSPF neighbor relationship is Down

The OSPF neighbor relationship is Down

Check logs or alarms to find the value of the

NeighborDownImmediate field

Neighbor Down

Due to Inactivity

No

Neighbor Down

Due to Kill Neighbor

No

Neighbor Down

Due to 1-Wayhello

Received

No

Yes

Yes

Yes

Check the configurations of the devices at both ends of the link

Check the interface and BFD

Check the remote device

Neighbor Down

Due to SequenceNum

Mismatch

Yes

No

Check the remote device


Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

Yes

No

End



195


Troubleshooting



NOTE

Saving the results of each troubleshooting step is recommended. If you are unable to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.

Procedure

Step 1 Check logs to find the cause of the fault.

Run the display logbuffer command, and you can find the following log information:

NBR_DOWN_REASON(l): Neighbor state leaves full or changed to Down. (ProcessId=

[USHORT], NeighborRouterId=[IPADDR], NeighborAreaId=[ULONG], NeighborInterface=

[STRING],NeighborDownImmediate reason=[STRING], NeighborDownPrimeReason=[STRING],

NeighborChangeTime=[STRING])

Check the NeighborDownImmediate reason field which records the cause of the fault. The possible causes of the fault are as follows: l Neighbor Down Due to Inactivity

If a device does not receive a Hello packet from its neighbor within the timeout period, the

OSPF neighbor relationship goes Down. In this case, go to

Step 2

.

l Neighbor Down Due to Kill Neighbor

If the interface is Down, BFD is Down, or the reset ospf process command is run, the OSPF neighbor relationship goes Down. In this case, check the NeighborDownPrimeReason field to determine the specific cause of the fault.

–

If the value of the NeighborDownPrimeReason field is Physical Interface State Change, it indicates that the interface status has changed. In this case, run the display interface

[ interface-type [ interface-number ] ] command to check the interface status, and then

troubleshoot the interface fault (See the section

3.1 Ethernet Interface

Troubleshooting

).

–

If the value of the NeighborDownPrimeReason field is BFD Session Down, it indicates that the BFD session status is Down. In this case, troubleshoot the BFD fault (See the section

BFD Session Cannot Go Up

).

–

If the value of the NeighborDownPrimeReason field is OSPF Process Reset, it indicates that the reset ospf process command has been run. The OSPF process is restarting. Wait until OSPF re-establishes the OSPF neighbor relationship.

l Neighbor Down Due to 1-Wayhello Received or Neighbor Down Due to SequenceNum

Mismatch

When the OSPF status on the remote device goes Down first, the remote device sends a 1-

Way Hello packet to the local device, causing OSPF on the local device to go Down. In this case, troubleshoot the fault that caused OSPF on the remote device to go Down.

l In other cases, go to

Step 9

.

Step 2 Check that the link between the two devices is normal.

Step 3 Check that the CPU usage is within the normal range.

Run the display cpu-usage command to check whether the CPU usage of the faulty device is higher than 60%. If the CPU usage is too high, OSPF fails to normally send and receive protocol



196



packets, causing the neighbor relationship to flap. In this case, go to

Step 9

. If the CPU usage

is within the normal range, go to

Step 4

.

Step 4 Check that the interface status is Up.

Run the display interface [ interface-type [ interface-number ] ] command to check the physical status of the interface. If the physical status of the interface is Down, troubleshoot the interface

fault (See the section

3.1 Ethernet Interface Troubleshooting

).

If the physical status of the interface is Up, run the display ospf interface command to check whether the OSPF status of the interface is Down. The normal status is DR, BDR, DR Other, or

P2P.

<Quidway> display ospf interface


Interfaces

Area: 0.0.0.0

IP Address Type State Cost Pri DR BDR

192.1.1.1 Broadcast DR 1 1 192.1.1.1 0.0.0.0

l If the OSPF status of the interface is Down, run the display ospf cumulative command to check whether the number of interfaces with OSPF enabled in the OSPF process exceeds the upper threshold. If so, reduce the number of interfaces with OSPF enabled. For the details about upper threshold of the interfaces, see the PAF/License file of the product.

<Quidway> display ospf cumulative


Cumulations

IO Statistics

Type Input Output

Hello 0 86

DB Description 0 0

Link-State Req 0 0

Link-State Update 0 0

Link-State Ack 0 0

SendPacket Peak-Control: (Disabled)

ASE: (Disabled)

LSAs originated by this router

Router: 1

Network: 0

Sum-Net: 0

Sum-Asbr: 0

External: 0

NSSA: 0

Opq-Link: 0

Opq-Area: 0

Opq-As: 0

LSAs Originated: 1 LSAs Received: 0

Routing Table:

Intra Area: 1 Inter Area: 0 ASE: 0

Up Interface Cumulate: 1 l If the OSPF status of the interface is not Down, go to

Step 5

.

Step 5 If the interface is connected to a broadcast network or an NBMA network, ensure that the IP addresses of the two devices are on the same network segment.

l If the IP addresses of the two devices are on different network segments, modify the IP addresses of the devices to ensure that the IP addresses are on the same network segment.

l

If the IP addresses of the two devices are on the same network segment, go to

Step 6

.

Step 6 Check that the MTUs of the interfaces on both ends are consistent.

If the ospf mtu-enable command is run on interfaces on both ends, the MTUs of the two interfaces must be consistent. If the MTUs are inconsistent, the OSPF neighbor relationship cannot be established.



197


Troubleshooting 6 IP Forwarding and Routing l If the MTUs of the two interfaces are inconsistent, run the mtu mtu command in the interface view to change the MTUs of the two interfaces to be consistent.

l If the MTUs of the two interfaces are consistent, go to

Step 7

.

Step 7 Check whether there is an interface with a priority that is not 0.

On broadcast and NBMA network segments, there must be at least one interface with a priority that is not 0 to ensure that the DR can be correctly elected. Otherwise, the OSPF neighbor relationship can only reach the two-way state.

Run the display ospf interface command to view the interface priority.



Interfaces

Area: 0.0.0.0

IP Address Type State Cost Pri DR BDR

1.1.1.41 Broadcast DR 1 1 1.1.1.41 0.0.0.0

Step 8 Ensure that the OSPF configurations on the two devices are correct.

1.

Check whether the OSPF router IDs of the two devices are the same.

<Quidway> display ospf brief

OSPF Process 1 with Router ID 1.1.1.1

OSPF Protocol Information

If the IDs are the same, run the ospf router-idrouter-id command to modify the OSPF router IDs of the two devices. The router ID of each device should be unique within an AS.

If the router IDs are not the same, proceed with this step.

2.

Check whether the OSPF area configurations on the two devices are consistent.



Interfaces

Area: 0.0.0.0

IP Address Type State Cost Pri DR BDR

111.1.1.1 Broadcast BDR 1 1 111.1.1.2 111.1.1.1

If the OSPF area configurations on the two devices are inconsistent, modify the OSPF Area.

If they are consistent, proceed with this step.

3.

Check whether other OSPF configurations on the two devices are consistent.

Run the display ospf error command every 10s for 5 m.

<Quidway> display ospf error


OSPF error statistics

General packet errors:

0 : IP: received my own packet 0 : Bad packet

0 : Bad version 0 : Bad checksum

0 : Bad area id 0 : Drop on unnumbered interface

0 : Bad virtual link 0 : Bad authentication type

0 : Bad authentication key 0 : Packet too small

0 : Packet size > ip length 0 : Transmit error

0 : Interface down 0 : Unknown neighbor

HELLO packet errors:

0 : Netmask mismatch 0 : Hello timer mismatch

0 : Dead timer mismatch 0 : Extern option mismatch

0 : Router id confusion 0 : Virtual neighbor unknown

0 : NBMA neighbor unknown 0 : Invalid Source Address l Check the Bad authentication type field. If the value of this field keeps increasing, the

OSPF authentication types of the two devices that establish the neighbor relationship are inconsistent. In this case, run the area-authentication-mode command to configure the same authentication type for the two devices.



198


Troubleshooting 6 IP Forwarding and Routing l Check the Hello timer mismatch field. If the value of this field keeps increasing, the value of the Hello timers on the two devices that establish the neighbor relationship are inconsistent. In this case, check the interface configurations of the two devices and run the ospf timer hello command to set the same value for the Hello timers.

l Check the Dead timer mismatch field. If the value of this field keeps increasing, the values of the dead timers on the two devices that establish the neighbor relationship are inconsistent. In this case, check the interface configurations of the two devices and run the ospf timer dead command to set the same value for the dead timers.

l Check the Extern option mismatch field. If the value of this field keeps increasing, the area types of the two devices that establish the neighbor relationship are inconsistent

(the area type of one device is common area, and the area type of the other device is stub area or NSSA). In this case, configure the same area type for the two devices (in the OSPF area view, the stub command indicates the area type is stub and the stub command indicates the area type is nssa).


Step 9

.

Step 9 Step 9 Contact Huawei technical support personnel and provide them with the following information.


----End


Relevant Alarms

OSPF_1.3.6.1.2.1.14.16.2.2 ospfNbrStateChange

Relevant Logs

OSPF/4/NBR_DOWN_REASON

6.4.2 The OSPF Neighbor Relationship Cannot Reach the Full State

Common Causes

This fault is commonly caused by one of the following: l The link is faulty and the OSPF packets are dropped.

l The configuration of the dr-priority on the interfaces is incorrect.

l The OSPF MTUs of the local device and its neighbor are different.


Issue 01 (2012-03-15)

Figure 6-10




199



Figure 6-10 Troubleshoot flowchart for the fault that the OSPF neighbor relationship cannot reach the Full state

The OSPF relationship cannot enter the Full state.

Check the status of the

OSPF neighbor relationship.

Can the status of the neighbor relationship be displayed?

No

Yes

See "OSPF

Neighbor

Relationship Is

Down" to rectify the fault.

Is the neighbor relationship always in the Down state?

No

Yes

Check the interface status.

Is the neighbor relationship always in the Init state?

No

Yes

Check the remote device and the link.

Is the neighbor relationship always in the 2-Way state?

No

Yes

Check the interface configured.

Is the neighbor relationship always in the Exstart state?

No

Yes


Is the neighbor relationship always in the Exchange state?

Yes

No



Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

End


NOTE




200



Procedure

Step 1 Troubleshoot the fault based on the status of the OSPF neighbor relationship.

l The status of the OSPF neighbor relationship cannot be displayed.

If the status of the OSPF neighbor relationship cannot be displayed, see

The OSPF Neighbor

Relationship Is Down


l The neighbor relationship is always in the Down state.

Run the display interface [ interface-type [ interface-number ] ] command to check the physical status of the interface. If the physical status of the interface is Down, troubleshoot the interface fault.

If the physical status of the interface is Up, run the display ospf interface command to check whether the OSPF status of the interface is Up (such as DR, BDR, DR Other, or P2P).



Interfaces

Area: 0.0.0.0

IP Address Type State Cost Pri DR BDR

192.1.1.1 Broadcast DR 1 1 192.1.1.1 0.0.0.0

–

If the OSPF status of the interface is Up, go to

Step 2

.

–

If the OSPF status of the interface is Down, run the display ospf cumulative command to check whether the number of interfaces with OSPF enabled in the OSPF process exceeds the upper threshold. If so, reduce the number of interfaces with OSPF enabled.

<Quidway> display ospf cumulative


Cumulations

IO Statistics

Type Input Output

Hello 0 86

DB Description 0 0

Link-State Req 0 0

Link-State Update 0 0

Link-State Ack 0 0

SendPacket Peak-Control: (Disabled)

ASE: (Disabled)

LSAs originated by this router

Router: 1

Network: 0

Sum-Net: 0

Sum-Asbr: 0

External: 0

NSSA: 0

Opq-Link: 0

Opq-Area: 0

Opq-As: 0

LSAs Originated: 1 LSAs Received: 0

Routing Table:

Intra Area: 1 Inter Area: 0 ASE: 0

Up Interface Cumulate: 1 l The neighbor relationship is always in the Init state.

If the status of the neighbor relationship is always displayed as Init, the remote device cannot receive Hello packets from the local device. In this case, check whether the link or the remote device is faulty.

l The neighbor relationship is always in the 2-way state.

If the status of the neighbor relationship is always displayed as 2-way, run the display ospf

interface command to check whether the DR priorities of the interfaces with OSPF enabled are 0.



201





Interfaces

Area: 0.0.0.0

IP Address Type State Cost Pri DR BDR

111.1.1.1 Broadcast DROther 1 0 111.1.1.2 0.0.0.0

–

If the DR priorities of the interfaces with OSPF enabled are 0 and the state is

DROther, both the local device and its neighbor are not the DR or BDR and they do not need to exchange LSAs. In this case, no action is required.

–

If the DR priorities of the interfaces enabled with OSPF are not 0, go to

Step 2

.

l The neighbor relationship is always in the Exstart state.

If the status of the neighbor relationship is always displayed as Exstart, it indicates that the devices are exchanging DD packets but fail to synchronize LSDBs, which occurs in the following cases:

– Packets that are too long cannot be normally sent and received.

Run the ping -s 1500 neighbor-address command to check the sending and receiving of packets that are too long. If the two devices fail to ping each other, solve the link problem first.

–

The OSPF MTUs of the two devices are different.

If the ospf mtu-enable command is run on the OSPF interfaces, check whether the OSPF

MTUs on the two interfaces are the same. If they are not the same, change the MTUs of the interfaces to ensure that the MTUs of the interfaces are the same.


Step 2

.

l The neighbor relationship is always in the Exchange state.

If the status of the neighbor relationship is always displayed as Exchange, the two devices are exchanging DD packets. In this case, follow the troubleshooting procedure provided for

when the neighbor relationship is in the Init state. If the fault persists, go to

Step 2

.

l The neighbor relationship is always in the Loading state.

CAUTION

Restarting OSPF causes the re-establishment of all neighbor relationships in the OSPF process and the temporary interruption of services.

If the neighbor relationship is always in the Loading state, run the reset ospf process-id

process command to restart the OSPF process.


Step 2

.

Step 2 Step 2 Collect the following information and contact Huawei technical support personnel.


----End




202



Relevant Alarms

OSPF_1.3.6.1.2.1.14.16.2.2 ospfNbrStateChange

OSPF_1.3.6.1.2.1.14.16.2.8 ospfIfRxBadPacket

OSPF_1.3.6.1.2.1.14.16.2.16 ospfIfStateChange

Relevant Logs

None.


Routes Are Abnormal Because the FA Fields in Type 5 LSAs Are Set Incorrectly

Fault Symptom


Figure 6-11

, Switch C is a non-Huawei device. Switch A and

Switch B are two switchs. Switch A and Switch B have two upstream XGE interfaces and are configured with two static routes.

l Switch A

[SwitchA] ip route-static 0.0.0.0 0.0.0.0 192.168.0.69

[SwitchA] ip route-static 0.0.0.0 0.0.0.0 192.168.0.65 l

Switch B

[SwitchB] ip route-static 0.0.0.0 0.0.0.0 192.168.0.5

[SwitchB] ip route-static 0.0.0.0 0.0.0.0 192.168.0.1

Switch A and Switch B advertise default routes to Switch C in an unforced manner. Normally,

Switch C has a default external route to Switch A and another default external route to Switch

B. Switch C, however, has a route to only one of Switchs A and B in the following situations: l

The static route 192.168.0.65 on Switch A is deleted, and other configurations remain unchanged. In this case, Switch C has an OSPF default route to only Switch B.

l The static route 192.168.0.1 on Switch B is deleted, and other configurations remain unchanged. In this case, Switch C has an OSPF default route to only Switch A.

Figure 6-11 Network diagram of the networking where routes on a device are abnormal

XGE0/0/2 XGE0/0/1 XGE0/0/2 XGE0/0/1

SwitchA

192.168.1.253

SwitchB

192.168.1.254

SwitchC



203



Fault Analysis

1.

Run the undo ip route-static 0.0.0.0 0.0.0.0 192.168.0.65 command on Switch A, and then view the details about the corresponding LSA on Switch C. The FA field of the LSA is incorrectly set by Switch A. In this case, Switch C has an OSPF default route to only

Switch B, because Switch C finds that the route to address 192.168.0.69 is unreachable when performing SPF calculation.

2.

Run the undo ip route-static 0.0.0.0 0.0.0.0 192.168.0.1 command on Switch B, and then view the details about the corresponding LSA on Switch C. The FA field of the LSA is incorrectly set by Switch B. In this case, Switch C has an OSPF default route to only

Switch A, because Switch C finds that the route to address 192.168.0.5 is unreachable when performing SPF calculation.

3.

The preceding analysis shows that the root cause of the fault is that Switch A and Switch

B incorrectly set the FA fields in the corresponding LSAs.

The rules the switch uses to fill in the FA fields of LSAs and calculate routes are as follows: l When the value of the FA field of a Type 5 LSA is 0.0.0.0, the router that receives the

LSA knows that the router sending the LSA is an advertising router (that is, an ASBR), and calculates the next hop.

l When all of the following conditions are met, an ASBR fills in an address other than

0.0.0.0 in the FA field of a Type 5 LSA, and the router that receives the LSA calculates the next hop based on the value of the FA field: a.

OSPF is enabled on the interface connecting the ASBR to an external network.

b.

The interface connecting the ASBR to an external network is not configured as a silent interface.

c.

The network type of the interface connecting the ASBR to an external network is not P2P or P2MP.

d.

The address of the interface connecting the ASBR to an external network is within the network address range advertised by OSPF.

If none of the preceding conditions are met, the FA field of an LSA is set to 0.0.0.0.

Procedure

Step 1 Do as follows to rectify the fault: l Check the data configuration on Switch A and Switch B, the following information can be found:

–

The network 192.168.0.68 0.0.0.3 command rather than the network 192.168.0.64

0.0.0.3 command is run in the OSPF process on Switch A.

– The network 192.168.0.4 0.0.0.3 command rather than the network 192.168.0.0

0.0.0.3 command is run in the OSPF process on Switch B.

l In the OSPF process on Switch A, delete the network command used to advertise the network segment to which the next hop of the configured static route corresponds. Perform the same operation on Switch B. Then, the fault is rectified.

l Run the ospf network-type p2p command on the interface specified in the network command run on the Switch A to change the network type of the interface. Then, perform the same operation on Switch B. After that, the fault is rectified.



204


Troubleshooting 6 IP Forwarding and Routing l Set the corresponding interface on Switch A to be a silent interface, or enable the routes from

Switch C to all the next hops of the static routes of Switch A to be reachable. Perform the same operation on Switch B. Then, the fault is rectified.

----End

Summary

The network segment addresses and interface types of OSPF interfaces must be correct configured. This allows the switch to correctly fill in the FA field in a Type 5 LSA and calculate routes based on defined rules.

The switch Receives Two LSAs with the Same LS ID but Fails to Calculate a Route

Based on One of the LSAs

Fault Symptom


Figure 6-12

, traffic is unevenly distributed between the path from

Switch A to the BAS and the path from Switch B to the BAS. Load balancing between the path

Switch A -> BAS -> destination and the path Switch A -> SwitchB -> BAS-> destination must be configured for the traffic transmitted from Switch A to the network segment to which the

BAS is connected.

Figure 6-12 Network diagram of the switch receiving two LSAs with the same LS ID but fails to calculate a route based on one of the LSAs

SwitchA SwitchB

10.1.2.26

BAS

10.1.3.1

10.1.1.0

Static route destined for

10.1.1.0

The following uses traffic sent to network segment 10.1.1.0 as an example.

On Switch B, a static route to 10.1.1.0 is configured and OSPF is configured to import static routes. Switch A receives an ASE LSA with the LS ID 10.1.1.0 from Switch B and an ASE LSA with the same LS ID from the BAS. Switch A can calculate a route based on the LSA received from the BAS, but fails to calculate a route based on the LSA received from Switch B.

Fault Analysis

The possible causes are as follows:

1.

Device configurations are incorrect.



205



2.

The FA field in the LSA sent by Switch B is 10.1.2.26. The LSA is not calculated because the FA field of the LSA is incorrect.

3.

The conditions required to generate routes for load balancing are not met.

Based on the analysis of the preceding possible causes, it can be concluded:

1.

The configurations of the devices are normal.

2.

The LSA whose FA field meets the condition of route calculation.

<SwitchA> ping 10.1.3.1


Reply from 10.1.3.1: bytes=56 Sequence=1 ttl=255 time=1 ms








0.00% packet loss

round-trip min/avg/max = 1/1/1 ms

<SwitchA> display ip routing-table 10.1.3.1


------------------------------------------------------------------------------


Summary Count : 2


10.1.3.1/32 O_ASE 150 1 D 10.1.2.45


O_ASE 150 1 D 10.1.2.49


<SwitchA> ping 10.1.2.26



0.00% packet loss

round-trip min/avg/max = 1/1/1 ms

<SwitchA> display ip routing-table 10.1.2.26

10.1.2.24/30 OSPF 10 101 D 10.1.2.45


OSPF 10 101 D 10.1.2.49 XGigabitEthernet0/0/6

3.

On this network, the costs of LSAs are 1. Compare the cost of the route to the ASBR and the cost of the route to the FA.

For Type 2 ASE LSAs, OSPF equal-cost routes can be generated when the following conditions are met: a.

The costs of LSAs are the same.

b.

The cost of the route to the ASBR is the same as the cost of the route to the FA.

On the network, the cost of the route to the FA is 101.

l For the LSA with the FA field 0.0.0.0, the cost of the route to ASBR at 10.1.3.1 is 1.

l For the LSA with an FA field other than 0.0.0.0, the cost of the route to the FA at

10.1.2.26 is 101.

The LSA with the FA field being set is not calculated because the priority of the LSA is lower. As a result, equal-cost routes cannot be formed.



206



Procedure

Step 1 To form equal-cost routes on the network, do as follows:

On the BAS, run the network command to enable OSPF on the next-hop interface of the route to 10.1.1.0. Run the ospf cost command to set the cost of the interface to 100 so that the interface advertises LSAs with the FA field as the address of the interface.

Then, there will be two LSAs with FA fields on Switch A. The cost of the route to one FA and the cost of the route to the other FA are both 101. Thus, equal-cost routes can be formed.

----End

Summary

To form equal-cost routes, set the same cost on the interfaces so that the interfaces advertise

LSAs with the same FA field, the addresses of the interfaces.

The OSPF Neighbor Relationship Cannot Be Established Between Two Devices

Because the Link Between the Devices Is Faulty

Fault Symptom

In the networking shown in

Figure 6-13

, the OSPF neighbor relationship cannot be established

between Switch A and its neighbor, and the neighbor is in the Exchange state.

Figure 6-13 Network diagram of the networking where the neighbor relationship cannot be established between two devices

10.1.1.0

SwitchA

Device of another manufacturer

Fault Analysis

The possible causes are as follows: l

The OSPF configurations are improper.

l Parameters of the two devices are incorrectly set.

l The OSPF packets are lost.

Check the configuration of Switch A and find that Switch A is correctly configured.

Check the OSPF parameters on the corresponding interfaces and find that the OSPF parameters on the interfaces are set correctly.

Run the related debugging command on Switch B and find that MTU negotiation fails.

The MTUs on the two devices are 4470. The debugging ospf packet dd command, however, shows that the MTU contained in the packet received by Switch B is 0, which indicates that the

MTU is not set on the peer device. It is concluded that the link is not working normally.



207



Procedure

Step 1 Replace the faulty board on Switch B.

----End

Summary

Run the following command on Switch A to ping the peer device. Packet loss occurs.

<SwitchA> ping 10.1.1.0


Request time out




Request time out




40.00% packet loss

Ensure that the link between intermediate transmission devices is normal. Collect traffic statistics from Switch A. It is found that packet loss does not occur on Switch A. Thus, packet loss may be occurring on the board of the peer device or on the link.

Collect traffic statistics on the peer device. It is found that packet loss occurs on the board on

Switch B because the board is faulty

Sometimes, OSPF packets are not received received. In this case, check connectivity at the link layer first. Enable OSPF debugging with the commands such as the debugging ospf packet and

debugging ospf event commands to locate the fault, or run the display ospf error command to view the various OSPF error statistics. If the OSPF configuration is correct, run the debugging

ip packet command to check whether packets are successfully forwarded at the IP layer.

An OSPF Routing Loop Occurs Because Router IDs of Devices Conflict

Fault Symptom

In the networking shown in

Figure 6-14

, OSPF multi-instance is run between PEs and CEs. The

CEs are Layer 3 switches of other manufacturers. The PEs deliver OSPF default routes to interwork the networks of two cities. PE1 and PE2 are connected to the same UMG. The same

IP address, 10.1.1.33, is set for the interface connecting PE1 to the UMG and the interface connecting PE2 to the UMG, and the two interfaces are bound to the VPN instance of the UMG.

Normally, the link between the UMG and PE2 is Down. The two interfaces with the IP address

10.1.1.33 on the two PEs cannot both be in the Up state simultaneously.

CE1 can successfully ping PE1, and CE2 can successfully ping PE2. When a CE pings a remote peer or a device on the remote network, packet loss occasionally occurs.



208



Figure 6-14 Network diagram of an OSPF routing loop that occurs because router IDs of the devices conflict

PE1 PE2

CityA CityB

CE1 CE2

Fault Analysis

1.

10.1.1.33 is the largest IP address in the VPN instance to which the two PEs are bound, and the following command is run to configure OSPF multi-instance:

<PE1> ospf 4 vpn-instance www

PE1 and PE2 select 10.1.1.33 as their router ID.

2.

On CE1, the router ID of PE1 is 10.1.1.33; on CE2, the router ID of PE2 is also 10.1.1.33.

3.

Debugging information on the CEs shows that a device with the router ID 10.1.1.33 sends

LSAs every five seconds and the sequence numbers of LSAs are incremental and unstable..

4.

The CEs receive LSAs sent by two devices with the same router ID. This causes the OSPF default routes in the routing tables of the CEs constantly change. When the default route of CE1 is learned by CE2 and the default route of CE2 is learned by CE1, a routing loop occurs. As a result, routes are unreachable and packet loss occurs.

Procedure

Step 1 Run the ospf 4 router-id 10.2.2.9 vpn-instance www command on PE1 to specify the router

ID of the OSPF multi-instance as the unique address of PE1, and run the ospf 4 router-id

10.2.2.10 vpn-instance www command on PE2 to specify the router ID of the OSPF multiinstance as the unique address of PE2.

[PE1] ospf 4 router-id 10.2.2.9 vpn-instance www

[PE2] ospf 4 router-id 10.2.2.10 vpn-instance www

Step 2 Restart the OSPF process associated with the VPN instance on PE1, and then perform the same operation on PE2. Services are restored after both OSPF processes restart.

----End

Summary

Specify the router ID of OSPF multi-instance as the unique addresses of the PEs.

6.5 IS-IS Troubleshooting

6.5.1 The IS-IS Neighbor Relationship Cannot Be Established



209



Common Causes

This fault is commonly caused by one of the following: l IS-IS cannot normally send or receive Hello packets due to a device fault or a link fault.

l The devices at both ends of the link are configured with the same system ID.

l The MTUs configured on the interfaces at both ends of the link are different or the MTU of an interface is smaller than the length of a Hello packet to be sent.

l

The IP addresses of the two interfaces at both ends of the link are on different network segments.

l

The authentication configurations on the IS-IS interfaces at both ends of the link are inconsistent.

l The IS-IS levels of the interfaces at both ends of the link are inconsistent.

l The area addresses of the devices at both ends of the link are inconsistent when the devices establish the IS-IS Level-1 neighbor relationship.


Figure 6-15




210



Figure 6-15 Flowchart for troubleshooting the fault that the IS-IS neighbor relationship cannot be established

The IS-IS neighbor relationship cannot be normally established.

Are Hello packets normally sent or received?

No

Check devices and intermediate links.

Yes

Is the physical status of the interface Up?

Yes

No Check the link status of the interface.

Is the

IS-IS status of the interface Up?

Yes

No

Check the IP address and MTU of the interface.

Is the

IS-IS configuration

correct?

Yes


No

Modify the IS-IS configuration.

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

End


NOTE


Procedure

Step 1 Check the status of IS-IS interfaces.

Run the display isis interface command to check the state of interfaces enabled with IS-IS (the value of the IPv4.State item).

l If the state is Mtu:Up/Lnk:Dn/IP:Dn, go to

Step 2

.



211


Troubleshooting



6 IP Forwarding and Routing l If the state is Mtu:Dn/Lnk:Up/IP:Up, run the display current-configuration interface

interface-type [ interface-number ] command to check the MTUs on the interfaces. Run the display current-configuration configuration isis command to check the lengths of

LSPs in an IS-IS process.

On a P2P interface, the LSP length should not be greater than the MTU on the P2P interface.

On a broadcast interface, the value obtained by the MTU on the interface subtracted by the

LSP length should be equal to or greater than 3. If the condition is not met, run the lsp-

length command in the IS-IS view to change the LSP length, or run the mtu command in the interface view to change the MTU.If there is different between the two poles interfaces,modify and make them equal.

If the fault is still not rectified, go to

Step 4

.

l If the state is Down, run the display current-configuration configuration isis command to check the configuration of the IS-IS process. Check whether the NET is configured in the IS-IS process. If not, configure the network-entity command in the IS-IS process.


Step 2

.

l

If the state is Up, go to

Step 4

.

Step 2 Check that the interface status is Up.

Run the display ip interface [ interface-type [ interface-number ] ] command to check the status of specified interfaces.

l If the interface link status (Line protocol current state field in the output information ) is not Up, troubleshoot the interface fault.


Step 3

.

l

If the interface status is Up, go to

Step 3

.

Step 3 Check that the IP addresses of the two interfaces at both ends of the link are on the same network segment.

l If the IP addresses of the two interfaces are on different network segments, change the IP addresses of the two interfaces to ensure that the two IP addresses are on the same network segment. If the fault is still not rectified, go to

Step 4

.

l

If the IP addresses of the two interfaces are on the same network segment, go to

Step 4

.

Step 4 Check that IS-IS can normally receive and send Hello packets.

Run the display isis statistics packet [ interface interface-type interface-number ] command to check whether IS-IS can normally receive and send Hello packets.

NOTE

The default interval at which IS-IS sends Hello packets is 10s. Therefore, run this command every 10s to check whether the packet statistics increase (L1 IIH or L2 IIH).

On a broadcast interface, Hello packets have IS-IS levels, and therefore you can view the statistics about

Hello packets based on the levels of established neighbor relationships. On a P2P interface, Hello packets have no IS-IS levels and are recorded as L2 IIH packets.

l If the number of received Hello packets does not increase for a certain period, check whether the IS-IS packets are lost.

–

For Broadcast interface, run the debugging ethernet packet isis interface-type

interface-number command. The following information indicates the interface can normally receive and send IS-IS Hello packets.

*0.75124950 HUAWEI ETH/7/eth_rcv:Receive an Eth Packet, interface :

Vlanif10, eth format: 3, length: 60, protoctype: 8000 isis, src_eth_addr:

00e0-fc37-08c1, dst_eth_addr: 0180-c200-0015

Issue 01 (2012-03-15) 212



*0.75124950 HUAWEI ETH/7/eth_send:Send an Eth Packet, interface : Vlanif10, eth format: 3, length: 112, protoctype: 8000 isis, src_eth_addr: 00e0-fc26f9d9, dst_eth_addr : 0180-c200-0015

NOTE

If the DIS field shown in the output of the display isis interface interface-type interface-

numbercommand is "--", it indicates the interface type is P2P. Otherwise, the interface type is

Broadcast.

If the device can not normally receive and send Hello packets, go to

Step 9

.

l

If the device can normally receive Hello packets, go to

Step 5

.

–

If the interfaces at both ends of the link are trunk interfaces, check whether the numbers of the member interfaces in the Up state in the trunk interfaces are the same. If numbers of the member interfaces in the Up state in the trunk interfaces are different, add the required physical interfaces to the Trunk interface correctly. Otherwise, go to

Step 2

– If the interfaces at both ends of the link are not trunk interfaces, go to

Step 2

.

Step 5 Check that the devices at both ends of the link are configured with different system IDs.

Run the display current-configuration configuration isis command to check whether the system IDs of the two devices are the same.

l If the system IDs of the two devices are the same, set different system IDs for the two devices.

l If the system IDs of the two devices are different, go to

Step 6

.

Step 6 Check that the IS-IS levels of the two devices at both ends of the link match.

Run the display current-configuration configuration isis | include is-level command to check the levels of the IS-IS processes on the two devices. Then, run the display current-

configuration interface interface-type interface-number | include isis circuit-level command to check whether the IS-IS levels of the interfaces at both ends of the link match. The IS-IS neighbor relationship can be established only when the IS-IS levels of the two interfaces match.

l If the IS-IS levels of the two devices do not match, run the is-level command in the IS-IS view to set matching IS-IS levels for the two devices, or run the isis circuit-level command in the interface view to change the levels of related interfaces.

l

If the IS-IS levels of the two devices match, go to

Step 7

.

Step 7 Check that the area addresses of the two devices at both ends of the link are the same.

When the area addresses of the two devices are different, the alarm ISIS_1.3.6.1.3.37.2.0.12

isisAreaMismatch is generated.

NOTE

If two devices at both ends of a link establish a Level-1 neighbor relationship, ensure that the two devices are in the same area.

An IS-IS process can be configured with a maximum of three area addresses. As long as one of the area addresses of the local IS-IS process is the same as one of the area addresses of the remote IS-IS process, the Level-1 neighbor relationship can be established.

When the IS-IS Level-2 neighbor relationship is established between two devices, you do not need to determine whether the area addresses of the two devices match.

l If the area addresses of the two devices are different, run the network-entity command in the IS-IS view to set the same area address for the two devices.

l If the area addresses of the two devices at both ends of the link are the same, go to

Step

8

.

Step 8 Check that the authentication configurations of the two devices at both ends of the link are the same.



213



If the authentication types of the two devices are different, the alarm ISIS_1.3.6.1.3.37.2.0.9

isisAuthenticationTypeFailure or the alarm ISIS_1.3.6.1.3.37.2.0.10isisAuthenticationFailure

is generated.

Run the display current-configuration interface interface-type interface-number | include isis

authentication-mode command to check whether the IS-IS authentication configurations of the two interfaces at both ends of the link are the same.

l If the authentication types on the two interfaces are different, run the isis authentication-

mode command in the view of each of the two interfaces to set the same authentication type for the two interfaces.

l If the authentication passwords on the two interfaces are different, run the isis

authentication-mode command in the view of each of the two interfaces to set the same authentication password for the two interfaces.

l If the authentication configurations of the two devices are the same, go to

Step 9

.



----End


Relevant Alarms

ISIS_1.3.6.1.3.37.2.0.12 isisAreaMismatch

ISIS_1.3.6.1.3.37.2.0.9 isisAuthenticationTypeFailure

ISIS_1.3.6.1.3.37.2.0.10 isisAuthenticationFailure

Relevant Logs

None.

6.5.2 A Device Fails to Learn Specified IS-IS Routes from Its

Neighbor

Common Causes

This fault is commonly caused by one of the following: l Another routing protocol whose priority is higher than that of IS-IS advertises the same routes as those advertised by IS-IS.

l The preferences of the imported external routes are low, and therefore the imported external routes are not preferred.

l The IS-IS cost styles of the two devices are inconsistent.

l The IS-IS neighbor relationship is not normally established between the two devices.



214


Troubleshooting 6 IP Forwarding and Routing l The two devices are configured with the same system ID.

l The authentication configurations of the two devices are inconsistent.

l LSP loss occurs due to a device fault or a link fault.


After IS-IS is configured on the network, it is found that the device cannot learn specified IS-IS routes from its neighbor.

The troubleshooting roadmap is as follows: l Check whether another protocol also learns specified routes.

l Check whether IS-IS calculates routes.

l Check whether IS-IS LSDBs are synchronized.

l Check whether the IS-IS configuration is correct.

Figure 6-16




215



Figure 6-16 Troubleshooting flowchart when device cannot learns IS-IS routes

A device fails to learn specified routes from its neighbor.

Do specified routes exist in the IS-IS routing table?

No

Check whether another routing protocol advertise the same routes.

Yes

Are the specified routes advertised?

Yes

No

Check the IS-IS configuration of the device that advertises the routes.

Are IS-IS LSDBs synchronized?

Yes

No

Check the IS-IS configuration.

Are IS-IS cost styles consistent?

Yes

Is the IS-IS neighbor relationship normally established?

No

Ensure that cost styles of the interfaces on both ends of the link are consistent.

No

Troubleshoot the fault of the IS-IS neighbor relationship fails to be established.

Yes


Is fault rectified?

No


Yes

Is fault rectified?

No

Yes

Yes

Is fault rectified?

No

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

End


NOTE




216



Procedure

Step 1 Check that the IS-IS routing table of the device that fails to learn specified routes is correct.

Run the display isis route command to view the IS-IS routing table.

l If the specified routes exist in the IS-IS routing table, run the display ip routing-table ip-

address [ mask | mask-length ] verbose command to check whether routes advertised by a routing protocol whose priority is higher than that of IS-IS exist in the routing table.

NOTE

If the value of the State field of a route is Active Adv, it indicates that the route is an active route.

If there are multiple routes that have the same prefix but are advertised by different routing protocols, the route advertised by the routing protocol with the highest priority is preferred as the active route.

– If there are such routes in the routing table, adjust the configuration based on the network planning.

–

If there is no such routes in the routing table, go to

Step 6

.

l

If there is no specified route in the IS-IS routing table, go to

Step 2

.

Step 2 Check that the specified IS-IS routes are advertised.

On the device that advertises specified routes, run the display isis lsdb local verbose command to check whether LSPs generated by the device carry the specified routes.

l If the LSPs do not carry the specified routes, check whether the configurations of the device are correct, for example, whether IS-IS is enabled on associated interfaces.

NOTE

If the specified routes are imported external routes, run the display ip routing-table protocol

protocol verbose command to check whether the external routes are active routes.

l If the LSPs carry the specified routes, go to

Step 3

.

Step 3 Check that IS-IS LSDBs are synchronized.

On the device that fails to learn specified IS-IS routes, run the display isis lsdb command to check whether the device learns LSPs from the device that advertises specified routes.

NOTE

LSPID identifies an LSP, and Seq Num is the sequence number of an LSP. The greater the sequence number, the newer the LSP.

l If the LSDB of the device that fails to learn specified IS-IS routes does not have specified

LSPs, do as follows as required:

–

If the alarm ISIS_1.3.6.1.3.37.2.0.9 isisAuthenticationTypeFailure or the alarm

ISIS_1.3.6.1.3.37.2.0.10 isisAuthenticationFailure is generated, it indicates that the authentication types or authentication passwords of the device that fails to learn specified routes and the device that advertises the specified routes are inconsistent. In this case, set the same authentication type and authentication password for the two devices.

–

If the alarm ISIS_1.3.6.1.3.37.2.0.9 isisAuthenticationTypeFailure or

ISIS_1.3.6.1.3.37.2.0.10 isisAuthenticationFailure is not generated, check whether devices or intermediate links are faulty.

l If the LSDB of the device that fails to learn specified IS-IS routes contains specified LSPs, but the Seq Num fields of the LSPs are different with the fields of the display isis lsdb

local verbose command, and the values of the Seq Num fields keep increasing, it indicates



217


Troubleshooting 6 IP Forwarding and Routing that there is another device configured with the same system ID as the device that advertises specified routes on the network. In this case, the alarm ISIS_1.3.6.1.3.37.2.0.8

isisSequenceNumberSkip is generated, and you need to check the IS-IS configurations on the devices on the network.

l If the LSDB of the device that fails to learn specified IS-IS routes contains specified LSPs, but the Seq Num fields of the LSPs are inconsistent and the values of the Seq Num fields keep unchanged, it indicates that the LSPs may be discarded during transmission. In this case, you need to check whether devices or intermediate links are faulty.

l If the LSDB of the device that fails to learn specified IS-IS routes contains specified LSPs

and the Seq Num fields of the LSPs are consistent, go to

Step 4

.

Step 4 Check whether the IS-IS cost styles of the two devices are consistent.

Run the display current-configuration configuration isis command on the device that advertises specified routes and the device that fails to learn specified IS-IS routes respectively to check whether the IS-IS cost styles (the cost-style command) of the two devices are consistent.

NOTE

Two devices can learn routes from each other only when the IS-IS cost styles of the two devices match.

The IS-IS cost styles are classified as follows: l narrow: indicates that the packets with the cost style being narrow can be received and sent.

l narrow-compatible: indicates that the packets with the cost style being narrow or wide can be received but only the packets with the cost style being narrow can be sent.

l compatible: indicates that the packets with the cost style being narrow or wide can be received and sent.

l wide-compatible: indicates that the packets with the cost style being narrow or wide can be received but only the packets with the cost style being wide can be sent.

l wide: indicates that the packets with the cost style being wide can be received and sent.

If the cost style of one device is narrow and the cost style of the other device is wide or wide-compatible, or the cost style of one device is narrow-compatible and the cost style of the other device is wide, the two devices cannot interwork.

l If the IS-IS cost styles on the two devices are inconsistent, run the cost-style command to set the same IS-IS cost style for the two devices.

l

If the IS-IS cost styles on the two devices are consistent, go to

Step 5

.

Step 5 Check that the IS-IS neighbor relationship is normally established.

Run the display isis peer command on every device on the path to check whether the IS-IS neighbor relationships are normally established.

l

If the State field is not Up, troubleshoot the fault

The IS-IS Neighbor Relationship

Cannot Be Established

.

l If the State field is Up, go to

Step 6

.



----End




218


Troubleshooting

Relevant Alarms

ISIS_1.3.6.1.3.37.2.0.8 isisSequenceNumberSkip

ISIS_1.3.6.1.3.37.2.0.9 isisAuthenticationTypeFailure

ISIS_1.3.6.1.3.37.2.0.10 isisAuthenticationFailure

Relevant Logs

None.

6.5.3 The IS-IS Neighbor Relationship Flaps


Common Causes

This fault is commonly caused by one of the following: l Packet loss occurs because the link is unstable or devices work abnormally.

l The member interfaces of the trunk interface are incorrectly connected.


After IS-IS is configured on the network, it is found that the IS-IS neighbor relationship flaps.

Figure 6-17




219


Troubleshooting

Figure 6-17 Troubleshooting flowchart when IS-IS neighbors flap

The IS-IS neighbor relationship flaps


Check log information to identify the change type of the IS-IS neighbor relationship

Neighbor relationship is

Down because the

Hold timer expires

Status of neighbor relationship changes between

Up and Init

Status of neighbor relationship is

MULTIPLE_P2P_

ADJ

Check the local device and the intermediate link

Check the local device and the intermediate link

Check that member interfaces of the trunk interface are correctly connected

Is fault rectified?

No

Is fault rectified?

No

Is fault rectified?

No

Yes

Yes

Yes

In other case


End


NOTE


Procedure

Step 1 Check the change type of the IS-IS neighbor relationship.

When the IS-IS neighbor relationship changes, the alarm ISIS_1.3.6.1.3.37.2.0.17

isisAdjacencyChange and the log ISIS/4/ADJ_CHANGE_LEVEL are generated.

NOTE

The log ISIS/4/ADJ_CHANGE_LEVEL is recorded only when the log-peer-change command is run in the IS-IS process.

l If the log-peer-change command is run in the IS-IS process, you can view the value of the

ChangeType field in the log information.



220



– If the value of the ChangeType field is HOLDTIMER_EXPIRED, it indicates that the local device cannot normally receive Hello packets from its neighbor. In this case, you need to check whether packet loss occurs because the local device or the intermediate link is faulty.

– If the value of the ChangeType field changes between 3_WAY_INIT and 3_WAY_UP

(for P2P interfaces) or is NEW_L1_ADJ or NEW_L2_ADJ (for broadcast interfaces), it indicates that the status of the neighbor relationship changes between Up and Init.

This is because the remote device cannot normally receive Hello packets from the local device. In this case, check whether packet loss occurs because the intermediate link or the remote device is faulty.

– If the value of the ChangeType field is MULTIPLE_P2P_ADJ and the interface is an

IP-Trunk interface, check whether the member interfaces of the trunk interface are correctly connected.

–

In other cases, go to

Step 2

.

l If the log-peer-change command is not run, run the display isis peer command consecutively, and then view the values of the State and HoldTime fields to identifies the change type of the IS-IS neighbor relationship.

– When the neighbor relationship flaps, if the value of the State field keeps unchanged, the value of the HoldTime field keeps decreasing, and the neighbor relationship is deleted after the value of the HoldTime field decreases to 0, it indicates that the local device cannot normally receive Hello packets from the remote device. In this case, you need to check whether packet loss occurs because the intermediate link or the local device is faulty.

–

When the neighbor relationship flaps, if the value of the State field changes between

Up and Init, it indicates that the remote device cannot normally receive Hello packets from the local device. In this case, you need to check whether packet loss occurs because the intermediate link or the remote device is faulty.

–

In other cases, go to

Step 2

.



----End


Relevant Alarms

ISIS_1.3.6.1.3.37.2.0.17 isisAdjacencyChange

Relevant Logs

ISIS/4/ADJ_CHANGE_LEVEL

6.5.4 IS-IS Routes Flap



221


Troubleshooting

Common Causes


This fault is commonly caused by one of the following: l The IS-IS neighbor relationship flaps.

l The two devices import the same external routes to IS-IS, and the preferences of the imported external routes are lower than those of IS-IS routes.

l

The two devices are configured with the same system ID.


After IS-IS is configured on the network, it is found that IS-IS routes flap.

Figure 6-18


Figure 6-18 Troubleshooting flowchart when IS-IS routes flap

IS-IS routes flap

Check the routing table and identify the changed attributes of routes

The outbound interface or cost of the route changes

Ensure that the IS-IS neighbor relationship does not flap

Is fault rectified?

No

Yes

A specified route appears intermittently in the routing table

Ensure that external routes do not flap and that the IS-IS configuration is correct

Is fault rectified?

No

Yes

Other cases





End

222



NOTE


Procedure

Step 1 Check the details about route flapping.

Run the display ip routing-table ip-address verbose command to check the details about route flapping, such as, the routing protocol from which active routes are learned and the changed attributes of routes during route flapping.

l If the value of the TunnelID field changes after route flapping, check whether the MPLS

LSP flaps. If the MPLS LSP flaps, see LDP LSP Flapping to rectify the fault.

l If the Cost or Interface field of a route changes, check whether the IS-IS neighbor

relationship established between devices on the path flaps. If so, see

The IS-IS Neighbor

Relationship Flaps


l If a route appears intermittently in the routing table (the value of the Age field changes), run the display isis lsdb verbose command to identify the LSP that carries the route. Then, run the display isislsdb lsp-id verbose command to check the updates of the LSP.

–

If the LSP always carries the specified route, check whether the IS-IS neighbor

relationship established between devices on the path flaps. If so, see

The IS-IS

Neighbor Relationship Flaps


– If the value of the Seq Num field of the LSP constantly increases, check whether the two devices are configured with the same system ID.

–

If the value of the Seq Num field of the LSP constantly increases and the route appears

intermittently before and after the LSP is updated, perform

Step 2

on the device that

generates the LSP.

NOTE

In the output of the display isis lsdblsp-id verbose command, the IP-Internal field or the +IP-

Internal field indicates the IP address of the device that generates the LSP.

l If the value of the Protocol field of the route changes, go to

Step 2

.

Step 2 Check the external routes imported by IS-IS.

If specified routes are external routes imported by IS-IS, run the display ip routing-table ip-

address verbose command on the device where IS-IS imports the external routes to view details about route flapping.

l The active routes in the routing table are IS-IS routes rather than external routes to be imported by IS-IS, it indicates that other IS-IS devices advertise the same routes. In this case, you need to modify the priorities of routing protocols based on network planning, or configure a route filtering policy in the IS-IS view to control the routes to be added to the

IP routing table.

l In other cases, go to

Step 3

.



----End



223


Troubleshooting



Relevant Alarms

None.

Relevant Logs

None.


An Upper-layer Device Cannot Learn IS-IS Routes Due to Differences in the Types of Routes Imported by IS-IS on a Huawei Device and a Non-Huawei Device

Fault Symptom


Figure 6-19

, Switch B and Switch C are located at the core layer and

are connected to two SR devices, that is, Switch A, and Switch D. Switch D is a non-Huawei device. To implement load balancing, Switch A and Switch D are configured to the same network, and direct routes and static routes are imported to IS-IS in related IS-IS processes. After the configuration, you can find that Switch B and Switch C can learn routes from only Switch

D.

Figure 6-19 Diagram of the network where devices cannot learn IS-IS routes

SwitchA

SwitchB SwitchC

SwitchD

Fault Analysis

By default, the type of static routes imported by IS-IS on Switch D is internal and the cost of the routes equals to the original cost of the imported route, whereas the type of static routes imported by IS-IS on Switch A is external and the cost of the routes equals to the sum of original cost of the imported route and 64. Switch B, and Switch C selects routes only from Switch D rather than Switch A because the costs are different.



224



NOTE

The trouble occurs only when the cost-style is narrow.

Procedure

Step 1 Run the system-view command on Switch A to enter the system view.

Step 2 Run the isis process-id command to enter the IS-IS view.

Step 3 Run the import-route direct cost-type internal command to configure IS-IS to import direct routes and set cost-type to internal.

Step 4 Run the import-route static cost-type internal command to configure IS-IS to import static routes and set cost-type to internal.

NOTE

Modify the cost-type from external to internal, the cost of the imported routes equals to the original cost of the imported route, rather than the sum of original cost of the imported route and 64.

After the preceding operations, run the display isis route command on Switch B and Switch C to view routing information. You can find that there are two IS-IS routes to the same network segment and that load balancing is performed by Switch A and Switch D.

----End

Summary

In the networking with devices of different manufacturers, note the implementation differences between the devices.

6.6 BGP Troubleshooting

6.6.1 The BGP Peer Relationship Fails to Be Established

Common Causes

Issue 01 (2012-03-15)

The BGP peer relationship fails to be established if the BGP peer relationship cannot enter the

Established state.


BGP packets fail to be forwarded.

l An ACL is configured to filter packets with the destination port TCP port 179.

l The peer router ID conflicts with the local router ID.

l The peer AS number is incorrect.

l

Loopback interfaces are used to establish the BGP peer relationship, but the peer connect-

interface command is not configured.

l Loopback interfaces are used to establish the EBGP peer relationship, but the peer ebgp-

max-hop command is not configured.



225


Troubleshooting 6 IP Forwarding and Routing l The number of routes sent by the peer exceeds the upper limit that is specified by the peer

route-limit command.

l The peer ignore command is configured on the peer.

l The address families of devices on both ends are inconsistent.


The BGP peer relationship fails to be established after the BGP protocol is configured.

Figure 6-20


Figure 6-20 Troubleshooting flowchart for the failure to establish the BGP peer relationship

The BGP peer relationship fails to be established

Can the ping operation succeed?

No

Check the routes used to establish the

BGP peer relationship

Yes

Is fault rectified?

Yes

No

Are the configurations correct?

No

Change the configurations to be correct

Yes

Is fault rectified?

Yes

No


End


NOTE


Procedure

Step 1 Run the ping command to check whether BGP peers can successfully ping each other.

l If BGP peers can successfully ping each other, there are available routes between BGP peers and link transmission is normal. In this case, go to

Step 2

.



226



NOTE

Run the ping -a source-ip-address -s packetsize host command to detect the connectivity of devices on both ends. Because the source address is specified in this command, it is possible to check whether the two devices have available routes to each other. Check whether large Ping packets can be normally transmitted over the link by specifying the size of the Ping packet.

l If the ping operation fails, check whether the two devices have routes to each other in routing table of each device.

–

If there are no routes to the peer, check the associated routing protocol configurations.

For details, see the section

The Ping Operation Fails

.

– If there are routes to the peer, contact Huawei technical support personnel.

Step 2 Check that no ACL is configured to filter the packets with the destination port TCP port 179.

Run the display acl all command on the two devices to check whether an ACL is configured to filter the packets with the destination port TCP port 179.

<Quidway> display acl all

Total nonempty ACL number is 1

Advanced ACL 3001, 2 rules

Acl's step is 5

rule 5 deny tcp source-port eq bgp

rule 10 deny tcp destination-port eq bgp l If an ACL is configured to filter the packets with the destination port TCP port 179, run the undo rule rule-id destination-port command and the undo rule rule-id source-port command in the Advanced ACL view to delete the configuration.

l If no ACL is configured to filter the packets with the destination port TCP port 179, go to

Step 3

.

Step 3 Check that the peer router ID does not conflict with the local router ID.

View information about BGP peers to check whether the peer and local router IDs conflict. For example, if the IPv4 unicast peer relationship fails to be established, run the display bgp peer command to check whether the peer router ID conflicts with the local router ID. In the following example command output, the local router ID is 223.5.0.109.

<Quidway> display bgp peer

BGP local router ID : 223.5.0.109

Local AS number : 41976

Total number of peers : 12 Peers in established state : 4

Peer V AS MsgRcvd MsgSent OutQ Up/Down State

PrefRcv

8.9.0.8 4 100 1601 1443 0 23:21:56 Established

10000

9.10.0.10 4 200 1565 1799 0 23:15:30 Established 9999 l If the peer router ID conflicts with the local router ID, run the router id command in the

BGP view to change the two router IDs to different values. Generally, a loopback interface address is used as the local router ID.

l If the peer router ID does not conflict with the local router ID, go to

Step 4

.

Step 4 Check that the peer AS number is configured correctly.

Run the display bgp peer command on each device to check whether the displayed peer AS number is the same as the remote AS number.

<Quidway> display bgp peer

BGP local router ID : 223.5.0.109

Local AS number : 41976



227



Total number of peers : 12 Peers in established state : 4

Peer V AS MsgRcvd MsgSent OutQ Up/Down State

PrefRcv

8.9.0.8 4 100 1601 1443 0 23:21:56 Established

10000

9.10.0.10 4 200 1565 1799 0 23:15:30 Established 9999 l If the peer AS number is incorrect configured, change it to be the same as the remote AS number.

l

If the peer AS number is configured correctly, go to

Step 5

.

Step 5 Check whether BGP configurations affect the establishment of the BGP peer relationship.

Run the display current-configuration configuration bgp command to check BGP configurations.

Item

peer connect-interface interface-type

interface-number

peer ebgp-max-hop hop-count

peer route-limit limit

peer ignore

Description

If two devices use loopback interfaces to establish the BGP peer relationship, run the peer connect-

interface command to specify the associated loopback interface as the source interface that sends BGP packets.

When two directly connected devices use loopback interfaces to establish the EBGP peer relationship or two indirectly connected devices establish the

EBGP peer relationship, run the peer ebgp-max-

hop command and specify the maximum number of hops between the two devices.

l When two directly connected devices use loopback interfaces to establish the EBGP peer relationship, the hop count can be any number greater than 1.

l When two indirectly connected devices establish the EBGP peer relationship, specify the number of hops based on the actual situation.

If the peer route-limit limit command is configured, check whether the number of routes sent by the peer exceeds the upper limit that is specified by limit. If the number of hops exceeds the upper limit, reduce the number of routes to be sent by the peer, and run the reset bgp ip-address command to reset the BGP peer relationship and trigger the re-establishment of the BGP peer relationship.

If the peer ignore command is configured on the peer, the peer is not required to establish the BGP peer relationship with the local device temporarily.

To establish the BGP peer relationship between the peer and the local device, run the undo peer

ignore command on the peer.



228



Item

Address family capability

Description

Check whether the address family capabilities of devices on both ends match. For example, in order to establish a BGP VPNv4 peer relationship, the

peer enable command must be configured in the

BGP-VPNv4 address families of both devices. If the peer enable command is configured on only one device, the BGP peer relationship on the other device is displayed as No neg.

Step 6 Contact Huawei technical support personnel and provide them with the following information.


----End


Relevant Alarms

BGP_1.3.6.1.2.1.15.7.2 bgpBackwardTransition

Relevant Logs

BGP/3/STATE_CHG_UPDOWN

BGP/3/WRONG_ROUTERID

BGP/3/WRONG_AS

6.6.2 BGP Public Network Traffic Is Interrupted

Common Causes

This troubleshooting case describes how to clear the fault that traffic to be transmitted through

BGP public network routes is interrupted when the BGP peer relationship is normal.


Routes are inactive because the next hops are unreachable.

l Routes fail to be advertised or received because routing policies are incorrectly configured.

l The received routes are dropped because there is an upper limit on the number of routes on the device.


BGP public network traffic is interrupted after the BGP protocol is configured.



229


Troubleshooting

Figure 6-21



Figure 6-21 Troubleshooting flowchart for interruption of BGP public network traffic

The BGP public network traffic is interrupted

Is the next hop of the route reachable?

Yes

No

Ensure that the next hop is reachable

Is faulty rectified?

Yes

No

Is the routing policy configured correctly?

No

Correctly configure the routing policy

Yes

Does the number of routes exceed the upper limit?

Yes

Reduce the number of routes

No



Yes

No


Yes

No

End


NOTE


Procedure

Step 1 Verify that the next hops for the routes are reachable.

Run the display bgp routing-table network { mask | mask-length } command on the device that sends routes (that is, the local device) to check whether the target route is active and whether it has been sent to the peer. network specifies the prefix of the target route.

Assume that the target route is a route to 13.0.0.0/8. The following command output shows that this route is valid and has been selected and sent to the peer at 3.3.3.3; the original next hop and iterated next hop of this route are 1.1.1.1 and 172.1.1.1 respectively.



230



<Quidway> display bgp routing-table 13.0.0.0 8

BGP local router ID : 23.1.1.2

Local AS number : 100

Paths: 1 available, 1 best, 1 select

BGP routing table entry information of 13.0.0.0/8:

From: 1.1.1.1 (121.1.1.1)

Route Duration: 4d21h29m39s

Relay IP Nexthop: 172.1.1.1

Relay IP Out-Interface: GigabitEthernet1/0/2

Original nexthop: 1.1.1.1

Qos information : 0x0

AS-path Nil, origin incomplete, localpref 100, pref-val 0, valid, internal, best, select, active, pre 255

Aggregator: AS 100, Aggregator ID 121.1.1.1

Advertised to such 1 peers:

3.3.3.3

l

If the target route is inactive, check whether there is a route to the original next hop in the

IP routing table. If there is no route to the original next hop, the BGP route is not advertised because the next hop of the BGP route is unreachable. Then, find out why there is no route to the original next hop (this fault is generally associated with IGP or static routes).

l If the target route is active and has been selected but there is no information indicating that this route has been sent to the peer, go to

Step 2

to check the outbound policy applied to the local device.

Run the display bgp routing-table network { mask | mask-length } command on the peer to check whether it has received the target route.

l

If the peer has received the target route, perform

Step 1

again to check whether the next

hop of the route is reachable and whether this route has been selected.

l

If the peer has not received the target route, go to

Step 2

to check the inbound policy applied

to the peer.

Step 2 Check that routing policies are configured correctly.

Run the display current-configuration configuration bgp command on the local device and the peer to check whether inbound and outbound policies are configured.

<Quidway> display current-configuration configuration bgp

# bgp 100

peer 1.1.1.1 as-number 100

#

ipv4-family unicast

undo synchronization

filter-policy ip-prefix aaa import

filter-policy ip-prefix aaa export

peer 1.1.1.1 enable

peer 1.1.1.1 filter-policy acl-name acl-name import

peer 1.1.1.1 filter-policy acl-name acl-name export

peer 1.1.1.1 as-path-filter 1 import

peer 1.1.1.1 as-path-filter 1 export

peer 1.1.1.1 ip-prefix prefix-name import

peer 1.1.1.1 ip-prefix prefix-name export

peer 1.1.1.1 route-policy policy-name import

peer 1.1.1.1 route-policy policy-name export return l If inbound and outbound policies are configured on the two devices, check whether the target route is filtered by these policies. For detailed configurations of a routing policy, see the S6700 Series Ethernet Switches Configuration Guide - IP Routing.

l

If inbound and outbound policies are not configured on the two devices, go to

Step 3

.



231



Step 3 Check that the number of routes is lower than the upper limit.

Run the display current-configuration configuration bgp | include peer destination-

address command or the display current-configuration configuration bgp | include peer

group-name command on the peer to check whether an upper limit on the number of routes to be received is configured on the peer.

For example, if the upper limit is set to 5, subsequent routes are dropped and a log is recorded after the peer receives five routes from the local device at 1.1.1.1.

<Quidway> display current-configuration configuration bgp | include peer 1.1.1.1


peer 1.1.1.1 route-limit 5 alert-only

peer 1.1.1.1 enable

If the peer is added to a peer group, there may be no configurations of the upper limit in the command output.

<Quidway> display current-configuration configuration bgp | include peer 1.1.1.1


peer 1.1.1.1 group IBGP

peer 1.1.1.1 enable

peer 1.1.1.1 group IBGP

In this case, run the display current-configuration configuration bgp | include peer group-

name command to check the configuration of this peer group.

<Quidway> display current-configuration configuration bgp | include peer IBGP

peer IBGP route-limit 5 alert-only

peer IBGP enable

If the log BGP/3/ROUTPRIX_EXCEED is generated when traffic is interrupted, the target route is dropped because the upper limit is exceeded. In this case, increase the upper limit.

NOTE

Changing the upper limit on the number of routes to be received from a peer interrupts the BGP peer relationship. Therefore, reducing the number of sent routes by configuring route summarization on the local device is recommended.

Step 4 Contact Huawei technical support personnel and provide them with the following information.

l Results of the preceding troubleshooting procedure.

l Configuration files, log files, and alarm files of the devices.

----End


Relevant Alarms

BGP_1.3.6.1.4.1.2011.5.25.177.1.3.1 hwBgpPeerRouteNumThresholdExceed

Relevant Logs

BGP/3/ROUTPRIX_EXCEED



232




Traffic Traverses the Egress Device of an AS Because BGP Delivers Default Routes with Different MEDs

Fault Symptom

In

Figure 6-22

, Switch A and Switch B reside on the backbone network. EBGP peer relationships are established between devices in AS 100 and AS 200. IBGP peer relationships are established between devices inside each AS. After Switch A and Switch B advertise BGP default routes, detailed information about BGP default routes on Switch C shows that outgoing traffic from AS

200 is directed to Switch D. That is, the next hop of BGP default routes is Switch D.

Consequently, outgoing traffic from AS 200 traverses Switch C.

Figure 6-22 Outgoing traffic traversing the egress device of an AS

AS100

SwitchA SwitchB

0.0.0.0 is delivered

MED 88

0.0.0.0 is delivered

MED 88

SwitchC SwitchD

AS200

Default route

Outgoing traffic

Fault Analysis

Run the display bgp routing-table 0.0.0.0 command on Switch C to view detailed information about BGP default routes. The command output will show that different MEDs are set for

Switch A and Switch B. As a result, outgoing traffic from AS 200 traverses Switch C.

Procedure

Step 1 Run the system-view command on Switch B or Switch A to enter the system view.



233



Step 2 Run the bgp as-number command to enter the BGP view.

Step 3 Run the ipv4-family unicast command to enter the BGP-IPv4 unicast address family view.

Step 4 Run the default med med command to modify the default MED of the BGP routes, and make sure the MED of BGP routes on Switch A matches the MED of BGP routes on Switch B.

After performing the preceding operations, run the display bgp routing-table 0.0.0.0 command on Switch C to view detailed information about BGP default routes. The command output will show that outgoing traffic from AS 200 is transmitted through Switch C. This indicates that the fault has been cleared.

----End

Summary

When there are multiple egress devices between two ASs, set the same MED for the advertised default routes. BGP prefers routes learned from EBGP peers because the delivered default routes have the same route attributes, such as the local-preference and MED.

BGP Peer Relationship Goes Down Because of Route Iteration

Route iteration is enabled by default. On a network, route iteration may cause exceptions, for example, the BGP peer relationship goes Down.

Fault Symptom

As shown in

Figure 6-23

, there are two links between SwitchA and SwitchB. SwitchA and

SwitchB establish the BGP peer relationship by using loopback interfaces. After VLANIF 10 on SwitchA goes Down, the BGP peer relationship between SwitchA and SwitchB goes Down and remains in OpenSent state. SwitchA, however, can successfully ping the IP address of the loopback interface on SwitchB.

Figure 6-23 Route iteration causing the BGP peer relationship to go Down

SwitchA

VLANIF10

192.168.0.2/30

VLANIF10

192.168.0.1/30

SwitchB

Loopback0

20.0.0.1

VLANIF20

192.168.1.2/30

VLANIF20

192.168.1.1/30

Loopback0

10.0.0.1

Fault Analysis

1.

After VLANIF 10 on SwitchA goes Down, run the display ip routing-table ip-address command on SwitchA to check the IP routing table. The command output shows that there are two equal-cost routes whose next-hop addresses are 10.0.0.1 and outbound interfaces are VLANIF 20 and Null0. Before VLANIF 10 on SwitchA goes Down, outbound interfaces of two routes whose next-hop addresses are 10.0.0.1 are VLANIF 20 and

VLANIF 10.



234



Run the display bgp peer command on SwitchA to check the BGP peer relationship, and you can see that the BGP peer at 10.0.0.1 is in OpenSent state.

2.

Route iteration may cause outbound interfaces of equal-cost routes to change. If route iteration does not occur, after VLANIF 10 on SwitchA goes Down, only one of the two equal-cost routes exists, that is, the route with the outbound interface VLANIF 20.

3.

Check the configuration of SwitchA and analyze why the outbound interface is iterated to

Null0. The configuration shows that the static routes with the 32-bit mask to the address

(10.0.0.1) of the loopback interface on SwitchB are configured on SwitchA.

ip route-static 10.0.0.1 255.255.255.255 192.168.1.1

ip route-static 10.0.0.1 255.255.255.255 192.168.0.1

After VLANIF 10 on SwitchA goes Down, the preceding static route configuration causes

SwitchA to iterate routes. Check whether there are routes to 192.168.0.1 in the routing table. By checking the configuration file, you can see the following static route configuration: ip route-static 192.168.0.0 255.255.255.0 NULL0 preference 255

The outbound interface of one of the two equal-cost routes becomes Null0.

4.

Analyze why the BGP peer relationship goes Down after one outbound interface becomes

Null0. After VLANIF 10 goes Down, two upstream routes of SwitchA are as follows:

Destination/Mask Proto Pre Cost NextHop Interface

10.0.0.1/32 BGP 100 0 10.0.0.1 Vlanif20

BGP 100 0 10.0.0.1 NUll0

In this case, SwitchA can successfully ping the IP address 10.0.0.1 of the loopback interface on SwitchB. In normal situations, the BGP peer relationship remains Up. Because there are two links between SwitchA and SwitchB, hash calculation is triggered when packets are exchanged between the two devices. Run the ping command without specifying the source address, the outbound interface calculated by the hash algorithm is VLANIF 20, in which case the ping operation succeeds. If you run the ping command and specify loopback interface address 20.0.0.1 as the source address on SwitchA, the outbound interface calculated by the hash algorithm is VLANIF 10, in which case the ping operation fails.

Loopback interface addresses are used to establish the BGP peer relationship between

SwitchA and SwitchB. VLANIF 10 is now iterated to the outbound interface Null0.

Therefore, the BGP peer relationship between SwitchA and SwitchB goes Down.

To rectify the fault, disable route iteration on SwitchA.

Procedure

Step 1 Run the system-view command on SwitchA to enter the system view.

After VLANIF 10 on SwitchA goes Down, the static route with the outbound interface VLANIF

10 becomes unreachable and is deleted from the routing table. Then, all packets destined for

SwitchB are sent through VLANIF 20 only.

Step 2 Run the undo ip route-static 10.0.0.1 255.255.255.255 192.168.1.1 and undo ip route-static

10.0.0.1 255.255.255.255 192.168.0.1 commands to delete original static route configurations.

Step 3 Run the ip route-static 10.0.0.1 255.255.255.255 vlanif20 192.168.1.1 and ip route-static

10.0.0.1 255.255.255.255 vlanif10 192.168.0.1 commands to configure static routes and specify next hops and outbound interfaces for them.

Step 4 Run the display bgp peer command, and you can see that the BGP peer at 10.0.0.1 is in

Established state. The BGP peer relationship is established correctly. The fault is rectified.

----End



235



Summary

Route iteration is enabled by default. Ensure that route iteration will not cause exceptions on a network.

Static Routes Do Not Take Effect Because of the Relay Depth

Fault Symptom

As shown in

Figure 6-24

, Switch A and Switch B are connected through two XGE links and

establish the EBGP peer relationship. The following two static routes are configured on

Switch A: ip route-static 2.2.2.2 255.255.255.255 Vlanif10 10.1.1.2

ip route-static 2.2.2.2 255.255.255.255 10.1.2.2

The routing table shows that routes to Switch B have only one outbound interface VLANIF

10.

Figure 6-24 Static routes failing to take effect

Loppback0

1.1.1.1/32

Loopback0

2.2.2.2/32

VLANIF10

10.1.1.1/24

VLANIF10

10.1.1.2/24

SwitchA

VLANIF20

10.1.2.1/24

VLANIF20

10.1.2.2/24

SwitchB

Fault Analysis

Because the static route configured through the ip route-static 2.2.2.2 255.255.255.255

vlanif10 10.1.1.2 command is specified with an outbound interface, route relay is not required and the relay depth is 0. Because no outbound interface is specified for the other static route configured through the ip route-static 2.2.2.2 255.255.255.255 10.1.2.2 command, route relay needs to be performed one time and the relay depth is 1.

BGP selects the static route with the smallest relay depth. Therefore, BGP selects the static route with the relay depth 0, and the outbound interface of the static route configured through the ip

route-static 2.2.2.2 255.255.255.255 10.1.2.2 command become VLANIF 10.

Procedure

Step 1 Run the system-view command on Switch A to enter the system view.

Step 2 Run the undo ip route-static 2.2.2.2 255.255.255.255 10.1.2.2 command to delete the static route.

Step 3 Run the ip route-static 2.2.2.2 255.255.255.255 vlanif20 10.1.2.2 command to configure a static route with an outbound interface.



236



After the preceding operations, both static routes are selected when BGP selects static routes with the smallest relay depth. Therefore, you can find two outbound interfaces VLANIF 10 and

VLANIF 20 when checking the routing table of Switch A.

----End

Summary

When configuring static routes, specify outbound interfaces for them. In this way, route relay is avoided.

6.7 RIP Troubleshooting

6.7.1 Device Does not Receive Partial or All the Routes

Common Causes

This fault is commonly caused by one of the following: l The incoming interface is not enabled with RIP.

l The incoming interface is not in Up state.

l The version number sent by the peer does not match with that received on the local interface.

l The interface is disabled to receive the RIP packet.

l The polic used to filter the received RIP routes is configured.

l The metric of the received routes is larger than 16.

l Other protocols have learned the same routes in the routing table.

l The number of the received routes exceeds the upper limit.

l The MTU value of the incoming interface is less than 532.

l The authentication of sending and receiving interface is not matching.


If a switch receives partial or none routes or the display ip routing-table command dose not display routes learned by RIP, refer to the following troubleshooting flowchart, as shown in

Figure 6-25

.



237



Figure 6-25 RIP route receiving troubleshooting flowchart

Device does not receive partial or all the routes

Ingress is eabled?

No

Enable the ingress

Yes

Ingress is normal?

No

Ensure the normal state on the ingress

Yes

Version numbers are

the same?

Yes

No

Ensure the same version number on sending and receiving interface

undo rip input

is configured?

No

Yes

Cancel the undo

rip input

command

Filtering policy

is configured?

No

Yes

Ensure the policy does not filter out received packets

rip metricin

is configured?

No

Metric

is larger than

16?

No

There are other better routes?

No


Yes

Reduce the value of rip metricin

Yes

Yes

Is fault rectified?

No

Is fault rectified?

No

Yes

Yes

Yes

Is fault rectified?

No

Is fault rectified?

No

Is fault rectified?

No

Is fault rectified?

No

End

Yes

Yes

Yes


NOTE




238


Troubleshooting




Procedure

Step 1 Check that the incoming interface is enabled with RIP.

The network command is used to specify the interface network segment. Only the interface enabled with RIP can receive and send the RIP routing information.

Run the display current-configuration configuration rip command to check information about the network segment where RIP is enabled. Check whether the outgoing interface is enabled.

The network address enabled by the network command must be that of the natural network segment.

Step 2 Check that the incoming interface works normally.

Run the display interface command to check the operating status of the incoming interface: l If the current physical status of the interface is Down or Administratively Down, RIP cannot receive any route from the interface.

l If the current protocol status of the interface is Down, the cost of routes learnt by RIP from the interface changes to 16, and then is deleted.

Therefore, ensure the normal status of the interface.

Step 3 Check that the version number sent by the peer matches with that received on the Local Interface.

By default, the interface sends only RIP-1 packets, but can receive both RIP-1 and RIP-2 packets.

If the version number of the incoming interface and that of the RIP packet are different, RIP routing information may not be received correctly.

Step 4 Check whether the undo rip input command is configured on the incoming interface.

The rip input command enables a specified interface to receive RIP packets.

The undo rip input command disables a specified interface from receiving RIP packets.

If the undo rip input command is configured on the incoming interface, all the RIP packets from the interface cannot be processed. Therefore, the routing information cannot be received.

Step 5 Check whether a policy used to filter received RIP routes is configured.

The filter-policy import command is used to filter the received RIP routes. If an ACL is used, run the display current-configuration configuration acl-basic command to view whether the

RIP routes learned from the neighbor are filtered. If the IP-Prefix list is used to filter routes, the

display ip ip-prefix command is used to check the configured policy.

If a routing policy is set to filter routes, it must be configured correctly.

Step 6 Check whether the incoming interface is configured with the rip metricin command and if the metric is larger than 16.

The rip metricin command is used to set the metric that is added to a route when the interface receives a RIP packet.

If the metric exceeds 16, the route is regarded as unreachable and is not added to the routing table.

Step 7 Check whether the metric of the received routes is larger than 16.

Issue 01 (2012-03-15) 239



If the metric of a received route exceeds 16, the route is regarded as unreachable and is not added to the routing table.

Step 8 Check whether the authentication on the sending and receiving interface is matching.

Run the display rip process-id statistics interface interface-type interface-number command to check whether packet authentication has failed on the interface.

If the packet authentication was failed on the interface, it must be configured correctly.

Step 9 Check whether other protocols have learned the same routes in the routing table.

Run the display rip process-id route command to check whether routes have been received from the neighbor.

The possible cause is that the RIP route is received correctly and the local device learns the same route from other protocols such as OSPF and IS-IS.

The weights of OSPF or IS-IS are generally greater than that of RIP. Routes learned through

OSPF or IS-IS are preferred by routing management.

Run the display ip routing-table protocol rip verbose command to view routes in the Inactive state.

Step 10 If the fault persists, contact Huawei technical support personnel and provide them with the following information.


----End


Relevant Alarms

None.

Relevant Logs

None.

6.7.2 Device Does not Send Some or All Routes

Common Causes

Issue 01 (2012-03-15)

This fault is commonly caused by one of the following: l The outgoing interface is not enabled with RIP.

l The outgoing interface is not in the Up state.

l The silent-interface command is configured on the outgoing interface so that the interface is suppressed from sending RIP packets.



240


Troubleshooting 6 IP Forwarding and Routing l The undo rip output command is configured on the outgoing interface so that the interface is disabled to send the RIP packet.

l The RIP split-horizon is disabled on the outgoing interface.

l The policy for filtering imported RIP routes is configured in RIP.

l The physical status of the interface is Down or Administratively Down, or the current status of the protocol on the outgoing interface is Down. The IP address of the interface cannot be added to the advertised routing table for RIP.

l Although the outgoing interface does not support the multicast or broadcast mode, packets must be sent to a multicast or broadcast address.

l The MTU value of the outgoing interface is less than 52.


If a switch sends partial or none routes, refer to the following troubleshooting flowchart, as shown in

Figure 6-26

.



241



Figure 6-26 RIP route sending troubleshooting flowchart

Device does not send partial or all the routes

Egress is enabled?

Yes

Egress is normal?

Yes

No

No

Enable the egress

Ensure the normal state on the egress

silent-interface

is configured?

No

Yes

Cancel the silentinterface command

Is fault rectified?

No

Is fault rectified?

No

Yes

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes undo rip output

is configured?

No

Yes

Cancel the undo rip output command

Split horizon is configured?

No

Yes

Filtering policy

is configured?

Yes

Ensure the policy does not filter out routes imported by

RIP

No

Local interface is normal?

Yes

No

If packets are sent to local interface, ensure the normal state on local interface

Any other problems?

No

Yes

Interface is enabled multicast and peer command is configured correctly


Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Yes

Is fault rectified?

No

End


NOTE




242


Troubleshooting




Procedure

Step 1 Check whether the outgoing interface is enabled with RIP.

The network command is used to specify an interface network segment. Only an interface enabled with RIP can receive and send RIP routes.

Run the display current-configuration configuration rip command to check information about a network segment where RIP is enabled. Check whether the outgoing interface is enabled.

The network address enabled by using the network command must be that of the natural network segment.

Step 2 Check whether the outgoing interface works normally.

Run the display interface command to check the operating status of the outgoing interface.

If the physical status of the interface is Down or Administratively Down, or the status of the current protocol is Down, RIP cannot work properly on the interface.

Ensure that the interface is normal.

Step 3 Check whether the silent-interface command is configured on the outgoing interface.

The silent-interface command is used to suppress the interface from sending the RIP packet.

The display current-configuration configuration rip command is used to check whether the interface is suppressed from sending RIP packets.

If the silent-interface command is configured, disable suppression on the interface.

Step 4 Check whether the undo rip output command is configured on the outgoing interface.

Run the display current-configuration command on the outgoing interface to view whether the rip output command is configured.

The rip output command enables the interface to send RIP packets.

The undo rip output command disables the interface from sending RIP packets.

If the undo rip output command is configured on the outgoing interface, the RIP packet cannot be sent on the interface.

Step 5 Check whether the rip split-horizon command is configured on the outgoing interface.

Run the display current-configuration command on the outgoing interface to view whether the rip split-horizon command is configured. If the command is configured, split-horizon is enabled on the outgoing interface.

By default, split-horizon is enabled on all outgoing interfaces, and the output of the command does not contain configuration items about split-horizon.

For the outgoing interface (such as X.25, FR) on the NonBroadcast Multiple Access (NBMA) network, if the display does not contain a configuration item about split-horizon, it indicates split-horizon is not enabled on the outgoing interface.

Split-horizon means that the route learned from an interface is not advertised on the interface.

Split-horizon is used to prevent a loop between adjacent neighbors from forming.

Step 6 Check whether the policy filtering the imported RIP route is configured in RIP.

Issue 01 (2012-03-15) 243



Run the filter-policy export command to configure the filtering policy on the global interface.

Only routes that pass the filtering policy can be added to the advertised routing table of RIP.

These routes are advertised through the updated packet.

Step 7 Check the status of the interface when the route is sent to the local interface address.

Run the display interface command to check the operating status of the interface.

If the physical status of the interface is Down or Administratively Down, or the current status of the protocol on the outgoing interface is Down, the IP address of the interface cannot be added to the advertised routing table of RIP. Therefore, the routing information is not sent to the neighbor.

Step 8 Check whether there are other problems.

If the outgoing interface does not support multicast or broadcast mode and a packet needs to be sent to a multicast or broadcast address, this fault will occur.

This potential source of the fault can be removed by configuring the peer command in the RIP mode to make switchs send packets with unicast addresses.

Step 9 If the fault persists, contact Huawei technical support personnel and provide them with the following information.


----End


Relevant Alarms

None.

Relevant Logs

None.

6.8 MCE Troubleshooting

This chapter describes common causes of Multi-VPN-Instance CE (MCE) faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

6.8.1 Users on a VPN Cannot Communicate with Each Other

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when users on a VPN cannot communicate with each other.

Common Causes

As shown in

Figure 6-27

, an MCE network is composed of Customer Edges (CEs)/Multi-VPN-

Instance CEs (MCEs), Provider Edges (PEs), Provider (P) devices, and sites:



244


Troubleshooting 6 IP Forwarding and Routing l PE: an edge device on a Service Provider (SP) network. A PE is directly connected to a

CE. On a Multiprotocol Label Switching (MPLS) network, PEs process all VPN services.

l P: a backbone device on an SP network. A P device is not directly connected to a CE. The

P device needs to provide only basic MPLS forwarding capabilities, without maintaining the VPN information.

l CE: an edge device on a customer network, providing interfaces that are directly connected to the SP network. A CE can be a router, a switch, or a host. Usually, CEs cannot detect

VPNs and do not need to support MPLS.

l MCE: an edge device on a customer network. An MCE binds a VLANIF interface to a

VPN, and creates and maintains an independent Virtual Routing and Forwarding (VRF) table for each VPN. It implements communication between the same departments in different areas and isolates users in different VPNs.

The MCE technology ensures data security between different VPN users and saves network construction costs. The S6700 functions as an MCE.

Figure 6-27 MCE network diagram

VPN 1

Site

Service provider's backbone

P P

CE

VPN 2

Site

PE

MCE PE

P

P

PE

VPN 2

Site

CE

VPN 1

Site

Users on VPN 1 cannot communicate with each other. This fault is commonly caused by one of the following: l The route between the CE and the MCE is unreachable.

l The route between the CE or MCE and the host is unreachable.


Figure 6-28




245


Troubleshooting


VPN users cannot communicate


Does MCE/CE have route to remote

VPN users?

Yes

No


Does PE have route to VPN users on MCE/CE?

No

Rectify the fault according to troubleshooting procedure

Yes

Is fault rectified?

No

Yes

Does local PE advertise VPN user route to peer PE?

No Advertise VPN user route to the peer PE

Yes


Is fault rectified?

Yes

No

End


NOTE


l Ensure that the PE can communicate with the CE and MCE at the network layer. If the PE cannot

communicate with the CE or MCE at the network layer, rectify the fault according to

6.2 Ping

Troubleshooting

.

l Ensure that VPN users can communicate with the corresponding MCE/CE at the network layer. If the VPN users cannot communicate with the MCE/CE at the network layer, rectify the fault according

to


.

l Ensure that PEs can communicate at the network layer. If PEs cannot communicate at the network layer, rectify the fault according to the troubleshooting documentation of the PE.

Procedure

Step 1 Check whether the MCE/CE has a route to remote VPN users.

Run the display ip routing-table ip-address command on the MCE/CE to check whether the

MCE/CE has a route to remote VPN users. ip-address specifies the network segment address of the VPN user on the PE.



246


Troubleshooting 6 IP Forwarding and Routing l If no information is displayed, there is no route to the VPN user on the PE. Go to step 2.

l If information is displayed but the next hop address is not the PE address directly connected to the MCE/CE, check whether VPN users at both ends are located in the same network segment. If yes, re-plan addresses to ensure that VPN users at both ends are located in different network segments.

l If the next hop address is the PE address directly connected to the MCE/CE, there is a route to remote VPN users. Go to step 4.

Step 2 Check whether the PE has a route to VPN users on the MCE/CE directly connected to the PE.

Check whether the PE has a route to VPN users on the MCE/CE directly connected to the PE according to the documentation of the PE.

l If the PE does not have a route to VPN users on the MCE/CE, perform the following operations:

– If a static route is used between the MCE/CE and the PE, configure a static route to

VPN users on the MCE/CE according to the relevant documentation of the PE.

–

If a dynamic routing protocol such as the Routing Information Protocol (RIP), Open

Shortest Path First (OSPF), Border Gateway Protocol (BGP), or Intermediate Systemto-Intermediate System (IS-IS) is used between the MCE/CE and the PE, the MCE/CE does not advertise the VPN route to the PE. Configure route advertisement on the MCE/

CE according to "Configuring a Route Multi-Instance Between an MCE and a PE" in

Configuration Guide - IP Routing.

l

If the PE has a route to VPN users on the MCE/CE, go to step 3.

Step 3 Check whether the local PE advertises the VPN user route to the peer PE.

Check the PE configuration. Ensure that the local PE advertises the VPN user route to the peer

PE. If the local PE does not advertise the VPN user route to the peer PE, configure route advertisement according to the documentation of the PE. If the fault persists, go to step 4.



----End


Relevant Alarms

None.

Relevant Logs

None.



247


Troubleshooting 7 Multicast

7

Multicast


7.1 Layer 2 Multicast Troubleshooting

This chapter describes common causes of Layer 2 multicast faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

7.2 Layer 3 Multicast Troubleshooting




248



7.1 Layer 2 Multicast Troubleshooting


7.1.1 Users in User VLANs Fail to Receive Multicast Packets (IGMP snooping)

Common Causes

This fault is commonly caused by one of the following: l The upstream link or downstream link of the S6700 is broken because the hardware (such as board, fiber, and network cable) is faulty.

l The Layer 2 multicast configuration of the system or the user VLAN is incorrect, for example, IGMP snooping is not enabled.

l The Layer 2 multicast configurations on the S6700 conflict, for example, the configuration of disabling dynamic multicast entry learning on interface, multicast group policy, interface fast leave, igmp-snooping require-router-alert and Layer 2 multicast filtering.

l

The number of Layer 2 multicast entries on the S6700 reaches the limit.


After the Layer 2 multicast is configured, multicast traffic cannot be transmitted to users.

The troubleshooting roadmap is as follows: l Check whether link failure occurs.

l Check whether the configurations are incorrect or conflict.

Figure 7-1




249


Troubleshooting

Figure 7-1 Troubleshooting flowchart for Layer 2 multicast

Layer 2 multicast traffic is interrupted

7 Multicast

Is the link status normal?

Yes

Is IGMP snooping enabled in

VLAN?

Yes

Is the multicast

VLAN correctly configured?

Yes

Do multicast configurations conflict?

Yes


No


No

Enable IGMP snooping in VLAN

No Configure the multicast VLAN correctly

No



No

Yes


No

Yes


Yes

No


Yes

No

End


NOTE


Procedure

Step 1 Check that the upstream and downstream links are Up.

Run the display interface brief command in the system view to check whether the Layer 2 multicast service interfaces are in Up state.

l If the *down status of an interface is Administratively Down, run the undo shutdown command in the interface view to enable the interface.

l

If the physical status of the interfaces is Down, check the upstream and downstream physical links.

l If the physical status and protocol status of the interfaces are Up, go to step 2.

Step 2 Check whether IGMP snooping is enabled globally and in a VLAN.



250



If the output information contains igmp-snooping enable, global IGMP snooping is enabled.

Run the display igmp-snooping configuration command to check the IGMP snooping configuration of the VLAN.

l If the command output does not contain igmp-snooping enable, run the igmp-snooping

enable command in the system view and VLAN view to enable IGMP snooping.

l If IGMP snooping has been enabled globally and in the VLAN, go to step 3.

Step 3 Check that the multicast VLAN is properly configured.

Run the display multicast-vlan vlan vlan-id command to check whether the user VLAN is bound to a correct multicast VLAN.

l If the user VLAN is not bound to a correct multicast VLAN, run the multicast-vlan user-

vlan command to bind VLANs correctly. Check whether the igmp-snooping of the multicast vlan is enabled.

l If the user VLAN has been bound to a correct multicast VLAN, go to step 4.

The following information shows that user VLANs 100 and 200 are bound to multicast

VLAN 10.

[Quidway]display multicast-vlan vlan 10

Multicast-vlan :

10

User-vlan Number :

2

IGMP snooping state :

Enable

MLD snooping state :

Disable

User-vlan Snoopingstate

-----------------------------------------------

100 IGMP Enable /MLD

Disable

200 IGMP Enable /MLD Disable

Step 4 Check whether the multicast configurations conflict.

Check whether the following configurations exist on the device: l Disabling dynamic learning of multicast entries on user interfaces or VLANs

If the dynamic learning of router interfaces is disabled in a VLAN, the VLAN does not listen to IGMP Query messages. As a result, no multicast router port is generated. Run the

igmp-snooping router-learning command in the VLAN view to enable the dynamic learning of router interfaces in the VLAN.

l Fast leave of member interfaces

The fast leave function can be enabled in a VLAN only when each interface in the VLAN is connected to one host. If a member interface is connected to multiple hosts, the S6700 immediately deletes the forwarding entry of the member interface after receiving an IGMP

Leave message from the member interface without sending a Group-Specific Query message.

Run the undo igmp-snooping prompt-leave command in the VLAN view to disable the fast leave of member interfaces.

l igmp-snooping require-router-alert

If igmp-snooping require-router-alert has been configured, the S6700 checks the Options field in IGMP packets, and discards the packets that do not contain the Options field.



251



Run the undo igmp-snooping require-router-alert command in the VLAN view to cancel the igmp-snooping require-router-alert configuration. The S6700 will not check the Options field in IGMP packets.

l

Multicast group policy

The multicast group policy limits the multicast groups that the hosts on a VLAN can join.

You can run the display igmp-snooping configuration command in the VLAN view to verify the configuration of multicast group policy. If ACL is configured, verify the ACL configuration.

l Layer 2 multicast filtering on interface

If this function is configured on the interface, the interface discards the UDP packets from certain VLANs.

Run the undo multicast-source-deny vlan command in the interface view to disable the

Layer 2 multicast filtering function.



----End


Relevant Alarms

None.

Relevant Logs

None.


This section presents several Layer 2 multicast troubleshooting cases.

Multicast Forwarding Fails When IGMP Snooping Is Disabled

Fault Symptom

As shown in

Figure 7-2

, the Switch provides the Layer 2 multicast function. After IGMP

snooping is configured on the Switch, clients can receive multicast traffic only for two minutes.

When a client initiates a multicast-on-demand operation, a multicast entry is created on the

Switch, but is deleted after two minutes. When the multicast entry is deleted, multicast traffic is interrupted.



252


Troubleshooting

Figure 7-2 Multicast forwarding fails when IGMP snooping is disabled

Multicast source

IP/MPLS core

7 Multicast

VLAN 10

XGE 0/0/1

Switch

XGE 0/0/2

Multicast receiver

Fault Analysis

1.

Only multicast traffic is interrupted, but other service traffic are forwarded normally; therefore, the links function properly.

2.

On the network, the router is configured with a static multicast group, so it does not send

IGMP Query packets. In addition, the Switch is not configured to send Query packets.

Therefore, the client does not receive IGMP Query packets.

When the client initiates the multicast-on-demand service, it sends an IGMP Report packet.

Therefore, the Layer 2 multicast entry can be created. By default, IGMP Snooping does not update multicast entries; therefore, after the created multicast entry ages out, the multicast entry will not be created again unless the client re-initiates the multicast-ondemand service. The default aging time of multicast entries is 130 seconds. The formula is

Query interval (60) x Robustness variable (2) + Maximum aging time (10). Therefore, the client keeps on receiving multicast traffic for only two minutes.

Procedure


Step 2 Run the vlan vlan-id command to enter the VLAN view.

Step 3 Run the igmp-snooping querier enable command to enable IGMP snooping querier for the

VLAN.

This command ensures that the Switch sends IGMP Query packets to update the multicast entries after IGMP snooping is enabled.



253



After the preceding configurations, the client receives multicast traffic and the traffic is not interrupted.

----End

Summary

When IGMP packets from the upstream router cannot reach the S6700 for some reasons, or when the upstream router only has static multicast forwarding entries, you can configure an IGMP querier on the S6700. The IGMP querier then sends IGMP Query packets for the upstream router.

7.2 Layer 3 Multicast Troubleshooting


7.2.1 Multicast Traffic Is Interrupted

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for Layer 3 multicast faults.

Common Causes

This fault is commonly caused by one of the following: l Route configurations are incorrect.

l VLAN status is incorrect.

l PIM routing entries are not created.

l Multicast forwarding entries are not created.


After the Layer 3 multicast is configured, multicast traffic cannot be transmitted to users.

The troubleshooting roadmap is as follows: l

Check that a route destined for the multicast source is available.

l

Check that the VLANs on the inbound and outbound interfaces of the multicast route function properly.

l Check that the PIM routing entries are created.

l Check that the multicast forwarding entries are created.



254


Troubleshooting


Multicast traffic

Cannot be transmitted

7 Multicast

Route to the multicast source is reachable?

Yes

No

Configurations

Have been delivered to interface boards?

No

Yes

PIM information table has been generated?

Yes

Check whether forwarding entries have been generated and record the phenomena

No

Configure a static route to the multicast source or enable a routing protocol


Is fault rectified?

No

Yes

End


NOTE


Procedure

Step 1 Check that a route destined for the multicast source is available.

Run the display ip routing-table ip-address command to check whether the local routing table contains a route destined for the multicast source.

NOTE

ip-address specifies the multicast source address.

l If not, configure a route destined for the multicast source.


Step 2 Check that the VLANs on the inbound and outbound interfaces of the multicast forwarding entry function properly.

Run the display interface vlanif vlan-id command to view VLAN status.



255


Troubleshooting 7 Multicast l If the VLANs are abnormal, the multicast forwarding entry cannot be created. Rectify the fault according to

4.1 VLAN Troubleshooting

.

In the following information, the status of VLANIF 90 is Down.

[Quidway] display interface Vlanif

90

Vlanif90 current state :

DOWN


DOWN

Description:HUAWEI, Quidway Series, Vlanif90

Interface

Route Port,The Maximum Transmit Unit is

1500

Internet Address is

1.1.1.1/24

IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is

0018-2000-0140






--

Output bandwidth utilization : -- l If the VLAN status is normal, go to step 3.

Step 3 Check that the PIM routing entries are created.

Run the display pim routing-table command to check whether PIM routing entries are created.

l If not, contact Huawei technical support personnel.


Step 4 Check whether the multicast forwarding entries are created.

Run the display multicast forwarding-table command to check that the multicast forwarding entries are created.

l If the fault persists, record the command output and contact Huawei technical support personnel.


l

Results of the preceding troubleshooting procedure l

Configuration files, log files, and alarm files of the devices

----End


Relevant Alarms

None.

Relevant Logs

None.



256


Troubleshooting

7.2.2 The PIM Neighbor Relationship Remains Down

7 Multicast

Common Causes

This fault is commonly caused by one of the following causes: l The interface is physically Down or the link-layer protocol status of the interface is Down.

l PIM is not enabled on the interface.

l PIM configurations on the interface are incorrect.


After PIM network configuration is complete, the PIM neighbor relationship remains Down.

Figure 7-4




257



Figure 7-4 Troubleshooting flowchart: the PIM neighbor relationship remains Down

The PIM neighbor relationship remains

Down

Is PIM enabled on the interface?

Yes

No

Is the PIM status

Up on the interface?

Yes

No

Enable PIM on the interface

Is the interface physically Up?

Yes

Is fault rectified?

No

No

Refer to the troubleshooting of interface Down

Yes

No

Is fault rectified?

Yes

Yes

Is the link status Up

on the interface?

No

Refer to the troubleshooting of interface Down

No

Is fault rectified?

Yes

Are the PIM configurations on the interface correct?

Yes

No


Change the PIM configurations on the interface

Is fault rectified?

Yes

No

End


NOTE

Saving the results of each troubleshooting step is recommended. If troubleshooting fails to correct the fault, a record of the actions taken will exist to provide to Huawei technical support personnel.

Procedure

Step 1 Check that PIM is enabled on the interface.



258



Run the display current-configuration interface interface-type interface-number command to check whether PIM is enabled on the interface.

l If PIM is not enabled, enable PIM on the interface.

If "Warning: Please enable multicast routing in the system view first" is prompted when you enable PIM, first run the multicast routing-enable command in the system view to enable the multicast function. Then, go on to enable PIM-SM or PIM-DM on the interface.

l

If PIM has been enabled on the interface, go to

Step 2

.

Step 2 Check that the PIM status of the interface is Up.

Run the display pim interface interface-type interface-number command to check whether the

PIM status of the interface is Up.

l

If the PIM status is Down, run the display interface interface-type interface-number command to check whether the physical status and link status of the interface are both Up.

1.

If the physical status is not Up, make the physical status go Up.

2.

If the link status is not Up, make the link status go Up.

l

If the PIM status of the interface is Up, go to

Step 3

.

Step 3 Check that PIM configurations on the interface are correct.

This fault may be caused by the following PIM configurations: l The IP addresses of directly-connected interfaces are on different network segments.

l PIM silent is configured on the interface.

l A PIM neighbor filtering policy is configured on the interface and the address of the PIM neighbor is filtered out by the policy.

l If the interface is configured to deny Hello messages without Generation IDs, the interface discards all the Hello messages received from PIM neighbors without any Generation IDs.

As a result, the PIM neighbor relationship cannot go Up. This case applies to the scenario in which Huawei devices are intercommunicating with non-Huawei devices.

Run the display current-configuration interface interface-type interface-number command to check whether any of the preceding PIM configurations exist on the interface.

l If any of the preceding PIM configurations exist, correct it.

l

If the fault persists after the preceding operations are complete, go to

Step 4

.



----End


Relevant Alarms

None.

Relevant Logs

PIM/4/NBR_DOWN



259


Troubleshooting

7.2.3 The RPT on a PIM-SM Network Fails to Forward Data

7 Multicast

Common Causes

This fault is commonly caused by one of the following causes: l The unicast route from the multicast device to the RP is unavailable.

l The RP addresses on multicast devices are inconsistent.

l The downstream interface on the multicast device does not receive any (*, G) Join messages.

l PIM-SM is not enabled on interfaces.

l The RPF route to RP is incorrect, for example, the unicast route contains a loop.

l Configurations are incorrect, for example, the configurations of the TTL, MTU, or multicast boundary are improper.


After a PIM-SM network is configured, the RPT cannot forward data.

Figure 7-5




260



Figure 7-5 Troubleshooting flowchart: the RPT on a PIM-SM network fails to forward data

The RPT on a PIM-SM network fails to forward data

Re-check the DR

Do correct (*, G) entries exist?

No

Check next hop along

RPF path from the receiver's DR to RP

Yes

No

Ensure

That the current router is an RP?

Yes Seek technical support

Has the downstream interface received Join messages?

Yes

No


Is fault rectified?

No

Yes

Is PIM-SM enabled on interfaces?

Yes

Are RP configurations correct?

Yes

No

No

Enable PIM-SM on interfaces

Rectify the faults on the static RP or BSR RP

Is fault rectified?

No

Is fault rectified?

No

Yes

Yes

Is the RPF route to the RP available?

Yes

No

Rectify the fault of unicast routes

Is fault rectified?

No

Yes

No

Is the interface that forwards multicast data the receiver's DR?

Yes

Is a multicast boundary configured on the interface?

No

Yes

Remove the configurations of the multicast boundary

Is a source-policy configured?

No

Yes

Remove the configurations of the source-policy or change the configurations of the ACL

Is fault rectified?

No

Is fault rectified?

No

Yes

Yes

End

Yes


No



Issue 01 (2012-03-15)

NOTE




261



Procedure

Step 1 Check that the PIM routing table contains correct (*, G) entries.

Run the display pim routing-table group-address command on the device to check whether the PIM routing table contains correct (*, G) entries. Focus on checking whether the downstream interface list contains downstream interfaces to forward data to all (*, G) group members.

l If the (*, G) entries exist and are all correct in the PIM routing table, run the display

multicast forwarding-table group-address command every 15 seconds to check whether the forwarding table contains (S, G) entries associated with the (*, G) entries and whether the value of the Matched field in the command output keeps increasing.

–

If the forwarding table contains associated (S, G) entries and the value of the

Matched field keeps increasing, it indicates that the upstream device can normally forward multicast data to the current device but the current device fails to forward the data downstream, for example, a too small TTL value or a forwarding fault.

–

If the forwarding table does not contain associated (S, G) entries or the value of the

Matched field remains unchanged, do as follows:

– If the current device is not an RP, it indicates that the current device has not received any multicast data. The fault may be caused by the upstream device. Then check whether the PIM routing table on the upstream device contains correct (S, G) entries.

– If the current device is already an RP, it indicates the RPT has been set up but the

RP fails to receive the multicast data from the multicast source. The fault may be caused by a failure in source's DR registration. In such a case, go to Step 10.

l If the PIM routing table does not contain correct (*, G) entries, go to Step 2.

Step 2 Check that the downstream interface has received Join messages.

Run the display pim control-message counters interface interface-type interface-number

message-type join-prune command to check whether the number of received Join/Prune messages on the downstream interface keeps increasing.

l If the number of received Join/Prune messages on the downstream interface does not increase, run the display pim control-message counters interface interface-type

interface-number message-type join-prune command on the downstream device to check whether the downstream device has sent Join/Prune messages upstream.

– If the command output shows that the number of sent Join/Prune messages keeps increasing, it indicates that the downstream device has sent Join/Prune messages. The fault may be caused by a failure in PIM neighbor communication. In such a case, go to

Step 10.

–

If the command output shows that the number of sent Join/Prune messages does not increase, it indicates the downstream device experiences a fault. Then locate the fault.

l If the number of received Join/Prune messages on the downstream interface keeps increasing, go to Step 3.

Step 3 Check that PIM-SM is enabled on interfaces.

The following interfaces are easy to be ignored in enabling PIM-SM: l RPF neighboring interface to the RP l RPF interface to the RP l Interface directly connected to shared network segment of user hosts, that is, downstream interface of the receiver's DR



262



Run the display pim interface verbose command to check PIM configurations on the interface.

Focus on checking whether PIM-SM is enabled on the preceding interfaces.

l If the command output does not contain information about an interface of the device or the

PIM mode of an interface is dense, you need to run thepim sm command on the interface.

If the system prompts that "Warning: Please enable multicast routing first" when you configure PIM-SM on the interface, run the multicast routing-enable command in the system view to enable the multicast function first and enable PIM-SM on the interface.

l If PIM-SM has been enabled on all the interfaces on the device, go to Step 4.

Step 4 Check that the RP information is correct.

Run the display pim rp-info command on the device to check whether the device has learnt information about the RP serving a specific group and whether the RP information of the same group on all other devices is consistent.

l If no RP information is displayed or RP information on the devices are inconsistent, do as follows:

–

If the static RP is used on the network, run the static-rp command on all the devices to make information about the RP serving a specific group consistent.

–

If the dynamic RP is used, go to Step 10.

l

If RP information of a specific group is consistent on all the devices, go to Step 5.

Step 5 Check that an RPF route to the RP is available.

Run the display multicast rpf-info source-address command on the device to check whether there is an RPF route to the RP.

l If the command output does not contain any RPF route to the RP, check the configurations of unicast routes. Run the ping command on the device and the RP to check whether they can ping each other successfully.

l If the command output contains an RPF route to the RP, do as follows:

–

If the command output shows that the RPF route is a static multicast route, run the

display current-configuration command to check whether the static multicast route is properly configured.

–

If the command output shows that the RPF route is a unicast route, run the display ip

routing-table command to check whether the unicast route is consistent with the RPF route.

l If the command output contains an RPF route to the RP and the route is properly configured, go to Step 6.

Step 6 Check that the interface that forwards multicast data is a receiver's DR.

Run the display pim interface interface-type interface-number command on the device to check whether the interface that forwards multicast data is a receiver's DR.

l If the DR information in the command output is not marked with local, troubleshoot the involved DR following the preceding steps.

l

If the DR information in the command output is marked with local, go to Step 7.

Step 7 Check whether a multicast boundary is configured on the interface.

Run the display current-configuration interface interface-type interface-number command on the device to check whether a multicast boundary is configured on the interface.



263


Troubleshooting 7 Multicast l If the configuration of the interface contains multicast boundary, it indicates that a multicast boundary is configured on the interface. Then you need to run the undo multicast

boundary { group-address { mask | mask-length } | all command to delete the configuration of the multicast boundary or re-plan the network to ensure that no multicast boundary is configured on the RPF interface or the RPF neighboring interface.

l If no multicast boundary is configured on the interface, go to Step 8.

Step 8 Check whether a source policy is configured.

Run the display current-configuration configuration pim command to view the current configurations in the PIM view.

l If the configuration contains source-policy acl-number, it indicates a source-based filtering rule is configured. If the received multicast data is denied by the ACL rule, the multicast data is discarded. Then you need to run the undo source-policy command to delete the configuration of the ACL rule or reconfigure an ACL rule to ensure that demanded multicast data can be normally forwarded.

l If no source policy is configured, go to Step 9.

Step 9 Check whether the PIM routing table contains correct (*, G) entries.

Run the display pim routing-table group-address command on the device to check whether the PIM routing table contains correct (*, G) entries. For details, see Step 1.

If the fault persists after the preceding troubleshooting procedures are complete, go to Step 10.



----End


Relevant Alarms

None.

Relevant Logs

None.

7.2.4 The SPT on a PIM-SM Network Fails to Forward Data

Common Causes

This fault is commonly caused by one of the following causes: l The downstream interface on the multicast device does not receive any (S, G) Join messages.

l PIM-SM is not enabled on the interface.

l The RPF route to the multicast source is incorrect. For example, the unicast route contains a loop.



264


Troubleshooting 7 Multicast l Configurations are incorrect. For example, the configurations of the TTL, MTU, switchover threshold, or multicast boundary are improper.


After the PIM-SM network is configured, the SPT fails to forward data.

Figure 7-6




265



Figure 7-6 Troubleshooting flowchart: the SPT on a PIM-SM network fails to forward data

The RPT on a PIM-SM network fails to forward data

Re-check the DR


Yes

Check the next hop along the RPF path from the receiver's DR to the multicast source

No

Yes

The current router is an RP?


No

Has the downstream interface received Join messages?

No


Is fault rectified?

No

Yes

Yes

Is

PIM-SM enabled on interfaces?

No

Enable PIM-SM on interfaces

Yes

Is the RPF route to the multicast source available?

No

Yes

No

Is the interface that forwards multicast data the receiver's DR?

Yes

Rectify the fault of unicast routes

Is a multicast boundary configured on the interface?

Yes

Remove the configurations of the multicast boudnary

No

Is a source-policy configured?

Yes

Remove the configurations of the source-policy or change the configurations of the ACL

No


No


Yes

Is fault rectified?

Yes

No

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

End

Yes



266


Troubleshooting


7 Multicast

NOTE


Procedure

Step 1 Check that the PIM routing table contains correct (S, G) entries.

Run the display pim routing-table command on the device to check whether the PIM routing table contains correct (S, G) entries.

l If the PIM routing table contains correct (S, G) entries, do as follows:

– Check whether the entry has an SPT flag.

–

If the multicast group is in the ASM group address range, the SPT switchover is triggered by the RP, and the upstream interface of the RP is a register interface, it indicates that the RP has received the Register message from the source's DR but the SPT fails to be established. Then contact Huawei technical support personnel.

– If the multicast group is in the ASM group address range, the SPT switchover is triggered by the receiver's DR, and the upstream interface is an RPF interface to the

RP but not the SPT interface to the multicast source, it indicates that the SPT fails to be established.

Then run the display current-configuration configuration pim command on the receiver's DR to view the current configurations in the PIM view. If the command output contains spt-switch-threshold traffic-rate or spt-switch-threshold

infinity, run the undo spt-switch-threshold command to delete the configurations of the traffic rate or run the spt-switch-threshold traffic-rate command to reconfigure a proper traffic rate.

–

Check whether the downstream interface list contains downstream interfaces to forward data to all group members.

– If the (S, G) entries exist and are all correct in the PIM routing table, run the display

multicast forwarding-table command to view the (S, G) entries in the forwarding table and check whether the value of the Forwarded field in the command output keeps increasing. The value of the Matched field is not updated in time. Therefore, after running the display multicast forwarding-table command, you need to wait for several minutes.

–

If the value of the Matched field keeps increasing, it indicates that the upstream device can normally forward multicast data to the current device but the current device fails to forward the data downstream. Go to Step 9.

– If the value of the Matched field remains unchanged, do as follows:

–

If the current device is not a source's DR, it indicates that the current device has not received any multicast data. The fault may be caused by the upstream device. Then check whether the PIM routing table on the upstream device contains correct (S, G) entries.



267


Troubleshooting



7 Multicast

– If the PIM routing table on the upstream device does not contain correct

(S, G) entries, troubleshoot the upstream device following the preceding steps.

–

If the PIM routing table on the upstream device contains correct (S, G) entries, but the value of the Matched field still remains unchanged, go to

Step 9.

– If the current device is already a source's DR, it indicates that SPT has been set up but the source's DR fails to forward the multicast data along the SPT.

Go to Step 9.

l If the PIM routing table does not contain correct (S, G) entries, go to Step 2.

Step 2 Check that the downstream interface has received Join messages.

NOTE

If the current device is a receiver's DR, skip this step.

If the downstream interface does not receive any (S, G) Join messages, the possible causes may be as follows: l A fault occurs on the downstream interface.

l PIM-SM is not enabled on the downstream interface.

Run the display pim control-message counters interface interface-type interface-number

message-type join-prune command to check whether the number of received Join/Prune messages on the downstream interface keeps increasing.

l If the number of received Join/Prune messages on the downstream interface does not increase, run the display pim control-message counters interface interface-type

interface-number message-type join-prune command on the downstream device to check whether it has sent Join/Prune messages upstream.

– If the command output shows that the number of sent Join/Prune messages keeps increasing, it indicates that the downstream device has sent Join/Prune messages. The fault may be caused by a failure in PIM neighbor communication. In such a case, go to

Step 9.

–

If the command output shows that the number of sent Join/Prune messages does not increase, it indicates the downstream device experiences a fault. Then locate the fault.

l If the number of received Join/Prune messages on the downstream interface keeps increasing, go to Step 3.

Step 3 Check that PIM-SM is enabled on interfaces.

The following interfaces are easy to be ignored in enabling PIM-SM: l RPF neighboring interface to the multicast source l RPF interface to the multicast source

NOTE

In PIM-SM network deployment, you are recommended to enable the multicast function on all the devices on the network and enable PIM-SM on all the interfaces.

Run the display pim interface verbose command to check PIM configurations on the interface.

Focus on checking whether PIM-SM is enabled on the preceding interfaces.

l If the command output does not contain information about an interface of the device or the

PIM mode of an interface is dense, you need to run the pim sm command on the interface.

Issue 01 (2012-03-15) 268



If the system prompts that "Warning: Please enable multicast routing first" when you configure PIM-SM on the interface, run the multicast routing-enable command in the system view to enable the multicast function first and run the pim sm command in the interface view to enable PIM-SM on the interface.

l If PIM-SM has been enabled on all the interfaces on the device, go to Step 4.

Step 4 Check that an RPF route to the multicast source is available.

Run the display multicast rpf-info source-address command on the device to check whether there is an RPF route to the multicast source.

l If the command output does not contain any RPF route to the RP, check the configurations of unicast routes. Run the ping command on the device and the RP to check whether they can ping each other successfully.

l If the command output contains an RPF route to the multicast source, do as follows:

– If the command output shows that the RPF route is a static multicast route, run the

display current-configuration command to check whether the static multicast route is properly configured.

–

If the command output shows that the RPF route is a unicast route, run the display ip

routing-table command to check whether the unicast route is consistent with the RPF route.

l If the command output contains an RPF route to the RP and the route is properly configured, go to Step 5.

Step 5 Check that the interface that forwards multicast data is the receiver's DR.

Run the display pim interface interface-type interface-number command on the device to check whether the interface that forwards multicast data is a receiver's DR.

l If the DR information in the command output is not marked with local, troubleshoot the involved DR following the preceding steps.

l If the DR information in the command output is marked with local, go to Step 6.

Step 6 Check whether a multicast boundary is configured on the interface.

Run the display current-configuration interface interface-type interface-number command on the device to check whether a multicast boundary is configured on the interface.

l

If the configuration of the interface contains multicast boundary, it indicates that a multicast boundary is configured on the interface. Then you need to run the undo multicast

boundary { group-address { mask | mask-length } | all command to delete the configuration of the multicast boundary or re-plan the network to ensure that no multicast boundary is configured on the RPF interface or the RPF neighboring interface.

l If no multicast boundary is configured on the interface, go to Step 7.

Step 7 Check whether a source policy is configured.

Run the display current-configuration configuration pim command to view the current configurations in the PIM view.

l

If the configuration contains source-policy acl-number, it indicates that a source filtering rule is configured. If the received multicast data is denied by the ACL rule, the multicast data is discarded. Then you need to run the undo source-policy command to delete the configuration of the ACL rule or reconfigure an ACL rule to ensure that demanded multicast data can be normally forwarded.



269


Troubleshooting 7 Multicast l If no source policy is configured, go to Step 8.

Step 8 Check whether the PIM routing table contains correct (S, G) entries.

Run the display pim routing-table command on the device to check whether the PIM routing table contains (S, G) entries. For details, see Step 1.



----End


Relevant Alarms

None.

Relevant Logs

None.

7.2.5 MSDP Peers Cannot Generate Correct (S, G) Entries

Common Causes

This fault is commonly caused by one of the following causes: l

The MSDP peer to initiate SA messages is not configured on the RP.

l The logical RP is not configured on the devices to be deployed with anycast RP or configurations of the logical RP are incorrect.

l MSDP peer relationships are not set up between every two members in a mesh group.

l The used intra-domain multicast protocol is not PIM-SM.

l The RPF route to the multicast source is incorrect. For example, the unicast route contains a loop.

l Configurations are incorrect. For example, the configurations of the SA policy, import policy, TTL, switchover threshold, or multicast boundary are improper.

l The SA message fails to pass RPF check.


After configurations are complete on a multicast network, MSDP peers cannot generate correct

(S, G) entries.

Figure 7-7




270



Figure 7-7 Troubleshooting flowchart: MSDP peers cannot generate correct (S, G) entries

MSDP peers cannot generate correct (S, G) entries

Are MSDP peers in the Up state?

Yes

No

Ensure that interfaces are correctly configured and peers are reachable through unicast routes

No

Is SA cache enabled?

Enable SA cache

Yes

No

Have any SA messages reached MSDP peers?

Yes

Ensure that MSDP peers can receive SA messages

No

Are export policies configured on MSDP peers?

No

Yes

Remove or change the configurations of the export policies

Are import policies Configured on MSDP peers?

No

Does current

MSDP peer receive multicast data from the multicast source?

Yes

Yes

Remove or change the configurations of the import policies

Is the current MSDP peer an RP?

Yes Change the configurations of the RP or MSDP

No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

Are import-source policies configured on the current MSDP peer?

No


Yes Remove or change the configurations of the import-source policies

End

Is fault rectified?

Yes

No



271




NOTE

Saving the results of each troubleshooting step is recommended. If troubleshooting fails to correct the fault, a record of the actions taken will exist to provide to Huawei technical support personnel.

Procedure

Step 1 Check that the status of MSDP peers is Up.

Run the display msdp brief command on the devices setting up an MSDP peer relationship to check whether the status of MSDP peers is Up.

l

If the command output shows that the status of MSDP peers is Down, check whether the

MSDP peer interfaces are correctly configured and whether the MSDP peers can ping each other successfully. If the ping fails, perform troubleshooting based on The Ping Operation

Fails.

l

If the MSDP peers are both in the Up state, go to

Step 2

.

Step 2 Check that SA cache is enabled.

Run the display current-configuration configuration msdp command on MSDP peers to view the current configurations in the MSDP view.

l If the command output shows undo cache-sa-enable, SA cache is disabled in the MSDP view. In this case, run the cache-sa-enable command in the MSDP view to enable SA cache.

l

If SA cache has been enabled, go to

Step 3

.

Step 3 Check that SA messages have reached MSDP peers.

Run the display msdp sa-count command on MSDP peers to check the contents of the SA cache.

l

If there is no command output, contact Huawei technical support personnel.

l If the value of the Number of source or Number of group field in the command output

is non-zero, SA messages have reached the peers. Then go to

Step 4

.

Step 4 Check whether export policies are configured on the MSDP peers.

Run the display current-configuration configuration msdp command in the MSDP view on the MSDP peers to view the current configurations.

l If export policies are configured on the MSDP peers, do as follows:

–

If the command output shows the configurations of the peer peer-address sa-policy

export command without any parameters, the MSDP peers are disabled from forwarding messages received from the multicast source. Then run the undo peer peer-

address sa-policy export command to delete the configurations of export policies.

–

If the command output shows the configurations of the peer peer-address sa-policy

export acl advanced-acl-number command with an ACL specified, MSDP peers can forward only the (S, G) entries permitted by the ACL. Then check whether ACL-related commands are run on the MSDP peers and whether (S, G) entries are permitted by the

ACL. You can run the undo peer peer-address sa-policy export command to delete the configurations of the ACL or change the configurations of the ACL rules.

l If no export policies are configured on MSDP peers, go to

Step 5

.



272



Step 5 Check whether import policies are configured on MSDP peers.

Run the display current-configuration configuration msdp command in the MSDP view on the MSDP peers to view the current configurations.

l If import policies are configured on MSDP peers, do as follows:

– If the command output shows the configurations of the peer peer-address sa-policy

import command without any parameters, the MSDP peers are disabled from receiving messages from the multicast source. Run the undo peer peer-address sa-policy

import command to delete the export policy configurations.

– If the command output shows the configurations of the peer peer-address sa-policy

import acl advanced-acl-number command with an ACL specified, MSDP peers can receive only the (S, G) entries permitted by the ACL. Check whether ACL-related commands are run on the MSDP peers and whether (S, G) entries are permitted by the

ACL. Run the undo peer peer-address sa-policy import command to delete the configurations of the ACL or change the configurations of the ACL rule.

l

If no import policies are configured on the MSDP peers, go to

Step 6

.

Step 6 Check whether the current MSDP peer receives multicast data from the multicast source.

l

If the current MSDP peer does not receive multicast data from the multicast source, troubleshoot the upstream device following the preceding steps.

l

If the current MSDP peer receives multicast data from the multicast source, go to

Step 7

.

Step 7 Check whether the current MSDP peer is an RP.

Run the display pim routing-table command on the MSDP peer closest to the multicast source to view the routing table.

l If the (S, G) entry does not have a 2MSDP flag, the MSDP peer is not an RP. Change the configurations of the RP or MSDP peer on the PIM-SM network to ensure that the MSDP peer is an RP.

l

If the MSDP peer is an RP, go to

Step 8

.

Step 8 Check whether import-source policies are configured on the current MSDP peer.

The import-source [ acl acl-number ] command is used to enable an MSDP peer to filter the

(S, G) entries to be advertised based on source addresses when creating SA messages. The MSDP peer can control the transmission of multicast source information. By default, SA messages can be used to advertise information about all known multicast sources.

Run the display current-configuration configuration msdp command in the MSDP view on the MSDP peer closest to the multicast source to view the current configurations.

l

If import-source policies are configured on the MSDP peer, do as follows:

– If the command output shows the configurations of the import-source command without any parameters, the MSDP peer is disabled from advertising multicast source information. Then run the undo import-source command to delete the import-source policy configurations.

– If the command output shows the import-source acl acl-number command with an

ACL specified, the MSDP peer advertises only (S, G) information matching the ACL.

Then check whether ACL-related commands are run on the MSDP peer and whether

(S, G) entries are permitted by the ACL. Then run the undo import-source command to delete the configurations of the ACL or change the configurations of the ACL rule.



273


Troubleshooting 7 Multicast l

If no import policies are configured on the MSDP peers, go to

Step 9

.


l Results of the preceding operation procedure l Configuration files, log files, and alarm files of the device

----End


Relevant Alarms

None.

Relevant Logs

None.


This section provides troubleshooting cases for Layer 3 multicast.



274


Troubleshooting 8 Security

8

Security


Issue 01 (2012-03-15)

8.1 AAA Troubleshooting

This chapter describes common causes of the authentication, authorization, and accounting

(AAA) faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

8.2 ARP Security Troubleshooting

This chapter describes common causes of ARP faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

8.3 NAC Troubleshooting

This chapter describes common causes of the network admission control (NAC) faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

8.4 DHCP Snooping Troubleshooting

This chapter describes common causes of Dynamic Host Configuration Protocol (DHCP) snooping faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

8.5 Traffic Suppression Troubleshooting

This chapter describes common causes of traffic suppression faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

8.6 CPU Defense Troubleshooting

This chapter describes common causes of CPU defense faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

8.7 MFF Troubleshooting

This chapter describes common causes of MAC Forced Forwarding (MFF) faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

8.8 ACL Troubleshooting

This chapter describes common causes of ACL faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

8.9 PPPoE+ Troubleshooting



275



This chapter describes common causes of Point-to-Point Protocol over Ethernet (PPPoE) faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

8.10 URPF Troubleshooting

This section provides a troubleshooting case for Unicast Reverse Path Forward (URPF).



276



8.1 AAA Troubleshooting

This chapter describes common causes of the authentication, authorization, and accounting

(AAA) faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

8.1.1 A User Fails in the RADIUS Authentication

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the Remote Authentication Dial In User Service (RADIUS) authentication failure.

Common Causes

This fault is commonly caused by one of the following: l The user name or password is incorrect. For example, the user name does not exist, or the user name format (with or without the domain name) is different from the format configured on the Remote Authentication Dial In User Service (RADIUS) server.

l The RADIUS configuration on the S6700 is incorrect, including the authentication mode and the RADIUS server template.

l

The port number and shared key configured on the S6700 are different from those on the

RADIUS server.

l

The number of online users reaches the maximum value.


A user fails to pass the Authentication Dial In User Service (RADIUS) authentication.

The troubleshooting roadmap is as follows: l Check whether the link between the S6700 and the RADIUS server works properly.

l Check the RADIUS configuration on the S6700, including the domain name, RADIUS server template, authentication mode, and accounting mode.

l Check whether the network access server (NAS) IP address, port number, and shared key configured on the RADIUS server are the same as those configured on the S6700.

Figure 8-1




277


Troubleshooting

Figure 8-1 Troubleshooting flowchart for RADIUS authentication failure

A users fails in

RADIUS authentication

Yes

Is the link faulty?

No

Rectify link fault


Yes

No

Is RADIUS configuration on

NAS correct?

No

Modify domain, authentication mode, accounting mode,

RADIUS server template, or user name


No

8 Security

Yes

Yes

Does NAS send

RADIUS packes?

No

Ensure that the RADIUS server template is applied to the domain


No

Yes

Yes

Does

NAS receive RADIUS response packet?

Yes

No

Ensure that NAS IP addresses and port numbers on RADIUS server and NAS are the same


No

Yes

Does

RADIUS server record authentication failure?

Yes

Ensure that shared key and user name format on

RADIUS server and NAS are the same

No



No

End

Yes


Issue 01 (2012-03-15)

NOTE




278



Procedure

Step 1 Run the ping command to check whether the link between the network access server (NAS), namely, the S6700, and the RADIUS server works properly.



.


Step 2 Check that the RADIUS configuration on the S6700 is correct.

Check the RADIUS configuration to ensure that: l The authentication scheme bound to the user domain is RADIUS authentication.

l The correct RADIUS server template is bound to the domain. The IP address and port of the authentication server and accounting server are set correctly in the template.

l The user name format and shared key specified in the template are the same as those on the

RADIUS server.

Among the above three check items, the latter two items are required to check the configurations on the RADIUS server, go to step 3.

Action

Check the domain configuration.

Command display domain

Check which RADIUS server template is bound to the domain.

Check the authentication scheme bound to the domain.

display domain name domain-name

display authentication-scheme display accounting-scheme

Check the accounting scheme bound to the domain.

Check the configuration of the RADIUS server template.

display radius-server configuration

Step 3 Check information about the RADIUS packets sent and received by the S6700.

Run the debugging radius packet command in the user view to enable RADIUS packet debugging. Check whether any RADIUS packets are sent and received by the S6700.

<Quidway> debugging radius packet

<Quidway> terminal debugging

<Quidway> terminal monitor

CAUTION

Debugging affects the system performance. So, after debugging, run the undo debugging all command to disable the debugging immediately.

Issue 01 (2012-03-15) l If no debugging information is displayed, the NAS configuration is incorrect. Check that the

RADIUS server template is bound to the domain.



279



The following configuration file shows that the RADIUS server template radius is bound to the domain huawei.

# radius-server template radius

radius-server authentication 1.1.1.1 1645

# aaa

authentication-scheme default

authentication-scheme aaa

authentication-mode radius

authorization-scheme default

accounting-scheme default

domain default

domain default_admin

domain huawei


radius-server radius l If debugging information is displayed, proceed according to the debugging information.

Debugging Information Solution

Nov 10 2010 15:23:34.260.6 Quidway

RDS/7/debug2:

Radius Sent a Packet

Server Template: 0

Server IP : 192.168.1.128

Protocol: Standard

......

Nov 10 2010 15:23:34.260.6 Quidway %%

01RDS/4/RDAUTHDOWN(l):

RADIUS authentication server ( IP:

192.168.1.128 Vpn-Instance: -- ) is down!

The RADIUS module sent an authentication packet. This message indicates that the S6700 can send RADIUS authentication packets.

The RADIUS authentication server did not send any authentication response packet.

This may because the link between the

S6700 and the RADIUS server fails or the

RADIUS server has not restarted.

Check that the NAS IP address and

RADIUS service port numbers configured on the RADIUS server are the same as those configured on the NAS, and that the

RADIUS service is enabled.



280



Debugging Information

Nov 10 2010 15:23:34.260.6 Quidway

RDS/7/debug2:

[RDS (Evt):] Send a msg (Auth reject)

Nov 10 2010 15:23:34.260.7 Quidway

RDS/7/debug2:

[RDS (Msg):]Msg type :Auth reject

[RDS (Msg):]UserID :16005

[RDS (Msg):]Template no:88.99

[RDS (Msg):]Authmethod :(pap)

[RDS (Msg):]ulSrcMsg :Auth req

[RDS (Msg):]szBitmap :00 00 00 00 00

00 00 00 00 00 00 00 00 00 00 00

Solution

The RADIUS authentication server returned an authentication failure packet.

The possible causes of authentication failure are: l The NAS IP address and the shared key are not configured on the RADIUS server.

l The shared key configured on the

RADIUS server is different from the shared key configured on the NAS.

l The user account is not configured on the RADIUS server, or the user name format configured in the RADIUS server template is different from that on the RADIUS server. For example, the

NAS sends the user name without the domain name but the RADIUS server requires the user name with the domain name.

l The password entered by the user is different from the password configured on the RADIUS server.

If any of the preceding errors exist, modify the configuration on the RADIUS server.

After configuration modification, check whether the user can pass the authentication. If the fault persists, go to step 4.

Step 4 Check whether the number of online users reaches the maximum.

Both the NAS and RADIUS server have a limit the number of online users. Run the display

access-user command on the S6700 to check the number of online users.

l If the number of online users reaches the maximum, you do not need to take any action. The user can log in after the number of online users falls below the maximum.

l If the number of online users does not reach the maximum, check the maximum number of online users set on the RADIUS server. If the maximum number of online users set on the

RADIUS server is not reached, go to step 5.

Step 5 Check the user type.

l If the user is a Telnet user or an FTP user, rectify the fault according to "


" or "

2.6.1 The User Fails to Log in to the Server

Through FTP

."

l If the user is a network access user, rectify the fault according to "

8.3 NAC

Troubleshooting

."




281


Troubleshooting l Results of the preceding troubleshooting procedure l Configuration file, log file, and alarm file of the S6700

----End


8 Security

Relevant Alarms

None.

Relevant Logs

None.

8.1.2 A User Fails in the HWTACACS Authentication

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the Huawei Terminal Access Controller Access Control System (HWTACACS) authentication failure.

Common Causes

This fault is commonly caused by one of the following: l The user name or password is incorrect. For example, the user name does not exist, or the user name format (with or without the domain name) is different from the format configured on the Huawei Terminal Access Controller Access Control System (HWTACACS) server.

l The HWTACACS configuration on the S6700 is incorrect, including the authentication mode and HWTACACS server template.

l The port number and shared key configured on the S6700 are different from those on the

HWTACACS server.

l The number of online users reaches the maximum value.


A user fails to pass the Huawei Terminal Access Controller Access Control System

(HWTACACS) authentication.

The troubleshooting roadmap is as follows: l Check whether the link between the S6700 and the HWTACACS server works properly.

l Check the HWTACACS configuration on the S6700, including the domain name,

HWTACACS server template, authentication mode, authorization mode, and accounting mode.

l Check whether the network access server (NAS) IP address, port number, and shared key configured on the HWTACACS server are the same as those configured on the S6700.

Figure 8-2




282



Figure 8-2 Troubleshooting flowchart for HWTACACS authentication failure

A user fails in

HWTACACS authentication

Is the link faulty?

No

Yes

Rectify link fault


No

Yes

Is HWTACACS configuration on NAS correct?

No

Modify domain, authentication mode, authorization mode, accounting mode,

HWTACACS server template, or user name


No

Yes

Yes

Does

NAS send

HWTACACS packets?

No

Ensure that the

HWTACACS server template is applied to the domain


Yes

No

Yes

Does NAS receive HWTACACS response packet?

Yes

No

Ensure that NAS IP addresses and port numbers on HWTACACS server and NAS are the same


No

Yes

Does

HWTACACS server record authentication failure?

No


Yes

Ensure that shared key and user name format on

HWTACACS server and

NAS are the same


No

End

Yes


NOTE




283



Procedure

Step 1 Run the ping command to check whether the link between the network access server (NAS), namely, the S6700, and the HWTACACS server works properly.



.


Step 2 Check the HWTACACS configuration on the S6700.

Check the HWTACACS configuration to ensure that: l The authentication scheme bound to the user domain is HWTACACS authentication.

l The correct HWTACACS server template is bound to the domain. The IP address and port of the authentication server, authorization server, and accounting server are set correctly in the template.

l The user name format and shared key specified in the template are the same as those on the

HWTACACS server.

Among the above three check items, the latter two items are required to check the configurations on the HWTACACS server, go to step 3.

Action

Check the domain configuration.

Command display domain

Check which HWTACACS server template is bound to the domain.

Check the authentication scheme bound to the domain.

display domain name domain-name

display authentication-scheme

Check the authorization scheme bound to the domain.

display authorization-scheme

Check the accounting scheme bound to the domain.

display accounting-scheme

Check the configuration of the HWTACACS server template.

display hwtacacs-server template

Step 3 Check information about the HWTACACS packets sent and received by the S6700.

Run the debugging hwtacacs all command in the user view to enable HWTACACS packet debugging. Check whether any HWTACACS packets are sent and received by the S6700.

<Quidway> debugging hwtacacs all

<Quidway> terminal debugging

<Quidway> terminal monitor

Issue 01 (2012-03-15)

CAUTION

Debugging affects the system performance. So, after debugging, run the undo debugging all command to disable the debugging immediately.



284


Troubleshooting 8 Security l If no debugging information is displayed, the NAS configuration is incorrect. Check that the

HWTACACS server template is applied to the domain.

The following configuration file shows that the HWTACACS server template hwtacacs is bound to the domain huawei.

# hwtacacs-server template hwtacacs

hwtacacs-server authentication 2.2.2.2

# aaa



authentication-mode hwtacacs



domain default

domain default_admin

domain huawei


hwtacacs-server hwtacacs

# l If debugging information is displayed, proceed according to the debugging information.

Debugging Information Solution

The HWTACACS module sent an authentication packet. This message indicates that the S6700 can send

HWTACACS authentication packets.

Nov 10 2010 15:43:35.500.6 Quidway

TAC/7/Event:HandleReqMsg: Session status is not connect now.

Nov 10 2010 15:43:35.500.7 Quidway

TAC/7/Event:statistics: transmit flag:

1-SENDPACKET, server flag: 0authentication, packet flag: 0xff

Nov 10 2010 15:43:35.550.1 Quidway

TAC/7/Event:HandleResp: Session status is connect now.

Nov 10 2010 15:43:35.550.2 Quidway

TAC/7/Event: Tac packet sending success!

version:c0 type:1-authentication sequence:1 flag:1-UNENCRYPTED_FLAG session id:908 length:24 serverIP:

10.138.88.209 vrf:0

Nov 10 2010 15:49:18.430.6 Quidway

TAC/7/Event:HandleReqMsg: Session status is not connect now.

Nov 10 2010 15:49:18.430.7 Quidway

TAC/7/Event:statistics: transmit flag:

1-SENDPACKET, server flag: 0authentication, packet flag: 0xff

Nov 10 2010 15:49:18.480.2 Quidway

TAC/7/Event:HandleResp: Session status is connect now.

Nov 10 2010 15:49:18.480.3 Quidway

TAC/7/Event: Tac send packet error!

The HWTACACS authentication server did not send any authentication response packet. This may because the link between the S6700 and the HWTACACS server is

Down, the HWTACACS server has not restarted, or the HWTACACS server fails.

In this case, check that the NAS IP address and HWTACACS service port numbers configured on the HWTACACS server are the same as those configured on the NAS, and that the HWTACACS service is enabled.



285



Debugging Information

Nov 10 2010 16:02:35.760.1 Quidway

TAC/7/Event: version:c0 type:AUTHEN_REPLY seq_no:6 flag:UNENCRYPTED_FLAG session_id:0x4ff8 length:6 pstPacketAll->ulDataLen:6 pstAuthenReply:ucStatus=2 ucflags=0 usServerMsgLen=0 usDataLen=0 status:AUTHEN_STATUS_FAIL flag:REPLY_FLAG_ECHO server_msg len:0 data len:0 server_msg: data:

Solution

The HWTACACS server returned an authentication failure packet. The possible causes of authentication failure are: l The NAS IP address and the shared key are not configured on the HWTACACS server.

l The shared key configured on the

HWTACACS server is different from the shared key configured on the NAS.

l The user account is not configured on the HWTACACS server, or the user name format configured in the

HWTACACS server template is different from that on the HWTACACS server. For example, the NAS sends the user name without the domain name but the HWTACACS server requires the user name with the domain name.

l The password entered by the user is different from the password configured on the HWTACACS server.

If any of the preceding errors exist, modify the configuration on the HWTACACS server. After configuration modification, check whether the user can pass the authentication. If the fault persists, go to step 4.

Step 4 Check whether the number of online users reaches the maximum.

Both the NAS and HWTACACS server have a limit on the number of online users. Run the

display access-user command on the S6700 to check the number of online users.

l If the number of online users reaches the maximum, you do not need to take any action. The user can log in after the number of online users falls below the maximum.

l If the number of online users does not reach the maximum, check the maximum number of online users set on the HWTACACS server. If the maximum number of online users set on the HWTACACS server is not reached, go to step 5.

Step 5 Check the user type.

l If the user is a Telnet user or an FTP user, rectify the fault according to "


" or "

2.6.1 The User Fails to Log in to the Server

Through FTP

."

l If the user is a network access user, rectify the fault according to "

8.3 NAC

Troubleshooting

."




286


Troubleshooting l Results of the preceding troubleshooting procedure l Configuration file, log file, and alarm file of the S6700

----End


Relevant Alarms

None.

Relevant Logs

None.


This section presents several AAA troubleshooting cases.

8 Security

Users Are Forced Offline 10-plus Seconds After They Log In

Fault Symptom


Figure 8-3

, users access the network through Switch B, which authenticates, authorizes, and charges the users.

Originally, Switch B uses the RADIUS protocol to perform authentication and accounting. After the RADIUS server fails, the administrator adopts local authentication temporarily.

Figure 8-3 Networking diagram of user access

Domain huawei

Issue 01 (2012-03-15)

SwitchA

Network

SwitchB

129.7.66.66/24

Destination network

129.7.66.67/24



287



After the configurations, users are forced offline 10-plus seconds after they log in.

Fault Analysis

1.

Run the display trapbuffer and display logbuffer commands on Switch B to check whether a trap or a log indicating that users are forced offline is recorded. The following trap information is displayed:

AAA cut user!

2.

Run the display current-configuration command on Switch B to check the AAA configuration. The command output shows that local authentication and RADIUS accounting are adopted. Details are as follows: radius-server template provera

radius-server shared-key xxxxxx


radius-server accounting 129.7.66.66 1646

undo radius-server user-name domain-included

# aaa

local-user telenor password cipher xxxxxxx


#

authentication-scheme provera

authentication-mode radius local

#


#


accounting-scheme provera

accounting-mode radius

accounting realtime 10

#

domain default

#

domain huawei

authentication-scheme provera

accounting-scheme provera

radius-server provera

# user-interface vty 0 4



set authentication password cipher xxxxxxx

history-command max-size 256

screen-length 15

Because the RADIUS server is unavailable, real-time accounting fails. You can run the

accounting interim-fail command to configure a real-time accounting failure policy to determine whether to keep users online or force users offline after the real-time accounting fails. If the accounting interim-fail command is not configured, Switch B adopts the default setting, that is, forcing users offline when real-time accounting fails.

It can therefore be concluded that RADIUS accounting causes users to be forced offline.

The period after which login users are forced offline is determined by the retransmission timeout period and retransmission times, which are configured by using the radius-server

timeout and radius-server retransmit commands respectively. By default, data is retransmitted every 5 seconds for three consecutive times. If data fails to be retransmitted

15 seconds after login, users are forced offline.



288



Procedure


Step 2 Run the aaa command to enter the AAA view.

Step 3 Run the domain huawei command to enter the Huawei domain view.

Step 4 Run the undo accounting-scheme provera command to configure the default accounting scheme for users in the domain, that is, non-accounting.

You can select any of the following methods to clear the fault: l Run the accounting-mode none command to change the accounting mode to nonaccounting.

–

Administrator users such as Telnet users and FTP users are not charged, and therefore you can change their accounting mode to non-accounting.

l Run the accounting interim-fail online command to configure to keep users online when real-time accounting fails.

l Run the undo accounting-scheme provera command to configure the default accounting scheme for the domain, that is, non-accounting.

In this troubleshooting case, Switch B mainly authenticates Telnet users that do not need to be charged, and therefore the non-accounting scheme is applicable. You can run the undo

accounting-scheme provera command to configure the non-accounting scheme.

After the preceding configurations, users can log in without being forced offline. The fault is cleared.

----End

Summary

On the access network using AAA authentication, if the remote server is unavailable and local authentication is adopted, the accounting scheme must be non-accounting. Otherwise, users are forced offline.

A User Cannot Pass the HWTACACS Authentication with Valid User Name and

Password

Fault Symptom

As shown in

Figure 8-4

, the four switches at the core layer are in the same autonomous system

(AS). They are configured with the Interior Border Gateway Protocol (IBGP), Intermediate

System To Intermediate System (IS-IS), AAA, QoS, and the Simple Network Management

Protocol (SNMP). The customer wants to configure a private AS number on the switches, replace

IBGP with the Exterior Border Gateway Protocol (EBGP), and replace IS-IS with the Open

Shortest Path First (OSPF). The IS-IS routing table contains only the routes to the IP addresses of connected interfaces and loopback interfaces.



289


Troubleshooting

Figure 8-4 HWTACACS authentication at the core layer

202.97.30.227/32

Loopback0 Loopback0

SwitchA

SwitchB

TACACS server

202.102.216.245/24

SwitchD

SwitchC

8 Security

Loopback0 Loopback0

After the configuration, the user fails to pass the Huawei Terminal Access Controller Access-

Control System (HWTACACS) authentication by using the valid user name and password.

Fault Analysis

1.

Check the user name and password configured on the HWTACACS server. The configured user name and password are the same as those entered by the user.

2.

Run the ping command on SwitchA to ping the HWTACACS server. The ping operation is successful.

3.

Run the display current-configuration command on SwitchA to check the HWTACACS configuration. The following configuration is displayed in the HWTACACS server template: hwtacacs-server source-ip 202.97.30.227

In the preceding information, 202.97.30.227 is the IP address of the loopback interface on

SwitchA.

This IP address is contained in the IS-IS routing table and is used as the source IP address of HWTACACS packets sent by SwitchA. The IS-IS configuration has been deleted; therefore, SwitchA cannot receive the authentication response packet with the destination address 202.97.30.227 sent from the HWTACACS server. This may be the cause for the

HWTACACS authentication failure.

4.

Run the ping -a 202.97.30.227 202.102.216.245 command on SwitchA to check whether the loopback interface address can ping the IP address of the HWTACACS server. Here, the IP address of the HWTACACS server is 202.102.216.245. The ping operation fails.

5.

Run the display ip routing-table command on SwitchA. The command output shows that the IP address of this loopback interface is not advertised by the OSPF protocol.

According to the preceding information, you can confirm that the authentication fails because the IS-IS configuration is deleted and the OSPF protocol does not advertise the loopback interface address.



290



Procedure


Step 2 Run the ospf process-id command to enter the OSPF view.

Step 3 Run the area area-id command to enter the OSPF area view.

Step 4 Run the network address wildcard-mask command to advertise the IP address of loopback interface.

After the configuration is complete, the user can log in by using the user name and password.

----End

Summary

Before modifying the routing protocol configuration, record the current configuration. After modifying the configuration, check whether the new configuration meets the network requirements and whether the modification has impacts on other configurations.

A Telnet User Fails to Log In Because the User Account Is Not Configured on the

RADIUS Server

Fault Symptom

On the S6700, 802.1x is enabled and the authentication mode is set to Remote Authentication

Dial In User Service (RADIUS) authentication. After the configuration, 802.1x users pass the authentication successfully, but a Telnet user fails to log in to the S6700 by using the local user account.

Fault Analysis

1.

The 802.1x users pass the authentication, indicating that the link between the S6700 and the RADIUS server works properly.

2.

Run the display current-configuration command on the S6700 to check the current configuration.

......

dot1x enable

# radius-server template remote

radius-server shared-key 123456


radius-server accounting 192.168.1.27 1646

#

......

interface XGigabitEthernet0/0/1

port hybrid pvid vlan 10

dot1x enable

dot1x max-user 1

dot1x port-method port

dot1x reauthenticate

......

aaa


authentication-scheme cams

authentication-mode radius

#



291




authorization-scheme cams

authorization-mode none

#


accounting-scheme account

#

domain default

authentication-scheme cams

authorization-scheme cams

accounting-scheme cams

radius-server remote

#

......

# user-interface maximum-vty 15 user-interface con 0 user-interface vty 0 14



idle-timeout 0 0

#

The preceding information indicates that the user domain is default, the authentication mode is RADIUS authentication, and the authorization mode is none. The 802.1x users use port-based 802.1x authentication. The Telnet user fails in the RADIUS authentication. The possible cause is that the user name and password of the Telnet user is not configured on the RADIUS server.

3.

Check the configuration of the RADIUS server. The user name and password of the Telnet user is not found on the RADIUS server.

To rectify the fault, add the user name and password of the Telnet user to the RADIUS server or configure the authentication mode of the Telnet user to local authentication.

Procedure

l Add the user name and password of the Telnet user to the RADIUS server. For the configuration procedure, see the configuration guide of the RADIUS server.

l Configure the authentication mode of the Telnet user to local authentication on the

S6700.

Create a new domain for the Telnet user.


[Quidway] aaa

[Quidway-aaa] domain telnet

[Quidway-aaa-domain-telnet]

Use the default authentication, authorization, and accounting schemes in the domain, that is, local authentication, local authorization, and no accounting.

<Quidway> display domain name telnet

Domain-name : telnet

Domain-state : Active

Authentication-scheme-name : default

Accounting-scheme-name : default

Authorization-scheme-name : -

Service-scheme-name : -

RADIUS-server-template : -

HWTACACS-server-template : -

<Quidway> display authentication-scheme default

Authentication-scheme-name : default



292



Authentication-method : Local

Authentication-super method : Super authentication-super

<Quidway> display authorization-scheme default

---------------------------------------------------------------------------

Authorization-scheme-name : default

Authorization-method : Local

......

<Quidway> display accounting-scheme default

Accounting-scheme-name : default

Accounting-method : None

Create a local user whose user name contains the domain name. The Telnet user needs to enter the domain name for authentication.


[Quidway] aaa

[Quidway-aaa] local-user telnetuser@telnet password simple 123456

[Quidway-aaa] local-user telnetuser@telnet service-type telnet

----End

Summary

You are advised to use different authentication modes for access users (such as 802.1x user),

Telnet users, and Secure Shell (SSH) users. When a Telnet user fails to log in to the S6700, the possible cause is that an incorrect authentication scheme is configured in the VTY user interface view and AAA view of the S6700, or on the remote authentication server.

8.2 ARP Security Troubleshooting

This chapter describes common causes of ARP faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

8.2.1 The ARP Entry of an Authorized User Is Modified Maliciously

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for malicious modification to the ARP entry of an authorized user.

Common Causes

This fault is commonly caused by the following: l An attacker sends bogus ARP packets to modify the ARP entry of the authorized user.


Issue 01 (2012-03-15)

An authorized user is disconnected from the Internet, but the links and routes are normal. The possible cause is that an attacker sends bogus ARP packets to modify the ARP entry of the user on the gateway. As a result, this user is disconnected from the network.

Figure 8-5




293



Figure 8-5 Troubleshooting flowchart for malicious modification to the ARP entry of an authorized user

User ARP entry is modified maliciously

Is ARP anti-spoofing configured?

Yes

Check type of ARP anti-spoofing

No

Increase CIR value

Yes

No

Configure ARP anti-spoofing fixed-mac mode fixed-all mode send-ack mode

ARP replies discarded by

CPCAR?

No

Yes

Yes


No

Is

MAC address changed?

Yes


No

Switch sends

ARP requests?

Yes

Switch receives ARP replies?

No

Network connection normal?

No

Rectify link fault

Yes

No


Yes

No

End



Yes

End


Issue 01 (2012-03-15)

NOTE




294



Procedure

Step 1 Run the display arp anti-attack configuration entry-check command on the S6700 to check that ARP anti-spoofing is enabled.

l If the following information is displayed, ARP anti-spoofing is not enabled.

ARP anti-attack entry-check mode: disabled

Run the arp anti-attack entry-check { fixed-mac | fixed-all | send-ack } enable command to enable ARP anti-spoofing.

NOTE

Before enabling ARP anti-spoofing, run the reset arp interface vlanif vlan-id command to delete the

ARP entries learned by the user-side interface.

l If the mode of ARP anti-spoofing is set to send-ack, go to step 2.

l If the mode of ARP anti-spoofing is set to fixed-mac, go to step 3.

l If the mode of ARP anti-spoofing is set to fixed-all, go to step 4.

Step 2 Perform the following steps to locate the fault in send-ack mode.

1.

Capture packets on the user-side interface by configuring port mirroring. If the S6700 does not send any ARP request, go to step 4.

2.

If the S6700 sends ARP requests but does not receive any ARP reply, check that the network connection between the S6700 and the user is normal.

3.

If the S6700 receives ARP reply packets from the user, run the display cpu-defend

statistics packet-type arp-reply command to check statistics about ARP reply packets. If the number of dropped ARP reply packets keeps increasing, the possible cause is that the rate of ARP reply packets exceeds the CPCAR. In this case, increase the committed information rate (CIR) by using the car command.

4.


Step 3 Run the display arp all | include ip-address command to check the modified information in the

ARP entry.

If the interface number or VLAN ID is changed, you do not need to take any action because it is normal in fixed-mac mode. If the MAC address is changed, go to step 4.



----End


Relevant Alarms

l 1.3.6.1.4.1.2011.5.25.165.2.2.2.2

Relevant Logs

None.



295



8.2.2 The Gateway Address Is Changed Maliciously

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for gateway address spoofing attacks.

Common Causes

This fault is commonly caused by one of the following: l An attacker sends bogus gratuitous ARP packets to users. Users change their gateway address after receiving the gratuitous ARP packets.

l An attacker sends bogus ICMP unreachable packets or ICMP redirect packets to users.


An attacker sends gratuitous ARP packets with the source IP address being the IP address of the gateway on the LAN. After receiving the gratuitous ARP packets, hosts on the LAN change their gateway MAC address to the MAC address of the attacker. As a result, the hosts cannot access the network.

Figure 8-6


Figure 8-6 Troubleshooting flowchart for gateway address spoofing

The gateway address is changed maliciously


Yes

No

Does switch functon as the gateway?

Yes

No

Configure the switch as the gateway

Is ARP gateway anti-collision configured?

No

Configure gateway anticollision

Yes

Are gateway anti-collision entries generated?

No

Yes

Configuer a policy to discard attack packets


Yes

Yes


Yes

Yes

Issue 01 (2012-03-15)




End

296


Troubleshooting


8 Security

NOTE


Procedure

Step 1 Check that the S6700 functions as the gateway. If the S6700 is not the gateway, the gateway anti-collision function does not take effect.

You can use either of the following methods to check whether the S6700 is the gateway: l Run the display arp all command to view the type of the ARP entry corresponding to the gateway IP address.

If the ARP entry type is displayed as I-, the gateway IP address is an interface address on the S6700.



VLAN

------------------------------------------------------------------------------

1.1.1.1 0022-0033-0044 I - Vlanif10 l Run the display ip routing-table gateway address command to check whether a route to the gateway address exists.

If a route to the gateway address is displayed in the command output, the S6700 is the gateway.

<Quidway> display ip routing-table 1.1.1.1


---------------------------------------------------------------------


Summary Count : 1


1.1.1.1/24 Direct 0 0 D 127.0.0.1 InLoopback0

If the S6700 is not the gateway, configure it as the user gateway.

Step 2 Run the display arp anti-attack configuration gateway-duplicate command to check that

ARP gateway anti-collision is enabled.

If ARP gateway anti-collision is not enabled, run the arp anti-attack gateway-duplicate

enable command to enable this function.

Step 3 Run the display arp anti-attack gateway-duplicate item command to check the anti-collision entries.

l If an entry is displayed, you can find the IP address, MAC address, and source interface of the attacker from the entry. Add the attacker to the blacklist or configure a blackhole MAC entry according to attacker information. Subsequently, packets from the attacker will be discarded.

l If no entry is displayed, go to step 4.


l Results of the preceding troubleshooting procedure



297


Troubleshooting l Configuration file, log file, and alarm file of the S6700

----End


8 Security

Relevant Alarms

l

1.3.6.1.4.1.2011.5.25.165.2.2.2.1

Relevant Logs

None.

8.2.3 User Traffic Is Interrupted by a Large Number of Bogus ARP

Packets

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for traffic interruption caused by a large number of bogus ARP packets.

Common Causes

This fault is commonly caused by the following: l

An attacker sends a large number of bogus ARP requests, thus increasing the load of the destination network segment. If Layer 3 interfaces are configured on the S6700, the ARP requests are sent to the CPU, causing a high CPU usage. DoS attacks may also be initiated in this case.


The S6700 uses the CPCAR mechanism to limit the rate of ARP requests sent to the CPU. If an attacker sends a large number of bogus ARP requests, valid ARP requests are also discarded when the bandwidth limit is exceeded. Consequently, user traffic is interrupted.

Figure 8-7




298



Figure 8-7 Troubleshooting flowchart for traffic interruption caused by bogus ARP packets

User traffic is interrupted by ARP attack packets

Do user

ARP entries exist?

Yes


No

Are ARP packets discarded by

CPCAR?

No

Yes

Is CPU usage of the switch high?

No


Increase the rate limit for ARP requests

Yes

Find attack source and discard attack packets


No


Yes


Yes

No

End


NOTE


Procedure

Step 1 Run the display arp all command on the S6700 to view ARP entries of authorized users.

l If ARP entries of authorized users are displayed, the S6700 has learned the ARP entries, and traffic interruption is caused by a short link disconnection. In this case, rectify link faults.

l If no ARP entry is displayed, go to step 2.

Step 2 Run the display cpu-defend statistics packet-type arp-request command to view the statistics about ARP requests.



299


Troubleshooting 8 Security l If the count of dropped ARP requests is 0, go to step 8.

l If the count of dropped ARP requests is not 0, it indicates that the rate of ARP requests exceeds the CPCAR rate limit and excess ARP requests are discarded. Go to step 3.

Step 3 Run the display cpu-usage command to check the CPU usage of the main control board.

l If the CPU usage is smaller than 70% but ARP requests are discarded, the rate limit is too small. Go to step 4.

l If the CPU usage exceeds 70%, the CPU may be attacked by ARP packets. Go to step 5.

Step 4 Run the car command to increase the rate limit for ARP requests.

Run the car command in the attack defense policy view and apply the attack defense policy.

Step 5 Capture packets on the user-side interface, and find the attacker according to the source addresses of ARP requests.

If a lot of ARP requests are sent from a source address, the S6700 considers the source address as an attack source. Then add the source address to the blacklist or configure a blackhole MAC address entry to discard ARP requests sent by the attacker.

Step 6 Run the arp speed-limit source-ip [ ip-address ] maximum maximum command or arp speed-

limit source-mac [ mac-address ] maximum maximum command in the system view to set the rate limit for ARP packets from the attack source.

By default, ARP packet suppression based on source IP address and ARP packet suppression based on source MAC address are disabled.

Step 7 Run the display arp anti-attack configuration log-trap-timer command to check whether the

ARP log and trap functions are enabled.

By default, the ARP log function is disabled. Run the arp anti-attack log-trap-timer timer command in the system view to enable the ARP log and trap functions. In this way, the S6700 will record logs and send traps when ARP attacks occur.



----End


Relevant Alarms

l 1.3.6.1.4.1.2011.5.25.165.2.2.2.3

l 1.3.6.1.4.1.2011.5.25.165.2.2.2.4

l 1.3.6.1.4.1.2011.5.25.165.2.2.2.5

l 1.3.6.1.4.1.2011.5.25.165.2.2.2.6

l 1.3.6.1.4.1.2011.5.25.165.2.2.2.7

l 1.3.6.1.4.1.2011.5.25.165.2.2.2.11

Relevant Logs

None.



300



8.2.4 IP Address Scanning Occurs

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for IP address scanning attacks.

Common Causes

This fault is commonly caused by the following: l An attacker sends a large number of destination unreachable packets to the S6700, and the packets trigger a large number of ARP Miss messages. In addition, the S6700 sends ARP requests to trigger ARP learning, causing a high CPU usage.


An attacker sends a large number of destination unreachable packets to the S6700. The packets are sent to the CPU and trigger a large number of ARP Miss messages. In addition, the S6700 sends ARP requests to trigger ARP learning, causing a high CPU usage.

Figure 8-8


Figure 8-8 Troubleshooting flowchart for IP address scanning

IP address scanning attack causes a high CPU usage

Is ARP

Miss suppress configured?

Yes

No

Configure ARP Miss suppression


Yes

No

Is rate limit for

ARP Miss messages too large?

No


Yes

Reduce the rate limit





Yes

No

End

301



NOTE


Procedure

Step 1 Run the display cpu-usage command on the S6700 to check the CPU usage of the main control board.

In the command output, ARQ indicates the ARP packet processing task.

Step 2 Run the display arp all command to view the learned ARP entries.

If the MAC address in an ARP entry is in Incomplete state, the S6700 fails to learn the ARP entry.



VLAN

---------------------------------------------------------------------

10.10.10.12 0018-82d2-0e08 I - Vlanif200

10.10.10.13 Incomplete 0 D-0 Vlanif200

3004/-

10.10.10.14 Incomplete 0 D-0 Vlanif200

3004/-

20.20.20.33 000c-76bd-43d6 I - Vlanif300

20.20.20.55 0013-7227-842f 17 D-0 Vlanif300

... 3003/-

Generally, the possible causes are: the S6700 fails to send ARP requests, the ARP requests are discarded during transmission, or no ARP reply is received. If the CPU usage of the ARQ task is high, the S6700 fails to send ARP requests and generates ARP Miss messages. Go to step 3.

Step 3 Capture packets on the user-side interface and check the source addresses of IP packets.

Step 4 Run the display arp anti-attack configuration arpmiss-speed-limit command to view the configuration of ARP Miss suppression.

l If a source IP address is specified in the ARP Miss suppression command, the S6700 checks whether the specified IP address is the source address of the received IP packets. If so, the

S6700 limits the rate of ARP Miss messages based on the rate limit configured in this command. If not, the S6700 limits the rate of the ARP Miss messages based on the limit set in the command without a source IP address specified.

l By default, ARP Miss suppression is enabled, and the maximum rate of ARP Miss messages is limited to 500 pps. When the rate of ARP Miss messages triggered by packets from the specified IP address exceeds the limit, the S6700 discards the packets sent from the IP address. You can change the rate limit for ARP Miss messages by running the arp-miss

speed-limit source-ip [ ip-address ] maximum maximum command in the system view.

Step 5 Run the display arp anti-attack configuration arpmiss-rate-limit command on the S6700 to view the configuration of ARP Miss suppression.

l If many ARP Miss packets are triggered on an interface, in a VLAN, or on the entire device in a period, the S6700 is busy in broadcasting ARP request packets and its performance deteriorates. After ARP Miss suppression is configured, the S6700 counts ARP Miss packets generated within a specified period and discards excess ARP Miss packets.

l By default, the maximum rate of ARP Miss packets is 100 packets per second. To change the rate limit of ARP Miss packets, run the arp-miss anti-attack rate-limit packet-

number [ interval-value ] command in the system view, VLAN view, or interface view.



302



Step 6 Run the display arp anti-attack configuration log-trap-timer command to check whether the

ARP log and trap functions are enabled.

By default, the ARP log function is disabled. Run the arp anti-attack log-trap-timer timer command in the system view to enable the ARP log. In this way, the S6700 will record logs and send traps when ARP attacks occur.



----End


Relevant Alarms

l 1.3.6.1.4.1.2011.5.25.165.2.2.2.8

l

1.3.6.1.4.1.2011.5.25.165.2.2.2.9

l 1.3.6.1.4.1.2011.5.25.165.2.2.2.10

l 1.3.6.1.4.1.2011.5.25.165.2.2.2.12

Relevant Logs

None.

8.2.5 ARP Learning Fails

This section provides a step-by-step for an ARP learning failure on the S6700.

Common Causes

The following table describes the possible causes of an ARP learning failure (assuming that the

S6700 sends an ARP request to trigger ARP learning).

Condition

The ARP request is not sent out.

The remote device does not receive the ARP request.

The remote device receives the ARP request but discards it.

Possible Cause

A large number of ARP requests are triggered by ARP Miss messages, and the S6700 has not processed this ARP request.

The link between the S6700 and the remote device is faulty, so the ARP request is discarded on the network.

The remote device receives a large number of

ARP packets. The rate of ARP packets exceeds the CAR, so the device discards the

ARP request sent by the S6700.



303



Condition

The S6700 does not receive the ARP reply sent by the remote device.

The S6700 receives the ARP reply but does not send it to the CPU.

Possible Cause

The link between the S6700 and the remote device is faulty, so the ARP request is discarded on the network.

The rate of ARP packets received by the

S6700 exceeds the CPCAR or ARP packet rate limit, so the ARP reply is discarded.

The ARP module of the S6700 is faulty.

The ARP reply is sent to the CPU but is discarded.


Figure 8-9


Figure 8-9 Troubleshooting flowchart for ARP learning failure

The switch fails to learn ARP entries

Does the link between switch and remote device function properly?

Yes

No

Rectify link fault

Does the switch

process ARP packets correctly?

No

Rectify the fault according to debugging information.

Ensure that the switch sends

ARP requests and does not discard ARP responses

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Does the remote device process ARP packets correctly?

No

Yes

Ensure that the remote device responds to ARP requests


Is fault rectified?

No

Yes

End



304


Troubleshooting


8 Security

NOTE


Procedure

Step 1 Check that the link between the S6700 and the remote device works properly.

l Perform ping operations between the S6700 and the remote device. If the ping operations fail, check the routing configuration on the two devices.

l View traffic statistics on the two devices to check whether packets are discarded on the link.

If any device on the link does not support the traffic statistics function, perform a ping test to check whether packets are discarded on the device. If packets are discarded on the link, rectify the link fault.

Step 2 Check that the S6700 processes ARP packets properly.

Run the debugging arp packet interface interface-type interface-number command in the user view to enable ARP debugging. Check whether information about ARP request and ARP reply packets is displayed.

NOTE

In the debugging information, the operation field indicates the ARP packet type. The value 1 indicates

ARP request packets and the value 2 indicates ARP reply packets.

l If the S6700 does not send any ARP request packet, rectify the fault according to

8.2.4 IP

Address Scanning Occurs

.

l If the S6700 does not receive any ARP reply packet, the ARP reply packets sent by the remote device may be discarded by the CPCAR mechanism. Go to step 3.

l If the S6700 receives ARP reply packets, go to step 5.

Step 3 Check whether ARP reply packets are discarded.

l Run the display cpu-defend arp-reply statistics command to view statistics about ARP reply packets.

If the Drop value keeps increasing, the rate of ARP reply packets exceeds the CPCAR. Run the car command to increase the CPCAR for ARP reply packets.

l Run the display this command in the interface view, VLAN view, and system view to check whether a rate limit is set for ARP packets.

If the rate limit is set and the rate of ARP packets is high, ARP reply packets may be discarded.

Run the arp anti-attack rate-limit command to increase the rate limit.

Step 4 Check that the remote device processes ARP packets properly.

Check that the remote device receives the ARP request and sends the ARP reply.

If the remote device is a Huawei device, perform step 2 on the device. If the remote device is a non-Huawei device, see the manual of the device.





305


Troubleshooting l Configuration files, log files, and alarm files of the devices

----End


8 Security

Relevant Alarms

None.

Relevant Logs

None.

8.3 NAC Troubleshooting

This chapter describes common causes of the network admission control (NAC) faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

8.3.1 802.1x Authentication of a User Fails

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the 802.1x authentication failure.

Common Causes

This fault is commonly caused by one of the following: l Some parameters are set incorrectly or not set, such as the parameters of 802.1x

authentication, AAA authentication domain, authentication server, and authentication server template.

l

The user name or password entered by the user is incorrect.

l

The number of online users reaches the maximum.


A user fails to pass the 802.1x authentication.

Figure 8-10




306


Troubleshooting

Figure 8-10 Troubleshooting flowchart for 802.1x authentication failure

A user fails to pass the

802.1x authentication

8 Security

Is 802.1x authentication enabled?

No

Enable 802.1x globally and on the interface

Yes

Is 802.1x configuration correct?

Yes

No

Ensure authentication method is the same as that on the server

Is AAA configuration correct?

Yes

User name, password correct?

No

Configure domain and authentication server template correctly

No

Use correct user name and password

Yes

Max. of online users reached?

No


Yes

This is not a fault


Yes

No


Yes

No


No

Yes


No

Yes

End


NOTE


Procedure

Step 1 Check that 802.1x authentication is enabled on the S6700.

Run the display dot1x command to check whether 802.1x authentication is enabled globally or on the user-side interface. If Global 802.1x is Enabled or 802.1x protocol is Enabled is not displayed, 802.1x authentication is not enabled. Run the dot1x enable command to enable

802.1x authentication globally and on the user-side interface.



307



CAUTION

802.1x authentication and MAC address authentication cannot be enabled on the same interface.

If MAC address authentication is enabled on the interface, the system displays an error message when you run the dot1x enable command.

Step 2 Check that 802.1x authentication is configured correctly.

Run the display dot1x command to check the 802.1x configuration.

The S6700 supports the following authentication methods for 802.1x: Password Authentication

Protocol (PAP), Challenge Handshake Authentication Protocol (CHAP), and Extensible

Authentication Protocol (EAP). The authentication method is configured by using the dot1x

authentication-method command.

l The authentication method on the S6700 must be the same as that on the authentication server.

l EAP authentication and local authentication cannot be configured simultaneously. If the authentication method for 802.1x users is EAP, go to step 3.

l If the authentication method for 802.1x users is PAP, check whether the client supports PAP authentication. If the client does not support PAP authentication, change the authentication method to CHAP or EAP.

Step 3 Check the AAA configuration.

1.

Check whether the user name contains the domain name.

l If user name does not contain the domain name, the user is authenticated in the default domain. In this case, check the authentication template bound to the default domain.

l If the user name contains the domain name, the user should be authenticated in the specified domain. However, if the domain name is not found, the authentication fails.

In this case, check the authentication template bound to the specified domain.

2.

Check the authentication scheme applied to the user domain on the S6700.

l If RADIUS or HWTACACS authentication is configured for the user domain, check whether the user account and the user attributes are created on the authentication server.

For details on RADIUS troubleshooting and HWTACACS troubleshooting, see

8.1.1

A User Fails in the RADIUS Authentication

and

8.1.2 A User Fails in the

HWTACACS Authentication

. For details on checking the authentication server, go to step 4.

l If local authentication is configured for the user domain, run the display local-user command to check whether the local user name and password are created on the

S6700. If not, run the local-user command to create the local user name and password.

l If the authentication scheme is none, go to step 6.

3.

Run the display accounting-scheme command to check the accounting scheme. If accounting is configured on the S6700 but the authentication server does not support accounting, the user will be forced offline after going online. To allow the user to go online, disable the accounting function in the user domain or run the accounting start-fail

online command in the accounting scheme view to configure the S6700 to keep the user online after the accounting fails.

Step 4 Check the configuration of the authentication server.



308


Troubleshooting 8 Security l If the user information does not exist on the authentication server, create the user name and password on the authentication server.

l If user attributes on the authentication server contain VLAN authorization information but the VLAN is not created on the S6700, user authorization fails. To rectify the fault, create the VLAN.

l If user attributes on the authentication server contain ACL authorization information (ACL number or ACL content), but the ACL is not created on the S6700 or the ACL format is different from that required by the S6700, user authorization fails. To rectify the fault, create the ACL. Ensure that the ACL format used by the authentication server is the same that required by the S6700.

.

NOTE

The S6700 requires the following ACL format in the user attributes:

acl acl-num key1 key-value1... keyN key-valueN permit/deny

Field Description Field Description

acl permit

Delivers the ACL content.

Allows users matching the rules to access the network.

acl-num

deny

keyM (1 ≤ M ≤ N) Indicates a keyword in the

ACL, including src-ip

(source IP address), srcipmask (mask of source

IP address), and tcpsrcport (source TCP port number).

key-valueM (1 < M <

N)

Specifies the ACL number. The value ranges from 10000 to 10999.

Prohibits users matching the rules from accessing the network.

Specifies the value of a keyword, which can be an

IP address, a mask, or a port number.

If the display access-user user-id command output contains the user IP address and Dynamic ACL

desc (Effective), the ACL specified in the user attribute takes effect.

If the configurations of the S6700 and the authentication server are correct, go to step 5.

Step 5 Check that the user name and password entered by the user are correct.

If RADIUS authentication is used and the authentication method is CHAP or PAP, run the test-

aaa command to check whether the user name and password can pass the RADIUS authentication.

l If the authentication fails, check the configuration of the RADIUS server and RADIUS

configuration on the S6700. For details, see

Troubleshooting Procedure of RADIUS authentication failure

.

l If user passes the authentication, check the option settings on the client or capture packets on the network adapter of the client to check whether the client sends authentication packets correctly.

If preceding configurations are correct, go to step 6.

Step 6 Run the display dot1x interface interface-type interface-number command on the S6700 to check whether the number of online 802.1x users reaches the maximum.



309



If the number of online 802.1x users reaches the maximum, the S6700 does not trigger authentication for subsequent users, and subsequent users cannot go online.



----End


Relevant Alarms

l 1.3.6.1.4.1.2011.5.25.40.4.2.1

Relevant Logs

None.

8.3.2 802.1x-based Fast Deployment Does Not Take Effect

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when the 802.1x-based fast deployment function does not take effect.

Common Causes

This fault is commonly caused by one of the following: l Some parameters are set incorrectly or not set, such as the 802.1x authentication parameters,

AAA authentication domain, and authentication server template.

l The specified free IP subnet is unreachable.

l The redirect-to URL is not in the free IP subnet or cannot be resolved by the DNS server.


A user uses 802.1x authentication but has not installed the authentication client. 802.1x-based fast deployment is configured, but the client download page is not displayed when the user visit a website.

Figure 8-11




310


Troubleshooting

Figure 8-11 Troubleshooting flowchart for ineffective 802.1x-based fast deployment

Unauthenticated user is not redirected to client download URL

Can user on 802.1x-disabled port access the URL?

Yes

No

Ensure DNS configuration is correct and link to

DNS server is normal

Can user directly access client download

URL?

No

Yes

Ensure the website

URL is not in free IP subnet

Is client

download URL in free IP subnet?

Yes

No

Configure a new free

IP subnet that includes the URL

No

Clear ARP entries on the user PC

Is user an 802.1x fast authentication user?

Yes



No

Yes


No

Yes


No

Yes


No

Yes

End

8 Security


Context

NOTE

Saving the results of each troubleshooting Step is recommended. If your troubleshooting fails to correct the fault, you will have a record of your actions to provide Huawei technical support personnel.

Procedure

Step 1 Check whether users can access the client download URL (redirect-to URL) on an interface without 802.1x enabled.

Select an idle interface on the switch and perform the same configuration on this interface as the interface connected to the user. Disable 802.1x on the idle interface. Connect a PC to the idle interface and access the client download URL.

l If URL access succeeds, go to step 2.



311


Troubleshooting 8 Security l If URL access fails, check the link between the switch and the DNS server, and configuration of the DNS server.

Step 2 Check whether the user can access the client download URL.

Enter the client download URL on the browser's address box, and check whether the client download web page is displayed.

l If the download web page is not displayed, run the display dot1x command to check whether the redirect-to URL is configured. If the redirect-to URL (dot1x url) is not configured, configure it and try again. If the redirect-to URL is configured and the web server is working properly, go to step 3.

l If the download web page is displayed, check whether the website URL that the user first accesses is in the free IP subnet. If so, the user access request does not need to be redirected and no action is not required. If the website URL is not in the free IP subnet, go to step 5.

Step 3 Check whether the redirect-to URL is in the free IP subnet.

The user's access request cannot be redirected to the client download web page when the redirectto URL is not in the free IP subnet. In this case, configure a new free IP subnet that includes the redirect-to URL. If the redirect-to URL is in the free IP subnet but the redirection fails, go to

Step 4.

Step 4 Check whether the user is an 802.1x-based fast authentication user.

Run the display dot1x command to check whether the user is an 802.1x-based fast authentication user.

l If not, perform the following operation on the user PC: Choose Start > Run, and enter

Cmd in the Run dialog box. In the displayed command line window, run the arp -d command to clear the ARP entries. Then access the website again.

l If the user is an 802.1x-based fast authentication user, go to step 5.


l Results of the preceding troubleshooting procedure l Configuration file, log files, and alarm files of the switch

----End


Relevant Alarms

None

Logs

None

8.3.3 MAC Address Authentication of a User Fails

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the MAC address authentication failure.



312


Troubleshooting

Common Causes

8 Security


Some parameters are set incorrectly or not set, such as the parameters of MAC address authentication, authentication domain, authentication server, and authentication server template.

l

The number of online users reaches the maximum.


A user fails to pass the MAC address authentication.

Figure 8-12


Figure 8-12 Troubleshooting flowchart for MAC address authentication failure

A user fails to pass MAC authentication

Is MAC authentication enabled?

Yes

Is the user name correct?

Yes


Yes

Max. of online users reached?

No


No

Enable MAC authentication globally and on interface

No

No

Yes

Modify user name configuration

Configure domain and authentication server template correctly

This is not a fault





No

Yes


No

Yes


Yes

No

End

313



Context

When MAC address authentication is used, users do not need the dial-up software. The authentication information such as the user name and password is generated according to the

MAC addresses of users. Similar to 802.1x authentication troubleshooting, when troubleshooting MAC address authentication, check whether the user name and password on the

S6700 are same as those on the authentication server and whether the domain name in the user name is correct.

NOTE


Procedure

Step 1 Check that MAC address authentication is enabled on the S6700.

Run the display mac-authen command to check whether MAC address authentication is enabled globally or on the user-side interface. If MAC address authentication is Enabled is not displayed, MAC address authentication is not enabled. Run the mac-authen command to enable

MAC address authentication globally and on the user-side interface.

CAUTION

802.1x authentication and MAC address authentication cannot be enabled on the same interface.

If 802.1x authentication is enabled on the interface, the system displays an error message when you run the mac-authen command.

Step 2 Check the configuration of the user name for MAC address authentication.

Run the display this command in the interface view to check the configuration of MAC address authentication on the interface. If MAC address authentication is not configured on the interface, the global configuration is used. Run the display mac-authen command to check the configuration of global MAC address authentication.

MAC address authentication supports two user name formats: fixed user name and MAC address.

l If the user MAC address is used as the user name, the S6700 sends the MAC address of the user terminal as the user name and password to the authentication server. The authentication domain is configured by the mac-authen domain command. If no authentication domain is configured, the default domain is used.

l When the fixed user name contains a domain name, this domain is used as the authentication domain. If the fixed user name does not contain a domain name, the default domain is used as the authentication domain.

Check the authentication server template and AAA schemes bound to the authentication domain.

Go to step 3.


1.

Check the configuration of the authentication server template bound to the domain. Ensure that the IP address and port of the authentication server are set correctly in the template,



314


Troubleshooting 8 Security and that the user name format and shared key specified in the template are the same as those on the authentication server.

2.


l If RADIUS or HWTACACS authentication is configured for the user domain, check whether the user account and the user attributes are created on the authentication server.

For details on RADIUS troubleshooting and HWTACACS troubleshooting, see

8.1.1

A User Fails in the RADIUS Authentication

and


HWTACACS Authentication

. For details on checking the authentication server, go to step 4.




3.

Run the display accounting-scheme command to check the accounting scheme. If accounting is configured on the S6700 but the authentication server does not support accounting, the user will be forced offline after going online. To allow the user to go online, disable the accounting function in the user domain or run the accounting start-fail


Step 4 Check the configuration of the authentication server.

l If the user information does not exist on the authentication server, create the user name and password on the authentication server.

l If user attributes on the authentication server contain VLAN authorization information but the VLAN is not created on the S6700, user authorization fails. To rectify the fault, create the VLAN.

l If user attributes on the authentication server contain ACL authorization information (ACL number or ACL content), but the ACL is not created on the S6700 or the ACL format is different from that required by the S6700, user authorization fails. To rectify the fault, create the ACL. Ensure that the ACL format used by the authentication server is the same that required by the S6700.

NOTE

The S6700 requires the following ACL format in the user attributes:

acl acl-num key1 key-value1... keyN key-valueN permit/deny

Field

acl permit

Description

Delivers the ACL content.

Allows users matching the rules to access the network.

Field

acl-num

deny

Description

Specifies the ACL number. The value ranges from 10000 to 10999.

Prohibits users matching the rules from accessing the network.



315



keyM (1 ≤ M ≤ N) Indicates a keyword in the

ACL, including src-ip

(source IP address), srcipmask (mask of source

IP address), and tcpsrcport (source TCP port number).

key-valueM (1 < M <

N)

Specifies the value of a keyword, which can be an

IP address, a mask, or a port number.

If the display access-user user-id command output contains the user IP address and Dynamic ACL

desc (Effective), the ACL specified in the user attribute takes effect.

If the configurations of the S6700 and the authentication server are correct, go to step 5.

Step 5 Run the display mac-authen interface interface-type interface-number command on the

S6700 to check whether the number of online MAC address authentication users reaches the maximum.

If the number of online MAC address authentication users reaches the maximum, the S6700 does not trigger authentication for subsequent users, and subsequent users cannot go online.



----End


Relevant Alarms

l 1.3.6.1.4.1.2011.5.25.171.2.1

Relevant Logs

None.

8.3.4 MAC Address Bypass Authentication of a User Fails

This section describes the method of MAC address bypass authentication troubleshooting.

In MAC address bypass authentication, a user terminal first sends an Address Resolution

Protocol (ARP) packet or a Dynamic Host Control Protocol (DHCP) packet to the S6700 to trigger 802.1x authentication. If the S6700 does not receive any 802.1x packet from the terminal within 30 seconds, the S6700 sends the MAC address of the terminal as the user name and password to the authentication server.

After MAC address bypass authentication is configured, the S6700 starts MAC address authentication automatically after a user fails to pass the 802.1x authentication. 802.1x

authentication and MAC address authentication cannot be enabled on the same interface. If

802.1x authentication is enabled on the interface, the system displays an error message when you attempt to enable MAC address authentication. You can enable MAC address bypass authentication by using the dot1x mac-bypass command. In MAC address bypass authentication, the terminal MAC address is used as the user name and password. The process of MAC address bypass authentication is the same as the process of MAC address authentication.



316



The troubleshooting procedure for MAC address bypass authentication failure is similar to the

troubleshooting procedure for MAC address authentication failure. For details, see

8.3.3 MAC

Address Authentication of a User Fails

.

8.3.5 Web Authentication of a User Fails

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the Web authentication failure.

Common Causes

This fault is commonly caused by one of the following: l Some parameters are set incorrectly or not set, such as the parameters of Web address authentication, authentication domain, authentication server, and authentication server template.

l The Web authentication server is unreachable or unavailable.

l The user name or password entered by the user is incorrect.


A user fails to pass the Web authentication.

Figure 8-13




317


Troubleshooting

Figure 8-13 Troubleshooting flowchart for Web authentication failure

A user fails to pass Portal authentication

Is the link normal?

No

Rectify link fault

Yes

Is Portal authentication correct?

No

Configure Portal authentication server and bind it to VLANIF interface

Yes

Is Portal server configured correctly?

No

Add switch to authenticated device list and ensure that listening port is the same as that on switch

Yes


Yes


No

Configure authentication server template, authentication scheme, user name, and password correctly

8 Security


Yes

No


Yes

No


Yes

No


Yes

No

End


NOTE


Procedure

Step 1 Run the ping command to check whether the link between the S6700 and the Portal authentication server and the link between the S6700 and the RADIUS or HWTACACS authentication server work properly.



318



l If the ping operation fails on any link, rectify the fault on the link according to

6.2.1 A Ping

Operation Fails

.


Step 2 Check that Portal authentication is configured correctly on the S6700.

l Run the display web-auth-server configuration command to check whether the Web authentication server is configured. If not, run the web-auth-server command in the system view to configure a Web authentication server. Run the server-ip and url commands in the web-auth-server view to configure an IP address and URL for the Web authentication server.

You can also run the port and shared-key commands in the web-auth-server view to configure the port number and shared key of the Web authentication server. If these parameters are configured on the S6700, ensure that the parameter settings are the same as those configured on the Web authentication server. If they are not configured on the

S6700, the default port number is 50100, and there is no default shared key or URL.

l Run the display this command in the VLANIF interface view to check whether the Portal authentication server is bound to the VLANIF interface. If not, run the web-auth-server

(VLANIF interface view) command in the VLANIF interface view to bind the Portal authentication server to the VLANIF interface. Users cannot be authenticated when the

S6700 switch functions as a Layer 3 device but Layer 2 Portal authentication is configured.

In this case, configure Layer 3 Portal authentication on the S6700.

l Run the display web-auth-server configuration command to check the listening port of

Portal packets. Go to step 3.

Step 3 Check the configuration of the Portal authentication server.

l Check whether the S6700 is in the authenticated device list.

l Check whether the listening port of Portal packets is the same as that configured on the

S6700.

l Check whether the IP address of the user is in the IP address group of the S6700.

Ensure that the S6700 is in the authenticated device list, the listening port of Portal packets on the S6700 is the same as that configured on the Portal authentication server, and the IP address of the user is in the IP address group of the S6700.


1.

Check the configuration of the authentication server template bound to the domain. Ensure that the IP address and port of the authentication server are set correctly in the template and that the user name format and shared key specified in the template are the same as those on the authentication server.

2.


l If RADIUS or HWTACACS authentication is configured for the user domain, check that the user name and password are configured on the authentication server. Ensure that the user enters the correct user name and password. For details on RADIUS

troubleshooting and HWTACACS troubleshooting, see


RADIUS Authentication

and

8.1.2 A User Fails in the HWTACACS

Authentication

.




3.

Run the display accounting-scheme command to check the accounting scheme. If accounting is configured on the S6700 but the authentication server does not support



319


Troubleshooting 8 Security accounting, the user will be forced offline after going online. To allow the user to go online, disable the accounting function in the user domain or run the accounting start-fail




----End


Relevant Alarms

None.

Relevant Logs

None.

8.4 DHCP Snooping Troubleshooting

This chapter describes common causes of Dynamic Host Configuration Protocol (DHCP) snooping faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

8.4.1 Users Fail to Go Online After DHCP Snooping Is Configured

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the DHCP snooping fault.

Common Causes

This fault is commonly caused by one of the following: l The network-side interface connected to the DHCP server is not configured as a trusted interface.

l

The number of DHCP users connected to the user-side interface reaches the upper limit.

l The transmission rate of DHCP packets exceeds the upper limit, so the DHCP packets of new users are discarded.




320


Troubleshooting

Figure 8-14 DHCP snooping troubleshooting flowchart

Users fail to go online after DHCP snooping is enabled

Is network-side interface set as trusted?

Yes

Does number of DHCP online users reach

limit?

No

Does

DHCP packet rate reach limit?

No

No

Set network-side interface as trusted interface

Yes

New users cannot go online

Yes

Increase limit value


Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

End

8 Security


NOTE


Procedure

Step 1 Check that the trusted interface is correctly configured.

l Run the display dhcp snooping command to check the VLANs and interfaces enabled with

DHCP snooping.

l Run the display dhcp snooping interface command to check whether "dhcp snooping trusted" is displayed under the network-side interface.

l Run the display this command in the VLAN view to check whether "dhcp snooping trusted interface xxx" is displayed.

After DHCP snooping is enabled on an interface, the interface is an untrusted interface by default.

When receiving packets from network-side interfaces, the S6700 processes only the DHCP

Reply packets received by the trusted interface and discards DHCP Reply packets received by untrusted interfaces. When receiving packets from user-side interfaces, the S6700 forwards the packets only to the trusted interface.



321


Troubleshooting 8 Security l The network-side interface of the DHCP server must be configured as a trusted interface. If the network-side interface is not a trusted interface, run the dhcp snooping trusted command in the interface view or run the dhcp snooping trusted interface command in the VLAN view to configure the interface as a trusted interface.

l If the trusted interface is correctly configured, go to step 2.

Step 2 Check whether the number of DHCP online users reaches the upper limit.

l Run the display dhcp snooping interface command to check whether "dhcp snooping maxuser-number xxx" is displayed under the user-side interface.

l Run the display this command in the VLAN view to check whether "dhcp snooping maxuser-number xxx" is displayed.

l Run the display this command in the system view to check whether "dhcp snooping global max-user-number xxx" is displayed.

In the command outputs, max-user-number indicates the maximum number of DHCP users. If this field is not displayed, the default value 1024 is used. If max-user-number is displayed in all the preceding command outputs, the smallest one among the displayed values is used.

Run the display dhcp snooping user-bind all command to view the number of DHCP users on the DHCP snooping-enabled interface. If the number of DHCP users on the interface reaches the upper limit, new users cannot go online.

If the number of DHCP users on the interface is lower than the upper limit, go to step 3.

Step 3 Check whether the transmission rate of DHCP packets exceeds the upper limit.

Run the display this command in the interface view, VLAN view, and system view to check whether a limit is set for DHCP packet rate. If the output information does not contain dhcp

snooping check dhcp-rate xx, the DHCP packet rate is the default value 100.

The DHCP snooping rate limit can be set in the system view, interface view, and VLAN view.

After a rate limit is set, the number of packets sent to the protocol stack within a certain period of time cannot exceed the limit; otherwise, the excess packets are discarded. The smallest value among the rate limits set in the system view, interface view, and VLAN view takes effect. If users cannot go online because the DHCP snooping rate limit is small, run the dhcp snooping

check dhcp-rate command in the system view, interface view, and VLAN view to increase the rate limit values.

If the fault persists after the rate limit values are increased, go to step 4.



----End


Relevant Alarms

None.



322


Troubleshooting

Relevant Logs

None.


This section provides a DHCP snooping troubleshooting case.

8 Security

Users Fail to Obtain IP Addresses After DHCP Snooping Is Enabled

Fault Symptom

As shown in

Figure 8-15

, DHCP relay is enabled on SwitchA, SwitchB is a non-Huawei device,

and DHCP snooping is enabled on SwitchB. Users cannot obtain IP addresses.


DHCP Server

SwitchA

DHCP Relay

L2 Network

SwitchB

DHCP Snooping

User

Fault Analysis

1.

Check the DHCP configurations on SwitchA and SwitchB. The configurations are correct; therefore, DHCP packets may be discarded.

2.

Check whether SwitchB correctly processes DHCP Discover packets.

Capture and analyze the packets on SwitchB. The DHCP Discover packets with the source port and destination port being port 67 are discarded before they enter DHCP snooping queues.

3.

Check the network. The DHCP snooping-enabled device (SwitchB) is between the DHCP relay and the DHCP server, and the source and destination ports of the DHCP Discover packets sent by the DHCP relay are port 67.



323



4.

Check the packet processing mechanism of SwitchB. SwitchB considers the packets with the source and destination ports being port 67 as invalid DHCP Discover packets, so it discards them.

Procedure

Step 1 Contact the vendor of SwitchB to modify the software codes to make SwitchB support the DHCP

Discover packets with the source and destination ports being port 67. The fault is then rectified.

----End

Summary

A typical DHCP network structure is client-relay-server and DHCP snooping is usually enabled on the relay or between the relay and the client. If DHCP snooping is enabled on a network using another structure, consider whether DHCP packet forwarding is affected.

8.5 Traffic Suppression Troubleshooting

This chapter describes common causes of traffic suppression faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

8.5.1 Broadcast Suppression Fails to Take Effect on an Interface

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when broadcast suppression fails to take effect.

Common Causes

This fault is commonly caused by one of the following: l Broadcast suppression is not configured on interfaces, or the broadcast suppression threshold is set too high.

l Broadcast packets are not discarded on the inbound interface.


When a broadcast storm occurs on an interface and broadcast suppression fails to take effect on the interface, see

Figure 8-16


NOTE

l Run the display interface interface-type interface-number command to check the packets sent and received on the interface. If the rate of incoming or outgoing packets reaches several Mbit/s and most packets are broadcast packets, a broadcast storm occurs on the interface.

l Generally, only ARP packets and DHCP packets are broadcast on a network. If a broadcast storm occurs, the network may have a loop. When many physical loops exist on the network, loop prevention protocols such as the Spanning Tree Protocol (STP) and the Rapid Ring Protection Protocol (RRPP) must be enabled to prevent network loops. When loop prevention is ineffective and network loops occur, broadcast suppression can restore the network. Broadcast suppression is a remedy for broadcast storms. An effective prevention measure for broadcast storms is to eliminate network loops. For details

about loop prevention measures, see

MSTP Troubleshooting

and

RRPP Troubleshooting

.



324


Troubleshooting


Broadcast suppression fails

8 Security

Are broadcast suppression parameters configured properly?

Yes

No

Modify broadcast suppression parameters

Are broadcast packets discarded on the inbound interface?

No


Yes

Is fault rectified?

No

Yes

Yes

Is fault rectified?

Yes

No

End


NOTE


l The troubleshooting procedures for multicast suppression and unknown unicast suppression are similar to that for broadcast suppression.

Procedure

Step 1 Check that traffic suppression is correctly configured on the related interface.

NOTE

The traffic suppression function controls the traffic entering the S6700 from an interface. Traffic suppression must be configured on both the user-side and network-side interfaces on the S6700. If broadcast suppression is configured only on the user-side or network-side interface, the S6700 can only control the broadcast traffic in one direction. When a downstream device sends a large number of broadcast packets to the S6700, broadcast storms still occur if broadcast suppression is not configured on the interface connected to the downstream device.

Normally, broadcast packets are transmitted at a rate less than 1000 kbit/s. Therefore, setting the broadcast suppression threshold to less than 1000 kbit/s is recommended. The formula for calculating the PPS rate is as follows: PPS = CIR x 1000/(84 x 8). The value 84 is the average packet length, including a 60-byte packet body, a 20-byte inter-frame gap and a 4-byte CRC. The value 8 is the number of bits in a byte.

Run the display flow-suppression interface interface-type interface-number command in the user view to check whether the values of rate mode and set rate value in the broadcast field are proper.



325


Troubleshooting 8 Security l If these values are improper, run the broadcast-suppression { percent-value | packets

packets-per-second } command in the user view to modify broadcast suppression parameters.

l If these values are proper, go to step 2.

Step 2 Check whether broadcast packets are discarded in the inbound direction of the interface.

You can check whether broadcast packets are discarded in the inbound direction of the interface by using the following methods: l Run the display interface interface-type interface-number command in the user view to check whether the value of Input bandwidth utilization changes greatly after traffic suppression is configured. Normally, after traffic suppression is configured, the interface bandwidth usage decreases if the interface discards excess packets. If the value of Input

bandwidth utilization does not change or changes a little, go to step 3.

l Configure another interface (interface B), and add it and the interface configured with traffic suppression (interface A) to the same VLAN. Then check whether the volume of the outgoing traffic on interface B is the same as the volume of the traffic on interface A. If they are different, no packet is discarded in the inbound direction of interface A. Go to step 3.



----End


Relevant Alarms

None.

Relevant Logs

None.

8.6 CPU Defense Troubleshooting

This chapter describes common causes of CPU defense faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

8.6.1 Protocol Packets Fail to Be Sent to the CPU

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when protocol packets fail to be sent to the CPU.

Common Causes

This fault is commonly caused by one of the following: l The inbound interface does not receive any protocol packet.



326


Troubleshooting 8 Security l A policy is configured on the S6700 to discard protocol packets. For example, a blacklist is configured, or the action taken on the protocol packets to be sent to the CPU is deny.

l

Invalid packets attack the CPU.


If a certain function does not work because protocol packets fail to be sent to the CPU, rectify the fault according to

Figure 8-17

.

Run the display cpu-defend statistics command to check the value of the Pass field to determine whether protocol packets are sent to the CPU.


Protocol packets fail to be sent to the CPU

Does the interface receive

protocol packets?

No

Yes

Are rules configured to discard protocol packets?

Yes

No

Do invalid packets attack the CPU?

No


Check and rectify

the link fault

Change the rules

Yes

Configure a blacklist or a traffic policy to prevent invalid packets from being sent to the CPU

No

No

No

End

Is fault rectified?

Yes

Is fault rectified?

Yes

Is fault rectified?

Yes


NOTE


Procedure

Step 1 Check whether the interface receives protocol packets.



327


Troubleshooting



8 Security

Capture packets on the interface to check whether the interface receives protocol packets.

l If the interface does not receive any protocol packet, run the display interface interface-type

interface-number command to check whether the interface is physically Up.

–

If the interface is physically Down, see


to rectify the interface fault.

–

If the interface is physically Up, go to step 3.

l If the interface receives protocol packets, go to step 2.

Step 2 Check whether a policy is configured to discard protocol packets on the S6700.

NOTE

On the S6700, protocol packets will be discarded and fail to be sent to the CPU in the following situations: l A blacklist is configured, and protocol packets match the ACL rule of the blacklist.

l The action taken on the protocol packets to be sent to the CPU is deny.

Run the display this command in the system view to check the configured attack defense policy.

Then run the display cpu-defend policy command to check whether a blacklist is configured or whether the action taken on the protocol packets to be sent to the CPU is deny.

l If a blacklist is configured, run the display acl command to check whether protocol packets match rules of the blacklist.

– If protocol packets match the rules, change the rules according to the service plan.

–

If no protocol packet matches the rules, go to step 3.

l If the action taken on the protocol packets to be sent to the CPU is deny, run the car command to set the CAR.

l If no blacklist is configured, and the action taken on the protocol packets to be sent to the

CPU is not deny, go to step 3.

Step 3 Check statistics about the packets sent to the CPU.

NOTE

If there are excessive packets of a certain protocol, for example, invalid packets attack the CPU, other protocol packets cannot be sent to the CPU.

Run the display cpu-defend statistics command to check whether a large number of protocol packets are discarded.

l If a large number of protocol packets are discarded, check whether these packets are invalid attack packets by using the attack source tracing function. If they are invalid attack packets, use the configured blacklist or traffic policy to prevent these packets from being sent to the

CPU.

NOTE

l For the configuration of the attack source tracing function, see "Configuring Attack Source

Tracing" in the S6700 Series Ethernet Switches Configuration Guide - Security.

l For the configuration of a blacklist, see "Configuring Attack Defense Policies" in the S6700

Series Ethernet Switches Configuration Guide - Security.

l For the configuration of a traffic policy, see "Configuring Attack Defense Policies" in the S6700

Series Ethernet Switches Configuration Guide - Security.

l If no protocol packet is discarded, go to step 4.



Issue 01 (2012-03-15) 328


Troubleshooting 8 Security l Configuration files, log files, and alarm files of the devices

----End


Relevant Alarms

None.

Relevant Logs

None.

8.6.2 Blacklist Function Fails to Take Effect

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when the blacklist function fails to take effect.

Common Causes


The configured blacklist fails to be applied.

l

Packets do not match rules of the blacklist.


None.


NOTE


Procedure

Step 1 Check that the configured blacklist is applied to the corresponding board successfully.

Run the display cpu-defend policy policy-name command to check whether the blacklist in the attack defense policy has been applied successfully.

<Quidway> display cpu-defend policy 1

Related slot : <0>

Configuration :

Blacklist 1 ACL number : 2001 l If "Related slot : <0>" is displayed in the command output, the attack defense policy has been applied successfully.

l If "Blacklist 1 ACL number : 2001" is displayed in the command output, a blacklist has been configured in the attack defense policy.

Step 2 Check whether packets match rules of the blacklist.

Check the ACL of the blacklist in the displayed attack defense policy information, and then run the display acl acl-number command to check whether service packets match the ACL rule.



329


Troubleshooting 8 Security l If service packets do not match the ACL rule, run the rule command in the ACL view to modify the ACL rule.

l If service packets match the ACL rule, the blacklist may fail to be applied because ACL resources are insufficient. Go to step 3.



----End


Relevant Alarms

None.

Relevant Logs

None.

8.6.3 Attack Source Tracing Fails to Take Effect

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when attack source tracing fails to take effect.

Common Causes

This fault is commonly caused by one of the following: l The attack defense policy configured with attack source tracing is not applied.

l The threshold for attack source tracing is set too high, so the attack source tracing function fails to identify attack packets.


If attack source tracing is configured, but the source of attack packets sent to the CPU cannot

be found by using the display auto-defend attack-source command, see

Figure 8-18


NOTE

The attack source tracing function can collect statistics about only DHCP, ICMP, IGMP, TCP, Telnet, and

ARP packets as well as packets with the TTL being 1.



330


Troubleshooting


Attack source tracing fails to take effect

8 Security

Is the attack defense policy applied correctly?

No

Yes

Is the threshold

for attack source tracing set too high?

Yes

Set the threshold for attack source tracing to a smaller value

No

Apply the attack defense policy correctly


No

No

End

Is fault rectified?

Yes

Is fault rectified?

Yes


NOTE


Procedure

Step 1 Check that the attack defense policy configured with attack source tracing is applied correctly.

Run the display this command in the system view to check whether the cpu-defend-policy

global command has been executed.

Alternatively, run the display auto-defend configuration command to check values of the

Name and Related Slot fields. The Name field specifies the attack defense policy name, and the Related Slot field specifies the slot of the board where the attack defense policy has been applied.

l If no attack defense policy is applied successfully, run the cpu-defend-policy global command in the system view to apply an attack defense policy.

l If the attack defense policy has been applied successfully, go to step 2.

Step 2 Check whether the threshold for attack source tracing is set too high.

NOTE

If the threshold for attack source tracing is set too high, attack source tracing cannot identify attack packets or collect attack packet statistics.

Run the auto-defend threshold command to set a smaller threshold for attack source tracing.



331



After a specified period, run the display auto-defend attack-source command to check whether the attack source list is displayed. If no attack source list is displayed, go to step 3.



----End


Relevant Alarms

l 1.3.6.1.4.1.2011.5.25.165.2.2.1.1

l 1.3.6.1.4.1.2011.5.25.165.2.2.1.2

Relevant Logs

None

8.7 MFF Troubleshooting

This chapter describes common causes of MAC Forced Forwarding (MFF) faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

8.7.1 Users Fail to Access the Internet After MFF Is Configured

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when users fail to access the Internet after MFF is configured.

Common Causes

This fault is commonly caused by one of the following: l No binding table is generated for users.

l User access configurations are incorrect. For example, DHCP snooping is not enabled on the user-side interface; the network-side interface is not configured as a trusted interface; the user address is on a different network segment than the gateway address.

l MFF configurations are incorrect. For example, the user-side interface is not added to the

MFF-enabled VLAN, or no network-side interface is configured.

l The S6700 does not receive any ARP reply packet from the gateway because the route from the S6700 to the gateway is unreachable or the link between them is busy.


The troubleshooting roadmap is as follows: l Check whether MFF host information is generated.

l Check whether the gateway MAC address is learned by the S6700.



332


Troubleshooting


Users fail to access the Internet after MFF is configured

8 Security

No host information is generated

Is host information generated and is gateway MAC address learned?

No gateway MAC address is learned

Yes

Is the

Binding table for users generated?


Yes

Are ARP reply packets received?

No No

Are user configurations correct?

No

Configure user information correctly

Yes

No

Rectify the routing fault

Is fault rectified?

Yes Yes

Is fault rectified?

No

No

Is the route to the gateway reachable?

Yes

Is MFF configured correctly?

Yes

No

No

Configure

MFF correctly

Start timed gateway address detection

Yes Are ARP

reply packets discarded?

No

Is fault rectified?

Yes Yes

Is fault rectified?

No


End



NOTE


Procedure

Step 1 Run the display mac-forced-forwarding vlan vlan-id command to check generated MFF information.



333


Troubleshooting 8 Security l If the User IP and User MAC fields are empty, no host information is generated. Go to step

2.

l If the Gateway MAC field is empty, no gateway MAC address is learned. Go to step 3.

Step 2 Check configurations to ensure that MFF host information is generated.

1.

Check that user access configurations are correct.

User Type

Dynamic user

Check Item

DHCP snooping is enabled on the userside interface.

Method

Run the display

this command in the user-side interface view to check whether the

dhcp snooping

enable command has been executed.

Solution

If the dhcp

snooping enable

command has not been executed, run this command in the interface view. You can also run this command in the

VLAN view if the user-side interface has been added to the VLAN.

The network-side interface is configured as a trusted interface.

Run the display

this command in the network-side interface view to check whether the

dhcp snooping

trusted command has been executed.

Users can get online.

Run the display

dhcp snooping

user-bind vlan

vlan-id command to check whether

DHCP snooping entries exist.

If the dhcp

snooping trusted

command has not been executed, run this command in the interface view. You can also run the

dhcp snooping trusted interface

command in the

VLAN view if the network-side interface has been added to the VLAN.

If there is no DHCP snooping entry for the user IP address, the user cannot get online. Rectify the fault according to

8.4.1 Users Fail to

Go Online After

DHCP Snooping Is

Configured

.



334



User Type

Static user

Check Item

A correct static gateway address is configured.

Method

Run the display

this command in the MFF-enabled

VLAN view to check whether the

mac-forcedforwarding static-

gateway ip-

address command has been executed and whether the static gateway address is on the same network segment as the static user address.

Solution

If the mac-forced-

forwarding static-

gateway ip-

address command is not run or the static gateway address is on a different network segment than the static user address, run the mac-

forced-forwarding

static-gateway ip-

address command to configure a static gateway, which resides on the same network segment as the static user.

If the fault persists, go to step b.

2.

Check that MFF configurations are correct.

l Run the display this command in the user-side interface view to check whether the interface is added to the MFF-enabled VLAN. If not, add it to the MFF-enabled VLAN by using commands.

l Run the display this command in the network-side interface view to check whether the

mac-forced-forwarding network-port command is run. If it is not run, run this command.

If MFF configurations of both the user-side and network-side interfaces are correct, go to step 3.

Step 3 Ensure that the S6700 can learn the gateway address.

1.

Check whether the S6700 receives an ARP reply packet from the gateway.

Run the debugging ethernet packet arp interface interface-type interface-number command in the user view to check whether the S6700 receives an ARP reply packet from the gateway.

l If the S6700 does not receive any ARP reply packet from the gateway, go to step c for dynamic users, and go to step b for static users.

l If the S6700 receives the ARP reply packet from the gateway, but the gateway MAC address still cannot be learned, go to step 4.

2.

Check that the link between the S6700 and the gateway works properly.

Ping the gateway from the S6700 to check whether the route between them is reachable.

l If the ping fails, rectify the link fault according to


.

l If the ping succeeds, go to step c.



335



3.

Check whether ARP reply packets are discarded.

l Run the display this command in the interface view, VLAN view, and system view to check whether a rate limit is set for ARP packets.

If the rate limit is set to a low value, ARP reply packets may be discarded. Run the arp

anti-attack rate-limit command to increase the rate limit.

l Run the mac-forced-forwarding gateway-detect command in the MFF-enabled

VLAN view to enable timed gateway address detection and obtain the gateway MAC address by retransmitting an ARP request packet.

If the gateway MAC address still cannot be learned, go to step 4.



----End


Relevant Alarms

None.

Relevant Logs

None.

8.8 ACL Troubleshooting

This chapter describes common causes of ACL faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

8.8.1 A User-Defined ACL Fails to Take Effect

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when a user-defined ACL fails to take effect.

Common Causes

This fault is commonly caused by one of the following: l Packets do not match user-defined ACL rules.

l The traffic policy configured with the user-defined ACL is applied incorrectly. For example, the traffic policy is applied to an incorrect object or applied in an incorrect direction.

l Packets match another traffic policy with a higher priority.


If a user-defined ACL fails to take effect, see

Figure 8-20




336


Troubleshooting


A user-defined

ACL fails to take effect

8 Security

Do packets match the ACL rule?

Yes

Is the traffic policy applied correctly?

Yes

No

No

Change the

ACL rule

Apply the traffic policy correctly

Do packets match another higher-priority traffic policy?

Yes

Change the traffic policy and its rules

No


No

Is fault rectified?

Yes

Is fault rectified?

Yes

No

No

Is fault rectified?

Yes

End


NOTE


Procedure

Step 1 Check whether packets match user-defined ACL rules.

CAUTION

User-defined ACLs match 4-byte information each time. Therefore, configuring 4-byte userdefined ACLs is recommended. If only 2 bytes are specified, the 2 bytes are used as the lower

2 bytes of 4 bytes to match information.

Issue 01 (2012-03-15)

Run the display acl command to view user-defined ACL rules and then capture packets to check whether information in the packets (including the IP address, MAC address, DSCP priority,

VLAN ID, and 802.1p priority) matches the user-defined ACL rules.



337


Troubleshooting 8 Security l If information in the packets does not match the user-defined ACL rules, run the rule command to modify the ACL rules to match the information.

l If information in the packets matches the user-defined ACL rules, go to step 2.

Step 2 Check that the traffic policy configured with the user-defined ACL is applied correctly.

1.

Determine the traffic policy configured with the user-defined ACL.

Run the display current-configuration command to view the current configuration file.

Search the traffic classifier containing the if-match acl acl-number command, and then determine the traffic policy bound to this traffic classifier.

2.

Check whether the traffic policy configured with the user-defined ACL is applied correctly.

Run the display traffic-policy applied-record command to check whether the traffic policy is applied to the correct VLAN, LPU, or interface and whether the traffic policy is applied in the correct direction. On the S6700, when a user-defined ACL is applied, the traffic policy can be applied only in the inbound direction to match incoming packets.

l If the traffic policy is applied to an incorrect object, run the traffic-policy command to apply the traffic policy to a correct object.

l If the traffic policy is applied to an incorrect direction, run the undo traffic-policy command to delete the traffic policy, and then run the traffic-policy command to apply the traffic policy in the correct direction.

l If the fault persists, go to step 3.

Step 3 Check whether packets match another traffic policy with a higher priority.

For details, see step 2 of


in

9.1.1 Traffic Policy Fails to Take

Effect

.



----End


Relevant Alarms

None.

Relevant Logs

None.


This section provides user-defined ACL troubleshooting cases.

A User-Defined ACL Fails to Limit the Packet Rate

Fault Symptom

As shown in

Figure 8-21

, PCs access the network from XGE 0/0/1 by using a PPPoE dialer and are authenticated by a RADIUS server. If too many user packets are sent to the RADIUS server,



338


Troubleshooting 8 Security the RADIUS server will stop responding. To solve this problem, the Switch needs to limit the rate of UDP packets from users by using a user-defined ACL. Assume that a user-defined ACL

(ACL 5000) containing ACL rule of rule 5 permit l2-head 0x0011 0x00ff 30 is configured. The traffic behavior is to limit the packet rate to 20 Mbit/s and to collect traffic statistics, and the traffic policy name is specified as udp. After these configurations are complete, traffic is not limited and no traffic statistics are collected.


Internet

RADIUS Server

10.10.10.1/32

Switch

XGE0/0/1

XGE0/0/2

PC1 PC2

10.1.1.2/24

PC3

Issue 01 (2012-03-15)

Configure the Switch.


[Quidway] acl 5000

[Quidway-acl-user-5000] rule permit l2-head 0x0011 0x00ff 30

[Quidway] traffic classifier udp

[Quidway-classifier-udp] if-match acl 5000

[Quidway-classifier-udp] quit

[Quidway] traffic behavior udp

[Quidway-behavior-udp] statistic enable

[Quidway-behavior-udp] car cir 20000

[Quidway-behavior-udp] quit

[Quidway] traffic policy udp

[Quidway-trafficpolicy-udp] classifier udp behavior udp

[Quidway] interface xgigabitethernet 0/0/2

[Quidway-XGigabitEthernet0/0/2] traffic-policy udp inbound

After the preceding configurations are complete, use a tester to simulate login for a large number of users and observe outgoing traffic on XGE 0/0/2. Traffic information shows that the traffic rate is still greater than 20 Mbit/s. That is, rate limit fails to take effect. Then run the display

traffic policy statistics interface interface-type interface-number inbound command. The following command output is displayed.

[Quidway-XGigabitEthernet0/0/2] display traffic policy statistics interface

xgigabitethernet0/0/2 inbound




339



Traffic policy inbound: udp

Rule number: 1

Current status: OK!

Board : 3

Item Packets Bytes

---------------------------------------------------------------------

Matched 0 0

+--Passed 0 0

+--Dropped 0 0

+--Filter 0 0

+--URPF - -

+--CAR 0 0

The preceding command output shows that no traffic statistics are collected. This is, packets do not match the traffic policy udp configured with ACL 5000.

Fault Analysis

1.

Check whether packets match the ACL rule.

Run the display acl 5000 command on the Switch. The following command output is displayed.

[Switch] display acl 5000

User ACL 5000, 1 rule

Acl's step is 5

rule 5 permit 0x00000011 0x000000ff 30

Capture packets on XGE 0/0/2 and analyze the packets sent from the Switch to the RADIUS server. Packet information shows that the UDP protocol number is 0x11, and the offset from the Layer 2 header is 30. 0x11 should match higher 16 bits, but the ACL rule is configured to match lower 16 bits. As a result, packets fail to match the ACL rule.

Procedure


Step 2 Run the acl 5000 command to enter the user-defined ACL view.

Step 3 Run the undo rule 5 command to delete ACL rule 5.

Step 4 Run the rule permit l2-head 0x00110000 0x00ff0000 30 command to re-define an ACL rule.

After the preceding configurations are complete, use a tester to simulate login for a large number of users, and observe outgoing traffic on XGE 0/0/2. Traffic information shows that the traffic rate is smaller than 20 Mbit/s. The fault is rectified.

----End

Summary

On the S6700, user-defined ACLs match 4-byte information each time. Therefore, configuring

4-byte user-defined ACLs is recommended. If only 2 bytes are specified, the S6700 uses the 2 bytes as the lower 2 bytes of 4 bytes to match information.

8.9 PPPoE+ Troubleshooting

This chapter describes common causes of Point-to-Point Protocol over Ethernet (PPPoE) faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.



340



8.9.1 PPPoE Users Fail to Access the Internet

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when PPPoE users fail to access the Internet.

Common Causes

This fault is commonly caused by one of the following: l The link between the PPPoE client and the PPPoE server is faulty.

l This PPPoE+ configuration is incorrect. For example, an uplink interface is an untrusted interface, or the action for processing original fields in PPPoE packets or the format of information added to PPPoE packets is incorrect.




341


Troubleshooting


PPPoE users cannot access Internet

8 Security

Is the interface Up?

No See “Connected

Ethernet Interfaces

Down”

Yes

Is PPPoE+ enabled?

No

Is

Layer 2 forwarding

No correct?

Yes

Yes

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No


Is action for processing original fields correct?

Yes

Is format of information

added to packets correct?

Yes

No

No

Configure a correct action

Configure a correct format

Is fault rectified?

No

Is fault rectified?

No

Yes

Yes



NOTE


Procedure

Step 1 Check that the connected interfaces of the user device, S6700, and PPPoE server are Up.



342


Troubleshooting



8 Security

l If some interfaces are Down, rectify the fault according to


Interfaces Down

.

l If these interfaces are Up, go to step 2.

Step 2 Check that PPPoE+ is enabled.

Run the display this command in the system view to check whether the pppoe intermediate-

agent information enable command was executed. If not, PPPoE+ is disabled.

l If PPPoE+ is disabled, the S6700 directly forwards PPPoE packets at Layer 2. If PPPoE packets are not forwarded from the S6700, or if packets are forwarded at Layer 2 but PPPoE users cannot access the Internet, go to step 4.

l If PPPoE+ is enabled, go to step 3.

Step 3 Check that the PPPoE+ configuration is correct.

1.

Check whether the network-side interface connected to the PPPoE server is the trusted interface.

If the network-side interface is not the trusted interface, PPPoE server spoofing may occur or PPPoE packets are forwarded to non-PPPoE interfaces. As a result, authorized PPPoE users cannot access the Internet.

Run the display this command in the view of the network-side interface to check whether the pppoe uplink-port trusted command was executed.

l If no, the network-side interface is an untrusted interface. Run the pppoe uplink-port

trusted command to configure the network-side interface as the trusted interface.

l If yes, the network-side interface is the trusted interface. Go to step b.

2.

Check whether the action for processing original fields in PPPoE packets is correct.

Run the display this command in the system view and in the view of the user-side interface to check whether the pppoe intermediate-agent information policy command was executed in the system and on the interface. If the actions for processing original information fields in PPPoE packets are configured on the interface and in the system, the action configured on the interface takes effect. If the action for processing original fields in PPPoE packets is not configured on the interface or in the system, the system replaces the original fields in PPPoE packets according to the configured field format by default.

Check whether the action for processing original fields in PPPoE packets is correct.

l If the action is incorrect, run the pppoe intermediate-agent information policy command to configure a correct action.

l If the action is correct, go to step c.

3.

Check that the format of information added to PPPoE packets is correct.

Run the display pppoe intermediate-agent information format command to check whether the format of information added to PPPoE packets is supported by the PPPoE server.

l If the format is not supported by the PPPoE server, run the pppoe intermediate-agent

information format command to configure a format supported by the PPPoE server.

l If the format is supported by the PPPoE server, go to step 4.



Issue 01 (2012-03-15) 343


Troubleshooting l Configuration file, log file, and alarm file of the S6700

----End


Relevant Alarms

None.

Relevant Logs

None.

8 Security

8.10 URPF Troubleshooting

This section provides a troubleshooting case for Unicast Reverse Path Forward (URPF).


Communication Between Connected Is Interrupted Intermittently

Fault Symptom

As shown in

Figure 8-23

, the network devices are configured with Unicast Reverse Path Forward

(URPF) and communicate with each other by using the Open Shortest Path First (OSPF) protocol. The path cost between SwitchA and SwitchB is 800; the path cost between SwitchC and SwitchA is 1000; the path cost between SwitchC and SwitchB is 1000.

Figure 8-23 Diagram of connected interfaces

SwitchA

XGE0/0/2 XGE0/0/2

SwitchB

XGE0/0/1 XGE0/0/1

XGE0/0/2 XGE0/0/1

SwitchC

Issue 01 (2012-03-15)

When SwitchA pings the IP address of XGE0/0/1 on SwitchC, the ping operation fails intermittently.



344



Fault Analysis

1.

Run the display ip routing-table command on SwitchC. The displayed OSPF routing entries are normal.

2.

Analyze the transmission path of the ping packets. When SwitchA pings the IP address of

XGE0/0/1 on SwitchC, two paths are available for the ping request packet: SwitchA ->

SwitchB -> SwitchC with a cost of 1800 and SwitchA -> SwitchC with a cost of 2000. The first path has a smaller cost, so it is selected. The ping reply packet can be transmitted through the path SwitchC -> SwitchA or SwitchC -> SwitchB -> SwitchA. The costs of the two paths are both 1800, so the two paths are equal-cost paths.

l When the reply packet is transmitted through the same path as the request packet, that is, SwitchC -> SwitchB -> SwitchA, the reply packet passes the URPF check and the ping operation succeeds.

l When the reply packet is transmitted through the other path, the reply packet fails in the

URPF check and is discarded. In this case, the ping operation fails.

NOTE

When a device receives a packet, it searches the forwarding table according to the destination IP address of the packet. If a route is found, the device forwards the packet through the route; otherwise, the device discards the packet. After URPF is configured, the device obtains the source IP address and the inbound interface of the packet and searches for the forwarding entry with the source IP address as the destination address. If the outbound interface in the forwarding entry is different from the inbound interface of the packet, the device considers the source IP address invalid and discards the packet. In this way, URPF can effectively prevent malicious users from sending packets with bogus source addresses to attack the network.

On this network, URPF is enabled on the connected interfaces and two equal-cost paths are available for ping packets. The ping operation succeeds when the ping reply packet passes through a path and fails when the ping reply packet passes through the other path.

In a conclusion, the fault is caused by URPF configured on connected interfaces.

Procedure



Step 3 Run the undo urpf command to disable URPF on the interface.

After URPF is disabled on the connected interfaces, SwitchA can ping the IP address of

XGE0/0/1 on SwitchC successfully.

----End

Summary

URPF is recommended on user-side interfaces or network-side interfaces but does not need to be configured on connected interfaces between network devices.



345


Troubleshooting

9

QoS


9.1 Traffic Policy Troubleshooting

This chapter describes common causes of traffic policy faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

9.2 Priority Mapping Troubleshooting

This chapter describes common causes of priority mapping faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

9.3 Traffic Policing Troubleshooting

This chapter describes common causes of traffic policing faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

9.4 Traffic Shaping Troubleshooting

This chapter describes common causes of traffic shaping faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

9.5 Congestion Avoidance Troubleshooting

This chapter describes common causes of congestion avoidance faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

9.6 Congestion Management Troubleshooting

This chapter describes common causes of congestion management faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

9 QoS



346


Troubleshooting 9 QoS

9.1 Traffic Policy Troubleshooting

This chapter describes common causes of traffic policy faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

9.1.1 Traffic Policy Fails to Take Effect

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when the traffic policy fails to take effect.

Common Causes

This fault is commonly caused by one of the following: l The packets do not match rules of the traffic classifier in the traffic policy.

l The traffic behavior associated with the traffic classifier in the traffic policy is configured incorrectly.

l The traffic policy is applied to an incorrect object.

l The traffic policy conflicts with another applied traffic policy and the packets match rules in the applied traffic policy.


Figure 9-1




347


Troubleshooting

Figure 9-1 Troubleshooting flowchart for ineffective traffic policy

Traffic policy does not take effect

9 QoS

Do packets match traffic classification

rules?

No

Yes

Traffic policy fails to be applied


Packets

do not match rules

Modify traffic classification rules

Yes

Is traffic policy correctly set?

Traffic policy is applied to an incorrect object

or direction

Change the object or direction

Do packets match

higher-priority rule?

No


Yes

Replan rule to match packets

End

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes


NOTE


Procedure

Step 1 Check whether packets match traffic classification rules.



348



Run the display traffic policy statistics command to check the traffic statistics in the system, on an interface, or in a VLAN to which a traffic policy is applied. If the corresponding field is empty, packets do not match traffic classification rules.

NOTE

Before viewing the traffic statistics, you must run the statistic enable command in the traffic behavior view to configure the traffic statistics function.

l If packets match traffic classification rules, go to step 4.

l If packets do not match traffic classification rules, go to step 2.

Step 2 Check that the traffic policy is configured correctly.

1.

Check whether the information in the packets matches traffic classification rules.

Check the information in the packets such as the IP address, MAC address, VLAN ID, and

802.1p priority, run the display traffic policy user-defined command to view the traffic classifier bound to the traffic policy, and then run the display traffic classifier user-

defined command to view the rules in the traffic classifier. Capture the packets on the inbound interface. Check whether packet characteristics match traffic classification rules.

l If not, modify the rules to match the information in the packets.

l If yes, go to step b.

2.

Run the display traffic policy user-defined policy-name classifier classifier-name command to check whether the traffic behavior associated with the traffic classifier is configured correctly.

l If not, run the traffic behavior command to enter the traffic behavior view and correctly configure the traffic behavior.

l If yes, go to step c.

3.

Check whether the traffic policy is applied correctly.

Run the display traffic-policy applied-record command to check whether the traffic policy is applied successfully, whether the traffic policy is applied to the correct VLAN,

LPU, or interface and whether the traffic policy is applied to the correct direction. On the

S6700, when the traffic policy is applied to incoming packets, the displayed traffic policy direction should be inbound; when it is applied to outgoing packets, the displayed traffic policy should be outbound.

l If the traffic policy is applied to an incorrect object or direction, run the undo traffic-

policy command to unbind the traffic policy from the incorrect object or direction, and then run the traffic-policy command to re-apply the traffic policy.

l If the traffic policy is applied to an incorrect object or direction, run the traffic-policy command to re-apply the traffic policy.

l If the traffic policy fails to be applied, go to step 4.


Step 3 Check whether packets match another rule that has a higher priority.

Run the display current-configuration command to check whether packets match another rule on the S6700 and the matching order of traffic classifiers in a traffic policy.


l If there is another matching rule check the rule that takes effect:

1.

Check the types of traffic classifiers.

The S6700 supports four types of traffic classifiers, which are based on Layer 2 information, Layer 3 information, User-defined Flow (UDF), and Layer 2 and Layer 3



349


Troubleshooting 9 QoS information. The rule priorities of the traffic classifiers in descending order are UDF,

Layer 2 and Layer 3 information, Layer 3 information, and Layer 2 information.

The rules in traffic classifiers are defined as follows:

–

Layer 2: The traffic classifier contains only Layer 2 rules.

– Layer 3: The traffic classifier contains only Layer 3 rules.

–

Layer 2 and Layer 3: The traffic classifier contains Layer 2 and Layer 3 rules. The logical relationship between the rules is OR.

–

UDF: The traffic classifier contains only user-defined rules.

Table 9-1

describes the classification of rules in traffic classifiers.

Table 9-1 Classification of rules in traffic classifiers

Type Rule

Layer 2 l if-match acl 4000-4999 l if-match any l if-match cvlan-8021p l if-match cvlan-id l if-match 8021p l if-match vlan-id l if-match destination-mac l if-match source-mac l if-match inbound-interface l if-match outbound-interface l if-match discard l if-match double-tag l if-match l2-protocol

Layer 3

UDF l if-match acl 2000-2999 l if-match acl 3000-3999 l if-match dscp l if-match ip-precedence l if-match protocol l if-match tcp l if-match acl 5000-5999

2.

Check the object where the traffic policy is applied.

On the S6700, the traffic policy can be applied to the entire system a VLAN, or an interface. When rules in traffic classifiers are of the same type and the traffic policy is applied to different objects, the traffic policies applied to an interface, a VLAN, and the entire system take effect in descending order of priority.

If the rule takes precedence over the current rule, the traffic action corresponding to the rule takes effect. Replan the rule so that the current rule takes effect and other services are not affected. Otherwise, go to step 4.



350


Troubleshooting 9 QoS l If there is another matching rule and the configuration order is used, check the rule that takes effect:

1.

Compare the objects to which the traffic policies are applied if the rule and the current rule are bound to different traffic policies. On the S6700, the traffic policies applied to an interface, a VLAN, and the system take effect in descending order of priority. The rule that is bound to the traffic policy with a higher priority takes effect.

2.

Determine the sequence in which traffic classifiers were bound to a traffic policy. If the rule and the current rule are bound to the same traffic policy but are in different traffic classifiers, the rule in the traffic classifier that is bound to the traffic policy first takes effect.

3.

Determine the sequence in which rules were configured in an ACL. If the rule and the current rule are bound to the same traffic policy, traffic classifier, and ACL, the rule that was configured in the ACL first takes effect.

If another rule takes effect, modify the rule so that the current rule takes effect and other services are not affected. Otherwise, go to step 4.



----End


Relevant Alarms

None.

Relevant Logs

None.


This section provides traffic policy troubleshooting cases.

PBR Based on Traffic Policies Fails to Take Effect

Fault Symptom

As shown in

Figure 9-2

, policy-based routing (PBR) based on traffic policies is configured on

the Switch so that data flows are redirected to the next hop 10.1.1.2/24 when enterprise users access the Web service.



351


Troubleshooting


Enterprise

user

Router1

10.1.1.2/24

LSW

XGE0/0/1

Switch

172.1.1.2/24

Router2

Internet

Intranet

9 QoS

After the configuration, data flows are not redirected to the next hop 10.1.1.2 when enterprise users access the Web service.

Fault Analysis

1.

Capture packets on the inbound interface XGE 0/0/1 of the Switch when enterprise users access the Web service. Data flows for enterprise users' access to the Web service can be captured.

2.

Run the display ip routing-table command to view the routing table. There is a route to

10.1.1.2/24.

3.

Check whether the data flows match another rule with a higher priority.

a.

Run the display this command in the view of the inbound interface XGE 0/0/1 to view the traffic policy configuration.

[Switch-XGigabitEthernet0/0/1] display this



port trunk allow-pass vlan 100

traffic-policy tp1 inbound

# return b.

Run the display traffic policy user-defined command to view the detailed traffic policy configuration.

[Switch] display traffic policy user-defined tp1


Policy: tp1

Classifier: tc1

Operator: AND

Behavior: tb1

Deny

Classifier: tc2

Operator: AND

Behavior: tb2

Redirect:

Redirect ip-nexthop 10.1.1.2

c.

Run the display current-configuration command to check the matching order of traffic classifiers in the traffic policy.



352



Two traffic classifiers are bound to the traffic policy tp1; therefore, you need to check the matching order of traffic classifiers in the traffic policy.

<Quidway> display current-configuration

# traffic policy tp1

classifier tc1 behavior tb1


The preceding information indicates that the automatic order is used.

NOTE

The automatic order is used by default and is not displayed in the configuration file.

d.

Run the display traffic classifier user-defined command to check the configurations of traffic classifiers tc1 and tc2. The system displays the following information:

[Switch] display traffic classifier user-defined tc1


Classifier: tc1

Operator: AND

Rule(s) : if-match any

if-match dscp 6

[Switch] display traffic classifier user-defined tc2


Classifier: tc2

Operator: AND

Rule(s) : if-match acl 3000 e.

Run the display acl 3000 command to view the content of ACL 3000.

[Switch] display acl 3000


Acl's step is 5

rule 5 permit tcp destination-port eq www

The preceding information indicates that the matching order of traffic classifiers in the traffic policy is auto and the traffic policy is bound to two traffic classifiers tc1 and tc2.

The matching rule of tc1 is if-match any and if-match dscp 6, which is a Layer 2 and

Layer 3 rule. The matching rule of tc2 is if-match acl 3000, which is a Layer 3 rule. On the S6700, if the matching order of traffic classifiers in the traffic policy is auto, a Layer

2 and Layer 3 rule takes precedence over a Layer 3 rule. Therefore, the data flows match

tc1 and contain the deny action. Such data flows are discarded and cannot be redirected to

10.1.1.2/24.

Procedure


Step 2 Run the traffic policy tp1 command to enter the view of the traffic policy tp1.

Step 3 Run the undo traffic classifier tc1 command to unbind the traffic classifier tc1 from the traffic policy.

After the preceding operations, when enterprise users access the Web service, data flows are redirected to the next hop 10.1.1.2. The fault is rectified.

NOTE

Before unbinding the traffic classifier tc1 from the traffic policy, ensure that tc1 is not in use. You may need to replan the rule priorities according to the network requirements.

----End



353


Troubleshooting

Summary

9 QoS

If PBR based on traffic policies fails to take effect, the possible causes are as follows: l The data flows do not match rules in the traffic policy.

l The route destined for the next hop does not exist in the routing table.

l The data flows match a rule with a higher priority. For details on how to determine priorities

of rules, see


in


.

Re-marking Fails to Take Effect After the Traffic Policy Is Applied to the Super-

VLAN

Fault Symptom

As shown in

Figure 9-3

, the re-marking function is configured on the Switch to re-mark DSCP

priorities of user packets. The upstream router then performs unified QoS control of user packets according to the re-marked priorities.


IP/MPLS core network

XGE0/0/2

XGE0/0/1

Router

Switch

Super-VLAN 2

Sub-VLANs 2000 to 2010

Issue 01 (2012-03-15)

After the configuration is complete, packets are captured on the inbound interface XGE 0/0/1 and the outbound interface XGE 0/0/2. The captured packets show that their DSCP priorities remain unchanged, which means that the re-marking function fails to take effect.



354



Fault Analysis

1.

Run the display current-configuration command to check the configuration of the

Switch.

The command output is as follows:

<Switch> display current-configuration traffic classifier temp operator and

if-match any traffic behavior temp

statistic enable

remark dscp af23 traffic policy temp

classifier temp behavior temp vlan 2

#

traffic-policy temp inbound

aggregate-vlan

access-vlan 2000 to 2010 interface XGigabitEthernet0/0/1


undo port trunk allow-pass vlan 1

port trunk allow-pass vlan 2000 to 2010




port trunk allow-pass vlan 2000 to 2010

The preceding information indicates that a traffic policy (the only traffic policy) on the

Switch re-marks AF23 on all packets and is applied to the inbound direction in super-VLAN

2. Such a configuration of the Switch is correct.

2.

Run the display traffic policy statistics vlan 2 command on the Switch to check whether the traffic policy is matched.

The command output is as follows:

<Switch> display traffic policy statistics vlan 2 inbound verbose classifier-

base

Vlan: 2

Traffic policy inbound: temp

Rule number: 1

Current status: OK!

---------------------------------------------------------------------

Classifier: temp operator and

Behavior: temp

Board : 0

Item Packets Bytes

---------------------------------------------------------------------

Matched 0 0

+--Passed 0 0

+--Dropped 0 0

+--Filter 0 0

+--URPF - -

+--CAR 0 0

The preceding information indicates that the traffic policy in super-VLAN 2 is not matched.

3.

Analyze the packets captured on XGE0/0/1. The packets carry VLAN 2000, different from super-VLAN 2 to which the traffic policy is applied.

When a traffic policy is applied to the super-VLAN, the traffic policy matches only the packets carrying the super-VLAN ID and it is ineffective for packets in sub-VLANs.



355



Procedure

Step 1 Run the system-view command on the Switch to enter the system view.

Step 2 Run the traffic classifier temp command to enter the traffic classifier view.

Step 3 Run the if-match vlan-id 2000 to 2010 command to change the rule of the traffic classifier to match all packets in sub-VLANs.

After the preceding steps, packets captured on XGE 0/0/1 and XGE 0/0/2 show that the DSCP priorities are re-marked with AF23. The fault is rectified.

----End

Summary

When a traffic policy is applied to the super-VLAN, the traffic policy matches only the packets carrying the super-VLAN ID but not the sub-VLAN ID. To match the traffic policy with the packets in the super-VLAN, you need to configure the traffic policy to match all the packets in sub-VLANs and apply the traffic policy to interfaces of sub-VLANs.

9.2 Priority Mapping Troubleshooting

This chapter describes common causes of priority mapping faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

9.2.1 Packets Enter Incorrect Queues

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when packets enter incorrect queues.

Common Causes


The priority type of packets is different from the priority type trusted by the inbound interface.

l The priority mapping configured in the DiffServ domain trusted by the inbound interface is incorrect.

l There are configurations affecting the queues that packets enter on the inbound interface, including:

–

port vlan-stacking

– port vlan-mapping vlan inner-vlan, or port vlan-mapping vlan map-vlan

–

trust upstream none

–

port link-type dot1q-tunnel

–

traffic-policy containing the remark 8021p, remark dscp, remark local-

precedence, or remark ip-precedence action l traffic-policy containing the remark 8021p, remark dscp, remark local-precedence, or

remark ip-precedence action in the VLAN that packets belong to.

l There are configurations affecting the queues that packets enter in the system, including:



356



– traffic-policy where remark 8021p, remark dscp, remark local-precedence, or

remark ip-precedence action is defined


Figure 9-4


Figure 9-4 Troubleshooting flowchart for packets entering incorrect queues

Packets enter queues not corresponding to priorities

Does priority

type trusted by inbound interface match

packet priority?

No

Correctly set priority type trusted by inbound interface

Yes

Is priority mapping in DiffServ domain correct?

Yes

No

Are there configurations affecting queues packets enter?

Yes

No


Correctly set priority mapping


No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

End


NOTE


Procedure

Step 1 Check that the priority type of packets is the same as the priority type trusted by the inbound interface.

Run the display this command in the inbound interface view to check the configuration of the

trust command on the inbound interface (if the trust command is not used, the system trusts



357


Troubleshooting 9 QoS the 802.1p priority in the outer VLAN tag by default). Then, capture packets on the inbound interface, and check whether the priority type of the captured packets is the same as the priority type trusted by the inbound interface.

l If not, run the trust command to modify the priority type trusted by the inbound interface to be the same as the priority type of the captured packets.


Step 2 Check whether the configured priority mapping is correct.

Run the display this command in the inbound interface view and check the configuration of the

trust upstream command (If the trust upstream command is not used, the system trusts the

default DiffServ domain). Then, run the display diffserv domain name domain-name command to check whether the priority mapping configured in the DiffServ domain trusted by the inbound interface is correct.

l If not, run the ip-dscp-inbound, ip-dscp-outbound, 8021p-inbound, or 8021p-outbound command to correctly configure the priority mapping.


Step 3 Check whether there are configurations affecting the queues that packets enter on the device.

1.

Check whether there are configurations affecting the queues that packets enter on the inbound interface.

The following configurations affect the queues that packets enter on the inbound interface: l If the port vlan-stacking command is used with remark-8021p specified, the priorities of packets are re-marked. The mapping between 802.1p priorities and local priorities may be incorrect and packets may enter incorrect queues.

l If the port vlan-mapping vlan inner-vlan, or port vlan-mapping vlan map-vlan command is used with remark-8021p specified, the priorities of packets are re-marked.

The mapping between 802.1p priorities and local priorities may be incorrect and packets may enter incorrect queues.

l If the traffic-policy command where remark local-precedence is defined is used for incoming packets, the system sends packets to queues based on the re-marked priority.

l If the traffic-policy command where remark 8021p , remark ip-precedenceor

remark dscp is defined is used, the system maps the re-marked priorities of packets to the local priorities and sends the packets to queues based on the mapped priorities.

l If the trust upstream none command is used, priorities of all the incoming packets are not mapped and the packets enter queues based on the default priority of the interface.

l If the port link-type dot1q-tunnel command is used but the trust 8021p inner command is not used on the interface, all the incoming packets enter queues based on the default priority of the interface.

Run the display this command in the inbound interface view to check whether there are configurations affecting the packets enqueuing on the inbound interface.

l If yes, delete or modify the configurations as required.

l If not, go to step b.

2.

Check whether there are configurations affecting the queues that packets enter in the VLAN that packets belong to.

The following configurations affect the queues that packets enter: l If the traffic-policy command where remark local-precedence is defined is used, the system sends packets to queues based on the re-marked priorities.



358


Troubleshooting 9 QoS l If the traffic-policy command where remark 8021p , remark ip-precedence or


Run the display this command in the view of the VLAN that packets belong to and check whether the configurations affecting the packets enqueuing are performed in the VLAN.


l If not, go to step c.

3.

Check whether there are configurations affecting the queues that packets enter in the system.

The following configurations affect the queues that packets enter: l If the traffic-policy command where remark local-precedence is defined is used, the system sends packets to queues based on the re-marked priority.

l If the traffic-policy command where remark 8021p, remark ip-precedence, or


Run the display current-configuration command to check whether the configurations affecting the packets enqueuing are performed in the system.



NOTE

The traffic policy is applied to an interface, a VLAN, and the system in descending order of priorities.



----End


Relevant Alarms

None.

Relevant Logs

None.

9.2.2 Priority Mapping Results Are Incorrect

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when priority mapping results are incorrect.

Common Causes

This fault is commonly caused by one of the following: l On the inbound interface, packets enter incorrect queues.



359


Troubleshooting 9 QoS l The type of the priority trusted by the outbound interface is incorrect.

l The priority mapping configured in the DiffServ domain trusted by the outbound interface is incorrect.

l There are configurations affecting the priority mapping on the outbound interface. For example:

–

undo qos phb marking enable

–

trust upstream none

–

traffic-policy containing the remark 8021p, remark ip-precedence or remark

dscp action


Figure 9-5


Figure 9-5 Troubleshooting flowchart for incorrect priority mapping

Priority mapping result is incorrect

Issue 01 (2012-03-15)

Do packets enter correct queues on inbound

No interface?

See "Packets Enter

Incorrect Queues"

Yes

Is proirity type trusted by outbound interface correct?

Yes

Is priority mapping on outbound interface correct?

Yes

No

Correctly set priority type trusted by outbound interface

No

Correctly set priority mapping on outbound interface

Are there configurations on outbound interface affecting priority mapping?

Yes


No


No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

End



360




NOTE


Procedure

Step 1 Check that packets enter correct queues on the outbound interface.

Run the display qos queue statistics interface interface-type interface-number command to check whether packets enter correct queues on the outbound interface.

l If not, rectify the fault according to


.


Step 2 Check that the priority type trusted by the outbound interface is correct.

Run the display this command in the view of the outbound interface to check whether the trusted priority type set by using the trust command on the outbound interface is correct. (If the trust command is not used, the system trusts the 802.1p priority in the outer VLAN tag by default.) l If not, run the trust command to correctly configure the priority type trusted by the outbound interface.


Step 3 Check that the priority mapping configured in the DiffServ domain trusted by the outbound interface is correct.

Run the display this command in the view of the outbound interface to check whether the trust

upstream command is used. If the trust upstream command is not used, the system trusts the

default DiffServ domain by default.

Run the display diffserv domain name domain-name command to check whether the mapping between local priorities and packet priorities complies with service planning.

NOTE

The local priority refers to the mapped priority of the inbound interface.

l If not, run the ip-dscp-outbound or 8021p-outbound command to correctly configure the mapping between local priorities and packet priorities.


Step 4 Check whether the configurations affecting priority mapping are performed on the outbound interface.

The following configurations affect priority mapping on the outbound interface: l If the undo qos phb marking enable command is used, the system does not perform PHB mapping for outgoing packets on an interface.

l If the trust upstream none command is used, the system does not perform PHB mapping for outgoing packets on an interface.

l If the traffic-policy command where remark 8021p, remark ip-precedence or remark

dscp is defined is used on the outbound interface, the re-marked priority is the packet priority.

Run the display this command in the view of the outbound interface to check whether the configurations affecting priority mapping are performed on the outbound interface.



361


Troubleshooting 9 QoS l If yes, delete or modify the configurations as required.




----End


Relevant Alarms

None.

Relevant Logs

None.


This section provides priority mapping troubleshooting cases.

Priority Mapping Is Incorrect Because the Trusted Priority Is Not Set

Fault Symptom

As shown in

Figure 9-6

, department 1 and department 2 are connected to the through the

Switch. Packets from department 1 and department 2 carry 802.1p priorities, whereas devices on the process packets based on Differentiated Services Code Point (DSCP) priorities. Therefore, you need to configure priority mapping on Switch to set the DSCP priority of packets from department 1 to 10 and the DSCP priority of packets from department 2 to 63. In this way, the device can provide different QoS services for packets based on their DSCP priorities.


Enterprise network

Router

XGE0/0/3

XGE0/0/1

VLAN 100

Switch

XGE0/0/2

VLAN 200

Department 1

CE1 CE2

Department 2



362


Troubleshooting



9 QoS

After the configuration, DSCP priorities of packets from department 1 and department 2 received on the router are different from the actual values.

Fault Analysis

1.

Check the priority mapping and the trusted priority of packets on the inbound interface.

a.

Capture packets on access interfaces of department 1 and department 2 and analyze packet priorities. It is found that 802.1p priorities of packets are 0.

b.

Run the display this command in the views of inbound interfaces XGE 0/0/1 and

XGE 0/0/2 to check the configuration of the interfaces.





trust upstream out-ds

# return

<Quidway> display diffserv domain name out-ds ip-dscp-outbound be green map 0

ip-dscp-outbound be yellow map 0

ip-dscp-outbound be red map 0

ip-dscp-outbound af1 green map 10

ip-dscp-outbound af1 yellow map 12

ip-dscp-outbound af1 red map 14

ip-dscp-outbound af2 green map 10

ip-dscp-outbound af2 yellow map 10

ip-dscp-outbound af2 red map 10







ip-dscp-outbound ef green map 46

ip-dscp-outbound ef yellow map 46

ip-dscp-outbound ef red map 46

ip-dscp-outbound cs6 green map 48

ip-dscp-outbound cs6 yellow map 48

ip-dscp-outbound cs6 red map 48

ip-dscp-outbound cs7 green map 63

ip-dscp-outbound cs7 yellow map 63

ip-dscp-outbound cs7 red map 63

The preceding information indicates thatXGE 0/0/1 is configured with DiffServ domain ds1 and XGE 0/0/2 is configured with DiffServ domain ds2. The interfaces trust the 802.1p priority in the outer VLAN tag by default. Run the display diffserv

domain name command to check the configurations of ds1 and ds2.

<Switch> display diffserv domain name ds1 diffserv domain name:ds1

8021p-inbound 0 phb af2 green

8021p-inbound 1 phb af2 green







......

<Switch> display diffserv domain name ds2 diffserv domain name:ds2

8021p-inbound 0 phb cs7 green

8021p-inbound 1 phb cs7 green

Issue 01 (2012-03-15) 363









......

2.

Check the priority mapping and the trusted type of priority on the outgoing interface.

Run the display this command in the XGE0/0/3 interface view to check the interface configuration.





trust upstream out-ds

# return

<Quidway> display diffserv domain name out-ds ip-dscp-outbound be green map 0

ip-dscp-outbound be yellow map 0

ip-dscp-outbound be red map 0




ip-dscp-outbound af2 green map 10

ip-dscp-outbound af2 yellow map 10

ip-dscp-outbound af2 red map 10







ip-dscp-outbound ef green map 46

ip-dscp-outbound ef yellow map 46

ip-dscp-outbound ef red map 46

ip-dscp-outbound cs6 green map 48

ip-dscp-outbound cs6 yellow map 48

ip-dscp-outbound cs6 red map 48

ip-dscp-outbound cs7 green map 63

ip-dscp-outbound cs7 yellow map 63

ip-dscp-outbound cs7 red map 63

The preceding information indicates that XGE 0/0/3 is configured with the DiffServ domain

out-ds. In out-ds, AF2 is mapped to DSCP 10 and CS7 is mapped to DSCP 63. The mappings are correct.

The interface is not configured to trust the priority of packets. That is, XGE 0/0/3 trusts only the 802.1p priority in the outer VLAN tag. Therefore, the outbound interface XGE

0/0/3 of the Switch does not mark the outgoing packets based on the priority mapping configured in the out-ds domain. DSCP priorities of packets from department 1 should be marked with 10 and DSCP priorities of packets from department 2 should be marked with

63.

Procedure

Step 1 Run the interface xgigabitethernet 0/0/3 command to enter the interface view.

Step 2 Run the trust dscp command to configure the interface to trust the DSCP priority of packets.

After the preceding configurations are complete, simulate users on department 1 and department 2 to send packets to the Switch, capture packets on outbound interface



364


Troubleshooting 9 QoS xgigabitethernet 0/0/3. The captured packets show that the DSCP priorities of the packets meet requirements. The fault is rectified.

----End

Summary

If the priority mapping is incorrect, check the configuration of priority mapping on the inbound and outbound interfaces and the trusted priority in packets. These may cause the failure to map the priorities of packets.

9.3 Traffic Policing Troubleshooting

This chapter describes common causes of traffic policing faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

9.3.1 Traffic Policing Based on Traffic Classifiers Fails to Take

Effect

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when traffic policing based on traffic classifiers fails to take effect.

Traffic policing applies the traffic policy's Committed Access Rate (CAR) or aggregated CAR action to packet flows. Its troubleshooting roadmap is the same as that for the traffic policy. For details, see


.

9.3.2 Interface-based Traffic Policing Results Are Incorrect

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when interface-based traffic policing results are incorrect.

Common Causes

This fault is commonly caused by one of the following: l The qos lr inbound command is not used on the interface.

l The CAR parameters are set incorrectly.


Figure 9-7




365



Figure 9-7 Troubleshooting flowchart for incorrect interface-based traffic policing results

Interface-based

QoS CAR is incorrect

Is

Interface-based traffic policing configured?

Yes

No

Correctly set interface-based

traffic policing

Are CAR parameters

set correctly?

No

Correctly set CAR parameters

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No


End


NOTE


Procedure

Step 1 Check whether the interface-based traffic policing is configured on the interface.

Run the display this command in the interface view to check whether the qos lr inbound command is used.

l If not, run the qos lr inbound command to configure the interface-based traffic policing correctly.


Step 2 Check whether the CAR parameters are set correctly.

Run the display qos lr command to check whether the CAR parameters are set correctly.

NOTE

On the S6700, , the granularity of interface-based traffic policing is 8 kbit/s. If the CIR value divided by 8 is greater than or equal to 1 but is smaller than 2, the traffic policing rate is (64+8) kbit/s. If the CIR value divided by 8 is greater than or equal to 2 but is smaller than 3, the traffic policing rate is (64+8*2) kbit/s, and so on.

l If not, run the qos lr inbound command to set the CAR parameters correctly.



366


Troubleshooting 9 QoS l If yes, go to step 3.



----End


Relevant Alarms

None.

Relevant Logs

None.


This section provides QoS CAR troubleshooting cases.

Traffic Policing Based on Traffic Classifiers Fails to Take Effect

Fault Symptom

As shown in

Figure 9-8

, a user accesses the Switch using the LSW. The user resides on the

network segment 192.168.1.0/24 and the MAC address of the user is 0001-0001-0001. Traffic policing is configured on XGE 0/0/1 of the Switch to limit the maximum transmission rate of incoming traffic to 50 Mbit/s. However, when traffic is transmitted from the LSW to the

Switch at the rate of 100 Mbit/s, the Switch still forwards the traffic at the rate of 100 Mbit/s.

That is, traffic policing fails to take effect.


192.168.1.0/24

0001-0001-0001

Enterprise users

XGE0/0/1 XGE0/0/2

Enterprise network

LSW

Switch

Router

Fault Analysis

1.

Check whether the traffic policy is configured for incoming packets on the interface.

Run the display this command in the view of the XGE 0/0/1 interface to view the traffic policy configuration on the interface.


#



367


Troubleshooting 9 QoS interface XGigabitEthernet0/0/1



traffic-policy tp1 inbound

# return

The preceding information indicates that the traffic policy tp1 is configured for incoming packets on the XGE0/0/1interface.

2.

Check whether the traffic classifier and the traffic behavior bound to the traffic policy are correct.

Run the display traffic policy user-defined [ policy-name [ classifier classifier-name ] ] command to check whether the traffic policy contains the traffic classifier and the traffic behavior, whether the CAR action is configured, and whether the CAR configuration is correct.

[Quidway] display traffic policy user-defined tp1


Policy: tp1

Classifier: tc1

Operator: AND

Behavior: tb1

Committed Access Rate:

CIR 5000 (Kbps), CBS 625000 (Byte)

PIR 5000 (Kbps), PBS 625000 (Byte)

Green Action : pass

Yellow Action : pass

Red Action : discard

Run the display traffic classifier user-defined command to check whether the rule in the traffic classifier is correct.

[Quidway] display traffic classifier user-defined tc1


Classifier: tc1

Operator: AND

Rule(s) : if-match acl 4000

[Quidway] display acl 4000


Acl's step is 5

rule 5 permit source-mac 0001-0001-0001 ffff-ffff-0fff (0 times matched)

The preceding information indicates that the traffic classifier and the traffic behavior in the traffic policy tp1 are correct.

3.

Check whether the information in the packets matches traffic classification rules.

Run the display traffic policy statistics command to check the traffic statistics on XGE

0/0/1 to which the traffic policy is applied. The following information is displayed:

[Quidway] display traffic policy statistics interface xgigabitethernet 0/0/1


Traffic policy inbound: tp1

Rule number: 1

Current status: OK!

---------------------------------------------------------------------

Board : 0

Item Packets Bytes

---------------------------------------------------------------------

Matched 0 0

+--Passed 0 0

+--Dropped 0 0

+--Filter 0 0

+--URPF - -

+--CAR 0 0

The preceding information indicates that packets do not match the traffic classificarion rules.



368


Troubleshooting

4.

Check whether another rule with a higher priority is configured for the packets.

Run the display current-configuration command to check the traffic policy in the system.

[Quidway] display current-configuration

#

sysname Quidway

#

acl number 3000

rule 5 permit ip source 192.168.1.0 0.0.0.255

#

acl number 4000

rule 5 permit rule 10 permit source-mac 0001-0001-0001 ffff-ffff-0fff

# traffic classifier test operator or

if-match acl 3000 traffic classifier tc1 operator or

if-match acl 4000

#

traffic behavior test

permit traffic behavior tb1

car cir 50000 pir 50000 cbs 6250000 pbs 6250000 green pass yellow pass red discard

# traffic policy test

classifier test behavior test traffic policy tp1


#

traffic-policy test global inbound

# interface XGigabitEthernet0/0/1 port link-type trunk


traffic-policy tp1 inbound

# return

The preceding information indicates that the traffic policy test is configured on the

Switch, which contains the traffic classifier test and the traffic behavior test. The traffic classifier references ACL 3000 that matches packets whose source IP address is

192.168.1.0. This is a Layer 3 rule. The action defined in the traffic behavior test is permit.

On the S6700, The traffic policy containing Layer 3 ACL rules takes precedence over the traffic policy containing Layer 2 rules. The packets with source MAC address

0001-0001-0001 from 192.168.1.0/24 match two rules, whereas only the traffic policy

test containing ACL 3000 takes effect. As a result, the packets are directly forwarded and traffic policing fails to take effect.

Procedure

Step 1 Run the undo traffic-policy global inbound command in the system view to disable the traffic policy test.

After the traffic policy test is disabled, transmit traffic to the XGE 0/0/1interface at the rate of

100 Mbit/s. The XGE 0/0/2 interface forwards the packets at the rate of 50 Mbit/s. The fault is rectified.

----End

Summary

9 QoS

If traffic policing based on traffic classifiers fails to take effect, rectify the fault according to

Traffic Policy Fails to Take Effect

.



369



9.4 Traffic Shaping Troubleshooting

This chapter describes common causes of traffic shaping faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

9.4.1 Traffic Shaping Results of Queues Are Incorrect

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when traffic shaping results of queues are incorrect.

Common Causes

The fault symptom may be any of the following: l Traffic shaping does not take effect.

l The CIR value for traffic shaping in queues cannot be reached.

This fault is commonly caused by one of the following: l Traffic shaping parameters are set incorrectly.

l The CIR value for traffic shaping on an interface is smaller than the sum of CIR values for traffic shaping in queues on the interface. As a result, the bandwidth of traffic shaping in queues cannot be ensured.

l Packets do not enter queues configured with traffic shaping because the configuration is incorrect. For example, priority mapping is incorrect.

l

Each queue uses the combined scheduling mode and excessive packets enter Priority

Queuing (PQ) queues. As a result, other queues cannot obtain sufficient bandwidth.

NOTE

In combined scheduling mode, if the bandwidth is insufficient, the Peak Information Rate (PIR) value of other queues cannot be reached. This is a correct traffic shaping result.


Figure 9-9




370


Troubleshooting

Figure 9-9 Troubleshooting flowchart for incorrect traffic shaping results

Traffic shaping result is incorrect

9 QoS

Are queue shaping parameters set correctly?

Yes

No

No

Is port shaping also configured?

Set queue shaping parameters correctly

Is fault rectified?

No

Yes

Yes

Is the CIR value for port shaping greater than the total CIR value for queue shaping?

No

Yes

Is there any traffic policing configuration that affects queue shaping?

Yes

Ensure that the CIR value for port shaping is greater than the total

CIR value for queue shaping

Is fault rectified?

No

Yes

Modify or delete the traffic policing configuration that affects queue shaping

Is fault rectified?

Yes

No

No

Do packets enter queues?

No

See "Troubleshooting

When Packets Enter

Incorrect Queues"

Is fault rectified?

Yes

No

Yes

Do queues work in combined scheduling mode?

Yes

Are there

too many packets

Yes in PQ queues?

Reconfigure the scheduling mode for each queue

No

No

Is fault rectified?

Yes

No


End



371




NOTE


Procedure

Step 1 Check whether traffic shaping parameters of queues are set correctly.

Run the display this command in the interface view to check whether the qos queue shaping command is used.

l If traffic shaping parameters of queues are set incorrectly or not set, run the qos queue

shaping command to set the parameters correctly.

l If traffic shaping parameters of queues are set and the CIR value for traffic shaping on an interface is set by using the qos lr outbound command, go to step 2.

l If traffic shaping parameters of queues are set but the CIR value for traffic shaping on an interface is not set, go to step 3.

Step 2 Check whether the CIR value for traffic shaping on an interface is greater than the sum of CIR values for traffic shaping in queues on the interface.

Compare the CIR value for traffic shaping on an interface with the sum of CIR values for traffic shaping in queues on the interface: l If the CIR value for traffic shaping on an interface is smaller than the sum of CIR values for traffic shaping in queues on the interface, queues on the interface cannot obtain sufficient bandwidth. The traffic shaping result may be incorrect. In this case, run the qos lr

outbound and qos queue shaping commands to modify related parameters accordingly so that the CIR value for traffic shaping on an interface is greater than the sum of CIR values for traffic shaping in queues on the interface.

l If the CIR value for traffic shaping on an interface is greater than the sum of CIR values for traffic shaping in queues on the interface, go to step 3.

Step 3 Check whether traffic policing affecting queue shaping is configured.

1.

Check whether interface-based traffic policing is configured on the inbound interface.

If interface-based traffic policing is configured on the inbound interface and its CIR value is smaller than the specified CIR value for queue shaping, queue shaping uses the CIR value for interface-based traffic policing.

Run the display this command in the inbound interface view to check whether the qos lr

inbound command is run on the inbound interface and whether its CIR value is smaller than the CIR value for queue shaping.

l If the qos lr inbound command is run and the CIR value of the inbound interface is smaller than the CIR value for queue shaping, disable interface-based traffic policing or modify the configuration so that the CIR value for interface-based traffic policing is greater than the CIR value for queue shaping.

l If the qos lr inbound command is not run or this command is run but the CIR value of the inbound interface is greater than the CIR value for queue shaping, go to step b.

2.

Check whether class-based traffic policing is configured on the device.



372



If class-based traffic policing is configured on the device, its CIR value is smaller than the

CIR for queue shaping, and traffic in queues matches the traffic classifier, the CIR for classbased traffic policing is used as the actual CIR value of queue shaping.

Run the display this command in the system view, inbound interface view, and VLAN view to check whether the traffic-policy command is run: l If the traffic-policy command is run, run the display traffic policy user-defined command to check whether a CIR value is defined in the traffic policy and whether the

CIR value is smaller than the CIR value for queue shaping.

–

If the CIR value is set, run the display traffic classifier user-defined command to check whether traffic in queues matches the traffic classifier in the configured traffic policy. If traffic in queues matches the traffic classifier, delete the CIR value defined in the traffic policy or modify the configuration so that the CIR value in the traffic policy is greater than the CIR value for queue shaping. If traffic in queues does not match the traffic classifier, go to step 4.

– If no CIR value is set, go to step 4.

l If the traffic-policy command is not run, go to step 4.

NOTE

l If both interface-based traffic policing and class-based traffic policing are configured, class-based traffic policing takes effect.

l The traffic policies configured in the interface view, VLAN view, and system view take effect in descending order of priority.

Step 4 Check whether packets enter traffic shaping queues.

Run the display qos queue statistics interface interface-type interface-number command to view the packet statistics on each queue on an interface.

l If packets do not enter traffic shaping queues, rectify the fault according to

9.2.1 Packets

Enter Incorrect Queues

.

l If packets enter traffic shaping queues but excessive packets (for example, the traffic rate on an XGigabitEthernet interface exceeds 1000 Mbit/s, the traffic rate on a GigabitEthernet interface exceeds 100 Mbit/s, and the traffic rate on an Ethernet interface exceeds 10 Mbit/ s) enter PQ queues, go to step 4.

l If packets enter traffic shaping queues and excessive packets do not enter PQ queues, go to step 5.

Step 5 Check whether queues on the interface use the combined scheduling mode.

Run the display this command in the interface view to check the scheduling mode used by each queue on the interface.

l If qos wrr or qos drr is used on an interface and qos queue queue-index drr weight 0 or

qos queue queue-index wrr weight 0 is configured in a queue, each queue on the interface uses the combined scheduling mode.

In combined scheduling mode, if no bandwidth limit is configured for PQ queues, some packets in WRR or DRR queues may be not processed when PQ queues contain a large number of packets. Because queue shaping is configured on WRR or DRR queues, queue shaping is also affected.. In this case, run the qos { pq | drr | wrr }, or qos queue queue-

index { drr | wrr }weight command to replan the scheduling mode and parameters of each queue, thereby reducing packets to enter PQ queues.



373



NOTE

In combined scheduling mode, if the bandwidth is insufficient, the PIR value of other queues cannot be reached. This is a correct traffic shaping result.

l If each queue uses the scheduling mode of qos pq or qos wrr/drr, go to step 5.



----End


Relevant Alarms

None.

Relevant Logs

None.


This section provides traffic shaping troubleshooting cases.

Traffic Shaping Results of Queues Are Incorrect

Fault Symptom

As shown in

Figure 9-10

, the transmission rate of network-side traffic is greater than the

transmission rate of traffic supported by the LSW, which may lead to jitter on the downlink interface XGE 0/0/1 of the Switch. To prevent jitter and ensure bandwidth of services, the

Switch is configured to send traffic of voice, video, and data services to queues respectively. In addition, traffic shaping is configured to: l Limit the maximum transmission rate of voice services to 128 kbit/s.

l Limit the maximum transmission rate of video services to 2000 kbit/s.

l Limit the maximum transmission rate of data services to 512 kbit/s.



374


Troubleshooting


Phone

802.1p=6

802.1p=5

Enterprise

PC

802.1p=2

LSW

XGE0/0/1

Switch

XGE0/0/2

Router

Enterprise network

9 QoS

TV

After the configuration, the bandwidth for voice services and video services is insufficient.

Fault Analysis

1.

Check the traffic shaping parameters in queues on the downlink interface XGE 0/0/1.

Run the display this command in the view of the downlink interface XGE 0/0/1 to check the traffic shaping parameters.





qos lr outbound cir 2000 cbs 250000

qos drr

qos queue 0 drr weight 0








qos queue 2 shaping cir 2000 pir 2000



# return

The preceding information indicates that the outbound interface is configured with traffic shaping and queue shaping. Queue 2, queue 5, and queue 6 use the DRR scheduling mode, and the traffic shaping parameters of each queue are correct. The CIR value for traffic shaping on the interface, however, is smaller than the sum of CIR values for traffic shaping in queue 2, queue 5, and queue 6 on the interface.

On the S6700, if the CIR value for traffic shaping on an interface is smaller than the sum of CIR values for traffic shaping in queues on the interface, the bandwidth of the queues cannot be ensured.



375



Procedure

Step 1 Run the interface xgigabitethernet0/0/1 command to enter the interface view.

Step 2 Run the qos lr outbound cir 3000 command to change the CIR for traffic shaping on the interface to 3000 kbit/s so that this value is greater than the sum of CIR values for traffic shaping in queues on the interface.

After the configuration, the bandwidth for voice, video, and data services is sufficient.

----End

Summary

On the Switch, if the CIR value of traffic shaping on an interface is smaller than the sum of the

CIR values of queue shaping, the committed rate of the queues cannot be provided.

9.5 Congestion Avoidance Troubleshooting

This chapter describes common causes of congestion avoidance faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

9.5.1 Congestion Avoidance Fails to Take Effect

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when congestion avoidance fails to take effect.

Common Causes


The interface or queue is not configured with the Weighted Random Early Detection

(WRED) drop profile.

l

Packets are not colored by using priority mapping, or remark local-precedence.

l

The parameters corresponding to packet colors are not configured in the WRED drop profile.

l When using queue-based congestion avoidance, packets do not enter queues configured with WRED drop profiles.


Figure 9-11




376


Troubleshooting

Figure 9-11 Troubleshooting flowchart for ineffective congestion avoidance

Congestion avoidance does not take effect

9 QoS

Is WRED

drop profile set on interface?

No

Configure WRED in the system, on an interface, or in a queue

Is fault rectified?

Yes

No

Yes

Are packets colored?

No

Color packets by priority mapping, or local-precedence

Is fault rectified?

Yes

No

Yes

Are drop parameters related

to packet colors set?

No

Set drop parameters related to packet colors

Yes

Is fault rectified?

No

Yes

Is WRED configured for queue

or interface?

Queue

Whether packets enter WRED queues？

No

See "Packets

Enter Incorrect

Queues"

Interface

Yes

Is fault rectified?

No

Yes


End


NOTE




377



Procedure

Step 1 Check whether the WRED drop profile is configured in the system, on an interface, or in a queue on the interface.

1.

Run the display this command in the interface view to check whether the qos wred or qos

queue wred command is used.


l If not, go to step b.

2.

Run the display this command in the system view to check whether the qos queue wred command is used.

l If the qos queue wred command is not used, run the qos queue wred or qos wred command to configure WRED globally or on an interface.

l If the qos queue wred command is used, go to step 2.

NOTE

The WRED drop profile configured in the system takes effect on all the interfaces. If a WRED drop profile is applied to the system and an interface simultaneously, the WRED drop profile applied to the interface takes effect.

If the WRED drop profiles are configured on an interface and in a queue on the interface, the system applies the WRED drop profiles in the queue and on the interface in sequence.

Step 2 Check whether parameters are set correctly in the WRED drop profile.

Run the display drop-profile command to check whether the WRED drop profile contains the parameters related to packet colors configured in step 2.

l If not, run the color command to set the parameters.

l If parameters are set and the qos queue wred command is used on the interface, go to step

3.

l If parameters are set and the qos wred command is used on the interface, go to step 4.

Step 3 Check whether packets enter the queue configured with the WRED drop profile.

Run the display qos queue statistics command to check whether there are packet statistics about the queue configured with the WRED drop profile.


l If not, packets are not entering the queue. Rectify the fault according to

9.2.1 Packets Enter

Incorrect Queues

.

Step 4 Check whether packets are colored by using priority mapping or traffic action.

Run the display this command in the interface view to check whether the following configurations are performed on the interface.

1.

Check whether the traffic-policy command is used on the interface.

l If yes, run the display traffic policy command to view the actions in the traffic policy.

Check whether remark local-precedence is configured.

– If remark local-precedence is configured but color is not specified, run the remark

local-precedence command to configure color.

– If remark local-precedence is configured and the parameter relevant to the color is configured, the system colors packets based on the configuration. Go to step 5.

–

If the action and related parameter are not configured, go to step b.

2.

Check whether the dei enable command is used on the interface.



378


Troubleshooting 9 QoS l If yes, verify that the system correctly marks packets with colors based on the CFI field

(if the CFI field is 1, packets are colored yellow; if the CFI field is 0, packets are colored green). Then go to step 5.

l If not, go to step c.

3.

Check whether the trust upstream { default | ds-domain-name } command is used on the interface.

If the trust upstream { default | ds-domain-name } command is used, run the display

diffserv domain name diffserv-domain-name command to check whether the mappings between packet priorities and colors are correct. If the trust upstream { default | ds-

domain-name } command is not used, the system trusts the default DiffServ domain by default. Run the display diffserv domainname default command to check whether the mappings between packet priorities and colors are correct.

l If the mappings are incorrect, run the 8021p-inbound, or ip-dscp-inbound command to modify the mappings between packet priorities and colors.

l If the mappings are correct, go to step 5.



----End


Relevant Alarms

None.

Relevant Logs

None.

9.6 Congestion Management Troubleshooting

This chapter describes common causes of congestion management faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

9.6.1 Congestion Management Fails to Take Effect

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when congestion management fails to take effect.

Common Causes


The queue scheduling mode is configured incorrectly.

l The weight ratio between WRR or DRR queues is greater than 50:1.

l Packets enter incorrect queues.



379


Troubleshooting


If packets in a queue are not scheduled or scheduling results are incorrect, congestion

management fails to take effect. Use the troubleshooting flowchart in

Figure 9-12

.

9 QoS

Figure 9-12 Troubleshooting flowchart for ineffective congestion management

Congestion management does not take effect

Is queue scheduling mode set correctly?

No

Yes

Is weight ratio between WRR/DRR queues greater

Yes than 50:1?

No

Correctly set queue scheduling mode

Change the weight ratio to be smaller than 50:1

Do packets enter queues correctly?

No

Rectify the fault

Yes


Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No

End


NOTE


Procedure

Step 1 Check whether the queue scheduling mode is configured correctly.

Run the display this command in the interface view to check whether the queue scheduling mode is correct.



380



NOTE

When you configure the queue scheduling mode: l The combined scheduling mode of PQ+WRR or PQ+DRR is recommended. Specifically, the delaysensitive key services are scheduled in PQ mode and other services are scheduled in WRR/DRR mode.

l If each queue is configured with PQ scheduling, delay-sensitive key services enter queues with a higher priority and non-key services enter queues with a lower priority.

l If each queue is configured with WRR/DRR scheduling, key services are assigned higher weights and non-key services are assigned lower weights.

l If the average length of different types of packets varies slightly, WRR scheduling is used; if such an average length differs a lot, DRR scheduling is used.

l If not, run the qos { pq | wrr | drr } command to reconfigure the queue scheduling mode.


Step 2 Check whether the weight ratio in WRR or DRR queues is overlarge.

NOTE

In WRR or DRR scheduling, if the weight ratio between queues is greater than 50:1, the WRR or DRR scheduling may be incorrect and congestion management may fail.

Run the display this command in the interface view to check whether the weight ratio is greater than 50:1 in WRR or DRR scheduling.

l If yes, run the qos queue queue-index drr weight weight or qos queue queue-index wrr

weight weight command to change the queue weights. Ensure that the weight ratio between any two queues is smaller than 50:1.


Step 3 Check whether packets enter queues correctly.

Use a tester to send service packets to the S6700 and run the display qos queue statistics command to view the statistics on queues. Check whether packets enter queues corresponding to the scheduling mode.

l If not, rectify the fault according to


.




----End


Relevant Alarms

None.

Relevant Logs

None.



381




This section provides congestion management troubleshooting cases.

QoS of Services with a Higher Priority Cannot Be Guaranteed

Fault Symptom

As shown in

Figure 9-13

, the Switch is connected to the router by using XGE 0/0/2. Voice,

video, and data services are transmitted on the network. The 802.1p priorities of voice, video, and data services are 6, 2, and 5 respectively. These services are transmitted to users through the router and the Switch. To ensure QoS of services, congestion management is configured.


Phone

802.1p=6

802.1p=5

Enterprise

PC

802.1p=2

LSW

XGE0/0/1

Switch

XGE0/0/2

Router

Enterprise network

TV

After the configuration, the QoS of voice and video services with a higher priority cannot be guaranteed, and voice and video signals are interrupted sometimes. That is, congestion management fails to take effect.

Fault Analysis

The possible causes are as follows: l The packets cannot enter correct queues. As a result, the packets with a lower priority are forwarded but the packets with a higher priority are discarded.

l The scheduling modes and weights of the queues are improper.

1.

Check the traffic statistics and scheduling parameters of each queue.

Run the display qos queue statistics command to check the traffic statistics and scheduling parameters of the queues on a specified interface.

<Quidway> display qos queue statistics interface xgigabitethernet 0/0/2

Queue CIR/PIR(kbps) Passed(Packet/Byte) Dropped(Packet/Byte)

---------------------------------------------------------------------------

0 1000000 0 0

1000000 0 0

---------------------------------------------------------------------------



382


Troubleshooting



9 QoS

1 1000000 0 0

1000000 0 0

---------------------------------------------------------------------------

2 2000 2457863 0

2000 245786300 0

---------------------------------------------------------------------------

3 1000000 2012324 0

1000000 201232400 0

---------------------------------------------------------------------------

4 1000000 2047189 0

1000000 204718900 0

---------------------------------------------------------------------------

5 512 0 0

512 0 0

---------------------------------------------------------------------------

6 1000000 0 0

1000000 0 0

---------------------------------------------------------------------------

7 128 0 0

128 0 0

---------------------------------------------------------------------------

The preceding information indicates that the packets enter queues AF2, AF3, and AF4.

2.

Check the priority mapping and scheduling parameters on the interface.

Run the display this command in the interface view to check whether the priority of the incoming packets is mapped to the priority specified by the DiffServ domain.

l If the packets passing through the interface are the IP packets carrying the DSCP priority, run the trust upstream and trust dscp commands on the interface.

l If the packets passing through the interface come from a VLAN, run the trust

upstream and trust 8021p commands on the interface.




port trunk allow-pass vlan 100 110 120

qos queue 0 wrr weight 0








trust upstream ds1

# return

The preceding information indicates that XGE 0/0/2 is bound to the DiffServ domain ds1 and trusts the 802.1p priority in the outer VLAN tag. Queues AF1 to AF4 use WRR scheduling and their weights are 10, 20, 30, and 40 respectively. Queues BE, EF, CS6, and

CS7 use PQ scheduling.

3.

Check the configuration of the DiffServ domain.

Run the display diffserv domain name ds1 command to view the mapping of 802.1p

priorities in the DiffServ domain ds1.

<Quidway> display diffserv domain name ds1 diffserv domain name:ds1

8021p-inbound 0 phb be green


8021p-inbound 2 phb AF2 green





383




8021p-outbound be green map 0

8021p-outbound be yellow map 0

The preceding information indicates that the packets with the 802.1p priority being 6 enter the AF4 queue, the packets with the 802.1p priority being 5 enter the AF3 queue, and the packets with the 802.1p priority being 2 enter the AF2 queue. The mapping is correct.

Queues AF4, AF3, and AF2 use WRR scheduling and their weights are 40, 30, and 20 respectively. Therefore, when service traffic is light, the QoS of delay-sensitive services can be guaranteed; however, when service traffic is heavy, the QoS of delay-sensitive services cannot be guaranteed and therefore voice signals are interrupted sometimes.

Procedure

Step 1 Run the diffserv domain ds1 command to enter the view of the DiffServ domain ds1.

Step 2 Run the 8021p-inbound 2 phb af2 green command to map the packets with the 802.1p priority being 2 to the AF2 queue.

Step 3 Run the 8021p-inbound 5 phb ef green command to map the packets with the 802.1p priority being 5 to the EF queue.

Step 4 Run the 8021p-inbound 6 phb cs7 green command to map the packets with the 802.1p priority being 6 to the CS7 queue.

After the configuration, voice signals are transmitted continuously.

----End

Summary

When configuring the DiffServ domain, correctly map the packet priorities to queues.

PQ, WRR, and DRR have their own advantages and disadvantages. If only PQ scheduling is used, the packets in the queues with a low priority cannot obtain bandwidth for a long period of time. If only WRR or DRR scheduling is used, delay-sensitive services cannot be scheduled in time. Therefore, when various services are transmitted on the network, use PQ+WRR or PQ

+DRR scheduling.



384


Troubleshooting 10 Reliability

10

Reliability


10.1 Smart Link Troubleshooting

This chapter describes common causes of a Smart Link fault, and provides the corresponding troubleshooting flowchart, troubleshooting procedures, alarms, and logs.

10.2 VRRP Troubleshooting

This chapter describes common causes of Virtual Router Redundancy Protocol (VRRP) faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

10.3 Ethernet OAM Troubleshooting

This chapter describes common causes of an Ethernet OAM fault, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

10.4 BFD Troubleshooting

This chapter describes common causes of a BFD fault, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

10.5 DLDP Troubleshooting

This chapter describes common causes of a DLDP fault, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

10.6 RRPP Troubleshooting

10.7 MAC Swap Loopback Troubleshooting

This chapter describes common causes of MAC swap loopback faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

10.8 ERPS Troubleshooting

This chapter describes common causes of Ethernet ring protection switching (ERPS) faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

385 Issue 01 (2012-03-15) Huawei Proprietary and Confidential




10.1 Smart Link Troubleshooting

This chapter describes common causes of a Smart Link fault, and provides the corresponding troubleshooting flowchart, troubleshooting procedures, alarms, and logs.

10.1.1 Active/Standby Switchover Failure in a Smart Link Group

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the active/standby switchover failure in a Smart Link group.

Common Causes

This fault is commonly caused by one of the following: l The Smart Link group is configured incorrectly. For example, the Smart Link group function is disabled or member interfaces are not added to the service VLAN.


l Data flows are locked on an interface in the Smart Link group.

l Flush packets are incorrectly received or sent.




386


Troubleshooting

Figure 10-1 Smart Link group troubleshooting flowchart

Active/standby switchover failure in a Smart Link group

Is memeber interface Up?

Yes

No

Rectify link fault

Yes

Is

Smart Link group status correct?

No

Are data flows locked?

No

Yes

Unlock data flows

Are packets member interfaces?

Yes

No


Is function of sending Flush packets enabled?

Yes

No Enable function of sending

Flush packets

Do member interfaces join control VLAN?

Yes

No

Add member interfaces to control VLAN

Is function of receiving Flush packets enabled?

Yes

No

Enable function of receiving

Flush packets


Is fault rectified?

No

Yes

Is fault rectified?

Yes

No


Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

End

10 Reliability



387


Troubleshooting


10 Reliability

NOTE


Procedure

Step 1 Check the member interface status in the Smart Link group.

Run the display interface interfacetype interface-number command to view the interface status, that is, the value of the current state field.

l

If the value of the current state field is Down, rectify the fault according to

Connected

Ethernet Interfaces Down

.

l If the value of the current state field is Up, the interface is Up. Go to step 2.

Step 2 Check the Smart Link group status.

Run the display smart-link group { all | group-id } command to view the Smart Link group status, that is, the value of the State field.

l If one interface is active and the other interface is inactive, the Smart Link group is in Up state. Go to step 4.

l If the Smart Link group is in Down state, go to step 3.

Step 3 Check whether data flows are locked on an interface in the Smart Link group.

Run the display smart-link group group-id command to check whether data flows are locked on an interface in the Smart Link group, that is, view the value of the Link status field.

l If the value of the Link status field is lock or force, data flows are locked on the master or slave interface in the Smart Link group. Run the undo smart-link { force | lock } command to unlock data flows on an interface in the Smart Link group.

l If the Link status field is not displayed, data flows are not locked on an interface in the

Smart Link group. Go to step 8.

Step 4 Check whether packets are discarded on member interfaces in the Smart Link group.

Use the following method to check whether packets are discarded:

Run the ping-c count -t timeout command to view packet loss information in the command output.

NOTE

If the network is unreliable, set the packet transmission count (-c) and timeout (-t) to the upper limits. This makes the test result accurate.

l If there are discarded packets, go to step 5.

l If no packet is discarded, go to step 8.

Step 5 Check whether the function of sending Flush packets is enabled.

Run the display this command in the Smart Link group view to check whether the function of sending Flush packets is enabled.

l If the information "flush send control-vlan vlan-id" is not displayed, run the flush send command to enable the function of sending Flush packets.



388


Troubleshooting 10 Reliability l If the information "flush send control-vlan vlan-id" is displayed, go to step 6.

Step 6 Check whether the control VLAN is created. Ensure that member interfaces of the Smart Link group join the control VLAN.

Run the display vlan vlan-id command.

l If the following information is displayed, member interfaces of the Smart Link group join the control VLAN. Go to step 7.

------------------------------------------------------------------------------

--

U: Up; D: Down; TG: Tagged; UT:

Untagged;

MP: Vlan-mapping; ST: Vlanstacking;

#: ProtocolTransparent-vlan; *: Managementvlan;

------------------------------------------------------------------------------

--

VID Type

Ports

------------------------------------------------------------------------------

--

10 common TG:XGE0/0/3(U) XGE0/0/2

(U) l If the following information is not displayed, create a control VLAN and add member interfaces of the Smart Link group to the control VLAN.

Step 7 Check whether the function of receiving Flush packets is enabled on the peer device.

Run the display this command in the interface view.

l If the information "smart-link flush receive control-vlan vlan-id" is displayed, go to step

8.

l If the information "smart-link flush receive control-vlan vlan-id" is not displayed, run the

smart-link flush receive command to enable the function of receiving Flush packets.


l Results of the preceding troubleshooting procedure l Configuration file, log file, and alarm file of the S6700 l MAC address of the device configured with the Smart Link group

----End


Relevant Alarms

None.

Relevant Logs

None.

10.1.2 Monitor Link Group Status Is Down

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when the Monitor Link group status is Down.



389



Common Causes

This fault is commonly caused by one of the following: l The link is faulty.

l The member interface of the Monitor Link group is not added to the service VLAN.

l The downlink interface is manually shut down.



Monitor Link

Group Status

Is Down

Is memeber interface Up?

Yes

No

Rectify link fault

Is fault rectified?

No

Yes

Do member interfaces join control VLAN?

Yes

No


Add member interfaces to control VLAN

Is fault rectified?

No

Yes

End


NOTE


Procedure

Step 1 Check the member interface status in the Monitor Link group.

Run the display monitor-link group group-id command to view the State field.

l

If the value of the State field is DOWN, rectify the fault according to


Interfaces Down

.



390



NOTE

A link fault, a unidirectional OAM connectivity fault, or a failure to establish OAM connections may occur on the uplink interface. When the uplink interface belongs to a Smart Link group, the uplink interface is considered as faulty if none of the maser and slave interfaces of the Smart Link group are in active state or the Smart Link group is not enabled.

l If the value of the State field is UP, go to step 2.

Step 2 Check whether the member interface of the Monitor Link group is added to the service VLAN.

Run the display current-configuration interface interface-type interface-number command in the member interface view to check whether the member interface of the Monitor Link group is added to the service VLAN.

l If the member interface of the Monitor Link group is not added to the service VLAN, add the member interface to the service VLAN.

l If the member interface of the Monitor Link group is added to the service VLAN, go to step 3.



----End


Relevant Alarms

None

Relevant Logs

None

10.2 VRRP Troubleshooting

This chapter describes common causes of Virtual Router Redundancy Protocol (VRRP) faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.

10.2.1 VRRP Group Flaps

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when the VRRP group flaps.

Common Causes

This fault is commonly caused by one of the following: l The interface where VRRP Advertisement packets are transmitted changes between Up and Down frequently.



391


Troubleshooting l The interval for sending VRRP Advertisement packets is small.

l Packets are discarded on the backup device interface.

l VRRP packets are discarded randomly because congestion occurs.


Figure 10-3 VRRP group troubleshooting flowchart

VRRP group flaps

10 Reliability

Issue 01 (2012-03-15)

Yes

Can backup receive

Advertisement packets?

No

Is

interval for sending

VRRP packets small?

No

Yes

Change interval for sending

VRRP packets

Is link between

VRRP devices fautly?

No

Yes

Rectify link fault

Are packets discarded on backup device interface?

No

Yes

Are packets sent to CPU lost?

Yes

No

Is limit set for

VRRP packets?

No

Yes

Change CAR value




Is fault rectified?

No

Is fault rectified?

No


Yes

Yes

Is fault rectified?

Yes

No

End

392


Troubleshooting


10 Reliability

NOTE


Procedure

Step 1 Check whether the backup device can receive VRRP Advertisement packets.

Run the debugging vrrp packet command on the backup device to check whether the following information is displayed.

*Aug 27 19:45:04 2009 Quidway VRRP/7/DebugPacket:

Vlanif45 | Virtual Router 45:receiving from 45.1.1.4, priority = 100,timer = 1, auth type is no, SysUptime: (0,121496722)

By default, the master device sends one VRRP Advertisement packet every second.

l If the backup device cannot receive VRRP Advertisement packets, go to step 2.

l If the backup device can receive VRRP Advertisement packets, go to step 6.

Step 2 Check whether the interval for sending VRRP Advertisement packets is short.

Run the vrrp vrid timer advertise command to set a greater interval for sending VRRP

Advertisement packets. Then run the display vrrp command repeatedly on the Backup device to view the State field. If the command output does not change, the backup device works stably.

l If the backup device works stably, the interval for sending VRRP Advertisement packets is small.

l

If the backup device works unstably, restore the interval for sending VRRP Advertisement packets. Go to step 3.

Step 3 Check whether the link between devices in the VRRP group is faulty.

Run the ping command repeatedly to check whether IP addresses of devices in the VRRP group can be pinged.

l If all the ping operations fail, rectify the fault on the link according to

A Ping Operation

Fails

.

l

If some ping operations succeed, loops may occur. Remove the loops.

l If all the ping operations succeed, go to step 4.

Step 4 Check whether packets are discarded on the backup device interface.

Use the following method to check whether packets are discarded:

NOTE

Before running the display interface command, run the reset counters interface command to clear the statistics on the interface.

Run the display interface interface-type interface-number command to check the values of

Discard fields under Input and Output.

l

If packets are discarded, go to step 5.

l If no packet is discarded, go to step 7.

Step 5 Check whether the limit for VRRP packets is configured on the LPU.



393



Run the command to check whether the following information is displayed:

----------------------------------------------------------------------

Packet Name Status Cir(Kbps) Cbs(Byte) Queue

---------------------------------------------------------------------vrrp Enabled

The default CIR value is kbit/s. Each board supports about VRRP groups.

l If there are more than VRRP backup groups, run the car command to change the CAR value to be greater.

l If there are VRRP backup groups or less, go to step 7.

Step 6 Check whether VRRP packets sent to the CPU are discarded.

Run the display cpu-defend statistics slot slot-id command to check whether VRRP packets sent to the CPU are discarded.

l If the value of the Drop(Packets) field is not 0, VRRP packets sent to the CPU are discarded. Record the result and go to step 7.

l If the value of the Drop(Packets) field is 0, VRRP packets sent to the CPU are not discarded. Go to step 7.



----End


Relevant Alarms

None.

Relevant Logs

None.

10.2.2 Two Master Devices Exist in a VRRP Group

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when two master devices exist in a VRRP group.

Common Causes

This fault is commonly caused by one of the following: l The configurations of the devices in the VRRP group are different.

l

The link where VRRP Advertisement packets are transmitted is faulty.

l A loop occurs on the link.

l The VRRP Advertisement packets received by the VRRP group with a lower priority are taken as invalid packets mistakenly and are discarded.



394


Troubleshooting


Figure 10-4 Troubleshooting flowchart for dual master devices in a VRRP group

Dual master devices exist in a VRRP group

10 Reliability

Are configurations of two devices same?

Yes

Can backup receive VRRP

Advertisement packets?

Yes

No


No

Collect information about invalid advertisement packets

Are interfaces blocked?

Yes

No

Change STP priorities of interfaces

Is fault rectified?

No

Yes



Yes

No


Is fault rectified?

Yes

No

Is fault rectified?

Yes

No


End


Context

NOTE




395



Procedure

Step 1 Check that the configurations of the devices in the VRRP group are the same.

Run the display this command on VLANIF interfaces of the devices to check the configurations.

Field

ip address vrid

Virtual IP

TimerRun

Auth Type

Method

Check whether interface IP addresses are on the same network segment. If not, run the ip address command to change IP addresses to be on the same network segment.

Check whether the virtual router IDs on the interfaces are the same. If not, run the vrrp vrid virtual-router-id virtual-ip virtual-address command to change the virtual router IDs to be the same.

Check whether virtual IP addresses on the interfaces are the same. If not, run the vrrp vrid virtual-router-id virtual-ip virtual-address command to change the virtual IP addresses to be the same.

Check whether the interfaces are configured with the same interval for sending

Advertisement packets. If not, run the vrrp vrid virtual-router-id timer

advertise adver-interval command to change the intervals to be the same.

Check whether VRRP packet authentication modes on the interfaces are the same. If not, run the vrrp vrid virtual-router-id authentication-mode

{ simple key | md5 md5-key } command to change the authentication modes to be the same.


Step 2 Check whether the backup device can receive VRRP Advertisement packets.

Enable debugging on the backup device and check whether the following information is displayed:

*Aug 27 19:45:04 2009 Quidway VRRP/7/DebugPacket:

Vlanif45 | Virtual Router 45:receiving from 45.1.1.4, priority = 100,timer = 1, auth type is no, SysUptime: (0,121496722)

By default, the master device sends one VRRP Advertisement packet every second.

l

If the backup device does not receive VRRP Advertisement packets, go to step 3.

l

If the backup device receives VRRP Advertisement packets, go to step 5.

Step 3 Check whether any interface on the device in the VRRP group and devices on the transmission path of VRRP Advertisement packets is blocked.

Run the display stp brief command to check the STP State field.

l

If the value of the STP State field is FORWARDING, the corresponding interface is not blocked. Go to step 4.

l If the value of the STP State field is DISCARDING, the corresponding interface is blocked. Change STP priorities of interfaces to ensure that interconnected interfaces can forward VRRP packets.

Step 4 Run the ping command to check whether the link between devices in the VRRP group is faulty.



396


Troubleshooting 10 Reliability l

If the ping operation fails, rectify link faults according to

A Ping Operation Fails

.


Step 5 Check whether the VRRP group with a lower priority receives invalid VRRP Advertisement packets.

Run the display vrrp statistics command to check the Received invalid type packets field.

l

If the value of the Received invalid type packets field is not 0, invalid VRRP

Advertisement packets are received. Go to step 6.

l

If the value of the Received invalid type packets field is 0, invalid VRRP Advertisement packets are not received. Go to step 6.



----End


Relevant Alarms

None.

Relevant Logs

None.


Data Packets Are Discarded on a Network Configured with VRRP

On a network configured with a VRRP backup group, a device connected to VRRP routers is incorrectly configured, causing incorrect MAC address learning and thus packet loss.

Fault Symptom


Figure 10-5

, a VRRP backup group is configured on Switch A and

Switch B. Switch A functions as a master device and Switch B functions as a backup device.

Switch C functions as a switch connecting Switch A and Switch B.



397


Troubleshooting

Figure 10-5 Networking diagram of a VRRP backup group

SwitchA

XGE0/0/8

XGE0/0/2

SwitchC

XGE0/0/3

XGE0/0/5

XGE0/0/7

SwitchD

XGE0/0/7

XGE0/0/2

Eth-trunk

XGE0/0/3

SwitchB

10 Reliability

SwitchE

After the configurations, a large number of packets sent from Switch E to Switch D are discarded.

Fault Analysis

1.

Run the display vrrp [ interface interface-type interface-number ] [ virtual-router-id ]

statistics command on Switch A and then Switch B to check traffic on XGE0/0/2 of

Switch A and XGE0/0/3 of Switch B. A small volume of traffic is transmitted on

XGE0/0/2 of Switch A connected to Switch C, and no traffic is transmitted on XGE0/0/3 of Switch B connected to Switch C.

Run the display statistics interface interface-type interface-number command on

Switch C to check traffic on XGE0/0/4, XGE0/0/3, and XGE0/0/5. A small volume of traffic is transmitted on XGE0/0/3 connected to XGE0/0/2, and no traffic is transmitted on

XGE0/0/5 connected to XGE0/0/3. A large amount of traffic is transmitted on XGE0/0/4.

The statistics show that traffic is dropped on Switch C.

2.

Run the display mac-address dynamic command on Switch C to check MAC addresses.

The learned MAC address of Switch A is sent by XGE0/0/4, but not XGE0/0/3 connected to Switch A or XGE0/0/5 connected to Switch B, indicating that the learned MAC address is incorrect. For example:

MAC address table of slot

0:

------------------------------------------------------------------------------

-

MAC Address VLAN/ PEVLAN CEVLAN Port Type

LSP/

VSI/SI MAC-

Tunnel

------------------------------------------------------------------------------

-

0000-0a0a-0102 1 - - XGE0/0/4 dynamic

-

0000-5e00-0101 1 - - XGE0/0/4 dynamic

-

0098-0113-0005 1 - - XGE0/0/4 dynamic

-



398


Troubleshooting



10 Reliability

0018-824f-f5d1 1 - - XGE0/0/3 dynamic

-

------------------------------------------------------------------------------

-

3.

Run the display current-configuration interface interface-type interface-number command on Switch C to check the configuration on XGE0/0/4. For example:

#

interface XGigabitEthernet0/0/4

undo shutdown

loopback internal

portswitch

port default vlan 1

The loopback function has been configured on XGE0/0/4, indicating that XGE0/0/4 loops traffic back after receiving it.

4.

Run the display statistics interface interface-type interface-number command on

Switch C to check traffic on XGE0/0/3, XGE0/0/4, and XGE0/0/5. A great amount of traffic is transmitted on XGE0/0/4. A small volume of traffic is transmitted on XGE0/0/3. This indicates that traffic loss is caused by the loopback function on XGE0/0/4.

5.

Run the display mac-address dynamic command multiple times on Switch C to check

MAC addresses. Switch C learns different MAC addresses at different times. For example:

[SwitchC] display mac-address

dynamic


0:

------------------------------------------------------------------------------

-


LSP/

VSI/SI MAC-

Tunnel

------------------------------------------------------------------------------

-

0000-0a0a-0102 1 - - XGE0/0/4 dynamic

-

0000-5e00-0101 1 - - XGE0/0/4 dynamic

-

0098-0113-0005 1 - - XGE0/0/5 dynamic

-

0018-824f-f5d1 1 - - XGE0/0/4 dynamic

-

------------------------------------------------------------------------------

-

Total matching items on slot 0 displayed =

4

[SwitchC] display mac-address dynamic


0:

------------------------------------------------------------------------------

-


LSP/

VSI/SI MAC-

Tunnel

------------------------------------------------------------------------------

-

0000-0a0a-0102 1 - - XGE0/0/4 dynamic

-

0000-5e00-0101 1 - - XGE0/0/3 dynamic

399



-

0098-0113-0005 1 - - XGE0/0/5 dynamic

-

0018-824f-f5d1 1 - - XGE0/0/4 dynamic

-

------------------------------------------------------------------------------

-

Total matching items on slot 0 displayed=4

In a VRRP backup group, a device with a higher priority functions as a master device. If the IP address of a device is the same as the virtual IP, the router priority is considered highest and always functions as the master device. The master device sends a VRRP packet to the backup device every one second by default. If a backup device fails to receive three consecutive packets from the master device, the backup device preempts to be the master device and sends a VRRP packet indicating that it becomes the master. In normal situations, the backup device does not send VRRP packets.

NOTE

If a device is assigned an IP address the same as the virtual IP address, the device always functions as the master router.

On this network, a packet sent by the master device arrives at the switch. The switch learns the source MAC address (in this example, 0000-5e00-0101), VLAN ID, and interface connected to the master device, and adds them to the MAC address table. The switch searches the MAC address table for the interface connected to the master device, thus forwarding the packet to the backup device. If a VRRP switchover occurs, the backup device becomes the master device and then sends a VRRP packet. After receiving the VRRP packet, the switch learns the MAC address and maps it to another interface connected to the new master device.

On this network, after receiving a VRRP packet that is sent every one second, Switch C learns the MAC address of Switch A and forwards the VRRP packet to all interfaces belonging to VLAN 1. XGE0/0/4 of VLAN 1 receives the VRRP packet, and then loops the VRRP packet back by using the loopback function. After receiving the returned VRRP packet, Switch C adds the mapping between XGE0/0/4 and 0000-5e00-0101 to the MAC address table to overwrite the previous mapping. In this manner, the newly-learned MAC address overwrites the previous one repeatedly, causing traffic loss.

Procedure



Step 3 Run the undo loopback command to disable the loopback function on the interface.

After the preceding operations, no traffic is discarded. The fault is cleared.

----End

Summary

Do not enable the loopback function on an interface of a Layer 2 device; otherwise, incorrect

MAC addresses are learned.



400



10.3 Ethernet OAM Troubleshooting

This chapter describes common causes of an Ethernet OAM fault, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

10.3.1 MAC Trace Based on Ethernet OAM 802.1ag Fails

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for the Ethernet OAM 802.1ag-based MAC trace failure.

Common Causes

As shown in

Figure 10-6

, SwitchA fails to trace the MAC address of SwitchC based on 802.1ag.

[SwitchA-md-1-ma-1] trace mac-8021ag mac 0018-823c-c449

Tracing the route to 0018-823c-c449 over a maximum of 64 hops:

Request timed out.


SwitchA SwitchB SwitchC


The destination node SwitchC is not configured with the MEP of the same level as the MEP on SwitchA.

l The MEP level of the intermediate node is the same as or higher than the MEP level of

SwitchA.

l The intermediate node does not have a MAC address entry with the destination being

SwitchC.




401



Figure 10-7 Troubleshooting flowchart for the Ethernet OAM 802.1ag-based MAC trace failure

802.1ag

Trace fails

Are

802.1ag

versions the same?

Yes

Is

MEP level of SwitchC high?

No

No

Yes

Change the

802.1ag version

Change the

MEP levels to be the same

Is MEP

level of SwitchB high?

No

Yes

Change the

MEP level on

SwitchB

SwitchB learn MAC address of

SwitchC?

Yes


No

Modify blackhole MAC or MAC address limit


Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

Yes

No

Is fault rectified?

No

Yes

End

Context

NOTE


Perform the following operations on the device where the fault occurs and on the downstream MEP or MIP of the device.

Procedure

Step 1 Run the display oam global configuration command to check whether all the devices on the trace path use the same 802.1ag version.

l If yes, go to step

Step 2

.

l If not, run the cfm version command to switch the Ethernet CFM 802.1ag version between draft7 and standard2007.



402



CAUTION

After the IEEE 802.1ag version is changed, the original CFM configuration will be deleted.

Therefore, use this command with caution.

–

If SwitchA successfully traces the MAC address of SwitchC based on 802.1ag, the fault has been rectified

–

If SwitchA still fails to trace the MAC address of SwitchC, go to step

Step 2

.

Step 2 Run the display this command on the MD view on source node and destination node to check whether the MEP levels of the destination node and the source node are the same.


Step 3

.

l If not, delete the MD, run the cfm md command to create a new one, set the MEP level of the destination node to be the same as that of the source node.

–

If SwitchA successfully traces the MAC address of SwitchC based on 802.1ag, the fault has been rectified.

–


Step 3

.

Step 3 Run the display cfm mep command to check whether the MEP level of the intermediate node is the same as or higher than the MEP level of SwitchA.

NOTE

The 802.1ag packets of a low-level MD are discarded in a high-level MD. The 802.1ag packets of a high-level

MD can traverse a low-level MD, The 802.1ag packets of an MD cannot be transmitted through the MD of the level same as that of the MD.

l If not, go to step

Step 4

.

l If yes, delete the MD, run the cfm md command to create a new one, change the MEP level on the intermediate node.

– If SwitchA successfully traces the MAC address of SwitchC based on 802.1ag, the fault has been rectified.

–


Step 4

.

Step 4 Run the display mac-address dynamic unit unit-id command to check whether the intermediate node has a MAC address entry with the destination being SwitchC.


Step 5

.

l If not, run the ping mac-8021ag command to enable the intermediate node to learn the MAC address of SwitchC.

– If SwitchA successfully traces the MAC address of SwitchC based on 802.1ag, the fault has been rectified.

–


Step 5

.



----End




403



Relevant Alarms

EOAM1AG_1.3.6.1.4.1.2011.5.25.136.1.6.1 hwCfmFaultAlarm

Relevant Logs

EOAM1AG/0/ALARM_CONFIG_ERR

EOAM1AG/3/DEL_MD_ERR

10.3.2 No Unexpected-MEP Alarm Is Generated

This section describes the possible causes of failures to generate unexpected-MEP alarms and provides a step-by-step troubleshooting procedure.

Common Causes

This fault is commonly caused by one of the following: l The interface with MEP configured is Down.

l The MD levels on the local and remote ends are different.

l The MEPs configured on the local and remote ends belong to different MAs.

l The MAs on the local and remote ends are associated with different VLANs.

l The number of unexpected-MEP alarms saved on the local device has reached the maximum value.


Figure 10-8




404



Figure 10-8 Troubleshooting flowchart for failures to generate unexpected-MEP alarms

Switch cannot generate unexpected-

MEP alarms

Is interface with

MEP configured

Down?

No

Yes

Are MD

levels on both ends the same?

Yes

No

Run undo shutdown on the interface

Configure the same MD level on both ends

Are MEPs in the same MA?

Yes

Are MAs on

both ends associated with the same

VLAN?

Yes

No

No

Configure the same MA name on both ends

Associate MAs with the same

VLAN

Is unexpected-

MEP alarm table

No full?

Yes

Clear unexpected-

MEP alarms



No


No

Yes


No

Yes


Yes

No


Yes

No

Yes

End


NOTE


Procedure

Step 1 Check whether the interface with an MEP configured is in Down state.

Run the display interface command is any view to check the status of the interface.

<Quidway> display interface xgigabitethernet0/0/1

XGigabitEthernet0/0/1 current state : UP




405


Troubleshooting



10 Reliability

Description:



Current system time: 2000-11-23 21:57:24-08:00



Duplex: FULL, Negotiation: ENABLE

Mdi : AUTO



Input peak rate 8536 bits/sec, Record time: 2000-11-19 02:24:19

Output peak rate 464 bits/sec, Record time: 2000-11-19 02:24:19


Unicast: 0, Multicast: 20804

Broadcast: 0, Jumbo: 0

Discard: 0, Total Error: 0

CRC: 0, Giants: 0

Jabbers: 0, Fragments: 0

Runts: 0, DropEvents: 0

Alignments: 0, Symbols: 0

Ignoreds: 0, Frames: 0


Unicast: 0, Multicast: 0

Broadcast: 1, Jumbo: 0

Discard: 0, Total Error: 0

Collisions: 0, ExcessiveCollisions: 0

Late Collisions: 0, Deferreds: 0

Buffers Purged: 0



Input bandwidth utilization : 0%

Output bandwidth utilization : 0%

If the interface is in Down state, run the display this command in the interface view to check whether the interface has been shut down.

l If the command output displays shutdown, run the undo shutdown command in the interface view.

l If shutdown is not displayed, go to 3.

l If the interface is in Up state, go to step 4.

Step 2 Check whether the MD levels at both ends are the same. If not, no unexpected-MEP alarm can be generated.

Run the display cfm md command in any view on both ends to check whether the Level field values on both ends are the same.

<SwitchA> display cfm md

The total number of MDs is : 1

------------------------------------------------

MD Name : md

MD Name Format : string

Level : 5

MIP Create-type : none

SenderID TLV-type : defer

MA list :

MA Name : ma

MA Name Format : string

Interval : 1000

VLAN ID : 23

VSI Name : --

L2VC ID : --

Issue 01 (2012-03-15) 406



<SwitchB> display cfm md

The total number of MDs is : 1

------------------------------------------------

MD Name : md


Level : 6



MA list :

MA Name : ma


Interval : 1000

VLAN ID : 23

VSI Name : --

L2VC ID :

-- l If the MD levels are different, set the same MD level for both ends.

l If the MD levels are the same, go to step 3.

Step 3 Check whether the MEPs on both ends belong to the same MA. If not, no unexpected-MEP alarm can be generated.

Run the display cfm ma command in any view to check the MA name on both ends.

<SwitchA> display cfm ma

The total number of MAs is : 1

--------------------------------------------------

MD Name : md


Level : 6



MA Name : ma1


Interval : 1000

Priority : 7

VLAN ID : 23

VSI Name : --

L2VC ID : --

MEP Number : 1

RMEP Number : 1

Suppressing Alarms : No

Sending AIS Packet : No

<SwitchB> display cfm ma


--------------------------------------------------

MD Name : md


Level : 6



MA Name : ma


Interval : 1000

Priority : 7

VLAN ID : 23

VSI Name : --

L2VC ID : --

MEP Number : 1

RMEP Number : 1


Sending AIS Packet :

No



407


Troubleshooting



10 Reliability

If the MA names are different, configure the same MA for both ends.

If the MA names are the same, go to step 4.

Step 4 Check whether the MAs on both ends are associated with the same VLAN. If not, no unexpected-

MEP alarm can be generated.

Run the display cfm ma command in any view to check the VLANs associated with the MAs on both ends.

<SwitchA> display cfm ma


--------------------------------------------------

MD Name : md


Level : 6



MA Name : ma1


Interval : 1000

Priority : 7

VLAN ID : 23

VSI Name : --

L2VC ID : --

MEP Number : 1

RMEP Number : 1


Sending AIS Packet : No

<SwitchB> display cfm ma


--------------------------------------------------

MD Name : md


Level : 6



MA Name : ma


Interval : 1000

Priority : 7

VLAN ID : 21

VSI Name : --

L2VC ID : --

MEP Number : 1

RMEP Number : 1


Sending AIS Packet :

No

If the MAs are associated with different VLANs, associate the MAs with the same VLAN.

If the MAs are associated with the same VLAN, go to step 5.

Step 5 Check whether the number of unexpected-MEP alarms reaches the maximum value. If so, no unexpected-MEP alarm can be generated.

Run the display cfm error-info error-type unexpected-mep [ md md-name ma ma-name

mep-id mep-id ] command in any view to check the number of unexpected-MEP alarms.

<Quidway> display cfm error-info error-type unexpected-mep

The total number of unexpected MEPs is : 1

--------------------------------------------------

MD Name : md

Level : 0

Issue 01 (2012-03-15) 408



MA Name : ma

MEP ID : 1

Unexpected MEP List:

Unexpected MEP ID : 2

MAC Address : 0025-e644-81a4

If the number of unexpected-MEP alarms reaches the maximum value, clear unexpected-MEP alarms.

If the number of unexpected-MEP alarms is smaller than the maximum value, go to step 6.

Step 6 Collect the following information and contact Huawei technical support personnel: l Results of the preceding troubleshooting procedure l Configuration file, logs, and alarms of the switch

----End


Relevant Alarms

None

Logs

An unexpected-MEP alarm is generated:

Oct 28 2011 09:58:59 L2PRO.139.107 EOAM1AG/4/UNEXPECTEDMEP:OID 1.3.6.1.4.1.2011.

5.25.136.1.6.36 MEP received a CCM with unexpected MEP. (MdIndex=25, MdIndex=25,

MaIndex=1, MdIndex=25, MaIndex=1, MepId=2, MdName=md, MaName=ma, MepId=2)

An unexpected-MEP alarm is aged out:

Oct 28 2011 09:59:09 L2PRO.139.107 EOAM1AG/4/UNEXPECTEDMEPCLEARED:OID 1.3.6.1.4.

1.2011.5.25.136.1.6.37 MEP did not receive any CCM with unexpected MEP before timeout. (MdIndex

=25, MdIndex=25, MaIndex=1, MdIndex=25, MaIndex=1, MepId=2, MdName=md, MaName=ma,

MepId=2)

10.4 BFD Troubleshooting

This chapter describes common causes of a BFD fault, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

10.4.1 BFD Session Cannot Go Up

Common Causes

This fault is commonly caused by one of the following: l The discriminators of the two devices are inconsistent.

l The link detected by the BFD session fails. As a result, BFD packets cannot be exchanged between the two ends of the BFD session.

l The BFD session status flaps.



409




Figure 10-9 Troubleshooting flowchart for the fault that a BFD session cannot go Up

A BFD session can not go Up

Configuration of the BFD session is commited

Yes

No

Commit the comfiguration

Check whether the discriminators of the two devices are consistent?

No set consistent discriiminators for the two devices.

Yes

BFD packets can be received and sent correctly?

Yes

Collect debugging information

No

Statistics about error packets exist?

No

Yes

Two ends of the

BFD session can ping each other?

Yes

No

Check the link

No

Is fault rectified?

Yes

No

Collect debugging information

No

Seek technial support

End

Is fault rectified?

No

Is fault rectified?

Seek technial support

No

Yes

Yes

Statistics

about the times the

BFD session goes down exist?

Yes

Adjust the BFD detection time

Is fault rectified?

Yes



410




Context

NOTE


Procedure

Step 1 Run the display current-configuration command to check that the configurations of the BFD session is committed.

l If the commit command is displayed, the configuration of the BFD session has been

committed. Then, go to

Step 2

.

l If the commit command is not displayed, the configuration of the BFD session has not been committed. In this case, run the commit command to commit the configurations. After doing so, run the display bfd session all command to check the State field.

–

If the State field is Up, the BFD session is successfully established.

– If the State field is not Up, go to

Step 2

.

Step 2 Run the display current-configuration command to check whether the discriminators of the two devices are consistent.

l If they are inconsistent, run the undo bfd command to delete the existing bfd session, and then run the bfd bind peer-ip command to create a new bfd session. At last run the

discriminator { local discr-value | remote discr-value } command to configure the local and remote discriminators. Ensure that the local discriminator on the local end is the same as the remote discriminator on the remote end and the remote discriminator on the local end is the same as the local discriminator on the remote end. Then, go to

Step 3

.

l If they are consistent, go to

Step 4

.

Step 3 Run the display bfd session all command to check the State field.

l If the State field is Up, the BFD session is successfully established.

l If the State field is not Up, go to

Step 4

.

Step 4 Run the display bfd statistics session all command several times to view statistics about the

BFD packets of the BFD session.

l If the value of the Received Packets field does not increase, go to

Step 5

.

l If the value of the Send Packets field does not increase, go to

Step 6

.

l If the values of Received Packets and Send Packets fields increase, go to

Step 9

.

l If none of the values of the Received Packets, Send Packets, Received Bad Packets, and

Send Bad Packets fields increase, go to

Step 7

.

l If the value of the Down Count field increases, the BFD session flaps. Then, go to

Step 7

.

Step 5 Run the display bfd statistics session all command several times to check the Received Bad

Packets field.

l If the value of this field increases, the BFD packets have been received and discarded. Then, go to

Step 9

.



411


Troubleshooting 10 Reliability l If the value of this field does not increase, the BFD packets have not been received. Then, go to

Step 7

.

Step 6 Run the display bfd statistics session all command several times to check the Send Bad

Packets field.

l If the value of this field increases, the BFD packets sent by the BFD session have been

discarded. Then, go to

Step 9

.

l If the value of this field does not increase, the BFD packets failed to be sent. Then, go to

Step 7

.

Step 7 Run the display bfd statistics session all command several times. If the BFD session still does not go Up, run the ping command on one end to ping the other end of the BFD session.

l If the ping fails, it indicates that the link fails. See the section


to

rectify the fault on the link.

l If the ping is successful, view the configurations on the related interfaces.

NOTE

BFD packets are transmitted in the default VLAN. Before a BFD session is established on an interface, configure the interface to allow packets of the default VLAN to pass through.

–

If the configurations on the interface are incorrect, correct the configurations.

–

If the configurations on the interface are correct, go to

Step 8

.

Step 8 Run the display current-configuration command to view the min-tx-interval and min-rx-

interval fields to check that the BFD detection period is longer than the delay on the link.

l If the BFD detection period is shorter than the delay on the link, run the detect-multiplier,

min-rx-interval, and min-tx-interval commands to adjust the values to make it longer than the delay on the link.

l If the BFD detection time is longer than the delay time on the link, go to

Step 9

.




----End


Relevant Alarms

l BFD_1.3.6.1.4.1.2011.5.25.38.3.1 hwBfdSessDown l BFD_1.3.6.1.4.1.2011.5.25.38.3.2 hwBfdSessUp

Relevant Logs

l BFD/4/STACHG_TODWN l BFD/4/STACHG_TOUP



412



10.4.2 Interface Forwarding Is Interrupted After a BFD Session

Detects a Fault and Goes Down

Common Causes

This fault is commonly caused by the following: l The BFD session status is associated with the interface status.


Figure 10-10 Troubleshooting flowchart for the fault that the interface forwarding is interrupted after a BFD session detects a fault and goes Down

Interface forwarding is interrupted after a

BFD session detects a fault and goes

Down

Check the interface status

Interface is Up but the BFD session status is Down?

No

Yes

No

BFD session is Up?

Yes

Rectify the fault in the forwarding module

BFD session status is associated with interface status?

No

Yes


End

End



413




Context

NOTE


Procedure

Step 1 Run the display interface interface-type interface-number command to check the status of the interface to which the BFD session is bound.

l If the Line protocol current state field displays DOWN(BFD status down), the interface

status is set to BFD status down after the BFD session detects a link fault. Then, go to

Step

2

.

l If the Line protocol current state field displays UP but the interface cannot forward packets, the forwarding module is faulty. See the section


to rectify the

forwarding fault.

Step 2 Run the display bfd session all command to view the status of the BFD session.

l If the BFD session is Down, go to

Step 3

.

l If the BFD session is Up, go to

Step 4

.

Step 3 Run the display current-configuration configuration bfd-session command to check that the

process-interface-status command is configured.

l If the process-interface-status command is configured, the interface is set to DOWN(BFD

status down) because the BFD session detected a fault and went Down.

l If the process-interface-status command is not configured, go to

Step 4

.




----End


Relevant Alarms

None.

Relevant Logs

None.

10.4.3 Changed BFD Session Parameters Do Not Take Effect



414



Common Causes

This fault is commonly caused by the following: l After parameters of a BFD session have been changed, changed configurations are not committed.


Figure 10-11 Troubleshooting flowchart for the fault that the changed BFD session parameters do not take effect

Changed BFD session parameters cannot take effect

Check the configuration of the BFD session

BFD session configuration is committed？

No

Run the commit command to make the configuration take effect

Yes

End

BFD session configuration takes effect?

Is fault rectified?

No


Yes

End


Context

NOTE


Procedure

Step 1 Run the display current-configuration configuration bfd-session command to check that the

commit command is configured.



415


Troubleshooting 10 Reliability l If the commit command is configured, the changed BFD session parameters have been

committed. Then, go to

Step 3

.

l If the commit command is not configured, the changed BFD session parameters have not

been committed. Then, run the commit command, and then go to

Step 2

.

Step 2 Run the display bfd session all command check whether the BFD session parameters are specified values.

l If BFD session parameters are specified, the modified parameters take effect.

l If BFD session parameters are not specified, go to

Step 3

.




----End


Relevant Alarms

None.

Relevant Logs

None.

10.4.4 Dynamic BFD Session Fails to Be Created

Common Causes

This fault is commonly caused by one of the following: l BFD is not enabled for the protocol.

l The route to the peer of the BFD session does not exist in the routing table.

l The interface is prohibited from creating a BFD session.



416


Troubleshooting


10 Reliability

Figure 10-12 Troubleshooting flowchart for the fault that a dynamic BFD session fails to be created

Dynamic BFD session fails to be created

Check the configuration of the BFD session

BFD is enabled for the protocol?

Yes

No

Enable BFD for the protocol

Routes exist in the routing table?

Yes

Interface is prohibited from creating a BFD session?

No

No

Rectify the fault on the link

Yes

Enable the interface to create a BFD session


End

End

End


Context

NOTE


Procedure

Step 1 Run the display current-configuration configuration bfd command to check that BFD is enabled for a protocol.



417


Troubleshooting 10 Reliability l If BFD is not enabled for a protocol, enable BFD. Then, go to

Step 2

.

l If BFD is enabled, go to

Step 3

.

Step 2 Run the display bfd session all command to view the state field.

l If the state field in the command output is Up, it indicates that the BFD section has been created.

l If the state field in the command output is not Up, go to

step 3

.

Step 3 Run the display ip routing-table command to check whether the route of the link detected by the BFD session exists.

l If the route exists, go to

step 4

.

l If the route does not exist, the BFD session associated with the protocol fails to be created.

see the section


.

Step 4 Run the interfaceinterface-typeinterface-number command to enter the view of the existing interface,then run the display this command to check that a command is configured to disable an interface to dynamically create a BFD session.

l If such a command is configured, Run the undo ospf bfd blockcommand to enable the interface to dynamically create a BFD session. Then, run the display bfd session all

command to check whether the BFD session is Up. If the session is not up, go to

step 5

.

l If such a command is not configured, go to

step 5

.




----End


Relevant Alarms

None.

Relevant Logs

None.


BFD Session Is Down Because the Interface Does Not Allow Packets from the

Default VLAN to Pass Through

Fault Symptom

As shown in

Figure 10-13

, the Eth-Trunk between the CX600 and the Switch is bound to static

BFD. After the configuration, the BFD session becomes Down.



418


Troubleshooting


VLAN 1023

XGE0/0/1

XGE0/0/2

XGE0/0/3

CX

Eth-Trunk

VLAN 1023

XGE0/0/1

XGE0/0/2

XGE0/0/3

Switch

10 Reliability

Fault Analysis

1.

Run the display current-configuration configuration bfd-session command on the

Switch to check whether the BFD session configuration is committed. If the commit field is displayed, the BFD session configuration is committed.

2.

Run the ping command on the Switch to check whether the network is reachable. The ping operation succeeds, indicating that the network is reachable.

3.

Run the display current-configuration interface interface-type interface-number command on the Switch to check the interface configuration.

# interface Eth-Trunk1




#

The preceding information indicates that the interface allows only packets from VLAN

1023 to pass through. On the Switch, BFD packets are transmitted in the default VLAN

(that is, VLAN 1), so the BFD session cannot be established.

Procedure

Step 1 Modify the configuration or the PVID of the interface to allow BFD packets from VLAN 1 to pass through.

1.

Modify the configuration of the interface so that BFD packets from VLAN 1 can pass through.

a.

Run the system-view command on the Switch to enter the system view.

b.

Run the interface interface-type interface-number command to enter the Eth-Trunk interface view.

c.

Run the port trunk allow-pass vlan 1 command to configure the interface to allow packets from VLAN 1 to pass through.

2.

Change the PVID of the interface.

a.

Run the system-view command on the Switch to enter the system view.

b.

Run the interface interface-type interface-number command to enter the Eth-Trunk interface view.

c.

Run the port trunk pvid vlan vlan-id command to configure the default VLAN of the interface.

d.

Run the port trunk allow-pass vlan vlan-id command to configure the interface to allow packets from the default VLAN to pass through.



419



After the preceding operations are complete, the BFD session status on the Switch is checked.

The BFD session is established successfully. The fault is rectified.

----End

Summary

BFD packets are transmitted in the default VLAN. Before a BFD session is established on an interface, configure the interface to allow packets of the default VLAN to pass through.

If the BFD session configuration is not committed or the configurations at two ends of the session are different, the BFD session may be Down. Check the configuration to locate faults.

10.5 DLDP Troubleshooting

This chapter describes common causes of a DLDP fault, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

10.5.1 DLDP Fails to Detect a Directly Connected Neighbor

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure for a failure to detect a directly connected neighbor.

Common Causes

This fault is commonly caused by one of the following: l The link is faulty.

l The DLDP function is disabled on the remote end.

l DLDP parameters on the local device and the neighbor are different.




420


Troubleshooting

Figure 10-14 DLDP troubleshooting flowchart

DLDP cannot detect directly connected neighbor

Is interface Up?

Yes

Is

DLDP enabled globally and on interface?

Yes

Are parameters

at both ends the same?

Yes


No

No

No

Rectify interface fault

Enable DLDP globally and on interface

Re-set parameters

10 Reliability

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

Is fault rectified?

No

Yes

End


NOTE


Procedure

Step 1 Check whether the interface that cannot detect a neighbor is Up.

Run the display interface interface-type interface-name command to view the current state field.

l

If the interface is Down, rectify the fault according to

Connected Ethernet Interfaces

Down

.

l If the interface is Up, go to step 2.

Step 2 Check that DLDP is enabled.

Run the display dldp command to check whether DLDP is enabled globally, that is, view the

DLDP global status field. Run the display this command in the interface view. If the output information contains the dldp enable field, DLDP is enabled on the interface. If the output information does not contain the dldp enable field, the interface works in auto-negotiation mode.



421


Troubleshooting 10 Reliability l If DLDP is not enabled globally or on an interface, run the dldp enable command in the corresponding view to enable DLDP.

l If DLDP is enabled, go to step 3.

Step 3 Check whether DLDP parameters at both ends are the same.

Run the display dldp command to view the information in the following table.

Field

DLDP interval

Method

Check whether the intervals for sending DLDP packets at both ends are the same. If the intervals for sending DLDP packets at both ends are different, run the dldp interval

interval-value command in the system view on both devices to set the same interval on both devices.

DLDP authentication-mode Check whether authentication modes and passwords at both ends are the same. If the two devices use different authentication modes or passwords, run the dldp

authentication-mode { md5 cipher-value | none | simple

cipher-value } command in the system view to set the same authentication mode and password on both devices.




----End


Relevant Alarms

None.

Relevant Logs

None.

10.6 RRPP Troubleshooting

10.6.1 RRPP Loop Occurs Temporarily

Common Causes

After RRPP is configured on a device, a loop occurs temporarily.

This fault is commonly caused by one of the following:



422


Troubleshooting 10 Reliability l The configuration is incorrect.

l Values of the Failtime timers configured for nodes along the RRPP ring are different.


Temporary RRPP loop troubleshooting is based on the network shown in

Figure 10-15

.

Figure 10-15 Networking diagram of RRPP

SwitchB

XGE0/0/4

XGE0/0/2

XGE0/0/4

XGE0/0/1

SwitchA

XGE0/0/8

XGE0/0/8

SwitchC

The troubleshooting roadmap is as follows: l Check that every node on the RRPP ring is correctly configured.

l Check that the Failtime timer of every node on the RRPP ring is set to the same value.

Figure 10-16




423



Figure 10-16 Troubleshooting flowchart for the fault that an RRPP loop occurs temporarily

RRPP loop occurs temporarily

Every node on The RRPP ring is correctly configured?

Yes

No

Modify the configurations

Failtime timer of every node on the

RRPP ring is set to the same value?

Yes

Collect information

No

Correct the configurations

Is fault rectified?

Yes

No

Is fault rectified?

Yes

No



NOTE


Procedure

Step 1 Check that every node on the RRPP ring is correctly configured.

Run the display this command in the RRPP view of each node on the RRPP ring to view RRPP configurations.

[RouterA-rrpp-domain-region1] display this

# rrpp domain 1

control-vlan 100

protected-vlan reference-instance 0

timer hello-timer 1 fail-timer 3

ring 1 node-mode master primary-port XGigabitEthernet0/0/2 secondary-port

XGigabitEthernet0/0/4 level 0

ring 1 enable

# return

Check whether all nodes on the RRPP ring belong to the same domain, whether the nodes are configured with the same control VLAN ID and instance number, and whether the RRPP ring has only one master node.



424


Troubleshooting 10 Reliability l If all configurations are correct, go to

Step 2

.

l If any of the preceding configurations is incorrect, RRPP configurations may be incorrect.

For correct configurations, see the chapter "RRPP Configuration" in the S6700 Configuration

Guide - LAN Access and MAN Access.

Step 2 Check that the Failtime timer of every node on the RRPP ring is set to the same value.

Run the display rrpp verbose domain domain-id command in any view to check detailed RRPP configurations.

[RouterA-rrpp-domain-region1] display rrpp verbose domain 1

Domain Index : 1


Hello Timer : 1 sec(default is 1 sec) Fail Timer : 3 sec(default is 3 sec)

RRPP Ring : 1

Ring Level : 0

Node Mode : Master

Ring State : Complete

Is Enabled : Enable Is Active : Yes


Secondary port: XGigabitEthernet0/0/2 Port status: BLOCKED l If the Failtime timers of the nodes on the RRPP ring are set to different values, correct the configurations according to the chapter "RRPP Configuration" in the S6700 Configuration

Guide - Reliability.

l

If the Failtime timer of every node on the RRPP ring is set to the same value, go to

Step

3

.



----End


Relevant Alarms

RRPP_1.3.6.1.4.1.2011.5.25.113.4.2 hwRrppRingFail

Relevant Logs

RRPP/3/FAIL

RRPP/5/PBLK

RRPP/5/RESTORE

10.7 MAC Swap Loopback Troubleshooting

This chapter describes common causes of MAC swap loopback faults, and provides the corresponding troubleshooting flowcharts, troubleshooting procedures, alarms, and logs.



425



10.7.1 No Remote Loopback Traffic Is Received by the Tester

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when no remote loopback traffic is received by the tester.

Common Causes

As shown in

Figure 10-17

, remote loopback is configured on XGE0/0/1 of SwitchA. Traffic is

sent from a tester to check connectivity between XGE0/0/2 of SwitchB and XGE0/0/1 of

SwitchA. The two interfaces are in the same VLAN.

Figure 10-17 Network diagram of remote loopback

Link to be detected

SwitchA

XGE0/0/2

XGE0/0/1

User

Metro

SwitchB

XGE0/0/1

XGE0/0/2

Tester

The tester does not receive the loopback traffic sent from SwitchA. This fault is commonly caused by one of the following: l The remote loopback test has not started.

l The VLAN ID specified for loopback packets is different from the VLAN ID of the remote loopback interface.

l The source and destination MAC addresses of the packets sent from the tester are different from those configured on the remote loopback interface.

l The tester is faulty.

l Packets sent from the tester are not IP packets.

l The link between SwitchA and SwitchB are not functioning properly.

l SwitchA, SwitchB, or either interface module between SwitchA and SwitchB fails.




426


Troubleshooting

Figure 10-18 Troubleshooting flowchart for a remote loopback failure

No remote loopback traffic is received by the tester

10 Reliability

Has switch connected to tester received packets from tester?

Yes

Is remote loopback configuration correct?

Yes

Has remote loopback switch received packets from tester?

Yes


No

Rectify fault of the tester, cable, or device hardware

No

Modify remote loopback configuration

No

Ensure that the link between the switches is working properly


Yes

No


Yes

No


Yes

No

End


NOTE


Procedure

Step 1 Check whether SwitchB has received the packets sent from the tester.

Run the display mac-address command on SwitchB to check whether SwitchB has learned the source MAC address of the packets sent from the tester and whether the source MAC address is learned on the interface connected to the tester.

l If SwitchB has not learned the source MAC address of the packets sent from the tester, check that:

–

The tester is configured correctly and packets constructed by the tester are IP packets.



427



– The cable between the tester and SwitchB is functioning properly.

If SwitchB still cannot learn the source MAC address of the packets sent from the tester, connect the tester to another interface of SwitchB. If the fault persists, go to step 4.

l If SwitchB has learned the source MAC address of the packets sent from the tester and the source MAC address is learned on the interface connected to the tester, go to step 2.

Step 2 Check the remote loopback configuration.

Run the display loopback swap-mac information command on SwitchA to check the remote loopback status and take actions accordingly.

Field

Loopback state

Description

Remote loopback status.

l Running: indicates that the loopback test has started.

l stop: indicates that the loopback test has not started.

Source MAC address of loopback packets.

Action

If this field is displayed as

stop, run the Loopback

swap-mac start command on the remote loopback interface to start the loopback test.

Loopback source MAC

Loopback destination MAC Destination MAC address of loopback packets.

If the value of this field is different from the source

MAC address of the packets sent from the tester, run the

loopback remote swap-

mac command on the remote loopback interface to change the source MAC address.

Alternatively, change the source MAC address of packets on the tester.

If the value of this field is different from the destination


loopback remote swap-

mac command on the remote loopback interface to change the destination MAC address. Alternatively, change the destination MAC address of packets on the tester.



428



Field

Loopback vlan

Description

VLAN ID of loopback packets.

Action

If the value of this field is different from the VLAN ID of the remote loopback interface, run the loopback

remote swap-mac command on the remote loopback interface to change the



Step 3 Check whether SwitchA has received the packets sent from the tester.

Run the display mac-address command on SwitchA to check whether SwitchA has learned the source MAC address of the packets sent from the tester and whether the source MAC address is learned on the remote loopback interface.

l If SwitchA has not learned the source MAC address of the packets sent from the tester, ensure that the link between SwitchA and SwitchB is functioning properly. If SwitchA still cannot learn the source MAC address of the packets sent from the tester, check whether packets of other services can be transmitted between SwitchA and SwitchB.

–

If packets of other services cannot be transmitted between SwitchA and SwitchB, interfaces between them may fail. Connect the cable to other interfaces of the Switches and add the interfaces to the VLAN. If the fault persists, go to step 4.

–

If packets of other services can be transmitted between SwitchA and SwitchB, go to step 4.

l If SwitchA has learned the source MAC address of the packets sent from the tester and the source MAC address is learned on the remote loopback interface, go to step 4.



----End


Relevant Alarms

None.

Relevant Logs

None.



429



10.7.2 No Local Loopback Traffic Is Received by the Tester

This section describes the troubleshooting flowchart and provides a step-by-step troubleshooting procedure to use when no local loopback traffic is received by the tester.

Common Causes

As shown in

Figure 10-19

, local loopback is configured on XGE0/0/2 of SwitchA. Traffic is

sent from a tester to check connectivity between XGE0/0/2 of SwitchB and XGE0/0/1 of

SwitchA. The two interfaces are in the same VLAN.

Figure 10-19 Network diagram of local loopback

Link to be detected

SwitchA

XGE0/0/2

XGE0/0/1

User

Metro

XGE0/0/1

SwitchB

XGE0/0/2

Tester

The tester does not receive the loopback traffic sent from SwitchA. This fault is commonly caused by one of the following: l The local loopback test has not started.

l The VLAN ID specified for loopback packets is different from the VLAN ID of the local loopback interface.

l The source and destination MAC addresses of the packets sent from the tester are different from those configured on the local loopback interface.

l The tester is faulty.

l Packets sent from the tester are not IP packets.

l The link between SwitchA and SwitchB are not functioning properly.

l SwitchA, SwitchB, or either interface module between SwitchA and SwitchB fails.




430


Troubleshooting

Figure 10-20 Troubleshooting flowchart for a local loopback failure

No local loopback traffic is received by the tester

10 Reliability

Has switch connected to tester received packets from tester?

Yes

Is local loopback configuration correct?

Yes

Has local loopback switch received packets from tester?

Yes


No

Rectify fault of the tester, cable, or device hardware

No

Modify local loopback configuration

No

Ensure that the link between the switches is working properly

Is fault rectified?

Yes

No

Is fault rectified?

No

Yes

Is fault rectified?

Yes

No

End


NOTE


Procedure

Step 1 Check whether SwitchB has received the packets sent from the tester.

Run the display mac-address command on SwitchB to check whether SwitchB has learned the source MAC address of the packets sent from the tester and whether the source MAC address is learned on the interface connected to the tester.

l If SwitchB has not learned the source MAC address of the packets sent from the tester, check that:

–

The tester is correctly configured.



431



– The cable between the tester and SwitchB is functioning properly.

If SwitchB still cannot learn the source MAC address of the packets sent from the tester, connect the tester to another interface of SwitchB. If the fault persists, go to step 4.

l If SwitchB has learned the source MAC address of the packets sent from the tester and the source MAC address is learned on the interface connected to the tester, go to step 2.

Step 2 Check the local loopback configuration.

Run the display loopback swap-mac information command on SwitchA to check the local loopback status and take actions accordingly.

Field

Loopback state

Description

Local loopback status.

l Running: indicates that the loopback test has started.

l stop: indicates that the loopback test has not started.

Source MAC address of loopback packets.

Action

If this field is displayed as

stop, run the loopback swap-

mac start command on the local loopback interface to start the loopback test.

Loopback source MAC

Loopback destination MAC Destination MAC address of loopback packets.

If the value of this field is different from the source


loopback local swap-mac

command on the local loopback interface to change the source MAC address.

Alternatively, change the source MAC address of packets on the tester.

If the value of this field is different from the destination


loopback local swap-mac

command on the local loopback interface to change the destination MAC address. Alternatively, change the destination MAC address of packets on the tester.



432



Field

Loopback vlan

Description


Action

If the value of this field is different from the VLAN ID of the local loopback interface, run the loopback

local swap-mac command on the local loopback interface to change the



Step 3 Check whether SwitchA has received the packets sent from the tester.

Run the display mac-address command on SwitchA to check whether SwitchA has learned the source MAC address of the packets sent from the tester and whether the source MAC address is learned on the local loopback interface.

l If SwitchA has not learned the source MAC address of the packets sent from the tester, ensure that the link between SwitchA and SwitchB is functioning properly. If SwitchA still cannot learn the source MAC address of the packets sent from the tester, check whether packets of other services can be transmitted between SwitchA and SwitchB.

–

If packets of other services cannot be transmitted between SwitchA and SwitchB, interfaces between them may fail. Connect the cable to other interfaces of the Switches and add the interfaces to the VLAN. If the fault persists, go to step 4.

–

If packets of other services can be transmitted between SwitchA and SwitchB, go to step 4.

l If SwitchA has learned the source MAC address of the packets sent from the tester and the source MAC address is learned on the remote loopback interface, go to step 4.



----End


Relevant Alarms

None.

Relevant Logs

None.



433



10.8 ERPS Troubleshooting

This chapter describes common causes of Ethernet ring protection switching (ERPS) faults, and provides the corresponding troubleshooting flowchart, troubleshooting procedure, alarms, and logs.

10.8.1 Traffic Forwarding Fails on an ERPS Link

This section describes the possible causes of a traffic forwarding failure on an ERPS link and provides a step-by-step troubleshooting procedure.

Common Causes

This fault is commonly caused by one of the following: l The link fails.

l The ERPS configuration is incorrect.

l The interface does not allow packets of the specified VLAN to pass.


Figure 10-21




434


Troubleshooting

Figure 10-21 ERPS link troubleshooting flowchart

ERPS link fails to forward traffic

Is ERPS link working properly?

Yes

No

Is interface Up?

No

Check that the physical link is normal

Yes

10 Reliability

No


Yes

Does interface allow packets of the VLAN?

Yes


No

Configure interface to allow packets of the VLAN to pass


Yes

No

End


NOTE


Procedure

Step 1 Check the port roles on the ERPS ring and status of each device on the ring.

An ERPS ring should have only one ring protection link (RPL) owner and all the other ports should be common ports.

Run the display erps [ ring ring-id ] [ verbose ] command in any view to check whether the

ERPS status is Idle. (Perform this operation on each device on the ERPS ring.)

[Quidway] display erps ring 1 verbose

Ring ID : 1

Description : Ring 1

Control Vlan : 1025

Protected Instance : 0 to 48

WTR Timer Setting (min) : 1 Running (s) : 0



435


Troubleshooting



10 Reliability

Guard Timer Setting (csec) : 200 Running (csec) : 0

Holdoff Timer Setting (deciseconds) : 0 Running (deciseconds) : 0

Ring State : Idle

RAPS_MEL : 7

Time since last topology change : 0 days 0h:2m:53s

--------------------------------------------------------------------------------

Port Port Role Port Status Signal Status

--------------------------------------------------------------------------------

XGE0/0/1 Common Forwarding Non-failed

XGE0/0/2 RPL Owner Discarding Non-failed l If the ERPS status on a device is not Idle, check that the ERPS configuration is correct. Go to step 2.

l If the ERPS status on all devices is Idle, go to step 4.

Step 2 Check whether the interface that fails to forward traffic is in Down state.

Run the display interface command is any view to check the status of the interface.

<Quidway> display interface xgigabitethernet0/0/1

XGigabitEthernet0/0/1 current state :

DOWN

Line protocol current state : DOWN

Description:HUAWEI, Quidway Series, XGigabitEthernet0/0/1

Interface


IP Sending Frames' Format is PKTFMT_ETHNT_2, Hardware address is 000b-0918-8bc1



Duplex: HALF, Negotiation: ENABLE

Mdi : AUTO



Input peak rate 0 bits/sec, Record time: -

Output peak rate 0 bits/sec, Record time: -




CRC : 0, Giants : 0

Jabbers : 0, Fragments : 0

Runts : 0, DropEvents : 0

Alignments : 0, Symbols : 0

Ignoreds : 0, Frames : 0





Collisions : 0, Deferreds : 0

Late Collisions: 0, ExcessiveCollisions: 0

Buffers Purged : 0






If the interface is in Down state, run the display this command in the interface view to check whether the interface has been shut down.

l If the command output displays shutdown, run the undo shutdown command in the interface view.

l If shutdown is not displayed, go to 3.

l If the interface is in Up state, go to step 4.

Step 3 Check that the physical interface is working properly.

Issue 01 (2012-03-15) 436



If the physical interface is faulty, rectify the fault according to

Ethernet Interface

Troubleshooting

. If the physical interface is working properly, go to step 4.

Step 4 Check that the interface allows data packets of the specified VLAN to pass.

Run the display this command in the interface view to check the VLANs allowed by the interface.





stp disable

erps ring 1

#

If the interface does not allow packets of the specified VLAN to pass, configure it to allow packets of this VLAN to pass.

If the interface allows packets of the specified VLAN to pass, go to step 5.

Step 5 Collect the following information and contact Huawei technical support personnel: l Results of the preceding troubleshooting procedure l Configuration file, logs, and alarms of the switch

----End


Relevant Alarms

None

Logs

None



437