Introduction
This document describes the Method of Procedure (MOP) steps necessary to replace a Fabric Switch Card (FSC) on a Cisco Aggregation Services Router (ASR) 5500 chassis.
Prerequisites
Requirements
Before you proceed with the steps outlined in this MOP, verify and be informed of this information: check current RAID status and collect SSD.
Components Used
The information in this document relates exclusively to a FSC as a component of the ASR5500 chassis.
The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, make sure that you understand the potential impact of any command.
Background Information
The steps outlined in this document involve the collection of Show Support Detail (SSD) that collects chassis information; verification of HD-Redundant Array of Independent Disks (RAID) status; removal the card Hard Drive (HD) from the Chassis HD RAID; and replacement of the new FSC in the appropriate slot. A chassis operational health check is recommended after successful FSC replacement.
The ASR5500 operates with each FSC treated as one disk subsystem in a RAID 5 configuration, which is where the disk subsystem operates in a N+1 mode and can handle one FSC failure. In a faulure scenario, the capacity would be the same when in the degraded state. However, there is no disk redundancy available until the failed FSC is replaced and the RAID is restored. If there is a double fault scenario and two FSC cards fail before the RAID is restored, then the RAID will be in a failed state and all data will be lost.
Replace the FSC
Maintenance Window
This procedure should not impact traffic; however, as a best practice, it is highly recommended that these activities be performed during a maintenance window. A maintenance window of at least one hour is recommended in order to perform and verify the activities outlined in this document.
Procedure
This is a step-by-step procedure to replace a FSC on an ASR5500 chassis.
- Collect SSD: This provides a snapshot baseline for subsequent problem analysis, if applicable, after FSC card replacement.
- Run this command in order to determine the status and slot number of the HD RAID hosted on the FSC:
Here is an example that shows the replacement of FSC in slot 17 of ASR5500 chassis.
show hd raid verbose
[local]ASR5500 ICSR> show hd raid verbose
Monday October 15 16:11:56 UTC 2012
HD RAID:
State : Available (clean) Degraded : No
UUID : 12345678:b91db53jd:e5bc12ca:ababab
Size : 1.2TB (1200000073728 bytes) Action : Idle
<snip> additional outputs supressed
Card 17
State : In-sync card
Created : Tue Jul 17 06:57:41 2012
Updated : Mon Oct 15 16:11:32 2012
Events : 585
Description : FSC17 SAD1111111X
Size : 400GB (400096755712 bytes) Disk hd17a
State : In-sync component
Created : Tue Jul 17 06:57:37 2012
Updated : Tue Jul 17 06:57:37 2012
Events : 0
Model : STEC-Z16IZF2D-200UCT Serial Number : xxxx
Size : 200GB (200049647616 bytes) Disk hd17b
State : In-sync component
Created : Tue Jul 17 06:57:37 2012
Updated : Tue Jul 17 06:57:37 2012
Events : 0
Model : STEC-Z16IZF2D-200UCT Serial Number : xxx
- Remove the current card from the RAID using the CLI as shown here. For example, remove RAID in slot 17.
ASR5500# hd raid remove hd17
Are you sure? [Yes|No]: yes
- Physically remove the FSC card from the ASR5500 chassis.
- Replace the new FSC card in the ASR5500 chassis.
- Check the status of the new card with this command. Determine if the card is usable and has passed diagnostics.
For example, display information for FSC in slot 17.
[local]ASR5500 ICSR> show card diag 17
Tuesday October 16 16:12:59 UTC 2012
Card 17: Status
IDEEPROM Magic Number : Good
Card Diagnostics : Pass : None
Last Failure : None
Card Usable : Yes Current Environment:
Temp: LM87 : 43.00 C
Temp: Lower : 42.00 C (limit 85.00 C) Temp: Upper : 44.00 C
(limit 85.00 C)
Temp: FE600-0 : 53.00 C (limit 100.00 C)
Temp: FE600-1 : 42.00 C (limit 100.00 C) Temp: MAX6696 : 36.00 C
(limit 85.00 C) Temp: F600 #1 : 37.57 C
Temp: Drive #1 : 55.00 C (limit 75.00 C) Temp: Drive #2 : 54.00 C
(limit 75.00 C)
Voltage: 2.5V : 2.496 V (min 2.380 V, max 2.630 V)
Voltage: 3.3V STANDBY : 3.341 V (min 2.970 V, max 3.630 V) Voltage: 5.0V
: 5.044 V (min 4.750 V, max 5.250 V) Voltage: 12V : 12.062 V
Voltage: 1.8V : 1.818 V (min 1.700 V, max 1.900 V) Voltage: 1.0V FE600-0
: 1.048 V
Voltage: 1.0V FE600-1 : 1.038 V Voltage: 48V-A : 50.500 V Voltage: 48V-B
: 52.100 V Current: 48V-A : 0.76 A Current: 48V-B : 1.00 A
Airflow: F600 #1 : 326 FPM
[local]ASR5500 ICSR>
If the new card does not come up, contact Cisco for additional support.
- Insert the new FSC card in the RAID with this CLI.
For example, insert FSC in slot 17 as seen here:
ASR5500# hdraid overwrite hd17
Are you sure? [Yes|No]: yes
[local]ASR5500 ICSR>
- Check that the RAID is not degraded, which might take approximately one hour to complete, after the command in Step 7 is issued:
Fo example, display RAID in FSC slot 17.
show hd raid verbose
[local]ASR5500 ICSR> show hd raid verbose
Monday October 15 15:20:52 UTC 2012
HD RAID:
State : Available (clean) <<< available
Degraded : No <<<< not degraded
UUID : 12345678:b91db53jd:e5bc12ca:ababab
Size : 1.2TB (1200000073728 bytes) Action : Idle
<snip> outputs suppressed
Card 17
State : In-sync card <<<<<<<<in-sync card
Created : Tue Jul 17 06:57:41 2012
Updated : Tue Oct 16 16:20:33 2012
Events : 585
Description : FSC17 SAD1111111X
Size : 400GB (400096755712 bytes) Disk hd17a
State : In-sync component <<<<<<<< Created :
Tue Jul 17 06:57:37 2012
Updated : Tue Jul 17 06:57:37 2012
Events : 0
Model : STEC-Z16IZF2D-200UCT Serial Number : STM000147A1E
Size : 200GB (200049647616 bytes) Disk hd17b
State : In-sync component <<<<<<<<<
Created : Tue Jul 17 06:57:37 2012
Updated : Tue Jul 17 06:57:37 2012
Events : 0
Model : STEC-Z16IZF2D-200UCT Serial Number : 1234
Size : 200GB (200049647616 bytes)
[local]ASR5500 ICSR>
- If the output still shows the RAID is degraded after one hour and 30 minutes, contact Cisco for additional support.
Health Check
In context local, issue these commands:
show clock
show version
show system uptime
show boot
show context show cpu table
show port utilization table
show session counters historical all
show subscribers data-rate high
show subscriber summary ggsn-service GGSN2
show subscriber summary ggsn-service GGSN1
show ntp status
show ntp associations
## The above for reference
[local] ASR5X00# show card table all |grep unknown
Should display no output
[local] ASR5X00# show card table | grep offline
Should display no output
[local] ASR5X00# show resources |grep Status
Should display "Within acceptable limits"
[local] ASR5X00# show task resources |grep over
Should display no output
[local] ASR5X00# show alarm outstanding
Monitor for any issues
[local] ASR5X00# show pgw-service all | grep "Status"
Should display STARTED.
[local] ASR5X00# show egtp-service all | grep "Status"
Should display STARTED.
[local] ASR5X00# show crash list
Related Information