Introduction
This document describes two possible discovery failures that can occur when a B460 M4 motherboard is replaced and their respective solutions.
Prerequisites
Requirements
This document assumes knowledge of UCS B460 M4 and UCS Manager (UCSM).
Components Used
- B460 M4 Blade Server
- UCS Manager
- Firmware 2.2(3b)
Background
The B460 M4 server consists of two Scalable M4 Blade Modules (B260 M4) and a Scalability Connector that cross-connects the two Blade Modules and allows them to function as a single server. The Blade Module on the bottom is the “Master” and the Blade Module on the top is the “Slave.”
Discovery Problems
Discovery Fails at 3% - Firmware Mismatch
In this failure scenario, the discovery fails at 3% with Remote Invocation Description Aggregate blade CIMC firmware version mismatch. Activate same firmware version on both CIMC as shown in the figure below. This can occur due to the replacement motherboard or blade module having a different firmware than the pre-existing B460 M4 server.
Note: The example below shows a mismatch in CIMC firmware, but the same process applies to mismatched CIMC, BIOS, and Board Controller firmware.
The Overall Status will be Discovery Failed as shown in the figure below.
The mismatched firmware can be checked from the command line (CLI) as shown below. In the output below, the first CIMC is the master and the second is the slave.
UCS-A# show system firmware expand detail
Server 7:
CIMC:
Running-Vers: 2.2(3b)
Package-Vers:
Update-Status: Ready
Activate-Status:
Startup-Vers:
Backup-Vers: 2.2(3a)
Bootloader-Vers: 2.2(3b).33
CIMC:
Running-Vers: 2.2(3a)
Package-Vers:
Update-Status: Ready
Activate-Status:
Startup-Vers:
Backup-Vers: 2.2(3b)
Bootloader-Vers: 2.2(3a).33
CIMC:
Running-Vers: 2.2(3b)
Package-Vers: 2.2(3b)B
Update-Status: Ready
Activate-Status: Ready
Startup-Vers: 2.2(3b)
Backup-Vers: 2.2(3b)
Bootloader-Vers: 2.2(3b).33
Solution
In order to recover from this, follow the steps below.
1) Navigate to Equipment > Chassis > Chassis # > Servers > Server # > Installed Firmware tab.
2) Right-click on the component that needs to be updated (e.g. BIOS, CIMC Controller) and select Update Firmware. In this example, the CIMC Controller will be updated to 2.2(3b).
3) Select the correct firmware, the Force checkbox and click Apply.
Tip: If it's not clear which version needs to be selected from the dropdown, the server administrator can navigate to Equipment > Firmware Management > Packages, expand ucs-k9-bundle-b-series.VERSION.B.bin and look for "ucs-EXM4." There will be three components: bios (BIOS), brdprog (Board Controller), and cimc (CIMC Controller).
Tip: Since the board controller firmware cannot be downgraded, if the replacement motherboard comes with a board controller firmware version that is not present in any of the blade series packages present in the domain, the network administrator can download a blade series package that contains the board controller version firmware needed. In order to verify which blade series package contains the needed firmware, please review the Release Bundle Contents for Cisco UCS Manager document.
4) Monitor the Installed Firmware tab and wait until the Update Status and Activate Status columns change to Ready and the Backup Version column changes to the correct firmware.
Tip: The server administrator can monitor the update status from Equipment > Chassis > Chassis # > Servers > Server # > Inventory tab > CIMC tab > Update Status
5) Right-click on this same component and select Activate Firmware. Again, select the correct firmware, the Force checkbox and click Apply.
6) The Activate Status column in the Installed Firmware tab will change state and eventually return to Ready.
7) The Overall Status in the General tab will change to Inaccessible while the server is rebooting. It should then change to Discovery and go through the discovery process.
Discovery Fails at 5% - Board controller firmware mismatch
Notice: In this failure scenario, the discovery fails at 5% with Remote Invocation DescriptionAggregate blade board controller firmware version mismatch. Activate same firmware version on both board controller as shown in the figure below. This can occur due to the replacement motherboard or blade module having a different firmware than the pre-existing B460 M4 server.
The mismatched firmware can be checked from the command line (CLI) as shown below. In the output below, the first Board controller is the master and the second is the slave.
srini-2gfi-96-b-A /chassis/server # show firmware board controller detail
Server 2/7:
Board Controller:
Running-Vers: 2.0 <<<<
Package-Vers: 2.2(7.156)B
Activate-Status: Ready
Board Controller: ( Master)
Running-Vers: 2.0 <<<<
Package-Vers:
Activate-Status:
Board Controller: ( Slave)
Running-Vers: 1.0 <<<<
Package-Vers:
Activate-Status:
Solution
In order to recover follow the steps below
Step 1
|
In the Navigation pane, click the Equipment tab.
|
Step 2
|
On the Equipment tab, click the Equipment node.
|
Step 3
|
In the Work pane, click the Firmware Management tab.
|
Step 4
|
On the Installed Firmware tab, click Activate Firmware.
Cisco UCS Manager GUI opens the Activate Firmware dialog box and verifies the firmware versions for all endpoints in the Cisco UCS domain. This step may take a few minutes, depending upon the number of chassis and servers
|
Step 5
|
From the Filter drop-down list on the menu bar of the Activate Firmware dialog box, select Board Controller.
Cisco UCS Manager GUI displays all servers that have board controllers in the Activate Firmware dialog box.
|
Step 6
|
For the board controller, you want to update, select the maximum/largest version from the Startup Version drop-down list. (Note: downgrades are not possible; always select the highest version to activate)
|
Step7
|
Click OK.
|
Step 8
|
(Optional)You can also use the Force Board Controller Activation option to update the firmware version when you upgrade CPUs with different architectures. For example, when you upgrade from Sandy Bridge to Ivy Bridge CPUs.
|
Discovery Fails at 7% - CPU Mismatch
In this failure scenario, the discovery fails at 7% with Remote Invocation Description Pre-boot Hardware config failure - Look at POST/diagnostic results as shown in the figure below.
The Overall Status in the General tab will be Compute Failed.
The POST Results can be verified by clicking the View Post Results under Actions in the General tab. The figure below shows that the problem is due to a CPU Mismatch.
Solution
If the hardware matches between the two blade modules, this could be caused by cached information on the server. An enhancement request (CSCuv27099) exists to clear the cached information from UCS Manager (UCSM). The server administrator can also contact the Cisco Technical Assistance Center (TAC) for a workaround.