Firmware Upgrade Mechanism and Common Issues
更新时间: 2025/12/03
在Gitcode上查看源码

Firmware Upgrade Overview

Firmware upgrade involves starting the firmware upgrade and querying the upgrade progress. Starting the firmware upgrade occurs in two local and remote upgrade scenarios. The actual upgrade actions are performed by specific firmware submodules, while the firmware management module driver the entire firmware upgrade process. Therefore, the firmware management module interacts with various firmware submodules.

Firmware Upgrade Implementation

The firmware upgrade phase consists of four stages: initialize, prepare, process, and finish. Common processing actions are inserted between the stages. Each firmware submodule contains three specific actions: prepare, process, and finish.

prepare

After receiving a prepare message from the firmware management component, the submodule confirms that the message is for its own upgrade, determines whether the upgrade is allowed, and returns the prepare result.

process

  • Preprocesses upgrade package data. The upgrade data encrypted using the AES-128 CBC algorithm is decrypted and written to a .bin file.
  • Writes the upgrade data to the flash memory. The .bin file is packaged and written to the flash memory and finally output to the corresponding partition.
  • Sets the M3 flag. The BMC upgrade flag is set.

finish

  • Runs the first boot script if it exists.
  • Registers the activation mode for the firmware management component. Activation actions are managed centrally by the firmware management component. The firmware upgrade workflow is as follows:

The upgrade succeeds only after all three stages pass. If any stage receives an "upgrade failed" result from the firmware subsystem, the upgrade is considered a failure. Additionally, if the firmware subsystem fails to respond with a stage result to the firmware management module within the specified time, a timeout failure occurs.

Firmware Activation

When a firmware submodule detects that activation conditions are met, it requests the firmware management module to determine whether to activate the firmware. The basic processing flow is as follows:

Parallel Upgrade

Parallel Upgrade Flow

In addition to the common method of upgrading firmware individually, the iBMC supports a parallel upgrade method. The specific upgrade flow is as follows:

Mutual Exclusion Handling

Mutual Exclusion Between Parallel and Serial Upgrades

Upgrade processes (such as firmware activation) must run serially to avoid concurrency conflicts. If the activation interface is called while an activation process is in progress, an error is returned.

Firmware Mutual Exclusion Rules

  • Firmware with the same ComponentID and ComponentIDEx does not support parallel upgrades.
  • Firmware with different ComponentIDEx may support parallelism, but the service component must return an error code during the prepare stage if it does not.
  • Parallel upgrades are prohibited between hardware objects (such as management modules and interface modules). The BMC must be upgraded first and independently.

Conflict Handling

In conflict scenarios (such as lock failure or a downstream component determining that parallel upgrade is unsupported), the task is moved to the end of the queue to wait. The number of retries is not recorded. When firmware with the same ComponentID and ComponentIDEx is being upgraded, new tasks are queued to wait for a retry.

Resource Limits

  • Maximum parallel tasks: The number of firmware units being upgraded simultaneously is configured through the component self-description record (CSR), with a default limit of 5. The default limit for the waiting queue is 10 tasks. An error is returned if new calls exceed this limit.
  • Storage queue management: Local upgrades require the firmware package to be cached first. Local tasks are prioritized to release resources.

Common Interfaces and Methods for Parallel Firmware Upgrade

There are multiple firmware upgrade interfaces, including web page upgrade, CLI upgrade, Redfish interface upgrade, SNMP interface upgrade, and IPMI command upgrade. The most commonly used methods are web page upgrade and CLI upgrade. For other upgrade methods, refer to the community resources at the Documentation Center.

Web Page Upgrade

Operation path: iBMC Settings > Firmware Upgrade > Firmware Update

CLI Upgrade

After logging in through Xshell, run the command to perform the upgrade.

White-Box Package Upgrade

The product management component interfaces with the firmware management component to complete white-box package upgrades and activation. The specific steps and upgrade flow chart are as follows.

  • Pre-upgrade preparation: The update.cfg file is parsed to determine whether the upgrade package type is a white-box package.
  • Upgrade execution: The filelist.cfg file is parsed to import customization files into specified system directories.
  • Post-upgrade processing: The web_custom.xml file is parsed to complete attribute customization defined in the XML file.

Common Upgrade Issues

Upgrade Error: Invalid Upgrade Package/FirmwareType is nil

The error information in the log is as follows. The firmware type is located externally. This error occurs because the firmware type to be upgraded is not defined in the VPD repository. The self-discovery process failed to distribute the firmware object to the firmware management component, making it unrecognizable. To add a firmware object, add the object to platform.sr of the corresponding model in the VPD repository, as shown in the following figure. Ensure that ComponentID and ComponentIDEx match the data in the update.cfg file.

Upgrade Failure Due to Key Decryption Failure

The log indicates that decryption of the key required for the upgrade failed. A possible reason is that after key_mgmt updates the key, the ksf key does not match the upgrade key, preventing the upgrade key from being decrypted. Firmware versions 1.0.30 and later have enhanced robustness from a firmware perspective to resolve this issue. If this problem occurs, use the following workarounds:

  • Delete the environment key.

  • Restart the bmc_core service.

  • After the component service is started, perform the upgrade again.

Error "p12 parse fail" During White-Box Package Upgrade

The production of white-box packages involves SSL certificates, which can be encrypted or unencrypted. This log entry corresponds to an encrypted certificate. Encrypted certificate production involves four steps: original encrypted .pxf certificate production, confirming the encryption password, removing the password, and setting the encryption password. This error occurs because the password removal setting was not followed during certificate production, leading to a mismatch between the system decryption key and the encryption key.

BIOS Upgrade Fails with or Without Configuration Retention

The log shows "get snapshot failed." Try running the command shown in the following figure to see if a response is returned. If an error is returned, the root cause is that the BIOS resource tree shown in the preceding figure failed to load, which prevented snapshot generation. It has been found that in a Tianchi module environment without a BCU, all resource trees in the BCU related to BIOS upgrade fail to be loaded. The solution is to migrate the BIOS-related resource tree functions in the BCU, preferably along with the product.

Failure to Upgrade V2 to V3

Run ipmcget -d v to obtain V2 version information. Collect logs or restart to trigger an upgrade, and analyze the upgrade-related logs: The log indicates that this BMC version lacks a root certificate, causing signature verification failure. This BMC version cannot upgrade from V2 to V3, nor can it upgrade within V2. Resolve the certificate issue first. Solution: Use the image switch function to switch to the BMC version in the other partition, and then attempt the upgrade.

V2 Burning Package Fails to Upgrade to V3

The environment has an older V2 burning BMC. The error log for the V2 to V3 upgrade is as follows. The reason is that older V2 versions only support PKCS signatures and do not support PSS. You must upgrade to an intermediate version that supports both PKCS and PSS verification (B version). Then, you can upgrade the BMC to V3.

CPLD Upgraded Successfully While Powered On, but Did Not Take Effect After Power-Off

Symptom: The CPLD was upgraded while the system was powered on, but activation was not triggered after power-off. There is no AC in the environment, and the CPLD version remains unchanged. Locating steps:

  • Check operation logs to determine the operation time.

  • Search app.log at the corresponding time point to view detailed upgrade logs.

  • Examine logs near the relevant events and observe whether the CPLD activation flow ran as expected.
  1. CPLD upgrades while powered on and registers the activation item with the firmware management component.
  2. The environment powers off.
  3. The firmware management module detects the power-off and triggers activation. In step 1, the activation condition registered by the CPLD is PowerCycle, indicating that CPLD activation customization was applied. With this customization, power-off and forced power-off will not activate the CPLD. To resolve this, remove this customization.

Upgrade Timeout Failure

Three symptoms represent timeout failures in the three stages:

  • Timeout failure in the prepare stage: Symptom: Upgrade progress stays at 5% for one minute and then fails. app.log contains the following entry. This indicates that the firmware submodule did not respond to the firmware management component within one minute during the prepare stage.
  • Timeout failure in the process stage: Symptom: Upgrade progress stays at 15% or higher for one hour and then fails. app.log contains timeout-related log entries.
  • Timeout failure in the finish stage: Symptom: Upgrade progress stays at 99% for one minute and then fails. app.log contains timeout-related log entries.