openUBMC dynamically loads PCIe cards of various types in the self-discovery process.
To better understand this chapter, complete the following courses first:
- [√] Component Self-Description Record (CSR)
- [√] Board and Card Adaptation Guide
- [√] Hardware Self-Discovery
Understanding Riser Cards and PCIe Cards
Peripheral Component Interconnect Express (PCIe) is a computer bus standard. Compared with the conventional PCI bus, PCIe has higher bandwidth and a higher speed, and can provide faster data transmission and more stable performance. In addition, PCIe supports hot-plug and multiplexing, allowing multi-device connection, thereby enhancing system scalability and flexibility. PCIe has been continuously updated, with the latest version, PCIe 5.0, delivering higher bandwidth and faster speeds than previous generations.
A PCIe card is a computer extension card that connects to the computer mainboard using the PCIe bus interface to provide additional functions and performance. PCIe cards can be used to increase the storage capacity, network functions, and graphics processing capabilities of computers. Common PCIe cards include graphics cards, network interface cards, sound cards, storage cards, and RAID cards.
In a server, a PCIe card is usually inserted into the PCIe card slot on a riser card. A riser card is an adapter card used to expand slots of a computer mainboard. It changes the slot direction on the mainboard from vertical to horizontal so that other internal components can be easily accommodated, especially in a small or space-constrained computer chassis.
PCIe Device Loading Process
In the self-discovery process, we have introduced the hardware loading process. As a connector between hardware, Connector ensures that openUBMC can load and distribute objects level by level. For details about Connector, see Board and Card Adaptation Guide.
Riser Card Loading
In a server, a PCIe card is typically inserted on a riser card, so it is necessary to ensure the proper loading of the riser card before loading the PCIe card. Therefore, we need to define a CSR compilation specification for the riser card as follows:
{
"FormatVersion": "x.xx",
"DataVersion": "x.xx",
"ManagementTopology": {
"Anchor": {
"Buses": [
"Hisport_x"
]
},
"Hisport_5": {
"Chips": [
"Eeprom_IEU",
"Pca9545_IEU"
]
},
"Pca9545_IEU": {
"Buses": [
"I2cMux_9545Chan2",
"I2cMux_9545Chan3",
"I2cMux_9545Chan4"
]
},
"I2cMux_9545Chan2": {
"Connectors": [
"Connector_PCIe_1",
]
},
"I2cMux_9545Chan3": {
"Connectors": [
"Connector_PCIe_2",
"Connector_PCIe_3",
]
},
"I2cMux_9545Chan4": {
"Chips": [
"Pca9555_IEU",
"Chip_MCU"
]
}
},
"Objects": {
...
}
}The preceding example shows the riser card's CSR definition, where ManagementTopology is mandatory for the CSR syntax. According to the topology, the riser card uses the MUX component to multiplex links and supports multiple PCIe cards. Connector also needs to be defined in Objects. The following uses Connector_PCIe_1 as an example (the configurations of other connectors are similar):
"Connector_PCIe_1": {
"Bom": "14140130," // Bom_ID of the downstream component
"Id": "xxxx," // ID of the downstream component
"AuxId": "xxxx," // AuxId of the downstream component
"Slot": 2, // Slot number of the downstream component
"Position": 2, //Position of the downstream component
"Presence": 1, // Presence information
"Buses": [
"I2cMux_9545Chan2"
],
"SystemId": 1,
"SilkText": "RiserCard${Slot}",
"IdentifyMode": 2, // Loading mode: loading mode of non-Tianchi components
"Container": "Component_RiserCard",
"Type": "PCIe" // Downstream component type
},By configuring the PCIe device's Connector, you can load the device on the riser card. For special onboard NICs, you can also configure them on the BCU as the preceding Connector configurations. In the preceding configurations, Bom, Id, and AuxId are set, and Presence is set to 1. In this way, the PCIe device corresponding to Bom_Id_AuxId.sr can be loaded.
Dynamic Loading of PCIe Devices
In a server, a riser card can connect to multiple types of PCIe devices. Different types of riser cards need to be loaded on the same server for different functions. To address this challenge, openUBMC supports loading different PCIe devices by automatically obtaining BDF information.
By configuring the mapping relationship between PCIe slots and CPU resources, along with the PCIe information provided by the BIOS, corresponding types of PCIe devices can be dynamically loaded.
As shown in the following figure, the CPU on the BCU is connected to the UBCDD through the serializer/deserializer (SerDes), and the UBCDD is connected to the ports on the riser card. The PCIe position can be managed by defining the relationship between the three in openUBMC.
PCIeSlot1/PCIeSlot2: the two PCIe slots on the IEU, corresponding to the two PCIeConnector on the riserUBCDD: a connector used to transmit PCIe high-speed signalsSerDes: serial/deserializer, which converts parallel data into serial data for transmission and converts received serial data back to parallel data
CPU Resource Ownership
To enable the CPU to manage PCIe resources, openUBMC proposes the following concepts for configuring CSR:
UnitConfiguration: configures the relationship between the riser card, UBCDD, and SerDes. This is configured in the PSR, with the UID of the riser card as the key. Multiple UnitConfiguration configurations can be set in the same PSR to support multiple riser cards.BusinessConnector: business connector, which defines the uplink and downlink bandwidths of the corresponding PCIe slot on the riser card and the bound PCIeConnector. In addition, the UID information is defined to locate the corresponding UnitConfiguration in the PSR.PCIeAddrInfo: UnitConfiguration configuration corresponding to the PCIe, which is bound to BusinessConnectorSerDes: hardware information of the SerDes configured in the BCU CSR, including the Device number of the RootBDF corresponding to each SerDes
The process of establishing PCIe resource management by CPU is called PCIe topology establishment.
PCIe Topology Establishment and RootBDF Calculation
The BDF (Bus, Device, Function) of a PCIe device indicates the position of the device on the PCIe bus. Bus indicates the PCIe bus number to which the device is connected, Device indicates the number of a device on the PCIe bus, and Function indicates the function number of a device.
A BDF address consists of three digits, for example, 05:02.1, indicating that the bus number is 5, the device number is 2, and the function number is 1. The BDF address is used to uniquely identify a PCIe device and is the basis for PCIe device communication and access.
In a PCIe topology, each PCIe device has a unique BDF position, while the RootBDF position is determined by the connected PCIe root bus. Therefore, BDF and RootBDF are two different concepts. That is, the RootBDF of a PCIe device may vary in different PCIe topologies, but the BDF position remains unchanged.
Mapping Between BCU Ports and IEU Ports
Before obtaining the BDF of a PCIe device, you need to determine the mapping between PCIe slots and CPU resources and calculate the RootBDF value. To calculate which SerDes a PCIe slot is connected to, that is, the RootBDF, the UnitConfiguration is used to describe the relationship between the BCU and IEU. The PSR describes multiple UnitConfiguration configurations, which contains the cable connections for this product. Here is an example:
"UnitConfiguration_IEU3": {
"SlotType": "IEU",
"SlotNumber": 3,
"SlotSilkText": "IEUSlot3",
"Configurations": [
{
"UID": "xxxx",
"Index": 0,
"BCUIndex": 1,
"SrcPortName": ["B4c", "B4a"],
"TargetPortID": [ 17, 49],
"Slot": [7, 8],
"Device": [8 , 12],
}
],
"Port1LinkInfo": ""
}UID: UID of the IEU card, which is used to match the correspondingPCIeAddrInfoSlot: globally unique and provided by hardwareSrcPortName: Port B4c of the UBCDD on the BCU corresponds to slot 7 of the BCU, and port B4a corresponds to slot 8 of the BCU.TargetPortID: Upstream port ID of the riser card (in decimal format)
Configuring SerDes in the BCU CSR
Similarly, the definition of the corresponding SerDes needs to be configured in the BCU CSR, as shown in the following:
"BusinessConnector_CPU2UBCDD2": {
"Direction": "Downstream",
"BCUIndex": "${Slot}",
"Slot": 8,
"LinkWidth": "X16", //Bandwidth: X16
"MaxLinkRate": "PCIe 4.0",
"ConnectorType": "UBCDD",
"SilkText": "CPU2 UBCDD2",
"UpstreamResources": [
{"Name": "SerDes_1_8","ID": 8,"Offset": 0,"Width": 8},
{"Name": "SerDes_1_7","ID": 7,"Offset": 0,"Width": 4},
{"Name": "SerDes_1_10","ID": 10,"Offset": 0,"Width": 4}
],
"ActualResourceOrder": ["SerDes_1_10","SerDes_1_7","SerDes_1_8"],
"Ports": [
{"Name": "B4a","ID": 13,"Offset": 0,"Width": 8},
{"Name": "B4c","ID": 15,"Offset": 8,"Width": 8}
],
"Port1LinkInfo": "",
"Port2LinkInfo": ""
},LinkWidth: bandwidth of the PCIe card slotUpstreamResources: configures the bandwidth of each SerDes (A PCIe card slot corresponds to multiple SerDes).ActualResourceOrderandPortsdetermine the mapping between SerDes and UBCDD ports in the actual connection. In the preceding configuration,SerDes_1_8with x8 bandwidth corresponds to B4c (B4d), andSerDes_1_7andSerDes_1_10with x4 bandwidth correspond to B4a (B4b).
Take SerDes_1_10 as an example. Its properties are defined on the BCU CSR:
"SerDes_1_10": {
"Name": "SerDes_1_10",
"ID": 10,
"SocketID": 1,
"LinkWidth": 4,
"WorkMode": 1,
"ModeConfigs": [
{
"Mode": 1,
"Device": [14,14,15,15],
"ControllerIndex": [1,1,1,1]
}
]
}Device: indicates that the device associated with the SerDes is14, 14, 15, or 15. It also servers as theportIDfor indexing RootBDF.Socket: indicates the CPU to which the SerDes is connected.
Configuring SerDes in the IEU CSR
The ports connected to the BCU also need to be defined in the IEU CSR. The configuration here is the definition of BusinessConnector.
"BusinessConnector_1": {
"Name": "Up_1",
"Direction": "Upstream",
"Slot": 1,
"LinkWidth": "X16",
"MaxLinkRate": "PCIe 4.0",
"ConnectorType": "UBCDD",
"Ports": [
{"Name": "Down_1","ID": 49,"Offset": 0,"Width": 8},
{"Name": "Down_2","ID": 17,"Offset": 8,"Width": 8}
]
},
"BusinessConnector_2": {
"Name": "Down_1",
"Direction": "Downstream",
"Slot": 1,
"LinkWidth": "X8",
"MaxLinkRate": "PCIe 4.0",
"ConnectorType": "PCIe CEM",
"UpstreamResources": [
{"Name": "Up_1","ID": 1,"Offset": 0,"Width": 8 }
],
"RefMgmtConnector": "#/Connector_PCIE_1",
"RefPCIeAddrInfo": "#/PcieAddrInfo_1"
},
"PcieAddrInfo_1": {
"Location": "RiserCard${Slot}",
"ComponentType": 8,
"ContainerSlot": "${Slot}",
"ContainerUID": "xxx", // Corresponding to the UID in UnitConfiguration_IEU3
"ContainerUnitType": "IEU",
"GroupPosition": "PcieAddrInfo_1_${GroupPosition}"
},Configure Ports in BusinessConnector_1 to define the downstream route. The defined ID corresponds to the TargetPortID in the matching UnitConfiguration, and thus corresponds to B4a. In PcieAddrInfo, set ContainerUID to the UID in UnitConfiguration.
The relationship (that is, the topology) between the BCU and IEU ports has been established. The next step is to calculate RootBDF in the pcie_device component repository.
RootBDF Calculation
In pcie_device, the portId and SocketID are used to calculate RootBDF.
1. pcie_biz_connectors_init function: the first ID of the IEU/SEU upstream port, that is, first_target_port_id in the construct_biz_topo function.
2. match_src_connector function: Find SrcPortName in the corresponding UnitConfiguration based on first_target_port_id, and obtain src_connector in the BCU CSR based on SrcPortName (determined by the is_src_biz_connector function).
3. Obtain SocketID and Device in the Serdes object based on src_connector and offset. Device corresponds to portId, and SocketID corresponds to cpuid.
4. After portId is obtained, RootBDF can be calculated using override_root_bdf + SocketID + portId.In the preceding case, the values of SocketID and portId are 1 and 14, respectively. You can then query the corresponding RootBDF in the root_bdf table, and the result is [0xAA, 0x2, 0].
PCIe Card Loading
PCIe cards may be hot-swapped in service scenarios. Therefore, a BDF-based solution is designed. That is, the BDF information of each PCIe slot is proactively reported during the BIOS startup. The BMC queries the 4-tuple information of the corresponding BDF from the PMU based on the BDF information, and then loads the corresponding PCIe device CSR. For details, see the following figure.
During initialization, the pcie_device component sends an IPMI command to the BIOS to obtain the 4-tuple (DeviceID, VendorID, SubDeviceID, SubVendorID) of the PCIe card in in-band mode and combines the 4-tuple into AuxID and ID to concatenate the sr file name, that is, Bom_{DeviceID+VendorID}_{SubDeviceID+SubVendorID}.sr. Finally, the sr is loaded.
Tianchi PCIe Card Loading
Different from non-Tianchi PCIe cards, Tianchi PCIe card loads CSR from the EEPROM. For example:
"Connector_PCIE_SLOT2TianChi": {
"Id ": "", // ID of the downstream component
"AuxId ": "", // AuxId of the downstream component
"Bom": "14140130", // Bom_ID of the downstream component
"Slot": 2, // Slot number of the downstream component
"Position": 5, // Position of the downstream component
"Presence": "<=/Scanner_Slot2Presence.Value|>expr($1 == 1? 0 : 1)", // The presence signal of the Tianchi component is obtained by the Scanner.
"Buses": [
"I2cMux_Pca9545_PCA9545_2"
],
"SystemId": 1,
"SilkText": "RiserCard${Slot}",
"IdentifyMode": 3, // Loading mode: loading mode of Tianchi components
"Container": "Component_RiserCard",
"Type": "PCIe", // Downstream component type
"IdChipAddr": 160
}Other configurations are the same as those of non-Tianchi PCIe devices.
PCIe CSR Configuration
The BCU and IEU port mapping is described in PCIe topology establishment. Make sure that the following configurations are correct before configuring the CSR of a new PCIe card:
- Whether
UnitConfiguration_IEUis added to the PSR repository and whether the configuration items such asUIDandSrcPortNameare correct. - Whether the mapping between SerDes and riser card ports is correctly configured in the BCU CSR.
- Whether
Connectorand"BusinessConnectorof the riser card are correctly configured.
After the preceding configuration is complete, you can configure the CSR for the PCIe card. The PCIe card objects must include the PCIeDevice and PCIeCard objects. Other objects are configured depending on the PCIe card capability. For example, a PCIe NIC requires additional NetworkAdapter and NetworkPort and a GPU requires the GPU object. Here, we briefly introduce the basic CSR configuration of PCIe cards:
{
"FormatVersion": "x.xx",
"DataVersion": "x.xx",
"Unit": {
"Type": "PCIeCard",
"Name": "PCIeCard_1"
},
"ManagementTopology": {
...
},
"Objects": {
"PCIeDevice_1": {
"DeviceName": "PCIe Card $ (name)", // Name of the PCIe card
"FunctionClass": x, // Function classification (0: unknown; 1: RAID; 2: NIC; 3: GPU card...)
"Position ": "", // Position of the downstream component
"DiagnosticFault": 0, // Fault flag
"PredictiveFault": 0, // Predictive fault flag
"BandwidthReduction": 0, // Bandwidth reduction property
"LinkSpeedReduced": 0, // Property for reporting PCIe device speed reduction events
"CorrectableError": 0, // Correctable error on the PCIe device. The value is 0 (no correctable error occurred) or 1 (A correctable error occurred).
"UncorrectableError": 0, // Uncorrectable error on the PCIe device. The value is 0 (no uncorrectable error occurred) or 1 (An uncorrectable error occurred).
"FatalError": 0, // Fatal error on the PCIe device. The value is 0 (no fatal error occurred) or 1 (a fatal error occurred).
"Container": "${Container}",
"GroupPosition": "PCIeDevice_${GroupPosition}"
},
"PCIeCard_1": {
"SlotID": "<=/PCIeDevice_1.SlotID",
"NodeID": "<=/PCIeDevice_1.SlotID |> string.format('PCIeCard%s',$1)",
"Name": "xxxx",
"BoardName": "xxx",
"BoardID": 255,
"Description": "xxxx",
"FunctionClass": x, //Function classification (0: unknown; 1: RAID; 2: NIC; 3: GPU card...)
"VendorID": 0000, // Four-digit decimal number (manufacturer ID, one element of the 4-tuple)
"DeviceID": 0000, // Four-digit decimal number (device ID, one element of the 4-tuple)
"SubVendorID": 0000, // Four-digit decimal number (sub-manufacturer ID, one element of the 4-tuple)
"SubDeviceID": 0000, // Four-digit decimal number (sub-device ID, one element of the 4-tuple)
"Position": "<=/PCIeDevice_1.Position", // Position of the downstream component
"LaneOwner": "<=/PCIeDevice_1.SocketID", // Resource ownership. The start value is 1, indicating the CPU to which the current card is connected.
"FirmwareVersion": "N/A",
"Manufacturer": "xxx",
"PartNumber": "xxxxx", // Part number
"RefChip": "#/Chip_xxx",
"Protocol": "xxxx", // Protocol
"MaxFrameLen": 64,
"Model": "xxxx",
"DeviceName": "<=/PCIeDevice_1.DeviceName",
"LinkSpeed ": "", // link speed
"LinkSpeedCapability ": "", // Maximum link speed
"PcbID": "#/Accessor_PcbID.Value", // Pcb
"PcbVersion": ".x", // Pcb version
"Health ": "", // Health status. 0: normal; 1: minor; 2: major; 3: critical.
"DevBus": "<=/PCIeDevice_1.DevBus", // BUS in device BDF
"DevDevice": "<=/PCIeDevice_1.DevDevice", // Device in device BDF
"DevFunction": "<=/PCIeDevice_1.DevFunction", // Fucntion in device BDF
"SerialNumber": "" // Serial number
},
... // Compile the PCIe card feature object.
}
}In-Band PCIe Query
As described above, the BDF of a PCIe device is distributed by the BIOS. Therefore, you can query PCIe information through the in-band OS.
dmidecode --type 9The preceding command obtains PCIe information from the SMBIOS, including the slot, bandwidth, and usage of the PCIe device.
You can also run the lspci -tv command to check the loading status of the in-band PCIe device.
Cable Detection
Alarm Display in Cable Detection
PCIe cable detection checks whether the connections between the UBC and IEU ports comply with the cable connection configuration in PSR.
Cable alarms:
- 0x28000033(Cable.UBNotPresent) indicates that the current cable connection is missing compared to the cable configuration of a component.
- 0x28000031(Cable.UBIncorrectConnection) indicates a wrong connection. Port A1 of an IEU is expected to connect to port B1 of a BCU, and port A2 is expected to connect to port B2 of the BCU. However, port A1 of the IEU is connected to port B2 of the BCU.
- 0x28000035(Cable.UnitNotSupported) indicates that the port of an IEU is connected to port A of the BCU, while this connection is not required in the configuration.
Cable Detection Process
pcie_device provides a cyclic cable detection solution, which is used to compare the cable connection configuration in the CSR with the actual cable connection on the BCU. It can complete bidirectional detection:
- When the CSR changes, the cable detection can check whether the CSR configuration is correct.
- When the CSR does not change, the cable detection can check whether the current components have cable issues.
The implementation method is to collect topology information through the corresponding SMC command word configured in the BCU CSR and perform the comparison. For example:
[B4a][0x31, 000000104030209999,0]
[B4c][0x11, 00000001040302088888,0]
[A3a][0x31, 0000000104030207777,1]
[A3c][0x11, 0000000104030206666,1]If the configuration in CSR is different from the preceding topology information, a cable detection alarm is generated.
More
For details, see pcie_device.