本文档主要介绍如何在openUBMC上适配一张网卡。
网卡介绍
网卡,即“网络接口控制器”(英语:network interface controller,NIC),又称“网络接口控制器”,“网络适配器”(network adapter),“网络界面卡”(network interface card),或“局域网接收器”(LAN adapter),是一块被设计用来允许服务器在网络上进行通信的硬件。
由于其重要性,随着基础设施的蓬勃发展,网卡的技术在不断地迭代,诞生出各式各样的形态。因此在带外管理领域,不同网卡的多样性一直是BMC面临的技术挑战。openUBMC 基于适配过的网卡,抽象了一套管理方法,便于统一管理。
openUBMC通常适配网卡类型
板载网卡 普通NIC网卡 OCP网卡 PCIE网卡 (IB卡、FC卡、MEZZ卡,传统服务器中适配过,目前项目中一般用不到)
网卡适配流程
与其他部件一样,每张网卡拥有一个CSR文件,CSR文件用来告诉openUBMC如何识别网卡、管理网卡信息,并监控网卡状态等。
新适配一张网卡CSR需要做什么
- 网卡的名称、型号、厂商、网口数、网口类型等固定信息。
- 网卡支持的带外协议及协议文档。
- 需要监控网卡的哪些状态,如网卡温度、光模块温度、网口连接状态等。
- 网卡传感器都需要哪些,如何获取这些传感器读值。
- 需要上报哪些告警及告警条件,如温度、CE、UCE、IERR、速率匹配。
- 网卡是否需要单独的调速策略。
网卡基础信息获取
- 确认网卡类型(
PCIe、OCP、普通低速接口网卡、板载等) - 网卡名称、型号(通过网卡标注或厂商资料获取)
- 网口类型、网口数量(通过网卡样式、接口类型,网口类型获取)
网卡高级信息配置
- 网卡支持的带外协议(如NCSI,MCTP,从厂商资料获取)。
- 网卡温度、光模块温度、网口连接状态(从厂商资料或整机硬件工程师获取)。
- 根据网卡能力设计对应传感器及告警、事件。
- 根据热工程师指导文档,编写调速策略。
网卡识别
- 配置正确的SR文件名称,根据“BOM+ID+AUXID”组成对应网卡的文件名。
- 板载网卡及普通NIC网卡,一般采用“BOM+ID”组成对应网卡的文件名,在网卡连接的组件(如载板)SR中,配置网卡的连接器,然后根据硬件反馈的在位信号加载网卡SR文件。
{
"Connector_LOM_1": {
"Bom": "14220246",
"Slot": 1,
"Position": 5,
"Presence": "<=/Scanner_Lom1Pres.Value |> expr(($1 == 1) ? 1 : 0)",
"Id": "255",
"AuxId": "",
"Buses": [
"I2c_2",
"I2cMux_Pca9545_i2c7_chip_1"
],
"SystemId": "${SystemId}",
"ManagerId": "${ManagerId}",
"ChassisId": "${ChassisId}",
"SilkText": "J6008",
"IdentifyMode": 1 // 下级组件识别方式,3对应下级组件识别方式为天池标准类型组件,2对应下级组件识别方式为BoardId不可读(上报)类型组件,1对应下级组件识别方式为BoardId可读类型组件
},
"Scanner_Lom1Pres": {
"Chip": "#/Smc_ExpBoardSMC",
"Offset": 134234368, // SMC命令字
"Size": 2, // 数据字节数
"Mask": 1, // 数据掩码
"Type": 0,
"Period": 2000, // 扫描周期,单位ms
"Value": 0
}
}- OCP卡、PCIE卡,一般采用“BOM+ID+AUXID,”组成对应网卡的文件名,其中ID为venderID+deviceID, AUXID为subVenderID+subDeviceID。
{
"Connector_OCP_1": {
"Bom": "14220247",
"Slot": 7,
"Position": 7,
"Presence": 0,
"Id": "15b31015",
"AuxId": "19e5d13b",
"Buses": [
"I2cMux_Pca9545_i2c7_chip_1"
],
"SystemId": "${SystemId}",
"ManagerId": "${ManagerId}",
"ChassisId": "${ChassisId}",
"SilkText": "J6008",
"IdentifyMode": 2
},
"BusinessConnector_1": {
"Name": "Down_1",
"Direction": "Downstream", // 连接器方向,"Downstream":下行连接器,"Upstream":上行连接器,下行连接器与槽位一一对应,上行连接器与基础板BCU的UBC口对应
"Slot": 1, // 下行连接器的槽位索引,表示在当前组件里的槽位索引,业务拓扑建立后,用来确定全局槽位号
"LinkWidth": "X4", // 连接器PCIe链路宽度
"MaxLinkRate": "PCIe 4.0", // 连接器支持的最大PCIe规范
"ConnectorType": "UBCDD", // 连接器类型,常见的有UBCDD、PCIE CEM等
"UpstreamResources": [ // 下行连接器对应的上行连接器资源,根据Name字段匹配到上行连接器的Name,根据Offset字段匹配上行连接器的Port
{
"Name": "Up_1",
"ID": 255,
"Offset": 0,
"Width": 4
}
],
"RefMgmtConnector": "#/Connector_OCP_1", // 下行连接器关联对应的管理连接器对象,后续加载对应卡的CSR
"RefPCIeAddrInfo": "#/PcieAddrInfo_1" // 下行连接器关联对应的槽位PCIeAddrInfo对象
},
"PcieAddrInfo_1": {
"Location": "RiserCard${Slot}", // PCIe槽位所在的位置
"ComponentType": 83, // 部件类型,对应代码中的COMPONENT_TYPE
"ContainerSlot": "${Slot}",
"ContainerUID": "00000001040302000002", // 容器UID
"ContainerUnitType": "EXU", // 容器单板类型
"GroupPosition": "PcieAddrInfo_1_${GroupPosition}",
"ControllerType": 1, // PCIe控制器类型,0:PCIeCore,1:NIC,2:SAS,3:SATA,4:ZIP,5:SEC
"Segment": 0, // 多PCI Bridge场景的编号,每一个Segment对应一个PCI Bus空间
"GroupID": 0, // 逻辑组ID
"SlotID": 1, // PCIe设备的槽位号
"SocketID": 0, // CPU ID,0:CPU1,1:CPU 2
"Bus": 0, // root port Bus号
"Device": 125, // root port Device号
"Function": 0 // root port Function号
}
}管理拓扑对象
- 配置网卡的所有总线信息
- 配置所有通道信息
- 配置所有chip信息
{
"ManagementTopology": {
"Anchor": {
"Buses": [
"I2c_2",
"I2cMux_pca9545_chan1"
]
},
"I2c_2": {
"Chips": [
"Smc_ExpBoardSMC"
]
},
"I2cMux_pca9545_chan1": {
"Chips": [
"Chip_NICTempChip",
"Eeprom_NIC"
]
}
}
}component对象
- 配置网卡的FruId
- 配置网卡类型
- 配置网卡位置信息
- 配置网卡部件编码
- 配置网卡序列号
- 配置网卡厂商信息
{
"Component_OCPCard": {
"FruId": 255, // 无Fru则默认255,有Fru则引用Fru对象的FruId
"Instance": "${Slot}",
"Type": 83, // 部件类型,对应代码中的COMPONENT_TYPE*
"BoardId": 65535,
"Name": "<=/PCIeDevice_1.DeviceName",
"Location": "<=/PCIeDevice_1.Position",
"Manufacturer": "<=/NetworkAdapter_1.Manufacturer",
"PartNumber": "06310158", // 部件编码,可从FRU数据中获取部件号
"UniqueId": "N/A",
"Presence": 1,
"Health": 0,
"Function": "PCIe device",
"GroupId": 1,
"PowerState": 1,
"SerialNumber": "<=/PCIeCard_1.SerialNumber"
}
}PCIe设备对象
定义PCIe设备的基本属性和类型
- 设备名称
- 功能类型
- 设备类型
- 插槽类型
{
"PCIeDevice_1": {
"DeviceName": "PCIe Card $ (设备型号)",
"FunctionClass": 2, // 功能类型,0:未知、1:RAID、2:网卡、3:GPU卡、4:存储卡(SSD卡/M.2卡)、5:SDI卡、6:加速卡、7:扩展卡(PCIe Riser)、8:FPGA卡、9:NPU卡
"PCIeDeviceType": "MultiFunction", // 设备类型
"SlotType": "FullLength", // 插槽类型
"FunctionProtocol": "PCIe", // 协议类型
"FunctionType": "Physical" // 功能类型
"SlotID": 1, // PCIe设备的槽位号
"SocketID": 0, // CPU ID,0:CPU1,1:CPU 2
"Bus": 0, // root port Bus号
"Device": 125, // root port Device号
"Function": 0, // root port Function号
"Position": "EXU${Slot}" // 设备位置(容器名称)
}
}网卡对象
定义网卡的基本信息和属性
- 名称
- 描述
- 厂商ID
- 设备ID
- 制造商信息
OCP卡对象:
{
"OCPCard_1": {
"SlotID": "${Slot}",
"NodeID": "<=/NetworkAdapter_1.SlotNumber |> string.format('OCPCard%s',$1)",
"Name": "BCM957414N4140C",
"BoardName": "BCM957414N4140C",
"Description": "BCM957414N4140C",
"FunctionClass": 2,
"VendorID": 5348,
"DeviceID": 5847,
"SubVendorID": 5348,
"SubDeviceID": 16710,
"Position": "<=/PCIeDevice_1.Position",
"LaneOwner": "<=/PCIeDevice_1.SocketID",
"FirmwareVersion": "N/A",
"Manufacturer": "Broadcom",
"PartNumber": "",
"Health": "<=/Component_OCPCard.Health",
"Model": "BCM57414",
"DeviceName": "<=/PCIeDevice_1.DeviceName",
"PcbVersion": ".A",
"Bus": "<=/PCIeDevice_1.Bus",
"Device": "<=/PCIeDevice_1.Device",
"Function": "<=/PCIeDevice_1.Function",
"DevBus": "<=/PCIeDevice_1.DevBus",
"DevDevice": "<=/PCIeDevice_1.DevDevice",
"DevFunction": "<=/PCIeDevice_1.DevFunction",
"SerialNumber": ""
}
}PCIE卡对象:
{
"PCIeCard_1": {
"SlotID": "<=/PCIeDevice_1.SlotID",
"NodeID": "<=/PCIeDevice_1.SlotID |> string.format('PCIeCard%s',$1)",
"Name": "SP670",
"BoardName": "SP670",
"BoardID": 255,
"Description": "100Gb/s Dual port card",
"FunctionClass": 2,
"VendorID": 6629,
"DeviceID": 546,
"SubVendorID": 6629,
"SubDeviceID": 161,
"Position": "<=/PCIeDevice_1.Position",
"LaneOwner": "<=/PCIeDevice_1.SocketID",
"FirmwareVersion": "<=/NetworkAdapter_1.FirmwareVersion",
"Manufacturer": "Huawei",
"PartNumber": "02314CRT",
"RefChip": "#/Chip_Hi1822", // 通过高级协议获取属性关联的chip
"Protocol": "SmBus", // 可通过带外协议获取属性,如结温、光模块温度等
"MaxFrameLen": 64,
"Model": "Hi1822",
"DeviceName": "<=/PCIeDevice_1.DeviceName",
"LinkSpeed": "<=/NetworkAdapter_1.LinkSpeed",
"LinkSpeedCapability": "<=/NetworkAdapter_1.LinkSpeedCapability",
"PcbID": "#/Accessor_PcbID.Value",
"PcbVersion": ".A",
"Health": "<=/Component_PCIeCard.Health",
"Bus": "<=/PCIeDevice_1.Bus",
"Device": "<=/PCIeDevice_1.Device",
"Function": "<=/PCIeDevice_1.Function",
"DevBus": "<=/PCIeDevice_1.DevBus",
"DevDevice": "<=/PCIeDevice_1.DevDevice",
"DevFunction": "<=/PCIeDevice_1.DevFunction",
"SerialNumber": "<=/FruData_Fru0.BoardSerialNumber"
}
}
板载网卡:
```json
{
"BoardNICCard_1": {
"Slot": "${Slot}",
"UID": "00000001C00302023956",
"BoardNICCardName": "BC83ETHB",
"Manufacturer": "<=/NetworkAdapter_1.Manufacturer",
"Type": "NICCard",
"Description": "<=/NetworkAdapter_1.ModelDescription",
"PartNumber": "N/A",
"PcbVersion": "<=/Fru_NICCard.PcbVersion",
"LogicVersion": "N/A",
"SRVersion": "",
"BoardID": "<=/NetworkAdapter_1.BoardID",
"BoardType": "BoardNICCard",
"Number": 1,
"DeviceName": "<=/NetworkAdapter_1.DeviceLocator",
"Position": "<=/NetworkAdapter_1.Position",
"BoardNodeId": "<=/NetworkAdapter_1.SlotNumber |> string.format('chassisNIC%s(SF223D-H)',$1)",
"FruID": "<=/Fru_NICCard.FruId",
"AssociatedResource": "<=/NetworkAdapter_1.AssociatedResource",
"RefComponent": "#/Component_NICCard",
"RefFru": "#/Fru_NICCard"
}
}网络适配器对象
- 名称
- 描述
- 端口数量
- 支持特性
{
"NetworkAdapter_1": {
"Name": "BCM957414A4142CC", // 适配器名称
"Description": "2*25GE", // 适配器描述
"NetworkPortCount": "2", // 端口数量
"VendorID": "0x14e4", // 厂商ID
"DeviceID": "0x16d7", // 设备ID
"SupportedMctp": true, // 是否支持MCTP
"SupportedLLDP": true, // 是否支持LLDP
"HotPluggable": false, // 是否支持热插拔
"LinkWidth": "N/A", // 链路宽度
"LinkSpeed": "N/A", // 链路速度
"Model": "BCM57414", // 网卡型号,根据型号加载带外管理协议获取网卡信息
"FruId": 255,
"HotPluggable": false, // 是否支持热拔插
"ReadyToRemove": false, // 是否可拔插状态,用于通知式热拔插
"AttentionHotPlugState": 255, // 热拔插状态
"TemperatureCelsius": 0, // 网卡温度
"TemperatureStatus": 0, // 获取温度的状态
"BandwidthThresholdPercent": 100, // 带宽告警门限百分比
"SpecialPcieCard": false //是否为特殊网卡芯片拓扑,当前仅1822以太网卡该属性配置为true
}
}配置网口基础信息
网口对象
定义网络端口属性和能力
- 端口ID
- 介质类型
- 链路能力
- 网口类型
- 设备功能类型
{
"NetworkPort_0": {
"@Parent": "NetworkAdapter_1",
"SystemID": 1,
"PortID": 0, // 网口ID
"NetDevFuncType": 1, // 设备功能类型,0:Disabled 1:Ethernet 2:FC 4:iSCSI 8:FCoE 16:OPA 32:IB
"MediumType": "FiberOptic", // 介质类型,可选Copper或FiberOptic
"SupportedLinkCapability": "25GE", // 链路能力
"BandwidthUsagePercent": 0, // 当前网口带宽使用率
"Type": 5, // 网口类型
"LinkStatusValue": 1, // 连接状态
"LinkStatusNumeric": 255, // 连接状态
"SupportedFuncType": 1,
"FunctionType": 1, // 网卡支持类型,1、ETHERNET,2、FC,3、IB
"MACAddress": "00:00:00:00:00:00", // 网口MAC地址
"PermanentMACAddress": "00:00:00:00:00:00" // 网口永久MAC地址
}
}光模块对象
- 通道号
- 温度
- 电源状态
- 阈值
{
"OpticalModule_0": {
"@Parent": "NetworkPort_0",
"ChannelNum": 1,
"TemperatureCelsius": 0,
"PowerState": 0,
"Presence": 0,
"IsSupportedType": 0,
"Temp_UpperThresholdCritical": 125,
"FaultState": 0,
"SupplyVoltage": 0,
"LowerThresholdCritical": 0,
"UpperThresholdCritical": 0
}
}配置告警和事件
传感器定制与开发请参考《传感器定制与开发》进行配置,以下仅举例网卡相关传感器。
- 设置温度监控阈值,用于告警或调速
- 配置电压和功率监控,如告警或传感器
阈值传感器对象 (ThresholdSensor)
- 6个阈值
- 告警取消滞后值
- 读值数据源
- 合理的M,RBExp值,使计算出的传感器读值在一个字节的有效区间
{
"ThresholdSensor_Temp": {
"OwnerId": 32,
"OwnerLun": 0,
"EntityId": "<=/Entity_PCIeCard.Id",
"EntityInstance": "<=/Entity_PCIeCard.Instance",
"Initialization": 127,
"Capabilities": 232, // 传感器生效条件
"SensorType": 1, // 传感器类型,参考ipmi规范42章
"ReadingType": 1,
"SensorName": "<=/PCIeDevice_1.SlotID |> string.format('PCIe NIC%s Temp',$1)",
"Unit": 128,
"BaseUnit": 1,
"ModifierUnit": 0,
"Analog": 1,
"NominalReading": 25,
"NormalMaximum": 0,
"NormalMinimum": 0,
"MaximumReading": 127,
"MinimumReading": 128,
"Reading": "<=/NetworkAdapter_1.TemperatureCelsius",
"AssertMask": 128, // 告警能力掩码,参考ipmi规范Assertion Event Mask
"DeassertMask": 28800, // 参考ipmi规范Deassertion Event Mask
"ReadingMask": 2056,
"Linearization": 0,
"M": 100,
"RBExp": 224,
"UpperNoncritical": 105,
"PositiveHysteresis": 2,
"NegativeHysteresis": 2
},
"ThresholdSensor_Voltage": {
"OwnerId": 32,
"OwnerLun": 0,
"EntityId": "<=/Entity_PCIeCard.Id",
"EntityInstance": "<=/Entity_PCIeCard.Instance",
"Initialization": 127,
"Capabilities": 232,
"SensorType": 2,
"ReadingType": 1,
"SensorName": "<=/PCIeDevice_1.SlotID |> string.format('Hi1822_NIC%s_1v2',$1)",
"Unit": 0,
"BaseUnit": 4,
"ModifierUnit": 0,
"Analog": 1,
"NominalReading": 90,
"NormalMaximum": 0,
"NormalMinimum": 0,
"MaximumReading": 127,
"MinimumReading": 0,
"Reading": "<=/Scanner_1v2.Value |> expr($1 / 10)",
"AssertMask": 516,
"DeassertMask": 516,
"ReadingMask": 4626,
"Linearization": 0,
"M": 1,
"RBExp": 224,
"UpperCritical": 132,
"LowerCritical": 108,
"PositiveHysteresis": 2,
"NegativeHysteresis": 2,
"ReadingStatus": "<=/::FruCtrl_1_0.PowerState |> string.cmp($1, 'ON') |> expr($1 ? 0 : 2)",
},
"ThresholdSensor_Power": {
"OwnerId": 32,
"OwnerLun": 0,
"EntityId": "<=/Entity_MainBoard.Id",
"EntityInstance": "<=/Entity_MainBoard.Instance",
"Initialization": 127,
"Capabilities": 232,
"SensorType": 11,
"ReadingType": 1,
"SensorName": "Power",
"SensorIdentifier": "Power",
"BaseUnit": 6,
"Analog": 1,
"NominalReading": 25,
"MaximumReading": 255,
"Reading": "<=/Scanner_Fan_Pwr.Value",
"ReadingStatus": "<=/Scanner_Fan_Pwr.Status",
"SensorNumber": 255,
"M": 5,
"PositiveHysteresis": 1,
"NegativeHysteresis": 1
}
}散热控制对象
- 目标温度
- 最大允许温度
{
{
"CoolingConfig_Basic": { // CoolingConfig类通常不配置在板卡的配置文件中
"SmartCoolingState": "Enabled", // 智能调速是否启用,取值范围"Enabled","Disabled"
"SmartCoolingMode": "EnergySaving", // 智能调速模式,取值范围"EnergySaving","HighPerformance","LowNoise","Custom","LiquidCooling"
"LevelPercentRange": [20, 100], // 智能调速等级范围,样例中为调速范围为20-100
"InitLevelInStartup": 100, // 启动时默认调速等级,取值范围LevelPercentRange
"DiskRowTemperatureAvailable": false, // 硬盘温度是否可获取
"SysHDDsMaxTemperature": 80.0, // HDD硬盘最大温度阈值
"SysSSDsMaxTemperature": 80.0, // SSD硬盘最大温度阈值
"SensorLocationSupported": false // 是否支持温度海洋界面
}
}
{
"CoolingPolicy_EnergySaving": {
"PolicyIdx": 6, // 线性调速策略Id,PolicyIdx必须全局唯一
"ExpCondVal": "EnergySaving", // 预期生效条件,CoolingPolicy生效前提是实际条件与期望条件一致,取值范围"EnergySaving","HighPerformance","LowNoise","Custom","LiquidCooling"
"ActualCondVal": "<=/CoolingConfig_1.SmartCoolingMode", // 实际生效条件,取值范围"EnergySaving","HighPerformance","LowNoise","Custom","LiquidCooling"
"TemperatureRangeLow": [-127, 20, 30, 40, 50], // 线性调速策略温度区间低门限
"TemperatureRangeHigh": [20, 30, 40, 50, 127], // 线性调速策略温度区间高门限
"SpeedRangeLow": [20, 32, 70, 100], // 线性调速策略转速区间低门限
"SpeedRangeHigh": [20, 32, 70, 100], // 线性调速策略转速区间高门限
"FanType": ["02314BLG 8038+"] // 风扇类型列表,环境风扇类型为列表之一时,风扇条件生效
}
}
"CoolingRequirement_1_7": {
"RequirementId": 7, // 目标调速策略Id,Id必须全局唯一,当前Id支持有效16位,前8位baseid,后8位槽位号
"TemperatureType": 11, // 目标调速温度点类型,1:Cpu 2:Outlet 3:Disk 4:Memory 5:PCH 6:VRD 7:VDDQ 8:NPUHbm 9:NPUAiCore 10:NPUBoard 11:Inlet 12:SoCBoardOutlet 13:SoCBoardInlet
"MonitoringStatus": "<=/Scanner_Lm75_Inlet.Status", // 温度点传感器状态,0:正常 1:异常
"MonitoringValue": "<=/Scanner_Lm75_Inlet.Value;<=/Scanner_Lm75_Inlet.Value |> expr((($1 + 5) > ($2 - 10)) ? ($1 + 5) : ($2 - 10))", // 参与调速的温度值
"FailedValue": 80, // 温度状态异常后的异常调速转速,不配代表不触发异常调速;如果配置了值,温度点读取失败的时候,异常调速会固定下发这个转速值
"TargetTemperatureCelsius": 50, // 当前调速目标值
"MaxAllowedTemperatureCelsius": 60,
"TargetTemperatureRangeCelsius": [ // 自定义温度允许范围,用于自定义目标值时的合法性判断
40,
60
],
"SmartCoolingTargetTemperature": [ // EnergySaving/HighPerformance/LowNoise 三种模式下的目标温度值,可不配置
50,
47,
53
],
"CustomSupported": true, // 是否支持自定义目标温度值,true:支持,false:不支持
"CustomTargetTemperatureCelsius": 50, // 用户自定义温度值,255为无效值
"SensorName": "#/ThresholdSensor_InletTemp.SensorName" // 传感器名称
},
"CoolingArea_1_25": {
"AreaId": 25, // 调速区域Id,AreaId必须全局唯一
"RequirementIdx": 25, // 调速区域关联的目标调速策略Id,与U8的RequirementId一致或者U16的BaseId一致
"PolicyIdxGroup": [], // 调速区域关联的环温调速索引
"FanIdxGroup": [ // 调速区域关联的风扇Id(CoolingFan.FanId)集合,表示参与该区域调速的风扇Id
1,
2,
3,
4
]
}
}基本事件对象
{
"Event_PCIeBandWidth": {
"EventKeyId": "PCIeCard.PCIeCardBandWidthDecreased",
"xxx": "xxxx"
},
"Event_PCIeLinkSpeed": {
"EventKeyId": "PCIeCard.PCIeCardLinkSpeedReduced",
"xxx": "xxxx"
},
"Event_PCIeCardUCE": {
"EventKeyId": "PCIeCard.PCIeCardUncorrectableErr",
"xxx": "xxxx"
}
}温度相关事件
{
"Event_OverTemp": {
"EventKeyId": "PCIeCard.PCIeCardOverTemp",
"xxx": "xxxx"
},
"Event_TempFail": {
"EventKeyId": "PcieCard.PCIeCardTempFail",
"xxx": "xxxx"
},
"Event_TempSensorFail": {
"EventKeyId": "PcieCard.PCIeCardTempSensorFail",
"xxx": "xxxx"
}
}电源相关事件
{
"Event_PowerFail": {
"EventKeyId": "PCIeCard.PCIeCardPowerFail",
"xxx": "xxxx"
},
"Event_VoltageAlarm": {
"EventKeyId": "PCIeCard.PCIeCardVoltageAlarm",
"xxx": "xxxx"
},
"Event_PowerOverload": {
"EventKeyId": "PCIeCard.PCIeCardPowerOverload",
"xxx": "xxxx"
}
}端口相关事件
{
"Event_Port1LinkDown": {
"EventKeyId": "Port.PortDisconnected",
"xxx": "xxxx"
},
"Event_OM1PowerAlarm": {
"EventKeyId": "Port.PortOpticalModulePowerAlarm",
"xxx": "xxxx"
},
"Event_OM1SpeedMatch": {
"EventKeyId": "Port.PortOpticalModuleSpeedMismatch",
"xxx": "xxxx"
},
"Event_Port1BWUsageMntr": {
"EventKeyId": "NICCard.SystemNetworkBandwidthUsageHigh",
"xxx": "xxxx"
}
}设备状态事件
{
"Event_DevicePresence": {
"EventKeyId": "PCIeCard.PCIeCardPresence",
"xxx": "xxxx"
},
"Event_PowerState": {
"EventKeyId": "PCIeCard.PCIeCardPowerState",
"xxx": "xxxx"
},
"Event_FaultState": {
"EventKeyId": "PCIeCard.PCIeCardFaultState",
"xxx": "xxxx"
}
}FRU对象
{
"Eeprom_NetCard": {
"OffsetWidth": 2,
"AddrWidth": 1,
"Address": 172,
"WriteTmout": 100,
"ReadTmout": 100,
"RwBlockSize": 32,
"WriteInterval": 20,
"HealthStatus": 0
},
"Fru_PCIeCard": {
"PcbId": "PCB ID",
"FruId": 1,
"FruName": "OCP${slot}",
"ConnectorGroupId": "${GroupId}",
"BoardId": 255,
"PowerState": 1,
"Health": 0,
"EepStatus": 1,
"FruDataId": "#/FruData_OCP1"
},
"FruData_OCP1": {
"FruId": 1,
"FruDev": "#/Eeprom_NetCard",
"EepromWp": "#/Accessor_WP.Value",
"BoardPartNumber": "N/A",
"BoardSerialNumber": "N/A",
"StorageType": "EepromV2" //
}
}芯片对象
{
"Chip_Hi1822": {
"OffsetWidth": 0,
"AddrWidth": 1, // 地址位宽
"Address": 232, // 地址
"WriteTmout": 100, // 写超时时间
"ReadTmout": 100, // 读超时时间
"HealthStatus": 0, // 健康状态
"WriteRetryTimes": 2, // 写失败重试次数
"ReadRetryTimes": 0
}
}常见问题
1、OCP、PCIE卡无法识别
说明:OCP卡和PCIE卡一般按IdentifyMode为2的方式加载,该方式整体流程为:
->整机适配时,在BMC配置OCP和PCIE卡的PcieAddrInfo对象。
->BIOS通过WritePcieCardBdfToBmc、WriteOcpCardBdfToBmc将2种类型设备的BDF号传给BMC。
->BMC根据BDF号向IMU查询对应PCIe槽位的四元组信息,设置对应属性到对应连接器的ID和AuxID和在位信息,从而加载对应的CSR。
2、网口数量不正确
说明:检查NetworkAdapter对象的NetworkPortCount属性配置是否正确;
检查NetworkPort对象和网口数量是否匹配,每个网口需要配置一个对象 ,如果是光口,则还需要配置对应的OpticalModule对象。
3、ncsi不可用
说明:检查NCSIChannelMgmt对象或NCSICapabilities对象配置;
检查NetworkAdapter对象的Model是否在src/lualib/hardware_config路径下有对应的同名文件,文件中包括识别设备、通过properties表定义属性获取方式等。
4、iBMA上报的数据无法正确匹配
说明:检查对应的PcieAddrInfo对象配置是否和硬件信息匹配,可在os下用lscpi命令查看每个对应槽位卡的bdf等信息。