Skip to content

labels: Add ASRockRack ROMED8-2T memory topology#232

Closed
csmarshall wants to merge 4 commits intomchehab:masterfrom
csmarshall:add-asrockrack-romed8-2t-labels
Closed

labels: Add ASRockRack ROMED8-2T memory topology#232
csmarshall wants to merge 4 commits intomchehab:masterfrom
csmarshall:add-asrockrack-romed8-2t-labels

Conversation

@csmarshall
Copy link
Contributor

Add memory slot labels for ASRockRack ROMED8-2T motherboard. This board has 8 memory slots (A1-H1) supporting DDR4 ECC memory. Optimal 4-DIMM configuration uses slots A1, B1, G1, H1.

Each 64GB DIMM spans two memory controller rows (csrow0/csrow1) requiring dual coordinate mapping per physical slot.

Tested on system with 4x 64GB DDR4-3200 ECC modules.

Add memory slot labels for ASRockRack ROMED8-2T motherboard.
This board has 8 memory slots (A1-H1) supporting DDR4 ECC memory.
Optimal 4-DIMM configuration uses slots A1, B1, G1, H1.

Each 64GB DIMM spans two memory controller rows (csrow0/csrow1)
requiring dual coordinate mapping per physical slot.

Tested on system with 4x 64GB DDR4-3200 ECC modules.

Signed-off-by: Charles Marshall <charles@wozi.com>
@csmarshall
Copy link
Contributor Author

Here's the layout:

% sudo ras-mc-ctl --layout
          +-----------------------------------------------+
          |                      mc0                      |
          |  csrow0   |  csrow1   |  csrow2   |  csrow3   |
----------+-----------------------------------------------+
channel7: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
channel6: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
----------+-------------------------------------------------+
channel5: |  32768 MB  |  32768 MB  |     0 MB  |     0 MB  |
channel4: |  32768 MB  |  32768 MB  |     0 MB  |     0 MB  |
----------+-----------------------------------------------+
channel3: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
channel2: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
----------+-------------------------------------------------+
channel1: |  32768 MB  |  32768 MB  |     0 MB  |     0 MB  |
channel0: |  32768 MB  |  32768 MB  |     0 MB  |     0 MB  |
----------+-------------------------------------------------+

And a dmidecode (with some stuff commented out obviously):

%  sudo dmidecode -t memory -t baseboard
# dmidecode 3.6
Getting SMBIOS data from sysfs.
SMBIOS 3.3.0 present.

Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
        Manufacturer: ASRockRack
        Product Name: ROMED8-2T
        Version: 1.03
        Serial Number: XXXXXXXXXXXXXXX
        Asset Tag:
        Features:
                Board is a hosting board
                Board is removable
                Board is replaceable
        Location In Chassis:
        Chassis Handle: 0x0003
        Type: Motherboard
        Contained Object Handles: 0

Handle 0x000F, DMI type 10, 6 bytes
On Board Device Information
        Type: Video
        Status: Enabled
        Description:    To Be Filled By O.E.M.

Handle 0x0012, DMI type 41, 11 bytes
Onboard Device
        Reference Designation:  Onboard IGD
        Type: Video
        Status: Enabled
        Type Instance: 1
        Bus Address: 0000:00:02.0

Handle 0x0013, DMI type 41, 11 bytes
Onboard Device
        Reference Designation:  Onboard LAN
        Type: Ethernet
        Status: Enabled
        Type Instance: 1
        Bus Address: 0000:00:19.0

Handle 0x0014, DMI type 41, 11 bytes
Onboard Device
        Reference Designation:  Onboard 1394
        Type: Other
        Status: Enabled
        Type Instance: 1
        Bus Address: 0000:03:1c.2

Handle 0x001F, DMI type 16, 23 bytes
Physical Memory Array
        Location: System Board Or Motherboard
        Use: System Memory
        Error Correction Type: Multi-bit ECC
        Maximum Capacity: 4 TB
        Error Information Handle: 0x001E
        Number Of Devices: 8

Handle 0x0027, DMI type 17, 92 bytes
Memory Device
        Array Handle: 0x001F
        Error Information Handle: 0x0026
        Total Width: 72 bits
        Data Width: 64 bits
        Size: 64 GB
        Form Factor: DIMM
        Set: None
        Locator: DIMM 0
        Bank Locator: P0 CHANNEL A
        Type: DDR4
        Type Detail: Synchronous Registered (Buffered)
        Speed: 3200 MT/s
        Manufacturer: Micron Technology
        Serial Number: XXXXXXXX
        Asset Tag: Not Specified
        Part Number: 36ASF8G72PZ-3G2B2
        Rank: 2
        Configured Memory Speed: 3200 MT/s
        Minimum Voltage: 1.2 V
        Maximum Voltage: 1.2 V
        Configured Voltage: 1.2 V
        Memory Technology: DRAM
        Memory Operating Mode Capability: Volatile memory
        Firmware Version: Unknown
        Module Manufacturer ID: Bank 1, Hex 0x2C
        Module Product ID: Unknown
        Memory Subsystem Controller Manufacturer ID: Unknown
        Memory Subsystem Controller Product ID: Unknown
        Non-Volatile Size: None
        Volatile Size: 64 GB
        Cache Size: None
        Logical Size: None

Handle 0x002A, DMI type 17, 92 bytes
Memory Device
        Array Handle: 0x001F
        Error Information Handle: 0x0029
        Total Width: 72 bits
        Data Width: 64 bits
        Size: 64 GB
        Form Factor: DIMM
        Set: None
        Locator: DIMM 0
        Bank Locator: P0 CHANNEL B
        Type: DDR4
        Type Detail: Synchronous Registered (Buffered)
        Speed: 3200 MT/s
        Manufacturer: Micron Technology
        Serial Number: XXXXXXXX
        Asset Tag: Not Specified
        Part Number: 36ASF8G72PZ-3G2B2
        Rank: 2
        Configured Memory Speed: 3200 MT/s
        Minimum Voltage: 1.2 V
        Maximum Voltage: 1.2 V
        Configured Voltage: 1.2 V
        Memory Technology: DRAM
        Memory Operating Mode Capability: Volatile memory
        Firmware Version: Unknown
        Module Manufacturer ID: Bank 1, Hex 0x2C
        Module Product ID: Unknown
        Memory Subsystem Controller Manufacturer ID: Unknown
        Memory Subsystem Controller Product ID: Unknown
        Non-Volatile Size: None
        Volatile Size: 64 GB
        Cache Size: None
        Logical Size: None

Handle 0x002D, DMI type 17, 92 bytes
Memory Device
        Array Handle: 0x001F
        Error Information Handle: 0x002C
        Total Width: Unknown
        Data Width: Unknown
        Size: No Module Installed
        Form Factor: Unknown
        Set: None
        Locator: DIMM 0
        Bank Locator: P0 CHANNEL C
        Type: Unknown
        Type Detail: Unknown

Handle 0x002F, DMI type 17, 92 bytes
Memory Device
        Array Handle: 0x001F
        Error Information Handle: 0x002E
        Total Width: Unknown
        Data Width: Unknown
        Size: No Module Installed
        Form Factor: Unknown
        Set: None
        Locator: DIMM 0
        Bank Locator: P0 CHANNEL D
        Type: Unknown
        Type Detail: Unknown

Handle 0x0031, DMI type 17, 92 bytes
Memory Device
        Array Handle: 0x001F
        Error Information Handle: 0x0030
        Total Width: Unknown
        Data Width: Unknown
        Size: No Module Installed
        Form Factor: Unknown
        Set: None
        Locator: DIMM 0
        Bank Locator: P0 CHANNEL E
        Type: Unknown
        Type Detail: Unknown

Handle 0x0033, DMI type 17, 92 bytes
Memory Device
        Array Handle: 0x001F
        Error Information Handle: 0x0032
        Total Width: Unknown
        Data Width: Unknown
        Size: No Module Installed
        Form Factor: Unknown
        Set: None
        Locator: DIMM 0
        Bank Locator: P0 CHANNEL F
        Type: Unknown
        Type Detail: Unknown

Handle 0x0035, DMI type 17, 92 bytes
Memory Device
        Array Handle: 0x001F
        Error Information Handle: 0x0034
        Total Width: 72 bits
        Data Width: 64 bits
        Size: 64 GB
        Form Factor: DIMM
        Set: None
        Locator: DIMM 0
        Bank Locator: P0 CHANNEL G
        Type: DDR4
        Type Detail: Synchronous Registered (Buffered)
        Speed: 3200 MT/s
        Manufacturer: Micron Technology
        Serial Number: XXXXXXXX
        Asset Tag: Not Specified
        Part Number: 36ASF8G72PZ-3G2B2
        Rank: 2
        Configured Memory Speed: 3200 MT/s
        Minimum Voltage: 1.2 V
        Maximum Voltage: 1.2 V
        Configured Voltage: 1.2 V
        Memory Technology: DRAM
        Memory Operating Mode Capability: Volatile memory
        Firmware Version: Unknown
        Module Manufacturer ID: Bank 1, Hex 0x2C
        Module Product ID: Unknown
        Memory Subsystem Controller Manufacturer ID: Unknown
        Memory Subsystem Controller Product ID: Unknown
        Non-Volatile Size: None
        Volatile Size: 64 GB
        Cache Size: None
        Logical Size: None

Handle 0x0038, DMI type 17, 92 bytes
Memory Device
        Array Handle: 0x001F
        Error Information Handle: 0x0037
        Total Width: 72 bits
        Data Width: 64 bits
        Size: 64 GB
        Form Factor: DIMM
        Set: None
        Locator: DIMM 0
        Bank Locator: P0 CHANNEL H
        Type: DDR4
        Type Detail: Synchronous Registered (Buffered)
        Speed: 3200 MT/s
        Manufacturer: Micron Technology
        Serial Number: XXXXXXXX
        Asset Tag: Not Specified
        Part Number: 36ASF8G72PZ-3G2B2
        Rank: 2
        Configured Memory Speed: 3200 MT/s
        Minimum Voltage: 1.2 V
        Maximum Voltage: 1.2 V
        Configured Voltage: 1.2 V
        Memory Technology: DRAM
        Memory Operating Mode Capability: Volatile memory
        Firmware Version: Unknown
        Module Manufacturer ID: Bank 1, Hex 0x2C
        Module Product ID: Unknown
        Memory Subsystem Controller Manufacturer ID: Unknown
        Memory Subsystem Controller Product ID: Unknown
        Non-Volatile Size: None
        Volatile Size: 64 GB
        Cache Size: None
        Logical Size: None

When using --error-count, consolidate entries that share the same
DIMM label and sum their CE/UE counts. This provides cleaner output
when physical DIMMs span multiple ranks, showing one line per
physical DIMM slot instead of duplicate labels.

Add --per-rank option to show each rank separately with location
info (csrow/channel) for cases where detailed per-rank error
tracking is needed.

Signed-off-by: Charles Marshall <charles@wozi.com>
Pad labels to consistent width so CE/UE columns align properly.
Account for location suffix length when using --per-rank.

Signed-off-by: Charles Marshall <charles@wozi.com>
@csmarshall
Copy link
Contributor Author

On systems where each physical DIMM spans multiple ranks (like the ROMED8-2T with 64GB DIMMs using 2 csrows each), the --error-count output showed each rank separately with the same label - resulting in confusing duplicate entries:

% ras-mc-ctl --error-count
Label   CE      UE
DDR4_H1 0       0
DDR4_G1 0       0
DDR4_B1 0       0
DDR4_A1 0       0
DDR4_A1 0       0
DDR4_H1 0       0
DDR4_B1 0       0
DDR4_G1 1       0

Since what you actually care about is "which physical DIMM has errors", consolidating by label and summing the counts gives cleaner, actionable output:

Label   CE      UE
DDR4_A1 0       0
DDR4_B1 0       0
DDR4_G1 1       0
DDR4_H1 0       0

The --per-rank option preserves the detailed view (with location info) for cases where you need to see which specific rank within a DIMM is reporting errors.

% ras-mc-ctl --per-rank --error-count
Label                           CE      UE
DDR4_A1 (csrow 0 channel 0)     0       0
DDR4_B1 (csrow 0 channel 1)     0       0
DDR4_G1 (csrow 0 channel 4)     0       0
DDR4_H1 (csrow 0 channel 5)     0       0
DDR4_A1 (csrow 1 channel 0)     0       0
DDR4_B1 (csrow 1 channel 1)     0       0
DDR4_G1 (csrow 1 channel 4)     1       0
DDR4_H1 (csrow 1 channel 5)     0       0

@mchehab
Copy link
Owner

mchehab commented Feb 27, 2026

Merged, thanks!

@mchehab mchehab closed this Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants