* [PATCH v3 0/5] Cleanup and add support for AMD Family 1Ah-based SOCs
@ 2025-09-09 18:53 Avadhut Naik
2025-09-09 18:53 ` [PATCH v3 1/5] EDAC/amd64: Generate ctl_name string at runtime Avadhut Naik
` (4 more replies)
0 siblings, 5 replies; 18+ messages in thread
From: Avadhut Naik @ 2025-09-09 18:53 UTC (permalink / raw)
To: linux-edac; +Cc: bp, yazen.ghannam, linux-kernel, avadhut.naik
This patchset undertakes some cleanups in the amd64_edac module and adds
support for newer AMD's Family 1Ah-based SOCs.
The first patch removes explicit assignment of the ctl_name string and
instead generates it at runtime using scnprintf.
The second patch removes the NUM_CONTROLLERS macro and instead uses the
max_mcs variable to determine the size of chipselect array.
The third patch adds support for new AMD's Family 1Ah-based SOCs.
The fourth patch modifies the legacy EDAC sysfs interface so that the 16
UMCs supported by some of these new SOCs are appropriately visible through
the interface.
The fifth patch initiates deprecation of the legacy EDAC sysfs interface.
Changes in v2:
- Remove extra tabs.
- Update the legacy EDAC sysfs interface.
Changes in v3:
- Remove explicit assignment of the ctl_name string and instead generate
the string at runtime.
- Remove the NUM_CONTROLLERS macro.
- Initiate deprecation of legacy EDAC sysfs interface.
Links:
v1: https://lore.kernel.org/all/20250722192855.3108575-1-avadhut.naik@amd.com/
v2: https://lore.kernel.org/all/20250807201843.4045761-1-avadhut.naik@amd.com/
Avadhut Naik (5):
EDAC/amd64: Generate ctl_name string at runtime
EDAC/amd64: Remove NUM_CONTROLLERS macro
EDAC/amd64: Add support for AMD family 1Ah-based newer models
EDAC/mc_sysfs: Increase legacy channel support to 16
EDAC/mc_sysfs: Begin deprecating legacy sysfs EDAC interface
drivers/edac/Kconfig | 2 +-
drivers/edac/amd64_edac.c | 73 +++++++++++++++---------------------
drivers/edac/amd64_edac.h | 9 +++--
drivers/edac/edac_mc_sysfs.c | 38 +++++++++++++++++++
4 files changed, 75 insertions(+), 47 deletions(-)
base-commit: 501973598d05fdb1d1089fbf3cf40b605b836e16
--
2.43.0
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 1/5] EDAC/amd64: Generate ctl_name string at runtime
2025-09-09 18:53 [PATCH v3 0/5] Cleanup and add support for AMD Family 1Ah-based SOCs Avadhut Naik
@ 2025-09-09 18:53 ` Avadhut Naik
2025-09-10 14:51 ` Yazen Ghannam
2025-09-09 18:53 ` [PATCH v3 2/5] EDAC/amd64: Remove NUM_CONTROLLERS macro Avadhut Naik
` (3 subsequent siblings)
4 siblings, 1 reply; 18+ messages in thread
From: Avadhut Naik @ 2025-09-09 18:53 UTC (permalink / raw)
To: linux-edac; +Cc: bp, yazen.ghannam, linux-kernel, avadhut.naik
Currently, the ctl_name string is statically assigned based on the family
and model of the SOC when the amd64_edac module is loaded.
The same, however, is not exactly needed as the string can be generated
and assigned at runtime through scnprintf().
Remove all static assignments and generate the string at runtime. Also,
cleanup the switch cases which now become defunct.
Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
---
Changes in v3:
Patch introduced.
---
drivers/edac/amd64_edac.c | 44 +++++++--------------------------------
drivers/edac/amd64_edac.h | 4 +++-
2 files changed, 11 insertions(+), 37 deletions(-)
diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 07f1e9dc1ca7..3989794e4f29 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -3767,6 +3767,8 @@ static int per_family_init(struct amd64_pvt *pvt)
pvt->model = boot_cpu_data.x86_model;
pvt->fam = boot_cpu_data.x86;
pvt->max_mcs = 2;
+ scnprintf(pvt->ctl_name, sizeof(pvt->ctl_name), "F%02Xh_M%02Xh",
+ pvt->fam, pvt->model);
/*
* Decide on which ops group to use here and do any family/model
@@ -3779,8 +3781,10 @@ static int per_family_init(struct amd64_pvt *pvt)
switch (pvt->fam) {
case 0xf:
- pvt->ctl_name = (pvt->ext_model >= K8_REV_F) ?
- "K8 revF or later" : "K8 revE or earlier";
+ if (pvt->ext_model >= K8_REV_F)
+ strscpy(pvt->ctl_name, "K8 revF or later", sizeof(pvt->ctl_name));
+ else
+ strscpy(pvt->ctl_name, "K8 revE or earlier", sizeof(pvt->ctl_name));
pvt->f1_id = PCI_DEVICE_ID_AMD_K8_NB_ADDRMAP;
pvt->f2_id = PCI_DEVICE_ID_AMD_K8_NB_MEMCTL;
pvt->ops->map_sysaddr_to_csrow = k8_map_sysaddr_to_csrow;
@@ -3788,7 +3792,6 @@ static int per_family_init(struct amd64_pvt *pvt)
break;
case 0x10:
- pvt->ctl_name = "F10h";
pvt->f1_id = PCI_DEVICE_ID_AMD_10H_NB_MAP;
pvt->f2_id = PCI_DEVICE_ID_AMD_10H_NB_DRAM;
pvt->ops->dbam_to_cs = f10_dbam_to_chip_select;
@@ -3797,12 +3800,10 @@ static int per_family_init(struct amd64_pvt *pvt)
case 0x15:
switch (pvt->model) {
case 0x30:
- pvt->ctl_name = "F15h_M30h";
pvt->f1_id = PCI_DEVICE_ID_AMD_15H_M30H_NB_F1;
pvt->f2_id = PCI_DEVICE_ID_AMD_15H_M30H_NB_F2;
break;
case 0x60:
- pvt->ctl_name = "F15h_M60h";
pvt->f1_id = PCI_DEVICE_ID_AMD_15H_M60H_NB_F1;
pvt->f2_id = PCI_DEVICE_ID_AMD_15H_M60H_NB_F2;
pvt->ops->dbam_to_cs = f15_m60h_dbam_to_chip_select;
@@ -3811,7 +3812,6 @@ static int per_family_init(struct amd64_pvt *pvt)
/* Richland is only client */
return -ENODEV;
default:
- pvt->ctl_name = "F15h";
pvt->f1_id = PCI_DEVICE_ID_AMD_15H_NB_F1;
pvt->f2_id = PCI_DEVICE_ID_AMD_15H_NB_F2;
pvt->ops->dbam_to_cs = f15_dbam_to_chip_select;
@@ -3822,12 +3822,10 @@ static int per_family_init(struct amd64_pvt *pvt)
case 0x16:
switch (pvt->model) {
case 0x30:
- pvt->ctl_name = "F16h_M30h";
pvt->f1_id = PCI_DEVICE_ID_AMD_16H_M30H_NB_F1;
pvt->f2_id = PCI_DEVICE_ID_AMD_16H_M30H_NB_F2;
break;
default:
- pvt->ctl_name = "F16h";
pvt->f1_id = PCI_DEVICE_ID_AMD_16H_NB_F1;
pvt->f2_id = PCI_DEVICE_ID_AMD_16H_NB_F2;
break;
@@ -3836,76 +3834,52 @@ static int per_family_init(struct amd64_pvt *pvt)
case 0x17:
switch (pvt->model) {
- case 0x10 ... 0x2f:
- pvt->ctl_name = "F17h_M10h";
- break;
case 0x30 ... 0x3f:
- pvt->ctl_name = "F17h_M30h";
pvt->max_mcs = 8;
break;
- case 0x60 ... 0x6f:
- pvt->ctl_name = "F17h_M60h";
- break;
- case 0x70 ... 0x7f:
- pvt->ctl_name = "F17h_M70h";
- break;
default:
- pvt->ctl_name = "F17h";
break;
}
break;
case 0x18:
- pvt->ctl_name = "F18h";
break;
case 0x19:
switch (pvt->model) {
case 0x00 ... 0x0f:
- pvt->ctl_name = "F19h";
pvt->max_mcs = 8;
break;
case 0x10 ... 0x1f:
- pvt->ctl_name = "F19h_M10h";
pvt->max_mcs = 12;
pvt->flags.zn_regs_v2 = 1;
break;
- case 0x20 ... 0x2f:
- pvt->ctl_name = "F19h_M20h";
- break;
case 0x30 ... 0x3f:
if (pvt->F3->device == PCI_DEVICE_ID_AMD_MI200_DF_F3) {
- pvt->ctl_name = "MI200";
+ memset(pvt->ctl_name, 0, sizeof(pvt->ctl_name));
+ strscpy(pvt->ctl_name, "MI200", sizeof(pvt->ctl_name));
pvt->max_mcs = 4;
pvt->dram_type = MEM_HBM2;
pvt->gpu_umc_base = 0x50000;
pvt->ops = &gpu_ops;
} else {
- pvt->ctl_name = "F19h_M30h";
pvt->max_mcs = 8;
}
break;
- case 0x50 ... 0x5f:
- pvt->ctl_name = "F19h_M50h";
- break;
case 0x60 ... 0x6f:
- pvt->ctl_name = "F19h_M60h";
pvt->flags.zn_regs_v2 = 1;
break;
case 0x70 ... 0x7f:
- pvt->ctl_name = "F19h_M70h";
pvt->max_mcs = 4;
pvt->flags.zn_regs_v2 = 1;
break;
case 0x90 ... 0x9f:
- pvt->ctl_name = "F19h_M90h";
pvt->max_mcs = 4;
pvt->dram_type = MEM_HBM3;
pvt->gpu_umc_base = 0x90000;
pvt->ops = &gpu_ops;
break;
case 0xa0 ... 0xaf:
- pvt->ctl_name = "F19h_MA0h";
pvt->max_mcs = 12;
pvt->flags.zn_regs_v2 = 1;
break;
@@ -3915,12 +3889,10 @@ static int per_family_init(struct amd64_pvt *pvt)
case 0x1A:
switch (pvt->model) {
case 0x00 ... 0x1f:
- pvt->ctl_name = "F1Ah";
pvt->max_mcs = 12;
pvt->flags.zn_regs_v2 = 1;
break;
case 0x40 ... 0x4f:
- pvt->ctl_name = "F1Ah_M40h";
pvt->flags.zn_regs_v2 = 1;
break;
}
diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h
index 17228d07de4c..56999ed3ae56 100644
--- a/drivers/edac/amd64_edac.h
+++ b/drivers/edac/amd64_edac.h
@@ -101,6 +101,8 @@
#define ON true
#define OFF false
+#define MAX_CTL_NAMELEN 20
+
/*
* PCI-defined configuration space registers
*/
@@ -362,7 +364,7 @@ struct amd64_pvt {
/* x4, x8, or x16 syndromes in use */
u8 ecc_sym_sz;
- const char *ctl_name;
+ char ctl_name[MAX_CTL_NAMELEN];
u16 f1_id, f2_id;
/* Maximum number of memory controllers per die/node. */
u8 max_mcs;
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 2/5] EDAC/amd64: Remove NUM_CONTROLLERS macro
2025-09-09 18:53 [PATCH v3 0/5] Cleanup and add support for AMD Family 1Ah-based SOCs Avadhut Naik
2025-09-09 18:53 ` [PATCH v3 1/5] EDAC/amd64: Generate ctl_name string at runtime Avadhut Naik
@ 2025-09-09 18:53 ` Avadhut Naik
2025-09-10 15:05 ` Yazen Ghannam
2025-09-09 18:53 ` [PATCH v3 3/5] EDAC/amd64: Add support for AMD family 1Ah-based newer models Avadhut Naik
` (2 subsequent siblings)
4 siblings, 1 reply; 18+ messages in thread
From: Avadhut Naik @ 2025-09-09 18:53 UTC (permalink / raw)
To: linux-edac; +Cc: bp, yazen.ghannam, linux-kernel, avadhut.naik
Currently, the NUM_CONTROLLERS macro is only used to statically allocate
the csels array of struct chip_select in struct amd64_pvt.
The size of this array, however, will never exceed the number of UMCs on
the SOC. Since, max_mcs variable in struct amd64_pvt already stores the
number of UMCs on the SOC, the macro can be removed and the static array
can be dynamically allocated instead.
Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
---
Changes in v3:
Patch introduced.
---
drivers/edac/amd64_edac.c | 19 +++++++++++++------
drivers/edac/amd64_edac.h | 5 ++---
2 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 3989794e4f29..0fade110c3fb 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -4000,30 +4000,34 @@ static int probe_one_instance(unsigned int nid)
if (ret < 0)
goto err_enable;
+ pvt->csels = kcalloc(pvt->max_mcs, sizeof(*pvt->csels), GFP_KERNEL);
+ if (!pvt->csels)
+ goto err_enable;
+
ret = pvt->ops->hw_info_get(pvt);
if (ret < 0)
- goto err_enable;
+ goto err_csels;
ret = 0;
if (!instance_has_memory(pvt)) {
amd64_info("Node %d: No DIMMs detected.\n", nid);
- goto err_enable;
+ goto err_csels;
}
if (!pvt->ops->ecc_enabled(pvt)) {
ret = -ENODEV;
if (!ecc_enable_override)
- goto err_enable;
+ goto err_csels;
if (boot_cpu_data.x86 >= 0x17) {
amd64_warn("Forcing ECC on is not recommended on newer systems. Please enable ECC in BIOS.");
- goto err_enable;
+ goto err_csels;
} else
amd64_warn("Forcing ECC on!\n");
if (!enable_ecc_error_reporting(s, nid, F3))
- goto err_enable;
+ goto err_csels;
}
ret = init_one_instance(pvt);
@@ -4033,7 +4037,7 @@ static int probe_one_instance(unsigned int nid)
if (boot_cpu_data.x86 < 0x17)
restore_ecc_error_reporting(s, nid, F3);
- goto err_enable;
+ goto err_csels;
}
amd64_info("%s detected (node %d).\n", pvt->ctl_name, pvt->mc_node_id);
@@ -4043,6 +4047,8 @@ static int probe_one_instance(unsigned int nid)
return ret;
+err_csels:
+ kfree(pvt->csels);
err_enable:
hw_info_put(pvt);
kfree(pvt);
@@ -4077,6 +4083,7 @@ static void remove_one_instance(unsigned int nid)
/* Free the EDAC CORE resources */
mci->pvt_info = NULL;
+ kfree(pvt->csels);
hw_info_put(pvt);
kfree(pvt);
edac_mc_free(mci);
diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h
index 56999ed3ae56..39d30255c767 100644
--- a/drivers/edac/amd64_edac.h
+++ b/drivers/edac/amd64_edac.h
@@ -96,7 +96,6 @@
/* Hardware limit on ChipSelect rows per MC and processors per system */
#define NUM_CHIPSELECTS 8
#define DRAM_RANGES 8
-#define NUM_CONTROLLERS 12
#define ON true
#define OFF false
@@ -347,8 +346,8 @@ struct amd64_pvt {
u32 dbam0; /* DRAM Base Address Mapping reg for DCT0 */
u32 dbam1; /* DRAM Base Address Mapping reg for DCT1 */
- /* one for each DCT/UMC */
- struct chip_select csels[NUM_CONTROLLERS];
+ /* Allocate one for each DCT/UMC */
+ struct chip_select *csels;
/* DRAM base and limit pairs F1x[78,70,68,60,58,50,48,40] */
struct dram_range ranges[DRAM_RANGES];
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 3/5] EDAC/amd64: Add support for AMD family 1Ah-based newer models
2025-09-09 18:53 [PATCH v3 0/5] Cleanup and add support for AMD Family 1Ah-based SOCs Avadhut Naik
2025-09-09 18:53 ` [PATCH v3 1/5] EDAC/amd64: Generate ctl_name string at runtime Avadhut Naik
2025-09-09 18:53 ` [PATCH v3 2/5] EDAC/amd64: Remove NUM_CONTROLLERS macro Avadhut Naik
@ 2025-09-09 18:53 ` Avadhut Naik
2025-09-10 15:13 ` Yazen Ghannam
2025-09-09 18:53 ` [PATCH v3 4/5] EDAC/mc_sysfs: Increase legacy channel support to 16 Avadhut Naik
2025-09-09 18:53 ` [PATCH v3 5/5] EDAC/mc_sysfs: Begin deprecating legacy sysfs EDAC interface Avadhut Naik
4 siblings, 1 reply; 18+ messages in thread
From: Avadhut Naik @ 2025-09-09 18:53 UTC (permalink / raw)
To: linux-edac; +Cc: bp, yazen.ghannam, linux-kernel, avadhut.naik
Add support for family 1Ah-based models 50h-57h, 90h-9Fh, A0h-AFh, and
C0h-C7h.
Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
---
Changes in v2:
1. Remove extra tabs
Changes in v3:
1. Since ctl_name string is now assigned at runtime, group similar models
together.
---
drivers/edac/amd64_edac.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 0fade110c3fb..804d3c4c3f14 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -3895,6 +3895,16 @@ static int per_family_init(struct amd64_pvt *pvt)
case 0x40 ... 0x4f:
pvt->flags.zn_regs_v2 = 1;
break;
+ case 0x50 ... 0x57:
+ case 0xc0 ... 0xc7:
+ pvt->max_mcs = 16;
+ pvt->flags.zn_regs_v2 = 1;
+ break;
+ case 0x90 ... 0x9f:
+ case 0xa0 ... 0xaf:
+ pvt->max_mcs = 8;
+ pvt->flags.zn_regs_v2 = 1;
+ break;
}
break;
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 4/5] EDAC/mc_sysfs: Increase legacy channel support to 16
2025-09-09 18:53 [PATCH v3 0/5] Cleanup and add support for AMD Family 1Ah-based SOCs Avadhut Naik
` (2 preceding siblings ...)
2025-09-09 18:53 ` [PATCH v3 3/5] EDAC/amd64: Add support for AMD family 1Ah-based newer models Avadhut Naik
@ 2025-09-09 18:53 ` Avadhut Naik
2025-09-10 15:17 ` Yazen Ghannam
2025-09-09 18:53 ` [PATCH v3 5/5] EDAC/mc_sysfs: Begin deprecating legacy sysfs EDAC interface Avadhut Naik
4 siblings, 1 reply; 18+ messages in thread
From: Avadhut Naik @ 2025-09-09 18:53 UTC (permalink / raw)
To: linux-edac; +Cc: bp, yazen.ghannam, linux-kernel, avadhut.naik
Newer AMD systems can support up to 16 channels per EDAC "mc" device.
These are detected by the EDAC module running on the device, and the
current EDAC interface is appropriately enumerated.
The legacy EDAC sysfs interface however, provides device attributes for
channels 0 through 11 only. Consequently, the last four channels, 12
through 15, will not be enumerated and will not be visible through the
legacy sysfs interface.
Add additional device attributes to ensure that all 16 channels, if
present, are enumerated by and visible through the legacy EDAC sysfs
interface.
Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
---
Changes in v2:
Patch introduced.
Changes in v3:
No changes.
---
drivers/edac/edac_mc_sysfs.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 0f338adf7d93..8689631f1905 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -305,6 +305,14 @@ DEVICE_CHANNEL(ch10_dimm_label, S_IRUGO | S_IWUSR,
channel_dimm_label_show, channel_dimm_label_store, 10);
DEVICE_CHANNEL(ch11_dimm_label, S_IRUGO | S_IWUSR,
channel_dimm_label_show, channel_dimm_label_store, 11);
+DEVICE_CHANNEL(ch12_dimm_label, S_IRUGO | S_IWUSR,
+ channel_dimm_label_show, channel_dimm_label_store, 12);
+DEVICE_CHANNEL(ch13_dimm_label, S_IRUGO | S_IWUSR,
+ channel_dimm_label_show, channel_dimm_label_store, 13);
+DEVICE_CHANNEL(ch14_dimm_label, S_IRUGO | S_IWUSR,
+ channel_dimm_label_show, channel_dimm_label_store, 14);
+DEVICE_CHANNEL(ch15_dimm_label, S_IRUGO | S_IWUSR,
+ channel_dimm_label_show, channel_dimm_label_store, 15);
/* Total possible dynamic DIMM Label attribute file table */
static struct attribute *dynamic_csrow_dimm_attr[] = {
@@ -320,6 +328,10 @@ static struct attribute *dynamic_csrow_dimm_attr[] = {
&dev_attr_legacy_ch9_dimm_label.attr.attr,
&dev_attr_legacy_ch10_dimm_label.attr.attr,
&dev_attr_legacy_ch11_dimm_label.attr.attr,
+ &dev_attr_legacy_ch12_dimm_label.attr.attr,
+ &dev_attr_legacy_ch13_dimm_label.attr.attr,
+ &dev_attr_legacy_ch14_dimm_label.attr.attr,
+ &dev_attr_legacy_ch15_dimm_label.attr.attr,
NULL
};
@@ -348,6 +360,14 @@ DEVICE_CHANNEL(ch10_ce_count, S_IRUGO,
channel_ce_count_show, NULL, 10);
DEVICE_CHANNEL(ch11_ce_count, S_IRUGO,
channel_ce_count_show, NULL, 11);
+DEVICE_CHANNEL(ch12_ce_count, S_IRUGO,
+ channel_ce_count_show, NULL, 12);
+DEVICE_CHANNEL(ch13_ce_count, S_IRUGO,
+ channel_ce_count_show, NULL, 13);
+DEVICE_CHANNEL(ch14_ce_count, S_IRUGO,
+ channel_ce_count_show, NULL, 14);
+DEVICE_CHANNEL(ch15_ce_count, S_IRUGO,
+ channel_ce_count_show, NULL, 15);
/* Total possible dynamic ce_count attribute file table */
static struct attribute *dynamic_csrow_ce_count_attr[] = {
@@ -363,6 +383,10 @@ static struct attribute *dynamic_csrow_ce_count_attr[] = {
&dev_attr_legacy_ch9_ce_count.attr.attr,
&dev_attr_legacy_ch10_ce_count.attr.attr,
&dev_attr_legacy_ch11_ce_count.attr.attr,
+ &dev_attr_legacy_ch12_ce_count.attr.attr,
+ &dev_attr_legacy_ch13_ce_count.attr.attr,
+ &dev_attr_legacy_ch14_ce_count.attr.attr,
+ &dev_attr_legacy_ch15_ce_count.attr.attr,
NULL
};
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v3 5/5] EDAC/mc_sysfs: Begin deprecating legacy sysfs EDAC interface
2025-09-09 18:53 [PATCH v3 0/5] Cleanup and add support for AMD Family 1Ah-based SOCs Avadhut Naik
` (3 preceding siblings ...)
2025-09-09 18:53 ` [PATCH v3 4/5] EDAC/mc_sysfs: Increase legacy channel support to 16 Avadhut Naik
@ 2025-09-09 18:53 ` Avadhut Naik
2025-09-10 15:24 ` Yazen Ghannam
4 siblings, 1 reply; 18+ messages in thread
From: Avadhut Naik @ 2025-09-09 18:53 UTC (permalink / raw)
To: linux-edac; +Cc: bp, yazen.ghannam, linux-kernel, avadhut.naik
The legacy sysfs EDAC interface has been made obsolete more than a decade
ago through the introduction of a new per-DIMM interface.
The legacy interface however, hasn't been removed till date.
Begin deprecating it so that it can eventually be removed two releases
later.
Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
---
Changes in v3:
Patch introduced.
---
drivers/edac/Kconfig | 2 +-
drivers/edac/edac_mc_sysfs.c | 14 ++++++++++++++
2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index b824472208c4..19470f4efee7 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -25,7 +25,7 @@ if EDAC
config EDAC_LEGACY_SYSFS
bool "EDAC legacy sysfs"
- default y
+ default n
help
Enable the compatibility sysfs nodes.
Use 'Y' if your edac utilities aren't ported to work with the newer
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 8689631f1905..3840eef942f8 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -144,6 +144,8 @@ struct dev_ch_attribute {
static ssize_t csrow_ue_count_show(struct device *dev,
struct device_attribute *mattr, char *data)
{
+ pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
+
struct csrow_info *csrow = to_csrow(dev);
return sysfs_emit(data, "%u\n", csrow->ue_count);
@@ -152,6 +154,8 @@ static ssize_t csrow_ue_count_show(struct device *dev,
static ssize_t csrow_ce_count_show(struct device *dev,
struct device_attribute *mattr, char *data)
{
+ pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
+
struct csrow_info *csrow = to_csrow(dev);
return sysfs_emit(data, "%u\n", csrow->ce_count);
@@ -160,6 +164,8 @@ static ssize_t csrow_ce_count_show(struct device *dev,
static ssize_t csrow_size_show(struct device *dev,
struct device_attribute *mattr, char *data)
{
+ pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
+
struct csrow_info *csrow = to_csrow(dev);
int i;
u32 nr_pages = 0;
@@ -172,6 +178,8 @@ static ssize_t csrow_size_show(struct device *dev,
static ssize_t csrow_mem_type_show(struct device *dev,
struct device_attribute *mattr, char *data)
{
+ pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
+
struct csrow_info *csrow = to_csrow(dev);
return sysfs_emit(data, "%s\n", edac_mem_types[csrow->channels[0]->dimm->mtype]);
@@ -180,6 +188,8 @@ static ssize_t csrow_mem_type_show(struct device *dev,
static ssize_t csrow_dev_type_show(struct device *dev,
struct device_attribute *mattr, char *data)
{
+ pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
+
struct csrow_info *csrow = to_csrow(dev);
return sysfs_emit(data, "%s\n", dev_types[csrow->channels[0]->dimm->dtype]);
@@ -189,6 +199,8 @@ static ssize_t csrow_edac_mode_show(struct device *dev,
struct device_attribute *mattr,
char *data)
{
+ pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
+
struct csrow_info *csrow = to_csrow(dev);
return sysfs_emit(data, "%s\n", edac_caps[csrow->channels[0]->dimm->edac_mode]);
@@ -199,6 +211,7 @@ static ssize_t channel_dimm_label_show(struct device *dev,
struct device_attribute *mattr,
char *data)
{
+ pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
struct csrow_info *csrow = to_csrow(dev);
unsigned int chan = to_channel(mattr);
struct rank_info *rank = csrow->channels[chan];
@@ -238,6 +251,7 @@ static ssize_t channel_dimm_label_store(struct device *dev,
static ssize_t channel_ce_count_show(struct device *dev,
struct device_attribute *mattr, char *data)
{
+ pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
struct csrow_info *csrow = to_csrow(dev);
unsigned int chan = to_channel(mattr);
struct rank_info *rank = csrow->channels[chan];
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/5] EDAC/amd64: Generate ctl_name string at runtime
2025-09-09 18:53 ` [PATCH v3 1/5] EDAC/amd64: Generate ctl_name string at runtime Avadhut Naik
@ 2025-09-10 14:51 ` Yazen Ghannam
2025-09-10 15:53 ` Naik, Avadhut
0 siblings, 1 reply; 18+ messages in thread
From: Yazen Ghannam @ 2025-09-10 14:51 UTC (permalink / raw)
To: Avadhut Naik; +Cc: linux-edac, bp, linux-kernel
On Tue, Sep 09, 2025 at 06:53:10PM +0000, Avadhut Naik wrote:
> Currently, the ctl_name string is statically assigned based on the family
> and model of the SOC when the amd64_edac module is loaded.
>
> The same, however, is not exactly needed as the string can be generated
> and assigned at runtime through scnprintf().
>
> Remove all static assignments and generate the string at runtime. Also,
> cleanup the switch cases which now become defunct.
>
> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
> ---
> Changes in v3:
> Patch introduced.
> ---
> drivers/edac/amd64_edac.c | 44 +++++++--------------------------------
> drivers/edac/amd64_edac.h | 4 +++-
> 2 files changed, 11 insertions(+), 37 deletions(-)
>
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index 07f1e9dc1ca7..3989794e4f29 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -3767,6 +3767,8 @@ static int per_family_init(struct amd64_pvt *pvt)
> pvt->model = boot_cpu_data.x86_model;
> pvt->fam = boot_cpu_data.x86;
> pvt->max_mcs = 2;
Newline here, please.
> + scnprintf(pvt->ctl_name, sizeof(pvt->ctl_name), "F%02Xh_M%02Xh",
> + pvt->fam, pvt->model);
There are a couple of special cases below.
So I think it may be better to move this part to the end...
>
> /*
> * Decide on which ops group to use here and do any family/model
> @@ -3779,8 +3781,10 @@ static int per_family_init(struct amd64_pvt *pvt)
>
> switch (pvt->fam) {
> case 0xf:
> - pvt->ctl_name = (pvt->ext_model >= K8_REV_F) ?
> - "K8 revF or later" : "K8 revE or earlier";
> + if (pvt->ext_model >= K8_REV_F)
> + strscpy(pvt->ctl_name, "K8 revF or later", sizeof(pvt->ctl_name));
> + else
> + strscpy(pvt->ctl_name, "K8 revE or earlier", sizeof(pvt->ctl_name));
Maybe save this special case to a temporary "char *name".
> pvt->f1_id = PCI_DEVICE_ID_AMD_K8_NB_ADDRMAP;
> pvt->f2_id = PCI_DEVICE_ID_AMD_K8_NB_MEMCTL;
> pvt->ops->map_sysaddr_to_csrow = k8_map_sysaddr_to_csrow;
> @@ -3788,7 +3792,6 @@ static int per_family_init(struct amd64_pvt *pvt)
> break;
>
> case 0x10:
> - pvt->ctl_name = "F10h";
> pvt->f1_id = PCI_DEVICE_ID_AMD_10H_NB_MAP;
> pvt->f2_id = PCI_DEVICE_ID_AMD_10H_NB_DRAM;
> pvt->ops->dbam_to_cs = f10_dbam_to_chip_select;
> @@ -3797,12 +3800,10 @@ static int per_family_init(struct amd64_pvt *pvt)
> case 0x15:
> switch (pvt->model) {
> case 0x30:
> - pvt->ctl_name = "F15h_M30h";
> pvt->f1_id = PCI_DEVICE_ID_AMD_15H_M30H_NB_F1;
> pvt->f2_id = PCI_DEVICE_ID_AMD_15H_M30H_NB_F2;
> break;
> case 0x60:
> - pvt->ctl_name = "F15h_M60h";
> pvt->f1_id = PCI_DEVICE_ID_AMD_15H_M60H_NB_F1;
> pvt->f2_id = PCI_DEVICE_ID_AMD_15H_M60H_NB_F2;
> pvt->ops->dbam_to_cs = f15_m60h_dbam_to_chip_select;
> @@ -3811,7 +3812,6 @@ static int per_family_init(struct amd64_pvt *pvt)
> /* Richland is only client */
> return -ENODEV;
> default:
> - pvt->ctl_name = "F15h";
> pvt->f1_id = PCI_DEVICE_ID_AMD_15H_NB_F1;
> pvt->f2_id = PCI_DEVICE_ID_AMD_15H_NB_F2;
> pvt->ops->dbam_to_cs = f15_dbam_to_chip_select;
> @@ -3822,12 +3822,10 @@ static int per_family_init(struct amd64_pvt *pvt)
> case 0x16:
> switch (pvt->model) {
> case 0x30:
> - pvt->ctl_name = "F16h_M30h";
> pvt->f1_id = PCI_DEVICE_ID_AMD_16H_M30H_NB_F1;
> pvt->f2_id = PCI_DEVICE_ID_AMD_16H_M30H_NB_F2;
> break;
> default:
> - pvt->ctl_name = "F16h";
> pvt->f1_id = PCI_DEVICE_ID_AMD_16H_NB_F1;
> pvt->f2_id = PCI_DEVICE_ID_AMD_16H_NB_F2;
> break;
> @@ -3836,76 +3834,52 @@ static int per_family_init(struct amd64_pvt *pvt)
>
> case 0x17:
> switch (pvt->model) {
> - case 0x10 ... 0x2f:
> - pvt->ctl_name = "F17h_M10h";
> - break;
> case 0x30 ... 0x3f:
> - pvt->ctl_name = "F17h_M30h";
> pvt->max_mcs = 8;
> break;
> - case 0x60 ... 0x6f:
> - pvt->ctl_name = "F17h_M60h";
> - break;
> - case 0x70 ... 0x7f:
> - pvt->ctl_name = "F17h_M70h";
> - break;
> default:
> - pvt->ctl_name = "F17h";
> break;
> }
> break;
>
> case 0x18:
> - pvt->ctl_name = "F18h";
> break;
>
> case 0x19:
> switch (pvt->model) {
> case 0x00 ... 0x0f:
> - pvt->ctl_name = "F19h";
> pvt->max_mcs = 8;
> break;
> case 0x10 ... 0x1f:
> - pvt->ctl_name = "F19h_M10h";
> pvt->max_mcs = 12;
> pvt->flags.zn_regs_v2 = 1;
> break;
> - case 0x20 ... 0x2f:
> - pvt->ctl_name = "F19h_M20h";
> - break;
> case 0x30 ... 0x3f:
> if (pvt->F3->device == PCI_DEVICE_ID_AMD_MI200_DF_F3) {
> - pvt->ctl_name = "MI200";
> + memset(pvt->ctl_name, 0, sizeof(pvt->ctl_name));
> + strscpy(pvt->ctl_name, "MI200", sizeof(pvt->ctl_name));
And this to a temp name too.
> pvt->max_mcs = 4;
> pvt->dram_type = MEM_HBM2;
> pvt->gpu_umc_base = 0x50000;
> pvt->ops = &gpu_ops;
> } else {
> - pvt->ctl_name = "F19h_M30h";
> pvt->max_mcs = 8;
> }
> break;
> - case 0x50 ... 0x5f:
> - pvt->ctl_name = "F19h_M50h";
> - break;
> case 0x60 ... 0x6f:
> - pvt->ctl_name = "F19h_M60h";
> pvt->flags.zn_regs_v2 = 1;
> break;
> case 0x70 ... 0x7f:
> - pvt->ctl_name = "F19h_M70h";
> pvt->max_mcs = 4;
> pvt->flags.zn_regs_v2 = 1;
> break;
> case 0x90 ... 0x9f:
> - pvt->ctl_name = "F19h_M90h";
> pvt->max_mcs = 4;
> pvt->dram_type = MEM_HBM3;
> pvt->gpu_umc_base = 0x90000;
> pvt->ops = &gpu_ops;
> break;
> case 0xa0 ... 0xaf:
> - pvt->ctl_name = "F19h_MA0h";
> pvt->max_mcs = 12;
> pvt->flags.zn_regs_v2 = 1;
> break;
> @@ -3915,12 +3889,10 @@ static int per_family_init(struct amd64_pvt *pvt)
> case 0x1A:
> switch (pvt->model) {
> case 0x00 ... 0x1f:
> - pvt->ctl_name = "F1Ah";
> pvt->max_mcs = 12;
> pvt->flags.zn_regs_v2 = 1;
> break;
> case 0x40 ... 0x4f:
> - pvt->ctl_name = "F1Ah_M40h";
> pvt->flags.zn_regs_v2 = 1;
> break;
> }
...here.
Then check if the name was set already (by the special cases). If not,
then set the generic family/model name.
For example:
char *tmp_name = NULL;
if K8:
if F:
tmp_name = "K8 F";
else:
tmp_name = "K8 E";
if MI200:
tmp_name = "MI200";
if (tmp_name)
scnprintf(pvt->ctl_name, sizeof(pvt->ctl_name), tmp_name);
else
scnprintf(pvt->ctl_name, sizeof(pvt->ctl_name), "F%02Xh_M%02Xh",
pvt->fam, pvt->model);
> diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h
> index 17228d07de4c..56999ed3ae56 100644
> --- a/drivers/edac/amd64_edac.h
> +++ b/drivers/edac/amd64_edac.h
> @@ -101,6 +101,8 @@
> #define ON true
> #define OFF false
>
> +#define MAX_CTL_NAMELEN 20
> +
> /*
> * PCI-defined configuration space registers
> */
> @@ -362,7 +364,7 @@ struct amd64_pvt {
> /* x4, x8, or x16 syndromes in use */
> u8 ecc_sym_sz;
>
> - const char *ctl_name;
> + char ctl_name[MAX_CTL_NAMELEN];
> u16 f1_id, f2_id;
> /* Maximum number of memory controllers per die/node. */
> u8 max_mcs;
> --
Thanks,
Yazen
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 2/5] EDAC/amd64: Remove NUM_CONTROLLERS macro
2025-09-09 18:53 ` [PATCH v3 2/5] EDAC/amd64: Remove NUM_CONTROLLERS macro Avadhut Naik
@ 2025-09-10 15:05 ` Yazen Ghannam
2025-09-10 15:48 ` Naik, Avadhut
0 siblings, 1 reply; 18+ messages in thread
From: Yazen Ghannam @ 2025-09-10 15:05 UTC (permalink / raw)
To: Avadhut Naik; +Cc: linux-edac, bp, linux-kernel
On Tue, Sep 09, 2025 at 06:53:11PM +0000, Avadhut Naik wrote:
> Currently, the NUM_CONTROLLERS macro is only used to statically allocate
> the csels array of struct chip_select in struct amd64_pvt.
>
> The size of this array, however, will never exceed the number of UMCs on
> the SOC. Since, max_mcs variable in struct amd64_pvt already stores the
> number of UMCs on the SOC, the macro can be removed and the static array
> can be dynamically allocated instead.
You should note that max_mcs and the csels array are also used in legacy
systems with 'DCTs'.
Those had a max of 2 controllers which we already set in
per_family_init() as the global default. So the legacy systems are
covered by this change too.
Without noting this, it seems like that case may be overlooked.
>
> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
> ---
> Changes in v3:
> Patch introduced.
> ---
> drivers/edac/amd64_edac.c | 19 +++++++++++++------
> drivers/edac/amd64_edac.h | 5 ++---
> 2 files changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index 3989794e4f29..0fade110c3fb 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -4000,30 +4000,34 @@ static int probe_one_instance(unsigned int nid)
> if (ret < 0)
> goto err_enable;
>
> + pvt->csels = kcalloc(pvt->max_mcs, sizeof(*pvt->csels), GFP_KERNEL);
> + if (!pvt->csels)
> + goto err_enable;
> +
You can move this allocation to the end of per_family_init(). That's
where we determine 'max_mcs'.
If you do so, then the 'goto' changes below are not needed.
Another option is to put it in hw_info_get() like we do for UMCs. But
that means adding the allocation to three different helper functions
rather than just the one with per_family_init().
> ret = pvt->ops->hw_info_get(pvt);
> if (ret < 0)
> - goto err_enable;
> + goto err_csels;
>
> ret = 0;
> if (!instance_has_memory(pvt)) {
> amd64_info("Node %d: No DIMMs detected.\n", nid);
> - goto err_enable;
> + goto err_csels;
> }
>
> if (!pvt->ops->ecc_enabled(pvt)) {
> ret = -ENODEV;
>
> if (!ecc_enable_override)
> - goto err_enable;
> + goto err_csels;
>
> if (boot_cpu_data.x86 >= 0x17) {
> amd64_warn("Forcing ECC on is not recommended on newer systems. Please enable ECC in BIOS.");
> - goto err_enable;
> + goto err_csels;
> } else
> amd64_warn("Forcing ECC on!\n");
>
> if (!enable_ecc_error_reporting(s, nid, F3))
> - goto err_enable;
> + goto err_csels;
> }
>
> ret = init_one_instance(pvt);
> @@ -4033,7 +4037,7 @@ static int probe_one_instance(unsigned int nid)
> if (boot_cpu_data.x86 < 0x17)
> restore_ecc_error_reporting(s, nid, F3);
>
> - goto err_enable;
> + goto err_csels;
> }
>
> amd64_info("%s detected (node %d).\n", pvt->ctl_name, pvt->mc_node_id);
> @@ -4043,6 +4047,8 @@ static int probe_one_instance(unsigned int nid)
>
> return ret;
>
> +err_csels:
> + kfree(pvt->csels);
This can go in hw_info_put(). We have kfree(pvt->umc) there already.
> err_enable:
> hw_info_put(pvt);
> kfree(pvt);
> @@ -4077,6 +4083,7 @@ static void remove_one_instance(unsigned int nid)
> /* Free the EDAC CORE resources */
> mci->pvt_info = NULL;
>
> + kfree(pvt->csels);
> hw_info_put(pvt);
> kfree(pvt);
> edac_mc_free(mci);
> diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h
> index 56999ed3ae56..39d30255c767 100644
> --- a/drivers/edac/amd64_edac.h
> +++ b/drivers/edac/amd64_edac.h
> @@ -96,7 +96,6 @@
> /* Hardware limit on ChipSelect rows per MC and processors per system */
> #define NUM_CHIPSELECTS 8
> #define DRAM_RANGES 8
> -#define NUM_CONTROLLERS 12
>
> #define ON true
> #define OFF false
> @@ -347,8 +346,8 @@ struct amd64_pvt {
> u32 dbam0; /* DRAM Base Address Mapping reg for DCT0 */
> u32 dbam1; /* DRAM Base Address Mapping reg for DCT1 */
>
> - /* one for each DCT/UMC */
> - struct chip_select csels[NUM_CONTROLLERS];
> + /* Allocate one for each DCT/UMC */
> + struct chip_select *csels;
>
> /* DRAM base and limit pairs F1x[78,70,68,60,58,50,48,40] */
> struct dram_range ranges[DRAM_RANGES];
> --
Thanks,
Yazen
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/5] EDAC/amd64: Add support for AMD family 1Ah-based newer models
2025-09-09 18:53 ` [PATCH v3 3/5] EDAC/amd64: Add support for AMD family 1Ah-based newer models Avadhut Naik
@ 2025-09-10 15:13 ` Yazen Ghannam
2025-09-10 16:08 ` Naik, Avadhut
0 siblings, 1 reply; 18+ messages in thread
From: Yazen Ghannam @ 2025-09-10 15:13 UTC (permalink / raw)
To: Avadhut Naik; +Cc: linux-edac, bp, linux-kernel
On Tue, Sep 09, 2025 at 06:53:12PM +0000, Avadhut Naik wrote:
> Add support for family 1Ah-based models 50h-57h, 90h-9Fh, A0h-AFh, and
> C0h-C7h.
>
> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
> ---
> Changes in v2:
> 1. Remove extra tabs
>
> Changes in v3:
> 1. Since ctl_name string is now assigned at runtime, group similar models
> together.
> ---
> drivers/edac/amd64_edac.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index 0fade110c3fb..804d3c4c3f14 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -3895,6 +3895,16 @@ static int per_family_init(struct amd64_pvt *pvt)
> case 0x40 ... 0x4f:
> pvt->flags.zn_regs_v2 = 1;
> break;
> + case 0x50 ... 0x57:
> + case 0xc0 ... 0xc7:
> + pvt->max_mcs = 16;
> + pvt->flags.zn_regs_v2 = 1;
> + break;
> + case 0x90 ... 0x9f:
> + case 0xa0 ... 0xaf:
> + pvt->max_mcs = 8;
> + pvt->flags.zn_regs_v2 = 1;
All of Family 1Ah uses 'zn_regs_v2', so this can go before the models
cases.
The register changes happened in Family 19h, so there are a mix of
models there.
We could be so bold to say 'zn_regs_v2 = (family >= 0x1A)' up top.
Family 19h would not set this, but then the individual model cases can
fix it up.
Thanks,
Yazen
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 4/5] EDAC/mc_sysfs: Increase legacy channel support to 16
2025-09-09 18:53 ` [PATCH v3 4/5] EDAC/mc_sysfs: Increase legacy channel support to 16 Avadhut Naik
@ 2025-09-10 15:17 ` Yazen Ghannam
2025-09-10 15:29 ` Naik, Avadhut
0 siblings, 1 reply; 18+ messages in thread
From: Yazen Ghannam @ 2025-09-10 15:17 UTC (permalink / raw)
To: Avadhut Naik; +Cc: linux-edac, bp, linux-kernel
On Tue, Sep 09, 2025 at 06:53:13PM +0000, Avadhut Naik wrote:
> Newer AMD systems can support up to 16 channels per EDAC "mc" device.
> These are detected by the EDAC module running on the device, and the
> current EDAC interface is appropriately enumerated.
>
> The legacy EDAC sysfs interface however, provides device attributes for
> channels 0 through 11 only. Consequently, the last four channels, 12
> through 15, will not be enumerated and will not be visible through the
> legacy sysfs interface.
>
> Add additional device attributes to ensure that all 16 channels, if
> present, are enumerated by and visible through the legacy EDAC sysfs
> interface.
>
> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
> ---
> Changes in v2:
> Patch introduced.
>
> Changes in v3:
> No changes.
> ---
> drivers/edac/edac_mc_sysfs.c | 24 ++++++++++++++++++++++++
> 1 file changed, 24 insertions(+)
>
> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
> index 0f338adf7d93..8689631f1905 100644
> --- a/drivers/edac/edac_mc_sysfs.c
> +++ b/drivers/edac/edac_mc_sysfs.c
> @@ -305,6 +305,14 @@ DEVICE_CHANNEL(ch10_dimm_label, S_IRUGO | S_IWUSR,
> channel_dimm_label_show, channel_dimm_label_store, 10);
> DEVICE_CHANNEL(ch11_dimm_label, S_IRUGO | S_IWUSR,
> channel_dimm_label_show, channel_dimm_label_store, 11);
> +DEVICE_CHANNEL(ch12_dimm_label, S_IRUGO | S_IWUSR,
> + channel_dimm_label_show, channel_dimm_label_store, 12);
> +DEVICE_CHANNEL(ch13_dimm_label, S_IRUGO | S_IWUSR,
> + channel_dimm_label_show, channel_dimm_label_store, 13);
> +DEVICE_CHANNEL(ch14_dimm_label, S_IRUGO | S_IWUSR,
> + channel_dimm_label_show, channel_dimm_label_store, 14);
> +DEVICE_CHANNEL(ch15_dimm_label, S_IRUGO | S_IWUSR,
> + channel_dimm_label_show, channel_dimm_label_store, 15);
>
> /* Total possible dynamic DIMM Label attribute file table */
> static struct attribute *dynamic_csrow_dimm_attr[] = {
> @@ -320,6 +328,10 @@ static struct attribute *dynamic_csrow_dimm_attr[] = {
> &dev_attr_legacy_ch9_dimm_label.attr.attr,
> &dev_attr_legacy_ch10_dimm_label.attr.attr,
> &dev_attr_legacy_ch11_dimm_label.attr.attr,
> + &dev_attr_legacy_ch12_dimm_label.attr.attr,
> + &dev_attr_legacy_ch13_dimm_label.attr.attr,
> + &dev_attr_legacy_ch14_dimm_label.attr.attr,
> + &dev_attr_legacy_ch15_dimm_label.attr.attr,
> NULL
> };
>
> @@ -348,6 +360,14 @@ DEVICE_CHANNEL(ch10_ce_count, S_IRUGO,
> channel_ce_count_show, NULL, 10);
> DEVICE_CHANNEL(ch11_ce_count, S_IRUGO,
> channel_ce_count_show, NULL, 11);
> +DEVICE_CHANNEL(ch12_ce_count, S_IRUGO,
> + channel_ce_count_show, NULL, 12);
> +DEVICE_CHANNEL(ch13_ce_count, S_IRUGO,
> + channel_ce_count_show, NULL, 13);
> +DEVICE_CHANNEL(ch14_ce_count, S_IRUGO,
> + channel_ce_count_show, NULL, 14);
> +DEVICE_CHANNEL(ch15_ce_count, S_IRUGO,
> + channel_ce_count_show, NULL, 15);
>
> /* Total possible dynamic ce_count attribute file table */
> static struct attribute *dynamic_csrow_ce_count_attr[] = {
> @@ -363,6 +383,10 @@ static struct attribute *dynamic_csrow_ce_count_attr[] = {
> &dev_attr_legacy_ch9_ce_count.attr.attr,
> &dev_attr_legacy_ch10_ce_count.attr.attr,
> &dev_attr_legacy_ch11_ce_count.attr.attr,
> + &dev_attr_legacy_ch12_ce_count.attr.attr,
> + &dev_attr_legacy_ch13_ce_count.attr.attr,
> + &dev_attr_legacy_ch14_ce_count.attr.attr,
> + &dev_attr_legacy_ch15_ce_count.attr.attr,
> NULL
> };
>
> --
There are many checkpatch warnings here.
Maybe it'd be prudent to note this in the commit message?
Something like "checkpatch warnings ignored to maintain coding style"
and maybe "affected lines are deprecated and will be removed", etc.?
Otherwise, I expect there will be some "checkpatch warning" fixes coming
our way.
Thanks,
Yazen
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 5/5] EDAC/mc_sysfs: Begin deprecating legacy sysfs EDAC interface
2025-09-09 18:53 ` [PATCH v3 5/5] EDAC/mc_sysfs: Begin deprecating legacy sysfs EDAC interface Avadhut Naik
@ 2025-09-10 15:24 ` Yazen Ghannam
2025-09-10 17:38 ` Naik, Avadhut
0 siblings, 1 reply; 18+ messages in thread
From: Yazen Ghannam @ 2025-09-10 15:24 UTC (permalink / raw)
To: Avadhut Naik; +Cc: linux-edac, bp, linux-kernel
On Tue, Sep 09, 2025 at 06:53:14PM +0000, Avadhut Naik wrote:
> The legacy sysfs EDAC interface has been made obsolete more than a decade
> ago through the introduction of a new per-DIMM interface.
>
> The legacy interface however, hasn't been removed till date.
>
> Begin deprecating it so that it can eventually be removed two releases
> later.
>
> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
> ---
> Changes in v3:
> Patch introduced.
> ---
> drivers/edac/Kconfig | 2 +-
> drivers/edac/edac_mc_sysfs.c | 14 ++++++++++++++
> 2 files changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
> index b824472208c4..19470f4efee7 100644
> --- a/drivers/edac/Kconfig
> +++ b/drivers/edac/Kconfig
> @@ -25,7 +25,7 @@ if EDAC
>
> config EDAC_LEGACY_SYSFS
> bool "EDAC legacy sysfs"
> - default y
> + default n
> help
> Enable the compatibility sysfs nodes.
> Use 'Y' if your edac utilities aren't ported to work with the newer
> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
> index 8689631f1905..3840eef942f8 100644
> --- a/drivers/edac/edac_mc_sysfs.c
> +++ b/drivers/edac/edac_mc_sysfs.c
> @@ -144,6 +144,8 @@ struct dev_ch_attribute {
> static ssize_t csrow_ue_count_show(struct device *dev,
> struct device_attribute *mattr, char *data)
> {
> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
> +
> struct csrow_info *csrow = to_csrow(dev);
>
> return sysfs_emit(data, "%u\n", csrow->ue_count);
> @@ -152,6 +154,8 @@ static ssize_t csrow_ue_count_show(struct device *dev,
> static ssize_t csrow_ce_count_show(struct device *dev,
> struct device_attribute *mattr, char *data)
> {
> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
> +
> struct csrow_info *csrow = to_csrow(dev);
>
> return sysfs_emit(data, "%u\n", csrow->ce_count);
> @@ -160,6 +164,8 @@ static ssize_t csrow_ce_count_show(struct device *dev,
> static ssize_t csrow_size_show(struct device *dev,
> struct device_attribute *mattr, char *data)
> {
> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
> +
> struct csrow_info *csrow = to_csrow(dev);
> int i;
> u32 nr_pages = 0;
> @@ -172,6 +178,8 @@ static ssize_t csrow_size_show(struct device *dev,
> static ssize_t csrow_mem_type_show(struct device *dev,
> struct device_attribute *mattr, char *data)
> {
> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
> +
> struct csrow_info *csrow = to_csrow(dev);
>
> return sysfs_emit(data, "%s\n", edac_mem_types[csrow->channels[0]->dimm->mtype]);
> @@ -180,6 +188,8 @@ static ssize_t csrow_mem_type_show(struct device *dev,
> static ssize_t csrow_dev_type_show(struct device *dev,
> struct device_attribute *mattr, char *data)
> {
> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
> +
> struct csrow_info *csrow = to_csrow(dev);
>
> return sysfs_emit(data, "%s\n", dev_types[csrow->channels[0]->dimm->dtype]);
> @@ -189,6 +199,8 @@ static ssize_t csrow_edac_mode_show(struct device *dev,
> struct device_attribute *mattr,
> char *data)
> {
> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
> +
> struct csrow_info *csrow = to_csrow(dev);
>
> return sysfs_emit(data, "%s\n", edac_caps[csrow->channels[0]->dimm->edac_mode]);
> @@ -199,6 +211,7 @@ static ssize_t channel_dimm_label_show(struct device *dev,
> struct device_attribute *mattr,
> char *data)
> {
> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
> struct csrow_info *csrow = to_csrow(dev);
> unsigned int chan = to_channel(mattr);
> struct rank_info *rank = csrow->channels[chan];
> @@ -238,6 +251,7 @@ static ssize_t channel_dimm_label_store(struct device *dev,
> static ssize_t channel_ce_count_show(struct device *dev,
> struct device_attribute *mattr, char *data)
> {
> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
> struct csrow_info *csrow = to_csrow(dev);
> unsigned int chan = to_channel(mattr);
> struct rank_info *rank = csrow->channels[chan];
> --
Depcreated -> Deprecated
And maybe the warning can go in an short inline function? Sorry, I
forgot if this came up already.
Also, "two future releases" is vague. And it may be confusing if this is
backported.
Does anyone have a better suggestion, or is this good as-is?
Thanks,
Yazen
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 4/5] EDAC/mc_sysfs: Increase legacy channel support to 16
2025-09-10 15:17 ` Yazen Ghannam
@ 2025-09-10 15:29 ` Naik, Avadhut
0 siblings, 0 replies; 18+ messages in thread
From: Naik, Avadhut @ 2025-09-10 15:29 UTC (permalink / raw)
To: Yazen Ghannam, Avadhut Naik; +Cc: linux-edac, bp, linux-kernel
On 9/10/2025 10:17, Yazen Ghannam wrote:
> On Tue, Sep 09, 2025 at 06:53:13PM +0000, Avadhut Naik wrote:
>> Newer AMD systems can support up to 16 channels per EDAC "mc" device.
>> These are detected by the EDAC module running on the device, and the
>> current EDAC interface is appropriately enumerated.
>>
>> The legacy EDAC sysfs interface however, provides device attributes for
>> channels 0 through 11 only. Consequently, the last four channels, 12
>> through 15, will not be enumerated and will not be visible through the
>> legacy sysfs interface.
>>
>> Add additional device attributes to ensure that all 16 channels, if
>> present, are enumerated by and visible through the legacy EDAC sysfs
>> interface.
>>
>> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
>> ---
>> Changes in v2:
>> Patch introduced.
>>
>> Changes in v3:
>> No changes.
>> ---
>> drivers/edac/edac_mc_sysfs.c | 24 ++++++++++++++++++++++++
>> 1 file changed, 24 insertions(+)
>>
>> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
>> index 0f338adf7d93..8689631f1905 100644
>> --- a/drivers/edac/edac_mc_sysfs.c
>> +++ b/drivers/edac/edac_mc_sysfs.c
>> @@ -305,6 +305,14 @@ DEVICE_CHANNEL(ch10_dimm_label, S_IRUGO | S_IWUSR,
>> channel_dimm_label_show, channel_dimm_label_store, 10);
>> DEVICE_CHANNEL(ch11_dimm_label, S_IRUGO | S_IWUSR,
>> channel_dimm_label_show, channel_dimm_label_store, 11);
>> +DEVICE_CHANNEL(ch12_dimm_label, S_IRUGO | S_IWUSR,
>> + channel_dimm_label_show, channel_dimm_label_store, 12);
>> +DEVICE_CHANNEL(ch13_dimm_label, S_IRUGO | S_IWUSR,
>> + channel_dimm_label_show, channel_dimm_label_store, 13);
>> +DEVICE_CHANNEL(ch14_dimm_label, S_IRUGO | S_IWUSR,
>> + channel_dimm_label_show, channel_dimm_label_store, 14);
>> +DEVICE_CHANNEL(ch15_dimm_label, S_IRUGO | S_IWUSR,
>> + channel_dimm_label_show, channel_dimm_label_store, 15);
>>
>> /* Total possible dynamic DIMM Label attribute file table */
>> static struct attribute *dynamic_csrow_dimm_attr[] = {
>> @@ -320,6 +328,10 @@ static struct attribute *dynamic_csrow_dimm_attr[] = {
>> &dev_attr_legacy_ch9_dimm_label.attr.attr,
>> &dev_attr_legacy_ch10_dimm_label.attr.attr,
>> &dev_attr_legacy_ch11_dimm_label.attr.attr,
>> + &dev_attr_legacy_ch12_dimm_label.attr.attr,
>> + &dev_attr_legacy_ch13_dimm_label.attr.attr,
>> + &dev_attr_legacy_ch14_dimm_label.attr.attr,
>> + &dev_attr_legacy_ch15_dimm_label.attr.attr,
>> NULL
>> };
>>
>> @@ -348,6 +360,14 @@ DEVICE_CHANNEL(ch10_ce_count, S_IRUGO,
>> channel_ce_count_show, NULL, 10);
>> DEVICE_CHANNEL(ch11_ce_count, S_IRUGO,
>> channel_ce_count_show, NULL, 11);
>> +DEVICE_CHANNEL(ch12_ce_count, S_IRUGO,
>> + channel_ce_count_show, NULL, 12);
>> +DEVICE_CHANNEL(ch13_ce_count, S_IRUGO,
>> + channel_ce_count_show, NULL, 13);
>> +DEVICE_CHANNEL(ch14_ce_count, S_IRUGO,
>> + channel_ce_count_show, NULL, 14);
>> +DEVICE_CHANNEL(ch15_ce_count, S_IRUGO,
>> + channel_ce_count_show, NULL, 15);
>>
>> /* Total possible dynamic ce_count attribute file table */
>> static struct attribute *dynamic_csrow_ce_count_attr[] = {
>> @@ -363,6 +383,10 @@ static struct attribute *dynamic_csrow_ce_count_attr[] = {
>> &dev_attr_legacy_ch9_ce_count.attr.attr,
>> &dev_attr_legacy_ch10_ce_count.attr.attr,
>> &dev_attr_legacy_ch11_ce_count.attr.attr,
>> + &dev_attr_legacy_ch12_ce_count.attr.attr,
>> + &dev_attr_legacy_ch13_ce_count.attr.attr,
>> + &dev_attr_legacy_ch14_ce_count.attr.attr,
>> + &dev_attr_legacy_ch15_ce_count.attr.attr,
>> NULL
>> };
>>
>> --
>
> There are many checkpatch warnings here.
>
> Maybe it'd be prudent to note this in the commit message?
>
> Something like "checkpatch warnings ignored to maintain coding style"
> and maybe "affected lines are deprecated and will be removed", etc.?
>
> Otherwise, I expect there will be some "checkpatch warning" fixes coming
> our way.
>
Initially, was going to fix the checkpatch warnings. But then noticed that
previous commits to this area had ignored warnings too. So let them be as
is.
In any case, will add about this to the commit message.
> Thanks,
> Yazen
--
Thanks,
Avadhut Naik
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 2/5] EDAC/amd64: Remove NUM_CONTROLLERS macro
2025-09-10 15:05 ` Yazen Ghannam
@ 2025-09-10 15:48 ` Naik, Avadhut
0 siblings, 0 replies; 18+ messages in thread
From: Naik, Avadhut @ 2025-09-10 15:48 UTC (permalink / raw)
To: Yazen Ghannam, Avadhut Naik; +Cc: linux-edac, bp, linux-kernel
On 9/10/2025 10:05, Yazen Ghannam wrote:
> On Tue, Sep 09, 2025 at 06:53:11PM +0000, Avadhut Naik wrote:
>> Currently, the NUM_CONTROLLERS macro is only used to statically allocate
>> the csels array of struct chip_select in struct amd64_pvt.
>>
>> The size of this array, however, will never exceed the number of UMCs on
>> the SOC. Since, max_mcs variable in struct amd64_pvt already stores the
>> number of UMCs on the SOC, the macro can be removed and the static array
>> can be dynamically allocated instead.
>
> You should note that max_mcs and the csels array are also used in legacy
> systems with 'DCTs'.
>
> Those had a max of 2 controllers which we already set in
> per_family_init() as the global default. So the legacy systems are
> covered by this change too.
>
> Without noting this, it seems like that case may be overlooked.
>
Will mention this in the commit message!
>>
>> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
>> ---
>> Changes in v3:
>> Patch introduced.
>> ---
>> drivers/edac/amd64_edac.c | 19 +++++++++++++------
>> drivers/edac/amd64_edac.h | 5 ++---
>> 2 files changed, 15 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
>> index 3989794e4f29..0fade110c3fb 100644
>> --- a/drivers/edac/amd64_edac.c
>> +++ b/drivers/edac/amd64_edac.c
>> @@ -4000,30 +4000,34 @@ static int probe_one_instance(unsigned int nid)
>> if (ret < 0)
>> goto err_enable;
>>
>> + pvt->csels = kcalloc(pvt->max_mcs, sizeof(*pvt->csels), GFP_KERNEL);
>> + if (!pvt->csels)
>> + goto err_enable;
>> +
>
> You can move this allocation to the end of per_family_init(). That's
> where we determine 'max_mcs'.
>
> If you do so, then the 'goto' changes below are not needed.
>
> Another option is to put it in hw_info_get() like we do for UMCs. But
> that means adding the allocation to three different helper functions
> rather than just the one with per_family_init().
>
Had considered moving allocation to per_family_init() while sending
this set. But then didn't since I had stated that I would be adding
this allocation in probe_one_instance().
In any case, will move it to per_family_init().
>> ret = pvt->ops->hw_info_get(pvt);
>> if (ret < 0)
>> - goto err_enable;
>> + goto err_csels;
>>
>> ret = 0;
>> if (!instance_has_memory(pvt)) {
>> amd64_info("Node %d: No DIMMs detected.\n", nid);
>> - goto err_enable;
>> + goto err_csels;
>> }
>>
>> if (!pvt->ops->ecc_enabled(pvt)) {
>> ret = -ENODEV;
>>
>> if (!ecc_enable_override)
>> - goto err_enable;
>> + goto err_csels;
>>
>> if (boot_cpu_data.x86 >= 0x17) {
>> amd64_warn("Forcing ECC on is not recommended on newer systems. Please enable ECC in BIOS.");
>> - goto err_enable;
>> + goto err_csels;
>> } else
>> amd64_warn("Forcing ECC on!\n");
>>
>> if (!enable_ecc_error_reporting(s, nid, F3))
>> - goto err_enable;
>> + goto err_csels;
>> }
>>
>> ret = init_one_instance(pvt);
>> @@ -4033,7 +4037,7 @@ static int probe_one_instance(unsigned int nid)
>> if (boot_cpu_data.x86 < 0x17)
>> restore_ecc_error_reporting(s, nid, F3);
>>
>> - goto err_enable;
>> + goto err_csels;
>> }
>>
>> amd64_info("%s detected (node %d).\n", pvt->ctl_name, pvt->mc_node_id);
>> @@ -4043,6 +4047,8 @@ static int probe_one_instance(unsigned int nid)
>>
>> return ret;
>>
>> +err_csels:
>> + kfree(pvt->csels);
>
> This can go in hw_info_put(). We have kfree(pvt->umc) there already.
>
Okay. Will move it to hw_info_put().
--
Thanks,
Avadhut Naik
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 1/5] EDAC/amd64: Generate ctl_name string at runtime
2025-09-10 14:51 ` Yazen Ghannam
@ 2025-09-10 15:53 ` Naik, Avadhut
0 siblings, 0 replies; 18+ messages in thread
From: Naik, Avadhut @ 2025-09-10 15:53 UTC (permalink / raw)
To: Yazen Ghannam, Avadhut Naik; +Cc: linux-edac, bp, linux-kernel
On 9/10/2025 09:51, Yazen Ghannam wrote:
> On Tue, Sep 09, 2025 at 06:53:10PM +0000, Avadhut Naik wrote:
>> Currently, the ctl_name string is statically assigned based on the family
>> and model of the SOC when the amd64_edac module is loaded.
>>
>> The same, however, is not exactly needed as the string can be generated
>> and assigned at runtime through scnprintf().
>>
>> Remove all static assignments and generate the string at runtime. Also,
>> cleanup the switch cases which now become defunct.
>>
>> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
>> ---
>> Changes in v3:
>> Patch introduced.
>> ---
>> drivers/edac/amd64_edac.c | 44 +++++++--------------------------------
>> drivers/edac/amd64_edac.h | 4 +++-
>> 2 files changed, 11 insertions(+), 37 deletions(-)
>>
>> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
>> index 07f1e9dc1ca7..3989794e4f29 100644
>> --- a/drivers/edac/amd64_edac.c
>> +++ b/drivers/edac/amd64_edac.c
>> @@ -3767,6 +3767,8 @@ static int per_family_init(struct amd64_pvt *pvt)
>> pvt->model = boot_cpu_data.x86_model;
>> pvt->fam = boot_cpu_data.x86;
>> pvt->max_mcs = 2;
>
> Newline here, please.
>
Okay.
>> + scnprintf(pvt->ctl_name, sizeof(pvt->ctl_name), "F%02Xh_M%02Xh",
>> + pvt->fam, pvt->model);
>
> There are a couple of special cases below.
>
> So I think it may be better to move this part to the end...
>
>>
>> /*
>> * Decide on which ops group to use here and do any family/model
>> @@ -3779,8 +3781,10 @@ static int per_family_init(struct amd64_pvt *pvt)
>>
>> switch (pvt->fam) {
>> case 0xf:
>> - pvt->ctl_name = (pvt->ext_model >= K8_REV_F) ?
>> - "K8 revF or later" : "K8 revE or earlier";
>> + if (pvt->ext_model >= K8_REV_F)
>> + strscpy(pvt->ctl_name, "K8 revF or later", sizeof(pvt->ctl_name));
>> + else
>> + strscpy(pvt->ctl_name, "K8 revE or earlier", sizeof(pvt->ctl_name));
>
> Maybe save this special case to a temporary "char *name".
>
>> pvt->f1_id = PCI_DEVICE_ID_AMD_K8_NB_ADDRMAP;
>> pvt->f2_id = PCI_DEVICE_ID_AMD_K8_NB_MEMCTL;
>> pvt->ops->map_sysaddr_to_csrow = k8_map_sysaddr_to_csrow;
>> @@ -3788,7 +3792,6 @@ static int per_family_init(struct amd64_pvt *pvt)
>> break;
>>
>> case 0x10:
>> - pvt->ctl_name = "F10h";
>> pvt->f1_id = PCI_DEVICE_ID_AMD_10H_NB_MAP;
>> pvt->f2_id = PCI_DEVICE_ID_AMD_10H_NB_DRAM;
>> pvt->ops->dbam_to_cs = f10_dbam_to_chip_select;
>> @@ -3797,12 +3800,10 @@ static int per_family_init(struct amd64_pvt *pvt)
>> case 0x15:
>> switch (pvt->model) {
>> case 0x30:
>> - pvt->ctl_name = "F15h_M30h";
>> pvt->f1_id = PCI_DEVICE_ID_AMD_15H_M30H_NB_F1;
>> pvt->f2_id = PCI_DEVICE_ID_AMD_15H_M30H_NB_F2;
>> break;
>> case 0x60:
>> - pvt->ctl_name = "F15h_M60h";
>> pvt->f1_id = PCI_DEVICE_ID_AMD_15H_M60H_NB_F1;
>> pvt->f2_id = PCI_DEVICE_ID_AMD_15H_M60H_NB_F2;
>> pvt->ops->dbam_to_cs = f15_m60h_dbam_to_chip_select;
>> @@ -3811,7 +3812,6 @@ static int per_family_init(struct amd64_pvt *pvt)
>> /* Richland is only client */
>> return -ENODEV;
>> default:
>> - pvt->ctl_name = "F15h";
>> pvt->f1_id = PCI_DEVICE_ID_AMD_15H_NB_F1;
>> pvt->f2_id = PCI_DEVICE_ID_AMD_15H_NB_F2;
>> pvt->ops->dbam_to_cs = f15_dbam_to_chip_select;
>> @@ -3822,12 +3822,10 @@ static int per_family_init(struct amd64_pvt *pvt)
>> case 0x16:
>> switch (pvt->model) {
>> case 0x30:
>> - pvt->ctl_name = "F16h_M30h";
>> pvt->f1_id = PCI_DEVICE_ID_AMD_16H_M30H_NB_F1;
>> pvt->f2_id = PCI_DEVICE_ID_AMD_16H_M30H_NB_F2;
>> break;
>> default:
>> - pvt->ctl_name = "F16h";
>> pvt->f1_id = PCI_DEVICE_ID_AMD_16H_NB_F1;
>> pvt->f2_id = PCI_DEVICE_ID_AMD_16H_NB_F2;
>> break;
>> @@ -3836,76 +3834,52 @@ static int per_family_init(struct amd64_pvt *pvt)
>>
>> case 0x17:
>> switch (pvt->model) {
>> - case 0x10 ... 0x2f:
>> - pvt->ctl_name = "F17h_M10h";
>> - break;
>> case 0x30 ... 0x3f:
>> - pvt->ctl_name = "F17h_M30h";
>> pvt->max_mcs = 8;
>> break;
>> - case 0x60 ... 0x6f:
>> - pvt->ctl_name = "F17h_M60h";
>> - break;
>> - case 0x70 ... 0x7f:
>> - pvt->ctl_name = "F17h_M70h";
>> - break;
>> default:
>> - pvt->ctl_name = "F17h";
>> break;
>> }
>> break;
>>
>> case 0x18:
>> - pvt->ctl_name = "F18h";
>> break;
>>
>> case 0x19:
>> switch (pvt->model) {
>> case 0x00 ... 0x0f:
>> - pvt->ctl_name = "F19h";
>> pvt->max_mcs = 8;
>> break;
>> case 0x10 ... 0x1f:
>> - pvt->ctl_name = "F19h_M10h";
>> pvt->max_mcs = 12;
>> pvt->flags.zn_regs_v2 = 1;
>> break;
>> - case 0x20 ... 0x2f:
>> - pvt->ctl_name = "F19h_M20h";
>> - break;
>> case 0x30 ... 0x3f:
>> if (pvt->F3->device == PCI_DEVICE_ID_AMD_MI200_DF_F3) {
>> - pvt->ctl_name = "MI200";
>> + memset(pvt->ctl_name, 0, sizeof(pvt->ctl_name));
>> + strscpy(pvt->ctl_name, "MI200", sizeof(pvt->ctl_name));
>
> And this to a temp name too.
>
>> pvt->max_mcs = 4;
>> pvt->dram_type = MEM_HBM2;
>> pvt->gpu_umc_base = 0x50000;
>> pvt->ops = &gpu_ops;
>> } else {
>> - pvt->ctl_name = "F19h_M30h";
>> pvt->max_mcs = 8;
>> }
>> break;
>> - case 0x50 ... 0x5f:
>> - pvt->ctl_name = "F19h_M50h";
>> - break;
>> case 0x60 ... 0x6f:
>> - pvt->ctl_name = "F19h_M60h";
>> pvt->flags.zn_regs_v2 = 1;
>> break;
>> case 0x70 ... 0x7f:
>> - pvt->ctl_name = "F19h_M70h";
>> pvt->max_mcs = 4;
>> pvt->flags.zn_regs_v2 = 1;
>> break;
>> case 0x90 ... 0x9f:
>> - pvt->ctl_name = "F19h_M90h";
>> pvt->max_mcs = 4;
>> pvt->dram_type = MEM_HBM3;
>> pvt->gpu_umc_base = 0x90000;
>> pvt->ops = &gpu_ops;
>> break;
>> case 0xa0 ... 0xaf:
>> - pvt->ctl_name = "F19h_MA0h";
>> pvt->max_mcs = 12;
>> pvt->flags.zn_regs_v2 = 1;
>> break;
>> @@ -3915,12 +3889,10 @@ static int per_family_init(struct amd64_pvt *pvt)
>> case 0x1A:
>> switch (pvt->model) {
>> case 0x00 ... 0x1f:
>> - pvt->ctl_name = "F1Ah";
>> pvt->max_mcs = 12;
>> pvt->flags.zn_regs_v2 = 1;
>> break;
>> case 0x40 ... 0x4f:
>> - pvt->ctl_name = "F1Ah_M40h";
>> pvt->flags.zn_regs_v2 = 1;
>> break;
>> }
>
> ...here.
>
> Then check if the name was set already (by the special cases). If not,
> then set the generic family/model name.
>
> For example:
> char *tmp_name = NULL;
>
> if K8:
> if F:
> tmp_name = "K8 F";
> else:
> tmp_name = "K8 E";
> if MI200:
> tmp_name = "MI200";
>
> if (tmp_name)
> scnprintf(pvt->ctl_name, sizeof(pvt->ctl_name), tmp_name);
> else
> scnprintf(pvt->ctl_name, sizeof(pvt->ctl_name), "F%02Xh_M%02Xh",
> pvt->fam, pvt->model);
>
Okay! Will try this out!
--
Thanks,
Avadhut Naik
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 3/5] EDAC/amd64: Add support for AMD family 1Ah-based newer models
2025-09-10 15:13 ` Yazen Ghannam
@ 2025-09-10 16:08 ` Naik, Avadhut
0 siblings, 0 replies; 18+ messages in thread
From: Naik, Avadhut @ 2025-09-10 16:08 UTC (permalink / raw)
To: Yazen Ghannam, Avadhut Naik; +Cc: linux-edac, bp, linux-kernel
On 9/10/2025 10:13, Yazen Ghannam wrote:
> On Tue, Sep 09, 2025 at 06:53:12PM +0000, Avadhut Naik wrote:
>> Add support for family 1Ah-based models 50h-57h, 90h-9Fh, A0h-AFh, and
>> C0h-C7h.
>>
>> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
>> ---
>> Changes in v2:
>> 1. Remove extra tabs
>>
>> Changes in v3:
>> 1. Since ctl_name string is now assigned at runtime, group similar models
>> together.
>> ---
>> drivers/edac/amd64_edac.c | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
>> index 0fade110c3fb..804d3c4c3f14 100644
>> --- a/drivers/edac/amd64_edac.c
>> +++ b/drivers/edac/amd64_edac.c
>> @@ -3895,6 +3895,16 @@ static int per_family_init(struct amd64_pvt *pvt)
>> case 0x40 ... 0x4f:
>> pvt->flags.zn_regs_v2 = 1;
>> break;
>> + case 0x50 ... 0x57:
>> + case 0xc0 ... 0xc7:
>> + pvt->max_mcs = 16;
>> + pvt->flags.zn_regs_v2 = 1;
>> + break;
>> + case 0x90 ... 0x9f:
>> + case 0xa0 ... 0xaf:
>> + pvt->max_mcs = 8;
>> + pvt->flags.zn_regs_v2 = 1;
>
> All of Family 1Ah uses 'zn_regs_v2', so this can go before the models
> cases.
>
> The register changes happened in Family 19h, so there are a mix of
> models there.
>
> We could be so bold to say 'zn_regs_v2 = (family >= 0x1A)' up top.
>
> Family 19h would not set this, but then the individual model cases can
> fix it up.
>
Okay. Will set zn_regs_v2 at the top of per_family_init() for Family 1Ah.
Will remove its existing usage too.
Will make this a separate patch.
> Thanks,
> Yazen
--
Thanks,
Avadhut Naik
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 5/5] EDAC/mc_sysfs: Begin deprecating legacy sysfs EDAC interface
2025-09-10 15:24 ` Yazen Ghannam
@ 2025-09-10 17:38 ` Naik, Avadhut
2025-09-15 17:04 ` Yazen Ghannam
0 siblings, 1 reply; 18+ messages in thread
From: Naik, Avadhut @ 2025-09-10 17:38 UTC (permalink / raw)
To: Yazen Ghannam, Avadhut Naik; +Cc: linux-edac, bp, linux-kernel
On 9/10/2025 10:24, Yazen Ghannam wrote:
> On Tue, Sep 09, 2025 at 06:53:14PM +0000, Avadhut Naik wrote:
>> The legacy sysfs EDAC interface has been made obsolete more than a decade
>> ago through the introduction of a new per-DIMM interface.
>>
>> The legacy interface however, hasn't been removed till date.
>>
>> Begin deprecating it so that it can eventually be removed two releases
>> later.
>>
>> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
>> ---
>> Changes in v3:
>> Patch introduced.
>> ---
>> drivers/edac/Kconfig | 2 +-
>> drivers/edac/edac_mc_sysfs.c | 14 ++++++++++++++
>> 2 files changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
>> index b824472208c4..19470f4efee7 100644
>> --- a/drivers/edac/Kconfig
>> +++ b/drivers/edac/Kconfig
>> @@ -25,7 +25,7 @@ if EDAC
>>
>> config EDAC_LEGACY_SYSFS
>> bool "EDAC legacy sysfs"
>> - default y
>> + default n
>> help
>> Enable the compatibility sysfs nodes.
>> Use 'Y' if your edac utilities aren't ported to work with the newer
>> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
>> index 8689631f1905..3840eef942f8 100644
>> --- a/drivers/edac/edac_mc_sysfs.c
>> +++ b/drivers/edac/edac_mc_sysfs.c
>> @@ -144,6 +144,8 @@ struct dev_ch_attribute {
>> static ssize_t csrow_ue_count_show(struct device *dev,
>> struct device_attribute *mattr, char *data)
>> {
>> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
>> +
>> struct csrow_info *csrow = to_csrow(dev);
>>
>> return sysfs_emit(data, "%u\n", csrow->ue_count);
>> @@ -152,6 +154,8 @@ static ssize_t csrow_ue_count_show(struct device *dev,
>> static ssize_t csrow_ce_count_show(struct device *dev,
>> struct device_attribute *mattr, char *data)
>> {
>> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
>> +
>> struct csrow_info *csrow = to_csrow(dev);
>>
>> return sysfs_emit(data, "%u\n", csrow->ce_count);
>> @@ -160,6 +164,8 @@ static ssize_t csrow_ce_count_show(struct device *dev,
>> static ssize_t csrow_size_show(struct device *dev,
>> struct device_attribute *mattr, char *data)
>> {
>> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
>> +
>> struct csrow_info *csrow = to_csrow(dev);
>> int i;
>> u32 nr_pages = 0;
>> @@ -172,6 +178,8 @@ static ssize_t csrow_size_show(struct device *dev,
>> static ssize_t csrow_mem_type_show(struct device *dev,
>> struct device_attribute *mattr, char *data)
>> {
>> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
>> +
>> struct csrow_info *csrow = to_csrow(dev);
>>
>> return sysfs_emit(data, "%s\n", edac_mem_types[csrow->channels[0]->dimm->mtype]);
>> @@ -180,6 +188,8 @@ static ssize_t csrow_mem_type_show(struct device *dev,
>> static ssize_t csrow_dev_type_show(struct device *dev,
>> struct device_attribute *mattr, char *data)
>> {
>> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
>> +
>> struct csrow_info *csrow = to_csrow(dev);
>>
>> return sysfs_emit(data, "%s\n", dev_types[csrow->channels[0]->dimm->dtype]);
>> @@ -189,6 +199,8 @@ static ssize_t csrow_edac_mode_show(struct device *dev,
>> struct device_attribute *mattr,
>> char *data)
>> {
>> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
>> +
>> struct csrow_info *csrow = to_csrow(dev);
>>
>> return sysfs_emit(data, "%s\n", edac_caps[csrow->channels[0]->dimm->edac_mode]);
>> @@ -199,6 +211,7 @@ static ssize_t channel_dimm_label_show(struct device *dev,
>> struct device_attribute *mattr,
>> char *data)
>> {
>> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
>> struct csrow_info *csrow = to_csrow(dev);
>> unsigned int chan = to_channel(mattr);
>> struct rank_info *rank = csrow->channels[chan];
>> @@ -238,6 +251,7 @@ static ssize_t channel_dimm_label_store(struct device *dev,
>> static ssize_t channel_ce_count_show(struct device *dev,
>> struct device_attribute *mattr, char *data)
>> {
>> + pr_warn_once("Depcreated interface! Will be removed within two future releases. Please switch to using the new interface!\n");
>> struct csrow_info *csrow = to_csrow(dev);
>> unsigned int chan = to_channel(mattr);
>> struct rank_info *rank = csrow->channels[chan];
>> --
>
> Depcreated -> Deprecated
>
Will change.
> And maybe the warning can go in an short inline function? Sorry, I
> forgot if this came up already.
>
> Also, "two future releases" is vague. And it may be confusing if this is
> backported.
>
> Does anyone have a better suggestion, or is this good as-is?
>
How about explicitly stating a release?
6.20, for example.
> Thanks,
> Yazen
--
Thanks,
Avadhut Naik
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 5/5] EDAC/mc_sysfs: Begin deprecating legacy sysfs EDAC interface
2025-09-10 17:38 ` Naik, Avadhut
@ 2025-09-15 17:04 ` Yazen Ghannam
2025-09-16 16:09 ` Naik, Avadhut
0 siblings, 1 reply; 18+ messages in thread
From: Yazen Ghannam @ 2025-09-15 17:04 UTC (permalink / raw)
To: Naik, Avadhut; +Cc: Avadhut Naik, linux-edac, bp, linux-kernel
On Wed, Sep 10, 2025 at 12:38:44PM -0500, Naik, Avadhut wrote:
[...]
> > And maybe the warning can go in an short inline function? Sorry, I
> > forgot if this came up already.
> >
> > Also, "two future releases" is vague. And it may be confusing if this is
> > backported.
> >
> > Does anyone have a better suggestion, or is this good as-is?
> >
>
> How about explicitly stating a release?
> 6.20, for example.
>
I think that may be okay, but I'm not sure.
Boris, what do you think?
Thanks,
Yazen
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3 5/5] EDAC/mc_sysfs: Begin deprecating legacy sysfs EDAC interface
2025-09-15 17:04 ` Yazen Ghannam
@ 2025-09-16 16:09 ` Naik, Avadhut
0 siblings, 0 replies; 18+ messages in thread
From: Naik, Avadhut @ 2025-09-16 16:09 UTC (permalink / raw)
To: Yazen Ghannam; +Cc: Avadhut Naik, linux-edac, bp, linux-kernel
On 9/15/2025 12:04, Yazen Ghannam wrote:
> On Wed, Sep 10, 2025 at 12:38:44PM -0500, Naik, Avadhut wrote:
>
> [...]
>>> And maybe the warning can go in an short inline function? Sorry, I
>>> forgot if this came up already.
>>>
>>> Also, "two future releases" is vague. And it may be confusing if this is
>>> backported.
>>>
>>> Does anyone have a better suggestion, or is this good as-is?
>>>
>>
>> How about explicitly stating a release?
>> 6.20, for example.
>>
>
> I think that may be okay, but I'm not sure.
>
> Boris, what do you think?
>
> Thanks,
> Yazen
Another alternative could be doing something like this:
[ 1340.644001] NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030
Mention the year instead of release!
--
Thanks,
Avadhut Naik
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2025-09-16 16:10 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-09 18:53 [PATCH v3 0/5] Cleanup and add support for AMD Family 1Ah-based SOCs Avadhut Naik
2025-09-09 18:53 ` [PATCH v3 1/5] EDAC/amd64: Generate ctl_name string at runtime Avadhut Naik
2025-09-10 14:51 ` Yazen Ghannam
2025-09-10 15:53 ` Naik, Avadhut
2025-09-09 18:53 ` [PATCH v3 2/5] EDAC/amd64: Remove NUM_CONTROLLERS macro Avadhut Naik
2025-09-10 15:05 ` Yazen Ghannam
2025-09-10 15:48 ` Naik, Avadhut
2025-09-09 18:53 ` [PATCH v3 3/5] EDAC/amd64: Add support for AMD family 1Ah-based newer models Avadhut Naik
2025-09-10 15:13 ` Yazen Ghannam
2025-09-10 16:08 ` Naik, Avadhut
2025-09-09 18:53 ` [PATCH v3 4/5] EDAC/mc_sysfs: Increase legacy channel support to 16 Avadhut Naik
2025-09-10 15:17 ` Yazen Ghannam
2025-09-10 15:29 ` Naik, Avadhut
2025-09-09 18:53 ` [PATCH v3 5/5] EDAC/mc_sysfs: Begin deprecating legacy sysfs EDAC interface Avadhut Naik
2025-09-10 15:24 ` Yazen Ghannam
2025-09-10 17:38 ` Naik, Avadhut
2025-09-15 17:04 ` Yazen Ghannam
2025-09-16 16:09 ` Naik, Avadhut
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox