* [PATCH 1/3] EDAC: i5000: disable error reporting at teardown and refactor helper
2026-04-30 8:42 [PATCH 0/3] EDAC: fix error reporting handling during init for Tushar Tibude
@ 2026-04-30 8:42 ` Tushar Tibude
2026-04-30 8:42 ` [PATCH 2/3] EDAC: i5100: disable error reporting at teardown and create helper Tushar Tibude
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Tushar Tibude @ 2026-04-30 8:42 UTC (permalink / raw)
To: mchehab, bp, tony.luck, linux-edac, linux-kernel
Cc: qiuxu.zhuo, Tushar Tibude
If error reporting is enabled during initialization but initialization
fails immediately after, or during normal driver teardown, error reporting
is left enabled in the mask register even after exit.
Replace i5000_enable_error_reporting() with i5000_set_error_reporting()
to combine enabling/disabling. Disable reporting at initialization
failure and driver exit, before call to i5000_put_devices() for cleanup.
This ensures clean hardware handling by disabling any unused error
reporting bits before exiting.
Signed-off-by: Tushar Tibude <tushar.tibude1000@gmail.com>
---
drivers/edac/i5000_edac.c | 33 ++++++++++++++++++++++++---------
1 file changed, 24 insertions(+), 9 deletions(-)
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index 471b8540d..c0faf55f7 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -352,6 +352,9 @@ struct i5000_pvt {
/* Actual values for this controller */
int maxch; /* Max channels */
int maxdimmperch; /* Max DIMMs per channel */
+
+ /* Hardware error reporting status */
+ bool enabled_error_reporting;
};
/* I5000 MCH error information retrieved from Hardware */
@@ -1302,10 +1305,10 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
}
/*
- * i5000_enable_error_reporting
- * Turn on the memory reporting features of the hardware
+ * i5000_set_error_reporting
+ * Turn on/off the memory reporting features of the hardware
*/
-static void i5000_enable_error_reporting(struct mem_ctl_info *mci)
+static void i5000_set_error_reporting(struct mem_ctl_info *mci, bool enable)
{
struct i5000_pvt *pvt;
u32 fbd_error_mask;
@@ -1316,8 +1319,11 @@ static void i5000_enable_error_reporting(struct mem_ctl_info *mci)
pci_read_config_dword(pvt->branchmap_werrors, EMASK_FBD,
&fbd_error_mask);
- /* Enable with a '0' */
- fbd_error_mask &= ~(ENABLE_EMASK_ALL);
+ /* Enable with 0, disable with 1 */
+ if (enable)
+ fbd_error_mask &= ~(ENABLE_EMASK_ALL);
+ else
+ fbd_error_mask |= ENABLE_EMASK_ALL;
pci_write_config_dword(pvt->branchmap_werrors, EMASK_FBD,
fbd_error_mask);
@@ -1435,17 +1441,19 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
if (i5000_init_csrows(mci)) {
edac_dbg(0, "MC: Setting mci->edac_cap to EDAC_FLAG_NONE because i5000_init_csrows() returned nonzero value\n");
mci->edac_cap = EDAC_FLAG_NONE; /* no csrows found */
+ pvt->enabled_error_reporting = false;
} else {
edac_dbg(1, "MC: Enable error reporting now\n");
- i5000_enable_error_reporting(mci);
+ i5000_set_error_reporting(mci, true);
+ pvt->enabled_error_reporting = true;
}
/* add this new MC control structure to EDAC's list of MCs */
if (edac_mc_add_mc(mci)) {
edac_dbg(0, "MC: failed edac_mc_add_mc()\n");
- /* FIXME: perhaps some code should go here that disables error
- * reporting if we just enabled it
- */
+ /* Disable error reporting if we previously enabled it */
+ if (pvt->enabled_error_reporting)
+ i5000_set_error_reporting(mci, false);
goto fail1;
}
@@ -1503,6 +1511,7 @@ static int i5000_init_one(struct pci_dev *pdev, const struct pci_device_id *id)
static void i5000_remove_one(struct pci_dev *pdev)
{
struct mem_ctl_info *mci;
+ struct i5000_pvt *pvt;
edac_dbg(0, "\n");
@@ -1512,6 +1521,12 @@ static void i5000_remove_one(struct pci_dev *pdev)
if ((mci = edac_mc_del_mc(&pdev->dev)) == NULL)
return;
+ pvt = mci->pvt_info;
+
+ /* Disable error reporting on teardown */
+ if (pvt->enabled_error_reporting)
+ i5000_set_error_reporting(mci, false);
+
/* retrieve references to resources, and free those resources */
i5000_put_devices(mci);
edac_mc_free(mci);
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH 2/3] EDAC: i5100: disable error reporting at teardown and create helper
2026-04-30 8:42 [PATCH 0/3] EDAC: fix error reporting handling during init for Tushar Tibude
2026-04-30 8:42 ` [PATCH 1/3] EDAC: i5000: disable error reporting at teardown and refactor helper Tushar Tibude
@ 2026-04-30 8:42 ` Tushar Tibude
2026-04-30 8:42 ` [PATCH 3/3] EDAC: i5400: disable error reporting at teardown and refactor helper Tushar Tibude
2026-04-30 14:18 ` [PATCH 0/3] EDAC: fix error reporting handling during init for Zhuo, Qiuxu
3 siblings, 0 replies; 7+ messages in thread
From: Tushar Tibude @ 2026-04-30 8:42 UTC (permalink / raw)
To: mchehab, bp, tony.luck, linux-edac, linux-kernel
Cc: qiuxu.zhuo, Tushar Tibude
Error reporting is enabled during init but not reverted when init fails.
It is also not disabled at normal driver teardown.
Create i5100_set_error_reporting() to enable/disable reporting. Move
enable reporting write to after initialization success. Disable reporting
at driver teardown.
Signed-off-by: Tushar Tibude <tushar.tibude1000@gmail.com>
---
drivers/edac/i5100_edac.c | 26 +++++++++++++++++++++-----
1 file changed, 21 insertions(+), 5 deletions(-)
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index d470afe65..d30919ceb 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -859,6 +859,21 @@ static void i5100_init_csrows(struct mem_ctl_info *mci)
}
}
+static void i5100_set_error_reporting(struct pci_dev *pdev, bool enable)
+{
+ u32 dw;
+
+ pci_read_config_dword(pdev, I5100_EMASK_MEM, &dw);
+
+ /* Enable with 0, disable with 1 */
+ if (enable)
+ dw &= ~I5100_FERR_NF_MEM_ANY_MASK;
+ else
+ dw |= I5100_FERR_NF_MEM_ANY_MASK;
+
+ pci_write_config_dword(pdev, I5100_EMASK_MEM, dw);
+}
+
/****************************************************************************
* Error injection routines
****************************************************************************/
@@ -1004,11 +1019,6 @@ static int i5100_init_one(struct pci_dev *pdev, const struct pci_device_id *id)
pci_read_config_dword(pdev, I5100_MS, &dw);
ranksperch = !!(dw & (1 << 8)) * 2 + 4;
- /* enable error reporting... */
- pci_read_config_dword(pdev, I5100_EMASK_MEM, &dw);
- dw &= ~I5100_FERR_NF_MEM_ANY_MASK;
- pci_write_config_dword(pdev, I5100_EMASK_MEM, dw);
-
/* device 21, func 0, Channel 0 Memory Map, Error Flag/Mask, etc... */
ch0mm = pci_get_device_func(PCI_VENDOR_ID_INTEL,
PCI_DEVICE_ID_INTEL_5100_21, 0);
@@ -1125,6 +1135,9 @@ static int i5100_init_one(struct pci_dev *pdev, const struct pci_device_id *id)
i5100_setup_debugfs(mci);
+ /* Enable error reporting on success */
+ i5100_set_error_reporting(pdev, true);
+
return ret;
bail_scrub:
@@ -1169,6 +1182,9 @@ static void i5100_remove_one(struct pci_dev *pdev)
priv = mci->pvt_info;
+ /* Disable error reporting at teardown */
+ i5100_set_error_reporting(pdev, false);
+
edac_debugfs_remove_recursive(priv->debugfs);
priv->scrub_enable = 0;
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH 3/3] EDAC: i5400: disable error reporting at teardown and refactor helper
2026-04-30 8:42 [PATCH 0/3] EDAC: fix error reporting handling during init for Tushar Tibude
2026-04-30 8:42 ` [PATCH 1/3] EDAC: i5000: disable error reporting at teardown and refactor helper Tushar Tibude
2026-04-30 8:42 ` [PATCH 2/3] EDAC: i5100: disable error reporting at teardown and create helper Tushar Tibude
@ 2026-04-30 8:42 ` Tushar Tibude
2026-04-30 14:18 ` [PATCH 0/3] EDAC: fix error reporting handling during init for Zhuo, Qiuxu
3 siblings, 0 replies; 7+ messages in thread
From: Tushar Tibude @ 2026-04-30 8:42 UTC (permalink / raw)
To: mchehab, bp, tony.luck, linux-edac, linux-kernel
Cc: qiuxu.zhuo, Tushar Tibude
If error reporting is enabled during initialization but initialization
fails immediately after, or during normal driver teardown, error reporting
is left enabled in the mask register even after exit.
Replace i5400_enable_error_reporting() with i5400_set_error_reporting()
to combine enabling/disabling. Disable reporting at initialization
failure and driver exit, before call to i5400_put_devices() for cleanup.
This ensures clean hardware handling by disabling any unused error
reporting bits before exiting.
Signed-off-by: Tushar Tibude <tushar.tibude1000@gmail.com>
---
drivers/edac/i5400_edac.c | 33 ++++++++++++++++++++++++---------
1 file changed, 24 insertions(+), 9 deletions(-)
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index fb49a1d1d..ae4f92989 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -353,6 +353,9 @@ struct i5400_pvt {
/* Actual values for this controller */
int maxch; /* Max channels */
int maxdimmperch; /* Max DIMMs per channel */
+
+ /* Hardware error reporting status */
+ bool enabled_error_reporting;
};
/* I5400 MCH error information retrieved from Hardware */
@@ -1223,10 +1226,10 @@ static int i5400_init_dimms(struct mem_ctl_info *mci)
}
/*
- * i5400_enable_error_reporting
- * Turn on the memory reporting features of the hardware
+ * i5400_set_error_reporting
+ * Turn on/off the memory reporting features of the hardware
*/
-static void i5400_enable_error_reporting(struct mem_ctl_info *mci)
+static void i5400_set_error_reporting(struct mem_ctl_info *mci, bool enable)
{
struct i5400_pvt *pvt;
u32 fbd_error_mask;
@@ -1237,8 +1240,11 @@ static void i5400_enable_error_reporting(struct mem_ctl_info *mci)
pci_read_config_dword(pvt->branchmap_werrors, EMASK_FBD,
&fbd_error_mask);
- /* Enable with a '0' */
- fbd_error_mask &= ~(ENABLE_EMASK_ALL);
+ /* Enable with 0, disable with 1 */
+ if (enable)
+ fbd_error_mask &= ~(ENABLE_EMASK_ALL);
+ else
+ fbd_error_mask |= ENABLE_EMASK_ALL;
pci_write_config_dword(pvt->branchmap_werrors, EMASK_FBD,
fbd_error_mask);
@@ -1319,17 +1325,19 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
if (i5400_init_dimms(mci)) {
edac_dbg(0, "MC: Setting mci->edac_cap to EDAC_FLAG_NONE because i5400_init_dimms() returned nonzero value\n");
mci->edac_cap = EDAC_FLAG_NONE; /* no dimms found */
+ pvt->enabled_error_reporting = false;
} else {
edac_dbg(1, "MC: Enable error reporting now\n");
- i5400_enable_error_reporting(mci);
+ i5400_set_error_reporting(mci, true);
+ pvt->enabled_error_reporting = true;
}
/* add this new MC control structure to EDAC's list of MCs */
if (edac_mc_add_mc(mci)) {
edac_dbg(0, "MC: failed edac_mc_add_mc()\n");
- /* FIXME: perhaps some code should go here that disables error
- * reporting if we just enabled it
- */
+ /* Disable error reporting if we just enabled it */
+ if (pvt->enabled_error_reporting)
+ i5400_set_error_reporting(mci, false);
goto fail1;
}
@@ -1387,6 +1395,7 @@ static int i5400_init_one(struct pci_dev *pdev, const struct pci_device_id *id)
static void i5400_remove_one(struct pci_dev *pdev)
{
struct mem_ctl_info *mci;
+ struct i5400_pvt *pvt;
edac_dbg(0, "\n");
@@ -1397,6 +1406,12 @@ static void i5400_remove_one(struct pci_dev *pdev)
if (!mci)
return;
+ pvt = mci->pvt_info;
+
+ /* Disable error reporting on teardown */
+ if (pvt->enabled_error_reporting)
+ i5400_set_error_reporting(mci, false);
+
/* retrieve references to resources, and free those resources */
i5400_put_devices(mci);
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* RE: [PATCH 0/3] EDAC: fix error reporting handling during init for
2026-04-30 8:42 [PATCH 0/3] EDAC: fix error reporting handling during init for Tushar Tibude
` (2 preceding siblings ...)
2026-04-30 8:42 ` [PATCH 3/3] EDAC: i5400: disable error reporting at teardown and refactor helper Tushar Tibude
@ 2026-04-30 14:18 ` Zhuo, Qiuxu
2026-04-30 22:40 ` Luck, Tony
3 siblings, 1 reply; 7+ messages in thread
From: Zhuo, Qiuxu @ 2026-04-30 14:18 UTC (permalink / raw)
To: Tushar Tibude, mchehab@kernel.org, bp@alien8.de, Luck, Tony,
linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org
Hi Tushar,
> From: Tushar Tibude <tushar.tibude1000@gmail.com>
> Sent: Thursday, April 30, 2026 4:42 PM
> To: mchehab@kernel.org; bp@alien8.de; Luck, Tony <tony.luck@intel.com>;
> linux-edac@vger.kernel.org; linux-kernel@vger.kernel.org
> Cc: Zhuo, Qiuxu <qiuxu.zhuo@intel.com>; Tushar Tibude
> <tushar.tibude1000@gmail.com>
> Subject: [PATCH 0/3] EDAC: fix error reporting handling during init for
>
> For some drivers, during initialization error reporting is enabled, but not
> properly handled during teardown or init fail, and this leaves it enabled in the
> register on exit.
>
> This inconsistency was first identified for i7300_edac.c and a patch addressing
> it was reviewed:
>
> Link: https://lore.kernel.org/linux-edac/20260429094806.25097-1-
> tushar.tibude1000@gmail.com/
>
> It was then discovered across multiple i5xxx drivers:
>
> i5000_edac.c
> i5100_edac.c
> i5400_edac.c
>
> This patch series aims to implement a fix for each of them.
Thanks for the patch.
Since this addresses the same issue with the same simple fix in the same subsystem,
IMHO it's better to combine this series and the missing fix for driver [1] with your
previous fix [2] into a single patch, rather than having 5 separate patches.
But please wait for comments from @Luck, Tony and @Borislav Petkov (AMD) on whether
to use a single patch or multiple patches.
[1] https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/edac/i7300_edac.c#n1092
[2] https://lore.kernel.org/all/20260429094806.25097-1-tushar.tibude1000@gmail.com/
Thanks!
-Qiuxu
^ permalink raw reply [flat|nested] 7+ messages in thread* RE: [PATCH 0/3] EDAC: fix error reporting handling during init for
2026-04-30 14:18 ` [PATCH 0/3] EDAC: fix error reporting handling during init for Zhuo, Qiuxu
@ 2026-04-30 22:40 ` Luck, Tony
2026-05-01 2:07 ` Zhuo, Qiuxu
0 siblings, 1 reply; 7+ messages in thread
From: Luck, Tony @ 2026-04-30 22:40 UTC (permalink / raw)
To: Zhuo, Qiuxu, Tushar Tibude, mchehab@kernel.org, bp@alien8.de,
linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org
> Since this addresses the same issue with the same simple fix in the same subsystem,
> IMHO it's better to combine this series and the missing fix for driver [1] with your
> previous fix [2] into a single patch, rather than having 5 separate patches.
>
> But please wait for comments from @Luck, Tony and @Borislav Petkov (AMD) on whether
> to use a single patch or multiple patches.
>
> [1] https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/edac/i7300_edac.c#n1092
> [2] https://lore.kernel.org/all/20260429094806.25097-1-tushar.tibude1000@gmail.com/
If the patch was only 5 lines of diff, bundling them all in one patch would be good. But each
is around 30+ lines:
drivers/edac/i7300_edac.c | 33 ++++++++++++++++++++++++---------
which would make the combined patch a bit bigger than I like.
I'll look at the individual patches and if they are OK I'll pick them up separately.
-Tony
^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH 0/3] EDAC: fix error reporting handling during init for
2026-04-30 22:40 ` Luck, Tony
@ 2026-05-01 2:07 ` Zhuo, Qiuxu
0 siblings, 0 replies; 7+ messages in thread
From: Zhuo, Qiuxu @ 2026-05-01 2:07 UTC (permalink / raw)
To: Luck, Tony, Tushar Tibude, mchehab@kernel.org, bp@alien8.de,
linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org
> From: Luck, Tony <tony.luck@intel.com>
> Sent: Friday, May 1, 2026 6:40 AM
> To: Zhuo, Qiuxu <qiuxu.zhuo@intel.com>; Tushar Tibude
> <tushar.tibude1000@gmail.com>; mchehab@kernel.org; bp@alien8.de;
> linux-edac@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: RE: [PATCH 0/3] EDAC: fix error reporting handling during init for
>
> > Since this addresses the same issue with the same simple fix in the
> > same subsystem, IMHO it's better to combine this series and the
> > missing fix for driver [1] with your previous fix [2] into a single patch, rather
> than having 5 separate patches.
> >
> > But please wait for comments from @Luck, Tony and @Borislav Petkov
> > (AMD) on whether to use a single patch or multiple patches.
> >
> > [1]
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> > /tree/drivers/edac/i7300_edac.c#n1092
> > [2]
> > https://lore.kernel.org/all/20260429094806.25097-1-tushar.tibude1000@g
> > mail.com/
>
> If the patch was only 5 lines of diff, bundling them all in one patch would be
> good. But each is around 30+ lines:
>
> drivers/edac/i7300_edac.c | 33 ++++++++++++++++++++++++---------
>
> which would make the combined patch a bit bigger than I like.
>
> I'll look at the individual patches and if they are OK I'll pick them up
> separately.
>
Thanks @Luck, Tony for the comments.
For the patch series, LGTM.
Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
^ permalink raw reply [flat|nested] 7+ messages in thread