* [PATCH] EDAC/versalnet: Report PFN and page offset for DDR errors
@ 2026-03-29 12:44 Shubhrajyoti Datta
2026-04-01 5:58 ` Prasanna Kumar T S M
0 siblings, 1 reply; 4+ messages in thread
From: Shubhrajyoti Datta @ 2026-03-29 12:44 UTC (permalink / raw)
To: linux-edac
Cc: git, shubhrajyoti.datta, Michal Simek, Borislav Petkov, Tony Luck,
linux-kernel
Currently, DDRMC correctable and uncorrectable error events are reported
to EDAC with page frame number (pfn) and offset set to zero.
This information is not useful to locate the address for memory errors.
Compute the physical address from the error information and extract
the page frame number and offset before calling edac_mc_handle_error().
This provides the actual memory location information to the userspace.
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
---
drivers/edac/versalnet_edac.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/edac/versalnet_edac.c b/drivers/edac/versalnet_edac.c
index 915bcd6166f7..66df090245be 100644
--- a/drivers/edac/versalnet_edac.c
+++ b/drivers/edac/versalnet_edac.c
@@ -431,8 +431,7 @@ static void handle_error(struct mc_priv *priv, struct ecc_status *stat,
{
union ecc_error_info pinf;
struct mem_ctl_info *mci;
- unsigned long pa;
- phys_addr_t pfn;
+ unsigned long pa, pfn;
int err;
if (WARN_ON_ONCE(ctl_num >= NUM_CONTROLLERS))
@@ -442,27 +441,28 @@ static void handle_error(struct mc_priv *priv, struct ecc_status *stat,
if (stat->error_type == MC5_ERR_TYPE_CE) {
pinf = stat->ceinfo[stat->channel];
+ pa = convert_to_physical(priv, pinf, ctl_num, error_data);
+ pfn = PHYS_PFN(pa);
snprintf(priv->message, sizeof(priv->message),
"Error type:%s Controller %d Addr at %lx\n",
- "CE", ctl_num, convert_to_physical(priv, pinf, ctl_num, error_data));
+ "CE", ctl_num, pa);
edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
- 1, 0, 0, 0, 0, 0, -1,
+ 1, pfn, offset_in_page(pa), 0, 0, 0, -1,
priv->message, "");
}
if (stat->error_type == MC5_ERR_TYPE_UE) {
pinf = stat->ueinfo[stat->channel];
+ pa = convert_to_physical(priv, pinf, ctl_num, error_data);
+ pfn = PHYS_PFN(pa);
snprintf(priv->message, sizeof(priv->message),
"Error type:%s controller %d Addr at %lx\n",
- "UE", ctl_num, convert_to_physical(priv, pinf, ctl_num, error_data));
+ "UE", ctl_num, pa);
edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
- 1, 0, 0, 0, 0, 0, -1,
+ 1, pfn, offset_in_page(pa), 0, 0, 0, -1,
priv->message, "");
- pa = convert_to_physical(priv, pinf, ctl_num, error_data);
- pfn = PHYS_PFN(pa);
-
if (IS_ENABLED(CONFIG_MEMORY_FAILURE)) {
err = memory_failure(pfn, MF_ACTION_REQUIRED);
if (err)
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] EDAC/versalnet: Report PFN and page offset for DDR errors
2026-03-29 12:44 [PATCH] EDAC/versalnet: Report PFN and page offset for DDR errors Shubhrajyoti Datta
@ 2026-04-01 5:58 ` Prasanna Kumar T S M
2026-04-22 3:59 ` Srivatsa S. Bhat
0 siblings, 1 reply; 4+ messages in thread
From: Prasanna Kumar T S M @ 2026-04-01 5:58 UTC (permalink / raw)
To: Shubhrajyoti Datta, linux-edac
Cc: git, shubhrajyoti.datta, Michal Simek, Borislav Petkov, Tony Luck,
linux-kernel
On 29-03-2026 18:14, Shubhrajyoti Datta wrote:
> Currently, DDRMC correctable and uncorrectable error events are reported
> to EDAC with page frame number (pfn) and offset set to zero.
> This information is not useful to locate the address for memory errors.
>
> Compute the physical address from the error information and extract
> the page frame number and offset before calling edac_mc_handle_error().
> This provides the actual memory location information to the userspace.
>
> Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
> ---
>
> drivers/edac/versalnet_edac.c | 18 +++++++++---------
> 1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/edac/versalnet_edac.c b/drivers/edac/versalnet_edac.c
> index 915bcd6166f7..66df090245be 100644
> --- a/drivers/edac/versalnet_edac.c
> +++ b/drivers/edac/versalnet_edac.c
> @@ -431,8 +431,7 @@ static void handle_error(struct mc_priv *priv, struct ecc_status *stat,
> {
> union ecc_error_info pinf;
> struct mem_ctl_info *mci;
> - unsigned long pa;
> - phys_addr_t pfn;
> + unsigned long pa, pfn;
> int err;
>
> if (WARN_ON_ONCE(ctl_num >= NUM_CONTROLLERS))
> @@ -442,27 +441,28 @@ static void handle_error(struct mc_priv *priv, struct ecc_status *stat,
>
> if (stat->error_type == MC5_ERR_TYPE_CE) {
> pinf = stat->ceinfo[stat->channel];
> + pa = convert_to_physical(priv, pinf, ctl_num, error_data);
> + pfn = PHYS_PFN(pa);
> snprintf(priv->message, sizeof(priv->message),
> "Error type:%s Controller %d Addr at %lx\n",
> - "CE", ctl_num, convert_to_physical(priv, pinf, ctl_num, error_data));
> + "CE", ctl_num, pa);
>
> edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> - 1, 0, 0, 0, 0, 0, -1,
> + 1, pfn, offset_in_page(pa), 0, 0, 0, -1,
> priv->message, "");
> }
>
> if (stat->error_type == MC5_ERR_TYPE_UE) {
> pinf = stat->ueinfo[stat->channel];
> + pa = convert_to_physical(priv, pinf, ctl_num, error_data);
> + pfn = PHYS_PFN(pa);
> snprintf(priv->message, sizeof(priv->message),
> "Error type:%s controller %d Addr at %lx\n",
> - "UE", ctl_num, convert_to_physical(priv, pinf, ctl_num, error_data));
> + "UE", ctl_num, pa);
>
> edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
> - 1, 0, 0, 0, 0, 0, -1,
> + 1, pfn, offset_in_page(pa), 0, 0, 0, -1,
> priv->message, "");
> - pa = convert_to_physical(priv, pinf, ctl_num, error_data);
> - pfn = PHYS_PFN(pa);
> -
> if (IS_ENABLED(CONFIG_MEMORY_FAILURE)) {
> err = memory_failure(pfn, MF_ACTION_REQUIRED);
> if (err)
Nit: pa and pfn calculation can be moved out of the if() condition.
Irrespective of the nit, the patch looks good.
Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] EDAC/versalnet: Report PFN and page offset for DDR errors
2026-04-01 5:58 ` Prasanna Kumar T S M
@ 2026-04-22 3:59 ` Srivatsa S. Bhat
2026-04-27 10:47 ` Shubhrajyoti Datta
0 siblings, 1 reply; 4+ messages in thread
From: Srivatsa S. Bhat @ 2026-04-22 3:59 UTC (permalink / raw)
To: Prasanna Kumar T S M
Cc: Shubhrajyoti Datta, linux-edac, git, shubhrajyoti.datta,
Michal Simek, Borislav Petkov, Tony Luck, linux-kernel
On Wed, Apr 01, 2026 at 11:28:10AM +0530, Prasanna Kumar T S M wrote:
>
>
> On 29-03-2026 18:14, Shubhrajyoti Datta wrote:
> > Currently, DDRMC correctable and uncorrectable error events are reported
> > to EDAC with page frame number (pfn) and offset set to zero.
> > This information is not useful to locate the address for memory errors.
> >
> > Compute the physical address from the error information and extract
> > the page frame number and offset before calling edac_mc_handle_error().
> > This provides the actual memory location information to the userspace.
> >
> > Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
> > ---
> >
[...]
> > if (stat->error_type == MC5_ERR_TYPE_UE) {
> > pinf = stat->ueinfo[stat->channel];
> > + pa = convert_to_physical(priv, pinf, ctl_num, error_data);
> > + pfn = PHYS_PFN(pa);
> > snprintf(priv->message, sizeof(priv->message),
> > "Error type:%s controller %d Addr at %lx\n",
> > - "UE", ctl_num, convert_to_physical(priv, pinf, ctl_num, error_data));
> > + "UE", ctl_num, pa);
> > edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
> > - 1, 0, 0, 0, 0, 0, -1,
> > + 1, pfn, offset_in_page(pa), 0, 0, 0, -1,
> > priv->message, "");
> > - pa = convert_to_physical(priv, pinf, ctl_num, error_data);
> > - pfn = PHYS_PFN(pa);
> > -
> > if (IS_ENABLED(CONFIG_MEMORY_FAILURE)) {
> > err = memory_failure(pfn, MF_ACTION_REQUIRED);
> > if (err)
>
> Nit: pa and pfn calculation can be moved out of the if() condition.
>
Hi Shubrajyoti,
Could you revise this patch with a similar cleanup for the versalnet
driver as you did for the versal driver to avoid code duplication,
please?
https://lore.kernel.org/all/20260415060239.733200-1-shubhrajyoti.datta@amd.com/#t
Thank you!
Regards,
Srivatsa
Microsoft Linux Systems Group
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] EDAC/versalnet: Report PFN and page offset for DDR errors
2026-04-22 3:59 ` Srivatsa S. Bhat
@ 2026-04-27 10:47 ` Shubhrajyoti Datta
0 siblings, 0 replies; 4+ messages in thread
From: Shubhrajyoti Datta @ 2026-04-27 10:47 UTC (permalink / raw)
To: Srivatsa S. Bhat
Cc: Prasanna Kumar T S M, Shubhrajyoti Datta, linux-edac, git,
Michal Simek, Borislav Petkov, Tony Luck, linux-kernel
On Wed, Apr 22, 2026 at 9:29 AM Srivatsa S. Bhat <srivatsa@csail.mit.edu> wrote:
>
> On Wed, Apr 01, 2026 at 11:28:10AM +0530, Prasanna Kumar T S M wrote:
> >
> >
> > On 29-03-2026 18:14, Shubhrajyoti Datta wrote:
> > > Currently, DDRMC correctable and uncorrectable error events are reported
> > > to EDAC with page frame number (pfn) and offset set to zero.
> > > This information is not useful to locate the address for memory errors.
> > >
> > > Compute the physical address from the error information and extract
> > > the page frame number and offset before calling edac_mc_handle_error().
> > > This provides the actual memory location information to the userspace.
> > >
> > > Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
> > > ---
> > >
>
> [...]
>
> > > if (stat->error_type == MC5_ERR_TYPE_UE) {
> > > pinf = stat->ueinfo[stat->channel];
> > > + pa = convert_to_physical(priv, pinf, ctl_num, error_data);
> > > + pfn = PHYS_PFN(pa);
> > > snprintf(priv->message, sizeof(priv->message),
> > > "Error type:%s controller %d Addr at %lx\n",
> > > - "UE", ctl_num, convert_to_physical(priv, pinf, ctl_num, error_data));
> > > + "UE", ctl_num, pa);
> > > edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
> > > - 1, 0, 0, 0, 0, 0, -1,
> > > + 1, pfn, offset_in_page(pa), 0, 0, 0, -1,
> > > priv->message, "");
> > > - pa = convert_to_physical(priv, pinf, ctl_num, error_data);
> > > - pfn = PHYS_PFN(pa);
> > > -
> > > if (IS_ENABLED(CONFIG_MEMORY_FAILURE)) {
> > > err = memory_failure(pfn, MF_ACTION_REQUIRED);
> > > if (err)
> >
> > Nit: pa and pfn calculation can be moved out of the if() condition.
> >
>
> Hi Shubrajyoti,
>
> Could you revise this patch with a similar cleanup for the versalnet
> driver as you did for the versal driver to avoid code duplication,
> please?
> https://lore.kernel.org/all/20260415060239.733200-1-shubhrajyoti.datta@amd.com/#t
Let me now if the below looks fine
if (stat->error_type == MC5_ERR_TYPE_CE) {
pinf = stat->ceinfo[stat->channel];
type = HW_EVENT_ERR_CORRECTED;
}
if (stat->error_type == MC5_ERR_TYPE_UE) {
pinf = stat->ueinfo[stat->channel];
type = HW_EVENT_ERR_UNCORRECTED;
}
pa = convert_to_physical(priv, pinf, ctl_num, error_data);
pfn = PHYS_PFN(pa);
snprintf(priv->message, sizeof(priv->message),
"Error type:%s Controller %d Addr at %lx\n",
type == HW_EVENT_ERR_UNCORRECTED ? "UE" : "CE",
ctl_num, pa);
edac_mc_handle_error(type, mci,
1, pfn, pa & ~PAGE_MASK, 0, 0, 0, -1,
priv->message, "");
if (stat->error_type == MC5_ERR_TYPE_UE) {
if (IS_ENABLED(CONFIG_MEMORY_FAILURE)) {
err = memory_failure(pfn, MF_ACTION_REQUIRED);
if (err)
edac_dbg(2, "memory_failure() error: %d", err);
else
edac_dbg(2, "Poison page at PA 0x%lx\n", pa);
}
}
>
> Thank you!
>
> Regards,
> Srivatsa
> Microsoft Linux Systems Group
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-04-27 10:47 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-29 12:44 [PATCH] EDAC/versalnet: Report PFN and page offset for DDR errors Shubhrajyoti Datta
2026-04-01 5:58 ` Prasanna Kumar T S M
2026-04-22 3:59 ` Srivatsa S. Bhat
2026-04-27 10:47 ` Shubhrajyoti Datta
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox