public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Srivatsa S. Bhat" <srivatsa@csail.mit.edu>
To: "Datta, Shubhrajyoti" <shubhrajyoti.datta@amd.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"git (AMD-Xilinx)" <git@amd.com>,
	"ptsm@linux.microsoft.com" <ptsm@linux.microsoft.com>,
	"shubhrajyoti.datta@gmail.com" <shubhrajyoti.datta@gmail.com>,
	Borislav Petkov <bp@alien8.de>, Tony Luck <tony.luck@intel.com>
Subject: Re: [PATCH] EDAC/versal: Report PFN and page offset for DDR errors
Date: Tue, 28 Apr 2026 00:19:12 +0530	[thread overview]
Message-ID: <ae-vqFb1GvTkj9e0@csail.mit.edu> (raw)
In-Reply-To: <DS2PR12MB9821117564B73D6C805727BB81362@DS2PR12MB9821.namprd12.prod.outlook.com>

On Mon, Apr 27, 2026 at 06:27:36AM +0000, Datta, Shubhrajyoti wrote:
> Public
> 
> > -----Original Message-----
> > From: Srivatsa S. Bhat <srivatsa@csail.mit.edu>
> > Sent: Wednesday, April 22, 2026 9:20 AM
> > To: Datta, Shubhrajyoti <shubhrajyoti.datta@amd.com>
> > Cc: linux-kernel@vger.kernel.org; linux-edac@vger.kernel.org; git (AMD-Xilinx)
> > <git@amd.com>; ptsm@linux.microsoft.com; shubhrajyoti.datta@gmail.com;
> > Borislav Petkov <bp@alien8.de>; Tony Luck <tony.luck@intel.com>
> > Subject: Re: [PATCH] EDAC/versal: Report PFN and page offset for DDR errors
> >
> > Caution: This message originated from an External Source. Use proper caution
> > when opening attachments, clicking links, or responding.
> >
> >
> > On Wed, Apr 15, 2026 at 11:32:39AM +0530, Shubhrajyoti Datta wrote:
> > > Currently, DDRMC correctable and uncorrectable error events are
> > > reported to EDAC with page frame number (pfn) and offset set to zero.
> > > This information is not useful to locate the address for memory errors.
> > >
> > > Compute the physical address from the error information and extract
> > > the page frame number and offset before calling edac_mc_handle_error().
> > > This provides the actual memory location information to the userspace.
> > >
> > > Fixes: 6f15b178cd63 ("EDAC/versal: Add a Xilinx Versal memory
> > > controller driver")
> > > Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
> > > ---
> > >
> > >  drivers/edac/versal_edac.c | 36 +++++++++++++++++-------------------
> > >  1 file changed, 17 insertions(+), 19 deletions(-)
> > >
> > > diff --git a/drivers/edac/versal_edac.c b/drivers/edac/versal_edac.c
> > > index 5a43b5d43ca2..18045f96610e 100644
> > > --- a/drivers/edac/versal_edac.c
> > > +++ b/drivers/edac/versal_edac.c
> > > @@ -414,34 +414,32 @@ static unsigned long convert_to_physical(struct
> > > edac_priv *priv, union ecc_error  static void handle_error(struct
> > > mem_ctl_info *mci, struct ecc_status *stat)  {
> >
> > [...]
> >
> > >       if (stat->error_type == XDDR_ERR_TYPE_CE) {
> >
> > [...]
> >
> > > +     } else if (stat->error_type == XDDR_ERR_TYPE_UE) {
> >
> > [...]
> > > +     } else {
> > > +             return;
> >
> > I like the cleanup contributed by this patch (in terms of reducing code
> > duplication) in addition to the actual fix. However, this patch also introduces a
> > subtle behavior change - the existing code calls
> > memset() to clear out the ecc_status struct unconditionally, but this patch
> > doesn't call memset if the error type is not CE or UE (i.e., in the early return
> > path).
> >
> > Was this change intentional? Wouldn't it potentially cause stale data to be left
> > over in the ecc_status struct, affecting future reuse?
> 
> Hi Srivatsa,
> Thanks for the review.
> 
> The early return in handle_error is intentionally safe.
> get_error_info is always called before handle_error:
> 
>   if (get_error_info(priv))
>           return;
>   handle_error(mci, &priv->stat);
> 
>   get_error_info already guards against non-CE/UE events:
> 
>   if (!eccr0_ceval && !eccr1_ceval && !eccr0_ueval && !eccr1_ueval)
>           return 1;
> 
>   So handle_error is only ever reached when at least one CE or UE
>   error is present, making the else { return; } branch unreachable
>   in practice. The removed memset in that path was overprotective
>   dead code.
> 
>   For future events, get_error_info always reloads the data fresh
>   before handle_error runs, so there is no risk of stale data.
> 

I see, thank you for the clarification!

>   I can add a comment above handle_error noting this precondition if
>   that would help future readers.
> 

Sure, that would be great! Along with that, how about also removing
the else { return; } branch if it is actually unreachable?

Regards,
Srivatsa
Microsoft Linux Systems Group

      reply	other threads:[~2026-04-27 18:49 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-15  6:02 [PATCH] EDAC/versal: Report PFN and page offset for DDR errors Shubhrajyoti Datta
2026-04-16  5:32 ` Prasanna Kumar T S M
2026-04-22  3:49 ` Srivatsa S. Bhat
2026-04-27  6:27   ` Datta, Shubhrajyoti
2026-04-27 18:49     ` Srivatsa S. Bhat [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ae-vqFb1GvTkj9e0@csail.mit.edu \
    --to=srivatsa@csail.mit.edu \
    --cc=bp@alien8.de \
    --cc=git@amd.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ptsm@linux.microsoft.com \
    --cc=shubhrajyoti.datta@amd.com \
    --cc=shubhrajyoti.datta@gmail.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox