Linux EDAC development
 help / color / mirror / Atom feed
From: "Srivatsa S. Bhat" <srivatsa@csail.mit.edu>
To: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
Cc: linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org,
	git@amd.com, ptsm@linux.microsoft.com,
	shubhrajyoti.datta@gmail.com,
	Sai Krishna Potthuri <sai.krishna.potthuri@amd.com>,
	Borislav Petkov <bp@alien8.de>, Tony Luck <tony.luck@intel.com>
Subject: Re: [PATCH v2] EDAC/versal: Report PFN and page offset for DDR errors
Date: Wed, 13 May 2026 16:01:48 +0530	[thread overview]
Message-ID: <agRTFM3GTU4fuYG6@csail.mit.edu> (raw)
In-Reply-To: <20260428102850.1372502-1-shubhrajyoti.datta@amd.com>

On Tue, Apr 28, 2026 at 03:58:50PM +0530, Shubhrajyoti Datta wrote:
> Currently, DDRMC correctable and uncorrectable error events are reported
> to EDAC with page frame number (pfn) and offset set to zero.
> This information is not useful to locate the address for memory errors.
> 
> Compute the physical address from the error information and extract
> the page frame number and offset before calling edac_mc_handle_error().
> This provides the actual memory location information to the userspace.
> 
> Fixes: 6f15b178cd63 ("EDAC/versal: Add a Xilinx Versal memory controller driver")
> Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com>
> Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>

Reviewed-by: Srivatsa S. Bhat (Microsoft) <srivatsa@csail.mit.edu>

Regards,
Srivatsa
Microsoft Linux Systems Group

> ---
> 
> Changes in v2:
> - Optimise the handle_error for it is not called for non-(CE/UE) errors
> - Remove the extra else
> 
>  drivers/edac/versal_edac.c | 38 +++++++++++++++++++-------------------
>  1 file changed, 19 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/edac/versal_edac.c b/drivers/edac/versal_edac.c
> index 539b46d4f610..5e049cbb3e9b 100644
> --- a/drivers/edac/versal_edac.c
> +++ b/drivers/edac/versal_edac.c
> @@ -513,38 +513,38 @@ static unsigned long convert_to_physical(struct edac_priv *priv, union ecc_error
>   * @stat:	ECC status structure.
>   *
>   * Handles ECC correctable and uncorrectable errors.
> + *
> + * Called after get_error_info() which
> + * filters out non CE nor UE events. Therefore
> + * stat->error_type is always XDDR_ERR_TYPE_CE or XDDR_ERR_TYPE_UE here.
>   */
>  static void handle_error(struct mem_ctl_info *mci, struct ecc_status *stat)
>  {
>  	struct edac_priv *priv = mci->pvt_info;
> +	enum hw_event_mc_err_type type;
>  	union ecc_error_info pinf;
> +	unsigned long pa, pfn;
>  
>  	if (stat->error_type == XDDR_ERR_TYPE_CE) {
>  		priv->ce_cnt++;
>  		pinf = stat->ceinfo[stat->channel];
> -		snprintf(priv->message, XDDR_EDAC_MSG_SIZE,
> -			 "Error type:%s MC ID: %d Addr at %lx Burst Pos: %d\n",
> -			 "CE", priv->mc_id,
> -			 convert_to_physical(priv, pinf), pinf.burstpos);
> -
> -		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> -				     1, 0, 0, 0, 0, 0, -1,
> -				     priv->message, "");
> -	}
> -
> -	if (stat->error_type == XDDR_ERR_TYPE_UE) {
> +		type = HW_EVENT_ERR_CORRECTED;
> +	} else {
>  		priv->ue_cnt++;
>  		pinf = stat->ueinfo[stat->channel];
> -		snprintf(priv->message, XDDR_EDAC_MSG_SIZE,
> -			 "Error type:%s MC ID: %d Addr at %lx Burst Pos: %d\n",
> -			 "UE", priv->mc_id,
> -			 convert_to_physical(priv, pinf), pinf.burstpos);
> -
> -		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
> -				     1, 0, 0, 0, 0, 0, -1,
> -				     priv->message, "");
> +		type = HW_EVENT_ERR_UNCORRECTED;
>  	}
>  
> +	pa = convert_to_physical(priv, pinf);
> +	pfn = PHYS_PFN(pa);
> +	snprintf(priv->message, XDDR_EDAC_MSG_SIZE,
> +		 "Error type:%s MC ID: %d Addr at %lx Burst Pos: %d\n",
> +		 type == HW_EVENT_ERR_UNCORRECTED ? "UE" : "CE", priv->mc_id,
> +		 pa, pinf.burstpos);
> +	edac_mc_handle_error(type, mci,
> +			     1, pfn, offset_in_page(pa), 0, 0, 0, -1,
> +			     priv->message, "");
> +
>  	memset(stat, 0, sizeof(*stat));
>  }
>  
> -- 
> 2.34.1
> 

      reply	other threads:[~2026-05-13 10:37 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-28 10:28 [PATCH v2] EDAC/versal: Report PFN and page offset for DDR errors Shubhrajyoti Datta
2026-05-13 10:31 ` Srivatsa S. Bhat [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=agRTFM3GTU4fuYG6@csail.mit.edu \
    --to=srivatsa@csail.mit.edu \
    --cc=bp@alien8.de \
    --cc=git@amd.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ptsm@linux.microsoft.com \
    --cc=sai.krishna.potthuri@amd.com \
    --cc=shubhrajyoti.datta@amd.com \
    --cc=shubhrajyoti.datta@gmail.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox