public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Jane Chu <jane.chu@oracle.com>
Cc: tony.luck@intel.com, bp@alien8.de, tglx@linutronix.de,
	mingo@redhat.com, dave.hansen@linux.intel.com, x86@kernel.org,
	linux-edac@vger.kernel.org, dan.j.williams@intel.com,
	linux-kernel@vger.kernel.org, hch@lst.de, nvdimm@lists.linux.dev
Subject: Re: [PATCH v7] x86/mce: retrieve poison range from hardware
Date: Wed, 3 Aug 2022 10:53:10 +0200	[thread overview]
Message-ID: <Yuo3dioqb9mDAOcT@gmail.com> (raw)
In-Reply-To: <20220802195053.3882368-1-jane.chu@oracle.com>


* Jane Chu <jane.chu@oracle.com> wrote:

> With Commit 7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine

s/Commit/commit

> poison granularity") that changed nfit_handle_mce() callback to report
> badrange according to 1ULL << MCI_MISC_ADDR_LSB(mce->misc), it's been
> discovered that the mce->misc LSB field is 0x1000 bytes, hence injecting
> 2 back-to-back poisons and the driver ends up logging 8 badblocks,
> because 0x1000 bytes is 8 512-byte.
> 
> Dan Williams noticed that apei_mce_report_mem_error() hardcode
> the LSB field to PAGE_SHIFT instead of consulting the input
> struct cper_sec_mem_err record.  So change to rely on hardware whenever
> support is available.
> 
> Link: https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com
> 
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
> Reviewed-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Jane Chu <jane.chu@oracle.com>
> ---
>  arch/x86/kernel/cpu/mce/apei.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c
> index 717192915f28..8ed341714686 100644
> --- a/arch/x86/kernel/cpu/mce/apei.c
> +++ b/arch/x86/kernel/cpu/mce/apei.c
> @@ -29,15 +29,26 @@
>  void apei_mce_report_mem_error(int severity, struct cper_sec_mem_err *mem_err)
>  {
>  	struct mce m;
> +	int lsb;
>  
>  	if (!(mem_err->validation_bits & CPER_MEM_VALID_PA))
>  		return;
>  
> +	/*
> +	 * Even if the ->validation_bits are set for address mask,
> +	 * to be extra safe, check and reject an error radius '0',
> +	 * and fall back to the default page size.
> +	 */
> +	if (mem_err->validation_bits & CPER_MEM_VALID_PA_MASK)
> +		lsb = find_first_bit((void *)&mem_err->physical_addr_mask, PAGE_SHIFT);
> +	else
> +		lsb = PAGE_SHIFT;
> +
>  	mce_setup(&m);
>  	m.bank = -1;
>  	/* Fake a memory read error with unknown channel */
>  	m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV | MCI_STATUS_MISCV | 0x9f;
> -	m.misc = (MCI_MISC_ADDR_PHYS << 6) | PAGE_SHIFT;
> +	m.misc = (MCI_MISC_ADDR_PHYS << 6) | lsb;

LGTM.

I suppose this wants to go upstream via the tree the bug came from (NVDIMM 
tree? ACPI tree?), or should we pick it up into the x86 tree?

Thanks,

	Ingo

  reply	other threads:[~2022-08-03  8:53 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-02 19:50 [PATCH v7] x86/mce: retrieve poison range from hardware Jane Chu
2022-08-03  8:53 ` Ingo Molnar [this message]
2022-08-08 20:58   ` Jane Chu
2022-08-08 23:30     ` Dan Williams
2022-08-23 16:38       ` Jane Chu
2022-08-23 16:51 ` Borislav Petkov
2022-08-23 16:58   ` Luck, Tony
2022-08-25 16:29   ` Jane Chu
2022-08-25 22:53     ` Borislav Petkov
2022-08-26 17:54       ` Dan Williams
2022-08-26 18:09         ` Borislav Petkov
2022-08-26 22:11           ` Jane Chu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yuo3dioqb9mDAOcT@gmail.com \
    --to=mingo@kernel.org \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hch@lst.de \
    --cc=jane.chu@oracle.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox