From: Jane Chu <jane.chu@oracle.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: david <david@fromorbit.com>,
"Darrick J. Wong" <djwong@kernel.org>,
Christoph Hellwig <hch@infradead.org>,
Vishal L Verma <vishal.l.verma@intel.com>,
Dave Jiang <dave.jiang@intel.com>,
Alasdair Kergon <agk@redhat.com>,
Mike Snitzer <snitzer@redhat.com>,
device-mapper development <dm-devel@redhat.com>,
"Weiny, Ira" <ira.weiny@intel.com>,
Matthew Wilcox <willy@infradead.org>,
Vivek Goyal <vgoyal@redhat.com>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Linux NVDIMM <nvdimm@lists.linux.dev>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-xfs <linux-xfs@vger.kernel.org>, X86 ML <x86@kernel.org>,
"luto@kernel.org" <luto@kernel.org>,
"peterz@infradead.org" <peterz@infradead.org>,
"dave.hansen@intel.com" <dave.hansen@intel.com>
Subject: Re: [PATCH v7 3/6] mce: fix set_mce_nospec to always unmap the whole page
Date: Fri, 15 Apr 2022 16:18:57 +0000 [thread overview]
Message-ID: <b3f0bfcd-9e7c-cc71-e91d-d0f28053dea9@oracle.com> (raw)
In-Reply-To: <CAPcyv4jpwzMPKtzzc=DEbC340+zmzXkj+QtPVxfYbraskLKv8g@mail.gmail.com>
On 4/13/2022 7:32 PM, Dan Williams wrote:
> On Wed, Apr 13, 2022 at 4:36 PM Jane Chu <jane.chu@oracle.com> wrote:
>>
>> On 4/11/2022 4:27 PM, Dan Williams wrote:
>>> On Tue, Apr 5, 2022 at 12:48 PM Jane Chu <jane.chu@oracle.com> wrote:
>>>>
>>>> The set_memory_uc() approach doesn't work well in all cases.
>>>> For example, when "The VMM unmapped the bad page from guest
>>>> physical space and passed the machine check to the guest."
>>>> "The guest gets virtual #MC on an access to that page.
>>>> When the guest tries to do set_memory_uc() and instructs
>>>> cpa_flush() to do clean caches that results in taking another
>>>> fault / exception perhaps because the VMM unmapped the page
>>>> from the guest."
>>>>
>>>> Since the driver has special knowledge to handle NP or UC,
>>>
>>> I think a patch is needed before this one to make this statement true? I.e.:
>>>
>>> diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
>>> index ee8d9973f60b..11641f55025a 100644
>>> --- a/drivers/acpi/nfit/mce.c
>>> +++ b/drivers/acpi/nfit/mce.c
>>> @@ -32,6 +32,7 @@ static int nfit_handle_mce(struct notifier_block
>>> *nb, unsigned long val,
>>> */
>>> mutex_lock(&acpi_desc_lock);
>>> list_for_each_entry(acpi_desc, &acpi_descs, list) {
>>> + unsigned int align = 1UL << MCI_MISC_ADDR_LSB(mce->misc);
>>> struct device *dev = acpi_desc->dev;
>>> int found_match = 0;
>>>
>>> @@ -63,8 +64,7 @@ static int nfit_handle_mce(struct notifier_block
>>> *nb, unsigned long val,
>>>
>>> /* If this fails due to an -ENOMEM, there is little we can do */
>>> nvdimm_bus_add_badrange(acpi_desc->nvdimm_bus,
>>> - ALIGN(mce->addr, L1_CACHE_BYTES),
>>> - L1_CACHE_BYTES);
>>> + ALIGN(mce->addr, align), align);
>>> nvdimm_region_notify(nfit_spa->nd_region,
>>> NVDIMM_REVALIDATE_POISON);
>>>
>>
>> Dan, I tried the above change, and this is what I got after injecting 8
>> back-to-back poisons, then read them and received SIGBUS/BUS_MCEERR_AR,
>> then repair via the v7 patch which works until this change is added.
>>
>> [ 6240.955331] nfit ACPI0012:00: XXX, align = 100
>> [ 6240.960300] nfit ACPI0012:00: XXX, ALIGN(mce->addr,
>> L1_CACHE_BYTES)=1851600400, L1_CACHE_BYTES=40, ALIGN(mce->addr,
>> align)=1851600400
>> [..]
>> [ 6242.052277] nfit ACPI0012:00: XXX, align = 100
>> [ 6242.057243] nfit ACPI0012:00: XXX, ALIGN(mce->addr,
>> L1_CACHE_BYTES)=1851601000, L1_CACHE_BYTES=40, ALIGN(mce->addr,
>> align)=1851601000
>> [..]
>> [ 6244.917198] nfit ACPI0012:00: XXX, align = 1000
>> [ 6244.922258] nfit ACPI0012:00: XXX, ALIGN(mce->addr,
>> L1_CACHE_BYTES)=1851601200, L1_CACHE_BYTES=40, ALIGN(mce->addr,
>> align)=1851602000
>> [..]
>>
>> All 8 poisons remain uncleared.
>>
>> Without further investigation, I don't know why the failure.
>> Could we mark this change to a follow-on task?
>
> Perhaps a bit more debug before kicking this can down the road...
>
> I'm worried that this means that the driver is not accurately tracking
> poison data For example, that last case the hardware is indicating a
> full page clobber, but the old code would only track poison from
> 1851601200 to 1851601400 (i.e. the first 512 bytes from the base error
> address).
I would appear so, but the old code tracking seems to be working
correctly. For example, injecting 8 back-to-back poison, the
/sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0012:00/ndbus0/region0/badblocks
cpatures all 8 of them, that spans 2K range, right? I never had issue
about missing poison in my tests.
>
> Oh... wait, I think there is a second bug here, that ALIGN should be
> ALIGN_DOWN(). Does this restore the result you expect?
That's it, ALIGN_DOWN fixed the issue, thanks!!
I'll add a patch, suggested-by you.
>
> diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
> index ee8d9973f60b..d7a52238a741 100644
> --- a/drivers/acpi/nfit/mce.c
> +++ b/drivers/acpi/nfit/mce.c
> @@ -63,8 +63,7 @@ static int nfit_handle_mce(struct notifier_block
> *nb, unsigned long val,
>
> /* If this fails due to an -ENOMEM, there is little we can do */
> nvdimm_bus_add_badrange(acpi_desc->nvdimm_bus,
> - ALIGN(mce->addr, L1_CACHE_BYTES),
> - L1_CACHE_BYTES);
> + ALIGN_DOWN(mce->addr, align), align);
> nvdimm_region_notify(nfit_spa->nd_region,
> NVDIMM_REVALIDATE_POISON);
>
>
>> The driver knows a lot about how to clear poisons besides hardcoding
>> poison alignment to 0x40 bytes.
>
> It does, but the badblocks tracking should still be reliable, and if
> it's not reliable I expect there are cases where recovery_write() will
> not be triggered because the driver will not fail the
> dax_direct_access() attempt.
thanks!
-jane
next prev parent reply other threads:[~2022-04-15 16:19 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-05 19:47 [PATCH v7 0/6] DAX poison recovery Jane Chu
2022-04-05 19:47 ` [PATCH v7 1/6] x86/mm: fix comment Jane Chu
2022-04-11 22:07 ` Dan Williams
2022-04-12 9:53 ` Borislav Petkov
2022-04-14 1:00 ` Jane Chu
2022-04-14 8:44 ` Borislav Petkov
2022-04-14 21:54 ` Jane Chu
2022-04-05 19:47 ` [PATCH v7 2/6] x86/mce: relocate set{clear}_mce_nospec() functions Jane Chu
2022-04-06 5:01 ` Christoph Hellwig
2022-04-11 22:20 ` Dan Williams
2022-04-14 0:56 ` Jane Chu
2022-04-05 19:47 ` [PATCH v7 3/6] mce: fix set_mce_nospec to always unmap the whole page Jane Chu
2022-04-06 5:02 ` Christoph Hellwig
2022-04-11 23:27 ` Dan Williams
2022-04-13 23:36 ` Jane Chu
2022-04-14 2:32 ` Dan Williams
2022-04-15 16:18 ` Jane Chu [this message]
2022-04-12 10:07 ` Borislav Petkov
2022-04-13 23:41 ` Jane Chu
2022-04-05 19:47 ` [PATCH v7 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops Jane Chu
2022-04-06 5:19 ` Christoph Hellwig
2022-04-06 17:32 ` Jane Chu
2022-04-06 17:45 ` Jane Chu
2022-04-07 5:30 ` Christoph Hellwig
2022-04-11 23:55 ` Dan Williams
2022-04-14 0:48 ` Jane Chu
2022-04-14 0:47 ` Jane Chu
2022-04-12 0:08 ` Dan Williams
2022-04-14 0:50 ` Jane Chu
2022-04-12 4:57 ` Dan Williams
2022-04-12 5:02 ` Christoph Hellwig
2022-04-14 0:51 ` Jane Chu
2022-04-05 19:47 ` [PATCH v7 5/6] pmem: refactor pmem_clear_poison() Jane Chu
2022-04-06 5:04 ` Christoph Hellwig
2022-04-06 17:34 ` Jane Chu
2022-04-12 4:26 ` Dan Williams
2022-04-14 0:55 ` Jane Chu
2022-04-14 2:02 ` Dan Williams
2022-04-05 19:47 ` [PATCH v7 6/6] pmem: implement pmem_recovery_write() Jane Chu
2022-04-06 5:21 ` Christoph Hellwig
2022-04-06 17:33 ` Jane Chu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b3f0bfcd-9e7c-cc71-e91d-d0f28053dea9@oracle.com \
--to=jane.chu@oracle.com \
--cc=agk@redhat.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@intel.com \
--cc=dave.jiang@intel.com \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=dm-devel@redhat.com \
--cc=hch@infradead.org \
--cc=ira.weiny@intel.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=luto@kernel.org \
--cc=nvdimm@lists.linux.dev \
--cc=peterz@infradead.org \
--cc=snitzer@redhat.com \
--cc=vgoyal@redhat.com \
--cc=vishal.l.verma@intel.com \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).