From: jane.chu@oracle.com
To: Jeff Moyer <jmoyer@redhat.com>, Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@infradead.org>,
linux-nvdimm <linux-nvdimm@lists.01.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
"JANE.CHU" <jane.chu@oracle.com>
Subject: Re: [RFC][PATCH] dax: Do not try to clear poison for partial pages
Date: Tue, 18 Feb 2020 14:50:43 -0800 [thread overview]
Message-ID: <17c0d27e-c23f-b686-1d47-a0ccace03211@oracle.com> (raw)
In-Reply-To: <583b5fc2-0358-ea9d-20eb-1323c8cedce2@oracle.com>
On 2/18/20 12:45 PM, jane.chu@oracle.com wrote:
> On 2/18/20 11:50 AM, Jeff Moyer wrote:
>> Dan Williams <dan.j.williams@intel.com> writes:
>>
>>> Right now the kernel does not install a pte on faults that land on a
>>> page with known poison, but only because the error clearing path is so
>>> convoluted and could only claim that fallocate(PUNCH_HOLE) cleared
>>> errors because that was guaranteed to send 512-byte aligned zero's
>>> down the block-I/O path when the fs-blocks got reallocated. In a world
>>> where native cpu instructions can clear errors the dax write() syscall
>>> case could be covered (modulo 64-byte alignment), and the kernel could
>>> just let the page be mapped so that the application could attempt it's
>>> own fine-grained clearing without calling back into the kernel.
>>
>> I'm not sure we'd want to do allow mapping the PTEs even if there was
>> support for clearing errors via CPU instructions. Any load from a
>> poisoned page will result in an MCE, and there exists the possiblity
>> that you will hit an unrecoverable error (Processor Context Corrupt).
>> It's just safer to catch these cases by not mapping the page, and
>> forcing recovery through the driver.
>>
>> -Jeff
>>
>
> I'm still in the process of trying a number of things before making an
> attempt to respond to Dan's response. But I'm too slow, so I'd like
> to share some concerns I have here.
>
> If a poison in a file is consumed, and the signal handle does the
> repair and recover as follow: punch a hole the size at least 4K, then
> pwrite the correct data in to the 'hole', then resume the operation.
> However, because the newly allocated pmem block (due to pwrite to the
> 'hole') is a different clean physical pmem block while the poisoned
> block remain unfixed, so we have a provisioning problem, because
> 1. DCPMEM is expensive hence there is likely little provision being
> provided by users;
> 2. lack up API between dax-filesystem and pmem driver for clearing
> poison at each legitimate point, such as when the filesystem tries
> to allocate a pmem block, or zeroing out a range >
> As DCPMM is used for its performance and capacity in cloud application,
> which translates to that the performance code paths include the error
> handling and recovery code path...
>
> With respect to the new cpu instruction, my concern is about the API
> including the error blast radius as reported in the signal payload.
> Is there a venue where we could discuss more in detail ?
For all the quarantined poison blocks, it's not practical to clear them
poisons via ndctl/libndctl on a per namespace granularity for fear of
poisons occurred in valid pmem blocks during data at rest.
How to ultimately clear poisons in a dax-fs in current framework?
it seems to me poisons need to be cleared on the go automatically.
Regards,
-jane
>
> Regards,
> -jane
>
>
>
prev parent reply other threads:[~2020-02-18 22:53 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-29 21:03 [RFC][PATCH] dax: Do not try to clear poison for partial pages Vivek Goyal
2020-01-31 5:42 ` Christoph Hellwig
2020-02-05 20:26 ` jane.chu
2020-02-06 0:37 ` Dan Williams
2020-02-18 19:50 ` Jeff Moyer
2020-02-18 20:45 ` jane.chu
2020-02-18 22:50 ` jane.chu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=17c0d27e-c23f-b686-1d47-a0ccace03211@oracle.com \
--to=jane.chu@oracle.com \
--cc=dan.j.williams@intel.com \
--cc=hch@infradead.org \
--cc=jmoyer@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).