From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: "Kani, Toshimitsu" <toshi.kani@hpe.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>
Subject: Re: [PATCH] pmem: report error on clear poison failure
Date: Thu, 13 Oct 2016 13:09:48 -0600 [thread overview]
Message-ID: <20161013190948.GA26922@linux.intel.com> (raw)
In-Reply-To: <1476382494.20881.58.camel@hpe.com>
On Thu, Oct 13, 2016 at 06:16:29PM +0000, Kani, Toshimitsu wrote:
> On Thu, 2016-10-13 at 10:22 -0700, Dan Williams wrote:
> > On Thu, Oct 13, 2016 at 9:08 AM, Kani, Toshimitsu <toshi.kani@hpe.com
> > > wrote:
> > >
> > > On Thu, 2016-10-13 at 09:01 -0700, Dan Williams wrote:
> > > >
> > > > On Thu, Oct 13, 2016 at 8:54 AM, Toshi Kani <toshi.kani@hpe.com>
> > > > wrote:
> > > > >
> > > > >
> > > > > ACPI Clear Uncorrectable Error DSM function may fail or may be
> > > > > unsupported on a platform. pmem_clear_poison() returns without
> > > > > clearing badblocks in such cases, which leads to a silent data
> > > > > corruption.
> > > > >
> > > > > Change pmem_do_bvec() and pmem_clear_poison() to return -EIO
> > > > > so that filesystem can log an error message.
> > > >
> > > > What's the silent data corruption scenario? If the clear poison
> > > > fails I'm assuming that the poison will still be notified on the
> > > > next
> > > > read.
> > >
> > > I agree that the data is eventually read, but there is no guranteed
> > > that when it is read soon enough, i.e. user might not access to the
> > > data for a long time.
> >
> > ...but that's the same behavior for errors that we don't yet know
> > about. That said, we indeed know that the write failed. I'd feel
> > better about this patch if the justification / impact was clearer in
> > the changelog, because "silent data corruption" is not the impact.
>
> Agreed. How about the following descritpion?
>
> ===
> ACPI Clear Uncorrectable Error DSM function may fail or may be
> unsupported on a platform. pmem_clear_poison() returns without
> clearing badblocks in such cases. This failure is detected at
> the next read (-EIO).
>
> This behavior can lead to an issue when user keeps writing but
> does not read immedicately. For instance, flight recorder file
immediately
> may be only read when it is necessary for troubleshooting.
>
> Change pmem_do_bvec() and pmem_clear_poison() to return -EIO
> so that filesystem can log an error message on a write error.
> ===
>
> Thanks,
> -Toshi
> _______________________________________________
> Linux-nvdimm mailing list
> Linux-nvdimm@lists.01.org
> https://lists.01.org/mailman/listinfo/linux-nvdimm
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
WARNING: multiple messages have this Message-ID (diff)
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: "Kani, Toshimitsu" <toshi.kani@hpe.com>
Cc: "dan.j.williams@intel.com" <dan.j.williams@intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>
Subject: Re: [PATCH] pmem: report error on clear poison failure
Date: Thu, 13 Oct 2016 13:09:48 -0600 [thread overview]
Message-ID: <20161013190948.GA26922@linux.intel.com> (raw)
In-Reply-To: <1476382494.20881.58.camel@hpe.com>
On Thu, Oct 13, 2016 at 06:16:29PM +0000, Kani, Toshimitsu wrote:
> On Thu, 2016-10-13 at 10:22 -0700, Dan Williams wrote:
> > On Thu, Oct 13, 2016 at 9:08 AM, Kani, Toshimitsu <toshi.kani@hpe.com
> > > wrote:
> > >
> > > On Thu, 2016-10-13 at 09:01 -0700, Dan Williams wrote:
> > > >
> > > > On Thu, Oct 13, 2016 at 8:54 AM, Toshi Kani <toshi.kani@hpe.com>
> > > > wrote:
> > > > >
> > > > >
> > > > > ACPI Clear Uncorrectable Error DSM function may fail or may be
> > > > > unsupported on a platform. pmem_clear_poison() returns without
> > > > > clearing badblocks in such cases, which leads to a silent data
> > > > > corruption.
> > > > >
> > > > > Change pmem_do_bvec() and pmem_clear_poison() to return -EIO
> > > > > so that filesystem can log an error message.
> > > >
> > > > What's the silent data corruption scenario? If the clear poison
> > > > fails I'm assuming that the poison will still be notified on the
> > > > next
> > > > read.
> > >
> > > I agree that the data is eventually read, but there is no guranteed
> > > that when it is read soon enough, i.e. user might not access to the
> > > data for a long time.
> >
> > ...but that's the same behavior for errors that we don't yet know
> > about. That said, we indeed know that the write failed. I'd feel
> > better about this patch if the justification / impact was clearer in
> > the changelog, because "silent data corruption" is not the impact.
>
> Agreed. How about the following descritpion?
>
> ===
> ACPI Clear Uncorrectable Error DSM function may fail or may be
> unsupported on a platform. pmem_clear_poison() returns without
> clearing badblocks in such cases. This failure is detected at
> the next read (-EIO).
>
> This behavior can lead to an issue when user keeps writing but
> does not read immedicately. For instance, flight recorder file
immediately
> may be only read when it is necessary for troubleshooting.
>
> Change pmem_do_bvec() and pmem_clear_poison() to return -EIO
> so that filesystem can log an error message on a write error.
> ===
>
> Thanks,
> -Toshi
> _______________________________________________
> Linux-nvdimm mailing list
> Linux-nvdimm@lists.01.org
> https://lists.01.org/mailman/listinfo/linux-nvdimm
next prev parent reply other threads:[~2016-10-13 19:09 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-13 15:54 [PATCH] pmem: report error on clear poison failure Toshi Kani
2016-10-13 15:54 ` Toshi Kani
2016-10-13 16:01 ` Dan Williams
2016-10-13 16:01 ` Dan Williams
2016-10-13 16:08 ` Kani, Toshimitsu
2016-10-13 16:08 ` Kani, Toshimitsu
2016-10-13 17:22 ` Dan Williams
2016-10-13 17:22 ` Dan Williams
2016-10-13 18:16 ` Kani, Toshimitsu
2016-10-13 18:16 ` Kani, Toshimitsu
2016-10-13 19:09 ` Ross Zwisler [this message]
2016-10-13 19:09 ` Ross Zwisler
2016-10-13 19:24 ` Dan Williams
2016-10-13 19:24 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161013190948.GA26922@linux.intel.com \
--to=ross.zwisler@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvdimm@lists.01.org \
--cc=toshi.kani@hpe.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.