From: "Verma, Vishal L" <vishal.l.verma@intel.com>
To: "Williams, Dan J" <dan.j.williams@intel.com>,
"toshi.kani@hpe.com" <toshi.kani@hpe.com>
Cc: "linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>
Subject: Re: [PATCH v6 6/6] libnvdimm, btt: rework error clearing
Date: Thu, 24 Aug 2017 23:15:14 +0000 [thread overview]
Message-ID: <1503616402.22880.24.camel@intel.com> (raw)
In-Reply-To: <CAPcyv4hF-0Asic+sfXGhN+rvWb6mOb2GpSzWAqkr+U=zO-pt=A@mail.gmail.com>
On Thu, 2017-08-24 at 15:11 -0700, Dan Williams wrote:
> On Thu, Aug 24, 2017 at 2:40 PM, Kani, Toshimitsu <toshi.kani@hpe.com>
> wrote:
> > On Thu, 2017-08-24 at 17:07 -0400, Jeff Moyer wrote:
> > > Dan Williams <dan.j.williams@intel.com> writes:
> > >
> > > > > > I hit an infinite clear loop when DSM Clear Uncorrectable
> > > > > > Error
> > > > > > function fails. Haven't looked into the details, but I
> > > > > > suspect
> > > > > > this unconditional retry is the cause of this.
> > > > >
> > > > > Thanks Toshi - that makes sense. I think the right thing to do
> > > > > would be if the DSM fails, return an EIO yes? (Or should we
> > > > > ignore the fact that there was an error, clear ->has_err, and
> > > > > let
> > > > > the write take its course (possibly generate a CMCI)
> > > > >
> > > > > It will still be in the badblock list, and for reads
> > > > > ->rw_bytes
> > > > > will still check and fail them.
> > > > >
> > > > > I'll send out a new series with a fix, but we really need to
> > > > > get
> > > > > a unit test for BTT error clearing, and I'm working on
> > > > > implementing the new error injection DSMs in libndctl and
> > > > > nfit_test to do that.
> > > > >
> > > >
> > > > I think as much as possible we should try to not fail writes.
> > > > Leave
> > > > the badblock entry in place so that we get an error on the next
> > > > read. Upper-level software reacts more aggressively to write
> > > > errors
> > > > than read errors.
> > >
> > > I don't think it's wise to lie about data integrity. If a write
> > > cannot be completed, it *needs* to fail. You can't make any
> > > assumptions about what applications will do with the result.
> >
> > Agreed. pmem driver returns with EIO on write in this scenario as
> > well.
>
> Ah true, I think we had this discussion before and you convinced me to
> go the EIO route then as well. So consider me re-convinced.
I'll hold off on sending this out for now - There is another place where
we attempt to clear errors, and that is during init, if we find existing
free-list blocks that have errors. I think the right thing to do here if
the clearing fails, is to make the btt read only. I'll really want to
test this stuff out properly with craftily injected errors, so going to
get that sorted out first.
-Vishal
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
prev parent reply other threads:[~2017-08-24 23:12 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-22 22:19 [PATCH v6 0/6] BTT error clearing rework Vishal Verma
2017-08-22 22:19 ` [PATCH v6 1/6] btt: fix a missed NVDIMM_IO_ATOMIC case in the write path Vishal Verma
2017-08-22 22:19 ` [PATCH v6 2/6] btt: refactor map entry operations with macros Vishal Verma
2017-08-22 22:19 ` [PATCH v6 3/6] btt: ensure that flags were also unchanged during a map_read Vishal Verma
2017-08-22 22:19 ` [PATCH v6 4/6] btt: cache sector_size in arena_info Vishal Verma
[not found] ` <20170822221915.1732-1-vishal.l.verma-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-08-22 22:19 ` [PATCH v6 5/6] libnvdimm: fix potential deadlock while clearing errors Vishal Verma
2017-08-22 22:19 ` Vishal Verma
2017-08-22 22:19 ` [PATCH v6 6/6] libnvdimm, btt: rework error clearing Vishal Verma
2017-08-23 17:23 ` Kani, Toshimitsu
2017-08-24 20:32 ` Verma, Vishal L
2017-08-24 20:36 ` Dan Williams
2017-08-24 21:07 ` Jeff Moyer
2017-08-24 21:40 ` Kani, Toshimitsu
2017-08-24 22:11 ` Dan Williams
2017-08-24 23:15 ` Verma, Vishal L [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1503616402.22880.24.camel@intel.com \
--to=vishal.l.verma@intel.com \
--cc=dan.j.williams@intel.com \
--cc=linux-nvdimm@lists.01.org \
--cc=toshi.kani@hpe.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.