All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kani, Toshimitsu" <toshi.kani@hpe.com>
To: "dan.j.williams@intel.com" <dan.j.williams@intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>
Subject: Re: [RFC PATCH] dax: add badblocks check to Device DAX
Date: Wed, 3 May 2017 18:46:54 +0000	[thread overview]
Message-ID: <1493837209.30303.47.camel@hpe.com> (raw)
In-Reply-To: <CAPcyv4gthc8Gc7SUyxqecDrd+dtOfOzY19bs5sY1qMepQKh=kQ@mail.gmail.com>

On Wed, 2017-05-03 at 09:30 -0700, Dan Williams wrote:
> On Wed, May 3, 2017 at 9:09 AM, Kani, Toshimitsu <toshi.kani@hpe.com>
> wrote:
> > On Wed, 2017-05-03 at 08:52 -0700, Dan Williams wrote:
> > > On Wed, May 3, 2017 at 8:31 AM, Toshi Kani <toshi.kani@hpe.com>
> > > wrote:
> > > > This is a RFC patch for seeking suggestions.  It adds support
> > > > of badblocks check in Device DAX by using region-level
> > > > badblocks list.  This patch is only briefly tested.
> > > > 
> > > > device_dax is a well-isolated self-contained module as it calls
> > > > alloc_dax() with dev_dax, which is private to device_dax.  For
> > > > checking badblocks, it needs to call dax_pmem to check with
> > > > region-level badblocks.
> > > > 
> > > > This patch attempts to keep device_dax self-contained.  It adds
> > > > check_error() to dax_operations, and dax_check_error() as a
> > > > stub with *dev_dax and *dev pointers to convey it to
> > > > dax_pmem.  I am wondering if this is the right direction, or we
> > > > should change the modularity to let dax_pmem call alloc_dax()
> > > > with its dax_pmem (or I completely missed something).
> > > 
> > > The problem is that device-dax guarantees a given fault
> > > granularity. To make that guarantee we can't fallback from 1G or
> > > 2M mappings due to an error. We also can't reasonably go the
> > > other way and fail mappings that contain a badblock because that
> > > would change the blast radius of a media error to the fault size.
> > 
> > Does it mean we expect users to have CPUs with MCE recovery for
> > Device DAX?  Can we add an attributes like allow error-check &
> > fall-back?
> 
> Yes, without MCE recovery device-dax mappings that consume errors
> will reboot. If an application needs the kernel protection it should
> be using filesystem-dax.

Understood.  Are we going to provide sysfs "badblocks" for Device DAX
as it is also needed for ndctl clear-error?

Thanks,
-Toshi
 
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: "Kani, Toshimitsu" <toshi.kani@hpe.com>
To: "dan.j.williams@intel.com" <dan.j.williams@intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"dave.jiang@intel.com" <dave.jiang@intel.com>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>
Subject: Re: [RFC PATCH] dax: add badblocks check to Device DAX
Date: Wed, 3 May 2017 18:46:54 +0000	[thread overview]
Message-ID: <1493837209.30303.47.camel@hpe.com> (raw)
In-Reply-To: <CAPcyv4gthc8Gc7SUyxqecDrd+dtOfOzY19bs5sY1qMepQKh=kQ@mail.gmail.com>

On Wed, 2017-05-03 at 09:30 -0700, Dan Williams wrote:
> On Wed, May 3, 2017 at 9:09 AM, Kani, Toshimitsu <toshi.kani@hpe.com>
> wrote:
> > On Wed, 2017-05-03 at 08:52 -0700, Dan Williams wrote:
> > > On Wed, May 3, 2017 at 8:31 AM, Toshi Kani <toshi.kani@hpe.com>
> > > wrote:
> > > > This is a RFC patch for seeking suggestions.  It adds support
> > > > of badblocks check in Device DAX by using region-level
> > > > badblocks list.  This patch is only briefly tested.
> > > > 
> > > > device_dax is a well-isolated self-contained module as it calls
> > > > alloc_dax() with dev_dax, which is private to device_dax.  For
> > > > checking badblocks, it needs to call dax_pmem to check with
> > > > region-level badblocks.
> > > > 
> > > > This patch attempts to keep device_dax self-contained.  It adds
> > > > check_error() to dax_operations, and dax_check_error() as a
> > > > stub with *dev_dax and *dev pointers to convey it to
> > > > dax_pmem.  I am wondering if this is the right direction, or we
> > > > should change the modularity to let dax_pmem call alloc_dax()
> > > > with its dax_pmem (or I completely missed something).
> > > 
> > > The problem is that device-dax guarantees a given fault
> > > granularity. To make that guarantee we can't fallback from 1G or
> > > 2M mappings due to an error. We also can't reasonably go the
> > > other way and fail mappings that contain a badblock because that
> > > would change the blast radius of a media error to the fault size.
> > 
> > Does it mean we expect users to have CPUs with MCE recovery for
> > Device DAX?  Can we add an attributes like allow error-check &
> > fall-back?
> 
> Yes, without MCE recovery device-dax mappings that consume errors
> will reboot. If an application needs the kernel protection it should
> be using filesystem-dax.

Understood.  Are we going to provide sysfs "badblocks" for Device DAX
as it is also needed for ndctl clear-error?

Thanks,
-Toshi
 

  reply	other threads:[~2017-05-03 18:46 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-03 15:31 [RFC PATCH] dax: add badblocks check to Device DAX Toshi Kani
2017-05-03 15:31 ` Toshi Kani
2017-05-03 15:52 ` Dan Williams
2017-05-03 15:52   ` Dan Williams
2017-05-03 16:09   ` Kani, Toshimitsu
2017-05-03 16:09     ` Kani, Toshimitsu
2017-05-03 16:30     ` Dan Williams
2017-05-03 16:30       ` Dan Williams
2017-05-03 18:46       ` Kani, Toshimitsu [this message]
2017-05-03 18:46         ` Kani, Toshimitsu
2017-05-03 21:48         ` Dan Williams
2017-05-03 21:48           ` Dan Williams
2017-05-03 21:56           ` Dave Jiang
2017-05-03 21:56             ` Dave Jiang
2017-05-03 22:41           ` Kani, Toshimitsu
2017-05-03 22:41             ` Kani, Toshimitsu
2017-05-03 22:51             ` Dan Williams
2017-05-03 22:51               ` Dan Williams
2017-05-03 23:08               ` Dan Williams
2017-05-03 23:08                 ` Dan Williams
2017-05-03 23:25                 ` Kani, Toshimitsu
2017-05-03 23:25                   ` Kani, Toshimitsu
2017-05-03 23:36                   ` Kani, Toshimitsu
2017-05-03 23:36                     ` Kani, Toshimitsu
2017-05-04  2:01                     ` Dan Williams
2017-05-04  2:01                       ` Dan Williams
2017-05-04 14:08                       ` Kani, Toshimitsu
2017-05-04 14:08                         ` Kani, Toshimitsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1493837209.30303.47.camel@hpe.com \
    --to=toshi.kani@hpe.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.