public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed
From: "Verma, Vishal L" <vishal.l.verma@intel.com>
To: "toshi.kani@hpe.com" <toshi.kani@hpe.com>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>
Cc: "Williams, Dan J" <dan.j.williams@intel.com>,
	"jmoyer@redhat.com" <jmoyer@redhat.com>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"Wysocki, Rafael J" <rafael.j.wysocki@intel.com>
Subject: Re: [PATCH v4 0/6] BTT error clearing rework
Date: Mon, 31 Jul 2017 23:35:08 +0000	[thread overview]
Message-ID: <1501544000.4405.5.camel@intel.com> (raw)
In-Reply-To: <1501542358.2042.97.camel@hpe.com>

On Mon, 2017-07-31 at 23:15 +0000, Kani, Toshimitsu wrote:
> On Wed, 2017-07-26 at 17:35 -0600, Vishal Verma wrote:
>  :
> > 
> > Clearing errors or badblocks during a BTT write requires sending an
> > ACPI DSM, which means potentially sleeping. Since a BTT IO happens
> > in
> > atomic context (preemption disabled, spinlocks may be held), we
> > cannot perform error clearing in the course of an IO. Due to this
> > error clearing for BTT IOs has hitherto been disabled.
> > 
> > This series fixes these problems by moving the error clearing out of
> > the atomic sections in the BTT.
> > 
> > Also fix a potential deadlock that can occur while clearing errors
> > from either BTT or pmem due to memory allocations in the IO path.
> 
> Hi Vishal,
> 
> I just tested the series (sorry for the delay).  It works nicely when
> doing I/Os to a block device directly.  But I am seeing a lot of write
> errors with filesystem.
> 
> Here is what I did for the testing.
> 
> 1. 'mkfs.ext /dev/pmem0s' and 'mount /dev/pmem0s /mnt/pmem0s'.
> 2. Inject an error to somewhere in the pmem0s device, but not in the
> metadata area at beginning.
> 3. Run the following script.
> ===
> DEV=pmem0s
> set -x
> dd if=/dev/zero of=/mnt/$DEV/1Gfile bs=1M count=1024
> while true; do
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-1
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-2
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-3
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-4
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-5
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-6
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-7
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-8
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-9
> cp /mnt/$DEV/1Gfile /mnt/$DEV/file-10
> done
> ===
> 
> Step 3 clears an error and runs fine with raw and memory modes.  With
> sector mode, however, it ends up with continuous write errors like
> below and does not clear the error.  Do you have any thoughts?
> 
>  EXT4-fs warning (device pmem0s): ext4_end_bio:322: I/O error 10
> writing to inode 17 (offset 1023410176 size 8388608 starting block
> 1834752)
>  Buffer I/O error on device pmem0s, logical block 1834752
>  Buffer I/O error on device pmem0s, logical block 1834753
>  Buffer I/O error on device pmem0s, logical block 1834754
>  :
>  nd_pmem btt0.0: io error in WRITE sector 14680064, len 4096,
>  EXT4-fs warning (device pmem0s): ext4_end_bio:322: I/O error 10
> writing to inode 17 (offset 1031798784 size 1052672 starting block
> 1835008)
>  nd_pmem btt0.0: io error in WRITE sector 14682112, len 4096,
>  EXT4-fs warning (device pmem0s): ext4_end_bio:322: I/O error 10
> writing to inode 17 (offset 1031798784 size 2101248 starting block
> 1835264)
>  :
>  nd_pmem btt0.0: io error in WRITE sector 14698496, len 4096,
>  nd_pmem btt0.0: io error in WRITE sector 14700544, len 4096,
>  nd_pmem btt0.0: io error in WRITE sector 14702592, len 4096,
>  nd_pmem btt0.0: io error in WRITE sector 14704640, len 4096,
>  :

Thanks for the test Toshi, I will try and reproduce it.
My first guess is - are the injected errors potentially in the BTT
metadata area towards the end?

->rw_bytes can only clear errors on properly aligned writes, and the btt
metadata writes will be too small to clear metadata errors..

> 
> Thanks,
> -Toshi

  reply	other threads:[~2017-07-31 23:35 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-26 23:35 [PATCH v4 0/6] BTT error clearing rework Vishal Verma
2017-07-26 23:35 ` [PATCH v4 1/6] btt: fix a missed NVDIMM_IO_ATOMIC case in the write path Vishal Verma
2017-07-26 23:35 ` [PATCH v4 2/6] btt: refactor map entry operations with macros Vishal Verma
2017-07-26 23:35 ` [PATCH v4 3/6] btt: ensure that flags were also unchanged during a map_read Vishal Verma
     [not found] ` <20170726233546.29052-1-vishal.l.verma-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-07-26 23:35   ` [PATCH v4 4/6] btt: cache sector_size in arena_info Vishal Verma
2017-07-26 23:35 ` [PATCH v4 5/6] libnvdimm: fix potential deadlock while clearing errors Vishal Verma
2017-07-26 23:35 ` [PATCH v4 6/6] libnvdimm, btt: rework error clearing Vishal Verma
2017-07-31 23:15 ` [PATCH v4 0/6] BTT error clearing rework Kani, Toshimitsu
2017-07-31 23:35   ` Verma, Vishal L [this message]
2017-08-01 15:28     ` Kani, Toshimitsu
2017-08-01 19:11       ` Kani, Toshimitsu
     [not found]         ` <1501614143.2042.101.camel-ZPxbGqLxI0U@public.gmane.org>
2017-08-01 19:56           ` Vishal Verma
2017-08-01 20:06             ` Kani, Toshimitsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1501544000.4405.5.camel@intel.com \
    --to=vishal.l.verma@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=toshi.kani@hpe.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox