All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>
Cc: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>,
	Damien Le Moal <Damien.LeMoal@wdc.com>
Subject: Re: I/O hang with v5.16-rc2
Date: Fri, 26 Nov 2021 09:55:44 -0700	[thread overview]
Message-ID: <e1b65eee-e8c8-e98d-d2f7-5e35eca46651@kernel.dk> (raw)
In-Reply-To: <124f86f8-91db-3a02-702d-5c26b22de107@kernel.dk>

On 11/26/21 9:21 AM, Jens Axboe wrote:
> On 11/26/21 2:53 AM, Shinichiro Kawasaki wrote:
>> I ran my test set on v5.16-rc2 and observed a process hang. The test work load
>> repeats file creation on xfs on dm-zoned. This dm-zoned device is on top of 3
>> dm-linear devices. One of them is dm-linear device on non-zoned NVMe device as
>> the cache of the dm-zoned device. The other two are dm-linear devices on zoned
>> SMR HDDs. So far, the hang is recreated 100% with my test system.
>>
>> The kernel message [2] reported hanging tasks. In the call stack, I observe
>> wbt_wait(). Also I observed "inflight 1" value in the "rqos/wbt/inflight"
>> attribute of debug sysfs.
>>
>> # grep -R . /sys/kernel/debug/block/nvme0n1 | grep inflight
>> /sys/kernel/debug/block/nvme0n1/rqos/wbt/inflight:0: inflight 1
>> /sys/kernel/debug/block/nvme0n1/rqos/wbt/inflight:1: inflight 0
>> /sys/kernel/debug/block/nvme0n1/rqos/wbt/inflight:2: inflight 0
>>
>> These symptoms look related to another issue reported to linux-block [1]. As
>> discussed in that thread, I set 0 to /sys/block/nvme0n1/queue/wbt_lat_usec.
>> With this setting, I observed the hang disappeared. Then this hang I observe
>> also related to writeback throttling for the NVMe device.
>>
>> I bisected and found the commit 4f5022453acd ("nvme: wire up completion batching
>> for the IRQ path") is the trigger commit. I reverted this commit from v5.16-rc2,
>> and observed the hang disappeared.
>>
>> Wish this report helps.
>>
>>
>> [1] https://lore.kernel.org/linux-block/b3ba57a7-d363-9c17-c4be-9dbe86875@panix.com
> 
> Yes looks the same as that one, and that commit was indeed my suspicion
> on what could potentially cause the accounting discrepancy. I'll take a
> look at this.

I sent out a patch in the other thread, please give that a whirl.

-- 
Jens Axboe


  reply	other threads:[~2021-11-26 17:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-26  9:53 I/O hang with v5.16-rc2 Shinichiro Kawasaki
2021-11-26 16:21 ` Jens Axboe
2021-11-26 16:55   ` Jens Axboe [this message]
2021-11-27  2:38     ` Shinichiro Kawasaki
2021-11-27 13:45       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e1b65eee-e8c8-e98d-d2f7-5e35eca46651@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=Damien.LeMoal@wdc.com \
    --cc=Johannes.Thumshirn@wdc.com \
    --cc=linux-block@vger.kernel.org \
    --cc=shinichiro.kawasaki@wdc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.