From: Jens Axboe <axboe@kernel.dk>
To: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>
Cc: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>,
Damien Le Moal <Damien.LeMoal@wdc.com>
Subject: Re: I/O hang with v5.16-rc2
Date: Fri, 26 Nov 2021 09:55:44 -0700 [thread overview]
Message-ID: <e1b65eee-e8c8-e98d-d2f7-5e35eca46651@kernel.dk> (raw)
In-Reply-To: <124f86f8-91db-3a02-702d-5c26b22de107@kernel.dk>
On 11/26/21 9:21 AM, Jens Axboe wrote:
> On 11/26/21 2:53 AM, Shinichiro Kawasaki wrote:
>> I ran my test set on v5.16-rc2 and observed a process hang. The test work load
>> repeats file creation on xfs on dm-zoned. This dm-zoned device is on top of 3
>> dm-linear devices. One of them is dm-linear device on non-zoned NVMe device as
>> the cache of the dm-zoned device. The other two are dm-linear devices on zoned
>> SMR HDDs. So far, the hang is recreated 100% with my test system.
>>
>> The kernel message [2] reported hanging tasks. In the call stack, I observe
>> wbt_wait(). Also I observed "inflight 1" value in the "rqos/wbt/inflight"
>> attribute of debug sysfs.
>>
>> # grep -R . /sys/kernel/debug/block/nvme0n1 | grep inflight
>> /sys/kernel/debug/block/nvme0n1/rqos/wbt/inflight:0: inflight 1
>> /sys/kernel/debug/block/nvme0n1/rqos/wbt/inflight:1: inflight 0
>> /sys/kernel/debug/block/nvme0n1/rqos/wbt/inflight:2: inflight 0
>>
>> These symptoms look related to another issue reported to linux-block [1]. As
>> discussed in that thread, I set 0 to /sys/block/nvme0n1/queue/wbt_lat_usec.
>> With this setting, I observed the hang disappeared. Then this hang I observe
>> also related to writeback throttling for the NVMe device.
>>
>> I bisected and found the commit 4f5022453acd ("nvme: wire up completion batching
>> for the IRQ path") is the trigger commit. I reverted this commit from v5.16-rc2,
>> and observed the hang disappeared.
>>
>> Wish this report helps.
>>
>>
>> [1] https://lore.kernel.org/linux-block/b3ba57a7-d363-9c17-c4be-9dbe86875@panix.com
>
> Yes looks the same as that one, and that commit was indeed my suspicion
> on what could potentially cause the accounting discrepancy. I'll take a
> look at this.
I sent out a patch in the other thread, please give that a whirl.
--
Jens Axboe
next prev parent reply other threads:[~2021-11-26 17:26 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-26 9:53 I/O hang with v5.16-rc2 Shinichiro Kawasaki
2021-11-26 16:21 ` Jens Axboe
2021-11-26 16:55 ` Jens Axboe [this message]
2021-11-27 2:38 ` Shinichiro Kawasaki
2021-11-27 13:45 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e1b65eee-e8c8-e98d-d2f7-5e35eca46651@kernel.dk \
--to=axboe@kernel.dk \
--cc=Damien.LeMoal@wdc.com \
--cc=Johannes.Thumshirn@wdc.com \
--cc=linux-block@vger.kernel.org \
--cc=shinichiro.kawasaki@wdc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.