* [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
@ 2025-06-26 20:37 Bart Van Assche
2025-06-27 7:17 ` Christoph Hellwig
2025-07-01 5:12 ` Nilay Shroff
0 siblings, 2 replies; 8+ messages in thread
From: Bart Van Assche @ 2025-06-26 20:37 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Christoph Hellwig, Bart Van Assche, Nilay Shroff,
stable
Every time I run test srp/002 the following deadlock is triggered:
task:multipathd
Call Trace:
<TASK>
__schedule+0x8c1/0x1bf0
schedule+0xdd/0x270
schedule_preempt_disabled+0x1c/0x30
__mutex_lock+0xb89/0x1650
mutex_lock_nested+0x1f/0x30
dm_table_set_restrictions+0x823/0xdf0
__bind+0x166/0x590
dm_swap_table+0x2a7/0x490
do_resume+0x1b1/0x610
dev_suspend+0x55/0x1a0
ctl_ioctl+0x3a5/0x7e0
dm_ctl_ioctl+0x12/0x20
__x64_sys_ioctl+0x127/0x1a0
x64_sys_call+0xe2b/0x17d0
do_syscall_64+0x96/0x3a0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
task:(udev-worker)
Call Trace:
<TASK>
__schedule+0x8c1/0x1bf0
schedule+0xdd/0x270
blk_mq_freeze_queue_wait+0xf2/0x140
blk_mq_freeze_queue_nomemsave+0x23/0x30
queue_ra_store+0x14e/0x290
queue_attr_store+0x23e/0x2c0
sysfs_kf_write+0xde/0x140
kernfs_fop_write_iter+0x3b2/0x630
vfs_write+0x4fd/0x1390
ksys_write+0xfd/0x230
__x64_sys_write+0x76/0xc0
x64_sys_call+0x276/0x17d0
do_syscall_64+0x96/0x3a0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
This deadlock happens because blk_mq_freeze_queue_nomemsave() waits for
pending requests to finish. The pending requests do never complete because
the dm-multipath queue_if_no_path option is enabled and the only path in
the dm-multipath configuration is being removed.
Fix this deadlock by removing the queue freezing/unfreezing code from
queue_ra_store().
Freezing the request queue from inside a block layer sysfs store callback
function is essential when modifying parameters that affect how bios or
requests are processed, e.g. parameters that affect bio_split_to_limit().
Freezing the request queue when modifying parameters that do not affect bio
nor request processing is not necessary.
Cc: Nilay Shroff <nilay@linux.ibm.com>
Cc: stable@vger.kernel.org
Fixes: b07a889e8335 ("block: move q->sysfs_lock and queue-freeze under show/store method")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
Changes compared to v1: made the patch description more detailed.
block/blk-sysfs.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index b2b9b89d6967..1f63b184c6e9 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -105,7 +105,6 @@ queue_ra_store(struct gendisk *disk, const char *page, size_t count)
{
unsigned long ra_kb;
ssize_t ret;
- unsigned int memflags;
struct request_queue *q = disk->queue;
ret = queue_var_store(&ra_kb, page, count);
@@ -116,10 +115,8 @@ queue_ra_store(struct gendisk *disk, const char *page, size_t count)
* calculated from the queue limits by queue_limits_commit_update.
*/
mutex_lock(&q->limits_lock);
- memflags = blk_mq_freeze_queue(q);
disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10);
mutex_unlock(&q->limits_lock);
- blk_mq_unfreeze_queue(q, memflags);
return ret;
}
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
2025-06-26 20:37 [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute Bart Van Assche
@ 2025-06-27 7:17 ` Christoph Hellwig
2025-06-28 22:58 ` Bart Van Assche
2025-06-30 22:39 ` Bart Van Assche
2025-07-01 5:12 ` Nilay Shroff
1 sibling, 2 replies; 8+ messages in thread
From: Christoph Hellwig @ 2025-06-27 7:17 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, linux-block, Christoph Hellwig, Nilay Shroff, stable
On Thu, Jun 26, 2025 at 01:37:13PM -0700, Bart Van Assche wrote:
> This deadlock happens because blk_mq_freeze_queue_nomemsave() waits for
> pending requests to finish. The pending requests do never complete because
> the dm-multipath queue_if_no_path option is enabled and the only path in
> the dm-multipath configuration is being removed.
Well, if there are queued never completed bios the freeze will obviously
fail. I don't see how this freeze is special vs other freezes or other
attributes that freeze.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
2025-06-27 7:17 ` Christoph Hellwig
@ 2025-06-28 22:58 ` Bart Van Assche
2025-06-30 22:39 ` Bart Van Assche
1 sibling, 0 replies; 8+ messages in thread
From: Bart Van Assche @ 2025-06-28 22:58 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, Nilay Shroff, stable
On 6/27/25 12:17 AM, Christoph Hellwig wrote:
> On Thu, Jun 26, 2025 at 01:37:13PM -0700, Bart Van Assche wrote:
>> This deadlock happens because blk_mq_freeze_queue_nomemsave() waits for
>> pending requests to finish. The pending requests do never complete because
>> the dm-multipath queue_if_no_path option is enabled and the only path in
>> the dm-multipath configuration is being removed.
>
> Well, if there are queued never completed bios the freeze will obviously
> fail. I don't see how this freeze is special vs other freezes or other
> attributes that freeze.
Hi Christoph,
There is a difference: there are Linux distros, e.g. openSUSE, that set
the read_ahead_kb attribute from a udev rule. I'm not aware of any Linux
distros that set any of the other attributes from a udev rule for which
the queue gets frozen from the .store callback (nr_requests, nomerges,
rq_affinity, io_poll, io_timeout and wbt_lat_usec).
Thanks,
Bart.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
2025-06-27 7:17 ` Christoph Hellwig
2025-06-28 22:58 ` Bart Van Assche
@ 2025-06-30 22:39 ` Bart Van Assche
2025-07-01 0:42 ` Keith Busch
1 sibling, 1 reply; 8+ messages in thread
From: Bart Van Assche @ 2025-06-30 22:39 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, Nilay Shroff, stable
On 6/27/25 12:17 AM, Christoph Hellwig wrote:
> Well, if there are queued never completed bios the freeze will obviously
> fail. I don't see how this freeze is special vs other freezes or other
> attributes that freeze.
Hi Christoph,
Do you perhaps want me to remove the freeze/unfreeze calls from all
sysfs store callbacks from which it is safe to remove these callbacks?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
2025-06-30 22:39 ` Bart Van Assche
@ 2025-07-01 0:42 ` Keith Busch
2025-07-01 1:50 ` Bart Van Assche
0 siblings, 1 reply; 8+ messages in thread
From: Keith Busch @ 2025-07-01 0:42 UTC (permalink / raw)
To: Bart Van Assche
Cc: Christoph Hellwig, Jens Axboe, linux-block, Nilay Shroff, stable
On Mon, Jun 30, 2025 at 03:39:18PM -0700, Bart Van Assche wrote:
> On 6/27/25 12:17 AM, Christoph Hellwig wrote:
> > Well, if there are queued never completed bios the freeze will obviously
> > fail. I don't see how this freeze is special vs other freezes or other
> > attributes that freeze.
>
> Hi Christoph,
>
> Do you perhaps want me to remove the freeze/unfreeze calls from all
> sysfs store callbacks from which it is safe to remove these callbacks?
But don't the remaining attributes that are not safe remain susceptible
to this deadlock?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
2025-07-01 0:42 ` Keith Busch
@ 2025-07-01 1:50 ` Bart Van Assche
2025-07-01 5:08 ` Nilay Shroff
0 siblings, 1 reply; 8+ messages in thread
From: Bart Van Assche @ 2025-07-01 1:50 UTC (permalink / raw)
To: Keith Busch
Cc: Christoph Hellwig, Jens Axboe, linux-block, Nilay Shroff, stable
On 6/30/25 5:42 PM, Keith Busch wrote:
> On Mon, Jun 30, 2025 at 03:39:18PM -0700, Bart Van Assche wrote:
>> On 6/27/25 12:17 AM, Christoph Hellwig wrote:
>>> Well, if there are queued never completed bios the freeze will obviously
>>> fail. I don't see how this freeze is special vs other freezes or other
>>> attributes that freeze.
>>
>> Hi Christoph,
>>
>> Do you perhaps want me to remove the freeze/unfreeze calls from all
>> sysfs store callbacks from which it is safe to remove these callbacks?
>
> But don't the remaining attributes that are not safe remain susceptible
> to this deadlock?
For the remaining sysfs attributes the deadlock can be solved by
letting the blk_mq_freeze_queue() call by the sysfs store methods time
out if that call takes too long or by making that call interruptible by
signals like Ctrl-C. I think its better to let functions like
queue_requests_store() fail if an attempt to freeze a request
queue takes longer than it should rather than to trigger a kernel
deadlock.
Bart.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
2025-07-01 1:50 ` Bart Van Assche
@ 2025-07-01 5:08 ` Nilay Shroff
0 siblings, 0 replies; 8+ messages in thread
From: Nilay Shroff @ 2025-07-01 5:08 UTC (permalink / raw)
To: Bart Van Assche, Keith Busch
Cc: Christoph Hellwig, Jens Axboe, linux-block, stable
On 7/1/25 7:20 AM, Bart Van Assche wrote:
> On 6/30/25 5:42 PM, Keith Busch wrote:
>> On Mon, Jun 30, 2025 at 03:39:18PM -0700, Bart Van Assche wrote:
>>> On 6/27/25 12:17 AM, Christoph Hellwig wrote:
>>>> Well, if there are queued never completed bios the freeze will obviously
>>>> fail. I don't see how this freeze is special vs other freezes or other
>>>> attributes that freeze.
>>>
>>> Hi Christoph,
>>>
>>> Do you perhaps want me to remove the freeze/unfreeze calls from all
>>> sysfs store callbacks from which it is safe to remove these callbacks?
>>
>> But don't the remaining attributes that are not safe remain susceptible
>> to this deadlock?
>
> For the remaining sysfs attributes the deadlock can be solved by
> letting the blk_mq_freeze_queue() call by the sysfs store methods time
> out if that call takes too long or by making that call interruptible by
> signals like Ctrl-C. I think its better to let functions like
> queue_requests_store() fail if an attempt to freeze a request
> queue takes longer than it should rather than to trigger a kernel
> deadlock.
>
I think you're proposing to use blk_mq_freeze_queue_wait_timeout()
here (which puts the caller into uninterruptible sleep) and that might
be okay. However, IMO, using TASK_INTERRUPTIBLE may not be worth in
case those sysfs store methods are invoked from udev rules or
application. But lets wait for others to chime in and suggest.
Thanks,
--Nilay
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
2025-06-26 20:37 [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute Bart Van Assche
2025-06-27 7:17 ` Christoph Hellwig
@ 2025-07-01 5:12 ` Nilay Shroff
1 sibling, 0 replies; 8+ messages in thread
From: Nilay Shroff @ 2025-07-01 5:12 UTC (permalink / raw)
To: Bart Van Assche, Jens Axboe; +Cc: linux-block, Christoph Hellwig, stable
On 6/27/25 2:07 AM, Bart Van Assche wrote:
> Every time I run test srp/002 the following deadlock is triggered:
>
> task:multipathd
> Call Trace:
> <TASK>
> __schedule+0x8c1/0x1bf0
> schedule+0xdd/0x270
> schedule_preempt_disabled+0x1c/0x30
> __mutex_lock+0xb89/0x1650
> mutex_lock_nested+0x1f/0x30
> dm_table_set_restrictions+0x823/0xdf0
> __bind+0x166/0x590
> dm_swap_table+0x2a7/0x490
> do_resume+0x1b1/0x610
> dev_suspend+0x55/0x1a0
> ctl_ioctl+0x3a5/0x7e0
> dm_ctl_ioctl+0x12/0x20
> __x64_sys_ioctl+0x127/0x1a0
> x64_sys_call+0xe2b/0x17d0
> do_syscall_64+0x96/0x3a0
> entry_SYSCALL_64_after_hwframe+0x4b/0x53
> </TASK>
> task:(udev-worker)
> Call Trace:
> <TASK>
> __schedule+0x8c1/0x1bf0
> schedule+0xdd/0x270
> blk_mq_freeze_queue_wait+0xf2/0x140
> blk_mq_freeze_queue_nomemsave+0x23/0x30
> queue_ra_store+0x14e/0x290
> queue_attr_store+0x23e/0x2c0
> sysfs_kf_write+0xde/0x140
> kernfs_fop_write_iter+0x3b2/0x630
> vfs_write+0x4fd/0x1390
> ksys_write+0xfd/0x230
> __x64_sys_write+0x76/0xc0
> x64_sys_call+0x276/0x17d0
> do_syscall_64+0x96/0x3a0
> entry_SYSCALL_64_after_hwframe+0x4b/0x53
> </TASK>
>
> This deadlock happens because blk_mq_freeze_queue_nomemsave() waits for
> pending requests to finish. The pending requests do never complete because
> the dm-multipath queue_if_no_path option is enabled and the only path in
> the dm-multipath configuration is being removed.
>
> Fix this deadlock by removing the queue freezing/unfreezing code from
> queue_ra_store().
>
> Freezing the request queue from inside a block layer sysfs store callback
> function is essential when modifying parameters that affect how bios or
> requests are processed, e.g. parameters that affect bio_split_to_limit().
> Freezing the request queue when modifying parameters that do not affect bio
> nor request processing is not necessary.
>
> Cc: Nilay Shroff <nilay@linux.ibm.com>
> Cc: stable@vger.kernel.org
> Fixes: b07a889e8335 ("block: move q->sysfs_lock and queue-freeze under show/store method")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
I hope we'd address other sysfs store attributes requiring queue-freeze
in another patch. So with that,
Looks good to me:
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-07-01 5:13 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-26 20:37 [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute Bart Van Assche
2025-06-27 7:17 ` Christoph Hellwig
2025-06-28 22:58 ` Bart Van Assche
2025-06-30 22:39 ` Bart Van Assche
2025-07-01 0:42 ` Keith Busch
2025-07-01 1:50 ` Bart Van Assche
2025-07-01 5:08 ` Nilay Shroff
2025-07-01 5:12 ` Nilay Shroff
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox