public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
@ 2025-06-26 20:37 Bart Van Assche
  2025-06-27  7:17 ` Christoph Hellwig
  2025-07-01  5:12 ` Nilay Shroff
  0 siblings, 2 replies; 8+ messages in thread
From: Bart Van Assche @ 2025-06-26 20:37 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Bart Van Assche, Nilay Shroff,
	stable

Every time I run test srp/002 the following deadlock is triggered:

task:multipathd
Call Trace:
 <TASK>
 __schedule+0x8c1/0x1bf0
 schedule+0xdd/0x270
 schedule_preempt_disabled+0x1c/0x30
 __mutex_lock+0xb89/0x1650
 mutex_lock_nested+0x1f/0x30
 dm_table_set_restrictions+0x823/0xdf0
 __bind+0x166/0x590
 dm_swap_table+0x2a7/0x490
 do_resume+0x1b1/0x610
 dev_suspend+0x55/0x1a0
 ctl_ioctl+0x3a5/0x7e0
 dm_ctl_ioctl+0x12/0x20
 __x64_sys_ioctl+0x127/0x1a0
 x64_sys_call+0xe2b/0x17d0
 do_syscall_64+0x96/0x3a0
 entry_SYSCALL_64_after_hwframe+0x4b/0x53
 </TASK>
task:(udev-worker)
Call Trace:
 <TASK>
 __schedule+0x8c1/0x1bf0
 schedule+0xdd/0x270
 blk_mq_freeze_queue_wait+0xf2/0x140
 blk_mq_freeze_queue_nomemsave+0x23/0x30
 queue_ra_store+0x14e/0x290
 queue_attr_store+0x23e/0x2c0
 sysfs_kf_write+0xde/0x140
 kernfs_fop_write_iter+0x3b2/0x630
 vfs_write+0x4fd/0x1390
 ksys_write+0xfd/0x230
 __x64_sys_write+0x76/0xc0
 x64_sys_call+0x276/0x17d0
 do_syscall_64+0x96/0x3a0
 entry_SYSCALL_64_after_hwframe+0x4b/0x53
 </TASK>

This deadlock happens because blk_mq_freeze_queue_nomemsave() waits for
pending requests to finish. The pending requests do never complete because
the dm-multipath queue_if_no_path option is enabled and the only path in
the dm-multipath configuration is being removed.

Fix this deadlock by removing the queue freezing/unfreezing code from
queue_ra_store().

Freezing the request queue from inside a block layer sysfs store callback
function is essential when modifying parameters that affect how bios or
requests are processed, e.g. parameters that affect bio_split_to_limit().
Freezing the request queue when modifying parameters that do not affect bio
nor request processing is not necessary.

Cc: Nilay Shroff <nilay@linux.ibm.com>
Cc: stable@vger.kernel.org
Fixes: b07a889e8335 ("block: move q->sysfs_lock and queue-freeze under show/store method")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---

Changes compared to v1: made the patch description more detailed.

 block/blk-sysfs.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index b2b9b89d6967..1f63b184c6e9 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -105,7 +105,6 @@ queue_ra_store(struct gendisk *disk, const char *page, size_t count)
 {
 	unsigned long ra_kb;
 	ssize_t ret;
-	unsigned int memflags;
 	struct request_queue *q = disk->queue;
 
 	ret = queue_var_store(&ra_kb, page, count);
@@ -116,10 +115,8 @@ queue_ra_store(struct gendisk *disk, const char *page, size_t count)
 	 * calculated from the queue limits by queue_limits_commit_update.
 	 */
 	mutex_lock(&q->limits_lock);
-	memflags = blk_mq_freeze_queue(q);
 	disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10);
 	mutex_unlock(&q->limits_lock);
-	blk_mq_unfreeze_queue(q, memflags);
 
 	return ret;
 }

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
  2025-06-26 20:37 [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute Bart Van Assche
@ 2025-06-27  7:17 ` Christoph Hellwig
  2025-06-28 22:58   ` Bart Van Assche
  2025-06-30 22:39   ` Bart Van Assche
  2025-07-01  5:12 ` Nilay Shroff
  1 sibling, 2 replies; 8+ messages in thread
From: Christoph Hellwig @ 2025-06-27  7:17 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Christoph Hellwig, Nilay Shroff, stable

On Thu, Jun 26, 2025 at 01:37:13PM -0700, Bart Van Assche wrote:
> This deadlock happens because blk_mq_freeze_queue_nomemsave() waits for
> pending requests to finish. The pending requests do never complete because
> the dm-multipath queue_if_no_path option is enabled and the only path in
> the dm-multipath configuration is being removed.

Well, if there are queued never completed bios the freeze will obviously
fail.  I don't see how this freeze is special vs other freezes or other
attributes that freeze.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
  2025-06-27  7:17 ` Christoph Hellwig
@ 2025-06-28 22:58   ` Bart Van Assche
  2025-06-30 22:39   ` Bart Van Assche
  1 sibling, 0 replies; 8+ messages in thread
From: Bart Van Assche @ 2025-06-28 22:58 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, Nilay Shroff, stable

On 6/27/25 12:17 AM, Christoph Hellwig wrote:
> On Thu, Jun 26, 2025 at 01:37:13PM -0700, Bart Van Assche wrote:
>> This deadlock happens because blk_mq_freeze_queue_nomemsave() waits for
>> pending requests to finish. The pending requests do never complete because
>> the dm-multipath queue_if_no_path option is enabled and the only path in
>> the dm-multipath configuration is being removed.
> 
> Well, if there are queued never completed bios the freeze will obviously
> fail.  I don't see how this freeze is special vs other freezes or other
> attributes that freeze.

Hi Christoph,

There is a difference: there are Linux distros, e.g. openSUSE, that set 
the read_ahead_kb attribute from a udev rule. I'm not aware of any Linux
distros that set any of the other attributes from a udev rule for which
the queue gets frozen from the .store callback (nr_requests, nomerges,
rq_affinity, io_poll, io_timeout and wbt_lat_usec).

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
  2025-06-27  7:17 ` Christoph Hellwig
  2025-06-28 22:58   ` Bart Van Assche
@ 2025-06-30 22:39   ` Bart Van Assche
  2025-07-01  0:42     ` Keith Busch
  1 sibling, 1 reply; 8+ messages in thread
From: Bart Van Assche @ 2025-06-30 22:39 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, Nilay Shroff, stable

On 6/27/25 12:17 AM, Christoph Hellwig wrote:
> Well, if there are queued never completed bios the freeze will obviously
> fail.  I don't see how this freeze is special vs other freezes or other
> attributes that freeze.

Hi Christoph,

Do you perhaps want me to remove the freeze/unfreeze calls from all
sysfs store callbacks from which it is safe to remove these callbacks?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
  2025-06-30 22:39   ` Bart Van Assche
@ 2025-07-01  0:42     ` Keith Busch
  2025-07-01  1:50       ` Bart Van Assche
  0 siblings, 1 reply; 8+ messages in thread
From: Keith Busch @ 2025-07-01  0:42 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Jens Axboe, linux-block, Nilay Shroff, stable

On Mon, Jun 30, 2025 at 03:39:18PM -0700, Bart Van Assche wrote:
> On 6/27/25 12:17 AM, Christoph Hellwig wrote:
> > Well, if there are queued never completed bios the freeze will obviously
> > fail.  I don't see how this freeze is special vs other freezes or other
> > attributes that freeze.
> 
> Hi Christoph,
> 
> Do you perhaps want me to remove the freeze/unfreeze calls from all
> sysfs store callbacks from which it is safe to remove these callbacks?

But don't the remaining attributes that are not safe remain susceptible
to this deadlock?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
  2025-07-01  0:42     ` Keith Busch
@ 2025-07-01  1:50       ` Bart Van Assche
  2025-07-01  5:08         ` Nilay Shroff
  0 siblings, 1 reply; 8+ messages in thread
From: Bart Van Assche @ 2025-07-01  1:50 UTC (permalink / raw)
  To: Keith Busch
  Cc: Christoph Hellwig, Jens Axboe, linux-block, Nilay Shroff, stable

On 6/30/25 5:42 PM, Keith Busch wrote:
> On Mon, Jun 30, 2025 at 03:39:18PM -0700, Bart Van Assche wrote:
>> On 6/27/25 12:17 AM, Christoph Hellwig wrote:
>>> Well, if there are queued never completed bios the freeze will obviously
>>> fail.  I don't see how this freeze is special vs other freezes or other
>>> attributes that freeze.
>>
>> Hi Christoph,
>>
>> Do you perhaps want me to remove the freeze/unfreeze calls from all
>> sysfs store callbacks from which it is safe to remove these callbacks?
> 
> But don't the remaining attributes that are not safe remain susceptible
> to this deadlock?

For the remaining sysfs attributes the deadlock can be solved by
letting the blk_mq_freeze_queue() call by the sysfs store methods time
out if that call takes too long or by making that call interruptible by
signals like Ctrl-C. I think its better to let functions like
queue_requests_store() fail if an attempt to freeze a request
queue takes longer than it should rather than to trigger a kernel
deadlock.

Bart.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
  2025-07-01  1:50       ` Bart Van Assche
@ 2025-07-01  5:08         ` Nilay Shroff
  0 siblings, 0 replies; 8+ messages in thread
From: Nilay Shroff @ 2025-07-01  5:08 UTC (permalink / raw)
  To: Bart Van Assche, Keith Busch
  Cc: Christoph Hellwig, Jens Axboe, linux-block, stable



On 7/1/25 7:20 AM, Bart Van Assche wrote:
> On 6/30/25 5:42 PM, Keith Busch wrote:
>> On Mon, Jun 30, 2025 at 03:39:18PM -0700, Bart Van Assche wrote:
>>> On 6/27/25 12:17 AM, Christoph Hellwig wrote:
>>>> Well, if there are queued never completed bios the freeze will obviously
>>>> fail.  I don't see how this freeze is special vs other freezes or other
>>>> attributes that freeze.
>>>
>>> Hi Christoph,
>>>
>>> Do you perhaps want me to remove the freeze/unfreeze calls from all
>>> sysfs store callbacks from which it is safe to remove these callbacks?
>>
>> But don't the remaining attributes that are not safe remain susceptible
>> to this deadlock?
> 
> For the remaining sysfs attributes the deadlock can be solved by
> letting the blk_mq_freeze_queue() call by the sysfs store methods time
> out if that call takes too long or by making that call interruptible by
> signals like Ctrl-C. I think its better to let functions like
> queue_requests_store() fail if an attempt to freeze a request
> queue takes longer than it should rather than to trigger a kernel
> deadlock.
> 
I think you're proposing to use blk_mq_freeze_queue_wait_timeout()
here (which puts the caller into uninterruptible sleep) and that might
be okay. However, IMO, using TASK_INTERRUPTIBLE may not be worth in
case those sysfs store methods are invoked from udev rules or 
application. But lets wait for others to chime in and suggest.

Thanks,
--Nilay

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute
  2025-06-26 20:37 [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute Bart Van Assche
  2025-06-27  7:17 ` Christoph Hellwig
@ 2025-07-01  5:12 ` Nilay Shroff
  1 sibling, 0 replies; 8+ messages in thread
From: Nilay Shroff @ 2025-07-01  5:12 UTC (permalink / raw)
  To: Bart Van Assche, Jens Axboe; +Cc: linux-block, Christoph Hellwig, stable



On 6/27/25 2:07 AM, Bart Van Assche wrote:
> Every time I run test srp/002 the following deadlock is triggered:
> 
> task:multipathd
> Call Trace:
>  <TASK>
>  __schedule+0x8c1/0x1bf0
>  schedule+0xdd/0x270
>  schedule_preempt_disabled+0x1c/0x30
>  __mutex_lock+0xb89/0x1650
>  mutex_lock_nested+0x1f/0x30
>  dm_table_set_restrictions+0x823/0xdf0
>  __bind+0x166/0x590
>  dm_swap_table+0x2a7/0x490
>  do_resume+0x1b1/0x610
>  dev_suspend+0x55/0x1a0
>  ctl_ioctl+0x3a5/0x7e0
>  dm_ctl_ioctl+0x12/0x20
>  __x64_sys_ioctl+0x127/0x1a0
>  x64_sys_call+0xe2b/0x17d0
>  do_syscall_64+0x96/0x3a0
>  entry_SYSCALL_64_after_hwframe+0x4b/0x53
>  </TASK>
> task:(udev-worker)
> Call Trace:
>  <TASK>
>  __schedule+0x8c1/0x1bf0
>  schedule+0xdd/0x270
>  blk_mq_freeze_queue_wait+0xf2/0x140
>  blk_mq_freeze_queue_nomemsave+0x23/0x30
>  queue_ra_store+0x14e/0x290
>  queue_attr_store+0x23e/0x2c0
>  sysfs_kf_write+0xde/0x140
>  kernfs_fop_write_iter+0x3b2/0x630
>  vfs_write+0x4fd/0x1390
>  ksys_write+0xfd/0x230
>  __x64_sys_write+0x76/0xc0
>  x64_sys_call+0x276/0x17d0
>  do_syscall_64+0x96/0x3a0
>  entry_SYSCALL_64_after_hwframe+0x4b/0x53
>  </TASK>
> 
> This deadlock happens because blk_mq_freeze_queue_nomemsave() waits for
> pending requests to finish. The pending requests do never complete because
> the dm-multipath queue_if_no_path option is enabled and the only path in
> the dm-multipath configuration is being removed.
> 
> Fix this deadlock by removing the queue freezing/unfreezing code from
> queue_ra_store().
> 
> Freezing the request queue from inside a block layer sysfs store callback
> function is essential when modifying parameters that affect how bios or
> requests are processed, e.g. parameters that affect bio_split_to_limit().
> Freezing the request queue when modifying parameters that do not affect bio
> nor request processing is not necessary.
> 
> Cc: Nilay Shroff <nilay@linux.ibm.com>
> Cc: stable@vger.kernel.org
> Fixes: b07a889e8335 ("block: move q->sysfs_lock and queue-freeze under show/store method")
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
I hope we'd address other sysfs store attributes requiring queue-freeze
in another patch. So with that,

Looks good to me:
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-07-01  5:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-26 20:37 [PATCH v2] block: Fix a deadlock related to modifying the readahead attribute Bart Van Assche
2025-06-27  7:17 ` Christoph Hellwig
2025-06-28 22:58   ` Bart Van Assche
2025-06-30 22:39   ` Bart Van Assche
2025-07-01  0:42     ` Keith Busch
2025-07-01  1:50       ` Bart Van Assche
2025-07-01  5:08         ` Nilay Shroff
2025-07-01  5:12 ` Nilay Shroff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox