All of lore.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: Nilay Shroff <nilay@linux.ibm.com>, Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org
Subject: Re: [PATCH 1/2] block: avoid to hold q->limits_lock across APIs for atomic update queue limits
Date: Wed, 18 Dec 2024 06:57:45 -0800	[thread overview]
Message-ID: <9e2ad956-4d20-456f-9676-8ea88dfd116e@kernel.org> (raw)
In-Reply-To: <0fdf7af6-9401-4853-8536-4295a614e6d2@linux.ibm.com>

On 2024/12/18 6:05, Nilay Shroff wrote:
>>> In the above code block, we may then replace blk_mq_freeze_queue() with
>>> queue_limits_commit_update() and similarly replace blk_mq_unfreeze_queue() 
>>> with queue_limits_commit_update().
>>>
>>> {
>>>     queue_limits_start_update()
>>>     ...
>>>     ...
>>>     ...
>>>     queue_limits_commit_update()
>>
>> In sd_revalidate_disk(), blk-mq request is allocated under queue_limits_start_update(),
>> then ABBA deadlock is triggered since blk_queue_enter() implies same lock(freeze lock)
>> from blk_mq_freeze_queue().
>>
> Yeah agreed but I see sd_revalidate_disk() is probably the only exception 
> which allocates the blk-mq request. Can't we fix it? 

If we change where limits_lock is taken now, we will again introduce races
between user config and discovery/revalidation, which is what
queue_limits_start_update() and queue_limits_commit_update() intended to fix in
the first place.

So changing sd_revalidate_disk() is not the right approach.

> One simple way I could think of would be updating queue_limit_xxx() APIs and
> then use it,
> 
> queue_limits_start_update(struct request_queue *q, bool freeze) 
> {
>     ...
>     mutex_lock(&q->limits_lock);
>     if(freeze)
>         blk_mq_freeze_queue(q);
>     ...
> }
> 
> queue_limits_commit_update(struct request_queue *q,
> 		struct queue_limits *lim, bool unfreeze)
> {
>     ...
>     ...
>     if(unfreeze)
>         blk_mq_unfreeze_queue(q);
>     mutex_unlock(&q->limits_lock);
> }
> 
> sd_revalidate_disk()
> {
>     ...
>     ...
>     queue_limits_start_update(q, false);
> 
>     ...
>     ...
> 
>     blk_mq_freeze_queue(q);
>     queue_limits_commit_update(q, lim, true);    

This is overly complicated ... As I suggested, I think that a simpler approach
is to call blk_mq_freeze_queue() and blk_mq_unfreeze_queue() inside
queue_limits_commit_update(). Doing so, no driver should need to directly call
freeze/unfreeze. But that would be a cleanup. Let's first fix the few instances
that have the update/freeze order wrong. As mentioned, the pattern simply needs
to be:

	queue_limits_start_update();
	...
	blk_mq_freeze_queue();
	queue_limits_commit_update();
	blk_mq_unfreeze_queue();

I fixed blk-zoned code last week like this. But I had not noticed that other
places had a similar issue as well.

-- 
Damien Le Moal
Western Digital Research

  reply	other threads:[~2024-12-18 14:57 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-16  8:02 [PATCH 0/2] block: fix deadlock caused by atomic limits update Ming Lei
2024-12-16  8:02 ` [PATCH 1/2] block: avoid to hold q->limits_lock across APIs for atomic update queue limits Ming Lei
2024-12-16 15:49   ` Christoph Hellwig
2024-12-17  1:52     ` Ming Lei
2024-12-17  4:40       ` Christoph Hellwig
2024-12-17  7:05         ` Ming Lei
2024-12-17  7:19           ` Christoph Hellwig
2024-12-17  7:30             ` Ming Lei
2024-12-17 16:07               ` Damien Le Moal
2024-12-18  2:09                 ` Ming Lei
2024-12-18 11:33                   ` Nilay Shroff
2024-12-18 13:40                     ` Ming Lei
2024-12-18 14:05                       ` Nilay Shroff
2024-12-18 14:57                         ` Damien Le Moal [this message]
2024-12-19  6:20                           ` Christoph Hellwig
2024-12-19  7:16                             ` Nilay Shroff
2024-12-21 13:03                             ` Nilay Shroff
2024-12-30  9:02                               ` Ming Lei
2024-12-30 23:29                                 ` Damien Le Moal
2025-01-01 11:17                                   ` Nilay Shroff
2024-12-19  6:17                         ` Christoph Hellwig
2024-12-19  6:15                   ` Christoph Hellwig
2024-12-16  8:02 ` [PATCH 2/2] block: remove queue_limits_cancel_update() Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e2ad956-4d20-456f-9676-8ea88dfd116e@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=nilay@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.