public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: Jens Axboe <axboe@kernel.dk>, linux-block@vger.kernel.org
Subject: Re: [PATCH v2 1/2] block: prevent freeing a zone write plug too early
Date: Wed, 24 Apr 2024 01:16:13 +1000	[thread overview]
Message-ID: <4530f039-0698-4cc0-94ab-5465d3f0e255@kernel.org> (raw)
In-Reply-To: <715fd037-a8e5-4e62-939e-a446087eed2a@kernel.dk>

On 2024/04/24 0:21, Jens Axboe wrote:
> On 4/20/24 1:58 AM, Damien Le Moal wrote:
>> The submission of plugged BIOs is done using a work struct executing the
>> function blk_zone_wplug_bio_work(). This function gets and submits a
>> plugged zone write BIO and is guaranteed to operate on a valid zone
>> write plug (with a reference count higher than 0) on entry as plugged
>> BIOs hold a reference on their zone write plugs. However, once a BIO is
>> submitted with submit_bio_noacct_nocheck(), the BIO may complete before
>> blk_zone_wplug_bio_work(), with the BIO completion trigering a release
>> and freeing of the zone write plug if the BIO is the last write to a
>> zone (making the zone FULL). This potentially can result in the zone
>> write plug being freed while the work is still active.
>>
>> Avoid this by calling flush_work() from disk_free_zone_wplug_rcu().
>>
>> Fixes: dd291d77cc90 ("block: Introduce zone write plugging")
>> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
>> ---
>>  block/blk-zoned.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
>> index 3befebe6b319..685f0b9159fd 100644
>> --- a/block/blk-zoned.c
>> +++ b/block/blk-zoned.c
>> @@ -526,6 +526,8 @@ static void disk_free_zone_wplug_rcu(struct rcu_head *rcu_head)
>>  	struct blk_zone_wplug *zwplug =
>>  		container_of(rcu_head, struct blk_zone_wplug, rcu_head);
>>  
>> +	flush_work(&zwplug->bio_work);
>> +
>>  	mempool_free(zwplug, zwplug->disk->zone_wplugs_pool);
>>  }
> 
> This is totally backwards. First of all, if you actually had work that
> needed flushing at this point, the kernel would bomb spectacularly.
> Secondly, what's the point of using RCU to protect this, if you're now
> needing to flush work from the RCU callback? That's a clear sign that
> something is very wrong here with your references / RCU usage.. The work
> item should hold a reference to it, trying to paper around it like this
> is not going to work at all.
> 
> Why is the work item racing with RCU freeing?!

The work item is a field of the zone write plug. Zone write plugs have
references to them as long as BIOs are in flight and and the zone is not full.
The zone write plug freeing through rcu is triggered by the last write to a zone
that makes the zone full. But the completion of this last write BIO may happen
right after the work issued the BIO with submit_bio_noacct_nocheck() and before
blk_zone_wplug_bio_work() returns, while the work item is still active.

The actual freeing of the plug happens only after the rcu grace period, and I
was not entirely sure if this is enough to guarantee that the work thread is
finished. But checking how the workqueue code processes the work item by calling
the work function (blk_zone_wplug_bio_work() in this case), there is no issue
because the work item (struct work_struct) is not touched once the work function
is called. So there are no issues/races with freeing the zone write plug. I was
overthinking this. My bad. We can drop this patch. Apologies for the noise.


-- 
Damien Le Moal
Western Digital Research


  reply	other threads:[~2024-04-23 15:16 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-20  7:58 [PATCH v2 0/2] Fixes for zone write plugging Damien Le Moal
2024-04-20  7:58 ` [PATCH v2 1/2] block: prevent freeing a zone write plug too early Damien Le Moal
2024-04-22  6:23   ` Christoph Hellwig
2024-04-23  6:12     ` Damien Le Moal
2024-04-23 14:21   ` Jens Axboe
2024-04-23 15:16     ` Damien Le Moal [this message]
2024-04-23 15:36       ` Jens Axboe
2024-04-23 18:19         ` Damien Le Moal
2024-04-24 13:58           ` Jens Axboe
2024-04-20  7:58 ` [PATCH v2 2/2] block: use a per disk workqueue for zone write plugging Damien Le Moal
2024-04-22  6:25   ` Christoph Hellwig
2024-04-23  6:01     ` Damien Le Moal
2024-04-23 15:46 ` (subset) [PATCH v2 0/2] Fixes " Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4530f039-0698-4cc0-94ab-5465d3f0e255@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox