public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: Bart Van Assche <bvanassche@acm.org>,
	Hannes Reinecke <hare@suse.de>, Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org
Subject: Re: [PATCH 1/8] block: fix zone write plug removal
Date: Tue, 24 Feb 2026 10:57:31 +0900	[thread overview]
Message-ID: <3a8fdb1f-db52-4264-b9c4-e8339c719dc4@kernel.org> (raw)
In-Reply-To: <03e14c32-8592-421f-992a-84301d1ec692@acm.org>

On 2/24/26 5:21 AM, Bart Van Assche wrote:
> On 2/23/26 11:30 AM, Bart Van Assche wrote:
>> On 2/23/26 3:56 AM, Hannes Reinecke wrote:
>>> On 2/21/26 01:44, Damien Le Moal wrote:
>>>> +static inline void disk_check_and_put_zone_wplug(struct blk_zone_wplug
>>>> *zwplug)
>>>> +{
>>>> +    lockdep_assert_not_held(&zwplug->lock);
>>>> +
>>>> +    /*
>>>> +     * Check for the need to remove the zone write plug from the hash list
>>>> +     * when we see that only the caller is referencing the zone write plug.
>>>> +     * Since this is racy without holding the zone write plug lock, this
>>>> +     * check is done again in  disk_should_remove_zone_wplug().
>>>> +     */
>>>> +    if (refcount_read(&zwplug->ref) <= 2) {
>>>> +        struct gendisk *disk = zwplug->disk;
>>>> +        unsigned long flags;
>>>> +
>>>> +        spin_lock_irqsave(&zwplug->lock, flags);
>>>> +        if (disk_should_remove_zone_wplug(disk, zwplug))
>>>> +            disk_remove_zone_wplug(disk, zwplug);
>>>> +        spin_unlock_irqrestore(&zwplug->lock, flags);
>>>> +    }
>>>> +
>>>> +    disk_put_zone_wplug(zwplug);
>>>> +}
>>>> +
>>>>   static void blk_zone_wplug_bio_work(struct work_struct *work);
>>>>   /*
>>>
>>> This looks slightly odd; a simple 'refcount_read()' always trigger alarm
>>> bells with me as it inevitably races with 'refcount_dec()' /
>>> 'refcount_inc()'.
>>> And similar the check 'refcount <= 2' also looks odd; typically one
>>> would check against '< 1' and let the refcount destructor handle
>>> things.
>> Hmm ... I think that the above code is racy. Here is an example:
>>
>> thread 1                                thread 2
>> -------------------------------         -------------------------------
>> refcount_read() returns 3               refcount_read() returns 3
>>
>> disk_should_remove_zone_wplug() is      disk_should_remove_zone_wplug()
>> is not called                           is not called
>>
>> disk_put_zone_wplug() decreases         disk_put_zone_wplug() decreases
>> zwplug->ref from 3 to 2                 zwplug->ref from 2 to 1
>>
>>
>> Result: zwplug->ref became less than 2 without
>> disk_should_remove_zone_wplug() having been called.
> 
> A correction to the above: three threads are required before it becomes
> possible for the zwplug reference count to drop from 3 to zero without
> disk_should_remove_zone_wplug() having been called.

I do not see how your example is possible... If refcount read are form the BIO
work context, we only have one per zone, so there is not such concurrency.
The additional contexts can be from zone reset or zone finish, but these
unconditionally call disk_should_remove_zone_wplug(), so there is no such race.

> 
> Thanks,
> 
> Bart.


-- 
Damien Le Moal
Western Digital Research

  reply	other threads:[~2026-02-24  2:02 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-21  0:44 [PATCH 0/8] Improve zoned (SMR) HDD write throughput Damien Le Moal
2026-02-21  0:44 ` [PATCH 1/8] block: fix zone write plug removal Damien Le Moal
2026-02-23 11:56   ` Hannes Reinecke
2026-02-23 19:30     ` Bart Van Assche
2026-02-23 20:21       ` Bart Van Assche
2026-02-24  1:57         ` Damien Le Moal [this message]
2026-02-21  0:44 ` [PATCH 2/8] block: remove BLK_ZONE_WPLUG_UNHASHED Damien Le Moal
2026-02-23 11:48   ` Hannes Reinecke
2026-02-24  2:04     ` Damien Le Moal
2026-02-21  0:44 ` [PATCH 3/8] block: remove disk_zone_is_full() Damien Le Moal
2026-02-23 11:56   ` Hannes Reinecke
2026-02-24 13:15   ` Johannes Thumshirn
2026-02-21  0:44 ` [PATCH 4/8] block: improve disk_zone_wplug_schedule_bio_work() Damien Le Moal
2026-02-23 11:59   ` Hannes Reinecke
2026-02-23 18:56     ` Bart Van Assche
2026-02-24  2:03     ` Damien Le Moal
2026-02-24 15:00       ` Hannes Reinecke
2026-02-24 15:08         ` Christoph Hellwig
2026-02-24 13:18   ` Johannes Thumshirn
2026-02-21  0:44 ` [PATCH 5/8] block: rename struct gendisk zone_wplugs_lock field Damien Le Moal
2026-02-23 12:00   ` Hannes Reinecke
2026-02-24 13:19   ` Johannes Thumshirn
2026-02-21  0:44 ` [PATCH 6/8] block: allow submitting all zone writes from a single context Damien Le Moal
2026-02-23 12:07   ` Hannes Reinecke
2026-02-24  2:00     ` Damien Le Moal
2026-02-21  0:44 ` [PATCH 7/8] block: default to QD=1 writes for blk-mq rotational zoned devices Damien Le Moal
2026-02-23 12:07   ` Hannes Reinecke
2026-02-21  0:44 ` [PATCH 8/8] Documentation: ABI: stable: document the zoned_qd1_writes attribute Damien Le Moal
2026-02-23 12:07   ` Hannes Reinecke
2026-02-23 17:03 ` [PATCH 0/8] Improve zoned (SMR) HDD write throughput Bart Van Assche
2026-02-24  1:07   ` Damien Le Moal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3a8fdb1f-db52-4264-b9c4-e8339c719dc4@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=hare@suse.de \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox