From: Damien Le Moal <dlemoal@kernel.org>
To: Bart Van Assche <bvanassche@acm.org>,
linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
linux-scsi@vger.kernel.org,
"Martin K . Petersen" <martin.petersen@oracle.com>,
dm-devel@lists.linux.dev, Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v2 07/28] block: Introduce zone write plugging
Date: Tue, 26 Mar 2024 12:12:03 +0900 [thread overview]
Message-ID: <839ebf2a-4dc6-433b-bc47-fd7915ed0ecf@kernel.org> (raw)
In-Reply-To: <f3b298bb-b68a-4375-a3b4-fc91229740c1@acm.org>
On 3/26/24 06:53, Bart Van Assche wrote:
> On 3/24/24 21:44, Damien Le Moal wrote:
>> +/*
>> + * Per-zone write plug.
>> + */
>> +struct blk_zone_wplug {
>> + struct hlist_node node;
>> + struct list_head err;
>> + atomic_t ref;
>> + spinlock_t lock;
>> + unsigned int flags;
>> + unsigned int zone_no;
>> + unsigned int wp_offset;
>> + struct bio_list bio_list;
>> + struct work_struct bio_work;
>> +};
>
> Please document what 'lock' protects. Please also document the unit of
> wp_offset.
>
> Since there is an atomic reference count in this data structure, why is
> the flag BLK_ZONE_WPLUG_FREEING required? Can that flag be replaced by
> checking whether or not 'ref' is zero?
Nope, we cannot. The reason is that BIO issuing and zone reset/finish can be
concurrently processed and we need to be ready for a user doing really stupid
things like resetting or finishing a zone while BIOs for that zone are being
issued. When zone reset/finish is processed, the plug is removed from the hash
table, but disk_get_zone_wplug_locked() may still get a reference to it because
we do not have the plug locked yet. Hence the flag, to prevent reusing the plug
for the reset/finished zone that was already removed from the hash table. This
is mentioned with a comment in disk_get_zone_wplug_locked():
/*
* Check that a BIO completion or a zone reset or finish
* operation has not already flagged the zone write plug for
* freeing and dropped its reference count. In such case, we
* need to get a new plug so start over from the beginning.
*/
The reference count dropping to 0 will then be the trigger for actually freeing
the plug, after all in-flight or plugged BIOs are completed (most likely failed).
>> -void disk_free_zone_bitmaps(struct gendisk *disk)
>> +static bool disk_insert_zone_wplug(struct gendisk *disk,
>> + struct blk_zone_wplug *zwplug)
>> +{
>> + struct blk_zone_wplug *zwplg;
>> + unsigned long flags;
>> + unsigned int idx =
>> + hash_32(zwplug->zone_no, disk->zone_wplugs_hash_bits);
>> +
>> + /*
>> + * Add the new zone write plug to the hash table, but carefully as we
>> + * are racing with other submission context, so we may already have a
>> + * zone write plug for the same zone.
>> + */
>> + spin_lock_irqsave(&disk->zone_wplugs_lock, flags);
>> + hlist_for_each_entry_rcu(zwplg, &disk->zone_wplugs_hash[idx], node) {
>> + if (zwplg->zone_no == zwplug->zone_no) {
>> + spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags);
>> + return false;
>> + }
>> + }
>> + hlist_add_head_rcu(&zwplug->node, &disk->zone_wplugs_hash[idx]);
>> + spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags);
>> +
>> + return true;
>> +}
>
> Since this function inserts an element into disk->zone_wplugs_hash[],
> can it happen that another thread removes that element from the hash
> list before this function returns?
No, that cannot happen. Both insertion and deletion of plugs in the hash table
are serialized with disk->zone_wplugs_lock. See disk_remove_zone_wplug().
--
Damien Le Moal
Western Digital Research
next prev parent reply other threads:[~2024-03-26 3:12 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-25 4:44 [PATCH v2 00/28] Zone write plugging Damien Le Moal
2024-03-25 4:44 ` [PATCH v2 01/28] block: Restore sector of flush requests Damien Le Moal
2024-03-25 19:30 ` Bart Van Assche
2024-03-26 6:05 ` Christoph Hellwig
2024-03-25 4:44 ` [PATCH v2 02/28] block: Remove req_bio_endio() Damien Le Moal
2024-03-25 19:39 ` Bart Van Assche
2024-03-26 1:54 ` Damien Le Moal
2024-03-25 4:44 ` [PATCH v2 03/28] block: Introduce blk_zone_update_request_bio() Damien Le Moal
2024-03-25 19:52 ` Bart Van Assche
2024-03-25 23:23 ` Damien Le Moal
2024-03-26 6:37 ` Christoph Hellwig
2024-03-26 7:47 ` Damien Le Moal
2024-03-27 7:01 ` Hannes Reinecke
2024-03-25 4:44 ` [PATCH v2 04/28] block: Introduce bio_straddle_zones() and bio_offset_from_zone_start() Damien Le Moal
2024-03-25 19:55 ` Bart Van Assche
2024-03-26 6:39 ` Christoph Hellwig
2024-03-25 4:44 ` [PATCH v2 05/28] block: Allow using bio_attempt_back_merge() internally Damien Le Moal
2024-03-25 20:00 ` Bart Van Assche
2024-03-26 6:39 ` Christoph Hellwig
2024-03-25 4:44 ` [PATCH v2 06/28] block: Remember zone capacity when revalidating zones Damien Le Moal
2024-03-25 21:53 ` Bart Van Assche
2024-03-25 23:20 ` Damien Le Moal
2024-03-26 6:40 ` Christoph Hellwig
2024-03-27 7:05 ` Hannes Reinecke
2024-03-25 4:44 ` [PATCH v2 07/28] block: Introduce zone write plugging Damien Le Moal
2024-03-25 21:53 ` Bart Van Assche
2024-03-26 3:12 ` Damien Le Moal [this message]
2024-03-26 6:51 ` Christoph Hellwig
2024-03-26 17:23 ` Bart Van Assche
2024-03-27 7:18 ` Hannes Reinecke
2024-03-25 4:44 ` [PATCH v2 08/28] block: Use a mempool to allocate zone write plugs Damien Le Moal
2024-03-27 7:19 ` Hannes Reinecke
2024-03-27 7:22 ` Damien Le Moal
2024-03-25 4:44 ` [PATCH v2 09/28] block: Fake max open zones limit when there is no limit Damien Le Moal
2024-03-26 6:57 ` Christoph Hellwig
2024-03-27 7:21 ` Hannes Reinecke
2024-03-25 4:44 ` [PATCH v2 10/28] block: Allow zero value of max_zone_append_sectors queue limit Damien Le Moal
2024-03-25 4:44 ` [PATCH v2 11/28] block: Implement zone append emulation Damien Le Moal
2024-03-27 7:28 ` Hannes Reinecke
2024-03-25 4:44 ` [PATCH v2 12/28] block: Allow BIO-based drivers to use blk_revalidate_disk_zones() Damien Le Moal
2024-03-26 7:08 ` Christoph Hellwig
2024-03-26 8:12 ` Damien Le Moal
2024-03-27 7:29 ` Hannes Reinecke
2024-03-25 4:44 ` [PATCH v2 13/28] dm: Use the block layer zone append emulation Damien Le Moal
2024-03-25 4:44 ` [PATCH v2 14/28] scsi: sd: " Damien Le Moal
2024-03-25 4:44 ` [PATCH v2 15/28] ublk_drv: Do not request ELEVATOR_F_ZBD_SEQ_WRITE elevator feature Damien Le Moal
2024-03-25 4:44 ` [PATCH v2 16/28] null_blk: " Damien Le Moal
2024-03-25 4:44 ` [PATCH v2 17/28] null_blk: Introduce zone_append_max_sectors attribute Damien Le Moal
2024-03-27 7:31 ` Hannes Reinecke
2024-03-25 4:44 ` [PATCH v2 18/28] null_blk: Introduce fua attribute Damien Le Moal
2024-03-25 4:44 ` [PATCH v2 19/28] nvmet: zns: Do not reference the gendisk conv_zones_bitmap Damien Le Moal
2024-03-26 6:45 ` Christoph Hellwig
2024-03-25 4:44 ` [PATCH v2 20/28] block: Remove BLK_STS_ZONE_RESOURCE Damien Le Moal
2024-03-26 6:45 ` Christoph Hellwig
2024-03-25 4:44 ` [PATCH v2 21/28] block: Simplify blk_revalidate_disk_zones() interface Damien Le Moal
2024-03-26 6:45 ` Christoph Hellwig
2024-03-25 4:44 ` [PATCH v2 22/28] block: mq-deadline: Remove support for zone write locking Damien Le Moal
2024-03-25 22:13 ` Bart Van Assche
2024-03-25 4:44 ` [PATCH v2 23/28] block: Remove elevator required features Damien Le Moal
2024-03-26 6:45 ` Christoph Hellwig
2024-03-25 4:44 ` [PATCH v2 24/28] block: Do not check zone type in blk_check_zone_append() Damien Le Moal
2024-03-26 6:46 ` Christoph Hellwig
2024-03-25 4:44 ` [PATCH v2 25/28] block: Move zone related debugfs attribute to blk-zoned.c Damien Le Moal
2024-03-25 22:20 ` Bart Van Assche
2024-03-25 23:17 ` Damien Le Moal
2024-03-25 4:44 ` [PATCH v2 26/28] block: Remove zone write locking Damien Le Moal
2024-03-25 22:27 ` Bart Van Assche
2024-03-27 7:32 ` Hannes Reinecke
2024-03-25 4:44 ` [PATCH v2 27/28] block: Do not force select mq-deadline with CONFIG_BLK_DEV_ZONED Damien Le Moal
2024-03-25 22:29 ` Bart Van Assche
2024-03-27 7:33 ` Hannes Reinecke
2024-03-25 4:44 ` [PATCH v2 28/28] block: Do not special-case plugging of zone write operations Damien Le Moal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=839ebf2a-4dc6-433b-bc47-fd7915ed0ecf@kernel.org \
--to=dlemoal@kernel.org \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=dm-devel@lists.linux.dev \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=snitzer@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.