From: Niklas Cassel <cassel@kernel.org>
To: Damien Le Moal <dlemoal@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, dm-devel@lists.linux.dev,
Mike Snitzer <snitzer@kernel.org>,
Mikulas Patocka <mpatocka@redhat.com>
Subject: Re: [PATCH 3/4] block: Fix zone write plugging handling of devices with a runt zone
Date: Thu, 30 May 2024 09:37:09 +0200 [thread overview]
Message-ID: <ZlgspRAZQ4lmqBcC@ryzen.lan> (raw)
In-Reply-To: <20240530054035.491497-4-dlemoal@kernel.org>
On Thu, May 30, 2024 at 02:40:34PM +0900, Damien Le Moal wrote:
> A zoned device may have a last sequential write required zone that is
> smaller than other zones. However, all tests to check if a zone write
> plug write offset exceeds the zone capacity use the same capacity
> value stored in the gendisk zone_capacity field. This is incorrect for a
> zoned device with a last runt (smaller) zone.
>
> Add the new field last_zone_capacity to struct gendisk to store the
> capacity of the last zone of the device. blk_revalidate_seq_zone() and
> blk_revalidate_conv_zone() are both modified to get this value when
> disk_zone_is_last() returns true. Similarly to zone_capacity, the value
> is first stored using the last_zone_capacity field of struct
> blk_revalidate_zone_args. Once zone revalidation of all zones is done,
> this is used to set the gendisk last_zone_capacity field.
>
> The checks to determine if a zone is full or if a sector offset in a
> zone exceeds the zone capacity in disk_should_remove_zone_wplug(),
> disk_zone_wplug_abort_unaligned(), blk_zone_write_plug_init_request(),
> and blk_zone_wplug_prepare_bio() are modified to use the new helper
> functions disk_zone_is_full() and disk_zone_wplug_is_full().
> disk_zone_is_full() uses the zone index to determine if the zone being
> tested is the last one of the disk and uses the either the disk
> zone_capacity or last_zone_capacity accordingly.
>
> Fixes: dd291d77cc90 ("block: Introduce zone write plugging")
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
> ---
> block/blk-zoned.c | 35 +++++++++++++++++++++++++++--------
> include/linux/blkdev.h | 1 +
> 2 files changed, 28 insertions(+), 8 deletions(-)
>
> diff --git a/block/blk-zoned.c b/block/blk-zoned.c
> index 402a50a1ac4d..52abebf56027 100644
> --- a/block/blk-zoned.c
> +++ b/block/blk-zoned.c
> @@ -455,6 +455,20 @@ static bool disk_zone_is_last(struct gendisk *disk, struct blk_zone *zone)
> return zone->start + zone->len >= get_capacity(disk);
> }
>
> +static bool disk_zone_is_full(struct gendisk *disk,
> + unsigned int zno, unsigned int offset_in_zone)
Why not just call the third parameter wp?
> +{
> + if (zno < disk->nr_zones - 1)
> + return offset_in_zone >= disk->zone_capacity;
> + return offset_in_zone >= disk->last_zone_capacity;
> +}
> +
> +static bool disk_zone_wplug_is_full(struct gendisk *disk,
> + struct blk_zone_wplug *zwplug)
> +{
> + return disk_zone_is_full(disk, zwplug->zone_no, zwplug->wp_offset);
> +}
> +
> static bool disk_insert_zone_wplug(struct gendisk *disk,
> struct blk_zone_wplug *zwplug)
> {
> @@ -548,7 +562,7 @@ static inline bool disk_should_remove_zone_wplug(struct gendisk *disk,
> return false;
>
> /* We can remove zone write plugs for zones that are empty or full. */
> - return !zwplug->wp_offset || zwplug->wp_offset >= disk->zone_capacity;
> + return !zwplug->wp_offset || disk_zone_wplug_is_full(disk, zwplug);
> }
>
> static void disk_remove_zone_wplug(struct gendisk *disk,
> @@ -669,13 +683,12 @@ static void disk_zone_wplug_abort(struct blk_zone_wplug *zwplug)
> static void disk_zone_wplug_abort_unaligned(struct gendisk *disk,
> struct blk_zone_wplug *zwplug)
> {
> - unsigned int zone_capacity = disk->zone_capacity;
> unsigned int wp_offset = zwplug->wp_offset;
> struct bio_list bl = BIO_EMPTY_LIST;
> struct bio *bio;
>
> while ((bio = bio_list_pop(&zwplug->bio_list))) {
> - if (wp_offset >= zone_capacity ||
> + if (disk_zone_is_full(disk, zwplug->zone_no, wp_offset) ||
Why don't you use disk_zone_wplug_is_full() here?
> (bio_op(bio) != REQ_OP_ZONE_APPEND &&
> bio_offset_from_zone_start(bio) != wp_offset)) {
> blk_zone_wplug_bio_io_error(zwplug, bio);
> @@ -914,7 +927,6 @@ void blk_zone_write_plug_init_request(struct request *req)
> sector_t req_back_sector = blk_rq_pos(req) + blk_rq_sectors(req);
> struct request_queue *q = req->q;
> struct gendisk *disk = q->disk;
> - unsigned int zone_capacity = disk->zone_capacity;
> struct blk_zone_wplug *zwplug =
> disk_get_zone_wplug(disk, blk_rq_pos(req));
> unsigned long flags;
> @@ -938,7 +950,7 @@ void blk_zone_write_plug_init_request(struct request *req)
> * into the back of the request.
> */
> spin_lock_irqsave(&zwplug->lock, flags);
> - while (zwplug->wp_offset < zone_capacity) {
> + while (!disk_zone_wplug_is_full(disk, zwplug)) {
> bio = bio_list_peek(&zwplug->bio_list);
> if (!bio)
> break;
> @@ -984,7 +996,7 @@ static bool blk_zone_wplug_prepare_bio(struct blk_zone_wplug *zwplug,
> * We know such BIO will fail, and that would potentially overflow our
> * write pointer offset beyond the end of the zone.
> */
> - if (zwplug->wp_offset >= disk->zone_capacity)
> + if (disk_zone_wplug_is_full(disk, zwplug))
> goto err;
>
> if (bio_op(bio) == REQ_OP_ZONE_APPEND) {
> @@ -1561,6 +1573,7 @@ void disk_free_zone_resources(struct gendisk *disk)
> kfree(disk->conv_zones_bitmap);
> disk->conv_zones_bitmap = NULL;
> disk->zone_capacity = 0;
> + disk->last_zone_capacity = 0;
> disk->nr_zones = 0;
> }
>
> @@ -1605,6 +1618,7 @@ struct blk_revalidate_zone_args {
> unsigned long *conv_zones_bitmap;
> unsigned int nr_zones;
> unsigned int zone_capacity;
> + unsigned int last_zone_capacity;
> sector_t sector;
> };
>
> @@ -1622,6 +1636,7 @@ static int disk_update_zone_resources(struct gendisk *disk,
>
> disk->nr_zones = args->nr_zones;
> disk->zone_capacity = args->zone_capacity;
> + disk->last_zone_capacity = args->last_zone_capacity;
> swap(disk->conv_zones_bitmap, args->conv_zones_bitmap);
> if (disk->conv_zones_bitmap)
> nr_conv_zones = bitmap_weight(disk->conv_zones_bitmap,
> @@ -1673,6 +1688,9 @@ static int blk_revalidate_conv_zone(struct blk_zone *zone, unsigned int idx,
> return -ENODEV;
> }
>
> + if (disk_zone_is_last(disk, zone))
> + args->last_zone_capacity = zone->capacity;
> +
> if (!disk_need_zone_resources(disk))
> return 0;
>
> @@ -1703,8 +1721,9 @@ static int blk_revalidate_seq_zone(struct blk_zone *zone, unsigned int idx,
> */
> if (!args->zone_capacity)
> args->zone_capacity = zone->capacity;
> - if (!disk_zone_is_last(disk, zone) &&
> - zone->capacity != args->zone_capacity) {
> + if (disk_zone_is_last(disk, zone)) {
> + args->last_zone_capacity = zone->capacity;
> + } else if (zone->capacity != args->zone_capacity) {
> pr_warn("%s: Invalid variable zone capacity\n",
> disk->disk_name);
> return -ENODEV;
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index aefdda9f4ec7..24c36929920b 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -186,6 +186,7 @@ struct gendisk {
> */
> unsigned int nr_zones;
> unsigned int zone_capacity;
> + unsigned int last_zone_capacity;
> unsigned long *conv_zones_bitmap;
> unsigned int zone_wplugs_hash_bits;
> spinlock_t zone_wplugs_lock;
> --
> 2.45.1
>
next prev parent reply other threads:[~2024-05-30 7:37 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-30 5:40 [PATCH 0/4] Zone write plugging and DM zone fixes Damien Le Moal
2024-05-30 5:40 ` [PATCH 1/4] null_blk: Do not allow runt zone with zone capacity smaller then zone size Damien Le Moal
2024-05-30 7:37 ` Niklas Cassel
2024-05-30 20:34 ` Bart Van Assche
2024-06-01 5:25 ` Christoph Hellwig
2024-06-03 6:53 ` Hannes Reinecke
2024-05-30 5:40 ` [PATCH 2/4] block: Fix validation of zoned device with a runt zone Damien Le Moal
2024-05-30 7:37 ` Niklas Cassel
2024-05-30 20:37 ` Bart Van Assche
2024-06-01 5:26 ` Christoph Hellwig
2024-06-03 6:55 ` Hannes Reinecke
2024-05-30 5:40 ` [PATCH 3/4] block: Fix zone write plugging handling of devices " Damien Le Moal
2024-05-30 7:37 ` Niklas Cassel [this message]
2024-05-30 11:09 ` Damien Le Moal
2024-05-30 12:51 ` Niklas Cassel
2024-05-30 20:40 ` Bart Van Assche
2024-06-01 5:26 ` Christoph Hellwig
2024-06-03 6:56 ` Hannes Reinecke
2024-05-30 5:40 ` [PATCH 4/4] dm: Improve zone resource limits handling Damien Le Moal
2024-05-30 7:37 ` Niklas Cassel
2024-05-31 19:26 ` Benjamin Marzinski
2024-06-01 5:29 ` Christoph Hellwig
2024-06-01 5:33 ` Christoph Hellwig
2024-06-03 0:44 ` Damien Le Moal
2024-06-01 5:29 ` Christoph Hellwig
2024-06-03 6:58 ` Hannes Reinecke
2024-05-30 21:03 ` [PATCH 0/4] Zone write plugging and DM zone fixes Jens Axboe
2024-05-30 23:58 ` Damien Le Moal
2024-05-30 21:04 ` (subset) " Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZlgspRAZQ4lmqBcC@ryzen.lan \
--to=cassel@kernel.org \
--cc=axboe@kernel.dk \
--cc=dlemoal@kernel.org \
--cc=dm-devel@lists.linux.dev \
--cc=linux-block@vger.kernel.org \
--cc=mpatocka@redhat.com \
--cc=snitzer@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.