From: Damien Le Moal <dlemoal@kernel.org>
To: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
Keith Busch <keith.busch@wdc.com>, Christoph Hellwig <hch@lst.de>,
dm-devel@lists.linux.dev, Mike Snitzer <snitzer@kernel.org>,
Mikulas Patocka <mpatocka@redhat.com>,
"Martin K . Petersen" <martin.petersen@oracle.com>,
linux-scsi@vger.kernel.org, linux-xfs@vger.kernel.org,
Carlos Maiolino <cem@kernel.org>,
linux-btrfs@vger.kernel.org, David Sterba <dsterba@suse.com>
Subject: [PATCH v2 10/15] block: introduce blkdev_report_zones_cached()
Date: Mon, 3 Nov 2025 22:31:18 +0900 [thread overview]
Message-ID: <20251103133123.645038-11-dlemoal@kernel.org> (raw)
In-Reply-To: <20251103133123.645038-1-dlemoal@kernel.org>
Introduce the function blkdev_report_zones_cached() to provide a fast
report zone built using the blkdev_get_zone_info() function, which gets
zone information from a disk zones_cond array or zone write plugs.
For a large capacity SMR drive, such fast report zone can be completed
in a few milliseconds compared to several seconds completion times
when the report zone is obtained from the device.
The zone report is built in the same manner as with the regular
blkdev_report_zones() function, that is, the first zone reported is the
one containing the specified start sector and the report is limited to
the specified number of zones (nr_zones argument). The information for
each zone in the report is obtained using blkdev_get_zone_info().
For zoned devices that do not use zone write plug resources,
using blkdev_get_zone_info() is inefficient as the zone report would
be very slow, generated one zone at a time. To avoid this,
blkdev_report_zones_cached() falls back to calling
blkdev_do_report_zones() to execute a regular zone report. In this case,
the .report_active field of struct blk_report_zones_args is set to true
to report zone conditions using the BLK_ZONE_COND_ACTIVE condition in
place of the implicit open, explicit open and closed conditions.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
block/blk-zoned.c | 88 +++++++++++++++++++++++++++++++++++-------
include/linux/blkdev.h | 2 +
2 files changed, 77 insertions(+), 13 deletions(-)
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 8d75c9ef53a0..94dde4590a8e 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -73,6 +73,19 @@ struct blk_zone_wplug {
enum blk_zone_cond cond;
};
+static inline bool disk_need_zone_resources(struct gendisk *disk)
+{
+ /*
+ * All request-based zoned devices need zone resources so that the
+ * block layer can automatically handle write BIO plugging. BIO-based
+ * device drivers (e.g. DM devices) are normally responsible for
+ * handling zone write ordering and do not need zone resources, unless
+ * the driver requires zone append emulation.
+ */
+ return queue_is_mq(disk->queue) ||
+ queue_emulates_zone_append(disk->queue);
+}
+
static inline unsigned int disk_zone_wplugs_hash_size(struct gendisk *disk)
{
return 1U << disk->zone_wplugs_hash_bits;
@@ -962,6 +975,68 @@ int blkdev_get_zone_info(struct block_device *bdev, sector_t sector,
}
EXPORT_SYMBOL_GPL(blkdev_get_zone_info);
+/**
+ * blkdev_report_zones_cached - Get cached zones information
+ * @bdev: Target block device
+ * @sector: Sector from which to report zones
+ * @nr_zones: Maximum number of zones to report
+ * @cb: Callback function called for each reported zone
+ * @data: Private data for the callback function
+ *
+ * Description:
+ * Similar to blkdev_report_zones() but instead of calling into the low level
+ * device driver to get the zone report from the device, use
+ * blkdev_get_zone_info() to generate the report from the disk zone write
+ * plugs and zones condition array. Since calling this function without a
+ * callback does not make sense, @cb must be specified.
+ */
+int blkdev_report_zones_cached(struct block_device *bdev, sector_t sector,
+ unsigned int nr_zones, report_zones_cb cb, void *data)
+{
+ struct gendisk *disk = bdev->bd_disk;
+ sector_t capacity = get_capacity(disk);
+ sector_t zone_sectors = bdev_zone_sectors(bdev);
+ unsigned int idx = 0;
+ struct blk_zone zone;
+ int ret;
+
+ if (!cb || !bdev_is_zoned(bdev) ||
+ WARN_ON_ONCE(!disk->fops->report_zones))
+ return -EOPNOTSUPP;
+
+ if (!nr_zones || sector >= capacity)
+ return 0;
+
+ /*
+ * If we do not have any zone write plug resources, fallback to using
+ * the regular zone report.
+ */
+ if (!disk_need_zone_resources(disk)) {
+ struct blk_report_zones_args args = {
+ .cb = cb,
+ .data = data,
+ .report_active = true,
+ };
+
+ return blkdev_do_report_zones(bdev, sector, nr_zones, &args);
+ }
+
+ for (sector = ALIGN_DOWN(sector, zone_sectors);
+ sector < capacity && idx < nr_zones;
+ sector += zone_sectors, idx++) {
+ ret = blkdev_get_zone_info(bdev, sector, &zone);
+ if (ret)
+ return ret;
+
+ ret = cb(&zone, idx, data);
+ if (ret)
+ return ret;
+ }
+
+ return idx;
+}
+EXPORT_SYMBOL_GPL(blkdev_report_zones_cached);
+
static void blk_zone_reset_bio_endio(struct bio *bio)
{
struct gendisk *disk = bio->bi_bdev->bd_disk;
@@ -1771,19 +1846,6 @@ void disk_free_zone_resources(struct gendisk *disk)
disk->nr_zones = 0;
}
-static inline bool disk_need_zone_resources(struct gendisk *disk)
-{
- /*
- * All mq zoned devices need zone resources so that the block layer
- * can automatically handle write BIO plugging. BIO-based device drivers
- * (e.g. DM devices) are normally responsible for handling zone write
- * ordering and do not need zone resources, unless the driver requires
- * zone append emulation.
- */
- return queue_is_mq(disk->queue) ||
- queue_emulates_zone_append(disk->queue);
-}
-
struct blk_revalidate_zone_args {
struct gendisk *disk;
u8 *zones_cond;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 03a594b4dfbc..f0ab02e0a673 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -442,6 +442,8 @@ int blkdev_get_zone_info(struct block_device *bdev, sector_t sector,
#define BLK_ALL_ZONES ((unsigned int)-1)
int blkdev_report_zones(struct block_device *bdev, sector_t sector,
unsigned int nr_zones, report_zones_cb cb, void *data);
+int blkdev_report_zones_cached(struct block_device *bdev, sector_t sector,
+ unsigned int nr_zones, report_zones_cb cb, void *data);
int blkdev_zone_mgmt(struct block_device *bdev, enum req_op op,
sector_t sectors, sector_t nr_sectors);
int blk_revalidate_disk_zones(struct gendisk *disk);
--
2.51.0
next prev parent reply other threads:[~2025-11-03 13:35 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-03 13:31 [PATCH v2 00/15] Introduce cached report zones Damien Le Moal
2025-11-03 13:31 ` [PATCH v2 01/15] block: handle zone management operations completions Damien Le Moal
2025-11-03 13:53 ` Johannes Thumshirn
2025-11-03 13:31 ` [PATCH v2 02/15] block: freeze queue when updating zone resources Damien Le Moal
2025-11-03 13:46 ` Christoph Hellwig
2025-11-03 13:55 ` Johannes Thumshirn
2025-11-03 13:31 ` [PATCH v2 03/15] block: cleanup blkdev_report_zones() Damien Le Moal
2025-11-03 13:56 ` Johannes Thumshirn
2025-11-03 13:31 ` [PATCH v2 04/15] block: introduce disk_report_zone() Damien Le Moal
2025-11-03 14:01 ` Johannes Thumshirn
2025-11-04 0:36 ` kernel test robot
2025-11-03 13:31 ` [PATCH v2 05/15] block: reorganize struct blk_zone_wplug Damien Le Moal
2025-11-03 14:01 ` Johannes Thumshirn
2025-11-03 13:31 ` [PATCH v2 06/15] block: use zone condition to determine conventional zones Damien Le Moal
2025-11-03 14:52 ` Johannes Thumshirn
2025-11-03 13:31 ` [PATCH v2 07/15] block: track zone conditions Damien Le Moal
2025-11-03 15:00 ` Johannes Thumshirn
2025-11-03 13:31 ` [PATCH v2 08/15] block: refactor blkdev_report_zones() code Damien Le Moal
2025-11-03 13:47 ` Christoph Hellwig
2025-11-03 15:01 ` Johannes Thumshirn
2025-11-03 13:31 ` [PATCH v2 09/15] block: introduce blkdev_get_zone_info() Damien Le Moal
2025-11-03 15:12 ` Johannes Thumshirn
2025-11-03 13:31 ` Damien Le Moal [this message]
2025-11-03 13:31 ` [PATCH v2 11/15] block: introduce BLKREPORTZONESV2 ioctl Damien Le Moal
2025-11-03 15:17 ` Johannes Thumshirn
2025-11-03 22:12 ` Bart Van Assche
2025-11-03 23:01 ` Damien Le Moal
2025-11-04 0:15 ` Damien Le Moal
2025-11-04 1:01 ` Bart Van Assche
2025-11-04 1:20 ` Damien Le Moal
2025-11-04 7:23 ` Johannes Thumshirn
2025-11-04 7:38 ` Damien Le Moal
2025-11-03 13:31 ` [PATCH v2 12/15] block: improve zone_wplugs debugfs attribute output Damien Le Moal
2025-11-03 13:47 ` Christoph Hellwig
2025-11-03 15:18 ` Johannes Thumshirn
2025-11-03 13:31 ` [PATCH v2 13/15] block: add zone write plug condition to debugfs zone_wplugs Damien Le Moal
2025-11-03 15:23 ` Johannes Thumshirn
2025-11-03 13:31 ` [PATCH v2 14/15] btrfs: use blkdev_report_zones_cached() Damien Le Moal
2025-11-03 15:26 ` Johannes Thumshirn
2025-11-03 13:31 ` [PATCH v2 15/15] xfs: " Damien Le Moal
2025-11-03 15:27 ` Johannes Thumshirn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251103133123.645038-11-dlemoal@kernel.org \
--to=dlemoal@kernel.org \
--cc=axboe@kernel.dk \
--cc=cem@kernel.org \
--cc=dm-devel@lists.linux.dev \
--cc=dsterba@suse.com \
--cc=hch@lst.de \
--cc=keith.busch@wdc.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=mpatocka@redhat.com \
--cc=snitzer@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).