* [PATCH 0/2] Introduce bdev_zone_is_seq() @ 2024-11-06 23:13 Damien Le Moal 2024-11-06 23:13 ` [PATCH 1/2] block: RCU protect disk->conv_zones_bitmap Damien Le Moal 2024-11-06 23:13 ` [PATCH 2/2] block: Add a public bdev_zone_is_seq() helper Damien Le Moal 0 siblings, 2 replies; 9+ messages in thread From: Damien Le Moal @ 2024-11-06 23:13 UTC (permalink / raw) To: Jens Axboe, linux-block; +Cc: Christoph Hellwig Allow file systems to safely access a block device gendisk bitmap of conventional zones to determine a zone type by: 1) Patch 1 - changing the gendisk conv_zones_bitmap to be RCU protected 2) Patch 2 - Introducing the helper function bdev_zone_is_seq() This is in preparation for use in btrfs to remove the btrfs-managed bitmap of conventional zones and in zoned support for xfs. Damien Le Moal (2): block: RCU protect disk->conv_zones_bitmap block: Add a public bdev_zone_is_seq() helper block/blk-zoned.c | 43 ++++++++++++++++++++++++------------------ include/linux/blkdev.h | 28 ++++++++++++++++++++++++++- 2 files changed, 52 insertions(+), 19 deletions(-) -- 2.47.0 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/2] block: RCU protect disk->conv_zones_bitmap 2024-11-06 23:13 [PATCH 0/2] Introduce bdev_zone_is_seq() Damien Le Moal @ 2024-11-06 23:13 ` Damien Le Moal 2024-11-06 23:20 ` Bart Van Assche 2024-11-07 5:40 ` Christoph Hellwig 2024-11-06 23:13 ` [PATCH 2/2] block: Add a public bdev_zone_is_seq() helper Damien Le Moal 1 sibling, 2 replies; 9+ messages in thread From: Damien Le Moal @ 2024-11-06 23:13 UTC (permalink / raw) To: Jens Axboe, linux-block; +Cc: Christoph Hellwig Ensure that a disk revalidation changing the conventional zones bitmap of a disk does not cause invalid memory references when using the disk_zone_is_conv() helper by RCU protecting the disk->conv_zones_bitmap pointer. disk_zone_is_conv() is modified to operate under the RCU read lock and the function disk_set_conv_zones_bitmap() is added to update a disk conv_zones_bitmap pointer using rcu_replace_pointer() with the disk zone_wplugs_lock spinlock held. disk_free_zone_resources() is modified to call disk_update_zone_resources() with a NULL bitmap pointer to free the disk conv_zones_bitmap. disk_set_conv_zones_bitmap() is also used in disk_update_zone_resources() to set the new (revalidated) bitmap and free the old one. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> --- block/blk-zoned.c | 43 ++++++++++++++++++++++++++++++------------ include/linux/blkdev.h | 2 +- 2 files changed, 32 insertions(+), 13 deletions(-) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index a287577d1ad6..7a7855555d6d 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -350,9 +350,14 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode, static inline bool disk_zone_is_conv(struct gendisk *disk, sector_t sector) { - if (!disk->conv_zones_bitmap) - return false; - return test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap); + bool is_conv; + + rcu_read_lock(); + is_conv = disk->conv_zones_bitmap && + test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap); + rcu_read_unlock(); + + return is_conv; } static bool disk_zone_is_last(struct gendisk *disk, struct blk_zone *zone) @@ -1455,6 +1460,25 @@ static void disk_destroy_zone_wplugs_hash_table(struct gendisk *disk) disk->zone_wplugs_hash_bits = 0; } +static unsigned int disk_set_conv_zones_bitmap(struct gendisk *disk, + unsigned long *bitmap) +{ + unsigned int nr_conv_zones = 0; + unsigned long flags; + + spin_lock_irqsave(&disk->zone_wplugs_lock, flags); + bitmap = rcu_replace_pointer(disk->conv_zones_bitmap, bitmap, + lockdep_is_held(&disk->zone_wplugs_lock)); + if (disk->conv_zones_bitmap) + nr_conv_zones = bitmap_weight(disk->conv_zones_bitmap, + disk->nr_zones); + spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags); + + kfree_rcu_mightsleep(bitmap); + + return nr_conv_zones; +} + void disk_free_zone_resources(struct gendisk *disk) { if (!disk->zone_wplugs_pool) @@ -1478,8 +1502,7 @@ void disk_free_zone_resources(struct gendisk *disk) mempool_destroy(disk->zone_wplugs_pool); disk->zone_wplugs_pool = NULL; - bitmap_free(disk->conv_zones_bitmap); - disk->conv_zones_bitmap = NULL; + disk_set_conv_zones_bitmap(disk, NULL); disk->zone_capacity = 0; disk->last_zone_capacity = 0; disk->nr_zones = 0; @@ -1538,17 +1561,15 @@ static int disk_update_zone_resources(struct gendisk *disk, struct blk_revalidate_zone_args *args) { struct request_queue *q = disk->queue; - unsigned int nr_seq_zones, nr_conv_zones = 0; + unsigned int nr_seq_zones, nr_conv_zones; unsigned int pool_size; struct queue_limits lim; disk->nr_zones = args->nr_zones; disk->zone_capacity = args->zone_capacity; disk->last_zone_capacity = args->last_zone_capacity; - swap(disk->conv_zones_bitmap, args->conv_zones_bitmap); - if (disk->conv_zones_bitmap) - nr_conv_zones = bitmap_weight(disk->conv_zones_bitmap, - disk->nr_zones); + nr_conv_zones = + disk_set_conv_zones_bitmap(disk, args->conv_zones_bitmap); if (nr_conv_zones >= disk->nr_zones) { pr_warn("%s: Invalid number of conventional zones %u / %u\n", disk->disk_name, nr_conv_zones, disk->nr_zones); @@ -1817,8 +1838,6 @@ int blk_revalidate_disk_zones(struct gendisk *disk) disk_free_zone_resources(disk); blk_mq_unfreeze_queue(q); - kfree(args.conv_zones_bitmap); - return ret; } EXPORT_SYMBOL_GPL(blk_revalidate_disk_zones); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 6d1413bd69a5..5687eb2a019c 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -195,7 +195,7 @@ struct gendisk { unsigned int nr_zones; unsigned int zone_capacity; unsigned int last_zone_capacity; - unsigned long *conv_zones_bitmap; + unsigned long __rcu *conv_zones_bitmap; unsigned int zone_wplugs_hash_bits; spinlock_t zone_wplugs_lock; struct mempool_s *zone_wplugs_pool; -- 2.47.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] block: RCU protect disk->conv_zones_bitmap 2024-11-06 23:13 ` [PATCH 1/2] block: RCU protect disk->conv_zones_bitmap Damien Le Moal @ 2024-11-06 23:20 ` Bart Van Assche 2024-11-06 23:44 ` Damien Le Moal 2024-11-07 5:40 ` Christoph Hellwig 1 sibling, 1 reply; 9+ messages in thread From: Bart Van Assche @ 2024-11-06 23:20 UTC (permalink / raw) To: Damien Le Moal, Jens Axboe, linux-block; +Cc: Christoph Hellwig On 11/6/24 3:13 PM, Damien Le Moal wrote: > diff --git a/block/blk-zoned.c b/block/blk-zoned.c > index a287577d1ad6..7a7855555d6d 100644 > --- a/block/blk-zoned.c > +++ b/block/blk-zoned.c > @@ -350,9 +350,14 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode, > > static inline bool disk_zone_is_conv(struct gendisk *disk, sector_t sector) > { > - if (!disk->conv_zones_bitmap) > - return false; > - return test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap); > + bool is_conv; > + > + rcu_read_lock(); > + is_conv = disk->conv_zones_bitmap && > + test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap); > + rcu_read_unlock(); > + > + return is_conv; > } The above code can be simplified significantly by using guard(rcu). Otherwise the two patches in this series look good to me. Thanks, Bart. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] block: RCU protect disk->conv_zones_bitmap 2024-11-06 23:20 ` Bart Van Assche @ 2024-11-06 23:44 ` Damien Le Moal 2024-11-07 0:02 ` Jens Axboe 0 siblings, 1 reply; 9+ messages in thread From: Damien Le Moal @ 2024-11-06 23:44 UTC (permalink / raw) To: Bart Van Assche, Jens Axboe, linux-block; +Cc: Christoph Hellwig On 11/7/24 08:20, Bart Van Assche wrote: > On 11/6/24 3:13 PM, Damien Le Moal wrote: >> diff --git a/block/blk-zoned.c b/block/blk-zoned.c >> index a287577d1ad6..7a7855555d6d 100644 >> --- a/block/blk-zoned.c >> +++ b/block/blk-zoned.c >> @@ -350,9 +350,14 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode, >> >> static inline bool disk_zone_is_conv(struct gendisk *disk, sector_t sector) >> { >> - if (!disk->conv_zones_bitmap) >> - return false; >> - return test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap); >> + bool is_conv; >> + >> + rcu_read_lock(); >> + is_conv = disk->conv_zones_bitmap && >> + test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap); >> + rcu_read_unlock(); >> + >> + return is_conv; >> } > > The above code can be simplified significantly by using guard(rcu). I personally dislike very much annotations that hide code. So unless Jens prefers using guard(rcu), I would prefer leaving the code as it is. > Otherwise the two patches in this series look good to me. > > Thanks, > > Bart. -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] block: RCU protect disk->conv_zones_bitmap 2024-11-06 23:44 ` Damien Le Moal @ 2024-11-07 0:02 ` Jens Axboe 2024-11-07 0:14 ` Damien Le Moal 0 siblings, 1 reply; 9+ messages in thread From: Jens Axboe @ 2024-11-07 0:02 UTC (permalink / raw) To: Damien Le Moal, Bart Van Assche, linux-block; +Cc: Christoph Hellwig On 11/6/24 4:44 PM, Damien Le Moal wrote: > On 11/7/24 08:20, Bart Van Assche wrote: >> On 11/6/24 3:13 PM, Damien Le Moal wrote: >>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c >>> index a287577d1ad6..7a7855555d6d 100644 >>> --- a/block/blk-zoned.c >>> +++ b/block/blk-zoned.c >>> @@ -350,9 +350,14 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode, >>> >>> static inline bool disk_zone_is_conv(struct gendisk *disk, sector_t sector) >>> { >>> - if (!disk->conv_zones_bitmap) >>> - return false; >>> - return test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap); >>> + bool is_conv; >>> + >>> + rcu_read_lock(); >>> + is_conv = disk->conv_zones_bitmap && >>> + test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap); >>> + rcu_read_unlock(); >>> + >>> + return is_conv; >>> } >> >> The above code can be simplified significantly by using guard(rcu). > > I personally dislike very much annotations that hide code. So unless > Jens prefers using guard(rcu), I would prefer leaving the code as it > is. I don't mind it, and I do use it myself when it makes sense - but imho it's up to the person writing the code, particularly when it's their code in the first place. -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] block: RCU protect disk->conv_zones_bitmap 2024-11-07 0:02 ` Jens Axboe @ 2024-11-07 0:14 ` Damien Le Moal 0 siblings, 0 replies; 9+ messages in thread From: Damien Le Moal @ 2024-11-07 0:14 UTC (permalink / raw) To: Jens Axboe, Bart Van Assche, linux-block; +Cc: Christoph Hellwig On 11/7/24 09:02, Jens Axboe wrote: > On 11/6/24 4:44 PM, Damien Le Moal wrote: >> On 11/7/24 08:20, Bart Van Assche wrote: >>> On 11/6/24 3:13 PM, Damien Le Moal wrote: >>>> diff --git a/block/blk-zoned.c b/block/blk-zoned.c >>>> index a287577d1ad6..7a7855555d6d 100644 >>>> --- a/block/blk-zoned.c >>>> +++ b/block/blk-zoned.c >>>> @@ -350,9 +350,14 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode, >>>> >>>> static inline bool disk_zone_is_conv(struct gendisk *disk, sector_t sector) >>>> { >>>> - if (!disk->conv_zones_bitmap) >>>> - return false; >>>> - return test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap); >>>> + bool is_conv; >>>> + >>>> + rcu_read_lock(); >>>> + is_conv = disk->conv_zones_bitmap && >>>> + test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap); >>>> + rcu_read_unlock(); >>>> + >>>> + return is_conv; >>>> } >>> >>> The above code can be simplified significantly by using guard(rcu). >> >> I personally dislike very much annotations that hide code. So unless >> Jens prefers using guard(rcu), I would prefer leaving the code as it >> is. > > I don't mind it, and I do use it myself when it makes sense - but imho > it's up to the person writing the code, particularly when it's their > code in the first place. OK, then I would prefer leaving the code as is :) -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] block: RCU protect disk->conv_zones_bitmap 2024-11-06 23:13 ` [PATCH 1/2] block: RCU protect disk->conv_zones_bitmap Damien Le Moal 2024-11-06 23:20 ` Bart Van Assche @ 2024-11-07 5:40 ` Christoph Hellwig 2024-11-07 5:44 ` Damien Le Moal 1 sibling, 1 reply; 9+ messages in thread From: Christoph Hellwig @ 2024-11-07 5:40 UTC (permalink / raw) To: Damien Le Moal; +Cc: Jens Axboe, linux-block, Christoph Hellwig As sparse rcu warnings got in my radar for something else, I did run sparse over this and it complains. It will need the little gem below to fix (and a rebase of the second patch). Otherwise the series looks great and way better than my initial hack, thanks! diff --git a/block/blk-zoned.c b/block/blk-zoned.c index 7a7855555d6d..bf4458b11720 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -350,11 +350,12 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode, static inline bool disk_zone_is_conv(struct gendisk *disk, sector_t sector) { + unsigned long *bitmap; bool is_conv; rcu_read_lock(); - is_conv = disk->conv_zones_bitmap && - test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap); + bitmap = rcu_dereference(disk->conv_zones_bitmap); + is_conv = bitmap && test_bit(disk_zone_no(disk, sector), bitmap); rcu_read_unlock(); return is_conv; @@ -1467,11 +1468,10 @@ static unsigned int disk_set_conv_zones_bitmap(struct gendisk *disk, unsigned long flags; spin_lock_irqsave(&disk->zone_wplugs_lock, flags); + if (bitmap) + nr_conv_zones = bitmap_weight(bitmap, disk->nr_zones); bitmap = rcu_replace_pointer(disk->conv_zones_bitmap, bitmap, lockdep_is_held(&disk->zone_wplugs_lock)); - if (disk->conv_zones_bitmap) - nr_conv_zones = bitmap_weight(disk->conv_zones_bitmap, - disk->nr_zones); spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags); kfree_rcu_mightsleep(bitmap); ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] block: RCU protect disk->conv_zones_bitmap 2024-11-07 5:40 ` Christoph Hellwig @ 2024-11-07 5:44 ` Damien Le Moal 0 siblings, 0 replies; 9+ messages in thread From: Damien Le Moal @ 2024-11-07 5:44 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Jens Axboe, linux-block On 11/7/24 14:40, Christoph Hellwig wrote: > As sparse rcu warnings got in my radar for something else, I did > run sparse over this and it complains. It will need the little > gem below to fix (and a rebase of the second patch). Otherwise the > series looks great and way better than my initial hack, thanks! OK. I will integrate this and send v2. > > diff --git a/block/blk-zoned.c b/block/blk-zoned.c > index 7a7855555d6d..bf4458b11720 100644 > --- a/block/blk-zoned.c > +++ b/block/blk-zoned.c > @@ -350,11 +350,12 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode, > > static inline bool disk_zone_is_conv(struct gendisk *disk, sector_t sector) > { > + unsigned long *bitmap; > bool is_conv; > > rcu_read_lock(); > - is_conv = disk->conv_zones_bitmap && > - test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap); > + bitmap = rcu_dereference(disk->conv_zones_bitmap); > + is_conv = bitmap && test_bit(disk_zone_no(disk, sector), bitmap); > rcu_read_unlock(); > > return is_conv; > @@ -1467,11 +1468,10 @@ static unsigned int disk_set_conv_zones_bitmap(struct gendisk *disk, > unsigned long flags; > > spin_lock_irqsave(&disk->zone_wplugs_lock, flags); > + if (bitmap) > + nr_conv_zones = bitmap_weight(bitmap, disk->nr_zones); > bitmap = rcu_replace_pointer(disk->conv_zones_bitmap, bitmap, > lockdep_is_held(&disk->zone_wplugs_lock)); > - if (disk->conv_zones_bitmap) > - nr_conv_zones = bitmap_weight(disk->conv_zones_bitmap, > - disk->nr_zones); > spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags); > > kfree_rcu_mightsleep(bitmap); -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 2/2] block: Add a public bdev_zone_is_seq() helper 2024-11-06 23:13 [PATCH 0/2] Introduce bdev_zone_is_seq() Damien Le Moal 2024-11-06 23:13 ` [PATCH 1/2] block: RCU protect disk->conv_zones_bitmap Damien Le Moal @ 2024-11-06 23:13 ` Damien Le Moal 1 sibling, 0 replies; 9+ messages in thread From: Damien Le Moal @ 2024-11-06 23:13 UTC (permalink / raw) To: Jens Axboe, linux-block; +Cc: Christoph Hellwig Turn the private disk_zone_is_conv() function in blk-zoned.c into a public and documented bdev_zone_is_seq() helper with the inverse polarity of the original function, also adding a check for non-zoned devices so that all file systems can use the helper, even with a regular block device. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> --- block/blk-zoned.c | 16 ++-------------- include/linux/blkdev.h | 26 ++++++++++++++++++++++++++ 2 files changed, 28 insertions(+), 14 deletions(-) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index 7a7855555d6d..a8c1e4106d6d 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -348,18 +348,6 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode, return ret; } -static inline bool disk_zone_is_conv(struct gendisk *disk, sector_t sector) -{ - bool is_conv; - - rcu_read_lock(); - is_conv = disk->conv_zones_bitmap && - test_bit(disk_zone_no(disk, sector), disk->conv_zones_bitmap); - rcu_read_unlock(); - - return is_conv; -} - static bool disk_zone_is_last(struct gendisk *disk, struct blk_zone *zone) { return zone->start + zone->len >= get_capacity(disk); @@ -714,7 +702,7 @@ static bool blk_zone_wplug_handle_reset_or_finish(struct bio *bio, struct blk_zone_wplug *zwplug; /* Conventional zones cannot be reset nor finished. */ - if (disk_zone_is_conv(disk, sector)) { + if (!bdev_zone_is_seq(bio->bi_bdev, sector)) { bio_io_error(bio); return true; } @@ -968,7 +956,7 @@ static bool blk_zone_wplug_handle_write(struct bio *bio, unsigned int nr_segs) } /* Conventional zones do not need write plugging. */ - if (disk_zone_is_conv(disk, sector)) { + if (!bdev_zone_is_seq(bio->bi_bdev, sector)) { /* Zone append to conventional zones is not allowed. */ if (bio_op(bio) == REQ_OP_ZONE_APPEND) { bio_io_error(bio); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 5687eb2a019c..24fef307d594 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1380,6 +1380,32 @@ static inline bool bdev_is_zone_start(struct block_device *bdev, return bdev_offset_from_zone_start(bdev, sector) == 0; } +/** + * bdev_zone_is_seq - check if a sector belongs to a sequential write zone + * @bdev: block device to check + * @sector: sector number + * + * Check if @sector on @bdev is contained in a sequential write required zone. + */ +static inline bool bdev_zone_is_seq(struct block_device *bdev, sector_t sector) +{ + bool is_seq = false; + +#if IS_ENABLED(CONFIG_BLK_DEV_ZONED) + if (bdev_is_zoned(bdev)) { + struct gendisk *disk = bdev->bd_disk; + + rcu_read_lock(); + is_seq = !disk->conv_zones_bitmap || + !test_bit(disk_zone_no(disk, sector), + disk->conv_zones_bitmap); + rcu_read_unlock(); + } +#endif + + return is_seq; +} + static inline int queue_dma_alignment(const struct request_queue *q) { return q->limits.dma_alignment; -- 2.47.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-11-07 5:44 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-11-06 23:13 [PATCH 0/2] Introduce bdev_zone_is_seq() Damien Le Moal 2024-11-06 23:13 ` [PATCH 1/2] block: RCU protect disk->conv_zones_bitmap Damien Le Moal 2024-11-06 23:20 ` Bart Van Assche 2024-11-06 23:44 ` Damien Le Moal 2024-11-07 0:02 ` Jens Axboe 2024-11-07 0:14 ` Damien Le Moal 2024-11-07 5:40 ` Christoph Hellwig 2024-11-07 5:44 ` Damien Le Moal 2024-11-06 23:13 ` [PATCH 2/2] block: Add a public bdev_zone_is_seq() helper Damien Le Moal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).