* [PATCH 0/3] btrfs: zoned: limitt active zones to max_open_zones
@ 2025-07-07 2:44 Naohiro Aota
2025-07-07 2:44 ` [PATCH 1/3] btrfs: zoned: do not select metadata BG as finish target Naohiro Aota
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Naohiro Aota @ 2025-07-07 2:44 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota
When there is no active zone limit, we can technically write into any
number of zones at the same time. However, exceeding the max open zones
can degrade performance. To prevent this, this series set the
max_active_zones to bdev_max_open_zones() if there is no active zone
limit.
Using max_open_zones as an active zone limit enables the active zone
tracking and enforcement on SMR too. The revealed some bugs on the
active zone tracking. This series also includes the fixes for them.
Naohiro Aota (3):
btrfs: zoned: do not select metadata BG as finish target
btrfs: zoned: fix write time activation failure for metadata block
group
btrfs: zoned: limit active zones to max_open_zones
fs/btrfs/zoned.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
--
2.50.0
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCH 1/3] btrfs: zoned: do not select metadata BG as finish target 2025-07-07 2:44 [PATCH 0/3] btrfs: zoned: limitt active zones to max_open_zones Naohiro Aota @ 2025-07-07 2:44 ` Naohiro Aota 2025-07-07 3:29 ` Damien Le Moal 2025-07-07 7:34 ` Johannes Thumshirn 2025-07-07 2:44 ` [PATCH 2/3] btrfs: zoned: fix write time activation failure for metadata block group Naohiro Aota 2025-07-07 2:44 ` [PATCH 3/3] btrfs: zoned: limit active zones to max_open_zones Naohiro Aota 2 siblings, 2 replies; 11+ messages in thread From: Naohiro Aota @ 2025-07-07 2:44 UTC (permalink / raw) To: linux-btrfs; +Cc: Naohiro Aota We call btrfs_zone_finish_one_bg() to zone finish one block group and make a room to activate another block group. Currently, we can choose a metadata block group as a target. But, as we reserve an active metadata block group, we no longer want to select a metadata block group. So, skip it in the loop. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> --- fs/btrfs/zoned.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index bd987c90a05c..6fb4d95412d6 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -2651,7 +2651,7 @@ int btrfs_zone_finish_one_bg(struct btrfs_fs_info *fs_info) spin_lock(&block_group->lock); if (block_group->reserved || block_group->alloc_offset == 0 || - (block_group->flags & BTRFS_BLOCK_GROUP_SYSTEM) || + !(block_group->flags & BTRFS_BLOCK_GROUP_DATA) || test_bit(BLOCK_GROUP_FLAG_ZONED_DATA_RELOC, &block_group->runtime_flags)) { spin_unlock(&block_group->lock); continue; -- 2.50.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] btrfs: zoned: do not select metadata BG as finish target 2025-07-07 2:44 ` [PATCH 1/3] btrfs: zoned: do not select metadata BG as finish target Naohiro Aota @ 2025-07-07 3:29 ` Damien Le Moal 2025-07-07 7:34 ` Johannes Thumshirn 1 sibling, 0 replies; 11+ messages in thread From: Damien Le Moal @ 2025-07-07 3:29 UTC (permalink / raw) To: Naohiro Aota, linux-btrfs On 7/7/25 11:44 AM, Naohiro Aota wrote: > We call btrfs_zone_finish_one_bg() to zone finish one block group and make > a room to activate another block group. Currently, we can choose a metadata Nit: make a room -> make room > block group as a target. But, as we reserve an active metadata block group, > we no longer want to select a metadata block group. So, skip it in the > loop. > > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/3] btrfs: zoned: do not select metadata BG as finish target 2025-07-07 2:44 ` [PATCH 1/3] btrfs: zoned: do not select metadata BG as finish target Naohiro Aota 2025-07-07 3:29 ` Damien Le Moal @ 2025-07-07 7:34 ` Johannes Thumshirn 1 sibling, 0 replies; 11+ messages in thread From: Johannes Thumshirn @ 2025-07-07 7:34 UTC (permalink / raw) To: Naohiro Aota, linux-btrfs@vger.kernel.org Looks good, Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/3] btrfs: zoned: fix write time activation failure for metadata block group 2025-07-07 2:44 [PATCH 0/3] btrfs: zoned: limitt active zones to max_open_zones Naohiro Aota 2025-07-07 2:44 ` [PATCH 1/3] btrfs: zoned: do not select metadata BG as finish target Naohiro Aota @ 2025-07-07 2:44 ` Naohiro Aota 2025-07-07 3:33 ` Damien Le Moal 2025-07-07 2:44 ` [PATCH 3/3] btrfs: zoned: limit active zones to max_open_zones Naohiro Aota 2 siblings, 1 reply; 11+ messages in thread From: Naohiro Aota @ 2025-07-07 2:44 UTC (permalink / raw) To: linux-btrfs; +Cc: Naohiro Aota Since commit 13bb483d32ab ("btrfs: zoned: activate metadata block group on write time"), we activate a metadata block group on the write time. If the zone capacity is small enough, we can allocate the entire region before the first write. Then, we hit the btrfs_zoned_bg_is_full() in btrfs_zone_activate() and the activation fails. For a data block group, we already check the fullness condition in the caller side. And, the fullness check is not necessary for metadata's write-time activation. Replace it with a proper WARN. Fixes: 13bb483d32ab ("btrfs: zoned: activate metadata block group on write time") CC: stable@vger.kernel.org # 6.6+ Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> --- fs/btrfs/zoned.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 6fb4d95412d6..9c354e84ab07 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -2169,10 +2169,13 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group) goto out_unlock; } - /* No space left */ - if (btrfs_zoned_bg_is_full(block_group)) { - ret = false; - goto out_unlock; + if (block_group->flags & BTRFS_BLOCK_GROUP_DATA) { + if (WARN_ON_ONCE(btrfs_zoned_bg_is_full(block_group))) { + ret = false; + goto out_unlock; + } + } else { + WARN_ON_ONCE(block_group->meta_write_pointer != block_group->start); } for (i = 0; i < map->num_stripes; i++) { -- 2.50.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] btrfs: zoned: fix write time activation failure for metadata block group 2025-07-07 2:44 ` [PATCH 2/3] btrfs: zoned: fix write time activation failure for metadata block group Naohiro Aota @ 2025-07-07 3:33 ` Damien Le Moal 2025-07-08 5:04 ` Naohiro Aota 0 siblings, 1 reply; 11+ messages in thread From: Damien Le Moal @ 2025-07-07 3:33 UTC (permalink / raw) To: Naohiro Aota, linux-btrfs On 7/7/25 11:44 AM, Naohiro Aota wrote: > Since commit 13bb483d32ab ("btrfs: zoned: activate metadata block group on > write time"), we activate a metadata block group on the write time. If the on the write time -> at write time > zone capacity is small enough, we can allocate the entire region before the > first write. Then, we hit the btrfs_zoned_bg_is_full() in > btrfs_zone_activate() and the activation fails. > > For a data block group, we already check the fullness condition in the > caller side. And, the fullness check is not necessary for metadata's > write-time activation. Replace it with a proper WARN. I am very confused by this explanation. If the BG is fully allocated before we issue the first write, we still need to activate that BG, no ? So why the WARN() ? That seems wrong to me. But I may not be understanding your explanation, which means you need to clarify it :) > > Fixes: 13bb483d32ab ("btrfs: zoned: activate metadata block group on write time") > CC: stable@vger.kernel.org # 6.6+ > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> > --- > fs/btrfs/zoned.c | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c > index 6fb4d95412d6..9c354e84ab07 100644 > --- a/fs/btrfs/zoned.c > +++ b/fs/btrfs/zoned.c > @@ -2169,10 +2169,13 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group) > goto out_unlock; > } > > - /* No space left */ > - if (btrfs_zoned_bg_is_full(block_group)) { > - ret = false; > - goto out_unlock; > + if (block_group->flags & BTRFS_BLOCK_GROUP_DATA) { > + if (WARN_ON_ONCE(btrfs_zoned_bg_is_full(block_group))) { > + ret = false; > + goto out_unlock; > + } > + } else { > + WARN_ON_ONCE(block_group->meta_write_pointer != block_group->start); > } > > for (i = 0; i < map->num_stripes; i++) { -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] btrfs: zoned: fix write time activation failure for metadata block group 2025-07-07 3:33 ` Damien Le Moal @ 2025-07-08 5:04 ` Naohiro Aota 2025-07-08 5:11 ` Damien Le Moal 0 siblings, 1 reply; 11+ messages in thread From: Naohiro Aota @ 2025-07-08 5:04 UTC (permalink / raw) To: Damien Le Moal, Naohiro Aota, linux-btrfs@vger.kernel.org On Mon Jul 7, 2025 at 12:33 PM JST, Damien Le Moal wrote: > On 7/7/25 11:44 AM, Naohiro Aota wrote: >> Since commit 13bb483d32ab ("btrfs: zoned: activate metadata block group on >> write time"), we activate a metadata block group on the write time. If the > > on the write time -> at write time > >> zone capacity is small enough, we can allocate the entire region before the >> first write. Then, we hit the btrfs_zoned_bg_is_full() in >> btrfs_zone_activate() and the activation fails. >> >> For a data block group, we already check the fullness condition in the >> caller side. And, the fullness check is not necessary for metadata's >> write-time activation. Replace it with a proper WARN. > > I am very confused by this explanation. If the BG is fully allocated before we > issue the first write, we still need to activate that BG, no ? So why the > WARN() ? That seems wrong to me. But I may not be understanding your > explanation, which means you need to clarify it :) We activate a data block group before the allocation to simplify the write stage. So, activating a full block group should arise WARN. OTOH, metadata block group is activated on the write stage. So, it is OK to have a full block group to be activated. Instead, it is WARN when we try to activate a partially written (meta_write_pointer != bg->start) block group. > >> >> Fixes: 13bb483d32ab ("btrfs: zoned: activate metadata block group on write time") >> CC: stable@vger.kernel.org # 6.6+ >> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> >> --- >> fs/btrfs/zoned.c | 11 +++++++---- >> 1 file changed, 7 insertions(+), 4 deletions(-) >> >> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c >> index 6fb4d95412d6..9c354e84ab07 100644 >> --- a/fs/btrfs/zoned.c >> +++ b/fs/btrfs/zoned.c >> @@ -2169,10 +2169,13 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group) >> goto out_unlock; >> } >> >> - /* No space left */ >> - if (btrfs_zoned_bg_is_full(block_group)) { >> - ret = false; >> - goto out_unlock; >> + if (block_group->flags & BTRFS_BLOCK_GROUP_DATA) { >> + if (WARN_ON_ONCE(btrfs_zoned_bg_is_full(block_group))) { >> + ret = false; >> + goto out_unlock; >> + } >> + } else { >> + WARN_ON_ONCE(block_group->meta_write_pointer != block_group->start); >> } >> >> for (i = 0; i < map->num_stripes; i++) { ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/3] btrfs: zoned: fix write time activation failure for metadata block group 2025-07-08 5:04 ` Naohiro Aota @ 2025-07-08 5:11 ` Damien Le Moal 0 siblings, 0 replies; 11+ messages in thread From: Damien Le Moal @ 2025-07-08 5:11 UTC (permalink / raw) To: Naohiro Aota, linux-btrfs@vger.kernel.org On 7/8/25 2:04 PM, Naohiro Aota wrote: > On Mon Jul 7, 2025 at 12:33 PM JST, Damien Le Moal wrote: >> On 7/7/25 11:44 AM, Naohiro Aota wrote: >>> Since commit 13bb483d32ab ("btrfs: zoned: activate metadata block group on >>> write time"), we activate a metadata block group on the write time. If the >> >> on the write time -> at write time >> >>> zone capacity is small enough, we can allocate the entire region before the >>> first write. Then, we hit the btrfs_zoned_bg_is_full() in >>> btrfs_zone_activate() and the activation fails. >>> >>> For a data block group, we already check the fullness condition in the >>> caller side. And, the fullness check is not necessary for metadata's >>> write-time activation. Replace it with a proper WARN. >> >> I am very confused by this explanation. If the BG is fully allocated before we >> issue the first write, we still need to activate that BG, no ? So why the >> WARN() ? That seems wrong to me. But I may not be understanding your >> explanation, which means you need to clarify it :) > > We activate a data block group before the allocation to simplify the > write stage. So, activating a full block group should arise WARN. > > OTOH, metadata block group is activated on the write stage. So, it is OK > to have a full block group to be activated. Instead, it is WARN when we > try to activate a partially written (meta_write_pointer != bg->start) > block group. OK. Please update the commit message with this to clarify. Also, a comment in the coe would be nice to remind this to whoever read that code :) -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/3] btrfs: zoned: limit active zones to max_open_zones 2025-07-07 2:44 [PATCH 0/3] btrfs: zoned: limitt active zones to max_open_zones Naohiro Aota 2025-07-07 2:44 ` [PATCH 1/3] btrfs: zoned: do not select metadata BG as finish target Naohiro Aota 2025-07-07 2:44 ` [PATCH 2/3] btrfs: zoned: fix write time activation failure for metadata block group Naohiro Aota @ 2025-07-07 2:44 ` Naohiro Aota 2025-07-07 3:40 ` Damien Le Moal 2 siblings, 1 reply; 11+ messages in thread From: Naohiro Aota @ 2025-07-07 2:44 UTC (permalink / raw) To: linux-btrfs; +Cc: Naohiro Aota When there is no active zone limit, we can technically write into any number of zones at the same time. However, exceeding the max open zones can degrade performance. To prevent this, set the max_active_zones to bdev_max_open_zones() if there is no active zone limit. Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> --- fs/btrfs/zoned.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 9c354e84ab07..bdcfabcb35e7 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -418,6 +418,8 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache) zone_info->nr_zones++; max_active_zones = bdev_max_active_zones(bdev); + if (!max_active_zones) + max_active_zones = bdev_max_open_zones(bdev); if (max_active_zones && max_active_zones < BTRFS_MIN_ACTIVE_ZONES) { btrfs_err(fs_info, "zoned: %s: max active zones %u is too small, need at least %u active zones", -- 2.50.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] btrfs: zoned: limit active zones to max_open_zones 2025-07-07 2:44 ` [PATCH 3/3] btrfs: zoned: limit active zones to max_open_zones Naohiro Aota @ 2025-07-07 3:40 ` Damien Le Moal 2025-07-08 5:16 ` Naohiro Aota 0 siblings, 1 reply; 11+ messages in thread From: Damien Le Moal @ 2025-07-07 3:40 UTC (permalink / raw) To: Naohiro Aota, linux-btrfs On 7/7/25 11:44 AM, Naohiro Aota wrote: > When there is no active zone limit, we can technically write into any > number of zones at the same time. However, exceeding the max open zones can > degrade performance. To prevent this, set the max_active_zones to > bdev_max_open_zones() if there is no active zone limit. > > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> > --- > fs/btrfs/zoned.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c > index 9c354e84ab07..bdcfabcb35e7 100644 > --- a/fs/btrfs/zoned.c > +++ b/fs/btrfs/zoned.c > @@ -418,6 +418,8 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache) > zone_info->nr_zones++; > > max_active_zones = bdev_max_active_zones(bdev); > + if (!max_active_zones) > + max_active_zones = bdev_max_open_zones(bdev); max_active_zones = min_not_zero(bdev_max_active_zones(bdev), bdev_max_open_zones(bdev)); And what if the device has no limits at all ? (e.g. no max active limit and no max open limit). In this case, max_active_zones will be zero. Should we perhaps set in that case a default max_active zones ? For information, the block layer zone write plugging defaults to create a 128 zones pull of write plugs if the device has no limits (BLK_ZONE_WPLUG_DEFAULT_POOL_SIZE in block/blk-zoned.c). I would recommend using that limit here too. So something like this: /* Default number of max active zones when the device has no limits. */ #define BTRFS_ZONE_DEFAULT_MAX_ACTIVE 128 max_active_zones = min_not_zero(bdev_max_active_zones(bdev), bdev_max_open_zones(bdev)); if (!max_active_zones && bdev_nr_zones(bdev) > BTRFS_ZONE_DEFAULT_MAX_ACTIVE) max_active_zones = BTRFS_ZONE_DEFAULT_MAX_ACTIVE; > if (max_active_zones && max_active_zones < BTRFS_MIN_ACTIVE_ZONES) { > btrfs_err(fs_info, > "zoned: %s: max active zones %u is too small, need at least %u active zones", -- Damien Le Moal Western Digital Research ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/3] btrfs: zoned: limit active zones to max_open_zones 2025-07-07 3:40 ` Damien Le Moal @ 2025-07-08 5:16 ` Naohiro Aota 0 siblings, 0 replies; 11+ messages in thread From: Naohiro Aota @ 2025-07-08 5:16 UTC (permalink / raw) To: Damien Le Moal, Naohiro Aota, linux-btrfs@vger.kernel.org On Mon Jul 7, 2025 at 12:40 PM JST, Damien Le Moal wrote: > On 7/7/25 11:44 AM, Naohiro Aota wrote: >> When there is no active zone limit, we can technically write into any >> number of zones at the same time. However, exceeding the max open zones can >> degrade performance. To prevent this, set the max_active_zones to >> bdev_max_open_zones() if there is no active zone limit. >> >> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> >> --- >> fs/btrfs/zoned.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c >> index 9c354e84ab07..bdcfabcb35e7 100644 >> --- a/fs/btrfs/zoned.c >> +++ b/fs/btrfs/zoned.c >> @@ -418,6 +418,8 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache) >> zone_info->nr_zones++; >> >> max_active_zones = bdev_max_active_zones(bdev); >> + if (!max_active_zones) >> + max_active_zones = bdev_max_open_zones(bdev); > > max_active_zones = min_not_zero(bdev_max_active_zones(bdev), > bdev_max_open_zones(bdev)); > > And what if the device has no limits at all ? (e.g. no max active limit and no > max open limit). In this case, max_active_zones will be zero. Should we perhaps > set in that case a default max_active zones ? > > For information, the block layer zone write plugging defaults to create a 128 > zones pull of write plugs if the device has no limits > (BLK_ZONE_WPLUG_DEFAULT_POOL_SIZE in block/blk-zoned.c). I would recommend > using that limit here too. So something like this: > > /* Default number of max active zones when the device has no limits. */ > #define BTRFS_ZONE_DEFAULT_MAX_ACTIVE 128 > > max_active_zones = min_not_zero(bdev_max_active_zones(bdev), > bdev_max_open_zones(bdev)); > if (!max_active_zones && > bdev_nr_zones(bdev) > BTRFS_ZONE_DEFAULT_MAX_ACTIVE) > max_active_zones = BTRFS_ZONE_DEFAULT_MAX_ACTIVE; Indeed. I'll do so in the next version. > > >> if (max_active_zones && max_active_zones < BTRFS_MIN_ACTIVE_ZONES) { >> btrfs_err(fs_info, >> "zoned: %s: max active zones %u is too small, need at least %u active zones", ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-07-08 5:17 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-07-07 2:44 [PATCH 0/3] btrfs: zoned: limitt active zones to max_open_zones Naohiro Aota 2025-07-07 2:44 ` [PATCH 1/3] btrfs: zoned: do not select metadata BG as finish target Naohiro Aota 2025-07-07 3:29 ` Damien Le Moal 2025-07-07 7:34 ` Johannes Thumshirn 2025-07-07 2:44 ` [PATCH 2/3] btrfs: zoned: fix write time activation failure for metadata block group Naohiro Aota 2025-07-07 3:33 ` Damien Le Moal 2025-07-08 5:04 ` Naohiro Aota 2025-07-08 5:11 ` Damien Le Moal 2025-07-07 2:44 ` [PATCH 3/3] btrfs: zoned: limit active zones to max_open_zones Naohiro Aota 2025-07-07 3:40 ` Damien Le Moal 2025-07-08 5:16 ` Naohiro Aota
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox