* [PATCH v4 01/13] btrfs: take btrfs_space_info in btrfs_reserve_data_bytes
2025-04-23 2:43 [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups Naohiro Aota
@ 2025-04-23 2:43 ` Naohiro Aota
2025-04-23 2:43 ` [PATCH v4 02/13] btrfs: take struct btrfs_inode in btrfs_free_reserved_data_space_noquota Naohiro Aota
` (12 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2025-04-23 2:43 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Johannes Thumshirn
Take struct btrfs_space_info in btrfs_reserve_data_bytes() to allow
reserving the data from multiple data space_info candidates.
This is a preparation for the following commits and there is no functional
change.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/delalloc-space.c | 4 ++--
fs/btrfs/space-info.c | 10 +++++-----
fs/btrfs/space-info.h | 2 +-
3 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c
index 1479be2427cb..c7181779b013 100644
--- a/fs/btrfs/delalloc-space.c
+++ b/fs/btrfs/delalloc-space.c
@@ -123,7 +123,7 @@ int btrfs_alloc_data_chunk_ondemand(const struct btrfs_inode *inode, u64 bytes)
if (btrfs_is_free_space_inode(inode))
flush = BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE;
- return btrfs_reserve_data_bytes(fs_info, bytes, flush);
+ return btrfs_reserve_data_bytes(fs_info->data_sinfo, bytes, flush);
}
int btrfs_check_data_free_space(struct btrfs_inode *inode,
@@ -144,7 +144,7 @@ int btrfs_check_data_free_space(struct btrfs_inode *inode,
else if (btrfs_is_free_space_inode(inode))
flush = BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE;
- ret = btrfs_reserve_data_bytes(fs_info, len, flush);
+ ret = btrfs_reserve_data_bytes(fs_info->data_sinfo, len, flush);
if (ret < 0)
return ret;
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 77cc5d4a5a47..3bb7246f40fa 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1836,10 +1836,10 @@ int btrfs_reserve_metadata_bytes(struct btrfs_fs_info *fs_info,
* This will reserve bytes from the data space info. If there is not enough
* space then we will attempt to flush space as specified by flush.
*/
-int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
+int btrfs_reserve_data_bytes(struct btrfs_space_info *space_info, u64 bytes,
enum btrfs_reserve_flush_enum flush)
{
- struct btrfs_space_info *data_sinfo = fs_info->data_sinfo;
+ struct btrfs_fs_info *fs_info = space_info->fs_info;
int ret;
ASSERT(flush == BTRFS_RESERVE_FLUSH_DATA ||
@@ -1847,12 +1847,12 @@ int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
flush == BTRFS_RESERVE_NO_FLUSH);
ASSERT(!current->journal_info || flush != BTRFS_RESERVE_FLUSH_DATA);
- ret = __reserve_bytes(fs_info, data_sinfo, bytes, flush);
+ ret = __reserve_bytes(fs_info, space_info, bytes, flush);
if (ret == -ENOSPC) {
trace_btrfs_space_reservation(fs_info, "space_info:enospc",
- data_sinfo->flags, bytes, 1);
+ space_info->flags, bytes, 1);
if (btrfs_test_opt(fs_info, ENOSPC_DEBUG))
- btrfs_dump_space_info(fs_info, data_sinfo, bytes, 0);
+ btrfs_dump_space_info(fs_info, space_info, bytes, 0);
}
return ret;
}
diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
index a96efdb5e681..7459b4eb99cd 100644
--- a/fs/btrfs/space-info.h
+++ b/fs/btrfs/space-info.h
@@ -288,7 +288,7 @@ static inline void btrfs_space_info_free_bytes_may_use(
btrfs_try_granting_tickets(space_info->fs_info, space_info);
spin_unlock(&space_info->lock);
}
-int btrfs_reserve_data_bytes(struct btrfs_fs_info *fs_info, u64 bytes,
+int btrfs_reserve_data_bytes(struct btrfs_space_info *space_info, u64 bytes,
enum btrfs_reserve_flush_enum flush);
void btrfs_dump_space_info_for_trans_abort(struct btrfs_fs_info *fs_info);
void btrfs_init_async_reclaim_work(struct btrfs_fs_info *fs_info);
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v4 02/13] btrfs: take struct btrfs_inode in btrfs_free_reserved_data_space_noquota
2025-04-23 2:43 [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups Naohiro Aota
2025-04-23 2:43 ` [PATCH v4 01/13] btrfs: take btrfs_space_info in btrfs_reserve_data_bytes Naohiro Aota
@ 2025-04-23 2:43 ` Naohiro Aota
2025-04-23 2:43 ` [PATCH v4 03/13] btrfs: factor out init_space_info() Naohiro Aota
` (11 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2025-04-23 2:43 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Johannes Thumshirn
As well as the last patch, take struct btrfs_inode in the function and let
it distinguish which data space it is working on in a later patch. There is
no functional change with this commit.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/delalloc-space.c | 7 ++++---
fs/btrfs/delalloc-space.h | 3 +--
fs/btrfs/inode.c | 4 ++--
fs/btrfs/relocation.c | 3 +--
4 files changed, 8 insertions(+), 9 deletions(-)
diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c
index c7181779b013..a18895255af9 100644
--- a/fs/btrfs/delalloc-space.c
+++ b/fs/btrfs/delalloc-space.c
@@ -151,7 +151,7 @@ int btrfs_check_data_free_space(struct btrfs_inode *inode,
/* Use new btrfs_qgroup_reserve_data to reserve precious data space. */
ret = btrfs_qgroup_reserve_data(inode, reserved, start, len);
if (ret < 0) {
- btrfs_free_reserved_data_space_noquota(fs_info, len);
+ btrfs_free_reserved_data_space_noquota(inode, len);
extent_changeset_free(*reserved);
*reserved = NULL;
} else {
@@ -168,9 +168,10 @@ int btrfs_check_data_free_space(struct btrfs_inode *inode,
* which we can't sleep and is sure it won't affect qgroup reserved space.
* Like clear_bit_hook().
*/
-void btrfs_free_reserved_data_space_noquota(struct btrfs_fs_info *fs_info,
+void btrfs_free_reserved_data_space_noquota(struct btrfs_inode *inode,
u64 len)
{
+ struct btrfs_fs_info *fs_info = inode->root->fs_info;
struct btrfs_space_info *data_sinfo;
ASSERT(IS_ALIGNED(len, fs_info->sectorsize));
@@ -196,7 +197,7 @@ void btrfs_free_reserved_data_space(struct btrfs_inode *inode,
round_down(start, fs_info->sectorsize);
start = round_down(start, fs_info->sectorsize);
- btrfs_free_reserved_data_space_noquota(fs_info, len);
+ btrfs_free_reserved_data_space_noquota(inode, len);
btrfs_qgroup_free_data(inode, reserved, start, len, NULL);
}
diff --git a/fs/btrfs/delalloc-space.h b/fs/btrfs/delalloc-space.h
index 069005959479..6119c0d3f883 100644
--- a/fs/btrfs/delalloc-space.h
+++ b/fs/btrfs/delalloc-space.h
@@ -18,8 +18,7 @@ void btrfs_free_reserved_data_space(struct btrfs_inode *inode,
void btrfs_delalloc_release_space(struct btrfs_inode *inode,
struct extent_changeset *reserved,
u64 start, u64 len, bool qgroup_free);
-void btrfs_free_reserved_data_space_noquota(struct btrfs_fs_info *fs_info,
- u64 len);
+void btrfs_free_reserved_data_space_noquota(struct btrfs_inode *inode, u64 len);
void btrfs_delalloc_release_metadata(struct btrfs_inode *inode, u64 num_bytes,
bool qgroup_free);
int btrfs_delalloc_reserve_space(struct btrfs_inode *inode,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index a88aa23cdbb6..e68c0b4a3410 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2589,7 +2589,7 @@ void btrfs_clear_delalloc_extent(struct btrfs_inode *inode,
!btrfs_is_free_space_inode(inode) &&
!(state->state & EXTENT_NORESERVE) &&
(bits & EXTENT_CLEAR_DATA_RESV))
- btrfs_free_reserved_data_space_noquota(fs_info, len);
+ btrfs_free_reserved_data_space_noquota(inode, len);
percpu_counter_add_batch(&fs_info->delalloc_bytes, -len,
fs_info->delalloc_batch);
@@ -9718,7 +9718,7 @@ ssize_t btrfs_do_encoded_write(struct kiocb *iocb, struct iov_iter *from,
* bytes_may_use.
*/
if (!extent_reserved)
- btrfs_free_reserved_data_space_noquota(fs_info, disk_num_bytes);
+ btrfs_free_reserved_data_space_noquota(inode, disk_num_bytes);
out_unlock:
btrfs_unlock_extent(io_tree, start, end, &cached_state);
out_folios:
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 97c223aa90b6..e0fbfd56fbba 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2749,8 +2749,7 @@ static noinline_for_stack int prealloc_file_extent_cluster(struct reloc_control
btrfs_inode_unlock(inode, 0);
if (cur_offset < prealloc_end)
- btrfs_free_reserved_data_space_noquota(inode->root->fs_info,
- prealloc_end + 1 - cur_offset);
+ btrfs_free_reserved_data_space_noquota(inode, prealloc_end + 1 - cur_offset);
return ret;
}
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v4 03/13] btrfs: factor out init_space_info()
2025-04-23 2:43 [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups Naohiro Aota
2025-04-23 2:43 ` [PATCH v4 01/13] btrfs: take btrfs_space_info in btrfs_reserve_data_bytes Naohiro Aota
2025-04-23 2:43 ` [PATCH v4 02/13] btrfs: take struct btrfs_inode in btrfs_free_reserved_data_space_noquota Naohiro Aota
@ 2025-04-23 2:43 ` Naohiro Aota
2025-04-23 2:43 ` [PATCH v4 04/13] btrfs: spin out do_async_reclaim_{data,metadata}_space() Naohiro Aota
` (10 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2025-04-23 2:43 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Johannes Thumshirn
Factor out initialization of the space_info struct, which is used in a
later patch. There is no functional change.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/space-info.c | 27 ++++++++++++++++-----------
1 file changed, 16 insertions(+), 11 deletions(-)
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 3bb7246f40fa..7334ffa67a86 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -234,19 +234,11 @@ void btrfs_update_space_info_chunk_size(struct btrfs_space_info *space_info,
WRITE_ONCE(space_info->chunk_size, chunk_size);
}
-static int create_space_info(struct btrfs_fs_info *info, u64 flags)
+static void init_space_info(struct btrfs_fs_info *info,
+ struct btrfs_space_info *space_info, u64 flags)
{
-
- struct btrfs_space_info *space_info;
- int i;
- int ret;
-
- space_info = kzalloc(sizeof(*space_info), GFP_NOFS);
- if (!space_info)
- return -ENOMEM;
-
space_info->fs_info = info;
- for (i = 0; i < BTRFS_NR_RAID_TYPES; i++)
+ for (int i = 0; i < BTRFS_NR_RAID_TYPES; i++)
INIT_LIST_HEAD(&space_info->block_groups[i]);
init_rwsem(&space_info->groups_sem);
spin_lock_init(&space_info->lock);
@@ -260,6 +252,19 @@ static int create_space_info(struct btrfs_fs_info *info, u64 flags)
if (btrfs_is_zoned(info))
space_info->bg_reclaim_threshold = BTRFS_DEFAULT_ZONED_RECLAIM_THRESH;
+}
+
+static int create_space_info(struct btrfs_fs_info *info, u64 flags)
+{
+
+ struct btrfs_space_info *space_info;
+ int ret;
+
+ space_info = kzalloc(sizeof(*space_info), GFP_NOFS);
+ if (!space_info)
+ return -ENOMEM;
+
+ init_space_info(info, space_info, flags);
ret = btrfs_sysfs_add_space_info_type(info, space_info);
if (ret)
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v4 04/13] btrfs: spin out do_async_reclaim_{data,metadata}_space()
2025-04-23 2:43 [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups Naohiro Aota
` (2 preceding siblings ...)
2025-04-23 2:43 ` [PATCH v4 03/13] btrfs: factor out init_space_info() Naohiro Aota
@ 2025-04-23 2:43 ` Naohiro Aota
2025-04-23 2:43 ` [PATCH v4 05/13] btrfs: factor out check_removing_space_info() Naohiro Aota
` (9 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2025-04-23 2:43 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Johannes Thumshirn
Factor out the main part of btrfs_async_reclaim_data_space() to
do_async_reclaim_data_space(), so it can take data space_info parameter it
is working on. Do the same for metadata. There is no functional change.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/space-info.c | 45 ++++++++++++++++++++++++++++---------------
1 file changed, 29 insertions(+), 16 deletions(-)
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 7334ffa67a86..d6d33ab754ba 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1088,23 +1088,15 @@ static bool maybe_fail_all_tickets(struct btrfs_fs_info *fs_info,
return (tickets_id != space_info->tickets_id);
}
-/*
- * This is for normal flushers, we can wait all goddamned day if we want to. We
- * will loop and continuously try to flush as long as we are making progress.
- * We count progress as clearing off tickets each time we have to loop.
- */
-static void btrfs_async_reclaim_metadata_space(struct work_struct *work)
+static void do_async_reclaim_metadata_space(struct btrfs_space_info *space_info)
{
- struct btrfs_fs_info *fs_info;
- struct btrfs_space_info *space_info;
+ struct btrfs_fs_info *fs_info = space_info->fs_info;
u64 to_reclaim;
enum btrfs_flush_state flush_state;
int commit_cycles = 0;
u64 last_tickets_id;
enum btrfs_flush_state final_state;
- fs_info = container_of(work, struct btrfs_fs_info, async_reclaim_work);
- space_info = btrfs_find_space_info(fs_info, BTRFS_BLOCK_GROUP_METADATA);
if (btrfs_is_zoned(fs_info))
final_state = RESET_ZONES;
else
@@ -1178,6 +1170,21 @@ static void btrfs_async_reclaim_metadata_space(struct work_struct *work)
} while (flush_state <= final_state);
}
+/*
+ * This is for normal flushers, we can wait all goddamned day if we want to. We
+ * will loop and continuously try to flush as long as we are making progress.
+ * We count progress as clearing off tickets each time we have to loop.
+ */
+static void btrfs_async_reclaim_metadata_space(struct work_struct *work)
+{
+ struct btrfs_fs_info *fs_info;
+ struct btrfs_space_info *space_info;
+
+ fs_info = container_of(work, struct btrfs_fs_info, async_reclaim_work);
+ space_info = btrfs_find_space_info(fs_info, BTRFS_BLOCK_GROUP_METADATA);
+ do_async_reclaim_metadata_space(space_info);
+}
+
/*
* This handles pre-flushing of metadata space before we get to the point that
* we need to start blocking threads on tickets. The logic here is different
@@ -1323,16 +1330,12 @@ static const enum btrfs_flush_state data_flush_states[] = {
ALLOC_CHUNK_FORCE,
};
-static void btrfs_async_reclaim_data_space(struct work_struct *work)
+static void do_async_reclaim_data_space(struct btrfs_space_info *space_info)
{
- struct btrfs_fs_info *fs_info;
- struct btrfs_space_info *space_info;
+ struct btrfs_fs_info *fs_info = space_info->fs_info;
u64 last_tickets_id;
enum btrfs_flush_state flush_state = 0;
- fs_info = container_of(work, struct btrfs_fs_info, async_data_reclaim_work);
- space_info = fs_info->data_sinfo;
-
spin_lock(&space_info->lock);
if (list_empty(&space_info->tickets)) {
space_info->flush = 0;
@@ -1400,6 +1403,16 @@ static void btrfs_async_reclaim_data_space(struct work_struct *work)
spin_unlock(&space_info->lock);
}
+static void btrfs_async_reclaim_data_space(struct work_struct *work)
+{
+ struct btrfs_fs_info *fs_info;
+ struct btrfs_space_info *space_info;
+
+ fs_info = container_of(work, struct btrfs_fs_info, async_data_reclaim_work);
+ space_info = fs_info->data_sinfo;
+ do_async_reclaim_data_space(space_info);
+}
+
void btrfs_init_async_reclaim_work(struct btrfs_fs_info *fs_info)
{
INIT_WORK(&fs_info->async_reclaim_work, btrfs_async_reclaim_metadata_space);
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v4 05/13] btrfs: factor out check_removing_space_info()
2025-04-23 2:43 [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups Naohiro Aota
` (3 preceding siblings ...)
2025-04-23 2:43 ` [PATCH v4 04/13] btrfs: spin out do_async_reclaim_{data,metadata}_space() Naohiro Aota
@ 2025-04-23 2:43 ` Naohiro Aota
2025-04-23 2:43 ` [PATCH v4 06/13] btrfs: introduce space_info argument to btrfs_chunk_alloc Naohiro Aota
` (8 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2025-04-23 2:43 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Johannes Thumshirn
Factor out check_removing_space_info() from btrfs_free_block_groups(). It
sanity checks a to-be-removed space_info. There is no functional change.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/block-group.c | 51 ++++++++++++++++++++++++------------------
1 file changed, 29 insertions(+), 22 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 91807d294366..b700d80089d3 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -4400,6 +4400,34 @@ void btrfs_put_block_group_cache(struct btrfs_fs_info *info)
}
}
+static void check_removing_space_info(struct btrfs_space_info *space_info)
+{
+ struct btrfs_fs_info *info = space_info->fs_info;
+
+ /*
+ * Do not hide this behind enospc_debug, this is actually
+ * important and indicates a real bug if this happens.
+ */
+ if (WARN_ON(space_info->bytes_pinned > 0 ||
+ space_info->bytes_may_use > 0))
+ btrfs_dump_space_info(info, space_info, 0, 0);
+
+ /*
+ * If there was a failure to cleanup a log tree, very likely due
+ * to an IO failure on a writeback attempt of one or more of its
+ * extent buffers, we could not do proper (and cheap) unaccounting
+ * of their reserved space, so don't warn on bytes_reserved > 0 in
+ * that case.
+ */
+ if (!(space_info->flags & BTRFS_BLOCK_GROUP_METADATA) ||
+ !BTRFS_FS_LOG_CLEANUP_ERROR(info)) {
+ if (WARN_ON(space_info->bytes_reserved > 0))
+ btrfs_dump_space_info(info, space_info, 0, 0);
+ }
+
+ WARN_ON(space_info->reclaim_size > 0);
+}
+
/*
* Must be called only after stopping all workers, since we could have block
* group caching kthreads running, and therefore they could race with us if we
@@ -4501,28 +4529,7 @@ int btrfs_free_block_groups(struct btrfs_fs_info *info)
struct btrfs_space_info,
list);
- /*
- * Do not hide this behind enospc_debug, this is actually
- * important and indicates a real bug if this happens.
- */
- if (WARN_ON(space_info->bytes_pinned > 0 ||
- space_info->bytes_may_use > 0))
- btrfs_dump_space_info(info, space_info, 0, 0);
-
- /*
- * If there was a failure to cleanup a log tree, very likely due
- * to an IO failure on a writeback attempt of one or more of its
- * extent buffers, we could not do proper (and cheap) unaccounting
- * of their reserved space, so don't warn on bytes_reserved > 0 in
- * that case.
- */
- if (!(space_info->flags & BTRFS_BLOCK_GROUP_METADATA) ||
- !BTRFS_FS_LOG_CLEANUP_ERROR(info)) {
- if (WARN_ON(space_info->bytes_reserved > 0))
- btrfs_dump_space_info(info, space_info, 0, 0);
- }
-
- WARN_ON(space_info->reclaim_size > 0);
+ check_removing_space_info(space_info);
list_del(&space_info->list);
btrfs_sysfs_remove_space_info(space_info);
}
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v4 06/13] btrfs: introduce space_info argument to btrfs_chunk_alloc
2025-04-23 2:43 [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups Naohiro Aota
` (4 preceding siblings ...)
2025-04-23 2:43 ` [PATCH v4 05/13] btrfs: factor out check_removing_space_info() Naohiro Aota
@ 2025-04-23 2:43 ` Naohiro Aota
2025-04-23 2:43 ` [PATCH v4 07/13] btrfs: pass space_info for block group creation Naohiro Aota
` (7 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2025-04-23 2:43 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota
Take a btrfs_space_info argument in btrfs_chunk_alloc(). New block group
will belong to that space_info.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
fs/btrfs/block-group.c | 27 +++++++++++++++++----------
fs/btrfs/block-group.h | 3 ++-
fs/btrfs/extent-tree.c | 6 ++++--
fs/btrfs/space-info.c | 2 +-
fs/btrfs/transaction.c | 5 +++--
5 files changed, 27 insertions(+), 16 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index b700d80089d3..ab5fc5b40f04 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2966,6 +2966,7 @@ int btrfs_inc_block_group_ro(struct btrfs_block_group *cache,
bool do_chunk_alloc)
{
struct btrfs_fs_info *fs_info = cache->fs_info;
+ struct btrfs_space_info *space_info = cache->space_info;
struct btrfs_trans_handle *trans;
struct btrfs_root *root = btrfs_block_group_root(fs_info);
u64 alloc_flags;
@@ -3018,7 +3019,7 @@ int btrfs_inc_block_group_ro(struct btrfs_block_group *cache,
*/
alloc_flags = btrfs_get_alloc_profile(fs_info, cache->flags);
if (alloc_flags != cache->flags) {
- ret = btrfs_chunk_alloc(trans, alloc_flags,
+ ret = btrfs_chunk_alloc(trans, space_info, alloc_flags,
CHUNK_ALLOC_FORCE);
/*
* ENOSPC is allowed here, we may have enough space
@@ -3046,15 +3047,15 @@ int btrfs_inc_block_group_ro(struct btrfs_block_group *cache,
(cache->flags & BTRFS_BLOCK_GROUP_SYSTEM))
goto unlock_out;
- alloc_flags = btrfs_get_alloc_profile(fs_info, cache->space_info->flags);
- ret = btrfs_chunk_alloc(trans, alloc_flags, CHUNK_ALLOC_FORCE);
+ alloc_flags = btrfs_get_alloc_profile(fs_info, space_info->flags);
+ ret = btrfs_chunk_alloc(trans, space_info, alloc_flags, CHUNK_ALLOC_FORCE);
if (ret < 0)
goto out;
/*
* We have allocated a new chunk. We also need to activate that chunk to
* grant metadata tickets for zoned filesystem.
*/
- ret = btrfs_zoned_activate_one_bg(fs_info, cache->space_info, true);
+ ret = btrfs_zoned_activate_one_bg(fs_info, space_info, true);
if (ret < 0)
goto out;
@@ -3898,8 +3899,15 @@ static int should_alloc_chunk(const struct btrfs_fs_info *fs_info,
int btrfs_force_chunk_alloc(struct btrfs_trans_handle *trans, u64 type)
{
u64 alloc_flags = btrfs_get_alloc_profile(trans->fs_info, type);
+ struct btrfs_space_info *space_info;
+
+ space_info = btrfs_find_space_info(trans->fs_info, type);
+ if (!space_info) {
+ ASSERT(0);
+ return -EINVAL;
+ }
- return btrfs_chunk_alloc(trans, alloc_flags, CHUNK_ALLOC_FORCE);
+ return btrfs_chunk_alloc(trans, space_info, alloc_flags, CHUNK_ALLOC_FORCE);
}
static struct btrfs_block_group *do_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags)
@@ -4095,6 +4103,8 @@ static struct btrfs_block_group *do_chunk_alloc(struct btrfs_trans_handle *trans
*
* This function, btrfs_chunk_alloc(), belongs to phase 1.
*
+ * @space_info specify which space_info the new chunk should belong to.
+ *
* If @force is CHUNK_ALLOC_FORCE:
* - return 1 if it successfully allocates a chunk,
* - return errors including -ENOSPC otherwise.
@@ -4103,11 +4113,11 @@ static struct btrfs_block_group *do_chunk_alloc(struct btrfs_trans_handle *trans
* - return 1 if it successfully allocates a chunk,
* - return errors including -ENOSPC otherwise.
*/
-int btrfs_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags,
+int btrfs_chunk_alloc(struct btrfs_trans_handle *trans,
+ struct btrfs_space_info *space_info, u64 flags,
enum btrfs_chunk_alloc_enum force)
{
struct btrfs_fs_info *fs_info = trans->fs_info;
- struct btrfs_space_info *space_info;
struct btrfs_block_group *ret_bg;
bool wait_for_alloc = false;
bool should_alloc = false;
@@ -4146,9 +4156,6 @@ int btrfs_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags,
if (flags & BTRFS_BLOCK_GROUP_SYSTEM)
return -ENOSPC;
- space_info = btrfs_find_space_info(fs_info, flags);
- ASSERT(space_info);
-
do {
spin_lock(&space_info->lock);
if (force < space_info->force_alloc)
diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
index 36937eeab9b8..c01f3af726a1 100644
--- a/fs/btrfs/block-group.h
+++ b/fs/btrfs/block-group.h
@@ -342,7 +342,8 @@ int btrfs_add_reserved_bytes(struct btrfs_block_group *cache,
bool force_wrong_size_class);
void btrfs_free_reserved_bytes(struct btrfs_block_group *cache,
u64 num_bytes, int delalloc);
-int btrfs_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags,
+int btrfs_chunk_alloc(struct btrfs_trans_handle *trans,
+ struct btrfs_space_info *space_info, u64 flags,
enum btrfs_chunk_alloc_enum force);
int btrfs_force_chunk_alloc(struct btrfs_trans_handle *trans, u64 type);
void check_system_chunk(struct btrfs_trans_handle *trans, const u64 type);
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index a7564b39eb5c..1762918dd9a1 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4105,6 +4105,7 @@ static int can_allocate_chunk(struct btrfs_fs_info *fs_info,
static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
struct btrfs_key *ins,
struct find_free_extent_ctl *ffe_ctl,
+ struct btrfs_space_info *space_info,
bool full_search)
{
struct btrfs_root *root = fs_info->chunk_root;
@@ -4159,7 +4160,7 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info,
return ret;
}
- ret = btrfs_chunk_alloc(trans, ffe_ctl->flags,
+ ret = btrfs_chunk_alloc(trans, space_info, ffe_ctl->flags,
CHUNK_ALLOC_FORCE_FOR_EXTENT);
/* Do not bail out on ENOSPC since we can do more. */
@@ -4572,7 +4573,8 @@ static noinline int find_free_extent(struct btrfs_root *root,
}
up_read(&space_info->groups_sem);
- ret = find_free_extent_update_loop(fs_info, ins, ffe_ctl, full_search);
+ ret = find_free_extent_update_loop(fs_info, ins, ffe_ctl, space_info,
+ full_search);
if (ret > 0)
goto search;
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index d6d33ab754ba..2489c2a16123 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -817,7 +817,7 @@ static void flush_space(struct btrfs_fs_info *fs_info,
ret = PTR_ERR(trans);
break;
}
- ret = btrfs_chunk_alloc(trans,
+ ret = btrfs_chunk_alloc(trans, space_info,
btrfs_get_alloc_profile(fs_info, space_info->flags),
(state == ALLOC_CHUNK) ? CHUNK_ALLOC_NO_FORCE :
CHUNK_ALLOC_FORCE);
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 39e48bf610a1..3f429f0cef6f 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -761,9 +761,10 @@ start_transaction(struct btrfs_root *root, unsigned int num_items,
* value here.
*/
if (do_chunk_alloc && num_bytes) {
- u64 flags = h->block_rsv->space_info->flags;
+ struct btrfs_space_info *space_info = h->block_rsv->space_info;
+ u64 flags = space_info->flags;
- btrfs_chunk_alloc(h, btrfs_get_alloc_profile(fs_info, flags),
+ btrfs_chunk_alloc(h, space_info, btrfs_get_alloc_profile(fs_info, flags),
CHUNK_ALLOC_NO_FORCE);
}
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v4 07/13] btrfs: pass space_info for block group creation
2025-04-23 2:43 [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups Naohiro Aota
` (5 preceding siblings ...)
2025-04-23 2:43 ` [PATCH v4 06/13] btrfs: introduce space_info argument to btrfs_chunk_alloc Naohiro Aota
@ 2025-04-23 2:43 ` Naohiro Aota
2025-04-23 2:43 ` [PATCH v4 08/13] btrfs: introduce btrfs_space_info sub-group Naohiro Aota
` (6 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2025-04-23 2:43 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota
Add btrfs_space_info parameter to btrfs_make_block_group(), its related
functions and related struct. Passed space_info will have a new block
group.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
fs/btrfs/block-group.c | 31 +++++++++++++++++++++++--------
fs/btrfs/block-group.h | 4 ++--
fs/btrfs/volumes.c | 36 +++++++++++++++++++++++++++++++-----
fs/btrfs/volumes.h | 3 ++-
4 files changed, 58 insertions(+), 16 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index ab5fc5b40f04..70deb3e8739a 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2866,8 +2866,8 @@ static u64 calculate_global_root_id(const struct btrfs_fs_info *fs_info, u64 off
}
struct btrfs_block_group *btrfs_make_block_group(struct btrfs_trans_handle *trans,
- u64 type,
- u64 chunk_offset, u64 size)
+ struct btrfs_space_info *space_info,
+ u64 type, u64 chunk_offset, u64 size)
{
struct btrfs_fs_info *fs_info = trans->fs_info;
struct btrfs_block_group *cache;
@@ -2921,7 +2921,7 @@ struct btrfs_block_group *btrfs_make_block_group(struct btrfs_trans_handle *tran
* assigned to our block group. We want our bg to be added to the rbtree
* with its ->space_info set.
*/
- cache->space_info = btrfs_find_space_info(fs_info, cache->flags);
+ cache->space_info = space_info;
ASSERT(cache->space_info);
ret = btrfs_add_block_group_cache(cache);
@@ -3910,7 +3910,9 @@ int btrfs_force_chunk_alloc(struct btrfs_trans_handle *trans, u64 type)
return btrfs_chunk_alloc(trans, space_info, alloc_flags, CHUNK_ALLOC_FORCE);
}
-static struct btrfs_block_group *do_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags)
+static struct btrfs_block_group *do_chunk_alloc(struct btrfs_trans_handle *trans,
+ struct btrfs_space_info *space_info,
+ u64 flags)
{
struct btrfs_block_group *bg;
int ret;
@@ -3923,7 +3925,7 @@ static struct btrfs_block_group *do_chunk_alloc(struct btrfs_trans_handle *trans
*/
check_system_chunk(trans, flags);
- bg = btrfs_create_chunk(trans, flags);
+ bg = btrfs_create_chunk(trans, space_info, flags);
if (IS_ERR(bg)) {
ret = PTR_ERR(bg);
goto out;
@@ -3971,8 +3973,17 @@ static struct btrfs_block_group *do_chunk_alloc(struct btrfs_trans_handle *trans
if (ret == -ENOSPC) {
const u64 sys_flags = btrfs_system_alloc_profile(trans->fs_info);
struct btrfs_block_group *sys_bg;
+ struct btrfs_space_info *sys_space_info;
- sys_bg = btrfs_create_chunk(trans, sys_flags);
+ sys_space_info = btrfs_find_space_info(trans->fs_info, sys_flags);
+ if (!sys_space_info) {
+ ASSERT(0);
+ ret = -EINVAL;
+ btrfs_abort_transaction(trans, ret);
+ goto out;
+ }
+
+ sys_bg = btrfs_create_chunk(trans, sys_space_info, sys_flags);
if (IS_ERR(sys_bg)) {
ret = PTR_ERR(sys_bg);
btrfs_abort_transaction(trans, ret);
@@ -4216,7 +4227,7 @@ int btrfs_chunk_alloc(struct btrfs_trans_handle *trans,
force_metadata_allocation(fs_info);
}
- ret_bg = do_chunk_alloc(trans, flags);
+ ret_bg = do_chunk_alloc(trans, space_info, flags);
trans->allocating_chunk = false;
if (IS_ERR(ret_bg)) {
@@ -4292,6 +4303,10 @@ static void reserve_chunk_space(struct btrfs_trans_handle *trans,
if (left < bytes) {
u64 flags = btrfs_system_alloc_profile(fs_info);
struct btrfs_block_group *bg;
+ struct btrfs_space_info *space_info;
+
+ space_info = btrfs_find_space_info(fs_info, flags);
+ ASSERT(space_info);
/*
* Ignore failure to create system chunk. We might end up not
@@ -4299,7 +4314,7 @@ static void reserve_chunk_space(struct btrfs_trans_handle *trans,
* the paths we visit in the chunk tree (they were already COWed
* or created in the current transaction for example).
*/
- bg = btrfs_create_chunk(trans, flags);
+ bg = btrfs_create_chunk(trans, space_info, flags);
if (IS_ERR(bg)) {
ret = PTR_ERR(bg);
} else {
diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
index c01f3af726a1..35309b690d6f 100644
--- a/fs/btrfs/block-group.h
+++ b/fs/btrfs/block-group.h
@@ -326,8 +326,8 @@ void btrfs_reclaim_bgs(struct btrfs_fs_info *fs_info);
void btrfs_mark_bg_to_reclaim(struct btrfs_block_group *bg);
int btrfs_read_block_groups(struct btrfs_fs_info *info);
struct btrfs_block_group *btrfs_make_block_group(struct btrfs_trans_handle *trans,
- u64 type,
- u64 chunk_offset, u64 size);
+ struct btrfs_space_info *space_info,
+ u64 type, u64 chunk_offset, u64 size);
void btrfs_create_pending_block_groups(struct btrfs_trans_handle *trans);
int btrfs_inc_block_group_ro(struct btrfs_block_group *cache,
bool do_chunk_alloc);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index dbbed3f4f88a..18474d93601a 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3423,8 +3423,17 @@ int btrfs_remove_chunk(struct btrfs_trans_handle *trans, u64 chunk_offset)
if (ret == -ENOSPC) {
const u64 sys_flags = btrfs_system_alloc_profile(fs_info);
struct btrfs_block_group *sys_bg;
+ struct btrfs_space_info *space_info;
- sys_bg = btrfs_create_chunk(trans, sys_flags);
+ space_info = btrfs_find_space_info(fs_info, sys_flags);
+ if (!space_info) {
+ ASSERT(0);
+ ret = -EINVAL;
+ btrfs_abort_transaction(trans, ret);
+ goto out;
+ }
+
+ sys_bg = btrfs_create_chunk(trans, space_info, sys_flags);
if (IS_ERR(sys_bg)) {
ret = PTR_ERR(sys_bg);
btrfs_abort_transaction(trans, ret);
@@ -5221,6 +5230,8 @@ struct alloc_chunk_ctl {
u64 stripe_size;
u64 chunk_size;
int ndevs;
+ /* Space_info the block group is going to belong. */
+ struct btrfs_space_info *space_info;
};
static void init_alloc_chunk_ctl_policy_regular(
@@ -5626,7 +5637,8 @@ static struct btrfs_block_group *create_chunk(struct btrfs_trans_handle *trans,
return ERR_PTR(ret);
}
- block_group = btrfs_make_block_group(trans, type, start, ctl->chunk_size);
+ block_group = btrfs_make_block_group(trans, ctl->space_info, type, start,
+ ctl->chunk_size);
if (IS_ERR(block_group)) {
btrfs_remove_chunk_map(info, map);
return block_group;
@@ -5652,7 +5664,8 @@ static struct btrfs_block_group *create_chunk(struct btrfs_trans_handle *trans,
}
struct btrfs_block_group *btrfs_create_chunk(struct btrfs_trans_handle *trans,
- u64 type)
+ struct btrfs_space_info *space_info,
+ u64 type)
{
struct btrfs_fs_info *info = trans->fs_info;
struct btrfs_fs_devices *fs_devices = info->fs_devices;
@@ -5682,6 +5695,7 @@ struct btrfs_block_group *btrfs_create_chunk(struct btrfs_trans_handle *trans,
ctl.start = find_next_chunk(info);
ctl.type = type;
+ ctl.space_info = space_info;
init_alloc_chunk_ctl(fs_devices, &ctl);
devices_info = kcalloc(fs_devices->rw_devices, sizeof(*devices_info),
@@ -5825,7 +5839,9 @@ static noinline int init_first_rw_device(struct btrfs_trans_handle *trans)
struct btrfs_fs_info *fs_info = trans->fs_info;
u64 alloc_profile;
struct btrfs_block_group *meta_bg;
+ struct btrfs_space_info *meta_space_info;
struct btrfs_block_group *sys_bg;
+ struct btrfs_space_info *sys_space_info;
/*
* When adding a new device for sprouting, the seed device is read-only
@@ -5849,12 +5865,22 @@ static noinline int init_first_rw_device(struct btrfs_trans_handle *trans)
*/
alloc_profile = btrfs_metadata_alloc_profile(fs_info);
- meta_bg = btrfs_create_chunk(trans, alloc_profile);
+ meta_space_info = btrfs_find_space_info(fs_info, alloc_profile);
+ if (!meta_space_info) {
+ ASSERT(0);
+ return -EINVAL;
+ }
+ meta_bg = btrfs_create_chunk(trans, meta_space_info, alloc_profile);
if (IS_ERR(meta_bg))
return PTR_ERR(meta_bg);
alloc_profile = btrfs_system_alloc_profile(fs_info);
- sys_bg = btrfs_create_chunk(trans, alloc_profile);
+ sys_space_info = btrfs_find_space_info(fs_info, alloc_profile);
+ if (!sys_space_info) {
+ ASSERT(0);
+ return -EINVAL;
+ }
+ sys_bg = btrfs_create_chunk(trans, sys_space_info, alloc_profile);
if (IS_ERR(sys_bg))
return PTR_ERR(sys_bg);
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index e247d551da67..7f314a4487c4 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -715,7 +715,8 @@ struct btrfs_discard_stripe *btrfs_map_discard(struct btrfs_fs_info *fs_info,
int btrfs_read_sys_array(struct btrfs_fs_info *fs_info);
int btrfs_read_chunk_tree(struct btrfs_fs_info *fs_info);
struct btrfs_block_group *btrfs_create_chunk(struct btrfs_trans_handle *trans,
- u64 type);
+ struct btrfs_space_info *space_info,
+ u64 type);
void btrfs_mapping_tree_free(struct btrfs_fs_info *fs_info);
int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
blk_mode_t flags, void *holder);
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v4 08/13] btrfs: introduce btrfs_space_info sub-group
2025-04-23 2:43 [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups Naohiro Aota
` (6 preceding siblings ...)
2025-04-23 2:43 ` [PATCH v4 07/13] btrfs: pass space_info for block group creation Naohiro Aota
@ 2025-04-23 2:43 ` Naohiro Aota
2025-04-23 2:43 ` [PATCH v4 09/13] btrfs: introduce tree-log sub-space_info Naohiro Aota
` (5 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2025-04-23 2:43 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Johannes Thumshirn
Current code assumes we have only one space_info for each block group type
(DATA, METADATA, and SYSTEM). We sometime needs multiple space_info to
manage special block groups.
One example is handling the data relocation block group for the zoned mode.
That block group is dedicated for writing relocated data and we cannot
allocate any regular extent from that block group, which is implemented in
the zoned extent allocator. That block group still belongs to the normal
data space_info. So, when all the normal data block groups are full and
there are some free space in the dedicated block group, the space_info
looks to have some free space, while it cannot allocate normal extent
anymore. That results in a strange ENOSPC error. We need to have a
space_info for the relocation data block group to represent the situation
properly.
This commit adds a basic infrastructure for having a "sub-group" of a
space_info: creation and removing. A sub-group space_info belongs to one of
the primary space_infos and has the same flags as its parent.
This commit first introduces the relocation data sub-space_info, and the
next commit will introduce tree-log sub-space_info. In the future, it could
be useful to implement tiered storage for btrfs e.g, by implementing a
sub-group space_info for block groups resides on a fast storage.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/block-group.c | 11 +++++++++++
fs/btrfs/space-info.c | 44 +++++++++++++++++++++++++++++++++++++++---
fs/btrfs/space-info.h | 8 ++++++++
fs/btrfs/sysfs.c | 18 ++++++++++++++---
4 files changed, 75 insertions(+), 6 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 70deb3e8739a..e2b376d19299 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -4426,6 +4426,17 @@ static void check_removing_space_info(struct btrfs_space_info *space_info)
{
struct btrfs_fs_info *info = space_info->fs_info;
+ if (space_info->subgroup_id == BTRFS_SUB_GROUP_PRIMARY) {
+ /* This is a top space_info, proceed its children first. */
+ for (int i = 0; i < BTRFS_SPACE_INFO_SUB_GROUP_MAX; i++) {
+ if (space_info->sub_group[i]) {
+ check_removing_space_info(space_info->sub_group[i]);
+ kfree(space_info->sub_group[i]);
+ space_info->sub_group[i] = NULL;
+ }
+ }
+ }
+
/*
* Do not hide this behind enospc_debug, this is actually
* important and indicates a real bug if this happens.
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 2489c2a16123..e45e9db37497 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -249,16 +249,45 @@ static void init_space_info(struct btrfs_fs_info *info,
INIT_LIST_HEAD(&space_info->priority_tickets);
space_info->clamp = 1;
btrfs_update_space_info_chunk_size(space_info, calc_chunk_size(info, flags));
+ space_info->subgroup_id = BTRFS_SUB_GROUP_PRIMARY;
if (btrfs_is_zoned(info))
space_info->bg_reclaim_threshold = BTRFS_DEFAULT_ZONED_RECLAIM_THRESH;
}
+static int create_space_info_sub_group(struct btrfs_space_info *parent,
+ u64 flags, enum btrfs_space_info_sub_group id,
+ int index)
+{
+ struct btrfs_fs_info *fs_info = parent->fs_info;
+ struct btrfs_space_info *sub_group;
+ int ret;
+
+ ASSERT(parent->subgroup_id == BTRFS_SUB_GROUP_PRIMARY);
+ ASSERT(id != BTRFS_SUB_GROUP_PRIMARY);
+
+ sub_group = kzalloc(sizeof(*sub_group), GFP_NOFS);
+ if (!sub_group)
+ return -ENOMEM;
+
+ init_space_info(fs_info, sub_group, flags);
+ parent->sub_group[index] = sub_group;
+ sub_group->parent = parent;
+ sub_group->subgroup_id = id;
+
+ ret = btrfs_sysfs_add_space_info_type(fs_info, sub_group);
+ if (ret) {
+ kfree(sub_group);
+ parent->sub_group[index] = NULL;
+ }
+ return ret;
+}
+
static int create_space_info(struct btrfs_fs_info *info, u64 flags)
{
struct btrfs_space_info *space_info;
- int ret;
+ int ret = 0;
space_info = kzalloc(sizeof(*space_info), GFP_NOFS);
if (!space_info)
@@ -266,6 +295,14 @@ static int create_space_info(struct btrfs_fs_info *info, u64 flags)
init_space_info(info, space_info, flags);
+ if (btrfs_is_zoned(info)) {
+ if (flags & BTRFS_BLOCK_GROUP_DATA)
+ ret = create_space_info_sub_group(space_info, flags,
+ BTRFS_SUB_GROUP_DATA_RELOC, 0);
+ if (ret)
+ return ret;
+ }
+
ret = btrfs_sysfs_add_space_info_type(info, space_info);
if (ret)
return ret;
@@ -561,8 +598,9 @@ static void __btrfs_dump_space_info(const struct btrfs_fs_info *fs_info,
lockdep_assert_held(&info->lock);
/* The free space could be negative in case of overcommit */
- btrfs_info(fs_info, "space_info %s has %lld free, is %sfull",
- flag_str,
+ btrfs_info(fs_info,
+ "space_info %s (sub-group id %d) has %lld free, is %sfull",
+ flag_str, info->subgroup_id,
(s64)(info->total_bytes - btrfs_space_info_used(info, true)),
info->full ? "" : "not ");
btrfs_info(fs_info,
diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
index 7459b4eb99cd..c76929a65475 100644
--- a/fs/btrfs/space-info.h
+++ b/fs/btrfs/space-info.h
@@ -98,8 +98,16 @@ enum btrfs_flush_state {
RESET_ZONES = 12,
};
+enum btrfs_space_info_sub_group {
+ BTRFS_SUB_GROUP_PRIMARY,
+ BTRFS_SUB_GROUP_DATA_RELOC,
+};
+#define BTRFS_SPACE_INFO_SUB_GROUP_MAX 1
struct btrfs_space_info {
struct btrfs_fs_info *fs_info;
+ struct btrfs_space_info *parent;
+ struct btrfs_space_info *sub_group[BTRFS_SPACE_INFO_SUB_GROUP_MAX];
+ int subgroup_id;
spinlock_t lock;
u64 total_bytes; /* total bytes in the space,
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index b9af74498b0c..4667b388e046 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1930,16 +1930,28 @@ void btrfs_sysfs_remove_space_info(struct btrfs_space_info *space_info)
kobject_put(&space_info->kobj);
}
-static const char *alloc_name(u64 flags)
+static const char *alloc_name(struct btrfs_space_info *space_info)
{
+ u64 flags = space_info->flags;
+
switch (flags) {
case BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_DATA:
return "mixed";
case BTRFS_BLOCK_GROUP_METADATA:
+ ASSERT(space_info->subgroup_id == BTRFS_SUB_GROUP_PRIMARY);
return "metadata";
case BTRFS_BLOCK_GROUP_DATA:
- return "data";
+ switch (space_info->subgroup_id) {
+ case BTRFS_SUB_GROUP_PRIMARY:
+ return "data";
+ case BTRFS_SUB_GROUP_DATA_RELOC:
+ return "data-reloc";
+ default:
+ WARN_ON_ONCE(1);
+ return "data (unknown sub-group)";
+ }
case BTRFS_BLOCK_GROUP_SYSTEM:
+ ASSERT(space_info->subgroup_id == BTRFS_SUB_GROUP_PRIMARY);
return "system";
default:
WARN_ON(1);
@@ -1958,7 +1970,7 @@ int btrfs_sysfs_add_space_info_type(struct btrfs_fs_info *fs_info,
ret = kobject_init_and_add(&space_info->kobj, &space_info_ktype,
fs_info->space_info_kobj, "%s",
- alloc_name(space_info->flags));
+ alloc_name(space_info));
if (ret) {
kobject_put(&space_info->kobj);
return ret;
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v4 09/13] btrfs: introduce tree-log sub-space_info
2025-04-23 2:43 [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups Naohiro Aota
` (7 preceding siblings ...)
2025-04-23 2:43 ` [PATCH v4 08/13] btrfs: introduce btrfs_space_info sub-group Naohiro Aota
@ 2025-04-23 2:43 ` Naohiro Aota
2025-04-23 2:43 ` [PATCH v4 10/13] btrfs: tweak extent/chunk allocation for space_info sub-space Naohiro Aota
` (4 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2025-04-23 2:43 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Johannes Thumshirn
This commit introduces the tree-log sub-space_info, which is sub-space of
metadata space_info and dedicated for tree-log node allocation.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/space-info.c | 4 ++++
fs/btrfs/space-info.h | 1 +
fs/btrfs/sysfs.c | 11 +++++++++--
3 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index e45e9db37497..a34771cf0870 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -299,6 +299,10 @@ static int create_space_info(struct btrfs_fs_info *info, u64 flags)
if (flags & BTRFS_BLOCK_GROUP_DATA)
ret = create_space_info_sub_group(space_info, flags,
BTRFS_SUB_GROUP_DATA_RELOC, 0);
+ else if (flags & BTRFS_BLOCK_GROUP_METADATA)
+ ret = create_space_info_sub_group(space_info, flags,
+ BTRFS_SUB_GROUP_TREELOG, 0);
+
if (ret)
return ret;
}
diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
index c76929a65475..e04727f30aed 100644
--- a/fs/btrfs/space-info.h
+++ b/fs/btrfs/space-info.h
@@ -101,6 +101,7 @@ enum btrfs_flush_state {
enum btrfs_space_info_sub_group {
BTRFS_SUB_GROUP_PRIMARY,
BTRFS_SUB_GROUP_DATA_RELOC,
+ BTRFS_SUB_GROUP_TREELOG,
};
#define BTRFS_SPACE_INFO_SUB_GROUP_MAX 1
struct btrfs_space_info {
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 4667b388e046..5d93d9dd2c12 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1938,8 +1938,15 @@ static const char *alloc_name(struct btrfs_space_info *space_info)
case BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_DATA:
return "mixed";
case BTRFS_BLOCK_GROUP_METADATA:
- ASSERT(space_info->subgroup_id == BTRFS_SUB_GROUP_PRIMARY);
- return "metadata";
+ switch (space_info->subgroup_id) {
+ case BTRFS_SUB_GROUP_PRIMARY:
+ return "metadata";
+ case BTRFS_SUB_GROUP_TREELOG:
+ return "metadata-treelog";
+ default:
+ WARN_ON_ONCE(1);
+ return "metadata (unknown sub-group)";
+ }
case BTRFS_BLOCK_GROUP_DATA:
switch (space_info->subgroup_id) {
case BTRFS_SUB_GROUP_PRIMARY:
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v4 10/13] btrfs: tweak extent/chunk allocation for space_info sub-space
2025-04-23 2:43 [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups Naohiro Aota
` (8 preceding siblings ...)
2025-04-23 2:43 ` [PATCH v4 09/13] btrfs: introduce tree-log sub-space_info Naohiro Aota
@ 2025-04-23 2:43 ` Naohiro Aota
2025-04-23 2:43 ` [PATCH v4 11/13] btrfs: use proper data space_info Naohiro Aota
` (3 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2025-04-23 2:43 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Johannes Thumshirn
Make the extent allocator and the chunk allocator aware of the sub-space.
It now uses BTRFS_SUB_GROUP_DATA_RELOC sub-space for data relocation block
group, and uses BTRFS_SUB_GROUP_TREELOG for metadata tree-log block group.
And, it needs to check the space_info is the right one when a block group
candidate is given. Also, new block group should now belong to the
specified one.
Now that, block_group->space_info is always set before
btrfs_add_bg_to_space_info(), we no longer need to "find" the space_info.
So, rename the variable name to address that as well.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/block-group.c | 3 +++
fs/btrfs/extent-tree.c | 14 +++++++++++++-
fs/btrfs/space-info.c | 32 +++++++++++++++-----------------
3 files changed, 31 insertions(+), 18 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index e2b376d19299..74d8ecfb599b 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2371,6 +2371,7 @@ static int read_one_block_group(struct btrfs_fs_info *info,
cache->commit_used = cache->used;
cache->flags = btrfs_stack_block_group_flags(bgi);
cache->global_root_id = btrfs_stack_block_group_chunk_objectid(bgi);
+ cache->space_info = btrfs_find_space_info(info, cache->flags);
set_free_space_tree_thresholds(cache);
@@ -2449,6 +2450,7 @@ static int read_one_block_group(struct btrfs_fs_info *info,
btrfs_remove_free_space_cache(cache);
goto error;
}
+
trace_btrfs_add_block_group(info, cache, 0);
btrfs_add_bg_to_space_info(info, cache);
@@ -2493,6 +2495,7 @@ static int fill_dummy_bgs(struct btrfs_fs_info *fs_info)
bg->cached = BTRFS_CACHE_FINISHED;
bg->used = map->chunk_len;
bg->flags = map->type;
+ bg->space_info = btrfs_find_space_info(fs_info, bg->flags);
ret = btrfs_add_block_group_cache(bg);
/*
* We may have some valid block group cache added already, in
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 1762918dd9a1..b920efbc9cfa 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4380,8 +4380,19 @@ static noinline int find_free_extent(struct btrfs_root *root,
trace_btrfs_find_free_extent(root, ffe_ctl);
space_info = btrfs_find_space_info(fs_info, ffe_ctl->flags);
+ if (btrfs_is_zoned(fs_info) && space_info) {
+ /* Use dedicated sub-space_info for dedicated block group users. */
+ if (ffe_ctl->for_data_reloc) {
+ space_info = space_info->sub_group[0];
+ ASSERT(space_info->subgroup_id == BTRFS_SUB_GROUP_DATA_RELOC);
+ } else if (ffe_ctl->for_treelog) {
+ space_info = space_info->sub_group[0];
+ ASSERT(space_info->subgroup_id == BTRFS_SUB_GROUP_TREELOG);
+ }
+ }
if (!space_info) {
- btrfs_err(fs_info, "No space info for %llu", ffe_ctl->flags);
+ btrfs_err(fs_info, "No space info for %llu, tree-log %d, relocation %d",
+ ffe_ctl->flags, ffe_ctl->for_treelog, ffe_ctl->for_data_reloc);
return -ENOSPC;
}
@@ -4403,6 +4414,7 @@ static noinline int find_free_extent(struct btrfs_root *root,
* picked out then we don't care that the block group is cached.
*/
if (block_group && block_group_bits(block_group, ffe_ctl->flags) &&
+ block_group->space_info == space_info &&
block_group->cached != BTRFS_CACHE_NO) {
down_read(&space_info->groups_sem);
if (list_empty(&block_group->list) ||
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index a34771cf0870..1205588b6234 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -358,31 +358,29 @@ int btrfs_init_space_info(struct btrfs_fs_info *fs_info)
void btrfs_add_bg_to_space_info(struct btrfs_fs_info *info,
struct btrfs_block_group *block_group)
{
- struct btrfs_space_info *found;
+ struct btrfs_space_info *space_info = block_group->space_info;
int factor, index;
factor = btrfs_bg_type_to_factor(block_group->flags);
- found = btrfs_find_space_info(info, block_group->flags);
- ASSERT(found);
- spin_lock(&found->lock);
- found->total_bytes += block_group->length;
- found->disk_total += block_group->length * factor;
- found->bytes_used += block_group->used;
- found->disk_used += block_group->used * factor;
- found->bytes_readonly += block_group->bytes_super;
- btrfs_space_info_update_bytes_zone_unusable(found, block_group->zone_unusable);
+ spin_lock(&space_info->lock);
+ space_info->total_bytes += block_group->length;
+ space_info->disk_total += block_group->length * factor;
+ space_info->bytes_used += block_group->used;
+ space_info->disk_used += block_group->used * factor;
+ space_info->bytes_readonly += block_group->bytes_super;
+ btrfs_space_info_update_bytes_zone_unusable(space_info, block_group->zone_unusable);
if (block_group->length > 0)
- found->full = 0;
- btrfs_try_granting_tickets(info, found);
- spin_unlock(&found->lock);
+ space_info->full = 0;
+ btrfs_try_granting_tickets(info, space_info);
+ spin_unlock(&space_info->lock);
- block_group->space_info = found;
+ block_group->space_info = space_info;
index = btrfs_bg_flags_to_raid_index(block_group->flags);
- down_write(&found->groups_sem);
- list_add_tail(&block_group->list, &found->block_groups[index]);
- up_write(&found->groups_sem);
+ down_write(&space_info->groups_sem);
+ list_add_tail(&block_group->list, &space_info->block_groups[index]);
+ up_write(&space_info->groups_sem);
}
struct btrfs_space_info *btrfs_find_space_info(struct btrfs_fs_info *info,
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v4 11/13] btrfs: use proper data space_info
2025-04-23 2:43 [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups Naohiro Aota
` (9 preceding siblings ...)
2025-04-23 2:43 ` [PATCH v4 10/13] btrfs: tweak extent/chunk allocation for space_info sub-space Naohiro Aota
@ 2025-04-23 2:43 ` Naohiro Aota
2025-04-23 2:43 ` [PATCH v4 12/13] btrfs: add block_rsv for treelog Naohiro Aota
` (2 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2025-04-23 2:43 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Johannes Thumshirn
Now that, we have data sub-space for the zoned mode. Tweak some space_info
functions to use proper space_info for a file.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/delalloc-space.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c
index a18895255af9..f7927657e036 100644
--- a/fs/btrfs/delalloc-space.c
+++ b/fs/btrfs/delalloc-space.c
@@ -111,6 +111,15 @@
* making error handling and cleanup easier.
*/
+static inline struct btrfs_space_info *data_sinfo_for_inode(const struct btrfs_inode *inode)
+{
+ struct btrfs_fs_info *fs_info = inode->root->fs_info;
+
+ if (btrfs_is_zoned(fs_info) && btrfs_is_data_reloc_root(inode->root))
+ return fs_info->data_sinfo->sub_group[SUB_GROUP_DATA_RELOC];
+ return fs_info->data_sinfo;
+}
+
int btrfs_alloc_data_chunk_ondemand(const struct btrfs_inode *inode, u64 bytes)
{
struct btrfs_root *root = inode->root;
@@ -123,7 +132,7 @@ int btrfs_alloc_data_chunk_ondemand(const struct btrfs_inode *inode, u64 bytes)
if (btrfs_is_free_space_inode(inode))
flush = BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE;
- return btrfs_reserve_data_bytes(fs_info->data_sinfo, bytes, flush);
+ return btrfs_reserve_data_bytes(data_sinfo_for_inode(inode), bytes, flush);
}
int btrfs_check_data_free_space(struct btrfs_inode *inode,
@@ -144,7 +153,7 @@ int btrfs_check_data_free_space(struct btrfs_inode *inode,
else if (btrfs_is_free_space_inode(inode))
flush = BTRFS_RESERVE_FLUSH_FREE_SPACE_INODE;
- ret = btrfs_reserve_data_bytes(fs_info->data_sinfo, len, flush);
+ ret = btrfs_reserve_data_bytes(data_sinfo_for_inode(inode), len, flush);
if (ret < 0)
return ret;
@@ -172,12 +181,10 @@ void btrfs_free_reserved_data_space_noquota(struct btrfs_inode *inode,
u64 len)
{
struct btrfs_fs_info *fs_info = inode->root->fs_info;
- struct btrfs_space_info *data_sinfo;
ASSERT(IS_ALIGNED(len, fs_info->sectorsize));
- data_sinfo = fs_info->data_sinfo;
- btrfs_space_info_free_bytes_may_use(data_sinfo, len);
+ btrfs_space_info_free_bytes_may_use(data_sinfo_for_inode(inode), len);
}
/*
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v4 12/13] btrfs: add block_rsv for treelog
2025-04-23 2:43 [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups Naohiro Aota
` (10 preceding siblings ...)
2025-04-23 2:43 ` [PATCH v4 11/13] btrfs: use proper data space_info Naohiro Aota
@ 2025-04-23 2:43 ` Naohiro Aota
2025-04-23 2:43 ` [PATCH v4 13/13] btrfs: reclaim from sub-space space_info Naohiro Aota
2025-05-02 13:08 ` [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups David Sterba
13 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2025-04-23 2:43 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Johannes Thumshirn
We need to add a dedicated block_rsv for tree-log, because the block_rsv
serves for a tree node allocation in btrfs_alloc_tree_block(). Currently,
tree-log tree uses fs_info->empty_block_rsv, which is shared across trees
and points to the normal metadata space_info. Instead, we add a dedicated
block_rsv and that block_rsv can use the dedicated sub-space_info.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/block-rsv.c | 11 +++++++++++
fs/btrfs/block-rsv.h | 1 +
fs/btrfs/delalloc-space.c | 7 +++++--
fs/btrfs/disk-io.c | 1 +
fs/btrfs/fs.h | 2 ++
5 files changed, 20 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/block-rsv.c b/fs/btrfs/block-rsv.c
index 3f3608299c0b..5ad6de738aee 100644
--- a/fs/btrfs/block-rsv.c
+++ b/fs/btrfs/block-rsv.c
@@ -418,6 +418,9 @@ void btrfs_init_root_block_rsv(struct btrfs_root *root)
case BTRFS_CHUNK_TREE_OBJECTID:
root->block_rsv = &fs_info->chunk_block_rsv;
break;
+ case BTRFS_TREE_LOG_OBJECTID:
+ root->block_rsv = &fs_info->treelog_rsv;
+ break;
default:
root->block_rsv = NULL;
break;
@@ -438,6 +441,14 @@ void btrfs_init_global_block_rsv(struct btrfs_fs_info *fs_info)
fs_info->delayed_block_rsv.space_info = space_info;
fs_info->delayed_refs_rsv.space_info = space_info;
+ /* The treelog_rsv uses a dedicated space_info on the zoned mode. */
+ if (!btrfs_is_zoned(fs_info)) {
+ fs_info->treelog_rsv.space_info = space_info;
+ } else {
+ ASSERT(space_info->sub_group[0]->subgroup_id == BTRFS_SUB_GROUP_TREELOG);
+ fs_info->treelog_rsv.space_info = space_info->sub_group[0];
+ }
+
btrfs_update_global_block_rsv(fs_info);
}
diff --git a/fs/btrfs/block-rsv.h b/fs/btrfs/block-rsv.h
index d12b1fac5c74..79ae9d05cd91 100644
--- a/fs/btrfs/block-rsv.h
+++ b/fs/btrfs/block-rsv.h
@@ -24,6 +24,7 @@ enum btrfs_rsv_type {
BTRFS_BLOCK_RSV_CHUNK,
BTRFS_BLOCK_RSV_DELOPS,
BTRFS_BLOCK_RSV_DELREFS,
+ BTRFS_BLOCK_RSV_TREELOG,
BTRFS_BLOCK_RSV_EMPTY,
BTRFS_BLOCK_RSV_TEMP,
};
diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c
index f7927657e036..62b608bea414 100644
--- a/fs/btrfs/delalloc-space.c
+++ b/fs/btrfs/delalloc-space.c
@@ -115,8 +115,11 @@ static inline struct btrfs_space_info *data_sinfo_for_inode(const struct btrfs_i
{
struct btrfs_fs_info *fs_info = inode->root->fs_info;
- if (btrfs_is_zoned(fs_info) && btrfs_is_data_reloc_root(inode->root))
- return fs_info->data_sinfo->sub_group[SUB_GROUP_DATA_RELOC];
+ if (btrfs_is_zoned(fs_info) && btrfs_is_data_reloc_root(inode->root)) {
+ ASSERT(fs_info->data_sinfo->sub_group[0]->subgroup_id ==
+ BTRFS_SUB_GROUP_DATA_RELOC);
+ return fs_info->data_sinfo->sub_group[0];
+ }
return fs_info->data_sinfo;
}
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 1541fd19d5ce..5286c25f3429 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2823,6 +2823,7 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info)
BTRFS_BLOCK_RSV_GLOBAL);
btrfs_init_block_rsv(&fs_info->trans_block_rsv, BTRFS_BLOCK_RSV_TRANS);
btrfs_init_block_rsv(&fs_info->chunk_block_rsv, BTRFS_BLOCK_RSV_CHUNK);
+ btrfs_init_block_rsv(&fs_info->treelog_rsv, BTRFS_BLOCK_RSV_TREELOG);
btrfs_init_block_rsv(&fs_info->empty_block_rsv, BTRFS_BLOCK_RSV_EMPTY);
btrfs_init_block_rsv(&fs_info->delayed_block_rsv,
BTRFS_BLOCK_RSV_DELOPS);
diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
index bcca43046064..0d5af9732a3c 100644
--- a/fs/btrfs/fs.h
+++ b/fs/btrfs/fs.h
@@ -471,6 +471,8 @@ struct btrfs_fs_info {
struct btrfs_block_rsv delayed_block_rsv;
/* Block reservation for delayed refs */
struct btrfs_block_rsv delayed_refs_rsv;
+ /* Block reservation for treelog tree */
+ struct btrfs_block_rsv treelog_rsv;
struct btrfs_block_rsv empty_block_rsv;
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH v4 13/13] btrfs: reclaim from sub-space space_info
2025-04-23 2:43 [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups Naohiro Aota
` (11 preceding siblings ...)
2025-04-23 2:43 ` [PATCH v4 12/13] btrfs: add block_rsv for treelog Naohiro Aota
@ 2025-04-23 2:43 ` Naohiro Aota
2025-05-02 13:08 ` [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups David Sterba
13 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2025-04-23 2:43 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Johannes Thumshirn
Modify btrfs_async_{data,metadata}_reclaim() to run the reclaim process on
the sub-spaces as well.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
---
fs/btrfs/space-info.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 1205588b6234..867092de09d1 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -1223,6 +1223,10 @@ static void btrfs_async_reclaim_metadata_space(struct work_struct *work)
fs_info = container_of(work, struct btrfs_fs_info, async_reclaim_work);
space_info = btrfs_find_space_info(fs_info, BTRFS_BLOCK_GROUP_METADATA);
do_async_reclaim_metadata_space(space_info);
+ for (int i = 0; i < BTRFS_SPACE_INFO_SUB_GROUP_MAX; i++) {
+ if (space_info->sub_group[i])
+ do_async_reclaim_metadata_space(space_info->sub_group[i]);
+ }
}
/*
@@ -1451,6 +1455,9 @@ static void btrfs_async_reclaim_data_space(struct work_struct *work)
fs_info = container_of(work, struct btrfs_fs_info, async_data_reclaim_work);
space_info = fs_info->data_sinfo;
do_async_reclaim_data_space(space_info);
+ for (int i = 0; i < BTRFS_SPACE_INFO_SUB_GROUP_MAX; i++)
+ if (space_info->sub_group[i])
+ do_async_reclaim_data_space(space_info->sub_group[i]);
}
void btrfs_init_async_reclaim_work(struct btrfs_fs_info *fs_info)
--
2.49.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups
2025-04-23 2:43 [PATCH v4 00/13] btrfs: zoned: split out space_info for dedicated block groups Naohiro Aota
` (12 preceding siblings ...)
2025-04-23 2:43 ` [PATCH v4 13/13] btrfs: reclaim from sub-space space_info Naohiro Aota
@ 2025-05-02 13:08 ` David Sterba
13 siblings, 0 replies; 15+ messages in thread
From: David Sterba @ 2025-05-02 13:08 UTC (permalink / raw)
To: Naohiro Aota; +Cc: linux-btrfs
On Wed, Apr 23, 2025 at 11:43:40AM +0900, Naohiro Aota wrote:
> As discussed in [1], there is a longstanding early ENOSPC issue on the
> zoned mode even with simple fio script. This is also causing blktests
> zbd/009 to fail [2].
>
> [1] https://lore.kernel.org/linux-btrfs/cover.1731571240.git.naohiro.aota@wdc.com/
> [2] https://github.com/osandov/blktests/issues/150
>
> This series is the second part to fix the ENOSPC issue. This series
> introduces "space_info sub-space" and use it split a space_info for data
> relocation block group and metadata tree-log block group.
>
> Current code assumes we have only one space_info for each block group type
> (DATA, METADATA, and SYSTEM). We sometime needs multiple space_info to
> manage special block groups.
>
> One example is handling the data relocation block group for the zoned mode.
> That block group is dedicated for writing relocated data and we cannot
> allocate any regular extent from that block group, which is implemented in
> the zoned extent allocator. That block group still belongs to the normal
> data space_info. So, when all the normal data block groups are full and
> there are some free space in the dedicated block group, the space_info
> looks to have some free space, while it cannot allocate normal extent
> anymore. That results in a strange ENOSPC error. We need to have a
> space_info for the relocation data block group to represent the situation
> properly.
>
> Changes:
> - v4:
> - Pass down space_info from top of call tree, instead querying it at the bottom.
> - Make BTRFS_SUB_GROUP_* id distinct.
> - Use treelog_rsv on regular btrfs too.
> - Fix NULL dereference bug.
> - Some format and commit log fix.
The patches have been in linux-next, I've moved them to for-next. The
non-zoned mode is mostly unaffected.
^ permalink raw reply [flat|nested] 15+ messages in thread