* [PATCH v2 0/3] btrfs: find_free_extent cleanups
@ 2025-10-03 23:41 Leo Martins
2025-10-03 23:41 ` [PATCH v2 1/3] btrfs: remove ffe RAID loop Leo Martins
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Leo Martins @ 2025-10-03 23:41 UTC (permalink / raw)
To: linux-btrfs, kernel-team, fstests
The first patch removes redundant RAID loop logic that became obsolete
after previous changes ensured allocations only occur from block groups
with matching RAID level.
The second patch adds comprehensive tracing to find_free_extent() to
improve debugging and performance analysis capabilities.
Here is a bpftrace script I put together to analyze the allocation
behavior, along with output.
Link: https://github.com/loemraw/btrfs-scripts/blob/main/ffe_analyzer.bt
Link: https://github.com/loemraw/btrfs-scripts/blob/main/ffe_analyzer.out
Testing:
- ran xfstests btrfs/auto
- verified trace events output correctly
- new fstest that tests RAID conversions under stress
Change log:
v1 -> v2:
PATCH 1
- re-add full_search
PATCH 2
- standardize naming of skip reasons
- remove preapare_allocation_failure reason as it's not a skip
- add more error_or_values for skip reasons
PATCH 3
- new fstests for raid conversions under stress
Leo Martins (2):
btrfs: remove ffe RAID loop
btrfs: add tracing for find_free_extent skip conditions
fs/btrfs/extent-tree.c | 70 ++++++++++++++++++------------------
fs/btrfs/extent-tree.h | 17 +++++++++
include/trace/events/btrfs.h | 66 ++++++++++++++++++++++++++++++++++
3 files changed, 119 insertions(+), 34 deletions(-)
--
2.47.3
^ permalink raw reply [flat|nested] 10+ messages in thread* [PATCH v2 1/3] btrfs: remove ffe RAID loop 2025-10-03 23:41 [PATCH v2 0/3] btrfs: find_free_extent cleanups Leo Martins @ 2025-10-03 23:41 ` Leo Martins 2025-10-15 3:29 ` Boris Burkov 2025-10-03 23:41 ` [PATCH v2 2/3] btrfs: add tracing for find_free_extent skip conditions Leo Martins 2025-10-03 23:41 ` [PATCH v2 3/3] fstests: btrfs: test RAID conversions under stress Leo Martins 2 siblings, 1 reply; 10+ messages in thread From: Leo Martins @ 2025-10-03 23:41 UTC (permalink / raw) To: linux-btrfs, kernel-team, fstests This patch removes the RAID loop from find_free_extent since it is impossible to allocate from a block group with a different RAID profile. Historically, we've been able to fulfill allocation requests from mismatched RAID block groups assuming they provided the required duplcation. For example, a request for RAID0 could be fulfilled by a RAID1 block group. 2a28468e525f ("btrfs: extent-tree: Make sure we only allocate extents from block groups with the same type") changed this behavior to skip block groups with different flags than the request. This makes the duplication compatiblity check redundant since we're going to keep searching regardless. Signed-off-by: Leo Martins <loemra.dev@gmail.com> --- fs/btrfs/extent-tree.c | 32 +------------------------------- 1 file changed, 1 insertion(+), 31 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index a4416c451b25..28b442660014 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -4171,13 +4171,8 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info, if (ffe_ctl->loop >= LOOP_CACHING_WAIT && ffe_ctl->have_caching_bg) return 1; - ffe_ctl->index++; - if (ffe_ctl->index < BTRFS_NR_RAID_TYPES) - return 1; - /* See the comments for btrfs_loop_type for an explanation of the phases. */ if (ffe_ctl->loop < LOOP_NO_EMPTY_SIZE) { - ffe_ctl->index = 0; /* * We want to skip the LOOP_CACHING_WAIT step if we don't have * any uncached bgs and we've already done a full search @@ -4477,9 +4472,7 @@ static noinline int find_free_extent(struct btrfs_root *root, search: trace_btrfs_find_free_extent_search_loop(root, ffe_ctl); ffe_ctl->have_caching_bg = false; - if (ffe_ctl->index == btrfs_bg_flags_to_raid_index(ffe_ctl->flags) || - ffe_ctl->index == 0) - full_search = true; + full_search = true; down_read(&space_info->groups_sem); list_for_each_entry(block_group, &space_info->block_groups[ffe_ctl->index], list) { @@ -4498,30 +4491,7 @@ static noinline int find_free_extent(struct btrfs_root *root, btrfs_grab_block_group(block_group, ffe_ctl->delalloc); ffe_ctl->search_start = block_group->start; - /* - * this can happen if we end up cycling through all the - * raid types, but we want to make sure we only allocate - * for the proper type. - */ if (!block_group_bits(block_group, ffe_ctl->flags)) { - u64 extra = BTRFS_BLOCK_GROUP_DUP | - BTRFS_BLOCK_GROUP_RAID1_MASK | - BTRFS_BLOCK_GROUP_RAID56_MASK | - BTRFS_BLOCK_GROUP_RAID10; - - /* - * if they asked for extra copies and this block group - * doesn't provide them, bail. This does allow us to - * fill raid0 from raid1. - */ - if ((ffe_ctl->flags & extra) && !(block_group->flags & extra)) - goto loop; - - /* - * This block group has different flags than we want. - * It's possible that we have MIXED_GROUP flag but no - * block group is mixed. Just skip such block group. - */ btrfs_release_block_group(block_group, ffe_ctl->delalloc); continue; } -- 2.47.3 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/3] btrfs: remove ffe RAID loop 2025-10-03 23:41 ` [PATCH v2 1/3] btrfs: remove ffe RAID loop Leo Martins @ 2025-10-15 3:29 ` Boris Burkov 0 siblings, 0 replies; 10+ messages in thread From: Boris Burkov @ 2025-10-15 3:29 UTC (permalink / raw) To: Leo Martins; +Cc: linux-btrfs, kernel-team, fstests On Fri, Oct 03, 2025 at 04:41:57PM -0700, Leo Martins wrote: > This patch removes the RAID loop from find_free_extent since it > is impossible to allocate from a block group with a different > RAID profile. > > Historically, we've been able to fulfill allocation requests > from mismatched RAID block groups assuming they provided the > required duplcation. For example, a request for RAID0 could be > fulfilled by a RAID1 block group. > > 2a28468e525f ("btrfs: extent-tree: Make sure we only allocate extents from block groups with the same type") > changed this behavior to skip block groups with different flags > than the request. This makes the duplication compatiblity check > redundant since we're going to keep searching regardless. > > Signed-off-by: Leo Martins <loemra.dev@gmail.com> Reviewed-by: Boris Burkov <boris@bur.io> > --- > fs/btrfs/extent-tree.c | 32 +------------------------------- > 1 file changed, 1 insertion(+), 31 deletions(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index a4416c451b25..28b442660014 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -4171,13 +4171,8 @@ static int find_free_extent_update_loop(struct btrfs_fs_info *fs_info, > if (ffe_ctl->loop >= LOOP_CACHING_WAIT && ffe_ctl->have_caching_bg) > return 1; > > - ffe_ctl->index++; > - if (ffe_ctl->index < BTRFS_NR_RAID_TYPES) > - return 1; > - > /* See the comments for btrfs_loop_type for an explanation of the phases. */ > if (ffe_ctl->loop < LOOP_NO_EMPTY_SIZE) { > - ffe_ctl->index = 0; > /* > * We want to skip the LOOP_CACHING_WAIT step if we don't have > * any uncached bgs and we've already done a full search > @@ -4477,9 +4472,7 @@ static noinline int find_free_extent(struct btrfs_root *root, > search: > trace_btrfs_find_free_extent_search_loop(root, ffe_ctl); > ffe_ctl->have_caching_bg = false; > - if (ffe_ctl->index == btrfs_bg_flags_to_raid_index(ffe_ctl->flags) || > - ffe_ctl->index == 0) > - full_search = true; > + full_search = true; > down_read(&space_info->groups_sem); > list_for_each_entry(block_group, > &space_info->block_groups[ffe_ctl->index], list) { > @@ -4498,30 +4491,7 @@ static noinline int find_free_extent(struct btrfs_root *root, > btrfs_grab_block_group(block_group, ffe_ctl->delalloc); > ffe_ctl->search_start = block_group->start; > > - /* > - * this can happen if we end up cycling through all the > - * raid types, but we want to make sure we only allocate > - * for the proper type. > - */ > if (!block_group_bits(block_group, ffe_ctl->flags)) { > - u64 extra = BTRFS_BLOCK_GROUP_DUP | > - BTRFS_BLOCK_GROUP_RAID1_MASK | > - BTRFS_BLOCK_GROUP_RAID56_MASK | > - BTRFS_BLOCK_GROUP_RAID10; > - > - /* > - * if they asked for extra copies and this block group > - * doesn't provide them, bail. This does allow us to > - * fill raid0 from raid1. > - */ > - if ((ffe_ctl->flags & extra) && !(block_group->flags & extra)) > - goto loop; > - > - /* > - * This block group has different flags than we want. > - * It's possible that we have MIXED_GROUP flag but no > - * block group is mixed. Just skip such block group. > - */ > btrfs_release_block_group(block_group, ffe_ctl->delalloc); > continue; > } > -- > 2.47.3 > ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 2/3] btrfs: add tracing for find_free_extent skip conditions 2025-10-03 23:41 [PATCH v2 0/3] btrfs: find_free_extent cleanups Leo Martins 2025-10-03 23:41 ` [PATCH v2 1/3] btrfs: remove ffe RAID loop Leo Martins @ 2025-10-03 23:41 ` Leo Martins 2025-10-15 3:28 ` Boris Burkov 2025-10-03 23:41 ` [PATCH v2 3/3] fstests: btrfs: test RAID conversions under stress Leo Martins 2 siblings, 1 reply; 10+ messages in thread From: Leo Martins @ 2025-10-03 23:41 UTC (permalink / raw) To: linux-btrfs, kernel-team, fstests Add detailed tracing to the find_free_extent() function to improve observability of extent allocation behavior. This patch introduces: - A new trace event btrfs_find_free_extent_skip_block_group() that captures allocation context and skip reasons - Comprehensive set of FFE_SKIP_BG_* constants defining specific reasons why block groups are skipped during allocation - Trace points at all major skip conditions in the allocator loop The trace event includes key allocation parameters (root, size, flags, loop iteration) and block group details (start, length, flags) along with the specific skip reason and associated error codes. These trace points will help diagnose allocation performance issues, understand allocation patterns, and debug extent allocator behavior. Signed-off-by: Leo Martins <loemra.dev@gmail.com> --- fs/btrfs/extent-tree.c | 38 +++++++++++++++++++-- fs/btrfs/extent-tree.h | 17 ++++++++++ include/trace/events/btrfs.h | 66 ++++++++++++++++++++++++++++++++++++ 3 files changed, 118 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 28b442660014..3b6d57d39bd5 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -4455,6 +4455,8 @@ static noinline int find_free_extent(struct btrfs_root *root, * target because our list pointers are not * valid */ + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, + FFE_SKIP_BG_REMOVING, 0); btrfs_put_block_group(block_group); up_read(&space_info->groups_sem); } else { @@ -4466,6 +4468,15 @@ static noinline int find_free_extent(struct btrfs_root *root, goto have_block_group; } } else if (block_group) { + if (!block_group_bits(block_group, ffe_ctl->flags)) + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, + FFE_SKIP_BG_WRONG_FLAGS, block_group->flags & ~ffe_ctl->flags); + else if (block_group->space_info != space_info) + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, + FFE_SKIP_BG_WRONG_SPACE_INFO, 0); + else if (block_group->cached == BTRFS_CACHE_NO) + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, + FFE_SKIP_BG_NOT_CACHED, 0); btrfs_put_block_group(block_group); } } @@ -4481,6 +4492,8 @@ static noinline int find_free_extent(struct btrfs_root *root, ffe_ctl->hinted = false; /* If the block group is read-only, we can skip it entirely. */ if (unlikely(block_group->ro)) { + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, + FFE_SKIP_BG_READ_ONLY, 0); if (ffe_ctl->for_treelog) btrfs_clear_treelog_bg(block_group); if (ffe_ctl->for_data_reloc) @@ -4492,6 +4505,8 @@ static noinline int find_free_extent(struct btrfs_root *root, ffe_ctl->search_start = block_group->start; if (!block_group_bits(block_group, ffe_ctl->flags)) { + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, + FFE_SKIP_BG_WRONG_FLAGS, block_group->flags & ~ffe_ctl->flags); btrfs_release_block_group(block_group, ffe_ctl->delalloc); continue; } @@ -4511,6 +4526,8 @@ static noinline int find_free_extent(struct btrfs_root *root, * error that caused problems, not ENOSPC. */ if (ret < 0) { + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, + FFE_SKIP_BG_CACHE_ERROR, ret); if (!cache_block_group_error) cache_block_group_error = ret; ret = 0; @@ -4520,18 +4537,26 @@ static noinline int find_free_extent(struct btrfs_root *root, } if (unlikely(block_group->cached == BTRFS_CACHE_ERROR)) { + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, + FFE_SKIP_BG_CACHE_ERROR, -EIO); if (!cache_block_group_error) cache_block_group_error = -EIO; goto loop; } - if (!find_free_extent_check_size_class(ffe_ctl, block_group)) + if (!find_free_extent_check_size_class(ffe_ctl, block_group)) { + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, + FFE_SKIP_BG_SIZE_CLASS_MISMATCH, block_group->size_class); goto loop; + } bg_ret = NULL; ret = do_allocation(block_group, ffe_ctl, &bg_ret); - if (ret > 0) + if (ret > 0) { + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, + FFE_SKIP_BG_DO_ALLOCATION_FAILED, ret); goto loop; + } if (bg_ret && bg_ret != block_group) { btrfs_release_block_group(block_group, ffe_ctl->delalloc); @@ -4545,22 +4570,29 @@ static noinline int find_free_extent(struct btrfs_root *root, /* move on to the next group */ if (ffe_ctl->search_start + ffe_ctl->num_bytes > block_group->start + block_group->length) { + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, + FFE_SKIP_BG_SEARCH_BOUNDS, block_group->start + block_group->length); btrfs_add_free_space_unused(block_group, ffe_ctl->found_offset, ffe_ctl->num_bytes); goto loop; } - if (ffe_ctl->found_offset < ffe_ctl->search_start) + if (ffe_ctl->found_offset < ffe_ctl->search_start) { + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, + FFE_SKIP_BG_SEARCH_BOUNDS, ffe_ctl->found_offset); btrfs_add_free_space_unused(block_group, ffe_ctl->found_offset, ffe_ctl->search_start - ffe_ctl->found_offset); + } ret = btrfs_add_reserved_bytes(block_group, ffe_ctl->ram_bytes, ffe_ctl->num_bytes, ffe_ctl->delalloc, ffe_ctl->loop >= LOOP_WRONG_SIZE_CLASS); if (ret == -EAGAIN) { + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, + FFE_SKIP_BG_ADD_RESERVED_FAILED, -EAGAIN); btrfs_add_free_space_unused(block_group, ffe_ctl->found_offset, ffe_ctl->num_bytes); diff --git a/fs/btrfs/extent-tree.h b/fs/btrfs/extent-tree.h index e970ac42a871..4f1dc077b7c9 100644 --- a/fs/btrfs/extent-tree.h +++ b/fs/btrfs/extent-tree.h @@ -23,6 +23,23 @@ enum btrfs_extent_allocation_policy { BTRFS_EXTENT_ALLOC_ZONED, }; +/* + * Enum for find_free_extent skip reasons used in trace events. + * Each enum corresponds to a specific unhappy path in the allocator. + */ +enum { + FFE_SKIP_BG_REMOVING, + FFE_SKIP_BG_READ_ONLY, + FFE_SKIP_BG_WRONG_SPACE_INFO, + FFE_SKIP_BG_WRONG_FLAGS, + FFE_SKIP_BG_NOT_CACHED, + FFE_SKIP_BG_CACHE_ERROR, + FFE_SKIP_BG_SIZE_CLASS_MISMATCH, + FFE_SKIP_BG_DO_ALLOCATION_FAILED, + FFE_SKIP_BG_SEARCH_BOUNDS, + FFE_SKIP_BG_ADD_RESERVED_FAILED, +}; + struct find_free_extent_ctl { /* Basic allocation info */ u64 ram_bytes; diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index 7e418f065b94..72aa250983d4 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -103,6 +103,19 @@ struct find_free_extent_ctl; EM( COMMIT_TRANS, "COMMIT_TRANS") \ EMe(RESET_ZONES, "RESET_ZONES") +#define FIND_FREE_EXTENT_SKIP_REASONS \ + EM( FFE_SKIP_BG_REMOVING, "BG_REMOVING") \ + EM( FFE_SKIP_BG_READ_ONLY, "BG_READ_ONLY") \ + EM( FFE_SKIP_BG_WRONG_SPACE_INFO, "BG_WRONG_SPACE_INFO") \ + EM( FFE_SKIP_BG_WRONG_FLAGS, "BG_WRONG_FLAGS") \ + EM( FFE_SKIP_BG_NOT_CACHED, "BG_NOT_CACHED") \ + EM( FFE_SKIP_BG_CACHE_ERROR, "BG_CACHE_ERROR") \ + EM( FFE_SKIP_BG_SIZE_CLASS_MISMATCH, "BG_SIZE_CLASS_MISMATCH") \ + EM( FFE_SKIP_BG_DO_ALLOCATION_FAILED, "BG_DO_ALLOCATION_FAILED") \ + EM( FFE_SKIP_BG_SEARCH_BOUNDS, "BG_SEARCH_BOUNDS") \ + EMe(FFE_SKIP_BG_ADD_RESERVED_FAILED, "BG_ADD_RESERVED_FAILED") + + /* * First define the enums in the above macros to be exported to userspace via * TRACE_DEFINE_ENUM(). @@ -118,6 +131,7 @@ FI_TYPES QGROUP_RSV_TYPES IO_TREE_OWNER FLUSH_STATES +FIND_FREE_EXTENT_SKIP_REASONS /* * Now redefine the EM and EMe macros to map the enums to the strings that will @@ -1388,6 +1402,58 @@ DEFINE_EVENT(btrfs__reserve_extent, btrfs_reserve_extent_cluster, TP_ARGS(block_group, ffe_ctl) ); +TRACE_EVENT(btrfs_find_free_extent_skip_block_group, + + TP_PROTO(const struct btrfs_root *root, + const struct find_free_extent_ctl *ffe_ctl, + const struct btrfs_block_group *block_group, + int reason, + s64 error_or_value), + + TP_ARGS(root, ffe_ctl, block_group, reason, error_or_value), + + TP_STRUCT__entry_btrfs( + __field( u64, root_objectid ) + __field( u64, num_bytes ) + __field( u64, empty_size ) + __field( u64, flags ) + __field( int, loop ) + __field( bool, hinted ) + __field( int, size_class ) + __field( u64, bg_start ) + __field( u64, bg_length ) + __field( u64, bg_flags ) + __field( int, reason ) + __field( s64, error_or_value ) + ), + + TP_fast_assign_btrfs(root->fs_info, + __entry->root_objectid = btrfs_root_id(root); + __entry->num_bytes = ffe_ctl->num_bytes; + __entry->empty_size = ffe_ctl->empty_size; + __entry->flags = ffe_ctl->flags; + __entry->loop = ffe_ctl->loop; + __entry->hinted = ffe_ctl->hinted; + __entry->size_class = ffe_ctl->size_class; + __entry->bg_start = block_group ? block_group->start : 0; + __entry->bg_length = block_group ? block_group->length : 0; + __entry->bg_flags = block_group ? block_group->flags : 0; + __entry->reason = reason; + __entry->error_or_value = error_or_value; + ), + + TP_printk_btrfs( +"root=%llu(%s) len=%llu empty_size=%llu flags=%llu(%s) loop=%d hinted=%d size_class=%d bg=[%llu+%llu] bg_flags=%llu(%s) reason=%s error_or_value=%lld", + show_root_type(__entry->root_objectid), + __entry->num_bytes, __entry->empty_size, __entry->flags, + __print_flags((unsigned long)__entry->flags, "|", BTRFS_GROUP_FLAGS), + __entry->loop, __entry->hinted, __entry->size_class, + __entry->bg_start, __entry->bg_length, __entry->bg_flags, + __print_flags((unsigned long)__entry->bg_flags, "|", BTRFS_GROUP_FLAGS), + __print_symbolic(__entry->reason, FIND_FREE_EXTENT_SKIP_REASONS), + __entry->error_or_value) +); + TRACE_EVENT(btrfs_find_cluster, TP_PROTO(const struct btrfs_block_group *block_group, u64 start, -- 2.47.3 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/3] btrfs: add tracing for find_free_extent skip conditions 2025-10-03 23:41 ` [PATCH v2 2/3] btrfs: add tracing for find_free_extent skip conditions Leo Martins @ 2025-10-15 3:28 ` Boris Burkov 0 siblings, 0 replies; 10+ messages in thread From: Boris Burkov @ 2025-10-15 3:28 UTC (permalink / raw) To: Leo Martins; +Cc: linux-btrfs, kernel-team, fstests On Fri, Oct 03, 2025 at 04:41:58PM -0700, Leo Martins wrote: > Add detailed tracing to the find_free_extent() function to improve > observability of extent allocation behavior. This patch introduces: > > - A new trace event btrfs_find_free_extent_skip_block_group() that > captures > allocation context and skip reasons > - Comprehensive set of FFE_SKIP_BG_* constants defining specific > reasons why block groups are skipped during allocation > - Trace points at all major skip conditions in the allocator loop > > The trace event includes key allocation parameters (root, size, flags, > loop iteration) and block group details (start, length, flags) along > with the specific skip reason and associated error codes. > > These trace points will help diagnose allocation performance > issues, understand allocation patterns, and debug extent allocator > behavior. > One nit inline, but this looks great, thanks. Reviewed-by: Boris Burkov <boris@bur.io> > Signed-off-by: Leo Martins <loemra.dev@gmail.com> > --- > fs/btrfs/extent-tree.c | 38 +++++++++++++++++++-- > fs/btrfs/extent-tree.h | 17 ++++++++++ > include/trace/events/btrfs.h | 66 ++++++++++++++++++++++++++++++++++++ > 3 files changed, 118 insertions(+), 3 deletions(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 28b442660014..3b6d57d39bd5 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -4455,6 +4455,8 @@ static noinline int find_free_extent(struct btrfs_root *root, > * target because our list pointers are not > * valid > */ > + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, > + FFE_SKIP_BG_REMOVING, 0); > btrfs_put_block_group(block_group); > up_read(&space_info->groups_sem); > } else { > @@ -4466,6 +4468,15 @@ static noinline int find_free_extent(struct btrfs_root *root, > goto have_block_group; > } > } else if (block_group) { > + if (!block_group_bits(block_group, ffe_ctl->flags)) > + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, > + FFE_SKIP_BG_WRONG_FLAGS, block_group->flags & ~ffe_ctl->flags); > + else if (block_group->space_info != space_info) > + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, > + FFE_SKIP_BG_WRONG_SPACE_INFO, 0); > + else if (block_group->cached == BTRFS_CACHE_NO) > + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, > + FFE_SKIP_BG_NOT_CACHED, 0); nit: I think this would look nicer with the code/data selected by the ifs and then a single trace call. > btrfs_put_block_group(block_group); > } > } > @@ -4481,6 +4492,8 @@ static noinline int find_free_extent(struct btrfs_root *root, > ffe_ctl->hinted = false; > /* If the block group is read-only, we can skip it entirely. */ > if (unlikely(block_group->ro)) { > + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, > + FFE_SKIP_BG_READ_ONLY, 0); > if (ffe_ctl->for_treelog) > btrfs_clear_treelog_bg(block_group); > if (ffe_ctl->for_data_reloc) > @@ -4492,6 +4505,8 @@ static noinline int find_free_extent(struct btrfs_root *root, > ffe_ctl->search_start = block_group->start; > > if (!block_group_bits(block_group, ffe_ctl->flags)) { > + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, > + FFE_SKIP_BG_WRONG_FLAGS, block_group->flags & ~ffe_ctl->flags); > btrfs_release_block_group(block_group, ffe_ctl->delalloc); > continue; > } > @@ -4511,6 +4526,8 @@ static noinline int find_free_extent(struct btrfs_root *root, > * error that caused problems, not ENOSPC. > */ > if (ret < 0) { > + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, > + FFE_SKIP_BG_CACHE_ERROR, ret); > if (!cache_block_group_error) > cache_block_group_error = ret; > ret = 0; > @@ -4520,18 +4537,26 @@ static noinline int find_free_extent(struct btrfs_root *root, > } > > if (unlikely(block_group->cached == BTRFS_CACHE_ERROR)) { > + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, > + FFE_SKIP_BG_CACHE_ERROR, -EIO); > if (!cache_block_group_error) > cache_block_group_error = -EIO; > goto loop; > } > > - if (!find_free_extent_check_size_class(ffe_ctl, block_group)) > + if (!find_free_extent_check_size_class(ffe_ctl, block_group)) { > + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, > + FFE_SKIP_BG_SIZE_CLASS_MISMATCH, block_group->size_class); > goto loop; > + } > > bg_ret = NULL; > ret = do_allocation(block_group, ffe_ctl, &bg_ret); > - if (ret > 0) > + if (ret > 0) { > + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, > + FFE_SKIP_BG_DO_ALLOCATION_FAILED, ret); > goto loop; > + } > > if (bg_ret && bg_ret != block_group) { > btrfs_release_block_group(block_group, ffe_ctl->delalloc); > @@ -4545,22 +4570,29 @@ static noinline int find_free_extent(struct btrfs_root *root, > /* move on to the next group */ > if (ffe_ctl->search_start + ffe_ctl->num_bytes > > block_group->start + block_group->length) { > + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, > + FFE_SKIP_BG_SEARCH_BOUNDS, block_group->start + block_group->length); > btrfs_add_free_space_unused(block_group, > ffe_ctl->found_offset, > ffe_ctl->num_bytes); > goto loop; > } > > - if (ffe_ctl->found_offset < ffe_ctl->search_start) > + if (ffe_ctl->found_offset < ffe_ctl->search_start) { > + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, > + FFE_SKIP_BG_SEARCH_BOUNDS, ffe_ctl->found_offset); > btrfs_add_free_space_unused(block_group, > ffe_ctl->found_offset, > ffe_ctl->search_start - ffe_ctl->found_offset); > + } > > ret = btrfs_add_reserved_bytes(block_group, ffe_ctl->ram_bytes, > ffe_ctl->num_bytes, > ffe_ctl->delalloc, > ffe_ctl->loop >= LOOP_WRONG_SIZE_CLASS); > if (ret == -EAGAIN) { > + trace_btrfs_find_free_extent_skip_block_group(root, ffe_ctl, block_group, > + FFE_SKIP_BG_ADD_RESERVED_FAILED, -EAGAIN); > btrfs_add_free_space_unused(block_group, > ffe_ctl->found_offset, > ffe_ctl->num_bytes); > diff --git a/fs/btrfs/extent-tree.h b/fs/btrfs/extent-tree.h > index e970ac42a871..4f1dc077b7c9 100644 > --- a/fs/btrfs/extent-tree.h > +++ b/fs/btrfs/extent-tree.h > @@ -23,6 +23,23 @@ enum btrfs_extent_allocation_policy { > BTRFS_EXTENT_ALLOC_ZONED, > }; > > +/* > + * Enum for find_free_extent skip reasons used in trace events. > + * Each enum corresponds to a specific unhappy path in the allocator. > + */ > +enum { > + FFE_SKIP_BG_REMOVING, > + FFE_SKIP_BG_READ_ONLY, > + FFE_SKIP_BG_WRONG_SPACE_INFO, > + FFE_SKIP_BG_WRONG_FLAGS, > + FFE_SKIP_BG_NOT_CACHED, > + FFE_SKIP_BG_CACHE_ERROR, > + FFE_SKIP_BG_SIZE_CLASS_MISMATCH, > + FFE_SKIP_BG_DO_ALLOCATION_FAILED, > + FFE_SKIP_BG_SEARCH_BOUNDS, > + FFE_SKIP_BG_ADD_RESERVED_FAILED, > +}; > + > struct find_free_extent_ctl { > /* Basic allocation info */ > u64 ram_bytes; > diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h > index 7e418f065b94..72aa250983d4 100644 > --- a/include/trace/events/btrfs.h > +++ b/include/trace/events/btrfs.h > @@ -103,6 +103,19 @@ struct find_free_extent_ctl; > EM( COMMIT_TRANS, "COMMIT_TRANS") \ > EMe(RESET_ZONES, "RESET_ZONES") > > +#define FIND_FREE_EXTENT_SKIP_REASONS \ > + EM( FFE_SKIP_BG_REMOVING, "BG_REMOVING") \ > + EM( FFE_SKIP_BG_READ_ONLY, "BG_READ_ONLY") \ > + EM( FFE_SKIP_BG_WRONG_SPACE_INFO, "BG_WRONG_SPACE_INFO") \ > + EM( FFE_SKIP_BG_WRONG_FLAGS, "BG_WRONG_FLAGS") \ > + EM( FFE_SKIP_BG_NOT_CACHED, "BG_NOT_CACHED") \ > + EM( FFE_SKIP_BG_CACHE_ERROR, "BG_CACHE_ERROR") \ > + EM( FFE_SKIP_BG_SIZE_CLASS_MISMATCH, "BG_SIZE_CLASS_MISMATCH") \ > + EM( FFE_SKIP_BG_DO_ALLOCATION_FAILED, "BG_DO_ALLOCATION_FAILED") \ > + EM( FFE_SKIP_BG_SEARCH_BOUNDS, "BG_SEARCH_BOUNDS") \ > + EMe(FFE_SKIP_BG_ADD_RESERVED_FAILED, "BG_ADD_RESERVED_FAILED") > + > + > /* > * First define the enums in the above macros to be exported to userspace via > * TRACE_DEFINE_ENUM(). > @@ -118,6 +131,7 @@ FI_TYPES > QGROUP_RSV_TYPES > IO_TREE_OWNER > FLUSH_STATES > +FIND_FREE_EXTENT_SKIP_REASONS > > /* > * Now redefine the EM and EMe macros to map the enums to the strings that will > @@ -1388,6 +1402,58 @@ DEFINE_EVENT(btrfs__reserve_extent, btrfs_reserve_extent_cluster, > TP_ARGS(block_group, ffe_ctl) > ); > > +TRACE_EVENT(btrfs_find_free_extent_skip_block_group, > + > + TP_PROTO(const struct btrfs_root *root, > + const struct find_free_extent_ctl *ffe_ctl, > + const struct btrfs_block_group *block_group, > + int reason, > + s64 error_or_value), > + > + TP_ARGS(root, ffe_ctl, block_group, reason, error_or_value), > + > + TP_STRUCT__entry_btrfs( > + __field( u64, root_objectid ) > + __field( u64, num_bytes ) > + __field( u64, empty_size ) > + __field( u64, flags ) > + __field( int, loop ) > + __field( bool, hinted ) > + __field( int, size_class ) > + __field( u64, bg_start ) > + __field( u64, bg_length ) > + __field( u64, bg_flags ) > + __field( int, reason ) > + __field( s64, error_or_value ) > + ), > + > + TP_fast_assign_btrfs(root->fs_info, > + __entry->root_objectid = btrfs_root_id(root); > + __entry->num_bytes = ffe_ctl->num_bytes; > + __entry->empty_size = ffe_ctl->empty_size; > + __entry->flags = ffe_ctl->flags; > + __entry->loop = ffe_ctl->loop; > + __entry->hinted = ffe_ctl->hinted; > + __entry->size_class = ffe_ctl->size_class; > + __entry->bg_start = block_group ? block_group->start : 0; > + __entry->bg_length = block_group ? block_group->length : 0; > + __entry->bg_flags = block_group ? block_group->flags : 0; > + __entry->reason = reason; > + __entry->error_or_value = error_or_value; > + ), > + > + TP_printk_btrfs( > +"root=%llu(%s) len=%llu empty_size=%llu flags=%llu(%s) loop=%d hinted=%d size_class=%d bg=[%llu+%llu] bg_flags=%llu(%s) reason=%s error_or_value=%lld", > + show_root_type(__entry->root_objectid), > + __entry->num_bytes, __entry->empty_size, __entry->flags, > + __print_flags((unsigned long)__entry->flags, "|", BTRFS_GROUP_FLAGS), > + __entry->loop, __entry->hinted, __entry->size_class, > + __entry->bg_start, __entry->bg_length, __entry->bg_flags, > + __print_flags((unsigned long)__entry->bg_flags, "|", BTRFS_GROUP_FLAGS), > + __print_symbolic(__entry->reason, FIND_FREE_EXTENT_SKIP_REASONS), > + __entry->error_or_value) > +); > + > TRACE_EVENT(btrfs_find_cluster, > > TP_PROTO(const struct btrfs_block_group *block_group, u64 start, > -- > 2.47.3 > ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 3/3] fstests: btrfs: test RAID conversions under stress 2025-10-03 23:41 [PATCH v2 0/3] btrfs: find_free_extent cleanups Leo Martins 2025-10-03 23:41 ` [PATCH v2 1/3] btrfs: remove ffe RAID loop Leo Martins 2025-10-03 23:41 ` [PATCH v2 2/3] btrfs: add tracing for find_free_extent skip conditions Leo Martins @ 2025-10-03 23:41 ` Leo Martins 2025-10-04 1:54 ` Qu Wenruo 2025-10-15 3:31 ` Boris Burkov 2 siblings, 2 replies; 10+ messages in thread From: Leo Martins @ 2025-10-03 23:41 UTC (permalink / raw) To: linux-btrfs, kernel-team, fstests Add test to test btrfs conversion while being stressed. This is important since btrfs no longer allows allocating from different RAID block_groups during conversions meaning there may be added enospc pressure. Signed-off-by: Leo Martins <loemra.dev@gmail.com> --- tests/btrfs/337 | 95 +++++++++++++++++++++++++++++++++++++++++++++ tests/btrfs/337.out | 2 + 2 files changed, 97 insertions(+) create mode 100755 tests/btrfs/337 create mode 100644 tests/btrfs/337.out diff --git a/tests/btrfs/337 b/tests/btrfs/337 new file mode 100755 index 00000000..fa335ed7 --- /dev/null +++ b/tests/btrfs/337 @@ -0,0 +1,95 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2025 Meta Platforms, Inc. All Rights Reserved. +# +# FS QA Test btrfs/337 +# +# Test RAID profile conversion with concurrent allocations. +# This combines profile conversion (like btrfs/195) with concurrent +# fsstress allocations (like btrfs/060-064). + +. ./common/preamble +_begin_fstest auto volume balance scrub raid + +_cleanup() +{ + cd / + rm -f $tmp.* + _kill_fsstress +} + +. ./common/filter +# we check scratch dev after each loop +_require_scratch_nocheck +_require_scratch_dev_pool 4 +# Zoned btrfs only supports SINGLE profile +_require_non_zoned_device "${SCRATCH_DEV}" + +# Load up the available configs +_btrfs_get_profile_configs +declare -a TEST_VECTORS=( +# $nr_dev_min:$data:$metadata:$data_convert:$metadata_convert +"4:single:raid1" +"4:single:raid0" +"4:single:raid10" +"4:single:dup" +"4:single:raid5" +"4:single:raid6" +"2:raid1:single" +) + +run_testcase() { + IFS=':' read -ra args <<< $1 + num_disks=${args[0]} + src_type=${args[1]} + dst_type=${args[2]} + + if [[ ! "${_btrfs_profile_configs[@]}" =~ "$dst_type" ]]; then + echo "=== Skipping test: $1 ===" >> $seqres.full + return + fi + + _scratch_dev_pool_get $num_disks + + echo "=== Running test: $1 (converting $src_type -> $dst_type) ===" >> $seqres.full + + _scratch_pool_mkfs -d$src_type -m$src_type >> $seqres.full 2>&1 + _scratch_mount + + echo "Creating initial data..." >> $seqres.full + _run_fsstress -d $SCRATCH_MNT -w -n 10000 >> $seqres.full 2>&1 + + args=`_scale_fsstress_args -p 20 -n 1000 -d $SCRATCH_MNT/stressdir` + echo "Starting fsstress: $args" >> $seqres.full + _run_fsstress_bg $args + + echo "Starting conversion: $src_type -> $dst_type" >> $seqres.full + _run_btrfs_balance_start -f -dconvert=$dst_type $SCRATCH_MNT >> $seqres.full + [ $? -eq 0 ] || echo "$1: Failed convert" + + echo "Waiting for fsstress to complete..." >> $seqres.full + _wait_for_fsstress + + # Verify the conversion was successful + echo "Checking filesystem profile after conversion..." >> $seqres.full + $BTRFS_UTIL_PROG filesystem df $SCRATCH_MNT >> $seqres.full + + # Scrub to verify data integrity + echo "Scrubbing filesystem..." >> $seqres.full + $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT >> $seqres.full 2>&1 + if [ $? -ne 0 ]; then + echo "$1: Scrub found errors" + fi + + _scratch_unmount + _check_scratch_fs + _scratch_dev_pool_put +} + +echo "Silence is golden" +for i in "${TEST_VECTORS[@]}"; do + run_testcase $i +done + +status=0 +exit diff --git a/tests/btrfs/337.out b/tests/btrfs/337.out new file mode 100644 index 00000000..d80a9830 --- /dev/null +++ b/tests/btrfs/337.out @@ -0,0 +1,2 @@ +QA output created by 337 +Silence is golden -- 2.47.3 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 3/3] fstests: btrfs: test RAID conversions under stress 2025-10-03 23:41 ` [PATCH v2 3/3] fstests: btrfs: test RAID conversions under stress Leo Martins @ 2025-10-04 1:54 ` Qu Wenruo 2025-10-06 17:37 ` Leo Martins 2025-10-15 3:31 ` Boris Burkov 1 sibling, 1 reply; 10+ messages in thread From: Qu Wenruo @ 2025-10-04 1:54 UTC (permalink / raw) To: Leo Martins, linux-btrfs, kernel-team, fstests 在 2025/10/4 09:11, Leo Martins 写道: > Add test to test btrfs conversion while being stressed. This is > important since btrfs no longer allows allocating from different RAID > block_groups during conversions meaning there may be added enospc > pressure. > > Signed-off-by: Leo Martins <loemra.dev@gmail.com> Please do not mix patches for different projects into the same patchset. A lot of us are using b4 to merge (kernel/progs) patches, which will merge the whole series, including the one intended to fstests. Thanks, Qu > --- > tests/btrfs/337 | 95 +++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/337.out | 2 + > 2 files changed, 97 insertions(+) > create mode 100755 tests/btrfs/337 > create mode 100644 tests/btrfs/337.out > > diff --git a/tests/btrfs/337 b/tests/btrfs/337 > new file mode 100755 > index 00000000..fa335ed7 > --- /dev/null > +++ b/tests/btrfs/337 > @@ -0,0 +1,95 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2025 Meta Platforms, Inc. All Rights Reserved. > +# > +# FS QA Test btrfs/337 > +# > +# Test RAID profile conversion with concurrent allocations. > +# This combines profile conversion (like btrfs/195) with concurrent > +# fsstress allocations (like btrfs/060-064). > + > +. ./common/preamble > +_begin_fstest auto volume balance scrub raid > + > +_cleanup() > +{ > + cd / > + rm -f $tmp.* > + _kill_fsstress > +} > + > +. ./common/filter > +# we check scratch dev after each loop > +_require_scratch_nocheck > +_require_scratch_dev_pool 4 > +# Zoned btrfs only supports SINGLE profile > +_require_non_zoned_device "${SCRATCH_DEV}" > + > +# Load up the available configs > +_btrfs_get_profile_configs > +declare -a TEST_VECTORS=( > +# $nr_dev_min:$data:$metadata:$data_convert:$metadata_convert > +"4:single:raid1" > +"4:single:raid0" > +"4:single:raid10" > +"4:single:dup" > +"4:single:raid5" > +"4:single:raid6" > +"2:raid1:single" > +) > + > +run_testcase() { > + IFS=':' read -ra args <<< $1 > + num_disks=${args[0]} > + src_type=${args[1]} > + dst_type=${args[2]} > + > + if [[ ! "${_btrfs_profile_configs[@]}" =~ "$dst_type" ]]; then > + echo "=== Skipping test: $1 ===" >> $seqres.full > + return > + fi > + > + _scratch_dev_pool_get $num_disks > + > + echo "=== Running test: $1 (converting $src_type -> $dst_type) ===" >> $seqres.full > + > + _scratch_pool_mkfs -d$src_type -m$src_type >> $seqres.full 2>&1 > + _scratch_mount > + > + echo "Creating initial data..." >> $seqres.full > + _run_fsstress -d $SCRATCH_MNT -w -n 10000 >> $seqres.full 2>&1 > + > + args=`_scale_fsstress_args -p 20 -n 1000 -d $SCRATCH_MNT/stressdir` > + echo "Starting fsstress: $args" >> $seqres.full > + _run_fsstress_bg $args > + > + echo "Starting conversion: $src_type -> $dst_type" >> $seqres.full > + _run_btrfs_balance_start -f -dconvert=$dst_type $SCRATCH_MNT >> $seqres.full > + [ $? -eq 0 ] || echo "$1: Failed convert" > + > + echo "Waiting for fsstress to complete..." >> $seqres.full > + _wait_for_fsstress > + > + # Verify the conversion was successful > + echo "Checking filesystem profile after conversion..." >> $seqres.full > + $BTRFS_UTIL_PROG filesystem df $SCRATCH_MNT >> $seqres.full > + > + # Scrub to verify data integrity > + echo "Scrubbing filesystem..." >> $seqres.full > + $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT >> $seqres.full 2>&1 > + if [ $? -ne 0 ]; then > + echo "$1: Scrub found errors" > + fi > + > + _scratch_unmount > + _check_scratch_fs > + _scratch_dev_pool_put > +} > + > +echo "Silence is golden" > +for i in "${TEST_VECTORS[@]}"; do > + run_testcase $i > +done > + > +status=0 > +exit > diff --git a/tests/btrfs/337.out b/tests/btrfs/337.out > new file mode 100644 > index 00000000..d80a9830 > --- /dev/null > +++ b/tests/btrfs/337.out > @@ -0,0 +1,2 @@ > +QA output created by 337 > +Silence is golden ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 3/3] fstests: btrfs: test RAID conversions under stress 2025-10-04 1:54 ` Qu Wenruo @ 2025-10-06 17:37 ` Leo Martins 2025-10-06 18:16 ` David Sterba 0 siblings, 1 reply; 10+ messages in thread From: Leo Martins @ 2025-10-06 17:37 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs, kernel-team, fstests On Sat, 4 Oct 2025 11:24:27 +0930 Qu Wenruo <wqu@suse.com> wrote: > > > 在 2025/10/4 09:11, Leo Martins 写道: > > Add test to test btrfs conversion while being stressed. This is > > important since btrfs no longer allows allocating from different RAID > > block_groups during conversions meaning there may be added enospc > > pressure. > > > > Signed-off-by: Leo Martins <loemra.dev@gmail.com> > > Please do not mix patches for different projects into the same patchset. > > A lot of us are using b4 to merge (kernel/progs) patches, which will > merge the whole series, including the one intended to fstests. > > Thanks, > Qu Got it, sorry about that. Should I resend the patches? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 3/3] fstests: btrfs: test RAID conversions under stress 2025-10-06 17:37 ` Leo Martins @ 2025-10-06 18:16 ` David Sterba 0 siblings, 0 replies; 10+ messages in thread From: David Sterba @ 2025-10-06 18:16 UTC (permalink / raw) To: Leo Martins; +Cc: Qu Wenruo, linux-btrfs, kernel-team, fstests On Mon, Oct 06, 2025 at 10:37:51AM -0700, Leo Martins wrote: > On Sat, 4 Oct 2025 11:24:27 +0930 Qu Wenruo <wqu@suse.com> wrote: > > > > > > > 在 2025/10/4 09:11, Leo Martins 写道: > > > Add test to test btrfs conversion while being stressed. This is > > > important since btrfs no longer allows allocating from different RAID > > > block_groups during conversions meaning there may be added enospc > > > pressure. > > > > > > Signed-off-by: Leo Martins <loemra.dev@gmail.com> > > > > Please do not mix patches for different projects into the same patchset. > > > > A lot of us are using b4 to merge (kernel/progs) patches, which will > > merge the whole series, including the one intended to fstests. > > > > Thanks, > > Qu > > Got it, sorry about that. Should I resend the patches? Please don't unless there are changes. We can always apply the patches one by one, but more patch versions in the mailinglist need more tracking and sometimes discussions are torn. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 3/3] fstests: btrfs: test RAID conversions under stress 2025-10-03 23:41 ` [PATCH v2 3/3] fstests: btrfs: test RAID conversions under stress Leo Martins 2025-10-04 1:54 ` Qu Wenruo @ 2025-10-15 3:31 ` Boris Burkov 1 sibling, 0 replies; 10+ messages in thread From: Boris Burkov @ 2025-10-15 3:31 UTC (permalink / raw) To: Leo Martins; +Cc: linux-btrfs, kernel-team, fstests On Fri, Oct 03, 2025 at 04:41:59PM -0700, Leo Martins wrote: > Add test to test btrfs conversion while being stressed. This is > important since btrfs no longer allows allocating from different RAID > block_groups during conversions meaning there may be added enospc > pressure. > Aside from the patch intermingling stuff, this test looks good to me, thanks for adding it. Reviewed-by: Boris Burkov <boris@bur.io> > Signed-off-by: Leo Martins <loemra.dev@gmail.com> > --- > tests/btrfs/337 | 95 +++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/337.out | 2 + > 2 files changed, 97 insertions(+) > create mode 100755 tests/btrfs/337 > create mode 100644 tests/btrfs/337.out > > diff --git a/tests/btrfs/337 b/tests/btrfs/337 > new file mode 100755 > index 00000000..fa335ed7 > --- /dev/null > +++ b/tests/btrfs/337 > @@ -0,0 +1,95 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2025 Meta Platforms, Inc. All Rights Reserved. > +# > +# FS QA Test btrfs/337 > +# > +# Test RAID profile conversion with concurrent allocations. > +# This combines profile conversion (like btrfs/195) with concurrent > +# fsstress allocations (like btrfs/060-064). > + > +. ./common/preamble > +_begin_fstest auto volume balance scrub raid > + > +_cleanup() > +{ > + cd / > + rm -f $tmp.* > + _kill_fsstress > +} > + > +. ./common/filter > +# we check scratch dev after each loop > +_require_scratch_nocheck > +_require_scratch_dev_pool 4 > +# Zoned btrfs only supports SINGLE profile > +_require_non_zoned_device "${SCRATCH_DEV}" > + > +# Load up the available configs > +_btrfs_get_profile_configs > +declare -a TEST_VECTORS=( > +# $nr_dev_min:$data:$metadata:$data_convert:$metadata_convert > +"4:single:raid1" > +"4:single:raid0" > +"4:single:raid10" > +"4:single:dup" > +"4:single:raid5" > +"4:single:raid6" > +"2:raid1:single" > +) > + > +run_testcase() { > + IFS=':' read -ra args <<< $1 > + num_disks=${args[0]} > + src_type=${args[1]} > + dst_type=${args[2]} > + > + if [[ ! "${_btrfs_profile_configs[@]}" =~ "$dst_type" ]]; then > + echo "=== Skipping test: $1 ===" >> $seqres.full > + return > + fi > + > + _scratch_dev_pool_get $num_disks > + > + echo "=== Running test: $1 (converting $src_type -> $dst_type) ===" >> $seqres.full > + > + _scratch_pool_mkfs -d$src_type -m$src_type >> $seqres.full 2>&1 > + _scratch_mount > + > + echo "Creating initial data..." >> $seqres.full > + _run_fsstress -d $SCRATCH_MNT -w -n 10000 >> $seqres.full 2>&1 > + > + args=`_scale_fsstress_args -p 20 -n 1000 -d $SCRATCH_MNT/stressdir` > + echo "Starting fsstress: $args" >> $seqres.full > + _run_fsstress_bg $args > + > + echo "Starting conversion: $src_type -> $dst_type" >> $seqres.full > + _run_btrfs_balance_start -f -dconvert=$dst_type $SCRATCH_MNT >> $seqres.full > + [ $? -eq 0 ] || echo "$1: Failed convert" > + > + echo "Waiting for fsstress to complete..." >> $seqres.full > + _wait_for_fsstress > + > + # Verify the conversion was successful > + echo "Checking filesystem profile after conversion..." >> $seqres.full > + $BTRFS_UTIL_PROG filesystem df $SCRATCH_MNT >> $seqres.full > + > + # Scrub to verify data integrity > + echo "Scrubbing filesystem..." >> $seqres.full > + $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT >> $seqres.full 2>&1 > + if [ $? -ne 0 ]; then > + echo "$1: Scrub found errors" > + fi > + > + _scratch_unmount > + _check_scratch_fs > + _scratch_dev_pool_put > +} > + > +echo "Silence is golden" > +for i in "${TEST_VECTORS[@]}"; do > + run_testcase $i > +done > + > +status=0 > +exit > diff --git a/tests/btrfs/337.out b/tests/btrfs/337.out > new file mode 100644 > index 00000000..d80a9830 > --- /dev/null > +++ b/tests/btrfs/337.out > @@ -0,0 +1,2 @@ > +QA output created by 337 > +Silence is golden > -- > 2.47.3 > ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-10-15 3:31 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-10-03 23:41 [PATCH v2 0/3] btrfs: find_free_extent cleanups Leo Martins 2025-10-03 23:41 ` [PATCH v2 1/3] btrfs: remove ffe RAID loop Leo Martins 2025-10-15 3:29 ` Boris Burkov 2025-10-03 23:41 ` [PATCH v2 2/3] btrfs: add tracing for find_free_extent skip conditions Leo Martins 2025-10-15 3:28 ` Boris Burkov 2025-10-03 23:41 ` [PATCH v2 3/3] fstests: btrfs: test RAID conversions under stress Leo Martins 2025-10-04 1:54 ` Qu Wenruo 2025-10-06 17:37 ` Leo Martins 2025-10-06 18:16 ` David Sterba 2025-10-15 3:31 ` Boris Burkov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).