* [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup
@ 2025-12-31 10:39 Sun YangKai
2025-12-31 10:39 ` [PATCH 1/7] btrfs: change block group reclaim_mark to bool Sun YangKai
` (7 more replies)
0 siblings, 8 replies; 14+ messages in thread
From: Sun YangKai @ 2025-12-31 10:39 UTC (permalink / raw)
To: linux-btrfs; +Cc: Sun YangKai, Boris Burkov
This series eliminates wasteful periodic reclaim operations that were occurring
when already failed to reclaim any new space, and includes several preparatory
cleanups.
Patch 1-6 are non-functional changes.
Patch 7 fixes the core issue, details are in the commit message.
Thanks.
CC: Boris Burkov <boris@bur.io>
Sun YangKai (7):
btrfs: change block group reclaim_mark to bool
btrfs: reorder btrfs_block_group members to reduce struct size
btrfs: use proper types for btrfs_block_group fields
btrfs: consolidate reclaim readiness checks in btrfs_should_reclaim()
btrfs: use u8 for reclaim threshold type
btrfs: clarify reclaim sweep control flow
btrfs: fix periodic reclaim condition
fs/btrfs/block-group.c | 29 ++++++++++-----------
fs/btrfs/block-group.h | 22 ++++++++++------
fs/btrfs/space-info.c | 58 ++++++++++++++++++++----------------------
fs/btrfs/space-info.h | 38 +++++++++++++++++----------
fs/btrfs/sysfs.c | 3 ++-
5 files changed, 81 insertions(+), 69 deletions(-)
--
2.51.2
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/7] btrfs: change block group reclaim_mark to bool
2025-12-31 10:39 [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup Sun YangKai
@ 2025-12-31 10:39 ` Sun YangKai
2025-12-31 10:39 ` [PATCH 2/7] btrfs: reorder btrfs_block_group members to reduce struct size Sun YangKai
` (6 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Sun YangKai @ 2025-12-31 10:39 UTC (permalink / raw)
To: linux-btrfs; +Cc: Sun YangKai
The reclaim_mark field in struct btrfs_block_group was a u64 that was
incremented when marking block groups for reclaim during sweeping, but
the actual counter value was never used - only the zero/non-zero state
mattered for determining if a block group needed reclaim.
Convert it to a bool to properly reflect its usage and reduce memory
footprint by 8 bytes. Update assignments to use true/false instead of
increment and zero.
Now sizeof(struct btrfs_block_group) is 632->624.
Signed-off-by: Sun YangKai <sunk67188@gmail.com>
---
fs/btrfs/block-group.c | 2 +-
fs/btrfs/block-group.h | 7 ++++++-
fs/btrfs/space-info.c | 2 +-
3 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index e417aba4c4c7..1dcc5dccef05 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -3726,7 +3726,7 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans,
old_val += num_bytes;
cache->used = old_val;
cache->reserved -= num_bytes;
- cache->reclaim_mark = 0;
+ cache->reclaim_mark = false;
space_info->bytes_reserved -= num_bytes;
space_info->bytes_used += num_bytes;
space_info->disk_used += num_bytes * factor;
diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
index 5f933455118c..3b3c61b3af2c 100644
--- a/fs/btrfs/block-group.h
+++ b/fs/btrfs/block-group.h
@@ -251,6 +251,12 @@ struct btrfs_block_group {
/* Protected by @free_space_lock. */
bool using_free_space_bitmaps_cached;
+ /*
+ * Mark this blockgroup is not used for allocation
+ * between two reclaim sweeps.
+ */
+ bool reclaim_mark;
+
/*
* Number of extents in this block group used for swap files.
* All accesses protected by the spinlock 'lock'.
@@ -270,7 +276,6 @@ struct btrfs_block_group {
struct work_struct zone_finish_work;
struct extent_buffer *last_eb;
enum btrfs_block_group_size_class size_class;
- u64 reclaim_mark;
};
static inline u64 btrfs_block_group_end(const struct btrfs_block_group *block_group)
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 7b7b7255f7d8..41f1507f460f 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -2104,7 +2104,7 @@ static void do_reclaim_sweep(struct btrfs_space_info *space_info, int raid)
try_again = false;
reclaim = true;
}
- bg->reclaim_mark++;
+ bg->reclaim_mark = true;
spin_unlock(&bg->lock);
if (reclaim)
btrfs_mark_bg_to_reclaim(bg);
--
2.51.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/7] btrfs: reorder btrfs_block_group members to reduce struct size
2025-12-31 10:39 [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup Sun YangKai
2025-12-31 10:39 ` [PATCH 1/7] btrfs: change block group reclaim_mark to bool Sun YangKai
@ 2025-12-31 10:39 ` Sun YangKai
2025-12-31 10:39 ` [PATCH 3/7] btrfs: use proper types for btrfs_block_group fields Sun YangKai
` (5 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Sun YangKai @ 2025-12-31 10:39 UTC (permalink / raw)
To: linux-btrfs; +Cc: Sun YangKai
Reorder struct btrfs_block_group fields to improve packing and reduce
memory footprint from 624 to 600 bytes (24 bytes saved per instance).
Here's pahole output after this change:
struct btrfs_block_group {
struct btrfs_fs_info * fs_info; /* 0 8 */
struct btrfs_inode * inode; /* 8 8 */
u64 start; /* 16 8 */
u64 length; /* 24 8 */
u64 pinned; /* 32 8 */
u64 reserved; /* 40 8 */
u64 used; /* 48 8 */
u64 delalloc_bytes; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
u64 bytes_super; /* 64 8 */
u64 flags; /* 72 8 */
u64 cache_generation; /* 80 8 */
u64 global_root_id; /* 88 8 */
u64 commit_used; /* 96 8 */
u32 bitmap_high_thresh; /* 104 4 */
u32 bitmap_low_thresh; /* 108 4 */
struct rw_semaphore data_rwsem; /* 112 40 */
/* --- cacheline 2 boundary (128 bytes) was 24 bytes ago --- */
unsigned long full_stripe_len; /* 152 8 */
unsigned long runtime_flags; /* 160 8 */
spinlock_t lock; /* 168 4 */
unsigned int ro; /* 172 4 */
enum btrfs_disk_cache_state disk_cache_state; /* 176 4 */
enum btrfs_caching_type cached; /* 180 4 */
struct btrfs_caching_control * caching_ctl; /* 184 8 */
/* --- cacheline 3 boundary (192 bytes) --- */
struct btrfs_space_info * space_info; /* 192 8 */
struct btrfs_free_space_ctl * free_space_ctl; /* 200 8 */
struct rb_node cache_node; /* 208 24 */
struct list_head list; /* 232 16 */
struct list_head cluster_list; /* 248 16 */
/* --- cacheline 4 boundary (256 bytes) was 8 bytes ago --- */
struct list_head bg_list; /* 264 16 */
struct list_head ro_list; /* 280 16 */
refcount_t refs; /* 296 4 */
atomic_t frozen; /* 300 4 */
struct list_head discard_list; /* 304 16 */
/* --- cacheline 5 boundary (320 bytes) --- */
enum btrfs_discard_state discard_state; /* 320 4 */
int discard_index; /* 324 4 */
u64 discard_eligible_time; /* 328 8 */
u64 discard_cursor; /* 336 8 */
struct list_head dirty_list; /* 344 16 */
struct list_head io_list; /* 360 16 */
struct btrfs_io_ctl io_ctl; /* 376 72 */
/* --- cacheline 7 boundary (448 bytes) --- */
atomic_t reservations; /* 448 4 */
atomic_t nocow_writers; /* 452 4 */
struct mutex free_space_lock; /* 456 32 */
bool using_free_space_bitmaps; /* 488 1 */
bool using_free_space_bitmaps_cached; /* 489 1 */
bool reclaim_mark; /* 490 1 */
/* XXX 1 byte hole, try to pack */
int swap_extents; /* 492 4 */
u64 alloc_offset; /* 496 8 */
u64 zone_unusable; /* 504 8 */
/* --- cacheline 8 boundary (512 bytes) --- */
u64 zone_capacity; /* 512 8 */
u64 meta_write_pointer; /* 520 8 */
struct btrfs_chunk_map * physical_map; /* 528 8 */
struct list_head active_bg_list; /* 536 16 */
struct work_struct zone_finish_work; /* 552 32 */
/* --- cacheline 9 boundary (576 bytes) was 8 bytes ago --- */
struct extent_buffer * last_eb; /* 584 8 */
enum btrfs_block_group_size_class size_class; /* 592 4 */
/* size: 600, cachelines: 10, members: 56 */
/* sum members: 595, holes: 1, sum holes: 1 */
/* padding: 4 */
/* last cacheline: 24 bytes */
};
Signed-off-by: Sun YangKai <sunk67188@gmail.com>
---
fs/btrfs/block-group.h | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
index 3b3c61b3af2c..88c2e3a0a5a0 100644
--- a/fs/btrfs/block-group.h
+++ b/fs/btrfs/block-group.h
@@ -118,7 +118,6 @@ struct btrfs_caching_control {
struct btrfs_block_group {
struct btrfs_fs_info *fs_info;
struct btrfs_inode *inode;
- spinlock_t lock;
u64 start;
u64 length;
u64 pinned;
@@ -159,6 +158,8 @@ struct btrfs_block_group {
unsigned long full_stripe_len;
unsigned long runtime_flags;
+ spinlock_t lock;
+
unsigned int ro;
int disk_cache_state;
@@ -178,8 +179,6 @@ struct btrfs_block_group {
/* For block groups in the same raid type */
struct list_head list;
- refcount_t refs;
-
/*
* List of struct btrfs_free_clusters for this block group.
* Today it will only have one thing on it, but that may change
@@ -199,6 +198,8 @@ struct btrfs_block_group {
/* For read-only block groups */
struct list_head ro_list;
+ refcount_t refs;
+
/*
* When non-zero it means the block group's logical address and its
* device extents can not be reused for future block group allocations
@@ -211,10 +212,10 @@ struct btrfs_block_group {
/* For discard operations */
struct list_head discard_list;
+ enum btrfs_discard_state discard_state;
int discard_index;
u64 discard_eligible_time;
u64 discard_cursor;
- enum btrfs_discard_state discard_state;
/* For dirty block groups */
struct list_head dirty_list;
--
2.51.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 3/7] btrfs: use proper types for btrfs_block_group fields
2025-12-31 10:39 [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup Sun YangKai
2025-12-31 10:39 ` [PATCH 1/7] btrfs: change block group reclaim_mark to bool Sun YangKai
2025-12-31 10:39 ` [PATCH 2/7] btrfs: reorder btrfs_block_group members to reduce struct size Sun YangKai
@ 2025-12-31 10:39 ` Sun YangKai
2025-12-31 10:39 ` [PATCH 4/7] btrfs: consolidate reclaim readiness checks in btrfs_should_reclaim() Sun YangKai
` (4 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Sun YangKai @ 2025-12-31 10:39 UTC (permalink / raw)
To: linux-btrfs; +Cc: Sun YangKai
Convert disk_cache_state and cached fields in struct btrfs_block_group
from int to their respective enum types (enum btrfs_disk_cache_state
and enum btrfs_caching_type). Also change btrfs_block_group_done() to
return bool instead of int since it returns a boolean comparison.
No functional changes intended.
Signed-off-by: Sun YangKai <sunk67188@gmail.com>
---
fs/btrfs/block-group.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h
index 88c2e3a0a5a0..bbd9371e33fd 100644
--- a/fs/btrfs/block-group.h
+++ b/fs/btrfs/block-group.h
@@ -162,10 +162,10 @@ struct btrfs_block_group {
unsigned int ro;
- int disk_cache_state;
+ enum btrfs_disk_cache_state disk_cache_state;
/* Cache tracking stuff */
- int cached;
+ enum btrfs_caching_type cached;
struct btrfs_caching_control *caching_ctl;
struct btrfs_space_info *space_info;
@@ -383,7 +383,7 @@ static inline u64 btrfs_system_alloc_profile(struct btrfs_fs_info *fs_info)
return btrfs_get_alloc_profile(fs_info, BTRFS_BLOCK_GROUP_SYSTEM);
}
-static inline int btrfs_block_group_done(const struct btrfs_block_group *cache)
+static inline bool btrfs_block_group_done(const struct btrfs_block_group *cache)
{
smp_mb();
return cache->cached == BTRFS_CACHE_FINISHED ||
--
2.51.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/7] btrfs: consolidate reclaim readiness checks in btrfs_should_reclaim()
2025-12-31 10:39 [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup Sun YangKai
` (2 preceding siblings ...)
2025-12-31 10:39 ` [PATCH 3/7] btrfs: use proper types for btrfs_block_group fields Sun YangKai
@ 2025-12-31 10:39 ` Sun YangKai
2025-12-31 10:39 ` [PATCH 5/7] btrfs: use u8 for reclaim threshold type Sun YangKai
` (3 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Sun YangKai @ 2025-12-31 10:39 UTC (permalink / raw)
To: linux-btrfs; +Cc: Sun YangKai
Move the filesystem state validation from btrfs_reclaim_bgs_work() into
btrfs_should_reclaim() to centralize the reclaim eligibility logic.
Since it is the only caller of btrfs_should_reclaim(), there's no
functional change.
Signed-off-by: Sun YangKai <sunk67188@gmail.com>
---
fs/btrfs/block-group.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 1dcc5dccef05..944857bd6af6 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1804,6 +1804,12 @@ static int reclaim_bgs_cmp(void *unused, const struct list_head *a,
static inline bool btrfs_should_reclaim(const struct btrfs_fs_info *fs_info)
{
+ if (!test_bit(BTRFS_FS_OPEN, &fs_info->flags))
+ return false;
+
+ if (btrfs_fs_closing(fs_info))
+ return false;
+
if (btrfs_is_zoned(fs_info))
return btrfs_zoned_should_reclaim(fs_info);
return true;
@@ -1838,12 +1844,6 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
struct btrfs_space_info *space_info;
LIST_HEAD(retry_list);
- if (!test_bit(BTRFS_FS_OPEN, &fs_info->flags))
- return;
-
- if (btrfs_fs_closing(fs_info))
- return;
-
if (!btrfs_should_reclaim(fs_info))
return;
--
2.51.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/7] btrfs: use u8 for reclaim threshold type
2025-12-31 10:39 [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup Sun YangKai
` (3 preceding siblings ...)
2025-12-31 10:39 ` [PATCH 4/7] btrfs: consolidate reclaim readiness checks in btrfs_should_reclaim() Sun YangKai
@ 2025-12-31 10:39 ` Sun YangKai
2025-12-31 10:39 ` [PATCH 6/7] btrfs: clarify reclaim sweep control flow Sun YangKai
` (2 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Sun YangKai @ 2025-12-31 10:39 UTC (permalink / raw)
To: linux-btrfs; +Cc: Sun YangKai
The bg_reclaim_threshold field stores a percentage value (0-100), making
u8 a more appropriate type than int. Update the field and related
function return types:
- struct btrfs_space_info::bg_reclaim_threshold
- calc_dynamic_reclaim_threshold()
- btrfs_calc_reclaim_threshold()
The sysfs store function already validates the range is 0-100, ensuring
the cast to u8 is safe.
Signed-off-by: Sun YangKai <sunk67188@gmail.com>
---
fs/btrfs/space-info.c | 6 +++---
fs/btrfs/space-info.h | 12 ++++++------
fs/btrfs/sysfs.c | 3 ++-
3 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 41f1507f460f..cf2c4b7105cf 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -2037,7 +2037,7 @@ static u64 calc_unalloc_target(struct btrfs_fs_info *fs_info)
* Typically with 10 block groups as the target, the discrete values this comes
* out to are 0, 10, 20, ... , 80, 90, and 99.
*/
-static int calc_dynamic_reclaim_threshold(const struct btrfs_space_info *space_info)
+static u8 calc_dynamic_reclaim_threshold(const struct btrfs_space_info *space_info)
{
struct btrfs_fs_info *fs_info = space_info->fs_info;
u64 unalloc = atomic64_read(&fs_info->free_chunk_space);
@@ -2052,11 +2052,11 @@ static int calc_dynamic_reclaim_threshold(const struct btrfs_space_info *space_i
if (unused < data_chunk_size)
return 0;
- /* Cast to int is OK because want <= target. */
+ /* Cast to u8 is OK because want <= target. */
return calc_pct_ratio(want, target);
}
-int btrfs_calc_reclaim_threshold(const struct btrfs_space_info *space_info)
+u8 btrfs_calc_reclaim_threshold(const struct btrfs_space_info *space_info)
{
lockdep_assert_held(&space_info->lock);
diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
index 0703f24b23f7..a4fa04d10722 100644
--- a/fs/btrfs/space-info.h
+++ b/fs/btrfs/space-info.h
@@ -132,15 +132,15 @@ struct btrfs_space_info {
/* Chunk size in bytes */
u64 chunk_size;
+ int clamp; /* Used to scale our threshold for preemptive
+ flushing. The value is >> clamp, so turns
+ out to be a 2^clamp divisor. */
+
/*
* Once a block group drops below this threshold (percents) we'll
* schedule it for reclaim.
*/
- int bg_reclaim_threshold;
-
- int clamp; /* Used to scale our threshold for preemptive
- flushing. The value is >> clamp, so turns
- out to be a 2^clamp divisor. */
+ u8 bg_reclaim_threshold;
bool full; /* indicates that we cannot allocate any more
chunks for this space */
@@ -303,7 +303,7 @@ u64 btrfs_account_ro_block_groups_free_space(struct btrfs_space_info *sinfo);
void btrfs_space_info_update_reclaimable(struct btrfs_space_info *space_info, s64 bytes);
void btrfs_set_periodic_reclaim_ready(struct btrfs_space_info *space_info, bool ready);
-int btrfs_calc_reclaim_threshold(const struct btrfs_space_info *space_info);
+u8 btrfs_calc_reclaim_threshold(const struct btrfs_space_info *space_info);
void btrfs_reclaim_sweep(const struct btrfs_fs_info *fs_info);
void btrfs_return_free_space(struct btrfs_space_info *space_info, u64 len);
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index f0974f4c0ae4..468c6a9acd3b 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -937,7 +937,8 @@ static ssize_t btrfs_sinfo_bg_reclaim_threshold_store(struct kobject *kobj,
if (thresh < 0 || thresh > 100)
return -EINVAL;
- WRITE_ONCE(space_info->bg_reclaim_threshold, thresh);
+ /* Safe to case to u8 after checking thresh's range is between 0 and 100 */
+ WRITE_ONCE(space_info->bg_reclaim_threshold, (u8)thresh);
return len;
}
--
2.51.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 6/7] btrfs: clarify reclaim sweep control flow
2025-12-31 10:39 [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup Sun YangKai
` (4 preceding siblings ...)
2025-12-31 10:39 ` [PATCH 5/7] btrfs: use u8 for reclaim threshold type Sun YangKai
@ 2025-12-31 10:39 ` Sun YangKai
2025-12-31 10:39 ` [PATCH 7/7] btrfs: fix periodic reclaim condition Sun YangKai
2026-01-01 0:13 ` [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup Qu Wenruo
7 siblings, 0 replies; 14+ messages in thread
From: Sun YangKai @ 2025-12-31 10:39 UTC (permalink / raw)
To: linux-btrfs; +Cc: Sun YangKai
Replace the try_again flag with will_reclaim and adjust the
to better reflect the intent of the logic. This makes the reclaim
decision easier to follow without changing behavior.
Also prepare for the next patch.
No functional change.
Signed-off-by: Sun YangKai <sunk67188@gmail.com>
---
fs/btrfs/space-info.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index cf2c4b7105cf..b6ec09aea64f 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -2083,7 +2083,7 @@ static void do_reclaim_sweep(struct btrfs_space_info *space_info, int raid)
{
struct btrfs_block_group *bg;
int thresh_pct;
- bool try_again = true;
+ bool will_reclaim = false;
bool urgent;
spin_lock(&space_info->lock);
@@ -2101,7 +2101,7 @@ static void do_reclaim_sweep(struct btrfs_space_info *space_info, int raid)
spin_lock(&bg->lock);
thresh = mult_perc(bg->length, thresh_pct);
if (bg->used < thresh && bg->reclaim_mark) {
- try_again = false;
+ will_reclaim = true;
reclaim = true;
}
bg->reclaim_mark = true;
@@ -2118,8 +2118,8 @@ static void do_reclaim_sweep(struct btrfs_space_info *space_info, int raid)
* If we have any staler groups, we don't touch the fresher ones, but if we
* really need a block group, do take a fresh one.
*/
- if (try_again && urgent) {
- try_again = false;
+ if (!will_reclaim && urgent) {
+ urgent = false;
goto again;
}
--
2.51.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 7/7] btrfs: fix periodic reclaim condition
2025-12-31 10:39 [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup Sun YangKai
` (5 preceding siblings ...)
2025-12-31 10:39 ` [PATCH 6/7] btrfs: clarify reclaim sweep control flow Sun YangKai
@ 2025-12-31 10:39 ` Sun YangKai
2026-01-01 0:20 ` Qu Wenruo
2026-01-01 0:13 ` [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup Qu Wenruo
7 siblings, 1 reply; 14+ messages in thread
From: Sun YangKai @ 2025-12-31 10:39 UTC (permalink / raw)
To: linux-btrfs; +Cc: Sun YangKai
Problems with current implementation:
1. reclaimable_bytes is signed while chunk_sz is unsigned, causing
negative reclaimable_bytes to trigger reclaim unexpectedly
2. The "space must be freed between scans" assumption breaks the
two-scan requirement: first scan marks block groups, second scan
reclaims them. Without the second scan, no reclamation occurs.
Instead, track actual reclaim progress: pause reclaim when block groups
will be reclaimed, and resume only when progress is made. This ensures
reclaim continues until no further progress can be made, then resumes when
space_info changes or new reclaimable groups appear.
Signed-off-by: Sun YangKai <sunk67188@gmail.com>
---
fs/btrfs/block-group.c | 15 +++++++--------
fs/btrfs/space-info.c | 42 +++++++++++++++++++-----------------------
fs/btrfs/space-info.h | 26 ++++++++++++++++++--------
3 files changed, 44 insertions(+), 39 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index 944857bd6af6..39e6f1bf3506 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1871,6 +1871,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
while (!list_empty(&fs_info->reclaim_bgs)) {
u64 used;
u64 reserved;
+ u64 old_total;
int ret = 0;
bg = list_first_entry(&fs_info->reclaim_bgs,
@@ -1936,6 +1937,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
}
spin_unlock(&bg->lock);
+ old_total = space_info->total_bytes;
spin_unlock(&space_info->lock);
/*
@@ -1988,14 +1990,14 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
reserved = 0;
spin_lock(&space_info->lock);
space_info->reclaim_errors++;
- if (READ_ONCE(space_info->periodic_reclaim))
- space_info->periodic_reclaim_ready = false;
spin_unlock(&space_info->lock);
}
spin_lock(&space_info->lock);
space_info->reclaim_count++;
space_info->reclaim_bytes += used;
space_info->reclaim_bytes += reserved;
+ if (space_info->total_bytes < old_total)
+ btrfs_resume_periodic_reclaim(space_info);
spin_unlock(&space_info->lock);
next:
@@ -3730,8 +3732,6 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans,
space_info->bytes_reserved -= num_bytes;
space_info->bytes_used += num_bytes;
space_info->disk_used += num_bytes * factor;
- if (READ_ONCE(space_info->periodic_reclaim))
- btrfs_space_info_update_reclaimable(space_info, -num_bytes);
spin_unlock(&cache->lock);
spin_unlock(&space_info->lock);
} else {
@@ -3741,12 +3741,11 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans,
btrfs_space_info_update_bytes_pinned(space_info, num_bytes);
space_info->bytes_used -= num_bytes;
space_info->disk_used -= num_bytes * factor;
- if (READ_ONCE(space_info->periodic_reclaim))
- btrfs_space_info_update_reclaimable(space_info, num_bytes);
- else
- reclaim = should_reclaim_block_group(cache, num_bytes);
+ reclaim = should_reclaim_block_group(cache, num_bytes);
spin_unlock(&cache->lock);
+ if (reclaim)
+ btrfs_resume_periodic_reclaim(space_info);
spin_unlock(&space_info->lock);
btrfs_set_extent_bit(&trans->transaction->pinned_extents, bytenr,
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index b6ec09aea64f..dce21809e640 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -2124,43 +2124,39 @@ static void do_reclaim_sweep(struct btrfs_space_info *space_info, int raid)
}
up_read(&space_info->groups_sem);
-}
-
-void btrfs_space_info_update_reclaimable(struct btrfs_space_info *space_info, s64 bytes)
-{
- u64 chunk_sz = calc_effective_data_chunk_size(space_info->fs_info);
-
- lockdep_assert_held(&space_info->lock);
- space_info->reclaimable_bytes += bytes;
- if (space_info->reclaimable_bytes >= chunk_sz)
- btrfs_set_periodic_reclaim_ready(space_info, true);
-}
-
-void btrfs_set_periodic_reclaim_ready(struct btrfs_space_info *space_info, bool ready)
-{
- lockdep_assert_held(&space_info->lock);
- if (!READ_ONCE(space_info->periodic_reclaim))
- return;
- if (ready != space_info->periodic_reclaim_ready) {
- space_info->periodic_reclaim_ready = ready;
- if (!ready)
- space_info->reclaimable_bytes = 0;
+ /*
+ * Temporary pause periodic reclaim until reclaim make some progress.
+ * This can prevent periodic reclaim keep happening but make no progress.
+ */
+ if (will_reclaim) {
+ spin_lock(&space_info->lock);
+ btrfs_pause_periodic_reclaim(space_info);
+ spin_unlock(&space_info->lock);
}
}
static bool btrfs_should_periodic_reclaim(struct btrfs_space_info *space_info)
{
bool ret;
+ u64 chunk_sz;
+ u64 unused;
if (space_info->flags & BTRFS_BLOCK_GROUP_SYSTEM)
return false;
if (!READ_ONCE(space_info->periodic_reclaim))
return false;
+ if (!READ_ONCE(space_info->periodic_reclaim_paused))
+ return true;
+
+ chunk_sz = calc_effective_data_chunk_size(space_info->fs_info);
spin_lock(&space_info->lock);
- ret = space_info->periodic_reclaim_ready;
- btrfs_set_periodic_reclaim_ready(space_info, false);
+ unused = space_info->total_bytes - space_info->bytes_used;
+ ret = (unused >= space_info->last_reclaim_unused + chunk_sz ||
+ btrfs_calc_reclaim_threshold(space_info) != space_info->last_reclaim_threshold);
+ if (ret)
+ btrfs_resume_periodic_reclaim(space_info);
spin_unlock(&space_info->lock);
return ret;
diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
index a4fa04d10722..2ebfe440837b 100644
--- a/fs/btrfs/space-info.h
+++ b/fs/btrfs/space-info.h
@@ -216,12 +216,9 @@ struct btrfs_space_info {
* Periodic reclaim should be a no-op if a space_info hasn't
* freed any space since the last time we tried.
*/
- bool periodic_reclaim_ready;
-
- /*
- * Net bytes freed or allocated since the last reclaim pass.
- */
- s64 reclaimable_bytes;
+ bool periodic_reclaim_paused;
+ u8 last_reclaim_threshold;
+ u64 last_reclaim_unused;
};
static inline bool btrfs_mixed_space_info(const struct btrfs_space_info *space_info)
@@ -301,9 +298,22 @@ void btrfs_dump_space_info_for_trans_abort(struct btrfs_fs_info *fs_info);
void btrfs_init_async_reclaim_work(struct btrfs_fs_info *fs_info);
u64 btrfs_account_ro_block_groups_free_space(struct btrfs_space_info *sinfo);
-void btrfs_space_info_update_reclaimable(struct btrfs_space_info *space_info, s64 bytes);
-void btrfs_set_periodic_reclaim_ready(struct btrfs_space_info *space_info, bool ready);
u8 btrfs_calc_reclaim_threshold(const struct btrfs_space_info *space_info);
+static inline void btrfs_resume_periodic_reclaim(struct btrfs_space_info *space_info)
+{
+ lockdep_assert_held(&space_info->lock);
+ if (space_info->periodic_reclaim_paused)
+ space_info->periodic_reclaim_paused = false;
+}
+static inline void btrfs_pause_periodic_reclaim(struct btrfs_space_info *space_info)
+{
+ lockdep_assert_held(&space_info->lock);
+ if (!space_info->periodic_reclaim_paused) {
+ space_info->periodic_reclaim_paused = true;
+ space_info->last_reclaim_threshold = btrfs_calc_reclaim_threshold(space_info);
+ space_info->last_reclaim_unused = space_info->total_bytes - space_info->bytes_used;
+ }
+}
void btrfs_reclaim_sweep(const struct btrfs_fs_info *fs_info);
void btrfs_return_free_space(struct btrfs_space_info *space_info, u64 len);
--
2.51.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup
2025-12-31 10:39 [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup Sun YangKai
` (6 preceding siblings ...)
2025-12-31 10:39 ` [PATCH 7/7] btrfs: fix periodic reclaim condition Sun YangKai
@ 2026-01-01 0:13 ` Qu Wenruo
2026-01-01 11:54 ` Sun Yangkai
7 siblings, 1 reply; 14+ messages in thread
From: Qu Wenruo @ 2026-01-01 0:13 UTC (permalink / raw)
To: Sun YangKai, linux-btrfs; +Cc: Boris Burkov
在 2025/12/31 21:09, Sun YangKai 写道:
> This series eliminates wasteful periodic reclaim operations that were occurring
> when already failed to reclaim any new space, and includes several preparatory
> cleanups.
>
> Patch 1-6 are non-functional changes.
>
> Patch 7 fixes the core issue, details are in the commit message.
Fix first then cleanup please, this will make backport much easier.
Thanks,
Qu
>
> Thanks.
>
> CC: Boris Burkov <boris@bur.io>
>
> Sun YangKai (7):
> btrfs: change block group reclaim_mark to bool
> btrfs: reorder btrfs_block_group members to reduce struct size
> btrfs: use proper types for btrfs_block_group fields
> btrfs: consolidate reclaim readiness checks in btrfs_should_reclaim()
> btrfs: use u8 for reclaim threshold type
> btrfs: clarify reclaim sweep control flow
> btrfs: fix periodic reclaim condition
>
> fs/btrfs/block-group.c | 29 ++++++++++-----------
> fs/btrfs/block-group.h | 22 ++++++++++------
> fs/btrfs/space-info.c | 58 ++++++++++++++++++++----------------------
> fs/btrfs/space-info.h | 38 +++++++++++++++++----------
> fs/btrfs/sysfs.c | 3 ++-
> 5 files changed, 81 insertions(+), 69 deletions(-)
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 7/7] btrfs: fix periodic reclaim condition
2025-12-31 10:39 ` [PATCH 7/7] btrfs: fix periodic reclaim condition Sun YangKai
@ 2026-01-01 0:20 ` Qu Wenruo
2026-01-01 11:44 ` Sun Yangkai
0 siblings, 1 reply; 14+ messages in thread
From: Qu Wenruo @ 2026-01-01 0:20 UTC (permalink / raw)
To: Sun YangKai, linux-btrfs
在 2025/12/31 21:09, Sun YangKai 写道:
> Problems with current implementation:
> 1. reclaimable_bytes is signed while chunk_sz is unsigned, causing
> negative reclaimable_bytes to trigger reclaim unexpectedly
> 2. The "space must be freed between scans" assumption breaks the
> two-scan requirement: first scan marks block groups, second scan
> reclaims them. Without the second scan, no reclamation occurs.
>
> Instead, track actual reclaim progress: pause reclaim when block groups
> will be reclaimed, and resume only when progress is made. This ensures
> reclaim continues until no further progress can be made, then resumes when
> space_info changes or new reclaimable groups appear.
>
> Signed-off-by: Sun YangKai <sunk67188@gmail.com>
If this is a bug fix indicated by the title, add a proper "Fixes:" tag
please.
Thanks,
Qu
> ---
> fs/btrfs/block-group.c | 15 +++++++--------
> fs/btrfs/space-info.c | 42 +++++++++++++++++++-----------------------
> fs/btrfs/space-info.h | 26 ++++++++++++++++++--------
> 3 files changed, 44 insertions(+), 39 deletions(-)
>
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index 944857bd6af6..39e6f1bf3506 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -1871,6 +1871,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
> while (!list_empty(&fs_info->reclaim_bgs)) {
> u64 used;
> u64 reserved;
> + u64 old_total;
> int ret = 0;
>
> bg = list_first_entry(&fs_info->reclaim_bgs,
> @@ -1936,6 +1937,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
> }
>
> spin_unlock(&bg->lock);
> + old_total = space_info->total_bytes;
> spin_unlock(&space_info->lock);
>
> /*
> @@ -1988,14 +1990,14 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
> reserved = 0;
> spin_lock(&space_info->lock);
> space_info->reclaim_errors++;
> - if (READ_ONCE(space_info->periodic_reclaim))
> - space_info->periodic_reclaim_ready = false;
> spin_unlock(&space_info->lock);
> }
> spin_lock(&space_info->lock);
> space_info->reclaim_count++;
> space_info->reclaim_bytes += used;
> space_info->reclaim_bytes += reserved;
> + if (space_info->total_bytes < old_total)
> + btrfs_resume_periodic_reclaim(space_info);
> spin_unlock(&space_info->lock);
>
> next:
> @@ -3730,8 +3732,6 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans,
> space_info->bytes_reserved -= num_bytes;
> space_info->bytes_used += num_bytes;
> space_info->disk_used += num_bytes * factor;
> - if (READ_ONCE(space_info->periodic_reclaim))
> - btrfs_space_info_update_reclaimable(space_info, -num_bytes);
> spin_unlock(&cache->lock);
> spin_unlock(&space_info->lock);
> } else {
> @@ -3741,12 +3741,11 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans,
> btrfs_space_info_update_bytes_pinned(space_info, num_bytes);
> space_info->bytes_used -= num_bytes;
> space_info->disk_used -= num_bytes * factor;
> - if (READ_ONCE(space_info->periodic_reclaim))
> - btrfs_space_info_update_reclaimable(space_info, num_bytes);
> - else
> - reclaim = should_reclaim_block_group(cache, num_bytes);
> + reclaim = should_reclaim_block_group(cache, num_bytes);
>
> spin_unlock(&cache->lock);
> + if (reclaim)
> + btrfs_resume_periodic_reclaim(space_info);
> spin_unlock(&space_info->lock);
>
> btrfs_set_extent_bit(&trans->transaction->pinned_extents, bytenr,
> diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
> index b6ec09aea64f..dce21809e640 100644
> --- a/fs/btrfs/space-info.c
> +++ b/fs/btrfs/space-info.c
> @@ -2124,43 +2124,39 @@ static void do_reclaim_sweep(struct btrfs_space_info *space_info, int raid)
> }
>
> up_read(&space_info->groups_sem);
> -}
> -
> -void btrfs_space_info_update_reclaimable(struct btrfs_space_info *space_info, s64 bytes)
> -{
> - u64 chunk_sz = calc_effective_data_chunk_size(space_info->fs_info);
> -
> - lockdep_assert_held(&space_info->lock);
> - space_info->reclaimable_bytes += bytes;
>
> - if (space_info->reclaimable_bytes >= chunk_sz)
> - btrfs_set_periodic_reclaim_ready(space_info, true);
> -}
> -
> -void btrfs_set_periodic_reclaim_ready(struct btrfs_space_info *space_info, bool ready)
> -{
> - lockdep_assert_held(&space_info->lock);
> - if (!READ_ONCE(space_info->periodic_reclaim))
> - return;
> - if (ready != space_info->periodic_reclaim_ready) {
> - space_info->periodic_reclaim_ready = ready;
> - if (!ready)
> - space_info->reclaimable_bytes = 0;
> + /*
> + * Temporary pause periodic reclaim until reclaim make some progress.
> + * This can prevent periodic reclaim keep happening but make no progress.
> + */
> + if (will_reclaim) {
> + spin_lock(&space_info->lock);
> + btrfs_pause_periodic_reclaim(space_info);
> + spin_unlock(&space_info->lock);
> }
> }
>
> static bool btrfs_should_periodic_reclaim(struct btrfs_space_info *space_info)
> {
> bool ret;
> + u64 chunk_sz;
> + u64 unused;
>
> if (space_info->flags & BTRFS_BLOCK_GROUP_SYSTEM)
> return false;
> if (!READ_ONCE(space_info->periodic_reclaim))
> return false;
> + if (!READ_ONCE(space_info->periodic_reclaim_paused))
> + return true;
> +
> + chunk_sz = calc_effective_data_chunk_size(space_info->fs_info);
>
> spin_lock(&space_info->lock);
> - ret = space_info->periodic_reclaim_ready;
> - btrfs_set_periodic_reclaim_ready(space_info, false);
> + unused = space_info->total_bytes - space_info->bytes_used;
> + ret = (unused >= space_info->last_reclaim_unused + chunk_sz ||
> + btrfs_calc_reclaim_threshold(space_info) != space_info->last_reclaim_threshold);
> + if (ret)
> + btrfs_resume_periodic_reclaim(space_info);
> spin_unlock(&space_info->lock);
>
> return ret;
> diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
> index a4fa04d10722..2ebfe440837b 100644
> --- a/fs/btrfs/space-info.h
> +++ b/fs/btrfs/space-info.h
> @@ -216,12 +216,9 @@ struct btrfs_space_info {
> * Periodic reclaim should be a no-op if a space_info hasn't
> * freed any space since the last time we tried.
> */
> - bool periodic_reclaim_ready;
> -
> - /*
> - * Net bytes freed or allocated since the last reclaim pass.
> - */
> - s64 reclaimable_bytes;
> + bool periodic_reclaim_paused;
> + u8 last_reclaim_threshold;
> + u64 last_reclaim_unused;
> };
>
> static inline bool btrfs_mixed_space_info(const struct btrfs_space_info *space_info)
> @@ -301,9 +298,22 @@ void btrfs_dump_space_info_for_trans_abort(struct btrfs_fs_info *fs_info);
> void btrfs_init_async_reclaim_work(struct btrfs_fs_info *fs_info);
> u64 btrfs_account_ro_block_groups_free_space(struct btrfs_space_info *sinfo);
>
> -void btrfs_space_info_update_reclaimable(struct btrfs_space_info *space_info, s64 bytes);
> -void btrfs_set_periodic_reclaim_ready(struct btrfs_space_info *space_info, bool ready);
> u8 btrfs_calc_reclaim_threshold(const struct btrfs_space_info *space_info);
> +static inline void btrfs_resume_periodic_reclaim(struct btrfs_space_info *space_info)
> +{
> + lockdep_assert_held(&space_info->lock);
> + if (space_info->periodic_reclaim_paused)
> + space_info->periodic_reclaim_paused = false;
> +}
> +static inline void btrfs_pause_periodic_reclaim(struct btrfs_space_info *space_info)
> +{
> + lockdep_assert_held(&space_info->lock);
> + if (!space_info->periodic_reclaim_paused) {
> + space_info->periodic_reclaim_paused = true;
> + space_info->last_reclaim_threshold = btrfs_calc_reclaim_threshold(space_info);
> + space_info->last_reclaim_unused = space_info->total_bytes - space_info->bytes_used;
> + }
> +}
> void btrfs_reclaim_sweep(const struct btrfs_fs_info *fs_info);
> void btrfs_return_free_space(struct btrfs_space_info *space_info, u64 len);
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 7/7] btrfs: fix periodic reclaim condition
2026-01-01 0:20 ` Qu Wenruo
@ 2026-01-01 11:44 ` Sun Yangkai
0 siblings, 0 replies; 14+ messages in thread
From: Sun Yangkai @ 2026-01-01 11:44 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs
在 2026/1/1 08:20, Qu Wenruo 写道:
>
>
> 在 2025/12/31 21:09, Sun YangKai 写道:
>> Problems with current implementation:
>> 1. reclaimable_bytes is signed while chunk_sz is unsigned, causing
>> negative reclaimable_bytes to trigger reclaim unexpectedly
>> 2. The "space must be freed between scans" assumption breaks the
>> two-scan requirement: first scan marks block groups, second scan
>> reclaims them. Without the second scan, no reclamation occurs.
>>
>> Instead, track actual reclaim progress: pause reclaim when block groups
>> will be reclaimed, and resume only when progress is made. This ensures
>> reclaim continues until no further progress can be made, then resumes when
>> space_info changes or new reclaimable groups appear.
>>
>> Signed-off-by: Sun YangKai <sunk67188@gmail.com>
>
> If this is a bug fix indicated by the title, add a proper "Fixes:" tag please.
>
> Thanks,
> Qu
>
Ok. I'll add it in v2.
Thanks,
Sun YangKai
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup
2026-01-01 0:13 ` [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup Qu Wenruo
@ 2026-01-01 11:54 ` Sun Yangkai
2026-01-01 21:14 ` Qu Wenruo
0 siblings, 1 reply; 14+ messages in thread
From: Sun Yangkai @ 2026-01-01 11:54 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs; +Cc: Boris Burkov
在 2026/1/1 08:13, Qu Wenruo 写道:
>
>
> 在 2025/12/31 21:09, Sun YangKai 写道:
>> This series eliminates wasteful periodic reclaim operations that were occurring
>> when already failed to reclaim any new space, and includes several preparatory
>> cleanups.
>>
>> Patch 1-6 are non-functional changes.
>>
>> Patch 7 fixes the core issue, details are in the commit message.
>
> Fix first then cleanup please, this will make backport much easier.
>
> Thanks,
> Qu
Sorry for bothering. I have no experience with backport things so I need some
more guidance here.
The fix patch needs two of the cleanup patches applied. I currently have no idea
what I could do to make backport easier. Should I also add "Fixes:" tag to the
two cleanup patch? Or should I squash the two cleanup and one fix together to
make a patch just for backport?
>>
>> Thanks.
>>
>> CC: Boris Burkov <boris@bur.io>
>>
>> Sun YangKai (7):
>> btrfs: change block group reclaim_mark to bool
>> btrfs: reorder btrfs_block_group members to reduce struct size
>> btrfs: use proper types for btrfs_block_group fields
>> btrfs: consolidate reclaim readiness checks in btrfs_should_reclaim()
>> btrfs: use u8 for reclaim threshold type
>> btrfs: clarify reclaim sweep control flow
>> btrfs: fix periodic reclaim condition
>>
>> fs/btrfs/block-group.c | 29 ++++++++++-----------
>> fs/btrfs/block-group.h | 22 ++++++++++------
>> fs/btrfs/space-info.c | 58 ++++++++++++++++++++----------------------
>> fs/btrfs/space-info.h | 38 +++++++++++++++++----------
>> fs/btrfs/sysfs.c | 3 ++-
>> 5 files changed, 81 insertions(+), 69 deletions(-)
>>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup
2026-01-01 11:54 ` Sun Yangkai
@ 2026-01-01 21:14 ` Qu Wenruo
2026-01-03 11:17 ` Sun Yangkai
0 siblings, 1 reply; 14+ messages in thread
From: Qu Wenruo @ 2026-01-01 21:14 UTC (permalink / raw)
To: Sun Yangkai, linux-btrfs; +Cc: Boris Burkov
[-- Attachment #1: Type: text/plain, Size: 2661 bytes --]
在 2026/1/1 22:24, Sun Yangkai 写道:
>
>
> 在 2026/1/1 08:13, Qu Wenruo 写道:
>>
>>
>> 在 2025/12/31 21:09, Sun YangKai 写道:
>>> This series eliminates wasteful periodic reclaim operations that were occurring
>>> when already failed to reclaim any new space, and includes several preparatory
>>> cleanups.
>>>
>>> Patch 1-6 are non-functional changes.
>>>
>>> Patch 7 fixes the core issue, details are in the commit message.
>>
>> Fix first then cleanup please, this will make backport much easier.
>>
>> Thanks,
>> Qu
>
> Sorry for bothering. I have no experience with backport things so I need some
> more guidance here.
>
> The fix patch needs two of the cleanup patches applied.
I didn't see anything in the cleanup that are significantly changing the
behavior.
Maybe some minor structure member or type change, but that's all.
Your fix should still work use the older types/members, and that will
make backport much easier, without the need to backport the cleanup as
dependency.
> I currently have no idea
> what I could do to make backport easier. Should I also add "Fixes:" tag to the
> two cleanup patch?
Definitely no, and those cleanup should only be done after a fix.
Cleanup is not a fix, thus they should not have such fixes tags.
> Or should I squash the two cleanup and one fix together to
> make a patch just for backport?
No either.
I did a quick simple reorder, and only minor changes needed to pass
compile (not tested). The reordered fix is attached.
Keep in mind that, during development you should focus on the fix first,
ignoring all the unrelated minor problems, which should make your fix
small and that's making it easier to backport.
>
>>>
>>> Thanks.
>>>
>>> CC: Boris Burkov <boris@bur.io>
>>>
>>> Sun YangKai (7):
>>> btrfs: change block group reclaim_mark to bool
>>> btrfs: reorder btrfs_block_group members to reduce struct size
>>> btrfs: use proper types for btrfs_block_group fields
>>> btrfs: consolidate reclaim readiness checks in btrfs_should_reclaim()
>>> btrfs: use u8 for reclaim threshold type
>>> btrfs: clarify reclaim sweep control flow
>>> btrfs: fix periodic reclaim condition
>>>
>>> fs/btrfs/block-group.c | 29 ++++++++++-----------
>>> fs/btrfs/block-group.h | 22 ++++++++++------
>>> fs/btrfs/space-info.c | 58 ++++++++++++++++++++----------------------
>>> fs/btrfs/space-info.h | 38 +++++++++++++++++----------
>>> fs/btrfs/sysfs.c | 3 ++-
>>> 5 files changed, 81 insertions(+), 69 deletions(-)
>>>
>>
>
[-- Attachment #2: 0001-btrfs-fix-periodic-reclaim-condition.patch --]
[-- Type: text/x-patch, Size: 7633 bytes --]
From 63f2aafcd67f98df1471fbcf4e668880737843d1 Mon Sep 17 00:00:00 2001
Message-ID: <63f2aafcd67f98df1471fbcf4e668880737843d1.1767301600.git.wqu@suse.com>
From: Sun YangKai <sunk67188@gmail.com>
Date: Wed, 31 Dec 2025 18:39:40 +0800
Subject: [PATCH] btrfs: fix periodic reclaim condition
Problems with current implementation:
1. reclaimable_bytes is signed while chunk_sz is unsigned, causing
negative reclaimable_bytes to trigger reclaim unexpectedly
2. The "space must be freed between scans" assumption breaks the
two-scan requirement: first scan marks block groups, second scan
reclaims them. Without the second scan, no reclamation occurs.
Instead, track actual reclaim progress: pause reclaim when block groups
will be reclaimed, and resume only when progress is made. This ensures
reclaim continues until no further progress can be made, then resumes when
space_info changes or new reclaimable groups appear.
Signed-off-by: Sun YangKai <sunk67188@gmail.com>
---
fs/btrfs/block-group.c | 15 +++++++--------
fs/btrfs/space-info.c | 42 +++++++++++++++++++-----------------------
fs/btrfs/space-info.h | 29 ++++++++++++++++++++---------
3 files changed, 46 insertions(+), 40 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index e417aba4c4c7..94a4068cd42a 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -1871,6 +1871,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
while (!list_empty(&fs_info->reclaim_bgs)) {
u64 used;
u64 reserved;
+ u64 old_total;
int ret = 0;
bg = list_first_entry(&fs_info->reclaim_bgs,
@@ -1936,6 +1937,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
}
spin_unlock(&bg->lock);
+ old_total = space_info->total_bytes;
spin_unlock(&space_info->lock);
/*
@@ -1988,14 +1990,14 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
reserved = 0;
spin_lock(&space_info->lock);
space_info->reclaim_errors++;
- if (READ_ONCE(space_info->periodic_reclaim))
- space_info->periodic_reclaim_ready = false;
spin_unlock(&space_info->lock);
}
spin_lock(&space_info->lock);
space_info->reclaim_count++;
space_info->reclaim_bytes += used;
space_info->reclaim_bytes += reserved;
+ if (space_info->total_bytes < old_total)
+ btrfs_resume_periodic_reclaim(space_info);
spin_unlock(&space_info->lock);
next:
@@ -3730,8 +3732,6 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans,
space_info->bytes_reserved -= num_bytes;
space_info->bytes_used += num_bytes;
space_info->disk_used += num_bytes * factor;
- if (READ_ONCE(space_info->periodic_reclaim))
- btrfs_space_info_update_reclaimable(space_info, -num_bytes);
spin_unlock(&cache->lock);
spin_unlock(&space_info->lock);
} else {
@@ -3741,12 +3741,11 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans,
btrfs_space_info_update_bytes_pinned(space_info, num_bytes);
space_info->bytes_used -= num_bytes;
space_info->disk_used -= num_bytes * factor;
- if (READ_ONCE(space_info->periodic_reclaim))
- btrfs_space_info_update_reclaimable(space_info, num_bytes);
- else
- reclaim = should_reclaim_block_group(cache, num_bytes);
+ reclaim = should_reclaim_block_group(cache, num_bytes);
spin_unlock(&cache->lock);
+ if (reclaim)
+ btrfs_resume_periodic_reclaim(space_info);
spin_unlock(&space_info->lock);
btrfs_set_extent_bit(&trans->transaction->pinned_extents, bytenr,
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index 7b7b7255f7d8..425e656b90d1 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -2124,43 +2124,39 @@ static void do_reclaim_sweep(struct btrfs_space_info *space_info, int raid)
}
up_read(&space_info->groups_sem);
-}
-void btrfs_space_info_update_reclaimable(struct btrfs_space_info *space_info, s64 bytes)
-{
- u64 chunk_sz = calc_effective_data_chunk_size(space_info->fs_info);
-
- lockdep_assert_held(&space_info->lock);
- space_info->reclaimable_bytes += bytes;
-
- if (space_info->reclaimable_bytes >= chunk_sz)
- btrfs_set_periodic_reclaim_ready(space_info, true);
-}
-
-void btrfs_set_periodic_reclaim_ready(struct btrfs_space_info *space_info, bool ready)
-{
- lockdep_assert_held(&space_info->lock);
- if (!READ_ONCE(space_info->periodic_reclaim))
- return;
- if (ready != space_info->periodic_reclaim_ready) {
- space_info->periodic_reclaim_ready = ready;
- if (!ready)
- space_info->reclaimable_bytes = 0;
+ /*
+ * Temporary pause periodic reclaim until reclaim make some progress.
+ * This can prevent periodic reclaim keep happening but make no progress.
+ */
+ if (!try_again) {
+ spin_lock(&space_info->lock);
+ btrfs_pause_periodic_reclaim(space_info);
+ spin_unlock(&space_info->lock);
}
}
static bool btrfs_should_periodic_reclaim(struct btrfs_space_info *space_info)
{
bool ret;
+ u64 chunk_sz;
+ u64 unused;
if (space_info->flags & BTRFS_BLOCK_GROUP_SYSTEM)
return false;
if (!READ_ONCE(space_info->periodic_reclaim))
return false;
+ if (!READ_ONCE(space_info->periodic_reclaim_paused))
+ return true;
+
+ chunk_sz = calc_effective_data_chunk_size(space_info->fs_info);
spin_lock(&space_info->lock);
- ret = space_info->periodic_reclaim_ready;
- btrfs_set_periodic_reclaim_ready(space_info, false);
+ unused = space_info->total_bytes - space_info->bytes_used;
+ ret = (unused >= space_info->last_reclaim_unused + chunk_sz ||
+ btrfs_calc_reclaim_threshold(space_info) != space_info->last_reclaim_threshold);
+ if (ret)
+ btrfs_resume_periodic_reclaim(space_info);
spin_unlock(&space_info->lock);
return ret;
diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h
index 0703f24b23f7..5040740d36f1 100644
--- a/fs/btrfs/space-info.h
+++ b/fs/btrfs/space-info.h
@@ -216,12 +216,9 @@ struct btrfs_space_info {
* Periodic reclaim should be a no-op if a space_info hasn't
* freed any space since the last time we tried.
*/
- bool periodic_reclaim_ready;
-
- /*
- * Net bytes freed or allocated since the last reclaim pass.
- */
- s64 reclaimable_bytes;
+ bool periodic_reclaim_paused;
+ u64 last_reclaim_unused;
+ int last_reclaim_threshold;
};
static inline bool btrfs_mixed_space_info(const struct btrfs_space_info *space_info)
@@ -300,10 +297,24 @@ int btrfs_reserve_data_bytes(struct btrfs_space_info *space_info, u64 bytes,
void btrfs_dump_space_info_for_trans_abort(struct btrfs_fs_info *fs_info);
void btrfs_init_async_reclaim_work(struct btrfs_fs_info *fs_info);
u64 btrfs_account_ro_block_groups_free_space(struct btrfs_space_info *sinfo);
-
-void btrfs_space_info_update_reclaimable(struct btrfs_space_info *space_info, s64 bytes);
-void btrfs_set_periodic_reclaim_ready(struct btrfs_space_info *space_info, bool ready);
int btrfs_calc_reclaim_threshold(const struct btrfs_space_info *space_info);
+
+static inline void btrfs_resume_periodic_reclaim(struct btrfs_space_info *space_info)
+{
+ lockdep_assert_held(&space_info->lock);
+ if (space_info->periodic_reclaim_paused)
+ space_info->periodic_reclaim_paused = false;
+}
+static inline void btrfs_pause_periodic_reclaim(struct btrfs_space_info *space_info)
+{
+ lockdep_assert_held(&space_info->lock);
+ if (!space_info->periodic_reclaim_paused) {
+ space_info->periodic_reclaim_paused = true;
+ space_info->last_reclaim_threshold = btrfs_calc_reclaim_threshold(space_info);
+ space_info->last_reclaim_unused = space_info->total_bytes - space_info->bytes_used;
+ }
+}
+
void btrfs_reclaim_sweep(const struct btrfs_fs_info *fs_info);
void btrfs_return_free_space(struct btrfs_space_info *space_info, u64 len);
--
2.52.0
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup
2026-01-01 21:14 ` Qu Wenruo
@ 2026-01-03 11:17 ` Sun Yangkai
0 siblings, 0 replies; 14+ messages in thread
From: Sun Yangkai @ 2026-01-03 11:17 UTC (permalink / raw)
To: Qu Wenruo, linux-btrfs; +Cc: Boris Burkov
在 2026/1/2 05:14, Qu Wenruo 写道:
>
>
> 在 2026/1/1 22:24, Sun Yangkai 写道:
>>
>>
>> 在 2026/1/1 08:13, Qu Wenruo 写道:
>>>
>>>
>>> 在 2025/12/31 21:09, Sun YangKai 写道:
>>>> This series eliminates wasteful periodic reclaim operations that were occurring
>>>> when already failed to reclaim any new space, and includes several preparatory
>>>> cleanups.
>>>>
>>>> Patch 1-6 are non-functional changes.
>>>>
>>>> Patch 7 fixes the core issue, details are in the commit message.
>>>
>>> Fix first then cleanup please, this will make backport much easier.
>>>
>>> Thanks,
>>> Qu
>>
>> Sorry for bothering. I have no experience with backport things so I need some
>> more guidance here.
>>
>> The fix patch needs two of the cleanup patches applied.
>
> I didn't see anything in the cleanup that are significantly changing the behavior.
>
> Maybe some minor structure member or type change, but that's all.
>
> Your fix should still work use the older types/members, and that will make
> backport much easier, without the need to backport the cleanup as dependency.
>
>> I currently have no idea
>> what I could do to make backport easier. Should I also add "Fixes:" tag to the
>> two cleanup patch?
>
> Definitely no, and those cleanup should only be done after a fix.
>
> Cleanup is not a fix, thus they should not have such fixes tags.
>
>> Or should I squash the two cleanup and one fix together to
>> make a patch just for backport?
>
> No either.
>
> I did a quick simple reorder, and only minor changes needed to pass compile (not
> tested). The reordered fix is attached.
I've seen the patch. It will not work correctly but I've got the idea. I'll move
the fix patch in v2 patch set.
> Keep in mind that, during development you should focus on the fix first,
> ignoring all the unrelated minor problems, which should make your fix small and
> that's making it easier to backport.
Got it.
Thanks a lot :)
Sun YangKai
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-01-03 11:17 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-31 10:39 [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup Sun YangKai
2025-12-31 10:39 ` [PATCH 1/7] btrfs: change block group reclaim_mark to bool Sun YangKai
2025-12-31 10:39 ` [PATCH 2/7] btrfs: reorder btrfs_block_group members to reduce struct size Sun YangKai
2025-12-31 10:39 ` [PATCH 3/7] btrfs: use proper types for btrfs_block_group fields Sun YangKai
2025-12-31 10:39 ` [PATCH 4/7] btrfs: consolidate reclaim readiness checks in btrfs_should_reclaim() Sun YangKai
2025-12-31 10:39 ` [PATCH 5/7] btrfs: use u8 for reclaim threshold type Sun YangKai
2025-12-31 10:39 ` [PATCH 6/7] btrfs: clarify reclaim sweep control flow Sun YangKai
2025-12-31 10:39 ` [PATCH 7/7] btrfs: fix periodic reclaim condition Sun YangKai
2026-01-01 0:20 ` Qu Wenruo
2026-01-01 11:44 ` Sun Yangkai
2026-01-01 0:13 ` [PATCH 0/7] btrfs: fix periodic reclaim condition with some cleanup Qu Wenruo
2026-01-01 11:54 ` Sun Yangkai
2026-01-01 21:14 ` Qu Wenruo
2026-01-03 11:17 ` Sun Yangkai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox