* [PATCH v2 01/15] bcachefs: Introduce btree_leaf_has_triggers_mask
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
@ 2025-08-28 14:16 ` Nikita Ofitserov via B4 Relay
2025-08-28 14:16 ` [PATCH v2 02/15] bcachefs: Refactor bch2_gc_btree/bch2_gc_btrees Nikita Ofitserov via B4 Relay
` (14 subsequent siblings)
15 siblings, 0 replies; 19+ messages in thread
From: Nikita Ofitserov via B4 Relay @ 2025-08-28 14:16 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs, Nikita Ofitserov
From: Nikita Ofitserov <himikof@gmail.com>
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
---
fs/bcachefs/btree_gc.c | 2 +-
fs/bcachefs/btree_types.h | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/bcachefs/btree_gc.c b/fs/bcachefs/btree_gc.c
index 43f294284d57473fd8868c974c4bdd847fcf2636..2b1b6b7ffdf0af7ec0bf4cbaf68cc9a53a20cfce 100644
--- a/fs/bcachefs/btree_gc.c
+++ b/fs/bcachefs/btree_gc.c
@@ -720,7 +720,7 @@ static int bch2_gc_btree(struct btree_trans *trans,
enum btree_id btree, bool initial)
{
struct bch_fs *c = trans->c;
- unsigned target_depth = btree_node_type_has_triggers(__btree_node_type(0, btree)) ? 0 : 1;
+ unsigned target_depth = BIT_ULL(btree) & btree_leaf_has_triggers_mask ? 0 : 1;
int ret = 0;
/* We need to make sure every leaf node is readable before going RW */
diff --git a/fs/bcachefs/btree_types.h b/fs/bcachefs/btree_types.h
index e893eb938bb3da441f0ab0096ced3cc3c2a701e2..ad9bd18fe9b6e51987da74d373830dca98df56ec 100644
--- a/fs/bcachefs/btree_types.h
+++ b/fs/bcachefs/btree_types.h
@@ -840,6 +840,9 @@ static inline bool btree_node_type_has_triggers(enum btree_node_type type)
return BIT_ULL(type) & BTREE_NODE_TYPE_HAS_TRIGGERS;
}
+/* A mask of btree id bits that have triggers for their leaves */
+static const u64 btree_leaf_has_triggers_mask = BTREE_NODE_TYPE_HAS_TRIGGERS >> 1;
+
static const u64 btree_is_extents_mask = 0
#define x(name, nr, flags, ...) |((!!((flags) & BTREE_IS_extents)) << nr)
BCH_BTREE_IDS()
--
2.50.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH v2 02/15] bcachefs: Refactor bch2_gc_btree/bch2_gc_btrees
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
2025-08-28 14:16 ` [PATCH v2 01/15] bcachefs: Introduce btree_leaf_has_triggers_mask Nikita Ofitserov via B4 Relay
@ 2025-08-28 14:16 ` Nikita Ofitserov via B4 Relay
2025-08-28 14:16 ` [PATCH v2 03/15] bcachefs: Improve check_allocations pass speed when not in fsck Nikita Ofitserov via B4 Relay
` (13 subsequent siblings)
15 siblings, 0 replies; 19+ messages in thread
From: Nikita Ofitserov via B4 Relay @ 2025-08-28 14:16 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs, Nikita Ofitserov
From: Nikita Ofitserov <himikof@gmail.com>
Concentrate iteration policy inside the caller so that accurate
progress reporting would be possible.
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
---
fs/bcachefs/btree_gc.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/fs/bcachefs/btree_gc.c b/fs/bcachefs/btree_gc.c
index 2b1b6b7ffdf0af7ec0bf4cbaf68cc9a53a20cfce..2a369aebdbe216ed665115e27388ab4821696bf8 100644
--- a/fs/bcachefs/btree_gc.c
+++ b/fs/bcachefs/btree_gc.c
@@ -717,16 +717,12 @@ static int bch2_gc_mark_key(struct btree_trans *trans, enum btree_id btree_id,
static int bch2_gc_btree(struct btree_trans *trans,
struct progress_indicator_state *progress,
- enum btree_id btree, bool initial)
+ enum btree_id btree, unsigned target_depth,
+ bool initial)
{
struct bch_fs *c = trans->c;
- unsigned target_depth = BIT_ULL(btree) & btree_leaf_has_triggers_mask ? 0 : 1;
int ret = 0;
- /* We need to make sure every leaf node is readable before going RW */
- if (initial)
- target_depth = 0;
-
for (unsigned level = target_depth; level < BTREE_MAX_DEPTH; level++) {
struct btree *prev = NULL;
struct btree_iter iter;
@@ -797,7 +793,8 @@ static int bch2_gc_btrees(struct bch_fs *c)
if (IS_ERR_OR_NULL(bch2_btree_id_root(c, btree)->b))
continue;
- ret = bch2_gc_btree(trans, &progress, btree, true);
+ /* We need to make sure every leaf node is readable before going RW */
+ ret = bch2_gc_btree(trans, &progress, btree, 0, true);
}
bch_err_fn(c, ret);
--
2.50.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH v2 03/15] bcachefs: Improve check_allocations pass speed when not in fsck
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
2025-08-28 14:16 ` [PATCH v2 01/15] bcachefs: Introduce btree_leaf_has_triggers_mask Nikita Ofitserov via B4 Relay
2025-08-28 14:16 ` [PATCH v2 02/15] bcachefs: Refactor bch2_gc_btree/bch2_gc_btrees Nikita Ofitserov via B4 Relay
@ 2025-08-28 14:16 ` Nikita Ofitserov via B4 Relay
2025-08-28 14:16 ` [PATCH v2 04/15] bcachefs: Refactor/rename btree_type_has_ptrs Nikita Ofitserov via B4 Relay
` (12 subsequent siblings)
15 siblings, 0 replies; 19+ messages in thread
From: Nikita Ofitserov via B4 Relay @ 2025-08-28 14:16 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs, Nikita Ofitserov
From: Nikita Ofitserov <himikof@gmail.com>
Skip reading unnecessary leaf btree nodes unless running fsck.
For example, this should speed up version upgrade/downgrade when
rebuilding accounting information.
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
---
fs/bcachefs/btree_gc.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/fs/bcachefs/btree_gc.c b/fs/bcachefs/btree_gc.c
index 2a369aebdbe216ed665115e27388ab4821696bf8..b95a3610900de2eecf978c62f3809453455f7a5c 100644
--- a/fs/bcachefs/btree_gc.c
+++ b/fs/bcachefs/btree_gc.c
@@ -793,8 +793,21 @@ static int bch2_gc_btrees(struct bch_fs *c)
if (IS_ERR_OR_NULL(bch2_btree_id_root(c, btree)->b))
continue;
- /* We need to make sure every leaf node is readable before going RW */
- ret = bch2_gc_btree(trans, &progress, btree, 0, true);
+
+ unsigned target_depth = BIT_ULL(btree) & btree_leaf_has_triggers_mask ? 0 : 1;
+
+ /*
+ * In fsck, we need to make sure every leaf node is readable
+ * before going RW, otherwise we can no longer rewind inside
+ * btree_lost_data to repair during the current fsck run.
+ *
+ * Otherwise, we can delay the repair to the next
+ * mount or offline fsck.
+ */
+ if (test_bit(BCH_FS_in_fsck, &c->flags))
+ target_depth = 0;
+
+ ret = bch2_gc_btree(trans, &progress, btree, target_depth, true);
}
bch_err_fn(c, ret);
--
2.50.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH v2 04/15] bcachefs: Refactor/rename btree_type_has_ptrs
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
` (2 preceding siblings ...)
2025-08-28 14:16 ` [PATCH v2 03/15] bcachefs: Improve check_allocations pass speed when not in fsck Nikita Ofitserov via B4 Relay
@ 2025-08-28 14:16 ` Nikita Ofitserov via B4 Relay
2025-08-28 14:16 ` [PATCH v2 05/15] bcachefs: Fix progress reporting for unknown btrees Nikita Ofitserov via B4 Relay
` (11 subsequent siblings)
15 siblings, 0 replies; 19+ messages in thread
From: Nikita Ofitserov via B4 Relay @ 2025-08-28 14:16 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs, Nikita Ofitserov
From: Nikita Ofitserov <himikof@gmail.com>
The new name (btree_type_has_data_ptrs) better reflects the meaning.
Also introduce the explicit mask constant.
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
---
fs/bcachefs/backpointers.c | 2 +-
fs/bcachefs/btree_gc.c | 2 +-
fs/bcachefs/btree_types.h | 8 ++++----
fs/bcachefs/migrate.c | 2 +-
fs/bcachefs/move.c | 2 +-
5 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/fs/bcachefs/backpointers.c b/fs/bcachefs/backpointers.c
index cb25cddb759b84bed48c653605ba195218f2140c..0d585e5662be3f02580558e9a590075ea73193d5 100644
--- a/fs/bcachefs/backpointers.c
+++ b/fs/bcachefs/backpointers.c
@@ -809,7 +809,7 @@ static int bch2_check_extents_to_backpointers_pass(struct btree_trans *trans,
for (enum btree_id btree_id = 0;
btree_id < btree_id_nr_alive(c);
btree_id++) {
- int level, depth = btree_type_has_ptrs(btree_id) ? 0 : 1;
+ int level, depth = btree_type_has_data_ptrs(btree_id) ? 0 : 1;
ret = commit_do(trans, NULL, NULL,
BCH_TRANS_COMMIT_no_enospc,
diff --git a/fs/bcachefs/btree_gc.c b/fs/bcachefs/btree_gc.c
index b95a3610900de2eecf978c62f3809453455f7a5c..2338feb8d8ed4bad85a01a6b9181116d918636b5 100644
--- a/fs/bcachefs/btree_gc.c
+++ b/fs/bcachefs/btree_gc.c
@@ -1238,7 +1238,7 @@ int bch2_gc_gens(struct bch_fs *c)
}
for (unsigned i = 0; i < BTREE_ID_NR; i++)
- if (btree_type_has_ptrs(i)) {
+ if (btree_type_has_data_ptrs(i)) {
c->gc_gens_btree = i;
c->gc_gens_pos = POS_MIN;
diff --git a/fs/bcachefs/btree_types.h b/fs/bcachefs/btree_types.h
index ad9bd18fe9b6e51987da74d373830dca98df56ec..d06991c16d5c24c586e9449efa9553e2b8593bd4 100644
--- a/fs/bcachefs/btree_types.h
+++ b/fs/bcachefs/btree_types.h
@@ -886,15 +886,15 @@ static inline bool btree_type_has_snapshot_field(enum btree_id btree)
return BIT_ULL(btree) & mask;
}
-static inline bool btree_type_has_ptrs(enum btree_id btree)
-{
- const u64 mask = 0
+static const u64 btree_has_data_ptrs_mask = 0
#define x(name, nr, flags, ...) |((!!((flags) & BTREE_IS_data)) << nr)
BCH_BTREE_IDS()
#undef x
;
- return BIT_ULL(btree) & mask;
+static inline bool btree_type_has_data_ptrs(enum btree_id btree)
+{
+ return BIT_ULL(btree) & btree_has_data_ptrs_mask;
}
static inline bool btree_type_uses_write_buffer(enum btree_id btree)
diff --git a/fs/bcachefs/migrate.c b/fs/bcachefs/migrate.c
index 892990b4a6a6b16785e5030e5dc7cf15b05c0fc5..41657deb0e3807f4968272491faea09d05ec2299 100644
--- a/fs/bcachefs/migrate.c
+++ b/fs/bcachefs/migrate.c
@@ -122,7 +122,7 @@ static int bch2_dev_usrdata_drop(struct bch_fs *c,
CLASS(btree_trans, trans)(c);
for (unsigned id = 0; id < BTREE_ID_NR; id++) {
- if (!btree_type_has_ptrs(id))
+ if (!btree_type_has_data_ptrs(id))
continue;
/* Stripe keys have pointers, but are handled separately */
diff --git a/fs/bcachefs/move.c b/fs/bcachefs/move.c
index 4f41f1f6ec6c7fe6af663e11b3479fde02832ef9..1cd81380c7d87ebc446a874f2f4a208442d0c577 100644
--- a/fs/bcachefs/move.c
+++ b/fs/bcachefs/move.c
@@ -820,7 +820,7 @@ static int bch2_move_data(struct bch_fs *c,
unsigned min_depth_this_btree = min_depth;
/* Stripe keys have pointers, but are handled separately */
- if (!btree_type_has_ptrs(id) ||
+ if (!btree_type_has_data_ptrs(id) ||
id == BTREE_ID_stripes)
min_depth_this_btree = max(min_depth_this_btree, 1);
--
2.50.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH v2 05/15] bcachefs: Fix progress reporting for unknown btrees
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
` (3 preceding siblings ...)
2025-08-28 14:16 ` [PATCH v2 04/15] bcachefs: Refactor/rename btree_type_has_ptrs Nikita Ofitserov via B4 Relay
@ 2025-08-28 14:16 ` Nikita Ofitserov via B4 Relay
2025-08-28 14:16 ` [PATCH v2 06/15] bcachefs: Partially fix old device removal with " Nikita Ofitserov via B4 Relay
` (10 subsequent siblings)
15 siblings, 0 replies; 19+ messages in thread
From: Nikita Ofitserov via B4 Relay @ 2025-08-28 14:16 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs, Nikita Ofitserov
From: Nikita Ofitserov <himikof@gmail.com>
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
---
fs/bcachefs/progress.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/bcachefs/progress.c b/fs/bcachefs/progress.c
index 792fc6fef27018c73168c59857e7f3497c1969f4..541ee951d1c9fc3f11b5a5ac40f1d366829a5c97 100644
--- a/fs/bcachefs/progress.c
+++ b/fs/bcachefs/progress.c
@@ -12,7 +12,7 @@ void bch2_progress_init(struct progress_indicator_state *s,
s->next_print = jiffies + HZ * 10;
- for (unsigned i = 0; i < BTREE_ID_NR; i++) {
+ for (unsigned i = 0; i < btree_id_nr_alive(c); i++) {
if (!(btree_id_mask & BIT_ULL(i)))
continue;
--
2.50.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH v2 06/15] bcachefs: Partially fix old device removal with unknown btrees
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
` (4 preceding siblings ...)
2025-08-28 14:16 ` [PATCH v2 05/15] bcachefs: Fix progress reporting for unknown btrees Nikita Ofitserov via B4 Relay
@ 2025-08-28 14:16 ` Nikita Ofitserov via B4 Relay
2025-08-28 14:16 ` [PATCH v2 07/15] bcachefs: Fix missing c->usage updates from early recovery Nikita Ofitserov via B4 Relay
` (9 subsequent siblings)
15 siblings, 0 replies; 19+ messages in thread
From: Nikita Ofitserov via B4 Relay @ 2025-08-28 14:16 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs, Nikita Ofitserov
From: Nikita Ofitserov <himikof@gmail.com>
Handle removal of unknown metadata only, data pointed to by unknown
btrees cannot be supported currently.
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
---
fs/bcachefs/migrate.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/bcachefs/migrate.c b/fs/bcachefs/migrate.c
index 41657deb0e3807f4968272491faea09d05ec2299..4aa7dd483cab5a400974b5824bfaf0c0af2e9331 100644
--- a/fs/bcachefs/migrate.c
+++ b/fs/bcachefs/migrate.c
@@ -121,6 +121,7 @@ static int bch2_dev_usrdata_drop(struct bch_fs *c,
{
CLASS(btree_trans, trans)(c);
+ /* FIXME: this does not handle unknown btrees with data pointers */
for (unsigned id = 0; id < BTREE_ID_NR; id++) {
if (!btree_type_has_data_ptrs(id))
continue;
@@ -161,7 +162,7 @@ static int bch2_dev_metadata_drop(struct bch_fs *c,
bch2_bkey_buf_init(&k);
closure_init_stack(&cl);
- for (id = 0; id < BTREE_ID_NR; id++) {
+ for (id = 0; id < btree_id_nr_alive(c); id++) {
bch2_trans_node_iter_init(trans, &iter, id, POS_MIN, 0, 0,
BTREE_ITER_prefetch);
retry:
--
2.50.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH v2 07/15] bcachefs: Fix missing c->usage updates from early recovery
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
` (5 preceding siblings ...)
2025-08-28 14:16 ` [PATCH v2 06/15] bcachefs: Partially fix old device removal with " Nikita Ofitserov via B4 Relay
@ 2025-08-28 14:16 ` Nikita Ofitserov via B4 Relay
2025-09-01 23:55 ` Kent Overstreet
2025-08-28 14:16 ` [PATCH v2 08/15] bcachefs: Optimize/fix bch2_dev_usage_remove Nikita Ofitserov via B4 Relay
` (8 subsequent siblings)
15 siblings, 1 reply; 19+ messages in thread
From: Nikita Ofitserov via B4 Relay @ 2025-08-28 14:16 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs, Nikita Ofitserov
From: Nikita Ofitserov <himikof@gmail.com>
Make do_bch2_trans_commit_to_journal_replay apply global FS usage
updates stored in trans->fs_usage_delta, too.
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
---
fs/bcachefs/btree_trans_commit.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/bcachefs/btree_trans_commit.c b/fs/bcachefs/btree_trans_commit.c
index 5fa7f2f9f1e9d0ddd1d4675774321b519313f477..e3f069ad4002e2b007600fb41cd220d13386d37f 100644
--- a/fs/bcachefs/btree_trans_commit.c
+++ b/fs/bcachefs/btree_trans_commit.c
@@ -970,6 +970,7 @@ do_bch2_trans_commit_to_journal_replay(struct btree_trans *trans,
struct bkey_i *accounting;
retry:
+ memset(&trans->fs_usage_delta, 0, sizeof(trans->fs_usage_delta));
percpu_down_read(&c->mark_lock);
for (accounting = btree_trans_subbuf_base(trans, &trans->accounting);
accounting != btree_trans_subbuf_top(trans, &trans->accounting);
@@ -981,6 +982,8 @@ do_bch2_trans_commit_to_journal_replay(struct btree_trans *trans,
if (ret)
goto revert_fs_usage;
}
+ /* Only fatal errors are possible later, so no need to revert this */
+ bch2_trans_account_disk_usage_change(trans);
percpu_up_read(&c->mark_lock);
trans_for_each_update(trans, i) {
--
2.50.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH v2 07/15] bcachefs: Fix missing c->usage updates from early recovery
2025-08-28 14:16 ` [PATCH v2 07/15] bcachefs: Fix missing c->usage updates from early recovery Nikita Ofitserov via B4 Relay
@ 2025-09-01 23:55 ` Kent Overstreet
0 siblings, 0 replies; 19+ messages in thread
From: Kent Overstreet @ 2025-09-01 23:55 UTC (permalink / raw)
To: himikof; +Cc: linux-bcachefs
On Thu, Aug 28, 2025 at 05:16:09PM +0300, Nikita Ofitserov via B4 Relay wrote:
> From: Nikita Ofitserov <himikof@gmail.com>
>
> Make do_bch2_trans_commit_to_journal_replay apply global FS usage
> updates stored in trans->fs_usage_delta, too.
>
> Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
kicked this patch out for now - run a lockdep test
> ---
> fs/bcachefs/btree_trans_commit.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/fs/bcachefs/btree_trans_commit.c b/fs/bcachefs/btree_trans_commit.c
> index 5fa7f2f9f1e9d0ddd1d4675774321b519313f477..e3f069ad4002e2b007600fb41cd220d13386d37f 100644
> --- a/fs/bcachefs/btree_trans_commit.c
> +++ b/fs/bcachefs/btree_trans_commit.c
> @@ -970,6 +970,7 @@ do_bch2_trans_commit_to_journal_replay(struct btree_trans *trans,
>
> struct bkey_i *accounting;
> retry:
> + memset(&trans->fs_usage_delta, 0, sizeof(trans->fs_usage_delta));
> percpu_down_read(&c->mark_lock);
> for (accounting = btree_trans_subbuf_base(trans, &trans->accounting);
> accounting != btree_trans_subbuf_top(trans, &trans->accounting);
> @@ -981,6 +982,8 @@ do_bch2_trans_commit_to_journal_replay(struct btree_trans *trans,
> if (ret)
> goto revert_fs_usage;
> }
> + /* Only fatal errors are possible later, so no need to revert this */
> + bch2_trans_account_disk_usage_change(trans);
> percpu_up_read(&c->mark_lock);
>
> trans_for_each_update(trans, i) {
>
> --
> 2.50.1
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 08/15] bcachefs: Optimize/fix bch2_dev_usage_remove
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
` (6 preceding siblings ...)
2025-08-28 14:16 ` [PATCH v2 07/15] bcachefs: Fix missing c->usage updates from early recovery Nikita Ofitserov via B4 Relay
@ 2025-08-28 14:16 ` Nikita Ofitserov via B4 Relay
2025-08-28 14:16 ` [PATCH v2 09/15] bcachefs: Fix online hidden (sb+journal) data accounting Nikita Ofitserov via B4 Relay
` (7 subsequent siblings)
15 siblings, 0 replies; 19+ messages in thread
From: Nikita Ofitserov via B4 Relay @ 2025-08-28 14:16 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs, Nikita Ofitserov
From: Nikita Ofitserov <himikof@gmail.com>
Make bch2_dev_usage_remove iterate only through relevant keys instead
of the whole accounting tree. Use the proper bch2_disk_accounting_mod
API to ensure in-memory counters will be zeroed and cleaned up later.
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
---
fs/bcachefs/alloc_background.c | 2 +-
fs/bcachefs/disk_accounting.c | 47 +++++++++++++++++++++++++++++++-----------
fs/bcachefs/disk_accounting.h | 2 +-
3 files changed, 37 insertions(+), 14 deletions(-)
diff --git a/fs/bcachefs/alloc_background.c b/fs/bcachefs/alloc_background.c
index 3fc728efbf5cc0425b966a0d83d97b33a666c531..0a2b630ab8420b404990eb5a4359c9295b926050 100644
--- a/fs/bcachefs/alloc_background.c
+++ b/fs/bcachefs/alloc_background.c
@@ -2397,7 +2397,7 @@ int bch2_dev_remove_alloc(struct bch_fs *c, struct bch_dev *ca)
BTREE_TRIGGER_norun, NULL) ?:
bch2_btree_delete_range(c, BTREE_ID_alloc, start, end,
BTREE_TRIGGER_norun, NULL) ?:
- bch2_dev_usage_remove(c, ca->dev_idx);
+ bch2_dev_usage_remove(c, ca);
bch_err_msg(ca, ret, "removing dev alloc info");
return ret;
}
diff --git a/fs/bcachefs/disk_accounting.c b/fs/bcachefs/disk_accounting.c
index d6c91abcdc4140ef98b407fa580b7cd0cf3b8f56..82ce610a7e40c86d84296a47ef7ea7b5eb82f816 100644
--- a/fs/bcachefs/disk_accounting.c
+++ b/fs/bcachefs/disk_accounting.c
@@ -974,21 +974,44 @@ int bch2_accounting_read(struct bch_fs *c)
return ret;
}
-int bch2_dev_usage_remove(struct bch_fs *c, unsigned dev)
+int bch2_dev_usage_remove(struct bch_fs *c, struct bch_dev *ca)
{
CLASS(btree_trans, trans)(c);
+
+ struct disk_accounting_pos start;
+ disk_accounting_key_init(start, dev_data_type, .dev = ca->dev_idx);
+
+ struct disk_accounting_pos end;
+ disk_accounting_key_init(end, dev_data_type, .dev = ca->dev_idx, .data_type = U8_MAX);
+
return bch2_btree_write_buffer_flush_sync(trans) ?:
- for_each_btree_key_commit(trans, iter, BTREE_ID_accounting, POS_MIN,
- BTREE_ITER_all_snapshots, k, NULL, NULL, 0, ({
- struct disk_accounting_pos acc;
- bpos_to_disk_accounting_pos(&acc, k.k->p);
-
- acc.type == BCH_DISK_ACCOUNTING_dev_data_type &&
- acc.dev_data_type.dev == dev
- ? bch2_btree_bit_mod_buffered(trans, BTREE_ID_accounting, k.k->p, 0)
- : 0;
- })) ?:
- bch2_btree_write_buffer_flush_sync(trans);
+ commit_do(trans, NULL, NULL, 0, ({
+ struct bkey_s_c k;
+ int ret = 0;
+
+ for_each_btree_key_max_norestart(trans, iter, BTREE_ID_accounting,
+ disk_accounting_pos_to_bpos(&start),
+ disk_accounting_pos_to_bpos(&end),
+ BTREE_ITER_all_snapshots, k, ret) {
+ if (k.k->type != KEY_TYPE_accounting)
+ continue;
+
+ struct disk_accounting_pos acc;
+ bpos_to_disk_accounting_pos(&acc, k.k->p);
+
+ const unsigned nr = bch2_accounting_counters(k.k);
+ u64 v[BCH_ACCOUNTING_MAX_COUNTERS];
+ memcpy_u64s_small(v, bkey_s_c_to_accounting(k).v->d, nr);
+
+ bch2_u64s_neg(v, nr);
+
+ ret = bch2_disk_accounting_mod(trans, &acc, v, nr, false);
+ if (ret)
+ break;
+ }
+
+ ret;
+ })) ?: bch2_btree_write_buffer_flush_sync(trans);
}
int bch2_dev_usage_init(struct bch_dev *ca, bool gc)
diff --git a/fs/bcachefs/disk_accounting.h b/fs/bcachefs/disk_accounting.h
index cc73cce98a447d96ad40683b88bfbc12689d8bf2..438628b63a8065e22ec76aebcd69e285f84f41ef 100644
--- a/fs/bcachefs/disk_accounting.h
+++ b/fs/bcachefs/disk_accounting.h
@@ -297,7 +297,7 @@ int bch2_gc_accounting_done(struct bch_fs *);
int bch2_accounting_read(struct bch_fs *);
-int bch2_dev_usage_remove(struct bch_fs *, unsigned);
+int bch2_dev_usage_remove(struct bch_fs *, struct bch_dev *);
int bch2_dev_usage_init(struct bch_dev *, bool);
void bch2_verify_accounting_clean(struct bch_fs *c);
--
2.50.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH v2 09/15] bcachefs: Fix online hidden (sb+journal) data accounting
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
` (7 preceding siblings ...)
2025-08-28 14:16 ` [PATCH v2 08/15] bcachefs: Optimize/fix bch2_dev_usage_remove Nikita Ofitserov via B4 Relay
@ 2025-08-28 14:16 ` Nikita Ofitserov via B4 Relay
2025-09-02 0:14 ` Kent Overstreet
2025-08-28 14:16 ` [PATCH v2 10/15] bcachefs: Fix outdated documentation comment Nikita Ofitserov via B4 Relay
` (6 subsequent siblings)
15 siblings, 1 reply; 19+ messages in thread
From: Nikita Ofitserov via B4 Relay @ 2025-08-28 14:16 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs, Nikita Ofitserov
From: Nikita Ofitserov <himikof@gmail.com>
Now the c->usage->hidden counters are kept up to date by the same
trigger-based mechanism as the other ones.
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
---
fs/bcachefs/disk_accounting.c | 12 ++++++++----
fs/bcachefs/disk_accounting.h | 10 +++++++---
2 files changed, 15 insertions(+), 7 deletions(-)
diff --git a/fs/bcachefs/disk_accounting.c b/fs/bcachefs/disk_accounting.c
index 82ce610a7e40c86d84296a47ef7ea7b5eb82f816..1649c0e1a89e8f4f04d122e40553b36f11491df5 100644
--- a/fs/bcachefs/disk_accounting.c
+++ b/fs/bcachefs/disk_accounting.c
@@ -1082,13 +1082,17 @@ void bch2_verify_accounting_clean(struct bch_fs *c)
case BCH_DISK_ACCOUNTING_dev_data_type: {
{
guard(rcu)(); /* scoped guard is a loop, and doesn't play nicely with continue */
+ const enum bch_data_type data_type = acc_k.dev_data_type.data_type;
struct bch_dev *ca = bch2_dev_rcu_noerror(c, acc_k.dev_data_type.dev);
if (!ca)
continue;
- v[0] = percpu_u64_get(&ca->usage->d[acc_k.dev_data_type.data_type].buckets);
- v[1] = percpu_u64_get(&ca->usage->d[acc_k.dev_data_type.data_type].sectors);
- v[2] = percpu_u64_get(&ca->usage->d[acc_k.dev_data_type.data_type].fragmented);
+ v[0] = percpu_u64_get(&ca->usage->d[data_type].buckets);
+ v[1] = percpu_u64_get(&ca->usage->d[data_type].sectors);
+ v[2] = percpu_u64_get(&ca->usage->d[data_type].fragmented);
+
+ if (data_type == BCH_DATA_sb || data_type == BCH_DATA_journal)
+ base.hidden += a.v->d[0] * ca->mi.bucket_size;
}
if (memcmp(a.v->d, v, 3 * sizeof(u64))) {
@@ -1116,7 +1120,7 @@ void bch2_verify_accounting_clean(struct bch_fs *c)
mismatch = true; \
}
- //check(hidden);
+ check(hidden);
check(btree);
check(data);
check(cached);
diff --git a/fs/bcachefs/disk_accounting.h b/fs/bcachefs/disk_accounting.h
index 438628b63a8065e22ec76aebcd69e285f84f41ef..c3f2dc5d0c14241f3c0ae2d367d6da504df2e1ab 100644
--- a/fs/bcachefs/disk_accounting.h
+++ b/fs/bcachefs/disk_accounting.h
@@ -186,11 +186,15 @@ static inline int bch2_accounting_mem_mod_locked(struct btree_trans *trans,
break;
case BCH_DISK_ACCOUNTING_dev_data_type: {
guard(rcu)();
+ const enum bch_data_type data_type = acc_k.dev_data_type.data_type;
struct bch_dev *ca = bch2_dev_rcu_noerror(c, acc_k.dev_data_type.dev);
if (ca) {
- this_cpu_add(ca->usage->d[acc_k.dev_data_type.data_type].buckets, a.v->d[0]);
- this_cpu_add(ca->usage->d[acc_k.dev_data_type.data_type].sectors, a.v->d[1]);
- this_cpu_add(ca->usage->d[acc_k.dev_data_type.data_type].fragmented, a.v->d[2]);
+ this_cpu_add(ca->usage->d[data_type].buckets, a.v->d[0]);
+ this_cpu_add(ca->usage->d[data_type].sectors, a.v->d[1]);
+ this_cpu_add(ca->usage->d[data_type].fragmented, a.v->d[2]);
+
+ if (data_type == BCH_DATA_sb || data_type == BCH_DATA_journal)
+ trans->fs_usage_delta.hidden += a.v->d[0] * ca->mi.bucket_size;
}
break;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH v2 09/15] bcachefs: Fix online hidden (sb+journal) data accounting
2025-08-28 14:16 ` [PATCH v2 09/15] bcachefs: Fix online hidden (sb+journal) data accounting Nikita Ofitserov via B4 Relay
@ 2025-09-02 0:14 ` Kent Overstreet
0 siblings, 0 replies; 19+ messages in thread
From: Kent Overstreet @ 2025-09-02 0:14 UTC (permalink / raw)
To: himikof; +Cc: linux-bcachefs
On Thu, Aug 28, 2025 at 05:16:11PM +0300, Nikita Ofitserov via B4 Relay wrote:
> From: Nikita Ofitserov <himikof@gmail.com>
>
> Now the c->usage->hidden counters are kept up to date by the same
> trigger-based mechanism as the other ones.
This pops warnings in accounting_verify_clean()
00133 ------------[ cut here ]------------
00133 WARNING: CPU: 8 PID: 595 at fs/bcachefs/disk_accounting.c:1144 bch2_verify_accounting_clean+0x8fc/0xac0
00133 Modules linked in:
00133 CPU: 8 UID: 0 PID: 595 Comm: mount.bcachefs Not tainted 6.17.0-rc3-ktest-g92dce30a5a04 #32156 NONE
00133 Hardware name: linux,dummy-virt (DT)
00133 pstate: 60001005 (nZCv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--)
00133 pc : bch2_verify_accounting_clean+0x8fc/0xac0
00133 lr : bch2_verify_accounting_clean+0xa60/0xac0
00133 sp : ffffffc081e0b9d0
00133 x29: ffffffc081e0bbe0 x28: ffffffc081e0ba70 x27: ffffffc080e59fc0
00133 x26: 0000000000000008 x25: 0000000000000003 x24: 0000000000000000
00133 x23: ffffff80e6220000 x22: ffffffc081e0bb08 x21: 0000000000000046
00133 x20: 0000000000000000 x19: ffffff80e4540000 x18: 00000000ffffffff
00133 x17: 696d206e65646469 x16: 682e657361625f65 x15: 676173755f736620
00133 x14: 29286e61656c635f x13: 0000000000000000 x12: ffffff80fadeffa8
00133 x11: ffffff80f97f0000 x10: ffffff80fadf0000 x9 : ffffffc0800a86f0
00133 x8 : 0000000000000001 x7 : 00000000005fffe8 x6 : 0000000000000001
00133 x5 : ffffff80fbcf9c90 x4 : 0000000000000000 x3 : 0000000000000000
00133 x2 : 0000000000000000 x1 : ffffff80c1576000 x0 : 0000000000000001
00133 Call trace:
00133 bch2_verify_accounting_clean+0x8fc/0xac0 (P)
00133 bch2_fs_read_only+0x2ec/0x358
00133 bch2_fs_reconfigure+0x98/0x128
00133 reconfigure_super+0x98/0x200
00133 path_mount+0x840/0xab8
00133 __arm64_sys_mount+0x160/0x2c0
00133 invoke_syscall.constprop.0+0x54/0xe0
00133 do_el0_svc+0x44/0xe0
00133 el0_svc+0x18/0x58
00133 el0t_64_sync_handler+0x98/0xe0
00133 el0t_64_sync+0x154/0x158
00133 ---[ end trace 0000000000000000 ]---
>
> Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
> ---
> fs/bcachefs/disk_accounting.c | 12 ++++++++----
> fs/bcachefs/disk_accounting.h | 10 +++++++---
> 2 files changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/fs/bcachefs/disk_accounting.c b/fs/bcachefs/disk_accounting.c
> index 82ce610a7e40c86d84296a47ef7ea7b5eb82f816..1649c0e1a89e8f4f04d122e40553b36f11491df5 100644
> --- a/fs/bcachefs/disk_accounting.c
> +++ b/fs/bcachefs/disk_accounting.c
> @@ -1082,13 +1082,17 @@ void bch2_verify_accounting_clean(struct bch_fs *c)
> case BCH_DISK_ACCOUNTING_dev_data_type: {
> {
> guard(rcu)(); /* scoped guard is a loop, and doesn't play nicely with continue */
> + const enum bch_data_type data_type = acc_k.dev_data_type.data_type;
> struct bch_dev *ca = bch2_dev_rcu_noerror(c, acc_k.dev_data_type.dev);
> if (!ca)
> continue;
>
> - v[0] = percpu_u64_get(&ca->usage->d[acc_k.dev_data_type.data_type].buckets);
> - v[1] = percpu_u64_get(&ca->usage->d[acc_k.dev_data_type.data_type].sectors);
> - v[2] = percpu_u64_get(&ca->usage->d[acc_k.dev_data_type.data_type].fragmented);
> + v[0] = percpu_u64_get(&ca->usage->d[data_type].buckets);
> + v[1] = percpu_u64_get(&ca->usage->d[data_type].sectors);
> + v[2] = percpu_u64_get(&ca->usage->d[data_type].fragmented);
> +
> + if (data_type == BCH_DATA_sb || data_type == BCH_DATA_journal)
> + base.hidden += a.v->d[0] * ca->mi.bucket_size;
> }
>
> if (memcmp(a.v->d, v, 3 * sizeof(u64))) {
> @@ -1116,7 +1120,7 @@ void bch2_verify_accounting_clean(struct bch_fs *c)
> mismatch = true; \
> }
>
> - //check(hidden);
> + check(hidden);
> check(btree);
> check(data);
> check(cached);
> diff --git a/fs/bcachefs/disk_accounting.h b/fs/bcachefs/disk_accounting.h
> index 438628b63a8065e22ec76aebcd69e285f84f41ef..c3f2dc5d0c14241f3c0ae2d367d6da504df2e1ab 100644
> --- a/fs/bcachefs/disk_accounting.h
> +++ b/fs/bcachefs/disk_accounting.h
> @@ -186,11 +186,15 @@ static inline int bch2_accounting_mem_mod_locked(struct btree_trans *trans,
> break;
> case BCH_DISK_ACCOUNTING_dev_data_type: {
> guard(rcu)();
> + const enum bch_data_type data_type = acc_k.dev_data_type.data_type;
> struct bch_dev *ca = bch2_dev_rcu_noerror(c, acc_k.dev_data_type.dev);
> if (ca) {
> - this_cpu_add(ca->usage->d[acc_k.dev_data_type.data_type].buckets, a.v->d[0]);
> - this_cpu_add(ca->usage->d[acc_k.dev_data_type.data_type].sectors, a.v->d[1]);
> - this_cpu_add(ca->usage->d[acc_k.dev_data_type.data_type].fragmented, a.v->d[2]);
> + this_cpu_add(ca->usage->d[data_type].buckets, a.v->d[0]);
> + this_cpu_add(ca->usage->d[data_type].sectors, a.v->d[1]);
> + this_cpu_add(ca->usage->d[data_type].fragmented, a.v->d[2]);
> +
> + if (data_type == BCH_DATA_sb || data_type == BCH_DATA_journal)
> + trans->fs_usage_delta.hidden += a.v->d[0] * ca->mi.bucket_size;
> }
> break;
> }
>
> --
> 2.50.1
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2 10/15] bcachefs: Fix outdated documentation comment
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
` (8 preceding siblings ...)
2025-08-28 14:16 ` [PATCH v2 09/15] bcachefs: Fix online hidden (sb+journal) data accounting Nikita Ofitserov via B4 Relay
@ 2025-08-28 14:16 ` Nikita Ofitserov via B4 Relay
2025-08-28 14:16 ` [PATCH v2 11/15] bcachefs: Relax restrictions on the number of accounting counters Nikita Ofitserov via B4 Relay
` (5 subsequent siblings)
15 siblings, 0 replies; 19+ messages in thread
From: Nikita Ofitserov via B4 Relay @ 2025-08-28 14:16 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs, Nikita Ofitserov
From: Nikita Ofitserov <himikof@gmail.com>
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
---
fs/bcachefs/bcachefs_format.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/fs/bcachefs/bcachefs_format.h b/fs/bcachefs/bcachefs_format.h
index b2de993d802b128cb503740d41028e51c14110ee..0839397105a9d20bcdb99090b508dcff2e0f1886 100644
--- a/fs/bcachefs/bcachefs_format.h
+++ b/fs/bcachefs/bcachefs_format.h
@@ -654,7 +654,6 @@ struct bch_sb_field_ext {
/*
* field 1: version name
* field 2: BCH_VERSION(major, minor)
- * field 3: recovery passess required on upgrade
*/
#define BCH_METADATA_VERSIONS() \
x(bkey_renumber, BCH_VERSION(0, 10)) \
--
2.50.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH v2 11/15] bcachefs: Relax restrictions on the number of accounting counters
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
` (9 preceding siblings ...)
2025-08-28 14:16 ` [PATCH v2 10/15] bcachefs: Fix outdated documentation comment Nikita Ofitserov via B4 Relay
@ 2025-08-28 14:16 ` Nikita Ofitserov via B4 Relay
2025-08-28 14:16 ` [PATCH v2 12/15] bcachefs: bcachefs_metadata_version_btree_node_accounting Nikita Ofitserov via B4 Relay
` (4 subsequent siblings)
15 siblings, 0 replies; 19+ messages in thread
From: Nikita Ofitserov via B4 Relay @ 2025-08-28 14:16 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs, Nikita Ofitserov
From: Nikita Ofitserov <himikof@gmail.com>
Make adding/removing accounting counters easier when upgrading by
removing their exact number from bch_accounting key validity invariants.
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
---
fs/bcachefs/disk_accounting.c | 11 ++++++++---
fs/bcachefs/disk_accounting.h | 4 ++--
2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/fs/bcachefs/disk_accounting.c b/fs/bcachefs/disk_accounting.c
index 1649c0e1a89e8f4f04d122e40553b36f11491df5..7e3bcda30f3b027cb88886a633680f77efe0b9d0 100644
--- a/fs/bcachefs/disk_accounting.c
+++ b/fs/bcachefs/disk_accounting.c
@@ -239,10 +239,12 @@ int bch2_accounting_validate(struct bch_fs *c, struct bkey_s_c k,
c, accounting_key_junk_at_end,
"junk at end of accounting key");
- bkey_fsck_err_on(bch2_accounting_counters(k.k) != bch2_accounting_type_nr_counters[acc_k.type],
+ const unsigned nr_counters = bch2_accounting_counters(k.k);
+
+ bkey_fsck_err_on(!nr_counters || nr_counters > BCH_ACCOUNTING_MAX_COUNTERS,
c, accounting_key_nr_counters_wrong,
"accounting key with %u counters, should be %u",
- bch2_accounting_counters(k.k), bch2_accounting_type_nr_counters[acc_k.type]);
+ nr_counters, bch2_accounting_type_nr_counters[acc_k.type]);
fsck_err:
return ret;
}
@@ -359,10 +361,13 @@ static int __bch2_accounting_mem_insert(struct bch_fs *c, struct bkey_s_c_accoun
accounting_pos_cmp, &a.k->p) < acc->k.nr)
return 0;
+ struct disk_accounting_pos acc_k;
+ bpos_to_disk_accounting_pos(&acc_k, a.k->p);
+
struct accounting_mem_entry n = {
.pos = a.k->p,
.bversion = a.k->bversion,
- .nr_counters = bch2_accounting_counters(a.k),
+ .nr_counters = bch2_accounting_type_nr_counters[acc_k.type],
.v[0] = __alloc_percpu_gfp(n.nr_counters * sizeof(u64),
sizeof(u64), GFP_KERNEL),
};
diff --git a/fs/bcachefs/disk_accounting.h b/fs/bcachefs/disk_accounting.h
index c3f2dc5d0c14241f3c0ae2d367d6da504df2e1ab..c0d3d7e8fda6172eb04c8b0d9cd4cd879090e480 100644
--- a/fs/bcachefs/disk_accounting.h
+++ b/fs/bcachefs/disk_accounting.h
@@ -216,9 +216,9 @@ static inline int bch2_accounting_mem_mod_locked(struct btree_trans *trans,
struct accounting_mem_entry *e = &acc->k.data[idx];
- EBUG_ON(bch2_accounting_counters(a.k) != e->nr_counters);
+ const unsigned nr = min_t(unsigned, bch2_accounting_counters(a.k), e->nr_counters);
- for (unsigned i = 0; i < bch2_accounting_counters(a.k); i++)
+ for (unsigned i = 0; i < nr; i++)
this_cpu_add(e->v[gc][i], a.v->d[i]);
return 0;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH v2 12/15] bcachefs: bcachefs_metadata_version_btree_node_accounting
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
` (10 preceding siblings ...)
2025-08-28 14:16 ` [PATCH v2 11/15] bcachefs: Relax restrictions on the number of accounting counters Nikita Ofitserov via B4 Relay
@ 2025-08-28 14:16 ` Nikita Ofitserov via B4 Relay
2025-08-28 14:16 ` [PATCH v2 13/15] bcachefs: Use explicit node counts in progress reporting Nikita Ofitserov via B4 Relay
` (3 subsequent siblings)
15 siblings, 0 replies; 19+ messages in thread
From: Nikita Ofitserov via B4 Relay @ 2025-08-28 14:16 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs, Nikita Ofitserov
From: Nikita Ofitserov <himikof@gmail.com>
Introduce btree node number accounting for better progress reporting.
This change includes a mandatory upgrade/downgrade.
Add 2 new counters for BCH_DISK_ACCOUNTING_btree: total number of btree
nodes (ignoring replication) and the number of non-leaf btree nodes
(likewise). Those are to be used by recovery progress reporting instead
of estimating them.
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
---
fs/bcachefs/bcachefs_format.h | 3 ++-
fs/bcachefs/buckets.c | 16 +++++++++++-----
fs/bcachefs/disk_accounting_format.h | 10 +++++++++-
fs/bcachefs/sb-downgrade.c | 11 +++++++++--
4 files changed, 31 insertions(+), 9 deletions(-)
diff --git a/fs/bcachefs/bcachefs_format.h b/fs/bcachefs/bcachefs_format.h
index 0839397105a9d20bcdb99090b508dcff2e0f1886..76a2ae7f8d2dca44dfdd31c0d1078c45778d8447 100644
--- a/fs/bcachefs/bcachefs_format.h
+++ b/fs/bcachefs/bcachefs_format.h
@@ -706,7 +706,8 @@ struct bch_sb_field_ext {
x(fast_device_removal, BCH_VERSION(1, 27)) \
x(inode_has_case_insensitive, BCH_VERSION(1, 28)) \
x(extent_snapshot_whiteouts, BCH_VERSION(1, 29)) \
- x(31bit_dirent_offset, BCH_VERSION(1, 30))
+ x(31bit_dirent_offset, BCH_VERSION(1, 30)) \
+ x(btree_node_accounting, BCH_VERSION(1, 31))
enum bcachefs_metadata_version {
bcachefs_metadata_version_min = 9,
diff --git a/fs/bcachefs/buckets.c b/fs/bcachefs/buckets.c
index 021f5cb7998de704be9d135064af2817cb2c52fe..99e928f7799971d052f24ba6043eb54765cf42b9 100644
--- a/fs/bcachefs/buckets.c
+++ b/fs/bcachefs/buckets.c
@@ -749,6 +749,7 @@ static int __trigger_extent(struct btree_trans *trans,
enum btree_iter_update_trigger_flags flags)
{
bool gc = flags & BTREE_TRIGGER_gc;
+ bool insert = !(flags & BTREE_TRIGGER_overwrite);
struct bkey_ptrs_c ptrs = bch2_bkey_ptrs_c(k);
const union bch_extent_entry *entry;
struct extent_ptr_decoded p;
@@ -802,7 +803,7 @@ static int __trigger_extent(struct btree_trans *trans,
if (cur_compression_type &&
cur_compression_type != p.crc.compression_type) {
- if (flags & BTREE_TRIGGER_overwrite)
+ if (!insert)
bch2_u64s_neg(compression_acct, ARRAY_SIZE(compression_acct));
ret = bch2_disk_accounting_mod2(trans, gc, compression_acct,
@@ -835,7 +836,7 @@ static int __trigger_extent(struct btree_trans *trans,
}
if (cur_compression_type) {
- if (flags & BTREE_TRIGGER_overwrite)
+ if (!insert)
bch2_u64s_neg(compression_acct, ARRAY_SIZE(compression_acct));
ret = bch2_disk_accounting_mod2(trans, gc, compression_acct,
@@ -845,12 +846,17 @@ static int __trigger_extent(struct btree_trans *trans,
}
if (level) {
- ret = bch2_disk_accounting_mod2_nr(trans, gc, &replicas_sectors, 1, btree, btree_id);
+ const bool leaf_node = level == 1;
+ s64 v[3] = {
+ replicas_sectors,
+ insert ? 1 : -1,
+ !leaf_node ? (insert ? 1 : -1) : 0,
+ };
+
+ ret = bch2_disk_accounting_mod2(trans, gc, v, btree, btree_id);
if (ret)
return ret;
} else {
- bool insert = !(flags & BTREE_TRIGGER_overwrite);
-
s64 v[3] = {
insert ? 1 : -1,
insert ? k.k->size : -((s64) k.k->size),
diff --git a/fs/bcachefs/disk_accounting_format.h b/fs/bcachefs/disk_accounting_format.h
index 8269af1dbe2a094454f780194f4ece33c4a4e461..730a17ea42431012282cec9d7803b0ac0b1d339d 100644
--- a/fs/bcachefs/disk_accounting_format.h
+++ b/fs/bcachefs/disk_accounting_format.h
@@ -108,7 +108,7 @@ static inline bool data_type_is_hidden(enum bch_data_type type)
x(dev_data_type, 3, 3) \
x(compression, 4, 3) \
x(snapshot, 5, 1) \
- x(btree, 6, 1) \
+ x(btree, 6, 3) \
x(rebalance_work, 7, 1) \
x(inum, 8, 3)
@@ -174,6 +174,14 @@ struct bch_acct_snapshot {
__u32 id;
} __packed;
+/*
+ * Metadata accounting per btree id:
+ * [
+ * total btree disk usage in sectors
+ * total number of btree nodes
+ * number of non-leaf btree nodes
+ * ]
+ */
struct bch_acct_btree {
__u32 id;
} __packed;
diff --git a/fs/bcachefs/sb-downgrade.c b/fs/bcachefs/sb-downgrade.c
index de56a1ee79db202da7ca021fb1f67f4b4820a2a8..bfd06fd5d506169031a2f7cadb5b1697ac40c811 100644
--- a/fs/bcachefs/sb-downgrade.c
+++ b/fs/bcachefs/sb-downgrade.c
@@ -104,7 +104,10 @@
x(inode_has_case_insensitive, \
BIT_ULL(BCH_RECOVERY_PASS_check_inodes), \
BCH_FSCK_ERR_inode_has_case_insensitive_not_set, \
- BCH_FSCK_ERR_inode_parent_has_case_insensitive_not_set)
+ BCH_FSCK_ERR_inode_parent_has_case_insensitive_not_set)\
+ x(btree_node_accounting, \
+ BIT_ULL(BCH_RECOVERY_PASS_check_allocations), \
+ BCH_FSCK_ERR_accounting_mismatch)
#define DOWNGRADE_TABLE() \
x(bucket_stripe_sectors, \
@@ -152,7 +155,11 @@
BIT_ULL(BCH_RECOVERY_PASS_check_allocations), \
BCH_FSCK_ERR_accounting_mismatch, \
BCH_FSCK_ERR_accounting_key_replicas_nr_devs_0, \
- BCH_FSCK_ERR_accounting_key_junk_at_end)
+ BCH_FSCK_ERR_accounting_key_junk_at_end) \
+ x(btree_node_accounting, \
+ BIT_ULL(BCH_RECOVERY_PASS_check_allocations), \
+ BCH_FSCK_ERR_accounting_mismatch, \
+ BCH_FSCK_ERR_accounting_key_nr_counters_wrong)
struct upgrade_downgrade_entry {
u64 recovery_passes;
--
2.50.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH v2 13/15] bcachefs: Use explicit node counts in progress reporting
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
` (11 preceding siblings ...)
2025-08-28 14:16 ` [PATCH v2 12/15] bcachefs: bcachefs_metadata_version_btree_node_accounting Nikita Ofitserov via B4 Relay
@ 2025-08-28 14:16 ` Nikita Ofitserov via B4 Relay
2025-08-28 14:16 ` [PATCH v2 14/15] bcachefs: Better progress reporting for btree iteration without leaves Nikita Ofitserov via B4 Relay
` (2 subsequent siblings)
15 siblings, 0 replies; 19+ messages in thread
From: Nikita Ofitserov via B4 Relay @ 2025-08-28 14:16 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs, Nikita Ofitserov
From: Nikita Ofitserov <himikof@gmail.com>
Also, consider the metadata_replicas option when better
accounting is not available.
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
---
fs/bcachefs/progress.c | 24 +++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)
diff --git a/fs/bcachefs/progress.c b/fs/bcachefs/progress.c
index 541ee951d1c9fc3f11b5a5ac40f1d366829a5c97..8181f6d2ca3808f57bef179383bd969d9a375d5f 100644
--- a/fs/bcachefs/progress.c
+++ b/fs/bcachefs/progress.c
@@ -12,6 +12,10 @@ void bch2_progress_init(struct progress_indicator_state *s,
s->next_print = jiffies + HZ * 10;
+ /* This is only an estimation: nodes can have different replica counts */
+ const u32 expected_node_disk_sectors =
+ READ_ONCE(c->opts.metadata_replicas) * btree_sectors(c);
+
for (unsigned i = 0; i < btree_id_nr_alive(c); i++) {
if (!(btree_id_mask & BIT_ULL(i)))
continue;
@@ -19,9 +23,23 @@ void bch2_progress_init(struct progress_indicator_state *s,
struct disk_accounting_pos acc;
disk_accounting_key_init(acc, btree, .id = i);
- u64 v;
- bch2_accounting_mem_read(c, disk_accounting_pos_to_bpos(&acc), &v, 1);
- s->nodes_total += div64_ul(v, btree_sectors(c));
+ struct {
+ u64 disk_sectors;
+ u64 total_nodes;
+ u64 inner_nodes;
+ } v = {0};
+ bch2_accounting_mem_read(c, disk_accounting_pos_to_bpos(&acc),
+ (u64 *)&v, sizeof(v) / sizeof(u64));
+
+ /*
+ * We check for zeros to degrade gracefully when run
+ * with un-upgraded accounting info (missing some counters).
+ */
+
+ if (v.total_nodes != 0)
+ s->nodes_total += v.total_nodes - v.inner_nodes;
+ else
+ s->nodes_total += div_u64(v.disk_sectors, expected_node_disk_sectors);
}
}
--
2.50.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH v2 14/15] bcachefs: Better progress reporting for btree iteration without leaves
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
` (12 preceding siblings ...)
2025-08-28 14:16 ` [PATCH v2 13/15] bcachefs: Use explicit node counts in progress reporting Nikita Ofitserov via B4 Relay
@ 2025-08-28 14:16 ` Nikita Ofitserov via B4 Relay
2025-08-28 14:16 ` [PATCH v2 15/15] bcachefs: More accurate progress reporting for inner node iteration Nikita Ofitserov via B4 Relay
2025-09-01 22:33 ` [PATCH v2 00/15] Accounting for accurate progress reporting Kent Overstreet
15 siblings, 0 replies; 19+ messages in thread
From: Nikita Ofitserov via B4 Relay @ 2025-08-28 14:16 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs, Nikita Ofitserov
From: Nikita Ofitserov <himikof@gmail.com>
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
---
fs/bcachefs/progress.c | 17 +++++++++++++----
fs/bcachefs/progress.h | 12 +++++++++++-
2 files changed, 24 insertions(+), 5 deletions(-)
diff --git a/fs/bcachefs/progress.c b/fs/bcachefs/progress.c
index 8181f6d2ca3808f57bef179383bd969d9a375d5f..7cc16490ffa98fec34ff228cb56bd1af0f42da3d 100644
--- a/fs/bcachefs/progress.c
+++ b/fs/bcachefs/progress.c
@@ -4,9 +4,10 @@
#include "disk_accounting.h"
#include "progress.h"
-void bch2_progress_init(struct progress_indicator_state *s,
- struct bch_fs *c,
- u64 btree_id_mask)
+void bch2_progress_init_inner(struct progress_indicator_state *s,
+ struct bch_fs *c,
+ u64 leaf_btree_id_mask,
+ u64 inner_btree_id_mask)
{
memset(s, 0, sizeof(*s));
@@ -16,6 +17,8 @@ void bch2_progress_init(struct progress_indicator_state *s,
const u32 expected_node_disk_sectors =
READ_ONCE(c->opts.metadata_replicas) * btree_sectors(c);
+ const u64 btree_id_mask = leaf_btree_id_mask | inner_btree_id_mask;
+
for (unsigned i = 0; i < btree_id_nr_alive(c); i++) {
if (!(btree_id_mask & BIT_ULL(i)))
continue;
@@ -31,11 +34,17 @@ void bch2_progress_init(struct progress_indicator_state *s,
bch2_accounting_mem_read(c, disk_accounting_pos_to_bpos(&acc),
(u64 *)&v, sizeof(v) / sizeof(u64));
+ /* Better to estimate as 0 than the total node count */
+ if (inner_btree_id_mask & BIT_ULL(i))
+ s->nodes_total += v.inner_nodes;
+
+ if (!(leaf_btree_id_mask & BIT_ULL(i)))
+ continue;
+
/*
* We check for zeros to degrade gracefully when run
* with un-upgraded accounting info (missing some counters).
*/
-
if (v.total_nodes != 0)
s->nodes_total += v.total_nodes - v.inner_nodes;
else
diff --git a/fs/bcachefs/progress.h b/fs/bcachefs/progress.h
index 972a73087ffe06632abef015e4556d8ee196eb24..91f3453377093e2713aa4198b68e13601223688e 100644
--- a/fs/bcachefs/progress.h
+++ b/fs/bcachefs/progress.h
@@ -20,7 +20,17 @@ struct progress_indicator_state {
struct btree *last_node;
};
-void bch2_progress_init(struct progress_indicator_state *, struct bch_fs *, u64);
+void bch2_progress_init_inner(struct progress_indicator_state *s,
+ struct bch_fs *c,
+ u64 leaf_btree_id_mask,
+ u64 inner_btree_id_mask);
+
+static inline void bch2_progress_init(struct progress_indicator_state *s,
+ struct bch_fs *c, u64 btree_id_mask)
+{
+ bch2_progress_init_inner(s, c, btree_id_mask, 0);
+}
+
void bch2_progress_update_iter(struct btree_trans *,
struct progress_indicator_state *,
struct btree_iter *,
--
2.50.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH v2 15/15] bcachefs: More accurate progress reporting for inner node iteration
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
` (13 preceding siblings ...)
2025-08-28 14:16 ` [PATCH v2 14/15] bcachefs: Better progress reporting for btree iteration without leaves Nikita Ofitserov via B4 Relay
@ 2025-08-28 14:16 ` Nikita Ofitserov via B4 Relay
2025-09-01 22:33 ` [PATCH v2 00/15] Accounting for accurate progress reporting Kent Overstreet
15 siblings, 0 replies; 19+ messages in thread
From: Nikita Ofitserov via B4 Relay @ 2025-08-28 14:16 UTC (permalink / raw)
To: Kent Overstreet; +Cc: linux-bcachefs, Nikita Ofitserov
From: Nikita Ofitserov <himikof@gmail.com>
Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
---
fs/bcachefs/backpointers.c | 4 +++-
fs/bcachefs/btree_gc.c | 2 +-
fs/bcachefs/migrate.c | 13 +++++++++----
3 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/fs/bcachefs/backpointers.c b/fs/bcachefs/backpointers.c
index 0d585e5662be3f02580558e9a590075ea73193d5..42370aebb7a442ee368df323eb1f4970f2e5f949 100644
--- a/fs/bcachefs/backpointers.c
+++ b/fs/bcachefs/backpointers.c
@@ -804,7 +804,9 @@ static int bch2_check_extents_to_backpointers_pass(struct btree_trans *trans,
struct progress_indicator_state progress;
int ret = 0;
- bch2_progress_init(&progress, trans->c, BIT_ULL(BTREE_ID_extents)|BIT_ULL(BTREE_ID_reflink));
+ bch2_progress_init_inner(&progress, trans->c,
+ btree_has_data_ptrs_mask,
+ ~0ULL);
for (enum btree_id btree_id = 0;
btree_id < btree_id_nr_alive(c);
diff --git a/fs/bcachefs/btree_gc.c b/fs/bcachefs/btree_gc.c
index 2338feb8d8ed4bad85a01a6b9181116d918636b5..c04e88ec5c0ac73c21f01db597b19d1b0c798299 100644
--- a/fs/bcachefs/btree_gc.c
+++ b/fs/bcachefs/btree_gc.c
@@ -780,7 +780,7 @@ static int bch2_gc_btrees(struct bch_fs *c)
int ret = 0;
struct progress_indicator_state progress;
- bch2_progress_init(&progress, c, ~0ULL);
+ bch2_progress_init_inner(&progress, c, ~0ULL, ~0ULL);
enum btree_id ids[BTREE_ID_NR];
for (unsigned i = 0; i < BTREE_ID_NR; i++)
diff --git a/fs/bcachefs/migrate.c b/fs/bcachefs/migrate.c
index 4aa7dd483cab5a400974b5824bfaf0c0af2e9331..d67ca56d3f431980b104ea5528cce79be88f93fb 100644
--- a/fs/bcachefs/migrate.c
+++ b/fs/bcachefs/migrate.c
@@ -266,10 +266,15 @@ int bch2_dev_data_drop_by_backpointers(struct bch_fs *c, unsigned dev_idx, unsig
int bch2_dev_data_drop(struct bch_fs *c, unsigned dev_idx, unsigned flags)
{
struct progress_indicator_state progress;
+ int ret;
+
bch2_progress_init(&progress, c,
- BIT_ULL(BTREE_ID_extents)|
- BIT_ULL(BTREE_ID_reflink));
+ btree_has_data_ptrs_mask & ~BIT_ULL(BTREE_ID_stripes));
+
+ if ((ret = bch2_dev_usrdata_drop(c, &progress, dev_idx, flags)))
+ return ret;
+
+ bch2_progress_init_inner(&progress, c, 0, ~0ULL);
- return bch2_dev_usrdata_drop(c, &progress, dev_idx, flags) ?:
- bch2_dev_metadata_drop(c, &progress, dev_idx, flags);
+ return bch2_dev_metadata_drop(c, &progress, dev_idx, flags);
}
--
2.50.1
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH v2 00/15] Accounting for accurate progress reporting
2025-08-28 14:16 [PATCH v2 00/15] Accounting for accurate progress reporting Nikita Ofitserov via B4 Relay
` (14 preceding siblings ...)
2025-08-28 14:16 ` [PATCH v2 15/15] bcachefs: More accurate progress reporting for inner node iteration Nikita Ofitserov via B4 Relay
@ 2025-09-01 22:33 ` Kent Overstreet
15 siblings, 0 replies; 19+ messages in thread
From: Kent Overstreet @ 2025-09-01 22:33 UTC (permalink / raw)
To: himikof; +Cc: linux-bcachefs
On Thu, Aug 28, 2025 at 05:16:02PM +0300, Nikita Ofitserov via B4 Relay wrote:
> This patch series introduces new per-btree accounting counters and uses
> them for (hopefully) accurate progress reporting in recovery passes.
> Also includes various assorted bugfixes.
>
> The commit "Relax restrictions on the number of accounting
> counters" is optional, but will likely greatly improve the
> upgrade/tools version mismatch experience. Without it, all bree usage
> accounting will be thrown out and rebuilt on any version mismatch.
>
> The commit "bcachefs_metadata_version_btree_node_accounting"
> introduces the format change along with upgrade/downgrade table entries.
>
> The first ten commits are drive-by fixes/improvements, especially
> "Improve check_allocations pass speed not in fsck", which should make
> future accounting upgrades much faster (including this one).
>
> Signed-off-by: Nikita Ofitserov <himikof@gmail.com>
> ---
> Changes in v2:
> - Reordered the format-change-dependent commits to the end
> - Added the new metadata version, upgrade and downgrade entries
> - Fixed an issue with hidden usage calculation after device removal
> - Link to v1: https://lore.kernel.org/r/20250827-better-progress-v1-0-74c24de7988a@gmail.com
https://evilpiepirate.org/~testdashboard/ci?user=nofitserov&branch=nofitserov
Every patch before "Relax restrictions on the number of accounting
counters" looks good according to the CI; after that the test dashboard
explodes:
https://evilpiepirate.org/~testdashboard/ci?user=nofitserov&branch=nofitserov
And those look like clean and simple bugfixes/refactorings, so I'm going
to pull those into testing.
^ permalink raw reply [flat|nested] 19+ messages in thread