* [PATCH 0/8] Snapshot deletion improvements
@ 2025-05-02 19:59 Kent Overstreet
2025-05-02 19:59 ` [PATCH 1/8] bcachefs: snapshot delete progress indicator Kent Overstreet
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: Kent Overstreet @ 2025-05-02 19:59 UTC (permalink / raw)
To: linux-bcachefs, linux-kernel; +Cc: Kent Overstreet
Snapshot deletion performance has been reported to be an issue, so this
patch series should address that.
The important optimization relies on the fact that if an
extent/dirent/xattr exists, a version of that inode must exist in that
specific snapshot. We can make use of that to avoid a lot of scanning.
Previously only fsck relied on this, so it would spit out a warning and
correct the issue in its in memory data structures if this occurred.
Now, it's turned into a proper fsck_err() with on-disk repair if
detected.
We also want some additional safety with this: if snapshot deletion ever
screws up and doesn't delete all the keys it was supposed to, we want to
be able to automatically repair.
So there's a new on disk format version with an incompatible (i.e. must
be expliticly enabled) feature, where we never delete snapshot keys - we
just mark them as deleted.
This allows fsck to differentiate between "key for deleted snapshot, I
know how to repair this" vs. "key for missing snapshot, we're not sure
what happened and someone should probably take a look".
And snapshot deletion status is now present in sysfs, currently with
btree and position in that btree. We do have per-snapshot-id accounting,
so it might be possible to turn that into a proper progress indicator
later.
Kent Overstreet (8):
bcachefs: snapshot delete progress indicator
bcachefs: Add comments for inode snapshot requirements
bcachefs: kill inode_walker_entry.snapshot
bcachefs: BCH_FSCK_ERR_snapshot_key_missing_inode_snapshot
bcachefs: Skip unrelated snapshot trees in snapshot deletion
bcachefs: BCH_SNAPSHOT_DELETED -> BCH_SNAPSHOT_WILL_DELETE
bcachefs: bcachefs_metadata_version_snapshot_deletion_v2
bcachefs: delete_dead_snapshot_keys_v2()
fs/bcachefs/bcachefs.h | 3 +-
fs/bcachefs/bcachefs_format.h | 3 +-
fs/bcachefs/fsck.c | 78 ++++---
fs/bcachefs/io_write.c | 6 +
fs/bcachefs/sb-errors_format.h | 4 +-
fs/bcachefs/snapshot.c | 389 +++++++++++++++++++++++++--------
fs/bcachefs/snapshot.h | 32 ++-
fs/bcachefs/snapshot_format.h | 4 +-
fs/bcachefs/snapshot_types.h | 56 +++++
fs/bcachefs/subvolume.c | 2 -
fs/bcachefs/subvolume.h | 3 -
fs/bcachefs/subvolume_types.h | 27 ---
fs/bcachefs/super.c | 1 +
fs/bcachefs/sysfs.c | 5 +
fs/bcachefs/xattr.c | 5 +
15 files changed, 445 insertions(+), 173 deletions(-)
create mode 100644 fs/bcachefs/snapshot_types.h
--
2.49.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/8] bcachefs: snapshot delete progress indicator
2025-05-02 19:59 [PATCH 0/8] Snapshot deletion improvements Kent Overstreet
@ 2025-05-02 19:59 ` Kent Overstreet
2025-05-02 19:59 ` [PATCH 2/8] bcachefs: Add comments for inode snapshot requirements Kent Overstreet
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2025-05-02 19:59 UTC (permalink / raw)
To: linux-bcachefs, linux-kernel; +Cc: Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
fs/bcachefs/bcachefs.h | 3 +-
fs/bcachefs/snapshot.c | 132 ++++++++++++++++++++++-------------
fs/bcachefs/snapshot.h | 7 +-
fs/bcachefs/snapshot_types.h | 25 +++++++
fs/bcachefs/subvolume.c | 2 -
fs/bcachefs/subvolume.h | 3 -
fs/bcachefs/super.c | 1 +
fs/bcachefs/sysfs.c | 5 ++
8 files changed, 121 insertions(+), 57 deletions(-)
create mode 100644 fs/bcachefs/snapshot_types.h
diff --git a/fs/bcachefs/bcachefs.h b/fs/bcachefs/bcachefs.h
index 0369dd656d32..cd35d1cf3fbb 100644
--- a/fs/bcachefs/bcachefs.h
+++ b/fs/bcachefs/bcachefs.h
@@ -216,6 +216,7 @@
#include "recovery_passes_types.h"
#include "sb-errors_types.h"
#include "seqmutex.h"
+#include "snapshot_types.h"
#include "time_stats.h"
#include "util.h"
@@ -869,7 +870,7 @@ struct bch_fs {
struct mutex snapshot_table_lock;
struct rw_semaphore snapshot_create_lock;
- struct work_struct snapshot_delete_work;
+ struct snapshot_delete snapshot_delete;
struct work_struct snapshot_wait_for_pagecache_and_delete_work;
snapshot_id_list snapshots_unlinked;
struct mutex snapshots_unlinked_lock;
diff --git a/fs/bcachefs/snapshot.c b/fs/bcachefs/snapshot.c
index 94cf60f76b64..2f2f129ce482 100644
--- a/fs/bcachefs/snapshot.c
+++ b/fs/bcachefs/snapshot.c
@@ -1,6 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
#include "bcachefs.h"
+#include "bbpos.h"
#include "bkey_buf.h"
#include "btree_cache.h"
#include "btree_key_cache.h"
@@ -1346,12 +1347,6 @@ int bch2_snapshot_node_create(struct btree_trans *trans, u32 parent,
* that key to snapshot leaf nodes, where we can mutate it
*/
-struct snapshot_interior_delete {
- u32 id;
- u32 live_child;
-};
-typedef DARRAY(struct snapshot_interior_delete) interior_delete_list;
-
static inline u32 interior_delete_has_id(interior_delete_list *l, u32 id)
{
darray_for_each(*l, i)
@@ -1385,28 +1380,28 @@ static unsigned __live_child(struct snapshot_table *t, u32 id,
return 0;
}
-static unsigned live_child(struct bch_fs *c, u32 id,
- snapshot_id_list *delete_leaves,
- interior_delete_list *delete_interior)
+static unsigned live_child(struct bch_fs *c, u32 id)
{
+ struct snapshot_delete *d = &c->snapshot_delete;
+
rcu_read_lock();
u32 ret = __live_child(rcu_dereference(c->snapshots), id,
- delete_leaves, delete_interior);
+ &d->delete_leaves, &d->delete_interior);
rcu_read_unlock();
return ret;
}
static int delete_dead_snapshots_process_key(struct btree_trans *trans,
struct btree_iter *iter,
- struct bkey_s_c k,
- snapshot_id_list *delete_leaves,
- interior_delete_list *delete_interior)
+ struct bkey_s_c k)
{
- if (snapshot_list_has_id(delete_leaves, k.k->p.snapshot))
+ struct snapshot_delete *d = &trans->c->snapshot_delete;
+
+ if (snapshot_list_has_id(&d->delete_leaves, k.k->p.snapshot))
return bch2_btree_delete_at(trans, iter,
BTREE_UPDATE_internal_snapshot_node);
- u32 live_child = interior_delete_has_id(delete_interior, k.k->p.snapshot);
+ u32 live_child = interior_delete_has_id(&d->delete_interior, k.k->p.snapshot);
if (live_child) {
struct bkey_i *new = bch2_bkey_make_mut_noupdate(trans, k);
int ret = PTR_ERR_OR_ZERO(new);
@@ -1442,14 +1437,13 @@ static int delete_dead_snapshots_process_key(struct btree_trans *trans,
* it doesn't have child snapshot nodes - it's now redundant and we can mark it
* as deleted.
*/
-static int check_should_delete_snapshot(struct btree_trans *trans, struct bkey_s_c k,
- snapshot_id_list *delete_leaves,
- interior_delete_list *delete_interior)
+static int check_should_delete_snapshot(struct btree_trans *trans, struct bkey_s_c k)
{
if (k.k->type != KEY_TYPE_snapshot)
return 0;
struct bch_fs *c = trans->c;
+ struct snapshot_delete *d = &c->snapshot_delete;
struct bkey_s_c_snapshot s = bkey_s_c_to_snapshot(k);
unsigned live_children = 0;
@@ -1460,23 +1454,23 @@ static int check_should_delete_snapshot(struct btree_trans *trans, struct bkey_s
u32 child = le32_to_cpu(s.v->children[i]);
live_children += child &&
- !snapshot_list_has_id(delete_leaves, child);
+ !snapshot_list_has_id(&d->delete_leaves, child);
}
if (live_children == 0) {
- return snapshot_list_add(c, delete_leaves, s.k->p.offset);
+ return snapshot_list_add(c, &d->delete_leaves, s.k->p.offset);
} else if (live_children == 1) {
- struct snapshot_interior_delete d = {
+ struct snapshot_interior_delete n = {
.id = s.k->p.offset,
- .live_child = live_child(c, s.k->p.offset, delete_leaves, delete_interior),
+ .live_child = live_child(c, s.k->p.offset),
};
- if (!d.live_child) {
- bch_err(c, "error finding live child of snapshot %u", d.id);
+ if (!n.live_child) {
+ bch_err(c, "error finding live child of snapshot %u", n.id);
return -EINVAL;
}
- return darray_push(delete_interior, d);
+ return darray_push(&d->delete_interior, n);
} else {
return 0;
}
@@ -1555,39 +1549,48 @@ static int bch2_fix_child_of_deleted_snapshot(struct btree_trans *trans,
return bch2_trans_update(trans, iter, &s->k_i, 0);
}
+static void bch2_snapshot_delete_nodes_to_text(struct printbuf *out, struct snapshot_delete *d)
+{
+ prt_printf(out, "deleting leaves");
+ darray_for_each(d->delete_leaves, i)
+ prt_printf(out, " %u", *i);
+
+ prt_printf(out, " interior");
+ darray_for_each(d->delete_interior, i)
+ prt_printf(out, " %u->%u", i->id, i->live_child);
+}
+
int bch2_delete_dead_snapshots(struct bch_fs *c)
{
if (!test_and_clear_bit(BCH_FS_need_delete_dead_snapshots, &c->flags))
return 0;
struct btree_trans *trans = bch2_trans_get(c);
- snapshot_id_list delete_leaves = {};
- interior_delete_list delete_interior = {};
+ struct snapshot_delete *d = &c->snapshot_delete;
int ret = 0;
/*
* For every snapshot node: If we have no live children and it's not
* pointed to by a subvolume, delete it:
*/
+ mutex_lock(&d->lock);
+ d->running = true;
+ d->pos = BBPOS_MIN;
+
ret = for_each_btree_key(trans, iter, BTREE_ID_snapshots, POS_MIN, 0, k,
- check_should_delete_snapshot(trans, k, &delete_leaves, &delete_interior));
+ check_should_delete_snapshot(trans, k));
+ mutex_unlock(&d->lock);
if (!bch2_err_matches(ret, EROFS))
bch_err_msg(c, ret, "walking snapshots");
if (ret)
goto err;
- if (!delete_leaves.nr && !delete_interior.nr)
+ if (!d->delete_leaves.nr && !d->delete_interior.nr)
goto err;
{
struct printbuf buf = PRINTBUF;
- prt_printf(&buf, "deleting leaves");
- darray_for_each(delete_leaves, i)
- prt_printf(&buf, " %u", *i);
-
- prt_printf(&buf, " interior");
- darray_for_each(delete_interior, i)
- prt_printf(&buf, " %u->%u", i->id, i->live_child);
+ bch2_snapshot_delete_nodes_to_text(&buf, d);
ret = commit_do(trans, NULL, NULL, 0, bch2_trans_log_msg(trans, &buf));
printbuf_exit(&buf);
@@ -1595,19 +1598,21 @@ int bch2_delete_dead_snapshots(struct bch_fs *c)
goto err;
}
- for (unsigned btree = 0; btree < BTREE_ID_NR; btree++) {
+ for (d->pos.btree = 0; d->pos.btree < BTREE_ID_NR; d->pos.btree++) {
struct disk_reservation res = { 0 };
- if (!btree_type_has_snapshots(btree))
+ d->pos.pos = POS_MIN;
+
+ if (!btree_type_has_snapshots(d->pos.btree))
continue;
ret = for_each_btree_key_commit(trans, iter,
- btree, POS_MIN,
+ d->pos.btree, POS_MIN,
BTREE_ITER_prefetch|BTREE_ITER_all_snapshots, k,
- &res, NULL, BCH_TRANS_COMMIT_no_enospc,
- delete_dead_snapshots_process_key(trans, &iter, k,
- &delete_leaves,
- &delete_interior));
+ &res, NULL, BCH_TRANS_COMMIT_no_enospc, ({
+ d->pos.pos = iter.pos;
+ delete_dead_snapshots_process_key(trans, &iter, k);
+ }));
bch2_disk_reservation_put(c, &res);
@@ -1617,7 +1622,7 @@ int bch2_delete_dead_snapshots(struct bch_fs *c)
goto err;
}
- darray_for_each(delete_leaves, i) {
+ darray_for_each(d->delete_leaves, i) {
ret = commit_do(trans, NULL, NULL, 0,
bch2_snapshot_node_delete(trans, *i));
if (!bch2_err_matches(ret, EROFS))
@@ -1634,11 +1639,11 @@ int bch2_delete_dead_snapshots(struct bch_fs *c)
ret = for_each_btree_key_commit(trans, iter, BTREE_ID_snapshots, POS_MIN,
BTREE_ITER_intent, k,
NULL, NULL, BCH_TRANS_COMMIT_no_enospc,
- bch2_fix_child_of_deleted_snapshot(trans, &iter, k, &delete_interior));
+ bch2_fix_child_of_deleted_snapshot(trans, &iter, k, &d->delete_interior));
if (ret)
goto err;
- darray_for_each(delete_interior, i) {
+ darray_for_each(d->delete_interior, i) {
ret = commit_do(trans, NULL, NULL, 0,
bch2_snapshot_node_delete(trans, i->id));
if (!bch2_err_matches(ret, EROFS))
@@ -1647,8 +1652,11 @@ int bch2_delete_dead_snapshots(struct bch_fs *c)
goto err;
}
err:
- darray_exit(&delete_interior);
- darray_exit(&delete_leaves);
+ mutex_lock(&d->lock);
+ darray_exit(&d->delete_interior);
+ darray_exit(&d->delete_leaves);
+ d->running = false;
+ mutex_unlock(&d->lock);
bch2_trans_put(trans);
if (!bch2_err_matches(ret, EROFS))
bch_err_fn(c, ret);
@@ -1657,7 +1665,7 @@ int bch2_delete_dead_snapshots(struct bch_fs *c)
void bch2_delete_dead_snapshots_work(struct work_struct *work)
{
- struct bch_fs *c = container_of(work, struct bch_fs, snapshot_delete_work);
+ struct bch_fs *c = container_of(work, struct bch_fs, snapshot_delete.work);
set_worker_desc("bcachefs-delete-dead-snapshots/%s", c->name);
@@ -1672,10 +1680,27 @@ void bch2_delete_dead_snapshots_async(struct bch_fs *c)
BUG_ON(!test_bit(BCH_FS_may_go_rw, &c->flags));
- if (!queue_work(c->write_ref_wq, &c->snapshot_delete_work))
+ if (!queue_work(c->write_ref_wq, &c->snapshot_delete.work))
enumerated_ref_put(&c->writes, BCH_WRITE_REF_delete_dead_snapshots);
}
+void bch2_snapshot_delete_status_to_text(struct printbuf *out, struct bch_fs *c)
+{
+ struct snapshot_delete *d = &c->snapshot_delete;
+
+ if (!d->running) {
+ prt_str(out, "(not running)");
+ return;
+ }
+
+ mutex_lock(&d->lock);
+ bch2_snapshot_delete_nodes_to_text(out, d);
+ prt_newline(out);
+ mutex_unlock(&d->lock);
+
+ bch2_bbpos_to_text(out, d->pos);
+}
+
int __bch2_key_has_snapshot_overwrites(struct btree_trans *trans,
enum btree_id id,
struct bpos pos)
@@ -1750,3 +1775,10 @@ void bch2_fs_snapshots_exit(struct bch_fs *c)
{
kvfree(rcu_dereference_protected(c->snapshots, true));
}
+
+void bch2_fs_snapshots_init_early(struct bch_fs *c)
+{
+ INIT_WORK(&c->snapshot_delete.work, bch2_delete_dead_snapshots_work);
+ mutex_init(&c->snapshot_delete.lock);
+ mutex_init(&c->snapshots_unlinked_lock);
+}
diff --git a/fs/bcachefs/snapshot.h b/fs/bcachefs/snapshot.h
index 81180181d7c9..24a451bb7024 100644
--- a/fs/bcachefs/snapshot.h
+++ b/fs/bcachefs/snapshot.h
@@ -244,7 +244,6 @@ int bch2_reconstruct_snapshots(struct bch_fs *);
int bch2_check_key_has_snapshot(struct btree_trans *, struct btree_iter *, struct bkey_s_c);
int bch2_snapshot_node_set_deleted(struct btree_trans *, u32);
-void bch2_delete_dead_snapshots_work(struct work_struct *);
int __bch2_key_has_snapshot_overwrites(struct btree_trans *, enum btree_id, struct bpos);
@@ -259,7 +258,13 @@ static inline int bch2_key_has_snapshot_overwrites(struct btree_trans *trans,
return __bch2_key_has_snapshot_overwrites(trans, id, pos);
}
+int bch2_delete_dead_snapshots(struct bch_fs *);
+void bch2_delete_dead_snapshots_work(struct work_struct *);
+void bch2_delete_dead_snapshots_async(struct bch_fs *);
+void bch2_snapshot_delete_status_to_text(struct printbuf *, struct bch_fs *);
+
int bch2_snapshots_read(struct bch_fs *);
void bch2_fs_snapshots_exit(struct bch_fs *);
+void bch2_fs_snapshots_init_early(struct bch_fs *);
#endif /* _BCACHEFS_SNAPSHOT_H */
diff --git a/fs/bcachefs/snapshot_types.h b/fs/bcachefs/snapshot_types.h
new file mode 100644
index 000000000000..bb67a6beb6e3
--- /dev/null
+++ b/fs/bcachefs/snapshot_types.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _BCACHEFS_SNAPSHOT_TYPES_H
+#define _BCACHEFS_SNAPSHOT_TYPES_H
+
+#include "bbpos_types.h"
+#include "subvolume_types.h"
+
+struct snapshot_interior_delete {
+ u32 id;
+ u32 live_child;
+};
+typedef DARRAY(struct snapshot_interior_delete) interior_delete_list;
+
+struct snapshot_delete {
+ struct work_struct work;
+
+ struct mutex lock;
+ snapshot_id_list delete_leaves;
+ interior_delete_list delete_interior;
+
+ bool running;
+ struct bbpos pos;
+};
+
+#endif /* _BCACHEFS_SNAPSHOT_TYPES_H */
diff --git a/fs/bcachefs/subvolume.c b/fs/bcachefs/subvolume.c
index 51ab2ee10706..3c6ba1469de2 100644
--- a/fs/bcachefs/subvolume.c
+++ b/fs/bcachefs/subvolume.c
@@ -730,8 +730,6 @@ int bch2_fs_upgrade_for_subvolumes(struct bch_fs *c)
void bch2_fs_subvolumes_init_early(struct bch_fs *c)
{
- INIT_WORK(&c->snapshot_delete_work, bch2_delete_dead_snapshots_work);
INIT_WORK(&c->snapshot_wait_for_pagecache_and_delete_work,
bch2_subvolume_wait_for_pagecache_and_delete);
- mutex_init(&c->snapshots_unlinked_lock);
}
diff --git a/fs/bcachefs/subvolume.h b/fs/bcachefs/subvolume.h
index ee5e4e5a0fc8..075f55e25c70 100644
--- a/fs/bcachefs/subvolume.h
+++ b/fs/bcachefs/subvolume.h
@@ -77,9 +77,6 @@ bch2_btree_iter_peek_in_subvolume_max_type(struct btree_trans *trans, struct btr
_end, _subvolid, _flags, _k, _do); \
})
-int bch2_delete_dead_snapshots(struct bch_fs *);
-void bch2_delete_dead_snapshots_async(struct bch_fs *);
-
int bch2_subvolume_unlink(struct btree_trans *, u32);
int bch2_subvolume_create(struct btree_trans *, u64, u32, u32, u32 *, u32 *, bool);
diff --git a/fs/bcachefs/super.c b/fs/bcachefs/super.c
index 03efda996348..9626468600af 100644
--- a/fs/bcachefs/super.c
+++ b/fs/bcachefs/super.c
@@ -864,6 +864,7 @@ static struct bch_fs *bch2_fs_alloc(struct bch_sb *sb, struct bch_opts *opts,
bch2_fs_quota_init(c);
bch2_fs_rebalance_init(c);
bch2_fs_sb_errors_init_early(c);
+ bch2_fs_snapshots_init_early(c);
bch2_fs_subvolumes_init_early(c);
INIT_LIST_HEAD(&c->list);
diff --git a/fs/bcachefs/sysfs.c b/fs/bcachefs/sysfs.c
index 1d0c0f24a7b9..adf99a805a62 100644
--- a/fs/bcachefs/sysfs.c
+++ b/fs/bcachefs/sysfs.c
@@ -198,6 +198,7 @@ read_attribute(copy_gc_wait);
sysfs_pd_controller_attribute(rebalance);
read_attribute(rebalance_status);
+read_attribute(snapshot_delete_status);
read_attribute(new_stripes);
@@ -320,6 +321,9 @@ SHOW(bch2_fs)
if (attr == &sysfs_rebalance_status)
bch2_rebalance_status_to_text(out, c);
+ if (attr == &sysfs_snapshot_delete_status)
+ bch2_snapshot_delete_status_to_text(out, c);
+
/* Debugging: */
if (attr == &sysfs_journal_debug)
@@ -466,6 +470,7 @@ struct attribute *bch2_fs_files[] = {
&sysfs_btree_write_stats,
&sysfs_rebalance_status,
+ &sysfs_snapshot_delete_status,
&sysfs_compression_stats,
--
2.49.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/8] bcachefs: Add comments for inode snapshot requirements
2025-05-02 19:59 [PATCH 0/8] Snapshot deletion improvements Kent Overstreet
2025-05-02 19:59 ` [PATCH 1/8] bcachefs: snapshot delete progress indicator Kent Overstreet
@ 2025-05-02 19:59 ` Kent Overstreet
2025-05-02 19:59 ` [PATCH 3/8] bcachefs: kill inode_walker_entry.snapshot Kent Overstreet
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2025-05-02 19:59 UTC (permalink / raw)
To: linux-bcachefs, linux-kernel; +Cc: Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
fs/bcachefs/io_write.c | 6 ++++++
fs/bcachefs/xattr.c | 5 +++++
2 files changed, 11 insertions(+)
diff --git a/fs/bcachefs/io_write.c b/fs/bcachefs/io_write.c
index add141ac45b5..399df8fede8b 100644
--- a/fs/bcachefs/io_write.c
+++ b/fs/bcachefs/io_write.c
@@ -279,6 +279,12 @@ static inline int bch2_extent_update_i_size_sectors(struct btree_trans *trans,
inode_update_flags = 0;
}
+ /*
+ * extents, dirents and xattrs updates require that an inode update also
+ * happens - to ensure that if a key exists in one of those btrees with
+ * a given snapshot ID an inode is also present - so we may have to skip
+ * the nojournal optimization:
+ */
if (inode->k.p.snapshot != iter.snapshot) {
inode->k.p.snapshot = iter.snapshot;
inode_update_flags = 0;
diff --git a/fs/bcachefs/xattr.c b/fs/bcachefs/xattr.c
index b8bc2fb04f15..ea3f87f6fe49 100644
--- a/fs/bcachefs/xattr.c
+++ b/fs/bcachefs/xattr.c
@@ -176,6 +176,11 @@ int bch2_xattr_set(struct btree_trans *trans, subvol_inum inum,
if (ret)
return ret;
+ /*
+ * Besides the ctime update, extents, dirents and xattrs updates require
+ * that an inode update also happens - to ensure that if a key exists in
+ * one of those btrees with a given snapshot ID an inode is also present
+ */
inode_u->bi_ctime = bch2_current_time(c);
ret = bch2_inode_write(trans, &inode_iter, inode_u);
--
2.49.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/8] bcachefs: kill inode_walker_entry.snapshot
2025-05-02 19:59 [PATCH 0/8] Snapshot deletion improvements Kent Overstreet
2025-05-02 19:59 ` [PATCH 1/8] bcachefs: snapshot delete progress indicator Kent Overstreet
2025-05-02 19:59 ` [PATCH 2/8] bcachefs: Add comments for inode snapshot requirements Kent Overstreet
@ 2025-05-02 19:59 ` Kent Overstreet
2025-05-02 19:59 ` [PATCH 4/8] bcachefs: BCH_FSCK_ERR_snapshot_key_missing_inode_snapshot Kent Overstreet
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2025-05-02 19:59 UTC (permalink / raw)
To: linux-bcachefs, linux-kernel; +Cc: Kent Overstreet
redundant
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
fs/bcachefs/fsck.c | 46 ++++++++++++++++++++++------------------------
1 file changed, 22 insertions(+), 24 deletions(-)
diff --git a/fs/bcachefs/fsck.c b/fs/bcachefs/fsck.c
index 7d3dd1a0ae4f..f0aa3c7479e1 100644
--- a/fs/bcachefs/fsck.c
+++ b/fs/bcachefs/fsck.c
@@ -785,12 +785,11 @@ static int ref_visible2(struct bch_fs *c,
#define for_each_visible_inode(_c, _s, _w, _snapshot, _i) \
for (_i = (_w)->inodes.data; _i < (_w)->inodes.data + (_w)->inodes.nr && \
- (_i)->snapshot <= (_snapshot); _i++) \
- if (key_visible_in_snapshot(_c, _s, _i->snapshot, _snapshot))
+ (_i)->inode.bi_snapshot <= (_snapshot); _i++) \
+ if (key_visible_in_snapshot(_c, _s, _i->inode.bi_snapshot, _snapshot))
struct inode_walker_entry {
struct bch_inode_unpacked inode;
- u32 snapshot;
u64 count;
u64 i_size;
};
@@ -824,7 +823,6 @@ static int add_inode(struct bch_fs *c, struct inode_walker *w,
return bch2_inode_unpack(inode, &u) ?:
darray_push(&w->inodes, ((struct inode_walker_entry) {
.inode = u,
- .snapshot = inode.k->p.snapshot,
}));
}
@@ -870,19 +868,19 @@ lookup_inode_for_snapshot(struct bch_fs *c, struct inode_walker *w, struct bkey_
struct inode_walker_entry *i;
__darray_for_each(w->inodes, i)
- if (bch2_snapshot_is_ancestor(c, k.k->p.snapshot, i->snapshot))
+ if (bch2_snapshot_is_ancestor(c, k.k->p.snapshot, i->inode.bi_snapshot))
goto found;
return NULL;
found:
- BUG_ON(k.k->p.snapshot > i->snapshot);
+ BUG_ON(k.k->p.snapshot > i->inode.bi_snapshot);
- if (k.k->p.snapshot != i->snapshot && !is_whiteout) {
+ if (k.k->p.snapshot != i->inode.bi_snapshot && !is_whiteout) {
struct inode_walker_entry new = *i;
- new.snapshot = k.k->p.snapshot;
- new.count = 0;
- new.i_size = 0;
+ new.inode.bi_snapshot = k.k->p.snapshot;
+ new.count = 0;
+ new.i_size = 0;
struct printbuf buf = PRINTBUF;
bch2_bkey_val_to_text(&buf, c, k);
@@ -890,10 +888,10 @@ lookup_inode_for_snapshot(struct bch_fs *c, struct inode_walker *w, struct bkey_
bch_info(c, "have key for inode %llu:%u but have inode in ancestor snapshot %u\n"
"unexpected because we should always update the inode when we update a key in that inode\n"
"%s",
- w->last_pos.inode, k.k->p.snapshot, i->snapshot, buf.buf);
+ w->last_pos.inode, k.k->p.snapshot, i->inode.bi_snapshot, buf.buf);
printbuf_exit(&buf);
- while (i > w->inodes.data && i[-1].snapshot > k.k->p.snapshot)
+ while (i > w->inodes.data && i[-1].inode.bi_snapshot > k.k->p.snapshot)
--i;
size_t pos = i - w->inodes.data;
@@ -1496,21 +1494,21 @@ static int check_i_sectors_notnested(struct btree_trans *trans, struct inode_wal
if (i->inode.bi_sectors == i->count)
continue;
- count2 = bch2_count_inode_sectors(trans, w->last_pos.inode, i->snapshot);
+ count2 = bch2_count_inode_sectors(trans, w->last_pos.inode, i->inode.bi_snapshot);
if (w->recalculate_sums)
i->count = count2;
if (i->count != count2) {
bch_err_ratelimited(c, "fsck counted i_sectors wrong for inode %llu:%u: got %llu should be %llu",
- w->last_pos.inode, i->snapshot, i->count, count2);
+ w->last_pos.inode, i->inode.bi_snapshot, i->count, count2);
i->count = count2;
}
if (fsck_err_on(!(i->inode.bi_flags & BCH_INODE_i_sectors_dirty),
trans, inode_i_sectors_wrong,
"inode %llu:%u has incorrect i_sectors: got %llu, should be %llu",
- w->last_pos.inode, i->snapshot,
+ w->last_pos.inode, i->inode.bi_snapshot,
i->inode.bi_sectors, i->count)) {
i->inode.bi_sectors = i->count;
ret = bch2_fsck_write_inode(trans, &i->inode);
@@ -1821,20 +1819,20 @@ static int check_extent(struct btree_trans *trans, struct btree_iter *iter,
for (struct inode_walker_entry *i = extent_i ?: &darray_last(inode->inodes);
inode->inodes.data && i >= inode->inodes.data;
--i) {
- if (i->snapshot > k.k->p.snapshot ||
- !key_visible_in_snapshot(c, s, i->snapshot, k.k->p.snapshot))
+ if (i->inode.bi_snapshot > k.k->p.snapshot ||
+ !key_visible_in_snapshot(c, s, i->inode.bi_snapshot, k.k->p.snapshot))
continue;
if (fsck_err_on(k.k->p.offset > round_up(i->inode.bi_size, block_bytes(c)) >> 9 &&
!bkey_extent_is_reservation(k),
trans, extent_past_end_of_inode,
"extent type past end of inode %llu:%u, i_size %llu\n%s",
- i->inode.bi_inum, i->snapshot, i->inode.bi_size,
+ i->inode.bi_inum, i->inode.bi_snapshot, i->inode.bi_size,
(bch2_bkey_val_to_text(&buf, c, k), buf.buf))) {
struct btree_iter iter2;
bch2_trans_copy_iter(trans, &iter2, iter);
- bch2_btree_iter_set_snapshot(trans, &iter2, i->snapshot);
+ bch2_btree_iter_set_snapshot(trans, &iter2, i->inode.bi_snapshot);
ret = bch2_btree_iter_traverse(trans, &iter2) ?:
bch2_btree_delete_at(trans, &iter2,
BTREE_UPDATE_internal_snapshot_node);
@@ -1856,8 +1854,8 @@ static int check_extent(struct btree_trans *trans, struct btree_iter *iter,
for (struct inode_walker_entry *i = extent_i ?: &darray_last(inode->inodes);
inode->inodes.data && i >= inode->inodes.data;
--i) {
- if (i->snapshot > k.k->p.snapshot ||
- !key_visible_in_snapshot(c, s, i->snapshot, k.k->p.snapshot))
+ if (i->inode.bi_snapshot > k.k->p.snapshot ||
+ !key_visible_in_snapshot(c, s, i->inode.bi_snapshot, k.k->p.snapshot))
continue;
i->count += k.k->size;
@@ -1939,13 +1937,13 @@ static int check_subdir_count_notnested(struct btree_trans *trans, struct inode_
if (i->inode.bi_nlink == i->count)
continue;
- count2 = bch2_count_subdirs(trans, w->last_pos.inode, i->snapshot);
+ count2 = bch2_count_subdirs(trans, w->last_pos.inode, i->inode.bi_snapshot);
if (count2 < 0)
return count2;
if (i->count != count2) {
bch_err_ratelimited(c, "fsck counted subdirectories wrong for inum %llu:%u: got %llu should be %llu",
- w->last_pos.inode, i->snapshot, i->count, count2);
+ w->last_pos.inode, i->inode.bi_snapshot, i->count, count2);
i->count = count2;
if (i->inode.bi_nlink == i->count)
continue;
@@ -1954,7 +1952,7 @@ static int check_subdir_count_notnested(struct btree_trans *trans, struct inode_
if (fsck_err_on(i->inode.bi_nlink != i->count,
trans, inode_dir_wrong_nlink,
"directory %llu:%u with wrong i_nlink: got %u, should be %llu",
- w->last_pos.inode, i->snapshot, i->inode.bi_nlink, i->count)) {
+ w->last_pos.inode, i->inode.bi_snapshot, i->inode.bi_nlink, i->count)) {
i->inode.bi_nlink = i->count;
ret = bch2_fsck_write_inode(trans, &i->inode);
if (ret)
--
2.49.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/8] bcachefs: BCH_FSCK_ERR_snapshot_key_missing_inode_snapshot
2025-05-02 19:59 [PATCH 0/8] Snapshot deletion improvements Kent Overstreet
` (2 preceding siblings ...)
2025-05-02 19:59 ` [PATCH 3/8] bcachefs: kill inode_walker_entry.snapshot Kent Overstreet
@ 2025-05-02 19:59 ` Kent Overstreet
2025-05-02 19:59 ` [PATCH 5/8] bcachefs: Skip unrelated snapshot trees in snapshot deletion Kent Overstreet
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2025-05-02 19:59 UTC (permalink / raw)
To: linux-bcachefs, linux-kernel; +Cc: Kent Overstreet
We're going to be doing some snapshot deletion performance improvements,
and those will strictly require that if an extent/dirent/xattr is
present, an inode is present in that snapshot ID.
We already check for this, but we don't repair it on disk: this patch
adds that repair and turns it into a real fsck_err().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
fs/bcachefs/fsck.c | 44 ++++++++++++++++------------------
fs/bcachefs/sb-errors_format.h | 3 ++-
2 files changed, 23 insertions(+), 24 deletions(-)
diff --git a/fs/bcachefs/fsck.c b/fs/bcachefs/fsck.c
index f0aa3c7479e1..dd88113a7c70 100644
--- a/fs/bcachefs/fsck.c
+++ b/fs/bcachefs/fsck.c
@@ -862,9 +862,9 @@ static int get_inodes_all_snapshots(struct btree_trans *trans,
}
static struct inode_walker_entry *
-lookup_inode_for_snapshot(struct bch_fs *c, struct inode_walker *w, struct bkey_s_c k)
+lookup_inode_for_snapshot(struct btree_trans *trans, struct inode_walker *w, struct bkey_s_c k)
{
- bool is_whiteout = k.k->type == KEY_TYPE_whiteout;
+ struct bch_fs *c = trans->c;
struct inode_walker_entry *i;
__darray_for_each(w->inodes, i)
@@ -875,34 +875,32 @@ lookup_inode_for_snapshot(struct bch_fs *c, struct inode_walker *w, struct bkey_
found:
BUG_ON(k.k->p.snapshot > i->inode.bi_snapshot);
- if (k.k->p.snapshot != i->inode.bi_snapshot && !is_whiteout) {
- struct inode_walker_entry new = *i;
-
- new.inode.bi_snapshot = k.k->p.snapshot;
- new.count = 0;
- new.i_size = 0;
-
- struct printbuf buf = PRINTBUF;
- bch2_bkey_val_to_text(&buf, c, k);
+ struct printbuf buf = PRINTBUF;
+ int ret = 0;
- bch_info(c, "have key for inode %llu:%u but have inode in ancestor snapshot %u\n"
+ if (fsck_err_on(k.k->p.snapshot != i->inode.bi_snapshot,
+ trans, snapshot_key_missing_inode_snapshot,
+ "have key for inode %llu:%u but have inode in ancestor snapshot %u\n"
"unexpected because we should always update the inode when we update a key in that inode\n"
"%s",
- w->last_pos.inode, k.k->p.snapshot, i->inode.bi_snapshot, buf.buf);
- printbuf_exit(&buf);
+ w->last_pos.inode, k.k->p.snapshot, i->inode.bi_snapshot,
+ (bch2_bkey_val_to_text(&buf, c, k),
+ buf.buf))) {
+ struct bch_inode_unpacked new = i->inode;
- while (i > w->inodes.data && i[-1].inode.bi_snapshot > k.k->p.snapshot)
- --i;
+ new.bi_snapshot = k.k->p.snapshot;
- size_t pos = i - w->inodes.data;
- int ret = darray_insert_item(&w->inodes, pos, new);
- if (ret)
- return ERR_PTR(ret);
-
- i = w->inodes.data + pos;
+ ret = __bch2_fsck_write_inode(trans, &new) ?:
+ bch2_trans_commit(trans, NULL, NULL, 0) ?:
+ -BCH_ERR_transaction_restart_nested;
+ goto fsck_err;
}
+ printbuf_exit(&buf);
return i;
+fsck_err:
+ printbuf_exit(&buf);
+ return ERR_PTR(ret);
}
static struct inode_walker_entry *walk_inode(struct btree_trans *trans,
@@ -917,7 +915,7 @@ static struct inode_walker_entry *walk_inode(struct btree_trans *trans,
w->last_pos = k.k->p;
- return lookup_inode_for_snapshot(trans->c, w, k);
+ return lookup_inode_for_snapshot(trans, w, k);
}
static int get_visible_inodes(struct btree_trans *trans,
diff --git a/fs/bcachefs/sb-errors_format.h b/fs/bcachefs/sb-errors_format.h
index 6389f6e0f8dc..82bc1906aa00 100644
--- a/fs/bcachefs/sb-errors_format.h
+++ b/fs/bcachefs/sb-errors_format.h
@@ -216,6 +216,7 @@ enum bch_fsck_flags {
x(inode_str_hash_invalid, 194, 0) \
x(inode_v3_fields_start_bad, 195, 0) \
x(inode_snapshot_mismatch, 196, 0) \
+ x(snapshot_key_missing_inode_snapshot, 314, FSCK_AUTOFIX) \
x(inode_unlinked_but_clean, 197, 0) \
x(inode_unlinked_but_nlink_nonzero, 198, 0) \
x(inode_unlinked_and_not_open, 281, 0) \
@@ -323,7 +324,7 @@ enum bch_fsck_flags {
x(dirent_stray_data_after_cf_name, 305, 0) \
x(rebalance_work_incorrectly_set, 309, FSCK_AUTOFIX) \
x(rebalance_work_incorrectly_unset, 310, FSCK_AUTOFIX) \
- x(MAX, 314, 0)
+ x(MAX, 315, 0)
enum bch_sb_error_id {
#define x(t, n, ...) BCH_FSCK_ERR_##t = n,
--
2.49.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 5/8] bcachefs: Skip unrelated snapshot trees in snapshot deletion
2025-05-02 19:59 [PATCH 0/8] Snapshot deletion improvements Kent Overstreet
` (3 preceding siblings ...)
2025-05-02 19:59 ` [PATCH 4/8] bcachefs: BCH_FSCK_ERR_snapshot_key_missing_inode_snapshot Kent Overstreet
@ 2025-05-02 19:59 ` Kent Overstreet
2025-05-02 19:59 ` [PATCH 6/8] bcachefs: BCH_SNAPSHOT_DELETED -> BCH_SNAPSHOT_WILL_DELETE Kent Overstreet
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2025-05-02 19:59 UTC (permalink / raw)
To: linux-bcachefs, linux-kernel; +Cc: Kent Overstreet
Don't scan keys in inodes for which the snapshot tree doesn't match any
we're deleting from.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
fs/bcachefs/snapshot.c | 35 +++++++++++++++++++++++++++++++++--
fs/bcachefs/snapshot_types.h | 1 +
2 files changed, 34 insertions(+), 2 deletions(-)
diff --git a/fs/bcachefs/snapshot.c b/fs/bcachefs/snapshot.c
index 2f2f129ce482..219cba038778 100644
--- a/fs/bcachefs/snapshot.c
+++ b/fs/bcachefs/snapshot.c
@@ -1432,6 +1432,24 @@ static int delete_dead_snapshots_process_key(struct btree_trans *trans,
return 0;
}
+static bool skip_unrelated_snapshot_tree(struct btree_trans *trans, struct btree_iter *iter)
+{
+ struct bch_fs *c = trans->c;
+ struct snapshot_delete *d = &c->snapshot_delete;
+
+ bool ret = !snapshot_list_has_id(&d->deleting_from_trees,
+ bch2_snapshot_tree(c, iter->pos.snapshot));
+ if (unlikely(ret)) {
+ struct bpos pos = iter->pos;
+ pos.snapshot = 0;
+ if (iter->btree_id != BTREE_ID_inodes)
+ pos.offset = U64_MAX;
+ bch2_btree_iter_set_pos(trans, iter, bpos_nosnap_successor(pos));
+ }
+
+ return ret;
+}
+
/*
* For a given snapshot, if it doesn't have a subvolume that points to it, and
* it doesn't have child snapshot nodes - it's now redundant and we can mark it
@@ -1457,8 +1475,11 @@ static int check_should_delete_snapshot(struct btree_trans *trans, struct bkey_s
!snapshot_list_has_id(&d->delete_leaves, child);
}
+ u32 tree = bch2_snapshot_tree(c, s.k->p.offset);
+
if (live_children == 0) {
- return snapshot_list_add(c, &d->delete_leaves, s.k->p.offset);
+ return snapshot_list_add_nodup(c, &d->deleting_from_trees, tree) ?:
+ snapshot_list_add(c, &d->delete_leaves, s.k->p.offset);
} else if (live_children == 1) {
struct snapshot_interior_delete n = {
.id = s.k->p.offset,
@@ -1470,7 +1491,8 @@ static int check_should_delete_snapshot(struct btree_trans *trans, struct bkey_s
return -EINVAL;
}
- return darray_push(&d->delete_interior, n);
+ return snapshot_list_add_nodup(c, &d->deleting_from_trees, tree) ?:
+ darray_push(&d->delete_interior, n);
} else {
return 0;
}
@@ -1551,6 +1573,10 @@ static int bch2_fix_child_of_deleted_snapshot(struct btree_trans *trans,
static void bch2_snapshot_delete_nodes_to_text(struct printbuf *out, struct snapshot_delete *d)
{
+ prt_printf(out, "deleting from trees");
+ darray_for_each(d->deleting_from_trees, i)
+ prt_printf(out, " %u", *i);
+
prt_printf(out, "deleting leaves");
darray_for_each(d->delete_leaves, i)
prt_printf(out, " %u", *i);
@@ -1611,6 +1637,10 @@ int bch2_delete_dead_snapshots(struct bch_fs *c)
BTREE_ITER_prefetch|BTREE_ITER_all_snapshots, k,
&res, NULL, BCH_TRANS_COMMIT_no_enospc, ({
d->pos.pos = iter.pos;
+
+ if (skip_unrelated_snapshot_tree(trans, &iter))
+ continue;
+
delete_dead_snapshots_process_key(trans, &iter, k);
}));
@@ -1653,6 +1683,7 @@ int bch2_delete_dead_snapshots(struct bch_fs *c)
}
err:
mutex_lock(&d->lock);
+ darray_exit(&d->deleting_from_trees);
darray_exit(&d->delete_interior);
darray_exit(&d->delete_leaves);
d->running = false;
diff --git a/fs/bcachefs/snapshot_types.h b/fs/bcachefs/snapshot_types.h
index bb67a6beb6e3..39fb47f43183 100644
--- a/fs/bcachefs/snapshot_types.h
+++ b/fs/bcachefs/snapshot_types.h
@@ -15,6 +15,7 @@ struct snapshot_delete {
struct work_struct work;
struct mutex lock;
+ snapshot_id_list deleting_from_trees;
snapshot_id_list delete_leaves;
interior_delete_list delete_interior;
--
2.49.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 6/8] bcachefs: BCH_SNAPSHOT_DELETED -> BCH_SNAPSHOT_WILL_DELETE
2025-05-02 19:59 [PATCH 0/8] Snapshot deletion improvements Kent Overstreet
` (4 preceding siblings ...)
2025-05-02 19:59 ` [PATCH 5/8] bcachefs: Skip unrelated snapshot trees in snapshot deletion Kent Overstreet
@ 2025-05-02 19:59 ` Kent Overstreet
2025-05-02 19:59 ` [PATCH 7/8] bcachefs: bcachefs_metadata_version_snapshot_deletion_v2 Kent Overstreet
2025-05-02 20:00 ` [PATCH 8/8] bcachefs: delete_dead_snapshot_keys_v2() Kent Overstreet
7 siblings, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2025-05-02 19:59 UTC (permalink / raw)
To: linux-bcachefs, linux-kernel; +Cc: Kent Overstreet
We're going to be speeding up snapshot deletion, by only having it
process the extents/dirents/xattrs btrees if an inode of a given
snapshot ID was present.
This raises the possibility of 'bkey_in_missing_snapshot' errors popping
up, if we ever accidentally don't do the corresponding inode update, or
if the new algorithm has bugs.
So we'll want to be able to differentiate more definitively between
'snapshot went missing' (and perhaps needs to be reconstructed), and
'key in snapshot that was deleted'.
So instead of deleting snapshot IDs, we'll be adding a new deleted flag
and leaving them permanently.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
fs/bcachefs/snapshot.c | 12 ++++++------
fs/bcachefs/snapshot_format.h | 2 +-
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/fs/bcachefs/snapshot.c b/fs/bcachefs/snapshot.c
index 219cba038778..7349f7f33a4f 100644
--- a/fs/bcachefs/snapshot.c
+++ b/fs/bcachefs/snapshot.c
@@ -213,7 +213,7 @@ void bch2_snapshot_to_text(struct printbuf *out, struct bch_fs *c,
prt_printf(out, "is_subvol %llu deleted %llu parent %10u children %10u %10u subvol %u tree %u",
BCH_SNAPSHOT_SUBVOL(s.v),
- BCH_SNAPSHOT_DELETED(s.v),
+ BCH_SNAPSHOT_WILL_DELETE(s.v),
le32_to_cpu(s.v->parent),
le32_to_cpu(s.v->children[0]),
le32_to_cpu(s.v->children[1]),
@@ -339,7 +339,7 @@ static int __bch2_mark_snapshot(struct btree_trans *trans,
parent - id - 1 < IS_ANCESTOR_BITMAP)
__set_bit(parent - id - 1, t->is_ancestor);
- if (BCH_SNAPSHOT_DELETED(s.v)) {
+ if (BCH_SNAPSHOT_WILL_DELETE(s.v)) {
set_bit(BCH_FS_need_delete_dead_snapshots, &c->flags);
if (c->curr_recovery_pass > BCH_RECOVERY_PASS_delete_dead_snapshots)
bch2_delete_dead_snapshots_async(c);
@@ -748,7 +748,7 @@ static int check_snapshot(struct btree_trans *trans,
}
bool should_have_subvol = BCH_SNAPSHOT_SUBVOL(&s) &&
- !BCH_SNAPSHOT_DELETED(&s);
+ !BCH_SNAPSHOT_WILL_DELETE(&s);
if (should_have_subvol) {
id = le32_to_cpu(s.subvol);
@@ -1062,10 +1062,10 @@ int bch2_snapshot_node_set_deleted(struct btree_trans *trans, u32 id)
}
/* already deleted? */
- if (BCH_SNAPSHOT_DELETED(&s->v))
+ if (BCH_SNAPSHOT_WILL_DELETE(&s->v))
goto err;
- SET_BCH_SNAPSHOT_DELETED(&s->v, true);
+ SET_BCH_SNAPSHOT_WILL_DELETE(&s->v, true);
SET_BCH_SNAPSHOT_SUBVOL(&s->v, false);
s->v.subvol = 0;
err:
@@ -1770,7 +1770,7 @@ static int bch2_check_snapshot_needs_deletion(struct btree_trans *trans, struct
return 0;
struct bkey_s_c_snapshot snap = bkey_s_c_to_snapshot(k);
- if (BCH_SNAPSHOT_DELETED(snap.v) ||
+ if (BCH_SNAPSHOT_WILL_DELETE(snap.v) ||
interior_snapshot_needs_delete(snap))
set_bit(BCH_FS_need_delete_dead_snapshots, &trans->c->flags);
diff --git a/fs/bcachefs/snapshot_format.h b/fs/bcachefs/snapshot_format.h
index aabcd3a74cd9..685a9fe209ab 100644
--- a/fs/bcachefs/snapshot_format.h
+++ b/fs/bcachefs/snapshot_format.h
@@ -15,7 +15,7 @@ struct bch_snapshot {
bch_le128 btime;
};
-LE32_BITMASK(BCH_SNAPSHOT_DELETED, struct bch_snapshot, flags, 0, 1)
+LE32_BITMASK(BCH_SNAPSHOT_WILL_DELETE, struct bch_snapshot, flags, 0, 1)
/* True if a subvolume points to this snapshot node: */
LE32_BITMASK(BCH_SNAPSHOT_SUBVOL, struct bch_snapshot, flags, 1, 2)
--
2.49.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 7/8] bcachefs: bcachefs_metadata_version_snapshot_deletion_v2
2025-05-02 19:59 [PATCH 0/8] Snapshot deletion improvements Kent Overstreet
` (5 preceding siblings ...)
2025-05-02 19:59 ` [PATCH 6/8] bcachefs: BCH_SNAPSHOT_DELETED -> BCH_SNAPSHOT_WILL_DELETE Kent Overstreet
@ 2025-05-02 19:59 ` Kent Overstreet
2025-05-02 20:00 ` [PATCH 8/8] bcachefs: delete_dead_snapshot_keys_v2() Kent Overstreet
7 siblings, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2025-05-02 19:59 UTC (permalink / raw)
To: linux-bcachefs, linux-kernel; +Cc: Kent Overstreet
We're going to be speeding up snapshot deletion, by only having it
process the extents/dirents/xattrs btrees if an inode of a given
snapshot ID was present.
This raises the possibility of 'bkey_in_missing_snapshot' errors popping
up, if we ever accidentally don't do the corresponding inode update, or
if the new algorithm has bugs.
So instead of deleting snapshot IDs, add a new deleted flag, so that
'key in missing snapshot' errors can more definitively tell what
happened and automatically repair.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
fs/bcachefs/bcachefs_format.h | 3 +-
fs/bcachefs/sb-errors_format.h | 3 +-
fs/bcachefs/snapshot.c | 80 ++++++++++++++++++++++++++--------
fs/bcachefs/snapshot.h | 25 ++++++++---
fs/bcachefs/snapshot_format.h | 2 +-
fs/bcachefs/snapshot_types.h | 30 +++++++++++++
fs/bcachefs/subvolume_types.h | 27 ------------
7 files changed, 116 insertions(+), 54 deletions(-)
diff --git a/fs/bcachefs/bcachefs_format.h b/fs/bcachefs/bcachefs_format.h
index 7ce475c565b5..0beff6af7ecf 100644
--- a/fs/bcachefs/bcachefs_format.h
+++ b/fs/bcachefs/bcachefs_format.h
@@ -695,7 +695,8 @@ struct bch_sb_field_ext {
x(stripe_backpointers, BCH_VERSION(1, 22)) \
x(stripe_lru, BCH_VERSION(1, 23)) \
x(casefolding, BCH_VERSION(1, 24)) \
- x(extent_flags, BCH_VERSION(1, 25))
+ x(extent_flags, BCH_VERSION(1, 25)) \
+ x(snapshot_deletion_v2, BCH_VERSION(1, 26))
enum bcachefs_metadata_version {
bcachefs_metadata_version_min = 9,
diff --git a/fs/bcachefs/sb-errors_format.h b/fs/bcachefs/sb-errors_format.h
index 82bc1906aa00..448326c01d13 100644
--- a/fs/bcachefs/sb-errors_format.h
+++ b/fs/bcachefs/sb-errors_format.h
@@ -209,6 +209,7 @@ enum bch_fsck_flags {
x(subvol_to_missing_root, 188, 0) \
x(subvol_root_wrong_bi_subvol, 189, FSCK_AUTOFIX) \
x(bkey_in_missing_snapshot, 190, 0) \
+ x(bkey_in_deleted_snapshot, 315, 0) \
x(inode_pos_inode_nonzero, 191, 0) \
x(inode_pos_blockdev_range, 192, 0) \
x(inode_alloc_cursor_inode_bad, 301, 0) \
@@ -324,7 +325,7 @@ enum bch_fsck_flags {
x(dirent_stray_data_after_cf_name, 305, 0) \
x(rebalance_work_incorrectly_set, 309, FSCK_AUTOFIX) \
x(rebalance_work_incorrectly_unset, 310, FSCK_AUTOFIX) \
- x(MAX, 315, 0)
+ x(MAX, 316, 0)
enum bch_sb_error_id {
#define x(t, n, ...) BCH_FSCK_ERR_##t = n,
diff --git a/fs/bcachefs/snapshot.c b/fs/bcachefs/snapshot.c
index 7349f7f33a4f..f074b9de5024 100644
--- a/fs/bcachefs/snapshot.c
+++ b/fs/bcachefs/snapshot.c
@@ -314,7 +314,9 @@ static int __bch2_mark_snapshot(struct btree_trans *trans,
if (new.k->type == KEY_TYPE_snapshot) {
struct bkey_s_c_snapshot s = bkey_s_c_to_snapshot(new);
- t->live = true;
+ t->state = !BCH_SNAPSHOT_DELETED(s.v)
+ ? SNAPSHOT_ID_live
+ : SNAPSHOT_ID_deleted;
t->parent = le32_to_cpu(s.v->parent);
t->children[0] = le32_to_cpu(s.v->children[0]);
t->children[1] = le32_to_cpu(s.v->children[1]);
@@ -711,6 +713,9 @@ static int check_snapshot(struct btree_trans *trans,
memset(&s, 0, sizeof(s));
memcpy(&s, k.v, min(sizeof(s), bkey_val_bytes(k.k)));
+ if (BCH_SNAPSHOT_DELETED(&s))
+ return 0;
+
id = le32_to_cpu(s.parent);
if (id) {
ret = bch2_snapshot_lookup(trans, id, &v);
@@ -998,7 +1003,7 @@ int bch2_reconstruct_snapshots(struct bch_fs *c)
snapshot_id_list_to_text(&buf, t);
darray_for_each(*t, id) {
- if (fsck_err_on(!bch2_snapshot_exists(c, *id),
+ if (fsck_err_on(bch2_snapshot_id_state(c, *id) == SNAPSHOT_ID_empty,
trans, snapshot_node_missing,
"snapshot node %u from tree %s missing, recreate?", *id, buf.buf)) {
if (t->nr > 1) {
@@ -1023,22 +1028,38 @@ int bch2_reconstruct_snapshots(struct bch_fs *c)
return ret;
}
-int bch2_check_key_has_snapshot(struct btree_trans *trans,
- struct btree_iter *iter,
- struct bkey_s_c k)
+int __bch2_check_key_has_snapshot(struct btree_trans *trans,
+ struct btree_iter *iter,
+ struct bkey_s_c k)
{
struct bch_fs *c = trans->c;
struct printbuf buf = PRINTBUF;
int ret = 0;
+ enum snapshot_id_state state = bch2_snapshot_id_state(c, k.k->p.snapshot);
+
+ /* Snapshot was definitively deleted, this error is marked autofix */
+ if (fsck_err_on(state == SNAPSHOT_ID_deleted,
+ trans, bkey_in_deleted_snapshot,
+ "key in deleted snapshot %s, delete?",
+ (bch2_btree_id_to_text(&buf, iter->btree_id),
+ prt_char(&buf, ' '),
+ bch2_bkey_val_to_text(&buf, c, k), buf.buf)))
+ ret = bch2_btree_delete_at(trans, iter,
+ BTREE_UPDATE_internal_snapshot_node) ?: 1;
- if (fsck_err_on(!bch2_snapshot_exists(c, k.k->p.snapshot),
+ /*
+ * Snapshot missing: we should have caught this with btree_lost_data and
+ * kicked off reconstruct_snapshots, so if we end up here we have no
+ * idea what happened:
+ */
+ if (fsck_err_on(state == SNAPSHOT_ID_empty,
trans, bkey_in_missing_snapshot,
"key in missing snapshot %s, delete?",
(bch2_btree_id_to_text(&buf, iter->btree_id),
prt_char(&buf, ' '),
bch2_bkey_val_to_text(&buf, c, k), buf.buf)))
ret = bch2_btree_delete_at(trans, iter,
- BTREE_UPDATE_internal_snapshot_node) ?: 1;
+ BTREE_UPDATE_internal_snapshot_node) ?: 1;
fsck_err:
printbuf_exit(&buf);
return ret;
@@ -1085,24 +1106,25 @@ static int bch2_snapshot_node_delete(struct btree_trans *trans, u32 id)
struct btree_iter iter, p_iter = {};
struct btree_iter c_iter = {};
struct btree_iter tree_iter = {};
- struct bkey_s_c_snapshot s;
u32 parent_id, child_id;
unsigned i;
int ret = 0;
- s = bch2_bkey_get_iter_typed(trans, &iter, BTREE_ID_snapshots, POS(0, id),
- BTREE_ITER_intent, snapshot);
- ret = bkey_err(s);
+ struct bkey_i_snapshot *s =
+ bch2_bkey_get_mut_typed(trans, &iter, BTREE_ID_snapshots, POS(0, id),
+ BTREE_ITER_intent, snapshot);
+ ret = PTR_ERR_OR_ZERO(s);
bch2_fs_inconsistent_on(bch2_err_matches(ret, ENOENT), c,
"missing snapshot %u", id);
if (ret)
goto err;
- BUG_ON(s.v->children[1]);
+ BUG_ON(BCH_SNAPSHOT_DELETED(&s->v));
+ BUG_ON(s->v.children[1]);
- parent_id = le32_to_cpu(s.v->parent);
- child_id = le32_to_cpu(s.v->children[0]);
+ parent_id = le32_to_cpu(s->v.parent);
+ child_id = le32_to_cpu(s->v.children[0]);
if (parent_id) {
struct bkey_i_snapshot *parent;
@@ -1160,24 +1182,38 @@ static int bch2_snapshot_node_delete(struct btree_trans *trans, u32 id)
*/
struct bkey_i_snapshot_tree *s_t;
- BUG_ON(s.v->children[1]);
+ BUG_ON(s->v.children[1]);
s_t = bch2_bkey_get_mut_typed(trans, &tree_iter,
- BTREE_ID_snapshot_trees, POS(0, le32_to_cpu(s.v->tree)),
+ BTREE_ID_snapshot_trees, POS(0, le32_to_cpu(s->v.tree)),
0, snapshot_tree);
ret = PTR_ERR_OR_ZERO(s_t);
if (ret)
goto err;
- if (s.v->children[0]) {
- s_t->v.root_snapshot = s.v->children[0];
+ if (s->v.children[0]) {
+ s_t->v.root_snapshot = s->v.children[0];
} else {
s_t->k.type = KEY_TYPE_deleted;
set_bkey_val_u64s(&s_t->k, 0);
}
}
- ret = bch2_btree_delete_at(trans, &iter, 0);
+ if (!bch2_request_incompat_feature(c, bcachefs_metadata_version_snapshot_deletion_v2)) {
+ SET_BCH_SNAPSHOT_DELETED(&s->v, true);
+ s->v.parent = 0;
+ s->v.children[0] = 0;
+ s->v.children[1] = 0;
+ s->v.subvol = 0;
+ s->v.tree = 0;
+ s->v.depth = 0;
+ s->v.skip[0] = 0;
+ s->v.skip[1] = 0;
+ s->v.skip[2] = 0;
+ } else {
+ s->k.type = KEY_TYPE_deleted;
+ set_bkey_val_u64s(&s->k, 0);
+ }
err:
bch2_trans_iter_exit(trans, &tree_iter);
bch2_trans_iter_exit(trans, &p_iter);
@@ -1468,6 +1504,9 @@ static int check_should_delete_snapshot(struct btree_trans *trans, struct bkey_s
if (BCH_SNAPSHOT_SUBVOL(s.v))
return 0;
+ if (BCH_SNAPSHOT_DELETED(s.v))
+ return 0;
+
for (unsigned i = 0; i < 2; i++) {
u32 child = le32_to_cpu(s.v->children[i]);
@@ -1524,6 +1563,9 @@ static int bch2_fix_child_of_deleted_snapshot(struct btree_trans *trans,
struct bkey_i_snapshot *s;
int ret;
+ if (!bch2_snapshot_exists(c, k.k->p.offset))
+ return 0;
+
if (k.k->type != KEY_TYPE_snapshot)
return 0;
diff --git a/fs/bcachefs/snapshot.h b/fs/bcachefs/snapshot.h
index 24a451bb7024..69c484b77729 100644
--- a/fs/bcachefs/snapshot.h
+++ b/fs/bcachefs/snapshot.h
@@ -120,21 +120,26 @@ static inline u32 bch2_snapshot_root(struct bch_fs *c, u32 id)
return id;
}
-static inline bool __bch2_snapshot_exists(struct bch_fs *c, u32 id)
+static inline enum snapshot_id_state __bch2_snapshot_id_state(struct bch_fs *c, u32 id)
{
const struct snapshot_t *s = snapshot_t(c, id);
- return s ? s->live : 0;
+ return s ? s->state : SNAPSHOT_ID_empty;
}
-static inline bool bch2_snapshot_exists(struct bch_fs *c, u32 id)
+static inline enum snapshot_id_state bch2_snapshot_id_state(struct bch_fs *c, u32 id)
{
rcu_read_lock();
- bool ret = __bch2_snapshot_exists(c, id);
+ enum snapshot_id_state ret = __bch2_snapshot_id_state(c, id);
rcu_read_unlock();
return ret;
}
+static inline bool bch2_snapshot_exists(struct bch_fs *c, u32 id)
+{
+ return bch2_snapshot_id_state(c, id) == SNAPSHOT_ID_live;
+}
+
static inline int bch2_snapshot_is_internal_node(struct bch_fs *c, u32 id)
{
rcu_read_lock();
@@ -241,7 +246,17 @@ int bch2_snapshot_node_create(struct btree_trans *, u32,
int bch2_check_snapshot_trees(struct bch_fs *);
int bch2_check_snapshots(struct bch_fs *);
int bch2_reconstruct_snapshots(struct bch_fs *);
-int bch2_check_key_has_snapshot(struct btree_trans *, struct btree_iter *, struct bkey_s_c);
+
+int __bch2_check_key_has_snapshot(struct btree_trans *, struct btree_iter *, struct bkey_s_c);
+
+static inline int bch2_check_key_has_snapshot(struct btree_trans *trans,
+ struct btree_iter *iter,
+ struct bkey_s_c k)
+{
+ return likely(bch2_snapshot_exists(trans->c, k.k->p.snapshot))
+ ? 0
+ : __bch2_check_key_has_snapshot(trans, iter, k);
+}
int bch2_snapshot_node_set_deleted(struct btree_trans *, u32);
diff --git a/fs/bcachefs/snapshot_format.h b/fs/bcachefs/snapshot_format.h
index 685a9fe209ab..9bccae1f3590 100644
--- a/fs/bcachefs/snapshot_format.h
+++ b/fs/bcachefs/snapshot_format.h
@@ -16,9 +16,9 @@ struct bch_snapshot {
};
LE32_BITMASK(BCH_SNAPSHOT_WILL_DELETE, struct bch_snapshot, flags, 0, 1)
-
/* True if a subvolume points to this snapshot node: */
LE32_BITMASK(BCH_SNAPSHOT_SUBVOL, struct bch_snapshot, flags, 1, 2)
+LE32_BITMASK(BCH_SNAPSHOT_DELETED, struct bch_snapshot, flags, 2, 3)
/*
* Snapshot trees:
diff --git a/fs/bcachefs/snapshot_types.h b/fs/bcachefs/snapshot_types.h
index 39fb47f43183..a64f4b942655 100644
--- a/fs/bcachefs/snapshot_types.h
+++ b/fs/bcachefs/snapshot_types.h
@@ -3,8 +3,38 @@
#define _BCACHEFS_SNAPSHOT_TYPES_H
#include "bbpos_types.h"
+#include "darray.h"
#include "subvolume_types.h"
+typedef DARRAY(u32) snapshot_id_list;
+
+#define IS_ANCESTOR_BITMAP 128
+
+struct snapshot_t {
+ enum snapshot_id_state {
+ SNAPSHOT_ID_empty,
+ SNAPSHOT_ID_live,
+ SNAPSHOT_ID_deleted,
+ } state;
+ u32 parent;
+ u32 skip[3];
+ u32 depth;
+ u32 children[2];
+ u32 subvol; /* Nonzero only if a subvolume points to this node: */
+ u32 tree;
+ unsigned long is_ancestor[BITS_TO_LONGS(IS_ANCESTOR_BITMAP)];
+};
+
+struct snapshot_table {
+ struct rcu_head rcu;
+ size_t nr;
+#ifndef RUST_BINDGEN
+ DECLARE_FLEX_ARRAY(struct snapshot_t, s);
+#else
+ struct snapshot_t s[0];
+#endif
+};
+
struct snapshot_interior_delete {
u32 id;
u32 live_child;
diff --git a/fs/bcachefs/subvolume_types.h b/fs/bcachefs/subvolume_types.h
index 1549d6daf7af..9d634b906dcd 100644
--- a/fs/bcachefs/subvolume_types.h
+++ b/fs/bcachefs/subvolume_types.h
@@ -2,33 +2,6 @@
#ifndef _BCACHEFS_SUBVOLUME_TYPES_H
#define _BCACHEFS_SUBVOLUME_TYPES_H
-#include "darray.h"
-
-typedef DARRAY(u32) snapshot_id_list;
-
-#define IS_ANCESTOR_BITMAP 128
-
-struct snapshot_t {
- bool live;
- u32 parent;
- u32 skip[3];
- u32 depth;
- u32 children[2];
- u32 subvol; /* Nonzero only if a subvolume points to this node: */
- u32 tree;
- unsigned long is_ancestor[BITS_TO_LONGS(IS_ANCESTOR_BITMAP)];
-};
-
-struct snapshot_table {
- struct rcu_head rcu;
- size_t nr;
-#ifndef RUST_BINDGEN
- DECLARE_FLEX_ARRAY(struct snapshot_t, s);
-#else
- struct snapshot_t s[0];
-#endif
-};
-
typedef struct {
/* we can't have padding in this struct: */
u64 subvol;
--
2.49.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 8/8] bcachefs: delete_dead_snapshot_keys_v2()
2025-05-02 19:59 [PATCH 0/8] Snapshot deletion improvements Kent Overstreet
` (6 preceding siblings ...)
2025-05-02 19:59 ` [PATCH 7/8] bcachefs: bcachefs_metadata_version_snapshot_deletion_v2 Kent Overstreet
@ 2025-05-02 20:00 ` Kent Overstreet
7 siblings, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2025-05-02 20:00 UTC (permalink / raw)
To: linux-bcachefs, linux-kernel; +Cc: Kent Overstreet
Since extents, dirents and xattrs require an inode with the
corresponding snapshot ID to exists, we can avoid a lot of scanning by
only scanning those trees for keys to process if the correspending inode
exists.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
fs/bcachefs/snapshot.c | 160 ++++++++++++++++++++++++++++++++++-------
1 file changed, 133 insertions(+), 27 deletions(-)
diff --git a/fs/bcachefs/snapshot.c b/fs/bcachefs/snapshot.c
index f074b9de5024..83b95269d38d 100644
--- a/fs/bcachefs/snapshot.c
+++ b/fs/bcachefs/snapshot.c
@@ -1427,6 +1427,12 @@ static unsigned live_child(struct bch_fs *c, u32 id)
return ret;
}
+static bool snapshot_id_dying(struct snapshot_delete *d, unsigned id)
+{
+ return snapshot_list_has_id(&d->delete_leaves, id) ||
+ interior_delete_has_id(&d->delete_interior, id) != 0;
+}
+
static int delete_dead_snapshots_process_key(struct btree_trans *trans,
struct btree_iter *iter,
struct bkey_s_c k)
@@ -1486,6 +1492,126 @@ static bool skip_unrelated_snapshot_tree(struct btree_trans *trans, struct btree
return ret;
}
+static int delete_dead_snapshot_keys_v1(struct btree_trans *trans)
+{
+ struct bch_fs *c = trans->c;
+ struct snapshot_delete *d = &c->snapshot_delete;
+
+ for (d->pos.btree = 0; d->pos.btree < BTREE_ID_NR; d->pos.btree++) {
+ struct disk_reservation res = { 0 };
+
+ d->pos.pos = POS_MIN;
+
+ if (!btree_type_has_snapshots(d->pos.btree))
+ continue;
+
+ int ret = for_each_btree_key_commit(trans, iter,
+ d->pos.btree, POS_MIN,
+ BTREE_ITER_prefetch|BTREE_ITER_all_snapshots, k,
+ &res, NULL, BCH_TRANS_COMMIT_no_enospc, ({
+ d->pos.pos = iter.pos;
+
+ if (skip_unrelated_snapshot_tree(trans, &iter))
+ continue;
+
+ delete_dead_snapshots_process_key(trans, &iter, k);
+ }));
+
+ bch2_disk_reservation_put(c, &res);
+
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+static int delete_dead_snapshot_keys_range(struct btree_trans *trans, enum btree_id btree,
+ struct bpos start, struct bpos end)
+{
+ struct bch_fs *c = trans->c;
+ struct snapshot_delete *d = &c->snapshot_delete;
+ struct disk_reservation res = { 0 };
+
+ d->pos.btree = btree;
+ d->pos.pos = POS_MIN;
+
+ int ret = for_each_btree_key_max_commit(trans, iter,
+ btree, start, end,
+ BTREE_ITER_prefetch|BTREE_ITER_all_snapshots, k,
+ &res, NULL, BCH_TRANS_COMMIT_no_enospc, ({
+ d->pos.pos = iter.pos;
+ delete_dead_snapshots_process_key(trans, &iter, k);
+ }));
+
+ bch2_disk_reservation_put(c, &res);
+ return ret;
+}
+
+static int delete_dead_snapshot_keys_v2(struct btree_trans *trans)
+{
+ struct bch_fs *c = trans->c;
+ struct snapshot_delete *d = &c->snapshot_delete;
+ struct disk_reservation res = { 0 };
+ int ret = 0;
+
+ struct btree_iter iter;
+ bch2_trans_iter_init(trans, &iter, BTREE_ID_inodes, POS_MIN,
+ BTREE_ITER_prefetch|BTREE_ITER_all_snapshots);
+
+ while (1) {
+ struct bkey_s_c k;
+ ret = lockrestart_do(trans,
+ bkey_err(k = bch2_btree_iter_peek(trans, &iter)));
+ if (ret)
+ break;
+
+ if (!k.k)
+ break;
+
+ d->pos.btree = iter.btree_id;
+ d->pos.pos = iter.pos;
+
+ if (skip_unrelated_snapshot_tree(trans, &iter))
+ continue;
+
+ if (snapshot_id_dying(d, k.k->p.snapshot)) {
+ struct bpos start = POS(k.k->p.offset, 0);
+ struct bpos end = POS(k.k->p.offset, U64_MAX);
+
+ ret = delete_dead_snapshot_keys_range(trans, BTREE_ID_extents, start, end) ?:
+ delete_dead_snapshot_keys_range(trans, BTREE_ID_dirents, start, end) ?:
+ delete_dead_snapshot_keys_range(trans, BTREE_ID_xattrs, start, end);
+ if (ret)
+ break;
+
+ bch2_btree_iter_set_pos(trans, &iter, POS(0, k.k->p.offset + 1));
+ } else {
+ bch2_btree_iter_advance(trans, &iter);
+ }
+ }
+ bch2_trans_iter_exit(trans, &iter);
+
+ if (ret)
+ goto err;
+
+ ret = for_each_btree_key_commit(trans, iter,
+ BTREE_ID_inodes, POS_MIN,
+ BTREE_ITER_prefetch|BTREE_ITER_all_snapshots, k,
+ &res, NULL, BCH_TRANS_COMMIT_no_enospc, ({
+ d->pos.btree = iter.btree_id;
+ d->pos.pos = iter.pos;
+
+ if (skip_unrelated_snapshot_tree(trans, &iter))
+ continue;
+
+ delete_dead_snapshots_process_key(trans, &iter, k);
+ }));
+err:
+ bch2_disk_reservation_put(c, &res);
+ return ret;
+}
+
/*
* For a given snapshot, if it doesn't have a subvolume that points to it, and
* it doesn't have child snapshot nodes - it's now redundant and we can mark it
@@ -1666,33 +1792,13 @@ int bch2_delete_dead_snapshots(struct bch_fs *c)
goto err;
}
- for (d->pos.btree = 0; d->pos.btree < BTREE_ID_NR; d->pos.btree++) {
- struct disk_reservation res = { 0 };
-
- d->pos.pos = POS_MIN;
-
- if (!btree_type_has_snapshots(d->pos.btree))
- continue;
-
- ret = for_each_btree_key_commit(trans, iter,
- d->pos.btree, POS_MIN,
- BTREE_ITER_prefetch|BTREE_ITER_all_snapshots, k,
- &res, NULL, BCH_TRANS_COMMIT_no_enospc, ({
- d->pos.pos = iter.pos;
-
- if (skip_unrelated_snapshot_tree(trans, &iter))
- continue;
-
- delete_dead_snapshots_process_key(trans, &iter, k);
- }));
-
- bch2_disk_reservation_put(c, &res);
-
- if (!bch2_err_matches(ret, EROFS))
- bch_err_msg(c, ret, "deleting keys from dying snapshots");
- if (ret)
- goto err;
- }
+ ret = !bch2_request_incompat_feature(c, bcachefs_metadata_version_snapshot_deletion_v2)
+ ? delete_dead_snapshot_keys_v2(trans)
+ : delete_dead_snapshot_keys_v1(trans);
+ if (!bch2_err_matches(ret, EROFS))
+ bch_err_msg(c, ret, "deleting keys from dying snapshots");
+ if (ret)
+ goto err;
darray_for_each(d->delete_leaves, i) {
ret = commit_do(trans, NULL, NULL, 0,
--
2.49.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-05-02 20:00 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-02 19:59 [PATCH 0/8] Snapshot deletion improvements Kent Overstreet
2025-05-02 19:59 ` [PATCH 1/8] bcachefs: snapshot delete progress indicator Kent Overstreet
2025-05-02 19:59 ` [PATCH 2/8] bcachefs: Add comments for inode snapshot requirements Kent Overstreet
2025-05-02 19:59 ` [PATCH 3/8] bcachefs: kill inode_walker_entry.snapshot Kent Overstreet
2025-05-02 19:59 ` [PATCH 4/8] bcachefs: BCH_FSCK_ERR_snapshot_key_missing_inode_snapshot Kent Overstreet
2025-05-02 19:59 ` [PATCH 5/8] bcachefs: Skip unrelated snapshot trees in snapshot deletion Kent Overstreet
2025-05-02 19:59 ` [PATCH 6/8] bcachefs: BCH_SNAPSHOT_DELETED -> BCH_SNAPSHOT_WILL_DELETE Kent Overstreet
2025-05-02 19:59 ` [PATCH 7/8] bcachefs: bcachefs_metadata_version_snapshot_deletion_v2 Kent Overstreet
2025-05-02 20:00 ` [PATCH 8/8] bcachefs: delete_dead_snapshot_keys_v2() Kent Overstreet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox