linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Boris Burkov <boris@bur.io>
To: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: [PATCH v4 05/18] btrfs: flush reservations during quota disable
Date: Wed, 26 Jul 2023 13:38:32 -0700	[thread overview]
Message-ID: <32160328076c60fe8ff76f546b7c013c217c46c8.1690403768.git.boris@bur.io> (raw)
In-Reply-To: <cover.1690403768.git.boris@bur.io>

The following sequence:
enable simple quotas
do some writes
    reserve space
    create ordered_extent
        release rsv (store rsv_bytes in OE, mark QGROUP_RESERVED bits)
disable quotas
enable simple quotas
    set qgroup rsv to 0 on all subvols
ordered_extent finishes
    create delayed ref with rsv_bytes from before
run delayed ref
    record_simple_quota_delta
        free rsv_bytes (0 -> -rsv_delta)

results in us reliably underflowing the subvolume's qgroup rsv counter,
because disabling/re-enabling quotas toggles reservation counters down
to 0, but does not remove other file system state which represents
successful acquisition of qgroup rsv space. Specifically metadata rsv
counters on the root object and rsv_bytes on ordered_extent objects that
have released their reservation as well as the corresponding
QGROUP_RESERVED extent bits.

Normal qgroups gets away with this, I believe because it forces more
work to happen on transaction commit, but I am not certain it is totally
safe from the ordered_extent/leaked extent bit variant. Simple quotas
hits this reliably.

The intent of the fix is to make disable take the time to clear that
external to qgroups state as well: after flipping off the quota bit on
fs_info, flush delalloc and ordered extents, clearing the extent bits
along the way. This makes it so there are no ordered extents or meta
prealloc hanging around from the first enablement period during the second.

Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/qgroup.c | 47 ++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 44 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 558f66994667..18f521716e8d 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -1248,6 +1248,40 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info,
 	return ret;
 }
 
+/*
+ * It is possible to have outstanding ordered extents
+ * which reserved bytes before we disabled. We need to fully flush
+ * delalloc, ordered extents, and a commit to ensure that
+ * we don't leak such reservations, only to have them come back
+ * if we re-enable.
+ *
+ * i.e.:
+ * enable simple quotas
+ * reserve space
+ * release it, store rsv_bytes in OE
+ * disable quotas
+ * enable simple quotas (qgroup rsv are all 0)
+ * OE finishes
+ * run delayed refs
+ * free rsv_bytes, resulting in miscounting or even underflow
+ */
+static int flush_reservations(struct btrfs_fs_info *fs_info)
+{
+	struct btrfs_trans_handle *trans;
+	int ret;
+
+	ret = btrfs_start_delalloc_roots(fs_info, LONG_MAX, false);
+	if (ret)
+		return ret;
+	btrfs_wait_ordered_roots(fs_info, U64_MAX, 0, (u64)-1);
+	trans = btrfs_join_transaction(fs_info->tree_root);
+	if (IS_ERR(trans))
+		return PTR_ERR(trans);
+	btrfs_commit_transaction(trans);
+
+	return ret;
+}
+
 int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
 {
 	struct btrfs_root *quota_root;
@@ -1292,6 +1326,10 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
 	clear_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
 	btrfs_qgroup_wait_for_completion(fs_info, false);
 
+	ret = flush_reservations(fs_info);
+	if (ret)
+		goto out;
+
 	/*
 	 * 1 For the root item
 	 *
@@ -1353,7 +1391,7 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info)
 	if (ret && trans)
 		btrfs_end_transaction(trans);
 	else if (trans)
-		ret = btrfs_end_transaction(trans);
+		ret = btrfs_commit_transaction(trans);
 	mutex_unlock(&fs_info->cleaner_mutex);
 
 	return ret;
@@ -3957,8 +3995,11 @@ static int __btrfs_qgroup_release_data(struct btrfs_inode *inode,
 	int trace_op = QGROUP_RELEASE;
 	int ret;
 
-	if (btrfs_qgroup_mode(inode->root->fs_info) == BTRFS_QGROUP_MODE_DISABLED)
-		return 0;
+	if (btrfs_qgroup_mode(inode->root->fs_info) == BTRFS_QGROUP_MODE_DISABLED) {
+		extent_changeset_init(&changeset);
+		return clear_record_extent_bits(&inode->io_tree, start, start + len - 1,
+				       EXTENT_QGROUP_RESERVED, &changeset);
+	}
 
 	/* In release case, we shouldn't have @reserved */
 	WARN_ON(!free && reserved);
-- 
2.41.0


  parent reply	other threads:[~2023-07-26 20:40 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-26 20:38 [PATCH v4 00/18] btrfs: simple quotas Boris Burkov
2023-07-26 20:38 ` [PATCH v4 01/18] btrfs: introduce quota mode Boris Burkov
2023-07-26 20:38 ` [PATCH v4 02/18] btrfs: add new quota mode for simple quotas Boris Burkov
2023-07-26 20:38 ` [PATCH v4 03/18] btrfs: expose quota mode via sysfs Boris Burkov
2023-07-26 20:38 ` [PATCH v4 04/18] btrfs: add simple_quota incompat feature to sysfs Boris Burkov
2023-07-26 20:38 ` Boris Burkov [this message]
2023-07-26 20:38 ` [PATCH v4 06/18] btrfs: create qgroup earlier in snapshot creation Boris Burkov
2023-07-26 20:38 ` [PATCH v4 07/18] btrfs: function for recording simple quota deltas Boris Burkov
2023-07-26 20:38 ` [PATCH v4 08/18] btrfs: rename tree_ref and data_ref owning_root Boris Burkov
2023-07-26 20:38 ` [PATCH v4 09/18] btrfs: track owning root in btrfs_ref Boris Burkov
2023-07-26 20:38 ` [PATCH v4 10/18] btrfs: track original extent owner in head_ref Boris Burkov
2023-07-26 20:38 ` [PATCH v4 11/18] btrfs: new inline ref storing owning subvol of data extents Boris Burkov
2023-07-26 20:38 ` [PATCH v4 12/18] btrfs: inline owner ref lookup helper Boris Burkov
2023-07-26 20:38 ` [PATCH v4 13/18] btrfs: record simple quota deltas Boris Burkov
2023-07-26 20:38 ` [PATCH v4 14/18] btrfs: simple quota auto hierarchy for nested subvols Boris Burkov
2023-07-26 20:38 ` [PATCH v4 15/18] btrfs: check generation when recording simple quota delta Boris Burkov
2023-07-26 20:38 ` [PATCH v4 16/18] btrfs: track metadata relocation cow with simple quota Boris Burkov
2023-07-26 20:38 ` [PATCH v4 17/18] btrfs: track data relocation " Boris Burkov
2023-07-26 20:38 ` [PATCH v4 18/18] btrfs: only set QUOTA_ENABLED when done reading qgroups Boris Burkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=32160328076c60fe8ff76f546b7c013c217c46c8.1690403768.git.boris@bur.io \
    --to=boris@bur.io \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).