From: Boris Burkov <boris@bur.io>
To: Qu Wenruo <wqu@suse.com>
Cc: Qu Wenruo <quwenruo.btrfs@gmx.com>,
linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH RFC] btrfs: fix qgroup rsv leak in subvol create
Date: Mon, 1 May 2023 20:34:21 -0700 [thread overview]
Message-ID: <20230502033421.GA3175929@zen> (raw)
In-Reply-To: <d6111cfa-1315-2c45-67b3-3946a7229896@suse.com>
On Tue, May 02, 2023 at 06:58:33AM +0800, Qu Wenruo wrote:
>
>
> On 2023/5/2 00:50, Boris Burkov wrote:
> > On Sat, Apr 29, 2023 at 03:18:26PM +0800, Qu Wenruo wrote:
> > >
> > >
> > > On 2023/4/29 07:08, Boris Burkov wrote:
> > > > While working on testing my quota work, I tried running all fstests
> > > > while passing mkfs -R quota. That shook out a failure in btrfs/042.
> > > >
> > > > The failure is a reservation leak detected at umount, and the cause is a
> > > > subtle difficulty with the qgroup rsv release accounting for inode
> > > > creation.
> > >
> > > Mind to give an example of the leakage kernel error message?
> > > As such message would include the type of the rsv.
> > >
> > > >
> > > > The issue stems from a recent change to subvol creation:
> > > > btrfs: don't commit transaction for every subvol create
> > > >
> > > > Specifically, that test creates 10 subvols, and in the mode where we
> > > > commit each time, the logic for dir index item reservation never decides
> > > > that it can undo the reservation. However, if we keep joining the
> > > > previous transaction, this logic kicks in and calls
> > > > btrfs_block_rsv_release without specifying any of the qgroup release
> > > > return counter stuff. As a result, adding the new subvol inode blows
> > > > away the state needed for the later explicit call to
> > > > btrfs_subvolume_release_metadata.
> > >
> > > Is there any reproducer for it?
> >
> > I believe that all you need is to run btrfs/042 with MKFS_OPTIONS set to
> > include "-R quota".
>
> Indeed, now it fails consistently.
>
> Although you don't need the extra mkfs options, as the test itself is
> utilizing qgroups already.
Good point. I assumed I needed the flag because the test passes in
Josef's CI thing, and I was doing the weird thing of testing with quotas
enabled via mkfs. http://toxicpanda.com/btrfs/042.html
But looking at it, it should just be totally broken as the subvols are
created after quota is enabled. I guess Josef must not have rebased past
that patch yet.
>
> Thanks,
> Qu
> >
> > >
> > > By the description it should be pretty simple as long as we create multiple
> > > subvolumes in one transaction.
> > >
> > > I'd like to have some qgroup related trace enabled to show the problem more
> > > explicitly, as I'm not that familiar with the delayed inode code.
> > >
> > > Thanks,
> > > Qu
> > > >
> > > > I suspect this fix is incorrect and will break something to do with
> > > > normal inode creation, but it's an interesting starting point and I
> > > > would appreciate any suggestions or help with how to really fix it,
> > > > without reverting the subvol commit patch. Worst case, I suppose we can
> > > > commit every time if quotas are enabled.
> > > >
> > > > The issue should reproduce on misc-next on btrfs/042 with
> > > > MKFS_OPTIONS="-K -R quota"
> > > > in the config file.
> > > >
> > > > Signed-off-by: Boris Burkov <boris@bur.io>
> > > > ---
> > > > fs/btrfs/delayed-inode.c | 4 +++-
> > > > 1 file changed, 3 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
> > > > index 6b457b010cbc..82b2e86f9bd9 100644
> > > > --- a/fs/btrfs/delayed-inode.c
> > > > +++ b/fs/btrfs/delayed-inode.c
> > > > @@ -1480,6 +1480,7 @@ int btrfs_insert_delayed_dir_index(struct btrfs_trans_handle *trans,
> > > > delayed_node->index_item_leaves++;
> > > > } else if (!test_bit(BTRFS_FS_LOG_RECOVERING, &fs_info->flags)) {
> > > > const u64 bytes = btrfs_calc_insert_metadata_size(fs_info, 1);
> > > > + u64 qgroup_to_release;
> > > > /*
> > > > * Adding the new dir index item does not require touching another
> > > > @@ -1490,7 +1491,8 @@ int btrfs_insert_delayed_dir_index(struct btrfs_trans_handle *trans,
> > > > */
> > > > trace_btrfs_space_reservation(fs_info, "transaction",
> > > > trans->transid, bytes, 0);
> > > > - btrfs_block_rsv_release(fs_info, trans->block_rsv, bytes, NULL);
> > > > + btrfs_block_rsv_release(fs_info, trans->block_rsv, bytes, &qgroup_to_release);
> > > > + btrfs_qgroup_convert_reserved_meta(delayed_node->root, qgroup_to_release);
> > > > ASSERT(trans->bytes_reserved >= bytes);
> > > > trans->bytes_reserved -= bytes;
> > > > }
next prev parent reply other threads:[~2023-05-02 3:34 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-28 23:08 [PATCH RFC] btrfs: fix qgroup rsv leak in subvol create Boris Burkov
2023-04-29 7:18 ` Qu Wenruo
2023-05-01 16:50 ` Boris Burkov
2023-05-01 22:58 ` Qu Wenruo
2023-05-02 3:34 ` Boris Burkov [this message]
2023-05-01 17:09 ` Boris Burkov
2023-05-09 22:29 ` David Sterba
2023-05-09 23:05 ` Qu Wenruo
2023-05-09 23:14 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230502033421.GA3175929@zen \
--to=boris@bur.io \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.