From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/2] xfs: reserve quota for dir expansion when linking/unlinking files
Date: Wed, 9 Mar 2022 15:33:02 -0800 [thread overview]
Message-ID: <20220309233302.GE8224@magnolia> (raw)
In-Reply-To: <20220309214821.GH661808@dread.disaster.area>
On Thu, Mar 10, 2022 at 08:48:21AM +1100, Dave Chinner wrote:
> On Wed, Mar 09, 2022 at 11:22:26AM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > XFS does not reserve quota for directory expansion when linking or
> > unlinking children from a directory. This means that we don't reject
> > the expansion with EDQUOT when we're at or near a hard limit, which
> > means that unprivileged userspace can use link()/unlink() to exceed
> > quota.
> >
> > The fix for this is nuanced -- link operations don't always expand the
> > directory, and we allow a link to proceed with no space reservation if
> > we don't need to add a block to the directory to handle the addition.
> > Unlink operations generally do not expand the directory (you'd have to
> > free a block and then cause a btree split) and we can defer the
> > directory block freeing if there is no space reservation.
> >
> > Moreover, there is a further bug in that we do not trigger the blockgc
> > workers to try to clear space when we're out of quota.
> >
> > To fix both cases, create a new xfs_trans_alloc_dir function that
> > allocates the transaction, locks and joins the inodes, and reserves
> > quota for the directory. If there isn't sufficient space or quota,
> > we'll switch the caller to reservationless mode. This should prevent
> > quota usage overruns with the least restriction in functionality.
> >
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> > fs/xfs/xfs_inode.c | 30 +++++-------------
> > fs/xfs/xfs_trans.c | 86 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> > fs/xfs/xfs_trans.h | 3 ++
> > 3 files changed, 97 insertions(+), 22 deletions(-)
>
> Overall looks good, minor nits below:
>
> >
> >
> > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > index 04bf467b1090..a131bbfe74e4 100644
> > --- a/fs/xfs/xfs_inode.c
> > +++ b/fs/xfs/xfs_inode.c
> > @@ -1217,7 +1217,7 @@ xfs_link(
> > {
> > xfs_mount_t *mp = tdp->i_mount;
> > xfs_trans_t *tp;
> > - int error;
> > + int error, space_error;
> > int resblks;
> >
> > trace_xfs_link(tdp, target_name);
> > @@ -1236,19 +1236,11 @@ xfs_link(
> > goto std_return;
> >
> > resblks = XFS_LINK_SPACE_RES(mp, target_name->len);
> > - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_link, resblks, 0, 0, &tp);
> > - if (error == -ENOSPC) {
> > - resblks = 0;
> > - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_link, 0, 0, 0, &tp);
> > - }
> > + error = xfs_trans_alloc_dir(tdp, &M_RES(mp)->tr_link, sip, &resblks,
> > + &tp, &space_error);
>
> It's the nospace_error, isn't it? The code reads a lot better when
> it's called that, too.
Fixed.
>
> > if (error)
> > goto std_return;
> >
> > - xfs_lock_two_inodes(sip, XFS_ILOCK_EXCL, tdp, XFS_ILOCK_EXCL);
> > -
> > - xfs_trans_ijoin(tp, sip, XFS_ILOCK_EXCL);
> > - xfs_trans_ijoin(tp, tdp, XFS_ILOCK_EXCL);
> > -
> > error = xfs_iext_count_may_overflow(tdp, XFS_DATA_FORK,
> > XFS_IEXT_DIR_MANIP_CNT(mp));
> > if (error)
> > @@ -1267,6 +1259,8 @@ xfs_link(
> >
> > if (!resblks) {
> > error = xfs_dir_canenter(tp, tdp, target_name);
> > + if (error == -ENOSPC && space_error)
> > + error = space_error;
>
> This would be better in the error_return stack, I think. That way
> the transformation only has to be done once, and it will be done for
> all functions that can potentially return ENOSPC.
Ok.
> > if (error)
> > goto error_return;
> > }
> > @@ -2755,6 +2749,7 @@ xfs_remove(
> > xfs_mount_t *mp = dp->i_mount;
> > xfs_trans_t *tp = NULL;
> > int is_dir = S_ISDIR(VFS_I(ip)->i_mode);
> > + int dontcare;
> > int error = 0;
> > uint resblks;
> >
> > @@ -2781,22 +2776,13 @@ xfs_remove(
> > * block from the directory.
> > */
> > resblks = XFS_REMOVE_SPACE_RES(mp);
> > - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_remove, resblks, 0, 0, &tp);
> > - if (error == -ENOSPC) {
> > - resblks = 0;
> > - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_remove, 0, 0, 0,
> > - &tp);
> > - }
> > + error = xfs_trans_alloc_dir(dp, &M_RES(mp)->tr_remove, ip, &resblks,
> > + &tp, &dontcare);
> > if (error) {
> > ASSERT(error != -ENOSPC);
> > goto std_return;
> > }
>
> So we just ignore -EDQUOT when it is returned in @dontcare? I'd like
> a comment to explain why we don't care about EDQUOT here, because
> the next time I look at this I will have forgotten all about this...
Ok. How about:
/*
* We try to get the real space reservation first, allowing for
* directory btree deletion(s) implying possible bmap insert(s).
* If we can't get the space reservation then we use 0 instead,
* and avoid the bmap btree insert(s) in the directory code by,
* if the bmap insert tries to happen, instead trimming the LAST
* block from the directory.
*
* Ignore EDQUOT and ENOSPC being returned via nospace_error
* because the directory code can handle a reservationless
* update and we don't want to prevent a user from trying to
* free space by deleting things.
*/
error = xfs_trans_alloc_dir(...);
--D
> Cheers,
>
> Dave.
>
> --
> Dave Chinner
> david@fromorbit.com
next prev parent reply other threads:[~2022-03-09 23:33 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-09 19:22 [PATCHSET v2 0/2] xfs: make quota reservations for directory changes Darrick J. Wong
2022-03-09 19:22 ` [PATCH 1/2] xfs: reserve quota for dir expansion when linking/unlinking files Darrick J. Wong
2022-03-09 21:48 ` Dave Chinner
2022-03-09 23:33 ` Darrick J. Wong [this message]
2022-03-10 1:50 ` Dave Chinner
2022-03-09 19:22 ` [PATCH 2/2] xfs: reserve quota for target dir expansion when renaming files Darrick J. Wong
2022-03-09 22:05 ` Dave Chinner
2022-03-09 23:36 ` Darrick J. Wong
-- strict thread matches above, loose matches on Subject: below --
2022-03-10 21:53 [PATCHSET v3 0/2] xfs: make quota reservations for directory changes Darrick J. Wong
2022-03-10 21:53 ` [PATCH 1/2] xfs: reserve quota for dir expansion when linking/unlinking files Darrick J. Wong
2022-03-10 22:28 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220309233302.GE8224@magnolia \
--to=djwong@kernel.org \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox