From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 3/3] xfs: rework collapse range into an atomic operation
Date: Wed, 18 Dec 2019 13:19:05 -0800 [thread overview]
Message-ID: <20191218211905.GC7489@magnolia> (raw)
In-Reply-To: <20191218121120.GB63809@bfoster>
On Wed, Dec 18, 2019 at 07:11:20AM -0500, Brian Foster wrote:
> On Tue, Dec 17, 2019 at 06:39:58PM -0800, Darrick J. Wong wrote:
> > On Fri, Dec 13, 2019 at 12:12:58PM -0500, Brian Foster wrote:
> > > The collapse range operation uses a unique transaction and ilock
> > > cycle for the hole punch and each extent shift iteration of the
> > > overall operation. While the hole punch is safe as a separate
> > > operation due to the iolock, cycling the ilock after each extent
> > > shift is risky similar to insert range.
> >
> > It is? I thought collapse range was safe because we started by punching
> > out the doomed range and shifting downwards, which eliminates the
> > problems that come with two adjacent mappings that could be combined?
> >
> > <confused?>
> >
>
> This is somewhat vague wording. I don't mean to say the same bug is
> possible on collapse. Indeed, I don't think that it is. What I mean to
> say is just that cycling the ilock generally opens the operation up to
> concurrent extent changes, similar to the behavior that resulted in the
> insert range bug and against the general design principle of the
> operation (as implied by the iolock hold -> flush -> unmap preparation
> sequence).
Oh, ok, you're merely trying to prevent anyone from seeing the inode
metadata while we're in the middle of a collapse-range operation. I
wonder then if we need to take a look at the remap range operations, but
oh wow is that a gnarly mess of inode locking. :)
> IOW, it seems to me that a similar behavior is possible on collapse, it
> just might occur after an extent has been shifted into its new target
> range rather than before. That wouldn't be a corruption/bug because it
> doesn't change the semantics of the shift operation or the content of
> the file, but it's subtle and arguably a misbehavior and/or landmine.
<nod> Ok, just making sure. :)
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
--D
> Brian
>
> > --D
> >
> > > To avoid this problem, make collapse range atomic with respect to
> > > ilock. Hold the ilock across the entire operation, replace the
> > > individual transactions with a single rolling transaction sequence
> > > and finish dfops on each iteration to perform pending frees and roll
> > > the transaction. Remove the unnecessary quota reservation as
> > > collapse range can only ever merge extents (and thus remove extent
> > > records and potentially free bmap blocks). The dfops call
> > > automatically relogs the inode to keep it moving in the log. This
> > > guarantees that nothing else can change the extent mapping of an
> > > inode while a collapse range operation is in progress.
> > >
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > ---
> > > fs/xfs/xfs_bmap_util.c | 29 +++++++++++++++--------------
> > > 1 file changed, 15 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> > > index 555c8b49a223..1c34a34997ca 100644
> > > --- a/fs/xfs/xfs_bmap_util.c
> > > +++ b/fs/xfs/xfs_bmap_util.c
> > > @@ -1050,7 +1050,6 @@ xfs_collapse_file_space(
> > > int error;
> > > xfs_fileoff_t next_fsb = XFS_B_TO_FSB(mp, offset + len);
> > > xfs_fileoff_t shift_fsb = XFS_B_TO_FSB(mp, len);
> > > - uint resblks = XFS_DIOSTRAT_SPACE_RES(mp, 0);
> > > bool done = false;
> > >
> > > ASSERT(xfs_isilocked(ip, XFS_IOLOCK_EXCL));
> > > @@ -1066,32 +1065,34 @@ xfs_collapse_file_space(
> > > if (error)
> > > return error;
> > >
> > > - while (!error && !done) {
> > > - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0,
> > > - &tp);
> > > - if (error)
> > > - break;
> > > + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, 0, 0, 0, &tp);
> > > + if (error)
> > > + return error;
> > >
> > > - xfs_ilock(ip, XFS_ILOCK_EXCL);
> > > - error = xfs_trans_reserve_quota(tp, mp, ip->i_udquot,
> > > - ip->i_gdquot, ip->i_pdquot, resblks, 0,
> > > - XFS_QMOPT_RES_REGBLKS);
> > > - if (error)
> > > - goto out_trans_cancel;
> > > - xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
> > > + xfs_ilock(ip, XFS_ILOCK_EXCL);
> > > + xfs_trans_ijoin(tp, ip, 0);
> > >
> > > + while (!done) {
> > > error = xfs_bmap_collapse_extents(tp, ip, &next_fsb, shift_fsb,
> > > &done);
> > > if (error)
> > > goto out_trans_cancel;
> > > + if (done)
> > > + break;
> > >
> > > - error = xfs_trans_commit(tp);
> > > + /* finish any deferred frees and roll the transaction */
> > > + error = xfs_defer_finish(&tp);
> > > + if (error)
> > > + goto out_trans_cancel;
> > > }
> > >
> > > + error = xfs_trans_commit(tp);
> > > + xfs_iunlock(ip, XFS_ILOCK_EXCL);
> > > return error;
> > >
> > > out_trans_cancel:
> > > xfs_trans_cancel(tp);
> > > + xfs_iunlock(ip, XFS_ILOCK_EXCL);
> > > return error;
> > > }
> > >
> > > --
> > > 2.20.1
> > >
> >
>
next prev parent reply other threads:[~2019-12-18 21:19 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-13 17:12 [PATCH 0/3] xfs: hold ilock across insert and collapse range Brian Foster
2019-12-13 17:12 ` [PATCH 1/3] xfs: open code insert range extent split helper Brian Foster
2019-12-17 17:02 ` Allison Collins
2019-12-17 22:21 ` Darrick J. Wong
2019-12-18 2:16 ` Darrick J. Wong
2019-12-24 11:18 ` Christoph Hellwig
2019-12-13 17:12 ` [PATCH 2/3] xfs: rework insert range into an atomic operation Brian Foster
2019-12-17 17:04 ` Allison Collins
2019-12-18 2:37 ` Darrick J. Wong
2019-12-18 12:10 ` Brian Foster
2019-12-18 21:15 ` Darrick J. Wong
2019-12-19 11:55 ` Brian Foster
2019-12-20 20:17 ` Darrick J. Wong
2019-12-23 12:12 ` Brian Foster
2019-12-24 11:28 ` Christoph Hellwig
2019-12-24 16:45 ` Darrick J. Wong
2019-12-24 11:26 ` Christoph Hellwig
2019-12-13 17:12 ` [PATCH 3/3] xfs: rework collapse " Brian Foster
2019-12-18 2:39 ` Darrick J. Wong
2019-12-18 12:11 ` Brian Foster
2019-12-18 21:19 ` Darrick J. Wong [this message]
2019-12-19 11:56 ` Brian Foster
2019-12-24 11:29 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191218211905.GC7489@magnolia \
--to=darrick.wong@oracle.com \
--cc=bfoster@redhat.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox