public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Chandan Babu R <chandan.babu@oracle.com>,
	"Darrick J. Wong" <djwong@kernel.org>,
	Dave Chinner <dchinner@redhat.com>,
	linux-xfs@vger.kernel.org
Subject: Re: [PATCH 7/9] xfs: check XFS_EOFBLOCKS_RELEASED earlier in xfs_release_eofblocks
Date: Mon, 12 Aug 2024 09:48:12 +1000	[thread overview]
Message-ID: <ZrlNvJairwgvACh2@dread.disaster.area> (raw)
In-Reply-To: <20240811085952.GB12713@lst.de>

On Sun, Aug 11, 2024 at 10:59:52AM +0200, Christoph Hellwig wrote:
> On Fri, Aug 09, 2024 at 09:03:24AM +1000, Dave Chinner wrote:
> > The test and set here is racy. A long time can pass between the test
> > and the setting of the flag,
> 
> The race window is much tighter due to the iolock, but if we really
> care about the race here, the right fix for that is to keep a second
> check for the XFS_EOFBLOCKS_RELEASED flag inside the iolock.

Right, that's exactly what the code I proposed below does.

> > so maybe this should be optimised to
> > something like:
> > 
> > 	if (inode->i_nlink &&
> > 	    (file->f_mode & FMODE_WRITE) &&
> > 	    (!(ip->i_flags & XFS_EOFBLOCKS_RELEASED)) &&
> > 	    xfs_ilock_nowait(ip, XFS_IOLOCK_EXCL)) {
> > 		if (xfs_can_free_eofblocks(ip) &&
> > 		    !xfs_iflags_test_and_set(ip, XFS_EOFBLOCKS_RELEASED))
> > 			xfs_free_eofblocks(ip);
> > 		xfs_iunlock(ip, XFS_IOLOCK_EXCL);
> > 	}
> 
> All these direct i_flags access actually are racy too (at least in
> theory).

Yes, but we really don't care about racing against the bit being
set. The flag never gets cleared unless a truncate down occurs, so
we don't really have to care about racing with that case - there
will be no eofblocks to free.

If the test races with another release call setting the flag (i.e.
we see it clear) then we are going to go the slow way and then do
exactly the right thing according to the current bit state once we
hold the IO lock and the i_flags_lock.

> We'd probably be better off moving those over to the atomic
> bitops and only using i_lock for any coordination beyond the actual
> flags.  I'd rather not get into that here for now, even if it is a
> worthwhile project for later.

That doesn't solve the exclusive cacheline access problem Mateusz
reported. It allows us to isolate the bitop updates, but in this
case here the atomic test-and-set op still requires exclusive
cacheline access.

Hence we'd still need test-test-and-set optimisations here to avoid
the exclusive cacheline contention when the bit is already set...

> > I do wonder, though - why do we need to hold the IOLOCK to call
> > xfs_can_free_eofblocks()? The only thing that really needs
> > serialisation is the xfS_bmapi_read() call, and that's done under
> > the ILOCK not the IOLOCK. Sure, xfs_free_eofblocks() needs the
> > IOLOCK because it's effectively a truncate w.r.t. extending writes,
> > but races with extending writes while checking if we need to do that
> > operation aren't really a big deal. Worst case is we take the
> > lock and free the EOF blocks beyond the writes we raced with.
> > 
> > What am I missing here?
> 
> I think the prime part of the story is that xfs_can_free_eofblocks was
> split out of xfs_free_eofblocks, which requires the iolock.  But I'm
> not sure if some of the checks are a little racy without the iolock,

Ok. I think the checks are racy even with the iolock - most of the
checks are for inode metadata that is modified under the ilock (e.g.
i_diflags, i_delayed_blks) or the ip->i_flags_lock (e.g.
VFS_I(ip)->i_size for serialisation with updates via
xfs_dio_write_end_io()). Hence I don't think that holding the IO
lock here makes any difference here at all...

> although I doubt it matter in practice as they are all optimizations.
> I'd need to take a deeper look at this, so maybe it's worth a follow
> on together with the changes in i_flags handling.

*nod*

-Dave.

-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2024-08-11 23:48 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-08 15:27 post-EOF block handling revamp v2 Christoph Hellwig
2024-08-08 15:27 ` [PATCH 1/9] xfs: remove the i_mode check in xfs_release Christoph Hellwig
2024-08-08 15:27 ` [PATCH 2/9] xfs: refactor f_op->release handling Christoph Hellwig
2024-08-08 15:27 ` [PATCH 3/9] xfs: don't bother returning errors from xfs_file_release Christoph Hellwig
2024-08-08 15:27 ` [PATCH 4/9] xfs: skip all of xfs_file_release when shut down Christoph Hellwig
2024-08-08 22:25   ` Dave Chinner
2024-08-08 15:27 ` [PATCH 5/9] xfs: don't free post-EOF blocks on read close Christoph Hellwig
2024-08-08 15:27 ` [PATCH 6/9] xfs: only free posteof blocks on first close Christoph Hellwig
2024-08-08 22:36   ` Dave Chinner
2024-08-11  8:44     ` Christoph Hellwig
2024-08-08 15:27 ` [PATCH 7/9] xfs: check XFS_EOFBLOCKS_RELEASED earlier in xfs_release_eofblocks Christoph Hellwig
2024-08-08 23:03   ` Dave Chinner
2024-08-11  8:59     ` Christoph Hellwig
2024-08-11 23:48       ` Dave Chinner [this message]
2024-08-08 15:27 ` [PATCH 8/9] xfs: simplify extent lookup in xfs_can_free_eofblocks Christoph Hellwig
2024-08-08 15:27 ` [PATCH 9/9] xfs: reclaim speculative preallocations for append only files Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2024-08-13  7:39 post-EOF block handling revamp v3 Christoph Hellwig
2024-08-13  7:39 ` [PATCH 7/9] xfs: check XFS_EOFBLOCKS_RELEASED earlier in xfs_release_eofblocks Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZrlNvJairwgvACh2@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=chandan.babu@oracle.com \
    --cc=dchinner@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox