From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Brian Foster <bfoster@redhat.com>, linux-xfs@vger.kernel.org
Subject: Re: [RFC PATCH 0/3]: Extreme fragmentation ahoy!
Date: Fri, 8 Feb 2019 08:29:51 -0800 [thread overview]
Message-ID: <20190208162951.GN7991@magnolia> (raw)
In-Reply-To: <20190208024730.GM14116@dastard>
On Fri, Feb 08, 2019 at 01:47:30PM +1100, Dave Chinner wrote:
> On Thu, Feb 07, 2019 at 10:52:43AM -0500, Brian Foster wrote:
> > On Thu, Feb 07, 2019 at 04:39:41PM +1100, Dave Chinner wrote:
> > > On Wed, Feb 06, 2019 at 09:21:14PM -0800, Darrick J. Wong wrote:
> > > > On Thu, Feb 07, 2019 at 04:08:10PM +1100, Dave Chinner wrote:
> > > > > Hi folks,
> > > > >
> > > > > I've just finished analysing an IO trace from a application
> > > > > generating an extreme filesystem fragmentation problem that started
> > > > > with extent size hints and ended with spurious ENOSPC reports due to
> > > > > massively fragmented files and free space. While the ENOSPC issue
> > > > > looks to have previously been solved, I still wanted to understand
> > > > > how the application had so comprehensively defeated extent size
> > > > > hints as a method of avoiding file fragmentation.
> ....
> > > FWIW, I think the scope of the problem is quite widespread -
> > > anything that does open/something/close repeatedly on a file that is
> > > being written to with O_DSYNC or O_DIRECT appending writes will kill
> > > the post-eof extent size hint allocated space. That's why I suspect
> > > we need to think about not trimming by default and trying to
> > > enumerating only the cases that need to trim eof blocks.
> > >
> >
> > To further this point.. I think the eofblocks scanning stuff came long
> > after the speculative preallocation code and associated release time
> > post-eof truncate.
>
> Yes, I cribed a bit of the history of the xfs_release() behaviour
> on #xfs yesterday afternoon:
>
> <djwong> dchinner: feel free to ignore this until tomorrow if you want, but /me wonders why we'd want to free the eofblocks at close time at all, instead of waiting for inactivation/enospc/background reaper to do it?
> <dchinner> historic. People doing operations then complaining du didn't match ls
> <dchinner> stuff like that
> <dchinner> There used to be a open file cache in XFS - we'd know exactly when the last reference went away and trim it then
> <dchinner> but that went away when NFS and the dcache got smarter about file handle conversion
> <dchinner> (i.e. that's how we used to make nfs not suck)
> <dchinner> that's when we started doing work in ->release
> <dchinner> it was close enough to "last close" for most workloads it made no difference.
> <dchinner> Except for concurrent NFS writes into the same directory
> <dchinner> and now there's another pathological application that triggers problems
> <dchinner> The NFS exception was prior to having thebackground reaper
> <dchinner> as these things goes the background reaper is relatively recent functionality
> <dchinner> so perhaps we should just leave it to "inode cache expiry or background reaping" and not do it on close at al
>
> > I think the background scanning was initially an
> > enhancement to deal with things like the dirty release optimization
> > leaving these blocks around longer and being able to free up this
> > accumulated space when we're at -ENOSPC conditions.
>
> Yes, amongst other things like slow writes keeping the file open
> forever.....
>
> > Now that we have the
> > scanning mechanism in place (and a 5 minute default background scan,
> > which really isn't all that long), it might be reasonable to just drop
> > the release time truncate completely and only trim post-eof blocks via
> > the bg scan or reclaim paths.
>
> Yeah, that's kinda the question I'm asking here. What's the likely
> impact of not trimming EOF blocks at least on close apart from
> people complaining about df/ls not matching du?
>
> I don't really care about that anymore because, well, reflink/dedupe
> completely break any remaining assumption that du reported space
> consumption is related to the file size (if sparse files wasn't
> enough of a hint arlready)....
Not to mention the deferred inactivation series tracks "space we could
free if we did a bunch of inactivation work" so that we can lie to
statfs and pretend we already did the work. It wouldn't be hard to
include speculative posteof blocks in that too.
--D
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
next prev parent reply other threads:[~2019-02-08 16:29 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-07 5:08 [RFC PATCH 0/3]: Extreme fragmentation ahoy! Dave Chinner
2019-02-07 5:08 ` [PATCH 1/3] xfs: Don't free EOF blocks on sync write close Dave Chinner
2019-02-07 5:08 ` [PATCH 2/3] xfs: Don't free EOF blocks on close when extent size hints are set Dave Chinner
2019-02-07 15:51 ` Brian Foster
2019-02-07 5:08 ` [PATCH 3/3] xfs: Don't free EOF blocks on sync write close Dave Chinner
2019-02-07 5:19 ` Dave Chinner
2019-02-07 5:21 ` [RFC PATCH 0/3]: Extreme fragmentation ahoy! Darrick J. Wong
2019-02-07 5:39 ` Dave Chinner
2019-02-07 15:52 ` Brian Foster
2019-02-08 2:47 ` Dave Chinner
2019-02-08 12:34 ` Brian Foster
2019-02-12 1:13 ` Darrick J. Wong
2019-02-12 11:46 ` Brian Foster
2019-02-12 20:21 ` Dave Chinner
2019-02-13 13:50 ` Brian Foster
2019-02-13 22:27 ` Dave Chinner
2019-02-14 13:00 ` Brian Foster
2019-02-14 21:51 ` Dave Chinner
2019-02-15 2:35 ` Brian Foster
2019-02-15 7:23 ` Dave Chinner
2019-02-15 20:33 ` Brian Foster
2019-02-08 16:29 ` Darrick J. Wong [this message]
2019-02-18 2:26 ` [PATCH 4/3] xfs: EOF blocks are not busy extents Dave Chinner
2019-02-20 15:12 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190208162951.GN7991@magnolia \
--to=darrick.wong@oracle.com \
--cc=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox