linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 4/4] xfs: account only rmapbt-used blocks against rmapbt perag res
Date: Fri, 9 Feb 2018 09:49:30 +1100	[thread overview]
Message-ID: <20180208224930.GG20266@dastard> (raw)
In-Reply-To: <20180208131938.GC13929@bfoster.bfoster>

On Thu, Feb 08, 2018 at 08:19:38AM -0500, Brian Foster wrote:
> On Wed, Feb 07, 2018 at 06:20:37PM -0800, Darrick J. Wong wrote:
> > On Wed, Feb 07, 2018 at 09:49:35AM -0500, Brian Foster wrote:
> > > On Tue, Feb 06, 2018 at 04:03:50PM -0800, Darrick J. Wong wrote:
> > > We've since applied it to things like the finobt (which I'm still not
> > > totally convinced was the right thing to do based on the vague
> > > justification for it), which kind of blurs the line between where it's
> > > a requirement vs. nice-to-have/band-aid for me.
> > 
> > I think the finobt reservation is required: Suppose you have a
> > filesystem with a lot of empty files, a lot of single-block files, and a
> > lot of big files such that there's no free space anywhere.  Suppose
> > further that there's an AG where every finobt block is exactly full,
> > there's an inode chunk with exactly 64 inodes in use, and every block in
> > that AG is in use (meaning zero AGFL blocks).  Now find one of the empty
> > inodes in that totally-full chunk and try to free it.  Truncation
> > doesn't free up any blocks, but we have to expand the finobt to add the
> > record for the chunk.  We can't find any blocks in that AG so we shut
> > down.
> > 
> 
> Yes, I suppose the problem makes sense (I wish the original commit had
> such an explanation :/). We do have the transaction block reservation in
> the !perag res case, but I suppose we're susceptible to the same global
> reservation problem as above.
> 
> Have we considered a per-ag + per-transaction mechanism at any point
> through all of this?

That's kind of what I was suggesting to Darrick on IRC a while back.
i.e. the per-ag reservation of at least 4-8MB of space similar to
the global reservation pool we have, and when it dips below that
threshold we reserve more free space.

But yeah, it doesn't completely solve the finobt growth at ENOSPC
problem, but then again the global reservation pool doesn't
completely solve the "run out of free space for IO completion
processing at ENOSPC" problem either. That mechanism is just a
simple solution that is good enough for 99.99% of XFS users, and if
you are outside this there's a way to increase the pool size to make
it more robust (e.g. for those 4096 CPU MPI jobs all doing
concurrent DIO writeback at ENOSPC).

So the question I'm asking here is this: do we need a "perfect
solution" or does a simple, small, dynamic reservation pool provide
"good enough protection" for the vast majority of our users?

> I ask because something that has been in the back
> of my mind (which I think was an idea from Dave originally) for a while
> is to simply queue inactive inode processing when it can't run at a
> particular point in time, but that depends on actually knowing whether
> we can proceed to inactivate an inode or not.

Essentially defer the xfs_ifree() processing step from
xfs_inactive(), right? i.e. leave the inode on the unlinked list
until we've got space to free it? This could be determined by a
simple AG space/resv space check before removign the inode from
the unlinked list...

FWIW, if we keep a list of inactivated but not yet freed inodes for
background processing, we could allocate inodes from that list, too,
simply by removing them from the unlinked list...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-02-08 22:49 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-05 17:45 [PATCH 0/4] xfs: rmapbt block and perag reservation fixups Brian Foster
2018-02-05 17:45 ` [PATCH 1/4] xfs: shutdown if block allocation overruns tx reservation Brian Foster
2018-02-08  1:42   ` Darrick J. Wong
2018-02-05 17:45 ` [PATCH 2/4] xfs: account format bouncing into rmapbt swapext " Brian Foster
2018-02-08  1:56   ` Darrick J. Wong
2018-02-08 13:12     ` Brian Foster
2018-02-05 17:46 ` [PATCH 3/4] xfs: rename agfl perag res type to rmapbt Brian Foster
2018-02-08  1:57   ` Darrick J. Wong
2018-02-05 17:46 ` [PATCH 4/4] xfs: account only rmapbt-used blocks against rmapbt perag res Brian Foster
2018-02-07  0:03   ` Darrick J. Wong
2018-02-07 14:49     ` Brian Foster
2018-02-08  2:20       ` Darrick J. Wong
2018-02-08 13:19         ` Brian Foster
2018-02-08 22:49           ` Dave Chinner [this message]
2018-02-09 13:37             ` Brian Foster
2018-02-06 13:10 ` [PATCH] tests/xfs: rmapbt swapext block reservation overrun test Brian Foster
2018-02-06 17:30   ` Darrick J. Wong
2018-02-06 18:50     ` Brian Foster
2018-02-07  4:07     ` Eryu Guan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180208224930.GG20266@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).