public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: John Garry <john.g.garry@oracle.com>
Cc: linux-xfs@vger.kernel.org, ojaswin@linux.ibm.com, ritesh.list@gmail.com
Subject: Re: [PATCH 1/3] xfs: simplify extent allocation alignment
Date: Tue, 2 Apr 2024 16:58:02 +1100	[thread overview]
Message-ID: <ZgueamvcnndUUwYd@dread.disaster.area> (raw)
In-Reply-To: <9cc5d4da-c1cd-41d3-95d9-0373990c2007@oracle.com>

On Tue, Mar 26, 2024 at 04:08:04PM +0000, John Garry wrote:
> On 20/03/2024 04:35, Dave Chinner wrote:
> 
> For some reason I never received this mail. I just noticed it on
> lore.kernel.org today by chance.
> 
> > On Wed, Mar 13, 2024 at 11:03:18AM +0000, John Garry wrote:
> > > On 06/03/2024 05:20, Dave Chinner wrote:
> > > >    		return false;
> > > > diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
> > > > index 0b956f8b9d5a..aa2c103d98f0 100644
> > > > --- a/fs/xfs/libxfs/xfs_alloc.h
> > > > +++ b/fs/xfs/libxfs/xfs_alloc.h
> > > > @@ -46,7 +46,7 @@ typedef struct xfs_alloc_arg {
> > > >    	xfs_extlen_t	minleft;	/* min blocks must be left after us */
> > > >    	xfs_extlen_t	total;		/* total blocks needed in xaction */
> > > >    	xfs_extlen_t	alignment;	/* align answer to multiple of this */
> > > > -	xfs_extlen_t	minalignslop;	/* slop for minlen+alignment calcs */
> > > > +	xfs_extlen_t	alignslop;	/* slop for alignment calcs */
> > > >    	xfs_agblock_t	min_agbno;	/* set an agbno range for NEAR allocs */
> > > >    	xfs_agblock_t	max_agbno;	/* ... */
> > > >    	xfs_extlen_t	len;		/* output: actual size of extent */
> > > > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> > > > index 656c95a22f2e..d56c82c07505 100644
> > > > --- a/fs/xfs/libxfs/xfs_bmap.c
> > > > +++ b/fs/xfs/libxfs/xfs_bmap.c
> > > > @@ -3295,6 +3295,10 @@ xfs_bmap_select_minlen(
> > > >    	xfs_extlen_t		blen)
> > > 
> > > Hi Dave,
> > > 
> > > >    {
> > > > +	/* Adjust best length for extent start alignment. */
> > > > +	if (blen > args->alignment)
> > > > +		blen -= args->alignment;
> > > > +
> > > 
> > > This change seems to be causing or exposing some issue, in that I find that
> > > I am being allocated an extent which is aligned to but not a multiple of
> > > args->alignment.
> > 
> > Entirely possible the logic isn't correct ;)
> 
> Out of curiosity, how do you guys normally test all this sort of logic?

With difficulty.

Exercising all the weird corner cases is really hard because the
combinatory explosion that occurs when you have 20 control
parameters, up to 5 different failure fallback strategies,
behavioural variations with delayed allocation, ENOSPC and AGFL
refilling accounting variations, etc, means it's basically
impossible to enumerate and iterate the behaviour space fully.
And then we have filesystem geometry and application concurrency
to consider, too.

All of the behaviours up to this point in time are best effort - we
don't guarantee allocation policy is followed when there is not
enough free space to execute the preferred policy - we slowly fall
back to mechanisms that are further from the policy but more likely
to succeed. i.e. as we approach ENOSPC, the allocation policies get
"looser" - they are less restrictive and more variable and don't
give as good results as when there is plenty of free space for the
allocation policy to make good decisions from.

As such, I only check that macro-level behaviour when there is lots
of free space is largely correct. e.g. by doing something like
copying a kernel tree onto a new filesystem, then checking inode
locality follows directories, block locality follows inodes, large
files are stripe aligned, extent size hint based inodes appear to
have the correct extent sizes, etc.

I then rely on the ENOSPC tests in fstests to find regressions that
might occur when the filesystem is stressed with little free space
available. These are a whole lot better than they used to be; root
cause analysis of ENOSPC corner case bugs has consumed months of my
working life over the past 20 years....

> I found this issue with the small program which I wrote to generate traffic.
> I could not find anything similar.

That's because it's largely impossible to write a test that is
deterministic and works on all possible test configurations. Even
changing the size of the filesystem even slightly can result in
vastly different but still 100% correct allocation
behaviour....

> > > Firstly, in this same scenario, in xfs_alloc_space_available() we calculate
> > > alloc_len = args->minlen + (args->alignment - 1) + args->alignslop = 76 + (4
> > > - 1) + 0 = 79, and then args->maxlen = 79.
> > 
> > That seems OK, we're doing aligned allocation and this is an ENOSPC
> > corner case so the aligned allocation should get rounded down in
> > xfs_alloc_fix_len() or rejected.
> > 
> > One thought I just had is that the args->maxlen adjustment shouldn't
> > be to "available space" - it should probably be set to args->minlen
> > because that's the aligned 'alloc_len' we checked available space
> > against. That would fix this, because then we'd have args->minlen =
> > args->maxlen = 76.
> > 
> > However, that only addresses this specific case, not the general
> > case of xfs_alloc_fix_len() failing to tail align the allocated
> > extent.
> > 
> > > Then xfs_alloc_fix_len() allows
> > > this as args->len == args->maxlen (=79), even though args->prod, mod = 4, 0.
> > 
> > Yeah, that smells wrong.
> 
> Would it be worth adding a debug assert for prod and mod being honoured from
> the allocator? xfs_alloc_fix_len() does have an assert later on and it does
> not help here.

I don't see any value in that because it's not actually a "fatal"
issue. See above about trading off policy strictness for allocation
success.

Again, this force alignment stuff is a fundamental change in this
behaviour - it wants "hard failure" rather than "trade off" and so
there isn't a general case for asserting that allocation must be
mod/prod aligned. Extent size hints are a -hint-, not a requirement,
and I don't want random assert failures in test systems because
debug kernels start treating hints as "must not fail" requirements.

> > I'd suggest that we've never noticed this until now because we
> > have never guaranteed extent alignment. Hence the occasional
> > short/unaligned extent being allocated in dark ENOSPC corners was
> > never an issue for anyone.
> > 
> > However, introducing a new alignment guarantee turns these sorts of
> > latent non-issues into bugs that need to be fixed. i.e. This is
> > exactly the sort of rare corner case behaviour I expected to be
> > flushed out by guaranteeing and then extensively testing allocation
> > alignments.
> > 
> > If you drop the rlen == args->maxlen check from
> > xfs_alloc_space_available(),
> 
> I assume that you mean xfs_alloc_fix_len()

Yes.

> > the problem should go away and the
> > extent gets trimmed to 76 blocks.
> 
> ..if so, then, yes, it does. We end up with this:
> 
>    0: [0..14079]:      42432..56511      0 (42432..56511)   14080
>    1: [14080..14687]:  177344..177951    0 (177344..177951)   608
>    2: [14688..14719]:  350720..350751    1 (171520..171551)    32

Good, that's how it should work. :) 

I'll update the patchset I have with these fixes.

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2024-04-02  5:58 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-04 13:04 [PATCH v2 00/14] block atomic writes for XFS John Garry
2024-03-04 13:04 ` [PATCH v2 01/14] block: Add blk_validate_atomic_write_op_size() John Garry
2024-03-04 13:04 ` [PATCH v2 02/14] fs: xfs: Don't use low-space allocator for alignment > 1 John Garry
2024-03-04 22:15   ` Dave Chinner
2024-03-05 13:36     ` John Garry
2024-03-04 13:04 ` [PATCH v2 03/14] fs: xfs: Introduce FORCEALIGN inode flag John Garry
2024-03-04 13:04 ` [PATCH v2 04/14] fs: xfs: Make file data allocations observe the 'forcealign' flag John Garry
2024-03-05  0:44   ` Dave Chinner
2024-03-05 15:22     ` John Garry
2024-03-05 22:18       ` Dave Chinner
2024-03-06  5:20         ` [RFC PATCH 0/3] xfs: forced extent alignment Dave Chinner
2024-03-06  5:20           ` [PATCH 1/3] xfs: simplify extent allocation alignment Dave Chinner
2024-03-13 11:03             ` John Garry
2024-03-20  4:35               ` Dave Chinner
2024-03-26 16:08                 ` John Garry
2024-04-02  5:58                   ` Dave Chinner [this message]
2024-04-02  7:49                     ` John Garry
2024-04-02 15:11                       ` John Garry
2024-04-02 21:26                         ` Dave Chinner
2024-04-03  8:49                           ` John Garry
2024-04-02 23:44                       ` Dave Chinner
2024-04-03 11:30                         ` John Garry
2024-03-06  5:20           ` [PATCH 2/3] xfs: make EOF allocation simpler Dave Chinner
2024-03-06  5:20           ` [PATCH 3/3] xfs: introduce forced allocation alignment Dave Chinner
2024-03-06 11:46           ` [RFC PATCH 0/3] xfs: forced extent alignment John Garry
2024-03-06 17:52             ` John Garry
2024-03-06 20:54             ` Dave Chinner
2024-03-13 18:32           ` John Garry
2024-03-06  9:41         ` [PATCH v2 04/14] fs: xfs: Make file data allocations observe the 'forcealign' flag John Garry
2024-03-04 13:04 ` [PATCH v2 05/14] fs: xfs: Enable file data forcealign feature John Garry
2024-03-04 13:04 ` [PATCH v2 06/14] fs: xfs: Do not free EOF blocks for forcealign John Garry
2024-03-06 21:07   ` Dave Chinner
2024-03-07 11:38     ` John Garry
2024-03-04 13:04 ` [PATCH v2 07/14] fs: iomap: Sub-extent zeroing John Garry
2024-03-06 21:14   ` Dave Chinner
2024-03-07 11:51     ` John Garry
2024-03-04 13:04 ` [PATCH v2 08/14] fs: xfs: " John Garry
2024-03-06 22:00   ` Dave Chinner
2024-03-07 12:57     ` John Garry
2024-03-04 13:04 ` [PATCH v2 09/14] fs: Add FS_XFLAG_ATOMICWRITES flag John Garry
2024-03-04 13:04 ` [PATCH v2 10/14] fs: iomap: Atomic write support John Garry
2024-03-04 13:04 ` [PATCH v2 11/14] fs: xfs: Support FS_XFLAG_ATOMICWRITES for forcealign John Garry
2024-03-06 21:43   ` Dave Chinner
2024-03-07 12:42     ` John Garry
2024-03-04 13:04 ` [PATCH v2 12/14] fs: xfs: Support atomic write for statx John Garry
2024-03-06 21:31   ` Dave Chinner
2024-03-07 10:35     ` John Garry
2024-03-04 13:04 ` [PATCH v2 13/14] fs: xfs: Validate atomic writes John Garry
2024-03-06 21:22   ` Dave Chinner
2024-03-07 10:19     ` John Garry
2024-03-04 13:04 ` [PATCH v2 14/14] fs: xfs: Support setting FMODE_CAN_ATOMIC_WRITE John Garry
2024-03-06 21:33   ` Dave Chinner
2024-03-07 11:55     ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZgueamvcnndUUwYd@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=john.g.garry@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ojaswin@linux.ibm.com \
    --cc=ritesh.list@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox