From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: xfs <linux-xfs@vger.kernel.org>
Subject: Re: [PATCH] xfs: walk all AGs if TRYLOCK passed to xfs_alloc_vextent_iterate_ags
Date: Sat, 18 Mar 2023 11:38:42 +1100 [thread overview]
Message-ID: <ZBUIElGrUQnvxprN@destitution> (raw)
In-Reply-To: <20230316164721.GK11376@frogsfrogsfrogs>
On Thu, Mar 16, 2023 at 09:47:21AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> Callers of xfs_alloc_vextent_iterate_ags that pass in the TRYLOCK flag
> want us to perform a non-blocking scan of the AGs for free space. There
> are no ordering constraints for non-blocking AGF lock acquisition, so
> the scan can freely start over at AG 0 even when minimum_agno > 0.
>
> This manifests fairly reliably on xfs/294 on 6.3-rc2 with the parent
> pointer patchset applied and the realtime volume enabled. I observed
> the following sequence as part of an xfs_dir_createname call:
>
> 0. Fragment the free space, then allocate nearly all the free space in
> all AGs except AG 0.
>
> 1. Create a directory in AG 2 and let it grow for a while.
>
> 2. Try to allocate 2 blocks to expand the dirent part of a directory.
> The space will be allocated out of AG 0, but the allocation will not
> be contiguous. This (I think) activates the LOWMODE allocator.
>
> 3. The bmapi call decides to convert from extents to bmbt format and
> tries to allocate 1 block. This allocation request calls
> xfs_alloc_vextent_start_ag with the inode number, which starts the
> scan at AG 2. We ignore AG 0 (with all its free space) and instead
> scrape AG 2 and 3 for more space. We find one block, but this now
> kicks t_highest_agno to 3.
>
> 4. The createname call decides it needs to split the dabtree. It tries
> to allocate even more space with xfs_alloc_vextent_start_ag, but now
> we're constrained to AG 3, and we don't find the space. The
> createname returns ENOSPC and the filesystem shuts down.
>
> This change fixes the problem by making the trylock scan wrap around to
> AG 0 if it doesn't like the AGs that it finds. Since the current
> transaction itself holds AGF 0, the trylock of AGF 0 will succeed, and
> we take space from the AG that has plenty.
>
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
> fs/xfs/libxfs/xfs_alloc.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> index 8999e38e1bed..bd7112d430b6 100644
> --- a/fs/xfs/libxfs/xfs_alloc.c
> +++ b/fs/xfs/libxfs/xfs_alloc.c
> @@ -3326,11 +3326,14 @@ xfs_alloc_vextent_iterate_ags(
> uint32_t flags)
> {
> struct xfs_mount *mp = args->mp;
> + xfs_agnumber_t restart_agno = minimum_agno;
> xfs_agnumber_t agno;
> int error = 0;
>
> + if (flags & XFS_ALLOC_FLAG_TRYLOCK)
> + restart_agno = 0;
> restart:
> - for_each_perag_wrap_range(mp, start_agno, minimum_agno,
> + for_each_perag_wrap_range(mp, start_agno, restart_agno,
> mp->m_sb.sb_agcount, agno, args->pag) {
> args->agno = agno;
> error = xfs_alloc_vextent_prepare_ag(args);
> @@ -3369,6 +3372,7 @@ xfs_alloc_vextent_iterate_ags(
> */
> if (flags) {
> flags = 0;
> + restart_agno = minimum_agno;
> goto restart;
> }
Looks good. Pretty much what we talked about on #xfs.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
prev parent reply other threads:[~2023-03-18 0:38 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-16 16:47 [PATCH] xfs: walk all AGs if TRYLOCK passed to xfs_alloc_vextent_iterate_ags Darrick J. Wong
2023-03-18 0:38 ` Dave Chinner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZBUIElGrUQnvxprN@destitution \
--to=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.