From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/6] xfs: bunmapi has unnecessary AG lock ordering issues
Date: Wed, 26 May 2021 23:16:43 -0700 [thread overview]
Message-ID: <20210527061643.GD202121@locust> (raw)
In-Reply-To: <20210527045202.1155628-3-david@fromorbit.com>
On Thu, May 27, 2021 at 02:51:58PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> large directory block size operations are assert failing because
> xfs_bunmapi() is not completely removing fragmented directory blocks
> like so:
>
> XFS: Assertion failed: done, file: fs/xfs/libxfs/xfs_dir2.c, line: 677
> ....
> Call Trace:
> xfs_dir2_shrink_inode+0x1a8/0x210
> xfs_dir2_block_to_sf+0x2ae/0x410
> xfs_dir2_block_removename+0x21a/0x280
> xfs_dir_removename+0x195/0x1d0
> xfs_rename+0xb79/0xc50
> ? avc_has_perm+0x8d/0x1a0
> ? avc_has_perm_noaudit+0x9a/0x120
> xfs_vn_rename+0xdb/0x150
> vfs_rename+0x719/0xb50
> ? __lookup_hash+0x6a/0xa0
> do_renameat2+0x413/0x5e0
> __x64_sys_rename+0x45/0x50
> do_syscall_64+0x3a/0x70
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> We are aborting the bunmapi() pass because of this specific chunk of
> code:
>
> /*
> * Make sure we don't touch multiple AGF headers out of order
> * in a single transaction, as that could cause AB-BA deadlocks.
> */
> if (!wasdel && !isrt) {
> agno = XFS_FSB_TO_AGNO(mp, del.br_startblock);
> if (prev_agno != NULLAGNUMBER && prev_agno > agno)
> break;
> prev_agno = agno;
> }
>
> This is designed to prevent deadlocks in AGF locking when freeing
> multiple extents by ensuring that we only ever lock in increasing
> AG number order. Unfortunately, this also violates the "bunmapi will
> always succeed" semantic that some high level callers depend on,
> such as xfs_dir2_shrink_inode(), xfs_da_shrink_inode() and
> xfs_inactive_symlink_rmt().
>
> This AG lock ordering was introduced back in 2017 to fix deadlocks
> triggered by generic/299 as reported here:
>
> https://lore.kernel.org/linux-xfs/800468eb-3ded-9166-20a4-047de8018582@gmail.com/
>
> This codebase is old enough that it was before we were defering all
> AG based extent freeing from within xfs_bunmapi(). THat is, we never
> actually lock AGs in xfs_bunmapi() any more - every non-rt based
> extent free is added to the defer ops list, as is all BMBT block
> freeing. And RT extents are not RT based, so there's no lock
> ordering issues associated with them.
>
> Hence this AGF lock ordering code is both broken and dead. Let's
> just remove it so that the large directory block code works reliably
> again.
>
> Tested against xfs/538 and generic/299 which is the original test
> that exposed the deadlocks that this code fixed.
>
> Fixes: 5b094d6dac04 ("xfs: fix multi-AG deadlock in xfs_bunmapi")
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
Seems reasonable nowadays; will throw it at the test cloud and see what
happens.
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
> ---
> fs/xfs/libxfs/xfs_bmap.c | 11 -----------
> 1 file changed, 11 deletions(-)
>
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 3f8b6da09261..a3e0e6f672d6 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -5349,7 +5349,6 @@ __xfs_bunmapi(
> xfs_fsblock_t sum;
> xfs_filblks_t len = *rlen; /* length to unmap in file */
> xfs_fileoff_t max_len;
> - xfs_agnumber_t prev_agno = NULLAGNUMBER, agno;
> xfs_fileoff_t end;
> struct xfs_iext_cursor icur;
> bool done = false;
> @@ -5441,16 +5440,6 @@ __xfs_bunmapi(
> del = got;
> wasdel = isnullstartblock(del.br_startblock);
>
> - /*
> - * Make sure we don't touch multiple AGF headers out of order
> - * in a single transaction, as that could cause AB-BA deadlocks.
> - */
> - if (!wasdel && !isrt) {
> - agno = XFS_FSB_TO_AGNO(mp, del.br_startblock);
> - if (prev_agno != NULLAGNUMBER && prev_agno > agno)
> - break;
> - prev_agno = agno;
> - }
> if (got.br_startoff < start) {
> del.br_startoff = start;
> del.br_blockcount -= start - got.br_startoff;
> --
> 2.31.1
>
next prev parent reply other threads:[~2021-05-27 6:16 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-27 4:51 [PATCH 0/6] xfs: bunmapi needs updating for deferred freeing Dave Chinner
2021-05-27 4:51 ` [PATCH 1/6] xfs: btree format inode forks can have zero extents Dave Chinner
2021-05-27 6:15 ` Darrick J. Wong
2021-05-27 4:51 ` [PATCH 2/6] xfs: bunmapi has unnecessary AG lock ordering issues Dave Chinner
2021-05-27 6:16 ` Darrick J. Wong [this message]
2021-05-27 4:51 ` [PATCH 3/6] xfs: xfs_itruncate_extents has no extent count limitation Dave Chinner
2021-05-31 12:55 ` Chandan Babu R
2021-05-31 13:05 ` Chandan Babu R
2021-05-31 23:28 ` Dave Chinner
2021-06-01 6:42 ` Chandan Babu R
2021-05-27 4:52 ` [PATCH 4/6] xfs: add a free space extent change reservation Dave Chinner
2021-05-27 6:38 ` kernel test robot
2021-05-27 6:38 ` kernel test robot
2021-05-27 7:03 ` kernel test robot
2021-05-27 7:03 ` [RFC PATCH] xfs: xfs_allocfree_extent_res can be static kernel test robot
2021-06-02 21:37 ` [PATCH 4/6] xfs: add a free space extent change reservation Darrick J. Wong
2021-05-27 4:52 ` [PATCH 5/6] xfs: factor free space tree transaciton reservations Dave Chinner
2021-06-02 21:36 ` Darrick J. Wong
2021-05-27 4:52 ` [PATCH 6/6] xfs: reduce transaction reservation for freeing extents Dave Chinner
2021-05-27 6:19 ` Darrick J. Wong
2021-05-27 8:52 ` Dave Chinner
2021-05-28 0:01 ` Darrick J. Wong
2021-05-28 2:30 ` Dave Chinner
2021-05-28 5:30 ` Darrick J. Wong
2021-05-31 10:02 ` [PATCH 0/6] xfs: bunmapi needs updating for deferred freeing Chandan Babu R
2021-05-31 22:41 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210527061643.GD202121@locust \
--to=djwong@kernel.org \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox