From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org, Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH] xfs: _{attr,data}_map_shared should take ILOCK_EXCL until iread_extents is completely done
Date: Tue, 11 Apr 2023 13:20:35 +1000 [thread overview]
Message-ID: <20230411032035.GZ3223426@dread.disaster.area> (raw)
In-Reply-To: <20230411010638.GF360889@frogsfrogsfrogs>
On Mon, Apr 10, 2023 at 06:06:38PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
>
> While fuzzing the data fork extent count on a btree-format directory
> with xfs/375, I observed the following (excerpted) splat:
>
> XFS: Assertion failed: xfs_isilocked(ip, XFS_ILOCK_EXCL), file: fs/xfs/libxfs/xfs_bmap.c, line: 1208
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 43192 at fs/xfs/xfs_message.c:104 assfail+0x46/0x4a [xfs]
> Call Trace:
> <TASK>
> xfs_iread_extents+0x1af/0x210 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
> xchk_dir_walk+0xb8/0x190 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
> xchk_parent_count_parent_dentries+0x41/0x80 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
> xchk_parent_validate+0x199/0x2e0 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
> xchk_parent+0xdf/0x130 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
> xfs_scrub_metadata+0x2b8/0x730 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
> xfs_scrubv_metadata+0x38b/0x4d0 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
> xfs_ioc_scrubv_metadata+0x111/0x160 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
> xfs_file_ioctl+0x367/0xf50 [xfs 09f66509ece4938760fac7de64732a0cbd3e39cd]
> __x64_sys_ioctl+0x82/0xa0
> do_syscall_64+0x2b/0x80
> entry_SYSCALL_64_after_hwframe+0x46/0xb0
>
> The cause of this is a race condition in xfs_ilock_data_map_shared,
> which performs an unlocked access to the data fork to guess which lock
> mode it needs:
>
> Thread 0 Thread 1
>
> xfs_need_iread_extents
> <observe no iext tree>
> xfs_ilock(..., ILOCK_EXCL)
> xfs_iread_extents
> <observe no iext tree>
> <check ILOCK_EXCL>
> <load bmbt extents into iext>
> <notice iext size doesn't
> match nextents>
> xfs_need_iread_extents
> <observe iext tree>
> xfs_ilock(..., ILOCK_SHARED)
> <tear down iext tree>
> xfs_iunlock(..., ILOCK_EXCL)
> xfs_iread_extents
> <observe no iext tree>
> <check ILOCK_EXCL>
> *BOOM*
>
> Fix this race by adding a flag to the xfs_ifork structure to indicate
> that we have not yet read in the extent records and changing the
> predicate to look at the flag state, not if_height. The memory barrier
> ensures that the flag will not be set until the very end of the
> function.
>
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
> fs/xfs/libxfs/xfs_bmap.c | 2 ++
> fs/xfs/libxfs/xfs_inode_fork.c | 2 ++
> fs/xfs/libxfs/xfs_inode_fork.h | 3 ++-
> 3 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 34de6e6898c4..5f96e7ce7b4a 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -1171,6 +1171,8 @@ xfs_iread_extents(
> goto out;
> }
> ASSERT(ir.loaded == xfs_iext_count(ifp));
> + smp_mb();
> + ifp->if_needextents = 0;
Hmmm - if this is to ensure that everything above is completed
before the clearing of this flag is visible everywhere else, then we
should be able to use load_acquire/store_release semantics? i.e. the
above is
smp_store_release(ifp->if_needextents, 0);
and we use
smp_load_acquire(ifp->if_needextents)
when we need to read the value to ensure that all the changes made
before the relevant stores are also visible?
> return 0;
> out:
> xfs_iext_destroy(ifp);
> diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
> index 6b21760184d9..eadae924dc42 100644
> --- a/fs/xfs/libxfs/xfs_inode_fork.c
> +++ b/fs/xfs/libxfs/xfs_inode_fork.c
> @@ -174,6 +174,8 @@ xfs_iformat_btree(
> int level;
>
> ifp = xfs_ifork_ptr(ip, whichfork);
> + ifp->if_needextents = 1;
Hmmm - what's the guarantee that the reader will see ifp->if_format
set correctly if they if_needextents = 1?
Wouldn't it be better to set this at the same time we set the
ifp->if_format value? We clear it unconditionally above in
xfs_iread_extents(), so why not set it unconditionally there, too,
before we start. i.e.
/*
* Set the format before we set needsextents with release
* semantics. This ensures that we can use acquire semantics
* on needextents in xfs_need_iread_extents() and be
* guaranteed to see a valid format value after that load.
*/
ifp->if_format = dip->di_format;
smp_store_release(ifp->if_needextents, 1);
That then means xfs_need_iread_extents() is guaranteed to see a
valid ifp->if_format if ifp->if_needextents is set if we do:
/* returns true if the fork has extents but they are not read in yet. */
static inline bool xfs_need_iread_extents(struct xfs_ifork *ifp)
{
/* see xfs_iread_extents() for needextents semantics */
return smp_load_acquire(ifp->if_needextents) &&
ifp->if_format == XFS_DINODE_FMT_BTREE;
}
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2023-04-11 3:20 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-11 1:06 [PATCH] xfs: _{attr,data}_map_shared should take ILOCK_EXCL until iread_extents is completely done Darrick J. Wong
2023-04-11 3:20 ` Dave Chinner [this message]
2023-04-11 4:18 ` Darrick J. Wong
2023-04-11 21:58 ` Dave Chinner
2023-04-11 5:11 ` Christoph Hellwig
2023-04-11 5:06 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230411032035.GZ3223426@dread.disaster.area \
--to=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=hch@infradead.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox