From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>, Al Viro <viro@zeniv.linux.org.uk>,
linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
Ted Tso <tytso@mit.edu>,
linux-xfs@vger.kernel.org
Subject: Re: Locking issue with directory renames
Date: Fri, 24 Feb 2023 19:46:57 -0800 [thread overview]
Message-ID: <Y/mEsfyhNCs8orCY@magnolia> (raw)
In-Reply-To: <20230117214457.GG360264@dread.disaster.area>
On Wed, Jan 18, 2023 at 08:44:57AM +1100, Dave Chinner wrote:
> On Tue, Jan 17, 2023 at 01:37:35PM +0100, Jan Kara wrote:
> > Hello!
> >
> > I've some across an interesting issue that was spotted by syzbot [1]. The
> > report is against UDF but AFAICS the problem exists for ext4 as well and
> > possibly other filesystems. The problem is the following: When we are
> > renaming directory 'dir' say rename("foo/dir", "bar/") we lock 'foo' and
> > 'bar' but 'dir' is unlocked because the locking done by vfs_rename() is
> >
> > if (!is_dir || (flags & RENAME_EXCHANGE))
> > lock_two_nondirectories(source, target);
> > else if (target)
> > inode_lock(target);
> >
> > However some filesystems (e.g. UDF but ext4 as well, I suspect XFS may be
> > hurt by this as well because it converts among multiple dir formats) need
> > to update parent pointer in 'dir' and nothing protects this update against
> > a race with someone else modifying 'dir'. Now this is mostly harmless
> > because the parent pointer (".." directory entry) is at the beginning of
> > the directory and stable however if for example the directory is converted
> > from packed "in-inode" format to "expanded" format as a result of
> > concurrent operation on 'dir', the filesystem gets corrupted (or crashes as
> > in case of UDF).
>
> No, xfs_rename() does not have this problem - we pass four inodes to
> the function - the source directory, source inode, destination
> directory and destination inode.
Um, I think it does have this problem. xfs_readdir thinks it can parse
a shortform inode without taking the ILOCK:
if (dp->i_df.if_format == XFS_DINODE_FMT_LOCAL)
return xfs_dir2_sf_getdents(&args, ctx);
lock_mode = xfs_ilock_data_map_shared(dp);
error = xfs_dir2_isblock(&args, &isblock);
So xfs_dir2_sf_replace can rewrite the shortform structure (or even
convert it to block format!) while readdir is accessing it. Or am I
mising something?
--D
> In the above case, "dir/" is passed to XFs as the source inode - the
> src_dir is "foo/", the target dir is "bar/" and the target inode is
> null. src_dir != target_dir, so we set the "new_parent" flag. the
> srouce inode is a directory, so we set the src_is_directory flag,
> too.
>
> We lock all three inodes that are passed. We do various things, then
> run:
>
> if (new_parent && src_is_directory) {
> /*
> * Rewrite the ".." entry to point to the new
> * directory.
> */
> error = xfs_dir_replace(tp, src_ip, &xfs_name_dotdot,
> target_dp->i_ino, spaceres);
> ASSERT(error != -EEXIST);
> if (error)
> goto out_trans_cancel;
> }
>
> which replaces the ".." entry in source inode atomically whilst it
> is locked. Any directory format changes that occur during the
> rename are done while the ILOCK is held, so they appear atomic to
> outside observers that are trying to parse the directory structure
> (e.g. readdir).
>
> > So we'd need to lock 'source' if it is a directory.
>
> Yup, and XFS goes further by always locking the source inode in a
> rename, even if it is not a directory. This ensures the inode being
> moved cannot have it's metadata otherwise modified whilst the rename
> is in progress, even if that modification would have no impact on
> the rename. It's a pretty strict interpretation of "rename is an
> atomic operation", but it avoids accidentally missing nasty corner
> cases like the one described above...
>
> > Ideally this would
> > happen in VFS as otherwise I bet a lot of filesystems will get this wrong
> > so could vfs_rename() lock 'source' if it is a dir as well? Essentially
> > this would amount to calling lock_two_nondirectories(source, target)
> > unconditionally but that would become a serious misnomer ;). Al, any
> > thought?
>
> XFS just has a function that allows for an arbitrary number of
> inodes to be locked in the given order: xfs_lock_inodes(). For
> rename, the lock order is determined by xfs_sort_for_rename().
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
next prev parent reply other threads:[~2023-02-25 3:47 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-17 12:37 Locking issue with directory renames Jan Kara
2023-01-17 16:57 ` Al Viro
2023-01-18 9:10 ` Jan Kara
2023-01-18 16:30 ` Al Viro
2023-01-18 18:41 ` Jan Kara
2023-01-17 21:44 ` Dave Chinner
2023-01-18 8:56 ` Jan Kara
2023-02-24 0:19 ` Kent Overstreet
2023-02-25 3:46 ` Darrick J. Wong [this message]
2023-02-28 1:58 ` Dave Chinner
2023-03-01 12:36 ` Jan Kara
2023-03-02 0:30 ` Dave Chinner
2023-03-02 9:21 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y/mEsfyhNCs8orCY@magnolia \
--to=djwong@kernel.org \
--cc=david@fromorbit.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).