Linux EXT4 FS development
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
	linux-fsdevel@vger.kernel.org, Miklos Szeredi <miklos@szeredi.hu>,
	"Darrick J. Wong" <djwong@kernel.org>, Ted Tso <tytso@mit.edu>,
	Jaegeuk Kim <jaegeuk@kernel.org>,
	linux-ext4@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net,
	linux-xfs@vger.kernel.org
Subject: Re: [PATCH 0/6] fs: Fix directory corruption when moving directories
Date: Fri, 26 May 2023 17:58:08 +0200	[thread overview]
Message-ID: <20230526-schrebergarten-vortag-9cd89694517e@brauner> (raw)
In-Reply-To: <20230525100654.15069-1-jack@suse.cz>

On Thu, May 25, 2023 at 12:16:06PM +0200, Jan Kara wrote:
> Hello,
> 
> this patch set fixes a problem with cross directory renames originally reported
> in [1]. To quickly sum it up some filesystems (so far we know at least about
> ext4, udf, f2fs, ocfs2, likely also reiserfs, gfs2 and others) need to lock the
> directory when it is being renamed into another directory. This is because we
> need to update the parent pointer in the directory in that case and if that
> races with other operation on the directory (in particular a conversion from
> one directory format into another), bad things can happen.
> 
> So far we've done the locking in the filesystem code but recently Darrick
> pointed out [2] that we've missed the RENAME_EXCHANGE case in our ext4 fix.
> That one is particularly nasty because RENAME_EXCHANGE can arbitrarily mix
> regular files and directories and proper lock ordering is not achievable in the
> filesystems alone.
> 
> This patch set adds locking into vfs_rename() so that not only parent
> directories but also moved inodes (regardless whether they are directories or
> not) are locked when calling into the filesystem.

This locking is trauma inducing.

So I was staring at this for a long time and the thing that bothered me
big time was that I couldn't easily figure out how we ended up with the
locking scheme that we have. So I went digging. Corrections and
additions very welcome...

Before 914a6e0ea12 ("Import 2.3.51pre1") locking for rename was based on
s_vfs_rename_sem and i_sem. For both within- and across-directory
renames s_vfs_rename_sem was acquired and the i_sem on the parent
directories was acquired but the target wasn't locked.

Then 914a6e0ea12 ("Import 2.3.51pre1") introduced an additional i_zombie
semaphore to protect against create, link, mknod, mkdir, unlink, and
rmdir on the target. So i_zombie had to be acquired during
vfs_rename_dir() on both parent and the victim but only if the source
was a directory. Back then RENAME_EXCHANGE didn't exist so if source was
a directory then target if it existed must've been a directory as well.

The original reasoning behind only locking the target if the source was
a directory was based on some filesystems not being able to deal with
opened but unlinked directories.

The i_zombie finally died in 1b3d7c93c6d ("[PATCH] (2.5.4) death of
->i_zombie") and a new locking scheme was introduced. The
s_vfs_rename_sem was now only used for across-directory renames. Instead
of having i_zombie and i_sem only i_sem was left. Now, both
vfs_rename_dir(/* if renaming directory */) and vfs_rename_other()
grabbed i_sem on both parents and on the target. So now the target was
always explicitly protected against a concurrent unlink or rmdir
as that would be done as part of the rename operation and that race
would just been awkward to allow afaict. Probably always was.

The locking of source however is a different story. This was introduced
in 6cedba8962f4 ("vfs: take i_mutex on renamed file") to prevent new
delegations from being acquired during rename, link, or unlink. So now
both source and target were locked. The target seemingly because it
would be unlinked in the filesystem's rename method and the source to
prevent new delegations from being acquired. So, since leases can only
be taken on regular files vfs_setlease()/generic_setlease() directories
were never considered for locking. So the lock never had to be acquired
for source directories.

So in any case, under the assumption that the broad strokes are correct
there seems to be no inherent reason why locking source and target if
they're directories couldn't be done if the ordering is well-defined.
Which is what originally made me hesitate. IOW, given my current
knowledge this seems ok.

Frankly, if we end up unconditionally locking source and target we're in
a better place from a pure maintainability perspective as far as I'm
concerned. Even if we end up taking the lock pointlessly for e.g., NFS
with RENAME_EXCHANGE. The last thing we need is more conditions on when
things are locked and why.

/me goes off into the weekend

  parent reply	other threads:[~2023-05-26 15:58 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-25 10:16 [PATCH 0/6] fs: Fix directory corruption when moving directories Jan Kara
2023-05-25 10:16 ` [PATCH 1/6] ext4: Remove ext4 locking of moved directory Jan Kara
2023-05-25 10:16 ` [PATCH 2/6] Revert "udf: Protect rename against modification of moved directory" Jan Kara
2023-05-25 10:16 ` [PATCH 3/6] Revert "f2fs: fix potential corruption when moving a directory" Jan Kara
2023-05-25 10:16 ` [PATCH 4/6] fs: Establish locking order for unrelated directories Jan Kara
2023-05-26  9:45   ` Christian Brauner
2023-05-29 12:41     ` Jan Kara
2023-05-30 12:42       ` Christian Brauner
2023-05-25 10:16 ` [PATCH 5/6] fs: Lock moved directories Jan Kara
2023-05-25 10:16 ` [PATCH 6/6] fs: Restrict lock_two_nondirectories() to non-directory inodes Jan Kara
2023-05-26 12:13   ` Amir Goldstein
2023-05-29 12:42     ` Jan Kara
2023-05-26 15:58 ` Christian Brauner [this message]
2023-05-31 14:09   ` [PATCH 0/6] fs: Fix directory corruption when moving directories Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230526-schrebergarten-vortag-9cd89694517e@brauner \
    --to=brauner@kernel.org \
    --cc=djwong@kernel.org \
    --cc=jack@suse.cz \
    --cc=jaegeuk@kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=tytso@mit.edu \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox