* [PATCH] vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems
@ 2008-07-11 19:47 Sage Weil
2008-07-11 20:53 ` Miklos Szeredi
0 siblings, 1 reply; 8+ messages in thread
From: Sage Weil @ 2008-07-11 19:47 UTC (permalink / raw)
To: linux-fsdevel, viro; +Cc: akpm, hch
Al, Christoph,
Zach just ran into this bug as well. Does this fix look reasonable?
thanks-
sage
----
>From Sage Weil <sage@newdream.net>
d_move() is strangely implemented in that it swaps the position of
new_dentry and old_dentry in the namespace. This is admittedly weird (see
comments for d_move_locked()), but normally harmless: even though
new_dentry swaps places with old_dentry, it is unhashed, and won't be seen
by a subsequent lookup.
However, vfs_rename_dir() doesn't properly account for filesystems with
FS_RENAME_DOES_D_MOVE. If new_dentry has a target inode attached, it
unhashes the new_dentry prior to the rename() iop and rehashes it after,
but doesn't account for the possibility that rename() may have swapped
{old,new}_dentry. For FS_RENAME_DOES_D_MOVE filesystems, it rehashes
new_dentry (now the old renamed-from name, which d_move() expected to go
away), such that a subsequent lookup will find it.
To correct this, move vfs_rename_dir()'s call to d_move() _before_ the
target inode mutex is dealt with. Since d_move() will have been called
for all filesystems at this point, there is no need to rehash new_dentry
unless the rename failed. (If the rename succeeded, old_dentry should
already be rehashed in the new location.)
The only in-tree filesystems with FS_RENAME_DOES_D_MOVE are ocfs2 and nfs.
My suspicion is that they are not bitten by this particular bug because
the incorrectly rehashed new_dentry gets rejected by d_revalidate().
This was caught by the recently posted POSIX fstest suite, rename/10.t
test 62 (and others) on ceph. With this patch, all tests succeed.
Signed-off-by: Sage Weil <sage@newdream.net>
---
fs/namei.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
--- linux-2.6.25-orig/fs/namei.c 2008-04-16 19:49:44.000000000 -0700
+++ linux/fs/namei.c 2008-04-18 13:59:30.000000000 -0700
@@ -2488,17 +2488,18 @@
error = -EBUSY;
else
error = old_dir->i_op->rename(old_dir, old_dentry, new_dir, new_dentry);
+ if (!error)
+ if (!(old_dir->i_sb->s_type->fs_flags & FS_RENAME_DOES_D_MOVE))
+ d_move(old_dentry, new_dentry);
if (target) {
if (!error)
target->i_flags |= S_DEAD;
mutex_unlock(&target->i_mutex);
- if (d_unhashed(new_dentry))
+ /* only rehash new_dentry if the rename failed */
+ if (error && d_unhashed(new_dentry))
d_rehash(new_dentry);
dput(new_dentry);
}
- if (!error)
- if (!(old_dir->i_sb->s_type->fs_flags & FS_RENAME_DOES_D_MOVE))
- d_move(old_dentry,new_dentry);
return error;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems
2008-07-11 19:47 [PATCH] vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems Sage Weil
@ 2008-07-11 20:53 ` Miklos Szeredi
2008-07-11 22:12 ` Zach Brown
2008-07-11 22:15 ` Sage Weil
0 siblings, 2 replies; 8+ messages in thread
From: Miklos Szeredi @ 2008-07-11 20:53 UTC (permalink / raw)
To: sage; +Cc: linux-fsdevel, viro, akpm, hch
On Fri, 11 Jul 2008, Sage Weil wrote:
> However, vfs_rename_dir() doesn't properly account for filesystems with
> FS_RENAME_DOES_D_MOVE. If new_dentry has a target inode attached, it
> unhashes the new_dentry prior to the rename() iop and rehashes it after,
> but doesn't account for the possibility that rename() may have swapped
> {old,new}_dentry. For FS_RENAME_DOES_D_MOVE filesystems, it rehashes
> new_dentry (now the old renamed-from name, which d_move() expected to go
> away), such that a subsequent lookup will find it.
>
> To correct this, move vfs_rename_dir()'s call to d_move() _before_ the
> target inode mutex is dealt with. Since d_move() will have been called
> for all filesystems at this point, there is no need to rehash new_dentry
> unless the rename failed. (If the rename succeeded, old_dentry should
> already be rehashed in the new location.)
I think rehashing the new dentry is bogus, even on error. And it
looks racy with lookup as well.
I wonder what the original reason for that was? Git history doesn't
tell...
So a better fix would be just to remove the rehashing completely.
Does the below patch work for you?
Thanks,
Miklos
---
fs/namei.c | 2 --
1 file changed, 2 deletions(-)
Index: linux-2.6/fs/namei.c
===================================================================
--- linux-2.6.orig/fs/namei.c 2008-07-11 22:09:32.000000000 +0200
+++ linux-2.6/fs/namei.c 2008-07-11 22:40:16.000000000 +0200
@@ -2643,8 +2643,6 @@ static int vfs_rename_dir(struct inode *
if (!error)
target->i_flags |= S_DEAD;
mutex_unlock(&target->i_mutex);
- if (d_unhashed(new_dentry))
- d_rehash(new_dentry);
dput(new_dentry);
}
if (!error)
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems
2008-07-11 20:53 ` Miklos Szeredi
@ 2008-07-11 22:12 ` Zach Brown
2008-07-18 10:59 ` Miklos Szeredi
2008-07-11 22:15 ` Sage Weil
1 sibling, 1 reply; 8+ messages in thread
From: Zach Brown @ 2008-07-11 22:12 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: sage, linux-fsdevel, viro, akpm, hch
Miklos Szeredi wrote:
> On Fri, 11 Jul 2008, Sage Weil wrote:
>> However, vfs_rename_dir() doesn't properly account for filesystems with
>> FS_RENAME_DOES_D_MOVE. If new_dentry has a target inode attached, it
>> unhashes the new_dentry prior to the rename() iop and rehashes it after,
>> but doesn't account for the possibility that rename() may have swapped
>> {old,new}_dentry. For FS_RENAME_DOES_D_MOVE filesystems, it rehashes
>> new_dentry (now the old renamed-from name, which d_move() expected to go
>> away), such that a subsequent lookup will find it.
>>
>> To correct this, move vfs_rename_dir()'s call to d_move() _before_ the
>> target inode mutex is dealt with. Since d_move() will have been called
>> for all filesystems at this point, there is no need to rehash new_dentry
>> unless the rename failed. (If the rename succeeded, old_dentry should
>> already be rehashed in the new location.)
>
> I think rehashing the new dentry is bogus, even on error.
So we'd just come back through lookup to repopulate the existing
destination name that vfs_rename_dir() unhashed before calling
->rename() in the case that the rename fails? That seems gross, but
relatively harmless.
> So a better fix would be just to remove the rehashing completely.
> Does the below patch work for you?
It'd work for my case, yeah.
- z
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems
2008-07-11 22:12 ` Zach Brown
@ 2008-07-18 10:59 ` Miklos Szeredi
2008-07-18 19:44 ` Zach Brown
0 siblings, 1 reply; 8+ messages in thread
From: Miklos Szeredi @ 2008-07-18 10:59 UTC (permalink / raw)
To: zach.brown; +Cc: miklos, sage, linux-fsdevel, viro, akpm, hch
On Fri, 11 Jul 2008, Zach Brown wrote:
> Miklos Szeredi wrote:
> > I think rehashing the new dentry is bogus, even on error.
>
> So we'd just come back through lookup to repopulate the existing
> destination name that vfs_rename_dir() unhashed before calling
> ->rename() in the case that the rename fails? That seems gross, but
> relatively harmless.
We are talking about an _extremely_ rare event. Even the
"vfs_rename_dir() with positive target" is very rare, let alone a
failing one.
If we are going to worry about directory removal failure cases, we
should start with rmdir(), which is a wee bit more common, than the
above case here.
Miklos
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems
2008-07-18 10:59 ` Miklos Szeredi
@ 2008-07-18 19:44 ` Zach Brown
0 siblings, 0 replies; 8+ messages in thread
From: Zach Brown @ 2008-07-18 19:44 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: sage, linux-fsdevel, viro, akpm, hch
Miklos Szeredi wrote:
> On Fri, 11 Jul 2008, Zach Brown wrote:
>> Miklos Szeredi wrote:
>>> I think rehashing the new dentry is bogus, even on error.
>> So we'd just come back through lookup to repopulate the existing
>> destination name that vfs_rename_dir() unhashed before calling
>> ->rename() in the case that the rename fails? That seems gross, but
>> relatively harmless.
>
> We are talking about an _extremely_ rare event.
Agreed, which is why I called it relatively harmless.
- z
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems
2008-07-11 20:53 ` Miklos Szeredi
2008-07-11 22:12 ` Zach Brown
@ 2008-07-11 22:15 ` Sage Weil
1 sibling, 0 replies; 8+ messages in thread
From: Sage Weil @ 2008-07-11 22:15 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: linux-fsdevel, viro, akpm, hch
On Fri, 11 Jul 2008, Miklos Szeredi wrote:
> On Fri, 11 Jul 2008, Sage Weil wrote:
> > However, vfs_rename_dir() doesn't properly account for filesystems with
> > FS_RENAME_DOES_D_MOVE. If new_dentry has a target inode attached, it
> > unhashes the new_dentry prior to the rename() iop and rehashes it after,
> > but doesn't account for the possibility that rename() may have swapped
> > {old,new}_dentry. For FS_RENAME_DOES_D_MOVE filesystems, it rehashes
> > new_dentry (now the old renamed-from name, which d_move() expected to go
> > away), such that a subsequent lookup will find it.
> >
> > To correct this, move vfs_rename_dir()'s call to d_move() _before_ the
> > target inode mutex is dealt with. Since d_move() will have been called
> > for all filesystems at this point, there is no need to rehash new_dentry
> > unless the rename failed. (If the rename succeeded, old_dentry should
> > already be rehashed in the new location.)
>
> I think rehashing the new dentry is bogus, even on error. And it
> looks racy with lookup as well.
>
> I wonder what the original reason for that was? Git history doesn't
> tell...
I assume just to leave the dentry in the same stat we originally found it
in?
> So a better fix would be just to remove the rehashing completely.
> Does the below patch work for you?
This would work as well, yeah. I've no real preference, here...
thanks-
sage
>
> Thanks,
> Miklos
>
> ---
> fs/namei.c | 2 --
> 1 file changed, 2 deletions(-)
>
> Index: linux-2.6/fs/namei.c
> ===================================================================
> --- linux-2.6.orig/fs/namei.c 2008-07-11 22:09:32.000000000 +0200
> +++ linux-2.6/fs/namei.c 2008-07-11 22:40:16.000000000 +0200
> @@ -2643,8 +2643,6 @@ static int vfs_rename_dir(struct inode *
> if (!error)
> target->i_flags |= S_DEAD;
> mutex_unlock(&target->i_mutex);
> - if (d_unhashed(new_dentry))
> - d_rehash(new_dentry);
> dput(new_dentry);
> }
> if (!error)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [patch] vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems
@ 2008-07-21 11:41 Miklos Szeredi
2008-07-21 19:02 ` Al Viro
0 siblings, 1 reply; 8+ messages in thread
From: Miklos Szeredi @ 2008-07-21 11:41 UTC (permalink / raw)
To: viro; +Cc: sage, zach.brown, linux-fsdevel, linux-kernel
From: Miklos Szeredi <mszeredi@suse.cz>
vfs_rename_dir() doesn't properly account for filesystems with
FS_RENAME_DOES_D_MOVE. If new_dentry has a target inode attached, it
unhashes the new_dentry prior to the rename() iop and rehashes it
after, but doesn't account for the possibility that rename() may have
swapped {old,new}_dentry. For FS_RENAME_DOES_D_MOVE filesystems, it
rehashes new_dentry (now the old renamed-from name, which d_move()
expected to go away), such that a subsequent lookup will find it.
This was caught by the recently posted POSIX fstest suite, rename/10.t
test 62 (and others) on ceph.
Fix by not rehashing the new dentry. Rehashing would only make sense
if the rename failed (which should happen extremely rarely), but we
cannot handle that case correctly 100% of the time anyway, so...
Reported-by: Sage Weil <sage@newdream.net>
CC: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
fs/namei.c | 2 --
1 file changed, 2 deletions(-)
Index: linux-2.6/fs/namei.c
===================================================================
--- linux-2.6.orig/fs/namei.c 2008-07-21 09:46:07.000000000 +0200
+++ linux-2.6/fs/namei.c 2008-07-21 11:56:01.000000000 +0200
@@ -2574,8 +2574,6 @@ static int vfs_rename_dir(struct inode *
if (!error)
target->i_flags |= S_DEAD;
mutex_unlock(&target->i_mutex);
- if (d_unhashed(new_dentry))
- d_rehash(new_dentry);
dput(new_dentry);
}
if (!error)
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [patch] vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems
2008-07-21 11:41 [patch] " Miklos Szeredi
@ 2008-07-21 19:02 ` Al Viro
0 siblings, 0 replies; 8+ messages in thread
From: Al Viro @ 2008-07-21 19:02 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: sage, zach.brown, linux-fsdevel, linux-kernel
On Mon, Jul 21, 2008 at 01:41:47PM +0200, Miklos Szeredi wrote:
> From: Miklos Szeredi <mszeredi@suse.cz>
>
> vfs_rename_dir() doesn't properly account for filesystems with
> FS_RENAME_DOES_D_MOVE. If new_dentry has a target inode attached, it
> unhashes the new_dentry prior to the rename() iop and rehashes it
> after, but doesn't account for the possibility that rename() may have
> swapped {old,new}_dentry. For FS_RENAME_DOES_D_MOVE filesystems, it
> rehashes new_dentry (now the old renamed-from name, which d_move()
> expected to go away), such that a subsequent lookup will find it.
>
> This was caught by the recently posted POSIX fstest suite, rename/10.t
> test 62 (and others) on ceph.
>
> Fix by not rehashing the new dentry. Rehashing would only make sense
> if the rename failed (which should happen extremely rarely), but we
> cannot handle that case correctly 100% of the time anyway, so...
Lovely. AFAICS, that's a fallout from
commit 349457ccf2592c14bdf13b6706170ae2e94931b1
Author: Mark Fasheh <mark.fasheh@oracle.com>
Date: Fri Sep 8 14:22:21 2006 -0700
[PATCH] Allow file systems to manually d_move() inside of ->rename()
that had allowed that crap for directories. Note that d_rehash() used
to be needed (d_move() would unhash the source otherwise) and d_move()
used to be unconditional until the changeset above.
It's _probably_ OK now, but I'd really like to think about NFS behaviour.
There are subtle traps in that area.
BTW, failing rename() is trivial - just have a non-empty target...
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-07-21 19:02 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-11 19:47 [PATCH] vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems Sage Weil
2008-07-11 20:53 ` Miklos Szeredi
2008-07-11 22:12 ` Zach Brown
2008-07-18 10:59 ` Miklos Szeredi
2008-07-18 19:44 ` Zach Brown
2008-07-11 22:15 ` Sage Weil
-- strict thread matches above, loose matches on Subject: below --
2008-07-21 11:41 [patch] " Miklos Szeredi
2008-07-21 19:02 ` Al Viro
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).