* [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs
@ 2009-11-17 22:56 akpm
2009-12-02 0:09 ` Erez Zadok
0 siblings, 1 reply; 7+ messages in thread
From: akpm @ 2009-11-17 22:56 UTC (permalink / raw)
To: viro; +Cc: linux-fsdevel, akpm, rdreier, kirkland, tyhicks
From: Roland Dreier <rdreier@cisco.com>
> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.31-2-generic #14~rbd3
> ---------------------------------------------
> firefox-3.5/4162 is trying to acquire lock:
> (&s->s_vfs_rename_mutex){+.+.+.}, at: [<ffffffff81139d31>] lock_rename+0x41/0xf0
>
> but task is already holding lock:
> (&s->s_vfs_rename_mutex){+.+.+.}, at: [<ffffffff81139d31>] lock_rename+0x41/0xf0
>
> other info that might help us debug this:
> 3 locks held by firefox-3.5/4162:
> #0: (&s->s_vfs_rename_mutex){+.+.+.}, at: [<ffffffff81139d31>] lock_rename+0x41/0xf0
> #1: (&sb->s_type->i_mutex_key#11/1){+.+.+.}, at: [<ffffffff81139d5a>] lock_rename+0x6a/0xf0
> #2: (&sb->s_type->i_mutex_key#11/2){+.+.+.}, at: [<ffffffff81139d6f>] lock_rename+0x7f/0xf0
>
> stack backtrace:
> Pid: 4162, comm: firefox-3.5 Tainted: G C 2.6.31-2-generic #14~rbd3
> Call Trace:
> [<ffffffff8108ae74>] print_deadlock_bug+0xf4/0x100
> [<ffffffff8108ce26>] validate_chain+0x4c6/0x750
> [<ffffffff8108d2e7>] __lock_acquire+0x237/0x430
> [<ffffffff8108d585>] lock_acquire+0xa5/0x150
> [<ffffffff81139d31>] ? lock_rename+0x41/0xf0
> [<ffffffff815526ad>] __mutex_lock_common+0x4d/0x3d0
> [<ffffffff81139d31>] ? lock_rename+0x41/0xf0
> [<ffffffff81139d31>] ? lock_rename+0x41/0xf0
> [<ffffffff8120eaf9>] ? ecryptfs_rename+0x99/0x170
> [<ffffffff81552b36>] mutex_lock_nested+0x46/0x60
> [<ffffffff81139d31>] lock_rename+0x41/0xf0
> [<ffffffff8120eb2a>] ecryptfs_rename+0xca/0x170
> [<ffffffff81139a9e>] vfs_rename_dir+0x13e/0x160
> [<ffffffff8113ac7e>] vfs_rename+0xee/0x290
> [<ffffffff8113c212>] ? __lookup_hash+0x102/0x160
> [<ffffffff8113d512>] sys_renameat+0x252/0x280
> [<ffffffff81133eb4>] ? cp_new_stat+0xe4/0x100
> [<ffffffff8101316a>] ? sysret_check+0x2e/0x69
> [<ffffffff8108c34d>] ? trace_hardirqs_on_caller+0x14d/0x190
> [<ffffffff8113d55b>] sys_rename+0x1b/0x20
> [<ffffffff81013132>] system_call_fastpath+0x16/0x1b
The trace above is totally reproducible by doing a cross-directory
rename on an ecryptfs directory.
The issue seems to be that sys_renameat() does lock_rename() then calls
into the filesystem; if the filesystem is ecryptfs, then
ecryptfs_rename() again does lock_rename() on the lower filesystem, and
lockdep can't tell that the two s_vfs_rename_mutexes are different. It
seems an annotation like the following is sufficient to fix this (it
does get rid of the lockdep trace in my simple tests); however I would
like to make sure I'm not misunderstanding the locking, hence the CC
list...
Signed-off-by: Roland Dreier <rdreier@cisco.com>
Cc: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Cc: Dustin Kirkland <kirkland@canonical.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/super.c | 1 +
include/linux/fs.h | 1 +
2 files changed, 2 insertions(+)
diff -puN fs/super.c~ecryptfs-another-lockdep-issue fs/super.c
--- a/fs/super.c~ecryptfs-another-lockdep-issue
+++ a/fs/super.c
@@ -95,6 +95,7 @@ static struct super_block *alloc_super(s
s->s_count = S_BIAS;
atomic_set(&s->s_active, 1);
mutex_init(&s->s_vfs_rename_mutex);
+ lockdep_set_class(&s->s_vfs_rename_mutex, &type->s_vfs_rename_key);
mutex_init(&s->s_dquot.dqio_mutex);
mutex_init(&s->s_dquot.dqonoff_mutex);
init_rwsem(&s->s_dquot.dqptr_sem);
diff -puN include/linux/fs.h~ecryptfs-another-lockdep-issue include/linux/fs.h
--- a/include/linux/fs.h~ecryptfs-another-lockdep-issue
+++ a/include/linux/fs.h
@@ -1748,6 +1748,7 @@ struct file_system_type {
struct lock_class_key s_lock_key;
struct lock_class_key s_umount_key;
+ struct lock_class_key s_vfs_rename_key;
struct lock_class_key i_lock_key;
struct lock_class_key i_mutex_key;
_
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs
2009-11-17 22:56 [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs akpm
@ 2009-12-02 0:09 ` Erez Zadok
2009-12-02 0:13 ` Andrew Morton
0 siblings, 1 reply; 7+ messages in thread
From: Erez Zadok @ 2009-12-02 0:09 UTC (permalink / raw)
To: akpm; +Cc: viro, linux-fsdevel, rdreier, kirkland, tyhicks
Any change this patch will be merged soon. It's helpful for other stackable
file systems as well, and I've verified that it helps my ->rename method
avoid a lockdep_off/on pair.
Thanks,
Erez.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs
2009-12-02 0:09 ` Erez Zadok
@ 2009-12-02 0:13 ` Andrew Morton
2009-12-02 0:47 ` Sage Weil
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2009-12-02 0:13 UTC (permalink / raw)
To: Erez Zadok; +Cc: viro, linux-fsdevel, rdreier, kirkland, tyhicks
On Tue, 1 Dec 2009 19:09:23 -0500
Erez Zadok <ezk@cs.sunysb.edu> wrote:
> Any change this patch will be merged soon. It's helpful for other stackable
> file systems as well, and I've verified that it helps my ->rename method
> avoid a lockdep_off/on pair.
>
Quite a VFS patch backlog here, some held over from 2.6.32:
vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch
raw-fix-rawctl-compat-ioctls-breakage-on-amd64-and-itanic.patch
vfs-improve-comment-describing-fget_light.patch
ecryptfs-another-lockdep-issue.patch
fs-anon_inodes-implement-dname.patch
fs-remove-unneeded-dcache_unhashed-tricks.patch
fs-improve-remountro-vs-buffercache-coherency.patch
vfs-make-real_lookup-do-dentry-revalidation-with-i_mutex-held.patch
vfs-clean-up-real_lookup.patch
vfs-remove-extraneous-null-d_inode-check-from-do_filp_open.patch
I'll send them out again in a week or so. If nothing happens, I shall go
through them and make some merge decisions myself (be afraid!)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs
2009-12-02 0:13 ` Andrew Morton
@ 2009-12-02 0:47 ` Sage Weil
2009-12-02 1:40 ` Ian Kent
2009-12-02 9:32 ` Miklos Szeredi
0 siblings, 2 replies; 7+ messages in thread
From: Sage Weil @ 2009-12-02 0:47 UTC (permalink / raw)
To: Andrew Morton
Cc: Erez Zadok, viro, linux-fsdevel, rdreier, kirkland, tyhicks,
raven
On Tue, 1 Dec 2009, Andrew Morton wrote:
> Quite a VFS patch backlog here, some held over from 2.6.32:
>
> vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch
This problem will bite ceph without this patch. I could work around it,
but it would be nice to fix the real bug. Al was worried about nfs:
> It's _probably_ OK now, but I'd really like to think about NFS
> behaviour. There are subtle traps in that area.
http://marc.info/?l=linux-kernel&m=121666695629006&w=2
> vfs-make-real_lookup-do-dentry-revalidation-with-i_mutex-held.patch
> vfs-clean-up-real_lookup.patch
These were waiting on the autofs4 locking changes and legacy autofs
removal, which Ian has posted and appear to be ready.
sage
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs
2009-12-02 0:47 ` Sage Weil
@ 2009-12-02 1:40 ` Ian Kent
2009-12-02 9:32 ` Miklos Szeredi
1 sibling, 0 replies; 7+ messages in thread
From: Ian Kent @ 2009-12-02 1:40 UTC (permalink / raw)
To: Sage Weil
Cc: Andrew Morton, Erez Zadok, viro, linux-fsdevel, rdreier, kirkland,
tyhicks
Sage Weil wrote:
> On Tue, 1 Dec 2009, Andrew Morton wrote:
>> Quite a VFS patch backlog here, some held over from 2.6.32:
>>
>> vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch
>
> This problem will bite ceph without this patch. I could work around it,
> but it would be nice to fix the real bug. Al was worried about nfs:
>
>> It's _probably_ OK now, but I'd really like to think about NFS
>> behaviour. There are subtle traps in that area.
>
> http://marc.info/?l=linux-kernel&m=121666695629006&w=2
>
>> vfs-make-real_lookup-do-dentry-revalidation-with-i_mutex-held.patch
>> vfs-clean-up-real_lookup.patch
>
> These were waiting on the autofs4 locking changes and legacy autofs
> removal, which Ian has posted and appear to be ready.
That's right.
The locking changes were sent to Andrew some time ago and they are
present in the mm kernel.
I've been side tracked and only recently posted the autofs removal
patches, no-one objected and I need to send them to Andrew, as in my
normal process.
Ian
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs
2009-12-02 0:47 ` Sage Weil
2009-12-02 1:40 ` Ian Kent
@ 2009-12-02 9:32 ` Miklos Szeredi
2009-12-21 19:44 ` Sage Weil
1 sibling, 1 reply; 7+ messages in thread
From: Miklos Szeredi @ 2009-12-02 9:32 UTC (permalink / raw)
To: Sage Weil
Cc: akpm, ezk, viro, linux-fsdevel, rdreier, kirkland, tyhicks, raven
On Tue, 1 Dec 2009, Sage Weil wrote:
> On Tue, 1 Dec 2009, Andrew Morton wrote:
> > Quite a VFS patch backlog here, some held over from 2.6.32:
> >
> > vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch
>
> This problem will bite ceph without this patch. I could work around it,
> but it would be nice to fix the real bug. Al was worried about nfs:
>
> > It's _probably_ OK now, but I'd really like to think about NFS
> > behaviour. There are subtle traps in that area.
>
> http://marc.info/?l=linux-kernel&m=121666695629006&w=2
I had a look at the NFS rename code again, and it does have the look
of of some legacy, especially the comments. I'll will submit a short
series of cleanup patches in that area. But otherwise it looks
perfectly OK, I could not see any traps, see below:
- Filesystems without FS_RENAME_DOES_D_MOVE
o target has external references
dentry_unhash() doesn't unhash it, so it doesn't need to be rehashed
o target doesn't have external references
- rename succeeded
the d_rehash() is basically a NOP, the following d_move() unhash it again
- rename failed
the d_rehash() is an optimization, saves a ->lookup() next time the
target is accessed. This is unnecessary, we don't need to optimize
this failure case
- Filesystems with FS_RENAME_DOES_D_MOVE
o target has external references
- rename succeeded
dentry_unhash() doesn't unhash the target, but d_move() will, it
will also exchange names and parents of the source/target
dentries. In this situation d_rehash will resurrect the target
dentry under the old name, /proc/PID/fd/N symlink won't show it as
"(deleted)" for example. Lookup of the old name will result in
ESTALE, since d_revalidate() fails but the dentry can't be
invalidated because of the external refs.
- rename failed
dentry_unhash() doens't unhash it, filesystem doesn't do d_move(),
so everything is fine, no need to rehash
o target doesn't have external refereces
- rename succeeded
again the target dentry is resurrected in the cache, but the next
lookup will successfully invalidate it. But there's no need to
rehash, it just slows down lookup
- rename failed
the d_rehash() is an optimization, saves a ->lookup() next time the
target is accessed. This is unnecessary
Does anyone see a fault in my analysis?
Thanks,
Miklos
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs
2009-12-02 9:32 ` Miklos Szeredi
@ 2009-12-21 19:44 ` Sage Weil
0 siblings, 0 replies; 7+ messages in thread
From: Sage Weil @ 2009-12-21 19:44 UTC (permalink / raw)
To: Miklos Szeredi
Cc: akpm, ezk, viro, linux-fsdevel, rdreier, kirkland, tyhicks, raven
On Wed, 2 Dec 2009, Miklos Szeredi wrote:
> On Tue, 1 Dec 2009, Sage Weil wrote:
> > On Tue, 1 Dec 2009, Andrew Morton wrote:
> > > Quite a VFS patch backlog here, some held over from 2.6.32:
> > >
> > > vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch
> >
> > This problem will bite ceph without this patch. I could work around it,
> > but it would be nice to fix the real bug. Al was worried about nfs:
> >
> > > It's _probably_ OK now, but I'd really like to think about NFS
> > > behaviour. There are subtle traps in that area.
> >
> > http://marc.info/?l=linux-kernel&m=121666695629006&w=2
>
> I had a look at the NFS rename code again, and it does have the look
> of of some legacy, especially the comments. I'll will submit a short
> series of cleanup patches in that area. But otherwise it looks
> perfectly OK, I could not see any traps, see below:
>
> - Filesystems without FS_RENAME_DOES_D_MOVE
>
> o target has external references
> dentry_unhash() doesn't unhash it, so it doesn't need to be rehashed
>
> o target doesn't have external references
>
> - rename succeeded
> the d_rehash() is basically a NOP, the following d_move() unhash it again
>
> - rename failed
> the d_rehash() is an optimization, saves a ->lookup() next time the
> target is accessed. This is unnecessary, we don't need to optimize
> this failure case
Yes.
> - Filesystems with FS_RENAME_DOES_D_MOVE
>
> o target has external references
>
> - rename succeeded
> dentry_unhash() doesn't unhash the target, but d_move() will, it
> will also exchange names and parents of the source/target
> dentries. In this situation d_rehash will resurrect the target
> dentry under the old name, /proc/PID/fd/N symlink won't show it as
> "(deleted)" for example. Lookup of the old name will result in
> ESTALE, since d_revalidate() fails but the dentry can't be
> invalidated because of the external refs.
- Assuming d_revalidate() fails, yes.
- If the name is stort (<32 chars or 40 chars, depending on arch), the
name is copied, not swapped, so the target will get rehashed at the same
name.
> - rename failed
> dentry_unhash() doens't unhash it, filesystem doesn't do d_move(),
> so everything is fine, no need to rehash
>
> o target doesn't have external refereces
>
> - rename succeeded
> again the target dentry is resurrected in the cache, but the next
> lookup will successfully invalidate it. But there's no need to
> rehash, it just slows down lookup
Again, only assuming d_revalidate() explicitly fails on renamed-over
dentries.
> - rename failed
> the d_rehash() is an optimization, saves a ->lookup() next time the
> target is accessed. This is unnecessary
>
> Does anyone see a fault in my analysis?
Otherwise, this looks right to me.
sage
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-12-21 19:44 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-17 22:56 [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs akpm
2009-12-02 0:09 ` Erez Zadok
2009-12-02 0:13 ` Andrew Morton
2009-12-02 0:47 ` Sage Weil
2009-12-02 1:40 ` Ian Kent
2009-12-02 9:32 ` Miklos Szeredi
2009-12-21 19:44 ` Sage Weil
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).