* [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs
@ 2009-11-17 22:56 akpm
2009-12-02 0:09 ` Erez Zadok
0 siblings, 1 reply; 7+ messages in thread
From: akpm @ 2009-11-17 22:56 UTC (permalink / raw)
To: viro; +Cc: linux-fsdevel, akpm, rdreier, kirkland, tyhicks
From: Roland Dreier <rdreier@cisco.com>
> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.31-2-generic #14~rbd3
> ---------------------------------------------
> firefox-3.5/4162 is trying to acquire lock:
> (&s->s_vfs_rename_mutex){+.+.+.}, at: [<ffffffff81139d31>] lock_rename+0x41/0xf0
>
> but task is already holding lock:
> (&s->s_vfs_rename_mutex){+.+.+.}, at: [<ffffffff81139d31>] lock_rename+0x41/0xf0
>
> other info that might help us debug this:
> 3 locks held by firefox-3.5/4162:
> #0: (&s->s_vfs_rename_mutex){+.+.+.}, at: [<ffffffff81139d31>] lock_rename+0x41/0xf0
> #1: (&sb->s_type->i_mutex_key#11/1){+.+.+.}, at: [<ffffffff81139d5a>] lock_rename+0x6a/0xf0
> #2: (&sb->s_type->i_mutex_key#11/2){+.+.+.}, at: [<ffffffff81139d6f>] lock_rename+0x7f/0xf0
>
> stack backtrace:
> Pid: 4162, comm: firefox-3.5 Tainted: G C 2.6.31-2-generic #14~rbd3
> Call Trace:
> [<ffffffff8108ae74>] print_deadlock_bug+0xf4/0x100
> [<ffffffff8108ce26>] validate_chain+0x4c6/0x750
> [<ffffffff8108d2e7>] __lock_acquire+0x237/0x430
> [<ffffffff8108d585>] lock_acquire+0xa5/0x150
> [<ffffffff81139d31>] ? lock_rename+0x41/0xf0
> [<ffffffff815526ad>] __mutex_lock_common+0x4d/0x3d0
> [<ffffffff81139d31>] ? lock_rename+0x41/0xf0
> [<ffffffff81139d31>] ? lock_rename+0x41/0xf0
> [<ffffffff8120eaf9>] ? ecryptfs_rename+0x99/0x170
> [<ffffffff81552b36>] mutex_lock_nested+0x46/0x60
> [<ffffffff81139d31>] lock_rename+0x41/0xf0
> [<ffffffff8120eb2a>] ecryptfs_rename+0xca/0x170
> [<ffffffff81139a9e>] vfs_rename_dir+0x13e/0x160
> [<ffffffff8113ac7e>] vfs_rename+0xee/0x290
> [<ffffffff8113c212>] ? __lookup_hash+0x102/0x160
> [<ffffffff8113d512>] sys_renameat+0x252/0x280
> [<ffffffff81133eb4>] ? cp_new_stat+0xe4/0x100
> [<ffffffff8101316a>] ? sysret_check+0x2e/0x69
> [<ffffffff8108c34d>] ? trace_hardirqs_on_caller+0x14d/0x190
> [<ffffffff8113d55b>] sys_rename+0x1b/0x20
> [<ffffffff81013132>] system_call_fastpath+0x16/0x1b
The trace above is totally reproducible by doing a cross-directory
rename on an ecryptfs directory.
The issue seems to be that sys_renameat() does lock_rename() then calls
into the filesystem; if the filesystem is ecryptfs, then
ecryptfs_rename() again does lock_rename() on the lower filesystem, and
lockdep can't tell that the two s_vfs_rename_mutexes are different. It
seems an annotation like the following is sufficient to fix this (it
does get rid of the lockdep trace in my simple tests); however I would
like to make sure I'm not misunderstanding the locking, hence the CC
list...
Signed-off-by: Roland Dreier <rdreier@cisco.com>
Cc: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Cc: Dustin Kirkland <kirkland@canonical.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
fs/super.c | 1 +
include/linux/fs.h | 1 +
2 files changed, 2 insertions(+)
diff -puN fs/super.c~ecryptfs-another-lockdep-issue fs/super.c
--- a/fs/super.c~ecryptfs-another-lockdep-issue
+++ a/fs/super.c
@@ -95,6 +95,7 @@ static struct super_block *alloc_super(s
s->s_count = S_BIAS;
atomic_set(&s->s_active, 1);
mutex_init(&s->s_vfs_rename_mutex);
+ lockdep_set_class(&s->s_vfs_rename_mutex, &type->s_vfs_rename_key);
mutex_init(&s->s_dquot.dqio_mutex);
mutex_init(&s->s_dquot.dqonoff_mutex);
init_rwsem(&s->s_dquot.dqptr_sem);
diff -puN include/linux/fs.h~ecryptfs-another-lockdep-issue include/linux/fs.h
--- a/include/linux/fs.h~ecryptfs-another-lockdep-issue
+++ a/include/linux/fs.h
@@ -1748,6 +1748,7 @@ struct file_system_type {
struct lock_class_key s_lock_key;
struct lock_class_key s_umount_key;
+ struct lock_class_key s_vfs_rename_key;
struct lock_class_key i_lock_key;
struct lock_class_key i_mutex_key;
_
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs 2009-11-17 22:56 [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs akpm @ 2009-12-02 0:09 ` Erez Zadok 2009-12-02 0:13 ` Andrew Morton 0 siblings, 1 reply; 7+ messages in thread From: Erez Zadok @ 2009-12-02 0:09 UTC (permalink / raw) To: akpm; +Cc: viro, linux-fsdevel, rdreier, kirkland, tyhicks Any change this patch will be merged soon. It's helpful for other stackable file systems as well, and I've verified that it helps my ->rename method avoid a lockdep_off/on pair. Thanks, Erez. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs 2009-12-02 0:09 ` Erez Zadok @ 2009-12-02 0:13 ` Andrew Morton 2009-12-02 0:47 ` Sage Weil 0 siblings, 1 reply; 7+ messages in thread From: Andrew Morton @ 2009-12-02 0:13 UTC (permalink / raw) To: Erez Zadok; +Cc: viro, linux-fsdevel, rdreier, kirkland, tyhicks On Tue, 1 Dec 2009 19:09:23 -0500 Erez Zadok <ezk@cs.sunysb.edu> wrote: > Any change this patch will be merged soon. It's helpful for other stackable > file systems as well, and I've verified that it helps my ->rename method > avoid a lockdep_off/on pair. > Quite a VFS patch backlog here, some held over from 2.6.32: vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch raw-fix-rawctl-compat-ioctls-breakage-on-amd64-and-itanic.patch vfs-improve-comment-describing-fget_light.patch ecryptfs-another-lockdep-issue.patch fs-anon_inodes-implement-dname.patch fs-remove-unneeded-dcache_unhashed-tricks.patch fs-improve-remountro-vs-buffercache-coherency.patch vfs-make-real_lookup-do-dentry-revalidation-with-i_mutex-held.patch vfs-clean-up-real_lookup.patch vfs-remove-extraneous-null-d_inode-check-from-do_filp_open.patch I'll send them out again in a week or so. If nothing happens, I shall go through them and make some merge decisions myself (be afraid!) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs 2009-12-02 0:13 ` Andrew Morton @ 2009-12-02 0:47 ` Sage Weil 2009-12-02 1:40 ` Ian Kent 2009-12-02 9:32 ` Miklos Szeredi 0 siblings, 2 replies; 7+ messages in thread From: Sage Weil @ 2009-12-02 0:47 UTC (permalink / raw) To: Andrew Morton Cc: Erez Zadok, viro, linux-fsdevel, rdreier, kirkland, tyhicks, raven On Tue, 1 Dec 2009, Andrew Morton wrote: > Quite a VFS patch backlog here, some held over from 2.6.32: > > vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch This problem will bite ceph without this patch. I could work around it, but it would be nice to fix the real bug. Al was worried about nfs: > It's _probably_ OK now, but I'd really like to think about NFS > behaviour. There are subtle traps in that area. http://marc.info/?l=linux-kernel&m=121666695629006&w=2 > vfs-make-real_lookup-do-dentry-revalidation-with-i_mutex-held.patch > vfs-clean-up-real_lookup.patch These were waiting on the autofs4 locking changes and legacy autofs removal, which Ian has posted and appear to be ready. sage ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs 2009-12-02 0:47 ` Sage Weil @ 2009-12-02 1:40 ` Ian Kent 2009-12-02 9:32 ` Miklos Szeredi 1 sibling, 0 replies; 7+ messages in thread From: Ian Kent @ 2009-12-02 1:40 UTC (permalink / raw) To: Sage Weil Cc: Andrew Morton, Erez Zadok, viro, linux-fsdevel, rdreier, kirkland, tyhicks Sage Weil wrote: > On Tue, 1 Dec 2009, Andrew Morton wrote: >> Quite a VFS patch backlog here, some held over from 2.6.32: >> >> vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch > > This problem will bite ceph without this patch. I could work around it, > but it would be nice to fix the real bug. Al was worried about nfs: > >> It's _probably_ OK now, but I'd really like to think about NFS >> behaviour. There are subtle traps in that area. > > http://marc.info/?l=linux-kernel&m=121666695629006&w=2 > >> vfs-make-real_lookup-do-dentry-revalidation-with-i_mutex-held.patch >> vfs-clean-up-real_lookup.patch > > These were waiting on the autofs4 locking changes and legacy autofs > removal, which Ian has posted and appear to be ready. That's right. The locking changes were sent to Andrew some time ago and they are present in the mm kernel. I've been side tracked and only recently posted the autofs removal patches, no-one objected and I need to send them to Andrew, as in my normal process. Ian ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs 2009-12-02 0:47 ` Sage Weil 2009-12-02 1:40 ` Ian Kent @ 2009-12-02 9:32 ` Miklos Szeredi 2009-12-21 19:44 ` Sage Weil 1 sibling, 1 reply; 7+ messages in thread From: Miklos Szeredi @ 2009-12-02 9:32 UTC (permalink / raw) To: Sage Weil Cc: akpm, ezk, viro, linux-fsdevel, rdreier, kirkland, tyhicks, raven On Tue, 1 Dec 2009, Sage Weil wrote: > On Tue, 1 Dec 2009, Andrew Morton wrote: > > Quite a VFS patch backlog here, some held over from 2.6.32: > > > > vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch > > This problem will bite ceph without this patch. I could work around it, > but it would be nice to fix the real bug. Al was worried about nfs: > > > It's _probably_ OK now, but I'd really like to think about NFS > > behaviour. There are subtle traps in that area. > > http://marc.info/?l=linux-kernel&m=121666695629006&w=2 I had a look at the NFS rename code again, and it does have the look of of some legacy, especially the comments. I'll will submit a short series of cleanup patches in that area. But otherwise it looks perfectly OK, I could not see any traps, see below: - Filesystems without FS_RENAME_DOES_D_MOVE o target has external references dentry_unhash() doesn't unhash it, so it doesn't need to be rehashed o target doesn't have external references - rename succeeded the d_rehash() is basically a NOP, the following d_move() unhash it again - rename failed the d_rehash() is an optimization, saves a ->lookup() next time the target is accessed. This is unnecessary, we don't need to optimize this failure case - Filesystems with FS_RENAME_DOES_D_MOVE o target has external references - rename succeeded dentry_unhash() doesn't unhash the target, but d_move() will, it will also exchange names and parents of the source/target dentries. In this situation d_rehash will resurrect the target dentry under the old name, /proc/PID/fd/N symlink won't show it as "(deleted)" for example. Lookup of the old name will result in ESTALE, since d_revalidate() fails but the dentry can't be invalidated because of the external refs. - rename failed dentry_unhash() doens't unhash it, filesystem doesn't do d_move(), so everything is fine, no need to rehash o target doesn't have external refereces - rename succeeded again the target dentry is resurrected in the cache, but the next lookup will successfully invalidate it. But there's no need to rehash, it just slows down lookup - rename failed the d_rehash() is an optimization, saves a ->lookup() next time the target is accessed. This is unnecessary Does anyone see a fault in my analysis? Thanks, Miklos ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs 2009-12-02 9:32 ` Miklos Szeredi @ 2009-12-21 19:44 ` Sage Weil 0 siblings, 0 replies; 7+ messages in thread From: Sage Weil @ 2009-12-21 19:44 UTC (permalink / raw) To: Miklos Szeredi Cc: akpm, ezk, viro, linux-fsdevel, rdreier, kirkland, tyhicks, raven On Wed, 2 Dec 2009, Miklos Szeredi wrote: > On Tue, 1 Dec 2009, Sage Weil wrote: > > On Tue, 1 Dec 2009, Andrew Morton wrote: > > > Quite a VFS patch backlog here, some held over from 2.6.32: > > > > > > vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch > > > > This problem will bite ceph without this patch. I could work around it, > > but it would be nice to fix the real bug. Al was worried about nfs: > > > > > It's _probably_ OK now, but I'd really like to think about NFS > > > behaviour. There are subtle traps in that area. > > > > http://marc.info/?l=linux-kernel&m=121666695629006&w=2 > > I had a look at the NFS rename code again, and it does have the look > of of some legacy, especially the comments. I'll will submit a short > series of cleanup patches in that area. But otherwise it looks > perfectly OK, I could not see any traps, see below: > > - Filesystems without FS_RENAME_DOES_D_MOVE > > o target has external references > dentry_unhash() doesn't unhash it, so it doesn't need to be rehashed > > o target doesn't have external references > > - rename succeeded > the d_rehash() is basically a NOP, the following d_move() unhash it again > > - rename failed > the d_rehash() is an optimization, saves a ->lookup() next time the > target is accessed. This is unnecessary, we don't need to optimize > this failure case Yes. > - Filesystems with FS_RENAME_DOES_D_MOVE > > o target has external references > > - rename succeeded > dentry_unhash() doesn't unhash the target, but d_move() will, it > will also exchange names and parents of the source/target > dentries. In this situation d_rehash will resurrect the target > dentry under the old name, /proc/PID/fd/N symlink won't show it as > "(deleted)" for example. Lookup of the old name will result in > ESTALE, since d_revalidate() fails but the dentry can't be > invalidated because of the external refs. - Assuming d_revalidate() fails, yes. - If the name is stort (<32 chars or 40 chars, depending on arch), the name is copied, not swapped, so the target will get rehashed at the same name. > - rename failed > dentry_unhash() doens't unhash it, filesystem doesn't do d_move(), > so everything is fine, no need to rehash > > o target doesn't have external refereces > > - rename succeeded > again the target dentry is resurrected in the cache, but the next > lookup will successfully invalidate it. But there's no need to > rehash, it just slows down lookup Again, only assuming d_revalidate() explicitly fails on renamed-over dentries. > - rename failed > the d_rehash() is an optimization, saves a ->lookup() next time the > target is accessed. This is unnecessary > > Does anyone see a fault in my analysis? Otherwise, this looks right to me. sage ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-12-21 19:44 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-11-17 22:56 [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs akpm 2009-12-02 0:09 ` Erez Zadok 2009-12-02 0:13 ` Andrew Morton 2009-12-02 0:47 ` Sage Weil 2009-12-02 1:40 ` Ian Kent 2009-12-02 9:32 ` Miklos Szeredi 2009-12-21 19:44 ` Sage Weil
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).