linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs
@ 2009-11-17 22:56 akpm
  2009-12-02  0:09 ` Erez Zadok
  0 siblings, 1 reply; 7+ messages in thread
From: akpm @ 2009-11-17 22:56 UTC (permalink / raw)
  To: viro; +Cc: linux-fsdevel, akpm, rdreier, kirkland, tyhicks

From: Roland Dreier <rdreier@cisco.com>

 >  =============================================
 >  [ INFO: possible recursive locking detected ]
 >  2.6.31-2-generic #14~rbd3
 >  ---------------------------------------------
 >  firefox-3.5/4162 is trying to acquire lock:
 >   (&s->s_vfs_rename_mutex){+.+.+.}, at: [<ffffffff81139d31>] lock_rename+0x41/0xf0
 >
 >  but task is already holding lock:
 >   (&s->s_vfs_rename_mutex){+.+.+.}, at: [<ffffffff81139d31>] lock_rename+0x41/0xf0
 >
 >  other info that might help us debug this:
 >  3 locks held by firefox-3.5/4162:
 >   #0:  (&s->s_vfs_rename_mutex){+.+.+.}, at: [<ffffffff81139d31>] lock_rename+0x41/0xf0
 >   #1:  (&sb->s_type->i_mutex_key#11/1){+.+.+.}, at: [<ffffffff81139d5a>] lock_rename+0x6a/0xf0
 >   #2:  (&sb->s_type->i_mutex_key#11/2){+.+.+.}, at: [<ffffffff81139d6f>] lock_rename+0x7f/0xf0
 >
 >  stack backtrace:
 >  Pid: 4162, comm: firefox-3.5 Tainted: G         C 2.6.31-2-generic #14~rbd3
 >  Call Trace:
 >   [<ffffffff8108ae74>] print_deadlock_bug+0xf4/0x100
 >   [<ffffffff8108ce26>] validate_chain+0x4c6/0x750
 >   [<ffffffff8108d2e7>] __lock_acquire+0x237/0x430
 >   [<ffffffff8108d585>] lock_acquire+0xa5/0x150
 >   [<ffffffff81139d31>] ? lock_rename+0x41/0xf0
 >   [<ffffffff815526ad>] __mutex_lock_common+0x4d/0x3d0
 >   [<ffffffff81139d31>] ? lock_rename+0x41/0xf0
 >   [<ffffffff81139d31>] ? lock_rename+0x41/0xf0
 >   [<ffffffff8120eaf9>] ? ecryptfs_rename+0x99/0x170
 >   [<ffffffff81552b36>] mutex_lock_nested+0x46/0x60
 >   [<ffffffff81139d31>] lock_rename+0x41/0xf0
 >   [<ffffffff8120eb2a>] ecryptfs_rename+0xca/0x170
 >   [<ffffffff81139a9e>] vfs_rename_dir+0x13e/0x160
 >   [<ffffffff8113ac7e>] vfs_rename+0xee/0x290
 >   [<ffffffff8113c212>] ? __lookup_hash+0x102/0x160
 >   [<ffffffff8113d512>] sys_renameat+0x252/0x280
 >   [<ffffffff81133eb4>] ? cp_new_stat+0xe4/0x100
 >   [<ffffffff8101316a>] ? sysret_check+0x2e/0x69
 >   [<ffffffff8108c34d>] ? trace_hardirqs_on_caller+0x14d/0x190
 >   [<ffffffff8113d55b>] sys_rename+0x1b/0x20
 >   [<ffffffff81013132>] system_call_fastpath+0x16/0x1b

The trace above is totally reproducible by doing a cross-directory
rename on an ecryptfs directory.

The issue seems to be that sys_renameat() does lock_rename() then calls
into the filesystem; if the filesystem is ecryptfs, then
ecryptfs_rename() again does lock_rename() on the lower filesystem, and
lockdep can't tell that the two s_vfs_rename_mutexes are different.  It
seems an annotation like the following is sufficient to fix this (it
does get rid of the lockdep trace in my simple tests); however I would
like to make sure I'm not misunderstanding the locking, hence the CC
list...

Signed-off-by: Roland Dreier <rdreier@cisco.com>
Cc: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Cc: Dustin Kirkland <kirkland@canonical.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/super.c         |    1 +
 include/linux/fs.h |    1 +
 2 files changed, 2 insertions(+)

diff -puN fs/super.c~ecryptfs-another-lockdep-issue fs/super.c
--- a/fs/super.c~ecryptfs-another-lockdep-issue
+++ a/fs/super.c
@@ -95,6 +95,7 @@ static struct super_block *alloc_super(s
 		s->s_count = S_BIAS;
 		atomic_set(&s->s_active, 1);
 		mutex_init(&s->s_vfs_rename_mutex);
+		lockdep_set_class(&s->s_vfs_rename_mutex, &type->s_vfs_rename_key);
 		mutex_init(&s->s_dquot.dqio_mutex);
 		mutex_init(&s->s_dquot.dqonoff_mutex);
 		init_rwsem(&s->s_dquot.dqptr_sem);
diff -puN include/linux/fs.h~ecryptfs-another-lockdep-issue include/linux/fs.h
--- a/include/linux/fs.h~ecryptfs-another-lockdep-issue
+++ a/include/linux/fs.h
@@ -1748,6 +1748,7 @@ struct file_system_type {
 
 	struct lock_class_key s_lock_key;
 	struct lock_class_key s_umount_key;
+	struct lock_class_key s_vfs_rename_key;
 
 	struct lock_class_key i_lock_key;
 	struct lock_class_key i_mutex_key;
_

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs
  2009-11-17 22:56 [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs akpm
@ 2009-12-02  0:09 ` Erez Zadok
  2009-12-02  0:13   ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Erez Zadok @ 2009-12-02  0:09 UTC (permalink / raw)
  To: akpm; +Cc: viro, linux-fsdevel, rdreier, kirkland, tyhicks

Any change this patch will be merged soon.  It's helpful for other stackable
file systems as well, and I've verified that it helps my ->rename method
avoid a lockdep_off/on pair.

Thanks,
Erez.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs
  2009-12-02  0:09 ` Erez Zadok
@ 2009-12-02  0:13   ` Andrew Morton
  2009-12-02  0:47     ` Sage Weil
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2009-12-02  0:13 UTC (permalink / raw)
  To: Erez Zadok; +Cc: viro, linux-fsdevel, rdreier, kirkland, tyhicks

On Tue, 1 Dec 2009 19:09:23 -0500
Erez Zadok <ezk@cs.sunysb.edu> wrote:

> Any change this patch will be merged soon.  It's helpful for other stackable
> file systems as well, and I've verified that it helps my ->rename method
> avoid a lockdep_off/on pair.
> 

Quite a VFS patch backlog here, some held over from 2.6.32:

vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch
raw-fix-rawctl-compat-ioctls-breakage-on-amd64-and-itanic.patch
vfs-improve-comment-describing-fget_light.patch
ecryptfs-another-lockdep-issue.patch
fs-anon_inodes-implement-dname.patch
fs-remove-unneeded-dcache_unhashed-tricks.patch
fs-improve-remountro-vs-buffercache-coherency.patch
vfs-make-real_lookup-do-dentry-revalidation-with-i_mutex-held.patch
vfs-clean-up-real_lookup.patch
vfs-remove-extraneous-null-d_inode-check-from-do_filp_open.patch

I'll send them out again in a week or so.  If nothing happens, I shall go
through them and make some merge decisions myself (be afraid!)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs
  2009-12-02  0:13   ` Andrew Morton
@ 2009-12-02  0:47     ` Sage Weil
  2009-12-02  1:40       ` Ian Kent
  2009-12-02  9:32       ` Miklos Szeredi
  0 siblings, 2 replies; 7+ messages in thread
From: Sage Weil @ 2009-12-02  0:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Erez Zadok, viro, linux-fsdevel, rdreier, kirkland, tyhicks,
	raven

On Tue, 1 Dec 2009, Andrew Morton wrote:
> Quite a VFS patch backlog here, some held over from 2.6.32:
> 
> vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch

This problem will bite ceph without this patch.  I could work around it, 
but it would be nice to fix the real bug.  Al was worried about nfs:

> It's _probably_ OK now, but I'd really like to think about NFS 
> behaviour.  There are subtle traps in that area.

http://marc.info/?l=linux-kernel&m=121666695629006&w=2

> vfs-make-real_lookup-do-dentry-revalidation-with-i_mutex-held.patch
> vfs-clean-up-real_lookup.patch

These were waiting on the autofs4 locking changes and legacy autofs 
removal, which Ian has posted and appear to be ready.

sage


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs
  2009-12-02  0:47     ` Sage Weil
@ 2009-12-02  1:40       ` Ian Kent
  2009-12-02  9:32       ` Miklos Szeredi
  1 sibling, 0 replies; 7+ messages in thread
From: Ian Kent @ 2009-12-02  1:40 UTC (permalink / raw)
  To: Sage Weil
  Cc: Andrew Morton, Erez Zadok, viro, linux-fsdevel, rdreier, kirkland,
	tyhicks

Sage Weil wrote:
> On Tue, 1 Dec 2009, Andrew Morton wrote:
>> Quite a VFS patch backlog here, some held over from 2.6.32:
>>
>> vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch
> 
> This problem will bite ceph without this patch.  I could work around it, 
> but it would be nice to fix the real bug.  Al was worried about nfs:
> 
>> It's _probably_ OK now, but I'd really like to think about NFS 
>> behaviour.  There are subtle traps in that area.
> 
> http://marc.info/?l=linux-kernel&m=121666695629006&w=2
> 
>> vfs-make-real_lookup-do-dentry-revalidation-with-i_mutex-held.patch
>> vfs-clean-up-real_lookup.patch
> 
> These were waiting on the autofs4 locking changes and legacy autofs 
> removal, which Ian has posted and appear to be ready.

That's right.

The locking changes were sent to Andrew some time ago and they are
present in the mm kernel.

I've been side tracked and only recently posted the autofs removal
patches, no-one objected and I need to send them to Andrew, as in my
normal process.

Ian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs
  2009-12-02  0:47     ` Sage Weil
  2009-12-02  1:40       ` Ian Kent
@ 2009-12-02  9:32       ` Miklos Szeredi
  2009-12-21 19:44         ` Sage Weil
  1 sibling, 1 reply; 7+ messages in thread
From: Miklos Szeredi @ 2009-12-02  9:32 UTC (permalink / raw)
  To: Sage Weil
  Cc: akpm, ezk, viro, linux-fsdevel, rdreier, kirkland, tyhicks, raven

On Tue, 1 Dec 2009, Sage Weil wrote:
> On Tue, 1 Dec 2009, Andrew Morton wrote:
> > Quite a VFS patch backlog here, some held over from 2.6.32:
> > 
> > vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch
> 
> This problem will bite ceph without this patch.  I could work around it, 
> but it would be nice to fix the real bug.  Al was worried about nfs:
> 
> > It's _probably_ OK now, but I'd really like to think about NFS 
> > behaviour.  There are subtle traps in that area.
> 
> http://marc.info/?l=linux-kernel&m=121666695629006&w=2

I had a look at the NFS rename code again, and it does have the look
of of some legacy, especially the comments.  I'll will submit a short
series of cleanup patches in that area.  But otherwise it looks
perfectly OK, I could not see any traps, see below:

- Filesystems without FS_RENAME_DOES_D_MOVE

 o target has external references
   dentry_unhash() doesn't unhash it, so it doesn't need to be rehashed

 o target doesn't have external references

  - rename succeeded
    the d_rehash() is basically a NOP, the following d_move() unhash it again

  - rename failed
    the d_rehash() is an optimization, saves a ->lookup() next time the
    target is accessed.  This is unnecessary, we don't need to optimize
    this failure case

- Filesystems with FS_RENAME_DOES_D_MOVE

 o target has external references

  - rename succeeded
    dentry_unhash() doesn't unhash the target, but d_move() will, it
    will also exchange names and parents of the source/target
    dentries.  In this situation d_rehash will resurrect the target
    dentry under the old name, /proc/PID/fd/N symlink won't show it as
    "(deleted)" for example.  Lookup of the old name will result in
    ESTALE, since d_revalidate() fails but the dentry can't be
    invalidated because of the external refs.

  - rename failed
    dentry_unhash() doens't unhash it, filesystem doesn't do d_move(),
    so everything is fine, no need to rehash

 o target doesn't have external refereces

  - rename succeeded
    again the target dentry is resurrected in the cache, but the next
    lookup will successfully invalidate it.  But there's no need to
    rehash, it just slows down lookup

  - rename failed
    the d_rehash() is an optimization, saves a ->lookup() next time the
    target is accessed.  This is unnecessary

Does anyone see a fault in my analysis?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs
  2009-12-02  9:32       ` Miklos Szeredi
@ 2009-12-21 19:44         ` Sage Weil
  0 siblings, 0 replies; 7+ messages in thread
From: Sage Weil @ 2009-12-21 19:44 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: akpm, ezk, viro, linux-fsdevel, rdreier, kirkland, tyhicks, raven

On Wed, 2 Dec 2009, Miklos Szeredi wrote:

> On Tue, 1 Dec 2009, Sage Weil wrote:
> > On Tue, 1 Dec 2009, Andrew Morton wrote:
> > > Quite a VFS patch backlog here, some held over from 2.6.32:
> > > 
> > > vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch
> > 
> > This problem will bite ceph without this patch.  I could work around it, 
> > but it would be nice to fix the real bug.  Al was worried about nfs:
> > 
> > > It's _probably_ OK now, but I'd really like to think about NFS 
> > > behaviour.  There are subtle traps in that area.
> > 
> > http://marc.info/?l=linux-kernel&m=121666695629006&w=2
> 
> I had a look at the NFS rename code again, and it does have the look
> of of some legacy, especially the comments.  I'll will submit a short
> series of cleanup patches in that area.  But otherwise it looks
> perfectly OK, I could not see any traps, see below:
> 
> - Filesystems without FS_RENAME_DOES_D_MOVE
> 
>  o target has external references
>    dentry_unhash() doesn't unhash it, so it doesn't need to be rehashed
> 
>  o target doesn't have external references
> 
>   - rename succeeded
>     the d_rehash() is basically a NOP, the following d_move() unhash it again
> 
>   - rename failed
>     the d_rehash() is an optimization, saves a ->lookup() next time the
>     target is accessed.  This is unnecessary, we don't need to optimize
>     this failure case

Yes.
 
> - Filesystems with FS_RENAME_DOES_D_MOVE
> 
>  o target has external references
> 
>   - rename succeeded
>     dentry_unhash() doesn't unhash the target, but d_move() will, it
>     will also exchange names and parents of the source/target
>     dentries.  In this situation d_rehash will resurrect the target
>     dentry under the old name, /proc/PID/fd/N symlink won't show it as
>     "(deleted)" for example.  Lookup of the old name will result in
>     ESTALE, since d_revalidate() fails but the dentry can't be
>     invalidated because of the external refs.

- Assuming d_revalidate() fails, yes. 
- If the name is stort (<32 chars or 40 chars, depending on arch), the 
name is copied, not swapped, so the target will get rehashed at the same 
name.

>   - rename failed
>     dentry_unhash() doens't unhash it, filesystem doesn't do d_move(),
>     so everything is fine, no need to rehash
> 
>  o target doesn't have external refereces
> 
>   - rename succeeded
>     again the target dentry is resurrected in the cache, but the next
>     lookup will successfully invalidate it.  But there's no need to
>     rehash, it just slows down lookup

Again, only assuming d_revalidate() explicitly fails on renamed-over 
dentries.
 
>   - rename failed
>     the d_rehash() is an optimization, saves a ->lookup() next time the
>     target is accessed.  This is unnecessary
> 
> Does anyone see a fault in my analysis?

Otherwise, this looks right to me.

sage

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-12-21 19:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-17 22:56 [patch 4/9] vfs: add lockdep annotation to s_vfs_rename_key for ecryptfs akpm
2009-12-02  0:09 ` Erez Zadok
2009-12-02  0:13   ` Andrew Morton
2009-12-02  0:47     ` Sage Weil
2009-12-02  1:40       ` Ian Kent
2009-12-02  9:32       ` Miklos Szeredi
2009-12-21 19:44         ` Sage Weil

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).