linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* NFS: possible circular locking  i_mutex <> mmap_sem
@ 2009-06-18  2:32 Wu Fengguang
  2009-06-18  2:52 ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Wu Fengguang @ 2009-06-18  2:32 UTC (permalink / raw)
  To: linux-nfs, LKML, Andrew Morton, Ingo Molnar

Hi,

I got a lockdep warning on a stress testing a nfsroot desktop.
The bits puzzled me is how come the page_fault() happens in
generic_file_buffered_write()?

Thanks,
Fengguang

[ 2638.515865] =======================================================
[ 2638.519743] [ INFO: possible circular locking dependency detected ]
[ 2638.519743] 2.6.30-rc8-mm1 #307
[ 2638.519743] -------------------------------------------------------
[ 2638.519743] firefox-bin/3399 is trying to acquire lock:
[ 2638.519743]  (&mm->mmap_sem){++++++}, at: [<ffffffff81548471>] do_page_fault+0x301/0x330
[ 2638.519743]
[ 2638.519743] but task is already holding lock:
[ 2638.519743]  (&sb->s_type->i_mutex_key#6){+.+.+.}, at: [<ffffffff810c2bd2>] generic_file_aio_write+0x52/0xd0
[ 2638.519743]
[ 2638.519743] which lock already depends on the new lock.
[ 2638.519743]
[ 2638.519743]
[ 2638.519743] the existing dependency chain (in reverse order) is:
[ 2638.519743]
[ 2638.519743] -> #1 (&sb->s_type->i_mutex_key#6){+.+.+.}:
[ 2638.519743]        [<ffffffff8107c066>] __lock_acquire+0x12b6/0x1b40
[ 2638.519743]        [<ffffffff8107c9d1>] lock_acquire+0xe1/0x120
[ 2638.519743]        [<ffffffff8154328e>] mutex_lock_nested+0x5e/0x390
[ 2638.519743]        [<ffffffff811be15c>] nfs_revalidate_mapping+0xac/0x110    ==> takes i_mutex in nfs_invalidate_mapping()
[ 2638.519743]        [<ffffffff811bba25>] nfs_file_mmap+0x55/0x80
[ 2638.519743]        [<ffffffff810e2bb7>] mmap_region+0x427/0x600
[ 2638.519743]        [<ffffffff810e305e>] do_mmap_pgoff+0x2ce/0x3f0
[ 2638.519743]        [<ffffffff81010c26>] sys_mmap+0x106/0x130                 ==> takes mmap_sem
[ 2638.519743]        [<ffffffff8100bf42>] system_call_fastpath+0x16/0x1b
[ 2638.519743]        [<ffffffffffffffff>] 0xffffffffffffffff
[ 2638.519743]
[ 2638.519743] -> #0 (&mm->mmap_sem){++++++}:
[ 2638.519743]        [<ffffffff8107c206>] __lock_acquire+0x1456/0x1b40
[ 2638.519743]        [<ffffffff8107c9d1>] lock_acquire+0xe1/0x120
[ 2638.519743]        [<ffffffff81543d3b>] down_read+0x4b/0x80
[ 2638.519743]        [<ffffffff81548471>] do_page_fault+0x301/0x330            ==> takes mmap_sem
[ 2638.519743]        [<ffffffff81545b55>] page_fault+0x25/0x30
[ 2638.519743]        [<ffffffff810c1970>] generic_file_buffered_write+0x100/0x320
[ 2638.519743]        [<ffffffff810c202d>] __generic_file_aio_write_nolock+0x25d/0x450
[ 2638.519743]        [<ffffffff810c2be9>] generic_file_aio_write+0x69/0xd0     ==> takes i_mutex
[ 2638.519743]        [<ffffffff811bbd36>] nfs_file_write+0x136/0x230
[ 2638.519743]        [<ffffffff810fc991>] do_sync_write+0xf1/0x140
[ 2638.519743]        [<ffffffff810fd59b>] vfs_write+0xcb/0x1a0
[ 2638.519743]        [<ffffffff810fd760>] sys_write+0x50/0x90
[ 2638.519743]        [<ffffffff8100bf42>] system_call_fastpath+0x16/0x1b
[ 2638.519743]        [<ffffffffffffffff>] 0xffffffffffffffff
[ 2638.519743]
[ 2638.519743] other info that might help us debug this:
[ 2638.519743]
[ 2638.519743] 1 lock held by firefox-bin/3399:
[ 2638.519743]  #0:  (&sb->s_type->i_mutex_key#6){+.+.+.}, at: [<ffffffff810c2bd2>] generic_file_aio_write+0x52/0xd0
[ 2638.519743]
[ 2638.519743] stack backtrace:
[ 2638.519743] Pid: 3399, comm: firefox-bin Not tainted 2.6.30-rc8-mm1 #307
[ 2638.519743] Call Trace:
[ 2638.519743]  [<ffffffff8107a83e>] print_circular_bug_tail+0xde/0xe0
[ 2638.519743]  [<ffffffff8107c206>] __lock_acquire+0x1456/0x1b40
[ 2638.519743]  [<ffffffff8107b1ed>] ? __lock_acquire+0x43d/0x1b40
[ 2638.519743]  [<ffffffff8107c9d1>] lock_acquire+0xe1/0x120
[ 2638.519743]  [<ffffffff81548471>] ? do_page_fault+0x301/0x330
[ 2638.519743]  [<ffffffff81543d3b>] down_read+0x4b/0x80
[ 2638.519743]  [<ffffffff81548471>] ? do_page_fault+0x301/0x330
[ 2638.519743]  [<ffffffff81064165>] ? search_exception_tables+0x25/0x50
[ 2638.519743]  [<ffffffff81548471>] do_page_fault+0x301/0x330
[ 2638.519743]  [<ffffffff81545b55>] page_fault+0x25/0x30
[ 2638.519743]  [<ffffffff810bf919>] ? iov_iter_fault_in_readable+0x49/0x60
[ 2638.519743]  [<ffffffff810c1970>] generic_file_buffered_write+0x100/0x320
[ 2638.519743]  [<ffffffff810c202d>] __generic_file_aio_write_nolock+0x25d/0x450
[ 2638.519743]  [<ffffffff810c2bd2>] ? generic_file_aio_write+0x52/0xd0
[ 2638.519743]  [<ffffffff810c2be9>] generic_file_aio_write+0x69/0xd0
[ 2638.519743]  [<ffffffff811bbd36>] nfs_file_write+0x136/0x230
[ 2638.519743]  [<ffffffff810fc991>] do_sync_write+0xf1/0x140
[ 2638.519743]  [<ffffffff8154506f>] ? _spin_unlock_irqrestore+0x3f/0x70
[ 2638.519743]  [<ffffffff81066ae0>] ? autoremove_wake_function+0x0/0x40
[ 2638.519743]  [<ffffffff810127f0>] ? native_sched_clock+0x20/0x80
[ 2638.519743]  [<ffffffff81012859>] ? sched_clock+0x9/0x10
[ 2638.519743]  [<ffffffff81077c85>] ? lock_release_holdtime+0x35/0x1c0
[ 2638.519743]  [<ffffffff810fd59b>] vfs_write+0xcb/0x1a0
[ 2638.519743]  [<ffffffff810fd760>] sys_write+0x50/0x90
[ 2638.519743]  [<ffffffff8100bf42>] system_call_fastpath+0x16/0x1b


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFS: possible circular locking  i_mutex <> mmap_sem
  2009-06-18  2:32 NFS: possible circular locking i_mutex <> mmap_sem Wu Fengguang
@ 2009-06-18  2:52 ` Andrew Morton
  2009-06-18  3:02   ` Wu Fengguang
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2009-06-18  2:52 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: linux-nfs, LKML, Ingo Molnar

On Thu, 18 Jun 2009 10:32:08 +0800 Wu Fengguang <fengguang.wu@intel.com> wrote:

> Hi,
> 
> I got a lockdep warning on a stress testing a nfsroot desktop.
> The bits puzzled me is how come the page_fault() happens in
> generic_file_buffered_write()?

write(fd, buf, count).  If the page at *buf isn't present, we take a
fault to instantiate that page so we can copy_to_user() into it.

> Thanks,
> Fengguang
> 
> [ 2638.515865] =======================================================
> [ 2638.519743] [ INFO: possible circular locking dependency detected ]
> [ 2638.519743] 2.6.30-rc8-mm1 #307
> [ 2638.519743] -------------------------------------------------------
> [ 2638.519743] firefox-bin/3399 is trying to acquire lock:
> [ 2638.519743]  (&mm->mmap_sem){++++++}, at: [<ffffffff81548471>] do_page_fault+0x301/0x330
> [ 2638.519743]
> [ 2638.519743] but task is already holding lock:
> [ 2638.519743]  (&sb->s_type->i_mutex_key#6){+.+.+.}, at: [<ffffffff810c2bd2>] generic_file_aio_write+0x52/0xd0
> [ 2638.519743]
> [ 2638.519743] which lock already depends on the new lock.
> [ 2638.519743]
> [ 2638.519743]
> [ 2638.519743] the existing dependency chain (in reverse order) is:
> [ 2638.519743]
> [ 2638.519743] -> #1 (&sb->s_type->i_mutex_key#6){+.+.+.}:
> [ 2638.519743]        [<ffffffff8107c066>] __lock_acquire+0x12b6/0x1b40
> [ 2638.519743]        [<ffffffff8107c9d1>] lock_acquire+0xe1/0x120
> [ 2638.519743]        [<ffffffff8154328e>] mutex_lock_nested+0x5e/0x390
> [ 2638.519743]        [<ffffffff811be15c>] nfs_revalidate_mapping+0xac/0x110    ==> takes i_mutex in nfs_invalidate_mapping()
> [ 2638.519743]        [<ffffffff811bba25>] nfs_file_mmap+0x55/0x80
> [ 2638.519743]        [<ffffffff810e2bb7>] mmap_region+0x427/0x600
> [ 2638.519743]        [<ffffffff810e305e>] do_mmap_pgoff+0x2ce/0x3f0
> [ 2638.519743]        [<ffffffff81010c26>] sys_mmap+0x106/0x130                 ==> takes mmap_sem
> [ 2638.519743]        [<ffffffff8100bf42>] system_call_fastpath+0x16/0x1b
> [ 2638.519743]        [<ffffffffffffffff>] 0xffffffffffffffff

This is the faulty code path.  mmap_sem is supposed to nest inside i_mutex.

> [ 2638.519743] -> #0 (&mm->mmap_sem){++++++}:
> [ 2638.519743]        [<ffffffff8107c206>] __lock_acquire+0x1456/0x1b40
> [ 2638.519743]        [<ffffffff8107c9d1>] lock_acquire+0xe1/0x120
> [ 2638.519743]        [<ffffffff81543d3b>] down_read+0x4b/0x80
> [ 2638.519743]        [<ffffffff81548471>] do_page_fault+0x301/0x330            ==> takes mmap_sem
> [ 2638.519743]        [<ffffffff81545b55>] page_fault+0x25/0x30
> [ 2638.519743]        [<ffffffff810c1970>] generic_file_buffered_write+0x100/0x320
> [ 2638.519743]        [<ffffffff810c202d>] __generic_file_aio_write_nolock+0x25d/0x450
> [ 2638.519743]        [<ffffffff810c2be9>] generic_file_aio_write+0x69/0xd0     ==> takes i_mutex
> [ 2638.519743]        [<ffffffff811bbd36>] nfs_file_write+0x136/0x230
> [ 2638.519743]        [<ffffffff810fc991>] do_sync_write+0xf1/0x140
> [ 2638.519743]        [<ffffffff810fd59b>] vfs_write+0xcb/0x1a0
> [ 2638.519743]        [<ffffffff810fd760>] sys_write+0x50/0x90
> [ 2638.519743]        [<ffffffff8100bf42>] system_call_fastpath+0x16/0x1b
> [ 2638.519743]        [<ffffffffffffffff>] 0xffffffffffffffff
> [ 2638.519743]

<digs around a bit>

afaict this was added by a fix-a-compile-warning patch!

: commit e1ebfd33be068ec933f8954060a499bd22ad6f69
: Author:     Trond Myklebust <Trond.Myklebust@netapp.com>
: AuthorDate: Wed Mar 11 14:37:54 2009 -0400
: Commit:     Trond Myklebust <Trond.Myklebust@netapp.com>
: CommitDate: Wed Mar 11 14:37:54 2009 -0400
: 
:     NFS: Kill the "defined but not used" compile error on nommu machines
:     
:     Bryan Wu reports that when compiling NFS on nommu machines he gets a
:     "defined but not used" error on nfs_file_mmap().
:     
:     The easiest fix is simply to get rid of the special casing in NFS, and
:     just always call generic_file_mmap() to set up the file.
:     
:     Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
: 
: diff --git a/fs/nfs/file.c b/fs/nfs/file.c
: index 404c19c..1eab9c9 100644
: --- a/fs/nfs/file.c
: +++ b/fs/nfs/file.c
: @@ -64,11 +64,7 @@ const struct file_operations nfs_file_operations = {
:  	.write		= do_sync_write,
:  	.aio_read	= nfs_file_read,
:  	.aio_write	= nfs_file_write,
: -#ifdef CONFIG_MMU
:  	.mmap		= nfs_file_mmap,
: -#else
: -	.mmap		= generic_file_mmap,
: -#endif
:  	.open		= nfs_file_open,
:  	.flush		= nfs_file_flush,
:  	.release	= nfs_file_release,
: @@ -304,11 +300,13 @@ nfs_file_mmap(struct file * file, struct vm_area_struct * vma)
:  	dprintk("NFS: mmap(%s/%s)\n",
:  		dentry->d_parent->d_name.name, dentry->d_name.name);
:  
: -	status = nfs_revalidate_mapping(inode, file->f_mapping);
: +	/* Note: generic_file_mmap() returns ENOSYS on nommu systems
: +	 *       so we call that before revalidating the mapping
: +	 */
: +	status = generic_file_mmap(file, vma);
:  	if (!status) {
:  		vma->vm_ops = &nfs_file_vm_ops;
: -		vma->vm_flags |= VM_CAN_NONLINEAR;
: -		file_accessed(file);
: +		status = nfs_revalidate_mapping(inode, file->f_mapping);
:  	}
:  	return status;
:  }

It's odd/worrying that this problem sat in linux-next for a month
and nobody noticed.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFS: possible circular locking  i_mutex <> mmap_sem
  2009-06-18  2:52 ` Andrew Morton
@ 2009-06-18  3:02   ` Wu Fengguang
  2009-06-18 16:48     ` Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Wu Fengguang @ 2009-06-18  3:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-nfs@vger.kernel.org, LKML, Ingo Molnar

On Thu, Jun 18, 2009 at 10:52:51AM +0800, Andrew Morton wrote:
> On Thu, 18 Jun 2009 10:32:08 +0800 Wu Fengguang <fengguang.wu@intel.com> wrote:
> 
> > Hi,
> > 
> > I got a lockdep warning on a stress testing a nfsroot desktop.
> > The bits puzzled me is how come the page_fault() happens in
> > generic_file_buffered_write()?
> 
> write(fd, buf, count).  If the page at *buf isn't present, we take a
> fault to instantiate that page so we can copy_to_user() into it.

Thanks. I was cheated by the pagefault_disable() call, but the real
fault happened some time before at iov_iter_fault_in_readable().

> > 
> > [ 2638.515865] =======================================================
> > [ 2638.519743] [ INFO: possible circular locking dependency detected ]
> > [ 2638.519743] 2.6.30-rc8-mm1 #307
> > [ 2638.519743] -------------------------------------------------------
> > [ 2638.519743] firefox-bin/3399 is trying to acquire lock:
> > [ 2638.519743]  (&mm->mmap_sem){++++++}, at: [<ffffffff81548471>] do_page_fault+0x301/0x330
> > [ 2638.519743]
> > [ 2638.519743] but task is already holding lock:
> > [ 2638.519743]  (&sb->s_type->i_mutex_key#6){+.+.+.}, at: [<ffffffff810c2bd2>] generic_file_aio_write+0x52/0xd0
> > [ 2638.519743]
> > [ 2638.519743] which lock already depends on the new lock.
> > [ 2638.519743]
> > [ 2638.519743]
> > [ 2638.519743] the existing dependency chain (in reverse order) is:
> > [ 2638.519743]
> > [ 2638.519743] -> #1 (&sb->s_type->i_mutex_key#6){+.+.+.}:
> > [ 2638.519743]        [<ffffffff8107c066>] __lock_acquire+0x12b6/0x1b40
> > [ 2638.519743]        [<ffffffff8107c9d1>] lock_acquire+0xe1/0x120
> > [ 2638.519743]        [<ffffffff8154328e>] mutex_lock_nested+0x5e/0x390
> > [ 2638.519743]        [<ffffffff811be15c>] nfs_revalidate_mapping+0xac/0x110    ==> takes i_mutex in nfs_invalidate_mapping()
> > [ 2638.519743]        [<ffffffff811bba25>] nfs_file_mmap+0x55/0x80
> > [ 2638.519743]        [<ffffffff810e2bb7>] mmap_region+0x427/0x600
> > [ 2638.519743]        [<ffffffff810e305e>] do_mmap_pgoff+0x2ce/0x3f0
> > [ 2638.519743]        [<ffffffff81010c26>] sys_mmap+0x106/0x130                 ==> takes mmap_sem
> > [ 2638.519743]        [<ffffffff8100bf42>] system_call_fastpath+0x16/0x1b
> > [ 2638.519743]        [<ffffffffffffffff>] 0xffffffffffffffff
> 
> This is the faulty code path.  mmap_sem is supposed to nest inside i_mutex.

This could be a long stand bug: the nfs_revalidate_mapping() call was
there in the very beginning of git history.

> > [ 2638.519743] -> #0 (&mm->mmap_sem){++++++}:
> > [ 2638.519743]        [<ffffffff8107c206>] __lock_acquire+0x1456/0x1b40
> > [ 2638.519743]        [<ffffffff8107c9d1>] lock_acquire+0xe1/0x120
> > [ 2638.519743]        [<ffffffff81543d3b>] down_read+0x4b/0x80
> > [ 2638.519743]        [<ffffffff81548471>] do_page_fault+0x301/0x330            ==> takes mmap_sem
> > [ 2638.519743]        [<ffffffff81545b55>] page_fault+0x25/0x30
> > [ 2638.519743]        [<ffffffff810c1970>] generic_file_buffered_write+0x100/0x320
> > [ 2638.519743]        [<ffffffff810c202d>] __generic_file_aio_write_nolock+0x25d/0x450
> > [ 2638.519743]        [<ffffffff810c2be9>] generic_file_aio_write+0x69/0xd0     ==> takes i_mutex
> > [ 2638.519743]        [<ffffffff811bbd36>] nfs_file_write+0x136/0x230
> > [ 2638.519743]        [<ffffffff810fc991>] do_sync_write+0xf1/0x140
> > [ 2638.519743]        [<ffffffff810fd59b>] vfs_write+0xcb/0x1a0
> > [ 2638.519743]        [<ffffffff810fd760>] sys_write+0x50/0x90
> > [ 2638.519743]        [<ffffffff8100bf42>] system_call_fastpath+0x16/0x1b
> > [ 2638.519743]        [<ffffffffffffffff>] 0xffffffffffffffff
> > [ 2638.519743]
> 
> <digs around a bit>
> 
> afaict this was added by a fix-a-compile-warning patch!
> 
> : commit e1ebfd33be068ec933f8954060a499bd22ad6f69
> : Author:     Trond Myklebust <Trond.Myklebust@netapp.com>
> : AuthorDate: Wed Mar 11 14:37:54 2009 -0400
> : Commit:     Trond Myklebust <Trond.Myklebust@netapp.com>
> : CommitDate: Wed Mar 11 14:37:54 2009 -0400
> : 
> :     NFS: Kill the "defined but not used" compile error on nommu machines
> :     
> :     Bryan Wu reports that when compiling NFS on nommu machines he gets a
> :     "defined but not used" error on nfs_file_mmap().
> :     
> :     The easiest fix is simply to get rid of the special casing in NFS, and
> :     just always call generic_file_mmap() to set up the file.
> :     
> :     Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
> : 
> : diff --git a/fs/nfs/file.c b/fs/nfs/file.c
> : index 404c19c..1eab9c9 100644
> : --- a/fs/nfs/file.c
> : +++ b/fs/nfs/file.c
> : @@ -64,11 +64,7 @@ const struct file_operations nfs_file_operations = {
> :  	.write		= do_sync_write,
> :  	.aio_read	= nfs_file_read,
> :  	.aio_write	= nfs_file_write,
> : -#ifdef CONFIG_MMU
> :  	.mmap		= nfs_file_mmap,
> : -#else
> : -	.mmap		= generic_file_mmap,
> : -#endif
> :  	.open		= nfs_file_open,
> :  	.flush		= nfs_file_flush,
> :  	.release	= nfs_file_release,
> : @@ -304,11 +300,13 @@ nfs_file_mmap(struct file * file, struct vm_area_struct * vma)
> :  	dprintk("NFS: mmap(%s/%s)\n",
> :  		dentry->d_parent->d_name.name, dentry->d_name.name);
> :  
> : -	status = nfs_revalidate_mapping(inode, file->f_mapping);
> : +	/* Note: generic_file_mmap() returns ENOSYS on nommu systems
> : +	 *       so we call that before revalidating the mapping
> : +	 */
> : +	status = generic_file_mmap(file, vma);
> :  	if (!status) {
> :  		vma->vm_ops = &nfs_file_vm_ops;
> : -		vma->vm_flags |= VM_CAN_NONLINEAR;
> : -		file_accessed(file);
> : +		status = nfs_revalidate_mapping(inode, file->f_mapping);
> :  	}
> :  	return status;
> :  }
> 
> It's odd/worrying that this problem sat in linux-next for a month
> and nobody noticed.

Not the fault of this patch?  The above patch only moves the
nfs_revalidate_mapping() call around a bit.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFS: possible circular locking  i_mutex <> mmap_sem
  2009-06-18  3:02   ` Wu Fengguang
@ 2009-06-18 16:48     ` Peter Zijlstra
  2009-09-03  7:17       ` Wu Fengguang
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2009-06-18 16:48 UTC (permalink / raw)
  To: Wu Fengguang; +Cc: Andrew Morton, linux-nfs@vger.kernel.org, LKML, Ingo Molnar

On Thu, 2009-06-18 at 11:02 +0800, Wu Fengguang wrote:

> > > [ 2638.515865] =======================================================
> > > [ 2638.519743] [ INFO: possible circular locking dependency detected ]
> > > [ 2638.519743] 2.6.30-rc8-mm1 #307
> > > [ 2638.519743] -------------------------------------------------------
> > > [ 2638.519743] firefox-bin/3399 is trying to acquire lock:
> > > [ 2638.519743]  (&mm->mmap_sem){++++++}, at: [<ffffffff81548471>] do_page_fault+0x301/0x330
> > > [ 2638.519743]
> > > [ 2638.519743] but task is already holding lock:
> > > [ 2638.519743]  (&sb->s_type->i_mutex_key#6){+.+.+.}, at: [<ffffffff810c2bd2>] generic_file_aio_write+0x52/0xd0
> > > [ 2638.519743]
> > > [ 2638.519743] which lock already depends on the new lock.
> > > [ 2638.519743]
> > > [ 2638.519743]
> > > [ 2638.519743] the existing dependency chain (in reverse order) is:
> > > [ 2638.519743]
> > > [ 2638.519743] -> #1 (&sb->s_type->i_mutex_key#6){+.+.+.}:
> > > [ 2638.519743]        [<ffffffff8107c066>] __lock_acquire+0x12b6/0x1b40
> > > [ 2638.519743]        [<ffffffff8107c9d1>] lock_acquire+0xe1/0x120
> > > [ 2638.519743]        [<ffffffff8154328e>] mutex_lock_nested+0x5e/0x390
> > > [ 2638.519743]        [<ffffffff811be15c>] nfs_revalidate_mapping+0xac/0x110    ==> takes i_mutex in nfs_invalidate_mapping()
> > > [ 2638.519743]        [<ffffffff811bba25>] nfs_file_mmap+0x55/0x80
> > > [ 2638.519743]        [<ffffffff810e2bb7>] mmap_region+0x427/0x600
> > > [ 2638.519743]        [<ffffffff810e305e>] do_mmap_pgoff+0x2ce/0x3f0
> > > [ 2638.519743]        [<ffffffff81010c26>] sys_mmap+0x106/0x130                 ==> takes mmap_sem
> > > [ 2638.519743]        [<ffffffff8100bf42>] system_call_fastpath+0x16/0x1b
> > > [ 2638.519743]        [<ffffffffffffffff>] 0xffffffffffffffff
> > 
> > This is the faulty code path.  mmap_sem is supposed to nest inside i_mutex.
> 
> This could be a long stand bug: the nfs_revalidate_mapping() call was
> there in the very beginning of git history.

http://programming.kicks-ass.net/kernel-patches/mmap-vs-nfs/



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFS: possible circular locking  i_mutex <> mmap_sem
  2009-06-18 16:48     ` Peter Zijlstra
@ 2009-09-03  7:17       ` Wu Fengguang
  0 siblings, 0 replies; 5+ messages in thread
From: Wu Fengguang @ 2009-09-03  7:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrew Morton, linux-nfs@vger.kernel.org, LKML, Ingo Molnar,
	Nick Piggin, Myklebust, Trond

On Fri, Jun 19, 2009 at 12:48:08AM +0800, Peter Zijlstra wrote:
> On Thu, 2009-06-18 at 11:02 +0800, Wu Fengguang wrote:
> 
> > > > [ 2638.515865] =======================================================
> > > > [ 2638.519743] [ INFO: possible circular locking dependency detected ]
> > > > [ 2638.519743] 2.6.30-rc8-mm1 #307
> > > > [ 2638.519743] -------------------------------------------------------
> > > > [ 2638.519743] firefox-bin/3399 is trying to acquire lock:
> > > > [ 2638.519743]  (&mm->mmap_sem){++++++}, at: [<ffffffff81548471>] do_page_fault+0x301/0x330
> > > > [ 2638.519743]
> > > > [ 2638.519743] but task is already holding lock:
> > > > [ 2638.519743]  (&sb->s_type->i_mutex_key#6){+.+.+.}, at: [<ffffffff810c2bd2>] generic_file_aio_write+0x52/0xd0
> > > > [ 2638.519743]
> > > > [ 2638.519743] which lock already depends on the new lock.
> > > > [ 2638.519743]
> > > > [ 2638.519743]
> > > > [ 2638.519743] the existing dependency chain (in reverse order) is:
> > > > [ 2638.519743]
> > > > [ 2638.519743] -> #1 (&sb->s_type->i_mutex_key#6){+.+.+.}:
> > > > [ 2638.519743]        [<ffffffff8107c066>] __lock_acquire+0x12b6/0x1b40
> > > > [ 2638.519743]        [<ffffffff8107c9d1>] lock_acquire+0xe1/0x120
> > > > [ 2638.519743]        [<ffffffff8154328e>] mutex_lock_nested+0x5e/0x390
> > > > [ 2638.519743]        [<ffffffff811be15c>] nfs_revalidate_mapping+0xac/0x110    ==> takes i_mutex in nfs_invalidate_mapping()
> > > > [ 2638.519743]        [<ffffffff811bba25>] nfs_file_mmap+0x55/0x80
> > > > [ 2638.519743]        [<ffffffff810e2bb7>] mmap_region+0x427/0x600
> > > > [ 2638.519743]        [<ffffffff810e305e>] do_mmap_pgoff+0x2ce/0x3f0
> > > > [ 2638.519743]        [<ffffffff81010c26>] sys_mmap+0x106/0x130                 ==> takes mmap_sem
> > > > [ 2638.519743]        [<ffffffff8100bf42>] system_call_fastpath+0x16/0x1b
> > > > [ 2638.519743]        [<ffffffffffffffff>] 0xffffffffffffffff
> > > 
> > > This is the faulty code path.  mmap_sem is supposed to nest inside i_mutex.
> > 
> > This could be a long stand bug: the nfs_revalidate_mapping() call was
> > there in the very beginning of git history.
> 
> http://programming.kicks-ass.net/kernel-patches/mmap-vs-nfs/
> 

I wonder why such an old patch was not merged back in 2007.
Nick and Trond seems to be OK with it. So is me. But now there
are so many merge conflicts because it modified so many files..

I can do some stress tests for it.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-09-03  7:17 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-18  2:32 NFS: possible circular locking i_mutex <> mmap_sem Wu Fengguang
2009-06-18  2:52 ` Andrew Morton
2009-06-18  3:02   ` Wu Fengguang
2009-06-18 16:48     ` Peter Zijlstra
2009-09-03  7:17       ` Wu Fengguang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).