Re: 2.6.23-rc1-mm2

Linux NFS development
 help / color / mirror / Atom feed

* Re: 2.6.23-rc1-mm2
       [not found] <20070731230932.a9459617.akpm@linux-foundation.org>
@ 2007-08-03 11:00 ` Marc Dietrich
  2007-08-03 16:38   ` 2.6.23-rc1-mm2 Andrew Morton
  0 siblings, 1 reply; 17+ messages in thread
From: Marc Dietrich @ 2007-08-03 11:00 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, nfs, Neil Brown


Hi,

Am Wednesday 01 August 2007 08:09 schrieb Andrew Morton:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.
>6.23-rc1-mm2/
>
>
> - the git-block tree remains dropped due to disageement with the Vaio
>
> - git-e1000new was withdrawn by the authors
>
> - git-wireless is back.  It is still a >3MB diff, and appears to compile.
>
> - Is anyone testing the kgdb code in here?

I still get some nfs related locking bug.

I applied 

linux-2.6.23-001-fix_rpciod_down_race.dif
linux-2.6.23-003-fix_locking_regression.dif
linux-2.6.23-004-fix_stateid_regression.dif

=============================================
[ INFO: possible recursive locking detected ]
2.6.23-rc1-mm2 #3
---------------------------------------------
events/0/5 is trying to acquire lock:
 (events){--..}, at: [<c012ed90>] flush_workqueue+0x0/0x70

but task is already holding lock:
 (events){--..}, at: [<c012e5c4>] run_workqueue+0xd4/0x1e0

other info that might help us debug this:
2 locks held by events/0/5:
 #0:  (events){--..}, at: [<c012e5c4>] run_workqueue+0xd4/0x1e0
 #1:  ((nfs_automount_task).work){--..}, at: [<c012e5c4>] 
run_workqueue+0xd4/0x1e0

stack backtrace:
 [<c0104fda>] show_trace_log_lvl+0x1a/0x30
 [<c0105c02>] show_trace+0x12/0x20
 [<c0105d15>] dump_stack+0x15/0x20
 [<c013ee42>] __lock_acquire+0xc22/0x1030
 [<c013f2b1>] lock_acquire+0x61/0x80
 [<c012edd9>] flush_workqueue+0x49/0x70
 [<c012ee0d>] flush_scheduled_work+0xd/0x10
 [<dcf55c0c>] nfs_release_automount_timer+0x2c/0x30 [nfs]
 [<dcf45d8e>] nfs_free_server+0x9e/0xd0 [nfs]
 [<dcf4e626>] nfs_kill_super+0x16/0x20 [nfs]
 [<c017b38d>] deactivate_super+0x7d/0xa0
 [<c018f94b>] mntput_no_expire+0x4b/0x80
 [<c018fd94>] expire_mount_list+0xe4/0x140
 [<c0191219>] mark_mounts_for_expiry+0x99/0xb0
 [<dcf55d1d>] nfs_expire_automounts+0xd/0x40 [nfs]
 [<c012e61b>] run_workqueue+0x12b/0x1e0
 [<c012f05b>] worker_thread+0x9b/0x100
 [<c0131c72>] kthread+0x42/0x70
 [<c0104c0f>] kernel_thread_helper+0x7/0x18
 =======================

thanks

Marc

-- 
	"The enemy uses unauthorized weapons."
		 Lord Arthur Ponsonby, "Falsehood in Wartime: Propaganda Lies of the First 
World War", 1928

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.23-rc1-mm2
  2007-08-03 11:00 ` 2.6.23-rc1-mm2 Marc Dietrich
@ 2007-08-03 16:38   ` Andrew Morton
  2007-08-03 17:03     ` 2.6.23-rc1-mm2 Trond Myklebust
  0 siblings, 1 reply; 17+ messages in thread
From: Andrew Morton @ 2007-08-03 16:38 UTC (permalink / raw)
  To: Marc Dietrich; +Cc: Neil Brown, Johannes Berg, nfs, linux-kernel, Oleg Nesterov

On Fri, 3 Aug 2007 13:00:46 +0200 Marc Dietrich <Marc.Dietrich@ap.physik.uni-giessen.de> wrote:

> 
> Hi,
> 
> Am Wednesday 01 August 2007 08:09 schrieb Andrew Morton:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.
> >6.23-rc1-mm2/
> >
> >
> > - the git-block tree remains dropped due to disageement with the Vaio
> >
> > - git-e1000new was withdrawn by the authors
> >
> > - git-wireless is back.  It is still a >3MB diff, and appears to compile.
> >
> > - Is anyone testing the kgdb code in here?
> 
> I still get some nfs related locking bug.
> 
> I applied 
> 
> linux-2.6.23-001-fix_rpciod_down_race.dif
> linux-2.6.23-003-fix_locking_regression.dif
> linux-2.6.23-004-fix_stateid_regression.dif
> 
> =============================================
> [ INFO: possible recursive locking detected ]
> 2.6.23-rc1-mm2 #3
> ---------------------------------------------
> events/0/5 is trying to acquire lock:
>  (events){--..}, at: [<c012ed90>] flush_workqueue+0x0/0x70
> 
> but task is already holding lock:
>  (events){--..}, at: [<c012e5c4>] run_workqueue+0xd4/0x1e0
> 
> other info that might help us debug this:
> 2 locks held by events/0/5:
>  #0:  (events){--..}, at: [<c012e5c4>] run_workqueue+0xd4/0x1e0
>  #1:  ((nfs_automount_task).work){--..}, at: [<c012e5c4>] 
> run_workqueue+0xd4/0x1e0
> 
> stack backtrace:
>  [<c0104fda>] show_trace_log_lvl+0x1a/0x30
>  [<c0105c02>] show_trace+0x12/0x20
>  [<c0105d15>] dump_stack+0x15/0x20
>  [<c013ee42>] __lock_acquire+0xc22/0x1030
>  [<c013f2b1>] lock_acquire+0x61/0x80
>  [<c012edd9>] flush_workqueue+0x49/0x70
>  [<c012ee0d>] flush_scheduled_work+0xd/0x10
>  [<dcf55c0c>] nfs_release_automount_timer+0x2c/0x30 [nfs]
>  [<dcf45d8e>] nfs_free_server+0x9e/0xd0 [nfs]
>  [<dcf4e626>] nfs_kill_super+0x16/0x20 [nfs]
>  [<c017b38d>] deactivate_super+0x7d/0xa0
>  [<c018f94b>] mntput_no_expire+0x4b/0x80
>  [<c018fd94>] expire_mount_list+0xe4/0x140
>  [<c0191219>] mark_mounts_for_expiry+0x99/0xb0
>  [<dcf55d1d>] nfs_expire_automounts+0xd/0x40 [nfs]
>  [<c012e61b>] run_workqueue+0x12b/0x1e0
>  [<c012f05b>] worker_thread+0x9b/0x100
>  [<c0131c72>] kthread+0x42/0x70
>  [<c0104c0f>] kernel_thread_helper+0x7/0x18
>  =======================
> 

There is new debugging stuff in -mm: deadlockable usage of workqueue
primitives will now trigger lockdep warnings.  See

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-flushing-deadlocks-with-lockdep.patch
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-work-related-deadlocks-with-lockdep.patch

I am suspecting that running flush_scheduled_work() from within run_workqueue()
isn't good.

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.23-rc1-mm2
  2007-08-03 16:38   ` 2.6.23-rc1-mm2 Andrew Morton
@ 2007-08-03 17:03     ` Trond Myklebust
  2007-08-03 17:21       ` 2.6.23-rc1-mm2 Oleg Nesterov
  0 siblings, 1 reply; 17+ messages in thread
From: Trond Myklebust @ 2007-08-03 17:03 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Neil Brown, linux-kernel, nfs, Johannes Berg, Oleg Nesterov,
	Marc Dietrich

On Fri, 2007-08-03 at 09:38 -0700, Andrew Morton wrote:
> On Fri, 3 Aug 2007 13:00:46 +0200 Marc Dietrich <Marc.Dietrich@ap.physik.uni-giessen.de> wrote:
> 
> > 
> > Hi,
> > 
> > Am Wednesday 01 August 2007 08:09 schrieb Andrew Morton:
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.
> > >6.23-rc1-mm2/
> > >
> > >
> > > - the git-block tree remains dropped due to disageement with the Vaio
> > >
> > > - git-e1000new was withdrawn by the authors
> > >
> > > - git-wireless is back.  It is still a >3MB diff, and appears to compile.
> > >
> > > - Is anyone testing the kgdb code in here?
> > 
> > I still get some nfs related locking bug.
> > 
> > I applied 
> > 
> > linux-2.6.23-001-fix_rpciod_down_race.dif
> > linux-2.6.23-003-fix_locking_regression.dif
> > linux-2.6.23-004-fix_stateid_regression.dif
> > 
> > =============================================
> > [ INFO: possible recursive locking detected ]
> > 2.6.23-rc1-mm2 #3
> > ---------------------------------------------
> > events/0/5 is trying to acquire lock:
> >  (events){--..}, at: [<c012ed90>] flush_workqueue+0x0/0x70
> > 
> > but task is already holding lock:
> >  (events){--..}, at: [<c012e5c4>] run_workqueue+0xd4/0x1e0
> > 
> > other info that might help us debug this:
> > 2 locks held by events/0/5:
> >  #0:  (events){--..}, at: [<c012e5c4>] run_workqueue+0xd4/0x1e0
> >  #1:  ((nfs_automount_task).work){--..}, at: [<c012e5c4>] 
> > run_workqueue+0xd4/0x1e0
> > 
> > stack backtrace:
> >  [<c0104fda>] show_trace_log_lvl+0x1a/0x30
> >  [<c0105c02>] show_trace+0x12/0x20
> >  [<c0105d15>] dump_stack+0x15/0x20
> >  [<c013ee42>] __lock_acquire+0xc22/0x1030
> >  [<c013f2b1>] lock_acquire+0x61/0x80
> >  [<c012edd9>] flush_workqueue+0x49/0x70
> >  [<c012ee0d>] flush_scheduled_work+0xd/0x10
> >  [<dcf55c0c>] nfs_release_automount_timer+0x2c/0x30 [nfs]
> >  [<dcf45d8e>] nfs_free_server+0x9e/0xd0 [nfs]
> >  [<dcf4e626>] nfs_kill_super+0x16/0x20 [nfs]
> >  [<c017b38d>] deactivate_super+0x7d/0xa0
> >  [<c018f94b>] mntput_no_expire+0x4b/0x80
> >  [<c018fd94>] expire_mount_list+0xe4/0x140
> >  [<c0191219>] mark_mounts_for_expiry+0x99/0xb0
> >  [<dcf55d1d>] nfs_expire_automounts+0xd/0x40 [nfs]
> >  [<c012e61b>] run_workqueue+0x12b/0x1e0
> >  [<c012f05b>] worker_thread+0x9b/0x100
> >  [<c0131c72>] kthread+0x42/0x70
> >  [<c0104c0f>] kernel_thread_helper+0x7/0x18
> >  =======================
> > 
> 
> There is new debugging stuff in -mm: deadlockable usage of workqueue
> primitives will now trigger lockdep warnings.  See
> 
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-flushing-deadlocks-with-lockdep.patch
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-work-related-deadlocks-with-lockdep.patch
> 
> I am suspecting that running flush_scheduled_work() from within run_workqueue()
> isn't good.

I'll have a look at this. I suspect that most if not all of our calls to
run_workqueue()/flush_scheduled_work() can now be replaced by more
targeted calls to cancel_work_sync() and cancel_delayed_work_sync().

Trond


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.23-rc1-mm2
  2007-08-03 17:03     ` 2.6.23-rc1-mm2 Trond Myklebust
@ 2007-08-03 17:21       ` Oleg Nesterov
  2007-08-06  6:24         ` 2.6.23-rc1-mm2 Johannes Berg
  2007-08-07 21:08         ` 2.6.23-rc1-mm2 Trond Myklebust
  0 siblings, 2 replies; 17+ messages in thread
From: Oleg Nesterov @ 2007-08-03 17:21 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Neil Brown, linux-kernel, nfs, Andrew Morton, Johannes Berg,
	Marc Dietrich

On 08/03, Trond Myklebust wrote:
>
> On Fri, 2007-08-03 at 09:38 -0700, Andrew Morton wrote:
> > > stack backtrace:
> > >  [<c0104fda>] show_trace_log_lvl+0x1a/0x30
> > >  [<c0105c02>] show_trace+0x12/0x20
> > >  [<c0105d15>] dump_stack+0x15/0x20
> > >  [<c013ee42>] __lock_acquire+0xc22/0x1030
> > >  [<c013f2b1>] lock_acquire+0x61/0x80
> > >  [<c012edd9>] flush_workqueue+0x49/0x70
> > >  [<c012ee0d>] flush_scheduled_work+0xd/0x10
> > >  [<dcf55c0c>] nfs_release_automount_timer+0x2c/0x30 [nfs]
> > >  [<dcf45d8e>] nfs_free_server+0x9e/0xd0 [nfs]
> > >  [<dcf4e626>] nfs_kill_super+0x16/0x20 [nfs]
> > >  [<c017b38d>] deactivate_super+0x7d/0xa0
> > >  [<c018f94b>] mntput_no_expire+0x4b/0x80
> > >  [<c018fd94>] expire_mount_list+0xe4/0x140
> > >  [<c0191219>] mark_mounts_for_expiry+0x99/0xb0
> > >  [<dcf55d1d>] nfs_expire_automounts+0xd/0x40 [nfs]
> > >  [<c012e61b>] run_workqueue+0x12b/0x1e0
> > >  [<c012f05b>] worker_thread+0x9b/0x100
> > >  [<c0131c72>] kthread+0x42/0x70
> > >  [<c0104c0f>] kernel_thread_helper+0x7/0x18
> > >  =======================
> > > 
> > 
> > There is new debugging stuff in -mm: deadlockable usage of workqueue
> > primitives will now trigger lockdep warnings.  See
> > 
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-flushing-deadlocks-with-lockdep.patch
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.23-rc1/2.6.23-rc1-mm2/broken-out/workqueue-debug-work-related-deadlocks-with-lockdep.patch
> > 
> > I am suspecting that running flush_scheduled_work() from within run_workqueue()
> > isn't good.
> 
> I'll have a look at this. I suspect that most if not all of our calls to
> run_workqueue()/flush_scheduled_work() can now be replaced by more
> targeted calls to cancel_work_sync() and cancel_delayed_work_sync().

Yes, please, if possible.

To avoid a possible confusion: it is still OK if work->func() flushes
its own workqueue, so strictly speaking this trace is false positive,
but it would be very nice if we can get rid of this practice.

Oleg.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.23-rc1-mm2
  2007-08-03 17:21       ` 2.6.23-rc1-mm2 Oleg Nesterov
@ 2007-08-06  6:24         ` Johannes Berg
  2007-08-06 10:53           ` 2.6.23-rc1-mm2 Oleg Nesterov
  2007-08-06 11:05           ` 2.6.23-rc1-mm2 Marc Dietrich
  2007-08-07 21:08         ` 2.6.23-rc1-mm2 Trond Myklebust
  1 sibling, 2 replies; 17+ messages in thread
From: Johannes Berg @ 2007-08-06  6:24 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, Neil Brown, linux-kernel, Trond Myklebust, nfs,
	Andrew Morton, Ingo Molnar, Marc Dietrich


[-- Attachment #1.1: Type: text/plain, Size: 2735 bytes --]

On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote:

> To avoid a possible confusion: it is still OK if work->func() flushes
> its own workqueue, so strictly speaking this trace is false positive,
> but it would be very nice if we can get rid of this practice.

I just had a thought: we could get rid of this warning by using a
read-lock here. That way, flushing from within a work function (which
would be seen as read-after-read recursive lock) won't trigger this
warning. Patch below. This would, however, also get rid of any warnings
for run_workqueue recursion. Which again we may or may not want, the
code inidicates that it should be allowed up to a depth of three.

However, the question whether we should allow flush_workqueue from
within a struct work is mainly an API policy issue; it doesn't hurt to
flush a workqueue from within a work, but it is probably nearer the
intent to use targeted cancel_work_sync() or such. OTOH, one could
imagine situations where multiple different work structs are on that
workqueue belonging to the same subsystem and then the general
flush_scheduled_work() call is the only way to guarantee nothing is on
scheduled at a given point... I don't feel qualified to make the
decision for or against allowing this use of the API at this point.

Marc, do you have an easy way to trigger this warning? Could you verify
that it goes away with the patch below applied?

johannes

---
 kernel/workqueue.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- wireless-dev.orig/kernel/workqueue.c	2007-08-06 08:11:23.297846657 +0200
+++ wireless-dev/kernel/workqueue.c	2007-08-06 08:19:54.727846657 +0200
@@ -272,7 +272,7 @@ static void run_workqueue(struct cpu_wor
 
 		BUG_ON(get_wq_data(work) != cwq);
 		work_clear_pending(work);
-		lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
+		lock_acquire(&cwq->wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_);
 		lock_acquire(&lockdep_map, 0, 0, 0, 2, _THIS_IP_);
 		f(work);
 		lock_release(&lockdep_map, 1, _THIS_IP_);
@@ -395,7 +395,7 @@ void fastcall flush_workqueue(struct wor
 	int cpu;
 
 	might_sleep();
-	lock_acquire(&wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
+	lock_acquire(&wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_);
 	lock_release(&wq->lockdep_map, 1, _THIS_IP_);
 	for_each_cpu_mask(cpu, *cpu_map)
 		flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu));
@@ -779,7 +779,7 @@ static void cleanup_workqueue_thread(str
 	if (cwq->thread == NULL)
 		return;
 
-	lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
+	lock_acquire(&cwq->wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_);
 	lock_release(&cwq->wq->lockdep_map, 1, _THIS_IP_);
 
 	flush_cpu_workqueue(cwq);


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 315 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

[-- Attachment #3: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.23-rc1-mm2
  2007-08-06  6:24         ` 2.6.23-rc1-mm2 Johannes Berg
@ 2007-08-06 10:53           ` Oleg Nesterov
  2007-08-06 10:58             ` 2.6.23-rc1-mm2 Johannes Berg
  2007-08-06 11:05           ` 2.6.23-rc1-mm2 Marc Dietrich
  1 sibling, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2007-08-06 10:53 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Peter Zijlstra, Neil Brown, linux-kernel, Trond Myklebust, nfs,
	Andrew Morton, Ingo Molnar, Marc Dietrich

On 08/06, Johannes Berg wrote:
>
> On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote:
> 
> > To avoid a possible confusion: it is still OK if work->func() flushes
> > its own workqueue, so strictly speaking this trace is false positive,
> > but it would be very nice if we can get rid of this practice.
> 
> However, the question whether we should allow flush_workqueue from
> within a struct work is mainly an API policy issue; it doesn't hurt to
> flush a workqueue from within a work,

I am not sure, but currently I hope we can forbid this eventually, so I
personally think it is good that your patch complains.

> --- wireless-dev.orig/kernel/workqueue.c	2007-08-06 08:11:23.297846657 +0200
> +++ wireless-dev/kernel/workqueue.c	2007-08-06 08:19:54.727846657 +0200
> @@ -272,7 +272,7 @@ static void run_workqueue(struct cpu_wor
>  
>  		BUG_ON(get_wq_data(work) != cwq);
>  		work_clear_pending(work);
> -		lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> +		lock_acquire(&cwq->wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_);
>  		lock_acquire(&lockdep_map, 0, 0, 0, 2, _THIS_IP_);
>  		f(work);
>  		lock_release(&lockdep_map, 1, _THIS_IP_);
> @@ -395,7 +395,7 @@ void fastcall flush_workqueue(struct wor
>  	int cpu;
>  
>  	might_sleep();
> -	lock_acquire(&wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> +	lock_acquire(&wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_);
>  	lock_release(&wq->lockdep_map, 1, _THIS_IP_);
>  	for_each_cpu_mask(cpu, *cpu_map)
>  		flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu));
> @@ -779,7 +779,7 @@ static void cleanup_workqueue_thread(str
>  	if (cwq->thread == NULL)
>  		return;
>  
> -	lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> +	lock_acquire(&cwq->wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_);
>  	lock_release(&cwq->wq->lockdep_map, 1, _THIS_IP_);
>  
>  	flush_cpu_workqueue(cwq);

But this makes ->lockdep_map meaningless? We always take wq->lockdep_map
for reading, now we can't detect deadlocks.

	read_lock(A);
	lock(B);

vs
	lock(B);
	read_lock(A);

is valid, kernel/lockdep.c should not complain.

No?

Oleg.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.23-rc1-mm2
  2007-08-06 10:53           ` 2.6.23-rc1-mm2 Oleg Nesterov
@ 2007-08-06 10:58             ` Johannes Berg
  0 siblings, 0 replies; 17+ messages in thread
From: Johannes Berg @ 2007-08-06 10:58 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, Neil Brown, linux-kernel, Trond Myklebust, nfs,
	Andrew Morton, Ingo Molnar, Marc Dietrich


[-- Attachment #1.1: Type: text/plain, Size: 577 bytes --]

On Mon, 2007-08-06 at 14:53 +0400, Oleg Nesterov wrote:

> But this makes ->lockdep_map meaningless? We always take wq->lockdep_map
> for reading, now we can't detect deadlocks.
> 
> 	read_lock(A);
> 	lock(B);
> 
> vs
> 	lock(B);
> 	read_lock(A);
> 
> is valid, kernel/lockdep.c should not complain.

Ah, hmm. Good point, I guess you can always have multiple read locks.
Then we'd have to make a new parameter or such to get rid of the
recursive locking try message. But if you want to deprecate the API
anyway then this is a good way to find it.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 315 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

[-- Attachment #3: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.23-rc1-mm2
  2007-08-06  6:24         ` 2.6.23-rc1-mm2 Johannes Berg
  2007-08-06 10:53           ` 2.6.23-rc1-mm2 Oleg Nesterov
@ 2007-08-06 11:05           ` Marc Dietrich
  2007-08-06 11:13             ` 2.6.23-rc1-mm2 Johannes Berg
  2007-08-06 16:24             ` 2.6.23-rc1-mm2 Trond Myklebust
  1 sibling, 2 replies; 17+ messages in thread
From: Marc Dietrich @ 2007-08-06 11:05 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Peter Zijlstra, Neil Brown, Ingo Molnar, linux-kernel,
	Trond Myklebust, nfs, Andrew Morton, Oleg Nesterov


Hi,

Am Monday 06 August 2007 08:24 schrieb Johannes Berg:
> On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote:
> > To avoid a possible confusion: it is still OK if work->func() flushes
> > its own workqueue, so strictly speaking this trace is false positive,
> > but it would be very nice if we can get rid of this practice.
>
> I just had a thought: we could get rid of this warning by using a
> read-lock here. That way, flushing from within a work function (which
> would be seen as read-after-read recursive lock) won't trigger this
> warning. Patch below. This would, however, also get rid of any warnings
> for run_workqueue recursion. Which again we may or may not want, the
> code inidicates that it should be allowed up to a depth of three.
>
> However, the question whether we should allow flush_workqueue from
> within a struct work is mainly an API policy issue; it doesn't hurt to
> flush a workqueue from within a work, but it is probably nearer the
> intent to use targeted cancel_work_sync() or such. OTOH, one could
> imagine situations where multiple different work structs are on that
> workqueue belonging to the same subsystem and then the general
> flush_scheduled_work() call is the only way to guarantee nothing is on
> scheduled at a given point... I don't feel qualified to make the
> decision for or against allowing this use of the API at this point.
>
> Marc, do you have an easy way to trigger this warning? Could you verify
> that it goes away with the patch below applied?

just booting into X is enough.

I applied the patch, but now I get:

=================================
[ INFO: inconsistent lock state ]
2.6.23-rc1-mm2 #4
---------------------------------
inconsistent {softirq-on-W} -> {in-softirq-W} usage.
swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
 (rpc_credcache_lock){-+..}, at: [<c01dc487>] _atomic_dec_and_lock+0x17/0x60
{softirq-on-W} state was registered at:
  [<c013e870>] __lock_acquire+0x650/0x1030
  [<c013f2b1>] lock_acquire+0x61/0x80
  [<c02db9ac>] _spin_lock+0x2c/0x40
  [<c01dc487>] _atomic_dec_and_lock+0x17/0x60
  [<dced55fd>] put_rpccred+0x5d/0x100 [sunrpc]
  [<dced56c1>] rpcauth_unbindcred+0x21/0x60 [sunrpc]
  [<dced3fd4>] a0 [sunrpc]
  [<dcecefe0>] rpc_call_sync+0x30/0x40 [sunrpc]
  [<dcedc73b>] rpcb_register+0xdb/0x180 [sunrpc]
  [<dced65b3>] svc_register+0x93/0x160 [sunrpc]
  [<dced6ebe>] __svc_create+0x1ee/0x220 [sunrpc]
  [<dced7053>] svc_create+0x13/0x20 [sunrpc]
  [<dcf6d722>] nfs_callback_up+0x82/0x120 [nfs]
  [<dcf48f36>] nfs_get_client+0x176/0x390 [nfs]
  [<dcf49181>] nfs4_set_client+0x31/0x190 [nfs]
  [<dcf49983>] nfs4_create_server+0x63/0x3b0 [nfs]
  [<dcf52426>] nfs4_get_sb+0x346/0x5b0 [nfs]
  [<c017b444>] vfs_kern_mount+0x94/0x110
  [<c0190a62>] do_mount+0x1f2/0x7d0
  [<c01910a6>] sys_mount+0x66/0xa0
  [<c0104046>] syscall_call+0x7/0xb
  [<ffffffff>] 0xffffffff
irq event stamp: 5277830
hardirqs last  enabled at (5277830): [<c017530a>] kmem_cache_free+0x8a/0xc0
hardirqs last disabled at (5277829): [<c01752d2>] kmem_cache_free+0x52/0xc0
softirqs last  enabled at (5277798): [<c0124173>] __do_softirq+0xa3/0xc0
softirqs last disabled at (5277817): [<c01241d7>] do_softirq+0x47/0x50

other info that might help us debug this:
no locks held by swapper/0.

stack backtrace:
 [<c0104fda>] show_trace_log_lvl+0x1a/0x30
 [<c0105c02>] show_trace+0x12/0x20
 [<c0105d15>] dump_stack+0x15/0x20
 [<c013ccc3>] print_usage_bug+0x153/0x160
 [<c013d8b9>] mark_lock+0x449/0x620
 [<c013e824>] __lock_acquire+0x604/0x1030
 [<c013f2b1>] lock_acquire+0x61/0x80
 [<c02db9ac>] _spin_lock+0x2c/0x40
 [<c01dc487>] _atomic_dec_and_lock+0x17/0x60
 [<dced55fd>] put_rpccred+0x5d/0x100 [sunrpc]
 [<dcf6bf83>] nfs_free_delegation_callback+0x13/0x20 [nfs]
 [<c012f9ea>] __rcu_process_callbacks+0x6a/0x1c0
 [<c012fb52>] rcu_process_callbacks+0x12/0x30
 [<c0124218>] tasklet_action+0x38/0x80
 [<c0124125>] __do_softirq+0x55/0xc0
 [<c01241d7>] do_softirq+0x47/0x50
 [<c0124605>] irq_exit+0x35/0x40
 [<c0112463>] smp_apic_timer_interrupt+0x43/0x80
 [<c0104a77>] apic_timer_interrupt+0x33/0x38
 [<c02690df>] cpuidle_idle_call+0x6f/0x90
 [<c01023c3>] cpu_idle+0x43/0x70
 [<c02d8c27>] rest_init+0x47/0x50
 [<c03bcb6a>] start_kernel+0x22a/0x2b0
 [<00000000>] 0x0
 =======================

also, sometimes this kernel hangs because of nfs accessing processes remain in 
D state.

Marc

> johannes
>
> ---
>  kernel/workqueue.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> --- wireless-dev.orig/kernel/workqueue.c	2007-08-06 08:11:23.297846657
> +0200 +++ wireless-dev/kernel/workqueue.c	2007-08-06 08:19:54.727846657
> +0200 @@ -272,7 +272,7 @@ static void run_workqueue(struct cpu_wor
>
>  		BUG_ON(get_wq_data(work) != cwq);
>  		work_clear_pending(work);
> -		lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> +		lock_acquire(&cwq->wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_);
>  		lock_acquire(&lockdep_map, 0, 0, 0, 2, _THIS_IP_);
>  		f(work);
>  		lock_release(&lockdep_map, 1, _THIS_IP_);
> @@ -395,7 +395,7 @@ void fastcall flush_workqueue(struct wor
>  	int cpu;
>
>  	might_sleep();
> -	lock_acquire(&wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> +	lock_acquire(&wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_);
>  	lock_release(&wq->lockdep_map, 1, _THIS_IP_);
>  	for_each_cpu_mask(cpu, *cpu_map)
>  		flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu));
> @@ -779,7 +779,7 @@ static void cleanup_workqueue_thread(str
>  	if (cwq->thread == NULL)
>  		return;
>
> -	lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
> +	lock_acquire(&cwq->wq->lockdep_map, 0, 0, 1, 2, _THIS_IP_);
>  	lock_release(&cwq->wq->lockdep_map, 1, _THIS_IP_);
>
>  	flush_cpu_workqueue(cwq);

-- 
	"The enemy uses unauthorized weapons."
		 Lord Arthur Ponsonby, "Falsehood in Wartime: Propaganda Lies of the First 
World War", 1928

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.23-rc1-mm2
  2007-08-06 11:05           ` 2.6.23-rc1-mm2 Marc Dietrich
@ 2007-08-06 11:13             ` Johannes Berg
  2007-08-06 16:24             ` 2.6.23-rc1-mm2 Trond Myklebust
  1 sibling, 0 replies; 17+ messages in thread
From: Johannes Berg @ 2007-08-06 11:13 UTC (permalink / raw)
  To: Marc Dietrich
  Cc: Peter Zijlstra, Neil Brown, Ingo Molnar, linux-kernel,
	Trond Myklebust, nfs, Andrew Morton, Oleg Nesterov


[-- Attachment #1.1: Type: text/plain, Size: 633 bytes --]

On Mon, 2007-08-06 at 13:05 +0200, Marc Dietrich wrote:

> I applied the patch, but now I get:
> 
> =================================
> [ INFO: inconsistent lock state ]
> 2.6.23-rc1-mm2 #4
> ---------------------------------
> inconsistent {softirq-on-W} -> {in-softirq-W} usage.
> swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
>  (rpc_credcache_lock){-+..}, at: [<c01dc487>] _atomic_dec_and_lock+0x17/0x60

Interesting, but doesn't seem related to this at all. As Oleg just
pointed out this basically disabled checking for workqueue stuff so this
should be looked into by somebody familiar with the NFS code.

johannes

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

[-- Attachment #2: Type: text/plain, Size: 315 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

[-- Attachment #3: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.23-rc1-mm2
  2007-08-06 11:05           ` 2.6.23-rc1-mm2 Marc Dietrich
  2007-08-06 11:13             ` 2.6.23-rc1-mm2 Johannes Berg
@ 2007-08-06 16:24             ` Trond Myklebust
  2007-08-07 12:09               ` [NFS] 2.6.23-rc1-mm2 Marc Dietrich
  1 sibling, 1 reply; 17+ messages in thread
From: Trond Myklebust @ 2007-08-06 16:24 UTC (permalink / raw)
  To: Marc Dietrich
  Cc: Andrew Morton, Neil Brown, Ingo Molnar, linux-kernel, nfs,
	Johannes Berg, Oleg Nesterov, Peter Zijlstra

[-- Attachment #1: Type: text/plain, Size: 4592 bytes --]

On Mon, 2007-08-06 at 13:05 +0200, Marc Dietrich wrote:
> Hi,
> 
> Am Monday 06 August 2007 08:24 schrieb Johannes Berg:
> > On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote:
> > > To avoid a possible confusion: it is still OK if work->func() flushes
> > > its own workqueue, so strictly speaking this trace is false positive,
> > > but it would be very nice if we can get rid of this practice.
> >
> > I just had a thought: we could get rid of this warning by using a
> > read-lock here. That way, flushing from within a work function (which
> > would be seen as read-after-read recursive lock) won't trigger this
> > warning. Patch below. This would, however, also get rid of any warnings
> > for run_workqueue recursion. Which again we may or may not want, the
> > code inidicates that it should be allowed up to a depth of three.
> >
> > However, the question whether we should allow flush_workqueue from
> > within a struct work is mainly an API policy issue; it doesn't hurt to
> > flush a workqueue from within a work, but it is probably nearer the
> > intent to use targeted cancel_work_sync() or such. OTOH, one could
> > imagine situations where multiple different work structs are on that
> > workqueue belonging to the same subsystem and then the general
> > flush_scheduled_work() call is the only way to guarantee nothing is on
> > scheduled at a given point... I don't feel qualified to make the
> > decision for or against allowing this use of the API at this point.
> >
> > Marc, do you have an easy way to trigger this warning? Could you verify
> > that it goes away with the patch below applied?
> 
> just booting into X is enough.
> 
> I applied the patch, but now I get:
> 
> =================================
> [ INFO: inconsistent lock state ]
> 2.6.23-rc1-mm2 #4
> ---------------------------------
> inconsistent {softirq-on-W} -> {in-softirq-W} usage.
> swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
>  (rpc_credcache_lock){-+..}, at: [<c01dc487>] _atomic_dec_and_lock+0x17/0x60
> {softirq-on-W} state was registered at:
>   [<c013e870>] __lock_acquire+0x650/0x1030
>   [<c013f2b1>] lock_acquire+0x61/0x80
>   [<c02db9ac>] _spin_lock+0x2c/0x40
>   [<c01dc487>] _atomic_dec_and_lock+0x17/0x60
>   [<dced55fd>] put_rpccred+0x5d/0x100 [sunrpc]
>   [<dced56c1>] rpcauth_unbindcred+0x21/0x60 [sunrpc]
>   [<dced3fd4>] a0 [sunrpc]
>   [<dcecefe0>] rpc_call_sync+0x30/0x40 [sunrpc]
>   [<dcedc73b>] rpcb_register+0xdb/0x180 [sunrpc]
>   [<dced65b3>] svc_register+0x93/0x160 [sunrpc]
>   [<dced6ebe>] __svc_create+0x1ee/0x220 [sunrpc]
>   [<dced7053>] svc_create+0x13/0x20 [sunrpc]
>   [<dcf6d722>] nfs_callback_up+0x82/0x120 [nfs]
>   [<dcf48f36>] nfs_get_client+0x176/0x390 [nfs]
>   [<dcf49181>] nfs4_set_client+0x31/0x190 [nfs]
>   [<dcf49983>] nfs4_create_server+0x63/0x3b0 [nfs]
>   [<dcf52426>] nfs4_get_sb+0x346/0x5b0 [nfs]
>   [<c017b444>] vfs_kern_mount+0x94/0x110
>   [<c0190a62>] do_mount+0x1f2/0x7d0
>   [<c01910a6>] sys_mount+0x66/0xa0
>   [<c0104046>] syscall_call+0x7/0xb
>   [<ffffffff>] 0xffffffff
> irq event stamp: 5277830
> hardirqs last  enabled at (5277830): [<c017530a>] kmem_cache_free+0x8a/0xc0
> hardirqs last disabled at (5277829): [<c01752d2>] kmem_cache_free+0x52/0xc0
> softirqs last  enabled at (5277798): [<c0124173>] __do_softirq+0xa3/0xc0
> softirqs last disabled at (5277817): [<c01241d7>] do_softirq+0x47/0x50
> 
> other info that might help us debug this:
> no locks held by swapper/0.
> 
> stack backtrace:
>  [<c0104fda>] show_trace_log_lvl+0x1a/0x30
>  [<c0105c02>] show_trace+0x12/0x20
>  [<c0105d15>] dump_stack+0x15/0x20
>  [<c013ccc3>] print_usage_bug+0x153/0x160
>  [<c013d8b9>] mark_lock+0x449/0x620
>  [<c013e824>] __lock_acquire+0x604/0x1030
>  [<c013f2b1>] lock_acquire+0x61/0x80
>  [<c02db9ac>] _spin_lock+0x2c/0x40
>  [<c01dc487>] _atomic_dec_and_lock+0x17/0x60
>  [<dced55fd>] put_rpccred+0x5d/0x100 [sunrpc]
>  [<dcf6bf83>] nfs_free_delegation_callback+0x13/0x20 [nfs]
>  [<c012f9ea>] __rcu_process_callbacks+0x6a/0x1c0
>  [<c012fb52>] rcu_process_callbacks+0x12/0x30
>  [<c0124218>] tasklet_action+0x38/0x80
>  [<c0124125>] __do_softirq+0x55/0xc0
>  [<c01241d7>] do_softirq+0x47/0x50
>  [<c0124605>] irq_exit+0x35/0x40
>  [<c0112463>] smp_apic_timer_interrupt+0x43/0x80
>  [<c0104a77>] apic_timer_interrupt+0x33/0x38
>  [<c02690df>] cpuidle_idle_call+0x6f/0x90
>  [<c01023c3>] cpu_idle+0x43/0x70
>  [<c02d8c27>] rest_init+0x47/0x50
>  [<c03bcb6a>] start_kernel+0x22a/0x2b0
>  [<00000000>] 0x0
>  =======================

That is a different matter. I assume this patch should suffice to fix
the above problem.

Trond


[-- Attachment #2: linux-2.6.23-005-dont_call_rpc_putcred_from_rcu.dif --]
[-- Type: message/rfc822, Size: 2128 bytes --]

From: Trond Myklebust <Trond.Myklebust@netapp.com>
Subject: No Subject
Date: Mon, 6 Aug 2007 12:18:34 -0400
Message-ID: <1186417445.6616.24.camel@localhost>

Doing so would require us to introduce bh-safe locks into put_rpccred().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 fs/nfs/delegation.c |   21 +++++++++++++++------
 1 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index 20ac403..c55a761 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -20,10 +20,8 @@
 #include "delegation.h"
 #include "internal.h"
 
-static void nfs_free_delegation(struct nfs_delegation *delegation)
+static void nfs_do_free_delegation(struct nfs_delegation *delegation)
 {
-	if (delegation->cred)
-		put_rpccred(delegation->cred);
 	kfree(delegation);
 }
 
@@ -31,7 +29,18 @@ static void nfs_free_delegation_callback(struct rcu_head *head)
 {
 	struct nfs_delegation *delegation = container_of(head, struct nfs_delegation, rcu);
 
-	nfs_free_delegation(delegation);
+	nfs_do_free_delegation(delegation);
+}
+
+static void nfs_free_delegation(struct nfs_delegation *delegation)
+{
+	struct rpc_cred *cred;
+
+	cred = rcu_dereference(delegation->cred);
+	rcu_assign_pointer(delegation->cred, NULL);
+	call_rcu(&delegation->rcu, nfs_free_delegation_callback);
+	if (cred)
+		put_rpccred(cred);
 }
 
 static int nfs_delegation_claim_locks(struct nfs_open_context *ctx, struct nfs4_state *state)
@@ -166,7 +175,7 @@ static int nfs_do_return_delegation(struct inode *inode, struct nfs_delegation *
 	int res = 0;
 
 	res = nfs4_proc_delegreturn(inode, delegation->cred, &delegation->stateid);
-	call_rcu(&delegation->rcu, nfs_free_delegation_callback);
+	nfs_free_delegation(delegation);
 	return res;
 }
 
@@ -448,7 +457,7 @@ restart:
 		spin_unlock(&clp->cl_lock);
 		rcu_read_unlock();
 		if (delegation != NULL)
-			call_rcu(&delegation->rcu, nfs_free_delegation_callback);
+			nfs_free_delegation(delegation);
 		goto restart;
 	}
 	rcu_read_unlock();

[-- Attachment #3: Type: text/plain, Size: 315 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

[-- Attachment #4: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [NFS] 2.6.23-rc1-mm2
  2007-08-06 16:24             ` 2.6.23-rc1-mm2 Trond Myklebust
@ 2007-08-07 12:09               ` Marc Dietrich
  0 siblings, 0 replies; 17+ messages in thread
From: Marc Dietrich @ 2007-08-07 12:09 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Johannes Berg, Oleg Nesterov, Andrew Morton, Neil Brown, nfs,
	linux-kernel

Am Monday 06 August 2007 18:24 schrieb Trond Myklebust:
> On Mon, 2007-08-06 at 13:05 +0200, Marc Dietrich wrote:
> > Hi,
> >
> > > (...)
> >
> > just booting into X is enough.
> >
> > I applied the patch, but now I get:
> >
> > =================================
> > [ INFO: inconsistent lock state ]
> > 2.6.23-rc1-mm2 #4
> > ---------------------------------
> > inconsistent {softirq-on-W} -> {in-softirq-W} usage.
> > swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
> >  (rpc_credcache_lock){-+..}, at: [<c01dc487>]
> > _atomic_dec_and_lock+0x17/0x60 {softirq-on-W} state was registered at:
> >   [<c013e870>] __lock_acquire+0x650/0x1030
> >   [<c013f2b1>] lock_acquire+0x61/0x80
> >   [<c02db9ac>] _spin_lock+0x2c/0x40
> >   [<c01dc487>] _atomic_dec_and_lock+0x17/0x60
> >   [<dced55fd>] put_rpccred+0x5d/0x100 [sunrpc]
> >   [<dced56c1>] rpcauth_unbindcred+0x21/0x60 [sunrpc]
> >   [<dced3fd4>] a0 [sunrpc]
> >   [<dcecefe0>] rpc_call_sync+0x30/0x40 [sunrpc]
> >   [<dcedc73b>] rpcb_register+0xdb/0x180 [sunrpc]
> >   [<dced65b3>] svc_register+0x93/0x160 [sunrpc]
> >   [<dced6ebe>] __svc_create+0x1ee/0x220 [sunrpc]
> >   [<dced7053>] svc_create+0x13/0x20 [sunrpc]
> >   [<dcf6d722>] nfs_callback_up+0x82/0x120 [nfs]
> >   [<dcf48f36>] nfs_get_client+0x176/0x390 [nfs]
> >   [<dcf49181>] nfs4_set_client+0x31/0x190 [nfs]
> >   [<dcf49983>] nfs4_create_server+0x63/0x3b0 [nfs]
> >   [<dcf52426>] nfs4_get_sb+0x346/0x5b0 [nfs]
> >   [<c017b444>] vfs_kern_mount+0x94/0x110
> >   [<c0190a62>] do_mount+0x1f2/0x7d0
> >   [<c01910a6>] sys_mount+0x66/0xa0
> >   [<c0104046>] syscall_call+0x7/0xb
> >   [<ffffffff>] 0xffffffff
> > irq event stamp: 5277830
> > hardirqs last  enabled at (5277830): [<c017530a>]
> > kmem_cache_free+0x8a/0xc0 hardirqs last disabled at (5277829):
> > [<c01752d2>] kmem_cache_free+0x52/0xc0 softirqs last  enabled at
> > (5277798): [<c0124173>] __do_softirq+0xa3/0xc0 softirqs last disabled at
> > (5277817): [<c01241d7>] do_softirq+0x47/0x50
> >
> > other info that might help us debug this:
> > no locks held by swapper/0.
> >
> > stack backtrace:
> >  [<c0104fda>] show_trace_log_lvl+0x1a/0x30
> >  [<c0105c02>] show_trace+0x12/0x20
> >  [<c0105d15>] dump_stack+0x15/0x20
> >  [<c013ccc3>] print_usage_bug+0x153/0x160
> >  [<c013d8b9>] mark_lock+0x449/0x620
> >  [<c013e824>] __lock_acquire+0x604/0x1030
> >  [<c013f2b1>] lock_acquire+0x61/0x80
> >  [<c02db9ac>] _spin_lock+0x2c/0x40
> >  [<c01dc487>] _atomic_dec_and_lock+0x17/0x60
> >  [<dced55fd>] put_rpccred+0x5d/0x100 [sunrpc]
> >  [<dcf6bf83>] nfs_free_delegation_callback+0x13/0x20 [nfs]
> >  [<c012f9ea>] __rcu_process_callbacks+0x6a/0x1c0
> >  [<c012fb52>] rcu_process_callbacks+0x12/0x30
> >  [<c0124218>] tasklet_action+0x38/0x80
> >  [<c0124125>] __do_softirq+0x55/0xc0
> >  [<c01241d7>] do_softirq+0x47/0x50
> >  [<c0124605>] irq_exit+0x35/0x40
> >  [<c0112463>] smp_apic_timer_interrupt+0x43/0x80
> >  [<c0104a77>] apic_timer_interrupt+0x33/0x38
> >  [<c02690df>] cpuidle_idle_call+0x6f/0x90
> >  [<c01023c3>] cpu_idle+0x43/0x70
> >  [<c02d8c27>] rest_init+0x47/0x50
> >  [<c03bcb6a>] start_kernel+0x22a/0x2b0
> >  [<00000000>] 0x0
> >  =======================
>
> That is a different matter. I assume this patch should suffice to fix
> the above problem.
>
> Trond

yes - it does.

thanks.

Marc


-- 
	"Our cause has a sacred nature."
		 Lord Arthur Ponsonby, "Falsehood in Wartime: Propaganda Lies of the First 
World War", 1928

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.23-rc1-mm2
  2007-08-03 17:21       ` 2.6.23-rc1-mm2 Oleg Nesterov
  2007-08-06  6:24         ` 2.6.23-rc1-mm2 Johannes Berg
@ 2007-08-07 21:08         ` Trond Myklebust
  2007-08-07 21:37           ` 2.6.23-rc1-mm2 Oleg Nesterov
  1 sibling, 1 reply; 17+ messages in thread
From: Trond Myklebust @ 2007-08-07 21:08 UTC (permalink / raw)
  To: Oleg Nesterov, Neil Brown
  Cc: Neil Brown, linux-kernel, nfs, Andrew Morton, Johannes Berg,
	Marc Dietrich

[-- Attachment #1: Type: text/plain, Size: 788 bytes --]

On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote:
> On 08/03, Trond Myklebust wrote:
> > I'll have a look at this. I suspect that most if not all of our calls to
> > run_workqueue()/flush_scheduled_work() can now be replaced by more
> > targeted calls to cancel_work_sync() and cancel_delayed_work_sync().
> 
> Yes, please, if possible.

All the NFS and SUNRPC cases appear to be trivial. IOW: the only reason
for the flush_workqueue()/flush_scheduled_work() calls was to ensure
that the cancel_work()/cancel_delayed_work() calls preceding them have
completed. Nevertheless I've split the conversion into two patches,
since one touches only the NFS code, whereas the other touches the
SUNRPC client and server code.

The two patches have been tested, and appear to work...

Trond



[-- Attachment #2: linux-2.6.23-007-nfs_dont_call_flush_workqueue.dif --]
[-- Type: message/rfc822, Size: 2421 bytes --]

From: Trond Myklebust <Trond.Myklebust@netapp.com>
Subject: No Subject
Date: Tue, 7 Aug 2007 15:28:33 -0400
Message-ID: <1186520901.6625.10.camel@heimdal.trondhjem.org>

This will avoid deadlocks of the form:

stack backtrace:
 [<c0104fda>] show_trace_log_lvl+0x1a/0x30
 [<c0105c02>] show_trace+0x12/0x20
 [<c0105d15>] dump_stack+0x15/0x20
 [<c013ee42>] __lock_acquire+0xc22/0x1030
 [<c013f2b1>] lock_acquire+0x61/0x80
 [<c012edd9>] flush_workqueue+0x49/0x70
 [<c012ee0d>] flush_scheduled_work+0xd/0x10
 [<dcf55c0c>] nfs_release_automount_timer+0x2c/0x30 [nfs]
 [<dcf45d8e>] nfs_free_server+0x9e/0xd0 [nfs]
 [<dcf4e626>] nfs_kill_super+0x16/0x20 [nfs]
 [<c017b38d>] deactivate_super+0x7d/0xa0
 [<c018f94b>] mntput_no_expire+0x4b/0x80
 [<c018fd94>] expire_mount_list+0xe4/0x140
 [<c0191219>] mark_mounts_for_expiry+0x99/0xb0
 [<dcf55d1d>] nfs_expire_automounts+0xd/0x40 [nfs]
 [<c012e61b>] run_workqueue+0x12b/0x1e0
 [<c012f05b>] worker_thread+0x9b/0x100
 [<c0131c72>] kthread+0x42/0x70
 [<c0104c0f>] kernel_thread_helper+0x7/0x18
 =======================

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 fs/nfs/namespace.c  |    6 ++----
 fs/nfs/nfs4renewd.c |    5 ++---
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c
index 7f86e65..aea76d0 100644
--- a/fs/nfs/namespace.c
+++ b/fs/nfs/namespace.c
@@ -175,10 +175,8 @@ static void nfs_expire_automounts(struct work_struct *work)
 
 void nfs_release_automount_timer(void)
 {
-	if (list_empty(&nfs_automount_list)) {
-		cancel_delayed_work(&nfs_automount_task);
-		flush_scheduled_work();
-	}
+	if (list_empty(&nfs_automount_list))
+		cancel_delayed_work_sync(&nfs_automount_task);
 }
 
 /*
diff --git a/fs/nfs/nfs4renewd.c b/fs/nfs/nfs4renewd.c
index 0505ca1..3ea352d 100644
--- a/fs/nfs/nfs4renewd.c
+++ b/fs/nfs/nfs4renewd.c
@@ -127,16 +127,15 @@ nfs4_schedule_state_renewal(struct nfs_client *clp)
 void
 nfs4_renewd_prepare_shutdown(struct nfs_server *server)
 {
-	flush_scheduled_work();
+	cancel_delayed_work(&server->nfs_client->cl_renewd);
 }
 
 void
 nfs4_kill_renewd(struct nfs_client *clp)
 {
 	down_read(&clp->cl_sem);
-	cancel_delayed_work(&clp->cl_renewd);
+	cancel_delayed_work_sync(&clp->cl_renewd);
 	up_read(&clp->cl_sem);
-	flush_scheduled_work();
 }
 
 /*

[-- Attachment #3: linux-2.6.23-008-sunrpc_dont_call_flush_workqueue.dif --]
[-- Type: message/rfc822, Size: 1399 bytes --]

From: Trond Myklebust <Trond.Myklebust@netapp.com>
Subject: No Subject
Date: Tue, 7 Aug 2007 15:33:01 -0400
Message-ID: <1186520901.6625.11.camel@heimdal.trondhjem.org>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
---

 net/sunrpc/cache.c    |    3 +--
 net/sunrpc/rpc_pipe.c |    3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 01c3c41..ebe344f 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -371,8 +371,7 @@ int cache_unregister(struct cache_detail *cd)
 	}
 	if (list_empty(&cache_list)) {
 		/* module must be being unloaded so its safe to kill the worker */
-		cancel_delayed_work(&cache_cleaner);
-		flush_scheduled_work();
+		cancel_delayed_work_sync(&cache_cleaner);
 	}
 	return 0;
 }
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index 650af06..669e12a 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -132,8 +132,7 @@ rpc_close_pipes(struct inode *inode)
 		rpci->nwriters = 0;
 		if (ops->release_pipe)
 			ops->release_pipe(inode);
-		cancel_delayed_work(&rpci->queue_timeout);
-		flush_workqueue(rpciod_workqueue);
+		cancel_delayed_work_sync(&rpci->queue_timeout);
 	}
 	rpc_inode_setowner(inode, NULL);
 	mutex_unlock(&inode->i_mutex);

[-- Attachment #4: Type: text/plain, Size: 315 bytes --]

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

[-- Attachment #5: Type: text/plain, Size: 140 bytes --]

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: 2.6.23-rc1-mm2
  2007-08-07 21:08         ` 2.6.23-rc1-mm2 Trond Myklebust
@ 2007-08-07 21:37           ` Oleg Nesterov
  2007-08-07 22:05             ` 2.6.23-rc1-mm2 Trond Myklebust
  0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2007-08-07 21:37 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Neil Brown, linux-kernel, nfs, Andrew Morton, Johannes Berg,
	Marc Dietrich

On 08/07, Trond Myklebust wrote:
>
> On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote:
> > On 08/03, Trond Myklebust wrote:
> > > I'll have a look at this. I suspect that most if not all of our calls to
> > > run_workqueue()/flush_scheduled_work() can now be replaced by more
> > > targeted calls to cancel_work_sync() and cancel_delayed_work_sync().
> > 
> > Yes, please, if possible.
> 
> All the NFS and SUNRPC cases appear to be trivial. IOW: the only reason
> for the flush_workqueue()/flush_scheduled_work() calls was to ensure
> that the cancel_work()/cancel_delayed_work() calls preceding them have
> completed. Nevertheless I've split the conversion into two patches,
> since one touches only the NFS code, whereas the other touches the
> SUNRPC client and server code.
> 
> The two patches have been tested, and appear to work...

Great!

>  void
>  nfs4_kill_renewd(struct nfs_client *clp)
>  {
>  	down_read(&clp->cl_sem);
> -	cancel_delayed_work(&clp->cl_renewd);
> +	cancel_delayed_work_sync(&clp->cl_renewd);
>  	up_read(&clp->cl_sem);
> -	flush_scheduled_work();
>  }

this looks unsafe to me, the window is very small, but afaics this can
deadlock if called when nfs4_renew_state() has already started, but didn't
take ->cl_sem yet.

Can't we avoid taking clp->cl_sem here?

Btw, unless I missed something, the code without this patch looks incorrect
too: cancel_delayed_work() can fail if the timer expired, but the ->cl_renewd
didn't run yet. In that case nfs4_renew_state() can run and re-schedule itself
after flush_scheduled_work() returns.

Oleg.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.23-rc1-mm2
  2007-08-07 21:37           ` 2.6.23-rc1-mm2 Oleg Nesterov
@ 2007-08-07 22:05             ` Trond Myklebust
  2007-08-07 22:20               ` 2.6.23-rc1-mm2 Oleg Nesterov
  0 siblings, 1 reply; 17+ messages in thread
From: Trond Myklebust @ 2007-08-07 22:05 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Neil Brown, linux-kernel, nfs, Andrew Morton, Johannes Berg,
	Marc Dietrich

On Wed, 2007-08-08 at 01:37 +0400, Oleg Nesterov wrote:
> On 08/07, Trond Myklebust wrote:
> >
> > On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote:
> > > On 08/03, Trond Myklebust wrote:
> > > > I'll have a look at this. I suspect that most if not all of our calls to
> > > > run_workqueue()/flush_scheduled_work() can now be replaced by more
> > > > targeted calls to cancel_work_sync() and cancel_delayed_work_sync().
> > > 
> > > Yes, please, if possible.
> > 
> > All the NFS and SUNRPC cases appear to be trivial. IOW: the only reason
> > for the flush_workqueue()/flush_scheduled_work() calls was to ensure
> > that the cancel_work()/cancel_delayed_work() calls preceding them have
> > completed. Nevertheless I've split the conversion into two patches,
> > since one touches only the NFS code, whereas the other touches the
> > SUNRPC client and server code.
> > 
> > The two patches have been tested, and appear to work...
> 
> Great!
> 
> >  void
> >  nfs4_kill_renewd(struct nfs_client *clp)
> >  {
> >  	down_read(&clp->cl_sem);
> > -	cancel_delayed_work(&clp->cl_renewd);
> > +	cancel_delayed_work_sync(&clp->cl_renewd);
> >  	up_read(&clp->cl_sem);
> > -	flush_scheduled_work();
> >  }
> 
> this looks unsafe to me, the window is very small, but afaics this can
> deadlock if called when nfs4_renew_state() has already started, but didn't
> take ->cl_sem yet.

Not really. We have removed the nfs_client from the public lists, and we
are guaranteed that there are no more active superblocks attached to it
so nothing can call the reclaimer routine (which is the only routine
that takes a write lock on clp->cl_sem).

> Can't we avoid taking clp->cl_sem here?

Yes, I believe that we can, for the same reasons as above: the race with
the reclaimer is impossible, hence the read lock on cl_sem is redundant.

> Btw, unless I missed something, the code without this patch looks incorrect
> too: cancel_delayed_work() can fail if the timer expired, but the ->cl_renewd
> didn't run yet. In that case nfs4_renew_state() can run and re-schedule itself
> after flush_scheduled_work() returns.

No, that should not be possible. Again, see above: there are no active
superblocks, so clp->cl_superblocks is empty.

Cheers
  Trond


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.23-rc1-mm2
  2007-08-07 22:05             ` 2.6.23-rc1-mm2 Trond Myklebust
@ 2007-08-07 22:20               ` Oleg Nesterov
  2007-08-07 23:08                 ` 2.6.23-rc1-mm2 Trond Myklebust
  0 siblings, 1 reply; 17+ messages in thread
From: Oleg Nesterov @ 2007-08-07 22:20 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Neil Brown, linux-kernel, nfs, Andrew Morton, Johannes Berg,
	Marc Dietrich

On 08/07, Trond Myklebust wrote:
>
> On Wed, 2007-08-08 at 01:37 +0400, Oleg Nesterov wrote:
> > On 08/07, Trond Myklebust wrote:
> > >
> > > On Fri, 2007-08-03 at 21:21 +0400, Oleg Nesterov wrote:
> > > > On 08/03, Trond Myklebust wrote:
> > > > > I'll have a look at this. I suspect that most if not all of our calls to
> > > > > run_workqueue()/flush_scheduled_work() can now be replaced by more
> > > > > targeted calls to cancel_work_sync() and cancel_delayed_work_sync().
> > > > 
> > > > Yes, please, if possible.
> > > 
> > > All the NFS and SUNRPC cases appear to be trivial. IOW: the only reason
> > > for the flush_workqueue()/flush_scheduled_work() calls was to ensure
> > > that the cancel_work()/cancel_delayed_work() calls preceding them have
> > > completed. Nevertheless I've split the conversion into two patches,
> > > since one touches only the NFS code, whereas the other touches the
> > > SUNRPC client and server code.
> > > 
> > > The two patches have been tested, and appear to work...
> > 
> > Great!
> > 
> > >  void
> > >  nfs4_kill_renewd(struct nfs_client *clp)
> > >  {
> > >  	down_read(&clp->cl_sem);
> > > -	cancel_delayed_work(&clp->cl_renewd);
> > > +	cancel_delayed_work_sync(&clp->cl_renewd);
> > >  	up_read(&clp->cl_sem);
> > > -	flush_scheduled_work();
> > >  }
> > 
> > this looks unsafe to me, the window is very small, but afaics this can
> > deadlock if called when nfs4_renew_state() has already started, but didn't
> > take ->cl_sem yet.
> 
> Not really. We have removed the nfs_client from the public lists, and we
> are guaranteed that there are no more active superblocks attached to it
> so nothing can call the reclaimer routine (which is the only routine
> that takes a write lock on clp->cl_sem).

Thanks for your explanation. Not that I was able to understand, nfs is a
black magic to me :)

But. nfs4_renew_state() checks list_empty(&clp->cl_superblocks) under
clp->cl_sem? So, if it is possible that clp->cl_renewd was scheduled
at the time when nfs4_kill_renewd(), we can deadlock, no? Because
nfs4_renew_state() needs clp->cl_sem to complete, but nfs4_kill_renewd()
holds this sem, and waits for nfs4_renew_state() completion.

> > Btw, unless I missed something, the code without this patch looks incorrect
> > too: cancel_delayed_work() can fail if the timer expired, but the ->cl_renewd
> > didn't run yet. In that case nfs4_renew_state() can run and re-schedule itself
> > after flush_scheduled_work() returns.
> 
> No, that should not be possible. Again, see above: there are no active
> superblocks, so clp->cl_superblocks is empty.

Yes, thanks. I missed "goto out" in nfs4_renew_state().

Oleg.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.23-rc1-mm2
  2007-08-07 22:20               ` 2.6.23-rc1-mm2 Oleg Nesterov
@ 2007-08-07 23:08                 ` Trond Myklebust
  2007-08-07 23:14                   ` 2.6.23-rc1-mm2 Oleg Nesterov
  0 siblings, 1 reply; 17+ messages in thread
From: Trond Myklebust @ 2007-08-07 23:08 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Neil Brown, linux-kernel, nfs, Andrew Morton, Johannes Berg,
	Marc Dietrich

On Wed, 2007-08-08 at 02:20 +0400, Oleg Nesterov wrote:

> But. nfs4_renew_state() checks list_empty(&clp->cl_superblocks) under
> clp->cl_sem? So, if it is possible that clp->cl_renewd was scheduled
> at the time when nfs4_kill_renewd(), we can deadlock, no? Because
> nfs4_renew_state() needs clp->cl_sem to complete, but nfs4_kill_renewd()
> holds this sem, and waits for nfs4_renew_state() completion.

They both take read locks, which means that they can take them
simultaneously. AFAICS, the deadlock can only occur if something manages
to insert a request for a write lock after nfs4_kill_renewd() takes its
read lock, but before nfs4_renew_state() takes its read lock:

1) nfs4_kill_renewd()		2) nfs4_renew_state()		3) somebody else
-------------------             ------------------		-------------
read lock
wait on (2) to complete
								write lock <waits on (1)>
				read lock <waits on (3),
					because rw_semaphores
					don't allow a read lock
					request to jump a write
					lock request>

however as I explained earlier, the only process that can take a write
lock is the reclaimer daemon, but we _know_ that cannot be running (for
one thing, the reference count on nfs_client is zero, for the other,
there are no superblocks).

Cheers
  Trond


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: 2.6.23-rc1-mm2
  2007-08-07 23:08                 ` 2.6.23-rc1-mm2 Trond Myklebust
@ 2007-08-07 23:14                   ` Oleg Nesterov
  0 siblings, 0 replies; 17+ messages in thread
From: Oleg Nesterov @ 2007-08-07 23:14 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Neil Brown, linux-kernel, nfs, Andrew Morton, Johannes Berg,
	Marc Dietrich

On 08/07, Trond Myklebust wrote:
>
> On Wed, 2007-08-08 at 02:20 +0400, Oleg Nesterov wrote:
> 
> > But. nfs4_renew_state() checks list_empty(&clp->cl_superblocks) under
> > clp->cl_sem? So, if it is possible that clp->cl_renewd was scheduled
> > at the time when nfs4_kill_renewd(), we can deadlock, no? Because
> > nfs4_renew_state() needs clp->cl_sem to complete, but nfs4_kill_renewd()
> > holds this sem, and waits for nfs4_renew_state() completion.
> 
> They both take read locks,

Aaaaaaaaaah. Please ignore me, thanks!

> which means that they can take them
> simultaneously. AFAICS, the deadlock can only occur if something manages
> to insert a request for a write lock after nfs4_kill_renewd() takes its
> read lock, but before nfs4_renew_state() takes its read lock:
> 
> 1) nfs4_kill_renewd()		2) nfs4_renew_state()		3) somebody else
> -------------------             ------------------		-------------
> read lock
> wait on (2) to complete
> 								write lock <waits on (1)>
> 				read lock <waits on (3),
> 					because rw_semaphores
> 					don't allow a read lock
> 					request to jump a write
> 					lock request>
> 
> however as I explained earlier, the only process that can take a write
> lock is the reclaimer daemon, but we _know_ that cannot be running (for
> one thing, the reference count on nfs_client is zero, for the other,
> there are no superblocks).

Oleg.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-08-07 23:14 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20070731230932.a9459617.akpm@linux-foundation.org>
2007-08-03 11:00 ` 2.6.23-rc1-mm2 Marc Dietrich
2007-08-03 16:38   ` 2.6.23-rc1-mm2 Andrew Morton
2007-08-03 17:03     ` 2.6.23-rc1-mm2 Trond Myklebust
2007-08-03 17:21       ` 2.6.23-rc1-mm2 Oleg Nesterov
2007-08-06  6:24         ` 2.6.23-rc1-mm2 Johannes Berg
2007-08-06 10:53           ` 2.6.23-rc1-mm2 Oleg Nesterov
2007-08-06 10:58             ` 2.6.23-rc1-mm2 Johannes Berg
2007-08-06 11:05           ` 2.6.23-rc1-mm2 Marc Dietrich
2007-08-06 11:13             ` 2.6.23-rc1-mm2 Johannes Berg
2007-08-06 16:24             ` 2.6.23-rc1-mm2 Trond Myklebust
2007-08-07 12:09               ` [NFS] 2.6.23-rc1-mm2 Marc Dietrich
2007-08-07 21:08         ` 2.6.23-rc1-mm2 Trond Myklebust
2007-08-07 21:37           ` 2.6.23-rc1-mm2 Oleg Nesterov
2007-08-07 22:05             ` 2.6.23-rc1-mm2 Trond Myklebust
2007-08-07 22:20               ` 2.6.23-rc1-mm2 Oleg Nesterov
2007-08-07 23:08                 ` 2.6.23-rc1-mm2 Trond Myklebust
2007-08-07 23:14                   ` 2.6.23-rc1-mm2 Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox