netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH][RFC] use completions instead of sleep_on for rpciod
@ 2004-02-07 14:44 Christoph Hellwig
  2004-02-08 20:43 ` David S. Miller
  2004-02-09 10:28 ` Olaf Kirch
  0 siblings, 2 replies; 9+ messages in thread
From: Christoph Hellwig @ 2004-02-07 14:44 UTC (permalink / raw)
  To: netdev

[let's hope some sunrpc folks are on this list, too, the nfs list
 mentioned in MAINTAINERS refuses postings from non-subsribers..]

The rpciod shutdown code gives ugly sleep_on without BKL warnings in
-mm.  And it looks indeed somewhat racy.

The easy fix would be to simply use a completion as in the patch below,
but that removes all the signal fuzzing semantics the current code has.
I don't really understand why we want to cancel the operation by
signals, but I think it'd be better to leave that to people familar with
the code anyway..


--- 1.27/net/sunrpc/sched.c	Fri Jun 20 22:16:26 2003
+++ edited/net/sunrpc/sched.c	Wed Feb  4 06:29:02 2004
@@ -71,7 +71,7 @@
  * rpciod-related stuff
  */
 static DECLARE_WAIT_QUEUE_HEAD(rpciod_idle);
-static DECLARE_WAIT_QUEUE_HEAD(rpciod_killer);
+static DECLARE_COMPLETION(rpciod_killer);
 static DECLARE_MUTEX(rpciod_sema);
 static unsigned int		rpciod_users;
 static pid_t			rpciod_pid;
@@ -950,7 +950,6 @@
 static int
 rpciod(void *ptr)
 {
-	wait_queue_head_t *assassin = (wait_queue_head_t*) ptr;
 	int		rounds = 0;
 
 	lock_kernel();
@@ -992,11 +991,11 @@
 		rpciod_killall();
 	}
 
-	rpciod_pid = 0;
-	wake_up(assassin);
-
 	dprintk("RPC: rpciod exiting\n");
 	unlock_kernel();
+
+	rpciod_pid = 0;
+	complete_and_exit(&rpciod_killer, 0);
 	return 0;
 }
 
@@ -1041,7 +1040,7 @@
 	/*
 	 * Create the rpciod thread and wait for it to start.
 	 */
-	error = kernel_thread(rpciod, &rpciod_killer, 0);
+	error = kernel_thread(rpciod, NULL, 0);
 	if (error < 0) {
 		printk(KERN_WARNING "rpciod_up: create thread failed, error=%d\n", error);
 		rpciod_users--;
@@ -1057,8 +1056,6 @@
 void
 rpciod_down(void)
 {
-	unsigned long flags;
-
 	down(&rpciod_sema);
 	dprintk("rpciod_down pid %d sema %d\n", rpciod_pid, rpciod_users);
 	if (rpciod_users) {
@@ -1073,27 +1070,8 @@
 	}
 
 	kill_proc(rpciod_pid, SIGKILL, 1);
-	/*
-	 * Usually rpciod will exit very quickly, so we
-	 * wait briefly before checking the process id.
-	 */
-	clear_thread_flag(TIF_SIGPENDING);
-	yield();
-	/*
-	 * Display a message if we're going to wait longer.
-	 */
-	while (rpciod_pid) {
-		dprintk("rpciod_down: waiting for pid %d to exit\n", rpciod_pid);
-		if (signalled()) {
-			dprintk("rpciod_down: caught signal\n");
-			break;
-		}
-		interruptible_sleep_on(&rpciod_killer);
-	}
-	spin_lock_irqsave(&current->sighand->siglock, flags);
-	recalc_sigpending();
-	spin_unlock_irqrestore(&current->sighand->siglock, flags);
-out:
+	wait_for_completion(&rpciod_killer);
+ out:
 	up(&rpciod_sema);
 }
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH][RFC] use completions instead of sleep_on for rpciod
  2004-02-07 14:44 [PATCH][RFC] use completions instead of sleep_on for rpciod Christoph Hellwig
@ 2004-02-08 20:43 ` David S. Miller
  2004-02-09 10:28 ` Olaf Kirch
  1 sibling, 0 replies; 9+ messages in thread
From: David S. Miller @ 2004-02-08 20:43 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: netdev

On Sat, 7 Feb 2004 15:44:05 +0100
Christoph Hellwig <hch@lst.de> wrote:

> The rpciod shutdown code gives ugly sleep_on without BKL warnings in
> -mm.  And it looks indeed somewhat racy.
> 
> The easy fix would be to simply use a completion as in the patch below,
> but that removes all the signal fuzzing semantics the current code has.
> I don't really understand why we want to cancel the operation by
> signals, but I think it'd be better to leave that to people familar with
> the code anyway..

I think your patch is fine, and there are no signal issues (such code looks
merely to be some abberation rather than anything else).

So I've added it to my tree.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH][RFC] use completions instead of sleep_on for rpciod
  2004-02-07 14:44 [PATCH][RFC] use completions instead of sleep_on for rpciod Christoph Hellwig
  2004-02-08 20:43 ` David S. Miller
@ 2004-02-09 10:28 ` Olaf Kirch
  2004-02-09 20:48   ` [NFS] [PATCH][RFC] use completions instead of sleep_on forrpciod trond.myklebust
  2004-02-10  1:52   ` [NFS] [PATCH][RFC] use completions instead of sleep_on for rpciod Greg Banks
  1 sibling, 2 replies; 9+ messages in thread
From: Olaf Kirch @ 2004-02-09 10:28 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: netdev, nfs

On Sat, Feb 07, 2004 at 03:44:05PM +0100, Christoph Hellwig wrote:
> The rpciod shutdown code gives ugly sleep_on without BKL warnings in
> -mm.  And it looks indeed somewhat racy.

Yes, this code is indeed one of the areas that breaks when you use
the sunrpc code without the BKL. Another one was the missing spinlock
in xprt_alloc_xid. The code in fs/nfs/unlink.c is probably also racy
without BKL.

So if you remove the sunrpc BKL be prepared for lots and lots
of bug reports :)

Olaf
-- 
Olaf Kirch     |  Stop wasting entropy - start using predictable
okir@suse.de   |  tempfile names today!
---------------+ 


-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [NFS] [PATCH][RFC] use completions instead of sleep_on forrpciod
  2004-02-09 10:28 ` Olaf Kirch
@ 2004-02-09 20:48   ` trond.myklebust
  2004-02-09 21:00     ` David S. Miller
  2004-02-10  9:15     ` Olaf Kirch
  2004-02-10  1:52   ` [NFS] [PATCH][RFC] use completions instead of sleep_on for rpciod Greg Banks
  1 sibling, 2 replies; 9+ messages in thread
From: trond.myklebust @ 2004-02-09 20:48 UTC (permalink / raw)
  To: Olaf Kirch, Christoph Hellwig; +Cc: netdev, nfs

På må , 09/02/2004 klokka 11:28, skreiv Olaf Kirch:
> On Sat, Feb 07, 2004 at 03:44:05PM +0100, Christoph Hellwig wrote:
> > The rpciod shutdown code gives ugly sleep_on without BKL warnings in
-mm.  And it looks indeed somewhat racy.
>
> Yes, this code is indeed one of the areas that breaks when you use the
sunrpc code without the BKL. Another one was the missing spinlock in
xprt_alloc_xid. The code in fs/nfs/unlink.c is probably also racy
without BKL.
>
> So if you remove the sunrpc BKL be prepared for lots and lots
> of bug reports :)

Ahem....

xprt_alloc_xid is quite safe. It's always called under xprt->xprt_lock.

The code in unlink.c is only called from dentry_iput() or from
sillyrename(). In either case, there is no possibility for SMP races.

I'll admit that the rpciod killer code is racy, but I'm not sure that
completions are the best answer, as they have at least one nasty feature:
they are not interruptible.

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [NFS] [PATCH][RFC] use completions instead of sleep_on forrpciod
  2004-02-09 20:48   ` [NFS] [PATCH][RFC] use completions instead of sleep_on forrpciod trond.myklebust
@ 2004-02-09 21:00     ` David S. Miller
  2004-02-10  9:15     ` Olaf Kirch
  1 sibling, 0 replies; 9+ messages in thread
From: David S. Miller @ 2004-02-09 21:00 UTC (permalink / raw)
  To: trond.myklebust; +Cc: okir, hch, netdev, nfs

On Mon, 9 Feb 2004 21:48:59 +0100 (CET)
trond.myklebust@fys.uio.no wrote:

> I'll admit that the rpciod killer code is racy, but I'm not sure that
> completions are the best answer, as they have at least one nasty feature:
> they are not interruptible.

Would you like to back out the change then?  I put it into
Linus's tree already.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [NFS] [PATCH][RFC] use completions instead of sleep_on for rpciod
  2004-02-09 10:28 ` Olaf Kirch
  2004-02-09 20:48   ` [NFS] [PATCH][RFC] use completions instead of sleep_on forrpciod trond.myklebust
@ 2004-02-10  1:52   ` Greg Banks
  2004-02-10  9:18     ` Olaf Kirch
  1 sibling, 1 reply; 9+ messages in thread
From: Greg Banks @ 2004-02-10  1:52 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Christoph Hellwig, netdev, nfs

Olaf Kirch wrote:
> 
> [...] The code in fs/nfs/unlink.c is probably also racy
> without BKL.

I had cause to poke around there recently, and it's *definitely* racy
without BKL.  That file would benefit from several modernisations:

1.  move the list of nfs_unlinkdata from a global to a field in nfs_server

2.  add a lock in nfs_server for it instead of implicit BKL in
    nfs_complete_unlink, nfs_async_unlink, nfs_put_unlinkdata

3.  use proper list.h lists

> So if you remove the sunrpc BKL be prepared for lots and lots
> of bug reports :)

Yes but it's worth doing, especially serverside.

Greg.
-- 
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [NFS] [PATCH][RFC] use completions instead of sleep_on forrpciod
  2004-02-09 20:48   ` [NFS] [PATCH][RFC] use completions instead of sleep_on forrpciod trond.myklebust
  2004-02-09 21:00     ` David S. Miller
@ 2004-02-10  9:15     ` Olaf Kirch
  2004-02-10 11:47       ` trond.myklebust
  1 sibling, 1 reply; 9+ messages in thread
From: Olaf Kirch @ 2004-02-10  9:15 UTC (permalink / raw)
  To: trond.myklebust; +Cc: Christoph Hellwig, netdev, nfs

On Mon, Feb 09, 2004 at 09:48:59PM +0100, Trond Myklebust wrote:
> > So if you remove the sunrpc BKL be prepared for lots and lots
> > of bug reports :)
> 
> Ahem....
> 
> xprt_alloc_xid is quite safe. It's always called under xprt->xprt_lock.

It wasn't in 2.4.19 or whenever I ran into this. And a per-transport
lock isn't enough to protect a global variable shared by all transports.
Trust me: I spent several weeks debugging this particular problem :)
(See the PS for a somewhat lengthy description of what I remember)

I think that BKL removal at this point is probably not a good idea. Suse
had a BKL removal patch in their SLES8 kernel, and I spent a total of 4-5
weeks hunting down all sorts of bugs related to SMP races. And I don't
think we've seen all of these bugs yet. It might be better to postpone
this for the 2.7 branch.

> The code in unlink.c is only called from dentry_iput() or from
> sillyrename(). In either case, there is no possibility for SMP races.

Are you sure? nfs_dentry_iput explicitly does a lock_kernel before calling
nfs_complete_unlink. I think that's because dentry_iput explicitly drops
the dcache_lock before calling dentry->d_op->d_iput. I wouldn't remove
that BKL :)

And even if you leave that BKL and just remove the one in rpciod, you may
still have a race between rpciod and dput. One CPU may be adding entries
inside nfs_async_unlink (called from dentry_iput()), and at the same
time rpciod can run on another CPU, calling nfs_async_unlink_release,
removing stuff from the nfs_deletes list. The reference counting on
nfs_unlinkdata isn't atomic either.

Olaf

PS: Let me try to put the pieces back together... this was on a PPC
SMP system (so it can take a while for modified values to be
be visible on other CPUs). There were several mounts being stressed
in parallel.

xprt_request_init did a simple "req->rq_xid = xid++;" without any
additional spin locks protecting the xid variable (we have that
spin lock now).

The bug required at least 3 CPUs, and 3 different mounts.

        CPU0            CPU1            CPU2
       --------------------------------------------------------
   |    lock xprt1      lock xprt2
   |                                    lock xprt3
   |    read xid=0
   |                    read xid=0
   |    write xid=1
                                        read xid=1
 time++                                 write xid=2
                        write xid=1
   |                                    unlock xprt3
   |                                    lock xprt3
   |                                    read xid=1
   |                                    write xid=2
   v                                    unlock xprt3

Note that on PPC, modified data can take a long time to become
visible on other CPUs. Cache sync is only guaranteed by operations
such as spin_unlock etc.

Olaf
-- 
Olaf Kirch     |  Stop wasting entropy - start using predictable
okir@suse.de   |  tempfile names today!
---------------+ 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [NFS] [PATCH][RFC] use completions instead of sleep_on for rpciod
  2004-02-10  1:52   ` [NFS] [PATCH][RFC] use completions instead of sleep_on for rpciod Greg Banks
@ 2004-02-10  9:18     ` Olaf Kirch
  0 siblings, 0 replies; 9+ messages in thread
From: Olaf Kirch @ 2004-02-10  9:18 UTC (permalink / raw)
  To: Greg Banks; +Cc: Christoph Hellwig, netdev, nfs

On Tue, Feb 10, 2004 at 12:52:18PM +1100, Greg Banks wrote:
> Yes but it's worth doing, especially serverside.

I won't argue that point. But I've had to deal with the fall-out from
"oh-well-they-seem-to-work-here" type of NFS BKL removal patches, so
I'm quite wary of adding anything at this point in the game :)

Olaf
-- 
Olaf Kirch     |  Stop wasting entropy - start using predictable
okir@suse.de   |  tempfile names today!
---------------+ 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [NFS] [PATCH][RFC] use completions instead of sleep_on forrpciod
  2004-02-10  9:15     ` Olaf Kirch
@ 2004-02-10 11:47       ` trond.myklebust
  0 siblings, 0 replies; 9+ messages in thread
From: trond.myklebust @ 2004-02-10 11:47 UTC (permalink / raw)
  To: Olaf Kirch; +Cc: Christoph Hellwig, netdev, nfs

På ty , 10/02/2004 klokka 10:15, skreiv Olaf Kirch:
> It wasn't in 2.4.19 or whenever I ran into this. And a per-transport
lock isn't enough to protect a global variable shared by all transports.
Trust me: I spent several weeks debugging this particular problem :)
(See the PS for a somewhat lengthy description of what I remember)

Duh... I forgot that the XID was still global...

Does anybody see any problems with moving the XID counter into the struct
xprt? Please bear in mind that the replay caches on modern NFS servers
look at XID+client IP address+port number, so 2 sockets that play the same
XID to the server should not normally conflict.

> I think that BKL removal at this point is probably not a good idea. Suse
had a BKL removal patch in their SLES8 kernel, and I spent a total of
4-5
> weeks hunting down all sorts of bugs related to SMP races. And I don't
think we've seen all of these bugs yet. It might be better to postpone
this for the 2.7 branch.

I totally agree with this...

No extra work has been done on BKL removal in the 2.6 branch, and I know
that the old patch that was up for testing in the 2.4 tree was buggy in
several ways...

> > The code in unlink.c is only called from dentry_iput() or from
sillyrename(). In either case, there is no possibility for SMP races.
>
> Are you sure? nfs_dentry_iput explicitly does a lock_kernel before
calling
> nfs_complete_unlink. I think that's because dentry_iput explicitly drops
the dcache_lock before calling dentry->d_op->d_iput. I wouldn't remove
that BKL :)

Yep, but there is only one inode per dentry, and the VFS rules forbids you
to convert a negative dentry into a positive one (the contrary is allowed,
but only under certain restricted situations).

> And even if you leave that BKL and just remove the one in rpciod, you
may
> still have a race between rpciod and dput. One CPU may be adding entries
inside nfs_async_unlink (called from dentry_iput()), and at the same
time rpciod can run on another CPU, calling nfs_async_unlink_release,
removing stuff from the nfs_deletes list. The reference counting on
nfs_unlinkdata isn't atomic either.

Sure... The unlink.c global lists definitely depend on the BKL for
protection. The latter would need to be replaced by a new global
spinlock.

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2004-02-10 11:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-07 14:44 [PATCH][RFC] use completions instead of sleep_on for rpciod Christoph Hellwig
2004-02-08 20:43 ` David S. Miller
2004-02-09 10:28 ` Olaf Kirch
2004-02-09 20:48   ` [NFS] [PATCH][RFC] use completions instead of sleep_on forrpciod trond.myklebust
2004-02-09 21:00     ` David S. Miller
2004-02-10  9:15     ` Olaf Kirch
2004-02-10 11:47       ` trond.myklebust
2004-02-10  1:52   ` [NFS] [PATCH][RFC] use completions instead of sleep_on for rpciod Greg Banks
2004-02-10  9:18     ` Olaf Kirch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).