Linux NFS development
 help / color / mirror / Atom feed
From: Olaf Kirch <okir@suse.de>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: nfs@lists.sourceforge.net
Subject: Re: RE: Race condition in xprt_disconnect
Date: Tue, 6 Apr 2004 11:24:01 +0200	[thread overview]
Message-ID: <20040406092401.GA29906@suse.de> (raw)
In-Reply-To: <1081197570.2641.133.camel@lade.trondhjem.org>

Hi Trond,

On Mon, Apr 05, 2004 at 04:39:30PM -0400, Trond Myklebust wrote:
> The only way I see that we can have duplicate calls to sock_release() is
> if we are scheduling xprt_socket_autoclose() or xprt_socket_connect()
> more than once.
> Normally xprt_lock_write() should protect us against this, and so AFAICS
> any duplicates must imply that xprt_lock_write() is getting lost before
> xprt_socket_connect() has had a chance to run.

Yes, that's what it looks like. I had another oops now happening
in some selinux function called from tcp_connect. This looks like
two processes in xprt_socket_connect; the first having just opened
the socket and calling connect, and the second one having released
the socket inbetween.

So yes, you're right. My patch is probably not sufficient.

> Presumably this is
> occurring because xprt->snd_task is timing out and then being woken up
> from xprt->pending.

I'm not sure it's a timeout, actually. Looking at the syslog, the NFS
Server closes the connection, and the oops happens one second later.

I've been looking around the code a little more, and here's a
different theory.

 -	server drops connection
 -	tcp_state_change is called and wakes up all tasks with ENOTCONN
 -	rpciod schedules all tasks. The first one gets the write lock,
 	and schedules the xprt_socket_connect worker. Goes to
	sleep on xprt->pending.
 -	xprt_socket_connect runs and calls xprt_close->xprt_disconnect
 	which wakes up all tasks with ENOTCONN. While xprt_socket_connect
	is calling various network functions to create a socket, bind
	and connect it, the following happens on CPU B:
 -	rpciod wakes up, schedules the task that was waiting for the
 	connect. Oops, we already have the send lock (we're snd_task),
	so go ahead and try to reconnect. Thus we schedule
	xprt_socket_connect a second time, which gets run. Note that
	the workqueue stuff resets the pending bit before calling
	xprt_socket_connect. First thing xprt_socket_connect does
	is close xprt->sock.
 -	Back on CPU A, we find the socket we're just trying to
 	connect is gone. Oops.

My patch proposed earlier to change xprt_close to not wake up all tasks
when called from xprt_socket_connect should also prevent this race. The
general case of snd_task waking up before the worker thread has run is
not covered by this, though. But that should only happen due to timeout,
and a 60 second timeout should be sufficient for keventd even on a slow
day.

> Note that in that case we don't actually *want* to schedule a new
> connect(), since the problem is not that the networking operation timed
> out, but rather that keventd is being too slow...

The additional XPRT_PENDING bit may help. It should be sufficient to
set it in xprt_connect and clear it in xprt_connect_status, I think.

Alternatively, it would also help if we canceled the pending work request
in __xprt_release_write. Unfortunately the workqueue api doesn't seem
to support cancellation. Alternatively we could just call
flush_pending_work.

Et ceterum censeo: sunrpc needs a rewrite :)
I'd like to spend some time on it this summer...

Olaf
-- 
Olaf Kirch     |  The Hardware Gods hate me.
okir@suse.de   |
---------------+ 


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

  reply	other threads:[~2004-04-06  9:24 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-04-05 20:39 Race condition in xprt_disconnect Trond Myklebust
2004-04-06  9:24 ` Olaf Kirch [this message]
2004-04-06 13:51   ` Trond Myklebust
2004-04-06 14:11   ` Trond Myklebust
2004-04-06 14:26     ` Olaf Kirch
2004-04-06 14:36       ` Trond Myklebust
2004-04-06 14:54         ` Trond Myklebust
2004-04-06 15:23           ` Olaf Kirch
2004-04-06 16:02             ` Trond Myklebust
2004-04-06 16:17               ` Olaf Kirch
2004-04-06 15:27       ` Mounting any other os besides RHEW 3 doesn't work Brent M. Clements
2004-04-06 20:05         ` Steve Dickson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040406092401.GA29906@suse.de \
    --to=okir@suse.de \
    --cc=nfs@lists.sourceforge.net \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox