Re: RE: Race condition in xprt_disconnect

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Olaf Kirch <okir@suse.de>
To: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: nfs@lists.sourceforge.net
Subject: Re: RE: Race condition in xprt_disconnect
Date: Tue, 6 Apr 2004 11:24:01 +0200	[thread overview]
Message-ID: <20040406092401.GA29906@suse.de> (raw)
In-Reply-To: <1081197570.2641.133.camel@lade.trondhjem.org>

Hi Trond,

On Mon, Apr 05, 2004 at 04:39:30PM -0400, Trond Myklebust wrote:
> The only way I see that we can have duplicate calls to sock_release() is
> if we are scheduling xprt_socket_autoclose() or xprt_socket_connect()
> more than once.
> Normally xprt_lock_write() should protect us against this, and so AFAICS
> any duplicates must imply that xprt_lock_write() is getting lost before
> xprt_socket_connect() has had a chance to run.

Yes, that's what it looks like. I had another oops now happening
in some selinux function called from tcp_connect. This looks like
two processes in xprt_socket_connect; the first having just opened
the socket and calling connect, and the second one having released
the socket inbetween.

So yes, you're right. My patch is probably not sufficient.

> Presumably this is
> occurring because xprt->snd_task is timing out and then being woken up
> from xprt->pending.

I'm not sure it's a timeout, actually. Looking at the syslog, the NFS
Server closes the connection, and the oops happens one second later.

I've been looking around the code a little more, and here's a
different theory.

 -	server drops connection
 -	tcp_state_change is called and wakes up all tasks with ENOTCONN
 -	rpciod schedules all tasks. The first one gets the write lock,
 	and schedules the xprt_socket_connect worker. Goes to
	sleep on xprt->pending.
 -	xprt_socket_connect runs and calls xprt_close->xprt_disconnect
 	which wakes up all tasks with ENOTCONN. While xprt_socket_connect
	is calling various network functions to create a socket, bind
	and connect it, the following happens on CPU B:
 -	rpciod wakes up, schedules the task that was waiting for the
 	connect. Oops, we already have the send lock (we're snd_task),
	so go ahead and try to reconnect. Thus we schedule
	xprt_socket_connect a second time, which gets run. Note that
	the workqueue stuff resets the pending bit before calling
	xprt_socket_connect. First thing xprt_socket_connect does
	is close xprt->sock.
 -	Back on CPU A, we find the socket we're just trying to
 	connect is gone. Oops.

My patch proposed earlier to change xprt_close to not wake up all tasks
when called from xprt_socket_connect should also prevent this race. The
general case of snd_task waking up before the worker thread has run is
not covered by this, though. But that should only happen due to timeout,
and a 60 second timeout should be sufficient for keventd even on a slow
day.

> Note that in that case we don't actually *want* to schedule a new
> connect(), since the problem is not that the networking operation timed
> out, but rather that keventd is being too slow...

The additional XPRT_PENDING bit may help. It should be sufficient to
set it in xprt_connect and clear it in xprt_connect_status, I think.

Alternatively, it would also help if we canceled the pending work request
in __xprt_release_write. Unfortunately the workqueue api doesn't seem
to support cancellation. Alternatively we could just call
flush_pending_work.

Et ceterum censeo: sunrpc needs a rewrite :)
I'd like to spend some time on it this summer...

Olaf
-- 
Olaf Kirch     |  The Hardware Gods hate me.
okir@suse.de   |
---------------+ 

-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

next prev parent reply	other threads:[~2004-04-06  9:24 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-04-05 20:39 Race condition in xprt_disconnect Trond Myklebust
2004-04-06  9:24 ` Olaf Kirch [this message]
2004-04-06 13:51   ` Trond Myklebust
2004-04-06 14:11   ` Trond Myklebust
2004-04-06 14:26     ` Olaf Kirch
2004-04-06 14:36       ` Trond Myklebust
2004-04-06 14:54         ` Trond Myklebust
2004-04-06 15:23           ` Olaf Kirch
2004-04-06 16:02             ` Trond Myklebust
2004-04-06 16:17               ` Olaf Kirch
2004-04-06 15:27       ` Mounting any other os besides RHEW 3 doesn't work Brent M. Clements
2004-04-06 20:05         ` Steve Dickson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040406092401.GA29906@suse.de \
    --to=okir@suse.de \
    --cc=nfs@lists.sourceforge.net \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.