Linux NFS development
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Jeff Layton <jlayton@redhat.com>
Cc: linux-nfs@vger.kernel.org, nfsv4@linux-nfs.org
Subject: Re: [PATCH] NLM: hold BKL when clearing global lockd task and serv vars
Date: Tue, 8 Apr 2008 12:28:21 -0400	[thread overview]
Message-ID: <20080408162821.GA8994@fieldses.org> (raw)
In-Reply-To: <20080408092102.2404f5ee@tleilax.poochiereds.net>

On Tue, Apr 08, 2008 at 09:21:02AM -0400, Jeff Layton wrote:
> On Mon, 7 Apr 2008 16:50:27 -0400
> "J. Bruce Fields" <bfields@fieldses.org> wrote:
> 
> > On Mon, Apr 07, 2008 at 04:22:41PM -0400, Jeff Layton wrote:
> > > On Mon, 7 Apr 2008 13:56:15 -0400
> > > "J. Bruce Fields" <bfields@fieldses.org> wrote:
> > > 
> > > > On Mon, Apr 07, 2008 at 12:45:01PM -0400, Christoph Hellwig wrote:
> > > > > On Mon, Apr 07, 2008 at 09:38:34AM -0400, Jeff Layton wrote:
> > > > > > The global task and serv pointers for lockd are normally protected by
> > > > > > the nlmsvc_mutex. The exception is when the lockd exits abnormally. When
> > > > > > this occurs, these variables are cleared without any locking.
> > > > > 
> > > > > Shouldn't we get rid of the case where it exits abnormally instead?
> > > > 
> > > > I tried to figure out when this could actually occur (when can
> > > > svc_recv() return an error other than -EINTR or -EAGAIN?), and got lost
> > > > in sock_recvmsg():
> > > > 
> > > > 	- svc_recv() itself returns only -EAGAIN or the return from
> > > > 	  ->xpo_recvfrom().
> > > > 	- the only xpo_recvfrom() that's interesting is
> > > > 	  svc_tcp_recvfrom(), which can return the error it gets from
> > > > 	  svc_recvfrom(), which can return the error from
> > > > 	  kernel_recvmsg(), which gets its return from sock_recvmsg().
> > > > 
> > > > Since __sock_recvmsg() has a security hook, it looks like we can end up
> > > > with an -EACCES from selinux?
> > > > 
> > > > So one case would be selinux deciding we weren't allowed to receive
> > > > packets from this socket.  Huh.
> > > 
> > > I got lost there too, but I would suspect that there are other errors
> > > that can bubble up from the lower networking layers as well. Even if
> > > there aren't currently, it's probably still prudent to assume that it's
> > > a possibility and code for it.
> > > 
> > > I tend to think the safest thing is probably to do a long sleep (1s or
> > > so and retry when we get an error (maybe also a ratelimited printk?).
> > 
> > Yeah, I guess I can't think of anything better.
> > 
> 
> Ok, I went ahead and did patches for this and gave them a quick test
> this morning. Obviously, these are hard to fully unit test since this
> seems to be a very uncommon occurrence.

I suppose this could probably be reproduced with some selinux magic.

> Any thoughts?

If anyone does ever hit this and it doesn't go away, the printk (even
with the ratelimiting) could be pretty annoying, so it might be worth
arranging to print this just once.  But perhaps we can wait and see if
that actually happens.

Given what appears to be a very unusual crash, and what I'm assuming is
an impending release, I suppose we should wait for the merge window (but
possibly also submit to 2.6.25.x).

--b.

  reply	other threads:[~2008-04-08 16:28 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-07 13:38 [PATCH] NFS: hold BKL when clearing nfs_callback_info.task Jeff Layton
2008-04-07 13:38 ` [PATCH] NLM: hold BKL when clearing global lockd task and serv vars Jeff Layton
2008-04-07 16:45   ` Christoph Hellwig
2008-04-07 17:40     ` Jeff Layton
2008-04-07 17:56     ` J. Bruce Fields
2008-04-07 19:08       ` Tom Tucker
2008-04-07 20:22       ` Jeff Layton
2008-04-07 20:50         ` J. Bruce Fields
2008-04-08 13:21           ` Jeff Layton
2008-04-08 16:28             ` J. Bruce Fields [this message]
2008-04-08 17:02               ` Jeff Layton
2008-04-08 19:16               ` Jeff Layton
2008-04-08 20:08               ` Chuck Lever
2008-04-08 20:20                 ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080408162821.GA8994@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=jlayton@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=nfsv4@linux-nfs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox