public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Simon Kirby <sim@hostway.ca>
To: Trond Myklebust <trond.myklebust@fys.uio.no>, Greg Banks <gnb@sgi.com>
Cc: linux-nfs@vger.kernel.org
Subject: Re: kernel NULL pointer dereference in rpcb_getport_done (2.6.29.4)
Date: Mon, 22 Jun 2009 14:11:26 -0700	[thread overview]
Message-ID: <20090622211126.GA564@hostway.ca> (raw)
In-Reply-To: <20090621050941.GA17059@hostway.ca>

On Sat, Jun 20, 2009 at 10:09:41PM -0700, Simon Kirby wrote:

> Actually, we just saw another similar crash on another machine which is
> an NFS client from this server (no nfsd running).  Same backtrace, but
> this time RAX was "32322e32352e3031", which is obviously ASCII
> ("22.25.01"), so memory scribbling seems to definitely be happening...

Good news: 2.6.30 seems to have fixed whatever the original scribbling
source was.  I see at least a couple of suspect commits in the log, but
I'm not sure which yet.

However, with 2.6.30, it seems 59a252ff8c0f2fa32c896f69d56ae33e641ce7ad
is causing us a large performance regression.  The server's response
latency is huge compared to normal.  I suspected this patch was the
culprit, so I wrote over the instruction that loads SVC_MAX_WAKING before
this comparison:

+	if (pool->sp_nwaking >= SVC_MAX_WAKING) {
+		/* too many threads are runnable and trying to wake up */
+		thread_avail = 0;
+	}

...when I raised SVC_MAX_WAKING to 40ish, the problem for us disappears. 

The problem is that with just 72 nfsd processes running, the NFS socket
has a ~1 MB backlog of packets on it, even though "ps" shows most of the
nfsd threads are not blocked.  This is on an 8 core system, with high NFS
packet rates.  More NFS threads (300) made no difference.

As soon as I raised SVC_MAX_WAKING, the load average went up again to
what it normally was before with 2.6.29, but the socket's receive backlog
went down to nearly 0 again, and the request latency is now back to
normal.

I think the issue here is that whatever calls svc_xprt_enqueue() isn't
doing it again as soon as the threads sleep again, but only when the next
packet comes in, or something...

Simon-

  reply	other threads:[~2009-06-22 21:11 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-19 22:54 kernel NULL pointer dereference in rpcb_getport_done (2.6.29.4) Simon Kirby
2009-06-20 19:57 ` Trond Myklebust
     [not found]   ` <1245527855.5182.33.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-21  5:09     ` Simon Kirby
2009-06-22 21:11       ` Simon Kirby [this message]
2009-07-09 17:27         ` Simon Kirby
2009-07-10 22:34           ` J. Bruce Fields
2009-08-10 23:55             ` J. Bruce Fields
2009-08-11 17:17               ` Simon Kirby
2009-10-15 21:46                 ` Simon Kirby
2009-10-15 22:52                   ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090622211126.GA564@hostway.ca \
    --to=sim@hostway.ca \
    --cc=gnb@sgi.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox