From: Simon Kirby <sim@hostway.ca>
To: Trond Myklebust <trond.myklebust@fys.uio.no>, Greg Banks <gnb@sgi.com>
Cc: linux-nfs@vger.kernel.org
Subject: Re: kernel NULL pointer dereference in rpcb_getport_done (2.6.29.4)
Date: Mon, 22 Jun 2009 14:11:26 -0700 [thread overview]
Message-ID: <20090622211126.GA564@hostway.ca> (raw)
In-Reply-To: <20090621050941.GA17059@hostway.ca>
On Sat, Jun 20, 2009 at 10:09:41PM -0700, Simon Kirby wrote:
> Actually, we just saw another similar crash on another machine which is
> an NFS client from this server (no nfsd running). Same backtrace, but
> this time RAX was "32322e32352e3031", which is obviously ASCII
> ("22.25.01"), so memory scribbling seems to definitely be happening...
Good news: 2.6.30 seems to have fixed whatever the original scribbling
source was. I see at least a couple of suspect commits in the log, but
I'm not sure which yet.
However, with 2.6.30, it seems 59a252ff8c0f2fa32c896f69d56ae33e641ce7ad
is causing us a large performance regression. The server's response
latency is huge compared to normal. I suspected this patch was the
culprit, so I wrote over the instruction that loads SVC_MAX_WAKING before
this comparison:
+ if (pool->sp_nwaking >= SVC_MAX_WAKING) {
+ /* too many threads are runnable and trying to wake up */
+ thread_avail = 0;
+ }
...when I raised SVC_MAX_WAKING to 40ish, the problem for us disappears.
The problem is that with just 72 nfsd processes running, the NFS socket
has a ~1 MB backlog of packets on it, even though "ps" shows most of the
nfsd threads are not blocked. This is on an 8 core system, with high NFS
packet rates. More NFS threads (300) made no difference.
As soon as I raised SVC_MAX_WAKING, the load average went up again to
what it normally was before with 2.6.29, but the socket's receive backlog
went down to nearly 0 again, and the request latency is now back to
normal.
I think the issue here is that whatever calls svc_xprt_enqueue() isn't
doing it again as soon as the threads sleep again, but only when the next
packet comes in, or something...
Simon-
next prev parent reply other threads:[~2009-06-22 21:11 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-19 22:54 kernel NULL pointer dereference in rpcb_getport_done (2.6.29.4) Simon Kirby
2009-06-20 19:57 ` Trond Myklebust
[not found] ` <1245527855.5182.33.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-21 5:09 ` Simon Kirby
2009-06-22 21:11 ` Simon Kirby [this message]
2009-07-09 17:27 ` Simon Kirby
2009-07-10 22:34 ` J. Bruce Fields
2009-08-10 23:55 ` J. Bruce Fields
2009-08-11 17:17 ` Simon Kirby
2009-10-15 21:46 ` Simon Kirby
2009-10-15 22:52 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090622211126.GA564@hostway.ca \
--to=sim@hostway.ca \
--cc=gnb@sgi.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.