From: "J. Bruce Fields" <bfields@fieldses.org>
To: Simon Kirby <sim@hostway.ca>
Cc: linux-nfs@vger.kernel.org, Greg Banks <gnb-xTcybq6BZ68@public.gmane.org>
Subject: Re: kernel NULL pointer dereference in rpcb_getport_done (2.6.29.4)
Date: Fri, 10 Jul 2009 18:34:08 -0400 [thread overview]
Message-ID: <20090710223408.GR10700@fieldses.org> (raw)
In-Reply-To: <20090709172739.GG13617@hostway.ca>
On Thu, Jul 09, 2009 at 10:27:39AM -0700, Simon Kirby wrote:
> Hello,
>
> It seems this email to Greg Banks is bouncing (no longer works at SGI),
Yes, I've cc'd his new address. (But he's on vacation.)
> and I see git commit 59a252ff8c0f2fa32c896f69d56ae33e641ce7ad is still
> in HEAD (and still causing problems for our load).
>
> Can somebody else eyeball this, please? I don't understand enough about
> this particular change to fix the request latency / queue backlogging
> that this patch seems to introduce.
>
> It would seem to me that this patch is flawed because svc_xprt_enqueue()
> is edge-triggered upon the arrival of packets, but the NFS threads
> themselves cannot then pull another request off of the socket queue.
> This patch likely helps with the particular benchmark, but not in our
> load case where there is a heavy mix of cached and uncached NFS requests.
That sounds plausible. I'll need to take some time to look at it.
--b.
>
> Simon-
>
> On Mon, Jun 22, 2009 at 02:11:26PM -0700, Simon Kirby wrote:
>
> > On Sat, Jun 20, 2009 at 10:09:41PM -0700, Simon Kirby wrote:
> >
> > > Actually, we just saw another similar crash on another machine which is
> > > an NFS client from this server (no nfsd running). Same backtrace, but
> > > this time RAX was "32322e32352e3031", which is obviously ASCII
> > > ("22.25.01"), so memory scribbling seems to definitely be happening...
> >
> > Good news: 2.6.30 seems to have fixed whatever the original scribbling
> > source was. I see at least a couple of suspect commits in the log, but
> > I'm not sure which yet.
> >
> > However, with 2.6.30, it seems 59a252ff8c0f2fa32c896f69d56ae33e641ce7ad
> > is causing us a large performance regression. The server's response
> > latency is huge compared to normal. I suspected this patch was the
> > culprit, so I wrote over the instruction that loads SVC_MAX_WAKING before
> > this comparison:
> >
> > + if (pool->sp_nwaking >= SVC_MAX_WAKING) {
> > + /* too many threads are runnable and trying to wake up */
> > + thread_avail = 0;
> > + }
> >
> > ...when I raised SVC_MAX_WAKING to 40ish, the problem for us disappears.
> >
> > The problem is that with just 72 nfsd processes running, the NFS socket
> > has a ~1 MB backlog of packets on it, even though "ps" shows most of the
> > nfsd threads are not blocked. This is on an 8 core system, with high NFS
> > packet rates. More NFS threads (300) made no difference.
> >
> > As soon as I raised SVC_MAX_WAKING, the load average went up again to
> > what it normally was before with 2.6.29, but the socket's receive backlog
> > went down to nearly 0 again, and the request latency is now back to
> > normal.
> >
> > I think the issue here is that whatever calls svc_xprt_enqueue() isn't
> > doing it again as soon as the threads sleep again, but only when the next
> > packet comes in, or something...
> >
> > Simon-
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2009-07-10 22:34 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-06-19 22:54 kernel NULL pointer dereference in rpcb_getport_done (2.6.29.4) Simon Kirby
2009-06-20 19:57 ` Trond Myklebust
[not found] ` <1245527855.5182.33.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-06-21 5:09 ` Simon Kirby
2009-06-22 21:11 ` Simon Kirby
2009-07-09 17:27 ` Simon Kirby
2009-07-10 22:34 ` J. Bruce Fields [this message]
2009-08-10 23:55 ` J. Bruce Fields
2009-08-11 17:17 ` Simon Kirby
2009-10-15 21:46 ` Simon Kirby
2009-10-15 22:52 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090710223408.GR10700@fieldses.org \
--to=bfields@fieldses.org \
--cc=gnb-xTcybq6BZ68@public.gmane.org \
--cc=linux-nfs@vger.kernel.org \
--cc=sim@hostway.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox