public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tom Tucker <tom@opengridcomputing.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Michael Tokarev <mjt@tls.msk.ru>,
	Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-nfs@vger.kernel.org
Subject: Re: 2.6.24: RPC: bad TCP reclen 0x00020090 (large)
Date: Mon, 18 Feb 2008 16:00:36 -0600	[thread overview]
Message-ID: <1203372036.24272.54.camel@trinity.ogc.int> (raw)
In-Reply-To: <1203369909.24272.44.camel@trinity.ogc.int>


On Mon, 2008-02-18 at 15:25 -0600, Tom Tucker wrote:
> On Mon, 2008-02-18 at 04:58 -0800, Andrew Morton wrote:
> > (suitable cc added)
> > 
> > (regression)
> > 
> > On Wed, 13 Feb 2008 17:02:53 +0300 Michael Tokarev <mjt@tls.msk.ru> wrote:
> > 
> > > Hello!
> > > 
> > > After upgrading to 2.6.24 (from .23), we're seeing ALOT
> > > of messages like in $subj in dmesg:
> > > 
> > > Feb 13 13:21:39 paltus kernel: RPC: bad TCP reclen 0x00020090 (large)
> > > Feb 13 13:21:46 paltus kernel: printk: 3586 messages suppressed.
> > > Feb 13 13:21:46 paltus kernel: RPC: bad TCP reclen 0x00020090 (large)
> > > Feb 13 13:21:49 paltus kernel: printk: 371 messages suppressed.
> > > Feb 13 13:21:49 paltus kernel: RPC: bad TCP reclen 0x00020090 (large)
> > > Feb 13 13:21:55 paltus kernel: printk: 2979 messages suppressed.
> > > ...
> > > 
> > > with linux NFS server.  The clients are all linux too, mostly 2.6.23
> > > and some 2.6.22.
> > > 
> > > I found the "offending" piece of code in net/sunrpc/svcsock.c,
> > > in routine svc_tcp_recvfrom() with condition being:
> > > 
> > >    if (svsk->sk_reclen > serv->sv_max_mesg) ...
> 
> The problem might be that the client is setting a bit in the RPC message
> length field that is meant to be interpreted and masked off by the
> server -- and we're not doing it yet. My bet is that 0x20000 is the bit
> we're looking for. I'll poke around...

Never mind. The way this is supposed to work is that the transport is
shut down. The code used to delete the socket directly, but I
reorganized this code to just set the XPT_CLOSE bit and let the normal
close path handle it when it came back through to retry. I'm not sure
exactly what version of the code you have, but it may be that your
missing the code from the close path that does this, or it may just not
work. As a first shot, can you try this patch and tell me if the
messages go away?


diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 1d3e5fc..cf6150a 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -920,6 +920,7 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
 
  err_delete:
        set_bit(XPT_CLOSE, &svsk->sk_xprt.xpt_flags);
+       svc_delete_xprt(&svsk->sk_xprt);
        return -EAGAIN;
 
  error:


> 
> > > 
> > > This happens after a server reboot.  At this point, client(s) are trying
> > > to perform some NFS transaction and fail, and server starts generating
> > > the above messages - till I do a umount followed by mount on all clients.
> > > Before, such situation (nfs server reboot) were handled transparently,
> > > ie, there was nothing to do, the mount continued working just fine when
> > > the server comes back online.
> > > 
> > > Now, I'm not sure if it's really 2.6.24-specific problem or a userspace
> > > problem.  Some time ago we also upgraded nfs-kernel-server (Debian)
> > > package, and the remount-after-nfs-server-reboot problem started to
> > > occur at THAT time (and it is something to worry about as well, I just
> > > had no time to deal with it); but the dmesg spamming only appeared
> > > with 2.6.24.
> > > 
> > > How to debug the issue further on from this point?
> > > 
> > 
> > 
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


      reply	other threads:[~2008-02-18 21:51 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-13 14:02 2.6.24: RPC: bad TCP reclen 0x00020090 (large) Michael Tokarev
2008-02-18 12:58 ` Andrew Morton
2008-02-18 13:05   ` Michael Tokarev
2008-02-18 21:25   ` Tom Tucker
2008-02-18 22:00     ` Tom Tucker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1203372036.24272.54.camel@trinity.ogc.int \
    --to=tom@opengridcomputing.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=mjt@tls.msk.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox