Linux NFS development
 help / color / mirror / Atom feed
From: Arto Jantunen <viiru@debian.org>
To: linux-nfs@vger.kernel.org
Subject: Re: Timeout issue (similar to bugs 11061 and 11154), bisected
Date: Tue, 17 Feb 2009 12:38:55 +0200	[thread overview]
Message-ID: <87prhh5ow0.fsf@viiru.iki.fi> (raw)
In-Reply-To: <1234789459.7708.47.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org> (sfid-20090216_151423_482241_46FCAF38) (Trond Myklebust's message of "Mon\, 16 Feb 2009 08\:04\:19 -0500")

Trond Myklebust <trond.myklebust@fys.uio.no> writes:

> On Mon, 2009-02-16 at 13:11 +0200, Arto Jantunen wrote:
>> (I'm not subscribed, so please CC me on any replies)
>> 
>> I seem to have hit a NFS bug while upgrading a machine from Debian
>> Etch to Debian Lenny. I have a NFS server running FreeBSD 7.0 RC1 and
>> a bunch of clients running Linux. The ones running kernel 2.6.18 work
>> perfectly, as do the ones running 2.6.24. The one I upgraded to 2.6.26
>> fails. After 5-15 minutes of working normally the mount dies and I get
>> the usual "nfs: server <server> not responding, still trying" in
>> dmesg. The only way I have found to get the mount back is umount -f &&
>> mount, waiting does not bring it back.
>> 
>> I have tested quite a bunch of different kernel versions, and starting
>> from 25 and ending at the git tree last week they all fail in the same
>> way. Bisecting tracks the problem to commit
>> e06799f958bf7f9f8fae15f0c6f519953fb0257c
>> 
>> I originally thought that it was the same as bug 11154, but the
>> patches attached to that bug do not fix this issue.
>> 
>> Any thoughts, patches, ideas?
>
> That looks like the known problem with the NFS server failing to close
> connections in a timely manner. There is a fix for this in
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git&a=commitdiff&h=69b6ba3712b796a66595cfaf0a5ab4dfe1cf964a
>
> There is also a client side patch that increases the robustness of the
> client when it hits a buggy server, and that causes it to do the
> equivalent of a linger2 timeout. That patch is as of yet not merged into
> mainline, however I've attached it below together with a followup patch
> that makes the timeout configurable...

The client side patch you attached hides the problem on the server,
after applying it the mount sticks around. As previously discussed,
the server is running an apparently buggy version of FreeBSD and I'd
rather not touch it right now since it is in production.

Thanks for your fast response.

-- 
Arto Jantunen

      parent reply	other threads:[~2009-02-17 10:39 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-16 11:11 Timeout issue (similar to bugs 11061 and 11154), bisected Arto Jantunen
2009-02-16 13:04 ` Trond Myklebust
     [not found]   ` <1234789459.7708.47.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-02-17 10:38     ` Arto Jantunen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87prhh5ow0.fsf@viiru.iki.fi \
    --to=viiru@debian.org \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox