public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: "Ian Kent" <ikent@redhat.com>,
	"Carlos André" <candrecn@gmail.com>,
	"Linux NFSv4 mailing list" <nfsv4@linux-nfs.org>,
	"NFS list" <linux-nfs@vger.kernel.org>
Subject: Re: AutoFS+NFSv4 server down = LOOOOONG timeout.
Date: Thu, 27 Aug 2009 11:00:35 -0400	[thread overview]
Message-ID: <1251385235.5173.13.camel@heimdal.trondhjem.org> (raw)
In-Reply-To: <EE444BE9-552C-48EA-9F4E-CB958A8270B0@oracle.com>

On Thu, 2009-08-27 at 10:54 -0400, Chuck Lever wrote:
> On Aug 27, 2009, at 10:52 AM, Trond Myklebust wrote:
> > On Thu, 2009-08-27 at 10:38 -0400, Chuck Lever wrote:
> >> On Aug 27, 2009, at 4:54 AM, Ian Kent wrote:
> >>> Ian Kent wrote:
> >>>> Carlos Andr=C3=A9 wrote:
> >>>>> Hi Ian,
> >>>>>
> >>>>> Thanks for patch and sorry for delay (i'm expecting receive u
> >>>>> reply on
> >>>>> bug track, not here) :)
> >>>>>
> >>>>> But, this patch doesnt worked to me like expected...  :(
> >>>>>
> >>>>>
> >>>>> Firstly I've changed "#MOUNT_WAIT=3D-1" to "MOUNT_WAIT=3D10"
> >>>>> and later changed "10" to "2" with same results...
> >>>>> (always restarting service, of course :)
> >>>>>
> >>>>> Then, tried remove "sec=3Dkrb5p", and later removed "nfs4" but =
i got
> >>>>> same results again.
> >>>>>
> >>>>> Or i'm doing something wrong?
> >>>>>
> >>>>>
> >>>>> [root@KSTATION areas]# automount -V
> >>>>>
> >>>>> Linux automount version 5.0.1-0.rc2.131.bz517349.1
> >>>>> [...]
> >>>>>
> >>>>> [root@KSTATION areas]# time ls -la testdown
> >>>>> ls: testedown: No such file or directory
> >>>>>
> >>>>> real    3m9.006s
> >>>>> user    0m0.002s
> >>>>> sys     0m0.000s
> >>>>
> >>>> OK, that isn't behaving the way I expect, I'll have a look.
> >>>>
> >>>>>
> >>>>> LOGGING:
> >>>>> -----------------------------------------
> >>>>> Aug 24 09:23:51 KSTATION automount[20803]: mount_mount: =20
> >>>>> mount(nfs):
> >>>>> calling mount -t nfs4 -s -o rw,acl,sec=3Dkrb5p 1.2.3.4:/areas/=20
> >>>>> testdown
> >>>>> /misc/areas/testdown
> >>>>> Aug 24 09:27:00 KSTATION automount[20803]: mount(nfs): nfs: mou=
nt
> >>>>> failure 1.2.3.4:/areas/testdown on /misc/areas/testdown
> >>>>> Aug 24 09:27:00 KSTATION automount[20803]: ioctl_send_fail: tok=
en
> >>>>> =3D 91
> >>>>> Aug 24 09:27:00 KSTATION automount[20803]: failed to mount /mis=
c/
> >>>>> areas/testdown
> >>>>> -----------------------------------------
> >>>
> >>> Having a look at this I suspect the reason it doesn't work as =20
> >>> expected
> >>> is the waitpid(2) we do after sending the TERM signal to the moun=
t
> >>> process (which we have to do) is not returning. This is likely =20
> >>> because
> >>> the mount process isn't giving up in a shorter time as it used to=
=2E
> >>
> >> You're thinking maybe mount(2) should be as interruptible as the
> >> socket calls that the mount command used to do?  That might be
> >> reasonable, and I can take a look at that.
> >
> > In recent kernels, all those RPC calls should be using TASK_KILLABL=
E
> > sleep states. SIGTERM should cause them to abort, provided that som=
e
> > process isn't blocking it.
> >
> > Perhaps TASK_KILLABLE could be backported to RHEL-5?
>=20
> That's pretty extensive, with hooks in the page cache.  I doubt RH =20
> would go for that.

You don't have to add the hooks in the page cache in order to make moun=
t
interruptible. You just need to replace the sigmask-manipulation in
net/sunrpc and fs/nfs (a.k.a. rpc_clnt_sigmask()/rpc_clnt_sigunmask())
with TASK_KILLABLE.

Alternatively, it might suffice to just turn on the 'intr' flag
temporarily while doing the mount path walk, and then switch it to
whatever default the user actually specified afterwards.

Trond


  reply	other threads:[~2009-08-27 15:00 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <f6ce31e30907291021p769d8bb7jb7a13d0370b87bd6@mail.gmail.com>
     [not found] ` <f6ce31e30908061718u2c527e2eo5cf35f6eb0800fd4@mail.gmail.com>
2009-08-07  6:42   ` AutoFS+NFSv4 server down = LOOOOONG timeout Benny Halevy
2009-08-07 14:04     ` J. Bruce Fields
2009-08-10 18:29       ` Carlos André
2009-08-10 19:18         ` Chuck Lever
2009-08-10 19:43           ` Carlos André
2009-08-10 20:05             ` Carlos André
2009-08-10 20:35               ` Chuck Lever
2009-08-11 12:41                 ` Carlos André
2009-08-11 20:00                   ` Chuck Lever
2009-08-12  2:37                     ` Carlos André
2009-08-12 14:27                       ` Ian Kent
2009-08-12 14:13                     ` Ian Kent
2009-08-12 15:00                       ` Carlos André
2009-08-12 15:20                         ` Ian Kent
2009-08-12 16:40                           ` Carlos André
2009-08-13 14:19                             ` Ian Kent
2009-08-13 14:43                               ` Carlos André
2009-08-13 15:18                                 ` Carlos André
2009-08-18  0:30                                   ` Ian Kent
2009-08-18 13:17                                     ` Chuck Lever
     [not found]                                     ` <1250555418.16878.7.camel-oPQCyYhPoviaaDTPkt0SUw@public.gmane.org>
2009-08-24 13:27                                       ` Carlos André
     [not found]                                         ` <f6ce31e30908240627gff0a7eeu3c884185e6324518-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-24 14:57                                           ` Ian Kent
2009-08-24 18:07                                             ` Carlos André
2009-08-27  8:54                                             ` Ian Kent
2009-08-27 14:38                                               ` Chuck Lever
2009-08-27 14:52                                                 ` Trond Myklebust
2009-08-27 14:54                                                   ` Chuck Lever
2009-08-27 15:00                                                     ` Trond Myklebust [this message]
2009-08-27 15:12                                                       ` Chuck Lever
2009-09-17 12:58                                                         ` Carlos André
2009-09-17 13:12                                                           ` Ondrej Valousek
2009-09-22  5:46                                         ` Ian Kent
2009-09-22 17:52                                           ` Carlos André
2009-08-10 20:11             ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1251385235.5173.13.camel@heimdal.trondhjem.org \
    --to=trond.myklebust@fys.uio.no \
    --cc=candrecn@gmail.com \
    --cc=chuck.lever@oracle.com \
    --cc=ikent@redhat.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=nfsv4@linux-nfs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox