All of lore.kernel.org
 help / color / mirror / Atom feed
From: Howard Wilkinson <howard@cohtech.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: autofs@linux.kernel.org,
	For users of Fedora Core releases <fedora-list@redhat.com>,
	nfsv4@linux-nfs.org
Subject: Re: Problem with mount.nfs4 on latest Fedora 10 updates
Date: Fri, 14 Aug 2009 08:20:22 +0100	[thread overview]
Message-ID: <4A851036.5090202@cohtech.com> (raw)
In-Reply-To: <0DA8A730-698F-4A4F-9294-EBD9D09E3658@oracle.com>

Chuck Lever wrote:
>
> On Aug 13, 2009, at 12:50 PM, Howard Wilkinson wrote:
>
>> I have just upgraded a couple of servers from FC9 to FC10 and I am 
>> seeing a major problem with mount.nfs4. This occurs when autofs calls 
>> the mount program. It then runs at 100% CPU and never terminates.
>>
>> I have VMs that are running similar configuration successfully, so 
>> this is something driven by being on bare metal.
>>
>> Kernel is 2.6.27.29-170.2.78.fc10.i686.PAE
>> nfs-utils is nfs-utils-1.1.4-8.fc10.i386
>> autofs is autofs-5.0.3-41.i386
>>
>> Command running is
>>
>> /sbin/mount.nfs4 battleaxe:/ /hosts/battleaxe -s -o 
>> rw,nosuid,nodev,tcp,rsize=32768,wsize=32768,hard,intr
>>
>> The autofs mount has worked and the directories under 
>> /hosts/battleaxe have been successfully accessed prior to the problem 
>> occuring - I suspect this is a remount after and expire has occurred.
>>
>> Anybody seen this before?
>> Anybody know what I can do to get round this? [I am on the way to 
>> FC11 but will have to live with FC10 for a while (a week or so)]
>> Any extra information I can acquire to diagnose this?
>>
>> There is nothing in the log files to indicate anything going wrong, I 
>> could turn debug on if I knew what to set and which messages to strip 
>> once I do.
>
> You could start with "sudo rpcdebug -m nfs -s mount" and look in 
> /var/log/messages, or you can strace the running mount command.
>
> -- 
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
The mount.nfs4 involvement is a red-herring! It would seem that the 
problem is in the kernel - probably in the NFS4 code path. I have now 
seem bash, df, and cfagent all exhibit the same failure. The processes 
go to 100% and hang up probably in a kernel thread. This happens some 
time after the kernel has booted so may still involve something to do 
with the autofs timing out the mount.

If I revert the kernel (and nothing else) to the latest FC9 version then 
everything goes back to working as it was.

Does anybody recognise these symptoms?

I am going to see if an strace will work, but once the system has failed 
it is difficult to get other processes to run to completion.

Howard.

      reply	other threads:[~2009-08-14  7:20 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-13 16:50 Problem with mount.nfs4 on latest Fedora 10 updates Howard Wilkinson
2009-08-13 18:04 ` Chuck Lever
2009-08-14  7:20   ` Howard Wilkinson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A851036.5090202@cohtech.com \
    --to=howard@cohtech.com \
    --cc=autofs@linux.kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=fedora-list@redhat.com \
    --cc=nfsv4@linux-nfs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.