All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gerd Bavendiek <gerd.bavendiek@googlemail.com>
To: nfs@lists.sourceforge.net
Subject: [NFS] 2.6.5-7.282 or 2.6.5-7.283: kernel: RPC: error 5 or nsm_mon_unmon: rpc failed, status=-13
Date: Tue, 20 Nov 2007 18:18:20 +0100	[thread overview]
Message-ID: <474316DC.3080502@googlemail.com> (raw)

Hi,

yes, this is old software from your point of view. But as I could not
get any helpful information so far I would like to ask here.

Running SLES9 SP3 (i.e. 2.6.5-7.244) on many boxes we started to
update some of these systems to 2.6.5-7.282 or 2.6.5-7.283. After this
update we see sometimes errors in the RPC layer.

One example is this box with a very simple setup, nothing special:

ad-test:/root>>> grep -i rpc /var/log/messages
Aug 21 13:42:47 ad-test kernel: RPC: error 5 connecting to server localhost
Aug 21 13:42:47 ad-test kernel: RPC: failed to contact portmap (errno -5).
Aug 21 13:50:51 ad-test kernel: RPC: error 5 connecting to server localhost
Aug 21 13:50:51 ad-test kernel: RPC: failed to contact portmap (errno -5).
Aug 21 14:37:50 ad-test kernel: RPC: error 5 connecting to server localhost
Aug 21 14:37:50 ad-test kernel: RPC: failed to contact portmap (errno -5).
Aug 22 15:11:32 ad-test kernel: RPC: error 5 connecting to server localhost
Aug 22 15:11:32 ad-test kernel: RPC: failed to contact portmap (errno -5).
ad-test:/root>>>

Very interesting: RPC 100024 is missing.

ad-test:/root>>> rpcinfo -p
    program vers proto   port
     100000    2   tcp    111  portmapper
     100000    2   udp    111  portmapper
     100021    1   udp  32799  nlockmgr
     100021    3   udp  32799  nlockmgr
     100021    4   udp  32799  nlockmgr
     100021    1   tcp  33012  nlockmgr
     100021    3   tcp  33012  nlockmgr
     100021    4   tcp  33012  nlockmgr
ad-test:/root>>>

This is an 2.6.5-7.282, which has been up for 80 days.

Another example, again very simple setup, nothing special:

Aug 15 16:40:22 polyxena kernel: RPC: error 5 connecting to server localhost
Aug 15 16:40:22 polyxena kernel: RPC: failed to contact portmap (errno -5).
Aug 15 16:40:22 polyxena kernel: RPC: error 5 connecting to server localhost
Aug 15 16:40:22 polyxena kernel: RPC: failed to contact portmap (errno -5).

This is an 2.6.5-7.283-smp, uptime 201 days. On this one rpcinfo
output is fine:

polyxena:/root>>> rpcinfo -p
    program vers proto   port
     100000    2   tcp    111  portmapper
     100000    2   udp    111  portmapper
     100024    1   udp  34281  status
     100021    1   udp  34281  nlockmgr
     100021    3   udp  34281  nlockmgr
     100021    4   udp  34281  nlockmgr
     100024    1   tcp  43627  status
     100021    1   tcp  43627  nlockmgr
     100021    3   tcp  43627  nlockmgr
     100021    4   tcp  43627  nlockmgr
polyxena:/root>>>

We have never seen these errors with 282 or 283 on x86_64. We have never 
seen them with 2.6.5-7.244.

On machines, which boot via pxe and have their root file system on
NetApp filers, we see the above described error _OR_ a second one:

Sep 16 04:46:57 c02ptec kernel: nsm_mon_unmon: rpc failed, status=-13
Sep 16 04:46:57 c02ptec kernel: lockd: cannot unmonitor 10.172.207.7

This may happen 10 minutes after reboot or after 60 days uptime.

Sep 16 04:48:07 c02ptec kernel: nsm_mon_unmon: rpc failed, status=-13
Sep 16 04:48:07 c02ptec kernel: lockd: cannot monitor 10.172.207.7
Sep 16 04:48:07 c02ptec kernel: lockd: failed to monitor 10.172.207.7

Luckily the application still lives, so I have this system still
running:

c02ptec:/var/log>>> uptime
   6:04pm  up 65 days 14:22,  0 users,  load average: 0.06, 0.06, 0.07

despite all NFS kernel threads are gone:

c02ptec:/var/log>>> rpcinfo -p
    program vers proto   port
     100000    2   tcp    111  portmapper
     100000    2   udp    111  portmapper
  100033058    1   tcp  39722
  100033057    1   tcp  39737
c02ptec:/var/log>>>

Are there known issue ?

What can I do to give more information ?

Thanks !

Gerd

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


                 reply	other threads:[~2007-11-20 17:18 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=474316DC.3080502@googlemail.com \
    --to=gerd.bavendiek@googlemail.com \
    --cc=nfs@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.