From: Carsten Aulbert <carsten.aulbert-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>
To: nfs@lists.sourceforge.net
Subject: [NFS] How to set-up a Linux NFS server to handle massive number of requests
Date: Thu, 10 Apr 2008 14:12:58 +0200 [thread overview]
Message-ID: <47FE044A.7020008@aei.mpg.de> (raw)
Hi all,
we have a pretty extreme problem here and I try to figure out how to get
it done right.
We have a large cluster consisting of 1340 compute nodes who have a
automount directory which will subsequently trigger a NFS mount (read-only):
$ ypcat auto.data
-fstype=nfs,nfsvers=3,hard,intr,rsize=8192,wsize=8192,tcp &:/data
$ grep auto.data /etc/auto.master
/atlas/data yp:auto.data --timeout=5
So far so good.
When submitting 1000 jobs just doing a md5sum of the very same file from
one single data server, I see very weird effects.
In the standard set-up many connections get into the box (tcp connection
status SYN_RECV) but those fall over after some time and stay in
CLOSE_WAIT state until I restart the nfs-kernel-server. Typically that
looks like (netstat -an):
tcp 0 0 10.20.10.14:687 10.10.2.87:799 SYN_RECV
tcp 0 0 10.20.10.14:687 10.10.4.1:823 SYN_RECV
tcp 0 0 10.20.10.14:687 10.10.1.65:656 SYN_RECV
tcp 0 0 10.20.10.14:687 10.10.1.30:650 SYN_RECV
tcp 0 0 10.20.10.14:687 10.10.0.71:789 SYN_RECV
tcp 0 0 10.20.10.14:687 10.10.1.4:602 SYN_RECV
tcp 0 0 10.20.10.14:687 10.10.1.1:967 SYN_RECV
tcp 0 0 10.20.10.14:687 10.10.3.66:915 SYN_RECV
tcp 0 0 10.20.10.14:687 10.10.0.55:620 SYN_RECV
tcp 0 0 10.20.10.14:687 10.10.1.41:835 SYN_RECV
tcp 0 0 10.20.10.14:687 10.10.2.29:958 SYN_RECV
tcp 0 0 10.20.10.14:687 10.10.1.12:998 SYN_RECV
tcp 0 0 10.20.10.14:687 10.10.1.30:651 SYN_RECV
tcp 0 0 10.20.10.14:687 10.10.1.4:601 SYN_RECV
tcp 0 0 10.20.10.14:2049 10.10.1.19:846
ESTABLISHED
tcp 45 0 10.20.10.14:687 10.10.0.68:979
CLOSE_WAIT
tcp 45 0 10.20.10.14:687 10.10.3.83:680
CLOSE_WAIT
tcp 89 0 10.20.10.14:687 10.10.0.79:604
CLOSE_WAIT
tcp 0 0 10.20.10.14:2049 10.10.2.6:676
ESTABLISHED
tcp 45 0 10.20.10.14:687 10.10.2.56:913
CLOSE_WAIT
tcp 45 0 10.20.10.14:687 10.10.0.60:827
CLOSE_WAIT
tcp 0 0 10.20.10.14:2049 10.10.3.55:778
ESTABLISHED
tcp 45 0 10.20.10.14:687 10.10.2.86:981
CLOSE_WAIT
tcp 45 0 10.20.10.14:687 10.10.9.13:792
CLOSE_WAIT
tcp 89 0 10.20.10.14:687 10.10.2.93:728
CLOSE_WAIT
tcp 45 0 10.20.10.14:687 10.10.0.20:742
CLOSE_WAIT
tcp 45 0 10.20.10.14:687 10.10.3.44:982
CLOSE_WAIT
I played with different numbers of of nfsd (ranging from 8-1024) and
increasing the number of threads for rpc.mountd from 1 to 64, in quite a
few combinations, but so far I have not found a consistent set of
parameters where 1000 nodes are able to read this file at the same time.
Any ideas from anyone or do you need more input from me?
TIA
Carsten
PS: Please Cc me, I'm not yet subscribed.
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
http://vger.kernel.org/vger-lists.html#linux-nfs
next reply other threads:[~2008-04-10 12:13 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-10 12:12 Carsten Aulbert [this message]
[not found] ` <47FE044A.7020008-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>
2008-04-11 23:07 ` [NFS] How to set-up a Linux NFS server to handle massive number of requests J. Bruce Fields
2008-04-12 6:45 ` Carsten Aulbert
[not found] ` <48005A78.9090609-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>
2008-04-14 17:06 ` J. Bruce Fields
2008-04-15 4:48 ` Tom Tucker
[not found] ` <1208234913.17169.50.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
2008-04-15 5:42 ` Carsten Aulbert
[not found] ` <48044055.2060500-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>
2008-04-15 13:58 ` J. Bruce Fields
2008-04-16 2:49 ` Tom Tucker
2008-04-15 15:12 ` J. Bruce Fields
2008-04-16 2:43 ` Tom Tucker
[not found] ` <1208313790.3521.32.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
2008-04-16 2:58 ` J. Bruce Fields
2008-04-16 3:22 ` Tom Tucker
[not found] ` <1208316166.3521.42.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
2008-04-16 13:45 ` Chuck Lever
2008-04-16 14:35 ` Carsten Aulbert
2008-05-01 19:47 ` Dean Hildebrand
2008-05-01 19:51 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47FE044A.7020008@aei.mpg.de \
--to=carsten.aulbert-l1a6w7hxd2yelga04laivw@public.gmane.org \
--cc=nfs@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.