[NFS] How to set-up a Linux NFS server to handle massive number of requests

All of lore.kernel.org
 help / color / mirror / Atom feed

* [NFS] How to set-up a Linux NFS server to handle massive number of requests
@ 2008-04-10 12:12 Carsten Aulbert
       [not found] ` <47FE044A.7020008-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Carsten Aulbert @ 2008-04-10 12:12 UTC (permalink / raw)
  To: nfs

Hi all,

we have a pretty extreme problem here and I try to figure out how to get 
it done right.

We have a large cluster consisting of 1340 compute nodes who have a 
automount directory which will subsequently trigger a NFS mount (read-only):

$ ypcat auto.data
-fstype=nfs,nfsvers=3,hard,intr,rsize=8192,wsize=8192,tcp       &:/data

$ grep auto.data /etc/auto.master
/atlas/data          yp:auto.data      --timeout=5

So far so good.

When submitting 1000 jobs just doing a md5sum of the very same file from 
one single data server, I see very weird effects.

In the standard set-up many connections get into the box (tcp connection 
status SYN_RECV) but those fall over after some time and stay in 
CLOSE_WAIT state until I restart the nfs-kernel-server. Typically that 
looks like (netstat -an):

tcp        0      0 10.20.10.14:687         10.10.2.87:799          SYN_RECV
tcp        0      0 10.20.10.14:687         10.10.4.1:823           SYN_RECV
tcp        0      0 10.20.10.14:687         10.10.1.65:656          SYN_RECV
tcp        0      0 10.20.10.14:687         10.10.1.30:650          SYN_RECV
tcp        0      0 10.20.10.14:687         10.10.0.71:789          SYN_RECV
tcp        0      0 10.20.10.14:687         10.10.1.4:602           SYN_RECV
tcp        0      0 10.20.10.14:687         10.10.1.1:967           SYN_RECV
tcp        0      0 10.20.10.14:687         10.10.3.66:915          SYN_RECV
tcp        0      0 10.20.10.14:687         10.10.0.55:620          SYN_RECV
tcp        0      0 10.20.10.14:687         10.10.1.41:835          SYN_RECV
tcp        0      0 10.20.10.14:687         10.10.2.29:958          SYN_RECV
tcp        0      0 10.20.10.14:687         10.10.1.12:998          SYN_RECV
tcp        0      0 10.20.10.14:687         10.10.1.30:651          SYN_RECV
tcp        0      0 10.20.10.14:687         10.10.1.4:601           SYN_RECV
tcp        0      0 10.20.10.14:2049        10.10.1.19:846 
ESTABLISHED
tcp       45      0 10.20.10.14:687         10.10.0.68:979 
CLOSE_WAIT
tcp       45      0 10.20.10.14:687         10.10.3.83:680 
CLOSE_WAIT
tcp       89      0 10.20.10.14:687         10.10.0.79:604 
CLOSE_WAIT
tcp        0      0 10.20.10.14:2049        10.10.2.6:676 
ESTABLISHED
tcp       45      0 10.20.10.14:687         10.10.2.56:913 
CLOSE_WAIT
tcp       45      0 10.20.10.14:687         10.10.0.60:827 
CLOSE_WAIT
tcp        0      0 10.20.10.14:2049        10.10.3.55:778 
ESTABLISHED
tcp       45      0 10.20.10.14:687         10.10.2.86:981 
CLOSE_WAIT
tcp       45      0 10.20.10.14:687         10.10.9.13:792 
CLOSE_WAIT
tcp       89      0 10.20.10.14:687         10.10.2.93:728 
CLOSE_WAIT
tcp       45      0 10.20.10.14:687         10.10.0.20:742 
CLOSE_WAIT
tcp       45      0 10.20.10.14:687         10.10.3.44:982 
CLOSE_WAIT


I played with different numbers of of nfsd (ranging from 8-1024) and 
increasing the number of threads for rpc.mountd from 1 to 64, in quite a 
few combinations, but so far I have not found a consistent set of 
parameters where 1000 nodes are able to read this file at the same time.

Any ideas from anyone or do you need more input from me?

TIA

Carsten

PS: Please Cc me, I'm not yet subscribed.

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 16+ messages in thread

[parent not found: <47FE044A.7020008-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>]

* Re: [NFS] How to set-up a Linux NFS server to handle massive number of requests
       [not found] ` <47FE044A.7020008-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>
@ 2008-04-11 23:07   ` J. Bruce Fields
  2008-04-12  6:45     ` Carsten Aulbert
  2008-04-15  4:48     ` Tom Tucker
  0 siblings, 2 replies; 16+ messages in thread
From: J. Bruce Fields @ 2008-04-11 23:07 UTC (permalink / raw)
  To: Carsten Aulbert; +Cc: nfs

On Thu, Apr 10, 2008 at 02:12:58PM +0200, Carsten Aulbert wrote:
> Hi all,
> 
> we have a pretty extreme problem here and I try to figure out how to get 
> it done right.
> 
> We have a large cluster consisting of 1340 compute nodes who have a 
> automount directory which will subsequently trigger a NFS mount (read-only):
> 
> $ ypcat auto.data
> -fstype=nfs,nfsvers=3,hard,intr,rsize=8192,wsize=8192,tcp       &:/data
> 
> $ grep auto.data /etc/auto.master
> /atlas/data          yp:auto.data      --timeout=5
> 
> So far so good.
> 
> When submitting 1000 jobs just doing a md5sum of the very same file from 
> one single data server, I see very weird effects.
> 
> In the standard set-up many connections get into the box (tcp connection 
> status SYN_RECV) but those fall over after some time and stay in 
> CLOSE_WAIT state until I restart the nfs-kernel-server. Typically that 
> looks like (netstat -an):

That's interesting!  But I'm not sure how to figure this out.

Is it possible to get a network trace that shows what's going on?

What happens on the clients?

What kernel version are you using?--b.

> 
> tcp        0      0 10.20.10.14:687         10.10.2.87:799          SYN_RECV
> tcp        0      0 10.20.10.14:687         10.10.4.1:823           SYN_RECV
> tcp        0      0 10.20.10.14:687         10.10.1.65:656          SYN_RECV
> tcp        0      0 10.20.10.14:687         10.10.1.30:650          SYN_RECV
> tcp        0      0 10.20.10.14:687         10.10.0.71:789          SYN_RECV
> tcp        0      0 10.20.10.14:687         10.10.1.4:602           SYN_RECV
> tcp        0      0 10.20.10.14:687         10.10.1.1:967           SYN_RECV
> tcp        0      0 10.20.10.14:687         10.10.3.66:915          SYN_RECV
> tcp        0      0 10.20.10.14:687         10.10.0.55:620          SYN_RECV
> tcp        0      0 10.20.10.14:687         10.10.1.41:835          SYN_RECV
> tcp        0      0 10.20.10.14:687         10.10.2.29:958          SYN_RECV
> tcp        0      0 10.20.10.14:687         10.10.1.12:998          SYN_RECV
> tcp        0      0 10.20.10.14:687         10.10.1.30:651          SYN_RECV
> tcp        0      0 10.20.10.14:687         10.10.1.4:601           SYN_RECV
> tcp        0      0 10.20.10.14:2049        10.10.1.19:846 
> ESTABLISHED
> tcp       45      0 10.20.10.14:687         10.10.0.68:979 
> CLOSE_WAIT
> tcp       45      0 10.20.10.14:687         10.10.3.83:680 
> CLOSE_WAIT
> tcp       89      0 10.20.10.14:687         10.10.0.79:604 
> CLOSE_WAIT
> tcp        0      0 10.20.10.14:2049        10.10.2.6:676 
> ESTABLISHED
> tcp       45      0 10.20.10.14:687         10.10.2.56:913 
> CLOSE_WAIT
> tcp       45      0 10.20.10.14:687         10.10.0.60:827 
> CLOSE_WAIT
> tcp        0      0 10.20.10.14:2049        10.10.3.55:778 
> ESTABLISHED
> tcp       45      0 10.20.10.14:687         10.10.2.86:981 
> CLOSE_WAIT
> tcp       45      0 10.20.10.14:687         10.10.9.13:792 
> CLOSE_WAIT
> tcp       89      0 10.20.10.14:687         10.10.2.93:728 
> CLOSE_WAIT
> tcp       45      0 10.20.10.14:687         10.10.0.20:742 
> CLOSE_WAIT
> tcp       45      0 10.20.10.14:687         10.10.3.44:982 
> CLOSE_WAIT
> 
> 
> I played with different numbers of of nfsd (ranging from 8-1024) and 
> increasing the number of threads for rpc.mountd from 1 to 64, in quite a 
> few combinations, but so far I have not found a consistent set of 
> parameters where 1000 nodes are able to read this file at the same time.
> 
> Any ideas from anyone or do you need more input from me?
> 
> TIA
> 
> Carsten
> 
> PS: Please Cc me, I'm not yet subscribed.
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> Don't miss this year's exciting event. There's still time to save $100. 
> Use priority code J8TL2D2. 
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
> _______________________________________________
> Please note that nfs@lists.sourceforge.net is being discontinued.
> Please subscribe to linux-nfs@vger.kernel.org instead.
>     http://vger.kernel.org/vger-lists.html#linux-nfs
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [NFS] How to set-up a Linux NFS server to handle massive number of requests
  2008-04-11 23:07   ` J. Bruce Fields
@ 2008-04-12  6:45     ` Carsten Aulbert
       [not found]       ` <48005A78.9090609-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>
  2008-04-15  4:48     ` Tom Tucker
  1 sibling, 1 reply; 16+ messages in thread
From: Carsten Aulbert @ 2008-04-12  6:45 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: nfs

2.6.24.Hi,

J. Bruce Fields wrote:
>> In the standard set-up many connections get into the box (tcp connection 
>> status SYN_RECV) but those fall over after some time and stay in 
>> CLOSE_WAIT state until I restart the nfs-kernel-server. Typically that 
>> looks like (netstat -an):
> 
> That's interesting!  But I'm not sure how to figure this out.
> 
> Is it possible to get a network trace that shows what's going on?
>

In principle yes, but
(1) it's huge. I only get this when doing this with 500-1000 clients 
starting at about the same time
(2) It seems that I don't get a full trace, i.e. the session seem to be 
incomplete - sometimes I only see a single packet with FIN set. I tried 
doing this both with wireshark running locally and with ntap's capturing 
device.

> What happens on the clients?
> 
In the logs (/var/log/daemon.log) I only see that the mount request 
fails in different ways.

Apr  9 12:07:55 n0078 automount[26838]: >> mount: RPC: Timed out
Apr  9 12:07:55 n0078 automount[26838]: mount(nfs): nfs: mount failure 
d14:/data on /atlas/data/d14
Apr  9 12:07:55 n0078 automount[26838]: failed to mount /atlas/data/d14
Apr  9 12:18:56 n0078 automount[27977]: >> mount: RPC: Remote system 
error - Connection timed out
Apr  9 12:18:56 n0078 automount[27977]: mount(nfs): nfs: mount failure 
d14:/data on /atlas/data/d14

I have not yet run tshark in the background on many nodes to see if I 
can capture the client's view. Would that be beneficial?

> What kernel version are you using?--b.

2.6.24.4 on Debian Etch

Right now, it seems that running 196 nfsd plus 64 threads for mountd 
solves the problem for the time being. Although it would be nice to 
understand these "magic" numbers ;)

Thanks!

Carsten

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 16+ messages in thread

[parent not found: <48005A78.9090609-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>]

* Re: [NFS] How to set-up a Linux NFS server to handle massive number of requests
       [not found]       ` <48005A78.9090609-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>
@ 2008-04-14 17:06         ` J. Bruce Fields
  0 siblings, 0 replies; 16+ messages in thread
From: J. Bruce Fields @ 2008-04-14 17:06 UTC (permalink / raw)
  To: Carsten Aulbert; +Cc: nfs

On Sat, Apr 12, 2008 at 08:45:12AM +0200, Carsten Aulbert wrote:
> 2.6.24.Hi,
>
> J. Bruce Fields wrote:
>>> In the standard set-up many connections get into the box (tcp 
>>> connection status SYN_RECV) but those fall over after some time and 
>>> stay in CLOSE_WAIT state until I restart the nfs-kernel-server. 
>>> Typically that looks like (netstat -an):
>>
>> That's interesting!  But I'm not sure how to figure this out.
>>
>> Is it possible to get a network trace that shows what's going on?
>>
>
> In principle yes, but
> (1) it's huge. I only get this when doing this with 500-1000 clients  
> starting at about the same time
> (2) It seems that I don't get a full trace, i.e. the session seem to be  
> incomplete - sometimes I only see a single packet with FIN set. I tried  
> doing this both with wireshark running locally and with ntap's capturing  
> device.

Yeah, that's not surprising.  You'd probably want to dedicate a machine
to doing the capture, and then I'm not sure what kind of hardware you'd
need for a given network to get everything.  Probably it's not worth it.

>> What happens on the clients?
>>
> In the logs (/var/log/daemon.log) I only see that the mount request  
> fails in different ways.
>
> Apr  9 12:07:55 n0078 automount[26838]: >> mount: RPC: Timed out
> Apr  9 12:07:55 n0078 automount[26838]: mount(nfs): nfs: mount failure  
> d14:/data on /atlas/data/d14
> Apr  9 12:07:55 n0078 automount[26838]: failed to mount /atlas/data/d14
> Apr  9 12:18:56 n0078 automount[27977]: >> mount: RPC: Remote system  
> error - Connection timed out
> Apr  9 12:18:56 n0078 automount[27977]: mount(nfs): nfs: mount failure  
> d14:/data on /atlas/data/d14
>
> I have not yet run tshark in the background on many nodes to see if I  
> can capture the client's view. Would that be beneficial?

Couldn't hurt.

Hauling out TCP/IP Illustrated and refreshing my memory of the tcp state
transition diagram....  So if the server has a lot of connections stuck
in CLOSE_WAIT, that means it got FIN's from the clients (perhaps after
they timed out), but never shut down its side of the connection.  Sounds
like a bug in some server-side rpc code.  (Hm.  But all those SYN_RECV's
are somebody waiting for a client to ACK a SYN.  Why are there so many
of those?)

Those connections are actually to port 687, which I assume is mountd
(what does rpcinfo -p say?).  (And probably if you just killed and
restarted mountd, instead of doing a complete
"/etc/init.d/nfs-kernel-server restart", that'd also clear those out.)
In fact, in the example you gave only three out of about 27 connections
(the only ESTABLISHED connections) were to port 2049 (nfsd itself).

So it looks like it's mountd that's not keeping up (and that's leaving
connections sitting around too long), and the mountd processes are
probably what we should be debugging.

>> What kernel version are you using?--b.
>
> 2.6.24.4 on Debian Etch
>
> Right now, it seems that running 196 nfsd plus 64 threads for mountd  
> solves the problem for the time being. Although it would be nice to  
> understand these "magic" numbers ;)

Yes, definitely.  I'm surprised the number of nfsd threads matters much
at all, actually, if mountd is the bottleneck.

--b.

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [NFS] How to set-up a Linux NFS server to handle massive number of requests
  2008-04-11 23:07   ` J. Bruce Fields
  2008-04-12  6:45     ` Carsten Aulbert
@ 2008-04-15  4:48     ` Tom Tucker
       [not found]       ` <1208234913.17169.50.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
  1 sibling, 1 reply; 16+ messages in thread
From: Tom Tucker @ 2008-04-15  4:48 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: nfs, Carsten Aulbert


Maybe this this is a TCP_BACKLOG issue?

BTW, with that many mounts won't you run out of "secure" ports (< 1024),
so you'll need to use 'insecure' as a mount option.


On Fri, 2008-04-11 at 19:07 -0400, J. Bruce Fields wrote:
> On Thu, Apr 10, 2008 at 02:12:58PM +0200, Carsten Aulbert wrote:
> > Hi all,
> > 
> > we have a pretty extreme problem here and I try to figure out how to get 
> > it done right.
> > 
> > We have a large cluster consisting of 1340 compute nodes who have a 
> > automount directory which will subsequently trigger a NFS mount (read-only):
> > 
> > $ ypcat auto.data
> > -fstype=nfs,nfsvers=3,hard,intr,rsize=8192,wsize=8192,tcp       &:/data
> > 
> > $ grep auto.data /etc/auto.master
> > /atlas/data          yp:auto.data      --timeout=5
> > 
> > So far so good.
> > 
> > When submitting 1000 jobs just doing a md5sum of the very same file from 
> > one single data server, I see very weird effects.
> > 
> > In the standard set-up many connections get into the box (tcp connection 
> > status SYN_RECV) but those fall over after some time and stay in 
> > CLOSE_WAIT state until I restart the nfs-kernel-server. Typically that 
> > looks like (netstat -an):
> 
> That's interesting!  But I'm not sure how to figure this out.
> 
> Is it possible to get a network trace that shows what's going on?
> 
> What happens on the clients?
> 
> What kernel version are you using?--b.
> 
> > 
> > tcp        0      0 10.20.10.14:687         10.10.2.87:799          SYN_RECV
> > tcp        0      0 10.20.10.14:687         10.10.4.1:823           SYN_RECV
> > tcp        0      0 10.20.10.14:687         10.10.1.65:656          SYN_RECV
> > tcp        0      0 10.20.10.14:687         10.10.1.30:650          SYN_RECV
> > tcp        0      0 10.20.10.14:687         10.10.0.71:789          SYN_RECV
> > tcp        0      0 10.20.10.14:687         10.10.1.4:602           SYN_RECV
> > tcp        0      0 10.20.10.14:687         10.10.1.1:967           SYN_RECV
> > tcp        0      0 10.20.10.14:687         10.10.3.66:915          SYN_RECV
> > tcp        0      0 10.20.10.14:687         10.10.0.55:620          SYN_RECV
> > tcp        0      0 10.20.10.14:687         10.10.1.41:835          SYN_RECV
> > tcp        0      0 10.20.10.14:687         10.10.2.29:958          SYN_RECV
> > tcp        0      0 10.20.10.14:687         10.10.1.12:998          SYN_RECV
> > tcp        0      0 10.20.10.14:687         10.10.1.30:651          SYN_RECV
> > tcp        0      0 10.20.10.14:687         10.10.1.4:601           SYN_RECV
> > tcp        0      0 10.20.10.14:2049        10.10.1.19:846 
> > ESTABLISHED
> > tcp       45      0 10.20.10.14:687         10.10.0.68:979 
> > CLOSE_WAIT
> > tcp       45      0 10.20.10.14:687         10.10.3.83:680 
> > CLOSE_WAIT
> > tcp       89      0 10.20.10.14:687         10.10.0.79:604 
> > CLOSE_WAIT
> > tcp        0      0 10.20.10.14:2049        10.10.2.6:676 
> > ESTABLISHED
> > tcp       45      0 10.20.10.14:687         10.10.2.56:913 
> > CLOSE_WAIT
> > tcp       45      0 10.20.10.14:687         10.10.0.60:827 
> > CLOSE_WAIT
> > tcp        0      0 10.20.10.14:2049        10.10.3.55:778 
> > ESTABLISHED
> > tcp       45      0 10.20.10.14:687         10.10.2.86:981 
> > CLOSE_WAIT
> > tcp       45      0 10.20.10.14:687         10.10.9.13:792 
> > CLOSE_WAIT
> > tcp       89      0 10.20.10.14:687         10.10.2.93:728 
> > CLOSE_WAIT
> > tcp       45      0 10.20.10.14:687         10.10.0.20:742 
> > CLOSE_WAIT
> > tcp       45      0 10.20.10.14:687         10.10.3.44:982 
> > CLOSE_WAIT
> > 
> > 
> > I played with different numbers of of nfsd (ranging from 8-1024) and 
> > increasing the number of threads for rpc.mountd from 1 to 64, in quite a 
> > few combinations, but so far I have not found a consistent set of 
> > parameters where 1000 nodes are able to read this file at the same time.
> > 
> > Any ideas from anyone or do you need more input from me?
> > 
> > TIA
> > 
> > Carsten
> > 
> > PS: Please Cc me, I'm not yet subscribed.
> > 
> > -------------------------------------------------------------------------
> > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> > Don't miss this year's exciting event. There's still time to save $100. 
> > Use priority code J8TL2D2. 
> > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> > _______________________________________________
> > NFS maillist  -  NFS@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nfs
> > _______________________________________________
> > Please note that nfs@lists.sourceforge.net is being discontinued.
> > Please subscribe to linux-nfs@vger.kernel.org instead.
> >     http://vger.kernel.org/vger-lists.html#linux-nfs
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> Don't miss this year's exciting event. There's still time to save $100. 
> Use priority code J8TL2D2. 
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
> _______________________________________________
> Please note that nfs@lists.sourceforge.net is being discontinued.
> Please subscribe to linux-nfs@vger.kernel.org instead.
>     http://vger.kernel.org/vger-lists.html#linux-nfs


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 16+ messages in thread

[parent not found: <1208234913.17169.50.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>]

* Re: [NFS] How to set-up a Linux NFS server to handle massive number of requests
       [not found]       ` <1208234913.17169.50.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
@ 2008-04-15  5:42         ` Carsten Aulbert
       [not found]           ` <48044055.2060500-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>
  2008-04-15 15:12         ` J. Bruce Fields
  1 sibling, 1 reply; 16+ messages in thread
From: Carsten Aulbert @ 2008-04-15  5:42 UTC (permalink / raw)
  To: Tom Tucker; +Cc: J. Bruce Fields, nfs

Tom Tucker wrote:
> Maybe this this is a TCP_BACKLOG issue?
> 

Hmm, Google does not yield much information about this. I think I know 
what that would be, is there a cure or some kernel switches for tuning that?

> BTW, with that many mounts won't you run out of "secure" ports (< 1024),
> so you'll need to use 'insecure' as a mount option.

Not to my knowledge. All connections go to a single port onto the server 
box (well, one port per service).  Only the clients may run out of 
privileged ports of they do too much mounting, but mostly this option is 
just for "security" reasons. At least that's my understanding.

By the ways, discussing this issue with my colleague cluster admins, the 
question popped up, if there is a guideline/rule of thump of how many 
nfsd one should run - or asking the other way round, how to arrive at a 
good compromise.

Our server boxes are pretty big (8 cores, 16 GB memory, 16 disk 
Areca1261 RAID6), so the resources used by the nfsd are not much of an 
issue - I even tested with 1024 nfsd idling around. AT some point 
increasing the number does not make much sense because I cannot get the 
data out fast enough or the seeks will likely "kill" the box^Wperformance.

Any thoughts on that?

Cheers

Carsten

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs

^ permalink raw reply	[flat|nested] 16+ messages in thread

[parent not found: <48044055.2060500-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>]

* Re: [NFS] How to set-up a Linux NFS server to handle massive number of requests
       [not found]           ` <48044055.2060500-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>
@ 2008-04-15 13:58             ` J. Bruce Fields
  2008-04-16  2:49             ` Tom Tucker
  1 sibling, 0 replies; 16+ messages in thread
From: J. Bruce Fields @ 2008-04-15 13:58 UTC (permalink / raw)
  To: Carsten Aulbert; +Cc: nfs

On Tue, Apr 15, 2008 at 07:42:45AM +0200, Carsten Aulbert wrote:
> By the ways, discussing this issue with my colleague cluster admins, the 
> question popped up, if there is a guideline/rule of thump of how many 
> nfsd one should run - or asking the other way round, how to arrive at a 
> good compromise.
> 
> Our server boxes are pretty big (8 cores, 16 GB memory, 16 disk 
> Areca1261 RAID6), so the resources used by the nfsd are not much of an 
> issue - I even tested with 1024 nfsd idling around. AT some point 
> increasing the number does not make much sense because I cannot get the 
> data out fast enough or the seeks will likely "kill" the box^Wperformance.
> 
> Any thoughts on that?

The only advice I know of is to check the "th" line in
/proc/net/rpc/nfsd and adjust the number of threads until you can verify
that they're rarely all in use; see

	http://nfs.sourceforge.net/nfs-howto/ar01s05.html#nfsd_daemon_instances

--b.

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [NFS] How to set-up a Linux NFS server to handle massive number of requests
       [not found]           ` <48044055.2060500-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>
  2008-04-15 13:58             ` J. Bruce Fields
@ 2008-04-16  2:49             ` Tom Tucker
  1 sibling, 0 replies; 16+ messages in thread
From: Tom Tucker @ 2008-04-16  2:49 UTC (permalink / raw)
  To: Carsten Aulbert; +Cc: J. Bruce Fields, nfs


On Tue, 2008-04-15 at 07:42 +0200, Carsten Aulbert wrote:
> 
> Tom Tucker wrote:
> > Maybe this this is a TCP_BACKLOG issue?
> > 
> 
> Hmm, Google does not yield much information about this. I think I know 
> what that would be, is there a cure or some kernel switches for tuning that?
> 
> > BTW, with that many mounts won't you run out of "secure" ports (< 1024),
> > so you'll need to use 'insecure' as a mount option.

> Not to my knowledge. All connections go to a single port onto the server 
> box (well, one port per service).  Only the clients may run out of 
> privileged ports of they do too much mounting, but mostly this option is 
> just for "security" reasons. At least that's my understanding.

Yes, you're right...I was being dumb here. Sorry. 

> 
> By the ways, discussing this issue with my colleague cluster admins, the 
> question popped up, if there is a guideline/rule of thump of how many 
> nfsd one should run - or asking the other way round, how to arrive at a 
> good compromise.
> 
> Our server boxes are pretty big (8 cores, 16 GB memory, 16 disk 
> Areca1261 RAID6), so the resources used by the nfsd are not much of an 
> issue - I even tested with 1024 nfsd idling around. AT some point 
> increasing the number does not make much sense because I cannot get the 
> data out fast enough or the seeks will likely "kill" the box^Wperformance.
> 
> Any thoughts on that?
> 
> Cheers
> 
> Carsten
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> Don't miss this year's exciting event. There's still time to save $100. 
> Use priority code J8TL2D2. 
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
> _______________________________________________
> Please note that nfs@lists.sourceforge.net is being discontinued.
> Please subscribe to linux-nfs@vger.kernel.org instead.
>     http://vger.kernel.org/vger-lists.html#linux-nfs
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [NFS] How to set-up a Linux NFS server to handle massive number of requests
       [not found]       ` <1208234913.17169.50.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
  2008-04-15  5:42         ` Carsten Aulbert
@ 2008-04-15 15:12         ` J. Bruce Fields
  2008-04-16  2:43           ` Tom Tucker
  1 sibling, 1 reply; 16+ messages in thread
From: J. Bruce Fields @ 2008-04-15 15:12 UTC (permalink / raw)
  To: Tom Tucker; +Cc: nfs, Carsten Aulbert

On Mon, Apr 14, 2008 at 11:48:33PM -0500, Tom Tucker wrote:
> 
> Maybe this this is a TCP_BACKLOG issue?

So, looking around.... There seems to be a global limit in
/proc/sys/net/ipv4/tcp_max_syn_backlog (default 1024?); might be worth
seeing what happens if that's increased, e.g., with

	echo 2048 >/proc/sys/net/ipv4/tcp_max_syn_backlog

Though each client does have to make more than one tcp connection, I
wouldn't expect it to be making more than one at a time, so with 1340
clients, and assuming the requests are spread out at least a tiny bit, I
would have thought 1024 would be enough.

Oh, but: Grepping the glibc rpc code, it looks like it calls listen with
second argument SOMAXCONN == 128.  You can confirm that by strace'ing
rpc.mountd -F and looking for the listen call.

And that socket's shared between all the mountd processes, so I guess
that's the real limit.  I don't see an easy way to adjust that.  You'd
also need to increase /proc/sys/net/core/somaxconn first.

But none of this explains why we'd see connections stuck in CLOSE_WAIT
indefinitely?

--b.

> 
> BTW, with that many mounts won't you run out of "secure" ports (< 1024),
> so you'll need to use 'insecure' as a mount option.
> 
> 
> On Fri, 2008-04-11 at 19:07 -0400, J. Bruce Fields wrote:
> > On Thu, Apr 10, 2008 at 02:12:58PM +0200, Carsten Aulbert wrote:
> > > Hi all,
> > > 
> > > we have a pretty extreme problem here and I try to figure out how to get 
> > > it done right.
> > > 
> > > We have a large cluster consisting of 1340 compute nodes who have a 
> > > automount directory which will subsequently trigger a NFS mount (read-only):
> > > 
> > > $ ypcat auto.data
> > > -fstype=nfs,nfsvers=3,hard,intr,rsize=8192,wsize=8192,tcp       &:/data
> > > 
> > > $ grep auto.data /etc/auto.master
> > > /atlas/data          yp:auto.data      --timeout=5
> > > 
> > > So far so good.
> > > 
> > > When submitting 1000 jobs just doing a md5sum of the very same file from 
> > > one single data server, I see very weird effects.
> > > 
> > > In the standard set-up many connections get into the box (tcp connection 
> > > status SYN_RECV) but those fall over after some time and stay in 
> > > CLOSE_WAIT state until I restart the nfs-kernel-server. Typically that 
> > > looks like (netstat -an):
> > 
> > That's interesting!  But I'm not sure how to figure this out.
> > 
> > Is it possible to get a network trace that shows what's going on?
> > 
> > What happens on the clients?
> > 
> > What kernel version are you using?--b.
> > 
> > > 
> > > tcp        0      0 10.20.10.14:687         10.10.2.87:799          SYN_RECV
> > > tcp        0      0 10.20.10.14:687         10.10.4.1:823           SYN_RECV
> > > tcp        0      0 10.20.10.14:687         10.10.1.65:656          SYN_RECV
> > > tcp        0      0 10.20.10.14:687         10.10.1.30:650          SYN_RECV
> > > tcp        0      0 10.20.10.14:687         10.10.0.71:789          SYN_RECV
> > > tcp        0      0 10.20.10.14:687         10.10.1.4:602           SYN_RECV
> > > tcp        0      0 10.20.10.14:687         10.10.1.1:967           SYN_RECV
> > > tcp        0      0 10.20.10.14:687         10.10.3.66:915          SYN_RECV
> > > tcp        0      0 10.20.10.14:687         10.10.0.55:620          SYN_RECV
> > > tcp        0      0 10.20.10.14:687         10.10.1.41:835          SYN_RECV
> > > tcp        0      0 10.20.10.14:687         10.10.2.29:958          SYN_RECV
> > > tcp        0      0 10.20.10.14:687         10.10.1.12:998          SYN_RECV
> > > tcp        0      0 10.20.10.14:687         10.10.1.30:651          SYN_RECV
> > > tcp        0      0 10.20.10.14:687         10.10.1.4:601           SYN_RECV
> > > tcp        0      0 10.20.10.14:2049        10.10.1.19:846 
> > > ESTABLISHED
> > > tcp       45      0 10.20.10.14:687         10.10.0.68:979 
> > > CLOSE_WAIT
> > > tcp       45      0 10.20.10.14:687         10.10.3.83:680 
> > > CLOSE_WAIT
> > > tcp       89      0 10.20.10.14:687         10.10.0.79:604 
> > > CLOSE_WAIT
> > > tcp        0      0 10.20.10.14:2049        10.10.2.6:676 
> > > ESTABLISHED
> > > tcp       45      0 10.20.10.14:687         10.10.2.56:913 
> > > CLOSE_WAIT
> > > tcp       45      0 10.20.10.14:687         10.10.0.60:827 
> > > CLOSE_WAIT
> > > tcp        0      0 10.20.10.14:2049        10.10.3.55:778 
> > > ESTABLISHED
> > > tcp       45      0 10.20.10.14:687         10.10.2.86:981 
> > > CLOSE_WAIT
> > > tcp       45      0 10.20.10.14:687         10.10.9.13:792 
> > > CLOSE_WAIT
> > > tcp       89      0 10.20.10.14:687         10.10.2.93:728 
> > > CLOSE_WAIT
> > > tcp       45      0 10.20.10.14:687         10.10.0.20:742 
> > > CLOSE_WAIT
> > > tcp       45      0 10.20.10.14:687         10.10.3.44:982 
> > > CLOSE_WAIT
> > > 
> > > 
> > > I played with different numbers of of nfsd (ranging from 8-1024) and 
> > > increasing the number of threads for rpc.mountd from 1 to 64, in quite a 
> > > few combinations, but so far I have not found a consistent set of 
> > > parameters where 1000 nodes are able to read this file at the same time.
> > > 
> > > Any ideas from anyone or do you need more input from me?
> > > 
> > > TIA
> > > 
> > > Carsten
> > > 
> > > PS: Please Cc me, I'm not yet subscribed.
> > > 
> > > -------------------------------------------------------------------------
> > > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> > > Don't miss this year's exciting event. There's still time to save $100. 
> > > Use priority code J8TL2D2. 
> > > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> > > _______________________________________________
> > > NFS maillist  -  NFS@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/nfs
> > > _______________________________________________
> > > Please note that nfs@lists.sourceforge.net is being discontinued.
> > > Please subscribe to linux-nfs@vger.kernel.org instead.
> > >     http://vger.kernel.org/vger-lists.html#linux-nfs
> > > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > -------------------------------------------------------------------------
> > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> > Don't miss this year's exciting event. There's still time to save $100. 
> > Use priority code J8TL2D2. 
> > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> > _______________________________________________
> > NFS maillist  -  NFS@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nfs
> > _______________________________________________
> > Please note that nfs@lists.sourceforge.net is being discontinued.
> > Please subscribe to linux-nfs@vger.kernel.org instead.
> >     http://vger.kernel.org/vger-lists.html#linux-nfs
> 
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> Don't miss this year's exciting event. There's still time to save $100. 
> Use priority code J8TL2D2. 
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
> _______________________________________________
> Please note that nfs@lists.sourceforge.net is being discontinued.
> Please subscribe to linux-nfs@vger.kernel.org instead.
>     http://vger.kernel.org/vger-lists.html#linux-nfs
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [NFS] How to set-up a Linux NFS server to handle massive number of requests
  2008-04-15 15:12         ` J. Bruce Fields
@ 2008-04-16  2:43           ` Tom Tucker
       [not found]             ` <1208313790.3521.32.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Tom Tucker @ 2008-04-16  2:43 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: nfs, Carsten Aulbert


On Tue, 2008-04-15 at 11:12 -0400, J. Bruce Fields wrote:
> On Mon, Apr 14, 2008 at 11:48:33PM -0500, Tom Tucker wrote:
> > 
> > Maybe this this is a TCP_BACKLOG issue?
> 
> So, looking around.... There seems to be a global limit in
> /proc/sys/net/ipv4/tcp_max_syn_backlog (default 1024?); might be worth
> seeing what happens if that's increased, e.g., with
> 
> 	echo 2048 >/proc/sys/net/ipv4/tcp_max_syn_backlog

I think this represents the collective total for all listening
endpoints. I think we're only talking about mountd. 

Shooting from the hip...

My gray haired recollection is that the single connection default is a
backlog of 10 (SYN received, not accepted connections). Additional SYN's
received to this endpoint will be dropped...clients will retry the SYN
as part of normal TCP retransmit...

It might be that the CLOSE_WAIT's in the log are _normal_. That is, they
reflect completed mount requests that are in the normal close path. If
they never go away, then that's not normal. Is this the case?

Suppose the 10 is roughly correct. The remaining "jilted" clients will
retransmit their SYN after a randomized exponential backoff. I think you
can imagine that trying 1300+ connections of which only 10 succeed and
then retrying 1300-10 based on a randomized exponential backoff might
get you some pretty bad performance.

Just a thought -- 

> 
> Though each client does have to make more than one tcp connection, I
> wouldn't expect it to be making more than one at a time, so with 1340
> clients, and assuming the requests are spread out at least a tiny bit, I
> would have thought 1024 would be enough.
> 
> Oh, but: Grepping the glibc rpc code, it looks like it calls listen with
> second argument SOMAXCONN == 128.  You can confirm that by strace'ing
> rpc.mountd -F and looking for the listen call.
> 
> And that socket's shared between all the mountd processes, so I guess
> that's the real limit.  I don't see an easy way to adjust that.  You'd
> also need to increase /proc/sys/net/core/somaxconn first.
> 
> But none of this explains why we'd see connections stuck in CLOSE_WAIT
> indefinitely?
> 
> --b.
> 
> > 
> > BTW, with that many mounts won't you run out of "secure" ports (< 1024),
> > so you'll need to use 'insecure' as a mount option.
> > 
> > 
> > On Fri, 2008-04-11 at 19:07 -0400, J. Bruce Fields wrote:
> > > On Thu, Apr 10, 2008 at 02:12:58PM +0200, Carsten Aulbert wrote:
> > > > Hi all,
> > > > 
> > > > we have a pretty extreme problem here and I try to figure out how to get 
> > > > it done right.
> > > > 
> > > > We have a large cluster consisting of 1340 compute nodes who have a 
> > > > automount directory which will subsequently trigger a NFS mount (read-only):
> > > > 
> > > > $ ypcat auto.data
> > > > -fstype=nfs,nfsvers=3,hard,intr,rsize=8192,wsize=8192,tcp       &:/data
> > > > 
> > > > $ grep auto.data /etc/auto.master
> > > > /atlas/data          yp:auto.data      --timeout=5
> > > > 
> > > > So far so good.
> > > > 
> > > > When submitting 1000 jobs just doing a md5sum of the very same file from 
> > > > one single data server, I see very weird effects.
> > > > 
> > > > In the standard set-up many connections get into the box (tcp connection 
> > > > status SYN_RECV) but those fall over after some time and stay in 
> > > > CLOSE_WAIT state until I restart the nfs-kernel-server. Typically that 
> > > > looks like (netstat -an):
> > > 
> > > That's interesting!  But I'm not sure how to figure this out.
> > > 
> > > Is it possible to get a network trace that shows what's going on?
> > > 
> > > What happens on the clients?
> > > 
> > > What kernel version are you using?--b.
> > > 
> > > > 
> > > > tcp        0      0 10.20.10.14:687         10.10.2.87:799          SYN_RECV
> > > > tcp        0      0 10.20.10.14:687         10.10.4.1:823           SYN_RECV
> > > > tcp        0      0 10.20.10.14:687         10.10.1.65:656          SYN_RECV
> > > > tcp        0      0 10.20.10.14:687         10.10.1.30:650          SYN_RECV
> > > > tcp        0      0 10.20.10.14:687         10.10.0.71:789          SYN_RECV
> > > > tcp        0      0 10.20.10.14:687         10.10.1.4:602           SYN_RECV
> > > > tcp        0      0 10.20.10.14:687         10.10.1.1:967           SYN_RECV
> > > > tcp        0      0 10.20.10.14:687         10.10.3.66:915          SYN_RECV
> > > > tcp        0      0 10.20.10.14:687         10.10.0.55:620          SYN_RECV
> > > > tcp        0      0 10.20.10.14:687         10.10.1.41:835          SYN_RECV
> > > > tcp        0      0 10.20.10.14:687         10.10.2.29:958          SYN_RECV
> > > > tcp        0      0 10.20.10.14:687         10.10.1.12:998          SYN_RECV
> > > > tcp        0      0 10.20.10.14:687         10.10.1.30:651          SYN_RECV
> > > > tcp        0      0 10.20.10.14:687         10.10.1.4:601           SYN_RECV
> > > > tcp        0      0 10.20.10.14:2049        10.10.1.19:846 
> > > > ESTABLISHED
> > > > tcp       45      0 10.20.10.14:687         10.10.0.68:979 
> > > > CLOSE_WAIT
> > > > tcp       45      0 10.20.10.14:687         10.10.3.83:680 
> > > > CLOSE_WAIT
> > > > tcp       89      0 10.20.10.14:687         10.10.0.79:604 
> > > > CLOSE_WAIT
> > > > tcp        0      0 10.20.10.14:2049        10.10.2.6:676 
> > > > ESTABLISHED
> > > > tcp       45      0 10.20.10.14:687         10.10.2.56:913 
> > > > CLOSE_WAIT
> > > > tcp       45      0 10.20.10.14:687         10.10.0.60:827 
> > > > CLOSE_WAIT
> > > > tcp        0      0 10.20.10.14:2049        10.10.3.55:778 
> > > > ESTABLISHED
> > > > tcp       45      0 10.20.10.14:687         10.10.2.86:981 
> > > > CLOSE_WAIT
> > > > tcp       45      0 10.20.10.14:687         10.10.9.13:792 
> > > > CLOSE_WAIT
> > > > tcp       89      0 10.20.10.14:687         10.10.2.93:728 
> > > > CLOSE_WAIT
> > > > tcp       45      0 10.20.10.14:687         10.10.0.20:742 
> > > > CLOSE_WAIT
> > > > tcp       45      0 10.20.10.14:687         10.10.3.44:982 
> > > > CLOSE_WAIT
> > > > 
> > > > 
> > > > I played with different numbers of of nfsd (ranging from 8-1024) and 
> > > > increasing the number of threads for rpc.mountd from 1 to 64, in quite a 
> > > > few combinations, but so far I have not found a consistent set of 
> > > > parameters where 1000 nodes are able to read this file at the same time.
> > > > 
> > > > Any ideas from anyone or do you need more input from me?
> > > > 
> > > > TIA
> > > > 
> > > > Carsten
> > > > 
> > > > PS: Please Cc me, I'm not yet subscribed.
> > > > 
> > > > -------------------------------------------------------------------------
> > > > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> > > > Don't miss this year's exciting event. There's still time to save $100. 
> > > > Use priority code J8TL2D2. 
> > > > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> > > > _______________________________________________
> > > > NFS maillist  -  NFS@lists.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/nfs
> > > > _______________________________________________
> > > > Please note that nfs@lists.sourceforge.net is being discontinued.
> > > > Please subscribe to linux-nfs@vger.kernel.org instead.
> > > >     http://vger.kernel.org/vger-lists.html#linux-nfs
> > > > 
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > 
> > > -------------------------------------------------------------------------
> > > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> > > Don't miss this year's exciting event. There's still time to save $100. 
> > > Use priority code J8TL2D2. 
> > > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> > > _______________________________________________
> > > NFS maillist  -  NFS@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/nfs
> > > _______________________________________________
> > > Please note that nfs@lists.sourceforge.net is being discontinued.
> > > Please subscribe to linux-nfs@vger.kernel.org instead.
> > >     http://vger.kernel.org/vger-lists.html#linux-nfs
> > 
> > 
> > -------------------------------------------------------------------------
> > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> > Don't miss this year's exciting event. There's still time to save $100. 
> > Use priority code J8TL2D2. 
> > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> > _______________________________________________
> > NFS maillist  -  NFS@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nfs
> > _______________________________________________
> > Please note that nfs@lists.sourceforge.net is being discontinued.
> > Please subscribe to linux-nfs@vger.kernel.org instead.
> >     http://vger.kernel.org/vger-lists.html#linux-nfs
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 16+ messages in thread

[parent not found: <1208313790.3521.32.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>]

* Re: [NFS] How to set-up a Linux NFS server to handle massive number of requests
       [not found]             ` <1208313790.3521.32.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
@ 2008-04-16  2:58               ` J. Bruce Fields
  2008-04-16  3:22                 ` Tom Tucker
  0 siblings, 1 reply; 16+ messages in thread
From: J. Bruce Fields @ 2008-04-16  2:58 UTC (permalink / raw)
  To: Tom Tucker; +Cc: nfs, Carsten Aulbert

On Tue, Apr 15, 2008 at 09:43:10PM -0500, Tom Tucker wrote:
> 
> On Tue, 2008-04-15 at 11:12 -0400, J. Bruce Fields wrote:
> > On Mon, Apr 14, 2008 at 11:48:33PM -0500, Tom Tucker wrote:
> > > 
> > > Maybe this this is a TCP_BACKLOG issue?
> > 
> > So, looking around.... There seems to be a global limit in
> > /proc/sys/net/ipv4/tcp_max_syn_backlog (default 1024?); might be worth
> > seeing what happens if that's increased, e.g., with
> > 
> > 	echo 2048 >/proc/sys/net/ipv4/tcp_max_syn_backlog
> 
> I think this represents the collective total for all listening
> endpoints. I think we're only talking about mountd. 

Yes.

> Shooting from the hip...
> 
> My gray haired recollection is that the single connection default is a
> backlog of 10 (SYN received, not accepted connections). Additional SYN's
> received to this endpoint will be dropped...clients will retry the SYN
> as part of normal TCP retransmit...
> 
> It might be that the CLOSE_WAIT's in the log are _normal_. That is, they
> reflect completed mount requests that are in the normal close path. If
> they never go away, then that's not normal. Is this the case?

What he said was:

	"those fall over after some time and stay in CLOSE_WAIT state
	until I restart the nfs-kernel-server."

Carsten, are you positive that the same sockets were in CLOSE_WAIT the
whole time you were watching?  And how long was it before you gave up
and restarted?

> Suppose the 10 is roughly correct. The remaining "jilted" clients will
> retransmit their SYN after a randomized exponential backoff. I think you
> can imagine that trying 1300+ connections of which only 10 succeed and
> then retrying 1300-10 based on a randomized exponential backoff might
> get you some pretty bad performance.

Right, could be, but:

...
> > Oh, but: Grepping the glibc rpc code, it looks like it calls listen with
> > second argument SOMAXCONN == 128.  You can confirm that by strace'ing
> > rpc.mountd -F and looking for the listen call.
> > 
> > And that socket's shared between all the mountd processes, so I guess
> > that's the real limit.  I don't see an easy way to adjust that.  You'd
> > also need to increase /proc/sys/net/core/somaxconn first.
> > 
> > But none of this explains why we'd see connections stuck in CLOSE_WAIT
> > indefinitely?

So the limit appears to be more like 128, and (based on my quick look at
the code) that appears to baked in to the glibc rpc code.

Maybe you could code around that in mountd.  Looks like the relevant
code is in nfs-utils/support/include/rpcmisc.c:rpc_init().

--b.

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [NFS] How to set-up a Linux NFS server to handle massive number of requests
  2008-04-16  2:58               ` J. Bruce Fields
@ 2008-04-16  3:22                 ` Tom Tucker
       [not found]                   ` <1208316166.3521.42.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Tom Tucker @ 2008-04-16  3:22 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: nfs, Carsten Aulbert


On Tue, 2008-04-15 at 22:58 -0400, J. Bruce Fields wrote:
> On Tue, Apr 15, 2008 at 09:43:10PM -0500, Tom Tucker wrote:
> > 
> > On Tue, 2008-04-15 at 11:12 -0400, J. Bruce Fields wrote:
> > > On Mon, Apr 14, 2008 at 11:48:33PM -0500, Tom Tucker wrote:
> > > > 
> > > > Maybe this this is a TCP_BACKLOG issue?
> > > 
> > > So, looking around.... There seems to be a global limit in
> > > /proc/sys/net/ipv4/tcp_max_syn_backlog (default 1024?); might be worth
> > > seeing what happens if that's increased, e.g., with
> > > 
> > > 	echo 2048 >/proc/sys/net/ipv4/tcp_max_syn_backlog
> > 
> > I think this represents the collective total for all listening
> > endpoints. I think we're only talking about mountd. 
> 
> Yes.
> 
> > Shooting from the hip...
> > 
> > My gray haired recollection is that the single connection default is a
> > backlog of 10 (SYN received, not accepted connections). Additional SYN's
> > received to this endpoint will be dropped...clients will retry the SYN
> > as part of normal TCP retransmit...
> > 
> > It might be that the CLOSE_WAIT's in the log are _normal_. That is, they
> > reflect completed mount requests that are in the normal close path. If
> > they never go away, then that's not normal. Is this the case?
> 
> What he said was:
> 
> 	"those fall over after some time and stay in CLOSE_WAIT state
> 	until I restart the nfs-kernel-server."
> 
> Carsten, are you positive that the same sockets were in CLOSE_WAIT the
> whole time you were watching?  And how long was it before you gave up
> and restarted?
> 
> > Suppose the 10 is roughly correct. The remaining "jilted" clients will
> > retransmit their SYN after a randomized exponential backoff. I think you
> > can imagine that trying 1300+ connections of which only 10 succeed and
> > then retrying 1300-10 based on a randomized exponential backoff might
> > get you some pretty bad performance.
> 
> Right, could be, but:
> 
> ...
> > > Oh, but: Grepping the glibc rpc code, it looks like it calls listen with
> > > second argument SOMAXCONN == 128.  You can confirm that by strace'ing
> > > rpc.mountd -F and looking for the listen call.
> > > 
> > > And that socket's shared between all the mountd processes, so I guess
> > > that's the real limit.  I don't see an easy way to adjust that.  You'd
> > > also need to increase /proc/sys/net/core/somaxconn first.
> > > 
> > > But none of this explains why we'd see connections stuck in CLOSE_WAIT
> > > indefinitely?
> 
> So the limit appears to be more like 128, and (based on my quick look at
> the code) that appears to baked in to the glibc rpc code.
> 
> Maybe you could code around that in mountd.  Looks like the relevant
> code is in nfs-utils/support/include/rpcmisc.c:rpc_init().

If you really need to start 1300 mounts all at once then something needs
to change. BTW even after you get past mountd, the server is going to
get pounded with SYN and RPC_NOP. 

It might be interesting to look at httpd (Apache) to see what it does. I
would think it faces similar traffic flows.

> 
> --b.
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> Don't miss this year's exciting event. There's still time to save $100. 
> Use priority code J8TL2D2. 
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
> _______________________________________________
> Please note that nfs@lists.sourceforge.net is being discontinued.
> Please subscribe to linux-nfs@vger.kernel.org instead.
>     http://vger.kernel.org/vger-lists.html#linux-nfs
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 16+ messages in thread

[parent not found: <1208316166.3521.42.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>]

* Re: [NFS] How to set-up a Linux NFS server to handle massive number of requests
       [not found]                   ` <1208316166.3521.42.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
@ 2008-04-16 13:45                     ` Chuck Lever
  2008-04-16 14:35                       ` Carsten Aulbert
  2008-05-01 19:47                     ` Dean Hildebrand
  1 sibling, 1 reply; 16+ messages in thread
From: Chuck Lever @ 2008-04-16 13:45 UTC (permalink / raw)
  To: J. Bruce Fields, Tom Tucker, Carsten Aulbert; +Cc: nfs

On Apr 15, 2008, at 11:22 PM, Tom Tucker wrote:
> On Tue, 2008-04-15 at 22:58 -0400, J. Bruce Fields wrote:
>> On Tue, Apr 15, 2008 at 09:43:10PM -0500, Tom Tucker wrote:
>>>
>>> On Tue, 2008-04-15 at 11:12 -0400, J. Bruce Fields wrote:
>>>> On Mon, Apr 14, 2008 at 11:48:33PM -0500, Tom Tucker wrote:
>>>>>
>>>>> Maybe this this is a TCP_BACKLOG issue?
>>>>
>>>> So, looking around.... There seems to be a global limit in
>>>> /proc/sys/net/ipv4/tcp_max_syn_backlog (default 1024?); might be  
>>>> worth
>>>> seeing what happens if that's increased, e.g., with
>>>>
>>>> 	echo 2048 >/proc/sys/net/ipv4/tcp_max_syn_backlog
>>>
>>> I think this represents the collective total for all listening
>>> endpoints. I think we're only talking about mountd.
>>
>> Yes.
>>
>>> Shooting from the hip...
>>>
>>> My gray haired recollection is that the single connection default  
>>> is a
>>> backlog of 10 (SYN received, not accepted connections). Additional  
>>> SYN's
>>> received to this endpoint will be dropped...clients will retry the  
>>> SYN
>>> as part of normal TCP retransmit...
>>>
>>> It might be that the CLOSE_WAIT's in the log are _normal_. That  
>>> is, they
>>> reflect completed mount requests that are in the normal close  
>>> path. If
>>> they never go away, then that's not normal. Is this the case?
>>
>> What he said was:
>>
>> 	"those fall over after some time and stay in CLOSE_WAIT state
>> 	until I restart the nfs-kernel-server."
>>
>> Carsten, are you positive that the same sockets were in CLOSE_WAIT  
>> the
>> whole time you were watching?  And how long was it before you gave up
>> and restarted?
>>
>>> Suppose the 10 is roughly correct. The remaining "jilted" clients  
>>> will
>>> retransmit their SYN after a randomized exponential backoff. I  
>>> think you
>>> can imagine that trying 1300+ connections of which only 10 succeed  
>>> and
>>> then retrying 1300-10 based on a randomized exponential backoff  
>>> might
>>> get you some pretty bad performance.
>>
>> Right, could be, but:
>>
>> ...
>>>> Oh, but: Grepping the glibc rpc code, it looks like it calls  
>>>> listen with
>>>> second argument SOMAXCONN == 128.  You can confirm that by  
>>>> strace'ing
>>>> rpc.mountd -F and looking for the listen call.
>>>>
>>>> And that socket's shared between all the mountd processes, so I  
>>>> guess
>>>> that's the real limit.  I don't see an easy way to adjust that.   
>>>> You'd
>>>> also need to increase /proc/sys/net/core/somaxconn first.
>>>>
>>>> But none of this explains why we'd see connections stuck in  
>>>> CLOSE_WAIT
>>>> indefinitely?
>>
>> So the limit appears to be more like 128, and (based on my quick  
>> look at
>> the code) that appears to baked in to the glibc rpc code.
>>
>> Maybe you could code around that in mountd.  Looks like the relevant
>> code is in nfs-utils/support/include/rpcmisc.c:rpc_init().
>
> If you really need to start 1300 mounts all at once then something  
> needs
> to change. BTW even after you get past mountd, the server is going to
> get pounded with SYN and RPC_NOP.

Would it be worth trying UDP, just as an experiment?

Force UDP for the mountd protocol by specifying the "mountproto=udp"  
option.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [NFS] How to set-up a Linux NFS server to handle massive number of requests
  2008-04-16 13:45                     ` Chuck Lever
@ 2008-04-16 14:35                       ` Carsten Aulbert
  0 siblings, 0 replies; 16+ messages in thread
From: Carsten Aulbert @ 2008-04-16 14:35 UTC (permalink / raw)
  To: Chuck Lever; +Cc: J. Bruce Fields, nfs



Chuck Lever wrote:
>
> Force UDP for the mountd protocol by specifying the "mountproto=udp" 
> option.

I'll give that also a try. I'm currently busy running other benchmarks. 
I'll try to get some results by the weekend, if nothing comes from my 
side by then, I'm probably buried alive in work, but please send me a 
(friendly) reminder then.

Thanks already for all the input!

Cheers

Carsten

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [NFS] How to set-up a Linux NFS server to handle massive number of requests
       [not found]                   ` <1208316166.3521.42.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
  2008-04-16 13:45                     ` Chuck Lever
@ 2008-05-01 19:47                     ` Dean Hildebrand
  2008-05-01 19:51                       ` J. Bruce Fields
  1 sibling, 1 reply; 16+ messages in thread
From: Dean Hildebrand @ 2008-05-01 19:47 UTC (permalink / raw)
  To: Tom Tucker; +Cc: J. Bruce Fields, nfs, Carsten Aulbert


> If you really need to start 1300 mounts all at once then something needs
> to change. BTW even after you get past mountd, the server is going to
> get pounded with SYN and RPC_NOP. 
>   
Just to give my 2 cents after the fact,.. a new approach is definitely 
needed.  For example, a small 10 line MPI program that has a single 
client mount the server, calculate the md5sum, and distribute the result 
to the other 999 clients would be a much better approach....

Dean
> It might be interesting to look at httpd (Apache) to see what it does. I
> would think it faces similar traffic flows.
>
>   
>> --b.
>>
>> -------------------------------------------------------------------------
>> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
>> Don't miss this year's exciting event. There's still time to save $100. 
>> Use priority code J8TL2D2. 
>> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
>> _______________________________________________
>> NFS maillist  -  NFS@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs
>> _______________________________________________
>> Please note that nfs@lists.sourceforge.net is being discontinued.
>> Please subscribe to linux-nfs@vger.kernel.org instead.
>>     http://vger.kernel.org/vger-lists.html#linux-nfs
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>     
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> Don't miss this year's exciting event. There's still time to save $100. 
> Use priority code J8TL2D2. 
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
> _______________________________________________
> Please note that nfs@lists.sourceforge.net is being discontinued.
> Please subscribe to linux-nfs@vger.kernel.org instead.
>     http://vger.kernel.org/vger-lists.html#linux-nfs
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [NFS] How to set-up a Linux NFS server to handle massive number of requests
  2008-05-01 19:47                     ` Dean Hildebrand
@ 2008-05-01 19:51                       ` J. Bruce Fields
  0 siblings, 0 replies; 16+ messages in thread
From: J. Bruce Fields @ 2008-05-01 19:51 UTC (permalink / raw)
  To: Dean Hildebrand; +Cc: nfs, Carsten Aulbert

On Thu, May 01, 2008 at 12:47:06PM -0700, Dean Hildebrand wrote:
>
>> If you really need to start 1300 mounts all at once then something needs
>> to change. BTW even after you get past mountd, the server is going to
>> get pounded with SYN and RPC_NOP.   
> Just to give my 2 cents after the fact,.. a new approach is definitely  
> needed.  For example, a small 10 line MPI program that has a single  
> client mount the server, calculate the md5sum, and distribute the result  
> to the other 999 clients would be a much better approach....

For that toy example, yes, but we still need to fix whatever's
preventing us from handling 1000 simultaneous mounts.

--b.

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
_______________________________________________
Please note that nfs@lists.sourceforge.net is being discontinued.
Please subscribe to linux-nfs@vger.kernel.org instead.
    http://vger.kernel.org/vger-lists.html#linux-nfs


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2008-05-01 19:51 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-10 12:12 [NFS] How to set-up a Linux NFS server to handle massive number of requests Carsten Aulbert
     [not found] ` <47FE044A.7020008-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>
2008-04-11 23:07   ` J. Bruce Fields
2008-04-12  6:45     ` Carsten Aulbert
     [not found]       ` <48005A78.9090609-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>
2008-04-14 17:06         ` J. Bruce Fields
2008-04-15  4:48     ` Tom Tucker
     [not found]       ` <1208234913.17169.50.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
2008-04-15  5:42         ` Carsten Aulbert
     [not found]           ` <48044055.2060500-l1a6w7hxd2yELgA04lAiVw@public.gmane.org>
2008-04-15 13:58             ` J. Bruce Fields
2008-04-16  2:49             ` Tom Tucker
2008-04-15 15:12         ` J. Bruce Fields
2008-04-16  2:43           ` Tom Tucker
     [not found]             ` <1208313790.3521.32.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
2008-04-16  2:58               ` J. Bruce Fields
2008-04-16  3:22                 ` Tom Tucker
     [not found]                   ` <1208316166.3521.42.camel-SMNkleLxa3ZimH42XvhXlA@public.gmane.org>
2008-04-16 13:45                     ` Chuck Lever
2008-04-16 14:35                       ` Carsten Aulbert
2008-05-01 19:47                     ` Dean Hildebrand
2008-05-01 19:51                       ` J. Bruce Fields

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.