All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: Pb of optimization for a Cluster under Gigabit
@ 2004-04-07 20:58 Lever, Charles
  2004-04-07 21:31 ` Chris Worley
  2004-04-08  2:17 ` Didier CONTIS
  0 siblings, 2 replies; 9+ messages in thread
From: Lever, Charles @ 2004-04-07 20:58 UTC (permalink / raw)
  To: Didier CONTIS; +Cc: nfs

> >do you really need to use the "sync" mount option on
> >the clients?  the "sync" export option on the server
> >should be enough for most applications.
>=20
> That definitely helped. Now the NFS options passed
> via automount to the client are:
>=20
> rw,nfsvers=3D3,udp,hard,intr,rsize=3D8192,wsize=3D8192
>=20
> However, the load of the nfs server still goes up to 3.5

that may be completely normal.

i don't think the load average is a good indication of
how hard your server is working.  is your application
throughput reasonable?  any response time problems?

you should talk a walk through the NFS HOWTO, as it has
some good server performance tips.

  http://nfs.sourceforge.net/


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread
* RE: Pb of optimization for a Cluster under Gigabit
@ 2004-04-07  4:39 Lever, Charles
  2004-04-07 20:46 ` Didier CONTIS
  0 siblings, 1 reply; 9+ messages in thread
From: Lever, Charles @ 2004-04-07  4:39 UTC (permalink / raw)
  To: Didier CONTIS; +Cc: nfs

hi didier-

do you really need to use the "sync" mount option on
the clients?  the "sync" export option on the server
should be enough for most applications.

IP fragmentation is normal for any UDP-based protocol,
and your stats don't show any reassembly failures or
timeouts.  btw you can get this information in slightly
friendlier form with "netstat -s".

> -----Original Message-----
> From: Didier CONTIS [mailto:didier@ece.gatech.edu]=20
> Sent: Tuesday, April 06, 2004 11:23 PM
> To: nfs@lists.sourceforge.net
> Subject: [NFS] Pb of optimization for a Cluster under Gigabit
>=20
>=20
>=20
>=20
> We have a cluster with ~60 Dell PowerEdge 1750 (dual cpu)
> running Redhat 9.0 (fully patched) connected via Gigabit
> to a stack of Catalyst 3750.
>=20
> The cluster has a dedicated NFS server also connected
> via Gigabit:
>=20
> Dell PowerEdge 2650 running AS 2.1 fully patched.
> The unit has a Raid 1 array for the OS and is connected
> via a dual Fiber Channel to a EMC Clarion SAN. We are
> running Powerpath. The server has also 1GB of memory.
>=20
> Its load is always 2 or higher an we have some flacky
> performance when copying files from one NFS partitions
> to another from the client:
>=20
> All the filesystem are exported with sync and mounted
> on the client (via autofs) with:
> rw,sync,hard,intr,rsize=3D8192,wsize=3D8192
>=20
> The time for copying a 40MB file from a NFS partition to local
> client filesystem is good.
>=20
> [didier@xfront2 ~]$ time cp jeffay.txt /tmp
> 0.010u 0.190s 0:05.19 3.8%      0+0k 0+0io 115pf+0w
>=20
> For copying same file from one NFS partition to another
> via the same client it takes more than a couple of minutes.
>=20
> We are running 96 nfsd on the file server with the Queue tune-up hack.
>=20
> The under /proc/net/rpc/nfsd
> [...]
> th 96 0 171.110 29.200 5.100 0.000 0.000 0.000 0.000 0.000 0.000 0.000
>=20
> looks good.
>=20
> It seems the file server is spending too much time doing ip frag work:
> uptime -> 18hours
>=20
> [didier@xnfs1 ~]$ cat /proc/net/snmp
> Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors
> ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests
> OutDiscards OutNoRoutes ReasmTimeout ReasmReqds ReasmOKs ReasmFails
> FragOKs FragFails FragCreates
> Ip: 2 64 34249037 0 0 0 0 217 23273862 31176320 24384 0 0 16472823
> 5502518 0 0 0 10378060
> [...]
> Tcp: RtoAlgorithm RtoMin RtoMax MaxConn ActiveOpens PassiveOpens
> AttemptFails EstabResets CurrEstab InSegs OutSegs RetransSegs=20
> InErrs OutRsts
> Tcp: 0 0 0 0 2532 0 0 0 1 54706 76945 15 0 12
> Udp: InDatagrams NoPorts InErrors OutDatagrams
> Udp: 23221263 75 217 23165842
>=20
> Would anyone have any suggestions or recommendations ? Should
> I switch rsize / wsize to 1024 ?
>=20
> Thanks - Didier
>=20
>=20
>=20
>=20
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> =
administration.http://ads.osdn.com/?ad_id=3D1470&alloc_id=3D3638&op=3Dcli=
ck
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
>=20


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread
* Pb of optimization for a Cluster under Gigabit
@ 2004-04-07  3:22 Didier CONTIS
  2004-04-07  5:28 ` Greg Banks
  2004-04-07 11:40 ` Bogdan Costescu
  0 siblings, 2 replies; 9+ messages in thread
From: Didier CONTIS @ 2004-04-07  3:22 UTC (permalink / raw)
  To: nfs



We have a cluster with ~60 Dell PowerEdge 1750 (dual cpu)
running Redhat 9.0 (fully patched) connected via Gigabit
to a stack of Catalyst 3750.

The cluster has a dedicated NFS server also connected
via Gigabit:

Dell PowerEdge 2650 running AS 2.1 fully patched.
The unit has a Raid 1 array for the OS and is connected
via a dual Fiber Channel to a EMC Clarion SAN. We are
running Powerpath. The server has also 1GB of memory.

Its load is always 2 or higher an we have some flacky
performance when copying files from one NFS partitions
to another from the client:

All the filesystem are exported with sync and mounted
on the client (via autofs) with:
rw,sync,hard,intr,rsize=8192,wsize=8192

The time for copying a 40MB file from a NFS partition to local
client filesystem is good.

[didier@xfront2 ~]$ time cp jeffay.txt /tmp
0.010u 0.190s 0:05.19 3.8%      0+0k 0+0io 115pf+0w

For copying same file from one NFS partition to another
via the same client it takes more than a couple of minutes.

We are running 96 nfsd on the file server with the Queue tune-up hack.

The under /proc/net/rpc/nfsd
[...]
th 96 0 171.110 29.200 5.100 0.000 0.000 0.000 0.000 0.000 0.000 0.000

looks good.

It seems the file server is spending too much time doing ip frag work:
uptime -> 18hours

[didier@xnfs1 ~]$ cat /proc/net/snmp
Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors
ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests
OutDiscards OutNoRoutes ReasmTimeout ReasmReqds ReasmOKs ReasmFails
FragOKs FragFails FragCreates
Ip: 2 64 34249037 0 0 0 0 217 23273862 31176320 24384 0 0 16472823
5502518 0 0 0 10378060
[...]
Tcp: RtoAlgorithm RtoMin RtoMax MaxConn ActiveOpens PassiveOpens
AttemptFails EstabResets CurrEstab InSegs OutSegs RetransSegs InErrs OutRsts
Tcp: 0 0 0 0 2532 0 0 0 1 54706 76945 15 0 12
Udp: InDatagrams NoPorts InErrors OutDatagrams
Udp: 23221263 75 217 23165842

Would anyone have any suggestions or recommendations ? Should
I switch rsize / wsize to 1024 ?

Thanks - Didier




-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2004-04-08  2:17 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-04-07 20:58 Pb of optimization for a Cluster under Gigabit Lever, Charles
2004-04-07 21:31 ` Chris Worley
2004-04-08  2:17 ` Didier CONTIS
  -- strict thread matches above, loose matches on Subject: below --
2004-04-07  4:39 Lever, Charles
2004-04-07 20:46 ` Didier CONTIS
2004-04-07  3:22 Didier CONTIS
2004-04-07  5:28 ` Greg Banks
2004-04-07 11:40 ` Bogdan Costescu
2004-04-07 20:38   ` Didier CONTIS

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.