RE: Pb of optimization for a Cluster under Gigabit

All of lore.kernel.org
 help / color / mirror / Atom feed

* RE: Pb of optimization for a Cluster under Gigabit
@ 2004-04-07 20:58 Lever, Charles
  2004-04-07 21:31 ` Chris Worley
  2004-04-08  2:17 ` Didier CONTIS
  0 siblings, 2 replies; 9+ messages in thread
From: Lever, Charles @ 2004-04-07 20:58 UTC (permalink / raw)
  To: Didier CONTIS; +Cc: nfs

> >do you really need to use the "sync" mount option on
> >the clients?  the "sync" export option on the server
> >should be enough for most applications.
>=20
> That definitely helped. Now the NFS options passed
> via automount to the client are:
>=20
> rw,nfsvers=3D3,udp,hard,intr,rsize=3D8192,wsize=3D8192
>=20
> However, the load of the nfs server still goes up to 3.5

that may be completely normal.

i don't think the load average is a good indication of
how hard your server is working.  is your application
throughput reasonable?  any response time problems?

you should talk a walk through the NFS HOWTO, as it has
some good server performance tips.

  http://nfs.sourceforge.net/


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Pb of optimization for a Cluster under Gigabit
  2004-04-07 20:58 Pb of optimization for a Cluster under Gigabit Lever, Charles
@ 2004-04-07 21:31 ` Chris Worley
  2004-04-08  2:17 ` Didier CONTIS
  1 sibling, 0 replies; 9+ messages in thread
From: Chris Worley @ 2004-04-07 21:31 UTC (permalink / raw)
  To: didier; +Cc: nfs

On Wed, 2004-04-07 at 14:58, Lever, Charles wrote:
> > 
> > That definitely helped. Now the NFS options passed
> > via automount to the client are:
> > 
> > rw,nfsvers=3,udp,hard,intr,rsize=8192,wsize=8192
> > 
> > However, the load of the nfs server still goes up to 3.5

Is that option string coming from /proc/mounts, or actually what you're
using?  If it's coming from /proc/mounts, then not all options are
echoed... and, one thing to definitely get the load level up on the
server is using the "noac" option on the clients.



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Pb of optimization for a Cluster under Gigabit
  2004-04-07 20:58 Pb of optimization for a Cluster under Gigabit Lever, Charles
  2004-04-07 21:31 ` Chris Worley
@ 2004-04-08  2:17 ` Didier CONTIS
  1 sibling, 0 replies; 9+ messages in thread
From: Didier CONTIS @ 2004-04-08  2:17 UTC (permalink / raw)
  To: nfs

> that may be completely normal.
> i don't think the load average is a good indication of
> how hard your server is working.  is your application
> throughput reasonable?  any response time problems?

I had some timeos until I reboot the server
(before sending my e-mail to the mailing list) + high load.

Currently I have just a high load (I update the RH AS 2.1 kernel to the
latest rev). I thought it was odd considering
the hardware and in comparison of the load I had with older Linux file
server of other clusters.

I did look at the section 5 of the HOWTO before e-mailing the list
(even though I confused myself in using the sync option on the server
and client side). I just did not want to keep playing increasing
my number of NFSD / increasing the rsize + wsize before asking.

To answer someone else, the options as shown from the cat /proc/mount
are:

rw,v3,rsize=8192,wsize=8192,hard,intr,udp,lock,addr=xnfs1...

Something weird I noticed -> I switched the file server exports
from sync to async. Of course I got better performance in terms of 
response time on the client side (up to 50s saved on some linking
operations during compilation).
In addition the load on the file server got divided
by 2 (need to do a better and longuer monitoring with ganglia for 
example). Can such a load decrease be expected ? I am surprised.

Too side question

1) Is the Kernel of Redhat AS 2.1 broken NFS wise and I should speed-up
    the upgrade of the file server to AS 3.0 ? Please note that using RH
    on the file server was imposed by Dell + EMC for the PowerPath
    install (redundant pathing)

2) Has anyone of the list any experience on running an NFS server
    connected to an EMC Clariion SAN with their PowerPath software ?

Thanks - Didier.

-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Pb of optimization for a Cluster under Gigabit
@ 2004-04-07  4:39 Lever, Charles
  2004-04-07 20:46 ` Didier CONTIS
  0 siblings, 1 reply; 9+ messages in thread
From: Lever, Charles @ 2004-04-07  4:39 UTC (permalink / raw)
  To: Didier CONTIS; +Cc: nfs

hi didier-

do you really need to use the "sync" mount option on
the clients?  the "sync" export option on the server
should be enough for most applications.

IP fragmentation is normal for any UDP-based protocol,
and your stats don't show any reassembly failures or
timeouts.  btw you can get this information in slightly
friendlier form with "netstat -s".

> -----Original Message-----
> From: Didier CONTIS [mailto:didier@ece.gatech.edu]=20
> Sent: Tuesday, April 06, 2004 11:23 PM
> To: nfs@lists.sourceforge.net
> Subject: [NFS] Pb of optimization for a Cluster under Gigabit
>=20
>=20
>=20
>=20
> We have a cluster with ~60 Dell PowerEdge 1750 (dual cpu)
> running Redhat 9.0 (fully patched) connected via Gigabit
> to a stack of Catalyst 3750.
>=20
> The cluster has a dedicated NFS server also connected
> via Gigabit:
>=20
> Dell PowerEdge 2650 running AS 2.1 fully patched.
> The unit has a Raid 1 array for the OS and is connected
> via a dual Fiber Channel to a EMC Clarion SAN. We are
> running Powerpath. The server has also 1GB of memory.
>=20
> Its load is always 2 or higher an we have some flacky
> performance when copying files from one NFS partitions
> to another from the client:
>=20
> All the filesystem are exported with sync and mounted
> on the client (via autofs) with:
> rw,sync,hard,intr,rsize=3D8192,wsize=3D8192
>=20
> The time for copying a 40MB file from a NFS partition to local
> client filesystem is good.
>=20
> [didier@xfront2 ~]$ time cp jeffay.txt /tmp
> 0.010u 0.190s 0:05.19 3.8%      0+0k 0+0io 115pf+0w
>=20
> For copying same file from one NFS partition to another
> via the same client it takes more than a couple of minutes.
>=20
> We are running 96 nfsd on the file server with the Queue tune-up hack.
>=20
> The under /proc/net/rpc/nfsd
> [...]
> th 96 0 171.110 29.200 5.100 0.000 0.000 0.000 0.000 0.000 0.000 0.000
>=20
> looks good.
>=20
> It seems the file server is spending too much time doing ip frag work:
> uptime -> 18hours
>=20
> [didier@xnfs1 ~]$ cat /proc/net/snmp
> Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors
> ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests
> OutDiscards OutNoRoutes ReasmTimeout ReasmReqds ReasmOKs ReasmFails
> FragOKs FragFails FragCreates
> Ip: 2 64 34249037 0 0 0 0 217 23273862 31176320 24384 0 0 16472823
> 5502518 0 0 0 10378060
> [...]
> Tcp: RtoAlgorithm RtoMin RtoMax MaxConn ActiveOpens PassiveOpens
> AttemptFails EstabResets CurrEstab InSegs OutSegs RetransSegs=20
> InErrs OutRsts
> Tcp: 0 0 0 0 2532 0 0 0 1 54706 76945 15 0 12
> Udp: InDatagrams NoPorts InErrors OutDatagrams
> Udp: 23221263 75 217 23165842
>=20
> Would anyone have any suggestions or recommendations ? Should
> I switch rsize / wsize to 1024 ?
>=20
> Thanks - Didier
>=20
>=20
>=20
>=20
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> =
administration.http://ads.osdn.com/?ad_id=3D1470&alloc_id=3D3638&op=3Dcli=
ck
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
>=20


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Pb of optimization for a Cluster under Gigabit
  2004-04-07  4:39 Lever, Charles
@ 2004-04-07 20:46 ` Didier CONTIS
  0 siblings, 0 replies; 9+ messages in thread
From: Didier CONTIS @ 2004-04-07 20:46 UTC (permalink / raw)
  To: nfs

 

>-----Original Message-----
>From: Lever, Charles [mailto:Charles.Lever@netapp.com] 
>Sent: Wednesday, April 07, 2004 12:40 AM
>To: Didier CONTIS
>Cc: nfs@lists.sourceforge.net
>Subject: RE: [NFS] Pb of optimization for a Cluster under Gigabit
>
>hi didier-
>
>do you really need to use the "sync" mount option on
>the clients?  the "sync" export option on the server
>should be enough for most applications.

That definitely helped. Now the NFS options passed
via automount to the client are:

rw,nfsvers=3,udp,hard,intr,rsize=8192,wsize=8192

However, the load of the nfs server still goes up to 3.5

What else could I check for ?

Thanks - Didier.

The output of nfsstat look like (after 2 days of uptime)

[didier@xnfs1 ~]$ /usr/sbin/nfsstat 
Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
58142076   0          0          0          0       
Server nfs v2:
null       getattr    setattr    root       lookup     readlink   
1      100% 0       0% 0       0% 0       0% 0       0% 0       0% 
read       wrcache    write      create     remove     rename     
0       0% 0       0% 0       0% 0       0% 0       0% 0       0% 
link       symlink    mkdir      rmdir      readdir    fsstat     
0       0% 0       0% 0       0% 0       0% 0       0% 0       0% 

Server nfs v3:
null       getattr    setattr    lookup     access     readlink   
1       0% 15158205 26% 53353   0% 996812  1% 42526   0% 1067    0% 
read       write      create     mkdir      symlink    mknod      
7941851 13% 33614209 57% 139897  0% 5118    0% 865     0% 0       0% 
remove     rmdir      rename     link       readdir    readdirplus
92778   0% 651     0% 1080    0% 394     0% 15436   0% 0       0% 
fsstat     fsinfo     pathconf   commit     
1106    0% 1106    0% 0       0% 75620   0% 

>IP fragmentation is normal for any UDP-based protocol,
>and your stats don't show any reassembly failures or
>timeouts.  btw you can get this information in slightly
>friendlier form with "netstat -s".

I got the following:

[didier@xnfs1 ~]$ netstat -s | more
Ip:
    79283261 total packets received
    0 forwarded
    19816 incoming packets discarded
    58326116 incoming packets delivered
    87711576 requests sent out
    148509 outgoing packets dropped
    30853764 reassemblies required
    9927976 packets reassembled ok
    37294547 fragments created

[......]

Udp:
    58212451 packets received
    190 packets to unknown port received.
    19816 packet receive errors
    57985833 packets sent

[didier@xnfs1 ~]$ netstat -in
Kernel Interface table
Iface       MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP
TX-OVR Flg
eth0       1500   0   109165      0      0      0   144631      0      0
0 BMRU
eth1       1500   0 79305078      0      0      0 87676856      0      0
0 BMRU
lo        16436   0      299      0      0      0      299      0      0
0 LRU




-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Pb of optimization for a Cluster under Gigabit
@ 2004-04-07  3:22 Didier CONTIS
  2004-04-07  5:28 ` Greg Banks
  2004-04-07 11:40 ` Bogdan Costescu
  0 siblings, 2 replies; 9+ messages in thread
From: Didier CONTIS @ 2004-04-07  3:22 UTC (permalink / raw)
  To: nfs



We have a cluster with ~60 Dell PowerEdge 1750 (dual cpu)
running Redhat 9.0 (fully patched) connected via Gigabit
to a stack of Catalyst 3750.

The cluster has a dedicated NFS server also connected
via Gigabit:

Dell PowerEdge 2650 running AS 2.1 fully patched.
The unit has a Raid 1 array for the OS and is connected
via a dual Fiber Channel to a EMC Clarion SAN. We are
running Powerpath. The server has also 1GB of memory.

Its load is always 2 or higher an we have some flacky
performance when copying files from one NFS partitions
to another from the client:

All the filesystem are exported with sync and mounted
on the client (via autofs) with:
rw,sync,hard,intr,rsize=8192,wsize=8192

The time for copying a 40MB file from a NFS partition to local
client filesystem is good.

[didier@xfront2 ~]$ time cp jeffay.txt /tmp
0.010u 0.190s 0:05.19 3.8%      0+0k 0+0io 115pf+0w

For copying same file from one NFS partition to another
via the same client it takes more than a couple of minutes.

We are running 96 nfsd on the file server with the Queue tune-up hack.

The under /proc/net/rpc/nfsd
[...]
th 96 0 171.110 29.200 5.100 0.000 0.000 0.000 0.000 0.000 0.000 0.000

looks good.

It seems the file server is spending too much time doing ip frag work:
uptime -> 18hours

[didier@xnfs1 ~]$ cat /proc/net/snmp
Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors
ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests
OutDiscards OutNoRoutes ReasmTimeout ReasmReqds ReasmOKs ReasmFails
FragOKs FragFails FragCreates
Ip: 2 64 34249037 0 0 0 0 217 23273862 31176320 24384 0 0 16472823
5502518 0 0 0 10378060
[...]
Tcp: RtoAlgorithm RtoMin RtoMax MaxConn ActiveOpens PassiveOpens
AttemptFails EstabResets CurrEstab InSegs OutSegs RetransSegs InErrs OutRsts
Tcp: 0 0 0 0 2532 0 0 0 1 54706 76945 15 0 12
Udp: InDatagrams NoPorts InErrors OutDatagrams
Udp: 23221263 75 217 23165842

Would anyone have any suggestions or recommendations ? Should
I switch rsize / wsize to 1024 ?

Thanks - Didier




-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Pb of optimization for a Cluster under Gigabit
  2004-04-07  3:22 Didier CONTIS
@ 2004-04-07  5:28 ` Greg Banks
  2004-04-07 11:40 ` Bogdan Costescu
  1 sibling, 0 replies; 9+ messages in thread
From: Greg Banks @ 2004-04-07  5:28 UTC (permalink / raw)
  To: Didier CONTIS; +Cc: nfs

Didier CONTIS wrote:
> 
> All the filesystem are exported with sync and mounted
> on the client (via autofs) with:
> rw,sync,hard,intr,rsize=8192,wsize=8192
> [...]
> Would anyone have any suggestions or recommendations ? [...]

Try removing "sync" from your mount options.

Greg.
-- 
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Pb of optimization for a Cluster under Gigabit
  2004-04-07  3:22 Didier CONTIS
  2004-04-07  5:28 ` Greg Banks
@ 2004-04-07 11:40 ` Bogdan Costescu
  2004-04-07 20:38   ` Didier CONTIS
  1 sibling, 1 reply; 9+ messages in thread
From: Bogdan Costescu @ 2004-04-07 11:40 UTC (permalink / raw)
  To: Didier CONTIS; +Cc: nfs

On Tue, 6 Apr 2004, Didier CONTIS wrote:

> Would anyone have any suggestions or recommendations ? Should
> I switch rsize / wsize to 1024 ?

If the network cards and switch allow it, you could try first using 
Jumbo (9k) Ethernet frames; in such a setup, the 8k NFS packets would 
most likely travel unfragmented between server and clients.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu@IWR.Uni-Heidelberg.De



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Pb of optimization for a Cluster under Gigabit
  2004-04-07 11:40 ` Bogdan Costescu
@ 2004-04-07 20:38   ` Didier CONTIS
  0 siblings, 0 replies; 9+ messages in thread
From: Didier CONTIS @ 2004-04-07 20:38 UTC (permalink / raw)
  To: nfs

 

>If the network cards and switch allow it, you could try first using 
>Jumbo (9k) Ethernet frames; in such a setup, the 8k NFS packets would 
>most likely travel unfragmented between server and clients.

While most of 90% of the hardware supports it, I have some legacy hardware
(nodes and switch) still running at 100Mbit connected to the stack of 3750.

Thanks - Didier.



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2004-04-08  2:17 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-04-07 20:58 Pb of optimization for a Cluster under Gigabit Lever, Charles
2004-04-07 21:31 ` Chris Worley
2004-04-08  2:17 ` Didier CONTIS
  -- strict thread matches above, loose matches on Subject: below --
2004-04-07  4:39 Lever, Charles
2004-04-07 20:46 ` Didier CONTIS
2004-04-07  3:22 Didier CONTIS
2004-04-07  5:28 ` Greg Banks
2004-04-07 11:40 ` Bogdan Costescu
2004-04-07 20:38   ` Didier CONTIS

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.