NFS Performance issues...

All of lore.kernel.org
 help / color / mirror / Atom feed

* NFS Performance issues...
@ 2002-03-15 23:44 Matt Heaton
  2002-03-16 15:34 ` Thomas Langås
  0 siblings, 1 reply; 8+ messages in thread
From: Matt Heaton @ 2002-03-15 23:44 UTC (permalink / raw)
  To: nfs

[-- Attachment #1: Type: text/plain, Size: 795 bytes --]

I run a fairly large free/paid hosting service.  We have about 8 webservers (NFS Clients) that hook up to a single NFS server box.  We
are having SEVERE performance issues.  The clients are READ ONLY clients.  We have set the rsize to 8192 bytes, but otherwise
are using the standard setup.  The NFS server seems to be able to only serve about 11 Mbits of data before it dies?  Is this typical
performace?  We are mostly serving web pages so most files are small although some are bigger (3-5 megabytes).  Both the clients
are the server are redhat 7.2  (NFS 3)

I AM LOOKING FOR SOME COMMERCIAL SUPPORT or some guru I can pay to help with these issues.  PLEASE give me a call
at (801) 361-1177 if you would be able to help.  We will pay for the support we need.

Thanks,
Matt Heaton

[-- Attachment #2: Type: text/html, Size: 1577 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: NFS Performance issues...
  2002-03-15 23:44 NFS Performance issues Matt Heaton
@ 2002-03-16 15:34 ` Thomas Langås
  2002-03-16 16:54   ` Matt Heaton
  0 siblings, 1 reply; 8+ messages in thread
From: Thomas Langås @ 2002-03-16 15:34 UTC (permalink / raw)
  To: Matt Heaton; +Cc: nfs

Matt Heaton:
> I run a fairly large free/paid hosting service.  We have about 8 webservers (NFS Clients) that hook up to a single NFS server box.  We
> are having SEVERE performance issues.  The clients are READ ONLY clients.  We have set the rsize to 8192 bytes, but otherwise
> are using the standard setup.  The NFS server seems to be able to only serve about 11 Mbits of data before it dies?  Is this typical
> performace?  We are mostly serving web pages so most files are small although some are bigger (3-5 megabytes).  Both the clients
> are the server are redhat 7.2  (NFS 3)

You fail to mention what kernel version you're running?

-- 
Thomas

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: NFS Performance issues...
  2002-03-16 15:34 ` Thomas Langås
@ 2002-03-16 16:54   ` Matt Heaton
  2002-03-16 17:35     ` Thomas Langås
  0 siblings, 1 reply; 8+ messages in thread
From: Matt Heaton @ 2002-03-16 16:54 UTC (permalink / raw)
  To: nfs

Sorry, we have a 100 Mbit Network.  A cisco 2924 switch where everything =
is
plugged into.  An ALTEON Load balancer/firewall.  We are running 2.4.7-10
(Redhat 7.2).  I believe it might be something with the kernel as somethi=
ng
the
memory just spirals down.  Doesn't 7.2 have some virtual memory problem l=
ike
that?

Matt

----- Original Message -----
From: "Thomas Lang=E5s" <tlan@stud.ntnu.no>
To: "Matt Heaton" <admin@0catch.com>
Cc: <nfs@lists.sourceforge.net>
Sent: Saturday, March 16, 2002 8:34 AM
Subject: Re: [NFS] NFS Performance issues...


> Matt Heaton:
> > I run a fairly large free/paid hosting service.  We have about 8
webservers (NFS Clients) that hook up to a single NFS server box.  We
> > are having SEVERE performance issues.  The clients are READ ONLY
clients.  We have set the rsize to 8192 bytes, but otherwise
> > are using the standard setup.  The NFS server seems to be able to onl=
y
serve about 11 Mbits of data before it dies?  Is this typical
> > performace?  We are mostly serving web pages so most files are small
although some are bigger (3-5 megabytes).  Both the clients
> > are the server are redhat 7.2  (NFS 3)
>
> You fail to mention what kernel version you're running?
>
> --
> Thomas
>
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
>


_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: NFS Performance issues...
  2002-03-16 16:54   ` Matt Heaton
@ 2002-03-16 17:35     ` Thomas Langås
  0 siblings, 0 replies; 8+ messages in thread
From: Thomas Langås @ 2002-03-16 17:35 UTC (permalink / raw)
  To: Matt Heaton; +Cc: nfs

Matt Heaton:
> Sorry, we have a 100 Mbit Network.  A cisco 2924 switch where everything is
> plugged into.  An ALTEON Load balancer/firewall.  We are running 2.4.7-10
> (Redhat 7.2).  I believe it might be something with the kernel as something
> the
> memory just spirals down.  Doesn't 7.2 have some virtual memory problem like
> that?

First off all, try updating to the errata kernel, 2.4.9  (it's in the
updates-directory on evey redhat-mirror).

-- 
Thomas

_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* NFS Performance issues
@ 2005-05-11  5:31 Jeff Block
  2005-05-11  6:14 ` Neil Brown
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Jeff Block @ 2005-05-11  5:31 UTC (permalink / raw)
  To: nfs

We seem to be having some major performance problems on our redhat
enterprise linux 3 boxes.  Some of our machines have RAIDs attached, some
have internal SCSI drives, and some have internal IDE drives.   The one
thing all the boxes have in common is that there solaris counterparts are
putting them to shame in the nfs performance battle.

Here's some of the info and what we've already tried.
/etc/exports is simple:
/export/data    @all-hosts(rw,sync)

The automounter is used so the default mount options are used, looks like
this:
server:/export/data /data/mountpoint nfs
rw,v3,rsize=8192,wsize=8192,hard,udp,lock,addr=server 0 0

We can't change the rsize and wsize on these mounts because the precompiled
redhat kernel for vers3 maxes out at 8K.  We could of course compile our own
kernel, but doing this for more than a handful of machines can be a big
headache.  

We've tried moving the journaling from RAID devices onto another internal
disk.  This helped a little, but not much.

We have tried async, and that certainly does speed things up, but we are
definitely not comfortable with using async.

The big problem that we are having seems to do with copying a bunch of data
from one machine to another.

We have 683MB of test data that we were playing with that represents the
file sizes that our users play with.  There are several small files in this
set so there is a lot of writes and commits.  Our users generally work with
data sets in the multiple gigabyte range.

Test data - 683 MB 
NFS Testing:
Client | Server | Storage | NFS cp Time | SCP Time
Solaris | Solaris | RAID | 1:32 | 1:59
Linux A | Solaris | RAID | 0:42 | 2:51
Linux A | Linux B | RAID5 /Journal to SCSI | 3:17 | 2:05
Linux A | Linux B | RAID5 /Journal to RAID | 5:07 | 1:45
Linux A | Linux B | SCSI | 3:17 | 1:52
Linux A | Linux B | IDE | 1:36 | 2:27

Other Tests      

Internal Tests:    
Host/Storage | Host/Storage | cp Time
Linux B Int. SCSI | Linux B Ext. RAID5 | 0:37
Sol Int. SCSI | Sol Ext. RAID5 | 0:35

Network:      
Host A | Host B | Throughput
linux A | linux B | 893 Mbit/sec

Probably hard to read, but the bottom line is this:
Copying the 683MB from a linux host to a solaris RAID took 42 seconds.
Copying the same data from a linux host to a linux RAID took 5:07 or 3:17
depending on where the journal is stored.  My SCP times from Linux to Linux
RAID are much quicker than my nfs copies which seems pretty backwards to me.

Thanks in advance for the help on this.

Jeff Block
Programmer / Analyst
Radiology Research Computing
University of California, San Francisco

-------------------------------------------------------
This SF.Net email is sponsored by Oracle Space Sweepstakes
Want to be the first software developer in space?
Enter now for the Oracle Space Sweepstakes!
http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: NFS Performance issues
  2005-05-11  5:31 Jeff Block
@ 2005-05-11  6:14 ` Neil Brown
  2005-05-11 14:15 ` Chris Penney
  2005-05-11 17:25 ` Dan Stromberg
  2 siblings, 0 replies; 8+ messages in thread
From: Neil Brown @ 2005-05-11  6:14 UTC (permalink / raw)
  To: Jeff Block; +Cc: nfs

On Tuesday May 10, jblock@mrsc.ucsf.edu wrote:
> 
> We've tried moving the journaling from RAID devices onto another internal
> disk.  This helped a little, but not much.
> 

Are you using ext3?
Have you tried the "data=journal" mount option.  It speeds up NFS
writes a lot.

NeilBrown


-------------------------------------------------------
This SF.Net email is sponsored by Oracle Space Sweepstakes
Want to be the first software developer in space?
Enter now for the Oracle Space Sweepstakes!
http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: NFS Performance issues
  2005-05-11  5:31 Jeff Block
  2005-05-11  6:14 ` Neil Brown
@ 2005-05-11 14:15 ` Chris Penney
  2005-05-11 17:25 ` Dan Stromberg
  2 siblings, 0 replies; 8+ messages in thread
From: Chris Penney @ 2005-05-11 14:15 UTC (permalink / raw)
  To: nfs; +Cc: Jeff Block

> We have 683MB of test data that we were playing with that represents the
> file sizes that our users play with.  There are several small files in th=
is
> set so there is a lot of writes and commits.  Our users generally work wi=
th
> data sets in the multiple gigabyte range.

This sounds similar to some of the CAE analysis work that some of the
NFS servers I maintain handle.  Our Sun 480s w/ Veritas do a
reasonable job, but the linux boxes we have blow their doors off.

We are using JFS file systems (which was a huge improvement for us)
and using the 2.6 device-mapper to stripe across four 1TB luns.  We
have dual cpu boxes w/ hyperthreading enabled and use 128 nfs threads.
 All clients use a 32k r/wsize (which was also an improvement).  We
don't use async for reliability reasons (I'm not sure with out setup
it would matter than much).

I aslo use the following in sysctl.conf:

net.core.rmem_default =3D 262144
net.core.wmem_default =3D 262144
net.core.rmem_max =3D 8388608
net.core.wmem_max =3D 8388608
net.ipv4.tcp_rmem =3D 4096 87380 8388608
net.ipv4.tcp_wmem =3D 4096 65536 8388608
net.ipv4.tcp_mem =3D 8388608 8388608 8388608

I can't say those tuneing options are formally tested.  Perhaps
something with more understand could comment on them.

   Chris

-------------------------------------------------------
This SF.Net email is sponsored by Oracle Space Sweepstakes
Want to be the first software developer in space?
Enter now for the Oracle Space Sweepstakes!
http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: NFS Performance issues
  2005-05-11  5:31 Jeff Block
  2005-05-11  6:14 ` Neil Brown
  2005-05-11 14:15 ` Chris Penney
@ 2005-05-11 17:25 ` Dan Stromberg
  2 siblings, 0 replies; 8+ messages in thread
From: Dan Stromberg @ 2005-05-11 17:25 UTC (permalink / raw)
  To: Jeff Block; +Cc: strombrg, nfs

[-- Attachment #1: Type: text/plain, Size: 5385 bytes --]

You might try upgrading to RHEL 4, or other linux with a 2.6.x kernel.

If you're on a gigabit network, you might try turning on jumbo frames.

NFS is known to have a lot of "back and forth", relative to other
protocols.  For bulk transfers, you're far better off with something
like ftp, ssh, rsync - even on a system with pretty good NFS
performance...  NFS is there for convenience, more than large data
transfers,IMO.

You might try firing up a sniffer against the NFS traffic, comparing
linux->linux, linux->solaris, solaris->linux, solaris->solaris.  If one
pairing has a lot more retries than others, then you know to look for a
network problem.

You might try UDP if you're using TCP now, or TCP if you're using UDP
now.  Theoretically, TCP should be better for long-haul transfers (lots
of router hops), and UDP should be better for local transfers through a
small (or even nil) number of routers.  But we may be surprised.  :)

If you're getting lots of retries, a smaller blocksize may actually
speed things up.  (But check for network problems first)

You might try benchmarking the same data -locally-, without any network
involved, to see to what extent your RAID situation is contributing to
the slowdown you're seeing.

FUSE might be an interesting thing to try...  To ditch NFS.  :)  I've
never installed a FUSE-based filesystem though, let alone benchmarked
one though.

Some vendor or other, is expected to be releasing some sort of NFS proxy
(which I believe probably functions a bit like "NX", of NoMachine fame -
IE, includes protocol-specific smarts to cache suitable chunks of data
on either side of the transmissions, and uses a hash table indexed by
cryptographic hashes to see if something similar was already transferred
recently, in which case the data can be simply pulled from a cache),
which should reduce the "back and forthing" of NFS significantly, and
hence give much better NFS performance.  Unfortunately, the guy who
mentioned this was under an NDA, so I don't know the name of the
vendor.  :(

HTH.

On Tue, 2005-05-10 at 22:31 -0700, Jeff Block wrote:
> We seem to be having some major performance problems on our redhat
> enterprise linux 3 boxes.  Some of our machines have RAIDs attached, some
> have internal SCSI drives, and some have internal IDE drives.   The one
> thing all the boxes have in common is that there solaris counterparts are
> putting them to shame in the nfs performance battle.
> 
> Here's some of the info and what we've already tried.
> /etc/exports is simple:
> /export/data    @all-hosts(rw,sync)
> 
> The automounter is used so the default mount options are used, looks like
> this:
> server:/export/data /data/mountpoint nfs
> rw,v3,rsize=8192,wsize=8192,hard,udp,lock,addr=server 0 0
> 
> We can't change the rsize and wsize on these mounts because the precompiled
> redhat kernel for vers3 maxes out at 8K.  We could of course compile our own
> kernel, but doing this for more than a handful of machines can be a big
> headache.  
> 
> We've tried moving the journaling from RAID devices onto another internal
> disk.  This helped a little, but not much.
> 
> We have tried async, and that certainly does speed things up, but we are
> definitely not comfortable with using async.
> 
> The big problem that we are having seems to do with copying a bunch of data
> from one machine to another.
> 
> We have 683MB of test data that we were playing with that represents the
> file sizes that our users play with.  There are several small files in this
> set so there is a lot of writes and commits.  Our users generally work with
> data sets in the multiple gigabyte range.
>       
> Test data - 683 MB 
> NFS Testing:
> Client | Server | Storage | NFS cp Time | SCP Time
> Solaris | Solaris | RAID | 1:32 | 1:59
> Linux A | Solaris | RAID | 0:42 | 2:51
> Linux A | Linux B | RAID5 /Journal to SCSI | 3:17 | 2:05
> Linux A | Linux B | RAID5 /Journal to RAID | 5:07 | 1:45
> Linux A | Linux B | SCSI | 3:17 | 1:52
> Linux A | Linux B | IDE | 1:36 | 2:27
>        
> Other Tests      
>        
> Internal Tests:    
> Host/Storage | Host/Storage | cp Time
> Linux B Int. SCSI | Linux B Ext. RAID5 | 0:37
> Sol Int. SCSI | Sol Ext. RAID5 | 0:35
>        
> Network:      
> Host A | Host B | Throughput
> linux A | linux B | 893 Mbit/sec
> 
> Probably hard to read, but the bottom line is this:
> Copying the 683MB from a linux host to a solaris RAID took 42 seconds.
> Copying the same data from a linux host to a linux RAID took 5:07 or 3:17
> depending on where the journal is stored.  My SCP times from Linux to Linux
> RAID are much quicker than my nfs copies which seems pretty backwards to me.
> 
> Thanks in advance for the help on this.
> 
> Jeff Block
> Programmer / Analyst
> Radiology Research Computing
> University of California, San Francisco
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by Oracle Space Sweepstakes
> Want to be the first software developer in space?
> Enter now for the Oracle Space Sweepstakes!
> http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2005-05-12  1:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-03-15 23:44 NFS Performance issues Matt Heaton
2002-03-16 15:34 ` Thomas Langås
2002-03-16 16:54   ` Matt Heaton
2002-03-16 17:35     ` Thomas Langås
  -- strict thread matches above, loose matches on Subject: below --
2005-05-11  5:31 Jeff Block
2005-05-11  6:14 ` Neil Brown
2005-05-11 14:15 ` Chris Penney
2005-05-11 17:25 ` Dan Stromberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.