From mboxrd@z Thu Jan 1 00:00:00 1970 From: "poczta.dotcom.pl" Subject: nfs performance problem Date: Tue, 5 Nov 2002 19:03:43 +0100 Sender: nfs-admin@lists.sourceforge.net Message-ID: <005b01c284f5$ae1d6090$640a010a@winda> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0058_01C284FE.0FA7CCD0" Return-path: Received: from rtr.pul.pl ([195.116.89.98] helo=poczta.dotcom.pl) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 18988a-0008Be-00 for ; Tue, 05 Nov 2002 10:09:16 -0800 Received: from winda (office.interia.pl [213.76.176.114]) by poczta.dotcom.pl (Postfix) with ESMTP id 52F9A159E7 for ; Tue, 5 Nov 2002 19:06:47 +0100 (CET) To: Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: This is a multi-part message in MIME format. ------=_NextPart_000_0058_01C284FE.0FA7CCD0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi All I have really slow working nfs. Server is build on: - dual pentium Xeon 2GHz, - 2 3ware 7500 controllers, - raid 5 on 7 160GB discs on controler 1=20 - raid 5 on 6 160GB disks on cotroller 2=20 Serwer is used for ~500k maildirs,=20 all clients (~20) are freebsd boxes, when nfs transfer is around 1MB/s = for writing and 1.5MB/s for reading=20 listing user directories even with very few file stakes up to 15 = seconds. I tried nfs-ALL patch, but it didn' t help. I run 256 nfs daemons, echo 2097152 > /proc/sys/net/core/rmem_max echo 2097152 > /proc/sys/net/core/rmem_default nsfstat output: Server rpc stats: calls badcalls badauth badclnt xdrcall 25271043 17885061 0 17885061 0 Server nfs v3: null getattr setattr lookup access readlink 25 0% 1159 0% 89718 0% 11702395 46% 10953437 43% 34868 0% read write create mkdir symlink mknod 925727 3% 226250 0% 44964 0% 1089 0% 2299 0% 0 0% remove rmdir rename link readdir readdirplus 35040 0% 42 0% 121064 0% 535 0% 534720 2% 0 0% fsstat fsinfo pathconf commit 585179 2% 36 0% 0 0% 12493 0% mount over tcp gives frequent errors on freebsd clients side: kernel: nfs send error 32 from nfs server xxx.xxx.xxx.xxx:/export thanks for any help,=20 rafal ------=_NextPart_000_0058_01C284FE.0FA7CCD0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Hi All
 
I have really slow working = nfs.
Server is build on:
- dual pentium Xeon = 2GHz,
- 2 3ware 7500 = controllers,
- raid 5 on 7 160GB discs on controler=20 1 
- raid 5 on 6 160GB disks on cotroller=20 2 
 
Serwer is used for ~500k maildirs, =
all clients (~20) are freebsd boxes, = when nfs=20 transfer is around 1MB/s for writing and 1.5MB/s for reading =
listing user directories even with very = few file=20 stakes up to 15 seconds.
 
I tried nfs-ALL patch, but it didn' t=20 help.
 
I run 256 nfs daemons,
 
echo 2097152 >=20 /proc/sys/net/core/rmem_max
echo 2097152 >=20 /proc/sys/net/core/rmem_default
 
nsfstat=20 output:
 
Server rpc=20 stats:
calls      badcalls  =20 badauth    badclnt   =20 xdrcall
25271043   17885061  =20 0          = 17885061  =20 0
Server nfs=20 v3:
null       = getattr   =20 setattr    lookup    =20 access     = readlink
25      0%=20 1159    0% 89718   0% 11702395 46% 10953437 43% = 34868   0%
read      =20 write      create    =20 mkdir      symlink   =20 mknod
925727  3% 226250  0% 44964   0%=20 1089    0% 2299    0%=20 0       = 0%
remove    =20 rmdir      rename    =20 link       readdir   =20 readdirplus
35040   0% 42      0%=20 121064  0% 535     0% 534720  2%=20 0       = 0%
fsstat    =20 fsinfo     pathconf   = commit
585179  2%=20 36      0% = 0       0%=20 12493   0%
mount over tcp gives frequent errors on = freebsd=20 clients side:
kernel: nfs send error 32 from nfs = server=20 xxx.xxx.xxx.xxx:/export
 
thanks for any help,
 
rafal
 
------=_NextPart_000_0058_01C284FE.0FA7CCD0-- ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?iso-8859-1?Q?Ragnar_Kj=F8rstad?= Subject: Re: nfs performance problem Date: Tue, 5 Nov 2002 20:17:59 +0100 Sender: nfs-admin@lists.sourceforge.net Message-ID: <20021105201758.I23227@vestdata.no> References: <005b01c284f5$ae1d6090$640a010a@winda> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Cc: nfs@lists.sourceforge.net Return-path: Received: from stine.vestdata.no ([195.204.68.10]) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 1899DH-0005kG-00 for ; Tue, 05 Nov 2002 11:18:12 -0800 To: "poczta.dotcom.pl" In-Reply-To: <005b01c284f5$ae1d6090$640a010a@winda>; from myciel@dotcom.pl on Tue, Nov 05, 2002 at 07:03:43PM +0100 Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: On Tue, Nov 05, 2002 at 07:03:43PM +0100, poczta.dotcom.pl wrote: > Hi All >=20 > I have really slow working nfs. > Server is build on: > - dual pentium Xeon 2GHz, > - 2 3ware 7500 controllers, > - raid 5 on 7 160GB discs on controler 1=20 > - raid 5 on 6 160GB disks on cotroller 2=20 >=20 > Serwer is used for ~500k maildirs,=20 > all clients (~20) are freebsd boxes, when nfs transfer is around 1MB/s = for writing and 1.5MB/s for reading=20 > listing user directories even with very few file stakes up to 15 second= s. >=20 > I tried nfs-ALL patch, but it didn' t help. How is local performance? What filesystem do you use? What's the output from "iostat -x -d -k 60 3"? --=20 Ragnar Kj=F8rstad Big Storage ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: "poczta.dotcom.pl" Subject: Re: nfs performance problem Date: Tue, 5 Nov 2002 20:55:41 +0100 Sender: nfs-admin@lists.sourceforge.net Message-ID: <007901c28505$51e9c240$640a010a@winda> References: <005b01c284f5$ae1d6090$640a010a@winda> <20021105201758.I23227@vestdata.no> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Return-path: Received: from rtr.pul.pl ([195.116.89.98] helo=poczta.dotcom.pl) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 1899sx-00017k-00 for ; Tue, 05 Nov 2002 12:01:16 -0800 Received: from winda (office.interia.pl [213.76.176.114]) by poczta.dotcom.pl (Postfix) with ESMTP id 0B9A916E16 for ; Tue, 5 Nov 2002 20:58:44 +0100 (CET) To: Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: ----- Original Message ----- From: "Ragnar Kj=F8rstad" To: "poczta.dotcom.pl" Cc: Sent: Tuesday, November 05, 2002 8:17 PM Subject: Re: [NFS] nfs performance problem > On Tue, Nov 05, 2002 at 07:03:43PM +0100, poczta.dotcom.pl wrote: > > Hi All > > > > I have really slow working nfs. > > Server is build on: > > - dual pentium Xeon 2GHz, > > - 2 3ware 7500 controllers, > > - raid 5 on 7 160GB discs on controler 1 > > - raid 5 on 6 160GB disks on cotroller 2 > > > > Serwer is used for ~500k maildirs, > > all clients (~20) are freebsd boxes, when nfs transfer is around 1MB/= s for writing and 1.5MB/s for reading > > listing user directories even with very few file stakes up to 15 seconds. > > > > I tried nfs-ALL patch, but it didn' t help. > > How is local performance? local read/writes are ok > What filesystem do you use? reiserfs on top of lvm (to be able to get snapshots) > What's the output from "iostat -x -d -k 60 3"? > below is output from 'iostat -d 60 3' (- my iostat does not take '-k' option, and '-d' gives me empty output) - mayby this is enough? Linux 2.4.19 (hostname) 11/05/02 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn dev8-0 16.16 205.52 82.84 3481842 1403528 dev8-1 160.58 1266.39 1346.12 21455002 22805744 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn dev8-0 9.23 134.93 16.13 8096 968 dev8-1 150.92 753.20 1715.87 45192 102952 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn dev8-0 10.02 135.07 26.53 8104 1592 dev8-1 152.78 763.33 1767.20 45800 106032 rafal ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Matt Heaton" Subject: Re: nfs performance problem Date: Tue, 5 Nov 2002 13:22:25 -0700 Sender: nfs-admin@lists.sourceforge.net Message-ID: <098d01c28509$0df86e20$e2a446a6@user1i6avc9gfx> References: <005b01c284f5$ae1d6090$640a010a@winda> <20021105201758.I23227@vestdata.no> <007901c28505$51e9c240$640a010a@winda> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Return-path: Received: from mgr3.xmission.com ([198.60.22.203]) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 189AFX-0006DY-00 for ; Tue, 05 Nov 2002 12:24:35 -0800 To: "poczta.dotcom.pl" , Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: I have 3ware cards as well (2 7500 Series 8 drive), just like yours. ALso with 120 GIG drives (7200) RPM. Whever I get 150-200 tps in iostat then my NFS runs SO SLOW. I also only get 1 MB, to 1.5 MB over NFS. The local speed seems to be ok, not great though. MY OPINION is that this is because of seek time on the raid array. I am serving very small files, just like you are. I am requesting about 200-300 files per second from each NFS server. So even though our throughput of only 1.5 MB isn't high. The number of files per second is actually quite high, and causes things to slow down because of seek time issues. PLEASE GIVE US CACHEFS SOMEONE?? Does anyone have experience with IDE Raid arrays that get over 250 tps in iostat that work fine? I would be VERY VERY VERY interested to find out. L8r... Matt > On Tue, Nov 05, 2002 at 07:03:43PM +0100, poczta.dotcom.pl wrote: > > Hi All > > > > I have really slow working nfs. > > Server is build on: > > - dual pentium Xeon 2GHz, > > - 2 3ware 7500 controllers, > > - raid 5 on 7 160GB discs on controler 1 > > - raid 5 on 6 160GB disks on cotroller 2 > > > > Serwer is used for ~500k maildirs, > > all clients (~20) are freebsd boxes, when nfs transfer is around 1MB/s for writing and 1.5MB/s for reading > > listing user directories even with very few file stakes up to 15 seconds. > > > > I tried nfs-ALL patch, but it didn' t help. > > How is local performance? local read/writes are ok > What filesystem do you use? reiserfs on top of lvm (to be able to get snapshots) > What's the output from "iostat -x -d -k 60 3"? > below is output from 'iostat -d 60 3' (- my iostat does not take '-k' option, and '-d' gives me empty output) - mayby this is enough? Linux 2.4.19 (hostname) 11/05/02 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn dev8-0 16.16 205.52 82.84 3481842 1403528 dev8-1 160.58 1266.39 1346.12 21455002 22805744 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn dev8-0 9.23 134.93 16.13 8096 968 dev8-1 150.92 753.20 1715.87 45192 102952 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn dev8-0 10.02 135.07 26.53 8104 1592 dev8-1 152.78 763.33 1767.20 45800 106032 rafal ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: Benjamin LaHaise Subject: Re: nfs performance problem Date: Tue, 5 Nov 2002 15:39:48 -0500 Sender: nfs-admin@lists.sourceforge.net Message-ID: <20021105153948.G3934@redhat.com> References: <005b01c284f5$ae1d6090$640a010a@winda> <20021105201758.I23227@vestdata.no> <007901c28505$51e9c240$640a010a@winda> <098d01c28509$0df86e20$e2a446a6@user1i6avc9gfx> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "poczta.dotcom.pl" , nfs@lists.sourceforge.net Return-path: Received: from to-velocet.redhat.com ([216.138.202.10] helo=touchme.toronto.redhat.com) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 189AUJ-0004q3-00 for ; Tue, 05 Nov 2002 12:39:51 -0800 To: Matt Heaton In-Reply-To: <098d01c28509$0df86e20$e2a446a6@user1i6avc9gfx>; from admin@0catch.com on Tue, Nov 05, 2002 at 01:22:25PM -0700 Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: On Tue, Nov 05, 2002 at 01:22:25PM -0700, Matt Heaton wrote: > each NFS server. So even though our throughput of only 1.5 MB isn't high. > The number of files per second is > actually quite high, and causes things to slow down because of seek time > issues. PLEASE GIVE US CACHEFS SOMEONE?? How is cachefs going to help? The kernel is already trying to cache data as much as possible. Once you're trying to serve more data than you have RAM, this are naturally going to degreate quite significantly as the system becomes seek bound. > Does anyone have experience with IDE Raid arrays that get over 250 tps in > iostat that work fine? I would > be VERY VERY VERY interested to find out. Use raid1+0 and you'll be much happier, as read requests will be balanced over multiple drives (mirroring means the same data can be read from all of the mirrors). Additionally, you'll have much lower CPU utilization and writes won't cause all disks in the array to seek for strip updates. Read the archives for the past couple of weeks for another example of the performance increase when going from raid5 to raid1+0. -ben ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Matt Heaton" Subject: Re: nfs performance problem Date: Tue, 5 Nov 2002 13:46:04 -0700 Sender: nfs-admin@lists.sourceforge.net Message-ID: <09b001c2850c$5eeb0ec0$e2a446a6@user1i6avc9gfx> References: <005b01c284f5$ae1d6090$640a010a@winda> <20021105201758.I23227@vestdata.no> <007901c28505$51e9c240$640a010a@winda> <098d01c28509$0df86e20$e2a446a6@user1i6avc9gfx> <20021105153948.G3934@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Return-path: Received: from mgr2.xmission.com ([198.60.22.202]) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 189AfO-0006U3-00 for ; Tue, 05 Nov 2002 12:51:18 -0800 To: "Benjamin LaHaise" , Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: Cachefs will help quite a lot in my opinion because it doesn't just store the files in RAM, it uses the hard drive. So if you have an NFS client with an extra 5 gig that you can designate as cache then reads to the NFS server will go down DRAMATICALLY as it will hit local cache on the NFS clients drive. I agree raid 1+0 should be much faster for writes and a little for read, but RAID 5 still reads from all drives simultaneously (Has to read parity in too I know), but can read all 7 drives at once instead of only 4 drives at once in a raid 1+0 configuration with 8 drives in the array. I have never used 1+0 so I am only talking about physical drive layout rather than any personal experience. Are my assumptions correct that raid 5 does in fact read from all drives at the same time? If so, reading might be a LITTLE faster on raid 1+0 than raid 5, but it shouldn't be HUGE. When I contacted 3ware, they basically said the same thing. I do agree that writes are MUCH faster on 1+0 than raid 5. Any thoughts? L8r... Matt > On Tue, Nov 05, 2002 at 01:22:25PM -0700, Matt Heaton wrote: > > each NFS server. So even though our throughput of only 1.5 MB isn't high. > > The number of files per second is > > actually quite high, and causes things to slow down because of seek time > > issues. PLEASE GIVE US CACHEFS SOMEONE?? > > How is cachefs going to help? The kernel is already trying to cache data > as much as possible. Once you're trying to serve more data than you have > RAM, this are naturally going to degreate quite significantly as the system > becomes seek bound. > > > Does anyone have experience with IDE Raid arrays that get over 250 tps in > > iostat that work fine? I would > > be VERY VERY VERY interested to find out. > > Use raid1+0 and you'll be much happier, as read requests will be balanced > over multiple drives (mirroring means the same data can be read from all > of the mirrors). Additionally, you'll have much lower CPU utilization > and writes won't cause all disks in the array to seek for strip updates. > Read the archives for the past couple of weeks for another example of the > performance increase when going from raid5 to raid1+0. > > -ben > > ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Lever, Charles" Subject: RE: nfs performance problem Date: Tue, 5 Nov 2002 12:56:07 -0800 Sender: nfs-admin@lists.sourceforge.net Message-ID: <6440EA1A6AA1D5118C6900902745938E07D55061@black.eng.netapp.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Cc: nfs@lists.sourceforge.net Return-path: Received: from mx01.netapp.com ([198.95.226.53]) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 189AkC-0007fi-00 for ; Tue, 05 Nov 2002 12:56:16 -0800 To: "'Matt Heaton'" Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: > Cachefs will help quite a lot in my opinion because it > doesn't just store > the files in RAM, > it uses the hard drive. So if you have an NFS client with an > extra 5 gig > that you can > designate as cache then reads to the NFS server will go down > DRAMATICALLY as > it will hit local cache on the NFS clients drive. yes, that's true, until you consider that without something like v4 delegation, the client still has to contact the server to keep its cache up to date. matt, you really do want a faster server in this case. ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: Benjamin LaHaise Subject: Re: nfs performance problem Date: Tue, 5 Nov 2002 16:24:45 -0500 Sender: nfs-admin@lists.sourceforge.net Message-ID: <20021105162445.A10031@redhat.com> References: <005b01c284f5$ae1d6090$640a010a@winda> <20021105201758.I23227@vestdata.no> <007901c28505$51e9c240$640a010a@winda> <098d01c28509$0df86e20$e2a446a6@user1i6avc9gfx> <20021105153948.G3934@redhat.com> <09b001c2850c$5eeb0ec0$e2a446a6@user1i6avc9gfx> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: nfs@lists.sourceforge.net Return-path: Received: from to-velocet.redhat.com ([216.138.202.10] helo=touchme.toronto.redhat.com) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 189BBo-000291-00 for ; Tue, 05 Nov 2002 13:24:49 -0800 To: Matt Heaton In-Reply-To: <09b001c2850c$5eeb0ec0$e2a446a6@user1i6avc9gfx>; from admin@0catch.com on Tue, Nov 05, 2002 at 01:46:04PM -0700 Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: You're wrong. Reads (not sequential, but scattered) on a 1+0 setup will be faster as the raid1 driver optimizes requests somewhat, plus 2 (or more for larger raid1s of disks) drives will be able to service requests for the same stripe offset on different disks. There is no way a raid5 can service two requests for the same stripe offset at different offsets in the array. And even for a read-heavy workload, there are still writes to update metadata (and journal unless you've got a separate journal device), and the impact of those is that *all* reads to the array have to suffer from seeks when even the smallest write is active. On a side note, make sure you have the filesystem mounted with the noatime flag if you can afford losing atimes. -ben On Tue, Nov 05, 2002 at 01:46:04PM -0700, Matt Heaton wrote: > Cachefs will help quite a lot in my opinion because it doesn't just store > the files in RAM, > it uses the hard drive. So if you have an NFS client with an extra 5 gig > that you can > designate as cache then reads to the NFS server will go down DRAMATICALLY as > it will hit local cache on the NFS clients drive. > > I agree raid 1+0 should be much faster for writes and a little for read, but > RAID 5 still > reads from all drives simultaneously (Has to read parity in too I know), but > can read > all 7 drives at once instead of only 4 drives at once in a raid 1+0 > configuration with 8 drives > in the array. I have never used 1+0 so I am only talking about physical > drive layout rather > than any personal experience. Are my assumptions correct that raid 5 does > in fact read > from all drives at the same time? If so, reading might be a LITTLE faster > on raid 1+0 than > raid 5, but it shouldn't be HUGE. When I contacted 3ware, they basically > said the same thing. > I do agree that writes are MUCH faster on 1+0 than raid 5. > > Any thoughts? > > L8r... > > Matt > > > > On Tue, Nov 05, 2002 at 01:22:25PM -0700, Matt Heaton wrote: > > > each NFS server. So even though our throughput of only 1.5 MB isn't > high. > > > The number of files per second is > > > actually quite high, and causes things to slow down because of seek time > > > issues. PLEASE GIVE US CACHEFS SOMEONE?? > > > > How is cachefs going to help? The kernel is already trying to cache data > > as much as possible. Once you're trying to serve more data than you have > > RAM, this are naturally going to degreate quite significantly as the > system > > becomes seek bound. > > > > > Does anyone have experience with IDE Raid arrays that get over 250 tps > in > > > iostat that work fine? I would > > > be VERY VERY VERY interested to find out. > > > > Use raid1+0 and you'll be much happier, as read requests will be balanced > > over multiple drives (mirroring means the same data can be read from all > > of the mirrors). Additionally, you'll have much lower CPU utilization > > and writes won't cause all disks in the array to seek for strip updates. > > Read the archives for the past couple of weeks for another example of the > > performance increase when going from raid5 to raid1+0. > > > > -ben > > > > > -- "Do you seek knowledge in time travel?" ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Lever, Charles" Subject: RE: nfs performance problem Date: Tue, 5 Nov 2002 14:09:28 -0800 Sender: nfs-admin@lists.sourceforge.net Message-ID: <6440EA1A6AA1D5118C6900902745938E07D55067@black.eng.netapp.com> Mime-Version: 1.0 Content-Type: text/plain Cc: "'Benjamin LaHaise'" Return-path: Received: from mx01.netapp.com ([198.95.226.53]) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 189BtI-0007jj-00 for ; Tue, 05 Nov 2002 14:09:44 -0800 To: nfs@lists.sourceforge.net Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: > On a side note, make sure you have the filesystem mounted > with the noatime flag if you can afford losing atimes. to be scrupulously clear, that's on the server side, not on the client side. the NFS client ignores the "noatime" mount option, since file access timestamps are managed on the server. ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?iso-8859-1?Q?Ragnar_Kj=F8rstad?= Subject: Re: nfs performance problem Date: Wed, 6 Nov 2002 00:32:37 +0100 Sender: nfs-admin@lists.sourceforge.net Message-ID: <20021106003237.L23227@vestdata.no> References: <005b01c284f5$ae1d6090$640a010a@winda> <20021105201758.I23227@vestdata.no> <007901c28505$51e9c240$640a010a@winda> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Cc: nfs@lists.sourceforge.net Return-path: Received: from stine.vestdata.no ([195.204.68.10]) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 189DBe-0008Fx-00 for ; Tue, 05 Nov 2002 15:32:46 -0800 To: "poczta.dotcom.pl" In-Reply-To: <007901c28505$51e9c240$640a010a@winda>; from myciel@dotcom.pl on Tue, Nov 05, 2002 at 08:55:41PM +0100 Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: On Tue, Nov 05, 2002 at 08:55:41PM +0100, poczta.dotcom.pl wrote: > > How is local performance? >=20 > local read/writes are ok >=20 > > What filesystem do you use? >=20 > reiserfs on top of lvm (to be able to get snapshots) What reiserfs-version? 3.5 or 3.6?=20 What on-disk format? 3.5 or 3.6? There should be an entry in the kernel-log from when the filesystem is mounted.=20 Reiserfs-3.5 has some known performance-problems related to NFS - that may be your problem. > > What's the output from "iostat -x -d -k 60 3"? > > > below is output from 'iostat -d 60 3' (- my iostat does not take '-k' > option, > and '-d' gives me empty output) - mayby this is enough? And not -x either? iostat fr=E5 sysstat-4.0.3-2 (from RedHat) includes a "-x" option for extended output. Most importantly, it will tell how much of the time each device is busy. It's unclear from your posts if this is a io-related problem or not. Unnless you're running reiserfs-3.5 my guess is that it is the IO-performance that is the problem. 3ware 7500 controllers have rather=20 poor performance on RAID5 - especially for writes.=20 If that's correct, then switching to a different RAID-level or replacing the 3ware-controller should solve the problem. SCSI- or FC- RAIDS are easily 10 or 20 times faster than the 3ware RAID for some types of operations, and if you want something cheaper there is also different types of IDE-RAIDs. --=20 Ragnar Kj=F8rstad Big Storage ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: "myciel" Subject: Re: nfs performance problem Date: Wed, 6 Nov 2002 09:59:01 +0100 Sender: nfs-admin@lists.sourceforge.net Message-ID: <003201c28572$c04ea260$640a010a@winda> References: <005b01c284f5$ae1d6090$640a010a@winda> <20021105201758.I23227@vestdata.no> <007901c28505$51e9c240$640a010a@winda> <20021106003237.L23227@vestdata.no> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Return-path: Received: from rtr.pul.pl ([195.116.89.98] helo=poczta.dotcom.pl) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 189M71-0001Rg-00 for ; Wed, 06 Nov 2002 01:04:36 -0800 Received: from winda (office.interia.pl [213.76.176.114]) by poczta.dotcom.pl (Postfix) with ESMTP id B882C16E16 for ; Wed, 6 Nov 2002 10:02:00 +0100 (CET) To: Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: ----- Original Message ----- From: "Ragnar Kj=F8rstad" To: "poczta.dotcom.pl" Cc: Sent: Wednesday, November 06, 2002 12:32 AM Subject: Re: [NFS] nfs performance problem > On Tue, Nov 05, 2002 at 08:55:41PM +0100, poczta.dotcom.pl wrote: > > > How is local performance? > > > > local read/writes are ok > > > > > What filesystem do you use? > > > > reiserfs on top of lvm (to be able to get snapshots) > > What reiserfs-version? 3.5 or 3.6? > What on-disk format? 3.5 or 3.6? 3.6.25 > > > > What's the output from "iostat -x -d -k 60 3"? > > > > > below is output from 'iostat -d 60 3' (- my iostat does not take '-k= ' > > option, > > and '-d' gives me empty output) - mayby this is enough? > > And not -x either? now I have 4.0.6, iostat -x -k 60 3 - outputs only column titles every 60 seconds but lin= e where numbers should be is blank, the same I have on other machine (fresh gentoo linux) > iostat fr=E5 sysstat-4.0.3-2 (from RedHat) includes a "-x" option for > extended output. Most importantly, it will tell how much of the time > each device is busy. anyway local writes are ok > > It's unclear from your posts if this is a io-related problem or not. > Unnless you're running reiserfs-3.5 my guess is that it is the > IO-performance that is the problem. 3ware 7500 controllers have rather > poor performance on RAID5 - especially for writes. reiserfs 3.6.25, raid 5, ok, I can understand raid 5 is not fast but getting below 2Mbytes/s is really poor :-( > > If that's correct, then switching to a different RAID-level or replacin= g > the 3ware-controller should solve the problem. SCSI- or FC- RAIDS are > easily 10 or 20 times faster than the 3ware RAID for some types of > operations, and if you want something cheaper there is also different > types of IDE-RAIDs. > what kind of IDE-RAID would You suggest? thanks rafal mycielski ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?iso-8859-1?Q?Ragnar_Kj=F8rstad?= Subject: Re: nfs performance problem Date: Wed, 6 Nov 2002 11:16:28 +0100 Sender: nfs-admin@lists.sourceforge.net Message-ID: <20021106111628.Q23227@vestdata.no> References: <005b01c284f5$ae1d6090$640a010a@winda> <20021105201758.I23227@vestdata.no> <007901c28505$51e9c240$640a010a@winda> <20021106003237.L23227@vestdata.no> <003201c28572$c04ea260$640a010a@winda> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Cc: nfs@lists.sourceforge.net Return-path: Received: from stine.vestdata.no ([195.204.68.10]) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 189NEn-0001XO-00 for ; Wed, 06 Nov 2002 02:16:41 -0800 To: myciel In-Reply-To: <003201c28572$c04ea260$640a010a@winda>; from myciel@dotcom.pl on Wed, Nov 06, 2002 at 09:59:01AM +0100 Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: On Wed, Nov 06, 2002 at 09:59:01AM +0100, myciel wrote: > > > > What filesystem do you use? > > > > > > reiserfs on top of lvm (to be able to get snapshots) > > > > What reiserfs-version? 3.5 or 3.6? > > What on-disk format? 3.5 or 3.6? >=20 > 3.6.25 That's the reiserfs-version. The on-disk format must be either "3.5" or "3.6". There should be a message at mount-time telling you wich one, or you can use "debugreiserfs " to check it. > > Unnless you're running reiserfs-3.5 my guess is that it is the > > IO-performance that is the problem. 3ware 7500 controllers have rathe= r > > poor performance on RAID5 - especially for writes. >=20 > reiserfs 3.6.25, raid 5, > ok, I can understand raid 5 is not fast but getting below 2Mbytes/s > is really poor :-( Well, updating a single byte is a very very expensive operation on raid5. I agree that 2MB/s is worse than one should expect though, so there may be something else going on. The fact that you say local writes are faster could indicate that the problem is not only io-related.=20 Maybe there is some packet-loss? That kills performance on nfs. Is there anything in the kernel-log on the clients to indicate the problem? > > If that's correct, then switching to a different RAID-level or replac= ing > > the 3ware-controller should solve the problem. SCSI- or FC- RAIDS are > > easily 10 or 20 times faster than the 3ware RAID for some types of > > operations, and if you want something cheaper there is also different > > types of IDE-RAIDs. >=20 > what kind of IDE-RAID would You suggest? A BigStorage IDE-RAID of course :) I'll get back to you off-list about that. But it's still not clear if your problems are raid-related or network-related. Possible a combination. --=20 Ragnar Kj=F8rstad Big Storage ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: "myciel" Subject: Re: nfs performance problem Date: Wed, 6 Nov 2002 12:46:58 +0100 Sender: nfs-admin@lists.sourceforge.net Message-ID: <00b401c2858a$36c4bf80$640a010a@winda> References: <005b01c284f5$ae1d6090$640a010a@winda> <20021105201758.I23227@vestdata.no> <007901c28505$51e9c240$640a010a@winda> <20021106003237.L23227@vestdata.no> <003201c28572$c04ea260$640a010a@winda> <20021106111628.Q23227@vestdata.no> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Return-path: Received: from rtr.pul.pl ([195.116.89.98] helo=poczta.dotcom.pl) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 189Ojm-0003bt-00 for ; Wed, 06 Nov 2002 03:52:46 -0800 Received: from winda (office.interia.pl [213.76.176.114]) by poczta.dotcom.pl (Postfix) with ESMTP id 3F7D916ADF for ; Wed, 6 Nov 2002 12:49:57 +0100 (CET) To: Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: > On Wed, Nov 06, 2002 at 09:59:01AM +0100, myciel wrote: > > > > > What filesystem do you use? > > > > > > > > reiserfs on top of lvm (to be able to get snapshots) > > > > > > What reiserfs-version? 3.5 or 3.6? > > > What on-disk format? 3.5 or 3.6? > > > > 3.6.25 > > That's the reiserfs-version. > The on-disk format must be either "3.5" or "3.6". > There should be a message at mount-time telling you wich one, or you can > use "debugreiserfs " to check it. > format 3.6 with standard journal > > > Unnless you're running reiserfs-3.5 my guess is that it is the > > > IO-performance that is the problem. 3ware 7500 controllers have rather > > > poor performance on RAID5 - especially for writes. > > > > reiserfs 3.6.25, raid 5, > > ok, I can understand raid 5 is not fast but getting below 2Mbytes/s > > is really poor :-( > > Well, updating a single byte is a very very expensive operation on > raid5. I agree that 2MB/s is worse than one should expect though, so > there may be something else going on. > > The fact that you say local writes are faster could indicate that the > problem is not only io-related. > > Maybe there is some packet-loss? I don't see any packet losses - at least as I can see with ping big packets like 16k or 32k >That kills performance on nfs. > Is there anything in the kernel-log on the clients to indicate the > problem? > yesterday evening I switched from udp to tcp, performace is much better but I get a lot of kernel messages in syslog: Nov 6 12:40:45 intler kernel: RPC request reserved 272 but used 276 Nov 6 12:40:50 intler kernel: RPC request reserved 240 but used 244 Nov 6 12:40:54 intler kernel: RPC request reserved 244 but used 248 what does above mean? > > > > what kind of IDE-RAID would You suggest? > > A BigStorage IDE-RAID of course :) > I'll get back to you off-list about that. ok rafal mycielski ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: pwitting@Cyveillance.com Subject: Re: nfs performance problem Date: Wed, 6 Nov 2002 12:08:59 -0500 Sender: nfs-admin@lists.sourceforge.net Message-ID: Mime-Version: 1.0 Content-Type: text/plain Return-path: Received: from [63.100.163.69] (helo=mercury.cyveillance.com) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 189Tg7-0007Be-00 for ; Wed, 06 Nov 2002 09:09:19 -0800 To: nfs@lists.sourceforge.net Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: > From: =?iso-8859-1?Q?Ragnar_Kj=F8rstad?= > On Tue, Nov 05, 2002 at 08:55:41PM +0100, poczta.dotcom.pl wrote: > > > How is local performance? > > local read/writes are ok > > > What filesystem do you use? > > reiserfs on top of lvm (to be able to get snapshots) I'm just going to throw my two cents in here. It sounds like the box is getting a lot of random IO (~500k maildirs) which means he is most likely seeing a LOT of seek issues, and RAID 5 is probably a really bad choice. Cheap controllers and fast RAID 5 just don't go together, there's too much math going on (the parity bit moves, remember) Matt, you really need to trust the folks on the list on this and try it. You'll lose some capacity (drop from 800GB to 480GB per array), but hopefully the performance boost will be worth it. Now, on the plus side, you're already using LVM, so I have a better idea than RAID 1+0 for your situation. Build six RAID 1 arrays, then use LVM to create a single logical volume out of the group. The randomness should ensure that if your server receives 4 simultaneous read/write requests, they will each have their own "spindle", an there will be no performance robbing back and forth seeking, meaning the requests get handled faster, meaning there's less chance for the system to start thrashing with seek requests. I usually think of each RAID volume as a "virtual spindle", capable of handling 1 request at a time. While each individual transaction might be slower, (no striping to speed things up) there's six lanes to handle the requests. Server thinking versus Workstation thinking. Hopefully you can experiment with this, If this things production already it may be too late. ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Baker, Byran" Subject: Re: nfs performance problem Date: Thu, 7 Nov 2002 09:19:44 -0600 Sender: nfs-admin@lists.sourceforge.net Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Return-path: Received: from camaro.bmc.com ([198.207.223.231] helo=creeper.bmc.com) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 189oS5-0006Du-00 for ; Thu, 07 Nov 2002 07:20:14 -0800 Received: from ec02-hou.bmc.com (localhost [127.0.0.1]) by creeper.bmc.com (8.10.2/8.10.2) with ESMTP id gA7FJkf26852 for ; Thu, 7 Nov 2002 09:19:46 -0600 (CST) To: "'nfs@lists.sourceforge.net'" Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: We have a couple of NFS servers with 3ware cards. One is a 3ware 7850 (8 Maxtor 160GB - 5400RPM) and one is a 3ware 7500-12, both configured with RAID 5 + Hot Spare. I have all the latest NFS patches applied. If I access an NFS mounted directory on one of these systems and copy a bunch of little files (1-2KB), I get 150-250 tps. If I access larger files (90% > 1MB each) I will normally see 25-60MB/s transfer with 500-700 tps, with peaks of over 1000 tps - A good indication that disk seek is to blame for the problem with the small files. Thanks, -Byran From: "Matt Heaton" To: "poczta.dotcom.pl" , Subject: Re: [NFS] nfs performance problem Date: Tue, 5 Nov 2002 13:22:25 -0700 I have 3ware cards as well (2 7500 Series 8 drive), just like yours. ALso with 120 GIG drives (7200) RPM. Whever I get 150-200 tps in iostat then my NFS runs SO SLOW. I also only get 1 MB, to 1.5 MB over NFS. The local speed seems to be ok, not great though. MY OPINION is that this is because of seek time on the raid array. I am serving very small files, just like you are. I am requesting about 200-300 files per second from each NFS server. So even though our throughput of only 1.5 MB isn't high. The number of files per second is actually quite high, and causes things to slow down because of seek time issues. PLEASE GIVE US CACHEFS SOMEONE?? Does anyone have experience with IDE Raid arrays that get over 250 tps in iostat that work fine? I would be VERY VERY VERY interested to find out. L8r... Matt ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Matt Heaton" Subject: Re: nfs performance problem Date: Thu, 7 Nov 2002 08:49:09 -0700 Sender: nfs-admin@lists.sourceforge.net Message-ID: <08da01c28675$367ea400$6601a8c0@c1886657a> References: Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Return-path: Received: from rwcrmhc52.attbi.com ([216.148.227.88]) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 189ouG-0004K9-00 for ; Thu, 07 Nov 2002 07:49:20 -0800 To: "Baker, Byran" , Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: I didn't understand? What is a good indication of poor seek time? The high number for TPS? I thought a high number meant the drive was handling a huge amount of transactions per second, and not necessarily slow. I would LOVE to have my NFS servers push out 500 TPS on iostat. I have 10 webservers that each ask for between 10-25 files per second from our 3ware 7850 controllers. We get about 200 TPS on the NFS server. Much more and the clients become VERY SLUGGISH, and I get blocked R processes in vmstat on the client. Sometimes 20-40 blocked R processes in vmstat. As soon as I offload some of files from a busy NFS server to another NFS server then the clients settle down and everything works great again, but I only get 1-1.5MB out of each NFS server before this happens! I can get really high TPS out of the server LOCALLY when I do copies etc, its just over NFS with a huge amount of really small files that everything stinks. L8r... Matt > We have a couple of NFS servers with 3ware cards. One is a 3ware 7850 (8 > Maxtor 160GB - 5400RPM) and one is a 3ware 7500-12, both configured with > RAID 5 + Hot Spare. I have all the latest NFS patches applied. If I access > an NFS mounted directory on one of these systems and copy a bunch of little > files (1-2KB), I get 150-250 tps. If I access larger files (90% > 1MB each) > I will normally see 25-60MB/s transfer with 500-700 tps, with peaks of over > 1000 tps - A good indication that disk seek is to blame for the problem with > the small files. > > Thanks, > -Byran > > From: "Matt Heaton" > To: "poczta.dotcom.pl" , > > Subject: Re: [NFS] nfs performance problem > Date: Tue, 5 Nov 2002 13:22:25 -0700 > > I have 3ware cards as well (2 7500 Series 8 drive), just like yours. ALso > with 120 GIG drives (7200) > RPM. Whever I get 150-200 tps in iostat then my NFS runs SO SLOW. I also > only get 1 MB, to 1.5 MB > over NFS. The local speed seems to be ok, not great though. MY OPINION is > that this is because of seek time on the raid > array. I am serving very small files, just like you are. I am requesting > about 200-300 files per second from > each NFS server. So even though our throughput of only 1.5 MB isn't high. > The number of files per second is > actually quite high, and causes things to slow down because of seek time > issues. PLEASE GIVE US CACHEFS SOMEONE?? > > Does anyone have experience with IDE Raid arrays that get over 250 tps in > iostat that work fine? I would > be VERY VERY VERY interested to find out. > > L8r... > > Matt > > > ------------------------------------------------------- > This sf.net email is sponsored by: See the NEW Palm > Tungsten T handheld. Power & Color in a compact size! > http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en > _______________________________________________ > NFS maillist - NFS@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs > > ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?iso-8859-1?Q?Ragnar_Kj=F8rstad?= Subject: Re: nfs performance problem Date: Thu, 7 Nov 2002 18:32:06 +0100 Sender: nfs-admin@lists.sourceforge.net Message-ID: <20021107183206.P23227@vestdata.no> References: Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Cc: "'nfs@lists.sourceforge.net'" Return-path: Received: from stine.vestdata.no ([195.204.68.10]) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 189qVx-0004qv-00 for ; Thu, 07 Nov 2002 09:32:22 -0800 To: "Baker, Byran" In-Reply-To: ; from Byran_Baker@bmc.com on Thu, Nov 07, 2002 at 09:19:44AM -0600 Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: On Thu, Nov 07, 2002 at 09:19:44AM -0600, Baker, Byran wrote: > We have a couple of NFS servers with 3ware cards. One is a 3ware 7850 = (8 > Maxtor 160GB - 5400RPM) and one is a 3ware 7500-12, both configured wit= h > RAID 5 + Hot Spare. I have all the latest NFS patches applied. If I a= ccess > an NFS mounted directory on one of these systems and copy a bunch of li= ttle > files (1-2KB), I get 150-250 tps. If I access larger files (90% > 1MB = each) > I will normally see 25-60MB/s transfer with 500-700 tps, with peaks of = over > 1000 tps - A good indication that disk seek is to blame for the problem= with > the small files. Just to clarify here. I bet when you transfer 1-2KB files you are able to transfer 150-250 files pr second, right?=20 When you transfer larger files the files pr second rate goes down, but the tps-rate increases because each file is split into multiple scsi requests.=20 And yes, I would agree with your conclution that disk seek is to blame. Are those numbers for reads or writes? I would expect to see the same tendency, but with stronger effect, for writes than reads. --=20 Ragnar Kj=F8rstad Big Storage ------------------------------------------------------- This sf.net email is sponsored by: See the NEW Palm Tungsten T handheld. Power & Color in a compact size! http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: "dean hildebrand" Subject: Re: nfs performance problem Date: Fri, 26 Oct 2007 18:35:56 -0700 Message-ID: References: <20071025131029.GH8334@barnabas.schuldei.org> <7B68ECC3-7EBA-442F-9FFD-A0E3F2DCC61A@oracle.com> <20071025193457.GE4499@jakobus.spotify.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Cc: Andreas Schuldei , nfs@lists.sourceforge.net To: "Talpey, Thomas" Return-path: Received: from sc8-sf-mx2-b.sourceforge.net ([10.3.1.92] helo=mail.sourceforge.net) by sc8-sf-list2-new.sourceforge.net with esmtp (Exim 4.43) id 1Ilaae-0002IA-2i for nfs@lists.sourceforge.net; Fri, 26 Oct 2007 18:35:52 -0700 Received: from wa-out-1112.google.com ([209.85.146.180]) by mail.sourceforge.net with esmtp (Exim 4.44) id 1Ilaaj-0003jU-Gg for nfs@lists.sourceforge.net; Fri, 26 Oct 2007 18:35:57 -0700 Received: by wa-out-1112.google.com with SMTP id k22so1345927waf for ; Fri, 26 Oct 2007 18:35:56 -0700 (PDT) In-Reply-To: List-Id: "Discussion of NFS under Linux development, interoperability, and testing." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nfs-bounces@lists.sourceforge.net Errors-To: nfs-bounces@lists.sourceforge.net >... You might consider moving it forward to 2.6.23, > as there is a significant readahead improvement/fix there. >... > Fengguang Wu's changes are merged in 2.6.23, >...Tom, Improving out-of-order nfsd requests is a great idea. Is there a description of Fengguang Wu's improvements or maybe a link to the specific patches. I tried searching through git and such and not sure if I found the right set of patches. Dean > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > NFS maillist - NFS@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs > ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andreas Schuldei Subject: Re: nfs performance problem Date: Sat, 27 Oct 2007 11:25:04 +0200 Message-ID: <20071027092504.GC25508@jakobus.schuldei.org> References: <20071025131029.GH8334@barnabas.schuldei.org> <200710251553.52138.bernd-schubert@gmx.de> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Cc: nfs@lists.sourceforge.net To: Bernd Schubert Return-path: Received: from sc8-sf-mx2-b.sourceforge.net ([10.3.1.92] helo=mail.sourceforge.net) by sc8-sf-list2-new.sourceforge.net with esmtp (Exim 4.43) id 1Ilhv2-0001vt-Gj for nfs@lists.sourceforge.net; Sat, 27 Oct 2007 02:25:24 -0700 Received: from petrus.schuldei.org ([81.27.3.162] helo=barnabas.schuldei.org) by mail.sourceforge.net with esmtp (Exim 4.44) id 1Ilhv6-0007He-QL for nfs@lists.sourceforge.net; Sat, 27 Oct 2007 02:25:30 -0700 In-Reply-To: <200710251553.52138.bernd-schubert@gmx.de> List-Id: "Discussion of NFS under Linux development, interoperability, and testing." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nfs-bounces@lists.sourceforge.net Errors-To: nfs-bounces@lists.sourceforge.net * Bernd Schubert (bernd-schubert@gmx.de) [071027 01:39]: > Hello Andreas, > > On Thursday 25 October 2007 15:10:29 Andreas Schuldei wrote: > > lotta:/var/disks/sda on /var/disks/sda type nfs > > (ro,hard,intr,proto=tcp,rsize=32k,addr=217.213.5.44) lotta:/var/disks/sdb > > try to increase rsize and wsize as much as possible, the maximum can be > adjusted in /proc/fs/nfsd/max_block_size on the nfs server. i cant increase this to more then 1M. is that the hard limit? ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Talpey, Thomas" Subject: Re: nfs performance problem Date: Mon, 29 Oct 2007 08:59:53 -0400 Message-ID: References: <20071025131029.GH8334@barnabas.schuldei.org> <7B68ECC3-7EBA-442F-9FFD-A0E3F2DCC61A@oracle.com> <20071025193457.GE4499@jakobus.spotify.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Cc: nfs@lists.sourceforge.net To: "dean hildebrand" Return-path: Received: from sc8-sf-mx2-b.sourceforge.net ([10.3.1.92] helo=mail.sourceforge.net) by sc8-sf-list2-new.sourceforge.net with esmtp (Exim 4.43) id 1ImUEm-0003zW-FI for nfs@lists.sourceforge.net; Mon, 29 Oct 2007 06:01:01 -0700 Received: from mx2.netapp.com ([216.240.18.37]) by mail.sourceforge.net with esmtp (Exim 4.44) id 1ImUEq-0003MS-VY for nfs@lists.sourceforge.net; Mon, 29 Oct 2007 06:01:06 -0700 In-Reply-To: References: <20071025131029.GH8334@barnabas.schuldei.org> <7B68ECC3-7EBA-442F-9FFD-A0E3F2DCC61A@oracle.com> <20071025193457.GE4499@jakobus.spotify.net> List-Id: "Discussion of NFS under Linux development, interoperability, and testing." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nfs-bounces@lists.sourceforge.net Errors-To: nfs-bounces@lists.sourceforge.net At 09:35 PM 10/26/2007, dean hildebrand wrote: >>... You might consider moving it forward to 2.6.23, >> as there is a significant readahead improvement/fix there. >>... >> Fengguang Wu's changes are merged in 2.6.23, >>...Tom, > >Improving out-of-order nfsd requests is a great idea. Is there a >description of Fengguang Wu's improvements or maybe a link to the >specific patches. I tried searching through git and such and not sure >if I found the right set of patches. There are quite a few of them. A big one with lots of background info is in mm/readahead.c, commit 122a21d11cbfda6d1e33cbc8ae9e4c4ee2f1886e Check out the "readahead thrashing" results at the end of the log, especially the comment "the more overall read density, the more possible gain", i.e. it's the density and not the actual order which now drives the readahead. Good stuff. Tom. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andreas Schuldei Subject: nfs performance problem Date: Thu, 25 Oct 2007 15:10:29 +0200 Message-ID: <20071025131029.GH8334@barnabas.schuldei.org> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" To: nfs@lists.sourceforge.net Return-path: Received: from sc8-sf-mx2-b.sourceforge.net ([10.3.1.92] helo=mail.sourceforge.net) by sc8-sf-list2-new.sourceforge.net with esmtp (Exim 4.43) id 1Il2qC-0003TT-NU for nfs@lists.sourceforge.net; Thu, 25 Oct 2007 06:33:41 -0700 Received: from petrus.schuldei.org ([81.27.3.162] helo=barnabas.schuldei.org) by mail.sourceforge.net with esmtp (Exim 4.44) id 1Il2qH-0000hf-MW for nfs@lists.sourceforge.net; Thu, 25 Oct 2007 06:33:46 -0700 Received: from localhost (localhost [127.0.0.1]) by barnabas.schuldei.org (Postfix) with ESMTP id 5576914C199 for ; Thu, 25 Oct 2007 15:10:35 +0200 (CEST) Received: from barnabas.schuldei.org ([127.0.0.1]) by localhost (barnabas.schuldei.org [127.0.0.1]) (amavisd-new, port 10024) with LMTP id WY8c6jto8sYC for ; Thu, 25 Oct 2007 15:10:31 +0200 (CEST) List-Id: "Discussion of NFS under Linux development, interoperability, and testing." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nfs-bounces@lists.sourceforge.net Errors-To: nfs-bounces@lists.sourceforge.net Hi! I need to tune a nfs server and client. on the server we have several Tbyte of ~2Mbyte files and we need to transfer them read only to the client. latency and throughput are crucial. What nfs server should i use? i started with the nfs-kernel-server on top of a kernel 2.6.22 on debian on the server side. the client is a debian etch server (2.6.18 kernel) with 1Gbyte e1000 intel network driver. later on we consider two network cards on both machines to transfer 2Gbit/s. Jumboframes are an option (how much will they help?) Right now i have only four disks in the server and i get 50Mbyte out of each of them, simultaniously, for real world loads (random reads across the disk, trying to minimizing the seeks by reading the files in one go with for i in a b h i ; do ( find /var/disks/sd$i -type f | xargs -I=B0 dd if=3D= =B0 bs=3D2M of=3D/dev/null status=3Dnoxfer 2>/dev/null & ) ; done so with this (4*50 Mbyte/s) i should be able to saturate both network cards. accessing the disks with apache2-mpm-worker we get ~90Mbyte/s out of the server, partly with considerable latency in the order of magnitude of 10s. I was hoping to get at least the same performance with much better latency with nfs. on the server i start 128 nfs servers (RPCNFSDCOUNT=3D128) and export the disks like this: /usr/sbin/exportfs -v /var/disks/sda (ro,async,wdelay,root_squash,no_subtree_check,anonuid= =3D65534,anongid=3D65534) /var/disks/sdb (ro,async,wdelay,root_squash,no_subtree_check,anonuid= =3D65534,anongid=3D65534) /var/disks/sdh (ro,async,wdelay,root_squash,no_subtree_check,anonuid= =3D65534,anongid=3D65534) /var/disks/sdi (ro,async,wdelay,root_squash,no_subtree_check,anonuid= =3D65534,anongid=3D65534) on the client i mount them like this: lotta:/var/disks/sda on /var/disks/sda type nfs (ro,hard,intr,proto=3Dtcp,r= size=3D32k,addr=3D217.213.5.44) lotta:/var/disks/sdb on /var/disks/sdb type nfs (ro,hard,intr,proto=3Dtcp,r= size=3D32k,addr=3D217.213.5.44) lotta:/var/disks/sdh on /var/disks/sdh type nfs (ro,hard,intr,proto=3Dtcp,r= size=3D32k,addr=3D217.213.5.44) lotta:/var/disks/sdi on /var/disks/sdi type nfs (ro,hard,intr,proto=3Dtcp,r= size=3D32k,addr=3D217.213.5.44) but when i then do the same dd again on the client i get disappointing 60-70Mbyte/s altogether. from a single disk i get ~25Mbytes/s on the client side. i played with some buffers /proc/sys/net/core/rmem_max and /proc/sys/net/core/rmem_default and increased them to 256M on the client. i was suspecting that the nfs server reads the files in too small chunks and tried to help it with = for i in a h i ; do ( echo $((1024*6)) > /sys/block/sd$i/queue/read_ahead= _kb ) ; done to get it to read in the files in one go. I would hope to at least double the speed. do you have a benchmark tool that can tell me the latency? i tried iozone and tried forcing it to only do read tests and did not get any helpfull error or output at all. = on the server: nfsstat Server rpc stats: calls badcalls badauth badclnt xdrcall 98188885 0 0 0 0 Server nfs v3: null getattr setattr lookup access readlink 5599 0% 318417 0% 160 0% 132643 0% 227130 0% 0 = 0% read write create mkdir symlink mknod 97256921 99% 118313 0% 168 0% 0 0% 0 0% 0 = 0% remove rmdir rename link readdir readdirplus 162 0% 0 0% 0 0% 0 0% 0 0% 105556 = 0% fsstat fsinfo pathconf commit 0 0% 1270 0% 0 0% 7153 0% cat /proc/net/rpc/nfsd rc 0 118803 98069945 fh 0 0 0 0 0 io 3253902194 38428672 th 128 10156908 1462.848 365.212 302.100 252.204 311.632 187.508 142.708 14= 2.132 198.168 648.640 ra 256 97097262 0 0 0 0 0 0 0 0 0 64684 net 98188985 16 98188854 5619 rpc 98188885 0 0 0 0 proc2 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 proc3 22 5599 318417 160 132643 227130 0 97256921 118313 168 0 0 0 162 0 0 = 0 0 105556 0 1270 0 7153 proc4 2 0 0 proc4ops 40 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0= 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bernd Schubert Subject: Re: nfs performance problem Date: Thu, 25 Oct 2007 15:53:51 +0200 Message-ID: <200710251553.52138.bernd-schubert@gmx.de> References: <20071025131029.GH8334@barnabas.schuldei.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Cc: Andreas Schuldei To: nfs@lists.sourceforge.net Return-path: Received: from sc8-sf-mx2-b.sourceforge.net ([10.3.1.92] helo=mail.sourceforge.net) by sc8-sf-list2-new.sourceforge.net with esmtp (Exim 4.43) id 1Il39t-0005ZS-GX for nfs@lists.sourceforge.net; Thu, 25 Oct 2007 06:54:01 -0700 Received: from mail.gmx.net ([213.165.64.20]) by mail.sourceforge.net with smtp (Exim 4.44) id 1Il39x-0000dJ-Bf for nfs@lists.sourceforge.net; Thu, 25 Oct 2007 06:54:07 -0700 In-Reply-To: <20071025131029.GH8334@barnabas.schuldei.org> List-Id: "Discussion of NFS under Linux development, interoperability, and testing." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nfs-bounces@lists.sourceforge.net Errors-To: nfs-bounces@lists.sourceforge.net Hello Andreas, On Thursday 25 October 2007 15:10:29 Andreas Schuldei wrote: > lotta:/var/disks/sda on /var/disks/sda type nfs > (ro,hard,intr,proto=tcp,rsize=32k,addr=217.213.5.44) lotta:/var/disks/sdb try to increase rsize and wsize as much as possible, the maximum can be adjusted in /proc/fs/nfsd/max_block_size on the nfs server. > > but when i then do the same dd again on the client i get > disappointing 60-70Mbyte/s altogether. from a single disk i get > ~25Mbytes/s on the client side. > > i played with some buffers /proc/sys/net/core/rmem_max and > /proc/sys/net/core/rmem_default and increased them to 256M on the > client. > > i was suspecting that the nfs server reads the files in too small > chunks and tried to help it with > > for i in a h i ; do ( echo $((1024*6)) > > /sys/block/sd$i/queue/read_ahead_kb ) ; done Are your partitions on lvm or md? This wouldn't help then, AFAIK for lvm you can't do it via /sys, but must always use blockdev > > to get it to read in the files in one go. > > I would hope to at least double the speed. do > you have a benchmark tool that can tell me the latency? i tried > iozone and tried forcing it to only do read tests and did not get > any helpfull error or output at all. First of all, what is you *local* performance? Cheers, Bernd ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andreas Schuldei Subject: Re: nfs performance problem Date: Thu, 25 Oct 2007 16:39:45 +0200 Message-ID: <20071025141913.GD4499@jakobus.spotify.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Cc: nfs@lists.sourceforge.net To: Bernd Schubert Return-path: Received: from sc8-sf-mx1-b.sourceforge.net ([10.3.1.91] helo=mail.sourceforge.net) by sc8-sf-list2-new.sourceforge.net with esmtp (Exim 4.43) id 1Il3sW-00020X-FH for nfs@lists.sourceforge.net; Thu, 25 Oct 2007 07:40:08 -0700 Received: from petrus.schuldei.org ([81.27.3.162] helo=barnabas.schuldei.org) by mail.sourceforge.net with esmtp (Exim 4.44) id 1Il3sZ-0003Js-Nt for nfs@lists.sourceforge.net; Thu, 25 Oct 2007 07:40:14 -0700 List-Id: "Discussion of NFS under Linux development, interoperability, and testing." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nfs-bounces@lists.sourceforge.net Errors-To: nfs-bounces@lists.sourceforge.net * Bernd Schubert (bernd-schubert@gmx.de) [071025 15:54]: > Hello Andreas, > > On Thursday 25 October 2007 15:10:29 Andreas Schuldei wrote: > > lotta:/var/disks/sda on /var/disks/sda type nfs > > (ro,hard,intr,proto=tcp,rsize=32k,addr=217.213.5.44) lotta:/var/disks/sdb > > try to increase rsize and wsize as much as possible, the maximum can be > adjusted in /proc/fs/nfsd/max_block_size on the nfs server. the with a max_block_size and rsize=512k i did not achive a noticable performance impact. increasing it to 2M decreased the performance to 12-12Mbytes/s per disk. do i need to adjust more tuneables to benefit from the increased size? > > i was suspecting that the nfs server reads the files in too small > > chunks and tried to help it with > > > > for i in a h i ; do ( echo $((1024*6)) > > > /sys/block/sd$i/queue/read_ahead_kb ) ; done > > Are your partitions on lvm or md? This wouldn't help then, AFAIK for lvm you > can't do it via /sys, but must always use blockdev they are on xfs directly on disk, no lvm or md involved. > > to get it to read in the files in one go. > > > > I would hope to at least double the speed. do > > you have a benchmark tool that can tell me the latency? i tried > > iozone and tried forcing it to only do read tests and did not get > > any helpfull error or output at all. > > First of all, what is you *local* performance? i mentioned that i get 50Mbyte/s out of each disk, even when doing the dd on all four disks at the same time. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chuck Lever Subject: Re: nfs performance problem Date: Thu, 25 Oct 2007 11:25:37 -0400 Message-ID: <7B68ECC3-7EBA-442F-9FFD-A0E3F2DCC61A@oracle.com> References: <20071025131029.GH8334@barnabas.schuldei.org> Mime-Version: 1.0 (Apple Message framework v752.2) Content-Type: text/plain; charset="iso-8859-1" Cc: nfs@lists.sourceforge.net To: Andreas Schuldei Return-path: Received: from sc8-sf-mx1-b.sourceforge.net ([10.3.1.91] helo=mail.sourceforge.net) by sc8-sf-list2-new.sourceforge.net with esmtp (Exim 4.43) id 1Il5sT-0001yw-Dy for nfs@lists.sourceforge.net; Thu, 25 Oct 2007 09:48:13 -0700 Received: from rgminet01.oracle.com ([148.87.113.118]) by mail.sourceforge.net with esmtps (TLSv1:AES256-SHA:256) (Exim 4.44) id 1Il5sV-0002Ci-OV for nfs@lists.sourceforge.net; Thu, 25 Oct 2007 09:48:18 -0700 In-Reply-To: <20071025131029.GH8334@barnabas.schuldei.org> List-Id: "Discussion of NFS under Linux development, interoperability, and testing." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nfs-bounces@lists.sourceforge.net Errors-To: nfs-bounces@lists.sourceforge.net On Oct 25, 2007, at 9:10 AM, Andreas Schuldei wrote: > Hi! > > I need to tune a nfs server and client. on the server we have > several Tbyte of ~2Mbyte files and we need to transfer them read > only to the client. latency and throughput are crucial. > > What nfs server should i use? i started with the > nfs-kernel-server on top of a kernel 2.6.22 on debian on the > server side. the client is a debian etch server (2.6.18 kernel) > with 1Gbyte e1000 intel network driver. later on we consider two > network cards on both machines to transfer 2Gbit/s. Jumboframes > are an option (how much will they help?) > > Right now i have only four disks in the server and i get 50Mbyte > out of each of them, simultaniously, for real world loads (random > reads across the disk, trying to minimizing the seeks by reading > the files in one go with > > for i in a b h i ; do ( find /var/disks/sd$i -type f | xargs -I=B0 dd = > if=3D=B0 bs=3D2M of=3D/dev/null status=3Dnoxfer 2>/dev/null & ) ; done > > so with this (4*50 Mbyte/s) i should be able to saturate both > network cards. > > accessing the disks with apache2-mpm-worker we get ~90Mbyte/s out > of the server, partly with considerable latency in the order of > magnitude of 10s. > > I was hoping to get at least the same performance with much > better latency with nfs. With a single client, you should not expect to get any better = performance than by running the web service on the NFS server. The = advantage of using NFS under a web service is that you can = transparently scale horizontally. When you add a second or third web = server that serves the same file set, you will see an effective = increase in the size of the data cache between your NFS server's = disks and the web servers. But don't expect to get better data throughput over NFS than you see = on your local NFS server. If anything, the 10s latency you see when = the web server is on the same system with the disks is indicative of = local file system configuration issues. > on the server i start 128 nfs servers (RPCNFSDCOUNT=3D128) and export > the disks like this: > > /usr/sbin/exportfs -v > /var/disks/sda = > (ro,async,wdelay,root_squash,no_subtree_check,anonuid=3D65534,anongid=3D6= 5 = > 534) > /var/disks/sdb = > (ro,async,wdelay,root_squash,no_subtree_check,anonuid=3D65534,anongid=3D6= 5 = > 534) > /var/disks/sdh = > (ro,async,wdelay,root_squash,no_subtree_check,anonuid=3D65534,anongid=3D6= 5 = > 534) > /var/disks/sdi = > (ro,async,wdelay,root_squash,no_subtree_check,anonuid=3D65534,anongid=3D6= 5 = > 534) On the server, mounting the web data file systems with "noatime" may = help reduce the number of seeks on the disks. Also, the "async" export option won't have any effect on reads. Check your block device configuration as well. You may find that = varying the RAID configuration, file system type (ext3 v. xfs) and = stripe/chunk size could impact your server's performance. You might = find that the deadline disk scheduler performs a little better than = the default cfq scheduler. It goes without saying that you should make sure your disk subsystem = is healthy. I've found that, for example, SATA drives in hot-swap = enclosures are sometimes affected by silent SATA transport errors = that result in slow performance. Check dmesg carefully to ensure you = are getting the highest possible speed settings. If only one of your = drives is running significantly slower than the others, it will have = a significant impact on the performance of a RAID group. > on the client i mount them like this: > > lotta:/var/disks/sda on /var/disks/sda type nfs = > (ro,hard,intr,proto=3Dtcp,rsize=3D32k,addr=3D217.213.5.44) > lotta:/var/disks/sdb on /var/disks/sdb type nfs = > (ro,hard,intr,proto=3Dtcp,rsize=3D32k,addr=3D217.213.5.44) > lotta:/var/disks/sdh on /var/disks/sdh type nfs = > (ro,hard,intr,proto=3Dtcp,rsize=3D32k,addr=3D217.213.5.44) > lotta:/var/disks/sdi on /var/disks/sdi type nfs = > (ro,hard,intr,proto=3Dtcp,rsize=3D32k,addr=3D217.213.5.44) There are some client-side mount options that might also help. Using = "nocto" and "actimeo=3D7200" could reduce synchonous NFS protocol = overhead. I also notice a significant amount of readdirplus = traffic. Readdirplus requests are fairly heavyweight, and in this = scenario may be unneeded overhead. Your client might support the = recently added "nordirplus" mount option, which could be helpful. I wonder if "rsize=3D32k" is supported - you might want "rsize=3D32768" = instead. Or better, let the client and server negotiate the maximum = that each supports automatically by leaving this option off. You can = check what options are in effect on each NFS mount point by looking = in /proc/self/mountstats on the client. Enabling jumbo frames between your NFS server and client will help. = Depending on your NIC, though, it may introduce some instability = (driver and hardware mileage may vary). Since you currently have only one client, you might consider running = the client and server back-to-back (ie replace any hub or switch with = a simple cross-over link) to eliminate extra network overhead. = Getting a high performance switch when you add more clients is key to = making this configuration scale well -- a $99 special won't cut it. > but when i then do the same dd again on the client i get > disappointing 60-70Mbyte/s altogether. from a single disk i get > ~25Mbytes/s on the client side. 25MB/s is fairly typical for Linux NFS servers. > i played with some buffers /proc/sys/net/core/rmem_max and > /proc/sys/net/core/rmem_default and increased them to 256M on the > client. You should consider similar network tuning on the server. Use a = network benchmarking tool like iperf to assist. > i was suspecting that the nfs server reads the files in too small > chunks and tried to help it with > > for i in a h i ; do ( echo $((1024*6)) > /sys/block/sd$i/queue/ = > read_ahead_kb ) ; done > > to get it to read in the files in one go. Insufficient read-ahead on your server may be an issue here. Read = traffic from the client often arrives at the server out of order, = preventing the server from cleanly detecting sequential reads. I = believe there was a recent change to the NFS server that addresses = this issue. > I would hope to at least double the speed. IMO you can do that only by adding more clients. > do > you have a benchmark tool that can tell me the latency? i tried > iozone and tried forcing it to only do read tests and did not get > any helpfull error or output at all. Use "iozone -a -i 1" to run read tests. You can narrow the test down = to 2MB sequential reads if you want. Take a look at "iozone -h" = output for more details. > on the server: > > nfsstat > Server rpc stats: > calls badcalls badauth badclnt xdrcall > 98188885 0 0 0 0 > > Server nfs v3: > null getattr setattr lookup access = > readlink > 5599 0% 318417 0% 160 0% 132643 0% 227130 0% = > 0 0% > read write create mkdir symlink mknod > 97256921 99% 118313 0% 168 0% 0 0% 0 0% = > 0 0% > remove rmdir rename link readdir = > readdirplus > 162 0% 0 0% 0 0% 0 0% 0 0% = > 105556 0% > fsstat fsinfo pathconf commit > 0 0% 1270 0% 0 0% 7153 0% > > > > cat /proc/net/rpc/nfsd > rc 0 118803 98069945 > fh 0 0 0 0 0 > io 3253902194 38428672 > th 128 10156908 1462.848 365.212 302.100 252.204 311.632 187.508 = > 142.708 142.132 198.168 648.640 > ra 256 97097262 0 0 0 0 0 0 0 0 0 64684 > net 98188985 16 98188854 5619 > rpc 98188885 0 0 0 0 > proc2 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > proc3 22 5599 318417 160 132643 227130 0 97256921 118313 168 0 0 0 = > 162 0 0 0 0 105556 0 1270 0 7153 > proc4 2 0 0 > proc4ops 40 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 = > 0 0 0 0 0 0 0 0 0 0 0 0 -- Chuck Lever chuck[dot]lever[at]oracle[dot]com ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andreas Schuldei Subject: Re: nfs performance problem Date: Thu, 25 Oct 2007 21:34:57 +0200 Message-ID: <20071025193457.GE4499@jakobus.spotify.net> References: <20071025131029.GH8334@barnabas.schuldei.org> <7B68ECC3-7EBA-442F-9FFD-A0E3F2DCC61A@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Cc: nfs@lists.sourceforge.net To: Chuck Lever Return-path: Received: from sc8-sf-mx1-b.sourceforge.net ([10.3.1.91] helo=mail.sourceforge.net) by sc8-sf-list2-new.sourceforge.net with esmtp (Exim 4.43) id 1Il8U0-0002YG-IW for nfs@lists.sourceforge.net; Thu, 25 Oct 2007 12:35:08 -0700 Received: from petrus.schuldei.org ([81.27.3.162] helo=barnabas.schuldei.org) by mail.sourceforge.net with esmtp (Exim 4.44) id 1Il8U1-0001i1-Vu for nfs@lists.sourceforge.net; Thu, 25 Oct 2007 12:35:12 -0700 In-Reply-To: <7B68ECC3-7EBA-442F-9FFD-A0E3F2DCC61A@oracle.com> List-Id: "Discussion of NFS under Linux development, interoperability, and testing." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nfs-bounces@lists.sourceforge.net Errors-To: nfs-bounces@lists.sourceforge.net * Chuck Lever (chuck.lever@oracle.com) [071025 20:25]: > On Oct 25, 2007, at 9:10 AM, Andreas Schuldei wrote: > >Hi! > > > >I need to tune a nfs server and client. on the server we have > >several Tbyte of ~2Mbyte files and we need to transfer them read > >only to the client. latency and throughput are crucial. Because i have Tbytes of data but only a few Gbytes or RAM my cache hits are rather unlikely. = > >Right now i have only four disks in the server and i get 50Mbyte > >out of each of them, simultaniously, for real world loads (random > >reads across the disk, trying to minimizing the seeks by reading > >the files in one go with > > > >for i in a b h i ; do ( find /var/disks/sd$i -type f | xargs -I=B0 dd if= =3D=B0 bs=3D2M of=3D/dev/null status=3Dnoxfer = > >2>/dev/null & ) ; done > > > >so with this (4*50 Mbyte/s) i should be able to saturate both > >network cards. note that this is my server's disk io performance. > With a single client, you should not expect to get any better performance= than by running the web service on the NFS = > server. The advantage of using NFS under a web service is that you can t= ransparently scale horizontally. When you add = > a second or third web server that serves the same file set, you will see = an effective increase in the size of the data = > cache between your NFS server's disks and the web servers. Not with terabyte of data and a distributed access pattern. Certainly i will have some cache hits but not enough to be able to serv considerable amounts out of RAM. > But don't expect to get better data throughput over NFS than you see on y= our local NFS server. = That is exactly the point. on my server i get 4*50Mbytes =3D 200Mbyte/s out of the disks (with the above FOR loop around the find and dd) and when i export on the same server the disks to an nfs client i all of a sudden loose ~75% of the performance. > If anything, the 10s = > latency you see when the web server is on the same system with the disks = is indicative of local file system = > configuration issues. how can i measure the latency on the local machine? i would be very interested in seeing how it behaves latency wise. > >on the server i start 128 nfs servers (RPCNFSDCOUNT=3D128) and export > >the disks like this: > > > >/usr/sbin/exportfs -v > >/var/disks/sda (ro,async,wdelay,root_squash,no_subtree_check,anon= uid=3D65534,anongid=3D65534) > >/var/disks/sdb (ro,async,wdelay,root_squash,no_subtree_check,anon= uid=3D65534,anongid=3D65534) > >/var/disks/sdh (ro,async,wdelay,root_squash,no_subtree_check,anon= uid=3D65534,anongid=3D65534) > >/var/disks/sdi (ro,async,wdelay,root_squash,no_subtree_check,anon= uid=3D65534,anongid=3D65534) > = > On the server, mounting the web data file systems with "noatime" may help= reduce the number of seeks on the disks. yes, we do that already. > >on the client i mount them like this: > > > >lotta:/var/disks/sda on /var/disks/sda type nfs (ro,hard,intr,proto=3Dtc= p,rsize=3D32k,addr=3D217.213.5.44) > >lotta:/var/disks/sdb on /var/disks/sdb type nfs (ro,hard,intr,proto=3Dtc= p,rsize=3D32k,addr=3D217.213.5.44) > >lotta:/var/disks/sdh on /var/disks/sdh type nfs (ro,hard,intr,proto=3Dtc= p,rsize=3D32k,addr=3D217.213.5.44) > >lotta:/var/disks/sdi on /var/disks/sdi type nfs (ro,hard,intr,proto=3Dtc= p,rsize=3D32k,addr=3D217.213.5.44) > = > There are some client-side mount options that might also help. Using "no= cto" and "actimeo=3D7200" could reduce = > synchonous NFS protocol overhead. I also notice a significant amount of = readdirplus traffic. Readdirplus requests are = > fairly heavyweight, and in this scenario may be unneeded overhead. Your = client might support the recently added = > "nordirplus" mount option, which could be helpful. > = > I wonder if "rsize=3D32k" is supported - you might want "rsize=3D32768" i= nstead. i think that gave an effect. now i am in the 90-100Mbyte/s ballpark and might hit the one-nic (1gbit) bottleneck. > Or better, let the client and server = > negotiate the maximum that each supports automatically by leaving this op= tion off. You can check what options are in = > effect on each NFS mount point by looking in /proc/self/mountstats on the= client. there it says now, after i specified rsize=3D2097152: opts: rw,vers=3D3,rsize=3D1048576,wsize=3D1048576,acregmin=3D3,ac= regmax=3D60,acdirmin=3D30,acdirmax=3D60,hard,intr,nolock,proto=3Dtcp,timeo= =3D600,retrans=3D2,sec=3Dsys i am surprised that it did not protest when it could not parse the "k". note that it it only took 1M chunks. how come? > Enabling jumbo frames between your NFS server and client will help. Depe= nding on your NIC, though, it may introduce = > some instability (driver and hardware mileage may vary). i will test that and bonding two nicks. > Insufficient read-ahead on your server may be an issue here. Read traffi= c from the client often arrives at the server = > out of order, preventing the server from cleanly detecting sequential rea= ds. I believe there was a recent change to = > the NFS server that addresses this issue. when did that go in? do i need to activate that somehow? how can i measure the latency on a loaded server? both locally and over nfs? /andreas ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chuck Lever Subject: Re: nfs performance problem Date: Fri, 26 Oct 2007 10:18:29 -0400 Message-ID: <4721F735.8050503@oracle.com> References: <20071025131029.GH8334@barnabas.schuldei.org> <7B68ECC3-7EBA-442F-9FFD-A0E3F2DCC61A@oracle.com> <20071025193457.GE4499@jakobus.spotify.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------030500060600050402070704" Cc: nfs@lists.sourceforge.net To: Andreas Schuldei Return-path: Received: from sc8-sf-mx1-b.sourceforge.net ([10.3.1.91] helo=mail.sourceforge.net) by sc8-sf-list2-new.sourceforge.net with esmtp (Exim 4.43) id 1IlQ9m-00020v-Ox for nfs@lists.sourceforge.net; Fri, 26 Oct 2007 07:27:26 -0700 Received: from rgminet01.oracle.com ([148.87.113.118]) by mail.sourceforge.net with esmtps (TLSv1:AES256-SHA:256) (Exim 4.44) id 1IlQ9q-00027R-Jl for nfs@lists.sourceforge.net; Fri, 26 Oct 2007 07:27:32 -0700 In-Reply-To: <20071025193457.GE4499@jakobus.spotify.net> List-Id: "Discussion of NFS under Linux development, interoperability, and testing." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nfs-bounces@lists.sourceforge.net Errors-To: nfs-bounces@lists.sourceforge.net This is a multi-part message in MIME format. --------------030500060600050402070704 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Andreas Schuldei wrote: > * Chuck Lever (chuck.lever@oracle.com) [071025 20:25]: >> On Oct 25, 2007, at 9:10 AM, Andreas Schuldei wrote: >> With a single client, you should not expect to get any better performance than by running the web service on the NFS >> > server. The advantage of using NFS under a web service is that you can transparently scale horizontally. When you add >> > a second or third web server that serves the same file set, you will see an effective increase in the size of the data >> > cache between your NFS server's disks and the web servers. > > Not with terabyte of data and a distributed access pattern. > Certainly i will have some cache hits but not enough to be able > to serv considerable amounts out of RAM. If you don't intend to scale horizontally by adding more clients, then inserting NFS between your disks and your web service is not recommended unless you have security requirement that needs to be solved by administering your data separate from running the web service. >> If anything, the 10s >> latency you see when the web server is on the same system with the disks is indicative of local file system >> configuration issues. > > how can i measure the latency on the local machine? i would be > very interested in seeing how it behaves latency wise. The iostat command has options for displaying disk service time. >>> on the client i mount them like this: >>> >>> lotta:/var/disks/sda on /var/disks/sda type nfs (ro,hard,intr,proto=tcp,rsize=32k,addr=217.213.5.44) >>> lotta:/var/disks/sdb on /var/disks/sdb type nfs (ro,hard,intr,proto=tcp,rsize=32k,addr=217.213.5.44) >>> lotta:/var/disks/sdh on /var/disks/sdh type nfs (ro,hard,intr,proto=tcp,rsize=32k,addr=217.213.5.44) >>> lotta:/var/disks/sdi on /var/disks/sdi type nfs (ro,hard,intr,proto=tcp,rsize=32k,addr=217.213.5.44) >> There are some client-side mount options that might also help. Using "nocto" and "actimeo=7200" could reduce >> synchonous NFS protocol overhead. I also notice a significant amount of readdirplus traffic. Readdirplus requests are >> fairly heavyweight, and in this scenario may be unneeded overhead. Your client might support the recently added >> "nordirplus" mount option, which could be helpful. >> >> I wonder if "rsize=32k" is supported - you might want "rsize=32768" instead. > > i think that gave an effect. now i am in the 90-100Mbyte/s > ballpark and might hit the one-nic (1gbit) bottleneck. > >> Or better, let the client and server >> negotiate the maximum that each supports automatically by leaving this option off. You can check what options are in >> effect on each NFS mount point by looking in /proc/self/mountstats on the client. > > there it says now, after i specified rsize=2097152: > opts: rw,vers=3,rsize=1048576,wsize=1048576,acregmin=3,acregmax=60,acdirmin=30,acdirmax=60,hard,intr,nolock,proto=tcp,timeo=600,retrans=2,sec=sys > > i am surprised that it did not protest when it could not parse > the "k". note that it it only took 1M chunks. how come? 1MB is the maximum that both your server and client support. Again, if you don't specify rsize at all, the maximum both support will be negotiated automatically. >> Insufficient read-ahead on your server may be an issue here. Read traffic from the client often arrives at the server >> out of order, preventing the server from cleanly detecting sequential reads. I believe there was a recent change to >> the NFS server that addresses this issue. > > when did that go in? do i need to activate that somehow? I don't think activation is needed. Someone else on the list can speak to when it was included in recent kernels or in distributions. However, since you are already close to your network limits, read-ahead is probably not an issue for you. > how can i measure the latency on a loaded server? both locally > and over nfs? The iostat command, as mentioned above, will report on local disk activity. NFS client activity has not been integrated into iostat, but there are some NFS client metrics available in /proc/self/mountstats. I have a pair of Python tools that can extract and display RPC request latency information. See: http://oss.oracle.com/~cel/linux-2.6/2.6.21/iostat-ms and http://oss.oracle.com/~cel/linux-2.6/2.6.21/mountstats --------------030500060600050402070704 Content-Type: text/x-vcard; charset=utf-8; name="chuck.lever.vcf" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="chuck.lever.vcf" begin:vcard fn:Chuck Lever n:Lever;Chuck org:Oracle Corporation;Corporate Architecture, Linux Projects Group email;internet:chuck.lever@oracle.com title:Principal Member of Staff tel;work:+1 248 614 5091 x-mozilla-html:FALSE version:2.1 end:vcard --------------030500060600050402070704 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ --------------030500060600050402070704 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs --------------030500060600050402070704-- From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Talpey, Thomas" Subject: Re: nfs performance problem Date: Fri, 26 Oct 2007 13:01:37 -0400 Message-ID: References: <20071025131029.GH8334@barnabas.schuldei.org> <7B68ECC3-7EBA-442F-9FFD-A0E3F2DCC61A@oracle.com> <20071025193457.GE4499@jakobus.spotify.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Cc: nfs@lists.sourceforge.net To: Andreas Schuldei Return-path: Received: from sc8-sf-mx1-b.sourceforge.net ([10.3.1.91] helo=mail.sourceforge.net) by sc8-sf-list2-new.sourceforge.net with esmtp (Exim 4.43) id 1IlSbU-0004X4-IU for nfs@lists.sourceforge.net; Fri, 26 Oct 2007 10:04:12 -0700 Received: from mx2.netapp.com ([216.240.18.37]) by mail.sourceforge.net with esmtp (Exim 4.44) id 1IlSbZ-0002nh-Tu for nfs@lists.sourceforge.net; Fri, 26 Oct 2007 10:04:18 -0700 In-Reply-To: <20071025193457.GE4499@jakobus.spotify.net> References: <20071025131029.GH8334@barnabas.schuldei.org> <7B68ECC3-7EBA-442F-9FFD-A0E3F2DCC61A@oracle.com> <20071025193457.GE4499@jakobus.spotify.net> List-Id: "Discussion of NFS under Linux development, interoperability, and testing." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: nfs-bounces@lists.sourceforge.net Errors-To: nfs-bounces@lists.sourceforge.net At 03:34 PM 10/25/2007, Andreas Schuldei wrote: >> But don't expect to get better data throughput over NFS than you see >on your local NFS server. > >That is exactly the point. on my server i get 4*50Mbytes = >200Mbyte/s out of the disks (with the above FOR loop around the >find and dd) and when i export on the same server the disks to an >nfs client i all of a sudden loose ~75% of the performance. Andreas - you mentioned earlier in the thread that you're using 2.6.22 as the server kernel. You might consider moving it forward to 2.6.23, as there is a significant readahead improvement/fix there. When you run a local read, all readahead is perfect. However on older kernels serving NFS, the reordering of reads due to Linux's multithreaded nfsd's triggers several issues which spook the readahead into truly awful behavior. Fengguang Wu's changes are merged in 2.6.23, in testing with NFS/RDMA we now see the same performance over the wire as locally, for any number of readers. My one caveat to this is that your files are relatively small, so I would not expect a truly dramatic speedup from it. It's certainly worth trying, there are other improvements in 2.6.23. Do you have Infiniband or iWARP adapters in your environment? If you're worried about bandwidth and application startup latency, using it could certainly help. You'd need to run a 2.6.24-rc1 client though, and some assembly is still required to use the server (it's not yet upstream). Tom. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs