Re: Millions of files and directory caching.

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Chris Dos <chris@chrisdos.com>
To: Matt Heaton <admin@0catch.com>
Cc: nfs@lists.sourceforge.net
Subject: Re: Millions of files and directory caching.
Date: Mon, 21 Oct 2002 11:44:28 -0600	[thread overview]
Message-ID: <3DB43CFC.706@chrisdos.com> (raw)
In-Reply-To: 086801c278c7$fd7e19c0$6801a8c0@c1886657a

Man, you setup seems extreamly close to mine and it's even for the same 
type of business.  Let me give you a run down on what I have, and what 
I've been doing.

We had a EMC Symetrix SAN/NAS that held 5.7 TB worth of disk.  We were 
only using about 550 GB of it, so the decision was to move away from the 
   EMC because of ongoing support cost and complexity issues.  The EMC 
was working in a NAS configuration serving files via NFS, and it was 
also sharing some of it's disk to a Sun 420R which was then serving 
files via NFS.

The clients are a mix of Solaris 2.6/7.0/8.0 and Redhat 7.2 with the 
stock kernel.  There are three RedHat 7.2 servers (two are updated 
running the 2.4.18-17 kernel, the other run is running the 2.4.7 kernel) 
that serve mail, Two Solaris servers that serve web, and one Oracle 
server that did have external disk to the EMC.  The clients are 
connected to the switch at 100BT FD.  The clients use the following 
mounting options:
udp,bg,intr,hard,rsize=8192,wsize=8192

The server built to replace the EMC is built as follows:
Hardware:	Tyan 2462 motherboard with five 64 bit slots
		Dual AMD 2100+ MP Processors
		2 GB PC 2100 RAM
		Two 3Ware 64 bit 7850 8 port RAID cards
		16 Maxtor 160 GB Hard Drives
		3 Netgear 64 bit 621 Gigabit cards
Software:
		Redhat Linux 7.3 (All patches applied)
		Custom 2.4.19 Kernel (I also rolled one using the
		nfs-all patch and the tcp patch, but Oracle didn't
		like starting it's database when mounted to it.
		Don't know why)
Config and Stats:
This server was configured using each 3ware RAID card in a RAID 5, and 
then mirrored via software RAID to the other 3Ware RAID card.  This gave 
us 1.2 TB of usable space.  I've moved the data off the EMC to this 
server, and I had to move the Oracle servers Oracle database to this 
server as well and export via NFS.  (The Oracle (Sun 420R) server can 
only hold two drives)  The server is connected to the network via 
Gigabit FD.
I'm running 256 NFS threads, and even then, that doesn't seem like 
enough according to the output of /proc/net/rpc/nfsd.  What's the limit 
on the maximum number of threads I can run?

Ouput of /proc/net/rpc/nfsd:
rc 21716 821664 85468545
fh 233099 88275136 0 233232 6591060
io 2359266294 1934417182
th 256 19536 1012.250 787.570 875.870 1350.400 840.930 220.190 96.470 
50.970 20.000 229.690
ra 512 4558260 16517 7803 5421 4228 3293 2686 1808 1279 1270 923900
net 86311942 86311942 0 0
rpc 86311925 17 16 1 0
proc2 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
proc3 22 100 47898973 4984 2138510 29020056 120 5542814 390404 117607 
130 0 9 163535 8 103856 62847 516154 205343 17118 772 60 128525

All the exports from the server have these options: rw,no_root_squash,sync
And I'm passing these options to the kernel:
/bin/echo 2097152 > /proc/sys/net/core/rmem_default
/bin/echo 2097152 > /proc/sys/net/core/rmem_max
/bin/echo "65536"  > /proc/sys/fs/file-max

And even with all of this, I'm having issues with this box.  The client 
NFS mounts are extreamly slow.  So slow, that some services time out and 
the daemon stops all together.  This is a very bad thing.  So I've 
pulled my hair out, beat my head against the wall, and contemplated 
using a sledge hammer to finish this slow painful death.  I've tried 
connecting the server via 100 FD instead of Gig to see if that would fix 
the problem, nada.  So, I got around to thinking, that when I had 
configured older 3Ware 6500 RAID cards in a RAID 5 on another server, 
performance sucked.  Converting it to RAID 10 solved that issue.  The 
RAID 5 performance was supposed to be fixed in the 7500 series,but I 
suspected it was not.  So I decided to explore this tangent and pull a 
couple of all nighters to make this happen.  So....

I broke the Software RAID, reconfigured one of the controllers as RAID 
10, giving me 640 GB of space.  Started copying as much data as I could 
between 10pm-7am Saturday and Sunday night, and as of this morning, I've 
been able to move 1/4 of the mail (one full mount) to the RAID 10. 
Already performance of my NFS has increased.  Customers aren't 
complaining now about slow mail (or no mail access at all for that 
matter).  After tonight I should have all the mail moved over to the 
RAID 10 and I should be able to give you an update tomorrow.  If 
everything goes as planned, I'll move the web sites the next day, and 
then this weekend, I'll reconfigure the controller that is in a RAID 5 
config, to a RAID 10, and then bring up the software RAID 1 between the 
controllers.

So, I think your problem is caused by RAID 5 and not NFS, just like mine 
is.  I'll know more tomorrow.

If anyone can see anything wrong with my configs,  or other 
optimizations I can make, please let me know.  This is a very high 
profile production environment.  I need to get this thing running 
without a hitch.

	Chris Dos

Matt Heaton wrote:
 > I run a hosting service that hosts about 700,000 websites.  We have 2
 > NFS servers running Redhat 7.2 (2.4.18 custom kernal, no
 > nfs patches).  The servers are 850 GIGS each (IDE RAID 5). THe clients
 > are all 7.2 Redhat with custom 2.4.18 kernels on them.  My question is
 > this.  I believe lookups/attribs on the files and directories are
 > slowing down performance considerably because we literally have 4-5
 > million files on each nfs server that we export.  One of the NFS servers
 > is running EXT3 and the other is XFS.  Both work ok, but under heavy
 > loads the clients die because the server can't export stuff fast
 > enough.  The total bandwidth out of each NFS server is LESS than 10
 > Mbit.  The trouble is that I am serving a bunch of SMALL files.  Either
 > I am running out of seek time on my boxes (IDE Raid 850 GIGS per
 > server), or it is taking forever to find the files.
 >
 > Here are my questions.
 >
 > 1) Can I increase the cache on the client side to hold the entire
 > directory structure of both NFS servers?
 >
 > 2) How can I tell if I am just maxing the seek time out on my NFS server?
 >
 > 3) Each NFS server serves about 60-100 files per second.  Is this too
 > many per second?  Could I possibly be maxing
 > out seek time on the NFS servers?  My IDE Raid card is the 3ware 750
 > with 8 individual IDE ports on it.
 >
 > 4) Is there anything like cachefs being developed for linux??  Any other
 > suggestions for persistent client caching for NFS?
 > Free or commercial is fine.
 >
 > Thanks for your answers to some or all of my questions.
 >
 > Matt
 >
 >

-------------------------------------------------------
This sf.net emial is sponsored by: Influence the future of 
Java(TM) technology. Join the Java Community Process(SM) (JCP(SM)) 
program now. http://ad.doubleclick.net/clk;4699841;7576298;k?
http://www.sun.com/javavote
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

next prev parent reply	other threads:[~2002-10-21 17:44 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-10-21  6:06 Millions of files and directory caching Matt Heaton
2002-10-21 12:34 ` Ragnar Kjørstad
2002-10-21 17:44 ` Chris Dos [this message]
  -- strict thread matches above, loose matches on Subject: below --
2002-10-21 14:59 Lever, Charles
     [not found] <6440EA1A6AA1D5118C6900902745938E07D54FA9@black.eng.netapp.com>
2002-10-22 21:00 ` Chris Dos
2002-10-23 14:50 pwitting
2002-10-23 18:16 ` Chris Dos
2002-10-23 18:25   ` Benjamin LaHaise
2002-10-23 19:48     ` Philippe Gramoullé
     [not found] <Pine.LNX.4.44.0210231159160.17120-100000@guest239.wc.cray.com>
2002-10-23 22:26 ` Chris Dos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3DB43CFC.706@chrisdos.com \
    --to=chris@chrisdos.com \
    --cc=admin@0catch.com \
    --cc=nfs@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.