From: Chris Dos <chris@chrisdos.com>
To: nfs@lists.sourceforge.net
Subject: Re: Millions of files and directory caching.
Date: Tue, 22 Oct 2002 15:00:37 -0600 [thread overview]
Message-ID: <3DB5BC75.1000509@chrisdos.com> (raw)
In-Reply-To: 6440EA1A6AA1D5118C6900902745938E07D54FA9@black.eng.netapp.com
I wasn't able to get the tranfer done last night. Hopefully it'll
finish by tonight and I can let everyone know the results tomorrow.
Does anyone know what is the maximum number of NFS threads I can run.
It seems like 256 wasn't enough for my platform, so I've upped it to
320. What is the absolute limit I can go?
Chris
Lever, Charles wrote:
> hi chris-
>
> someone recently reported to me that oil companies like RAID 10
> because RAID 5 performance is terrible for NFS servers. seems
> like you are on the right path.
>
>
>
>>Man, you setup seems extreamly close to mine and it's even
>>for the same
>>type of business. Let me give you a run down on what I have,
>>and what
>>I've been doing.
>>
>>We had a EMC Symetrix SAN/NAS that held 5.7 TB worth of disk.
>> We were
>>only using about 550 GB of it, so the decision was to move
>>away from the
>> EMC because of ongoing support cost and complexity issues.
>> The EMC
>>was working in a NAS configuration serving files via NFS, and it was
>>also sharing some of it's disk to a Sun 420R which was then serving
>>files via NFS.
>>
>>The clients are a mix of Solaris 2.6/7.0/8.0 and Redhat 7.2 with the
>>stock kernel. There are three RedHat 7.2 servers (two are updated
>>running the 2.4.18-17 kernel, the other run is running the
>>2.4.7 kernel)
>>that serve mail, Two Solaris servers that serve web, and one Oracle
>>server that did have external disk to the EMC. The clients are
>>connected to the switch at 100BT FD. The clients use the following
>>mounting options:
>>udp,bg,intr,hard,rsize=8192,wsize=8192
>>
>>The server built to replace the EMC is built as follows:
>>Hardware: Tyan 2462 motherboard with five 64 bit slots
>> Dual AMD 2100+ MP Processors
>> 2 GB PC 2100 RAM
>> Two 3Ware 64 bit 7850 8 port RAID cards
>> 16 Maxtor 160 GB Hard Drives
>> 3 Netgear 64 bit 621 Gigabit cards
>>Software:
>> Redhat Linux 7.3 (All patches applied)
>> Custom 2.4.19 Kernel (I also rolled one using the
>> nfs-all patch and the tcp patch, but Oracle didn't
>> like starting it's database when mounted to it.
>> Don't know why)
>>Config and Stats:
>>This server was configured using each 3ware RAID card in a
>>RAID 5, and
>>then mirrored via software RAID to the other 3Ware RAID card.
>> This gave
>>us 1.2 TB of usable space. I've moved the data off the EMC to this
>>server, and I had to move the Oracle servers Oracle database to this
>>server as well and export via NFS. (The Oracle (Sun 420R) server can
>>only hold two drives) The server is connected to the network via
>>Gigabit FD.
>>I'm running 256 NFS threads, and even then, that doesn't seem like
>>enough according to the output of /proc/net/rpc/nfsd. What's
>>the limit
>>on the maximum number of threads I can run?
>>
>>Ouput of /proc/net/rpc/nfsd:
>>rc 21716 821664 85468545
>>fh 233099 88275136 0 233232 6591060
>>io 2359266294 1934417182
>>th 256 19536 1012.250 787.570 875.870 1350.400 840.930 220.190 96.470
>>50.970 20.000 229.690
>>ra 512 4558260 16517 7803 5421 4228 3293 2686 1808 1279 1270 923900
>>net 86311942 86311942 0 0
>>rpc 86311925 17 16 1 0
>>proc2 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>>proc3 22 100 47898973 4984 2138510 29020056 120 5542814 390404 117607
>>130 0 9 163535 8 103856 62847 516154 205343 17118 772 60 128525
>>
>>All the exports from the server have these options:
>>rw,no_root_squash,sync
>>And I'm passing these options to the kernel:
>>/bin/echo 2097152 > /proc/sys/net/core/rmem_default
>>/bin/echo 2097152 > /proc/sys/net/core/rmem_max
>>/bin/echo "65536" > /proc/sys/fs/file-max
>>
>>And even with all of this, I'm having issues with this box.
>>The client
>>NFS mounts are extreamly slow. So slow, that some services
>>time out and
>>the daemon stops all together. This is a very bad thing. So I've
>>pulled my hair out, beat my head against the wall, and contemplated
>>using a sledge hammer to finish this slow painful death. I've tried
>>connecting the server via 100 FD instead of Gig to see if
>>that would fix
>>the problem, nada. So, I got around to thinking, that when I had
>>configured older 3Ware 6500 RAID cards in a RAID 5 on another server,
>>performance sucked. Converting it to RAID 10 solved that issue. The
>>RAID 5 performance was supposed to be fixed in the 7500 series,but I
>>suspected it was not. So I decided to explore this tangent
>>and pull a
>>couple of all nighters to make this happen. So....
>>
>>I broke the Software RAID, reconfigured one of the
>>controllers as RAID
>>10, giving me 640 GB of space. Started copying as much data
>>as I could
>>between 10pm-7am Saturday and Sunday night, and as of this
>>morning, I've
>>been able to move 1/4 of the mail (one full mount) to the RAID 10.
>>Already performance of my NFS has increased. Customers aren't
>>complaining now about slow mail (or no mail access at all for that
>>matter). After tonight I should have all the mail moved over to the
>>RAID 10 and I should be able to give you an update tomorrow. If
>>everything goes as planned, I'll move the web sites the next day, and
>>then this weekend, I'll reconfigure the controller that is in
>>a RAID 5
>>config, to a RAID 10, and then bring up the software RAID 1
>>between the
>>controllers.
>>
>>So, I think your problem is caused by RAID 5 and not NFS,
>>just like mine
>>is. I'll know more tomorrow.
>>
>>If anyone can see anything wrong with my configs, or other
>>optimizations I can make, please let me know. This is a very high
>>profile production environment. I need to get this thing running
>>without a hitch.
>>
>> Chris Dos
>>
>>Matt Heaton wrote:
>> > I run a hosting service that hosts about 700,000 websites.
>> We have 2
>> > NFS servers running Redhat 7.2 (2.4.18 custom kernal, no
>> > nfs patches). The servers are 850 GIGS each (IDE RAID 5).
>>THe clients
>> > are all 7.2 Redhat with custom 2.4.18 kernels on them. My
>>question is
>> > this. I believe lookups/attribs on the files and directories are
>> > slowing down performance considerably because we literally have 4-5
>> > million files on each nfs server that we export. One of
>>the NFS servers
>> > is running EXT3 and the other is XFS. Both work ok, but
>>under heavy
>> > loads the clients die because the server can't export stuff fast
>> > enough. The total bandwidth out of each NFS server is LESS than 10
>> > Mbit. The trouble is that I am serving a bunch of SMALL
>>files. Either
>> > I am running out of seek time on my boxes (IDE Raid 850 GIGS per
>> > server), or it is taking forever to find the files.
>> >
>> > Here are my questions.
>> >
>> > 1) Can I increase the cache on the client side to hold the entire
>> > directory structure of both NFS servers?
>> >
>> > 2) How can I tell if I am just maxing the seek time out on
>>my NFS server?
>> >
>> > 3) Each NFS server serves about 60-100 files per second.
>>Is this too
>> > many per second? Could I possibly be maxing
>> > out seek time on the NFS servers? My IDE Raid card is the
>>3ware 750
>> > with 8 individual IDE ports on it.
>> >
>> > 4) Is there anything like cachefs being developed for
>>linux?? Any other
>> > suggestions for persistent client caching for NFS?
>> > Free or commercial is fine.
>> >
>> > Thanks for your answers to some or all of my questions.
>> >
>> > Matt
>> >
>> >
>>
>>
>>
>>
>>-------------------------------------------------------
>>This sf.net emial is sponsored by: Influence the future of
>>Java(TM) technology. Join the Java Community Process(SM) (JCP(SM))
>>program now. http://ad.doubleclick.net/clk;4699841;7576298;k?
>>http://www.sun.com/javavote
>>_______________________________________________
>>NFS maillist - NFS@lists.sourceforge.net
>>https://lists.sourceforge.net/lists/listinfo/nfs
>>
>
>
-------------------------------------------------------
This sf.net emial is sponsored by: Influence the future
of Java(TM) technology. Join the Java Community
Process(SM) (JCP(SM)) program now.
http://ad.doubleclick.net/clk;4699841;7576301;v?http://www.sun.com/javavote
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
next parent reply other threads:[~2002-10-22 21:00 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <6440EA1A6AA1D5118C6900902745938E07D54FA9@black.eng.netapp.com>
2002-10-22 21:00 ` Chris Dos [this message]
[not found] <Pine.LNX.4.44.0210231159160.17120-100000@guest239.wc.cray.com>
2002-10-23 22:26 ` Millions of files and directory caching Chris Dos
2002-10-23 14:50 pwitting
2002-10-23 18:16 ` Chris Dos
2002-10-23 18:25 ` Benjamin LaHaise
2002-10-23 19:48 ` Philippe Gramoullé
-- strict thread matches above, loose matches on Subject: below --
2002-10-21 14:59 Lever, Charles
2002-10-21 6:06 Matt Heaton
2002-10-21 12:34 ` Ragnar Kjørstad
2002-10-21 17:44 ` Chris Dos
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3DB5BC75.1000509@chrisdos.com \
--to=chris@chrisdos.com \
--cc=nfs@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox