From: Greg Whynott <greg@dkp.com>
To: Linux NFS Mailing List <nfs@lists.sourceforge.net>
Subject: render farm NFS server is having hard time staying up.
Date: Tue, 19 Oct 2004 11:35:32 -0400 [thread overview]
Message-ID: <41753444.9030300@dkp.com> (raw)
Hello Folks,
I'm looking for any information which may help me resolve a NFS
server issues we are seeing. We are seeing about 1-3% curruption on
files wrote to the array over NFS when under load. Some times we'll see
I/O errors, other times we'll see this error in the dmesg output"nfs:
server murdock not responding, timed out", and othertimes the result is
a bad file.
here are the details of the enviroment:
@200-300 dual cpu render nodes (depending on time of day).
all connected to gigabit network ports.
NFS server is a dual 2.8 p4 with 4gigs memory.
auto neg is off on switch ports, locked to 1000/full-dup/flow-control
render nodes mount the file server(s) with automount using these options:
-rw,insecure,hard,rsize=8192,wsize=8192,intr,timeo=600
RedHat 9 is running on the servers:
2.4.20-8 with big mem support.
rw,no_root_squash,insecure,sync,no_subtree_check
24 nfsd's fire off at startup.
contents of proc-nfsd:
[root@barney root]# cat /proc/net/rpc/nfsd
rc 6738 70516059 9738836
fh 500 79366229 10104583 667218 0
io 196640402 2028579561
th 24 387656 14064.970 2016.480 615.180 93.980 239.450 152.980 143.640
144.910 2.240 831.600
ra 48 47883 0 0 0 0 74 0 0 0 0 121
net 80270754 80270754 0 0
rpc 80261633 9121 0 9121 0
proc2 18 22 6763 918 0 1406 1 0 0 163637 142 0 0 0 0 1 0 0 11
proc3 22 4 2462879 570357 1141041 5515254 650 48078 69567752 142094 6308
3 0 3 0 71582 0 6417 0 4474 4477 0 547359
RedHat 7.3 is running on the render nodes:
2.4.18-.7
export options:
The disk arrays connected to the server are Sun T4s in a 6320 array via
dual 2G FC (active/active), 6 trays of 14 disks, hardware RAID 5 horz,
RAID 0 vert. The switches report few errors (counters reset 7 days ago):
Port name is BARNEY
MTU 1518 bytes, encapsulation ethernet
300 second input rate: 23597672 bits/sec, 2266 packets/sec, 2.39%
utilization
300 second output rate: 7404080 bits/sec, 2025 packets/sec, 0.76%
utilization
595831889 packets input, 589820579851 bytes, 0 no buffer
Received 63119 broadcasts, 0 multicasts, 595768764 unicasts
9 input errors, 6 CRC, 0 frame, 0 ignored
3 runts, 0 giants, DMA received 595831869 packets
765643165 packets output, 620030207291 bytes, 0 underruns
Transmitted 57746415 broadcasts, 551424 multicasts, 707345326 unicasts
0 output errors, 0 collisions, DMA transmitted 765643165 packets
I have added this as part of the system startup:
echo 262144 > /proc/sys/net/core/rmem_default
echo 262144 > /proc/sys/net/core/rmem_max
/etc/init.d/nfs start
echo 65536 > /proc/sys/net/core/rmem_default
echo 65536 > /proc/sys/net/core/rmem_max
This is a render farm where images are rendered then wrote out the the
array when complete. At the same time there is are people reading files
from the same array. I suspect we are giving our NFS server a DoS of
sorts, my hopes are we can set things up in such away that if a file
starts to write to the array, it'll finish and not write out bogas
data. If the server is to busy it should reject further connections
rather than handle them incorrectly. pipe dream?
thanks very much for your time, if you wish further info please let me
know, I must run off to a meeting,
greg
-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
next reply other threads:[~2004-10-19 15:36 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-10-19 15:35 Greg Whynott [this message]
-- strict thread matches above, loose matches on Subject: below --
2004-10-19 15:44 render farm NFS server is having hard time staying up Lever, Charles
2004-10-19 16:59 ` James Pearson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=41753444.9030300@dkp.com \
--to=greg@dkp.com \
--cc=nfs@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.