From: Robert Petkus <rpetkus@bnl.gov>
To: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: xfs@oss.sgi.com
Subject: Re: Poor performance -- poor config?
Date: Wed, 20 Jun 2007 17:16:41 -0400 [thread overview]
Message-ID: <46799939.2080503@bnl.gov> (raw)
In-Reply-To: <Pine.LNX.4.64.0706201703310.27484@p34.internal.lan>
Justin Piszcz wrote:
>
>
> On Wed, 20 Jun 2007, Robert Petkus wrote:
>
>> Folks,
>> I'm trying to configure a system (server + DS4700 disk array) that
>> can offer the highest performance for our application. We will be
>> reading and writing multiple threads of 1-2GB files with 1MB block
>> sizes.
>> DS4700 config:
>> (16) 500 GB SATA disks
>> (3) 4+1 RAID 5 arrays and (1) hot spare == (3) 2TB LUNs.
>> (2) RAID arrays are on controller A, (1) RAID array is on controller B.
>> 512k segment size
>>
>> Server Config:
>> IBM x3550, 9GB RAM, RHEL 5 x86_64 (2.6.18)
>> The (3) LUNs are sdb, sdc {both controller A}, sdd {controller B}
>>
>> My original goal was to use XFS and create a highly optimized
>> config. Here is what I came up with:
>> Create separate partitions for XFS log files: sdd1, sdd2, sdd3 each
>> 150M -- 128MB is the maximum allowable XFS log size.
>> The XFS "stripe unit" (su) = 512k to match the DS4700 segment size
>> The "stripe width" ( (n-1)*sunit )= swidth=2048k = sw=4 (a multiple
>> of su)
>> 4k is the max block size allowable on x86_64 since 4k is the max
>> kernel page size
>>
>> [root@~]# mkfs.xfs -l logdev=/dev/sdd1,size=128m -d su=512k -d sw=4
>> -f /dev/sdb
>> [root@~]# mount -t xfs -o
>> context=system_u:object_r:unconfined_t,noatime,nodiratime,logbufs=8,logdev=/dev/sdd1
>> /dev/sdb /data0
>>
>> And the write performance is lousy compared to ext3 built like so:
>> [root@~]# mke2fs -j -m 1 -b4096 -E stride=128 /dev/sdc
>> [root@~]# mount -t ext3 -o
>> noatime,nodiratime,context="system_u:object_r:unconfined_t:s0",reservation
>> /dev/sdc /data1
>>
>> What am I missing?
>>
>> Thanks!
>>
>> --
>> Robert Petkus
>> RHIC/USATLAS Computing Facility
>> Brookhaven National Laboratory
>> Physics Dept. - Bldg. 510A
>> Upton, New York 11973
>>
>> http://www.bnl.gov/RHIC
>> http://www.acf.bnl.gov
>>
>>
>
> What speeds are you getting?
dd if=/dev/zero of=/data0/bigfile bs=1024k count=5000
5242880000 bytes (5.2 GB) copied, 149.296 seconds, 35.1 MB/s
dd if=/data0/bigfile of=/dev/null bs=1024k count=5000
5242880000 bytes (5.2 GB) copied, 26.3148 seconds, 199 MB/s
iozone.linux -w -r 1m -s 1g -i0 -t 4 -e -w -f /data0/test1
Children see throughput for 4 initial writers = 28528.59 KB/sec
Parent sees throughput for 4 initial writers = 25212.79 KB/sec
Min throughput per process = 6259.05 KB/sec
Max throughput per process = 7548.29 KB/sec
Avg throughput per process = 7132.15 KB/sec
iozone.linux -w -r 1m -s 1g -i1 -t 4 -e -w -f /data0/test1
Children see throughput for 4 readers = 3059690.19 KB/sec
Parent sees throughput for 4 readers = 3055307.71 KB/sec
Min throughput per process = 757151.81 KB/sec
Max throughput per process = 776032.62 KB/sec
Avg throughput per process = 764922.55 KB/sec
>
> Have you tried a SW RAID with the 16 drives, if you do that, XFS will
> auto-optimize per the physical characteristics of the md array.
No because this would waste an expensive disk array. I've done this
with various JBODs, even a SUN Thumper, with OK results...
>
> Also, most of those mount options besides the logdev/noatime don't do
> much with XFS from my personal benchmarks, you're better off with the
> defaults+noatime.
The security context stuff is in there since I run a strict SELinux
policy. Otherwise, I need logdev since it's on a different disk. BTW,
the same filesystem w/out a separate log disk made no difference in
performance.
>
> What speed are you getting reads/writes, what do you expect? How are
> the drives attached/what type of controller? PCI?
I can get ~3x write performance with ext3. I have a dual-port FC-4 PCIe
HBA connected to (2) IBM DS4700 FC-4 controllers. There is lots of
headroom.
--
Robert Petkus
RHIC/USATLAS Computing Facility
Brookhaven National Laboratory
Physics Dept. - Bldg. 510A
Upton, New York 11973
http://www.bnl.gov/RHIC
http://www.acf.bnl.gov
next prev parent reply other threads:[~2007-06-20 21:16 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-06-20 20:59 Poor performance -- poor config? Robert Petkus
2007-06-20 21:04 ` Justin Piszcz
2007-06-20 21:16 ` Robert Petkus [this message]
2007-06-20 21:23 ` Justin Piszcz
2007-06-21 6:37 ` Sebastian Brings
2007-06-21 23:59 ` David Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46799939.2080503@bnl.gov \
--to=rpetkus@bnl.gov \
--cc=jpiszcz@lucidpixels.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox