public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Robert Petkus <rpetkus@bnl.gov>
To: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: xfs@oss.sgi.com
Subject: Re: Poor performance -- poor config?
Date: Wed, 20 Jun 2007 17:16:41 -0400	[thread overview]
Message-ID: <46799939.2080503@bnl.gov> (raw)
In-Reply-To: <Pine.LNX.4.64.0706201703310.27484@p34.internal.lan>

Justin Piszcz wrote:
>
>
> On Wed, 20 Jun 2007, Robert Petkus wrote:
>
>> Folks,
>> I'm trying to configure a system (server + DS4700 disk array) that 
>> can offer the highest performance for our application.  We will be 
>> reading and writing multiple threads of 1-2GB files with 1MB block 
>> sizes.
>> DS4700 config:
>> (16) 500 GB SATA disks
>> (3) 4+1 RAID 5 arrays and (1) hot spare == (3) 2TB LUNs.
>> (2) RAID arrays are on controller A, (1) RAID array is on controller B.
>> 512k segment size
>>
>> Server Config:
>> IBM x3550, 9GB RAM, RHEL 5 x86_64 (2.6.18)
>> The (3) LUNs are sdb, sdc {both controller A}, sdd {controller B}
>>
>> My original goal was to use XFS and create a highly optimized 
>> config.  Here is what I came up with:
>> Create separate partitions for XFS log files: sdd1, sdd2, sdd3 each 
>> 150M -- 128MB is the maximum allowable XFS log size.
>> The XFS "stripe unit" (su) = 512k to match the DS4700 segment size
>> The "stripe width" ( (n-1)*sunit )= swidth=2048k  = sw=4 (a multiple 
>> of su)
>> 4k is the max block size allowable on x86_64 since 4k is the max 
>> kernel page size
>>
>> [root@~]# mkfs.xfs -l logdev=/dev/sdd1,size=128m -d su=512k -d sw=4 
>> -f /dev/sdb
>> [root@~]#  mount -t xfs -o 
>> context=system_u:object_r:unconfined_t,noatime,nodiratime,logbufs=8,logdev=/dev/sdd1 
>> /dev/sdb /data0
>>
>> And the write performance is lousy compared to ext3 built like so:
>> [root@~]# mke2fs -j -m 1 -b4096 -E stride=128 /dev/sdc
>> [root@~]# mount -t ext3 -o 
>> noatime,nodiratime,context="system_u:object_r:unconfined_t:s0",reservation 
>> /dev/sdc /data1
>>
>> What am I missing?
>>
>> Thanks!
>>
>> -- 
>> Robert Petkus
>> RHIC/USATLAS Computing Facility
>> Brookhaven National Laboratory
>> Physics Dept. - Bldg. 510A
>> Upton, New York 11973
>>
>> http://www.bnl.gov/RHIC
>> http://www.acf.bnl.gov
>>
>>
>
> What speeds are you getting?
dd if=/dev/zero of=/data0/bigfile bs=1024k count=5000
5242880000 bytes (5.2 GB) copied, 149.296 seconds, 35.1 MB/s

dd if=/data0/bigfile of=/dev/null bs=1024k count=5000
5242880000 bytes (5.2 GB) copied, 26.3148 seconds, 199 MB/s

iozone.linux -w -r 1m -s 1g -i0 -t 4 -e -w -f /data0/test1
Children see throughput for  4 initial writers  =   28528.59 KB/sec
        Parent sees throughput for  4 initial writers   =   25212.79 KB/sec
        Min throughput per process                      =    6259.05 KB/sec
        Max throughput per process                      =    7548.29 KB/sec
        Avg throughput per process                      =    7132.15 KB/sec

iozone.linux -w -r 1m -s 1g -i1 -t 4 -e -w -f /data0/test1
Children see throughput for  4 readers          = 3059690.19 KB/sec
        Parent sees throughput for  4 readers           = 3055307.71 KB/sec
        Min throughput per process                      =  757151.81 KB/sec
        Max throughput per process                      =  776032.62 KB/sec
        Avg throughput per process                      =  764922.55 KB/sec

>
> Have you tried a SW RAID with the 16 drives, if you do that, XFS will 
> auto-optimize per the physical characteristics of the md array.
No because this would waste an expensive disk array.  I've done this 
with various JBODs, even a SUN Thumper, with OK results...
>
> Also, most of those mount options besides the logdev/noatime don't do 
> much with XFS from my personal benchmarks, you're better off with the 
> defaults+noatime.
The security context stuff is in there since I run a strict SELinux 
policy.  Otherwise, I need logdev since it's on a different disk.  BTW, 
the same filesystem w/out a separate log disk made no difference in 
performance.
>
> What speed are you getting reads/writes, what do you expect?  How are 
> the drives attached/what type of controller? PCI?
I can get ~3x write performance with ext3.  I have a dual-port FC-4 PCIe 
HBA connected to (2) IBM DS4700 FC-4 controllers.  There is lots of 
headroom.

-- 
Robert Petkus
RHIC/USATLAS Computing Facility
Brookhaven National Laboratory
Physics Dept. - Bldg. 510A
Upton, New York 11973

http://www.bnl.gov/RHIC
http://www.acf.bnl.gov

  reply	other threads:[~2007-06-20 21:16 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-20 20:59 Poor performance -- poor config? Robert Petkus
2007-06-20 21:04 ` Justin Piszcz
2007-06-20 21:16   ` Robert Petkus [this message]
2007-06-20 21:23     ` Justin Piszcz
2007-06-21  6:37       ` Sebastian Brings
2007-06-21 23:59         ` David Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46799939.2080503@bnl.gov \
    --to=rpetkus@bnl.gov \
    --cc=jpiszcz@lucidpixels.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox