From: Stan Hoeppner <stan@hardwarefreak.com>
To: quanjun hu <huquanjun@gmail.com>, xfs@oss.sgi.com
Subject: Re: Problem about very high Average Read/Write Request Time
Date: Sun, 19 Oct 2014 16:16:56 -0500 [thread overview]
Message-ID: <54442A48.6050503@hardwarefreak.com> (raw)
In-Reply-To: <CALSoAzD4ccHXBuD6mT3ggqMf1j_kDEK-RNMOeRLq+N+NiWVQXg@mail.gmail.com>
On 10/18/2014 04:26 AM, quanjun hu wrote:
> Hi,
> I am using xfs on a raid 5 (~100TB) and put log on external ssd device, the mount information is:
> /dev/sdc on /data/fhgfs/fhgfs_storage type xfs (rw,relatime,attr2,delaylog,logdev=/dev/sdb1,sunit=512,swidth=15872,noquota).
> when doing only reading / only writing , the speed is very fast(~1.5G), but when do both the speed is very slow (100M), and high r_await(160) and w_await(200000).
> 1. how can I reduce average request time?
> 2. can I use ssd as write/read cache for xfs?
You apparently have 31 effective SATA 7.2k RPM spindles with 256 KiB chunk, 7.75 MiB stripe width, in RAID5. That should yield 3-4.6 GiB/s of streaming throughput assuming no cable, expander, nor HBA limitations. You're achieving only 1/3rd to 1/2 of this. Which hardware RAID controller is this? What are the specs? Cache RAM, host and back end cable count and type?
When you say read or write is fast individually, but read+write is slow, what types of files are you reading and writing, and how many in parallel? This combined pattern is likely the cause of the slowdown due to excessive seeking in the drives.
As others mentioned this isn't an XFS problem. The problem is that your RAID geometry doesn't match your workload. Your very wide parity stripe is apparently causing excessive seeking with your read+write workload due to read-modify-write operations. To mitigate this, and to increase resiliency, you should switch to RAID6 with a smaller chunk. If you need maximum capacity make a single RAID6 array with 16 KiB chunk size. This will yield a 496 KiB stripe width, increasing the odds that all writes are a full stripe, and hopefully eliminating much of the RMW problem.
A better option might be making three 10 drive RAID6 arrays (two spares) with 32 KiB chunk, 256 KiB stripe width, and concatenating the 3 arrays with mdadm --linear. You'd have 24 spindles of capacity and throughput instead of 31, but no more RMW operations, or at least very few. You'd format the linear md device with
# mkfs.xfs -d su=32k,sw=8 /dev/mdX
As long as your file accesses are spread fairly evenly across at least 3 directories you should achieve excellent parallel throughput, though single file streaming throughput will peak at 800-1200 MiB/s, that of 8 drives. With a little understanding of how this setup works, you can write two streaming files and read a third without any of the 3 competing with one another for disk seeks/bandwidth--which is your current problem. Or you could do one read and one write to each of 3 directories, and no pair of two would interfere with the other pairs. Scale up from here.
Basically what we're doing is isolating each RAID LUN into a set of directories. When you write to one of those directories the file goes into only one of the 3 RAID arrays. Doing this isolates RMWs for a given write to only a subset of your disks, and minimizes the amount of seeks generated by parallel accesses.
Cheers,
Stan
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
prev parent reply other threads:[~2014-10-19 21:16 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-18 9:26 Problem about very high Average Read/Write Request Time quanjun hu
2014-10-18 12:38 ` Emmanuel Florac
2014-10-19 10:10 ` Peter Grandi
2014-10-20 8:00 ` Bernd Schubert
2014-10-21 18:27 ` Peter Grandi
2014-10-23 16:20 ` Bernd Schubert
2014-10-23 20:09 ` Peter Grandi
2014-10-24 21:45 ` Dave Chinner
2014-10-25 11:00 ` Peter Grandi
2014-10-25 19:31 ` Stan Hoeppner
2014-10-25 12:36 ` Peter Grandi
2014-10-23 23:01 ` Peter Grandi
2014-10-19 21:16 ` Stan Hoeppner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54442A48.6050503@hardwarefreak.com \
--to=stan@hardwarefreak.com \
--cc=huquanjun@gmail.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox