From: Adam Goryachev <mailinglists@websitemanagers.com.au>
To: Dave Cundiff <syshackmin@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: RAID performance
Date: Thu, 07 Feb 2013 23:49:26 +1100 [thread overview]
Message-ID: <5113A2D6.20104@websitemanagers.com.au> (raw)
In-Reply-To: <CAKHEz2ZyG5xQC78GykbWOfk9EF=r7jcSe_01P=bH7NuJuFBEvA@mail.gmail.com>
On 07/02/13 22:07, Dave Cundiff wrote:
> On Thu, Feb 7, 2013 at 5:19 AM, Adam Goryachev
> <mailinglists@websitemanagers.com.au> wrote:
>> On 07/02/13 20:07, Dave Cundiff wrote:
>>> On Thu, Feb 7, 2013 at 1:48 AM, Adam Goryachev
>>> <mailinglists@websitemanagers.com.au> wrote:
>>> Why would you plug thousands of dollars of SSD into an onboard
>>> controller? It's probably running off a 1x PCIE shared with every
>>> other onboard device. An LSI 8x 8 port HBA will run you a few
>>> hundred(less than 1 SSD) and let you melt your northbridge. At least
>>> on my Supermicro X8DTL boards I had to add active cooling to it or it
>>> would overheat and crash at sustained IO. I can hit 2 - 2.5GB a second
>>> doing large sequential IO with Samsung 840 Pros on a RAID10.
>>
>> Because originally I was just using 4 x 2TB 7200 rpm disks in RAID10, I
>> upgraded to SSD to improve performance (which it did), but hadn't (yet)
>> upgraded the SATA controller because I didn't know if it would help.
>>
>> I'm seeing conflicting information here (buy SATA card or not)...
>
> Its not going to help your remote access any. From your configuration
> it looks like you are limited to 4 gigabits. At least as long as your
> NICs are not in the slot shared with the disks. If they are you might
> get some contention.
>
> http://download.intel.com/support/motherboards/server/sb/g13326004_s1200bt_tps_r2_0.pdf
>
> See page 17 for a block diagram of your motherboard. You have a 4x DMI
> connection that PCI slot 3, your disks, and every other onboard device
> share. That should be about 1.2GB(10Gigabits) of bandwidth. Your SSDs
> alone could saturate that if you performed a local operation. Get your
> NIC's going at 4Gig and all of it a sudden you'll really want that
> SATA card in slot 4 or 5.
OK, I'll have to check that the 4 x 1G ethernet are in slots 4 and 5
now, not using the onboard ethernet, and not in slot 3...
If I could get close to 4Gbps (ie, saturate the ethernet) then I think
I'd be more than happy... I don't see my SSD's running at 400MB/s though
anyway....
>>>> 2) Move from a 5 disk RAID5 to a 8 disk RAID10, giving better data
>>>> protection (can lose up to four drives) and hopefully better performance
>>>> (main concern right now), and same capacity as current.
>>>
>>> I've had strange issues with anything other than RAID1 or 10 with SSD.
>>> Even with the high IO and IOP rates of SSDs the parity calcs and extra
>>> writes still seem to penalize you greatly.
>>
>> Maybe this is the single threaded nature of RAID5 (and RAID10) ?
>
> I definitely see that. See below for a FIO run I just did on one of my RAID10s
>
> md2 : active raid10 sdb3[1] sdf3[5] sde3[4] sdc3[2] sdd3[3] sda3[0]
> 742343232 blocks super 1.2 32K chunks 2 near-copies [6/6] [UUUUUU]
>
> seq-read: (g=0): rw=read, bs=64K-64K/64K-64K/64K-64K, ioengine=libaio,
> iodepth=32
> seq-write: (g=2): rw=write, bs=64K-64K/64K-64K/64K-64K,
> ioengine=libaio, iodepth=32
>
> Run status group 0 (all jobs):
> READ: io=4096.0MB, aggrb=2149.3MB/s, minb=2149.3MB/s,
> maxb=2149.3MB/s, mint=1906msec, maxt=1906msec
>
> Run status group 2 (all jobs):
> WRITE: io=4096.0MB, aggrb=1168.7MB/s, minb=1168.7MB/s,
> maxb=1168.7MB/s, mint=3505msec, maxt=3505msec
>
> These drives are pretty fresh and my writes are a whole gig less than
> my read. Its not for lack of bandwidth either.
Can you please show your command line used, so I can try a similar test
and see a comparison?
>>> Also if your kernel does not have md TRIM support you risk taking a
>>> SEVERE performance hit on writes. Once you complete a full write pass
>>> on your NAND the SSD controller will require extra time to complete a
>>> write. if your IO is mostly small and random this can cause your NAND
>>> to become fragmented. If the fragmentation becomes bad enough you'll
>>> be lucky to get 1 spinning disk worth of write IO out of all 5
>>> combined.
>>
>> This was the reason I made the partition (for raid) smaller than the
>> disk, and left the rest un-partitioned. However, as you said, once I've
>> fully written enough data to fill the raw disk capacity, I still have a
>> problem. Is there some way to instruct the disk (overnight) to TRIM the
>> extra blank space, and do whatever it needs to tidy things up? Perhaps
>> this would help, at least first thing in the morning if it isn't enough
>> to get through the day. Potentially I could add a 6th SSD, reduce the
>> partition size across all of them, just so there is more blank space to
>> get through a full day worth of writes?
>
> There was a script called mdtrim that would use hdparm to manually
> send the proper TRIM commands to the drives. I didn't bother looking
> for a link because it scares me to death and you probably shouldn't
> use it. If it gets the math wrong random data will disappear from your
> disks.
Doesn't sound good... would be nice to use smartctl or similar to ask
the drive "please tidy up now". The drive itself knows that the
unpartitioned space is available.
> As for changing partition sizes you really have to know what kinds of
> IO you're doing. If all you're doing is hammering these things with
> tiny IOs 24x7 its gonna end up with terrible write IO. At least my
> SSDs do. If you have a decent mix of small and large it may not
> fragment as badly. I ran random 4k against mine for 2 days before it
> got miserably slow. Reading will always be fine.
Well, if I can re-trim daily, and have enough clean space to work for 2
days, then I should never hit this problem.... Assuming it loses *that
much* performance....
Thanks,
Adam
--
Adam Goryachev
Website Managers
www.websitemanagers.com.au
next prev parent reply other threads:[~2013-02-07 12:49 UTC|newest]
Thread overview: 131+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-02-07 6:48 RAID performance Adam Goryachev
2013-02-07 6:51 ` Adam Goryachev
2013-02-07 8:24 ` Stan Hoeppner
2013-02-07 7:02 ` Carsten Aulbert
2013-02-07 10:12 ` Adam Goryachev
2013-02-07 10:29 ` Carsten Aulbert
2013-02-07 10:41 ` Adam Goryachev
2013-02-07 8:11 ` Stan Hoeppner
2013-02-07 10:05 ` Adam Goryachev
2013-02-16 4:33 ` RAID performance - *Slow SSDs likely solved* Stan Hoeppner
[not found] ` <cfefe7a6-a13f-413c-9e3d-e061c68dc01b@email.android.com>
2013-02-17 5:01 ` Stan Hoeppner
2013-02-08 7:21 ` RAID performance Adam Goryachev
2013-02-08 7:37 ` Chris Murphy
2013-02-08 13:04 ` Stan Hoeppner
2013-02-07 9:07 ` Dave Cundiff
2013-02-07 10:19 ` Adam Goryachev
2013-02-07 11:07 ` Dave Cundiff
2013-02-07 12:49 ` Adam Goryachev [this message]
2013-02-07 12:53 ` Phil Turmel
2013-02-07 12:58 ` Adam Goryachev
2013-02-07 13:03 ` Phil Turmel
2013-02-07 13:08 ` Adam Goryachev
2013-02-07 13:20 ` Mikael Abrahamsson
2013-02-07 22:03 ` Chris Murphy
2013-02-07 23:48 ` Chris Murphy
2013-02-08 0:02 ` Chris Murphy
2013-02-08 6:25 ` Adam Goryachev
2013-02-08 7:35 ` Chris Murphy
2013-02-08 8:34 ` Chris Murphy
2013-02-08 14:31 ` Adam Goryachev
2013-02-08 14:19 ` Adam Goryachev
2013-02-08 6:15 ` Adam Goryachev
2013-02-07 15:32 ` Dave Cundiff
2013-02-08 13:58 ` Adam Goryachev
2013-02-08 21:42 ` Stan Hoeppner
2013-02-14 22:42 ` Chris Murphy
2013-02-15 1:10 ` Adam Goryachev
2013-02-15 1:40 ` Chris Murphy
2013-02-15 4:01 ` Adam Goryachev
2013-02-15 5:14 ` Chris Murphy
2013-02-15 11:10 ` Adam Goryachev
2013-02-15 23:01 ` Chris Murphy
2013-02-17 9:52 ` RAID performance - new kernel results Adam Goryachev
2013-02-18 13:20 ` RAID performance - new kernel results - 5x SSD RAID5 Stan Hoeppner
2013-02-20 17:10 ` Adam Goryachev
2013-02-21 6:04 ` Stan Hoeppner
2013-02-21 6:40 ` Adam Goryachev
2013-02-21 8:47 ` Joseph Glanville
2013-02-22 8:10 ` Stan Hoeppner
2013-02-24 20:36 ` Stan Hoeppner
2013-03-01 16:06 ` Adam Goryachev
2013-03-02 9:15 ` Stan Hoeppner
2013-03-02 17:07 ` Phil Turmel
2013-03-02 23:48 ` Stan Hoeppner
2013-03-03 2:35 ` Phil Turmel
2013-03-03 15:19 ` Adam Goryachev
2013-03-04 1:31 ` Phil Turmel
2013-03-04 9:39 ` Adam Goryachev
2013-03-04 12:41 ` Phil Turmel
2013-03-04 12:42 ` Stan Hoeppner
2013-03-04 5:25 ` Stan Hoeppner
2013-03-03 17:32 ` Adam Goryachev
2013-03-04 12:20 ` Stan Hoeppner
2013-03-04 16:26 ` Adam Goryachev
2013-03-05 9:30 ` RAID performance - 5x SSD RAID5 - effects of stripe cache sizing Stan Hoeppner
2013-03-05 15:53 ` Adam Goryachev
2013-03-07 7:36 ` Stan Hoeppner
2013-03-08 0:17 ` Adam Goryachev
2013-03-08 4:02 ` Stan Hoeppner
2013-03-08 5:57 ` Mikael Abrahamsson
2013-03-08 10:09 ` Stan Hoeppner
2013-03-08 14:11 ` Mikael Abrahamsson
2013-02-21 17:41 ` RAID performance - new kernel results - 5x SSD RAID5 David Brown
2013-02-23 6:41 ` Stan Hoeppner
2013-02-23 15:57 ` RAID performance - new kernel results John Stoffel
2013-03-01 16:10 ` Adam Goryachev
2013-03-10 15:35 ` Charles Polisher
2013-04-15 12:23 ` Adam Goryachev
2013-04-15 15:31 ` John Stoffel
2013-04-17 10:15 ` Adam Goryachev
2013-04-15 16:49 ` Roy Sigurd Karlsbakk
2013-04-15 20:16 ` Phil Turmel
2013-04-16 19:28 ` Roy Sigurd Karlsbakk
2013-04-16 21:03 ` Phil Turmel
2013-04-16 21:43 ` Stan Hoeppner
2013-04-15 20:42 ` Stan Hoeppner
2013-02-08 3:32 ` RAID performance Stan Hoeppner
2013-02-08 7:11 ` Adam Goryachev
2013-02-08 17:10 ` Stan Hoeppner
2013-02-08 18:44 ` Adam Goryachev
2013-02-09 4:09 ` Stan Hoeppner
2013-02-10 4:40 ` Adam Goryachev
2013-02-10 13:22 ` Stan Hoeppner
2013-02-10 16:16 ` Adam Goryachev
2013-02-10 17:19 ` Mikael Abrahamsson
2013-02-10 21:57 ` Adam Goryachev
2013-02-11 3:41 ` Adam Goryachev
2013-02-11 4:33 ` Mikael Abrahamsson
2013-02-12 2:46 ` Stan Hoeppner
2013-02-12 5:33 ` Adam Goryachev
2013-02-13 7:56 ` Stan Hoeppner
2013-02-13 13:48 ` Phil Turmel
2013-02-13 16:17 ` Adam Goryachev
2013-02-13 20:20 ` Adam Goryachev
2013-02-14 12:22 ` Stan Hoeppner
2013-02-15 13:31 ` Stan Hoeppner
2013-02-15 14:32 ` Adam Goryachev
2013-02-16 1:07 ` Stan Hoeppner
2013-02-16 17:19 ` Adam Goryachev
2013-02-17 1:42 ` Stan Hoeppner
2013-02-17 5:02 ` Adam Goryachev
2013-02-17 6:28 ` Stan Hoeppner
2013-02-17 8:41 ` Adam Goryachev
2013-02-17 13:58 ` Stan Hoeppner
2013-02-17 14:46 ` Adam Goryachev
2013-02-19 8:17 ` Stan Hoeppner
2013-02-20 16:45 ` Adam Goryachev
2013-02-21 0:45 ` Stan Hoeppner
2013-02-21 3:10 ` Adam Goryachev
2013-02-22 11:19 ` Stan Hoeppner
2013-02-22 15:25 ` Charles Polisher
2013-02-23 4:14 ` Stan Hoeppner
2013-02-12 7:34 ` Mikael Abrahamsson
2013-02-08 7:17 ` Adam Goryachev
2013-02-07 12:01 ` Brad Campbell
2013-02-07 12:37 ` Adam Goryachev
2013-02-07 17:12 ` Fredrik Lindgren
2013-02-08 0:00 ` Adam Goryachev
2013-02-11 19:49 ` Roy Sigurd Karlsbakk
2013-02-11 20:30 ` Dave Cundiff
2013-02-07 11:32 ` Mikael Abrahamsson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5113A2D6.20104@websitemanagers.com.au \
--to=mailinglists@websitemanagers.com.au \
--cc=linux-raid@vger.kernel.org \
--cc=syshackmin@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).