From: NeilBrown <neilb@suse.de>
To: John Williams <jwilliams4200@gmail.com>
Cc: stan@hardwarefreak.com, James Plank <plank@cs.utk.edu>,
Ric Wheeler <rwheeler@redhat.com>,
Andrea Mazzoleni <amadvance@gmail.com>,
"H. Peter Anvin" <hpa@zytor.com>,
Linux RAID Mailing List <linux-raid@vger.kernel.org>,
Btrfs BTRFS <linux-btrfs@vger.kernel.org>,
David Brown <david.brown@hesbynett.no>,
David Smith <creamyfish@gmail.com>
Subject: Re: Triple parity and beyond
Date: Sat, 23 Nov 2013 18:12:08 +1100 [thread overview]
Message-ID: <20131123181208.5103bee4@notabene.brown> (raw)
In-Reply-To: <CAJBj3vfsbPoke2spzomBQRGqmSG9RCjwfMG1R4mfmJ8SOBZjvw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2225 bytes --]
On Fri, 22 Nov 2013 21:34:41 -0800 John Williams <jwilliams4200@gmail.com>
wrote:
> On Fri, Nov 22, 2013 at 9:04 PM, NeilBrown <neilb@suse.de> wrote:
>
> > I guess with that many drives you could hit PCI bus throughput limits.
> >
> > A 16-lane PCIe 4.0 could just about give 100MB/s to each of 16 devices. So
> > you would really need top-end hardware to keep all of 16 drives busy in a
> > recovery.
> > So yes: rebuilding a drive in a 16-drive RAID6+ would be slower than in e.g.
> > a 20 drive RAID10.
>
> Not really. A single 8x PCIe 2.0 card has 8 x 500MB/s = 4000MB/s of
> potential bandwidth. That would be 250MB/s per drive for 16 drives.
>
> But quite a few people running software RAID with many drives have
> multiple PCIe cards. For example, in one machine I have three IBM
> M1015 cards (which I got for $75/ea) that are 8x PCIe 2.0. That comes
> to 3 x 500MB/s x 8 = 12GB/s of IO bandwidth.
>
> Also, your math is wrong. PCIe 3.0 is 985 MB/s per lane. If we assume
> PCIe 4.0 would double that, we would have 1970MB/s per lane. So one
> lane of the hypothetical PCIe 4.0 would have enough IO bandwidth to
> give about 120MB/s to each of 16 drives. A single 8x PCIe 4.0 card
> would have 8 times that capability which is more than 15GB/s.
It wasn't my math, it was my reading :-(
16-lane PCIe 4.0 is 31 GB/sec so 2GB/sec per drive. I was reading the
"1-lane" number...
>
> Even a single 8x PCIe 3.0 card has potentially over 7GB/s of bandwidth.
>
> Bottom line is that IO bandwidth is not a problem for a system with
> prudently chosen hardware.
>
> More likely is that you would be CPU limited (rather than bus limited)
> in a high-parity rebuild where more than one drive failed. But even
> that is not likely to be too bad, since Andrea's single-threaded
> recovery code can recover two drives at nearly 1GB/s on one of my
> machines. I think the code could probably be threaded to achieve a
> multiple of that running on multiple cores.
Indeed. It seems likely that with modern hardware, the linear write speed
would be the limiting factor for spinning-rust drives.
For SSDs the limit might end up being somewhere else ...
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2013-11-23 7:12 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-18 22:08 Triple parity and beyond Andrea Mazzoleni
2013-11-18 22:12 ` H. Peter Anvin
2013-11-18 22:35 ` Andrea Mazzoleni
2013-11-18 23:25 ` H. Peter Anvin
2013-11-19 10:16 ` David Brown
2013-11-19 17:36 ` Andrea Mazzoleni
2013-11-19 22:51 ` Drew
2013-11-20 0:54 ` Chris Murphy
2013-11-20 1:23 ` John Williams
2013-11-20 10:35 ` David Brown
2013-11-20 10:31 ` David Brown
2013-11-20 18:09 ` John Williams
2013-11-20 18:44 ` Andrea Mazzoleni
2013-11-21 6:15 ` Stan Hoeppner
2013-11-21 8:32 ` David Brown
2013-11-20 18:34 ` Andrea Mazzoleni
2013-11-20 18:43 ` H. Peter Anvin
2013-11-20 18:56 ` Andrea Mazzoleni
2013-11-20 18:59 ` H. Peter Anvin
2013-11-20 21:21 ` Andrea Mazzoleni
2013-11-20 19:00 ` H. Peter Anvin
2013-11-20 21:04 ` Andrea Mazzoleni
2013-11-20 21:06 ` H. Peter Anvin
2013-11-21 8:36 ` David Brown
2013-11-19 17:28 ` Andrea Mazzoleni
2013-11-19 20:29 ` Ric Wheeler
2013-11-20 16:16 ` James Plank
2013-11-20 19:05 ` Andrea Mazzoleni
2013-11-20 19:10 ` H. Peter Anvin
2013-11-20 20:30 ` James Plank
2013-11-20 21:23 ` Andrea Mazzoleni
2013-11-27 2:50 ` ronnie sahlberg
2013-11-20 21:28 ` H. Peter Anvin
2013-11-21 1:28 ` Stan Hoeppner
2013-11-21 2:46 ` John Williams
2013-11-21 6:52 ` Stan Hoeppner
2013-11-21 7:05 ` John Williams
2013-11-21 22:57 ` Stan Hoeppner
2013-11-21 23:38 ` John Williams
2013-11-22 9:35 ` Stan Hoeppner
2013-11-22 11:24 ` joystick
2013-11-22 15:01 ` John Williams
2013-11-22 22:28 ` Stan Hoeppner
2013-11-22 23:07 ` NeilBrown
2013-11-23 3:46 ` Stan Hoeppner
2013-11-23 5:04 ` NeilBrown
2013-11-23 5:34 ` John Williams
2013-11-23 7:12 ` NeilBrown [this message]
2013-11-24 4:03 ` Stan Hoeppner
2013-11-24 5:14 ` John Williams
2013-11-24 21:13 ` Stan Hoeppner
2013-11-24 23:28 ` Rudy Zijlstra
2013-11-24 23:53 ` Alex Elsayed
2013-11-25 2:04 ` Stan Hoeppner
2013-11-25 4:48 ` Alex Elsayed
2013-11-25 9:15 ` David Brown
2013-11-24 5:19 ` Russell Coker
2013-11-24 21:44 ` Stan Hoeppner
2013-11-24 22:31 ` Mark Knecht
2013-11-25 2:14 ` Russell Coker
2013-11-25 9:20 ` David Brown
2013-11-21 8:08 ` joystick
2013-11-22 0:30 ` Stan Hoeppner
2013-11-22 0:33 ` H. Peter Anvin
2013-11-22 0:45 ` David Brown
2013-11-21 9:07 ` David Brown
2013-11-21 9:54 ` Adam Goryachev
2013-11-21 10:32 ` David Brown
2013-11-22 8:12 ` Russell Coker
2013-11-25 18:23 ` Pasi Kärkkäinen
2013-11-22 8:13 ` Stan Hoeppner
2013-11-22 13:15 ` David Brown
2013-11-22 16:07 ` Stan Hoeppner
2013-11-22 22:59 ` NeilBrown
2013-11-23 17:39 ` David Brown
2013-11-22 16:50 ` Mark Knecht
2013-11-22 19:51 ` Duncan
2013-11-22 8:38 ` Stan Hoeppner
2013-11-22 13:24 ` David Brown
2013-11-28 7:16 ` Stan Hoeppner
2013-11-28 7:36 ` Russell Coker
2013-11-28 9:56 ` David Brown
2013-11-30 7:32 ` Alex Elsayed
2013-12-01 15:37 ` Stan Hoeppner
2013-11-22 14:19 ` David Taylor
2013-11-21 19:56 ` Piergiorgio Sartor
2013-11-19 18:12 ` Piergiorgio Sartor
2013-11-20 10:44 ` David Brown
2013-11-20 21:59 ` Piergiorgio Sartor
2013-11-21 10:13 ` David Brown
2013-11-21 17:37 ` Goffredo Baroncelli
2013-11-21 20:05 ` Piergiorgio Sartor
2013-11-21 20:31 ` David Brown
2013-11-21 20:52 ` Piergiorgio Sartor
2013-11-22 0:32 ` David Brown
2013-11-22 20:32 ` Piergiorgio Sartor
2013-11-26 18:10 ` joystick
2013-11-20 21:38 ` Andrea Mazzoleni
2013-11-20 22:29 ` Piergiorgio Sartor
2013-11-23 7:55 ` Andrea Mazzoleni
2013-11-23 22:10 ` Piergiorgio Sartor
2013-11-24 9:39 ` Andrea Mazzoleni
-- strict thread matches above, loose matches on Subject: below --
2013-12-01 17:53 Richard Scobie
2013-12-02 4:30 ` Stan Hoeppner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131123181208.5103bee4@notabene.brown \
--to=neilb@suse.de \
--cc=amadvance@gmail.com \
--cc=creamyfish@gmail.com \
--cc=david.brown@hesbynett.no \
--cc=hpa@zytor.com \
--cc=jwilliams4200@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=plank@cs.utk.edu \
--cc=rwheeler@redhat.com \
--cc=stan@hardwarefreak.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).