From: NeilBrown <neilb@suse.de>
To: stan@hardwarefreak.com
Cc: John Williams <jwilliams4200@gmail.com>,
James Plank <plank@cs.utk.edu>, Ric Wheeler <rwheeler@redhat.com>,
Andrea Mazzoleni <amadvance@gmail.com>,
"H. Peter Anvin" <hpa@zytor.com>,
Linux RAID Mailing List <linux-raid@vger.kernel.org>,
Btrfs BTRFS <linux-btrfs@vger.kernel.org>,
David Brown <david.brown@hesbynett.no>,
David Smith <creamyfish@gmail.com>
Subject: Re: Triple parity and beyond
Date: Sat, 23 Nov 2013 16:04:28 +1100 [thread overview]
Message-ID: <20131123160428.6f1c5898@notabene.brown> (raw)
In-Reply-To: <5290252A.8020508@hardwarefreak.com>
[-- Attachment #1: Type: text/plain, Size: 4184 bytes --]
On Fri, 22 Nov 2013 21:46:50 -0600 Stan Hoeppner <stan@hardwarefreak.com>
wrote:
> On 11/22/2013 5:07 PM, NeilBrown wrote:
> > On Thu, 21 Nov 2013 16:57:48 -0600 Stan Hoeppner <stan@hardwarefreak.com>
> > wrote:
> >
> >> On 11/21/2013 1:05 AM, John Williams wrote:
> >>> On Wed, Nov 20, 2013 at 10:52 PM, Stan Hoeppner <stan@hardwarefreak.com> wrote:
> >>>> On 11/20/2013 8:46 PM, John Williams wrote:
> >>>>> For myself or any machines I managed for work that do not need high
> >>>>> IOPS, I would definitely choose triple- or quad-parity over RAID 51 or
> >>>>> similar schemes with arrays of 16 - 32 drives.
> >>>>
> >>>> You must see a week long rebuild as acceptable...
> >>>
> >>> It would not be a problem if it did take that long, since I would have
> >>> extra parity units as backup in case of a failure during a rebuild.
> >>>
> >>> But of course it would not take that long. Take, for example, a 24 x
> >>> 3TB triple-parity array (21+3) that has had two drive failures
> >>> (perhaps the rebuild started with one failure, but there was soon
> >>> another failure). I would expect the rebuild to take about a day.
> >>
> >> You're looking at today. We're discussing tomorrow's needs. Today's
> >> 6TB 3.5" drives have sustained average throughput of ~175MB/s.
> >> Tomorrow's 20TB drives will be lucky to do 300MB/s. As I said
> >> previously, at that rate a straight disk-disk copy of a 20TB drive takes
> >> 18.6 hours. This is what you get with RAID1/10/51. In the real world,
> >> rebuilding a failed drive in a 3P array of say 8 of these disks will
> >> likely take at least 3 times as long, 2 days 6 hours minimum, probably
> >> more. This may be perfectly acceptable to some, but probably not to all.
> >
> > Could you explain your logic here? Why do you think rebuilding parity
> > will take 3 times as long as rebuilding a copy? Can you measure that sort of
> > difference today?
>
> I've not performed head-to-head timed rebuild tests of mirror vs parity
> RAIDs. I'm making the elapsed guess for parity RAIDs based on posts
> here over the past ~3 years, in which many users reported 16-24+ hour
> rebuild times for their fairly wide (12-16 1-2TB drive) RAID6 arrays.
I guess with that many drives you could hit PCI bus throughput limits.
A 16-lane PCIe 4.0 could just about give 100MB/s to each of 16 devices. So
you would really need top-end hardware to keep all of 16 drives busy in a
recovery.
So yes: rebuilding a drive in a 16-drive RAID6+ would be slower than in e.g.
a 20 drive RAID10.
>
> This is likely due to their chosen rebuild priority and concurrent user
> load during rebuild. Since this seems to be the norm, instead of giving
> 100% to the rebuild, I thought it prudent to take this into account,
> instead of the theoretical minimum rebuild time.
>
> > Presumably when we have 20TB drives we will also have more cores and quite
> > possibly dedicated co-processors which will make the CPU load less
> > significant.
>
> But (when) will we have the code to fully take advantage of these? It's
> nearly 2014 and we still don't have a working threaded write model for
> levels 5/6/10, though maybe soon. Multi-core mainstream x86 CPUs have
> been around for 8 years now, SMP and ccNUMA systems even longer. So the
> need has been there for a while.
I think we might have that multi-threading now - not sure exactly what is
enabled by default though.
I think it requires more than "need" - it requires "demand". i.e. people
repeatedly expressing the need. We certainly have had that for a while, but
not a very long while
>
> I'm strictly making an observation (possibly not fully accurate) here.
> I am not casting stones. I'm not a programmer and am thus unable to
> contribute code, only ideas and troubleshooting assistance for fellow
> users. Ergo I have no right/standing to complain about the rate of
> feature progress. I know that everyone hacking md is making the most of
> the time they have available. So again, not a complaint, just an
> observation.
Understood - and thanks for your observation.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2013-11-23 5:04 UTC|newest]
Thread overview: 104+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-18 22:08 Triple parity and beyond Andrea Mazzoleni
2013-11-18 22:12 ` H. Peter Anvin
2013-11-18 22:35 ` Andrea Mazzoleni
2013-11-18 23:25 ` H. Peter Anvin
2013-11-19 10:16 ` David Brown
2013-11-19 17:36 ` Andrea Mazzoleni
2013-11-19 22:51 ` Drew
2013-11-20 0:54 ` Chris Murphy
2013-11-20 1:23 ` John Williams
2013-11-20 10:35 ` David Brown
2013-11-20 10:31 ` David Brown
2013-11-20 18:09 ` John Williams
2013-11-20 18:44 ` Andrea Mazzoleni
2013-11-21 6:15 ` Stan Hoeppner
2013-11-21 8:32 ` David Brown
2013-11-20 18:34 ` Andrea Mazzoleni
2013-11-20 18:43 ` H. Peter Anvin
2013-11-20 18:56 ` Andrea Mazzoleni
2013-11-20 18:59 ` H. Peter Anvin
2013-11-20 21:21 ` Andrea Mazzoleni
2013-11-20 19:00 ` H. Peter Anvin
2013-11-20 21:04 ` Andrea Mazzoleni
2013-11-20 21:06 ` H. Peter Anvin
2013-11-21 8:36 ` David Brown
2013-11-19 17:28 ` Andrea Mazzoleni
2013-11-19 20:29 ` Ric Wheeler
2013-11-20 16:16 ` James Plank
2013-11-20 19:05 ` Andrea Mazzoleni
2013-11-20 19:10 ` H. Peter Anvin
2013-11-20 20:30 ` James Plank
2013-11-20 21:23 ` Andrea Mazzoleni
2013-11-27 2:50 ` ronnie sahlberg
2013-11-20 21:28 ` H. Peter Anvin
2013-11-21 1:28 ` Stan Hoeppner
2013-11-21 2:46 ` John Williams
2013-11-21 6:52 ` Stan Hoeppner
2013-11-21 7:05 ` John Williams
2013-11-21 22:57 ` Stan Hoeppner
2013-11-21 23:38 ` John Williams
2013-11-22 9:35 ` Stan Hoeppner
2013-11-22 11:24 ` joystick
2013-11-22 15:01 ` John Williams
2013-11-22 22:28 ` Stan Hoeppner
2013-11-22 23:07 ` NeilBrown
2013-11-23 3:46 ` Stan Hoeppner
2013-11-23 5:04 ` NeilBrown [this message]
2013-11-23 5:34 ` John Williams
2013-11-23 7:12 ` NeilBrown
2013-11-24 4:03 ` Stan Hoeppner
2013-11-24 5:14 ` John Williams
2013-11-24 21:13 ` Stan Hoeppner
2013-11-24 23:28 ` Rudy Zijlstra
2013-11-24 23:53 ` Alex Elsayed
2013-11-25 2:04 ` Stan Hoeppner
2013-11-25 4:48 ` Alex Elsayed
2013-11-25 9:15 ` David Brown
2013-11-24 5:19 ` Russell Coker
2013-11-24 21:44 ` Stan Hoeppner
2013-11-24 22:31 ` Mark Knecht
2013-11-25 2:14 ` Russell Coker
2013-11-25 9:20 ` David Brown
2013-11-21 8:08 ` joystick
2013-11-22 0:30 ` Stan Hoeppner
2013-11-22 0:33 ` H. Peter Anvin
2013-11-22 0:45 ` David Brown
2013-11-21 9:07 ` David Brown
2013-11-21 9:54 ` Adam Goryachev
2013-11-21 10:32 ` David Brown
2013-11-22 8:12 ` Russell Coker
2013-11-25 18:23 ` Pasi Kärkkäinen
2013-11-22 8:13 ` Stan Hoeppner
2013-11-22 13:15 ` David Brown
2013-11-22 16:07 ` Stan Hoeppner
2013-11-22 22:59 ` NeilBrown
2013-11-23 17:39 ` David Brown
2013-11-22 16:50 ` Mark Knecht
2013-11-22 19:51 ` Duncan
2013-11-22 8:38 ` Stan Hoeppner
2013-11-22 13:24 ` David Brown
2013-11-28 7:16 ` Stan Hoeppner
2013-11-28 7:36 ` Russell Coker
2013-11-28 9:56 ` David Brown
2013-11-30 7:32 ` Alex Elsayed
2013-12-01 15:37 ` Stan Hoeppner
2013-11-22 14:19 ` David Taylor
2013-11-21 19:56 ` Piergiorgio Sartor
2013-11-19 18:12 ` Piergiorgio Sartor
2013-11-20 10:44 ` David Brown
2013-11-20 21:59 ` Piergiorgio Sartor
2013-11-21 10:13 ` David Brown
2013-11-21 17:37 ` Goffredo Baroncelli
2013-11-21 20:05 ` Piergiorgio Sartor
2013-11-21 20:31 ` David Brown
2013-11-21 20:52 ` Piergiorgio Sartor
2013-11-22 0:32 ` David Brown
2013-11-22 20:32 ` Piergiorgio Sartor
2013-11-26 18:10 ` joystick
2013-11-20 21:38 ` Andrea Mazzoleni
2013-11-20 22:29 ` Piergiorgio Sartor
2013-11-23 7:55 ` Andrea Mazzoleni
2013-11-23 22:10 ` Piergiorgio Sartor
2013-11-24 9:39 ` Andrea Mazzoleni
-- strict thread matches above, loose matches on Subject: below --
2013-12-01 17:53 Richard Scobie
2013-12-02 4:30 ` Stan Hoeppner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131123160428.6f1c5898@notabene.brown \
--to=neilb@suse.de \
--cc=amadvance@gmail.com \
--cc=creamyfish@gmail.com \
--cc=david.brown@hesbynett.no \
--cc=hpa@zytor.com \
--cc=jwilliams4200@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-raid@vger.kernel.org \
--cc=plank@cs.utk.edu \
--cc=rwheeler@redhat.com \
--cc=stan@hardwarefreak.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).