All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ric Wheeler <rwheeler@redhat.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
	Ric Wheeler <rwheeler@redhat.com>,
	Dan Williams <dan.j.williams@intel.com>,
	chris.mason@oracle.com, linux-btrfs@vger.kernel.org,
	neilb@suse.de, linux-raid@vger.kernel.org, plank@cs.utk.edu
Subject: Re: [PATCH 1/4] md: Factor out RAID6 algorithms into lib/
Date: Sat, 18 Jul 2009 08:49:25 -0400	[thread overview]
Message-ID: <4A61C4D5.6020707@redhat.com> (raw)
In-Reply-To: <1247918016.22313.138.camel@macbook.infradead.org>

On 07/18/2009 07:53 AM, David Woodhouse wrote:
> On Fri, 2009-07-17 at 11:49 -0400, H. Peter Anvin wrote:
>   =20
>> Ric Wheeler wrote:
>>     =20
>>>> The bottom line is pretty much this: the cost of changing the enco=
ding
>>>> would appear to outweigh the benefit. I'm not trying to claim the =
Linux
>>>> RAID-6 implementation is optimal, but it is simple and appears to =
be
>>>> fast enough that the math isn't the bottleneck.
>>>>         =20
>>> Cost? Thank about how to get free grad student hours testing out th=
ings
>>> that you might or might not want to leverage on down the road :-)
>>>
>>>       =20
>> Cost, yes, of changing an on-disk format.
>>     =20
>
> Personally, I don't care about that -- I'm utterly uninterested in th=
e
> legacy RAID-6 setup where it pretends to be a normal disk. I think th=
at
> model is as fundamentally wrong as flash devices making the similar
> pretence.
>
> I'm only interested in what we can use directly within btrfs -- and
> ideally I do want something which gives me an _arbitrary_ number of
> redundant blocks, rather than limiting me to 2. But the legacy code i=
s
> good enough for now=C2=B9.
>
> When I get round to wanting more, I was thinking of lifting something
> like http://git.infradead.org/mtd-utils.git?a=3Dblob;f=3Dfec.c to sta=
rt
> with, and maybe hoping that someone cleverer will come up with someth=
ing
> better.
>
> The less I have to deal with Galois Fields, the happier I'll be.
>
>   =20

I think that we are generally fine with the RAID5/6 support given a=20
small number of drives. The fancier erasure encodings are much more=20
interesting when you have a large number of drives - for example, we=20
just ordered 4 shelves of SATA drives (15/shelf) that will be driven by=
=20
a single server. You can certainly imagine profiling a lot of=20
interesting variations with that many things to play with.

Ric


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: Ric Wheeler <rwheeler@redhat.com>
To: David Woodhouse <dwmw2@infradead.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
	Ric Wheeler <rwheeler@redhat.com>,
	Dan Williams <dan.j.williams@intel.com>,
	chris.mason@oracle.com, linux-btrfs@vger.kernel.org,
	neilb@suse.de, linux-raid@vger.kernel.org, plank@cs.utk.edu
Subject: Re: [PATCH 1/4] md: Factor out RAID6 algorithms into lib/
Date: Sat, 18 Jul 2009 08:49:25 -0400	[thread overview]
Message-ID: <4A61C4D5.6020707@redhat.com> (raw)
In-Reply-To: <1247918016.22313.138.camel@macbook.infradead.org>

On 07/18/2009 07:53 AM, David Woodhouse wrote:
> On Fri, 2009-07-17 at 11:49 -0400, H. Peter Anvin wrote:
>    
>> Ric Wheeler wrote:
>>      
>>>> The bottom line is pretty much this: the cost of changing the encoding
>>>> would appear to outweigh the benefit. I'm not trying to claim the Linux
>>>> RAID-6 implementation is optimal, but it is simple and appears to be
>>>> fast enough that the math isn't the bottleneck.
>>>>          
>>> Cost? Thank about how to get free grad student hours testing out things
>>> that you might or might not want to leverage on down the road :-)
>>>
>>>        
>> Cost, yes, of changing an on-disk format.
>>      
>
> Personally, I don't care about that -- I'm utterly uninterested in the
> legacy RAID-6 setup where it pretends to be a normal disk. I think that
> model is as fundamentally wrong as flash devices making the similar
> pretence.
>
> I'm only interested in what we can use directly within btrfs -- and
> ideally I do want something which gives me an _arbitrary_ number of
> redundant blocks, rather than limiting me to 2. But the legacy code is
> good enough for now¹.
>
> When I get round to wanting more, I was thinking of lifting something
> like http://git.infradead.org/mtd-utils.git?a=blob;f=fec.c to start
> with, and maybe hoping that someone cleverer will come up with something
> better.
>
> The less I have to deal with Galois Fields, the happier I'll be.
>
>    

I think that we are generally fine with the RAID5/6 support given a 
small number of drives. The fancier erasure encodings are much more 
interesting when you have a large number of drives - for example, we 
just ordered 4 shelves of SATA drives (15/shelf) that will be driven by 
a single server. You can certainly imagine profiling a lot of 
interesting variations with that many things to play with.

Ric


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2009-07-18 12:49 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-13 14:11 [PATCH 1/4] md: Factor out RAID6 algorithms into lib/ David Woodhouse
2009-07-15 19:23 ` Dan Williams
2009-07-15 20:16   ` Chris Mason
2009-07-15 22:11     ` Dan Williams
2009-07-15 22:11       ` Dan Williams
2009-07-16 17:38   ` H. Peter Anvin
2009-07-17 14:22     ` Ric Wheeler
2009-07-17 15:20       ` H. Peter Anvin
2009-07-17 15:35         ` Ric Wheeler
2009-07-17 15:40           ` H. Peter Anvin
2009-07-17 15:47             ` Ric Wheeler
2009-07-17 15:49               ` H. Peter Anvin
2009-07-17 15:58                 ` Ric Wheeler
2009-07-17 18:59                   ` Alex Elsayed
2009-07-17 19:02                     ` Alex Elsayed
2009-07-29 18:16                       ` H. Peter Anvin
2009-07-17 19:12                 ` Gregory Maxwell
2009-07-17 19:12                   ` Gregory Maxwell
2009-07-18 11:53                 ` David Woodhouse
2009-07-18 11:53                   ` David Woodhouse
2009-07-18 12:45                   ` H. Peter Anvin
2009-07-18 12:45                     ` H. Peter Anvin
2009-07-18 18:50                     ` Alex Elsayed
2009-07-18 18:52                       ` Alex Elsayed
2009-07-29 18:20                         ` H. Peter Anvin
2009-07-18 12:49                   ` Ric Wheeler [this message]
2009-07-18 12:49                     ` Ric Wheeler
2009-07-18 16:26                   ` Dan Williams
2009-07-18 16:26                     ` Dan Williams
2009-07-18 18:42                     ` David Woodhouse
2009-07-18 20:04                       ` Dan Williams
2009-07-19 18:04                         ` David Woodhouse
2009-07-20  5:21                           ` H. Peter Anvin
2009-07-17 15:51               ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A61C4D5.6020707@redhat.com \
    --to=rwheeler@redhat.com \
    --cc=chris.mason@oracle.com \
    --cc=dan.j.williams@intel.com \
    --cc=dwmw2@infradead.org \
    --cc=hpa@zytor.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=plank@cs.utk.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.