Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Chris Mason <chris.mason@oracle.com>
To: Ric Wheeler <rwheeler@redhat.com>
Cc: Avi Kivity <avi@redhat.com>,
	Stephan von Krawczynski <skraw@ithnet.com>,
	Christoph Hellwig <hch@infradead.org>, jim owens <jowens@hp.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: Some very basic questions
Date: Wed, 22 Oct 2008 11:33:59 -0400	[thread overview]
Message-ID: <1224689639.6448.72.camel@think.oraclecorp.com> (raw)
In-Reply-To: <48FF45EE.7010001@redhat.com>

On Wed, 2008-10-22 at 11:25 -0400, Ric Wheeler wrote:
> Avi Kivity wrote:
> > Ric Wheeler wrote:
> >>>
> >>> Well, btrfs is not about duplicating how most storage works today.  
> >>> Spare capacity has significant advantages over spare disks, such as 
> >>> being able to mix disk sizes, RAID levels, and better performance.
> >>
> >> Sure, there are advantages that go in favour of one or the other 
> >> approaches. But btrfs is also about being able to use common hardware 
> >> configurations without having to reinvent where we can avoid it (if 
> >> we have a working RAID or enough drives to do RAID5 with spares or 
> >> RAID6, we want to be able to delegate that off to something else if 
> >> we can).
> >
> > Well, if you have an existing RAID (or have lots of $$$ to buy a new 
> > one), you needn't tell Btrfs about it.  Just be sure not to enable 
> > Btrfs data redundancy, or you'll have redundant redundancy, which is 
> > expensive.
> >
> > What Btrfs enables with its multiple device capabilities is to 
> > assemble a JBOD into a filesystem-level data redundancy system, which 
> > is cheaper, more flexible (per-file data redundancy levels), and 
> > faster (no need for RMW, since you're always COWing).
>
> I think that the btrfs plan is still to push more complicated RAID 
> schemes off to MD (RAID6, etc) so this is an issue even with a JBOD.

At least v1.0 won't have raid6.  Over the longer term I hope to include
it because managing the storage once in btrfs and once in md is going to
be a bit clumsy.  It also limits the mixed mode functionality like
different stripe sizes for data vs metadata or metadata mirroring and
data raid6 that will allow us to perform well.

The goal will be to make a library of raid routines based on md that
other storage will be able to use.  I know Christoph has been interested
in this as well.

But in general, the btrfs raid code can do either spare disks or spare
capacity modes safely.  It enforces the correct number of devices in
each raid mode (as long as the admin doesn't lie to us and feed
partitions off the same device).

I'll leave the rest up to the admin.  One problem with the spare
capacity model is the general trend where drives from the same batch
that get hammered on in the same way tend to die at the same time.  Some
shops will sleep better knowing there's a hot spare and that's fine by
me.

-chris



  reply	other threads:[~2008-10-22 15:33 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-21 11:23 Some very basic questions Stephan von Krawczynski
2008-10-21 12:13 ` Andi Kleen
2008-10-21 14:22   ` Stephan von Krawczynski
2008-10-21 15:34     ` jim owens
2008-10-22 11:36       ` Stephan von Krawczynski
2008-10-22 12:15         ` Avi Kivity
2008-10-22 13:03           ` Ric Wheeler
2008-10-22 13:13             ` Chris Mason
2008-10-22 13:16             ` Avi Kivity
2008-10-21 13:20 ` jim owens
2008-10-21 17:01   ` Stephan von Krawczynski
2008-10-21 17:15     ` Christoph Hellwig
2008-10-21 17:31       ` Ric Wheeler
2008-10-22 12:27         ` Stephan von Krawczynski
2008-10-22 13:15           ` Chris Mason
2008-10-22 13:27             ` Ric Wheeler
2008-10-22 14:32               ` Avi Kivity
2008-10-22 14:36                 ` Chris Mason
2008-10-22 14:40                   ` Avi Kivity
2008-10-22 14:46                 ` Ric Wheeler
2008-10-22 14:54                   ` Avi Kivity
2008-10-22 15:02                     ` Ric Wheeler
2008-10-22 15:13                       ` Avi Kivity
2008-10-22 15:25                         ` Ric Wheeler
2008-10-22 15:33                           ` Chris Mason [this message]
2008-10-22 15:43                             ` Avi Kivity
2008-10-22 15:54                               ` Ric Wheeler
2008-10-22 18:28                                 ` Avi Kivity
2008-10-22 15:39                           ` Avi Kivity
2008-10-22 13:52             ` Stephan von Krawczynski
2008-10-22 15:56               ` Michel Salim
2008-10-22 16:56                 ` jim owens
2008-10-23  9:47                 ` Stephan von Krawczynski
2008-10-22 11:40       ` Stephan von Krawczynski
2008-10-21 13:59 ` Chris Mason
2008-10-21 16:09   ` Andi Kleen
2008-10-22 11:43     ` Stephan von Krawczynski
2008-10-21 16:27   ` Stephan von Krawczynski
2008-10-21 16:59     ` Andi Kleen
2008-10-22 11:46       ` Stephan von Krawczynski
2008-10-21 17:49     ` Chris Mason
2008-10-22 12:19       ` Stephan von Krawczynski
2008-10-22 12:48         ` Jeff Schroeder
2008-10-22 14:02           ` Stephan von Krawczynski
2008-10-22 13:50         ` Chris Mason
2008-10-22 14:04           ` Matthias Wächter
2008-10-22 14:32             ` Ric Wheeler
2008-10-22 14:44               ` jim owens
2008-10-24  8:42           ` Chris Samuel
2008-10-24  8:39         ` Chris Samuel
2008-10-21 20:54   ` Eric Anopolsky
2008-10-21 22:18     ` Ric Wheeler
2008-10-22  2:29       ` Eric Anopolsky
2008-10-22 10:42         ` Ric Wheeler
2008-10-22 10:53           ` Tejun Heo
2008-10-22 12:57             ` Ric Wheeler
2008-10-22 12:57             ` Ric Wheeler
2008-10-22 13:15               ` Tejun Heo
2008-10-22 13:19                 ` Chris Mason
2008-10-22 13:38                   ` Ric Wheeler
2008-10-22 13:59                     ` Chris Mason
2008-10-22 14:23                       ` Ric Wheeler
2008-10-22 13:23                 ` Ric Wheeler
2008-10-22 16:14                   ` Tejun Heo
2008-10-22 16:34                     ` Ric Wheeler
2008-10-23  3:59                       ` Tejun Heo
2008-10-22 18:32                     ` Avi Kivity
2008-10-22 19:13                       ` jim owens
2008-10-22 19:22                         ` Avi Kivity
2008-10-22 19:59                       ` Ric Wheeler
2008-10-22 21:31                     ` Eric Anopolsky
2008-10-22 21:56                       ` Ric Wheeler
  -- strict thread matches above, loose matches on Subject: below --
2008-10-21 17:37 calin
2008-10-21 20:08 ` jim owens
2008-10-22  7:15   ` Avi Kivity
2008-10-22 14:13     ` jim owens
2008-10-22 14:25       ` Avi Kivity
2008-10-22 14:35 dbz
2008-10-27 15:43 ` Stephan von Krawczynski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1224689639.6448.72.camel@think.oraclecorp.com \
    --to=chris.mason@oracle.com \
    --cc=avi@redhat.com \
    --cc=hch@infradead.org \
    --cc=jowens@hp.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=rwheeler@redhat.com \
    --cc=skraw@ithnet.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox