Re: RAID5/6 Implementation - Understanding first

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Chris Mason <chris.mason@fusionio.com>
To: Tony Plack <tony@plack.net>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: RAID5/6 Implementation - Understanding first
Date: Mon, 18 Feb 2013 20:17:23 -0500	[thread overview]
Message-ID: <20130219011723.GE13803@shiny.masoncoding.com> (raw)
In-Reply-To: <FB38335C-3CF7-4D82-B7E2-79264D54F0C8@plack.net>

On Mon, Feb 18, 2013 at 04:20:58PM -0700, Tony Plack wrote:
> Chris and team, hats off on the RAID5/6 being at least experimental.
> I have been following your work for a year now, and waiting for these
> days.
> 
> I am trying to get my head rapped around the architecture for BTRFS
> before I jump in and start recommending code changes to the branch.
> 
> What I am trying to understand is the comments in the GIT commit which
> state:
> 
> 	Read/modify/write is done after the higher levels of the filesystem have
> 	prepared a given bio.  This means the higher layers are not responsible
> 	for building full stripes, and they don't need to query for the topology
> 	of the extents that may get allocated during delayed allocation runs.
> 	It also means different files can easily share the same stripe.
> 
> As I understand it, what we are doing is trying to hide the underlying
> extents architecture to gain some advantages in the higher level code.
> I have been digging in the code, and believe I know the answer to this
> question.  So by "higher levels" does this mean that RMW, snapshots,
> checksums and duplicate detection are all unaware of RAID
> architecture?

Yes, although the allocator is aware of the raid code, and the raid code
is aware that the higher levels are doing copy-on-write.  They also
share the same transaction subsystem, at least until my parity logging
code is complete.

Longer term the two will cooperate more.  For example, when we trigger
read/modify/write in RAID because a sub-stripe write was made to a large
file, we might as well use adjacent blocks from that file to fill the
new stripe.  This will reduce a lot of complexity in terms of small
extent overhead in the rest of the code.

-chris

     prev parent reply	other threads:[~2013-02-19  1:17 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-18 23:20 RAID5/6 Implementation - Understanding first Tony Plack
2013-02-19  1:17 ` Chris Mason [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130219011723.GE13803@shiny.masoncoding.com \
    --to=chris.mason@fusionio.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=tony@plack.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.