* RAID5/6 Implementation - Understanding first
@ 2013-02-18 23:20 Tony Plack
2013-02-19 1:17 ` Chris Mason
0 siblings, 1 reply; 2+ messages in thread
From: Tony Plack @ 2013-02-18 23:20 UTC (permalink / raw)
To: linux-btrfs
Chris and team, hats off on the RAID5/6 being at least experimental. I have been following your work for a year now, and waiting for these days.
I am trying to get my head rapped around the architecture for BTRFS before I jump in and start recommending code changes to the branch.
What I am trying to understand is the comments in the GIT commit which state:
Read/modify/write is done after the higher levels of the filesystem have
prepared a given bio. This means the higher layers are not responsible
for building full stripes, and they don't need to query for the topology
of the extents that may get allocated during delayed allocation runs.
It also means different files can easily share the same stripe.
As I understand it, what we are doing is trying to hide the underlying extents architecture to gain some advantages in the higher level code. I have been digging in the code, and believe I know the answer to this question. So by "higher levels" does this mean that RMW, snapshots, checksums and duplicate detection are all unaware of RAID architecture?
If so, I might have some points to consider in this space. If not, I will need to dig deeper in the code to understand how some of my concerns can be realized and how I missed the answer to my question.
Thank you for this awesome work you all are doing and thank you for the time to answer.
Anthony Plack
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: RAID5/6 Implementation - Understanding first
2013-02-18 23:20 RAID5/6 Implementation - Understanding first Tony Plack
@ 2013-02-19 1:17 ` Chris Mason
0 siblings, 0 replies; 2+ messages in thread
From: Chris Mason @ 2013-02-19 1:17 UTC (permalink / raw)
To: Tony Plack; +Cc: linux-btrfs@vger.kernel.org
On Mon, Feb 18, 2013 at 04:20:58PM -0700, Tony Plack wrote:
> Chris and team, hats off on the RAID5/6 being at least experimental.
> I have been following your work for a year now, and waiting for these
> days.
>
> I am trying to get my head rapped around the architecture for BTRFS
> before I jump in and start recommending code changes to the branch.
>
> What I am trying to understand is the comments in the GIT commit which
> state:
>
> Read/modify/write is done after the higher levels of the filesystem have
> prepared a given bio. This means the higher layers are not responsible
> for building full stripes, and they don't need to query for the topology
> of the extents that may get allocated during delayed allocation runs.
> It also means different files can easily share the same stripe.
>
> As I understand it, what we are doing is trying to hide the underlying
> extents architecture to gain some advantages in the higher level code.
> I have been digging in the code, and believe I know the answer to this
> question. So by "higher levels" does this mean that RMW, snapshots,
> checksums and duplicate detection are all unaware of RAID
> architecture?
Yes, although the allocator is aware of the raid code, and the raid code
is aware that the higher levels are doing copy-on-write. They also
share the same transaction subsystem, at least until my parity logging
code is complete.
Longer term the two will cooperate more. For example, when we trigger
read/modify/write in RAID because a sub-stripe write was made to a large
file, we might as well use adjacent blocks from that file to fill the
new stripe. This will reduce a lot of complexity in terms of small
extent overhead in the rest of the code.
-chris
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2013-02-19 1:17 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-18 23:20 RAID5/6 Implementation - Understanding first Tony Plack
2013-02-19 1:17 ` Chris Mason
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox