All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <chris.mason@oracle.com>
To: Mike Snitzer <snitzer@gmail.com>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [ANNOUNCE] Btrfs: a copy on write, snapshotting FS
Date: Tue, 12 Jun 2007 16:14:39 -0400	[thread overview]
Message-ID: <20070612201439.GI28279@think.oraclecorp.com> (raw)
In-Reply-To: <170fa0d20706121253v62d13f70p3694eeab1852092a@mail.gmail.com>

On Tue, Jun 12, 2007 at 03:53:03PM -0400, Mike Snitzer wrote:
> On 6/12/07, Chris Mason <chris.mason@oracle.com> wrote:
> >Hello everyone,
> >
> >After the last FS summit, I started working on a new filesystem that
> >maintains checksums of all file data and metadata.  Many thanks to Zach
> >Brown for his ideas, and to Dave Chinner for his help on
> >benchmarking analysis.
> 
> Chris,
> 
> Given the substantial work that you've already put into btrfs and the
> direction you're Todo list details; it feels as though Btrfs will
> quickly provide the features that only Sun's ZFS provides.
> 
> Looking at your Btrfs benchmark and design pages it is clear that
> you're motivation is a filesystem that addresses modern concerns
> (performance that doesn't degrade over time, online fsck, fast offline
> fsck, data/metadata checksums, unlimited snapshots, efficient remote
> mirroring, etc).  There is still much "Todo" but you've made very
> impressive progress for the first announcement!
> 
> I have some management oriented questions/comments.
> 
> 1)
> Regarding the direction of Btrfs as it relates to integration with DM.
> The allocation policies, the ease of configuring DM-based
> striping/mirroring, management of large pools of storage all seems to
> indicate that Btrfs will manage the physical spindles internally.
> This is very ZFS-ish (ZFS pools) so I'd like to understand where you
> see Btrfs going in this area.

There's quite a lot of hand waving in that section.  What I'd like to do
is work closely with the LVM/DM/MD maintainers and come up with
something that leverages what linux already does.  I don't want to
rewrite LVM into the FS, but I do want to make better use of info about
the underlying storage.

> 
> Your initial benchmarks were all done ontop of a single disk with an
> LVM stack yet your roadmap/todo and design speaks to a tighter
> integration of the volume management features.  So long term is
> traditional LVM/MD functionality to be pulled directly into Btrfs?
> 
> 2)
> The Btrfs notion of subvolumes and snapshots is very elegant and
> provides for a fluid management of the filesystem system data.  It
> feels as though each subvolume/snapshot is just folded into the parent
> Btrfs volumes' namespace.  Was there any particular reason you elected
> to do this?  I can see that it lends itself to allowing snapshots of
> snapshots.  If you could elaborate I'd appreciate it.
> 
Yes, I wanted snapshots to be writable and resnapshottable.  It also
lowers the complexity to keep each snapshot as a subvolume/tree.

subvolumes are only slightly more expensive than a directory.  So, even
though a subvolume is a large grained unit for a snapshot, you can get
around this by just making more subvolumes.

> In practice subvolumes and/or snapshots appear to be implicitly
> mounted upon creation (refcount of parent is incremented).  Is this
> correct?  For snapshots, this runs counter to mapping the snapshots'
> data into the namespace of the origin Btrfs (e.g. with a .snapshot
> dir, but this is only useful for read-only snaps).  Having snapshot
> namespaces in terms of monolithic subvolumes puts a less intuitive
> face on N Btrfs snapshots.  The history of a given file/dir feels to
> be lost with this model.

That's somewhat true, the disk format does have enough information to
show you that history, but cleanly expressing it to the user is a
daunting task.

> 
> Aside from folding snapshot history into the origin's namespace... It
> could be possible to have a mount.btrfs that allows subvolumes and/or
> snapshot volumes to be mounted as unique roots?  I'd imagine a bind
> mount _could_ provide this too?  Anyway, I'm just interested in
> understanding the vision for managing the potentially complex nature
> of a Btrfs namespace.

One option is to put the real btrfs root into some directory in
(/sys/fs/btrfs/$device?) and then use tools in userland to mount -o bind
outside of that.  I wanted to wait to get fancy until I had a better
idea of how people would use the feature.
> 
> Thanks for doing all this work; I think the Linux community got a much
> needed shot in the arm with this Btrfs announcement.
> 

Thanks for the comments.

-chris

  reply	other threads:[~2007-06-12 20:17 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-12 16:10 [ANNOUNCE] Btrfs: a copy on write, snapshotting FS Chris Mason
2007-06-12 19:53 ` Mike Snitzer
2007-06-12 20:14   ` Chris Mason [this message]
2007-06-13  3:08     ` Christoph Hellwig
2007-06-13 10:17       ` Chris Mason
2007-06-13  3:46 ` John Stoffel
2007-06-13 10:35   ` Chris Mason
2007-06-13 14:00     ` John Stoffel
2007-06-13 14:54       ` Chris Mason
2007-06-13 16:12         ` John Stoffel
2007-06-13 16:34           ` Chris Mason
2007-06-13 16:25         ` Grzegorz Kulewski
2007-06-14 18:20   ` Chuck Lever
2007-06-14 18:48     ` Chris Mason
2007-06-15 17:17       ` Chuck Lever
2007-06-14 18:29 ` Florian D.
2007-06-14 19:13   ` Chris Mason
2007-06-15 19:08     ` Florian D.
2007-06-15 19:11       ` Chris Mason
2007-06-15 20:46         ` Florian D.
2007-06-15 20:51           ` Chris Mason
2007-06-15 22:03             ` Florian D.
2007-06-16  0:54               ` Chris Mason
2007-06-16  9:31                 ` Florian D.
2007-06-18 14:29                   ` Chris Mason
2007-06-18 14:41 ` Chris Mason
2007-06-18 14:41   ` Chris Mason
2007-06-18 17:37 ` Vladislav Bolkhovitin
2007-06-18 20:08   ` John Stoffel
2007-06-19  9:11   ` Pádraig Brady
2007-06-19 10:01     ` Vladislav Bolkhovitin
2007-06-19 18:20       ` david
2007-06-20  8:41         ` Vladislav Bolkhovitin
2007-06-19 12:04     ` Chris Mason
2007-06-19 12:04       ` Chris Mason
2007-06-19 14:00       ` Vladislav Bolkhovitin
2007-06-19 14:00         ` Vladislav Bolkhovitin
2007-06-19 18:24       ` david
2007-06-19 18:28       ` Philipp Matthias Hahn
2007-06-19 18:28         ` Philipp Matthias Hahn
2007-06-20  8:44         ` Vladislav Bolkhovitin
2007-06-20  9:18           ` Ph. Marek
  -- strict thread matches above, loose matches on Subject: below --
2007-06-13  5:45 Albert Cahalan
2007-06-13 12:00 ` Chris Mason
2007-06-13 16:14   ` Albert Cahalan
2007-06-13 16:57     ` Chris Mason
2007-06-14  6:59       ` Albert Cahalan
2007-06-14 12:30         ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070612201439.GI28279@think.oraclecorp.com \
    --to=chris.mason@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=snitzer@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.