From: Jeff Layton <jlayton@redhat.com>
To: Josef Bacik <josef@redhat.com>
Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
chris.mason@oracle.com, hch@lst.de, ssorce@redhat.com
Subject: Re: What to do about subvolumes?
Date: Wed, 1 Dec 2010 15:03:52 -0500 [thread overview]
Message-ID: <20101201150352.164007ed@tlielax.poochiereds.net> (raw)
In-Reply-To: <20101201142136.GD427@dhcp231-156.rdu.redhat.com>
On Wed, 1 Dec 2010 09:21:36 -0500
Josef Bacik <josef@redhat.com> wrote:
> There is one tricky thing. When you create a subvolume, the directory inode
> that is created in the parent subvolume has the inode number of 256. So if you
> have a bunch of subvolumes in the same parent subvolume, you are going to have a
> bunch of directories with the inode number of 256. This is so when users cd
> into a subvolume we can know its a subvolume and do all the normal voodoo to
> start looking in the subvolumes tree instead of the parent subvolumes tree.
>
> This is where things go a bit sideways. We had serious problems with NFS, but
> thankfully NFS gives us a bunch of hooks to get around these problems.
> CIFS/Samba do not, so we will have problems there, not to mention any other
> userspace application that looks at inode numbers.
A more common use case than CIFS or samba is going to be things like
backup programs. They commonly look at inode numbers in order to
identify hardlinks and may be horribly confused when there files that
have a link count >1 and inode number collisions with other files.
That probably qualifies as an "enterprise-ready" show stopper...
> === What do we do? ===
>
> This is where I expect to see the most discussion. Here is what I want to do
>
> 1) Scrap the 256 inode number thing. Instead we'll just put a flag in the inode
> to say "Hey, I'm a subvolume" and then we can do all of the appropriate magic
> that way. This unfortunately will be an incompatible format change, but the
> sooner we get this adressed the easier it will be in the long run. Obviously
> when I say format change I mean via the incompat bits we have, so old fs's won't
> be broken and such.
>
> 2) Do something like NFS's referral mounts when we cd into a subvolume. Now we
> just do dentry trickery, but that doesn't make the boundary between subvolumes
> clear, so it will confuse people (and samba) when they walk into a subvolume and
> all of a sudden the inode numbers are the same as in the directory behind them.
> With doing the referral mount thing, each subvolume appears to be its own mount
> and that way things like NFS and samba will work properly.
>
Sounds like you're on the right track.
The key concept is really that an inode number should be unique within
the scope of the st_dev. The simplest solution for you here is simply to
give each subvol its own st_dev and mount it up via a shrinkable mount
automagically when someone walks into the directory. In addition to the
examples of this in NFS, CIFS does this for DFS referrals.
Today, this is mostly done by hijacking the follow_link operation, but
David Howells proposed some patches a while back to do this via a more
formalized interface. It may be reasonable to target this work on top
of that, depending on the state of those changes...
--
Jeff Layton <jlayton@redhat.com>
next prev parent reply other threads:[~2010-12-01 20:03 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-12-01 14:21 What to do about subvolumes? Josef Bacik
2010-12-01 14:50 ` Mike Hommey
2010-12-01 14:51 ` C Anthony Risinger
2010-12-01 16:01 ` Chris Mason
2010-12-01 16:03 ` C Anthony Risinger
2010-12-01 16:13 ` Chris Mason
2010-12-01 16:31 ` Mike Hommey
2010-12-09 19:53 ` Martin Steigerwald
2010-12-01 16:00 ` Chris Mason
2010-12-01 16:38 ` Hugo Mills
2010-12-01 16:48 ` Gordan Bobic
2010-12-01 16:52 ` Mike Hommey
2010-12-01 16:52 ` C Anthony Risinger
2010-12-01 17:38 ` Josef Bacik
2010-12-01 19:35 ` Hugo Mills
2010-12-01 20:24 ` Freddie Cash
2010-12-01 21:28 ` Hugo Mills
2010-12-01 23:32 ` Freddie Cash
2010-12-02 4:46 ` Mike Fedyk
2010-12-01 18:33 ` Goffredo Baroncelli
2010-12-01 18:36 ` Josef Bacik
2010-12-01 18:48 ` C Anthony Risinger
2010-12-01 18:52 ` C Anthony Risinger
2010-12-01 19:08 ` Goffredo Baroncelli
2010-12-01 19:44 ` J. Bruce Fields
2010-12-01 19:54 ` Josef Bacik
2010-12-01 20:00 ` J. Bruce Fields
2010-12-01 20:09 ` Josef Bacik
2010-12-01 20:16 ` J. Bruce Fields
2010-12-02 1:52 ` Michael Vrable
2010-12-03 20:53 ` J. Bruce Fields
2010-12-01 20:03 ` Jeff Layton [this message]
2010-12-01 20:46 ` Goffredo Baroncelli
2010-12-01 21:06 ` Jeff Layton
2010-12-02 9:26 ` Arne Jansen
2010-12-02 9:49 ` Arne Jansen
2010-12-02 16:11 ` Chris Mason
2010-12-02 17:14 ` David Pottage
[not found] ` <AANLkTinBzpoCnci+1a=0pjXbAdQ7mzpdr2k8GOo7HUc8@mail.gmail.com>
2010-12-03 13:47 ` Fwd: " Paweł Brodacki
2010-12-03 20:56 ` J. Bruce Fields
2010-12-03 2:43 ` Phillip Susi
2011-01-31 2:40 ` Ian Kent
2010-12-03 4:25 ` Chris Ball
2010-12-03 14:00 ` Josef Bacik
2010-12-03 21:45 ` Josef Bacik
2010-12-03 22:16 ` J. Bruce Fields
2010-12-03 22:27 ` Dave Chinner
2010-12-03 22:29 ` Chris Mason
2010-12-03 22:45 ` J. Bruce Fields
2010-12-03 23:01 ` Andreas Dilger
2010-12-06 16:48 ` J. Bruce Fields
2010-12-08 6:39 ` Andreas Dilger
2010-12-08 23:07 ` Neil Brown
2010-12-09 4:41 ` Andreas Dilger
2010-12-09 15:19 ` J. Bruce Fields
2010-12-07 16:52 ` hch
2010-12-07 20:45 ` J. Bruce Fields
2010-12-07 16:51 ` Christoph Hellwig
2010-12-07 17:02 ` Trond Myklebust
2010-12-08 17:16 ` Andreas Dilger
2010-12-08 17:27 ` J. Bruce Fields
2010-12-08 21:18 ` Andreas Dilger
2010-12-04 21:58 ` Mike Fedyk
2010-12-06 14:27 ` Josef Bacik
2011-01-31 2:56 ` Ian Kent
2010-12-07 16:48 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101201150352.164007ed@tlielax.poochiereds.net \
--to=jlayton@redhat.com \
--cc=chris.mason@oracle.com \
--cc=hch@lst.de \
--cc=josef@redhat.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=ssorce@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).