From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
Djalal Harouni <tixxdz@gmail.com>, Chris Mason <clm@fb.com>,
Theodore Tso <tytso@mit.edu>,
Josh Triplett <josh@joshtriplett.org>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Andy Lutomirski <luto@kernel.org>,
Seth Forshee <seth.forshee@canonical.com>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-security-module@vger.kernel.org,
Dongsu Park <dongsu@endocode.com>,
David Herrmann <dh.herrmann@googlemail.com>,
Miklos Szeredi <mszeredi@redhat.com>,
Alban Crequy <alban.crequy@gmail.com>,
"Serge E. Hallyn" <serge@hallyn.com>,
Phil Estes <estesp@gmail.com>
Subject: Re: [RFC 1/1] shiftfs: uid/gid shifting bind mount
Date: Sat, 18 Feb 2017 19:24:38 -0800 [thread overview]
Message-ID: <1487474678.15793.2.camel@HansenPartnership.com> (raw)
In-Reply-To: <20170217203529.GC15841@redhat.com>
On Fri, 2017-02-17 at 15:35 -0500, Vivek Goyal wrote:
> On Fri, Feb 17, 2017 at 09:34:07AM -0800, James Bottomley wrote:
> > On Fri, 2017-02-17 at 02:55 +0000, Al Viro wrote:
> > > On Thu, Feb 16, 2017 at 07:56:30AM -0800, James Bottomley wrote:
> > >
> > > > > Hi James,
> > > > >
> > > > > Should it be "return d_splice_alias()" so that if we find an
> > > > > alias it is returned back to caller and passed in dentry can
> > > > > be freed. Though I don't know in what cases alias can be
> > > > > found. And if alias is found how do we make sure alias_dentry
> > > > > ->d_fsdata is pointing to new (real dentry).
> > > >
> > > > It probably should be for the sake of the pattern. In our case
> > > > I don't think we can have any root aliases because the root
> > > > dentry is always pinned in the cache, so cache lookup should
> > > > always find it.
> > >
> > > What does that have to do with root dentry? The real reason why
> > > that code works (FVerySVO) is that the damn thing allocates a new
> > > inode every time. Including the hardlinks, BTW.
> >
> > Yes, this is a known characteristic of stacked filesystems. Is
> > there some magic I don't know about that would make it easier to
> > reflect hard links as aliases?
>
> I think overlayfs had the same issue in the beginning and miklos
> fixed it.
>
> commit 51f7e52dc943468c6929fa0a82d4afac3c8e9636
> Author: Miklos Szeredi <mszeredi@redhat.com>
> Date: Fri Jul 29 12:05:24 2016 +0200
>
> ovl: share inode for hard link
That's rather complex, but the principle is simple: use the inode hash
for all upper inodes that may have aliases. Aliasable means the
underlying inode isn't a directory and has i_nlink > 1, so all I have
to do is perform a lookup through the hash if the underlying is
aliasable, invalidate the dentry in d_revalidate if the aliasing
conditions to the underlying change and manually handle hard links and
it should all work.
Like this?
James
---
diff --git a/fs/shiftfs.c b/fs/shiftfs.c
index 5b50447..c659812 100644
--- a/fs/shiftfs.c
+++ b/fs/shiftfs.c
@@ -134,6 +134,7 @@ static int shiftfs_d_weak_revalidate(struct dentry *dentry, unsigned int flags)
static int shiftfs_d_revalidate(struct dentry *dentry, unsigned int flags)
{
struct dentry *real = dentry->d_fsdata;
+ struct inode *reali = d_inode(real), *inode = d_inode(dentry);
int ret;
if (d_unhashed(real))
@@ -146,6 +147,15 @@ static int shiftfs_d_revalidate(struct dentry *dentry, unsigned int flags)
if (d_is_negative(real) != d_is_negative(dentry))
return 0;
+ /*
+ * non dir link count is > 1 and our inode is currently not in
+ * the inode hash => need to drop and reget our dentry to make
+ * sure we're aliasing it correctly.
+ */
+ if (reali &&!S_ISDIR(reali->i_mode) && reali->i_nlink > 1 &&
+ (!inode || inode_unhashed(inode)))
+ return 0;
+
if (!(real->d_flags & DCACHE_OP_REVALIDATE))
return 1;
@@ -285,7 +295,8 @@ static int shiftfs_make_object(struct inode *dir, struct dentry *dentry,
umode_t mode, const char *symlink,
struct dentry *hardlink, bool excl)
{
- struct dentry *real = dir->i_private, *new = dentry->d_fsdata;
+ struct dentry *real = dir->i_private, *new = dentry->d_fsdata,
+ *realhardlink = NULL;
struct inode *reali = real->d_inode, *newi;
const struct inode_operations *iop = reali->i_op;
int err;
@@ -293,6 +304,7 @@ static int shiftfs_make_object(struct inode *dir, struct dentry *dentry,
bool op_ok = false;
if (hardlink) {
+ realhardlink = hardlink->d_fsdata;
op_ok = iop->link;
} else {
switch (mode & S_IFMT) {
@@ -310,7 +322,7 @@ static int shiftfs_make_object(struct inode *dir, struct dentry *dentry,
return -EINVAL;
- newi = shiftfs_new_inode(dentry->d_sb, mode, NULL);
+ newi = shiftfs_new_inode(dentry->d_sb, mode, realhardlink);
if (!newi)
return -ENOMEM;
@@ -320,8 +332,6 @@ static int shiftfs_make_object(struct inode *dir, struct dentry *dentry,
err = -EINVAL; /* shut gcc up about uninit var */
if (hardlink) {
- struct dentry *realhardlink = hardlink->d_fsdata;
-
err = vfs_link(realhardlink, reali, new, NULL);
} else {
switch (mode & S_IFMT) {
@@ -341,7 +351,16 @@ static int shiftfs_make_object(struct inode *dir, struct dentry *dentry,
if (err)
goto out_dput;
- shiftfs_fill_inode(newi, new);
+ if (!hardlink)
+ shiftfs_fill_inode(newi, new);
+ else if (inode_unhashed(newi) && !S_ISDIR(newi->i_mode))
+ /*
+ * although dentry and hardlink now each point to
+ * newi, the link count was 1 when they were created,
+ * so insert into the inode cache now that the link
+ * count has gone above one.
+ */
+ __insert_inode_hash(newi, (unsigned long)d_inode(new));
d_instantiate(dentry, newi);
@@ -569,12 +588,55 @@ static const struct inode_operations shiftfs_inode_ops = {
.listxattr = shiftfs_listxattr,
};
+static int shiftfs_test(struct inode *inode, void *data)
+{
+ struct dentry *d1 = inode->i_private, *d2 = data;
+ struct inode *i1 = d_inode(d1), *i2 = d_inode(d2);
+
+ return i1 && i1 == i2;
+}
+
+static int shiftfs_set(struct inode *inode, void *data)
+{
+ struct dentry *dentry = data;
+
+ shiftfs_fill_inode(inode, dentry);
+
+ return 0;
+}
+
static struct inode *shiftfs_new_inode(struct super_block *sb, umode_t mode,
struct dentry *dentry)
{
struct inode *inode;
+ struct inode *reali = dentry ? d_inode(dentry): NULL;
+ bool use_inode_hash = false;
+
+ /*
+ * Here we hash the inode only if the underlying link count is
+ * greater than one and it's not a directory (meaning the hash
+ * contains all items that might be aliases). We keep this
+ * accurate by checking the underlying link count on
+ * revalidation and forcing a new lookup if the underlying
+ * link count is raised.
+ *
+ * Note: if the link count drops again, we don't remove the
+ * inode from the hash, so the hash contains all inodes that
+ * may be aliases plus a few others.
+ */
+ if (reali)
+ use_inode_hash = ACCESS_ONCE(reali->i_nlink) > 1 &&
+ !S_ISDIR(reali->i_mode);
+
+ if (use_inode_hash) {
+ inode = iget5_locked(sb, (unsigned long)reali, shiftfs_test,
+ shiftfs_set, dentry);
+ if (inode && !(inode->i_state & I_NEW))
+ return inode;
+ } else {
+ inode = new_inode(sb);
+ }
- inode = new_inode(sb);
if (!inode)
return NULL;
@@ -586,7 +648,10 @@ static struct inode *shiftfs_new_inode(struct super_block *sb, umode_t mode,
inode->i_op = &shiftfs_inode_ops;
- shiftfs_fill_inode(inode, dentry);
+ if (use_inode_hash)
+ unlock_new_inode(inode);
+ else
+ shiftfs_fill_inode(inode, dentry);
return inode;
}
next prev parent reply other threads:[~2017-02-19 3:26 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-04 19:18 [RFC 0/1] shiftfs: uid/gid shifting filesystem (s_user_ns version) James Bottomley
2017-02-04 19:19 ` [RFC 1/1] shiftfs: uid/gid shifting bind mount James Bottomley
2017-02-05 7:51 ` Amir Goldstein
2017-02-06 1:18 ` James Bottomley
2017-02-06 6:59 ` Amir Goldstein
2017-02-06 14:41 ` James Bottomley
2017-02-14 23:03 ` Vivek Goyal
2017-02-14 23:45 ` James Bottomley
2017-02-15 14:17 ` Vivek Goyal
2017-02-16 15:51 ` James Bottomley
2017-02-16 16:42 ` Vivek Goyal
2017-02-16 16:58 ` James Bottomley
2017-02-17 1:57 ` Eric W. Biederman
2017-02-17 8:39 ` Djalal Harouni
2017-02-17 17:19 ` James Bottomley
2017-02-20 4:24 ` Eric W. Biederman
2017-02-22 12:01 ` James Bottomley
2017-02-06 3:25 ` J. R. Okajima
2017-02-06 6:38 ` Amir Goldstein
2017-02-06 16:29 ` James Bottomley
2017-02-06 6:46 ` James Bottomley
2017-02-06 14:50 ` Theodore Ts'o
2017-02-06 15:18 ` James Bottomley
2017-02-06 15:38 ` lkml
2017-02-06 17:32 ` James Bottomley
2017-02-06 21:52 ` J. Bruce Fields
2017-02-07 0:10 ` James Bottomley
2017-02-07 1:35 ` J. Bruce Fields
2017-02-07 19:01 ` James Bottomley
2017-02-07 19:47 ` Christoph Hellwig
2017-02-06 16:24 ` J. R. Okajima
2017-02-21 0:48 ` James Bottomley
2017-02-21 2:57 ` J. R. Okajima
2017-02-21 4:07 ` James Bottomley
2017-02-21 4:34 ` J. R. Okajima
2017-02-07 9:19 ` Christoph Hellwig
2017-02-07 9:39 ` Djalal Harouni
2017-02-07 9:53 ` Christoph Hellwig
2017-02-07 16:37 ` James Bottomley
2017-02-07 17:59 ` Amir Goldstein
2017-02-07 18:10 ` Christoph Hellwig
2017-02-07 19:02 ` James Bottomley
2017-02-07 19:49 ` Christoph Hellwig
2017-02-07 20:05 ` James Bottomley
2017-02-07 21:01 ` Amir Goldstein
2017-02-07 22:25 ` Christoph Hellwig
2017-02-07 23:42 ` James Bottomley
2017-02-08 6:44 ` Amir Goldstein
2017-02-08 11:45 ` Konstantin Khlebnikov
2017-02-08 14:57 ` James Bottomley
2017-02-08 15:15 ` James Bottomley
2017-02-08 1:54 ` Josh Triplett
2017-02-08 15:22 ` James Bottomley
2017-02-09 10:36 ` Josh Triplett
2017-02-09 15:34 ` James Bottomley
2017-02-13 10:15 ` Eric W. Biederman
2017-02-15 9:33 ` Djalal Harouni
2017-02-15 9:37 ` Eric W. Biederman
2017-02-15 10:04 ` Djalal Harouni
2017-02-07 18:20 ` James Bottomley
2017-02-07 19:48 ` Djalal Harouni
2017-02-15 20:34 ` Vivek Goyal
2017-02-16 15:56 ` James Bottomley
2017-02-17 2:55 ` Al Viro
2017-02-17 17:34 ` James Bottomley
2017-02-17 20:35 ` Vivek Goyal
2017-02-19 3:24 ` James Bottomley [this message]
2017-02-20 19:26 ` Vivek Goyal
2017-02-21 0:38 ` James Bottomley
2017-02-17 2:29 ` Al Viro
2017-02-17 17:24 ` James Bottomley
2017-02-17 17:51 ` Al Viro
2017-02-17 20:27 ` Vivek Goyal
2017-02-17 20:50 ` James Bottomley
-- strict thread matches above, loose matches on Subject: below --
2016-05-12 19:06 [RFC 0/1] shiftfs: uid/gid shifting filesystem James Bottomley
2016-05-12 19:07 ` [RFC 1/1] shiftfs: uid/gid shifting bind mount James Bottomley
2016-05-16 19:41 ` Serge Hallyn
2016-05-17 2:28 ` James Bottomley
2016-05-17 3:47 ` Serge E. Hallyn
2016-05-17 10:23 ` James Bottomley
2016-05-17 20:59 ` James Bottomley
2016-05-19 2:28 ` Serge E. Hallyn
2016-05-19 10:53 ` James Bottomley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1487474678.15793.2.camel@HansenPartnership.com \
--to=james.bottomley@hansenpartnership.com \
--cc=alban.crequy@gmail.com \
--cc=clm@fb.com \
--cc=dh.herrmann@googlemail.com \
--cc=dongsu@endocode.com \
--cc=ebiederm@xmission.com \
--cc=estesp@gmail.com \
--cc=josh@joshtriplett.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mszeredi@redhat.com \
--cc=serge@hallyn.com \
--cc=seth.forshee@canonical.com \
--cc=tixxdz@gmail.com \
--cc=tytso@mit.edu \
--cc=vgoyal@redhat.com \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).