public inbox for linux-unionfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/23] Overlayfs inodes index
@ 2017-06-14  7:26 Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 01/23] vfs: introduce inode 'inuse' lock Amir Goldstein
                   ` (23 more replies)
  0 siblings, 24 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs, linux-fsdevel

Miklos,

I've made all the changes we discussed on v2 review and now all
upper/lower hardlinks are consistent. Note that I chose to invalidate
lower indexed dentries instead of updating all aliases on first copy up
(patch 21). That seemed simpler to me. Please let me know if you see any
culprit with that approach. I also re-introduced ovl_update_type()
(patch 8) for easier and more compact storing of the INDEX type flag.

The remaining work on persistent nlink accounting and orphan index
cleanup is left for a differnt posting. Without that complementary work,
the index entries will remain forever. I am not aware of any other
culprit or correctness issues that may arrise from lack of index cleanup.

This work introduces the inodes index opt-in feature, which provides:
- Hardlinks are not broken on copy up
- Infrastructure for overlayfs NFS export

The work is available on my ovl-hardlinks branch [1].
For the curious, documentation about how overlay NFS export would work
with the inodes index is available in the linked commit [2].

The most significant change w.r.t. vfs is that with inodes index,
overlay inodes are hashed and unique throughout the lifecycle of an
overlay object (i.e. across copy up). This is required for not breaking
hardlinks on copy up.

Hardlink copy up tests including coverage for concurrent copy up of
lower hardlinks and lower/upper hardlinks consistency are available on
my xfstests dev branch [3].

This work also fixes constant st_ino for samefs case for lower hardlinks,
so the few constant st_ino tests that fail on v4.12-rc1 due to copy up of
lower hardlinks now pass [3][4].

Thanks,
Amir.

Changes since v2:
- Introduce ovl_inode and ovl_inode mutex
- Synchronize copy up with ovl_inode mutex
- Don't store index dentry in overlay dentry
- Consistency of lower/upper hardlinks by link-up on lookup
- Preemptive copy up before lower hardlink unlink

TODO:
- Persistent overlay nlink accounting
- Whiteout index when overlay nlink drops to zero
- Cleanup stale and orphan index entries on mount
- Document the inodes index feature

[1] https://github.com/amir73il/linux/commits/ovl-hardlinks
[2] https://github.com/amir73il/linux/commits/ovl-nfs-export 
[3] https://github.com/amir73il/xfstests/commits/overlayfs-devel
[4] https://github.com/amir73il/unionmount-testsuite/commits/overlayfs-devel

Amir Goldstein (23):
  vfs: introduce inode 'inuse' lock
  ovl: get exclusive ownership on upper/work dirs
  ovl: relax same fs constrain for ovl_check_origin()
  ovl: generalize ovl_create_workdir()
  ovl: introduce the inodes index dir feature
  ovl: verify upper root dir matches lower root dir
  ovl: verify index dir matches upper dir
  ovl: store path type in dentry
  ovl: cram dentry state booleans into type flags
  ovl: lookup index entry for copy up origin
  ovl: allocate an ovl_inode struct
  ovl: store upper/lower real inode in ovl_inode_info
  ovl: use ovl_inode_init() for initializing new inode
  ovl: hash overlay non-dir inodes by copy up origin inode
  ovl: defer upper dir lock to tempfile link
  ovl: factor out ovl_copy_up_inode() helper
  ovl: generalize ovl_copy_up_locked() using actors
  ovl: generalize ovl_copy_up_one() using actors
  ovl: use ovl_inode mutex to synchronize concurrent copy up
  ovl: implement index dir copy up method
  ovl: link up indexed lower hardlink on lookup
  ovl: fix nlink leak in ovl_rename()
  ovl: adjust overlay inode nlink for indexed inodes

 fs/inode.c               |  50 ++++
 fs/overlayfs/Kconfig     |  20 ++
 fs/overlayfs/copy_up.c   | 592 +++++++++++++++++++++++++++++++++++++----------
 fs/overlayfs/dir.c       |  50 +++-
 fs/overlayfs/inode.c     |  68 +++++-
 fs/overlayfs/namei.c     | 283 ++++++++++++++++++----
 fs/overlayfs/overlayfs.h |  37 +--
 fs/overlayfs/ovl_entry.h |  37 ++-
 fs/overlayfs/super.c     | 308 +++++++++++++++++++++---
 fs/overlayfs/util.c      | 164 ++++++++++---
 include/linux/fs.h       |  14 ++
 11 files changed, 1356 insertions(+), 267 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v3 01/23] vfs: introduce inode 'inuse' lock
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 02/23] ovl: get exclusive ownership on upper/work dirs Amir Goldstein
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs, linux-fsdevel

Added an i_state flag I_INUSE and helpers to set/clear/test the bit.

The 'inuse' lock is an 'advisory' inode lock, that can be used to extend
exclusive create protection beyond parent->i_mutex lock among cooperating
users.

This is going to be used by overlayfs to get exclusive ownership on upper
and work dirs among overlayfs mounts.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/inode.c         | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h | 14 ++++++++++++++
 2 files changed, 64 insertions(+)

diff --git a/fs/inode.c b/fs/inode.c
index db5914783a71..6a41b27467eb 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -2120,3 +2120,53 @@ struct timespec current_time(struct inode *inode)
 	return timespec_trunc(now, inode->i_sb->s_time_gran);
 }
 EXPORT_SYMBOL(current_time);
+
+/**
+ * inode_inuse_trylock - try to get an exclusive 'inuse' lock on inode
+ * @inode: inode being locked
+ *
+ * The 'inuse' lock is an 'advisory' lock that can be used to extend exclusive
+ * create protection beyond parent->i_mutex lock among cooperating users.
+ * Used by overlayfs to get exclusive ownership on upper and work dirs among
+ * overlayfs mounts.
+ *
+ * Caller must hold a reference to inode to prevent it from being freed while
+ * it is marked inuse.
+ *
+ * Return true if I_INUSE flag was set by this call.
+ */
+bool inode_inuse_trylock(struct inode *inode)
+{
+	bool locked = false;
+
+	spin_lock(&inode->i_lock);
+	if (!WARN_ON(!atomic_read(&inode->i_count)) &&
+	    !WARN_ON(inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) &&
+	    !(inode->i_state & I_INUSE)) {
+		inode->i_state |= I_INUSE;
+		locked = true;
+	}
+	spin_unlock(&inode->i_lock);
+	return locked;
+}
+EXPORT_SYMBOL(inode_inuse_trylock);
+
+/**
+ * inode_inuse_unlock - release exclusive 'inuse' lock
+ * @inode:	inode inuse to unlock
+ *
+ * Clear the I_INUSE state.
+ *
+ * Caller must hold a reference to inode and must have successfully marked
+ * the inode 'inuse' prior to this call.
+ */
+void inode_inuse_unlock(struct inode *inode)
+{
+	spin_lock(&inode->i_lock);
+	WARN_ON(!atomic_read(&inode->i_count));
+	WARN_ON(inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE));
+	WARN_ON(!(inode->i_state & I_INUSE));
+	inode->i_state &= ~I_INUSE;
+	spin_unlock(&inode->i_lock);
+}
+EXPORT_SYMBOL(inode_inuse_unlock);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index aab10f93ef23..532802120258 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1929,6 +1929,12 @@ static inline bool HAS_UNMAPPED_ID(struct inode *inode)
  *			wb stat updates to grab mapping->tree_lock.  See
  *			inode_switch_wb_work_fn() for details.
  *
+ * I_INUSE		An 'advisory' bit to get exclusive ownership on inode
+ *			using inode_inuse_trylock().  It can be used to extend
+ *			exclusive create protection beyond parent->i_mutex lock.
+ *			Used by overlayfs to get exclusive ownership on upper
+ *			and work dirs among overlayfs mounts.
+ *
  * Q: What is the difference between I_WILL_FREE and I_FREEING?
  */
 #define I_DIRTY_SYNC		(1 << 0)
@@ -1949,6 +1955,7 @@ static inline bool HAS_UNMAPPED_ID(struct inode *inode)
 #define __I_DIRTY_TIME_EXPIRED	12
 #define I_DIRTY_TIME_EXPIRED	(1 << __I_DIRTY_TIME_EXPIRED)
 #define I_WB_SWITCH		(1 << 13)
+#define I_INUSE			(1 << 14)
 
 #define I_DIRTY (I_DIRTY_SYNC | I_DIRTY_DATASYNC | I_DIRTY_PAGES)
 #define I_DIRTY_ALL (I_DIRTY | I_DIRTY_TIME)
@@ -3258,5 +3265,12 @@ static inline bool dir_relax_shared(struct inode *inode)
 
 extern bool path_noexec(const struct path *path);
 extern void inode_nohighmem(struct inode *inode);
+extern bool inode_inuse_trylock(struct inode *inode);
+extern void inode_inuse_unlock(struct inode *inode);
+
+static inline bool inode_inuse(struct inode *inode)
+{
+	return inode->i_state & I_INUSE;
+}
 
 #endif /* _LINUX_FS_H */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 02/23] ovl: get exclusive ownership on upper/work dirs
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 01/23] vfs: introduce inode 'inuse' lock Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 03/23] ovl: relax same fs constrain for ovl_check_origin() Amir Goldstein
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Bad things can happen if several concurrent overlay mounts try to
use the same upperdir/workdir path.

Try to get the 'inuse' advisory lock on upperdir and workdir.
Fail mount if another overlay mount instance or another user
holds the 'inuse' lock on these directories.

Note that this provides no protection for concurrent overlay
mount that use overlapping (i.e. descendant) upper/work dirs.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/ovl_entry.h |  3 +++
 fs/overlayfs/super.c     | 41 ++++++++++++++++++++++++++++++++++++++---
 2 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 34bc4a9f5c61..b0e7ee2ae398 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -21,6 +21,9 @@ struct ovl_fs {
 	struct vfsmount *upper_mnt;
 	unsigned numlower;
 	struct vfsmount **lower_mnt;
+	/* workbasedir is the path at workdir= mount option */
+	struct dentry *workbasedir;
+	/* workdir is the 'work' directory under workbasedir */
 	struct dentry *workdir;
 	long namelen;
 	/* pathnames of lower and upper dirs, for show_options */
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 4882ffb37bae..476f021baf2a 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -165,12 +165,28 @@ static const struct dentry_operations ovl_reval_dentry_operations = {
 	.d_weak_revalidate = ovl_dentry_weak_revalidate,
 };
 
+/* Get exclusive ownership on upper/work dir among overlay mounts */
+static bool ovl_dir_lock(struct dentry *dentry)
+{
+	return inode_inuse_trylock(d_inode(dentry));
+}
+
+static void ovl_dir_unlock(struct dentry *dentry)
+{
+	if (dentry)
+		inode_inuse_unlock(d_inode(dentry));
+}
+
 static void ovl_put_super(struct super_block *sb)
 {
 	struct ovl_fs *ufs = sb->s_fs_info;
 	unsigned i;
 
 	dput(ufs->workdir);
+	ovl_dir_unlock(ufs->workbasedir);
+	dput(ufs->workbasedir);
+	if (ufs->upper_mnt)
+		ovl_dir_unlock(ufs->upper_mnt->mnt_root);
 	mntput(ufs->upper_mnt);
 	for (i = 0; i < ufs->numlower; i++)
 		mntput(ufs->lower_mnt[i]);
@@ -788,9 +804,15 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		if (err)
 			goto out_put_upperpath;
 
+		err = -EBUSY;
+		if (!ovl_dir_lock(upperpath.dentry)) {
+			pr_err("overlayfs: upperdir is in-use by another mount\n");
+			goto out_put_upperpath;
+		}
+
 		err = ovl_mount_dir(ufs->config.workdir, &workpath);
 		if (err)
-			goto out_put_upperpath;
+			goto out_unlock_upperdentry;
 
 		err = -EINVAL;
 		if (upperpath.mnt != workpath.mnt) {
@@ -801,12 +823,20 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 			pr_err("overlayfs: workdir and upperdir must be separate subtrees\n");
 			goto out_put_workpath;
 		}
+
+		err = -EBUSY;
+		if (!ovl_dir_lock(workpath.dentry)) {
+			pr_err("overlayfs: workdir is in-use by another mount\n");
+			goto out_put_workpath;
+		}
+
+		ufs->workbasedir = workpath.dentry;
 		sb->s_stack_depth = upperpath.mnt->mnt_sb->s_stack_depth;
 	}
 	err = -ENOMEM;
 	lowertmp = kstrdup(ufs->config.lowerdir, GFP_KERNEL);
 	if (!lowertmp)
-		goto out_put_workpath;
+		goto out_unlock_workdentry;
 
 	err = -EINVAL;
 	stacklen = ovl_split_lowerdirs(lowertmp);
@@ -849,6 +879,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 			pr_err("overlayfs: failed to clone upperpath\n");
 			goto out_put_lowerpath;
 		}
+
 		/* Don't inherit atime flags */
 		ufs->upper_mnt->mnt_flags &= ~(MNT_NOATIME | MNT_NODIRATIME | MNT_RELATIME);
 
@@ -971,7 +1002,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	mntput(upperpath.mnt);
 	for (i = 0; i < numlower; i++)
 		mntput(stack[i].mnt);
-	path_put(&workpath);
+	mntput(workpath.mnt);
 	kfree(lowertmp);
 
 	if (upperpath.dentry) {
@@ -1011,8 +1042,12 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	kfree(stack);
 out_free_lowertmp:
 	kfree(lowertmp);
+out_unlock_workdentry:
+	ovl_dir_unlock(workpath.dentry);
 out_put_workpath:
 	path_put(&workpath);
+out_unlock_upperdentry:
+	ovl_dir_unlock(upperpath.dentry);
 out_put_upperpath:
 	path_put(&upperpath);
 out_free_config:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 03/23] ovl: relax same fs constrain for ovl_check_origin()
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 01/23] vfs: introduce inode 'inuse' lock Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 02/23] ovl: get exclusive ownership on upper/work dirs Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 04/23] ovl: generalize ovl_create_workdir() Amir Goldstein
                   ` (20 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

For the case of all layers not on the same fs, try to decode the copy up
origin file handle on any of the lower layers.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/namei.c | 43 +++++++++++++++++++++++++------------------
 1 file changed, 25 insertions(+), 18 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 1ec1499fc672..4ca6061f7bfa 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -272,28 +272,24 @@ static int ovl_lookup_layer(struct dentry *base, struct ovl_lookup_data *d,
 static int ovl_check_origin(struct dentry *dentry, struct dentry *upperdentry,
 			    struct path **stackp, unsigned int *ctrp)
 {
-	struct super_block *same_sb = ovl_same_sb(dentry->d_sb);
 	struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
 	struct vfsmount *mnt;
-	struct dentry *origin;
+	struct dentry *origin = NULL;
+	int i;
 
-	if (!same_sb || !roe->numlower)
-		return 0;
 
-       /*
-	* Since all layers are on the same fs, we use the first layer for
-	* decoding the file handle.  We may get a disconnected dentry,
-	* which is fine, because we only need to hold the origin inode in
-	* cache and use its inode number.  We may even get a connected dentry,
-	* that is not under the first layer's root.  That is also fine for
-	* using it's inode number - it's the same as if we held a reference
-	* to a dentry in first layer that was moved under us.
-	*/
-	mnt = roe->lowerstack[0].mnt;
-
-	origin = ovl_get_origin(upperdentry, mnt);
-	if (IS_ERR_OR_NULL(origin))
-		return PTR_ERR(origin);
+	for (i = 0; i < roe->numlower; i++) {
+		mnt = roe->lowerstack[i].mnt;
+		origin = ovl_get_origin(upperdentry, mnt);
+		if (IS_ERR(origin))
+			return PTR_ERR(origin);
+
+		if (origin)
+			break;
+	}
+
+	if (!origin)
+		return 0;
 
 	BUG_ON(*stackp || *ctrp);
 	*stackp = kmalloc(sizeof(struct path), GFP_TEMPORARY);
@@ -301,6 +297,7 @@ static int ovl_check_origin(struct dentry *dentry, struct dentry *upperdentry,
 		dput(origin);
 		return -ENOMEM;
 	}
+
 	**stackp = (struct path) { .dentry = origin, .mnt = mnt };
 	*ctrp = 1;
 
@@ -375,6 +372,16 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		}
 		if (upperdentry && !d.is_dir) {
 			BUG_ON(!d.stop || d.redirect);
+			/*
+			 * Lookup copy up origin by decoding origin file handle.
+			 * We may get a disconnected dentry, which is fine,
+			 * because we only need to hold the origin inode in
+			 * cache and use its inode number.  We may even get a
+			 * connected dentry, that is not under any of the lower
+			 * layers root.  That is also fine for using it's inode
+			 * number - it's the same as if we held a reference
+			 * to a dentry in lower layer that was moved under us.
+			 */
 			err = ovl_check_origin(dentry, upperdentry,
 					       &stack, &ctr);
 			if (err)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 04/23] ovl: generalize ovl_create_workdir()
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (2 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 03/23] ovl: relax same fs constrain for ovl_check_origin() Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 05/23] ovl: introduce the inodes index dir feature Amir Goldstein
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Pass in the subdir name to create and specify if subdir is persistent
or if it should be cleaned up on every mount.

Move fallback to readonly mount on failure to create dir and print of error
message into the helper.

This function is going to be used for creating the persistent 'index' dir
under workbasedir.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/super.c | 45 +++++++++++++++++++++++++++++----------------
 1 file changed, 29 insertions(+), 16 deletions(-)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 476f021baf2a..5114627799e9 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -395,22 +395,25 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 
 #define OVL_WORKDIR_NAME "work"
 
-static struct dentry *ovl_workdir_create(struct vfsmount *mnt,
-					 struct dentry *dentry)
+static struct dentry *ovl_workdir_create(struct super_block *sb,
+					 struct ovl_fs *ufs,
+					 struct dentry *dentry,
+					 const char *name, bool persist)
 {
-	struct inode *dir = dentry->d_inode;
-	struct dentry *work;
+	struct vfsmount *mnt = ufs->upper_mnt;
+	struct inode *dir = NULL;
+	struct dentry *work = NULL;
 	int err;
 	bool retried = false;
 
 	err = mnt_want_write(mnt);
 	if (err)
-		return ERR_PTR(err);
+		goto out_err;
 
+	dir = d_inode(dentry);
 	inode_lock_nested(dir, I_MUTEX_PARENT);
 retry:
-	work = lookup_one_len(OVL_WORKDIR_NAME, dentry,
-			      strlen(OVL_WORKDIR_NAME));
+	work = lookup_one_len(name, dentry, strlen(name));
 
 	if (!IS_ERR(work)) {
 		struct iattr attr = {
@@ -423,6 +426,9 @@ static struct dentry *ovl_workdir_create(struct vfsmount *mnt,
 			if (retried)
 				goto out_dput;
 
+			if (persist)
+				goto out_unlock;
+
 			retried = true;
 			ovl_workdir_cleanup(dir, mnt, work, 0);
 			dput(work);
@@ -462,16 +468,25 @@ static struct dentry *ovl_workdir_create(struct vfsmount *mnt,
 		inode_unlock(work->d_inode);
 		if (err)
 			goto out_dput;
+	} else {
+		err = PTR_ERR(work);
+		goto out_err;
 	}
+
 out_unlock:
-	inode_unlock(dir);
+	if (dir)
+		inode_unlock(dir);
 	mnt_drop_write(mnt);
 
 	return work;
 
 out_dput:
 	dput(work);
-	work = ERR_PTR(err);
+out_err:
+	pr_warn("overlayfs: failed to create directory %s/%s (errno: %i); mounting read-only\n",
+		ufs->config.workdir, name, -err);
+	sb->s_flags |= MS_RDONLY;
+	work = NULL;
 	goto out_unlock;
 }
 
@@ -885,14 +900,11 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 
 		sb->s_time_gran = ufs->upper_mnt->mnt_sb->s_time_gran;
 
-		ufs->workdir = ovl_workdir_create(ufs->upper_mnt, workpath.dentry);
+		ufs->workdir = ovl_workdir_create(sb, ufs, workpath.dentry,
+						  OVL_WORKDIR_NAME, false);
 		err = PTR_ERR(ufs->workdir);
-		if (IS_ERR(ufs->workdir)) {
-			pr_warn("overlayfs: failed to create directory %s/%s (errno: %i); mounting read-only\n",
-				ufs->config.workdir, OVL_WORKDIR_NAME, -err);
-			sb->s_flags |= MS_RDONLY;
-			ufs->workdir = NULL;
-		}
+		if (IS_ERR(ufs->workdir))
+			goto out_put_upper_mnt;
 
 		/*
 		 * Upper should support d_type, else whiteouts are visible.
@@ -1035,6 +1047,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	kfree(ufs->lower_mnt);
 out_put_workdir:
 	dput(ufs->workdir);
+out_put_upper_mnt:
 	mntput(ufs->upper_mnt);
 out_put_lowerpath:
 	for (i = 0; i < numlower; i++)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 05/23] ovl: introduce the inodes index dir feature
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (3 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 04/23] ovl: generalize ovl_create_workdir() Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 06/23] ovl: verify upper root dir matches lower root dir Amir Goldstein
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Create the index dir on mount. The index dir will contain hardlinks to
upper inodes, named after the hex representation of their origin lower
inodes.

The index dir is going to be used to prevent breaking lower hardlinks
on copy up and to implement overlayfs NFS export.

Because the feature is not fully backward compat, enabling the feature
is opt-in by config/module/mount option.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/Kconfig     | 20 +++++++++++++++
 fs/overlayfs/copy_up.c   | 10 +++-----
 fs/overlayfs/overlayfs.h |  2 ++
 fs/overlayfs/ovl_entry.h |  3 +++
 fs/overlayfs/super.c     | 66 ++++++++++++++++++++++++++++++++++++++++++++++--
 fs/overlayfs/util.c      | 15 +++++++++++
 6 files changed, 108 insertions(+), 8 deletions(-)

diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
index c0c9683934b7..b7241f273c36 100644
--- a/fs/overlayfs/Kconfig
+++ b/fs/overlayfs/Kconfig
@@ -23,3 +23,23 @@ config OVERLAY_FS_REDIRECT_DIR
 	  Note, that redirects are not backward compatible.  That is, mounting
 	  an overlay which has redirects on a kernel that doesn't support this
 	  feature will have unexpected results.
+
+config OVERLAY_FS_INDEX
+	bool "Overlayfs: turn on inodes index feature by default"
+	depends on OVERLAY_FS
+	help
+	  If this config option is enabled then overlay filesystems will use
+	  the inodes index dir to map lower inodes to upper inodes by default.
+	  In this case it is still possible to turn off index globally with the
+	  "index=off" module option or on a filesystem instance basis with the
+	  "index=off" mount option.
+
+	  The inodes index feature prevents breaking of lower hardlinks on copy
+	  up and enables NFS export.
+
+	  Note, that the inodes index feature is read-only backward compatible.
+	  That is, mounting an overlay which has an index dir on a kernel that
+	  doesn't support this feature read-only, will not have any negative
+	  outcomes.  However, mounting the same overlay with an old kernel
+	  read-write and then mounting it again with a new kernel, will have
+	  unexpected results.
diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 7a44533f4bbf..d3047fdc0377 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -233,12 +233,13 @@ int ovl_set_attr(struct dentry *upperdentry, struct kstat *stat)
 	return err;
 }
 
-static struct ovl_fh *ovl_encode_fh(struct dentry *lower, uuid_be *uuid)
+static struct ovl_fh *ovl_encode_fh(struct dentry *lower)
 {
 	struct ovl_fh *fh;
 	int fh_type, fh_len, dwords;
 	void *buf;
 	int buflen = MAX_HANDLE_SZ;
+	uuid_be *uuid = (uuid_be *) &lower->d_sb->s_uuid;
 
 	buf = kmalloc(buflen, GFP_TEMPORARY);
 	if (!buf)
@@ -283,8 +284,6 @@ static struct ovl_fh *ovl_encode_fh(struct dentry *lower, uuid_be *uuid)
 static int ovl_set_origin(struct dentry *dentry, struct dentry *lower,
 			  struct dentry *upper)
 {
-	struct super_block *sb = lower->d_sb;
-	uuid_be *uuid = (uuid_be *) &sb->s_uuid;
 	const struct ovl_fh *fh = NULL;
 	int err;
 
@@ -293,9 +292,8 @@ static int ovl_set_origin(struct dentry *dentry, struct dentry *lower,
 	 * so we can use the overlay.origin xattr to distignuish between a copy
 	 * up and a pure upper inode.
 	 */
-	if (sb->s_export_op && sb->s_export_op->fh_to_dentry &&
-	    uuid_be_cmp(*uuid, NULL_UUID_BE)) {
-		fh = ovl_encode_fh(lower, uuid);
+	if (ovl_can_decode_fh(lower->d_sb)) {
+		fh = ovl_encode_fh(lower);
 		if (IS_ERR(fh))
 			return PTR_ERR(fh);
 	}
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 513e25e56eed..9ec93063c3a8 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -191,6 +191,8 @@ void ovl_drop_write(struct dentry *dentry);
 struct dentry *ovl_workdir(struct dentry *dentry);
 const struct cred *ovl_override_creds(struct super_block *sb);
 struct super_block *ovl_same_sb(struct super_block *sb);
+bool ovl_can_decode_fh(struct super_block *sb);
+struct dentry *ovl_indexdir(struct super_block *sb);
 struct ovl_entry *ovl_alloc_entry(unsigned int numlower);
 bool ovl_dentry_remote(struct dentry *dentry);
 bool ovl_dentry_weird(struct dentry *dentry);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index b0e7ee2ae398..ddd4b0a874a9 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -14,6 +14,7 @@ struct ovl_config {
 	char *workdir;
 	bool default_permissions;
 	bool redirect_dir;
+	bool index;
 };
 
 /* private information held for overlayfs's superblock */
@@ -25,6 +26,8 @@ struct ovl_fs {
 	struct dentry *workbasedir;
 	/* workdir is the 'work' directory under workbasedir */
 	struct dentry *workdir;
+	/* index directory listing overlay inodes by origin file handle */
+	struct dentry *indexdir;
 	long namelen;
 	/* pathnames of lower and upper dirs, for show_options */
 	struct ovl_config config;
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 5114627799e9..d92c7486bdf2 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -34,6 +34,11 @@ module_param_named(redirect_dir, ovl_redirect_dir_def, bool, 0644);
 MODULE_PARM_DESC(ovl_redirect_dir_def,
 		 "Default to on or off for the redirect_dir feature");
 
+static bool ovl_index_def = IS_ENABLED(CONFIG_OVERLAY_FS_INDEX);
+module_param_named(index, ovl_index_def, bool, 0644);
+MODULE_PARM_DESC(ovl_index_def,
+		 "Default to on or off for the inodes index feature");
+
 static void ovl_dentry_release(struct dentry *dentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
@@ -182,6 +187,7 @@ static void ovl_put_super(struct super_block *sb)
 	struct ovl_fs *ufs = sb->s_fs_info;
 	unsigned i;
 
+	dput(ufs->indexdir);
 	dput(ufs->workdir);
 	ovl_dir_unlock(ufs->workbasedir);
 	dput(ufs->workbasedir);
@@ -244,6 +250,12 @@ static int ovl_statfs(struct dentry *dentry, struct kstatfs *buf)
 	return err;
 }
 
+/* Will this overlay be forced to mount/remount ro? */
+static bool ovl_force_readonly(struct ovl_fs *ufs)
+{
+	return (!ufs->upper_mnt || !ufs->workdir);
+}
+
 /**
  * ovl_show_options
  *
@@ -265,6 +277,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
 	if (ufs->config.redirect_dir != ovl_redirect_dir_def)
 		seq_printf(m, ",redirect_dir=%s",
 			   ufs->config.redirect_dir ? "on" : "off");
+	if (ufs->config.index != ovl_index_def)
+		seq_printf(m, ",index=%s",
+			   ufs->config.index ? "on" : "off");
 	return 0;
 }
 
@@ -272,7 +287,7 @@ static int ovl_remount(struct super_block *sb, int *flags, char *data)
 {
 	struct ovl_fs *ufs = sb->s_fs_info;
 
-	if (!(*flags & MS_RDONLY) && (!ufs->upper_mnt || !ufs->workdir))
+	if (!(*flags & MS_RDONLY) && ovl_force_readonly(ufs))
 		return -EROFS;
 
 	return 0;
@@ -294,6 +309,8 @@ enum {
 	OPT_DEFAULT_PERMISSIONS,
 	OPT_REDIRECT_DIR_ON,
 	OPT_REDIRECT_DIR_OFF,
+	OPT_INDEX_ON,
+	OPT_INDEX_OFF,
 	OPT_ERR,
 };
 
@@ -304,6 +321,8 @@ static const match_table_t ovl_tokens = {
 	{OPT_DEFAULT_PERMISSIONS,	"default_permissions"},
 	{OPT_REDIRECT_DIR_ON,		"redirect_dir=on"},
 	{OPT_REDIRECT_DIR_OFF,		"redirect_dir=off"},
+	{OPT_INDEX_ON,			"index=on"},
+	{OPT_INDEX_OFF,			"index=off"},
 	{OPT_ERR,			NULL}
 };
 
@@ -376,6 +395,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 			config->redirect_dir = false;
 			break;
 
+		case OPT_INDEX_ON:
+			config->index = true;
+			break;
+
+		case OPT_INDEX_OFF:
+			config->index = false;
+			break;
+
 		default:
 			pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
 			return -EINVAL;
@@ -394,6 +421,7 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 }
 
 #define OVL_WORKDIR_NAME "work"
+#define OVL_INDEXDIR_NAME "index"
 
 static struct dentry *ovl_workdir_create(struct super_block *sb,
 					 struct ovl_fs *ufs,
@@ -785,6 +813,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 
 	init_waitqueue_head(&ufs->copyup_wq);
 	ufs->config.redirect_dir = ovl_redirect_dir_def;
+	ufs->config.index = ovl_index_def;
 	err = ovl_parse_opt((char *) data, &ufs->config);
 	if (err)
 		goto out_free_config;
@@ -947,6 +976,13 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 			} else {
 				vfs_removexattr(ufs->workdir, OVL_XATTR_OPAQUE);
 			}
+
+			/* Check if upper/work fs supports file handles */
+			if (ufs->config.index &&
+			    !ovl_can_decode_fh(ufs->workdir->d_sb)) {
+				ufs->config.index = false;
+				pr_warn("overlayfs: upper fs does not support NFS export.\n");
+			}
 		}
 	}
 
@@ -976,6 +1012,15 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 			ufs->same_sb = mnt->mnt_sb;
 		else if (ufs->same_sb != mnt->mnt_sb)
 			ufs->same_sb = NULL;
+
+		/*
+		 * The inodes index feature needs to encode and decode file
+		 * handles, so it requires that all layers support NFS export.
+		 */
+		if (ufs->config.index && !ovl_can_decode_fh(mnt->mnt_sb)) {
+			ufs->config.index = false;
+			pr_warn("overlayfs: lower fs does not support NFS export.\n");
+		}
 	}
 
 	/* If the upper fs is nonexistent, we mark overlayfs r/o too */
@@ -984,6 +1029,21 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	else if (ufs->upper_mnt->mnt_sb != ufs->same_sb)
 		ufs->same_sb = NULL;
 
+	if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
+		ufs->indexdir = ovl_workdir_create(sb, ufs, workpath.dentry,
+						   OVL_INDEXDIR_NAME, true);
+		err = PTR_ERR(ufs->indexdir);
+		if (IS_ERR(ufs->indexdir))
+			goto out_put_lower_mnt;
+
+		if (!ufs->indexdir)
+			pr_warn("overlayfs: try deleting index dir or mounting with '-o index=off' to disable inodes index.\n");
+	}
+
+	/* Show index=off/on in /proc/mounts for any of the reasons above */
+	if (!ufs->indexdir)
+		ufs->config.index = false;
+
 	if (remote)
 		sb->s_d_op = &ovl_reval_dentry_operations;
 	else
@@ -991,7 +1051,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 
 	ufs->creator_cred = cred = prepare_creds();
 	if (!cred)
-		goto out_put_lower_mnt;
+		goto out_put_indexdir;
 
 	/* Never override disk quota limits or use reserved space */
 	cap_lower(cred->cap_effective, CAP_SYS_RESOURCE);
@@ -1041,6 +1101,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	kfree(oe);
 out_put_cred:
 	put_cred(ufs->creator_cred);
+out_put_indexdir:
+	dput(ufs->indexdir);
 out_put_lower_mnt:
 	for (i = 0; i < ufs->numlower; i++)
 		mntput(ufs->lower_mnt[i]);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 809048913889..8e2f308056f1 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -12,6 +12,8 @@
 #include <linux/slab.h>
 #include <linux/cred.h>
 #include <linux/xattr.h>
+#include <linux/exportfs.h>
+#include <linux/uuid.h>
 #include "overlayfs.h"
 #include "ovl_entry.h"
 
@@ -47,6 +49,19 @@ struct super_block *ovl_same_sb(struct super_block *sb)
 	return ofs->same_sb;
 }
 
+bool ovl_can_decode_fh(struct super_block *sb)
+{
+	return (sb->s_export_op && sb->s_export_op->fh_to_dentry &&
+		uuid_be_cmp(*(uuid_be *) &sb->s_uuid, NULL_UUID_BE));
+}
+
+struct dentry *ovl_indexdir(struct super_block *sb)
+{
+	struct ovl_fs *ofs = sb->s_fs_info;
+
+	return ofs->indexdir;
+}
+
 struct ovl_entry *ovl_alloc_entry(unsigned int numlower)
 {
 	size_t size = offsetof(struct ovl_entry, lowerstack[numlower]);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 06/23] ovl: verify upper root dir matches lower root dir
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (4 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 05/23] ovl: introduce the inodes index dir feature Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 07/23] ovl: verify index dir matches upper dir Amir Goldstein
                   ` (17 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

When inodes index feature is enabled, verify that the file handle stored
in upper root dir matches the lower root dir or fail to mount.

If upper root dir has no stored file handle, encode and store the lower
root dir file handle in overlay.origin xattr.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   |  2 +-
 fs/overlayfs/namei.c     | 84 +++++++++++++++++++++++++++++++++++++++++-------
 fs/overlayfs/overlayfs.h |  3 ++
 fs/overlayfs/super.c     | 42 ++++++++++++++++++++++++
 4 files changed, 119 insertions(+), 12 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index d3047fdc0377..ee0b894e1d99 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -233,7 +233,7 @@ int ovl_set_attr(struct dentry *upperdentry, struct kstat *stat)
 	return err;
 }
 
-static struct ovl_fh *ovl_encode_fh(struct dentry *lower)
+struct ovl_fh *ovl_encode_fh(struct dentry *lower)
 {
 	struct ovl_fh *fh;
 	int fh_type, fh_len, dwords;
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 4ca6061f7bfa..1b8017a57a02 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -88,13 +88,11 @@ static int ovl_acceptable(void *ctx, struct dentry *dentry)
 	return 1;
 }
 
-static struct dentry *ovl_get_origin(struct dentry *dentry,
-				     struct vfsmount *mnt)
+static struct ovl_fh *ovl_get_origin_fh(struct dentry *dentry,
+					struct vfsmount *mnt)
 {
 	int res;
 	struct ovl_fh *fh = NULL;
-	struct dentry *origin = NULL;
-	int bytes;
 
 	res = vfs_getxattr(dentry, OVL_XATTR_ORIGIN, NULL, 0);
 	if (res < 0) {
@@ -106,7 +104,7 @@ static struct dentry *ovl_get_origin(struct dentry *dentry,
 	if (res == 0)
 		return NULL;
 
-	fh  = kzalloc(res, GFP_TEMPORARY);
+	fh = kzalloc(res, GFP_TEMPORARY);
 	if (!fh)
 		return ERR_PTR(-ENOMEM);
 
@@ -129,8 +127,6 @@ static struct dentry *ovl_get_origin(struct dentry *dentry,
 	    (fh->flags & OVL_FH_FLAG_BIG_ENDIAN) != OVL_FH_FLAG_CPU_ENDIAN)
 		goto out;
 
-	bytes = (fh->len - offsetof(struct ovl_fh, fid));
-
 	/*
 	 * Make sure that the stored uuid matches the uuid of the lower
 	 * layer where file handle will be decoded.
@@ -138,6 +134,31 @@ static struct dentry *ovl_get_origin(struct dentry *dentry,
 	if (uuid_be_cmp(fh->uuid, *(uuid_be *) &mnt->mnt_sb->s_uuid))
 		goto out;
 
+	return fh;
+
+out:
+	kfree(fh);
+	return NULL;
+
+fail:
+	pr_warn_ratelimited("overlayfs: failed to get origin (%i)\n", res);
+	goto out;
+invalid:
+	pr_warn_ratelimited("overlayfs: invalid origin (%*phN)\n", res, fh);
+	goto out;
+}
+
+static struct dentry *ovl_get_origin(struct dentry *dentry,
+				     struct vfsmount *mnt)
+{
+	struct dentry *origin = NULL;
+	struct ovl_fh *fh = ovl_get_origin_fh(dentry, mnt);
+	int bytes;
+
+	if (!fh)
+		return NULL;
+
+	bytes = (fh->len - offsetof(struct ovl_fh, fid));
 	origin = exportfs_decode_fh(mnt, (struct fid *)fh->fid,
 				    bytes >> 2, (int)fh->type,
 				    ovl_acceptable, NULL);
@@ -159,11 +180,8 @@ static struct dentry *ovl_get_origin(struct dentry *dentry,
 	kfree(fh);
 	return origin;
 
-fail:
-	pr_warn_ratelimited("overlayfs: failed to get origin (%i)\n", res);
-	goto out;
 invalid:
-	pr_warn_ratelimited("overlayfs: invalid origin (%*phN)\n", res, fh);
+	pr_warn_ratelimited("overlayfs: invalid origin (%*phN)\n", fh->len, fh);
 	goto out;
 }
 
@@ -305,6 +323,50 @@ static int ovl_check_origin(struct dentry *dentry, struct dentry *upperdentry,
 }
 
 /*
+ * Verify that an inode matches the origin file handle stored in upper inode.
+ * Return 0 on match, -ESTALE on mismatch, < 0 on error.
+ */
+int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
+		      struct dentry *origin)
+{
+	struct inode *inode = NULL;
+	struct ovl_fh *fh = NULL;
+	struct ovl_fh *ofh = ovl_get_origin_fh(dentry, mnt);
+	int err;
+
+	/* Fail verification with no warning if no valid origin fh */
+	if (!ofh)
+		return -ENODATA;
+
+	if (IS_ERR(ofh)) {
+		err = PTR_ERR(ofh);
+		goto fail;
+	}
+
+	fh = ovl_encode_fh(origin);
+	if (IS_ERR(fh)) {
+		err = PTR_ERR(fh);
+		fh = NULL;
+		goto fail;
+	} else if (fh->len != ofh->len || memcmp(fh, ofh, fh->len)) {
+		err = -ESTALE;
+		goto fail;
+	}
+
+	err = 0;
+out:
+	kfree(ofh);
+	kfree(fh);
+	return err;
+
+fail:
+	inode = d_inode(origin);
+	pr_warn_ratelimited("overlayfs: failed to verify origin (ino=%lu, err=%i) - were layers copied?\n",
+			    inode ? inode->i_ino : 0, err);
+	goto out;
+}
+
+/*
  * Returns next layer in stack starting from top.
  * Returns -1 if this is the last layer.
  */
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 9ec93063c3a8..1eb8250464be 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -235,6 +235,8 @@ static inline bool ovl_is_impuredir(struct dentry *dentry)
 
 
 /* namei.c */
+int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
+		      struct dentry *origin);
 int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp);
 struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags);
 bool ovl_lower_positive(struct dentry *dentry);
@@ -293,3 +295,4 @@ int ovl_copy_up(struct dentry *dentry);
 int ovl_copy_up_flags(struct dentry *dentry, int flags);
 int ovl_copy_xattr(struct dentry *old, struct dentry *new);
 int ovl_set_attr(struct dentry *upper, struct kstat *stat);
+struct ovl_fh *ovl_encode_fh(struct dentry *lower);
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index d92c7486bdf2..e6e0515525d3 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -420,6 +420,41 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 	return 0;
 }
 
+/*
+ * Verify that stored file handle in dir matches origin.
+ * If dir has no stored file handle, encode and store origin file handle.
+ */
+static int ovl_verify_set_origin(struct dentry *dir, struct vfsmount *mnt,
+				 struct dentry *origin, const char *name)
+{
+	const struct ovl_fh *fh = NULL;
+	int err;
+
+	err = ovl_verify_origin(dir, mnt, origin);
+	if (!err)
+		return 0;
+
+	if (err != -ENODATA)
+		goto fail;
+
+	fh = ovl_encode_fh(origin);
+	err = PTR_ERR(fh);
+	if (IS_ERR(fh))
+		goto fail;
+	err = ovl_do_setxattr(dir, OVL_XATTR_ORIGIN, fh, fh->len, 0);
+	if (err)
+		goto fail;
+
+out:
+	kfree(fh);
+	return err;
+
+fail:
+	pr_err("overlayfs: failed to verify %s dir. (err=%i)\n",
+	       name, err);
+	goto out;
+}
+
 #define OVL_WORKDIR_NAME "work"
 #define OVL_INDEXDIR_NAME "index"
 
@@ -1030,6 +1065,13 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		ufs->same_sb = NULL;
 
 	if (!(ovl_force_readonly(ufs)) && ufs->config.index) {
+		/* Verify lower root is upper root origin */
+		err = ovl_verify_set_origin(upperpath.dentry,
+					    ufs->lower_mnt[0], stack[0].dentry,
+					    "lower root");
+		if (err)
+			goto out_put_lower_mnt;
+
 		ufs->indexdir = ovl_workdir_create(sb, ufs, workpath.dentry,
 						   OVL_INDEXDIR_NAME, true);
 		err = PTR_ERR(ufs->indexdir);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 07/23] ovl: verify index dir matches upper dir
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (5 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 06/23] ovl: verify upper root dir matches lower root dir Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 08/23] ovl: store path type in dentry Amir Goldstein
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

An index dir contains persistent hardlinks to files in upper dir.
Therefore, we must never mount an existing index dir with a differnt
upper dir.

Store the upper root dir file handle in index dir inode when index
dir is created and verify the file handle before using an existing
index dir on mount.

Add an 'is_upper' flag to the overlay file handle encoding and set it
when encoding the upper root file handle. This is not critical for index
dir verification, but it is good practice towards a standard overlayfs
file handle format for NFS export.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   | 12 ++++++++++--
 fs/overlayfs/namei.c     |  4 ++--
 fs/overlayfs/overlayfs.h |  6 ++++--
 fs/overlayfs/super.c     | 20 +++++++++++++++-----
 4 files changed, 31 insertions(+), 11 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index ee0b894e1d99..eedc26e15dad 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -233,7 +233,7 @@ int ovl_set_attr(struct dentry *upperdentry, struct kstat *stat)
 	return err;
 }
 
-struct ovl_fh *ovl_encode_fh(struct dentry *lower)
+struct ovl_fh *ovl_encode_fh(struct dentry *lower, bool is_upper)
 {
 	struct ovl_fh *fh;
 	int fh_type, fh_len, dwords;
@@ -272,6 +272,14 @@ struct ovl_fh *ovl_encode_fh(struct dentry *lower)
 	fh->magic = OVL_FH_MAGIC;
 	fh->type = fh_type;
 	fh->flags = OVL_FH_FLAG_CPU_ENDIAN;
+	/*
+	 * When we will want to decode an overlay dentry from this handle
+	 * and all layers are on the same fs, if we get a disconncted real
+	 * dentry when we decode fid, the only way to tell if we should assign
+	 * it to upperdentry or to lowerstack is by checking this flag.
+	 */
+	if (is_upper)
+		fh->flags |= OVL_FH_FLAG_PATH_UPPER;
 	fh->len = fh_len;
 	fh->uuid = *uuid;
 	memcpy(fh->fid, buf, buflen);
@@ -293,7 +301,7 @@ static int ovl_set_origin(struct dentry *dentry, struct dentry *lower,
 	 * up and a pure upper inode.
 	 */
 	if (ovl_can_decode_fh(lower->d_sb)) {
-		fh = ovl_encode_fh(lower);
+		fh = ovl_encode_fh(lower, false);
 		if (IS_ERR(fh))
 			return PTR_ERR(fh);
 	}
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 1b8017a57a02..61f4988527f4 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -327,7 +327,7 @@ static int ovl_check_origin(struct dentry *dentry, struct dentry *upperdentry,
  * Return 0 on match, -ESTALE on mismatch, < 0 on error.
  */
 int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
-		      struct dentry *origin)
+		      struct dentry *origin, bool is_upper)
 {
 	struct inode *inode = NULL;
 	struct ovl_fh *fh = NULL;
@@ -343,7 +343,7 @@ int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
 		goto fail;
 	}
 
-	fh = ovl_encode_fh(origin);
+	fh = ovl_encode_fh(origin, is_upper);
 	if (IS_ERR(fh)) {
 		err = PTR_ERR(fh);
 		fh = NULL;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 1eb8250464be..6fd7c9e748c1 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -38,6 +38,8 @@ enum ovl_path_type {
 /* CPU byte order required for fid decoding:  */
 #define OVL_FH_FLAG_BIG_ENDIAN	(1 << 0)
 #define OVL_FH_FLAG_ANY_ENDIAN	(1 << 1)
+/* Is the real inode encoded in fid an upper inode? */
+#define OVL_FH_FLAG_PATH_UPPER	(1 << 2)
 
 #define OVL_FH_FLAG_ALL (OVL_FH_FLAG_BIG_ENDIAN | OVL_FH_FLAG_ANY_ENDIAN)
 
@@ -236,7 +238,7 @@ static inline bool ovl_is_impuredir(struct dentry *dentry)
 
 /* namei.c */
 int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
-		      struct dentry *origin);
+		      struct dentry *origin, bool is_upper);
 int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp);
 struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags);
 bool ovl_lower_positive(struct dentry *dentry);
@@ -295,4 +297,4 @@ int ovl_copy_up(struct dentry *dentry);
 int ovl_copy_up_flags(struct dentry *dentry, int flags);
 int ovl_copy_xattr(struct dentry *old, struct dentry *new);
 int ovl_set_attr(struct dentry *upper, struct kstat *stat);
-struct ovl_fh *ovl_encode_fh(struct dentry *lower);
+struct ovl_fh *ovl_encode_fh(struct dentry *lower, bool is_upper);
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index e6e0515525d3..5bbcef1ccf82 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -425,19 +425,20 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
  * If dir has no stored file handle, encode and store origin file handle.
  */
 static int ovl_verify_set_origin(struct dentry *dir, struct vfsmount *mnt,
-				 struct dentry *origin, const char *name)
+				 struct dentry *origin, const char *name,
+				 bool is_upper)
 {
 	const struct ovl_fh *fh = NULL;
 	int err;
 
-	err = ovl_verify_origin(dir, mnt, origin);
+	err = ovl_verify_origin(dir, mnt, origin, is_upper);
 	if (!err)
 		return 0;
 
 	if (err != -ENODATA)
 		goto fail;
 
-	fh = ovl_encode_fh(origin);
+	fh = ovl_encode_fh(origin, is_upper);
 	err = PTR_ERR(fh);
 	if (IS_ERR(fh))
 		goto fail;
@@ -1068,7 +1069,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		/* Verify lower root is upper root origin */
 		err = ovl_verify_set_origin(upperpath.dentry,
 					    ufs->lower_mnt[0], stack[0].dentry,
-					    "lower root");
+					    "lower root", false);
 		if (err)
 			goto out_put_lower_mnt;
 
@@ -1078,8 +1079,17 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 		if (IS_ERR(ufs->indexdir))
 			goto out_put_lower_mnt;
 
-		if (!ufs->indexdir)
+		if (ufs->indexdir) {
+			/* Verify upper root is index dir origin */
+			err = ovl_verify_set_origin(ufs->indexdir,
+						    ufs->upper_mnt,
+						    upperpath.dentry,
+						    "upper root", true);
+			if (err)
+				goto out_put_indexdir;
+		} else {
 			pr_warn("overlayfs: try deleting index dir or mounting with '-o index=off' to disable inodes index.\n");
+		}
 	}
 
 	/* Show index=off/on in /proc/mounts for any of the reasons above */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 08/23] ovl: store path type in dentry
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (6 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 07/23] ovl: verify index dir matches upper dir Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 09/23] ovl: cram dentry state booleans into type flags Amir Goldstein
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

We would like to be able to set type flags during lookup (i.e. INDEX)
instead of deducing them from other state information.

Store the type value in ovl_entry and update the UPPER/MERGE/ORIGIN
flags when needed, so ovl_path_type() just returns the stored value.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/namei.c     |  1 +
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/ovl_entry.h |  3 +++
 fs/overlayfs/super.c     |  1 +
 fs/overlayfs/util.c      | 20 +++++++++++++++++---
 5 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 61f4988527f4..198dde797a53 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -535,6 +535,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	kfree(stack);
 	kfree(d.redirect);
 	dentry->d_fsdata = oe;
+	ovl_update_type(dentry, d.is_dir);
 	d_add(dentry, inode);
 
 	return NULL;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 6fd7c9e748c1..66ee4116f93b 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -199,6 +199,7 @@ struct ovl_entry *ovl_alloc_entry(unsigned int numlower);
 bool ovl_dentry_remote(struct dentry *dentry);
 bool ovl_dentry_weird(struct dentry *dentry);
 enum ovl_path_type ovl_path_type(struct dentry *dentry);
+void ovl_update_type(struct dentry *dentry, bool is_dir);
 void ovl_path_upper(struct dentry *dentry, struct path *path);
 void ovl_path_lower(struct dentry *dentry, struct path *path);
 enum ovl_path_type ovl_path_real(struct dentry *dentry, struct path *path);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index ddd4b0a874a9..e94fa2638faf 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -40,6 +40,8 @@ struct ovl_fs {
 	struct super_block *same_sb;
 };
 
+enum ovl_path_type;
+
 /* private information held for every overlayfs dentry */
 struct ovl_entry {
 	struct dentry *__upperdentry;
@@ -54,6 +56,7 @@ struct ovl_entry {
 		};
 		struct rcu_head rcu;
 	};
+	enum ovl_path_type __type;
 	unsigned numlower;
 	struct path lowerstack[];
 };
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 5bbcef1ccf82..7a41648ee285 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1140,6 +1140,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	kfree(stack);
 
 	root_dentry->d_fsdata = oe;
+	ovl_update_type(root_dentry, true);
 
 	realinode = d_inode(ovl_dentry_real(root_dentry));
 	ovl_inode_init(d_inode(root_dentry), realinode, !!upperpath.dentry);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 8e2f308056f1..8e743c76aa03 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -91,24 +91,37 @@ bool ovl_dentry_weird(struct dentry *dentry)
 enum ovl_path_type ovl_path_type(struct dentry *dentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
+
+	return READ_ONCE(oe->__type);
+}
+
+void ovl_update_type(struct dentry *dentry, bool is_dir)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
 	enum ovl_path_type type = 0;
 
+	spin_lock(&dentry->d_lock);
 	if (oe->__upperdentry) {
 		type = __OVL_PATH_UPPER;
-
 		/*
 		 * Non-dir dentry can hold lower dentry of its copy up origin.
 		 */
 		if (oe->numlower) {
 			type |= __OVL_PATH_ORIGIN;
-			if (d_is_dir(dentry))
+			if (is_dir)
 				type |= __OVL_PATH_MERGE;
 		}
 	} else {
 		if (oe->numlower > 1)
 			type |= __OVL_PATH_MERGE;
 	}
-	return type;
+
+	/*
+	 * The UPPER/MERGE/ORIGIN flags can never be cleared during the
+	 * lifetime of a dentry, so don't bother masking them out first.
+	 */
+	oe->__type |= type;
+	spin_unlock(&dentry->d_lock);
 }
 
 void ovl_path_upper(struct dentry *dentry, struct path *path)
@@ -243,6 +256,7 @@ void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry)
 	 */
 	smp_wmb();
 	oe->__upperdentry = upperdentry;
+	ovl_update_type(dentry, d_is_dir(dentry));
 }
 
 void ovl_inode_init(struct inode *inode, struct inode *realinode, bool is_upper)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 09/23] ovl: cram dentry state booleans into type flags
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (7 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 08/23] ovl: store path type in dentry Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 10/23] ovl: lookup index entry for copy up origin Amir Goldstein
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Like the other type flags, 'opaque' and 'impure' are states discovered in
lookup, set on certain ops (e.g.: create, rename) and never cleared.

We would like to add more state info to ovl_entry soon (indexed) and
and this state info would be added as type flags. It makes little sense
to have the 'opaque' and 'impure' states occupy a boolean each, so store
these states as type flags.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/namei.c     | 15 +++++++--------
 fs/overlayfs/overlayfs.h |  4 ++++
 fs/overlayfs/ovl_entry.h |  2 --
 fs/overlayfs/super.c     |  7 +++----
 fs/overlayfs/util.c      | 25 ++++++++++++++++---------
 5 files changed, 30 insertions(+), 23 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 198dde797a53..402e52a98c1a 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -402,8 +402,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	struct dentry *upperdir, *upperdentry = NULL;
 	unsigned int ctr = 0;
 	struct inode *inode = NULL;
-	bool upperopaque = false;
-	bool upperimpure = false;
+	enum ovl_path_type type = 0;
 	char *upperredirect = NULL;
 	struct dentry *this;
 	unsigned int i;
@@ -457,9 +456,10 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			if (d.redirect[0] == '/')
 				poe = roe;
 		}
-		upperopaque = d.opaque;
-		if (upperdentry && d.is_dir)
-			upperimpure = ovl_is_impuredir(upperdentry);
+		if (d.opaque)
+			type |= __OVL_PATH_OPAQUE;
+		if (upperdentry && d.is_dir && ovl_is_impuredir(upperdentry))
+			type |= __OVL_PATH_IMPURE;
 	}
 
 	if (!d.stop && poe->numlower) {
@@ -527,8 +527,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	}
 
 	revert_creds(old_cred);
-	oe->opaque = upperopaque;
-	oe->impure = upperimpure;
+	oe->__type = type;
 	oe->redirect = upperredirect;
 	oe->__upperdentry = upperdentry;
 	memcpy(oe->lowerstack, stack, sizeof(struct path) * ctr);
@@ -569,7 +568,7 @@ bool ovl_lower_positive(struct dentry *dentry)
 	 * whiteout.
 	 */
 	if (!dentry->d_inode)
-		return oe->opaque;
+		return OVL_TYPE_OPAQUE(oe->__type);
 
 	/* Negative upper -> positive lower */
 	if (!oe->__upperdentry)
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 66ee4116f93b..c4bc1c3ea4e2 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -14,11 +14,15 @@ enum ovl_path_type {
 	__OVL_PATH_UPPER	= (1 << 0),
 	__OVL_PATH_MERGE	= (1 << 1),
 	__OVL_PATH_ORIGIN	= (1 << 2),
+	__OVL_PATH_OPAQUE	= (1 << 3),
+	__OVL_PATH_IMPURE	= (1 << 4),
 };
 
 #define OVL_TYPE_UPPER(type)	((type) & __OVL_PATH_UPPER)
 #define OVL_TYPE_MERGE(type)	((type) & __OVL_PATH_MERGE)
 #define OVL_TYPE_ORIGIN(type)	((type) & __OVL_PATH_ORIGIN)
+#define OVL_TYPE_OPAQUE(type)	((type) & __OVL_PATH_OPAQUE)
+#define OVL_TYPE_IMPURE(type)	((type) & __OVL_PATH_IMPURE)
 
 #define OVL_XATTR_PREFIX XATTR_TRUSTED_PREFIX "overlay."
 #define OVL_XATTR_OPAQUE OVL_XATTR_PREFIX "opaque"
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index e94fa2638faf..6de8725cc38a 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -50,8 +50,6 @@ struct ovl_entry {
 		struct {
 			u64 version;
 			const char *redirect;
-			bool opaque;
-			bool impure;
 			bool copying;
 		};
 		struct rcu_head rcu;
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 7a41648ee285..13f2ed09c025 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1129,10 +1129,9 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	mntput(workpath.mnt);
 	kfree(lowertmp);
 
-	if (upperpath.dentry) {
-		oe->__upperdentry = upperpath.dentry;
-		oe->impure = ovl_is_impuredir(upperpath.dentry);
-	}
+	oe->__upperdentry = upperpath.dentry;
+	if (upperpath.dentry && ovl_is_impuredir(upperpath.dentry))
+		oe->__type |= __OVL_PATH_IMPURE;
 	for (i = 0; i < numlower; i++) {
 		oe->lowerstack[i].dentry = stack[i].dentry;
 		oe->lowerstack[i].mnt = ufs->lower_mnt[i];
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 8e743c76aa03..9d2f3bb70e61 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -199,15 +199,12 @@ void ovl_set_dir_cache(struct dentry *dentry, struct ovl_dir_cache *cache)
 
 bool ovl_dentry_is_opaque(struct dentry *dentry)
 {
-	struct ovl_entry *oe = dentry->d_fsdata;
-	return oe->opaque;
+	return OVL_TYPE_OPAQUE(ovl_path_type(dentry));
 }
 
 bool ovl_dentry_is_impure(struct dentry *dentry)
 {
-	struct ovl_entry *oe = dentry->d_fsdata;
-
-	return oe->impure;
+	return OVL_TYPE_IMPURE(ovl_path_type(dentry));
 }
 
 bool ovl_dentry_is_whiteout(struct dentry *dentry)
@@ -219,7 +216,18 @@ void ovl_dentry_set_opaque(struct dentry *dentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
 
-	oe->opaque = true;
+	spin_lock(&dentry->d_lock);
+	oe->__type |= __OVL_PATH_OPAQUE;
+	spin_unlock(&dentry->d_lock);
+}
+
+static void ovl_dentry_set_impure(struct dentry *dentry)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+
+	spin_lock(&dentry->d_lock);
+	oe->__type |= __OVL_PATH_IMPURE;
+	spin_unlock(&dentry->d_lock);
 }
 
 bool ovl_redirect_dir(struct super_block *sb)
@@ -372,9 +380,8 @@ int ovl_check_setxattr(struct dentry *dentry, struct dentry *upperdentry,
 int ovl_set_impure(struct dentry *dentry, struct dentry *upperdentry)
 {
 	int err;
-	struct ovl_entry *oe = dentry->d_fsdata;
 
-	if (oe->impure)
+	if (ovl_dentry_is_impure(dentry))
 		return 0;
 
 	/*
@@ -384,7 +391,7 @@ int ovl_set_impure(struct dentry *dentry, struct dentry *upperdentry)
 	err = ovl_check_setxattr(dentry, upperdentry, OVL_XATTR_IMPURE,
 				 "y", 1, 0);
 	if (!err)
-		oe->impure = true;
+		ovl_dentry_set_impure(dentry);
 
 	return err;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 10/23] ovl: lookup index entry for copy up origin
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (8 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 09/23] ovl: cram dentry state booleans into type flags Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 11/23] ovl: allocate an ovl_inode struct Amir Goldstein
                   ` (13 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

When inodes index feature is enabled, lookup in indexdir for the index
entry of lower real inode or copy up origin inode. The index entry name
is the hex representation of the lower inode file handle.

If the index dentry in negative, then either no lower aliases have been
copied up yet, or aliases have been copied up in older kernels and are
not indexed.

If the index dentry for a copy up origin inode is positive, but points
to an inode different than the upper inode, then either the upper inode
has been copied up and not indexed or it was indexed, but since then
index dir was cleared. Either way, that index cannot be used to indentify
the overlay inode.

If a positive dentry that matches the upper inode was found, then it is
safe to use the copy up origin st_ino for upper hardlinks, because all
indexed upper hardlinks are represented by the same overlay inode as the
copy up origin.

Set the INDEX type flag on an indexed dentry. A non-upper overlay dentry
may also have a positive index from copy up of another lower hardlink.
This situation can be tested with the path type macros
OVL_TYPE_INDEX(type) && !OVL_TYPE_UPPER(type).

This is going to be used to prevent breaking hardlinks on copy up.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     |   8 +++-
 fs/overlayfs/namei.c     | 102 +++++++++++++++++++++++++++++++++++++++++++++++
 fs/overlayfs/overlayfs.h |   2 +
 3 files changed, 110 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index d613e2c41242..b78993abdeea 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -96,11 +96,15 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 
 			WARN_ON_ONCE(stat->dev != lowerstat.dev);
 			/*
-			 * Lower hardlinks are broken on copy up to different
+			 * Lower hardlinks may be broken on copy up to different
 			 * upper files, so we cannot use the lower origin st_ino
 			 * for those different files, even for the same fs case.
+			 * With inodes index enabled, it is safe to use st_ino
+			 * of an indexed hardlinked origin. The index validates
+			 * that the upper hardlink is not broken.
 			 */
-			if (is_dir || lowerstat.nlink == 1)
+			if (is_dir || lowerstat.nlink == 1 ||
+			    OVL_TYPE_INDEX(type))
 				stat->ino = lowerstat.ino;
 		}
 		stat->dev = dentry->d_sb->s_dev;
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 402e52a98c1a..fe84e482d926 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -367,6 +367,84 @@ int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
 }
 
 /*
+ * Lookup in indexdir for the index entry of a lower real inode or a copy up
+ * origin inode. The index entry name is the hex representation of the lower
+ * inode file handle.
+ *
+ * If the index dentry in negative, then either no lower aliases have been
+ * copied up yet, or aliases have been copied up in older kernels and are
+ * not indexed.
+ *
+ * If the index dentry for a copy up origin inode is positive, but points
+ * to an inode different than the upper inode, then either the upper inode
+ * has been copied up and not indexed or it was indexed, but since then
+ * index dir was cleared. Either way, that index cannot be used to indentify
+ * the overlay inode.
+ */
+static int ovl_lookup_index(struct dentry *dentry, struct dentry *upper,
+			    struct dentry *lower, struct dentry **indexp)
+{
+	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
+	struct ovl_fh *fh;
+	struct dentry *index = NULL;
+	struct inode *inode;
+	char *s, *name = NULL;
+	long namelen = 0;
+	int err;
+
+	if (WARN_ON(!ofs->indexdir))
+		return -ENOENT;
+
+	fh = ovl_encode_fh(lower, false);
+	if (IS_ERR(fh)) {
+		err = PTR_ERR(fh);
+		fh = NULL;
+		goto fail;
+	}
+
+	err = -ENAMETOOLONG;
+	namelen = fh->len * 2;
+	if (namelen > ofs->namelen)
+		goto fail;
+
+	err = -ENOMEM;
+	name = kzalloc(namelen + 1, GFP_TEMPORARY);
+	if (!name)
+		goto fail;
+
+	s  = bin2hex(name, fh, fh->len);
+	namelen = s - name;
+
+	index = lookup_one_len_unlocked(name, ofs->indexdir, namelen);
+	if (IS_ERR(index)) {
+		err = PTR_ERR(index);
+		goto fail;
+	}
+
+	if (upper && d_inode(index) != d_inode(upper)) {
+		inode = d_inode(index);
+		pr_debug("ovl_lookup_index: upper with origin not indexed (%pd2, ino=%lu)\n",
+			 upper, inode ? inode->i_ino : 0);
+		dput(index);
+		index = NULL;
+	}
+
+	*indexp = index;
+	err = 0;
+
+out:
+	kfree(fh);
+	kfree(name);
+	return err;
+
+fail:
+	inode = d_inode(lower);
+	pr_warn_ratelimited("overlayfs: failed inode index lookup (ino=%lu, key=%*s, err=%i); mount with '-o index=off' to disable inodes index.\n",
+			    inode ? inode->i_ino : 0, (int)namelen, name, err);
+	goto out;
+}
+
+/*
  * Returns next layer in stack starting from top.
  * Returns -1 if this is the last layer.
  */
@@ -400,6 +478,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	struct ovl_entry *roe = dentry->d_sb->s_root->d_fsdata;
 	struct path *stack = NULL;
 	struct dentry *upperdir, *upperdentry = NULL;
+	struct dentry *index = NULL;
 	unsigned int ctr = 0;
 	struct inode *inode = NULL;
 	enum ovl_path_type type = 0;
@@ -500,6 +579,27 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		}
 	}
 
+	/* Lookup index by lower inode and verify it matches upper inode */
+	if (ctr && !d.is_dir && ovl_indexdir(dentry->d_sb)) {
+		struct dentry *origin = stack[0].dentry;
+
+		err = ovl_lookup_index(dentry, upperdentry, origin, &index);
+		if (err)
+			goto out_put;
+
+		if (index && d_inode(index)) {
+			type |= __OVL_PATH_INDEX;
+		} else if (upperdentry && d_inode(origin)->i_nlink > 1) {
+			/*
+			 * Non-indexed upper of lower hardlink is a broken
+			 * hardlink - disown origin, because we cannot use
+			 * origin st_ino for a broken hardlink.
+			 */
+			dput(origin);
+			ctr = 0;
+		}
+	}
+
 	oe = ovl_alloc_entry(ctr);
 	err = -ENOMEM;
 	if (!oe)
@@ -531,6 +631,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	oe->redirect = upperredirect;
 	oe->__upperdentry = upperdentry;
 	memcpy(oe->lowerstack, stack, sizeof(struct path) * ctr);
+	dput(index);
 	kfree(stack);
 	kfree(d.redirect);
 	dentry->d_fsdata = oe;
@@ -542,6 +643,7 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 out_free_oe:
 	kfree(oe);
 out_put:
+	dput(index);
 	for (i = 0; i < ctr; i++)
 		dput(stack[i].dentry);
 	kfree(stack);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index c4bc1c3ea4e2..191a41f61839 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -16,6 +16,7 @@ enum ovl_path_type {
 	__OVL_PATH_ORIGIN	= (1 << 2),
 	__OVL_PATH_OPAQUE	= (1 << 3),
 	__OVL_PATH_IMPURE	= (1 << 4),
+	__OVL_PATH_INDEX	= (1 << 5),
 };
 
 #define OVL_TYPE_UPPER(type)	((type) & __OVL_PATH_UPPER)
@@ -23,6 +24,7 @@ enum ovl_path_type {
 #define OVL_TYPE_ORIGIN(type)	((type) & __OVL_PATH_ORIGIN)
 #define OVL_TYPE_OPAQUE(type)	((type) & __OVL_PATH_OPAQUE)
 #define OVL_TYPE_IMPURE(type)	((type) & __OVL_PATH_IMPURE)
+#define OVL_TYPE_INDEX(type)	((type) & __OVL_PATH_INDEX)
 
 #define OVL_XATTR_PREFIX XATTR_TRUSTED_PREFIX "overlay."
 #define OVL_XATTR_OPAQUE OVL_XATTR_PREFIX "opaque"
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 11/23] ovl: allocate an ovl_inode struct
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (9 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 10/23] ovl: lookup index entry for copy up origin Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 12/23] ovl: store upper/lower real inode in ovl_inode_info Amir Goldstein
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

We need some more space to store overlay inode data in memory,
so allocate overlay inodes from a slab of struct ovl_inode.

The ovl_inode struct includes a mutex that is going to be used
for synchronizing copy up of lower hardlinks.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/ovl_entry.h | 12 ++++++++++
 fs/overlayfs/super.c     | 62 ++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 6de8725cc38a..6c36fa70abe4 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -65,3 +65,15 @@ static inline struct dentry *ovl_upperdentry_dereference(struct ovl_entry *oe)
 {
 	return lockless_dereference(oe->__upperdentry);
 }
+
+struct ovl_inode {
+	/* keep this first */
+	struct inode vfs_inode;
+	/* synchronize copy up and more */
+	struct mutex oi_lock;
+};
+
+static inline struct ovl_inode *OVL_I(struct inode *inode)
+{
+	return (struct ovl_inode *) inode;
+}
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 13f2ed09c025..3fe635d01cc6 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -170,6 +170,34 @@ static const struct dentry_operations ovl_reval_dentry_operations = {
 	.d_weak_revalidate = ovl_dentry_weak_revalidate,
 };
 
+static struct kmem_cache *ovl_inode_cachep;
+
+static struct inode *ovl_alloc_inode(struct super_block *sb)
+{
+	struct inode *inode;
+
+	inode = kmem_cache_alloc(ovl_inode_cachep, GFP_KERNEL);
+	if (!inode)
+		return NULL;
+
+	mutex_init(&OVL_I(inode)->oi_lock);
+
+	return inode;
+}
+
+static void ovl_i_callback(struct rcu_head *head)
+{
+	struct inode *inode = container_of(head, struct inode, i_rcu);
+
+	kmem_cache_free(ovl_inode_cachep, inode);
+}
+
+static void ovl_destroy_inode(struct inode *inode)
+{
+	mutex_destroy(&OVL_I(inode)->oi_lock);
+	call_rcu(&inode->i_rcu, ovl_i_callback);
+}
+
 /* Get exclusive ownership on upper/work dir among overlay mounts */
 static bool ovl_dir_lock(struct dentry *dentry)
 {
@@ -294,12 +322,14 @@ static int ovl_remount(struct super_block *sb, int *flags, char *data)
 }
 
 static const struct super_operations ovl_super_operations = {
+	.alloc_inode	= ovl_alloc_inode,
+	.destroy_inode	= ovl_destroy_inode,
+	.drop_inode	= generic_delete_inode,
 	.put_super	= ovl_put_super,
 	.sync_fs	= ovl_sync_fs,
 	.statfs		= ovl_statfs,
 	.show_options	= ovl_show_options,
 	.remount_fs	= ovl_remount,
-	.drop_inode	= generic_delete_inode,
 };
 
 enum {
@@ -1200,14 +1230,42 @@ static struct file_system_type ovl_fs_type = {
 };
 MODULE_ALIAS_FS("overlay");
 
+static void ovl_inode_init_once(void *foo)
+{
+	struct inode *inode = foo;
+
+	inode_init_once(inode);
+}
+
 static int __init ovl_init(void)
 {
-	return register_filesystem(&ovl_fs_type);
+	int err;
+
+	ovl_inode_cachep = kmem_cache_create("ovl_inode",
+					     sizeof(struct ovl_inode), 0,
+					     SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
+					     ovl_inode_init_once);
+	if (ovl_inode_cachep == NULL)
+		return -ENOMEM;
+
+	err = register_filesystem(&ovl_fs_type);
+	if (err)
+		kmem_cache_destroy(ovl_inode_cachep);
+
+	return err;
 }
 
 static void __exit ovl_exit(void)
 {
 	unregister_filesystem(&ovl_fs_type);
+
+	/*
+	 * Make sure all delayed rcu free inodes are flushed before we
+	 * destroy cache.
+	 */
+	rcu_barrier();
+	kmem_cache_destroy(ovl_inode_cachep);
+
 }
 
 module_init(ovl_init);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 12/23] ovl: store upper/lower real inode in ovl_inode_info
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (10 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 11/23] ovl: allocate an ovl_inode struct Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14 11:00   ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 13/23] ovl: use ovl_inode_init() for initializing new inode Amir Goldstein
                   ` (11 subsequent siblings)
  23 siblings, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Introduce ovl_inode_info struct that is embedded in ovl_inode
and contains a reference to lowerinode and/or upperinode.

Storing the upper/lower real inode in ovl_inode_info replaces
the method of storing realinode & ISUPPER flag in vfs inode
i_private field.

This will be used for hashing overlay inodes before copy up.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c     |  5 +++--
 fs/overlayfs/overlayfs.h | 13 +------------
 fs/overlayfs/ovl_entry.h | 12 ++++++++++++
 fs/overlayfs/super.c     |  1 +
 fs/overlayfs/util.c      | 28 ++++++++++++++++++++++++----
 5 files changed, 41 insertions(+), 18 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index b78993abdeea..49492b982726 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -13,6 +13,7 @@
 #include <linux/xattr.h>
 #include <linux/posix_acl.h>
 #include "overlayfs.h"
+#include "ovl_entry.h"
 
 int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 {
@@ -457,12 +458,12 @@ struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev)
 
 static int ovl_inode_test(struct inode *inode, void *data)
 {
-	return ovl_inode_real(inode, NULL) == data;
+	return OVL_I_INFO(inode)->__upperinode == data;
 }
 
 static int ovl_inode_set(struct inode *inode, void *data)
 {
-	inode->i_private = (void *) (((unsigned long) data) | OVL_ISUPPER_MASK);
+	OVL_I_INFO(inode)->__upperinode = data;
 	return 0;
 }
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 191a41f61839..46e6730f51f8 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -68,8 +68,6 @@ struct ovl_fh {
 	u8 fid[0];	/* file identifier */
 } __packed;
 
-#define OVL_ISUPPER_MASK 1UL
-
 static inline int ovl_do_rmdir(struct inode *dir, struct dentry *dentry)
 {
 	int err = vfs_rmdir(dir, dentry);
@@ -183,16 +181,6 @@ static inline struct dentry *ovl_do_tmpfile(struct dentry *dentry, umode_t mode)
 	return ret;
 }
 
-static inline struct inode *ovl_inode_real(struct inode *inode, bool *is_upper)
-{
-	unsigned long x = (unsigned long) READ_ONCE(inode->i_private);
-
-	if (is_upper)
-		*is_upper = x & OVL_ISUPPER_MASK;
-
-	return (struct inode *) (x & ~OVL_ISUPPER_MASK);
-}
-
 /* util.c */
 int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
@@ -224,6 +212,7 @@ void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
 void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry);
 void ovl_inode_init(struct inode *inode, struct inode *realinode,
 		    bool is_upper);
+struct inode *ovl_inode_real(struct inode *inode, bool *is_upper);
 void ovl_inode_update(struct inode *inode, struct inode *upperinode);
 void ovl_dentry_version_inc(struct dentry *dentry);
 u64 ovl_dentry_version_get(struct dentry *dentry);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 6c36fa70abe4..6e240bc5ac1d 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -66,9 +66,16 @@ static inline struct dentry *ovl_upperdentry_dereference(struct ovl_entry *oe)
 	return lockless_dereference(oe->__upperdentry);
 }
 
+/* private information embedded in every overlayfs inode */
+struct ovl_inode_info {
+	struct inode *__upperinode;
+	struct inode *lowerinode;
+};
+
 struct ovl_inode {
 	/* keep this first */
 	struct inode vfs_inode;
+	struct ovl_inode_info info;
 	/* synchronize copy up and more */
 	struct mutex oi_lock;
 };
@@ -77,3 +84,8 @@ static inline struct ovl_inode *OVL_I(struct inode *inode)
 {
 	return (struct ovl_inode *) inode;
 }
+
+static inline struct ovl_inode_info *OVL_I_INFO(struct inode *inode)
+{
+	return &OVL_I(inode)->info;
+}
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 3fe635d01cc6..7b48bfed0121 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -181,6 +181,7 @@ static struct inode *ovl_alloc_inode(struct super_block *sb)
 		return NULL;
 
 	mutex_init(&OVL_I(inode)->oi_lock);
+	memset(OVL_I_INFO(inode), 0, sizeof(struct ovl_inode_info));
 
 	return inode;
 }
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 9d2f3bb70e61..0ce6efd8f19d 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -269,16 +269,36 @@ void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry)
 
 void ovl_inode_init(struct inode *inode, struct inode *realinode, bool is_upper)
 {
-	WRITE_ONCE(inode->i_private, (unsigned long) realinode |
-		   (is_upper ? OVL_ISUPPER_MASK : 0));
+	struct ovl_inode_info *oi = OVL_I_INFO(inode);
+
+	if (is_upper) {
+		oi->__upperinode = realinode;
+		oi->lowerinode = NULL;
+	} else {
+		oi->__upperinode = NULL;
+		oi->lowerinode = realinode;
+	}
+}
+
+struct inode *ovl_inode_real(struct inode *inode, bool *is_upper)
+{
+	struct ovl_inode_info *oi = OVL_I_INFO(inode);
+	struct inode *realinode;
+
+	realinode = READ_ONCE(oi->__upperinode);
+	if (!realinode)
+		realinode = oi->lowerinode;
+	else if (is_upper)
+		*is_upper = true;
+
+	return realinode;
 }
 
 void ovl_inode_update(struct inode *inode, struct inode *upperinode)
 {
 	WARN_ON(!upperinode);
 	WARN_ON(!inode_unhashed(inode));
-	WRITE_ONCE(inode->i_private,
-		   (unsigned long) upperinode | OVL_ISUPPER_MASK);
+	WRITE_ONCE(OVL_I_INFO(inode)->__upperinode, upperinode);
 	if (!S_ISDIR(upperinode->i_mode))
 		__insert_inode_hash(inode, (unsigned long) upperinode);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 13/23] ovl: use ovl_inode_init() for initializing new inode
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (11 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 12/23] ovl: store upper/lower real inode in ovl_inode_info Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 14/23] ovl: hash overlay non-dir inodes by copy up origin inode Amir Goldstein
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

In preparation for hashing all overlay inodes, modify the helper
ovl_inode_init() to hash a new non-dir upper overlay inode.

Use this helper for initializing new upper inode in ovl_instantiate()
instead of ovl_inode_update(), because the implementations for new inode
case and copy up case are going to diverge.

Also move ovl_copyattr() into ovl_inode_init() and ovl_inode_get(),
so attributes will be copied only for new inode.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/dir.c   |  3 +--
 fs/overlayfs/inode.c |  1 +
 fs/overlayfs/namei.c |  1 -
 fs/overlayfs/util.c  | 11 ++++++++++-
 4 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index a63a71656e9b..f87cef27741f 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -156,8 +156,7 @@ static void ovl_instantiate(struct dentry *dentry, struct inode *inode,
 	ovl_dentry_version_inc(dentry->d_parent);
 	ovl_dentry_update(dentry, newdentry);
 	if (!hardlink) {
-		ovl_inode_update(inode, d_inode(newdentry));
-		ovl_copyattr(newdentry->d_inode, inode);
+		ovl_inode_init(inode, d_inode(newdentry), true);
 	} else {
 		WARN_ON(ovl_inode_real(inode, NULL) != d_inode(newdentry));
 		inc_nlink(inode);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 49492b982726..6f0ec691d8d5 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -477,6 +477,7 @@ struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode)
 	if (inode && inode->i_state & I_NEW) {
 		ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev);
 		set_nlink(inode, realinode->i_nlink);
+		ovl_copyattr(realinode, inode);
 		unlock_new_inode(inode);
 	}
 
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index fe84e482d926..228b9eaa19f6 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -623,7 +623,6 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 		}
 		if (!inode)
 			goto out_free_oe;
-		ovl_copyattr(realdentry->d_inode, inode);
 	}
 
 	revert_creds(old_cred);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 0ce6efd8f19d..9bbcfcb20ec4 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -267,6 +267,12 @@ void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry)
 	ovl_update_type(dentry, d_is_dir(dentry));
 }
 
+static void ovl_insert_inode_hash(struct inode *inode, struct inode *realinode)
+{
+	WARN_ON(!inode_unhashed(inode));
+	__insert_inode_hash(inode, (unsigned long) realinode);
+}
+
 void ovl_inode_init(struct inode *inode, struct inode *realinode, bool is_upper)
 {
 	struct ovl_inode_info *oi = OVL_I_INFO(inode);
@@ -274,10 +280,13 @@ void ovl_inode_init(struct inode *inode, struct inode *realinode, bool is_upper)
 	if (is_upper) {
 		oi->__upperinode = realinode;
 		oi->lowerinode = NULL;
+		if (!S_ISDIR(realinode->i_mode))
+			ovl_insert_inode_hash(inode, realinode);
 	} else {
 		oi->__upperinode = NULL;
 		oi->lowerinode = realinode;
 	}
+	ovl_copyattr(realinode, inode);
 }
 
 struct inode *ovl_inode_real(struct inode *inode, bool *is_upper)
@@ -300,7 +309,7 @@ void ovl_inode_update(struct inode *inode, struct inode *upperinode)
 	WARN_ON(!inode_unhashed(inode));
 	WRITE_ONCE(OVL_I_INFO(inode)->__upperinode, upperinode);
 	if (!S_ISDIR(upperinode->i_mode))
-		__insert_inode_hash(inode, (unsigned long) upperinode);
+		ovl_insert_inode_hash(inode, upperinode);
 }
 
 void ovl_dentry_version_inc(struct dentry *dentry)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 14/23] ovl: hash overlay non-dir inodes by copy up origin inode
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (12 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 13/23] ovl: use ovl_inode_init() for initializing new inode Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 15/23] ovl: defer upper dir lock to tempfile link Amir Goldstein
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

When inodes index feature is enabled, hash all non-dir inodes by the
address of the copy up origin inode if it is indexed.

Non-upper overlay inodes are hashed by the lower real inode, which is
the copy up origin to be.

This change makes all lower hardlinks and their indexed copy ups be
represented by a single overlay inode and is needed for vfs inode
consistency after hardlinks are no longer broken on copy up.

When hashing a non-upper overlay inode and an index entry already exists
from another lower alias copy up, set the overlay realinode to the indexed
upper inode. This is needed to make the overlay realinode invariant to
the order of lookup between two lower aliases, when only one fo them was
copied up.

Because overlay dentries of lower hardlink aliases have the same overlay
inode, a non indexed copy up of those lower aliases will cause a conflict
when trying to update the realinode to the broken upper hardlink.
A non indexed copy up of an alias that lost in this conflict will return
-EEXIST and drop the overlay dentry. The next lookup of that broken
upper hardlink will return as upper entry with a new overlay inode.

This conflict is going to be handled more gracefully by following patches.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   |  6 +++++-
 fs/overlayfs/inode.c     | 49 ++++++++++++++++++++++++++++++++++++++++++------
 fs/overlayfs/namei.c     | 28 +++++++++++++++++++++++++--
 fs/overlayfs/overlayfs.h |  7 +++++--
 fs/overlayfs/util.c      | 33 ++++++++++++++++++++++++++++----
 5 files changed, 108 insertions(+), 15 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index eedc26e15dad..ae18824c7944 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -418,7 +418,11 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 
 	newdentry = dget(tmpfile ? upper : temp);
 	ovl_dentry_update(dentry, newdentry);
-	ovl_inode_update(d_inode(dentry), d_inode(newdentry));
+	err = ovl_inode_update(d_inode(dentry), d_inode(newdentry));
+	if (err) {
+		/* Broken hardlink - drop cache and return error */
+		d_drop(dentry);
+	}
 
 	/* Restore timestamps on parent (best effort) */
 	ovl_set_timestamps(upperdir, pstat);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 6f0ec691d8d5..28f9a8cc0f61 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -458,22 +458,59 @@ struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev)
 
 static int ovl_inode_test(struct inode *inode, void *data)
 {
-	return OVL_I_INFO(inode)->__upperinode == data;
+	struct ovl_inode_info *oi = OVL_I_INFO(inode);
+	struct ovl_inode_info *info = data;
+	bool indexed = ovl_indexdir(inode->i_sb);
+
+	if (indexed && info->lowerinode != oi->lowerinode)
+		return false;
+
+	if (info->__upperinode == oi->__upperinode)
+		return true;
+
+	/*
+	 * Return same overlay inode when looking up by lower real inode, but
+	 * an indexed overlay inode, that is hashed by the same lower origin
+	 * inode, has already been updated on copy up to a real upper inode.
+	 */
+	return indexed && !info->__upperinode;
 }
 
 static int ovl_inode_set(struct inode *inode, void *data)
 {
-	OVL_I_INFO(inode)->__upperinode = data;
+	struct ovl_inode_info *oi = OVL_I_INFO(inode);
+	struct ovl_inode_info *info = data;
+
+	oi->__upperinode = info->__upperinode;
+	oi->lowerinode = info->lowerinode;
+
 	return 0;
 }
 
-struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode)
-
+struct inode *ovl_get_inode(struct super_block *sb, struct ovl_inode_info *info)
 {
+	struct inode *realinode = info->__upperinode;
+	unsigned long hashval = (unsigned long) realinode;
 	struct inode *inode;
 
-	inode = iget5_locked(sb, (unsigned long) realinode,
-			     ovl_inode_test, ovl_inode_set, realinode);
+	/*
+	 * With inodes index feature enabled, overlay inodes hash key is the
+	 * address of the copy up origin lower inode if it is indexed.
+	 * When hashing a non-upper overlay inode, origin has to be set to
+	 * the real lower inode, which is the copy up origin inode to be.
+	 * Regardless of the inode we use as hash key, we always use the
+	 * uppermost real inode to initialize ovl_inode.
+	 */
+	if (info->lowerinode) {
+		WARN_ON(!ovl_indexdir(sb));
+		hashval = (unsigned long) info->lowerinode;
+		if (!realinode)
+			realinode = info->lowerinode;
+	} else if (WARN_ON(!realinode)) {
+		return NULL;
+	}
+
+	inode = iget5_locked(sb, hashval, ovl_inode_test, ovl_inode_set, info);
 	if (inode && inode->i_state & I_NEW) {
 		ovl_fill_inode(inode, realinode->i_mode, realinode->i_rdev);
 		set_nlink(inode, realinode->i_nlink);
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 228b9eaa19f6..31f5ad774499 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -608,13 +608,37 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	if (upperdentry || ctr) {
 		struct dentry *realdentry;
 		struct inode *realinode;
+		struct inode *originode = NULL;
 
 		realdentry = upperdentry ? upperdentry : stack[0].dentry;
 		realinode = d_inode(realdentry);
+		/*
+		 * When inodes index is enabled, we hash all non-dir inodes
+		 * by the address of the copy up origin inode if it is indexed
+		 * or by the address of the non-upper real inode.
+		 * When inodes index is disabled, or if origin is not indexed,
+		 * we hash non-dir upper inodes by the addess of the real inode.
+		 * When hashing a non-upper overlay inode and index points to
+		 * an upper inode (from another lower alias copy up), set the
+		 * real upper inode to the indexed upper inode.
+		 */
+		if (index) {
+			BUG_ON(!ctr);
+			originode = d_inode(stack[0].dentry);
+			if (!upperdentry)
+				realinode = d_inode(index);
+			else
+				WARN_ON(realinode != d_inode(index));
+		}
 
 		err = -ENOMEM;
-		if (upperdentry && !d_is_dir(upperdentry)) {
-			inode = ovl_get_inode(dentry->d_sb, realinode);
+		if (!d.is_dir && (upperdentry || originode)) {
+			struct ovl_inode_info info = {
+				.__upperinode = realinode,
+				.lowerinode = originode,
+			};
+
+			inode = ovl_get_inode(dentry->d_sb, &info);
 		} else {
 			inode = ovl_new_inode(dentry->d_sb, realinode->i_mode,
 					      realinode->i_rdev);
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 46e6730f51f8..0e801059311a 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -213,7 +213,7 @@ void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry);
 void ovl_inode_init(struct inode *inode, struct inode *realinode,
 		    bool is_upper);
 struct inode *ovl_inode_real(struct inode *inode, bool *is_upper);
-void ovl_inode_update(struct inode *inode, struct inode *upperinode);
+int ovl_inode_update(struct inode *inode, struct inode *upperinode);
 void ovl_dentry_version_inc(struct dentry *dentry);
 u64 ovl_dentry_version_get(struct dentry *dentry);
 bool ovl_is_whiteout(struct dentry *dentry);
@@ -264,7 +264,10 @@ int ovl_update_time(struct inode *inode, struct timespec *ts, int flags);
 bool ovl_is_private_xattr(const char *name);
 
 struct inode *ovl_new_inode(struct super_block *sb, umode_t mode, dev_t rdev);
-struct inode *ovl_get_inode(struct super_block *sb, struct inode *realinode);
+struct ovl_inode_info;
+struct inode *ovl_get_inode(struct super_block *sb,
+			    struct ovl_inode_info *info);
+
 static inline void ovl_copyattr(struct inode *from, struct inode *to)
 {
 	to->i_uid = from->i_uid;
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 9bbcfcb20ec4..ee6d08f9c7bd 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -303,13 +303,38 @@ struct inode *ovl_inode_real(struct inode *inode, bool *is_upper)
 	return realinode;
 }
 
-void ovl_inode_update(struct inode *inode, struct inode *upperinode)
+/*
+ * When inodes index is enabled, we hash all non-dir inodes by the address
+ * of the lower origin inode. We need to take care on concurrent copy up of
+ * different lower hardlinks, that only one alias can set the upper real inode.
+ * Copy up of an alias that lost the ovl_inode_update() race will get -EEXIST
+ * and needs to d_drop() the overlay dentry of that alias, so the next
+ * ovl_lookup() will initialize a new overlay inode for the broken hardlink.
+ */
+int ovl_inode_update(struct inode *inode, struct inode *upperinode)
 {
+	struct ovl_inode_info *oi = OVL_I_INFO(inode);
+	struct inode *realinode;
+	bool indexed = ovl_indexdir(inode->i_sb);
+
 	WARN_ON(!upperinode);
-	WARN_ON(!inode_unhashed(inode));
-	WRITE_ONCE(OVL_I_INFO(inode)->__upperinode, upperinode);
-	if (!S_ISDIR(upperinode->i_mode))
+
+	spin_lock(&inode->i_lock);
+	realinode = oi->__upperinode;
+	if (!realinode)
+		oi->__upperinode = upperinode;
+	spin_unlock(&inode->i_lock);
+
+	if (realinode && realinode != upperinode) {
+		WARN_ON(!indexed);
+		return -EEXIST;
+	}
+
+	/* When inodes index is enabled, inode is hashed before copy up */
+	if (!S_ISDIR(upperinode->i_mode) && !indexed)
 		ovl_insert_inode_hash(inode, upperinode);
+
+	return 0;
 }
 
 void ovl_dentry_version_inc(struct dentry *dentry)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 15/23] ovl: defer upper dir lock to tempfile link
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (13 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 14/23] ovl: hash overlay non-dir inodes by copy up origin inode Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 16/23] ovl: factor out ovl_copy_up_inode() helper Amir Goldstein
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

On copy up of regular file using an O_TMPFILE, lock upper dir only
before linking the tempfile in place.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c | 35 ++++++++++++++++++-----------------
 fs/overlayfs/util.c    |  3 ++-
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index ae18824c7944..f6570bc127e7 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -336,8 +336,13 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 		.link = link
 	};
 
-	upper = lookup_one_len(dentry->d_name.name, upperdir,
-			       dentry->d_name.len);
+	if (tmpfile) {
+		upper = lookup_one_len_unlocked(dentry->d_name.name, upperdir,
+						dentry->d_name.len);
+	} else {
+		upper = lookup_one_len(dentry->d_name.name, upperdir,
+				       dentry->d_name.len);
+	}
 	err = PTR_ERR(upper);
 	if (IS_ERR(upper))
 		goto out;
@@ -377,16 +382,7 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 		BUG_ON(upperpath.dentry != NULL);
 		upperpath.dentry = temp;
 
-		if (tmpfile) {
-			inode_unlock(udir);
-			err = ovl_copy_up_data(lowerpath, &upperpath,
-					       stat->size);
-			inode_lock_nested(udir, I_MUTEX_PARENT);
-		} else {
-			err = ovl_copy_up_data(lowerpath, &upperpath,
-					       stat->size);
-		}
-
+		err = ovl_copy_up_data(lowerpath, &upperpath, stat->size);
 		if (err)
 			goto out_cleanup;
 	}
@@ -409,10 +405,16 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 	if (err)
 		goto out_cleanup;
 
-	if (tmpfile)
+	if (tmpfile) {
+		inode_lock_nested(udir, I_MUTEX_PARENT);
 		err = ovl_do_link(temp, udir, upper, true);
-	else
+		if (!err)
+			ovl_set_timestamps(upperdir, pstat);
+		pstat = NULL;
+		inode_unlock(udir);
+	} else {
 		err = ovl_do_rename(wdir, temp, udir, upper, 0);
+	}
 	if (err)
 		goto out_cleanup;
 
@@ -425,7 +427,8 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 	}
 
 	/* Restore timestamps on parent (best effort) */
-	ovl_set_timestamps(upperdir, pstat);
+	if (pstat)
+		ovl_set_timestamps(upperdir, pstat);
 out2:
 	dput(temp);
 out1:
@@ -496,10 +499,8 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			goto out_done;
 		}
 
-		inode_lock_nested(upperdir->d_inode, I_MUTEX_PARENT);
 		err = ovl_copy_up_locked(workdir, upperdir, dentry, lowerpath,
 					 stat, link, &pstat, true);
-		inode_unlock(upperdir->d_inode);
 		ovl_copy_up_end(dentry);
 		goto out_done;
 	}
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index ee6d08f9c7bd..65c1e09ae0d0 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -256,7 +256,8 @@ void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
 
-	WARN_ON(!inode_is_locked(upperdentry->d_parent->d_inode));
+	WARN_ON(!inode_is_locked(upperdentry->d_parent->d_inode) &&
+		!oe->copying);
 	WARN_ON(oe->__upperdentry);
 	/*
 	 * Make sure upperdentry is consistent before making it visible to
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 16/23] ovl: factor out ovl_copy_up_inode() helper
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (14 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 15/23] ovl: defer upper dir lock to tempfile link Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 17/23] ovl: generalize ovl_copy_up_locked() using actors Amir Goldstein
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Factor out helper for copying lower inode data and metadata to temp
upper inode, that is common to copy up using O_TMPFILE and workdir.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c | 64 +++++++++++++++++++++++++++++---------------------
 1 file changed, 37 insertions(+), 27 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index f6570bc127e7..e642a51f3be4 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -316,6 +316,42 @@ static int ovl_set_origin(struct dentry *dentry, struct dentry *lower,
 	return err;
 }
 
+static int ovl_copy_up_inode(struct dentry *dentry, struct dentry *temp,
+			     struct path *lowerpath, struct kstat *stat)
+{
+	int err;
+
+	if (S_ISREG(stat->mode)) {
+		struct path upperpath;
+
+		ovl_path_upper(dentry, &upperpath);
+		BUG_ON(upperpath.dentry != NULL);
+		upperpath.dentry = temp;
+
+		err = ovl_copy_up_data(lowerpath, &upperpath, stat->size);
+		if (err)
+			return err;
+	}
+
+	err = ovl_copy_xattr(lowerpath->dentry, temp);
+	if (err)
+		return err;
+
+	inode_lock(temp->d_inode);
+	err = ovl_set_attr(temp, stat);
+	inode_unlock(temp->d_inode);
+	if (err)
+		return err;
+
+	/*
+	 * Store identifier of lower inode in upper inode xattr to
+	 * allow lookup of the copy up origin inode.
+	 */
+	err = ovl_set_origin(dentry, lowerpath->dentry, temp);
+
+	return err;
+}
+
 static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 			      struct dentry *dentry, struct path *lowerpath,
 			      struct kstat *stat, const char *link,
@@ -375,33 +411,7 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 	if (err)
 		goto out2;
 
-	if (S_ISREG(stat->mode)) {
-		struct path upperpath;
-
-		ovl_path_upper(dentry, &upperpath);
-		BUG_ON(upperpath.dentry != NULL);
-		upperpath.dentry = temp;
-
-		err = ovl_copy_up_data(lowerpath, &upperpath, stat->size);
-		if (err)
-			goto out_cleanup;
-	}
-
-	err = ovl_copy_xattr(lowerpath->dentry, temp);
-	if (err)
-		goto out_cleanup;
-
-	inode_lock(temp->d_inode);
-	err = ovl_set_attr(temp, stat);
-	inode_unlock(temp->d_inode);
-	if (err)
-		goto out_cleanup;
-
-	/*
-	 * Store identifier of lower inode in upper inode xattr to
-	 * allow lookup of the copy up origin inode.
-	 */
-	err = ovl_set_origin(dentry, lowerpath->dentry, temp);
+	err = ovl_copy_up_inode(dentry, temp, lowerpath, stat);
 	if (err)
 		goto out_cleanup;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 17/23] ovl: generalize ovl_copy_up_locked() using actors
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (15 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 16/23] ovl: factor out ovl_copy_up_inode() helper Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 18/23] ovl: generalize ovl_copy_up_one() " Amir Goldstein
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

ovl_copy_up_locked() is implementing two different copy up methods in one
function with many conditional statements.

Split the function into actors: prepare,commit,cancel.
Implement the actors for the 'workdir' and 'tmpfile' copy up methods.

The aquire,release code for the two methods is still open coded in
ovl_copy_up_one(). Next patch will implement the aquire,release actors.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c | 250 ++++++++++++++++++++++++++++++++++---------------
 1 file changed, 172 insertions(+), 78 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index e642a51f3be4..39e982acc6c3 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -352,104 +352,197 @@ static int ovl_copy_up_inode(struct dentry *dentry, struct dentry *temp,
 	return err;
 }
 
-static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
-			      struct dentry *dentry, struct path *lowerpath,
-			      struct kstat *stat, const char *link,
-			      struct kstat *pstat, bool tmpfile)
+/*
+ * Context and operations for copying up a single lower file.
+ */
+struct ovl_copy_up_ctx {
+	struct dentry *dentry;
+	struct path *lowerpath;
+	struct kstat *stat;
+	struct kstat pstat;
+	const char *link;
+	struct dentry *upperdir;
+	struct dentry *tempdir;
+	struct dentry *upper;
+	struct dentry *temp;
+};
+
+struct ovl_copy_up_ops {
+	int (*prepare)(struct ovl_copy_up_ctx *);
+	int (*commit)(struct ovl_copy_up_ctx *);
+	void (*cancel)(struct ovl_copy_up_ctx *);
+};
+
+/*
+ * Copy up operations using workdir.
+ * Upper file is created in workdir, copied and moved to upperdir.
+ */
+static int ovl_copy_up_workdir_prepare(struct ovl_copy_up_ctx *ctx)
 {
-	struct inode *wdir = workdir->d_inode;
-	struct inode *udir = upperdir->d_inode;
-	struct dentry *newdentry = NULL;
 	struct dentry *upper = NULL;
 	struct dentry *temp = NULL;
 	int err;
-	const struct cred *old_creds = NULL;
-	struct cred *new_creds = NULL;
 	struct cattr cattr = {
 		/* Can't properly set mode on creation because of the umask */
-		.mode = stat->mode & S_IFMT,
-		.rdev = stat->rdev,
-		.link = link
+		.mode = ctx->stat->mode & S_IFMT,
+		.rdev = ctx->stat->rdev,
+		.link = ctx->link,
 	};
 
-	if (tmpfile) {
-		upper = lookup_one_len_unlocked(dentry->d_name.name, upperdir,
-						dentry->d_name.len);
-	} else {
-		upper = lookup_one_len(dentry->d_name.name, upperdir,
-				       dentry->d_name.len);
-	}
+
+	upper = lookup_one_len(ctx->dentry->d_name.name, ctx->upperdir,
+			       ctx->dentry->d_name.len);
 	err = PTR_ERR(upper);
 	if (IS_ERR(upper))
 		goto out;
 
+	temp = ovl_lookup_temp(ctx->tempdir);
+	if (IS_ERR(temp)) {
+		err = PTR_ERR(temp);
+		goto out_dput_upper;
+	}
+
+	err = ovl_create_real(d_inode(ctx->tempdir), temp, &cattr, NULL, true);
+	if (err)
+		goto out_dput_temp;
+
+	ctx->upper = upper;
+	ctx->temp = temp;
+	return 0;
+
+out_dput_temp:
+	dput(temp);
+out_dput_upper:
+	dput(upper);
+out:
+	return err;
+}
+
+static int ovl_copy_up_workdir_commit(struct ovl_copy_up_ctx *ctx)
+{
+	int err;
+
+	err = ovl_do_rename(d_inode(ctx->tempdir), ctx->temp,
+			    d_inode(ctx->upperdir), ctx->upper, 0);
+	if (err)
+		return err;
+
+	/* After rename, ctx->temp is the upper entry we will use */
+	swap(ctx->temp, ctx->upper);
+
+	/* Restore timestamps on parent (best effort) */
+	ovl_set_timestamps(ctx->upperdir, &ctx->pstat);
+
+	return err;
+}
+
+static void ovl_copy_up_workdir_cancel(struct ovl_copy_up_ctx *ctx)
+{
+	ovl_cleanup(d_inode(ctx->tempdir), ctx->temp);
+}
+
+static const struct ovl_copy_up_ops ovl_copy_up_workdir_ops = {
+	.prepare = ovl_copy_up_workdir_prepare,
+	.commit = ovl_copy_up_workdir_commit,
+	.cancel = ovl_copy_up_workdir_cancel,
+};
+
+/*
+ * Copy up operations using O_TMPFILE.
+ * Upper file is created unlinked, copied and linked to upperdir.
+ */
+static int ovl_copy_up_tmpfile_prepare(struct ovl_copy_up_ctx *ctx)
+{
+	struct dentry *upper;
+	struct dentry *temp;
+
+	upper = lookup_one_len_unlocked(ctx->dentry->d_name.name, ctx->upperdir,
+					ctx->dentry->d_name.len);
+	if (IS_ERR(upper))
+		return PTR_ERR(upper);
+
+	temp = ovl_do_tmpfile(ctx->upperdir, ctx->stat->mode);
+	if (IS_ERR(temp)) {
+		dput(upper);
+		return PTR_ERR(temp);
+	}
+
+	ctx->upper = upper;
+	ctx->temp = temp;
+	return 0;
+}
+
+static int ovl_copy_up_tmpfile_commit(struct ovl_copy_up_ctx *ctx)
+{
+	int err;
+
+	inode_lock_nested(d_inode(ctx->upperdir), I_MUTEX_PARENT);
+	/* link the sucker ;) */
+	err = ovl_do_link(ctx->temp, d_inode(ctx->upperdir), ctx->upper, true);
+	/* Restore timestamps on parent (best effort) */
+	if (!err)
+		ovl_set_timestamps(ctx->upperdir, &ctx->pstat);
+	inode_unlock(d_inode(ctx->upperdir));
+
+	return err;
+}
+
+static void ovl_copy_up_tmpfile_cancel(struct ovl_copy_up_ctx *ctx) { }
+
+static const struct ovl_copy_up_ops ovl_copy_up_tmpfile_ops = {
+	.prepare = ovl_copy_up_tmpfile_prepare,
+	.commit = ovl_copy_up_tmpfile_commit,
+	.cancel = ovl_copy_up_tmpfile_cancel,
+};
+
+static int ovl_copy_up_locked(struct ovl_copy_up_ctx *ctx,
+			      const struct ovl_copy_up_ops *ops)
+{
+	struct dentry *dentry = ctx->dentry;
+	const struct cred *old_creds = NULL;
+	struct cred *new_creds = NULL;
+	int err;
+
 	err = security_inode_copy_up(dentry, &new_creds);
 	if (err < 0)
-		goto out1;
+		return err;
 
 	if (new_creds)
 		old_creds = override_creds(new_creds);
 
-	if (tmpfile)
-		temp = ovl_do_tmpfile(upperdir, stat->mode);
-	else
-		temp = ovl_lookup_temp(workdir);
-	err = 0;
-	if (IS_ERR(temp)) {
-		err = PTR_ERR(temp);
-		temp = NULL;
-	}
-
-	if (!err && !tmpfile)
-		err = ovl_create_real(wdir, temp, &cattr, NULL, true);
+	ctx->upper = ctx->temp = NULL;
+	err = ops->prepare(ctx);
 
 	if (new_creds) {
 		revert_creds(old_creds);
 		put_cred(new_creds);
 	}
-
 	if (err)
-		goto out2;
+		goto out;
 
-	err = ovl_copy_up_inode(dentry, temp, lowerpath, stat);
+	err = ovl_copy_up_inode(dentry, ctx->temp, ctx->lowerpath, ctx->stat);
 	if (err)
-		goto out_cleanup;
+		goto out_cancel;
 
-	if (tmpfile) {
-		inode_lock_nested(udir, I_MUTEX_PARENT);
-		err = ovl_do_link(temp, udir, upper, true);
-		if (!err)
-			ovl_set_timestamps(upperdir, pstat);
-		pstat = NULL;
-		inode_unlock(udir);
-	} else {
-		err = ovl_do_rename(wdir, temp, udir, upper, 0);
-	}
+	err = ops->commit(ctx);
 	if (err)
-		goto out_cleanup;
+		goto out_cancel;
 
-	newdentry = dget(tmpfile ? upper : temp);
-	ovl_dentry_update(dentry, newdentry);
-	err = ovl_inode_update(d_inode(dentry), d_inode(newdentry));
+	ovl_dentry_update(dentry, dget(ctx->upper));
+	err = ovl_inode_update(d_inode(dentry), d_inode(ctx->upper));
 	if (err) {
 		/* Broken hardlink - drop cache and return error */
 		d_drop(dentry);
 	}
 
-	/* Restore timestamps on parent (best effort) */
-	if (pstat)
-		ovl_set_timestamps(upperdir, pstat);
-out2:
-	dput(temp);
-out1:
-	dput(upper);
 out:
+	dput(ctx->temp);
+	dput(ctx->upper);
 	return err;
 
-out_cleanup:
-	if (!tmpfile)
-		ovl_cleanup(wdir, temp);
-	goto out2;
+out_cancel:
+	ops->cancel(ctx);
+	goto out;
 }
 
 /*
@@ -465,37 +558,40 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			   struct path *lowerpath, struct kstat *stat)
 {
 	DEFINE_DELAYED_CALL(done);
-	struct dentry *workdir = ovl_workdir(dentry);
 	int err;
-	struct kstat pstat;
 	struct path parentpath;
 	struct dentry *lowerdentry = lowerpath->dentry;
-	struct dentry *upperdir;
-	const char *link = NULL;
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
+	struct ovl_copy_up_ctx ctx = {
+		.dentry = dentry,
+		.lowerpath = lowerpath,
+		.stat = stat,
+		.link = NULL,
+		.tempdir = ovl_workdir(dentry),
+	};
 
-	if (WARN_ON(!workdir))
+	if (WARN_ON(!ctx.tempdir))
 		return -EROFS;
 
 	ovl_do_check_copy_up(lowerdentry);
 
 	ovl_path_upper(parent, &parentpath);
-	upperdir = parentpath.dentry;
+	ctx.upperdir = parentpath.dentry;
 
 	/* Mark parent "impure" because it may now contain non-pure upper */
-	err = ovl_set_impure(parent, upperdir);
+	err = ovl_set_impure(parent, ctx.upperdir);
 	if (err)
 		return err;
 
-	err = vfs_getattr(&parentpath, &pstat,
+	err = vfs_getattr(&parentpath, &ctx.pstat,
 			  STATX_ATIME | STATX_MTIME, AT_STATX_SYNC_AS_STAT);
 	if (err)
 		return err;
 
 	if (S_ISLNK(stat->mode)) {
-		link = vfs_get_link(lowerdentry, &done);
-		if (IS_ERR(link))
-			return PTR_ERR(link);
+		ctx.link = vfs_get_link(lowerdentry, &done);
+		if (IS_ERR(ctx.link))
+			return PTR_ERR(ctx.link);
 	}
 
 	/* Should we copyup with O_TMPFILE or with workdir? */
@@ -509,14 +605,13 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			goto out_done;
 		}
 
-		err = ovl_copy_up_locked(workdir, upperdir, dentry, lowerpath,
-					 stat, link, &pstat, true);
+		err = ovl_copy_up_locked(&ctx, &ovl_copy_up_tmpfile_ops);
 		ovl_copy_up_end(dentry);
 		goto out_done;
 	}
 
 	err = -EIO;
-	if (lock_rename(workdir, upperdir) != NULL) {
+	if (lock_rename(ctx.tempdir, ctx.upperdir) != NULL) {
 		pr_err("overlayfs: failed to lock workdir+upperdir\n");
 		goto out_unlock;
 	}
@@ -526,10 +621,9 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 		goto out_unlock;
 	}
 
-	err = ovl_copy_up_locked(workdir, upperdir, dentry, lowerpath,
-				 stat, link, &pstat, false);
+	err = ovl_copy_up_locked(&ctx, &ovl_copy_up_workdir_ops);
 out_unlock:
-	unlock_rename(workdir, upperdir);
+	unlock_rename(ctx.tempdir, ctx.upperdir);
 out_done:
 	do_delayed_call(&done);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 18/23] ovl: generalize ovl_copy_up_one() using actors
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (16 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 17/23] ovl: generalize ovl_copy_up_locked() using actors Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 19/23] ovl: use ovl_inode mutex to synchronize concurrent copy up Amir Goldstein
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

ovl_copy_up_one() is implementing two different locking methods
for regular files and non-regular files.

Implement the actors aquire,release for the 'workdir' and 'tmpfile'
copy up methods and use them to generalize locking in ovl_copy_up_one().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c | 81 ++++++++++++++++++++++++++++++++++----------------
 1 file changed, 56 insertions(+), 25 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 39e982acc6c3..452560a17d7e 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -368,15 +368,38 @@ struct ovl_copy_up_ctx {
 };
 
 struct ovl_copy_up_ops {
+	int (*aquire)(struct ovl_copy_up_ctx *);
 	int (*prepare)(struct ovl_copy_up_ctx *);
 	int (*commit)(struct ovl_copy_up_ctx *);
 	void (*cancel)(struct ovl_copy_up_ctx *);
+	void (*release)(struct ovl_copy_up_ctx *);
 };
 
 /*
  * Copy up operations using workdir.
  * Upper file is created in workdir, copied and moved to upperdir.
  */
+static int ovl_copy_up_workdir_aquire(struct ovl_copy_up_ctx *ctx)
+{
+	int err = -EIO;
+
+	if (lock_rename(ctx->tempdir, ctx->upperdir) != NULL) {
+		pr_err("overlayfs: failed to lock workdir+upperdir\n");
+		goto out_unlock;
+	}
+	if (ovl_dentry_upper(ctx->dentry)) {
+		/* Raced with another copy-up? */
+		err = 1;
+		goto out_unlock;
+	}
+
+	return 0;
+
+out_unlock:
+	unlock_rename(ctx->tempdir, ctx->upperdir);
+	return err;
+}
+
 static int ovl_copy_up_workdir_prepare(struct ovl_copy_up_ctx *ctx)
 {
 	struct dentry *upper = NULL;
@@ -441,16 +464,28 @@ static void ovl_copy_up_workdir_cancel(struct ovl_copy_up_ctx *ctx)
 	ovl_cleanup(d_inode(ctx->tempdir), ctx->temp);
 }
 
+static void ovl_copy_up_workdir_release(struct ovl_copy_up_ctx *ctx)
+{
+	unlock_rename(ctx->tempdir, ctx->upperdir);
+}
+
 static const struct ovl_copy_up_ops ovl_copy_up_workdir_ops = {
+	.aquire = ovl_copy_up_workdir_aquire,
 	.prepare = ovl_copy_up_workdir_prepare,
 	.commit = ovl_copy_up_workdir_commit,
 	.cancel = ovl_copy_up_workdir_cancel,
+	.release = ovl_copy_up_workdir_release,
 };
 
 /*
  * Copy up operations using O_TMPFILE.
  * Upper file is created unlinked, copied and linked to upperdir.
  */
+static int ovl_copy_up_tmpfile_aquire(struct ovl_copy_up_ctx *ctx)
+{
+	return ovl_copy_up_start(ctx->dentry);
+}
+
 static int ovl_copy_up_tmpfile_prepare(struct ovl_copy_up_ctx *ctx)
 {
 	struct dentry *upper;
@@ -489,10 +524,17 @@ static int ovl_copy_up_tmpfile_commit(struct ovl_copy_up_ctx *ctx)
 
 static void ovl_copy_up_tmpfile_cancel(struct ovl_copy_up_ctx *ctx) { }
 
+static void ovl_copy_up_tmpfile_release(struct ovl_copy_up_ctx *ctx)
+{
+	ovl_copy_up_end(ctx->dentry);
+}
+
 static const struct ovl_copy_up_ops ovl_copy_up_tmpfile_ops = {
+	.aquire = ovl_copy_up_tmpfile_aquire,
 	.prepare = ovl_copy_up_tmpfile_prepare,
 	.commit = ovl_copy_up_tmpfile_commit,
 	.cancel = ovl_copy_up_tmpfile_cancel,
+	.release = ovl_copy_up_tmpfile_release,
 };
 
 static int ovl_copy_up_locked(struct ovl_copy_up_ctx *ctx,
@@ -569,6 +611,7 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 		.link = NULL,
 		.tempdir = ovl_workdir(dentry),
 	};
+	const struct ovl_copy_up_ops *ops;
 
 	if (WARN_ON(!ctx.tempdir))
 		return -EROFS;
@@ -595,35 +638,23 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	}
 
 	/* Should we copyup with O_TMPFILE or with workdir? */
-	if (S_ISREG(stat->mode) && ofs->tmpfile) {
-		err = ovl_copy_up_start(dentry);
-		/* err < 0: interrupted, err > 0: raced with another copy-up */
-		if (unlikely(err)) {
-			pr_debug("ovl_copy_up_start(%pd2) = %i\n", dentry, err);
-			if (err > 0)
-				err = 0;
-			goto out_done;
-		}
-
-		err = ovl_copy_up_locked(&ctx, &ovl_copy_up_tmpfile_ops);
-		ovl_copy_up_end(dentry);
+	if (S_ISREG(stat->mode) && ofs->tmpfile)
+		ops = &ovl_copy_up_tmpfile_ops;
+	else
+		ops = &ovl_copy_up_workdir_ops;
+
+	err = ops->aquire(&ctx);
+	/* err < 0: interrupted, err > 0: raced with another copy-up */
+	if (unlikely(err)) {
+		pr_debug("%s(%pd2): aquire = %i\n", __func__, dentry, err);
+		if (err > 0)
+			err = 0;
 		goto out_done;
 	}
 
-	err = -EIO;
-	if (lock_rename(ctx.tempdir, ctx.upperdir) != NULL) {
-		pr_err("overlayfs: failed to lock workdir+upperdir\n");
-		goto out_unlock;
-	}
-	if (ovl_dentry_upper(dentry)) {
-		/* Raced with another copy-up?  Nothing to do, then... */
-		err = 0;
-		goto out_unlock;
-	}
+	err = ovl_copy_up_locked(&ctx, ops);
+	ops->release(&ctx);
 
-	err = ovl_copy_up_locked(&ctx, &ovl_copy_up_workdir_ops);
-out_unlock:
-	unlock_rename(ctx.tempdir, ctx.upperdir);
 out_done:
 	do_delayed_call(&done);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 19/23] ovl: use ovl_inode mutex to synchronize concurrent copy up
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (17 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 18/23] ovl: generalize ovl_copy_up_one() " Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 20/23] ovl: implement index dir copy up method Amir Goldstein
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Use the new ovl_inode mutex to synchonize concurrent copy up
instead of the super block copy up workqueue.

Moving the synchronization object from the overlay dentry to
the overlay inode is needed for synchonizing concurrent copy up
of lower hardlinks to the same upper inode.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/ovl_entry.h |  2 --
 fs/overlayfs/super.c     |  1 -
 fs/overlayfs/util.c      | 24 ++++++------------------
 3 files changed, 6 insertions(+), 21 deletions(-)

diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 6e240bc5ac1d..288f52f24755 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -35,7 +35,6 @@ struct ovl_fs {
 	const struct cred *creator_cred;
 	bool tmpfile;
 	bool noxattr;
-	wait_queue_head_t copyup_wq;
 	/* sb common to all layers */
 	struct super_block *same_sb;
 };
@@ -50,7 +49,6 @@ struct ovl_entry {
 		struct {
 			u64 version;
 			const char *redirect;
-			bool copying;
 		};
 		struct rcu_head rcu;
 	};
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 7b48bfed0121..0f36a09df34c 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -878,7 +878,6 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	if (!ufs)
 		goto out;
 
-	init_waitqueue_head(&ufs->copyup_wq);
 	ufs->config.redirect_dir = ovl_redirect_dir_def;
 	ufs->config.index = ovl_index_def;
 	err = ovl_parse_opt((char *) data, &ufs->config);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 65c1e09ae0d0..9c729322d3b0 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -257,7 +257,7 @@ void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry)
 	struct ovl_entry *oe = dentry->d_fsdata;
 
 	WARN_ON(!inode_is_locked(upperdentry->d_parent->d_inode) &&
-		!oe->copying);
+		!mutex_is_locked(&OVL_I(d_inode(dentry))->oi_lock));
 	WARN_ON(oe->__upperdentry);
 	/*
 	 * Make sure upperdentry is consistent before making it visible to
@@ -368,32 +368,20 @@ struct file *ovl_path_open(struct path *path, int flags)
 
 int ovl_copy_up_start(struct dentry *dentry)
 {
-	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
-	struct ovl_entry *oe = dentry->d_fsdata;
 	int err;
 
-	spin_lock(&ofs->copyup_wq.lock);
-	err = wait_event_interruptible_locked(ofs->copyup_wq, !oe->copying);
-	if (!err) {
-		if (oe->__upperdentry)
-			err = 1; /* Already copied up */
-		else
-			oe->copying = true;
+	err = mutex_lock_interruptible(&OVL_I(d_inode(dentry))->oi_lock);
+	if (!err && ovl_dentry_upper(dentry)) {
+		err = 1; /* Already copied up */
+		mutex_unlock(&OVL_I(d_inode(dentry))->oi_lock);
 	}
-	spin_unlock(&ofs->copyup_wq.lock);
 
 	return err;
 }
 
 void ovl_copy_up_end(struct dentry *dentry)
 {
-	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
-	struct ovl_entry *oe = dentry->d_fsdata;
-
-	spin_lock(&ofs->copyup_wq.lock);
-	oe->copying = false;
-	wake_up_locked(&ofs->copyup_wq);
-	spin_unlock(&ofs->copyup_wq.lock);
+	mutex_unlock(&OVL_I(d_inode(dentry))->oi_lock);
 }
 
 bool ovl_check_dir_xattr(struct dentry *dentry, const char *name)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 20/23] ovl: implement index dir copy up method
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (18 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 19/23] ovl: use ovl_inode mutex to synchronize concurrent copy up Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 21/23] ovl: link up indexed lower hardlink on lookup Amir Goldstein
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Implement a copy up method for non-dir objects using index dir to
prevent breaking lower hardlinks on copy up.

This method requires that the inodes index dir feature was enabled and
that all underlying fs support NFS export.

On the first lower hardlink copy up, upper file is created in index dir,
named after the hex representation of the lower origin inode file handle.
On the second lower hardlink copy up, upper file is found in index dir,
by the same lower handle key.
On either case, the upper indexed inode is then linked to the copy up
upper path.

The index entry remains linked for future lower hardlink copy up and for
lower to upper inode map, that is needed for exporting overlayfs to NFS.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   | 212 +++++++++++++++++++++++++++++++++++++++++++++--
 fs/overlayfs/namei.c     |   4 +-
 fs/overlayfs/overlayfs.h |   3 +
 fs/overlayfs/util.c      |   9 ++
 4 files changed, 218 insertions(+), 10 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 452560a17d7e..1f2007c35755 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -345,7 +345,8 @@ static int ovl_copy_up_inode(struct dentry *dentry, struct dentry *temp,
 
 	/*
 	 * Store identifier of lower inode in upper inode xattr to
-	 * allow lookup of the copy up origin inode.
+	 * allow lookup of the copy up origin inode. We do this last
+	 * to bless the inode, in case it was created in the index dir.
 	 */
 	err = ovl_set_origin(dentry, lowerpath->dentry, temp);
 
@@ -365,6 +366,7 @@ struct ovl_copy_up_ctx {
 	struct dentry *tempdir;
 	struct dentry *upper;
 	struct dentry *temp;
+	bool created;
 };
 
 struct ovl_copy_up_ops {
@@ -537,6 +539,192 @@ static const struct ovl_copy_up_ops ovl_copy_up_tmpfile_ops = {
 	.release = ovl_copy_up_tmpfile_release,
 };
 
+/*
+ * Copy up operations using index dir.
+ * Upper file is created in index dir, copied and linked to upperdir.
+ * The index entry remains to be used for mapping lower inode to upper inode.
+ */
+static int ovl_copy_up_indexdir_aquire(struct ovl_copy_up_ctx *ctx)
+{
+	return ovl_copy_up_start(ctx->dentry);
+}
+
+/*
+ * Set ctx->temp to a positive dentry with the index inode.
+ *
+ * Return 0 if entry was created by us and we need to copy the inode data.
+ *
+ * Return 1 if we found an index inode, in which case, we do not need
+ * to copy the inode data.
+ *
+ * May return -EEXISTS/-ENOENT if corrupt index entries are found.
+ */
+static int ovl_copy_up_indexdir_prepare(struct ovl_copy_up_ctx *ctx)
+{
+	struct dentry *upper;
+	struct dentry *index = NULL;
+	struct inode *inode;
+	int err;
+	struct cattr cattr = {
+		/* Can't properly set mode on creation because of the umask */
+		.mode = ctx->stat->mode & S_IFMT,
+		.rdev = ctx->stat->rdev,
+		.link = ctx->link,
+	};
+
+	upper = lookup_one_len_unlocked(ctx->dentry->d_name.name, ctx->upperdir,
+					ctx->dentry->d_name.len);
+	if (IS_ERR(upper))
+		return PTR_ERR(upper);
+
+	err = ovl_lookup_index(ctx->dentry, NULL, ctx->lowerpath->dentry,
+			       &index);
+	if (err)
+		goto out_dput;
+
+	inode = d_inode(index);
+	if (inode) {
+		/* Another lower hardlink already copied-up? */
+		err = -EEXIST;
+		if ((inode->i_mode & S_IFMT) != cattr.mode)
+			goto out_dput;
+
+		err = -ENOENT;
+		if (!inode->i_nlink)
+			goto out_dput;
+
+		/*
+		 * Verify that found index is a copy up of lower inode.
+		 * If index inode doesn't point back to lower inode via
+		 * origin file handle, then this is either a leftover from
+		 * failed copy up or an index dir entry before copying layers.
+		 * In both cases, we cannot use this index and must fail the
+		 * copy up. The failed copy up case will return -EEXISTS and
+		 * the copying layers case will return -ESTALE.
+		 */
+		err = ovl_verify_origin(index, ctx->lowerpath->mnt,
+					ctx->lowerpath->dentry, false);
+		if (err) {
+			if (err == -ENODATA)
+				err = -EEXIST;
+			goto out_dput;
+		}
+
+		if (inode->i_nlink < 2) {
+			/*
+			 * An orphan index inode can be created by copying up
+			 * a lower hardlink alias and then unlinking it. From
+			 * overlayfs perspective, this inode may still live if
+			 * there are more lower hardlinks and it should contain
+			 * the data of the upper inode that was unlinked. So if
+			 * an orphan inode is found in the index dir and we
+			 * should reuse it on copy up of another lower alias.
+			 *
+			 * TODO: keep account of nlink incremented by copy up
+			 * and account of nlink decremented by lower cover up.
+			 * When copyup_nlink + coverup_nlink == origin_nlink
+			 * and index_nlink == 1, need to replace the index entry
+			 * with a whiteout because all overlay references to the
+			 * index are gone.
+			 */
+			pr_warn_ratelimited("overlayfs: linking to orphan upper (%pd2, ino=%lu)\n",
+					    index, inode->i_ino);
+		}
+
+		/* Link to existing upper without copying lower */
+		err = 1;
+		goto out;
+	}
+
+	inode_lock_nested(d_inode(ctx->tempdir), I_MUTEX_PARENT);
+	err = ovl_create_real(d_inode(ctx->tempdir), index, &cattr, NULL, true);
+	inode_unlock(d_inode(ctx->tempdir));
+	if (err)
+		goto out_dput;
+
+	ctx->created = true;
+out:
+	ctx->upper = upper;
+	ctx->temp = index;
+	return err;
+
+out_dput:
+	pr_warn_ratelimited("overlayfs: failed to create index entry (%pd2, ino=%lu, err=%i)\n",
+			    index ? : ctx->lowerpath->dentry,
+			    inode ? inode->i_ino : 0, err);
+	pr_warn_ratelimited("overlayfs: try clearing index dir or mounting with '-o index=off' to disable inodes index.\n");
+	dput(upper);
+	dput(index);
+	return err;
+}
+
+static int ovl_copy_up_indexdir_commit(struct ovl_copy_up_ctx *ctx)
+{
+	int err;
+
+	inode_lock_nested(d_inode(ctx->upperdir), I_MUTEX_PARENT);
+	/* link the sucker ;) */
+	err = ovl_do_link(ctx->temp, d_inode(ctx->upperdir), ctx->upper, true);
+	/* Restore timestamps on parent (best effort) */
+	if (!err)
+		ovl_set_timestamps(ctx->upperdir, &ctx->pstat);
+	inode_unlock(d_inode(ctx->upperdir));
+
+	if (err)
+		goto out;
+
+	/*
+	 * Overlay inode nlink doesn't account for lower hardlinks that haven't
+	 * been copied up, so we need to update it on copy up. Otherwise, user
+	 * could decrement nlink below zero by unlinking copied up uppers.
+	 * On the first copy up, we set overlay inode nlink to upper inode nlink
+	 * and on following copy ups we increment it. In between, ovl_link()
+	 * could add more upper hardlinks and increment overlay nlink as well.
+	 */
+	if (ctx->created)
+		set_nlink(d_inode(ctx->dentry), d_inode(ctx->temp)->i_nlink);
+	else
+		inc_nlink(d_inode(ctx->dentry));
+
+	/* We can mark dentry is indexed before updating upperdentry */
+	ovl_dentry_set_indexed(ctx->dentry);
+
+out:
+	return err;
+}
+
+static void ovl_copy_up_indexdir_cancel(struct ovl_copy_up_ctx *ctx)
+{
+	struct inode *inode = d_inode(ctx->temp);
+
+	if (WARN_ON(!inode))
+		return;
+
+	/* Cleanup prepared index entry only if we created it */
+	if (!ctx->created)
+		return;
+
+	pr_warn_ratelimited("overlayfs: cleanup bad index (%pd2, ino=%lu, nlink=%u)\n",
+			    ctx->temp, inode->i_ino, inode->i_nlink);
+
+	inode_lock_nested(d_inode(ctx->tempdir), I_MUTEX_PARENT);
+	ovl_cleanup(d_inode(ctx->tempdir), ctx->temp);
+	inode_unlock(d_inode(ctx->tempdir));
+}
+
+static void ovl_copy_up_indexdir_release(struct ovl_copy_up_ctx *ctx)
+{
+	ovl_copy_up_end(ctx->dentry);
+}
+
+static const struct ovl_copy_up_ops ovl_copy_up_indexdir_ops = {
+	.aquire = ovl_copy_up_indexdir_aquire,
+	.prepare = ovl_copy_up_indexdir_prepare,
+	.commit = ovl_copy_up_indexdir_commit,
+	.cancel = ovl_copy_up_indexdir_cancel,
+	.release = ovl_copy_up_indexdir_release,
+};
+
 static int ovl_copy_up_locked(struct ovl_copy_up_ctx *ctx,
 			      const struct ovl_copy_up_ops *ops)
 {
@@ -559,12 +747,16 @@ static int ovl_copy_up_locked(struct ovl_copy_up_ctx *ctx,
 		revert_creds(old_creds);
 		put_cred(new_creds);
 	}
-	if (err)
+	if (err < 0)
 		goto out;
 
-	err = ovl_copy_up_inode(dentry, ctx->temp, ctx->lowerpath, ctx->stat);
-	if (err)
-		goto out_cancel;
+	/* err == 1 means we found an existing hardlinked upper inode */
+	if (!err) {
+		err = ovl_copy_up_inode(dentry, ctx->temp, ctx->lowerpath,
+					ctx->stat);
+		if (err)
+			goto out_cancel;
+	}
 
 	err = ops->commit(ctx);
 	if (err)
@@ -604,12 +796,14 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	struct path parentpath;
 	struct dentry *lowerdentry = lowerpath->dentry;
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
+	bool indexed = ovl_indexdir(dentry->d_sb) && !S_ISDIR(stat->mode);
 	struct ovl_copy_up_ctx ctx = {
 		.dentry = dentry,
 		.lowerpath = lowerpath,
 		.stat = stat,
 		.link = NULL,
-		.tempdir = ovl_workdir(dentry),
+		.tempdir = indexed ? ovl_indexdir(dentry->d_sb) :
+				     ovl_workdir(dentry),
 	};
 	const struct ovl_copy_up_ops *ops;
 
@@ -637,8 +831,10 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			return PTR_ERR(ctx.link);
 	}
 
-	/* Should we copyup with O_TMPFILE or with workdir? */
-	if (S_ISREG(stat->mode) && ofs->tmpfile)
+	/* Should we copyup with O_TMPFILE, with indexdir or with workdir? */
+	if (indexed)
+		ops = &ovl_copy_up_indexdir_ops;
+	else if (S_ISREG(stat->mode) && ofs->tmpfile)
 		ops = &ovl_copy_up_tmpfile_ops;
 	else
 		ops = &ovl_copy_up_workdir_ops;
diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 31f5ad774499..193177883dc7 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -381,8 +381,8 @@ int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
  * index dir was cleared. Either way, that index cannot be used to indentify
  * the overlay inode.
  */
-static int ovl_lookup_index(struct dentry *dentry, struct dentry *upper,
-			    struct dentry *lower, struct dentry **indexp)
+int ovl_lookup_index(struct dentry *dentry, struct dentry *upper,
+		     struct dentry *lower, struct dentry **indexp)
 {
 	struct ovl_fs *ofs = dentry->d_sb->s_fs_info;
 	struct ovl_fh *fh;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 0e801059311a..28dd0d59d998 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -206,6 +206,7 @@ bool ovl_dentry_is_opaque(struct dentry *dentry);
 bool ovl_dentry_is_impure(struct dentry *dentry);
 bool ovl_dentry_is_whiteout(struct dentry *dentry);
 void ovl_dentry_set_opaque(struct dentry *dentry);
+void ovl_dentry_set_indexed(struct dentry *dentry);
 bool ovl_redirect_dir(struct super_block *sb);
 const char *ovl_dentry_get_redirect(struct dentry *dentry);
 void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
@@ -235,6 +236,8 @@ static inline bool ovl_is_impuredir(struct dentry *dentry)
 /* namei.c */
 int ovl_verify_origin(struct dentry *dentry, struct vfsmount *mnt,
 		      struct dentry *origin, bool is_upper);
+int ovl_lookup_index(struct dentry *dentry, struct dentry *upper,
+		     struct dentry *lower, struct dentry **indexp);
 int ovl_path_next(int idx, struct dentry *dentry, struct path *path, int *idxp);
 struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags);
 bool ovl_lower_positive(struct dentry *dentry);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 9c729322d3b0..0211ec353adc 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -230,6 +230,15 @@ static void ovl_dentry_set_impure(struct dentry *dentry)
 	spin_unlock(&dentry->d_lock);
 }
 
+void ovl_dentry_set_indexed(struct dentry *dentry)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+
+	spin_lock(&dentry->d_lock);
+	oe->__type |= __OVL_PATH_INDEX;
+	spin_unlock(&dentry->d_lock);
+}
+
 bool ovl_redirect_dir(struct super_block *sb)
 {
 	struct ovl_fs *ofs = sb->s_fs_info;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 21/23] ovl: link up indexed lower hardlink on lookup
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (19 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 20/23] ovl: implement index dir copy up method Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-19 10:22   ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 22/23] ovl: fix nlink leak in ovl_rename() Amir Goldstein
                   ` (2 subsequent siblings)
  23 siblings, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

With inodes index feature, all lower and upper hardlinks point to
the same overlay inode. However, when a lower hardlink is accessed
for read operation, the real inode operated on is not the same inode
as the real inode for read operation on an upper hardlink.

When lookup finds a lower hardlink, which is already indexed by
an earlier upper hardlink copy up, call ovl_copy_up() to link the
indexed upper on top of the lower hardlink and then operate on the
upper real inode to avoid this inconsistency.

Invalidate a lower indexed dentry on dcache lookup, so ovl_lookup()
is called to perform the index link up.

The following test demonstrates the upper/lower hardlinks inconsistency:

$ echo -n a > /lower/foo
$ ln /lower/foo /lower/bar
$ cd /mnt
$ echo -n b >> foo
$ tail foo bar # foo is indexed upper, bar is indexed lower
==> foo <==
ab
==> bar <==
a

$ echo -n c >> bar
$ tail foo bar # both aliases are indexed upper
==> foo <==
abc
==> bar <==
abc

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/namei.c | 11 ++++++++++-
 fs/overlayfs/super.c | 32 +++++++++++++++++++++++++++-----
 2 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/fs/overlayfs/namei.c b/fs/overlayfs/namei.c
index 193177883dc7..2f4c02355060 100644
--- a/fs/overlayfs/namei.c
+++ b/fs/overlayfs/namei.c
@@ -654,13 +654,22 @@ struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 	oe->redirect = upperredirect;
 	oe->__upperdentry = upperdentry;
 	memcpy(oe->lowerstack, stack, sizeof(struct path) * ctr);
-	dput(index);
 	kfree(stack);
 	kfree(d.redirect);
 	dentry->d_fsdata = oe;
 	ovl_update_type(dentry, d.is_dir);
 	d_add(dentry, inode);
 
+	/* Link up indexed lower early for consistent overlay hardlinks */
+	if (OVL_TYPE_INDEX(type) && !upperdentry) {
+		err = ovl_copy_up(dentry);
+		if (err) {
+			pr_warn_ratelimited("overlayfs: failed link up to index (%pd2, index=%pd2, err=%i)\n",
+					    dentry, index, err);
+		}
+	}
+	dput(index);
+
 	return NULL;
 
 out_free_oe:
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 0f36a09df34c..82b036ee9e52 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -117,6 +117,33 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
 	return dentry;
 }
 
+static int ovl_dentry_indexed_revalidate(struct dentry *dentry,
+					 unsigned int flags)
+{
+	enum ovl_path_type type = ovl_path_type(dentry);
+	bool is_upper;
+
+	if (d_is_dir(dentry) || d_is_negative(dentry))
+		return 1;
+
+	/*
+	 * Invalidate lower hardlink after it has been indexed by copy up
+	 * of another lower alias. ovl_lookup will trigger copy up of this
+	 * path and link the upper path to the upper index inode.
+	 */
+	ovl_inode_real(d_inode(dentry), &is_upper);
+	if (is_upper && !OVL_TYPE_UPPER(type))
+		return 0;
+
+	return 1;
+}
+
+static const struct dentry_operations ovl_dentry_operations = {
+	.d_release = ovl_dentry_release,
+	.d_real = ovl_d_real,
+	.d_revalidate = ovl_dentry_indexed_revalidate,
+};
+
 static int ovl_dentry_revalidate(struct dentry *dentry, unsigned int flags)
 {
 	struct ovl_entry *oe = dentry->d_fsdata;
@@ -158,11 +185,6 @@ static int ovl_dentry_weak_revalidate(struct dentry *dentry, unsigned int flags)
 	return ret;
 }
 
-static const struct dentry_operations ovl_dentry_operations = {
-	.d_release = ovl_dentry_release,
-	.d_real = ovl_d_real,
-};
-
 static const struct dentry_operations ovl_reval_dentry_operations = {
 	.d_release = ovl_dentry_release,
 	.d_real = ovl_d_real,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 22/23] ovl: fix nlink leak in ovl_rename()
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (20 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 21/23] ovl: link up indexed lower hardlink on lookup Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:26 ` [PATCH v3 23/23] ovl: adjust overlay inode nlink for indexed inodes Amir Goldstein
  2017-06-14  7:30 ` [PATCH v3 00/23] Overlayfs inodes index Miklos Szeredi
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

This patch fixes an overlay inode nlink leak in the case where
ovl_rename() renames over a non-dir.

This is not so critical, because overlay inode doesn't rely on
nlink dropping to zero for inode deletion.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/dir.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index f87cef27741f..cec89d33ab1e 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -890,6 +890,7 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 	bool old_opaque;
 	bool new_opaque;
 	bool cleanup_whiteout = false;
+	bool new_drop_nlink = false;
 	bool overwrite = !(flags & RENAME_EXCHANGE);
 	bool is_dir = d_is_dir(old);
 	bool new_is_dir = d_is_dir(new);
@@ -951,6 +952,8 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 			flags |= RENAME_EXCHANGE;
 			cleanup_whiteout = true;
 		}
+		if (!new_is_dir && new->d_inode)
+			new_drop_nlink = true;
 	}
 
 	old_upperdir = ovl_dentry_upper(old->d_parent);
@@ -1045,6 +1048,9 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 	if (cleanup_whiteout)
 		ovl_cleanup(old_upperdir->d_inode, newdentry);
 
+	if (new_drop_nlink)
+		drop_nlink(new->d_inode);
+
 	ovl_dentry_version_inc(old->d_parent);
 	ovl_dentry_version_inc(new->d_parent);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v3 23/23] ovl: adjust overlay inode nlink for indexed inodes
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (21 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 22/23] ovl: fix nlink leak in ovl_rename() Amir Goldstein
@ 2017-06-14  7:26 ` Amir Goldstein
  2017-06-14  7:30 ` [PATCH v3 00/23] Overlayfs inodes index Miklos Szeredi
  23 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:26 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Overlay inode nlink does not account for all the lower hardlinks that
have not been copied up yet. Those are added to nlink as they get copied up
and do not decrement nlink when they get unlinked or renamed over.

As a workaround for correct overlay nlink count, before trying to unlink
or rename over the first lower hardlink, try to copy up that lower and
create the index prior to removal of upper.

An overlay inode nlink of 1 stands for an orphan indexed inode. This is
the case where some of the lower hardlinks were copied up, modified, and
then all copied up upper hardlinks has been unlinked, but there are still
non covered lower hardlinks.

The important thing to take care of is that overlay inode nlink doesn't
drop to zero when there are still upper hardlinks or non covered
lower hardlinks.

Return the overlay inode nlinks for indexed upper inodes on stat(2),
excluding 1 nlink for the index entry.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/dir.c   | 43 +++++++++++++++++++++++++++++++++++++++----
 fs/overlayfs/inode.c |  9 +++++++++
 2 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index cec89d33ab1e..6c354bd4483e 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -732,9 +732,42 @@ static int ovl_remove_upper(struct dentry *dentry, bool is_dir)
 	return err;
 }
 
+static bool ovl_is_indexed_lower(struct dentry *dentry)
+{
+	enum ovl_path_type type = ovl_path_type(dentry);
+
+	/*
+	 * With inodes index feature, lower hardlinks are not counted in
+	 * overlay inode nlink. The lower hardlinks are added to overlay nlink
+	 * count as they get copied up and do not decrement overlay nlink when
+	 * they get unlinked or renamed over.
+	 *
+	 * An overlay inode nlink of 1 stands for an orphan indexed inode -
+	 * an inode that was copied up, modified, and then the copied up alias
+	 * has been unlinked. The orphan index can still be linked by another
+	 * lower hardlink copy up.
+	 *
+	 * Removing non indexed lower hardlinks would lead to negative overlay
+	 * nlink if we wanted to account for those removals in overlay nlink.
+	 * As a workaround, before whiteout/rename over of a lower hardlink,
+	 * try to copy up to create the upper index. Creating the upper index
+	 * will initialize the overlay nlink, do it could be dropped if unlink
+	 * or rename succeeds.
+	 * TODO: implement metadata only index copy up when called with
+	 *       ovl_copy_up_flags(dentry, O_PATH).
+	 */
+	if (ovl_indexdir(dentry->d_sb) && !OVL_TYPE_UPPER(type) &&
+	    ovl_dentry_lower(dentry)->d_inode->i_nlink > 1) {
+		ovl_copy_up(dentry);
+		type = ovl_path_type(dentry);
+	}
+
+	return !OVL_TYPE_UPPER(type) && OVL_TYPE_INDEX(type);
+}
+
 static int ovl_do_remove(struct dentry *dentry, bool is_dir)
 {
-	enum ovl_path_type type;
+	bool indexed_lower;
 	int err;
 	const struct cred *old_cred;
 
@@ -746,7 +779,8 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir)
 	if (err)
 		goto out_drop_write;
 
-	type = ovl_path_type(dentry);
+	if (!is_dir)
+		indexed_lower = ovl_is_indexed_lower(dentry);
 
 	old_cred = ovl_override_creds(dentry->d_sb);
 	if (!ovl_lower_positive(dentry))
@@ -757,7 +791,7 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir)
 	if (!err) {
 		if (is_dir)
 			clear_nlink(dentry->d_inode);
-		else
+		else if (!indexed_lower)
 			drop_nlink(dentry->d_inode);
 	}
 out_drop_write:
@@ -952,7 +986,8 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 			flags |= RENAME_EXCHANGE;
 			cleanup_whiteout = true;
 		}
-		if (!new_is_dir && new->d_inode)
+		/* An indexed lower hardlink is not counted in overlay nlink */
+		if (!new_is_dir && new->d_inode && !ovl_is_indexed_lower(new))
 			new_drop_nlink = true;
 	}
 
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 28f9a8cc0f61..35fda6fb9de3 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -131,6 +131,15 @@ int ovl_getattr(const struct path *path, struct kstat *stat,
 	if (is_dir && OVL_TYPE_MERGE(type))
 		stat->nlink = 1;
 
+	/*
+	 * Return the overlay inode nlinks for indexed upper inodes, excluding
+	 * 1 nlink for the index entry. Overlay inode nlink does not account
+	 * for all the lower hardlinks that have not been copied up yet.
+	 * TODO: add count of non-covered lower hardlinks.
+	 */
+	if (!is_dir && OVL_TYPE_UPPER(type) && OVL_TYPE_INDEX(type))
+		stat->nlink = dentry->d_inode->i_nlink - 1;
+
 out:
 	revert_creds(old_cred);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 00/23] Overlayfs inodes index
  2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
                   ` (22 preceding siblings ...)
  2017-06-14  7:26 ` [PATCH v3 23/23] ovl: adjust overlay inode nlink for indexed inodes Amir Goldstein
@ 2017-06-14  7:30 ` Miklos Szeredi
  2017-06-14  7:48   ` Amir Goldstein
  23 siblings, 1 reply; 29+ messages in thread
From: Miklos Szeredi @ 2017-06-14  7:30 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: linux-unionfs@vger.kernel.org, linux-fsdevel

On Wed, Jun 14, 2017 at 9:26 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> Miklos,
>
> I've made all the changes we discussed on v2 review and now all
> upper/lower hardlinks are consistent. Note that I chose to invalidate
> lower indexed dentries instead of updating all aliases on first copy up
> (patch 21). That seemed simpler to me. Please let me know if you see any
> culprit with that approach.

If the alias is open, it won't receive the st_ino updates through
fstat(2) since it will still have the ref to the old dentry/old inode.
So it's more correct to update the aliases, but I guess it's also more
complex.


> I also re-introduced ovl_update_type()
> (patch 8) for easier and more compact storing of the INDEX type flag.
>
> The remaining work on persistent nlink accounting and orphan index
> cleanup is left for a differnt posting. Without that complementary work,
> the index entries will remain forever. I am not aware of any other
> culprit or correctness issues that may arrise from lack of index cleanup.
>
> This work introduces the inodes index opt-in feature, which provides:
> - Hardlinks are not broken on copy up
> - Infrastructure for overlayfs NFS export
>
> The work is available on my ovl-hardlinks branch [1].
> For the curious, documentation about how overlay NFS export would work
> with the inodes index is available in the linked commit [2].
>
> The most significant change w.r.t. vfs is that with inodes index,
> overlay inodes are hashed and unique throughout the lifecycle of an
> overlay object (i.e. across copy up). This is required for not breaking
> hardlinks on copy up.
>
> Hardlink copy up tests including coverage for concurrent copy up of
> lower hardlinks and lower/upper hardlinks consistency are available on
> my xfstests dev branch [3].
>
> This work also fixes constant st_ino for samefs case for lower hardlinks,
> so the few constant st_ino tests that fail on v4.12-rc1 due to copy up of
> lower hardlinks now pass [3][4].

Thanks, will have a look though.

Miklos

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 00/23] Overlayfs inodes index
  2017-06-14  7:30 ` [PATCH v3 00/23] Overlayfs inodes index Miklos Szeredi
@ 2017-06-14  7:48   ` Amir Goldstein
  0 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14  7:48 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs@vger.kernel.org, linux-fsdevel

On Wed, Jun 14, 2017 at 10:30 AM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Wed, Jun 14, 2017 at 9:26 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> Miklos,
>>
>> I've made all the changes we discussed on v2 review and now all
>> upper/lower hardlinks are consistent. Note that I chose to invalidate
>> lower indexed dentries instead of updating all aliases on first copy up
>> (patch 21). That seemed simpler to me. Please let me know if you see any
>> culprit with that approach.
>
> If the alias is open, it won't receive the st_ino updates through
> fstat(2) since it will still have the ref to the old dentry/old inode.
> So it's more correct to update the aliases, but I guess it's also more
> complex.
>

fstat(2) and st_ino are not a problem, because st_ino of lower is returned
both before and after link up. The issue is with other attributes like mtime.
However, if the alias is open, then read(2) will read the lower content, so you
may as well say that returning the lower mtime is a feature...

My main concern for indexed lower inconsistency was that ovl_inode_real()
may be upper, while ovl_dentry_real() is lower, especially when checking
overlay object access permissions.
The only places I found where this inconsistency might be exposed are
ovl_get_acl() and ovl_permission() and both those cases are relevant for
lookup and not for the "open alias" case. Right?

Amir.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 12/23] ovl: store upper/lower real inode in ovl_inode_info
  2017-06-14  7:26 ` [PATCH v3 12/23] ovl: store upper/lower real inode in ovl_inode_info Amir Goldstein
@ 2017-06-14 11:00   ` Amir Goldstein
  2017-06-16 11:55     ` Amir Goldstein
  0 siblings, 1 reply; 29+ messages in thread
From: Amir Goldstein @ 2017-06-14 11:00 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs

On Wed, Jun 14, 2017 at 10:26 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> Introduce ovl_inode_info struct that is embedded in ovl_inode
> and contains a reference to lowerinode and/or upperinode.
>
> Storing the upper/lower real inode in ovl_inode_info replaces
> the method of storing realinode & ISUPPER flag in vfs inode
> i_private field.
>
> This will be used for hashing overlay inodes before copy up.
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---

> +struct inode *ovl_inode_real(struct inode *inode, bool *is_upper)
> +{
> +       struct ovl_inode_info *oi = OVL_I_INFO(inode);
> +       struct inode *realinode;
> +
> +       realinode = READ_ONCE(oi->__upperinode);
> +       if (!realinode)
> +               realinode = oi->lowerinode;
> +       else if (is_upper)
> +               *is_upper = true;

oops, bug, not setting false (caught by ./run --ov rmdir):
        if (is_upper)
                *is_upper = !realinode;

> +
> +       return realinode;
>  }
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 12/23] ovl: store upper/lower real inode in ovl_inode_info
  2017-06-14 11:00   ` Amir Goldstein
@ 2017-06-16 11:55     ` Amir Goldstein
  0 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-16 11:55 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs

On Wed, Jun 14, 2017 at 2:00 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Wed, Jun 14, 2017 at 10:26 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> Introduce ovl_inode_info struct that is embedded in ovl_inode
>> and contains a reference to lowerinode and/or upperinode.
>>
>> Storing the upper/lower real inode in ovl_inode_info replaces
>> the method of storing realinode & ISUPPER flag in vfs inode
>> i_private field.
>>
>> This will be used for hashing overlay inodes before copy up.
>>
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>> ---
>
>> +struct inode *ovl_inode_real(struct inode *inode, bool *is_upper)
>> +{
>> +       struct ovl_inode_info *oi = OVL_I_INFO(inode);
>> +       struct inode *realinode;
>> +
>> +       realinode = READ_ONCE(oi->__upperinode);
>> +       if (!realinode)
>> +               realinode = oi->lowerinode;
>> +       else if (is_upper)
>> +               *is_upper = true;
>
> oops, bug, not setting false (caught by ./run --ov rmdir):
>         if (is_upper)
>                 *is_upper = !realinode;
>

Embarrassing. I fixed the bug in a very buggy way and did not notice that it
broke my tests... should be:

        realinode = READ_ONCE(oi->__upperinode);
        if (is_upper)
                *is_upper = !!realinode;
        if (!realinode)
                realinode = oi->lowerinode;


Pushed the ovl-hardlinks branch with the fixed version.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v3 21/23] ovl: link up indexed lower hardlink on lookup
  2017-06-14  7:26 ` [PATCH v3 21/23] ovl: link up indexed lower hardlink on lookup Amir Goldstein
@ 2017-06-19 10:22   ` Amir Goldstein
  0 siblings, 0 replies; 29+ messages in thread
From: Amir Goldstein @ 2017-06-19 10:22 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs

On Wed, Jun 14, 2017 at 10:26 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> With inodes index feature, all lower and upper hardlinks point to
> the same overlay inode. However, when a lower hardlink is accessed
> for read operation, the real inode operated on is not the same inode
> as the real inode for read operation on an upper hardlink.
>
> When lookup finds a lower hardlink, which is already indexed by
> an earlier upper hardlink copy up, call ovl_copy_up() to link the
> indexed upper on top of the lower hardlink and then operate on the
> upper real inode to avoid this inconsistency.
>
> Invalidate a lower indexed dentry on dcache lookup, so ovl_lookup()
> is called to perform the index link up.
>
> The following test demonstrates the upper/lower hardlinks inconsistency:
>
> $ echo -n a > /lower/foo
> $ ln /lower/foo /lower/bar
> $ cd /mnt
> $ echo -n b >> foo
> $ tail foo bar # foo is indexed upper, bar is indexed lower
> ==> foo <==
> ab
> ==> bar <==
> a
>
> $ echo -n c >> bar
> $ tail foo bar # both aliases are indexed upper
> ==> foo <==
> abc
> ==> bar <==
> abc
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/namei.c | 11 ++++++++++-
>  fs/overlayfs/super.c | 32 +++++++++++++++++++++++++++-----
>  2 files changed, 37 insertions(+), 6 deletions(-)
>

> +
> +static const struct dentry_operations ovl_dentry_operations = {
> +       .d_release = ovl_dentry_release,
> +       .d_real = ovl_d_real,
> +       .d_revalidate = ovl_dentry_indexed_revalidate,
> +};
> +
>  static int ovl_dentry_revalidate(struct dentry *dentry, unsigned int flags)
>  {

ovl_dentry_indexed_revalidate() should also be checked within this variant.
Will fix for next posting.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2017-06-19 10:22 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-14  7:26 [PATCH v3 00/23] Overlayfs inodes index Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 01/23] vfs: introduce inode 'inuse' lock Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 02/23] ovl: get exclusive ownership on upper/work dirs Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 03/23] ovl: relax same fs constrain for ovl_check_origin() Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 04/23] ovl: generalize ovl_create_workdir() Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 05/23] ovl: introduce the inodes index dir feature Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 06/23] ovl: verify upper root dir matches lower root dir Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 07/23] ovl: verify index dir matches upper dir Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 08/23] ovl: store path type in dentry Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 09/23] ovl: cram dentry state booleans into type flags Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 10/23] ovl: lookup index entry for copy up origin Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 11/23] ovl: allocate an ovl_inode struct Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 12/23] ovl: store upper/lower real inode in ovl_inode_info Amir Goldstein
2017-06-14 11:00   ` Amir Goldstein
2017-06-16 11:55     ` Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 13/23] ovl: use ovl_inode_init() for initializing new inode Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 14/23] ovl: hash overlay non-dir inodes by copy up origin inode Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 15/23] ovl: defer upper dir lock to tempfile link Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 16/23] ovl: factor out ovl_copy_up_inode() helper Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 17/23] ovl: generalize ovl_copy_up_locked() using actors Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 18/23] ovl: generalize ovl_copy_up_one() " Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 19/23] ovl: use ovl_inode mutex to synchronize concurrent copy up Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 20/23] ovl: implement index dir copy up method Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 21/23] ovl: link up indexed lower hardlink on lookup Amir Goldstein
2017-06-19 10:22   ` Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 22/23] ovl: fix nlink leak in ovl_rename() Amir Goldstein
2017-06-14  7:26 ` [PATCH v3 23/23] ovl: adjust overlay inode nlink for indexed inodes Amir Goldstein
2017-06-14  7:30 ` [PATCH v3 00/23] Overlayfs inodes index Miklos Szeredi
2017-06-14  7:48   ` Amir Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox