linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/10] Assorted fixes/enhancements for overlayfs
@ 2010-09-06  0:50 NeilBrown
  2010-09-06  0:50 ` [PATCH 02/10] ovl: minimal remount support NeilBrown
                   ` (10 more replies)
  0 siblings, 11 replies; 20+ messages in thread
From: NeilBrown @ 2010-09-06  0:50 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, linux-kernel

Hi Miklos,
 following are an assortment of suggested changes for overlayfs for
 your consideration.
 They pass some light testing.

Thanks,
NeilBrown


---

NeilBrown (10):
      ovl: small optimisation for ovl_lookup
      ovl: minimal remount support.
      ovl: use correct seek function for directories.
      ovl: initialise is_real before use.
      ovl: make sure fsync is never called on the lower filesystem.
      ovl: rename ovl_fill_cache to ovl_dir_read
      ovl: add initial revalidate support
      VFS: tiny optimisation in open_other handling.
      VFS: Remove read-only checks from dentry_permission
      ovl: Assorted updates to Documentation/filesystems/overlayfs.txt


 Documentation/filesystems/overlayfs.txt |   41 ++++---
 fs/namei.c                              |   15 ---
 fs/open.c                               |    2 
 fs/overlayfs/overlayfs.c                |  186 ++++++++++++++++++++++---------
 4 files changed, 152 insertions(+), 92 deletions(-)

-- 
Signature

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 01/10] ovl: small optimisation for ovl_lookup
  2010-09-06  0:50 [PATCH 00/10] Assorted fixes/enhancements for overlayfs NeilBrown
  2010-09-06  0:50 ` [PATCH 02/10] ovl: minimal remount support NeilBrown
@ 2010-09-06  0:50 ` NeilBrown
  2010-09-06 16:57   ` Miklos Szeredi
  2010-09-06  0:50 ` [PATCH 03/10] ovl: use correct seek function for directories NeilBrown
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: NeilBrown @ 2010-09-06  0:50 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, linux-kernel

If we find a non-directory in the upper layer, there is no point
performing a lookup in the lower layer as whatever is found will just
be ignored anyway.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/overlayfs/overlayfs.c |   17 ++++++++---------
 1 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/fs/overlayfs/overlayfs.c b/fs/overlayfs/overlayfs.c
index 0fe3944..0ddfeec 100644
--- a/fs/overlayfs/overlayfs.c
+++ b/fs/overlayfs/overlayfs.c
@@ -495,7 +495,8 @@ static struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 				dput(upperdentry);
 				upperdentry = NULL;
 				ue->opaque = true;
-			}
+			} else if (!S_ISDIR(upperdentry->d_inode->i_mode))
+				ue->opaque = true;
 			revert_creds(old_cred);
 			put_cred(override_cred);
 		}
@@ -507,14 +508,12 @@ static struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
 			dput(upperdentry);
 			goto out_free;
 		}
-	}
-
-	if (lowerdentry && upperdentry &&
-	    (!S_ISDIR(upperdentry->d_inode->i_mode) ||
-	     !S_ISDIR(lowerdentry->d_inode->i_mode))) {
-		dput(lowerdentry);
-		lowerdentry = NULL;
-		ue->opaque = true;
+		if (lowerdentry && upperdentry &&
+		    !S_ISDIR(lowerdentry->d_inode->i_mode)) {
+			dput(lowerdentry);
+			lowerdentry = NULL;
+			ue->opaque = true;
+		}
 	}
 
 	if (lowerdentry || upperdentry) {

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 03/10] ovl: use correct seek function for directories.
  2010-09-06  0:50 [PATCH 00/10] Assorted fixes/enhancements for overlayfs NeilBrown
  2010-09-06  0:50 ` [PATCH 02/10] ovl: minimal remount support NeilBrown
  2010-09-06  0:50 ` [PATCH 01/10] ovl: small optimisation for ovl_lookup NeilBrown
@ 2010-09-06  0:50 ` NeilBrown
  2010-09-06  0:50 ` [PATCH 04/10] ovl: initialise is_real before use NeilBrown
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: NeilBrown @ 2010-09-06  0:50 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, linux-kernel

It is incorrect to call generic_llseek_file on a file from a different
filesystem.  For that we must use the seek function that the
filesystem defines, which is called by vfs_llseek.

Also, we only want to seek the realfile when is_real is true.
Otherwise we just want to update our own f_pos pointer, so use
generic_llseek_file for that.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/overlayfs/overlayfs.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/overlayfs/overlayfs.c b/fs/overlayfs/overlayfs.c
index 4e032e8..306de45 100644
--- a/fs/overlayfs/overlayfs.c
+++ b/fs/overlayfs/overlayfs.c
@@ -307,8 +307,11 @@ static loff_t ovl_dir_llseek(struct file *file, loff_t offset, int origin)
 	loff_t res;
 	struct ovl_dir_file *od = file->private_data;
 
-	res = generic_file_llseek(od->realfile, offset, origin);
-	file->f_pos = od->realfile->f_pos;
+	if (od->is_real) {
+		res = vfs_llseek(od->realfile, offset, origin);
+		file->f_pos = od->realfile->f_pos;
+	} else
+		res = generic_file_llseek(file, offset, origin);
 
 	return res;
 }

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 02/10] ovl: minimal remount support.
  2010-09-06  0:50 [PATCH 00/10] Assorted fixes/enhancements for overlayfs NeilBrown
@ 2010-09-06  0:50 ` NeilBrown
  2010-09-06  0:50 ` [PATCH 01/10] ovl: small optimisation for ovl_lookup NeilBrown
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: NeilBrown @ 2010-09-06  0:50 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, linux-kernel

As overlayfs reflects the 'readonly' mount status in write-access to
the upper filesystem, we must handle remount and either drop or take
write access when the ro status changes.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/overlayfs/overlayfs.c |   20 ++++++++++++++++++++
 1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/fs/overlayfs/overlayfs.c b/fs/overlayfs/overlayfs.c
index 0ddfeec..4e032e8 100644
--- a/fs/overlayfs/overlayfs.c
+++ b/fs/overlayfs/overlayfs.c
@@ -1685,8 +1685,28 @@ static void ovl_put_super(struct super_block *sb)
 	kfree(ufs);
 }
 
+static int ovl_remount_fs(struct super_block *sb, int *flagsp, char *data)
+{
+	int flags = *flagsp;
+	struct ovl_fs *ufs = sb->s_fs_info;
+
+	/* When remounting rw or ro, we need to adjust the write access to the
+	 * upper fs.
+	 */
+	if (((flags ^ sb->s_flags) & MS_RDONLY) == 0)
+		/* No change to readonly status */
+		return 0;
+
+	if (flags & MS_RDONLY) {
+		mnt_drop_write(ufs->upper_mnt);
+		return 0;
+	} else
+		return mnt_want_write(ufs->upper_mnt);
+}
+
 static const struct super_operations ovl_super_operations = {
 	.put_super	= ovl_put_super,
+	.remount_fs	= ovl_remount_fs,
 };
 
 struct ovl_config {

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 04/10] ovl: initialise is_real before use.
  2010-09-06  0:50 [PATCH 00/10] Assorted fixes/enhancements for overlayfs NeilBrown
                   ` (2 preceding siblings ...)
  2010-09-06  0:50 ` [PATCH 03/10] ovl: use correct seek function for directories NeilBrown
@ 2010-09-06  0:50 ` NeilBrown
  2010-09-06  0:50 ` [PATCH 08/10] VFS: tiny optimisation in open_other handling NeilBrown
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: NeilBrown @ 2010-09-06  0:50 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, linux-kernel

The previous patch can use od->is_real before it is properly
initialised is llseek is called before readdir.
So factor out the initialisation of is_real and call it from both
readdir and llseek when f_pos is 0.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/overlayfs/overlayfs.c |   24 +++++++++++++++++-------
 1 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/fs/overlayfs/overlayfs.c b/fs/overlayfs/overlayfs.c
index 306de45..de854e1 100644
--- a/fs/overlayfs/overlayfs.c
+++ b/fs/overlayfs/overlayfs.c
@@ -240,6 +240,17 @@ static int ovl_fill_cache(struct path *realpath, struct ovl_cache_callback *cb,
 	return 0;
 }
 
+static void ovl_dir_reset(struct file *file)
+{
+	struct ovl_dir_file *od = file->private_data;
+	struct ovl_entry *ue = file->f_path.dentry->d_fsdata;
+
+	ovl_cache_free(od->cache);
+	od->cache = NULL;
+
+	od->is_real = (!ue->lowerpath.dentry || !ue->upperpath.dentry);
+}
+
 static int ovl_readdir(struct file *file, void *buf, filldir_t filler)
 {
 	struct ovl_dir_file *od = file->private_data;
@@ -248,14 +259,10 @@ static int ovl_readdir(struct file *file, void *buf, filldir_t filler)
 	loff_t off;
 	int res = 0;
 
-	if (!file->f_pos) {
-		ovl_cache_free(od->cache);
-		od->cache = NULL;
-		od->is_real = false;
-	}
+	if (!file->f_pos)
+		ovl_dir_reset(file);
 
-	if (od->is_real || !ue->lowerpath.dentry || !ue->upperpath.dentry) {
-		od->is_real = true;
+	if (od->is_real) {
 		res = vfs_readdir(od->realfile, filler, buf);
 		file->f_pos = od->realfile->f_pos;
 
@@ -307,6 +314,9 @@ static loff_t ovl_dir_llseek(struct file *file, loff_t offset, int origin)
 	loff_t res;
 	struct ovl_dir_file *od = file->private_data;
 
+	if (!file->f_pos)
+		ovl_dir_reset(file);
+
 	if (od->is_real) {
 		res = vfs_llseek(od->realfile, offset, origin);
 		file->f_pos = od->realfile->f_pos;



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 05/10] ovl: make sure fsync is never called on the lower filesystem.
  2010-09-06  0:50 [PATCH 00/10] Assorted fixes/enhancements for overlayfs NeilBrown
                   ` (7 preceding siblings ...)
  2010-09-06  0:50 ` [PATCH 06/10] ovl: rename ovl_fill_cache to ovl_dir_read NeilBrown
@ 2010-09-06  0:50 ` NeilBrown
  2010-09-06 17:16   ` Miklos Szeredi
  2010-09-06  0:50 ` [PATCH 10/10] ovl: Assorted updates to Documentation/filesystems/overlayfs.txt NeilBrown
  2010-09-06 19:23 ` [PATCH 00/10] Assorted fixes/enhancements for overlayfs Miklos Szeredi
  10 siblings, 1 reply; 20+ messages in thread
From: NeilBrown @ 2010-09-06  0:50 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, linux-kernel

If we open a directory O_RDONLY, ->realfile could be on the lower
filesystem.
While fsync is not likely to be called on such a file, we should make
sure never to pass the fsync down as we are treating the lower
filesystem as read-only.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/overlayfs/overlayfs.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/fs/overlayfs/overlayfs.c b/fs/overlayfs/overlayfs.c
index de854e1..f9eea96 100644
--- a/fs/overlayfs/overlayfs.c
+++ b/fs/overlayfs/overlayfs.c
@@ -329,8 +329,15 @@ static loff_t ovl_dir_llseek(struct file *file, loff_t offset, int origin)
 static int ovl_dir_fsync(struct file *file, int datasync)
 {
 	struct ovl_dir_file *od = file->private_data;
+	struct ovl_entry *ue = file->f_path.dentry->d_fsdata;
 
-	return vfs_fsync(od->realfile, datasync);
+	/* realfile could be on lowerdir, but only if this was a read-only open,
+	 * in which case we can ignore the fsync.
+	 */
+	if (file->f_path.dentry == ue->upperpath.dentry)
+		return vfs_fsync(od->realfile, datasync);
+	else
+		return 0;
 }
 
 static int ovl_dir_release(struct inode *inode, struct file *file)

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 06/10] ovl: rename ovl_fill_cache to ovl_dir_read
  2010-09-06  0:50 [PATCH 00/10] Assorted fixes/enhancements for overlayfs NeilBrown
                   ` (6 preceding siblings ...)
  2010-09-06  0:50 ` [PATCH 07/10] ovl: add initial revalidate support NeilBrown
@ 2010-09-06  0:50 ` NeilBrown
  2010-09-06  0:50 ` [PATCH 05/10] ovl: make sure fsync is never called on the lower filesystem NeilBrown
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: NeilBrown @ 2010-09-06  0:50 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, linux-kernel

ovl_fill_cache isn't only used for filling the cache - it is also used
to remove all whiteout files.  This can be a bit confusing.
So rename it to a more generic "ovl_dir_read" and rename the data
structure that is passed to it to "ovl_readdir_data".

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/overlayfs/overlayfs.c |   28 ++++++++++++++--------------
 1 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/fs/overlayfs/overlayfs.c b/fs/overlayfs/overlayfs.c
index f9eea96..656cad7 100644
--- a/fs/overlayfs/overlayfs.c
+++ b/fs/overlayfs/overlayfs.c
@@ -88,7 +88,7 @@ struct ovl_cache_entry {
 	bool is_whiteout;
 };
 
-struct ovl_cache_callback {
+struct ovl_readdir_data {
 	struct ovl_cache_entry *list;
 	struct ovl_cache_entry **endp;
 	struct path path;
@@ -102,7 +102,7 @@ struct ovl_dir_file {
 	struct file *realfile;
 };
 
-static int ovl_cache_add_entry(struct ovl_cache_callback *cb,
+static int ovl_cache_add_entry(struct ovl_readdir_data *cb,
 				 const char *name, int namelen, u64 ino,
 				 unsigned int d_type, bool is_whiteout)
 {
@@ -161,7 +161,7 @@ static int ovl_cache_find_entry(struct ovl_cache_entry *start,
 static int ovl_fill_lower(void *buf, const char *name, int namlen,
 			    loff_t offset, u64 ino, unsigned int d_type)
 {
-	struct ovl_cache_callback *cb = buf;
+	struct ovl_readdir_data *cb = buf;
 
 	cb->count++;
 	if (!ovl_cache_find_entry(cb->list, name, namlen))
@@ -173,7 +173,7 @@ static int ovl_fill_lower(void *buf, const char *name, int namlen,
 static int ovl_fill_upper(void *buf, const char *name, int namlen,
 			  loff_t offset, u64 ino, unsigned int d_type)
 {
-	struct ovl_cache_callback *cb = buf;
+	struct ovl_readdir_data *cb = buf;
 	bool is_whiteout = false;
 
 	cb->count++;
@@ -195,7 +195,7 @@ out:
 	return cb->err;
 }
 
-static int ovl_fill_cache(struct path *realpath, struct ovl_cache_callback *cb,
+static int ovl_dir_read(struct path *realpath, struct ovl_readdir_data *cb,
 			  filldir_t filler)
 {
 	const struct cred *old_cred;
@@ -270,15 +270,15 @@ static int ovl_readdir(struct file *file, void *buf, filldir_t filler)
 	}
 
 	if (!od->cache) {
-		struct ovl_cache_callback cb = {
+		struct ovl_readdir_data cb = {
 			.list = NULL,
 			.endp = &cb.list,
 			.path = ue->upperpath,
 		};
 
-		res = ovl_fill_cache(&ue->upperpath, &cb, ovl_fill_upper);
+		res = ovl_dir_read(&ue->upperpath, &cb, ovl_fill_upper);
 		if (!res) {
-			res = ovl_fill_cache(&ue->lowerpath, &cb,
+			res = ovl_dir_read(&ue->lowerpath, &cb,
 					     ovl_fill_lower);
 		}
 		if (res)
@@ -1292,18 +1292,18 @@ static int ovl_check_empty_dir(struct dentry *dentry)
 	int err;
 	struct ovl_entry *ue = dentry->d_fsdata;
 	struct ovl_cache_entry *p;
-	struct ovl_cache_callback cb = {
+	struct ovl_readdir_data cb = {
 		.list = NULL,
 		.endp = &cb.list,
 		.path = ue->upperpath,
 	};
 
 	if (ue->upperpath.dentry) {
-		err = ovl_fill_cache(&ue->upperpath, &cb, ovl_fill_upper);
+		err = ovl_dir_read(&ue->upperpath, &cb, ovl_fill_upper);
 		if (err)
 			return err;
 	}
-	err = ovl_fill_cache(&ue->lowerpath, &cb, ovl_fill_lower);
+	err = ovl_dir_read(&ue->lowerpath, &cb, ovl_fill_lower);
 	if (err)
 		return err;
 
@@ -1330,7 +1330,7 @@ static int ovl_check_empty_dir(struct dentry *dentry)
 static int ovl_unlink_whiteout(void *buf, const char *name, int namlen,
 				 loff_t offset, u64 ino, unsigned int d_type)
 {
-	struct ovl_cache_callback *cb = buf;
+	struct ovl_readdir_data *cb = buf;
 
 	cb->count++;
 	/* check d_type to filter out "." and ".." */
@@ -1352,7 +1352,7 @@ static int ovl_unlink_whiteout(void *buf, const char *name, int namlen,
 static int ovl_remove_whiteouts(struct dentry *dentry)
 {
 	struct ovl_entry *ue = dentry->d_fsdata;
-	struct ovl_cache_callback cb = {
+	struct ovl_readdir_data cb = {
 		.list = NULL,
 		.path = ue->upperpath,
 	};
@@ -1360,7 +1360,7 @@ static int ovl_remove_whiteouts(struct dentry *dentry)
 	if (!ue->upperpath.dentry)
 		return 0;
 
-	return ovl_fill_cache(&ue->upperpath, &cb, ovl_unlink_whiteout);
+	return ovl_dir_read(&ue->upperpath, &cb, ovl_unlink_whiteout);
 }
 
 static int ovl_rmdir(struct inode *dir, struct dentry *dentry)

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 07/10] ovl: add initial revalidate support
  2010-09-06  0:50 [PATCH 00/10] Assorted fixes/enhancements for overlayfs NeilBrown
                   ` (5 preceding siblings ...)
  2010-09-06  0:50 ` [PATCH 09/10] VFS: Remove read-only checks from dentry_permission NeilBrown
@ 2010-09-06  0:50 ` NeilBrown
  2010-09-16 14:47   ` Miklos Szeredi
  2010-09-06  0:50 ` [PATCH 06/10] ovl: rename ovl_fill_cache to ovl_dir_read NeilBrown
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 20+ messages in thread
From: NeilBrown @ 2010-09-06  0:50 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, linux-kernel

Add dentry_revalidate method and fail validation of either the
upper or lower dentry has been renamed or unlinked directly in the
otherlying filesystem.
This allows such changes to appear promptly in the overlay providing
the file isn't currently in use.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/overlayfs/overlayfs.c |   55 ++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 55 insertions(+), 0 deletions(-)

diff --git a/fs/overlayfs/overlayfs.c b/fs/overlayfs/overlayfs.c
index 656cad7..cdeafa7 100644
--- a/fs/overlayfs/overlayfs.c
+++ b/fs/overlayfs/overlayfs.c
@@ -409,9 +409,64 @@ static void ovl_dentry_iput(struct dentry *dentry, struct inode *inode)
 	iput(inode);
 }
 
+static int ovl_still_valid(struct dentry *dentry,
+			     struct dentry *parent, struct dentry *child)
+{
+	/* dentry is in the overlay filesystem
+	 * child is the corresponding dentry in the upper or lower layer
+	 * parent is the corresponding dentry of dentry's parent in the same layer
+	 *
+	 * If child is NULL, the parent must be NULL or negative.
+	 * Otherwise child must be hashed, have parent as the d_parent, and
+	 * have the same name as dentry
+	 *
+	 */
+	struct qstr *qstr;
+	int rv;
+
+	if (child == NULL)
+		return (parent == NULL || parent->d_inode == NULL);
+
+	if (child->d_parent != parent)
+		return 0;
+	if (d_unhashed(child))
+		return 0;
+
+	/* Unfortunately we need d_lock to compare names */
+	spin_lock(&dentry->d_lock);
+	spin_lock_nested(&child->d_lock, DENTRY_D_LOCK_NESTED);
+	qstr = &child->d_name;
+	if (parent->d_op && parent->d_op->d_compare)
+		rv = ! (parent->d_op->d_compare(parent, qstr, &dentry->d_name));
+	else
+		rv =  (qstr->len == dentry->d_name.len &&
+		       memcmp(qstr->name, dentry->d_name.name, qstr->len) == 0);
+
+	spin_unlock(&child->d_lock);
+	spin_unlock(&dentry->d_lock);
+	return rv;
+}
+
+static int ovl_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
+{
+	/* We need to invalidate this dentry if the either upperpath or lowerpath
+	 * has been changed
+	 */
+	struct dentry *parent = dget_parent(dentry);
+	struct ovl_entry *ue = dentry->d_fsdata;
+	struct ovl_entry *pue = parent->d_fsdata;
+	int rv;
+
+	rv = (ovl_still_valid(dentry, pue->upperpath.dentry, ue->upperpath.dentry) &&
+	      ovl_still_valid(dentry, pue->lowerpath.dentry, ue->lowerpath.dentry));
+	dput(parent);
+	return rv;
+}
+
 static const struct dentry_operations ovl_dentry_operations = {
 	.d_release = ovl_dentry_release,
 	.d_iput = ovl_dentry_iput,
+	.d_revalidate = ovl_dentry_revalidate,
 };
 
 static struct inode *ovl_new_inode(struct super_block *sb, umode_t mode)

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 08/10] VFS: tiny optimisation in open_other handling.
  2010-09-06  0:50 [PATCH 00/10] Assorted fixes/enhancements for overlayfs NeilBrown
                   ` (3 preceding siblings ...)
  2010-09-06  0:50 ` [PATCH 04/10] ovl: initialise is_real before use NeilBrown
@ 2010-09-06  0:50 ` NeilBrown
  2010-09-06  0:50 ` [PATCH 09/10] VFS: Remove read-only checks from dentry_permission NeilBrown
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 20+ messages in thread
From: NeilBrown @ 2010-09-06  0:50 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, linux-kernel

If IS_ERR(ret), then ret != NULL, so if we are performing the second
test we don't need the first.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/open.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/open.c b/fs/open.c
index 89e4f34..b986ca1 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -687,7 +687,7 @@ static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
 	if (!open && f->f_op && f->f_op->open_other) {
 		/* NULL means keep f, non-error non-null means replace */
 		ret = f->f_op->open_other(f);
-		if (IS_ERR(ret) || ret != NULL)
+		if (ret)
 			goto cleanup_all;
 	} else {
 		if (!open && f->f_op)

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 09/10] VFS: Remove read-only checks from dentry_permission
  2010-09-06  0:50 [PATCH 00/10] Assorted fixes/enhancements for overlayfs NeilBrown
                   ` (4 preceding siblings ...)
  2010-09-06  0:50 ` [PATCH 08/10] VFS: tiny optimisation in open_other handling NeilBrown
@ 2010-09-06  0:50 ` NeilBrown
  2010-09-06 19:10   ` Miklos Szeredi
  2010-09-07 15:58   ` Dave Hansen
  2010-09-06  0:50 ` [PATCH 07/10] ovl: add initial revalidate support NeilBrown
                   ` (4 subsequent siblings)
  10 siblings, 2 replies; 20+ messages in thread
From: NeilBrown @ 2010-09-06  0:50 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, Dave Hansen, linux-kernel

It is not sufficient to depend on the the "filesystem is readonly"
tests in dentry_permission as it does not check if the vfsmnt is
readonly.
All call sites already call mnt_want_write or __mnt_is_readonly which
includes the test on MS_RDONLY.

So remove this test from dentry_permission as it simply duplicates
tests done elsewhere.

This allows ovl_permission to be significantly simplified.


Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/namei.c               |   15 +--------------
 fs/overlayfs/overlayfs.c |   26 ++------------------------
 2 files changed, 3 insertions(+), 38 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 6042564..291a839 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -254,22 +254,9 @@ int dentry_permission(struct dentry *dentry, int mask)
 	struct inode *inode = dentry->d_inode;
 	int retval;
 
-	if (mask & MAY_WRITE) {
-		umode_t mode = inode->i_mode;
-
-		/*
-		 * Nobody gets write access to a read-only fs.
-		 */
-		if (IS_RDONLY(inode) &&
-		    (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode)))
-			return -EROFS;
-
-		/*
-		 * Nobody gets write access to an immutable file.
-		 */
+	if (mask & MAY_WRITE)
 		if (IS_IMMUTABLE(inode))
 			return -EACCES;
-	}
 
 	if (inode->i_op->permission)
 		retval = inode->i_op->permission(dentry, mask);
diff --git a/fs/overlayfs/overlayfs.c b/fs/overlayfs/overlayfs.c
index cdeafa7..749109a 100644
--- a/fs/overlayfs/overlayfs.c
+++ b/fs/overlayfs/overlayfs.c
@@ -1127,31 +1127,9 @@ static int ovl_dir_getattr(struct vfsmount *mnt, struct dentry *dentry,
 static int ovl_permission(struct dentry *dentry, int mask)
 {
 	struct ovl_entry *ue = dentry->d_fsdata;
-	struct inode *inode;
-	int err;
-
-	if (ue->upperpath.dentry)
-		return dentry_permission(ue->upperpath.dentry, mask);
-
-	inode = ue->lowerpath.dentry->d_inode;
-	if (!(mask & MAY_WRITE) || special_file(inode->i_mode))
-		return dentry_permission(ue->lowerpath.dentry, mask);
-
-	/* Don't check for read-only fs */
-	if (mask & MAY_WRITE) {
-		if (IS_IMMUTABLE(inode))
-			return -EACCES;
-	}
-
-	if (inode->i_op->permission)
-		err = inode->i_op->permission(ue->lowerpath.dentry, mask);
-	else
-		err = generic_permission(inode, mask, inode->i_op->check_acl);
-
-	if (err)
-		return err;
+	struct path *realpath = ovl_path(ue);
 
-	return security_inode_permission(inode, mask);
+	return dentry_permission(realpath->dentry, mask);
 }
 
 static int ovl_create_object(struct dentry *dentry, int mode, dev_t rdev,

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 10/10] ovl: Assorted updates to Documentation/filesystems/overlayfs.txt
  2010-09-06  0:50 [PATCH 00/10] Assorted fixes/enhancements for overlayfs NeilBrown
                   ` (8 preceding siblings ...)
  2010-09-06  0:50 ` [PATCH 05/10] ovl: make sure fsync is never called on the lower filesystem NeilBrown
@ 2010-09-06  0:50 ` NeilBrown
  2010-09-06 19:23 ` [PATCH 00/10] Assorted fixes/enhancements for overlayfs Miklos Szeredi
  10 siblings, 0 replies; 20+ messages in thread
From: NeilBrown @ 2010-09-06  0:50 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, linux-kernel

Signed-off-by: NeilBrown <neilb@suse.de>
---
 Documentation/filesystems/overlayfs.txt |   41 ++++++++++++++++---------------
 1 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
index 02af2d6..4e5f972 100644
--- a/Documentation/filesystems/overlayfs.txt
+++ b/Documentation/filesystems/overlayfs.txt
@@ -4,32 +4,33 @@ Overlay Filesystem
 ==================
 
 This document describes a prototype for a new approach to providing
-union-filesystem functionality in Linux.  A union-filesystem tries to
-present the union of two different filesystems as though it were a
-single filesystem.  The result will inevitably fail to look exactly
-like a normal filesystem for various technical reasons.  The
-expectation is that many use cases will be able to ignore these
-differences.
+overlay-filesystem functionality in Linux (sometimes referred to as
+union-filesystems).  An overlay-filesystem tries to present a
+filesystem which is the result over overlaying one filesystem on top
+of the other.
+
+The result will inevitably fail to look exactly like a normal
+filesystem for various technical reasons.  The expectation is that
+many use cases will be able to ignore these differences.
 
 This approach is 'hybrid' because the objects that appear in the
 filesystem do not all appear to belong to that filesystem.  In many
 case an object accessed in the union will be indistinguishable
 from accessing the corresponding object from the original filesystem.
 This is most obvious from the 'st_dev' field returned by stat(2).
-Some objects will report an st_dev from one original filesystem, some
-from the other, directories will report an st_dev from the union
-itself.  Similarly st_ino will only be unique when combined with
-st_dev, and both of these can change over the lifetime of a
-non-directory object.  Many applications and tools ignore these values
-and will not be affected.
+
+While directories will report an st_dev for the overlay-filesystem,
+all non-directory objects will report an st_dev whichever of the
+'lower' or 'upper' filesystem that is providing the object.  Similarly
+st_ino will only be unique when combined with st_dev, and both of
+these can change over the lifetime of a non-directory object.  Many
+applications and tools ignore these values and will not be affected.
 
 Upper and Lower
 ---------------
 
 An overlay filesystem combines two filesystems - an 'upper' filesystem
-and a 'lower' filesystem.  Note that while in set theory, 'union' is a
-commutative operation, in filesystems it is not - the two filesystems
-are treated differently.  When a name exists in both filesystems, the
+and a 'lower' filesystem.  When a name exists in both filesystems, the
 object in the 'upper' filesystem is visible while the object in the
 'lower' filesystem is either hidden or, in the case of directories,
 merged with the 'upper' object.
@@ -48,14 +49,14 @@ trusted.* extended attributes, and must provide valid d_type in
 readdir responses, at least for symbolic links - so NFS is not
 suitable.
 
-A read-only union of two read-only filesystems may use any filesystem
-type.
+A read-only overlay of two read-only filesystems may use any
+filesystem type.
 
 Directories
 -----------
 
-Unioning mainly involved directories.  If a given name appears in both
-upper ad lower filesystems and refers to a non-directory in either,
+Overlaying mainly involved directories.  If a given name appears in both
+upper and lower filesystems and refers to a non-directory in either,
 then the lower object is hidden - the name refers only to the upper
 object.
 
@@ -67,7 +68,7 @@ into a merged directory.  Then whenever a lookup is requested in such
 a merged directory, the lookup is performed in each actual directory
 and the combined result is cached in the dentry belonging to the overlay
 filesystem.  If both actual lookups find directories, both are stored
-and a merged directory is create, otherwise only one is stored: the
+and a merged directory is created, otherwise only one is stored: the
 upper if it exists, else the lower.
 
 Only the lists of names from directories are merged.  Other content

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 01/10] ovl: small optimisation for ovl_lookup
  2010-09-06  0:50 ` [PATCH 01/10] ovl: small optimisation for ovl_lookup NeilBrown
@ 2010-09-06 16:57   ` Miklos Szeredi
  0 siblings, 0 replies; 20+ messages in thread
From: Miklos Szeredi @ 2010-09-06 16:57 UTC (permalink / raw)
  To: NeilBrown; +Cc: miklos, linux-fsdevel, linux-kernel

On Mon, 06 Sep 2010, NeilBrown wrote:
> If we find a non-directory in the upper layer, there is no point
> performing a lookup in the lower layer as whatever is found will just
> be ignored anyway.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/overlayfs/overlayfs.c |   17 ++++++++---------
>  1 files changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/overlayfs/overlayfs.c b/fs/overlayfs/overlayfs.c
> index 0fe3944..0ddfeec 100644
> --- a/fs/overlayfs/overlayfs.c
> +++ b/fs/overlayfs/overlayfs.c
> @@ -495,7 +495,8 @@ static struct dentry *ovl_lookup(struct inode *dir, struct dentry *dentry,
>  				dput(upperdentry);
>  				upperdentry = NULL;
>  				ue->opaque = true;
> -			}
> +			} else if (!S_ISDIR(upperdentry->d_inode->i_mode))
> +				ue->opaque = true;

Setting opaque unconditionally, without checking whether the lower
file exists or not, means on unlink we'll create a whiteout even when
not necessary.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 05/10] ovl: make sure fsync is never called on the lower filesystem.
  2010-09-06  0:50 ` [PATCH 05/10] ovl: make sure fsync is never called on the lower filesystem NeilBrown
@ 2010-09-06 17:16   ` Miklos Szeredi
  0 siblings, 0 replies; 20+ messages in thread
From: Miklos Szeredi @ 2010-09-06 17:16 UTC (permalink / raw)
  To: NeilBrown; +Cc: miklos, linux-fsdevel, linux-kernel

On Mon, 06 Sep 2010, NeilBrown wrote:
> If we open a directory O_RDONLY, ->realfile could be on the lower
> filesystem.
> While fsync is not likely to be called on such a file, we should make
> sure never to pass the fsync down as we are treating the lower
> filesystem as read-only.

Why not?  fsync() works fine on an r/o filesystem.  It doesn't do
much, probably, but that's not a reason to add additional complexity.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 09/10] VFS: Remove read-only checks from dentry_permission
  2010-09-06  0:50 ` [PATCH 09/10] VFS: Remove read-only checks from dentry_permission NeilBrown
@ 2010-09-06 19:10   ` Miklos Szeredi
  2010-09-08  7:47     ` Neil Brown
  2010-09-07 15:58   ` Dave Hansen
  1 sibling, 1 reply; 20+ messages in thread
From: Miklos Szeredi @ 2010-09-06 19:10 UTC (permalink / raw)
  To: NeilBrown; +Cc: miklos, linux-fsdevel, haveblue, linux-kernel

On Mon, 06 Sep 2010, NeilBrown wrote:
> It is not sufficient to depend on the the "filesystem is readonly"
> tests in dentry_permission as it does not check if the vfsmnt is
> readonly.
> All call sites already call mnt_want_write or __mnt_is_readonly which
> includes the test on MS_RDONLY.

Last time I checked I found some holes (in nfsd IIRC).  That was a
long time ago and things may have changed.

That check could be replaced with a

		if (IS_RDONLY(inode) &&
		    (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode)))
			BUG();

which would catch these cases but only if the superblock was marked
r/o.  Otherwise it's basically impossible to make sure the callers of
the VFS play by the rules.  That was one reason I advocated a
path_... interface for the VFS instead of the current dentry based
one, but Al didn't like it.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 00/10] Assorted fixes/enhancements for overlayfs
  2010-09-06  0:50 [PATCH 00/10] Assorted fixes/enhancements for overlayfs NeilBrown
                   ` (9 preceding siblings ...)
  2010-09-06  0:50 ` [PATCH 10/10] ovl: Assorted updates to Documentation/filesystems/overlayfs.txt NeilBrown
@ 2010-09-06 19:23 ` Miklos Szeredi
  10 siblings, 0 replies; 20+ messages in thread
From: Miklos Szeredi @ 2010-09-06 19:23 UTC (permalink / raw)
  To: NeilBrown; +Cc: miklos, linux-fsdevel, linux-kernel

Hi Neil,

On Mon, 06 Sep 2010, NeilBrown wrote:
> Hi Miklos,
>  following are an assortment of suggested changes for overlayfs for
>  your consideration.
>  They pass some light testing.
> 
> Thanks,
> NeilBrown
> 
> 
> ---
> 
> NeilBrown (10):
>       ovl: small optimisation for ovl_lookup
>       ovl: minimal remount support.
>       ovl: use correct seek function for directories.
>       ovl: initialise is_real before use.
>       ovl: make sure fsync is never called on the lower filesystem.
>       ovl: rename ovl_fill_cache to ovl_dir_read
>       ovl: add initial revalidate support
>       VFS: tiny optimisation in open_other handling.
>       VFS: Remove read-only checks from dentry_permission
>       ovl: Assorted updates to Documentation/filesystems/overlayfs.txt

Thanks!

I applied the rest of the patches (except those three I commented on)
and pushed out to the overlayfs branch.

Miklos

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 09/10] VFS: Remove read-only checks from dentry_permission
  2010-09-06  0:50 ` [PATCH 09/10] VFS: Remove read-only checks from dentry_permission NeilBrown
  2010-09-06 19:10   ` Miklos Szeredi
@ 2010-09-07 15:58   ` Dave Hansen
  1 sibling, 0 replies; 20+ messages in thread
From: Dave Hansen @ 2010-09-07 15:58 UTC (permalink / raw)
  To: NeilBrown; +Cc: Miklos Szeredi, linux-fsdevel, Dave Hansen, linux-kernel

On Mon, 2010-09-06 at 10:50 +1000, NeilBrown wrote:
> It is not sufficient to depend on the the "filesystem is readonly"
> tests in dentry_permission as it does not check if the vfsmnt is
> readonly.
> All call sites already call mnt_want_write or __mnt_is_readonly which
> includes the test on MS_RDONLY.
> 
> So remove this test from dentry_permission as it simply duplicates
> tests done elsewhere.

Yeah, I remember auditing all of the dentry_permission() callers and
moving them over to mnt_want_write() and friends as well.  I don't quite
remember why we didn't take this step back then, but it looks sane to
me.

-- Dave

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 09/10] VFS: Remove read-only checks from dentry_permission
  2010-09-06 19:10   ` Miklos Szeredi
@ 2010-09-08  7:47     ` Neil Brown
  2010-09-08  8:58       ` Miklos Szeredi
  0 siblings, 1 reply; 20+ messages in thread
From: Neil Brown @ 2010-09-08  7:47 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, haveblue, linux-kernel

On Mon, 06 Sep 2010 21:10:10 +0200
Miklos Szeredi <miklos@szeredi.hu> wrote:

> On Mon, 06 Sep 2010, NeilBrown wrote:
> > It is not sufficient to depend on the the "filesystem is readonly"
> > tests in dentry_permission as it does not check if the vfsmnt is
> > readonly.
> > All call sites already call mnt_want_write or __mnt_is_readonly which
> > includes the test on MS_RDONLY.
> 
> Last time I checked I found some holes (in nfsd IIRC).  That was a
> long time ago and things may have changed.

nfsd looks OK to me.  I didn't do an exhaustive audit but couldn't find
things that would not still work correctly.


> 
> That check could be replaced with a
> 
> 		if (IS_RDONLY(inode) &&
> 		    (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode)))
> 			BUG();

That wouldn't quite work currently.
sys_faccessat checks __mnt_is_readonly *after* the call to dentry_permission,
so the above would cause it to BUG.  Possibly the __mnt_is_readonly could be
checked before dentry_permission.

However nfsd_permission is a bit more awkward to fix as sometimes it
deliberately wants to ignore read-only-filesystem issues ... but it might
still be possible to work around..

> 
> which would catch these cases but only if the superblock was marked
> r/o.  Otherwise it's basically impossible to make sure the callers of
> the VFS play by the rules.  That was one reason I advocated a
> path_... interface for the VFS instead of the current dentry based
> one, but Al didn't like it.
>

I guess there comes a point were we just have to document the rules and if
someone doesn't play by them - that is a bug...

NeilBrown

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 09/10] VFS: Remove read-only checks from dentry_permission
  2010-09-08  7:47     ` Neil Brown
@ 2010-09-08  8:58       ` Miklos Szeredi
  0 siblings, 0 replies; 20+ messages in thread
From: Miklos Szeredi @ 2010-09-08  8:58 UTC (permalink / raw)
  To: Neil Brown; +Cc: miklos, linux-fsdevel, haveblue, linux-kernel

On Wed, 8 Sep 2010, Neil Brown wrote:
> On Mon, 06 Sep 2010 21:10:10 +0200
> Miklos Szeredi <miklos@szeredi.hu> wrote:
> 
> > On Mon, 06 Sep 2010, NeilBrown wrote:
> > > It is not sufficient to depend on the the "filesystem is readonly"
> > > tests in dentry_permission as it does not check if the vfsmnt is
> > > readonly.
> > > All call sites already call mnt_want_write or __mnt_is_readonly which
> > > includes the test on MS_RDONLY.
> > 
> > Last time I checked I found some holes (in nfsd IIRC).  That was a
> > long time ago and things may have changed.
> 
> nfsd looks OK to me.  I didn't do an exhaustive audit but couldn't find
> things that would not still work correctly.

Looks to me as if nfsd_setattr() and some of its callers fail to
acquire a write ref.

> > 
> > That check could be replaced with a
> > 
> > 		if (IS_RDONLY(inode) &&
> > 		    (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode)))
> > 			BUG();
> 
> That wouldn't quite work currently.
> sys_faccessat checks __mnt_is_readonly *after* the call to dentry_permission,

Ugh.

> so the above would cause it to BUG.  Possibly the __mnt_is_readonly could be
> checked before dentry_permission.
> 
> However nfsd_permission is a bit more awkward to fix as sometimes it
> deliberately wants to ignore read-only-filesystem issues ... but it might
> still be possible to work around..
> 
> > 
> > which would catch these cases but only if the superblock was marked
> > r/o.  Otherwise it's basically impossible to make sure the callers of
> > the VFS play by the rules.  That was one reason I advocated a
> > path_... interface for the VFS instead of the current dentry based
> > one, but Al didn't like it.
> >
> 
> I guess there comes a point were we just have to document the rules and if
> someone doesn't play by them - that is a bug...

Yeah and that's fine as long as the bug shows up relatively easily and
not result in silent fs corruption once in blue moon instead.

If that MS_RDONLY check is removed the above criterion will not be
met.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 07/10] ovl: add initial revalidate support
  2010-09-06  0:50 ` [PATCH 07/10] ovl: add initial revalidate support NeilBrown
@ 2010-09-16 14:47   ` Miklos Szeredi
  2010-09-21  2:40     ` Neil Brown
  0 siblings, 1 reply; 20+ messages in thread
From: Miklos Szeredi @ 2010-09-16 14:47 UTC (permalink / raw)
  To: NeilBrown; +Cc: miklos, linux-fsdevel, linux-kernel

On Mon, 06 Sep 2010, NeilBrown wrote:
> Add dentry_revalidate method and fail validation of either the
> upper or lower dentry has been renamed or unlinked directly in the
> otherlying filesystem.
> This allows such changes to appear promptly in the overlay providing
> the file isn't currently in use.

I fixed up some things in the revalidation logic and tested it out.
There are some unexpected effects, but they boil down to the fact that
busy directories can't be invalidated.  Mostly it works as expected.

However, the "rearange directories so that a/b becomes b/a" trick
still strikes in evil ways.  Consider the following script:

mkdir /upper/a/b
cd /mnt/overlay/a
while true; do
	cd b
	mv /upper/a/b /upper/b
	mv /upper/a /upper/b/a
	cd a
	mv /upper/b/a /upper/a
	mv /upper/b /upper/a/b
done

It will create an ever deeper directory tree on the overlay.

Can this be prevented?  Probably, e.g. lookup should make sure that
each new directory gets a *unique* set of lower and upper dentries
(e.g. by creating hash tables indexed by lower and upper dentries).

Is it worth the trouble?

Any other ideas?

Thanks,
Miklos



> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/overlayfs/overlayfs.c |   55 ++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 55 insertions(+), 0 deletions(-)
> 
> diff --git a/fs/overlayfs/overlayfs.c b/fs/overlayfs/overlayfs.c
> index 656cad7..cdeafa7 100644
> --- a/fs/overlayfs/overlayfs.c
> +++ b/fs/overlayfs/overlayfs.c
> @@ -409,9 +409,64 @@ static void ovl_dentry_iput(struct dentry *dentry, struct inode *inode)
>  	iput(inode);
>  }
>  
> +static int ovl_still_valid(struct dentry *dentry,
> +			     struct dentry *parent, struct dentry *child)
> +{
> +	/* dentry is in the overlay filesystem
> +	 * child is the corresponding dentry in the upper or lower layer
> +	 * parent is the corresponding dentry of dentry's parent in the same layer
> +	 *
> +	 * If child is NULL, the parent must be NULL or negative.
> +	 * Otherwise child must be hashed, have parent as the d_parent, and
> +	 * have the same name as dentry
> +	 *
> +	 */
> +	struct qstr *qstr;
> +	int rv;
> +
> +	if (child == NULL)
> +		return (parent == NULL || parent->d_inode == NULL);
> +
> +	if (child->d_parent != parent)
> +		return 0;
> +	if (d_unhashed(child))
> +		return 0;
> +
> +	/* Unfortunately we need d_lock to compare names */
> +	spin_lock(&dentry->d_lock);
> +	spin_lock_nested(&child->d_lock, DENTRY_D_LOCK_NESTED);
> +	qstr = &child->d_name;
> +	if (parent->d_op && parent->d_op->d_compare)
> +		rv = ! (parent->d_op->d_compare(parent, qstr, &dentry->d_name));
> +	else
> +		rv =  (qstr->len == dentry->d_name.len &&
> +		       memcmp(qstr->name, dentry->d_name.name, qstr->len) == 0);
> +
> +	spin_unlock(&child->d_lock);
> +	spin_unlock(&dentry->d_lock);
> +	return rv;
> +}
> +
> +static int ovl_dentry_revalidate(struct dentry *dentry, struct nameidata *nd)
> +{
> +	/* We need to invalidate this dentry if the either upperpath or lowerpath
> +	 * has been changed
> +	 */
> +	struct dentry *parent = dget_parent(dentry);
> +	struct ovl_entry *ue = dentry->d_fsdata;
> +	struct ovl_entry *pue = parent->d_fsdata;
> +	int rv;
> +
> +	rv = (ovl_still_valid(dentry, pue->upperpath.dentry, ue->upperpath.dentry) &&
> +	      ovl_still_valid(dentry, pue->lowerpath.dentry, ue->lowerpath.dentry));
> +	dput(parent);
> +	return rv;
> +}
> +
>  static const struct dentry_operations ovl_dentry_operations = {
>  	.d_release = ovl_dentry_release,
>  	.d_iput = ovl_dentry_iput,
> +	.d_revalidate = ovl_dentry_revalidate,
>  };
>  
>  static struct inode *ovl_new_inode(struct super_block *sb, umode_t mode)
> 
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 07/10] ovl: add initial revalidate support
  2010-09-16 14:47   ` Miklos Szeredi
@ 2010-09-21  2:40     ` Neil Brown
  0 siblings, 0 replies; 20+ messages in thread
From: Neil Brown @ 2010-09-21  2:40 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, linux-kernel

On Thu, 16 Sep 2010 16:47:29 +0200
Miklos Szeredi <miklos@szeredi.hu> wrote:

> On Mon, 06 Sep 2010, NeilBrown wrote:
> > Add dentry_revalidate method and fail validation of either the
> > upper or lower dentry has been renamed or unlinked directly in the
> > otherlying filesystem.
> > This allows such changes to appear promptly in the overlay providing
> > the file isn't currently in use.
> 
> I fixed up some things in the revalidation logic and tested it out.
> There are some unexpected effects, but they boil down to the fact that
> busy directories can't be invalidated.  Mostly it works as expected.
> 
> However, the "rearange directories so that a/b becomes b/a" trick
> still strikes in evil ways.  Consider the following script:
> 
> mkdir /upper/a/b
> cd /mnt/overlay/a
> while true; do
> 	cd b
> 	mv /upper/a/b /upper/b
> 	mv /upper/a /upper/b/a
> 	cd a
> 	mv /upper/b/a /upper/a
> 	mv /upper/b /upper/a/b
> done
> 
> It will create an ever deeper directory tree on the overlay.
> 
> Can this be prevented?  Probably, e.g. lookup should make sure that
> each new directory gets a *unique* set of lower and upper dentries
> (e.g. by creating hash tables indexed by lower and upper dentries).
> 
> Is it worth the trouble?

The only real problem here is that an unprivileged user with direct access
to 'upper' could consume unlimited kernel memory without using a
corresponding mount of storage space (so quotas etc wouldn't stop them).

Maybe the easiest counter-measure to that is to keep track of a 'depth' of
each dentry and disallow lookups that go much beyond PATH_MAX???

Either that or rely in the sysadmin to hide the upper/lower filesystems from
untrusted users.  I particularly see value in sysadmins being able to modify
these filesystem - not so much for 'regular users'.

NeilBrown

> 
> Any other ideas?
> 
> Thanks,
> Miklos
> 
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2010-09-21  2:40 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-06  0:50 [PATCH 00/10] Assorted fixes/enhancements for overlayfs NeilBrown
2010-09-06  0:50 ` [PATCH 02/10] ovl: minimal remount support NeilBrown
2010-09-06  0:50 ` [PATCH 01/10] ovl: small optimisation for ovl_lookup NeilBrown
2010-09-06 16:57   ` Miklos Szeredi
2010-09-06  0:50 ` [PATCH 03/10] ovl: use correct seek function for directories NeilBrown
2010-09-06  0:50 ` [PATCH 04/10] ovl: initialise is_real before use NeilBrown
2010-09-06  0:50 ` [PATCH 08/10] VFS: tiny optimisation in open_other handling NeilBrown
2010-09-06  0:50 ` [PATCH 09/10] VFS: Remove read-only checks from dentry_permission NeilBrown
2010-09-06 19:10   ` Miklos Szeredi
2010-09-08  7:47     ` Neil Brown
2010-09-08  8:58       ` Miklos Szeredi
2010-09-07 15:58   ` Dave Hansen
2010-09-06  0:50 ` [PATCH 07/10] ovl: add initial revalidate support NeilBrown
2010-09-16 14:47   ` Miklos Szeredi
2010-09-21  2:40     ` Neil Brown
2010-09-06  0:50 ` [PATCH 06/10] ovl: rename ovl_fill_cache to ovl_dir_read NeilBrown
2010-09-06  0:50 ` [PATCH 05/10] ovl: make sure fsync is never called on the lower filesystem NeilBrown
2010-09-06 17:16   ` Miklos Szeredi
2010-09-06  0:50 ` [PATCH 10/10] ovl: Assorted updates to Documentation/filesystems/overlayfs.txt NeilBrown
2010-09-06 19:23 ` [PATCH 00/10] Assorted fixes/enhancements for overlayfs Miklos Szeredi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).