public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* overlayfs non-persistent inodes
@ 2018-01-13  0:19 Niklas Cassel
  2018-01-13  4:38 ` Amir Goldstein
  0 siblings, 1 reply; 2+ messages in thread
From: Niklas Cassel @ 2018-01-13  0:19 UTC (permalink / raw)
  To: amir73il, mszeredi; +Cc: linux-unionfs, linux-kernel

Hello Miklos, Amir

We are having some problems with inotify + overlayfs.

If we start to monitor a directory on an overlayfs,
and get into a low memory situation, or if
/proc/sys/vm/drop_caches is called explicitly,
our inotify stops reporting events.

This appears to be caused by non-persistent inodes.

Looking at the commits, it looked like inode numbers
were persistent since commit
b7a807dc2010 ("ovl: persistent inode number for directories").

However, this doesn't seem to take different file systems
into account:

# cat /proc/version
Linux version 4.15.0-rc7-next-20180109-00003-gd1acc32c9ee5 (niklass@lnxartpec1)
 (gcc version 6.3.0 (crosstool-NG crosstool-ng-1.22.0-610-g21cde94)) #74
 SMP Fri Jan 12 13:23:57 CET 2018
# mkdir -p /test/dir /mnt/tmp
# mount -t tmpfs tmpfs /mnt/tmp
# mkdir /mnt/tmp/.work-test /mnt/tmp/test
# mount -t overlay overlay /test -o lowerdir=/test,upperdir=/mnt/tmp/test,
workdir=/mnt/tmp/.work-test
# ls -lid /test/dir
   6313 drwxr-xr-x    2 root     root            40 Jan  1 00:01 /test/dir
# echo 3 > /proc/sys/vm/drop_caches
[  229.762380] sh (118): drop_caches: 3
# ls -lid /test/dir
   6318 drwxr-xr-x    2 root     root            40 Jan  1 00:01 /test/dir


There are probably a few user space applications that
rely on inotify. If inotify suddenly and silently stops
reporting events after some memory pressure, all kinds
of bugs could be introduced in user space.

I'm guessing that it will not be so easy to figure out
that overlayfs might be the root-cause of their
seemingly random bug.


Regards,
Niklas

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: overlayfs non-persistent inodes
  2018-01-13  0:19 overlayfs non-persistent inodes Niklas Cassel
@ 2018-01-13  4:38 ` Amir Goldstein
  0 siblings, 0 replies; 2+ messages in thread
From: Amir Goldstein @ 2018-01-13  4:38 UTC (permalink / raw)
  To: Niklas Cassel; +Cc: Miklos Szeredi, overlayfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1239 bytes --]

On Sat, Jan 13, 2018 at 2:19 AM, Niklas Cassel <niklas.cassel@axis.com> wrote:
> Hello Miklos, Amir
>
> We are having some problems with inotify + overlayfs.
>
> If we start to monitor a directory on an overlayfs,
> and get into a low memory situation, or if
> /proc/sys/vm/drop_caches is called explicitly,
> our inotify stops reporting events.
>
> This appears to be caused by non-persistent inodes.

Not "non-persistent" inodes, but "non-unique" inodes.
inotify pins overlay inode in memory, but because overlay
directory inodes are not hashed in inode cache, after drop
caches, if nothing holds a reference to overlay dentry,
overlay lookup will allocate a new dentry with a new inode
instead of re-using the inotify pinned inode.

>
> Looking at the commits, it looked like inode numbers
> were persistent since commit
> b7a807dc2010 ("ovl: persistent inode number for directories").
>

Persistent inode *number*, meaning that all non-unique overlay
directory inodes have the same inode number - doesn't help
inotify.

> However, this doesn't seem to take different file systems
> into account:
>

Are you saying that inotify issue is not happening when layers
are on same fs? I doubt that.

Please try attached patch.

Thanks,
Amir.

[-- Attachment #2: 0001-ovl-hash-directory-inodes.patch --]
[-- Type: text/x-patch, Size: 5180 bytes --]

From 9635eb8bfe22542ab543116126a57a65154fb84e Mon Sep 17 00:00:00 2001
From: Amir Goldstein <amir73il@gmail.com>
Date: Mon, 10 Jul 2017 15:55:55 +0300
Subject: [PATCH] ovl: hash directory inodes

fsnotify pins a watched directory inode in cache, but if directory dentry
is released, new lookup will allocate a new dentry and a new inode.
Directory events will be notified on the new inode, while fsnoity listener
is watching the old pinned inode.

Hash all directory inodes to reuse the pinned inode on lookup. Pure upper
dirs are hashes by real upper inode, merge and lower dirs are hashed by
real lower inode.

The reference to lower inode was being held by the lower dentry object
in the overlay dentry (oe->lowerstack[0]). Releasing the overlay dentry
may drop lower inode refcount to zero. Add a refcount on behalf of the
overlay inode to prevent that.

As a by-product, hashing directory inodes also detects multiple
redirected dirs to the same lower dir and uncovered redirected dir
target on and returns -ESTALE on lookup.

Reported-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/inode.c | 31 +++++++++++++++++++++++--------
 fs/overlayfs/super.c |  1 +
 fs/overlayfs/util.c  |  4 ++--
 3 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 00b6b294272a..090f9735f4ed 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -551,7 +551,8 @@ unsigned int ovl_get_nlink(struct dentry *lowerdentry,
 	char buf[13];
 	int err;
 
-	if (!lowerdentry || !upperdentry || d_inode(lowerdentry)->i_nlink == 1)
+	if (!lowerdentry || !upperdentry || d_is_dir(lowerdentry) ||
+	    d_inode(lowerdentry)->i_nlink == 1)
 		return fallback;
 
 	err = vfs_getxattr(upperdentry, OVL_XATTR_NLINK, &buf, sizeof(buf) - 1);
@@ -606,6 +607,16 @@ static int ovl_inode_set(struct inode *inode, void *data)
 static bool ovl_verify_inode(struct inode *inode, struct dentry *lowerdentry,
 			     struct dentry *upperdentry)
 {
+	if (S_ISDIR(inode->i_mode)) {
+		/* Real lower dir moved to upper layer under us? */
+		if (!lowerdentry && ovl_inode_lower(inode))
+			return false;
+
+		/* Lookup of an uncovered redirect origin? */
+		if (!upperdentry && ovl_inode_upper(inode))
+			return false;
+	}
+
 	/*
 	 * Allow non-NULL lower inode in ovl_inode even if lowerdentry is NULL.
 	 * This happens when finding a copied up overlay inode for a renamed
@@ -633,6 +644,7 @@ struct inode *ovl_get_inode(struct dentry *dentry, struct dentry *upperdentry,
 	struct inode *inode;
 	/* Already indexed or could be indexed on copy up? */
 	bool indexed = (index || (ovl_indexdir(dentry->d_sb) && !upperdentry));
+	struct dentry *origin = indexed ? lowerdentry : NULL;
 
 	if (WARN_ON(upperdentry && indexed && !lowerdentry))
 		return ERR_PTR(-EIO);
@@ -641,14 +653,17 @@ struct inode *ovl_get_inode(struct dentry *dentry, struct dentry *upperdentry,
 		realinode = d_inode(lowerdentry);
 
 	/*
-	 * Copy up origin (lower) may exist for non-indexed upper, but we must
-	 * not use lower as hash key in that case.
-	 * Hash inodes that are or could be indexed by origin inode and
-	 * non-indexed upper inodes that could be hard linked by upper inode.
+	 * Copy up origin (lower) may exist for non-indexed non-dir upper, but
+	 * we must not use lower as hash key in that case.
+	 * Hash non-dir that is or could be indexed by origin inode.
+	 * Hash dir that is or could be merged by origin inode.
+	 * Hash pure upper and non-indexed non-dir by upper inode.
 	 */
-	if (!S_ISDIR(realinode->i_mode) && (upperdentry || indexed)) {
-		struct inode *key = d_inode(indexed ? lowerdentry :
-						      upperdentry);
+	if (S_ISDIR(realinode->i_mode))
+		origin = lowerdentry;
+
+	if (upperdentry || origin) {
+		struct inode *key = d_inode(origin ?: upperdentry);
 		unsigned int nlink;
 
 		inode = iget5_locked(dentry->d_sb, (unsigned long) key,
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 5f6d2385c0b3..3387e6d639a5 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -211,6 +211,7 @@ static void ovl_destroy_inode(struct inode *inode)
 	struct ovl_inode *oi = OVL_I(inode);
 
 	dput(oi->__upperdentry);
+	iput(oi->lower);
 	kfree(oi->redirect);
 	ovl_dir_cache_free(inode);
 	mutex_destroy(&oi->lock);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index d6bb1c9f5e7a..06119f34a69d 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -257,7 +257,7 @@ void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
 	if (upperdentry)
 		OVL_I(inode)->__upperdentry = upperdentry;
 	if (lowerdentry)
-		OVL_I(inode)->lower = d_inode(lowerdentry);
+		OVL_I(inode)->lower = igrab(d_inode(lowerdentry));
 
 	ovl_copyattr(d_inode(upperdentry ?: lowerdentry), inode);
 }
@@ -273,7 +273,7 @@ void ovl_inode_update(struct inode *inode, struct dentry *upperdentry)
 	 */
 	smp_wmb();
 	OVL_I(inode)->__upperdentry = upperdentry;
-	if (!S_ISDIR(upperinode->i_mode) && inode_unhashed(inode)) {
+	if (inode_unhashed(inode)) {
 		inode->i_private = upperinode;
 		__insert_inode_hash(inode, (unsigned long) upperinode);
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-01-13  4:38 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-13  0:19 overlayfs non-persistent inodes Niklas Cassel
2018-01-13  4:38 ` Amir Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox