linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] vfs: get_next_ino(), never inum=0
       [not found] <'<CANn89i+PBEGp=9QGRioa7CUDZmApT-UNa=OJTdz4eu7AyO3Kbw@mail.gmail.com>
@ 2014-05-28 14:06 ` J. R. Okajima
  0 siblings, 0 replies; 3+ messages in thread
From: J. R. Okajima @ 2014-05-28 14:06 UTC (permalink / raw)
  To: linux-fsdevel, dchinner, viro, Eric Dumazet, Hugh Dickins,
	Christoph Hellwig, Andreas Dilger, Jan Kara

It is very rare for get_next_ino() to return zero as a new inode number
since its type is unsigned int, but it can surely happen eventually.

Interestingly, ls(1) and find(1) (actually readdir(3)) don't show a file
whose inum is zero, so people won't be able to find it. This issue may
be harmful especially for tmpfs.

On a very long lived and busy system, users may frequently create files
on tmpfs. And if unluckily he gets inum=0, then he cannot see its
filename. If he remembers its name, he may be able to use or unlink it
by its name since the file surely exists. Otherwise, the file remains on
tmpfs silently. No one can touch it. This behaviour looks like resource
leak.
As a worse case, if a dir gets inum=0 and a user creates several files
under it, then the leaked memory will increase since a user cannot see
the name of all files under the dir whose inum=0, regardless the inum of
the children.

There is another unpleasant effect when get_next_ino() wraps
around. When there is a file whose inum=100 on tmpfs, a new file may get
inum=100, ie. the duplicated inums. I am not sure what will happen when
the duplicated inums exist on tmpfs. If it happens, then some tools
won't work correctly such as backup tools, I am afraid.
Anyway this is not a issue in get_next_ino(). It should be
fixed in mm/shmem.c separatly if it is really necessary.

There are many other get_next_ino() callers other than tmpfs, such as
several drivers, anon_inode, autofs4, freevxfs, procfs, pis, hugetlbfs,
configfs, ramfs, fuse, ocfs2, debugfs, securityfs, cgroup, socket, ipc.
Some of them will not care inum so this issue is harmless for them. But
the others may suffer from inum=0. For example, if procfs gets inum=0
for a task dir (or for one of its children), then several utilities
won't work correctly, including ps(1), lsof(8), etc.

(Essentially the patch is re-written by Eric Dumazet.)

Cc: Eric Dumazet <edumazet@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Andreas Dilger <adilger@dilger.ca>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
---
 fs/inode.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/inode.c b/fs/inode.c
index 567296b..58e7c56 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -840,6 +840,8 @@ unsigned int get_next_ino(void)
 	unsigned int *p = &get_cpu_var(last_ino);
 	unsigned int res = *p;
 
+start:
+
 #ifdef CONFIG_SMP
 	if (unlikely((res & (LAST_INO_BATCH-1)) == 0)) {
 		static atomic_t shared_last_ino;
@@ -849,7 +851,9 @@ unsigned int get_next_ino(void)
 	}
 #endif
 
-	*p = ++res;
+	if (unlikely(!++res))
+		goto start;	/* never zero */
+	*p = res;
 	put_cpu_var(last_ino);
 	WARN(!res, "static inum wrapped around");
 	return res;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] vfs: get_next_ino(), never inum=0
  2014-04-29 15:45 [PATCH] " hooanon05g
@ 2014-08-18 18:21 ` Carlos Maiolino
  2014-08-19  0:58   ` J. R. Okajima
  0 siblings, 1 reply; 3+ messages in thread
From: Carlos Maiolino @ 2014-08-18 18:21 UTC (permalink / raw)
  To: linux-fsdevel

This V2 looks very reasonable, and fix the problem with files with inode=0 on
tmpfs which I tested here, so, consider it

Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>

Cheers
-- 
Carlos

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] vfs: get_next_ino(), never inum=0
  2014-08-18 18:21 ` [PATCH v2] " Carlos Maiolino
@ 2014-08-19  0:58   ` J. R. Okajima
  0 siblings, 0 replies; 3+ messages in thread
From: J. R. Okajima @ 2014-08-19  0:58 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: linux-fsdevel


Carlos Maiolino:
> This V2 looks very reasonable, and fix the problem with files with inode=0 on
> tmpfs which I tested here, so, consider it
>
> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>

Just out of curious, how did you notice the problem of inode=0? I think
it is hard for everyone to meet the problem.

And after posting the patch, some people reported me a bug related to
Sysv shm. This extra patch supports Sysv shm. But I don't like it since
it introduces an additional condition into the very normal path.


J. R. Okajima

diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index ca658a8..fda816e 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -25,6 +25,7 @@ struct shmem_inode_info {
 
 struct shmem_sb_info {
 	struct mutex idr_lock;
+	bool idr_nouse;
 	struct idr idr;		    /* manages inode-number */
 	unsigned long max_blocks;   /* How many blocks are allowed */
 	struct percpu_counter used_blocks;  /* How many are allocated */
diff --git a/mm/shmem.c b/mm/shmem.c
index 0aa3b85..5eb75e9 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -648,7 +648,7 @@ static void shmem_evict_inode(struct inode *inode)
 
 	simple_xattrs_free(&info->xattrs);
 	WARN_ON(inode->i_blocks);
-	if (inode->i_ino) {
+	if (!sbinfo->idr_nouse && inode->i_ino) {
 		mutex_lock(&sbinfo->idr_lock);
 		idr_remove(&sbinfo->idr, inode->i_ino);
 		mutex_unlock(&sbinfo->idr_lock);
@@ -1423,19 +1423,24 @@ static struct inode *shmem_get_inode(struct super_block *sb, const struct inode
 			break;
 		}
 
-		/* inum 0 and 1 are unused */
-		mutex_lock(&sbinfo->idr_lock);
-		ino = idr_alloc(&sbinfo->idr, inode, 2, INT_MAX, GFP_NOFS);
-		if (ino > 0) {
-			inode->i_ino = ino;
-			mutex_unlock(&sbinfo->idr_lock);
-			__insert_inode_hash(inode, inode->i_ino);
-		} else {
-			inode->i_ino = 0;
-			mutex_unlock(&sbinfo->idr_lock);
-			iput(inode);	/* shmem_free_inode() will be called */
-			inode = NULL;
-		}
+		if (!sbinfo->idr_nouse) {
+			/* inum 0 and 1 are unused */
+			mutex_lock(&sbinfo->idr_lock);
+			ino = idr_alloc(&sbinfo->idr, inode, 2, INT_MAX,
+					GFP_NOFS);
+			if (ino > 0) {
+				inode->i_ino = ino;
+				mutex_unlock(&sbinfo->idr_lock);
+				__insert_inode_hash(inode, inode->i_ino);
+			} else {
+				inode->i_ino = 0;
+				mutex_unlock(&sbinfo->idr_lock);
+				iput(inode);
+				/* shmem_free_inode() will be called */
+				inode = NULL;
+			}
+		} else
+			inode->i_ino = get_next_ino();
 	} else
 		shmem_free_inode(sb);
 	return inode;
@@ -2560,7 +2565,8 @@ static void shmem_put_super(struct super_block *sb)
 {
 	struct shmem_sb_info *sbinfo = SHMEM_SB(sb);
 
-	idr_destroy(&sbinfo->idr);
+	if (!sbinfo->idr_nouse)
+		idr_destroy(&sbinfo->idr);
 	percpu_counter_destroy(&sbinfo->used_blocks);
 	mpol_put(sbinfo->mpol);
 	kfree(sbinfo);
@@ -2682,6 +2688,15 @@ static void shmem_destroy_inodecache(void)
 	kmem_cache_destroy(shmem_inode_cachep);
 }
 
+static __init void shmem_no_idr(struct super_block *sb)
+{
+	struct shmem_sb_info *sbinfo;
+
+	sbinfo = SHMEM_SB(sb);
+	sbinfo->idr_nouse = true;
+	idr_destroy(&sbinfo->idr);
+}
+
 static const struct address_space_operations shmem_aops = {
 	.writepage	= shmem_writepage,
 	.set_page_dirty	= __set_page_dirty_no_writeback,
@@ -2814,6 +2829,7 @@ int __init shmem_init(void)
 		printk(KERN_ERR "Could not kern_mount tmpfs\n");
 		goto out1;
 	}
+	shmem_no_idr(shm_mnt->mnt_sb);
 	return 0;
 
 out1:

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-08-19  1:05 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <'<CANn89i+PBEGp=9QGRioa7CUDZmApT-UNa=OJTdz4eu7AyO3Kbw@mail.gmail.com>
2014-05-28 14:06 ` [PATCH v2] vfs: get_next_ino(), never inum=0 J. R. Okajima
2014-04-29 15:45 [PATCH] " hooanon05g
2014-08-18 18:21 ` [PATCH v2] " Carlos Maiolino
2014-08-19  0:58   ` J. R. Okajima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).