All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Louis Rilling <louis.rilling@kerlabs.com>,
	Pavel Emelyanov <xemul@openvz.org>,
	Linux Containers <containers@lists.osdl.org>,
	linux-kernel@vger.kernel.org,
	Daniel Lezcano <dlezcano@fr.ibm.com>
Subject: [PATCH 2/2] pid_ns: refactor the buggy pid_ns_release_proc() logic
Date: Sun, 20 Jun 2010 20:06:37 +0200	[thread overview]
Message-ID: <20100620180637.GD17120@redhat.com> (raw)
In-Reply-To: <20100620180530.GB17120@redhat.com>

pid_namespace holds ns->proc_mnt, while this vfsmount has a referene to
the namespace via PROC_I(sb->s_root->d_inode)->pid. To break this circle
/sbin/init does mntput() in pid_ns_release_proc(). See 6f4e6433.

But we have the following problems:

	- Nobody does mntput() if copy_process() fails after
	  pid_ns_prepare_proc().

	- proc_flush_task() checks upid->nr == 1 to verify we are init,
	  this is wrong if a multi-threaded init does exec.

	- As Louis pointed out, this namespace can have the detached
	  EXIT_DEAD tasks which can use ns->proc_mnt after this mntput().

With this patch only pid_namespace has a reference to ns->proc_mnt, and
mntput(ns->proc_mnt) is called by destroy_pid_namespace() paths when we
know that this ns must not have any references (in particular, there are
no pids in this namespace).

Changes:

	- kill proc_flush_task()->pid_ns_release_proc()

	- change fs/proc/root.c so that we don't create the "artificial"
	  references to the namespace or its pid==1.

	- change destroy_pid_namespace() to call pid_ns_release_proc().

	- change pid_ns_release_proc() to clear s_root->d_inode->pid.
	  The caller is destroy_pid_namespace(), this pid was already
	  freed.

Reported-by:  Louis Rilling <louis.rilling@kerlabs.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 kernel/pid_namespace.c |    2 ++
 fs/proc/base.c         |    4 ----
 fs/proc/root.c         |   10 ++++++----
 3 files changed, 8 insertions(+), 8 deletions(-)

--- 35-rc3/kernel/pid_namespace.c~PNS_6_BREAK_CIRCLE	2010-06-20 18:36:00.000000000 +0200
+++ 35-rc3/kernel/pid_namespace.c	2010-06-20 18:50:30.000000000 +0200
@@ -128,6 +128,8 @@ static void destroy_pid_namespace(struct
 {
 	int i;
 
+	pid_ns_release_proc(ns);
+
 	for (i = 0; i < PIDMAP_ENTRIES; i++)
 		kfree(ns->pidmap[i].page);
 	kmem_cache_free(pid_ns_cachep, ns);
--- 35-rc3/fs/proc/base.c~PNS_6_BREAK_CIRCLE	2010-05-28 13:41:41.000000000 +0200
+++ 35-rc3/fs/proc/base.c	2010-06-20 18:51:14.000000000 +0200
@@ -2745,10 +2745,6 @@ void proc_flush_task(struct task_struct 
 		proc_flush_task_mnt(upid->ns->proc_mnt, upid->nr,
 					tgid->numbers[i].nr);
 	}
-
-	upid = &pid->numbers[pid->level];
-	if (upid->nr == 1)
-		pid_ns_release_proc(upid->ns);
 }
 
 static struct dentry *proc_pid_instantiate(struct inode *dir,
--- 35-rc3/fs/proc/root.c~PNS_6_BREAK_CIRCLE	2010-06-19 20:11:03.000000000 +0200
+++ 35-rc3/fs/proc/root.c	2010-06-20 18:58:12.000000000 +0200
@@ -31,7 +31,7 @@ static int proc_set_super(struct super_b
 	struct pid_namespace *ns;
 
 	ns = (struct pid_namespace *)data;
-	sb->s_fs_info = get_pid_ns(ns);
+	sb->s_fs_info = ns;
 	return set_anon_super(sb, NULL);
 }
 
@@ -68,7 +68,7 @@ static int proc_get_sb(struct file_syste
 		struct proc_inode *ei = PROC_I(sb->s_root->d_inode);
 		if (!ei->pid) {
 			rcu_read_lock();
-			ei->pid = get_pid(find_pid_ns(1, ns));
+			ei->pid = find_pid_ns(1, ns);
 			rcu_read_unlock();
 		}
 	}
@@ -83,7 +83,6 @@ static void proc_kill_sb(struct super_bl
 
 	ns = (struct pid_namespace *)sb->s_fs_info;
 	kill_anon_super(sb);
-	put_pid_ns(ns);
 }
 
 static struct file_system_type proc_fs_type = {
@@ -209,5 +208,8 @@ int pid_ns_prepare_proc(struct pid_names
 
 void pid_ns_release_proc(struct pid_namespace *ns)
 {
-	mntput(ns->proc_mnt);
+	if (ns->proc_mnt) {
+		PROC_I(ns->proc_mnt->mnt_sb->s_root->d_inode)->pid = NULL;
+		mntput(ns->proc_mnt);
+	}
 }

  parent reply	other threads:[~2010-06-20 18:06 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-16 16:34 [PATCH] procfs: Do not release pid_ns->proc_mnt too early Louis Rilling
     [not found] ` <1276706068-18567-1-git-send-email-louis.rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org>
2010-06-17  9:53   ` Pavel Emelyanov
2010-06-17  9:53     ` Pavel Emelyanov
2010-06-17 13:41     ` Eric W. Biederman
2010-06-17 14:20       ` Louis Rilling
2010-06-17 21:36       ` Oleg Nesterov
2010-06-18  8:27         ` Louis Rilling
     [not found]           ` <20100618082738.GE16877-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2010-06-18 16:27             ` Oleg Nesterov
2010-06-18 16:27           ` Oleg Nesterov
2010-06-21 11:11             ` Louis Rilling
2010-06-21 12:58               ` Eric W. Biederman
2010-06-21 14:15                 ` Louis Rilling
2010-06-21 14:26                   ` Eric W. Biederman
2010-06-17 21:20 ` Oleg Nesterov
2010-06-18  8:20   ` Louis Rilling
2010-06-18 11:15     ` Oleg Nesterov
2010-06-18 16:08       ` Oleg Nesterov
2010-06-18 16:08         ` Oleg Nesterov
2010-06-18 17:33         ` Louis Rilling
2010-06-18 17:55           ` Oleg Nesterov
2010-06-18 17:55             ` Oleg Nesterov
2010-06-18 21:23             ` Oleg Nesterov
2010-06-18 21:23               ` Oleg Nesterov
2010-06-19 19:08               ` [PATCH 0/4] pid_ns_prepare_proc/unshare cleanups Oleg Nesterov
2010-06-19 19:09                 ` [PATCH 1/4] procfs: proc_get_sb: consolidate/cleanup root_inode->pid logic Oleg Nesterov
2010-06-19 19:10                 ` [PATCH 2/4] procfs: kill the global proc_mnt variable Oleg Nesterov
2010-06-19 19:10                 ` [PATCH 3/4] procfs: move pid_ns_prepare_proc() from copy_process() to create_pid_namespace() Oleg Nesterov
2010-06-19 19:11                 ` [PATCH RESEND 4/4] sys_unshare: simplify the not-really-implemented CLONE_THREAD/SIGHAND/VM code Oleg Nesterov
2010-06-20  8:42                 ` [PATCH 0/6] Unshare support for the pid namespace Eric W. Biederman
2010-06-20  8:44                   ` [PATCH 1/6] pid: Remove the child_reaper special case in init/main.c Eric W. Biederman
     [not found]                     ` <m1ljaaqejm.fsf_-_-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-06-20 18:29                       ` Oleg Nesterov
2010-06-20 18:29                         ` Oleg Nesterov
2010-06-20 20:27                         ` Oleg Nesterov
2010-06-20  8:45                   ` [PATCH 2/6] pidns: Call pid_ns_prepare_proc from create_pid_namespace Eric W. Biederman
     [not found]                     ` <m1hbkyqeib.fsf_-_-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-06-20 18:19                       ` Oleg Nesterov
2010-06-20 18:19                         ` Oleg Nesterov
2010-06-20  8:45                   ` [PATCH 3/6] procfs: kill the global proc_mnt variable Eric W. Biederman
2010-06-20  8:47                   ` [PATCH 4/6] pidns: Don't allow new pids after the namespace is dead Eric W. Biederman
2010-06-20 18:44                     ` Oleg Nesterov
2010-06-20  8:48                   ` [PATCH 5/6] pidns: Use task_active_pid_ns where appropriate Eric W. Biederman
2010-06-20  8:49                   ` [PATCH 6/6] pidns: Support unsharing the pid namespace Eric W. Biederman
2010-06-20 20:14                     ` Oleg Nesterov
2010-06-20 20:42                       ` Oleg Nesterov
2010-06-21  1:53                       ` Eric W. Biederman
2010-06-20 18:03                   ` [PATCH 0/6] Unshare support for " Oleg Nesterov
2010-06-20 18:05                     ` [PATCH 0/2] pid_ns_release_proc() fixes Oleg Nesterov
2010-06-20 18:06                       ` [PATCH 1/2] pid_ns: move destroy_pid_namespace() into workqueue context Oleg Nesterov
2010-06-20 18:06                       ` Oleg Nesterov [this message]
2010-06-20 21:00                     ` [PATCH 0/6] Unshare support for the pid namespace Eric W. Biederman
2010-06-20 21:48                       ` Oleg Nesterov
     [not found]                       ` <m14ogxctd6.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-06-20 21:56                         ` Oleg Nesterov
2010-06-20 21:56                           ` Oleg Nesterov
2011-01-26 15:57                   ` Daniel Lezcano
2010-06-23 20:36                 ` [PATCH 0/1] pid_ns: move pid_ns_release_proc() from proc_flush_task() to zap_pid_ns_processes() Oleg Nesterov
     [not found]                   ` <20100623203652.GA25298-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-06-23 20:37                     ` [PATCH 1/1] " Oleg Nesterov
2010-06-23 20:37                       ` Oleg Nesterov
2010-06-24  6:36                       ` Sukadev Bhattiprolu
2010-06-24 12:59                         ` Oleg Nesterov
2010-06-24  7:06                       ` Eric W. Biederman
2010-06-24 13:01                         ` Oleg Nesterov
2010-06-24  8:37                   ` [PATCH] pid_ns: Fix proc_flush_task() accessing freed proc_mnt Louis Rilling
2010-06-24 17:08                   ` [RESEND PATCH] " Louis Rilling
2010-06-24 19:18                     ` Oleg Nesterov
2010-06-25 10:23                       ` Louis Rilling
     [not found]                         ` <20100625102303.GG3773-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2010-06-25 12:21                           ` Oleg Nesterov
2010-06-25 12:21                             ` Oleg Nesterov
2010-06-25 18:37                           ` Sukadev Bhattiprolu
2010-06-25 18:37                         ` Sukadev Bhattiprolu
2010-06-25 19:29                           ` Oleg Nesterov
2010-06-25 21:26                             ` Sukadev Bhattiprolu
2010-06-25 21:27                               ` Oleg Nesterov
2010-06-25 22:07                                 ` Sukadev Bhattiprolu
2010-07-09  4:36                                   ` [RFC][PATCH 1/2] pidns: Add a flag to indicate a pid namespace is dead Eric W. Biederman
2010-07-09  4:39                                     ` [RFC][PATCH 2/2] pidns: Remove proc flush races when a pid namespaces are exiting Eric W. Biederman
2010-07-09 12:14                                       ` Louis Rilling
2010-07-09 13:05                                         ` Eric W. Biederman
2010-07-09 14:13                                           ` Louis Rilling
2010-07-09 15:58                                             ` [PATCH 01/24] pidns: Remove races by stopping the caching of proc_mnt Eric W. Biederman
2010-07-09 22:13                                               ` Serge E. Hallyn
2010-07-11 14:14                                               ` Louis Rilling
2010-07-11 14:25                                                 ` Eric W. Biederman
2010-07-12 18:09                                                 ` [PATCH] pidns: Fix wait for zombies to be reaped in zap_pid_ns_processes Eric W. Biederman
2010-07-13 21:42                                                   ` Louis Rilling
     [not found]                                                     ` <20100713214234.GA21042-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2010-07-13 22:34                                                       ` Serge E. Hallyn
2010-07-13 22:34                                                     ` Serge E. Hallyn
2010-07-14  1:47                                                     ` Eric W. Biederman
     [not found]                                                       ` <m1oceakf5x.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-10-30  7:07                                                         ` Sukadev Bhattiprolu
2010-10-30  7:07                                                           ` Sukadev Bhattiprolu
2010-07-14 20:53                                                   ` Sukadev Bhattiprolu
2010-07-14 21:35                                                     ` Eric W. Biederman
2010-06-21 11:09             ` [PATCH] procfs: Do not release pid_ns->proc_mnt too early Louis Rilling
2010-06-21 11:15             ` Louis Rilling
2010-06-21 14:38               ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100620180637.GD17120@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=containers@lists.osdl.org \
    --cc=dlezcano@fr.ibm.com \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=louis.rilling@kerlabs.com \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.