All of lore.kernel.org
 help / color / mirror / Atom feed
* [REVIEW][PATCH 0/11] pid namespace cleanups and enhancements
@ 2012-11-16 16:32 ` Eric W. Biederman
  0 siblings, 0 replies; 76+ messages in thread
From: Eric W. Biederman @ 2012-11-16 16:32 UTC (permalink / raw)
  To: Linux Containers
  Cc: Andrew Morton, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Oleg Nesterov


This patchset is my pile of pid namespace patches that I have been
sitting on for entirely too long.  I have been running and testing these
changes for a while but if anyone sees any problems please let me know.

Feature wise this patchset adds unshare and setns support for the pid
namespace.

Cleanup wise this patchset adds an explicit count of how many pids are
hashed in a pid namespace and uses that count to trigger the unmounting
of the internal kernel mount of proc.  The current scheme is buggy and
entirely too clever to continue living.

Some proc bits that were added to support the pid namespace initially
are removed, as they are no no longer necessary.

These patches are also available at:
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git pidns-v73

Since some of this work is closely allied with the user namespace bits I
have pending I intend to merge these changes through my user namespace
tree.

Eric W. Biederman (11):
      procfs: Use the proc generic infrastructure for proc/self.
      procfs: Don't cache a pid in the root inode.
      pidns: Capture the user namespace and filter ns_last_pid
      pidns: Use task_active_pid_ns where appropriate
      pidns: Make the pidns proc mount/umount logic obvious.
      pidns: Don't allow new processes in a dead pid namespace.
      pidns: Wait in zap_pid_ns_processes until pid_ns->nr_hashed == 1
      pidns: Deny strange cases when creating pid namespaces.
      pidns: Add setns support
      pidns: Consolidate initialzation of special init task state
      pidns: Support unsharing the pid namespace.

 arch/powerpc/platforms/cell/spufs/sched.c |    2 +-
 arch/um/drivers/mconsole_kern.c           |    2 +-
 drivers/staging/android/binder.c          |    3 +-
 fs/hppfs/hppfs.c                          |    2 +-
 fs/proc/Makefile                          |    1 +
 fs/proc/base.c                            |  169 +----------------------------
 fs/proc/internal.h                        |    1 +
 fs/proc/namespaces.c                      |    3 +
 fs/proc/root.c                            |   16 +---
 fs/proc/self.c                            |   59 ++++++++++
 include/linux/pid_namespace.h             |   10 ++-
 include/linux/proc_fs.h                   |    1 +
 init/main.c                               |    1 -
 kernel/cgroup.c                           |    2 +-
 kernel/events/core.c                      |    2 +-
 kernel/exit.c                             |   12 --
 kernel/fork.c                             |   42 +++++---
 kernel/nsproxy.c                          |    4 +-
 kernel/pid.c                              |   46 +++++++--
 kernel/pid_namespace.c                    |   99 +++++++++++++----
 kernel/signal.c                           |    2 +-
 kernel/sysctl_binary.c                    |    2 +-
 22 files changed, 231 insertions(+), 250 deletions(-)

^ permalink raw reply	[flat|nested] 76+ messages in thread
* Re: [PATCH 03/11] pidns: Capture the user namespace and filter ns_last_pid
@ 2012-11-19 12:27 Zhao Hongjiang
       [not found] ` <50AA259A.5030007-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 76+ messages in thread
From: Zhao Hongjiang @ 2012-11-19 12:27 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On 2012/11/17 0:35, Eric W. Biederman wrote:
> From: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
>
> - Capture the the user namespace that creates the pid namespace
> - Use that user namespace to test if it is ok to write to
>   /proc/sys/kernel/ns_last_pid.
>
> Acked-by: Serge Hallyn <serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> Signed-off-by: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
> ---
>  include/linux/pid_namespace.h |    8 +++++---
>  kernel/nsproxy.c              |    2 +-
>  kernel/pid.c                  |    1 +
>  kernel/pid_namespace.c        |   16 +++++++++++-----
>  4 files changed, 18 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h
> index 65e3e87..c89c9cf 100644
> --- a/include/linux/pid_namespace.h
> +++ b/include/linux/pid_namespace.h
> @@ -31,6 +31,7 @@ struct pid_namespace {
>  #ifdef CONFIG_BSD_PROCESS_ACCT
>  	struct bsd_acct_struct *bacct;
>  #endif
> +	struct user_namespace *user_ns;
>  	kgid_t pid_gid;
>  	int hide_pid;
>  	int reboot;	/* group exit code if this pidns was rebooted */
> @@ -46,7 +47,8 @@ static inline struct pid_namespace *get_pid_ns(struct pid_namespace *ns)
>  	return ns;
>  }
>
> -extern struct pid_namespace *copy_pid_ns(unsigned long flags, struct pid_namespace *ns);
> +extern struct pid_namespace *copy_pid_ns(unsigned long flags,
> +	struct user_namespace *user_ns, struct pid_namespace *ns);
>  extern void zap_pid_ns_processes(struct pid_namespace *pid_ns);
>  extern int reboot_pid_ns(struct pid_namespace *pid_ns, int cmd);
>  extern void put_pid_ns(struct pid_namespace *ns);
> @@ -59,8 +61,8 @@ static inline struct pid_namespace *get_pid_ns(struct pid_namespace *ns)
>  	return ns;
>  }
>
> -static inline struct pid_namespace *
> -copy_pid_ns(unsigned long flags, struct pid_namespace *ns)
> +static inline struct pid_namespace *copy_pid_ns(unsigned long flags,
> +	struct user_namespace *user_ns, struct pid_namespace *ns)
>  {
>  	if (flags & CLONE_NEWPID)
>  		ns = ERR_PTR(-EINVAL);
> diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
> index b576f7f..5fe88e1 100644
> --- a/kernel/nsproxy.c
> +++ b/kernel/nsproxy.c
> @@ -84,7 +84,7 @@ static struct nsproxy *create_new_namespaces(unsigned long flags,
>  		goto out_ipc;
>  	}
>
> -	new_nsp->pid_ns = copy_pid_ns(flags, task_active_pid_ns(tsk));
> +	new_nsp->pid_ns = copy_pid_ns(flags, task_cred_xxx(tsk, user_ns), task_active_pid_ns(tsk));
>  	if (IS_ERR(new_nsp->pid_ns)) {
>  		err = PTR_ERR(new_nsp->pid_ns);
>  		goto out_pid;
> diff --git a/kernel/pid.c b/kernel/pid.c
> index aebd4f5..2a624f1 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -78,6 +78,7 @@ struct pid_namespace init_pid_ns = {
>  	.last_pid = 0,
>  	.level = 0,
>  	.child_reaper = &init_task,
> +	.user_ns = &init_user_ns,
>  };
>  EXPORT_SYMBOL_GPL(init_pid_ns);
>
> diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
> index 7b07cc0..7580aa0 100644
> --- a/kernel/pid_namespace.c
> +++ b/kernel/pid_namespace.c
> @@ -10,6 +10,7 @@
>
>  #include <linux/pid.h>
>  #include <linux/pid_namespace.h>
> +#include <linux/user_namespace.h>
>  #include <linux/syscalls.h>
>  #include <linux/err.h>
>  #include <linux/acct.h>
> @@ -74,7 +75,8 @@ err_alloc:
>  /* MAX_PID_NS_LEVEL is needed for limiting size of 'struct pid' */
>  #define MAX_PID_NS_LEVEL 32
>
> -static struct pid_namespace *create_pid_namespace(struct pid_namespace *parent_pid_ns)
> +static struct pid_namespace *create_pid_namespace(struct user_namespace *user_ns,
> +	struct pid_namespace *parent_pid_ns)
>  {
>  	struct pid_namespace *ns;
>  	unsigned int level = parent_pid_ns->level + 1;
> @@ -102,6 +104,7 @@ static struct pid_namespace *create_pid_namespace(struct pid_namespace *parent_p
>  	kref_init(&ns->kref);
>  	ns->level = level;
>  	ns->parent = get_pid_ns(parent_pid_ns);
> +	ns->user_ns = get_user_ns(user_ns);

Hi Eric,

I noticed that, the increment refcount of user_ns has never released when the pid_ns's refcount is
down to zero. And i have sent out a patch to solve this on Nov 2th.

Am i misunderstand something?

Hongjiang

>
>  	set_bit(0, ns->pidmap[0].page);
>  	atomic_set(&ns->pidmap[0].nr_free, BITS_PER_PAGE - 1);
> @@ -117,6 +120,7 @@ static struct pid_namespace *create_pid_namespace(struct pid_namespace *parent_p
>
>  out_put_parent_pid_ns:
>  	put_pid_ns(parent_pid_ns);
> +	put_user_ns(user_ns);
>  out_free_map:
>  	kfree(ns->pidmap[0].page);
>  out_free:
> @@ -134,13 +138,14 @@ static void destroy_pid_namespace(struct pid_namespace *ns)
>  	kmem_cache_free(pid_ns_cachep, ns);
>  }
>
> -struct pid_namespace *copy_pid_ns(unsigned long flags, struct pid_namespace *old_ns)
> +struct pid_namespace *copy_pid_ns(unsigned long flags,
> +	struct user_namespace *user_ns, struct pid_namespace *old_ns)
>  {
>  	if (!(flags & CLONE_NEWPID))
>  		return get_pid_ns(old_ns);
>  	if (flags & (CLONE_THREAD|CLONE_PARENT))
>  		return ERR_PTR(-EINVAL);
> -	return create_pid_namespace(old_ns);
> +	return create_pid_namespace(user_ns, old_ns);
>  }
>
>  static void free_pid_ns(struct kref *kref)
> @@ -239,9 +244,10 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns)
>  static int pid_ns_ctl_handler(struct ctl_table *table, int write,
>  		void __user *buffer, size_t *lenp, loff_t *ppos)
>  {
> +	struct pid_namespace *pid_ns = task_active_pid_ns(current);
>  	struct ctl_table tmp = *table;
>
> -	if (write && !capable(CAP_SYS_ADMIN))
> +	if (write && !ns_capable(pid_ns->user_ns, CAP_SYS_ADMIN))
>  		return -EPERM;
>
>  	/*
> @@ -250,7 +256,7 @@ static int pid_ns_ctl_handler(struct ctl_table *table, int write,
>  	 * it should synchronize its usage with external means.
>  	 */
>
> -	tmp.data = &current->nsproxy->pid_ns->last_pid;
> +	tmp.data = &pid_ns->last_pid;
>  	return proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
>  }
>
>



.

^ permalink raw reply	[flat|nested] 76+ messages in thread

end of thread, other threads:[~2012-12-21 22:59 UTC | newest]

Thread overview: 76+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-16 16:32 [REVIEW][PATCH 0/11] pid namespace cleanups and enhancements Eric W. Biederman
2012-11-16 16:32 ` Eric W. Biederman
     [not found] ` <8739097bkk.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-16 16:35   ` [PATCH 01/11] procfs: Use the proc generic infrastructure for proc/self Eric W. Biederman
2012-11-16 16:35     ` Eric W. Biederman
2012-11-16 16:35     ` [PATCH 07/11] pidns: Wait in zap_pid_ns_processes until pid_ns->nr_hashed == 1 Eric W. Biederman
     [not found]       ` <1353083750-3621-7-git-send-email-ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-21  2:24         ` Gao feng
2012-11-21  2:24           ` Gao feng
2012-12-19 18:47         ` Oleg Nesterov
2012-12-19 18:47           ` Oleg Nesterov
     [not found]           ` <20121219184757.GB22991-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-21  1:19             ` Eric W. Biederman
2012-12-21  1:19               ` Eric W. Biederman
     [not found]               ` <87bodourqt.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-12-21 14:11                 ` Oleg Nesterov
2012-12-21 14:11                   ` Oleg Nesterov
     [not found]                   ` <20121221141133.GA13805-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-21 15:02                     ` Oleg Nesterov
2012-12-21 15:02                       ` Oleg Nesterov
     [not found]                       ` <20121221150238.GA16003-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-21 15:31                         ` Oleg Nesterov
2012-12-21 15:31                           ` Oleg Nesterov
     [not found]                           ` <20121221153152.GA17250-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-21 18:42                             ` Eric W. Biederman
2012-12-21 18:42                               ` Eric W. Biederman
2012-12-21 18:33                     ` Eric W. Biederman
2012-12-21 18:33                       ` Eric W. Biederman
     [not found]     ` <1353083750-3621-1-git-send-email-ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-16 16:35       ` [PATCH 02/11] procfs: Don't cache a pid in the root inode Eric W. Biederman
2012-11-16 16:35         ` Eric W. Biederman
     [not found]         ` <1353083750-3621-2-git-send-email-ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-21  1:07           ` Gao feng
2012-11-21  1:07             ` Gao feng
2012-11-16 16:35       ` [PATCH 03/11] pidns: Capture the user namespace and filter ns_last_pid Eric W. Biederman
2012-11-16 16:35         ` Eric W. Biederman
     [not found]         ` <1353083750-3621-3-git-send-email-ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-21  1:26           ` Gao feng
2012-11-21  1:26             ` Gao feng
2012-11-16 16:35       ` [PATCH 04/11] pidns: Use task_active_pid_ns where appropriate Eric W. Biederman
2012-11-16 16:35         ` Eric W. Biederman
     [not found]         ` <1353083750-3621-4-git-send-email-ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-21  2:02           ` Gao feng
2012-11-21  2:02             ` Gao feng
2012-11-16 16:35       ` [PATCH 05/11] pidns: Make the pidns proc mount/umount logic obvious Eric W. Biederman
2012-11-16 16:35         ` Eric W. Biederman
     [not found]         ` <1353083750-3621-5-git-send-email-ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-19 11:02           ` Gao feng
2012-11-19 11:02             ` Gao feng
2012-11-16 16:35       ` [PATCH 06/11] pidns: Don't allow new processes in a dead pid namespace Eric W. Biederman
2012-11-16 16:35         ` Eric W. Biederman
     [not found]         ` <1353083750-3621-6-git-send-email-ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-21  2:17           ` Gao feng
2012-11-21  2:17             ` Gao feng
2012-11-16 16:35       ` [PATCH 07/11] pidns: Wait in zap_pid_ns_processes until pid_ns->nr_hashed == 1 Eric W. Biederman
2012-11-16 16:35       ` [PATCH 08/11] pidns: Deny strange cases when creating pid namespaces Eric W. Biederman
2012-11-16 16:35         ` Eric W. Biederman
     [not found]         ` <1353083750-3621-8-git-send-email-ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-21  2:25           ` Gao feng
2012-11-21  2:25             ` Gao feng
2012-11-16 16:35       ` [PATCH 09/11] pidns: Add setns support Eric W. Biederman
2012-11-16 16:35         ` Eric W. Biederman
     [not found]         ` <1353083750-3621-9-git-send-email-ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-19  9:11           ` Gao feng
2012-11-19  9:11             ` Gao feng
     [not found]             ` <50A9F7DE.60807-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2012-11-19  9:27               ` Eric W. Biederman
2012-11-19  9:27                 ` Eric W. Biederman
2012-11-21  2:36           ` Gao feng
2012-11-21  2:36             ` Gao feng
2012-11-16 16:35       ` [PATCH 10/11] pidns: Consolidate initialzation of special init task state Eric W. Biederman
2012-11-16 16:35         ` Eric W. Biederman
     [not found]         ` <1353083750-3621-10-git-send-email-ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-21  2:56           ` Gao feng
2012-11-21  2:56             ` Gao feng
2012-11-16 16:35       ` [PATCH 11/11] pidns: Support unsharing the pid namespace Eric W. Biederman
2012-11-16 16:35         ` Eric W. Biederman
     [not found]         ` <1353083750-3621-11-git-send-email-ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-11-21  2:55           ` Gao feng
2012-11-21  2:55             ` Gao feng
2012-12-19 18:14           ` Oleg Nesterov
2012-12-19 18:14             ` Oleg Nesterov
     [not found]             ` <20121219181400.GA22991-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-21  1:43               ` Eric W. Biederman
2012-12-21  1:43                 ` Eric W. Biederman
     [not found]                 ` <871uektc2f.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-12-21 15:49                   ` Oleg Nesterov
2012-12-21 15:49                     ` Oleg Nesterov
     [not found]                     ` <20121221154931.GA18730-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-12-21 17:51                       ` Eric W. Biederman
2012-12-21 17:51                         ` Eric W. Biederman
     [not found]                         ` <87fw2zmgzc.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2012-12-21 19:24                           ` Rob Landley
2012-12-21 19:24                             ` Rob Landley
2012-12-21 22:58                             ` namespace documentation Eric W. Biederman
2012-12-21 22:58                             ` Eric W. Biederman
  -- strict thread matches above, loose matches on Subject: below --
2012-11-19 12:27 [PATCH 03/11] pidns: Capture the user namespace and filter ns_last_pid Zhao Hongjiang
     [not found] ` <50AA259A.5030007-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2012-11-19 12:41   ` Eric W. Biederman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.