All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org,
	Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>,
	Nathan Lynch <ntl-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
Subject: Re: [PATCH] [RFC] c/r: Add UTS support
Date: Sat, 21 Mar 2009 09:51:00 -0500	[thread overview]
Message-ID: <20090321145100.GA23058@hallyn.com> (raw)
In-Reply-To: <m1prgbzgqq.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>

Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> writes:
> 
> > Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> >> > What is wrong with Alexey's patch, which simply passes in the values
> >> > themselves?  Do you have another use in mind for the min/max pid
> >> > values?
> >> 
> >> At an implementation level (and I need to look at Alexey's specific patch)
> >> every patch I have seen to date creates their own version of alloc_pidmap.
> >
> > You're right, Alexey's patch creates a new one.
> >
> >> alloc_pidmap already implicitly takes min/max and first value to try
> >> as parameters.  RESERVED_PIDS, pid_max, and pid_ns->last_pid.  So
> >> instead of rewriting alloc_pidmap we should just be able to refactor
> >> alloc_pidmap to take the requisite values.  That should be less code
> >> and easier to maintain.
> >
> > Yeah, that sounds good actually.  Thanks.
> >
> >> Looking at the current implementation we also have the issue that
> >> pid_max is not per pid namespace.  Where it seems to belong.
> >
> > Eh.  It does seem to, but otoh why give userspace knobs it has no use
> > for...  Or, can you think of a case where it'd be useful?
> 
> In general the number of usable pid numbers should be larger in the outer
> pid namespace than in the child pid namespace.  Otherwise it is possible
> for the child to eat all of the possible pid numbers.
> 
> So I think it would be advantageous for to make containers designed to migrate
> to have a small pid_max by default so we know we won't overwhelm others.
> 
> Furthermore since pid_max is a limit on the identifiers allocated no on the
> number of processes it is very much a pid namespace property.

Right, I don't argue that it doesn't seem to belong there.  Well if
you think people would use it, it does seem simple enough to do.
Untested (well compile-tested) patch below just for grins.

> >> > I think that's a good guideline, bad rule.  Certainly possible
> >> > that you're right that this is just pointing to in-kernel
> >> > recreation of process tree as the way to go.  I was getting
> >> > that feeling myself, but then there are still very good reasons
> >> > not to do that, as there are things which each task should do
> >> > before completing sys_restart() which are best done in userspace.
> >> > These include for instance creating virtual nics, and calling
> >> > Oren's suggested 'cr_advise()' system calls.
> >> 
> >> You might be right.   I am behind on that part of the conversation.
> >> 
> >> My general concern is that dividing up the responsibilities between user space
> >> and kernel space seems harder to maintain, and refactor if we don't get something
> >> right the first time.
> >
> > So far we're actually still at the point where the code (Oren's set)
> > could go either way.  A small patch from Alexey can make it swing toward
> > kernel, while Oren's mktree.c userspace restart program swings the other
> > way.
> >
> > And since we're punting on any nested namespaces it actually may stay that way
> > for awhile.
> 
> Interesting.  That sounds fairly fundamental.  If I have some free time I will
> have to take a look.  I'm in favor of a kernel/user space cooperation but I don't
> currently see the benefit of fork processes in user space.

All right I'll wait for you to take a look, rather than repeat
myself :)  The biggest concern IMO is how to create complicated
resources (like a veth tunnel pair) in the kernel case.

thanks,
-serge

From 47303d729ec494add03fbddb47fac9a020d65f00 Mon Sep 17 00:00:00 2001
From: Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Date: Sat, 21 Mar 2009 09:22:26 -0500
Subject: [PATCH 1/1] pid_ns: make pid_max a pid_ns property

Remove the pid_max global, and make it a property of the
pid_namespace.  When a pid_ns is created, it inherits
the parent's pid_ns.

Fixing up sysctl (trivial akin to ipc version, but
potentially tedious to get right for all CONFIG*
combinations) is left for later.

Signed-off-by: Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
---
 include/linux/pid_namespace.h |    1 +
 kernel/pid.c                  |   14 +++++++-------
 kernel/pid_namespace.c        |    6 ++++--
 kernel/sysctl.c               |    4 ++--
 4 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h
index 38d1032..fd7f497 100644
--- a/include/linux/pid_namespace.h
+++ b/include/linux/pid_namespace.h
@@ -30,6 +30,7 @@ struct pid_namespace {
 #ifdef CONFIG_BSD_PROCESS_ACCT
 	struct bsd_acct_struct *bacct;
 #endif
+	int pid_max;
 };
 
 extern struct pid_namespace init_pid_ns;
diff --git a/kernel/pid.c b/kernel/pid.c
index 1b3586f..898fa8b 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -43,8 +43,6 @@ static struct hlist_head *pid_hash;
 static int pidhash_shift;
 struct pid init_struct_pid = INIT_STRUCT_PID;
 
-int pid_max = PID_MAX_DEFAULT;
-
 #define RESERVED_PIDS		300
 
 int pid_max_min = RESERVED_PIDS + 1;
@@ -78,6 +76,7 @@ struct pid_namespace init_pid_ns = {
 	.last_pid = 0,
 	.level = 0,
 	.child_reaper = &init_task,
+	.pid_max = PID_MAX_DEFAULT,
 };
 EXPORT_SYMBOL_GPL(init_pid_ns);
 
@@ -128,11 +127,12 @@ static int alloc_pidmap(struct pid_namespace *pid_ns)
 	struct pidmap *map;
 
 	pid = last + 1;
-	if (pid >= pid_max)
+	if (pid >= pid_ns->pid_max)
 		pid = RESERVED_PIDS;
 	offset = pid & BITS_PER_PAGE_MASK;
 	map = &pid_ns->pidmap[pid/BITS_PER_PAGE];
-	max_scan = (pid_max + BITS_PER_PAGE - 1)/BITS_PER_PAGE - !offset;
+	max_scan = (pid_ns->pid_max + BITS_PER_PAGE - 1)/BITS_PER_PAGE
+			- !offset;
 	for (i = 0; i <= max_scan; ++i) {
 		if (unlikely(!map->page)) {
 			void *page = kzalloc(PAGE_SIZE, GFP_KERNEL);
@@ -164,11 +164,11 @@ static int alloc_pidmap(struct pid_namespace *pid_ns)
 			 * bitmap block and the final block was the same
 			 * as the starting point, pid is before last_pid.
 			 */
-			} while (offset < BITS_PER_PAGE && pid < pid_max &&
-					(i != max_scan || pid < last ||
+			} while (offset < BITS_PER_PAGE && pid < pid_ns->pid_max
+					&& (i != max_scan || pid < last ||
 					    !((last+1) & BITS_PER_PAGE_MASK)));
 		}
-		if (map < &pid_ns->pidmap[(pid_max-1)/BITS_PER_PAGE]) {
+		if (map < &pid_ns->pidmap[(pid_ns->pid_max-1)/BITS_PER_PAGE]) {
 			++map;
 			offset = 0;
 		} else {
diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index fab8ea8..1ba3970 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -67,15 +67,17 @@ err_alloc:
 	return NULL;
 }
 
-static struct pid_namespace *create_pid_namespace(unsigned int level)
+static struct pid_namespace *create_pid_namespace(struct pid_namespace *old)
 {
 	struct pid_namespace *ns;
+	unsigned int level = old->level + 1;
 	int i;
 
 	ns = kmem_cache_zalloc(pid_ns_cachep, GFP_KERNEL);
 	if (ns == NULL)
 		goto out;
 
+	ns->pid_max = old->pid_max;
 	ns->pidmap[0].page = kzalloc(PAGE_SIZE, GFP_KERNEL);
 	if (!ns->pidmap[0].page)
 		goto out_free;
@@ -125,7 +127,7 @@ struct pid_namespace *copy_pid_ns(unsigned long flags, struct pid_namespace *old
 	if (flags & CLONE_THREAD)
 		goto out_put;
 
-	new_ns = create_pid_namespace(old_ns->level + 1);
+	new_ns = create_pid_namespace(old_ns);
 	if (!IS_ERR(new_ns))
 		new_ns->parent = get_pid_ns(old_ns);
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index c5ef44f..8af16bd 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -48,6 +48,7 @@
 #include <linux/acpi.h>
 #include <linux/reboot.h>
 #include <linux/ftrace.h>
+#include <linux/pid_namespace.h>
 
 #include <asm/uaccess.h>
 #include <asm/processor.h>
@@ -74,7 +75,6 @@ extern int max_threads;
 extern int core_uses_pid;
 extern int suid_dumpable;
 extern char core_pattern[];
-extern int pid_max;
 extern int min_free_kbytes;
 extern int pid_max_min, pid_max_max;
 extern int sysctl_drop_caches;
@@ -643,7 +643,7 @@ static struct ctl_table kern_table[] = {
 	{
 		.ctl_name	= KERN_PIDMAX,
 		.procname	= "pid_max",
-		.data		= &pid_max,
+		.data		= &init_pid_ns.pid_max,
 		.maxlen		= sizeof (int),
 		.mode		= 0644,
 		.proc_handler	= &proc_dointvec_minmax,
-- 
1.5.6.3

  parent reply	other threads:[~2009-03-21 14:51 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-12 17:56 [PATCH] [RFC] c/r: Add UTS support Dan Smith
     [not found] ` <1236880612-15316-1-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-12 21:29   ` Nathan Lynch
2009-03-12 21:56     ` Dan Smith
     [not found]       ` <87fxhipfrh.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2009-03-12 22:48         ` Serge E. Hallyn
     [not found]           ` <20090312224820.GA12723-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
2009-03-12 22:56             ` Dan Smith
     [not found]               ` <87bps6pcyf.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2009-03-13  0:12                 ` Serge E. Hallyn
2009-03-18  8:27                 ` Oren Laadan
     [not found]                   ` <49C0B069.6060300-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-03-18  9:01                     ` Cedric Le Goater
2009-03-18 13:49                     ` Serge E. Hallyn
     [not found]                       ` <20090318134932.GC22636-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-18 14:04                         ` Dan Smith
     [not found]                           ` <878wn353mf.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2009-03-18 15:46                             ` Cedric Le Goater
     [not found]                               ` <49C1175F.9060600-GANU6spQydw@public.gmane.org>
2009-03-18 15:55                                 ` Dan Smith
     [not found]                                   ` <874oxq6d1x.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2009-03-18 16:02                                     ` Cedric Le Goater
2009-03-18 19:50                                 ` Mike Waychison
     [not found]                                   ` <49C1506C.1080500-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2009-03-19  0:10                                     ` Eric W. Biederman
     [not found]                                       ` <m1bprye5io.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-03-19  0:46                                         ` Mike Waychison
     [not found]                                           ` <49C195CF.1080506-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2009-03-19  1:06                                             ` Eric W. Biederman
     [not found]                                               ` <m1ab7icodl.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-03-19  1:51                                                 ` Mike Waychison
     [not found]                                                   ` <49C1A52D.4000503-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2009-03-19  3:28                                                     ` Eric W. Biederman
     [not found]                                                       ` <m1iqm6xkc7.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-03-20 17:26                                                         ` Serge E. Hallyn
     [not found]                                                           ` <20090320172616.GA7203-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-20 19:51                                                             ` Mike Waychison
     [not found]                                                               ` <49C3F3C0.30100-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2009-03-20 20:40                                                                 ` Serge E. Hallyn
2009-03-20 20:53                                                                 ` Oren Laadan
2009-03-20 23:26                                                             ` Eric W. Biederman
     [not found]                                                               ` <m1d4cb3he5.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-03-21  2:38                                                                 ` Serge E. Hallyn
     [not found]                                                                   ` <20090321023834.GA21064-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
2009-03-21  3:39                                                                     ` Eric W. Biederman
     [not found]                                                                       ` <m1prgbzgqq.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-03-21 14:51                                                                         ` Serge E. Hallyn [this message]
2009-03-12 22:48         ` Daniel Lezcano
     [not found]           ` <49B99144.9000106-GANU6spQydw@public.gmane.org>
2009-03-12 22:58             ` Dan Smith
     [not found]               ` <877i2upcvo.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2009-03-12 23:11                 ` Daniel Lezcano
     [not found]                   ` <49B996BC.1090908-GANU6spQydw@public.gmane.org>
2009-03-12 23:13                     ` Dan Smith
     [not found]                       ` <873adipc5l.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2009-03-12 23:24                         ` Daniel Lezcano
     [not found]                           ` <49B999A6.2000005-GANU6spQydw@public.gmane.org>
2009-03-13 15:30                             ` Serge E. Hallyn
     [not found]                               ` <20090313153004.GA8317-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-13 15:51                                 ` Daniel Lezcano
     [not found]                                   ` <49BA811C.4070302-GANU6spQydw@public.gmane.org>
2009-03-13 17:15                                     ` Serge E. Hallyn
     [not found]                                       ` <20090313171556.GB10685-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-13 17:53                                         ` Daniel Lezcano
     [not found]                                           ` <49BA9D9C.2030208-GANU6spQydw@public.gmane.org>
2009-03-25 12:01                                             ` Eric W. Biederman
2009-03-13 15:59                         ` Cedric Le Goater
     [not found]                           ` <49BA82CE.4090206-GANU6spQydw@public.gmane.org>
2009-03-13 16:04                             ` Daniel Lezcano
2009-03-18  8:32             ` Oren Laadan
2009-03-18  8:35   ` Oren Laadan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090321145100.GA23058@hallyn.com \
    --to=serge-a9i7lubdfnhqt0dzr+alfa@public.gmane.org \
    --cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
    --cc=danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    --cc=ntl-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.