From: "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org,
Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>,
Nathan Lynch <ntl-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
Subject: Re: [PATCH] [RFC] c/r: Add UTS support
Date: Sat, 21 Mar 2009 09:51:00 -0500 [thread overview]
Message-ID: <20090321145100.GA23058@hallyn.com> (raw)
In-Reply-To: <m1prgbzgqq.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> "Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org> writes:
>
> > Quoting Eric W. Biederman (ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org):
> >> > What is wrong with Alexey's patch, which simply passes in the values
> >> > themselves? Do you have another use in mind for the min/max pid
> >> > values?
> >>
> >> At an implementation level (and I need to look at Alexey's specific patch)
> >> every patch I have seen to date creates their own version of alloc_pidmap.
> >
> > You're right, Alexey's patch creates a new one.
> >
> >> alloc_pidmap already implicitly takes min/max and first value to try
> >> as parameters. RESERVED_PIDS, pid_max, and pid_ns->last_pid. So
> >> instead of rewriting alloc_pidmap we should just be able to refactor
> >> alloc_pidmap to take the requisite values. That should be less code
> >> and easier to maintain.
> >
> > Yeah, that sounds good actually. Thanks.
> >
> >> Looking at the current implementation we also have the issue that
> >> pid_max is not per pid namespace. Where it seems to belong.
> >
> > Eh. It does seem to, but otoh why give userspace knobs it has no use
> > for... Or, can you think of a case where it'd be useful?
>
> In general the number of usable pid numbers should be larger in the outer
> pid namespace than in the child pid namespace. Otherwise it is possible
> for the child to eat all of the possible pid numbers.
>
> So I think it would be advantageous for to make containers designed to migrate
> to have a small pid_max by default so we know we won't overwhelm others.
>
> Furthermore since pid_max is a limit on the identifiers allocated no on the
> number of processes it is very much a pid namespace property.
Right, I don't argue that it doesn't seem to belong there. Well if
you think people would use it, it does seem simple enough to do.
Untested (well compile-tested) patch below just for grins.
> >> > I think that's a good guideline, bad rule. Certainly possible
> >> > that you're right that this is just pointing to in-kernel
> >> > recreation of process tree as the way to go. I was getting
> >> > that feeling myself, but then there are still very good reasons
> >> > not to do that, as there are things which each task should do
> >> > before completing sys_restart() which are best done in userspace.
> >> > These include for instance creating virtual nics, and calling
> >> > Oren's suggested 'cr_advise()' system calls.
> >>
> >> You might be right. I am behind on that part of the conversation.
> >>
> >> My general concern is that dividing up the responsibilities between user space
> >> and kernel space seems harder to maintain, and refactor if we don't get something
> >> right the first time.
> >
> > So far we're actually still at the point where the code (Oren's set)
> > could go either way. A small patch from Alexey can make it swing toward
> > kernel, while Oren's mktree.c userspace restart program swings the other
> > way.
> >
> > And since we're punting on any nested namespaces it actually may stay that way
> > for awhile.
>
> Interesting. That sounds fairly fundamental. If I have some free time I will
> have to take a look. I'm in favor of a kernel/user space cooperation but I don't
> currently see the benefit of fork processes in user space.
All right I'll wait for you to take a look, rather than repeat
myself :) The biggest concern IMO is how to create complicated
resources (like a veth tunnel pair) in the kernel case.
thanks,
-serge
From 47303d729ec494add03fbddb47fac9a020d65f00 Mon Sep 17 00:00:00 2001
From: Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Date: Sat, 21 Mar 2009 09:22:26 -0500
Subject: [PATCH 1/1] pid_ns: make pid_max a pid_ns property
Remove the pid_max global, and make it a property of the
pid_namespace. When a pid_ns is created, it inherits
the parent's pid_ns.
Fixing up sysctl (trivial akin to ipc version, but
potentially tedious to get right for all CONFIG*
combinations) is left for later.
Signed-off-by: Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
---
include/linux/pid_namespace.h | 1 +
kernel/pid.c | 14 +++++++-------
kernel/pid_namespace.c | 6 ++++--
kernel/sysctl.c | 4 ++--
4 files changed, 14 insertions(+), 11 deletions(-)
diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h
index 38d1032..fd7f497 100644
--- a/include/linux/pid_namespace.h
+++ b/include/linux/pid_namespace.h
@@ -30,6 +30,7 @@ struct pid_namespace {
#ifdef CONFIG_BSD_PROCESS_ACCT
struct bsd_acct_struct *bacct;
#endif
+ int pid_max;
};
extern struct pid_namespace init_pid_ns;
diff --git a/kernel/pid.c b/kernel/pid.c
index 1b3586f..898fa8b 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -43,8 +43,6 @@ static struct hlist_head *pid_hash;
static int pidhash_shift;
struct pid init_struct_pid = INIT_STRUCT_PID;
-int pid_max = PID_MAX_DEFAULT;
-
#define RESERVED_PIDS 300
int pid_max_min = RESERVED_PIDS + 1;
@@ -78,6 +76,7 @@ struct pid_namespace init_pid_ns = {
.last_pid = 0,
.level = 0,
.child_reaper = &init_task,
+ .pid_max = PID_MAX_DEFAULT,
};
EXPORT_SYMBOL_GPL(init_pid_ns);
@@ -128,11 +127,12 @@ static int alloc_pidmap(struct pid_namespace *pid_ns)
struct pidmap *map;
pid = last + 1;
- if (pid >= pid_max)
+ if (pid >= pid_ns->pid_max)
pid = RESERVED_PIDS;
offset = pid & BITS_PER_PAGE_MASK;
map = &pid_ns->pidmap[pid/BITS_PER_PAGE];
- max_scan = (pid_max + BITS_PER_PAGE - 1)/BITS_PER_PAGE - !offset;
+ max_scan = (pid_ns->pid_max + BITS_PER_PAGE - 1)/BITS_PER_PAGE
+ - !offset;
for (i = 0; i <= max_scan; ++i) {
if (unlikely(!map->page)) {
void *page = kzalloc(PAGE_SIZE, GFP_KERNEL);
@@ -164,11 +164,11 @@ static int alloc_pidmap(struct pid_namespace *pid_ns)
* bitmap block and the final block was the same
* as the starting point, pid is before last_pid.
*/
- } while (offset < BITS_PER_PAGE && pid < pid_max &&
- (i != max_scan || pid < last ||
+ } while (offset < BITS_PER_PAGE && pid < pid_ns->pid_max
+ && (i != max_scan || pid < last ||
!((last+1) & BITS_PER_PAGE_MASK)));
}
- if (map < &pid_ns->pidmap[(pid_max-1)/BITS_PER_PAGE]) {
+ if (map < &pid_ns->pidmap[(pid_ns->pid_max-1)/BITS_PER_PAGE]) {
++map;
offset = 0;
} else {
diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index fab8ea8..1ba3970 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -67,15 +67,17 @@ err_alloc:
return NULL;
}
-static struct pid_namespace *create_pid_namespace(unsigned int level)
+static struct pid_namespace *create_pid_namespace(struct pid_namespace *old)
{
struct pid_namespace *ns;
+ unsigned int level = old->level + 1;
int i;
ns = kmem_cache_zalloc(pid_ns_cachep, GFP_KERNEL);
if (ns == NULL)
goto out;
+ ns->pid_max = old->pid_max;
ns->pidmap[0].page = kzalloc(PAGE_SIZE, GFP_KERNEL);
if (!ns->pidmap[0].page)
goto out_free;
@@ -125,7 +127,7 @@ struct pid_namespace *copy_pid_ns(unsigned long flags, struct pid_namespace *old
if (flags & CLONE_THREAD)
goto out_put;
- new_ns = create_pid_namespace(old_ns->level + 1);
+ new_ns = create_pid_namespace(old_ns);
if (!IS_ERR(new_ns))
new_ns->parent = get_pid_ns(old_ns);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index c5ef44f..8af16bd 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -48,6 +48,7 @@
#include <linux/acpi.h>
#include <linux/reboot.h>
#include <linux/ftrace.h>
+#include <linux/pid_namespace.h>
#include <asm/uaccess.h>
#include <asm/processor.h>
@@ -74,7 +75,6 @@ extern int max_threads;
extern int core_uses_pid;
extern int suid_dumpable;
extern char core_pattern[];
-extern int pid_max;
extern int min_free_kbytes;
extern int pid_max_min, pid_max_max;
extern int sysctl_drop_caches;
@@ -643,7 +643,7 @@ static struct ctl_table kern_table[] = {
{
.ctl_name = KERN_PIDMAX,
.procname = "pid_max",
- .data = &pid_max,
+ .data = &init_pid_ns.pid_max,
.maxlen = sizeof (int),
.mode = 0644,
.proc_handler = &proc_dointvec_minmax,
--
1.5.6.3
next prev parent reply other threads:[~2009-03-21 14:51 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-12 17:56 [PATCH] [RFC] c/r: Add UTS support Dan Smith
[not found] ` <1236880612-15316-1-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-12 21:29 ` Nathan Lynch
2009-03-12 21:56 ` Dan Smith
[not found] ` <87fxhipfrh.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2009-03-12 22:48 ` Serge E. Hallyn
[not found] ` <20090312224820.GA12723-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
2009-03-12 22:56 ` Dan Smith
[not found] ` <87bps6pcyf.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2009-03-13 0:12 ` Serge E. Hallyn
2009-03-18 8:27 ` Oren Laadan
[not found] ` <49C0B069.6060300-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-03-18 9:01 ` Cedric Le Goater
2009-03-18 13:49 ` Serge E. Hallyn
[not found] ` <20090318134932.GC22636-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-18 14:04 ` Dan Smith
[not found] ` <878wn353mf.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2009-03-18 15:46 ` Cedric Le Goater
[not found] ` <49C1175F.9060600-GANU6spQydw@public.gmane.org>
2009-03-18 15:55 ` Dan Smith
[not found] ` <874oxq6d1x.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2009-03-18 16:02 ` Cedric Le Goater
2009-03-18 19:50 ` Mike Waychison
[not found] ` <49C1506C.1080500-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2009-03-19 0:10 ` Eric W. Biederman
[not found] ` <m1bprye5io.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-03-19 0:46 ` Mike Waychison
[not found] ` <49C195CF.1080506-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2009-03-19 1:06 ` Eric W. Biederman
[not found] ` <m1ab7icodl.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-03-19 1:51 ` Mike Waychison
[not found] ` <49C1A52D.4000503-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2009-03-19 3:28 ` Eric W. Biederman
[not found] ` <m1iqm6xkc7.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-03-20 17:26 ` Serge E. Hallyn
[not found] ` <20090320172616.GA7203-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-20 19:51 ` Mike Waychison
[not found] ` <49C3F3C0.30100-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2009-03-20 20:40 ` Serge E. Hallyn
2009-03-20 20:53 ` Oren Laadan
2009-03-20 23:26 ` Eric W. Biederman
[not found] ` <m1d4cb3he5.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-03-21 2:38 ` Serge E. Hallyn
[not found] ` <20090321023834.GA21064-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
2009-03-21 3:39 ` Eric W. Biederman
[not found] ` <m1prgbzgqq.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-03-21 14:51 ` Serge E. Hallyn [this message]
2009-03-12 22:48 ` Daniel Lezcano
[not found] ` <49B99144.9000106-GANU6spQydw@public.gmane.org>
2009-03-12 22:58 ` Dan Smith
[not found] ` <877i2upcvo.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2009-03-12 23:11 ` Daniel Lezcano
[not found] ` <49B996BC.1090908-GANU6spQydw@public.gmane.org>
2009-03-12 23:13 ` Dan Smith
[not found] ` <873adipc5l.fsf-FLMGYpZoEPULwtHQx/6qkW3U47Q5hpJU@public.gmane.org>
2009-03-12 23:24 ` Daniel Lezcano
[not found] ` <49B999A6.2000005-GANU6spQydw@public.gmane.org>
2009-03-13 15:30 ` Serge E. Hallyn
[not found] ` <20090313153004.GA8317-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-13 15:51 ` Daniel Lezcano
[not found] ` <49BA811C.4070302-GANU6spQydw@public.gmane.org>
2009-03-13 17:15 ` Serge E. Hallyn
[not found] ` <20090313171556.GB10685-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-13 17:53 ` Daniel Lezcano
[not found] ` <49BA9D9C.2030208-GANU6spQydw@public.gmane.org>
2009-03-25 12:01 ` Eric W. Biederman
2009-03-13 15:59 ` Cedric Le Goater
[not found] ` <49BA82CE.4090206-GANU6spQydw@public.gmane.org>
2009-03-13 16:04 ` Daniel Lezcano
2009-03-18 8:32 ` Oren Laadan
2009-03-18 8:35 ` Oren Laadan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090321145100.GA23058@hallyn.com \
--to=serge-a9i7lubdfnhqt0dzr+alfa@public.gmane.org \
--cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
--cc=danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
--cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
--cc=ntl-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox