From: Andrew Morton <akpm@linux-foundation.org>
To: Andrew Vagin <avagin@openvz.org>
Cc: linux-kernel@vger.kernel.org, Oleg Nesterov <oleg@redhat.com>,
Cyrill Gorcunov <gorcunov@openvz.org>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Pavel Emelyanov <xemul@parallels.com>
Subject: Re: [PATCH] pidns: limit the nesting depth of pid namespaces
Date: Tue, 23 Oct 2012 16:56:51 -0700 [thread overview]
Message-ID: <20121023165651.88af399d.akpm@linux-foundation.org> (raw)
In-Reply-To: <1350045042-1369134-1-git-send-email-avagin@openvz.org>
On Fri, 12 Oct 2012 16:30:42 +0400
Andrew Vagin <avagin@openvz.org> wrote:
> 'struct pid' is a "variable sized struct" - a header with an array
> of upids at the end.
>
> A size of the array depends on a level (depth) of pid namespaces. Now
> a level of pidns is not limited, so 'struct pid' can be more than one
> page.
>
> Looks reasonable, that it should be less than a page. MAX_PIS_NS_LEVEL
> is not calculated from PAGE_SIZE, because in this case it depends on
> architectures, config options and it will be reduced, if someone adds a
> new fields in struct pid or struct upid.
>
> I suggest to set MAX_PIS_NS_LEVEL = 32, because it saves ability to
> expand "struct pid" and it's more than enough for all known for me
> use-cases. When someone finds a reasonable use case, we can add a
> config option or a sysctl parameter.
>
> In addition it will reduce effect of another problem, when we have many
> nested namespaces and the oldest one starts dying. zap_pid_ns_processe
> will be called for each namespace and find_vpid will be called for each
> process in a namespace. find_vpid will be called minimum max_level^2 / 2
> times. The reason of that is that when we found a bit in pidmap, we
> can't determine this pidns is top for this process or it isn't.
>
> vpid is a heavy operation, so a fork bomb, which create many nested
> namespace, can do a system inaccessible for a long time.
>
> --- a/kernel/pid_namespace.c
> +++ b/kernel/pid_namespace.c
> @@ -70,12 +70,18 @@ err_alloc:
> return NULL;
> }
>
> +/* MAX_PID_NS_LEVEL is needed for limiting size of 'struct pid' */
> +#define MAX_PID_NS_LEVEL 32
> +
> static struct pid_namespace *create_pid_namespace(struct pid_namespace *parent_pid_ns)
> {
> struct pid_namespace *ns;
> unsigned int level = parent_pid_ns->level + 1;
> int i, err = -ENOMEM;
>
> + if (level > MAX_PID_NS_LEVEL)
> + goto out;
> +
> ns = kmem_cache_zalloc(pid_ns_cachep, GFP_KERNEL);
> if (ns == NULL)
> goto out;
hm, OK. This will kind-of fix the free_pid_ns()-excessive-recursion
issue which we already fixed by other means ;)
I don't think this problem is serious enough to bother backporting into
-stable but I guess we can/should do it in 3.7. Agree?
I think that returning -ENOMEM in response to an excessive nesting
attempt is misleading - the system *didn't* run out of memory. EINVAL
is better?
From: Andrew Morton <akpm@linux-foundation.org>
Subject: pidns-limit-the-nesting-depth-of-pid-namespaces-fix
return -EINVAL in response to excessive nesting, not -ENOMEM
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
kernel/pid_namespace.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff -puN kernel/pid_namespace.c~pidns-limit-the-nesting-depth-of-pid-namespaces-fix kernel/pid_namespace.c
--- a/kernel/pid_namespace.c~pidns-limit-the-nesting-depth-of-pid-namespaces-fix
+++ a/kernel/pid_namespace.c
@@ -78,11 +78,15 @@ static struct pid_namespace *create_pid_
{
struct pid_namespace *ns;
unsigned int level = parent_pid_ns->level + 1;
- int i, err = -ENOMEM;
+ int i;
+ int err;
- if (level > MAX_PID_NS_LEVEL)
+ if (level > MAX_PID_NS_LEVEL) {
+ err = -EINVAL;
goto out;
+ }
+ err = -ENOMEM;
ns = kmem_cache_zalloc(pid_ns_cachep, GFP_KERNEL);
if (ns == NULL)
goto out;
_
next prev parent reply other threads:[~2012-10-23 23:57 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-12 12:30 [PATCH] pidns: limit the nesting depth of pid namespaces Andrew Vagin
2012-10-19 14:25 ` Andrey Wagin
2012-10-20 15:21 ` Oleg Nesterov
2012-10-23 23:56 ` Andrew Morton [this message]
2012-10-24 5:38 ` Andrey Wagin
2012-10-24 19:46 ` Andrew Morton
2012-10-25 15:02 ` Andrey Wagin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121023165651.88af399d.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=avagin@openvz.org \
--cc=ebiederm@xmission.com \
--cc=gorcunov@openvz.org \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=xemul@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.