* Re: [PATCH] Make core_pattern support namespace
[not found] ` <ce9a21e42a846150a2a482278f5b5a5e0ea27839.1454588184.git.zhaolei-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
@ 2016-02-16 14:26 ` Mateusz Guzik
0 siblings, 0 replies; 8+ messages in thread
From: Mateusz Guzik @ 2016-02-16 14:26 UTC (permalink / raw)
To: Zhao Lei
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
On Tue, Feb 16, 2016 at 07:33:39PM +0800, Zhao Lei wrote:
> Currently, each container shared one copy of coredump setting
> with the host system, if host system changed the setting, each
> running containers will be affected.
>
> Moreover, it is not easy to let each container keeping their own
> coredump setting.
>
> We can use some workaround as pipe program to make the second
> requirement possible, but it is not simple, and both host and
> container are limited to set to fixed pipe program.
> In one word, for host running contailer, we can't change core_pattern
> anymore.
> To make the problem more hard, if a host running more than one
> container product, each product will try to snatch the global
> coredump setting to fit their own requirement.
>
> For container based on namespace design, it is good to allow
> each container keeping their own coredump setting.
>
> It will bring us following benefit:
> 1: Each container can change their own coredump setting
> based on operation on /proc/sys/kernel/core_pattern
> 2: Coredump setting changed in host will not affect
> running containers.
> 3: Support both case of "putting coredump in guest" and
> "putting curedump in host".
>
> Each namespace-based software(lxc, docker, ..) can use this function
> to custom their dump setting.
>
> And this function makes each continer working as separate system,
> it fit for design goal of namespace.
>
Sorry if this is a false alarm, I don't have easy means to test it, but
is not this an immediate privilege escalation?
In particular, if the new pattern was to include a pipe, would not it
cause the kernel to run whatever the namespace put in there, but on the
"host"?
The feature definitely looks useful, but this is another privileged
operation which is now made available to unprivileged users. This poses
some questions:
- should the namespace be allowed to set anything it wants, including
pipes? I would argue the latter should be disallowed for simplicity
- now that the pattern can change at any time by namespace root, it
becomes fishy to parse it for filename generation without any
protection against concurrent modification
- how do you ensure that namespaces which did not explicitely set their
pattern still get updates from the host?
That said, I suggest adding an allocated buffer to hold it or be NULL
otherwise. If the pointer is NULL, the "host" pattern is used.
For dumping purposes the code could just kmalloc, rcu_read_lock, memcpy
and be done with it. Or, if it is acceptable given the size, just have a
buffer on the stack.
Finally, the code setting it could simply xchg the pointer to avoid
locks if there is no good lock to be taken here, and then kfree_rcu the
old buffer.
Just my $0,03.
--
Mateusz Guzik
^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <20160216142608.GA16757@mguzik>]
* Re: [PATCH] Make core_pattern support namespace
[not found] ` <20160216142608.GA16757@mguzik>
@ 2016-02-17 20:15 ` Eric W. Biederman
[not found] ` <87k2m33xgj.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-02-18 11:13 ` Zhao Lei
1 sibling, 1 reply; 8+ messages in thread
From: Eric W. Biederman @ 2016-02-17 20:15 UTC (permalink / raw)
To: Mateusz Guzik
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
Mateusz Guzik <mguzik-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> writes:
> On Tue, Feb 16, 2016 at 07:33:39PM +0800, Zhao Lei wrote:
>> Currently, each container shared one copy of coredump setting
>> with the host system, if host system changed the setting, each
>> running containers will be affected.
>>
>> Moreover, it is not easy to let each container keeping their own
>> coredump setting.
>>
>> We can use some workaround as pipe program to make the second
>> requirement possible, but it is not simple, and both host and
>> container are limited to set to fixed pipe program.
>> In one word, for host running contailer, we can't change core_pattern
>> anymore.
>> To make the problem more hard, if a host running more than one
>> container product, each product will try to snatch the global
>> coredump setting to fit their own requirement.
>>
>> For container based on namespace design, it is good to allow
>> each container keeping their own coredump setting.
>>
>> It will bring us following benefit:
>> 1: Each container can change their own coredump setting
>> based on operation on /proc/sys/kernel/core_pattern
>> 2: Coredump setting changed in host will not affect
>> running containers.
>> 3: Support both case of "putting coredump in guest" and
>> "putting curedump in host".
>>
>> Each namespace-based software(lxc, docker, ..) can use this function
>> to custom their dump setting.
>>
>> And this function makes each continer working as separate system,
>> it fit for design goal of namespace.
>>
>
> Sorry if this is a false alarm, I don't have easy means to test it, but
> is not this an immediate privilege escalation?
It is. This is why we do not currently have a per namespace setting.
Solving the user mode helper problem is technically a fair amount of
work, if not theoretically challenging.
Eric
^ permalink raw reply [flat|nested] 8+ messages in thread* RE: [PATCH] Make core_pattern support namespace
[not found] ` <20160216142608.GA16757@mguzik>
2016-02-17 20:15 ` Eric W. Biederman
@ 2016-02-18 11:13 ` Zhao Lei
1 sibling, 0 replies; 8+ messages in thread
From: Zhao Lei @ 2016-02-18 11:13 UTC (permalink / raw)
To: 'Mateusz Guzik'
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
Hi, Mateusz Guzik
Thanks for your detailed comment and suggestion on this patch.
> -----Original Message-----
> From: Mateusz Guzik [mailto:mguzik-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org]
> Sent: Tuesday, February 16, 2016 10:26 PM
> To: Zhao Lei <zhaolei-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
> Cc: linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> Subject: Re: [PATCH] Make core_pattern support namespace
>
> On Tue, Feb 16, 2016 at 07:33:39PM +0800, Zhao Lei wrote:
> > Currently, each container shared one copy of coredump setting
> > with the host system, if host system changed the setting, each
> > running containers will be affected.
> >
> > Moreover, it is not easy to let each container keeping their own
> > coredump setting.
> >
> > We can use some workaround as pipe program to make the second
> > requirement possible, but it is not simple, and both host and
> > container are limited to set to fixed pipe program.
> > In one word, for host running contailer, we can't change core_pattern
> > anymore.
> > To make the problem more hard, if a host running more than one
> > container product, each product will try to snatch the global
> > coredump setting to fit their own requirement.
> >
> > For container based on namespace design, it is good to allow
> > each container keeping their own coredump setting.
> >
> > It will bring us following benefit:
> > 1: Each container can change their own coredump setting
> > based on operation on /proc/sys/kernel/core_pattern
> > 2: Coredump setting changed in host will not affect
> > running containers.
> > 3: Support both case of "putting coredump in guest" and
> > "putting curedump in host".
> >
> > Each namespace-based software(lxc, docker, ..) can use this function
> > to custom their dump setting.
> >
> > And this function makes each continer working as separate system,
> > it fit for design goal of namespace.
> >
>
> Sorry if this is a false alarm, I don't have easy means to test it, but
> is not this an immediate privilege escalation?
>
> In particular, if the new pattern was to include a pipe, would not it
> cause the kernel to run whatever the namespace put in there, but on the
> "host"?
>
In my mind, if user don't run vm in with privilege request,
the container manager can do following processing:
1: User set custom core_pattern in starting container, as:
# lxc-start --core-pattern='xxx' -n vm01
or
# docker run --core-pattern='xxx' my_image
or
set CORE_PATTERN in vm's config file
2: Container manager(lxc or docker) mount /proc as rw in init period,
and set core_pattern.
3: Container manager remount /proc as ro, and give vm to user.
User/program in vm can not change core_pattern setting, just as lxc
and docker's current default behavor.
If user run vm with privilege request, it will be a problem.
User/process in vm can change core-pattern setting, if change to a file,
it is not a problem, because it is a file in vm's filesystem.
But if changed to a pipe, it is a problem in current kernel design,
the pipe program is in host's filesystem, it means the vm can change host.
In short:
Run vm in non-privilege | do anything | no problem
Run vm in privilege | change core_pattern to file | no problem
Run vm in privilege | change core_pattern to pipe | IS problem
As a solution, maybe we can modify kernel to search the pipe program
in vm's filesystem, and the pipe problem runs in vm's fs context,
so a vm can only put dump file in its private filesystem, except vm
manager link the dir into host.
Another solution is to only allow vm change core_pattern to file(avoid pipe),
but is looks strange, the vm should have same core_dump setting function
with normal system.
> The feature definitely looks useful, but this is another privileged
> operation which is now made available to unprivileged users. This poses
> some questions:
> - should the namespace be allowed to set anything it wants, including
> pipes? I would argue the latter should be disallowed for simplicity
Yes, current patch does allow, it can be fixed in above way.
> - now that the pattern can change at any time by namespace root, it
> becomes fishy to parse it for filename generation without any
> protection against concurrent modification
Before this patch, kernel can works corrent, so I think kernel already
have lock to avoid changing and reading the cure_pattern buffer in
same time.
This patch only modify the buffer place, so the existence lock will
still can work, I'll confirm it in source.
> - how do you ensure that namespaces which did not explicitely set their
> pattern still get updates from the host?
>
> That said, I suggest adding an allocated buffer to hold it or be NULL
> otherwise. If the pointer is NULL, the "host" pattern is used.
>
> For dumping purposes the code could just kmalloc, rcu_read_lock, memcpy
> and be done with it. Or, if it is acceptable given the size, just have a
> buffer on the stack.
>
> Finally, the code setting it could simply xchg the pointer to avoid
> locks if there is no good lock to be taken here, and then kfree_rcu the
> old buffer.
>
Actually I considered about this way in doing patch, as:
HOST core_pattern=foo
+- VM1 core_pattern=NULL
+- VM1_1 core_pattern=NULL
VM1 and VM1_1 use nearext core_pattern setting in tree(foo).
And if VM1_1 changed core pattern to bar:
HOST core_pattern=foo
+- VM1 core_pattern=NULL
+- VM1_1 core_pattern=bar
VM1 use foo and VM1_1 use bar.
It means vm can always set core_pattern buffer pointer to NULL in creating,
until someone changes it.
But I finally selected to clone core_pattern in creating vm, because if we
see a vm as a independent system, its core_pattern should only changed by
process running in it, and after a vm got created, the host will not able
to change vm's internal setting, except login/run_command in vm's context.
Thanks
Zhaolei
> Just my $0,03.
>
> --
> Mateusz Guzik
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Make core_pattern support namespace
@ 2016-02-16 11:33 Zhao Lei
0 siblings, 0 replies; 8+ messages in thread
From: Zhao Lei @ 2016-02-16 11:33 UTC (permalink / raw)
To: linux-kernel-u79uwXL29TY76Z2rM5mHXA
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA
Currently, each container shared one copy of coredump setting
with the host system, if host system changed the setting, each
running containers will be affected.
Moreover, it is not easy to let each container keeping their own
coredump setting.
We can use some workaround as pipe program to make the second
requirement possible, but it is not simple, and both host and
container are limited to set to fixed pipe program.
In one word, for host running contailer, we can't change core_pattern
anymore.
To make the problem more hard, if a host running more than one
container product, each product will try to snatch the global
coredump setting to fit their own requirement.
For container based on namespace design, it is good to allow
each container keeping their own coredump setting.
It will bring us following benefit:
1: Each container can change their own coredump setting
based on operation on /proc/sys/kernel/core_pattern
2: Coredump setting changed in host will not affect
running containers.
3: Support both case of "putting coredump in guest" and
"putting curedump in host".
Each namespace-based software(lxc, docker, ..) can use this function
to custom their dump setting.
And this function makes each continer working as separate system,
it fit for design goal of namespace.
Test(in lxc):
# In the host
# ----------------
# echo host_core >/proc/sys/kernel/core_pattern
# cat /proc/sys/kernel/core_pattern
host_core
# ulimit -c 1024000
# ./make_dump
Segmentation fault (core dumped)
# ls -l
-rw------- 1 root root 331776 Feb 4 18:02 host_core.2175
-rwxr-xr-x 1 root root 759731 Feb 4 18:01 make_dump
#
# In the container
# ----------------
# cat /proc/sys/kernel/core_pattern
host_core
# echo container_core >/proc/sys/kernel/core_pattern
# ./make_dump
Segmentation fault (core dumped)
# ls -l
-rwxr-xr-x 1 root root 759731 Feb 4 10:45 make_dump
-rw------- 1 root root 331776 Feb 4 10:45 container_core.16
#
# Return to host
# ----------------
# cat /proc/sys/kernel/core_pattern
host_core
# ls
host_core.2175 make_dump make_dump.c
# rm -f host_core.2175
# ./make_dump
Segmentation fault (core dumped)
# ls -l
-rw------- 1 root root 331776 Feb 4 18:49 host_core.2351
-rwxr-xr-x 1 root root 759731 Feb 4 18:01 make_dump
#
Signed-off-by: Zhao Lei <zhaolei-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
---
fs/coredump.c | 3 +--
include/linux/pid_namespace.h | 2 ++
kernel/pid.c | 1 +
kernel/pid_namespace.c | 3 +++
kernel/sysctl.c | 22 ++++++++++++++++------
5 files changed, 23 insertions(+), 8 deletions(-)
diff --git a/fs/coredump.c b/fs/coredump.c
index 9ea87e9..8a7ef9b 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -46,7 +46,6 @@
int core_uses_pid;
unsigned int core_pipe_limit;
-char core_pattern[CORENAME_MAX_SIZE] = "core";
static int core_name_size = CORENAME_MAX_SIZE;
struct core_name {
@@ -183,7 +182,7 @@ put_exe_file:
static int format_corename(struct core_name *cn, struct coredump_params *cprm)
{
const struct cred *cred = current_cred();
- const char *pat_ptr = core_pattern;
+ const char *pat_ptr = current->nsproxy->pid_ns_for_children->core_pattern;
int ispipe = (*pat_ptr == '|');
int pid_in_pattern = 0;
int err = 0;
diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h
index 918b117..a5af1e9 100644
--- a/include/linux/pid_namespace.h
+++ b/include/linux/pid_namespace.h
@@ -9,6 +9,7 @@
#include <linux/nsproxy.h>
#include <linux/kref.h>
#include <linux/ns_common.h>
+#include <linux/binfmts.h>
struct pidmap {
atomic_t nr_free;
@@ -45,6 +46,7 @@ struct pid_namespace {
int hide_pid;
int reboot; /* group exit code if this pidns was rebooted */
struct ns_common ns;
+ char core_pattern[CORENAME_MAX_SIZE];
};
extern struct pid_namespace init_pid_ns;
diff --git a/kernel/pid.c b/kernel/pid.c
index 4d73a83..c79c1d5 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -83,6 +83,7 @@ struct pid_namespace init_pid_ns = {
#ifdef CONFIG_PID_NS
.ns.ops = &pidns_operations,
#endif
+ .core_pattern = "core",
};
EXPORT_SYMBOL_GPL(init_pid_ns);
diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c
index a65ba13..16d6d21 100644
--- a/kernel/pid_namespace.c
+++ b/kernel/pid_namespace.c
@@ -123,6 +123,9 @@ static struct pid_namespace *create_pid_namespace(struct user_namespace *user_ns
for (i = 1; i < PIDMAP_ENTRIES; i++)
atomic_set(&ns->pidmap[i].nr_free, BITS_PER_PAGE);
+ strncpy(ns->core_pattern, parent_pid_ns->core_pattern,
+ sizeof(ns->core_pattern));
+
return ns;
out_free_map:
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 97715fd..70f8af5 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -100,7 +100,6 @@
extern int suid_dumpable;
#ifdef CONFIG_COREDUMP
extern int core_uses_pid;
-extern char core_pattern[];
extern unsigned int core_pipe_limit;
#endif
extern int pid_max;
@@ -469,7 +468,7 @@ static struct ctl_table kern_table[] = {
},
{
.procname = "core_pattern",
- .data = core_pattern,
+ .data = NULL,
.maxlen = CORENAME_MAX_SIZE,
.mode = 0644,
.proc_handler = proc_dostring_coredump,
@@ -2301,6 +2300,8 @@ int proc_dointvec_minmax(struct ctl_table *table, int write,
static void validate_coredump_safety(void)
{
#ifdef CONFIG_COREDUMP
+ char *core_pattern = current->nsproxy->pid_ns_for_children->core_pattern;
+
if (suid_dumpable == SUID_DUMP_ROOT &&
core_pattern[0] != '/' && core_pattern[0] != '|') {
printk(KERN_WARNING "Unsafe core_pattern used with "\
@@ -2323,10 +2324,19 @@ static int proc_dointvec_minmax_coredump(struct ctl_table *table, int write,
static int proc_dostring_coredump(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos)
{
- int error = proc_dostring(table, write, buffer, lenp, ppos);
- if (!error)
- validate_coredump_safety();
- return error;
+ int ret;
+
+ if (write && *ppos && sysctl_writes_strict == SYSCTL_WRITES_WARN)
+ warn_sysctl_write(table);
+
+ ret = _proc_do_string(current->nsproxy->pid_ns_for_children->core_pattern,
+ table->maxlen, write,
+ (char __user *)buffer, lenp, ppos);
+ if (ret)
+ return ret;
+
+ validate_coredump_safety();
+ return 0;
}
#endif
--
1.8.5.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-02-19 10:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <ce9a21e42a846150a2a482278f5b5a5e0ea27839.1454588184.git.zhaolei@cn.fujitsu.com>
[not found] ` <ce9a21e42a846150a2a482278f5b5a5e0ea27839.1454588184.git.zhaolei-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2016-02-16 14:26 ` [PATCH] Make core_pattern support namespace Mateusz Guzik
[not found] ` <20160216142608.GA16757@mguzik>
2016-02-17 20:15 ` Eric W. Biederman
[not found] ` <87k2m33xgj.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-02-17 20:54 ` Mateusz Guzik
2016-02-18 11:59 ` Zhao Lei
2016-02-18 20:18 ` Eric W. Biederman
[not found] ` <877fi122o4.fsf-JOvCrm2gF+uungPnsOpG7nhyD016LWXt@public.gmane.org>
2016-02-19 10:24 ` Zhao Lei
2016-02-18 11:13 ` Zhao Lei
2016-02-16 11:33 Zhao Lei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox