* Re: [RFC PATCH 5/5] Add a Signal Control Group Subsystem
[not found] ` <20080423142518.703428301@us.ibm.com>
@ 2008-04-23 15:17 ` Cedric Le Goater
2008-04-23 15:37 ` Paul Menage
0 siblings, 1 reply; 11+ messages in thread
From: Cedric Le Goater @ 2008-04-23 15:17 UTC (permalink / raw)
To: Matt Helsley
Cc: Linux-Kernel, Paul Menage, Oren Laadan, Linus Torvalds,
Pavel Machek, linux-pm, Linux Containers
Hello Matt !
> Add a signal control group subsystem that allows us to send signals to all tasks
> in the control group by writing the desired signal(7) number to the kill file.
>
> NOTE: We don't really need per-cgroup state, but control groups doesn't support
> stateless subsystems yet.
>
> Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
> ---
> include/linux/cgroup_signal.h | 28 +++++++++
> include/linux/cgroup_subsys.h | 6 +
> init/Kconfig | 6 +
> kernel/Makefile | 1
> kernel/cgroup_signal.c | 129 ++++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 170 insertions(+)
I think there is a small race with new tasks entering the cgroup
while it's beeing killed, and a _fork ops would handle that. nop ?
Thanks,
C.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 5/5] Add a Signal Control Group Subsystem
2008-04-23 15:17 ` [RFC PATCH 5/5] Add a Signal Control Group Subsystem Cedric Le Goater
@ 2008-04-23 15:37 ` Paul Menage
2008-04-24 7:00 ` Matt Helsley
0 siblings, 1 reply; 11+ messages in thread
From: Paul Menage @ 2008-04-23 15:37 UTC (permalink / raw)
To: Cedric Le Goater
Cc: Matt Helsley, Linux-Kernel, Oren Laadan, Linus Torvalds,
Pavel Machek, linux-pm, Linux Containers
On Wed, Apr 23, 2008 at 8:17 AM, Cedric Le Goater <clg@fr.ibm.com> wrote:
> Hello Matt !
>
> > Add a signal control group subsystem that allows us to send signals to all tasks
> > in the control group by writing the desired signal(7) number to the kill file.
> >
> > NOTE: We don't really need per-cgroup state, but control groups doesn't support
> > stateless subsystems yet.
> >
> > Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
> > ---
> > include/linux/cgroup_signal.h | 28 +++++++++
> > include/linux/cgroup_subsys.h | 6 +
> > init/Kconfig | 6 +
> > kernel/Makefile | 1
> > kernel/cgroup_signal.c | 129 ++++++++++++++++++++++++++++++++++++++++++
> > 5 files changed, 170 insertions(+)
>
>
> I think there is a small race with new tasks entering the cgroup
> while it's beeing killed, and a _fork ops would handle that. nop ?
>
I never saw the actual patch (what lists did it go out to?) but I
suspect that this is one of those operations that's just going to be
inherently racy, and that the API should guarantee to affect all tasks
that are members of the group for the entirety of the operation, but
with no guarantees about what happens to tasks that enter or leave in
the meantime.
Paul
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC][PATCH 5/5] Add a Signal Control Group Subsystem
2008-04-24 6:47 [RFC][PATCH 0/5] Container Freezer: Reuse Suspend Freezer Matt Helsley
@ 2008-04-24 6:48 ` Matt Helsley
2008-04-24 19:30 ` Paul Jackson
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Matt Helsley @ 2008-04-24 6:48 UTC (permalink / raw)
To: Linux-Kernel
Cc: Cedric Le Goater, Paul Menage, Oren Laadan, Linus Torvalds,
Pavel Machek, linux-pm, Linux Containers
[-- Attachment #1: cgroup-signal/cgroup-signal-implement-signal-subsystem.patch --]
[-- Type: text/plain, Size: 6112 bytes --]
Add a signal control group subsystem that allows us to send signals to all tasks
in the control group by writing the desired signal(7) number to the kill file.
NOTE: We don't really need per-cgroup state, but control groups doesn't support
stateless subsystems yet.
Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
---
include/linux/cgroup_signal.h | 28 +++++++++
include/linux/cgroup_subsys.h | 6 +
init/Kconfig | 6 +
kernel/Makefile | 1
kernel/cgroup_signal.c | 129 ++++++++++++++++++++++++++++++++++++++++++
5 files changed, 170 insertions(+)
Index: linux-2.6.25-mm1/include/linux/cgroup_signal.h
===================================================================
--- /dev/null
+++ linux-2.6.25-mm1/include/linux/cgroup_signal.h
@@ -0,0 +1,28 @@
+#ifndef _LINUX_CGROUP_SIGNAL_H
+#define _LINUX_CGROUP_SIGNAL_H
+/*
+ * cgroup_signal.h - control group freezer subsystem interface
+ *
+ * Copyright IBM Corp. 2007
+ *
+ * Author : Cedric Le Goater <clg@fr.ibm.com>
+ * Author : Matt Helsley <matthltc@us.ibm.com>
+ */
+
+#include <linux/cgroup.h>
+
+#ifdef CONFIG_CGROUP_SIGNAL
+
+struct stateless {
+ struct cgroup_subsys_state css;
+};
+
+static inline struct stateless *cgroup_signal(struct cgroup *cgroup)
+{
+ return container_of(cgroup_subsys_state(cgroup, signal_subsys_id),
+ struct stateless, css);
+}
+
+#else /* !CONFIG_CGROUP_SIGNAL */
+#endif /* !CONFIG_CGROUP_SIGNAL */
+#endif /* _LINUX_CGROUP_SIGNAL_H */
Index: linux-2.6.25-mm1/kernel/cgroup_signal.c
===================================================================
--- /dev/null
+++ linux-2.6.25-mm1/kernel/cgroup_signal.c
@@ -0,0 +1,129 @@
+/*
+ * cgroup_signal.c - control group signal subsystem
+ *
+ * Copyright IBM Corp. 2007
+ *
+ * Author : Cedric Le Goater <clg@fr.ibm.com>
+ * Author : Matt Helsley <matthltc@us.ibm.com>
+ */
+
+#include <linux/module.h>
+#include <linux/cgroup.h>
+#include <linux/fs.h>
+#include <linux/uaccess.h>
+#include <linux/cgroup_signal.h>
+
+struct cgroup_subsys signal_subsys;
+
+static struct cgroup_subsys_state *signal_create(
+ struct cgroup_subsys *ss, struct cgroup *cgroup)
+{
+ struct stateless *dummy;
+
+ if (!capable(CAP_SYS_ADMIN))
+ return ERR_PTR(-EPERM);
+
+ dummy = kzalloc(sizeof(struct stateless), GFP_KERNEL);
+ if (!dummy)
+ return ERR_PTR(-ENOMEM);
+ return &dummy->css;
+}
+
+static void signal_destroy(struct cgroup_subsys *ss,
+ struct cgroup *cgroup)
+{
+ kfree(cgroup_signal(cgroup));
+}
+
+
+static int signal_can_attach(struct cgroup_subsys *ss,
+ struct cgroup *new_cgroup,
+ struct task_struct *task)
+{
+ return 0;
+}
+
+static int signal_kill(struct cgroup *cgroup, int signum)
+{
+ struct cgroup_iter it;
+ struct task_struct *task;
+ int retval = 0;
+
+ cgroup_iter_start(cgroup, &it);
+ while ((task = cgroup_iter_next(cgroup, &it))) {
+ retval = send_sig(signum, task, 1);
+ if (retval)
+ break;
+ }
+ cgroup_iter_end(cgroup, &it);
+
+ return retval;
+}
+
+static ssize_t signal_write(struct cgroup *cgroup,
+ struct cftype *cft,
+ struct file *file,
+ const char __user *userbuf,
+ size_t nbytes, loff_t *unused_ppos)
+{
+ char *buffer;
+ int retval = 0;
+ int value;
+
+ if (nbytes >= PATH_MAX)
+ return -E2BIG;
+
+ /* +1 for nul-terminator */
+ buffer = kmalloc(nbytes + 1, GFP_KERNEL);
+ if (buffer == NULL)
+ return -ENOMEM;
+
+ if (copy_from_user(buffer, userbuf, nbytes)) {
+ retval = -EFAULT;
+ goto free_buffer;
+ }
+ buffer[nbytes] = 0; /* nul-terminate */
+ if (sscanf(buffer, "%d", &value) != 1) {
+ retval = -EIO;
+ goto free_buffer;
+ }
+
+ cgroup_lock();
+
+ if (cgroup_is_removed(cgroup)) {
+ retval = -ENODEV;
+ goto unlock;
+ }
+
+ retval = signal_kill(cgroup, value);
+ if (retval == 0)
+ retval = nbytes;
+unlock:
+ cgroup_unlock();
+free_buffer:
+ kfree(buffer);
+ return retval;
+}
+
+static struct cftype kill_file = {
+ .name = "kill",
+ .write = signal_write,
+ .private = 0,
+};
+
+static int signal_populate(struct cgroup_subsys *ss, struct cgroup *cgroup)
+{
+ return cgroup_add_files(cgroup, ss, &kill_file, 1);
+}
+
+struct cgroup_subsys signal_subsys = {
+ .name = "signal",
+ .create = signal_create,
+ .destroy = signal_destroy,
+ .populate = signal_populate,
+ .subsys_id = signal_subsys_id,
+ .can_attach = signal_can_attach,
+ .attach = NULL,
+ .fork = NULL,
+ .exit = NULL,
+};
Index: linux-2.6.25-mm1/init/Kconfig
===================================================================
--- linux-2.6.25-mm1.orig/init/Kconfig
+++ linux-2.6.25-mm1/init/Kconfig
@@ -328,10 +328,16 @@ config CGROUP_FREEZER
depends on CGROUPS
help
Provides a way to freeze and unfreeze all tasks in a
cgroup
+config CGROUP_SIGNAL
+ bool "control group signal subsystem"
+ depends on CGROUPS
+ help
+ Provides a way to signal all tasks in a cgroup
+
config FAIR_GROUP_SCHED
bool "Group scheduling for SCHED_OTHER"
depends on GROUP_SCHED
default y
Index: linux-2.6.25-mm1/kernel/Makefile
===================================================================
--- linux-2.6.25-mm1.orig/kernel/Makefile
+++ linux-2.6.25-mm1/kernel/Makefile
@@ -47,10 +47,11 @@ obj-$(CONFIG_KEXEC) += kexec.o
obj-$(CONFIG_BACKTRACE_SELF_TEST) += backtracetest.o
obj-$(CONFIG_COMPAT) += compat.o
obj-$(CONFIG_CGROUPS) += cgroup.o
obj-$(CONFIG_CGROUP_DEBUG) += cgroup_debug.o
obj-$(CONFIG_CGROUP_FREEZER) += cgroup_freezer.o
+obj-$(CONFIG_CGROUP_SIGNAL) += cgroup_signal.o
obj-$(CONFIG_CPUSETS) += cpuset.o
obj-$(CONFIG_CGROUP_NS) += ns_cgroup.o
obj-$(CONFIG_UTS_NS) += utsname.o
obj-$(CONFIG_USER_NS) += user_namespace.o
obj-$(CONFIG_PID_NS) += pid_namespace.o
Index: linux-2.6.25-mm1/include/linux/cgroup_subsys.h
===================================================================
--- linux-2.6.25-mm1.orig/include/linux/cgroup_subsys.h
+++ linux-2.6.25-mm1/include/linux/cgroup_subsys.h
@@ -52,5 +52,11 @@ SUBSYS(devices)
#ifdef CONFIG_CGROUP_FREEZER
SUBSYS(freezer)
#endif
/* */
+
+#ifdef CONFIG_CGROUP_SIGNAL
+SUBSYS(signal)
+#endif
+
+/* */
--
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 5/5] Add a Signal Control Group Subsystem
2008-04-23 15:37 ` Paul Menage
@ 2008-04-24 7:00 ` Matt Helsley
0 siblings, 0 replies; 11+ messages in thread
From: Matt Helsley @ 2008-04-24 7:00 UTC (permalink / raw)
To: Paul Menage
Cc: Cedric Le Goater, Linux-Kernel, Oren Laadan, Linus Torvalds,
Pavel Machek, linux-pm, Linux Containers
On Wed, 2008-04-23 at 08:37 -0700, Paul Menage wrote:
> On Wed, Apr 23, 2008 at 8:17 AM, Cedric Le Goater <clg@fr.ibm.com> wrote:
> > Hello Matt !
> >
> > > Add a signal control group subsystem that allows us to send signals to all tasks
> > > in the control group by writing the desired signal(7) number to the kill file.
> > >
> > > NOTE: We don't really need per-cgroup state, but control groups doesn't support
> > > stateless subsystems yet.
> > >
> > > Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
> > > ---
> > > include/linux/cgroup_signal.h | 28 +++++++++
> > > include/linux/cgroup_subsys.h | 6 +
> > > init/Kconfig | 6 +
> > > kernel/Makefile | 1
> > > kernel/cgroup_signal.c | 129 ++++++++++++++++++++++++++++++++++++++++++
> > > 5 files changed, 170 insertions(+)
> >
> >
> > I think there is a small race with new tasks entering the cgroup
> > while it's beeing killed, and a _fork ops would handle that. nop ?
> >
>
> I never saw the actual patch (what lists did it go out to?) but I
Hi Paul,
Sorry about this. MTA issues again :(. I think I've gotten them fixed
*this* time. I've resent them and you should have received your own copy
this time. Please let me know if you still aren't receiving them.
Cheers,
-Matt
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 5/5] Add a Signal Control Group Subsystem
2008-04-24 6:48 ` [RFC][PATCH 5/5] Add a Signal Control Group Subsystem Matt Helsley
@ 2008-04-24 19:30 ` Paul Jackson
2008-04-30 7:48 ` Matt Helsley
2008-04-25 6:01 ` Paul Menage
2008-04-25 11:41 ` Cedric Le Goater
2 siblings, 1 reply; 11+ messages in thread
From: Paul Jackson @ 2008-04-24 19:30 UTC (permalink / raw)
To: Matt Helsley
Cc: linux-kernel, containers, clg, pavel, menage, torvalds, linux-pm
> +static struct cftype kill_file = {
> + .name = "kill",
The name "kill" seems ambiguous to me. It suggests that any write
will send some default signal (TERM or KILL?) to all tasks in the
cgroup, rather like the 'killall' command.
I'm guessing that more people, on seeing this file in a cgroup
directory, will guess correctly what it does if it were named
"signal" or "send_signal" or some such.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.940.382.4214
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 5/5] Add a Signal Control Group Subsystem
2008-04-24 6:48 ` [RFC][PATCH 5/5] Add a Signal Control Group Subsystem Matt Helsley
2008-04-24 19:30 ` Paul Jackson
@ 2008-04-25 6:01 ` Paul Menage
2008-04-30 8:29 ` Matt Helsley
2008-04-25 11:41 ` Cedric Le Goater
2 siblings, 1 reply; 11+ messages in thread
From: Paul Menage @ 2008-04-25 6:01 UTC (permalink / raw)
To: Matt Helsley
Cc: Linux-Kernel, Cedric Le Goater, Oren Laadan, Linus Torvalds,
Pavel Machek, linux-pm, Linux Containers
I don't think you need cgroup_signal.h. It's only included in
cgroup_signal.c, and doesn't really contain any useful definitions
anyway. You should just use a cgroup_subsys_state object as your state
object, since you'll never need to do anything with it anyway.
>+static struct cgroup_subsys_state *signal_create(
>+ struct cgroup_subsys *ss, struct cgroup *cgroup)
>+{
>+ struct stateless *dummy;
>+
>+ if (!capable(CAP_SYS_ADMIN))
>+ return ERR_PTR(-EPERM);
This is unnecessary.
>+
+ dummy = kzalloc(sizeof(struct stateless), GFP_KERNEL);
+ if (!dummy)
+ return ERR_PTR(-ENOMEM);
+ return &dummy->css;
+}
This function could be simplified to:
struct cgroup_subsys_state *css;
css = kzalloc(sizeof(*css), GFP_KERNEL);
return css ?: ERR_PTR(-ENOMEM);
>+static int signal_can_attach(struct cgroup_subsys *ss,
>+ struct cgroup *new_cgroup,
>+ struct task_struct *task)
>+{
>+ return 0;
>+}
No need for a can_attach() method if it just returns 0 - that's the default.
>+static int signal_kill(struct cgroup *cgroup, int signum)
>+{
>+ struct cgroup_iter it;
>+ struct task_struct *task;
>+ int retval = 0;
>+
>+ cgroup_iter_start(cgroup, &it);
>+ while ((task = cgroup_iter_next(cgroup, &it))) {
>+ retval = send_sig(signum, task, 1);
>+ if (retval)
>+ break;
>+ }
>+ cgroup_iter_end(cgroup, &it);
>+
>+ return retval;
>+}
cgroup_iter_start() takes a read lock - is send_sig() guaranteed not to sleep?
>+static ssize_t signal_write(struct cgroup *cgroup,
>+ struct cftype *cft,
>+ struct file *file,
>+ const char __user *userbuf,
>+ size_t nbytes, loff_t *unused_ppos)
This should just be a write_u64() method - cgroups will handle the
copying/parsing for you. See e.g.
kernel/sched.c:cpu_shares_write_u64()
>+static struct cftype kill_file = {
>+ .name = "kill",
>+ .write = signal_write,
>+ .private = 0,
>+};
I agree with PaulJ that "signal.send" would be a nicer name for this
than "signal.kill"
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 5/5] Add a Signal Control Group Subsystem
2008-04-24 6:48 ` [RFC][PATCH 5/5] Add a Signal Control Group Subsystem Matt Helsley
2008-04-24 19:30 ` Paul Jackson
2008-04-25 6:01 ` Paul Menage
@ 2008-04-25 11:41 ` Cedric Le Goater
2008-04-30 18:44 ` Matt Helsley
2 siblings, 1 reply; 11+ messages in thread
From: Cedric Le Goater @ 2008-04-25 11:41 UTC (permalink / raw)
To: Matt Helsley
Cc: Linux-Kernel, Linux Containers, Pavel Machek, Paul Menage,
Linus Torvalds, linux-pm
Matt Helsley wrote:
> Add a signal control group subsystem that allows us to send signals to all tasks
> in the control group by writing the desired signal(7) number to the kill file.
>
> NOTE: We don't really need per-cgroup state, but control groups doesn't support
> stateless subsystems yet.
>
> Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
> ---
> include/linux/cgroup_signal.h | 28 +++++++++
> include/linux/cgroup_subsys.h | 6 +
> init/Kconfig | 6 +
> kernel/Makefile | 1
> kernel/cgroup_signal.c | 129 ++++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 170 insertions(+)
>
> Index: linux-2.6.25-mm1/include/linux/cgroup_signal.h
> ===================================================================
> --- /dev/null
> +++ linux-2.6.25-mm1/include/linux/cgroup_signal.h
> @@ -0,0 +1,28 @@
> +#ifndef _LINUX_CGROUP_SIGNAL_H
> +#define _LINUX_CGROUP_SIGNAL_H
> +/*
> + * cgroup_signal.h - control group freezer subsystem interface
s/freezer/signal/
> + *
> + * Copyright IBM Corp. 2007
> + *
> + * Author : Cedric Le Goater <clg@fr.ibm.com>
> + * Author : Matt Helsley <matthltc@us.ibm.com>
> + */
> +
> +#include <linux/cgroup.h>
> +
> +#ifdef CONFIG_CGROUP_SIGNAL
> +
> +struct stateless {
> + struct cgroup_subsys_state css;
> +};
I'm not sure this is correct to say so. Imagine you want to send
a SIGKILL to a cgroup, you would expect all tasks to die and the
cgroup to become empty. right ?
but if a task is doing clone() while it's being killed by this cgroup
signal subsystem, we can miss the child. This is because there's a
small window in copy_process() where the child is in the cgroup and
not visible yet.
copy_process()
cgroup_fork()
do stuff
cgroup_fork_callbacks()
cgroup_post_fork()
put new task in the list.
( I didn't dig too much the code, though. So I might be missing
something )
So if we want to send the signal to all tasks in the cgroup, we need
to track the new tasks with a fork callback, just like the freezer :
static void signal_fork(struct cgroup_subsys *ss, struct task_struct *task)
{
}
and, of course, we need to keep somewhere the signal number we need to
send.
All this depends on how we want the cgroup signal subsystem to behave.
It could be brainless of course, but it seems to me that the biggest
benefit of such a subsystem is to use the cgroup capability to track
new tasks coming in.
Cheers,
C.
> +static inline struct stateless *cgroup_signal(struct cgroup *cgroup)
> +{
> + return container_of(cgroup_subsys_state(cgroup, signal_subsys_id),
> + struct stateless, css);
> +}
> +
> +#else /* !CONFIG_CGROUP_SIGNAL */
> +#endif /* !CONFIG_CGROUP_SIGNAL */
> +#endif /* _LINUX_CGROUP_SIGNAL_H */
> Index: linux-2.6.25-mm1/kernel/cgroup_signal.c
> ===================================================================
> --- /dev/null
> +++ linux-2.6.25-mm1/kernel/cgroup_signal.c
> @@ -0,0 +1,129 @@
> +/*
> + * cgroup_signal.c - control group signal subsystem
> + *
> + * Copyright IBM Corp. 2007
> + *
> + * Author : Cedric Le Goater <clg@fr.ibm.com>
> + * Author : Matt Helsley <matthltc@us.ibm.com>
> + */
> +
> +#include <linux/module.h>
> +#include <linux/cgroup.h>
> +#include <linux/fs.h>
> +#include <linux/uaccess.h>
> +#include <linux/cgroup_signal.h>
> +
> +struct cgroup_subsys signal_subsys;
> +
> +static struct cgroup_subsys_state *signal_create(
> + struct cgroup_subsys *ss, struct cgroup *cgroup)
> +{
> + struct stateless *dummy;
> +
> + if (!capable(CAP_SYS_ADMIN))
> + return ERR_PTR(-EPERM);
> +
> + dummy = kzalloc(sizeof(struct stateless), GFP_KERNEL);
> + if (!dummy)
> + return ERR_PTR(-ENOMEM);
> + return &dummy->css;
> +}
> +
> +static void signal_destroy(struct cgroup_subsys *ss,
> + struct cgroup *cgroup)
> +{
> + kfree(cgroup_signal(cgroup));
> +}
> +
> +
> +static int signal_can_attach(struct cgroup_subsys *ss,
> + struct cgroup *new_cgroup,
> + struct task_struct *task)
> +{
> + return 0;
> +}
> +
> +static int signal_kill(struct cgroup *cgroup, int signum)
> +{
> + struct cgroup_iter it;
> + struct task_struct *task;
> + int retval = 0;
> +
> + cgroup_iter_start(cgroup, &it);
> + while ((task = cgroup_iter_next(cgroup, &it))) {
> + retval = send_sig(signum, task, 1);
> + if (retval)
> + break;
> + }
> + cgroup_iter_end(cgroup, &it);
> +
> + return retval;
> +}
> +
> +static ssize_t signal_write(struct cgroup *cgroup,
> + struct cftype *cft,
> + struct file *file,
> + const char __user *userbuf,
> + size_t nbytes, loff_t *unused_ppos)
> +{
> + char *buffer;
> + int retval = 0;
> + int value;
> +
> + if (nbytes >= PATH_MAX)
> + return -E2BIG;
> +
> + /* +1 for nul-terminator */
> + buffer = kmalloc(nbytes + 1, GFP_KERNEL);
> + if (buffer == NULL)
> + return -ENOMEM;
> +
> + if (copy_from_user(buffer, userbuf, nbytes)) {
> + retval = -EFAULT;
> + goto free_buffer;
> + }
> + buffer[nbytes] = 0; /* nul-terminate */
> + if (sscanf(buffer, "%d", &value) != 1) {
> + retval = -EIO;
> + goto free_buffer;
> + }
> +
> + cgroup_lock();
> +
> + if (cgroup_is_removed(cgroup)) {
> + retval = -ENODEV;
> + goto unlock;
> + }
> +
> + retval = signal_kill(cgroup, value);
> + if (retval == 0)
> + retval = nbytes;
> +unlock:
> + cgroup_unlock();
> +free_buffer:
> + kfree(buffer);
> + return retval;
> +}
> +
> +static struct cftype kill_file = {
> + .name = "kill",
> + .write = signal_write,
> + .private = 0,
> +};
> +
> +static int signal_populate(struct cgroup_subsys *ss, struct cgroup *cgroup)
> +{
> + return cgroup_add_files(cgroup, ss, &kill_file, 1);
> +}
> +
> +struct cgroup_subsys signal_subsys = {
> + .name = "signal",
> + .create = signal_create,
> + .destroy = signal_destroy,
> + .populate = signal_populate,
> + .subsys_id = signal_subsys_id,
> + .can_attach = signal_can_attach,
> + .attach = NULL,
> + .fork = NULL,
> + .exit = NULL,
> +};
> Index: linux-2.6.25-mm1/init/Kconfig
> ===================================================================
> --- linux-2.6.25-mm1.orig/init/Kconfig
> +++ linux-2.6.25-mm1/init/Kconfig
> @@ -328,10 +328,16 @@ config CGROUP_FREEZER
> depends on CGROUPS
> help
> Provides a way to freeze and unfreeze all tasks in a
> cgroup
>
> +config CGROUP_SIGNAL
> + bool "control group signal subsystem"
> + depends on CGROUPS
> + help
> + Provides a way to signal all tasks in a cgroup
> +
> config FAIR_GROUP_SCHED
> bool "Group scheduling for SCHED_OTHER"
> depends on GROUP_SCHED
> default y
>
> Index: linux-2.6.25-mm1/kernel/Makefile
> ===================================================================
> --- linux-2.6.25-mm1.orig/kernel/Makefile
> +++ linux-2.6.25-mm1/kernel/Makefile
> @@ -47,10 +47,11 @@ obj-$(CONFIG_KEXEC) += kexec.o
> obj-$(CONFIG_BACKTRACE_SELF_TEST) += backtracetest.o
> obj-$(CONFIG_COMPAT) += compat.o
> obj-$(CONFIG_CGROUPS) += cgroup.o
> obj-$(CONFIG_CGROUP_DEBUG) += cgroup_debug.o
> obj-$(CONFIG_CGROUP_FREEZER) += cgroup_freezer.o
> +obj-$(CONFIG_CGROUP_SIGNAL) += cgroup_signal.o
> obj-$(CONFIG_CPUSETS) += cpuset.o
> obj-$(CONFIG_CGROUP_NS) += ns_cgroup.o
> obj-$(CONFIG_UTS_NS) += utsname.o
> obj-$(CONFIG_USER_NS) += user_namespace.o
> obj-$(CONFIG_PID_NS) += pid_namespace.o
> Index: linux-2.6.25-mm1/include/linux/cgroup_subsys.h
> ===================================================================
> --- linux-2.6.25-mm1.orig/include/linux/cgroup_subsys.h
> +++ linux-2.6.25-mm1/include/linux/cgroup_subsys.h
> @@ -52,5 +52,11 @@ SUBSYS(devices)
> #ifdef CONFIG_CGROUP_FREEZER
> SUBSYS(freezer)
> #endif
>
> /* */
> +
> +#ifdef CONFIG_CGROUP_SIGNAL
> +SUBSYS(signal)
> +#endif
> +
> +/* */
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 5/5] Add a Signal Control Group Subsystem
2008-04-24 19:30 ` Paul Jackson
@ 2008-04-30 7:48 ` Matt Helsley
2008-04-30 8:18 ` Paul Jackson
0 siblings, 1 reply; 11+ messages in thread
From: Matt Helsley @ 2008-04-30 7:48 UTC (permalink / raw)
To: Paul Jackson
Cc: linux-kernel, containers, clg, pavel, menage, torvalds, linux-pm
On Thu, 2008-04-24 at 14:30 -0500, Paul Jackson wrote:
> > +static struct cftype kill_file = {
> > + .name = "kill",
>
> The name "kill" seems ambiguous to me. It suggests that any write
> will send some default signal (TERM or KILL?) to all tasks in the
> cgroup, rather like the 'killall' command.
>
> I'm guessing that more people, on seeing this file in a cgroup
> directory, will guess correctly what it does if it were named
> "signal" or "send_signal" or some such.
OK, I renamed it signal.send and replaced all uses of "kill" in the code
to indicate that we're actually sending a signal -- not "KILL"ing
things :).
Cheers,
-Matt
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 5/5] Add a Signal Control Group Subsystem
2008-04-30 7:48 ` Matt Helsley
@ 2008-04-30 8:18 ` Paul Jackson
0 siblings, 0 replies; 11+ messages in thread
From: Paul Jackson @ 2008-04-30 8:18 UTC (permalink / raw)
To: Matt Helsley
Cc: linux-kernel, containers, clg, pavel, menage, torvalds, linux-pm
Matt wrote:
> OK, I renamed it signal.send
I'm not familiar with the cgroup subsystem naming conventions,
but modulo that, this looks good to me - thanks!
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.940.382.4214
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 5/5] Add a Signal Control Group Subsystem
2008-04-25 6:01 ` Paul Menage
@ 2008-04-30 8:29 ` Matt Helsley
0 siblings, 0 replies; 11+ messages in thread
From: Matt Helsley @ 2008-04-30 8:29 UTC (permalink / raw)
To: Paul Menage
Cc: Linux-Kernel, Cedric Le Goater, Oren Laadan, Linus Torvalds,
Pavel Machek, linux-pm, Linux Containers
On Thu, 2008-04-24 at 23:01 -0700, Paul Menage wrote:
> I don't think you need cgroup_signal.h. It's only included in
> cgroup_signal.c, and doesn't really contain any useful definitions
> anyway. You should just use a cgroup_subsys_state object as your state
> object, since you'll never need to do anything with it anyway.
>
> >+static struct cgroup_subsys_state *signal_create(
> >+ struct cgroup_subsys *ss, struct cgroup *cgroup)
> >+{
> >+ struct stateless *dummy;
> >+
> >+ if (!capable(CAP_SYS_ADMIN))
> >+ return ERR_PTR(-EPERM);
>
> This is unnecessary.
OK, removed.
> >+
> + dummy = kzalloc(sizeof(struct stateless), GFP_KERNEL);
> + if (!dummy)
> + return ERR_PTR(-ENOMEM);
> + return &dummy->css;
> +}
>
> This function could be simplified to:
>
> struct cgroup_subsys_state *css;
> css = kzalloc(sizeof(*css), GFP_KERNEL);
> return css ?: ERR_PTR(-ENOMEM);
I kept the if() syntax but used cgroup_subsys_state as suggested. I kept
the name "dummy" too to emphasize that except for following the
currently-required form we don't really need the state information.
As a side note, I don't think adding state (which signal was/to
sent/send) or a fork handling function prevents races. I tend to agree
that there are lots of races involving adding/removing tasks to the
cgroup where we may not signal "everything". I think that from
userspace's perspective we can't solve the races because there will
always be a window between releasing whatever lock we use and returning
to userspace where new tasks can be added.
IMHO the only way to prevent such races would be to allow userspace to
"lock" a cgroup so that no new tasks may be added. That would block/fail
new forks and prevent writing new pids to the tasks file. Otherwise
userspace must always recheck the list to ensure it didn't get any new
entries...
> >+static int signal_can_attach(struct cgroup_subsys *ss,
> >+ struct cgroup *new_cgroup,
> >+ struct task_struct *task)
> >+{
> >+ return 0;
> >+}
>
> No need for a can_attach() method if it just returns 0 - that's the default.
Removed.
> >+static int signal_kill(struct cgroup *cgroup, int signum)
> >+{
> >+ struct cgroup_iter it;
> >+ struct task_struct *task;
> >+ int retval = 0;
> >+
> >+ cgroup_iter_start(cgroup, &it);
> >+ while ((task = cgroup_iter_next(cgroup, &it))) {
> >+ retval = send_sig(signum, task, 1);
> >+ if (retval)
> >+ break;
> >+ }
> >+ cgroup_iter_end(cgroup, &it);
> >+
> >+ return retval;
> >+}
>
> cgroup_iter_start() takes a read lock - is send_sig() guaranteed not to sleep?
send_sig() -> send_sig_info() hold the tasklist_lock for read and the
task's sighand->siglock spinlock.
> >+static ssize_t signal_write(struct cgroup *cgroup,
> >+ struct cftype *cft,
> >+ struct file *file,
> >+ const char __user *userbuf,
> >+ size_t nbytes, loff_t *unused_ppos)
>
> This should just be a write_u64() method - cgroups will handle the
> copying/parsing for you. See e.g.
> kernel/sched.c:cpu_shares_write_u64()
Sure.
> >+static struct cftype kill_file = {
> >+ .name = "kill",
> >+ .write = signal_write,
> >+ .private = 0,
> >+};
>
> I agree with PaulJ that "signal.send" would be a nicer name for this
> than "signal.kill"
OK.
Cheers,
-Matt Helsley
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 5/5] Add a Signal Control Group Subsystem
2008-04-25 11:41 ` Cedric Le Goater
@ 2008-04-30 18:44 ` Matt Helsley
0 siblings, 0 replies; 11+ messages in thread
From: Matt Helsley @ 2008-04-30 18:44 UTC (permalink / raw)
To: Cedric Le Goater
Cc: Linux-Kernel, Linux Containers, Pavel Machek, Paul Menage,
Linus Torvalds, linux-pm
On Fri, 2008-04-25 at 13:41 +0200, Cedric Le Goater wrote:
> Matt Helsley wrote:
> > Add a signal control group subsystem that allows us to send signals to all tasks
> > in the control group by writing the desired signal(7) number to the kill file.
> >
> > NOTE: We don't really need per-cgroup state, but control groups doesn't support
> > stateless subsystems yet.
> >
> > Signed-off-by: Matt Helsley <matthltc@us.ibm.com>
> > ---
> > include/linux/cgroup_signal.h | 28 +++++++++
> > include/linux/cgroup_subsys.h | 6 +
> > init/Kconfig | 6 +
> > kernel/Makefile | 1
> > kernel/cgroup_signal.c | 129 ++++++++++++++++++++++++++++++++++++++++++
> > 5 files changed, 170 insertions(+)
> >
> > Index: linux-2.6.25-mm1/include/linux/cgroup_signal.h
> > ===================================================================
> > --- /dev/null
> > +++ linux-2.6.25-mm1/include/linux/cgroup_signal.h
> > @@ -0,0 +1,28 @@
> > +#ifndef _LINUX_CGROUP_SIGNAL_H
> > +#define _LINUX_CGROUP_SIGNAL_H
> > +/*
> > + * cgroup_signal.h - control group freezer subsystem interface
>
> s/freezer/signal/
>
> > + *
> > + * Copyright IBM Corp. 2007
> > + *
> > + * Author : Cedric Le Goater <clg@fr.ibm.com>
> > + * Author : Matt Helsley <matthltc@us.ibm.com>
> > + */
> > +
> > +#include <linux/cgroup.h>
> > +
> > +#ifdef CONFIG_CGROUP_SIGNAL
> > +
> > +struct stateless {
> > + struct cgroup_subsys_state css;
> > +};
>
> I'm not sure this is correct to say so. Imagine you want to send
> a SIGKILL to a cgroup, you would expect all tasks to die and the
> cgroup to become empty. right ?
>
> but if a task is doing clone() while it's being killed by this cgroup
> signal subsystem, we can miss the child. This is because there's a
> small window in copy_process() where the child is in the cgroup and
> not visible yet.
>
> copy_process()
> cgroup_fork()
> do stuff
> cgroup_fork_callbacks()
>
> cgroup_post_fork()
> put new task in the list.
>
> ( I didn't dig too much the code, though. So I might be missing
> something )
>
> So if we want to send the signal to all tasks in the cgroup, we need
> to track the new tasks with a fork callback, just like the freezer :
>
> static void signal_fork(struct cgroup_subsys *ss, struct task_struct *task)
> {
>
> }
>
> and, of course, we need to keep somewhere the signal number we need to
> send.
>
>
> All this depends on how we want the cgroup signal subsystem to behave.
> It could be brainless of course, but it seems to me that the biggest
> benefit of such a subsystem is to use the cgroup capability to track
> new tasks coming in.
>
> Cheers,
>
> C.
Assuming we did this, isn't it still possible to send SIGSTOP to every
task in the cgroup yet still appear to have not stopped every task in
the cgroup:
Task A Task B
echo 19 > signal.send
record signal
return -EBUSY from can_attach
send signals to all the tasks
return 0 from write syscall
echo newpid > tasks
cat tasks
<Uh oh, not all tasks are stopped...>
Cheers,
-Matt Helsley
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-04-30 21:29 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20080423142517.062433911@us.ibm.com>
[not found] ` <20080423142518.703428301@us.ibm.com>
2008-04-23 15:17 ` [RFC PATCH 5/5] Add a Signal Control Group Subsystem Cedric Le Goater
2008-04-23 15:37 ` Paul Menage
2008-04-24 7:00 ` Matt Helsley
2008-04-24 6:47 [RFC][PATCH 0/5] Container Freezer: Reuse Suspend Freezer Matt Helsley
2008-04-24 6:48 ` [RFC][PATCH 5/5] Add a Signal Control Group Subsystem Matt Helsley
2008-04-24 19:30 ` Paul Jackson
2008-04-30 7:48 ` Matt Helsley
2008-04-30 8:18 ` Paul Jackson
2008-04-25 6:01 ` Paul Menage
2008-04-30 8:29 ` Matt Helsley
2008-04-25 11:41 ` Cedric Le Goater
2008-04-30 18:44 ` Matt Helsley
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox