* Re: [RFC PATCH 5/5] Add a Signal Control Group Subsystem [not found] ` <20080423142518.703428301@us.ibm.com> @ 2008-04-23 15:17 ` Cedric Le Goater 2008-04-23 15:37 ` Paul Menage 0 siblings, 1 reply; 11+ messages in thread From: Cedric Le Goater @ 2008-04-23 15:17 UTC (permalink / raw) To: Matt Helsley Cc: Linux-Kernel, Paul Menage, Oren Laadan, Linus Torvalds, Pavel Machek, linux-pm, Linux Containers Hello Matt ! > Add a signal control group subsystem that allows us to send signals to all tasks > in the control group by writing the desired signal(7) number to the kill file. > > NOTE: We don't really need per-cgroup state, but control groups doesn't support > stateless subsystems yet. > > Signed-off-by: Matt Helsley <matthltc@us.ibm.com> > --- > include/linux/cgroup_signal.h | 28 +++++++++ > include/linux/cgroup_subsys.h | 6 + > init/Kconfig | 6 + > kernel/Makefile | 1 > kernel/cgroup_signal.c | 129 ++++++++++++++++++++++++++++++++++++++++++ > 5 files changed, 170 insertions(+) I think there is a small race with new tasks entering the cgroup while it's beeing killed, and a _fork ops would handle that. nop ? Thanks, C. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 5/5] Add a Signal Control Group Subsystem 2008-04-23 15:17 ` [RFC PATCH 5/5] Add a Signal Control Group Subsystem Cedric Le Goater @ 2008-04-23 15:37 ` Paul Menage 2008-04-24 7:00 ` Matt Helsley 0 siblings, 1 reply; 11+ messages in thread From: Paul Menage @ 2008-04-23 15:37 UTC (permalink / raw) To: Cedric Le Goater Cc: Matt Helsley, Linux-Kernel, Oren Laadan, Linus Torvalds, Pavel Machek, linux-pm, Linux Containers On Wed, Apr 23, 2008 at 8:17 AM, Cedric Le Goater <clg@fr.ibm.com> wrote: > Hello Matt ! > > > Add a signal control group subsystem that allows us to send signals to all tasks > > in the control group by writing the desired signal(7) number to the kill file. > > > > NOTE: We don't really need per-cgroup state, but control groups doesn't support > > stateless subsystems yet. > > > > Signed-off-by: Matt Helsley <matthltc@us.ibm.com> > > --- > > include/linux/cgroup_signal.h | 28 +++++++++ > > include/linux/cgroup_subsys.h | 6 + > > init/Kconfig | 6 + > > kernel/Makefile | 1 > > kernel/cgroup_signal.c | 129 ++++++++++++++++++++++++++++++++++++++++++ > > 5 files changed, 170 insertions(+) > > > I think there is a small race with new tasks entering the cgroup > while it's beeing killed, and a _fork ops would handle that. nop ? > I never saw the actual patch (what lists did it go out to?) but I suspect that this is one of those operations that's just going to be inherently racy, and that the API should guarantee to affect all tasks that are members of the group for the entirety of the operation, but with no guarantees about what happens to tasks that enter or leave in the meantime. Paul ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 5/5] Add a Signal Control Group Subsystem 2008-04-23 15:37 ` Paul Menage @ 2008-04-24 7:00 ` Matt Helsley 0 siblings, 0 replies; 11+ messages in thread From: Matt Helsley @ 2008-04-24 7:00 UTC (permalink / raw) To: Paul Menage Cc: Cedric Le Goater, Linux-Kernel, Oren Laadan, Linus Torvalds, Pavel Machek, linux-pm, Linux Containers On Wed, 2008-04-23 at 08:37 -0700, Paul Menage wrote: > On Wed, Apr 23, 2008 at 8:17 AM, Cedric Le Goater <clg@fr.ibm.com> wrote: > > Hello Matt ! > > > > > Add a signal control group subsystem that allows us to send signals to all tasks > > > in the control group by writing the desired signal(7) number to the kill file. > > > > > > NOTE: We don't really need per-cgroup state, but control groups doesn't support > > > stateless subsystems yet. > > > > > > Signed-off-by: Matt Helsley <matthltc@us.ibm.com> > > > --- > > > include/linux/cgroup_signal.h | 28 +++++++++ > > > include/linux/cgroup_subsys.h | 6 + > > > init/Kconfig | 6 + > > > kernel/Makefile | 1 > > > kernel/cgroup_signal.c | 129 ++++++++++++++++++++++++++++++++++++++++++ > > > 5 files changed, 170 insertions(+) > > > > > > I think there is a small race with new tasks entering the cgroup > > while it's beeing killed, and a _fork ops would handle that. nop ? > > > > I never saw the actual patch (what lists did it go out to?) but I Hi Paul, Sorry about this. MTA issues again :(. I think I've gotten them fixed *this* time. I've resent them and you should have received your own copy this time. Please let me know if you still aren't receiving them. Cheers, -Matt ^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC][PATCH 0/5] Container Freezer: Reuse Suspend Freezer
@ 2008-04-24 6:47 Matt Helsley
2008-04-24 6:48 ` [RFC][PATCH 5/5] Add a Signal Control Group Subsystem Matt Helsley
0 siblings, 1 reply; 11+ messages in thread
From: Matt Helsley @ 2008-04-24 6:47 UTC (permalink / raw)
To: Linux-Kernel
Cc: Cedric Le Goater, Paul Menage, Oren Laadan, Linus Torvalds,
Pavel Machek, linux-pm, Linux Containers
This patchset reuses the container infrastructure and the swsusp freezer to
freeze a group of tasks. I've merely taken Cedric's patches, forward-ported
them to 2.6.25-mm1 and tested the expected common cases.
Changes since v1:
v2 (roughly patches 3 and 5):
Moved the "kill" file into a separate cgroup subsystem (signal) and
it's own patch.
Changed the name of the file from freezer.freeze to freezer.state.
Switched from taking 1 and 0 as input to the strings "FROZEN" and
"RUNNING", respectively. This helps keep the interface
human-usable if/when we need to more states.
Checked that stopped or interrupted is "frozen enough"
Since try_to_freeze() is called upon wakeup of these tasks
this should be fine. This idea comes from recent changes to
the freezer.
Checked that if (task == current) whilst freezing cgroup we're ok
Fixed bug where -EBUSY would always be returned when freezing
Added code to handle userspace retries for any remaining -EBUSY
The freezer subsystem in the container filesystem defines a file named
freezer.state. Writing "FROZEN" to the state file will freeze all tasks in the
cgroup. Subsequently writing "RUNNING" will unfreeze the tasks in the cgroup.
Reading will return the current state.
* Examples of usage :
# mkdir /containers/freezer
# mount -t cgroup -ofreezer,signal freezer /containers/freezer
# mkdir /containers/freezer/0
# echo $some_pid > /containers/freezer/0/tasks
to get status of the freezer subsystem :
# cat /containers/freezer/0/freezer.state
RUNNING
to freeze all tasks in the container :
# echo FROZEN > /containers/freezer/0/freezer.state
# cat /containers/freezer/0/freezer.state
FREEZING
# cat /containers/freezer/0/freezer.state
FROZEN
to unfreeze all tasks in the container :
# echo RUNNING > /containers/freezer/0/freezer.state
# cat /containers/freezer/0/freezer.state
RUNNING
to kill all tasks in the container :
# echo 9 > /containers/freezer/0/signal.kill
* Caveats:
- The cgroup moves into the FROZEN state once all tasks in the cgroup are
frozen. This is calculated and changed when the container file
"freezer.state" is read or written.
- Frozen containers will be unfrozen when a system is resumed after
a suspend. This is addressed by a subsequent patch.
* Series
Applies to 2.6.25-mm1
The first patches make the freezer available to all architectures
before implementing the freezer cgroup subsystem.
[RFC PATCH 1/5] Add TIF_FREEZE flag to all architectures
[RFC PATCH 2/5] Make refrigerator always available
[RFC PATCH 3/5] Implement freezer cgroup subsystem
[RFC PATCH 4/5] Skip frozen cgroups during power management resume
[RFC PATCH 5/5] Implement signal cgroup subsytem
Comments are welcome. I'm planning to finish up testing with ptrace'd and
vforking processes and then, if it still seems appropriate, resubmit as a
non-RFC series next.
Cheers,
-Matt Helsley
--
^ permalink raw reply [flat|nested] 11+ messages in thread* [RFC][PATCH 5/5] Add a Signal Control Group Subsystem 2008-04-24 6:47 [RFC][PATCH 0/5] Container Freezer: Reuse Suspend Freezer Matt Helsley @ 2008-04-24 6:48 ` Matt Helsley 2008-04-24 19:30 ` Paul Jackson ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Matt Helsley @ 2008-04-24 6:48 UTC (permalink / raw) To: Linux-Kernel Cc: Cedric Le Goater, Paul Menage, Oren Laadan, Linus Torvalds, Pavel Machek, linux-pm, Linux Containers [-- Attachment #1: cgroup-signal/cgroup-signal-implement-signal-subsystem.patch --] [-- Type: text/plain, Size: 6112 bytes --] Add a signal control group subsystem that allows us to send signals to all tasks in the control group by writing the desired signal(7) number to the kill file. NOTE: We don't really need per-cgroup state, but control groups doesn't support stateless subsystems yet. Signed-off-by: Matt Helsley <matthltc@us.ibm.com> --- include/linux/cgroup_signal.h | 28 +++++++++ include/linux/cgroup_subsys.h | 6 + init/Kconfig | 6 + kernel/Makefile | 1 kernel/cgroup_signal.c | 129 ++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 170 insertions(+) Index: linux-2.6.25-mm1/include/linux/cgroup_signal.h =================================================================== --- /dev/null +++ linux-2.6.25-mm1/include/linux/cgroup_signal.h @@ -0,0 +1,28 @@ +#ifndef _LINUX_CGROUP_SIGNAL_H +#define _LINUX_CGROUP_SIGNAL_H +/* + * cgroup_signal.h - control group freezer subsystem interface + * + * Copyright IBM Corp. 2007 + * + * Author : Cedric Le Goater <clg@fr.ibm.com> + * Author : Matt Helsley <matthltc@us.ibm.com> + */ + +#include <linux/cgroup.h> + +#ifdef CONFIG_CGROUP_SIGNAL + +struct stateless { + struct cgroup_subsys_state css; +}; + +static inline struct stateless *cgroup_signal(struct cgroup *cgroup) +{ + return container_of(cgroup_subsys_state(cgroup, signal_subsys_id), + struct stateless, css); +} + +#else /* !CONFIG_CGROUP_SIGNAL */ +#endif /* !CONFIG_CGROUP_SIGNAL */ +#endif /* _LINUX_CGROUP_SIGNAL_H */ Index: linux-2.6.25-mm1/kernel/cgroup_signal.c =================================================================== --- /dev/null +++ linux-2.6.25-mm1/kernel/cgroup_signal.c @@ -0,0 +1,129 @@ +/* + * cgroup_signal.c - control group signal subsystem + * + * Copyright IBM Corp. 2007 + * + * Author : Cedric Le Goater <clg@fr.ibm.com> + * Author : Matt Helsley <matthltc@us.ibm.com> + */ + +#include <linux/module.h> +#include <linux/cgroup.h> +#include <linux/fs.h> +#include <linux/uaccess.h> +#include <linux/cgroup_signal.h> + +struct cgroup_subsys signal_subsys; + +static struct cgroup_subsys_state *signal_create( + struct cgroup_subsys *ss, struct cgroup *cgroup) +{ + struct stateless *dummy; + + if (!capable(CAP_SYS_ADMIN)) + return ERR_PTR(-EPERM); + + dummy = kzalloc(sizeof(struct stateless), GFP_KERNEL); + if (!dummy) + return ERR_PTR(-ENOMEM); + return &dummy->css; +} + +static void signal_destroy(struct cgroup_subsys *ss, + struct cgroup *cgroup) +{ + kfree(cgroup_signal(cgroup)); +} + + +static int signal_can_attach(struct cgroup_subsys *ss, + struct cgroup *new_cgroup, + struct task_struct *task) +{ + return 0; +} + +static int signal_kill(struct cgroup *cgroup, int signum) +{ + struct cgroup_iter it; + struct task_struct *task; + int retval = 0; + + cgroup_iter_start(cgroup, &it); + while ((task = cgroup_iter_next(cgroup, &it))) { + retval = send_sig(signum, task, 1); + if (retval) + break; + } + cgroup_iter_end(cgroup, &it); + + return retval; +} + +static ssize_t signal_write(struct cgroup *cgroup, + struct cftype *cft, + struct file *file, + const char __user *userbuf, + size_t nbytes, loff_t *unused_ppos) +{ + char *buffer; + int retval = 0; + int value; + + if (nbytes >= PATH_MAX) + return -E2BIG; + + /* +1 for nul-terminator */ + buffer = kmalloc(nbytes + 1, GFP_KERNEL); + if (buffer == NULL) + return -ENOMEM; + + if (copy_from_user(buffer, userbuf, nbytes)) { + retval = -EFAULT; + goto free_buffer; + } + buffer[nbytes] = 0; /* nul-terminate */ + if (sscanf(buffer, "%d", &value) != 1) { + retval = -EIO; + goto free_buffer; + } + + cgroup_lock(); + + if (cgroup_is_removed(cgroup)) { + retval = -ENODEV; + goto unlock; + } + + retval = signal_kill(cgroup, value); + if (retval == 0) + retval = nbytes; +unlock: + cgroup_unlock(); +free_buffer: + kfree(buffer); + return retval; +} + +static struct cftype kill_file = { + .name = "kill", + .write = signal_write, + .private = 0, +}; + +static int signal_populate(struct cgroup_subsys *ss, struct cgroup *cgroup) +{ + return cgroup_add_files(cgroup, ss, &kill_file, 1); +} + +struct cgroup_subsys signal_subsys = { + .name = "signal", + .create = signal_create, + .destroy = signal_destroy, + .populate = signal_populate, + .subsys_id = signal_subsys_id, + .can_attach = signal_can_attach, + .attach = NULL, + .fork = NULL, + .exit = NULL, +}; Index: linux-2.6.25-mm1/init/Kconfig =================================================================== --- linux-2.6.25-mm1.orig/init/Kconfig +++ linux-2.6.25-mm1/init/Kconfig @@ -328,10 +328,16 @@ config CGROUP_FREEZER depends on CGROUPS help Provides a way to freeze and unfreeze all tasks in a cgroup +config CGROUP_SIGNAL + bool "control group signal subsystem" + depends on CGROUPS + help + Provides a way to signal all tasks in a cgroup + config FAIR_GROUP_SCHED bool "Group scheduling for SCHED_OTHER" depends on GROUP_SCHED default y Index: linux-2.6.25-mm1/kernel/Makefile =================================================================== --- linux-2.6.25-mm1.orig/kernel/Makefile +++ linux-2.6.25-mm1/kernel/Makefile @@ -47,10 +47,11 @@ obj-$(CONFIG_KEXEC) += kexec.o obj-$(CONFIG_BACKTRACE_SELF_TEST) += backtracetest.o obj-$(CONFIG_COMPAT) += compat.o obj-$(CONFIG_CGROUPS) += cgroup.o obj-$(CONFIG_CGROUP_DEBUG) += cgroup_debug.o obj-$(CONFIG_CGROUP_FREEZER) += cgroup_freezer.o +obj-$(CONFIG_CGROUP_SIGNAL) += cgroup_signal.o obj-$(CONFIG_CPUSETS) += cpuset.o obj-$(CONFIG_CGROUP_NS) += ns_cgroup.o obj-$(CONFIG_UTS_NS) += utsname.o obj-$(CONFIG_USER_NS) += user_namespace.o obj-$(CONFIG_PID_NS) += pid_namespace.o Index: linux-2.6.25-mm1/include/linux/cgroup_subsys.h =================================================================== --- linux-2.6.25-mm1.orig/include/linux/cgroup_subsys.h +++ linux-2.6.25-mm1/include/linux/cgroup_subsys.h @@ -52,5 +52,11 @@ SUBSYS(devices) #ifdef CONFIG_CGROUP_FREEZER SUBSYS(freezer) #endif /* */ + +#ifdef CONFIG_CGROUP_SIGNAL +SUBSYS(signal) +#endif + +/* */ -- ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 5/5] Add a Signal Control Group Subsystem 2008-04-24 6:48 ` [RFC][PATCH 5/5] Add a Signal Control Group Subsystem Matt Helsley @ 2008-04-24 19:30 ` Paul Jackson 2008-04-30 7:48 ` Matt Helsley 2008-04-25 6:01 ` Paul Menage 2008-04-25 11:41 ` Cedric Le Goater 2 siblings, 1 reply; 11+ messages in thread From: Paul Jackson @ 2008-04-24 19:30 UTC (permalink / raw) To: Matt Helsley Cc: linux-kernel, containers, clg, pavel, menage, torvalds, linux-pm > +static struct cftype kill_file = { > + .name = "kill", The name "kill" seems ambiguous to me. It suggests that any write will send some default signal (TERM or KILL?) to all tasks in the cgroup, rather like the 'killall' command. I'm guessing that more people, on seeing this file in a cgroup directory, will guess correctly what it does if it were named "signal" or "send_signal" or some such. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 5/5] Add a Signal Control Group Subsystem 2008-04-24 19:30 ` Paul Jackson @ 2008-04-30 7:48 ` Matt Helsley 2008-04-30 8:18 ` Paul Jackson 0 siblings, 1 reply; 11+ messages in thread From: Matt Helsley @ 2008-04-30 7:48 UTC (permalink / raw) To: Paul Jackson Cc: linux-kernel, containers, clg, pavel, menage, torvalds, linux-pm On Thu, 2008-04-24 at 14:30 -0500, Paul Jackson wrote: > > +static struct cftype kill_file = { > > + .name = "kill", > > The name "kill" seems ambiguous to me. It suggests that any write > will send some default signal (TERM or KILL?) to all tasks in the > cgroup, rather like the 'killall' command. > > I'm guessing that more people, on seeing this file in a cgroup > directory, will guess correctly what it does if it were named > "signal" or "send_signal" or some such. OK, I renamed it signal.send and replaced all uses of "kill" in the code to indicate that we're actually sending a signal -- not "KILL"ing things :). Cheers, -Matt ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 5/5] Add a Signal Control Group Subsystem 2008-04-30 7:48 ` Matt Helsley @ 2008-04-30 8:18 ` Paul Jackson 0 siblings, 0 replies; 11+ messages in thread From: Paul Jackson @ 2008-04-30 8:18 UTC (permalink / raw) To: Matt Helsley Cc: linux-kernel, containers, clg, pavel, menage, torvalds, linux-pm Matt wrote: > OK, I renamed it signal.send I'm not familiar with the cgroup subsystem naming conventions, but modulo that, this looks good to me - thanks! -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <pj@sgi.com> 1.940.382.4214 ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 5/5] Add a Signal Control Group Subsystem 2008-04-24 6:48 ` [RFC][PATCH 5/5] Add a Signal Control Group Subsystem Matt Helsley 2008-04-24 19:30 ` Paul Jackson @ 2008-04-25 6:01 ` Paul Menage 2008-04-30 8:29 ` Matt Helsley 2008-04-25 11:41 ` Cedric Le Goater 2 siblings, 1 reply; 11+ messages in thread From: Paul Menage @ 2008-04-25 6:01 UTC (permalink / raw) To: Matt Helsley Cc: Linux-Kernel, Cedric Le Goater, Oren Laadan, Linus Torvalds, Pavel Machek, linux-pm, Linux Containers I don't think you need cgroup_signal.h. It's only included in cgroup_signal.c, and doesn't really contain any useful definitions anyway. You should just use a cgroup_subsys_state object as your state object, since you'll never need to do anything with it anyway. >+static struct cgroup_subsys_state *signal_create( >+ struct cgroup_subsys *ss, struct cgroup *cgroup) >+{ >+ struct stateless *dummy; >+ >+ if (!capable(CAP_SYS_ADMIN)) >+ return ERR_PTR(-EPERM); This is unnecessary. >+ + dummy = kzalloc(sizeof(struct stateless), GFP_KERNEL); + if (!dummy) + return ERR_PTR(-ENOMEM); + return &dummy->css; +} This function could be simplified to: struct cgroup_subsys_state *css; css = kzalloc(sizeof(*css), GFP_KERNEL); return css ?: ERR_PTR(-ENOMEM); >+static int signal_can_attach(struct cgroup_subsys *ss, >+ struct cgroup *new_cgroup, >+ struct task_struct *task) >+{ >+ return 0; >+} No need for a can_attach() method if it just returns 0 - that's the default. >+static int signal_kill(struct cgroup *cgroup, int signum) >+{ >+ struct cgroup_iter it; >+ struct task_struct *task; >+ int retval = 0; >+ >+ cgroup_iter_start(cgroup, &it); >+ while ((task = cgroup_iter_next(cgroup, &it))) { >+ retval = send_sig(signum, task, 1); >+ if (retval) >+ break; >+ } >+ cgroup_iter_end(cgroup, &it); >+ >+ return retval; >+} cgroup_iter_start() takes a read lock - is send_sig() guaranteed not to sleep? >+static ssize_t signal_write(struct cgroup *cgroup, >+ struct cftype *cft, >+ struct file *file, >+ const char __user *userbuf, >+ size_t nbytes, loff_t *unused_ppos) This should just be a write_u64() method - cgroups will handle the copying/parsing for you. See e.g. kernel/sched.c:cpu_shares_write_u64() >+static struct cftype kill_file = { >+ .name = "kill", >+ .write = signal_write, >+ .private = 0, >+}; I agree with PaulJ that "signal.send" would be a nicer name for this than "signal.kill" ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 5/5] Add a Signal Control Group Subsystem 2008-04-25 6:01 ` Paul Menage @ 2008-04-30 8:29 ` Matt Helsley 0 siblings, 0 replies; 11+ messages in thread From: Matt Helsley @ 2008-04-30 8:29 UTC (permalink / raw) To: Paul Menage Cc: Linux-Kernel, Cedric Le Goater, Oren Laadan, Linus Torvalds, Pavel Machek, linux-pm, Linux Containers On Thu, 2008-04-24 at 23:01 -0700, Paul Menage wrote: > I don't think you need cgroup_signal.h. It's only included in > cgroup_signal.c, and doesn't really contain any useful definitions > anyway. You should just use a cgroup_subsys_state object as your state > object, since you'll never need to do anything with it anyway. > > >+static struct cgroup_subsys_state *signal_create( > >+ struct cgroup_subsys *ss, struct cgroup *cgroup) > >+{ > >+ struct stateless *dummy; > >+ > >+ if (!capable(CAP_SYS_ADMIN)) > >+ return ERR_PTR(-EPERM); > > This is unnecessary. OK, removed. > >+ > + dummy = kzalloc(sizeof(struct stateless), GFP_KERNEL); > + if (!dummy) > + return ERR_PTR(-ENOMEM); > + return &dummy->css; > +} > > This function could be simplified to: > > struct cgroup_subsys_state *css; > css = kzalloc(sizeof(*css), GFP_KERNEL); > return css ?: ERR_PTR(-ENOMEM); I kept the if() syntax but used cgroup_subsys_state as suggested. I kept the name "dummy" too to emphasize that except for following the currently-required form we don't really need the state information. As a side note, I don't think adding state (which signal was/to sent/send) or a fork handling function prevents races. I tend to agree that there are lots of races involving adding/removing tasks to the cgroup where we may not signal "everything". I think that from userspace's perspective we can't solve the races because there will always be a window between releasing whatever lock we use and returning to userspace where new tasks can be added. IMHO the only way to prevent such races would be to allow userspace to "lock" a cgroup so that no new tasks may be added. That would block/fail new forks and prevent writing new pids to the tasks file. Otherwise userspace must always recheck the list to ensure it didn't get any new entries... > >+static int signal_can_attach(struct cgroup_subsys *ss, > >+ struct cgroup *new_cgroup, > >+ struct task_struct *task) > >+{ > >+ return 0; > >+} > > No need for a can_attach() method if it just returns 0 - that's the default. Removed. > >+static int signal_kill(struct cgroup *cgroup, int signum) > >+{ > >+ struct cgroup_iter it; > >+ struct task_struct *task; > >+ int retval = 0; > >+ > >+ cgroup_iter_start(cgroup, &it); > >+ while ((task = cgroup_iter_next(cgroup, &it))) { > >+ retval = send_sig(signum, task, 1); > >+ if (retval) > >+ break; > >+ } > >+ cgroup_iter_end(cgroup, &it); > >+ > >+ return retval; > >+} > > cgroup_iter_start() takes a read lock - is send_sig() guaranteed not to sleep? send_sig() -> send_sig_info() hold the tasklist_lock for read and the task's sighand->siglock spinlock. > >+static ssize_t signal_write(struct cgroup *cgroup, > >+ struct cftype *cft, > >+ struct file *file, > >+ const char __user *userbuf, > >+ size_t nbytes, loff_t *unused_ppos) > > This should just be a write_u64() method - cgroups will handle the > copying/parsing for you. See e.g. > kernel/sched.c:cpu_shares_write_u64() Sure. > >+static struct cftype kill_file = { > >+ .name = "kill", > >+ .write = signal_write, > >+ .private = 0, > >+}; > > I agree with PaulJ that "signal.send" would be a nicer name for this > than "signal.kill" OK. Cheers, -Matt Helsley ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 5/5] Add a Signal Control Group Subsystem 2008-04-24 6:48 ` [RFC][PATCH 5/5] Add a Signal Control Group Subsystem Matt Helsley 2008-04-24 19:30 ` Paul Jackson 2008-04-25 6:01 ` Paul Menage @ 2008-04-25 11:41 ` Cedric Le Goater 2008-04-30 18:44 ` Matt Helsley 2 siblings, 1 reply; 11+ messages in thread From: Cedric Le Goater @ 2008-04-25 11:41 UTC (permalink / raw) To: Matt Helsley Cc: Linux-Kernel, Linux Containers, Pavel Machek, Paul Menage, Linus Torvalds, linux-pm Matt Helsley wrote: > Add a signal control group subsystem that allows us to send signals to all tasks > in the control group by writing the desired signal(7) number to the kill file. > > NOTE: We don't really need per-cgroup state, but control groups doesn't support > stateless subsystems yet. > > Signed-off-by: Matt Helsley <matthltc@us.ibm.com> > --- > include/linux/cgroup_signal.h | 28 +++++++++ > include/linux/cgroup_subsys.h | 6 + > init/Kconfig | 6 + > kernel/Makefile | 1 > kernel/cgroup_signal.c | 129 ++++++++++++++++++++++++++++++++++++++++++ > 5 files changed, 170 insertions(+) > > Index: linux-2.6.25-mm1/include/linux/cgroup_signal.h > =================================================================== > --- /dev/null > +++ linux-2.6.25-mm1/include/linux/cgroup_signal.h > @@ -0,0 +1,28 @@ > +#ifndef _LINUX_CGROUP_SIGNAL_H > +#define _LINUX_CGROUP_SIGNAL_H > +/* > + * cgroup_signal.h - control group freezer subsystem interface s/freezer/signal/ > + * > + * Copyright IBM Corp. 2007 > + * > + * Author : Cedric Le Goater <clg@fr.ibm.com> > + * Author : Matt Helsley <matthltc@us.ibm.com> > + */ > + > +#include <linux/cgroup.h> > + > +#ifdef CONFIG_CGROUP_SIGNAL > + > +struct stateless { > + struct cgroup_subsys_state css; > +}; I'm not sure this is correct to say so. Imagine you want to send a SIGKILL to a cgroup, you would expect all tasks to die and the cgroup to become empty. right ? but if a task is doing clone() while it's being killed by this cgroup signal subsystem, we can miss the child. This is because there's a small window in copy_process() where the child is in the cgroup and not visible yet. copy_process() cgroup_fork() do stuff cgroup_fork_callbacks() cgroup_post_fork() put new task in the list. ( I didn't dig too much the code, though. So I might be missing something ) So if we want to send the signal to all tasks in the cgroup, we need to track the new tasks with a fork callback, just like the freezer : static void signal_fork(struct cgroup_subsys *ss, struct task_struct *task) { } and, of course, we need to keep somewhere the signal number we need to send. All this depends on how we want the cgroup signal subsystem to behave. It could be brainless of course, but it seems to me that the biggest benefit of such a subsystem is to use the cgroup capability to track new tasks coming in. Cheers, C. > +static inline struct stateless *cgroup_signal(struct cgroup *cgroup) > +{ > + return container_of(cgroup_subsys_state(cgroup, signal_subsys_id), > + struct stateless, css); > +} > + > +#else /* !CONFIG_CGROUP_SIGNAL */ > +#endif /* !CONFIG_CGROUP_SIGNAL */ > +#endif /* _LINUX_CGROUP_SIGNAL_H */ > Index: linux-2.6.25-mm1/kernel/cgroup_signal.c > =================================================================== > --- /dev/null > +++ linux-2.6.25-mm1/kernel/cgroup_signal.c > @@ -0,0 +1,129 @@ > +/* > + * cgroup_signal.c - control group signal subsystem > + * > + * Copyright IBM Corp. 2007 > + * > + * Author : Cedric Le Goater <clg@fr.ibm.com> > + * Author : Matt Helsley <matthltc@us.ibm.com> > + */ > + > +#include <linux/module.h> > +#include <linux/cgroup.h> > +#include <linux/fs.h> > +#include <linux/uaccess.h> > +#include <linux/cgroup_signal.h> > + > +struct cgroup_subsys signal_subsys; > + > +static struct cgroup_subsys_state *signal_create( > + struct cgroup_subsys *ss, struct cgroup *cgroup) > +{ > + struct stateless *dummy; > + > + if (!capable(CAP_SYS_ADMIN)) > + return ERR_PTR(-EPERM); > + > + dummy = kzalloc(sizeof(struct stateless), GFP_KERNEL); > + if (!dummy) > + return ERR_PTR(-ENOMEM); > + return &dummy->css; > +} > + > +static void signal_destroy(struct cgroup_subsys *ss, > + struct cgroup *cgroup) > +{ > + kfree(cgroup_signal(cgroup)); > +} > + > + > +static int signal_can_attach(struct cgroup_subsys *ss, > + struct cgroup *new_cgroup, > + struct task_struct *task) > +{ > + return 0; > +} > + > +static int signal_kill(struct cgroup *cgroup, int signum) > +{ > + struct cgroup_iter it; > + struct task_struct *task; > + int retval = 0; > + > + cgroup_iter_start(cgroup, &it); > + while ((task = cgroup_iter_next(cgroup, &it))) { > + retval = send_sig(signum, task, 1); > + if (retval) > + break; > + } > + cgroup_iter_end(cgroup, &it); > + > + return retval; > +} > + > +static ssize_t signal_write(struct cgroup *cgroup, > + struct cftype *cft, > + struct file *file, > + const char __user *userbuf, > + size_t nbytes, loff_t *unused_ppos) > +{ > + char *buffer; > + int retval = 0; > + int value; > + > + if (nbytes >= PATH_MAX) > + return -E2BIG; > + > + /* +1 for nul-terminator */ > + buffer = kmalloc(nbytes + 1, GFP_KERNEL); > + if (buffer == NULL) > + return -ENOMEM; > + > + if (copy_from_user(buffer, userbuf, nbytes)) { > + retval = -EFAULT; > + goto free_buffer; > + } > + buffer[nbytes] = 0; /* nul-terminate */ > + if (sscanf(buffer, "%d", &value) != 1) { > + retval = -EIO; > + goto free_buffer; > + } > + > + cgroup_lock(); > + > + if (cgroup_is_removed(cgroup)) { > + retval = -ENODEV; > + goto unlock; > + } > + > + retval = signal_kill(cgroup, value); > + if (retval == 0) > + retval = nbytes; > +unlock: > + cgroup_unlock(); > +free_buffer: > + kfree(buffer); > + return retval; > +} > + > +static struct cftype kill_file = { > + .name = "kill", > + .write = signal_write, > + .private = 0, > +}; > + > +static int signal_populate(struct cgroup_subsys *ss, struct cgroup *cgroup) > +{ > + return cgroup_add_files(cgroup, ss, &kill_file, 1); > +} > + > +struct cgroup_subsys signal_subsys = { > + .name = "signal", > + .create = signal_create, > + .destroy = signal_destroy, > + .populate = signal_populate, > + .subsys_id = signal_subsys_id, > + .can_attach = signal_can_attach, > + .attach = NULL, > + .fork = NULL, > + .exit = NULL, > +}; > Index: linux-2.6.25-mm1/init/Kconfig > =================================================================== > --- linux-2.6.25-mm1.orig/init/Kconfig > +++ linux-2.6.25-mm1/init/Kconfig > @@ -328,10 +328,16 @@ config CGROUP_FREEZER > depends on CGROUPS > help > Provides a way to freeze and unfreeze all tasks in a > cgroup > > +config CGROUP_SIGNAL > + bool "control group signal subsystem" > + depends on CGROUPS > + help > + Provides a way to signal all tasks in a cgroup > + > config FAIR_GROUP_SCHED > bool "Group scheduling for SCHED_OTHER" > depends on GROUP_SCHED > default y > > Index: linux-2.6.25-mm1/kernel/Makefile > =================================================================== > --- linux-2.6.25-mm1.orig/kernel/Makefile > +++ linux-2.6.25-mm1/kernel/Makefile > @@ -47,10 +47,11 @@ obj-$(CONFIG_KEXEC) += kexec.o > obj-$(CONFIG_BACKTRACE_SELF_TEST) += backtracetest.o > obj-$(CONFIG_COMPAT) += compat.o > obj-$(CONFIG_CGROUPS) += cgroup.o > obj-$(CONFIG_CGROUP_DEBUG) += cgroup_debug.o > obj-$(CONFIG_CGROUP_FREEZER) += cgroup_freezer.o > +obj-$(CONFIG_CGROUP_SIGNAL) += cgroup_signal.o > obj-$(CONFIG_CPUSETS) += cpuset.o > obj-$(CONFIG_CGROUP_NS) += ns_cgroup.o > obj-$(CONFIG_UTS_NS) += utsname.o > obj-$(CONFIG_USER_NS) += user_namespace.o > obj-$(CONFIG_PID_NS) += pid_namespace.o > Index: linux-2.6.25-mm1/include/linux/cgroup_subsys.h > =================================================================== > --- linux-2.6.25-mm1.orig/include/linux/cgroup_subsys.h > +++ linux-2.6.25-mm1/include/linux/cgroup_subsys.h > @@ -52,5 +52,11 @@ SUBSYS(devices) > #ifdef CONFIG_CGROUP_FREEZER > SUBSYS(freezer) > #endif > > /* */ > + > +#ifdef CONFIG_CGROUP_SIGNAL > +SUBSYS(signal) > +#endif > + > +/* */ > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC][PATCH 5/5] Add a Signal Control Group Subsystem 2008-04-25 11:41 ` Cedric Le Goater @ 2008-04-30 18:44 ` Matt Helsley 0 siblings, 0 replies; 11+ messages in thread From: Matt Helsley @ 2008-04-30 18:44 UTC (permalink / raw) To: Cedric Le Goater Cc: Linux-Kernel, Linux Containers, Pavel Machek, Paul Menage, Linus Torvalds, linux-pm On Fri, 2008-04-25 at 13:41 +0200, Cedric Le Goater wrote: > Matt Helsley wrote: > > Add a signal control group subsystem that allows us to send signals to all tasks > > in the control group by writing the desired signal(7) number to the kill file. > > > > NOTE: We don't really need per-cgroup state, but control groups doesn't support > > stateless subsystems yet. > > > > Signed-off-by: Matt Helsley <matthltc@us.ibm.com> > > --- > > include/linux/cgroup_signal.h | 28 +++++++++ > > include/linux/cgroup_subsys.h | 6 + > > init/Kconfig | 6 + > > kernel/Makefile | 1 > > kernel/cgroup_signal.c | 129 ++++++++++++++++++++++++++++++++++++++++++ > > 5 files changed, 170 insertions(+) > > > > Index: linux-2.6.25-mm1/include/linux/cgroup_signal.h > > =================================================================== > > --- /dev/null > > +++ linux-2.6.25-mm1/include/linux/cgroup_signal.h > > @@ -0,0 +1,28 @@ > > +#ifndef _LINUX_CGROUP_SIGNAL_H > > +#define _LINUX_CGROUP_SIGNAL_H > > +/* > > + * cgroup_signal.h - control group freezer subsystem interface > > s/freezer/signal/ > > > + * > > + * Copyright IBM Corp. 2007 > > + * > > + * Author : Cedric Le Goater <clg@fr.ibm.com> > > + * Author : Matt Helsley <matthltc@us.ibm.com> > > + */ > > + > > +#include <linux/cgroup.h> > > + > > +#ifdef CONFIG_CGROUP_SIGNAL > > + > > +struct stateless { > > + struct cgroup_subsys_state css; > > +}; > > I'm not sure this is correct to say so. Imagine you want to send > a SIGKILL to a cgroup, you would expect all tasks to die and the > cgroup to become empty. right ? > > but if a task is doing clone() while it's being killed by this cgroup > signal subsystem, we can miss the child. This is because there's a > small window in copy_process() where the child is in the cgroup and > not visible yet. > > copy_process() > cgroup_fork() > do stuff > cgroup_fork_callbacks() > > cgroup_post_fork() > put new task in the list. > > ( I didn't dig too much the code, though. So I might be missing > something ) > > So if we want to send the signal to all tasks in the cgroup, we need > to track the new tasks with a fork callback, just like the freezer : > > static void signal_fork(struct cgroup_subsys *ss, struct task_struct *task) > { > > } > > and, of course, we need to keep somewhere the signal number we need to > send. > > > All this depends on how we want the cgroup signal subsystem to behave. > It could be brainless of course, but it seems to me that the biggest > benefit of such a subsystem is to use the cgroup capability to track > new tasks coming in. > > Cheers, > > C. Assuming we did this, isn't it still possible to send SIGSTOP to every task in the cgroup yet still appear to have not stopped every task in the cgroup: Task A Task B echo 19 > signal.send record signal return -EBUSY from can_attach send signals to all the tasks return 0 from write syscall echo newpid > tasks cat tasks <Uh oh, not all tasks are stopped...> Cheers, -Matt Helsley ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-04-30 21:29 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20080423142517.062433911@us.ibm.com>
[not found] ` <20080423142518.703428301@us.ibm.com>
2008-04-23 15:17 ` [RFC PATCH 5/5] Add a Signal Control Group Subsystem Cedric Le Goater
2008-04-23 15:37 ` Paul Menage
2008-04-24 7:00 ` Matt Helsley
2008-04-24 6:47 [RFC][PATCH 0/5] Container Freezer: Reuse Suspend Freezer Matt Helsley
2008-04-24 6:48 ` [RFC][PATCH 5/5] Add a Signal Control Group Subsystem Matt Helsley
2008-04-24 19:30 ` Paul Jackson
2008-04-30 7:48 ` Matt Helsley
2008-04-30 8:18 ` Paul Jackson
2008-04-25 6:01 ` Paul Menage
2008-04-30 8:29 ` Matt Helsley
2008-04-25 11:41 ` Cedric Le Goater
2008-04-30 18:44 ` Matt Helsley
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox