* [PATCH, v7] Introduce timer slack controller @ 2011-03-02 16:40 Kirill A. Shutsemov [not found] ` <1299084001-3916-1-git-send-email-kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: Kirill A. Shutsemov @ 2011-03-02 16:40 UTC (permalink / raw) To: Paul Menage, Li Zefan Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, jacob.jun.pan-VuQAYsv1563Yd54FQh9/CA, Arjan van de Ven, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Matt Helsley, Andrew Morton, linux-api-u79uwXL29TY76Z2rM5mHXA, Kirill A. Shutemov From: Kirill A. Shutemov <kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org> Changelog: v7: - totally reworked interface and rewritten from scratch (See Documentation/cgroups/timer_slack.txt for more information) v6: - add documentation - use notifier_call_chain() instead of check hook - fix validate_change() - cleanup v5: - -EBUSY on writing to timer_slack.min_slack_ns/max_slack_ns if a child has wider min-max range v4: - hierarchy support - drop dummy_timer_slack_check() - workaround lockdep false (?) positive - allow 0 as timer slack value v3: - rework interface - s/EXPORT_SYMBOL/EXPORT_SYMBOL_GPL/ v2: - fixed with CONFIG_CGROUP_TIMER_SLACK=y v1: - initial revision Kirill A. Shutemov (1): cgroups: introduce timer slack controller Documentation/cgroups/timer_slack.txt | 66 ++++++++++++++++++++ fs/select.c | 7 +-- include/linux/cgroup_subsys.h | 6 ++ include/linux/sched.h | 9 +++ init/Kconfig | 8 +++ kernel/Makefile | 1 + kernel/cgroup_timer_slack.c | 107 +++++++++++++++++++++++++++++++++ kernel/futex.c | 4 +- kernel/hrtimer.c | 2 +- 9 files changed, 202 insertions(+), 8 deletions(-) create mode 100644 Documentation/cgroups/timer_slack.txt create mode 100644 kernel/cgroup_timer_slack.c -- 1.7.4.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <1299084001-3916-1-git-send-email-kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org>]
* [PATCH, v7] cgroups: introduce timer slack controller [not found] ` <1299084001-3916-1-git-send-email-kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org> @ 2011-03-02 16:40 ` Kirill A. Shutsemov [not found] ` <1299084001-3916-2-git-send-email-kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org> 2011-03-04 19:03 ` Peter Zijlstra 0 siblings, 2 replies; 10+ messages in thread From: Kirill A. Shutsemov @ 2011-03-02 16:40 UTC (permalink / raw) To: Paul Menage, Li Zefan Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, jacob.jun.pan-VuQAYsv1563Yd54FQh9/CA, Arjan van de Ven, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Matt Helsley, Andrew Morton, linux-api-u79uwXL29TY76Z2rM5mHXA, Kirill A. Shutemov From: Kirill A. Shutemov <kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org> Every task_struct has timer_slack_ns value. This value uses to round up poll() and select() timeout values. This feature can be useful in mobile environment where combined wakeups are desired. cgroup subsys "timer_slack" implement timer slack controller. It provides a way to set minimal timer slack value for a group of tasks. If a task belongs to a cgroup with minimal timer slack value higher than task's value, cgroup's value will be applied. Idea-by: Jacob Pan <jacob.jun.pan-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> Signed-off-by: Kirill A. Shutemov <kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org> --- Documentation/cgroups/timer_slack.txt | 66 ++++++++++++++++++++ fs/select.c | 7 +-- include/linux/cgroup_subsys.h | 6 ++ include/linux/sched.h | 9 +++ init/Kconfig | 8 +++ kernel/Makefile | 1 + kernel/cgroup_timer_slack.c | 107 +++++++++++++++++++++++++++++++++ kernel/futex.c | 4 +- kernel/hrtimer.c | 2 +- 9 files changed, 202 insertions(+), 8 deletions(-) create mode 100644 Documentation/cgroups/timer_slack.txt create mode 100644 kernel/cgroup_timer_slack.c diff --git a/Documentation/cgroups/timer_slack.txt b/Documentation/cgroups/timer_slack.txt new file mode 100644 index 0000000..16ac066 --- /dev/null +++ b/Documentation/cgroups/timer_slack.txt @@ -0,0 +1,66 @@ +Timer Slack Controller +===================== + +Overview +-------- + +Every task_struct has timer_slack_ns value. This value uses to round up +poll() and select() timeout values. This feature can be useful in +mobile environment where combined wakeups are desired. + +cgroup subsys "timer_slack" implement timer slack controller. It +provides a way to set minimal timer slack value for a group of tasks. +If a task belongs to a cgroup with minimal timer slack value higher than +task's value, cgroup's value will be applied. + +User interface +-------------- + +To get timer slack controller functionality you need to enable it in +kernel configuration: + +CONFIG_CGROUP_TIMER_SLACK=y + +The controller provides only one file: + +# mount -t cgroup -o timer_slack none /sys/fs/cgroup +# ls /sys/fs/cgroup/timer_slack.* +/sys/fs/cgroup/timer_slack.min_slack_ns + +By defeault it's 0: + +# cat /sys/fs/cgroup/timer_slack.min_slack_ns +0 + +You can set it to some value: + +# echo 50000 > /sys/fs/cgroup/timer_slack.min_slack_ns +# cat /sys/fs/cgroup/timer_slack.min_slack_ns +50000 + +Tasks still can set task's value below 50000 using prctl(), but in this +case cgroup's value will be applied. + +Timer slack controller supports hierarchical groups. The only rule: +parent's minimal timer slack value should be less or equal to child's. + +# mkdir /sys/fs/cgroup/a +# cat /sys/fs/cgroup/a/timer_slack.min_slack_ns +50000 +# echo 70000 > /sys/fs/cgroup/a/timer_slack.min_slack_ns +# cat /sys/fs/cgroup/a/timer_slack.min_slack_ns +70000 + +You'll get -EPERM, if you try to set child's timer_slack.min_slack_ns > +parent's timer_slack.min_slack_ns: + +# /bin/echo 40000 > /sys/fs/cgroup/a/timer_slack.min_slack_ns +/bin/echo: write error: Operation not permitted + +Child's value will be adjusted if necessary on parent's value update: + +# echo 100000 > /sys/fs/cgroup/timer_slack.min_slack_ns +# cat /sys/fs/cgroup/timer_slack.min_slack_ns +100000 +# cat /sys/fs/cgroup/a/timer_slack.min_slack_ns +100000 diff --git a/fs/select.c b/fs/select.c index e56560d..a189e4d 100644 --- a/fs/select.c +++ b/fs/select.c @@ -69,7 +69,6 @@ static long __estimate_accuracy(struct timespec *tv) long select_estimate_accuracy(struct timespec *tv) { - unsigned long ret; struct timespec now; /* @@ -81,10 +80,8 @@ long select_estimate_accuracy(struct timespec *tv) ktime_get_ts(&now); now = timespec_sub(*tv, now); - ret = __estimate_accuracy(&now); - if (ret < current->timer_slack_ns) - return current->timer_slack_ns; - return ret; + return clamp(__estimate_accuracy(&now), + get_task_timer_slack(current), LONG_MAX); } diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h index ccefff0..e399228 100644 --- a/include/linux/cgroup_subsys.h +++ b/include/linux/cgroup_subsys.h @@ -66,3 +66,9 @@ SUBSYS(blkio) #endif /* */ + +#ifdef CONFIG_CGROUP_TIMER_SLACK +SUBSYS(timer_slack) +#endif + +/* */ diff --git a/include/linux/sched.h b/include/linux/sched.h index 777d8a5..3751aaa 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2620,6 +2620,15 @@ static inline unsigned long rlimit_max(unsigned int limit) return task_rlimit_max(current, limit); } +#ifdef CONFIG_CGROUP_TIMER_SLACK +extern unsigned long get_task_timer_slack(struct task_struct *tsk); +#else +static inline unsigned long get_task_timer_slack(struct task_struct *tsk) +{ + return tsk->timer_slack_ns; +} +#endif + #endif /* __KERNEL__ */ #endif diff --git a/init/Kconfig b/init/Kconfig index be788c0..bbc4d9c 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -596,6 +596,14 @@ config CGROUP_FREEZER Provides a way to freeze and unfreeze all tasks in a cgroup. +config CGROUP_TIMER_SLACK + bool "Timer slack cgroup controller" + help + Provides a way to set minimal timer slack value for tasks in + a cgroup. + It's useful in mobile devices where certain background apps + are attached to a cgroup and combined wakeups are desired. + config CGROUP_DEVICE bool "Device controller for cgroups" help diff --git a/kernel/Makefile b/kernel/Makefile index 353d3fe..0b60239 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -61,6 +61,7 @@ obj-$(CONFIG_BACKTRACE_SELF_TEST) += backtracetest.o obj-$(CONFIG_COMPAT) += compat.o obj-$(CONFIG_CGROUPS) += cgroup.o obj-$(CONFIG_CGROUP_FREEZER) += cgroup_freezer.o +obj-$(CONFIG_CGROUP_TIMER_SLACK) += cgroup_timer_slack.o obj-$(CONFIG_CPUSETS) += cpuset.o obj-$(CONFIG_CGROUP_NS) += ns_cgroup.o obj-$(CONFIG_UTS_NS) += utsname.o diff --git a/kernel/cgroup_timer_slack.c b/kernel/cgroup_timer_slack.c new file mode 100644 index 0000000..c300125 --- /dev/null +++ b/kernel/cgroup_timer_slack.c @@ -0,0 +1,107 @@ +#include <linux/cgroup.h> +#include <linux/slab.h> + +struct cgroup_subsys timer_slack_subsys; +struct tslack_cgroup { + struct cgroup_subsys_state css; + unsigned long min_slack_ns; +}; + +static struct tslack_cgroup *cgroup_to_tslack(struct cgroup *cgroup) +{ + struct cgroup_subsys_state *css; + + css = cgroup_subsys_state(cgroup, timer_slack_subsys.subsys_id); + return container_of(css, struct tslack_cgroup, css); +} + +static struct cgroup_subsys_state *tslack_create(struct cgroup_subsys *subsys, + struct cgroup *cgroup) +{ + struct tslack_cgroup *tslack_cgroup; + + tslack_cgroup = kmalloc(sizeof(*tslack_cgroup), GFP_KERNEL); + if (!tslack_cgroup) + return ERR_PTR(-ENOMEM); + + if (cgroup->parent) { + struct tslack_cgroup *parent = cgroup_to_tslack(cgroup->parent); + tslack_cgroup->min_slack_ns = parent->min_slack_ns; + } else + tslack_cgroup->min_slack_ns = 0UL; + + return &tslack_cgroup->css; +} + +static void tslack_destroy(struct cgroup_subsys *tslack_cgroup, + struct cgroup *cgroup) +{ + kfree(cgroup_to_tslack(cgroup)); +} + +static u64 tslack_read_min(struct cgroup *cgroup, struct cftype *cft) +{ + return cgroup_to_tslack(cgroup)->min_slack_ns; +} + +static int tslack_write_min(struct cgroup *cgroup, struct cftype *cft, u64 val) +{ + struct cgroup *cur; + + if (val > ULONG_MAX) + return -EINVAL; + + /* the min timer slack value should be more or equal than parent's */ + if (cgroup->parent) { + struct tslack_cgroup *parent = cgroup_to_tslack(cgroup->parent); + if (parent->min_slack_ns > val) + return -EPERM; + } + + cgroup_to_tslack(cgroup)->min_slack_ns = val; + + /* update children's min slack value if needed */ + list_for_each_entry(cur, &cgroup->children, sibling) { + struct tslack_cgroup *child = cgroup_to_tslack(cur); + if (val > child->min_slack_ns) + tslack_write_min(cur, cft, val); + } + + return 0; +} + +static struct cftype files[] = { + { + .name = "min_slack_ns", + .read_u64 = tslack_read_min, + .write_u64 = tslack_write_min, + } +}; + +static int tslack_populate(struct cgroup_subsys *subsys, struct cgroup *cgroup) +{ + return cgroup_add_files(cgroup, subsys, files, ARRAY_SIZE(files)); +} + +struct cgroup_subsys timer_slack_subsys = { + .name = "timer_slack", + .subsys_id = timer_slack_subsys_id, + .create = tslack_create, + .destroy = tslack_destroy, + .populate = tslack_populate, +}; + +unsigned long get_task_timer_slack(struct task_struct *tsk) +{ + struct cgroup_subsys_state *css; + struct tslack_cgroup *tslack_cgroup; + unsigned long ret; + + rcu_read_lock(); + css = task_subsys_state(tsk, timer_slack_subsys.subsys_id); + tslack_cgroup = container_of(css, struct tslack_cgroup, css); + ret = max(tsk->timer_slack_ns, tslack_cgroup->min_slack_ns); + rcu_read_unlock(); + + return ret; +} diff --git a/kernel/futex.c b/kernel/futex.c index b766d28..eca8773 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -1845,7 +1845,7 @@ static int futex_wait(u32 __user *uaddr, unsigned int flags, u32 val, HRTIMER_MODE_ABS); hrtimer_init_sleeper(to, current); hrtimer_set_expires_range_ns(&to->timer, *abs_time, - current->timer_slack_ns); + get_task_timer_slack(current)); } retry: @@ -2242,7 +2242,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags, HRTIMER_MODE_ABS); hrtimer_init_sleeper(to, current); hrtimer_set_expires_range_ns(&to->timer, *abs_time, - current->timer_slack_ns); + get_task_timer_slack(current)); } /* diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c index 0c8d7c0..cdf47ba 100644 --- a/kernel/hrtimer.c +++ b/kernel/hrtimer.c @@ -1542,7 +1542,7 @@ long hrtimer_nanosleep(struct timespec *rqtp, struct timespec __user *rmtp, int ret = 0; unsigned long slack; - slack = current->timer_slack_ns; + slack = get_task_timer_slack(current); if (rt_task(current)) slack = 0; -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 10+ messages in thread
[parent not found: <1299084001-3916-2-git-send-email-kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org>]
* Re: [PATCH, v7] cgroups: introduce timer slack controller [not found] ` <1299084001-3916-2-git-send-email-kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org> @ 2011-03-02 18:40 ` Thomas Gleixner [not found] ` <alpine.LFD.2.00.1103021912090.2701-bi+AKbBUZKagILUCTcTcHdKyNwTtLsGr@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: Thomas Gleixner @ 2011-03-02 18:40 UTC (permalink / raw) To: Kirill A. Shutsemov Cc: Paul Menage, Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, jacob.jun.pan-VuQAYsv1563Yd54FQh9/CA, Arjan van de Ven, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Matt Helsley, Andrew Morton, linux-api-u79uwXL29TY76Z2rM5mHXA On Wed, 2 Mar 2011, Kirill A. Shutsemov wrote: Not CC'ing me does not avoid another review :) > diff --git a/fs/select.c b/fs/select.c > index e56560d..a189e4d 100644 > --- a/fs/select.c > +++ b/fs/select.c > @@ -69,7 +69,6 @@ static long __estimate_accuracy(struct timespec *tv) > > long select_estimate_accuracy(struct timespec *tv) > { > - unsigned long ret; > struct timespec now; > > /* > @@ -81,10 +80,8 @@ long select_estimate_accuracy(struct timespec *tv) > > ktime_get_ts(&now); > now = timespec_sub(*tv, now); > - ret = __estimate_accuracy(&now); > - if (ret < current->timer_slack_ns) > - return current->timer_slack_ns; > - return ret; > + return clamp(__estimate_accuracy(&now), > + get_task_timer_slack(current), LONG_MAX); > } Can you please split out the get_task_timer_slack() change into a separate patch? Also the function wants to be named different. task_get_effective_timer_slack() or such, so it becomes clear, that it's not just a wrapper around tsk->timer_slack_ns And the places which access tsk->timer_slack_ns directly should be updated with comments, why they are not using the wrapper. > +static int tslack_write_min(struct cgroup *cgroup, struct cftype *cft, u64 val) > +{ > + struct cgroup *cur; > + > + if (val > ULONG_MAX) > + return -EINVAL; > + > + /* the min timer slack value should be more or equal than parent's */ s/should/must/ > + if (cgroup->parent) { > + struct tslack_cgroup *parent = cgroup_to_tslack(cgroup->parent); New line between variables and code please > + if (parent->min_slack_ns > val) > + return -EPERM; > + } > + > + cgroup_to_tslack(cgroup)->min_slack_ns = val; > + > + /* update children's min slack value if needed */ > + list_for_each_entry(cur, &cgroup->children, sibling) { > + struct tslack_cgroup *child = cgroup_to_tslack(cur); Ditto > + if (val > child->min_slack_ns) > + tslack_write_min(cur, cft, val); > + } So, we can increase the value and propagate it through the groups children, but decreasing it requires to go through all child groups seperately. When I'm trying to optimize something during runtime and chose a too high value I need to go through hoops and loops to make it smaller again. Asymetric behaviour sucks always. > +unsigned long get_task_timer_slack(struct task_struct *tsk) > +{ > + struct cgroup_subsys_state *css; > + struct tslack_cgroup *tslack_cgroup; > + unsigned long ret; > + > + rcu_read_lock(); Did you just remove the odd comment or actually figure out why you need rcu_read_lock() here ? > + css = task_subsys_state(tsk, timer_slack_subsys.subsys_id); > + tslack_cgroup = container_of(css, struct tslack_cgroup, css); > + ret = max(tsk->timer_slack_ns, tslack_cgroup->min_slack_ns); > + rcu_read_unlock(); > + > + return ret; > +} Otherwise, it's way more palatable than the last one. Thanks, tglx ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <alpine.LFD.2.00.1103021912090.2701-bi+AKbBUZKagILUCTcTcHdKyNwTtLsGr@public.gmane.org>]
* Re: [PATCH, v7] cgroups: introduce timer slack controller [not found] ` <alpine.LFD.2.00.1103021912090.2701-bi+AKbBUZKagILUCTcTcHdKyNwTtLsGr@public.gmane.org> @ 2011-03-03 6:59 ` Li Zefan [not found] ` <4D6F3C5B.8020307-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> 2011-03-03 7:46 ` Matt Helsley 2011-03-03 8:33 ` Kirill A. Shutemov 2 siblings, 1 reply; 10+ messages in thread From: Li Zefan @ 2011-03-03 6:59 UTC (permalink / raw) To: Thomas Gleixner Cc: Kirill A. Shutsemov, Paul Menage, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, jacob.jun.pan-VuQAYsv1563Yd54FQh9/CA, Arjan van de Ven, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Matt Helsley, Andrew Morton, linux-api-u79uwXL29TY76Z2rM5mHXA >> +unsigned long get_task_timer_slack(struct task_struct *tsk) >> +{ >> + struct cgroup_subsys_state *css; >> + struct tslack_cgroup *tslack_cgroup; >> + unsigned long ret; >> + >> + rcu_read_lock(); > > Did you just remove the odd comment or actually figure out why you > need rcu_read_lock() here ? > It's necessary to protect against task exiting or task moving between cgroups. >> + css = task_subsys_state(tsk, timer_slack_subsys.subsys_id); >> + tslack_cgroup = container_of(css, struct tslack_cgroup, css); >> + ret = max(tsk->timer_slack_ns, tslack_cgroup->min_slack_ns); >> + rcu_read_unlock(); >> + >> + return ret; >> +} > > Otherwise, it's way more palatable than the last one. > > Thanks, > > tglx > ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <4D6F3C5B.8020307-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>]
* Re: [PATCH, v7] cgroups: introduce timer slack controller [not found] ` <4D6F3C5B.8020307-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> @ 2011-03-03 7:34 ` Thomas Gleixner 0 siblings, 0 replies; 10+ messages in thread From: Thomas Gleixner @ 2011-03-03 7:34 UTC (permalink / raw) To: Li Zefan Cc: Kirill A. Shutsemov, Paul Menage, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, jacob.jun.pan-VuQAYsv1563Yd54FQh9/CA, Arjan van de Ven, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Matt Helsley, Andrew Morton, linux-api-u79uwXL29TY76Z2rM5mHXA On Thu, 3 Mar 2011, Li Zefan wrote: > >> +unsigned long get_task_timer_slack(struct task_struct *tsk) > >> +{ > >> + struct cgroup_subsys_state *css; > >> + struct tslack_cgroup *tslack_cgroup; > >> + unsigned long ret; > >> + > >> + rcu_read_lock(); > > > > Did you just remove the odd comment or actually figure out why you > > need rcu_read_lock() here ? > > > > It's necessary to protect against task exiting or task moving between cgroups. I know, though after the last review I wanted to make sure, that the author understands it as well and not just removed the odd comment just because I ranted about it :) Thanks, tglx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH, v7] cgroups: introduce timer slack controller [not found] ` <alpine.LFD.2.00.1103021912090.2701-bi+AKbBUZKagILUCTcTcHdKyNwTtLsGr@public.gmane.org> 2011-03-03 6:59 ` Li Zefan @ 2011-03-03 7:46 ` Matt Helsley [not found] ` <20110303074609.GL14893-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org> 2011-03-03 8:33 ` Kirill A. Shutemov 2 siblings, 1 reply; 10+ messages in thread From: Matt Helsley @ 2011-03-03 7:46 UTC (permalink / raw) To: Thomas Gleixner Cc: Kirill A. Shutsemov, Paul Menage, Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, jacob.jun.pan-VuQAYsv1563Yd54FQh9/CA, Arjan van de Ven, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Matt Helsley, Andrew Morton, linux-api-u79uwXL29TY76Z2rM5mHXA On Wed, Mar 02, 2011 at 07:40:01PM +0100, Thomas Gleixner wrote: > On Wed, 2 Mar 2011, Kirill A. Shutsemov wrote: <snip> > > + if (val > child->min_slack_ns) > > + tslack_write_min(cur, cft, val); > > + } > > So, we can increase the value and propagate it through the groups > children, but decreasing it requires to go through all child groups > seperately. One goal is to restrict shrinking timer slacks so that they cannot be less than the parent cgroup's. > > When I'm trying to optimize something during runtime and chose a too > high value I need to go through hoops and loops to make it smaller > again. > > Asymetric behaviour sucks always. Well, in this case I think the asymmetry is less sucky because I don't see any point in imposing the same restrictions on raising the timer slack *except* this symmetry argument. But I haven't played much with timer slacks so I don't know: Is there any case where raising the timer slack would be harmful? Would making the values additive solve the symmetry problem? In other words, the "minimum" you see in a cgroups min_slack_ns file is the minimum in addition to the minimum timer slack of its parent. Then, so long as negative values are disallowed, you can't possibly write values that violate this restriction. We could re-evaluate the resulting minimum timer slack internally during writes to the file so functions using timer slack won't have to walk the cgroup parents to calculate it but it couldn't result in EPERM... Cheers, -Matt Helsley ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <20110303074609.GL14893-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>]
* Re: [PATCH, v7] cgroups: introduce timer slack controller [not found] ` <20110303074609.GL14893-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org> @ 2011-03-03 8:28 ` Thomas Gleixner 0 siblings, 0 replies; 10+ messages in thread From: Thomas Gleixner @ 2011-03-03 8:28 UTC (permalink / raw) To: Matt Helsley Cc: Kirill A. Shutsemov, Paul Menage, Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, jacob.jun.pan-VuQAYsv1563Yd54FQh9/CA, Arjan van de Ven, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton, linux-api-u79uwXL29TY76Z2rM5mHXA On Wed, 2 Mar 2011, Matt Helsley wrote: > On Wed, Mar 02, 2011 at 07:40:01PM +0100, Thomas Gleixner wrote: > > On Wed, 2 Mar 2011, Kirill A. Shutsemov wrote: > > <snip> > > > > + if (val > child->min_slack_ns) > > > + tslack_write_min(cur, cft, val); > > > + } > > > > So, we can increase the value and propagate it through the groups > > children, but decreasing it requires to go through all child groups > > seperately. > > One goal is to restrict shrinking timer slacks so that they > cannot be less than the parent cgroup's. Right, but that's not what the code is doing. > > > > When I'm trying to optimize something during runtime and chose a too > > high value I need to go through hoops and loops to make it smaller > > again. > > > > Asymetric behaviour sucks always. > > Well, in this case I think the asymmetry is less sucky because I don't > see any point in imposing the same restrictions on raising the timer > slack *except* this symmetry argument. But I haven't played much with > timer slacks so I don't know: Is there any case where raising the timer > slack would be harmful? Well, the slack can delay your timer expiry by slack ns. So chosing a too large slack value can affect the functionality of an application. Now imagine you are tuning your system and typo the slack value by an order of magnitude, which makes your apps unhappy. Then you definitely want to be able to undo that w/o going through loops and circles. > Would making the values additive solve the symmetry problem? In other words, > the "minimum" you see in a cgroups min_slack_ns file is the minimum in > addition to the minimum timer slack of its parent. Then, so long as negative > values are disallowed, you can't possibly write values that violate this > restriction. We could re-evaluate the resulting minimum timer slack internally > during writes to the file so functions using timer slack won't have to walk > the cgroup parents to calculate it but it couldn't result in EPERM... That would work, but that wants some interface to read out the effective slack value for a group. Thanks, tglx ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH, v7] cgroups: introduce timer slack controller [not found] ` <alpine.LFD.2.00.1103021912090.2701-bi+AKbBUZKagILUCTcTcHdKyNwTtLsGr@public.gmane.org> 2011-03-03 6:59 ` Li Zefan 2011-03-03 7:46 ` Matt Helsley @ 2011-03-03 8:33 ` Kirill A. Shutemov 2 siblings, 0 replies; 10+ messages in thread From: Kirill A. Shutemov @ 2011-03-03 8:33 UTC (permalink / raw) To: Thomas Gleixner Cc: Paul Menage, Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, jacob.jun.pan-VuQAYsv1563Yd54FQh9/CA, Arjan van de Ven, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Matt Helsley, Andrew Morton, linux-api-u79uwXL29TY76Z2rM5mHXA On Wed, Mar 02, 2011 at 07:40:01PM +0100, Thomas Gleixner wrote: > On Wed, 2 Mar 2011, Kirill A. Shutsemov wrote: > > Not CC'ing me does not avoid another review :) Sorry for that. It's by mistake. > > diff --git a/fs/select.c b/fs/select.c > > index e56560d..a189e4d 100644 > > --- a/fs/select.c > > +++ b/fs/select.c > > @@ -69,7 +69,6 @@ static long __estimate_accuracy(struct timespec *tv) > > > > long select_estimate_accuracy(struct timespec *tv) > > { > > - unsigned long ret; > > struct timespec now; > > > > /* > > @@ -81,10 +80,8 @@ long select_estimate_accuracy(struct timespec *tv) > > > > ktime_get_ts(&now); > > now = timespec_sub(*tv, now); > > - ret = __estimate_accuracy(&now); > > - if (ret < current->timer_slack_ns) > > - return current->timer_slack_ns; > > - return ret; > > + return clamp(__estimate_accuracy(&now), > > + get_task_timer_slack(current), LONG_MAX); > > } > > Can you please split out the get_task_timer_slack() change into a > separate patch? > > Also the function wants to be named different. > > task_get_effective_timer_slack() or such, so it becomes clear, that > it's not just a wrapper around tsk->timer_slack_ns > > And the places which access tsk->timer_slack_ns directly should be > updated with comments, why they are not using the wrapper. Ok, I'll fix it. > > +static int tslack_write_min(struct cgroup *cgroup, struct cftype *cft, u64 val) > > +{ > > + struct cgroup *cur; > > + > > + if (val > ULONG_MAX) > > + return -EINVAL; > > + > > + /* the min timer slack value should be more or equal than parent's */ > > s/should/must/ Ok. > > + if (cgroup->parent) { > > + struct tslack_cgroup *parent = cgroup_to_tslack(cgroup->parent); > > New line between variables and code please Ok. > > + if (parent->min_slack_ns > val) > > + return -EPERM; > > + } > > + > > + cgroup_to_tslack(cgroup)->min_slack_ns = val; > > + > > + /* update children's min slack value if needed */ > > + list_for_each_entry(cur, &cgroup->children, sibling) { > > + struct tslack_cgroup *child = cgroup_to_tslack(cur); > > Ditto > > > + if (val > child->min_slack_ns) > > + tslack_write_min(cur, cft, val); > > + } > > So, we can increase the value and propagate it through the groups > children, but decreasing it requires to go through all child groups > seperately. > > When I'm trying to optimize something during runtime and chose a too > high value I need to go through hoops and loops to make it smaller > again. > > Asymetric behaviour sucks always. Other option is to remove restriction on the min_slack_ns value and modify get_task_timer_slack() to use the most strict value up by hierarchy. Do you prefer this approach? > > +unsigned long get_task_timer_slack(struct task_struct *tsk) > > +{ > > + struct cgroup_subsys_state *css; > > + struct tslack_cgroup *tslack_cgroup; > > + unsigned long ret; > > + > > + rcu_read_lock(); > > Did you just remove the odd comment or actually figure out why you > need rcu_read_lock() here ? The second ;) > > + css = task_subsys_state(tsk, timer_slack_subsys.subsys_id); > > + tslack_cgroup = container_of(css, struct tslack_cgroup, css); > > + ret = max(tsk->timer_slack_ns, tslack_cgroup->min_slack_ns); > > + rcu_read_unlock(); > > + > > + return ret; > > +} > > Otherwise, it's way more palatable than the last one. Thanks for reviewing. -- Kirill A. Shutemov ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH, v7] cgroups: introduce timer slack controller 2011-03-02 16:40 ` [PATCH, v7] cgroups: introduce " Kirill A. Shutsemov [not found] ` <1299084001-3916-2-git-send-email-kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org> @ 2011-03-04 19:03 ` Peter Zijlstra 2011-03-04 23:22 ` Kirill A. Shutemov 1 sibling, 1 reply; 10+ messages in thread From: Peter Zijlstra @ 2011-03-04 19:03 UTC (permalink / raw) To: Kirill A. Shutsemov Cc: Paul Menage, Li Zefan, containers, jacob.jun.pan, Arjan van de Ven, linux-kernel, Matt Helsley, Andrew Morton, linux-api On Wed, 2011-03-02 at 18:40 +0200, Kirill A. Shutsemov wrote: > - if (ret < current->timer_slack_ns) > - return current->timer_slack_ns; > - return ret; > + return clamp(__estimate_accuracy(&now), > + get_task_timer_slack(current), LONG_MAX); That actually makes the code worse, how about: min(__estimate_accuracy(), get_task_timer_slack()) ? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH, v7] cgroups: introduce timer slack controller 2011-03-04 19:03 ` Peter Zijlstra @ 2011-03-04 23:22 ` Kirill A. Shutemov 0 siblings, 0 replies; 10+ messages in thread From: Kirill A. Shutemov @ 2011-03-04 23:22 UTC (permalink / raw) To: Peter Zijlstra Cc: Paul Menage, Li Zefan, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, jacob.jun.pan-VuQAYsv1563Yd54FQh9/CA, Arjan van de Ven, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Matt Helsley, Andrew Morton, linux-api-u79uwXL29TY76Z2rM5mHXA On Fri, Mar 04, 2011 at 08:03:42PM +0100, Peter Zijlstra wrote: > On Wed, 2011-03-02 at 18:40 +0200, Kirill A. Shutsemov wrote: > > - if (ret < current->timer_slack_ns) > > - return current->timer_slack_ns; > > - return ret; > > + return clamp(__estimate_accuracy(&now), > > + get_task_timer_slack(current), LONG_MAX); > > That actually makes the code worse, how about: > > min(__estimate_accuracy(), get_task_timer_slack()) ? It's better. Thanks. -- Kirill A. Shutemov ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2011-03-04 23:22 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-03-02 16:40 [PATCH, v7] Introduce timer slack controller Kirill A. Shutsemov [not found] ` <1299084001-3916-1-git-send-email-kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org> 2011-03-02 16:40 ` [PATCH, v7] cgroups: introduce " Kirill A. Shutsemov [not found] ` <1299084001-3916-2-git-send-email-kirill-oKw7cIdHH8eLwutG50LtGA@public.gmane.org> 2011-03-02 18:40 ` Thomas Gleixner [not found] ` <alpine.LFD.2.00.1103021912090.2701-bi+AKbBUZKagILUCTcTcHdKyNwTtLsGr@public.gmane.org> 2011-03-03 6:59 ` Li Zefan [not found] ` <4D6F3C5B.8020307-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> 2011-03-03 7:34 ` Thomas Gleixner 2011-03-03 7:46 ` Matt Helsley [not found] ` <20110303074609.GL14893-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org> 2011-03-03 8:28 ` Thomas Gleixner 2011-03-03 8:33 ` Kirill A. Shutemov 2011-03-04 19:03 ` Peter Zijlstra 2011-03-04 23:22 ` Kirill A. Shutemov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).