* [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count
@ 2009-04-28 6:53 KOSAKI Motohiro
2009-04-28 7:31 ` Bharata B Rao
2009-04-28 22:08 ` Balbir Singh
0 siblings, 2 replies; 15+ messages in thread
From: KOSAKI Motohiro @ 2009-04-28 6:53 UTC (permalink / raw)
To: LKML, Bharata B Rao, Balaji Rao, Dhaval Giani, KAMEZAWA Hiroyuki,
Peter Zijlstra, Balbir Singh, Ingo Molnar, Martin Schwidefsky,
seto.hidetoshi
Cc: kosaki.motohiro
I'm not cpuacct expert. please give me comment.
====================
Subject: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime caching
impact: little performance improvement
cpuacct_update_stats() is called at every tick updating. and it use percpu_counter
for avoiding performance degression.
Unfortunately, it doesn't works on VIRT_CPU_ACCOUNTING=y environment properly.
if VIRT_CPU_ACCOUNTING=y, every tick update much than 1000 cputime.
Thus every percpu_counter_add() makes spinlock grabbing and update non-percpu-variable.
This patch change the batch rule. now, every cpu can store "percpu_counter_bach x jiffies"
cputime in percpu cache.
it mean this patch don't have behavior change if VIRT_CPU_ACCOUNTING=n, but
works well on VIRT_CPU_ACCOUNTING=y too.
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Balaji Rao <balajirrao@gmail.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
kernel/sched.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
Index: b/kernel/sched.c
===================================================================
--- a/kernel/sched.c 2009-04-28 14:18:36.000000000 +0900
+++ b/kernel/sched.c 2009-04-28 15:18:07.000000000 +0900
@@ -10117,6 +10117,7 @@ struct cpuacct {
};
struct cgroup_subsys cpuacct_subsys;
+static s32 cpuacct_batch;
/* return cpu accounting group corresponding to this container */
static inline struct cpuacct *cgroup_ca(struct cgroup *cgrp)
@@ -10146,6 +10147,9 @@ static struct cgroup_subsys_state *cpuac
if (!ca->cpuusage)
goto out_free_ca;
+ if (!cpuacct_batch)
+ cpuacct_batch = jiffies_to_cputime(percpu_counter_batch);
+
for (i = 0; i < CPUACCT_STAT_NSTATS; i++)
if (percpu_counter_init(&ca->cpustat[i], 0))
goto out_free_counters;
@@ -10342,7 +10346,7 @@ static void cpuacct_update_stats(struct
ca = task_ca(tsk);
do {
- percpu_counter_add(&ca->cpustat[idx], val);
+ __percpu_counter_add(&ca->cpustat[idx], val, cpuacct_batch);
ca = ca->parent;
} while (ca);
rcu_read_unlock();
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count
2009-04-28 6:53 [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count KOSAKI Motohiro
@ 2009-04-28 7:31 ` Bharata B Rao
2009-04-28 7:38 ` KOSAKI Motohiro
2009-04-28 22:08 ` Balbir Singh
1 sibling, 1 reply; 15+ messages in thread
From: Bharata B Rao @ 2009-04-28 7:31 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: LKML, Balaji Rao, Dhaval Giani, KAMEZAWA Hiroyuki, Peter Zijlstra,
Balbir Singh, Ingo Molnar, Martin Schwidefsky, seto.hidetoshi
On Tue, Apr 28, 2009 at 03:53:32PM +0900, KOSAKI Motohiro wrote:
>
> I'm not cpuacct expert. please give me comment.
>
> ====================
> Subject: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime caching
>
> impact: little performance improvement
>
> cpuacct_update_stats() is called at every tick updating. and it use percpu_counter
> for avoiding performance degression.
>
> Unfortunately, it doesn't works on VIRT_CPU_ACCOUNTING=y environment properly.
> if VIRT_CPU_ACCOUNTING=y, every tick update much than 1000 cputime.
> Thus every percpu_counter_add() makes spinlock grabbing and update non-percpu-variable.
>
> This patch change the batch rule. now, every cpu can store "percpu_counter_bach x jiffies"
> cputime in percpu cache.
> it mean this patch don't have behavior change if VIRT_CPU_ACCOUNTING=n, but
> works well on VIRT_CPU_ACCOUNTING=y too.
Let me try to understand what you are saying...
For archs which define VIRT_CPU_ACCOUNTING, every tick would result
in around 1000 units of cputime updates and since this is much much greater
than percpu_batch_counter, we end up taking spinlock on every tick.
If my above reading of the problem is correct, please look at my below comments.
> Index: b/kernel/sched.c
> ===================================================================
> --- a/kernel/sched.c 2009-04-28 14:18:36.000000000 +0900
> +++ b/kernel/sched.c 2009-04-28 15:18:07.000000000 +0900
> @@ -10117,6 +10117,7 @@ struct cpuacct {
> };
>
> struct cgroup_subsys cpuacct_subsys;
> +static s32 cpuacct_batch;
>
> /* return cpu accounting group corresponding to this container */
> static inline struct cpuacct *cgroup_ca(struct cgroup *cgrp)
> @@ -10146,6 +10147,9 @@ static struct cgroup_subsys_state *cpuac
> if (!ca->cpuusage)
> goto out_free_ca;
>
> + if (!cpuacct_batch)
> + cpuacct_batch = jiffies_to_cputime(percpu_counter_batch);
You essentially end up increasing the batch value from the default value
of max(32, nr_cpus*2).
> +
> for (i = 0; i < CPUACCT_STAT_NSTATS; i++)
> if (percpu_counter_init(&ca->cpustat[i], 0))
> goto out_free_counters;
> @@ -10342,7 +10346,7 @@ static void cpuacct_update_stats(struct
> ca = task_ca(tsk);
>
> do {
> - percpu_counter_add(&ca->cpustat[idx], val);
> + __percpu_counter_add(&ca->cpustat[idx], val, cpuacct_batch);
And you do this unconditionally which will affect all archs ? So you make
this behaviour default for archs which have VIRT_CPU_ACCOUNTING=n.
BTW, did you observe any real problem with the percpu counter spinlock ?
Regards,
Bharata.
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count
2009-04-28 7:31 ` Bharata B Rao
@ 2009-04-28 7:38 ` KOSAKI Motohiro
2009-04-28 7:50 ` Bharata B Rao
0 siblings, 1 reply; 15+ messages in thread
From: KOSAKI Motohiro @ 2009-04-28 7:38 UTC (permalink / raw)
To: bharata
Cc: kosaki.motohiro, LKML, Balaji Rao, Dhaval Giani,
KAMEZAWA Hiroyuki, Peter Zijlstra, Balbir Singh, Ingo Molnar,
Martin Schwidefsky, seto.hidetoshi
> On Tue, Apr 28, 2009 at 03:53:32PM +0900, KOSAKI Motohiro wrote:
> >
> > I'm not cpuacct expert. please give me comment.
> >
> > ====================
> > Subject: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime caching
> >
> > impact: little performance improvement
> >
> > cpuacct_update_stats() is called at every tick updating. and it use percpu_counter
> > for avoiding performance degression.
> >
> > Unfortunately, it doesn't works on VIRT_CPU_ACCOUNTING=y environment properly.
> > if VIRT_CPU_ACCOUNTING=y, every tick update much than 1000 cputime.
> > Thus every percpu_counter_add() makes spinlock grabbing and update non-percpu-variable.
> >
> > This patch change the batch rule. now, every cpu can store "percpu_counter_bach x jiffies"
> > cputime in percpu cache.
> > it mean this patch don't have behavior change if VIRT_CPU_ACCOUNTING=n, but
> > works well on VIRT_CPU_ACCOUNTING=y too.
>
> Let me try to understand what you are saying...
>
> For archs which define VIRT_CPU_ACCOUNTING, every tick would result
> in around 1000 units of cputime updates and since this is much much greater
> than percpu_batch_counter, we end up taking spinlock on every tick.
Correct.
>
> If my above reading of the problem is correct, please look at my below comments.
>
> > Index: b/kernel/sched.c
> > ===================================================================
> > --- a/kernel/sched.c 2009-04-28 14:18:36.000000000 +0900
> > +++ b/kernel/sched.c 2009-04-28 15:18:07.000000000 +0900
> > @@ -10117,6 +10117,7 @@ struct cpuacct {
> > };
> >
> > struct cgroup_subsys cpuacct_subsys;
> > +static s32 cpuacct_batch;
> >
> > /* return cpu accounting group corresponding to this container */
> > static inline struct cpuacct *cgroup_ca(struct cgroup *cgrp)
> > @@ -10146,6 +10147,9 @@ static struct cgroup_subsys_state *cpuac
> > if (!ca->cpuusage)
> > goto out_free_ca;
> >
> > + if (!cpuacct_batch)
> > + cpuacct_batch = jiffies_to_cputime(percpu_counter_batch);
>
> You essentially end up increasing the batch value from the default value
> of max(32, nr_cpus*2).
>
> > +
> > for (i = 0; i < CPUACCT_STAT_NSTATS; i++)
> > if (percpu_counter_init(&ca->cpustat[i], 0))
> > goto out_free_counters;
> > @@ -10342,7 +10346,7 @@ static void cpuacct_update_stats(struct
> > ca = task_ca(tsk);
> >
> > do {
> > - percpu_counter_add(&ca->cpustat[idx], val);
> > + __percpu_counter_add(&ca->cpustat[idx], val, cpuacct_batch);
>
> And you do this unconditionally which will affect all archs ? So you make
> this behaviour default for archs which have VIRT_CPU_ACCOUNTING=n.
if VIRT_CPU_ACCOUNTING=n, jiffies_to_cputime(n) return n.
IOW, jiffies_to_cputime() don't change value.
and percpu_counter() use percpu_counter_batch as batch.
Then, if VIRT_CPU_ACCOUNTING=n, this patch don't change behavior.
> BTW, did you observe any real problem with the percpu counter spinlock ?
No.
I review percpu_counter() caller recently and it seems stragen usage.
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count
2009-04-28 7:38 ` KOSAKI Motohiro
@ 2009-04-28 7:50 ` Bharata B Rao
2009-04-28 8:10 ` KOSAKI Motohiro
0 siblings, 1 reply; 15+ messages in thread
From: Bharata B Rao @ 2009-04-28 7:50 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: LKML, Balaji Rao, Dhaval Giani, KAMEZAWA Hiroyuki, Peter Zijlstra,
Balbir Singh, Ingo Molnar, Martin Schwidefsky, seto.hidetoshi
On Tue, Apr 28, 2009 at 04:38:51PM +0900, KOSAKI Motohiro wrote:
> > On Tue, Apr 28, 2009 at 03:53:32PM +0900, KOSAKI Motohiro wrote:
>
> > BTW, did you observe any real problem with the percpu counter spinlock ?
>
> No.
> I review percpu_counter() caller recently and it seems stragen usage.
>
I should have phrased the question better ...
So have you found any performance degradation with any benchmarks/workloads
on archs which define VIRT_CPU_ACCOUNTING due to percpu_counter spinlock
being taken on every tick ? If the answer is no, don't you think we could
wait before making the kind of change you are proposing ?
Regards,
Bharata.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count
2009-04-28 7:50 ` Bharata B Rao
@ 2009-04-28 8:10 ` KOSAKI Motohiro
2009-04-28 10:37 ` Bharata B Rao
0 siblings, 1 reply; 15+ messages in thread
From: KOSAKI Motohiro @ 2009-04-28 8:10 UTC (permalink / raw)
To: bharata
Cc: kosaki.motohiro, LKML, Balaji Rao, Dhaval Giani,
KAMEZAWA Hiroyuki, Peter Zijlstra, Balbir Singh, Ingo Molnar,
Martin Schwidefsky, seto.hidetoshi
> On Tue, Apr 28, 2009 at 04:38:51PM +0900, KOSAKI Motohiro wrote:
> > > On Tue, Apr 28, 2009 at 03:53:32PM +0900, KOSAKI Motohiro wrote:
> >
> > > BTW, did you observe any real problem with the percpu counter spinlock ?
> >
> > No.
> > I review percpu_counter() caller recently and it seems stragen usage.
> >
>
> I should have phrased the question better ...
>
> So have you found any performance degradation with any benchmarks/workloads
> on archs which define VIRT_CPU_ACCOUNTING due to percpu_counter spinlock
> being taken on every tick ? If the answer is no, don't you think we could
> wait before making the kind of change you are proposing ?
maybe, I don't understand your point.
I don't oppose you wait something :)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count
2009-04-28 8:10 ` KOSAKI Motohiro
@ 2009-04-28 10:37 ` Bharata B Rao
2009-04-29 2:31 ` KOSAKI Motohiro
0 siblings, 1 reply; 15+ messages in thread
From: Bharata B Rao @ 2009-04-28 10:37 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: LKML, Balaji Rao, Dhaval Giani, KAMEZAWA Hiroyuki, Peter Zijlstra,
Balbir Singh, Ingo Molnar, Martin Schwidefsky, seto.hidetoshi
On Tue, Apr 28, 2009 at 05:10:47PM +0900, KOSAKI Motohiro wrote:
> > On Tue, Apr 28, 2009 at 04:38:51PM +0900, KOSAKI Motohiro wrote:
> > > > On Tue, Apr 28, 2009 at 03:53:32PM +0900, KOSAKI Motohiro wrote:
> > >
> > > > BTW, did you observe any real problem with the percpu counter spinlock ?
> > >
> > > No.
> > > I review percpu_counter() caller recently and it seems stragen usage.
> > >
> >
> > I should have phrased the question better ...
> >
> > So have you found any performance degradation with any benchmarks/workloads
> > on archs which define VIRT_CPU_ACCOUNTING due to percpu_counter spinlock
> > being taken on every tick ? If the answer is no, don't you think we could
> > wait before making the kind of change you are proposing ?
>
> maybe, I don't understand your point.
My point is, let us not make this change if it is not a real problem that
has been observed on archs which define VIRT_CPU_ACCOUNTING.
Regards,
Bharata.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count
2009-04-28 10:37 ` Bharata B Rao
@ 2009-04-29 2:31 ` KOSAKI Motohiro
2009-04-29 3:30 ` Bharata B Rao
0 siblings, 1 reply; 15+ messages in thread
From: KOSAKI Motohiro @ 2009-04-29 2:31 UTC (permalink / raw)
To: bharata
Cc: kosaki.motohiro, LKML, Balaji Rao, Dhaval Giani,
KAMEZAWA Hiroyuki, Peter Zijlstra, Balbir Singh, Ingo Molnar,
Martin Schwidefsky, seto.hidetoshi
> > > > > BTW, did you observe any real problem with the percpu counter spinlock ?
> > > >
> > > > No.
> > > > I review percpu_counter() caller recently and it seems stragen usage.
> > > >
> > >
> > > I should have phrased the question better ...
> > >
> > > So have you found any performance degradation with any benchmarks/workloads
> > > on archs which define VIRT_CPU_ACCOUNTING due to percpu_counter spinlock
> > > being taken on every tick ? If the answer is no, don't you think we could
> > > wait before making the kind of change you are proposing ?
> >
> > maybe, I don't understand your point.
>
> My point is, let us not make this change if it is not a real problem that
> has been observed on archs which define VIRT_CPU_ACCOUNTING.
It's nice joke. but not constructive.
but another idea and another patch are always welcome.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count
2009-04-29 2:31 ` KOSAKI Motohiro
@ 2009-04-29 3:30 ` Bharata B Rao
0 siblings, 0 replies; 15+ messages in thread
From: Bharata B Rao @ 2009-04-29 3:30 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: LKML, Balaji Rao, Dhaval Giani, KAMEZAWA Hiroyuki, Peter Zijlstra,
Balbir Singh, Ingo Molnar, Martin Schwidefsky, seto.hidetoshi
On Wed, Apr 29, 2009 at 11:31:14AM +0900, KOSAKI Motohiro wrote:
> > > > > > BTW, did you observe any real problem with the percpu counter spinlock ?
> > > > >
> > > > > No.
> > > > > I review percpu_counter() caller recently and it seems stragen usage.
> > > > >
> > > >
> > > > I should have phrased the question better ...
> > > >
> > > > So have you found any performance degradation with any benchmarks/workloads
> > > > on archs which define VIRT_CPU_ACCOUNTING due to percpu_counter spinlock
> > > > being taken on every tick ? If the answer is no, don't you think we could
> > > > wait before making the kind of change you are proposing ?
> > >
> > > maybe, I don't understand your point.
> >
> > My point is, let us not make this change if it is not a real problem that
> > has been observed on archs which define VIRT_CPU_ACCOUNTING.
>
> It's nice joke. but not constructive.
I was only asking you if you have seen any real life problem with this
and you said no. In that context, if I re-read my above point, I think
I should have a pretty good sense of humour to consider it as joke :)
Regards,
Bharata.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count
2009-04-28 6:53 [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count KOSAKI Motohiro
2009-04-28 7:31 ` Bharata B Rao
@ 2009-04-28 22:08 ` Balbir Singh
2009-04-29 2:26 ` KOSAKI Motohiro
2009-04-29 3:21 ` Bharata B Rao
1 sibling, 2 replies; 15+ messages in thread
From: Balbir Singh @ 2009-04-28 22:08 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: LKML, Bharata B Rao, Balaji Rao, Dhaval Giani, KAMEZAWA Hiroyuki,
Peter Zijlstra, Ingo Molnar, Martin Schwidefsky, seto.hidetoshi
* KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> [2009-04-28 15:53:32]:
>
> I'm not cpuacct expert. please give me comment.
>
> ====================
> Subject: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime caching
>
> impact: little performance improvement
>
> cpuacct_update_stats() is called at every tick updating. and it use percpu_counter
> for avoiding performance degression.
>
> Unfortunately, it doesn't works on VIRT_CPU_ACCOUNTING=y environment properly.
> if VIRT_CPU_ACCOUNTING=y, every tick update much than 1000 cputime.
> Thus every percpu_counter_add() makes spinlock grabbing and update non-percpu-variable.
>
> This patch change the batch rule. now, every cpu can store "percpu_counter_bach x jiffies"
> cputime in percpu cache.
> it mean this patch don't have behavior change if VIRT_CPU_ACCOUNTING=n, but
> works well on VIRT_CPU_ACCOUNTING=y too.
>
>
> Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
> Cc: Balaji Rao <balajirrao@gmail.com>
> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> ---
> kernel/sched.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> Index: b/kernel/sched.c
> ===================================================================
> --- a/kernel/sched.c 2009-04-28 14:18:36.000000000 +0900
> +++ b/kernel/sched.c 2009-04-28 15:18:07.000000000 +0900
> @@ -10117,6 +10117,7 @@ struct cpuacct {
> };
>
> struct cgroup_subsys cpuacct_subsys;
> +static s32 cpuacct_batch;
>
> /* return cpu accounting group corresponding to this container */
> static inline struct cpuacct *cgroup_ca(struct cgroup *cgrp)
> @@ -10146,6 +10147,9 @@ static struct cgroup_subsys_state *cpuac
> if (!ca->cpuusage)
> goto out_free_ca;
>
> + if (!cpuacct_batch)
> + cpuacct_batch = jiffies_to_cputime(percpu_counter_batch);
> +
I expect cpuacct_batch to be a large number
> for (i = 0; i < CPUACCT_STAT_NSTATS; i++)
> if (percpu_counter_init(&ca->cpustat[i], 0))
> goto out_free_counters;
> @@ -10342,7 +10346,7 @@ static void cpuacct_update_stats(struct
> ca = task_ca(tsk);
>
> do {
> - percpu_counter_add(&ca->cpustat[idx], val);
> + __percpu_counter_add(&ca->cpustat[idx], val, cpuacct_batch);
This will make the end result very off the real value due to large
batch value per cpu. If we are going to go this route, we should
probably consider using __percpu_counter_sum so that the batch value
does not show data that is way off.
> ca = ca->parent;
> } while (ca);
> rcu_read_unlock();
>
>
--
Balbir
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count
2009-04-28 22:08 ` Balbir Singh
@ 2009-04-29 2:26 ` KOSAKI Motohiro
2009-04-29 5:44 ` Balbir Singh
2009-04-29 3:21 ` Bharata B Rao
1 sibling, 1 reply; 15+ messages in thread
From: KOSAKI Motohiro @ 2009-04-29 2:26 UTC (permalink / raw)
To: balbir
Cc: kosaki.motohiro, LKML, Bharata B Rao, Balaji Rao, Dhaval Giani,
KAMEZAWA Hiroyuki, Peter Zijlstra, Ingo Molnar,
Martin Schwidefsky, seto.hidetoshi
Hi
Thanks for reviewing.
> > /* return cpu accounting group corresponding to this container */
> > static inline struct cpuacct *cgroup_ca(struct cgroup *cgrp)
> > @@ -10146,6 +10147,9 @@ static struct cgroup_subsys_state *cpuac
> > if (!ca->cpuusage)
> > goto out_free_ca;
> >
> > + if (!cpuacct_batch)
> > + cpuacct_batch = jiffies_to_cputime(percpu_counter_batch);
> > +
>
> I expect cpuacct_batch to be a large number
Yes.
>
> > for (i = 0; i < CPUACCT_STAT_NSTATS; i++)
> > if (percpu_counter_init(&ca->cpustat[i], 0))
> > goto out_free_counters;
> > @@ -10342,7 +10346,7 @@ static void cpuacct_update_stats(struct
> > ca = task_ca(tsk);
> >
> > do {
> > - percpu_counter_add(&ca->cpustat[idx], val);
> > + __percpu_counter_add(&ca->cpustat[idx], val, cpuacct_batch);
>
> This will make the end result very off the real value due to large
> batch value per cpu. If we are going to go this route, we should
> probably consider using __percpu_counter_sum so that the batch value
> does not show data that is way off.
No problem.
end-user don't see cputime itself. they see converted time.
cpuacct_stats_show() use cputime64_to_clock_t. it mean
the value less than 10msec don't display.
--------------------------------------------------------
static int cpuacct_stats_show(struct cgroup *cgrp, struct cftype *cft,
struct cgroup_map_cb *cb)
{
struct cpuacct *ca = cgroup_ca(cgrp);
int i;
for (i = 0; i < CPUACCT_STAT_NSTATS; i++) {
s64 val = percpu_counter_read(&ca->cpustat[i]);
val = cputime64_to_clock_t(val);
cb->fill(cb, cpuacct_stat_desc[i], val);
}
return 0;
}
--------------------------------------------------------
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count
2009-04-29 2:26 ` KOSAKI Motohiro
@ 2009-04-29 5:44 ` Balbir Singh
2009-04-29 6:08 ` KOSAKI Motohiro
0 siblings, 1 reply; 15+ messages in thread
From: Balbir Singh @ 2009-04-29 5:44 UTC (permalink / raw)
To: KOSAKI Motohiro
Cc: LKML, Bharata B Rao, Balaji Rao, Dhaval Giani, KAMEZAWA Hiroyuki,
Peter Zijlstra, Ingo Molnar, Martin Schwidefsky, seto.hidetoshi
* KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> [2009-04-29 11:26:30]:
> Hi
>
> Thanks for reviewing.
>
> > > /* return cpu accounting group corresponding to this container */
> > > static inline struct cpuacct *cgroup_ca(struct cgroup *cgrp)
> > > @@ -10146,6 +10147,9 @@ static struct cgroup_subsys_state *cpuac
> > > if (!ca->cpuusage)
> > > goto out_free_ca;
> > >
> > > + if (!cpuacct_batch)
> > > + cpuacct_batch = jiffies_to_cputime(percpu_counter_batch);
> > > +
> >
> > I expect cpuacct_batch to be a large number
>
> Yes.
>
>
> >
> > > for (i = 0; i < CPUACCT_STAT_NSTATS; i++)
> > > if (percpu_counter_init(&ca->cpustat[i], 0))
> > > goto out_free_counters;
> > > @@ -10342,7 +10346,7 @@ static void cpuacct_update_stats(struct
> > > ca = task_ca(tsk);
> > >
> > > do {
> > > - percpu_counter_add(&ca->cpustat[idx], val);
> > > + __percpu_counter_add(&ca->cpustat[idx], val, cpuacct_batch);
> >
> > This will make the end result very off the real value due to large
> > batch value per cpu. If we are going to go this route, we should
> > probably consider using __percpu_counter_sum so that the batch value
> > does not show data that is way off.
>
> No problem.
>
> end-user don't see cputime itself. they see converted time.
> cpuacct_stats_show() use cputime64_to_clock_t. it mean
> the value less than 10msec don't display.
>
Yes, I know, I reviewed Bharata's patch and suggested converting to
clock_t for consistency with other metrics.
> --------------------------------------------------------
> static int cpuacct_stats_show(struct cgroup *cgrp, struct cftype *cft,
> struct cgroup_map_cb *cb)
> {
> struct cpuacct *ca = cgroup_ca(cgrp);
> int i;
>
> for (i = 0; i < CPUACCT_STAT_NSTATS; i++) {
> s64 val = percpu_counter_read(&ca->cpustat[i]);
My point is, this should probably be percpu_counter_sum(), but that
can be expensive and we were willing to tollerate some inaccuracy due
to batch value, I think your patch adds to the inaccuracy even more,
even though it fixes a genuine problem.
> val = cputime64_to_clock_t(val);
> cb->fill(cb, cpuacct_stat_desc[i], val);
> }
> return 0;
> }
> --------------------------------------------------------
>
>
>
>
>
--
Balbir
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count
2009-04-29 5:44 ` Balbir Singh
@ 2009-04-29 6:08 ` KOSAKI Motohiro
0 siblings, 0 replies; 15+ messages in thread
From: KOSAKI Motohiro @ 2009-04-29 6:08 UTC (permalink / raw)
To: balbir
Cc: kosaki.motohiro, LKML, Bharata B Rao, Balaji Rao, Dhaval Giani,
KAMEZAWA Hiroyuki, Peter Zijlstra, Ingo Molnar,
Martin Schwidefsky, seto.hidetoshi
> > > This will make the end result very off the real value due to large
> > > batch value per cpu. If we are going to go this route, we should
> > > probably consider using __percpu_counter_sum so that the batch value
> > > does not show data that is way off.
> >
> > No problem.
> >
> > end-user don't see cputime itself. they see converted time.
> > cpuacct_stats_show() use cputime64_to_clock_t. it mean
> > the value less than 10msec don't display.
> >
>
> Yes, I know, I reviewed Bharata's patch and suggested converting to
> clock_t for consistency with other metrics.
Oh, sorry.
I didn't know this.
> > --------------------------------------------------------
> > static int cpuacct_stats_show(struct cgroup *cgrp, struct cftype *cft,
> > struct cgroup_map_cb *cb)
> > {
> > struct cpuacct *ca = cgroup_ca(cgrp);
> > int i;
> >
> > for (i = 0; i < CPUACCT_STAT_NSTATS; i++) {
> > s64 val = percpu_counter_read(&ca->cpustat[i]);
>
> My point is, this should probably be percpu_counter_sum(), but that
> can be expensive and we were willing to tollerate some inaccuracy due
> to batch value, I think your patch adds to the inaccuracy even more,
> even though it fixes a genuine problem.
Not expensive.
cpuacct_stats_show() is only called when reading stat file.
it definitely slow-path.
I think we can use percpu_counter_sum().
However, I doubt its worth.
before my patch:
VIRT_CPU_ACCOUNTING=y: accuracy but slow
VIRT_CPU_ACCOUNTING=n: inaccuracy few tick but fast
my patch
VIRT_CPU_ACCOUNTING=y: inaccuracy few tick but fast
VIRT_CPU_ACCOUNTING=n: inaccuracy few tick but fast
if my inaccuracy is wrong, current code is also crap when VIRT_CPU_ACCOUNTING=n.
I only make const accuracy to VIRT_CPU_ACCOUNTING=y and n.
Thought?
Although you still think percpu_counter_sum() is better, I can do it.
>
>
> > val = cputime64_to_clock_t(val);
> > cb->fill(cb, cpuacct_stat_desc[i], val);
> > }
> > return 0;
> > }
> > --------------------------------------------------------
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count
2009-04-28 22:08 ` Balbir Singh
2009-04-29 2:26 ` KOSAKI Motohiro
@ 2009-04-29 3:21 ` Bharata B Rao
2009-04-29 6:04 ` Balbir Singh
1 sibling, 1 reply; 15+ messages in thread
From: Bharata B Rao @ 2009-04-29 3:21 UTC (permalink / raw)
To: Balbir Singh
Cc: KOSAKI Motohiro, LKML, Balaji Rao, Dhaval Giani,
KAMEZAWA Hiroyuki, Peter Zijlstra, Ingo Molnar,
Martin Schwidefsky, seto.hidetoshi
On Wed, Apr 29, 2009 at 03:38:54AM +0530, Balbir Singh wrote:
> * KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> [2009-04-28 15:53:32]:
> >
> > do {
> > - percpu_counter_add(&ca->cpustat[idx], val);
> > + __percpu_counter_add(&ca->cpustat[idx], val, cpuacct_batch);
>
> This will make the end result very off the real value due to large
> batch value per cpu. If we are going to go this route, we should
> probably consider using __percpu_counter_sum so that the batch value
> does not show data that is way off.
But __percpu_counter_sum takes fbc->lock spinlock and that is the cause of
the problem Kosaki mentions I believe.
Regards,
Bharata.
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count
2009-04-29 3:21 ` Bharata B Rao
@ 2009-04-29 6:04 ` Balbir Singh
2009-04-29 6:09 ` Bharata B Rao
0 siblings, 1 reply; 15+ messages in thread
From: Balbir Singh @ 2009-04-29 6:04 UTC (permalink / raw)
To: Bharata B Rao
Cc: KOSAKI Motohiro, LKML, Balaji Rao, Dhaval Giani,
KAMEZAWA Hiroyuki, Peter Zijlstra, Ingo Molnar,
Martin Schwidefsky, seto.hidetoshi
* Bharata B Rao <bharata@linux.vnet.ibm.com> [2009-04-29 08:51:46]:
> On Wed, Apr 29, 2009 at 03:38:54AM +0530, Balbir Singh wrote:
> > * KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> [2009-04-28 15:53:32]:
> > >
> > > do {
> > > - percpu_counter_add(&ca->cpustat[idx], val);
> > > + __percpu_counter_add(&ca->cpustat[idx], val, cpuacct_batch);
> >
> > This will make the end result very off the real value due to large
> > batch value per cpu. If we are going to go this route, we should
> > probably consider using __percpu_counter_sum so that the batch value
> > does not show data that is way off.
>
> But __percpu_counter_sum takes fbc->lock spinlock and that is the cause of
> the problem Kosaki mentions I believe.
But this is on the user space read side (show stats), I thought
Kosaki's problem was with per_cpu_counter_add called from update
stats.
--
Balbir
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count
2009-04-29 6:04 ` Balbir Singh
@ 2009-04-29 6:09 ` Bharata B Rao
0 siblings, 0 replies; 15+ messages in thread
From: Bharata B Rao @ 2009-04-29 6:09 UTC (permalink / raw)
To: Balbir Singh
Cc: KOSAKI Motohiro, LKML, Balaji Rao, Dhaval Giani,
KAMEZAWA Hiroyuki, Peter Zijlstra, Ingo Molnar,
Martin Schwidefsky, seto.hidetoshi
On Wed, Apr 29, 2009 at 11:34:15AM +0530, Balbir Singh wrote:
> * Bharata B Rao <bharata@linux.vnet.ibm.com> [2009-04-29 08:51:46]:
>
> > On Wed, Apr 29, 2009 at 03:38:54AM +0530, Balbir Singh wrote:
> > > * KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> [2009-04-28 15:53:32]:
> > > >
> > > > do {
> > > > - percpu_counter_add(&ca->cpustat[idx], val);
> > > > + __percpu_counter_add(&ca->cpustat[idx], val, cpuacct_batch);
> > >
> > > This will make the end result very off the real value due to large
> > > batch value per cpu. If we are going to go this route, we should
> > > probably consider using __percpu_counter_sum so that the batch value
> > > does not show data that is way off.
> >
> > But __percpu_counter_sum takes fbc->lock spinlock and that is the cause of
> > the problem Kosaki mentions I believe.
>
> But this is on the user space read side (show stats), I thought
> Kosaki's problem was with per_cpu_counter_add called from update
> stats.
Right. Agreed.
Regards,
Bharata.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2009-04-29 6:10 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-28 6:53 [PATCH] cpuacct: VIRT_CPU_ACCOUNTING don't prevent percpu cputime count KOSAKI Motohiro
2009-04-28 7:31 ` Bharata B Rao
2009-04-28 7:38 ` KOSAKI Motohiro
2009-04-28 7:50 ` Bharata B Rao
2009-04-28 8:10 ` KOSAKI Motohiro
2009-04-28 10:37 ` Bharata B Rao
2009-04-29 2:31 ` KOSAKI Motohiro
2009-04-29 3:30 ` Bharata B Rao
2009-04-28 22:08 ` Balbir Singh
2009-04-29 2:26 ` KOSAKI Motohiro
2009-04-29 5:44 ` Balbir Singh
2009-04-29 6:08 ` KOSAKI Motohiro
2009-04-29 3:21 ` Bharata B Rao
2009-04-29 6:04 ` Balbir Singh
2009-04-29 6:09 ` Bharata B Rao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox