public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] cpuacct: split usage into user_usage and sys_usage
@ 2016-03-17  4:19 Zhao Lei
  2016-03-17  4:19 ` [PATCH v3 1/3] cpuacct: rename parameter in cpuusage_write for readability Zhao Lei
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Zhao Lei @ 2016-03-17  4:19 UTC (permalink / raw)
  To: linux-kernel, Tejun Heo, Peter Zijlstra; +Cc: Zhao Lei

Sometimes, cpuacct.usage is not detialed enough to user
to see how much usage a group used. We want to know how
much time it used in user mode and how much in kernel mode.

This patch introduce some more files to tell user these information.
 # ls /sys/fs/cgroup/cpuacct/cpuacct.usage*
 /sys/fs/cgroup/cpuacct/cpuacct.usage
 /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu
 /sys/fs/cgroup/cpuacct/cpuacct.usage_user
 /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu_user
 /sys/fs/cgroup/cpuacct/cpuacct.usage_sys
 /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu_sys

Changelog v2->v3:
1: Remove some unnecessary locks and so some cleanup based on
   suggestion from Peter Zijlstra <peterz@infradead.org>

Changelog v1->v2:
1: Rebase on top of 4.5-rc6
2: Fix little spelling typo in description.
3: Fix line over 80 characters

Yang Dongsheng (2):
  cpuacct: rename parameter in cpuusage_write for readability
  cpuacct: split usage into user_usage and sys_usage.

Zhao Lei (1):
  Some small restruct for cpuacct

 kernel/sched/cpuacct.c | 165 +++++++++++++++++++++++++++++++++++--------------
 kernel/sched/cpuacct.h |   4 +-
 2 files changed, 120 insertions(+), 49 deletions(-)

-- 
1.8.5.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/3] cpuacct: rename parameter in cpuusage_write for readability
  2016-03-17  4:19 [PATCH v3 0/3] cpuacct: split usage into user_usage and sys_usage Zhao Lei
@ 2016-03-17  4:19 ` Zhao Lei
  2016-03-17  4:19 ` [PATCH v3 2/3] Some small restruct for cpuacct Zhao Lei
  2016-03-17  4:19 ` [PATCH v3 3/3] cpuacct: split usage into user_usage and sys_usage Zhao Lei
  2 siblings, 0 replies; 8+ messages in thread
From: Zhao Lei @ 2016-03-17  4:19 UTC (permalink / raw)
  To: linux-kernel, Tejun Heo, Peter Zijlstra; +Cc: Yang Dongsheng, Zhao Lei

From: Yang Dongsheng <yangds.fnst@cn.fujitsu.com>

The name of 'reset' makes a little confusion in reading, we would
say, if we want to reset usage, return -EINVAL. That's not true.

Actually, we want to say, we only allow user to do a reset. This
patch rename reset to val and add a comment here, making the code
more readable.

From: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 kernel/sched/cpuacct.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
index dd7cbb5..9c2bbf7 100644
--- a/kernel/sched/cpuacct.c
+++ b/kernel/sched/cpuacct.c
@@ -145,13 +145,16 @@ static u64 cpuusage_read(struct cgroup_subsys_state *css, struct cftype *cft)
 }
 
 static int cpuusage_write(struct cgroup_subsys_state *css, struct cftype *cft,
-			  u64 reset)
+			  u64 val)
 {
 	struct cpuacct *ca = css_ca(css);
 	int err = 0;
 	int i;
 
-	if (reset) {
+	/*
+	 * Only allow '0' here to do a reset.
+	 */
+	if (val) {
 		err = -EINVAL;
 		goto out;
 	}
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/3] Some small restruct for cpuacct
  2016-03-17  4:19 [PATCH v3 0/3] cpuacct: split usage into user_usage and sys_usage Zhao Lei
  2016-03-17  4:19 ` [PATCH v3 1/3] cpuacct: rename parameter in cpuusage_write for readability Zhao Lei
@ 2016-03-17  4:19 ` Zhao Lei
  2016-03-21 11:17   ` [tip:sched/urgent] sched/cpuacct: Simplify the cpuacct code tip-bot for Zhao Lei
  2016-03-17  4:19 ` [PATCH v3 3/3] cpuacct: split usage into user_usage and sys_usage Zhao Lei
  2 siblings, 1 reply; 8+ messages in thread
From: Zhao Lei @ 2016-03-17  4:19 UTC (permalink / raw)
  To: linux-kernel, Tejun Heo, Peter Zijlstra; +Cc: Zhao Lei

1: Use for() instead of while() loop in some functions
   to make code simple.
2: Use this_cpu_ptr() instead of per_cpu_ptr() to make code
   clean with little performance up.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 kernel/sched/cpuacct.c | 28 +++++-----------------------
 kernel/sched/cpuacct.h |  4 ++--
 2 files changed, 7 insertions(+), 25 deletions(-)

diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
index 9c2bbf7..434c2fa 100644
--- a/kernel/sched/cpuacct.c
+++ b/kernel/sched/cpuacct.c
@@ -238,23 +238,10 @@ static struct cftype files[] = {
 void cpuacct_charge(struct task_struct *tsk, u64 cputime)
 {
 	struct cpuacct *ca;
-	int cpu;
-
-	cpu = task_cpu(tsk);
 
 	rcu_read_lock();
-
-	ca = task_ca(tsk);
-
-	while (true) {
-		u64 *cpuusage = per_cpu_ptr(ca->cpuusage, cpu);
-		*cpuusage += cputime;
-
-		ca = parent_ca(ca);
-		if (!ca)
-			break;
-	}
-
+	for (ca = task_ca(tsk); ca; ca = parent_ca(ca))
+		*this_cpu_ptr(ca->cpuusage) += cputime;
 	rcu_read_unlock();
 }
 
@@ -263,18 +250,13 @@ void cpuacct_charge(struct task_struct *tsk, u64 cputime)
  *
  * Note: it's the caller that updates the account of the root cgroup.
  */
-void cpuacct_account_field(struct task_struct *p, int index, u64 val)
+void cpuacct_account_field(struct task_struct *tsk, int index, u64 val)
 {
-	struct kernel_cpustat *kcpustat;
 	struct cpuacct *ca;
 
 	rcu_read_lock();
-	ca = task_ca(p);
-	while (ca != &root_cpuacct) {
-		kcpustat = this_cpu_ptr(ca->cpustat);
-		kcpustat->cpustat[index] += val;
-		ca = parent_ca(ca);
-	}
+	for (ca = task_ca(tsk); ca != &root_cpuacct; ca = parent_ca(ca))
+		this_cpu_ptr(ca->cpustat)->cpustat[index] += val;
 	rcu_read_unlock();
 }
 
diff --git a/kernel/sched/cpuacct.h b/kernel/sched/cpuacct.h
index ed60562..ba72807 100644
--- a/kernel/sched/cpuacct.h
+++ b/kernel/sched/cpuacct.h
@@ -1,7 +1,7 @@
 #ifdef CONFIG_CGROUP_CPUACCT
 
 extern void cpuacct_charge(struct task_struct *tsk, u64 cputime);
-extern void cpuacct_account_field(struct task_struct *p, int index, u64 val);
+extern void cpuacct_account_field(struct task_struct *tsk, int index, u64 val);
 
 #else
 
@@ -10,7 +10,7 @@ static inline void cpuacct_charge(struct task_struct *tsk, u64 cputime)
 }
 
 static inline void
-cpuacct_account_field(struct task_struct *p, int index, u64 val)
+cpuacct_account_field(struct task_struct *tsk, int index, u64 val)
 {
 }
 
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 3/3] cpuacct: split usage into user_usage and sys_usage
  2016-03-17  4:19 [PATCH v3 0/3] cpuacct: split usage into user_usage and sys_usage Zhao Lei
  2016-03-17  4:19 ` [PATCH v3 1/3] cpuacct: rename parameter in cpuusage_write for readability Zhao Lei
  2016-03-17  4:19 ` [PATCH v3 2/3] Some small restruct for cpuacct Zhao Lei
@ 2016-03-17  4:19 ` Zhao Lei
  2016-03-17  8:40   ` Peter Zijlstra
  2 siblings, 1 reply; 8+ messages in thread
From: Zhao Lei @ 2016-03-17  4:19 UTC (permalink / raw)
  To: linux-kernel, Tejun Heo, Peter Zijlstra; +Cc: Yang Dongsheng, Zhao Lei

From: Yang Dongsheng <yangds.fnst@cn.fujitsu.com>

Sometimes, cpuacct.usage is not detialed enough to user
to see how much usage a group used. We want to know how
much time it used in user mode and how much in kernel mode.

This patch introduce some more files to tell user these information.
 # ls /sys/fs/cgroup/cpuacct/cpuacct.usage*
 /sys/fs/cgroup/cpuacct/cpuacct.usage
 /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu
 /sys/fs/cgroup/cpuacct/cpuacct.usage_user
 /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu_user
 /sys/fs/cgroup/cpuacct/cpuacct.usage_sys
 /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu_sys

From: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
---
 kernel/sched/cpuacct.c | 140 +++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 113 insertions(+), 27 deletions(-)

diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
index 434c2fa..9075b9d 100644
--- a/kernel/sched/cpuacct.c
+++ b/kernel/sched/cpuacct.c
@@ -25,11 +25,22 @@ enum cpuacct_stat_index {
 	CPUACCT_STAT_NSTATS,
 };
 
+enum cpuacct_usage_index {
+	CPUACCT_USAGE_USER,	/* ... user mode */
+	CPUACCT_USAGE_SYSTEM,	/* ... kernel mode */
+
+	CPUACCT_USAGE_NRUSAGE,
+};
+
+struct cpuacct_usage {
+	u64	usages[CPUACCT_USAGE_NRUSAGE];
+};
+
 /* track cpu usage of a group of tasks and its child groups */
 struct cpuacct {
 	struct cgroup_subsys_state css;
 	/* cpuusage holds pointer to a u64-type object on every cpu */
-	u64 __percpu *cpuusage;
+	struct cpuacct_usage __percpu *cpuusage;
 	struct kernel_cpustat __percpu *cpustat;
 };
 
@@ -49,7 +60,7 @@ static inline struct cpuacct *parent_ca(struct cpuacct *ca)
 	return css_ca(ca->css.parent);
 }
 
-static DEFINE_PER_CPU(u64, root_cpuacct_cpuusage);
+static DEFINE_PER_CPU(struct cpuacct_usage, root_cpuacct_cpuusage);
 static struct cpuacct root_cpuacct = {
 	.cpustat	= &kernel_cpustat,
 	.cpuusage	= &root_cpuacct_cpuusage,
@@ -68,7 +79,7 @@ cpuacct_css_alloc(struct cgroup_subsys_state *parent_css)
 	if (!ca)
 		goto out;
 
-	ca->cpuusage = alloc_percpu(u64);
+	ca->cpuusage = alloc_percpu(struct cpuacct_usage);
 	if (!ca->cpuusage)
 		goto out_free_ca;
 
@@ -96,20 +107,37 @@ static void cpuacct_css_free(struct cgroup_subsys_state *css)
 	kfree(ca);
 }
 
-static u64 cpuacct_cpuusage_read(struct cpuacct *ca, int cpu)
+static u64 cpuacct_cpuusage_read(struct cpuacct *ca, int cpu,
+				 enum cpuacct_usage_index index)
 {
-	u64 *cpuusage = per_cpu_ptr(ca->cpuusage, cpu);
+	struct cpuacct_usage *cpuusage = per_cpu_ptr(ca->cpuusage, cpu);
 	u64 data;
 
+	/*
+	 * We allow index == CPUACCT_USAGE_NRUSAGE here to read
+	 * the sum of suages.
+	 */
+	BUG_ON(index > CPUACCT_USAGE_NRUSAGE);
+
 #ifndef CONFIG_64BIT
 	/*
 	 * Take rq->lock to make 64-bit read safe on 32-bit platforms.
 	 */
 	raw_spin_lock_irq(&cpu_rq(cpu)->lock);
-	data = *cpuusage;
+#endif
+
+	if (index == CPUACCT_USAGE_NRUSAGE) {
+		int i = 0;
+
+		data = 0;
+		for (i = 0; i < CPUACCT_USAGE_NRUSAGE; i++)
+			data += cpuusage->usages[i];
+	} else {
+		data = cpuusage->usages[index];
+	}
+
+#ifndef CONFIG_64BIT
 	raw_spin_unlock_irq(&cpu_rq(cpu)->lock);
-#else
-	data = *cpuusage;
 #endif
 
 	return data;
@@ -117,69 +145,103 @@ static u64 cpuacct_cpuusage_read(struct cpuacct *ca, int cpu)
 
 static void cpuacct_cpuusage_write(struct cpuacct *ca, int cpu, u64 val)
 {
-	u64 *cpuusage = per_cpu_ptr(ca->cpuusage, cpu);
+	struct cpuacct_usage *cpuusage = per_cpu_ptr(ca->cpuusage, cpu);
+	int i;
 
 #ifndef CONFIG_64BIT
 	/*
 	 * Take rq->lock to make 64-bit write safe on 32-bit platforms.
 	 */
 	raw_spin_lock_irq(&cpu_rq(cpu)->lock);
-	*cpuusage = val;
+#endif
+
+	for (i = 0; i < CPUACCT_USAGE_NRUSAGE; i++)
+		cpuusage->usages[i] = val;
+
+#ifndef CONFIG_64BIT
 	raw_spin_unlock_irq(&cpu_rq(cpu)->lock);
-#else
-	*cpuusage = val;
 #endif
 }
 
 /* return total cpu usage (in nanoseconds) of a group */
-static u64 cpuusage_read(struct cgroup_subsys_state *css, struct cftype *cft)
+static u64 __cpuusage_read(struct cgroup_subsys_state *css,
+			   enum cpuacct_usage_index index)
 {
 	struct cpuacct *ca = css_ca(css);
 	u64 totalcpuusage = 0;
 	int i;
 
 	for_each_present_cpu(i)
-		totalcpuusage += cpuacct_cpuusage_read(ca, i);
+		totalcpuusage += cpuacct_cpuusage_read(ca, i, index);
 
 	return totalcpuusage;
 }
 
+static u64 cpuusage_user_read(struct cgroup_subsys_state *css,
+			      struct cftype *cft)
+{
+	return __cpuusage_read(css, CPUACCT_USAGE_USER);
+}
+
+static u64 cpuusage_sys_read(struct cgroup_subsys_state *css,
+			     struct cftype *cft)
+{
+	return __cpuusage_read(css, CPUACCT_USAGE_SYSTEM);
+}
+
+static u64 cpuusage_read(struct cgroup_subsys_state *css, struct cftype *cft)
+{
+	return __cpuusage_read(css, CPUACCT_USAGE_NRUSAGE);
+}
+
 static int cpuusage_write(struct cgroup_subsys_state *css, struct cftype *cft,
 			  u64 val)
 {
 	struct cpuacct *ca = css_ca(css);
-	int err = 0;
-	int i;
+	int cpu;
 
 	/*
 	 * Only allow '0' here to do a reset.
 	 */
-	if (val) {
-		err = -EINVAL;
-		goto out;
-	}
+	if (val)
+		return -EINVAL;
 
-	for_each_present_cpu(i)
-		cpuacct_cpuusage_write(ca, i, 0);
+	for_each_present_cpu(cpu)
+		cpuacct_cpuusage_write(ca, cpu, 0);
 
-out:
-	return err;
+	return 0;
 }
 
-static int cpuacct_percpu_seq_show(struct seq_file *m, void *V)
+static int __cpuacct_percpu_seq_show(struct seq_file *m,
+				     enum cpuacct_usage_index index)
 {
 	struct cpuacct *ca = css_ca(seq_css(m));
 	u64 percpu;
 	int i;
 
 	for_each_present_cpu(i) {
-		percpu = cpuacct_cpuusage_read(ca, i);
+		percpu = cpuacct_cpuusage_read(ca, i, index);
 		seq_printf(m, "%llu ", (unsigned long long) percpu);
 	}
 	seq_printf(m, "\n");
 	return 0;
 }
 
+static int cpuacct_percpu_user_seq_show(struct seq_file *m, void *V)
+{
+	return __cpuacct_percpu_seq_show(m, CPUACCT_USAGE_USER);
+}
+
+static int cpuacct_percpu_sys_seq_show(struct seq_file *m, void *V)
+{
+	return __cpuacct_percpu_seq_show(m, CPUACCT_USAGE_SYSTEM);
+}
+
+static int cpuacct_percpu_seq_show(struct seq_file *m, void *V)
+{
+	return __cpuacct_percpu_seq_show(m, CPUACCT_USAGE_NRUSAGE);
+}
+
 static const char * const cpuacct_stat_desc[] = {
 	[CPUACCT_STAT_USER] = "user",
 	[CPUACCT_STAT_SYSTEM] = "system",
@@ -220,10 +282,26 @@ static struct cftype files[] = {
 		.write_u64 = cpuusage_write,
 	},
 	{
+		.name = "usage_user",
+		.read_u64 = cpuusage_user_read,
+	},
+	{
+		.name = "usage_sys",
+		.read_u64 = cpuusage_sys_read,
+	},
+	{
 		.name = "usage_percpu",
 		.seq_show = cpuacct_percpu_seq_show,
 	},
 	{
+		.name = "usage_percpu_user",
+		.seq_show = cpuacct_percpu_user_seq_show,
+	},
+	{
+		.name = "usage_percpu_sys",
+		.seq_show = cpuacct_percpu_sys_seq_show,
+	},
+	{
 		.name = "stat",
 		.seq_show = cpuacct_stats_show,
 	},
@@ -238,10 +316,18 @@ static struct cftype files[] = {
 void cpuacct_charge(struct task_struct *tsk, u64 cputime)
 {
 	struct cpuacct *ca;
+	int index;
+
+	if (user_mode(task_pt_regs(tsk)))
+		index = CPUACCT_USAGE_USER;
+	else
+		index = CPUACCT_USAGE_SYSTEM;
 
 	rcu_read_lock();
+
 	for (ca = task_ca(tsk); ca; ca = parent_ca(ca))
-		*this_cpu_ptr(ca->cpuusage) += cputime;
+		this_cpu_ptr(ca->cpuusage)->usages[index] += cputime;
+
 	rcu_read_unlock();
 }
 
-- 
1.8.5.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 3/3] cpuacct: split usage into user_usage and sys_usage
  2016-03-17  4:19 ` [PATCH v3 3/3] cpuacct: split usage into user_usage and sys_usage Zhao Lei
@ 2016-03-17  8:40   ` Peter Zijlstra
  2016-03-17  9:42     ` Zhao Lei
  0 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2016-03-17  8:40 UTC (permalink / raw)
  To: Zhao Lei; +Cc: linux-kernel, Tejun Heo, Yang Dongsheng

On Thu, Mar 17, 2016 at 12:19:44PM +0800, Zhao Lei wrote:
> +static u64 __cpuusage_read(struct cgroup_subsys_state *css,
> +			   enum cpuacct_usage_index index)
>  {
>  	struct cpuacct *ca = css_ca(css);
>  	u64 totalcpuusage = 0;
>  	int i;
>  
>  	for_each_present_cpu(i)
> +		totalcpuusage += cpuacct_cpuusage_read(ca, i, index);
>  
>  	return totalcpuusage;
>  }

Ok, so while looking over this, it mostly uses for_each_present_cpu(),
which is already dubious, but then cpuacct_stats_show() uses
for_each_online_cpu().

Why is this? Why not always for_each_possible_cpu()?

Surely, if you offline a cpu, you still want its stat to be included in
your totals, otherwise your numbers will go backwards when you take a
cpu offline.

Could you double check the logic and maybe fix?

Thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: [PATCH v3 3/3] cpuacct: split usage into user_usage and sys_usage
  2016-03-17  8:40   ` Peter Zijlstra
@ 2016-03-17  9:42     ` Zhao Lei
  2016-03-17 10:12       ` Peter Zijlstra
  0 siblings, 1 reply; 8+ messages in thread
From: Zhao Lei @ 2016-03-17  9:42 UTC (permalink / raw)
  To: 'Peter Zijlstra'
  Cc: linux-kernel, 'Tejun Heo', 'Yang Dongsheng'

Hi, Peter Zijlstra

> -----Original Message-----
> From: Peter Zijlstra [mailto:peterz@infradead.org]
> Sent: Thursday, March 17, 2016 4:40 PM
> To: Zhao Lei <zhaolei@cn.fujitsu.com>
> Cc: linux-kernel@vger.kernel.org; Tejun Heo <htejun@gmail.com>; Yang
> Dongsheng <yangds.fnst@cn.fujitsu.com>
> Subject: Re: [PATCH v3 3/3] cpuacct: split usage into user_usage and sys_usage
> 
> On Thu, Mar 17, 2016 at 12:19:44PM +0800, Zhao Lei wrote:
> > +static u64 __cpuusage_read(struct cgroup_subsys_state *css,
> > +			   enum cpuacct_usage_index index)
> >  {
> >  	struct cpuacct *ca = css_ca(css);
> >  	u64 totalcpuusage = 0;
> >  	int i;
> >
> >  	for_each_present_cpu(i)
> > +		totalcpuusage += cpuacct_cpuusage_read(ca, i, index);
> >
> >  	return totalcpuusage;
> >  }
> 
> Ok, so while looking over this, it mostly uses for_each_present_cpu(),
> which is already dubious, but then cpuacct_stats_show() uses
> for_each_online_cpu().
> 
> Why is this? Why not always for_each_possible_cpu()?
> 
> Surely, if you offline a cpu, you still want its stat to be included in
> your totals, otherwise your numbers will go backwards when you take a
> cpu offline.
> 
I agree with you that for_each_possible_cpu() is best choice for above
code.

In corrent code,

1: cpuacct.usage_percpu only include present_cpus
  If a cpu is hotplug out, it is not exist in above file.
2: cpuacct.usage only calculate present_cpus
  If a cpu is hotplug out, this value maybe decreased.
3: cpuacct.stat only calculate online cpus
  Obviously wrong.

Above 3 is easy to fix, but better to fix above 1 and 2 together,
in one word, to make ALL statics counts possible_cpu.

The problem is file output, currently,
 # cat cpuacct.usage_percpu
 689076136 1131883300
If we turn to use possible_cpu, above line will have
256 valuse, as:
# cat cpuacct.usage_percpu
 689076136 1131883300 0 0 0 0 0 0 0 0 ...

Or we can only show present_cpu and non_present_cpu with !0 value,
and we need also need output a cpuindex, as:
# cat cpuacct.usage_percpu
  [0] 689076136
  [1] 1131883300
  [3] 11111111
  [50] 22222222
#

It will tell user more accurate information,
but both solution will change current cgroup interface.

So I suggest keeping current using of for_each_present_cpu,
and only modify for_each_online_cpu.

What is your opinion?

Thanks
Zhaolei

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 3/3] cpuacct: split usage into user_usage and sys_usage
  2016-03-17  9:42     ` Zhao Lei
@ 2016-03-17 10:12       ` Peter Zijlstra
  0 siblings, 0 replies; 8+ messages in thread
From: Peter Zijlstra @ 2016-03-17 10:12 UTC (permalink / raw)
  To: Zhao Lei; +Cc: linux-kernel, 'Tejun Heo', 'Yang Dongsheng'

On Thu, Mar 17, 2016 at 05:42:48PM +0800, Zhao Lei wrote:

> > Ok, so while looking over this, it mostly uses for_each_present_cpu(),
> > which is already dubious, but then cpuacct_stats_show() uses
> > for_each_online_cpu().
> > 
> > Why is this? Why not always for_each_possible_cpu()?
> > 
> > Surely, if you offline a cpu, you still want its stat to be included in
> > your totals, otherwise your numbers will go backwards when you take a
> > cpu offline.
> > 
> I agree with you that for_each_possible_cpu() is best choice for above
> code.
> 
> In corrent code,
> 
> 1: cpuacct.usage_percpu only include present_cpus
>   If a cpu is hotplug out, it is not exist in above file.
> 2: cpuacct.usage only calculate present_cpus
>   If a cpu is hotplug out, this value maybe decreased.
> 3: cpuacct.stat only calculate online cpus
>   Obviously wrong.
> 
> Above 3 is easy to fix, but better to fix above 1 and 2 together,
> in one word, to make ALL statics counts possible_cpu.
> 
> The problem is file output, currently,
>  # cat cpuacct.usage_percpu
>  689076136 1131883300
> If we turn to use possible_cpu, above line will have
> 256 valuse, as:

Only if you're in a VM or something daft like that. Real hardware will
limit possible to the number of CPUs that are actually possible (usually
the same as present, but physical hotplug and an empty socket can see
possible > present).

And if VM people are annoyed by the silly amount of CPUs, they can limit
the number with the possible_cpus= boot parameter.

> # cat cpuacct.usage_percpu
>  689076136 1131883300 0 0 0 0 0 0 0 0 ...

Such is life, if you actually had 256 CPUs (some people do) you'd get
the same output.

> Or we can only show present_cpu and non_present_cpu with !0 value,
> and we need also need output a cpuindex, as:

> # cat cpuacct.usage_percpu
>   [0] 689076136
>   [1] 1131883300
>   [3] 11111111
>   [50] 22222222

Can't do that, it changes the ABI.

> So I suggest keeping current using of for_each_present_cpu,
> and only modify for_each_online_cpu.
> 
> What is your opinion?

I think they should all be possible, present too can loose bits with
hotplug.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tip:sched/urgent] sched/cpuacct: Simplify the cpuacct code
  2016-03-17  4:19 ` [PATCH v3 2/3] Some small restruct for cpuacct Zhao Lei
@ 2016-03-21 11:17   ` tip-bot for Zhao Lei
  0 siblings, 0 replies; 8+ messages in thread
From: tip-bot for Zhao Lei @ 2016-03-21 11:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: htejun, hpa, tglx, linux-kernel, mingo, torvalds, peterz, zhaolei

Commit-ID:  73e6aafd9ea81498d31361f01db84a0118da2d1c
Gitweb:     http://git.kernel.org/tip/73e6aafd9ea81498d31361f01db84a0118da2d1c
Author:     Zhao Lei <zhaolei@cn.fujitsu.com>
AuthorDate: Thu, 17 Mar 2016 12:19:43 +0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Mon, 21 Mar 2016 11:00:28 +0100

sched/cpuacct: Simplify the cpuacct code

 - Use for() instead of while() loop in some functions
   to make the code simpler.

 - Use this_cpu_ptr() instead of per_cpu_ptr() to make the code
   cleaner and a bit faster.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/d8a7ef9592f55224630cb26dea239f05b6398a4e.1458187654.git.zhaolei@cn.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/cpuacct.c | 28 +++++-----------------------
 kernel/sched/cpuacct.h |  4 ++--
 2 files changed, 7 insertions(+), 25 deletions(-)

diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
index 9c2bbf7..434c2fa 100644
--- a/kernel/sched/cpuacct.c
+++ b/kernel/sched/cpuacct.c
@@ -238,23 +238,10 @@ static struct cftype files[] = {
 void cpuacct_charge(struct task_struct *tsk, u64 cputime)
 {
 	struct cpuacct *ca;
-	int cpu;
-
-	cpu = task_cpu(tsk);
 
 	rcu_read_lock();
-
-	ca = task_ca(tsk);
-
-	while (true) {
-		u64 *cpuusage = per_cpu_ptr(ca->cpuusage, cpu);
-		*cpuusage += cputime;
-
-		ca = parent_ca(ca);
-		if (!ca)
-			break;
-	}
-
+	for (ca = task_ca(tsk); ca; ca = parent_ca(ca))
+		*this_cpu_ptr(ca->cpuusage) += cputime;
 	rcu_read_unlock();
 }
 
@@ -263,18 +250,13 @@ void cpuacct_charge(struct task_struct *tsk, u64 cputime)
  *
  * Note: it's the caller that updates the account of the root cgroup.
  */
-void cpuacct_account_field(struct task_struct *p, int index, u64 val)
+void cpuacct_account_field(struct task_struct *tsk, int index, u64 val)
 {
-	struct kernel_cpustat *kcpustat;
 	struct cpuacct *ca;
 
 	rcu_read_lock();
-	ca = task_ca(p);
-	while (ca != &root_cpuacct) {
-		kcpustat = this_cpu_ptr(ca->cpustat);
-		kcpustat->cpustat[index] += val;
-		ca = parent_ca(ca);
-	}
+	for (ca = task_ca(tsk); ca != &root_cpuacct; ca = parent_ca(ca))
+		this_cpu_ptr(ca->cpustat)->cpustat[index] += val;
 	rcu_read_unlock();
 }
 
diff --git a/kernel/sched/cpuacct.h b/kernel/sched/cpuacct.h
index ed60562..ba72807 100644
--- a/kernel/sched/cpuacct.h
+++ b/kernel/sched/cpuacct.h
@@ -1,7 +1,7 @@
 #ifdef CONFIG_CGROUP_CPUACCT
 
 extern void cpuacct_charge(struct task_struct *tsk, u64 cputime);
-extern void cpuacct_account_field(struct task_struct *p, int index, u64 val);
+extern void cpuacct_account_field(struct task_struct *tsk, int index, u64 val);
 
 #else
 
@@ -10,7 +10,7 @@ static inline void cpuacct_charge(struct task_struct *tsk, u64 cputime)
 }
 
 static inline void
-cpuacct_account_field(struct task_struct *p, int index, u64 val)
+cpuacct_account_field(struct task_struct *tsk, int index, u64 val)
 {
 }
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-03-21 11:18 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-03-17  4:19 [PATCH v3 0/3] cpuacct: split usage into user_usage and sys_usage Zhao Lei
2016-03-17  4:19 ` [PATCH v3 1/3] cpuacct: rename parameter in cpuusage_write for readability Zhao Lei
2016-03-17  4:19 ` [PATCH v3 2/3] Some small restruct for cpuacct Zhao Lei
2016-03-21 11:17   ` [tip:sched/urgent] sched/cpuacct: Simplify the cpuacct code tip-bot for Zhao Lei
2016-03-17  4:19 ` [PATCH v3 3/3] cpuacct: split usage into user_usage and sys_usage Zhao Lei
2016-03-17  8:40   ` Peter Zijlstra
2016-03-17  9:42     ` Zhao Lei
2016-03-17 10:12       ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox