linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] cpustat: use atomic operations to read/update stats
@ 2013-02-22  5:56 Kevin Hilman
  2013-02-22  5:56 ` [PATCH 1/2] cpustat: use accessor functions for get/set/add Kevin Hilman
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Kevin Hilman @ 2013-02-22  5:56 UTC (permalink / raw)
  To: linux-arm-kernel

On 64-bit platforms, reads/writes of the various cpustat fields are
atomic due to native 64-bit loads/stores.  However, on non 64-bit
platforms, reads/writes of the cpustat fields are not atomic and could
lead to inconsistent statistics.

This problem was originally reported by Frederic Weisbecker as a
64-bit limitation with the nsec granularity cputime accounting for
full dynticks, but then we realized that it's a problem that's been
around for awhile and not specific to the new cputime accounting.

This series fixes this by first converting all access to the cputime
fields to use accessor functions, and then converting the accessor
functions to use the atomic64 functions.

Implemented based on idea proposed by Frederic Weisbecker.

Kevin Hilman (2):
  cpustat: use accessor functions for get/set/add
  cpustat: convert to atomic operations

 arch/s390/appldata/appldata_os.c   | 16 +++++++--------
 drivers/cpufreq/cpufreq_governor.c | 18 ++++++++---------
 drivers/cpufreq/cpufreq_ondemand.c |  2 +-
 drivers/macintosh/rack-meter.c     |  6 +++---
 fs/proc/stat.c                     | 40 +++++++++++++++++++-------------------
 fs/proc/uptime.c                   |  2 +-
 include/linux/kernel_stat.h        | 11 ++++++++++-
 kernel/sched/core.c                | 12 +++++-------
 kernel/sched/cputime.c             | 29 +++++++++++++--------------
 9 files changed, 70 insertions(+), 66 deletions(-)

-- 
1.8.1.2

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] cpustat: use accessor functions for get/set/add
  2013-02-22  5:56 [PATCH 0/2] cpustat: use atomic operations to read/update stats Kevin Hilman
@ 2013-02-22  5:56 ` Kevin Hilman
  2013-02-22  6:21   ` Viresh Kumar
  2013-02-22 13:38   ` Frederic Weisbecker
  2013-02-22  5:56 ` [PATCH 2/2] cpustat: convert to atomic operations Kevin Hilman
  2013-02-22  8:46 ` [PATCH 0/2] cpustat: use atomic operations to read/update stats Peter Zijlstra
  2 siblings, 2 replies; 25+ messages in thread
From: Kevin Hilman @ 2013-02-22  5:56 UTC (permalink / raw)
  To: linux-arm-kernel

Add some accessor functions in order to facilitate the conversion to
atomic reads/writes of cpustat values.

Signed-off-by: Kevin Hilman <khilman@linaro.org>
---
 arch/s390/appldata/appldata_os.c   | 16 +++++++--------
 drivers/cpufreq/cpufreq_governor.c | 18 ++++++++---------
 drivers/cpufreq/cpufreq_ondemand.c |  2 +-
 drivers/macintosh/rack-meter.c     |  6 +++---
 fs/proc/stat.c                     | 40 +++++++++++++++++++-------------------
 fs/proc/uptime.c                   |  2 +-
 include/linux/kernel_stat.h        |  7 ++++++-
 kernel/sched/core.c                | 12 +++++-------
 kernel/sched/cputime.c             | 29 +++++++++++++--------------
 9 files changed, 66 insertions(+), 66 deletions(-)

diff --git a/arch/s390/appldata/appldata_os.c b/arch/s390/appldata/appldata_os.c
index 87521ba..eff76f8 100644
--- a/arch/s390/appldata/appldata_os.c
+++ b/arch/s390/appldata/appldata_os.c
@@ -113,21 +113,21 @@ static void appldata_get_os_data(void *data)
 	j = 0;
 	for_each_online_cpu(i) {
 		os_data->os_cpu[j].per_cpu_user =
-			cputime_to_jiffies(kcpustat_cpu(i).cpustat[CPUTIME_USER]);
+			cputime_to_jiffies(kcpustat_cpu_get(i, CPUTIME_USER));
 		os_data->os_cpu[j].per_cpu_nice =
-			cputime_to_jiffies(kcpustat_cpu(i).cpustat[CPUTIME_NICE]);
+			cputime_to_jiffies(kcpustat_cpu_get(i, CPUTIME_NICE));
 		os_data->os_cpu[j].per_cpu_system =
-			cputime_to_jiffies(kcpustat_cpu(i).cpustat[CPUTIME_SYSTEM]);
+			cputime_to_jiffies(kcpustat_cpu_get(i, CPUTIME_SYSTEM));
 		os_data->os_cpu[j].per_cpu_idle =
-			cputime_to_jiffies(kcpustat_cpu(i).cpustat[CPUTIME_IDLE]);
+			cputime_to_jiffies(kcpustat_cpu_get(i, CPUTIME_IDLE));
 		os_data->os_cpu[j].per_cpu_irq =
-			cputime_to_jiffies(kcpustat_cpu(i).cpustat[CPUTIME_IRQ]);
+			cputime_to_jiffies(kcpustat_cpu_get(i, CPUTIME_IRQ));
 		os_data->os_cpu[j].per_cpu_softirq =
-			cputime_to_jiffies(kcpustat_cpu(i).cpustat[CPUTIME_SOFTIRQ]);
+			cputime_to_jiffies(kcpustat_cpu_get(i, CPUTIME_SOFTIRQ));
 		os_data->os_cpu[j].per_cpu_iowait =
-			cputime_to_jiffies(kcpustat_cpu(i).cpustat[CPUTIME_IOWAIT]);
+			cputime_to_jiffies(kcpustat_cpu_get(i, CPUTIME_IOWAIT));
 		os_data->os_cpu[j].per_cpu_steal =
-			cputime_to_jiffies(kcpustat_cpu(i).cpustat[CPUTIME_STEAL]);
+			cputime_to_jiffies(kcpustat_cpu_get(i, CPUTIME_STEAL));
 		os_data->os_cpu[j].cpu_id = i;
 		j++;
 	}
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 6c5f1d3..ec6c315 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -36,12 +36,12 @@ static inline u64 get_cpu_idle_time_jiffy(unsigned int cpu, u64 *wall)
 
 	cur_wall_time = jiffies64_to_cputime64(get_jiffies_64());
 
-	busy_time = kcpustat_cpu(cpu).cpustat[CPUTIME_USER];
-	busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_SYSTEM];
-	busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_IRQ];
-	busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_SOFTIRQ];
-	busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_STEAL];
-	busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_NICE];
+	busy_time = kcpustat_cpu_get(cpu, CPUTIME_USER);
+	busy_time += kcpustat_cpu_get(cpu, CPUTIME_SYSTEM);
+	busy_time += kcpustat_cpu_get(cpu, CPUTIME_IRQ);
+	busy_time += kcpustat_cpu_get(cpu, CPUTIME_SOFTIRQ);
+	busy_time += kcpustat_cpu_get(cpu, CPUTIME_STEAL);
+	busy_time += kcpustat_cpu_get(cpu, CPUTIME_NICE);
 
 	idle_time = cur_wall_time - busy_time;
 	if (wall)
@@ -103,7 +103,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 			u64 cur_nice;
 			unsigned long cur_nice_jiffies;
 
-			cur_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE] -
+			cur_nice = kcpustat_cpu_get(j, CPUTIME_NICE) -
 					 cdbs->prev_cpu_nice;
 			/*
 			 * Assumption: nice time between sampling periods will
@@ -113,7 +113,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
 					cputime64_to_jiffies64(cur_nice);
 
 			cdbs->prev_cpu_nice =
-				kcpustat_cpu(j).cpustat[CPUTIME_NICE];
+				kcpustat_cpu_get(j, CPUTIME_NICE);
 			idle_time += jiffies_to_usecs(cur_nice_jiffies);
 		}
 
@@ -216,7 +216,7 @@ int cpufreq_governor_dbs(struct dbs_data *dbs_data,
 					&j_cdbs->prev_cpu_wall);
 			if (ignore_nice)
 				j_cdbs->prev_cpu_nice =
-					kcpustat_cpu(j).cpustat[CPUTIME_NICE];
+					kcpustat_cpu_get(j, CPUTIME_NICE);
 		}
 
 		/*
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index 7731f7c..ac5d49f 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -403,7 +403,7 @@ static ssize_t store_ignore_nice_load(struct kobject *a, struct attribute *b,
 						&dbs_info->cdbs.prev_cpu_wall);
 		if (od_tuners.ignore_nice)
 			dbs_info->cdbs.prev_cpu_nice =
-				kcpustat_cpu(j).cpustat[CPUTIME_NICE];
+				kcpustat_cpu_get(j, CPUTIME_NICE);
 
 	}
 	return count;
diff --git a/drivers/macintosh/rack-meter.c b/drivers/macintosh/rack-meter.c
index cad0e19..e799f3c 100644
--- a/drivers/macintosh/rack-meter.c
+++ b/drivers/macintosh/rack-meter.c
@@ -83,11 +83,11 @@ static inline cputime64_t get_cpu_idle_time(unsigned int cpu)
 {
 	u64 retval;
 
-	retval = kcpustat_cpu(cpu).cpustat[CPUTIME_IDLE] +
-		 kcpustat_cpu(cpu).cpustat[CPUTIME_IOWAIT];
+	retval = kcpustat_cpu_get(cpu, CPUTIME_IDLE) +
+		 kcpustat_cpu_get(cpu, CPUTIME_IOWAIT);
 
 	if (rackmeter_ignore_nice)
-		retval += kcpustat_cpu(cpu).cpustat[CPUTIME_NICE];
+		retval += kcpustat_cpu_get(cpu, CPUTIME_NICE);
 
 	return retval;
 }
diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index e296572..6a24276 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -25,7 +25,7 @@ static cputime64_t get_idle_time(int cpu)
 {
 	cputime64_t idle;
 
-	idle = kcpustat_cpu(cpu).cpustat[CPUTIME_IDLE];
+	idle = kcpustat_cpu_get(cpu, CPUTIME_IDLE);
 	if (cpu_online(cpu) && !nr_iowait_cpu(cpu))
 		idle += arch_idle_time(cpu);
 	return idle;
@@ -35,7 +35,7 @@ static cputime64_t get_iowait_time(int cpu)
 {
 	cputime64_t iowait;
 
-	iowait = kcpustat_cpu(cpu).cpustat[CPUTIME_IOWAIT];
+	iowait = kcpustat_cpu_get(cpu, CPUTIME_IOWAIT);
 	if (cpu_online(cpu) && nr_iowait_cpu(cpu))
 		iowait += arch_idle_time(cpu);
 	return iowait;
@@ -52,7 +52,7 @@ static u64 get_idle_time(int cpu)
 
 	if (idle_time == -1ULL)
 		/* !NO_HZ or cpu offline so we can rely on cpustat.idle */
-		idle = kcpustat_cpu(cpu).cpustat[CPUTIME_IDLE];
+		idle = kcpustat_cpu_get(cpu, CPUTIME_IDLE);
 	else
 		idle = usecs_to_cputime64(idle_time);
 
@@ -68,7 +68,7 @@ static u64 get_iowait_time(int cpu)
 
 	if (iowait_time == -1ULL)
 		/* !NO_HZ or cpu offline so we can rely on cpustat.iowait */
-		iowait = kcpustat_cpu(cpu).cpustat[CPUTIME_IOWAIT];
+		iowait = kcpustat_cpu_get(cpu, CPUTIME_IOWAIT);
 	else
 		iowait = usecs_to_cputime64(iowait_time);
 
@@ -95,16 +95,16 @@ static int show_stat(struct seq_file *p, void *v)
 	jif = boottime.tv_sec;
 
 	for_each_possible_cpu(i) {
-		user += kcpustat_cpu(i).cpustat[CPUTIME_USER];
-		nice += kcpustat_cpu(i).cpustat[CPUTIME_NICE];
-		system += kcpustat_cpu(i).cpustat[CPUTIME_SYSTEM];
+		user += kcpustat_cpu_get(i, CPUTIME_USER);
+		nice += kcpustat_cpu_get(i, CPUTIME_NICE);
+		system += kcpustat_cpu_get(i, CPUTIME_SYSTEM);
 		idle += get_idle_time(i);
 		iowait += get_iowait_time(i);
-		irq += kcpustat_cpu(i).cpustat[CPUTIME_IRQ];
-		softirq += kcpustat_cpu(i).cpustat[CPUTIME_SOFTIRQ];
-		steal += kcpustat_cpu(i).cpustat[CPUTIME_STEAL];
-		guest += kcpustat_cpu(i).cpustat[CPUTIME_GUEST];
-		guest_nice += kcpustat_cpu(i).cpustat[CPUTIME_GUEST_NICE];
+		irq += kcpustat_cpu_get(i, CPUTIME_IRQ);
+		softirq += kcpustat_cpu_get(i, CPUTIME_SOFTIRQ);
+		steal += kcpustat_cpu_get(i, CPUTIME_STEAL);
+		guest += kcpustat_cpu_get(i, CPUTIME_GUEST);
+		guest_nice += kcpustat_cpu_get(i, CPUTIME_GUEST_NICE);
 		sum += kstat_cpu_irqs_sum(i);
 		sum += arch_irq_stat_cpu(i);
 
@@ -132,16 +132,16 @@ static int show_stat(struct seq_file *p, void *v)
 
 	for_each_online_cpu(i) {
 		/* Copy values here to work around gcc-2.95.3, gcc-2.96 */
-		user = kcpustat_cpu(i).cpustat[CPUTIME_USER];
-		nice = kcpustat_cpu(i).cpustat[CPUTIME_NICE];
-		system = kcpustat_cpu(i).cpustat[CPUTIME_SYSTEM];
+		user = kcpustat_cpu_get(i, CPUTIME_USER);
+		nice = kcpustat_cpu_get(i, CPUTIME_NICE);
+		system = kcpustat_cpu_get(i, CPUTIME_SYSTEM);
 		idle = get_idle_time(i);
 		iowait = get_iowait_time(i);
-		irq = kcpustat_cpu(i).cpustat[CPUTIME_IRQ];
-		softirq = kcpustat_cpu(i).cpustat[CPUTIME_SOFTIRQ];
-		steal = kcpustat_cpu(i).cpustat[CPUTIME_STEAL];
-		guest = kcpustat_cpu(i).cpustat[CPUTIME_GUEST];
-		guest_nice = kcpustat_cpu(i).cpustat[CPUTIME_GUEST_NICE];
+		irq = kcpustat_cpu_get(i, CPUTIME_IRQ);
+		softirq = kcpustat_cpu_get(i, CPUTIME_SOFTIRQ);
+		steal = kcpustat_cpu_get(i, CPUTIME_STEAL);
+		guest = kcpustat_cpu_get(i, CPUTIME_GUEST);
+		guest_nice = kcpustat_cpu_get(i, CPUTIME_GUEST_NICE);
 		seq_printf(p, "cpu%d", i);
 		seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(user));
 		seq_put_decimal_ull(p, ' ', cputime64_to_clock_t(nice));
diff --git a/fs/proc/uptime.c b/fs/proc/uptime.c
index 9610ac7..fd9c93d 100644
--- a/fs/proc/uptime.c
+++ b/fs/proc/uptime.c
@@ -18,7 +18,7 @@ static int uptime_proc_show(struct seq_file *m, void *v)
 
 	idletime = 0;
 	for_each_possible_cpu(i)
-		idletime += (__force u64) kcpustat_cpu(i).cpustat[CPUTIME_IDLE];
+		idletime += (__force u64) kcpustat_cpu_get(i, CPUTIME_IDLE);
 
 	do_posix_clock_monotonic_gettime(&uptime);
 	monotonic_to_bootbased(&uptime);
diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
index 66b7078..df8ad75 100644
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -32,7 +32,7 @@ enum cpu_usage_stat {
 };
 
 struct kernel_cpustat {
-	u64 cpustat[NR_STATS];
+	u64 _cpustat[NR_STATS];
 };
 
 struct kernel_stat {
@@ -51,6 +51,11 @@ DECLARE_PER_CPU(struct kernel_cpustat, kernel_cpustat);
 #define kcpustat_this_cpu (&__get_cpu_var(kernel_cpustat))
 #define kstat_cpu(cpu) per_cpu(kstat, cpu)
 #define kcpustat_cpu(cpu) per_cpu(kernel_cpustat, cpu)
+#define kcpustat_cpu_get(cpu, i) (kcpustat_cpu(cpu)._cpustat[i])
+#define kcpustat_cpu_set(cpu, i, val) (kcpustat_cpu(cpu)._cpustat[i] = (val))
+#define kcpustat_cpu_add(cpu, i, val) (kcpustat_cpu(cpu)._cpustat[i] += (val))
+#define kcpustat_this_cpu_set(i, val) (kcpustat_this_cpu->_cpustat[i] = (val))
+#define kcpustat_this_cpu_add(i, val) (kcpustat_this_cpu->_cpustat[i] += (val))
 
 extern unsigned long long nr_context_switches(void);
 
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 26058d0..4a8234c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -8082,19 +8082,17 @@ static int cpuacct_stats_show(struct cgroup *cgrp, struct cftype *cft,
 	s64 val = 0;
 
 	for_each_online_cpu(cpu) {
-		struct kernel_cpustat *kcpustat = per_cpu_ptr(ca->cpustat, cpu);
-		val += kcpustat->cpustat[CPUTIME_USER];
-		val += kcpustat->cpustat[CPUTIME_NICE];
+		val += kcpustat_cpu_get(cpu, CPUTIME_USER);
+		val += kcpustat_cpu_get(cpu, CPUTIME_NICE);
 	}
 	val = cputime64_to_clock_t(val);
 	cb->fill(cb, cpuacct_stat_desc[CPUACCT_STAT_USER], val);
 
 	val = 0;
 	for_each_online_cpu(cpu) {
-		struct kernel_cpustat *kcpustat = per_cpu_ptr(ca->cpustat, cpu);
-		val += kcpustat->cpustat[CPUTIME_SYSTEM];
-		val += kcpustat->cpustat[CPUTIME_IRQ];
-		val += kcpustat->cpustat[CPUTIME_SOFTIRQ];
+		val += kcpustat_cpu_get(cpu, CPUTIME_SYSTEM);
+		val += kcpustat_cpu_get(cpu, CPUTIME_IRQ);
+		val += kcpustat_cpu_get(cpu, CPUTIME_SOFTIRQ);
 	}
 
 	val = cputime64_to_clock_t(val);
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 293b202..a4c0594 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -124,7 +124,7 @@ static inline void task_group_account_field(struct task_struct *p, int index,
 	 * is the only cgroup, then nothing else should be necessary.
 	 *
 	 */
-	__get_cpu_var(kernel_cpustat).cpustat[index] += tmp;
+	kcpustat_this_cpu_add(index, tmp);
 
 #ifdef CONFIG_CGROUP_CPUACCT
 	if (unlikely(!cpuacct_subsys.active))
@@ -175,8 +175,6 @@ void account_user_time(struct task_struct *p, cputime_t cputime,
 static void account_guest_time(struct task_struct *p, cputime_t cputime,
 			       cputime_t cputime_scaled)
 {
-	u64 *cpustat = kcpustat_this_cpu->cpustat;
-
 	/* Add guest time to process. */
 	p->utime += cputime;
 	p->utimescaled += cputime_scaled;
@@ -185,11 +183,12 @@ static void account_guest_time(struct task_struct *p, cputime_t cputime,
 
 	/* Add guest time to cpustat. */
 	if (TASK_NICE(p) > 0) {
-		cpustat[CPUTIME_NICE] += (__force u64) cputime;
-		cpustat[CPUTIME_GUEST_NICE] += (__force u64) cputime;
+		kcpustat_this_cpu_add(CPUTIME_NICE, (__force u64) cputime);
+		kcpustat_this_cpu_add(CPUTIME_GUEST_NICE,
+				      (__force u64) cputime);
 	} else {
-		cpustat[CPUTIME_USER] += (__force u64) cputime;
-		cpustat[CPUTIME_GUEST] += (__force u64) cputime;
+		kcpustat_this_cpu_add(CPUTIME_USER, (__force u64) cputime);
+		kcpustat_this_cpu_add(CPUTIME_GUEST, (__force u64) cputime);
 	}
 }
 
@@ -249,9 +248,7 @@ void account_system_time(struct task_struct *p, int hardirq_offset,
  */
 void account_steal_time(cputime_t cputime)
 {
-	u64 *cpustat = kcpustat_this_cpu->cpustat;
-
-	cpustat[CPUTIME_STEAL] += (__force u64) cputime;
+	kcpustat_this_cpu_add(CPUTIME_STEAL, (__force u64) cputime);
 }
 
 /*
@@ -260,13 +257,12 @@ void account_steal_time(cputime_t cputime)
  */
 void account_idle_time(cputime_t cputime)
 {
-	u64 *cpustat = kcpustat_this_cpu->cpustat;
 	struct rq *rq = this_rq();
 
 	if (atomic_read(&rq->nr_iowait) > 0)
-		cpustat[CPUTIME_IOWAIT] += (__force u64) cputime;
+		kcpustat_this_cpu_add(CPUTIME_IOWAIT, (__force u64) cputime);
 	else
-		cpustat[CPUTIME_IDLE] += (__force u64) cputime;
+		kcpustat_this_cpu_add(CPUTIME_IDLE, (__force u64) cputime);
 }
 
 static __always_inline bool steal_account_process_tick(void)
@@ -344,15 +340,16 @@ static void irqtime_account_process_tick(struct task_struct *p, int user_tick,
 						struct rq *rq)
 {
 	cputime_t one_jiffy_scaled = cputime_to_scaled(cputime_one_jiffy);
-	u64 *cpustat = kcpustat_this_cpu->cpustat;
 
 	if (steal_account_process_tick())
 		return;
 
 	if (irqtime_account_hi_update()) {
-		cpustat[CPUTIME_IRQ] += (__force u64) cputime_one_jiffy;
+		kcpustat_this_cpu_add(CPUTIME_IRQ,
+				      (__force u64) cputime_one_jiffy);
 	} else if (irqtime_account_si_update()) {
-		cpustat[CPUTIME_SOFTIRQ] += (__force u64) cputime_one_jiffy;
+		kcpustat_this_cpu_add(CPUTIME_SOFTIRQ,
+				      (__force u64) cputime_one_jiffy);
 	} else if (this_cpu_ksoftirqd() == p) {
 		/*
 		 * ksoftirqd time do not get accounted in cpu_softirq_time.
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/2] cpustat: convert to atomic operations
  2013-02-22  5:56 [PATCH 0/2] cpustat: use atomic operations to read/update stats Kevin Hilman
  2013-02-22  5:56 ` [PATCH 1/2] cpustat: use accessor functions for get/set/add Kevin Hilman
@ 2013-02-22  5:56 ` Kevin Hilman
  2013-02-22  8:46 ` [PATCH 0/2] cpustat: use atomic operations to read/update stats Peter Zijlstra
  2 siblings, 0 replies; 25+ messages in thread
From: Kevin Hilman @ 2013-02-22  5:56 UTC (permalink / raw)
  To: linux-arm-kernel

For non 64-bit platforms, convert cpustat fields to atomic64 type so
reads and udpates of cpustats are atomic on those platforms as well.

For 64-bit platforms, the cpustat field is left as u64 because on
64-bit, using atomic64_add will have the additional overhead of a lock.
We could also have used atomic64_set(atomic64_read() + delta), but on
32-bit platforms using the generic 64-bit ops (lib/atomic64.c), that
results in taking a lock twice.

Signed-off-by: Kevin Hilman <khilman@linaro.org>
---
 include/linux/kernel_stat.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
index df8ad75..a433f87 100644
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -32,7 +32,11 @@ enum cpu_usage_stat {
 };
 
 struct kernel_cpustat {
+#ifdef CONFIG_64BIT
 	u64 _cpustat[NR_STATS];
+#else
+	atomic64_t _cpustat[NR_STATS];
+#endif
 };
 
 struct kernel_stat {
@@ -51,11 +55,23 @@ DECLARE_PER_CPU(struct kernel_cpustat, kernel_cpustat);
 #define kcpustat_this_cpu (&__get_cpu_var(kernel_cpustat))
 #define kstat_cpu(cpu) per_cpu(kstat, cpu)
 #define kcpustat_cpu(cpu) per_cpu(kernel_cpustat, cpu)
+#ifdef CONFIG_64BIT
 #define kcpustat_cpu_get(cpu, i) (kcpustat_cpu(cpu)._cpustat[i])
 #define kcpustat_cpu_set(cpu, i, val) (kcpustat_cpu(cpu)._cpustat[i] = (val))
 #define kcpustat_cpu_add(cpu, i, val) (kcpustat_cpu(cpu)._cpustat[i] += (val))
 #define kcpustat_this_cpu_set(i, val) (kcpustat_this_cpu->_cpustat[i] = (val))
 #define kcpustat_this_cpu_add(i, val) (kcpustat_this_cpu->_cpustat[i] += (val))
+#else
+#define kcpustat_cpu_get(cpu, i) atomic64_read(&kcpustat_cpu(cpu)._cpustat[i])
+#define kcpustat_cpu_set(cpu, i, val) \
+	atomic64_set(val, &kcpustat_cpu(cpu)._cpustat[i])
+#define kcpustat_cpu_add(cpu, i, val) \
+	atomic64_add(val, &kcpustat_cpu(cpu)._cpustat[i])
+#define kcpustat_this_cpu_set(i, val) \
+	atomic64_set(val, &kcpustat_this_cpu->_cpustat[i])
+#define kcpustat_this_cpu_add(i, val) \
+	atomic64_add(val, &kcpustat_this_cpu->_cpustat[i])
+#endif
 
 extern unsigned long long nr_context_switches(void);
 
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 1/2] cpustat: use accessor functions for get/set/add
  2013-02-22  5:56 ` [PATCH 1/2] cpustat: use accessor functions for get/set/add Kevin Hilman
@ 2013-02-22  6:21   ` Viresh Kumar
  2013-02-22  7:17     ` Amit Kucheria
  2013-02-22 15:14     ` Kevin Hilman
  2013-02-22 13:38   ` Frederic Weisbecker
  1 sibling, 2 replies; 25+ messages in thread
From: Viresh Kumar @ 2013-02-22  6:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 22, 2013 at 11:26 AM, Kevin Hilman <khilman@linaro.org> wrote:
> Add some accessor functions in order to facilitate the conversion to
> atomic reads/writes of cpustat values.
>
> Signed-off-by: Kevin Hilman <khilman@linaro.org>
> ---
>  drivers/cpufreq/cpufreq_governor.c | 18 ++++++++---------
>  drivers/cpufreq/cpufreq_ondemand.c |  2 +-

> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 6c5f1d3..ec6c315 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -36,12 +36,12 @@ static inline u64 get_cpu_idle_time_jiffy(unsigned int cpu, u64 *wall)
>
>         cur_wall_time = jiffies64_to_cputime64(get_jiffies_64());
>
> -       busy_time = kcpustat_cpu(cpu).cpustat[CPUTIME_USER];
> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_SYSTEM];
> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_IRQ];
> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_SOFTIRQ];
> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_STEAL];
> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_NICE];
> +       busy_time = kcpustat_cpu_get(cpu, CPUTIME_USER);
> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_SYSTEM);
> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_IRQ);
> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_SOFTIRQ);
> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_STEAL);
> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_NICE);
>
>         idle_time = cur_wall_time - busy_time;
>         if (wall)
> @@ -103,7 +103,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>                         u64 cur_nice;
>                         unsigned long cur_nice_jiffies;
>
> -                       cur_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE] -
> +                       cur_nice = kcpustat_cpu_get(j, CPUTIME_NICE) -
>                                          cdbs->prev_cpu_nice;
>                         /*
>                          * Assumption: nice time between sampling periods will
> @@ -113,7 +113,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>                                         cputime64_to_jiffies64(cur_nice);
>
>                         cdbs->prev_cpu_nice =
> -                               kcpustat_cpu(j).cpustat[CPUTIME_NICE];
> +                               kcpustat_cpu_get(j, CPUTIME_NICE);
>                         idle_time += jiffies_to_usecs(cur_nice_jiffies);
>                 }
>
> @@ -216,7 +216,7 @@ int cpufreq_governor_dbs(struct dbs_data *dbs_data,
>                                         &j_cdbs->prev_cpu_wall);
>                         if (ignore_nice)
>                                 j_cdbs->prev_cpu_nice =
> -                                       kcpustat_cpu(j).cpustat[CPUTIME_NICE];
> +                                       kcpustat_cpu_get(j, CPUTIME_NICE);
>                 }
>
>                 /*
> diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
> index 7731f7c..ac5d49f 100644
> --- a/drivers/cpufreq/cpufreq_ondemand.c
> +++ b/drivers/cpufreq/cpufreq_ondemand.c
> @@ -403,7 +403,7 @@ static ssize_t store_ignore_nice_load(struct kobject *a, struct attribute *b,
>                                                 &dbs_info->cdbs.prev_cpu_wall);
>                 if (od_tuners.ignore_nice)
>                         dbs_info->cdbs.prev_cpu_nice =
> -                               kcpustat_cpu(j).cpustat[CPUTIME_NICE];
> +                               kcpustat_cpu_get(j, CPUTIME_NICE);
>
>         }
>         return count;

For cpufreq:

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

Though i believe you also need this:

diff --git a/drivers/cpufreq/cpufreq_conservative.c
b/drivers/cpufreq/cpufreq_conservative.c
index 64ef737..38e3ad7 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -242,7 +242,7 @@ static ssize_t store_ignore_nice_load(struct
kobject *a, struct attribute *b,
                                                &dbs_info->cdbs.prev_cpu_wall);
                if (cs_tuners.ignore_nice)
                        dbs_info->cdbs.prev_cpu_nice =
-                               kcpustat_cpu(j).cpustat[CPUTIME_NICE];
+                               kcpustat_cpu_get(j, CPUTIME_NICE);
        }
        return count;
 }

BTW, i don't see kcpustat_cpu() used in

 kernel/sched/core.c                | 12 +++++-------
 kernel/sched/cputime.c             | 29 +++++++++++++--------------

I searched tip/master as well as lnext/master.

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 1/2] cpustat: use accessor functions for get/set/add
  2013-02-22  6:21   ` Viresh Kumar
@ 2013-02-22  7:17     ` Amit Kucheria
  2013-02-22  7:50       ` Viresh Kumar
  2013-02-22 12:34       ` Frederic Weisbecker
  2013-02-22 15:14     ` Kevin Hilman
  1 sibling, 2 replies; 25+ messages in thread
From: Amit Kucheria @ 2013-02-22  7:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 22, 2013 at 11:51 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On Fri, Feb 22, 2013 at 11:26 AM, Kevin Hilman <khilman@linaro.org> wrote:
>> Add some accessor functions in order to facilitate the conversion to
>> atomic reads/writes of cpustat values.
>>
>> Signed-off-by: Kevin Hilman <khilman@linaro.org>
>> ---
>>  drivers/cpufreq/cpufreq_governor.c | 18 ++++++++---------
>>  drivers/cpufreq/cpufreq_ondemand.c |  2 +-
>
>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>> index 6c5f1d3..ec6c315 100644
>> --- a/drivers/cpufreq/cpufreq_governor.c
>> +++ b/drivers/cpufreq/cpufreq_governor.c
>> @@ -36,12 +36,12 @@ static inline u64 get_cpu_idle_time_jiffy(unsigned int cpu, u64 *wall)
>>
>>         cur_wall_time = jiffies64_to_cputime64(get_jiffies_64());
>>
>> -       busy_time = kcpustat_cpu(cpu).cpustat[CPUTIME_USER];
>> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_SYSTEM];
>> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_IRQ];
>> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_SOFTIRQ];
>> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_STEAL];
>> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_NICE];
>> +       busy_time = kcpustat_cpu_get(cpu, CPUTIME_USER);
>> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_SYSTEM);
>> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_IRQ);
>> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_SOFTIRQ);
>> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_STEAL);
>> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_NICE);
>>
>>         idle_time = cur_wall_time - busy_time;
>>         if (wall)
>> @@ -103,7 +103,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>>                         u64 cur_nice;
>>                         unsigned long cur_nice_jiffies;
>>
>> -                       cur_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE] -
>> +                       cur_nice = kcpustat_cpu_get(j, CPUTIME_NICE) -
>>                                          cdbs->prev_cpu_nice;
>>                         /*
>>                          * Assumption: nice time between sampling periods will
>> @@ -113,7 +113,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>>                                         cputime64_to_jiffies64(cur_nice);
>>
>>                         cdbs->prev_cpu_nice =
>> -                               kcpustat_cpu(j).cpustat[CPUTIME_NICE];
>> +                               kcpustat_cpu_get(j, CPUTIME_NICE);
>>                         idle_time += jiffies_to_usecs(cur_nice_jiffies);
>>                 }
>>
>> @@ -216,7 +216,7 @@ int cpufreq_governor_dbs(struct dbs_data *dbs_data,
>>                                         &j_cdbs->prev_cpu_wall);
>>                         if (ignore_nice)
>>                                 j_cdbs->prev_cpu_nice =
>> -                                       kcpustat_cpu(j).cpustat[CPUTIME_NICE];
>> +                                       kcpustat_cpu_get(j, CPUTIME_NICE);
>>                 }
>>
>>                 /*
>> diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
>> index 7731f7c..ac5d49f 100644
>> --- a/drivers/cpufreq/cpufreq_ondemand.c
>> +++ b/drivers/cpufreq/cpufreq_ondemand.c
>> @@ -403,7 +403,7 @@ static ssize_t store_ignore_nice_load(struct kobject *a, struct attribute *b,
>>                                                 &dbs_info->cdbs.prev_cpu_wall);
>>                 if (od_tuners.ignore_nice)
>>                         dbs_info->cdbs.prev_cpu_nice =
>> -                               kcpustat_cpu(j).cpustat[CPUTIME_NICE];
>> +                               kcpustat_cpu_get(j, CPUTIME_NICE);
>>
>>         }
>>         return count;
>
> For cpufreq:
>
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
>
> Though i believe you also need this:
>
> diff --git a/drivers/cpufreq/cpufreq_conservative.c
> b/drivers/cpufreq/cpufreq_conservative.c
> index 64ef737..38e3ad7 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -242,7 +242,7 @@ static ssize_t store_ignore_nice_load(struct
> kobject *a, struct attribute *b,
>                                                 &dbs_info->cdbs.prev_cpu_wall);
>                 if (cs_tuners.ignore_nice)
>                         dbs_info->cdbs.prev_cpu_nice =
> -                               kcpustat_cpu(j).cpustat[CPUTIME_NICE];
> +                               kcpustat_cpu_get(j, CPUTIME_NICE);
>         }
>         return count;
>  }
>
> BTW, i don't see kcpustat_cpu() used in
>
>  kernel/sched/core.c                | 12 +++++-------
>  kernel/sched/cputime.c             | 29 +++++++++++++--------------
>
> I searched tip/master as well as lnext/master.

Added by Frederic's Adaptive NOHZ patchset?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] cpustat: use accessor functions for get/set/add
  2013-02-22  7:17     ` Amit Kucheria
@ 2013-02-22  7:50       ` Viresh Kumar
  2013-02-22 12:34       ` Frederic Weisbecker
  1 sibling, 0 replies; 25+ messages in thread
From: Viresh Kumar @ 2013-02-22  7:50 UTC (permalink / raw)
  To: linux-arm-kernel

On 22 February 2013 12:47, Amit Kucheria <amit.kucheria@linaro.org> wrote:
> On Fri, Feb 22, 2013 at 11:51 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> BTW, i don't see kcpustat_cpu() used in
>>
>>  kernel/sched/core.c                | 12 +++++-------
>>  kernel/sched/cputime.c             | 29 +++++++++++++--------------
>>
>> I searched tip/master as well as lnext/master.
>
> Added by Frederic's Adaptive NOHZ patchset?

I don't even see them on our unused-nohz-adaptive-tickless-v2 branch :)
Maybe some other latest work.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/2] cpustat: use atomic operations to read/update stats
  2013-02-22  5:56 [PATCH 0/2] cpustat: use atomic operations to read/update stats Kevin Hilman
  2013-02-22  5:56 ` [PATCH 1/2] cpustat: use accessor functions for get/set/add Kevin Hilman
  2013-02-22  5:56 ` [PATCH 2/2] cpustat: convert to atomic operations Kevin Hilman
@ 2013-02-22  8:46 ` Peter Zijlstra
  2013-02-22 12:50   ` Frederic Weisbecker
  2 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2013-02-22  8:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2013-02-21 at 21:56 -0800, Kevin Hilman wrote:
> On 64-bit platforms, reads/writes of the various cpustat fields are
> atomic due to native 64-bit loads/stores.  However, on non 64-bit
> platforms, reads/writes of the cpustat fields are not atomic and could
> lead to inconsistent statistics.

Which is a problem how?

> This problem was originally reported by Frederic Weisbecker as a
> 64-bit limitation with the nsec granularity cputime accounting for
> full dynticks, but then we realized that it's a problem that's been
> around for awhile and not specific to the new cputime accounting.
> 
> This series fixes this by first converting all access to the cputime
> fields to use accessor functions, and then converting the accessor
> functions to use the atomic64 functions.

Argh!! at what cost? 64bit atomics are like expensive. Wouldn't adding
a seqlock be saner?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] cpustat: use accessor functions for get/set/add
  2013-02-22  7:17     ` Amit Kucheria
  2013-02-22  7:50       ` Viresh Kumar
@ 2013-02-22 12:34       ` Frederic Weisbecker
  1 sibling, 0 replies; 25+ messages in thread
From: Frederic Weisbecker @ 2013-02-22 12:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 22, 2013 at 12:47:33PM +0530, Amit Kucheria wrote:
> On Fri, Feb 22, 2013 at 11:51 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > On Fri, Feb 22, 2013 at 11:26 AM, Kevin Hilman <khilman@linaro.org> wrote:
> >> Add some accessor functions in order to facilitate the conversion to
> >> atomic reads/writes of cpustat values.
> >>
> >> Signed-off-by: Kevin Hilman <khilman@linaro.org>
> >> ---
> >>  drivers/cpufreq/cpufreq_governor.c | 18 ++++++++---------
> >>  drivers/cpufreq/cpufreq_ondemand.c |  2 +-
> >
> >> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> >> index 6c5f1d3..ec6c315 100644
> >> --- a/drivers/cpufreq/cpufreq_governor.c
> >> +++ b/drivers/cpufreq/cpufreq_governor.c
> >> @@ -36,12 +36,12 @@ static inline u64 get_cpu_idle_time_jiffy(unsigned int cpu, u64 *wall)
> >>
> >>         cur_wall_time = jiffies64_to_cputime64(get_jiffies_64());
> >>
> >> -       busy_time = kcpustat_cpu(cpu).cpustat[CPUTIME_USER];
> >> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_SYSTEM];
> >> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_IRQ];
> >> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_SOFTIRQ];
> >> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_STEAL];
> >> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_NICE];
> >> +       busy_time = kcpustat_cpu_get(cpu, CPUTIME_USER);
> >> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_SYSTEM);
> >> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_IRQ);
> >> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_SOFTIRQ);
> >> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_STEAL);
> >> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_NICE);
> >>
> >>         idle_time = cur_wall_time - busy_time;
> >>         if (wall)
> >> @@ -103,7 +103,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
> >>                         u64 cur_nice;
> >>                         unsigned long cur_nice_jiffies;
> >>
> >> -                       cur_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE] -
> >> +                       cur_nice = kcpustat_cpu_get(j, CPUTIME_NICE) -
> >>                                          cdbs->prev_cpu_nice;
> >>                         /*
> >>                          * Assumption: nice time between sampling periods will
> >> @@ -113,7 +113,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
> >>                                         cputime64_to_jiffies64(cur_nice);
> >>
> >>                         cdbs->prev_cpu_nice =
> >> -                               kcpustat_cpu(j).cpustat[CPUTIME_NICE];
> >> +                               kcpustat_cpu_get(j, CPUTIME_NICE);
> >>                         idle_time += jiffies_to_usecs(cur_nice_jiffies);
> >>                 }
> >>
> >> @@ -216,7 +216,7 @@ int cpufreq_governor_dbs(struct dbs_data *dbs_data,
> >>                                         &j_cdbs->prev_cpu_wall);
> >>                         if (ignore_nice)
> >>                                 j_cdbs->prev_cpu_nice =
> >> -                                       kcpustat_cpu(j).cpustat[CPUTIME_NICE];
> >> +                                       kcpustat_cpu_get(j, CPUTIME_NICE);
> >>                 }
> >>
> >>                 /*
> >> diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
> >> index 7731f7c..ac5d49f 100644
> >> --- a/drivers/cpufreq/cpufreq_ondemand.c
> >> +++ b/drivers/cpufreq/cpufreq_ondemand.c
> >> @@ -403,7 +403,7 @@ static ssize_t store_ignore_nice_load(struct kobject *a, struct attribute *b,
> >>                                                 &dbs_info->cdbs.prev_cpu_wall);
> >>                 if (od_tuners.ignore_nice)
> >>                         dbs_info->cdbs.prev_cpu_nice =
> >> -                               kcpustat_cpu(j).cpustat[CPUTIME_NICE];
> >> +                               kcpustat_cpu_get(j, CPUTIME_NICE);
> >>
> >>         }
> >>         return count;
> >
> > For cpufreq:
> >
> > Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> >
> > Though i believe you also need this:
> >
> > diff --git a/drivers/cpufreq/cpufreq_conservative.c
> > b/drivers/cpufreq/cpufreq_conservative.c
> > index 64ef737..38e3ad7 100644
> > --- a/drivers/cpufreq/cpufreq_conservative.c
> > +++ b/drivers/cpufreq/cpufreq_conservative.c
> > @@ -242,7 +242,7 @@ static ssize_t store_ignore_nice_load(struct
> > kobject *a, struct attribute *b,
> >                                                 &dbs_info->cdbs.prev_cpu_wall);
> >                 if (cs_tuners.ignore_nice)
> >                         dbs_info->cdbs.prev_cpu_nice =
> > -                               kcpustat_cpu(j).cpustat[CPUTIME_NICE];
> > +                               kcpustat_cpu_get(j, CPUTIME_NICE);
> >         }
> >         return count;
> >  }
> >
> > BTW, i don't see kcpustat_cpu() used in
> >
> >  kernel/sched/core.c                | 12 +++++-------
> >  kernel/sched/cputime.c             | 29 +++++++++++++--------------
> >
> > I searched tip/master as well as lnext/master.
> 
> Added by Frederic's Adaptive NOHZ patchset?

No, Kevin has been using kcpustat_this_cpu_set() from kernel/sched/cputime.c. That's
because it's always written locally.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/2] cpustat: use atomic operations to read/update stats
  2013-02-22  8:46 ` [PATCH 0/2] cpustat: use atomic operations to read/update stats Peter Zijlstra
@ 2013-02-22 12:50   ` Frederic Weisbecker
  2013-02-22 13:48     ` Peter Zijlstra
                       ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Frederic Weisbecker @ 2013-02-22 12:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 22, 2013 at 09:46:07AM +0100, Peter Zijlstra wrote:
> On Thu, 2013-02-21 at 21:56 -0800, Kevin Hilman wrote:
> > On 64-bit platforms, reads/writes of the various cpustat fields are
> > atomic due to native 64-bit loads/stores.  However, on non 64-bit
> > platforms, reads/writes of the cpustat fields are not atomic and could
> > lead to inconsistent statistics.
> 
> Which is a problem how?

So here is a possible scenario, CPU 0 reads a kcpustat value, and CPU 1 writes
it at the same time:

    //Initial value of "cpustat" is 0xffffffff
         == CPU 0 ==           == CPU 1 ==

       //load low part
       mov %eax, [cpustat]
                             inc [cpustat]
                             //Update the high part if necessary
                             jnc 1f
                             inc [cpustat + 4]
                             1:
       //load high part
       mov %edx, [cpustat + 4]


Afterward, CPU 0 will think the value is 0x1ffffffff while it's actually
0x100000000.

atomic64_read() and atomic64_set() are supposed to take care of that, without
even the need for _inc() or _add() parts that use LOCK.


> 
> > This problem was originally reported by Frederic Weisbecker as a
> > 64-bit limitation with the nsec granularity cputime accounting for
> > full dynticks, but then we realized that it's a problem that's been
> > around for awhile and not specific to the new cputime accounting.
> > 
> > This series fixes this by first converting all access to the cputime
> > fields to use accessor functions, and then converting the accessor
> > functions to use the atomic64 functions.
> 
> Argh!! at what cost? 64bit atomics are like expensive. Wouldn't adding
> a seqlock be saner?

Not sure. This requires a spinlock in the write side which is called from
fast path like the timer interrupt.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] cpustat: use accessor functions for get/set/add
  2013-02-22  5:56 ` [PATCH 1/2] cpustat: use accessor functions for get/set/add Kevin Hilman
  2013-02-22  6:21   ` Viresh Kumar
@ 2013-02-22 13:38   ` Frederic Weisbecker
  2013-02-22 13:58     ` Peter Zijlstra
  1 sibling, 1 reply; 25+ messages in thread
From: Frederic Weisbecker @ 2013-02-22 13:38 UTC (permalink / raw)
  To: linux-arm-kernel

Looks good, just a minor neat:

On Thu, Feb 21, 2013 at 09:56:43PM -0800, Kevin Hilman wrote:
> diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
> index 66b7078..df8ad75 100644
> --- a/include/linux/kernel_stat.h
> +++ b/include/linux/kernel_stat.h
> @@ -32,7 +32,7 @@ enum cpu_usage_stat {
>  };
>  
>  struct kernel_cpustat {
> -	u64 cpustat[NR_STATS];
> +	u64 _cpustat[NR_STATS];

Why do you need the underscore?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/2] cpustat: use atomic operations to read/update stats
  2013-02-22 12:50   ` Frederic Weisbecker
@ 2013-02-22 13:48     ` Peter Zijlstra
  2013-02-22 13:54       ` Ingo Molnar
  2013-02-22 14:05     ` Peter Zijlstra
  2013-02-22 14:09     ` Peter Zijlstra
  2 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2013-02-22 13:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2013-02-22 at 13:50 +0100, Frederic Weisbecker wrote:
> atomic64_read() and atomic64_set() are supposed to take care of that,
> without
> even the need for _inc() or _add() parts that use LOCK.

Are you sure? Generally atomic*_set() is not actually an atomic
operation.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/2] cpustat: use atomic operations to read/update stats
  2013-02-22 13:48     ` Peter Zijlstra
@ 2013-02-22 13:54       ` Ingo Molnar
  2013-02-22 14:04         ` Peter Zijlstra
  0 siblings, 1 reply; 25+ messages in thread
From: Ingo Molnar @ 2013-02-22 13:54 UTC (permalink / raw)
  To: linux-arm-kernel


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Fri, 2013-02-22 at 13:50 +0100, Frederic Weisbecker wrote:

> > atomic64_read() and atomic64_set() are supposed to take care 
> > of that, without even the need for _inc() or _add() parts 
> > that use LOCK.
> 
> Are you sure? Generally atomic*_set() is not actually an 
> atomic operation.

as per Documentation/atomic_ops.h:

        #define atomic_read(v)  ((v)->counter)

 which simply reads the counter value currently visible to the 
 calling thread. The read is atomic in that the return value is 
 guaranteed to be one of the values initialized or modified with 
 the interface operations if a proper implicit or explicit 
 memory barrier is used after possible runtime initialization by 
 any other thread and the value is modified only with the 
 interface operations.

 ...

 Properly aligned pointers, longs, ints, and chars (and unsigned 
 equivalents) may be atomically loaded from and stored to in the 
 same sense as described for atomic_read() and atomic_set().  

 The ACCESS_ONCE() macro should be used to prevent the compiler 
 from using optimizations that might otherwise optimize accesses 
 out of existence on the one hand, or that might create 
 unsolicited accesses on the other.

This is usually a side effect of M[O]ESI cache coherency 
protocols - you can only get a 'split' word access if the word 
crosses cache line boundaries (multiples of 32 bytes, 
generally).

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] cpustat: use accessor functions for get/set/add
  2013-02-22 13:38   ` Frederic Weisbecker
@ 2013-02-22 13:58     ` Peter Zijlstra
  2013-02-22 14:03       ` Frederic Weisbecker
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2013-02-22 13:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2013-02-22 at 14:38 +0100, Frederic Weisbecker wrote:
> Looks good, just a minor neat:
> 
> On Thu, Feb 21, 2013 at 09:56:43PM -0800, Kevin Hilman wrote:
> > diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
> > index 66b7078..df8ad75 100644
> > --- a/include/linux/kernel_stat.h
> > +++ b/include/linux/kernel_stat.h
> > @@ -32,7 +32,7 @@ enum cpu_usage_stat {
> >  };
> >  
> >  struct kernel_cpustat {
> > -	u64 cpustat[NR_STATS];
> > +	u64 _cpustat[NR_STATS];
> 
> Why do you need the underscore?

To make sure there's nobody still referencing the data directly instead
of through the accessors.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] cpustat: use accessor functions for get/set/add
  2013-02-22 13:58     ` Peter Zijlstra
@ 2013-02-22 14:03       ` Frederic Weisbecker
  0 siblings, 0 replies; 25+ messages in thread
From: Frederic Weisbecker @ 2013-02-22 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 22, 2013 at 02:58:51PM +0100, Peter Zijlstra wrote:
> On Fri, 2013-02-22 at 14:38 +0100, Frederic Weisbecker wrote:
> > Looks good, just a minor neat:
> > 
> > On Thu, Feb 21, 2013 at 09:56:43PM -0800, Kevin Hilman wrote:
> > > diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
> > > index 66b7078..df8ad75 100644
> > > --- a/include/linux/kernel_stat.h
> > > +++ b/include/linux/kernel_stat.h
> > > @@ -32,7 +32,7 @@ enum cpu_usage_stat {
> > >  };
> > >  
> > >  struct kernel_cpustat {
> > > -	u64 cpustat[NR_STATS];
> > > +	u64 _cpustat[NR_STATS];
> > 
> > Why do you need the underscore?
> 
> To make sure there's nobody still referencing the data directly instead
> of through the accessors.

Good point!

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/2] cpustat: use atomic operations to read/update stats
  2013-02-22 13:54       ` Ingo Molnar
@ 2013-02-22 14:04         ` Peter Zijlstra
  2013-02-22 14:16           ` Ingo Molnar
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2013-02-22 14:04 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2013-02-22 at 14:54 +0100, Ingo Molnar wrote:
> * Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > On Fri, 2013-02-22 at 13:50 +0100, Frederic Weisbecker wrote:
> 
> > > atomic64_read() and atomic64_set() are supposed to take care 
> > > of that, without even the need for _inc() or _add() parts 
> > > that use LOCK.
> > 
> > Are you sure? Generally atomic*_set() is not actually an 
> > atomic operation.
> 
> as per Documentation/atomic_ops.h:

I think the interesting part is:

"The setting is atomic in that the return values of the atomic
operations by
all threads are guaranteed to be correct reflecting either the value
that has
been set with this operation or set with another operation.  A proper
implicit
or explicit memory barrier is needed before the value set with the
operation
is guaranteed to be readable with atomic_read from another thread."

Which does give us the wanted guarantee, however:

I checked arch/x86/include/asm/atomic64_32.h and we use cmpxchg8b for
everything from _set() to _read(), which translates into 'horridly
stupendifyingly slow' for a number of machines, but coherent.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/2] cpustat: use atomic operations to read/update stats
  2013-02-22 12:50   ` Frederic Weisbecker
  2013-02-22 13:48     ` Peter Zijlstra
@ 2013-02-22 14:05     ` Peter Zijlstra
  2013-02-22 14:15       ` Frederic Weisbecker
  2013-02-22 14:09     ` Peter Zijlstra
  2 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2013-02-22 14:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2013-02-22 at 13:50 +0100, Frederic Weisbecker wrote:
> > Which is a problem how?
> 
> So here is a possible scenario, CPU 0 reads a kcpustat value, and CPU
> 1 writes
> it at the same time:
> 
>     //Initial value of "cpustat" is 0xffffffff
>          == CPU 0 ==           == CPU 1 ==
> 
>        //load low part
>        mov %eax, [cpustat]
>                              inc [cpustat]
>                              //Update the high part if necessary
>                              jnc 1f
>                              inc [cpustat + 4]
>                              1:
>        //load high part
>        mov %edx, [cpustat + 4]
> 
> 
> Afterward, CPU 0 will think the value is 0x1ffffffff while it's
> actually
> 0x100000000.
> 
> atomic64_read() and atomic64_set() are supposed to take care of that,
> without
> even the need for _inc() or _add() parts that use LOCK.


Sure I get that, but again, why is that a problem,.. who relies on
these statistics that makes it a problem?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/2] cpustat: use atomic operations to read/update stats
  2013-02-22 12:50   ` Frederic Weisbecker
  2013-02-22 13:48     ` Peter Zijlstra
  2013-02-22 14:05     ` Peter Zijlstra
@ 2013-02-22 14:09     ` Peter Zijlstra
  2013-02-22 14:18       ` Ingo Molnar
  2013-02-22 14:21       ` Eric Dumazet
  2 siblings, 2 replies; 25+ messages in thread
From: Peter Zijlstra @ 2013-02-22 14:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2013-02-22 at 13:50 +0100, Frederic Weisbecker wrote:
> > Argh!! at what cost? 64bit atomics are like expensive. Wouldn't
> adding
> > a seqlock be saner?
> 
> Not sure. This requires a spinlock in the write side which is called
> from
> fast path like the timer interrupt.

A single spinlock is _way_ cheaper than a ton of cmpxchg8b()s to update
a few variables.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/2] cpustat: use atomic operations to read/update stats
  2013-02-22 14:05     ` Peter Zijlstra
@ 2013-02-22 14:15       ` Frederic Weisbecker
  0 siblings, 0 replies; 25+ messages in thread
From: Frederic Weisbecker @ 2013-02-22 14:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 22, 2013 at 03:05:39PM +0100, Peter Zijlstra wrote:
> On Fri, 2013-02-22 at 13:50 +0100, Frederic Weisbecker wrote:
> > > Which is a problem how?
> > 
> > So here is a possible scenario, CPU 0 reads a kcpustat value, and CPU
> > 1 writes
> > it at the same time:
> > 
> >     //Initial value of "cpustat" is 0xffffffff
> >          == CPU 0 ==           == CPU 1 ==
> > 
> >        //load low part
> >        mov %eax, [cpustat]
> >                              inc [cpustat]
> >                              //Update the high part if necessary
> >                              jnc 1f
> >                              inc [cpustat + 4]
> >                              1:
> >        //load high part
> >        mov %edx, [cpustat + 4]
> > 
> > 
> > Afterward, CPU 0 will think the value is 0x1ffffffff while it's
> > actually
> > 0x100000000.
> > 
> > atomic64_read() and atomic64_set() are supposed to take care of that,
> > without
> > even the need for _inc() or _add() parts that use LOCK.
> 
> 
> Sure I get that, but again, why is that a problem,.. who relies on
> these statistics that makes it a problem?

I guess we want to provide at least some minimal reliability in /proc/stat
I mean we don't mind if the read is slightly off, reading stats from userspace
is inherently racy anyway, but if it suddenly shows a wrong increase of 4 billions
which disappear soon after, it looks like a bug to me.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/2] cpustat: use atomic operations to read/update stats
  2013-02-22 14:04         ` Peter Zijlstra
@ 2013-02-22 14:16           ` Ingo Molnar
  2013-02-22 14:25             ` Peter Zijlstra
  0 siblings, 1 reply; 25+ messages in thread
From: Ingo Molnar @ 2013-02-22 14:16 UTC (permalink / raw)
  To: linux-arm-kernel


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Fri, 2013-02-22 at 14:54 +0100, Ingo Molnar wrote:
> > * Peter Zijlstra <peterz@infradead.org> wrote:
> > 
> > > On Fri, 2013-02-22 at 13:50 +0100, Frederic Weisbecker wrote:
> > 
> > > > atomic64_read() and atomic64_set() are supposed to take care 
> > > > of that, without even the need for _inc() or _add() parts 
> > > > that use LOCK.
> > > 
> > > Are you sure? Generally atomic*_set() is not actually an 
> > > atomic operation.
> > 
> > as per Documentation/atomic_ops.h:
> 
> I think the interesting part is:
> 
> "The setting is atomic in that the return values of the atomic 
> operations by all threads are guaranteed to be correct 
> reflecting either the value that has been set with this 
> operation or set with another operation.  A proper implicit or 
> explicit memory barrier is needed before the value set with 
> the operation is guaranteed to be readable with atomic_read 
> from another thread."
> 
> Which does give us the wanted guarantee, however:
> 
> I checked arch/x86/include/asm/atomic64_32.h and we use 
> cmpxchg8b for everything from _set() to _read(), which 
> translates into 'horridly stupendifyingly slow' for a number 
> of machines, but coherent.

That's a valid concern - and cmpxchg8b is the only 64-bit op 
available on most 32-bit x86 CPUs which does not involve the 
FPU.

Wondering how significant this range of x86 problem boxes will 
be by the time any of these changes reaches upstream and distros 
- and how much 'horridly stupendifyingly slow' is in terms of 
cycles expended.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/2] cpustat: use atomic operations to read/update stats
  2013-02-22 14:09     ` Peter Zijlstra
@ 2013-02-22 14:18       ` Ingo Molnar
  2013-02-22 14:21       ` Eric Dumazet
  1 sibling, 0 replies; 25+ messages in thread
From: Ingo Molnar @ 2013-02-22 14:18 UTC (permalink / raw)
  To: linux-arm-kernel


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Fri, 2013-02-22 at 13:50 +0100, Frederic Weisbecker wrote:
> > > Argh!! at what cost? 64bit atomics are like expensive. Wouldn't
> > adding
> > > a seqlock be saner?
> > 
> > Not sure. This requires a spinlock in the write side which is called
> > from
> > fast path like the timer interrupt.
> 
> A single spinlock is _way_ cheaper than a ton of cmpxchg8b()s 
> to update a few variables.

Every cmpxchg8b() will be roughly as expensive as a spinlock 
acquire+release fastpath.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/2] cpustat: use atomic operations to read/update stats
  2013-02-22 14:09     ` Peter Zijlstra
  2013-02-22 14:18       ` Ingo Molnar
@ 2013-02-22 14:21       ` Eric Dumazet
  2013-02-22 14:23         ` Frederic Weisbecker
  1 sibling, 1 reply; 25+ messages in thread
From: Eric Dumazet @ 2013-02-22 14:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2013-02-22 at 15:09 +0100, Peter Zijlstra wrote:
> On Fri, 2013-02-22 at 13:50 +0100, Frederic Weisbecker wrote:
> > > Argh!! at what cost? 64bit atomics are like expensive. Wouldn't
> > adding
> > > a seqlock be saner?
> > 
> > Not sure. This requires a spinlock in the write side which is called
> > from
> > fast path like the timer interrupt.
> 
> A single spinlock is _way_ cheaper than a ton of cmpxchg8b()s to update
> a few variables.

We also have include/linux/u64_stats_sync.h since 2.6.36

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/2] cpustat: use atomic operations to read/update stats
  2013-02-22 14:21       ` Eric Dumazet
@ 2013-02-22 14:23         ` Frederic Weisbecker
  2013-02-22 19:01           ` Kevin Hilman
  0 siblings, 1 reply; 25+ messages in thread
From: Frederic Weisbecker @ 2013-02-22 14:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 22, 2013 at 06:21:31AM -0800, Eric Dumazet wrote:
> On Fri, 2013-02-22 at 15:09 +0100, Peter Zijlstra wrote:
> > On Fri, 2013-02-22 at 13:50 +0100, Frederic Weisbecker wrote:
> > > > Argh!! at what cost? 64bit atomics are like expensive. Wouldn't
> > > adding
> > > > a seqlock be saner?
> > > 
> > > Not sure. This requires a spinlock in the write side which is called
> > > from
> > > fast path like the timer interrupt.
> > 
> > A single spinlock is _way_ cheaper than a ton of cmpxchg8b()s to update
> > a few variables.
> 
> We also have include/linux/u64_stats_sync.h since 2.6.36

Interesting, we should probably use that instead.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/2] cpustat: use atomic operations to read/update stats
  2013-02-22 14:16           ` Ingo Molnar
@ 2013-02-22 14:25             ` Peter Zijlstra
  0 siblings, 0 replies; 25+ messages in thread
From: Peter Zijlstra @ 2013-02-22 14:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2013-02-22 at 15:16 +0100, Ingo Molnar wrote:
> > I checked arch/x86/include/asm/atomic64_32.h and we use 
> > cmpxchg8b for everything from _set() to _read(), which 
> > translates into 'horridly stupendifyingly slow' for a number 
> > of machines, but coherent.
> 
> That's a valid concern - and cmpxchg8b is the only 64-bit op 
> available on most 32-bit x86 CPUs which does not involve the 
> FPU.
> 
> Wondering how significant this range of x86 problem boxes will 
> be by the time any of these changes reaches upstream and distros 
> - and how much 'horridly stupendifyingly slow' is in terms of 
> cycles expended.

On the !x86 side of things we're implementing (generic) atomic64 using
hashed spinlocks, so there too using a single spinlock around the
entire data structure is a complete win.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/2] cpustat: use accessor functions for get/set/add
  2013-02-22  6:21   ` Viresh Kumar
  2013-02-22  7:17     ` Amit Kucheria
@ 2013-02-22 15:14     ` Kevin Hilman
  1 sibling, 0 replies; 25+ messages in thread
From: Kevin Hilman @ 2013-02-22 15:14 UTC (permalink / raw)
  To: linux-arm-kernel

Viresh Kumar <viresh.kumar@linaro.org> writes:

> On Fri, Feb 22, 2013 at 11:26 AM, Kevin Hilman <khilman@linaro.org> wrote:
>> Add some accessor functions in order to facilitate the conversion to
>> atomic reads/writes of cpustat values.
>>
>> Signed-off-by: Kevin Hilman <khilman@linaro.org>
>> ---
>>  drivers/cpufreq/cpufreq_governor.c | 18 ++++++++---------
>>  drivers/cpufreq/cpufreq_ondemand.c |  2 +-
>
>> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
>> index 6c5f1d3..ec6c315 100644
>> --- a/drivers/cpufreq/cpufreq_governor.c
>> +++ b/drivers/cpufreq/cpufreq_governor.c
>> @@ -36,12 +36,12 @@ static inline u64 get_cpu_idle_time_jiffy(unsigned int cpu, u64 *wall)
>>
>>         cur_wall_time = jiffies64_to_cputime64(get_jiffies_64());
>>
>> -       busy_time = kcpustat_cpu(cpu).cpustat[CPUTIME_USER];
>> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_SYSTEM];
>> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_IRQ];
>> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_SOFTIRQ];
>> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_STEAL];
>> -       busy_time += kcpustat_cpu(cpu).cpustat[CPUTIME_NICE];
>> +       busy_time = kcpustat_cpu_get(cpu, CPUTIME_USER);
>> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_SYSTEM);
>> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_IRQ);
>> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_SOFTIRQ);
>> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_STEAL);
>> +       busy_time += kcpustat_cpu_get(cpu, CPUTIME_NICE);
>>
>>         idle_time = cur_wall_time - busy_time;
>>         if (wall)
>> @@ -103,7 +103,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>>                         u64 cur_nice;
>>                         unsigned long cur_nice_jiffies;
>>
>> -                       cur_nice = kcpustat_cpu(j).cpustat[CPUTIME_NICE] -
>> +                       cur_nice = kcpustat_cpu_get(j, CPUTIME_NICE) -
>>                                          cdbs->prev_cpu_nice;
>>                         /*
>>                          * Assumption: nice time between sampling periods will
>> @@ -113,7 +113,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
>>                                         cputime64_to_jiffies64(cur_nice);
>>
>>                         cdbs->prev_cpu_nice =
>> -                               kcpustat_cpu(j).cpustat[CPUTIME_NICE];
>> +                               kcpustat_cpu_get(j, CPUTIME_NICE);
>>                         idle_time += jiffies_to_usecs(cur_nice_jiffies);
>>                 }
>>
>> @@ -216,7 +216,7 @@ int cpufreq_governor_dbs(struct dbs_data *dbs_data,
>>                                         &j_cdbs->prev_cpu_wall);
>>                         if (ignore_nice)
>>                                 j_cdbs->prev_cpu_nice =
>> -                                       kcpustat_cpu(j).cpustat[CPUTIME_NICE];
>> +                                       kcpustat_cpu_get(j, CPUTIME_NICE);
>>                 }
>>
>>                 /*
>> diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
>> index 7731f7c..ac5d49f 100644
>> --- a/drivers/cpufreq/cpufreq_ondemand.c
>> +++ b/drivers/cpufreq/cpufreq_ondemand.c
>> @@ -403,7 +403,7 @@ static ssize_t store_ignore_nice_load(struct kobject *a, struct attribute *b,
>>                                                 &dbs_info->cdbs.prev_cpu_wall);
>>                 if (od_tuners.ignore_nice)
>>                         dbs_info->cdbs.prev_cpu_nice =
>> -                               kcpustat_cpu(j).cpustat[CPUTIME_NICE];
>> +                               kcpustat_cpu_get(j, CPUTIME_NICE);
>>
>>         }
>>         return count;
>
> For cpufreq:
>
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
>
> Though i believe you also need this:
>
> diff --git a/drivers/cpufreq/cpufreq_conservative.c
> b/drivers/cpufreq/cpufreq_conservative.c
> index 64ef737..38e3ad7 100644
> --- a/drivers/cpufreq/cpufreq_conservative.c
> +++ b/drivers/cpufreq/cpufreq_conservative.c
> @@ -242,7 +242,7 @@ static ssize_t store_ignore_nice_load(struct
> kobject *a, struct attribute *b,
>                                                 &dbs_info->cdbs.prev_cpu_wall);
>                 if (cs_tuners.ignore_nice)
>                         dbs_info->cdbs.prev_cpu_nice =
> -                               kcpustat_cpu(j).cpustat[CPUTIME_NICE];
> +                               kcpustat_cpu_get(j, CPUTIME_NICE);
>         }
>         return count;
>  }

Thanks, I missed that one.  I've added it in locally.

Kevin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 0/2] cpustat: use atomic operations to read/update stats
  2013-02-22 14:23         ` Frederic Weisbecker
@ 2013-02-22 19:01           ` Kevin Hilman
  0 siblings, 0 replies; 25+ messages in thread
From: Kevin Hilman @ 2013-02-22 19:01 UTC (permalink / raw)
  To: linux-arm-kernel

Frederic Weisbecker <fweisbec@gmail.com> writes:

> On Fri, Feb 22, 2013 at 06:21:31AM -0800, Eric Dumazet wrote:
>> On Fri, 2013-02-22 at 15:09 +0100, Peter Zijlstra wrote:
>> > On Fri, 2013-02-22 at 13:50 +0100, Frederic Weisbecker wrote:
>> > > > Argh!! at what cost? 64bit atomics are like expensive. Wouldn't
>> > > adding
>> > > > a seqlock be saner?
>> > > 
>> > > Not sure. This requires a spinlock in the write side which is called
>> > > from
>> > > fast path like the timer interrupt.
>> > 
>> > A single spinlock is _way_ cheaper than a ton of cmpxchg8b()s to update
>> > a few variables.
>> 
>> We also have include/linux/u64_stats_sync.h since 2.6.36
>
> Interesting, we should probably use that instead.

OK, I'll spin a version using the u64_stats interface.

Unfortunately, for 32-bit platforms that have atomic 64-bit loads
stores[1], u64_stats leads to some unnecessary overhead, but I'll look
at possibly optimizing u64_stats for those platforms as a follow-up
patch.

Kevin

[1] ARM >= v6k devices have ldrexd/strexd instructions for 64-bit loads
    stores which are used by the atomic64 accessors on those devices.
    (c.f. arch/arm/include/asm/atomic.h:atomic64_read()

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2013-02-22 19:01 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-22  5:56 [PATCH 0/2] cpustat: use atomic operations to read/update stats Kevin Hilman
2013-02-22  5:56 ` [PATCH 1/2] cpustat: use accessor functions for get/set/add Kevin Hilman
2013-02-22  6:21   ` Viresh Kumar
2013-02-22  7:17     ` Amit Kucheria
2013-02-22  7:50       ` Viresh Kumar
2013-02-22 12:34       ` Frederic Weisbecker
2013-02-22 15:14     ` Kevin Hilman
2013-02-22 13:38   ` Frederic Weisbecker
2013-02-22 13:58     ` Peter Zijlstra
2013-02-22 14:03       ` Frederic Weisbecker
2013-02-22  5:56 ` [PATCH 2/2] cpustat: convert to atomic operations Kevin Hilman
2013-02-22  8:46 ` [PATCH 0/2] cpustat: use atomic operations to read/update stats Peter Zijlstra
2013-02-22 12:50   ` Frederic Weisbecker
2013-02-22 13:48     ` Peter Zijlstra
2013-02-22 13:54       ` Ingo Molnar
2013-02-22 14:04         ` Peter Zijlstra
2013-02-22 14:16           ` Ingo Molnar
2013-02-22 14:25             ` Peter Zijlstra
2013-02-22 14:05     ` Peter Zijlstra
2013-02-22 14:15       ` Frederic Weisbecker
2013-02-22 14:09     ` Peter Zijlstra
2013-02-22 14:18       ` Ingo Molnar
2013-02-22 14:21       ` Eric Dumazet
2013-02-22 14:23         ` Frederic Weisbecker
2013-02-22 19:01           ` Kevin Hilman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).