public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] posix-cpu-timers: clock_gettime(CLOCK_THREAD_CPUTIME_ID) goes backward
@ 2008-12-24  8:50 Hidetoshi Seto
  2008-12-25  8:24 ` Ingo Molnar
  0 siblings, 1 reply; 3+ messages in thread
From: Hidetoshi Seto @ 2008-12-24  8:50 UTC (permalink / raw)
  To: linux-kernel

[1]:
 Timer tick (and task_tick) can interrups between getting sum_exec_runtime
 and task_delta_exec(p).  Since task's runtime is updated in the tick,
 it makes the value of cpu->sched about a tick less than what it should be.
 So, interrupts should be disabled while sampling runtime.

[2]:
 enqueue_task also updates runtime of task running on the rq which
 a task is going to be enqueued.
 So, runqueue should be locked while sampling runtime.

>From user land (@HZ=250), it looks like:
	prev call       current call    diff
        (0.191195713) > (0.191200456) : 4743
        (0.191200456) > (0.191205198) : 4742
        (0.191205198) > (0.187213693) : -3991505
        (0.187213693) > (0.191218688) : 4004995
        (0.191218688) > (0.191223436) : 4748

clock_gettime(CLOCK_PROCESS_CPUTIME_ID) also has same problem,
but it is complicated because it need to handle thread group.
There should be another patch to fix it.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
---
 include/linux/kernel_stat.h |    1 +
 kernel/posix-cpu-timers.c   |    2 +-
 kernel/sched.c              |   27 +++++++++++++++++++++++++++
 3 files changed, 29 insertions(+), 1 deletions(-)

diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h
index 4a145ca..225f27a 100644
--- a/include/linux/kernel_stat.h
+++ b/include/linux/kernel_stat.h
@@ -66,6 +66,7 @@ static inline unsigned int kstat_irqs(unsigned int irq)
 	return sum;
 }
 
+extern unsigned long long task_total_exec(struct task_struct *);
 extern unsigned long long task_delta_exec(struct task_struct *);
 extern void account_user_time(struct task_struct *, cputime_t);
 extern void account_user_time_scaled(struct task_struct *, cputime_t);
diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index 4e5288a..8e828b4 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -294,7 +294,7 @@ static int cpu_clock_sample(const clockid_t which_clock, struct task_struct *p,
 		cpu->cpu = virt_ticks(p);
 		break;
 	case CPUCLOCK_SCHED:
-		cpu->sched = p->se.sum_exec_runtime + task_delta_exec(p);
+		cpu->sched = task_total_exec(p);
 		break;
 	}
 	return 0;
diff --git a/kernel/sched.c b/kernel/sched.c
index e4bb1dd..5c7365a 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -4064,6 +4064,33 @@ DEFINE_PER_CPU(struct kernel_stat, kstat);
 EXPORT_PER_CPU_SYMBOL(kstat);
 
 /*
+ * Return total of runtime which already banked and which not yet
+ * @p in case that task is currently running.
+ */
+unsigned long long task_total_exec(struct task_struct *p)
+{
+	unsigned long flags;
+	struct rq *rq;
+	u64 ns = 0;
+
+	rq = task_rq_lock(p, &flags);
+
+	if (task_current(rq, p)) {
+		u64 delta_exec;
+
+		update_rq_clock(rq);
+		delta_exec = rq->clock - p->se.exec_start;
+		if ((s64)delta_exec > 0)
+			ns = delta_exec;
+	}
+	ns += p->se.sum_exec_runtime;
+
+	task_rq_unlock(rq, &flags);
+
+	return ns;
+}
+
+/*
  * Return any ns on the sched_clock that have not yet been banked in
  * @p in case that task is currently running.
  */
-- 
1.6.1.rc4


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] posix-cpu-timers: clock_gettime(CLOCK_THREAD_CPUTIME_ID) goes backward
  2008-12-24  8:50 [PATCH] posix-cpu-timers: clock_gettime(CLOCK_THREAD_CPUTIME_ID) goes backward Hidetoshi Seto
@ 2008-12-25  8:24 ` Ingo Molnar
  2008-12-26  0:07   ` Hidetoshi Seto
  0 siblings, 1 reply; 3+ messages in thread
From: Ingo Molnar @ 2008-12-25  8:24 UTC (permalink / raw)
  To: Hidetoshi Seto; +Cc: linux-kernel


* Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> wrote:

> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -4064,6 +4064,33 @@ DEFINE_PER_CPU(struct kernel_stat, kstat);
>  EXPORT_PER_CPU_SYMBOL(kstat);
>  
>  /*
> + * Return total of runtime which already banked and which not yet
> + * @p in case that task is currently running.
> + */
> +unsigned long long task_total_exec(struct task_struct *p)
> +{
> +	unsigned long flags;
> +	struct rq *rq;
> +	u64 ns = 0;
> +
> +	rq = task_rq_lock(p, &flags);
> +
> +	if (task_current(rq, p)) {
> +		u64 delta_exec;
> +
> +		update_rq_clock(rq);
> +		delta_exec = rq->clock - p->se.exec_start;
> +		if ((s64)delta_exec > 0)
> +			ns = delta_exec;
> +	}
> +	ns += p->se.sum_exec_runtime;
> +
> +	task_rq_unlock(rq, &flags);
> +
> +	return ns;
> +}
> +

ok - the issue is real, but there's a few implementational details. Did 
you know about task_delta_exec()? This new function duplicates that 
function's behavior - and task_delta_exec() is already used by posix 
task/cpu timers.

I find your new task_total_exec() interface better (it's easier to use), 
could you have a look at eliminating task_delta_exec() users and migrating 
them over to task_total_exec()?

Maybe even change the name of sum_exec_runtime over to __sum_exec_runtime, 
to make sure nobody uses the 'raw' value - which is always stale up to one 
tick's worth of execution. Accessing that raw value would only make sense 
after the task has finished executing. (i.e. at cutime/cstime collection 
time)

	Ingo

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] posix-cpu-timers: clock_gettime(CLOCK_THREAD_CPUTIME_ID) goes backward
  2008-12-25  8:24 ` Ingo Molnar
@ 2008-12-26  0:07   ` Hidetoshi Seto
  0 siblings, 0 replies; 3+ messages in thread
From: Hidetoshi Seto @ 2008-12-26  0:07 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel

Ingo Molnar wrote:
> ok - the issue is real, but there's a few implementational details. Did 
> you know about task_delta_exec()? This new function duplicates that 
> function's behavior - and task_delta_exec() is already used by posix 
> task/cpu timers.
> 
> I find your new task_total_exec() interface better (it's easier to use), 
> could you have a look at eliminating task_delta_exec() users and migrating 
> them over to task_total_exec()?

Yes.  There are only 2 users (clock_gettime() on thread timer and that on
process timer), so I'll post new patch with better description soon.

Thanks,
H.Seto


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-12-26  0:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-24  8:50 [PATCH] posix-cpu-timers: clock_gettime(CLOCK_THREAD_CPUTIME_ID) goes backward Hidetoshi Seto
2008-12-25  8:24 ` Ingo Molnar
2008-12-26  0:07   ` Hidetoshi Seto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox