public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf/sched: fix for getting task's execute time
@ 2009-12-06 10:57 Xiao Guangrong
  2009-12-06 11:05 ` Peter Zijlstra
  0 siblings, 1 reply; 13+ messages in thread
From: Xiao Guangrong @ 2009-12-06 10:57 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Peter Zijlstra, Frederic Weisbecker, Paul Mackerras, LKML

In current code, we get task's execute time by reading
"/proc/<pid>/sched" file, it's wrong if the task is created
by pthread_create(), because every thread task has same pid.
So, the correct way is reading "/proc/<ppid>/task/<tid>/sched"
file.

This patch also remove redundant include files since <sys/types.h>
is included in "perf.h"

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 tools/perf/builtin-sched.c |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 26b782f..b2e910e 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -13,7 +13,6 @@
 #include "util/debug.h"
 #include "util/data_map.h"
 
-#include <sys/types.h>
 #include <sys/prctl.h>
 
 #include <semaphore.h>
@@ -416,7 +415,7 @@ static u64 get_cpu_usage_nsec_parent(void)
 
 static u64 get_cpu_usage_nsec_self(void)
 {
-	char filename [] = "/proc/1234567890/sched";
+	char filename [] = "/proc/1234567890/task/1234567890/sched";
 	unsigned long msecs, nsecs;
 	char *line = NULL;
 	u64 total = 0;
@@ -425,7 +424,9 @@ static u64 get_cpu_usage_nsec_self(void)
 	FILE *file;
 	int ret;
 
-	sprintf(filename, "/proc/%d/sched", getpid());
+	sprintf(filename, "/proc/%d/task/%d/sched", getpid(),
+		(pid_t)syscall(SYS_gettid));
+
 	file = fopen(filename, "r");
 	BUG_ON(!file);
 
-- 
1.6.1.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] perf/sched: fix for getting task's execute time
  2009-12-06 10:57 [PATCH] perf/sched: fix for getting task's execute time Xiao Guangrong
@ 2009-12-06 11:05 ` Peter Zijlstra
  2009-12-06 11:06   ` Peter Zijlstra
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2009-12-06 11:05 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Ingo Molnar, Frederic Weisbecker, Paul Mackerras, LKML

On Sun, 2009-12-06 at 18:57 +0800, Xiao Guangrong wrote:
> In current code, we get task's execute time by reading
> "/proc/<pid>/sched" file, it's wrong if the task is created
> by pthread_create(), because every thread task has same pid.
> So, the correct way is reading "/proc/<ppid>/task/<tid>/sched"
> file.
> 
> This patch also remove redundant include files since <sys/types.h>
> is included in "perf.h"

We really should not be using these proc files but instead make sure
this information gets transferred through a tracepoint or similar.

Reading these proc files is too prone to races.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] perf/sched: fix for getting task's execute time
  2009-12-06 11:05 ` Peter Zijlstra
@ 2009-12-06 11:06   ` Peter Zijlstra
  2009-12-06 11:50     ` Ingo Molnar
  2009-12-06 17:10     ` Xiao Guangrong
  0 siblings, 2 replies; 13+ messages in thread
From: Peter Zijlstra @ 2009-12-06 11:06 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Ingo Molnar, Frederic Weisbecker, Paul Mackerras, LKML

On Sun, 2009-12-06 at 12:05 +0100, Peter Zijlstra wrote:
> On Sun, 2009-12-06 at 18:57 +0800, Xiao Guangrong wrote:
> > In current code, we get task's execute time by reading
> > "/proc/<pid>/sched" file, it's wrong if the task is created
> > by pthread_create(), because every thread task has same pid.
> > So, the correct way is reading "/proc/<ppid>/task/<tid>/sched"
> > file.
> > 
> > This patch also remove redundant include files since <sys/types.h>
> > is included in "perf.h"
> 
> We really should not be using these proc files but instead make sure
> this information gets transferred through a tracepoint or similar.
> 
> Reading these proc files is too prone to races.

We can probably get the runtime by grouping a task-clock swcounter with
an appropriate other event.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] perf/sched: fix for getting task's execute time
  2009-12-06 11:06   ` Peter Zijlstra
@ 2009-12-06 11:50     ` Ingo Molnar
  2009-12-06 17:10     ` Xiao Guangrong
  1 sibling, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2009-12-06 11:50 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Xiao Guangrong, Frederic Weisbecker, Paul Mackerras, LKML


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Sun, 2009-12-06 at 12:05 +0100, Peter Zijlstra wrote:
> > On Sun, 2009-12-06 at 18:57 +0800, Xiao Guangrong wrote:
> > > In current code, we get task's execute time by reading
> > > "/proc/<pid>/sched" file, it's wrong if the task is created
> > > by pthread_create(), because every thread task has same pid.
> > > So, the correct way is reading "/proc/<ppid>/task/<tid>/sched"
> > > file.
> > > 
> > > This patch also remove redundant include files since <sys/types.h>
> > > is included in "perf.h"
> > 
> > We really should not be using these proc files but instead make sure
> > this information gets transferred through a tracepoint or similar.
> > 
> > Reading these proc files is too prone to races.

yeah. Ideally we'd want all valuable information that is available via 
/proc to be available via perf events as well. In the future it should 
be possible to run perf even without /proc mounted for example.

Furthermore it's good for consistency and simplicity as well, plus it's 
faster too to get the information from the perf syscall and mmap-ed 
ringbuffer than to read things via /proc. No need for ASCII conversion, 
fixed record formats, fast streaming and buffering, no read() overhead, 
etc.

> We can probably get the runtime by grouping a task-clock swcounter 
> with an appropriate other event.

Would be lovely.

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] perf/sched: fix for getting task's execute time
  2009-12-06 11:06   ` Peter Zijlstra
  2009-12-06 11:50     ` Ingo Molnar
@ 2009-12-06 17:10     ` Xiao Guangrong
  2009-12-07  7:20       ` [PATCH v2] perf/sched: fix for getting task's execution time Xiao Guangrong
  1 sibling, 1 reply; 13+ messages in thread
From: Xiao Guangrong @ 2009-12-06 17:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Xiao Guangrong, Ingo Molnar, Frederic Weisbecker, Paul Mackerras,
	LKML

Peter Zijlstra wrote:

>> We really should not be using these proc files but instead make sure
>> this information gets transferred through a tracepoint or similar.
>>
>> Reading these proc files is too prone to races.
> 
> We can probably get the runtime by grouping a task-clock swcounter with
> an appropriate other event.
> 

Hi Peter,

Thanks your suggestion.

Actually, we can call getrusage(RUSAGE_THREAD, ru) to get current task's
execute time, and I think this is a simpler way.

I'll send v2 patch later.

Thanks,
Xiao

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] perf/sched: fix for getting task's execution time
  2009-12-06 17:10     ` Xiao Guangrong
@ 2009-12-07  7:20       ` Xiao Guangrong
  2009-12-07  7:30         ` Ingo Molnar
  0 siblings, 1 reply; 13+ messages in thread
From: Xiao Guangrong @ 2009-12-07  7:20 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Peter Zijlstra, Ingo Molnar, Frederic Weisbecker, Paul Mackerras,
	Török Edwin, LKML

In current code, task's execute time is got by reading
'/proc/<pid>/sched' file, it's wrong if the task is created
by pthread_create(), because every thread task has same pid.

This way also has two demerits:

1: 'perf sched replay' can't work if the kernel not compile with
   'CONFIG_SCHED_DEBUG' option
2: perf tool should depend on proc file system

So, this patch call getrusage() to get task's execution time instead
of reading /proc file

Reported-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 tools/perf/builtin-sched.c |   46 ++++++++++++++-----------------------------
 1 files changed, 15 insertions(+), 31 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 19f43fa..bd57994 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -13,7 +13,6 @@
 #include "util/debug.h"
 #include "util/data_map.h"
 
-#include <sys/types.h>
 #include <sys/prctl.h>
 
 #include <semaphore.h>
@@ -399,49 +398,34 @@ process_sched_event(struct task_desc *this_task __used, struct sched_atom *atom)
 	}
 }
 
+static u64 get_sum_time(struct rusage *ru)
+{
+	u64 sum;
+
+	sum =  ru->ru_utime.tv_sec*1e9 + ru->ru_utime.tv_usec*1e3;
+	sum += ru->ru_stime.tv_sec*1e9 + ru->ru_stime.tv_usec*1e3;
+	return sum;
+}
+
 static u64 get_cpu_usage_nsec_parent(void)
 {
 	struct rusage ru;
-	u64 sum;
 	int err;
 
 	err = getrusage(RUSAGE_SELF, &ru);
 	BUG_ON(err);
 
-	sum =  ru.ru_utime.tv_sec*1e9 + ru.ru_utime.tv_usec*1e3;
-	sum += ru.ru_stime.tv_sec*1e9 + ru.ru_stime.tv_usec*1e3;
-
-	return sum;
+	return get_sum_time(&ru);
 }
 
 static u64 get_cpu_usage_nsec_self(void)
 {
-	char filename [] = "/proc/1234567890/sched";
-	unsigned long msecs, nsecs;
-	char *line = NULL;
-	u64 total = 0;
-	size_t len = 0;
-	ssize_t chars;
-	FILE *file;
-	int ret;
-
-	sprintf(filename, "/proc/%d/sched", getpid());
-	file = fopen(filename, "r");
-	BUG_ON(!file);
-
-	while ((chars = getline(&line, &len, file)) != -1) {
-		ret = sscanf(line, "se.sum_exec_runtime : %ld.%06ld\n",
-			&msecs, &nsecs);
-		if (ret == 2) {
-			total = msecs*1e6 + nsecs;
-			break;
-		}
-	}
-	if (line)
-		free(line);
-	fclose(file);
+	struct rusage ru;
+	int err;
 
-	return total;
+	err = getrusage(RUSAGE_THREAD, &ru);
+	BUG_ON(err);
+	return get_sum_time(&ru);
 }
 
 static void *thread_func(void *ctx)
-- 
1.6.1.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2] perf/sched: fix for getting task's execution time
  2009-12-07  7:20       ` [PATCH v2] perf/sched: fix for getting task's execution time Xiao Guangrong
@ 2009-12-07  7:30         ` Ingo Molnar
  2009-12-09  9:51           ` [PATCH v3] " Xiao Guangrong
  0 siblings, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2009-12-07  7:30 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Xiao Guangrong, Peter Zijlstra, Frederic Weisbecker,
	Paul Mackerras, T??r??k Edwin, LKML


* Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> wrote:

> In current code, task's execute time is got by reading
> '/proc/<pid>/sched' file, it's wrong if the task is created
> by pthread_create(), because every thread task has same pid.
> 
> This way also has two demerits:
> 
> 1: 'perf sched replay' can't work if the kernel not compile with
>    'CONFIG_SCHED_DEBUG' option
> 2: perf tool should depend on proc file system
> 
> So, this patch call getrusage() to get task's execution time instead
> of reading /proc file

ok, that's better than /proc, but how about using 
PERF_COUNT_SW_TASK_CLOCK instead of rusage, to recover the CPU time 
used? It would be more precise, and it would use the perf API. (Some 
helper functions in tools/perf/lib/ would be nice to make it as easy to 
use as rusage (or even easier if possible).)

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3] perf/sched: fix for getting task's execution time
  2009-12-07  7:30         ` Ingo Molnar
@ 2009-12-09  9:51           ` Xiao Guangrong
  2009-12-09  9:54             ` Xiao Guangrong
                               ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Xiao Guangrong @ 2009-12-09  9:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Xiao Guangrong, Peter Zijlstra, Frederic Weisbecker,
	Paul Mackerras, T??r??k Edwin, LKML

In current code, task's execute time is got by reading
'/proc/<pid>/sched' file, it's wrong if the task is created
by pthread_create(), because every thread task has same pid.

This way also has two demerits:

1: 'perf sched replay' can't work if the kernel not compile
    with 'CONFIG_SCHED_DEBUG' option
2: perf tool should depend on proc file system

So, this patch use PERF_COUNT_SW_TASK_CLOCK to get task's
execution time instead of reading /proc file

Changelog v2 -> v3:
use PERF_COUNT_SW_TASK_CLOCK instead of rusage() as Ingo's suggestion

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 tools/perf/builtin-sched.c |   55 +++++++++++++++++++++----------------------
 1 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 19f43fa..b12b23a 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -13,7 +13,6 @@
 #include "util/debug.h"
 #include "util/data_map.h"
 
-#include <sys/types.h>
 #include <sys/prctl.h>
 
 #include <semaphore.h>
@@ -414,34 +413,33 @@ static u64 get_cpu_usage_nsec_parent(void)
 	return sum;
 }
 
-static u64 get_cpu_usage_nsec_self(void)
+static int self_open_counters(void)
 {
-	char filename [] = "/proc/1234567890/sched";
-	unsigned long msecs, nsecs;
-	char *line = NULL;
-	u64 total = 0;
-	size_t len = 0;
-	ssize_t chars;
-	FILE *file;
-	int ret;
+	struct perf_event_attr attr;
+	int fd;
 
-	sprintf(filename, "/proc/%d/sched", getpid());
-	file = fopen(filename, "r");
-	BUG_ON(!file);
+	memset(&attr, 0, sizeof(attr));
 
-	while ((chars = getline(&line, &len, file)) != -1) {
-		ret = sscanf(line, "se.sum_exec_runtime : %ld.%06ld\n",
-			&msecs, &nsecs);
-		if (ret == 2) {
-			total = msecs*1e6 + nsecs;
-			break;
-		}
-	}
-	if (line)
-		free(line);
-	fclose(file);
+	attr.type = PERF_TYPE_SOFTWARE;
+	attr.config = PERF_COUNT_SW_TASK_CLOCK;
 
-	return total;
+	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+
+	if (fd < 0)
+		die("Error: sys_perf_event_open() syscall returned"
+		    "with %d (%s)\n", fd, strerror(errno));
+	return fd;
+}
+
+static u64 get_cpu_usage_nsec_self(int fd)
+{
+	u64 runtime;
+	int ret;
+
+	ret = read(fd, &runtime, sizeof(runtime));
+	BUG_ON(ret != sizeof(runtime));
+
+	return runtime;
 }
 
 static void *thread_func(void *ctx)
@@ -450,9 +448,11 @@ static void *thread_func(void *ctx)
 	u64 cpu_usage_0, cpu_usage_1;
 	unsigned long i, ret;
 	char comm2[22];
+	int fd;
 
 	sprintf(comm2, ":%s", this_task->comm);
 	prctl(PR_SET_NAME, comm2);
+	fd = self_open_counters();
 
 again:
 	ret = sem_post(&this_task->ready_for_work);
@@ -462,16 +462,15 @@ again:
 	ret = pthread_mutex_unlock(&start_work_mutex);
 	BUG_ON(ret);
 
-	cpu_usage_0 = get_cpu_usage_nsec_self();
+	cpu_usage_0 = get_cpu_usage_nsec_self(fd);
 
 	for (i = 0; i < this_task->nr_events; i++) {
 		this_task->curr_event = i;
 		process_sched_event(this_task, this_task->atoms[i]);
 	}
 
-	cpu_usage_1 = get_cpu_usage_nsec_self();
+	cpu_usage_1 = get_cpu_usage_nsec_self(fd);
 	this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
-
 	ret = sem_post(&this_task->work_done_sem);
 	BUG_ON(ret);
 
-- 
1.6.1.2






^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] perf/sched: fix for getting task's execution time
  2009-12-09  9:51           ` [PATCH v3] " Xiao Guangrong
@ 2009-12-09  9:54             ` Xiao Guangrong
  2009-12-09  9:59               ` Ingo Molnar
  2009-12-09  9:57             ` Xiao Guangrong
                               ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Xiao Guangrong @ 2009-12-09  9:54 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Xiao Guangrong, Peter Zijlstra, Frederic Weisbecker,
	Paul Mackerras, T??r??k Edwin, LKML

Sorry, miss Reported-by, please ignore this mail, i'll resend it

Xiao Guangrong wrote:
> In current code, task's execute time is got by reading
> '/proc/<pid>/sched' file, it's wrong if the task is created
> by pthread_create(), because every thread task has same pid.
> 
> This way also has two demerits:
> 
> 1: 'perf sched replay' can't work if the kernel not compile
>     with 'CONFIG_SCHED_DEBUG' option
> 2: perf tool should depend on proc file system
> 
> So, this patch use PERF_COUNT_SW_TASK_CLOCK to get task's
> execution time instead of reading /proc file
> 
> Changelog v2 -> v3:
> use PERF_COUNT_SW_TASK_CLOCK instead of rusage() as Ingo's suggestion
> 
> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3] perf/sched: fix for getting task's execution time
  2009-12-09  9:51           ` [PATCH v3] " Xiao Guangrong
  2009-12-09  9:54             ` Xiao Guangrong
@ 2009-12-09  9:57             ` Xiao Guangrong
  2009-12-09  9:57             ` Ingo Molnar
  2009-12-09 10:03             ` [tip:perf/urgent] perf sched: Fix " tip-bot for Xiao Guangrong
  3 siblings, 0 replies; 13+ messages in thread
From: Xiao Guangrong @ 2009-12-09  9:57 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Xiao Guangrong, Peter Zijlstra, Frederic Weisbecker,
	Paul Mackerras, T??r??k Edwin, LKML

In current code, task's execute time is got by reading
'/proc/<pid>/sched' file, it's wrong if the task is created
by pthread_create(), because every thread task has same pid.

This way also has two demerits:

1: 'perf sched replay' can't work if the kernel not compile
    with 'CONFIG_SCHED_DEBUG' option
2: perf tool should depend on proc file system

So, this patch use PERF_COUNT_SW_TASK_CLOCK to get task's
execution time instead of reading /proc file

Changelog v2 -> v3:
use PERF_COUNT_SW_TASK_CLOCK instead of rusage() as Ingo's suggestion

Reported-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
---
 tools/perf/builtin-sched.c |   55 +++++++++++++++++++++----------------------
 1 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 19f43fa..b12b23a 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -13,7 +13,6 @@
 #include "util/debug.h"
 #include "util/data_map.h"
 
-#include <sys/types.h>
 #include <sys/prctl.h>
 
 #include <semaphore.h>
@@ -414,34 +413,33 @@ static u64 get_cpu_usage_nsec_parent(void)
 	return sum;
 }
 
-static u64 get_cpu_usage_nsec_self(void)
+static int self_open_counters(void)
 {
-	char filename [] = "/proc/1234567890/sched";
-	unsigned long msecs, nsecs;
-	char *line = NULL;
-	u64 total = 0;
-	size_t len = 0;
-	ssize_t chars;
-	FILE *file;
-	int ret;
+	struct perf_event_attr attr;
+	int fd;
 
-	sprintf(filename, "/proc/%d/sched", getpid());
-	file = fopen(filename, "r");
-	BUG_ON(!file);
+	memset(&attr, 0, sizeof(attr));
 
-	while ((chars = getline(&line, &len, file)) != -1) {
-		ret = sscanf(line, "se.sum_exec_runtime : %ld.%06ld\n",
-			&msecs, &nsecs);
-		if (ret == 2) {
-			total = msecs*1e6 + nsecs;
-			break;
-		}
-	}
-	if (line)
-		free(line);
-	fclose(file);
+	attr.type = PERF_TYPE_SOFTWARE;
+	attr.config = PERF_COUNT_SW_TASK_CLOCK;
 
-	return total;
+	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+
+	if (fd < 0)
+		die("Error: sys_perf_event_open() syscall returned"
+		    "with %d (%s)\n", fd, strerror(errno));
+	return fd;
+}
+
+static u64 get_cpu_usage_nsec_self(int fd)
+{
+	u64 runtime;
+	int ret;
+
+	ret = read(fd, &runtime, sizeof(runtime));
+	BUG_ON(ret != sizeof(runtime));
+
+	return runtime;
 }
 
 static void *thread_func(void *ctx)
@@ -450,9 +448,11 @@ static void *thread_func(void *ctx)
 	u64 cpu_usage_0, cpu_usage_1;
 	unsigned long i, ret;
 	char comm2[22];
+	int fd;
 
 	sprintf(comm2, ":%s", this_task->comm);
 	prctl(PR_SET_NAME, comm2);
+	fd = self_open_counters();
 
 again:
 	ret = sem_post(&this_task->ready_for_work);
@@ -462,16 +462,15 @@ again:
 	ret = pthread_mutex_unlock(&start_work_mutex);
 	BUG_ON(ret);
 
-	cpu_usage_0 = get_cpu_usage_nsec_self();
+	cpu_usage_0 = get_cpu_usage_nsec_self(fd);
 
 	for (i = 0; i < this_task->nr_events; i++) {
 		this_task->curr_event = i;
 		process_sched_event(this_task, this_task->atoms[i]);
 	}
 
-	cpu_usage_1 = get_cpu_usage_nsec_self();
+	cpu_usage_1 = get_cpu_usage_nsec_self(fd);
 	this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
-
 	ret = sem_post(&this_task->work_done_sem);
 	BUG_ON(ret);
 
-- 
1.6.1.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] perf/sched: fix for getting task's execution time
  2009-12-09  9:51           ` [PATCH v3] " Xiao Guangrong
  2009-12-09  9:54             ` Xiao Guangrong
  2009-12-09  9:57             ` Xiao Guangrong
@ 2009-12-09  9:57             ` Ingo Molnar
  2009-12-09 10:03             ` [tip:perf/urgent] perf sched: Fix " tip-bot for Xiao Guangrong
  3 siblings, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2009-12-09  9:57 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Xiao Guangrong, Peter Zijlstra, Frederic Weisbecker,
	Paul Mackerras, T??r??k Edwin, LKML


* Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> wrote:

> In current code, task's execute time is got by reading
> '/proc/<pid>/sched' file, it's wrong if the task is created
> by pthread_create(), because every thread task has same pid.
> 
> This way also has two demerits:
> 
> 1: 'perf sched replay' can't work if the kernel not compile
>     with 'CONFIG_SCHED_DEBUG' option
> 2: perf tool should depend on proc file system
> 
> So, this patch use PERF_COUNT_SW_TASK_CLOCK to get task's
> execution time instead of reading /proc file
> 
> Changelog v2 -> v3:
> use PERF_COUNT_SW_TASK_CLOCK instead of rusage() as Ingo's suggestion
> 
> Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
> ---
>  tools/perf/builtin-sched.c |   55 +++++++++++++++++++++----------------------
>  1 files changed, 27 insertions(+), 28 deletions(-)
> 
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 19f43fa..b12b23a 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -13,7 +13,6 @@
>  #include "util/debug.h"
>  #include "util/data_map.h"
>  
> -#include <sys/types.h>
>  #include <sys/prctl.h>
>  
>  #include <semaphore.h>
> @@ -414,34 +413,33 @@ static u64 get_cpu_usage_nsec_parent(void)
>  	return sum;
>  }
>  
> -static u64 get_cpu_usage_nsec_self(void)
> +static int self_open_counters(void)
>  {
> -	char filename [] = "/proc/1234567890/sched";
> -	unsigned long msecs, nsecs;
> -	char *line = NULL;
> -	u64 total = 0;
> -	size_t len = 0;
> -	ssize_t chars;
> -	FILE *file;
> -	int ret;
> +	struct perf_event_attr attr;
> +	int fd;
>  
> -	sprintf(filename, "/proc/%d/sched", getpid());
> -	file = fopen(filename, "r");
> -	BUG_ON(!file);
> +	memset(&attr, 0, sizeof(attr));
>  
> -	while ((chars = getline(&line, &len, file)) != -1) {
> -		ret = sscanf(line, "se.sum_exec_runtime : %ld.%06ld\n",
> -			&msecs, &nsecs);
> -		if (ret == 2) {
> -			total = msecs*1e6 + nsecs;
> -			break;
> -		}
> -	}
> -	if (line)
> -		free(line);
> -	fclose(file);
> +	attr.type = PERF_TYPE_SOFTWARE;
> +	attr.config = PERF_COUNT_SW_TASK_CLOCK;
>  
> -	return total;
> +	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
> +
> +	if (fd < 0)
> +		die("Error: sys_perf_event_open() syscall returned"
> +		    "with %d (%s)\n", fd, strerror(errno));
> +	return fd;
> +}
> +
> +static u64 get_cpu_usage_nsec_self(int fd)
> +{
> +	u64 runtime;
> +	int ret;
> +
> +	ret = read(fd, &runtime, sizeof(runtime));
> +	BUG_ON(ret != sizeof(runtime));
> +
> +	return runtime;
>  }
>  
>  static void *thread_func(void *ctx)
> @@ -450,9 +448,11 @@ static void *thread_func(void *ctx)
>  	u64 cpu_usage_0, cpu_usage_1;
>  	unsigned long i, ret;
>  	char comm2[22];
> +	int fd;
>  
>  	sprintf(comm2, ":%s", this_task->comm);
>  	prctl(PR_SET_NAME, comm2);
> +	fd = self_open_counters();
>  
>  again:
>  	ret = sem_post(&this_task->ready_for_work);
> @@ -462,16 +462,15 @@ again:
>  	ret = pthread_mutex_unlock(&start_work_mutex);
>  	BUG_ON(ret);
>  
> -	cpu_usage_0 = get_cpu_usage_nsec_self();
> +	cpu_usage_0 = get_cpu_usage_nsec_self(fd);
>  
>  	for (i = 0; i < this_task->nr_events; i++) {
>  		this_task->curr_event = i;
>  		process_sched_event(this_task, this_task->atoms[i]);
>  	}
>  
> -	cpu_usage_1 = get_cpu_usage_nsec_self();
> +	cpu_usage_1 = get_cpu_usage_nsec_self(fd);
>  	this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
> -
>  	ret = sem_post(&this_task->work_done_sem);
>  	BUG_ON(ret);

Very nice - and the code even got shorter a tiny bit.

Applied, thanks!

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] perf/sched: fix for getting task's execution time
  2009-12-09  9:54             ` Xiao Guangrong
@ 2009-12-09  9:59               ` Ingo Molnar
  0 siblings, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2009-12-09  9:59 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Xiao Guangrong, Peter Zijlstra, Frederic Weisbecker,
	Paul Mackerras, T??r??k Edwin, LKML


* Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> wrote:

> Sorry, miss Reported-by, please ignore this mail, i'll resend it

I've added Edwin's Reported-by, no need to resend.

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tip:perf/urgent] perf sched: Fix for getting task's execution time
  2009-12-09  9:51           ` [PATCH v3] " Xiao Guangrong
                               ` (2 preceding siblings ...)
  2009-12-09  9:57             ` Ingo Molnar
@ 2009-12-09 10:03             ` tip-bot for Xiao Guangrong
  3 siblings, 0 replies; 13+ messages in thread
From: tip-bot for Xiao Guangrong @ 2009-12-09 10:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, paulus, hpa, mingo, peterz, edwintorok,
	xiaoguangrong, ericxiao.gr, fweisbec, tglx, mingo

Commit-ID:  c0c9e72150c07b4a6766cd24a6f685ec2dc9c343
Gitweb:     http://git.kernel.org/tip/c0c9e72150c07b4a6766cd24a6f685ec2dc9c343
Author:     Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
AuthorDate: Wed, 9 Dec 2009 17:51:30 +0800
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Wed, 9 Dec 2009 10:59:12 +0100

perf sched: Fix for getting task's execution time

In current code, task's execute time is got by reading
'/proc/<pid>/sched' file, it's wrong if the task is created
by pthread_create(), because every thread task has same pid.

This way also has two demerits:

 1: 'perf sched replay' can't work if the kernel is not
    compiled with the 'CONFIG_SCHED_DEBUG' option

 2: perf tool should depend on proc file system

So, this patch uses PERF_COUNT_SW_TASK_CLOCK to get task's
execution time instead of reading /proc file.

Changelog v2 -> v3:
use PERF_COUNT_SW_TASK_CLOCK instead of rusage() as Ingo's
suggestion

Reported-by: Torok Edwin <edwintorok@gmail.com>
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Xiao Guangrong <ericxiao.gr@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <4B1F7322.80103@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 tools/perf/builtin-sched.c |   55 +++++++++++++++++++++----------------------
 1 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 19f43fa..b12b23a 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -13,7 +13,6 @@
 #include "util/debug.h"
 #include "util/data_map.h"
 
-#include <sys/types.h>
 #include <sys/prctl.h>
 
 #include <semaphore.h>
@@ -414,34 +413,33 @@ static u64 get_cpu_usage_nsec_parent(void)
 	return sum;
 }
 
-static u64 get_cpu_usage_nsec_self(void)
+static int self_open_counters(void)
 {
-	char filename [] = "/proc/1234567890/sched";
-	unsigned long msecs, nsecs;
-	char *line = NULL;
-	u64 total = 0;
-	size_t len = 0;
-	ssize_t chars;
-	FILE *file;
-	int ret;
+	struct perf_event_attr attr;
+	int fd;
 
-	sprintf(filename, "/proc/%d/sched", getpid());
-	file = fopen(filename, "r");
-	BUG_ON(!file);
+	memset(&attr, 0, sizeof(attr));
 
-	while ((chars = getline(&line, &len, file)) != -1) {
-		ret = sscanf(line, "se.sum_exec_runtime : %ld.%06ld\n",
-			&msecs, &nsecs);
-		if (ret == 2) {
-			total = msecs*1e6 + nsecs;
-			break;
-		}
-	}
-	if (line)
-		free(line);
-	fclose(file);
+	attr.type = PERF_TYPE_SOFTWARE;
+	attr.config = PERF_COUNT_SW_TASK_CLOCK;
 
-	return total;
+	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+
+	if (fd < 0)
+		die("Error: sys_perf_event_open() syscall returned"
+		    "with %d (%s)\n", fd, strerror(errno));
+	return fd;
+}
+
+static u64 get_cpu_usage_nsec_self(int fd)
+{
+	u64 runtime;
+	int ret;
+
+	ret = read(fd, &runtime, sizeof(runtime));
+	BUG_ON(ret != sizeof(runtime));
+
+	return runtime;
 }
 
 static void *thread_func(void *ctx)
@@ -450,9 +448,11 @@ static void *thread_func(void *ctx)
 	u64 cpu_usage_0, cpu_usage_1;
 	unsigned long i, ret;
 	char comm2[22];
+	int fd;
 
 	sprintf(comm2, ":%s", this_task->comm);
 	prctl(PR_SET_NAME, comm2);
+	fd = self_open_counters();
 
 again:
 	ret = sem_post(&this_task->ready_for_work);
@@ -462,16 +462,15 @@ again:
 	ret = pthread_mutex_unlock(&start_work_mutex);
 	BUG_ON(ret);
 
-	cpu_usage_0 = get_cpu_usage_nsec_self();
+	cpu_usage_0 = get_cpu_usage_nsec_self(fd);
 
 	for (i = 0; i < this_task->nr_events; i++) {
 		this_task->curr_event = i;
 		process_sched_event(this_task, this_task->atoms[i]);
 	}
 
-	cpu_usage_1 = get_cpu_usage_nsec_self();
+	cpu_usage_1 = get_cpu_usage_nsec_self(fd);
 	this_task->cpu_usage = cpu_usage_1 - cpu_usage_0;
-
 	ret = sem_post(&this_task->work_done_sem);
 	BUG_ON(ret);
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2009-12-09 10:04 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-06 10:57 [PATCH] perf/sched: fix for getting task's execute time Xiao Guangrong
2009-12-06 11:05 ` Peter Zijlstra
2009-12-06 11:06   ` Peter Zijlstra
2009-12-06 11:50     ` Ingo Molnar
2009-12-06 17:10     ` Xiao Guangrong
2009-12-07  7:20       ` [PATCH v2] perf/sched: fix for getting task's execution time Xiao Guangrong
2009-12-07  7:30         ` Ingo Molnar
2009-12-09  9:51           ` [PATCH v3] " Xiao Guangrong
2009-12-09  9:54             ` Xiao Guangrong
2009-12-09  9:59               ` Ingo Molnar
2009-12-09  9:57             ` Xiao Guangrong
2009-12-09  9:57             ` Ingo Molnar
2009-12-09 10:03             ` [tip:perf/urgent] perf sched: Fix " tip-bot for Xiao Guangrong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox