public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] [GIT PULL] function profiler updates
@ 2009-03-26  1:07 Steven Rostedt
  2009-03-26  1:07 ` [PATCH 1/2] tracing: remove on the fly allocator from function profiler Steven Rostedt
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Steven Rostedt @ 2009-03-26  1:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Mike Galbraith, Arjan van de Ven


Ingo,

Please pull the latest rfc/tip/tracing/function-profiler tree, which can be found at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
rfc/tip/tracing/function-profiler


Steven Rostedt (2):
      tracing: remove on the fly allocator from function profiler
      tracing: add average time in function to function profiler

----
 kernel/trace/ftrace.c |   97 +++++++++++++++++++++++++++++--------------------
 1 files changed, 57 insertions(+), 40 deletions(-)
-- 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] tracing: remove on the fly allocator from function profiler
  2009-03-26  1:07 [PATCH 0/2] [GIT PULL] function profiler updates Steven Rostedt
@ 2009-03-26  1:07 ` Steven Rostedt
  2009-03-26  1:07 ` [PATCH 2/2] tracing: add average time in function to " Steven Rostedt
  2009-03-26  8:07 ` [PATCH 0/2] [GIT PULL] function profiler updates Ingo Molnar
  2 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2009-03-26  1:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Mike Galbraith, Arjan van de Ven,
	Steven Rostedt

[-- Attachment #1: 0001-tracing-remove-on-the-fly-allocator-from-function-p.patch --]
[-- Type: text/plain, Size: 4782 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Impact: safer code

The on the fly allocator for the function profiler was to save
memory. But at the expense of stability. Although it survived several
tests, allocating from the function tracer is just too risky, just
to save space.

This patch removes the allocator and simply allocates enough entries
at start up.

Each function gets a profiling structure of 40 bytes. With an average
of 20K functions, and this is for each CPU, we have 800K per online
CPU. This is not too bad, at least for non-embedded.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 kernel/trace/ftrace.c |   76 +++++++++++++++++++++++++++---------------------
 1 files changed, 43 insertions(+), 33 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index a141d84..4d90c91 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -401,6 +401,8 @@ static void ftrace_profile_reset(struct ftrace_profile_stat *stat)
 int ftrace_profile_pages_init(struct ftrace_profile_stat *stat)
 {
 	struct ftrace_profile_page *pg;
+	int functions;
+	int pages;
 	int i;
 
 	/* If we already allocated, do nothing */
@@ -411,22 +413,46 @@ int ftrace_profile_pages_init(struct ftrace_profile_stat *stat)
 	if (!stat->pages)
 		return -ENOMEM;
 
+#ifdef CONFIG_DYNAMIC_FTRACE
+	functions = ftrace_update_tot_cnt;
+#else
+	/*
+	 * We do not know the number of functions that exist because
+	 * dynamic tracing is what counts them. With past experience
+	 * we have around 20K functions. That should be more than enough.
+	 * It is highly unlikely we will execute every function in
+	 * the kernel.
+	 */
+	functions = 20000;
+#endif
+
 	pg = stat->start = stat->pages;
 
-	/* allocate 10 more pages to start */
-	for (i = 0; i < 10; i++) {
+	pages = DIV_ROUND_UP(functions, PROFILES_PER_PAGE);
+
+	for (i = 0; i < pages; i++) {
 		pg->next = (void *)get_zeroed_page(GFP_KERNEL);
-		/*
-		 * We only care about allocating profile_pages, if
-		 * we failed to allocate here, hopefully we will allocate
-		 * later.
-		 */
 		if (!pg->next)
-			break;
+			goto out_free;
 		pg = pg->next;
 	}
 
 	return 0;
+
+ out_free:
+	pg = stat->start;
+	while (pg) {
+		unsigned long tmp = (unsigned long)pg;
+
+		pg = pg->next;
+		free_page(tmp);
+	}
+
+	free_page((unsigned long)stat->pages);
+	stat->pages = NULL;
+	stat->start = NULL;
+
+	return -ENOMEM;
 }
 
 static int ftrace_profile_init_cpu(int cpu)
@@ -460,7 +486,7 @@ static int ftrace_profile_init_cpu(int cpu)
 			ftrace_profile_bits++;
 	}
 
-	/* Preallocate a few pages */
+	/* Preallocate the function profiling pages */
 	if (ftrace_profile_pages_init(stat) < 0) {
 		kfree(stat->hash);
 		stat->hash = NULL;
@@ -516,24 +542,21 @@ static void ftrace_add_profile(struct ftrace_profile_stat *stat,
 	hlist_add_head_rcu(&rec->node, &stat->hash[key]);
 }
 
-/* Interrupts must be disabled calling this */
+/*
+ * The memory is already allocated, this simply finds a new record to use.
+ */
 static struct ftrace_profile *
-ftrace_profile_alloc(struct ftrace_profile_stat *stat,
-		     unsigned long ip, bool alloc_safe)
+ftrace_profile_alloc(struct ftrace_profile_stat *stat, unsigned long ip)
 {
 	struct ftrace_profile *rec = NULL;
 
-	/* prevent recursion */
+	/* prevent recursion (from NMIs) */
 	if (atomic_inc_return(&stat->disabled) != 1)
 		goto out;
 
-	/* Try to always keep another page available */
-	if (!stat->pages->next && alloc_safe)
-		stat->pages->next = (void *)get_zeroed_page(GFP_ATOMIC);
-
 	/*
-	 * Try to find the function again since another
-	 * task on another CPU could have added it
+	 * Try to find the function again since an NMI
+	 * could have added it
 	 */
 	rec = ftrace_find_profiled_func(stat, ip);
 	if (rec)
@@ -555,29 +578,16 @@ ftrace_profile_alloc(struct ftrace_profile_stat *stat,
 	return rec;
 }
 
-/*
- * If we are not in an interrupt, or softirq and
- * and interrupts are disabled and preemption is not enabled
- * (not in a spinlock) then it should be safe to allocate memory.
- */
-static bool ftrace_safe_to_allocate(void)
-{
-	return !in_interrupt() && irqs_disabled() && !preempt_count();
-}
-
 static void
 function_profile_call(unsigned long ip, unsigned long parent_ip)
 {
 	struct ftrace_profile_stat *stat;
 	struct ftrace_profile *rec;
 	unsigned long flags;
-	bool alloc_safe;
 
 	if (!ftrace_profile_enabled)
 		return;
 
-	alloc_safe = ftrace_safe_to_allocate();
-
 	local_irq_save(flags);
 
 	stat = &__get_cpu_var(ftrace_profile_stats);
@@ -586,7 +596,7 @@ function_profile_call(unsigned long ip, unsigned long parent_ip)
 
 	rec = ftrace_find_profiled_func(stat, ip);
 	if (!rec) {
-		rec = ftrace_profile_alloc(stat, ip, alloc_safe);
+		rec = ftrace_profile_alloc(stat, ip);
 		if (!rec)
 			goto out;
 	}
-- 
1.6.2

-- 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] tracing: add average time in function to function profiler
  2009-03-26  1:07 [PATCH 0/2] [GIT PULL] function profiler updates Steven Rostedt
  2009-03-26  1:07 ` [PATCH 1/2] tracing: remove on the fly allocator from function profiler Steven Rostedt
@ 2009-03-26  1:07 ` Steven Rostedt
  2009-03-26 22:15   ` Stephen Rothwell
  2009-03-26  8:07 ` [PATCH 0/2] [GIT PULL] function profiler updates Ingo Molnar
  2 siblings, 1 reply; 7+ messages in thread
From: Steven Rostedt @ 2009-03-26  1:07 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Mike Galbraith, Arjan van de Ven,
	Steven Rostedt

[-- Attachment #1: 0002-tracing-add-average-time-in-function-to-function-pr.patch --]
[-- Type: text/plain, Size: 2854 bytes --]

From: Steven Rostedt <srostedt@redhat.com>

Show the average time in the function (Time / Hit)

  Function                               Hit    Time            Avg
  --------                               ---    ----            ---
  mwait_idle                              51    140326.6 us     2751.503 us
  smp_apic_timer_interrupt                47    3517.735 us     74.845 us
  schedule                                10    2738.754 us     273.875 us
  __schedule                              10    2732.857 us     273.285 us
  hrtimer_interrupt                       47    1896.104 us     40.342 us
  irq_exit                                56    1711.833 us     30.568 us
  __run_hrtimer                           47    1315.589 us     27.991 us
  tick_sched_timer                        47    1138.690 us     24.227 us
  do_softirq                              56    1116.829 us     19.943 us
  __do_softirq                            56    1066.932 us     19.052 us
  do_IRQ                                   9    926.153 us      102.905 us

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
---
 kernel/trace/ftrace.c |   21 ++++++++++++++-------
 1 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 4d90c91..c7f4a4b 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -347,8 +347,10 @@ static int function_stat_cmp(void *p1, void *p2)
 static int function_stat_headers(struct seq_file *m)
 {
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
-	seq_printf(m, "  Function                               Hit    Time\n"
-		      "  --------                               ---    ----\n");
+	seq_printf(m, "  Function                               "
+		   "Hit    Time            Avg\n"
+		      "  --------                               "
+		   "---    ----            ---\n");
 #else
 	seq_printf(m, "  Function                               Hit\n"
 		      "  --------                               ---\n");
@@ -361,12 +363,9 @@ static int function_stat_show(struct seq_file *m, void *v)
 	struct ftrace_profile *rec = v;
 	char str[KSYM_SYMBOL_LEN];
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
-	static struct trace_seq s;
 	static DEFINE_MUTEX(mutex);
-
-	mutex_lock(&mutex);
-	trace_seq_init(&s);
-	trace_print_graph_duration(rec->time, &s);
+	static struct trace_seq s;
+	unsigned long long avg;
 #endif
 
 	kallsyms_lookup(rec->ip, NULL, NULL, NULL, str);
@@ -374,6 +373,14 @@ static int function_stat_show(struct seq_file *m, void *v)
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	seq_printf(m, "    ");
+	avg = rec->time;
+	do_div(avg, rec->counter);
+
+	mutex_lock(&mutex);
+	trace_seq_init(&s);
+	trace_print_graph_duration(rec->time, &s);
+	trace_seq_puts(&s, "    ");
+	trace_print_graph_duration(avg, &s);
 	trace_print_seq(m, &s);
 	mutex_unlock(&mutex);
 #endif
-- 
1.6.2

-- 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] [GIT PULL] function profiler updates
  2009-03-26  1:07 [PATCH 0/2] [GIT PULL] function profiler updates Steven Rostedt
  2009-03-26  1:07 ` [PATCH 1/2] tracing: remove on the fly allocator from function profiler Steven Rostedt
  2009-03-26  1:07 ` [PATCH 2/2] tracing: add average time in function to " Steven Rostedt
@ 2009-03-26  8:07 ` Ingo Molnar
  2 siblings, 0 replies; 7+ messages in thread
From: Ingo Molnar @ 2009-03-26  8:07 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
	Frederic Weisbecker, Mike Galbraith, Arjan van de Ven


* Steven Rostedt <rostedt@goodmis.org> wrote:

> 
> Ingo,
> 
> Please pull the latest rfc/tip/tracing/function-profiler tree, which can be found at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
> rfc/tip/tracing/function-profiler
> 
> 
> Steven Rostedt (2):
>       tracing: remove on the fly allocator from function profiler
>       tracing: add average time in function to function profiler
> 
> ----
>  kernel/trace/ftrace.c |   97 +++++++++++++++++++++++++++++--------------------
>  1 files changed, 57 insertions(+), 40 deletions(-)
> -- 

Pulled, thanks Steve. The on-the-fly allocator was my main 
conceptual worry about this.

How useful this kind of profiling will end up being in practice 
remains to be seen - but it does have a handful of unique properties 
that cannot be dismissed.

It's main downside is its overhead - sampling based profilers are a 
lot more lightweight and allow unintrusive inspection of workloads.

The upside of it are all the things that this profiling overhead 
allows us to do:

 - 100% precise, non-sampling profiling

 - accurate profiles possible with very short, bursty workloads too

 - filtering capabilities of the tracing subsystem allow the 
   flexible shaping of the set of functions and the set of 
   conditions that are profiled: there's the regex filter and 
   there's also the tracepoint based filter expressions.

 - it also allowsuser-space driven filtering: if user-space wants to 
   profile just one specific functionality (say the function calls 
   of one specific library), they can use /debug/tracing/enabled
   fast toggle to turn profiling on/off runtime. This can be done at 
   a very high frequency as well.

 - developers can switch between profiling and tracing mode 
   seemlessly. This allows multiple 'dimensions' of the same 
   workload to be observed, via the same set of facilities.

	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] tracing: add average time in function to function profiler
  2009-03-26  1:07 ` [PATCH 2/2] tracing: add average time in function to " Steven Rostedt
@ 2009-03-26 22:15   ` Stephen Rothwell
  2009-03-26 22:29     ` Ingo Molnar
  2009-03-26 23:14     ` Steven Rostedt
  0 siblings, 2 replies; 7+ messages in thread
From: Stephen Rothwell @ 2009-03-26 22:15 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Mike Galbraith,
	Arjan van de Ven, Steven Rostedt

[-- Attachment #1: Type: text/plain, Size: 446 bytes --]

Hi Steve,

On Wed, 25 Mar 2009 21:07:39 -0400 Steven Rostedt <rostedt@goodmis.org> wrote:
>
> From: Steven Rostedt <srostedt@redhat.com>
> 
> Show the average time in the function (Time / Hit)

Just wondering why we need to do this in the kernel ... there is already
enough information for a user to figure these averages out.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] tracing: add average time in function to function profiler
  2009-03-26 22:15   ` Stephen Rothwell
@ 2009-03-26 22:29     ` Ingo Molnar
  2009-03-26 23:14     ` Steven Rostedt
  1 sibling, 0 replies; 7+ messages in thread
From: Ingo Molnar @ 2009-03-26 22:29 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: Steven Rostedt, linux-kernel, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Mike Galbraith,
	Arjan van de Ven, Steven Rostedt


* Stephen Rothwell <sfr@canb.auug.org.au> wrote:

> Hi Steve,
> 
> On Wed, 25 Mar 2009 21:07:39 -0400 Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > From: Steven Rostedt <srostedt@redhat.com>
> > 
> > Show the average time in the function (Time / Hit)
> 
> Just wondering why we need to do this in the kernel ... there is 
> already enough information for a user to figure these averages 
> out.

Well, this argument can be applied to a lot of information exposed 
in /proc and /sys. For example in theory there's no need for 
/proc/interrupts at all, because the IRQ mappings are present in the 
syslog and in /sys enumerations and the IRQ counts are present in 
/proc/stat already.

I still find /proc/interrupts very handy, despite purist arguments 
that it is redundant and that it thus should not be there.

So we provide such things when it's easy to do, when it's a 
meaningful piece of information and it is convenient to users and 
developers.

Note that the function profiler is a double-non-default item in 
/debug (you need to enable two non-default options in the .config to 
get it), so you cannot stumble on it accidentally.

	Ingo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] tracing: add average time in function to function profiler
  2009-03-26 22:15   ` Stephen Rothwell
  2009-03-26 22:29     ` Ingo Molnar
@ 2009-03-26 23:14     ` Steven Rostedt
  1 sibling, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2009-03-26 23:14 UTC (permalink / raw)
  To: Stephen Rothwell
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker, Mike Galbraith,
	Arjan van de Ven, Steven Rostedt


On Fri, 27 Mar 2009, Stephen Rothwell wrote:

> Hi Steve,
> 
> On Wed, 25 Mar 2009 21:07:39 -0400 Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > From: Steven Rostedt <srostedt@redhat.com>
> > 
> > Show the average time in the function (Time / Hit)
> 
> Just wondering why we need to do this in the kernel ... there is already
> enough information for a user to figure these averages out.

I also plan on adding a profiler_option file to let you sort by Hit, Time, 
or Avg. I'm working on a project at the University that would like to see 
that. Yes, it can be done by userspace too, but it is very simple to do in 
the kernel (just pick what to sort by since it already has the sorting 
algorithm). Userspace, can do it, but it is much more clumsy at it.

-- Steve


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-03-26 23:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-26  1:07 [PATCH 0/2] [GIT PULL] function profiler updates Steven Rostedt
2009-03-26  1:07 ` [PATCH 1/2] tracing: remove on the fly allocator from function profiler Steven Rostedt
2009-03-26  1:07 ` [PATCH 2/2] tracing: add average time in function to " Steven Rostedt
2009-03-26 22:15   ` Stephen Rothwell
2009-03-26 22:29     ` Ingo Molnar
2009-03-26 23:14     ` Steven Rostedt
2009-03-26  8:07 ` [PATCH 0/2] [GIT PULL] function profiler updates Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox