* [PATCH 0/9] Backport of ftrace's function profiler
@ 2009-05-19 14:36 Luis Claudio R. Goncalves
2009-05-19 14:43 ` [PATCH 1/9] tracing: add " Luis Claudio R. Goncalves
` (9 more replies)
0 siblings, 10 replies; 13+ messages in thread
From: Luis Claudio R. Goncalves @ 2009-05-19 14:36 UTC (permalink / raw)
To: linux-rt-users; +Cc: Steven Rostedt, Ingo Molnar, williams, tglx
Hello,
The following patches are the backport of Steven Rostedt's ftrace's
function profiler, as published in his RFC series posted to LKML. This
backport has been done on top of 2.6.29.3-rt14.
The feature is fully functional but we do have one known problem, a kernel
panic when stressing the function profiler (a fast sequence of start, stop,
read in loop for 30+ minutes). This problem has a Bugzilla entry at:
https://bugzilla.redhat.com/show_bug.cgi?id=500156
Luis
--
[ Luis Claudio R. Goncalves Red Hat - Realtime Team ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ]
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/9] tracing: add function profiler
2009-05-19 14:36 [PATCH 0/9] Backport of ftrace's function profiler Luis Claudio R. Goncalves
@ 2009-05-19 14:43 ` Luis Claudio R. Goncalves
2009-05-19 14:45 ` [PATCH 2/9] reduce size of memory in " Luis Claudio R. Goncalves
` (8 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Luis Claudio R. Goncalves @ 2009-05-19 14:43 UTC (permalink / raw)
To: linux-rt-users; +Cc: Steven Rostedt, Ingo Molnar, williams, tglx
tracing: add function profiler
Impact: new profiling feature
Backport to 2.6.29.3-rt14.
| commit b64b1cd804a0261e0052e67897c8f634414e94ce
| Author: Steven Rostedt <srostedt@redhat.com>
| Date: Fri Mar 20 12:50:56 2009 -0400
|
| tracing: add function profiler
|
| Impact: new profiling feature
|
| This patch adds a function profiler. In debugfs/tracing/ two new
| files are created.
|
| function_profile_enabled - to enable or disable profiling
|
| trace_stat/functions - the profiled functions.
|
| For example:
|
| echo 1 > /debugfs/tracing/function_profile_enabled
| ./hackbench 50
| echo 0 > /debugfs/tracing/function_profile_enabled
|
| yields:
|
| cat /debugfs/tracing/trace_stat/functions
|
| Function Hit
| -------- ---
| _spin_lock 10106442
| _spin_unlock 10097492
| kfree 6013704
| _spin_unlock_irqrestore 4423941
| _spin_lock_irqsave 4406825
| __phys_addr 4181686
| __slab_free 4038222
| dput 4030130
| path_put 4023387
| unroll_tree_refs 4019532
| [...]
|
| The most hit functions are listed first. Functions that are not
| hit are not listed.
|
| This feature depends on and uses dynamic function tracing. When the
| function profiling is disabled, no overhead occurs. But it still
| takes up around 300KB to hold the data, thus it is not recomended
| to keep it enabled for systems low on memory.
|
| When a '1' is echoed into the function_profile_enabled file, the
| counters for is function is reset back to zero. Thus you can see what
| functions are hit most by different programs.
|
| Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
---
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 015a3d2..2acc94d 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -153,6 +153,11 @@ struct dyn_ftrace {
unsigned long flags;
struct dyn_ftrace *newlist;
};
+#ifdef CONFIG_FUNCTION_PROFILER
+ unsigned long counter;
+ struct hlist_node node;
+#endif
+
struct dyn_arch_ftrace arch;
};
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index b0a46f8..cc81b96 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -377,6 +377,24 @@ config DYNAMIC_FTRACE
were made. If so, it runs stop_machine (stops all CPUS)
and modifies the code to jump over the call to ftrace.
+config FUNCTION_PROFILER
+ bool "Kernel function profiler"
+ depends on DYNAMIC_FTRACE
+ default n
+ help
+ This option enables the kernel function profiler. When the dynamic
+ function tracing is enabled, a counter is added into the function
+ records used by the dynamic function tracer. A file is created in
+ debugfs called function_profile_enabled which defaults to zero.
+ When a 1 is echoed into this file profiling begins, and when a
+ zero is entered, profiling stops. A file in the trace_stats
+ directory called functions, that show the list of functions that
+ have been hit and their counters.
+
+ This takes up around 320K more memory.
+
+ If in doubt, say N
+
config FTRACE_MCOUNT_RECORD
def_bool y
depends on DYNAMIC_FTRACE
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 7847806..ce74547 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -34,6 +34,7 @@
#include <asm/ftrace.h>
#include "trace.h"
+#include "trace_stat.h"
#define FTRACE_WARN_ON(cond) \
do { \
@@ -261,7 +262,6 @@ struct ftrace_func_probe {
struct rcu_head rcu;
};
-
enum {
FTRACE_ENABLE_CALLS = (1 << 0),
FTRACE_DISABLE_CALLS = (1 << 1),
@@ -309,6 +309,307 @@ static struct dyn_ftrace *ftrace_free_records;
} \
}
+#ifdef CONFIG_FUNCTION_PROFILER
+static struct hlist_head *ftrace_profile_hash;
+static int ftrace_profile_bits;
+static int ftrace_profile_enabled;
+static DEFINE_MUTEX(ftrace_profile_lock);
+
+static void *
+function_stat_next(void *v, int idx)
+{
+ struct dyn_ftrace *rec = v;
+ struct ftrace_page *pg;
+
+ pg = (struct ftrace_page *)((unsigned long)rec & PAGE_MASK);
+
+ again:
+ rec++;
+ if ((void *)rec >= (void *)&pg->records[pg->index]) {
+ pg = pg->next;
+ if (!pg)
+ return NULL;
+ rec = &pg->records[0];
+ }
+
+ if (rec->flags & FTRACE_FL_FREE ||
+ rec->flags & FTRACE_FL_FAILED ||
+ !(rec->flags & FTRACE_FL_CONVERTED) ||
+ /* ignore non hit functions */
+ !rec->counter)
+ goto again;
+
+ return rec;
+}
+
+static void *function_stat_start(void)
+{
+ return function_stat_next(&ftrace_pages_start->records[0], 0);
+}
+
+static int function_stat_cmp(void *p1, void *p2)
+{
+ struct dyn_ftrace *a = p1;
+ struct dyn_ftrace *b = p2;
+
+ if (a->counter < b->counter)
+ return -1;
+ if (a->counter > b->counter)
+ return 1;
+ else
+ return 0;
+}
+
+static int function_stat_headers(struct seq_file *m)
+{
+ seq_printf(m, " Function Hit\n"
+ " -------- ---\n");
+ return 0;
+}
+
+static int function_stat_show(struct seq_file *m, void *v)
+{
+ struct dyn_ftrace *rec = v;
+ char str[KSYM_SYMBOL_LEN];
+
+ kallsyms_lookup(rec->ip, NULL, NULL, NULL, str);
+
+ seq_printf(m, " %-30.30s %10lu\n", str, rec->counter);
+ return 0;
+}
+
+static struct tracer_stat function_stats = {
+ .name = "functions",
+ .stat_start = function_stat_start,
+ .stat_next = function_stat_next,
+ .stat_cmp = function_stat_cmp,
+ .stat_headers = function_stat_headers,
+ .stat_show = function_stat_show
+};
+
+static void ftrace_profile_init(int nr_funcs)
+{
+ unsigned long addr;
+ int order;
+ int size;
+
+ /*
+ * We are profiling all functions, lets make it 1/4th of the
+ * number of functions that are in core kernel. So we have to
+ * iterate 4 times.
+ */
+ order = (sizeof(struct hlist_head) * nr_funcs) / 4;
+ order = get_order(order);
+ size = 1 << (PAGE_SHIFT + order);
+
+ pr_info("Allocating %d KB for profiler hash\n", size >> 10);
+
+ addr = __get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
+ if (!addr) {
+ pr_warning("Could not allocate function profiler hash\n");
+ return;
+ }
+
+ ftrace_profile_hash = (void *)addr;
+
+ /*
+ * struct hlist_head should be a pointer of 4 or 8 bytes.
+ * And a simple bit manipulation can be done, but if for
+ * some reason struct hlist_head is not a mulitple of 2,
+ * then we play it safe, and simply count. This function
+ * is done once at boot up, so it is not that critical in
+ * performance.
+ */
+
+ size--;
+ size /= sizeof(struct hlist_head);
+
+ for (; size; size >>= 1)
+ ftrace_profile_bits++;
+
+ pr_info("Function profiler has %d hash buckets\n",
+ 1 << ftrace_profile_bits);
+
+ return;
+}
+
+static ssize_t
+ftrace_profile_read(struct file *filp, char __user *ubuf,
+ size_t cnt, loff_t *ppos)
+{
+ char buf[64];
+ int r;
+
+ r = sprintf(buf, "%u\n", ftrace_profile_enabled);
+ return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
+}
+
+static void ftrace_profile_reset(void)
+{
+ struct dyn_ftrace *rec;
+ struct ftrace_page *pg;
+
+ do_for_each_ftrace_rec(pg, rec) {
+ rec->counter = 0;
+ } while_for_each_ftrace_rec();
+}
+
+static struct dyn_ftrace *ftrace_find_profiled_func(unsigned long ip)
+{
+ struct dyn_ftrace *rec;
+ struct hlist_head *hhd;
+ struct hlist_node *n;
+ unsigned long flags;
+ unsigned long key;
+
+ if (!ftrace_profile_hash)
+ return NULL;
+
+ key = hash_long(ip, ftrace_profile_bits);
+ hhd = &ftrace_profile_hash[key];
+
+ if (hlist_empty(hhd))
+ return NULL;
+
+ local_irq_save(flags);
+ hlist_for_each_entry_rcu(rec, n, hhd, node) {
+ if (rec->ip == ip)
+ goto out;
+ }
+ rec = NULL;
+ out:
+ local_irq_restore(flags);
+
+ return rec;
+}
+
+static void
+function_profile_call(unsigned long ip, unsigned long parent_ip)
+{
+ struct dyn_ftrace *rec;
+ unsigned long flags;
+
+ if (!ftrace_profile_enabled)
+ return;
+
+ local_irq_save(flags);
+ rec = ftrace_find_profiled_func(ip);
+ if (!rec)
+ goto out;
+
+ rec->counter++;
+ out:
+ local_irq_restore(flags);
+}
+
+static struct ftrace_ops ftrace_profile_ops __read_mostly =
+{
+ .func = function_profile_call,
+};
+
+static ssize_t
+ftrace_profile_write(struct file *filp, const char __user *ubuf,
+ size_t cnt, loff_t *ppos)
+{
+ unsigned long val;
+ char buf[64];
+ int ret;
+
+ if (!ftrace_profile_hash) {
+ pr_info("Can not enable hash due to earlier problems\n");
+ return -ENODEV;
+ }
+
+ if (cnt >= sizeof(buf))
+ return -EINVAL;
+
+ if (copy_from_user(&buf, ubuf, cnt))
+ return -EFAULT;
+
+ buf[cnt] = 0;
+
+ ret = strict_strtoul(buf, 10, &val);
+ if (ret < 0)
+ return ret;
+
+ val = !!val;
+
+ mutex_lock(&ftrace_profile_lock);
+ if (ftrace_profile_enabled ^ val) {
+ if (val) {
+ ftrace_profile_reset();
+ register_ftrace_function(&ftrace_profile_ops);
+ ftrace_profile_enabled = 1;
+ } else {
+ ftrace_profile_enabled = 0;
+ unregister_ftrace_function(&ftrace_profile_ops);
+ }
+ }
+ mutex_unlock(&ftrace_profile_lock);
+
+ filp->f_pos += cnt;
+
+ return cnt;
+}
+
+static const struct file_operations ftrace_profile_fops = {
+ .open = tracing_open_generic,
+ .read = ftrace_profile_read,
+ .write = ftrace_profile_write,
+};
+
+static void ftrace_profile_debugfs(struct dentry *d_tracer)
+{
+ struct dentry *entry;
+ int ret;
+
+ ret = register_stat_tracer(&function_stats);
+ if (ret) {
+ pr_warning("Warning: could not register "
+ "function stats\n");
+ return;
+ }
+
+ entry = debugfs_create_file("function_profile_enabled", 0644,
+ d_tracer, NULL, &ftrace_profile_fops);
+ if (!entry)
+ pr_warning("Could not create debugfs "
+ "'function_profile_enabled' entry\n");
+}
+
+static void ftrace_add_profile(struct dyn_ftrace *rec)
+{
+ unsigned long key;
+
+ if (!ftrace_profile_hash)
+ return;
+
+ key = hash_long(rec->ip, ftrace_profile_bits);
+ hlist_add_head_rcu(&rec->node, &ftrace_profile_hash[key]);
+}
+
+static void ftrace_profile_release(struct dyn_ftrace *rec)
+{
+ mutex_lock(&ftrace_profile_lock);
+ hlist_del(&rec->node);
+ mutex_unlock(&ftrace_profile_lock);
+}
+
+#else /* CONFIG_FUNCTION_PROFILER */
+static void ftrace_profile_init(int nr_funcs)
+{
+}
+static void ftrace_add_profile(struct dyn_ftrace *rec)
+{
+}
+static void ftrace_profile_debugfs(struct dentry *d_tracer)
+{
+}
+static void ftrace_profile_release(struct dyn_ftrace *rec)
+{
+}
+#endif /* CONFIG_FUNCTION_PROFILER */
+
#ifdef CONFIG_KPROBES
static int frozen_record_count;
@@ -359,8 +660,10 @@ void ftrace_release(void *start, unsigned long size)
mutex_lock(&ftrace_lock);
do_for_each_ftrace_rec(pg, rec) {
if ((rec->ip >= s) && (rec->ip < e) &&
- !(rec->flags & FTRACE_FL_FREE))
+ !(rec->flags & FTRACE_FL_FREE)) {
ftrace_free_rec(rec);
+ ftrace_profile_release(rec);
+ }
} while_for_each_ftrace_rec();
mutex_unlock(&ftrace_lock);
}
@@ -414,6 +717,8 @@ ftrace_record_ip(unsigned long ip)
rec->newlist = ftrace_new_addrs;
ftrace_new_addrs = rec;
+ ftrace_add_profile(rec);
+
return rec;
}
@@ -2157,6 +2462,8 @@ static __init int ftrace_init_dyn_debugfs(struct dentry *d_tracer)
"'set_graph_function' entry\n");
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+ ftrace_profile_debugfs(d_tracer);
+
return 0;
}
@@ -2225,6 +2532,8 @@ void __init ftrace_init(void)
if (ret)
goto failed;
+ ftrace_profile_init(count);
+
last_ftrace_enabled = ftrace_enabled = 1;
ret = ftrace_convert_nops(NULL,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
--
[ Luis Claudio R. Goncalves Red Hat - Realtime Team ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ]
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/9] reduce size of memory in function profiler
2009-05-19 14:36 [PATCH 0/9] Backport of ftrace's function profiler Luis Claudio R. Goncalves
2009-05-19 14:43 ` [PATCH 1/9] tracing: add " Luis Claudio R. Goncalves
@ 2009-05-19 14:45 ` Luis Claudio R. Goncalves
2009-05-19 14:48 ` [PATCH 3/9] tracing: adding function timings to function profile Luis Claudio R. Goncalves
` (7 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Luis Claudio R. Goncalves @ 2009-05-19 14:45 UTC (permalink / raw)
To: linux-rt-users; +Cc: Steven Rostedt, Ingo Molnar, williams, tglx
Impact: reduce size of memory in function profiler
Backport to 2.6.29.3-rt14.
| From: Steven Rostedt <srostedt@redhat.com>
|
| Impact: reduce size of memory in function profiler
|
| The function profiler originally introduces its counters into the
| function records itself. There is 20 thousand different functions on
| a normal system, and that is adding 20 thousand counters for profiling
| event when not needed.
|
| A normal run of the profiler yields only a couple of thousand functions
| executed, depending on what is being profiled. This means we have around
| 18 thousand useless counters.
|
| This patch rectifies this by moving the data out of the function
| records used by dynamic ftrace. Data is preallocated to hold the functions
| when the profiling begins. Checks are made during profiling to see if
| more recorcds should be allocated, and they are allocated if it is safe
| to do so.
|
| This also removes the dependency from using dynamic ftrace, and also
| removes the overhead by having it enabled.
|
| Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
---
include/linux/ftrace.h | 4 -
kernel/trace/Kconfig | 10 +-
kernel/trace/ftrace.c | 440 ++++++++++++++++++++++++++++--------------------
3 files changed, 263 insertions(+), 191 deletions(-)
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 0456c3a..015a3d2 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -153,11 +153,6 @@ struct dyn_ftrace {
unsigned long flags;
struct dyn_ftrace *newlist;
};
-#ifdef CONFIG_FUNCTION_PROFILER
- unsigned long counter;
- struct hlist_node node;
-#endif
-
struct dyn_arch_ftrace arch;
};
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 95e9ad5..8a41360 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -379,20 +379,16 @@ config DYNAMIC_FTRACE
config FUNCTION_PROFILER
bool "Kernel function profiler"
- depends on DYNAMIC_FTRACE
+ depends on FUNCTION_TRACER
default n
help
- This option enables the kernel function profiler. When the dynamic
- function tracing is enabled, a counter is added into the function
- records used by the dynamic function tracer. A file is created in
- debugfs called function_profile_enabled which defaults to zero.
+ This option enables the kernel function profiler. A file is created
+ in debugfs called function_profile_enabled which defaults to zero.
When a 1 is echoed into this file profiling begins, and when a
zero is entered, profiling stops. A file in the trace_stats
directory called functions, that show the list of functions that
have been hit and their counters.
- This takes up around 320K more memory.
-
If in doubt, say N
config FTRACE_MCOUNT_RECORD
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 11f364c..24dac44 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -241,87 +241,48 @@ static void ftrace_update_pid_func(void)
#endif
}
-/* set when tracing only a pid */
-struct pid *ftrace_pid_trace;
-static struct pid * const ftrace_swapper_pid = &init_struct_pid;
-
-#ifdef CONFIG_DYNAMIC_FTRACE
-
-#ifndef CONFIG_FTRACE_MCOUNT_RECORD
-# error Dynamic ftrace depends on MCOUNT_RECORD
-#endif
-
-static struct hlist_head ftrace_func_hash[FTRACE_FUNC_HASHSIZE] __read_mostly;
-
-struct ftrace_func_probe {
- struct hlist_node node;
- struct ftrace_probe_ops *ops;
- unsigned long flags;
- unsigned long ip;
- void *data;
- struct rcu_head rcu;
+#ifdef CONFIG_FUNCTION_PROFILER
+struct ftrace_profile {
+ struct hlist_node node;
+ unsigned long ip;
+ unsigned long counter;
};
-enum {
- FTRACE_ENABLE_CALLS = (1 << 0),
- FTRACE_DISABLE_CALLS = (1 << 1),
- FTRACE_UPDATE_TRACE_FUNC = (1 << 2),
- FTRACE_ENABLE_MCOUNT = (1 << 3),
- FTRACE_DISABLE_MCOUNT = (1 << 4),
- FTRACE_START_FUNC_RET = (1 << 5),
- FTRACE_STOP_FUNC_RET = (1 << 6),
+struct ftrace_profile_page {
+ struct ftrace_profile_page *next;
+ unsigned long index;
+ struct ftrace_profile records[];
};
-static int ftrace_filtered;
+#define PROFILE_RECORDS_SIZE \
+ (PAGE_SIZE - offsetof(struct ftrace_profile_page, records))
-static struct dyn_ftrace *ftrace_new_addrs;
+#define PROFILES_PER_PAGE \
+ (PROFILE_RECORDS_SIZE / sizeof(struct ftrace_profile))
-static DEFINE_MUTEX(ftrace_regex_lock);
-
-struct ftrace_page {
- struct ftrace_page *next;
- int index;
- struct dyn_ftrace records[];
-};
+/* TODO: make these percpu, to prevent cache line bouncing */
+static struct ftrace_profile_page *profile_pages_start;
+static struct ftrace_profile_page *profile_pages;
-#define ENTRIES_PER_PAGE \
- ((PAGE_SIZE - sizeof(struct ftrace_page)) / sizeof(struct dyn_ftrace))
-
-/* estimate from running different kernels */
-#define NR_TO_INIT 10000
-
-static struct ftrace_page *ftrace_pages_start;
-static struct ftrace_page *ftrace_pages;
-
-static struct dyn_ftrace *ftrace_free_records;
-
-/*
- * This is a double for. Do not use 'break' to break out of the loop,
- * you must use a goto.
- */
-#define do_for_each_ftrace_rec(pg, rec) \
- for (pg = ftrace_pages_start; pg; pg = pg->next) { \
- int _____i; \
- for (_____i = 0; _____i < pg->index; _____i++) { \
- rec = &pg->records[_____i];
-
-#define while_for_each_ftrace_rec() \
- } \
- }
-
-#ifdef CONFIG_FUNCTION_PROFILER
static struct hlist_head *ftrace_profile_hash;
static int ftrace_profile_bits;
static int ftrace_profile_enabled;
static DEFINE_MUTEX(ftrace_profile_lock);
+static DEFINE_PER_CPU(atomic_t, ftrace_profile_disable);
+
+#define FTRACE_PROFILE_HASH_SIZE 1024 /* must be power of 2 */
+
+static raw_spinlock_t ftrace_profile_rec_lock =
+ (raw_spinlock_t)__RAW_SPIN_LOCK_UNLOCKED;
+
static void *
function_stat_next(void *v, int idx)
{
- struct dyn_ftrace *rec = v;
- struct ftrace_page *pg;
+ struct ftrace_profile *rec = v;
+ struct ftrace_profile_page *pg;
- pg = (struct ftrace_page *)((unsigned long)rec & PAGE_MASK);
+ pg = (struct ftrace_profile_page *)((unsigned long)rec & PAGE_MASK);
again:
rec++;
@@ -330,27 +291,22 @@ function_stat_next(void *v, int idx)
if (!pg)
return NULL;
rec = &pg->records[0];
+ if (!rec->counter)
+ goto again;
}
- if (rec->flags & FTRACE_FL_FREE ||
- rec->flags & FTRACE_FL_FAILED ||
- !(rec->flags & FTRACE_FL_CONVERTED) ||
- /* ignore non hit functions */
- !rec->counter)
- goto again;
-
return rec;
}
static void *function_stat_start(void)
{
- return function_stat_next(&ftrace_pages_start->records[0], 0);
+ return function_stat_next(&profile_pages_start->records[0], 0);
}
static int function_stat_cmp(void *p1, void *p2)
{
- struct dyn_ftrace *a = p1;
- struct dyn_ftrace *b = p2;
+ struct ftrace_profile *a = p1;
+ struct ftrace_profile *b = p2;
if (a->counter < b->counter)
return -1;
@@ -369,7 +325,7 @@ static int function_stat_headers(struct seq_file *m)
static int function_stat_show(struct seq_file *m, void *v)
{
- struct dyn_ftrace *rec = v;
+ struct ftrace_profile *rec = v;
char str[KSYM_SYMBOL_LEN];
kallsyms_lookup(rec->ip, NULL, NULL, NULL, str);
@@ -387,115 +343,191 @@ static struct tracer_stat function_stats = {
.stat_show = function_stat_show
};
-static void ftrace_profile_init(int nr_funcs)
+static void ftrace_profile_reset(void)
{
- unsigned long addr;
- int order;
- int size;
+ struct ftrace_profile_page *pg;
- /*
- * We are profiling all functions, lets make it 1/4th of the
- * number of functions that are in core kernel. So we have to
- * iterate 4 times.
- */
- order = (sizeof(struct hlist_head) * nr_funcs) / 4;
- order = get_order(order);
- size = 1 << (PAGE_SHIFT + order);
-
- pr_info("Allocating %d KB for profiler hash\n", size >> 10);
+ pg = profile_pages = profile_pages_start;
- addr = __get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
- if (!addr) {
- pr_warning("Could not allocate function profiler hash\n");
- return;
+ while (pg) {
+ memset(pg->records, 0, PROFILE_RECORDS_SIZE);
+ pg->index = 0;
+ pg = pg->next;
}
- ftrace_profile_hash = (void *)addr;
+ memset(ftrace_profile_hash, 0,
+ FTRACE_PROFILE_HASH_SIZE * sizeof(struct hlist_head));
+}
- /*
- * struct hlist_head should be a pointer of 4 or 8 bytes.
- * And a simple bit manipulation can be done, but if for
- * some reason struct hlist_head is not a mulitple of 2,
- * then we play it safe, and simply count. This function
- * is done once at boot up, so it is not that critical in
- * performance.
- */
+int ftrace_profile_pages_init(void)
+{
+ struct ftrace_profile_page *pg;
+ int i;
- size--;
- size /= sizeof(struct hlist_head);
+ /* If we already allocated, do nothing */
+ if (profile_pages)
+ return 0;
- for (; size; size >>= 1)
- ftrace_profile_bits++;
+ profile_pages = (void *)get_zeroed_page(GFP_KERNEL);
+ if (!profile_pages)
+ return -ENOMEM;
- pr_info("Function profiler has %d hash buckets\n",
- 1 << ftrace_profile_bits);
+ pg = profile_pages_start = profile_pages;
- return;
+ /* allocate 10 more pages to start */
+ for (i = 0; i < 10; i++) {
+ pg->next = (void *)get_zeroed_page(GFP_KERNEL);
+ /*
+ * We only care about allocating profile_pages, if
+ * we failed to allocate here, hopefully we will allocate
+ * later.
+ */
+ if (!pg->next)
+ break;
+ pg = pg->next;
+ }
+
+ return 0;
}
-static ssize_t
-ftrace_profile_read(struct file *filp, char __user *ubuf,
- size_t cnt, loff_t *ppos)
+static int ftrace_profile_init(void)
{
- char buf[64];
- int r;
+ int size;
- r = sprintf(buf, "%u\n", ftrace_profile_enabled);
- return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
-}
+ if (ftrace_profile_hash) {
+ /* If the profile is already created, simply reset it */
+ ftrace_profile_reset();
+ return 0;
+ }
-static void ftrace_profile_reset(void)
-{
- struct dyn_ftrace *rec;
- struct ftrace_page *pg;
+ /*
+ * We are profiling all functions, but usually only a few thousand
+ * functions are hit. We'll make a hash of 1024 items.
+ */
+ size = FTRACE_PROFILE_HASH_SIZE;
- do_for_each_ftrace_rec(pg, rec) {
- rec->counter = 0;
- } while_for_each_ftrace_rec();
+ ftrace_profile_hash =
+ kzalloc(sizeof(struct hlist_head) * size, GFP_KERNEL);
+
+ if (!ftrace_profile_hash)
+ return -ENOMEM;
+
+ size--;
+
+ for (; size; size >>= 1)
+ ftrace_profile_bits++;
+
+ /* Preallocate a few pages */
+ if (ftrace_profile_pages_init() < 0) {
+ kfree(ftrace_profile_hash);
+ ftrace_profile_hash = NULL;
+ return -ENOMEM;
+ }
+
+ return 0;
}
-static struct dyn_ftrace *ftrace_find_profiled_func(unsigned long ip)
+/* interrupts must be disabled */
+static struct ftrace_profile *ftrace_find_profiled_func(unsigned long ip)
{
- struct dyn_ftrace *rec;
+ struct ftrace_profile *rec;
struct hlist_head *hhd;
struct hlist_node *n;
- unsigned long flags;
unsigned long key;
- if (!ftrace_profile_hash)
- return NULL;
-
key = hash_long(ip, ftrace_profile_bits);
hhd = &ftrace_profile_hash[key];
if (hlist_empty(hhd))
return NULL;
- local_irq_save(flags);
hlist_for_each_entry_rcu(rec, n, hhd, node) {
if (rec->ip == ip)
- goto out;
+ return rec;
+ }
+
+ return NULL;
+}
+
+static void ftrace_add_profile(struct ftrace_profile *rec)
+{
+ unsigned long key;
+
+ key = hash_long(rec->ip, ftrace_profile_bits);
+ hlist_add_head_rcu(&rec->node, &ftrace_profile_hash[key]);
+}
+
+/* Interrupts must be disabled calling this */
+static struct ftrace_profile *
+ftrace_profile_alloc(unsigned long ip, bool alloc_safe)
+{
+ struct ftrace_profile *rec = NULL;
+
+ /* prevent recursion */
+ if (atomic_inc_return(&__get_cpu_var(ftrace_profile_disable)) != 1)
+ goto out;
+
+ __raw_spin_lock(&ftrace_profile_rec_lock);
+
+ /* Try to always keep another page available */
+ if (!profile_pages->next && alloc_safe)
+ profile_pages->next = (void *)get_zeroed_page(GFP_ATOMIC);
+
+ /*
+ * Try to find the function again since another
+ * task on another CPU could have added it
+ */
+ rec = ftrace_find_profiled_func(ip);
+ if (rec)
+ goto out_unlock;
+
+ if (profile_pages->index == PROFILES_PER_PAGE) {
+ if (!profile_pages->next)
+ goto out_unlock;
+ profile_pages = profile_pages->next;
}
- rec = NULL;
+
+ rec = &profile_pages->records[profile_pages->index++];
+ rec->ip = ip;
+ ftrace_add_profile(rec);
+
+ out_unlock:
+ __raw_spin_unlock(&ftrace_profile_rec_lock);
out:
- local_irq_restore(flags);
+ atomic_dec(&__get_cpu_var(ftrace_profile_disable));
return rec;
}
+/*
+ * If we are not in an interrupt, or softirq and
+ * and interrupts are disabled and preemption is not enabled
+ * (not in a spinlock) then it should be safe to allocate memory.
+ */
+static bool ftrace_safe_to_allocate(void)
+{
+ return !in_interrupt() && irqs_disabled() && !preempt_count();
+}
+
static void
function_profile_call(unsigned long ip, unsigned long parent_ip)
{
- struct dyn_ftrace *rec;
+ struct ftrace_profile *rec;
unsigned long flags;
+ bool alloc_safe;
if (!ftrace_profile_enabled)
return;
+ alloc_safe = ftrace_safe_to_allocate();
+
local_irq_save(flags);
rec = ftrace_find_profiled_func(ip);
- if (!rec)
- goto out;
+ if (!rec) {
+ rec = ftrace_profile_alloc(ip, alloc_safe);
+ if (!rec)
+ goto out;
+ }
rec->counter++;
out:
@@ -515,11 +547,6 @@ ftrace_profile_write(struct file *filp, const char __user *ubuf,
char buf[64];
int ret;
- if (!ftrace_profile_hash) {
- pr_info("Can not enable hash due to earlier problems\n");
- return -ENODEV;
- }
-
if (cnt >= sizeof(buf))
return -EINVAL;
@@ -537,7 +564,12 @@ ftrace_profile_write(struct file *filp, const char __user *ubuf,
mutex_lock(&ftrace_profile_lock);
if (ftrace_profile_enabled ^ val) {
if (val) {
- ftrace_profile_reset();
+ ret = ftrace_profile_init();
+ if (ret < 0) {
+ cnt = ret;
+ goto out;
+ }
+
register_ftrace_function(&ftrace_profile_ops);
ftrace_profile_enabled = 1;
} else {
@@ -545,6 +577,7 @@ ftrace_profile_write(struct file *filp, const char __user *ubuf,
unregister_ftrace_function(&ftrace_profile_ops);
}
}
+ out:
mutex_unlock(&ftrace_profile_lock);
filp->f_pos += cnt;
@@ -552,6 +585,17 @@ ftrace_profile_write(struct file *filp, const char __user *ubuf,
return cnt;
}
+static ssize_t
+ftrace_profile_read(struct file *filp, char __user *ubuf,
+ size_t cnt, loff_t *ppos)
+{
+ char buf[64];
+ int r;
+
+ r = sprintf(buf, "%u\n", ftrace_profile_enabled);
+ return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
+}
+
static const struct file_operations ftrace_profile_fops = {
.open = tracing_open_generic,
.read = ftrace_profile_read,
@@ -577,39 +621,80 @@ static void ftrace_profile_debugfs(struct dentry *d_tracer)
"'function_profile_enabled' entry\n");
}
-static void ftrace_add_profile(struct dyn_ftrace *rec)
-{
- unsigned long key;
-
- if (!ftrace_profile_hash)
- return;
-
- key = hash_long(rec->ip, ftrace_profile_bits);
- hlist_add_head_rcu(&rec->node, &ftrace_profile_hash[key]);
-}
-
-static void ftrace_profile_release(struct dyn_ftrace *rec)
-{
- mutex_lock(&ftrace_profile_lock);
- hlist_del(&rec->node);
- mutex_unlock(&ftrace_profile_lock);
-}
-
#else /* CONFIG_FUNCTION_PROFILER */
-static void ftrace_profile_init(int nr_funcs)
-{
-}
-static void ftrace_add_profile(struct dyn_ftrace *rec)
-{
-}
static void ftrace_profile_debugfs(struct dentry *d_tracer)
{
}
-static void ftrace_profile_release(struct dyn_ftrace *rec)
-{
-}
#endif /* CONFIG_FUNCTION_PROFILER */
+/* set when tracing only a pid */
+struct pid *ftrace_pid_trace;
+static struct pid * const ftrace_swapper_pid = &init_struct_pid;
+
+#ifdef CONFIG_DYNAMIC_FTRACE
+
+#ifndef CONFIG_FTRACE_MCOUNT_RECORD
+# error Dynamic ftrace depends on MCOUNT_RECORD
+#endif
+
+static struct hlist_head ftrace_func_hash[FTRACE_FUNC_HASHSIZE] __read_mostly;
+
+struct ftrace_func_probe {
+ struct hlist_node node;
+ struct ftrace_probe_ops *ops;
+ unsigned long flags;
+ unsigned long ip;
+ void *data;
+ struct rcu_head rcu;
+};
+
+enum {
+ FTRACE_ENABLE_CALLS = (1 << 0),
+ FTRACE_DISABLE_CALLS = (1 << 1),
+ FTRACE_UPDATE_TRACE_FUNC = (1 << 2),
+ FTRACE_ENABLE_MCOUNT = (1 << 3),
+ FTRACE_DISABLE_MCOUNT = (1 << 4),
+ FTRACE_START_FUNC_RET = (1 << 5),
+ FTRACE_STOP_FUNC_RET = (1 << 6),
+};
+
+static int ftrace_filtered;
+
+static struct dyn_ftrace *ftrace_new_addrs;
+
+static DEFINE_MUTEX(ftrace_regex_lock);
+
+struct ftrace_page {
+ struct ftrace_page *next;
+ int index;
+ struct dyn_ftrace records[];
+};
+
+#define ENTRIES_PER_PAGE \
+ ((PAGE_SIZE - sizeof(struct ftrace_page)) / sizeof(struct dyn_ftrace))
+
+/* estimate from running different kernels */
+#define NR_TO_INIT 10000
+
+static struct ftrace_page *ftrace_pages_start;
+static struct ftrace_page *ftrace_pages;
+
+static struct dyn_ftrace *ftrace_free_records;
+
+/*
+ * This is a double for. Do not use 'break' to break out of the loop,
+ * you must use a goto.
+ */
+#define do_for_each_ftrace_rec(pg, rec) \
+ for (pg = ftrace_pages_start; pg; pg = pg->next) { \
+ int _____i; \
+ for (_____i = 0; _____i < pg->index; _____i++) { \
+ rec = &pg->records[_____i];
+
+#define while_for_each_ftrace_rec() \
+ } \
+ }
+
#ifdef CONFIG_KPROBES
static int frozen_record_count;
@@ -660,10 +745,8 @@ void ftrace_release(void *start, unsigned long size)
mutex_lock(&ftrace_lock);
do_for_each_ftrace_rec(pg, rec) {
if ((rec->ip >= s) && (rec->ip < e) &&
- !(rec->flags & FTRACE_FL_FREE)) {
+ !(rec->flags & FTRACE_FL_FREE))
ftrace_free_rec(rec);
- ftrace_profile_release(rec);
- }
} while_for_each_ftrace_rec();
mutex_unlock(&ftrace_lock);
}
@@ -717,8 +800,6 @@ ftrace_record_ip(unsigned long ip)
rec->newlist = ftrace_new_addrs;
ftrace_new_addrs = rec;
- ftrace_add_profile(rec);
-
return rec;
}
@@ -2462,8 +2543,6 @@ static __init int ftrace_init_dyn_debugfs(struct dentry *d_tracer)
"'set_graph_function' entry\n");
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
- ftrace_profile_debugfs(d_tracer);
-
return 0;
}
@@ -2532,8 +2611,6 @@ void __init ftrace_init(void)
if (ret)
goto failed;
- ftrace_profile_init(count);
-
last_ftrace_enabled = ftrace_enabled = 1;
ret = ftrace_convert_nops(NULL,
@@ -2734,6 +2811,9 @@ static __init int ftrace_init_debugfs(void)
if (!entry)
pr_warning("Could not create debugfs "
"'set_ftrace_pid' entry\n");
+
+ ftrace_profile_debugfs(d_tracer);
+
return 0;
}
fs_initcall(ftrace_init_debugfs);
--
1.6.2
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
--
[ Luis Claudio R. Goncalves Bass - Gospel - RT ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ]
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/9] tracing: adding function timings to function profile
2009-05-19 14:36 [PATCH 0/9] Backport of ftrace's function profiler Luis Claudio R. Goncalves
2009-05-19 14:43 ` [PATCH 1/9] tracing: add " Luis Claudio R. Goncalves
2009-05-19 14:45 ` [PATCH 2/9] reduce size of memory in " Luis Claudio R. Goncalves
@ 2009-05-19 14:48 ` Luis Claudio R. Goncalves
2009-05-19 14:52 ` [PATCH 4/9] tracing: make the function profiler per cpu Luis Claudio R. Goncalves
` (6 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Luis Claudio R. Goncalves @ 2009-05-19 14:48 UTC (permalink / raw)
To: linux-rt-users; +Cc: Steven Rostedt, Ingo Molnar, williams, tglx
tracing: adding function timings to function profile
Backport to 2.6.29.3-rt14.
| From: Steven Rostedt <srostedt@redhat.com>
|
| If the function graph trace is enabled, the function profiler will
| use it to take the timing of the functions.
|
| cat /debug/tracing/trace_stat/functions
|
| Function Hit Time
| -------- --- ----
| mwait_idle 127 183028.4 us
| schedule 26 151997.7 us
| __schedule 31 151975.1 us
| sys_wait4 2 74080.53 us
| do_wait 2 74077.80 us
| sys_newlstat 138 39929.16 us
| do_path_lookup 179 39845.79 us
| vfs_lstat_fd 138 39761.97 us
| user_path_at 153 39469.58 us
| path_walk 179 39435.76 us
| __link_path_walk 189 39143.73 us
| [...]
|
| Note the times are skewed due to the function graph tracer not taking
| into account schedules.
|
| Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
---
kernel/trace/ftrace.c | 93 ++++++++++++++++++++++++++++++++--
kernel/trace/trace.c | 11 ----
kernel/trace/trace.h | 3 +-
kernel/trace/trace_functions_graph.c | 17 +++++-
kernel/trace/trace_output.c | 10 ++++
kernel/trace/trace_output.h | 2 +
6 files changed, 117 insertions(+), 19 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 24dac44..a9ccd71 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -33,7 +33,7 @@
#include <asm/ftrace.h>
-#include "trace.h"
+#include "trace_output.h"
#include "trace_stat.h"
#define FTRACE_WARN_ON(cond) \
@@ -246,6 +246,9 @@ struct ftrace_profile {
struct hlist_node node;
unsigned long ip;
unsigned long counter;
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+ unsigned long long time;
+#endif
};
struct ftrace_profile_page {
@@ -303,6 +306,22 @@ static void *function_stat_start(struct tracer_stat *trace)
return function_stat_next(&profile_pages_start->records[0], 0);
}
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+/* function graph compares on total time */
+static int function_stat_cmp(void *p1, void *p2)
+{
+ struct ftrace_profile *a = p1;
+ struct ftrace_profile *b = p2;
+
+ if (a->time < b->time)
+ return -1;
+ if (a->time > b->time)
+ return 1;
+ else
+ return 0;
+}
+#else
+/* not function graph compares against hits */
static int function_stat_cmp(void *p1, void *p2)
{
struct ftrace_profile *a = p1;
@@ -315,11 +334,17 @@ static int function_stat_cmp(void *p1, void *p2)
else
return 0;
}
+#endif
static int function_stat_headers(struct seq_file *m)
{
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+ seq_printf(m, " Function Hit Time\n"
+ " -------- --- ----\n");
+#else
seq_printf(m, " Function Hit\n"
" -------- ---\n");
+#endif
return 0;
}
@@ -327,10 +352,25 @@ static int function_stat_show(struct seq_file *m, void *v)
{
struct ftrace_profile *rec = v;
char str[KSYM_SYMBOL_LEN];
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+ static struct trace_seq s;
+ static DEFINE_MUTEX(mutex);
+
+ mutex_lock(&mutex);
+ trace_seq_init(&s);
+ trace_print_graph_duration(rec->time, &s);
+#endif
kallsyms_lookup(rec->ip, NULL, NULL, NULL, str);
+ seq_printf(m, " %-30.30s %10lu", str, rec->counter);
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+ seq_printf(m, " ");
+ trace_print_seq(m, &s);
+ mutex_unlock(&mutex);
+#endif
+ seq_putc(m, '\n');
- seq_printf(m, " %-30.30s %10lu\n", str, rec->counter);
return 0;
}
@@ -534,11 +574,52 @@ function_profile_call(unsigned long ip, unsigned long parent_ip)
local_irq_restore(flags);
}
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+static int profile_graph_entry(struct ftrace_graph_ent *trace)
+{
+ function_profile_call(trace->func, 0);
+ return 1;
+}
+
+static void profile_graph_return(struct ftrace_graph_ret *trace)
+{
+ unsigned long flags;
+ struct ftrace_profile *rec;
+
+ local_irq_save(flags);
+ rec = ftrace_find_profiled_func(trace->func);
+ if (rec)
+ rec->time += trace->rettime - trace->calltime;
+ local_irq_restore(flags);
+}
+
+static int register_ftrace_profiler(void)
+{
+ return register_ftrace_graph(&profile_graph_return,
+ &profile_graph_entry);
+}
+
+static void unregister_ftrace_profiler(void)
+{
+ unregister_ftrace_graph();
+}
+#else
static struct ftrace_ops ftrace_profile_ops __read_mostly =
{
.func = function_profile_call,
};
+static int register_ftrace_profiler(void)
+{
+ return register_ftrace_function(&ftrace_profile_ops);
+}
+
+static void unregister_ftrace_profiler(void)
+{
+ unregister_ftrace_function(&ftrace_profile_ops);
+}
+#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+
static ssize_t
ftrace_profile_write(struct file *filp, const char __user *ubuf,
size_t cnt, loff_t *ppos)
@@ -570,11 +651,15 @@ ftrace_profile_write(struct file *filp, const char __user *ubuf,
goto out;
}
- register_ftrace_function(&ftrace_profile_ops);
+ ret = register_ftrace_profiler();
+ if (ret < 0) {
+ cnt = ret;
+ goto out;
+ }
ftrace_profile_enabled = 1;
} else {
ftrace_profile_enabled = 0;
- unregister_ftrace_function(&ftrace_profile_ops);
+ unregister_ftrace_profiler();
}
}
out:
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 67c6a21..821bf49 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -405,17 +405,6 @@ static ssize_t trace_seq_to_buffer(struct trace_seq *s, void *buf, size_t cnt)
return cnt;
}
-static void
-trace_print_seq(struct seq_file *m, struct trace_seq *s)
-{
- int len = s->len >= PAGE_SIZE ? PAGE_SIZE - 1 : s->len;
-
- s->buffer[len] = 0;
- seq_puts(m, s->buffer);
-
- trace_seq_init(s);
-}
-
/**
* update_max_tr - snapshot all trace buffers from global_trace to max_tr
* @tr: tracer
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index d7410bb..c66ca3b 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -613,6 +613,8 @@ extern unsigned long trace_flags;
/* Standard output formatting function used for function return traces */
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
extern enum print_line_t print_graph_function(struct trace_iterator *iter);
+extern enum print_line_t
+trace_print_graph_duration(unsigned long long duration, struct trace_seq *s);
#ifdef CONFIG_DYNAMIC_FTRACE
/* TODO: make this variable */
@@ -644,7 +646,6 @@ static inline int ftrace_graph_addr(unsigned long addr)
return 1;
}
#endif /* CONFIG_DYNAMIC_FTRACE */
-
#else /* CONFIG_FUNCTION_GRAPH_TRACER */
static inline enum print_line_t
print_graph_function(struct trace_iterator *iter)
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index d28687e..85bba0f 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -426,8 +426,8 @@ print_graph_irq(struct trace_iterator *iter, unsigned long addr,
return TRACE_TYPE_HANDLED;
}
-static enum print_line_t
-print_graph_duration(unsigned long long duration, struct trace_seq *s)
+enum print_line_t
+trace_print_graph_duration(unsigned long long duration, struct trace_seq *s)
{
unsigned long nsecs_rem = do_div(duration, 1000);
/* log10(ULONG_MAX) + '\0' */
@@ -464,12 +464,23 @@ print_graph_duration(unsigned long long duration, struct trace_seq *s)
if (!ret)
return TRACE_TYPE_PARTIAL_LINE;
}
+ return TRACE_TYPE_HANDLED;
+}
+
+static enum print_line_t
+print_graph_duration(unsigned long long duration, struct trace_seq *s)
+{
+ int ret;
+
+ ret = trace_print_graph_duration(duration, s);
+ if (ret != TRACE_TYPE_HANDLED)
+ return ret;
ret = trace_seq_printf(s, "| ");
if (!ret)
return TRACE_TYPE_PARTIAL_LINE;
- return TRACE_TYPE_HANDLED;
+ return TRACE_TYPE_HANDLED;
}
/* Case of a leaf function on its call entry */
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 19261fd..a3b6e3f 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -19,6 +19,16 @@ static struct hlist_head event_hash[EVENT_HASHSIZE] __read_mostly;
static int next_event_type = __TRACE_LAST_TYPE + 1;
+void trace_print_seq(struct seq_file *m, struct trace_seq *s)
+{
+ int len = s->len >= PAGE_SIZE ? PAGE_SIZE - 1 : s->len;
+
+ s->buffer[len] = 0;
+ seq_puts(m, s->buffer);
+
+ trace_seq_init(s);
+}
+
enum print_line_t trace_print_bprintk_msg_only(struct trace_iterator *iter)
{
struct trace_seq *s = &iter->seq;
diff --git a/kernel/trace/trace_output.h b/kernel/trace/trace_output.h
index 35c422f..1eac297 100644
--- a/kernel/trace/trace_output.h
+++ b/kernel/trace/trace_output.h
@@ -20,6 +20,8 @@ trace_print_bprintk_msg_only(struct trace_iterator *iter);
extern enum print_line_t
trace_print_printk_msg_only(struct trace_iterator *iter);
+extern void trace_print_seq(struct seq_file *m, struct trace_seq *s);
+
extern int trace_seq_printf(struct trace_seq *s, const char *fmt, ...)
__attribute__ ((format (printf, 2, 3)));
extern int
--
1.6.2
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
--
[ Luis Claudio R. Goncalves Bass - Gospel - RT ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ]
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 4/9] tracing: make the function profiler per cpu
2009-05-19 14:36 [PATCH 0/9] Backport of ftrace's function profiler Luis Claudio R. Goncalves
` (2 preceding siblings ...)
2009-05-19 14:48 ` [PATCH 3/9] tracing: adding function timings to function profile Luis Claudio R. Goncalves
@ 2009-05-19 14:52 ` Luis Claudio R. Goncalves
2009-05-19 14:55 ` [PATCH 5/9] function graph add option to calculate graph time off Luis Claudio R. Goncalves
` (5 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Luis Claudio R. Goncalves @ 2009-05-19 14:52 UTC (permalink / raw)
To: linux-rt-users; +Cc: Steven Rostedt, Ingo Molnar, williams, tglx
Impact: speed enhancement
Backport to 2.6.29.3-rt14.
| From: Steven Rostedt <srostedt@redhat.com>
|
| Impact: speed enhancement
|
| By making the function profiler record in per cpu data we not only
| get better readings, avoid races, we also do not have to take any
| locks.
|
| Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
---
kernel/trace/ftrace.c | 201 ++++++++++++++++++++++++++++++++-----------------
1 files changed, 131 insertions(+), 70 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index a9ccd71..ed1fc50 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -257,28 +257,28 @@ struct ftrace_profile_page {
struct ftrace_profile records[];
};
+struct ftrace_profile_stat {
+ atomic_t disabled;
+ struct hlist_head *hash;
+ struct ftrace_profile_page *pages;
+ struct ftrace_profile_page *start;
+ struct tracer_stat stat;
+};
+
#define PROFILE_RECORDS_SIZE \
(PAGE_SIZE - offsetof(struct ftrace_profile_page, records))
#define PROFILES_PER_PAGE \
(PROFILE_RECORDS_SIZE / sizeof(struct ftrace_profile))
-/* TODO: make these percpu, to prevent cache line bouncing */
-static struct ftrace_profile_page *profile_pages_start;
-static struct ftrace_profile_page *profile_pages;
-
-static struct hlist_head *ftrace_profile_hash;
static int ftrace_profile_bits;
static int ftrace_profile_enabled;
static DEFINE_MUTEX(ftrace_profile_lock);
-static DEFINE_PER_CPU(atomic_t, ftrace_profile_disable);
+static DEFINE_PER_CPU(struct ftrace_profile_stat, ftrace_profile_stats);
#define FTRACE_PROFILE_HASH_SIZE 1024 /* must be power of 2 */
-static raw_spinlock_t ftrace_profile_rec_lock =
- (raw_spinlock_t)__RAW_SPIN_LOCK_UNLOCKED;
-
static void *
function_stat_next(void *v, int idx)
{
@@ -303,7 +303,13 @@ function_stat_next(void *v, int idx)
static void *function_stat_start(void)
{
- return function_stat_next(&profile_pages_start->records[0], 0);
+ struct ftrace_profile_stat *stat =
+ container_of(trace, struct ftrace_profile_stat, stat);
+
+ if (!stat || !stat->start)
+ return NULL;
+
+ return function_stat_next(&stat->start->records[0], 0);
}
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
@@ -374,20 +380,11 @@ static int function_stat_show(struct seq_file *m, void *v)
return 0;
}
-static struct tracer_stat function_stats = {
- .name = "functions",
- .stat_start = function_stat_start,
- .stat_next = function_stat_next,
- .stat_cmp = function_stat_cmp,
- .stat_headers = function_stat_headers,
- .stat_show = function_stat_show
-};
-
-static void ftrace_profile_reset(void)
+static void ftrace_profile_reset(struct ftrace_profile_stat *stat)
{
struct ftrace_profile_page *pg;
- pg = profile_pages = profile_pages_start;
+ pg = stat->pages = stat->start;
while (pg) {
memset(pg->records, 0, PROFILE_RECORDS_SIZE);
@@ -395,24 +392,24 @@ static void ftrace_profile_reset(void)
pg = pg->next;
}
- memset(ftrace_profile_hash, 0,
+ memset(stat->hash, 0,
FTRACE_PROFILE_HASH_SIZE * sizeof(struct hlist_head));
}
-int ftrace_profile_pages_init(void)
+int ftrace_profile_pages_init(struct ftrace_profile_stat *stat)
{
struct ftrace_profile_page *pg;
int i;
/* If we already allocated, do nothing */
- if (profile_pages)
+ if (stat->pages)
return 0;
- profile_pages = (void *)get_zeroed_page(GFP_KERNEL);
- if (!profile_pages)
+ stat->pages = (void *)get_zeroed_page(GFP_KERNEL);
+ if (!stat->pages)
return -ENOMEM;
- pg = profile_pages_start = profile_pages;
+ pg = stat->start = stat->pages;
/* allocate 10 more pages to start */
for (i = 0; i < 10; i++) {
@@ -430,13 +427,16 @@ int ftrace_profile_pages_init(void)
return 0;
}
-static int ftrace_profile_init(void)
+static int ftrace_profile_init_cpu(int cpu)
{
+ struct ftrace_profile_stat *stat;
int size;
- if (ftrace_profile_hash) {
+ stat = &per_cpu(ftrace_profile_stats, cpu);
+
+ if (stat->hash) {
/* If the profile is already created, simply reset it */
- ftrace_profile_reset();
+ ftrace_profile_reset(stat);
return 0;
}
@@ -446,29 +446,45 @@ static int ftrace_profile_init(void)
*/
size = FTRACE_PROFILE_HASH_SIZE;
- ftrace_profile_hash =
- kzalloc(sizeof(struct hlist_head) * size, GFP_KERNEL);
+ stat->hash = kzalloc(sizeof(struct hlist_head) * size, GFP_KERNEL);
- if (!ftrace_profile_hash)
+ if (!stat->hash)
return -ENOMEM;
- size--;
+ if (!ftrace_profile_bits) {
+ size--;
- for (; size; size >>= 1)
- ftrace_profile_bits++;
+ for (; size; size >>= 1)
+ ftrace_profile_bits++;
+ }
/* Preallocate a few pages */
- if (ftrace_profile_pages_init() < 0) {
- kfree(ftrace_profile_hash);
- ftrace_profile_hash = NULL;
+ if (ftrace_profile_pages_init(stat) < 0) {
+ kfree(stat->hash);
+ stat->hash = NULL;
return -ENOMEM;
}
return 0;
}
+static int ftrace_profile_init(void)
+{
+ int cpu;
+ int ret = 0;
+
+ for_each_online_cpu(cpu) {
+ ret = ftrace_profile_init_cpu(cpu);
+ if (ret)
+ break;
+ }
+
+ return ret;
+}
+
/* interrupts must be disabled */
-static struct ftrace_profile *ftrace_find_profiled_func(unsigned long ip)
+static struct ftrace_profile *
+ftrace_find_profiled_func(struct ftrace_profile_stat *stat, unsigned long ip)
{
struct ftrace_profile *rec;
struct hlist_head *hhd;
@@ -476,7 +492,7 @@ static struct ftrace_profile *ftrace_find_profiled_func(unsigned long ip)
unsigned long key;
key = hash_long(ip, ftrace_profile_bits);
- hhd = &ftrace_profile_hash[key];
+ hhd = &stat->hash[key];
if (hlist_empty(hhd))
return NULL;
@@ -489,52 +505,50 @@ static struct ftrace_profile *ftrace_find_profiled_func(unsigned long ip)
return NULL;
}
-static void ftrace_add_profile(struct ftrace_profile *rec)
+static void ftrace_add_profile(struct ftrace_profile_stat *stat,
+ struct ftrace_profile *rec)
{
unsigned long key;
key = hash_long(rec->ip, ftrace_profile_bits);
- hlist_add_head_rcu(&rec->node, &ftrace_profile_hash[key]);
+ hlist_add_head_rcu(&rec->node, &stat->hash[key]);
}
/* Interrupts must be disabled calling this */
static struct ftrace_profile *
-ftrace_profile_alloc(unsigned long ip, bool alloc_safe)
+ftrace_profile_alloc(struct ftrace_profile_stat *stat,
+ unsigned long ip, bool alloc_safe)
{
struct ftrace_profile *rec = NULL;
/* prevent recursion */
- if (atomic_inc_return(&__get_cpu_var(ftrace_profile_disable)) != 1)
+ if (atomic_inc_return(&stat->disabled) != 1)
goto out;
- __raw_spin_lock(&ftrace_profile_rec_lock);
-
/* Try to always keep another page available */
- if (!profile_pages->next && alloc_safe)
- profile_pages->next = (void *)get_zeroed_page(GFP_ATOMIC);
+ if (!stat->pages->next && alloc_safe)
+ stat->pages->next = (void *)get_zeroed_page(GFP_ATOMIC);
/*
* Try to find the function again since another
* task on another CPU could have added it
*/
- rec = ftrace_find_profiled_func(ip);
+ rec = ftrace_find_profiled_func(stat, ip);
if (rec)
- goto out_unlock;
+ goto out;
- if (profile_pages->index == PROFILES_PER_PAGE) {
- if (!profile_pages->next)
- goto out_unlock;
- profile_pages = profile_pages->next;
+ if (stat->pages->index == PROFILES_PER_PAGE) {
+ if (!stat->pages->next)
+ goto out;
+ stat->pages = stat->pages->next;
}
- rec = &profile_pages->records[profile_pages->index++];
+ rec = &stat->pages->records[stat->pages->index++];
rec->ip = ip;
- ftrace_add_profile(rec);
+ ftrace_add_profile(stat, rec);
- out_unlock:
- __raw_spin_unlock(&ftrace_profile_rec_lock);
out:
- atomic_dec(&__get_cpu_var(ftrace_profile_disable));
+ atomic_dec(&stat->disabled);
return rec;
}
@@ -552,6 +566,7 @@ static bool ftrace_safe_to_allocate(void)
static void
function_profile_call(unsigned long ip, unsigned long parent_ip)
{
+ struct ftrace_profile_stat *stat;
struct ftrace_profile *rec;
unsigned long flags;
bool alloc_safe;
@@ -562,9 +577,14 @@ function_profile_call(unsigned long ip, unsigned long parent_ip)
alloc_safe = ftrace_safe_to_allocate();
local_irq_save(flags);
- rec = ftrace_find_profiled_func(ip);
+
+ stat = &__get_cpu_var(ftrace_profile_stats);
+ if (!stat->hash)
+ goto out;
+
+ rec = ftrace_find_profiled_func(stat, ip);
if (!rec) {
- rec = ftrace_profile_alloc(ip, alloc_safe);
+ rec = ftrace_profile_alloc(stat, ip, alloc_safe);
if (!rec)
goto out;
}
@@ -583,13 +603,19 @@ static int profile_graph_entry(struct ftrace_graph_ent *trace)
static void profile_graph_return(struct ftrace_graph_ret *trace)
{
- unsigned long flags;
+ struct ftrace_profile_stat *stat;
struct ftrace_profile *rec;
+ unsigned long flags;
local_irq_save(flags);
- rec = ftrace_find_profiled_func(trace->func);
+ stat = &__get_cpu_var(ftrace_profile_stats);
+ if (!stat->hash)
+ goto out;
+
+ rec = ftrace_find_profiled_func(stat, trace->func);
if (rec)
rec->time += trace->rettime - trace->calltime;
+ out:
local_irq_restore(flags);
}
@@ -687,16 +713,51 @@ static const struct file_operations ftrace_profile_fops = {
.write = ftrace_profile_write,
};
+/* used to initialize the real stat files */
+static struct tracer_stat function_stats __initdata = {
+ .name = "functions",
+ .stat_start = function_stat_start,
+ .stat_next = function_stat_next,
+ .stat_cmp = function_stat_cmp,
+ .stat_headers = function_stat_headers,
+ .stat_show = function_stat_show
+};
+
static void ftrace_profile_debugfs(struct dentry *d_tracer)
{
+ struct ftrace_profile_stat *stat;
struct dentry *entry;
+ char *name;
int ret;
+ int cpu;
- ret = register_stat_tracer(&function_stats);
- if (ret) {
- pr_warning("Warning: could not register "
- "function stats\n");
- return;
+ for_each_possible_cpu(cpu) {
+ stat = &per_cpu(ftrace_profile_stats, cpu);
+
+ /* allocate enough for function name + cpu number */
+ name = kmalloc(32, GFP_KERNEL);
+ if (!name) {
+ /*
+ * The files created are permanent, if something happens
+ * we still do not free memory.
+ */
+ kfree(stat);
+ WARN(1,
+ "Could not allocate stat file for cpu %d\n",
+ cpu);
+ return;
+ }
+ stat->stat = function_stats;
+ snprintf(name, 32, "function%d", cpu);
+ stat->stat.name = name;
+ ret = register_stat_tracer(&stat->stat);
+ if (ret) {
+ WARN(1,
+ "Could not register function stat for cpu %d\n",
+ cpu);
+ kfree(name);
+ return;
+ }
}
entry = debugfs_create_file("function_profile_enabled", 0644,
--
1.6.2
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
--
[ Luis Claudio R. Goncalves Bass - Gospel - RT ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ]
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 5/9] function graph add option to calculate graph time off
2009-05-19 14:36 [PATCH 0/9] Backport of ftrace's function profiler Luis Claudio R. Goncalves
` (3 preceding siblings ...)
2009-05-19 14:52 ` [PATCH 4/9] tracing: make the function profiler per cpu Luis Claudio R. Goncalves
@ 2009-05-19 14:55 ` Luis Claudio R. Goncalves
2009-05-19 14:58 ` [PATCH 6/9] tracing: remove on the fly allocator from function profiler Luis Claudio R. Goncalves
` (4 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Luis Claudio R. Goncalves @ 2009-05-19 14:55 UTC (permalink / raw)
To: linux-rt-users; +Cc: Steven Rostedt, Ingo Molnar, williams, tglx
function graph add option to calculate graph time off
Backport to 2.6.29.3-rt14.
| From: Steven Rostedt <srostedt@redhat.com>
|
| graph time is the time that a function is executing another function.
| Thus if function A calls B, if graph-time is set, then the time for
| A includes B. This is the default behavior. But if graph-time is off,
| then the time spent executing B is subtracted from A.
|
| Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
---
include/linux/ftrace.h | 3 +--
kernel/trace/ftrace.c | 21 ++++++++++++++++++++-
kernel/trace/trace.c | 4 +++-
kernel/trace/trace.h | 1 +
kernel/trace/trace_functions_graph.c | 8 ++++----
5 files changed, 29 insertions(+), 8 deletions(-)
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 015a3d2..9e0a8d2 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -365,6 +365,7 @@ struct ftrace_ret_stack {
unsigned long ret;
unsigned long func;
unsigned long long calltime;
+ unsigned long long subtime;
};
/*
@@ -376,8 +377,6 @@ extern void return_to_handler(void);
extern int
ftrace_push_return_trace(unsigned long ret, unsigned long func, int *depth);
-extern void
-ftrace_pop_return_trace(struct ftrace_graph_ret *trace, unsigned long *ret);
/*
* Sometimes we don't want to trace a function with the function
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index ed1fc50..71e5fae 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -604,6 +604,7 @@ static int profile_graph_entry(struct ftrace_graph_ent *trace)
static void profile_graph_return(struct ftrace_graph_ret *trace)
{
struct ftrace_profile_stat *stat;
+ unsigned long long calltime;
struct ftrace_profile *rec;
unsigned long flags;
@@ -612,9 +613,27 @@ static void profile_graph_return(struct ftrace_graph_ret *trace)
if (!stat->hash)
goto out;
+ calltime = trace->rettime - trace->calltime;
+
+ if (!(trace_flags & TRACE_ITER_GRAPH_TIME)) {
+ int index;
+
+ index = trace->depth;
+
+ /* Append this call time to the parent time to subtract */
+ if (index)
+ current->ret_stack[index - 1].subtime += calltime;
+
+ if (current->ret_stack[index].subtime < calltime)
+ calltime -= current->ret_stack[index].subtime;
+ else
+ calltime = 0;
+ }
+
rec = ftrace_find_profiled_func(stat, trace->func);
if (rec)
- rec->time += trace->rettime - trace->calltime;
+ rec->time += calltime;
+
out:
local_irq_restore(flags);
}
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 821bf49..5d1a16c 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -255,7 +255,8 @@ static DECLARE_WAIT_QUEUE_HEAD(trace_wait);
/* trace_flags holds trace_options default values */
unsigned long trace_flags = TRACE_ITER_PRINT_PARENT | TRACE_ITER_PRINTK |
- TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_SLEEP_TIME;
+ TRACE_ITER_ANNOTATE | TRACE_ITER_CONTEXT_INFO | TRACE_ITER_SLEEP_TIME |
+ TRACE_ITER_GRAPH_TIME;
/**
* trace_wake_up - wake up tasks waiting for trace input
@@ -321,6 +322,7 @@ static const char *trace_options[] = {
"latency-format",
"global-clock",
"sleep-time",
+ "graph-time",
NULL
};
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index c66ca3b..e3429a8 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -693,6 +693,7 @@ enum trace_iterator_flags {
TRACE_ITER_LATENCY_FMT = 0x40000,
TRACE_ITER_GLOBAL_CLK = 0x80000,
TRACE_ITER_SLEEP_TIME = 0x100000,
+ TRACE_ITER_GRAPH_TIME = 0x200000,
};
/*
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index 85bba0f..10f6ad7 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -78,13 +78,14 @@ ftrace_push_return_trace(unsigned long ret, unsigned long func, int *depth)
current->ret_stack[index].ret = ret;
current->ret_stack[index].func = func;
current->ret_stack[index].calltime = calltime;
+ current->ret_stack[index].subtime = 0;
*depth = index;
return 0;
}
/* Retrieve a function return address to the trace stack on thread info.*/
-void
+static void
ftrace_pop_return_trace(struct ftrace_graph_ret *trace, unsigned long *ret)
{
int index;
@@ -104,9 +105,6 @@ ftrace_pop_return_trace(struct ftrace_graph_ret *trace, unsigned long *ret)
trace->calltime = current->ret_stack[index].calltime;
trace->overrun = atomic_read(¤t->trace_overrun);
trace->depth = index;
- barrier();
- current->curr_ret_stack--;
-
}
/*
@@ -121,6 +119,8 @@ unsigned long ftrace_return_to_handler(void)
ftrace_pop_return_trace(&trace, &ret);
trace.rettime = trace_clock_local();
ftrace_graph_return(&trace);
+ barrier();
+ current->curr_ret_stack--;
if (unlikely(!ret)) {
ftrace_graph_stop();
--
1.6.2
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
--
[ Luis Claudio R. Goncalves Bass - Gospel - RT ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ]
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 6/9] tracing: remove on the fly allocator from function profiler
2009-05-19 14:36 [PATCH 0/9] Backport of ftrace's function profiler Luis Claudio R. Goncalves
` (4 preceding siblings ...)
2009-05-19 14:55 ` [PATCH 5/9] function graph add option to calculate graph time off Luis Claudio R. Goncalves
@ 2009-05-19 14:58 ` Luis Claudio R. Goncalves
2009-05-19 15:01 ` [PATCH 7/9] tracing: add average time in function to " Luis Claudio R. Goncalves
` (3 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Luis Claudio R. Goncalves @ 2009-05-19 14:58 UTC (permalink / raw)
To: linux-rt-users; +Cc: Steven Rostedt, Ingo Molnar, williams, tglx
tracing: remove on the fly allocator from function profiler
Impact: safer code
| From: Steven Rostedt <srostedt@redhat.com>
|
| Impact: safer code
|
| The on the fly allocator for the function profiler was to save
| memory. But at the expense of stability. Although it survived several
| tests, allocating from the function tracer is just too risky, just
| to save space.
|
| This patch removes the allocator and simply allocates enough entries
| at start up.
|
| Each function gets a profiling structure of 40 bytes. With an average
| of 20K functions, and this is for each CPU, we have 800K per online
| CPU. This is not too bad, at least for non-embedded.
|
| Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
---
kernel/trace/ftrace.c | 76 +++++++++++++++++++++++++++---------------------
1 files changed, 43 insertions(+), 33 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index a141d84..4d90c91 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -399,6 +399,8 @@ static void ftrace_profile_reset(struct ftrace_profile_stat *stat)
int ftrace_profile_pages_init(struct ftrace_profile_stat *stat)
{
struct ftrace_profile_page *pg;
+ int functions;
+ int pages;
int i;
/* If we already allocated, do nothing */
@@ -409,22 +411,46 @@ int ftrace_profile_pages_init(struct ftrace_profile_stat *stat)
if (!stat->pages)
return -ENOMEM;
+#ifdef CONFIG_DYNAMIC_FTRACE
+ functions = ftrace_update_tot_cnt;
+#else
+ /*
+ * We do not know the number of functions that exist because
+ * dynamic tracing is what counts them. With past experience
+ * we have around 20K functions. That should be more than enough.
+ * It is highly unlikely we will execute every function in
+ * the kernel.
+ */
+ functions = 20000;
+#endif
+
pg = stat->start = stat->pages;
- /* allocate 10 more pages to start */
- for (i = 0; i < 10; i++) {
+ pages = DIV_ROUND_UP(functions, PROFILES_PER_PAGE);
+
+ for (i = 0; i < pages; i++) {
pg->next = (void *)get_zeroed_page(GFP_KERNEL);
- /*
- * We only care about allocating profile_pages, if
- * we failed to allocate here, hopefully we will allocate
- * later.
- */
if (!pg->next)
- break;
+ goto out_free;
pg = pg->next;
}
return 0;
+
+ out_free:
+ pg = stat->start;
+ while (pg) {
+ unsigned long tmp = (unsigned long)pg;
+
+ pg = pg->next;
+ free_page(tmp);
+ }
+
+ free_page((unsigned long)stat->pages);
+ stat->pages = NULL;
+ stat->start = NULL;
+
+ return -ENOMEM;
}
static int ftrace_profile_init_cpu(int cpu)
@@ -458,7 +484,7 @@ static int ftrace_profile_init_cpu(int cpu)
ftrace_profile_bits++;
}
- /* Preallocate a few pages */
+ /* Preallocate the function profiling pages */
if (ftrace_profile_pages_init(stat) < 0) {
kfree(stat->hash);
stat->hash = NULL;
@@ -514,24 +540,21 @@ static void ftrace_add_profile(struct ftrace_profile_stat *stat,
hlist_add_head_rcu(&rec->node, &stat->hash[key]);
}
-/* Interrupts must be disabled calling this */
+/*
+ * The memory is already allocated, this simply finds a new record to use.
+ */
static struct ftrace_profile *
-ftrace_profile_alloc(struct ftrace_profile_stat *stat,
- unsigned long ip, bool alloc_safe)
+ftrace_profile_alloc(struct ftrace_profile_stat *stat, unsigned long ip)
{
struct ftrace_profile *rec = NULL;
- /* prevent recursion */
+ /* prevent recursion (from NMIs) */
if (atomic_inc_return(&stat->disabled) != 1)
goto out;
- /* Try to always keep another page available */
- if (!stat->pages->next && alloc_safe)
- stat->pages->next = (void *)get_zeroed_page(GFP_ATOMIC);
-
/*
- * Try to find the function again since another
- * task on another CPU could have added it
+ * Try to find the function again since an NMI
+ * could have added it
*/
rec = ftrace_find_profiled_func(stat, ip);
if (rec)
@@ -553,29 +576,16 @@ ftrace_profile_alloc(struct ftrace_profile_stat *stat,
return rec;
}
-/*
- * If we are not in an interrupt, or softirq and
- * and interrupts are disabled and preemption is not enabled
- * (not in a spinlock) then it should be safe to allocate memory.
- */
-static bool ftrace_safe_to_allocate(void)
-{
- return !in_interrupt() && irqs_disabled() && !preempt_count();
-}
-
static void
function_profile_call(unsigned long ip, unsigned long parent_ip)
{
struct ftrace_profile_stat *stat;
struct ftrace_profile *rec;
unsigned long flags;
- bool alloc_safe;
if (!ftrace_profile_enabled)
return;
- alloc_safe = ftrace_safe_to_allocate();
-
local_irq_save(flags);
stat = &__get_cpu_var(ftrace_profile_stats);
@@ -584,7 +594,7 @@ function_profile_call(unsigned long ip, unsigned long parent_ip)
rec = ftrace_find_profiled_func(stat, ip);
if (!rec) {
- rec = ftrace_profile_alloc(stat, ip, alloc_safe);
+ rec = ftrace_profile_alloc(stat, ip);
if (!rec)
goto out;
}
--
1.6.2
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
--
[ Luis Claudio R. Goncalves Bass - Gospel - RT ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ]
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 7/9] tracing: add average time in function to function profiler
2009-05-19 14:36 [PATCH 0/9] Backport of ftrace's function profiler Luis Claudio R. Goncalves
` (5 preceding siblings ...)
2009-05-19 14:58 ` [PATCH 6/9] tracing: remove on the fly allocator from function profiler Luis Claudio R. Goncalves
@ 2009-05-19 15:01 ` Luis Claudio R. Goncalves
2009-05-19 15:03 ` [PATCH 8/9] backport function profiler fixes Luis Claudio R. Goncalves
` (2 subsequent siblings)
9 siblings, 0 replies; 13+ messages in thread
From: Luis Claudio R. Goncalves @ 2009-05-19 15:01 UTC (permalink / raw)
To: linux-rt-users; +Cc: Steven Rostedt, Ingo Molnar, williams, tglx
tracing: add average time in function to function profiler
Show the average time in the function (Time / Hit)
Backport to 2.6.29.3-rt14.
| From: Steven Rostedt <srostedt@redhat.com>
|
| Show the average time in the function (Time / Hit)
|
| Function Hit Time Avg
| -------- --- ---- ---
| mwait_idle 51 140326.6 us 2751.503 us
| smp_apic_timer_interrupt 47 3517.735 us 74.845 us
| schedule 10 2738.754 us 273.875 us
| __schedule 10 2732.857 us 273.285 us
| hrtimer_interrupt 47 1896.104 us 40.342 us
| irq_exit 56 1711.833 us 30.568 us
| __run_hrtimer 47 1315.589 us 27.991 us
| tick_sched_timer 47 1138.690 us 24.227 us
| do_softirq 56 1116.829 us 19.943 us
| __do_softirq 56 1066.932 us 19.052 us
| do_IRQ 9 926.153 us 102.905 us
|
| Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
---
kernel/trace/ftrace.c | 21 ++++++++++++++-------
1 files changed, 14 insertions(+), 7 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 4d90c91..c7f4a4b 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -345,8 +345,10 @@ static int function_stat_cmp(void *p1, void *p2)
static int function_stat_headers(struct seq_file *m)
{
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
- seq_printf(m, " Function Hit Time\n"
- " -------- --- ----\n");
+ seq_printf(m, " Function "
+ "Hit Time Avg\n"
+ " -------- "
+ "--- ---- ---\n");
#else
seq_printf(m, " Function Hit\n"
" -------- ---\n");
@@ -359,12 +361,9 @@ static int function_stat_show(struct seq_file *m, void *v)
struct ftrace_profile *rec = v;
char str[KSYM_SYMBOL_LEN];
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
- static struct trace_seq s;
static DEFINE_MUTEX(mutex);
-
- mutex_lock(&mutex);
- trace_seq_init(&s);
- trace_print_graph_duration(rec->time, &s);
+ static struct trace_seq s;
+ unsigned long long avg;
#endif
kallsyms_lookup(rec->ip, NULL, NULL, NULL, str);
@@ -372,6 +371,14 @@ static int function_stat_show(struct seq_file *m, void *v)
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
seq_printf(m, " ");
+ avg = rec->time;
+ do_div(avg, rec->counter);
+
+ mutex_lock(&mutex);
+ trace_seq_init(&s);
+ trace_print_graph_duration(rec->time, &s);
+ trace_seq_puts(&s, " ");
+ trace_print_graph_duration(avg, &s);
trace_print_seq(m, &s);
mutex_unlock(&mutex);
#endif
--
1.6.2
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
--
[ Luis Claudio R. Goncalves Bass - Gospel - RT ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ]
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 8/9] backport function profiler fixes
2009-05-19 14:36 [PATCH 0/9] Backport of ftrace's function profiler Luis Claudio R. Goncalves
` (6 preceding siblings ...)
2009-05-19 15:01 ` [PATCH 7/9] tracing: add average time in function to " Luis Claudio R. Goncalves
@ 2009-05-19 15:03 ` Luis Claudio R. Goncalves
2009-05-19 15:05 ` [PATCH 9/9] ftrace: function profiler band-aid Luis Claudio R. Goncalves
2009-05-19 15:06 ` [PATCH 0/9] Backport of ftrace's function profiler Steven Rostedt
9 siblings, 0 replies; 13+ messages in thread
From: Luis Claudio R. Goncalves @ 2009-05-19 15:03 UTC (permalink / raw)
To: linux-rt-users; +Cc: Steven Rostedt, Ingo Molnar, williams, tglx
Additional fixes to the backport of Ftrace's Function Profiler
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
---
kernel/trace/ftrace.c | 2 +-
kernel/trace/trace_stat.c | 2 +-
kernel/trace/trace_stat.h | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index cd91e88..ed1fc50 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -301,7 +301,7 @@ function_stat_next(void *v, int idx)
return rec;
}
-static void *function_stat_start(void)
+static void *function_stat_start(struct tracer_stat *trace)
{
struct ftrace_profile_stat *stat =
container_of(trace, struct ftrace_profile_stat, stat);
diff --git a/kernel/trace/trace_stat.c b/kernel/trace/trace_stat.c
index acdebd7..fdde3a4 100644
--- a/kernel/trace/trace_stat.c
+++ b/kernel/trace/trace_stat.c
@@ -85,7 +85,7 @@ static int stat_seq_init(struct tracer_stat_session *session)
if (!ts->stat_cmp)
ts->stat_cmp = dummy_cmp;
- stat = ts->stat_start();
+ stat = ts->stat_start(ts);
if (!stat)
goto exit;
diff --git a/kernel/trace/trace_stat.h b/kernel/trace/trace_stat.h
index 202274c..f3546a2 100644
--- a/kernel/trace/trace_stat.h
+++ b/kernel/trace/trace_stat.h
@@ -12,7 +12,7 @@ struct tracer_stat {
/* The name of your stat file */
const char *name;
/* Iteration over statistic entries */
- void *(*stat_start)(void);
+ void *(*stat_start)(struct tracer_stat *trace);
void *(*stat_next)(void *prev, int idx);
/* Compare two entries for stats sorting */
int (*stat_cmp)(void *p1, void *p2);
--
[ Luis Claudio R. Goncalves Bass - Gospel - RT ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ]
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 9/9] ftrace: function profiler band-aid
2009-05-19 14:36 [PATCH 0/9] Backport of ftrace's function profiler Luis Claudio R. Goncalves
` (7 preceding siblings ...)
2009-05-19 15:03 ` [PATCH 8/9] backport function profiler fixes Luis Claudio R. Goncalves
@ 2009-05-19 15:05 ` Luis Claudio R. Goncalves
2009-05-19 15:06 ` [PATCH 0/9] Backport of ftrace's function profiler Steven Rostedt
9 siblings, 0 replies; 13+ messages in thread
From: Luis Claudio R. Goncalves @ 2009-05-19 15:05 UTC (permalink / raw)
To: linux-rt-users; +Cc: Steven Rostedt, Ingo Molnar, williams, tglx
Kernel Panic when stressing ftrace function_profiler
https://bugzilla.redhat.com/show_bug.cgi?id=500156
This patch does not fix the whole issue but removes a few probable spots
where the panic could originate from.
[root@void ~]# while true
> do echo "starting new test..."
> ./function_profiler.sh sleep 2 > /dev/null
> dmesg -c
> ./function_profiler.sh ls -la > /dev/null
> dmesg -c;
> done
starting new test...
starting new test...
...
Kernel panic - not syncing: Fatal exception
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
---
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 8509861..3228291 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -372,7 +372,8 @@ static int function_stat_show(struct seq_file *m, void *v)
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
seq_printf(m, " ");
avg = rec->time;
- do_div(avg, rec->counter);
+ if (rec->counter)
+ do_div(avg, rec->counter);
mutex_lock(&mutex);
trace_seq_init(&s);
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index 10f6ad7..e352d82 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -436,7 +436,7 @@ trace_print_graph_duration(unsigned long long duration, struct trace_seq *s)
int ret, len;
int i;
- sprintf(msecs_str, "%lu", (unsigned long) duration);
+ snprintf(msecs_str, sizeof(msecs_str), "%lu", (unsigned long) duration);
/* Print msecs */
ret = trace_seq_printf(s, "%s", msecs_str);
--
[ Luis Claudio R. Goncalves Bass - Gospel - RT ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ]
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 0/9] Backport of ftrace's function profiler
2009-05-19 14:36 [PATCH 0/9] Backport of ftrace's function profiler Luis Claudio R. Goncalves
` (8 preceding siblings ...)
2009-05-19 15:05 ` [PATCH 9/9] ftrace: function profiler band-aid Luis Claudio R. Goncalves
@ 2009-05-19 15:06 ` Steven Rostedt
2009-05-19 15:12 ` Luis Claudio R. Goncalves
9 siblings, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2009-05-19 15:06 UTC (permalink / raw)
To: Luis Claudio R. Goncalves; +Cc: linux-rt-users, Ingo Molnar, williams, tglx
On Tue, 19 May 2009, Luis Claudio R. Goncalves wrote:
> Hello,
>
> The following patches are the backport of Steven Rostedt's ftrace's
> function profiler, as published in his RFC series posted to LKML. This
> backport has been done on top of 2.6.29.3-rt14.
>
> The feature is fully functional but we do have one known problem, a kernel
> panic when stressing the function profiler (a fast sequence of start, stop,
> read in loop for 30+ minutes). This problem has a Bugzilla entry at:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=500156
Luis,
I'm fine with this, but if you want me to review the patches, it would
help if you have the reject files that you had to solve (or merge conflict
files). Then I can see if changes were made correctly.
-- Steve
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/9] Backport of ftrace's function profiler
2009-05-19 15:06 ` [PATCH 0/9] Backport of ftrace's function profiler Steven Rostedt
@ 2009-05-19 15:12 ` Luis Claudio R. Goncalves
2009-05-19 15:15 ` Steven Rostedt
0 siblings, 1 reply; 13+ messages in thread
From: Luis Claudio R. Goncalves @ 2009-05-19 15:12 UTC (permalink / raw)
To: Steven Rostedt; +Cc: linux-rt-users, Ingo Molnar, williams, tglx
On Tue, May 19, 2009 at 11:06:55AM -0400, Steven Rostedt wrote:
|
| On Tue, 19 May 2009, Luis Claudio R. Goncalves wrote:
|
| > Hello,
| >
| > The following patches are the backport of Steven Rostedt's ftrace's
| > function profiler, as published in his RFC series posted to LKML. This
| > backport has been done on top of 2.6.29.3-rt14.
| >
| > The feature is fully functional but we do have one known problem, a kernel
| > panic when stressing the function profiler (a fast sequence of start, stop,
| > read in loop for 30+ minutes). This problem has a Bugzilla entry at:
| >
| > https://bugzilla.redhat.com/show_bug.cgi?id=500156
|
| Luis,
|
| I'm fine with this, but if you want me to review the patches, it would
| help if you have the reject files that you had to solve (or merge conflict
| files). Then I can see if changes were made correctly.
So I finally found a use for those ugly reject files I always delete... :)
Steven, I will have to redo part of the work to get the reject files again.
Hopefully it won't take long. Thanks for being available to review the
backport!
Luis
--
[ Luis Claudio R. Goncalves Red Hat - Realtime Team ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 0/9] Backport of ftrace's function profiler
2009-05-19 15:12 ` Luis Claudio R. Goncalves
@ 2009-05-19 15:15 ` Steven Rostedt
0 siblings, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2009-05-19 15:15 UTC (permalink / raw)
To: Luis Claudio R. Goncalves; +Cc: linux-rt-users, Ingo Molnar, williams, tglx
On Tue, 19 May 2009, Luis Claudio R. Goncalves wrote:
>
> So I finally found a use for those ugly reject files I always delete... :)
Heh, I have a tar ball of all the rejects I ever had while porting -rt
from one kernel version to another. I do not delete .rej files lightly.
-- Steve
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-05-19 15:19 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-19 14:36 [PATCH 0/9] Backport of ftrace's function profiler Luis Claudio R. Goncalves
2009-05-19 14:43 ` [PATCH 1/9] tracing: add " Luis Claudio R. Goncalves
2009-05-19 14:45 ` [PATCH 2/9] reduce size of memory in " Luis Claudio R. Goncalves
2009-05-19 14:48 ` [PATCH 3/9] tracing: adding function timings to function profile Luis Claudio R. Goncalves
2009-05-19 14:52 ` [PATCH 4/9] tracing: make the function profiler per cpu Luis Claudio R. Goncalves
2009-05-19 14:55 ` [PATCH 5/9] function graph add option to calculate graph time off Luis Claudio R. Goncalves
2009-05-19 14:58 ` [PATCH 6/9] tracing: remove on the fly allocator from function profiler Luis Claudio R. Goncalves
2009-05-19 15:01 ` [PATCH 7/9] tracing: add average time in function to " Luis Claudio R. Goncalves
2009-05-19 15:03 ` [PATCH 8/9] backport function profiler fixes Luis Claudio R. Goncalves
2009-05-19 15:05 ` [PATCH 9/9] ftrace: function profiler band-aid Luis Claudio R. Goncalves
2009-05-19 15:06 ` [PATCH 0/9] Backport of ftrace's function profiler Steven Rostedt
2009-05-19 15:12 ` Luis Claudio R. Goncalves
2009-05-19 15:15 ` Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).