[RFC PATCH 00/10] Ftrace functions hashlist

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC PATCH 00/10] Ftrace functions hashlist
@ 2010-01-22  1:16 Frederic Weisbecker
  2010-01-22  1:16 ` [RFC PATCH 01/10] ftrace: Generalize the function hashlist from function profiler Frederic Weisbecker
                   ` (9 more replies)
  0 siblings, 10 replies; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-22  1:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Steven Rostedt, Li Zefan,
	Lai Jiangshan

Hi,

This set exports the functions hashlist in use by the functions
profiler for more general uses.

For now it has fixed a scalability problem in the function graph
tracer concerning the function start filtering (toward
set_graph_function file).

This is especially the beginning of a work to allow multiple function
(and function graph) tracers instances usable in parallel with
their own dynamic ftrace parameters. The plan is then to make
these tracers available for wider uses like perf, toward the trace
event api or whatever...

This is still in RFC stage because with this patchset,
the set of function filters from set_graph_function is only
active after the function graph tracer is enabled. If
set_graph_function is changed in the middle of the tracing, the
filters will be actually updated only after you switch to another
tracer and re-switch to the grah tracer. But that should be
fixed in a further iteration.

Frederic Weisbecker (9):
  ftrace: Generalize the function hashlist from function profiler
  ftrace: Ensure tracing has really stopped before leaving unregister_ftrace_graph
  ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  ftrace: Ensure buffers are visibles to tracing callbacks right away
  ftrace: Drop buffer check in function profiler callbacks
  ftrace: Release the function hlist if we don't need it anymore
  ftrace: Make the function hashlist concurrently usable
  tracing: Use the hashlist for graph function
  ftrace: Factorize search and insertion in the function hashlist

Lai Jiangshan (1):
  tracing: Simplify test for function_graph tracing

 kernel/trace/Makefile                |    1 +
 kernel/trace/ftrace.c                |  378 ++++++++++------------------------
 kernel/trace/functions_hlist.c       |  289 ++++++++++++++++++++++++++
 kernel/trace/functions_hlist.h       |   54 +++++
 kernel/trace/trace.h                 |   21 +--
 kernel/trace/trace_functions_graph.c |   46 ++++-
 6 files changed, 493 insertions(+), 296 deletions(-)
 create mode 100644 kernel/trace/functions_hlist.c
 create mode 100644 kernel/trace/functions_hlist.h

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [RFC PATCH 01/10] ftrace: Generalize the function hashlist from function profiler
  2010-01-22  1:16 [RFC PATCH 00/10] Ftrace functions hashlist Frederic Weisbecker
@ 2010-01-22  1:16 ` Frederic Weisbecker
  2010-01-22  1:16 ` [RFC PATCH 02/10] ftrace: Ensure tracing has really stopped before leaving unregister_ftrace_graph Frederic Weisbecker
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-22  1:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Steven Rostedt, Li Zefan,
	Lai Jiangshan

Extract and generalize the function hashlist in use by the function
profiler.

Having a general purpose hashlist for kernel functions will help
multiplexing dynamic tracing parameters for parallel users of the
function and function graph tracers. This can help tracking who is
tracing which functions among concurrent tracers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/trace/Makefile          |    1 +
 kernel/trace/ftrace.c          |  289 ++++-----------------------------------
 kernel/trace/functions_hlist.c |  217 ++++++++++++++++++++++++++++++
 kernel/trace/functions_hlist.h |   38 ++++++
 4 files changed, 286 insertions(+), 259 deletions(-)
 create mode 100644 kernel/trace/functions_hlist.c
 create mode 100644 kernel/trace/functions_hlist.h

diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index d00c6fe..f9804f2 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -58,5 +58,6 @@ obj-$(CONFIG_EVENT_TRACING) += trace_events_filter.o
 obj-$(CONFIG_KPROBE_EVENT) += trace_kprobe.o
 obj-$(CONFIG_KSYM_TRACER) += trace_ksym.o
 obj-$(CONFIG_EVENT_TRACING) += power-traces.o
+obj-$(CONFIG_FUNCTION_PROFILER) += functions_hlist.o
 
 libftrace-y := ftrace.o
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 1e6640f..c050ce3 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -27,13 +27,13 @@
 #include <linux/sysctl.h>
 #include <linux/ctype.h>
 #include <linux/list.h>
-#include <linux/hash.h>
 
 #include <trace/events/sched.h>
 
 #include <asm/ftrace.h>
 #include <asm/setup.h>
 
+#include "functions_hlist.h"
 #include "trace_output.h"
 #include "trace_stat.h"
 
@@ -258,52 +258,19 @@ static void ftrace_update_pid_func(void)
 }
 
 #ifdef CONFIG_FUNCTION_PROFILER
-struct ftrace_profile {
-	struct hlist_node		node;
-	unsigned long			ip;
-	unsigned long			counter;
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-	unsigned long long		time;
-#endif
-};
-
-struct ftrace_profile_page {
-	struct ftrace_profile_page	*next;
-	unsigned long			index;
-	struct ftrace_profile		records[];
-};
-
-struct ftrace_profile_stat {
-	atomic_t			disabled;
-	struct hlist_head		*hash;
-	struct ftrace_profile_page	*pages;
-	struct ftrace_profile_page	*start;
-	struct tracer_stat		stat;
-};
-
-#define PROFILE_RECORDS_SIZE						\
-	(PAGE_SIZE - offsetof(struct ftrace_profile_page, records))
 
-#define PROFILES_PER_PAGE					\
-	(PROFILE_RECORDS_SIZE / sizeof(struct ftrace_profile))
-
-static int ftrace_profile_bits __read_mostly;
 static int ftrace_profile_enabled __read_mostly;
 
 /* ftrace_profile_lock - synchronize the enable and disable of the profiler */
 static DEFINE_MUTEX(ftrace_profile_lock);
 
-static DEFINE_PER_CPU(struct ftrace_profile_stat, ftrace_profile_stats);
-
-#define FTRACE_PROFILE_HASH_SIZE 1024 /* must be power of 2 */
-
 static void *
 function_stat_next(void *v, int idx)
 {
-	struct ftrace_profile *rec = v;
-	struct ftrace_profile_page *pg;
+	struct func_node *rec = v;
+	struct func_hlist_page *pg;
 
-	pg = (struct ftrace_profile_page *)((unsigned long)rec & PAGE_MASK);
+	pg = (struct func_hlist_page *)((unsigned long)rec & PAGE_MASK);
 
  again:
 	if (idx != 0)
@@ -323,21 +290,21 @@ function_stat_next(void *v, int idx)
 
 static void *function_stat_start(struct tracer_stat *trace)
 {
-	struct ftrace_profile_stat *stat =
-		container_of(trace, struct ftrace_profile_stat, stat);
+	struct func_hlist *hlist =
+		container_of(trace, struct func_hlist, stat);
 
-	if (!stat || !stat->start)
+	if (!hlist || !hlist->start)
 		return NULL;
 
-	return function_stat_next(&stat->start->records[0], 0);
+	return function_stat_next(&hlist->start->records[0], 0);
 }
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 /* function graph compares on total time */
 static int function_stat_cmp(void *p1, void *p2)
 {
-	struct ftrace_profile *a = p1;
-	struct ftrace_profile *b = p2;
+	struct func_node *a = p1;
+	struct func_node *b = p2;
 
 	if (a->time < b->time)
 		return -1;
@@ -350,8 +317,8 @@ static int function_stat_cmp(void *p1, void *p2)
 /* not function graph compares against hits */
 static int function_stat_cmp(void *p1, void *p2)
 {
-	struct ftrace_profile *a = p1;
-	struct ftrace_profile *b = p2;
+	struct func_node *a = p1;
+	struct func_node *b = p2;
 
 	if (a->counter < b->counter)
 		return -1;
@@ -378,7 +345,7 @@ static int function_stat_headers(struct seq_file *m)
 
 static int function_stat_show(struct seq_file *m, void *v)
 {
-	struct ftrace_profile *rec = v;
+	struct func_node *rec = v;
 	char str[KSYM_SYMBOL_LEN];
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	static DEFINE_MUTEX(mutex);
@@ -407,207 +374,11 @@ static int function_stat_show(struct seq_file *m, void *v)
 	return 0;
 }
 
-static void ftrace_profile_reset(struct ftrace_profile_stat *stat)
-{
-	struct ftrace_profile_page *pg;
-
-	pg = stat->pages = stat->start;
-
-	while (pg) {
-		memset(pg->records, 0, PROFILE_RECORDS_SIZE);
-		pg->index = 0;
-		pg = pg->next;
-	}
-
-	memset(stat->hash, 0,
-	       FTRACE_PROFILE_HASH_SIZE * sizeof(struct hlist_head));
-}
-
-int ftrace_profile_pages_init(struct ftrace_profile_stat *stat)
-{
-	struct ftrace_profile_page *pg;
-	int functions;
-	int pages;
-	int i;
-
-	/* If we already allocated, do nothing */
-	if (stat->pages)
-		return 0;
-
-	stat->pages = (void *)get_zeroed_page(GFP_KERNEL);
-	if (!stat->pages)
-		return -ENOMEM;
-
-#ifdef CONFIG_DYNAMIC_FTRACE
-	functions = ftrace_update_tot_cnt;
-#else
-	/*
-	 * We do not know the number of functions that exist because
-	 * dynamic tracing is what counts them. With past experience
-	 * we have around 20K functions. That should be more than enough.
-	 * It is highly unlikely we will execute every function in
-	 * the kernel.
-	 */
-	functions = 20000;
-#endif
-
-	pg = stat->start = stat->pages;
-
-	pages = DIV_ROUND_UP(functions, PROFILES_PER_PAGE);
-
-	for (i = 0; i < pages; i++) {
-		pg->next = (void *)get_zeroed_page(GFP_KERNEL);
-		if (!pg->next)
-			goto out_free;
-		pg = pg->next;
-	}
-
-	return 0;
-
- out_free:
-	pg = stat->start;
-	while (pg) {
-		unsigned long tmp = (unsigned long)pg;
-
-		pg = pg->next;
-		free_page(tmp);
-	}
-
-	free_page((unsigned long)stat->pages);
-	stat->pages = NULL;
-	stat->start = NULL;
-
-	return -ENOMEM;
-}
-
-static int ftrace_profile_init_cpu(int cpu)
-{
-	struct ftrace_profile_stat *stat;
-	int size;
-
-	stat = &per_cpu(ftrace_profile_stats, cpu);
-
-	if (stat->hash) {
-		/* If the profile is already created, simply reset it */
-		ftrace_profile_reset(stat);
-		return 0;
-	}
-
-	/*
-	 * We are profiling all functions, but usually only a few thousand
-	 * functions are hit. We'll make a hash of 1024 items.
-	 */
-	size = FTRACE_PROFILE_HASH_SIZE;
-
-	stat->hash = kzalloc(sizeof(struct hlist_head) * size, GFP_KERNEL);
-
-	if (!stat->hash)
-		return -ENOMEM;
-
-	if (!ftrace_profile_bits) {
-		size--;
-
-		for (; size; size >>= 1)
-			ftrace_profile_bits++;
-	}
-
-	/* Preallocate the function profiling pages */
-	if (ftrace_profile_pages_init(stat) < 0) {
-		kfree(stat->hash);
-		stat->hash = NULL;
-		return -ENOMEM;
-	}
-
-	return 0;
-}
-
-static int ftrace_profile_init(void)
-{
-	int cpu;
-	int ret = 0;
-
-	for_each_online_cpu(cpu) {
-		ret = ftrace_profile_init_cpu(cpu);
-		if (ret)
-			break;
-	}
-
-	return ret;
-}
-
-/* interrupts must be disabled */
-static struct ftrace_profile *
-ftrace_find_profiled_func(struct ftrace_profile_stat *stat, unsigned long ip)
-{
-	struct ftrace_profile *rec;
-	struct hlist_head *hhd;
-	struct hlist_node *n;
-	unsigned long key;
-
-	key = hash_long(ip, ftrace_profile_bits);
-	hhd = &stat->hash[key];
-
-	if (hlist_empty(hhd))
-		return NULL;
-
-	hlist_for_each_entry_rcu(rec, n, hhd, node) {
-		if (rec->ip == ip)
-			return rec;
-	}
-
-	return NULL;
-}
-
-static void ftrace_add_profile(struct ftrace_profile_stat *stat,
-			       struct ftrace_profile *rec)
-{
-	unsigned long key;
-
-	key = hash_long(rec->ip, ftrace_profile_bits);
-	hlist_add_head_rcu(&rec->node, &stat->hash[key]);
-}
-
-/*
- * The memory is already allocated, this simply finds a new record to use.
- */
-static struct ftrace_profile *
-ftrace_profile_alloc(struct ftrace_profile_stat *stat, unsigned long ip)
-{
-	struct ftrace_profile *rec = NULL;
-
-	/* prevent recursion (from NMIs) */
-	if (atomic_inc_return(&stat->disabled) != 1)
-		goto out;
-
-	/*
-	 * Try to find the function again since an NMI
-	 * could have added it
-	 */
-	rec = ftrace_find_profiled_func(stat, ip);
-	if (rec)
-		goto out;
-
-	if (stat->pages->index == PROFILES_PER_PAGE) {
-		if (!stat->pages->next)
-			goto out;
-		stat->pages = stat->pages->next;
-	}
-
-	rec = &stat->pages->records[stat->pages->index++];
-	rec->ip = ip;
-	ftrace_add_profile(stat, rec);
-
- out:
-	atomic_dec(&stat->disabled);
-
-	return rec;
-}
-
 static void
 function_profile_call(unsigned long ip, unsigned long parent_ip)
 {
-	struct ftrace_profile_stat *stat;
-	struct ftrace_profile *rec;
+	struct func_hlist *hlist;
+	struct func_node *rec;
 	unsigned long flags;
 
 	if (!ftrace_profile_enabled)
@@ -615,13 +386,13 @@ function_profile_call(unsigned long ip, unsigned long parent_ip)
 
 	local_irq_save(flags);
 
-	stat = &__get_cpu_var(ftrace_profile_stats);
-	if (!stat->hash || !ftrace_profile_enabled)
+	hlist = &__get_cpu_var(func_hlist_cpu);
+	if (!hlist->hash || !ftrace_profile_enabled)
 		goto out;
 
-	rec = ftrace_find_profiled_func(stat, ip);
+	rec = function_find_hlist_node(hlist, ip);
 	if (!rec) {
-		rec = ftrace_profile_alloc(stat, ip);
+		rec = function_hlist_record_alloc(hlist, ip);
 		if (!rec)
 			goto out;
 	}
@@ -640,14 +411,14 @@ static int profile_graph_entry(struct ftrace_graph_ent *trace)
 
 static void profile_graph_return(struct ftrace_graph_ret *trace)
 {
-	struct ftrace_profile_stat *stat;
+	struct func_hlist *hlist;
 	unsigned long long calltime;
-	struct ftrace_profile *rec;
+	struct func_node *rec;
 	unsigned long flags;
 
 	local_irq_save(flags);
-	stat = &__get_cpu_var(ftrace_profile_stats);
-	if (!stat->hash || !ftrace_profile_enabled)
+	hlist = &__get_cpu_var(func_hlist_cpu);
+	if (!hlist->hash || !ftrace_profile_enabled)
 		goto out;
 
 	calltime = trace->rettime - trace->calltime;
@@ -667,7 +438,7 @@ static void profile_graph_return(struct ftrace_graph_ret *trace)
 			calltime = 0;
 	}
 
-	rec = ftrace_find_profiled_func(stat, trace->func);
+	rec = function_find_hlist_node(hlist, trace->func);
 	if (rec)
 		rec->time += calltime;
 
@@ -727,7 +498,7 @@ ftrace_profile_write(struct file *filp, const char __user *ubuf,
 	mutex_lock(&ftrace_profile_lock);
 	if (ftrace_profile_enabled ^ val) {
 		if (val) {
-			ret = ftrace_profile_init();
+			ret = function_hlist_init();
 			if (ret < 0) {
 				cnt = ret;
 				goto out;
@@ -785,14 +556,14 @@ static struct tracer_stat function_stats __initdata = {
 
 static __init void ftrace_profile_debugfs(struct dentry *d_tracer)
 {
-	struct ftrace_profile_stat *stat;
+	struct func_hlist *hlist;
 	struct dentry *entry;
 	char *name;
 	int ret;
 	int cpu;
 
 	for_each_possible_cpu(cpu) {
-		stat = &per_cpu(ftrace_profile_stats, cpu);
+		hlist = &per_cpu(func_hlist_cpu, cpu);
 
 		/* allocate enough for function name + cpu number */
 		name = kmalloc(32, GFP_KERNEL);
@@ -806,10 +577,10 @@ static __init void ftrace_profile_debugfs(struct dentry *d_tracer)
 			     cpu);
 			return;
 		}
-		stat->stat = function_stats;
+		hlist->stat = function_stats;
 		snprintf(name, 32, "function%d", cpu);
-		stat->stat.name = name;
-		ret = register_stat_tracer(&stat->stat);
+		hlist->stat.name = name;
+		ret = register_stat_tracer(&hlist->stat);
 		if (ret) {
 			WARN(1,
 			     "Could not register function stat for cpu %d\n",
diff --git a/kernel/trace/functions_hlist.c b/kernel/trace/functions_hlist.c
new file mode 100644
index 0000000..37804c4
--- /dev/null
+++ b/kernel/trace/functions_hlist.c
@@ -0,0 +1,217 @@
+/*
+ * Copyright (C) 2007-2010 Steven Rostedt <srostedt@redhat.com>
+ *
+ * Extracted from function profiling in ftrace.c, generalize naming:
+ * 	Copyright (C) 2010 Frederic Weisbecker <fweisbec@gmail.com>
+ */
+
+#include <linux/gfp.h>
+#include "functions_hlist.h"
+#include "trace.h"
+
+#define FUNCTIONS_RECORDS_SIZE						\
+	(PAGE_SIZE - offsetof(struct func_hlist_page, records))
+
+#define RECORDS_PER_PAGE					\
+	(FUNCTIONS_RECORDS_SIZE / sizeof(struct func_node))
+
+#define FUNCTIONS_HLIST_SIZE 1024 /* must be power of 2 */
+
+DEFINE_PER_CPU(struct func_hlist, func_hlist_cpu);
+
+int functions_hash_bits __read_mostly;
+
+static void function_hlist_reset(struct func_hlist *hlist)
+{
+	struct func_hlist_page *pg;
+
+	pg = hlist->pages = hlist->start;
+
+	while (pg) {
+		memset(pg->records, 0, FUNCTIONS_RECORDS_SIZE);
+		pg->index = 0;
+		pg = pg->next;
+	}
+
+	memset(hlist->hash, 0,
+	       FUNCTIONS_HLIST_SIZE * sizeof(struct hlist_head));
+}
+
+static int function_hlist_pages_init(struct func_hlist *hlist)
+{
+	struct func_hlist_page *pg;
+	int functions;
+	int pages;
+	int i;
+
+	/* If we already allocated, do nothing */
+	if (hlist->pages)
+		return 0;
+
+	hlist->pages = (void *)get_zeroed_page(GFP_KERNEL);
+	if (!hlist->pages)
+		return -ENOMEM;
+
+#ifdef CONFIG_DYNAMIC_FTRACE
+	functions = ftrace_update_tot_cnt;
+#else
+	/*
+	 * We do not know the number of functions that exist because
+	 * dynamic tracing is what counts them. With past experience
+	 * we have around 20K functions. That should be more than enough.
+	 * It is highly unlikely we will execute every function in
+	 * the kernel.
+	 */
+	functions = 20000;
+#endif
+
+	pg = hlist->start = hlist->pages;
+
+	pages = DIV_ROUND_UP(functions, RECORDS_PER_PAGE);
+
+	for (i = 0; i < pages; i++) {
+		pg->next = (void *)get_zeroed_page(GFP_KERNEL);
+		if (!pg->next)
+			goto out_free;
+		pg = pg->next;
+	}
+
+	return 0;
+
+ out_free:
+	pg = hlist->start;
+	while (pg) {
+		unsigned long tmp = (unsigned long)pg;
+
+		pg = pg->next;
+		free_page(tmp);
+	}
+
+	free_page((unsigned long)hlist->pages);
+	hlist->pages = NULL;
+	hlist->start = NULL;
+
+	return -ENOMEM;
+}
+
+static int function_hlist_init_cpu(int cpu)
+{
+	struct func_hlist *hlist;
+	int size;
+
+	hlist = &per_cpu(func_hlist_cpu, cpu);
+
+	if (hlist->hash) {
+		/* If the profile is already created, simply reset it */
+		function_hlist_reset(hlist);
+		return 0;
+	}
+
+	/*
+	 * We are profiling all functions, but usually only a few thousand
+	 * functions are hit. We'll make a hash of 1024 items.
+	 */
+	size = FUNCTIONS_HLIST_SIZE;
+
+	hlist->hash = kzalloc(sizeof(struct hlist_head) * size, GFP_KERNEL);
+
+	if (!hlist->hash)
+		return -ENOMEM;
+
+	if (!functions_hash_bits) {
+		size--;
+
+		for (; size; size >>= 1)
+			functions_hash_bits++;
+	}
+
+	/* Preallocate the function profiling pages */
+	if (function_hlist_pages_init(hlist) < 0) {
+		kfree(hlist->hash);
+		hlist->hash = NULL;
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+int function_hlist_init(void)
+{
+	int cpu;
+	int ret = 0;
+
+	for_each_online_cpu(cpu) {
+		ret = function_hlist_init_cpu(cpu);
+		if (ret)
+			break;
+	}
+
+	return ret;
+}
+
+struct func_node *
+function_find_hlist_node(struct func_hlist *hlist, unsigned long ip)
+{
+	struct func_node *rec;
+	struct hlist_head *hhd;
+	struct hlist_node *n;
+	unsigned long key;
+
+	key = hash_long(ip, functions_hash_bits);
+	hhd = &hlist->hash[key];
+
+	if (hlist_empty(hhd))
+		return NULL;
+
+	hlist_for_each_entry_rcu(rec, n, hhd, node) {
+		if (rec->ip == ip)
+			return rec;
+	}
+
+	return NULL;
+}
+
+static void function_hlist_add(struct func_hlist *hlist,
+			       struct func_node *rec)
+{
+	unsigned long key;
+
+	key = hash_long(rec->ip, functions_hash_bits);
+	hlist_add_head_rcu(&rec->node, &hlist->hash[key]);
+}
+
+/*
+ * The memory is already allocated, this simply finds a new record to use.
+ */
+struct func_node *
+function_hlist_record_alloc(struct func_hlist *hlist, unsigned long ip)
+{
+	struct func_node *rec = NULL;
+
+	/* prevent recursion (from NMIs) */
+	if (atomic_inc_return(&hlist->disabled) != 1)
+		goto out;
+
+	/*
+	 * Try to find the function again since an NMI
+	 * could have added it
+	 */
+	rec = function_find_hlist_node(hlist, ip);
+	if (rec)
+		goto out;
+
+	if (hlist->pages->index == RECORDS_PER_PAGE) {
+		if (!hlist->pages->next)
+			goto out;
+		hlist->pages = hlist->pages->next;
+	}
+
+	rec = &hlist->pages->records[hlist->pages->index++];
+	rec->ip = ip;
+	function_hlist_add(hlist, rec);
+
+ out:
+	atomic_dec(&hlist->disabled);
+
+	return rec;
+}
diff --git a/kernel/trace/functions_hlist.h b/kernel/trace/functions_hlist.h
new file mode 100644
index 0000000..3f4e485
--- /dev/null
+++ b/kernel/trace/functions_hlist.h
@@ -0,0 +1,38 @@
+#include <linux/hash.h>
+#include <linux/list.h>
+#include "trace_stat.h"
+
+struct func_node {
+	struct hlist_node		node;
+	unsigned long			ip;
+	unsigned long			counter;
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+	unsigned long long		time;
+#endif
+};
+
+struct func_hlist_page {
+	struct func_hlist_page		*next;
+	unsigned long			index;
+	struct func_node		records[];
+};
+
+struct func_hlist {
+	atomic_t			disabled;
+	struct hlist_head		*hash;
+	struct func_hlist_page		*pages;
+	struct func_hlist_page		*start;
+	struct tracer_stat		stat;
+};
+
+DECLARE_PER_CPU(struct func_hlist, func_hlist_cpu);
+
+extern int functions_hash_bits;
+
+struct func_node *
+function_find_hlist_node(struct func_hlist *hlist, unsigned long ip);
+
+struct func_node *
+function_hlist_record_alloc(struct func_hlist *hlist, unsigned long ip);
+
+int function_hlist_init(void);
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH 02/10] ftrace: Ensure tracing has really stopped before leaving unregister_ftrace_graph
  2010-01-22  1:16 [RFC PATCH 00/10] Ftrace functions hashlist Frederic Weisbecker
  2010-01-22  1:16 ` [RFC PATCH 01/10] ftrace: Generalize the function hashlist from function profiler Frederic Weisbecker
@ 2010-01-22  1:16 ` Frederic Weisbecker
  2010-01-22  1:51   ` Steven Rostedt
  2010-01-22  1:16 ` [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path Frederic Weisbecker
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-22  1:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Steven Rostedt, Li Zefan,
	Lai Jiangshan

When we run under dynamic tracing, we know that after calling
unregister_ftrace_graph(), tracing has really stopped because of
the hot patching and use of stop_machine().

But in static tracing case, we only set stub callbacks. This is
not sufficient on archs that have weak memory ordering to assume
the older callbacks won't be called right after we leave
unregister_ftrace_graph().

Insert a read/write memory barrier in the end of
unregister_ftrace_graph() so that the code that follow can safely
assume tracing has really stopped. This can avoid its older tracing
callbacks to perform checks about various states like ensuring
needed buffers have been allocated, etc...

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/trace/ftrace.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index c050ce3..94117ec 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3128,6 +3128,18 @@ void unregister_ftrace_graph(void)
 
  out:
 	mutex_unlock(&ftrace_lock);
+
+	/*
+	 * In dynamic tracing case, tracing is really stopped here.
+	 * Otherwise we need to ensure ftrace_graph_return
+	 * and ftrace_graph_entry are seen uptodate, so that code
+	 * that follows unregister_ftrace_graph can safely assume
+	 * tracing has stopped. This avoid wasteful checks of tracing
+	 * states inside tracing callbacks.
+	 */
+#ifndef CONFIG_DYNAMIC_FTRACE
+	smp_mb();
+#endif
 }
 
 /* Allocate a return stack for newly created task */
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-22  1:16 [RFC PATCH 00/10] Ftrace functions hashlist Frederic Weisbecker
  2010-01-22  1:16 ` [RFC PATCH 01/10] ftrace: Generalize the function hashlist from function profiler Frederic Weisbecker
  2010-01-22  1:16 ` [RFC PATCH 02/10] ftrace: Ensure tracing has really stopped before leaving unregister_ftrace_graph Frederic Weisbecker
@ 2010-01-22  1:16 ` Frederic Weisbecker
  2010-01-22  2:05   ` Steven Rostedt
  2010-01-22  1:16 ` [RFC PATCH 04/10] ftrace: Ensure buffers are visibles to tracing callbacks right away Frederic Weisbecker
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-22  1:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Steven Rostedt, Li Zefan,
	Lai Jiangshan

Every time we enter the function profiler tracing callbacks, we first
check if the function profiling is enabled.

This check is useless because we register the function graph
callbacks only after the hashlist has been initialized.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/trace/ftrace.c |    7 ++-----
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 94117ec..f258f67 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -381,13 +381,10 @@ function_profile_call(unsigned long ip, unsigned long parent_ip)
 	struct func_node *rec;
 	unsigned long flags;
 
-	if (!ftrace_profile_enabled)
-		return;
-
 	local_irq_save(flags);
 
 	hlist = &__get_cpu_var(func_hlist_cpu);
-	if (!hlist->hash || !ftrace_profile_enabled)
+	if (!hlist->hash)
 		goto out;
 
 	rec = function_find_hlist_node(hlist, ip);
@@ -418,7 +415,7 @@ static void profile_graph_return(struct ftrace_graph_ret *trace)
 
 	local_irq_save(flags);
 	hlist = &__get_cpu_var(func_hlist_cpu);
-	if (!hlist->hash || !ftrace_profile_enabled)
+	if (!hlist->hash)
 		goto out;
 
 	calltime = trace->rettime - trace->calltime;
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH 04/10] ftrace: Ensure buffers are visibles to tracing callbacks right away
  2010-01-22  1:16 [RFC PATCH 00/10] Ftrace functions hashlist Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2010-01-22  1:16 ` [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path Frederic Weisbecker
@ 2010-01-22  1:16 ` Frederic Weisbecker
  2010-01-22  1:16 ` [RFC PATCH 05/10] ftrace: Drop buffer check in function profiler callbacks Frederic Weisbecker
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-22  1:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Steven Rostedt, Li Zefan,
	Lai Jiangshan

A tracer may need to allocate specific buffers before calling
function tracing callbacks. And these callbacks may use these
buffers. To avoid lazy memory ordering, the callbacks may need
to check the state of the buffers before doing anything. This
is wasteful on the tracing hot path.

Let's then put an appropriate barrier before registering tracing
callbacks.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/trace/ftrace.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index f258f67..3cdb35e 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3099,6 +3099,14 @@ int register_ftrace_graph(trace_func_graph_ret_t retfunc,
 		goto out;
 	}
 
+	/*
+	 * The tracing callbacks might use buffers that the tracer could
+	 * have allocated just before. Ensure this is all visible before
+	 * starting the tracing, so that we avoid null pointer checks
+	 * in the tracing hot paths.
+	 */
+	smp_mb();
+
 	ftrace_graph_return = retfunc;
 	ftrace_graph_entry = entryfunc;
 
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH 05/10] ftrace: Drop buffer check in function profiler callbacks
  2010-01-22  1:16 [RFC PATCH 00/10] Ftrace functions hashlist Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2010-01-22  1:16 ` [RFC PATCH 04/10] ftrace: Ensure buffers are visibles to tracing callbacks right away Frederic Weisbecker
@ 2010-01-22  1:16 ` Frederic Weisbecker
  2010-01-25  6:17   ` Lai Jiangshan
  2010-01-22  1:16 ` [RFC PATCH 06/10] ftrace: Release the function hlist if we don't need it anymore Frederic Weisbecker
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-22  1:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Steven Rostedt, Li Zefan,
	Lai Jiangshan

Drop the null check on hlist->hash. It is wasteful because
we don't register the tracer if the buffer allocation failed,
and the memory barrier in register_ftrace_graph() ensure it
is visible to the callbacks right away.

Also we know the tracing callbacks won't be called after
register_ftrace_graph(), so subsequent buffer resets are safe
too.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/trace/ftrace.c |    5 -----
 1 files changed, 0 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 3cdb35e..dfd8f7c 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -384,8 +384,6 @@ function_profile_call(unsigned long ip, unsigned long parent_ip)
 	local_irq_save(flags);
 
 	hlist = &__get_cpu_var(func_hlist_cpu);
-	if (!hlist->hash)
-		goto out;
 
 	rec = function_find_hlist_node(hlist, ip);
 	if (!rec) {
@@ -415,8 +413,6 @@ static void profile_graph_return(struct ftrace_graph_ret *trace)
 
 	local_irq_save(flags);
 	hlist = &__get_cpu_var(func_hlist_cpu);
-	if (!hlist->hash)
-		goto out;
 
 	calltime = trace->rettime - trace->calltime;
 
@@ -439,7 +435,6 @@ static void profile_graph_return(struct ftrace_graph_ret *trace)
 	if (rec)
 		rec->time += calltime;
 
- out:
 	local_irq_restore(flags);
 }
 
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH 06/10] ftrace: Release the function hlist if we don't need it anymore
  2010-01-22  1:16 [RFC PATCH 00/10] Ftrace functions hashlist Frederic Weisbecker
                   ` (4 preceding siblings ...)
  2010-01-22  1:16 ` [RFC PATCH 05/10] ftrace: Drop buffer check in function profiler callbacks Frederic Weisbecker
@ 2010-01-22  1:16 ` Frederic Weisbecker
  2010-01-25  6:41   ` Lai Jiangshan
  2010-01-22  1:16 ` [RFC PATCH 07/10] ftrace: Make the function hashlist concurrently usable Frederic Weisbecker
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-22  1:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Steven Rostedt, Li Zefan,
	Lai Jiangshan

After we disable the function profiler, the function hashlist
stays in memory. This is wasteful as nobody needs it anymore,
until the next use if any.

Release it when we disable the function profiler instead of
resetting it in the next use.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/trace/ftrace.c          |    1 +
 kernel/trace/functions_hlist.c |   61 +++++++++++++++++-----------------------
 kernel/trace/functions_hlist.h |    1 +
 3 files changed, 28 insertions(+), 35 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index dfd8f7c..0ded01c 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -509,6 +509,7 @@ ftrace_profile_write(struct file *filp, const char __user *ubuf,
 			 * so this acts like an synchronize_sched.
 			 */
 			unregister_ftrace_profiler();
+			function_hlist_release();
 		}
 	}
  out:
diff --git a/kernel/trace/functions_hlist.c b/kernel/trace/functions_hlist.c
index 37804c4..c79c4c5 100644
--- a/kernel/trace/functions_hlist.c
+++ b/kernel/trace/functions_hlist.c
@@ -21,20 +21,23 @@ DEFINE_PER_CPU(struct func_hlist, func_hlist_cpu);
 
 int functions_hash_bits __read_mostly;
 
-static void function_hlist_reset(struct func_hlist *hlist)
+static void __function_hlist_release(struct func_hlist *hlist)
 {
-	struct func_hlist_page *pg;
-
-	pg = hlist->pages = hlist->start;
+	struct func_hlist_page *pg = hlist->start;
 
 	while (pg) {
-		memset(pg->records, 0, FUNCTIONS_RECORDS_SIZE);
-		pg->index = 0;
+		unsigned long tmp = (unsigned long)pg;
+
 		pg = pg->next;
+		free_page(tmp);
 	}
 
-	memset(hlist->hash, 0,
-	       FUNCTIONS_HLIST_SIZE * sizeof(struct hlist_head));
+	free_page((unsigned long)hlist->pages);
+	hlist->pages = NULL;
+	hlist->start = NULL;
+
+	kfree(hlist->hash);
+	hlist->hash = NULL;
 }
 
 static int function_hlist_pages_init(struct func_hlist *hlist)
@@ -44,10 +47,6 @@ static int function_hlist_pages_init(struct func_hlist *hlist)
 	int pages;
 	int i;
 
-	/* If we already allocated, do nothing */
-	if (hlist->pages)
-		return 0;
-
 	hlist->pages = (void *)get_zeroed_page(GFP_KERNEL);
 	if (!hlist->pages)
 		return -ENOMEM;
@@ -72,26 +71,11 @@ static int function_hlist_pages_init(struct func_hlist *hlist)
 	for (i = 0; i < pages; i++) {
 		pg->next = (void *)get_zeroed_page(GFP_KERNEL);
 		if (!pg->next)
-			goto out_free;
+			return -ENOMEM;
 		pg = pg->next;
 	}
 
 	return 0;
-
- out_free:
-	pg = hlist->start;
-	while (pg) {
-		unsigned long tmp = (unsigned long)pg;
-
-		pg = pg->next;
-		free_page(tmp);
-	}
-
-	free_page((unsigned long)hlist->pages);
-	hlist->pages = NULL;
-	hlist->start = NULL;
-
-	return -ENOMEM;
 }
 
 static int function_hlist_init_cpu(int cpu)
@@ -101,11 +85,8 @@ static int function_hlist_init_cpu(int cpu)
 
 	hlist = &per_cpu(func_hlist_cpu, cpu);
 
-	if (hlist->hash) {
-		/* If the profile is already created, simply reset it */
-		function_hlist_reset(hlist);
-		return 0;
-	}
+	if (WARN_ON_ONCE(hlist->hash))
+		return -EBUSY;
 
 	/*
 	 * We are profiling all functions, but usually only a few thousand
@@ -127,14 +108,24 @@ static int function_hlist_init_cpu(int cpu)
 
 	/* Preallocate the function profiling pages */
 	if (function_hlist_pages_init(hlist) < 0) {
-		kfree(hlist->hash);
-		hlist->hash = NULL;
+		__function_hlist_release(hlist);
 		return -ENOMEM;
 	}
 
 	return 0;
 }
 
+void function_hlist_release(void)
+{
+	int cpu;
+	struct func_hlist *hlist;
+
+	for_each_online_cpu(cpu) {
+		hlist = &per_cpu(func_hlist_cpu, cpu);
+		__function_hlist_release(hlist);
+	}
+}
+
 int function_hlist_init(void)
 {
 	int cpu;
diff --git a/kernel/trace/functions_hlist.h b/kernel/trace/functions_hlist.h
index 3f4e485..8001f95 100644
--- a/kernel/trace/functions_hlist.h
+++ b/kernel/trace/functions_hlist.h
@@ -36,3 +36,4 @@ struct func_node *
 function_hlist_record_alloc(struct func_hlist *hlist, unsigned long ip);
 
 int function_hlist_init(void);
+void function_hlist_release(void);
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH 07/10] ftrace: Make the function hashlist concurrently usable
  2010-01-22  1:16 [RFC PATCH 00/10] Ftrace functions hashlist Frederic Weisbecker
                   ` (5 preceding siblings ...)
  2010-01-22  1:16 ` [RFC PATCH 06/10] ftrace: Release the function hlist if we don't need it anymore Frederic Weisbecker
@ 2010-01-22  1:16 ` Frederic Weisbecker
  2010-01-22  1:16 ` [PATCH 08/10] tracing: Simplify test for function_graph tracing Frederic Weisbecker
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-22  1:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Steven Rostedt, Li Zefan,
	Lai Jiangshan

Allow the function hashlist to be usable by simultaneous users.
The function tracers can then plug in for their own needs.

This introduces a get/put_function_hlist() that keeps track
of the hashlist users for allocation and release.

We also introduce function_hlist_reset_profile() that resets
the function profiling statistics on a function hashlist already
in use.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/trace/ftrace.c          |    5 ++-
 kernel/trace/functions_hlist.c |   56 ++++++++++++++++++++++++++++++++++++++-
 kernel/trace/functions_hlist.h |    5 ++-
 3 files changed, 60 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 0ded01c..027743c 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -490,11 +490,12 @@ ftrace_profile_write(struct file *filp, const char __user *ubuf,
 	mutex_lock(&ftrace_profile_lock);
 	if (ftrace_profile_enabled ^ val) {
 		if (val) {
-			ret = function_hlist_init();
+			ret = get_function_hlist();
 			if (ret < 0) {
 				cnt = ret;
 				goto out;
 			}
+			function_hlist_reset_profile();
 
 			ret = register_ftrace_profiler();
 			if (ret < 0) {
@@ -509,7 +510,7 @@ ftrace_profile_write(struct file *filp, const char __user *ubuf,
 			 * so this acts like an synchronize_sched.
 			 */
 			unregister_ftrace_profiler();
-			function_hlist_release();
+			put_function_hlist();
 		}
 	}
  out:
diff --git a/kernel/trace/functions_hlist.c b/kernel/trace/functions_hlist.c
index c79c4c5..d682213 100644
--- a/kernel/trace/functions_hlist.c
+++ b/kernel/trace/functions_hlist.c
@@ -20,6 +20,39 @@
 DEFINE_PER_CPU(struct func_hlist, func_hlist_cpu);
 
 int functions_hash_bits __read_mostly;
+static atomic_t hlist_refcount;
+
+
+static void function_hlist_reset_profile_cpu(int cpu)
+{
+	struct func_hlist *hlist = &per_cpu(func_hlist_cpu, cpu);
+	struct hlist_head *head;
+	struct hlist_node *node;
+	struct func_node *rec;
+	int i;
+
+	for (i = 0; i < FUNCTIONS_HLIST_SIZE; i++) {
+		head = &hlist->hash[i];
+
+		if (hlist_empty(head))
+			continue;
+
+		hlist_for_each_entry(rec, node, head, node) {
+			rec->counter = 0;
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+			rec->time = 0;
+#endif
+		}
+	}
+}
+
+void function_hlist_reset_profile(void)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu)
+		function_hlist_reset_profile_cpu(cpu);
+}
 
 static void __function_hlist_release(struct func_hlist *hlist)
 {
@@ -115,7 +148,7 @@ static int function_hlist_init_cpu(int cpu)
 	return 0;
 }
 
-void function_hlist_release(void)
+static void function_hlist_release(void)
 {
 	int cpu;
 	struct func_hlist *hlist;
@@ -126,7 +159,7 @@ void function_hlist_release(void)
 	}
 }
 
-int function_hlist_init(void)
+static int function_hlist_init(void)
 {
 	int cpu;
 	int ret = 0;
@@ -140,6 +173,25 @@ int function_hlist_init(void)
 	return ret;
 }
 
+int get_function_hlist(void)
+{
+	int ret = 0;
+
+	if (atomic_inc_return(&hlist_refcount) == 1) {
+		ret = function_hlist_init();
+		if (ret < 0)
+			atomic_dec(&hlist_refcount);
+	}
+
+	return ret;
+}
+
+void put_function_hlist(void)
+{
+	if (!atomic_dec_return(&hlist_refcount))
+		function_hlist_release();
+}
+
 struct func_node *
 function_find_hlist_node(struct func_hlist *hlist, unsigned long ip)
 {
diff --git a/kernel/trace/functions_hlist.h b/kernel/trace/functions_hlist.h
index 8001f95..a4655c7 100644
--- a/kernel/trace/functions_hlist.h
+++ b/kernel/trace/functions_hlist.h
@@ -35,5 +35,6 @@ function_find_hlist_node(struct func_hlist *hlist, unsigned long ip);
 struct func_node *
 function_hlist_record_alloc(struct func_hlist *hlist, unsigned long ip);
 
-int function_hlist_init(void);
-void function_hlist_release(void);
+int get_function_hlist(void);
+void put_function_hlist(void);
+void function_hlist_reset_profile(void);
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH 08/10] tracing: Simplify test for function_graph tracing
  2010-01-22  1:16 [RFC PATCH 00/10] Ftrace functions hashlist Frederic Weisbecker
                   ` (6 preceding siblings ...)
  2010-01-22  1:16 ` [RFC PATCH 07/10] ftrace: Make the function hashlist concurrently usable Frederic Weisbecker
@ 2010-01-22  1:16 ` Frederic Weisbecker
  2010-01-22  1:16 ` [RFC PATCH 09/10] tracing: Use the hashlist for graph function Frederic Weisbecker
  2010-01-22  1:16 ` [RFC PATCH 10/10] ftrace: Factorize search and insertion in the function hashlist Frederic Weisbecker
  9 siblings, 0 replies; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-22  1:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Lai Jiangshan, Steven Rostedt, Li Zefan,
	Frederic Weisbecker

From: Lai Jiangshan <laijs@cn.fujitsu.com>

For function_graph, a calling function is to be traced only when
it is enabled or it is nested in a enabled function.

Current code uses TSK_TRACE_FL_GRAPH to test whether it is nested
or not. Looking at the code, we can get this:
(trace->depth > 0) <==> (TSK_TRACE_FL_GRAPH is set)

trace->depth is more explicit to tell that it is nested.
So we use trace->depth directly and simplify the code.

No functionality is changed.
TSK_TRACE_FL_GRAPH is not removed now, it is left for future usage.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4B4DB0B6.7040607@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/trace/trace.h                 |    2 +-
 kernel/trace/trace_functions_graph.c |    8 ++------
 2 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 4df6a77..ce077fb 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -504,7 +504,7 @@ static inline int ftrace_graph_addr(unsigned long addr)
 {
 	int i;
 
-	if (!ftrace_graph_count || test_tsk_trace_graph(current))
+	if (!ftrace_graph_count)
 		return 1;
 
 	for (i = 0; i < ftrace_graph_count; i++) {
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index f225229..616b135 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -215,7 +215,8 @@ int trace_graph_entry(struct ftrace_graph_ent *trace)
 	if (!ftrace_trace_task(current))
 		return 0;
 
-	if (!ftrace_graph_addr(trace->func))
+	/* trace it when it is-nested-in or is a function enabled. */
+	if (!(trace->depth || ftrace_graph_addr(trace->func)))
 		return 0;
 
 	local_irq_save(flags);
@@ -228,9 +229,6 @@ int trace_graph_entry(struct ftrace_graph_ent *trace)
 	} else {
 		ret = 0;
 	}
-	/* Only do the atomic if it is not already set */
-	if (!test_tsk_trace_graph(current))
-		set_tsk_trace_graph(current);
 
 	atomic_dec(&data->disabled);
 	local_irq_restore(flags);
@@ -278,8 +276,6 @@ void trace_graph_return(struct ftrace_graph_ret *trace)
 		pc = preempt_count();
 		__trace_graph_return(tr, trace, flags, pc);
 	}
-	if (!trace->depth)
-		clear_tsk_trace_graph(current);
 	atomic_dec(&data->disabled);
 	local_irq_restore(flags);
 }
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH 09/10] tracing: Use the hashlist for graph function
  2010-01-22  1:16 [RFC PATCH 00/10] Ftrace functions hashlist Frederic Weisbecker
                   ` (7 preceding siblings ...)
  2010-01-22  1:16 ` [PATCH 08/10] tracing: Simplify test for function_graph tracing Frederic Weisbecker
@ 2010-01-22  1:16 ` Frederic Weisbecker
  2010-01-25  8:50   ` Lai Jiangshan
  2010-01-22  1:16 ` [RFC PATCH 10/10] ftrace: Factorize search and insertion in the function hashlist Frederic Weisbecker
  9 siblings, 1 reply; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-22  1:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Steven Rostedt, Li Zefan,
	Lai Jiangshan

When we set a filter to start tracing from a given function in
the function graph tracer, the filter is stored in a linear array.

It doesn't scale very well, we even limited the number of such
functions to 32.

Now that we have a hashlist of functions, lets put a field inside
each function node so that we can check if a function is one of
these filters using the hashlist, not a linear array.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/trace/ftrace.c                |   61 ++++++++++++++++++++++++++++++++++
 kernel/trace/functions_hlist.c       |   29 ++++++++++++++++
 kernel/trace/functions_hlist.h       |    2 +
 kernel/trace/trace.h                 |   21 +-----------
 kernel/trace/trace_functions_graph.c |   41 ++++++++++++++++++++++-
 5 files changed, 133 insertions(+), 21 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 027743c..d719078 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2193,6 +2193,67 @@ static DEFINE_MUTEX(graph_lock);
 int ftrace_graph_count;
 unsigned long ftrace_graph_funcs[FTRACE_GRAPH_MAX_FUNCS] __read_mostly;
 
+static int
+change_graph_function_hlist(unsigned long ip, int val, struct func_hlist *hlist)
+{
+	struct func_node *rec;
+
+	rec = function_find_hlist_node(hlist, ip);
+	if (!rec) {
+		rec = function_hlist_record_alloc(hlist, ip);
+		if (!rec)
+			return 1;
+	}
+
+	rec->graph_start = val;
+
+	return 0;
+}
+
+static int change_graph_function(unsigned long ip, int val)
+{
+	int ret;
+	int cpu;
+	struct func_hlist *hlist;
+
+	for_each_online_cpu(cpu) {
+		hlist = &per_cpu(func_hlist_cpu, cpu);
+		ret = change_graph_function_hlist(ip, val, hlist);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+static void clear_graph_functions(void)
+{
+	int i;
+
+	for (i = 0; i < ftrace_graph_count; i++)
+		change_graph_function(ftrace_graph_funcs[i], 0);
+}
+
+int set_graph_functions(void)
+{
+	int i;
+	int ret = 0;
+
+	mutex_lock(&graph_lock);
+
+	for (i = 0; i < ftrace_graph_count; i++) {
+		ret = change_graph_function(ftrace_graph_funcs[i], 1);
+		if (ret) {
+			clear_graph_functions();
+			break;
+		}
+	}
+
+	mutex_unlock(&graph_lock);
+
+	return 0;
+}
+
 static void *
 __g_next(struct seq_file *m, loff_t *pos)
 {
diff --git a/kernel/trace/functions_hlist.c b/kernel/trace/functions_hlist.c
index d682213..7a60265 100644
--- a/kernel/trace/functions_hlist.c
+++ b/kernel/trace/functions_hlist.c
@@ -54,6 +54,35 @@ void function_hlist_reset_profile(void)
 		function_hlist_reset_profile_cpu(cpu);
 }
 
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+static void function_hlist_reset_graph_cpu(int cpu)
+{
+	struct func_hlist *hlist = &per_cpu(func_hlist_cpu, cpu);
+	struct hlist_head *head;
+	struct hlist_node *node;
+	struct func_node *rec;
+	int i;
+
+	for (i = 0; i < FUNCTIONS_HLIST_SIZE; i++) {
+		head = &hlist->hash[i];
+
+		if (hlist_empty(head))
+			continue;
+
+		hlist_for_each_entry(rec, node, head, node)
+			rec->graph_start = 0;
+	}
+}
+
+void function_hlist_reset_graph(void)
+{
+	int cpu;
+
+	for_each_online_cpu(cpu)
+		function_hlist_reset_graph_cpu(cpu);
+}
+#endif
+
 static void __function_hlist_release(struct func_hlist *hlist)
 {
 	struct func_hlist_page *pg = hlist->start;
diff --git a/kernel/trace/functions_hlist.h b/kernel/trace/functions_hlist.h
index a4655c7..39d89b4 100644
--- a/kernel/trace/functions_hlist.h
+++ b/kernel/trace/functions_hlist.h
@@ -8,6 +8,7 @@ struct func_node {
 	unsigned long			counter;
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	unsigned long long		time;
+	int				graph_start;
 #endif
 };
 
@@ -38,3 +39,4 @@ function_hlist_record_alloc(struct func_hlist *hlist, unsigned long ip);
 int get_function_hlist(void);
 void put_function_hlist(void);
 void function_hlist_reset_profile(void);
+void function_hlist_reset_graph(void);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index ce077fb..ff0b01a 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -499,26 +499,7 @@ trace_print_graph_duration(unsigned long long duration, struct trace_seq *s);
 #define FTRACE_GRAPH_MAX_FUNCS		32
 extern int ftrace_graph_count;
 extern unsigned long ftrace_graph_funcs[FTRACE_GRAPH_MAX_FUNCS];
-
-static inline int ftrace_graph_addr(unsigned long addr)
-{
-	int i;
-
-	if (!ftrace_graph_count)
-		return 1;
-
-	for (i = 0; i < ftrace_graph_count; i++) {
-		if (addr == ftrace_graph_funcs[i])
-			return 1;
-	}
-
-	return 0;
-}
-#else
-static inline int ftrace_graph_addr(unsigned long addr)
-{
-	return 1;
-}
+int set_graph_functions(void);
 #endif /* CONFIG_DYNAMIC_FTRACE */
 #else /* CONFIG_FUNCTION_GRAPH_TRACER */
 static inline enum print_line_t
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index 616b135..da24add 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -13,6 +13,7 @@
 
 #include "trace.h"
 #include "trace_output.h"
+#include "functions_hlist.h"
 
 struct fgraph_cpu_data {
 	pid_t		last_pid;
@@ -202,6 +203,33 @@ static int __trace_graph_entry(struct trace_array *tr,
 	return 1;
 }
 
+static inline int ftrace_graph_addr(unsigned long addr)
+{
+	struct func_node *rec;
+	struct func_hlist *hlist;
+
+	if (!ftrace_graph_count)
+		return 0;
+
+	hlist = &__get_cpu_var(func_hlist_cpu);
+
+	rec = function_find_hlist_node(hlist, addr);
+	if (!rec) {
+		/*
+		 * TODO: send a retrieval error event
+		 * to keep track of this.
+		 */
+		rec = function_hlist_record_alloc(hlist, addr);
+		if (!rec)
+			return 0;
+	}
+
+	if (rec->graph_start)
+		return 1;
+
+	return 0;
+}
+
 int trace_graph_entry(struct ftrace_graph_ent *trace)
 {
 	struct trace_array *tr = graph_array;
@@ -294,10 +322,20 @@ static int graph_trace_init(struct trace_array *tr)
 	int ret;
 
 	set_graph_array(tr);
+	ret = get_function_hlist();
+	if (ret)
+		return ret;
+
+	function_hlist_reset_graph();
+	set_graph_functions();
+
 	ret = register_ftrace_graph(&trace_graph_return,
 				    &trace_graph_entry);
-	if (ret)
+	if (ret) {
+		put_function_hlist();
 		return ret;
+	}
+
 	tracing_start_cmdline_record();
 
 	return 0;
@@ -307,6 +345,7 @@ static void graph_trace_reset(struct trace_array *tr)
 {
 	tracing_stop_cmdline_record();
 	unregister_ftrace_graph();
+	put_function_hlist();
 }
 
 static int max_bytes_for_cpu;
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [RFC PATCH 10/10] ftrace: Factorize search and insertion in the function hashlist
  2010-01-22  1:16 [RFC PATCH 00/10] Ftrace functions hashlist Frederic Weisbecker
                   ` (8 preceding siblings ...)
  2010-01-22  1:16 ` [RFC PATCH 09/10] tracing: Use the hashlist for graph function Frederic Weisbecker
@ 2010-01-22  1:16 ` Frederic Weisbecker
  9 siblings, 0 replies; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-22  1:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Steven Rostedt, Li Zefan,
	Lai Jiangshan

Factorize search and insertion operations in the function hashlist
to remove some redundant code.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 kernel/trace/ftrace.c                |   18 ++++++------------
 kernel/trace/functions_hlist.h       |   12 ++++++++++++
 kernel/trace/trace_functions_graph.c |   17 +++++++----------
 3 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index d719078..785d077 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -385,12 +385,9 @@ function_profile_call(unsigned long ip, unsigned long parent_ip)
 
 	hlist = &__get_cpu_var(func_hlist_cpu);
 
-	rec = function_find_hlist_node(hlist, ip);
-	if (!rec) {
-		rec = function_hlist_record_alloc(hlist, ip);
-		if (!rec)
-			goto out;
-	}
+	rec = function_get_hlist_node(hlist, ip);
+	if (!rec)
+		goto out;
 
 	rec->counter++;
  out:
@@ -2198,12 +2195,9 @@ change_graph_function_hlist(unsigned long ip, int val, struct func_hlist *hlist)
 {
 	struct func_node *rec;
 
-	rec = function_find_hlist_node(hlist, ip);
-	if (!rec) {
-		rec = function_hlist_record_alloc(hlist, ip);
-		if (!rec)
-			return 1;
-	}
+	rec = function_get_hlist_node(hlist, ip);
+	if (!rec)
+		return 1;
 
 	rec->graph_start = val;
 
diff --git a/kernel/trace/functions_hlist.h b/kernel/trace/functions_hlist.h
index 39d89b4..3baf4bb 100644
--- a/kernel/trace/functions_hlist.h
+++ b/kernel/trace/functions_hlist.h
@@ -40,3 +40,15 @@ int get_function_hlist(void);
 void put_function_hlist(void);
 void function_hlist_reset_profile(void);
 void function_hlist_reset_graph(void);
+
+static inline struct func_node *
+function_get_hlist_node(struct func_hlist *hlist, unsigned long ip)
+{
+	struct func_node *rec;
+
+	rec = function_find_hlist_node(hlist, ip);
+	if (!rec)
+		rec = function_hlist_record_alloc(hlist, ip);
+
+	return rec;
+}
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index da24add..0fa0691 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -213,16 +213,13 @@ static inline int ftrace_graph_addr(unsigned long addr)
 
 	hlist = &__get_cpu_var(func_hlist_cpu);
 
-	rec = function_find_hlist_node(hlist, addr);
-	if (!rec) {
-		/*
-		 * TODO: send a retrieval error event
-		 * to keep track of this.
-		 */
-		rec = function_hlist_record_alloc(hlist, addr);
-		if (!rec)
-			return 0;
-	}
+	/*
+	 * TODO: send a retrieval error event
+	 * to keep track of failures
+	 */
+	rec = function_get_hlist_node(hlist, addr);
+	if (!rec)
+		return 0;
 
 	if (rec->graph_start)
 		return 1;
-- 
1.6.2.3


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 02/10] ftrace: Ensure tracing has really stopped before leaving unregister_ftrace_graph
  2010-01-22  1:16 ` [RFC PATCH 02/10] ftrace: Ensure tracing has really stopped before leaving unregister_ftrace_graph Frederic Weisbecker
@ 2010-01-22  1:51   ` Steven Rostedt
  2010-01-22  2:04     ` Frederic Weisbecker
  0 siblings, 1 reply; 44+ messages in thread
From: Steven Rostedt @ 2010-01-22  1:51 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Ingo Molnar, LKML, Li Zefan, Lai Jiangshan

On Fri, 2010-01-22 at 02:16 +0100, Frederic Weisbecker wrote:
> When we run under dynamic tracing, we know that after calling
> unregister_ftrace_graph(), tracing has really stopped because of
> the hot patching and use of stop_machine().

This is incorrect. Even after unregister_ftrace_graph() with
stop_machine(), we still have no guarantee that a call back is not being
called. This is the reason I use sub tracing instead of NULLs. The call
to the trace function could have been loaded in a register and then
preempted. Even after stop_machine() that trace function can be called.
This is also the reason that I never let modules add hooks to the
function tracer (although I can easily make a wrapper to do so).

> 
> But in static tracing case, we only set stub callbacks. This is
> not sufficient on archs that have weak memory ordering to assume
> the older callbacks won't be called right after we leave
> unregister_ftrace_graph().
> 
> Insert a read/write memory barrier in the end of
> unregister_ftrace_graph() so that the code that follow can safely
> assume tracing has really stopped. This can avoid its older tracing
> callbacks to perform checks about various states like ensuring
> needed buffers have been allocated, etc...

There's no guarantee, even with a smp_mb() that a trace function will
not be called after being unregistered.

-- Steve

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 02/10] ftrace: Ensure tracing has really stopped before leaving unregister_ftrace_graph
  2010-01-22  1:51   ` Steven Rostedt
@ 2010-01-22  2:04     ` Frederic Weisbecker
  0 siblings, 0 replies; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-22  2:04 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Ingo Molnar, LKML, Li Zefan, Lai Jiangshan

On Thu, Jan 21, 2010 at 08:51:48PM -0500, Steven Rostedt wrote:
> On Fri, 2010-01-22 at 02:16 +0100, Frederic Weisbecker wrote:
> > When we run under dynamic tracing, we know that after calling
> > unregister_ftrace_graph(), tracing has really stopped because of
> > the hot patching and use of stop_machine().
> 
> This is incorrect. Even after unregister_ftrace_graph() with
> stop_machine(), we still have no guarantee that a call back is not being
> called. This is the reason I use sub tracing instead of NULLs. The call
> to the trace function could have been loaded in a register and then
> preempted. Even after stop_machine() that trace function can be called.
> This is also the reason that I never let modules add hooks to the
> function tracer (although I can easily make a wrapper to do so).



Ah, you are utterly right! I forgot about all that. And looks like
nothing can easily help this.

I just dream about a magic synchronize_trace().

 
> > 
> > But in static tracing case, we only set stub callbacks. This is
> > not sufficient on archs that have weak memory ordering to assume
> > the older callbacks won't be called right after we leave
> > unregister_ftrace_graph().
> > 
> > Insert a read/write memory barrier in the end of
> > unregister_ftrace_graph() so that the code that follow can safely
> > assume tracing has really stopped. This can avoid its older tracing
> > callbacks to perform checks about various states like ensuring
> > needed buffers have been allocated, etc...
> 
> There's no guarantee, even with a smp_mb() that a trace function will
> not be called after being unregistered.



Yeah, indeed...


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-22  1:16 ` [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path Frederic Weisbecker
@ 2010-01-22  2:05   ` Steven Rostedt
  2010-01-22  2:28     ` Mathieu Desnoyers
  2010-01-22  2:43     ` [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path Frederic Weisbecker
  0 siblings, 2 replies; 44+ messages in thread
From: Steven Rostedt @ 2010-01-22  2:05 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Li Zefan, Lai Jiangshan, Mathieu Desnoyers

On Fri, 2010-01-22 at 02:16 +0100, Frederic Weisbecker wrote:
> Every time we enter the function profiler tracing callbacks, we first
> check if the function profiling is enabled.
> 
> This check is useless because we register the function graph
> callbacks only after the hashlist has been initialized.

Unfortunately, since the previous patch is incorrect, it makes this one
buggy too.

If you remove the check to ftrace_profile_enabled, the call to the
profiled code could have been preempted and pending to be called.

Stop machine may remove all calls to the tracing, but it only affects
new hits. Pending hits may still exist.

If you remove this check, and the user re-enables the profiling, then
all PER_CPU hashs will be reset. If in the process of this happening,
the task with the pending trace wakes up, it may access the PER_CPU list
and corrupt it.

Now for the reason I Cc'd Paul and Mathieu...

If we had a synchronize_sched() like function that would wait and return
when all preempted tasks have been scheduled again and went to either
userspace or called schedule directly, then we could actually do this.

After unregistering the function graph trace, you call this
"synchronize_tasks()" and it will guarantee that all currently preempted
tasks have either went to userspace or have called schedule() directly.
Then it would be safe to remove this check.

-- Steve

> 
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Li Zefan <lizf@cn.fujitsu.com>
> Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
>  kernel/trace/ftrace.c |    7 ++-----
>  1 files changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 94117ec..f258f67 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -381,13 +381,10 @@ function_profile_call(unsigned long ip, unsigned long parent_ip)
>  	struct func_node *rec;
>  	unsigned long flags;
>  
> -	if (!ftrace_profile_enabled)
> -		return;
> -
>  	local_irq_save(flags);
>  
>  	hlist = &__get_cpu_var(func_hlist_cpu);
> -	if (!hlist->hash || !ftrace_profile_enabled)
> +	if (!hlist->hash)
>  		goto out;
>  
>  	rec = function_find_hlist_node(hlist, ip);
> @@ -418,7 +415,7 @@ static void profile_graph_return(struct ftrace_graph_ret *trace)
>  
>  	local_irq_save(flags);
>  	hlist = &__get_cpu_var(func_hlist_cpu);
> -	if (!hlist->hash || !ftrace_profile_enabled)
> +	if (!hlist->hash)
>  		goto out;
>  
>  	calltime = trace->rettime - trace->calltime;

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-22  2:05   ` Steven Rostedt
@ 2010-01-22  2:28     ` Mathieu Desnoyers
  2010-01-22  3:12       ` Steven Rostedt
  2010-01-22  2:43     ` [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path Frederic Weisbecker
  1 sibling, 1 reply; 44+ messages in thread
From: Mathieu Desnoyers @ 2010-01-22  2:28 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frederic Weisbecker, Ingo Molnar, LKML, Li Zefan, Lai Jiangshan,
	Masami Hiramatsu

* Steven Rostedt (rostedt@goodmis.org) wrote:
> On Fri, 2010-01-22 at 02:16 +0100, Frederic Weisbecker wrote:
> > Every time we enter the function profiler tracing callbacks, we first
> > check if the function profiling is enabled.
> > 
> > This check is useless because we register the function graph
> > callbacks only after the hashlist has been initialized.
> 
> Unfortunately, since the previous patch is incorrect, it makes this one
> buggy too.
> 
> If you remove the check to ftrace_profile_enabled, the call to the
> profiled code could have been preempted and pending to be called.
> 
> Stop machine may remove all calls to the tracing, but it only affects
> new hits. Pending hits may still exist.
> 
> If you remove this check, and the user re-enables the profiling, then
> all PER_CPU hashs will be reset. If in the process of this happening,
> the task with the pending trace wakes up, it may access the PER_CPU list
> and corrupt it.
> 
> Now for the reason I Cc'd Paul and Mathieu...
> 
> If we had a synchronize_sched() like function that would wait and return
> when all preempted tasks have been scheduled again and went to either
> userspace or called schedule directly, then we could actually do this.
> 
> After unregistering the function graph trace, you call this
> "synchronize_tasks()" and it will guarantee that all currently preempted
> tasks have either went to userspace or have called schedule() directly.
> Then it would be safe to remove this check.

OK, so basically you need to know when you reach a quiescent state, but
preemption is enabled and there is no RCU read lock taken around these
code paths, am I correct ?

With tracepoints, life is easy because I disable preemption around the
calls, so I can use synchronize_sched() to know when quiescent state is
reached.

I recommend looking at kernel/kprobes.c:check_safety(). It uses
thaw_processes() and synchronize_sched() for this purpose. Basically, it
rely on the "refrigeration" points to detect such quiescent state. This
trick should do the job for the function graph tracer too.

I'm adding Masami in CC. He is the one who implemented check_safety(),
and I remember discussing it with him in the past.

Thanks,

Mathieu

> 
> -- Steve
> 
> > 
> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Li Zefan <lizf@cn.fujitsu.com>
> > Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> > ---
> >  kernel/trace/ftrace.c |    7 ++-----
> >  1 files changed, 2 insertions(+), 5 deletions(-)
> > 
> > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> > index 94117ec..f258f67 100644
> > --- a/kernel/trace/ftrace.c
> > +++ b/kernel/trace/ftrace.c
> > @@ -381,13 +381,10 @@ function_profile_call(unsigned long ip, unsigned long parent_ip)
> >  	struct func_node *rec;
> >  	unsigned long flags;
> >  
> > -	if (!ftrace_profile_enabled)
> > -		return;
> > -
> >  	local_irq_save(flags);
> >  
> >  	hlist = &__get_cpu_var(func_hlist_cpu);
> > -	if (!hlist->hash || !ftrace_profile_enabled)
> > +	if (!hlist->hash)
> >  		goto out;
> >  
> >  	rec = function_find_hlist_node(hlist, ip);
> > @@ -418,7 +415,7 @@ static void profile_graph_return(struct ftrace_graph_ret *trace)
> >  
> >  	local_irq_save(flags);
> >  	hlist = &__get_cpu_var(func_hlist_cpu);
> > -	if (!hlist->hash || !ftrace_profile_enabled)
> > +	if (!hlist->hash)
> >  		goto out;
> >  
> >  	calltime = trace->rettime - trace->calltime;
> 
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-22  2:05   ` Steven Rostedt
  2010-01-22  2:28     ` Mathieu Desnoyers
@ 2010-01-22  2:43     ` Frederic Weisbecker
  2010-01-22  3:05       ` Steven Rostedt
  1 sibling, 1 reply; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-22  2:43 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, LKML, Li Zefan, Lai Jiangshan, Mathieu Desnoyers

On Thu, Jan 21, 2010 at 09:05:17PM -0500, Steven Rostedt wrote:
> On Fri, 2010-01-22 at 02:16 +0100, Frederic Weisbecker wrote:
> > Every time we enter the function profiler tracing callbacks, we first
> > check if the function profiling is enabled.
> > 
> > This check is useless because we register the function graph
> > callbacks only after the hashlist has been initialized.
> 
> Unfortunately, since the previous patch is incorrect, it makes this one
> buggy too.
> 
> If you remove the check to ftrace_profile_enabled, the call to the
> profiled code could have been preempted and pending to be called.
> 
> Stop machine may remove all calls to the tracing, but it only affects
> new hits. Pending hits may still exist.
> 
> If you remove this check, and the user re-enables the profiling, then
> all PER_CPU hashs will be reset. If in the process of this happening,
> the task with the pending trace wakes up, it may access the PER_CPU list
> and corrupt it.


Indeed.



> Now for the reason I Cc'd Paul and Mathieu...
> 
> If we had a synchronize_sched() like function that would wait and return
> when all preempted tasks have been scheduled again and went to either
> userspace or called schedule directly, then we could actually do this.
> 
> After unregistering the function graph trace, you call this
> "synchronize_tasks()" and it will guarantee that all currently preempted
> tasks have either went to userspace or have called schedule() directly.
> Then it would be safe to remove this check.



Good point!

I fear that would require heavy hooks in the scheduler though...


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-22  2:43     ` [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path Frederic Weisbecker
@ 2010-01-22  3:05       ` Steven Rostedt
  2010-01-25 20:52         ` Frederic Weisbecker
  0 siblings, 1 reply; 44+ messages in thread
From: Steven Rostedt @ 2010-01-22  3:05 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Ingo Molnar, LKML, Li Zefan, Lai Jiangshan, Mathieu Desnoyers

On Fri, 2010-01-22 at 03:43 +0100, Frederic Weisbecker wrote:

> > Now for the reason I Cc'd Paul and Mathieu...
> > 
> > If we had a synchronize_sched() like function that would wait and return
> > when all preempted tasks have been scheduled again and went to either
> > userspace or called schedule directly, then we could actually do this.
> > 
> > After unregistering the function graph trace, you call this
> > "synchronize_tasks()" and it will guarantee that all currently preempted
> > tasks have either went to userspace or have called schedule() directly.
> > Then it would be safe to remove this check.
> 
> 
> 
> Good point!
> 
> I fear that would require heavy hooks in the scheduler though...
> 

Not a heavy one. We could add a field to the task_struct and just call
something if it is set.

At start of schedule()

	if (unlikely(current->pcount))
		handle_pcount_waiters(current);

void handle_pcount_waiters(struct task_struct *p)
{
	current->pcount = 0;
	wake_up(pcount_waiters);
}

and for the synchronize_tasks(), just search the task list for tasks
that are on the run queue but not running, and add a pcount timestamp
and record the list of tasks (allocated list).

After it is woken up, it checks the list of tasks and if a task does not
have the pcount timestamp that matches what was stored, it removes it
from the list. When it is finally woken up and does not have any more
tasks on the list, it continues.

This is just a basic idea, i left out a bunch of details, but I'm sure
it is feasible. This type of wait may work for other types of lockless
algorithms too.

-- Steve

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-22  2:28     ` Mathieu Desnoyers
@ 2010-01-22  3:12       ` Steven Rostedt
  2010-01-22  4:09         ` Mathieu Desnoyers
  0 siblings, 1 reply; 44+ messages in thread
From: Steven Rostedt @ 2010-01-22  3:12 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Frederic Weisbecker, Ingo Molnar, LKML, Li Zefan, Lai Jiangshan,
	Masami Hiramatsu

On Thu, 2010-01-21 at 21:28 -0500, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@goodmis.org) wrote:

> > Now for the reason I Cc'd Paul and Mathieu...
> > 
> > If we had a synchronize_sched() like function that would wait and return
> > when all preempted tasks have been scheduled again and went to either
> > userspace or called schedule directly, then we could actually do this.
> > 
> > After unregistering the function graph trace, you call this
> > "synchronize_tasks()" and it will guarantee that all currently preempted
> > tasks have either went to userspace or have called schedule() directly.
> > Then it would be safe to remove this check.
> 
> OK, so basically you need to know when you reach a quiescent state, but
> preemption is enabled and there is no RCU read lock taken around these
> code paths, am I correct ?
> 
> With tracepoints, life is easy because I disable preemption around the
> calls, so I can use synchronize_sched() to know when quiescent state is
> reached.
> 
> I recommend looking at kernel/kprobes.c:check_safety(). It uses
> thaw_processes() and synchronize_sched() for this purpose. Basically, it
> rely on the "refrigeration" points to detect such quiescent state. This
> trick should do the job for the function graph tracer too.
> 
> I'm adding Masami in CC. He is the one who implemented check_safety(),
> and I remember discussing it with him in the past.

Hmm, interesting. Maybe something like that might work. But what if
CONFIG_PREEMPT is enabled but CONFIG_FREEZER is not?

-- Steve



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-22  3:12       ` Steven Rostedt
@ 2010-01-22  4:09         ` Mathieu Desnoyers
  2010-01-22  4:52           ` Steven Rostedt
  0 siblings, 1 reply; 44+ messages in thread
From: Mathieu Desnoyers @ 2010-01-22  4:09 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frederic Weisbecker, Ingo Molnar, LKML, Li Zefan, Lai Jiangshan,
	Masami Hiramatsu

* Steven Rostedt (rostedt@goodmis.org) wrote:
> On Thu, 2010-01-21 at 21:28 -0500, Mathieu Desnoyers wrote:
> > * Steven Rostedt (rostedt@goodmis.org) wrote:
> 
> > > Now for the reason I Cc'd Paul and Mathieu...
> > > 
> > > If we had a synchronize_sched() like function that would wait and return
> > > when all preempted tasks have been scheduled again and went to either
> > > userspace or called schedule directly, then we could actually do this.
> > > 
> > > After unregistering the function graph trace, you call this
> > > "synchronize_tasks()" and it will guarantee that all currently preempted
> > > tasks have either went to userspace or have called schedule() directly.
> > > Then it would be safe to remove this check.
> > 
> > OK, so basically you need to know when you reach a quiescent state, but
> > preemption is enabled and there is no RCU read lock taken around these
> > code paths, am I correct ?
> > 
> > With tracepoints, life is easy because I disable preemption around the
> > calls, so I can use synchronize_sched() to know when quiescent state is
> > reached.
> > 
> > I recommend looking at kernel/kprobes.c:check_safety(). It uses
> > thaw_processes() and synchronize_sched() for this purpose. Basically, it
> > rely on the "refrigeration" points to detect such quiescent state. This
> > trick should do the job for the function graph tracer too.
> > 
> > I'm adding Masami in CC. He is the one who implemented check_safety(),
> > and I remember discussing it with him in the past.
> 
> Hmm, interesting. Maybe something like that might work. But what if
> CONFIG_PREEMPT is enabled but CONFIG_FREEZER is not?

Then you may want to make the function tracer depend on CONFIG_FREEZER,
but maybe Masami has other ideas ?

Mathieu

> 
> -- Steve
> 
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-22  4:09         ` Mathieu Desnoyers
@ 2010-01-22  4:52           ` Steven Rostedt
  2010-01-22 12:34             ` Mathieu Desnoyers
  0 siblings, 1 reply; 44+ messages in thread
From: Steven Rostedt @ 2010-01-22  4:52 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Frederic Weisbecker, Ingo Molnar, LKML, Li Zefan, Lai Jiangshan,
	Masami Hiramatsu

On Thu, 2010-01-21 at 23:09 -0500, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@goodmis.org) wrote:

> > Hmm, interesting. Maybe something like that might work. But what if
> > CONFIG_PREEMPT is enabled but CONFIG_FREEZER is not?
> 
> Then you may want to make the function tracer depend on CONFIG_FREEZER,
> but maybe Masami has other ideas ?

egad no! This is just to help add guarantees to those that use the
function tracer that when the tracing is disabled, it is guaranteed that
no more tracing will be called by the function tracer. Currently,
nothing relies on this. But we may add cases that might need this.

In fact, only those that need this requirement would need to do this
trick. Anyway, we could make those depend on CONFIG_FREEZER, but that
just seems to be a strange dependency.

-- Steve

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-22  4:52           ` Steven Rostedt
@ 2010-01-22 12:34             ` Mathieu Desnoyers
  2010-01-22 14:54               ` Masami Hiramatsu
  2010-01-25 20:58               ` Frederic Weisbecker
  0 siblings, 2 replies; 44+ messages in thread
From: Mathieu Desnoyers @ 2010-01-22 12:34 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frederic Weisbecker, Ingo Molnar, LKML, Li Zefan, Lai Jiangshan,
	Masami Hiramatsu

* Steven Rostedt (rostedt@goodmis.org) wrote:
> On Thu, 2010-01-21 at 23:09 -0500, Mathieu Desnoyers wrote:
> > * Steven Rostedt (rostedt@goodmis.org) wrote:
> 
> > > Hmm, interesting. Maybe something like that might work. But what if
> > > CONFIG_PREEMPT is enabled but CONFIG_FREEZER is not?
> > 
> > Then you may want to make the function tracer depend on CONFIG_FREEZER,
> > but maybe Masami has other ideas ?
> 
> egad no! This is just to help add guarantees to those that use the
> function tracer that when the tracing is disabled, it is guaranteed that
> no more tracing will be called by the function tracer. Currently,
> nothing relies on this. But we may add cases that might need this.

Yep, identifying tracer quiescent state can become handy.

> 
> In fact, only those that need this requirement would need to do this
> trick. Anyway, we could make those depend on CONFIG_FREEZER, but that
> just seems to be a strange dependency.

This makes me wonder (question for Masami)...

static int __kprobes check_safety(void)
{
        int ret = 0;
#if defined(CONFIG_PREEMPT) && defined(CONFIG_FREEZER)
        ret = freeze_processes();
        if (ret == 0) {
                struct task_struct *p, *q;
                do_each_thread(p, q) {
                        if (p != current && p->state == TASK_RUNNING &&
                            p->pid != 0) {
                                printk("Check failed: %s is running\n",p->comm);
                                ret = -1;
                                goto loop_end;
                        }
                } while_each_thread(p, q);
        }
loop_end:
        thaw_processes();
#else
        synchronize_sched();
#endif
        return ret;
}

How does that deal with CONFIG_PREEMPT && !CONFIG_FREEZER exactly ?

Thanks,

Mathieu

> 
> -- Steve
> 
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-22 12:34             ` Mathieu Desnoyers
@ 2010-01-22 14:54               ` Masami Hiramatsu
  2010-01-25 20:58               ` Frederic Weisbecker
  1 sibling, 0 replies; 44+ messages in thread
From: Masami Hiramatsu @ 2010-01-22 14:54 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Steven Rostedt, Frederic Weisbecker, Ingo Molnar, LKML, Li Zefan,
	Lai Jiangshan

Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@goodmis.org) wrote:
>> On Thu, 2010-01-21 at 23:09 -0500, Mathieu Desnoyers wrote:
>>> * Steven Rostedt (rostedt@goodmis.org) wrote:
>>
>>>> Hmm, interesting. Maybe something like that might work. But what if
>>>> CONFIG_PREEMPT is enabled but CONFIG_FREEZER is not?
>>>
>>> Then you may want to make the function tracer depend on CONFIG_FREEZER,
>>> but maybe Masami has other ideas ?
>>
>> egad no! This is just to help add guarantees to those that use the
>> function tracer that when the tracing is disabled, it is guaranteed that
>> no more tracing will be called by the function tracer. Currently,
>> nothing relies on this. But we may add cases that might need this.
> 
> Yep, identifying tracer quiescent state can become handy.
> 
>>
>> In fact, only those that need this requirement would need to do this
>> trick. Anyway, we could make those depend on CONFIG_FREEZER, but that
>> just seems to be a strange dependency.
> 
> This makes me wonder (question for Masami)...
> 
> static int __kprobes check_safety(void)
> {
>         int ret = 0;
> #if defined(CONFIG_PREEMPT) && defined(CONFIG_FREEZER)
>         ret = freeze_processes();
>         if (ret == 0) {
>                 struct task_struct *p, *q;
>                 do_each_thread(p, q) {
>                         if (p != current && p->state == TASK_RUNNING &&
>                             p->pid != 0) {
>                                 printk("Check failed: %s is running\n",p->comm);
>                                 ret = -1;
>                                 goto loop_end;
>                         }
>                 } while_each_thread(p, q);
>         }
> loop_end:
>         thaw_processes();
> #else
>         synchronize_sched();
> #endif
>         return ret;
> }
> 
> How does that deal with CONFIG_PREEMPT && !CONFIG_FREEZER exactly ?

In that case, it just disables kprobe booster, and never use this
safety check.

Actually, this safety check is currently only for boosted probes,
which is transparently enabled on all kprobes. This means that we
can fall back to normal kprobes if we can't use the booster.
(the functionality of probing still exists, but not boosted,
 becomes slow.)

I'm not so sure how much cost can be reduced by dropping
ftrace_profile_enabled(). I agree that using freeze_processes() will
help you, but if there is no alternative(e.g. use ftrace_profile_enable()
only if CONFIG_PREEMPT&&!CONFIG_FREEZER), I don't recommend to use it.

If we can make synchronize_tasks(), I'd like to use it in above safety
check too:-)

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 05/10] ftrace: Drop buffer check in function profiler callbacks
  2010-01-22  1:16 ` [RFC PATCH 05/10] ftrace: Drop buffer check in function profiler callbacks Frederic Weisbecker
@ 2010-01-25  6:17   ` Lai Jiangshan
  2010-01-30 20:47     ` Frederic Weisbecker
  0 siblings, 1 reply; 44+ messages in thread
From: Lai Jiangshan @ 2010-01-25  6:17 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Ingo Molnar, LKML, Steven Rostedt, Li Zefan

Frederic Weisbecker wrote:
> Drop the null check on hlist->hash. It is wasteful because
> we don't register the tracer if the buffer allocation failed,
> and the memory barrier in register_ftrace_graph() ensure it
> is visible to the callbacks right away.
> 

When it is on a cpu which is just came up, hlist->hash is not
allocated either.

See ftrace_profile_init().

function_profile_call(), profile_graph_entry() and profile_graph_return()
on a just online cpu are wasteful. I think we need call
ftrace_profile_init_cpu()(if ftrace_profile_enabled=1) when CPU_UP_PREPARE.

> Also we know the tracing callbacks won't be called after
> register_ftrace_graph(), so subsequent buffer resets are safe
> too.
> 
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Li Zefan <lizf@cn.fujitsu.com>
> Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
>  kernel/trace/ftrace.c |    5 -----
>  1 files changed, 0 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 3cdb35e..dfd8f7c 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -384,8 +384,6 @@ function_profile_call(unsigned long ip, unsigned long parent_ip)
>  	local_irq_save(flags);
>  
>  	hlist = &__get_cpu_var(func_hlist_cpu);
> -	if (!hlist->hash)
> -		goto out;
>  
>  	rec = function_find_hlist_node(hlist, ip);
>  	if (!rec) {
> @@ -415,8 +413,6 @@ static void profile_graph_return(struct ftrace_graph_ret *trace)
>  
>  	local_irq_save(flags);
>  	hlist = &__get_cpu_var(func_hlist_cpu);
> -	if (!hlist->hash)
> -		goto out;
>  
>  	calltime = trace->rettime - trace->calltime;
>  
> @@ -439,7 +435,6 @@ static void profile_graph_return(struct ftrace_graph_ret *trace)
>  	if (rec)
>  		rec->time += calltime;
>  
> - out:
>  	local_irq_restore(flags);
>  }
>  


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 06/10] ftrace: Release the function hlist if we don't need it anymore
  2010-01-22  1:16 ` [RFC PATCH 06/10] ftrace: Release the function hlist if we don't need it anymore Frederic Weisbecker
@ 2010-01-25  6:41   ` Lai Jiangshan
  2010-01-30 21:14     ` Frederic Weisbecker
  0 siblings, 1 reply; 44+ messages in thread
From: Lai Jiangshan @ 2010-01-25  6:41 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Ingo Molnar, LKML, Steven Rostedt, Li Zefan

Frederic Weisbecker wrote:
> After we disable the function profiler, the function hashlist
> stays in memory. This is wasteful as nobody needs it anymore,
> until the next use if any.
> 
> Release it when we disable the function profiler instead of
> resetting it in the next use.
> 
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Li Zefan <lizf@cn.fujitsu.com>
> Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
>  kernel/trace/ftrace.c          |    1 +
>  kernel/trace/functions_hlist.c |   61 +++++++++++++++++-----------------------
>  kernel/trace/functions_hlist.h |    1 +
>  3 files changed, 28 insertions(+), 35 deletions(-)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index dfd8f7c..0ded01c 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -509,6 +509,7 @@ ftrace_profile_write(struct file *filp, const char __user *ubuf,

>			/*
>			 * unregister_ftrace_profiler calls stop_machine
>			 * so this acts like an synchronize_sched.
>			 */
>  			unregister_ftrace_profiler();

unluckily, when !CONFIG_DYNAMIC_FTRACE, it does not call stop_machine()
nor synchronize_sched(), bug here?

> +			function_hlist_release();



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 09/10] tracing: Use the hashlist for graph function
  2010-01-22  1:16 ` [RFC PATCH 09/10] tracing: Use the hashlist for graph function Frederic Weisbecker
@ 2010-01-25  8:50   ` Lai Jiangshan
  2010-01-30 21:19     ` Frederic Weisbecker
  0 siblings, 1 reply; 44+ messages in thread
From: Lai Jiangshan @ 2010-01-25  8:50 UTC (permalink / raw)
  To: Frederic Weisbecker; +Cc: Ingo Molnar, LKML, Steven Rostedt, Li Zefan

Frederic Weisbecker wrote:
> When we set a filter to start tracing from a given function in
> the function graph tracer, the filter is stored in a linear array.
> 
> It doesn't scale very well, we even limited the number of such
> functions to 32.
> 
> Now that we have a hashlist of functions, lets put a field inside
> each function node so that we can check if a function is one of
> these filters using the hashlist, not a linear array.

The linear array @ftrace_graph_funcs is still used in this patch.
we still limit the number of such functions to 32?

[...]
>  #define FTRACE_GRAPH_MAX_FUNCS		32
>  extern int ftrace_graph_count;
>  extern unsigned long ftrace_graph_funcs[FTRACE_GRAPH_MAX_FUNCS];
> -
> -static inline int ftrace_graph_addr(unsigned long addr)
> -{
> -	int i;
> -
> -	if (!ftrace_graph_count)
> -		return 1;

Here return 1.

[...]
> +static inline int ftrace_graph_addr(unsigned long addr)
> +{
> +	struct func_node *rec;
> +	struct func_hlist *hlist;
> +
> +	if (!ftrace_graph_count)
> +		return 0;
> +

But in this patch, return 0 here, the behave will be changed.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-22  3:05       ` Steven Rostedt
@ 2010-01-25 20:52         ` Frederic Weisbecker
  0 siblings, 0 replies; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-25 20:52 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, LKML, Li Zefan, Lai Jiangshan, Mathieu Desnoyers

On Thu, Jan 21, 2010 at 10:05:06PM -0500, Steven Rostedt wrote:
> On Fri, 2010-01-22 at 03:43 +0100, Frederic Weisbecker wrote:
> 
> > > Now for the reason I Cc'd Paul and Mathieu...
> > > 
> > > If we had a synchronize_sched() like function that would wait and return
> > > when all preempted tasks have been scheduled again and went to either
> > > userspace or called schedule directly, then we could actually do this.
> > > 
> > > After unregistering the function graph trace, you call this
> > > "synchronize_tasks()" and it will guarantee that all currently preempted
> > > tasks have either went to userspace or have called schedule() directly.
> > > Then it would be safe to remove this check.
> > 
> > 
> > 
> > Good point!
> > 
> > I fear that would require heavy hooks in the scheduler though...
> > 
> 
> Not a heavy one. We could add a field to the task_struct and just call
> something if it is set.
> 
> 
> At start of schedule()
> 
> 	if (unlikely(current->pcount))
> 		handle_pcount_waiters(current);
> 
> 
> void handle_pcount_waiters(struct task_struct *p)
> {
> 	current->pcount = 0;
> 	wake_up(pcount_waiters);
> }
> 
> 
> and for the synchronize_tasks(), just search the task list for tasks
> that are on the run queue but not running, and add a pcount timestamp
> and record the list of tasks (allocated list).
> 
> After it is woken up, it checks the list of tasks and if a task does not
> have the pcount timestamp that matches what was stored, it removes it
> from the list. When it is finally woken up and does not have any more
> tasks on the list, it continues.
> 
> This is just a basic idea, i left out a bunch of details, but I'm sure
> it is feasible. This type of wait may work for other types of lockless
> algorithms too.
> 
> -- Steve



Sounds like a good idea. That would avoid these wasteful checks and
resident buffers.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-22 12:34             ` Mathieu Desnoyers
  2010-01-22 14:54               ` Masami Hiramatsu
@ 2010-01-25 20:58               ` Frederic Weisbecker
  2010-01-25 22:14                 ` Masami Hiramatsu
  1 sibling, 1 reply; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-25 20:58 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Steven Rostedt, Ingo Molnar, LKML, Li Zefan, Lai Jiangshan,
	Masami Hiramatsu

On Fri, Jan 22, 2010 at 07:34:51AM -0500, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > On Thu, 2010-01-21 at 23:09 -0500, Mathieu Desnoyers wrote:
> > > * Steven Rostedt (rostedt@goodmis.org) wrote:
> > 
> > > > Hmm, interesting. Maybe something like that might work. But what if
> > > > CONFIG_PREEMPT is enabled but CONFIG_FREEZER is not?
> > > 
> > > Then you may want to make the function tracer depend on CONFIG_FREEZER,
> > > but maybe Masami has other ideas ?
> > 
> > egad no! This is just to help add guarantees to those that use the
> > function tracer that when the tracing is disabled, it is guaranteed that
> > no more tracing will be called by the function tracer. Currently,
> > nothing relies on this. But we may add cases that might need this.
> 
> Yep, identifying tracer quiescent state can become handy.
> 
> > 
> > In fact, only those that need this requirement would need to do this
> > trick. Anyway, we could make those depend on CONFIG_FREEZER, but that
> > just seems to be a strange dependency.
> 
> This makes me wonder (question for Masami)...
> 
> static int __kprobes check_safety(void)
> {
>         int ret = 0;
> #if defined(CONFIG_PREEMPT) && defined(CONFIG_FREEZER)
>         ret = freeze_processes();
>         if (ret == 0) {
>                 struct task_struct *p, *q;
>                 do_each_thread(p, q) {
>                         if (p != current && p->state == TASK_RUNNING &&
>                             p->pid != 0) {
>                                 printk("Check failed: %s is running\n",p->comm);
>                                 ret = -1;
>                                 goto loop_end;
>                         }
>                 } while_each_thread(p, q);



How does that deal with kernel threads that don't freeze?

Also freezing every processes seems a bit of a heavy thing for that.
Looks like a synchronize_tasks() would be really useful.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-25 20:58               ` Frederic Weisbecker
@ 2010-01-25 22:14                 ` Masami Hiramatsu
  2010-01-26  0:41                   ` Frederic Weisbecker
  0 siblings, 1 reply; 44+ messages in thread
From: Masami Hiramatsu @ 2010-01-25 22:14 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Mathieu Desnoyers, Steven Rostedt, Ingo Molnar, LKML, Li Zefan,
	Lai Jiangshan

Frederic Weisbecker wrote:
> On Fri, Jan 22, 2010 at 07:34:51AM -0500, Mathieu Desnoyers wrote:
>> * Steven Rostedt (rostedt@goodmis.org) wrote:
>>> On Thu, 2010-01-21 at 23:09 -0500, Mathieu Desnoyers wrote:
>>>> * Steven Rostedt (rostedt@goodmis.org) wrote:
>>>
>>>>> Hmm, interesting. Maybe something like that might work. But what if
>>>>> CONFIG_PREEMPT is enabled but CONFIG_FREEZER is not?
>>>>
>>>> Then you may want to make the function tracer depend on CONFIG_FREEZER,
>>>> but maybe Masami has other ideas ?
>>>
>>> egad no! This is just to help add guarantees to those that use the
>>> function tracer that when the tracing is disabled, it is guaranteed that
>>> no more tracing will be called by the function tracer. Currently,
>>> nothing relies on this. But we may add cases that might need this.
>>
>> Yep, identifying tracer quiescent state can become handy.
>>
>>>
>>> In fact, only those that need this requirement would need to do this
>>> trick. Anyway, we could make those depend on CONFIG_FREEZER, but that
>>> just seems to be a strange dependency.
>>
>> This makes me wonder (question for Masami)...
>>
>> static int __kprobes check_safety(void)
>> {
>>         int ret = 0;
>> #if defined(CONFIG_PREEMPT) && defined(CONFIG_FREEZER)
>>         ret = freeze_processes();
>>         if (ret == 0) {
>>                 struct task_struct *p, *q;
>>                 do_each_thread(p, q) {
>>                         if (p != current && p->state == TASK_RUNNING &&
>>                             p->pid != 0) {
>>                                 printk("Check failed: %s is running\n",p->comm);
>>                                 ret = -1;
>>                                 goto loop_end;
>>                         }
>>                 } while_each_thread(p, q);
> 
> 
> 
> How does that deal with kernel threads that don't freeze?

Hmm, right. It can't handle non-freezable kernel threads.

> Also freezing every processes seems a bit of a heavy thing for that.
> Looks like a synchronize_tasks() would be really useful.

Sure :-)
Maybe, I'd better remove booster support on preemptive kernel until then.

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-25 22:14                 ` Masami Hiramatsu
@ 2010-01-26  0:41                   ` Frederic Weisbecker
  2010-01-26  1:13                     ` Mathieu Desnoyers
  0 siblings, 1 reply; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-26  0:41 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Mathieu Desnoyers, Steven Rostedt, Ingo Molnar, LKML, Li Zefan,
	Lai Jiangshan

On Mon, Jan 25, 2010 at 05:14:16PM -0500, Masami Hiramatsu wrote:
> Frederic Weisbecker wrote:
> > On Fri, Jan 22, 2010 at 07:34:51AM -0500, Mathieu Desnoyers wrote:
> >> * Steven Rostedt (rostedt@goodmis.org) wrote:
> >>> On Thu, 2010-01-21 at 23:09 -0500, Mathieu Desnoyers wrote:
> >>>> * Steven Rostedt (rostedt@goodmis.org) wrote:
> >>>
> >>>>> Hmm, interesting. Maybe something like that might work. But what if
> >>>>> CONFIG_PREEMPT is enabled but CONFIG_FREEZER is not?
> >>>>
> >>>> Then you may want to make the function tracer depend on CONFIG_FREEZER,
> >>>> but maybe Masami has other ideas ?
> >>>
> >>> egad no! This is just to help add guarantees to those that use the
> >>> function tracer that when the tracing is disabled, it is guaranteed that
> >>> no more tracing will be called by the function tracer. Currently,
> >>> nothing relies on this. But we may add cases that might need this.
> >>
> >> Yep, identifying tracer quiescent state can become handy.
> >>
> >>>
> >>> In fact, only those that need this requirement would need to do this
> >>> trick. Anyway, we could make those depend on CONFIG_FREEZER, but that
> >>> just seems to be a strange dependency.
> >>
> >> This makes me wonder (question for Masami)...
> >>
> >> static int __kprobes check_safety(void)
> >> {
> >>         int ret = 0;
> >> #if defined(CONFIG_PREEMPT) && defined(CONFIG_FREEZER)
> >>         ret = freeze_processes();
> >>         if (ret == 0) {
> >>                 struct task_struct *p, *q;
> >>                 do_each_thread(p, q) {
> >>                         if (p != current && p->state == TASK_RUNNING &&
> >>                             p->pid != 0) {
> >>                                 printk("Check failed: %s is running\n",p->comm);
> >>                                 ret = -1;
> >>                                 goto loop_end;
> >>                         }
> >>                 } while_each_thread(p, q);
> > 
> > 
> > 
> > How does that deal with kernel threads that don't freeze?
> 
> Hmm, right. It can't handle non-freezable kernel threads.
> 
> > Also freezing every processes seems a bit of a heavy thing for that.
> > Looks like a synchronize_tasks() would be really useful.
> 
> Sure :-)
> Maybe, I'd better remove booster support on preemptive kernel until then.


I don't know as I haven't looked deeper into check_safety(), but does the
fact we have non-freezable tasks break the assumptions that make
kprobes booster safe? If so then yeah, may be deactivate it for now.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-26  0:41                   ` Frederic Weisbecker
@ 2010-01-26  1:13                     ` Mathieu Desnoyers
  2010-01-26  1:37                       ` Masami Hiramatsu
  0 siblings, 1 reply; 44+ messages in thread
From: Mathieu Desnoyers @ 2010-01-26  1:13 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Masami Hiramatsu, Steven Rostedt, Ingo Molnar, LKML, Li Zefan,
	Lai Jiangshan

* Frederic Weisbecker (fweisbec@gmail.com) wrote:
> On Mon, Jan 25, 2010 at 05:14:16PM -0500, Masami Hiramatsu wrote:
> > Frederic Weisbecker wrote:
> > > On Fri, Jan 22, 2010 at 07:34:51AM -0500, Mathieu Desnoyers wrote:
> > >> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > >>> On Thu, 2010-01-21 at 23:09 -0500, Mathieu Desnoyers wrote:
> > >>>> * Steven Rostedt (rostedt@goodmis.org) wrote:
> > >>>
> > >>>>> Hmm, interesting. Maybe something like that might work. But what if
> > >>>>> CONFIG_PREEMPT is enabled but CONFIG_FREEZER is not?
> > >>>>
> > >>>> Then you may want to make the function tracer depend on CONFIG_FREEZER,
> > >>>> but maybe Masami has other ideas ?
> > >>>
> > >>> egad no! This is just to help add guarantees to those that use the
> > >>> function tracer that when the tracing is disabled, it is guaranteed that
> > >>> no more tracing will be called by the function tracer. Currently,
> > >>> nothing relies on this. But we may add cases that might need this.
> > >>
> > >> Yep, identifying tracer quiescent state can become handy.
> > >>
> > >>>
> > >>> In fact, only those that need this requirement would need to do this
> > >>> trick. Anyway, we could make those depend on CONFIG_FREEZER, but that
> > >>> just seems to be a strange dependency.
> > >>
> > >> This makes me wonder (question for Masami)...
> > >>
> > >> static int __kprobes check_safety(void)
> > >> {
> > >>         int ret = 0;
> > >> #if defined(CONFIG_PREEMPT) && defined(CONFIG_FREEZER)
> > >>         ret = freeze_processes();
> > >>         if (ret == 0) {
> > >>                 struct task_struct *p, *q;
> > >>                 do_each_thread(p, q) {
> > >>                         if (p != current && p->state == TASK_RUNNING &&
> > >>                             p->pid != 0) {
> > >>                                 printk("Check failed: %s is running\n",p->comm);
> > >>                                 ret = -1;
> > >>                                 goto loop_end;
> > >>                         }
> > >>                 } while_each_thread(p, q);
> > > 
> > > 
> > > 
> > > How does that deal with kernel threads that don't freeze?
> > 
> > Hmm, right. It can't handle non-freezable kernel threads.
> > 
> > > Also freezing every processes seems a bit of a heavy thing for that.
> > > Looks like a synchronize_tasks() would be really useful.
> > 
> > Sure :-)
> > Maybe, I'd better remove booster support on preemptive kernel until then.
> 
> 
> I don't know as I haven't looked deeper into check_safety(), but does the
> fact we have non-freezable tasks break the assumptions that make
> kprobes booster safe? If so then yeah, may be deactivate it for now.
> 

In the case of check_safety, it's not a bug per se if a task happens to
be non freezable. freeze_processes() will likely return a non-zero
value, and the whole check_safety will therefore return that value, so
standard breakpoints will be used instead.

But that doesn't fit with the function graph tracer requirements.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path
  2010-01-26  1:13                     ` Mathieu Desnoyers
@ 2010-01-26  1:37                       ` Masami Hiramatsu
  2010-01-27 21:55                         ` [PATCH tracing/kprobes] kprobes: Disable booster when CONFIG_PREEMPT=y Masami Hiramatsu
  0 siblings, 1 reply; 44+ messages in thread
From: Masami Hiramatsu @ 2010-01-26  1:37 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Frederic Weisbecker, Steven Rostedt, Ingo Molnar, LKML, Li Zefan,
	Lai Jiangshan

Mathieu Desnoyers wrote:
> * Frederic Weisbecker (fweisbec@gmail.com) wrote:
>> On Mon, Jan 25, 2010 at 05:14:16PM -0500, Masami Hiramatsu wrote:
>>> Frederic Weisbecker wrote:
>>>> On Fri, Jan 22, 2010 at 07:34:51AM -0500, Mathieu Desnoyers wrote:
>>>>> * Steven Rostedt (rostedt@goodmis.org) wrote:
>>>>>> On Thu, 2010-01-21 at 23:09 -0500, Mathieu Desnoyers wrote:
>>>>>>> * Steven Rostedt (rostedt@goodmis.org) wrote:
>>>>>>
>>>>>>>> Hmm, interesting. Maybe something like that might work. But what if
>>>>>>>> CONFIG_PREEMPT is enabled but CONFIG_FREEZER is not?
>>>>>>>
>>>>>>> Then you may want to make the function tracer depend on CONFIG_FREEZER,
>>>>>>> but maybe Masami has other ideas ?
>>>>>>
>>>>>> egad no! This is just to help add guarantees to those that use the
>>>>>> function tracer that when the tracing is disabled, it is guaranteed that
>>>>>> no more tracing will be called by the function tracer. Currently,
>>>>>> nothing relies on this. But we may add cases that might need this.
>>>>>
>>>>> Yep, identifying tracer quiescent state can become handy.
>>>>>
>>>>>>
>>>>>> In fact, only those that need this requirement would need to do this
>>>>>> trick. Anyway, we could make those depend on CONFIG_FREEZER, but that
>>>>>> just seems to be a strange dependency.
>>>>>
>>>>> This makes me wonder (question for Masami)...
>>>>>
>>>>> static int __kprobes check_safety(void)
>>>>> {
>>>>>         int ret = 0;
>>>>> #if defined(CONFIG_PREEMPT) && defined(CONFIG_FREEZER)
>>>>>         ret = freeze_processes();
>>>>>         if (ret == 0) {
>>>>>                 struct task_struct *p, *q;
>>>>>                 do_each_thread(p, q) {
>>>>>                         if (p != current && p->state == TASK_RUNNING &&
>>>>>                             p->pid != 0) {
>>>>>                                 printk("Check failed: %s is running\n",p->comm);
>>>>>                                 ret = -1;
>>>>>                                 goto loop_end;
>>>>>                         }
>>>>>                 } while_each_thread(p, q);
>>>>
>>>>
>>>>
>>>> How does that deal with kernel threads that don't freeze?
>>>
>>> Hmm, right. It can't handle non-freezable kernel threads.
>>>
>>>> Also freezing every processes seems a bit of a heavy thing for that.
>>>> Looks like a synchronize_tasks() would be really useful.
>>>
>>> Sure :-)
>>> Maybe, I'd better remove booster support on preemptive kernel until then.
>>
>>
>> I don't know as I haven't looked deeper into check_safety(), but does the
>> fact we have non-freezable tasks break the assumptions that make
>> kprobes booster safe? If so then yeah, may be deactivate it for now.
>>
> 
> In the case of check_safety, it's not a bug per se if a task happens to
> be non freezable. freeze_processes() will likely return a non-zero
> value, and the whole check_safety will therefore return that value, so
> standard breakpoints will be used instead.

Hmm, unfortunately no, because freeze_processes() doesn't count
non-freezable threads...

---
                todo = 0;
                read_lock(&tasklist_lock);
                do_each_thread(g, p) {
                        if (frozen(p) || !freezeable(p))
                                continue;

                        if (!freeze_task(p, sig_only))
                                continue;

                        /*
                         * Now that we've done set_freeze_flag, don't
                         * perturb a task in TASK_STOPPED or TASK_TRACED.
                         * It is "frozen enough".  If the task does wake
                         * up, it will immediately call try_to_freeze.
                         */
                        if (!task_is_stopped_or_traced(p) &&
                            !freezer_should_skip(p))
                                todo++;
                } while_each_thread(g, p);
                read_unlock(&tasklist_lock);
                if (!todo || time_after(jiffies, end_time))
                        break;
---

So, it can return 0 if there is non-freezable threads running.

Thank you,


-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH tracing/kprobes] kprobes: Disable booster when CONFIG_PREEMPT=y
  2010-01-26  1:37                       ` Masami Hiramatsu
@ 2010-01-27 21:55                         ` Masami Hiramatsu
  2010-01-28  1:08                           ` Mathieu Desnoyers
                                             ` (2 more replies)
  0 siblings, 3 replies; 44+ messages in thread
From: Masami Hiramatsu @ 2010-01-27 21:55 UTC (permalink / raw)
  To: Ingo Molnar, lkml
  Cc: systemtap, DLE, Masami Hiramatsu, Ananth N Mavinakayanahalli,
	Frederic Weisbecker, Ingo Molnar, Jim Keniston, Mathieu Desnoyers,
	Steven Rostedt

Disable kprobe booster when CONFIG_PREEMPT=y, because it can't ensure
that all kernel threads preempted on kprobe's boosted slot run out from
the slot even using freeze_processes().

The booster on preemptive kernel will be resumed if synchronize_tasks()
or something like that is introduced.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jim Keniston <jkenisto@us.ibm.com>
CC: Mathieu Desnoyers <compudj@krystal.dyndns.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
---

 arch/ia64/kernel/kprobes.c |    2 +-
 arch/x86/kernel/kprobes.c  |    2 +-
 kernel/kprobes.c           |   29 ++---------------------------
 3 files changed, 4 insertions(+), 29 deletions(-)

diff --git a/arch/ia64/kernel/kprobes.c b/arch/ia64/kernel/kprobes.c
index 9adac44..7026b29 100644
--- a/arch/ia64/kernel/kprobes.c
+++ b/arch/ia64/kernel/kprobes.c
@@ -870,7 +870,7 @@ static int __kprobes pre_kprobes_handler(struct die_args *args)
 		return 1;
 
 ss_probe:
-#if !defined(CONFIG_PREEMPT) || defined(CONFIG_FREEZER)
+#if !defined(CONFIG_PREEMPT)
 	if (p->ainsn.inst_flag == INST_FLAG_BOOSTABLE && !p->post_handler) {
 		/* Boost up -- we can execute copied instructions directly */
 		ia64_psr(regs)->ri = p->ainsn.slot;
diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
index 5b8c750..9453815 100644
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -429,7 +429,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
 static void __kprobes setup_singlestep(struct kprobe *p, struct pt_regs *regs,
 				       struct kprobe_ctlblk *kcb)
 {
-#if !defined(CONFIG_PREEMPT) || defined(CONFIG_FREEZER)
+#if !defined(CONFIG_PREEMPT)
 	if (p->ainsn.boostable == 1 && !p->post_handler) {
 		/* Boost up -- we can execute copied instructions directly */
 		reset_current_kprobe();
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index b7df302..9907a03 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -124,30 +124,6 @@ static LIST_HEAD(kprobe_insn_pages);
 static int kprobe_garbage_slots;
 static int collect_garbage_slots(void);
 
-static int __kprobes check_safety(void)
-{
-	int ret = 0;
-#if defined(CONFIG_PREEMPT) && defined(CONFIG_FREEZER)
-	ret = freeze_processes();
-	if (ret == 0) {
-		struct task_struct *p, *q;
-		do_each_thread(p, q) {
-			if (p != current && p->state == TASK_RUNNING &&
-			    p->pid != 0) {
-				printk("Check failed: %s is running\n",p->comm);
-				ret = -1;
-				goto loop_end;
-			}
-		} while_each_thread(p, q);
-	}
-loop_end:
-	thaw_processes();
-#else
-	synchronize_sched();
-#endif
-	return ret;
-}
-
 /**
  * __get_insn_slot() - Find a slot on an executable page for an instruction.
  * We allocate an executable page if there's no room on existing ones.
@@ -235,9 +211,8 @@ static int __kprobes collect_garbage_slots(void)
 {
 	struct kprobe_insn_page *kip, *next;
 
-	/* Ensure no-one is preepmted on the garbages */
-	if (check_safety())
-		return -EAGAIN;
+	/* Ensure no-one is interrupted on the garbages */
+	synchronize_sched();
 
 	list_for_each_entry_safe(kip, next, &kprobe_insn_pages, list) {
 		int i;


-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH tracing/kprobes] kprobes: Disable booster when CONFIG_PREEMPT=y
  2010-01-27 21:55                         ` [PATCH tracing/kprobes] kprobes: Disable booster when CONFIG_PREEMPT=y Masami Hiramatsu
@ 2010-01-28  1:08                           ` Mathieu Desnoyers
  2010-01-28  4:21                           ` Ananth N Mavinakayanahalli
  2010-01-29  9:21                           ` Ingo Molnar
  2 siblings, 0 replies; 44+ messages in thread
From: Mathieu Desnoyers @ 2010-01-28  1:08 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, lkml, systemtap, DLE, Ananth N Mavinakayanahalli,
	Frederic Weisbecker, Jim Keniston, Steven Rostedt

* Masami Hiramatsu (mhiramat@redhat.com) wrote:
> Disable kprobe booster when CONFIG_PREEMPT=y, because it can't ensure
> that all kernel threads preempted on kprobe's boosted slot run out from
> the slot even using freeze_processes().
> 
> The booster on preemptive kernel will be resumed if synchronize_tasks()
> or something like that is introduced.


Yes, given that the freezer does not deal with non-freezable tasks (as
you pointed out in a different thread), I think we cannot rely on it
with CONFIG_PREEMPT.

Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>

> 
> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Jim Keniston <jkenisto@us.ibm.com>
> CC: Mathieu Desnoyers <compudj@krystal.dyndns.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> ---
> 
>  arch/ia64/kernel/kprobes.c |    2 +-
>  arch/x86/kernel/kprobes.c  |    2 +-
>  kernel/kprobes.c           |   29 ++---------------------------
>  3 files changed, 4 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/ia64/kernel/kprobes.c b/arch/ia64/kernel/kprobes.c
> index 9adac44..7026b29 100644
> --- a/arch/ia64/kernel/kprobes.c
> +++ b/arch/ia64/kernel/kprobes.c
> @@ -870,7 +870,7 @@ static int __kprobes pre_kprobes_handler(struct die_args *args)
>  		return 1;
>  
>  ss_probe:
> -#if !defined(CONFIG_PREEMPT) || defined(CONFIG_FREEZER)
> +#if !defined(CONFIG_PREEMPT)
>  	if (p->ainsn.inst_flag == INST_FLAG_BOOSTABLE && !p->post_handler) {
>  		/* Boost up -- we can execute copied instructions directly */
>  		ia64_psr(regs)->ri = p->ainsn.slot;
> diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
> index 5b8c750..9453815 100644
> --- a/arch/x86/kernel/kprobes.c
> +++ b/arch/x86/kernel/kprobes.c
> @@ -429,7 +429,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
>  static void __kprobes setup_singlestep(struct kprobe *p, struct pt_regs *regs,
>  				       struct kprobe_ctlblk *kcb)
>  {
> -#if !defined(CONFIG_PREEMPT) || defined(CONFIG_FREEZER)
> +#if !defined(CONFIG_PREEMPT)
>  	if (p->ainsn.boostable == 1 && !p->post_handler) {
>  		/* Boost up -- we can execute copied instructions directly */
>  		reset_current_kprobe();
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index b7df302..9907a03 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -124,30 +124,6 @@ static LIST_HEAD(kprobe_insn_pages);
>  static int kprobe_garbage_slots;
>  static int collect_garbage_slots(void);
>  
> -static int __kprobes check_safety(void)
> -{
> -	int ret = 0;
> -#if defined(CONFIG_PREEMPT) && defined(CONFIG_FREEZER)
> -	ret = freeze_processes();
> -	if (ret == 0) {
> -		struct task_struct *p, *q;
> -		do_each_thread(p, q) {
> -			if (p != current && p->state == TASK_RUNNING &&
> -			    p->pid != 0) {
> -				printk("Check failed: %s is running\n",p->comm);
> -				ret = -1;
> -				goto loop_end;
> -			}
> -		} while_each_thread(p, q);
> -	}
> -loop_end:
> -	thaw_processes();
> -#else
> -	synchronize_sched();
> -#endif
> -	return ret;
> -}
> -
>  /**
>   * __get_insn_slot() - Find a slot on an executable page for an instruction.
>   * We allocate an executable page if there's no room on existing ones.
> @@ -235,9 +211,8 @@ static int __kprobes collect_garbage_slots(void)
>  {
>  	struct kprobe_insn_page *kip, *next;
>  
> -	/* Ensure no-one is preepmted on the garbages */
> -	if (check_safety())
> -		return -EAGAIN;
> +	/* Ensure no-one is interrupted on the garbages */
> +	synchronize_sched();
>  
>  	list_for_each_entry_safe(kip, next, &kprobe_insn_pages, list) {
>  		int i;
> 
> 
> -- 
> Masami Hiramatsu
> 
> Software Engineer
> Hitachi Computer Products (America), Inc.
> Software Solutions Division
> 
> e-mail: mhiramat@redhat.com
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH tracing/kprobes] kprobes: Disable booster when CONFIG_PREEMPT=y
  2010-01-27 21:55                         ` [PATCH tracing/kprobes] kprobes: Disable booster when CONFIG_PREEMPT=y Masami Hiramatsu
  2010-01-28  1:08                           ` Mathieu Desnoyers
@ 2010-01-28  4:21                           ` Ananth N Mavinakayanahalli
  2010-01-29  9:21                           ` Ingo Molnar
  2 siblings, 0 replies; 44+ messages in thread
From: Ananth N Mavinakayanahalli @ 2010-01-28  4:21 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, lkml, systemtap, DLE, Frederic Weisbecker,
	Jim Keniston, Mathieu Desnoyers, Steven Rostedt

On Wed, Jan 27, 2010 at 04:55:31PM -0500, Masami Hiramatsu wrote:
> Disable kprobe booster when CONFIG_PREEMPT=y, because it can't ensure
> that all kernel threads preempted on kprobe's boosted slot run out from
> the slot even using freeze_processes().
> 
> The booster on preemptive kernel will be resumed if synchronize_tasks()
> or something like that is introduced.
> 
> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>

Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH tracing/kprobes] kprobes: Disable booster when CONFIG_PREEMPT=y
  2010-01-27 21:55                         ` [PATCH tracing/kprobes] kprobes: Disable booster when CONFIG_PREEMPT=y Masami Hiramatsu
  2010-01-28  1:08                           ` Mathieu Desnoyers
  2010-01-28  4:21                           ` Ananth N Mavinakayanahalli
@ 2010-01-29  9:21                           ` Ingo Molnar
  2010-01-29 11:30                             ` Peter Zijlstra
  2010-01-29 14:52                             ` Masami Hiramatsu
  2 siblings, 2 replies; 44+ messages in thread
From: Ingo Molnar @ 2010-01-29  9:21 UTC (permalink / raw)
  To: Masami Hiramatsu, Thomas Gleixner, Peter Zijlstra,
	Fr??d??ric Weisbecker, Steven Rostedt
  Cc: lkml, systemtap, DLE, Ananth N Mavinakayanahalli,
	Frederic Weisbecker, Jim Keniston, Mathieu Desnoyers,
	Steven Rostedt


* Masami Hiramatsu <mhiramat@redhat.com> wrote:

> Disable kprobe booster when CONFIG_PREEMPT=y, because it can't ensure that 
> all kernel threads preempted on kprobe's boosted slot run out from the slot 
> even using freeze_processes().

hm, this really sucks as it makes preemptible kernels perform worse. Is there 
no better solution?

> The booster on preemptive kernel will be resumed if synchronize_tasks() or 
> something like that is introduced.

such as this one?

	Ingo

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH tracing/kprobes] kprobes: Disable booster when CONFIG_PREEMPT=y
  2010-01-29  9:21                           ` Ingo Molnar
@ 2010-01-29 11:30                             ` Peter Zijlstra
  2010-01-29 14:52                             ` Masami Hiramatsu
  1 sibling, 0 replies; 44+ messages in thread
From: Peter Zijlstra @ 2010-01-29 11:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Masami Hiramatsu, Thomas Gleixner, Fr??d??ric Weisbecker,
	Steven Rostedt, lkml, systemtap, DLE, Ananth N Mavinakayanahalli,
	Jim Keniston, Mathieu Desnoyers

On Fri, 2010-01-29 at 10:21 +0100, Ingo Molnar wrote:
> * Masami Hiramatsu <mhiramat@redhat.com> wrote:
> 
> > Disable kprobe booster when CONFIG_PREEMPT=y, because it can't ensure that 
> > all kernel threads preempted on kprobe's boosted slot run out from the slot 
> > even using freeze_processes().
> 
> hm, this really sucks as it makes preemptible kernels perform worse. Is there 
> no better solution?

We could pre-allocate 1 slot per kernel thread.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH tracing/kprobes] kprobes: Disable booster when CONFIG_PREEMPT=y
  2010-01-29  9:21                           ` Ingo Molnar
  2010-01-29 11:30                             ` Peter Zijlstra
@ 2010-01-29 14:52                             ` Masami Hiramatsu
  2010-01-29 17:08                               ` Mathieu Desnoyers
  1 sibling, 1 reply; 44+ messages in thread
From: Masami Hiramatsu @ 2010-01-29 14:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, Peter Zijlstra, Fr??d??ric Weisbecker,
	Steven Rostedt, lkml, systemtap, DLE, Ananth N Mavinakayanahalli,
	Jim Keniston, Mathieu Desnoyers

Ingo Molnar wrote:
> 
> * Masami Hiramatsu <mhiramat@redhat.com> wrote:
> 
>> Disable kprobe booster when CONFIG_PREEMPT=y, because it can't ensure that 
>> all kernel threads preempted on kprobe's boosted slot run out from the slot 
>> even using freeze_processes().
> 
> hm, this really sucks as it makes preemptible kernels perform worse. Is there 
> no better solution?
> 
>> The booster on preemptive kernel will be resumed if synchronize_tasks() or 
>> something like that is introduced.
> 
> such as this one?

Yes, I think this synchronize_tasks(), which just (sleeping) wait until
all currently preempted tasks are wake up and scheduled, can ensure safety.

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH tracing/kprobes] kprobes: Disable booster when CONFIG_PREEMPT=y
  2010-01-29 14:52                             ` Masami Hiramatsu
@ 2010-01-29 17:08                               ` Mathieu Desnoyers
  2010-01-29 17:15                                 ` Peter Zijlstra
  0 siblings, 1 reply; 44+ messages in thread
From: Mathieu Desnoyers @ 2010-01-29 17:08 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Thomas Gleixner, Peter Zijlstra,
	Fr??d??ric Weisbecker, Steven Rostedt, lkml, systemtap, DLE,
	Ananth N Mavinakayanahalli, Jim Keniston

* Masami Hiramatsu (mhiramat@redhat.com) wrote:
> Ingo Molnar wrote:
> > 
> > * Masami Hiramatsu <mhiramat@redhat.com> wrote:
> > 
> >> Disable kprobe booster when CONFIG_PREEMPT=y, because it can't ensure that 
> >> all kernel threads preempted on kprobe's boosted slot run out from the slot 
> >> even using freeze_processes().
> > 
> > hm, this really sucks as it makes preemptible kernels perform worse. Is there 
> > no better solution?
> > 
> >> The booster on preemptive kernel will be resumed if synchronize_tasks() or 
> >> something like that is introduced.
> > 
> > such as this one?
> 
> Yes, I think this synchronize_tasks(), which just (sleeping) wait until
> all currently preempted tasks are wake up and scheduled, can ensure safety

If a task is set as stopped, and the preempted before calling schedule,
can this result in a preempted task staying in that state for an
arbitrary long period of time ? Or is there some mechanism prohibiting
that in the scheduler ?

Thanks,

Mathieu

> 
> Thank you,
> 
> -- 
> Masami Hiramatsu
> 
> Software Engineer
> Hitachi Computer Products (America), Inc.
> Software Solutions Division
> 
> e-mail: mhiramat@redhat.com
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH tracing/kprobes] kprobes: Disable booster when CONFIG_PREEMPT=y
  2010-01-29 17:08                               ` Mathieu Desnoyers
@ 2010-01-29 17:15                                 ` Peter Zijlstra
  2010-01-29 17:27                                   ` Mathieu Desnoyers
  0 siblings, 1 reply; 44+ messages in thread
From: Peter Zijlstra @ 2010-01-29 17:15 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Masami Hiramatsu, Ingo Molnar, Thomas Gleixner,
	Fr??d??ric Weisbecker, Steven Rostedt, lkml, systemtap, DLE,
	Ananth N Mavinakayanahalli, Jim Keniston

On Fri, 2010-01-29 at 12:08 -0500, Mathieu Desnoyers wrote:
> 
> If a task is set as stopped, and the preempted before calling schedule,
> can this result in a preempted task staying in that state for an
> arbitrary long period of time ? Or is there some mechanism prohibiting
> that in the scheduler ? 

PREEMPT_ACTIVE does that:

preempt_schedule()
                add_preempt_count(PREEMPT_ACTIVE);
                schedule();


schedule()
        if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
                if (unlikely(signal_pending_state(prev->state, prev)))
                        prev->state = TASK_RUNNING;
                else
                        deactivate_task(rq, prev, 1);
                switch_count = &prev->nvcsw;
        }



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH tracing/kprobes] kprobes: Disable booster when CONFIG_PREEMPT=y
  2010-01-29 17:15                                 ` Peter Zijlstra
@ 2010-01-29 17:27                                   ` Mathieu Desnoyers
  2010-01-29 17:32                                     ` Masami Hiramatsu
  0 siblings, 1 reply; 44+ messages in thread
From: Mathieu Desnoyers @ 2010-01-29 17:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Masami Hiramatsu, Ingo Molnar, Thomas Gleixner,
	Fr??d??ric Weisbecker, Steven Rostedt, lkml, systemtap, DLE,
	Ananth N Mavinakayanahalli, Jim Keniston

* Peter Zijlstra (peterz@infradead.org) wrote:
> On Fri, 2010-01-29 at 12:08 -0500, Mathieu Desnoyers wrote:
> > 
> > If a task is set as stopped, and the preempted before calling schedule,
> > can this result in a preempted task staying in that state for an
> > arbitrary long period of time ? Or is there some mechanism prohibiting
> > that in the scheduler ? 
> 
> PREEMPT_ACTIVE does that:
> 
> preempt_schedule()
>                 add_preempt_count(PREEMPT_ACTIVE);
>                 schedule();
> 
> 
> schedule()
>         if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
>                 if (unlikely(signal_pending_state(prev->state, prev)))
>                         prev->state = TASK_RUNNING;
>                 else
>                         deactivate_task(rq, prev, 1);
>                 switch_count = &prev->nvcsw;
>         }

OK, it looks safe for preemption. Is there any unforeseen weird way a
task can be scheduled out and stopped that would permit it to either:

- stall the algorithm forever (DoS)
- appear as quiescent to the algorithm while its stack would hold return
  pointers to incorrect locations

?

I'm concerned about page faults here.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH tracing/kprobes] kprobes: Disable booster when CONFIG_PREEMPT=y
  2010-01-29 17:27                                   ` Mathieu Desnoyers
@ 2010-01-29 17:32                                     ` Masami Hiramatsu
  0 siblings, 0 replies; 44+ messages in thread
From: Masami Hiramatsu @ 2010-01-29 17:32 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Fr??d??ric Weisbecker, Steven Rostedt, lkml, systemtap, DLE,
	Ananth N Mavinakayanahalli, Jim Keniston

Mathieu Desnoyers wrote:
> * Peter Zijlstra (peterz@infradead.org) wrote:
>> On Fri, 2010-01-29 at 12:08 -0500, Mathieu Desnoyers wrote:
>>>
>>> If a task is set as stopped, and the preempted before calling schedule,
>>> can this result in a preempted task staying in that state for an
>>> arbitrary long period of time ? Or is there some mechanism prohibiting
>>> that in the scheduler ? 
>>
>> PREEMPT_ACTIVE does that:
>>
>> preempt_schedule()
>>                 add_preempt_count(PREEMPT_ACTIVE);
>>                 schedule();
>>
>>
>> schedule()
>>         if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
>>                 if (unlikely(signal_pending_state(prev->state, prev)))
>>                         prev->state = TASK_RUNNING;
>>                 else
>>                         deactivate_task(rq, prev, 1);
>>                 switch_count = &prev->nvcsw;
>>         }
> 
> OK, it looks safe for preemption. Is there any unforeseen weird way a
> task can be scheduled out and stopped that would permit it to either:
> 
> - stall the algorithm forever (DoS)
> - appear as quiescent to the algorithm while its stack would hold return
>   pointers to incorrect locations
> 
> ?
> 
> I'm concerned about page faults here.

booster also checks whether the instruction can cause page-fault,
if it will cause it, kprobes doesn't boost it.

Thank you,

-- 
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America), Inc.
Software Solutions Division

e-mail: mhiramat@redhat.com


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 05/10] ftrace: Drop buffer check in function profiler callbacks
  2010-01-25  6:17   ` Lai Jiangshan
@ 2010-01-30 20:47     ` Frederic Weisbecker
  0 siblings, 0 replies; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-30 20:47 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Ingo Molnar, LKML, Steven Rostedt, Li Zefan

On Mon, Jan 25, 2010 at 02:17:29PM +0800, Lai Jiangshan wrote:
> Frederic Weisbecker wrote:
> > Drop the null check on hlist->hash. It is wasteful because
> > we don't register the tracer if the buffer allocation failed,
> > and the memory barrier in register_ftrace_graph() ensure it
> > is visible to the callbacks right away.
> > 
> 
> When it is on a cpu which is just came up, hlist->hash is not
> allocated either.
> 
> See ftrace_profile_init().
> 
> function_profile_call(), profile_graph_entry() and profile_graph_return()
> on a just online cpu are wasteful. I think we need call
> ftrace_profile_init_cpu()(if ftrace_profile_enabled=1) when CPU_UP_PREPARE.



May be yeah, or perhaps we should play with cpu_possible_mask()
instead of cpu_online. Whatever, the patch is also buggy for the reasons
given by Steve, I'm going to drop it until we have a sane implementation
of synchronize_preempted_tasks().


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 06/10] ftrace: Release the function hlist if we don't need it anymore
  2010-01-25  6:41   ` Lai Jiangshan
@ 2010-01-30 21:14     ` Frederic Weisbecker
  0 siblings, 0 replies; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-30 21:14 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Ingo Molnar, LKML, Steven Rostedt, Li Zefan

On Mon, Jan 25, 2010 at 02:41:59PM +0800, Lai Jiangshan wrote:
> Frederic Weisbecker wrote:
> > After we disable the function profiler, the function hashlist
> > stays in memory. This is wasteful as nobody needs it anymore,
> > until the next use if any.
> > 
> > Release it when we disable the function profiler instead of
> > resetting it in the next use.
> > 
> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Li Zefan <lizf@cn.fujitsu.com>
> > Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> > ---
> >  kernel/trace/ftrace.c          |    1 +
> >  kernel/trace/functions_hlist.c |   61 +++++++++++++++++-----------------------
> >  kernel/trace/functions_hlist.h |    1 +
> >  3 files changed, 28 insertions(+), 35 deletions(-)
> > 
> > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> > index dfd8f7c..0ded01c 100644
> > --- a/kernel/trace/ftrace.c
> > +++ b/kernel/trace/ftrace.c
> > @@ -509,6 +509,7 @@ ftrace_profile_write(struct file *filp, const char __user *ubuf,
> 
> >			/*
> >			 * unregister_ftrace_profiler calls stop_machine
> >			 * so this acts like an synchronize_sched.
> >			 */
> >  			unregister_ftrace_profiler();
> 
> unluckily, when !CONFIG_DYNAMIC_FTRACE, it does not call stop_machine()
> nor synchronize_sched(), bug here?



Yeah. I guess the synchronize_sched() is here to ensure
that if a user reenables the profiler, there is no
pending tracing callback that touches the hlist
that can become reset.

We indeed may need an explicit synchronize_sched() for
!CONFIG_DYNAMIC_FTRACE.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [RFC PATCH 09/10] tracing: Use the hashlist for graph function
  2010-01-25  8:50   ` Lai Jiangshan
@ 2010-01-30 21:19     ` Frederic Weisbecker
  0 siblings, 0 replies; 44+ messages in thread
From: Frederic Weisbecker @ 2010-01-30 21:19 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: Ingo Molnar, LKML, Steven Rostedt, Li Zefan

On Mon, Jan 25, 2010 at 04:50:48PM +0800, Lai Jiangshan wrote:
> Frederic Weisbecker wrote:
> > When we set a filter to start tracing from a given function in
> > the function graph tracer, the filter is stored in a linear array.
> > 
> > It doesn't scale very well, we even limited the number of such
> > functions to 32.
> > 
> > Now that we have a hashlist of functions, lets put a field inside
> > each function node so that we can check if a function is one of
> > these filters using the hashlist, not a linear array.
> 
> The linear array @ftrace_graph_funcs is still used in this patch.
> we still limit the number of such functions to 32?
> 
> [...]



Heh, that's right. I should probably make it a list. The array
is only used when we touch set_graph_function file.



> >  #define FTRACE_GRAPH_MAX_FUNCS		32
> >  extern int ftrace_graph_count;
> >  extern unsigned long ftrace_graph_funcs[FTRACE_GRAPH_MAX_FUNCS];
> > -
> > -static inline int ftrace_graph_addr(unsigned long addr)
> > -{
> > -	int i;
> > -
> > -	if (!ftrace_graph_count)
> > -		return 1;
> 
> Here return 1.
> 
> [...]
> > +static inline int ftrace_graph_addr(unsigned long addr)
> > +{
> > +	struct func_node *rec;
> > +	struct func_hlist *hlist;
> > +
> > +	if (!ftrace_graph_count)
> > +		return 0;
> > +
> 
> But in this patch, return 0 here, the behave will be changed.


Yeah, I've inverted the check from the tracing callback.

It seems to me that:

	if (ftrace_graph_addr(addr))

more logically means that we trace this addr. Making it
a boolean would make the things more clear perhaps?


^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2010-01-30 21:19 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-22  1:16 [RFC PATCH 00/10] Ftrace functions hashlist Frederic Weisbecker
2010-01-22  1:16 ` [RFC PATCH 01/10] ftrace: Generalize the function hashlist from function profiler Frederic Weisbecker
2010-01-22  1:16 ` [RFC PATCH 02/10] ftrace: Ensure tracing has really stopped before leaving unregister_ftrace_graph Frederic Weisbecker
2010-01-22  1:51   ` Steven Rostedt
2010-01-22  2:04     ` Frederic Weisbecker
2010-01-22  1:16 ` [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path Frederic Weisbecker
2010-01-22  2:05   ` Steven Rostedt
2010-01-22  2:28     ` Mathieu Desnoyers
2010-01-22  3:12       ` Steven Rostedt
2010-01-22  4:09         ` Mathieu Desnoyers
2010-01-22  4:52           ` Steven Rostedt
2010-01-22 12:34             ` Mathieu Desnoyers
2010-01-22 14:54               ` Masami Hiramatsu
2010-01-25 20:58               ` Frederic Weisbecker
2010-01-25 22:14                 ` Masami Hiramatsu
2010-01-26  0:41                   ` Frederic Weisbecker
2010-01-26  1:13                     ` Mathieu Desnoyers
2010-01-26  1:37                       ` Masami Hiramatsu
2010-01-27 21:55                         ` [PATCH tracing/kprobes] kprobes: Disable booster when CONFIG_PREEMPT=y Masami Hiramatsu
2010-01-28  1:08                           ` Mathieu Desnoyers
2010-01-28  4:21                           ` Ananth N Mavinakayanahalli
2010-01-29  9:21                           ` Ingo Molnar
2010-01-29 11:30                             ` Peter Zijlstra
2010-01-29 14:52                             ` Masami Hiramatsu
2010-01-29 17:08                               ` Mathieu Desnoyers
2010-01-29 17:15                                 ` Peter Zijlstra
2010-01-29 17:27                                   ` Mathieu Desnoyers
2010-01-29 17:32                                     ` Masami Hiramatsu
2010-01-22  2:43     ` [RFC PATCH 03/10] ftrace: Drop the ftrace_profile_enabled checks in tracing hot path Frederic Weisbecker
2010-01-22  3:05       ` Steven Rostedt
2010-01-25 20:52         ` Frederic Weisbecker
2010-01-22  1:16 ` [RFC PATCH 04/10] ftrace: Ensure buffers are visibles to tracing callbacks right away Frederic Weisbecker
2010-01-22  1:16 ` [RFC PATCH 05/10] ftrace: Drop buffer check in function profiler callbacks Frederic Weisbecker
2010-01-25  6:17   ` Lai Jiangshan
2010-01-30 20:47     ` Frederic Weisbecker
2010-01-22  1:16 ` [RFC PATCH 06/10] ftrace: Release the function hlist if we don't need it anymore Frederic Weisbecker
2010-01-25  6:41   ` Lai Jiangshan
2010-01-30 21:14     ` Frederic Weisbecker
2010-01-22  1:16 ` [RFC PATCH 07/10] ftrace: Make the function hashlist concurrently usable Frederic Weisbecker
2010-01-22  1:16 ` [PATCH 08/10] tracing: Simplify test for function_graph tracing Frederic Weisbecker
2010-01-22  1:16 ` [RFC PATCH 09/10] tracing: Use the hashlist for graph function Frederic Weisbecker
2010-01-25  8:50   ` Lai Jiangshan
2010-01-30 21:19     ` Frederic Weisbecker
2010-01-22  1:16 ` [RFC PATCH 10/10] ftrace: Factorize search and insertion in the function hashlist Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).