* [PATCH 0/3] ftrace: new features
@ 2008-11-25 22:34 Steven Rostedt
2008-11-25 22:34 ` [PATCH 1/3] ftrace: add function tracing to single thread Steven Rostedt
` (2 more replies)
0 siblings, 3 replies; 23+ messages in thread
From: Steven Rostedt @ 2008-11-25 22:34 UTC (permalink / raw)
To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker
Ingo,
The first patch adds a new feature to trace a single pid in the
function tracer.
The next two patches let the ftrace_return (graph now?) and the
function tracer run together.
The following patches are in:
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git
branch: tip/devel
Steven Rostedt (3):
ftrace: add function tracing to single thread
ftrace: use code patching for ftrace return tracer
ftrace: let function tracing and function return run together
----
Documentation/ftrace.txt | 79 +++++++++++++
arch/x86/kernel/entry_32.S | 5 +
arch/x86/kernel/ftrace.c | 46 ++++++++-
include/linux/ftrace.h | 5 +
kernel/trace/ftrace.c | 260 +++++++++++++++++++++++++++++++++-----------
5 files changed, 331 insertions(+), 64 deletions(-)
^ permalink raw reply [flat|nested] 23+ messages in thread* [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-25 22:34 [PATCH 0/3] ftrace: new features Steven Rostedt @ 2008-11-25 22:34 ` Steven Rostedt 2008-11-25 22:42 ` Andrew Morton 2008-11-26 0:02 ` Frédéric Weisbecker 2008-11-25 22:34 ` [PATCH 2/3] ftrace: use code patching for ftrace return tracer Steven Rostedt 2008-11-25 22:34 ` [PATCH 3/3] ftrace: let function tracing and function return run together Steven Rostedt 2 siblings, 2 replies; 23+ messages in thread From: Steven Rostedt @ 2008-11-25 22:34 UTC (permalink / raw) To: linux-kernel Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, Steven Rostedt [-- Attachment #1: 0001-ftrace-add-function-tracing-to-single-thread.patch --] [-- Type: text/plain, Size: 12047 bytes --] From: Steven Rostedt <srostedt@redhat.com> Impact: feature to function trace a single thread This patch adds the ability to function trace a single thread. The file: /debugfs/tracing/set_ftrace_pid contains the pid to trace. Valid pids are any positive integer. Writing any negative number to this file will disable the pid tracing and the function tracer will go back to tracing all of threads. This feature works with both static and dynamic function tracing. Signed-off-by: Steven Rostedt <srostedt@redhat.com> --- Documentation/ftrace.txt | 79 +++++++++++++++++ kernel/trace/ftrace.c | 209 ++++++++++++++++++++++++++++++++++++++++------ 2 files changed, 262 insertions(+), 26 deletions(-) diff --git a/Documentation/ftrace.txt b/Documentation/ftrace.txt index 35a78bc..de05042 100644 --- a/Documentation/ftrace.txt +++ b/Documentation/ftrace.txt @@ -127,6 +127,8 @@ of ftrace. Here is a list of some of the key files: be traced. If a function exists in both set_ftrace_filter and set_ftrace_notrace, the function will _not_ be traced. + set_ftrace_pid: Have the function tracer only trace a single thread. + available_filter_functions: This lists the functions that ftrace has processed and can trace. These are the function names that you can pass to "set_ftrace_filter" or @@ -1073,6 +1075,83 @@ For simple one time traces, the above is sufficent. For anything else, a search through /proc/mounts may be needed to find where the debugfs file-system is mounted. + +Single thread tracing +--------------------- + +By writing into /debug/tracing/set_ftrace_pid you can trace a +single thread. For example: + +# cat /debug/tracing/set_ftrace_pid +no pid +# echo 3111 > /debug/tracing/set_ftrace_pid +# cat /debug/tracing/set_ftrace_pid +3111 +# echo function > /debug/tracing/current_tracer +# cat /debug/tracing/trace | head + # tracer: function + # + # TASK-PID CPU# TIMESTAMP FUNCTION + # | | | | | + yum-updatesd-3111 [003] 1637.254676: finish_task_switch <-thread_return + yum-updatesd-3111 [003] 1637.254681: hrtimer_cancel <-schedule_hrtimeout_range + yum-updatesd-3111 [003] 1637.254682: hrtimer_try_to_cancel <-hrtimer_cancel + yum-updatesd-3111 [003] 1637.254683: lock_hrtimer_base <-hrtimer_try_to_cancel + yum-updatesd-3111 [003] 1637.254685: fget_light <-do_sys_poll + yum-updatesd-3111 [003] 1637.254686: pipe_poll <-do_sys_poll +# echo -1 > /debug/tracing/set_ftrace_pid +# cat /debug/tracing/trace |head + # tracer: function + # + # TASK-PID CPU# TIMESTAMP FUNCTION + # | | | | | + ##### CPU 3 buffer started #### + yum-updatesd-3111 [003] 1701.957688: free_poll_entry <-poll_freewait + yum-updatesd-3111 [003] 1701.957689: remove_wait_queue <-free_poll_entry + yum-updatesd-3111 [003] 1701.957691: fput <-free_poll_entry + yum-updatesd-3111 [003] 1701.957692: audit_syscall_exit <-sysret_audit + yum-updatesd-3111 [003] 1701.957693: path_put <-audit_syscall_exit + +If you want to trace a function when executing, you could use +something like this simple program: + +#include <stdio.h> +#include <stdlib.h> +#include <sys/types.h> +#include <sys/stat.h> +#include <fcntl.h> +#include <unistd.h> + +int main (int argc, char **argv) +{ + if (argc < 1) + exit(-1); + + if (fork() > 0) { + int fd, ffd; + char line[64]; + int s; + + ffd = open("/debug/tracing/current_tracer", O_WRONLY); + if (ffd < 0) + exit(-1); + write(ffd, "nop", 3); + + fd = open("/debug/tracing/set_ftrace_pid", O_WRONLY); + s = sprintf(line, "%d\n", getpid()); + write(fd, line, s); + + write(ffd, "function", 8); + + close(fd); + close(ffd); + + execvp(argv[1], argv+1); + } + + return 0; +} + dynamic ftrace -------------- diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 53042f1..e8cf629 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -47,6 +47,9 @@ int ftrace_enabled __read_mostly; static int last_ftrace_enabled; +/* ftrace_pid_trace >= 0 will only trace threads with this pid */ +static int ftrace_pid_trace = -1; + /* Quick disabling of function tracer. */ int function_trace_stop; @@ -61,6 +64,7 @@ static int ftrace_disabled __read_mostly; static DEFINE_SPINLOCK(ftrace_lock); static DEFINE_MUTEX(ftrace_sysctl_lock); +static DEFINE_MUTEX(ftrace_start_lock); static struct ftrace_ops ftrace_list_end __read_mostly = { @@ -70,6 +74,7 @@ static struct ftrace_ops ftrace_list_end __read_mostly = static struct ftrace_ops *ftrace_list __read_mostly = &ftrace_list_end; ftrace_func_t ftrace_trace_function __read_mostly = ftrace_stub; ftrace_func_t __ftrace_trace_function __read_mostly = ftrace_stub; +ftrace_func_t ftrace_pid_function __read_mostly = ftrace_stub; static void ftrace_list_func(unsigned long ip, unsigned long parent_ip) { @@ -86,6 +91,21 @@ static void ftrace_list_func(unsigned long ip, unsigned long parent_ip) }; } +static void ftrace_pid_func(unsigned long ip, unsigned long parent_ip) +{ + if (current->pid != ftrace_pid_trace) + return; + + ftrace_pid_function(ip, parent_ip); +} + +static void set_ftrace_pid_function(ftrace_func_t func) +{ + /* do not set ftrace_pid_function to itself! */ + if (func != ftrace_pid_func) + ftrace_pid_function = func; +} + /** * clear_ftrace_function - reset the ftrace function * @@ -96,6 +116,7 @@ void clear_ftrace_function(void) { ftrace_trace_function = ftrace_stub; __ftrace_trace_function = ftrace_stub; + ftrace_pid_function = ftrace_stub; } #ifndef CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST @@ -128,20 +149,26 @@ static int __register_ftrace_function(struct ftrace_ops *ops) ftrace_list = ops; if (ftrace_enabled) { + ftrace_func_t func; + + if (ops->next == &ftrace_list_end) + func = ops->func; + else + func = ftrace_list_func; + + if (ftrace_pid_trace >= 0) { + set_ftrace_pid_function(func); + func = ftrace_pid_func; + } + /* * For one func, simply call it directly. * For more than one func, call the chain. */ #ifdef CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST - if (ops->next == &ftrace_list_end) - ftrace_trace_function = ops->func; - else - ftrace_trace_function = ftrace_list_func; + ftrace_trace_function = func; #else - if (ops->next == &ftrace_list_end) - __ftrace_trace_function = ops->func; - else - __ftrace_trace_function = ftrace_list_func; + __ftrace_trace_function = func; ftrace_trace_function = ftrace_test_stop_func; #endif } @@ -182,8 +209,19 @@ static int __unregister_ftrace_function(struct ftrace_ops *ops) if (ftrace_enabled) { /* If we only have one func left, then call that directly */ - if (ftrace_list->next == &ftrace_list_end) - ftrace_trace_function = ftrace_list->func; + if (ftrace_list->next == &ftrace_list_end) { + ftrace_func_t func = ftrace_list->func; + + if (ftrace_pid_trace >= 0) { + set_ftrace_pid_function(func); + func = ftrace_pid_func; + } +#ifdef CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST + ftrace_trace_function = func; +#else + __ftrace_trace_function = func; +#endif + } } out: @@ -192,6 +230,38 @@ static int __unregister_ftrace_function(struct ftrace_ops *ops) return ret; } +static void ftrace_update_pid_func(void) +{ + ftrace_func_t func; + + /* should not be called from interrupt context */ + spin_lock(&ftrace_lock); + + if (ftrace_trace_function == ftrace_stub) + goto out; + + func = ftrace_trace_function; + + if (ftrace_pid_trace >= 0) { + set_ftrace_pid_function(func); + func = ftrace_pid_func; + } else { + if (func != ftrace_pid_func) + goto out; + + set_ftrace_pid_function(func); + } + +#ifdef CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST + ftrace_trace_function = func; +#else + __ftrace_trace_function = func; +#endif + + out: + spin_unlock(&ftrace_lock); +} + #ifdef CONFIG_DYNAMIC_FTRACE #ifndef CONFIG_FTRACE_MCOUNT_RECORD # error Dynamic ftrace depends on MCOUNT_RECORD @@ -545,7 +615,19 @@ static void ftrace_run_update_code(int command) static ftrace_func_t saved_ftrace_func; static int ftrace_start_up; -static DEFINE_MUTEX(ftrace_start_lock); + +static void ftrace_startup_enable(int command) +{ + if (saved_ftrace_func != ftrace_trace_function) { + saved_ftrace_func = ftrace_trace_function; + command |= FTRACE_UPDATE_TRACE_FUNC; + } + + if (!command || !ftrace_enabled) + return; + + ftrace_run_update_code(command); +} static void ftrace_startup(void) { @@ -558,16 +640,8 @@ static void ftrace_startup(void) ftrace_start_up++; command |= FTRACE_ENABLE_CALLS; - if (saved_ftrace_func != ftrace_trace_function) { - saved_ftrace_func = ftrace_trace_function; - command |= FTRACE_UPDATE_TRACE_FUNC; - } - - if (!command || !ftrace_enabled) - goto out; + ftrace_startup_enable(command); - ftrace_run_update_code(command); - out: mutex_unlock(&ftrace_start_lock); } @@ -1262,13 +1336,10 @@ static struct file_operations ftrace_notrace_fops = { .release = ftrace_notrace_release, }; -static __init int ftrace_init_debugfs(void) +static __init int ftrace_init_dyn_debugfs(struct dentry *d_tracer) { - struct dentry *d_tracer; struct dentry *entry; - d_tracer = tracing_init_dentry(); - entry = debugfs_create_file("available_filter_functions", 0444, d_tracer, NULL, &ftrace_avail_fops); if (!entry) @@ -1295,8 +1366,6 @@ static __init int ftrace_init_debugfs(void) return 0; } -fs_initcall(ftrace_init_debugfs); - static int ftrace_convert_nops(struct module *mod, unsigned long *start, unsigned long *end) @@ -1382,12 +1451,100 @@ static int __init ftrace_nodyn_init(void) } device_initcall(ftrace_nodyn_init); +static inline int ftrace_init_dyn_debugfs(struct dentry *d_tracer) { return 0; } +static inline void ftrace_startup_enable(int command) { } # define ftrace_startup() do { } while (0) # define ftrace_shutdown() do { } while (0) # define ftrace_startup_sysctl() do { } while (0) # define ftrace_shutdown_sysctl() do { } while (0) #endif /* CONFIG_DYNAMIC_FTRACE */ +static ssize_t +ftrace_pid_read(struct file *file, char __user *ubuf, + size_t cnt, loff_t *ppos) +{ + char buf[64]; + int r; + + if (ftrace_pid_trace >= 0) + r = sprintf(buf, "%u\n", ftrace_pid_trace); + else + r = sprintf(buf, "no pid\n"); + + return simple_read_from_buffer(ubuf, cnt, ppos, buf, r); +} + +static ssize_t +ftrace_pid_write(struct file *filp, const char __user *ubuf, + size_t cnt, loff_t *ppos) +{ + char buf[64]; + long val; + int ret; + + if (cnt >= sizeof(buf)) + return -EINVAL; + + if (copy_from_user(&buf, ubuf, cnt)) + return -EFAULT; + + buf[cnt] = 0; + + ret = strict_strtol(buf, 10, &val); + if (ret < 0) + return ret; + + mutex_lock(&ftrace_start_lock); + if (ret < 0) { + /* disable pid tracing */ + if (ftrace_pid_trace < 0) + goto out; + ftrace_pid_trace = -1; + + } else { + + if (ftrace_pid_trace == val) + goto out; + + ftrace_pid_trace = val; + } + + /* update the function call */ + ftrace_update_pid_func(); + ftrace_startup_enable(0); + + out: + mutex_unlock(&ftrace_start_lock); + + return cnt; +} + +static struct file_operations ftrace_pid_fops = { + .read = ftrace_pid_read, + .write = ftrace_pid_write, +}; + +static __init int ftrace_init_debugfs(void) +{ + struct dentry *d_tracer; + struct dentry *entry; + + d_tracer = tracing_init_dentry(); + if (!d_tracer) + return 0; + + ftrace_init_dyn_debugfs(d_tracer); + + entry = debugfs_create_file("set_ftrace_pid", 0644, d_tracer, + NULL, &ftrace_pid_fops); + if (!entry) + pr_warning("Could not create debugfs " + "'set_ftrace_pid' entry\n"); + return 0; +} + +fs_initcall(ftrace_init_debugfs); + /** * ftrace_kill - kill ftrace * -- 1.5.6.5 -- ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-25 22:34 ` [PATCH 1/3] ftrace: add function tracing to single thread Steven Rostedt @ 2008-11-25 22:42 ` Andrew Morton 2008-11-25 23:01 ` Steven Rostedt 2008-11-26 0:02 ` Frédéric Weisbecker 1 sibling, 1 reply; 23+ messages in thread From: Andrew Morton @ 2008-11-25 22:42 UTC (permalink / raw) To: Steven Rostedt; +Cc: linux-kernel, mingo, fweisbec, srostedt On Tue, 25 Nov 2008 17:34:22 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > This patch adds the ability to function trace a single thread. > The file: > > /debugfs/tracing/set_ftrace_pid > > contains the pid to trace. What happens if the same pid exists in two or more pid namespaces? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-25 22:42 ` Andrew Morton @ 2008-11-25 23:01 ` Steven Rostedt 2008-11-25 23:31 ` Andrew Morton 0 siblings, 1 reply; 23+ messages in thread From: Steven Rostedt @ 2008-11-25 23:01 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, mingo, fweisbec, srostedt On Tue, 25 Nov 2008, Andrew Morton wrote: > On Tue, 25 Nov 2008 17:34:22 -0500 > Steven Rostedt <rostedt@goodmis.org> wrote: > > > This patch adds the ability to function trace a single thread. > > The file: > > > > /debugfs/tracing/set_ftrace_pid > > > > contains the pid to trace. > > What happens if the same pid exists in two or more pid namespaces? I think we had this discussion before. It tests current->pid, would that be different among the name spaces? -- Steve ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-25 23:01 ` Steven Rostedt @ 2008-11-25 23:31 ` Andrew Morton 2008-11-26 0:11 ` Steven Rostedt 0 siblings, 1 reply; 23+ messages in thread From: Andrew Morton @ 2008-11-25 23:31 UTC (permalink / raw) To: Steven Rostedt; +Cc: linux-kernel, mingo, fweisbec, srostedt On Tue, 25 Nov 2008 18:01:06 -0500 (EST) Steven Rostedt <rostedt@goodmis.org> wrote: > > On Tue, 25 Nov 2008, Andrew Morton wrote: > > > On Tue, 25 Nov 2008 17:34:22 -0500 > > Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > This patch adds the ability to function trace a single thread. > > > The file: > > > > > > /debugfs/tracing/set_ftrace_pid > > > > > > contains the pid to trace. > > > > What happens if the same pid exists in two or more pid namespaces? > > I think we had this discussion before. Oh. What did we end up concluding? > It tests current->pid, would that > be different among the name spaces? I think those are non-unique. containers@lists.osdl.org would have better ideas.. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-25 23:31 ` Andrew Morton @ 2008-11-26 0:11 ` Steven Rostedt 2008-11-26 0:53 ` Dave Hansen 0 siblings, 1 reply; 23+ messages in thread From: Steven Rostedt @ 2008-11-26 0:11 UTC (permalink / raw) To: Andrew Morton; +Cc: LKML, Ingo Molnar, fweisbec, srostedt, containers On Tue, 25 Nov 2008, Andrew Morton wrote: > On Tue, 25 Nov 2008 18:01:06 -0500 (EST) > Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > On Tue, 25 Nov 2008, Andrew Morton wrote: > > > > > On Tue, 25 Nov 2008 17:34:22 -0500 > > > Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > > > This patch adds the ability to function trace a single thread. > > > > The file: > > > > > > > > /debugfs/tracing/set_ftrace_pid > > > > > > > > contains the pid to trace. > > > > > > What happens if the same pid exists in two or more pid namespaces? > > > > I think we had this discussion before. > > Oh. What did we end up concluding? > > > It tests current->pid, would that > > be different among the name spaces? > > I think those are non-unique. containers@lists.osdl.org would have > better ideas.. Added list in CC. I think the end result was, if this file can only be changed by root, then we do not need to worry about namespaces. This file is a privileged file that can only be modified by root. If someday we decide to let non admin users touch this file, then we would need to care about this. This file may actually be modified in the future by users, so this may become an issue. -- Steve ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-26 0:11 ` Steven Rostedt @ 2008-11-26 0:53 ` Dave Hansen 2008-11-26 1:01 ` Steven Rostedt 0 siblings, 1 reply; 23+ messages in thread From: Dave Hansen @ 2008-11-26 0:53 UTC (permalink / raw) To: Steven Rostedt Cc: Andrew Morton, containers, fweisbec, Ingo Molnar, LKML, srostedt, Eric Biederman, Sukadev Bhattiprolu, Serge E. Hallyn On Tue, 2008-11-25 at 19:11 -0500, Steven Rostedt wrote: > > > > Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > > > > > This patch adds the ability to function trace a single thread. > > > > > The file: > > > > > > > > > > /debugfs/tracing/set_ftrace_pid > > > > > > > > > > contains the pid to trace. > > > > > > > > What happens if the same pid exists in two or more pid namespaces? > > > > > > I think we had this discussion before. > > > > Oh. What did we end up concluding? > > > > > It tests current->pid, would that > > > be different among the name spaces? > > > > I think those are non-unique. containers@lists.osdl.org would have > > better ideas.. > > Added list in CC. > > I think the end result was, if this file can only be changed by root, then > we do not need to worry about namespaces. This file is a privileged file > that can only be modified by root. > > If someday we decide to let non admin users touch this file, then we would > need to care about this. This file may actually be modified in the future > by users, so this may become an issue. This really has very little to do with root vs non-root users. In fact, we're working towards having cases where we have many "root" users, even those inside namespaces. It is also quite possible for a normal root user to fork into a new pid namespace. In that case, root simply won't be able to use this feature because something like: echo $$ /debugfs/tracing/set_ftrace_pid just won't work. Let's look at a bit of the code. +static void ftrace_pid_func(unsigned long ip, unsigned long parent_ip) +{ + if (current->pid != ftrace_pid_trace) + return; + + ftrace_pid_function(ip, parent_ip); +} One thing this doesn't deal with is pid wraparound. Does that matter? If you want to fix this a bit, instead of saving off the pid_t in ftrace_pid_trace, you should save a 'struct pid'. You can get the 'struct pid' for a particular task by doing a find_get_pid(pid_t). You can then compare that pid_t to current by doing a pid_task(struct_pid_that_i_saved, PIDTYPE_PID). That will also protect against pid wraparound. The find_get_pid() is handy because it will do the pid_t lookup in the context of the current task's pid namespace, which is what you want, I think. -- Dave ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-26 0:53 ` Dave Hansen @ 2008-11-26 1:01 ` Steven Rostedt 2008-11-26 4:50 ` Eric W. Biederman 2008-11-26 5:20 ` Eric W. Biederman 0 siblings, 2 replies; 23+ messages in thread From: Steven Rostedt @ 2008-11-26 1:01 UTC (permalink / raw) To: Dave Hansen Cc: Andrew Morton, containers, fweisbec, Ingo Molnar, LKML, srostedt, Eric Biederman, Sukadev Bhattiprolu, Serge E. Hallyn On Tue, 25 Nov 2008, Dave Hansen wrote: > > > > I think the end result was, if this file can only be changed by root, then > > we do not need to worry about namespaces. This file is a privileged file > > that can only be modified by root. > > > > If someday we decide to let non admin users touch this file, then we would > > need to care about this. This file may actually be modified in the future > > by users, so this may become an issue. > > This really has very little to do with root vs non-root users. In fact, > we're working towards having cases where we have many "root" users, even > those inside namespaces. It is also quite possible for a normal root > user to fork into a new pid namespace. In that case, root simply won't > be able to use this feature because something like: > > echo $$ /debugfs/tracing/set_ftrace_pid > > just won't work. Let's look at a bit of the code. > > +static void ftrace_pid_func(unsigned long ip, unsigned long parent_ip) > +{ > + if (current->pid != ftrace_pid_trace) > + return; > + > + ftrace_pid_function(ip, parent_ip); > +} > > One thing this doesn't deal with is pid wraparound. Does that matter? Should not. This is just a way to trace a particular process. Currently it traces all processes. If we wrap, then we trace the process with the new pid. This should not be an issue. > > If you want to fix this a bit, instead of saving off the pid_t in > ftrace_pid_trace, you should save a 'struct pid'. You can get the > 'struct pid' for a particular task by doing a find_get_pid(pid_t). You > can then compare that pid_t to current by doing a > pid_task(struct_pid_that_i_saved, PIDTYPE_PID). That will also protect > against pid wraparound. > > The find_get_pid() is handy because it will do the pid_t lookup in the > context of the current task's pid namespace, which is what you want, I > think. Nope, we can not call that in this context. ftrace_pid_func is called directly from mcount, without any protection. struct pid *find_get_pid(pid_t nr) { struct pid *pid; rcu_read_lock(); pid = get_pid(find_vpid(nr)); rcu_read_unlock(); return pid; } EXPORT_SYMBOL_GPL(find_get_pid); This means find_get_pid will call mcount which will call ftrace_pid_func, and back again. This can also happen with rcu_read_{un}lock() and get_pid() and find_vpid(). We can not do anything special here. -- Steve ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-26 1:01 ` Steven Rostedt @ 2008-11-26 4:50 ` Eric W. Biederman 2008-11-26 5:01 ` Steven Rostedt 2008-11-26 5:20 ` Eric W. Biederman 1 sibling, 1 reply; 23+ messages in thread From: Eric W. Biederman @ 2008-11-26 4:50 UTC (permalink / raw) To: Steven Rostedt Cc: Dave Hansen, Andrew Morton, containers, fweisbec, Ingo Molnar, LKML, srostedt, Sukadev Bhattiprolu, Serge E. Hallyn ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-26 4:50 ` Eric W. Biederman @ 2008-11-26 5:01 ` Steven Rostedt 2008-11-26 6:23 ` Eric W. Biederman 0 siblings, 1 reply; 23+ messages in thread From: Steven Rostedt @ 2008-11-26 5:01 UTC (permalink / raw) To: Eric W. Biederman Cc: Dave Hansen, Andrew Morton, containers, fweisbec, Ingo Molnar, LKML, srostedt, Sukadev Bhattiprolu, Serge E. Hallyn On Tue, 25 Nov 2008, Eric W. Biederman wrote: > > I'm speechless too. -- Steve ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-26 5:01 ` Steven Rostedt @ 2008-11-26 6:23 ` Eric W. Biederman 2008-11-26 6:32 ` Ingo Molnar 0 siblings, 1 reply; 23+ messages in thread From: Eric W. Biederman @ 2008-11-26 6:23 UTC (permalink / raw) To: Steven Rostedt Cc: Dave Hansen, Andrew Morton, containers, fweisbec, Ingo Molnar, LKML, srostedt, Sukadev Bhattiprolu, Serge E. Hallyn Steven Rostedt <rostedt@goodmis.org> writes: > On Tue, 25 Nov 2008, Eric W. Biederman wrote: > >> >> > > I'm speechless too. I'm a bit tired so probably am pushing to hard. At the same time I don't see a single reason not to use struct pid for what it was designed for. Identifying tasks. pid_t's really only belong at the border. I can see in the tracer when grabbing numbers you might not be able to follow pointers. For that I see justification for using task->pid. For the comparison I just don't see it. Eric ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-26 6:23 ` Eric W. Biederman @ 2008-11-26 6:32 ` Ingo Molnar 2008-11-26 7:02 ` Eric W. Biederman 0 siblings, 1 reply; 23+ messages in thread From: Ingo Molnar @ 2008-11-26 6:32 UTC (permalink / raw) To: Eric W. Biederman Cc: Steven Rostedt, Dave Hansen, Andrew Morton, containers, fweisbec, LKML, srostedt, Sukadev Bhattiprolu, Serge E. Hallyn * Eric W. Biederman <ebiederm@xmission.com> wrote: > Steven Rostedt <rostedt@goodmis.org> writes: > > > On Tue, 25 Nov 2008, Eric W. Biederman wrote: > > > > I'm speechless too. > > I'm a bit tired so probably am pushing to hard. > > At the same time I don't see a single reason not to use struct pid > for what it was designed for. Identifying tasks. pid_t's really > only belong at the border. > > I can see in the tracer when grabbing numbers you might not be able > to follow pointers. For that I see justification for using > task->pid. For the comparison I just don't see it. i dont see the point of the complexity you are advocating. 99.9% of the users run a unique PID space. Tracing is about keeping stuff simple. On containers we could also trace the namespace ID (is there an easy ID for the namespace, as an easy extension to the very nice PID concept that Unix introduced decades ago?) and be done with it. Ingo ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-26 6:32 ` Ingo Molnar @ 2008-11-26 7:02 ` Eric W. Biederman 2008-11-26 7:18 ` Ingo Molnar 0 siblings, 1 reply; 23+ messages in thread From: Eric W. Biederman @ 2008-11-26 7:02 UTC (permalink / raw) To: Ingo Molnar Cc: Steven Rostedt, Dave Hansen, Andrew Morton, containers, fweisbec, LKML, srostedt, Sukadev Bhattiprolu, Serge E. Hallyn Ingo Molnar <mingo@elte.hu> writes: > i dont see the point of the complexity you are advocating. 99.9% of > the users run a unique PID space. I'm not advocating complexity. I'm advocating using the same APIs as the rest of the kernel, for doing the same functions. > Tracing is about keeping stuff simple. On containers we could also > trace the namespace ID (is there an easy ID for the namespace, as an > easy extension to the very nice PID concept that Unix introduced > decades ago?) and be done with it. I don't really care about the pid namespace in this context. I am just asking that we compare a different field in the task struct. I am asking that we don't accumulate new users of an old crufty bug prone API, for no good reason. I'm asking that we don't be different for no good reason. Eric ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-26 7:02 ` Eric W. Biederman @ 2008-11-26 7:18 ` Ingo Molnar 2008-11-26 8:48 ` Eric W. Biederman 0 siblings, 1 reply; 23+ messages in thread From: Ingo Molnar @ 2008-11-26 7:18 UTC (permalink / raw) To: Eric W. Biederman Cc: Steven Rostedt, Dave Hansen, Andrew Morton, containers, fweisbec, LKML, srostedt, Sukadev Bhattiprolu, Serge E. Hallyn * Eric W. Biederman <ebiederm@xmission.com> wrote: > Ingo Molnar <mingo@elte.hu> writes: > > > i dont see the point of the complexity you are advocating. 99.9% of > > the users run a unique PID space. > > I'm not advocating complexity. I'm advocating using the same APIs as > the rest of the kernel, for doing the same functions. > > > Tracing is about keeping stuff simple. On containers we could also > > trace the namespace ID (is there an easy ID for the namespace, as an > > easy extension to the very nice PID concept that Unix introduced > > decades ago?) and be done with it. > > I don't really care about the pid namespace in this context. > > I am just asking that we compare a different field in the task > struct. > > I am asking that we don't accumulate new users of an old crufty bug > prone API, for no good reason. i dont disagree about the change, but i'm curious, what's bug-prone about current->pid? It certainly worked quite well for the first 15 years. Ingo ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-26 7:18 ` Ingo Molnar @ 2008-11-26 8:48 ` Eric W. Biederman 0 siblings, 0 replies; 23+ messages in thread From: Eric W. Biederman @ 2008-11-26 8:48 UTC (permalink / raw) To: Ingo Molnar Cc: Steven Rostedt, Dave Hansen, Andrew Morton, containers, fweisbec, LKML, srostedt, Sukadev Bhattiprolu, Serge E. Hallyn Ingo Molnar <mingo@elte.hu> writes: > * Eric W. Biederman <ebiederm@xmission.com> wrote: > >> Ingo Molnar <mingo@elte.hu> writes: >> >> > i dont see the point of the complexity you are advocating. 99.9% of >> > the users run a unique PID space. >> >> I'm not advocating complexity. I'm advocating using the same APIs as >> the rest of the kernel, for doing the same functions. >> >> > Tracing is about keeping stuff simple. On containers we could also >> > trace the namespace ID (is there an easy ID for the namespace, as an >> > easy extension to the very nice PID concept that Unix introduced >> > decades ago?) and be done with it. >> >> I don't really care about the pid namespace in this context. >> >> I am just asking that we compare a different field in the task >> struct. >> >> I am asking that we don't accumulate new users of an old crufty bug >> prone API, for no good reason. > > i dont disagree about the change, but i'm curious, what's bug-prone > about current->pid? It certainly worked quite well for the first 15 > years. Nothing especially serious. - Just plain general sloppiness that you can get into with using numeric pids because you can get away with that you can't get away with if you are dealing with a real data structure. One of which is classic data type confusion. Is a pid stored in an int a pid_t a short or something else. Which leads to the difficult question if you need to change something how do you find and update all of the users. Other cases are the horrible to convert cases where we in practice leaked pids all over the place, got the locking totally all confused simply because we never noticed the races. Code like that is a nightmare to convert to using struct pid *. Even if struct pid pointers are faster to use. - There is the interesting case that the original unix usage of pids and process and session groups with ttys, did not have any problems with pid wrap around. But as pids became more common we added the SIGIO support which theoretically at least could allow send a signal to the wrong process due to pid wrap around. I'm not especially security paranoid but I suspect someone intent on such things could likely find a way to send a signal to a process they usually could not using pid wrap around. - As for current->pid itself it is on my hit list for a different reason. It is one of the very few left overs from the old pid API, where we assume pids numbers are global. Being able to successfully remove it would greatly increase the confidence that we haven't missed something in the pid namespace implementation. The __pgrp and the __session fields in signal_struct are much higher on my hit list. Darn I thought I had already removed them. But unfortunately the final finishing touches on the pid namespace got stalled. Currently the preferred patterns are struct pid pointers internal to the kernel ( any place we are likely to save them ) and pid_t values right on the edge of user space. current->pid is very handy for debugging. Certainly global pid numbers aka pid numbers in the initial pid namespace are the pids we want to print with printk, and in any very light weight tracing. As long as they don't creep into APIs where people turn around and use those those pids I don't have a problem with privileged users seeing them. Eric ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-26 1:01 ` Steven Rostedt 2008-11-26 4:50 ` Eric W. Biederman @ 2008-11-26 5:20 ` Eric W. Biederman 2008-11-26 16:36 ` Steven Rostedt 1 sibling, 1 reply; 23+ messages in thread From: Eric W. Biederman @ 2008-11-26 5:20 UTC (permalink / raw) To: Steven Rostedt Cc: Dave Hansen, Andrew Morton, containers, fweisbec, Ingo Molnar, LKML, srostedt, Sukadev Bhattiprolu, Serge E. Hallyn Steven Rostedt <rostedt@goodmis.org> writes: > On Tue, 25 Nov 2008, Dave Hansen wrote: >> > >> > I think the end result was, if this file can only be changed by root, then >> > we do not need to worry about namespaces. This file is a privileged file >> > that can only be modified by root. >> > >> > If someday we decide to let non admin users touch this file, then we would >> > need to care about this. This file may actually be modified in the future >> > by users, so this may become an issue. >> >> This really has very little to do with root vs non-root users. In fact, >> we're working towards having cases where we have many "root" users, even >> those inside namespaces. It is also quite possible for a normal root >> user to fork into a new pid namespace. In that case, root simply won't >> be able to use this feature because something like: >> >> echo $$ /debugfs/tracing/set_ftrace_pid >> >> just won't work. Let's look at a bit of the code. >> >> +static void ftrace_pid_func(unsigned long ip, unsigned long parent_ip) >> +{ >> + if (current->pid != ftrace_pid_trace) >> + return; >> + >> + ftrace_pid_function(ip, parent_ip); >> +} >> >> One thing this doesn't deal with is pid wraparound. Does that matter? > > Should not. This is just a way to trace a particular process. Currently > it traces all processes. If we wrap, then we trace the process with the > new pid. This should not be an issue. So. Using raw pid numbers in the kernel is bad form. The internal representation should be struct pid pointers as much as we can make them. I would 100% prefer it if ftrace_pid_func was written in terms of struct pid. That does guarantee you don't have pid wrap around issues. It almost makes it clear +static void ftrace_pid_func(unsigned long ip, unsigned long parent_ip) +{ + if (task_pid(current) == ftrace_pid_trace) + return; + + ftrace_pid_function(ip, parent_ip); +} We don't need locks to access the pid of current. >> If you want to fix this a bit, instead of saving off the pid_t in >> ftrace_pid_trace, you should save a 'struct pid'. You can get the >> 'struct pid' for a particular task by doing a find_get_pid(pid_t). You >> can then compare that pid_t to current by doing a >> pid_task(struct_pid_that_i_saved, PIDTYPE_PID). That will also protect >> against pid wraparound. >> >> The find_get_pid() is handy because it will do the pid_t lookup in the >> context of the current task's pid namespace, which is what you want, I >> think. > > Nope, we can not call that in this context. ftrace_pid_func is called > directly from mcount, without any protection. Of course you can't. But at the same time find_get_pid() is always supposed to be called on the user space pid ingress path. > struct pid *find_get_pid(pid_t nr) > { > struct pid *pid; > > rcu_read_lock(); > pid = get_pid(find_vpid(nr)); > rcu_read_unlock(); > > return pid; > } > EXPORT_SYMBOL_GPL(find_get_pid); > > This means find_get_pid will call mcount which will call ftrace_pid_func, > and back again. This can also happen with rcu_read_{un}lock() and > get_pid() and find_vpid(). > > We can not do anything special here. I don't see the whole path. But here is the deal. 1) Using struct pid and the proper find_get_pid() means that a user with the proper capabilities/permissions who happens to be running in a pid namespace can call this and it will just work. 2) The current best practices in the current are to: - call find_get_pid() when you capture a user space pid. - call put_pid() when you are done with it. Perhaps that is just: put_pid(ftrace_pid_trace); ftrace_pid_trace = find_get_pid(user_provided_pid_value); 3) If you ultimately want to support the full gamut: thread/process/process group/session. You will need to use struct pid pointer comparisons. 4) When I looked at the place you were concerned about races a) you were concerned about the wrong race. b) I don't see a race. c) You were filtering for the tid of a linux task not the tgid of a process. So the code didn't seem to be doing what you thought it was doing. 5) I keep thinking current->pid should be removed some day. So let's do this properly if we can. This is a privileged operation so we don't need to support people without the proper capabilities doing this. Or multiple comparisons or anything silly like that. But doing this with struct pid comparisons seems cleaner and more maintainable. And that really should matter. Eric ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-26 5:20 ` Eric W. Biederman @ 2008-11-26 16:36 ` Steven Rostedt 0 siblings, 0 replies; 23+ messages in thread From: Steven Rostedt @ 2008-11-26 16:36 UTC (permalink / raw) To: Eric W. Biederman Cc: Dave Hansen, Andrew Morton, containers, fweisbec, Ingo Molnar, LKML, srostedt, Sukadev Bhattiprolu, Serge E. Hallyn On Tue, 25 Nov 2008, Eric W. Biederman wrote: > Steven Rostedt <rostedt@goodmis.org> writes: > > > On Tue, 25 Nov 2008, Dave Hansen wrote: > >> > > >> > I think the end result was, if this file can only be changed by root, then > >> > we do not need to worry about namespaces. This file is a privileged file > >> > that can only be modified by root. > >> > > >> > If someday we decide to let non admin users touch this file, then we would > >> > need to care about this. This file may actually be modified in the future > >> > by users, so this may become an issue. > >> > >> This really has very little to do with root vs non-root users. In fact, > >> we're working towards having cases where we have many "root" users, even > >> those inside namespaces. It is also quite possible for a normal root > >> user to fork into a new pid namespace. In that case, root simply won't > >> be able to use this feature because something like: > >> > >> echo $$ /debugfs/tracing/set_ftrace_pid > >> > >> just won't work. Let's look at a bit of the code. > >> > >> +static void ftrace_pid_func(unsigned long ip, unsigned long parent_ip) > >> +{ > >> + if (current->pid != ftrace_pid_trace) > >> + return; > >> + > >> + ftrace_pid_function(ip, parent_ip); > >> +} > >> > >> One thing this doesn't deal with is pid wraparound. Does that matter? > > > > Should not. This is just a way to trace a particular process. Currently > > it traces all processes. If we wrap, then we trace the process with the > > new pid. This should not be an issue. > > So. Using raw pid numbers in the kernel is bad form. The internal > representation should be struct pid pointers as much as we can make > them. > > I would 100% prefer it if ftrace_pid_func was written in terms of struct > pid. That does guarantee you don't have pid wrap around issues. > It almost makes it clear > > +static void ftrace_pid_func(unsigned long ip, unsigned long parent_ip) > +{ > + if (task_pid(current) == ftrace_pid_trace) > + return; > + > + ftrace_pid_function(ip, parent_ip); > +} > > We don't need locks to access the pid of current. That version does not bother me. I'm not worried about locks as much as I am about recursion. If that "task_pid()" ever became a function that is traced by mcount, then it will end up in a recursive loop, and will crash the system. > > > >> If you want to fix this a bit, instead of saving off the pid_t in > >> ftrace_pid_trace, you should save a 'struct pid'. You can get the > >> 'struct pid' for a particular task by doing a find_get_pid(pid_t). You > >> can then compare that pid_t to current by doing a > >> pid_task(struct_pid_that_i_saved, PIDTYPE_PID). That will also protect > >> against pid wraparound. > >> > >> The find_get_pid() is handy because it will do the pid_t lookup in the > >> context of the current task's pid namespace, which is what you want, I > >> think. > > > > Nope, we can not call that in this context. ftrace_pid_func is called > > directly from mcount, without any protection. > > Of course you can't. But at the same time find_get_pid() is always > supposed to be called on the user space pid ingress path. > > > struct pid *find_get_pid(pid_t nr) > > { > > struct pid *pid; > > > > rcu_read_lock(); > > pid = get_pid(find_vpid(nr)); > > rcu_read_unlock(); > > > > return pid; > > } > > EXPORT_SYMBOL_GPL(find_get_pid); > > > > This means find_get_pid will call mcount which will call ftrace_pid_func, > > and back again. This can also happen with rcu_read_{un}lock() and > > get_pid() and find_vpid(). > > > > We can not do anything special here. > > I don't see the whole path. But here is the deal. > 1) Using struct pid and the proper find_get_pid() means that a user with > the proper capabilities/permissions who happens to be running in a pid > namespace can call this and it will just work. > > 2) The current best practices in the current are to: > - call find_get_pid() when you capture a user space pid. > - call put_pid() when you are done with it. > > Perhaps that is just: > put_pid(ftrace_pid_trace); > ftrace_pid_trace = find_get_pid(user_provided_pid_value); This may be fine. > > 3) If you ultimately want to support the full gamut: > thread/process/process group/session. You will need > to use struct pid pointer comparisons. > > 4) When I looked at the place you were concerned about races > a) you were concerned about the wrong race. > b) I don't see a race. > c) You were filtering for the tid of a linux task not > the tgid of a process. So the code didn't seem to > be doing what you thought it was doing. > > 5) I keep thinking current->pid should be removed some day. > > So let's do this properly if we can. This is a privileged operation > so we don't need to support people without the proper capabilities > doing this. Or multiple comparisons or anything silly like that. But > doing this with struct pid comparisons seems cleaner and more maintainable. And that > really should matter. Just so you understand what I'm concerned about: $ objdump -dr kernel/pid.o [...] 0000025f <find_get_pid>: 25f: 55 push %ebp 260: 89 e5 mov %esp,%ebp 262: 53 push %ebx 263: e8 fc ff ff ff call 264 <find_get_pid+0x5> 264: R_386_PC32 mcount 268: 89 c3 mov %eax,%ebx 26a: b8 01 00 00 00 mov $0x1,%eax looking in arch/x86/kernel/entry_32.S: ENTRY(mcount) cmpl $0, function_trace_stop jne ftrace_stub cmpl $ftrace_stub, ftrace_trace_function jnz trace [...] trace: pushl %eax pushl %ecx pushl %edx movl 0xc(%esp), %eax movl 0x4(%ebp), %edx subl $MCOUNT_INSN_SIZE, %eax call *ftrace_trace_function looking in kerne/trace/ftrace.c: if (ftrace_pid_trace >= 0) { set_ftrace_pid_function(func); func = ftrace_pid_func; } ftrace_trace_function = func; And we then have static void ftrace_pid_func(unsigned long ip, unsigned long parent_ip) { if (current->pid != ftrace_pid_trace) return; ftrace_pid_function(ip, parent_ip); } Now by having ftrace_pid_func call find_get_pid, we have the function flow of... schedule() /* any traced function */ --> mcount --> *ftrace_trace_function == ftrace_pid_func ftrace_pid_func --> find_get_pid --> mcount --> *ftrace_trace_function == ftrace_pid_func ftrace_pid_func --> find_get_pid [ ad infinitum ] The comparison must be very careful not to call anything that will also trace. I can add code to catch this recursion, but this is overhead I do not want to add. Remember, this is called on every function call. If we do the work at the time we set ftrace_pid_trace and we can do simple pointer comparisons in the ftrace_pid_func, I will be happy with that. I'm still learning about this pid namespace, so I'll probably screw it up a few more times ;-) -- Steve ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-25 22:34 ` [PATCH 1/3] ftrace: add function tracing to single thread Steven Rostedt 2008-11-25 22:42 ` Andrew Morton @ 2008-11-26 0:02 ` Frédéric Weisbecker 2008-11-26 0:16 ` Steven Rostedt 1 sibling, 1 reply; 23+ messages in thread From: Frédéric Weisbecker @ 2008-11-26 0:02 UTC (permalink / raw) To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton, Steven Rostedt 2008/11/25 Steven Rostedt <rostedt@goodmis.org>: > From: Steven Rostedt <srostedt@redhat.com> > > Impact: feature to function trace a single thread > > This patch adds the ability to function trace a single thread. > The file: > > /debugfs/tracing/set_ftrace_pid That's a good idea! About the pid, I think they are unique (which is not the case for tgid for multiple threads). pid are tgid + number of the thread.... But perhaps I'm wrong.... ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/3] ftrace: add function tracing to single thread 2008-11-26 0:02 ` Frédéric Weisbecker @ 2008-11-26 0:16 ` Steven Rostedt 0 siblings, 0 replies; 23+ messages in thread From: Steven Rostedt @ 2008-11-26 0:16 UTC (permalink / raw) To: Frédéric Weisbecker Cc: linux-kernel, Ingo Molnar, Andrew Morton, Steven Rostedt On Wed, 26 Nov 2008, Fr?d?ric Weisbecker wrote: > 2008/11/25 Steven Rostedt <rostedt@goodmis.org>: > > From: Steven Rostedt <srostedt@redhat.com> > > > > Impact: feature to function trace a single thread > > > > This patch adds the ability to function trace a single thread. > > The file: > > > > /debugfs/tracing/set_ftrace_pid > > > That's a good idea! > > About the pid, I think they are unique (which is not the case for tgid > for multiple threads). > pid are tgid + number of the thread.... > > But perhaps I'm wrong.... Hmm, I can see that there could be times we would want to trace all threads, and times to trace only a single thread. This should default to the tgid. We could echo in "t:pid" to trace per thread. and just "pid" to trace non all threads in a group. -- Steve ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 2/3] ftrace: use code patching for ftrace return tracer 2008-11-25 22:34 [PATCH 0/3] ftrace: new features Steven Rostedt 2008-11-25 22:34 ` [PATCH 1/3] ftrace: add function tracing to single thread Steven Rostedt @ 2008-11-25 22:34 ` Steven Rostedt 2008-11-26 0:13 ` Frédéric Weisbecker 2008-11-26 0:23 ` Steven Rostedt 2008-11-25 22:34 ` [PATCH 3/3] ftrace: let function tracing and function return run together Steven Rostedt 2 siblings, 2 replies; 23+ messages in thread From: Steven Rostedt @ 2008-11-25 22:34 UTC (permalink / raw) To: linux-kernel Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, Steven Rostedt [-- Attachment #1: 0002-ftrace-use-code-patching-for-ftrace-return-tracer.patch --] [-- Type: text/plain, Size: 6605 bytes --] From: Steven Rostedt <rostedt@goodmis.org> Impact: more efficient code for ftrace return tracer This patch uses the dynamic patching, when available, to patch the function return code into the kernel. This patch will ease the way for letting both function tracing and function return tracing run together. Signed-off-by: Steven Rostedt <srostedt@redhat.com> --- arch/x86/kernel/entry_32.S | 5 ++++ arch/x86/kernel/ftrace.c | 46 +++++++++++++++++++++++++++++++++++++++++++- include/linux/ftrace.h | 5 ++++ kernel/trace/ftrace.c | 34 ++++++++++++++------------------ 4 files changed, 70 insertions(+), 20 deletions(-) diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S index 7a934c0..d6f0333 100644 --- a/arch/x86/kernel/entry_32.S +++ b/arch/x86/kernel/entry_32.S @@ -1185,6 +1185,11 @@ ftrace_call: popl %edx popl %ecx popl %eax +#ifdef CONFIG_FUNCTION_RET_TRACER +.globl ftrace_return_call +ftrace_return_call: + jmp ftrace_stub +#endif .globl ftrace_stub ftrace_stub: diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c index bb137f7..81f8581 100644 --- a/arch/x86/kernel/ftrace.c +++ b/arch/x86/kernel/ftrace.c @@ -111,7 +111,6 @@ static void ftrace_mod_code(void) */ mod_code_status = probe_kernel_write(mod_code_ip, mod_code_newcode, MCOUNT_INSN_SIZE); - } void ftrace_nmi_enter(void) @@ -258,6 +257,51 @@ int ftrace_update_ftrace_func(ftrace_func_t func) return ret; } +#ifdef CONFIG_FUNCTION_RET_TRACER +extern void ftrace_return_call(void); + +static int ftrace_mod_jmp(unsigned long ip, + int old_offset, int new_offset) +{ + unsigned char code[MCOUNT_INSN_SIZE]; + + if (probe_kernel_read(code, (void *)ip, MCOUNT_INSN_SIZE)) + return -EFAULT; + + if (code[0] != 0xe9 || old_offset != *(int *)(&code[1])) + return -EINVAL; + + *(int *)(&code[1]) = new_offset; + + if (do_ftrace_mod_code(ip, &code)) + return -EPERM; + + return 0; +} + +int ftrace_enable_ftrace_return_caller(void) +{ + unsigned long ip = (unsigned long)(&ftrace_return_call); + int old_offset, new_offset; + + old_offset = (unsigned long)(&ftrace_stub) - (ip + MCOUNT_INSN_SIZE); + new_offset = (unsigned long)(&ftrace_return_caller) - (ip + MCOUNT_INSN_SIZE); + + return ftrace_mod_jmp(ip, old_offset, new_offset); +} + +int ftrace_disable_ftrace_return_caller(void) +{ + unsigned long ip = (unsigned long)(&ftrace_return_call); + int old_offset, new_offset; + + old_offset = (unsigned long)(&ftrace_return_caller) - (ip + MCOUNT_INSN_SIZE); + new_offset = (unsigned long)(&ftrace_stub) - (ip + MCOUNT_INSN_SIZE); + + return ftrace_mod_jmp(ip, old_offset, new_offset); +} +#endif /* CONFIG_FUNCTION_RET_TRACER */ + int __init ftrace_dyn_arch_init(void *data) { extern const unsigned char ftrace_test_p6nop[]; diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h index 7854d87..73147eb 100644 --- a/include/linux/ftrace.h +++ b/include/linux/ftrace.h @@ -117,6 +117,11 @@ extern void ftrace_call(void); extern void mcount_call(void); #ifdef CONFIG_FUNCTION_RET_TRACER extern void ftrace_return_caller(void); +extern int ftrace_enable_ftrace_return_caller(void); +extern int ftrace_disable_ftrace_return_caller(void); +#else +static inline int ftrace_enable_ftrace_return_caller(void) { } +static inline int ftrace_disable_ftrace_return_caller(void) { } #endif /** diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index e8cf629..03223b5 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -281,6 +281,8 @@ enum { FTRACE_UPDATE_TRACE_FUNC = (1 << 2), FTRACE_ENABLE_MCOUNT = (1 << 3), FTRACE_DISABLE_MCOUNT = (1 << 4), + FTRACE_START_FUNC_RET = (1 << 5), + FTRACE_STOP_FUNC_RET = (1 << 6), }; static int ftrace_filtered; @@ -465,14 +467,7 @@ __ftrace_replace_code(struct dyn_ftrace *rec, int enable) unsigned long ip, fl; unsigned long ftrace_addr; -#ifdef CONFIG_FUNCTION_RET_TRACER - if (ftrace_tracing_type == FTRACE_TYPE_ENTER) - ftrace_addr = (unsigned long)ftrace_caller; - else - ftrace_addr = (unsigned long)ftrace_return_caller; -#else ftrace_addr = (unsigned long)ftrace_caller; -#endif ip = rec->ip; @@ -605,6 +600,11 @@ static int __ftrace_modify_code(void *data) if (*command & FTRACE_UPDATE_TRACE_FUNC) ftrace_update_ftrace_func(ftrace_trace_function); + if (*command & FTRACE_START_FUNC_RET) + ftrace_enable_ftrace_return_caller(); + else if (*command & FTRACE_STOP_FUNC_RET) + ftrace_disable_ftrace_return_caller(); + return 0; } @@ -629,10 +629,8 @@ static void ftrace_startup_enable(int command) ftrace_run_update_code(command); } -static void ftrace_startup(void) +static void ftrace_startup(int command) { - int command = 0; - if (unlikely(ftrace_disabled)) return; @@ -645,10 +643,8 @@ static void ftrace_startup(void) mutex_unlock(&ftrace_start_lock); } -static void ftrace_shutdown(void) +static void ftrace_shutdown(int command) { - int command = 0; - if (unlikely(ftrace_disabled)) return; @@ -1453,8 +1449,8 @@ device_initcall(ftrace_nodyn_init); static inline int ftrace_init_dyn_debugfs(struct dentry *d_tracer) { return 0; } static inline void ftrace_startup_enable(int command) { } -# define ftrace_startup() do { } while (0) -# define ftrace_shutdown() do { } while (0) +static inline void ftrace_startup(int command) { } +static inline void ftrace_shutdown(int command) { } # define ftrace_startup_sysctl() do { } while (0) # define ftrace_shutdown_sysctl() do { } while (0) #endif /* CONFIG_DYNAMIC_FTRACE */ @@ -1585,7 +1581,7 @@ int register_ftrace_function(struct ftrace_ops *ops) } ret = __register_ftrace_function(ops); - ftrace_startup(); + ftrace_startup(0); out: mutex_unlock(&ftrace_sysctl_lock); @@ -1604,7 +1600,7 @@ int unregister_ftrace_function(struct ftrace_ops *ops) mutex_lock(&ftrace_sysctl_lock); ret = __unregister_ftrace_function(ops); - ftrace_shutdown(); + ftrace_shutdown(0); mutex_unlock(&ftrace_sysctl_lock); return ret; @@ -1747,8 +1743,8 @@ int register_ftrace_return(trace_function_return_t func) goto out; } ftrace_tracing_type = FTRACE_TYPE_RETURN; + ftrace_startup(FTRACE_START_FUNC_RET); ftrace_function_return = func; - ftrace_startup(); out: mutex_unlock(&ftrace_sysctl_lock); @@ -1761,7 +1757,7 @@ void unregister_ftrace_return(void) atomic_dec(&ftrace_retfunc_active); ftrace_function_return = (trace_function_return_t)ftrace_stub; - ftrace_shutdown(); + ftrace_shutdown(FTRACE_STOP_FUNC_RET); /* Restore normal tracing type */ ftrace_tracing_type = FTRACE_TYPE_ENTER; -- 1.5.6.5 -- ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 2/3] ftrace: use code patching for ftrace return tracer 2008-11-25 22:34 ` [PATCH 2/3] ftrace: use code patching for ftrace return tracer Steven Rostedt @ 2008-11-26 0:13 ` Frédéric Weisbecker 2008-11-26 0:23 ` Steven Rostedt 1 sibling, 0 replies; 23+ messages in thread From: Frédéric Weisbecker @ 2008-11-26 0:13 UTC (permalink / raw) To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton, Steven Rostedt 2008/11/25 Steven Rostedt <rostedt@goodmis.org>: > From: Steven Rostedt <rostedt@goodmis.org> > > Impact: more efficient code for ftrace return tracer > > This patch uses the dynamic patching, when available, to patch > the function return code into the kernel. > > This patch will ease the way for letting both function tracing > and function return tracing run together. > > Signed-off-by: Steven Rostedt <srostedt@redhat.com> > --- > arch/x86/kernel/entry_32.S | 5 ++++ > arch/x86/kernel/ftrace.c | 46 +++++++++++++++++++++++++++++++++++++++++++- > include/linux/ftrace.h | 5 ++++ > kernel/trace/ftrace.c | 34 ++++++++++++++------------------ > 4 files changed, 70 insertions(+), 20 deletions(-) > > diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S > index 7a934c0..d6f0333 100644 > --- a/arch/x86/kernel/entry_32.S > +++ b/arch/x86/kernel/entry_32.S > @@ -1185,6 +1185,11 @@ ftrace_call: > popl %edx > popl %ecx > popl %eax > +#ifdef CONFIG_FUNCTION_RET_TRACER > +.globl ftrace_return_call > +ftrace_return_call: > + jmp ftrace_stub > +#endif And you patch ftrace_stub to ftrace_return_caller. I see... Seems a good idea :-) Acked-by: Frederic Weisbecker <fweisbec@gmail.com> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/3] ftrace: use code patching for ftrace return tracer 2008-11-25 22:34 ` [PATCH 2/3] ftrace: use code patching for ftrace return tracer Steven Rostedt 2008-11-26 0:13 ` Frédéric Weisbecker @ 2008-11-26 0:23 ` Steven Rostedt 1 sibling, 0 replies; 23+ messages in thread From: Steven Rostedt @ 2008-11-26 0:23 UTC (permalink / raw) To: linux-kernel Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, Steven Rostedt On Tue, 25 Nov 2008, Steven Rostedt wrote: > From: Steven Rostedt <rostedt@goodmis.org> > int __init ftrace_dyn_arch_init(void *data) > { > extern const unsigned char ftrace_test_p6nop[]; > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h > index 7854d87..73147eb 100644 > --- a/include/linux/ftrace.h > +++ b/include/linux/ftrace.h > @@ -117,6 +117,11 @@ extern void ftrace_call(void); > extern void mcount_call(void); > #ifdef CONFIG_FUNCTION_RET_TRACER > extern void ftrace_return_caller(void); > +extern int ftrace_enable_ftrace_return_caller(void); > +extern int ftrace_disable_ftrace_return_caller(void); > +#else > +static inline int ftrace_enable_ftrace_return_caller(void) { } > +static inline int ftrace_disable_ftrace_return_caller(void) { } > #endif > This is what I get for not testing the off case :-p The above should be: +static inline int ftrace_enable_ftrace_return_caller(void) { return 0; } +static inline int ftrace_disable_ftrace_return_caller(void) { return 0; } I'll fix and push again. -- Steve ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 3/3] ftrace: let function tracing and function return run together 2008-11-25 22:34 [PATCH 0/3] ftrace: new features Steven Rostedt 2008-11-25 22:34 ` [PATCH 1/3] ftrace: add function tracing to single thread Steven Rostedt 2008-11-25 22:34 ` [PATCH 2/3] ftrace: use code patching for ftrace return tracer Steven Rostedt @ 2008-11-25 22:34 ` Steven Rostedt 2 siblings, 0 replies; 23+ messages in thread From: Steven Rostedt @ 2008-11-25 22:34 UTC (permalink / raw) To: linux-kernel Cc: Ingo Molnar, Andrew Morton, Frederic Weisbecker, Steven Rostedt [-- Attachment #1: 0003-ftrace-let-function-tracing-and-function-return-run.patch --] [-- Type: text/plain, Size: 2126 bytes --] From: Steven Rostedt <rostedt@goodmis.org> Impact: feature This patch enables function tracing and function return to run together. I've tested this by enabling the stack tracer and return tracer, where both the function entry and function return are used together with dynamic ftrace. Signed-off-by: Steven Rostedt <srostedt@redhat.com> --- kernel/trace/ftrace.c | 19 ------------------- 1 files changed, 0 insertions(+), 19 deletions(-) diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 03223b5..4d6ce65 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -53,9 +53,6 @@ static int ftrace_pid_trace = -1; /* Quick disabling of function tracer. */ int function_trace_stop; -/* By default, current tracing type is normal tracing. */ -enum ftrace_tracing_type_t ftrace_tracing_type = FTRACE_TYPE_ENTER; - /* * ftrace_disabled is set when an anomaly is discovered. * ftrace_disabled is much stronger than ftrace_enabled. @@ -1575,11 +1572,6 @@ int register_ftrace_function(struct ftrace_ops *ops) mutex_lock(&ftrace_sysctl_lock); - if (ftrace_tracing_type == FTRACE_TYPE_RETURN) { - ret = -EBUSY; - goto out; - } - ret = __register_ftrace_function(ops); ftrace_startup(0); @@ -1728,21 +1720,12 @@ int register_ftrace_return(trace_function_return_t func) mutex_lock(&ftrace_sysctl_lock); - /* - * Don't launch return tracing if normal function - * tracing is already running. - */ - if (ftrace_trace_function != ftrace_stub) { - ret = -EBUSY; - goto out; - } atomic_inc(&ftrace_retfunc_active); ret = start_return_tracing(); if (ret) { atomic_dec(&ftrace_retfunc_active); goto out; } - ftrace_tracing_type = FTRACE_TYPE_RETURN; ftrace_startup(FTRACE_START_FUNC_RET); ftrace_function_return = func; @@ -1758,8 +1741,6 @@ void unregister_ftrace_return(void) atomic_dec(&ftrace_retfunc_active); ftrace_function_return = (trace_function_return_t)ftrace_stub; ftrace_shutdown(FTRACE_STOP_FUNC_RET); - /* Restore normal tracing type */ - ftrace_tracing_type = FTRACE_TYPE_ENTER; mutex_unlock(&ftrace_sysctl_lock); } -- 1.5.6.5 -- ^ permalink raw reply related [flat|nested] 23+ messages in thread
end of thread, other threads:[~2008-11-26 16:36 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-11-25 22:34 [PATCH 0/3] ftrace: new features Steven Rostedt 2008-11-25 22:34 ` [PATCH 1/3] ftrace: add function tracing to single thread Steven Rostedt 2008-11-25 22:42 ` Andrew Morton 2008-11-25 23:01 ` Steven Rostedt 2008-11-25 23:31 ` Andrew Morton 2008-11-26 0:11 ` Steven Rostedt 2008-11-26 0:53 ` Dave Hansen 2008-11-26 1:01 ` Steven Rostedt 2008-11-26 4:50 ` Eric W. Biederman 2008-11-26 5:01 ` Steven Rostedt 2008-11-26 6:23 ` Eric W. Biederman 2008-11-26 6:32 ` Ingo Molnar 2008-11-26 7:02 ` Eric W. Biederman 2008-11-26 7:18 ` Ingo Molnar 2008-11-26 8:48 ` Eric W. Biederman 2008-11-26 5:20 ` Eric W. Biederman 2008-11-26 16:36 ` Steven Rostedt 2008-11-26 0:02 ` Frédéric Weisbecker 2008-11-26 0:16 ` Steven Rostedt 2008-11-25 22:34 ` [PATCH 2/3] ftrace: use code patching for ftrace return tracer Steven Rostedt 2008-11-26 0:13 ` Frédéric Weisbecker 2008-11-26 0:23 ` Steven Rostedt 2008-11-25 22:34 ` [PATCH 3/3] ftrace: let function tracing and function return run together Steven Rostedt
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.