linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: linux-kernel@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Tom Zanussi <tom.zanussi@linux.intel.com>,
	linux-rt-users@vger.kernel.org,
	linux-trace-users@vger.kernel.org,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Clark Williams <williams@redhat.com>,
	Jiri Olsa <jolsa@redhat.com>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Namhyung Kim <namhyung@kernel.org>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>
Subject: Re: [PATCH 00/18] [ANNOUNCE] Dynamically created function based events
Date: Sat, 3 Feb 2018 13:52:33 -0500	[thread overview]
Message-ID: <20180203135233.797c41d6@gandalf.local.home> (raw)
In-Reply-To: <20180202230458.840252014@goodmis.org>


I need to add a "Quick guide" and "Tips and tricks" section to the
document. To explain the arguments better...

Simple args are:

 "int val", "unsigned int val", "char x", "unsigned long addr",

Also the types:

 "s32 val", "u32 val", "s8 x", "u64 addr"

The above are all printed in decimal "%d" or "%u", if you want hex...

 "x32 val", "x8 x", "x64 addr"

If you want to have it use "%pS" to print (symbols)

 "symbol addr" is like: "%pS", (void *)addr

Arrays can be expressed after the type:

 "x8[6] mac" is like: "%x,%x,%x,%x,%x,%x", mac[0], mac[1], mac[2],
                                           mac[3], mac[4], mac[5]

Where mac would be: unsigned char mac[6] type.

Note, arrays of type "char" and "unsigned char" turn into a static
string.

 "char[10] str" is like: "%s", str

Where str is defined as char str[10];

If the argument is a pointer to a structure, you can index into the
structure:

 "x64 ip[16]" is like: "%llx", ((u64 *)ip)[16]

Finally, if an argument is a pointer to a structure, and you want to
get to another structure that it points to, for example

 struct sk_buff *skb;

and you want to get to:

  skb->dev->perm_addr

when the parameter is a pointer to skb.

  (gdb) printf "%d\n", &((struct sk_buff *)0)->dev
16
  (gdb) printf "%d\n", &((struct net_device *)0)->perm_addr
558

The net_device *dev is 16 bytes into sk_buff, and the char array
perm_addr, is 558 bytes into the net_device structure.

Where perm_addr is an array of the mac address.

 "x8[6] perm_addr+16[0]+558"

Which would turn into:

 char *dev = (char **)(((void *)skb)+16)[0];
 char *perm_addr = (char *)(dev+558);

 "%x,%x,%x,%x,%x,%x", perm_addr[0], perm_addr[1], perm_addr[2],
                      perm_addr[3], perm_addr[4], perm_addr[5]

OK, the above is a bit complex ;-) But works nicely.

-- Steve

On Fri, 02 Feb 2018 18:04:58 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> At Kernel Summit back in October, we tried to bring up trace markers, which
> would be nops within the kernel proper, that would allow modules to hook
> arbitrary trace events to them. The reaction to this proposal was less than
> favorable. We were told that we were trying to make a work around for a
> problem, and not solving it. The problem in our minds is the notion of a
> "stable trace event".
> 
> There are maintainers that do not want trace events, or more trace events in
> their subsystems. This is due to the fact that trace events post an
> interface to user space, and this interface could become required by some
> tool. This may cause the trace event to become stable where it must not
> break the tool, and thus prevent the code from changing.
> 
> Or, the trace event may just have to add padding for fields that tools
> may require. The "success" field of the sched_wakeup trace event is one such
> instance. There is no more "success" variable, but tools may fail if it were
> to go away, so a "1" is simply added to the trace event wasting ring buffer
> real estate.
> 
> I talked with Linus about this, and he told me that we already have these
> markers in the kernel. They are from the mcount/__fentry__ used by function
> tracing. Have the trace events be created by these, and see if this will
> satisfy most areas that want trace events.
> 
> I decided to implement this idea, and here's the patch set.
> 
> Introducing "function based events". These are created dynamically by a
> tracefs file called "function_events". By writing a pseudo prototype into
> this file, you create an event.
> 
>  # mount -t tracefs nodev /sys/kernel/tracing
>  # cd /sys/kernel/tracing
>  # echo 'do_IRQ(symbol ip[16] | x64[6] irq_stack[16])' > function_events
>  # cat events/functions/do_IRQ/format
> name: do_IRQ
> ID: 1399
> format:
> 	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
> 	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
> 	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
> 	field:int common_pid;	offset:4;	size:4;	signed:1;
> 
> 	field:unsigned long __parent_ip;	offset:8;	size:8;	signed:0;
> 	field:unsigned long __ip;	offset:16;	size:8;	signed:0;
> 	field:symbol ip;	offset:24;	size:8;	signed:0;
> 	field:x64 irq_stack[6];	offset:32;	size:48;	signed:0;
> 
> print fmt: "%pS->%pS(ip=%pS, irq_stack=%llx:%llx:%llx:%llx:%llx:%llx)", REC->__ip, REC->__parent_ip,
> REC->ip, REC->irq_stack[0], REC->irq_stack[1], REC->irq_stack[2], REC->irq_stack[3], REC->irq_stack[4],
> REC->irq_stack[5]
> 
>  # echo 1 > events/functions/do_IRQ/enable
>  # cat trace
>           <idle>-0     [003] d..3  3647.049344: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
>           <idle>-0     [003] d..3  3647.049433: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
>           <idle>-0     [003] d..3  3647.049672: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
>           <idle>-0     [003] d..3  3647.325709: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
>           <idle>-0     [003] d..3  3647.325929: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
>           <idle>-0     [003] d..3  3647.325993: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
>           <idle>-0     [003] d..3  3647.387571: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
>           <idle>-0     [003] d..3  3647.387791: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
>           <idle>-0     [003] d..3  3647.387874: ret_from_intr->do_IRQ(ip=cpuidle_enter_state+0xb1/0x330, irq_stack=ffffffff81665db1,10,246,ffffc900006c3e80,18,ffff88011eae9b40)
> 
> And this is much more powerful than just this. We can show strings, and
> index off of structures into other structures.
> 
>   # echo '__vfs_read(symbol read+40[0]+16)' > function_events
> 
>   # echo 1 > events/functions/__vfs_read/enable
>   # cat trace
>          sshd-1343  [005] ...2   199.734752: vfs_read->__vfs_read(read=tty_read+0x0/0xf0)
>          bash-1344  [003] ...2   199.734822: vfs_read->__vfs_read(read=tty_read+0x0/0xf0)
>          sshd-1343  [005] ...2   199.734835: vfs_read->__vfs_read(read=tty_read+0x0/0xf0)
>  avahi-daemon-910   [003] ...2   200.136740: vfs_read->__vfs_read(read=          (null))
>  avahi-daemon-910   [003] ...2   200.136750: vfs_read->__vfs_read(read=          (null))
> 
> And even read user space:
> 
>   # echo 'SyS_openat(int dfd, string path, x32 flags, x16 mode)' > function_events
>   # echo 1 > events/functions/enable
>   # grep task_fork /proc/kallsyms 
> ffffffff810d5a60 t task_fork_fair
> ffffffff810dfc30 t task_fork_dl
>   # cat trace
>             grep-1820  [000] ...2  3926.107603: entry_SYSCALL_64_fastpath->SyS_openat(dfd=-100, path=/proc/kallsyms, flags=100, mode=0)
> 
> These are fully functional events. That is, they work with ftrace,
> trace-cmd, perf, histograms, triggers, and eBPF.
> 
> What's next? I need to rewrite the function graph tracer, and be able to add
> dynamic events on function return.
> 
> I made this work with x86 with a simple function that only returns
> 6 function parameters for x86_64 and 3 for x86_32. But this could easily
> be extended.
> 
> Cheers!
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
> ftrace/dynamic-ftrace-events
> 
> Head SHA1: 30fbdffd5d38bd27b04fb911f7158f10a99be8c4
> 
> 
> Steven Rostedt (VMware) (18):
>       tracing: Add function based events
>       tracing: Add documentation for function based events
>       tracing: Add simple arguments to function based events
>       tracing/x86: Add arch_get_func_args() function
>       tracing: Add hex print for dynamic ftrace based events
>       tracing: Add indirect offset to args of ftrace based events
>       tracing: Add dereferencing multiple fields per arg
>       tracing: Add "unsigned" to function based events
>       tracing: Add indexing of arguments for function based events
>       tracing: Make func_type enums for easier comparing of arg types
>       tracing: Add symbol type to function based events
>       tracing: Add accessing direct address from function based events
>       tracing: Add array type to function based events
>       tracing: Have char arrays be strings for function based events
>       tracing: Add string type for dynamic strings in function based events
>       tracing: Add NULL to skip args for function based events
>       tracing: Add indirect to indirect access for function based events
>       tracing/perf: Allow perf to use function based events
> 
> ----
>  Documentation/trace/function-based-events.rst |  426 ++++++++
>  arch/x86/kernel/ftrace.c                      |   28 +
>  include/linux/trace_events.h                  |    2 +
>  kernel/trace/Kconfig                          |   12 +
>  kernel/trace/Makefile                         |    1 +
>  kernel/trace/trace.h                          |   11 +
>  kernel/trace/trace_event_ftrace.c             | 1440 +++++++++++++++++++++++++
>  kernel/trace/trace_probe.h                    |   11 -
>  8 files changed, 1920 insertions(+), 11 deletions(-)
>  create mode 100644 Documentation/trace/function-based-events.rst
>  create mode 100644 kernel/trace/trace_event_ftrace.c
> --
> To unsubscribe from this list: send the line "unsubscribe linux-trace-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  parent reply	other threads:[~2018-02-03 18:52 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-02 23:04 [PATCH 00/18] [ANNOUNCE] Dynamically created function based events Steven Rostedt
2018-02-02 23:04 ` [PATCH 01/18] tracing: Add " Steven Rostedt
2018-02-05  8:24   ` Jiri Olsa
2018-02-05 15:00     ` Steven Rostedt
2018-02-07  3:09       ` Steven Rostedt
2018-02-07 12:06         ` Jiri Olsa
2018-02-02 23:05 ` [PATCH 02/18] tracing: Add documentation for " Steven Rostedt
2018-02-02 23:05 ` [PATCH 03/18] tracing: Add simple arguments to " Steven Rostedt
2018-02-08 10:18   ` Namhyung Kim
2018-02-08 15:37     ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 04/18] tracing/x86: Add arch_get_func_args() function Steven Rostedt
2018-02-05 16:33   ` Masami Hiramatsu
2018-02-05 17:06     ` Steven Rostedt
2018-02-08  5:28   ` Namhyung Kim
2018-02-08 15:29     ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 05/18] tracing: Add hex print for dynamic ftrace based events Steven Rostedt
2018-02-02 23:05 ` [PATCH 06/18] tracing: Add indirect offset to args of " Steven Rostedt
2018-02-02 23:05 ` [PATCH 07/18] tracing: Add dereferencing multiple fields per arg Steven Rostedt
2018-02-02 23:05 ` [PATCH 08/18] tracing: Add "unsigned" to function based events Steven Rostedt
2018-02-02 23:05 ` [PATCH 09/18] tracing: Add indexing of arguments for " Steven Rostedt
2018-02-08 10:59   ` Namhyung Kim
2018-02-08 15:43     ` Steven Rostedt
2018-02-08 23:56       ` Namhyung Kim
2018-02-09  0:19         ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 10/18] tracing: Make func_type enums for easier comparing of arg types Steven Rostedt
2018-02-02 23:05 ` [PATCH 11/18] tracing: Add symbol type to function based events Steven Rostedt
2018-02-08 11:03   ` Namhyung Kim
2018-02-08 15:48     ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 12/18] tracing: Add accessing direct address from " Steven Rostedt
2018-02-09  0:34   ` Namhyung Kim
2018-02-09  1:10     ` Steven Rostedt
2018-02-09 22:07     ` Steven Rostedt
2018-02-12  2:06       ` Namhyung Kim
2018-02-12 15:47         ` Masami Hiramatsu
2018-02-12 16:47           ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 13/18] tracing: Add array type to " Steven Rostedt
2018-02-03 13:56   ` Masami Hiramatsu
2018-02-03 15:29     ` Steven Rostedt
2018-02-04  3:50       ` Masami Hiramatsu
2018-02-09  1:17   ` Namhyung Kim
2018-02-09  1:54     ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 14/18] tracing: Have char arrays be strings for " Steven Rostedt
2018-02-02 23:05 ` [PATCH 15/18] tracing: Add string type for dynamic strings in " Steven Rostedt
2018-02-09  3:15   ` Namhyung Kim
2018-02-09  3:31     ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 16/18] tracing: Add NULL to skip args for " Steven Rostedt
2018-02-02 23:05 ` [PATCH 17/18] tracing: Add indirect to indirect access " Steven Rostedt
2018-02-09  5:13   ` Namhyung Kim
2018-02-09 15:47     ` Steven Rostedt
2018-02-09 17:18       ` Steven Rostedt
2018-02-12  2:15       ` Namhyung Kim
2018-02-12 17:23         ` Steven Rostedt
2018-02-13  9:27           ` Namhyung Kim
2018-02-13 15:28             ` Steven Rostedt
2018-02-02 23:05 ` [PATCH 18/18] tracing/perf: Allow perf to use " Steven Rostedt
2018-02-03 13:38 ` [PATCH 00/18] [ANNOUNCE] Dynamically created " Masami Hiramatsu
2018-02-03 15:27   ` Steven Rostedt
2018-02-04  3:57     ` Masami Hiramatsu
2018-02-03 17:04 ` Mathieu Desnoyers
2018-02-03 19:02   ` Steven Rostedt
2018-02-03 20:52     ` Alexei Starovoitov
2018-02-03 21:08       ` Steven Rostedt
2018-02-03 21:30         ` Alexei Starovoitov
2018-02-04  2:37           ` Namhyung Kim
2018-02-04 15:50         ` Mathieu Desnoyers
2018-02-03 21:17       ` Steven Rostedt
2018-02-03 21:38         ` Alexei Starovoitov
2018-02-04  2:25         ` Namhyung Kim
2018-02-05 15:02           ` Steven Rostedt
2018-02-05 13:53         ` Juri Lelli
2018-02-05 15:07           ` Steven Rostedt
2018-02-03 21:43   ` Linus Torvalds
2018-02-04 15:30     ` Mathieu Desnoyers
2018-02-04 15:47       ` Steven Rostedt
2018-02-04 19:39       ` Linus Torvalds
2018-02-05 10:09         ` Peter Zijlstra
2018-02-05 15:10           ` Steven Rostedt
2018-02-05 15:14         ` Masami Hiramatsu
2018-02-03 18:52 ` Steven Rostedt [this message]
2018-02-05 10:23 ` Juri Lelli
2018-02-05 10:49   ` Daniel Bristot de Oliveira
2018-02-05 15:11     ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180203135233.797c41d6@gandalf.local.home \
    --to=rostedt@goodmis.org \
    --cc=acme@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bristot@redhat.com \
    --cc=corbet@lwn.net \
    --cc=jolsa@redhat.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=linux-trace-users@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tom.zanussi@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).