All of lore.kernel.org
 help / color / mirror / Atom feed
From: Francis Laniel <flaniel@linux.microsoft.com>
To: Masami Hiramatsu <mhiramat@kernel.org>
Cc: linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>,
	linux-trace-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v1 1/1] tracing/kprobe: Add multi-probe support for 'perf_kprobe' PMU
Date: Fri, 18 Aug 2023 20:12:11 +0200	[thread overview]
Message-ID: <5702263.DvuYhMxLoT@pwmachine> (raw)
In-Reply-To: <20230818220537.75ce8210c6a4c80a5a8d16f8@kernel.org>

Hi.

Le vendredi 18 août 2023, 15:05:37 CEST Masami Hiramatsu a écrit :
> On Thu, 17 Aug 2023 13:06:20 +0200
> 
> Francis Laniel <flaniel@linux.microsoft.com> wrote:
> > Hi.
> > 
> > Le jeudi 17 août 2023, 09:50:57 CEST Masami Hiramatsu a écrit :
> > > Hi,
> > > 
> > > On Wed, 16 Aug 2023 18:35:17 +0200
> > > 
> > > Francis Laniel <flaniel@linux.microsoft.com> wrote:
> > > > When using sysfs, it is possible to create kprobe for several kernel
> > > > functions sharing the same name, but of course with different
> > > > addresses,
> > > > by writing their addresses in kprobe_events file.
> > > > 
> > > > When using PMU, if only the symbol name is given, the event will be
> > > > created for the first address which matches the symbol, as returned by
> > > > kallsyms_lookup_name().
> > > 
> > > Do you mean probing the same name symbols? Yes, it is intended behavior,
> > > since it is not always true that the same name function has the same
> > > prototype (it is mostly true but is not ensured), it is better to leave
> > > user to decide which one is what you want to probe.
> > 
> > This is what I meant.
> > I also share your mind regarding leaving the users deciding which one they
> > want to probe but in my case (which I agree is a bit a corner one) it
> > leaded me to misunderstanding as the PMU kprobe was only added to the
> > first ntfs_file_write_iter() which is not the one for ntfs3.
> 
> Hmm, OK. I think in that case (multiple same-name symbols exist) the default
> behavior is rejecting with error message. And optionally, it will probe all
> or them like your patch.

I am not sure to understand.
Can you please precise the default behavior of which software component?

> > > Have you used 'perf probe' tool? It tries to find the appropriate
> > > function
> > > by line number and creates the probe by 'text+OFFSET' style, not by
> > > symbol.
> > > I think this is the correct way to do that, because user will not know
> > > which 'address' of the symbol is what the user want.
> > 
> > 'perf probe' perfectly does the trick, as it would find all the kernel
> > addresses which correspond to the symbol name and create as many probes as
> > corresponding symbols [1]:
> > root@vm-amd64:~# perf probe --add ntfs_file_write_iter
> 
> If you can specify the (last part of) file path as below,
> 
> perf probe --add ntfs_file_write_iter@ntfs3/file.c
> 
> Then it will choose correct one. :)

Nice! TIL thank you! perf is really powerful!

> > Added new events:
> >   probe:ntfs_file_write_iter (on ntfs_file_write_iter)
> >   probe:ntfs_file_write_iter (on ntfs_file_write_iter)
> > 
> > You can now use it in all perf tools, such as:
> >         perf record -e probe:ntfs_file_write_iter -aR sleep 1
> > 
> > root@vm-amd64:~# cat /sys/kernel/tracing/kprobe_events
> > p:probe/ntfs_file_write_iter _text+5088544
> > p:probe/ntfs_file_write_iter _text+5278560
> > 
> > > Thought?
> > 
> > This contribution is basically here to sort of mimic what perf does but
> > with PMU kprobes, as this is not possible to write in a sysfs file with
> > this type of probe.
> 
> OK, I see it is for BPF only. Maybe BPF program can filter correct one
> to access the argument etc.

I am not sure I understand, can you please precise?
The eBPF program will be run when the kprobe will be triggered, so if the 
kprobe is armed for the function (e.g. old ntfs_file_write_iter()), the eBPF 
program will never be called.

> 
> Thank you,
> 
> > > Thank you,
> > > 
> > > > The idea here is to search all kernel functions which match this
> > > > symbol
> > > > and
> > > > create a trace_kprobe for each of them.
> > > > All these trace_kprobes are linked together by sharing the same
> > > > trace_probe.
> > > > 
> > > > Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>
> > > > ---
> > > > 
> > > >  kernel/trace/trace_kprobe.c | 86
> > > >  +++++++++++++++++++++++++++++++++++++
> > > >  1 file changed, 86 insertions(+)
> > > > 
> > > > diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> > > > index 1b3fa7b854aa..08580f1466c7 100644
> > > > --- a/kernel/trace/trace_kprobe.c
> > > > +++ b/kernel/trace/trace_kprobe.c
> > > > @@ -1682,13 +1682,42 @@ static int unregister_kprobe_event(struct
> > > > trace_kprobe *tk)>
> > > > 
> > > >  }
> > > >  
> > > >  #ifdef CONFIG_PERF_EVENTS
> > > > 
> > > > +
> > > > +struct address_array {
> > > > +	unsigned long *addrs;
> > > > +	size_t size;
> > > > +};
> > > > +
> > > > +static int add_addr(void *data, unsigned long addr)
> > > > +{
> > > > +	struct address_array *array = data;
> > > > +	unsigned long *p;
> > > > +
> > > > +	array->size++;
> > > > +	p = krealloc(array->addrs,
> > > > +				sizeof(*array->addrs) * array->size,
> > > > +				GFP_KERNEL);
> > > > +	if (!p) {
> > > > +		kfree(array->addrs);
> > > > +		return -ENOMEM;
> > > > +	}
> > > > +
> > > > +	array->addrs = p;
> > > > +	array->addrs[array->size - 1] = addr;
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > > 
> > > >  /* create a trace_kprobe, but don't add it to global lists */
> > > >  struct trace_event_call *
> > > >  create_local_trace_kprobe(char *func, void *addr, unsigned long offs,
> > > >  
> > > >  			  bool is_return)
> > > >  
> > > >  {
> > > >  
> > > >  	enum probe_print_type ptype;
> > > > 
> > > > +	struct address_array array;
> > > > 
> > > >  	struct trace_kprobe *tk;
> > > > 
> > > > +	unsigned long func_addr;
> > > > +	unsigned int i;
> > > > 
> > > >  	int ret;
> > > >  	char *event;
> > > > 
> > > > @@ -1722,7 +1751,64 @@ create_local_trace_kprobe(char *func, void
> > > > *addr,
> > > > unsigned long offs,>
> > > > 
> > > >  	if (ret < 0)
> > > >  	
> > > >  		goto error;
> > > > 
> > > > +	array.addrs = NULL;
> > > > +	array.size = 0;
> > > > +	ret = kallsyms_on_each_match_symbol(add_addr, func, &array);
> > > > +	if (ret)
> > > > +		goto error_free;
> > > > +
> > > > +	if (array.size == 1)
> > > > +		goto end;
> > > > +
> > > > +	/*
> > > > +	 * Below loop allocates a trace_kprobe for each function with the
> > > > same
> > > > +	 * name in kernel source code.
> > > > +	 * All this differente trace_kprobes will be linked together through
> > > > +	 * append_trace_kprobe().
> > > > +	 * NOTE append_trace_kprobe() is called in register_trace_kprobe()
> > 
> > which
> > 
> > > > +	 * is called when a kprobe is added through sysfs.
> > > > +	 */
> > > > +	func_addr = kallsyms_lookup_name(func);
> > > > +	for (i = 0; i < array.size; i++) {
> > > > +		struct trace_kprobe *tk_same_name;
> > > > +		unsigned long address;
> > > > +
> > > > +		address = array.addrs[i];
> > > > +		/* Skip the function address as we already registered it. */
> > > > +		if (address == func_addr)
> > > > +			continue;
> > > > +
> > > > +		/*
> > > > +		 * alloc_trace_kprobe() first considers symbol name, so we set
> > > > +		 * this to NULL to allocate this kprobe on the given address.
> > > > +		 */
> > > > +		tk_same_name = alloc_trace_kprobe(KPROBE_EVENT_SYSTEM, event,
> > > > +						  (void *)address, NULL, offs,
> > > > +						  0 /* maxactive */,
> > > > +						  0 /* nargs */, is_return);
> > > > +
> > > > +		if (IS_ERR(tk_same_name)) {
> > > > +			ret = -ENOMEM;
> > > > +			goto error_free;
> > > > +		}
> > > > +
> > > > +		init_trace_event_call(tk_same_name);
> > > > +
> > > > +		if (traceprobe_set_print_fmt(&tk_same_name->tp, ptype) < 0) {
> > > > +			ret = -ENOMEM;
> > > > +			goto error_free;
> > > > +		}
> > > > +
> > > > +		ret = append_trace_kprobe(tk_same_name, tk);
> > > > +		if (ret)
> > > > +			goto error_free;
> > > > +	}
> > > > +
> > > > +end:
> > > > +	kfree(array.addrs);
> > > > 
> > > >  	return trace_probe_event_call(&tk->tp);
> > > > 
> > > > +error_free:
> > > > +	kfree(array.addrs);
> > > > 
> > > >  error:
> > > >  	free_trace_kprobe(tk);
> > > >  	return ERR_PTR(ret);
> > 
> > ---
> > [1]: https://github.com/torvalds/linux/blob/
> > 57012c57536f8814dec92e74197ee96c3498d24e/tools/perf/util/probe-event.c#L29
> > 89- L2993





  reply	other threads:[~2023-08-18 18:12 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20230816163517.112518-1-flaniel@linux.microsoft.com>
2023-08-16 16:35 ` [RFC PATCH v1 1/1] tracing/kprobe: Add multi-probe support for 'perf_kprobe' PMU Francis Laniel
2023-08-16 18:42   ` Steven Rostedt
2023-08-17 10:59     ` Francis Laniel
2023-08-17 15:13       ` Steven Rostedt
2023-08-18  9:01         ` Francis Laniel
2023-08-18 12:37         ` Masami Hiramatsu
2023-08-18 15:41           ` Steven Rostedt
2023-08-18 18:13             ` Francis Laniel
2023-08-18 18:20               ` Steven Rostedt
2023-08-19  1:15                 ` Masami Hiramatsu
2023-08-19 15:22                   ` Song Liu
2023-08-20  9:32                     ` Masami Hiramatsu
2023-08-20 10:02                       ` Song Liu
2023-08-20 13:16                         ` Masami Hiramatsu
2023-08-21  6:09                           ` Song Liu
2023-08-21 10:01                             ` Masami Hiramatsu
2023-08-21 14:45                               ` Steven Rostedt
2023-08-21 18:07                                 ` Kees Cook
2023-08-21 14:29                         ` Steven Rostedt
2023-08-21 15:19                           ` Masami Hiramatsu
2023-08-21 15:28                             ` Steven Rostedt
2023-08-17  7:50   ` Masami Hiramatsu
2023-08-17 11:06     ` Francis Laniel
2023-08-18 13:05       ` Masami Hiramatsu
2023-08-18 18:12         ` Francis Laniel [this message]
2023-08-19  1:11           ` Masami Hiramatsu
2023-08-20 20:23             ` Jiri Olsa
2023-08-21 12:22               ` Francis Laniel
2023-08-20 20:34             ` Jiri Olsa
2023-08-21 12:24               ` Francis Laniel
2023-08-22 13:13                 ` Jiri Olsa
2023-08-21 12:55             ` Francis Laniel
2023-08-23  0:36               ` Masami Hiramatsu
2023-08-23  9:54                 ` Francis Laniel
2023-08-23 13:45                   ` Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5702263.DvuYhMxLoT@pwmachine \
    --to=flaniel@linux.microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.