public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
To: Jiri Olsa <olsajiri@gmail.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Florent Revest <revest@chromium.org>,
	linux-trace-kernel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	bpf <bpf@vger.kernel.org>, Sven Schnelle <svens@linux.ibm.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Alan Maguire <alan.maguire@oracle.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>, Guo Ren <guoren@kernel.org>
Subject: Re: [PATCH v5 28/34] fprobe: Rewrite fprobe on function-graph tracer
Date: Wed, 20 Dec 2023 10:00:47 +0900	[thread overview]
Message-ID: <20231220100047.e33d862cb869423c2a3a82bf@kernel.org> (raw)
In-Reply-To: <ZYGrB7NsDEWk2liL@krava>

On Tue, 19 Dec 2023 15:39:03 +0100
Jiri Olsa <olsajiri@gmail.com> wrote:

> On Mon, Dec 18, 2023 at 10:17:10PM +0900, Masami Hiramatsu (Google) wrote:
> 
> SNIP
> 
> > -static void fprobe_exit_handler(struct rethook_node *rh, void *data,
> > -				unsigned long ret_ip, struct pt_regs *regs)
> > +static int fprobe_entry(unsigned long func, unsigned long ret_ip,
> > +			struct ftrace_regs *fregs, struct fgraph_ops *gops)
> >  {
> > -	struct fprobe *fp = (struct fprobe *)data;
> > -	struct fprobe_rethook_node *fpr;
> > -	struct ftrace_regs *fregs = (struct ftrace_regs *)regs;
> > -	int bit;
> > +	struct fprobe_hlist_node *node, *first;
> > +	unsigned long *fgraph_data = NULL;
> > +	unsigned long header;
> > +	int reserved_words;
> > +	struct fprobe *fp;
> > +	int used, ret;
> >  
> > -	if (!fp || fprobe_disabled(fp))
> > -		return;
> > +	if (WARN_ON_ONCE(!fregs))
> > +		return 0;
> >  
> > -	fpr = container_of(rh, struct fprobe_rethook_node, node);
> > +	first = node = find_first_fprobe_node(func);
> > +	if (unlikely(!first))
> > +		return 0;
> > +
> > +	reserved_words = 0;
> > +	hlist_for_each_entry_from_rcu(node, hlist) {
> > +		if (node->addr != func)
> > +			break;
> > +		fp = READ_ONCE(node->fp);
> > +		if (!fp || !fp->exit_handler)
> > +			continue;
> > +		/*
> > +		 * Since fprobe can be enabled until the next loop, we ignore the
> > +		 * fprobe's disabled flag in this loop.
> > +		 */
> > +		reserved_words +=
> > +			DIV_ROUND_UP(fp->entry_data_size, sizeof(long)) + 1;
> > +	}
> > +	node = first;
> > +	if (reserved_words) {
> > +		fgraph_data = fgraph_reserve_data(gops->idx, reserved_words * sizeof(long));
> > +		if (unlikely(!fgraph_data)) {
> > +			hlist_for_each_entry_from_rcu(node, hlist) {
> > +				if (node->addr != func)
> > +					break;
> > +				fp = READ_ONCE(node->fp);
> > +				if (fp && !fprobe_disabled(fp))
> > +					fp->nmissed++;
> > +			}
> > +			return 0;
> > +		}
> > +	}
> 
> this looks expensive compared to what we do now.. IIUC due to the graph
> ops limitations (16 ctive ops), you have just single graph ops for fprobe
> and each fprobe registration stores ips into hash which you need to search
> in here to get registered callbacks..?

I think this is not so expensive. Most cases, it only hits 1 fprobe on the
hash. And if the fprobe is only used to hook the entry, reserved_words == 0.

> I wonder would it make sense to allow arbitrary number of active graph_ops
> with the price some callback might fail because there's no stack space so
> each fprobe instance would have its own graph_ops.. and we would get rid
> of the code above (and below) ?

Yeah, actually my first implementation is that. But I realized that doesn't
work, this requires intermediate object which has refcounter because the
"marker" on the shadow stack will be left after unregistering it. We need to
identify which is still available and which is not. And for that purpose,
we may need to introduce similar structure in the fgraph too.

The current multi-fgraph does;

 - if CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=n (called from dedicated mcount
   asm code), it has to loop on all fgraph_ops and check the hash, which is
   inefficient but it can easily push the return trace entry on the shadow
   stack.

 - if CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y (called from ftrace asm code),
   it does not need to loop (that will be done by ftrace) but each handler
   does NOT know who pushed the return trace entry on the shadow stack.
   Thus it has to decode the shadow stack and check it needs to push return
   trace entry or not. And this is hard if the traced function is self-
   recursive call or tail call. To check the recursive call, I introduced
   a bitmap entry on the shadow stack. This bitmap size limits the max
   number of fgraph.

So, unlimit the number of fgraph, we may need to stack the number of fgraph
on the stack and each fgraph callback has to unwind the shadow stack to check
whether their own number is there instead of checking the bit in the bitmap.
That will be more trusted way but maybe slow.

Another option is introducing a pair of pre- and post-callbacks which is called
before and after calling the list/direct call of ftrace_ops. And pre-callback
will push the ret_stack on shadow stack and post-callback will commit or cancel it.
(but this one is hard to design... maybe becomes ugly interface.)

Thank you,

-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

  reply	other threads:[~2023-12-20  1:00 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-18 13:11 [PATCH v5 00/34] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph Masami Hiramatsu (Google)
2023-12-18 13:11 ` [PATCH v5 01/34] tracing: Add a comment about ftrace_regs definition Masami Hiramatsu (Google)
2024-01-05 17:12   ` Mark Rutland
2023-12-18 13:11 ` [PATCH v5 02/34] x86: tracing: Add ftrace_regs definition in the header Masami Hiramatsu (Google)
2023-12-18 13:12 ` [PATCH v5 03/34] function_graph: Convert ret_stack to a series of longs Masami Hiramatsu (Google)
2023-12-18 13:12 ` [PATCH v5 04/34] fgraph: Use BUILD_BUG_ON() to make sure we have structures divisible by long Masami Hiramatsu (Google)
2023-12-18 13:12 ` [PATCH v5 05/34] function_graph: Add an array structure that will allow multiple callbacks Masami Hiramatsu (Google)
2023-12-18 13:12 ` [PATCH v5 06/34] function_graph: Allow multiple users to attach to function graph Masami Hiramatsu (Google)
2023-12-19 13:23   ` Jiri Olsa
2023-12-19 15:45     ` Masami Hiramatsu
2023-12-26 15:24       ` Masami Hiramatsu
2023-12-18 13:12 ` [PATCH v5 07/34] function_graph: Remove logic around ftrace_graph_entry and return Masami Hiramatsu (Google)
2023-12-18 13:13 ` [PATCH v5 08/34] ftrace/function_graph: Pass fgraph_ops to function graph callbacks Masami Hiramatsu (Google)
2023-12-18 13:13 ` [PATCH v5 09/34] ftrace: Allow function_graph tracer to be enabled in instances Masami Hiramatsu (Google)
2023-12-18 13:13 ` [PATCH v5 10/34] ftrace: Allow ftrace startup flags exist without dynamic ftrace Masami Hiramatsu (Google)
2023-12-18 13:13 ` [PATCH v5 11/34] function_graph: Have the instances use their own ftrace_ops for filtering Masami Hiramatsu (Google)
2023-12-26  0:20   ` Masami Hiramatsu
2024-01-05 17:09   ` Mark Rutland
2024-01-08  1:14     ` Masami Hiramatsu
2024-01-08 12:25       ` Mark Rutland
2024-01-08 14:21         ` Mark Rutland
2024-01-08 15:03           ` Mark Rutland
2024-01-11 13:47             ` Masami Hiramatsu
2024-01-11  2:15         ` Masami Hiramatsu
2024-01-11 11:01           ` Mark Rutland
2024-01-11 13:45             ` Masami Hiramatsu
2023-12-18 13:13 ` [PATCH v5 12/34] function_graph: Use a simple LRU for fgraph_array index number Masami Hiramatsu (Google)
2023-12-18 13:14 ` [PATCH v5 13/34] function_graph: Add "task variables" per task for fgraph_ops Masami Hiramatsu (Google)
2023-12-18 13:14 ` [PATCH v5 14/34] function_graph: Move set_graph_function tests to shadow stack global var Masami Hiramatsu (Google)
2023-12-18 13:14 ` [PATCH v5 15/34] function_graph: Move graph depth stored data " Masami Hiramatsu (Google)
2023-12-18 13:14 ` [PATCH v5 16/34] function_graph: Move graph notrace bit " Masami Hiramatsu (Google)
2023-12-18 13:15 ` [PATCH v5 17/34] function_graph: Implement fgraph_reserve_data() and fgraph_retrieve_data() Masami Hiramatsu (Google)
2023-12-18 13:15 ` [PATCH v5 18/34] function_graph: Add selftest for passing local variables Masami Hiramatsu (Google)
2023-12-18 13:15 ` [PATCH v5 19/34] function_graph: Add a new entry handler with parent_ip and ftrace_regs Masami Hiramatsu (Google)
2023-12-18 13:15 ` [PATCH v5 20/34] function_graph: Add a new exit " Masami Hiramatsu (Google)
2023-12-18 13:15 ` [PATCH v5 21/34] x86/ftrace: Enable HAVE_FUNCTION_GRAPH_FREGS Masami Hiramatsu (Google)
2023-12-18 13:15 ` [PATCH v5 22/34] tracing: Rename ftrace_regs_return_value to ftrace_regs_get_return_value Masami Hiramatsu (Google)
2024-01-05 17:14   ` Mark Rutland
2024-01-08  1:09     ` Masami Hiramatsu
2023-12-18 13:16 ` [PATCH v5 23/34] arm64: ftrace: Enable HAVE_FUNCTION_GRAPH_FREGS Masami Hiramatsu (Google)
2023-12-18 13:16 ` [PATCH v5 24/34] fprobe: Use ftrace_regs in fprobe entry handler Masami Hiramatsu (Google)
2023-12-19 13:23   ` Jiri Olsa
2023-12-19 13:23   ` Jiri Olsa
2023-12-19 22:51     ` Masami Hiramatsu
2023-12-18 13:16 ` [PATCH v5 25/34] fprobe: Use ftrace_regs in fprobe exit handler Masami Hiramatsu (Google)
2023-12-18 13:16 ` [PATCH v5 26/34] tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs Masami Hiramatsu (Google)
2023-12-18 13:16 ` [PATCH v5 27/34] tracing: Add ftrace_fill_perf_regs() for perf event Masami Hiramatsu (Google)
2023-12-18 13:17 ` [PATCH v5 28/34] fprobe: Rewrite fprobe on function-graph tracer Masami Hiramatsu (Google)
2023-12-19 14:39   ` Jiri Olsa
2023-12-20  1:00     ` Masami Hiramatsu [this message]
2023-12-18 13:17 ` [PATCH v5 29/34] tracing/fprobe: Remove nr_maxactive from fprobe Masami Hiramatsu (Google)
2023-12-18 13:17 ` [PATCH v5 30/34] tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS Masami Hiramatsu (Google)
2023-12-18 13:17 ` [PATCH v5 31/34] bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled Masami Hiramatsu (Google)
2023-12-18 13:17 ` [PATCH v5 32/34] selftests: ftrace: Remove obsolate maxactive syntax check Masami Hiramatsu (Google)
2023-12-18 13:18 ` [PATCH v5 33/34] selftests/ftrace: Add a test case for repeating register/unregister fprobe Masami Hiramatsu (Google)
2023-12-18 13:18 ` [PATCH v5 34/34] Documentation: probes: Update fprobe on function-graph tracer Masami Hiramatsu (Google)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231220100047.e33d862cb869423c2a3a82bf@kernel.org \
    --to=mhiramat@kernel.org \
    --cc=acme@kernel.org \
    --cc=alan.maguire@oracle.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=guoren@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=martin.lau@linux.dev \
    --cc=olsajiri@gmail.com \
    --cc=peterz@infradead.org \
    --cc=revest@chromium.org \
    --cc=rostedt@goodmis.org \
    --cc=svens@linux.ibm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox