BPF List
 help / color / mirror / Atom feed
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Jiri Olsa <olsajiri@gmail.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Florent Revest <revest@chromium.org>,
	linux-trace-kernel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Alan Maguire <alan.maguire@oracle.com>,
	Mark Rutland <mark.rutland@arm.com>
Subject: Re: [PATCH] fgraph: Move trace_clock_local() for return time to function_graph tracer
Date: Tue, 14 Jan 2025 10:18:06 +0900	[thread overview]
Message-ID: <20250114101806.b2778cb01f34f5be9d23ad98@kernel.org> (raw)
In-Reply-To: <20250113195449.72ab5d81@gandalf.local.home>

On Mon, 13 Jan 2025 19:54:49 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Sun, 12 Jan 2025 14:26:35 +0900
> "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
> 
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > 
> > Since the ftrace_graph_ret::rettime is only referred in the function_graph
> > tracer, the trace_clock_local() call in fgraph is just an overhead for
> > other fgraph users.
> > 
> > Move the trace_clock_local() for recording return time to function_graph
> > tracer and the rettime field is just zeroed in the fgraph side.
> > That rettime field is updated by one of the function_graph tracer and
> > cached for other function_graph tracer instances.
> > 
> > According to Jiri's report[1], removing this function will gain fprobe
> > performance ~27%.
> > 
> > [1] https://lore.kernel.org/all/Z3aSuql3fnXMVMoM@krava/
> > 
> > Reported-by: Jiri Olsa <olsajiri@gmail.com>
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> I'd rather just nuke the calltime and rettime from struct ftrace_graph_ret.

Yeah, this one is good to me. But this will make a slightly different time
for the same function when we enable multiple function graph tracer on
different instances (mine records it once and reuse it but this will record
on each instance). We may need to notice it on the document. :)

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Thank you,

> 
> How about something like this instead?
> 
> [ based on: https://lore.kernel.org/linux-trace-kernel/20250113183124.61767419@gandalf.local.home/ ]
> 
> I can start testing it (it compiles and boots).
> 
>  include/linux/ftrace.h               |  2 --
>  kernel/trace/fgraph.c                |  1 -
>  kernel/trace/trace.h                 |  4 +++-
>  kernel/trace/trace_entries.h         |  8 +++----
>  kernel/trace/trace_functions_graph.c | 33 ++++++++++++++++------------
>  kernel/trace/trace_irqsoff.c         |  5 +++--
>  kernel/trace/trace_sched_wakeup.c    |  6 +++--
>  7 files changed, 33 insertions(+), 26 deletions(-)
> 
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index aa9ddd1e4bb6..848ddf0d8f89 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -1061,8 +1061,6 @@ struct ftrace_graph_ret {
>  	int depth;
>  	/* Number of functions that overran the depth limit for current task */
>  	unsigned int overrun;
> -	unsigned long long calltime;
> -	unsigned long long rettime;
>  } __packed;
>  
>  struct fgraph_ops;
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index 30e3ddc8a8a8..a7e6c8488b7f 100644
> --- a/kernel/trace/fgraph.c
> +++ b/kernel/trace/fgraph.c
> @@ -818,7 +818,6 @@ static unsigned long __ftrace_return_to_handler(struct fgraph_ret_regs *ret_regs
>  		return (unsigned long)panic;
>  	}
>  
> -	trace.rettime = trace_clock_local();
>  #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
>  	trace.retval = fgraph_ret_regs_return_value(ret_regs);
>  #endif
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 9691b47b5f3d..467a8143fbb9 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -911,7 +911,9 @@ extern int __trace_graph_retaddr_entry(struct trace_array *tr,
>  				unsigned long retaddr);
>  extern void __trace_graph_return(struct trace_array *tr,
>  				 struct ftrace_graph_ret *trace,
> -				 unsigned int trace_ctx);
> +				 unsigned int trace_ctx,
> +				 u64 calltime, u64 rettime);
> +
>  extern void init_array_fgraph_ops(struct trace_array *tr, struct ftrace_ops *ops);
>  extern int allocate_fgraph_ops(struct trace_array *tr, struct ftrace_ops *ops);
>  extern void free_fgraph_ops(struct trace_array *tr);
> diff --git a/kernel/trace/trace_entries.h b/kernel/trace/trace_entries.h
> index 82fd174ebbe0..fbfb396905a6 100644
> --- a/kernel/trace/trace_entries.h
> +++ b/kernel/trace/trace_entries.h
> @@ -124,8 +124,8 @@ FTRACE_ENTRY_PACKED(funcgraph_exit, ftrace_graph_ret_entry,
>  		__field_packed(	unsigned long,	ret,		retval	)
>  		__field_packed(	int,		ret,		depth	)
>  		__field_packed(	unsigned int,	ret,		overrun	)
> -		__field_packed(	unsigned long long, ret,	calltime)
> -		__field_packed(	unsigned long long, ret,	rettime	)
> +		__field(unsigned long long,	calltime		)
> +		__field(unsigned long long,	rettime			)
>  	),
>  
>  	F_printk("<-- %ps (%d) (start: %llx  end: %llx) over: %d retval: %lx",
> @@ -146,8 +146,8 @@ FTRACE_ENTRY_PACKED(funcgraph_exit, ftrace_graph_ret_entry,
>  		__field_packed(	unsigned long,	ret,		func	)
>  		__field_packed(	int,		ret,		depth	)
>  		__field_packed(	unsigned int,	ret,		overrun	)
> -		__field_packed(	unsigned long long, ret,	calltime)
> -		__field_packed(	unsigned long long, ret,	rettime	)
> +		__field(unsigned long long,	calltime		)
> +		__field(unsigned long long,	rettime			)
>  	),
>  
>  	F_printk("<-- %ps (%d) (start: %llx  end: %llx) over: %d",
> diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
> index 5504b5e4e7b4..7bff133c1543 100644
> --- a/kernel/trace/trace_functions_graph.c
> +++ b/kernel/trace/trace_functions_graph.c
> @@ -270,12 +270,10 @@ __trace_graph_function(struct trace_array *tr,
>  	struct ftrace_graph_ret ret = {
>  		.func     = ip,
>  		.depth    = 0,
> -		.calltime = time,
> -		.rettime  = time,
>  	};
>  
>  	__trace_graph_entry(tr, &ent, trace_ctx);
> -	__trace_graph_return(tr, &ret, trace_ctx);
> +	__trace_graph_return(tr, &ret, trace_ctx, time, time);
>  }
>  
>  void
> @@ -287,8 +285,9 @@ trace_graph_function(struct trace_array *tr,
>  }
>  
>  void __trace_graph_return(struct trace_array *tr,
> -				struct ftrace_graph_ret *trace,
> -				unsigned int trace_ctx)
> +			  struct ftrace_graph_ret *trace,
> +			  unsigned int trace_ctx,
> +			  u64 calltime, u64 rettime)
>  {
>  	struct ring_buffer_event *event;
>  	struct trace_buffer *buffer = tr->array_buffer.buffer;
> @@ -300,6 +299,8 @@ void __trace_graph_return(struct trace_array *tr,
>  		return;
>  	entry	= ring_buffer_event_data(event);
>  	entry->ret				= *trace;
> +	entry->calltime				= calltime;
> +	entry->rettime				= rettime;
>  	trace_buffer_unlock_commit_nostack(buffer, event);
>  }
>  
> @@ -322,10 +323,13 @@ void trace_graph_return(struct ftrace_graph_ret *trace,
>  	struct fgraph_times *ftimes;
>  	unsigned long flags;
>  	unsigned int trace_ctx;
> +	u64 calltime, rettime;
>  	long disabled;
>  	int size;
>  	int cpu;
>  
> +	rettime = trace_clock_local();
> +
>  	ftrace_graph_addr_finish(gops, trace);
>  
>  	if (*task_var & TRACE_GRAPH_NOTRACE) {
> @@ -339,7 +343,7 @@ void trace_graph_return(struct ftrace_graph_ret *trace,
>  
>  	handle_nosleeptime(trace, ftimes, size);
>  
> -	trace->calltime = ftimes->calltime;
> +	calltime = ftimes->calltime;
>  
>  	local_irq_save(flags);
>  	cpu = raw_smp_processor_id();
> @@ -347,7 +351,7 @@ void trace_graph_return(struct ftrace_graph_ret *trace,
>  	disabled = atomic_inc_return(&data->disabled);
>  	if (likely(disabled == 1)) {
>  		trace_ctx = tracing_gen_ctx_flags(flags);
> -		__trace_graph_return(tr, trace, trace_ctx);
> +		__trace_graph_return(tr, trace, trace_ctx, calltime, rettime);
>  	}
>  	atomic_dec(&data->disabled);
>  	local_irq_restore(flags);
> @@ -372,10 +376,8 @@ static void trace_graph_thresh_return(struct ftrace_graph_ret *trace,
>  
>  	handle_nosleeptime(trace, ftimes, size);
>  
> -	trace->calltime = ftimes->calltime;
> -
>  	if (tracing_thresh &&
> -	    (trace->rettime - ftimes->calltime < tracing_thresh))
> +	    (trace_clock_local() - ftimes->calltime < tracing_thresh))
>  		return;
>  	else
>  		trace_graph_return(trace, gops);
> @@ -861,7 +863,7 @@ print_graph_entry_leaf(struct trace_iterator *iter,
>  
>  	graph_ret = &ret_entry->ret;
>  	call = &entry->graph_ent;
> -	duration = graph_ret->rettime - graph_ret->calltime;
> +	duration = ret_entry->rettime - ret_entry->calltime;
>  
>  	func = call->func + iter->tr->text_delta;
>  
> @@ -1142,11 +1144,14 @@ print_graph_entry(struct ftrace_graph_ent_entry *field, struct trace_seq *s,
>  }
>  
>  static enum print_line_t
> -print_graph_return(struct ftrace_graph_ret *trace, struct trace_seq *s,
> +print_graph_return(struct ftrace_graph_ret_entry *retentry, struct trace_seq *s,
>  		   struct trace_entry *ent, struct trace_iterator *iter,
>  		   u32 flags)
>  {
> -	unsigned long long duration = trace->rettime - trace->calltime;
> +	struct ftrace_graph_ret *trace = &retentry->ret;
> +	u64 calltime = retentry->calltime;
> +	u64 rettime = retentry->rettime;
> +	unsigned long long duration = rettime - calltime;
>  	struct fgraph_data *data = iter->private;
>  	struct trace_array *tr = iter->tr;
>  	unsigned long func;
> @@ -1347,7 +1352,7 @@ print_graph_function_flags(struct trace_iterator *iter, u32 flags)
>  	case TRACE_GRAPH_RET: {
>  		struct ftrace_graph_ret_entry *field;
>  		trace_assign_type(field, entry);
> -		return print_graph_return(&field->ret, s, entry, iter, flags);
> +		return print_graph_return(field, s, entry, iter, flags);
>  	}
>  	case TRACE_STACK:
>  	case TRACE_FN:
> diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
> index a4e799c1e767..1d3646ce9c34 100644
> --- a/kernel/trace/trace_irqsoff.c
> +++ b/kernel/trace/trace_irqsoff.c
> @@ -221,6 +221,7 @@ static void irqsoff_graph_return(struct ftrace_graph_ret *trace,
>  	unsigned long flags;
>  	unsigned int trace_ctx;
>  	u64 *calltime;
> +	u64 rettime;
>  	int size;
>  
>  	ftrace_graph_addr_finish(gops, trace);
> @@ -228,13 +229,13 @@ static void irqsoff_graph_return(struct ftrace_graph_ret *trace,
>  	if (!func_prolog_dec(tr, &data, &flags))
>  		return;
>  
> +	rettime = trace_clock_local();
>  	calltime = fgraph_retrieve_data(gops->idx, &size);
>  	if (!calltime)
>  		return;
> -	trace->calltime = *calltime;
>  
>  	trace_ctx = tracing_gen_ctx_flags(flags);
> -	__trace_graph_return(tr, trace, trace_ctx);
> +	__trace_graph_return(tr, trace, trace_ctx, *calltime, rettime);
>  	atomic_dec(&data->disabled);
>  }
>  
> diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c
> index c58292e424d5..c4bbaae2a2a3 100644
> --- a/kernel/trace/trace_sched_wakeup.c
> +++ b/kernel/trace/trace_sched_wakeup.c
> @@ -156,6 +156,7 @@ static void wakeup_graph_return(struct ftrace_graph_ret *trace,
>  	struct trace_array_cpu *data;
>  	unsigned int trace_ctx;
>  	u64 *calltime;
> +	u64 rettime;
>  	int size;
>  
>  	ftrace_graph_addr_finish(gops, trace);
> @@ -163,12 +164,13 @@ static void wakeup_graph_return(struct ftrace_graph_ret *trace,
>  	if (!func_prolog_preempt_disable(tr, &data, &trace_ctx))
>  		return;
>  
> +	rettime = trace_clock_local();
> +
>  	calltime = fgraph_retrieve_data(gops->idx, &size);
>  	if (!calltime)
>  		return;
> -	trace->calltime = *calltime;
>  
> -	__trace_graph_return(tr, trace, trace_ctx);
> +	__trace_graph_return(tr, trace, trace_ctx, *calltime, rettime);
>  	atomic_dec(&data->disabled);
>  
>  	preempt_enable_notrace();
> -- 
> 2.45.2
> 
> -- Steve


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

  reply	other threads:[~2025-01-14  1:18 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-26  5:11 [PATCH v22 00/20] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph Masami Hiramatsu (Google)
2024-12-26  5:11 ` [PATCH v22 01/20] fgraph: Get ftrace recursion lock in function_graph_enter Masami Hiramatsu (Google)
2024-12-26  5:11 ` [PATCH v22 02/20] fgraph: Pass ftrace_regs to entryfunc Masami Hiramatsu (Google)
2024-12-26  5:11 ` [PATCH v22 03/20] fgraph: Replace fgraph_ret_regs with ftrace_regs Masami Hiramatsu (Google)
2024-12-26  5:12 ` [PATCH v22 04/20] fgraph: Pass ftrace_regs to retfunc Masami Hiramatsu (Google)
2024-12-26  5:12 ` [PATCH v22 05/20] fprobe: Use ftrace_regs in fprobe entry handler Masami Hiramatsu (Google)
2024-12-26  5:12 ` [PATCH v22 06/20] fprobe: Use ftrace_regs in fprobe exit handler Masami Hiramatsu (Google)
2024-12-26  5:12 ` [PATCH v22 07/20] tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs Masami Hiramatsu (Google)
2024-12-26  5:12 ` [PATCH v22 08/20] tracing: Add ftrace_fill_perf_regs() for perf event Masami Hiramatsu (Google)
2024-12-26  5:13 ` [PATCH v22 09/20] tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS Masami Hiramatsu (Google)
2024-12-26  5:13 ` [PATCH v22 10/20] bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled Masami Hiramatsu (Google)
2024-12-26  5:13 ` [PATCH v22 11/20] ftrace: Add CONFIG_HAVE_FTRACE_GRAPH_FUNC Masami Hiramatsu (Google)
2024-12-26  5:13 ` [PATCH v22 12/20] s390/tracing: Enable HAVE_FTRACE_GRAPH_FUNC Masami Hiramatsu (Google)
2024-12-26  5:13 ` [PATCH v22 13/20] fprobe: Rewrite fprobe on function-graph tracer Masami Hiramatsu (Google)
2024-12-26  5:14 ` [PATCH v22 14/20] fprobe: Add fprobe_header encoding feature Masami Hiramatsu (Google)
2024-12-26  5:14 ` [PATCH v22 15/20] tracing/fprobe: Remove nr_maxactive from fprobe Masami Hiramatsu (Google)
2024-12-26  5:14 ` [PATCH v22 16/20] selftests: ftrace: Remove obsolate maxactive syntax check Masami Hiramatsu (Google)
2024-12-26  5:14 ` [PATCH v22 17/20] selftests/ftrace: Add a test case for repeating register/unregister fprobe Masami Hiramatsu (Google)
2024-12-26  5:15 ` [PATCH v22 18/20] Documentation: probes: Update fprobe on function-graph tracer Masami Hiramatsu (Google)
2024-12-26  5:15 ` [PATCH v22 19/20] ftrace: Add ftrace_get_symaddr to convert fentry_ip to symaddr Masami Hiramatsu (Google)
2025-02-03 21:33   ` Gabriel de Perthuis
2025-02-04  9:19     ` Masami Hiramatsu
2024-12-26  5:15 ` [PATCH v22 20/20] bpf: Use ftrace_get_symaddr() for kprobe_multi probes Masami Hiramatsu (Google)
2024-12-27  2:23   ` Steven Rostedt
2024-12-27  2:24     ` Steven Rostedt
2024-12-27 15:24       ` Steven Rostedt
2024-12-31 15:48         ` Masami Hiramatsu
2024-12-31 16:00         ` [PATCH v23] " Masami Hiramatsu (Google)
2025-01-02 13:20 ` [PATCH v22 00/20] tracing: fprobe: function_graph: Multi-function graph and fprobe on fgraph Jiri Olsa
2025-01-11  0:04   ` Andrii Nakryiko
2025-01-14 15:12     ` Jiri Olsa
2025-01-14 19:04       ` Andrii Nakryiko
2025-01-12  4:45   ` Masami Hiramatsu
2025-01-12  5:26   ` [PATCH] fgraph: Move trace_clock_local() for return time to function_graph tracer Masami Hiramatsu (Google)
2025-01-14  0:54     ` Steven Rostedt
2025-01-14  1:18       ` Masami Hiramatsu [this message]
2025-01-14 15:05         ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250114101806.b2778cb01f34f5be9d23ad98@kernel.org \
    --to=mhiramat@kernel.org \
    --cc=alan.maguire@oracle.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=martin.lau@linux.dev \
    --cc=olsajiri@gmail.com \
    --cc=revest@chromium.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox