Linux Trace Kernel
 help / color / mirror / Atom feed
* Interface for enabling context tracking
@ 2025-04-10 18:51 Junxuan Liao
  2025-04-10 19:05 ` Steven Rostedt
  2025-04-10 19:10 ` Paul E. McKenney
  0 siblings, 2 replies; 9+ messages in thread
From: Junxuan Liao @ 2025-04-10 18:51 UTC (permalink / raw)
  To: Frederic Weisbecker, Paul E. McKenney
  Cc: linux-kernel, linux-trace-kernel, Steven Rostedt,
	Masami Hiramatsu, Mathieu Desnoyers

Hi all,

 From what I can tell, tracepoints context_tracking:user_enter and
user_exit might be useful for performance analysis. e.g. Figuring out how
much time is spent handling page faults or in system calls. However
context tracking is by default inactive and the only way to enable it is
to turn on nohz_full for some CPUs.

Is it a good idea to support turning on and off context tracking through
some interface accessible from the userspace?

Thanks,
Junxuan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Interface for enabling context tracking
  2025-04-10 18:51 Interface for enabling context tracking Junxuan Liao
@ 2025-04-10 19:05 ` Steven Rostedt
  2025-04-10 20:31   ` Junxuan Liao
  2025-04-17 19:10   ` Junxuan Liao
  2025-04-10 19:10 ` Paul E. McKenney
  1 sibling, 2 replies; 9+ messages in thread
From: Steven Rostedt @ 2025-04-10 19:05 UTC (permalink / raw)
  To: Junxuan Liao
  Cc: Frederic Weisbecker, Paul E. McKenney, linux-kernel,
	linux-trace-kernel, Masami Hiramatsu, Mathieu Desnoyers

On Thu, 10 Apr 2025 13:51:39 -0500
Junxuan Liao <ljx@cs.wisc.edu> wrote:

> Hi all,
> 
>  From what I can tell, tracepoints context_tracking:user_enter and
> user_exit might be useful for performance analysis. e.g. Figuring out how
> much time is spent handling page faults or in system calls. However
> context tracking is by default inactive and the only way to enable it is
> to turn on nohz_full for some CPUs.
> 
> Is it a good idea to support turning on and off context tracking through
> some interface accessible from the userspace?
>

I think the best thing to do is to add trace events in all areas that enter
and exit the kernel normally (where noinstr is turned off). There's already
one for page faults on entry. It's been on my todo list to add one for page
fault exit (as I do care for how long they last.

I believe the irq vectors also have entry and exits trace events.

What else is missing?

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Interface for enabling context tracking
  2025-04-10 18:51 Interface for enabling context tracking Junxuan Liao
  2025-04-10 19:05 ` Steven Rostedt
@ 2025-04-10 19:10 ` Paul E. McKenney
  2025-04-10 19:32   ` Steven Rostedt
  1 sibling, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2025-04-10 19:10 UTC (permalink / raw)
  To: Junxuan Liao
  Cc: Frederic Weisbecker, linux-kernel, linux-trace-kernel,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers

On Thu, Apr 10, 2025 at 01:51:39PM -0500, Junxuan Liao wrote:
> Hi all,
> 
> From what I can tell, tracepoints context_tracking:user_enter and
> user_exit might be useful for performance analysis. e.g. Figuring out how
> much time is spent handling page faults or in system calls. However
> context tracking is by default inactive and the only way to enable it is
> to turn on nohz_full for some CPUs.
> 
> Is it a good idea to support turning on and off context tracking through
> some interface accessible from the userspace?

There is some in-kernel support for turning the rcu_nocbs portion
of nohz_full on and off on a per-CPU basis, but a given CPU must be
offline in order to do this transition.  Last I heard, there are still
issues preventing this support from being generalized to cover all of
the nohz_full functionality, and I doubt that it would be exposed to
user level until all of nohz_full is supported.

The rcu_nocbs in-kernel functionality is tested regularly.

Are you interested in working on joining the noble quest of getting the
rest of the nohz_full support in place?  (Full disclosure: This stuff
is non-trivial.)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Interface for enabling context tracking
  2025-04-10 19:10 ` Paul E. McKenney
@ 2025-04-10 19:32   ` Steven Rostedt
  2025-04-11 17:41     ` Junxuan Liao
  0 siblings, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2025-04-10 19:32 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Junxuan Liao, Frederic Weisbecker, linux-kernel,
	linux-trace-kernel, Masami Hiramatsu, Mathieu Desnoyers

On Thu, 10 Apr 2025 12:10:19 -0700
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> There is some in-kernel support for turning the rcu_nocbs portion
> of nohz_full on and off on a per-CPU basis, but a given CPU must be
> offline in order to do this transition.  Last I heard, there are still
> issues preventing this support from being generalized to cover all of
> the nohz_full functionality, and I doubt that it would be exposed to
> user level until all of nohz_full is supported.
> 
> The rcu_nocbs in-kernel functionality is tested regularly.
> 
> Are you interested in working on joining the noble quest of getting the
> rest of the nohz_full support in place?  (Full disclosure: This stuff
> is non-trivial.)

I believe the request is more of just tracing entry and exit from the
kernel, which just needs a simple trace event at the border crossings.

NOHZ_FULL is to allow the kernel infrastructure to know that a CPU has
transitioned states (no need to do RCU or have a tick on that CPU). That's
a much harder task as you not only need to know the border crossings, you
also need to make sure nothing happens from the locations you mark and the
crossing takes place. That's a much more difficult task.

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Interface for enabling context tracking
  2025-04-10 19:05 ` Steven Rostedt
@ 2025-04-10 20:31   ` Junxuan Liao
  2025-04-17 19:10   ` Junxuan Liao
  1 sibling, 0 replies; 9+ messages in thread
From: Junxuan Liao @ 2025-04-10 20:31 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frederic Weisbecker, Paul E. McKenney, linux-kernel,
	linux-trace-kernel, Masami Hiramatsu, Mathieu Desnoyers

On 4/10/25 2:05 PM, Steven Rostedt wrote:

> I think the best thing to do is to add trace events in all areas that enter
> and exit the kernel normally (where noinstr is turned off). There's already
> one for page faults on entry. It's been on my todo list to add one for page
> fault exit (as I do care for how long they last.

I agree. That's indeed better.

> I believe the irq vectors also have entry and exits trace events.
> 
> What else is missing?

That's everything as far as I know.

Thanks,
Junxuan




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Interface for enabling context tracking
  2025-04-10 19:32   ` Steven Rostedt
@ 2025-04-11 17:41     ` Junxuan Liao
  2025-04-11 21:51       ` Frederic Weisbecker
  0 siblings, 1 reply; 9+ messages in thread
From: Junxuan Liao @ 2025-04-11 17:41 UTC (permalink / raw)
  To: Steven Rostedt, Paul E. McKenney
  Cc: Frederic Weisbecker, linux-kernel, linux-trace-kernel,
	Masami Hiramatsu, Mathieu Desnoyers


>> Are you interested in working on joining the noble quest of getting the
>> rest of the nohz_full support in place?  (Full disclosure: This stuff
>> is non-trivial.)
> 
> I believe the request is more of just tracing entry and exit from the
> kernel, which just needs a simple trace event at the border crossings.

Yeah I'm more interested in just tracing this for now.

> It's been on my todo list to add one for page
> fault exit (as I do care for how long they last.

I've added a tracepoint similar to page_fault_user for that but I'm not
sure if it's the best way to do it. Should I send a patch for review?

Thanks,
Junxuan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Interface for enabling context tracking
  2025-04-11 17:41     ` Junxuan Liao
@ 2025-04-11 21:51       ` Frederic Weisbecker
  0 siblings, 0 replies; 9+ messages in thread
From: Frederic Weisbecker @ 2025-04-11 21:51 UTC (permalink / raw)
  To: Junxuan Liao
  Cc: Steven Rostedt, Paul E. McKenney, linux-kernel,
	linux-trace-kernel, Masami Hiramatsu, Mathieu Desnoyers

Le Fri, Apr 11, 2025 at 12:41:37PM -0500, Junxuan Liao a écrit :
> 
> > > Are you interested in working on joining the noble quest of getting the
> > > rest of the nohz_full support in place?  (Full disclosure: This stuff
> > > is non-trivial.)
> > 
> > I believe the request is more of just tracing entry and exit from the
> > kernel, which just needs a simple trace event at the border crossings.
> 
> Yeah I'm more interested in just tracing this for now.
> 
> > It's been on my todo list to add one for page
> > fault exit (as I do care for how long they last.
> 
> I've added a tracepoint similar to page_fault_user for that but I'm not
> sure if it's the best way to do it. Should I send a patch for review?

If you do so, it may be a good idea to remove page_fault_user and
page_fault_kernel and introduce page_fault_user_enter/page_fault_user_exit
and page_fault_kernel_enter/page_fault_kernel_exit.

But the following is also possible (and then trace/events/context_tracking.h
should be renamed into trace/events/entry.h):

diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
index fc61d0205c97..83b1764078f7 100644
--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -15,6 +15,8 @@
 
 #include <asm/entry-common.h>
 
+#include <trace/events/context_tracking.h>
+
 /*
  * Define dummy _TIF work flags if not defined by the architecture or for
  * disabled functionality.
@@ -115,6 +117,7 @@ static __always_inline void enter_from_user_mode(struct pt_regs *regs)
 	instrumentation_begin();
 	kmsan_unpoison_entry_regs(regs);
 	trace_hardirqs_off_finish();
+	trace_user_exit(0);
 	instrumentation_end();
 }
 
@@ -357,6 +360,7 @@ static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs)
 static __always_inline void exit_to_user_mode(void)
 {
 	instrumentation_begin();
+	trace_user_enter(0);
 	trace_hardirqs_on_prepare();
 	lockdep_hardirqs_on_prepare();
 	instrumentation_end();
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index fb5be6e9b423..e9395936bded 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -428,9 +428,6 @@ static __always_inline void ct_kernel_enter(bool user, int offset) { }
 
 #ifdef CONFIG_CONTEXT_TRACKING_USER
 
-#define CREATE_TRACE_POINTS
-#include <trace/events/context_tracking.h>
-
 DEFINE_STATIC_KEY_FALSE_RO(context_tracking_key);
 EXPORT_SYMBOL_GPL(context_tracking_key);
 
@@ -486,7 +483,6 @@ void noinstr __ct_user_enter(enum ctx_state state)
 			 */
 			if (state == CT_STATE_USER) {
 				instrumentation_begin();
-				trace_user_enter(0);
 				vtime_user_enter(current);
 				instrumentation_end();
 			}
@@ -623,7 +619,6 @@ void noinstr __ct_user_exit(enum ctx_state state)
 			if (state == CT_STATE_USER) {
 				instrumentation_begin();
 				vtime_user_exit(current);
-				trace_user_exit(0);
 				instrumentation_end();
 			}
 





-- 
Frederic Weisbecker
SUSE Labs

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Interface for enabling context tracking
  2025-04-10 19:05 ` Steven Rostedt
  2025-04-10 20:31   ` Junxuan Liao
@ 2025-04-17 19:10   ` Junxuan Liao
  2025-04-17 20:34     ` Steven Rostedt
  1 sibling, 1 reply; 9+ messages in thread
From: Junxuan Liao @ 2025-04-17 19:10 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frederic Weisbecker, Paul E. McKenney, linux-kernel,
	linux-trace-kernel, Masami Hiramatsu, Mathieu Desnoyers


On 4/10/25 2:05 PM, Steven Rostedt wrote:
> I think the best thing to do is to add trace events in all areas that enter
> and exit the kernel normally (where noinstr is turned off). There's already
> one for page faults on entry. It's been on my todo list to add one for page
> fault exit (as I do care for how long they last.
> 
> I believe the irq vectors also have entry and exits trace events.
> 
> What else is missing?
> 
> -- Steve

Just found out that the exit tracepoints for syscalls aren't always
exactly preceding the exit to userspace. The kernel can still spend
quite some time in task_work_run after the tracepoints are triggered.
Has that bothered you before?

--
Junxuan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Interface for enabling context tracking
  2025-04-17 19:10   ` Junxuan Liao
@ 2025-04-17 20:34     ` Steven Rostedt
  0 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2025-04-17 20:34 UTC (permalink / raw)
  To: Junxuan Liao
  Cc: Frederic Weisbecker, Paul E. McKenney, linux-kernel,
	linux-trace-kernel, Masami Hiramatsu, Mathieu Desnoyers

On Thu, 17 Apr 2025 14:10:53 -0500
Junxuan Liao <ljx@cs.wisc.edu> wrote:

> Just found out that the exit tracepoints for syscalls aren't always
> exactly preceding the exit to userspace. The kernel can still spend
> quite some time in task_work_run after the tracepoints are triggered.
> Has that bothered you before?

It's been a while, but what I usually do when I want to see entry into the
kernel is also to run:

  trace-cmd set -p function_graph --max-graph-depth 1

Which tracks the first function call into the kernel. It obviously now
misses entry and exit from user mode due to noinstr, but if a task_work
function is called, it will usually catch that too.

-- Steve

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-04-17 20:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-10 18:51 Interface for enabling context tracking Junxuan Liao
2025-04-10 19:05 ` Steven Rostedt
2025-04-10 20:31   ` Junxuan Liao
2025-04-17 19:10   ` Junxuan Liao
2025-04-17 20:34     ` Steven Rostedt
2025-04-10 19:10 ` Paul E. McKenney
2025-04-10 19:32   ` Steven Rostedt
2025-04-11 17:41     ` Junxuan Liao
2025-04-11 21:51       ` Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox