* Interface for enabling context tracking @ 2025-04-10 18:51 Junxuan Liao 2025-04-10 19:05 ` Steven Rostedt 2025-04-10 19:10 ` Paul E. McKenney 0 siblings, 2 replies; 9+ messages in thread From: Junxuan Liao @ 2025-04-10 18:51 UTC (permalink / raw) To: Frederic Weisbecker, Paul E. McKenney Cc: linux-kernel, linux-trace-kernel, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers Hi all, From what I can tell, tracepoints context_tracking:user_enter and user_exit might be useful for performance analysis. e.g. Figuring out how much time is spent handling page faults or in system calls. However context tracking is by default inactive and the only way to enable it is to turn on nohz_full for some CPUs. Is it a good idea to support turning on and off context tracking through some interface accessible from the userspace? Thanks, Junxuan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Interface for enabling context tracking 2025-04-10 18:51 Interface for enabling context tracking Junxuan Liao @ 2025-04-10 19:05 ` Steven Rostedt 2025-04-10 20:31 ` Junxuan Liao 2025-04-17 19:10 ` Junxuan Liao 2025-04-10 19:10 ` Paul E. McKenney 1 sibling, 2 replies; 9+ messages in thread From: Steven Rostedt @ 2025-04-10 19:05 UTC (permalink / raw) To: Junxuan Liao Cc: Frederic Weisbecker, Paul E. McKenney, linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mathieu Desnoyers On Thu, 10 Apr 2025 13:51:39 -0500 Junxuan Liao <ljx@cs.wisc.edu> wrote: > Hi all, > > From what I can tell, tracepoints context_tracking:user_enter and > user_exit might be useful for performance analysis. e.g. Figuring out how > much time is spent handling page faults or in system calls. However > context tracking is by default inactive and the only way to enable it is > to turn on nohz_full for some CPUs. > > Is it a good idea to support turning on and off context tracking through > some interface accessible from the userspace? > I think the best thing to do is to add trace events in all areas that enter and exit the kernel normally (where noinstr is turned off). There's already one for page faults on entry. It's been on my todo list to add one for page fault exit (as I do care for how long they last. I believe the irq vectors also have entry and exits trace events. What else is missing? -- Steve ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Interface for enabling context tracking 2025-04-10 19:05 ` Steven Rostedt @ 2025-04-10 20:31 ` Junxuan Liao 2025-04-17 19:10 ` Junxuan Liao 1 sibling, 0 replies; 9+ messages in thread From: Junxuan Liao @ 2025-04-10 20:31 UTC (permalink / raw) To: Steven Rostedt Cc: Frederic Weisbecker, Paul E. McKenney, linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mathieu Desnoyers On 4/10/25 2:05 PM, Steven Rostedt wrote: > I think the best thing to do is to add trace events in all areas that enter > and exit the kernel normally (where noinstr is turned off). There's already > one for page faults on entry. It's been on my todo list to add one for page > fault exit (as I do care for how long they last. I agree. That's indeed better. > I believe the irq vectors also have entry and exits trace events. > > What else is missing? That's everything as far as I know. Thanks, Junxuan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Interface for enabling context tracking 2025-04-10 19:05 ` Steven Rostedt 2025-04-10 20:31 ` Junxuan Liao @ 2025-04-17 19:10 ` Junxuan Liao 2025-04-17 20:34 ` Steven Rostedt 1 sibling, 1 reply; 9+ messages in thread From: Junxuan Liao @ 2025-04-17 19:10 UTC (permalink / raw) To: Steven Rostedt Cc: Frederic Weisbecker, Paul E. McKenney, linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mathieu Desnoyers On 4/10/25 2:05 PM, Steven Rostedt wrote: > I think the best thing to do is to add trace events in all areas that enter > and exit the kernel normally (where noinstr is turned off). There's already > one for page faults on entry. It's been on my todo list to add one for page > fault exit (as I do care for how long they last. > > I believe the irq vectors also have entry and exits trace events. > > What else is missing? > > -- Steve Just found out that the exit tracepoints for syscalls aren't always exactly preceding the exit to userspace. The kernel can still spend quite some time in task_work_run after the tracepoints are triggered. Has that bothered you before? -- Junxuan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Interface for enabling context tracking 2025-04-17 19:10 ` Junxuan Liao @ 2025-04-17 20:34 ` Steven Rostedt 0 siblings, 0 replies; 9+ messages in thread From: Steven Rostedt @ 2025-04-17 20:34 UTC (permalink / raw) To: Junxuan Liao Cc: Frederic Weisbecker, Paul E. McKenney, linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mathieu Desnoyers On Thu, 17 Apr 2025 14:10:53 -0500 Junxuan Liao <ljx@cs.wisc.edu> wrote: > Just found out that the exit tracepoints for syscalls aren't always > exactly preceding the exit to userspace. The kernel can still spend > quite some time in task_work_run after the tracepoints are triggered. > Has that bothered you before? It's been a while, but what I usually do when I want to see entry into the kernel is also to run: trace-cmd set -p function_graph --max-graph-depth 1 Which tracks the first function call into the kernel. It obviously now misses entry and exit from user mode due to noinstr, but if a task_work function is called, it will usually catch that too. -- Steve ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Interface for enabling context tracking 2025-04-10 18:51 Interface for enabling context tracking Junxuan Liao 2025-04-10 19:05 ` Steven Rostedt @ 2025-04-10 19:10 ` Paul E. McKenney 2025-04-10 19:32 ` Steven Rostedt 1 sibling, 1 reply; 9+ messages in thread From: Paul E. McKenney @ 2025-04-10 19:10 UTC (permalink / raw) To: Junxuan Liao Cc: Frederic Weisbecker, linux-kernel, linux-trace-kernel, Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers On Thu, Apr 10, 2025 at 01:51:39PM -0500, Junxuan Liao wrote: > Hi all, > > From what I can tell, tracepoints context_tracking:user_enter and > user_exit might be useful for performance analysis. e.g. Figuring out how > much time is spent handling page faults or in system calls. However > context tracking is by default inactive and the only way to enable it is > to turn on nohz_full for some CPUs. > > Is it a good idea to support turning on and off context tracking through > some interface accessible from the userspace? There is some in-kernel support for turning the rcu_nocbs portion of nohz_full on and off on a per-CPU basis, but a given CPU must be offline in order to do this transition. Last I heard, there are still issues preventing this support from being generalized to cover all of the nohz_full functionality, and I doubt that it would be exposed to user level until all of nohz_full is supported. The rcu_nocbs in-kernel functionality is tested regularly. Are you interested in working on joining the noble quest of getting the rest of the nohz_full support in place? (Full disclosure: This stuff is non-trivial.) Thanx, Paul ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Interface for enabling context tracking 2025-04-10 19:10 ` Paul E. McKenney @ 2025-04-10 19:32 ` Steven Rostedt 2025-04-11 17:41 ` Junxuan Liao 0 siblings, 1 reply; 9+ messages in thread From: Steven Rostedt @ 2025-04-10 19:32 UTC (permalink / raw) To: Paul E. McKenney Cc: Junxuan Liao, Frederic Weisbecker, linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mathieu Desnoyers On Thu, 10 Apr 2025 12:10:19 -0700 "Paul E. McKenney" <paulmck@kernel.org> wrote: > There is some in-kernel support for turning the rcu_nocbs portion > of nohz_full on and off on a per-CPU basis, but a given CPU must be > offline in order to do this transition. Last I heard, there are still > issues preventing this support from being generalized to cover all of > the nohz_full functionality, and I doubt that it would be exposed to > user level until all of nohz_full is supported. > > The rcu_nocbs in-kernel functionality is tested regularly. > > Are you interested in working on joining the noble quest of getting the > rest of the nohz_full support in place? (Full disclosure: This stuff > is non-trivial.) I believe the request is more of just tracing entry and exit from the kernel, which just needs a simple trace event at the border crossings. NOHZ_FULL is to allow the kernel infrastructure to know that a CPU has transitioned states (no need to do RCU or have a tick on that CPU). That's a much harder task as you not only need to know the border crossings, you also need to make sure nothing happens from the locations you mark and the crossing takes place. That's a much more difficult task. -- Steve ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Interface for enabling context tracking 2025-04-10 19:32 ` Steven Rostedt @ 2025-04-11 17:41 ` Junxuan Liao 2025-04-11 21:51 ` Frederic Weisbecker 0 siblings, 1 reply; 9+ messages in thread From: Junxuan Liao @ 2025-04-11 17:41 UTC (permalink / raw) To: Steven Rostedt, Paul E. McKenney Cc: Frederic Weisbecker, linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mathieu Desnoyers >> Are you interested in working on joining the noble quest of getting the >> rest of the nohz_full support in place? (Full disclosure: This stuff >> is non-trivial.) > > I believe the request is more of just tracing entry and exit from the > kernel, which just needs a simple trace event at the border crossings. Yeah I'm more interested in just tracing this for now. > It's been on my todo list to add one for page > fault exit (as I do care for how long they last. I've added a tracepoint similar to page_fault_user for that but I'm not sure if it's the best way to do it. Should I send a patch for review? Thanks, Junxuan ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Interface for enabling context tracking 2025-04-11 17:41 ` Junxuan Liao @ 2025-04-11 21:51 ` Frederic Weisbecker 0 siblings, 0 replies; 9+ messages in thread From: Frederic Weisbecker @ 2025-04-11 21:51 UTC (permalink / raw) To: Junxuan Liao Cc: Steven Rostedt, Paul E. McKenney, linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mathieu Desnoyers Le Fri, Apr 11, 2025 at 12:41:37PM -0500, Junxuan Liao a écrit : > > > > Are you interested in working on joining the noble quest of getting the > > > rest of the nohz_full support in place? (Full disclosure: This stuff > > > is non-trivial.) > > > > I believe the request is more of just tracing entry and exit from the > > kernel, which just needs a simple trace event at the border crossings. > > Yeah I'm more interested in just tracing this for now. > > > It's been on my todo list to add one for page > > fault exit (as I do care for how long they last. > > I've added a tracepoint similar to page_fault_user for that but I'm not > sure if it's the best way to do it. Should I send a patch for review? If you do so, it may be a good idea to remove page_fault_user and page_fault_kernel and introduce page_fault_user_enter/page_fault_user_exit and page_fault_kernel_enter/page_fault_kernel_exit. But the following is also possible (and then trace/events/context_tracking.h should be renamed into trace/events/entry.h): diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h index fc61d0205c97..83b1764078f7 100644 --- a/include/linux/entry-common.h +++ b/include/linux/entry-common.h @@ -15,6 +15,8 @@ #include <asm/entry-common.h> +#include <trace/events/context_tracking.h> + /* * Define dummy _TIF work flags if not defined by the architecture or for * disabled functionality. @@ -115,6 +117,7 @@ static __always_inline void enter_from_user_mode(struct pt_regs *regs) instrumentation_begin(); kmsan_unpoison_entry_regs(regs); trace_hardirqs_off_finish(); + trace_user_exit(0); instrumentation_end(); } @@ -357,6 +360,7 @@ static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs) static __always_inline void exit_to_user_mode(void) { instrumentation_begin(); + trace_user_enter(0); trace_hardirqs_on_prepare(); lockdep_hardirqs_on_prepare(); instrumentation_end(); diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c index fb5be6e9b423..e9395936bded 100644 --- a/kernel/context_tracking.c +++ b/kernel/context_tracking.c @@ -428,9 +428,6 @@ static __always_inline void ct_kernel_enter(bool user, int offset) { } #ifdef CONFIG_CONTEXT_TRACKING_USER -#define CREATE_TRACE_POINTS -#include <trace/events/context_tracking.h> - DEFINE_STATIC_KEY_FALSE_RO(context_tracking_key); EXPORT_SYMBOL_GPL(context_tracking_key); @@ -486,7 +483,6 @@ void noinstr __ct_user_enter(enum ctx_state state) */ if (state == CT_STATE_USER) { instrumentation_begin(); - trace_user_enter(0); vtime_user_enter(current); instrumentation_end(); } @@ -623,7 +619,6 @@ void noinstr __ct_user_exit(enum ctx_state state) if (state == CT_STATE_USER) { instrumentation_begin(); vtime_user_exit(current); - trace_user_exit(0); instrumentation_end(); } -- Frederic Weisbecker SUSE Labs ^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-04-17 20:32 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-04-10 18:51 Interface for enabling context tracking Junxuan Liao 2025-04-10 19:05 ` Steven Rostedt 2025-04-10 20:31 ` Junxuan Liao 2025-04-17 19:10 ` Junxuan Liao 2025-04-17 20:34 ` Steven Rostedt 2025-04-10 19:10 ` Paul E. McKenney 2025-04-10 19:32 ` Steven Rostedt 2025-04-11 17:41 ` Junxuan Liao 2025-04-11 21:51 ` Frederic Weisbecker
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox