linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Alexei Starovoitov <ast@kernel.org>, Yonghong Song <yhs@fb.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Namhyung Kim <namhyung@kernel.org>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	bpf@vger.kernel.org, Joel Fernandes <joel@joelfernandes.org>,
	linux-trace-kernel@vger.kernel.org
Subject: Re: [PATCH 0/8] tracing: Allow system call tracepoints to handle page faults
Date: Tue, 17 Sep 2024 04:49:16 +0900	[thread overview]
Message-ID: <20240917044916.c615d25eb4fecc9818d3d376@kernel.org> (raw)
In-Reply-To: <20240909201652.319406-1-mathieu.desnoyers@efficios.com>

On Mon,  9 Sep 2024 16:16:44 -0400
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> Wire up the system call tracepoints with Tasks Trace RCU to allow
> the ftrace, perf, and eBPF tracers to handle page faults.
> 
> This series does the initial wire-up allowing tracers to handle page
> faults, but leaves out the actual handling of said page faults as future
> work.
> 
> This series was compile and runtime tested with ftrace and perf syscall
> tracing and raw syscall tracing, adding a WARN_ON_ONCE() in the
> generated code to validate that the intended probes are used for raw
> syscall tracing. The might_fault() added within those probes validate
> that they are called from a context where handling a page fault is OK.

I think this series itself is valuable.
However, I'm still not sure that why ftrace needs to handle page faults.
This allows syscall trace-event itself to handle page faults, but the
raw-syscall/syscall events only accesses registers, right?

I think that the page faults happen only when dereference those registers
as a pointer to the data structure, and currently that is done by probes
like eprobe and fprobe. In order to handle faults in those probes, we
need to change how those writes data in per-cpu ring buffer.

Currently, those probes reserves an entry on ring buffer and writes the
dereferenced data on the entry, and commits it. So during this reserve-
write-commit operation, this still disables preemption. So we need a
another buffer for dereference on the stack and copy it.

Thank you,


> 
> For ebpf, this series is compile-tested only.
> 
> This series replaces the "Faultable Tracepoints v6" series found at [1].
> 
> Thanks,
> 
> Mathieu
> 
> Link: https://lore.kernel.org/lkml/20240828144153.829582-1-mathieu.desnoyers@efficios.com/ # [1]
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Yonghong Song <yhs@fb.com>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> Cc: bpf@vger.kernel.org
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: linux-trace-kernel@vger.kernel.org
> 
> Mathieu Desnoyers (8):
>   tracing: Declare system call tracepoints with TRACE_EVENT_SYSCALL
>   tracing/ftrace: guard syscall probe with preempt_notrace
>   tracing/perf: guard syscall probe with preempt_notrace
>   tracing/bpf: guard syscall probe with preempt_notrace
>   tracing: Allow system call tracepoints to handle page faults
>   tracing/ftrace: Add might_fault check to syscall probes
>   tracing/perf: Add might_fault check to syscall probes
>   tracing/bpf: Add might_fault check to syscall probes
> 
>  include/linux/tracepoint.h      | 87 +++++++++++++++++++++++++--------
>  include/trace/bpf_probe.h       | 13 +++++
>  include/trace/define_trace.h    |  5 ++
>  include/trace/events/syscalls.h |  4 +-
>  include/trace/perf.h            | 43 ++++++++++++++--
>  include/trace/trace_events.h    | 61 +++++++++++++++++++++--
>  init/Kconfig                    |  1 +
>  kernel/entry/common.c           |  4 +-
>  kernel/trace/trace_syscalls.c   | 36 ++++++++++++--
>  9 files changed, 218 insertions(+), 36 deletions(-)
> 
> -- 
> 2.39.2


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

  parent reply	other threads:[~2024-09-16 19:49 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-09 20:16 [PATCH 0/8] tracing: Allow system call tracepoints to handle page faults Mathieu Desnoyers
2024-09-09 20:16 ` [PATCH 1/8] tracing: Declare system call tracepoints with TRACE_EVENT_SYSCALL Mathieu Desnoyers
2024-09-09 20:16 ` [PATCH 2/8] tracing/ftrace: guard syscall probe with preempt_notrace Mathieu Desnoyers
2024-09-16 18:47   ` Masami Hiramatsu
2024-09-09 20:16 ` [PATCH 3/8] tracing/perf: " Mathieu Desnoyers
2024-09-09 20:16 ` [PATCH 4/8] tracing/bpf: " Mathieu Desnoyers
2024-09-10  0:03   ` Andrii Nakryiko
2024-09-09 20:16 ` [PATCH 5/8] tracing: Allow system call tracepoints to handle page faults Mathieu Desnoyers
2024-09-09 20:16 ` [PATCH 6/8] tracing/ftrace: Add might_fault check to syscall probes Mathieu Desnoyers
2024-09-09 20:16 ` [PATCH 7/8] tracing/perf: " Mathieu Desnoyers
2024-09-09 20:16 ` [PATCH 8/8] tracing/bpf: " Mathieu Desnoyers
2024-09-10  0:02   ` Andrii Nakryiko
2024-09-09 23:53 ` [PATCH 0/8] tracing: Allow system call tracepoints to handle page faults Andrii Nakryiko
2024-09-10  0:36   ` Mathieu Desnoyers
2024-09-11 23:08     ` Andrii Nakryiko
2024-09-16 19:49 ` Masami Hiramatsu [this message]
2024-09-17  9:54   ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240917044916.c615d25eb4fecc9818d3d376@kernel.org \
    --to=mhiramat@kernel.org \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).