Re: [RFC PATCH] tracing: change syscall number type in struct syscall_trace_*

linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Steven Rostedt <rostedt@goodmis.org>
To: Artem Savkov <asavkov@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
	linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, netdev@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-rt-users@vger.kernel.org, Jiri Olsa <jolsa@kernel.org>
Subject: Re: [RFC PATCH] tracing: change syscall number type in struct syscall_trace_*
Date: Tue, 3 Oct 2023 21:38:44 -0400	[thread overview]
Message-ID: <20231003213844.1de0c138@gandalf.local.home> (raw)
In-Reply-To: <20231002135242.247536-1-asavkov@redhat.com>

On Mon,  2 Oct 2023 15:52:42 +0200
Artem Savkov <asavkov@redhat.com> wrote:

> linux-rt-devel tree contains a patch that adds an extra member to struct
> trace_entry. This causes the offset of args field in struct
> trace_event_raw_sys_enter be different from the one in struct
> syscall_trace_enter:

This patch looks like it's fixing the symptom and not the issue. No code
should rely on the two event structures to be related. That's an unwanted
coupling, that will likely cause issues down the road (like the RT patch
you mentioned).


> 
> struct trace_event_raw_sys_enter {
>         struct trace_entry         ent;                  /*     0    12 */
> 
>         /* XXX last struct has 3 bytes of padding */
>         /* XXX 4 bytes hole, try to pack */
> 
>         long int                   id;                   /*    16     8 */
>         long unsigned int          args[6];              /*    24    48 */
>         /* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */
>         char                       __data[];             /*    72     0 */
> 
>         /* size: 72, cachelines: 2, members: 4 */
>         /* sum members: 68, holes: 1, sum holes: 4 */
>         /* paddings: 1, sum paddings: 3 */
>         /* last cacheline: 8 bytes */
> };
> 
> struct syscall_trace_enter {
>         struct trace_entry         ent;                  /*     0    12 */
> 
>         /* XXX last struct has 3 bytes of padding */
> 
>         int                        nr;                   /*    12     4 */
>         long unsigned int          args[];               /*    16     0 */
> 
>         /* size: 16, cachelines: 1, members: 3 */
>         /* paddings: 1, sum paddings: 3 */
>         /* last cacheline: 16 bytes */
> };
> 
> This, in turn, causes perf_event_set_bpf_prog() fail while running bpf
> test_profiler testcase because max_ctx_offset is calculated based on the
> former struct, while off on the latter:

The above appears to be pointing to the real bug. The "is calculated based
on the former struct while off on the latter" Why are the two being used
together? They are supposed to be *unrelated*!


> 
>   10488         if (is_tracepoint || is_syscall_tp) {
>   10489                 int off = trace_event_get_offsets(event->tp_event);

So basically this is clumping together the raw_syscalls with the syscalls
events as if they are the same. But the are not. They are created
differently. It's basically like using one structure to get the offsets of
another structure. That would be a bug anyplace else in the kernel. Sounds
like it's a bug here too.

I think the issue is with this code, not the tracing code.

We could expose the struct syscall_trace_enter and syscall_trace_exit if
the offsets to those are needed.

-- Steve


>   10490
>   10491                 if (prog->aux->max_ctx_offset > off)
>   10492                         return -EACCES;
>   10493         }
> 
> This patch changes the type of nr member in syscall_trace_* structs to
> be long so that "args" offset is equal to that in struct
> trace_event_raw_sys_enter.
>

next prev parent reply	other threads:[~2023-10-04  1:37 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-02 13:52 [RFC PATCH] tracing: change syscall number type in struct syscall_trace_* Artem Savkov
2023-10-03 22:11 ` Andrii Nakryiko
2023-10-04  7:02   ` Artem Savkov
2023-10-04  1:38 ` Steven Rostedt [this message]
2023-10-04 12:55   ` Artem Savkov
2023-10-05 12:34     ` Artem Savkov
2023-10-12 11:45       ` [RFC PATCH bpf-next] bpf: change syscall_nr type to int in struct syscall_tp_t Artem Savkov
2023-10-12 13:44         ` Steven Rostedt
2023-10-12 23:32           ` Andrii Nakryiko
2023-10-13  5:42             ` [PATCH " Artem Savkov
2023-10-13 19:50               ` patchwork-bot+netdevbpf
2023-10-13  6:01             ` [RFC PATCH " Artem Savkov
2023-10-13 14:00               ` Steven Rostedt
2023-10-13 19:43                 ` Andrii Nakryiko
2023-10-16 15:53                   ` Steven Rostedt
2023-10-13  3:13         ` Rod Webster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231003213844.1de0c138@gandalf.local.home \
    --to=rostedt@goodmis.org \
    --cc=andrii@kernel.org \
    --cc=asavkov@redhat.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).