Re: [PATCH v3 0/3] tracing: Read user data from futex system call trace event

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Thomas Gleixner <tglx@linutronix.de>
To: Steven Rostedt <rostedt@kernel.org>,
	linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Brian Geffon <bgeffon@google.com>,
	John Stultz <jstultz@google.com>, Ian Rogers <irogers@google.com>,
	Suleiman Souhlal <suleiman@google.com>
Subject: Re: [PATCH v3 0/3] tracing: Read user data from futex system call trace event
Date: Wed, 01 Apr 2026 21:31:19 +0200	[thread overview]
Message-ID: <87zf3m7a0o.ffs@tglx> (raw)
In-Reply-To: <20260331181349.062575155@kernel.org>

On Tue, Mar 31 2026 at 14:13, Steven Rostedt wrote:
> We are looking at the performance of futexes and require a bit more
> information when tracing them.
>
> The two patches here extend the system call reading of user space to

s/two/three/ :)

> create specific handling of the futex system call. It now reads the
> user space relevant data (the addr, utime and addr2), as well as
> parses the flags. This adds a little smarts to the trace event as
> it only shows the parameters that are relevant, as well as parses
> utime as either a timespec or as val2 depending on the futex_op.
>
> Here's an example of the new output:
>
>  sys_futex(uaddr: 0x56196292e830 (0), FUTEX_WAKE|FUTEX_PRIVATE_FLAG)
>  sys_futex(uaddr: 0x56196292e834 (0x4a7) tid: 1191, FUTEX_UNLOCK_PI|FUTEX_PRIVATE_FLAG)
>  sys_futex(uaddr: 0x56196292e834 (0) tid: 0, FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG)
>  sys_futex(uaddr: 0x56196292e830 (0), FUTEX_WAIT|FUTEX_PRIVATE_FLAG)
>  sys_futex(uaddr: 0x56196292e838 (0), FUTEX_WAIT_REQUEUE_PI|FUTEX_PRIVATE_FLAG, timespec: 0x7ffc1b91a9f0 (163.048528790), uaddr2: 0x56196292e834 (4aa), val3: 0)
>  sys_futex(uaddr: 0x56196292e834 (0x4aa) tid: 1194, FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG)
>  sys_futex(uaddr: 0x56196292e838 (0), FUTEX_WAIT_REQUEUE_PI|FUTEX_PRIVATE_FLAG, timespec: 0x7ffc1b91a9f0 (163.048528790), uaddr2: 0x56196292e834 (800004aa), val3: 0)
>  sys_futex(uaddr: 0x7f7ed6b29990 (0x4ab), FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME)
>  sys_futex(uaddr: 0x56196292e834 (0x800004aa) tid: 1194 (WAITERS), FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG)
>  sys_futex(uaddr: 0x56196292e838 (0), FUTEX_WAIT_REQUEUE_PI|FUTEX_PRIVATE_FLAG, timespec: 0x7ffc1b91a9f0 (163.048528790), uaddr2: 0x56196292e834 (800004aa), val3: 0)
>  sys_futex(uaddr: 0x56196292e834 (0x800004aa) tid: 1194 (WAITERS), FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG)

I understand what you are trying to achieve, but do we really need all
the complexity of decoding and pretty printing in the kernel?

Isn't it sufficient to store and expose the raw data and use post
processing to make it readable?

I've been doing complex futex analysis for two decades with a small set
of python scripts which translate raw text or binary trace data into
human readable information.

I agree that it's useful to have the actual timeout value and other data
which is missing today, but that still does not require all this
customized printing.

The initial idea of having at least some information about the data
entry (type, meaning etc.) in $event/format and use that for kernel text
output and for user space tools to analyze a binary trace has been
definitely the right way to go.

But that now deviates because $event/format cannot carry that
information you translate to in the kernel. It will still describe raw
event data, no?

So why not keeping the well known and working solution of identifying
the data in the format, print it raw and leave the post processing to
user space tools in case there is a need.

You actually make it harder to do development. Look at the patch series
related to robust futexes:

  https://lore.kernel.org/lkml/20260330114212.927686587@kernel.org/

So your decoding:

>  sys_futex(uaddr: 0x56196292e830 (0), FUTEX_WAKE|FUTEX_PRIVATE_FLAG)

fails to decode the new flag and the usage of uaddr2 unless I go and add
it in the first place _before_ working on the code. Right now it is just
printing op as a hex value and it just works when a new bit is added.

Stick 100 lines of python into tools/tracing and be done with it. I'm
happy to contribute to that.

Aside of that:

  Putting the decoder (futex_print_syscall) into the futex code itself
  is admittedly a smart move to offload the work of keeping that up to
  date to the people who are actually working on futexes.

TBH, I'm not interested to deal with that at all. If you want this
ftrace magic pretty printing, then stick it into kernel/trace or if
there is a real technical reason (hint there is none) into
kernel/futex/trace.c and take ownership of it. But please do not burden
others with your fancy toy of the day.

Thanks,

        tglx

next prev parent reply	other threads:[~2026-04-01 19:31 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-31 18:13 [PATCH v3 0/3] tracing: Read user data from futex system call trace event Steven Rostedt
2026-03-31 18:13 ` [PATCH v3 1/3] tracing: Have futex syscall trace event show specific user data Steven Rostedt
2026-04-01 18:07   ` Ian Rogers
2026-04-01 18:17     ` Steven Rostedt
2026-03-31 18:13 ` [PATCH v3 2/3] tracing: Update futex syscall trace event to show more commands Steven Rostedt
2026-03-31 18:13 ` [PATCH v3 3/3] tracing: Show TID and flags for PI futex system call trace event Steven Rostedt
2026-04-01 17:15 ` [PATCH v3 0/3] tracing: Read user data from " Steven Rostedt
2026-04-01 19:31 ` Thomas Gleixner [this message]
2026-04-01 20:13   ` Peter Zijlstra
2026-04-01 20:19   ` Steven Rostedt
2026-04-01 20:25     ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zf3m7a0o.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=akpm@linux-foundation.org \
    --cc=bgeffon@google.com \
    --cc=irogers@google.com \
    --cc=jstultz@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@kernel.org \
    --cc=suleiman@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox