* [PATCH v2 0/3] tracing: Read user data from futex system call trace event
@ 2026-03-10 20:09 Steven Rostedt
2026-03-10 20:09 ` [PATCH v2 1/3] tracing: Have futex syscall trace event show specific user data Steven Rostedt
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Steven Rostedt @ 2026-03-10 20:09 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel
Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Brian Geffon, John Stultz,
Ian Rogers, Suleiman Souhlal
We are looking at the performance of futexes and require a bit more
information when tracing them.
The two patches here extend the system call reading of user space to
create specific handling of the futex system call. It now reads the
user space relevant data (the addr, utime and addr2), as well as
parses the flags. This adds a little smarts to the trace event as
it only shows the parameters that are relevant, as well as parses
utime as either a timespec or as val2 depending on the futex_op.
Here's an example of the new output:
sys_futex(uaddr: 0x56196292e830 (0), FUTEX_WAKE|FUTEX_PRIVATE_FLAG)
sys_futex(uaddr: 0x56196292e834 (0x4a7) tid: 1191, FUTEX_UNLOCK_PI|FUTEX_PRIVATE_FLAG)
sys_futex(uaddr: 0x56196292e834 (0) tid: 0, FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG)
sys_futex(uaddr: 0x56196292e830 (0), FUTEX_WAIT|FUTEX_PRIVATE_FLAG)
sys_futex(uaddr: 0x56196292e838 (0), FUTEX_WAIT_REQUEUE_PI|FUTEX_PRIVATE_FLAG, timespec: 0x7ffc1b91a9f0 (163.048528790), uaddr2: 0x56196292e834 (4aa), val3: 0)
sys_futex(uaddr: 0x56196292e834 (0x4aa) tid: 1194, FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG)
sys_futex(uaddr: 0x56196292e838 (0), FUTEX_WAIT_REQUEUE_PI|FUTEX_PRIVATE_FLAG, timespec: 0x7ffc1b91a9f0 (163.048528790), uaddr2: 0x56196292e834 (800004aa), val3: 0)
sys_futex(uaddr: 0x7f7ed6b29990 (0x4ab), FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME)
sys_futex(uaddr: 0x56196292e834 (0x800004aa) tid: 1194 (WAITERS), FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG)
sys_futex(uaddr: 0x56196292e838 (0), FUTEX_WAIT_REQUEUE_PI|FUTEX_PRIVATE_FLAG, timespec: 0x7ffc1b91a9f0 (163.048528790), uaddr2: 0x56196292e834 (800004aa), val3: 0)
sys_futex(uaddr: 0x56196292e834 (0x800004aa) tid: 1194 (WAITERS), FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG)
Changes since v1: https://lore.kernel.org/all/20260303214735.002154462@kernel.org/
- Moved the printing of the futex parameters to kernel/futex/syscall.c to
keep it closer to the actual system call.
- Moved the always_inline functions: futex_cmd_has_timeout()
and futex_cmd_has_addr2() into include/linux/futex.h so that they
can be both used by the futex and tracing syscall code.
Steven Rostedt (3):
tracing: Have futex syscall trace event show specific user data
tracing: Update futex syscall trace event to show more commands
tracing: Show TID and flags for PI futex system call trace event
----
include/linux/futex.h | 38 +++++-
kernel/futex/syscalls.c | 129 ++++++++++++++++++---
kernel/trace/trace_syscalls.c | 261 +++++++++++++++++++++++++++++++++++++++++-
3 files changed, 411 insertions(+), 17 deletions(-)
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v2 1/3] tracing: Have futex syscall trace event show specific user data
2026-03-10 20:09 [PATCH v2 0/3] tracing: Read user data from futex system call trace event Steven Rostedt
@ 2026-03-10 20:09 ` Steven Rostedt
2026-03-11 7:41 ` kernel test robot
` (2 more replies)
2026-03-10 20:09 ` [PATCH v2 2/3] tracing: Update futex syscall trace event to show more commands Steven Rostedt
2026-03-10 20:09 ` [PATCH v2 3/3] tracing: Show TID and flags for PI futex system call trace event Steven Rostedt
2 siblings, 3 replies; 10+ messages in thread
From: Steven Rostedt @ 2026-03-10 20:09 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel
Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Brian Geffon, John Stultz,
Ian Rogers, Suleiman Souhlal
From: Steven Rostedt <rostedt@goodmis.org>
Add specific reporting of the futex system call. This allows for debugging
the futex code a bit easier. Instead of just showing the values passed
into the futex system call, read the value of the user space memory
pointed to by the addr parameter.
Also make the op parameter more readable by parsing the values to show
what the command is:
futex_requeue_p-3251 [002] ..... 2101.068479: sys_futex(uaddr: 0x55e79a4da834 (0x80000cb1), FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG, val: 0)
futex_requeue_p-3248 [001] ..... 2101.068970: sys_futex(uaddr: 0x7f859072f990 (0xcb2), FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, val: 3250)
futex_requeue_p-3252 [005] ..... 2101.069108: sys_futex(uaddr: 0x55e79a4da838 (0), FUTEX_WAIT_REQUEUE_PI|FUTEX_PRIVATE_FLAG, val: 0, timespec: 0x7ffe61076aa0, uaddr2: 0x55e79a4da834, uaddr2: 94453214586932, val3: 0)
futex_requeue_p-3252 [005] ..... 2101.069410: sys_futex(uaddr: 0x55e79a4da834 (0x80000cb1), FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG, val: 0)
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
Changes since v1: https://lore.kernel.org/all/20260303214942.428502100@kernel.org/
- Moved the printing of the futex parameters to kernel/futex/syscall.c to
keep it closer to the actual system call.
include/linux/futex.h | 7 +-
kernel/futex/syscalls.c | 89 ++++++++++++++++++
kernel/trace/trace_syscalls.c | 168 +++++++++++++++++++++++++++++++++-
3 files changed, 260 insertions(+), 4 deletions(-)
diff --git a/include/linux/futex.h b/include/linux/futex.h
index 9e9750f04980..9fc47aa01a8b 100644
--- a/include/linux/futex.h
+++ b/include/linux/futex.h
@@ -82,6 +82,10 @@ long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
u32 __user *uaddr2, u32 val2, u32 val3);
int futex_hash_prctl(unsigned long arg2, unsigned long arg3, unsigned long arg4);
+#ifdef CONFIG_FTRACE_SYSCALLS
+void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args, u32 *ptr);
+#endif
+
#ifdef CONFIG_FUTEX_PRIVATE_HASH
int futex_hash_allocate_default(void);
void futex_hash_free(struct mm_struct *mm);
@@ -114,7 +118,8 @@ static inline int futex_hash_allocate_default(void)
}
static inline int futex_hash_free(struct mm_struct *mm) { return 0; }
static inline int futex_mm_init(struct mm_struct *mm) { return 0; }
-
+static inline void futex_print_syscall(struct seq_buf *s, int nr_args,
+ unsigned long *args, u32 *ptr) { }
#endif
#endif
diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c
index 743c7a728237..a1cd512aa502 100644
--- a/kernel/futex/syscalls.c
+++ b/kernel/futex/syscalls.c
@@ -171,6 +171,18 @@ static __always_inline bool futex_cmd_has_timeout(u32 cmd)
return false;
}
+static __always_inline bool futex_cmd_has_addr2(u32 cmd)
+{
+ switch (cmd) {
+ case FUTEX_REQUEUE:
+ case FUTEX_CMP_REQUEUE:
+ case FUTEX_WAKE_OP:
+ case FUTEX_WAIT_REQUEUE_PI:
+ return true;
+ }
+ return false;
+}
+
static __always_inline int
futex_init_timeout(u32 cmd, u32 op, struct timespec64 *ts, ktime_t *t)
{
@@ -207,6 +219,83 @@ SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, u32, val,
return do_futex(uaddr, op, val, tp, uaddr2, (unsigned long)utime, val3);
}
+#ifdef CONFIG_FTRACE_SYSCALLS
+static const char * __futex_cmds[] =
+{
+ "FUTEX_WAIT", "FUTEX_WAKE", "FUTEX_FD", "FUTEX_REQUEUE",
+ "FUTEX_CMP_REQUEUE", "FUTEX_WAKE_OP", "FUTEX_LOCK_PI",
+ "FUTEX_UNLOCK_PI", "FUTEX_TRYLOCK_PI", "FUTEX_WAIT_BITSET",
+ "FUTEX_WAKE_BITSET", "FUTEX_WAIT_REQUEUE_PI", "FUTEX_CMP_REQUEUE_PI",
+ "FUTEX_LOCK_PI2",
+};
+
+void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args, u32 *ptr)
+{
+ unsigned int op, cmd;
+ bool done = false;
+
+ for (int i = 0; !done && i < nr_args; i++) {
+
+ if (seq_buf_has_overflowed(s))
+ break;
+
+ switch (i) {
+ case 0:
+ seq_buf_printf(s, "uaddr: 0x%lx", args[i]);
+ if (ptr) {
+ u32 val = *ptr;
+ if (val < 10)
+ seq_buf_printf(s, " (%u)", val);
+ else
+ seq_buf_printf(s, " (0x%x)", val);
+ }
+ continue;
+ case 1:
+ op = args[i];
+ cmd = op & FUTEX_CMD_MASK;
+ if (cmd <= FUTEX_LOCK_PI2)
+ seq_buf_printf(s, ", %s", __futex_cmds[cmd]);
+ else
+ seq_buf_puts(s, ", UNKNOWN");
+
+ if (op & FUTEX_PRIVATE_FLAG)
+ seq_buf_puts(s, "|FUTEX_PRIVATE_FLAG");
+ if (op & FUTEX_CLOCK_REALTIME)
+ seq_buf_puts(s, "|FUTEX_CLOCK_REALTIME");
+ continue;
+ case 3:
+ if (!futex_cmd_has_timeout(cmd)) {
+
+ if (!futex_cmd_has_addr2(cmd)) {
+ done = true;
+ continue;
+ }
+
+ seq_buf_printf(s, ", val2: 0x%x", (u32)(long)args[i]);
+ continue;
+ }
+
+ if (!args[i])
+ continue;
+
+ seq_buf_printf(s, ", timespec: 0x%lx", args[i]);
+ continue;
+ case 4:
+ if (!futex_cmd_has_addr2(cmd)) {
+ done = true;
+ continue;
+ }
+ seq_buf_printf(s, ", uaddr2: 0x%lx", args[i]);
+ continue;
+ case 5:
+ seq_buf_printf(s, ", val3: %lu", args[i]);
+ done = true;
+ continue;
+ }
+ }
+}
+#endif
+
/**
* futex_parse_waitv - Parse a waitv array from userspace
* @futexv: Kernel side list of waiters to be filled
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 37317b81fcda..8cb3af569157 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -6,11 +6,13 @@
#include <linux/slab.h>
#include <linux/kernel.h>
#include <linux/module.h> /* for MODULE_NAME_LEN via KSYM_SYMBOL_LEN */
+#include <linux/futex.h>
#include <linux/ftrace.h>
#include <linux/perf_event.h>
#include <linux/xarray.h>
#include <asm/syscall.h>
+
#include "trace_output.h"
#include "trace.h"
@@ -237,6 +239,27 @@ sys_enter_openat_print(struct syscall_trace_enter *trace, struct syscall_metadat
return trace_handle_return(s);
}
+static enum print_line_t
+sys_enter_futex_print(struct syscall_trace_enter *trace, struct syscall_metadata *entry,
+ struct trace_seq *s, struct trace_event *event, int ent_size)
+{
+ void *end = (void *)trace + ent_size;
+ void *ptr;
+
+ /* Set ptr to the user space copied area */
+ ptr = (void *)trace->args + sizeof(unsigned long) * entry->nb_args;
+ if (ptr + 4 > end)
+ ptr = NULL;
+
+ trace_seq_printf(s, "%s(", entry->name);
+
+ futex_print_syscall(&s->seq, entry->nb_args, trace->args, ptr);
+
+ trace_seq_puts(s, ")\n");
+
+ return trace_handle_return(s);
+}
+
static enum print_line_t
print_syscall_enter(struct trace_iterator *iter, int flags,
struct trace_event *event)
@@ -267,6 +290,10 @@ print_syscall_enter(struct trace_iterator *iter, int flags,
if (!tr || !(tr->trace_flags & TRACE_ITER(VERBOSE)))
return sys_enter_openat_print(trace, entry, s, event);
break;
+ case __NR_futex:
+ if (!tr || !(tr->trace_flags & TRACE_ITER(VERBOSE)))
+ return sys_enter_futex_print(trace, entry, s, event, iter->ent_size);
+ break;
default:
break;
}
@@ -437,6 +464,69 @@ sys_enter_openat_print_fmt(struct syscall_metadata *entry, char *buf, int len)
return pos;
}
+static int __init
+sys_enter_futex_print_fmt(struct syscall_metadata *entry, char *buf, int len)
+{
+ int pos = 0;
+
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "\"uaddr: 0x%%lx (0x%%lx) cmd=%%s%%s%%s");
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " val: 0x%%x timeout/val2: 0x%%llx");
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " uaddr2: 0x%%lx val3: 0x%%x\", ");
+
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " REC->uaddr,");
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " REC->__value,");
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " __print_symbolic(REC->op & 0x%x, ", FUTEX_CMD_MASK);
+
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{%d, \"FUTEX_WAIT\"}, ", FUTEX_WAIT);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{%d, \"FUTEX_WAKE\"}, ", FUTEX_WAKE);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{%d, \"FUTEX_FD\"}, ", FUTEX_FD);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{%d, \"FUTEX_REQUEUE\"}, ", FUTEX_REQUEUE);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{%d, \"FUTEX_CMP_REQUEUE\"}, ", FUTEX_CMP_REQUEUE);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{%d, \"FUTEX_WAKE_OP\"}, ", FUTEX_WAKE_OP);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{%d, \"FUTEX_LOCK_PI\"}, ", FUTEX_LOCK_PI);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{%d, \"FUTEX_UNLOCK_PI\"}, ", FUTEX_UNLOCK_PI);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{%d, \"FUTEX_TRYLOCK_PI\"}, ", FUTEX_TRYLOCK_PI);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{%d, \"FUTEX_WAIT_BITSET\"}, ", FUTEX_WAIT_BITSET);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{%d, \"FUTEX_WAKE_BITSET\"}, ", FUTEX_WAKE_BITSET);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{%d, \"FUTEX_WAIT_REQUEUE_PI\"}, ", FUTEX_WAIT_REQUEUE_PI);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{%d, \"FUTEX_CMP_REQUEUE_PI\"}, ", FUTEX_CMP_REQUEUE_PI);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ "{%d, \"FUTEX_LOCK_PI2\"}),", FUTEX_LOCK_PI2);
+
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " (REC->op & %d) ? \"|FUTEX_PRIVATE_FLAG\" : \"\",",
+ FUTEX_PRIVATE_FLAG);
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " (REC->op & %d) ? \"|FUTEX_CLOCK_REALTIME\" : \"\",",
+ FUTEX_CLOCK_REALTIME);
+
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " REC->val, REC->utime,");
+
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " REC->uaddr, REC->val3");
+ return pos;
+}
+
static int __init
__set_enter_print_fmt(struct syscall_metadata *entry, char *buf, int len)
{
@@ -447,6 +537,8 @@ __set_enter_print_fmt(struct syscall_metadata *entry, char *buf, int len)
switch (entry->syscall_nr) {
case __NR_openat:
return sys_enter_openat_print_fmt(entry, buf, len);
+ case __NR_futex:
+ return sys_enter_futex_print_fmt(entry, buf, len);
default:
break;
}
@@ -523,6 +615,21 @@ static void __init free_syscall_print_fmt(struct trace_event_call *call)
kfree(call->print_fmt);
}
+static int __init futex_fields(struct trace_event_call *call, int offset)
+{
+ char *arg;
+ int ret;
+
+ arg = kstrdup("__value", GFP_KERNEL);
+ if (WARN_ON_ONCE(!arg))
+ return -ENOMEM;
+ ret = trace_define_field(call, "u32", arg, offset, sizeof(int), 0,
+ FILTER_OTHER);
+ if (ret)
+ kfree(arg);
+ return ret;
+}
+
static int __init syscall_enter_define_fields(struct trace_event_call *call)
{
struct syscall_trace_enter trace;
@@ -544,6 +651,9 @@ static int __init syscall_enter_define_fields(struct trace_event_call *call)
offset += sizeof(unsigned long);
}
+ if (!ret && meta->syscall_nr == __NR_futex)
+ return futex_fields(call, offset);
+
if (ret || !meta->user_mask)
return ret;
@@ -689,6 +799,48 @@ static int syscall_copy_user_array(char *buf, const char __user *ptr,
return 0;
}
+static int
+syscall_get_futex(unsigned long *args, char **buffer, int *size, int buf_size)
+{
+ struct syscall_user_buffer *sbuf;
+ const char __user *ptr;
+ char *buf;
+
+ /* buf_size of zero means user doesn't want user space read */
+ if (!buf_size)
+ return -1;
+
+ /* If the syscall_buffer is NULL, tracing is being shutdown */
+ sbuf = READ_ONCE(syscall_buffer);
+ if (!sbuf)
+ return -1;
+
+ ptr = (char __user *)args[0];
+
+ *buffer = trace_user_fault_read(&sbuf->buf, ptr, 4, NULL, NULL);
+ if (!*buffer)
+ return -1;
+
+ /* Add room for the value */
+ *size += 4;
+
+ buf = *buffer;
+
+ return 0;
+}
+
+static void syscall_put_futex(struct syscall_metadata *sys_data,
+ struct syscall_trace_enter *entry,
+ char *buffer)
+{
+ u32 *ptr;
+
+ /* Place the futex key into the storage */
+ ptr = (void *)entry->args + sizeof(unsigned long) * sys_data->nb_args;
+
+ *ptr = *(u32 *)buffer;
+}
+
static char *sys_fault_user(unsigned int buf_size,
struct syscall_metadata *sys_data,
struct syscall_user_buffer *sbuf,
@@ -905,6 +1057,9 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
if (syscall_get_data(sys_data, args, &user_ptr,
&size, user_sizes, &uargs, tr->syscall_buf_sz) < 0)
return;
+ } else if (syscall_nr == __NR_futex) {
+ if (syscall_get_futex(args, &user_ptr, &size, tr->syscall_buf_sz) < 0)
+ return;
}
size += sizeof(*entry) + sizeof(unsigned long) * sys_data->nb_args;
@@ -921,6 +1076,9 @@ static void ftrace_syscall_enter(void *data, struct pt_regs *regs, long id)
if (mayfault)
syscall_put_data(sys_data, entry, user_ptr, size, user_sizes, uargs);
+ else if (syscall_nr == __NR_futex)
+ syscall_put_futex(sys_data, entry, user_ptr);
+
trace_event_buffer_commit(&fbuffer);
}
@@ -971,14 +1129,18 @@ static int reg_event_syscall_enter(struct trace_event_file *file,
{
struct syscall_metadata *sys_data = call->data;
struct trace_array *tr = file->tr;
+ bool enable_faults;
int ret = 0;
int num;
num = sys_data->syscall_nr;
if (WARN_ON_ONCE(num < 0 || num >= NR_syscalls))
return -ENOSYS;
+
+ enable_faults = sys_data->user_mask || num == __NR_futex;
+
guard(mutex)(&syscall_trace_lock);
- if (sys_data->user_mask) {
+ if (enable_faults) {
ret = syscall_fault_buffer_enable();
if (ret < 0)
return ret;
@@ -986,7 +1148,7 @@ static int reg_event_syscall_enter(struct trace_event_file *file,
if (!tr->sys_refcount_enter) {
ret = register_trace_sys_enter(ftrace_syscall_enter, tr);
if (ret < 0) {
- if (sys_data->user_mask)
+ if (enable_faults)
syscall_fault_buffer_disable();
return ret;
}
@@ -1011,7 +1173,7 @@ static void unreg_event_syscall_enter(struct trace_event_file *file,
WRITE_ONCE(tr->enter_syscall_files[num], NULL);
if (!tr->sys_refcount_enter)
unregister_trace_sys_enter(ftrace_syscall_enter, tr);
- if (sys_data->user_mask)
+ if (sys_data->user_mask || num == __NR_futex)
syscall_fault_buffer_disable();
}
--
2.51.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 2/3] tracing: Update futex syscall trace event to show more commands
2026-03-10 20:09 [PATCH v2 0/3] tracing: Read user data from futex system call trace event Steven Rostedt
2026-03-10 20:09 ` [PATCH v2 1/3] tracing: Have futex syscall trace event show specific user data Steven Rostedt
@ 2026-03-10 20:09 ` Steven Rostedt
2026-03-12 5:34 ` Masami Hiramatsu
2026-03-10 20:09 ` [PATCH v2 3/3] tracing: Show TID and flags for PI futex system call trace event Steven Rostedt
2 siblings, 1 reply; 10+ messages in thread
From: Steven Rostedt @ 2026-03-10 20:09 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel
Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Brian Geffon, John Stultz,
Ian Rogers, Suleiman Souhlal
From: Steven Rostedt <rostedt@goodmis.org>
Make the futex syscall trace event a little more smart. Have it read the
futex_op instruction to determine what else it can save and print. For the
appropriate options, it will read the utime (timespec) parameter and show
its output as well as the uaddr2.
futex_requeue_p-1154 [004] ..... 144.568339: sys_futex(uaddr: 0x5652b178d834 (0x482), FUTEX_UNLOCK_PI|FUTEX_PRIVATE_FLAG, val: 0)
futex_requeue_p-1162 [002] ..... 144.568696: sys_futex(uaddr: 0x7f763b7fece0 (2), FUTEX_WAIT|FUTEX_PRIVATE_FLAG, val: 2)
futex_requeue_p-1151 [000] ..... 144.568700: sys_futex(uaddr: 0x7f763b7fece0 (0), FUTEX_WAKE|FUTEX_PRIVATE_FLAG, val: 1)
futex_requeue_p-1162 [002] ..... 144.568705: sys_futex(uaddr: 0x7f763b7fece0 (0), FUTEX_WAKE|FUTEX_PRIVATE_FLAG, val: 1)
futex_requeue_p-1151 [000] ..... 144.568715: sys_futex(uaddr: 0x7f764369e990 (0x483), FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, val: 1155)
futex_requeue_p-1155 [005] ..... 144.569420: sys_futex(uaddr: 0x5652b178d838 (0), FUTEX_WAIT_REQUEUE_PI|FUTEX_PRIVATE_FLAG, val: 0, timespec: 0x7ffdacfba500 (143.890024054), uaddr2: 0x5652b178d834 (0), val3: 0)
futex_requeue_p-1155 [005] ..... 144.569454: sys_futex(uaddr: 0x5652b178d834 (0), FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG, val: 0)
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
Changes since v1: https://lore.kernel.org/all/20260303214942.587739736@kernel.org/
- Updated to have the print processing in kernel/futex/syscall.c
include/linux/futex.h | 35 ++++++++-
kernel/futex/syscalls.c | 48 ++++++-------
kernel/trace/trace_syscalls.c | 129 +++++++++++++++++++++++++++++-----
3 files changed, 164 insertions(+), 48 deletions(-)
diff --git a/include/linux/futex.h b/include/linux/futex.h
index 9fc47aa01a8b..976fa257ab5c 100644
--- a/include/linux/futex.h
+++ b/include/linux/futex.h
@@ -82,8 +82,35 @@ long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
u32 __user *uaddr2, u32 val2, u32 val3);
int futex_hash_prctl(unsigned long arg2, unsigned long arg3, unsigned long arg4);
+static __always_inline bool futex_cmd_has_timeout(u32 cmd)
+{
+ switch (cmd) {
+ case FUTEX_WAIT:
+ case FUTEX_LOCK_PI:
+ case FUTEX_LOCK_PI2:
+ case FUTEX_WAIT_BITSET:
+ case FUTEX_WAIT_REQUEUE_PI:
+ return true;
+ }
+ return false;
+}
+
+static __always_inline bool futex_cmd_has_addr2(u32 cmd)
+{
+ switch (cmd) {
+ case FUTEX_REQUEUE:
+ case FUTEX_CMP_REQUEUE:
+ case FUTEX_WAKE_OP:
+ case FUTEX_WAIT_REQUEUE_PI:
+ return true;
+ }
+ return false;
+}
+
#ifdef CONFIG_FTRACE_SYSCALLS
-void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args, u32 *ptr);
+void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args,
+ u32 *ptr1, u32 *ptr2, unsigned long *ts1,
+ unsigned long *ts2);
#endif
#ifdef CONFIG_FUTEX_PRIVATE_HASH
@@ -119,7 +146,11 @@ static inline int futex_hash_allocate_default(void)
static inline int futex_hash_free(struct mm_struct *mm) { return 0; }
static inline int futex_mm_init(struct mm_struct *mm) { return 0; }
static inline void futex_print_syscall(struct seq_buf *s, int nr_args,
- unsigned long *args, u32 *ptr) { }
+ unsigned long *args, u32 *ptr1,
+ u32 *ptr2, unsigned long *ts1,
+ unsigned long *ts2) { }
+static __always_inline bool futex_cmd_has_timeout(u32 cmd) { return false; }
+static __always_inline bool futex_cmd_has_addr2(u32 cmd) { return false; }
#endif
#endif
diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c
index a1cd512aa502..a46706d6bc6c 100644
--- a/kernel/futex/syscalls.c
+++ b/kernel/futex/syscalls.c
@@ -158,31 +158,6 @@ long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
return -ENOSYS;
}
-static __always_inline bool futex_cmd_has_timeout(u32 cmd)
-{
- switch (cmd) {
- case FUTEX_WAIT:
- case FUTEX_LOCK_PI:
- case FUTEX_LOCK_PI2:
- case FUTEX_WAIT_BITSET:
- case FUTEX_WAIT_REQUEUE_PI:
- return true;
- }
- return false;
-}
-
-static __always_inline bool futex_cmd_has_addr2(u32 cmd)
-{
- switch (cmd) {
- case FUTEX_REQUEUE:
- case FUTEX_CMP_REQUEUE:
- case FUTEX_WAKE_OP:
- case FUTEX_WAIT_REQUEUE_PI:
- return true;
- }
- return false;
-}
-
static __always_inline int
futex_init_timeout(u32 cmd, u32 op, struct timespec64 *ts, ktime_t *t)
{
@@ -229,7 +204,9 @@ static const char * __futex_cmds[] =
"FUTEX_LOCK_PI2",
};
-void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args, u32 *ptr)
+void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args,
+ u32 *ptr1, u32 *ptr2, unsigned long *ts1,
+ unsigned long *ts2)
{
unsigned int op, cmd;
bool done = false;
@@ -242,8 +219,8 @@ void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args, u3
switch (i) {
case 0:
seq_buf_printf(s, "uaddr: 0x%lx", args[i]);
- if (ptr) {
- u32 val = *ptr;
+ if (ptr1) {
+ u32 val = *ptr1;
if (val < 10)
seq_buf_printf(s, " (%u)", val);
else
@@ -279,6 +256,15 @@ void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args, u3
continue;
seq_buf_printf(s, ", timespec: 0x%lx", args[i]);
+
+ if (!ts1 || !ts2)
+ continue;
+
+ if (!*ts1 && !*ts2) {
+ seq_buf_puts(s, " (0)");
+ continue;
+ }
+ seq_buf_printf(s, " (%lu.%09lu)", *ts1, *ts2);
continue;
case 4:
if (!futex_cmd_has_addr2(cmd)) {
@@ -286,6 +272,12 @@ void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args, u3
continue;
}
seq_buf_printf(s, ", uaddr2: 0x%lx", args[i]);
+
+ if (!ptr2)
+ continue;
+
+ seq_buf_printf(s, " (%x)", *ptr2);
+
continue;
case 5:
seq_buf_printf(s, ", val3: %lu", args[i]);
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index 8cb3af569157..de1fa97547a3 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -239,21 +239,35 @@ sys_enter_openat_print(struct syscall_trace_enter *trace, struct syscall_metadat
return trace_handle_return(s);
}
+struct futex_data {
+ u32 val1;
+ u32 val2;
+ unsigned long ts1;
+ unsigned long ts2;
+};
+
static enum print_line_t
sys_enter_futex_print(struct syscall_trace_enter *trace, struct syscall_metadata *entry,
struct trace_seq *s, struct trace_event *event, int ent_size)
{
+ struct futex_data *data;
void *end = (void *)trace + ent_size;
- void *ptr;
+ unsigned long *ts1 = NULL, *ts2 = NULL;
+ u32 *ptr1 = NULL, *ptr2 = NULL;
/* Set ptr to the user space copied area */
- ptr = (void *)trace->args + sizeof(unsigned long) * entry->nb_args;
- if (ptr + 4 > end)
- ptr = NULL;
+ data = (void *)trace->args + sizeof(unsigned long) * entry->nb_args;
+ if ((void *)data + sizeof(*data) <= end) {
+ ptr1 = &data->val1;
+ ptr2 = &data->val2;
+ ts1 = &data->ts1;
+ ts2 = &data->ts2;
+ }
trace_seq_printf(s, "%s(", entry->name);
- futex_print_syscall(&s->seq, entry->nb_args, trace->args, ptr);
+ futex_print_syscall(&s->seq, entry->nb_args, trace->args, ptr1, ptr2,
+ ts1, ts2);
trace_seq_puts(s, ")\n");
@@ -472,9 +486,9 @@ sys_enter_futex_print_fmt(struct syscall_metadata *entry, char *buf, int len)
pos += snprintf(buf + pos, LEN_OR_ZERO,
"\"uaddr: 0x%%lx (0x%%lx) cmd=%%s%%s%%s");
pos += snprintf(buf + pos, LEN_OR_ZERO,
- " val: 0x%%x timeout/val2: 0x%%llx");
+ " val: 0x%%x timeout/val2: 0x%%llx (%%lu.%%lu)");
pos += snprintf(buf + pos, LEN_OR_ZERO,
- " uaddr2: 0x%%lx val3: 0x%%x\", ");
+ " uaddr2: 0x%%lx (0x%%lx) val3: 0x%%x\", ");
pos += snprintf(buf + pos, LEN_OR_ZERO,
" REC->uaddr,");
@@ -520,10 +534,12 @@ sys_enter_futex_print_fmt(struct syscall_metadata *entry, char *buf, int len)
FUTEX_CLOCK_REALTIME);
pos += snprintf(buf + pos, LEN_OR_ZERO,
- " REC->val, REC->utime,");
+ " REC->val, REC->utime, REC->__ts1, REC->__ts2,");
pos += snprintf(buf + pos, LEN_OR_ZERO,
- " REC->uaddr, REC->val3");
+ " REC->uaddr,");
+ pos += snprintf(buf + pos, LEN_OR_ZERO,
+ " REC->__value2, REC->val3");
return pos;
}
@@ -626,7 +642,39 @@ static int __init futex_fields(struct trace_event_call *call, int offset)
ret = trace_define_field(call, "u32", arg, offset, sizeof(int), 0,
FILTER_OTHER);
if (ret)
- kfree(arg);
+ goto free;
+ offset += sizeof(int);
+
+ arg = kstrdup("__value2", GFP_KERNEL);
+ if (WARN_ON_ONCE(!arg))
+ return -ENOMEM;
+ ret = trace_define_field(call, "u32", arg, offset, sizeof(int), 0,
+ FILTER_OTHER);
+ if (ret)
+ goto free;
+ offset += sizeof(int);
+
+ arg = kstrdup("__ts1", GFP_KERNEL);
+ if (WARN_ON_ONCE(!arg))
+ return -ENOMEM;
+ ret = trace_define_field(call, "unsigned long", arg, offset,
+ sizeof(unsigned long), 0, FILTER_OTHER);
+ if (ret)
+ goto free;
+ offset += sizeof(long);
+
+ arg = kstrdup("__ts2", GFP_KERNEL);
+ if (WARN_ON_ONCE(!arg))
+ return -ENOMEM;
+ ret = trace_define_field(call, "unsigned long", arg, offset,
+ sizeof(unsigned long), 0, FILTER_OTHER);
+ if (ret)
+ goto free;
+
+ return 0;
+
+free:
+ kfree(arg);
return ret;
}
@@ -799,11 +847,51 @@ static int syscall_copy_user_array(char *buf, const char __user *ptr,
return 0;
}
+struct tp_futex_data {
+ u32 cmd;
+ const u32 __user *val1;
+ const u32 __user *val2;
+ void __user *timeout;
+};
+
+static int syscall_copy_futex(char *buf, const char __user *ptr,
+ size_t size, void *data)
+{
+ struct tp_futex_data *tp_data = data;
+ struct futex_data *fdata = (void *)buf;
+ int cmd = tp_data->cmd & FUTEX_CMD_MASK;
+ int ret;
+
+ memset(fdata, 0, sizeof(*fdata));
+
+ if (tp_data->val1) {
+ ret = __copy_from_user(&fdata->val1, tp_data->val1, 4);
+ if (ret)
+ return -1;
+ }
+
+ if (tp_data->val2 && futex_cmd_has_addr2(cmd)) {
+ ret = __copy_from_user(&fdata->val2, tp_data->val2, 4);
+ if (ret)
+ return -1;
+ }
+
+ if (tp_data->timeout && futex_cmd_has_timeout(cmd)) {
+ /* Copies both ts1 and ts2 */
+ ret = __copy_from_user(&fdata->ts1, tp_data->timeout,
+ sizeof(long) * 2);
+ if (ret)
+ return -1;
+ }
+
+ return 0;
+}
+
static int
syscall_get_futex(unsigned long *args, char **buffer, int *size, int buf_size)
{
struct syscall_user_buffer *sbuf;
- const char __user *ptr;
+ struct tp_futex_data tp_data;
char *buf;
/* buf_size of zero means user doesn't want user space read */
@@ -815,14 +903,18 @@ syscall_get_futex(unsigned long *args, char **buffer, int *size, int buf_size)
if (!sbuf)
return -1;
- ptr = (char __user *)args[0];
+ tp_data.cmd = args[1];
+ tp_data.val1 = (u32 __user *)args[0];
+ tp_data.val2 = (u32 __user *)args[4];
+ tp_data.timeout = (u64 __user *)args[3];
- *buffer = trace_user_fault_read(&sbuf->buf, ptr, 4, NULL, NULL);
+ *buffer = trace_user_fault_read(&sbuf->buf, NULL, 0,
+ syscall_copy_futex, &tp_data);
if (!*buffer)
return -1;
- /* Add room for the value */
- *size += 4;
+ /* Add room for values */
+ *size += sizeof(struct futex_data);
buf = *buffer;
@@ -833,12 +925,13 @@ static void syscall_put_futex(struct syscall_metadata *sys_data,
struct syscall_trace_enter *entry,
char *buffer)
{
- u32 *ptr;
+ struct futex_data *fdata = (void *)buffer;
+ struct futex_data *data;
/* Place the futex key into the storage */
- ptr = (void *)entry->args + sizeof(unsigned long) * sys_data->nb_args;
+ data = (void *)entry->args + sizeof(unsigned long) * sys_data->nb_args;
- *ptr = *(u32 *)buffer;
+ *data = *fdata;
}
static char *sys_fault_user(unsigned int buf_size,
--
2.51.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v2 3/3] tracing: Show TID and flags for PI futex system call trace event
2026-03-10 20:09 [PATCH v2 0/3] tracing: Read user data from futex system call trace event Steven Rostedt
2026-03-10 20:09 ` [PATCH v2 1/3] tracing: Have futex syscall trace event show specific user data Steven Rostedt
2026-03-10 20:09 ` [PATCH v2 2/3] tracing: Update futex syscall trace event to show more commands Steven Rostedt
@ 2026-03-10 20:09 ` Steven Rostedt
2026-03-12 8:15 ` Masami Hiramatsu
2 siblings, 1 reply; 10+ messages in thread
From: Steven Rostedt @ 2026-03-10 20:09 UTC (permalink / raw)
To: linux-kernel, linux-trace-kernel
Cc: Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers, Andrew Morton,
Thomas Gleixner, Peter Zijlstra, Brian Geffon, John Stultz,
Ian Rogers, Suleiman Souhlal
From: Steven Rostedt <rostedt@goodmis.org>
For the futex system call trace event for FUTEX_LOCK_PI and
FUTEX_UNLOCK_PI commands, show the TID from the value (which is usually in
hex) as well as translate the flags (DIED and WAITERS).
pi_mutex_hammer-1098 [000] ..... 121.876928: sys_futex(uaddr: 0x560f40cc8180 (0x450) tid: 1104, FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG, val: 0, timespec: 0x7f2f9d4b1e50 (0.000100000))
pi_mutex_hammer-1128 [000] ..... 121.877120: sys_futex(uaddr: 0x560f40cc8180 (0x8000042a) tid: 1066 (WAITERS), FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG, val: 0, timespec: 0x7f2f8e493e50 (0.000100000))
pi_mutex_hammer-1106 [000] ..... 121.877242: sys_futex(uaddr: 0x560f40cc8180 (0x80000452) tid: 1106 (WAITERS), FUTEX_UNLOCK_PI|FUTEX_PRIVATE_FLAG, val: 0)
This makes it easier to see the hand off of a mutex and who the owner was.
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
Changes since v1: https://lore.kernel.org/all/20260303174442.548b6524@gandalf.local.home/
- Updated to have the print processing in kernel/futex/syscall.c
kernel/futex/syscalls.c | 26 ++++++++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)
diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c
index a46706d6bc6c..24d912d9b20d 100644
--- a/kernel/futex/syscalls.c
+++ b/kernel/futex/syscalls.c
@@ -211,6 +211,9 @@ void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args,
unsigned int op, cmd;
bool done = false;
+ op = args[1];
+ cmd = op & FUTEX_CMD_MASK;
+
for (int i = 0; !done && i < nr_args; i++) {
if (seq_buf_has_overflowed(s))
@@ -225,11 +228,30 @@ void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args,
seq_buf_printf(s, " (%u)", val);
else
seq_buf_printf(s, " (0x%x)", val);
+
+ switch(cmd) {
+ case FUTEX_LOCK_PI:
+ case FUTEX_UNLOCK_PI:
+ seq_buf_printf(s, " tid: %d",
+ val & FUTEX_TID_MASK);
+
+ if (!(val & (FUTEX_OWNER_DIED|FUTEX_WAITERS)))
+ break;
+
+ seq_buf_puts(s, " (");
+ if (val & FUTEX_WAITERS)
+ seq_buf_puts(s, "WAITERS");
+ if (val & FUTEX_OWNER_DIED) {
+ if (op & FUTEX_WAITERS)
+ seq_buf_putc(s, '|');
+ seq_buf_puts(s, "DIED");
+ }
+ seq_buf_putc(s, ')');
+ break;
+ }
}
continue;
case 1:
- op = args[i];
- cmd = op & FUTEX_CMD_MASK;
if (cmd <= FUTEX_LOCK_PI2)
seq_buf_printf(s, ", %s", __futex_cmds[cmd]);
else
--
2.51.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/3] tracing: Have futex syscall trace event show specific user data
2026-03-10 20:09 ` [PATCH v2 1/3] tracing: Have futex syscall trace event show specific user data Steven Rostedt
@ 2026-03-11 7:41 ` kernel test robot
2026-03-11 9:03 ` Masami Hiramatsu
2026-03-11 11:23 ` kernel test robot
2 siblings, 0 replies; 10+ messages in thread
From: kernel test robot @ 2026-03-11 7:41 UTC (permalink / raw)
To: Steven Rostedt, linux-kernel, linux-trace-kernel
Cc: oe-kbuild-all, Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers,
Andrew Morton, Linux Memory Management List, Thomas Gleixner,
Brian Geffon, John Stultz, Ian Rogers, Suleiman Souhlal
Hi Steven,
kernel test robot noticed the following build warnings:
[auto build test WARNING on tip/locking/core]
[also build test WARNING on trace/for-next akpm-mm/mm-everything linus/master v7.0-rc3 next-20260310]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Steven-Rostedt/tracing-Have-futex-syscall-trace-event-show-specific-user-data/20260311-041422
base: tip/locking/core
patch link: https://lore.kernel.org/r/20260310201036.542627924%40kernel.org
patch subject: [PATCH v2 1/3] tracing: Have futex syscall trace event show specific user data
config: x86_64-rhel-9.4 (https://download.01.org/0day-ci/archive/20260311/202603110841.Pxedg2y4-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260311/202603110841.Pxedg2y4-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202603110841.Pxedg2y4-lkp@intel.com/
All warnings (new ones prefixed by >>):
kernel/trace/trace_syscalls.c: In function 'syscall_get_futex':
>> kernel/trace/trace_syscalls.c:807:15: warning: variable 'buf' set but not used [-Wunused-but-set-variable]
807 | char *buf;
| ^~~
vim +/buf +807 kernel/trace/trace_syscalls.c
801
802 static int
803 syscall_get_futex(unsigned long *args, char **buffer, int *size, int buf_size)
804 {
805 struct syscall_user_buffer *sbuf;
806 const char __user *ptr;
> 807 char *buf;
808
809 /* buf_size of zero means user doesn't want user space read */
810 if (!buf_size)
811 return -1;
812
813 /* If the syscall_buffer is NULL, tracing is being shutdown */
814 sbuf = READ_ONCE(syscall_buffer);
815 if (!sbuf)
816 return -1;
817
818 ptr = (char __user *)args[0];
819
820 *buffer = trace_user_fault_read(&sbuf->buf, ptr, 4, NULL, NULL);
821 if (!*buffer)
822 return -1;
823
824 /* Add room for the value */
825 *size += 4;
826
827 buf = *buffer;
828
829 return 0;
830 }
831
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/3] tracing: Have futex syscall trace event show specific user data
2026-03-10 20:09 ` [PATCH v2 1/3] tracing: Have futex syscall trace event show specific user data Steven Rostedt
2026-03-11 7:41 ` kernel test robot
@ 2026-03-11 9:03 ` Masami Hiramatsu
2026-03-11 14:16 ` Steven Rostedt
2026-03-11 11:23 ` kernel test robot
2 siblings, 1 reply; 10+ messages in thread
From: Masami Hiramatsu @ 2026-03-11 9:03 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mark Rutland,
Mathieu Desnoyers, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
Brian Geffon, John Stultz, Ian Rogers, Suleiman Souhlal
On Tue, 10 Mar 2026 16:09:55 -0400
Steven Rostedt <rostedt@kernel.org> wrote:
> +#ifdef CONFIG_FTRACE_SYSCALLS
> +static const char * __futex_cmds[] =
> +{
> + "FUTEX_WAIT", "FUTEX_WAKE", "FUTEX_FD", "FUTEX_REQUEUE",
> + "FUTEX_CMP_REQUEUE", "FUTEX_WAKE_OP", "FUTEX_LOCK_PI",
> + "FUTEX_UNLOCK_PI", "FUTEX_TRYLOCK_PI", "FUTEX_WAIT_BITSET",
> + "FUTEX_WAKE_BITSET", "FUTEX_WAIT_REQUEUE_PI", "FUTEX_CMP_REQUEUE_PI",
> + "FUTEX_LOCK_PI2",
> +};
> +
[...]
> @@ -437,6 +464,69 @@ sys_enter_openat_print_fmt(struct syscall_metadata *entry, char *buf, int len)
> return pos;
> }
>
> +static int __init
> +sys_enter_futex_print_fmt(struct syscall_metadata *entry, char *buf, int len)
> +{
> + int pos = 0;
> +
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + "\"uaddr: 0x%%lx (0x%%lx) cmd=%%s%%s%%s");
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + " val: 0x%%x timeout/val2: 0x%%llx");
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + " uaddr2: 0x%%lx val3: 0x%%x\", ");
> +
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + " REC->uaddr,");
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + " REC->__value,");
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + " __print_symbolic(REC->op & 0x%x, ", FUTEX_CMD_MASK);
> +
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + "{%d, \"FUTEX_WAIT\"}, ", FUTEX_WAIT);
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + "{%d, \"FUTEX_WAKE\"}, ", FUTEX_WAKE);
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + "{%d, \"FUTEX_FD\"}, ", FUTEX_FD);
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + "{%d, \"FUTEX_REQUEUE\"}, ", FUTEX_REQUEUE);
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + "{%d, \"FUTEX_CMP_REQUEUE\"}, ", FUTEX_CMP_REQUEUE);
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + "{%d, \"FUTEX_WAKE_OP\"}, ", FUTEX_WAKE_OP);
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + "{%d, \"FUTEX_LOCK_PI\"}, ", FUTEX_LOCK_PI);
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + "{%d, \"FUTEX_UNLOCK_PI\"}, ", FUTEX_UNLOCK_PI);
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + "{%d, \"FUTEX_TRYLOCK_PI\"}, ", FUTEX_TRYLOCK_PI);
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + "{%d, \"FUTEX_WAIT_BITSET\"}, ", FUTEX_WAIT_BITSET);
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + "{%d, \"FUTEX_WAKE_BITSET\"}, ", FUTEX_WAKE_BITSET);
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + "{%d, \"FUTEX_WAIT_REQUEUE_PI\"}, ", FUTEX_WAIT_REQUEUE_PI);
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + "{%d, \"FUTEX_CMP_REQUEUE_PI\"}, ", FUTEX_CMP_REQUEUE_PI);
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + "{%d, \"FUTEX_LOCK_PI2\"}),", FUTEX_LOCK_PI2);
Hmm can we share __futex_cmds[] with kernel/futex/syscalls.c?
Then these could be
for (i = 0; i <= FUTEX_LOCK_PI2; i++)
pos += snprintf(buf + pos, LEN_OR_ZERO,
"{%d, \"%s\"}%s", i, __futex_cmds[i],
i == FUTEX_LOCK_PI2 ? ")," : ", ");
> +
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + " (REC->op & %d) ? \"|FUTEX_PRIVATE_FLAG\" : \"\",",
> + FUTEX_PRIVATE_FLAG);
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + " (REC->op & %d) ? \"|FUTEX_CLOCK_REALTIME\" : \"\",",
> + FUTEX_CLOCK_REALTIME);
> +
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + " REC->val, REC->utime,");
> +
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + " REC->uaddr, REC->val3");
> + return pos;
> +}
> +
> static int __init
> __set_enter_print_fmt(struct syscall_metadata *entry, char *buf, int len)
> {
[...]
> @@ -689,6 +799,48 @@ static int syscall_copy_user_array(char *buf, const char __user *ptr,
> return 0;
> }
>
> +static int
> +syscall_get_futex(unsigned long *args, char **buffer, int *size, int buf_size)
> +{
> + struct syscall_user_buffer *sbuf;
> + const char __user *ptr;
> + char *buf;
> +
> + /* buf_size of zero means user doesn't want user space read */
> + if (!buf_size)
> + return -1;
> +
> + /* If the syscall_buffer is NULL, tracing is being shutdown */
> + sbuf = READ_ONCE(syscall_buffer);
> + if (!sbuf)
> + return -1;
> +
> + ptr = (char __user *)args[0];
> +
> + *buffer = trace_user_fault_read(&sbuf->buf, ptr, 4, NULL, NULL);
> + if (!*buffer)
> + return -1;
> +
> + /* Add room for the value */
> + *size += 4;
> +
> + buf = *buffer;
As kernel test bot says, this does nothing. (*buffer is already assigned)
> +
> + return 0;
> +}
Thanks,
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/3] tracing: Have futex syscall trace event show specific user data
2026-03-10 20:09 ` [PATCH v2 1/3] tracing: Have futex syscall trace event show specific user data Steven Rostedt
2026-03-11 7:41 ` kernel test robot
2026-03-11 9:03 ` Masami Hiramatsu
@ 2026-03-11 11:23 ` kernel test robot
2 siblings, 0 replies; 10+ messages in thread
From: kernel test robot @ 2026-03-11 11:23 UTC (permalink / raw)
To: Steven Rostedt, linux-kernel, linux-trace-kernel
Cc: oe-kbuild-all, Masami Hiramatsu, Mark Rutland, Mathieu Desnoyers,
Andrew Morton, Linux Memory Management List, Thomas Gleixner,
Brian Geffon, John Stultz, Ian Rogers, Suleiman Souhlal
Hi Steven,
kernel test robot noticed the following build warnings:
[auto build test WARNING on tip/locking/core]
[also build test WARNING on trace/for-next akpm-mm/mm-everything linus/master v7.0-rc3 next-20260310]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Steven-Rostedt/tracing-Have-futex-syscall-trace-event-show-specific-user-data/20260311-041422
base: tip/locking/core
patch link: https://lore.kernel.org/r/20260310201036.542627924%40kernel.org
patch subject: [PATCH v2 1/3] tracing: Have futex syscall trace event show specific user data
config: arc-randconfig-001-20260311 (https://download.01.org/0day-ci/archive/20260311/202603111911.zAWpiGlz-lkp@intel.com/config)
compiler: arc-linux-gcc (GCC) 14.3.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260311/202603111911.zAWpiGlz-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202603111911.zAWpiGlz-lkp@intel.com/
All warnings (new ones prefixed by >>):
kernel/trace/trace_syscalls.c: In function 'syscall_get_futex':
>> kernel/trace/trace_syscalls.c:807:15: warning: variable 'buf' set but not used [-Wunused-but-set-variable]
807 | char *buf;
| ^~~
vim +/buf +807 kernel/trace/trace_syscalls.c
801
802 static int
803 syscall_get_futex(unsigned long *args, char **buffer, int *size, int buf_size)
804 {
805 struct syscall_user_buffer *sbuf;
806 const char __user *ptr;
> 807 char *buf;
808
809 /* buf_size of zero means user doesn't want user space read */
810 if (!buf_size)
811 return -1;
812
813 /* If the syscall_buffer is NULL, tracing is being shutdown */
814 sbuf = READ_ONCE(syscall_buffer);
815 if (!sbuf)
816 return -1;
817
818 ptr = (char __user *)args[0];
819
820 *buffer = trace_user_fault_read(&sbuf->buf, ptr, 4, NULL, NULL);
821 if (!*buffer)
822 return -1;
823
824 /* Add room for the value */
825 *size += 4;
826
827 buf = *buffer;
828
829 return 0;
830 }
831
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/3] tracing: Have futex syscall trace event show specific user data
2026-03-11 9:03 ` Masami Hiramatsu
@ 2026-03-11 14:16 ` Steven Rostedt
0 siblings, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2026-03-11 14:16 UTC (permalink / raw)
To: Masami Hiramatsu (Google)
Cc: linux-kernel, linux-trace-kernel, Mark Rutland, Mathieu Desnoyers,
Andrew Morton, Thomas Gleixner, Peter Zijlstra, Brian Geffon,
John Stultz, Ian Rogers, Suleiman Souhlal
On Wed, 11 Mar 2026 18:03:25 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + " __print_symbolic(REC->op & 0x%x, ", FUTEX_CMD_MASK);
> > +
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + "{%d, \"FUTEX_WAIT\"}, ", FUTEX_WAIT);
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + "{%d, \"FUTEX_WAKE\"}, ", FUTEX_WAKE);
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + "{%d, \"FUTEX_FD\"}, ", FUTEX_FD);
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + "{%d, \"FUTEX_REQUEUE\"}, ", FUTEX_REQUEUE);
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + "{%d, \"FUTEX_CMP_REQUEUE\"}, ", FUTEX_CMP_REQUEUE);
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + "{%d, \"FUTEX_WAKE_OP\"}, ", FUTEX_WAKE_OP);
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + "{%d, \"FUTEX_LOCK_PI\"}, ", FUTEX_LOCK_PI);
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + "{%d, \"FUTEX_UNLOCK_PI\"}, ", FUTEX_UNLOCK_PI);
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + "{%d, \"FUTEX_TRYLOCK_PI\"}, ", FUTEX_TRYLOCK_PI);
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + "{%d, \"FUTEX_WAIT_BITSET\"}, ", FUTEX_WAIT_BITSET);
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + "{%d, \"FUTEX_WAKE_BITSET\"}, ", FUTEX_WAKE_BITSET);
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + "{%d, \"FUTEX_WAIT_REQUEUE_PI\"}, ", FUTEX_WAIT_REQUEUE_PI);
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + "{%d, \"FUTEX_CMP_REQUEUE_PI\"}, ", FUTEX_CMP_REQUEUE_PI);
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + "{%d, \"FUTEX_LOCK_PI2\"}),", FUTEX_LOCK_PI2);
>
> Hmm can we share __futex_cmds[] with kernel/futex/syscalls.c?
> Then these could be
>
> for (i = 0; i <= FUTEX_LOCK_PI2; i++)
> pos += snprintf(buf + pos, LEN_OR_ZERO,
> "{%d, \"%s\"}%s", i, __futex_cmds[i],
> i == FUTEX_LOCK_PI2 ? ")," : ", ");
Hmm, I created the above *before* creating the __futex_cmds. But yes, that
makes sense. Thanks for the suggestion.
>
>
> > +
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + " (REC->op & %d) ? \"|FUTEX_PRIVATE_FLAG\" : \"\",",
> > + FUTEX_PRIVATE_FLAG);
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + " (REC->op & %d) ? \"|FUTEX_CLOCK_REALTIME\" : \"\",",
> > + FUTEX_CLOCK_REALTIME);
> > +
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + " REC->val, REC->utime,");
> > +
> > + pos += snprintf(buf + pos, LEN_OR_ZERO,
> > + " REC->uaddr, REC->val3");
> > + return pos;
> > +}
> > +
> > static int __init
> > __set_enter_print_fmt(struct syscall_metadata *entry, char *buf, int len)
> > {
> [...]
> > @@ -689,6 +799,48 @@ static int syscall_copy_user_array(char *buf, const char __user *ptr,
> > return 0;
> > }
> >
> > +static int
> > +syscall_get_futex(unsigned long *args, char **buffer, int *size, int buf_size)
> > +{
> > + struct syscall_user_buffer *sbuf;
> > + const char __user *ptr;
> > + char *buf;
> > +
> > + /* buf_size of zero means user doesn't want user space read */
> > + if (!buf_size)
> > + return -1;
> > +
> > + /* If the syscall_buffer is NULL, tracing is being shutdown */
> > + sbuf = READ_ONCE(syscall_buffer);
> > + if (!sbuf)
> > + return -1;
> > +
> > + ptr = (char __user *)args[0];
> > +
> > + *buffer = trace_user_fault_read(&sbuf->buf, ptr, 4, NULL, NULL);
> > + if (!*buffer)
> > + return -1;
> > +
> > + /* Add room for the value */
> > + *size += 4;
> > +
> > + buf = *buffer;
>
> As kernel test bot says, this does nothing. (*buffer is already assigned)
>
Oops, I think this was a cut-and-paste error :-p
-- Steve
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/3] tracing: Update futex syscall trace event to show more commands
2026-03-10 20:09 ` [PATCH v2 2/3] tracing: Update futex syscall trace event to show more commands Steven Rostedt
@ 2026-03-12 5:34 ` Masami Hiramatsu
0 siblings, 0 replies; 10+ messages in thread
From: Masami Hiramatsu @ 2026-03-12 5:34 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mark Rutland,
Mathieu Desnoyers, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
Brian Geffon, John Stultz, Ian Rogers, Suleiman Souhlal
On Tue, 10 Mar 2026 16:09:56 -0400
Steven Rostedt <rostedt@kernel.org> wrote:
> From: Steven Rostedt <rostedt@goodmis.org>
>
> Make the futex syscall trace event a little more smart. Have it read the
> futex_op instruction to determine what else it can save and print. For the
> appropriate options, it will read the utime (timespec) parameter and show
> its output as well as the uaddr2.
>
> futex_requeue_p-1154 [004] ..... 144.568339: sys_futex(uaddr: 0x5652b178d834 (0x482), FUTEX_UNLOCK_PI|FUTEX_PRIVATE_FLAG, val: 0)
> futex_requeue_p-1162 [002] ..... 144.568696: sys_futex(uaddr: 0x7f763b7fece0 (2), FUTEX_WAIT|FUTEX_PRIVATE_FLAG, val: 2)
> futex_requeue_p-1151 [000] ..... 144.568700: sys_futex(uaddr: 0x7f763b7fece0 (0), FUTEX_WAKE|FUTEX_PRIVATE_FLAG, val: 1)
> futex_requeue_p-1162 [002] ..... 144.568705: sys_futex(uaddr: 0x7f763b7fece0 (0), FUTEX_WAKE|FUTEX_PRIVATE_FLAG, val: 1)
> futex_requeue_p-1151 [000] ..... 144.568715: sys_futex(uaddr: 0x7f764369e990 (0x483), FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, val: 1155)
> futex_requeue_p-1155 [005] ..... 144.569420: sys_futex(uaddr: 0x5652b178d838 (0), FUTEX_WAIT_REQUEUE_PI|FUTEX_PRIVATE_FLAG, val: 0, timespec: 0x7ffdacfba500 (143.890024054), uaddr2: 0x5652b178d834 (0), val3: 0)
> futex_requeue_p-1155 [005] ..... 144.569454: sys_futex(uaddr: 0x5652b178d834 (0), FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG, val: 0)
>
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
This looks good to me.
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Thanks,
> ---
> Changes since v1: https://lore.kernel.org/all/20260303214942.587739736@kernel.org/
>
> - Updated to have the print processing in kernel/futex/syscall.c
>
> include/linux/futex.h | 35 ++++++++-
> kernel/futex/syscalls.c | 48 ++++++-------
> kernel/trace/trace_syscalls.c | 129 +++++++++++++++++++++++++++++-----
> 3 files changed, 164 insertions(+), 48 deletions(-)
>
> diff --git a/include/linux/futex.h b/include/linux/futex.h
> index 9fc47aa01a8b..976fa257ab5c 100644
> --- a/include/linux/futex.h
> +++ b/include/linux/futex.h
> @@ -82,8 +82,35 @@ long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
> u32 __user *uaddr2, u32 val2, u32 val3);
> int futex_hash_prctl(unsigned long arg2, unsigned long arg3, unsigned long arg4);
>
> +static __always_inline bool futex_cmd_has_timeout(u32 cmd)
> +{
> + switch (cmd) {
> + case FUTEX_WAIT:
> + case FUTEX_LOCK_PI:
> + case FUTEX_LOCK_PI2:
> + case FUTEX_WAIT_BITSET:
> + case FUTEX_WAIT_REQUEUE_PI:
> + return true;
> + }
> + return false;
> +}
> +
> +static __always_inline bool futex_cmd_has_addr2(u32 cmd)
> +{
> + switch (cmd) {
> + case FUTEX_REQUEUE:
> + case FUTEX_CMP_REQUEUE:
> + case FUTEX_WAKE_OP:
> + case FUTEX_WAIT_REQUEUE_PI:
> + return true;
> + }
> + return false;
> +}
> +
> #ifdef CONFIG_FTRACE_SYSCALLS
> -void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args, u32 *ptr);
> +void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args,
> + u32 *ptr1, u32 *ptr2, unsigned long *ts1,
> + unsigned long *ts2);
> #endif
>
> #ifdef CONFIG_FUTEX_PRIVATE_HASH
> @@ -119,7 +146,11 @@ static inline int futex_hash_allocate_default(void)
> static inline int futex_hash_free(struct mm_struct *mm) { return 0; }
> static inline int futex_mm_init(struct mm_struct *mm) { return 0; }
> static inline void futex_print_syscall(struct seq_buf *s, int nr_args,
> - unsigned long *args, u32 *ptr) { }
> + unsigned long *args, u32 *ptr1,
> + u32 *ptr2, unsigned long *ts1,
> + unsigned long *ts2) { }
> +static __always_inline bool futex_cmd_has_timeout(u32 cmd) { return false; }
> +static __always_inline bool futex_cmd_has_addr2(u32 cmd) { return false; }
> #endif
>
> #endif
> diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c
> index a1cd512aa502..a46706d6bc6c 100644
> --- a/kernel/futex/syscalls.c
> +++ b/kernel/futex/syscalls.c
> @@ -158,31 +158,6 @@ long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
> return -ENOSYS;
> }
>
> -static __always_inline bool futex_cmd_has_timeout(u32 cmd)
> -{
> - switch (cmd) {
> - case FUTEX_WAIT:
> - case FUTEX_LOCK_PI:
> - case FUTEX_LOCK_PI2:
> - case FUTEX_WAIT_BITSET:
> - case FUTEX_WAIT_REQUEUE_PI:
> - return true;
> - }
> - return false;
> -}
> -
> -static __always_inline bool futex_cmd_has_addr2(u32 cmd)
> -{
> - switch (cmd) {
> - case FUTEX_REQUEUE:
> - case FUTEX_CMP_REQUEUE:
> - case FUTEX_WAKE_OP:
> - case FUTEX_WAIT_REQUEUE_PI:
> - return true;
> - }
> - return false;
> -}
> -
> static __always_inline int
> futex_init_timeout(u32 cmd, u32 op, struct timespec64 *ts, ktime_t *t)
> {
> @@ -229,7 +204,9 @@ static const char * __futex_cmds[] =
> "FUTEX_LOCK_PI2",
> };
>
> -void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args, u32 *ptr)
> +void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args,
> + u32 *ptr1, u32 *ptr2, unsigned long *ts1,
> + unsigned long *ts2)
> {
> unsigned int op, cmd;
> bool done = false;
> @@ -242,8 +219,8 @@ void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args, u3
> switch (i) {
> case 0:
> seq_buf_printf(s, "uaddr: 0x%lx", args[i]);
> - if (ptr) {
> - u32 val = *ptr;
> + if (ptr1) {
> + u32 val = *ptr1;
> if (val < 10)
> seq_buf_printf(s, " (%u)", val);
> else
> @@ -279,6 +256,15 @@ void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args, u3
> continue;
>
> seq_buf_printf(s, ", timespec: 0x%lx", args[i]);
> +
> + if (!ts1 || !ts2)
> + continue;
> +
> + if (!*ts1 && !*ts2) {
> + seq_buf_puts(s, " (0)");
> + continue;
> + }
> + seq_buf_printf(s, " (%lu.%09lu)", *ts1, *ts2);
> continue;
> case 4:
> if (!futex_cmd_has_addr2(cmd)) {
> @@ -286,6 +272,12 @@ void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args, u3
> continue;
> }
> seq_buf_printf(s, ", uaddr2: 0x%lx", args[i]);
> +
> + if (!ptr2)
> + continue;
> +
> + seq_buf_printf(s, " (%x)", *ptr2);
> +
> continue;
> case 5:
> seq_buf_printf(s, ", val3: %lu", args[i]);
> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> index 8cb3af569157..de1fa97547a3 100644
> --- a/kernel/trace/trace_syscalls.c
> +++ b/kernel/trace/trace_syscalls.c
> @@ -239,21 +239,35 @@ sys_enter_openat_print(struct syscall_trace_enter *trace, struct syscall_metadat
> return trace_handle_return(s);
> }
>
> +struct futex_data {
> + u32 val1;
> + u32 val2;
> + unsigned long ts1;
> + unsigned long ts2;
> +};
> +
> static enum print_line_t
> sys_enter_futex_print(struct syscall_trace_enter *trace, struct syscall_metadata *entry,
> struct trace_seq *s, struct trace_event *event, int ent_size)
> {
> + struct futex_data *data;
> void *end = (void *)trace + ent_size;
> - void *ptr;
> + unsigned long *ts1 = NULL, *ts2 = NULL;
> + u32 *ptr1 = NULL, *ptr2 = NULL;
>
> /* Set ptr to the user space copied area */
> - ptr = (void *)trace->args + sizeof(unsigned long) * entry->nb_args;
> - if (ptr + 4 > end)
> - ptr = NULL;
> + data = (void *)trace->args + sizeof(unsigned long) * entry->nb_args;
> + if ((void *)data + sizeof(*data) <= end) {
> + ptr1 = &data->val1;
> + ptr2 = &data->val2;
> + ts1 = &data->ts1;
> + ts2 = &data->ts2;
> + }
>
> trace_seq_printf(s, "%s(", entry->name);
>
> - futex_print_syscall(&s->seq, entry->nb_args, trace->args, ptr);
> + futex_print_syscall(&s->seq, entry->nb_args, trace->args, ptr1, ptr2,
> + ts1, ts2);
>
> trace_seq_puts(s, ")\n");
>
> @@ -472,9 +486,9 @@ sys_enter_futex_print_fmt(struct syscall_metadata *entry, char *buf, int len)
> pos += snprintf(buf + pos, LEN_OR_ZERO,
> "\"uaddr: 0x%%lx (0x%%lx) cmd=%%s%%s%%s");
> pos += snprintf(buf + pos, LEN_OR_ZERO,
> - " val: 0x%%x timeout/val2: 0x%%llx");
> + " val: 0x%%x timeout/val2: 0x%%llx (%%lu.%%lu)");
> pos += snprintf(buf + pos, LEN_OR_ZERO,
> - " uaddr2: 0x%%lx val3: 0x%%x\", ");
> + " uaddr2: 0x%%lx (0x%%lx) val3: 0x%%x\", ");
>
> pos += snprintf(buf + pos, LEN_OR_ZERO,
> " REC->uaddr,");
> @@ -520,10 +534,12 @@ sys_enter_futex_print_fmt(struct syscall_metadata *entry, char *buf, int len)
> FUTEX_CLOCK_REALTIME);
>
> pos += snprintf(buf + pos, LEN_OR_ZERO,
> - " REC->val, REC->utime,");
> + " REC->val, REC->utime, REC->__ts1, REC->__ts2,");
>
> pos += snprintf(buf + pos, LEN_OR_ZERO,
> - " REC->uaddr, REC->val3");
> + " REC->uaddr,");
> + pos += snprintf(buf + pos, LEN_OR_ZERO,
> + " REC->__value2, REC->val3");
> return pos;
> }
>
> @@ -626,7 +642,39 @@ static int __init futex_fields(struct trace_event_call *call, int offset)
> ret = trace_define_field(call, "u32", arg, offset, sizeof(int), 0,
> FILTER_OTHER);
> if (ret)
> - kfree(arg);
> + goto free;
> + offset += sizeof(int);
> +
> + arg = kstrdup("__value2", GFP_KERNEL);
> + if (WARN_ON_ONCE(!arg))
> + return -ENOMEM;
> + ret = trace_define_field(call, "u32", arg, offset, sizeof(int), 0,
> + FILTER_OTHER);
> + if (ret)
> + goto free;
> + offset += sizeof(int);
> +
> + arg = kstrdup("__ts1", GFP_KERNEL);
> + if (WARN_ON_ONCE(!arg))
> + return -ENOMEM;
> + ret = trace_define_field(call, "unsigned long", arg, offset,
> + sizeof(unsigned long), 0, FILTER_OTHER);
> + if (ret)
> + goto free;
> + offset += sizeof(long);
> +
> + arg = kstrdup("__ts2", GFP_KERNEL);
> + if (WARN_ON_ONCE(!arg))
> + return -ENOMEM;
> + ret = trace_define_field(call, "unsigned long", arg, offset,
> + sizeof(unsigned long), 0, FILTER_OTHER);
> + if (ret)
> + goto free;
> +
> + return 0;
> +
> +free:
> + kfree(arg);
> return ret;
> }
>
> @@ -799,11 +847,51 @@ static int syscall_copy_user_array(char *buf, const char __user *ptr,
> return 0;
> }
>
> +struct tp_futex_data {
> + u32 cmd;
> + const u32 __user *val1;
> + const u32 __user *val2;
> + void __user *timeout;
> +};
> +
> +static int syscall_copy_futex(char *buf, const char __user *ptr,
> + size_t size, void *data)
> +{
> + struct tp_futex_data *tp_data = data;
> + struct futex_data *fdata = (void *)buf;
> + int cmd = tp_data->cmd & FUTEX_CMD_MASK;
> + int ret;
> +
> + memset(fdata, 0, sizeof(*fdata));
> +
> + if (tp_data->val1) {
> + ret = __copy_from_user(&fdata->val1, tp_data->val1, 4);
> + if (ret)
> + return -1;
> + }
> +
> + if (tp_data->val2 && futex_cmd_has_addr2(cmd)) {
> + ret = __copy_from_user(&fdata->val2, tp_data->val2, 4);
> + if (ret)
> + return -1;
> + }
> +
> + if (tp_data->timeout && futex_cmd_has_timeout(cmd)) {
> + /* Copies both ts1 and ts2 */
> + ret = __copy_from_user(&fdata->ts1, tp_data->timeout,
> + sizeof(long) * 2);
> + if (ret)
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> static int
> syscall_get_futex(unsigned long *args, char **buffer, int *size, int buf_size)
> {
> struct syscall_user_buffer *sbuf;
> - const char __user *ptr;
> + struct tp_futex_data tp_data;
> char *buf;
>
> /* buf_size of zero means user doesn't want user space read */
> @@ -815,14 +903,18 @@ syscall_get_futex(unsigned long *args, char **buffer, int *size, int buf_size)
> if (!sbuf)
> return -1;
>
> - ptr = (char __user *)args[0];
> + tp_data.cmd = args[1];
> + tp_data.val1 = (u32 __user *)args[0];
> + tp_data.val2 = (u32 __user *)args[4];
> + tp_data.timeout = (u64 __user *)args[3];
>
> - *buffer = trace_user_fault_read(&sbuf->buf, ptr, 4, NULL, NULL);
> + *buffer = trace_user_fault_read(&sbuf->buf, NULL, 0,
> + syscall_copy_futex, &tp_data);
> if (!*buffer)
> return -1;
>
> - /* Add room for the value */
> - *size += 4;
> + /* Add room for values */
> + *size += sizeof(struct futex_data);
>
> buf = *buffer;
>
> @@ -833,12 +925,13 @@ static void syscall_put_futex(struct syscall_metadata *sys_data,
> struct syscall_trace_enter *entry,
> char *buffer)
> {
> - u32 *ptr;
> + struct futex_data *fdata = (void *)buffer;
> + struct futex_data *data;
>
> /* Place the futex key into the storage */
> - ptr = (void *)entry->args + sizeof(unsigned long) * sys_data->nb_args;
> + data = (void *)entry->args + sizeof(unsigned long) * sys_data->nb_args;
>
> - *ptr = *(u32 *)buffer;
> + *data = *fdata;
> }
>
> static char *sys_fault_user(unsigned int buf_size,
> --
> 2.51.0
>
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 3/3] tracing: Show TID and flags for PI futex system call trace event
2026-03-10 20:09 ` [PATCH v2 3/3] tracing: Show TID and flags for PI futex system call trace event Steven Rostedt
@ 2026-03-12 8:15 ` Masami Hiramatsu
0 siblings, 0 replies; 10+ messages in thread
From: Masami Hiramatsu @ 2026-03-12 8:15 UTC (permalink / raw)
To: Steven Rostedt
Cc: linux-kernel, linux-trace-kernel, Masami Hiramatsu, Mark Rutland,
Mathieu Desnoyers, Andrew Morton, Thomas Gleixner, Peter Zijlstra,
Brian Geffon, John Stultz, Ian Rogers, Suleiman Souhlal
On Tue, 10 Mar 2026 16:09:57 -0400
Steven Rostedt <rostedt@kernel.org> wrote:
> From: Steven Rostedt <rostedt@goodmis.org>
>
> For the futex system call trace event for FUTEX_LOCK_PI and
> FUTEX_UNLOCK_PI commands, show the TID from the value (which is usually in
> hex) as well as translate the flags (DIED and WAITERS).
>
> pi_mutex_hammer-1098 [000] ..... 121.876928: sys_futex(uaddr: 0x560f40cc8180 (0x450) tid: 1104, FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG, val: 0, timespec: 0x7f2f9d4b1e50 (0.000100000))
> pi_mutex_hammer-1128 [000] ..... 121.877120: sys_futex(uaddr: 0x560f40cc8180 (0x8000042a) tid: 1066 (WAITERS), FUTEX_LOCK_PI|FUTEX_PRIVATE_FLAG, val: 0, timespec: 0x7f2f8e493e50 (0.000100000))
> pi_mutex_hammer-1106 [000] ..... 121.877242: sys_futex(uaddr: 0x560f40cc8180 (0x80000452) tid: 1106 (WAITERS), FUTEX_UNLOCK_PI|FUTEX_PRIVATE_FLAG, val: 0)
>
> This makes it easier to see the hand off of a mutex and who the owner was.
>
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Looks good to me.
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Thanks!
> ---
> Changes since v1: https://lore.kernel.org/all/20260303174442.548b6524@gandalf.local.home/
>
> - Updated to have the print processing in kernel/futex/syscall.c
>
> kernel/futex/syscalls.c | 26 ++++++++++++++++++++++++--
> 1 file changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c
> index a46706d6bc6c..24d912d9b20d 100644
> --- a/kernel/futex/syscalls.c
> +++ b/kernel/futex/syscalls.c
> @@ -211,6 +211,9 @@ void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args,
> unsigned int op, cmd;
> bool done = false;
>
> + op = args[1];
> + cmd = op & FUTEX_CMD_MASK;
> +
> for (int i = 0; !done && i < nr_args; i++) {
>
> if (seq_buf_has_overflowed(s))
> @@ -225,11 +228,30 @@ void futex_print_syscall(struct seq_buf *s, int nr_args, unsigned long *args,
> seq_buf_printf(s, " (%u)", val);
> else
> seq_buf_printf(s, " (0x%x)", val);
> +
> + switch(cmd) {
> + case FUTEX_LOCK_PI:
> + case FUTEX_UNLOCK_PI:
> + seq_buf_printf(s, " tid: %d",
> + val & FUTEX_TID_MASK);
> +
> + if (!(val & (FUTEX_OWNER_DIED|FUTEX_WAITERS)))
> + break;
> +
> + seq_buf_puts(s, " (");
> + if (val & FUTEX_WAITERS)
> + seq_buf_puts(s, "WAITERS");
> + if (val & FUTEX_OWNER_DIED) {
> + if (op & FUTEX_WAITERS)
> + seq_buf_putc(s, '|');
> + seq_buf_puts(s, "DIED");
> + }
> + seq_buf_putc(s, ')');
> + break;
> + }
> }
> continue;
> case 1:
> - op = args[i];
> - cmd = op & FUTEX_CMD_MASK;
> if (cmd <= FUTEX_LOCK_PI2)
> seq_buf_printf(s, ", %s", __futex_cmds[cmd]);
> else
> --
> 2.51.0
>
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-03-12 8:15 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-10 20:09 [PATCH v2 0/3] tracing: Read user data from futex system call trace event Steven Rostedt
2026-03-10 20:09 ` [PATCH v2 1/3] tracing: Have futex syscall trace event show specific user data Steven Rostedt
2026-03-11 7:41 ` kernel test robot
2026-03-11 9:03 ` Masami Hiramatsu
2026-03-11 14:16 ` Steven Rostedt
2026-03-11 11:23 ` kernel test robot
2026-03-10 20:09 ` [PATCH v2 2/3] tracing: Update futex syscall trace event to show more commands Steven Rostedt
2026-03-12 5:34 ` Masami Hiramatsu
2026-03-10 20:09 ` [PATCH v2 3/3] tracing: Show TID and flags for PI futex system call trace event Steven Rostedt
2026-03-12 8:15 ` Masami Hiramatsu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox