Linux Trace Kernel
 help / color / mirror / Atom feed
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
To: Martin Kaiser <martin@kaiser.cx>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	linux-trace-kernel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] tracing: eprobe: read the complete FILTER_PTR_STRING pointer
Date: Thu, 18 Jun 2026 10:52:27 +0900	[thread overview]
Message-ID: <20260618105227.c58c85e9cb19bce673d9a79b@kernel.org> (raw)
In-Reply-To: <ajJbkeK0zXb8MtcS@akranes.kaiser.cx>

On Wed, 17 Jun 2026 10:32:17 +0200
Martin Kaiser <martin@kaiser.cx> wrote:

> Hiramatsu-san,
> 
> thank you for reviewing my patch.
> 
> Thus wrote Masami Hiramatsu (mhiramat@kernel.org):
> 
> > Ah, this is a bit complicated. It seems to work with sched_switch event
> > as commit f04dec93466a ("tracing/eprobes: Fix reading of string fields"):
> 
> > echo 'e:sw sched/sched_switch comm=$next_comm:string' > dynamic_events
> 
> > #           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
> > #              | |         |   |||||     |         |
> >               sh-162     [002] d..3.    54.027213: sw: (sched.sched_switch) comm="swapper/2"
> >           <idle>-0       [007] d..3.    54.034573: sw: (sched.sched_switch) comm="rcu_preempt"
> >      rcu_preempt-15      [007] d..3.    54.034589: sw: (sched.sched_switch) comm="swapper/7"
> 
> > Maybe comm is stored as a fixed string information in the event record?
> 
> Yes, this example does not execute my change.
> 
> > /sys/kernel/tracing # cat events/sched/sched_switch/format 
> > name: sched_switch
> > ID: 254
> > format:
> > 	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
> > 	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
> > 	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
> > 	field:int common_pid;	offset:4;	size:4;	signed:1;
> 
> > 	field:char prev_comm[16];	offset:8;	size:16;	signed:0;
> > 	field:pid_t prev_pid;	offset:24;	size:4;	signed:1;
> > 	field:int prev_prio;	offset:28;	size:4;	signed:1;
> > 	field:long prev_state;	offset:32;	size:8;	signed:1;
> > 	field:char next_comm[16];	offset:40;	size:16;	signed:0;
> > 	field:pid_t next_pid;	offset:56;	size:4;	signed:1;
> > 	field:int next_prio;	offset:60;	size:4;	signed:1;
> 
> > But the filename is a pointer.
> 
> > /sys/kernel/tracing # cat events/syscalls/sys_enter_openat/format 
> > name: sys_enter_openat
> > ID: 705
> > format:
> > 	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
> > 	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
> > 	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
> > 	field:int common_pid;	offset:4;	size:4;	signed:1;
> 
> > 	field:int __syscall_nr;	offset:8;	size:4;	signed:1;
> > 	field:int dfd;	offset:16;	size:8;	signed:0;
> > 	field:const char * filename;	offset:24;	size:8;	signed:0;
> > 	field:int flags;	offset:32;	size:8;	signed:0;
> > 	field:umode_t mode;	offset:40;	size:8;	signed:0;
> > 	field:__data_loc char[] __filename_val;	offset:48;	size:4;	signed:0;
> 
> > In this case, the filename field should use __data_loc directly instead of
> > pointing data on the ring buffer.
> 
> > Can you try 
> 
> > echo 'e syscalls.sys_enter_openat $__filename_val:string' > \
> >  		/sys/kernel/tracing/dynamic_events
> 
> > Instead?
> 
> This field is working as expected.
> 
> I still believe that the handling of FILTER_PTR_STRING is not correct. The
> pointer is stored in the ringbuffer as unsigned long and read as a char. This
> gives us a truncated pointer that cannot be dereferenced.

Ah, OK. I understand the problem.

 - ring buffer and its records should be self-contained.
 - In most cases, events use __data_loc/__rel_loc or fixed array to store
   strings.
 - only syscall events exposes the char *, which is not recommended but
   important to debug user space. (not for dereference)

The example usage of FILTER_PTR_STRING is actually using FILTER_STATIC_STRING
now, so FILTER_PTR_STRING is left broken. (hmm, but there are many
 "const char *" are used especially under rcu events...)

OK, can you update your patch description to use rcu events?

BTW, I think those also should be decoded from enum value in the events,
or use __rel_loc. Since it is not self-contained. (it's a TODO item)

> > I think better solution is fixing sycall tracer.
> 
> I would say that syscall trace is doing the right thing. The ringbuffer entry
> is a struct syscall_trace_enter, the syscall arguments are unsigned longs.
> They are written in ftrace_syscall_enter, this looks correct to me.

OK, I thought the filename points the ringbuffer, but it actually points
the user space. (saving a raw parameter values) So it is OK.

For eprobe users, it should not access to the user space data directly
because it can cause page fault in the kernel without fixup. It may work
on x86, but it doesn't work on other architecture which has separated
address space for user space. To avoid such mistake, it saves actual
string in the ringbuffer as __filename_val.

Hmm, this must be documented in eprobe example code...

> 
> A const char * syscall argument is using FILTER_PTR_STRING, the unsigned long
> argument from the ringbuffer is read as a char and then converted to a
> truncated pointer.


Thanks,

-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

      reply	other threads:[~2026-06-18  1:52 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-15 14:54 [PATCH] tracing: eprobe: read the complete FILTER_PTR_STRING pointer Martin Kaiser
2026-06-16  2:09 ` Masami Hiramatsu
2026-06-17  8:32   ` Martin Kaiser
2026-06-18  1:52     ` Masami Hiramatsu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260618105227.c58c85e9cb19bce673d9a79b@kernel.org \
    --to=mhiramat@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=martin@kaiser.cx \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox