From: Frederic Weisbecker <fweisbec@gmail.com>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: Vaibhav Nagarnaik <vnagarnaik@google.com>,
Ingo Molnar <mingo@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, David Sharp <dhsharp@google.com>,
Justin Teravest <teravest@google.com>,
Laurent Chavey <chavey@google.com>,
Michael Davidson <md@google.com>,
x86@kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/6] trace: trace syscall in its handler not from ptrace handler
Date: Fri, 30 Mar 2012 13:57:17 +0200 [thread overview]
Message-ID: <20120330115715.GB13022@somewhere.redhat.com> (raw)
In-Reply-To: <4F74C0B2.1010100@zytor.com>
On Thu, Mar 29, 2012 at 01:06:10PM -0700, H. Peter Anvin wrote:
> On 03/29/2012 12:43 PM, Vaibhav Nagarnaik wrote:
> >
> > However, we agree that the syscall tracing as implemented currently is
> > a bit unwieldy. We would want to be a part of the re-designing effort
> > if there is a momentum in the community towards that goal. We would be
> > happy to contribute towards this effort.
> >
>
> I had a long discussion with Frederic over IRC earlier today. We came
> up with the following strawman:
>
> 1. A system call thunk (which could be enabled/disabled by patching the
> syscall table.) This provides an entry and exit hook, and also sets a
> per-thread flag to capture userspace traffic.
>
> 2. Instrumenting get_user/put_user/copy_from_user/copy_to_user to
> capture traffic to userspace. This captures the *full* set of system
> call arguments, including things addressed via pointers. Furthermore,
> it captures the exact versions fed to or returned from the kernel, and
> deals with data-dependent collection like ioctl().
>
> This has to be done with extreme care to avoid introducing overhead in
> the no-tracing case, however, as these functions are extraordinarily
> performance sensitive. This probably will require careful patching in
> the first enable/last disable case.
>
> 3. There will need to be userspace tools written to decode the resulting
> trace buffer. This is pretty much needed anyway, but once you throw in
> complex data structures it becomes even more so. A trace will basically
> consist of:
>
> SYSCALL_ENTRY <syscall number> <arg1..6>
> COPY_FROM_USER <address> <data>
> ...
> COPY_TO_USER <address> <data>
> ...
> SYSCALL_EXIT <return value>
>
> Outputting this in human-readable format requires some reasonably
> sophisticated logic, but the *HUGE* advantage is that not only is all
> the information there, it is *correct by construction*.
>
> -hpa
Note we have the relevant tracepoints in place with the "raw_syscalls"
events subsystem. They are generic with only two tracepoints sys_enter
and sys_exit and they blindly dump the syscall number/arg/return value:
$ cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/format
name: sys_enter
ID: 53
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:int common_padding; offset:8; size:4; signed:1;
field:long id; offset:16; size:8; signed:1;
field:unsigned long args[6]; offset:24; size:48; signed:0;
print fmt: "NR %ld (%lx, %lx, %lx, %lx, %lx, %lx)", REC->id, REC->args[0], REC->args[1], REC->args[2], REC->args[3],
REC->args[4], REC->args[5]
$ cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_exit/format
name: sys_exit
ID: 52
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:int common_padding; offset:8; size:4; signed:1;
field:long id; offset:16; size:8; signed:1;
field:long ret; offset:24; size:8; signed:1;
print fmt: "NR %ld = %ld", REC->id, REC->ret
Now we have yet to do the syscall table patching and the copy_*_user() tracepoints.
But other than these details the bulk of the remaining work is in userspace.
next prev parent reply other threads:[~2012-03-30 11:57 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-26 18:39 [PATCH 0/6] Enhance and speed up syscall tracing Vaibhav Nagarnaik
2012-03-26 18:39 ` [PATCH 1/6] trace: syscalls.h - cleanup and simplify SYSCALL_METADATA() Vaibhav Nagarnaik
2012-03-26 18:39 ` [PATCH 2/6] trace: add support for 32 bit compat syscalls on x86_64 Vaibhav Nagarnaik
2012-03-27 4:49 ` H. Peter Anvin
2012-03-28 21:10 ` Vaibhav Nagarnaik
2012-03-28 21:11 ` Vaibhav Nagarnaik
2012-03-28 23:00 ` Vaibhav Nagarnaik
2012-03-26 18:39 ` [PATCH 3/6] trace: Refactor ftrace syscall macros to make them more readable Vaibhav Nagarnaik
2012-03-26 18:39 ` [PATCH 4/6] trace: trace syscall in its handler not from ptrace handler Vaibhav Nagarnaik
2012-03-27 5:00 ` H. Peter Anvin
2012-03-28 18:23 ` Vaibhav Nagarnaik
2012-03-29 2:43 ` H. Peter Anvin
2012-03-29 2:59 ` Steven Rostedt
2012-03-29 3:15 ` H. Peter Anvin
2012-03-29 3:02 ` Vaibhav Nagarnaik
2012-03-29 3:16 ` H. Peter Anvin
2012-03-29 6:20 ` Ingo Molnar
2012-03-29 19:02 ` Vaibhav Nagarnaik
2012-03-29 19:12 ` H. Peter Anvin
2012-03-29 19:43 ` Vaibhav Nagarnaik
2012-03-29 20:06 ` H. Peter Anvin
2012-03-29 22:40 ` David Sharp
2012-03-29 22:44 ` H. Peter Anvin
2012-03-30 12:06 ` Frederic Weisbecker
2012-03-30 11:57 ` Frederic Weisbecker [this message]
2012-03-29 22:44 ` David Sharp
2012-03-29 22:48 ` H. Peter Anvin
2012-03-26 18:39 ` [PATCH 5/6] trace: raw_syscalls: Mark compat syscalls in the MSB of the syscall number Vaibhav Nagarnaik
2012-03-26 18:39 ` [PATCH 6/6] trace: get rid of the enabled_*_syscalls bitmaps Vaibhav Nagarnaik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120330115715.GB13022@somewhere.redhat.com \
--to=fweisbec@gmail.com \
--cc=chavey@google.com \
--cc=dhsharp@google.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=md@google.com \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=rostedt@goodmis.org \
--cc=teravest@google.com \
--cc=tglx@linutronix.de \
--cc=vnagarnaik@google.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox