From: Frederic Weisbecker <fweisbec@gmail.com>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: Vaibhav Nagarnaik <vnagarnaik@google.com>,
Ingo Molnar <mingo@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, David Sharp <dhsharp@google.com>,
Justin Teravest <teravest@google.com>,
Laurent Chavey <chavey@google.com>,
Michael Davidson <md@google.com>,
x86@kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/6] trace: trace syscall in its handler not from ptrace handler
Date: Fri, 30 Mar 2012 13:57:17 +0200 [thread overview]
Message-ID: <20120330115715.GB13022@somewhere.redhat.com> (raw)
In-Reply-To: <4F74C0B2.1010100@zytor.com>
On Thu, Mar 29, 2012 at 01:06:10PM -0700, H. Peter Anvin wrote:
> On 03/29/2012 12:43 PM, Vaibhav Nagarnaik wrote:
> >
> > However, we agree that the syscall tracing as implemented currently is
> > a bit unwieldy. We would want to be a part of the re-designing effort
> > if there is a momentum in the community towards that goal. We would be
> > happy to contribute towards this effort.
> >
>
> I had a long discussion with Frederic over IRC earlier today. We came
> up with the following strawman:
>
> 1. A system call thunk (which could be enabled/disabled by patching the
> syscall table.) This provides an entry and exit hook, and also sets a
> per-thread flag to capture userspace traffic.
>
> 2. Instrumenting get_user/put_user/copy_from_user/copy_to_user to
> capture traffic to userspace. This captures the *full* set of system
> call arguments, including things addressed via pointers. Furthermore,
> it captures the exact versions fed to or returned from the kernel, and
> deals with data-dependent collection like ioctl().
>
> This has to be done with extreme care to avoid introducing overhead in
> the no-tracing case, however, as these functions are extraordinarily
> performance sensitive. This probably will require careful patching in
> the first enable/last disable case.
>
> 3. There will need to be userspace tools written to decode the resulting
> trace buffer. This is pretty much needed anyway, but once you throw in
> complex data structures it becomes even more so. A trace will basically
> consist of:
>
> SYSCALL_ENTRY <syscall number> <arg1..6>
> COPY_FROM_USER <address> <data>
> ...
> COPY_TO_USER <address> <data>
> ...
> SYSCALL_EXIT <return value>
>
> Outputting this in human-readable format requires some reasonably
> sophisticated logic, but the *HUGE* advantage is that not only is all
> the information there, it is *correct by construction*.
>
> -hpa
Note we have the relevant tracepoints in place with the "raw_syscalls"
events subsystem. They are generic with only two tracepoints sys_enter
and sys_exit and they blindly dump the syscall number/arg/return value:
$ cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/format
name: sys_enter
ID: 53
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:int common_padding; offset:8; size:4; signed:1;
field:long id; offset:16; size:8; signed:1;
field:unsigned long args[6]; offset:24; size:48; signed:0;
print fmt: "NR %ld (%lx, %lx, %lx, %lx, %lx, %lx)", REC->id, REC->args[0], REC->args[1], REC->args[2], REC->args[3],
REC->args[4], REC->args[5]
$ cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_exit/format
name: sys_exit
ID: 52
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:int common_padding; offset:8; size:4; signed:1;
field:long id; offset:16; size:8; signed:1;
field:long ret; offset:24; size:8; signed:1;
print fmt: "NR %ld = %ld", REC->id, REC->ret
Now we have yet to do the syscall table patching and the copy_*_user() tracepoints.
But other than these details the bulk of the remaining work is in userspace.
next prev parent reply other threads:[~2012-03-30 11:57 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-26 18:39 [PATCH 0/6] Enhance and speed up syscall tracing Vaibhav Nagarnaik
2012-03-26 18:39 ` [PATCH 1/6] trace: syscalls.h - cleanup and simplify SYSCALL_METADATA() Vaibhav Nagarnaik
2012-03-26 18:39 ` [PATCH 2/6] trace: add support for 32 bit compat syscalls on x86_64 Vaibhav Nagarnaik
2012-03-27 4:49 ` H. Peter Anvin
2012-03-28 21:10 ` Vaibhav Nagarnaik
2012-03-28 21:11 ` Vaibhav Nagarnaik
2012-03-28 23:00 ` Vaibhav Nagarnaik
2012-03-26 18:39 ` [PATCH 3/6] trace: Refactor ftrace syscall macros to make them more readable Vaibhav Nagarnaik
2012-03-26 18:39 ` [PATCH 4/6] trace: trace syscall in its handler not from ptrace handler Vaibhav Nagarnaik
2012-03-27 5:00 ` H. Peter Anvin
2012-03-28 18:23 ` Vaibhav Nagarnaik
2012-03-29 2:43 ` H. Peter Anvin
2012-03-29 2:59 ` Steven Rostedt
2012-03-29 3:15 ` H. Peter Anvin
2012-03-29 3:02 ` Vaibhav Nagarnaik
2012-03-29 3:16 ` H. Peter Anvin
2012-03-29 6:20 ` Ingo Molnar
2012-03-29 19:02 ` Vaibhav Nagarnaik
2012-03-29 19:12 ` H. Peter Anvin
2012-03-29 19:43 ` Vaibhav Nagarnaik
2012-03-29 20:06 ` H. Peter Anvin
2012-03-29 22:40 ` David Sharp
2012-03-29 22:44 ` H. Peter Anvin
2012-03-30 12:06 ` Frederic Weisbecker
2012-03-30 11:57 ` Frederic Weisbecker [this message]
2012-03-29 22:44 ` David Sharp
2012-03-29 22:48 ` H. Peter Anvin
2012-03-26 18:39 ` [PATCH 5/6] trace: raw_syscalls: Mark compat syscalls in the MSB of the syscall number Vaibhav Nagarnaik
2012-03-26 18:39 ` [PATCH 6/6] trace: get rid of the enabled_*_syscalls bitmaps Vaibhav Nagarnaik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120330115715.GB13022@somewhere.redhat.com \
--to=fweisbec@gmail.com \
--cc=chavey@google.com \
--cc=dhsharp@google.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=md@google.com \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=rostedt@goodmis.org \
--cc=teravest@google.com \
--cc=tglx@linutronix.de \
--cc=vnagarnaik@google.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.