From: Frederic Weisbecker <fweisbec@gmail.com>
To: David Sharp <dhsharp@google.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
Vaibhav Nagarnaik <vnagarnaik@google.com>,
Ingo Molnar <mingo@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>,
Justin Teravest <teravest@google.com>,
Laurent Chavey <chavey@google.com>,
Michael Davidson <md@google.com>,
x86@kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/6] trace: trace syscall in its handler not from ptrace handler
Date: Fri, 30 Mar 2012 14:06:59 +0200 [thread overview]
Message-ID: <20120330120657.GC13022@somewhere.redhat.com> (raw)
In-Reply-To: <CAJL_ekvU8sjt0wi6GjrMiqy03kBZ6YzQkz9TO+MQzQAr_-2VnQ@mail.gmail.com>
On Thu, Mar 29, 2012 at 03:40:17PM -0700, David Sharp wrote:
> On Thu, Mar 29, 2012 at 1:06 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> > I had a long discussion with Frederic over IRC earlier today. We came
> > up with the following strawman:
> >
> > 1. A system call thunk (which could be enabled/disabled by patching the
> > syscall table.) This provides an entry and exit hook, and also sets a
> > per-thread flag to capture userspace traffic.
>
> Our goal is for syscall traces to be as fast as regular tracepoints.
> iirc, What we've found is that much of the extra overhead of syscall
> tracepoints as compared to regular tracepoints is due to that the code
> path for syscall tracing is bundled with checks for ptrace and other
> stuff (Vaibhav did all this characterization, he can jump in with
> details if wanted). How much work would this "thunk" have to do that
> is not either recording the trace or calling the syscall?
>
> >
> > 2. Instrumenting get_user/put_user/copy_from_user/copy_to_user to
> > capture traffic to userspace. This captures the *full* set of system
> > call arguments, including things addressed via pointers. Furthermore,
> > it captures the exact versions fed to or returned from the kernel, and
> > deals with data-dependent collection like ioctl().
>
> Do I understand correctly that you are thinking to copy tho contents
> of those buffers into the ring buffer? This sounds useful. However I
> think it should be optional and the number of bytes copied should be
> limited (tunable). On highly utilized systems, we don't always have a
> lot of memory to dedicate to the ring bufffer, so filling it with the
> contents of, eg, the payload of "read" or "write" would not be
> acceptable under those circumstances. And since events in the ring
> buffer can't cross page boundaries, at some threshold this will cause
> an unacceptable level of unutilized space in the ring buffer.
>
> (For context, this is coming from the folks that added "tiny" versions
> of syscall tracepoints that only put 16 bits of arg0 into the ring
> buffer so we can get longer trace durations.)
BTW, since tracing overhead (in terms of volume and throughput) is
important for you guys, have you considered adding some option to ftrace
to ignore the "common" fields on the trace record:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:int common_padding; offset:8; size:4; signed:1;
I think you talked about that on the last kernel summit. This would be
interesting for everyone.
You can find out the pid on top of sched switch events. The rest is probably useless
most of the time.
next prev parent reply other threads:[~2012-03-30 12:07 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-26 18:39 [PATCH 0/6] Enhance and speed up syscall tracing Vaibhav Nagarnaik
2012-03-26 18:39 ` [PATCH 1/6] trace: syscalls.h - cleanup and simplify SYSCALL_METADATA() Vaibhav Nagarnaik
2012-03-26 18:39 ` [PATCH 2/6] trace: add support for 32 bit compat syscalls on x86_64 Vaibhav Nagarnaik
2012-03-27 4:49 ` H. Peter Anvin
2012-03-28 21:10 ` Vaibhav Nagarnaik
2012-03-28 21:11 ` Vaibhav Nagarnaik
2012-03-28 23:00 ` Vaibhav Nagarnaik
2012-03-26 18:39 ` [PATCH 3/6] trace: Refactor ftrace syscall macros to make them more readable Vaibhav Nagarnaik
2012-03-26 18:39 ` [PATCH 4/6] trace: trace syscall in its handler not from ptrace handler Vaibhav Nagarnaik
2012-03-27 5:00 ` H. Peter Anvin
2012-03-28 18:23 ` Vaibhav Nagarnaik
2012-03-29 2:43 ` H. Peter Anvin
2012-03-29 2:59 ` Steven Rostedt
2012-03-29 3:15 ` H. Peter Anvin
2012-03-29 3:02 ` Vaibhav Nagarnaik
2012-03-29 3:16 ` H. Peter Anvin
2012-03-29 6:20 ` Ingo Molnar
2012-03-29 19:02 ` Vaibhav Nagarnaik
2012-03-29 19:12 ` H. Peter Anvin
2012-03-29 19:43 ` Vaibhav Nagarnaik
2012-03-29 20:06 ` H. Peter Anvin
2012-03-29 22:40 ` David Sharp
2012-03-29 22:44 ` H. Peter Anvin
2012-03-30 12:06 ` Frederic Weisbecker [this message]
2012-03-30 11:57 ` Frederic Weisbecker
2012-03-29 22:44 ` David Sharp
2012-03-29 22:48 ` H. Peter Anvin
2012-03-26 18:39 ` [PATCH 5/6] trace: raw_syscalls: Mark compat syscalls in the MSB of the syscall number Vaibhav Nagarnaik
2012-03-26 18:39 ` [PATCH 6/6] trace: get rid of the enabled_*_syscalls bitmaps Vaibhav Nagarnaik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120330120657.GC13022@somewhere.redhat.com \
--to=fweisbec@gmail.com \
--cc=chavey@google.com \
--cc=dhsharp@google.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=md@google.com \
--cc=mingo@kernel.org \
--cc=mingo@redhat.com \
--cc=rostedt@goodmis.org \
--cc=teravest@google.com \
--cc=tglx@linutronix.de \
--cc=vnagarnaik@google.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox