public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
To: "Frank Ch. Eigler" <fche@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>,
	akpm@linux-foundation.org, Ingo Molnar <mingo@elte.hu>,
	linux-kernel@vger.kernel.org
Subject: [RFC] system-wide in-kernel syscall tracing
Date: Fri, 25 Apr 2008 09:17:17 -0400	[thread overview]
Message-ID: <20080425131717.GA8034@Krystal> (raw)
In-Reply-To: <20080425125607.GC28411@Krystal>

* Mathieu Desnoyers (mathieu.desnoyers@polymtl.ca) wrote:
> > >
> > > Those which are close enough to system call boundary are essentially
> > > strace(1).
> > 
> > Those may not sound worthwhile to put a marker for, BUT, you're
> > ignoring the huge differences of impact and scope.  A system-wide
> > marker-based trace (filtered a la systemtap if desired) can be done
> > with a tiny fraction of system load and none of the disruption caused
> > by an strace of all the processes.
> > 
> 
> I agree with both ;) Actually we need a low-overhead hook in
> syscall_trace(), so we can perform efficient system-wide tracing of
> system calls. I'll dig in this as soon as I find time.
> 
> Basic ideas :
> 
> - I already have the TIF_KERNEL_TRACE thread flag added to all
>   architectures in another patchset.
> - We add a function called on TIF_KERNEL_TRACE, from do_syscall_trace(),
>   which is architecture-specific. It's basically a big switch() for all
>   system calls. syscalls which takes similar types could be grouped
>   together, but I don't think it would be useful at all. It might be
>   better just to add a trace_mark for each so we extract the syscall
>   fields in the marker string.
> - We perform the page fault (caused by strings and structures) reads on
>   the spot, because we prefer not to do this in atomic context.
> - We put a marker, e.g., for x86_32, a pseudo-code like :
> 
> syscall_trace_enter()
> {
>   ...
>   if (test_thread_flag(TIF_KERNEL_TRACE))
>     do_marker_syscall_trace();
>   ...
> }
> 
> do_marker_syscall_trace()
> {
>   char *tmpbuf;
> 
>   switch(regs->orig_ax) {
> 
>   case SYS_OPEN:
>     tmpbuf = vmalloc(4096); /* what size is needed ? */
>     copy_from_user(tmpbuf, regs->bx);
>     trace_mark(sys_open, "filename %p flags %d mode %d",

Actually, I meant :
     trace_mark(sys_open, "filename %s flags %d mode %d",

and it would be even better to pass the __user pointer directly to the
probe to eliminate the copy. I think this could be done by making sure
the memory is faulted-in and locked when we call the trace_mark. It
could require to think of a way to specify a weird format string type
though, so an automated tracer would use strncpy_from_user in atomic and
al instead of trying to dereference the userspace pointer directly.

Mathieu

>       tmpbuf, regs->cx, regs->dx);
>     vfree(tmpbuf);
>     break;
>   }
> }
> 
> Modulo some optimization, what do you think of this ? If someone is
> willing to implement this, I can provide the patchset for
> TIF_KERNEL_TRACE.
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

  reply	other threads:[~2008-04-25 13:17 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-24 15:03 [patch 00/37] Linux Kernel Markers instrumentation for sched-devel.git Mathieu Desnoyers
2008-04-24 15:03 ` [patch 01/37] Stringify support commas Mathieu Desnoyers
2008-04-24 15:03 ` [patch 02/37] x86_64 page fault NMI-safe Mathieu Desnoyers
2008-04-24 15:03 ` [patch 03/37] Change Alpha active count bit Mathieu Desnoyers
2008-04-24 15:03 ` [patch 04/37] Change avr32 " Mathieu Desnoyers
2008-04-24 15:03 ` [patch 05/37] x86 NMI-safe INT3 and Page Fault Mathieu Desnoyers
2008-04-24 15:03 ` [patch 06/37] Kprobes - use a mutex to protect the instruction pages list Mathieu Desnoyers
2008-04-24 15:03 ` [patch 07/37] Kprobes - do not use kprobes mutex in arch code Mathieu Desnoyers
2008-04-24 15:03 ` [patch 08/37] Kprobes - declare kprobe_mutex static Mathieu Desnoyers
2008-04-24 15:03 ` [patch 09/37] Fix sched-devel text_poke Mathieu Desnoyers
2008-04-24 15:03 ` [patch 10/37] Text Edit Lock - Architecture Independent Code Mathieu Desnoyers
2008-04-24 15:03 ` [patch 11/37] Text Edit Lock - kprobes architecture independent support Mathieu Desnoyers
2008-04-24 15:03 ` [patch 12/37] Add all cpus option to stop machine run Mathieu Desnoyers
2008-04-24 15:03 ` [patch 13/37] Immediate Values - Architecture Independent Code Mathieu Desnoyers
2008-04-25 14:55   ` Ingo Molnar
2008-04-26  9:36     ` Ingo Molnar
2008-04-26 11:09       ` Ingo Molnar
2008-04-26 14:17         ` Mathieu Desnoyers
2008-04-24 15:03 ` [patch 14/37] Immediate Values - Kconfig menu in EMBEDDED Mathieu Desnoyers
2008-04-24 15:03 ` [patch 15/37] Immediate Values - x86 Optimization Mathieu Desnoyers
2008-04-24 15:03 ` [patch 16/37] Add text_poke and sync_core to powerpc Mathieu Desnoyers
2008-04-24 15:03 ` [patch 17/37] Immediate Values - Powerpc Optimization Mathieu Desnoyers
2008-04-24 15:03 ` [patch 18/37] Immediate Values - Documentation Mathieu Desnoyers
2008-04-24 15:03 ` [patch 19/37] Immediate Values Support init Mathieu Desnoyers
2008-04-24 15:03 ` [patch 20/37] Immediate Values - Move Kprobes x86 restore_interrupt to kdebug.h Mathieu Desnoyers
2008-04-24 15:03 ` [patch 21/37] Add __discard section to x86 Mathieu Desnoyers
2008-04-24 15:03 ` [patch 22/37] Immediate Values - x86 Optimization NMI and MCE support Mathieu Desnoyers
2008-04-24 15:03 ` [patch 23/37] Immediate Values - Powerpc Optimization NMI " Mathieu Desnoyers
2008-04-24 15:03 ` [patch 24/37] Immediate Values Use Arch NMI and MCE Support Mathieu Desnoyers
2008-04-24 15:03 ` [patch 25/37] Immediate Values - Jump Mathieu Desnoyers
2008-04-24 15:03 ` [patch 26/37] Scheduler Profiling - Use Immediate Values Mathieu Desnoyers
2008-04-24 15:03 ` [patch 27/37] From: Adrian Bunk <bunk@kernel.org> Mathieu Desnoyers
2008-04-28  9:54   ` Adrian Bunk
2008-04-28 12:37     ` Mathieu Desnoyers
2008-04-24 15:03 ` [patch 28/37] Markers - remove extra format argument Mathieu Desnoyers
2008-04-24 15:03 ` [patch 29/37] Markers - define non optimized marker Mathieu Desnoyers
2008-04-24 15:03 ` [patch 30/37] Linux Kernel Markers - Use Immediate Values Mathieu Desnoyers
2008-04-24 15:03 ` [patch 31/37] Markers use imv jump Mathieu Desnoyers
2008-04-24 15:03 ` [patch 32/37] Port ftrace to markers Mathieu Desnoyers
2008-04-24 15:03 ` [patch 33/37] LTTng instrumentation fs Mathieu Desnoyers
2008-04-24 15:03 ` [patch 34/37] LTTng instrumentation ipc Mathieu Desnoyers
2008-04-24 23:02   ` Alexey Dobriyan
2008-04-25  2:15     ` Frank Ch. Eigler
2008-04-25 12:56       ` Mathieu Desnoyers
2008-04-25 13:17         ` Mathieu Desnoyers [this message]
2008-05-04 21:04         ` Alexey Dobriyan
2008-05-04 20:39           ` Frank Ch. Eigler
2008-04-24 15:03 ` [patch 35/37] LTTng instrumentation kernel Mathieu Desnoyers
2008-04-24 15:04 ` [patch 36/37] LTTng instrumentation mm Mathieu Desnoyers
2008-04-28  2:12   ` Masami Hiramatsu
2008-04-24 15:04 ` [patch 37/37] LTTng instrumentation net Mathieu Desnoyers
2008-04-24 15:52   ` Pavel Emelyanov
2008-04-24 16:13     ` Mathieu Desnoyers
2008-04-24 16:30       ` Pavel Emelyanov
2008-04-26 19:38 ` [patch 00/37] Linux Kernel Markers instrumentation for sched-devel.git Peter Zijlstra
2008-04-26 20:11   ` Mathieu Desnoyers
2008-04-27 10:00     ` Peter Zijlstra
2008-05-04 15:08       ` Mathieu Desnoyers
2008-04-28 18:22   ` Ingo Molnar
2008-04-28 18:36   ` Andrew Morton
2008-04-28 18:40     ` Christoph Hellwig
2008-04-28 18:47       ` Andrew Morton
2008-04-28 18:49         ` Christoph Hellwig
2008-04-28 19:01         ` KOSAKI Motohiro
2008-04-28 19:52     ` Peter Zijlstra
2008-04-28 22:25       ` Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080425131717.GA8034@Krystal \
    --to=mathieu.desnoyers@polymtl.ca \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=fche@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox