From: Frederic Weisbecker <fweisbec@gmail.com>
To: "Kok, Auke" <auke-jan.h.kok@intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
powertop ml <power@bughost.org>,
Arjan van de Ven <arjan@linux.intel.com>,
Ingo Molnar <mingo@elte.hu>,
srostedt@redhat.com,
Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
"Frank Ch. Eigler" <fche@redhat.com>,
Neil Horman <nhorman@tuxdriver.com>
Subject: Re: [PATCH] tracer for sys_open() - sreadahead
Date: Tue, 27 Jan 2009 23:50:49 +0100 [thread overview]
Message-ID: <20090127225048.GA4652@nowhere> (raw)
In-Reply-To: <20090127224303.GB5850@nowhere>
On Tue, Jan 27, 2009 at 11:43:03PM +0100, Frederic Weisbecker wrote:
> On Tue, Jan 27, 2009 at 12:08:04PM -0800, Kok, Auke wrote:
> >
> > This tracer monitors regular file open() syscalls. This is a fast
> > and low-overhead alternative to strace, and does not allow or
> > require to be attached to every process.
> >
> > The tracer only logs succesfull calls, as those are the only ones we
> > are currently interested in, and we can determine the absolute path
> > of these files as we log.
> >
> > Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
>
>
> Hi Auke,
>
> Speaking about a global syscall tracer, I made a patch to trace only the syscalls
> with the function-graph-tracer.
>
> http://lkml.org/lkml/2008/12/30/267
>
> Its approach and purpose is different than a tracer dedicated only to syscalls.
> The function graph tracer traces execution graph of the functions and is more about
> execution time spent and code flow whereas a syscall tracer can provide more specific
> informations about syscalls.
>
> So both are not overlaping.
>
> But the low level part of my patch creates a thread flag _TIF_SYSCALL_TRACE which triggers
s/_TIF_SYSCALL_TRACE/_TIF_SYSCALL_FTRACE
_TIF_SYSCALL_TRACE is the one used by ptrace.
> a ptrace hook when set.
> This low-level part can easily be used by all tracers that would like to inspect syscalls.
>
> Just a change is needed: Steven requested that the part inside syscall_trace_enter become
> a tracepoint, making it totally shareable between tracers and easy to turn on and off.
>
> And perhaps the parts that set/clear the flag on all tasks can be shared too.
>
> So we can start with this low-level syscall tracing facility. If you want, I can adapt
> this low-level part and submit a patch this week or the next one to give you this base
> infrastructure.
>
>
> Once we have it, I think a syscall tracer can be fed with new syscalls events through
> several patch iterations, starting with the open and close one :-)
>
> Are you ok with that?
>
> Steven, Ingo, do you agree?
>
>
> >
> > diff --git a/fs/open.c b/fs/open.c
> > index a3a78ce..8cf2a6b 100644
> > --- a/fs/open.c
> > +++ b/fs/open.c
> > @@ -30,6 +30,10 @@
> > #include <linux/audit.h>
> > #include <linux/falloc.h>
> >
> > +#include <trace/fs.h>
> > +
> > +DEFINE_TRACE(do_sys_open);
> > +
> > int vfs_statfs(struct dentry *dentry, struct kstatfs *buf)
> > {
> > int retval = -ENODEV;
> > @@ -1040,6 +1044,7 @@ long do_sys_open(int dfd, const char __user *filename, int
> > flags, int mode)
> > fsnotify_open(f->f_path.dentry);
> > fd_install(fd, f);
> > }
> > + trace_do_sys_open(f, flags, mode, fd);
> > }
> > putname(tmp);
> > }
> > diff --git a/include/trace/fs.h b/include/trace/fs.h
> > new file mode 100644
> > index 0000000..870eec2
> > --- /dev/null
> > +++ b/include/trace/fs.h
> > @@ -0,0 +1,11 @@
> > +#ifndef _TRACE_FS_H
> > +#define _TRACE_FS_H
> > +
> > +#include <linux/fs.h>
> > +#include <linux/tracepoint.h>
> > +
> > +DECLARE_TRACE(do_sys_open,
> > + TPPROTO(struct file *filp, int flags, int mode, long fd),
> > + TPARGS(filp, flags, mode, fd));
> > +
> > +#endif
> > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> > index e2a4ff6..0400815 100644
> > --- a/kernel/trace/Kconfig
> > +++ b/kernel/trace/Kconfig
> > @@ -149,6 +149,15 @@ config CONTEXT_SWITCH_TRACER
> > This tracer gets called from the context switch and records
> > all switching of tasks.
> >
> > +config OPEN_CLOSE_TRACER
> > + bool "Trace open() calls"
> > + depends on DEBUG_KERNEL
> > + select TRACING
> > + select MARKERS
> > + help
> > + This tracer records open() syscalls. These calls are made when
> > + files are accessed on disk.
> > +
> > config BOOT_TRACER
> > bool "Trace boot initcalls"
> > depends on DEBUG_KERNEL
> > diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
> > index 349d5a9..25cec6c 100644
> > --- a/kernel/trace/Makefile
> > +++ b/kernel/trace/Makefile
> > @@ -20,6 +20,7 @@ obj-$(CONFIG_RING_BUFFER) += ring_buffer.o
> >
> > obj-$(CONFIG_TRACING) += trace.o
> > obj-$(CONFIG_CONTEXT_SWITCH_TRACER) += trace_sched_switch.o
> > +obj-$(CONFIG_OPEN_CLOSE_TRACER) += trace_open_close.o
> > obj-$(CONFIG_SYSPROF_TRACER) += trace_sysprof.o
> > obj-$(CONFIG_FUNCTION_TRACER) += trace_functions.o
> > obj-$(CONFIG_IRQSOFF_TRACER) += trace_irqsoff.o
> > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> > index 4d3d381..24c17d2 100644
> > --- a/kernel/trace/trace.h
> > +++ b/kernel/trace/trace.h
> > @@ -30,6 +30,7 @@ enum trace_type {
> > TRACE_USER_STACK,
> > TRACE_HW_BRANCHES,
> > TRACE_POWER,
> > + TRACE_OPEN,
> >
> > __TRACE_LAST_TYPE
> > };
> > diff --git a/kernel/trace/trace_open_close.c b/kernel/trace/trace_open_close.c
> > new file mode 100644
> > index 0000000..4250efc
> > --- /dev/null
> > +++ b/kernel/trace/trace_open_close.c
> > @@ -0,0 +1,148 @@
> > +/*
> > + * trace open calls
> > + * Copyright (C) 2009 Intel Corporation
> > + *
> > + * Based extensively on trace_sched_switch.c
> > + * Copyright (C) 2007 Steven Rostedt <srostedt@redhat.com>
> > + *
> > + */
> > +
> > +#include <linux/module.h>
> > +#include <linux/fs.h>
> > +#include <linux/debugfs.h>
> > +#include <linux/kallsyms.h>
> > +#include <linux/uaccess.h>
> > +#include <linux/ftrace.h>
> > +#include <trace/fs.h>
> > +
> > +#include "trace.h"
> > +
> > +
> > +static struct trace_array *ctx_trace;
> > +static int __read_mostly open_trace_enabled;
> > +static atomic_t open_ref;
> > +
> > +static void probe_do_sys_open(struct file *filp, int flags, int mode, long fd)
> > +{
> > + char *buf;
> > + char *fname;
> > +
> > + if (!atomic_read(&open_ref))
> > + return;
> > +
> > + if (!open_trace_enabled)
> > + return;
> > +
> > + buf = kzalloc(PAGE_SIZE, GFP_KERNEL);
> > + if (!buf)
> > + return;
> > + fname = d_path(&filp->f_path, buf, PAGE_SIZE);
> > +
> > + if (IS_ERR(fname))
> > + goto out;
> > +
> > + ftrace_printk("%s: open(\"%s\", %d, %d) = %ld\n",
> > + current->comm, fname, flags, mode, fd);
> > +out:
> > + kfree(buf);
> > +}
> > +
> > +static void open_trace_reset(struct trace_array *tr)
> > +{
> > + tr->time_start = ftrace_now(tr->cpu);
> > + tracing_reset_online_cpus(tr);
> > +}
> > +
> > +static int open_trace_register(void)
> > +{
> > + int ret;
> > +
> > + ret = register_trace_do_sys_open(probe_do_sys_open);
> > + if (ret) {
> > + pr_info("open trace: Could not activate tracepoint"
> > + " probe to do_open\n");
> > + }
> > +
> > + return ret;
> > +}
> > +
> > +static void open_trace_unregister(void)
> > +{
> > + unregister_trace_do_sys_open(probe_do_sys_open);
> > +}
> > +
> > +static void open_trace_start(void)
> > +{
> > + long ref;
> > +
> > + ref = atomic_inc_return(&open_ref);
> > + if (ref == 1)
> > + open_trace_register();
> > +}
> > +
> > +static void open_trace_stop(void)
> > +{
> > + long ref;
> > +
> > + ref = atomic_dec_and_test(&open_ref);
> > + if (ref)
> > + open_trace_unregister();
> > +}
> > +
> > +void open_trace_start_cmdline_record(void)
> > +{
> > + open_trace_start();
> > +}
> > +
> > +void open_trace_stop_cmdline_record(void)
> > +{
> > + open_trace_stop();
> > +}
> > +
> > +static void open_start_trace(struct trace_array *tr)
> > +{
> > + open_trace_reset(tr);
> > + open_trace_start_cmdline_record();
> > + open_trace_enabled = 1;
> > +}
> > +
> > +static void open_stop_trace(struct trace_array *tr)
> > +{
> > + open_trace_enabled = 0;
> > + open_trace_stop_cmdline_record();
> > +}
> > +
> > +static int open_trace_init(struct trace_array *tr)
> > +{
> > + ctx_trace = tr;
> > +
> > + open_start_trace(tr);
> > + return 0;
> > +}
> > +
> > +static void reset_open_trace(struct trace_array *tr)
> > +{
> > + open_stop_trace(tr);
> > +}
> > +
> > +static struct tracer open_trace __read_mostly =
> > +{
> > + .name = "open",
> > + .init = open_trace_init,
> > + .reset = reset_open_trace,
> > +};
> > +
> > +__init static int init_open_trace(void)
> > +{
> > + int ret = 0;
> > +
> > + if (atomic_read(&open_ref))
> > + ret = open_trace_register();
> > + if (ret) {
> > + pr_info("error registering open trace\n");
> > + return ret;
> > + }
> > + return register_tracer(&open_trace);
> > +}
> > +device_initcall(init_open_trace);
> > +
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2009-01-27 22:51 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-27 20:08 [PATCH] tracer for sys_open() - sreadahead Kok, Auke
2009-01-27 20:51 ` Arnaldo Carvalho de Melo
2009-01-27 21:14 ` Frederic Weisbecker
2009-01-28 22:05 ` Kok, Auke
2009-01-29 0:45 ` Arnaldo Carvalho de Melo
2009-01-29 13:39 ` Frédéric Weisbecker
2009-01-29 13:40 ` Frédéric Weisbecker
2009-01-27 22:43 ` Frederic Weisbecker
2009-01-27 22:50 ` Frederic Weisbecker [this message]
2009-01-29 14:04 ` Ingo Molnar
2009-01-29 14:29 ` Frédéric Weisbecker
2009-01-29 14:31 ` Ingo Molnar
2009-01-29 14:40 ` Frédéric Weisbecker
2009-01-29 14:48 ` Frédéric Weisbecker
2009-01-29 15:09 ` Ingo Molnar
2009-01-29 15:17 ` Frédéric Weisbecker
2009-01-29 15:34 ` Frédéric Weisbecker
2009-01-29 15:53 ` Frank Ch. Eigler
2009-01-28 0:43 ` Frank Ch. Eigler
2009-01-28 13:58 ` Frédéric Weisbecker
2009-01-28 14:29 ` Arnaldo Carvalho de Melo
2009-01-28 9:38 ` Ananth N Mavinakayanahalli
2009-01-28 14:21 ` Frédéric Weisbecker
2009-01-28 17:00 ` Ananth N Mavinakayanahalli
2009-01-28 17:15 ` Frédéric Weisbecker
2009-01-28 22:19 ` Kok, Auke
2009-01-30 20:22 ` Pavel Machek
2009-02-03 13:32 ` Ingo Molnar
2009-02-05 14:44 ` Harald Hoyer
2009-02-05 15:07 ` Bill Nottingham
2009-02-05 15:14 ` Arjan van de Ven
2009-02-05 15:24 ` Bill Nottingham
2009-02-05 15:47 ` Arjan van de Ven
2009-02-06 23:18 ` Corrado Zoccolo
2009-02-09 13:13 ` Karel Zak
2009-02-09 13:23 ` Harald Hoyer
2009-02-09 13:54 ` Karel Zak
2009-02-11 10:44 ` Harald Hoyer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090127225048.GA4652@nowhere \
--to=fweisbec@gmail.com \
--cc=acme@ghostprotocols.net \
--cc=arjan@linux.intel.com \
--cc=auke-jan.h.kok@intel.com \
--cc=fche@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=nhorman@tuxdriver.com \
--cc=power@bughost.org \
--cc=srostedt@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.