public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Karim Yaghmour <karim@opersys.com>
To: John Levon <movement@marcelothewonderpenguin.com>
Cc: Linus Torvalds <torvalds@transmeta.com>,
	Andrew Morton <akpm@zip.com.au>,
	linux-kernel@vger.kernel.org,
	Richard Moore <richardj_moore@uk.ibm.com>,
	bob <bob@watson.ibm.com>
Subject: Re: Enhanced profiling support (was Re: vm lock contention reduction)
Date: Tue, 09 Jul 2002 15:56:21 -0400	[thread overview]
Message-ID: <3D2B3FE5.2AD6859A@opersys.com> (raw)
In-Reply-To: 20020709165754.GA96901@compsoc.man.ac.uk


I've been following oprofile's development for a while and was quite
happy to see a DCPI equivalent come to Linux. The LTT data engine
collection is actually inspired by the one described in the DCPI paper.

As I said earlier, much of the kernel behavior information oprofile
requires is made available by LTT. Instead of redirecting the syscall
table, for example, LTT hooks itself within the syscall path to obtain
all syscall entries and exits (all contained within #ifdef's of course).
The same goes for important events such as forks/reads/writes/scheduling
etc. The mapping of PID to the name of an executable, for instance,
is easily extracted from this information.

As with other applications/uses which require insight into the dynamics
of the kernel, it would seem to me that oprofile would greatly benefit
from the integration of the LTT patch in the mainline kernel. If
nothing else, oprofile could use the LTT collection engine to forward
its data to user-space, much like DProbes currently does.

By the same token, the LTT collection engine could replace the slew
of per-driver tracing systems already in the kernel, providing therefore
a uniform tracing system:
drivers/char/ftape/lowlevel/ftape-tracing.c
drivers/char/ip2/ip2trace.c
drivers/char/dtlk.c
drivers/char/mwavedd.h
drivers/char/n_r3964
drivers/acpi/utilities/utdebug.c
drivers/cdrom/sbpcd.c
drivers/isdn/eicon/eicon_mod.c
drivers/scsi/gdth.c
drivers/scsi/megaraid.c
drivers/scsi/qlogicfc.c
drivers/scsi/qlogicisp.c
drivers/net/wavelan.c
drivers/net/skfp/hwmtm.c
drivers/net/pcmcia/wavelan_cs.c
drivers/net/wireless/orinoco.c
drivers/net/wireless/orinoco_cs.c
drivers/video/radeonfb.c
drivers/usb/pwc.h
drivers/usb/hpusbscsi.c
include/linux/jdb.h for fs/ext3/*.c and fs/jdb/*.c
net/irda/irnet/irnet.h
etc.

The above list is but a sample of the files containing actual code
implementing tracing and/or data collection engines. There a great
deal many files that actually have trace points already in them. A
simple "grep -r TRACE *" provides an interesting insight to the
number of subsystems requiring tracing, each implementing their own
scheme.

It is time to provide a uniform tracing and high-throughput data
collection engine for all to use. LTT has already been field-tested
for these purposes and is very easily extended to include any
additional functionality required.

Any comments/thoughts are greatly appreciated.

Cheers,

Karim

John Levon wrote:
> 
> On Mon, Jul 08, 2002 at 10:52:36AM -0700, Linus Torvalds wrote:
> 
> >  - I'd associate each profiling event with a dentry/offset pair, simply
> >    because that's the highest-level thing that the kernel knows about and
> >    that is "static".
> 
> This makes sense, I think.
> 
> >  - I'd suggest that the profiler explicitly mark the dentries it wants
> >    profiled, so that the kernel can throw away events that we're not
> >    interested in. The marking function would return a cookie to user
> >    space, and increment the dentry count (along with setting the
> >    "profile" flag in the dentry)
> 
> For a system-wide profiler, this needs to be /all/ dentries that get
> mapped in with executable permissions, or we lose any mappings of shared
> libraries we don't know about etc. Essentially, oprofile wants samples
> against any dentry that gets mmap()ed with PROT_EXEC, so this marking
> would really need to happen at mmap() time. Missing out on any dentry
> profiles amounts to data loss in the system profile and has the
> potential to mislead.
> 
> >  - the "cookie" (which would most easily just be the kernel address of the
> >    dentry) would be the thing that we give to user-space (along with
> >    offset) on profile read. The user app can turn it back into a filename.
> >
> > Whether it is the original "mark this file for profiling" phase that saves
> > away the cookie<->filename association, or whether we also have a system
> > call for "return the path of this cookie", I don't much care about.
> > Details, details.
> >
> > Anyway, what would be the preferred interface from user level?
> 
> oprofile currently receives eip-pid pairs, along with the necessary
> syscall tracing info needed to reconstruct file offsets. The above
> scheme removes the dependency on the pid, but this also unfortunately
> throws away some useful information.
> 
> It is often useful to be able to separate out shared-library samples on
> a per-process (and/or per-application) basis. Any really useful profile
> buffer facility really needs to preserve this info, but just including
> the raw pid isn't going to work when user-space can't reconstruct the
> "name" of the pid (where "name" would be something "/bin/bash") because
> the process exited in the meantime.
> 
> The same goes for kernel samples that happen in process context.
> 
> So this might work well in tandem with some global process-tree tracing
> scheme, but I don't know what form that might take ...
> 
> (Then there are kernel modules, but that's probably best served by
> patching modutils)
> 
> regards
> john
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

===================================================
                 Karim Yaghmour
               karim@opersys.com
      Embedded and Real-Time Linux Expert
===================================================

  reply	other threads:[~2002-07-09 20:20 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <3D27AC81.FC72D08F@zip.com.au>
     [not found] ` <Pine.LNX.4.44.0207061949240.1558-100000@home.transmeta.com>
     [not found]   ` <3D27B9EA.E68B11E@zip.com.au>
2002-07-08 11:39     ` Enhanced profiling support (was Re: vm lock contention reduction) John Levon
2002-07-08 17:52       ` Linus Torvalds
2002-07-08 18:41         ` Karim Yaghmour
2002-07-10  2:22           ` John Levon
2002-07-10  4:16             ` Karim Yaghmour
2002-07-10  4:38               ` John Levon
2002-07-10  5:46                 ` Karim Yaghmour
2002-07-10 13:10                 ` bob
2002-07-09 16:57         ` John Levon
2002-07-09 19:56           ` Karim Yaghmour [this message]
2002-07-10 14:28 Richard J Moore
2002-07-10 20:30 ` Karim Yaghmour
2002-07-10 21:41   ` Andrea Arcangeli
2002-07-11  4:47     ` Karim Yaghmour
2002-07-11  4:59       ` Karim Yaghmour

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3D2B3FE5.2AD6859A@opersys.com \
    --to=karim@opersys.com \
    --cc=akpm@zip.com.au \
    --cc=bob@watson.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=movement@marcelothewonderpenguin.com \
    --cc=richardj_moore@uk.ibm.com \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox