From: Ingo Molnar <mingo@elte.hu>
To: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
LKML <linux-kernel@vger.kernel.org>,
linux-arch@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
Stephane Eranian <eranian@googlemail.com>,
Eric Dumazet <dada1@cosmosbay.com>,
Robert Richter <robert.richter@amd.com>,
Arjan van de Veen <arjan@infradead.org>,
Peter Anvin <hpa@zytor.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Steven Rostedt <rostedt@goodmis.org>,
David Miller <davem@davemloft.net>
Subject: Re: [patch 0/3] [Announcement] Performance Counters for Linux
Date: Fri, 5 Dec 2008 09:08:13 +0100 [thread overview]
Message-ID: <20081205080813.GA2030@elte.hu> (raw)
In-Reply-To: <18744.56857.259756.129894@cargo.ozlabs.ibm.com>
* Paul Mackerras <paulus@samba.org> wrote:
> Ingo Molnar writes:
> >
> > * Paul Mackerras <paulus@samba.org> wrote:
> [snip]
> > > One thing that this sort of thing can't do is to get values from
> > > multiple counters that correlate with each other. For instance, we
> > > would often want to count, say, L2 cache misses and instructions
> > > completed at the same time, and be able to read both counters at very
> > > close to the same time, so that we can measure average L2 cache misses
> > > per instruction completed, which is useful.
> >
> > This can be done in a very natural way with our abstraction, and the
> > "hello.c" example happens to do exactly that:
>
> Has hello.c been posted? I can't find it in any of the posts from you
> or Thomas. Am I just being blind? :)
Sorry, was late at night when we did the release - monitor.c was posted -
and i just posted hello.c it half an hour ago :)
> > aldebaran:~/perf-counter-test> ./hello
> > doing perf_counter_open() call:
> > counter[0]... fd: 3.
> > counter[1]... fd: 4.
> > counter[0] delta: 10866 cycles
> > counter[1] delta: 414 cycles
> > counter[0] delta: 23640 cycles
> > counter[1] delta: 3673 cycles
> > counter[0] delta: 28225 cycles
> > counter[1] delta: 3695 cycles
> >
> > This counts cycles executed and instructions executed, and reads the two
> > counters out at the same time.
>
> Isn't it two separate read() calls to read the two counters? If so,
> the only way the two values are actually going to correspond to the
> same point in time is if the task being monitored is stopped. In which
> case the monitoring task needs to use ptrace or something similar in
> order to make sure that the monitored task is actually stopped.
It doesnt matter in practice.
Also, look at our code: we buffer notification events and do not have to
stop the thread for recording the context information.
Also, if you _do_ care about getting immediate readouts, the _monitoring_
task can be set to higher priority. (not that i'd advocate it in general:
any task stopping or scheduling can destroy a workload's true behavior)
> If the monitored task is not stopped, then the interval between the two
> reads will be sufficient to render the results useless - particularly
> since the monitoring task could get preempted for an arbitrary length
> of time between the two reads. But even if it doesn't, the hundreds of
> cycles between the two reads will introduce considerable imprecision in
> the results.
Even if the two read()s are done apart, stopping a task is _far_ more
intrusive to the event flow of a single application. Most workloads are
multithreaded - so stopping a task causes another task to be scheduled
in, which would not have occured were the profiling more transparent and
less intrusive.
Furthermore, even for the special case of single task monitoring, a
context-switch is more expensive than a system call.
Furthermore, in most of the practical cases there's very few events
happening between two read()s. The interval of profiling versus the
interval between two reads()s is a couple of orders of magnitude.
This 'task has to be stopped' aspect is a red herring that has no
technical basis.
> There really is value in being able to read all the counters you're
> using in one system call.
It's possible with our code too: what you are asking for is in essence a
sys_read_fds() system call extension - a bit like readv(), but from a
vector of separate fds.
Such kind of 'group system call facility' has been suggested several
times in the past - but ... never got anywhere because system calls are
cheap enough, it really does not count in practice.
It could be implemented, and note that because our code uses a proper
Linux file descriptor abstraction, such a sys_read_fds() facility would
help _other_ applications as well, not just performance counters.
But it brings complications: demultiplexing of error conditions on
individual counters is a real pain with any compound abstraction. We very
consciously went with the 'one fd, one object, one counter' design.
Ingo
next prev parent reply other threads:[~2008-12-05 8:08 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-04 23:44 [patch 0/3] [Announcement] Performance Counters for Linux Thomas Gleixner
2008-12-04 23:44 ` [patch 1/3] performance counters: core code Thomas Gleixner
2008-12-05 10:55 ` Paul Mackerras
2008-12-05 11:20 ` Ingo Molnar
2008-12-04 23:44 ` [patch 2/3] performance counters: documentation Thomas Gleixner
2008-12-05 0:33 ` Paul Mackerras
2008-12-05 0:37 ` David Miller
2008-12-05 2:50 ` Arjan van de Ven
2008-12-05 3:26 ` David Miller
2008-12-05 2:33 ` Andi Kleen
2008-12-04 23:45 ` [patch 3/3] performance counters: x86 support Thomas Gleixner
2008-12-05 0:22 ` [patch 0/3] [Announcement] Performance Counters for Linux Paul Mackerras
2008-12-05 6:31 ` Ingo Molnar
2008-12-05 7:02 ` Arjan van de Ven
2008-12-05 7:52 ` David Miller
2008-12-05 7:03 ` Ingo Molnar
2008-12-05 7:03 ` Ingo Molnar
2008-12-05 7:16 ` Peter Zijlstra
2008-12-05 7:57 ` Paul Mackerras
2008-12-05 8:03 ` Peter Zijlstra
2008-12-05 8:07 ` David Miller
2008-12-05 8:11 ` Ingo Molnar
2008-12-05 8:17 ` David Miller
2008-12-05 8:24 ` Ingo Molnar
2008-12-05 8:27 ` David Miller
2008-12-05 8:42 ` Ingo Molnar
2008-12-05 8:49 ` David Miller
2008-12-05 12:13 ` Ingo Molnar
2008-12-05 12:13 ` Ingo Molnar
2008-12-05 12:39 ` Andi Kleen
2008-12-05 20:08 ` David Miller
2008-12-10 3:48 ` Paul Mundt
2008-12-10 4:42 ` Paul Mackerras
2008-12-10 8:43 ` Mikael Pettersson
2008-12-10 10:28 ` Andi Kleen
2008-12-10 10:23 ` Paul Mundt
2008-12-10 11:03 ` Andi Kleen
2008-12-10 11:03 ` Andi Kleen
2008-12-10 10:28 ` Andi Kleen
2008-12-05 15:00 ` Arjan van de Ven
2008-12-05 9:16 ` Paul Mackerras
2008-12-05 7:57 ` David Miller
2008-12-05 8:18 ` Ingo Molnar
2008-12-05 8:20 ` David Miller
2008-12-05 7:54 ` Paul Mackerras
2008-12-05 8:08 ` Ingo Molnar [this message]
2008-12-05 8:15 ` David Miller
2008-12-05 13:25 ` Ingo Molnar
2008-12-05 9:10 ` Paul Mackerras
2008-12-05 12:07 ` Ingo Molnar
2008-12-06 0:05 ` Paul Mackerras
2008-12-06 1:23 ` Mikael Pettersson
2008-12-06 12:34 ` Peter Zijlstra
2008-12-07 5:15 ` Paul Mackerras
2008-12-08 7:18 ` stephane eranian
2008-12-08 11:11 ` Ingo Molnar
2008-12-08 11:58 ` David Miller
2008-12-09 0:21 ` stephane eranian
2008-12-09 0:21 ` stephane eranian
2008-12-05 0:22 ` H. Peter Anvin
2008-12-05 0:43 ` Paul Mackerras
2008-12-05 1:12 ` David Miller
2008-12-05 6:10 ` Ingo Molnar
2008-12-05 7:50 ` David Miller
2008-12-05 9:34 ` Paul Mackerras
2008-12-05 10:41 ` Ingo Molnar
2008-12-05 10:05 ` Ingo Molnar
2008-12-05 3:30 ` Andrew Morton
2008-12-06 2:36 ` stephane eranian
2008-12-08 2:12 ` [perfmon2] [patch 0/3] [Announcement] Performance Counters forLinux Dan Terpstra
2008-12-10 16:27 ` [patch 0/3] [Announcement] Performance Counters for Linux Rob Fowler
2008-12-10 16:27 ` [perfmon2] " Rob Fowler
2008-12-10 17:11 ` Andi Kleen
2008-12-10 17:11 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081205080813.GA2030@elte.hu \
--to=mingo@elte.hu \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=arjan@infradead.org \
--cc=dada1@cosmosbay.com \
--cc=davem@davemloft.net \
--cc=eranian@googlemail.com \
--cc=hpa@zytor.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=paulus@samba.org \
--cc=robert.richter@amd.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox