Re: [RFC PATCH] perf_core: provide a kernel-internal interface to get to performance counters

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@elte.hu>
To: "Frédéric Weisbecker" <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	"K.Prasad" <prasad@linux.vnet.ibm.com>,
	Arjan van de Ven <arjan@infradead.org>,
	"Frank Ch. Eigler" <fche@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] perf_core: provide a kernel-internal interface to get to performance counters
Date: Mon, 5 Oct 2009 11:48:49 +0200	[thread overview]
Message-ID: <20091005094849.GA10620@elte.hu> (raw)
In-Reply-To: <c62985530910050224u5e755808jc2a25c3dd5c172da@mail.gmail.com>

* Frédéric Weisbecker <fweisbec@gmail.com> wrote:

> 2009/10/5 Ingo Molnar <mingo@elte.hu>:
> >
> > * Peter Zijlstra <peterz@infradead.org> wrote:
> >> Non-trivial.
> >>
> >> Something like this would imply a single output channel for all these
> >> CPUs, and we've already seen that stuffing too many CPUs down one such
> >> channel (using -M) leads to significant performance issues.
> >
> > We could add internal per cpu buffering before it hits any globally 
> > visible output channel. (That has come up when i talked to Frederic 
> > about the function tracer.) We could even have page sized output 
> > (via the introduction of a NOP event that fills up to the next page 
> > edge).
> 
> That looks good for the counting/sampling fast path, but would that 
> scale once it comes to reordering in the globally visible output 
> channel? Such a union has its costs.

Well, reordering always has a cost, and we have multiple models 
regarding to where to put that cost.

The first model is 'everything is per cpu' - i.e. completely separate 
event buffers and the reordering is pushed to the user-space 
post-processing stage. This is the most scalable solution - but it can 
also lose information such as the true ordering of events.

The second model is 'event multiplexing' - here we use a single output 
buffer for events. This serializes all output on the same buffer and 
hence is the least scalable one. It is the easiest to use one: just a 
single channel of output to deal with. It is also the most precise 
solution and it saves the post-processing stage from reordering hassles.

What i suggested above is a third model: 'short-term per cpu, 
multiplexed into an output channel with page granularity'. It has the 
advantage of being per cpu on a page granular basis. It has the ease of 
use of having a single output channel only.

Neither solution can eliminate the costs and tradeoffs involved. What 
they do is to offer an app a spectrum to choose from.

	Ingo

next prev parent reply	other threads:[~2009-10-05  9:49 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-25 10:25 [RFC PATCH] perf_core: provide a kernel-internal interface to get to performance counters Arjan van de Ven
2009-09-25 10:44 ` Frederic Weisbecker
2009-09-25 11:42 ` Peter Zijlstra
2009-09-26 16:03 ` Frank Ch. Eigler
2009-09-26 16:11   ` Arjan van de Ven
2009-09-26 16:20     ` Frank Ch. Eigler
2009-09-26 18:32   ` K.Prasad
2009-09-26 18:48     ` Arjan van de Ven
2009-10-01  7:25       ` Ingo Molnar
2009-10-01  8:16         ` K.Prasad
2009-10-01  8:53           ` Ingo Molnar
2009-10-01 10:01             ` K.Prasad
2009-10-01 10:28               ` Ingo Molnar
2009-10-04 22:28             ` Frederic Weisbecker
2009-10-05  9:55               ` Ingo Molnar
2009-10-05 10:13                 ` Frédéric Weisbecker
2009-10-05  7:53             ` Peter Zijlstra
2009-10-05  8:55               ` Ingo Molnar
2009-10-05  9:24                 ` Frédéric Weisbecker
2009-10-05  9:48                   ` Ingo Molnar [this message]
2009-10-05 10:08                     ` Frédéric Weisbecker
2009-11-21 13:36 ` [tip:perf/core] perf/core: Provide " tip-bot for Arjan van de Ven
2010-02-05 15:47 ` [RFC PATCH] perf_core: provide " Christoph Hellwig
2010-02-05 17:59   ` john smith
2010-02-06  6:24   ` Arjan van de Ven
2010-02-06 11:46     ` Frederic Weisbecker
2010-02-06 14:18       ` Peter Zijlstra
2010-02-06 16:08         ` Frederic Weisbecker
2010-02-07 17:01   ` Ingo Molnar
  -- strict thread matches above, loose matches on Subject: below --
2010-01-25  5:10 john smith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091005094849.GA10620@elte.hu \
    --to=mingo@elte.hu \
    --cc=arjan@infradead.org \
    --cc=fche@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=prasad@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox