From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@kernel.org, Michael Ellerman <mpe@ellerman.id.au>,
Anton Blanchard <anton@au1.ibm.com>,
Stephane Eranian <eranian@google.com>,
Jiri Olsa <jolsa@redhat.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>,
linux-kernel@vger.kernel.org
Subject: Re: [RFC][PATCH] perf: Implement read_group() PMU operation
Date: Tue, 17 Feb 2015 00:33:12 -0800 [thread overview]
Message-ID: <20150217083312.GA16221@us.ibm.com> (raw)
In-Reply-To: <20150212155856.GC21418@twins.programming.kicks-ass.net>
Peter Zijlstra [peterz@infradead.org] wrote:
| > --- a/kernel/events/core.c
| > +++ b/kernel/events/core.c
| > @@ -3549,10 +3549,43 @@ static int perf_event_read_group(struct perf_event *event,
|
| You also want perf_output_read_group().
Ok. Will look into it. We currently don't support sampling with
the 24x7 counters but we should make sure that the new interface
fits correctly.
|
| > struct perf_event *leader = event->group_leader, *sub;
| > int n = 0, size = 0, ret = -EFAULT;
| > struct perf_event_context *ctx = leader->ctx;
| > + u64 *valuesp;
| > u64 values[5];
| > + int use_group_read;
| > u64 count, enabled, running;
| > + struct pmu *pmu = event->pmu;
| > +
| > + /*
| > + * If PMU supports group read and group read is requested,
| > + * allocate memory before taking the mutex.
| > + */
| > + use_group_read = 0;
| > + if ((read_format & PERF_FORMAT_GROUP) && pmu->read_group) {
| > + use_group_read++;
| > + }
| > +
| > + if (use_group_read) {
| > + valuesp = kzalloc(leader->nr_siblings * sizeof(u64), GFP_KERNEL);
| > + if (!valuesp)
| > + return -ENOMEM;
| > + }
|
| This seems 'sad', the hardware already knows how many it can maximally
| use at once and can preallocate, right?
Yeah :-) In a subsequent version, I got rid of the allocation by moving
the copy_to_user() into the PMU, but still needed to walk the sibling list.
|
| >
| > mutex_lock(&ctx->mutex);
| > +
| > + if (use_group_read) {
| > + ret = pmu->read_group(leader, valuesp, leader->nr_siblings);
| > + if (ret >= 0) {
| > + size = ret * sizeof(u64);
| > +
| > + ret = size;
| > + if (copy_to_user(buf, valuesp, size))
| > + ret = -EFAULT;
| > + }
| > +
| > + kfree(valuesp);
| > + goto unlock;
| > + }
| > +
| > count = perf_event_read_value(leader, &enabled, &running);
| >
| > values[n++] = 1 + leader->nr_siblings;
|
| Since ->read() has a void return value, we can delay its effect, so I'm
| currently thinking we might want to extend the transaction interface for
| this; give pmu::start_txn() a flags argument to indicate scheduling
| (add) or reading (read).
|
| So we'd end up with something like:
|
| pmu->start_txn(pmu, PMU_TXN_READ);
|
| leader->read();
|
| for_each_sibling()
| sibling->read();
|
| pmu->commit_txn();
So, each of the ->read() calls is really "appending a counter" to a
list of counters that the PMU should read and the values for the counters
(results of the read) are only available after the commit_txn() right?
In which case, perf_event_read_group() would then follow this commit_txn()
with its "normal" read, and the PMU would return the result cached during
->commit_txn(). If so, we need a way to invalidate the cached result ?
|
| after which we can use the values updated by the read calls. The trivial
| no-support implementation lets read do its immediate thing like it does
| now.
|
| A more complex driver can then collect the actual counter values and
| execute one hypercall using its pre-allocated memory.
the hypercall should happen in the ->commit_txn() right ?
|
| So no allocations in the core code, and no sibling iterations in the
| driver code.
|
| Would that work for you?
I think it would,
I am working on breaking up the underlying code along start/read/commit
lines, and hope to have it done later this week and then can experiment
more with this interface.
Appreciate the input.
Sukadev
next prev parent reply other threads:[~2015-02-17 8:33 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-06 2:59 [RFC][PATCH] perf: Implement read_group() PMU operation Sukadev Bhattiprolu
2015-02-12 15:58 ` Peter Zijlstra
2015-02-17 8:33 ` Sukadev Bhattiprolu [this message]
2015-02-17 10:03 ` Peter Zijlstra
2015-02-22 21:04 ` Cody P Schafer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150217083312.GA16221@us.ibm.com \
--to=sukadev@linux.vnet.ibm.com \
--cc=acme@kernel.org \
--cc=anton@au1.ibm.com \
--cc=eranian@google.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox