From: Ingo Molnar <mingo@kernel.org>
To: Prarit Bhargava <prarit@redhat.com>
Cc: linux-kernel@vger.kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>,
x86@kernel.org, Len Brown <len.brown@intel.com>,
Dasaratharaman Chandramouli
<dasaratharaman.chandramouli@intel.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Borislav Petkov <bp@alien8.de>, Andy Lutomirski <luto@kernel.org>,
Denys Vlasenko <dvlasenk@redhat.com>,
Brian Gerst <brgerst@gmail.com>,
Arnaldo Carvalho de Melo <acme@infradead.org>
Subject: Re: [PATCH] x86, msr: Allow read access to /dev/cpu/X/msr
Date: Sat, 27 Jun 2015 10:39:21 +0200 [thread overview]
Message-ID: <20150627083921.GA13074@gmail.com> (raw)
In-Reply-To: <20150627083354.GA12834@gmail.com>
* Ingo Molnar <mingo@kernel.org> wrote:
> So what's wrong with exposing them as a simplified PMU driver?
>
> That way we only expose the ones we want to - plus tooling can use all the rich
> perf features that can be used around this. (sampling, counting, call chains,
> etc.)
See below code from Andy that exposes a single MSR via perf. At the core of the
PMU driver is a single rdmsrl():
+static void aperfmperf_event_start(struct perf_event *event, int flags)
+{
+ u64 now;
+
+ rdmsrl(event->hw.event_base, now);
+ local64_set(&event->hw.prev_count, now);
+}
Now I think what we really want is to expose not a single MSR but multiple MSRs in
a single driver, i.e. don't have one PMU driver per MSR, but have a driver that
allows the exposure of select MSRs as counters.
There should also be a maker/family/model filter mechanism, so that certain MSRs
are only exposed on models that are known to support them, etc.
Thanks,
Ingo
----- Forwarded message from Andy Lutomirski <luto@kernel.org> -----
Date: Tue, 28 Apr 2015 14:25:37 -0700
From: Andy Lutomirski <luto@kernel.org>
To: Len Brown <len.brown@intel.com>, Peter Zijlstra <peterz@infradead.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: Paul Mackerras <paulus@samba.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, Andy Lutomirski <luto@kernel.org>
Subject: [RFC] x86, perf: Add an aperfmperf driver
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
This driver seems a little bit silly, but I can imagine it being useful. For
example, I think that turbostat could do some of its work without being
root if we had a driver like this.
Thoughts? Would it make sense at all? Did I wire it up right? This is
the only PMU driver I've ever written, and it could have any number of
issues.
arch/x86/kernel/cpu/Makefile | 2 +
arch/x86/kernel/cpu/perf_event_aperfmperf.c | 119 ++++++++++++++++++++++++++++
2 files changed, 121 insertions(+)
create mode 100644 arch/x86/kernel/cpu/perf_event_aperfmperf.c
diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index 80091ae54c2b..fadc822efc90 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -45,6 +45,8 @@ obj-$(CONFIG_PERF_EVENTS_INTEL_UNCORE) += perf_event_intel_uncore.o \
perf_event_intel_uncore_snb.o \
perf_event_intel_uncore_snbep.o \
perf_event_intel_uncore_nhmex.o
+obj-$(CONFIG_CPU_SUP_INTEL) += perf_event_aperf_mperf.o
+obj-$(CONFIG_CPU_SUP_AMD) += perf_event_aperf_mperf.o
endif
diff --git a/arch/x86/kernel/cpu/perf_event_aperfmperf.c b/arch/x86/kernel/cpu/perf_event_aperfmperf.c
new file mode 100644
index 000000000000..6e6d113bd9ce
--- /dev/null
+++ b/arch/x86/kernel/cpu/perf_event_aperfmperf.c
@@ -0,0 +1,119 @@
+#include <linux/perf_event.h>
+
+#define APERFMPERF_EVENT_APERF 0
+#define APERFMPERF_EVENT_MPERF 1
+
+PMU_EVENT_ATTR_STRING(aperf, evattr_aperf, "event=0x00");
+PMU_EVENT_ATTR_STRING(mperf, evattr_mperf, "event=0x01");
+static struct attribute *events_attrs[] = {
+ &evattr_aperf.attr.attr,
+ &evattr_mperf.attr.attr,
+ NULL,
+};
+static struct attribute_group events_attr_group = {
+ .name = "events",
+ .attrs = events_attrs,
+};
+
+PMU_FORMAT_ATTR(event, "config:0-63");
+static struct attribute *format_attrs[] = {
+ &format_attr_event.attr,
+ NULL,
+};
+static struct attribute_group format_attr_group = {
+ .name = "format",
+ .attrs = format_attrs,
+};
+
+static const struct attribute_group *attr_groups[] = {
+ &events_attr_group,
+ &format_attr_group,
+ NULL,
+};
+
+static int aperfmperf_event_init(struct perf_event *event)
+{
+ if (event->attr.type != event->pmu->type)
+ return -ENOENT;
+
+ if (event->attr.config != APERFMPERF_EVENT_APERF &&
+ event->attr.config != APERFMPERF_EVENT_MPERF)
+ return -ENOENT;
+
+ if (event->attr.config1 != 0)
+ return -ENOENT;
+
+ /* no sampling */
+ if (event->hw.sample_period)
+ return -EINVAL;
+
+ /* unsupported modes and filters */
+ if (event->attr.exclude_user ||
+ event->attr.exclude_kernel ||
+ event->attr.exclude_hv ||
+ event->attr.exclude_idle ||
+ event->attr.exclude_host ||
+ event->attr.exclude_guest ||
+ event->attr.freq ||
+ event->attr.sample_period) /* no sampling */
+ return -EINVAL;
+
+ event->hw.idx = -1;
+ event->hw.event_base = (event->attr.config == APERFMPERF_EVENT_APERF ?
+ MSR_IA32_APERF : MSR_IA32_MPERF);
+
+ return 0;
+}
+
+static void aperfmperf_event_update(struct perf_event *event)
+{
+ u64 prev;
+ u64 now;
+
+ rdmsrl(event->hw.event_base, now);
+ prev = local64_xchg(&event->hw.prev_count, now);
+ local64_add(now - prev, &event->count);
+}
+
+static void aperfmperf_event_start(struct perf_event *event, int flags)
+{
+ u64 now;
+
+ rdmsrl(event->hw.event_base, now);
+ local64_set(&event->hw.prev_count, now);
+}
+
+static void aperfmperf_event_stop_or_del(struct perf_event *event, int flags)
+{
+ aperfmperf_event_update(event);
+}
+
+static int aperfmperf_event_add(struct perf_event *event, int flags)
+{
+ if (flags & PERF_EF_START)
+ aperfmperf_event_start(event, flags);
+
+ return 0;
+}
+
+static struct pmu pmu_aperfmperf = {
+ .task_ctx_nr = perf_invalid_context,
+ .attr_groups = attr_groups,
+ .event_init = aperfmperf_event_init,
+ .add = aperfmperf_event_add,
+ .del = aperfmperf_event_stop_or_del,
+ .start = aperfmperf_event_start,
+ .stop = aperfmperf_event_stop_or_del,
+ .read = aperfmperf_event_update,
+};
+
+static int __init aperfmperf_init(void)
+{
+ if (!boot_cpu_has(X86_FEATURE_APERFMPERF))
+ return -ENODEV;
+
+ perf_pmu_register(&pmu_aperfmperf, "aperfmperf", -1);
+
+ return 0;
+}
+device_initcall(aperfmperf_init);
--
2.3.0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
----- End forwarded message -----
next prev parent reply other threads:[~2015-06-27 8:39 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-26 17:52 [PATCH] x86, msr: Allow read access to /dev/cpu/X/msr Prarit Bhargava
2015-06-26 18:45 ` H. Peter Anvin
2015-06-26 19:23 ` Brian Gerst
2015-06-26 21:26 ` Prarit Bhargava
2015-06-28 15:13 ` Henrique de Moraes Holschuh
2015-06-27 8:33 ` Ingo Molnar
2015-06-27 8:39 ` Ingo Molnar [this message]
2015-06-27 15:52 ` Andy Lutomirski
2015-06-28 14:34 ` Prarit Bhargava
2015-06-28 15:10 ` Henrique de Moraes Holschuh
2015-06-29 6:42 ` Ingo Molnar
2015-06-29 10:58 ` Matt Fleming
2015-06-29 19:51 ` H. Peter Anvin
2015-06-30 12:20 ` Prarit Bhargava
2015-06-30 12:44 ` Peter Zijlstra
2015-06-30 12:57 ` Ingo Molnar
2015-06-30 13:23 ` Prarit Bhargava
2015-07-01 16:38 ` Brown, Len
2015-07-01 17:33 ` Andy Lutomirski
2015-07-02 9:15 ` Ingo Molnar
2015-07-02 19:22 ` H. Peter Anvin
2015-07-02 19:26 ` Andy Lutomirski
2015-07-03 7:42 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150627083921.GA13074@gmail.com \
--to=mingo@kernel.org \
--cc=a.p.zijlstra@chello.nl \
--cc=acme@infradead.org \
--cc=bp@alien8.de \
--cc=brgerst@gmail.com \
--cc=dasaratharaman.chandramouli@intel.com \
--cc=dvlasenk@redhat.com \
--cc=hpa@zytor.com \
--cc=len.brown@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=prarit@redhat.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.