* [PATCH v4 1/4] perf, x86: Implement IBS event configuration
2011-12-15 16:56 [PATCH v4 0/4] perf, x86: Implement AMD IBS Robert Richter
@ 2011-12-15 16:56 ` Robert Richter
2012-03-08 12:22 ` [tip:perf/x86-ibs] perf/x86: " tip-bot for Robert Richter
2011-12-15 16:56 ` [PATCH v4 2/4] perf, x86: Implement IBS interrupt handler Robert Richter
` (3 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: Robert Richter @ 2011-12-15 16:56 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Ingo Molnar, Stephane Eranian, LKML, Robert Richter
This patch implements perf configuration for AMD IBS. The IBS pmu is
selected using the type attribute in sysfs. There are two types of ibs
pmus, for instruction fetch (IBS_FETCH) and for instruction execution
(IBS_OP):
/sys/bus/event_source/devices/ibs_fetch/type
/sys/bus/event_source/devices/ibs_op/type
Except for the sample period IBS can only be set up with raw config
values and raw data samples. The event attributes for the syscall
should be programmed like this (IBS_FETCH):
type = get_pmu_type("/sys/bus/event_source/devices/ibs_fetch/type");
memset(&attr, 0, sizeof(attr));
attr.type = type;
attr.sample_type = PERF_SAMPLE_CPU | PERF_SAMPLE_RAW;
attr.config = IBS_FETCH_CONFIG_DEFAULT;
This implementation does not yet support 64 bit counters. It is
limited to the hardware counter bit width which is 20 bits. 64 bit
support can be added later.
V3:
* disable per-task monitoring (mark pmu with perf_invalid_context),
per-task monitoring can be added in a separate patch
Signed-off-by: Robert Richter <robert.richter@amd.com>
---
arch/x86/kernel/cpu/perf_event_amd_ibs.c | 92 +++++++++++++++++++++++++++--
1 files changed, 85 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
index 3b8a2d3..36684eb 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
@@ -16,12 +16,67 @@ static u32 ibs_caps;
#if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_AMD)
-static struct pmu perf_ibs;
+#define IBS_FETCH_CONFIG_MASK (IBS_FETCH_RAND_EN | IBS_FETCH_MAX_CNT)
+#define IBS_OP_CONFIG_MASK IBS_OP_MAX_CNT
+
+struct perf_ibs {
+ struct pmu pmu;
+ unsigned int msr;
+ u64 config_mask;
+ u64 cnt_mask;
+ u64 enable_mask;
+};
+
+static struct perf_ibs perf_ibs_fetch;
+static struct perf_ibs perf_ibs_op;
+
+static struct perf_ibs *get_ibs_pmu(int type)
+{
+ if (perf_ibs_fetch.pmu.type == type)
+ return &perf_ibs_fetch;
+ if (perf_ibs_op.pmu.type == type)
+ return &perf_ibs_op;
+ return NULL;
+}
static int perf_ibs_init(struct perf_event *event)
{
- if (perf_ibs.type != event->attr.type)
+ struct hw_perf_event *hwc = &event->hw;
+ struct perf_ibs *perf_ibs;
+ u64 max_cnt, config;
+
+ perf_ibs = get_ibs_pmu(event->attr.type);
+ if (!perf_ibs)
return -ENOENT;
+
+ config = event->attr.config;
+ if (config & ~perf_ibs->config_mask)
+ return -EINVAL;
+
+ if (hwc->sample_period) {
+ if (config & perf_ibs->cnt_mask)
+ /* raw max_cnt may not be set */
+ return -EINVAL;
+ if (hwc->sample_period & 0x0f)
+ /* lower 4 bits can not be set in ibs max cnt */
+ return -EINVAL;
+ max_cnt = hwc->sample_period >> 4;
+ if (max_cnt & ~perf_ibs->cnt_mask)
+ /* out of range */
+ return -EINVAL;
+ config |= max_cnt;
+ } else {
+ max_cnt = config & perf_ibs->cnt_mask;
+ event->attr.sample_period = max_cnt << 4;
+ hwc->sample_period = event->attr.sample_period;
+ }
+
+ if (!max_cnt)
+ return -EINVAL;
+
+ hwc->config_base = perf_ibs->msr;
+ hwc->config = config;
+
return 0;
}
@@ -34,10 +89,32 @@ static void perf_ibs_del(struct perf_event *event, int flags)
{
}
-static struct pmu perf_ibs = {
- .event_init= perf_ibs_init,
- .add= perf_ibs_add,
- .del= perf_ibs_del,
+static struct perf_ibs perf_ibs_fetch = {
+ .pmu = {
+ .task_ctx_nr = perf_invalid_context,
+
+ .event_init = perf_ibs_init,
+ .add = perf_ibs_add,
+ .del = perf_ibs_del,
+ },
+ .msr = MSR_AMD64_IBSFETCHCTL,
+ .config_mask = IBS_FETCH_CONFIG_MASK,
+ .cnt_mask = IBS_FETCH_MAX_CNT,
+ .enable_mask = IBS_FETCH_ENABLE,
+};
+
+static struct perf_ibs perf_ibs_op = {
+ .pmu = {
+ .task_ctx_nr = perf_invalid_context,
+
+ .event_init = perf_ibs_init,
+ .add = perf_ibs_add,
+ .del = perf_ibs_del,
+ },
+ .msr = MSR_AMD64_IBSOPCTL,
+ .config_mask = IBS_OP_CONFIG_MASK,
+ .cnt_mask = IBS_OP_MAX_CNT,
+ .enable_mask = IBS_OP_ENABLE,
};
static __init int perf_event_ibs_init(void)
@@ -45,7 +122,8 @@ static __init int perf_event_ibs_init(void)
if (!ibs_caps)
return -ENODEV; /* ibs not supported by the cpu */
- perf_pmu_register(&perf_ibs, "ibs", -1);
+ perf_pmu_register(&perf_ibs_fetch.pmu, "ibs_fetch", -1);
+ perf_pmu_register(&perf_ibs_op.pmu, "ibs_op", -1);
printk(KERN_INFO "perf: AMD IBS detected (0x%08x)\n", ibs_caps);
return 0;
--
1.7.7
^ permalink raw reply related [flat|nested] 10+ messages in thread* [tip:perf/x86-ibs] perf/x86: Implement IBS event configuration
2011-12-15 16:56 ` [PATCH v4 1/4] perf, x86: Implement IBS event configuration Robert Richter
@ 2012-03-08 12:22 ` tip-bot for Robert Richter
0 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Robert Richter @ 2012-03-08 12:22 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, eranian, hpa, mingo, robert.richter, peterz, tglx,
mingo
Commit-ID: 510419435c6948fb32959d691bf84eaba41ca474
Gitweb: http://git.kernel.org/tip/510419435c6948fb32959d691bf84eaba41ca474
Author: Robert Richter <robert.richter@amd.com>
AuthorDate: Thu, 15 Dec 2011 17:56:36 +0100
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Thu, 8 Mar 2012 11:35:21 +0100
perf/x86: Implement IBS event configuration
This patch implements perf configuration for AMD IBS. The IBS
pmu is selected using the type attribute in sysfs. There are two
types of ibs pmus, for instruction fetch (IBS_FETCH) and for
instruction execution (IBS_OP):
/sys/bus/event_source/devices/ibs_fetch/type
/sys/bus/event_source/devices/ibs_op/type
Except for the sample period IBS can only be set up with raw
config values and raw data samples. The event attributes for the
syscall should be programmed like this (IBS_FETCH):
type = get_pmu_type("/sys/bus/event_source/devices/ibs_fetch/type");
memset(&attr, 0, sizeof(attr));
attr.type = type;
attr.sample_type = PERF_SAMPLE_CPU | PERF_SAMPLE_RAW;
attr.config = IBS_FETCH_CONFIG_DEFAULT;
This implementation does not yet support 64 bit counters. It is
limited to the hardware counter bit width which is 20 bits. 64
bit support can be added later.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323968199-9326-2-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/kernel/cpu/perf_event_amd_ibs.c | 92 +++++++++++++++++++++++++++--
1 files changed, 85 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
index 3b8a2d3..36684eb 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
@@ -16,12 +16,67 @@ static u32 ibs_caps;
#if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_AMD)
-static struct pmu perf_ibs;
+#define IBS_FETCH_CONFIG_MASK (IBS_FETCH_RAND_EN | IBS_FETCH_MAX_CNT)
+#define IBS_OP_CONFIG_MASK IBS_OP_MAX_CNT
+
+struct perf_ibs {
+ struct pmu pmu;
+ unsigned int msr;
+ u64 config_mask;
+ u64 cnt_mask;
+ u64 enable_mask;
+};
+
+static struct perf_ibs perf_ibs_fetch;
+static struct perf_ibs perf_ibs_op;
+
+static struct perf_ibs *get_ibs_pmu(int type)
+{
+ if (perf_ibs_fetch.pmu.type == type)
+ return &perf_ibs_fetch;
+ if (perf_ibs_op.pmu.type == type)
+ return &perf_ibs_op;
+ return NULL;
+}
static int perf_ibs_init(struct perf_event *event)
{
- if (perf_ibs.type != event->attr.type)
+ struct hw_perf_event *hwc = &event->hw;
+ struct perf_ibs *perf_ibs;
+ u64 max_cnt, config;
+
+ perf_ibs = get_ibs_pmu(event->attr.type);
+ if (!perf_ibs)
return -ENOENT;
+
+ config = event->attr.config;
+ if (config & ~perf_ibs->config_mask)
+ return -EINVAL;
+
+ if (hwc->sample_period) {
+ if (config & perf_ibs->cnt_mask)
+ /* raw max_cnt may not be set */
+ return -EINVAL;
+ if (hwc->sample_period & 0x0f)
+ /* lower 4 bits can not be set in ibs max cnt */
+ return -EINVAL;
+ max_cnt = hwc->sample_period >> 4;
+ if (max_cnt & ~perf_ibs->cnt_mask)
+ /* out of range */
+ return -EINVAL;
+ config |= max_cnt;
+ } else {
+ max_cnt = config & perf_ibs->cnt_mask;
+ event->attr.sample_period = max_cnt << 4;
+ hwc->sample_period = event->attr.sample_period;
+ }
+
+ if (!max_cnt)
+ return -EINVAL;
+
+ hwc->config_base = perf_ibs->msr;
+ hwc->config = config;
+
return 0;
}
@@ -34,10 +89,32 @@ static void perf_ibs_del(struct perf_event *event, int flags)
{
}
-static struct pmu perf_ibs = {
- .event_init= perf_ibs_init,
- .add= perf_ibs_add,
- .del= perf_ibs_del,
+static struct perf_ibs perf_ibs_fetch = {
+ .pmu = {
+ .task_ctx_nr = perf_invalid_context,
+
+ .event_init = perf_ibs_init,
+ .add = perf_ibs_add,
+ .del = perf_ibs_del,
+ },
+ .msr = MSR_AMD64_IBSFETCHCTL,
+ .config_mask = IBS_FETCH_CONFIG_MASK,
+ .cnt_mask = IBS_FETCH_MAX_CNT,
+ .enable_mask = IBS_FETCH_ENABLE,
+};
+
+static struct perf_ibs perf_ibs_op = {
+ .pmu = {
+ .task_ctx_nr = perf_invalid_context,
+
+ .event_init = perf_ibs_init,
+ .add = perf_ibs_add,
+ .del = perf_ibs_del,
+ },
+ .msr = MSR_AMD64_IBSOPCTL,
+ .config_mask = IBS_OP_CONFIG_MASK,
+ .cnt_mask = IBS_OP_MAX_CNT,
+ .enable_mask = IBS_OP_ENABLE,
};
static __init int perf_event_ibs_init(void)
@@ -45,7 +122,8 @@ static __init int perf_event_ibs_init(void)
if (!ibs_caps)
return -ENODEV; /* ibs not supported by the cpu */
- perf_pmu_register(&perf_ibs, "ibs", -1);
+ perf_pmu_register(&perf_ibs_fetch.pmu, "ibs_fetch", -1);
+ perf_pmu_register(&perf_ibs_op.pmu, "ibs_op", -1);
printk(KERN_INFO "perf: AMD IBS detected (0x%08x)\n", ibs_caps);
return 0;
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v4 2/4] perf, x86: Implement IBS interrupt handler
2011-12-15 16:56 [PATCH v4 0/4] perf, x86: Implement AMD IBS Robert Richter
2011-12-15 16:56 ` [PATCH v4 1/4] perf, x86: Implement IBS event configuration Robert Richter
@ 2011-12-15 16:56 ` Robert Richter
2012-03-08 12:23 ` [tip:perf/x86-ibs] perf/x86: " tip-bot for Robert Richter
2011-12-15 16:56 ` [PATCH v4 3/4] perf, x86: Implement IBS pmu control ops Robert Richter
` (2 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: Robert Richter @ 2011-12-15 16:56 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Ingo Molnar, Stephane Eranian, LKML, Robert Richter
This patch implements code to handle ibs interrupts. If ibs data is
available a raw perf_event data sample is created and sent back to the
userland. This patch only implements the storage of ibs data in the
raw sample, but this could be extended in a later patch by generating
generic event data such as the rip from the ibs sampling data.
V2:
* Added bit mask for msr offsets.
* Added caps field to raw sample format.
* Rebase on Don's NMI patches that introduce register_nmi_handler().
Signed-off-by: Robert Richter <robert.richter@amd.com>
---
arch/x86/include/asm/msr-index.h | 5 ++
arch/x86/kernel/cpu/perf_event_amd_ibs.c | 84 ++++++++++++++++++++++++++++++
2 files changed, 89 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index a6962d9..4e3cd38 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -127,6 +127,8 @@
#define MSR_AMD64_IBSFETCHCTL 0xc0011030
#define MSR_AMD64_IBSFETCHLINAD 0xc0011031
#define MSR_AMD64_IBSFETCHPHYSAD 0xc0011032
+#define MSR_AMD64_IBSFETCH_REG_COUNT 3
+#define MSR_AMD64_IBSFETCH_REG_MASK ((1UL<<MSR_AMD64_IBSFETCH_REG_COUNT)-1)
#define MSR_AMD64_IBSOPCTL 0xc0011033
#define MSR_AMD64_IBSOPRIP 0xc0011034
#define MSR_AMD64_IBSOPDATA 0xc0011035
@@ -134,8 +136,11 @@
#define MSR_AMD64_IBSOPDATA3 0xc0011037
#define MSR_AMD64_IBSDCLINAD 0xc0011038
#define MSR_AMD64_IBSDCPHYSAD 0xc0011039
+#define MSR_AMD64_IBSOP_REG_COUNT 7
+#define MSR_AMD64_IBSOP_REG_MASK ((1UL<<MSR_AMD64_IBSOP_REG_COUNT)-1)
#define MSR_AMD64_IBSCTL 0xc001103a
#define MSR_AMD64_IBSBRTARGET 0xc001103b
+#define MSR_AMD64_IBS_REG_COUNT_MAX 8 /* includes MSR_AMD64_IBSBRTARGET */
/* Fam 15h MSRs */
#define MSR_F15H_PERF_CTL 0xc0010200
diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
index 36684eb..a7ec6bd 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
@@ -16,6 +16,11 @@ static u32 ibs_caps;
#if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_AMD)
+#include <linux/kprobes.h>
+#include <linux/hardirq.h>
+
+#include <asm/nmi.h>
+
#define IBS_FETCH_CONFIG_MASK (IBS_FETCH_RAND_EN | IBS_FETCH_MAX_CNT)
#define IBS_OP_CONFIG_MASK IBS_OP_MAX_CNT
@@ -25,6 +30,18 @@ struct perf_ibs {
u64 config_mask;
u64 cnt_mask;
u64 enable_mask;
+ u64 valid_mask;
+ unsigned long offset_mask[1];
+ int offset_max;
+};
+
+struct perf_ibs_data {
+ u32 size;
+ union {
+ u32 data[0]; /* data buffer starts here */
+ u32 caps;
+ };
+ u64 regs[MSR_AMD64_IBS_REG_COUNT_MAX];
};
static struct perf_ibs perf_ibs_fetch;
@@ -101,6 +118,9 @@ static struct perf_ibs perf_ibs_fetch = {
.config_mask = IBS_FETCH_CONFIG_MASK,
.cnt_mask = IBS_FETCH_MAX_CNT,
.enable_mask = IBS_FETCH_ENABLE,
+ .valid_mask = IBS_FETCH_VAL,
+ .offset_mask = { MSR_AMD64_IBSFETCH_REG_MASK },
+ .offset_max = MSR_AMD64_IBSFETCH_REG_COUNT,
};
static struct perf_ibs perf_ibs_op = {
@@ -115,8 +135,71 @@ static struct perf_ibs perf_ibs_op = {
.config_mask = IBS_OP_CONFIG_MASK,
.cnt_mask = IBS_OP_MAX_CNT,
.enable_mask = IBS_OP_ENABLE,
+ .valid_mask = IBS_OP_VAL,
+ .offset_mask = { MSR_AMD64_IBSOP_REG_MASK },
+ .offset_max = MSR_AMD64_IBSOP_REG_COUNT,
};
+static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
+{
+ struct perf_event *event = NULL;
+ struct hw_perf_event *hwc = &event->hw;
+ struct perf_sample_data data;
+ struct perf_raw_record raw;
+ struct pt_regs regs;
+ struct perf_ibs_data ibs_data;
+ int offset, size;
+ unsigned int msr;
+ u64 *buf;
+
+ msr = hwc->config_base;
+ buf = ibs_data.regs;
+ rdmsrl(msr, *buf);
+ if (!(*buf++ & perf_ibs->valid_mask))
+ return 0;
+
+ perf_sample_data_init(&data, 0);
+ if (event->attr.sample_type & PERF_SAMPLE_RAW) {
+ ibs_data.caps = ibs_caps;
+ size = 1;
+ offset = 1;
+ do {
+ rdmsrl(msr + offset, *buf++);
+ size++;
+ offset = find_next_bit(perf_ibs->offset_mask,
+ perf_ibs->offset_max,
+ offset + 1);
+ } while (offset < perf_ibs->offset_max);
+ raw.size = sizeof(u32) + sizeof(u64) * size;
+ raw.data = ibs_data.data;
+ data.raw = &raw;
+ }
+
+ regs = *iregs; /* XXX: update ip from ibs sample */
+
+ if (perf_event_overflow(event, &data, ®s))
+ ; /* stop */
+ else
+ /* reenable */
+ wrmsrl(hwc->config_base, hwc->config | perf_ibs->enable_mask);
+
+ return 1;
+}
+
+static int __kprobes
+perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)
+{
+ int handled = 0;
+
+ handled += perf_ibs_handle_irq(&perf_ibs_fetch, regs);
+ handled += perf_ibs_handle_irq(&perf_ibs_op, regs);
+
+ if (handled)
+ inc_irq_stat(apic_perf_irqs);
+
+ return handled;
+}
+
static __init int perf_event_ibs_init(void)
{
if (!ibs_caps)
@@ -124,6 +207,7 @@ static __init int perf_event_ibs_init(void)
perf_pmu_register(&perf_ibs_fetch.pmu, "ibs_fetch", -1);
perf_pmu_register(&perf_ibs_op.pmu, "ibs_op", -1);
+ register_nmi_handler(NMI_LOCAL, &perf_ibs_nmi_handler, 0, "perf_ibs");
printk(KERN_INFO "perf: AMD IBS detected (0x%08x)\n", ibs_caps);
return 0;
--
1.7.7
^ permalink raw reply related [flat|nested] 10+ messages in thread* [tip:perf/x86-ibs] perf/x86: Implement IBS interrupt handler
2011-12-15 16:56 ` [PATCH v4 2/4] perf, x86: Implement IBS interrupt handler Robert Richter
@ 2012-03-08 12:23 ` tip-bot for Robert Richter
0 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Robert Richter @ 2012-03-08 12:23 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, eranian, hpa, mingo, robert.richter, peterz, tglx,
mingo
Commit-ID: b7074f1fbd6149eac1ec25063e4a364c39a85473
Gitweb: http://git.kernel.org/tip/b7074f1fbd6149eac1ec25063e4a364c39a85473
Author: Robert Richter <robert.richter@amd.com>
AuthorDate: Thu, 15 Dec 2011 17:56:37 +0100
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Thu, 8 Mar 2012 11:35:21 +0100
perf/x86: Implement IBS interrupt handler
This patch implements code to handle ibs interrupts. If ibs data
is available a raw perf_event data sample is created and sent
back to the userland. This patch only implements the storage of
ibs data in the raw sample, but this could be extended in a
later patch by generating generic event data such as the rip
from the ibs sampling data.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323968199-9326-3-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/include/asm/msr-index.h | 5 ++
arch/x86/kernel/cpu/perf_event_amd_ibs.c | 84 ++++++++++++++++++++++++++++++
2 files changed, 89 insertions(+), 0 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index a6962d9..4e3cd38 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -127,6 +127,8 @@
#define MSR_AMD64_IBSFETCHCTL 0xc0011030
#define MSR_AMD64_IBSFETCHLINAD 0xc0011031
#define MSR_AMD64_IBSFETCHPHYSAD 0xc0011032
+#define MSR_AMD64_IBSFETCH_REG_COUNT 3
+#define MSR_AMD64_IBSFETCH_REG_MASK ((1UL<<MSR_AMD64_IBSFETCH_REG_COUNT)-1)
#define MSR_AMD64_IBSOPCTL 0xc0011033
#define MSR_AMD64_IBSOPRIP 0xc0011034
#define MSR_AMD64_IBSOPDATA 0xc0011035
@@ -134,8 +136,11 @@
#define MSR_AMD64_IBSOPDATA3 0xc0011037
#define MSR_AMD64_IBSDCLINAD 0xc0011038
#define MSR_AMD64_IBSDCPHYSAD 0xc0011039
+#define MSR_AMD64_IBSOP_REG_COUNT 7
+#define MSR_AMD64_IBSOP_REG_MASK ((1UL<<MSR_AMD64_IBSOP_REG_COUNT)-1)
#define MSR_AMD64_IBSCTL 0xc001103a
#define MSR_AMD64_IBSBRTARGET 0xc001103b
+#define MSR_AMD64_IBS_REG_COUNT_MAX 8 /* includes MSR_AMD64_IBSBRTARGET */
/* Fam 15h MSRs */
#define MSR_F15H_PERF_CTL 0xc0010200
diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
index 36684eb..a7ec6bd 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
@@ -16,6 +16,11 @@ static u32 ibs_caps;
#if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_AMD)
+#include <linux/kprobes.h>
+#include <linux/hardirq.h>
+
+#include <asm/nmi.h>
+
#define IBS_FETCH_CONFIG_MASK (IBS_FETCH_RAND_EN | IBS_FETCH_MAX_CNT)
#define IBS_OP_CONFIG_MASK IBS_OP_MAX_CNT
@@ -25,6 +30,18 @@ struct perf_ibs {
u64 config_mask;
u64 cnt_mask;
u64 enable_mask;
+ u64 valid_mask;
+ unsigned long offset_mask[1];
+ int offset_max;
+};
+
+struct perf_ibs_data {
+ u32 size;
+ union {
+ u32 data[0]; /* data buffer starts here */
+ u32 caps;
+ };
+ u64 regs[MSR_AMD64_IBS_REG_COUNT_MAX];
};
static struct perf_ibs perf_ibs_fetch;
@@ -101,6 +118,9 @@ static struct perf_ibs perf_ibs_fetch = {
.config_mask = IBS_FETCH_CONFIG_MASK,
.cnt_mask = IBS_FETCH_MAX_CNT,
.enable_mask = IBS_FETCH_ENABLE,
+ .valid_mask = IBS_FETCH_VAL,
+ .offset_mask = { MSR_AMD64_IBSFETCH_REG_MASK },
+ .offset_max = MSR_AMD64_IBSFETCH_REG_COUNT,
};
static struct perf_ibs perf_ibs_op = {
@@ -115,8 +135,71 @@ static struct perf_ibs perf_ibs_op = {
.config_mask = IBS_OP_CONFIG_MASK,
.cnt_mask = IBS_OP_MAX_CNT,
.enable_mask = IBS_OP_ENABLE,
+ .valid_mask = IBS_OP_VAL,
+ .offset_mask = { MSR_AMD64_IBSOP_REG_MASK },
+ .offset_max = MSR_AMD64_IBSOP_REG_COUNT,
};
+static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
+{
+ struct perf_event *event = NULL;
+ struct hw_perf_event *hwc = &event->hw;
+ struct perf_sample_data data;
+ struct perf_raw_record raw;
+ struct pt_regs regs;
+ struct perf_ibs_data ibs_data;
+ int offset, size;
+ unsigned int msr;
+ u64 *buf;
+
+ msr = hwc->config_base;
+ buf = ibs_data.regs;
+ rdmsrl(msr, *buf);
+ if (!(*buf++ & perf_ibs->valid_mask))
+ return 0;
+
+ perf_sample_data_init(&data, 0);
+ if (event->attr.sample_type & PERF_SAMPLE_RAW) {
+ ibs_data.caps = ibs_caps;
+ size = 1;
+ offset = 1;
+ do {
+ rdmsrl(msr + offset, *buf++);
+ size++;
+ offset = find_next_bit(perf_ibs->offset_mask,
+ perf_ibs->offset_max,
+ offset + 1);
+ } while (offset < perf_ibs->offset_max);
+ raw.size = sizeof(u32) + sizeof(u64) * size;
+ raw.data = ibs_data.data;
+ data.raw = &raw;
+ }
+
+ regs = *iregs; /* XXX: update ip from ibs sample */
+
+ if (perf_event_overflow(event, &data, ®s))
+ ; /* stop */
+ else
+ /* reenable */
+ wrmsrl(hwc->config_base, hwc->config | perf_ibs->enable_mask);
+
+ return 1;
+}
+
+static int __kprobes
+perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)
+{
+ int handled = 0;
+
+ handled += perf_ibs_handle_irq(&perf_ibs_fetch, regs);
+ handled += perf_ibs_handle_irq(&perf_ibs_op, regs);
+
+ if (handled)
+ inc_irq_stat(apic_perf_irqs);
+
+ return handled;
+}
+
static __init int perf_event_ibs_init(void)
{
if (!ibs_caps)
@@ -124,6 +207,7 @@ static __init int perf_event_ibs_init(void)
perf_pmu_register(&perf_ibs_fetch.pmu, "ibs_fetch", -1);
perf_pmu_register(&perf_ibs_op.pmu, "ibs_op", -1);
+ register_nmi_handler(NMI_LOCAL, &perf_ibs_nmi_handler, 0, "perf_ibs");
printk(KERN_INFO "perf: AMD IBS detected (0x%08x)\n", ibs_caps);
return 0;
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v4 3/4] perf, x86: Implement IBS pmu control ops
2011-12-15 16:56 [PATCH v4 0/4] perf, x86: Implement AMD IBS Robert Richter
2011-12-15 16:56 ` [PATCH v4 1/4] perf, x86: Implement IBS event configuration Robert Richter
2011-12-15 16:56 ` [PATCH v4 2/4] perf, x86: Implement IBS interrupt handler Robert Richter
@ 2011-12-15 16:56 ` Robert Richter
2012-03-08 12:23 ` [tip:perf/x86-ibs] perf/x86: " tip-bot for Robert Richter
2011-12-15 16:56 ` [PATCH v4 4/4] perf, x86: Implement 64 bit counter support for IBS Robert Richter
2011-12-22 15:36 ` [PATCH v4 0/4] perf, x86: Implement AMD IBS Robert Richter
4 siblings, 1 reply; 10+ messages in thread
From: Robert Richter @ 2011-12-15 16:56 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Ingo Molnar, Stephane Eranian, LKML, Robert Richter
Add code to control the IBS pmu. We need to maintain per-cpu
states. Since some states are used and changed by the nmi handler,
access to these states must be atomic.
V4:
* fix returned number of handled nmis in perf_ibs_handle_irq()
Signed-off-by: Robert Richter <robert.richter@amd.com>
---
arch/x86/kernel/cpu/perf_event_amd_ibs.c | 106 +++++++++++++++++++++++++++++-
1 files changed, 103 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
index a7ec6bd..40a6d9d 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
@@ -24,6 +24,19 @@ static u32 ibs_caps;
#define IBS_FETCH_CONFIG_MASK (IBS_FETCH_RAND_EN | IBS_FETCH_MAX_CNT)
#define IBS_OP_CONFIG_MASK IBS_OP_MAX_CNT
+enum ibs_states {
+ IBS_ENABLED = 0,
+ IBS_STARTED = 1,
+ IBS_STOPPING = 2,
+
+ IBS_MAX_STATES,
+};
+
+struct cpu_perf_ibs {
+ struct perf_event *event;
+ unsigned long state[BITS_TO_LONGS(IBS_MAX_STATES)];
+};
+
struct perf_ibs {
struct pmu pmu;
unsigned int msr;
@@ -33,6 +46,7 @@ struct perf_ibs {
u64 valid_mask;
unsigned long offset_mask[1];
int offset_max;
+ struct cpu_perf_ibs __percpu *pcpu;
};
struct perf_ibs_data {
@@ -97,15 +111,66 @@ static int perf_ibs_init(struct perf_event *event)
return 0;
}
+static void perf_ibs_start(struct perf_event *event, int flags)
+{
+ struct hw_perf_event *hwc = &event->hw;
+ struct perf_ibs *perf_ibs = container_of(event->pmu, struct perf_ibs, pmu);
+ struct cpu_perf_ibs *pcpu = this_cpu_ptr(perf_ibs->pcpu);
+
+ if (test_and_set_bit(IBS_STARTED, pcpu->state))
+ return;
+
+ wrmsrl(hwc->config_base, hwc->config | perf_ibs->enable_mask);
+}
+
+static void perf_ibs_stop(struct perf_event *event, int flags)
+{
+ struct hw_perf_event *hwc = &event->hw;
+ struct perf_ibs *perf_ibs = container_of(event->pmu, struct perf_ibs, pmu);
+ struct cpu_perf_ibs *pcpu = this_cpu_ptr(perf_ibs->pcpu);
+ u64 val;
+
+ if (!test_and_clear_bit(IBS_STARTED, pcpu->state))
+ return;
+
+ set_bit(IBS_STOPPING, pcpu->state);
+
+ rdmsrl(hwc->config_base, val);
+ val &= ~perf_ibs->enable_mask;
+ wrmsrl(hwc->config_base, val);
+}
+
static int perf_ibs_add(struct perf_event *event, int flags)
{
+ struct perf_ibs *perf_ibs = container_of(event->pmu, struct perf_ibs, pmu);
+ struct cpu_perf_ibs *pcpu = this_cpu_ptr(perf_ibs->pcpu);
+
+ if (test_and_set_bit(IBS_ENABLED, pcpu->state))
+ return -ENOSPC;
+
+ pcpu->event = event;
+
+ if (flags & PERF_EF_START)
+ perf_ibs_start(event, PERF_EF_RELOAD);
+
return 0;
}
static void perf_ibs_del(struct perf_event *event, int flags)
{
+ struct perf_ibs *perf_ibs = container_of(event->pmu, struct perf_ibs, pmu);
+ struct cpu_perf_ibs *pcpu = this_cpu_ptr(perf_ibs->pcpu);
+
+ if (!test_and_clear_bit(IBS_ENABLED, pcpu->state))
+ return;
+
+ perf_ibs_stop(event, 0);
+
+ pcpu->event = NULL;
}
+static void perf_ibs_read(struct perf_event *event) { }
+
static struct perf_ibs perf_ibs_fetch = {
.pmu = {
.task_ctx_nr = perf_invalid_context,
@@ -113,6 +178,9 @@ static struct perf_ibs perf_ibs_fetch = {
.event_init = perf_ibs_init,
.add = perf_ibs_add,
.del = perf_ibs_del,
+ .start = perf_ibs_start,
+ .stop = perf_ibs_stop,
+ .read = perf_ibs_read,
},
.msr = MSR_AMD64_IBSFETCHCTL,
.config_mask = IBS_FETCH_CONFIG_MASK,
@@ -130,6 +198,9 @@ static struct perf_ibs perf_ibs_op = {
.event_init = perf_ibs_init,
.add = perf_ibs_add,
.del = perf_ibs_del,
+ .start = perf_ibs_start,
+ .stop = perf_ibs_stop,
+ .read = perf_ibs_read,
},
.msr = MSR_AMD64_IBSOPCTL,
.config_mask = IBS_OP_CONFIG_MASK,
@@ -142,7 +213,8 @@ static struct perf_ibs perf_ibs_op = {
static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
{
- struct perf_event *event = NULL;
+ struct cpu_perf_ibs *pcpu = this_cpu_ptr(perf_ibs->pcpu);
+ struct perf_event *event = pcpu->event;
struct hw_perf_event *hwc = &event->hw;
struct perf_sample_data data;
struct perf_raw_record raw;
@@ -152,6 +224,14 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
unsigned int msr;
u64 *buf;
+ if (!test_bit(IBS_STARTED, pcpu->state)) {
+ /* Catch spurious interrupts after stopping IBS: */
+ if (!test_and_clear_bit(IBS_STOPPING, pcpu->state))
+ return 0;
+ rdmsrl(perf_ibs->msr, *ibs_data.regs);
+ return (*ibs_data.regs & perf_ibs->valid_mask) ? 1 : 0;
+ }
+
msr = hwc->config_base;
buf = ibs_data.regs;
rdmsrl(msr, *buf);
@@ -200,13 +280,33 @@ perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)
return handled;
}
+static __init int perf_ibs_pmu_init(struct perf_ibs *perf_ibs, char *name)
+{
+ struct cpu_perf_ibs __percpu *pcpu;
+ int ret;
+
+ pcpu = alloc_percpu(struct cpu_perf_ibs);
+ if (!pcpu)
+ return -ENOMEM;
+
+ perf_ibs->pcpu = pcpu;
+
+ ret = perf_pmu_register(&perf_ibs->pmu, name, -1);
+ if (ret) {
+ perf_ibs->pcpu = NULL;
+ free_percpu(pcpu);
+ }
+
+ return ret;
+}
+
static __init int perf_event_ibs_init(void)
{
if (!ibs_caps)
return -ENODEV; /* ibs not supported by the cpu */
- perf_pmu_register(&perf_ibs_fetch.pmu, "ibs_fetch", -1);
- perf_pmu_register(&perf_ibs_op.pmu, "ibs_op", -1);
+ perf_ibs_pmu_init(&perf_ibs_fetch, "ibs_fetch");
+ perf_ibs_pmu_init(&perf_ibs_op, "ibs_op");
register_nmi_handler(NMI_LOCAL, &perf_ibs_nmi_handler, 0, "perf_ibs");
printk(KERN_INFO "perf: AMD IBS detected (0x%08x)\n", ibs_caps);
--
1.7.7
^ permalink raw reply related [flat|nested] 10+ messages in thread* [tip:perf/x86-ibs] perf/x86: Implement IBS pmu control ops
2011-12-15 16:56 ` [PATCH v4 3/4] perf, x86: Implement IBS pmu control ops Robert Richter
@ 2012-03-08 12:23 ` tip-bot for Robert Richter
0 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Robert Richter @ 2012-03-08 12:23 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, eranian, hpa, mingo, robert.richter, peterz, tglx,
mingo
Commit-ID: 4db2e8e6500d9ba6406f2714fa3968b39a325274
Gitweb: http://git.kernel.org/tip/4db2e8e6500d9ba6406f2714fa3968b39a325274
Author: Robert Richter <robert.richter@amd.com>
AuthorDate: Thu, 15 Dec 2011 17:56:38 +0100
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Thu, 8 Mar 2012 11:35:22 +0100
perf/x86: Implement IBS pmu control ops
Add code to control the IBS pmu. We need to maintain per-cpu
states. Since some states are used and changed by the nmi
handler, access to these states must be atomic.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323968199-9326-4-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/kernel/cpu/perf_event_amd_ibs.c | 106 +++++++++++++++++++++++++++++-
1 files changed, 103 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
index a7ec6bd..40a6d9d 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
@@ -24,6 +24,19 @@ static u32 ibs_caps;
#define IBS_FETCH_CONFIG_MASK (IBS_FETCH_RAND_EN | IBS_FETCH_MAX_CNT)
#define IBS_OP_CONFIG_MASK IBS_OP_MAX_CNT
+enum ibs_states {
+ IBS_ENABLED = 0,
+ IBS_STARTED = 1,
+ IBS_STOPPING = 2,
+
+ IBS_MAX_STATES,
+};
+
+struct cpu_perf_ibs {
+ struct perf_event *event;
+ unsigned long state[BITS_TO_LONGS(IBS_MAX_STATES)];
+};
+
struct perf_ibs {
struct pmu pmu;
unsigned int msr;
@@ -33,6 +46,7 @@ struct perf_ibs {
u64 valid_mask;
unsigned long offset_mask[1];
int offset_max;
+ struct cpu_perf_ibs __percpu *pcpu;
};
struct perf_ibs_data {
@@ -97,15 +111,66 @@ static int perf_ibs_init(struct perf_event *event)
return 0;
}
+static void perf_ibs_start(struct perf_event *event, int flags)
+{
+ struct hw_perf_event *hwc = &event->hw;
+ struct perf_ibs *perf_ibs = container_of(event->pmu, struct perf_ibs, pmu);
+ struct cpu_perf_ibs *pcpu = this_cpu_ptr(perf_ibs->pcpu);
+
+ if (test_and_set_bit(IBS_STARTED, pcpu->state))
+ return;
+
+ wrmsrl(hwc->config_base, hwc->config | perf_ibs->enable_mask);
+}
+
+static void perf_ibs_stop(struct perf_event *event, int flags)
+{
+ struct hw_perf_event *hwc = &event->hw;
+ struct perf_ibs *perf_ibs = container_of(event->pmu, struct perf_ibs, pmu);
+ struct cpu_perf_ibs *pcpu = this_cpu_ptr(perf_ibs->pcpu);
+ u64 val;
+
+ if (!test_and_clear_bit(IBS_STARTED, pcpu->state))
+ return;
+
+ set_bit(IBS_STOPPING, pcpu->state);
+
+ rdmsrl(hwc->config_base, val);
+ val &= ~perf_ibs->enable_mask;
+ wrmsrl(hwc->config_base, val);
+}
+
static int perf_ibs_add(struct perf_event *event, int flags)
{
+ struct perf_ibs *perf_ibs = container_of(event->pmu, struct perf_ibs, pmu);
+ struct cpu_perf_ibs *pcpu = this_cpu_ptr(perf_ibs->pcpu);
+
+ if (test_and_set_bit(IBS_ENABLED, pcpu->state))
+ return -ENOSPC;
+
+ pcpu->event = event;
+
+ if (flags & PERF_EF_START)
+ perf_ibs_start(event, PERF_EF_RELOAD);
+
return 0;
}
static void perf_ibs_del(struct perf_event *event, int flags)
{
+ struct perf_ibs *perf_ibs = container_of(event->pmu, struct perf_ibs, pmu);
+ struct cpu_perf_ibs *pcpu = this_cpu_ptr(perf_ibs->pcpu);
+
+ if (!test_and_clear_bit(IBS_ENABLED, pcpu->state))
+ return;
+
+ perf_ibs_stop(event, 0);
+
+ pcpu->event = NULL;
}
+static void perf_ibs_read(struct perf_event *event) { }
+
static struct perf_ibs perf_ibs_fetch = {
.pmu = {
.task_ctx_nr = perf_invalid_context,
@@ -113,6 +178,9 @@ static struct perf_ibs perf_ibs_fetch = {
.event_init = perf_ibs_init,
.add = perf_ibs_add,
.del = perf_ibs_del,
+ .start = perf_ibs_start,
+ .stop = perf_ibs_stop,
+ .read = perf_ibs_read,
},
.msr = MSR_AMD64_IBSFETCHCTL,
.config_mask = IBS_FETCH_CONFIG_MASK,
@@ -130,6 +198,9 @@ static struct perf_ibs perf_ibs_op = {
.event_init = perf_ibs_init,
.add = perf_ibs_add,
.del = perf_ibs_del,
+ .start = perf_ibs_start,
+ .stop = perf_ibs_stop,
+ .read = perf_ibs_read,
},
.msr = MSR_AMD64_IBSOPCTL,
.config_mask = IBS_OP_CONFIG_MASK,
@@ -142,7 +213,8 @@ static struct perf_ibs perf_ibs_op = {
static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
{
- struct perf_event *event = NULL;
+ struct cpu_perf_ibs *pcpu = this_cpu_ptr(perf_ibs->pcpu);
+ struct perf_event *event = pcpu->event;
struct hw_perf_event *hwc = &event->hw;
struct perf_sample_data data;
struct perf_raw_record raw;
@@ -152,6 +224,14 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
unsigned int msr;
u64 *buf;
+ if (!test_bit(IBS_STARTED, pcpu->state)) {
+ /* Catch spurious interrupts after stopping IBS: */
+ if (!test_and_clear_bit(IBS_STOPPING, pcpu->state))
+ return 0;
+ rdmsrl(perf_ibs->msr, *ibs_data.regs);
+ return (*ibs_data.regs & perf_ibs->valid_mask) ? 1 : 0;
+ }
+
msr = hwc->config_base;
buf = ibs_data.regs;
rdmsrl(msr, *buf);
@@ -200,13 +280,33 @@ perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)
return handled;
}
+static __init int perf_ibs_pmu_init(struct perf_ibs *perf_ibs, char *name)
+{
+ struct cpu_perf_ibs __percpu *pcpu;
+ int ret;
+
+ pcpu = alloc_percpu(struct cpu_perf_ibs);
+ if (!pcpu)
+ return -ENOMEM;
+
+ perf_ibs->pcpu = pcpu;
+
+ ret = perf_pmu_register(&perf_ibs->pmu, name, -1);
+ if (ret) {
+ perf_ibs->pcpu = NULL;
+ free_percpu(pcpu);
+ }
+
+ return ret;
+}
+
static __init int perf_event_ibs_init(void)
{
if (!ibs_caps)
return -ENODEV; /* ibs not supported by the cpu */
- perf_pmu_register(&perf_ibs_fetch.pmu, "ibs_fetch", -1);
- perf_pmu_register(&perf_ibs_op.pmu, "ibs_op", -1);
+ perf_ibs_pmu_init(&perf_ibs_fetch, "ibs_fetch");
+ perf_ibs_pmu_init(&perf_ibs_op, "ibs_op");
register_nmi_handler(NMI_LOCAL, &perf_ibs_nmi_handler, 0, "perf_ibs");
printk(KERN_INFO "perf: AMD IBS detected (0x%08x)\n", ibs_caps);
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v4 4/4] perf, x86: Implement 64 bit counter support for IBS
2011-12-15 16:56 [PATCH v4 0/4] perf, x86: Implement AMD IBS Robert Richter
` (2 preceding siblings ...)
2011-12-15 16:56 ` [PATCH v4 3/4] perf, x86: Implement IBS pmu control ops Robert Richter
@ 2011-12-15 16:56 ` Robert Richter
2012-03-08 12:24 ` [tip:perf/x86-ibs] perf/x86: Implement 64-bit " tip-bot for Robert Richter
2011-12-22 15:36 ` [PATCH v4 0/4] perf, x86: Implement AMD IBS Robert Richter
4 siblings, 1 reply; 10+ messages in thread
From: Robert Richter @ 2011-12-15 16:56 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Ingo Molnar, Stephane Eranian, LKML, Robert Richter
This patch implements 64 bit counter support for IBS. The sampling
period is no longer limited to the hw counter width.
The functions perf_event_set_period() and perf_event_try_update() can
be used as generic functions. They can replace similar code that is
duplicate across architectures.
V2: Added caps check for IBS_OP_CUR_CNT emulation.
Signed-off-by: Robert Richter <robert.richter@amd.com>
---
arch/x86/include/asm/perf_event.h | 2 +
arch/x86/kernel/cpu/perf_event_amd_ibs.c | 204 +++++++++++++++++++++++++++---
2 files changed, 185 insertions(+), 21 deletions(-)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index b50e9d1..bb4744b 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -176,6 +176,8 @@ struct x86_pmu_capability {
#define IBS_FETCH_MAX_CNT 0x0000FFFFULL
/* IbsOpCtl bits */
+/* lower 4 bits of the current count are ignored: */
+#define IBS_OP_CUR_CNT (0xFFFF0ULL<<32)
#define IBS_OP_CNT_CTL (1ULL<<19)
#define IBS_OP_VAL (1ULL<<18)
#define IBS_OP_ENABLE (1ULL<<17)
diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
index 40a6d9d..573d248 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
@@ -44,9 +44,11 @@ struct perf_ibs {
u64 cnt_mask;
u64 enable_mask;
u64 valid_mask;
+ u64 max_period;
unsigned long offset_mask[1];
int offset_max;
struct cpu_perf_ibs __percpu *pcpu;
+ u64 (*get_count)(u64 config);
};
struct perf_ibs_data {
@@ -58,6 +60,78 @@ struct perf_ibs_data {
u64 regs[MSR_AMD64_IBS_REG_COUNT_MAX];
};
+static int
+perf_event_set_period(struct hw_perf_event *hwc, u64 min, u64 max, u64 *count)
+{
+ s64 left = local64_read(&hwc->period_left);
+ s64 period = hwc->sample_period;
+ int overflow = 0;
+
+ /*
+ * If we are way outside a reasonable range then just skip forward:
+ */
+ if (unlikely(left <= -period)) {
+ left = period;
+ local64_set(&hwc->period_left, left);
+ hwc->last_period = period;
+ overflow = 1;
+ }
+
+ if (unlikely(left <= 0)) {
+ left += period;
+ local64_set(&hwc->period_left, left);
+ hwc->last_period = period;
+ overflow = 1;
+ }
+
+ if (unlikely(left < min))
+ left = min;
+
+ if (left > max)
+ left = max;
+
+ *count = (u64)left;
+
+ return overflow;
+}
+
+static int
+perf_event_try_update(struct perf_event *event, u64 new_raw_count, int width)
+{
+ struct hw_perf_event *hwc = &event->hw;
+ int shift = 64 - width;
+ u64 prev_raw_count;
+ u64 delta;
+
+ /*
+ * Careful: an NMI might modify the previous event value.
+ *
+ * Our tactic to handle this is to first atomically read and
+ * exchange a new raw count - then add that new-prev delta
+ * count to the generic event atomically:
+ */
+ prev_raw_count = local64_read(&hwc->prev_count);
+ if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
+ new_raw_count) != prev_raw_count)
+ return 0;
+
+ /*
+ * Now we have the new raw value and have updated the prev
+ * timestamp already. We can now calculate the elapsed delta
+ * (event-)time and add that to the generic event.
+ *
+ * Careful, not all hw sign-extends above the physical width
+ * of the count.
+ */
+ delta = (new_raw_count << shift) - (prev_raw_count << shift);
+ delta >>= shift;
+
+ local64_add(delta, &event->count);
+ local64_sub(delta, &hwc->period_left);
+
+ return 1;
+}
+
static struct perf_ibs perf_ibs_fetch;
static struct perf_ibs perf_ibs_op;
@@ -91,18 +165,14 @@ static int perf_ibs_init(struct perf_event *event)
if (hwc->sample_period & 0x0f)
/* lower 4 bits can not be set in ibs max cnt */
return -EINVAL;
- max_cnt = hwc->sample_period >> 4;
- if (max_cnt & ~perf_ibs->cnt_mask)
- /* out of range */
- return -EINVAL;
- config |= max_cnt;
} else {
max_cnt = config & perf_ibs->cnt_mask;
+ config &= ~perf_ibs->cnt_mask;
event->attr.sample_period = max_cnt << 4;
hwc->sample_period = event->attr.sample_period;
}
- if (!max_cnt)
+ if (!hwc->sample_period)
return -EINVAL;
hwc->config_base = perf_ibs->msr;
@@ -111,16 +181,71 @@ static int perf_ibs_init(struct perf_event *event)
return 0;
}
+static int perf_ibs_set_period(struct perf_ibs *perf_ibs,
+ struct hw_perf_event *hwc, u64 *period)
+{
+ int ret;
+
+ /* ignore lower 4 bits in min count: */
+ ret = perf_event_set_period(hwc, 1<<4, perf_ibs->max_period, period);
+ local64_set(&hwc->prev_count, 0);
+
+ return ret;
+}
+
+static u64 get_ibs_fetch_count(u64 config)
+{
+ return (config & IBS_FETCH_CNT) >> 12;
+}
+
+static u64 get_ibs_op_count(u64 config)
+{
+ return (config & IBS_OP_CUR_CNT) >> 32;
+}
+
+static void
+perf_ibs_event_update(struct perf_ibs *perf_ibs, struct perf_event *event,
+ u64 config)
+{
+ u64 count = perf_ibs->get_count(config);
+
+ while (!perf_event_try_update(event, count, 20)) {
+ rdmsrl(event->hw.config_base, config);
+ count = perf_ibs->get_count(config);
+ }
+}
+
+/* Note: The enable mask must be encoded in the config argument. */
+static inline void perf_ibs_enable_event(struct hw_perf_event *hwc, u64 config)
+{
+ wrmsrl(hwc->config_base, hwc->config | config);
+}
+
+/*
+ * We cannot restore the ibs pmu state, so we always needs to update
+ * the event while stopping it and then reset the state when starting
+ * again. Thus, ignoring PERF_EF_RELOAD and PERF_EF_UPDATE flags in
+ * perf_ibs_start()/perf_ibs_stop() and instead always do it.
+ */
static void perf_ibs_start(struct perf_event *event, int flags)
{
struct hw_perf_event *hwc = &event->hw;
struct perf_ibs *perf_ibs = container_of(event->pmu, struct perf_ibs, pmu);
struct cpu_perf_ibs *pcpu = this_cpu_ptr(perf_ibs->pcpu);
+ u64 config;
- if (test_and_set_bit(IBS_STARTED, pcpu->state))
+ if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED)))
return;
- wrmsrl(hwc->config_base, hwc->config | perf_ibs->enable_mask);
+ WARN_ON_ONCE(!(hwc->state & PERF_HES_UPTODATE));
+ hwc->state = 0;
+
+ perf_ibs_set_period(perf_ibs, hwc, &config);
+ config = (config >> 4) | perf_ibs->enable_mask;
+ set_bit(IBS_STARTED, pcpu->state);
+ perf_ibs_enable_event(hwc, config);
+
+ perf_event_update_userpage(event);
}
static void perf_ibs_stop(struct perf_event *event, int flags)
@@ -129,15 +254,28 @@ static void perf_ibs_stop(struct perf_event *event, int flags)
struct perf_ibs *perf_ibs = container_of(event->pmu, struct perf_ibs, pmu);
struct cpu_perf_ibs *pcpu = this_cpu_ptr(perf_ibs->pcpu);
u64 val;
+ int stopping;
- if (!test_and_clear_bit(IBS_STARTED, pcpu->state))
- return;
+ stopping = test_and_clear_bit(IBS_STARTED, pcpu->state);
- set_bit(IBS_STOPPING, pcpu->state);
+ if (!stopping && (hwc->state & PERF_HES_UPTODATE))
+ return;
rdmsrl(hwc->config_base, val);
- val &= ~perf_ibs->enable_mask;
- wrmsrl(hwc->config_base, val);
+
+ if (stopping) {
+ set_bit(IBS_STOPPING, pcpu->state);
+ val &= ~perf_ibs->enable_mask;
+ wrmsrl(hwc->config_base, val);
+ WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
+ hwc->state |= PERF_HES_STOPPED;
+ }
+
+ if (hwc->state & PERF_HES_UPTODATE)
+ return;
+
+ perf_ibs_event_update(perf_ibs, event, val);
+ hwc->state |= PERF_HES_UPTODATE;
}
static int perf_ibs_add(struct perf_event *event, int flags)
@@ -148,6 +286,8 @@ static int perf_ibs_add(struct perf_event *event, int flags)
if (test_and_set_bit(IBS_ENABLED, pcpu->state))
return -ENOSPC;
+ event->hw.state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+
pcpu->event = event;
if (flags & PERF_EF_START)
@@ -164,9 +304,11 @@ static void perf_ibs_del(struct perf_event *event, int flags)
if (!test_and_clear_bit(IBS_ENABLED, pcpu->state))
return;
- perf_ibs_stop(event, 0);
+ perf_ibs_stop(event, PERF_EF_UPDATE);
pcpu->event = NULL;
+
+ perf_event_update_userpage(event);
}
static void perf_ibs_read(struct perf_event *event) { }
@@ -187,8 +329,11 @@ static struct perf_ibs perf_ibs_fetch = {
.cnt_mask = IBS_FETCH_MAX_CNT,
.enable_mask = IBS_FETCH_ENABLE,
.valid_mask = IBS_FETCH_VAL,
+ .max_period = IBS_FETCH_MAX_CNT << 4,
.offset_mask = { MSR_AMD64_IBSFETCH_REG_MASK },
.offset_max = MSR_AMD64_IBSFETCH_REG_COUNT,
+
+ .get_count = get_ibs_fetch_count,
};
static struct perf_ibs perf_ibs_op = {
@@ -207,8 +352,11 @@ static struct perf_ibs perf_ibs_op = {
.cnt_mask = IBS_OP_MAX_CNT,
.enable_mask = IBS_OP_ENABLE,
.valid_mask = IBS_OP_VAL,
+ .max_period = IBS_OP_MAX_CNT << 4,
.offset_mask = { MSR_AMD64_IBSOP_REG_MASK },
.offset_max = MSR_AMD64_IBSOP_REG_COUNT,
+
+ .get_count = get_ibs_op_count,
};
static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
@@ -220,9 +368,9 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
struct perf_raw_record raw;
struct pt_regs regs;
struct perf_ibs_data ibs_data;
- int offset, size;
+ int offset, size, overflow, reenable;
unsigned int msr;
- u64 *buf;
+ u64 *buf, config;
if (!test_bit(IBS_STARTED, pcpu->state)) {
/* Catch spurious interrupts after stopping IBS: */
@@ -257,11 +405,25 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
regs = *iregs; /* XXX: update ip from ibs sample */
- if (perf_event_overflow(event, &data, ®s))
- ; /* stop */
- else
- /* reenable */
- wrmsrl(hwc->config_base, hwc->config | perf_ibs->enable_mask);
+ /*
+ * Emulate IbsOpCurCnt in MSRC001_1033 (IbsOpCtl), not
+ * supported in all cpus. As this triggered an interrupt, we
+ * set the current count to the max count.
+ */
+ config = ibs_data.regs[0];
+ if (perf_ibs == &perf_ibs_op && !(ibs_caps & IBS_CAPS_RDWROPCNT)) {
+ config &= ~IBS_OP_CUR_CNT;
+ config |= (config & IBS_OP_MAX_CNT) << 36;
+ }
+
+ perf_ibs_event_update(perf_ibs, event, config);
+
+ overflow = perf_ibs_set_period(perf_ibs, hwc, &config);
+ reenable = !(overflow && perf_event_overflow(event, &data, ®s));
+ config = (config >> 4) | (reenable ? perf_ibs->enable_mask : 0);
+ perf_ibs_enable_event(hwc, config);
+
+ perf_event_update_userpage(event);
return 1;
}
--
1.7.7
^ permalink raw reply related [flat|nested] 10+ messages in thread* [tip:perf/x86-ibs] perf/x86: Implement 64-bit counter support for IBS
2011-12-15 16:56 ` [PATCH v4 4/4] perf, x86: Implement 64 bit counter support for IBS Robert Richter
@ 2012-03-08 12:24 ` tip-bot for Robert Richter
0 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Robert Richter @ 2012-03-08 12:24 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, eranian, hpa, mingo, robert.richter, peterz, tglx,
mingo
Commit-ID: db98c5faf8cb350212ea3af786cb3ba0d4e7a01e
Gitweb: http://git.kernel.org/tip/db98c5faf8cb350212ea3af786cb3ba0d4e7a01e
Author: Robert Richter <robert.richter@amd.com>
AuthorDate: Thu, 15 Dec 2011 17:56:39 +0100
Committer: Ingo Molnar <mingo@elte.hu>
CommitDate: Thu, 8 Mar 2012 11:35:22 +0100
perf/x86: Implement 64-bit counter support for IBS
This patch implements 64 bit counter support for IBS. The
sampling period is no longer limited to the hw counter width.
The functions perf_event_set_period() and
perf_event_try_update() can be used as generic functions. They
can replace similar code that is duplicate across architectures.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323968199-9326-5-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/include/asm/perf_event.h | 2 +
arch/x86/kernel/cpu/perf_event_amd_ibs.c | 204 +++++++++++++++++++++++++++---
2 files changed, 185 insertions(+), 21 deletions(-)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index e8fb2c7..9cf6696 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -177,6 +177,8 @@ struct x86_pmu_capability {
#define IBS_FETCH_MAX_CNT 0x0000FFFFULL
/* IbsOpCtl bits */
+/* lower 4 bits of the current count are ignored: */
+#define IBS_OP_CUR_CNT (0xFFFF0ULL<<32)
#define IBS_OP_CNT_CTL (1ULL<<19)
#define IBS_OP_VAL (1ULL<<18)
#define IBS_OP_ENABLE (1ULL<<17)
diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
index 40a6d9d..573d248 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c
@@ -44,9 +44,11 @@ struct perf_ibs {
u64 cnt_mask;
u64 enable_mask;
u64 valid_mask;
+ u64 max_period;
unsigned long offset_mask[1];
int offset_max;
struct cpu_perf_ibs __percpu *pcpu;
+ u64 (*get_count)(u64 config);
};
struct perf_ibs_data {
@@ -58,6 +60,78 @@ struct perf_ibs_data {
u64 regs[MSR_AMD64_IBS_REG_COUNT_MAX];
};
+static int
+perf_event_set_period(struct hw_perf_event *hwc, u64 min, u64 max, u64 *count)
+{
+ s64 left = local64_read(&hwc->period_left);
+ s64 period = hwc->sample_period;
+ int overflow = 0;
+
+ /*
+ * If we are way outside a reasonable range then just skip forward:
+ */
+ if (unlikely(left <= -period)) {
+ left = period;
+ local64_set(&hwc->period_left, left);
+ hwc->last_period = period;
+ overflow = 1;
+ }
+
+ if (unlikely(left <= 0)) {
+ left += period;
+ local64_set(&hwc->period_left, left);
+ hwc->last_period = period;
+ overflow = 1;
+ }
+
+ if (unlikely(left < min))
+ left = min;
+
+ if (left > max)
+ left = max;
+
+ *count = (u64)left;
+
+ return overflow;
+}
+
+static int
+perf_event_try_update(struct perf_event *event, u64 new_raw_count, int width)
+{
+ struct hw_perf_event *hwc = &event->hw;
+ int shift = 64 - width;
+ u64 prev_raw_count;
+ u64 delta;
+
+ /*
+ * Careful: an NMI might modify the previous event value.
+ *
+ * Our tactic to handle this is to first atomically read and
+ * exchange a new raw count - then add that new-prev delta
+ * count to the generic event atomically:
+ */
+ prev_raw_count = local64_read(&hwc->prev_count);
+ if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
+ new_raw_count) != prev_raw_count)
+ return 0;
+
+ /*
+ * Now we have the new raw value and have updated the prev
+ * timestamp already. We can now calculate the elapsed delta
+ * (event-)time and add that to the generic event.
+ *
+ * Careful, not all hw sign-extends above the physical width
+ * of the count.
+ */
+ delta = (new_raw_count << shift) - (prev_raw_count << shift);
+ delta >>= shift;
+
+ local64_add(delta, &event->count);
+ local64_sub(delta, &hwc->period_left);
+
+ return 1;
+}
+
static struct perf_ibs perf_ibs_fetch;
static struct perf_ibs perf_ibs_op;
@@ -91,18 +165,14 @@ static int perf_ibs_init(struct perf_event *event)
if (hwc->sample_period & 0x0f)
/* lower 4 bits can not be set in ibs max cnt */
return -EINVAL;
- max_cnt = hwc->sample_period >> 4;
- if (max_cnt & ~perf_ibs->cnt_mask)
- /* out of range */
- return -EINVAL;
- config |= max_cnt;
} else {
max_cnt = config & perf_ibs->cnt_mask;
+ config &= ~perf_ibs->cnt_mask;
event->attr.sample_period = max_cnt << 4;
hwc->sample_period = event->attr.sample_period;
}
- if (!max_cnt)
+ if (!hwc->sample_period)
return -EINVAL;
hwc->config_base = perf_ibs->msr;
@@ -111,16 +181,71 @@ static int perf_ibs_init(struct perf_event *event)
return 0;
}
+static int perf_ibs_set_period(struct perf_ibs *perf_ibs,
+ struct hw_perf_event *hwc, u64 *period)
+{
+ int ret;
+
+ /* ignore lower 4 bits in min count: */
+ ret = perf_event_set_period(hwc, 1<<4, perf_ibs->max_period, period);
+ local64_set(&hwc->prev_count, 0);
+
+ return ret;
+}
+
+static u64 get_ibs_fetch_count(u64 config)
+{
+ return (config & IBS_FETCH_CNT) >> 12;
+}
+
+static u64 get_ibs_op_count(u64 config)
+{
+ return (config & IBS_OP_CUR_CNT) >> 32;
+}
+
+static void
+perf_ibs_event_update(struct perf_ibs *perf_ibs, struct perf_event *event,
+ u64 config)
+{
+ u64 count = perf_ibs->get_count(config);
+
+ while (!perf_event_try_update(event, count, 20)) {
+ rdmsrl(event->hw.config_base, config);
+ count = perf_ibs->get_count(config);
+ }
+}
+
+/* Note: The enable mask must be encoded in the config argument. */
+static inline void perf_ibs_enable_event(struct hw_perf_event *hwc, u64 config)
+{
+ wrmsrl(hwc->config_base, hwc->config | config);
+}
+
+/*
+ * We cannot restore the ibs pmu state, so we always needs to update
+ * the event while stopping it and then reset the state when starting
+ * again. Thus, ignoring PERF_EF_RELOAD and PERF_EF_UPDATE flags in
+ * perf_ibs_start()/perf_ibs_stop() and instead always do it.
+ */
static void perf_ibs_start(struct perf_event *event, int flags)
{
struct hw_perf_event *hwc = &event->hw;
struct perf_ibs *perf_ibs = container_of(event->pmu, struct perf_ibs, pmu);
struct cpu_perf_ibs *pcpu = this_cpu_ptr(perf_ibs->pcpu);
+ u64 config;
- if (test_and_set_bit(IBS_STARTED, pcpu->state))
+ if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED)))
return;
- wrmsrl(hwc->config_base, hwc->config | perf_ibs->enable_mask);
+ WARN_ON_ONCE(!(hwc->state & PERF_HES_UPTODATE));
+ hwc->state = 0;
+
+ perf_ibs_set_period(perf_ibs, hwc, &config);
+ config = (config >> 4) | perf_ibs->enable_mask;
+ set_bit(IBS_STARTED, pcpu->state);
+ perf_ibs_enable_event(hwc, config);
+
+ perf_event_update_userpage(event);
}
static void perf_ibs_stop(struct perf_event *event, int flags)
@@ -129,15 +254,28 @@ static void perf_ibs_stop(struct perf_event *event, int flags)
struct perf_ibs *perf_ibs = container_of(event->pmu, struct perf_ibs, pmu);
struct cpu_perf_ibs *pcpu = this_cpu_ptr(perf_ibs->pcpu);
u64 val;
+ int stopping;
- if (!test_and_clear_bit(IBS_STARTED, pcpu->state))
- return;
+ stopping = test_and_clear_bit(IBS_STARTED, pcpu->state);
- set_bit(IBS_STOPPING, pcpu->state);
+ if (!stopping && (hwc->state & PERF_HES_UPTODATE))
+ return;
rdmsrl(hwc->config_base, val);
- val &= ~perf_ibs->enable_mask;
- wrmsrl(hwc->config_base, val);
+
+ if (stopping) {
+ set_bit(IBS_STOPPING, pcpu->state);
+ val &= ~perf_ibs->enable_mask;
+ wrmsrl(hwc->config_base, val);
+ WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
+ hwc->state |= PERF_HES_STOPPED;
+ }
+
+ if (hwc->state & PERF_HES_UPTODATE)
+ return;
+
+ perf_ibs_event_update(perf_ibs, event, val);
+ hwc->state |= PERF_HES_UPTODATE;
}
static int perf_ibs_add(struct perf_event *event, int flags)
@@ -148,6 +286,8 @@ static int perf_ibs_add(struct perf_event *event, int flags)
if (test_and_set_bit(IBS_ENABLED, pcpu->state))
return -ENOSPC;
+ event->hw.state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
+
pcpu->event = event;
if (flags & PERF_EF_START)
@@ -164,9 +304,11 @@ static void perf_ibs_del(struct perf_event *event, int flags)
if (!test_and_clear_bit(IBS_ENABLED, pcpu->state))
return;
- perf_ibs_stop(event, 0);
+ perf_ibs_stop(event, PERF_EF_UPDATE);
pcpu->event = NULL;
+
+ perf_event_update_userpage(event);
}
static void perf_ibs_read(struct perf_event *event) { }
@@ -187,8 +329,11 @@ static struct perf_ibs perf_ibs_fetch = {
.cnt_mask = IBS_FETCH_MAX_CNT,
.enable_mask = IBS_FETCH_ENABLE,
.valid_mask = IBS_FETCH_VAL,
+ .max_period = IBS_FETCH_MAX_CNT << 4,
.offset_mask = { MSR_AMD64_IBSFETCH_REG_MASK },
.offset_max = MSR_AMD64_IBSFETCH_REG_COUNT,
+
+ .get_count = get_ibs_fetch_count,
};
static struct perf_ibs perf_ibs_op = {
@@ -207,8 +352,11 @@ static struct perf_ibs perf_ibs_op = {
.cnt_mask = IBS_OP_MAX_CNT,
.enable_mask = IBS_OP_ENABLE,
.valid_mask = IBS_OP_VAL,
+ .max_period = IBS_OP_MAX_CNT << 4,
.offset_mask = { MSR_AMD64_IBSOP_REG_MASK },
.offset_max = MSR_AMD64_IBSOP_REG_COUNT,
+
+ .get_count = get_ibs_op_count,
};
static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
@@ -220,9 +368,9 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
struct perf_raw_record raw;
struct pt_regs regs;
struct perf_ibs_data ibs_data;
- int offset, size;
+ int offset, size, overflow, reenable;
unsigned int msr;
- u64 *buf;
+ u64 *buf, config;
if (!test_bit(IBS_STARTED, pcpu->state)) {
/* Catch spurious interrupts after stopping IBS: */
@@ -257,11 +405,25 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
regs = *iregs; /* XXX: update ip from ibs sample */
- if (perf_event_overflow(event, &data, ®s))
- ; /* stop */
- else
- /* reenable */
- wrmsrl(hwc->config_base, hwc->config | perf_ibs->enable_mask);
+ /*
+ * Emulate IbsOpCurCnt in MSRC001_1033 (IbsOpCtl), not
+ * supported in all cpus. As this triggered an interrupt, we
+ * set the current count to the max count.
+ */
+ config = ibs_data.regs[0];
+ if (perf_ibs == &perf_ibs_op && !(ibs_caps & IBS_CAPS_RDWROPCNT)) {
+ config &= ~IBS_OP_CUR_CNT;
+ config |= (config & IBS_OP_MAX_CNT) << 36;
+ }
+
+ perf_ibs_event_update(perf_ibs, event, config);
+
+ overflow = perf_ibs_set_period(perf_ibs, hwc, &config);
+ reenable = !(overflow && perf_event_overflow(event, &data, ®s));
+ config = (config >> 4) | (reenable ? perf_ibs->enable_mask : 0);
+ perf_ibs_enable_event(hwc, config);
+
+ perf_event_update_userpage(event);
return 1;
}
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v4 0/4] perf, x86: Implement AMD IBS
2011-12-15 16:56 [PATCH v4 0/4] perf, x86: Implement AMD IBS Robert Richter
` (3 preceding siblings ...)
2011-12-15 16:56 ` [PATCH v4 4/4] perf, x86: Implement 64 bit counter support for IBS Robert Richter
@ 2011-12-22 15:36 ` Robert Richter
4 siblings, 0 replies; 10+ messages in thread
From: Robert Richter @ 2011-12-22 15:36 UTC (permalink / raw)
To: Ingo Molnar
Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, Stephane Eranian, LKML
Ingo,
On 15.12.11 17:56:35, Robert Richter wrote:
> Changes for V4 (only patch #3 modified):
>
> * removed userland implementation from patch set, will post it as
> separate patch series
> * fix returned number of handled nmis in perf_ibs_handle_irq()
> Robert Richter (4):
> perf, x86: Implement IBS event configuration
> perf, x86: Implement IBS interrupt handler
> perf, x86: Implement IBS pmu control ops
> perf, x86: Implement 64 bit counter support for IBS
>
> arch/x86/include/asm/msr-index.h | 5 +
> arch/x86/include/asm/perf_event.h | 2 +
> arch/x86/kernel/cpu/perf_event_amd_ibs.c | 438 +++++++++++++++++++++++++++++-
> 3 files changed, 438 insertions(+), 7 deletions(-)
Assuming that there is agreement about the perf tool ibs
implementation, could you consider to apply the (kernel) patches? Not
sure what Arnaldo's plan is for the userland. It would be great to
have both ready for 3.3.
(Even if Peter is on vacation or so, except the fix in
perf_ibs_handle_irq() the kernel patches are unchanged and Peter
already agreed to apply them.)
Thanks,
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
^ permalink raw reply [flat|nested] 10+ messages in thread