* [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations
@ 2024-12-10 9:34 Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 01/10] perf/amd/ibs: Remove IBS_{FETCH|OP}_CONFIG_MASK macros Ravi Bangoria
` (10 more replies)
0 siblings, 11 replies; 16+ messages in thread
From: Ravi Bangoria @ 2024-12-10 9:34 UTC (permalink / raw)
To: peterz, mingo, namhyung
Cc: ravi.bangoria, acme, eranian, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
x86, hpa, linux-perf-users, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das
IBS Fetch and IBS Op PMUs have constraints on supported sample period
values. The IBS hw behavior could be undefined if they are not followed.
Currently, IBS driver does not honor them correctly and thus a malicious
event could cause issues to the system. Fortunately, the IBS hw is very
resilient so far and IBS PMUs are restricted to root only, so the attack
vector is minimal. In any case, these are genuine bugs and must be fixed.
There were some conflicts with Namhyung's IBS patches[1] which are now
in the tip tree, so rebased the series on tip/perf/core (02c56362a7d3).
v2: https://lore.kernel.org/r/20241206051713.991-1-ravi.bangoria@amd.com
v2->v3:
- No code changes
- s/pmu/PMU/g
- Include unit test example in the commit description
[1]: https://lore.kernel.org/r/20241203180441.1634709-1-namhyung@kernel.org
Ravi Bangoria (10):
perf/amd/ibs: Remove IBS_{FETCH|OP}_CONFIG_MASK macros
perf/amd/ibs: Remove pointless sample period check
perf/amd/ibs: Fix ->config to sample period calculation for OP PMU
perf/amd/ibs: Fix perf_ibs_op.cnt_mask for CurCnt
perf/amd/ibs: Don't allow freq mode event creation through ->config
interface
perf/amd/ibs: Add PMU specific minimum period
perf/amd/ibs: Add ->check_period() callback
perf/core: Introduce pmu->adjust_period() callback
perf test: Introduce DEFINE_SUITE_EXCLUSIVE()
perf test amd ibs: Add sample period unit test
arch/x86/events/amd/ibs.c | 96 ++-
arch/x86/include/asm/perf_event.h | 1 +
include/linux/perf_event.h | 5 +
kernel/events/core.c | 12 +-
tools/perf/arch/x86/include/arch-tests.h | 1 +
tools/perf/arch/x86/tests/Build | 1 +
tools/perf/arch/x86/tests/amd-ibs-period.c | 953 +++++++++++++++++++++
tools/perf/arch/x86/tests/arch-tests.c | 2 +
tools/perf/tests/tests.h | 10 +
9 files changed, 1055 insertions(+), 26 deletions(-)
create mode 100644 tools/perf/arch/x86/tests/amd-ibs-period.c
--
2.43.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v3 01/10] perf/amd/ibs: Remove IBS_{FETCH|OP}_CONFIG_MASK macros
2024-12-10 9:34 [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations Ravi Bangoria
@ 2024-12-10 9:34 ` Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 02/10] perf/amd/ibs: Remove pointless sample period check Ravi Bangoria
` (9 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Ravi Bangoria @ 2024-12-10 9:34 UTC (permalink / raw)
To: peterz, mingo, namhyung
Cc: ravi.bangoria, acme, eranian, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
x86, hpa, linux-perf-users, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das
Definition of these macros are very simple and they are used at only one
place. Get rid of unnecessary redirection.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
arch/x86/events/amd/ibs.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index e7a8b8758e08..4ca8006d2221 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -28,9 +28,6 @@ static u32 ibs_caps;
#include <asm/nmi.h>
#include <asm/amd-ibs.h>
-#define IBS_FETCH_CONFIG_MASK (IBS_FETCH_RAND_EN | IBS_FETCH_MAX_CNT)
-#define IBS_OP_CONFIG_MASK IBS_OP_MAX_CNT
-
/* attr.config2 */
#define IBS_SW_FILTER_MASK 1
@@ -688,7 +685,7 @@ static struct perf_ibs perf_ibs_fetch = {
.read = perf_ibs_read,
},
.msr = MSR_AMD64_IBSFETCHCTL,
- .config_mask = IBS_FETCH_CONFIG_MASK,
+ .config_mask = IBS_FETCH_MAX_CNT | IBS_FETCH_RAND_EN,
.cnt_mask = IBS_FETCH_MAX_CNT,
.enable_mask = IBS_FETCH_ENABLE,
.valid_mask = IBS_FETCH_VAL,
@@ -711,7 +708,7 @@ static struct perf_ibs perf_ibs_op = {
.read = perf_ibs_read,
},
.msr = MSR_AMD64_IBSOPCTL,
- .config_mask = IBS_OP_CONFIG_MASK,
+ .config_mask = IBS_OP_MAX_CNT,
.cnt_mask = IBS_OP_MAX_CNT | IBS_OP_CUR_CNT |
IBS_OP_CUR_CNT_RAND,
.enable_mask = IBS_OP_ENABLE,
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v3 02/10] perf/amd/ibs: Remove pointless sample period check
2024-12-10 9:34 [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 01/10] perf/amd/ibs: Remove IBS_{FETCH|OP}_CONFIG_MASK macros Ravi Bangoria
@ 2024-12-10 9:34 ` Ravi Bangoria
2024-12-18 13:35 ` Peter Zijlstra
2024-12-10 9:34 ` [PATCH v3 03/10] perf/amd/ibs: Fix ->config to sample period calculation for OP PMU Ravi Bangoria
` (8 subsequent siblings)
10 siblings, 1 reply; 16+ messages in thread
From: Ravi Bangoria @ 2024-12-10 9:34 UTC (permalink / raw)
To: peterz, mingo, namhyung
Cc: ravi.bangoria, acme, eranian, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
x86, hpa, linux-perf-users, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das
Valid perf event sample period value for IBS PMUs (Fetch and Op both)
is limited to multiple of 0x10. perf_ibs_init() has this check:
if (!event->attr.sample_freq && hwc->sample_period & 0x0f)
return -EINVAL;
But it's broken since hwc->sample_period will always be 0 when
event->attr.sample_freq is 0 (irrespective of event->attr.freq value.)
One option to fix this is to change the condition:
- if (!event->attr.sample_freq && hwc->sample_period & 0x0f)
+ if (!event->attr.freq && hwc->sample_period & 0x0f)
However, that will break all userspace tools which have been using IBS
event with sample_period not multiple of 0x10.
Another option is to remove the condition altogether and mask lower
nibble _silently_, same as what current code is inadvertently doing.
I'm preferring this approach as it keeps the existing behavior.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
arch/x86/events/amd/ibs.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index 4ca8006d2221..bd8919e7c3b1 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -307,13 +307,8 @@ static int perf_ibs_init(struct perf_event *event)
if (config & perf_ibs->cnt_mask)
/* raw max_cnt may not be set */
return -EINVAL;
- if (!event->attr.sample_freq && hwc->sample_period & 0x0f)
- /*
- * lower 4 bits can not be set in ibs max cnt,
- * but allowing it in case we adjust the
- * sample period to set a frequency.
- */
- return -EINVAL;
+
+ /* Silently mask off lower nibble. IBS hw mandates it. */
hwc->sample_period &= ~0x0FULL;
if (!hwc->sample_period)
hwc->sample_period = 0x10;
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v3 03/10] perf/amd/ibs: Fix ->config to sample period calculation for OP PMU
2024-12-10 9:34 [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 01/10] perf/amd/ibs: Remove IBS_{FETCH|OP}_CONFIG_MASK macros Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 02/10] perf/amd/ibs: Remove pointless sample period check Ravi Bangoria
@ 2024-12-10 9:34 ` Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 04/10] perf/amd/ibs: Fix perf_ibs_op.cnt_mask for CurCnt Ravi Bangoria
` (7 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Ravi Bangoria @ 2024-12-10 9:34 UTC (permalink / raw)
To: peterz, mingo, namhyung
Cc: ravi.bangoria, acme, eranian, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
x86, hpa, linux-perf-users, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das
Instead of using standard perf_event_attr->freq=0 and ->sample_period
fields, IBS event in 'sample period mode' can also be opened by setting
period value directly in perf_event_attr->config in a MaxCnt bit-field
format.
IBS OP MaxCnt bits are defined as:
(high bits) IbsOpCtl[26:20] = IbsOpMaxCnt[26:20]
(low bits) IbsOpCtl[15:0] = IbsOpMaxCnt[19:4]
Perf event sample period can be derived from MaxCnt bits as:
sample_period = (high bits) | ((low_bits) << 4);
However, current code just masks MaxCnt bits and shifts all of them,
including high bits, which is incorrect. Fix it.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
arch/x86/events/amd/ibs.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index bd8919e7c3b1..f95542b75b91 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -271,7 +271,7 @@ static int perf_ibs_init(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
struct perf_ibs *perf_ibs;
- u64 max_cnt, config;
+ u64 config;
int ret;
perf_ibs = get_ibs_pmu(event->attr.type);
@@ -313,10 +313,19 @@ static int perf_ibs_init(struct perf_event *event)
if (!hwc->sample_period)
hwc->sample_period = 0x10;
} else {
- max_cnt = config & perf_ibs->cnt_mask;
+ u64 period = 0;
+
+ if (perf_ibs == &perf_ibs_op) {
+ period = (config & IBS_OP_MAX_CNT) << 4;
+ if (ibs_caps & IBS_CAPS_OPCNTEXT)
+ period |= config & IBS_OP_MAX_CNT_EXT_MASK;
+ } else {
+ period = (config & IBS_FETCH_MAX_CNT) << 4;
+ }
+
config &= ~perf_ibs->cnt_mask;
- event->attr.sample_period = max_cnt << 4;
- hwc->sample_period = event->attr.sample_period;
+ event->attr.sample_period = period;
+ hwc->sample_period = period;
}
if (!hwc->sample_period)
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v3 04/10] perf/amd/ibs: Fix perf_ibs_op.cnt_mask for CurCnt
2024-12-10 9:34 [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations Ravi Bangoria
` (2 preceding siblings ...)
2024-12-10 9:34 ` [PATCH v3 03/10] perf/amd/ibs: Fix ->config to sample period calculation for OP PMU Ravi Bangoria
@ 2024-12-10 9:34 ` Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 05/10] perf/amd/ibs: Don't allow freq mode event creation through ->config interface Ravi Bangoria
` (6 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Ravi Bangoria @ 2024-12-10 9:34 UTC (permalink / raw)
To: peterz, mingo, namhyung
Cc: ravi.bangoria, acme, eranian, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
x86, hpa, linux-perf-users, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das
IBS Op uses two counters: MaxCnt and CurCnt. MaxCnt is programmed with
the desired sample period. IBS hw generates sample when CurCnt reaches
to MaxCnt. The size of these counter used to be 20 bits but later they
were extended to 27 bits. The 7 bit extension is indicated by CPUID
Fn8000_001B_EAX[6 / OpCntExt].
perf_ibs->cnt_mask variable contains bit masks for MaxCnt and CurCnt.
But IBS driver does not set upper 7 bits of CurCnt in cnt_mask even
when OpCntExt CPUID bit is set. Fix this.
IBS driver uses cnt_mask[CurCnt] bits only while disabling an event.
Fortunately, CurCnt bits are not read from MSR while re-enabling the
event, instead MaxCnt is programmed with desired period and CurCnt is
set to 0. Hence, we did not see any issues so far.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
arch/x86/events/amd/ibs.c | 3 ++-
arch/x86/include/asm/perf_event.h | 1 +
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index f95542b75b91..d9c84f1d530f 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -1245,7 +1245,8 @@ static __init int perf_ibs_op_init(void)
if (ibs_caps & IBS_CAPS_OPCNTEXT) {
perf_ibs_op.max_period |= IBS_OP_MAX_CNT_EXT_MASK;
perf_ibs_op.config_mask |= IBS_OP_MAX_CNT_EXT_MASK;
- perf_ibs_op.cnt_mask |= IBS_OP_MAX_CNT_EXT_MASK;
+ perf_ibs_op.cnt_mask |= (IBS_OP_MAX_CNT_EXT_MASK |
+ IBS_OP_CUR_CNT_EXT_MASK);
}
if (ibs_caps & IBS_CAPS_ZEN4)
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index cb9c4679f45c..aff9fc693b11 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -510,6 +510,7 @@ struct pebs_xmm {
*/
#define IBS_OP_CUR_CNT (0xFFF80ULL<<32)
#define IBS_OP_CUR_CNT_RAND (0x0007FULL<<32)
+#define IBS_OP_CUR_CNT_EXT_MASK (0x7FULL<<52)
#define IBS_OP_CNT_CTL (1ULL<<19)
#define IBS_OP_VAL (1ULL<<18)
#define IBS_OP_ENABLE (1ULL<<17)
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v3 05/10] perf/amd/ibs: Don't allow freq mode event creation through ->config interface
2024-12-10 9:34 [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations Ravi Bangoria
` (3 preceding siblings ...)
2024-12-10 9:34 ` [PATCH v3 04/10] perf/amd/ibs: Fix perf_ibs_op.cnt_mask for CurCnt Ravi Bangoria
@ 2024-12-10 9:34 ` Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 06/10] perf/amd/ibs: Add PMU specific minimum period Ravi Bangoria
` (5 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Ravi Bangoria @ 2024-12-10 9:34 UTC (permalink / raw)
To: peterz, mingo, namhyung
Cc: ravi.bangoria, acme, eranian, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
x86, hpa, linux-perf-users, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das
Most perf_event_attr->config bits directly maps to IBS_{FETCH|OP}_CTL
MSR. Since the sample period is programmed in these control registers,
IBS PMU driver allows opening an IBS event by setting sample period
value directly in perf_event_attr->config instead of using explicit
perf_event_attr->sample_period interface.
However, this logic is not applicable for freq mode events since the
semantics of control register fields are applicable only to fixed
sample period whereas the freq mode event adjusts sample period after
each and every sample. Currently, IBS driver (unintentionally) allows
creating freq mode event via ->config interface, which is semantically
wrong as well as detrimental because it can be misused to bypass
perf_event_max_sample_rate checks.
Don't allow freq mode event creation through perf_event_attr->config
interface.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
arch/x86/events/amd/ibs.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index d9c84f1d530f..3e7ca1e2f25e 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -315,6 +315,9 @@ static int perf_ibs_init(struct perf_event *event)
} else {
u64 period = 0;
+ if (event->attr.freq)
+ return -EINVAL;
+
if (perf_ibs == &perf_ibs_op) {
period = (config & IBS_OP_MAX_CNT) << 4;
if (ibs_caps & IBS_CAPS_OPCNTEXT)
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v3 06/10] perf/amd/ibs: Add PMU specific minimum period
2024-12-10 9:34 [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations Ravi Bangoria
` (4 preceding siblings ...)
2024-12-10 9:34 ` [PATCH v3 05/10] perf/amd/ibs: Don't allow freq mode event creation through ->config interface Ravi Bangoria
@ 2024-12-10 9:34 ` Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 07/10] perf/amd/ibs: Add ->check_period() callback Ravi Bangoria
` (4 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Ravi Bangoria @ 2024-12-10 9:34 UTC (permalink / raw)
To: peterz, mingo, namhyung
Cc: ravi.bangoria, acme, eranian, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
x86, hpa, linux-perf-users, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das
0x10 is the minimum sample period for IBS Fetch and 0x90 for IBS Op.
Current IBS PMU driver uses 0x10 for both the PMUs, which is incorrect.
Fix it by adding PMU specific minimum period values in struct perf_ibs.
Also, bail out opening a 'sample period mode' event if the user requested
sample period is less than PMU supported minimum value. For a 'freq mode'
event, start calibrating sample period from PMU specific minimum period.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
arch/x86/events/amd/ibs.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index 3e7ca1e2f25e..7b54b76d39f5 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -86,6 +86,7 @@ struct perf_ibs {
u64 cnt_mask;
u64 enable_mask;
u64 valid_mask;
+ u16 min_period;
u64 max_period;
unsigned long offset_mask[1];
int offset_max;
@@ -308,10 +309,14 @@ static int perf_ibs_init(struct perf_event *event)
/* raw max_cnt may not be set */
return -EINVAL;
- /* Silently mask off lower nibble. IBS hw mandates it. */
- hwc->sample_period &= ~0x0FULL;
- if (!hwc->sample_period)
- hwc->sample_period = 0x10;
+ if (event->attr.freq) {
+ hwc->sample_period = perf_ibs->min_period;
+ } else {
+ /* Silently mask off lower nibble. IBS hw mandates it. */
+ hwc->sample_period &= ~0x0FULL;
+ if (hwc->sample_period < perf_ibs->min_period)
+ return -EINVAL;
+ }
} else {
u64 period = 0;
@@ -329,10 +334,10 @@ static int perf_ibs_init(struct perf_event *event)
config &= ~perf_ibs->cnt_mask;
event->attr.sample_period = period;
hwc->sample_period = period;
- }
- if (!hwc->sample_period)
- return -EINVAL;
+ if (hwc->sample_period < perf_ibs->min_period)
+ return -EINVAL;
+ }
/*
* If we modify hwc->sample_period, we also need to update
@@ -353,7 +358,8 @@ static int perf_ibs_set_period(struct perf_ibs *perf_ibs,
int overflow;
/* ignore lower 4 bits in min count: */
- overflow = perf_event_set_period(hwc, 1<<4, perf_ibs->max_period, period);
+ overflow = perf_event_set_period(hwc, perf_ibs->min_period,
+ perf_ibs->max_period, period);
local64_set(&hwc->prev_count, 0);
return overflow;
@@ -696,6 +702,7 @@ static struct perf_ibs perf_ibs_fetch = {
.cnt_mask = IBS_FETCH_MAX_CNT,
.enable_mask = IBS_FETCH_ENABLE,
.valid_mask = IBS_FETCH_VAL,
+ .min_period = 0x10,
.max_period = IBS_FETCH_MAX_CNT << 4,
.offset_mask = { MSR_AMD64_IBSFETCH_REG_MASK },
.offset_max = MSR_AMD64_IBSFETCH_REG_COUNT,
@@ -720,6 +727,7 @@ static struct perf_ibs perf_ibs_op = {
IBS_OP_CUR_CNT_RAND,
.enable_mask = IBS_OP_ENABLE,
.valid_mask = IBS_OP_VAL,
+ .min_period = 0x90,
.max_period = IBS_OP_MAX_CNT << 4,
.offset_mask = { MSR_AMD64_IBSOP_REG_MASK },
.offset_max = MSR_AMD64_IBSOP_REG_COUNT,
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v3 07/10] perf/amd/ibs: Add ->check_period() callback
2024-12-10 9:34 [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations Ravi Bangoria
` (5 preceding siblings ...)
2024-12-10 9:34 ` [PATCH v3 06/10] perf/amd/ibs: Add PMU specific minimum period Ravi Bangoria
@ 2024-12-10 9:34 ` Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 08/10] perf/core: Introduce pmu->adjust_period() callback Ravi Bangoria
` (3 subsequent siblings)
10 siblings, 0 replies; 16+ messages in thread
From: Ravi Bangoria @ 2024-12-10 9:34 UTC (permalink / raw)
To: peterz, mingo, namhyung
Cc: ravi.bangoria, acme, eranian, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
x86, hpa, linux-perf-users, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das
IBS Fetch and IBS Op PMUs have constraints on sample period. The sample
period is verified at the time of opening an event but not at the ioctl()
interface. Hence, a user can open an event with valid period but change
it later with ioctl(). Add a ->check_period() callback to verify the
period provided at ioctl() is also valid.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
arch/x86/events/amd/ibs.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index 7b54b76d39f5..aea893a971b6 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -564,6 +564,28 @@ static void perf_ibs_del(struct perf_event *event, int flags)
static void perf_ibs_read(struct perf_event *event) { }
+static int perf_ibs_check_period(struct perf_event *event, u64 value)
+{
+ struct perf_ibs *perf_ibs;
+ u64 low_nibble;
+
+ if (event->attr.freq)
+ return 0;
+
+ perf_ibs = container_of(event->pmu, struct perf_ibs, pmu);
+ low_nibble = value & 0xFULL;
+
+ /*
+ * This contradicts with perf_ibs_init() which allows sample period
+ * with lower nibble bits set but silently masks them off. Whereas
+ * this returns error.
+ */
+ if (low_nibble || value < perf_ibs->min_period)
+ return -EINVAL;
+
+ return 0;
+}
+
/*
* We need to initialize with empty group if all attributes in the
* group are dynamic.
@@ -696,6 +718,7 @@ static struct perf_ibs perf_ibs_fetch = {
.start = perf_ibs_start,
.stop = perf_ibs_stop,
.read = perf_ibs_read,
+ .check_period = perf_ibs_check_period,
},
.msr = MSR_AMD64_IBSFETCHCTL,
.config_mask = IBS_FETCH_MAX_CNT | IBS_FETCH_RAND_EN,
@@ -720,6 +743,7 @@ static struct perf_ibs perf_ibs_op = {
.start = perf_ibs_start,
.stop = perf_ibs_stop,
.read = perf_ibs_read,
+ .check_period = perf_ibs_check_period,
},
.msr = MSR_AMD64_IBSOPCTL,
.config_mask = IBS_OP_MAX_CNT,
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v3 08/10] perf/core: Introduce pmu->adjust_period() callback
2024-12-10 9:34 [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations Ravi Bangoria
` (6 preceding siblings ...)
2024-12-10 9:34 ` [PATCH v3 07/10] perf/amd/ibs: Add ->check_period() callback Ravi Bangoria
@ 2024-12-10 9:34 ` Ravi Bangoria
2024-12-18 13:52 ` Peter Zijlstra
2024-12-10 9:34 ` [PATCH v3 09/10] perf test: Introduce DEFINE_SUITE_EXCLUSIVE() Ravi Bangoria
` (2 subsequent siblings)
10 siblings, 1 reply; 16+ messages in thread
From: Ravi Bangoria @ 2024-12-10 9:34 UTC (permalink / raw)
To: peterz, mingo, namhyung
Cc: ravi.bangoria, acme, eranian, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
x86, hpa, linux-perf-users, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das
Many hardware PMUs have constraints about sample period. For ex, minimum
supported sample period for IBS Op PMU is 0x90, the sample period must
be multiple of 0x10 for IBS Fetch and IBS Op.
Add an optional callback adjust_period() to struct PMU to allow PMU
specific drivers to adjust sample period calculated by generic code.
This will ensure the sample_period value will always be valid and no
additional code is required in PMU specific drivers to re-adjust the
period.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
arch/x86/events/amd/ibs.c | 11 +++++++++++
include/linux/perf_event.h | 5 +++++
kernel/events/core.c | 12 ++++++++++--
3 files changed, 26 insertions(+), 2 deletions(-)
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index aea893a971b6..db6dc7b231e2 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -586,6 +586,15 @@ static int perf_ibs_check_period(struct perf_event *event, u64 value)
return 0;
}
+static u64 perf_ibs_adjust_period(struct perf_event *event, u64 period)
+{
+ struct perf_ibs *perf_ibs = container_of(event->pmu, struct perf_ibs, pmu);
+
+ period &= ~0xFULL;
+
+ return period < perf_ibs->min_period ? perf_ibs->min_period : period;
+}
+
/*
* We need to initialize with empty group if all attributes in the
* group are dynamic.
@@ -719,6 +728,7 @@ static struct perf_ibs perf_ibs_fetch = {
.stop = perf_ibs_stop,
.read = perf_ibs_read,
.check_period = perf_ibs_check_period,
+ .adjust_period = perf_ibs_adjust_period,
},
.msr = MSR_AMD64_IBSFETCHCTL,
.config_mask = IBS_FETCH_MAX_CNT | IBS_FETCH_RAND_EN,
@@ -744,6 +754,7 @@ static struct perf_ibs perf_ibs_op = {
.stop = perf_ibs_stop,
.read = perf_ibs_read,
.check_period = perf_ibs_check_period,
+ .adjust_period = perf_ibs_adjust_period,
},
.msr = MSR_AMD64_IBSOPCTL,
.config_mask = IBS_OP_MAX_CNT,
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 8333f132f4a9..4dcc51f5d2b6 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -582,6 +582,11 @@ struct pmu {
* Check period value for PERF_EVENT_IOC_PERIOD ioctl.
*/
int (*check_period) (struct perf_event *event, u64 value); /* optional */
+
+ /*
+ * Adjust period value according to PMU constraints.
+ */
+ u64 (*adjust_period) (struct perf_event *event, u64 period); /* optional */
};
enum perf_addr_filter_action_t {
diff --git a/kernel/events/core.c b/kernel/events/core.c
index b2bc67791f84..e71aded67ce6 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -4192,9 +4192,9 @@ static void perf_adjust_period(struct perf_event *event, u64 nsec, u64 count, bo
if (!sample_period)
sample_period = 1;
- hwc->sample_period = sample_period;
+ hwc->sample_period = event->pmu->adjust_period(event, sample_period);
- if (local64_read(&hwc->period_left) > 8*sample_period) {
+ if (local64_read(&hwc->period_left) > 8*hwc->sample_period) {
if (disable)
event->pmu->stop(event, PERF_EF_UPDATE);
@@ -11519,6 +11519,11 @@ static int perf_event_nop_int(struct perf_event *event, u64 value)
return 0;
}
+static u64 perf_pmu_nop_adjust_period(struct perf_event *event, u64 period)
+{
+ return period;
+}
+
static DEFINE_PER_CPU(unsigned int, nop_txn_flags);
static void perf_pmu_start_txn(struct pmu *pmu, unsigned int flags)
@@ -11856,6 +11861,9 @@ int perf_pmu_register(struct pmu *pmu, const char *name, int type)
if (!pmu->check_period)
pmu->check_period = perf_event_nop_int;
+ if (!pmu->adjust_period)
+ pmu->adjust_period = perf_pmu_nop_adjust_period;
+
if (!pmu->event_idx)
pmu->event_idx = perf_event_idx_default;
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v3 09/10] perf test: Introduce DEFINE_SUITE_EXCLUSIVE()
2024-12-10 9:34 [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations Ravi Bangoria
` (7 preceding siblings ...)
2024-12-10 9:34 ` [PATCH v3 08/10] perf/core: Introduce pmu->adjust_period() callback Ravi Bangoria
@ 2024-12-10 9:34 ` Ravi Bangoria
2024-12-13 19:55 ` Arnaldo Carvalho de Melo
2024-12-10 9:34 ` [PATCH v3 10/10] perf test amd ibs: Add sample period unit test Ravi Bangoria
2024-12-16 6:42 ` [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations Ravi Bangoria
10 siblings, 1 reply; 16+ messages in thread
From: Ravi Bangoria @ 2024-12-10 9:34 UTC (permalink / raw)
To: peterz, mingo, namhyung
Cc: ravi.bangoria, acme, eranian, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
x86, hpa, linux-perf-users, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das
A variant of DEFINE_SUITE() but sets ->exclusive bit for the test so the
test will be executed sequentially.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
tools/perf/tests/tests.h | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index cb58b43aa063..8aea344536b8 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -81,6 +81,16 @@ struct test_suite {
.test_cases = tests__##_name, \
}
+#define DEFINE_SUITE_EXCLUSIVE(description, _name) \
+ struct test_case tests__##_name[] = { \
+ TEST_CASE_EXCLUSIVE(description, _name),\
+ { .name = NULL, } \
+ }; \
+ struct test_suite suite__##_name = { \
+ .desc = description, \
+ .test_cases = tests__##_name, \
+ }
+
/* Tests */
DECLARE_SUITE(vmlinux_matches_kallsyms);
DECLARE_SUITE(openat_syscall_event);
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v3 10/10] perf test amd ibs: Add sample period unit test
2024-12-10 9:34 [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations Ravi Bangoria
` (8 preceding siblings ...)
2024-12-10 9:34 ` [PATCH v3 09/10] perf test: Introduce DEFINE_SUITE_EXCLUSIVE() Ravi Bangoria
@ 2024-12-10 9:34 ` Ravi Bangoria
2024-12-16 6:42 ` [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations Ravi Bangoria
10 siblings, 0 replies; 16+ messages in thread
From: Ravi Bangoria @ 2024-12-10 9:34 UTC (permalink / raw)
To: peterz, mingo, namhyung
Cc: ravi.bangoria, acme, eranian, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
x86, hpa, linux-perf-users, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das
IBS Fetch and IBS Op PMUs has various constraints on supported sample
periods. Add perf unit tests to test those.
Running it in parallel with other tests causes intermittent failures.
Mark it exclusive to force it to run sequentially. Sample output on a
Zen5 machine:
Without kernel fixes:
$ sudo ./perf test -vv 112
112: AMD IBS sample period:
--- start ---
test child forked, pid 16675
Using CPUID AuthenticAMD-26-2-1
IBS config tests:
-----------------
Fetch PMU tests:
0xffff : Ok (nr samples: 1001)
0x1000 : Ok (nr samples: 15557)
0xff : Ok (nr samples: 40993)
0x1 : Ok (nr samples: 40543)
0x0 : Ok
0x10000 : Ok
Op PMU tests:
0x0 : Ok
0x1 : Fail
0x8 : Fail
0x9 : Ok (nr samples: 40543)
0xf : Ok (nr samples: 40543)
0x1000 : Ok (nr samples: 18399)
0xffff : Ok (nr samples: 1191)
0x10000 : Ok
0x100000 : Fail (nr samples: 19)
0xf00000 : Fail (nr samples: 1)
0xf0ffff : Fail (nr samples: 2)
0x1f0ffff : Fail (nr samples: 1)
0x7f0ffff : Fail (nr samples: 0)
0x8f0ffff : Ok
0x17f0ffff : Ok
IBS sample period constraint tests:
-----------------------------------
Fetch PMU test:
freq 0, sample_freq 0: Ok
freq 0, sample_freq 1: Fail
freq 0, sample_freq 15: Fail
freq 0, sample_freq 16: Ok (nr samples: 1604)
freq 0, sample_freq 17: Ok (nr samples: 1604)
freq 0, sample_freq 143: Ok (nr samples: 1604)
freq 0, sample_freq 144: Ok (nr samples: 1604)
freq 0, sample_freq 145: Ok (nr samples: 1604)
freq 0, sample_freq 1234: Ok (nr samples: 1604)
freq 0, sample_freq 4103: Ok (nr samples: 1471)
freq 0, sample_freq 65520: Ok (nr samples: 2294)
freq 0, sample_freq 65535: Ok (nr samples: 2130)
freq 0, sample_freq 65552: Ok (nr samples: 2253)
freq 0, sample_freq 8388607: Ok (nr samples: 26)
freq 1, sample_freq 0: Ok
freq 1, sample_freq 1: Ok (nr samples: 4)
freq 1, sample_freq 15: Ok (nr samples: 4)
freq 1, sample_freq 16: Ok (nr samples: 4)
freq 1, sample_freq 17: Ok (nr samples: 4)
freq 1, sample_freq 143: Ok (nr samples: 5)
freq 1, sample_freq 144: Ok (nr samples: 5)
freq 1, sample_freq 145: Ok (nr samples: 5)
freq 1, sample_freq 1234: Ok (nr samples: 7)
freq 1, sample_freq 4103: Ok (nr samples: 32)
freq 1, sample_freq 65520: Ok (nr samples: 530)
freq 1, sample_freq 65535: Ok (nr samples: 406)
freq 1, sample_freq 65552: Ok (nr samples: 129)
freq 1, sample_freq 8388607: Ok
Op PMU test:
freq 0, sample_freq 0: Ok
freq 0, sample_freq 1: Fail
freq 0, sample_freq 15: Fail
freq 0, sample_freq 16: Fail
freq 0, sample_freq 17: Fail
freq 0, sample_freq 143: Fail
freq 0, sample_freq 144: Ok (nr samples: 1604)
freq 0, sample_freq 145: Ok (nr samples: 1604)
freq 0, sample_freq 1234: Ok (nr samples: 1604)
freq 0, sample_freq 4103: Ok (nr samples: 1473)
freq 0, sample_freq 65520: Ok (nr samples: 2159)
freq 0, sample_freq 65535: Ok (nr samples: 2185)
freq 0, sample_freq 65552: Ok (nr samples: 2105)
freq 0, sample_freq 8388607: Ok (nr samples: 24)
freq 1, sample_freq 0: Ok
freq 1, sample_freq 1: Fail (nr samples: 4)
freq 1, sample_freq 15: Fail (nr samples: 4)
freq 1, sample_freq 16: Fail (nr samples: 4)
freq 1, sample_freq 17: Fail (nr samples: 4)
freq 1, sample_freq 143: Fail (nr samples: 5)
freq 1, sample_freq 144: Fail (nr samples: 5)
freq 1, sample_freq 145: Fail (nr samples: 5)
freq 1, sample_freq 1234: Fail (nr samples: 7)
freq 1, sample_freq 4103: Fail (nr samples: 33)
freq 1, sample_freq 65520: Fail (nr samples: 554)
freq 1, sample_freq 65535: Fail (nr samples: 561)
freq 1, sample_freq 65552: Fail (nr samples: 551)
freq 1, sample_freq 8388607: Ok
IBS ioctl() tests:
------------------
Fetch PMU tests
ioctl(period = 0x0 ): Ok
ioctl(period = 0x1 ): Fail
ioctl(period = 0xf ): Fail
ioctl(period = 0x10 ): Ok
ioctl(period = 0x11 ): Fail
ioctl(period = 0x1f ): Fail
ioctl(period = 0x20 ): Ok
ioctl(period = 0x80 ): Ok
ioctl(period = 0x8f ): Fail
ioctl(period = 0x90 ): Ok
ioctl(period = 0x91 ): Fail
ioctl(period = 0x100 ): Ok
ioctl(period = 0xfff0 ): Ok
ioctl(period = 0xffff ): Fail
ioctl(period = 0x10000 ): Ok
ioctl(period = 0x1fff0 ): Ok
ioctl(period = 0x1fff5 ): Fail
ioctl(freq = 0x0 ): Ok
ioctl(freq = 0x1 ): Ok
ioctl(freq = 0xf ): Ok
ioctl(freq = 0x10 ): Ok
ioctl(freq = 0x11 ): Ok
ioctl(freq = 0x1f ): Ok
ioctl(freq = 0x20 ): Ok
ioctl(freq = 0x80 ): Ok
ioctl(freq = 0x8f ): Ok
ioctl(freq = 0x90 ): Ok
ioctl(freq = 0x91 ): Ok
ioctl(freq = 0x100 ): Ok
Op PMU tests
ioctl(period = 0x0 ): Ok
ioctl(period = 0x1 ): Fail
ioctl(period = 0xf ): Fail
ioctl(period = 0x10 ): Fail
ioctl(period = 0x11 ): Fail
ioctl(period = 0x1f ): Fail
ioctl(period = 0x20 ): Fail
ioctl(period = 0x80 ): Fail
ioctl(period = 0x8f ): Fail
ioctl(period = 0x90 ): Ok
ioctl(period = 0x91 ): Fail
ioctl(period = 0x100 ): Ok
ioctl(period = 0xfff0 ): Ok
ioctl(period = 0xffff ): Fail
ioctl(period = 0x10000 ): Ok
ioctl(period = 0x1fff0 ): Ok
ioctl(period = 0x1fff5 ): Fail
ioctl(freq = 0x0 ): Ok
ioctl(freq = 0x1 ): Ok
ioctl(freq = 0xf ): Ok
ioctl(freq = 0x10 ): Ok
ioctl(freq = 0x11 ): Ok
ioctl(freq = 0x1f ): Ok
ioctl(freq = 0x20 ): Ok
ioctl(freq = 0x80 ): Ok
ioctl(freq = 0x8f ): Ok
ioctl(freq = 0x90 ): Ok
ioctl(freq = 0x91 ): Ok
ioctl(freq = 0x100 ): Ok
IBS freq (negative) tests:
--------------------------
freq 1, sample_freq 200000: Fail
IBS L3MissOnly test: (takes a while)
--------------------
Fetch L3MissOnly: Fail (nr_samples: 1358)
Op L3MissOnly: Ok (nr_samples: 1422)
---- end(-1) ----
112: AMD IBS sample period : FAILED!
With kernel fixes:
$ sudo ./perf test -vv 112
112: AMD IBS sample period:
--- start ---
test child forked, pid 11681
Using CPUID AuthenticAMD-26-2-1
IBS config tests:
-----------------
Fetch PMU tests:
0xffff : Ok (nr samples: 942)
0x1000 : Ok (nr samples: 15396)
0xff : Ok (nr samples: 40547)
0x1 : Ok (nr samples: 40543)
0x0 : Ok
0x10000 : Ok
Op PMU tests:
0x0 : Ok
0x1 : Ok
0x8 : Ok
0x9 : Ok (nr samples: 40543)
0xf : Ok (nr samples: 40543)
0x1000 : Ok (nr samples: 18656)
0xffff : Ok (nr samples: 1187)
0x10000 : Ok
0x100000 : Ok (nr samples: 1178)
0xf00000 : Ok (nr samples: 77)
0xf0ffff : Ok (nr samples: 69)
0x1f0ffff : Ok (nr samples: 34)
0x7f0ffff : Ok (nr samples: 7)
0x8f0ffff : Ok
0x17f0ffff : Ok
IBS sample period constraint tests:
-----------------------------------
Fetch PMU test:
freq 0, sample_freq 0: Ok
freq 0, sample_freq 1: Ok
freq 0, sample_freq 15: Ok
freq 0, sample_freq 16: Ok (nr samples: 1604)
freq 0, sample_freq 17: Ok (nr samples: 1604)
freq 0, sample_freq 143: Ok (nr samples: 1604)
freq 0, sample_freq 144: Ok (nr samples: 1604)
freq 0, sample_freq 145: Ok (nr samples: 1203)
freq 0, sample_freq 1234: Ok (nr samples: 1430)
freq 0, sample_freq 4103: Ok (nr samples: 638)
freq 0, sample_freq 65520: Ok (nr samples: 2018)
freq 0, sample_freq 65535: Ok (nr samples: 2002)
freq 0, sample_freq 65552: Ok (nr samples: 345)
freq 0, sample_freq 8388607: Ok (nr samples: 24)
freq 1, sample_freq 0: Ok
freq 1, sample_freq 1: Ok (nr samples: 4)
freq 1, sample_freq 15: Ok (nr samples: 4)
freq 1, sample_freq 16: Ok (nr samples: 4)
freq 1, sample_freq 17: Ok (nr samples: 4)
freq 1, sample_freq 143: Ok (nr samples: 5)
freq 1, sample_freq 144: Ok (nr samples: 5)
freq 1, sample_freq 145: Ok (nr samples: 5)
freq 1, sample_freq 1234: Ok (nr samples: 8)
freq 1, sample_freq 4103: Ok (nr samples: 30)
freq 1, sample_freq 65520: Ok (nr samples: 1604)
freq 1, sample_freq 65535: Ok (nr samples: 1604)
freq 1, sample_freq 65552: Ok (nr samples: 1604)
freq 1, sample_freq 8388607: Ok
Op PMU test:
freq 0, sample_freq 0: Ok
freq 0, sample_freq 1: Ok
freq 0, sample_freq 15: Ok
freq 0, sample_freq 16: Ok
freq 0, sample_freq 17: Ok
freq 0, sample_freq 143: Ok
freq 0, sample_freq 144: Ok (nr samples: 1604)
freq 0, sample_freq 145: Ok (nr samples: 1203)
freq 0, sample_freq 1234: Ok (nr samples: 1435)
freq 0, sample_freq 4103: Ok (nr samples: 1467)
freq 0, sample_freq 65520: Ok (nr samples: 2209)
freq 0, sample_freq 65535: Ok (nr samples: 2230)
freq 0, sample_freq 65552: Ok (nr samples: 2265)
freq 0, sample_freq 8388607: Ok (nr samples: 25)
freq 1, sample_freq 0: Ok
freq 1, sample_freq 1: Ok (nr samples: 4)
freq 1, sample_freq 15: Ok (nr samples: 4)
freq 1, sample_freq 16: Ok (nr samples: 4)
freq 1, sample_freq 17: Ok (nr samples: 4)
freq 1, sample_freq 143: Ok (nr samples: 5)
freq 1, sample_freq 144: Ok (nr samples: 4)
freq 1, sample_freq 145: Ok (nr samples: 5)
freq 1, sample_freq 1234: Ok (nr samples: 7)
freq 1, sample_freq 4103: Ok (nr samples: 31)
freq 1, sample_freq 65520: Ok (nr samples: 548)
freq 1, sample_freq 65535: Ok (nr samples: 446)
freq 1, sample_freq 65552: Ok (nr samples: 541)
freq 1, sample_freq 8388607: Ok
IBS ioctl() tests:
------------------
Fetch PMU tests
ioctl(period = 0x0 ): Ok
ioctl(period = 0x1 ): Ok
ioctl(period = 0xf ): Ok
ioctl(period = 0x10 ): Ok
ioctl(period = 0x11 ): Ok
ioctl(period = 0x1f ): Ok
ioctl(period = 0x20 ): Ok
ioctl(period = 0x80 ): Ok
ioctl(period = 0x8f ): Ok
ioctl(period = 0x90 ): Ok
ioctl(period = 0x91 ): Ok
ioctl(period = 0x100 ): Ok
ioctl(period = 0xfff0 ): Ok
ioctl(period = 0xffff ): Ok
ioctl(period = 0x10000 ): Ok
ioctl(period = 0x1fff0 ): Ok
ioctl(period = 0x1fff5 ): Ok
ioctl(freq = 0x0 ): Ok
ioctl(freq = 0x1 ): Ok
ioctl(freq = 0xf ): Ok
ioctl(freq = 0x10 ): Ok
ioctl(freq = 0x11 ): Ok
ioctl(freq = 0x1f ): Ok
ioctl(freq = 0x20 ): Ok
ioctl(freq = 0x80 ): Ok
ioctl(freq = 0x8f ): Ok
ioctl(freq = 0x90 ): Ok
ioctl(freq = 0x91 ): Ok
ioctl(freq = 0x100 ): Ok
Op PMU tests
ioctl(period = 0x0 ): Ok
ioctl(period = 0x1 ): Ok
ioctl(period = 0xf ): Ok
ioctl(period = 0x10 ): Ok
ioctl(period = 0x11 ): Ok
ioctl(period = 0x1f ): Ok
ioctl(period = 0x20 ): Ok
ioctl(period = 0x80 ): Ok
ioctl(period = 0x8f ): Ok
ioctl(period = 0x90 ): Ok
ioctl(period = 0x91 ): Ok
ioctl(period = 0x100 ): Ok
ioctl(period = 0xfff0 ): Ok
ioctl(period = 0xffff ): Ok
ioctl(period = 0x10000 ): Ok
ioctl(period = 0x1fff0 ): Ok
ioctl(period = 0x1fff5 ): Ok
ioctl(freq = 0x0 ): Ok
ioctl(freq = 0x1 ): Ok
ioctl(freq = 0xf ): Ok
ioctl(freq = 0x10 ): Ok
ioctl(freq = 0x11 ): Ok
ioctl(freq = 0x1f ): Ok
ioctl(freq = 0x20 ): Ok
ioctl(freq = 0x80 ): Ok
ioctl(freq = 0x8f ): Ok
ioctl(freq = 0x90 ): Ok
ioctl(freq = 0x91 ): Ok
ioctl(freq = 0x100 ): Ok
IBS freq (negative) tests:
--------------------------
freq 1, sample_freq 200000: Ok
IBS L3MissOnly test: (takes a while)
--------------------
Fetch L3MissOnly: Ok (nr_samples: 1523)
Op L3MissOnly: Ok (nr_samples: 1771)
---- end(0) ----
112: AMD IBS sample period : Ok
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
tools/perf/arch/x86/include/arch-tests.h | 1 +
tools/perf/arch/x86/tests/Build | 1 +
tools/perf/arch/x86/tests/amd-ibs-period.c | 953 +++++++++++++++++++++
tools/perf/arch/x86/tests/arch-tests.c | 2 +
4 files changed, 957 insertions(+)
create mode 100644 tools/perf/arch/x86/tests/amd-ibs-period.c
diff --git a/tools/perf/arch/x86/include/arch-tests.h b/tools/perf/arch/x86/include/arch-tests.h
index c0421a26b875..4fd425157d7d 100644
--- a/tools/perf/arch/x86/include/arch-tests.h
+++ b/tools/perf/arch/x86/include/arch-tests.h
@@ -14,6 +14,7 @@ int test__intel_pt_hybrid_compat(struct test_suite *test, int subtest);
int test__bp_modify(struct test_suite *test, int subtest);
int test__x86_sample_parsing(struct test_suite *test, int subtest);
int test__amd_ibs_via_core_pmu(struct test_suite *test, int subtest);
+int test__amd_ibs_period(struct test_suite *test, int subtest);
int test__hybrid(struct test_suite *test, int subtest);
extern struct test_suite *arch_tests[];
diff --git a/tools/perf/arch/x86/tests/Build b/tools/perf/arch/x86/tests/Build
index 3227053f3355..db4b7945fc40 100644
--- a/tools/perf/arch/x86/tests/Build
+++ b/tools/perf/arch/x86/tests/Build
@@ -10,6 +10,7 @@ perf-test-$(CONFIG_AUXTRACE) += insn-x86.o
endif
perf-test-$(CONFIG_X86_64) += bp-modify.o
perf-test-y += amd-ibs-via-core-pmu.o
+perf-test-y += amd-ibs-period.o
ifdef SHELLCHECK
SHELL_TESTS := gen-insn-x86-dat.sh
diff --git a/tools/perf/arch/x86/tests/amd-ibs-period.c b/tools/perf/arch/x86/tests/amd-ibs-period.c
new file mode 100644
index 000000000000..573fa8de81c2
--- /dev/null
+++ b/tools/perf/arch/x86/tests/amd-ibs-period.c
@@ -0,0 +1,953 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <sched.h>
+#include <sys/syscall.h>
+#include <sys/mman.h>
+#include <sys/ioctl.h>
+#include <string.h>
+
+#include "arch-tests.h"
+#include "linux/perf_event.h"
+#include "linux/zalloc.h"
+#include "tests/tests.h"
+#include "../perf-sys.h"
+#include "pmu.h"
+#include "pmus.h"
+#include "debug.h"
+#include "util.h"
+#include "strbuf.h"
+#include "../util/env.h"
+
+#define PAGE_SIZE sysconf(_SC_PAGESIZE)
+
+#define PERF_MMAP_DATA_PAGES 32L
+#define PERF_MMAP_DATA_SIZE (PERF_MMAP_DATA_PAGES * PAGE_SIZE)
+#define PERF_MMAP_DATA_MASK (PERF_MMAP_DATA_SIZE - 1)
+#define PERF_MMAP_TOTAL_PAGES (PERF_MMAP_DATA_PAGES + 1)
+#define PERF_MMAP_TOTAL_SIZE (PERF_MMAP_TOTAL_PAGES * PAGE_SIZE)
+
+#define rmb() asm volatile("lfence":::"memory")
+
+enum {
+ FD_ERROR,
+ FD_SUCCESS,
+};
+
+enum {
+ IBS_FETCH,
+ IBS_OP,
+};
+
+struct perf_pmu *fetch_pmu;
+struct perf_pmu *op_pmu;
+unsigned int perf_event_max_sample_rate;
+
+/* Dummy workload to generate IBS samples. */
+static int dummy_workload_1(unsigned long count)
+{
+ int (*func)(void);
+ int ret = 0;
+ char *p;
+ char insn1[] = {
+ 0xb8, 0x01, 0x00, 0x00, 0x00, /* mov 1,%eax */
+ 0xc3, /* ret */
+ 0xcc, /* int 3 */
+ };
+
+ char insn2[] = {
+ 0xb8, 0x02, 0x00, 0x00, 0x00, /* mov 2,%eax */
+ 0xc3, /* ret */
+ 0xcc, /* int 3 */
+ };
+
+ p = zalloc(2 * PAGE_SIZE);
+ if (!p) {
+ printf("malloc() failed. %m");
+ return 1;
+ }
+
+ func = (void *)((unsigned long)(p + PAGE_SIZE - 1) & ~(PAGE_SIZE - 1));
+
+ ret = mprotect(func, PAGE_SIZE, PROT_READ | PROT_WRITE | PROT_EXEC);
+ if (ret) {
+ printf("mprotect() failed. %m");
+ goto out;
+ }
+
+ if (count < 100000)
+ count = 100000;
+ else if (count > 1000000)
+ count = 1000000;
+ while (count--) {
+ memcpy(func, insn1, sizeof(insn1));
+ if (func() != 1) {
+ pr_debug("ERROR insn1\n");
+ ret = -1;
+ goto out;
+ }
+ memcpy(func, insn2, sizeof(insn2));
+ if (func() != 2) {
+ pr_debug("ERROR insn2\n");
+ ret = -1;
+ goto out;
+ }
+ }
+
+out:
+ free(p);
+ return ret;
+}
+
+/* Another dummy workload to generate IBS samples. */
+static void dummy_workload_2(char *perf)
+{
+ char bench[] = " bench sched messaging -g 10 -l 5000 > /dev/null 2>&1";
+ char taskset[] = "taskset -c 0 ";
+ int ret __maybe_unused;
+ struct strbuf sb;
+ char *cmd;
+
+ strbuf_init(&sb, 0);
+ strbuf_add(&sb, taskset, strlen(taskset));
+ strbuf_add(&sb, perf, strlen(perf));
+ strbuf_add(&sb, bench, strlen(bench));
+ cmd = strbuf_detach(&sb, NULL);
+ ret = system(cmd);
+ free(cmd);
+}
+
+static int sched_affine(int cpu)
+{
+ cpu_set_t set;
+
+ CPU_ZERO(&set);
+ CPU_SET(cpu, &set);
+ if (sched_setaffinity(getpid(), sizeof(set), &set) == -1) {
+ pr_debug("sched_setaffinity() failed. [%m]");
+ return -1;
+ }
+ return 0;
+}
+
+static void
+copy_sample_data(void *src, unsigned long offset, void *dest, size_t size)
+{
+ size_t chunk1_size, chunk2_size;
+
+ if ((offset + size) < (size_t)PERF_MMAP_DATA_SIZE) {
+ memcpy(dest, src + offset, size);
+ } else {
+ chunk1_size = PERF_MMAP_DATA_SIZE - offset;
+ chunk2_size = size - chunk1_size;
+
+ memcpy(dest, src + offset, chunk1_size);
+ memcpy(dest + chunk1_size, src, chunk2_size);
+ }
+}
+
+static int rb_read(struct perf_event_mmap_page *rb, void *dest, size_t size)
+{
+ void *base;
+ unsigned long data_tail, data_head;
+
+ /* Casting to (void *) is needed. */
+ base = (void *)rb + PAGE_SIZE;
+
+ data_head = rb->data_head;
+ rmb();
+ data_tail = rb->data_tail;
+
+ if ((data_head - data_tail) < size)
+ return -1;
+
+ data_tail &= PERF_MMAP_DATA_MASK;
+ copy_sample_data(base, data_tail, dest, size);
+ rb->data_tail += size;
+ return 0;
+}
+
+static void rb_skip(struct perf_event_mmap_page *rb, size_t size)
+{
+ size_t data_head = rb->data_head;
+
+ rmb();
+
+ if ((rb->data_tail + size) > data_head)
+ rb->data_tail = data_head;
+ else
+ rb->data_tail += size;
+}
+
+/* Sample period value taken from perf sample must match with expected value. */
+static int period_equal(unsigned long exp_period, unsigned long act_period)
+{
+ return exp_period == act_period ? 0 : -1;
+}
+
+/*
+ * Sample period value taken from perf sample must be >= minimum sample period
+ * supported by IBS HW.
+ */
+static int period_higher(unsigned long min_period, unsigned long act_period)
+{
+ return min_period <= act_period ? 0 : -1;
+}
+
+static int rb_drain_samples(struct perf_event_mmap_page *rb,
+ unsigned long exp_period,
+ int *nr_samples,
+ int (*callback)(unsigned long, unsigned long))
+{
+ struct perf_event_header hdr;
+ unsigned long period;
+ int ret = 0;
+
+ /*
+ * PERF_RECORD_SAMPLE:
+ * struct {
+ * struct perf_event_header hdr;
+ * { u64 period; } && PERF_SAMPLE_PERIOD
+ * };
+ */
+ while (1) {
+ if (rb_read(rb, &hdr, sizeof(hdr)))
+ return ret;
+
+ if (hdr.type == PERF_RECORD_SAMPLE) {
+ (*nr_samples)++;
+ period = 0;
+ if (rb_read(rb, &period, sizeof(period)))
+ pr_debug("rb_read(period) error. [%m]");
+ ret |= callback(exp_period, period);
+ } else {
+ rb_skip(rb, hdr.size - sizeof(hdr));
+ }
+ }
+ return ret;
+}
+
+static long perf_event_open(struct perf_event_attr *attr, pid_t pid,
+ int cpu, int group_fd, unsigned long flags)
+{
+ return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags);
+}
+
+static void fetch_prepare_attr(struct perf_event_attr *attr,
+ unsigned long config, int freq,
+ unsigned long sample_period)
+{
+ memset(attr, 0, sizeof(struct perf_event_attr));
+
+ attr->type = fetch_pmu->type;
+ attr->size = sizeof(struct perf_event_attr);
+ attr->config = config;
+ attr->disabled = 1;
+ attr->sample_type = PERF_SAMPLE_PERIOD;
+ attr->freq = freq;
+ attr->sample_period = sample_period; /* = ->sample_freq */
+}
+
+static void op_prepare_attr(struct perf_event_attr *attr,
+ unsigned long config, int freq,
+ unsigned long sample_period)
+{
+ memset(attr, 0, sizeof(struct perf_event_attr));
+
+ attr->type = op_pmu->type;
+ attr->size = sizeof(struct perf_event_attr);
+ attr->config = config;
+ attr->disabled = 1;
+ attr->sample_type = PERF_SAMPLE_PERIOD;
+ attr->freq = freq;
+ attr->sample_period = sample_period; /* = ->sample_freq */
+}
+
+struct ibs_configs {
+ /* Input */
+ unsigned long config;
+
+ /* Expected output */
+ unsigned long period;
+ int fd;
+};
+
+/*
+ * Somehow first Fetch event with sample period = 0x10 causes 0
+ * samples. So start with large period and decrease it gradually.
+ */
+struct ibs_configs fetch_configs[] = {
+ { .config = 0xffff, .period = 0xffff0, .fd = FD_SUCCESS },
+ { .config = 0x1000, .period = 0x10000, .fd = FD_SUCCESS },
+ { .config = 0xff, .period = 0xff0, .fd = FD_SUCCESS },
+ { .config = 0x1, .period = 0x10, .fd = FD_SUCCESS },
+ { .config = 0x0, .period = -1, .fd = FD_ERROR },
+ { .config = 0x10000, .period = -1, .fd = FD_ERROR },
+};
+
+struct ibs_configs op_configs[] = {
+ { .config = 0x0, .period = -1, .fd = FD_ERROR },
+ { .config = 0x1, .period = -1, .fd = FD_ERROR },
+ { .config = 0x8, .period = -1, .fd = FD_ERROR },
+ { .config = 0x9, .period = 0x90, .fd = FD_SUCCESS },
+ { .config = 0xf, .period = 0xf0, .fd = FD_SUCCESS },
+ { .config = 0x1000, .period = 0x10000, .fd = FD_SUCCESS },
+ { .config = 0xffff, .period = 0xffff0, .fd = FD_SUCCESS },
+ { .config = 0x10000, .period = -1, .fd = FD_ERROR },
+ { .config = 0x100000, .period = 0x100000, .fd = FD_SUCCESS },
+ { .config = 0xf00000, .period = 0xf00000, .fd = FD_SUCCESS },
+ { .config = 0xf0ffff, .period = 0xfffff0, .fd = FD_SUCCESS },
+ { .config = 0x1f0ffff, .period = 0x1fffff0, .fd = FD_SUCCESS },
+ { .config = 0x7f0ffff, .period = 0x7fffff0, .fd = FD_SUCCESS },
+ { .config = 0x8f0ffff, .period = -1, .fd = FD_ERROR },
+ { .config = 0x17f0ffff, .period = -1, .fd = FD_ERROR },
+};
+
+static int __ibs_config_test(int ibs_type, struct ibs_configs *config, int *nr_samples)
+{
+ struct perf_event_attr attr;
+ int fd, i;
+ void *rb;
+ int ret = 0;
+
+ if (ibs_type == IBS_FETCH)
+ fetch_prepare_attr(&attr, config->config, 0, 0);
+ else
+ op_prepare_attr(&attr, config->config, 0, 0);
+
+ /* CPU0, All processes */
+ fd = perf_event_open(&attr, -1, 0, -1, 0);
+ if (config->fd == FD_ERROR) {
+ if (fd != -1) {
+ close(fd);
+ return -1;
+ }
+ return 0;
+ }
+ if (fd <= -1)
+ return -1;
+
+ rb = mmap(NULL, PERF_MMAP_TOTAL_SIZE, PROT_READ | PROT_WRITE,
+ MAP_SHARED, fd, 0);
+ if (rb == MAP_FAILED) {
+ pr_debug("mmap() failed. [%m]\n");
+ return -1;
+ }
+
+ ioctl(fd, PERF_EVENT_IOC_RESET, 0);
+ ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);
+
+ i = 5;
+ while (i--) {
+ dummy_workload_1(1000000);
+
+ ret = rb_drain_samples(rb, config->period, nr_samples,
+ period_equal);
+ if (ret)
+ break;
+ }
+
+ ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
+ munmap(rb, PERF_MMAP_TOTAL_SIZE);
+ close(fd);
+ return ret;
+}
+
+static int ibs_config_test(void)
+{
+ int nr_samples = 0;
+ unsigned long i;
+ int ret = 0;
+ int r;
+
+ pr_debug("\nIBS config tests:\n");
+ pr_debug("-----------------\n");
+
+ pr_debug("Fetch PMU tests:\n");
+ for (i = 0; i < ARRAY_SIZE(fetch_configs); i++) {
+ nr_samples = 0;
+ r = __ibs_config_test(IBS_FETCH, &(fetch_configs[i]), &nr_samples);
+
+ if (fetch_configs[i].fd == FD_ERROR) {
+ pr_debug("0x%-16lx: %-4s\n", fetch_configs[i].config,
+ !r ? "Ok" : "Fail");
+ } else {
+ /*
+ * Although nr_samples == 0 is reported as Fail here,
+ * the failure status is not cascaded up because, we
+ * can not decide whether test really failed or not
+ * without actual samples.
+ */
+ pr_debug("0x%-16lx: %-4s (nr samples: %d)\n", fetch_configs[i].config,
+ (!r && nr_samples != 0) ? "Ok" : "Fail", nr_samples);
+ }
+
+ ret |= r;
+ }
+
+ pr_debug("Op PMU tests:\n");
+ for (i = 0; i < ARRAY_SIZE(op_configs); i++) {
+ nr_samples = 0;
+ r = __ibs_config_test(IBS_OP, &(op_configs[i]), &nr_samples);
+
+ if (op_configs[i].fd == FD_ERROR) {
+ pr_debug("0x%-16lx: %-4s\n", op_configs[i].config,
+ !r ? "Ok" : "Fail");
+ } else {
+ /*
+ * Although nr_samples == 0 is reported as Fail here,
+ * the failure status is not cascaded up because, we
+ * can not decide whether test really failed or not
+ * without actual samples.
+ */
+ pr_debug("0x%-16lx: %-4s (nr samples: %d)\n", op_configs[i].config,
+ (!r && nr_samples != 0) ? "Ok" : "Fail", nr_samples);
+ }
+
+ ret |= r;
+ }
+
+ return ret;
+}
+
+struct ibs_period {
+ /* Input */
+ int freq;
+ unsigned long sample_freq;
+
+ /* Output */
+ int ret;
+ unsigned long period;
+};
+
+struct ibs_period fetch_period[] = {
+ { .freq = 0, .sample_freq = 0, .ret = FD_ERROR, .period = -1 },
+ { .freq = 0, .sample_freq = 1, .ret = FD_ERROR, .period = -1 },
+ { .freq = 0, .sample_freq = 0xf, .ret = FD_ERROR, .period = -1 },
+ { .freq = 0, .sample_freq = 0x10, .ret = FD_SUCCESS, .period = 0x10 },
+ { .freq = 0, .sample_freq = 0x11, .ret = FD_SUCCESS, .period = 0x10 },
+ { .freq = 0, .sample_freq = 0x8f, .ret = FD_SUCCESS, .period = 0x80 },
+ { .freq = 0, .sample_freq = 0x90, .ret = FD_SUCCESS, .period = 0x90 },
+ { .freq = 0, .sample_freq = 0x91, .ret = FD_SUCCESS, .period = 0x90 },
+ { .freq = 0, .sample_freq = 0x4d2, .ret = FD_SUCCESS, .period = 0x4d0 },
+ { .freq = 0, .sample_freq = 0x1007, .ret = FD_SUCCESS, .period = 0x1000 },
+ { .freq = 0, .sample_freq = 0xfff0, .ret = FD_SUCCESS, .period = 0xfff0 },
+ { .freq = 0, .sample_freq = 0xffff, .ret = FD_SUCCESS, .period = 0xfff0 },
+ { .freq = 0, .sample_freq = 0x10010, .ret = FD_SUCCESS, .period = 0x10010 },
+ { .freq = 0, .sample_freq = 0x7fffff, .ret = FD_SUCCESS, .period = 0x7ffff0 },
+ { .freq = 1, .sample_freq = 0, .ret = FD_ERROR, .period = -1 },
+ { .freq = 1, .sample_freq = 1, .ret = FD_SUCCESS, .period = 0x10 },
+ { .freq = 1, .sample_freq = 0xf, .ret = FD_SUCCESS, .period = 0x10 },
+ { .freq = 1, .sample_freq = 0x10, .ret = FD_SUCCESS, .period = 0x10 },
+ { .freq = 1, .sample_freq = 0x11, .ret = FD_SUCCESS, .period = 0x10 },
+ { .freq = 1, .sample_freq = 0x8f, .ret = FD_SUCCESS, .period = 0x10 },
+ { .freq = 1, .sample_freq = 0x90, .ret = FD_SUCCESS, .period = 0x10 },
+ { .freq = 1, .sample_freq = 0x91, .ret = FD_SUCCESS, .period = 0x10 },
+ { .freq = 1, .sample_freq = 0x4d2, .ret = FD_SUCCESS, .period = 0x10 },
+ { .freq = 1, .sample_freq = 0x1007, .ret = FD_SUCCESS, .period = 0x10 },
+ { .freq = 1, .sample_freq = 0xfff0, .ret = FD_SUCCESS, .period = 0x10 },
+ { .freq = 1, .sample_freq = 0xffff, .ret = FD_SUCCESS, .period = 0x10 },
+ { .freq = 1, .sample_freq = 0x10010, .ret = FD_SUCCESS, .period = 0x10 },
+
+ /* ret=-1 because it's beyond default perf_event_max_sample_rate (100000) */
+ { .freq = 1, .sample_freq = 0x7fffff, .ret = FD_ERROR, .period = -1 },
+};
+
+struct ibs_period op_period[] = {
+ { .freq = 0, .sample_freq = 0, .ret = FD_ERROR, .period = -1 },
+ { .freq = 0, .sample_freq = 1, .ret = FD_ERROR, .period = -1 },
+ { .freq = 0, .sample_freq = 0xf, .ret = FD_ERROR, .period = -1 },
+ { .freq = 0, .sample_freq = 0x10, .ret = FD_ERROR, .period = -1 },
+ { .freq = 0, .sample_freq = 0x11, .ret = FD_ERROR, .period = -1 },
+ { .freq = 0, .sample_freq = 0x8f, .ret = FD_ERROR, .period = -1 },
+ { .freq = 0, .sample_freq = 0x90, .ret = FD_SUCCESS, .period = 0x90 },
+ { .freq = 0, .sample_freq = 0x91, .ret = FD_SUCCESS, .period = 0x90 },
+ { .freq = 0, .sample_freq = 0x4d2, .ret = FD_SUCCESS, .period = 0x4d0 },
+ { .freq = 0, .sample_freq = 0x1007, .ret = FD_SUCCESS, .period = 0x1000 },
+ { .freq = 0, .sample_freq = 0xfff0, .ret = FD_SUCCESS, .period = 0xfff0 },
+ { .freq = 0, .sample_freq = 0xffff, .ret = FD_SUCCESS, .period = 0xfff0 },
+ { .freq = 0, .sample_freq = 0x10010, .ret = FD_SUCCESS, .period = 0x10010 },
+ { .freq = 0, .sample_freq = 0x7fffff, .ret = FD_SUCCESS, .period = 0x7ffff0 },
+ { .freq = 1, .sample_freq = 0, .ret = FD_ERROR, .period = -1 },
+ { .freq = 1, .sample_freq = 1, .ret = FD_SUCCESS, .period = 0x90 },
+ { .freq = 1, .sample_freq = 0xf, .ret = FD_SUCCESS, .period = 0x90 },
+ { .freq = 1, .sample_freq = 0x10, .ret = FD_SUCCESS, .period = 0x90 },
+ { .freq = 1, .sample_freq = 0x11, .ret = FD_SUCCESS, .period = 0x90 },
+ { .freq = 1, .sample_freq = 0x8f, .ret = FD_SUCCESS, .period = 0x90 },
+ { .freq = 1, .sample_freq = 0x90, .ret = FD_SUCCESS, .period = 0x90 },
+ { .freq = 1, .sample_freq = 0x91, .ret = FD_SUCCESS, .period = 0x90 },
+ { .freq = 1, .sample_freq = 0x4d2, .ret = FD_SUCCESS, .period = 0x90 },
+ { .freq = 1, .sample_freq = 0x1007, .ret = FD_SUCCESS, .period = 0x90 },
+ { .freq = 1, .sample_freq = 0xfff0, .ret = FD_SUCCESS, .period = 0x90 },
+ { .freq = 1, .sample_freq = 0xffff, .ret = FD_SUCCESS, .period = 0x90 },
+ { .freq = 1, .sample_freq = 0x10010, .ret = FD_SUCCESS, .period = 0x90 },
+
+ /* ret=-1 because it's beyond default perf_event_max_sample_rate (100000) */
+ { .freq = 1, .sample_freq = 0x7fffff, .ret = FD_ERROR, .period = -1 },
+};
+
+static int __ibs_period_constraint_test(int ibs_type, struct ibs_period *period,
+ int *nr_samples)
+{
+ struct perf_event_attr attr;
+ int ret = 0;
+ void *rb;
+ int fd;
+
+ if (period->freq && period->sample_freq >= perf_event_max_sample_rate) {
+ nr_samples = 0;
+ return 0;
+ }
+
+ if (ibs_type == IBS_FETCH)
+ fetch_prepare_attr(&attr, 0, period->freq, period->sample_freq);
+ else
+ op_prepare_attr(&attr, 0, period->freq, period->sample_freq);
+
+ /* CPU0, All processes */
+ fd = perf_event_open(&attr, -1, 0, -1, 0);
+ if (period->ret == FD_ERROR) {
+ if (fd != -1) {
+ close(fd);
+ return -1;
+ }
+ return 0;
+ }
+ if (fd <= -1)
+ return -1;
+
+ rb = mmap(NULL, PERF_MMAP_TOTAL_SIZE, PROT_READ | PROT_WRITE,
+ MAP_SHARED, fd, 0);
+ if (rb == MAP_FAILED) {
+ pr_debug("mmap() failed. [%m]\n");
+ close(fd);
+ return -1;
+ }
+
+ ioctl(fd, PERF_EVENT_IOC_RESET, 0);
+ ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);
+
+ if (period->freq) {
+ dummy_workload_1(100000);
+ ret = rb_drain_samples(rb, period->period, nr_samples,
+ period_higher);
+ } else {
+ dummy_workload_1(period->sample_freq * 10);
+ ret = rb_drain_samples(rb, period->period, nr_samples,
+ period_equal);
+ }
+
+ ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
+ munmap(rb, PERF_MMAP_TOTAL_SIZE);
+ close(fd);
+ return ret;
+}
+
+static int ibs_period_constraint_test(void)
+{
+ unsigned long i;
+ int nr_samples;
+ int ret = 0;
+ int r;
+
+ pr_debug("\nIBS sample period constraint tests:\n");
+ pr_debug("-----------------------------------\n");
+
+ pr_debug("Fetch PMU test:\n");
+ for (i = 0; i < ARRAY_SIZE(fetch_period); i++) {
+ nr_samples = 0;
+ r = __ibs_period_constraint_test(IBS_FETCH, &fetch_period[i],
+ &nr_samples);
+
+ if (fetch_period[i].ret == FD_ERROR) {
+ pr_debug("freq %d, sample_freq %7ld: %-4s\n",
+ fetch_period[i].freq, fetch_period[i].sample_freq,
+ !r ? "Ok" : "Fail");
+ } else {
+ /*
+ * Although nr_samples == 0 is reported as Fail here,
+ * the failure status is not cascaded up because, we
+ * can not decide whether test really failed or not
+ * without actual samples.
+ */
+ pr_debug("freq %d, sample_freq %7ld: %-4s (nr samples: %d)\n",
+ fetch_period[i].freq, fetch_period[i].sample_freq,
+ (!r && nr_samples != 0) ? "Ok" : "Fail", nr_samples);
+ }
+ ret |= r;
+ }
+
+ pr_debug("Op PMU test:\n");
+ for (i = 0; i < ARRAY_SIZE(op_period); i++) {
+ nr_samples = 0;
+ r = __ibs_period_constraint_test(IBS_OP, &op_period[i],
+ &nr_samples);
+
+ if (op_period[i].ret == FD_ERROR) {
+ pr_debug("freq %d, sample_freq %7ld: %-4s\n",
+ op_period[i].freq, op_period[i].sample_freq,
+ !r ? "Ok" : "Fail");
+ } else {
+ /*
+ * Although nr_samples == 0 is reported as Fail here,
+ * the failure status is not cascaded up because, we
+ * can not decide whether test really failed or not
+ * without actual samples.
+ */
+ pr_debug("freq %d, sample_freq %7ld: %-4s (nr samples: %d)\n",
+ op_period[i].freq, op_period[i].sample_freq,
+ (!r && nr_samples != 0) ? "Ok" : "Fail", nr_samples);
+ }
+ ret |= r;
+ }
+
+ return ret;
+}
+
+struct ibs_ioctl {
+ /* Input */
+ int freq;
+ unsigned long period;
+
+ /* Expected output */
+ int ret;
+};
+
+struct ibs_ioctl fetch_ioctl[] = {
+ { .freq = 0, .period = 0x0, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x1, .ret = FD_ERROR },
+ { .freq = 0, .period = 0xf, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x10, .ret = FD_SUCCESS },
+ { .freq = 0, .period = 0x11, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x1f, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x20, .ret = FD_SUCCESS },
+ { .freq = 0, .period = 0x80, .ret = FD_SUCCESS },
+ { .freq = 0, .period = 0x8f, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x90, .ret = FD_SUCCESS },
+ { .freq = 0, .period = 0x91, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x100, .ret = FD_SUCCESS },
+ { .freq = 0, .period = 0xfff0, .ret = FD_SUCCESS },
+ { .freq = 0, .period = 0xffff, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x10000, .ret = FD_SUCCESS },
+ { .freq = 0, .period = 0x1fff0, .ret = FD_SUCCESS },
+ { .freq = 0, .period = 0x1fff5, .ret = FD_ERROR },
+ { .freq = 1, .period = 0x0, .ret = FD_ERROR },
+ { .freq = 1, .period = 0x1, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0xf, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x10, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x11, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x1f, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x20, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x80, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x8f, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x90, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x91, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x100, .ret = FD_SUCCESS },
+};
+
+struct ibs_ioctl op_ioctl[] = {
+ { .freq = 0, .period = 0x0, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x1, .ret = FD_ERROR },
+ { .freq = 0, .period = 0xf, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x10, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x11, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x1f, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x20, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x80, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x8f, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x90, .ret = FD_SUCCESS },
+ { .freq = 0, .period = 0x91, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x100, .ret = FD_SUCCESS },
+ { .freq = 0, .period = 0xfff0, .ret = FD_SUCCESS },
+ { .freq = 0, .period = 0xffff, .ret = FD_ERROR },
+ { .freq = 0, .period = 0x10000, .ret = FD_SUCCESS },
+ { .freq = 0, .period = 0x1fff0, .ret = FD_SUCCESS },
+ { .freq = 0, .period = 0x1fff5, .ret = FD_ERROR },
+ { .freq = 1, .period = 0x0, .ret = FD_ERROR },
+ { .freq = 1, .period = 0x1, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0xf, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x10, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x11, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x1f, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x20, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x80, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x8f, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x90, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x91, .ret = FD_SUCCESS },
+ { .freq = 1, .period = 0x100, .ret = FD_SUCCESS },
+};
+
+static int __ibs_ioctl_test(int ibs_type, struct ibs_ioctl *ibs_ioctl)
+{
+ struct perf_event_attr attr;
+ int ret = 0;
+ int fd;
+ int r;
+
+ if (ibs_type == IBS_FETCH)
+ fetch_prepare_attr(&attr, 0, ibs_ioctl->freq, 1000);
+ else
+ op_prepare_attr(&attr, 0, ibs_ioctl->freq, 1000);
+
+ /* CPU0, All processes */
+ fd = perf_event_open(&attr, -1, 0, -1, 0);
+ if (fd <= -1) {
+ pr_debug("event_open() Failed\n");
+ return -1;
+ }
+
+ r = ioctl(fd, PERF_EVENT_IOC_PERIOD, &ibs_ioctl->period);
+ if ((ibs_ioctl->ret == FD_SUCCESS && r <= -1) ||
+ (ibs_ioctl->ret == FD_ERROR && r >= 0)) {
+ ret = -1;
+ }
+
+ close(fd);
+ return ret;
+}
+
+static int ibs_ioctl_test(void)
+{
+ unsigned long i;
+ int ret = 0;
+ int r;
+
+ pr_debug("\nIBS ioctl() tests:\n");
+ pr_debug("------------------\n");
+
+ pr_debug("Fetch PMU tests\n");
+ for (i = 0; i < ARRAY_SIZE(fetch_ioctl); i++) {
+ r = __ibs_ioctl_test(IBS_FETCH, &fetch_ioctl[i]);
+
+ pr_debug("ioctl(%s = 0x%-7lx): %s\n",
+ fetch_ioctl[i].freq ? "freq " : "period",
+ fetch_ioctl[i].period, r ? "Fail" : "Ok");
+ ret |= r;
+ }
+
+ pr_debug("Op PMU tests\n");
+ for (i = 0; i < ARRAY_SIZE(op_ioctl); i++) {
+ r = __ibs_ioctl_test(IBS_OP, &op_ioctl[i]);
+
+ pr_debug("ioctl(%s = 0x%-7lx): %s\n",
+ op_ioctl[i].freq ? "freq " : "period",
+ op_ioctl[i].period, r ? "Fail" : "Ok");
+ ret |= r;
+ }
+
+ return ret;
+}
+
+static int ibs_freq_neg_test(void)
+{
+ struct perf_event_attr attr;
+ int fd;
+
+ pr_debug("\nIBS freq (negative) tests:\n");
+ pr_debug("--------------------------\n");
+
+ /*
+ * Assuming perf_event_max_sample_rate <= 100000,
+ * config: 0x300D40 ==> MaxCnt: 200000
+ */
+ op_prepare_attr(&attr, 0x300D40, 1, 0);
+
+ /* CPU0, All processes */
+ fd = perf_event_open(&attr, -1, 0, -1, 0);
+ if (fd != -1) {
+ pr_debug("freq 1, sample_freq 200000: Fail\n");
+ close(fd);
+ return -1;
+ }
+
+ pr_debug("freq 1, sample_freq 200000: Ok\n");
+
+ return 0;
+}
+
+static int __ibs_l3missonly_test(char *perf, int ibs_type, int *nr_samples, int min_period)
+{
+ struct perf_event_attr attr;
+ int ret = 0;
+ void *rb;
+ int fd;
+
+ if (ibs_type == IBS_FETCH)
+ fetch_prepare_attr(&attr, 0x800000000000000UL, 1, 10000);
+ else
+ op_prepare_attr(&attr, 0x10000, 1, 10000);
+
+ /* CPU0, All processes */
+ fd = perf_event_open(&attr, -1, 0, -1, 0);
+ if (fd == -1) {
+ pr_debug("perf_event_open() failed. [%m]\n");
+ return -1;
+ }
+
+ rb = mmap(NULL, PERF_MMAP_TOTAL_SIZE, PROT_READ | PROT_WRITE,
+ MAP_SHARED, fd, 0);
+ if (rb == MAP_FAILED) {
+ pr_debug("mmap() failed. [%m]\n");
+ close(fd);
+ return -1;
+ }
+
+ ioctl(fd, PERF_EVENT_IOC_RESET, 0);
+ ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);
+
+ dummy_workload_2(perf);
+
+ ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
+
+ ret = rb_drain_samples(rb, min_period, nr_samples, period_higher);
+
+ munmap(rb, PERF_MMAP_TOTAL_SIZE);
+ close(fd);
+ return ret;
+}
+
+static int ibs_l3missonly_test(char *perf)
+{
+ int nr_samples = 0;
+ int ret = 0;
+ int r = 0;
+
+ pr_debug("\nIBS L3MissOnly test: (takes a while)\n");
+ pr_debug("--------------------\n");
+
+ if (perf_pmu__has_format(fetch_pmu, "l3missonly")) {
+ nr_samples = 0;
+ r = __ibs_l3missonly_test(perf, IBS_FETCH, &nr_samples, 0x10);
+ /*
+ * Although nr_samples == 0 is reported as Fail here,
+ * the failure status is not cascaded up because, we
+ * can not decide whether test really failed or not
+ * without actual samples.
+ */
+ pr_debug("Fetch L3MissOnly: %-4s (nr_samples: %d)\n",
+ (!r && nr_samples != 0) ? "Ok" : "Fail", nr_samples);
+ ret |= r;
+ }
+
+ if (perf_pmu__has_format(op_pmu, "l3missonly")) {
+ nr_samples = 0;
+ r = __ibs_l3missonly_test(perf, IBS_OP, &nr_samples, 0x90);
+ /*
+ * Although nr_samples == 0 is reported as Fail here,
+ * the failure status is not cascaded up because, we
+ * can not decide whether test really failed or not
+ * without actual samples.
+ */
+ pr_debug("Op L3MissOnly: %-4s (nr_samples: %d)\n",
+ (!r && nr_samples != 0) ? "Ok" : "Fail", nr_samples);
+ ret |= r;
+ }
+
+ return ret;
+}
+
+static unsigned int get_perf_event_max_sample_rate(void)
+{
+ unsigned int max_sample_rate = 100000;
+ FILE *fp;
+ int ret;
+
+ fp = fopen("/proc/sys/kernel/perf_event_max_sample_rate", "r");
+ if (!fp) {
+ pr_debug("Can't open perf_event_max_sample_rate. Asssuming %d\n",
+ max_sample_rate);
+ goto out;
+ }
+
+ ret = fscanf(fp, "%d", &max_sample_rate);
+ if (ret == EOF) {
+ pr_debug("Can't read perf_event_max_sample_rate. Assuming 100000\n");
+ max_sample_rate = 100000;
+ }
+ fclose(fp);
+
+out:
+ return max_sample_rate;
+}
+
+int test__amd_ibs_period(struct test_suite *test __maybe_unused,
+ int subtest __maybe_unused)
+{
+ char perf[PATH_MAX] = {'\0'};
+ int ret = TEST_OK;
+
+ perf_event_max_sample_rate = get_perf_event_max_sample_rate();
+ fetch_pmu = perf_pmus__find("ibs_fetch");
+ op_pmu = perf_pmus__find("ibs_op");
+
+ if (!x86__is_amd_cpu() || !fetch_pmu || !op_pmu)
+ return TEST_SKIP;
+
+ perf_exe(perf, sizeof(perf));
+
+ if (sched_affine(0))
+ return TEST_FAIL;
+
+ /*
+ * Perf event can be opened in two modes:
+ * 1 Freq mode
+ * perf_event_attr->freq = 1, ->sample_freq = <frequency>
+ * 2 Sample period mode
+ * perf_event_attr->freq = 0, ->sample_period = <period>
+ *
+ * Instead of using above interface, IBS event in 'sample period mode'
+ * can also be opened by passing <period> value directly in a MaxCnt
+ * bitfields of perf_event_attr->config. Test this IBS specific special
+ * interface.
+ */
+ if (ibs_config_test())
+ ret = TEST_FAIL;
+
+ /*
+ * IBS Fetch and Op PMUs have HW constraints on minimum sample period.
+ * Also, sample period value must be in multiple of 0x10. Test that IBS
+ * driver honors HW constraints for various possible values in Freq as
+ * well as Sample Period mode IBS events.
+ */
+ if (ibs_period_constraint_test())
+ ret = TEST_FAIL;
+
+ /*
+ * Test ioctl() with various sample period values for IBS event.
+ */
+ if (ibs_ioctl_test())
+ ret = TEST_FAIL;
+
+ /*
+ * Test that opening of freq mode IBS event fails when the freq value
+ * is passed through ->config, not explicitly in ->sample_freq. Also
+ * use high freq value (beyond perf_event_max_sample_rate) to test IBS
+ * driver do not bypass perf_event_max_sample_rate checks.
+ */
+ if (ibs_freq_neg_test())
+ ret = TEST_FAIL;
+
+ /*
+ * L3MissOnly is a post-processing filter, i.e. IBS HW checks for L3
+ * Miss at the completion of the tagged uOp. The sample is discarded
+ * if the tagged uOp did not cause L3Miss. Also, IBS HW internally
+ * resets CurCnt to a small pseudo-random value and resumes counting.
+ * A new uOp is tagged once CurCnt reaches to MaxCnt. But the process
+ * repeats until the tagged uOp causes an L3 Miss.
+ *
+ * With the freq mode event, the next sample period is calculated by
+ * generic kernel on every sample to achieve desired freq of samples.
+ *
+ * Since the number of times HW internally reset CurCnt and the pseudo-
+ * random value of CurCnt for all those occurrences are not known to SW,
+ * the sample period adjustment by kernel goes for a toes for freq mode
+ * IBS events. Kernel will set very small period for the next sample if
+ * the window between current sample and prev sample is too high due to
+ * multiple samples being rejected by IBS internally.
+ *
+ * Test that IBS sample period constraints are honored when L3MissOnly
+ * is ON.
+ */
+ if (ibs_l3missonly_test(perf))
+ ret = TEST_FAIL;
+
+ return ret;
+}
diff --git a/tools/perf/arch/x86/tests/arch-tests.c b/tools/perf/arch/x86/tests/arch-tests.c
index a216a5d172ed..bfee2432515b 100644
--- a/tools/perf/arch/x86/tests/arch-tests.c
+++ b/tools/perf/arch/x86/tests/arch-tests.c
@@ -25,6 +25,7 @@ DEFINE_SUITE("x86 bp modify", bp_modify);
#endif
DEFINE_SUITE("x86 Sample parsing", x86_sample_parsing);
DEFINE_SUITE("AMD IBS via core pmu", amd_ibs_via_core_pmu);
+DEFINE_SUITE_EXCLUSIVE("AMD IBS sample period", amd_ibs_period);
static struct test_case hybrid_tests[] = {
TEST_CASE_REASON("x86 hybrid event parsing", hybrid, "not hybrid"),
{ .name = NULL, }
@@ -50,6 +51,7 @@ struct test_suite *arch_tests[] = {
#endif
&suite__x86_sample_parsing,
&suite__amd_ibs_via_core_pmu,
+ &suite__amd_ibs_period,
&suite__hybrid,
NULL,
};
--
2.43.0
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v3 09/10] perf test: Introduce DEFINE_SUITE_EXCLUSIVE()
2024-12-10 9:34 ` [PATCH v3 09/10] perf test: Introduce DEFINE_SUITE_EXCLUSIVE() Ravi Bangoria
@ 2024-12-13 19:55 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 16+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-12-13 19:55 UTC (permalink / raw)
To: Ravi Bangoria
Cc: peterz, mingo, namhyung, eranian, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
tglx, bp, dave.hansen, x86, hpa, linux-perf-users, linux-kernel,
santosh.shukla, ananth.narayan, sandipan.das
On Tue, Dec 10, 2024 at 09:34:48AM +0000, Ravi Bangoria wrote:
> A variant of DEFINE_SUITE() but sets ->exclusive bit for the test so the
> test will be executed sequentially.
Cherry picking this one as I need it for some other work.
Ian Rogers pointed me while reviewing.
- Arnaldo
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
> ---
> tools/perf/tests/tests.h | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
> index cb58b43aa063..8aea344536b8 100644
> --- a/tools/perf/tests/tests.h
> +++ b/tools/perf/tests/tests.h
> @@ -81,6 +81,16 @@ struct test_suite {
> .test_cases = tests__##_name, \
> }
>
> +#define DEFINE_SUITE_EXCLUSIVE(description, _name) \
> + struct test_case tests__##_name[] = { \
> + TEST_CASE_EXCLUSIVE(description, _name),\
> + { .name = NULL, } \
> + }; \
> + struct test_suite suite__##_name = { \
> + .desc = description, \
> + .test_cases = tests__##_name, \
> + }
> +
> /* Tests */
> DECLARE_SUITE(vmlinux_matches_kallsyms);
> DECLARE_SUITE(openat_syscall_event);
> --
> 2.43.0
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations
2024-12-10 9:34 [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations Ravi Bangoria
` (9 preceding siblings ...)
2024-12-10 9:34 ` [PATCH v3 10/10] perf test amd ibs: Add sample period unit test Ravi Bangoria
@ 2024-12-16 6:42 ` Ravi Bangoria
10 siblings, 0 replies; 16+ messages in thread
From: Ravi Bangoria @ 2024-12-16 6:42 UTC (permalink / raw)
To: peterz, mingo
Cc: acme, eranian, mark.rutland, alexander.shishkin, jolsa, irogers,
adrian.hunter, kan.liang, tglx, bp, dave.hansen, x86, hpa,
linux-perf-users, linux-kernel, santosh.shukla, ananth.narayan,
sandipan.das, namhyung, Ravi Bangoria
On 10-Dec-24 3:04 PM, Ravi Bangoria wrote:
> IBS Fetch and IBS Op PMUs have constraints on supported sample period
> values. The IBS hw behavior could be undefined if they are not followed.
> Currently, IBS driver does not honor them correctly and thus a malicious
> event could cause issues to the system. Fortunately, the IBS hw is very
> resilient so far and IBS PMUs are restricted to root only, so the attack
> vector is minimal. In any case, these are genuine bugs and must be fixed.
>
> There were some conflicts with Namhyung's IBS patches[1] which are now
> in the tip tree, so rebased the series on tip/perf/core (02c56362a7d3).
>
> v2: https://lore.kernel.org/r/20241206051713.991-1-ravi.bangoria@amd.com
> v2->v3:
> - No code changes
> - s/pmu/PMU/g
> - Include unit test example in the commit description
>
> [1]: https://lore.kernel.org/r/20241203180441.1634709-1-namhyung@kernel.org
Peter, Ingo, gentle ping.
Thanks,
Ravi
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 02/10] perf/amd/ibs: Remove pointless sample period check
2024-12-10 9:34 ` [PATCH v3 02/10] perf/amd/ibs: Remove pointless sample period check Ravi Bangoria
@ 2024-12-18 13:35 ` Peter Zijlstra
0 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2024-12-18 13:35 UTC (permalink / raw)
To: Ravi Bangoria
Cc: mingo, namhyung, acme, eranian, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
x86, hpa, linux-perf-users, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das
On Tue, Dec 10, 2024 at 09:34:41AM +0000, Ravi Bangoria wrote:
> Valid perf event sample period value for IBS PMUs (Fetch and Op both)
> is limited to multiple of 0x10. perf_ibs_init() has this check:
>
> if (!event->attr.sample_freq && hwc->sample_period & 0x0f)
> return -EINVAL;
>
> But it's broken since hwc->sample_period will always be 0 when
> event->attr.sample_freq is 0 (irrespective of event->attr.freq value.)
>
> One option to fix this is to change the condition:
>
> - if (!event->attr.sample_freq && hwc->sample_period & 0x0f)
> + if (!event->attr.freq && hwc->sample_period & 0x0f)
>
> However, that will break all userspace tools which have been using IBS
> event with sample_period not multiple of 0x10.
>
> Another option is to remove the condition altogether and mask lower
> nibble _silently_, same as what current code is inadvertently doing.
> I'm preferring this approach as it keeps the existing behavior.
Urgh... :/
> Acked-by: Namhyung Kim <namhyung@kernel.org>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
> ---
> arch/x86/events/amd/ibs.c | 9 ++-------
> 1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
> index 4ca8006d2221..bd8919e7c3b1 100644
> --- a/arch/x86/events/amd/ibs.c
> +++ b/arch/x86/events/amd/ibs.c
> @@ -307,13 +307,8 @@ static int perf_ibs_init(struct perf_event *event)
> if (config & perf_ibs->cnt_mask)
> /* raw max_cnt may not be set */
> return -EINVAL;
> - if (!event->attr.sample_freq && hwc->sample_period & 0x0f)
> - /*
> - * lower 4 bits can not be set in ibs max cnt,
> - * but allowing it in case we adjust the
> - * sample period to set a frequency.
> - */
> - return -EINVAL;
> +
> + /* Silently mask off lower nibble. IBS hw mandates it. */
> hwc->sample_period &= ~0x0FULL;
> if (!hwc->sample_period)
> hwc->sample_period = 0x10;
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 08/10] perf/core: Introduce pmu->adjust_period() callback
2024-12-10 9:34 ` [PATCH v3 08/10] perf/core: Introduce pmu->adjust_period() callback Ravi Bangoria
@ 2024-12-18 13:52 ` Peter Zijlstra
2025-01-08 11:16 ` Ravi Bangoria
0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2024-12-18 13:52 UTC (permalink / raw)
To: Ravi Bangoria
Cc: mingo, namhyung, acme, eranian, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
x86, hpa, linux-perf-users, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das
On Tue, Dec 10, 2024 at 09:34:47AM +0000, Ravi Bangoria wrote:
> Many hardware PMUs have constraints about sample period. For ex, minimum
> supported sample period for IBS Op PMU is 0x90, the sample period must
> be multiple of 0x10 for IBS Fetch and IBS Op.
>
> Add an optional callback adjust_period() to struct PMU to allow PMU
> specific drivers to adjust sample period calculated by generic code.
> This will ensure the sample_period value will always be valid and no
> additional code is required in PMU specific drivers to re-adjust the
> period.
And not a word about pmu::check_period() and x86_pmu::limit_period() :-(
Please explain why that can't be made to work nor adapted.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3 08/10] perf/core: Introduce pmu->adjust_period() callback
2024-12-18 13:52 ` Peter Zijlstra
@ 2025-01-08 11:16 ` Ravi Bangoria
0 siblings, 0 replies; 16+ messages in thread
From: Ravi Bangoria @ 2025-01-08 11:16 UTC (permalink / raw)
To: Peter Zijlstra
Cc: mingo, namhyung, acme, eranian, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, tglx, bp, dave.hansen,
x86, hpa, linux-perf-users, linux-kernel, santosh.shukla,
ananth.narayan, sandipan.das, Ravi Bangoria
On 18-Dec-24 7:22 PM, Peter Zijlstra wrote:
> On Tue, Dec 10, 2024 at 09:34:47AM +0000, Ravi Bangoria wrote:
>> Many hardware PMUs have constraints about sample period. For ex, minimum
>> supported sample period for IBS Op PMU is 0x90, the sample period must
>> be multiple of 0x10 for IBS Fetch and IBS Op.
>>
>> Add an optional callback adjust_period() to struct PMU to allow PMU
>> specific drivers to adjust sample period calculated by generic code.
>> This will ensure the sample_period value will always be valid and no
>> additional code is required in PMU specific drivers to re-adjust the
>> period.
>
> And not a word about pmu::check_period() and x86_pmu::limit_period() :-(
Let me collate them. Will respin.
Thanks for the feedback,
Ravi
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2025-01-08 11:16 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-10 9:34 [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 01/10] perf/amd/ibs: Remove IBS_{FETCH|OP}_CONFIG_MASK macros Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 02/10] perf/amd/ibs: Remove pointless sample period check Ravi Bangoria
2024-12-18 13:35 ` Peter Zijlstra
2024-12-10 9:34 ` [PATCH v3 03/10] perf/amd/ibs: Fix ->config to sample period calculation for OP PMU Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 04/10] perf/amd/ibs: Fix perf_ibs_op.cnt_mask for CurCnt Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 05/10] perf/amd/ibs: Don't allow freq mode event creation through ->config interface Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 06/10] perf/amd/ibs: Add PMU specific minimum period Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 07/10] perf/amd/ibs: Add ->check_period() callback Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 08/10] perf/core: Introduce pmu->adjust_period() callback Ravi Bangoria
2024-12-18 13:52 ` Peter Zijlstra
2025-01-08 11:16 ` Ravi Bangoria
2024-12-10 9:34 ` [PATCH v3 09/10] perf test: Introduce DEFINE_SUITE_EXCLUSIVE() Ravi Bangoria
2024-12-13 19:55 ` Arnaldo Carvalho de Melo
2024-12-10 9:34 ` [PATCH v3 10/10] perf test amd ibs: Add sample period unit test Ravi Bangoria
2024-12-16 6:42 ` [PATCH v3 00/10] perf/amd/ibs: Fix sample period computations Ravi Bangoria
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).