* [PATCH kvm-unit-tests] pmu: fixes for Sandy Bridge hosts
@ 2013-05-30 17:43 Paolo Bonzini
2013-05-30 17:43 ` [PATCH] " Paolo Bonzini
2013-06-02 15:32 ` [PATCH kvm-unit-tests] " Gleb Natapov
0 siblings, 2 replies; 9+ messages in thread
From: Paolo Bonzini @ 2013-05-30 17:43 UTC (permalink / raw)
To: kvm
This patch includes two fixes for SB:
* the 3rd fixed counter ("ref cpu cycles") can sometimes report
less than the number of iterations
* there is an 8th counter which causes out of bounds accesses
to gp_event or check_counters_many's cnt array
There is still a bug in KVM, because the "pmu all counters-0"
test fails. (It passes if you use any 6 of the 8 gp counters,
fails if you use 7 or 8).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
x86/pmu.c | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/x86/pmu.c b/x86/pmu.c
index 2c46f31..dca753a 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -88,9 +88,10 @@ struct pmu_event {
}, fixed_events[] = {
{"fixed 1", MSR_CORE_PERF_FIXED_CTR0, 10*N, 10.2*N},
{"fixed 2", MSR_CORE_PERF_FIXED_CTR0 + 1, 1*N, 30*N},
- {"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 1*N, 30*N}
+ {"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N}
};
+static int num_counters;
static int tests, failures;
char *buf;
@@ -237,7 +238,7 @@ static void check_gp_counter(struct pmu_event *evt)
};
int i;
- for (i = 0; i < eax.split.num_counters; i++, cnt.ctr++) {
+ for (i = 0; i < num_counters; i++, cnt.ctr++) {
cnt.count = 0;
measure(&cnt, 1);
report(evt->name, i, verify_event(cnt.count, evt));
@@ -276,7 +277,7 @@ static void check_counters_many(void)
pmu_counter_t cnt[10];
int i, n;
- for (i = 0, n = 0; n < eax.split.num_counters; i++) {
+ for (i = 0, n = 0; n < num_counters; i++) {
if (ebx.full & (1 << i))
continue;
@@ -316,10 +317,10 @@ static void check_counter_overflow(void)
/* clear status before test */
wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
- for (i = 0; i < eax.split.num_counters + 1; i++, cnt.ctr++) {
+ for (i = 0; i < num_counters + 1; i++, cnt.ctr++) {
uint64_t status;
int idx;
- if (i == eax.split.num_counters)
+ if (i == num_counters)
cnt.ctr = fixed_events[0].unit_sel;
if (i % 2)
cnt.config |= EVNTSEL_INT;
@@ -355,7 +356,7 @@ static void check_rdpmc(void)
uint64_t val = 0x1f3456789ull;
int i;
- for (i = 0; i < eax.split.num_counters; i++) {
+ for (i = 0; i < num_counters; i++) {
uint64_t x = (val & 0xffffffff) |
((1ull << (eax.split.bit_width - 32)) - 1) << 32;
wrmsr(MSR_IA32_PERFCTR0 + i, val);
@@ -395,6 +396,14 @@ int main(int ac, char **av)
printf("Fixed counters: %d\n", edx.split.num_counters_fixed);
printf("Fixed counter width: %d\n", edx.split.bit_width_fixed);
+ num_counters = eax.split.num_counters;
+ if (num_counters > ARRAY_SIZE(gp_events))
+ num_counters = ARRAY_SIZE(gp_events);
+
apic_write(APIC_LVTPC, PC_VECTOR);
check_gp_counters();
--
1.8.2.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH] pmu: fixes for Sandy Bridge hosts
2013-05-30 17:43 [PATCH kvm-unit-tests] pmu: fixes for Sandy Bridge hosts Paolo Bonzini
@ 2013-05-30 17:43 ` Paolo Bonzini
2013-05-30 17:45 ` Paolo Bonzini
2013-06-02 15:32 ` [PATCH kvm-unit-tests] " Gleb Natapov
1 sibling, 1 reply; 9+ messages in thread
From: Paolo Bonzini @ 2013-05-30 17:43 UTC (permalink / raw)
To: kvm
This patch includes two fixes for SB:
* the 3rd fixed counter ("ref cpu cycles") can sometimes report
less than the number of iterations
* there is an 8th counter which causes out of bounds accesses
to gp_event or check_counters_many's cnt array
There is still a bug in KVM, because the "pmu all counters-0"
test fails. (It passes if you use any 6 of the 8 gp counters,
fails if you use 7 or 8).
---
x86/pmu.c | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/x86/pmu.c b/x86/pmu.c
index 2c46f31..dca753a 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -88,9 +88,10 @@ struct pmu_event {
}, fixed_events[] = {
{"fixed 1", MSR_CORE_PERF_FIXED_CTR0, 10*N, 10.2*N},
{"fixed 2", MSR_CORE_PERF_FIXED_CTR0 + 1, 1*N, 30*N},
- {"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 1*N, 30*N}
+ {"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N}
};
+static int num_counters;
static int tests, failures;
char *buf;
@@ -237,7 +238,7 @@ static void check_gp_counter(struct pmu_event *evt)
};
int i;
- for (i = 0; i < eax.split.num_counters; i++, cnt.ctr++) {
+ for (i = 0; i < num_counters; i++, cnt.ctr++) {
cnt.count = 0;
measure(&cnt, 1);
report(evt->name, i, verify_event(cnt.count, evt));
@@ -276,7 +277,7 @@ static void check_counters_many(void)
pmu_counter_t cnt[10];
int i, n;
- for (i = 0, n = 0; n < eax.split.num_counters; i++) {
+ for (i = 0, n = 0; n < num_counters; i++) {
if (ebx.full & (1 << i))
continue;
@@ -316,10 +317,10 @@ static void check_counter_overflow(void)
/* clear status before test */
wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
- for (i = 0; i < eax.split.num_counters + 1; i++, cnt.ctr++) {
+ for (i = 0; i < num_counters + 1; i++, cnt.ctr++) {
uint64_t status;
int idx;
- if (i == eax.split.num_counters)
+ if (i == num_counters)
cnt.ctr = fixed_events[0].unit_sel;
if (i % 2)
cnt.config |= EVNTSEL_INT;
@@ -355,7 +356,7 @@ static void check_rdpmc(void)
uint64_t val = 0x1f3456789ull;
int i;
- for (i = 0; i < eax.split.num_counters; i++) {
+ for (i = 0; i < num_counters; i++) {
uint64_t x = (val & 0xffffffff) |
((1ull << (eax.split.bit_width - 32)) - 1) << 32;
wrmsr(MSR_IA32_PERFCTR0 + i, val);
@@ -395,6 +396,14 @@ int main(int ac, char **av)
printf("Fixed counters: %d\n", edx.split.num_counters_fixed);
printf("Fixed counter width: %d\n", edx.split.bit_width_fixed);
+ num_counters = eax.split.num_counters;
+ if (num_counters > ARRAY_SIZE(gp_events))
+ num_counters = ARRAY_SIZE(gp_events);
+ while (id.b) {
+ num_counters--;
+ id.b &= id.b - 1;
+ }
+
apic_write(APIC_LVTPC, PC_VECTOR);
check_gp_counters();
--
1.8.2.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] pmu: fixes for Sandy Bridge hosts
2013-05-30 17:43 ` [PATCH] " Paolo Bonzini
@ 2013-05-30 17:45 ` Paolo Bonzini
0 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2013-05-30 17:45 UTC (permalink / raw)
To: kvm
Il 30/05/2013 19:43, Paolo Bonzini ha scritto:
> @@ -395,6 +396,14 @@ int main(int ac, char **av)
> printf("Fixed counters: %d\n", edx.split.num_counters_fixed);
> printf("Fixed counter width: %d\n", edx.split.bit_width_fixed);
>
> + num_counters = eax.split.num_counters;
> + if (num_counters > ARRAY_SIZE(gp_events))
> + num_counters = ARRAY_SIZE(gp_events);
> + while (id.b) {
> + num_counters--;
> + id.b &= id.b - 1;
> + }
> +
> apic_write(APIC_LVTPC, PC_VECTOR);
>
> check_gp_counters();
>
Please ignore this patch. The parent one with "[PATCH kvm-unit-tests]"
subject is good though.
Paolo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH kvm-unit-tests] pmu: fixes for Sandy Bridge hosts
2013-05-30 17:43 [PATCH kvm-unit-tests] pmu: fixes for Sandy Bridge hosts Paolo Bonzini
2013-05-30 17:43 ` [PATCH] " Paolo Bonzini
@ 2013-06-02 15:32 ` Gleb Natapov
2013-06-03 6:33 ` Paolo Bonzini
1 sibling, 1 reply; 9+ messages in thread
From: Gleb Natapov @ 2013-06-02 15:32 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kvm
On Thu, May 30, 2013 at 07:43:07PM +0200, Paolo Bonzini wrote:
> This patch includes two fixes for SB:
>
> * the 3rd fixed counter ("ref cpu cycles") can sometimes report
> less than the number of iterations
>
Is it documented? It is strange for "architectural" counter to behave
differently on different architectures.
> * there is an 8th counter which causes out of bounds accesses
> to gp_event or check_counters_many's cnt array
>
> There is still a bug in KVM, because the "pmu all counters-0"
> test fails. (It passes if you use any 6 of the 8 gp counters,
> fails if you use 7 or 8).
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> x86/pmu.c | 21 +++++++++++++++------
> 1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/x86/pmu.c b/x86/pmu.c
> index 2c46f31..dca753a 100644
> --- a/x86/pmu.c
> +++ b/x86/pmu.c
> @@ -88,9 +88,10 @@ struct pmu_event {
> }, fixed_events[] = {
> {"fixed 1", MSR_CORE_PERF_FIXED_CTR0, 10*N, 10.2*N},
> {"fixed 2", MSR_CORE_PERF_FIXED_CTR0 + 1, 1*N, 30*N},
> - {"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 1*N, 30*N}
> + {"fixed 3", MSR_CORE_PERF_FIXED_CTR0 + 2, 0.1*N, 30*N}
> };
>
> +static int num_counters;
> static int tests, failures;
>
> char *buf;
> @@ -237,7 +238,7 @@ static void check_gp_counter(struct pmu_event *evt)
> };
> int i;
>
> - for (i = 0; i < eax.split.num_counters; i++, cnt.ctr++) {
> + for (i = 0; i < num_counters; i++, cnt.ctr++) {
> cnt.count = 0;
> measure(&cnt, 1);
> report(evt->name, i, verify_event(cnt.count, evt));
> @@ -276,7 +277,7 @@ static void check_counters_many(void)
> pmu_counter_t cnt[10];
> int i, n;
>
> - for (i = 0, n = 0; n < eax.split.num_counters; i++) {
> + for (i = 0, n = 0; n < num_counters; i++) {
> if (ebx.full & (1 << i))
> continue;
>
> @@ -316,10 +317,10 @@ static void check_counter_overflow(void)
> /* clear status before test */
> wrmsr(MSR_CORE_PERF_GLOBAL_OVF_CTRL, rdmsr(MSR_CORE_PERF_GLOBAL_STATUS));
>
> - for (i = 0; i < eax.split.num_counters + 1; i++, cnt.ctr++) {
> + for (i = 0; i < num_counters + 1; i++, cnt.ctr++) {
> uint64_t status;
> int idx;
> - if (i == eax.split.num_counters)
> + if (i == num_counters)
> cnt.ctr = fixed_events[0].unit_sel;
> if (i % 2)
> cnt.config |= EVNTSEL_INT;
> @@ -355,7 +356,7 @@ static void check_rdpmc(void)
> uint64_t val = 0x1f3456789ull;
> int i;
>
> - for (i = 0; i < eax.split.num_counters; i++) {
> + for (i = 0; i < num_counters; i++) {
> uint64_t x = (val & 0xffffffff) |
> ((1ull << (eax.split.bit_width - 32)) - 1) << 32;
> wrmsr(MSR_IA32_PERFCTR0 + i, val);
> @@ -395,6 +396,14 @@ int main(int ac, char **av)
> printf("Fixed counters: %d\n", edx.split.num_counters_fixed);
> printf("Fixed counter width: %d\n", edx.split.bit_width_fixed);
>
> + num_counters = eax.split.num_counters;
> + if (num_counters > ARRAY_SIZE(gp_events))
> + num_counters = ARRAY_SIZE(gp_events);
> +
> apic_write(APIC_LVTPC, PC_VECTOR);
>
> check_gp_counters();
> --
> 1.8.2.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Gleb.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH kvm-unit-tests] pmu: fixes for Sandy Bridge hosts
2013-06-02 15:32 ` [PATCH kvm-unit-tests] " Gleb Natapov
@ 2013-06-03 6:33 ` Paolo Bonzini
2013-06-03 6:38 ` Gleb Natapov
0 siblings, 1 reply; 9+ messages in thread
From: Paolo Bonzini @ 2013-06-03 6:33 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm
Il 02/06/2013 17:32, Gleb Natapov ha scritto:
> On Thu, May 30, 2013 at 07:43:07PM +0200, Paolo Bonzini wrote:
>> This patch includes two fixes for SB:
>>
>> * the 3rd fixed counter ("ref cpu cycles") can sometimes report
>> less than the number of iterations
>>
> Is it documented? It is strange for "architectural" counter to behave
> differently on different architectures.
It just counts the CPU cycles. If the CPU can optimize the loop better,
it will take less CPU cycles to execute it.
Paolo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH kvm-unit-tests] pmu: fixes for Sandy Bridge hosts
2013-06-03 6:33 ` Paolo Bonzini
@ 2013-06-03 6:38 ` Gleb Natapov
2013-06-03 7:08 ` Paolo Bonzini
0 siblings, 1 reply; 9+ messages in thread
From: Gleb Natapov @ 2013-06-03 6:38 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kvm
On Mon, Jun 03, 2013 at 08:33:13AM +0200, Paolo Bonzini wrote:
> Il 02/06/2013 17:32, Gleb Natapov ha scritto:
> > On Thu, May 30, 2013 at 07:43:07PM +0200, Paolo Bonzini wrote:
> >> This patch includes two fixes for SB:
> >>
> >> * the 3rd fixed counter ("ref cpu cycles") can sometimes report
> >> less than the number of iterations
> >>
> > Is it documented? It is strange for "architectural" counter to behave
> > differently on different architectures.
>
> It just counts the CPU cycles. If the CPU can optimize the loop better,
> it will take less CPU cycles to execute it.
>
We should try and change the loop so that it will not be so easily optimized.
Making the test succeed if only 10% percent of cycles were spend on a loop
may result in the test missing the case when counter counts something
different.
--
Gleb.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH kvm-unit-tests] pmu: fixes for Sandy Bridge hosts
2013-06-03 6:38 ` Gleb Natapov
@ 2013-06-03 7:08 ` Paolo Bonzini
2013-06-03 7:38 ` Gleb Natapov
0 siblings, 1 reply; 9+ messages in thread
From: Paolo Bonzini @ 2013-06-03 7:08 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm
Il 03/06/2013 08:38, Gleb Natapov ha scritto:
> On Mon, Jun 03, 2013 at 08:33:13AM +0200, Paolo Bonzini wrote:
>> Il 02/06/2013 17:32, Gleb Natapov ha scritto:
>>> On Thu, May 30, 2013 at 07:43:07PM +0200, Paolo Bonzini wrote:
>>>> This patch includes two fixes for SB:
>>>>
>>>> * the 3rd fixed counter ("ref cpu cycles") can sometimes report
>>>> less than the number of iterations
>>>>
>>> Is it documented? It is strange for "architectural" counter to behave
>>> differently on different architectures.
>>
>> It just counts the CPU cycles. If the CPU can optimize the loop better,
>> it will take less CPU cycles to execute it.
>>
> We should try and change the loop so that it will not be so easily optimized.
> Making the test succeed if only 10% percent of cycles were spend on a loop
> may result in the test missing the case when counter counts something
> different.
Any hard-to-optimize loop risks becoming wrong on the other side (e.g.
if something stalls the pipeline, a newer chip with longer pipeline will
use more CPU cycles).
Turbo boost could also contribute to lowering the number of cycles; a
boosted processor has ref cpu cycles that are _longer_ than the regular
cycles (thus they count in smaller numbers). Maybe that's why "core
cycles" didn't go below N.
The real result was something like 0.8*N (780-830000). I used 0.1*N
because it is used for the "ref cpu cycles" gp counter, which is not the
same but similar. Should I change it to 0.5*N or so?
Paolo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH kvm-unit-tests] pmu: fixes for Sandy Bridge hosts
2013-06-03 7:08 ` Paolo Bonzini
@ 2013-06-03 7:38 ` Gleb Natapov
2013-06-03 7:44 ` Paolo Bonzini
0 siblings, 1 reply; 9+ messages in thread
From: Gleb Natapov @ 2013-06-03 7:38 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kvm
On Mon, Jun 03, 2013 at 09:08:46AM +0200, Paolo Bonzini wrote:
> Il 03/06/2013 08:38, Gleb Natapov ha scritto:
> > On Mon, Jun 03, 2013 at 08:33:13AM +0200, Paolo Bonzini wrote:
> >> Il 02/06/2013 17:32, Gleb Natapov ha scritto:
> >>> On Thu, May 30, 2013 at 07:43:07PM +0200, Paolo Bonzini wrote:
> >>>> This patch includes two fixes for SB:
> >>>>
> >>>> * the 3rd fixed counter ("ref cpu cycles") can sometimes report
> >>>> less than the number of iterations
> >>>>
> >>> Is it documented? It is strange for "architectural" counter to behave
> >>> differently on different architectures.
> >>
> >> It just counts the CPU cycles. If the CPU can optimize the loop better,
> >> it will take less CPU cycles to execute it.
> >>
> > We should try and change the loop so that it will not be so easily optimized.
> > Making the test succeed if only 10% percent of cycles were spend on a loop
> > may result in the test missing the case when counter counts something
> > different.
>
> Any hard-to-optimize loop risks becoming wrong on the other side (e.g.
> if something stalls the pipeline, a newer chip with longer pipeline will
> use more CPU cycles).
>
> Turbo boost could also contribute to lowering the number of cycles; a
> boosted processor has ref cpu cycles that are _longer_ than the regular
> cycles (thus they count in smaller numbers). Maybe that's why "core
> cycles" didn't go below N.
>
"core cycles" are subject to Turbo boost changes, not ref cycles. Since
instruction are executed at core frequency ref cpu cycles count may be
indeed smaller.
> The real result was something like 0.8*N (780-830000). I used 0.1*N
> because it is used for the "ref cpu cycles" gp counter, which is not the
> same but similar. Should I change it to 0.5*N or so?
>
For cpus with constant_tsc they should be the same. OK lets make gp and
fixed use the same boundaries.
--
Gleb.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH kvm-unit-tests] pmu: fixes for Sandy Bridge hosts
2013-06-03 7:38 ` Gleb Natapov
@ 2013-06-03 7:44 ` Paolo Bonzini
0 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2013-06-03 7:44 UTC (permalink / raw)
To: Gleb Natapov; +Cc: kvm
Il 03/06/2013 09:38, Gleb Natapov ha scritto:
>> > Turbo boost could also contribute to lowering the number of cycles; a
>> > boosted processor has ref cpu cycles that are _longer_ than the regular
>> > cycles (thus they count in smaller numbers). Maybe that's why "core
>> > cycles" didn't go below N.
>> >
> "core cycles" are subject to Turbo boost changes, not ref cycles. Since
> instruction are executed at core frequency ref cpu cycles count may be
> indeed smaller.
Yes, that's what I was trying to say. :)
Paolo
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-06-03 7:44 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-30 17:43 [PATCH kvm-unit-tests] pmu: fixes for Sandy Bridge hosts Paolo Bonzini
2013-05-30 17:43 ` [PATCH] " Paolo Bonzini
2013-05-30 17:45 ` Paolo Bonzini
2013-06-02 15:32 ` [PATCH kvm-unit-tests] " Gleb Natapov
2013-06-03 6:33 ` Paolo Bonzini
2013-06-03 6:38 ` Gleb Natapov
2013-06-03 7:08 ` Paolo Bonzini
2013-06-03 7:38 ` Gleb Natapov
2013-06-03 7:44 ` Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox