linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* perf_events: questions about cpu_has_ht_siblings() and offcore support
@ 2011-04-22 12:59 Stephane Eranian
  2011-04-22 13:26 ` Lin Ming
  0 siblings, 1 reply; 14+ messages in thread
From: Stephane Eranian @ 2011-04-22 12:59 UTC (permalink / raw)
  To: LKML; +Cc: mingo, Peter Zijlstra, Lin Ming, andi.kleen

Lin,

In arch/x86/include/asm/smp.h, you added:

static inline bool cpu_has_ht_siblings(void)
{
       bool has_siblings = false;
#ifdef CONFIG_SMP
       has_siblings = cpu_has_ht && smp_num_siblings > 1;
#endif
       return has_siblings;
}

I am wondering about the goal of this function.

Is it supposed to return whether or not HT is enabled?

Ht enabled != HT supported

On my systems (NHM or SNB), its value does not change
when I enable/disable HT.

Looking at Intel's AP-485 (CPUID documentation), they
clearly say that none of the Leaf functions which report
about HT or the number of logical cores, can be used to
detect HT enabled or disabled. Seems those leaf functions
are the basis for smp_num_siblings. The trick in Table-5.5
for bit 28 with CPUID(1).EBX[23:16] used in detect_ht()
is about HT supported and not HT enabled.

HT supported means multi-core or multi-thread supported.

Going back to the perf_event code, I wonder what is the
point of using this function in intel_pmu_cpu_prepare(), then.

I suspect you wanted to know whether or not HT was enabled.
But that's not going to work. If you want that functionality, then
I tried:

+static inline int is_ht_enabled(void)
+{
+       bool has_ht = false;
+#ifdef CONFIG_SMP
+       int w;
+       w = cpumask_weight(cpu_sibling_mask(smp_processor_id()));
+       has_ht = cpu_has_ht && w > 1;
+#endif
+       return has_ht;
+}

But that cannot be used in the CPU hotplug callback for prepare(),
it is too early.

OTOH, you need some validation even in the case HT is off. No two events
scheduled together on the same PMU can have different values for the extra
reg. Thus, the fact that cpu_has_ht_siblings() is imune to HT state helps here,
but then what's the point of it?

I am preparing a patch that builds on your patch and improves
support for those events or features which require an extra
(shared) register. They are differences between NHM/WSM
and SNB.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: perf_events: questions about cpu_has_ht_siblings() and offcore support
  2011-04-22 12:59 perf_events: questions about cpu_has_ht_siblings() and offcore support Stephane Eranian
@ 2011-04-22 13:26 ` Lin Ming
  2011-04-22 13:46   ` Stephane Eranian
  0 siblings, 1 reply; 14+ messages in thread
From: Lin Ming @ 2011-04-22 13:26 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: LKML, mingo@elte.hu, Peter Zijlstra, Kleen, Andi

On Fri, 2011-04-22 at 20:59 +0800, Stephane Eranian wrote:
> Lin,
> 
> In arch/x86/include/asm/smp.h, you added:
> 
> static inline bool cpu_has_ht_siblings(void)
> {
>        bool has_siblings = false;
> #ifdef CONFIG_SMP
>        has_siblings = cpu_has_ht && smp_num_siblings > 1;
> #endif
>        return has_siblings;
> }
> 
> I am wondering about the goal of this function.
> 
> Is it supposed to return whether or not HT is enabled?
> 
> Ht enabled != HT supported

It's used to check if HT is supported.

We had some long discussions months ago about how to check if HT is
enabled. http://marc.info/?t=129346430400004&r=1&w=2

But unfortunately, we didn't find a way to check if HT is enabled.
So I just check if HT is supported.

> 
> On my systems (NHM or SNB), its value does not change
> when I enable/disable HT.
> 
> Looking at Intel's AP-485 (CPUID documentation), they
> clearly say that none of the Leaf functions which report
> about HT or the number of logical cores, can be used to
> detect HT enabled or disabled. Seems those leaf functions
> are the basis for smp_num_siblings. The trick in Table-5.5
> for bit 28 with CPUID(1).EBX[23:16] used in detect_ht()
> is about HT supported and not HT enabled.
> 
> HT supported means multi-core or multi-thread supported.
> 
> Going back to the perf_event code, I wonder what is the
> point of using this function in intel_pmu_cpu_prepare(), then.
> 
> I suspect you wanted to know whether or not HT was enabled.
> But that's not going to work. If you want that functionality, then
> I tried:
> 
> +static inline int is_ht_enabled(void)
> +{
> +       bool has_ht = false;
> +#ifdef CONFIG_SMP
> +       int w;
> +       w = cpumask_weight(cpu_sibling_mask(smp_processor_id()));
> +       has_ht = cpu_has_ht && w > 1;
> +#endif
> +       return has_ht;
> +}
> 
> But that cannot be used in the CPU hotplug callback for prepare(),
> it is too early.
> 
> OTOH, you need some validation even in the case HT is off. No two events
> scheduled together on the same PMU can have different values for the extra
> reg. Thus, the fact that cpu_has_ht_siblings() is imune to HT state helps here,
> but then what's the point of it?

The points is to avoid the percore resource allocations(which are used
to sync between HTs) if HT is not supported.

> 
> I am preparing a patch that builds on your patch and improves
> support for those events or features which require an extra
> (shared) register. They are differences between NHM/WSM
> and SNB.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: perf_events: questions about cpu_has_ht_siblings() and offcore support
  2011-04-22 13:26 ` Lin Ming
@ 2011-04-22 13:46   ` Stephane Eranian
  2011-04-22 14:31     ` Lin Ming
  2011-04-22 20:42     ` Peter Zijlstra
  0 siblings, 2 replies; 14+ messages in thread
From: Stephane Eranian @ 2011-04-22 13:46 UTC (permalink / raw)
  To: Lin Ming; +Cc: LKML, mingo@elte.hu, Peter Zijlstra, Kleen, Andi

On Fri, Apr 22, 2011 at 3:26 PM, Lin Ming <ming.m.lin@intel.com> wrote:
> On Fri, 2011-04-22 at 20:59 +0800, Stephane Eranian wrote:
>> Lin,
>>
>> In arch/x86/include/asm/smp.h, you added:
>>
>> static inline bool cpu_has_ht_siblings(void)
>> {
>>        bool has_siblings = false;
>> #ifdef CONFIG_SMP
>>        has_siblings = cpu_has_ht && smp_num_siblings > 1;
>> #endif
>>        return has_siblings;
>> }
>>
>> I am wondering about the goal of this function.
>>
>> Is it supposed to return whether or not HT is enabled?
>>
>> Ht enabled != HT supported
>
> It's used to check if HT is supported.
>
Ok, that makes more sense.

> But unfortunately, we didn't find a way to check if HT is enabled.
> So I just check if HT is supported.
>
>>
>> +static inline int is_ht_enabled(void)
>> +{
>> +       bool has_ht = false;
>> +#ifdef CONFIG_SMP
>> +       int w;
>> +       w = cpumask_weight(cpu_sibling_mask(smp_processor_id()));
>> +       has_ht = cpu_has_ht && w > 1;
>> +#endif
>> +       return has_ht;
>> +}
>>
>> OTOH, you need some validation even in the case HT is off. No two events
>> scheduled together on the same PMU can have different values for the extra
>> reg. Thus, the fact that cpu_has_ht_siblings() is imune to HT state helps here,
>> but then what's the point of it?
>
> The points is to avoid the percore resource allocations(which are used
> to sync between HTs) if HT is not supported.
>
But if you check x86_pmu.extra_regs, that should do it as well.

Suppose HT is disabled and I do:

perf stat -e offcore_response_0:dmd_data_rd,offcore_response_0:dmnd_rfo ......

This should still not be allowed.

I think in this case, HT supported will cause your code to still allocate the
per-core struct. There will be no matching of per-core structs in starting().
So I suspect things work.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: perf_events: questions about cpu_has_ht_siblings() and offcore support
  2011-04-22 13:46   ` Stephane Eranian
@ 2011-04-22 14:31     ` Lin Ming
  2011-04-22 14:41       ` Stephane Eranian
  2011-04-22 14:53       ` Stephane Eranian
  2011-04-22 20:42     ` Peter Zijlstra
  1 sibling, 2 replies; 14+ messages in thread
From: Lin Ming @ 2011-04-22 14:31 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: LKML, mingo@elte.hu, Peter Zijlstra, Kleen, Andi

On Fri, 2011-04-22 at 21:46 +0800, Stephane Eranian wrote:
> On Fri, Apr 22, 2011 at 3:26 PM, Lin Ming <ming.m.lin@intel.com> wrote:
> > On Fri, 2011-04-22 at 20:59 +0800, Stephane Eranian wrote:
> >> Lin,
> >>
> >> In arch/x86/include/asm/smp.h, you added:
> >>
> >> static inline bool cpu_has_ht_siblings(void)
> >> {
> >>        bool has_siblings = false;
> >> #ifdef CONFIG_SMP
> >>        has_siblings = cpu_has_ht && smp_num_siblings > 1;
> >> #endif
> >>        return has_siblings;
> >> }
> >>
> >> I am wondering about the goal of this function.
> >>
> >> Is it supposed to return whether or not HT is enabled?
> >>
> >> Ht enabled != HT supported
> >
> > It's used to check if HT is supported.
> >
> Ok, that makes more sense.
> 
> > But unfortunately, we didn't find a way to check if HT is enabled.
> > So I just check if HT is supported.
> >
> >>
> >> +static inline int is_ht_enabled(void)
> >> +{
> >> +       bool has_ht = false;
> >> +#ifdef CONFIG_SMP
> >> +       int w;
> >> +       w = cpumask_weight(cpu_sibling_mask(smp_processor_id()));
> >> +       has_ht = cpu_has_ht && w > 1;
> >> +#endif
> >> +       return has_ht;
> >> +}
> >>
> >> OTOH, you need some validation even in the case HT is off. No two events
> >> scheduled together on the same PMU can have different values for the extra

I got it now.

> >> reg. Thus, the fact that cpu_has_ht_siblings() is imune to HT state helps here,
> >> but then what's the point of it?
> >
> > The points is to avoid the percore resource allocations(which are used
> > to sync between HTs) if HT is not supported.
> >
> But if you check x86_pmu.extra_regs, that should do it as well.

I don't understand here.
Did you mean we can avoid the percore resource allocations by just
checking x86_pmu.extra_regs? How?

> 
> Suppose HT is disabled and I do:
> 
> perf stat -e offcore_response_0:dmd_data_rd,offcore_response_0:dmnd_rfo ......
> 
> This should still not be allowed.

Ah, you are right.
We have to always check extra_config even HT is disabled and/or
supported.

> 
> I think in this case, HT supported will cause your code to still allocate the
> per-core struct. There will be no matching of per-core structs in starting().
> So I suspect things work.

This has no problem.
If "no matching" found, then below if(...) statement won't be executed.

intel_pmu_cpu_starting:

        for_each_cpu(i, topology_thread_cpumask(cpu)) {
                struct intel_percore *pc = per_cpu(cpu_hw_events, i).per_core;

                if (pc && pc->core_id == core_id) {
                        kfree(cpuc->per_core);
                        cpuc->per_core = pc;
                        break;
                }
        }

Or do you see other potential problem?



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: perf_events: questions about cpu_has_ht_siblings() and offcore support
  2011-04-22 14:31     ` Lin Ming
@ 2011-04-22 14:41       ` Stephane Eranian
  2011-04-22 14:47         ` Peter Zijlstra
  2011-04-22 15:03         ` Lin Ming
  2011-04-22 14:53       ` Stephane Eranian
  1 sibling, 2 replies; 14+ messages in thread
From: Stephane Eranian @ 2011-04-22 14:41 UTC (permalink / raw)
  To: Lin Ming; +Cc: LKML, mingo@elte.hu, Peter Zijlstra, Kleen, Andi

On Fri, Apr 22, 2011 at 4:31 PM, Lin Ming <ming.m.lin@intel.com> wrote:
> On Fri, 2011-04-22 at 21:46 +0800, Stephane Eranian wrote:
>> On Fri, Apr 22, 2011 at 3:26 PM, Lin Ming <ming.m.lin@intel.com> wrote:
>> > On Fri, 2011-04-22 at 20:59 +0800, Stephane Eranian wrote:
>> >> Lin,
>> >>
>> >> In arch/x86/include/asm/smp.h, you added:
>> >>
>> >> static inline bool cpu_has_ht_siblings(void)
>> >> {
>> >>        bool has_siblings = false;
>> >> #ifdef CONFIG_SMP
>> >>        has_siblings = cpu_has_ht && smp_num_siblings > 1;
>> >> #endif
>> >>        return has_siblings;
>> >> }
>> >>
>> >> I am wondering about the goal of this function.
>> >>
>> >> Is it supposed to return whether or not HT is enabled?
>> >>
>> >> Ht enabled != HT supported
>> >
>> > It's used to check if HT is supported.
>> >
>> Ok, that makes more sense.
>>
>> > But unfortunately, we didn't find a way to check if HT is enabled.
>> > So I just check if HT is supported.
>> >
>> >>
>> >> +static inline int is_ht_enabled(void)
>> >> +{
>> >> +       bool has_ht = false;
>> >> +#ifdef CONFIG_SMP
>> >> +       int w;
>> >> +       w = cpumask_weight(cpu_sibling_mask(smp_processor_id()));
>> >> +       has_ht = cpu_has_ht && w > 1;
>> >> +#endif
>> >> +       return has_ht;
>> >> +}
>> >>
>> >> OTOH, you need some validation even in the case HT is off. No two events
>> >> scheduled together on the same PMU can have different values for the extra
>
> I got it now.
>
>> >> reg. Thus, the fact that cpu_has_ht_siblings() is imune to HT state helps here,
>> >> but then what's the point of it?
>> >
>> > The points is to avoid the percore resource allocations(which are used
>> > to sync between HTs) if HT is not supported.
>> >
>> But if you check x86_pmu.extra_regs, that should do it as well.
>
> I don't understand here.
> Did you mean we can avoid the percore resource allocations by just
> checking x86_pmu.extra_regs? How?

Is you have not extra_regs, i.e., regs that are shared, then why would
you need the percore allocation?


>
>>
>> Suppose HT is disabled and I do:
>>
>> perf stat -e offcore_response_0:dmd_data_rd,offcore_response_0:dmnd_rfo ......
>>
>> This should still not be allowed.
>
> Ah, you are right.
> We have to always check extra_config even HT is disabled and/or
> supported.
>
Yes. You won't need the locking, though.

>>
>> I think in this case, HT supported will cause your code to still allocate the
>> per-core struct. There will be no matching of per-core structs in starting().
>> So I suspect things work.
>
> This has no problem.
> If "no matching" found, then below if(...) statement won't be executed.
>
> intel_pmu_cpu_starting:
>
>        for_each_cpu(i, topology_thread_cpumask(cpu)) {
>                struct intel_percore *pc = per_cpu(cpu_hw_events, i).per_core;
>
>                if (pc && pc->core_id == core_id) {
>                        kfree(cpuc->per_core);
>                        cpuc->per_core = pc;
>                        break;
>                }
>        }
>
> Or do you see other potential problem?
>
I think when HT is off, you will never execute the if statement, because
no core_id will ever match another.

Another thing that struck me when locking at the hotplug code for
per-core is the lack of locking. I assume that's because hotplug
cpu is inherently serialized. You cannot have a CPU going offline
and one going online at the same time. is that right? Otherwise
I wonder if you could simply do per_core->refcnt++ vs.
per_core->refcnt--

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: perf_events: questions about cpu_has_ht_siblings() and offcore support
  2011-04-22 14:41       ` Stephane Eranian
@ 2011-04-22 14:47         ` Peter Zijlstra
  2011-04-22 14:48           ` Stephane Eranian
  2011-04-22 15:03         ` Lin Ming
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2011-04-22 14:47 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Lin Ming, LKML, mingo@elte.hu, Kleen, Andi

On Fri, 2011-04-22 at 16:41 +0200, Stephane Eranian wrote:
> I assume that's because hotplug
> cpu is inherently serialized. You cannot have a CPU going offline
> and one going online at the same time. is that right? 

Exactly, that's all serialized under the hotpug mutex


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: perf_events: questions about cpu_has_ht_siblings() and offcore support
  2011-04-22 14:47         ` Peter Zijlstra
@ 2011-04-22 14:48           ` Stephane Eranian
  0 siblings, 0 replies; 14+ messages in thread
From: Stephane Eranian @ 2011-04-22 14:48 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Lin Ming, LKML, mingo@elte.hu, Kleen, Andi

On Fri, Apr 22, 2011 at 4:47 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, 2011-04-22 at 16:41 +0200, Stephane Eranian wrote:
>> I assume that's because hotplug
>> cpu is inherently serialized. You cannot have a CPU going offline
>> and one going online at the same time. is that right?
>
> Exactly, that's all serialized under the hotpug mutex
>
Ok, thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: perf_events: questions about cpu_has_ht_siblings() and offcore support
  2011-04-22 14:31     ` Lin Ming
  2011-04-22 14:41       ` Stephane Eranian
@ 2011-04-22 14:53       ` Stephane Eranian
  1 sibling, 0 replies; 14+ messages in thread
From: Stephane Eranian @ 2011-04-22 14:53 UTC (permalink / raw)
  To: Lin Ming; +Cc: LKML, mingo@elte.hu, Peter Zijlstra, Kleen, Andi

On Fri, Apr 22, 2011 at 4:31 PM, Lin Ming <ming.m.lin@intel.com> wrote:
> On Fri, 2011-04-22 at 21:46 +0800, Stephane Eranian wrote:
>> On Fri, Apr 22, 2011 at 3:26 PM, Lin Ming <ming.m.lin@intel.com> wrote:
>> > On Fri, 2011-04-22 at 20:59 +0800, Stephane Eranian wrote:
>> >> Lin,
>> >>
>> >> In arch/x86/include/asm/smp.h, you added:
>> >>
>> >> static inline bool cpu_has_ht_siblings(void)
>> >> {
>> >>        bool has_siblings = false;
>> >> #ifdef CONFIG_SMP
>> >>        has_siblings = cpu_has_ht && smp_num_siblings > 1;
>> >> #endif
>> >>        return has_siblings;
>> >> }
>> >>
>> >> I am wondering about the goal of this function.
>> >>
>> >> Is it supposed to return whether or not HT is enabled?
>> >>
>> >> Ht enabled != HT supported
>> >
>> > It's used to check if HT is supported.
>> >
>> Ok, that makes more sense.
>>
>> > But unfortunately, we didn't find a way to check if HT is enabled.
>> > So I just check if HT is supported.
>> >
>> >>
>> >> +static inline int is_ht_enabled(void)
>> >> +{
>> >> +       bool has_ht = false;
>> >> +#ifdef CONFIG_SMP
>> >> +       int w;
>> >> +       w = cpumask_weight(cpu_sibling_mask(smp_processor_id()));
>> >> +       has_ht = cpu_has_ht && w > 1;
>> >> +#endif
>> >> +       return has_ht;
>> >> +}
>> >>
>> >> OTOH, you need some validation even in the case HT is off. No two events
>> >> scheduled together on the same PMU can have different values for the extra
>
> I got it now.
>
>> >> reg. Thus, the fact that cpu_has_ht_siblings() is imune to HT state helps here,
>> >> but then what's the point of it?
>> >
>> > The points is to avoid the percore resource allocations(which are used
>> > to sync between HTs) if HT is not supported.
>> >
>> But if you check x86_pmu.extra_regs, that should do it as well.
>
> I don't understand here.
> Did you mean we can avoid the percore resource allocations by just
> checking x86_pmu.extra_regs? How?
>
>>
>> Suppose HT is disabled and I do:
>>
>> perf stat -e offcore_response_0:dmd_data_rd,offcore_response_0:dmnd_rfo ......
>>
>> This should still not be allowed.
>
> Ah, you are right.
> We have to always check extra_config even HT is disabled and/or
> supported.
>
>>
>> I think in this case, HT supported will cause your code to still allocate the
>> per-core struct. There will be no matching of per-core structs in starting().
>> So I suspect things work.
>
> This has no problem.
> If "no matching" found, then below if(...) statement won't be executed.
>
> intel_pmu_cpu_starting:
>
>        for_each_cpu(i, topology_thread_cpumask(cpu)) {
>                struct intel_percore *pc = per_cpu(cpu_hw_events, i).per_core;
>
>                if (pc && pc->core_id == core_id) {
>                        kfree(cpuc->per_core);
>                        cpuc->per_core = pc;
>                        break;
>                }
>        }
>
> Or do you see other potential problem?
>
The other problem I found while working on my patch is that the
current code, allows you to construct an event group which will
never be schedulable:

task -e offcore_response_0:dmd_data_rd,offcore_response_0:dmnd_rfo ....

That's because validate_group() uses a fake_cpuc which does not implement
the per-core logic. I think I understand how why you had:

intel_percore_constraints(struct cpu_hw_events *cpuc, struct perf_event *event)
{
             .....
                pc = cpuc->per_core;
                if (!pc)
                        break;
             ....
}

I have that fixed in my patch.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: perf_events: questions about cpu_has_ht_siblings() and offcore support
  2011-04-22 14:41       ` Stephane Eranian
  2011-04-22 14:47         ` Peter Zijlstra
@ 2011-04-22 15:03         ` Lin Ming
  2011-04-22 15:05           ` Stephane Eranian
  1 sibling, 1 reply; 14+ messages in thread
From: Lin Ming @ 2011-04-22 15:03 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: LKML, mingo@elte.hu, Peter Zijlstra, Kleen, Andi

On Fri, 2011-04-22 at 22:41 +0800, Stephane Eranian wrote:
> On Fri, Apr 22, 2011 at 4:31 PM, Lin Ming <ming.m.lin@intel.com> wrote:
> > On Fri, 2011-04-22 at 21:46 +0800, Stephane Eranian wrote:
> >> On Fri, Apr 22, 2011 at 3:26 PM, Lin Ming <ming.m.lin@intel.com> wrote:
> >> > On Fri, 2011-04-22 at 20:59 +0800, Stephane Eranian wrote:
> >> >> Lin,
> >> >>
> >> >> In arch/x86/include/asm/smp.h, you added:
> >> >>
> >> >> static inline bool cpu_has_ht_siblings(void)
> >> >> {
> >> >>        bool has_siblings = false;
> >> >> #ifdef CONFIG_SMP
> >> >>        has_siblings = cpu_has_ht && smp_num_siblings > 1;
> >> >> #endif
> >> >>        return has_siblings;
> >> >> }
> >> >>
> >> >> I am wondering about the goal of this function.
> >> >>
> >> >> Is it supposed to return whether or not HT is enabled?
> >> >>
> >> >> Ht enabled != HT supported
> >> >
> >> > It's used to check if HT is supported.
> >> >
> >> Ok, that makes more sense.
> >>
> >> > But unfortunately, we didn't find a way to check if HT is enabled.
> >> > So I just check if HT is supported.
> >> >
> >> >>
> >> >> +static inline int is_ht_enabled(void)
> >> >> +{
> >> >> +       bool has_ht = false;
> >> >> +#ifdef CONFIG_SMP
> >> >> +       int w;
> >> >> +       w = cpumask_weight(cpu_sibling_mask(smp_processor_id()));
> >> >> +       has_ht = cpu_has_ht && w > 1;
> >> >> +#endif
> >> >> +       return has_ht;
> >> >> +}
> >> >>
> >> >> OTOH, you need some validation even in the case HT is off. No two events
> >> >> scheduled together on the same PMU can have different values for the extra
> >
> > I got it now.
> >
> >> >> reg. Thus, the fact that cpu_has_ht_siblings() is imune to HT state helps here,
> >> >> but then what's the point of it?
> >> >
> >> > The points is to avoid the percore resource allocations(which are used
> >> > to sync between HTs) if HT is not supported.
> >> >
> >> But if you check x86_pmu.extra_regs, that should do it as well.
> >
> > I don't understand here.
> > Did you mean we can avoid the percore resource allocations by just
> > checking x86_pmu.extra_regs? How?
> 
> Is you have not extra_regs, i.e., regs that are shared, then why would
> you need the percore allocation?

But "extra_regs" does not imply they are regs that are shared.
It only means some events need to set extra registers to work.

> 
> 
> >
> >>
> >> Suppose HT is disabled and I do:
> >>
> >> perf stat -e offcore_response_0:dmd_data_rd,offcore_response_0:dmnd_rfo ......
> >>
> >> This should still not be allowed.
> >
> > Ah, you are right.
> > We have to always check extra_config even HT is disabled and/or
> > supported.
> >
> Yes. You won't need the locking, though.
> 
> >>
> >> I think in this case, HT supported will cause your code to still allocate the
> >> per-core struct. There will be no matching of per-core structs in starting().
> >> So I suspect things work.
> >
> > This has no problem.
> > If "no matching" found, then below if(...) statement won't be executed.
> >
> > intel_pmu_cpu_starting:
> >
> >        for_each_cpu(i, topology_thread_cpumask(cpu)) {
> >                struct intel_percore *pc = per_cpu(cpu_hw_events, i).per_core;
> >
> >                if (pc && pc->core_id == core_id) {
> >                        kfree(cpuc->per_core);
> >                        cpuc->per_core = pc;
> >                        break;
> >                }
> >        }
> >
> > Or do you see other potential problem?
> >
> I think when HT is off, you will never execute the if statement, because
> no core_id will ever match another.

The "if" statement is not executed so the per-core structs allocated in
intel_pmu_cpu_prepare is not freed.

This is the intended behavior since we don't have a way to check if HT
is off.

> 
> Another thing that struck me when locking at the hotplug code for
> per-core is the lack of locking. I assume that's because hotplug
> cpu is inherently serialized. You cannot have a CPU going offline
> and one going online at the same time. is that right? Otherwise
> I wonder if you could simply do per_core->refcnt++ vs.
> per_core->refcnt--



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: perf_events: questions about cpu_has_ht_siblings() and offcore support
  2011-04-22 15:03         ` Lin Ming
@ 2011-04-22 15:05           ` Stephane Eranian
  2011-04-22 15:30             ` Lin Ming
  0 siblings, 1 reply; 14+ messages in thread
From: Stephane Eranian @ 2011-04-22 15:05 UTC (permalink / raw)
  To: Lin Ming; +Cc: LKML, mingo@elte.hu, Peter Zijlstra, Kleen, Andi

On Fri, Apr 22, 2011 at 5:03 PM, Lin Ming <ming.m.lin@intel.com> wrote:
> On Fri, 2011-04-22 at 22:41 +0800, Stephane Eranian wrote:
>> On Fri, Apr 22, 2011 at 4:31 PM, Lin Ming <ming.m.lin@intel.com> wrote:
>> > On Fri, 2011-04-22 at 21:46 +0800, Stephane Eranian wrote:
>> >> On Fri, Apr 22, 2011 at 3:26 PM, Lin Ming <ming.m.lin@intel.com> wrote:
>> >> > On Fri, 2011-04-22 at 20:59 +0800, Stephane Eranian wrote:
>> >> >> Lin,
>> >> >>
>> >> >> In arch/x86/include/asm/smp.h, you added:
>> >> >>
>> >> >> static inline bool cpu_has_ht_siblings(void)
>> >> >> {
>> >> >>        bool has_siblings = false;
>> >> >> #ifdef CONFIG_SMP
>> >> >>        has_siblings = cpu_has_ht && smp_num_siblings > 1;
>> >> >> #endif
>> >> >>        return has_siblings;
>> >> >> }
>> >> >>
>> >> >> I am wondering about the goal of this function.
>> >> >>
>> >> >> Is it supposed to return whether or not HT is enabled?
>> >> >>
>> >> >> Ht enabled != HT supported
>> >> >
>> >> > It's used to check if HT is supported.
>> >> >
>> >> Ok, that makes more sense.
>> >>
>> >> > But unfortunately, we didn't find a way to check if HT is enabled.
>> >> > So I just check if HT is supported.
>> >> >
>> >> >>
>> >> >> +static inline int is_ht_enabled(void)
>> >> >> +{
>> >> >> +       bool has_ht = false;
>> >> >> +#ifdef CONFIG_SMP
>> >> >> +       int w;
>> >> >> +       w = cpumask_weight(cpu_sibling_mask(smp_processor_id()));
>> >> >> +       has_ht = cpu_has_ht && w > 1;
>> >> >> +#endif
>> >> >> +       return has_ht;
>> >> >> +}
>> >> >>
>> >> >> OTOH, you need some validation even in the case HT is off. No two events
>> >> >> scheduled together on the same PMU can have different values for the extra
>> >
>> > I got it now.
>> >
>> >> >> reg. Thus, the fact that cpu_has_ht_siblings() is imune to HT state helps here,
>> >> >> but then what's the point of it?
>> >> >
>> >> > The points is to avoid the percore resource allocations(which are used
>> >> > to sync between HTs) if HT is not supported.
>> >> >
>> >> But if you check x86_pmu.extra_regs, that should do it as well.
>> >
>> > I don't understand here.
>> > Did you mean we can avoid the percore resource allocations by just
>> > checking x86_pmu.extra_regs? How?
>>
>> Is you have not extra_regs, i.e., regs that are shared, then why would
>> you need the percore allocation?
>
> But "extra_regs" does not imply they are regs that are shared.
> It only means some events need to set extra registers to work.
>
Do you have example of such register that would not require the
extra mutual exclusion either between HT threads or between
events on the same PMU?



>>
>>
>> >
>> >>
>> >> Suppose HT is disabled and I do:
>> >>
>> >> perf stat -e offcore_response_0:dmd_data_rd,offcore_response_0:dmnd_rfo ......
>> >>
>> >> This should still not be allowed.
>> >
>> > Ah, you are right.
>> > We have to always check extra_config even HT is disabled and/or
>> > supported.
>> >
>> Yes. You won't need the locking, though.
>>
>> >>
>> >> I think in this case, HT supported will cause your code to still allocate the
>> >> per-core struct. There will be no matching of per-core structs in starting().
>> >> So I suspect things work.
>> >
>> > This has no problem.
>> > If "no matching" found, then below if(...) statement won't be executed.
>> >
>> > intel_pmu_cpu_starting:
>> >
>> >        for_each_cpu(i, topology_thread_cpumask(cpu)) {
>> >                struct intel_percore *pc = per_cpu(cpu_hw_events, i).per_core;
>> >
>> >                if (pc && pc->core_id == core_id) {
>> >                        kfree(cpuc->per_core);
>> >                        cpuc->per_core = pc;
>> >                        break;
>> >                }
>> >        }
>> >
>> > Or do you see other potential problem?
>> >
>> I think when HT is off, you will never execute the if statement, because
>> no core_id will ever match another.
>
> The "if" statement is not executed so the per-core structs allocated in
> intel_pmu_cpu_prepare is not freed.
>
> This is the intended behavior since we don't have a way to check if HT
> is off.
>
>>
>> Another thing that struck me when locking at the hotplug code for
>> per-core is the lack of locking. I assume that's because hotplug
>> cpu is inherently serialized. You cannot have a CPU going offline
>> and one going online at the same time. is that right? Otherwise
>> I wonder if you could simply do per_core->refcnt++ vs.
>> per_core->refcnt--
>
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: perf_events: questions about cpu_has_ht_siblings() and offcore support
  2011-04-22 15:05           ` Stephane Eranian
@ 2011-04-22 15:30             ` Lin Ming
  2011-04-22 16:21               ` Stephane Eranian
  0 siblings, 1 reply; 14+ messages in thread
From: Lin Ming @ 2011-04-22 15:30 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: LKML, mingo@elte.hu, Peter Zijlstra, Kleen, Andi

On Fri, 2011-04-22 at 23:05 +0800, Stephane Eranian wrote:
> On Fri, Apr 22, 2011 at 5:03 PM, Lin Ming <ming.m.lin@intel.com> wrote:
> > On Fri, 2011-04-22 at 22:41 +0800, Stephane Eranian wrote:
> >> On Fri, Apr 22, 2011 at 4:31 PM, Lin Ming <ming.m.lin@intel.com> wrote:
> >> > On Fri, 2011-04-22 at 21:46 +0800, Stephane Eranian wrote:
> >> >> On Fri, Apr 22, 2011 at 3:26 PM, Lin Ming <ming.m.lin@intel.com> wrote:
> >> >> > On Fri, 2011-04-22 at 20:59 +0800, Stephane Eranian wrote:
> >> >> >> Lin,
> >> >> >>
> >> >> >> In arch/x86/include/asm/smp.h, you added:
> >> >> >>
> >> >> >> static inline bool cpu_has_ht_siblings(void)
> >> >> >> {
> >> >> >>        bool has_siblings = false;
> >> >> >> #ifdef CONFIG_SMP
> >> >> >>        has_siblings = cpu_has_ht && smp_num_siblings > 1;
> >> >> >> #endif
> >> >> >>        return has_siblings;
> >> >> >> }
> >> >> >>
> >> >> >> I am wondering about the goal of this function.
> >> >> >>
> >> >> >> Is it supposed to return whether or not HT is enabled?
> >> >> >>
> >> >> >> Ht enabled != HT supported
> >> >> >
> >> >> > It's used to check if HT is supported.
> >> >> >
> >> >> Ok, that makes more sense.
> >> >>
> >> >> > But unfortunately, we didn't find a way to check if HT is enabled.
> >> >> > So I just check if HT is supported.
> >> >> >
> >> >> >>
> >> >> >> +static inline int is_ht_enabled(void)
> >> >> >> +{
> >> >> >> +       bool has_ht = false;
> >> >> >> +#ifdef CONFIG_SMP
> >> >> >> +       int w;
> >> >> >> +       w = cpumask_weight(cpu_sibling_mask(smp_processor_id()));
> >> >> >> +       has_ht = cpu_has_ht && w > 1;
> >> >> >> +#endif
> >> >> >> +       return has_ht;
> >> >> >> +}
> >> >> >>
> >> >> >> OTOH, you need some validation even in the case HT is off. No two events
> >> >> >> scheduled together on the same PMU can have different values for the extra
> >> >
> >> > I got it now.
> >> >
> >> >> >> reg. Thus, the fact that cpu_has_ht_siblings() is imune to HT state helps here,
> >> >> >> but then what's the point of it?
> >> >> >
> >> >> > The points is to avoid the percore resource allocations(which are used
> >> >> > to sync between HTs) if HT is not supported.
> >> >> >
> >> >> But if you check x86_pmu.extra_regs, that should do it as well.
> >> >
> >> > I don't understand here.
> >> > Did you mean we can avoid the percore resource allocations by just
> >> > checking x86_pmu.extra_regs? How?
> >>
> >> Is you have not extra_regs, i.e., regs that are shared, then why would
> >> you need the percore allocation?
> >
> > But "extra_regs" does not imply they are regs that are shared.
> > It only means some events need to set extra registers to work.
> >
> Do you have example of such register that would not require the
> extra mutual exclusion either between HT threads or between
> events on the same PMU?

No.

I was thinking the case that "extra_regs" may be per-thread, instead of
pef-core.

So if there are "extra_regs" or not does not connected directly with if
locking is needed or not.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: perf_events: questions about cpu_has_ht_siblings() and offcore support
  2011-04-22 15:30             ` Lin Ming
@ 2011-04-22 16:21               ` Stephane Eranian
  0 siblings, 0 replies; 14+ messages in thread
From: Stephane Eranian @ 2011-04-22 16:21 UTC (permalink / raw)
  To: Lin Ming; +Cc: LKML, mingo@elte.hu, Peter Zijlstra, Kleen, Andi

On Fri, Apr 22, 2011 at 5:30 PM, Lin Ming <ming.m.lin@intel.com> wrote:
> On Fri, 2011-04-22 at 23:05 +0800, Stephane Eranian wrote:
>> On Fri, Apr 22, 2011 at 5:03 PM, Lin Ming <ming.m.lin@intel.com> wrote:
>> > On Fri, 2011-04-22 at 22:41 +0800, Stephane Eranian wrote:
>> >> On Fri, Apr 22, 2011 at 4:31 PM, Lin Ming <ming.m.lin@intel.com> wrote:
>> >> > On Fri, 2011-04-22 at 21:46 +0800, Stephane Eranian wrote:
>> >> >> On Fri, Apr 22, 2011 at 3:26 PM, Lin Ming <ming.m.lin@intel.com> wrote:
>> >> >> > On Fri, 2011-04-22 at 20:59 +0800, Stephane Eranian wrote:
>> >> >> >> Lin,
>> >> >> >>
>> >> >> >> In arch/x86/include/asm/smp.h, you added:
>> >> >> >>
>> >> >> >> static inline bool cpu_has_ht_siblings(void)
>> >> >> >> {
>> >> >> >>        bool has_siblings = false;
>> >> >> >> #ifdef CONFIG_SMP
>> >> >> >>        has_siblings = cpu_has_ht && smp_num_siblings > 1;
>> >> >> >> #endif
>> >> >> >>        return has_siblings;
>> >> >> >> }
>> >> >> >>
>> >> >> >> I am wondering about the goal of this function.
>> >> >> >>
>> >> >> >> Is it supposed to return whether or not HT is enabled?
>> >> >> >>
>> >> >> >> Ht enabled != HT supported
>> >> >> >
>> >> >> > It's used to check if HT is supported.
>> >> >> >
>> >> >> Ok, that makes more sense.
>> >> >>
>> >> >> > But unfortunately, we didn't find a way to check if HT is enabled.
>> >> >> > So I just check if HT is supported.
>> >> >> >
>> >> >> >>
>> >> >> >> +static inline int is_ht_enabled(void)
>> >> >> >> +{
>> >> >> >> +       bool has_ht = false;
>> >> >> >> +#ifdef CONFIG_SMP
>> >> >> >> +       int w;
>> >> >> >> +       w = cpumask_weight(cpu_sibling_mask(smp_processor_id()));
>> >> >> >> +       has_ht = cpu_has_ht && w > 1;
>> >> >> >> +#endif
>> >> >> >> +       return has_ht;
>> >> >> >> +}
>> >> >> >>
>> >> >> >> OTOH, you need some validation even in the case HT is off. No two events
>> >> >> >> scheduled together on the same PMU can have different values for the extra
>> >> >
>> >> > I got it now.
>> >> >
>> >> >> >> reg. Thus, the fact that cpu_has_ht_siblings() is imune to HT state helps here,
>> >> >> >> but then what's the point of it?
>> >> >> >
>> >> >> > The points is to avoid the percore resource allocations(which are used
>> >> >> > to sync between HTs) if HT is not supported.
>> >> >> >
>> >> >> But if you check x86_pmu.extra_regs, that should do it as well.
>> >> >
>> >> > I don't understand here.
>> >> > Did you mean we can avoid the percore resource allocations by just
>> >> > checking x86_pmu.extra_regs? How?
>> >>
>> >> Is you have not extra_regs, i.e., regs that are shared, then why would
>> >> you need the percore allocation?
>> >
>> > But "extra_regs" does not imply they are regs that are shared.
>> > It only means some events need to set extra registers to work.
>> >
>> Do you have example of such register that would not require the
>> extra mutual exclusion either between HT threads or between
>> events on the same PMU?
>
> No.
>
> I was thinking the case that "extra_regs" may be per-thread, instead of
> pef-core.
>
Yes, I am looking into that in my patch. You need this on NHM/WSM and
SNB.

But you would still need a struct to track the locked in value for the
extra_reg to avoid conflict even on one thread.

> So if there are "extra_regs" or not does not connected directly with if
> locking is needed or not.
>
True but it is "connected" to the kernel having to maintain the current
value for that extra_reg and ensuring that no other events conflict with
it.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: perf_events: questions about cpu_has_ht_siblings() and offcore support
  2011-04-22 13:46   ` Stephane Eranian
  2011-04-22 14:31     ` Lin Ming
@ 2011-04-22 20:42     ` Peter Zijlstra
  2011-04-22 22:15       ` Stephane Eranian
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2011-04-22 20:42 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Lin Ming, LKML, mingo@elte.hu, Kleen, Andi

On Fri, 2011-04-22 at 15:46 +0200, Stephane Eranian wrote:
> 
> perf stat -e offcore_response_0:dmd_data_rd,offcore_response_0:dmnd_rfo ......
> 
> This should still not be allowed.

Aside from the problem you mention, I don't like the whole
offcore_response_[01] thing. The offcore things are fully symmetric so
the kernel should automatically select the other one if one is taken.




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: perf_events: questions about cpu_has_ht_siblings() and offcore support
  2011-04-22 20:42     ` Peter Zijlstra
@ 2011-04-22 22:15       ` Stephane Eranian
  0 siblings, 0 replies; 14+ messages in thread
From: Stephane Eranian @ 2011-04-22 22:15 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Lin Ming, LKML, mingo@elte.hu, Kleen, Andi

On Fri, Apr 22, 2011 at 10:42 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, 2011-04-22 at 15:46 +0200, Stephane Eranian wrote:
>>
>> perf stat -e offcore_response_0:dmd_data_rd,offcore_response_0:dmnd_rfo ......
>>
>> This should still not be allowed.
>
> Aside from the problem you mention, I don't like the whole
> offcore_response_[01] thing. The offcore things are fully symmetric so
> the kernel should automatically select the other one if one is taken.
>
We could indeed do this. Though it would mean special casing
event codes 0xbb and 0xb7. You'd have to lookup the per-core struct
to decide whether it's worth it or not.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-04-22 22:15 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-22 12:59 perf_events: questions about cpu_has_ht_siblings() and offcore support Stephane Eranian
2011-04-22 13:26 ` Lin Ming
2011-04-22 13:46   ` Stephane Eranian
2011-04-22 14:31     ` Lin Ming
2011-04-22 14:41       ` Stephane Eranian
2011-04-22 14:47         ` Peter Zijlstra
2011-04-22 14:48           ` Stephane Eranian
2011-04-22 15:03         ` Lin Ming
2011-04-22 15:05           ` Stephane Eranian
2011-04-22 15:30             ` Lin Ming
2011-04-22 16:21               ` Stephane Eranian
2011-04-22 14:53       ` Stephane Eranian
2011-04-22 20:42     ` Peter Zijlstra
2011-04-22 22:15       ` Stephane Eranian

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).