* Re: [PATCH 1/3] perf/core: Flush PMU internal buffers for per-CPU events
[not found] ` <CAM9d7cjwFp9JBqs1Ga9n1ojbez9chZLvmOgFv1EE4KDhAa9ryA@mail.gmail.com>
@ 2020-11-20 11:24 ` Namhyung Kim
2020-11-23 11:00 ` Michael Ellerman
0 siblings, 1 reply; 6+ messages in thread
From: Namhyung Kim @ 2020-11-20 11:24 UTC (permalink / raw)
To: Liang, Kan
Cc: Ian Rogers, Andi Kleen, Peter Zijlstra, linuxppc-dev,
linux-kernel, Stephane Eranian, Paul Mackerras,
Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Gabriel Marin
Hi Peter and Kan,
(Adding PPC folks)
On Tue, Nov 17, 2020 at 2:01 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hello,
>
> On Thu, Nov 12, 2020 at 4:54 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
> >
> >
> >
> > On 11/11/2020 11:25 AM, Peter Zijlstra wrote:
> > > On Mon, Nov 09, 2020 at 09:49:31AM -0500, Liang, Kan wrote:
> > >
> > >> - When the large PEBS was introduced (9c964efa4330), the sched_task() should
> > >> be invoked to flush the PEBS buffer in each context switch. However, The
> > >> perf_sched_events in account_event() is not updated accordingly. The
> > >> perf_event_task_sched_* never be invoked for a pure per-CPU context. Only
> > >> per-task event works.
> > >> At that time, the perf_pmu_sched_task() is outside of
> > >> perf_event_context_sched_in/out. It means that perf has to double
> > >> perf_pmu_disable() for per-task event.
> > >
> > >> - The patch 1 tries to fix broken per-CPU events. The CPU context cannot be
> > >> retrieved from the task->perf_event_ctxp. So it has to be tracked in the
> > >> sched_cb_list. Yes, the code is very similar to the original codes, but it
> > >> is actually the new code for per-CPU events. The optimization for per-task
> > >> events is still kept.
> > >> For the case, which has both a CPU context and a task context, yes, the
> > >> __perf_pmu_sched_task() in this patch is not invoked. Because the
> > >> sched_task() only need to be invoked once in a context switch. The
> > >> sched_task() will be eventually invoked in the task context.
> > >
> > > The thing is; your first two patches rely on PERF_ATTACH_SCHED_CB and
> > > only set that for large pebs. Are you sure the other users (Intel LBR
> > > and PowerPC BHRB) don't need it?
> >
> > I didn't set it for LBR, because the perf_sched_events is always enabled
> > for LBR. But, yes, we should explicitly set the PERF_ATTACH_SCHED_CB
> > for LBR.
> >
> > if (has_branch_stack(event))
> > inc = true;
> >
> > >
> > > If they indeed do not require the pmu::sched_task() callback for CPU
> > > events, then I still think the whole perf_sched_cb_{inc,dec}() interface
> >
> > No, LBR requires the pmu::sched_task() callback for CPU events.
> >
> > Now, The LBR registers have to be reset in sched in even for CPU events.
> >
> > To fix the shorter LBR callstack issue for CPU events, we also need to
> > save/restore LBRs in pmu::sched_task().
> > https://lore.kernel.org/lkml/1578495789-95006-4-git-send-email-kan.liang@linux.intel.com/
> >
> > > is confusing at best.
> > >
> > > Can't we do something like this instead?
> > >
> > I think the below patch may have two issues.
> > - PERF_ATTACH_SCHED_CB is required for LBR (maybe PowerPC BHRB as well) now.
> > - We may disable the large PEBS later if not all PEBS events support
> > large PEBS. The PMU need a way to notify the generic code to decrease
> > the nr_sched_task.
>
> Any updates on this? I've reviewed and tested Kan's patches
> and they all look good.
>
> Maybe we can talk to PPC folks to confirm the BHRB case?
Can we move this forward? I saw patch 3/3 also adds PERF_ATTACH_SCHED_CB
for PowerPC too. But it'd be nice if ppc folks can confirm the change.
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/3] perf/core: Flush PMU internal buffers for per-CPU events
2020-11-20 11:24 ` [PATCH 1/3] perf/core: Flush PMU internal buffers for per-CPU events Namhyung Kim
@ 2020-11-23 11:00 ` Michael Ellerman
2020-11-24 4:51 ` Namhyung Kim
0 siblings, 1 reply; 6+ messages in thread
From: Michael Ellerman @ 2020-11-23 11:00 UTC (permalink / raw)
To: Namhyung Kim, Liang, Kan
Cc: Ian Rogers, Andi Kleen, Peter Zijlstra, linuxppc-dev,
linux-kernel, Stephane Eranian, Paul Mackerras,
Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Gabriel Marin
Namhyung Kim <namhyung@kernel.org> writes:
> Hi Peter and Kan,
>
> (Adding PPC folks)
>
> On Tue, Nov 17, 2020 at 2:01 PM Namhyung Kim <namhyung@kernel.org> wrote:
>>
>> Hello,
>>
>> On Thu, Nov 12, 2020 at 4:54 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>> >
>> >
>> >
>> > On 11/11/2020 11:25 AM, Peter Zijlstra wrote:
>> > > On Mon, Nov 09, 2020 at 09:49:31AM -0500, Liang, Kan wrote:
>> > >
>> > >> - When the large PEBS was introduced (9c964efa4330), the sched_task() should
>> > >> be invoked to flush the PEBS buffer in each context switch. However, The
>> > >> perf_sched_events in account_event() is not updated accordingly. The
>> > >> perf_event_task_sched_* never be invoked for a pure per-CPU context. Only
>> > >> per-task event works.
>> > >> At that time, the perf_pmu_sched_task() is outside of
>> > >> perf_event_context_sched_in/out. It means that perf has to double
>> > >> perf_pmu_disable() for per-task event.
>> > >
>> > >> - The patch 1 tries to fix broken per-CPU events. The CPU context cannot be
>> > >> retrieved from the task->perf_event_ctxp. So it has to be tracked in the
>> > >> sched_cb_list. Yes, the code is very similar to the original codes, but it
>> > >> is actually the new code for per-CPU events. The optimization for per-task
>> > >> events is still kept.
>> > >> For the case, which has both a CPU context and a task context, yes, the
>> > >> __perf_pmu_sched_task() in this patch is not invoked. Because the
>> > >> sched_task() only need to be invoked once in a context switch. The
>> > >> sched_task() will be eventually invoked in the task context.
>> > >
>> > > The thing is; your first two patches rely on PERF_ATTACH_SCHED_CB and
>> > > only set that for large pebs. Are you sure the other users (Intel LBR
>> > > and PowerPC BHRB) don't need it?
>> >
>> > I didn't set it for LBR, because the perf_sched_events is always enabled
>> > for LBR. But, yes, we should explicitly set the PERF_ATTACH_SCHED_CB
>> > for LBR.
>> >
>> > if (has_branch_stack(event))
>> > inc = true;
>> >
>> > >
>> > > If they indeed do not require the pmu::sched_task() callback for CPU
>> > > events, then I still think the whole perf_sched_cb_{inc,dec}() interface
>> >
>> > No, LBR requires the pmu::sched_task() callback for CPU events.
>> >
>> > Now, The LBR registers have to be reset in sched in even for CPU events.
>> >
>> > To fix the shorter LBR callstack issue for CPU events, we also need to
>> > save/restore LBRs in pmu::sched_task().
>> > https://lore.kernel.org/lkml/1578495789-95006-4-git-send-email-kan.liang@linux.intel.com/
>> >
>> > > is confusing at best.
>> > >
>> > > Can't we do something like this instead?
>> > >
>> > I think the below patch may have two issues.
>> > - PERF_ATTACH_SCHED_CB is required for LBR (maybe PowerPC BHRB as well) now.
>> > - We may disable the large PEBS later if not all PEBS events support
>> > large PEBS. The PMU need a way to notify the generic code to decrease
>> > the nr_sched_task.
>>
>> Any updates on this? I've reviewed and tested Kan's patches
>> and they all look good.
>>
>> Maybe we can talk to PPC folks to confirm the BHRB case?
>
> Can we move this forward? I saw patch 3/3 also adds PERF_ATTACH_SCHED_CB
> for PowerPC too. But it'd be nice if ppc folks can confirm the change.
Sorry I've read the whole thread, but I'm still not entirely sure I
understand the question.
cheers
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/3] perf/core: Flush PMU internal buffers for per-CPU events
2020-11-23 11:00 ` Michael Ellerman
@ 2020-11-24 4:51 ` Namhyung Kim
2020-11-24 5:42 ` Madhavan Srinivasan
2020-11-25 8:12 ` Michael Ellerman
0 siblings, 2 replies; 6+ messages in thread
From: Namhyung Kim @ 2020-11-24 4:51 UTC (permalink / raw)
To: Michael Ellerman
Cc: Ian Rogers, Andi Kleen, Peter Zijlstra, linuxppc-dev,
linux-kernel, Stephane Eranian, Paul Mackerras,
Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Gabriel Marin,
Liang, Kan
Hello,
On Mon, Nov 23, 2020 at 8:00 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Namhyung Kim <namhyung@kernel.org> writes:
> > Hi Peter and Kan,
> >
> > (Adding PPC folks)
> >
> > On Tue, Nov 17, 2020 at 2:01 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >>
> >> Hello,
> >>
> >> On Thu, Nov 12, 2020 at 4:54 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
> >> >
> >> >
> >> >
> >> > On 11/11/2020 11:25 AM, Peter Zijlstra wrote:
> >> > > On Mon, Nov 09, 2020 at 09:49:31AM -0500, Liang, Kan wrote:
> >> > >
> >> > >> - When the large PEBS was introduced (9c964efa4330), the sched_task() should
> >> > >> be invoked to flush the PEBS buffer in each context switch. However, The
> >> > >> perf_sched_events in account_event() is not updated accordingly. The
> >> > >> perf_event_task_sched_* never be invoked for a pure per-CPU context. Only
> >> > >> per-task event works.
> >> > >> At that time, the perf_pmu_sched_task() is outside of
> >> > >> perf_event_context_sched_in/out. It means that perf has to double
> >> > >> perf_pmu_disable() for per-task event.
> >> > >
> >> > >> - The patch 1 tries to fix broken per-CPU events. The CPU context cannot be
> >> > >> retrieved from the task->perf_event_ctxp. So it has to be tracked in the
> >> > >> sched_cb_list. Yes, the code is very similar to the original codes, but it
> >> > >> is actually the new code for per-CPU events. The optimization for per-task
> >> > >> events is still kept.
> >> > >> For the case, which has both a CPU context and a task context, yes, the
> >> > >> __perf_pmu_sched_task() in this patch is not invoked. Because the
> >> > >> sched_task() only need to be invoked once in a context switch. The
> >> > >> sched_task() will be eventually invoked in the task context.
> >> > >
> >> > > The thing is; your first two patches rely on PERF_ATTACH_SCHED_CB and
> >> > > only set that for large pebs. Are you sure the other users (Intel LBR
> >> > > and PowerPC BHRB) don't need it?
> >> >
> >> > I didn't set it for LBR, because the perf_sched_events is always enabled
> >> > for LBR. But, yes, we should explicitly set the PERF_ATTACH_SCHED_CB
> >> > for LBR.
> >> >
> >> > if (has_branch_stack(event))
> >> > inc = true;
> >> >
> >> > >
> >> > > If they indeed do not require the pmu::sched_task() callback for CPU
> >> > > events, then I still think the whole perf_sched_cb_{inc,dec}() interface
> >> >
> >> > No, LBR requires the pmu::sched_task() callback for CPU events.
> >> >
> >> > Now, The LBR registers have to be reset in sched in even for CPU events.
> >> >
> >> > To fix the shorter LBR callstack issue for CPU events, we also need to
> >> > save/restore LBRs in pmu::sched_task().
> >> > https://lore.kernel.org/lkml/1578495789-95006-4-git-send-email-kan.liang@linux.intel.com/
> >> >
> >> > > is confusing at best.
> >> > >
> >> > > Can't we do something like this instead?
> >> > >
> >> > I think the below patch may have two issues.
> >> > - PERF_ATTACH_SCHED_CB is required for LBR (maybe PowerPC BHRB as well) now.
> >> > - We may disable the large PEBS later if not all PEBS events support
> >> > large PEBS. The PMU need a way to notify the generic code to decrease
> >> > the nr_sched_task.
> >>
> >> Any updates on this? I've reviewed and tested Kan's patches
> >> and they all look good.
> >>
> >> Maybe we can talk to PPC folks to confirm the BHRB case?
> >
> > Can we move this forward? I saw patch 3/3 also adds PERF_ATTACH_SCHED_CB
> > for PowerPC too. But it'd be nice if ppc folks can confirm the change.
>
> Sorry I've read the whole thread, but I'm still not entirely sure I
> understand the question.
Thanks for your time and sorry about not being clear enough.
We found per-cpu events are not calling pmu::sched_task()
on context switches. So PERF_ATTACH_SCHED_CB was
added to indicate the core logic that it needs to invoke the
callback.
The patch 3/3 added the flag to PPC (for BHRB) with other
changes (I think it should be split like in the patch 2/3) and
want to get ACKs from the PPC folks.
Thanks,
Namhyung
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/3] perf/core: Flush PMU internal buffers for per-CPU events
2020-11-24 4:51 ` Namhyung Kim
@ 2020-11-24 5:42 ` Madhavan Srinivasan
2020-11-24 16:04 ` Liang, Kan
2020-11-25 8:12 ` Michael Ellerman
1 sibling, 1 reply; 6+ messages in thread
From: Madhavan Srinivasan @ 2020-11-24 5:42 UTC (permalink / raw)
To: Namhyung Kim, Michael Ellerman
Cc: Ian Rogers, Andi Kleen, Peter Zijlstra, Jiri Olsa, linux-kernel,
Stephane Eranian, Paul Mackerras, Arnaldo Carvalho de Melo,
linuxppc-dev, Ingo Molnar, Gabriel Marin, Liang, Kan
On 11/24/20 10:21 AM, Namhyung Kim wrote:
> Hello,
>
> On Mon, Nov 23, 2020 at 8:00 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>> Namhyung Kim <namhyung@kernel.org> writes:
>>> Hi Peter and Kan,
>>>
>>> (Adding PPC folks)
>>>
>>> On Tue, Nov 17, 2020 at 2:01 PM Namhyung Kim <namhyung@kernel.org> wrote:
>>>> Hello,
>>>>
>>>> On Thu, Nov 12, 2020 at 4:54 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>>>>
>>>>>
>>>>> On 11/11/2020 11:25 AM, Peter Zijlstra wrote:
>>>>>> On Mon, Nov 09, 2020 at 09:49:31AM -0500, Liang, Kan wrote:
>>>>>>
>>>>>>> - When the large PEBS was introduced (9c964efa4330), the sched_task() should
>>>>>>> be invoked to flush the PEBS buffer in each context switch. However, The
>>>>>>> perf_sched_events in account_event() is not updated accordingly. The
>>>>>>> perf_event_task_sched_* never be invoked for a pure per-CPU context. Only
>>>>>>> per-task event works.
>>>>>>> At that time, the perf_pmu_sched_task() is outside of
>>>>>>> perf_event_context_sched_in/out. It means that perf has to double
>>>>>>> perf_pmu_disable() for per-task event.
>>>>>>> - The patch 1 tries to fix broken per-CPU events. The CPU context cannot be
>>>>>>> retrieved from the task->perf_event_ctxp. So it has to be tracked in the
>>>>>>> sched_cb_list. Yes, the code is very similar to the original codes, but it
>>>>>>> is actually the new code for per-CPU events. The optimization for per-task
>>>>>>> events is still kept.
>>>>>>> For the case, which has both a CPU context and a task context, yes, the
>>>>>>> __perf_pmu_sched_task() in this patch is not invoked. Because the
>>>>>>> sched_task() only need to be invoked once in a context switch. The
>>>>>>> sched_task() will be eventually invoked in the task context.
>>>>>> The thing is; your first two patches rely on PERF_ATTACH_SCHED_CB and
>>>>>> only set that for large pebs. Are you sure the other users (Intel LBR
>>>>>> and PowerPC BHRB) don't need it?
>>>>> I didn't set it for LBR, because the perf_sched_events is always enabled
>>>>> for LBR. But, yes, we should explicitly set the PERF_ATTACH_SCHED_CB
>>>>> for LBR.
>>>>>
>>>>> if (has_branch_stack(event))
>>>>> inc = true;
>>>>>
>>>>>> If they indeed do not require the pmu::sched_task() callback for CPU
>>>>>> events, then I still think the whole perf_sched_cb_{inc,dec}() interface
>>>>> No, LBR requires the pmu::sched_task() callback for CPU events.
>>>>>
>>>>> Now, The LBR registers have to be reset in sched in even for CPU events.
>>>>>
>>>>> To fix the shorter LBR callstack issue for CPU events, we also need to
>>>>> save/restore LBRs in pmu::sched_task().
>>>>> https://lore.kernel.org/lkml/1578495789-95006-4-git-send-email-kan.liang@linux.intel.com/
>>>>>
>>>>>> is confusing at best.
>>>>>>
>>>>>> Can't we do something like this instead?
>>>>>>
>>>>> I think the below patch may have two issues.
>>>>> - PERF_ATTACH_SCHED_CB is required for LBR (maybe PowerPC BHRB as well) now.
>>>>> - We may disable the large PEBS later if not all PEBS events support
>>>>> large PEBS. The PMU need a way to notify the generic code to decrease
>>>>> the nr_sched_task.
>>>> Any updates on this? I've reviewed and tested Kan's patches
>>>> and they all look good.
>>>>
>>>> Maybe we can talk to PPC folks to confirm the BHRB case?
>>> Can we move this forward? I saw patch 3/3 also adds PERF_ATTACH_SCHED_CB
>>> for PowerPC too. But it'd be nice if ppc folks can confirm the change.
>> Sorry I've read the whole thread, but I'm still not entirely sure I
>> understand the question.
> Thanks for your time and sorry about not being clear enough.
>
> We found per-cpu events are not calling pmu::sched_task()
> on context switches. So PERF_ATTACH_SCHED_CB was
> added to indicate the core logic that it needs to invoke the
> callback.
>
> The patch 3/3 added the flag to PPC (for BHRB) with other
> changes (I think it should be split like in the patch 2/3) and
> want to get ACKs from the PPC folks.
Sorry for delay.
I guess first it will be better to split the ppc change to a separate patch,
secondly, we are missing the changes needed in the power_pmu_bhrb_disable()
where perf_sched_cb_dec() needs the "state" to be included.
Maddy
>
> Thanks,
> Namhyung
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/3] perf/core: Flush PMU internal buffers for per-CPU events
2020-11-24 5:42 ` Madhavan Srinivasan
@ 2020-11-24 16:04 ` Liang, Kan
0 siblings, 0 replies; 6+ messages in thread
From: Liang, Kan @ 2020-11-24 16:04 UTC (permalink / raw)
To: Madhavan Srinivasan, Namhyung Kim, Michael Ellerman
Cc: Ian Rogers, Andi Kleen, Peter Zijlstra, Jiri Olsa, linux-kernel,
Stephane Eranian, Paul Mackerras, Arnaldo Carvalho de Melo,
linuxppc-dev, Ingo Molnar, Gabriel Marin
On 11/24/2020 12:42 AM, Madhavan Srinivasan wrote:
>
> On 11/24/20 10:21 AM, Namhyung Kim wrote:
>> Hello,
>>
>> On Mon, Nov 23, 2020 at 8:00 PM Michael Ellerman <mpe@ellerman.id.au>
>> wrote:
>>> Namhyung Kim <namhyung@kernel.org> writes:
>>>> Hi Peter and Kan,
>>>>
>>>> (Adding PPC folks)
>>>>
>>>> On Tue, Nov 17, 2020 at 2:01 PM Namhyung Kim <namhyung@kernel.org>
>>>> wrote:
>>>>> Hello,
>>>>>
>>>>> On Thu, Nov 12, 2020 at 4:54 AM Liang, Kan
>>>>> <kan.liang@linux.intel.com> wrote:
>>>>>>
>>>>>>
>>>>>> On 11/11/2020 11:25 AM, Peter Zijlstra wrote:
>>>>>>> On Mon, Nov 09, 2020 at 09:49:31AM -0500, Liang, Kan wrote:
>>>>>>>
>>>>>>>> - When the large PEBS was introduced (9c964efa4330), the
>>>>>>>> sched_task() should
>>>>>>>> be invoked to flush the PEBS buffer in each context switch.
>>>>>>>> However, The
>>>>>>>> perf_sched_events in account_event() is not updated accordingly.
>>>>>>>> The
>>>>>>>> perf_event_task_sched_* never be invoked for a pure per-CPU
>>>>>>>> context. Only
>>>>>>>> per-task event works.
>>>>>>>> At that time, the perf_pmu_sched_task() is outside of
>>>>>>>> perf_event_context_sched_in/out. It means that perf has to double
>>>>>>>> perf_pmu_disable() for per-task event.
>>>>>>>> - The patch 1 tries to fix broken per-CPU events. The CPU
>>>>>>>> context cannot be
>>>>>>>> retrieved from the task->perf_event_ctxp. So it has to be
>>>>>>>> tracked in the
>>>>>>>> sched_cb_list. Yes, the code is very similar to the original
>>>>>>>> codes, but it
>>>>>>>> is actually the new code for per-CPU events. The optimization
>>>>>>>> for per-task
>>>>>>>> events is still kept.
>>>>>>>> For the case, which has both a CPU context and a task
>>>>>>>> context, yes, the
>>>>>>>> __perf_pmu_sched_task() in this patch is not invoked. Because the
>>>>>>>> sched_task() only need to be invoked once in a context switch. The
>>>>>>>> sched_task() will be eventually invoked in the task context.
>>>>>>> The thing is; your first two patches rely on PERF_ATTACH_SCHED_CB
>>>>>>> and
>>>>>>> only set that for large pebs. Are you sure the other users (Intel
>>>>>>> LBR
>>>>>>> and PowerPC BHRB) don't need it?
>>>>>> I didn't set it for LBR, because the perf_sched_events is always
>>>>>> enabled
>>>>>> for LBR. But, yes, we should explicitly set the PERF_ATTACH_SCHED_CB
>>>>>> for LBR.
>>>>>>
>>>>>> if (has_branch_stack(event))
>>>>>> inc = true;
>>>>>>
>>>>>>> If they indeed do not require the pmu::sched_task() callback for CPU
>>>>>>> events, then I still think the whole perf_sched_cb_{inc,dec}()
>>>>>>> interface
>>>>>> No, LBR requires the pmu::sched_task() callback for CPU events.
>>>>>>
>>>>>> Now, The LBR registers have to be reset in sched in even for CPU
>>>>>> events.
>>>>>>
>>>>>> To fix the shorter LBR callstack issue for CPU events, we also
>>>>>> need to
>>>>>> save/restore LBRs in pmu::sched_task().
>>>>>> https://lore.kernel.org/lkml/1578495789-95006-4-git-send-email-kan.liang@linux.intel.com/
>>>>>>
>>>>>>
>>>>>>> is confusing at best.
>>>>>>>
>>>>>>> Can't we do something like this instead?
>>>>>>>
>>>>>> I think the below patch may have two issues.
>>>>>> - PERF_ATTACH_SCHED_CB is required for LBR (maybe PowerPC BHRB as
>>>>>> well) now.
>>>>>> - We may disable the large PEBS later if not all PEBS events support
>>>>>> large PEBS. The PMU need a way to notify the generic code to decrease
>>>>>> the nr_sched_task.
>>>>> Any updates on this? I've reviewed and tested Kan's patches
>>>>> and they all look good.
>>>>>
>>>>> Maybe we can talk to PPC folks to confirm the BHRB case?
>>>> Can we move this forward? I saw patch 3/3 also adds
>>>> PERF_ATTACH_SCHED_CB
>>>> for PowerPC too. But it'd be nice if ppc folks can confirm the change.
>>> Sorry I've read the whole thread, but I'm still not entirely sure I
>>> understand the question.
>> Thanks for your time and sorry about not being clear enough.
>>
>> We found per-cpu events are not calling pmu::sched_task()
>> on context switches. So PERF_ATTACH_SCHED_CB was
>> added to indicate the core logic that it needs to invoke the
>> callback.
>>
>> The patch 3/3 added the flag to PPC (for BHRB) with other
>> changes (I think it should be split like in the patch 2/3) and
>> want to get ACKs from the PPC folks.
>
> Sorry for delay.
>
> I guess first it will be better to split the ppc change to a separate
> patch,
Both PPC and X86 invokes the perf_sched_cb_inc() directly. The patch
changes the parameters of the perf_sched_cb_inc(). I think we have to
update the PPC and X86 codes together. Otherwise, there will be a
compile error, if someone may only applies the change for the
perf_sched_cb_inc() but forget to applies the changes in PPC or X86
specific codes.
>
> secondly, we are missing the changes needed in the power_pmu_bhrb_disable()
>
> where perf_sched_cb_dec() needs the "state" to be included.
>
Ah, right. The below patch should fix the issue.
diff --git a/arch/powerpc/perf/core-book3s.c
b/arch/powerpc/perf/core-book3s.c
index bced502f64a1..6756d1602a67 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -391,13 +391,18 @@ static void power_pmu_bhrb_enable(struct
perf_event *event)
static void power_pmu_bhrb_disable(struct perf_event *event)
{
struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
+ int state = PERF_SCHED_CB_SW_IN;
if (!ppmu->bhrb_nr)
return;
WARN_ON_ONCE(!cpuhw->bhrb_users);
cpuhw->bhrb_users--;
- perf_sched_cb_dec(event->ctx->pmu);
+
+ if (!(event->attach_state & PERF_ATTACH_TASK))
+ state |= PERF_SCHED_CB_CPU;
+
+ perf_sched_cb_dec(event->ctx->pmu, state);
if (!cpuhw->disabled && !cpuhw->bhrb_users) {
/* BHRB cannot be turned off when other
Thanks,
Kan
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/3] perf/core: Flush PMU internal buffers for per-CPU events
2020-11-24 4:51 ` Namhyung Kim
2020-11-24 5:42 ` Madhavan Srinivasan
@ 2020-11-25 8:12 ` Michael Ellerman
1 sibling, 0 replies; 6+ messages in thread
From: Michael Ellerman @ 2020-11-25 8:12 UTC (permalink / raw)
To: Namhyung Kim
Cc: Ian Rogers, Andi Kleen, Peter Zijlstra, linuxppc-dev,
linux-kernel, Stephane Eranian, Paul Mackerras,
Arnaldo Carvalho de Melo, Jiri Olsa, Ingo Molnar, Gabriel Marin,
Liang, Kan
Namhyung Kim <namhyung@kernel.org> writes:
> Hello,
>
> On Mon, Nov 23, 2020 at 8:00 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> Namhyung Kim <namhyung@kernel.org> writes:
>> > Hi Peter and Kan,
>> >
>> > (Adding PPC folks)
>> >
>> > On Tue, Nov 17, 2020 at 2:01 PM Namhyung Kim <namhyung@kernel.org> wrote:
>> >>
>> >> Hello,
>> >>
>> >> On Thu, Nov 12, 2020 at 4:54 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>> >> >
>> >> >
>> >> >
>> >> > On 11/11/2020 11:25 AM, Peter Zijlstra wrote:
>> >> > > On Mon, Nov 09, 2020 at 09:49:31AM -0500, Liang, Kan wrote:
>> >> > >
>> >> > >> - When the large PEBS was introduced (9c964efa4330), the sched_task() should
>> >> > >> be invoked to flush the PEBS buffer in each context switch. However, The
>> >> > >> perf_sched_events in account_event() is not updated accordingly. The
>> >> > >> perf_event_task_sched_* never be invoked for a pure per-CPU context. Only
>> >> > >> per-task event works.
>> >> > >> At that time, the perf_pmu_sched_task() is outside of
>> >> > >> perf_event_context_sched_in/out. It means that perf has to double
>> >> > >> perf_pmu_disable() for per-task event.
>> >> > >
>> >> > >> - The patch 1 tries to fix broken per-CPU events. The CPU context cannot be
>> >> > >> retrieved from the task->perf_event_ctxp. So it has to be tracked in the
>> >> > >> sched_cb_list. Yes, the code is very similar to the original codes, but it
>> >> > >> is actually the new code for per-CPU events. The optimization for per-task
>> >> > >> events is still kept.
>> >> > >> For the case, which has both a CPU context and a task context, yes, the
>> >> > >> __perf_pmu_sched_task() in this patch is not invoked. Because the
>> >> > >> sched_task() only need to be invoked once in a context switch. The
>> >> > >> sched_task() will be eventually invoked in the task context.
>> >> > >
>> >> > > The thing is; your first two patches rely on PERF_ATTACH_SCHED_CB and
>> >> > > only set that for large pebs. Are you sure the other users (Intel LBR
>> >> > > and PowerPC BHRB) don't need it?
>> >> >
>> >> > I didn't set it for LBR, because the perf_sched_events is always enabled
>> >> > for LBR. But, yes, we should explicitly set the PERF_ATTACH_SCHED_CB
>> >> > for LBR.
>> >> >
>> >> > if (has_branch_stack(event))
>> >> > inc = true;
>> >> >
>> >> > >
>> >> > > If they indeed do not require the pmu::sched_task() callback for CPU
>> >> > > events, then I still think the whole perf_sched_cb_{inc,dec}() interface
>> >> >
>> >> > No, LBR requires the pmu::sched_task() callback for CPU events.
>> >> >
>> >> > Now, The LBR registers have to be reset in sched in even for CPU events.
>> >> >
>> >> > To fix the shorter LBR callstack issue for CPU events, we also need to
>> >> > save/restore LBRs in pmu::sched_task().
>> >> > https://lore.kernel.org/lkml/1578495789-95006-4-git-send-email-kan.liang@linux.intel.com/
>> >> >
>> >> > > is confusing at best.
>> >> > >
>> >> > > Can't we do something like this instead?
>> >> > >
>> >> > I think the below patch may have two issues.
>> >> > - PERF_ATTACH_SCHED_CB is required for LBR (maybe PowerPC BHRB as well) now.
>> >> > - We may disable the large PEBS later if not all PEBS events support
>> >> > large PEBS. The PMU need a way to notify the generic code to decrease
>> >> > the nr_sched_task.
>> >>
>> >> Any updates on this? I've reviewed and tested Kan's patches
>> >> and they all look good.
>> >>
>> >> Maybe we can talk to PPC folks to confirm the BHRB case?
>> >
>> > Can we move this forward? I saw patch 3/3 also adds PERF_ATTACH_SCHED_CB
>> > for PowerPC too. But it'd be nice if ppc folks can confirm the change.
>>
>> Sorry I've read the whole thread, but I'm still not entirely sure I
>> understand the question.
>
> Thanks for your time and sorry about not being clear enough.
>
> We found per-cpu events are not calling pmu::sched_task()
> on context switches. So PERF_ATTACH_SCHED_CB was
> added to indicate the core logic that it needs to invoke the
> callback.
OK. TBH I've never thought of using branch stack with a per-cpu event,
but I guess you can do it.
I think the same logic applies as LBR, we need to read the BHRB entries
in the context of the task that they were recorded for.
> The patch 3/3 added the flag to PPC (for BHRB) with other
> changes (I think it should be split like in the patch 2/3) and
> want to get ACKs from the PPC folks.
If you post a new version with Maddy's comments addressed then he or I
can ack it.
cheers
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-11-25 8:14 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20201106212935.28943-1-kan.liang@linux.intel.com>
[not found] ` <20201109095235.GC2594@hirez.programming.kicks-ass.net>
[not found] ` <20201109110405.GN2651@hirez.programming.kicks-ass.net>
[not found] ` <0a1db246-c34a-22a3-160c-3e0c0a38119d@linux.intel.com>
[not found] ` <20201111162509.GW2611@hirez.programming.kicks-ass.net>
[not found] ` <2dc483f6-7b29-c42b-13a4-4c549d720aa2@linux.intel.com>
[not found] ` <CAM9d7cjwFp9JBqs1Ga9n1ojbez9chZLvmOgFv1EE4KDhAa9ryA@mail.gmail.com>
2020-11-20 11:24 ` [PATCH 1/3] perf/core: Flush PMU internal buffers for per-CPU events Namhyung Kim
2020-11-23 11:00 ` Michael Ellerman
2020-11-24 4:51 ` Namhyung Kim
2020-11-24 5:42 ` Madhavan Srinivasan
2020-11-24 16:04 ` Liang, Kan
2020-11-25 8:12 ` Michael Ellerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).