From: Peter Zijlstra <peterz@infradead.org>
To: Eric Lin <eric.lin@sifive.com>
Cc: Stephane Eranian <eranian@google.com>,
mingo@redhat.com, acme@kernel.org, mark.rutland@arm.com,
alexander.shishkin@linux.intel.com, jolsa@kernel.org,
namhyung@kernel.org, irogers@google.com, adrian.hunter@intel.com,
palmer@dabbelt.com, linux-perf-users@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org,
greentime.hu@sifive.com, vincent.chen@sifive.com
Subject: Re: [PATCH] perf/core: Add pmu stop before unthrottling to prevent WARNING
Date: Tue, 27 Jun 2023 11:38:23 +0200 [thread overview]
Message-ID: <20230627093823.GV83892@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <CAPqJEFpV8a8D7eA0sspjvThvBxdZhSLPTEbEzN7WiGCAzSnYYg@mail.gmail.com>
On Tue, Jun 27, 2023 at 05:08:07PM +0800, Eric Lin wrote:
> > Yeah, Changelog fails to explain how we got to the faulty state -- and
> > without that we can't judge if the proposed solution actually fixes the
> > problem or not.
> >
>
> Hi Stephane, Peter,
>
> Most of the pmu driver will call *_pmu_stop(event,0) in the
> *_pmu_handle_irq() function and update the hwc->state with
> PERF_HES_STOPPED flag as below:
>
> arch/alpha/kernel/perf_event.c:856: if
> (perf_event_overflow(event, &data, regs)) {
> arch/alpha/kernel/perf_event.c-857- /* Interrupts
> coming too quickly; "throttle" the
> arch/alpha/kernel/perf_event.c-858- * counter,
> i.e., disable it for a little while.
> arch/alpha/kernel/perf_event.c-859- */
> arch/alpha/kernel/perf_event.c-860-
> alpha_pmu_stop(event, 0);
> arch/alpha/kernel/perf_event.c-861- }
> -----
> arch/arc/kernel/perf_event.c:603: if
> (perf_event_overflow(event, &data, regs))
> arch/arc/kernel/perf_event.c-604-
> arc_pmu_stop(event, 0);
> arch/arc/kernel/perf_event.c-605- }
> -----
> arch/x86/events/amd/core.c:935: if (perf_event_overflow(event,
> &data, regs))
> arch/x86/events/amd/core.c-936- x86_pmu_stop(event, 0);
> arch/x86/events/amd/core.c-937- }
> -----
>
> However, some of the pmu drivers stop the event in the
> *_pmu_handle_irq() without updating the hwc->state with
> PERF_HES_STOPPED flag as below:
>
> arch/arm/kernel/perf_event_v7.c:994: if
> (perf_event_overflow(event, &data, regs))
> arch/arm/kernel/perf_event_v7.c-995-
> cpu_pmu->disable(event); // <== not update with PERF_HES_STOPPED
> arch/arm/kernel/perf_event_v7.c-996- }
> ------
> arch/csky/kernel/perf_event.c:1142: if
> (perf_event_overflow(event, &data, regs))
> arch/csky/kernel/perf_event.c-1143-
> csky_pmu_stop_event(event); // <== not update with PERF_HES_STOPPED
> arch/csky/kernel/perf_event.c-1144- }
> -------
> arch/loongarch/kernel/perf_event.c:492: if (perf_event_overflow(event,
> data, regs))
> arch/loongarch/kernel/perf_event.c-493-
> loongarch_pmu_disable_event(idx); // <== not update with
> PERF_HES_STOPPED
> arch/loongarch/kernel/perf_event.c-494-}
> -------
> arch/mips/kernel/perf_event_mipsxx.c:794: if
> (perf_event_overflow(event, data, regs))
> arch/mips/kernel/perf_event_mipsxx.c-795-
> mipsxx_pmu_disable_event(idx); // <== not update with PERF_HES_STOPPED
> arch/mips/kernel/perf_event_mipsxx.c-796-}
> ....
>
> Furthermore, these drivers did not add event->hw.state checking in
> *_pmu_start() before starting the event like x86 does:
>
> 1497 static void x86_pmu_start(struct perf_event *event, int flags)
> 1498 {
> 1499 struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> 1500 int idx = event->hw.idx;
> 1501
> 1502 if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED)))
> 1503 return;
> 1504
>
> As a result, these drivers won't trigger the WARN_ON_ONCE warning as
> shown in this patch.
>
> However, if a pmu driver like RISC-V pmu which didn't call
> *_pmu_stop(event,0) without updating the hwc->state with
> PERF_HES_STOPPED flag in the *_pmu_handle_irq() function
> but has event->hw.state checking in *_pmu_start(), it could trigger
> the WARN_ON_ONCE warning as shown in this patch.
>
> Therefore, I think we need to call pmu->stop() before unthrottling the
> event to prevent this warning.
How is that not a pmu driver problem ? I'd think we should be fixing
those drivers. Mark, do you have have any memories of how the ARM driver
came to be this way?
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Eric Lin <eric.lin@sifive.com>
Cc: Stephane Eranian <eranian@google.com>,
mingo@redhat.com, acme@kernel.org, mark.rutland@arm.com,
alexander.shishkin@linux.intel.com, jolsa@kernel.org,
namhyung@kernel.org, irogers@google.com, adrian.hunter@intel.com,
palmer@dabbelt.com, linux-perf-users@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org,
greentime.hu@sifive.com, vincent.chen@sifive.com
Subject: Re: [PATCH] perf/core: Add pmu stop before unthrottling to prevent WARNING
Date: Tue, 27 Jun 2023 11:38:23 +0200 [thread overview]
Message-ID: <20230627093823.GV83892@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <CAPqJEFpV8a8D7eA0sspjvThvBxdZhSLPTEbEzN7WiGCAzSnYYg@mail.gmail.com>
On Tue, Jun 27, 2023 at 05:08:07PM +0800, Eric Lin wrote:
> > Yeah, Changelog fails to explain how we got to the faulty state -- and
> > without that we can't judge if the proposed solution actually fixes the
> > problem or not.
> >
>
> Hi Stephane, Peter,
>
> Most of the pmu driver will call *_pmu_stop(event,0) in the
> *_pmu_handle_irq() function and update the hwc->state with
> PERF_HES_STOPPED flag as below:
>
> arch/alpha/kernel/perf_event.c:856: if
> (perf_event_overflow(event, &data, regs)) {
> arch/alpha/kernel/perf_event.c-857- /* Interrupts
> coming too quickly; "throttle" the
> arch/alpha/kernel/perf_event.c-858- * counter,
> i.e., disable it for a little while.
> arch/alpha/kernel/perf_event.c-859- */
> arch/alpha/kernel/perf_event.c-860-
> alpha_pmu_stop(event, 0);
> arch/alpha/kernel/perf_event.c-861- }
> -----
> arch/arc/kernel/perf_event.c:603: if
> (perf_event_overflow(event, &data, regs))
> arch/arc/kernel/perf_event.c-604-
> arc_pmu_stop(event, 0);
> arch/arc/kernel/perf_event.c-605- }
> -----
> arch/x86/events/amd/core.c:935: if (perf_event_overflow(event,
> &data, regs))
> arch/x86/events/amd/core.c-936- x86_pmu_stop(event, 0);
> arch/x86/events/amd/core.c-937- }
> -----
>
> However, some of the pmu drivers stop the event in the
> *_pmu_handle_irq() without updating the hwc->state with
> PERF_HES_STOPPED flag as below:
>
> arch/arm/kernel/perf_event_v7.c:994: if
> (perf_event_overflow(event, &data, regs))
> arch/arm/kernel/perf_event_v7.c-995-
> cpu_pmu->disable(event); // <== not update with PERF_HES_STOPPED
> arch/arm/kernel/perf_event_v7.c-996- }
> ------
> arch/csky/kernel/perf_event.c:1142: if
> (perf_event_overflow(event, &data, regs))
> arch/csky/kernel/perf_event.c-1143-
> csky_pmu_stop_event(event); // <== not update with PERF_HES_STOPPED
> arch/csky/kernel/perf_event.c-1144- }
> -------
> arch/loongarch/kernel/perf_event.c:492: if (perf_event_overflow(event,
> data, regs))
> arch/loongarch/kernel/perf_event.c-493-
> loongarch_pmu_disable_event(idx); // <== not update with
> PERF_HES_STOPPED
> arch/loongarch/kernel/perf_event.c-494-}
> -------
> arch/mips/kernel/perf_event_mipsxx.c:794: if
> (perf_event_overflow(event, data, regs))
> arch/mips/kernel/perf_event_mipsxx.c-795-
> mipsxx_pmu_disable_event(idx); // <== not update with PERF_HES_STOPPED
> arch/mips/kernel/perf_event_mipsxx.c-796-}
> ....
>
> Furthermore, these drivers did not add event->hw.state checking in
> *_pmu_start() before starting the event like x86 does:
>
> 1497 static void x86_pmu_start(struct perf_event *event, int flags)
> 1498 {
> 1499 struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> 1500 int idx = event->hw.idx;
> 1501
> 1502 if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED)))
> 1503 return;
> 1504
>
> As a result, these drivers won't trigger the WARN_ON_ONCE warning as
> shown in this patch.
>
> However, if a pmu driver like RISC-V pmu which didn't call
> *_pmu_stop(event,0) without updating the hwc->state with
> PERF_HES_STOPPED flag in the *_pmu_handle_irq() function
> but has event->hw.state checking in *_pmu_start(), it could trigger
> the WARN_ON_ONCE warning as shown in this patch.
>
> Therefore, I think we need to call pmu->stop() before unthrottling the
> event to prevent this warning.
How is that not a pmu driver problem ? I'd think we should be fixing
those drivers. Mark, do you have have any memories of how the ARM driver
came to be this way?
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2023-06-27 9:38 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-02 9:48 [PATCH] perf/core: Add pmu stop before unthrottling to prevent WARNING Eric Lin
2023-06-02 9:48 ` Eric Lin
2023-06-21 4:24 ` Eric Lin
2023-06-21 4:24 ` Eric Lin
2023-06-21 6:18 ` Stephane Eranian
2023-06-21 6:18 ` Stephane Eranian
2023-06-21 11:58 ` Peter Zijlstra
2023-06-21 11:58 ` Peter Zijlstra
2023-06-27 9:03 ` Eric Lin
2023-06-27 9:03 ` Eric Lin
2023-06-27 9:08 ` Eric Lin
2023-06-27 9:08 ` Eric Lin
2023-06-27 9:38 ` Peter Zijlstra [this message]
2023-06-27 9:38 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230627093823.GV83892@hirez.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=eranian@google.com \
--cc=eric.lin@sifive.com \
--cc=greentime.hu@sifive.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=palmer@dabbelt.com \
--cc=vincent.chen@sifive.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.