* [PATCH] drivers/perf: arm-pmu: fix RCU usage on resume from idle states
@ 2016-04-19 17:08 Lorenzo Pieralisi
2016-04-19 17:23 ` Will Deacon
0 siblings, 1 reply; 9+ messages in thread
From: Lorenzo Pieralisi @ 2016-04-19 17:08 UTC (permalink / raw)
To: linux-arm-kernel
Commit da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
added code in the arm perf infrastructure that allows the kernel to
save/restore perf counters whenever the CPU enters a low-power idle
state. The kernel saves/restores the counters for each active event
through the armpmu_{stop/start} ARM pmu API, so that the idle state
enter/exit power cycle is emulated through pmu start/stop operations
for each event in use.
However, calling armpmu_start() for each active event on power up
executes code that requires RCU locking (perf_event_update_userpage())
to be functional, so, given that the core may call the CPU_PM notifiers
while running the idle thread in an quiescent RCU state this is not
allowed as detected through the following splat when kernel is run with
CONFIG_PROVE_LOCKING enabled:
[ 49.293286]
[ 49.294761] ===============================
[ 49.298895] [ INFO: suspicious RCU usage. ]
[ 49.303031] 4.6.0-rc3+ #421 Not tainted
[ 49.306821] -------------------------------
[ 49.310956] include/linux/rcupdate.h:872 rcu_read_lock() used
illegally while idle!
[ 49.318530]
[ 49.318530] other info that might help us debug this:
[ 49.318530]
[ 49.326451]
[ 49.326451] RCU used illegally from idle CPU!
[ 49.326451] rcu_scheduler_active = 1, debug_locks = 0
[ 49.337209] RCU used illegally from extended quiescent state!
[ 49.342892] 2 locks held by swapper/2/0:
[ 49.346768] #0: (cpu_pm_notifier_lock){......}, at:
[<ffffff8008163c28>] cpu_pm_exit+0x18/0x80
[ 49.355492] #1: (rcu_read_lock){......}, at: [<ffffff800816dc38>]
perf_event_update_userpage+0x0/0x260
This patch refactors the perf CPU_PM notifiers to add a boolean
flag to the function updating the counters event period, so that the
userpage update can be skipped when resuming from low-power whilst
keeping correct save/restore functionality for the running events.
As a side effect the kernel, while resuming from low-power with
perf events enabled, runs with a userspace view of active counters that
is not up-to-date with the kernel one, but since the core power down is
not really a PMU event start/stop this can be considered acceptable and
the userspace event snapshot will update the user view of counters
on subsequent perf event updates requested by either the perf API
or event counters overflow-triggered interrupts.
Fixes: da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reported-by: James Morse <james.morse@arm.com>
Cc: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
---
drivers/perf/arm_pmu.c | 26 +++++++++++++++++++++-----
1 file changed, 21 insertions(+), 5 deletions(-)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 32346b5..4343984 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -101,7 +101,8 @@ armpmu_map_event(struct perf_event *event,
return -ENOENT;
}
-int armpmu_event_set_period(struct perf_event *event)
+static int __armpmu_event_set_period(struct perf_event *event,
+ bool update_user)
{
struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
struct hw_perf_event *hwc = &event->hw;
@@ -136,11 +137,17 @@ int armpmu_event_set_period(struct perf_event *event)
armpmu->write_counter(event, (u64)(-left) & 0xffffffff);
- perf_event_update_userpage(event);
+ if (update_user)
+ perf_event_update_userpage(event);
return ret;
}
+int armpmu_event_set_period(struct perf_event *event)
+{
+ return __armpmu_event_set_period(event, true);
+}
+
u64 armpmu_event_update(struct perf_event *event)
{
struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
@@ -733,12 +740,21 @@ static void cpu_pm_pmu_setup(struct arm_pmu *armpmu, unsigned long cmd)
/*
* Stop and update the counter
*/
- armpmu_stop(event, PERF_EF_UPDATE);
+ armpmu->disable(event);
+ armpmu_event_update(event);
break;
case CPU_PM_EXIT:
case CPU_PM_ENTER_FAILED:
- /* Restore and enable the counter */
- armpmu_start(event, PERF_EF_RELOAD);
+ /*
+ * Restore and enable the counter;
+ * user space updates require RCU locking
+ * that is disallowed when a CPU is in an
+ * idle quiescent state, therefore set
+ * update_user bool flag to false so that the
+ * user space counters update is skipped.
+ */
+ __armpmu_event_set_period(event, false);
+ armpmu->enable(event);
break;
default:
break;
--
2.6.4
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH] drivers/perf: arm-pmu: fix RCU usage on resume from idle states
2016-04-19 17:08 [PATCH] drivers/perf: arm-pmu: fix RCU usage on resume from idle states Lorenzo Pieralisi
@ 2016-04-19 17:23 ` Will Deacon
2016-04-19 21:48 ` Kevin Hilman
2016-04-20 10:48 ` Catalin Marinas
0 siblings, 2 replies; 9+ messages in thread
From: Will Deacon @ 2016-04-19 17:23 UTC (permalink / raw)
To: linux-arm-kernel
Hi Lorenzo,
On Tue, Apr 19, 2016 at 06:08:09PM +0100, Lorenzo Pieralisi wrote:
> Commit da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
> added code in the arm perf infrastructure that allows the kernel to
> save/restore perf counters whenever the CPU enters a low-power idle
> state. The kernel saves/restores the counters for each active event
> through the armpmu_{stop/start} ARM pmu API, so that the idle state
> enter/exit power cycle is emulated through pmu start/stop operations
> for each event in use.
>
> However, calling armpmu_start() for each active event on power up
> executes code that requires RCU locking (perf_event_update_userpage())
> to be functional, so, given that the core may call the CPU_PM notifiers
> while running the idle thread in an quiescent RCU state this is not
> allowed as detected through the following splat when kernel is run with
> CONFIG_PROVE_LOCKING enabled:
>
> [ 49.293286]
> [ 49.294761] ===============================
> [ 49.298895] [ INFO: suspicious RCU usage. ]
> [ 49.303031] 4.6.0-rc3+ #421 Not tainted
> [ 49.306821] -------------------------------
> [ 49.310956] include/linux/rcupdate.h:872 rcu_read_lock() used
> illegally while idle!
> [ 49.318530]
> [ 49.318530] other info that might help us debug this:
> [ 49.318530]
> [ 49.326451]
> [ 49.326451] RCU used illegally from idle CPU!
> [ 49.326451] rcu_scheduler_active = 1, debug_locks = 0
> [ 49.337209] RCU used illegally from extended quiescent state!
> [ 49.342892] 2 locks held by swapper/2/0:
> [ 49.346768] #0: (cpu_pm_notifier_lock){......}, at:
> [<ffffff8008163c28>] cpu_pm_exit+0x18/0x80
> [ 49.355492] #1: (rcu_read_lock){......}, at: [<ffffff800816dc38>]
> perf_event_update_userpage+0x0/0x260
>
> This patch refactors the perf CPU_PM notifiers to add a boolean
> flag to the function updating the counters event period, so that the
> userpage update can be skipped when resuming from low-power whilst
> keeping correct save/restore functionality for the running events.
>
> As a side effect the kernel, while resuming from low-power with
> perf events enabled, runs with a userspace view of active counters that
> is not up-to-date with the kernel one, but since the core power down is
> not really a PMU event start/stop this can be considered acceptable and
> the userspace event snapshot will update the user view of counters
> on subsequent perf event updates requested by either the perf API
> or event counters overflow-triggered interrupts.
>
> Fixes: da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Reported-by: James Morse <james.morse@arm.com>
> Cc: Ashwin Chaugule <ashwin.chaugule@linaro.org>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Kevin Hilman <khilman@baylibre.com>
> Cc: Sudeep Holla <sudeep.holla@arm.com>
> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> ---
> drivers/perf/arm_pmu.c | 26 +++++++++++++++++++++-----
> 1 file changed, 21 insertions(+), 5 deletions(-)
This is horrible, but I think it's the best we can do without completely
redesigning the way in which we save/restore the PMU state. We should do
that, but not for 4.6!
Acked-by: Will Deacon <will.deacon@arm.com>
I assume Catalin will take this as a fix?
Will
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] drivers/perf: arm-pmu: fix RCU usage on resume from idle states
2016-04-19 17:23 ` Will Deacon
@ 2016-04-19 21:48 ` Kevin Hilman
2016-04-20 13:41 ` Lorenzo Pieralisi
2016-04-20 10:48 ` Catalin Marinas
1 sibling, 1 reply; 9+ messages in thread
From: Kevin Hilman @ 2016-04-19 21:48 UTC (permalink / raw)
To: linux-arm-kernel
Will Deacon <will.deacon@arm.com> writes:
> Hi Lorenzo,
>
> On Tue, Apr 19, 2016 at 06:08:09PM +0100, Lorenzo Pieralisi wrote:
>> Commit da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
>> added code in the arm perf infrastructure that allows the kernel to
>> save/restore perf counters whenever the CPU enters a low-power idle
>> state. The kernel saves/restores the counters for each active event
>> through the armpmu_{stop/start} ARM pmu API, so that the idle state
>> enter/exit power cycle is emulated through pmu start/stop operations
>> for each event in use.
>>
>> However, calling armpmu_start() for each active event on power up
>> executes code that requires RCU locking (perf_event_update_userpage())
>> to be functional, so, given that the core may call the CPU_PM notifiers
>> while running the idle thread in an quiescent RCU state this is not
>> allowed as detected through the following splat when kernel is run with
>> CONFIG_PROVE_LOCKING enabled:
>>
>> [ 49.293286]
>> [ 49.294761] ===============================
>> [ 49.298895] [ INFO: suspicious RCU usage. ]
>> [ 49.303031] 4.6.0-rc3+ #421 Not tainted
>> [ 49.306821] -------------------------------
>> [ 49.310956] include/linux/rcupdate.h:872 rcu_read_lock() used
>> illegally while idle!
>> [ 49.318530]
>> [ 49.318530] other info that might help us debug this:
>> [ 49.318530]
>> [ 49.326451]
>> [ 49.326451] RCU used illegally from idle CPU!
>> [ 49.326451] rcu_scheduler_active = 1, debug_locks = 0
>> [ 49.337209] RCU used illegally from extended quiescent state!
>> [ 49.342892] 2 locks held by swapper/2/0:
>> [ 49.346768] #0: (cpu_pm_notifier_lock){......}, at:
>> [<ffffff8008163c28>] cpu_pm_exit+0x18/0x80
>> [ 49.355492] #1: (rcu_read_lock){......}, at: [<ffffff800816dc38>]
>> perf_event_update_userpage+0x0/0x260
>>
>> This patch refactors the perf CPU_PM notifiers to add a boolean
>> flag to the function updating the counters event period, so that the
>> userpage update can be skipped when resuming from low-power whilst
>> keeping correct save/restore functionality for the running events.
>>
>> As a side effect the kernel, while resuming from low-power with
>> perf events enabled, runs with a userspace view of active counters that
>> is not up-to-date with the kernel one, but since the core power down is
>> not really a PMU event start/stop this can be considered acceptable and
>> the userspace event snapshot will update the user view of counters
>> on subsequent perf event updates requested by either the perf API
>> or event counters overflow-triggered interrupts.
>>
>> Fixes: da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
>> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
>> Reported-by: James Morse <james.morse@arm.com>
>> Cc: Ashwin Chaugule <ashwin.chaugule@linaro.org>
>> Cc: Will Deacon <will.deacon@arm.com>
>> Cc: Kevin Hilman <khilman@baylibre.com>
>> Cc: Sudeep Holla <sudeep.holla@arm.com>
>> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> ---
>> drivers/perf/arm_pmu.c | 26 +++++++++++++++++++++-----
>> 1 file changed, 21 insertions(+), 5 deletions(-)
>
> This is horrible, but I think it's the best we can do without completely
> redesigning the way in which we save/restore the PMU state. We should do
> that, but not for 4.6!
>
> Acked-by: Will Deacon <will.deacon@arm.com>
Maybe RCU_NONIDLE() will help here?
Kevin
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] drivers/perf: arm-pmu: fix RCU usage on resume from idle states
2016-04-19 17:23 ` Will Deacon
2016-04-19 21:48 ` Kevin Hilman
@ 2016-04-20 10:48 ` Catalin Marinas
1 sibling, 0 replies; 9+ messages in thread
From: Catalin Marinas @ 2016-04-20 10:48 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Apr 19, 2016 at 06:23:20PM +0100, Will Deacon wrote:
> On Tue, Apr 19, 2016 at 06:08:09PM +0100, Lorenzo Pieralisi wrote:
> > Commit da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
> > added code in the arm perf infrastructure that allows the kernel to
> > save/restore perf counters whenever the CPU enters a low-power idle
> > state. The kernel saves/restores the counters for each active event
> > through the armpmu_{stop/start} ARM pmu API, so that the idle state
> > enter/exit power cycle is emulated through pmu start/stop operations
> > for each event in use.
> >
> > However, calling armpmu_start() for each active event on power up
> > executes code that requires RCU locking (perf_event_update_userpage())
> > to be functional, so, given that the core may call the CPU_PM notifiers
> > while running the idle thread in an quiescent RCU state this is not
> > allowed as detected through the following splat when kernel is run with
> > CONFIG_PROVE_LOCKING enabled:
> >
> > [ 49.293286]
> > [ 49.294761] ===============================
> > [ 49.298895] [ INFO: suspicious RCU usage. ]
> > [ 49.303031] 4.6.0-rc3+ #421 Not tainted
> > [ 49.306821] -------------------------------
> > [ 49.310956] include/linux/rcupdate.h:872 rcu_read_lock() used
> > illegally while idle!
> > [ 49.318530]
> > [ 49.318530] other info that might help us debug this:
> > [ 49.318530]
> > [ 49.326451]
> > [ 49.326451] RCU used illegally from idle CPU!
> > [ 49.326451] rcu_scheduler_active = 1, debug_locks = 0
> > [ 49.337209] RCU used illegally from extended quiescent state!
> > [ 49.342892] 2 locks held by swapper/2/0:
> > [ 49.346768] #0: (cpu_pm_notifier_lock){......}, at:
> > [<ffffff8008163c28>] cpu_pm_exit+0x18/0x80
> > [ 49.355492] #1: (rcu_read_lock){......}, at: [<ffffff800816dc38>]
> > perf_event_update_userpage+0x0/0x260
> >
> > This patch refactors the perf CPU_PM notifiers to add a boolean
> > flag to the function updating the counters event period, so that the
> > userpage update can be skipped when resuming from low-power whilst
> > keeping correct save/restore functionality for the running events.
> >
> > As a side effect the kernel, while resuming from low-power with
> > perf events enabled, runs with a userspace view of active counters that
> > is not up-to-date with the kernel one, but since the core power down is
> > not really a PMU event start/stop this can be considered acceptable and
> > the userspace event snapshot will update the user view of counters
> > on subsequent perf event updates requested by either the perf API
> > or event counters overflow-triggered interrupts.
> >
> > Fixes: da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
> > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > Reported-by: James Morse <james.morse@arm.com>
> > Cc: Ashwin Chaugule <ashwin.chaugule@linaro.org>
> > Cc: Will Deacon <will.deacon@arm.com>
> > Cc: Kevin Hilman <khilman@baylibre.com>
> > Cc: Sudeep Holla <sudeep.holla@arm.com>
> > Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> > Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > ---
> > drivers/perf/arm_pmu.c | 26 +++++++++++++++++++++-----
> > 1 file changed, 21 insertions(+), 5 deletions(-)
>
> This is horrible, but I think it's the best we can do without completely
> redesigning the way in which we save/restore the PMU state. We should do
> that, but not for 4.6!
>
> Acked-by: Will Deacon <will.deacon@arm.com>
>
> I assume Catalin will take this as a fix?
Applied. Thanks.
--
Catalin
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] drivers/perf: arm-pmu: fix RCU usage on resume from idle states
2016-04-19 21:48 ` Kevin Hilman
@ 2016-04-20 13:41 ` Lorenzo Pieralisi
2016-04-20 15:45 ` Catalin Marinas
2016-04-20 16:23 ` Paul E. McKenney
0 siblings, 2 replies; 9+ messages in thread
From: Lorenzo Pieralisi @ 2016-04-20 13:41 UTC (permalink / raw)
To: linux-arm-kernel
[+ Paul, Nico]
On Tue, Apr 19, 2016 at 02:48:40PM -0700, Kevin Hilman wrote:
> Will Deacon <will.deacon@arm.com> writes:
>
> > Hi Lorenzo,
> >
> > On Tue, Apr 19, 2016 at 06:08:09PM +0100, Lorenzo Pieralisi wrote:
> >> Commit da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
> >> added code in the arm perf infrastructure that allows the kernel to
> >> save/restore perf counters whenever the CPU enters a low-power idle
> >> state. The kernel saves/restores the counters for each active event
> >> through the armpmu_{stop/start} ARM pmu API, so that the idle state
> >> enter/exit power cycle is emulated through pmu start/stop operations
> >> for each event in use.
> >>
> >> However, calling armpmu_start() for each active event on power up
> >> executes code that requires RCU locking (perf_event_update_userpage())
> >> to be functional, so, given that the core may call the CPU_PM notifiers
> >> while running the idle thread in an quiescent RCU state this is not
> >> allowed as detected through the following splat when kernel is run with
> >> CONFIG_PROVE_LOCKING enabled:
> >>
> >> [ 49.293286]
> >> [ 49.294761] ===============================
> >> [ 49.298895] [ INFO: suspicious RCU usage. ]
> >> [ 49.303031] 4.6.0-rc3+ #421 Not tainted
> >> [ 49.306821] -------------------------------
> >> [ 49.310956] include/linux/rcupdate.h:872 rcu_read_lock() used
> >> illegally while idle!
> >> [ 49.318530]
> >> [ 49.318530] other info that might help us debug this:
> >> [ 49.318530]
> >> [ 49.326451]
> >> [ 49.326451] RCU used illegally from idle CPU!
> >> [ 49.326451] rcu_scheduler_active = 1, debug_locks = 0
> >> [ 49.337209] RCU used illegally from extended quiescent state!
> >> [ 49.342892] 2 locks held by swapper/2/0:
> >> [ 49.346768] #0: (cpu_pm_notifier_lock){......}, at:
> >> [<ffffff8008163c28>] cpu_pm_exit+0x18/0x80
> >> [ 49.355492] #1: (rcu_read_lock){......}, at: [<ffffff800816dc38>]
> >> perf_event_update_userpage+0x0/0x260
> >>
> >> This patch refactors the perf CPU_PM notifiers to add a boolean
> >> flag to the function updating the counters event period, so that the
> >> userpage update can be skipped when resuming from low-power whilst
> >> keeping correct save/restore functionality for the running events.
> >>
> >> As a side effect the kernel, while resuming from low-power with
> >> perf events enabled, runs with a userspace view of active counters that
> >> is not up-to-date with the kernel one, but since the core power down is
> >> not really a PMU event start/stop this can be considered acceptable and
> >> the userspace event snapshot will update the user view of counters
> >> on subsequent perf event updates requested by either the perf API
> >> or event counters overflow-triggered interrupts.
> >>
> >> Fixes: da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
> >> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> >> Reported-by: James Morse <james.morse@arm.com>
> >> Cc: Ashwin Chaugule <ashwin.chaugule@linaro.org>
> >> Cc: Will Deacon <will.deacon@arm.com>
> >> Cc: Kevin Hilman <khilman@baylibre.com>
> >> Cc: Sudeep Holla <sudeep.holla@arm.com>
> >> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> >> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> >> Cc: Mark Rutland <mark.rutland@arm.com>
> >> ---
> >> drivers/perf/arm_pmu.c | 26 +++++++++++++++++++++-----
> >> 1 file changed, 21 insertions(+), 5 deletions(-)
> >
> > This is horrible, but I think it's the best we can do without completely
> > redesigning the way in which we save/restore the PMU state. We should do
> > that, but not for 4.6!
> >
> > Acked-by: Will Deacon <will.deacon@arm.com>
>
> Maybe RCU_NONIDLE() will help here?
Thanks for chiming in.
CPU_PM notifiers are called from process context (which is not necessarily
the idle thread) with IRQs disabled from:
- CPUidle drivers state enter calls
- syscore callbacks (ie suspend2RAM - suspend thread)
- bL switcher
- MCPM loopback
The questions I have are:
- Is it safe to wrap a call (in this case armpmu_start()) with RCU_NONIDLE
if the core is not actually executing the idle thread ? The function
requiring rcu locks/dereferences is perf_event_update_userpage().
- What are RCU_NONIDLE side-effects (ie what can be actually called from
within an RCU_NONIDLE wrapper ?)
It would be nice if we can use it instead of merging this patch, I need
more insights into RCU_NONIDLE usage though before proceeding.
Thanks a lot,
Lorenzo
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] drivers/perf: arm-pmu: fix RCU usage on resume from idle states
2016-04-20 13:41 ` Lorenzo Pieralisi
@ 2016-04-20 15:45 ` Catalin Marinas
2016-04-20 16:23 ` Paul E. McKenney
1 sibling, 0 replies; 9+ messages in thread
From: Catalin Marinas @ 2016-04-20 15:45 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Apr 20, 2016 at 02:41:16PM +0100, Lorenzo Pieralisi wrote:
> On Tue, Apr 19, 2016 at 02:48:40PM -0700, Kevin Hilman wrote:
> > Will Deacon <will.deacon@arm.com> writes:
> > > On Tue, Apr 19, 2016 at 06:08:09PM +0100, Lorenzo Pieralisi wrote:
> > >> Commit da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
> > >> added code in the arm perf infrastructure that allows the kernel to
> > >> save/restore perf counters whenever the CPU enters a low-power idle
> > >> state. The kernel saves/restores the counters for each active event
> > >> through the armpmu_{stop/start} ARM pmu API, so that the idle state
> > >> enter/exit power cycle is emulated through pmu start/stop operations
> > >> for each event in use.
> > >>
> > >> However, calling armpmu_start() for each active event on power up
> > >> executes code that requires RCU locking (perf_event_update_userpage())
> > >> to be functional, so, given that the core may call the CPU_PM notifiers
> > >> while running the idle thread in an quiescent RCU state this is not
> > >> allowed as detected through the following splat when kernel is run with
> > >> CONFIG_PROVE_LOCKING enabled:
> > >>
> > >> [ 49.293286]
> > >> [ 49.294761] ===============================
> > >> [ 49.298895] [ INFO: suspicious RCU usage. ]
> > >> [ 49.303031] 4.6.0-rc3+ #421 Not tainted
> > >> [ 49.306821] -------------------------------
> > >> [ 49.310956] include/linux/rcupdate.h:872 rcu_read_lock() used
> > >> illegally while idle!
> > >> [ 49.318530]
> > >> [ 49.318530] other info that might help us debug this:
> > >> [ 49.318530]
> > >> [ 49.326451]
> > >> [ 49.326451] RCU used illegally from idle CPU!
> > >> [ 49.326451] rcu_scheduler_active = 1, debug_locks = 0
> > >> [ 49.337209] RCU used illegally from extended quiescent state!
> > >> [ 49.342892] 2 locks held by swapper/2/0:
> > >> [ 49.346768] #0: (cpu_pm_notifier_lock){......}, at:
> > >> [<ffffff8008163c28>] cpu_pm_exit+0x18/0x80
> > >> [ 49.355492] #1: (rcu_read_lock){......}, at: [<ffffff800816dc38>]
> > >> perf_event_update_userpage+0x0/0x260
> > >>
> > >> This patch refactors the perf CPU_PM notifiers to add a boolean
> > >> flag to the function updating the counters event period, so that the
> > >> userpage update can be skipped when resuming from low-power whilst
> > >> keeping correct save/restore functionality for the running events.
> > >>
> > >> As a side effect the kernel, while resuming from low-power with
> > >> perf events enabled, runs with a userspace view of active counters that
> > >> is not up-to-date with the kernel one, but since the core power down is
> > >> not really a PMU event start/stop this can be considered acceptable and
> > >> the userspace event snapshot will update the user view of counters
> > >> on subsequent perf event updates requested by either the perf API
> > >> or event counters overflow-triggered interrupts.
> > >>
> > >> Fixes: da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
> > >> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > >> Reported-by: James Morse <james.morse@arm.com>
> > >> Cc: Ashwin Chaugule <ashwin.chaugule@linaro.org>
> > >> Cc: Will Deacon <will.deacon@arm.com>
> > >> Cc: Kevin Hilman <khilman@baylibre.com>
> > >> Cc: Sudeep Holla <sudeep.holla@arm.com>
> > >> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> > >> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> > >> Cc: Mark Rutland <mark.rutland@arm.com>
> > >> ---
> > >> drivers/perf/arm_pmu.c | 26 +++++++++++++++++++++-----
> > >> 1 file changed, 21 insertions(+), 5 deletions(-)
> > >
> > > This is horrible, but I think it's the best we can do without completely
> > > redesigning the way in which we save/restore the PMU state. We should do
> > > that, but not for 4.6!
> > >
> > > Acked-by: Will Deacon <will.deacon@arm.com>
> >
> > Maybe RCU_NONIDLE() will help here?
>
> Thanks for chiming in.
>
> CPU_PM notifiers are called from process context (which is not necessarily
> the idle thread) with IRQs disabled from:
>
> - CPUidle drivers state enter calls
> - syscore callbacks (ie suspend2RAM - suspend thread)
> - bL switcher
> - MCPM loopback
I can see other places where RCU_NONIDLE is used with interrupts
disabled, e.g. enter_freeze_proper(). The rcu_irq_(enter|exit)_irqson
functions use a save/restore combination for the IRQ flags, so there is
no risk of enabling IRQs inadvertently.
--
Catalin
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] drivers/perf: arm-pmu: fix RCU usage on resume from idle states
2016-04-20 13:41 ` Lorenzo Pieralisi
2016-04-20 15:45 ` Catalin Marinas
@ 2016-04-20 16:23 ` Paul E. McKenney
2016-04-20 16:52 ` Lorenzo Pieralisi
1 sibling, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2016-04-20 16:23 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Apr 20, 2016 at 02:41:16PM +0100, Lorenzo Pieralisi wrote:
> [+ Paul, Nico]
>
> On Tue, Apr 19, 2016 at 02:48:40PM -0700, Kevin Hilman wrote:
> > Will Deacon <will.deacon@arm.com> writes:
> >
> > > Hi Lorenzo,
> > >
> > > On Tue, Apr 19, 2016 at 06:08:09PM +0100, Lorenzo Pieralisi wrote:
> > >> Commit da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
> > >> added code in the arm perf infrastructure that allows the kernel to
> > >> save/restore perf counters whenever the CPU enters a low-power idle
> > >> state. The kernel saves/restores the counters for each active event
> > >> through the armpmu_{stop/start} ARM pmu API, so that the idle state
> > >> enter/exit power cycle is emulated through pmu start/stop operations
> > >> for each event in use.
> > >>
> > >> However, calling armpmu_start() for each active event on power up
> > >> executes code that requires RCU locking (perf_event_update_userpage())
> > >> to be functional, so, given that the core may call the CPU_PM notifiers
> > >> while running the idle thread in an quiescent RCU state this is not
> > >> allowed as detected through the following splat when kernel is run with
> > >> CONFIG_PROVE_LOCKING enabled:
> > >>
> > >> [ 49.293286]
> > >> [ 49.294761] ===============================
> > >> [ 49.298895] [ INFO: suspicious RCU usage. ]
> > >> [ 49.303031] 4.6.0-rc3+ #421 Not tainted
> > >> [ 49.306821] -------------------------------
> > >> [ 49.310956] include/linux/rcupdate.h:872 rcu_read_lock() used
> > >> illegally while idle!
> > >> [ 49.318530]
> > >> [ 49.318530] other info that might help us debug this:
> > >> [ 49.318530]
> > >> [ 49.326451]
> > >> [ 49.326451] RCU used illegally from idle CPU!
> > >> [ 49.326451] rcu_scheduler_active = 1, debug_locks = 0
> > >> [ 49.337209] RCU used illegally from extended quiescent state!
> > >> [ 49.342892] 2 locks held by swapper/2/0:
> > >> [ 49.346768] #0: (cpu_pm_notifier_lock){......}, at:
> > >> [<ffffff8008163c28>] cpu_pm_exit+0x18/0x80
> > >> [ 49.355492] #1: (rcu_read_lock){......}, at: [<ffffff800816dc38>]
> > >> perf_event_update_userpage+0x0/0x260
> > >>
> > >> This patch refactors the perf CPU_PM notifiers to add a boolean
> > >> flag to the function updating the counters event period, so that the
> > >> userpage update can be skipped when resuming from low-power whilst
> > >> keeping correct save/restore functionality for the running events.
> > >>
> > >> As a side effect the kernel, while resuming from low-power with
> > >> perf events enabled, runs with a userspace view of active counters that
> > >> is not up-to-date with the kernel one, but since the core power down is
> > >> not really a PMU event start/stop this can be considered acceptable and
> > >> the userspace event snapshot will update the user view of counters
> > >> on subsequent perf event updates requested by either the perf API
> > >> or event counters overflow-triggered interrupts.
> > >>
> > >> Fixes: da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier")
> > >> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > >> Reported-by: James Morse <james.morse@arm.com>
> > >> Cc: Ashwin Chaugule <ashwin.chaugule@linaro.org>
> > >> Cc: Will Deacon <will.deacon@arm.com>
> > >> Cc: Kevin Hilman <khilman@baylibre.com>
> > >> Cc: Sudeep Holla <sudeep.holla@arm.com>
> > >> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> > >> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> > >> Cc: Mark Rutland <mark.rutland@arm.com>
> > >> ---
> > >> drivers/perf/arm_pmu.c | 26 +++++++++++++++++++++-----
> > >> 1 file changed, 21 insertions(+), 5 deletions(-)
> > >
> > > This is horrible, but I think it's the best we can do without completely
> > > redesigning the way in which we save/restore the PMU state. We should do
> > > that, but not for 4.6!
> > >
> > > Acked-by: Will Deacon <will.deacon@arm.com>
> >
> > Maybe RCU_NONIDLE() will help here?
>
> Thanks for chiming in.
>
> CPU_PM notifiers are called from process context (which is not necessarily
> the idle thread) with IRQs disabled from:
>
> - CPUidle drivers state enter calls
> - syscore callbacks (ie suspend2RAM - suspend thread)
> - bL switcher
> - MCPM loopback
>
> The questions I have are:
>
> - Is it safe to wrap a call (in this case armpmu_start()) with RCU_NONIDLE
> if the core is not actually executing the idle thread ? The function
> requiring rcu locks/dereferences is perf_event_update_userpage().
Yes it is.
> - What are RCU_NONIDLE side-effects (ie what can be actually called from
> within an RCU_NONIDLE wrapper ?)
There are a few restrictions:
1. Code within RCU_NONIDLE() cannot block. Then again, neither
can the idle task. ;-)
2. RCU_NONIDLE() can be nested, but not indefinitely. Then again,
given that the limit even on a 32-bit system is something like
a million, I bet you hit compiler or stack-size limits long
before you overflow RCU_NONIDLE()'s counter.
3. You can neither branch into the middle of RCU_NONIDLE()'s code
nor branch out from the middle of RCU_NONIDLE()'s code.
Calling functions is just fine, but things like this are not:
RCU_NONIDLE({
do_something();
goto bad_idea; /* BUG!!! */
do_something_else();});
do_yet_a_third_thing();
bad_idea:
Branching -within- the RCU_NONIDLE() code is just fine.
Yes, and I am adding this information to RCU_NONIDLE()'s header
comment, apologies for its not being there to begin with!
(See below for patches.)
> It would be nice if we can use it instead of merging this patch, I need
> more insights into RCU_NONIDLE usage though before proceeding.
Please let me know if any of the above restrictions cause you a problem.
Thanx, Paul
------------------------------------------------------------------------
diff --git a/Documentation/RCU/Design/Requirements/Requirements.html b/Documentation/RCU/Design/Requirements/Requirements.html
index e7e24b3e86e2..ece410f40436 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.html
+++ b/Documentation/RCU/Design/Requirements/Requirements.html
@@ -2391,6 +2391,41 @@ and <tt>RCU_NONIDLE()</tt> on the other while inspecting
idle-loop code.
Steven Rostedt supplied <tt>_rcuidle</tt> event tracing,
which is used quite heavily in the idle loop.
+However, there are some restrictions on the code placed within
+<tt>RCU_NONIDLE()</tt>:
+
+<ol>
+<li> Blocking is prohibited.
+ In practice, this is not a serious restriction given that idle
+ tasks are prohibited from blocking to begin with.
+<li> Although nesting <tt>RCU_NONIDLE()</tt> is permited, they cannot
+ nest indefinitely deeply.
+ However, given that they can be nested on the order of a million
+ deep, even on 32-bit systems, this should not be a serious
+ restriction.
+ This nesting limit would probably be reached long after the
+ compiler OOMed or the stack overflowed.
+<li> Any code path that enters <tt>RCU_NONIDLE()</tt> must sequence
+ out of that same <tt>RCU_NONIDLE()</tt>.
+ For example, the following is grossly illegal:
+
+ <blockquote>
+ <pre>
+ 1 RCU_NONIDLE({
+ 2 do_something();
+ 3 goto bad_idea; /* BUG!!! */
+ 4 do_something_else();});
+ 5 bad_idea:
+ </pre>
+ </blockquote>
+
+ <p>
+ It is just as illegal to transfer control into the middle of
+ <tt>RCU_NONIDLE()</tt>'s argument.
+ Yes, in theory, you could transfer in as long as you also
+ transferred out, but in practice you could also expect to get sharply
+ worded review comments.
+</ol>
<p>
It is similarly socially unacceptable to interrupt an
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 5f1533e3d032..c61b6b9506e7 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -379,12 +379,13 @@ static inline void rcu_init_nohz(void)
* in the inner idle loop.
*
* This macro provides the way out: RCU_NONIDLE(do_something_with_RCU())
- * will tell RCU that it needs to pay attending, invoke its argument
- * (in this example, a call to the do_something_with_RCU() function),
+ * will tell RCU that it needs to pay attention, invoke its argument
+ * (in this example, calling the do_something_with_RCU() function),
* and then tell RCU to go back to ignoring this CPU. It is permissible
- * to nest RCU_NONIDLE() wrappers, but the nesting level is currently
- * quite limited. If deeper nesting is required, it will be necessary
- * to adjust DYNTICK_TASK_NESTING_VALUE accordingly.
+ * to nest RCU_NONIDLE() wrappers, but not indefinitely (but the limit is
+ * on the order of a million or so, even on 32-bit systems). It is
+ * not legal to block within RCU_NONIDLE(), nor is it permissible to
+ * transfer control either into or out of RCU_NONIDLE()'s statement.
*/
#define RCU_NONIDLE(a) \
do { \
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH] drivers/perf: arm-pmu: fix RCU usage on resume from idle states
2016-04-20 16:23 ` Paul E. McKenney
@ 2016-04-20 16:52 ` Lorenzo Pieralisi
2016-04-20 17:19 ` Paul E. McKenney
0 siblings, 1 reply; 9+ messages in thread
From: Lorenzo Pieralisi @ 2016-04-20 16:52 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Apr 20, 2016 at 09:23:54AM -0700, Paul E. McKenney wrote:
[...]
> > > Maybe RCU_NONIDLE() will help here?
> >
> > Thanks for chiming in.
> >
> > CPU_PM notifiers are called from process context (which is not necessarily
> > the idle thread) with IRQs disabled from:
> >
> > - CPUidle drivers state enter calls
> > - syscore callbacks (ie suspend2RAM - suspend thread)
> > - bL switcher
> > - MCPM loopback
> >
> > The questions I have are:
> >
> > - Is it safe to wrap a call (in this case armpmu_start()) with RCU_NONIDLE
> > if the core is not actually executing the idle thread ? The function
> > requiring rcu locks/dereferences is perf_event_update_userpage().
>
> Yes it is.
>
> > - What are RCU_NONIDLE side-effects (ie what can be actually called from
> > within an RCU_NONIDLE wrapper ?)
>
> There are a few restrictions:
>
> 1. Code within RCU_NONIDLE() cannot block. Then again, neither
> can the idle task. ;-)
>
> 2. RCU_NONIDLE() can be nested, but not indefinitely. Then again,
> given that the limit even on a 32-bit system is something like
> a million, I bet you hit compiler or stack-size limits long
> before you overflow RCU_NONIDLE()'s counter.
>
> 3. You can neither branch into the middle of RCU_NONIDLE()'s code
> nor branch out from the middle of RCU_NONIDLE()'s code.
> Calling functions is just fine, but things like this are not:
>
> RCU_NONIDLE({
> do_something();
> goto bad_idea; /* BUG!!! */
> do_something_else();});
> do_yet_a_third_thing();
> bad_idea:
>
> Branching -within- the RCU_NONIDLE() code is just fine.
>
> Yes, and I am adding this information to RCU_NONIDLE()'s header
> comment, apologies for its not being there to begin with!
> (See below for patches.)
Thank you for the explanation, that's now clear. For my own understanding:
RCU_NONIDLE() is a way to inform the RCU subsystem that the CPU in question
should be temporarily *watched* (ie it is not idle from an RCU standpoint),
correct ?
> > It would be nice if we can use it instead of merging this patch, I need
> > more insights into RCU_NONIDLE usage though before proceeding.
>
> Please let me know if any of the above restrictions cause you a problem.
I can't think of any, perf_event_update_userpage(), that I will call
indirectly through:
RCU_NONIDLE(armpmu_start());
is not allowed to block anyway, so I think we have a much better
solution than this one, new patch coming, Catalin please drop this one.
Thank you all !
Lorenzo
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] drivers/perf: arm-pmu: fix RCU usage on resume from idle states
2016-04-20 16:52 ` Lorenzo Pieralisi
@ 2016-04-20 17:19 ` Paul E. McKenney
0 siblings, 0 replies; 9+ messages in thread
From: Paul E. McKenney @ 2016-04-20 17:19 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Apr 20, 2016 at 05:52:05PM +0100, Lorenzo Pieralisi wrote:
> On Wed, Apr 20, 2016 at 09:23:54AM -0700, Paul E. McKenney wrote:
>
> [...]
>
> > > > Maybe RCU_NONIDLE() will help here?
> > >
> > > Thanks for chiming in.
> > >
> > > CPU_PM notifiers are called from process context (which is not necessarily
> > > the idle thread) with IRQs disabled from:
> > >
> > > - CPUidle drivers state enter calls
> > > - syscore callbacks (ie suspend2RAM - suspend thread)
> > > - bL switcher
> > > - MCPM loopback
> > >
> > > The questions I have are:
> > >
> > > - Is it safe to wrap a call (in this case armpmu_start()) with RCU_NONIDLE
> > > if the core is not actually executing the idle thread ? The function
> > > requiring rcu locks/dereferences is perf_event_update_userpage().
> >
> > Yes it is.
> >
> > > - What are RCU_NONIDLE side-effects (ie what can be actually called from
> > > within an RCU_NONIDLE wrapper ?)
> >
> > There are a few restrictions:
> >
> > 1. Code within RCU_NONIDLE() cannot block. Then again, neither
> > can the idle task. ;-)
> >
> > 2. RCU_NONIDLE() can be nested, but not indefinitely. Then again,
> > given that the limit even on a 32-bit system is something like
> > a million, I bet you hit compiler or stack-size limits long
> > before you overflow RCU_NONIDLE()'s counter.
> >
> > 3. You can neither branch into the middle of RCU_NONIDLE()'s code
> > nor branch out from the middle of RCU_NONIDLE()'s code.
> > Calling functions is just fine, but things like this are not:
> >
> > RCU_NONIDLE({
> > do_something();
> > goto bad_idea; /* BUG!!! */
> > do_something_else();});
> > do_yet_a_third_thing();
> > bad_idea:
> >
> > Branching -within- the RCU_NONIDLE() code is just fine.
> >
> > Yes, and I am adding this information to RCU_NONIDLE()'s header
> > comment, apologies for its not being there to begin with!
> > (See below for patches.)
>
> Thank you for the explanation, that's now clear. For my own understanding:
> RCU_NONIDLE() is a way to inform the RCU subsystem that the CPU in question
> should be temporarily *watched* (ie it is not idle from an RCU standpoint),
> correct ?
Exactly!
> > > It would be nice if we can use it instead of merging this patch, I need
> > > more insights into RCU_NONIDLE usage though before proceeding.
> >
> > Please let me know if any of the above restrictions cause you a problem.
>
> I can't think of any, perf_event_update_userpage(), that I will call
> indirectly through:
>
> RCU_NONIDLE(armpmu_start());
>
> is not allowed to block anyway, so I think we have a much better
> solution than this one, new patch coming, Catalin please drop this one.
>
> Thank you all !
Please let me know how it goes!
Thanx, Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-04-20 17:19 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-19 17:08 [PATCH] drivers/perf: arm-pmu: fix RCU usage on resume from idle states Lorenzo Pieralisi
2016-04-19 17:23 ` Will Deacon
2016-04-19 21:48 ` Kevin Hilman
2016-04-20 13:41 ` Lorenzo Pieralisi
2016-04-20 15:45 ` Catalin Marinas
2016-04-20 16:23 ` Paul E. McKenney
2016-04-20 16:52 ` Lorenzo Pieralisi
2016-04-20 17:19 ` Paul E. McKenney
2016-04-20 10:48 ` Catalin Marinas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).