From mboxrd@z Thu Jan 1 00:00:00 1970 From: lorenzo.pieralisi@arm.com (Lorenzo Pieralisi) Date: Wed, 20 Apr 2016 14:41:16 +0100 Subject: [PATCH] drivers/perf: arm-pmu: fix RCU usage on resume from idle states In-Reply-To: <7hr3e1b77r.fsf@baylibre.com> References: <1461085689-2893-1-git-send-email-lorenzo.pieralisi@arm.com> <20160419172320.GD4509@arm.com> <7hr3e1b77r.fsf@baylibre.com> Message-ID: <20160420134116.GA8537@red-moon> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org [+ Paul, Nico] On Tue, Apr 19, 2016 at 02:48:40PM -0700, Kevin Hilman wrote: > Will Deacon writes: > > > Hi Lorenzo, > > > > On Tue, Apr 19, 2016 at 06:08:09PM +0100, Lorenzo Pieralisi wrote: > >> Commit da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier") > >> added code in the arm perf infrastructure that allows the kernel to > >> save/restore perf counters whenever the CPU enters a low-power idle > >> state. The kernel saves/restores the counters for each active event > >> through the armpmu_{stop/start} ARM pmu API, so that the idle state > >> enter/exit power cycle is emulated through pmu start/stop operations > >> for each event in use. > >> > >> However, calling armpmu_start() for each active event on power up > >> executes code that requires RCU locking (perf_event_update_userpage()) > >> to be functional, so, given that the core may call the CPU_PM notifiers > >> while running the idle thread in an quiescent RCU state this is not > >> allowed as detected through the following splat when kernel is run with > >> CONFIG_PROVE_LOCKING enabled: > >> > >> [ 49.293286] > >> [ 49.294761] =============================== > >> [ 49.298895] [ INFO: suspicious RCU usage. ] > >> [ 49.303031] 4.6.0-rc3+ #421 Not tainted > >> [ 49.306821] ------------------------------- > >> [ 49.310956] include/linux/rcupdate.h:872 rcu_read_lock() used > >> illegally while idle! > >> [ 49.318530] > >> [ 49.318530] other info that might help us debug this: > >> [ 49.318530] > >> [ 49.326451] > >> [ 49.326451] RCU used illegally from idle CPU! > >> [ 49.326451] rcu_scheduler_active = 1, debug_locks = 0 > >> [ 49.337209] RCU used illegally from extended quiescent state! > >> [ 49.342892] 2 locks held by swapper/2/0: > >> [ 49.346768] #0: (cpu_pm_notifier_lock){......}, at: > >> [] cpu_pm_exit+0x18/0x80 > >> [ 49.355492] #1: (rcu_read_lock){......}, at: [] > >> perf_event_update_userpage+0x0/0x260 > >> > >> This patch refactors the perf CPU_PM notifiers to add a boolean > >> flag to the function updating the counters event period, so that the > >> userpage update can be skipped when resuming from low-power whilst > >> keeping correct save/restore functionality for the running events. > >> > >> As a side effect the kernel, while resuming from low-power with > >> perf events enabled, runs with a userspace view of active counters that > >> is not up-to-date with the kernel one, but since the core power down is > >> not really a PMU event start/stop this can be considered acceptable and > >> the userspace event snapshot will update the user view of counters > >> on subsequent perf event updates requested by either the perf API > >> or event counters overflow-triggered interrupts. > >> > >> Fixes: da4e4f18afe0 ("drivers/perf: arm_pmu: implement CPU_PM notifier") > >> Signed-off-by: Lorenzo Pieralisi > >> Reported-by: James Morse > >> Cc: Ashwin Chaugule > >> Cc: Will Deacon > >> Cc: Kevin Hilman > >> Cc: Sudeep Holla > >> Cc: Daniel Lezcano > >> Cc: Mathieu Poirier > >> Cc: Mark Rutland > >> --- > >> drivers/perf/arm_pmu.c | 26 +++++++++++++++++++++----- > >> 1 file changed, 21 insertions(+), 5 deletions(-) > > > > This is horrible, but I think it's the best we can do without completely > > redesigning the way in which we save/restore the PMU state. We should do > > that, but not for 4.6! > > > > Acked-by: Will Deacon > > Maybe RCU_NONIDLE() will help here? Thanks for chiming in. CPU_PM notifiers are called from process context (which is not necessarily the idle thread) with IRQs disabled from: - CPUidle drivers state enter calls - syscore callbacks (ie suspend2RAM - suspend thread) - bL switcher - MCPM loopback The questions I have are: - Is it safe to wrap a call (in this case armpmu_start()) with RCU_NONIDLE if the core is not actually executing the idle thread ? The function requiring rcu locks/dereferences is perf_event_update_userpage(). - What are RCU_NONIDLE side-effects (ie what can be actually called from within an RCU_NONIDLE wrapper ?) It would be nice if we can use it instead of merging this patch, I need more insights into RCU_NONIDLE usage though before proceeding. Thanks a lot, Lorenzo