From mboxrd@z Thu Jan 1 00:00:00 1970 From: Lina Iyer Subject: Re: [PATCH] PM / Domains: Restore lock-less behaviour for the genpd syscore APIs Date: Tue, 7 Feb 2017 14:52:29 -0700 Message-ID: <20170207215229.GA2564@linaro.org> References: <1486482784-31734-1-git-send-email-ulf.hansson@linaro.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Return-path: Received: from mail-pf0-f171.google.com ([209.85.192.171]:33396 "EHLO mail-pf0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932101AbdBGWBq (ORCPT ); Tue, 7 Feb 2017 17:01:46 -0500 Received: by mail-pf0-f171.google.com with SMTP id y143so36174637pfb.0 for ; Tue, 07 Feb 2017 14:01:46 -0800 (PST) Content-Disposition: inline In-Reply-To: <1486482784-31734-1-git-send-email-ulf.hansson@linaro.org> Sender: linux-pm-owner@vger.kernel.org List-Id: linux-pm@vger.kernel.org To: Ulf Hansson Cc: "Rafael J. Wysocki" , linux-pm@vger.kernel.org, Len Brown , Pavel Machek , Kevin Hilman , Geert Uytterhoeven , Jon Hunter , Marek Szyprowski , Brian Norris Hi Ulf, On Tue, Feb 07 2017 at 08:53 -0700, Ulf Hansson wrote: >The commit ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of >*noirq() callbacks") went too far, as it not only changed to use locks for >the *noirq callbacks, but also for the genpd syscore APIs. > >This cause the following error, reported by Geert Uytterhoeven: > >"This causes the following BUG on all my Renesas arm32 boards, where the >system timer is an IRQ safe device: > >BUG: sleeping function called from invalid context at >kernel/locking/mutex.c:232 >in_atomic(): 0, irqs_disabled(): 128, pid: 1751, name: s2ram >CPU: 0 PID: 1751 Comm: s2ram Not tainted >4.10.0-rc7-koelsch-05643-g27f4c73972a614fe #3354 >Hardware name: Generic R8A7791 (Flattened Device Tree) >[] (unwind_backtrace) from [] (show_stack+0x10/0x14) >[] (show_stack) from [] (dump_stack+0x7c/0x9c) >[] (dump_stack) from [] (___might_sleep+0x124/0x160) >[] (___might_sleep) from [] (mutex_lock+0x18/0x60) >[] (mutex_lock) from [] >(genpd_syscore_switch+0x2c/0x7c) >[] (genpd_syscore_switch) from [] >(sh_cmt_clock_event_suspend+0x18/0x28) >[] (sh_cmt_clock_event_suspend) from [] >(clockevents_suspend+0x40/0x54) >[] (clockevents_suspend) from [] >(timekeeping_suspend+0x23c/0x278) >[] (timekeeping_suspend) from [] >(syscore_suspend+0x88/0x138) >[] (syscore_suspend) from [] >(suspend_devices_and_enter+0x290/0x470) >[] (suspend_devices_and_enter) from [] >(pm_suspend+0x228/0x280) >[] (pm_suspend) from [] (state_store+0xac/0xcc) >[] (state_store) from [] (kernfs_fop_write+0x160/0x19c) >[] (kernfs_fop_write) from [] (__vfs_write+0x20/0x108) >[] (__vfs_write) from [] (vfs_write+0xb8/0x144) >[] (vfs_write) from [] (SyS_write+0x40/0x80) >[] (SyS_write) from [] (ret_fast_syscall+0x0/0x34)" > >To fix this problem, restore the lock-less behaviour. However only for the >genpd syscore APIs. > >Reported-by: Geert Uytterhoeven >Fixes: ef4f7e2c8335 ("PM / Domains: Fix asynchronous execution of *noirq() callbacks") >Signed-off-by: Ulf Hansson >--- > drivers/base/power/domain.c | 48 +++++++++++++++++++++++++-------------------- > 1 file changed, 27 insertions(+), 21 deletions(-) > >diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c >index 271e208..3a75fb1 100644 >--- a/drivers/base/power/domain.c >+++ b/drivers/base/power/domain.c >@@ -729,16 +729,17 @@ static bool genpd_dev_active_wakeup(struct generic_pm_domain *genpd, > /** > * genpd_sync_power_off - Synchronously power off a PM domain and its masters. > * @genpd: PM domain to power off, if possible. >+ * @use_lock: use the lock. I am not too fond of this. It would rather be much cleaner if you could overload the genpd_lock/unlock() calls to make this genpd lockless. A flag for the genpd, perhaps that could turn the lock functions into a NOP? -- Lina > * @depth: nesting count for lockdep. > * > * Check if the given PM domain can be powered off (during system suspend or > * hibernation) and do that if so. Also, in that case propagate to its masters. > * > * This function is only called in "noirq" and "syscore" stages of system power >- * transitions, but since the "noirq" callbacks may be executed asynchronously, >- * the lock must be held. >+ * transitions. The "noirq" callbacks may be executed asynchronously, thus in >+ * these cases the lock must be held. > */ >-static void genpd_sync_power_off(struct generic_pm_domain *genpd, >+static void genpd_sync_power_off(struct generic_pm_domain *genpd, bool use_lock, > unsigned int depth) > { > struct gpd_link *link; >@@ -759,22 +760,27 @@ static void genpd_sync_power_off(struct generic_pm_domain *genpd, > list_for_each_entry(link, &genpd->slave_links, slave_node) { > genpd_sd_counter_dec(link->master); > >- genpd_lock_nested(link->master, depth + 1); >- genpd_sync_power_off(link->master, depth + 1); >- genpd_unlock(link->master); >+ if (use_lock) >+ genpd_lock_nested(link->master, depth + 1); >+ >+ genpd_sync_power_off(link->master, use_lock, depth + 1); >+ >+ if (use_lock) >+ genpd_unlock(link->master); > } > } > > /** > * genpd_sync_power_on - Synchronously power on a PM domain and its masters. > * @genpd: PM domain to power on. >+ * @use_lock: use the lock. > * @depth: nesting count for lockdep. > * > * This function is only called in "noirq" and "syscore" stages of system power >- * transitions, but since the "noirq" callbacks may be executed asynchronously, >- * the lock must be held. >+ * transitions. The "noirq" callbacks may be executed asynchronously, thus in >+ * these cases the lock must be held. > */ >-static void genpd_sync_power_on(struct generic_pm_domain *genpd, >+static void genpd_sync_power_on(struct generic_pm_domain *genpd, bool use_lock, > unsigned int depth) > { > struct gpd_link *link; >@@ -785,9 +791,13 @@ static void genpd_sync_power_on(struct generic_pm_domain *genpd, > list_for_each_entry(link, &genpd->slave_links, slave_node) { > genpd_sd_counter_inc(link->master); > >- genpd_lock_nested(link->master, depth + 1); >- genpd_sync_power_on(link->master, depth + 1); >- genpd_unlock(link->master); >+ if (use_lock) >+ genpd_lock_nested(link->master, depth + 1); >+ >+ genpd_sync_power_on(link->master, use_lock, depth + 1); >+ >+ if (use_lock) >+ genpd_unlock(link->master); > } > > _genpd_power_on(genpd, false); >@@ -898,7 +908,7 @@ static int pm_genpd_suspend_noirq(struct device *dev) > > genpd_lock(genpd); > genpd->suspended_count++; >- genpd_sync_power_off(genpd, 0); >+ genpd_sync_power_off(genpd, true, 0); > genpd_unlock(genpd); > > return 0; >@@ -925,7 +935,7 @@ static int pm_genpd_resume_noirq(struct device *dev) > return 0; > > genpd_lock(genpd); >- genpd_sync_power_on(genpd, 0); >+ genpd_sync_power_on(genpd, true, 0); > genpd->suspended_count--; > genpd_unlock(genpd); > >@@ -1016,7 +1026,7 @@ static int pm_genpd_restore_noirq(struct device *dev) > */ > genpd->status = GPD_STATE_POWER_OFF; > >- genpd_sync_power_on(genpd, 0); >+ genpd_sync_power_on(genpd, true, 0); > genpd_unlock(genpd); > > if (genpd->dev_ops.stop && genpd->dev_ops.start) >@@ -1070,17 +1080,13 @@ static void genpd_syscore_switch(struct device *dev, bool suspend) > if (!pm_genpd_present(genpd)) > return; > >- genpd_lock(genpd); >- > if (suspend) { > genpd->suspended_count++; >- genpd_sync_power_off(genpd, 0); >+ genpd_sync_power_off(genpd, false, 0); > } else { >- genpd_sync_power_on(genpd, 0); >+ genpd_sync_power_on(genpd, false, 0); > genpd->suspended_count--; > } >- >- genpd_unlock(genpd); > } > > void pm_genpd_syscore_poweroff(struct device *dev) >-- >1.9.1 >