From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marek Szyprowski Subject: Re: [PATCH] power: genpd: fix lockdep issue for all subdomains Date: Wed, 20 Jan 2016 07:40:45 +0100 Message-ID: <569F2BED.3060504@samsung.com> References: <1451903982-1598-1-git-send-email-m.szyprowski@samsung.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from mailout4.w1.samsung.com ([210.118.77.14]:33754 "EHLO mailout4.w1.samsung.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750861AbcATGku (ORCPT ); Wed, 20 Jan 2016 01:40:50 -0500 In-reply-to: Sender: linux-pm-owner@vger.kernel.org List-Id: linux-pm@vger.kernel.org To: Ulf Hansson Cc: linux-samsung-soc , "linux-kernel@vger.kernel.org" , Kevin Hilman , "Rafael J. Wysocki" , Anand Moon , Bartlomiej Zolnierkiewicz , "linux-pm@vger.kernel.org" , Daniel Kurtz , Nicolas Boichat Hello, On 2016-01-19 16:25, Ulf Hansson wrote: > +linux-pm, Daniel Kurtz, Nicolas Boichat > > On 4 January 2016 at 11:39, Marek Szyprowski wrote: >> During genpd_poweron, genpd->lock is acquired recursively for each >> parent (master) domain, which are separate obejcts. This confuses >> lockdep, which considers every operation on genpd->lock as being done on >> the same lock class. This leads to the following false positive warning: >> >> ============================================= >> [ INFO: possible recursive locking detected ] >> 4.4.0-rc4-xu3s #32 Not tainted >> --------------------------------------------- >> swapper/0/1 is trying to acquire lock: >> (&genpd->lock){+.+...}, at: [] __genpd_poweron+0x64/0x108 >> >> but task is already holding lock: >> (&genpd->lock){+.+...}, at: [] genpd_dev_pm_attach+0x168/0x1b8 >> >> other info that might help us debug this: >> Possible unsafe locking scenario: >> >> CPU0 >> ---- >> lock(&genpd->lock); >> lock(&genpd->lock); >> >> *** DEADLOCK *** >> >> May be due to missing lock nesting notation >> >> 3 locks held by swapper/0/1: >> #0: (&dev->mutex){......}, at: [] __driver_attach+0x48/0x98 >> #1: (&dev->mutex){......}, at: [] __driver_attach+0x58/0x98 >> #2: (&genpd->lock){+.+...}, at: [] genpd_dev_pm_attach+0x168/0x1b8 >> >> stack backtrace: >> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc4-xu3s #32 >> Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) >> [] (unwind_backtrace) from [] (show_stack+0x10/0x14) >> [] (show_stack) from [] (dump_stack+0x84/0xc4) >> [] (dump_stack) from [] (__lock_acquire+0x1f88/0x215c) >> [] (__lock_acquire) from [] (lock_acquire+0xa4/0xd0) >> [] (lock_acquire) from [] (mutex_lock_nested+0x70/0x4d4) >> [] (mutex_lock_nested) from [] (__genpd_poweron+0x64/0x108) >> [] (__genpd_poweron) from [] (genpd_dev_pm_attach+0x170/0x1b8) >> [] (genpd_dev_pm_attach) from [] (platform_drv_probe+0x2c/0xac) >> [] (platform_drv_probe) from [] (driver_probe_device+0x208/0x2fc) >> [] (driver_probe_device) from [] (__driver_attach+0x94/0x98) >> [] (__driver_attach) from [] (bus_for_each_dev+0x68/0x9c) >> [] (bus_for_each_dev) from [] (bus_add_driver+0x1a0/0x218) >> [] (bus_add_driver) from [] (driver_register+0x78/0xf8) >> [] (driver_register) from [] (exynos_drm_register_drivers+0x28/0x74) >> [] (exynos_drm_register_drivers) from [] (exynos_drm_init+0x6c/0xc4) >> [] (exynos_drm_init) from [] (do_one_initcall+0x90/0x1dc) >> [] (do_one_initcall) from [] (kernel_init_freeable+0x158/0x1f8) >> [] (kernel_init_freeable) from [] (kernel_init+0x8/0xe8) >> [] (kernel_init) from [] (ret_from_fork+0x14/0x24) >> >> This patch replaces mutex_lock with mutex_lock_nested() and uses >> recursion depth to annotate each genpd->lock operation with separate >> lockdep subclass. >> >> Reported-by: Anand Moon >> Signed-off-by: Marek Szyprowski > Thanks for looking into this! > > Daniel Kurtz, did also run into this issue [1] and reported it a while ago. > There where some discussions and Daniel also posted a patch trying to > solve the issue [2]. That approach didn't work out, and unfortunate I > haven't yet been able to look closer into the issue. Sorry about that! > >> --- >> drivers/base/power/domain.c | 21 +++++++++++++-------- >> 1 file changed, 13 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c >> index b803790..e02ddf6 100644 >> --- a/drivers/base/power/domain.c >> +++ b/drivers/base/power/domain.c >> @@ -170,16 +170,15 @@ static void genpd_queue_power_off_work(struct generic_pm_domain *genpd) >> queue_work(pm_wq, &genpd->power_off_work); >> } >> >> -static int genpd_poweron(struct generic_pm_domain *genpd); >> - >> /** >> * __genpd_poweron - Restore power to a given PM domain and its masters. >> * @genpd: PM domain to power up. >> + * @depth: nesting count for lockdep. >> * >> * Restore power to @genpd and all of its masters so that it is possible to >> * resume a device belonging to it. >> */ >> -static int __genpd_poweron(struct generic_pm_domain *genpd) >> +static int __genpd_poweron(struct generic_pm_domain *genpd, unsigned int depth) >> { >> struct gpd_link *link; >> int ret = 0; >> @@ -194,11 +193,16 @@ static int __genpd_poweron(struct generic_pm_domain *genpd) >> * with it. >> */ >> list_for_each_entry(link, &genpd->slave_links, slave_node) { >> - genpd_sd_counter_inc(link->master); >> + struct generic_pm_domain *master = link->master; >> + >> + genpd_sd_counter_inc(master); >> + >> + mutex_lock_nested(&master->lock, depth + 1); > Nested locks isn't a solution to a problem, but instead this tricks > lockdep about what goes on. Right? The only difference between mutex_lock_nested and mutex_lock is the way it is interpreted by deplock. The additional argument is deplock subclass of the lock. The name of this function is imho a bit misleading. > I am wondering whether there's another option available which better > can instruct lockdep to not treat this as an error. Similar solution is already applied in regulators and i2c cores, see regulator_lock_supply() and i2c_del_adapter() functions. >> + ret = __genpd_poweron(master, depth + 1); >> + mutex_unlock(&master->lock); >> >> - ret = genpd_poweron(link->master); >> if (ret) { >> - genpd_sd_counter_dec(link->master); >> + genpd_sd_counter_dec(master); >> goto err; >> } >> } >> @@ -230,11 +234,12 @@ static int genpd_poweron(struct generic_pm_domain *genpd) >> int ret; >> >> mutex_lock(&genpd->lock); >> - ret = __genpd_poweron(genpd); >> + ret = __genpd_poweron(genpd, 0); >> mutex_unlock(&genpd->lock); >> return ret; >> } >> >> + >> static int genpd_save_dev(struct generic_pm_domain *genpd, struct device *dev) >> { >> return GENPD_DEV_CALLBACK(genpd, int, save_state, dev); >> @@ -482,7 +487,7 @@ static int pm_genpd_runtime_resume(struct device *dev) >> } >> >> mutex_lock(&genpd->lock); >> - ret = __genpd_poweron(genpd); >> + ret = __genpd_poweron(genpd, 0); >> mutex_unlock(&genpd->lock); >> >> if (ret) >> -- >> 1.9.2 >> > Kind regards > Uffe > > [1] > http://www.spinics.net/lists/arm-kernel/msg471025.html > [2] > http://www.spinics.net/lists/stable/msg113650.html > > Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland