From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jonathan Richardson Subject: Re: [PATCH] pwm: Avoid double mutex lock on pwm_enable Date: Mon, 23 Nov 2015 12:09:03 -0800 Message-ID: <5653725F.3070209@broadcom.com> References: <1448038740-2811-1-git-send-email-linux.amoon@gmail.com> <564FF287.8000201@samsung.com> <20151123100742.GE31868@ulmo.nvidia.com> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit Return-path: Received: from mail-gw3-out.broadcom.com ([216.31.210.64]:44304 "EHLO mail-gw3-out.broadcom.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752910AbbKWUJK (ORCPT ); Mon, 23 Nov 2015 15:09:10 -0500 In-Reply-To: <20151123100742.GE31868@ulmo.nvidia.com> Sender: linux-pwm-owner@vger.kernel.org List-Id: linux-pwm@vger.kernel.org To: Thierry Reding Cc: Krzysztof Kozlowski , Anand Moon , Guenter Roeck , Javier Martinez Canillas , linux-pwm@vger.kernel.org, Linux Kernel , "linux-samsung-soc@vger.kernel.org" On 15-11-23 02:07 AM, Thierry Reding wrote: > On Sun, Nov 22, 2015 at 09:13:17AM +0900, Krzysztof Kozlowski wrote: >> 2015-11-22 3:14 GMT+09:00 Anand Moon : >>> Hi Krzysztof, >>> >>> On 21 November 2015 at 18:37, Krzysztof Kozlowski >>> wrote: >>>> 2015-11-21 21:11 GMT+09:00 Anand Moon : >>>>> hi Krzysztof, >>>>> >>>>> On 21 November 2015 at 15:22, Krzysztof Kozlowski >>>>> wrote: >>>>>> 2015-11-21 18:40 GMT+09:00 Anand Moon : >>>>>>> hi Krzysztof, >>>>>>> >>>>>>> On 21 November 2015 at 09:56, Krzysztof Kozlowski >>>>>>> wrote: >>>>>>>> >>>>>>>> W dniu 21.11.2015 o 01:59, Anand Moon pisze: >>>>>>>>> Commit d1cd21427747f15920cd726f5f67a07880e7dee4 >>>>>>>>> (pwm: Set enable state properly on failed call to enable) >>>>>>>>> introduce double lock of mutex on pwm_enable >>>>>>>>> >>>>>>>>> Changes fix the following debug lock warning. >>>>>>>>> >>>>>>>>> [ 2.701720] WARNING: CPU: 3 PID: 0 at kernel/locking/mutex.c:526 >>>>>>>>> __mutex_lock_slowpath+0x3bc/0x404() >>>>>>>>> [ 2.701731] DEBUG_LOCKS_WARN_ON(in_interrupt()) >>>>>>>> >>>>>>>> Hi Anand! >>>>>>>> >>>>>>>> Yeah, I am hitting this as well. Annoying. However your description is >>>>>>>> inaccurate. Double lock of mutex does not cause warnings of sleeping or >>>>>>>> locking in atomic context (if you build with debug sleep atomic you will >>>>>>>> see different warning). More over you are partially reverting the commit >>>>>>>> d1cd21427747 ("pwm: Set enable state properly on failed call to enable") >>>>>>>> without proper explanation: >>>>>>>> 1. why this locking is inappropriate; >>>>>>>> 2. why it is safe to remove the mutex_lock(). >>>>>>>> >>>>>>>> Please find the real problem, fix the real problem and throughly explain >>>>>>>> the real issue. >>>>>>>> >>>>>>>> I was suspecting some weird changes in timers. For !irqsafe timer I >>>>>>>> think it is safe to use mutexes... but apparently not. Maybe a work >>>>>>>> should be scheduled from function called by timer? >>>>>>>> >>>>>>>> Warning with debug atomic sleep: >>>>>>>> [ 16.047517] BUG: sleeping function called from invalid context at >>>>>>>> ../kernel/locking/mutex.c:617 >>>>>>>> [ 16.054866] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0 >>>>>>>> [ 16.061402] INFO: lockdep is turned off. >>>>>>>> [ 16.065281] Preemption disabled at:[< (null)>] (null) >>>>>>>> [ 16.070524] >>>>>>>> [ 16.072002] CPU: 0 PID: 0 Comm: swapper/0 Not tainted >>>>>>>> 4.4.0-rc1-00040-g28c429565d4f #290 >>>>>>>> [ 16.080150] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) >>>>>>>> [ 16.086215] [] (unwind_backtrace) from [] >>>>>>>> (show_stack+0x10/0x14) >>>>>>>> [ 16.093938] [] (show_stack) from [] >>>>>>>> (dump_stack+0x70/0xbc) >>>>>>>> [ 16.101122] [] (dump_stack) from [] >>>>>>>> (mutex_lock_nested+0x24/0x474) >>>>>>>> [ 16.109009] [] (mutex_lock_nested) from [] >>>>>>>> (pwm_enable+0x20/0x7c) >>>>>>>> [ 16.116799] [] (pwm_enable) from [] >>>>>>>> (led_heartbeat_function+0xdc/0x140) >>>>>>>> [ 16.125119] [] (led_heartbeat_function) from [] >>>>>>>> (call_timer_fn+0x6c/0xf4) >>>>>>>> [ 16.133606] [] (call_timer_fn) from [] >>>>>>>> (run_timer_softirq+0x1a8/0x230) >>>>>>>> [ 16.141846] [] (run_timer_softirq) from [] >>>>>>>> (__do_softirq+0x134/0x2c0) >>>>>>>> [ 16.149982] [] (__do_softirq) from [] >>>>>>>> (irq_exit+0xd0/0x114) >>>>>>>> [ 16.157255] [] (irq_exit) from [] >>>>>>>> (__handle_domain_irq+0x70/0xe4) >>>>>>>> [ 16.165056] [] (__handle_domain_irq) from [] >>>>>>>> (gic_handle_irq+0x4c/0x94) >>>>>>>> [ 16.173376] [] (gic_handle_irq) from [] >>>>>>>> (__irq_svc+0x58/0x98) >>>>>>>> [ 16.180817] Exception stack(0xc08cdf58 to 0xc08cdfa0) >>>>>>>> [ 16.185823] df40: >>>>>>>> c0010dcc 00000000 >>>>>>>> [ 16.193997] df60: c08cdfa8 00000000 c05f57c4 00000000 c08ca520 >>>>>>>> c08ce4bc c08c7364 c08ce4b4 >>>>>>>> [ 16.202141] df80: c09121ca 00000000 00000001 c08cdfa8 c0010dcc >>>>>>>> c0010dd0 600f0013 ffffffff >>>>>>>> [ 16.210291] [] (__irq_svc) from [] >>>>>>>> (arch_cpu_idle+0x20/0x3c) >>>>>>>> [ 16.217661] [] (arch_cpu_idle) from [] >>>>>>>> (cpu_startup_entry+0x17c/0x26c) >>>>>>>> [ 16.225899] [] (cpu_startup_entry) from [] >>>>>>>> (start_kernel+0x368/0x3d0) >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Krzysztof >>>>>>>> >>>>>>>> >>>>>>>>> [ 2.701737] Modules linked in: >>>>>>>>> [ 2.701748] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.4.0-rc1-xu4f >>>>>>>>> #24 >>>>>>>>> [ 2.701753] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) >>>>>>>>> [ 2.701787] [] (unwind_backtrace) from [] >>>>>>>>> (show_stack+0x10/0x14) >>>>>>>>> [ 2.701808] [] (show_stack) from [] >>>>>>>>> (dump_stack+0x84/0xc4) >>>>>>>>> [ 2.701824] [] (dump_stack) from [] >>>>>>>>> (warn_slowpath_common+0x80/0xb0) >>>>>>>>> [ 2.701836] [] (warn_slowpath_common) from [] >>>>>>>>> (warn_slowpath_fmt+0x30/0x40) >>>>>>>>> [ 2.701849] [] (warn_slowpath_fmt) from [] >>>>>>>>> (__mutex_lock_slowpath+0x3bc/0x404) >>>>>>>>> [ 2.701864] [] (__mutex_lock_slowpath) from [] >>>>>>>>> (mutex_lock+0xc/0x24) >>>>>>>>> [ 2.701884] [] (mutex_lock) from [] >>>>>>>>> (pwm_enable+0x20/0x7c) >>>>>>>>> [ 2.701903] [] (pwm_enable) from [] >>>>>>>>> (led_heartbeat_function+0x74/0x144) >>>>>>>>> [ 2.701919] [] (led_heartbeat_function) from [] >>>>>>>>> (call_timer_fn+0x24/0x98) >>>>>>>>> [ 2.701932] [] (call_timer_fn) from [] >>>>>>>>> (run_timer_softirq+0x160/0x21c) >>>>>>>>> [ 2.701946] [] (run_timer_softirq) from [] >>>>>>>>> (__do_softirq+0x110/0x228) >>>>>>>>> [ 2.701959] [] (__do_softirq) from [] >>>>>>>>> (irq_exit+0xc0/0xfc) >>>>>>>>> [ 2.701976] [] (irq_exit) from [] >>>>>>>>> (__handle_domain_irq+0x80/0xec) >>>>>>>>> [ 2.701990] [] (__handle_domain_irq) from [] >>>>>>>>> (gic_handle_irq+0x54/0x94) >>>>>>>>> [ 2.702001] [] (gic_handle_irq) from [] >>>>>>>>> (__irq_svc+0x54/0x90) >>>>>>>>> [ 2.702008] Exception stack(0xee8bdf88 to 0xee8bdfd0) >>>>>>>>> [ 2.702019] df80: 00000001 00000000 00000000 >>>>>>>>> c001b720 ee8bc000 c0573354 >>>>>>>>> [ 2.702031] dfa0: 00000000 00000000 ee8bdfe0 c07f92e4 c08004b4 >>>>>>>>> c08004bc f0806640 ee8bdfd8 >>>>>>>>> [ 2.702039] dfc0: c0010180 c0010184 60000013 ffffffff >>>>>>>>> [ 2.702053] [] (__irq_svc) from [] >>>>>>>>> (arch_cpu_idle+0x38/0x3c) >>>>>>>>> [ 2.702073] [] (arch_cpu_idle) from [] >>>>>>>>> (cpu_startup_entry+0x1ec/0x270) >>>>>>>>> [ 2.702086] [] (cpu_startup_entry) from [<4000956c>] >>>>>>>>> (0x4000956c) >>>>>>>>> [ 2.702093] ---[ end trace 539af70491f4f1a9 ]--- >>>>>>>>> >>>>>>>>> Signed-off-by: Anand Moon >>>>>>>>> --- >>>>>>>>> drivers/pwm/core.c | 4 ---- >>>>>>>>> 1 file changed, 4 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/drivers/pwm/core.c b/drivers/pwm/core.c >>>>>>>>> index d24ca5f..b8f035a 100644 >>>>>>>>> --- a/drivers/pwm/core.c >>>>>>>>> +++ b/drivers/pwm/core.c >>>>>>>>> @@ -506,16 +506,12 @@ int pwm_enable(struct pwm_device *pwm) >>>>>>>>> if (!pwm) >>>>>>>>> return -EINVAL; >>>>>>>>> >>>>>>>>> - mutex_lock(&pwm->lock); >>>>>>>>> - >>>>>>>>> if (!test_and_set_bit(PWMF_ENABLED, &pwm->flags)) { >>>>>>>>> err = pwm->chip->ops->enable(pwm->chip, pwm); >>>>>>>>> if (err) >>>>>>>>> clear_bit(PWMF_ENABLED, &pwm->flags); >>>>>>>>> } >>>>>>>>> >>>>>>>>> - mutex_unlock(&pwm->lock); >>>>>>>>> - >>>>>>>>> return err; >>>>>>>>> } >>>>>>>>> EXPORT_SYMBOL_GPL(pwm_enable); >>>>>>>>> >>>>>>> >>>>>>> >>>>>>> Adding Jonathan Richardson. >>>>>>> >>>>>>> Yes I am >>>>>>> >>>>>>> aware I am reverting some part of the d1cd21427747 ("pwm: Set enable state >>>>>>> properly on failed call to enable") >>>>>>> >>>>>>> Please take a look at this below commit. >>>>>>> >>>>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/pwm/core.c?id=d1cd21427747f15920cd726f5f67a07880e7dee4 >>>>>>> >>>>>>> Actually reverting this change it work fine. >>>>>>> >>>>>>> This changes introduce the new mutex lock pwm->lock to protect enabled bit >>>>>>> by drivers while setting polarity. >>>>>>> >>>>>>> Well pwm_set_polarity already acquires the pwm->lock and calls >>>>>>> pwm_is_enabled function. >>>>>>> Again within pwm_is_enabled we are trying to acquire the same mutex lock. >>>>>> >>>>>> You are describing a lockdown by trying to acquire the same mutex twice. >>>>>> >>>>>> However pwm_is_enabled() does not acquire mutex. >>>>>> >>>>>> Again, please look at generated warnings: >>>>>> 1. BUG: sleeping function called from invalid context >>>>>> 2. DEBUG_LOCKS_WARN_ON(in_interrupt()) >>>>>> >>>>>> They are not related anyhow to what you described (acquiring already >>>>>> locked mutex). >>>>>> >>>>>> Best regards, >>>>>> Krzysztof >>>>> >>>>> My last reply mail went in HTML format so resend this. >>>>> >>>>> First it was a typo on my part. >>>>> It not pwm_is_enabled function its pwm_enabled. >>>> >>>> There is no such function as "pwm_enabled". >>>> >>>> Sorry, I don't get your point. >>>> >>>> Instead of pasting some commit use a descriptive way to show the calls >>>> leading lockdown. But please use real function names. >>>> >>>> Best regards, >>>> Krzysztof >>> >>> Earlier my assumption of double mutex lock up totally rubbish. >>> >>> After reverting my changes and building image with CONFIG_DEBUG_ATOMIC_SLEEP. >>> >>> [ 390.415370] BUG: sleeping function called from invalid context at >>> kernel/locking/mutex.c:97 >>> [ 390.422274] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1 >>> [ 390.428831] Preemption disabled at:[< (null)>] (null) >>> [ 391.970352] BUG: sleeping function called from invalid context at >>> kernel/locking/mutex.c:97 >>> [ 391.977251] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1 >>> [ 391.983814] Preemption disabled at:[< (null)>] (null) >>> [ 393.520376] BUG: sleeping function called from invalid context at >>> kernel/locking/mutex.c:97 >>> [ 393.527312] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1 >>> [ 393.533925] Preemption disabled at:[< (null)>] (null) >> >> Yes, now you pasted the same warning I did... >> >> This is still the same issue. I already wrote it: >>> 1. BUG: sleeping function called from invalid context >>> 2. DEBUG_LOCKS_WARN_ON(in_interrupt()) >> >> We can repeat it many times but that won't change anything... > > This looks like you're simply running the leds-pwm driver with a PWM > that isn't properly marked as potentially sleeping. Unfortunately the > introduction of the mutex in d1cd21427747 ("pwm: Set enable state > properly on failed call to enable") effectively makes all PWM drivers > potentially sleeping. That in turn makes the .can_sleep field obsolete > since all drivers can now sleep. > > Any objections to simply removing it and make all users use a workqueue > or some such if they need to control a PWM as a result of an interrupt > trigger? > > Thierry > No objections here. The function is somewhat new and I didn't notice it. It should have been removed in my last patch set. Thanks, Jon