From mboxrd@z Thu Jan 1 00:00:00 1970 From: Guenter Roeck Subject: Re: [PATCH v2 01/47] kernel: Add support for poweroff handler call chain Date: Wed, 22 Oct 2014 06:22:58 -0700 Message-ID: <5447AFB2.4010701@roeck-us.net> References: <1413864783-3271-1-git-send-email-linux@roeck-us.net> <1413864783-3271-2-git-send-email-linux@roeck-us.net> <54460140.50805@gmail.com> <54465FBA.8070007@roeck-us.net> <544764B1.9020303@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: <544764B1.9020303@gmail.com> Sender: linux-kernel-owner@vger.kernel.org To: =?UTF-8?B?UGhpbGlwcGUgUsOpdG9ybmF6?= , linux-kernel@vger.kernel.org Cc: linux-pm@vger.kernel.org, Alan Cox , Alexander Graf , Andrew Morton , Geert Uytterhoeven , Heiko Stuebner , Lee Jones , Len Brown , Pavel Machek , "Rafael J. Wysocki" , Romain Perier List-Id: linux-pm@vger.kernel.org On 10/22/2014 01:02 AM, Philippe R=C3=A9tornaz wrote: > Le 21/10/2014 15:29, Guenter Roeck a =C3=A9crit : >> On 10/20/2014 11:46 PM, Philippe R=C3=A9tornaz wrote: >>> Hello >>> >>> [...] >>>> - Use raw notifiers protected by spinlocks instead of atomic notif= iers >>> [...] >>> >>>> +/** >>>> + * do_kernel_power_off - Execute kernel poweroff handler call = chain >>>> + * >>>> + * Calls functions registered with register_power_off_handler. >>>> + * >>>> + * Expected to be called from machine_power_off as last step o= f >>>> + * the poweroff sequence. >>>> + * >>>> + * Powers off the system immediately if a poweroff handler fun= ction >>>> + * has been registered. Otherwise does nothing. >>>> + */ >>>> +void do_kernel_power_off(void) >>>> +{ >>>> + spin_lock(&power_off_handler_lock); >>>> + raw_notifier_call_chain(&power_off_handler_list, 0, NULL); >>>> + spin_unlock(&power_off_handler_lock); >>>> +} >>> >>> I don't get it. You are still in atomic context inside the poweroff >>> callback >>> since you lock it with a spinlock. > [...] >>> >>> Why not using the blocking_notifier_* family ? >>> It will lock with a read-write semaphore under which you can sleep. >>> >>> For instance, twl4030_power_off will sleep, since it is doing I2C a= ccess. >>> So you cannot call it in atomic context. >>> >> >> Learning something new all the time. I assumed that spin_lock (unlik= e >> spin_lock_irqsafe) would not run in atomic context. >> >> I did not want to use a sleeping lock because I am not sure if that >> works for all architectures; some disable (local) interrupts before >> calling the function (eg arm and arm64), and I don't want to change >> that semantics. > > You're right and it even disable all others CPUs (if any). > I don't understand why it needs to disable local interrupts since the > code path to pm_power_off is simply doing: > > syscall -> migrate to reboot cpu -> disable local interrupt -> disabl= e others cpu -> pm_power_off(). > > I don't understand why we cannot re-enable interrupts right before pm= _power_off(). > And it looks like that all pm_power_off callbacks which can sleep are= broken. > I still wonder how i2c communication can works without local interrup= ts ... it looks > like somebody is re-enabling them (or the code was never run) > Good question. Or the code was never run under arm, or the drivers have= a polling fallback if interrupts are disabled. >> I have another idea how to get there without changing the lock situa= tion >> while executing the call chain - just set a flag indicating that it = is >> running and execute it without lock. Would that work ? > > I don't think inventing a new locking mechanism is a good solution. > We need first to know for sure if we can sleep or not in pm_power_off= =2E > If yes then we need to re-enable local interrupts and we can use a mu= tex. > If not then the atomic notifier is fine and a lots of drivers are wro= ng. > I had another look into the kernel, checking all the callers of machine= _power_off(). In some cases, the function is called directly from interrupt handlers,= meaning there is no guarantee that interrupts are enabled to start with when it= is called. So there is more to fix if the semantics of pm_power_off requires inter= rupts to be enabled. My current solution is to use a spinlock with disabled interrupts durin= g poweroff handler registration and unregistration, and no lock at all wh= ile executing the call chain. That is not perfect, but it retains the curre= nt situation and thus guarantees that existing code works as before. This disconnects the sleep vs. no sleep problem from the current patch serie= s, and we can solve that problem separately. Thanks, Guenter