* Re: [PATCH] machine_power_off: not only local_irq_disable but also do disable preemption
[not found] <CAA-nRyoO_NmrD8aRa3OyVXz4b133Fxi0=biGXTzSPDuSuNJJxQ@mail.gmail.com>
@ 2014-07-05 18:10 ` Russell King - ARM Linux
[not found] ` <CAA-nRyp5pPC2qZs7RhAXJdXaK_k4fKNZuq=yRTSTPLQsgPxqxQ@mail.gmail.com>
0 siblings, 1 reply; 7+ messages in thread
From: Russell King - ARM Linux @ 2014-07-05 18:10 UTC (permalink / raw)
To: pawandeep oza
Cc: akpm, holt, mingo, sboyd, k.khlebnikov, u.kleine-koenig,
nicolas.pitre, linux-arm-kernel, linux-kernel
On Sat, Jul 05, 2014 at 10:44:41PM +0530, pawandeep oza wrote:
> process calls sys_reboot and that process then stops other CPUs while those
> CPUs are within a spin_lock() region we can potentially encounter a
> deadlock
> scenario like below.
>
> CPU 0 CPU 1
> ----- -----
> spin_lock(my_lock)
> smp_send_stop()
> <send IPI> handle_IPI()
> disable_preemption/irqs
> while(1);
> <PREEMPT>
> spin_lock(my_lock) <--- Waits forever
Please explain how that <PREEMPT> occurs with IRQs already disabled.
--
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] machine_power_off: not only local_irq_disable but also do disable preemption
[not found] ` <CAA-nRyp5pPC2qZs7RhAXJdXaK_k4fKNZuq=yRTSTPLQsgPxqxQ@mail.gmail.com>
@ 2014-07-05 19:01 ` Russell King - ARM Linux
[not found] ` <CAA-nRyo2Ts3o_B1jMS8vvLJanf13u=u2WyoqnL8bCF5svOOCew@mail.gmail.com>
2014-07-05 19:25 ` Russell King - ARM Linux
1 sibling, 1 reply; 7+ messages in thread
From: Russell King - ARM Linux @ 2014-07-05 19:01 UTC (permalink / raw)
To: pawandeep oza
Cc: akpm, holt, mingo, sboyd, k.khlebnikov, u.kleine-koenig,
Nicolas Pitre, linux-arm-kernel, linux-kernel
On Sun, Jul 06, 2014 at 12:20:03AM +0530, pawandeep oza wrote:
> Hi,
>
> I am referring to this version of spin lock apis.
>
> static inline void __raw_spin_lock(raw_spinlock_t *lock)
> {
> preempt_disable();
> spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
> LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);
> }
>
>
> static inline void __raw_spin_unlock(raw_spinlock_t *lock)
> {
> spin_release(&lock->dep_map, 1, _RET_IP_);
> do_raw_spin_unlock(lock);
> preempt_enable();
> }
>
> poweroff path runs with irqs disabled, but what is some one (valid
> scenerio) try to have spin_lock and spin_unlock for its own reasons.
>
> spin_unlock doesn preempt_enable at the end...
> which in turn does following.
>
> #define preempt_enable() \
> do { \
> preempt_enable_no_resched(); \
> barrier(); \
> preempt_check_resched(); \
> } while (0)
>
>
> preempt_check_resched would check TIF_NEED_RESCHED
> #define preempt_check_resched() \
> do { \
> if (unlikely(test_thread_flag(TIF_NEED_RESCHED))) \
> preempt_schedule(); \
> } while (0)
>
> there is a chance that just beofre we disabled irqs, somebody would have
> marked the flag to current, and
> later on, it might happen that, current gets replaced by the process which
> tries to hold a spin_lock which has already been previosuly held by CPU1
> when
> was being plugged out by smp_send_stop.
This seems to be a generic code bug - if interrupts are disabled (they
are) then we should not schedule at all.
--
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] machine_power_off: not only local_irq_disable but also do disable preemption
[not found] ` <CAA-nRyp5pPC2qZs7RhAXJdXaK_k4fKNZuq=yRTSTPLQsgPxqxQ@mail.gmail.com>
2014-07-05 19:01 ` Russell King - ARM Linux
@ 2014-07-05 19:25 ` Russell King - ARM Linux
1 sibling, 0 replies; 7+ messages in thread
From: Russell King - ARM Linux @ 2014-07-05 19:25 UTC (permalink / raw)
To: pawandeep oza
Cc: akpm, holt, mingo, sboyd, k.khlebnikov, u.kleine-koenig,
Nicolas Pitre, linux-arm-kernel, linux-kernel
On Sun, Jul 06, 2014 at 12:20:03AM +0530, pawandeep oza wrote:
> preempt_check_resched would check TIF_NEED_RESCHED
> #define preempt_check_resched() \
> do { \
> if (unlikely(test_thread_flag(TIF_NEED_RESCHED))) \
> preempt_schedule(); \
> } while (0)
>
> there is a chance that just beofre we disabled irqs, somebody would have
> marked the flag to current, and
> later on, it might happen that, current gets replaced by the process which
> tries to hold a spin_lock which has already been previosuly held by CPU1
> when
> was being plugged out by smp_send_stop.
And preempt_schedule() contains:
/*
* If there is a non-zero preempt_count or interrupts are disabled,
* we do not want to preempt the current task. Just return..
*/
if (likely(!preemptible()))
return;
where preemptible() is defined as:
(preempt_count() == 0 && !irqs_disabled())
which means... if interrupts are disabled (as they are) we don't
return from preempt_schedule() without doing anything.
Scheduling with interrupts disabled is a bug. If you need to add
preempt_disable() before local_irq_disable() to prevent it, then
there's a bug somewhere else, and we don't "fix" that by adding
preempt_disable(). The real bug needs to be found and fixed.
--
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] machine_power_off: not only local_irq_disable but also do disable preemption
[not found] ` <CAA-nRyo2Ts3o_B1jMS8vvLJanf13u=u2WyoqnL8bCF5svOOCew@mail.gmail.com>
@ 2014-07-05 19:26 ` Russell King - ARM Linux
[not found] ` <CAA-nRyqHXCBCzKcK83UnS64f_wKcWObzui2xamt=ayVDKX9zEA@mail.gmail.com>
0 siblings, 1 reply; 7+ messages in thread
From: Russell King - ARM Linux @ 2014-07-05 19:26 UTC (permalink / raw)
To: pawandeep oza
Cc: akpm, holt, mingo, sboyd, k.khlebnikov, u.kleine-koenig,
Nicolas Pitre, linux-arm-kernel, linux-kernel
On Sun, Jul 06, 2014 at 12:34:14AM +0530, pawandeep oza wrote:
> no this is not a generic code bug, because only in this scenerio, problem
> might happen.
> because other core is plugged out and now there is no chance that it can
> release the spin_lock it would have held previously.
>
> so this core must run with preemption and interrupt disabled.
>
> this is only applicable for shutdown, not for any other scenerio.
No. See my reply I just sent. Preempting when local interrupts
disabled is a bug.
--
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] machine_power_off: not only local_irq_disable but also do disable preemption
[not found] ` <CAA-nRyo3T+JLpfE6M260Ws+Ddu4z3TS2fQy5sv_dETMTcYFscQ@mail.gmail.com>
@ 2014-07-06 15:35 ` Russell King - ARM Linux
[not found] ` <CAA-nRyppv-LruJK0NySQ_VGORP1+M1w++=k5rqeiC15Wc6Ks9g@mail.gmail.com>
0 siblings, 1 reply; 7+ messages in thread
From: Russell King - ARM Linux @ 2014-07-06 15:35 UTC (permalink / raw)
To: pawandeep oza
Cc: akpm, holt, mingo, sboyd, k.khlebnikov, u.kleine-koenig,
Nicolas Pitre, linux-arm-kernel, linux-kernel
On Sun, Jul 06, 2014 at 08:20:36PM +0530, pawandeep oza wrote:
> just to add to this,
> this scenerio happened on the platform which I am working on,
> where interrupts were being re-enabled in rpm_callback path...because the
Can you explain how you get to the runtime callback paths from the
system reboot/shutdown handlers?
In any case, calling the plain _irq (rather than _irqsave) variants of
spinlocks when IRQs are already disabled is (another) bug. The plain
_irq variants are not permitted when IRQs have been previously
disabled.
--
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] machine_power_off: not only local_irq_disable but also do disable preemption
[not found] ` <CAA-nRyppv-LruJK0NySQ_VGORP1+M1w++=k5rqeiC15Wc6Ks9g@mail.gmail.com>
@ 2014-07-06 17:28 ` Russell King - ARM Linux
[not found] ` <CAA-nRyqaOoXKs9OfaLGObDAG3=Gey+7UoOYdjLUfv45kprqdFg@mail.gmail.com>
0 siblings, 1 reply; 7+ messages in thread
From: Russell King - ARM Linux @ 2014-07-06 17:28 UTC (permalink / raw)
To: pawandeep oza, Simon Horman, Magnus Damm
Cc: akpm, mingo, sboyd, k.khlebnikov, u.kleine-koenig, Nicolas Pitre,
linux-arm-kernel, linux-kernel, linux-sh
On Sun, Jul 06, 2014 at 09:22:56PM +0530, pawandeep oza wrote:
> this is the stack trace I was analyzing...please find it below.
>
> #0 [<c072df48>] (__schedule) from [<c072e3a0>]
>
> #1 [<c072e3a0>] (preempt_schedule) from [<c072f244>]
> #2 [<c072f244>] (_raw_spin_unlock_irq) from [<c0394468>]
> #3 [<c0394468>] (__rpm_callback) from [<c03944b0>]
> #4 [<c03944b0>] (rpm_callback) from [<c03957bc>]
>
> #5 [<c03957bc>] (rpm_resume) from [<c0395d18>]
> #6 [<c0395d18>] (__pm_runtime_resume) from [<c04675b4>]
> #7 [<c04675b4>] (sh_mobile_i2c_xfer) from [<c0460390>]
> #8 [<c0460390>] (__i2c_transfer) from [<c0462094>]
>
> #9 [<c0462094>] (i2c_transfer) from [<c03b2738>]
> #10 [<c03b2738>] (d2153_i2c_write_device) from [<c03b08b0>]
> #11 [<c03b08b0>] (d2153_write) from [<c03b1574>]
> #12 [<c03b1574>] (d2153_reg_write) from [<c03b2490>]
>
> #13 [<c03b2490>] (d2153_system_poweroff) from [<c0011278>]
> #14 [<c0011278>] (machine_power_off) from [<c0047d58>]
> #15 [<c0047d58>] (kernel_power_off) from [<c0048420>]
> #16 [<c0048420>] (sys_reboot) from [<c0010240>]
Thanks.
Right, so from the above, we can see that d2153_system_poweroff() is
hooked into pm_power_off hook, which is called with IRQs disabled.
So, the context which d2153_system_poweroff() is called from is
_fundamentally_ one where scheduling and preemption are _not_
permitted.
d2153_system_poweroff calls into the I2C code, which then calls the
shmobile I2C driver. The shmobile I2C driver then calls into the
PM runtime stuff, which ends up re-enabling interrupts. This is the
first bug with the code, and is the one that you're hitting. However,
this is not the only bug.
The second bug here is that i2c_transfer() is permitted to sleep,
waiting for the transaction to complete. As I've already stated, and
as I've said above, scheduling is not allowed with interrupts disabled.
What this means is that calling i2c_transfer() beneath pm_power_off()
is not permitted.
However, there's more problems here: what if one of the secondary CPUs
which received the IPI was in the middle of using that very same I2C
bus. This _can_ deadlock no matter how you try and sort out that
preemption and/or scheduling problem - because the I2C bus may already
be in use but the CPU which was using it has been halted off via a STOP
IPI, preventing it from ever completing the transaction.
So, the above code path has at least _three_ potentially very serious
bugs in it:
1. calling i2c_transfer() beneath pm_power_off() due to scheduling
possibility
2. calling i2c_transfer() beneath pm_power_off() due to deadlock on
bus access
3. calling the runtime PM callbacks beneath pm_power_off() which result
in interrupts being unexpectedly re-enabled.
I suspect that part of the problem here is that this "d2153" thing is a
PMIC, and you need to write to it via I2C to tell it to turn the power
off.
Some of these problems can be mitigated:
(2) can be worked around - the I2C driver receives a shutdown callback
when the system is being rebooted or powered off. The I2C driver can
ensure that the bus is idle (if not, wait for the bus to become idle)
before _safely_ disabling the I2C interface against further transactions.
(3) can also be solved by the mechanism given above - the shutdown
callback /can/ safely call the runtime PM helpers there, and, therefore,
can lock the I2C interfaces into an active state for the shutdown or
reboot operation.
That means that when we hit pm_power_off(), we can be _sure_ that, firstly,
no one else is using the I2C bus, and secondly, it is not in a runtime
suspended state.
However, that also means that /we/ can't use i2c_transfer() either - and
in fact, that's a /good/ thing because i2c_transfer() may sleep on us,
causing bug (1).
So what we need is an out-of-band method for this - we don't have that
though. This is a problem which would need to be solved to the I2C
maintainers satisfaction to make this kind of thing operate reliably.
PS, it would've been nice had the email addresses not had typos. I
finally remembered to fix them and delete the one which bounces. I've
also added the shmobile people.
--
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] machine_power_off: not only local_irq_disable but also do disable preemption
[not found] ` <CAA-nRyqaOoXKs9OfaLGObDAG3=Gey+7UoOYdjLUfv45kprqdFg@mail.gmail.com>
@ 2014-07-06 18:42 ` Russell King - ARM Linux
0 siblings, 0 replies; 7+ messages in thread
From: Russell King - ARM Linux @ 2014-07-06 18:42 UTC (permalink / raw)
To: pawandeep oza
Cc: Simon Horman, Magnus Damm, akpm, mingo, sboyd, k.khlebnikov,
u.kleine-koenig, Nicolas Pitre, linux-arm-kernel, linux-kernel,
linux-sh
On Sun, Jul 06, 2014 at 11:59:31PM +0530, pawandeep oza wrote:
> Hi,
>
> thanks for your elaborate inputs.
>
> So, the above code path has at least _three_ potentially very serious
> bugs in it:
>
> 1. calling i2c_transfer() beneath pm_power_off() due to scheduling
> possibility
>
> oza: this I have already fixed in the code by making i2c in poll mode in
> shutdown case.
> so it doesnt do schedule_timeout(.....) and it doesnt sleep anymore.
> so bug 1 is no more there.
> I have a patch, Can I submit that ?
>
> 2. calling i2c_transfer() beneath pm_power_off() due to deadlock on
> bus access
>
> 3. calling the runtime PM callbacks beneath pm_power_off() which result
> in interrupts being unexpectedly re-enabled.
>
> if I understood right, you are suggesting to register driver shutdown
> method ?
> where syscore_shutdown will call them in order to solve 2 and 3 ?
> if thats the case, I believe it is too early to trigger poweroff
> at syscore_shutdown, because kernel_power_off calls it very early.
You didn't understand right.
Firstly, there's no need to get involved with the syscore stuff at all.
All drivers in the system are notified of a clean system shutdown via
a function in their driver structure.
For example, struct platform_driver has a "shutdown" function pointer
in it. This is called fairly early in the shutdown sequence where it
is still permitted to sleep, where preemption is possible, interrupts
are still enabled etc.
This is where you need to ensure that (2) and (3) are solved in the
I2C driver.
What I'm *not* saying is to use the driver shutdown function pointer
as a replacement for pm_power_off(), as it would be too early, before
the system has finished quiescing. You still need to do stuff in
pm_power_off().
What you need to do is to _prepare_ the system for pm_power_off() so
that it is in a sane state where you _can_ safely access the I2C bus.
> what I would still think is: even if we make the things right in driver,
> still the solution would not be generic until make changes in uppermost
> layer which is machine_power_off.
No changes are needed there.
> if we cut the preemption from there, the things will be taken care at the
> highest generic kernel layer rather then individual driver taking care of
> the same.
Please, get this idea of disabling preemption out of you head. It is
a /complete/ fallicy that the bug(s) you are seeing are caused by it.
It's merely a symptom of an incorrectly thought out process.
The steps required to power off may have been correctly identified,
but what has been totally missed by whoever started hitting that
keyboard is to think about the _context_ which the code gets run.
The answer is *not* to change the context in which the code is run.
As I've already said, disabling preemption does NOT fix the problems.
It papers over the problem you have found, but leaves other problems.
Fix code properly. Never paper over problems. Papering over problems
is not an acceptable bug fixing strategy.
--
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-07-06 18:43 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CAA-nRyoO_NmrD8aRa3OyVXz4b133Fxi0=biGXTzSPDuSuNJJxQ@mail.gmail.com>
2014-07-05 18:10 ` [PATCH] machine_power_off: not only local_irq_disable but also do disable preemption Russell King - ARM Linux
[not found] ` <CAA-nRyp5pPC2qZs7RhAXJdXaK_k4fKNZuq=yRTSTPLQsgPxqxQ@mail.gmail.com>
2014-07-05 19:01 ` Russell King - ARM Linux
[not found] ` <CAA-nRyo2Ts3o_B1jMS8vvLJanf13u=u2WyoqnL8bCF5svOOCew@mail.gmail.com>
2014-07-05 19:26 ` Russell King - ARM Linux
[not found] ` <CAA-nRyqHXCBCzKcK83UnS64f_wKcWObzui2xamt=ayVDKX9zEA@mail.gmail.com>
[not found] ` <CAA-nRyo3T+JLpfE6M260Ws+Ddu4z3TS2fQy5sv_dETMTcYFscQ@mail.gmail.com>
2014-07-06 15:35 ` Russell King - ARM Linux
[not found] ` <CAA-nRyppv-LruJK0NySQ_VGORP1+M1w++=k5rqeiC15Wc6Ks9g@mail.gmail.com>
2014-07-06 17:28 ` Russell King - ARM Linux
[not found] ` <CAA-nRyqaOoXKs9OfaLGObDAG3=Gey+7UoOYdjLUfv45kprqdFg@mail.gmail.com>
2014-07-06 18:42 ` Russell King - ARM Linux
2014-07-05 19:25 ` Russell King - ARM Linux
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox