* Deadlock in ia64_mca_cmc_int_caller
@ 2003-12-06 4:16 Keith Owens
2003-12-06 15:23 ` Alex Williamson
2003-12-06 22:50 ` Keith Owens
0 siblings, 2 replies; 3+ messages in thread
From: Keith Owens @ 2003-12-06 4:16 UTC (permalink / raw)
To: linux-ia64
ia64_mca_cmc_int_caller() calls smp_call_function() which waits until
all cpus have taken the IPI before returning. This interacts badly
with locks that are sometimes taken with interrupts disabled and
sometimes with interrupts enabled, smp_call_function can deadlock.
cpu 3 cpu 0
Holds tasklist_lock with interrupts enabled,
it did read_lock() or write_lock().
Does read_lock_irq() or
write_lock_irq(). Spinning
disabled waiting for tasklist_lock.
CMC interrupt occurs
ia64_mca_cmc_int_caller() calls smp_call_function()
smp_call_function() sends IPI to other cpus
IPI on cpu 0 blocked, it is disabled
waiting for tasklist_lock.
smp_call_function() waits until IPI reaches
all other cpus.
cpu 0 never responds, we never release the
tasklist lock, deadlock.
AFAICT it is never safe to call smp_call_function() from an interrupt
handler.
The unsafe nature of smp_call_function is not ia64 specific. ix86 can
deadlock this way if any ix86 code calls smp_call_function from an
interrupt handler.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Deadlock in ia64_mca_cmc_int_caller
2003-12-06 4:16 Deadlock in ia64_mca_cmc_int_caller Keith Owens
@ 2003-12-06 15:23 ` Alex Williamson
2003-12-06 22:50 ` Keith Owens
1 sibling, 0 replies; 3+ messages in thread
From: Alex Williamson @ 2003-12-06 15:23 UTC (permalink / raw)
To: linux-ia64
Keith,
We debugged a similar problem with the old CMC/CPE code recently.
However, the latest version in 2.4/2.6 fixed that problem. So are you
actually hitting a deadlock when ia64_mca_cmc_int_caller() calls
smp_call_function(ia64_mca_cmc_vector_enable, NULL, 1, 0)? I've reached
the same conclusion about smp_call_function, my mistake for using it in
the first place, it's way too dangerous. We need to enable/disable the
CMC vector in a better context or use another mechanism.
Alex
On Fri, 2003-12-05 at 21:16, Keith Owens wrote:
> ia64_mca_cmc_int_caller() calls smp_call_function() which waits until
> all cpus have taken the IPI before returning. This interacts badly
> with locks that are sometimes taken with interrupts disabled and
> sometimes with interrupts enabled, smp_call_function can deadlock.
>
> cpu 3 cpu 0
> Holds tasklist_lock with interrupts enabled,
> it did read_lock() or write_lock().
>
> Does read_lock_irq() or
> write_lock_irq(). Spinning
> disabled waiting for tasklist_lock.
>
> CMC interrupt occurs
>
> ia64_mca_cmc_int_caller() calls smp_call_function()
>
> smp_call_function() sends IPI to other cpus
>
> IPI on cpu 0 blocked, it is disabled
> waiting for tasklist_lock.
>
> smp_call_function() waits until IPI reaches
> all other cpus.
>
> cpu 0 never responds, we never release the
> tasklist lock, deadlock.
>
> AFAICT it is never safe to call smp_call_function() from an interrupt
> handler.
>
> The unsafe nature of smp_call_function is not ia64 specific. ix86 can
> deadlock this way if any ix86 code calls smp_call_function from an
> interrupt handler.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Deadlock in ia64_mca_cmc_int_caller
2003-12-06 4:16 Deadlock in ia64_mca_cmc_int_caller Keith Owens
2003-12-06 15:23 ` Alex Williamson
@ 2003-12-06 22:50 ` Keith Owens
1 sibling, 0 replies; 3+ messages in thread
From: Keith Owens @ 2003-12-06 22:50 UTC (permalink / raw)
To: linux-ia64
On Sat, 06 Dec 2003 08:23:50 -0700,
Alex Williamson <alex.williamson@hp.com> wrote:
> We debugged a similar problem with the old CMC/CPE code recently.
>However, the latest version in 2.4/2.6 fixed that problem. So are you
>actually hitting a deadlock when ia64_mca_cmc_int_caller() calls
>smp_call_function(ia64_mca_cmc_vector_enable, NULL, 1, 0)?
Yes, at the point that smp_call_function is spinning on
while (atomic_read(&data.started) != cpus)
The cpus that were not responding were spinning disabled waiting for
tasklist_lock. The assumption is that tasklist_lock is held by the
current cpu.
>I've reached
>the same conclusion about smp_call_function, my mistake for using it in
>the first place, it's way too dangerous.
Using smp_call_function in any interrupt context is unsafe, we should
add a badness check to smp_call_function for that state. I think that
bh context is bad as well, but need to confirm that. Of course it is
not interrupt/bh context per se that is bad, but the interaction of
those contexts with spinlocks that are sometimes taken enabled and
sometimes disabled and synchronizing across cpus.
>We need to enable/disable the
>CMC vector in a better context or use another mechanism.
Since the only safe time to use smp_call_function is with no spinlocks
held on the current cpu, that restricts us to a user context thread.
Create a kernel thread called smp_call_nowait that waits on a semaphore
which CMC/CPE does up() on. Use a list of kmalloc(GFP_ATOMIC)
structures containing
list_head
void (*func) (void *info)
void *info
char info_data[variable]
When smp_call_nowait wakes up, it takes the first entry off the list,
calls smp_call_function with wait=1 then kfrees the list entry. The
'_nowait' part of the thread name indicates that the original caller
does not wait for the smp function to take effect.
I will code this up on Monday.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2003-12-06 22:50 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-12-06 4:16 Deadlock in ia64_mca_cmc_int_caller Keith Owens
2003-12-06 15:23 ` Alex Williamson
2003-12-06 22:50 ` Keith Owens
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox