* Kernel gets deadlocked during smp booting
@ 2015-06-16 11:24 AYAN KUMAR HALDER
[not found] ` <CANoiuGLzBHVT6Q0P6PSFE==RSwSxMsH9496+UD1o2TH0RWP5eg@mail.gmail.com>
0 siblings, 1 reply; 5+ messages in thread
From: AYAN KUMAR HALDER @ 2015-06-16 11:24 UTC (permalink / raw)
To: kernelnewbies
Hi All,
I am booting kernel 4.0.4 on an ARM based custom SOC with SMP enabled.
I see that the kernel gets deadlocked on a spinlock. To be precise,
static inline void arch_spin_lock(arch_spinlock_t *lock)
{
...
// loops in the following while loop for indefinite period
while (lockval.tickets.next != lockval.tickets.owner) {
wfe();
lockval.tickets.owner = ACCESS_ONCE(lock->tickets.owner);
}
}
Can someone explain to me what does the loop exactly do.
Besides, any pointers about how to debug such issues.
The backtrace shows the following:-
arch_spin_lock
_raw_spin_lock
vprintk_emit ----> raw_spin_lock(&logbuf_lock);
vprintk_default
printk
smp_init
I can guess that the deadlock is being caused due to logbuf_lock. But,
I am unable to proceed further.
Thanks and Regards,
Ayan kumar Halder
^ permalink raw reply [flat|nested] 5+ messages in thread
* Kernel gets deadlocked during smp booting
[not found] ` <CANoiuGLzBHVT6Q0P6PSFE==RSwSxMsH9496+UD1o2TH0RWP5eg@mail.gmail.com>
@ 2015-06-17 6:39 ` AYAN KUMAR HALDER
2015-06-17 15:25 ` Josh Cartwright
0 siblings, 1 reply; 5+ messages in thread
From: AYAN KUMAR HALDER @ 2015-06-17 6:39 UTC (permalink / raw)
To: kernelnewbies
> Could you post the full backtrace (straight from the kernel log)?
The backtrace has been obtained from a debugger, not from the kernel
log (as the kernel has been deadlocked). The backtrace is as follows:-
arch_spin_lock
_raw_spin_lock
vprintk_emit
vprintk_default
printk
smp_init
do_initcall_level
do_initcalls
do_basic_setup
kernel_init_freeable
kernel_init
ret_from_fork
^ permalink raw reply [flat|nested] 5+ messages in thread
* Kernel gets deadlocked during smp booting
2015-06-17 6:39 ` AYAN KUMAR HALDER
@ 2015-06-17 15:25 ` Josh Cartwright
2015-06-17 16:18 ` AYAN KUMAR HALDER
0 siblings, 1 reply; 5+ messages in thread
From: Josh Cartwright @ 2015-06-17 15:25 UTC (permalink / raw)
To: kernelnewbies
On Wed, Jun 17, 2015 at 12:09:10PM +0530, AYAN KUMAR HALDER wrote:
> > Could you post the full backtrace (straight from the kernel log)?
> The backtrace has been obtained from a debugger, not from the kernel
> log (as the kernel has been deadlocked). The backtrace is as follows:-
Unfortunately, this trace shows you what you already know. You mention
that SMP is enabled; are you running on a system that supports SMP, or
are you relying on SMP_ON_UP to patch things up?
If another CPU has been brought up, then its backtrace is likely more
interesting to you.
Josh
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
Url : http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20150617/aa4951a7/attachment.bin
^ permalink raw reply [flat|nested] 5+ messages in thread
* Kernel gets deadlocked during smp booting
2015-06-17 15:25 ` Josh Cartwright
@ 2015-06-17 16:18 ` AYAN KUMAR HALDER
2015-06-21 15:18 ` AYAN KUMAR HALDER
0 siblings, 1 reply; 5+ messages in thread
From: AYAN KUMAR HALDER @ 2015-06-17 16:18 UTC (permalink / raw)
To: kernelnewbies
> Unfortunately, this trace shows you what you already know. You mention
> that SMP is enabled; are you running on a system that supports SMP, or
> are you relying on SMP_ON_UP to patch things up?
>
> If another CPU has been brought up, then its backtrace is likely more
> interesting to you.
Cores 1,2 are the secondary cpus.
Core1 :- It is in Abort mode
Core 2 :- It is in 'cpu_v7_do_idle' function
^ permalink raw reply [flat|nested] 5+ messages in thread
* Kernel gets deadlocked during smp booting
2015-06-17 16:18 ` AYAN KUMAR HALDER
@ 2015-06-21 15:18 ` AYAN KUMAR HALDER
0 siblings, 0 replies; 5+ messages in thread
From: AYAN KUMAR HALDER @ 2015-06-21 15:18 UTC (permalink / raw)
To: kernelnewbies
> Cores 1,2 are the secondary cpus.
> Core1 :- It is in Abort mode
> Core 2 :- It is in 'cpu_v7_do_idle' function
I was able to resolve the issue. I am not sure if this is lack of my
understanding or a potential bug in the kernel.
1. In gic_init_bases(), gic_cpu_map for each core is initialized to
0xff. So in my case, gic_cpu_map[core0] = 0xff, gic_cpu_map[core1] =
0xff, and gic_cpu_map[core2] = 0xff.
2. gic_cpu_init() gets called for primary core (core0). Thus
gic_cpu_map[core0] = 0x1, gic_cpu_map[core1] = 0xfe, and
gic_cpu_map[core2] = 0xfe.
3. Primary core (core0) calls gic_raise_softirq to send interrupts to core1,2.
map |= gic_cpu_map[core1]; Thus map = 0xfe
So when this gets written to GIC distribution registers in the
following statement:-
writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) +
GIC_DIST_SOFTINT);
This will send wake up interrupt to cores1 - 7 (in my case cores1,2)
at the same time. This is not intended as core1 should be send wake up
interrupt first, allowed to come online. Subsequently, core2 should be
send wake up interrupt. This would ensure that they boot one at a
time.
In my case, cores1 and 2 were woken up@the same time, starts
booting simultaneously. Then, one of the core acquires a spinlock and
aborts(as I believe that secondary_start_kernel() is non-reentrant).
This causes deadlock for the other core.
I would be happy if someone can correct my misunderstanding or
identify if this is a potential issue in kernel 4.0.4
For my fix, I did the following:-
In gic_raise_softirq(),
instead of map |= gic_cpu_map[cpu];
I used the following statement
map |= (0x1 << cpu );
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-06-21 15:18 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-16 11:24 Kernel gets deadlocked during smp booting AYAN KUMAR HALDER
[not found] ` <CANoiuGLzBHVT6Q0P6PSFE==RSwSxMsH9496+UD1o2TH0RWP5eg@mail.gmail.com>
2015-06-17 6:39 ` AYAN KUMAR HALDER
2015-06-17 15:25 ` Josh Cartwright
2015-06-17 16:18 ` AYAN KUMAR HALDER
2015-06-21 15:18 ` AYAN KUMAR HALDER
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.