* [Xenomai] xnarch_xchg infinite loop
@ 2014-01-20 8:31 Henri Roosen
2014-01-20 9:48 ` Gilles Chanteperdrix
0 siblings, 1 reply; 8+ messages in thread
From: Henri Roosen @ 2014-01-20 8:31 UTC (permalink / raw)
To: Xenomai@xenomai.org
Hi all,
We have the problem that (hot-)rebooting our Xenomai system fails every 1
out of 10 times.
The system is an ARM iMX6Solo (Cortex-A9) running Xenomai 2.6.2.1 and
kernel 3.0 (freescale branch).
When the system hangs at reboot, it is in an infinite loop in the Xenomai
atomic exchange implementation, with STREX always returning 1:
__xnarch_xchg
S:0xC0094B2C : ADD r3,r6,#0x890
S:0xC0094B30 : LDREX r2,[r3]
S:0xC0094B34 : STREX r1,r9,[r3]
S:0xC0094B38 : TEQ r1,#0
S:0xC0094B3C : BNE {pc}-0xc ; 0xc0094b30
Does anyone know what is causing the STREX to always return 0 and why it
might get into this state?
Is there a workaround for this problem?
Any help is appreciated!
Thanks,
Henri
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [Xenomai] xnarch_xchg infinite loop 2014-01-20 8:31 [Xenomai] xnarch_xchg infinite loop Henri Roosen @ 2014-01-20 9:48 ` Gilles Chanteperdrix 2014-01-20 10:02 ` Henri Roosen 0 siblings, 1 reply; 8+ messages in thread From: Gilles Chanteperdrix @ 2014-01-20 9:48 UTC (permalink / raw) To: Henri Roosen; +Cc: Xenomai@xenomai.org On 01/20/2014 09:31 AM, Henri Roosen wrote: > Hi all, > > We have the problem that (hot-)rebooting our Xenomai system fails every 1 > out of 10 times. > > The system is an ARM iMX6Solo (Cortex-A9) running Xenomai 2.6.2.1 and > kernel 3.0 (freescale branch). > > When the system hangs at reboot, it is in an infinite loop in the Xenomai > atomic exchange implementation, with STREX always returning 1: > > __xnarch_xchg > S:0xC0094B2C : ADD r3,r6,#0x890 > S:0xC0094B30 : LDREX r2,[r3] > S:0xC0094B34 : STREX r1,r9,[r3] > S:0xC0094B38 : TEQ r1,#0 > S:0xC0094B3C : BNE {pc}-0xc ; 0xc0094b30 > > Does anyone know what is causing the STREX to always return 0 and why it > might get into this state? Normally, strex fails if "something else" stores data in between ldrex and strex. Do you have the full stack trace? -- Gilles. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] xnarch_xchg infinite loop 2014-01-20 9:48 ` Gilles Chanteperdrix @ 2014-01-20 10:02 ` Henri Roosen 2014-01-20 20:40 ` Gilles Chanteperdrix 0 siblings, 1 reply; 8+ messages in thread From: Henri Roosen @ 2014-01-20 10:02 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: Xenomai@xenomai.org On Mon, Jan 20, 2014 at 10:48 AM, Gilles Chanteperdrix < gilles.chanteperdrix@xenomai.org> wrote: > On 01/20/2014 09:31 AM, Henri Roosen wrote: > > Hi all, > > > > We have the problem that (hot-)rebooting our Xenomai system fails every 1 > > out of 10 times. > > > > The system is an ARM iMX6Solo (Cortex-A9) running Xenomai 2.6.2.1 and > > kernel 3.0 (freescale branch). > > > > When the system hangs at reboot, it is in an infinite loop in the Xenomai > > atomic exchange implementation, with STREX always returning 1: > > > > __xnarch_xchg > > S:0xC0094B2C : ADD r3,r6,#0x890 > > S:0xC0094B30 : LDREX r2,[r3] > > S:0xC0094B34 : STREX r1,r9,[r3] > > S:0xC0094B38 : TEQ r1,#0 > > S:0xC0094B3C : BNE {pc}-0xc ; 0xc0094b30 > > > > Does anyone know what is causing the STREX to always return 0 and why it > > might get into this state? > > Normally, strex fails if "something else" stores data in between ldrex > and strex. Do you have the full stack trace? > Thank you for your reply Gilles. Please find the stacktrace below: #0 __xnarch_xchg( size = 4, x = 3204479504, ptr = <Value optimised away by compiler> ) at atomic_asm.h:79 #1 xnintr_irq_handler( irq = 260, cookie = (void*) 0xBF0079E8 ) at atomic_asm.h:93 #2 __ipipe_sync_stage() at core.c:1301 #3 ipipe_suspend_domain() at core.c:856 #4 __ipipe_walk_pipeline( pos = (struct list_head*) 0xC0463884 ) at core.c:797 #5 __ipipe_handle_irq( irq = 98, flags = 0 ) at ipipe.c:564 #6 __ipipe_grab_irq( irq = <Value optimised away by compiler>, regs = (struct pt_regs*) 0xCC897DD8 ) at ipipe.c:618 #7 [__irq_svc+0x40] Thanks, Henri > -- > Gilles. > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] xnarch_xchg infinite loop 2014-01-20 10:02 ` Henri Roosen @ 2014-01-20 20:40 ` Gilles Chanteperdrix 2014-01-22 8:31 ` Henri Roosen 0 siblings, 1 reply; 8+ messages in thread From: Gilles Chanteperdrix @ 2014-01-20 20:40 UTC (permalink / raw) To: Henri Roosen; +Cc: Xenomai@xenomai.org On 01/20/2014 11:02 AM, Henri Roosen wrote: > On Mon, Jan 20, 2014 at 10:48 AM, Gilles Chanteperdrix < > gilles.chanteperdrix@xenomai.org> wrote: > >> On 01/20/2014 09:31 AM, Henri Roosen wrote: >>> Hi all, >>> >>> We have the problem that (hot-)rebooting our Xenomai system fails every 1 >>> out of 10 times. >>> >>> The system is an ARM iMX6Solo (Cortex-A9) running Xenomai 2.6.2.1 and >>> kernel 3.0 (freescale branch). >>> >>> When the system hangs at reboot, it is in an infinite loop in the Xenomai >>> atomic exchange implementation, with STREX always returning 1: >>> >>> __xnarch_xchg >>> S:0xC0094B2C : ADD r3,r6,#0x890 >>> S:0xC0094B30 : LDREX r2,[r3] >>> S:0xC0094B34 : STREX r1,r9,[r3] >>> S:0xC0094B38 : TEQ r1,#0 >>> S:0xC0094B3C : BNE {pc}-0xc ; 0xc0094b30 >>> >>> Does anyone know what is causing the STREX to always return 0 and why it >>> might get into this state? >> >> Normally, strex fails if "something else" stores data in between ldrex >> and strex. Do you have the full stack trace? >> > > Thank you for your reply Gilles. Please find the stacktrace below: > > #0 __xnarch_xchg( size = 4, x = 3204479504, ptr = <Value optimised away by > compiler> ) at atomic_asm.h:79 > #1 xnintr_irq_handler( irq = 260, cookie = (void*) 0xBF0079E8 ) at > atomic_asm.h:93 > #2 __ipipe_sync_stage() at core.c:1301 > #3 ipipe_suspend_domain() at core.c:856 > #4 __ipipe_walk_pipeline( pos = (struct list_head*) 0xC0463884 ) at > core.c:797 > #5 __ipipe_handle_irq( irq = 98, flags = 0 ) at ipipe.c:564 > #6 __ipipe_grab_irq( irq = <Value optimised away by compiler>, regs = > (struct pt_regs*) 0xCC897DD8 ) at ipipe.c:618 > #7 [__irq_svc+0x40] I do not really see what is going on. However, how come you have a driver still running while rebooting? Do you not remove the drivers before rebooting? Also, __irq_svc means you received an interrupt while in svc mode, so, it would be interesting to know what is below __irq_svc. I believe you can know that by adding a call to show_stack (and preferably build the kernel with frame pointers). The trick will be to avoid getting a show_stack on every interrupt. Regards. -- Gilles. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] xnarch_xchg infinite loop 2014-01-20 20:40 ` Gilles Chanteperdrix @ 2014-01-22 8:31 ` Henri Roosen 2014-01-22 10:47 ` Gilles Chanteperdrix 0 siblings, 1 reply; 8+ messages in thread From: Henri Roosen @ 2014-01-22 8:31 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: Xenomai@xenomai.org On Mon, Jan 20, 2014 at 9:40 PM, Gilles Chanteperdrix < gilles.chanteperdrix@xenomai.org> wrote: > On 01/20/2014 11:02 AM, Henri Roosen wrote: > > On Mon, Jan 20, 2014 at 10:48 AM, Gilles Chanteperdrix < > > gilles.chanteperdrix@xenomai.org> wrote: > > > >> On 01/20/2014 09:31 AM, Henri Roosen wrote: > >>> Hi all, > >>> > >>> We have the problem that (hot-)rebooting our Xenomai system fails > every 1 > >>> out of 10 times. > >>> > >>> The system is an ARM iMX6Solo (Cortex-A9) running Xenomai 2.6.2.1 and > >>> kernel 3.0 (freescale branch). > >>> > >>> When the system hangs at reboot, it is in an infinite loop in the > Xenomai > >>> atomic exchange implementation, with STREX always returning 1: > >>> > >>> __xnarch_xchg > >>> S:0xC0094B2C : ADD r3,r6,#0x890 > >>> S:0xC0094B30 : LDREX r2,[r3] > >>> S:0xC0094B34 : STREX r1,r9,[r3] > >>> S:0xC0094B38 : TEQ r1,#0 > >>> S:0xC0094B3C : BNE {pc}-0xc ; 0xc0094b30 > >>> > >>> Does anyone know what is causing the STREX to always return 0 and why > it > >>> might get into this state? > >> > >> Normally, strex fails if "something else" stores data in between ldrex > >> and strex. Do you have the full stack trace? > >> > > > > Thank you for your reply Gilles. Please find the stacktrace below: > > > > #0 __xnarch_xchg( size = 4, x = 3204479504, ptr = <Value optimised away > by > > compiler> ) at atomic_asm.h:79 > > #1 xnintr_irq_handler( irq = 260, cookie = (void*) 0xBF0079E8 ) at > > atomic_asm.h:93 > > #2 __ipipe_sync_stage() at core.c:1301 > > #3 ipipe_suspend_domain() at core.c:856 > > #4 __ipipe_walk_pipeline( pos = (struct list_head*) 0xC0463884 ) at > > core.c:797 > > #5 __ipipe_handle_irq( irq = 98, flags = 0 ) at ipipe.c:564 > > #6 __ipipe_grab_irq( irq = <Value optimised away by compiler>, regs = > > (struct pt_regs*) 0xCC897DD8 ) at ipipe.c:618 > > #7 [__irq_svc+0x40] > > I do not really see what is going on. However, how come you have a > driver still running while rebooting? Do you not remove the drivers > Removing the device driver before reboot doesn't help: we still get a LDREX/STREX lockup in the Xenomai timertick path. I'm still not sure what actually causes this lockup, but my guess is somewhere late in the shutdown-sequence the memory (or something else) is put into a state that makes LDREX/STREX fail all the time. Any interrupt from the Xenomai domain locks the system then. As a quick (and dirty!) workaround skipping __ipipe_handle_irq when system_state == SYSTEM_REBOOT solves the problem. So thinking towards a proper solution: is there or should there be any shutdown/de-initialization of Xenomai and it's services in the Linux shutdown sequence? before rebooting? Also, __irq_svc means you received an interrupt while > in svc mode, so, it would be interesting to know what is below > __irq_svc. I believe you can know that by adding a call to show_stack > (and preferably build the kernel with frame pointers). The trick will be > to avoid getting a show_stack on every interrupt. > Regards. > > -- > Gilles. > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] xnarch_xchg infinite loop 2014-01-22 8:31 ` Henri Roosen @ 2014-01-22 10:47 ` Gilles Chanteperdrix 2014-01-22 14:49 ` Henri Roosen 0 siblings, 1 reply; 8+ messages in thread From: Gilles Chanteperdrix @ 2014-01-22 10:47 UTC (permalink / raw) To: Henri Roosen; +Cc: Xenomai@xenomai.org On 01/22/2014 09:31 AM, Henri Roosen wrote: > Removing the device driver before reboot doesn't help: we still get a > LDREX/STREX lockup in the Xenomai timertick path. I'm still not sure what > actually causes this lockup, but my guess is somewhere late in the > shutdown-sequence the memory (or something else) is put into a state that > makes LDREX/STREX fail all the time. Any interrupt from the Xenomai domain > locks the system then. > > As a quick (and dirty!) workaround skipping __ipipe_handle_irq when > system_state == SYSTEM_REBOOT solves the problem. > > So thinking towards a proper solution: is there or should there be any > shutdown/de-initialization of Xenomai and it's services in the Linux > shutdown sequence? The simplest solution seems to find the Linux code involved, which probably contains a local_irq_disable() to avoid this situation, and replace it with hard_local_irq_disable() in the I-pipe case so as to also block xenomai interrupts. -- Gilles. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] xnarch_xchg infinite loop 2014-01-22 10:47 ` Gilles Chanteperdrix @ 2014-01-22 14:49 ` Henri Roosen 2014-01-22 20:20 ` Gilles Chanteperdrix 0 siblings, 1 reply; 8+ messages in thread From: Henri Roosen @ 2014-01-22 14:49 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: Xenomai@xenomai.org On Wed, Jan 22, 2014 at 11:47 AM, Gilles Chanteperdrix < gilles.chanteperdrix@xenomai.org> wrote: > On 01/22/2014 09:31 AM, Henri Roosen wrote: > > Removing the device driver before reboot doesn't help: we still get a > > LDREX/STREX lockup in the Xenomai timertick path. I'm still not sure what > > actually causes this lockup, but my guess is somewhere late in the > > shutdown-sequence the memory (or something else) is put into a state that > > makes LDREX/STREX fail all the time. Any interrupt from the Xenomai > domain > > locks the system then. > > > > As a quick (and dirty!) workaround skipping __ipipe_handle_irq when > > system_state == SYSTEM_REBOOT solves the problem. > > > > So thinking towards a proper solution: is there or should there be any > > shutdown/de-initialization of Xenomai and it's services in the Linux > > shutdown sequence? > > The simplest solution seems to find the Linux code involved, which > probably contains a local_irq_disable() to avoid this situation, and > replace it with hard_local_irq_disable() in the I-pipe case so as to > also block xenomai interrupts. > Great, this seems to work. Thanks Gilles! I replaced with local_irq_disable_hw(), that is what you meant right? The ARM reboot code also disables the fiq with local_fiq_disable(). It it necessary to also change this by local_fiq_disable_hw() for the ipipe case? Or is there no need for that? Thanks, Henri > -- > Gilles. > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] xnarch_xchg infinite loop 2014-01-22 14:49 ` Henri Roosen @ 2014-01-22 20:20 ` Gilles Chanteperdrix 0 siblings, 0 replies; 8+ messages in thread From: Gilles Chanteperdrix @ 2014-01-22 20:20 UTC (permalink / raw) To: Henri Roosen; +Cc: Xenomai@xenomai.org On 01/22/2014 03:49 PM, Henri Roosen wrote: > On Wed, Jan 22, 2014 at 11:47 AM, Gilles Chanteperdrix < > gilles.chanteperdrix@xenomai.org> wrote: > >> On 01/22/2014 09:31 AM, Henri Roosen wrote: >>> Removing the device driver before reboot doesn't help: we still get a >>> LDREX/STREX lockup in the Xenomai timertick path. I'm still not sure what >>> actually causes this lockup, but my guess is somewhere late in the >>> shutdown-sequence the memory (or something else) is put into a state that >>> makes LDREX/STREX fail all the time. Any interrupt from the Xenomai >> domain >>> locks the system then. >>> >>> As a quick (and dirty!) workaround skipping __ipipe_handle_irq when >>> system_state == SYSTEM_REBOOT solves the problem. >>> >>> So thinking towards a proper solution: is there or should there be any >>> shutdown/de-initialization of Xenomai and it's services in the Linux >>> shutdown sequence? >> >> The simplest solution seems to find the Linux code involved, which >> probably contains a local_irq_disable() to avoid this situation, and >> replace it with hard_local_irq_disable() in the I-pipe case so as to >> also block xenomai interrupts. >> > > Great, this seems to work. Thanks Gilles! > I replaced with local_irq_disable_hw(), that is what you meant right? > > The ARM reboot code also disables the fiq with local_fiq_disable(). It it > necessary to also change this by local_fiq_disable_hw() for the ipipe case? > Or is there no need for that? Right, hard_local_irq_disable is for I-pipe patches for 3.2 and later kernels. Yeq, if local_fiq_disable_hw exists, you should call it instead of local_fiq_disable. Regards. -- Gilles. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-01-22 20:20 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-01-20 8:31 [Xenomai] xnarch_xchg infinite loop Henri Roosen 2014-01-20 9:48 ` Gilles Chanteperdrix 2014-01-20 10:02 ` Henri Roosen 2014-01-20 20:40 ` Gilles Chanteperdrix 2014-01-22 8:31 ` Henri Roosen 2014-01-22 10:47 ` Gilles Chanteperdrix 2014-01-22 14:49 ` Henri Roosen 2014-01-22 20:20 ` Gilles Chanteperdrix
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.