From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4F563F34.5030207@domain.hid> Date: Tue, 06 Mar 2012 17:45:40 +0100 From: Gilles Chanteperdrix MIME-Version: 1.0 References: <4F3AAA52.6060802@domain.hid> <4F46C7CD.1070509@domain.hid> <4F4920DE.40600@domain.hid> <4F509B3C.9030105@domain.hid> <4F509C26.7070308@domain.hid> <4F516540.5030801@domain.hid> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Xenomai-help] Freeze while running examples List-Id: Help regarding installation and common use of Xenomai List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: =?UTF-8?B?T3NjYXIgRMOhdmlsYQ==?= Cc: xenomai@xenomai.org On 03/06/2012 04:14 AM, Oscar D=C3=A1vila wrote: > 2012/3/2 Gilles Chanteperdrix >=20 >> On 03/03/2012 01:14 AM, Oscar D=C3=A1vila wrote: >>> 2012/3/2 Gilles Chanteperdrix >>> >>>> On 03/02/2012 11:04 AM, Gilles Chanteperdrix wrote: >>>>> On 03/01/2012 05:23 AM, Oscar D=C3=A1vila wrote: >>>>>> Finally i could get the dump >>>>>> >>>>>> >>>>>> post-prompt >>>>>> No breakpoints or watchpoints. >>>>>> >>>>>> breakpoints-table-end >>>>>> >>>>>> post-prompt >>>>>> Dump of assembler code for function __ipipe_sync_stage: >>>>>> 0xc106d376 <__ipipe_sync_stage+0>: push %ebp >>>>>> (...) >>>>>> 0xc106d526 <__ipipe_sync_stage+432>: ret >>>>>> End of assembler dump. >>>>> >>>>> The address where the EIP is when the NMI watchdog triggers is >>>>> 0xc106d5e1, so, outside this code. >>>>> >>>> And this dump does not seem to correspond to the kernel that was run= ning >>>> when the bug happened, because in that case we had >>>> >>>> 0xc106d5e1 =3D=3D __ipipe_sync_stage + 0x21b >>>> >>>> whereas in your dump, >>>> >>>> __ipipe_sync_stage + 0x21b =3D=3D 0xc106d591 >>>> >>>> Sorry about that, i lost that image of the kernel. >>> >>> Here is a new complete test. >>> >>> Kernel Messages >>> >>> >>> Kernel failure message 1: >>> BUG: NMI Watchdog detected LOCKUP on CPU0, ip c10751d3, registers: >>> >>> local_irq_disable_hw(); >>> c10751bf: fa cli >>> c10751c0: 89 e0 mov %esp,%eax >>> c10751c2: 25 00 e0 ff ff and $0xffffe000,%eax >>> root_stall_after_handler(); >>> while (__ipipe_check_root_resched()) >>> c10751c7: 83 78 14 00 cmpl $0x0,0x14(%eax) >>> c10751cb: 75 58 jne c1075225 <__xirq_end+0x2= > >>> c10751cd: f6 40 08 08 testb $0x8,0x8(%eax) >>> c10751d1: 74 52 je c1075225 <__xirq_end+0x2= > >>> c10751d3: eb f8 jmp c10751cd >> <__ipipe_sync_stage+0x12b> >>> __ipipe_preempt_schedule_irq(); >> >> Looks like an infinite loop when CONFIG_PREEMPT is off. Try putting an= >> #ifdef CONFIG_PREEMPT around this code: >> >> #ifdef CONFIG_PREEMPT >> while (__ipipe_check_root_resched()) >> __ipipe_preempt_schedule_irq(); >> #endif >> >> To test that this is indeed the issue, you may try enabling >> CONFIG_PREEMPT in the code. >=20 >=20 >=20 > I recompiled the kernel enabling CONFIG_PREEMPT and it worked, also i t= ried > the other option, where i add the #ifdef CONFIG_PREEMPT to the source o= f > core.c, and it also worked. >=20 > So it seems that was the problem. Now i can run trivial_periodic. >=20 > But after some time with the kernel after running trivial_periodic, the= > machines still freezes, i will try to see where the failure is happenin= g > now. >=20 > Which type of preemption model its preferred? i mean, using the > CONFIG_PREEMPT enabled or without the: > while (__ipipe_check_root_resched()) > __ipipe_preempt_schedule_irq(); We should not need either workaround. From reading the code, I do not understand why the compiler creates this infinite loop. It would be interesting to generate the pre-processed file to understand how this happens. --=20 Gilles.