From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4F564103.6070500@domain.hid> Date: Tue, 06 Mar 2012 17:53:23 +0100 From: Philippe Gerum MIME-Version: 1.0 References: <4F3AAA52.6060802@domain.hid> <4F46C7CD.1070509@domain.hid> <4F4920DE.40600@domain.hid> <4F509B3C.9030105@domain.hid> <4F509C26.7070308@domain.hid> <4F516540.5030801@domain.hid> <4F563F34.5030207@domain.hid> In-Reply-To: <4F563F34.5030207@domain.hid> Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Transfer-Encoding: 8bit Subject: Re: [Xenomai-help] Freeze while running examples List-Id: Help regarding installation and common use of Xenomai List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gilles Chanteperdrix Cc: xenomai@xenomai.org On 03/06/2012 05:45 PM, Gilles Chanteperdrix wrote: > On 03/06/2012 04:14 AM, Oscar Dávila wrote: >> 2012/3/2 Gilles Chanteperdrix >> >>> On 03/03/2012 01:14 AM, Oscar Dávila wrote: >>>> 2012/3/2 Gilles Chanteperdrix >>>> >>>>> On 03/02/2012 11:04 AM, Gilles Chanteperdrix wrote: >>>>>> On 03/01/2012 05:23 AM, Oscar Dávila wrote: >>>>>>> Finally i could get the dump >>>>>>> >>>>>>> >>>>>>> post-prompt >>>>>>> No breakpoints or watchpoints. >>>>>>> >>>>>>> breakpoints-table-end >>>>>>> >>>>>>> post-prompt >>>>>>> Dump of assembler code for function __ipipe_sync_stage: >>>>>>> 0xc106d376<__ipipe_sync_stage+0>: push %ebp >>>>>>> (...) >>>>>>> 0xc106d526<__ipipe_sync_stage+432>: ret >>>>>>> End of assembler dump. >>>>>> >>>>>> The address where the EIP is when the NMI watchdog triggers is >>>>>> 0xc106d5e1, so, outside this code. >>>>>> >>>>> And this dump does not seem to correspond to the kernel that was running >>>>> when the bug happened, because in that case we had >>>>> >>>>> 0xc106d5e1 == __ipipe_sync_stage + 0x21b >>>>> >>>>> whereas in your dump, >>>>> >>>>> __ipipe_sync_stage + 0x21b == 0xc106d591 >>>>> >>>>> Sorry about that, i lost that image of the kernel. >>>> >>>> Here is a new complete test. >>>> >>>> Kernel Messages >>>> >>>> >>>> Kernel failure message 1: >>>> BUG: NMI Watchdog detected LOCKUP on CPU0, ip c10751d3, registers: >>>> >>>> local_irq_disable_hw(); >>>> c10751bf: fa cli >>>> c10751c0: 89 e0 mov %esp,%eax >>>> c10751c2: 25 00 e0 ff ff and $0xffffe000,%eax >>>> root_stall_after_handler(); >>>> while (__ipipe_check_root_resched()) >>>> c10751c7: 83 78 14 00 cmpl $0x0,0x14(%eax) >>>> c10751cb: 75 58 jne c1075225<__xirq_end+0x2> >>>> c10751cd: f6 40 08 08 testb $0x8,0x8(%eax) >>>> c10751d1: 74 52 je c1075225<__xirq_end+0x2> >>>> c10751d3: eb f8 jmp c10751cd >>> <__ipipe_sync_stage+0x12b> >>>> __ipipe_preempt_schedule_irq(); >>> >>> Looks like an infinite loop when CONFIG_PREEMPT is off. Try putting an >>> #ifdef CONFIG_PREEMPT around this code: >>> >>> #ifdef CONFIG_PREEMPT >>> while (__ipipe_check_root_resched()) >>> __ipipe_preempt_schedule_irq(); >>> #endif >>> >>> To test that this is indeed the issue, you may try enabling >>> CONFIG_PREEMPT in the code. >> >> >> >> I recompiled the kernel enabling CONFIG_PREEMPT and it worked, also i tried >> the other option, where i add the #ifdef CONFIG_PREEMPT to the source of >> core.c, and it also worked. >> >> So it seems that was the problem. Now i can run trivial_periodic. >> >> But after some time with the kernel after running trivial_periodic, the >> machines still freezes, i will try to see where the failure is happening >> now. >> >> Which type of preemption model its preferred? i mean, using the >> CONFIG_PREEMPT enabled or without the: >> while (__ipipe_check_root_resched()) >> __ipipe_preempt_schedule_irq(); > > We should not need either workaround. From reading the code, I do not > understand why the compiler creates this infinite loop. It would be > interesting to generate the pre-processed file to understand how this > happens. > Because CONFIG_PREEMPT is disabled, but __ipipe_check_root_resched() is instantiated. This can't fly. -- Philippe.