From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <53AA7F39.90706@axelsw.it> Date: Wed, 25 Jun 2014 09:50:17 +0200 From: Marco Tessore MIME-Version: 1.0 References: <53A3FAB0.4050100@axelsw.it> <53A4207F.9040801@xenomai.org> <53A9AA38.3090005@axelsw.it> <53A9B0FB.1070809@xenomai.org> In-Reply-To: <53A9B0FB.1070809@xenomai.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai] Kernel freezes in __ipipe_sync_stage List-Id: Discussions about the Xenomai project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Philippe Gerum , Gilles Chanteperdrix , xenomai@xenomai.org Il 24/06/2014 19:10, Philippe Gerum ha scritto: > On 06/24/2014 06:41 PM, Marco Tessore wrote: >> Hi, >> >> Il 20/06/2014 13:52, Gilles Chanteperdrix ha scritto: >>> On 06/20/2014 11:11 AM, Marco Tessore wrote: >>>> The kernel is version 2.6.31 for ARM architecture - specifically a >>> Do you have the same problem with a recent I-pipe patches, like one for >>> 3.8 or 3.10 kernel? >>> >> >> I managed to do some tests on 3.10 kernel but on onother board with >> imx28 CPU, actually it happens that that kernel freezes too, >> but I haven't debugged it with the jtag debugger. >> >> I have, instead, some information on the original problem, that is the >> one that worried me more: >> >> In summary: >> I have a board based on imx25, with kernel 2.6.31, Xenomai 2.5.6 and >> ipipe patch 1.16-02. >> >> Rarely, but often enough to be a problem, the kernel freezes at boot. >> Thanks to a JTAG debugger I'm able to observe the kernel in the >> following situation: >> I'm in an infinite loop with the following stack trace: >> __ipipe_set_irqpending >> xnintr_host_tick (__ipipe_propagate_irq) >> xnintr_clock_handler >> __ipipe_sync_stage <- (1) >> ipipe_suspend_domain >> __ipipe_walk_pipeline >> __ipipe_restore_pipeline_head >> xnarch_next_tick_shot >> clockevents_program_event >> tick_dev_program_event >> hrtimer_interrupt >> mxc_interrupt >> handle_IRQ_event >> handle_level_irq >> asm_do_IRQ >> __ipipe_sync_stage <- (2) >> ipipe_suspend_domain >> __ipipe_walk_pipeline >> __ipipe_restore_pipeline_head >> xnpod_enable_timesource >> xnpod_init >> __native_skin_init >> ... >> ... >> >> Specifically, it happens that the first call to __ipipe_sync_stage, the >> one marked with the number (2), is working on a stage that I can not >> determine, >> let's say for convenience stage S1, I think is the Linux secondary >> domain but I'm not sure, >> so the function invokes the interrupt handler of the system timer. >> Continuing in the stack trace, I have a nested call to >> __ipipe_sync_stage, indicated with (1), >> but this call works on another stage, for convenience domain S2, >> in turn this function invokes a handler for the timer irq, which at a >> certain point invokes the __ipipe_propagate_irq which raises the flags >> for the stage S1, >> thus making the first call to __ipipe_sync_stage (2) fails to get out of >> their while loops. >> >> I should add that I do not see hardware interrupt for the timer in >> function __ipipe_grab_IRQ. >> I have no idea how the cycle is triggered,but when the kernel is locked, >> the kernel is located in the software exclusively infinite loop >> described above. >> >> >> In the hope that you could help me understand what is going on, >> I would have liked groped a patch like this: >> - Store, for each level of nesting of __ipipe_sync_stage, the irq number >> currently running and on behalf of which stage. >> - Patch the function __ipipe_set_irqpending in such a way as not to set >> the flags for the pair (irq, stage) if the pair is already present at >> some level in the current stack trace, that is, >> - if the function __ipipe_sync_stage is executing the handler for a >> stage, and then he had reset the flags in irqpend_himask and >> irqpend_lomask, it does not expect the handler goes to raise again the >> same flag for the same stage. >> >> What do you think about this? >> >> Thank you very much for any kind of advice you could give me >> > > You mentioned random lockups during boot. Does you board ever lock up > when passing xeno_hal.disable=1 on the kernel command line? > Yes, I mentioned random lockups, but always the kernel enters in the infinite loop described above. Following your suggestion I tried to pass parameter xeno_hal.disable=1 but kernel sayed "Unknown boot option `xeno_hal.disable=1': ignoring" What is supposed to do this option anyway? If it would disable HAL, does not this inhibits xenomai realtime services? What about the patch,described above, that I would apply? say, don't permit that the interrupt handlers called in __ipipe_sync_stage raise a couple (stage, irq) already handled in the current stack? Thank you Marco Tessore