From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <5139DEA2.9050103@mitrol.it> Date: Fri, 08 Mar 2013 13:50:42 +0100 From: Paolo Minazzi MIME-Version: 1.0 References: <51372B12.2030400@mitrol.it> <51373149.4050700@xenomai.org> <5137370B.2050402@mitrol.it> <51373841.70704@xenomai.org> <51385910.80203@mitrol.it> <51388A3A.2090004@xenomai.org> <51388DD2.2020805@mitrol.it> <51388EB2.6000206@xenomai.org> In-Reply-To: <51388EB2.6000206@xenomai.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai] Sporadic problem : rt_task_sleep locked after debugging List-Id: Discussions about the Xenomai project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: xenomai@xenomai.org Il 07/03/2013 13.57, Gilles Chanteperdrix ha scritto: > On 03/07/2013 01:53 PM, Paolo Minazzi wrote: > >> Il 07/03/2013 13.38, Gilles Chanteperdrix ha scritto: >>> On 03/07/2013 10:08 AM, Paolo Minazzi wrote: >>> >>>> Il 06/03/2013 13.36, Gilles Chanteperdrix ha scritto: >>>>> On 03/06/2013 01:31 PM, Paolo Minazzi wrote: >>>>> >>>>>> Il 06/03/2013 13.06, Gilles Chanteperdrix ha scritto: >>>>>>> On 03/06/2013 12:40 PM, Paolo Minazzi wrote: >>>>>>> >>>>>>>> I can generate the problem only debugging with gdb, otherwise there is >>>>>>>> no problem. >>>>>>>> >>>>>>>> Can you help me to undertand what happen ? >>>>>>>> Have you got an idea ? do you need other information ? >>>>>>> In case it is something which was fixed since Xenomai 2.5.6, could you >>>>>>> try Xenomai 2.6.2.1? >>>>>>> >>>>>> I have not done the port. >>>>>> This work is done by an external firm. >>>>>> I know well enough the linux kernel, but very very little the xenomai >>>>>> internals. >>>>> You can use Xenomai 2.6.2.1 with the same version of the I-pipe kernel, >>>>> and the I-pipe kernel is the only thing which needs to be ported. >>>>> >>>>>> I could try ... but it is not easy .... >>>>> It should be as easy as: >>>>> - keep your kernel patched with the I-pipe patch >>>>> - download the newest version of Xenomai, that is 2.6.2.1 >>>>> - follow the installation instructions, here: >>>>> http://www.xenomai.org/documentation/xenomai-2.6/html/README.INSTALL/ >>>>> >>>>>> The problem appear only using gdb .... any ideas ? >>>>> Could be the timer programmed for a too short delay, could be something >>>>> we already fixed, could be a new bug... Really, testing rapidly the last >>>>> version will make us win a lot of time if this is an issue already fixed. >>>>> >>>>> >>>> I Gilles, >>>> I have ported to 2.6.1 without problems. >>>> To 2.6.2 and 2.6.2.1 I need to add a gcc built-in. My compiler is >>> pass --with-atomic-ops=ad-hoc to configure script, this will avoid the >>> builtins. >>> >>>> gcc-4.3.2 and does not have some built-in atomic function. >>>> After this I need to change the switch.S because my assembler cannot >>> switch.S has been compiling for ages, way before gcc 4.4. Could you show >>> me the warning you get? >>> >>>> compile it. Maybe a newer compiler (gcc>= 4.4) could solve all these >>>> problems, but for me this is not a valid solution because other >>>> developers of us use a cygwin compiler. We should built a new cygwin >>>> compiler ... the libs will be different and so I will have problem with >>>> shared libraries .... too complex to solve a sporadic bug using gdb .... >>>> I can try to see the 2.6.1. >>> The idea of asking you to try 2.6.2.1 is not to ask you to switch to it, >>> but simply to do a quick test to see if you can reproduce the issue. >>> >>> >> CC drivers/xenomai/testing/switchtest.o >> CC drivers/xenomai/testing/timerbench.o >> LD drivers/xenomai/testing/xeno_timerbench.o >> LD drivers/xenomai/testing/xeno_switchtest.o >> LD drivers/xenomai/testing/built-in.o >> LD drivers/xenomai/built-in.o >> LD drivers/built-in.o >> CC arch/arm/xenomai/hal.o >> AS arch/arm/xenomai/switch.o >> /home/axel/MarvellEnv/BuildLinux/linux-2.6.31.8/arch/arm/xenomai/switch.S: >> Assembler messages: >> /home/axel/MarvellEnv/BuildLinux/linux-2.6.31.8/arch/arm/xenomai/switch.S:156: >> Error: bad instruction `arm( stmia ip!,{r4-sl,fp,sp,lr})' >> /home/axel/MarvellEnv/BuildLinux/linux-2.6.31.8/arch/arm/xenomai/switch.S:157: >> Error: bad instruction `thumb( stmia ip!,{r4-sl,fp})' >> /home/axel/MarvellEnv/BuildLinux/linux-2.6.31.8/arch/arm/xenomai/switch.S:158: >> Error: bad instruction `thumb( str sp,[ip],#4)' >> /home/axel/MarvellEnv/BuildLinux/linux-2.6.31.8/arch/arm/xenomai/switch.S:159: >> Error: bad instruction `thumb( str lr,[ip],#4)' >> /home/axel/MarvellEnv/BuildLinux/linux-2.6.31.8/arch/arm/xenomai/switch.S:170: >> Error: bad instruction `arm( add r4,r2,#28)' >> /home/axel/MarvellEnv/BuildLinux/linux-2.6.31.8/arch/arm/xenomai/switch.S:171: >> Error: bad instruction `arm( ldmia r4,{r4-sl,fp,sp,pc})' >> /home/axel/MarvellEnv/BuildLinux/linux-2.6.31.8/arch/arm/xenomai/switch.S:172: >> Error: bad instruction `thumb( add ip,r2,#28)' >> /home/axel/MarvellEnv/BuildLinux/linux-2.6.31.8/arch/arm/xenomai/switch.S:173: >> Error: bad instruction `thumb( ldmia ip!,{r4-sl,fp})' >> /home/axel/MarvellEnv/BuildLinux/linux-2.6.31.8/arch/arm/xenomai/switch.S:174: >> Error: bad instruction `thumb( ldr sp,[ip],#4)' >> /home/axel/MarvellEnv/BuildLinux/linux-2.6.31.8/arch/arm/xenomai/switch.S:175: >> Error: bad instruction `thumb( ldr pc,[ip])' >> make[2]: *** [arch/arm/xenomai/switch.o] Error 1 >> make[1]: *** [arch/arm/xenomai] Error 2 >> make: *** [sub-make] Error 2 > > The issue is not the compiler, the issue is with the linux kernel you > use. Could you put me a source tarball on some ftp site? > > Please try adding: > #define ARM(x...) x > #define THUMB(x...) > > At the top of switch.S > > Hi Gill, 2.6.2.1 seems work ok with a normal xenomai application. But testing our complex application (to seach the gdb bug /rt_task_sleep) I found an other small problem. Today I have studied this new problem that can be shown and produced with a simple example. #include #include #include #include #include #include // PRIO=0 make a fault ! Other values are good #define PRIO 0 RT_TASK tsk; void fn(void *arg) { while (1) { rt_task_sleep(1000000); } } int main(int argc, char *argv[]) { mlockall(MCL_CURRENT | MCL_FUTURE); rt_timer_set_mode(0); rt_task_set_mode(0, 0, /* T_WARNSW , */ NULL); while (1) { rt_task_create(&tsk, "demo", 0, PRIO, T_JOINABLE); // rt_task_start(&tsk, &fn, 0); rt_task_suspend(&tsk); rt_task_delete(&tsk); rt_task_join(&tsk); } This is the log : / # /D/main Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = 874d4000 [00000000] *pgd=01557031, *pte=00000000, *ppte=00000000 Internal error: Oops: 817 [#1] PREEMPT Modules linked in: dp CPU: 0 Not tainted (2.6.31.8 #15) PC is at losyscall_event+0x218/0x238 LR is at schedule+0x46c/0x50c pc : [<8008d930>] lr : [<8025b1ac>] psr: a0000013 sp : 80d0def8 ip : fffffe00 fp : 80d0df1c r10: 00000000 r9 : 803102a0 r8 : 00000018 r7 : 00000000 r6 : 80d0dfb0 r5 : 88031210 r4 : 00000001 r3 : 00000000 r2 : 00b00231 r1 : 87839360 r0 : 00000001 Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 0005397f Table: 074d4000 DAC: 00000015 Process demo (pid: 295, stack limit = 0x80d0c270) Stack: (0x80d0def8 to 0x80d0e000) dee0: 00000228 80310310 df00: 80332c40 80332c40 80332c44 00000001 80d0df6c 80d0df20 8007c3cc 8008d728 df20: 00000200 00000000 80d0dfb0 00000009 fffffdff ffffffff 20000013 80332c40 df40: 80d0c000 80d0dfb0 00095018 000f0042 000f0042 800202ec 80d0c000 00000000 df60: 80d0df8c 80d0df70 80026540 8007c314 7edaad24 00095018 000931e4 000f0042 df80: 00000000 80d0df90 80020254 800264d0 000931e4 0300022b 2aad5ca0 2aad5c7c dfa0: 2aad5c7c 00000000 7edaad24 00095018 fffffe00 2aad5ca0 2aad5c7c 2aad5c7c dfc0: 7edaad24 00095018 000931e4 000f0042 00000400 00070754 00000000 00000001 dfe0: 2aad5ca0 2aad5c78 0000de6c 000089e8 20000010 0300022b 00000000 00000000 Backtrace: [<8008d718>] (losyscall_event+0x0/0x238) from [<8007c3cc>] (__ipipe_dispatch_event+0xc8/0x1a8) [<8007c304>] (__ipipe_dispatch_event+0x0/0x1a8) from [<80026540>] (__ipipe_syscall_root+0x80/0x128) [<800264c0>] (__ipipe_syscall_root+0x0/0x128) from [<80020254>] (vector_swi+0x74/0xb4) r7:000f0042 r6:000931e4 r5:00095018 r4:7edaad24 Code: 159524a8 03a00001 159536f8 13a00001 (15832000) ---[ end trace 8d00a583486ebf82 ]--- Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = 874d4000 [00000000] *pgd=01557031, *pte=00000000, *ppte=00000000 Internal error: Oops: 817 [#2] PREEMPT Modules linked in: dp CPU: 0 Tainted: G D (2.6.31.8 #15) PC is at losyscall_event+0x218/0x238 LR is at schedule+0x46c/0x50c pc : [<8008d930>] lr : [<8025b1ac>] psr: a0000013 sp : 8790fef8 ip : fffffe00 fp : 8790ff1c r10: 00000000 r9 : 803102a0 r8 : 00000018 r7 : 00000000 r6 : 8790ffb0 r5 : 88031210 r4 : 00000001 r3 : 00000000 r2 : 00b00231 r1 : 87839360 r0 : 00000001 Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 0005397f Table: 074d4000 DAC: 00000015 Process demo (pid: 297, stack limit = 0x8790e270) Stack: (0x8790fef8 to 0x87910000) fee0: 00000228 80310310 ff00: 80332c40 80332c40 80332c44 00000001 8790ff6c 8790ff20 8007c3cc 8008d728 ff20: 00000200 00000000 8790ffb0 00000009 fffffdff ffffffff 20000013 80332c40 ff40: 8790e000 8790ffb0 00095028 000f0042 000f0042 800202ec 8790e000 00000000 ff60: 8790ff8c 8790ff70 80026540 8007c314 7edaad24 00095028 000931e4 000f0042 ff80: 00000000 8790ff90 80020254 800264d0 000931e4 0300022b 2aad5ca0 2aad5c7c ffa0: 2aad5c7c 00000000 7edaad24 00095028 fffffe00 2aad5ca0 2aad5c7c 2aad5c7c ffc0: 7edaad24 00095028 000931e4 000f0042 00000400 00070754 00000000 00000001 ffe0: 2aad5ca0 2aad5c78 0000de6c 000089e8 20000010 0300022b 00443031 00443431 Backtrace: [<8008d718>] (losyscall_event+0x0/0x238) from [<8007c3cc>] (__ipipe_dispatch_event+0xc8/0x1a8) [<8007c304>] (__ipipe_dispatch_event+0x0/0x1a8) from [<80026540>] (__ipipe_syscall_root+0x80/0x128) [<800264c0>] (__ipipe_syscall_root+0x0/0x128) from [<80020254>] (vector_swi+0x74/0xb4) r7:000f0042 r6:000931e4 r5:00095028 r4:7edaad24 Code: 159524a8 03a00001 159536f8 13a00001 (15832000) ---[ end trace 8d00a583486ebf83 ]--- Unable to handle kernel NULL pointer dereference at virtual address 00000000 ^Cpgd = 874d4000 [00000000] *pgd=01557031, *pte=00000000, *ppte=00000000 Internal error: Oops: 817 [#3] PREEMPT Modules linked in: dp CPU: 0 Tainted: G D (2.6.31.8 #15) PC is at losyscall_event+0x218/0x238 LR is at schedule+0x46c/0x50c pc : [<8008d930>] lr : [<8025b1ac>] psr: a0000013 sp : 80dc5ef8 ip : fffffe00 fp : 80dc5f1c r10: 00000000 r9 : 803102a0 r8 : 00000018 r7 : 00000000 r6 : 80dc5fb0 r5 : 88031210 r4 : 00000001 r3 : 00000000 r2 : 00b00231 r1 : 87839360 r0 : 00000001 Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 0005397f Table: 074d4000 DAC: 00000015 Process demo (pid: 299, stack limit = 0x80dc4270) Stack: (0x80dc5ef8 to 0x80dc6000) 5ee0: 00000228 80310310 5f00: 80332c40 80332c40 80332c44 00000001 80dc5f6c 80dc5f20 8007c3cc 8008d728 5f20: 00000200 00000000 80dc5fb0 00000009 fffffdff ffffffff 20000013 80332c40 5f40: 80dc4000 80dc5fb0 00095038 000f0042 000f0042 800202ec 80dc4000 00000000 5f60: 80dc5f8c 80dc5f70 80026540 8007c314 7edaad24 00095038 000931e4 000f0042 5f80: 00000000 80dc5f90 80020254 800264d0 000931e4 0300022b 2aad5ca0 2aad5c7c 5fa0: 2aad5c7c 00000000 7edaad24 00095038 fffffe00 2aad5ca0 2aad5c7c 2aad5c7c 5fc0: 7edaad24 00095038 000931e4 000f0042 00000400 00070754 00000000 00000001 5fe0: 2aad5ca0 2aad5c78 0000de6c 000089e8 20000010 0300022b 00000000 00000000 Backtrace: [<8008d718>] (losyscall_event+0x0/0x238) from [<8007c3cc>] (__ipipe_dispatch_event+0xc8/0x1a8) [<8007c304>] (__ipipe_dispatch_event+0x0/0x1a8) from [<80026540>] (__ipipe_syscall_root+0x80/0x128) [<800264c0>] (__ipipe_syscall_root+0x0/0x128) from [<80020254>] (vector_swi+0x74/0xb4) r7:000f0042 r6:000931e4 r5:00095038 r4:7edaad24 Code: 159524a8 03a00001 159536f8 13a00001 (15832000) ---[ end trace 8d00a583486ebf84 ]--- This fault does not freeze the arm, I can countinue to work. Ideas ? If you can help me to adjust this problem I can start to see the gdb bug .... Thanks Paolo .