From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4656F037.4040704@domain.hid> Date: Fri, 25 May 2007 16:18:31 +0200 From: Gilles Chanteperdrix MIME-Version: 1.0 References: <4656E62F.3090603@domain.hid> <4656E887.7040406@domain.hid> <4656EABD.60601@domain.hid> <1180102373.20410.103.camel@domain.hid> In-Reply-To: <1180102373.20410.103.camel@domain.hid> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai-core] [BUG] recursive fault on cyclictest termination -- scalable sched? List-Id: "Xenomai life and development \(bug reports, patches, discussions\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: rpm@xenomai.org Cc: Jan Kiszka , xenomai-core Philippe Gerum wrote: > On Fri, 2007-05-25 at 15:55 +0200, Jan Kiszka wrote: > >>Gilles Chanteperdrix wrote: >> >>>Jan Kiszka wrote: >>> >>>>People, we have some new troubles: >>>> >>>>I'm reproducibly getting recursive faults on termination of cyclictest >>>>via ^C. It's all standard here: ipipe 1.8-02, Xenomai trunk #2469, no >>>>weird patches of mine. >>>> >>>>Something goes utterly wrong, the debugger currently points into >>>>xnshadow_relax->rpi_push, and there into some queuing operation. Note >>>>that I have XENO_OPT_SCALABLE_SCHED on in my config, also >>>>XENO_OPT_PRIOCPL. Without XENO_OPT_SCALABLE_SCHED, things seems to work >>>>find. Who did last work on this? What was fixed? >>> >>>Looking at svn log, you did the last modification. What about activating >>>queue debugging ? >>> >> >>Well, I remember that 64-bit issue now, but it was my patch IIRC. >> >>Anyway, good suggestion: >> >>T: 0 ( 824) P:99 I: 1000 C: 1483 Min: 3 Act: 349 Avg: 1602 Max: 3240 >>[ 474.912841] Xenomai: fatal: corrupted queue, qslot->elems=0, qslot=c0614a28 at include/xenomai/nucleus/queue.h:684 >>[ 474.912921] CPU PID PRI TIMEOUT STAT NAME >>[ 474.912941] 0 0 -1 0 00500088 ROOT >>[ 474.912960] 0 823 0 0 00300380 cyclictest >>[ 474.912980] > 0 824 0 0 00300180 cyclictest >>[ 474.913000] Master time base: clock=1043585105469 >>[ 474.913017] >>[ 474.915241] c11e7eb4 00000000 00000000 c0614a28 c11e7ed8 c0104c6e c033f860 00000000 >>[ 474.915762] 00000103 c11e7f18 c0154f02 c033fd86 c115c000 c0614a28 c03428c4 000002ac >>[ 474.916378] c015657d 00000000 c1180b30 ffffffff c1180d94 00000000 c1180b30 00000000 >>[ 474.916874] Call Trace: >>[ 474.917034] [] show_trace_log_lvl+0x1a/0x30 >>[ 474.917422] [] show_stack_log_lvl+0xb1/0xe0 >>[ 474.917693] [] show_stack+0x2e/0x40 >>[ 474.917920] [] rpi_push+0x192/0x3a0 >>[ 474.918146] [] xnshadow_relax+0x50/0x1c0 >>[ 474.918394] [] hisyscall_event+0xd0/0x290 >>[ 474.918647] [] __ipipe_dispatch_event+0x8e/0x140 >>[ 474.918917] [] __ipipe_syscall_root+0x3e/0xf0 >>[ 474.919292] [] system_call+0x29/0x41 >>[ 474.919534] ======================= >> >>Any comment? >> > > > Yeah. This is the exact bug I told you I was chasing some moons ago on > qemu/x86_64 and which I can't reproduce anywhere else (on real hw for > instance), glad to see I'm not alone in the twilight zone anymore. :o) > (Btw, this issue predates any recent change; this is something I've seen > popping up more than six weeks ago on my setup). A recent change that might change the behaviour is the direct tsc access by clock_gettime in user-space. This may make the bug win a race. -- Gilles Chanteperdrix