From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <17467.54032.226274.987236@domain.hid> Date: Tue, 11 Apr 2006 18:02:24 +0200 Subject: Re: [Xenomai-core] latency -t 1 crashes on SMP. In-Reply-To: <4439791B.7060003@domain.hid> References: <17465.8101.106079.634470@domain.hid> <4439791B.7060003@domain.hid> From: Gilles Chanteperdrix List-Id: "Xenomai life and development \(bug reports, patches, discussions\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Philippe Gerum Cc: xenomai@xenomai.org Philippe Gerum wrote: > Gilles Chanteperdrix wrote: > > I tried latency -t 1 on an SMP machines, and observed a very > > reproducible crash. Most of the time, the machine locks up > > completely. The rest of the time, I get : > > > > Xenomai: suspending kernel thread e7559044 ('timerbench') at 0xb01051f2 after ex > > ception #14 > > > > And the system remains runnable, but 0xb01051f2 is not a valid kernel > > module text address. > > Looks like a kernel-based task stack address on x86. Actually, this address really is a kernel text address, because I use the CONFIG_VMSPLIT_3G_OPT, which seems a way to configure a machine with 1GB RAM. The crash also happens with CONFIG_VMSPLIT_3G, but the text address is 0xc01051f2. I tried adding a call to show_stack() in xnpod_fault_handler, and it shows that the crash happens right after the call to rthal_nmi_arm in xnarch_program_timer_shot: Xenomai: suspending kernel thread eeed5044 ('timerbench') at 0xc01051f2 after ex ception #14 ee8c1768 c02d24c1 ee8c1798 f8ff868c 00000000 00000000 eeed53b0 c01051f2 0000000e bfa49f69 58454e4f 0000000e ee8c17a4 f900aca1 ee8c17b0 ee8c17c0 f8ff57b3 ee8c17b0 0000000e ffffffff ee8c183c 00000082 ee8c17dc c0259df0 Call Trace: [] show_stack_log_lvl+0xaa/0xe0 [] show_stack+0x21/0x30 [] xnpod_fault_handler+0x8c/0x220 [xeno_nucleus] [] xnpod_trap_fault+0x61/0x70 [xeno_nucleus] [] xnarch_trap_fault+0x23/0x30 [xeno_nucleus] [] exception_event+0x70/0x80 [] __ipipe_dispatch_event+0x103/0x130 [] __ipipe_handle_exception+0x2a/0xf0 [] error_code+0x54/0x64 [] nmi_stack_correct+0x1d/0x22 [] xntimer_do_start_aperiodic+0x441/0x560 [xeno_nucleus] [] xntimer_start+0x130/0x350 [xeno_nucleus] [] xnpod_suspend_thread+0x7ad/0xd40 [xeno_nucleus] [] rtdm_task_sleep_until+0x179/0x3d0 [xeno_rtdm] [] timer_task_proc+0x1c6/0x450 [xeno_timerbench] [] xnarch_thread_redirect+0x5d/0x80 [xeno_nucleus] [<00000000>] _stext+0x3feffd68/0x8 And now the most interesting: when compiling Xenomai in the kernel, (which moves Xenomai code out of vmalloc'ed memory) the bug disappears. -- Gilles Chanteperdrix.