* Re: [Xenomai] [Xenomai-help] Debugging oops in xnheap_init [not found] ` <4FA875BF.7040103@xenomai.org> @ 2012-05-30 23:31 ` Doug Brunner 2012-05-31 6:59 ` Gilles Chanteperdrix 0 siblings, 1 reply; 5+ messages in thread From: Doug Brunner @ 2012-05-30 23:31 UTC (permalink / raw) To: xenomai [-- Attachment #1: Type: text/plain, Size: 5489 bytes --] On 05/07/2012 06:24 PM, Gilles Chanteperdrix wrote: > On 05/08/2012 02:59 AM, Doug Brunner wrote: >> I just got an oops from running one of my POSIX skin RT applications: >> >> [183168.735823] BUG: unable to handle kernel paging request at 00700bf5 >> [183168.737436] IP: [<c10c091f>] xnheap_init+0x1cf/0x210 >> [183168.738604] *pde = 00000000 >> [183168.739406] Oops: 0002 [#1] PREEMPT >> [183168.740173] last sysfs file: /sys/devices/virtual/bdi/0:19/uevent >> [183168.740173] Modules linked in: e1000 xeno_rtipc lxfb cfbcopyarea >> cfbimgblt cfbfillrect binfmt_misc psmouse usbhid serio_raw hid ata_piix >> [last unloaded: e1000] >> [183168.740173] >> [183168.740173] Pid: 2557, comm: eve_dal Not tainted 2.6.37 #1 /Bochs >> [183168.740173] EIP: 0060:[<c10c091f>] EFLAGS: 00010246 CPU: 0 >> [183168.740173] EIP is at xnheap_init+0x1cf/0x210 >> [183168.740173] EAX: 00700bf1 EBX: eed0e210 ECX: eed0e730 EDX: eed0e2fc >> [183168.740173] ESI: 00000000 EDI: 00000000 EBP: eed27da4 ESP: eed27d7c >> [183168.740173] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 >> [183168.740173] Process eve_dal (pid: 2557, ti=eed26000 >> task=f5773280 task.ti=eed26000) >> [183168.740173] I-pipe domain Linux >> [183168.740173] Stack: >> [183168.740173] eed0e304 00000030 c157d4a9 eed0e210 eed0e2fc >> eed0e2fc 0000003e 00000000 >> [183168.740173] f85ea000 eed0e210 eed27dc8 c10c0c44 00001000 >> 00000000 f85aa000 00040000 >> [183168.740173] eed0e200 fffffff4 eed0e210 eed27df0 c10cf198 >> eed27de4 c1058b86 eed27f20 >> [183168.740173] Call Trace: >> [183168.740173] [<c10c0c44>] ? xnheap_init_mapped+0xd4/0x210 >> [183168.740173] [<c10cf198>] ? xnshadow_sys_event+0x68/0x210 >> [183168.740173] [<c1058b86>] ? commit_creds+0xe6/0x190 >> [183168.740173] [<c10cefa3>] ? xnshadow_sys_bind+0x293/0x420 >> [183168.740173] [<c114b22e>] ? __d_lookup+0x12e/0x160 >> [183168.740173] [<c114bf96>] ? dput+0x66/0x1b0 >> [183168.740173] [<c1141b9e>] ? path_to_nameidata+0x1e/0x50 >> [183168.740173] [<c1143e02>] ? link_path_walk+0x422/0x7c0 >> [183168.740173] [<c11425f5>] ? path_put+0x25/0x30 >> [183168.740173] [<c109559d>] ? __ipipe_restore_root+0x1d/0x30 >> [183168.740173] [<c112d537>] ? kmem_cache_free+0xa7/0x100 >> [183168.740173] [<c11423ea>] ? putname+0x2a/0x40 >> [183168.740173] [<c1144cfa>] ? user_path_at+0x4a/0x80 >> [183168.740173] [<c10cd82d>] ? losyscall_event+0xad/0x200 >> [183168.740173] [<c1095035>] ? __ipipe_dispatch_event+0xb5/0x170 >> [183168.740173] [<c10cd780>] ? losyscall_event+0x0/0x200 >> [183168.740173] [<c10166d5>] ? __ipipe_syscall_root+0x45/0xd0 >> [183168.740173] [<c145526d>] ? system_call+0x2d/0x53 >> [183168.740173] Code: 24 e8 a6 cc 19 00 fa 8b 0d 28 36 61 c1 0f ba >> 2d c0 1b 61 c1 00 19 f6 8b 55 e8 83 e6 01 89 8b f0 00 00 00 8b 01 89 83 >> ec 00 00 00<89> 50 04 31 c0 89 11 8b 15 c0 1b 61 c1 83 05 2c 36 61 c1 01 83 >> [183168.740173] EIP: [<c10c091f>] xnheap_init+0x1cf/0x210 SS:ESP >> 0068:eed27d7c >> [183168.740173] CR2: 0000000000700bf5 >> >> As you can see, this happened with kernel 2.6.37, and I built it with >> Xenomai 2.6.0. The offending instruction was at xnheap_init + 463: >> >> 0xc10c090b<xnheap_init+443>: mov -0x18(%ebp),%edx >> 0xc10c090e<xnheap_init+446>: and $0x1,%esi >> 0xc10c0911<xnheap_init+449>: mov %ecx,0xf0(%ebx) >> 0xc10c0917<xnheap_init+455>: mov (%ecx),%eax >> 0xc10c0919<xnheap_init+457>: mov %eax,0xec(%ebx) >> 0xc10c091f<xnheap_init+463>: mov %edx,0x4(%eax) >> 0xc10c0922<xnheap_init+466>: xor %eax,%eax >> 0xc10c0924<xnheap_init+468>: mov %edx,(%ecx) >> 0xc10c0926<xnheap_init+470>: mov 0xc1611bc0,%edx >> 0xc10c092c<xnheap_init+476>: addl $0x1,0xc161362c >> 0xc10c0933<xnheap_init+483>: addl $0x1,0xc17c83e4 >> >> This corresponds to ath(xnholder_t *, xnholder_t *) in >> include/xenomai/nucleus/queue.h, line 48: >> >> 43 static inline void ath(xnholder_t *head, xnholder_t *holder) >> 44 { >> 45 /* Inserts the new element right after the heading one */ >> 46 holder->last = head; >> 47 holder->next = head->next; >> 48 holder->next->last = holder; >> 49 head->next = holder; >> 50 } >> >> It's apparently the call to appendq() at >> kernel/xenomai/nucleus/heap.c:332 that does this, with a junk pointer >> dereference. So, heap->stat_link.next is not valid at the time of this >> call, yet it's initialized by the call to inith() on line 319. I don't >> know what would have changed that, unless it's a bad pointer elsewhere >> that caused overwriting of this data. Any ideas where to go from here? >> > If the bug is reproducible, two things you can try: > - enable CONFIG_XENO_OPT_DEBUG_QUEUES > - enable the I-pipe tracer and panic freezes, you should get a trace > when the bug happens. > Hi Gilles, I finally got a bit more information. The crash occurred again today on my testing hardware, so I installed a kernel with I-pipe trace and queue debugging and tried to reproduce. I didn't get the same error, and the kernel didn't oops, but I did get some interesting-looking information in the log. It looks like something bad was happening with XDDP, but I can't figure out what. Hopefully the attached log file will get through. Thanks, --Doug Brunner [-- Attachment #2: eve-syslog-20120530-01 --] [-- Type: text/plain, Size: 27356 bytes --] May 30 14:01:56 eve-0000 kernel: [ 98.147753] Xenomai: registered exported object dispsem-1288 (semaphores) May 30 14:02:03 eve-0000 kernel: [ 105.246528] Xenomai: unregistered exported object dispsem-1288 (semaphores) May 30 14:03:27 eve-0000 kernel: [ 189.106422] Xenomai: bug at kernel/xenomai/nucleus/bufd.c:374 (xnlock_is_owner(&nklock) || spltest()) May 30 14:03:27 eve-0000 kernel: [ 189.106930] May 30 14:03:27 eve-0000 kernel: [ 189.106938] CPU PID PRI TIMEOUT STAT NAME May 30 14:03:27 eve-0000 kernel: [ 189.107342] 0 0 -1 0 00500088 ROOT May 30 14:03:27 eve-0000 kernel: [ 189.107641] 0 1344 0 0 00b00380 eve_dal May 30 14:03:27 eve-0000 kernel: [ 189.107952] 0 1349 99 191183557 00300184 eve_dal May 30 14:03:27 eve-0000 kernel: [ 189.108268] 0 1351 99 29193155 00300184 eve_dal May 30 14:03:27 eve-0000 kernel: [ 189.108580] 0 1352 99 179201948 00300184 eve_dal May 30 14:03:27 eve-0000 kernel: [ 189.108895] 0 1353 99 334966274 00300184 eve_dal May 30 14:03:27 eve-0000 kernel: [ 189.109204] 0 1354 0 50434175 00b00184 eve_dal May 30 14:03:27 eve-0000 kernel: [ 189.109509] > 0 1355 50 0 00300180 eve_dal May 30 14:03:27 eve-0000 kernel: [ 189.109806] Master time base: clock=96812860797 May 30 14:03:27 eve-0000 kernel: [ 189.110052] f50abbbc f50abc34 00000040 b46f71d0 f50abbdc c10055f3 00000000 c15a33d2 May 30 14:03:27 eve-0000 kernel: [ 189.110143] f50abc00 c10c44cd c15a5aa2 c15a5a84 00000176 c15acf5c f50abcb0 00000040 May 30 14:03:27 eve-0000 kernel: [ 189.110234] 00000040 f50abc94 fa7f91dd 00000000 00000040 00000040 00000050 fa890200 May 30 14:03:27 eve-0000 kernel: [ 189.110325] Call Trace: May 30 14:03:27 eve-0000 kernel: [ 189.110350] I-pipe: Detected illicit call from domain 'Xenomai' May 30 14:03:27 eve-0000 kernel: [ 189.110367] <3> into a service reserved for domain 'Linux' and below. May 30 14:03:27 eve-0000 kernel: [ 189.111019] Pid: 1355, comm: eve_dal Not tainted 2.6.37 #4 May 30 14:03:27 eve-0000 kernel: [ 189.111045] Call Trace: May 30 14:03:27 eve-0000 kernel: [ 189.111103] [<c1095daf>] ipipe_check_context+0x8f/0xb0 May 30 14:03:27 eve-0000 kernel: [ 189.111158] [<c147ea85>] add_preempt_count+0x15/0xc0 May 30 14:03:27 eve-0000 kernel: [ 189.111201] [<c106a495>] is_module_text_address+0x15/0x50 May 30 14:03:27 eve-0000 kernel: [ 189.111250] [<c104f8df>] __kernel_text_address+0x4f/0x70 May 30 14:03:27 eve-0000 kernel: [ 189.111297] [<c10057b9>] print_context_stack+0x59/0x110 May 30 14:03:27 eve-0000 kernel: [ 189.111346] [<c10046f8>] dump_trace+0x68/0xd0 May 30 14:03:27 eve-0000 kernel: [ 189.111395] [<c100564c>] show_trace_log_lvl+0x4c/0x60 May 30 14:03:27 eve-0000 kernel: [ 189.111443] [<c10047eb>] show_stack_log_lvl+0x5b/0xe0 May 30 14:03:27 eve-0000 kernel: [ 189.111489] [<c10055f3>] show_stack+0x23/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.111531] [<c10c44cd>] xnbufd_copy_to_kmem+0xfd/0x110 May 30 14:03:27 eve-0000 kernel: [ 189.111593] [<fa7f91dd>] __xddp_sendmsg+0x20d/0x310 [xeno_rtipc] May 30 14:03:27 eve-0000 kernel: [ 189.111656] [<c10994da>] ? ipipe_trace_function+0x2a/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.111697] [<c1003483>] ? ftrace_call+0x5/0x8 May 30 14:03:27 eve-0000 kernel: [ 189.111743] [<c1282eb6>] ? __copy_from_user_ll_nozero+0x16/0xf0 May 30 14:03:27 eve-0000 kernel: [ 189.111796] [<fa7f824f>] ? rtipc_get_arg+0x5f/0x70 [xeno_rtipc] May 30 14:03:27 eve-0000 kernel: [ 189.111848] [<fa7f942a>] xddp_sendmsg+0xfa/0x150 [xeno_rtipc] May 30 14:03:27 eve-0000 kernel: [ 189.111895] [<c1096deb>] ? __ipipe_walk_pipeline+0xb/0xb0 May 30 14:03:27 eve-0000 kernel: [ 189.111939] [<c10970bb>] ? __ipipe_dispatch_wired_nocheck+0x5b/0x90 May 30 14:03:27 eve-0000 kernel: [ 189.111987] [<c1016fba>] ? ipipe_trigger_irq+0xaa/0xb0 May 30 14:03:27 eve-0000 kernel: [ 189.112034] [<c10995fe>] ? ipipe_trace_end+0x2e/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.112084] [<c10c9b39>] ? __xnpod_schedule+0x9/0x8f0 May 30 14:03:27 eve-0000 kernel: [ 189.112127] [<c10937bf>] ? __rcu_read_unlock+0xf/0x2d0 May 30 14:03:27 eve-0000 kernel: [ 189.112174] [<c10994da>] ? ipipe_trace_function+0x2a/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.112215] [<c1003483>] ? ftrace_call+0x5/0x8 May 30 14:03:27 eve-0000 kernel: [ 189.112258] [<c10937c4>] ? __rcu_read_unlock+0x14/0x2d0 May 30 14:03:27 eve-0000 kernel: [ 189.112299] [<c1003483>] ? ftrace_call+0x5/0x8 May 30 14:03:27 eve-0000 kernel: [ 189.112350] [<c1097142>] ? __ipipe_restore_pipeline_head+0x52/0xc0 May 30 14:03:27 eve-0000 kernel: [ 189.112400] [<c10995fe>] ? ipipe_trace_end+0x2e/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.112447] [<c10d6ac8>] ? xnsched_peek_rpi+0x8/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.112489] [<c10dde4c>] ? schedule_event+0x12c/0x530 May 30 14:03:27 eve-0000 kernel: [ 189.112539] [<c10d7474>] ? ipipe_restore_pipeline_head+0x4/0x60 May 30 14:03:27 eve-0000 kernel: [ 189.112588] [<c10994da>] ? ipipe_trace_function+0x2a/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.112649] [<fa7f808c>] ? rtipc_sendmsg+0xc/0x30 [xeno_rtipc] May 30 14:03:27 eve-0000 kernel: [ 189.112698] [<c10994da>] ? ipipe_trace_function+0x2a/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.112750] [<fa7f80a2>] rtipc_sendmsg+0x22/0x30 [xeno_rtipc] May 30 14:03:27 eve-0000 kernel: [ 189.112799] [<c1113d45>] __rt_dev_sendmsg+0x45/0x90 May 30 14:03:27 eve-0000 kernel: [ 189.112847] [<c111639b>] sys_rtdm_sendmsg+0x5b/0x70 May 30 14:03:27 eve-0000 kernel: [ 189.112896] [<c10dd183>] hisyscall_event+0x173/0x370 May 30 14:03:27 eve-0000 kernel: [ 189.112936] [<c1096bec>] ? __ipipe_dispatch_event+0x15c/0x200 May 30 14:03:27 eve-0000 kernel: [ 189.112987] [<c1096b1e>] __ipipe_dispatch_event+0x8e/0x200 May 30 14:03:27 eve-0000 kernel: [ 189.113034] [<c10dd010>] ? hisyscall_event+0x0/0x370 May 30 14:03:27 eve-0000 kernel: [ 189.113079] [<c1016bd3>] __ipipe_syscall_root+0x43/0x100 May 30 14:03:27 eve-0000 kernel: [ 189.113123] [<c1002e78>] sysenter_past_esp+0x55/0x6c May 30 14:03:27 eve-0000 kernel: [ 189.113166] I-pipe tracer log (100 points): May 30 14:03:27 eve-0000 kernel: [ 189.113188] + func 0 ipipe_trace_panic_freeze+0x4 (xnarch_begin_panic+0xd) May 30 14:03:27 eve-0000 kernel: [ 189.113317] + func 0 xnarch_begin_panic+0x3 (xnbufd_copy_to_kmem+0xd0) May 30 14:03:27 eve-0000 kernel: [ 189.113433] + func -2 xnbufd_copy_to_kmem+0x9 (__xddp_sendmsg+0x20d [xeno_rtipc]) May 30 14:03:27 eve-0000 kernel: [ 189.113560] + func -3 xnbufd_map_umem+0x7 (__xddp_sendmsg+0x1fd [xeno_rtipc]) May 30 14:03:27 eve-0000 kernel: [ 189.113690] | + end 0x80000000 -4 __ipipe_restore_pipeline_head+0x52 (xnheap_alloc+0x133) May 30 14:03:27 eve-0000 kernel: [ 189.113807] | # func -5 __ipipe_restore_pipeline_head+0x6 (xnheap_alloc+0x133) May 30 14:03:27 eve-0000 kernel: [ 189.113921] | # func -7 get_free_range+0x9 (xnheap_alloc+0x182) May 30 14:03:27 eve-0000 kernel: [ 189.114031] | + begin 0x80000000 -9 xnheap_alloc+0x76 (__xddp_sendmsg+0x1a8 [xeno_rtipc]) May 30 14:03:27 eve-0000 kernel: [ 189.114159] + func -10 xnheap_alloc+0x9 (__xddp_sendmsg+0x1a8 [xeno_rtipc]) May 30 14:03:27 eve-0000 kernel: [ 189.114285] | + end 0x80000000 -12 __ipipe_restore_pipeline_head+0x52 (rtdm_context_get+0x77) May 30 14:03:27 eve-0000 kernel: [ 189.114403] | # func -13 __ipipe_restore_pipeline_head+0x6 (rtdm_context_get+0x77) May 30 14:03:27 eve-0000 kernel: [ 189.114520] | + begin 0x80000000 -14 rtdm_context_get+0x42 (__xddp_sendmsg+0x5f [xeno_rtipc]) May 30 14:03:27 eve-0000 kernel: [ 189.114649] + func -14 rtdm_context_get+0x11 (__xddp_sendmsg+0x5f [xeno_rtipc]) May 30 14:03:27 eve-0000 kernel: [ 189.114777] + func -16 rtipc_get_iov_flatlen+0x5 [xeno_rtipc] (__xddp_sendmsg+0x29 [xeno_rtipc]) May 30 14:03:27 eve-0000 kernel: [ 189.114918] + func -17 __xddp_sendmsg+0x12 [xeno_rtipc] (xddp_sendmsg+0xfa [xeno_rtipc]) May 30 14:03:27 eve-0000 kernel: [ 189.115059] + func -18 __copy_from_user_ll_nozero+0x11 (rtipc_get_arg+0x5f [xeno_rtipc]) May 30 14:03:27 eve-0000 kernel: [ 189.115189] + func -19 rtipc_get_arg+0x11 [xeno_rtipc] (xddp_sendmsg+0xcf [xeno_rtipc]) May 30 14:03:27 eve-0000 kernel: [ 189.115350] + func -21 xddp_sendmsg+0x12 [xeno_rtipc] (rtipc_sendmsg+0x22 [xeno_rtipc]) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -22 rtipc_sendmsg+0xc [xeno_rtipc] (__rt_dev_sendmsg+0x45) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + end 0x80000000 -24 __ipipe_restore_pipeline_head+0x52 (rtdm_context_get+0x77) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -24 __ipipe_restore_pipeline_head+0x6 (rtdm_context_get+0x77) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + begin 0x80000000 -26 rtdm_context_get+0x42 (__rt_dev_sendmsg+0x20) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -27 rtdm_context_get+0x11 (__rt_dev_sendmsg+0x20) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -27 __rt_dev_sendmsg+0xf (sys_rtdm_sendmsg+0x5b) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -28 __copy_from_user_ll_nozero+0x11 (sys_rtdm_sendmsg+0x46) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -29 sys_rtdm_sendmsg+0xf (hisyscall_event+0x173) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -31 hisyscall_event+0xf (__ipipe_dispatch_event+0x8e) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + end 0x80000001 -32 __ipipe_dispatch_event+0x15c (__ipipe_syscall_root+0x43) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + begin 0x80000001 -32 __ipipe_dispatch_event+0x1fb (__ipipe_syscall_root+0x43) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -33 __ipipe_dispatch_event+0x9 (__ipipe_syscall_root+0x43) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -34 __ipipe_syscall_root+0xf (sysenter_past_esp+0x55) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + begin 0x80000001 -57 __ipipe_syscall_root+0xbb (sysenter_past_esp+0x55) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + end 0x80000001 -57 __ipipe_dispatch_event+0x1cd (__ipipe_syscall_root+0x43) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + begin 0x80000001 -58 __ipipe_dispatch_event+0x176 (__ipipe_syscall_root+0x43) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -59 signal_pending+0x3 (hisyscall_event+0x25d) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -60 __copy_to_user_ll+0x11 (__rt_timer_read+0x4d) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -62 xnarch_tsc_to_ns+0xd (xnarch_get_cpu_time+0xf) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -63 xnarch_get_cpu_time+0x3 (__rt_timer_read+0x1a) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -64 __rt_timer_read+0x8 (hisyscall_event+0x173) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -65 hisyscall_event+0xf (__ipipe_dispatch_event+0x8e) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + end 0x80000001 -66 __ipipe_dispatch_event+0x15c (__ipipe_syscall_root+0x43) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + begin 0x80000001 -67 __ipipe_dispatch_event+0x1fb (__ipipe_syscall_root+0x43) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -67 __ipipe_dispatch_event+0x9 (__ipipe_syscall_root+0x43) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -68 __ipipe_syscall_root+0xf (sysenter_past_esp+0x55) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + begin 0x80000001 -90 __ipipe_syscall_root+0xbb (sysenter_past_esp+0x55) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + end 0x80000001 -91 __ipipe_dispatch_event+0x1cd (__ipipe_syscall_root+0x43) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + begin 0x80000001 -92 __ipipe_dispatch_event+0x176 (__ipipe_syscall_root+0x43) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -93 signal_pending+0x3 (hisyscall_event+0x25d) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + end 0x80000000 -94 __ipipe_restore_pipeline_head+0x52 (ipipe_restore_pipeline_head+0x24) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -95 __ipipe_restore_pipeline_head+0x6 (ipipe_restore_pipeline_head+0x24) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -96 ipipe_restore_pipeline_head+0x4 (pthread_cond_signal+0xd7) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -97 xnsynch_wakeup_one_sleeper+0x9 (pthread_cond_signal+0xba) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -99 ppd_lookup_inner+0x9 (xnshadow_ppd_get+0x48) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -100 xnshadow_ppd_get+0x6 (pse51_kqueues+0x22) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -102 pse51_kqueues+0x3 (pthread_cond_signal+0x9e) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + begin 0x80000000 -103 pthread_cond_signal+0x2a (__pthread_cond_signal+0x40) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -104 pthread_cond_signal+0x11 (__pthread_cond_signal+0x40) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -105 __copy_from_user_ll_nozero+0x11 (__pthread_cond_signal+0x35) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -106 __pthread_cond_signal+0x7 (hisyscall_event+0x173) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -107 hisyscall_event+0xf (__ipipe_dispatch_event+0x8e) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + end 0x80000001 -108 __ipipe_dispatch_event+0x15c (__ipipe_syscall_root+0x43) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + begin 0x80000001 -109 __ipipe_dispatch_event+0x1fb (__ipipe_syscall_root+0x43) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -110 __ipipe_dispatch_event+0x9 (__ipipe_syscall_root+0x43) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -111 __ipipe_syscall_root+0xf (sysenter_past_esp+0x55) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + begin 0x80000001 -263 __ipipe_syscall_root+0xbb (sysenter_past_esp+0x55) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + end 0x80000001 -263 __ipipe_dispatch_event+0x1cd (__ipipe_syscall_root+0x43) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + begin 0x80000001 -264 __ipipe_dispatch_event+0x176 (__ipipe_syscall_root+0x43) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -265 signal_pending+0x3 (losyscall_event+0x150) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + end 0x80000000 -267 __ipipe_restore_pipeline_head+0x52 (ipipe_restore_pipeline_head+0x24) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -267 __ipipe_restore_pipeline_head+0x6 (ipipe_restore_pipeline_head+0x24) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -268 ipipe_restore_pipeline_head+0x4 (pthread_setschedparam+0x11b) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -269 ipipe_restore_pipeline_head+0x4 (xnsched_pick_next+0x2ef) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -270 ipipe_test_and_stall_pipeline_head+0x3 (xnsched_pick_next+0x223) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -271 ipipe_restore_pipeline_head+0x4 (xnsched_pick_next+0x21e) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -272 xnsched_pick_next+0x9 (__xnpod_schedule+0x82) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # [ 1355] eve_dal 50 -272 __xnpod_schedule+0x63 (pthread_setschedparam+0x136) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -273 __xnpod_schedule+0x9 (pthread_setschedparam+0x136) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -275 ipipe_restore_pipeline_head+0x4 (xnsched_putback+0x187) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -276 ipipe_test_and_stall_pipeline_head+0x3 (xnsched_putback+0x12b) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -276 ipipe_restore_pipeline_head+0x4 (xnsched_putback+0x126) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -278 xnsched_putback+0x9 (__xnpod_set_thread_schedparam+0xdf) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -280 xnsched_set_policy+0x9 (__xnpod_set_thread_schedparam+0x51) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -281 __xnpod_set_thread_schedparam+0xf (xnpod_set_thread_schedparam+0x17) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -281 xnpod_set_thread_schedparam+0x6 (pthread_setschedparam+0x102) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -283 xnpod_set_thread_tslice+0xf (pthread_setschedparam+0xa0) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + begin 0x80000000 -285 pthread_setschedparam+0x2d (__pthread_setschedparam+0x84) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -285 pthread_setschedparam+0xf (__pthread_setschedparam+0x84) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + end 0x80000000 -286 __ipipe_restore_pipeline_head+0x52 (__pthread_find+0xf4) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -287 __ipipe_restore_pipeline_head+0x6 (__pthread_find+0xf4) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + begin 0x80000000 -288 __pthread_find+0x8a (__pthread_setschedparam+0x67) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -289 __pthread_find+0x9 (__pthread_setschedparam+0x67) May 30 14:03:27 eve-0000 kernel: [ 189.115432] + func -291 __pthread_setschedparam+0xf (losyscall_event+0xad) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | + end 0x80000000 -292 ipipe_unstall_pipeline_head+0x3a (xnshadow_harden+0xd0) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -293 ipipe_unstall_pipeline_head+0x3 (xnshadow_harden+0xd0) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -294 xnpod_switch_fpu+0x11 (xnshadow_harden+0xc1) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -296 ipipe_restore_pipeline_head+0x4 (xnsched_pick_next+0x2ef) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -297 ipipe_test_and_stall_pipeline_head+0x3 (xnsched_pick_next+0x223) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -298 ipipe_restore_pipeline_head+0x4 (xnsched_pick_next+0x21e) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # func -299 xnsched_pick_next+0x9 (__xnpod_schedule+0x82) May 30 14:03:27 eve-0000 kernel: [ 189.115432] | # [ 19] gatekee -1 -299 __xnpod_schedule+0x63 (xnpod_schedule_handler+0x2a) May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10055f3>] ? show_stack+0x23/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10c44cd>] ? xnbufd_copy_to_kmem+0xfd/0x110 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<fa7f91dd>] ? __xddp_sendmsg+0x20d/0x310 [xeno_rtipc] May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10994da>] ? ipipe_trace_function+0x2a/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c1003483>] ? ftrace_call+0x5/0x8 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c1282eb6>] ? __copy_from_user_ll_nozero+0x16/0xf0 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<fa7f824f>] ? rtipc_get_arg+0x5f/0x70 [xeno_rtipc] May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<fa7f942a>] ? xddp_sendmsg+0xfa/0x150 [xeno_rtipc] May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c1096deb>] ? __ipipe_walk_pipeline+0xb/0xb0 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10970bb>] ? __ipipe_dispatch_wired_nocheck+0x5b/0x90 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c1016fba>] ? ipipe_trigger_irq+0xaa/0xb0 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10995fe>] ? ipipe_trace_end+0x2e/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10c9b39>] ? __xnpod_schedule+0x9/0x8f0 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10937bf>] ? __rcu_read_unlock+0xf/0x2d0 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10994da>] ? ipipe_trace_function+0x2a/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c1003483>] ? ftrace_call+0x5/0x8 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10937c4>] ? __rcu_read_unlock+0x14/0x2d0 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c1003483>] ? ftrace_call+0x5/0x8 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c1097142>] ? __ipipe_restore_pipeline_head+0x52/0xc0 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10995fe>] ? ipipe_trace_end+0x2e/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10d6ac8>] ? xnsched_peek_rpi+0x8/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10dde4c>] ? schedule_event+0x12c/0x530 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10d7474>] ? ipipe_restore_pipeline_head+0x4/0x60 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10994da>] ? ipipe_trace_function+0x2a/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<fa7f808c>] ? rtipc_sendmsg+0xc/0x30 [xeno_rtipc] May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10994da>] ? ipipe_trace_function+0x2a/0x30 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<fa7f80a2>] ? rtipc_sendmsg+0x22/0x30 [xeno_rtipc] May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c1113d45>] ? __rt_dev_sendmsg+0x45/0x90 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c111639b>] ? sys_rtdm_sendmsg+0x5b/0x70 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10dd183>] ? hisyscall_event+0x173/0x370 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c1096bec>] ? __ipipe_dispatch_event+0x15c/0x200 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c1096b1e>] ? __ipipe_dispatch_event+0x8e/0x200 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c10dd010>] ? hisyscall_event+0x0/0x370 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c1016bd3>] ? __ipipe_syscall_root+0x43/0x100 May 30 14:03:27 eve-0000 kernel: [ 189.115432] [<c1002e78>] ? sysenter_past_esp+0x55/0x6c May 30 14:03:27 eve-0000 kernel: [ 189.115432] Xenomai: watchdog triggered -- signaling runaway thread 'eve_dal' May 30 14:03:27 eve-0000 kernel: [ 189.115432] Xenomai: watchdog triggered -- killing runaway thread 'eve_dal' May 30 14:03:27 eve-0000 kernel: [ 189.132824] Xenomai: registered exported object batt (iddp) May 30 14:03:29 eve-0000 kernel: [ 190.493310] Xenomai: registered exported object fc (iddp) May 30 14:07:00 eve-0000 kernel: [ 401.811795] Xenomai: bug at kernel/xenomai/nucleus/bufd.c:486 (xnlock_is_owner(&nklock) || spltest()) f50a9bc0 f50a9c34 00000040 fa890290 f50a9be0 c10055f3 00000000 c15a33d2 May 30 14:07:00 eve-0000 kernel: [ 401.815432] f50a9c04 c10c4240 c15a5aa2 c15a5a84 000001e6 c15acf5c 00000000 f50a9cac May 30 14:07:00 eve-0000 kernel: [ 401.815432] 00000040 f50a9c98 fa7f8965 05f5e100 00000000 c1972780 fa890280 fa890290 May 30 14:07:00 eve-0000 kernel: [ 401.815432] Call Trace: May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c10055f3>] ? show_stack+0x23/0x30 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c10c4240>] ? xnbufd_copy_from_kmem+0xc0/0x100 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<fa7f8965>] ? __xddp_recvmsg+0x105/0x240 [xeno_rtipc] May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c10c8a9d>] ? xnpod_resume_thread+0x29d/0x550 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1003483>] ? ftrace_call+0x5/0x8 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1282eb6>] ? __copy_from_user_ll_nozero+0x16/0xf0 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<fa7f8b7c>] ? xddp_recvmsg+0x9c/0xf0 [xeno_rtipc] May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1003483>] ? ftrace_call+0x5/0x8 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c10d818f>] ? xnsched_pick_next+0x2ef/0x3f0 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1003483>] ? ftrace_call+0x5/0x8 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1090033>] ? irq_thread+0x163/0x2a0 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c10c6e24>] ? xnintr_edge_shirq_handler+0x204/0x2a0 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1003483>] ? ftrace_call+0x5/0x8 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c10970a6>] ? __ipipe_dispatch_wired_nocheck+0x46/0x90 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c10168f0>] ? __ipipe_ack_irq+0x0/0x10 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c109779d>] ? __ipipe_dispatch_wired+0x1d/0x30 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1016b0a>] ? __ipipe_handle_irq+0x18a/0x210 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1003440>] ? common_interrupt+0x40/0x50 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1003483>] ? ftrace_call+0x5/0x8 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1282eb6>] ? __copy_from_user_ll_nozero+0x16/0xf0 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<fa7f824f>] ? rtipc_get_arg+0x5f/0x70 [xeno_rtipc] May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<fa7f97ba>] ? xddp_ioctl+0x33a/0x910 [xeno_rtipc] May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1090033>] ? irq_thread+0x163/0x2a0 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c10c6e24>] ? xnintr_edge_shirq_handler+0x204/0x2a0 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1003483>] ? ftrace_call+0x5/0x8 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<fa7f8072>] ? rtipc_recvmsg+0x22/0x30 [xeno_rtipc] May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1113dd5>] ? __rt_dev_recvmsg+0x45/0x90 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c111640b>] ? sys_rtdm_recvmsg+0x5b/0x90 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c10dd183>] ? hisyscall_event+0x173/0x370 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1096b1e>] ? __ipipe_dispatch_event+0x8e/0x200 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c10dd010>] ? hisyscall_event+0x0/0x370 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1016bd3>] ? __ipipe_syscall_root+0x43/0x100 May 30 14:07:00 eve-0000 kernel: [ 401.815432] [<c1002e78>] ? sysenter_past_esp+0x55/0x6c May 30 14:07:00 eve-0000 kernel: [ 401.815432] Xenomai: watchdog triggered -- signaling runaway thread 'eve_dal' May 30 14:07:00 eve-0000 kernel: [ 401.815432] Xenomai: watchdog triggered -- killing runaway thread 'eve_dal' May 30 14:09:18 eve-0000 eve_dal: [ 1338412179.965692] Aborted ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Xenomai] [Xenomai-help] Debugging oops in xnheap_init 2012-05-30 23:31 ` [Xenomai] [Xenomai-help] Debugging oops in xnheap_init Doug Brunner @ 2012-05-31 6:59 ` Gilles Chanteperdrix 2012-05-31 7:38 ` Philippe Gerum 0 siblings, 1 reply; 5+ messages in thread From: Gilles Chanteperdrix @ 2012-05-31 6:59 UTC (permalink / raw) To: Doug Brunner; +Cc: xenomai On 05/31/2012 01:31 AM, Doug Brunner wrote: > On 05/07/2012 06:24 PM, Gilles Chanteperdrix wrote: >> On 05/08/2012 02:59 AM, Doug Brunner wrote: >>> I just got an oops from running one of my POSIX skin RT applications: >>> >>> [183168.735823] BUG: unable to handle kernel paging request at 00700bf5 >>> [183168.737436] IP: [<c10c091f>] xnheap_init+0x1cf/0x210 >>> [183168.738604] *pde = 00000000 >>> [183168.739406] Oops: 0002 [#1] PREEMPT >>> [183168.740173] last sysfs file: /sys/devices/virtual/bdi/0:19/uevent >>> [183168.740173] Modules linked in: e1000 xeno_rtipc lxfb cfbcopyarea >>> cfbimgblt cfbfillrect binfmt_misc psmouse usbhid serio_raw hid ata_piix >>> [last unloaded: e1000] >>> [183168.740173] >>> [183168.740173] Pid: 2557, comm: eve_dal Not tainted 2.6.37 #1 /Bochs >>> [183168.740173] EIP: 0060:[<c10c091f>] EFLAGS: 00010246 CPU: 0 >>> [183168.740173] EIP is at xnheap_init+0x1cf/0x210 >>> [183168.740173] EAX: 00700bf1 EBX: eed0e210 ECX: eed0e730 EDX: eed0e2fc >>> [183168.740173] ESI: 00000000 EDI: 00000000 EBP: eed27da4 ESP: eed27d7c >>> [183168.740173] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 >>> [183168.740173] Process eve_dal (pid: 2557, ti=eed26000 >>> task=f5773280 task.ti=eed26000) >>> [183168.740173] I-pipe domain Linux >>> [183168.740173] Stack: >>> [183168.740173] eed0e304 00000030 c157d4a9 eed0e210 eed0e2fc >>> eed0e2fc 0000003e 00000000 >>> [183168.740173] f85ea000 eed0e210 eed27dc8 c10c0c44 00001000 >>> 00000000 f85aa000 00040000 >>> [183168.740173] eed0e200 fffffff4 eed0e210 eed27df0 c10cf198 >>> eed27de4 c1058b86 eed27f20 >>> [183168.740173] Call Trace: >>> [183168.740173] [<c10c0c44>] ? xnheap_init_mapped+0xd4/0x210 >>> [183168.740173] [<c10cf198>] ? xnshadow_sys_event+0x68/0x210 >>> [183168.740173] [<c1058b86>] ? commit_creds+0xe6/0x190 >>> [183168.740173] [<c10cefa3>] ? xnshadow_sys_bind+0x293/0x420 >>> [183168.740173] [<c114b22e>] ? __d_lookup+0x12e/0x160 >>> [183168.740173] [<c114bf96>] ? dput+0x66/0x1b0 >>> [183168.740173] [<c1141b9e>] ? path_to_nameidata+0x1e/0x50 >>> [183168.740173] [<c1143e02>] ? link_path_walk+0x422/0x7c0 >>> [183168.740173] [<c11425f5>] ? path_put+0x25/0x30 >>> [183168.740173] [<c109559d>] ? __ipipe_restore_root+0x1d/0x30 >>> [183168.740173] [<c112d537>] ? kmem_cache_free+0xa7/0x100 >>> [183168.740173] [<c11423ea>] ? putname+0x2a/0x40 >>> [183168.740173] [<c1144cfa>] ? user_path_at+0x4a/0x80 >>> [183168.740173] [<c10cd82d>] ? losyscall_event+0xad/0x200 >>> [183168.740173] [<c1095035>] ? __ipipe_dispatch_event+0xb5/0x170 >>> [183168.740173] [<c10cd780>] ? losyscall_event+0x0/0x200 >>> [183168.740173] [<c10166d5>] ? __ipipe_syscall_root+0x45/0xd0 >>> [183168.740173] [<c145526d>] ? system_call+0x2d/0x53 >>> [183168.740173] Code: 24 e8 a6 cc 19 00 fa 8b 0d 28 36 61 c1 0f ba >>> 2d c0 1b 61 c1 00 19 f6 8b 55 e8 83 e6 01 89 8b f0 00 00 00 8b 01 89 83 >>> ec 00 00 00<89> 50 04 31 c0 89 11 8b 15 c0 1b 61 c1 83 05 2c 36 61 c1 01 83 >>> [183168.740173] EIP: [<c10c091f>] xnheap_init+0x1cf/0x210 SS:ESP >>> 0068:eed27d7c >>> [183168.740173] CR2: 0000000000700bf5 >>> >>> As you can see, this happened with kernel 2.6.37, and I built it with >>> Xenomai 2.6.0. The offending instruction was at xnheap_init + 463: >>> >>> 0xc10c090b<xnheap_init+443>: mov -0x18(%ebp),%edx >>> 0xc10c090e<xnheap_init+446>: and $0x1,%esi >>> 0xc10c0911<xnheap_init+449>: mov %ecx,0xf0(%ebx) >>> 0xc10c0917<xnheap_init+455>: mov (%ecx),%eax >>> 0xc10c0919<xnheap_init+457>: mov %eax,0xec(%ebx) >>> 0xc10c091f<xnheap_init+463>: mov %edx,0x4(%eax) >>> 0xc10c0922<xnheap_init+466>: xor %eax,%eax >>> 0xc10c0924<xnheap_init+468>: mov %edx,(%ecx) >>> 0xc10c0926<xnheap_init+470>: mov 0xc1611bc0,%edx >>> 0xc10c092c<xnheap_init+476>: addl $0x1,0xc161362c >>> 0xc10c0933<xnheap_init+483>: addl $0x1,0xc17c83e4 >>> >>> This corresponds to ath(xnholder_t *, xnholder_t *) in >>> include/xenomai/nucleus/queue.h, line 48: >>> >>> 43 static inline void ath(xnholder_t *head, xnholder_t *holder) >>> 44 { >>> 45 /* Inserts the new element right after the heading one */ >>> 46 holder->last = head; >>> 47 holder->next = head->next; >>> 48 holder->next->last = holder; >>> 49 head->next = holder; >>> 50 } >>> >>> It's apparently the call to appendq() at >>> kernel/xenomai/nucleus/heap.c:332 that does this, with a junk pointer >>> dereference. So, heap->stat_link.next is not valid at the time of this >>> call, yet it's initialized by the call to inith() on line 319. I don't >>> know what would have changed that, unless it's a bad pointer elsewhere >>> that caused overwriting of this data. Any ideas where to go from here? >>> >> If the bug is reproducible, two things you can try: >> - enable CONFIG_XENO_OPT_DEBUG_QUEUES >> - enable the I-pipe tracer and panic freezes, you should get a trace >> when the bug happens. >> > Hi Gilles, > > I finally got a bit more information. The crash occurred again today on > my testing hardware, so I installed a kernel with I-pipe trace and queue > debugging and tried to reproduce. I didn't get the same error, and the > kernel didn't oops, but I did get some interesting-looking information > in the log. It looks like something bad was happening with XDDP, but I > can't figure out what. Hopefully the attached log file will get through. Nothing seems obvious from the trace. Also, you may want to increase the trace to 10000 points for instance, in order to have more history. But if the issue is a memory corruption, chances are that it may not be enough. What version of xenomai and the I-pipe patch are you using? -- Gilles. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Xenomai] [Xenomai-help] Debugging oops in xnheap_init 2012-05-31 6:59 ` Gilles Chanteperdrix @ 2012-05-31 7:38 ` Philippe Gerum 2012-06-14 1:50 ` Doug Brunner 0 siblings, 1 reply; 5+ messages in thread From: Philippe Gerum @ 2012-05-31 7:38 UTC (permalink / raw) To: Gilles Chanteperdrix; +Cc: Doug Brunner, xenomai On 05/31/2012 08:59 AM, Gilles Chanteperdrix wrote: > On 05/31/2012 01:31 AM, Doug Brunner wrote: >> On 05/07/2012 06:24 PM, Gilles Chanteperdrix wrote: >>> On 05/08/2012 02:59 AM, Doug Brunner wrote: >>>> I just got an oops from running one of my POSIX skin RT applications: >>>> >>>> [183168.735823] BUG: unable to handle kernel paging request at 00700bf5 >>>> [183168.737436] IP: [<c10c091f>] xnheap_init+0x1cf/0x210 >>>> [183168.738604] *pde = 00000000 >>>> [183168.739406] Oops: 0002 [#1] PREEMPT >>>> [183168.740173] last sysfs file: /sys/devices/virtual/bdi/0:19/uevent >>>> [183168.740173] Modules linked in: e1000 xeno_rtipc lxfb cfbcopyarea >>>> cfbimgblt cfbfillrect binfmt_misc psmouse usbhid serio_raw hid ata_piix >>>> [last unloaded: e1000] >>>> [183168.740173] >>>> [183168.740173] Pid: 2557, comm: eve_dal Not tainted 2.6.37 #1 /Bochs >>>> [183168.740173] EIP: 0060:[<c10c091f>] EFLAGS: 00010246 CPU: 0 >>>> [183168.740173] EIP is at xnheap_init+0x1cf/0x210 >>>> [183168.740173] EAX: 00700bf1 EBX: eed0e210 ECX: eed0e730 EDX: eed0e2fc >>>> [183168.740173] ESI: 00000000 EDI: 00000000 EBP: eed27da4 ESP: eed27d7c >>>> [183168.740173] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 >>>> [183168.740173] Process eve_dal (pid: 2557, ti=eed26000 >>>> task=f5773280 task.ti=eed26000) >>>> [183168.740173] I-pipe domain Linux >>>> [183168.740173] Stack: >>>> [183168.740173] eed0e304 00000030 c157d4a9 eed0e210 eed0e2fc >>>> eed0e2fc 0000003e 00000000 >>>> [183168.740173] f85ea000 eed0e210 eed27dc8 c10c0c44 00001000 >>>> 00000000 f85aa000 00040000 >>>> [183168.740173] eed0e200 fffffff4 eed0e210 eed27df0 c10cf198 >>>> eed27de4 c1058b86 eed27f20 >>>> [183168.740173] Call Trace: >>>> [183168.740173] [<c10c0c44>] ? xnheap_init_mapped+0xd4/0x210 >>>> [183168.740173] [<c10cf198>] ? xnshadow_sys_event+0x68/0x210 >>>> [183168.740173] [<c1058b86>] ? commit_creds+0xe6/0x190 >>>> [183168.740173] [<c10cefa3>] ? xnshadow_sys_bind+0x293/0x420 >>>> [183168.740173] [<c114b22e>] ? __d_lookup+0x12e/0x160 >>>> [183168.740173] [<c114bf96>] ? dput+0x66/0x1b0 >>>> [183168.740173] [<c1141b9e>] ? path_to_nameidata+0x1e/0x50 >>>> [183168.740173] [<c1143e02>] ? link_path_walk+0x422/0x7c0 >>>> [183168.740173] [<c11425f5>] ? path_put+0x25/0x30 >>>> [183168.740173] [<c109559d>] ? __ipipe_restore_root+0x1d/0x30 >>>> [183168.740173] [<c112d537>] ? kmem_cache_free+0xa7/0x100 >>>> [183168.740173] [<c11423ea>] ? putname+0x2a/0x40 >>>> [183168.740173] [<c1144cfa>] ? user_path_at+0x4a/0x80 >>>> [183168.740173] [<c10cd82d>] ? losyscall_event+0xad/0x200 >>>> [183168.740173] [<c1095035>] ? __ipipe_dispatch_event+0xb5/0x170 >>>> [183168.740173] [<c10cd780>] ? losyscall_event+0x0/0x200 >>>> [183168.740173] [<c10166d5>] ? __ipipe_syscall_root+0x45/0xd0 >>>> [183168.740173] [<c145526d>] ? system_call+0x2d/0x53 >>>> [183168.740173] Code: 24 e8 a6 cc 19 00 fa 8b 0d 28 36 61 c1 0f ba >>>> 2d c0 1b 61 c1 00 19 f6 8b 55 e8 83 e6 01 89 8b f0 00 00 00 8b 01 89 83 >>>> ec 00 00 00<89> 50 04 31 c0 89 11 8b 15 c0 1b 61 c1 83 05 2c 36 61 c1 01 83 >>>> [183168.740173] EIP: [<c10c091f>] xnheap_init+0x1cf/0x210 SS:ESP >>>> 0068:eed27d7c >>>> [183168.740173] CR2: 0000000000700bf5 >>>> >>>> As you can see, this happened with kernel 2.6.37, and I built it with >>>> Xenomai 2.6.0. The offending instruction was at xnheap_init + 463: >>>> >>>> 0xc10c090b<xnheap_init+443>: mov -0x18(%ebp),%edx >>>> 0xc10c090e<xnheap_init+446>: and $0x1,%esi >>>> 0xc10c0911<xnheap_init+449>: mov %ecx,0xf0(%ebx) >>>> 0xc10c0917<xnheap_init+455>: mov (%ecx),%eax >>>> 0xc10c0919<xnheap_init+457>: mov %eax,0xec(%ebx) >>>> 0xc10c091f<xnheap_init+463>: mov %edx,0x4(%eax) >>>> 0xc10c0922<xnheap_init+466>: xor %eax,%eax >>>> 0xc10c0924<xnheap_init+468>: mov %edx,(%ecx) >>>> 0xc10c0926<xnheap_init+470>: mov 0xc1611bc0,%edx >>>> 0xc10c092c<xnheap_init+476>: addl $0x1,0xc161362c >>>> 0xc10c0933<xnheap_init+483>: addl $0x1,0xc17c83e4 >>>> >>>> This corresponds to ath(xnholder_t *, xnholder_t *) in >>>> include/xenomai/nucleus/queue.h, line 48: >>>> >>>> 43 static inline void ath(xnholder_t *head, xnholder_t *holder) >>>> 44 { >>>> 45 /* Inserts the new element right after the heading one */ >>>> 46 holder->last = head; >>>> 47 holder->next = head->next; >>>> 48 holder->next->last = holder; >>>> 49 head->next = holder; >>>> 50 } >>>> >>>> It's apparently the call to appendq() at >>>> kernel/xenomai/nucleus/heap.c:332 that does this, with a junk pointer >>>> dereference. So, heap->stat_link.next is not valid at the time of this >>>> call, yet it's initialized by the call to inith() on line 319. I don't >>>> know what would have changed that, unless it's a bad pointer elsewhere >>>> that caused overwriting of this data. Any ideas where to go from here? >>>> >>> If the bug is reproducible, two things you can try: >>> - enable CONFIG_XENO_OPT_DEBUG_QUEUES >>> - enable the I-pipe tracer and panic freezes, you should get a trace >>> when the bug happens. >>> >> Hi Gilles, >> >> I finally got a bit more information. The crash occurred again today on >> my testing hardware, so I installed a kernel with I-pipe trace and queue >> debugging and tried to reproduce. I didn't get the same error, and the >> kernel didn't oops, but I did get some interesting-looking information >> in the log. It looks like something bad was happening with XDDP, but I >> can't figure out what. Hopefully the attached log file will get through. > > Nothing seems obvious from the trace. Also, you may want to increase the > trace to 10000 points for instance, in order to have more history. But > if the issue is a memory corruption, chances are that it may not be enough. > nklock is leaking in the traces, we should not be holding the superlock on entry to xnbufd_copy_to_kmem. Doug, please apply the patch below, with CONFIG_XENO_OPT_DEBUG_XNLOCK enabled. This may tell us which routine on the current code path grabbed that lock. diff --git a/ksrc/nucleus/bufd.c b/ksrc/nucleus/bufd.c index c482b9d..8fd2ec5 100644 --- a/ksrc/nucleus/bufd.c +++ b/ksrc/nucleus/bufd.c @@ -371,6 +371,9 @@ ssize_t xnbufd_copy_to_kmem(void *to, struct xnbufd *bufd, size_t len) if (xnpod_userspace_p() && !xnpod_asynch_p() && current->mm == bufd->b_mm) { + if (xnlock_is_owner(&nklock)) + printk(KERN_ERR "Xenomai: nklock held: %s:%u (%s())\n", + nklock.file, nklock.line, nklock.function); XENO_BUGON(NUCLEUS, xnlock_is_owner(&nklock) || spltest()); if (__xn_safe_copy_from_user(to, (void __user *)from, len)) return -EFAULT; -- Philippe. ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [Xenomai] [Xenomai-help] Debugging oops in xnheap_init 2012-05-31 7:38 ` Philippe Gerum @ 2012-06-14 1:50 ` Doug Brunner 2012-06-19 1:32 ` Doug Brunner 0 siblings, 1 reply; 5+ messages in thread From: Doug Brunner @ 2012-06-14 1:50 UTC (permalink / raw) To: xenomai On 05/31/2012 12:38 AM, Philippe Gerum wrote: > On 05/31/2012 08:59 AM, Gilles Chanteperdrix wrote: >> On 05/31/2012 01:31 AM, Doug Brunner wrote: >>> On 05/07/2012 06:24 PM, Gilles Chanteperdrix wrote: >>>> On 05/08/2012 02:59 AM, Doug Brunner wrote: >>>>> I just got an oops from running one of my POSIX skin RT applications: >>>>> >>>>> [183168.735823] BUG: unable to handle kernel paging request at 00700bf5 >>>>> [183168.737436] IP: [<c10c091f>] xnheap_init+0x1cf/0x210 >>>>> [183168.738604] *pde = 00000000 >>>>> [183168.739406] Oops: 0002 [#1] PREEMPT >>>>> [183168.740173] last sysfs file: /sys/devices/virtual/bdi/0:19/uevent >>>>> [183168.740173] Modules linked in: e1000 xeno_rtipc lxfb cfbcopyarea >>>>> cfbimgblt cfbfillrect binfmt_misc psmouse usbhid serio_raw hid ata_piix >>>>> [last unloaded: e1000] >>>>> [183168.740173] >>>>> [183168.740173] Pid: 2557, comm: eve_dal Not tainted 2.6.37 #1 /Bochs >>>>> [183168.740173] EIP: 0060:[<c10c091f>] EFLAGS: 00010246 CPU: 0 >>>>> [183168.740173] EIP is at xnheap_init+0x1cf/0x210 >>>>> [183168.740173] EAX: 00700bf1 EBX: eed0e210 ECX: eed0e730 EDX: eed0e2fc >>>>> [183168.740173] ESI: 00000000 EDI: 00000000 EBP: eed27da4 ESP: eed27d7c >>>>> [183168.740173] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 >>>>> [183168.740173] Process eve_dal (pid: 2557, ti=eed26000 >>>>> task=f5773280 task.ti=eed26000) >>>>> [183168.740173] I-pipe domain Linux >>>>> [183168.740173] Stack: >>>>> [183168.740173] eed0e304 00000030 c157d4a9 eed0e210 eed0e2fc >>>>> eed0e2fc 0000003e 00000000 >>>>> [183168.740173] f85ea000 eed0e210 eed27dc8 c10c0c44 00001000 >>>>> 00000000 f85aa000 00040000 >>>>> [183168.740173] eed0e200 fffffff4 eed0e210 eed27df0 c10cf198 >>>>> eed27de4 c1058b86 eed27f20 >>>>> [183168.740173] Call Trace: >>>>> [183168.740173] [<c10c0c44>] ? xnheap_init_mapped+0xd4/0x210 >>>>> [183168.740173] [<c10cf198>] ? xnshadow_sys_event+0x68/0x210 >>>>> [183168.740173] [<c1058b86>] ? commit_creds+0xe6/0x190 >>>>> [183168.740173] [<c10cefa3>] ? xnshadow_sys_bind+0x293/0x420 >>>>> [183168.740173] [<c114b22e>] ? __d_lookup+0x12e/0x160 >>>>> [183168.740173] [<c114bf96>] ? dput+0x66/0x1b0 >>>>> [183168.740173] [<c1141b9e>] ? path_to_nameidata+0x1e/0x50 >>>>> [183168.740173] [<c1143e02>] ? link_path_walk+0x422/0x7c0 >>>>> [183168.740173] [<c11425f5>] ? path_put+0x25/0x30 >>>>> [183168.740173] [<c109559d>] ? __ipipe_restore_root+0x1d/0x30 >>>>> [183168.740173] [<c112d537>] ? kmem_cache_free+0xa7/0x100 >>>>> [183168.740173] [<c11423ea>] ? putname+0x2a/0x40 >>>>> [183168.740173] [<c1144cfa>] ? user_path_at+0x4a/0x80 >>>>> [183168.740173] [<c10cd82d>] ? losyscall_event+0xad/0x200 >>>>> [183168.740173] [<c1095035>] ? __ipipe_dispatch_event+0xb5/0x170 >>>>> [183168.740173] [<c10cd780>] ? losyscall_event+0x0/0x200 >>>>> [183168.740173] [<c10166d5>] ? __ipipe_syscall_root+0x45/0xd0 >>>>> [183168.740173] [<c145526d>] ? system_call+0x2d/0x53 >>>>> [183168.740173] Code: 24 e8 a6 cc 19 00 fa 8b 0d 28 36 61 c1 0f ba >>>>> 2d c0 1b 61 c1 00 19 f6 8b 55 e8 83 e6 01 89 8b f0 00 00 00 8b 01 89 83 >>>>> ec 00 00 00<89> 50 04 31 c0 89 11 8b 15 c0 1b 61 c1 83 05 2c 36 61 c1 01 83 >>>>> [183168.740173] EIP: [<c10c091f>] xnheap_init+0x1cf/0x210 SS:ESP >>>>> 0068:eed27d7c >>>>> [183168.740173] CR2: 0000000000700bf5 >>>>> >>>>> As you can see, this happened with kernel 2.6.37, and I built it with >>>>> Xenomai 2.6.0. The offending instruction was at xnheap_init + 463: >>>>> >>>>> 0xc10c090b<xnheap_init+443>: mov -0x18(%ebp),%edx >>>>> 0xc10c090e<xnheap_init+446>: and $0x1,%esi >>>>> 0xc10c0911<xnheap_init+449>: mov %ecx,0xf0(%ebx) >>>>> 0xc10c0917<xnheap_init+455>: mov (%ecx),%eax >>>>> 0xc10c0919<xnheap_init+457>: mov %eax,0xec(%ebx) >>>>> 0xc10c091f<xnheap_init+463>: mov %edx,0x4(%eax) >>>>> 0xc10c0922<xnheap_init+466>: xor %eax,%eax >>>>> 0xc10c0924<xnheap_init+468>: mov %edx,(%ecx) >>>>> 0xc10c0926<xnheap_init+470>: mov 0xc1611bc0,%edx >>>>> 0xc10c092c<xnheap_init+476>: addl $0x1,0xc161362c >>>>> 0xc10c0933<xnheap_init+483>: addl $0x1,0xc17c83e4 >>>>> >>>>> This corresponds to ath(xnholder_t *, xnholder_t *) in >>>>> include/xenomai/nucleus/queue.h, line 48: >>>>> >>>>> 43 static inline void ath(xnholder_t *head, xnholder_t *holder) >>>>> 44 { >>>>> 45 /* Inserts the new element right after the heading one */ >>>>> 46 holder->last = head; >>>>> 47 holder->next = head->next; >>>>> 48 holder->next->last = holder; >>>>> 49 head->next = holder; >>>>> 50 } >>>>> >>>>> It's apparently the call to appendq() at >>>>> kernel/xenomai/nucleus/heap.c:332 that does this, with a junk pointer >>>>> dereference. So, heap->stat_link.next is not valid at the time of this >>>>> call, yet it's initialized by the call to inith() on line 319. I don't >>>>> know what would have changed that, unless it's a bad pointer elsewhere >>>>> that caused overwriting of this data. Any ideas where to go from here? >>>>> >>>> If the bug is reproducible, two things you can try: >>>> - enable CONFIG_XENO_OPT_DEBUG_QUEUES >>>> - enable the I-pipe tracer and panic freezes, you should get a trace >>>> when the bug happens. >>>> >>> Hi Gilles, >>> >>> I finally got a bit more information. The crash occurred again today on >>> my testing hardware, so I installed a kernel with I-pipe trace and queue >>> debugging and tried to reproduce. I didn't get the same error, and the >>> kernel didn't oops, but I did get some interesting-looking information >>> in the log. It looks like something bad was happening with XDDP, but I >>> can't figure out what. Hopefully the attached log file will get through. >> Nothing seems obvious from the trace. Also, you may want to increase the >> trace to 10000 points for instance, in order to have more history. But >> if the issue is a memory corruption, chances are that it may not be enough. >> > nklock is leaking in the traces, we should not be holding the superlock on > entry to xnbufd_copy_to_kmem. Doug, please apply the patch below, with > CONFIG_XENO_OPT_DEBUG_XNLOCK enabled. This may tell us which routine on the > current code path grabbed that lock. > > diff --git a/ksrc/nucleus/bufd.c b/ksrc/nucleus/bufd.c > index c482b9d..8fd2ec5 100644 > --- a/ksrc/nucleus/bufd.c > +++ b/ksrc/nucleus/bufd.c > @@ -371,6 +371,9 @@ ssize_t xnbufd_copy_to_kmem(void *to, struct xnbufd *bufd, size_t len) > > if (xnpod_userspace_p()&& !xnpod_asynch_p()&& > current->mm == bufd->b_mm) { > + if (xnlock_is_owner(&nklock)) > + printk(KERN_ERR "Xenomai: nklock held: %s:%u (%s())\n", > + nklock.file, nklock.line, nklock.function); > XENO_BUGON(NUCLEUS, xnlock_is_owner(&nklock) || spltest()); > if (__xn_safe_copy_from_user(to, (void __user *)from, len)) > return -EFAULT; More interesting results (sorry about the delay, my testing hardware was in pieces until this morning). I haven't been able to reproduce the second bug (the one which prompted Philippe to suggest the above diagnostic patch), but I was able to trigger something similar to the first bug. I also have figured out an apparently reliable way to reproduce it. I noticed a discussion on the list recently about the inadvisability of enabling KGDB, so I looked in my kernel configuration and found that it was turned on. I first tried triggering the bug with the kernel that had KGDB enabled; this produced the log attached as eve-log-20120613-01. EIP pointed at kernel/xenomai/nucleus/heap.c:332 (interestingly, not inside the appendq() call that was a problem before). The system was still somewhat responsive, but crippled; disk writes no longer seemed to work. Next, I turned off KGDB, and still got the crash, but a much shorter log, attached as eve-log-20120613-02. The system was completely locked up after this log was output. The scenario that triggers this is: 1. RT task creates XDDP socket and binds it to minor 0 2. Linux process opens /dev/rtp0 3. RT task is killed; Linux process appears to stay blocked on read() of /dev/rtp0 4. New RT task attempts to create XDDP socket and bind it to minor 0 I'm going to try to refine this into a small test case. Gilles, the kernel is 2.6.37, patched with adeos-ipipe-2.6.37.6-x86-2.9-02.patch. Xenomai version is 2.6.0 (since that is the latest stable version shown on the wiki). Thanks, --Doug Brunner -------------- next part -------------- [ 262.198260] I-pipe: Detected stalled topmost domain, probably caused by a bug. [ 262.198284] A critical section may have been left unterminated. [ 262.200668] Pid: 1364, comm: eve_dal Not tainted 2.6.37ipipedebug #11 [ 262.200668] Call Trace: [ 262.200668] [<c108b88f>] ipipe_check_context+0x8f/0xb0 [ 262.200668] [<c1003483>] ? ftrace_call+0x5/0x8 [ 262.200668] [<c101710d>] __ipipe_handle_exception+0x1dd/0x270 [ 262.200668] [<c10ba14f>] ? __xnlock_get_irqsave+0xf/0xc0 [ 262.200668] [<c14aa665>] error_code+0x5d/0x6c [ 262.200668] [<c10ba7cc>] ? xnheap_init+0x35c/0x530 [ 262.200668] [<fa82f576>] xddp_ioctl+0x3d6/0xaf0 [xeno_rtipc] [ 262.200668] [<fa82e11c>] ? rtipc_ioctl+0xc/0x30 [xeno_rtipc] [ 262.200668] [<c108efba>] ? ipipe_trace_function+0x2a/0x30 [ 262.200668] [<fa82e132>] rtipc_ioctl+0x22/0x30 [xeno_rtipc] [ 262.200668] [<c113f114>] __rt_dev_ioctl+0xd4/0xf0 [ 262.200668] [<c108efba>] ? ipipe_trace_function+0x2a/0x30 [ 262.200668] [<c114442d>] sys_rtdm_ioctl+0x2d/0x30 [ 262.200668] [<c10dd7ad>] losyscall_event+0xad/0x200 [ 262.200668] [<c108c5fe>] __ipipe_dispatch_event+0x8e/0x200 [ 262.200668] [<c10dd700>] ? losyscall_event+0x0/0x200 [ 262.200668] [<c1016bd3>] __ipipe_syscall_root+0x43/0x100 [ 262.200668] [<c1002e78>] sysenter_past_esp+0x55/0x6c [ 262.200668] I-pipe tracer log (100 points): [ 262.200668] | *+func 0 ipipe_trace_panic_freeze+0x4 (ipipe_check_context+0x4c) [ 262.200668] | *+func -1 ipipe_check_context+0x7 (__ipipe_handle_exception+0x1dd) [ 262.200668] | *+func -2 __ipipe_handle_exception+0x9 (error_code+0x5d) [ 262.200668] | *+func -4 __xnlock_get_irqsave+0xf (xnheap_init+0x340) [ 262.200668] | +begin 0x80000000 -6 __xnlock_get_irqsave+0xa3 (xnheap_init+0x322) [ 262.200668] +func -6 __xnlock_get_irqsave+0xf (xnheap_init+0x322) [ 262.200668] +func -13 memcpy+0x11 (vsnprintf+0x2da) [ 262.200668] | +end 0x80000000 -16 __ipipe_restore_pipeline_head+0x52 (xnlock_put_irqrestore+0xb0) [ 262.200668] | *+func -17 __ipipe_restore_pipeline_head+0x6 (xnlock_put_irqrestore+0xb0) [ 262.200668] | *+func -18 xnlock_put_irqrestore+0x8 (xnheap_init+0x2d0) [ 262.200668] | +begin 0x80000000 -19 __xnlock_get_irqsave+0xa3 (xnheap_init+0x29f) [ 262.200668] +func -20 __xnlock_get_irqsave+0xf (xnheap_init+0x29f) [ 262.200668] | +end 0x80000000 -20 __ipipe_restore_pipeline_head+0x52 (xnlock_put_irqrestore+0xb0) [ 262.200668] | *+func -21 __ipipe_restore_pipeline_head+0x6 (xnlock_put_irqrestore+0xb0) [ 262.200668] | *+func -23 xnlock_put_irqrestore+0x8 (xnheap_init+0x286) [ 262.200668] | +begin 0x80000000 -24 __xnlock_get_irqsave+0xa3 (xnheap_init+0x20e) [ 262.200668] +func -25 __xnlock_get_irqsave+0xf (xnheap_init+0x20e) [ 262.200668] +func -470 xnheap_init+0xf (xddp_ioctl+0x3d6 [xeno_rtipc]) [ 262.200668] +func -472 ipipe_check_context+0x7 (sub_preempt_count+0x15) [ 262.200668] | +end 0x80000000 -473 __ipipe_unstall_root+0x45 (__ipipe_restore_root+0x2d) [ 262.200668] | #func -474 ipipe_check_context+0x7 (__ipipe_unstall_root+0x19) [ 262.200668] | #begin 0x80000000 -475 __ipipe_unstall_root+0x53 (__ipipe_restore_root+0x2d) [ 262.200668] #func -475 __ipipe_unstall_root+0x3 (__ipipe_restore_root+0x2d) [ 262.200668] #func -476 ipipe_check_context+0x7 (__ipipe_restore_root+0x15) [ 262.200668] #func -477 __ipipe_restore_root+0x4 (__ipipe_pin_range_globally+0x14f) [ 262.200668] #func -478 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -479 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -480 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -481 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -482 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -483 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -484 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -485 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -486 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -487 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -488 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -489 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -490 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -491 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -492 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -493 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -494 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -495 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -496 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -497 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -498 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -499 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -500 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -501 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -502 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -503 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -504 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -505 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -506 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -507 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -508 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -509 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -510 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -511 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -512 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -513 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -514 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -515 ipipe_check_context+0x7 (add_preempt_count+0x15) [ 262.200668] +func -516 ipipe_check_context+0x7 (__ipipe_pin_range_globally+0x59) [ 262.200668] +func -517 ipipe_check_context+0x7 (sub_preempt_count+0x15) [ 262.200668] | +end 0x80000000 -518 __ipipe_unstall_root+0x45 (__ipipe_restore_root+0x2d) [ 262.200668] | #func -518 ipipe_check_context+0x7 (__ipipe_unstall_root+0x19) [ 262.200668] | #begin 0x80000000 -519 __ipipe_unstall_root+0x53 (__ipipe_restore_root+0x2d) [ 262.200668] #func -520 __ipipe_unstall_root+0x3 (__ipipe_restore_root+0x2d) [ 262.200668] #func -521 ipipe_check_context+0x7 (__ipipe_restore_root+0x15) [ 262.200668] #func -521 __ipipe_restore_root+0x4 (__ipipe_pin_range_globally+0x14f) [ 262.200668] #func -523 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -524 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -525 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -526 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -527 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -528 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -529 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -531 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -532 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -533 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -534 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -535 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -536 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -537 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -539 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -540 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -541 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -542 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -543 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -544 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -545 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -547 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -548 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -549 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -550 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -551 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -552 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -553 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] #func -554 page_address+0x9 (__ipipe_pin_range_globally+0xe7) [ 262.200668] BUG: unable to handle kernel NULL pointer dereference at 00000004 [ 262.200668] IP: [<c10ba7cc>] xnheap_init+0x35c/0x530 [ 262.200668] *pde = 00000000 [ 262.200668] Oops: 0000 [#1] PREEMPT [ 262.200668] last sysfs file: /sys/bus/pnp/drivers/xeno_16550A/uevent [ 262.200668] Modules linked in: xeno_rtipc xeno_16550A xeno_can_isa xeno_can_sja1000 lxfb cfbcopyarea cfbimgblt cfbfillrect binfmt_misc snd_cs5535audio snd_ac97_codec ac97_bus snd_pcm usbhid snd_timer e100 hid snd soundcore snd_page_alloc cs5535_mfgpt serio_raw cs5535_gpio pata_cs5536 [ 262.200668] [ 262.200668] Pid: 1364, comm: eve_dal Not tainted 2.6.37ipipedebug #11 / [ 262.200668] EIP: 0060:[<c10ba7cc>] EFLAGS: 00010293 CPU: 0 [ 262.200668] EIP is at xnheap_init+0x35c/0x530 [ 262.200668] EAX: 00000001 EBX: f5023644 ECX: 00000000 EDX: c14bc3f5 [ 262.200668] ESI: 00000002 EDI: 00000005 EBP: f5053e3c ESP: f5053e0c [ 262.200668] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 [ 262.200668] Process eve_dal (pid: 1364, ti=f5052000 task=f50ebf20 task.ti=f5052000) [ 262.200668] I-pipe domain Linux [ 262.200668] Stack: [ 262.200668] c14bc3f5 00000030 c15d4f93 f5023644 00000000 f503aae0 f5023674 00000000 [ 262.200668] f5023778 00000000 f503aae0 f5023600 f5053ee0 fa82f576 00000200 00000054 [ 262.200668] 00000000 00000082 00000082 00000000 f6802500 00000282 f5023644 faccf000 [ 262.200668] Call Trace: [ 262.200668] [<fa82f576>] ? xddp_ioctl+0x3d6/0xaf0 [xeno_rtipc] [ 262.200668] [<fa82e11c>] ? rtipc_ioctl+0xc/0x30 [xeno_rtipc] [ 262.200668] [<c108efba>] ? ipipe_trace_function+0x2a/0x30 [ 262.200668] [<fa82e132>] ? rtipc_ioctl+0x22/0x30 [xeno_rtipc] [ 262.200668] [<c113f114>] ? __rt_dev_ioctl+0xd4/0xf0 [ 262.200668] [<c108efba>] ? ipipe_trace_function+0x2a/0x30 [ 262.200668] [<c114442d>] ? sys_rtdm_ioctl+0x2d/0x30 [ 262.200668] [<c10dd7ad>] ? losyscall_event+0xad/0x200 [ 262.200668] [<c108c5fe>] ? __ipipe_dispatch_event+0x8e/0x200 [ 262.200668] [<c10dd700>] ? losyscall_event+0x0/0x200 [ 262.200668] [<c1016bd3>] ? __ipipe_syscall_root+0x43/0x100 [ 262.200668] [<c1002e78>] ? sysenter_past_esp+0x55/0x6c [ 262.200668] Code: b5 7e c1 c7 04 24 f5 c3 4b c1 e8 90 f9 ff ff 8b 0d a4 b5 7e c1 8b 3d a8 b5 7e c1 81 f9 a0 b5 7e c1 0f 84 92 00 00 00 39 fe 7d 46 <8b> 49 04 83 c6 01 eb e8 e8 27 fa ff ff 8b 4d ec c7 44 24 10 47 [ 262.200668] EIP: [<c10ba7cc>] xnheap_init+0x35c/0x530 SS:ESP 0068:f5053e0c [ 262.200668] CR2: 0000000000000004 [ 262.200668] ---[ end trace 42427ce7f2d030b2 ]--- -------------- next part -------------- [ 255.185332] Xenomai: corrupted queue, qslot->elems=5/5, qslot=c17eb5a0 at kernel/xenomai/nucleus/heap.c:332 [ 255.189731] [ 255.189731] CPU PID PRI TIMEOUT STAT NAME [ 255.189731] > 0 0 -1 0 00500080 ROOT [ 255.189731] 0 1351 0 0 00b00380 eve_dal [ 255.189731] 0 1353 99 75327489 00300184 eve_dal [ 255.189731] 0 1354 99 54990725 00300184 eve_dal [ 255.189731] 0 1355 99 205020388 00300184 eve_dal [ 255.189731] 0 1356 99 355059396 00300184 eve_dal [ 255.189731] Master time base: clock=132791887205 [ 255.189731] I-pipe: Detected stalled topmost domain, probably caused by a bug. [ 255.189731] A critical section may have been left unterminated. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Xenomai] [Xenomai-help] Debugging oops in xnheap_init 2012-06-14 1:50 ` Doug Brunner @ 2012-06-19 1:32 ` Doug Brunner 0 siblings, 0 replies; 5+ messages in thread From: Doug Brunner @ 2012-06-19 1:32 UTC (permalink / raw) To: xenomai On 06/13/2012 06:50 PM, Doug Brunner wrote: > On 05/31/2012 12:38 AM, Philippe Gerum wrote: >> On 05/31/2012 08:59 AM, Gilles Chanteperdrix wrote: >>> On 05/31/2012 01:31 AM, Doug Brunner wrote: >>>> On 05/07/2012 06:24 PM, Gilles Chanteperdrix wrote: >>>>> On 05/08/2012 02:59 AM, Doug Brunner wrote: >>>>>> I just got an oops from running one of my POSIX skin RT >>>>>> applications: >>>>>> >>>>>> [183168.735823] BUG: unable to handle kernel paging >>>>>> request at 00700bf5 >>>>>> [183168.737436] IP: [<c10c091f>] xnheap_init+0x1cf/0x210 >>>>>> [183168.738604] *pde = 00000000 >>>>>> [183168.739406] Oops: 0002 [#1] PREEMPT >>>>>> [183168.740173] last sysfs file: >>>>>> /sys/devices/virtual/bdi/0:19/uevent >>>>>> [183168.740173] Modules linked in: e1000 xeno_rtipc lxfb >>>>>> cfbcopyarea >>>>>> cfbimgblt cfbfillrect binfmt_misc psmouse usbhid serio_raw hid >>>>>> ata_piix >>>>>> [last unloaded: e1000] >>>>>> [183168.740173] >>>>>> [183168.740173] Pid: 2557, comm: eve_dal Not tainted >>>>>> 2.6.37 #1 /Bochs >>>>>> [183168.740173] EIP: 0060:[<c10c091f>] EFLAGS: 00010246 >>>>>> CPU: 0 >>>>>> [183168.740173] EIP is at xnheap_init+0x1cf/0x210 >>>>>> [183168.740173] EAX: 00700bf1 EBX: eed0e210 ECX: eed0e730 >>>>>> EDX: eed0e2fc >>>>>> [183168.740173] ESI: 00000000 EDI: 00000000 EBP: eed27da4 >>>>>> ESP: eed27d7c >>>>>> [183168.740173] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 >>>>>> [183168.740173] Process eve_dal (pid: 2557, ti=eed26000 >>>>>> task=f5773280 task.ti=eed26000) >>>>>> [183168.740173] I-pipe domain Linux >>>>>> [183168.740173] Stack: >>>>>> [183168.740173] eed0e304 00000030 c157d4a9 eed0e210 eed0e2fc >>>>>> eed0e2fc 0000003e 00000000 >>>>>> [183168.740173] f85ea000 eed0e210 eed27dc8 c10c0c44 00001000 >>>>>> 00000000 f85aa000 00040000 >>>>>> [183168.740173] eed0e200 fffffff4 eed0e210 eed27df0 c10cf198 >>>>>> eed27de4 c1058b86 eed27f20 >>>>>> [183168.740173] Call Trace: >>>>>> [183168.740173] [<c10c0c44>] ? xnheap_init_mapped+0xd4/0x210 >>>>>> [183168.740173] [<c10cf198>] ? xnshadow_sys_event+0x68/0x210 >>>>>> [183168.740173] [<c1058b86>] ? commit_creds+0xe6/0x190 >>>>>> [183168.740173] [<c10cefa3>] ? xnshadow_sys_bind+0x293/0x420 >>>>>> [183168.740173] [<c114b22e>] ? __d_lookup+0x12e/0x160 >>>>>> [183168.740173] [<c114bf96>] ? dput+0x66/0x1b0 >>>>>> [183168.740173] [<c1141b9e>] ? path_to_nameidata+0x1e/0x50 >>>>>> [183168.740173] [<c1143e02>] ? link_path_walk+0x422/0x7c0 >>>>>> [183168.740173] [<c11425f5>] ? path_put+0x25/0x30 >>>>>> [183168.740173] [<c109559d>] ? >>>>>> __ipipe_restore_root+0x1d/0x30 >>>>>> [183168.740173] [<c112d537>] ? kmem_cache_free+0xa7/0x100 >>>>>> [183168.740173] [<c11423ea>] ? putname+0x2a/0x40 >>>>>> [183168.740173] [<c1144cfa>] ? user_path_at+0x4a/0x80 >>>>>> [183168.740173] [<c10cd82d>] ? losyscall_event+0xad/0x200 >>>>>> [183168.740173] [<c1095035>] ? >>>>>> __ipipe_dispatch_event+0xb5/0x170 >>>>>> [183168.740173] [<c10cd780>] ? losyscall_event+0x0/0x200 >>>>>> [183168.740173] [<c10166d5>] ? >>>>>> __ipipe_syscall_root+0x45/0xd0 >>>>>> [183168.740173] [<c145526d>] ? system_call+0x2d/0x53 >>>>>> [183168.740173] Code: 24 e8 a6 cc 19 00 fa 8b 0d 28 36 61 >>>>>> c1 0f ba >>>>>> 2d c0 1b 61 c1 00 19 f6 8b 55 e8 83 e6 01 89 8b f0 00 00 00 8b 01 >>>>>> 89 83 >>>>>> ec 00 00 00<89> 50 04 31 c0 89 11 8b 15 c0 1b 61 c1 83 05 2c >>>>>> 36 61 c1 01 83 >>>>>> [183168.740173] EIP: [<c10c091f>] xnheap_init+0x1cf/0x210 >>>>>> SS:ESP >>>>>> 0068:eed27d7c >>>>>> [183168.740173] CR2: 0000000000700bf5 >>>>>> >>>>>> As you can see, this happened with kernel 2.6.37, and I built it >>>>>> with >>>>>> Xenomai 2.6.0. The offending instruction was at xnheap_init + 463: >>>>>> >>>>>> 0xc10c090b<xnheap_init+443>: mov -0x18(%ebp),%edx >>>>>> 0xc10c090e<xnheap_init+446>: and $0x1,%esi >>>>>> 0xc10c0911<xnheap_init+449>: mov %ecx,0xf0(%ebx) >>>>>> 0xc10c0917<xnheap_init+455>: mov (%ecx),%eax >>>>>> 0xc10c0919<xnheap_init+457>: mov %eax,0xec(%ebx) >>>>>> 0xc10c091f<xnheap_init+463>: mov %edx,0x4(%eax) >>>>>> 0xc10c0922<xnheap_init+466>: xor %eax,%eax >>>>>> 0xc10c0924<xnheap_init+468>: mov %edx,(%ecx) >>>>>> 0xc10c0926<xnheap_init+470>: mov 0xc1611bc0,%edx >>>>>> 0xc10c092c<xnheap_init+476>: addl $0x1,0xc161362c >>>>>> 0xc10c0933<xnheap_init+483>: addl $0x1,0xc17c83e4 >>>>>> >>>>>> This corresponds to ath(xnholder_t *, xnholder_t *) in >>>>>> include/xenomai/nucleus/queue.h, line 48: >>>>>> >>>>>> 43 static inline void ath(xnholder_t *head, xnholder_t >>>>>> *holder) >>>>>> 44 { >>>>>> 45 /* Inserts the new element right after the >>>>>> heading one */ >>>>>> 46 holder->last = head; >>>>>> 47 holder->next = head->next; >>>>>> 48 holder->next->last = holder; >>>>>> 49 head->next = holder; >>>>>> 50 } >>>>>> >>>>>> It's apparently the call to appendq() at >>>>>> kernel/xenomai/nucleus/heap.c:332 that does this, with a junk >>>>>> pointer >>>>>> dereference. So, heap->stat_link.next is not valid at the time of >>>>>> this >>>>>> call, yet it's initialized by the call to inith() on line 319. I >>>>>> don't >>>>>> know what would have changed that, unless it's a bad pointer >>>>>> elsewhere >>>>>> that caused overwriting of this data. Any ideas where to go from >>>>>> here? >>>>>> >>>>> If the bug is reproducible, two things you can try: >>>>> - enable CONFIG_XENO_OPT_DEBUG_QUEUES >>>>> - enable the I-pipe tracer and panic freezes, you should get a trace >>>>> when the bug happens. >>>>> >>>> Hi Gilles, >>>> >>>> I finally got a bit more information. The crash occurred again >>>> today on >>>> my testing hardware, so I installed a kernel with I-pipe trace and >>>> queue >>>> debugging and tried to reproduce. I didn't get the same error, and the >>>> kernel didn't oops, but I did get some interesting-looking information >>>> in the log. It looks like something bad was happening with XDDP, but I >>>> can't figure out what. Hopefully the attached log file will get >>>> through. >>> Nothing seems obvious from the trace. Also, you may want to increase >>> the >>> trace to 10000 points for instance, in order to have more history. But >>> if the issue is a memory corruption, chances are that it may not be >>> enough. >>> >> nklock is leaking in the traces, we should not be holding the >> superlock on >> entry to xnbufd_copy_to_kmem. Doug, please apply the patch below, with >> CONFIG_XENO_OPT_DEBUG_XNLOCK enabled. This may tell us which routine >> on the >> current code path grabbed that lock. >> >> diff --git a/ksrc/nucleus/bufd.c b/ksrc/nucleus/bufd.c >> index c482b9d..8fd2ec5 100644 >> --- a/ksrc/nucleus/bufd.c >> +++ b/ksrc/nucleus/bufd.c >> @@ -371,6 +371,9 @@ ssize_t xnbufd_copy_to_kmem(void *to, struct >> xnbufd *bufd, size_t len) >> >> if (xnpod_userspace_p()&& !xnpod_asynch_p()&& >> current->mm == bufd->b_mm) { >> + if (xnlock_is_owner(&nklock)) >> + printk(KERN_ERR "Xenomai: nklock held: %s:%u (%s())\n", >> + nklock.file, nklock.line, nklock.function); >> XENO_BUGON(NUCLEUS, xnlock_is_owner(&nklock) || spltest()); >> if (__xn_safe_copy_from_user(to, (void __user *)from, len)) >> return -EFAULT; > More interesting results (sorry about the delay, my testing hardware > was in pieces until this morning). I haven't been able to reproduce > the second bug (the one which prompted Philippe to suggest the above > diagnostic patch), but I was able to trigger something similar to the > first bug. I also have figured out an apparently reliable way to > reproduce it. > > I noticed a discussion on the list recently about the inadvisability > of enabling KGDB, so I looked in my kernel configuration and found > that it was turned on. I first tried triggering the bug with the > kernel that had KGDB enabled; this produced the log attached as > eve-log-20120613-01. EIP pointed at kernel/xenomai/nucleus/heap.c:332 > (interestingly, not inside the appendq() call that was a problem > before). The system was still somewhat responsive, but crippled; disk > writes no longer seemed to work. Next, I turned off KGDB, and still > got the crash, but a much shorter log, attached as > eve-log-20120613-02. The system was completely locked up after this > log was output. > > The scenario that triggers this is: > 1. RT task creates XDDP socket and binds it to minor 0 > 2. Linux process opens /dev/rtp0 > 3. RT task is killed; Linux process appears to stay blocked on read() > of /dev/rtp0 > 4. New RT task attempts to create XDDP socket and bind it to minor 0 > I'm going to try to refine this into a small test case. > > Gilles, the kernel is 2.6.37, patched with > adeos-ipipe-2.6.37.6-x86-2.9-02.patch. Xenomai version is 2.6.0 (since > that is the latest stable version shown on the wiki). > > Thanks, > --Doug Brunner I've now got a test case working. This program is derived from xddp-echo.c. It will (fairly reliably) crash either my production kernel build or the I-pipe debug version. I attached the kernel config for the latter in case it's useful. It turns out that the first RT task must close the socket to trigger the bug. Otherwise, the second RT task just gets an error when it tries to bind to the same address. Thanks, --Doug Brunner -------------- next part -------------- A non-text attachment was scrubbed... Name: config-ipipedebug.gz Type: application/x-gzip Size: 18464 bytes Desc: not available URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20120618/bcb0f8ca/attachment.bin> -------------- next part -------------- A non-text attachment was scrubbed... Name: xddp-trycrash.c Type: text/x-csrc Size: 6234 bytes Desc: not available URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20120618/bcb0f8ca/attachment.c> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-06-19 1:32 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <4FA87007.1040105@ebus.com>
[not found] ` <4FA875BF.7040103@xenomai.org>
2012-05-30 23:31 ` [Xenomai] [Xenomai-help] Debugging oops in xnheap_init Doug Brunner
2012-05-31 6:59 ` Gilles Chanteperdrix
2012-05-31 7:38 ` Philippe Gerum
2012-06-14 1:50 ` Doug Brunner
2012-06-19 1:32 ` Doug Brunner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.