* Exception #14 in kernel space when using rttcp @ 2020-06-12 15:26 Per Oberg 2020-06-12 15:33 ` Per Oberg 2020-06-12 15:33 ` Jan Kiszka 0 siblings, 2 replies; 7+ messages in thread From: Per Oberg @ 2020-06-12 15:26 UTC (permalink / raw) To: xenomai Hi list I get a massive amount of "swithching ... to secondary mode after exception #14 in kernel-space ..." followed by a WARNING as shown below. Can someone enlighten me regarding the meaning of exception #14 ? Is the "WARNING: CPU: 0 ..." the cause or the symptom ? It has a macro at fd.c calling "XENO_WARN_ON(COBALT, fd->refs <= 0); This is for Xenomai 3.1 [ 133.458856] [Xenomai] switching RTTest to secondary mode after exception #14 in kernel-space at 0xffffffff8145386e (pid 498) [ 133.461217] ------------[ cut here ]------------ [ 133.461218] WARNING: CPU: 0 PID: 199 at /usr/src/kernel/kernel/xenomai/rtdm/fd.c:299 __put_fd+0x242/0x290 [ 133.461218] Modules linked in: rttcp rtudp rtipv4 intel_powerclamp intel_rapl coretemp i915 e1000e rt_igb pcan(O) rtnet video fan thermal_sys [ 133.461225] CPU: 0 PID: 199 Comm: rtnet-stack Tainted: G O 4.9.90-xeno-cobolt #1 [ 133.461226] Hardware name: Default string Default string/SKYBAY, BIOS 5.0.1.1 04/18/2016 [ 133.461226] I-pipe domain: Xenomai [ 133.461227] ffffc90000947d20 ffffffff81446d18 0000000000000000 0000000000000000 [ 133.461229] ffffffff81bb13f0 ffffc90000947d60 ffffffff81078461 0000012b00000001 [ 133.461231] 0000000000000000 0000000000000000 ffff88026194d000 0000000000000000 [ 133.461233] Call Trace: [ 133.461234] [<ffffffff81446d18>] dump_stack+0xbf/0xe7 [ 133.461234] [<ffffffff81078461>] __warn+0xe1/0x100 [ 133.461235] [<ffffffff8107854d>] warn_slowpath_null+0x1d/0x20 [ 133.461235] [<ffffffff811711e2>] __put_fd+0x242/0x290 [ 133.461236] [<ffffffffa00bae46>] ? rtskb_pool_queue_tail+0xa6/0xd0 [rtnet] [ 133.461236] [<ffffffff81171ff6>] rtdm_fd_unlock+0x96/0xc0 [ 133.461237] [<ffffffffa02ed725>] rt_ip_rcv+0x135/0x170 [rtipv4] [ 133.461237] [<ffffffffa00bc008>] rt_stack_deliver+0xf8/0x220 [rtnet] [ 133.461238] [<ffffffff8115fd20>] ? xnthread_map+0x330/0x330 [ 133.461238] [<ffffffffa00bc1a0>] rt_stack_mgr_task+0x70/0xa0 [rtnet] [ 133.461239] [<ffffffff8115fd93>] kthread_trampoline+0x73/0x120 [ 133.461240] [<ffffffff810967a9>] kthread+0xd9/0xf0 [ 133.461240] [<ffffffff810966d0>] ? kthread_park+0x60/0x60 [ 133.461241] [<ffffffff81001c92>] ? do_syscall_64+0x82/0xf0 [ 133.461241] [<ffffffff818e0255>] ret_from_fork+0x55/0x60 [ 133.461242] ---[ end trace c5197a4d8608bef3 ]--- Per Öberg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Exception #14 in kernel space when using rttcp 2020-06-12 15:26 Exception #14 in kernel space when using rttcp Per Oberg @ 2020-06-12 15:33 ` Per Oberg 2020-06-12 15:33 ` Jan Kiszka 1 sibling, 0 replies; 7+ messages in thread From: Per Oberg @ 2020-06-12 15:33 UTC (permalink / raw) To: xenomai ----- Den 12 jun 2020, på kl 17:26, xenomai xenomai@xenomai.org skrev: > Hi list > I get a massive amount of "swithching ... to secondary mode after exception #14 > in kernel-space ..." followed by a WARNING as shown below. > Can someone enlighten me regarding the meaning of exception #14 ? Is this an illegal page-fault in kernel space? Are these numbers the same as for the standard kernel? > Is the "WARNING: CPU: 0 ..." the cause or the symptom ? It has a macro at fd.c > calling "XENO_WARN_ON(COBALT, fd->refs <= 0); > This is for Xenomai 3.1 > [ 133.458856] [Xenomai] switching RTTest to secondary mode after exception #14 > in kernel-space at 0xffffffff8145386e (pid 498) > [ 133.461217] ------------[ cut here ]------------ > [ 133.461218] WARNING: CPU: 0 PID: 199 at > /usr/src/kernel/kernel/xenomai/rtdm/fd.c:299 __put_fd+0x242/0x290 > [ 133.461218] Modules linked in: rttcp rtudp rtipv4 intel_powerclamp intel_rapl > coretemp i915 e1000e rt_igb pcan(O) rtnet video fan thermal_sys > [ 133.461225] CPU: 0 PID: 199 Comm: rtnet-stack Tainted: G O 4.9.90-xeno-cobolt > #1 > [ 133.461226] Hardware name: Default string Default string/SKYBAY, BIOS 5.0.1.1 > 04/18/2016 > [ 133.461226] I-pipe domain: Xenomai > [ 133.461227] ffffc90000947d20 ffffffff81446d18 0000000000000000 > 0000000000000000 > [ 133.461229] ffffffff81bb13f0 ffffc90000947d60 ffffffff81078461 > 0000012b00000001 > [ 133.461231] 0000000000000000 0000000000000000 ffff88026194d000 > 0000000000000000 > [ 133.461233] Call Trace: > [ 133.461234] [<ffffffff81446d18>] dump_stack+0xbf/0xe7 > [ 133.461234] [<ffffffff81078461>] __warn+0xe1/0x100 > [ 133.461235] [<ffffffff8107854d>] warn_slowpath_null+0x1d/0x20 > [ 133.461235] [<ffffffff811711e2>] __put_fd+0x242/0x290 > [ 133.461236] [<ffffffffa00bae46>] ? rtskb_pool_queue_tail+0xa6/0xd0 [rtnet] > [ 133.461236] [<ffffffff81171ff6>] rtdm_fd_unlock+0x96/0xc0 > [ 133.461237] [<ffffffffa02ed725>] rt_ip_rcv+0x135/0x170 [rtipv4] > [ 133.461237] [<ffffffffa00bc008>] rt_stack_deliver+0xf8/0x220 [rtnet] > [ 133.461238] [<ffffffff8115fd20>] ? xnthread_map+0x330/0x330 > [ 133.461238] [<ffffffffa00bc1a0>] rt_stack_mgr_task+0x70/0xa0 [rtnet] > [ 133.461239] [<ffffffff8115fd93>] kthread_trampoline+0x73/0x120 > [ 133.461240] [<ffffffff810967a9>] kthread+0xd9/0xf0 > [ 133.461240] [<ffffffff810966d0>] ? kthread_park+0x60/0x60 > [ 133.461241] [<ffffffff81001c92>] ? do_syscall_64+0x82/0xf0 > [ 133.461241] [<ffffffff818e0255>] ret_from_fork+0x55/0x60 > [ 133.461242] ---[ end trace c5197a4d8608bef3 ]--- > Per Öberg Thanks Per Öberg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Exception #14 in kernel space when using rttcp 2020-06-12 15:26 Exception #14 in kernel space when using rttcp Per Oberg 2020-06-12 15:33 ` Per Oberg @ 2020-06-12 15:33 ` Jan Kiszka 2020-06-12 15:47 ` Per Oberg 1 sibling, 1 reply; 7+ messages in thread From: Jan Kiszka @ 2020-06-12 15:33 UTC (permalink / raw) To: Per Oberg, xenomai On 12.06.20 17:26, Per Oberg via Xenomai wrote: > Hi list > > I get a massive amount of "swithching ... to secondary mode after exception #14 in kernel-space ..." followed by a WARNING as shown below. > > Can someone enlighten me regarding the meaning of exception #14 ? > > Is the "WARNING: CPU: 0 ..." the cause or the symptom ? It has a macro at fd.c calling "XENO_WARN_ON(COBALT, fd->refs <= 0); > Likely related: The WARN_ON triggers a stack dump and that may trigger fixable or ignorable faults. We may consider converting that XENO_WARN_ON into XENO_WARN_ON_ONCE. What is actually interesting is the warning itself. Reference counting became imbalanced. How do you trigger that? Jan > This is for Xenomai 3.1 > > [ 133.458856] [Xenomai] switching RTTest to secondary mode after exception #14 in kernel-space at 0xffffffff8145386e (pid 498) > [ 133.461217] ------------[ cut here ]------------ > [ 133.461218] WARNING: CPU: 0 PID: 199 at /usr/src/kernel/kernel/xenomai/rtdm/fd.c:299 __put_fd+0x242/0x290 > [ 133.461218] Modules linked in: rttcp rtudp rtipv4 intel_powerclamp intel_rapl coretemp i915 e1000e rt_igb pcan(O) rtnet video fan thermal_sys > [ 133.461225] CPU: 0 PID: 199 Comm: rtnet-stack Tainted: G O 4.9.90-xeno-cobolt #1 > [ 133.461226] Hardware name: Default string Default string/SKYBAY, BIOS 5.0.1.1 04/18/2016 > [ 133.461226] I-pipe domain: Xenomai > [ 133.461227] ffffc90000947d20 ffffffff81446d18 0000000000000000 0000000000000000 > [ 133.461229] ffffffff81bb13f0 ffffc90000947d60 ffffffff81078461 0000012b00000001 > [ 133.461231] 0000000000000000 0000000000000000 ffff88026194d000 0000000000000000 > [ 133.461233] Call Trace: > [ 133.461234] [<ffffffff81446d18>] dump_stack+0xbf/0xe7 > [ 133.461234] [<ffffffff81078461>] __warn+0xe1/0x100 > [ 133.461235] [<ffffffff8107854d>] warn_slowpath_null+0x1d/0x20 > [ 133.461235] [<ffffffff811711e2>] __put_fd+0x242/0x290 > [ 133.461236] [<ffffffffa00bae46>] ? rtskb_pool_queue_tail+0xa6/0xd0 [rtnet] > [ 133.461236] [<ffffffff81171ff6>] rtdm_fd_unlock+0x96/0xc0 > [ 133.461237] [<ffffffffa02ed725>] rt_ip_rcv+0x135/0x170 [rtipv4] > [ 133.461237] [<ffffffffa00bc008>] rt_stack_deliver+0xf8/0x220 [rtnet] > [ 133.461238] [<ffffffff8115fd20>] ? xnthread_map+0x330/0x330 > [ 133.461238] [<ffffffffa00bc1a0>] rt_stack_mgr_task+0x70/0xa0 [rtnet] > [ 133.461239] [<ffffffff8115fd93>] kthread_trampoline+0x73/0x120 > [ 133.461240] [<ffffffff810967a9>] kthread+0xd9/0xf0 > [ 133.461240] [<ffffffff810966d0>] ? kthread_park+0x60/0x60 > [ 133.461241] [<ffffffff81001c92>] ? do_syscall_64+0x82/0xf0 > [ 133.461241] [<ffffffff818e0255>] ret_from_fork+0x55/0x60 > [ 133.461242] ---[ end trace c5197a4d8608bef3 ]--- > > > Per Öberg > -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Exception #14 in kernel space when using rttcp 2020-06-12 15:33 ` Jan Kiszka @ 2020-06-12 15:47 ` Per Oberg 2020-06-12 15:54 ` Jan Kiszka 0 siblings, 1 reply; 7+ messages in thread From: Per Oberg @ 2020-06-12 15:47 UTC (permalink / raw) To: xenomai ----- Den 12 jun 2020, på kl 17:33, Jan Kiszka jan.kiszka@siemens.com skrev: > On 12.06.20 17:26, Per Oberg via Xenomai wrote: > > Hi list >> I get a massive amount of "swithching ... to secondary mode after exception #14 > > in kernel-space ..." followed by a WARNING as shown below. > > Can someone enlighten me regarding the meaning of exception #14 ? >> Is the "WARNING: CPU: 0 ..." the cause or the symptom ? It has a macro at fd.c > > calling "XENO_WARN_ON(COBALT, fd->refs <= 0); > Likely related: The WARN_ON triggers a stack dump and that may trigger > fixable or ignorable faults. We may consider converting that > XENO_WARN_ON into XENO_WARN_ON_ONCE. > What is actually interesting is the warning itself. Reference counting > became imbalanced. How do you trigger that? Where do you see that? I can't figure out anything about what is going on from that warning... Not sure what I am actually doing, but I'd be glad to debug it if I knew where to start. I'm working on compiling a network library for use in Xenomai. It uses a lot of extra stuff to get everything up and running,but in the end it will use UDP for the data exchange. So switching to secondary mode may be ok during the startup. > Jan > > This is for Xenomai 3.1 >> [ 133.458856] [Xenomai] switching RTTest to secondary mode after exception #14 > > in kernel-space at 0xffffffff8145386e (pid 498) > > [ 133.461217] ------------[ cut here ]------------ >> [ 133.461218] WARNING: CPU: 0 PID: 199 at > > /usr/src/kernel/kernel/xenomai/rtdm/fd.c:299 __put_fd+0x242/0x290 >> [ 133.461218] Modules linked in: rttcp rtudp rtipv4 intel_powerclamp intel_rapl > > coretemp i915 e1000e rt_igb pcan(O) rtnet video fan thermal_sys >> [ 133.461225] CPU: 0 PID: 199 Comm: rtnet-stack Tainted: G O 4.9.90-xeno-cobolt > > #1 >> [ 133.461226] Hardware name: Default string Default string/SKYBAY, BIOS 5.0.1.1 > > 04/18/2016 > > [ 133.461226] I-pipe domain: Xenomai >> [ 133.461227] ffffc90000947d20 ffffffff81446d18 0000000000000000 > > 0000000000000000 >> [ 133.461229] ffffffff81bb13f0 ffffc90000947d60 ffffffff81078461 > > 0000012b00000001 >> [ 133.461231] 0000000000000000 0000000000000000 ffff88026194d000 > > 0000000000000000 > > [ 133.461233] Call Trace: > > [ 133.461234] [<ffffffff81446d18>] dump_stack+0xbf/0xe7 > > [ 133.461234] [<ffffffff81078461>] __warn+0xe1/0x100 > > [ 133.461235] [<ffffffff8107854d>] warn_slowpath_null+0x1d/0x20 > > [ 133.461235] [<ffffffff811711e2>] __put_fd+0x242/0x290 > > [ 133.461236] [<ffffffffa00bae46>] ? rtskb_pool_queue_tail+0xa6/0xd0 [rtnet] > > [ 133.461236] [<ffffffff81171ff6>] rtdm_fd_unlock+0x96/0xc0 > > [ 133.461237] [<ffffffffa02ed725>] rt_ip_rcv+0x135/0x170 [rtipv4] > > [ 133.461237] [<ffffffffa00bc008>] rt_stack_deliver+0xf8/0x220 [rtnet] > > [ 133.461238] [<ffffffff8115fd20>] ? xnthread_map+0x330/0x330 > > [ 133.461238] [<ffffffffa00bc1a0>] rt_stack_mgr_task+0x70/0xa0 [rtnet] > > [ 133.461239] [<ffffffff8115fd93>] kthread_trampoline+0x73/0x120 > > [ 133.461240] [<ffffffff810967a9>] kthread+0xd9/0xf0 > > [ 133.461240] [<ffffffff810966d0>] ? kthread_park+0x60/0x60 > > [ 133.461241] [<ffffffff81001c92>] ? do_syscall_64+0x82/0xf0 > > [ 133.461241] [<ffffffff818e0255>] ret_from_fork+0x55/0x60 > > [ 133.461242] ---[ end trace c5197a4d8608bef3 ]--- > > Per Öberg > -- > Siemens AG, Corporate Technology, CT RDA IOT SES-DE > Corporate Competence Center Embedded Linux Per Öberg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Exception #14 in kernel space when using rttcp 2020-06-12 15:47 ` Per Oberg @ 2020-06-12 15:54 ` Jan Kiszka 2020-06-13 8:05 ` Per Oberg 0 siblings, 1 reply; 7+ messages in thread From: Jan Kiszka @ 2020-06-12 15:54 UTC (permalink / raw) To: Per Oberg, xenomai On 12.06.20 17:47, Per Oberg via Xenomai wrote: > ----- Den 12 jun 2020, på kl 17:33, Jan Kiszka jan.kiszka@siemens.com skrev: > >> On 12.06.20 17:26, Per Oberg via Xenomai wrote: >>> Hi list > >>> I get a massive amount of "swithching ... to secondary mode after exception #14 >>> in kernel-space ..." followed by a WARNING as shown below. > >>> Can someone enlighten me regarding the meaning of exception #14 ? > >>> Is the "WARNING: CPU: 0 ..." the cause or the symptom ? It has a macro at fd.c >>> calling "XENO_WARN_ON(COBALT, fd->refs <= 0); > > >> Likely related: The WARN_ON triggers a stack dump and that may trigger >> fixable or ignorable faults. We may consider converting that >> XENO_WARN_ON into XENO_WARN_ON_ONCE. > >> What is actually interesting is the warning itself. Reference counting >> became imbalanced. How do you trigger that? > > Where do you see that? I can't figure out anything about what is going on from that warning... > Warning at .../rtdm/fd.c:299: static void __put_fd(struct rtdm_fd *fd, spl_t s) { ... XENO_WARN_ON(COBALT, fd->refs <= 0); So, the file descriptor is released although its internal reference counter says it's not held. That is a bug in the kernel, likely leading to use-after-release issues. > Not sure what I am actually doing, but I'd be glad to debug it if I knew where to start. > > I'm working on compiling a network library for use in Xenomai. It uses a lot of extra stuff to get everything up and running,but in the end it will use UDP for the data exchange. So switching to secondary mode may be ok during the startup. > I suppose you are writing a userspace application that uses RT-TCP here. That usage pattern up to the point you see the first warning would be interesting, ideally as minimal testcase. Also the configuration of the RTnet stack (compile-time and runtime). Jan -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Exception #14 in kernel space when using rttcp 2020-06-12 15:54 ` Jan Kiszka @ 2020-06-13 8:05 ` Per Oberg 2020-06-13 15:11 ` Jan Kiszka 0 siblings, 1 reply; 7+ messages in thread From: Per Oberg @ 2020-06-13 8:05 UTC (permalink / raw) To: xenomai ----- Den 12 jun 2020, på kl 17:54, Jan Kiszka jan.kiszka@siemens.com skrev: > On 12.06.20 17:47, Per Oberg via Xenomai wrote: > > ----- Den 12 jun 2020, på kl 17:33, Jan Kiszka jan.kiszka@siemens.com skrev: > >> On 12.06.20 17:26, Per Oberg via Xenomai wrote: > >>> Hi list > >>> I get a massive amount of "swithching ... to secondary mode after exception #14 > >>> in kernel-space ..." followed by a WARNING as shown below. > >>> Can someone enlighten me regarding the meaning of exception #14 ? > >>> Is the "WARNING: CPU: 0 ..." the cause or the symptom ? It has a macro at fd.c > >>> calling "XENO_WARN_ON(COBALT, fd->refs <= 0); > >> Likely related: The WARN_ON triggers a stack dump and that may trigger > >> fixable or ignorable faults. We may consider converting that > >> XENO_WARN_ON into XENO_WARN_ON_ONCE. > >> What is actually interesting is the warning itself. Reference counting > >> became imbalanced. How do you trigger that? >> Where do you see that? I can't figure out anything about what is going on from > > that warning... > Warning at .../rtdm/fd.c:299: > static void __put_fd(struct rtdm_fd *fd, spl_t s) > { > ... > XENO_WARN_ON(COBALT, fd->refs <= 0); Oh, yes of course. Didn't make the connections to the source. Thought you could see it directly in the kernel message. My bad. > So, the file descriptor is released although its internal reference > counter says it's not held. That is a bug in the kernel, likely leading > to use-after-release issues. >> Not sure what I am actually doing, but I'd be glad to debug it if I knew where > > to start. >> I'm working on compiling a network library for use in Xenomai. It uses a lot of >> extra stuff to get everything up and running,but in the end it will use UDP for > > the data exchange. So switching to secondary mode may be ok during the startup. > I suppose you are writing a userspace application that uses RT-TCP here. > That usage pattern up to the point you see the first warning would be > interesting, ideally as minimal testcase. Also the configuration of the > RTnet stack (compile-time and runtime). One thing that is noteworthy is that I was running Xenomai 3.1 with a 4.9.90 kernel (reporting itself as Xenomai 3.1), which seems like a big mess up. When I cleaned up my build-tree properly it wouldn't compile anymore which gave me the idea that I somehow managed to mix and match two xenomai versions in the same kernel. Anyway, I recompiled everything from scratch using : Xenomai 3.1 Linux 4.19.114-cip24 with ipipe patch 12 Now I get other errors, but I'm not sure yet whether that is because I have turned on the watchdog. I did a little quick-and dirty config of the kernel to get it up and running so I am not sure exactly how much that differs between my this and my old setup. Here goes: [ 1054.259075] [Xenomai] watchdog triggered on CPU #0 -- runaway thread 'RTTest' signaled [ 1054.260509] ------------[ cut here ]------------ [ 1054.260510] [Xenomai] switching rtnet-stack to secondary mode after exception #6 in kernel-space at 0xffffffff8bd7064b (pid 1449) [ 1054.260517] WARNING: CPU: 0 PID: 1449 at /usr/src/kernel/kernel/xenomai/rtdm/fd.c:299 __put_fd+0x26b/0x2c0 [ 1054.260517] Modules linked in: rttcp rtudp rtipv4 rt_igb rtnet x86_pkg_temp_thermal [ 1054.260521] CPU: 0 PID: 1449 Comm: rtnet-stack Not tainted 4.19.114-cip24xeno-cobalt #1 [ 1054.260522] Hardware name: Default string Default string/SKYBAY, BIOS 5.0.1.1 04/18/2016 [ 1054.260522] I-pipe domain: Linux [ 1054.260524] RIP: 0010:__put_fd+0x26b/0x2c0 [ 1054.260525] Code: 83 e0 01 49 39 c4 74 08 4c 89 e7 e8 8f 98 f9 ff 48 8d 7d b0 e8 36 99 f9 ff e9 81 fe ff ff 48 c7 c7 e0 b0 db 8c e8 1e 4b f3 ff <0f> 0b 41 8b 5d 18 e9 ca fd ff ff 48 8b 05 eb d0 2d 01 49 c7 45 30 [ 1054.260525] RSP: 0018:ffff94f9c0223dc0 EFLAGS: 00010282 [ 1054.260526] RAX: 0000000000000028 RBX: 0000000000000000 RCX: 0000000000000001 [ 1054.260527] RDX: 0000000000000000 RSI: 0000000000001140 RDI: ffffffff8d77d500 [ 1054.260528] RBP: ffff94f9c0223e20 R08: 0000000000000045 R09: 000000000002e7c0 [ 1054.260528] R10: ffff94f9c0223e38 R11: 0000000000000000 R12: 0000000000000000 [ 1054.260529] R13: ffff919461fb4800 R14: 0000000000000000 R15: ffffffffc011d1e0 [ 1054.260529] FS: 0000000000000000(0000) GS:ffff919465a00000(0000) knlGS:0000000000000000 [ 1054.260530] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1054.260531] CR2: 0000000000000000 CR3: 00000001dba0a001 CR4: 00000000003606f0 [ 1054.260531] Call Trace: [ 1054.260534] ? rtdm_nrtsig_pend+0x43/0x70 [ 1054.260536] ? rtdm_cleanup+0x10/0x10 [ 1054.260537] ? rtdm_fd_unlock+0x9b/0xd0 [ 1054.260538] rtdm_fd_unlock+0x9b/0xd0 [ 1054.260540] rt_ip_rcv+0x129/0x180 [rtipv4] [ 1054.260542] rt_stack_deliver+0x22c/0x3a0 [rtnet] [ 1054.260544] ? xnthread_map+0x370/0x370 [ 1054.260545] rt_stack_mgr_task+0x66/0xa0 [rtnet] [ 1054.260546] kthread_trampoline+0x77/0x133 [ 1054.260548] kthread+0x10e/0x130 [ 1054.260550] ? kthread_create_worker_on_cpu+0x70/0x70 [ 1054.260552] ret_from_fork+0x36/0x50 [ 1054.260554] ---[ end trace a6a10c1d0c5fd7df ]--- [ 1054.260555] ------------[ cut here ]------------ [ 1054.260571] WARNING: CPU: 0 PID: 1449 at /usr/src/kernel/kernel/xenomai/rtdm/drvlib.c:884 rtdm_event_timedwait+0x50/0x320 [ 1054.260572] Modules linked in: rttcp rtudp rtipv4 rt_igb rtnet x86_pkg_temp_thermal [ 1054.260573] CPU: 0 PID: 1449 Comm: rtnet-stack Tainted: G W 4.19.114-cip24xeno-cobalt #1 [ 1054.260574] Hardware name: Default string Default string/SKYBAY, BIOS 5.0.1.1 04/18/2016 [ 1054.260574] I-pipe domain: Linux [ 1054.260575] RIP: 0010:rtdm_event_timedwait+0x50/0x320 [ 1054.260575] Code: c0 48 85 f6 78 46 48 c7 c2 40 01 03 00 48 89 d0 65 48 03 05 da 17 2a 74 f6 40 09 40 74 19 48 c7 c7 e0 b0 db 8c e8 f9 77 f3 ff <0f> 0b 41 bc ff ff ff ff e9 24 01 00 00 65 48 03 15 b3 17 2a 74 48 [ 1054.260576] RSP: 0018:ffff94f9c0223e80 EFLAGS: 00010282 [ 1054.260576] RAX: 0000000000000024 RBX: ffffffffc0105e00 RCX: 0000000000000000 [ 1054.260577] RDX: 0000000000000000 RSI: ffffffff8cdd42b1 RDI: 00000000ffffffff [ 1054.260577] RBP: ffffffffc0104300 R08: ffff919465a00000 R09: 0000000000000466 [ 1054.260578] R10: ffff94f9c0223e38 R11: 0000000000000000 R12: ffff94f9c02b3a90 [ 1054.260578] R13: 0000000000000000 R14: 0000000000000000 R15: ffffffffc0104300 [ 1054.260579] FS: 0000000000000000(0000) GS:ffff919465a00000(0000) knlGS:0000000000000000 [ 1054.260579] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1054.260580] CR2: 0000000000000000 CR3: 00000001dba0a001 CR4: 00000000003606f0 [ 1054.260580] Call Trace: [ 1054.260580] ? rt_stack_deliver+0x28b/0x3a0 [rtnet] [ 1054.260581] ? xnthread_map+0x370/0x370 [ 1054.260581] rt_stack_mgr_task+0x27/0xa0 [rtnet] [ 1054.260582] kthread_trampoline+0x77/0x133 [ 1054.260582] kthread+0x10e/0x130 [ 1054.260583] ? kthread_create_worker_on_cpu+0x70/0x70 [ 1054.260583] ret_from_fork+0x36/0x50 [ 1054.260584] ---[ end trace a6a10c1d0c5fd7e0 ]--- I will try to make a minimal example of my example and my current setup. Am I right in believing that there is now a "Standard distro" for xenomai that I can try this on with well known settings? If so, how can I take it out for a spin? > Jan > -- > Siemens AG, Corporate Technology, CT RDA IOT SES-DE > Corporate Competence Center Embedded Linux Thanks Per Öberg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Exception #14 in kernel space when using rttcp 2020-06-13 8:05 ` Per Oberg @ 2020-06-13 15:11 ` Jan Kiszka 0 siblings, 0 replies; 7+ messages in thread From: Jan Kiszka @ 2020-06-13 15:11 UTC (permalink / raw) To: Per Oberg, xenomai On 13.06.20 10:05, Per Oberg wrote: > > ----- Den 12 jun 2020, på kl 17:54, Jan Kiszka jan.kiszka@siemens.com skrev: > >> On 12.06.20 17:47, Per Oberg via Xenomai wrote: >>> ----- Den 12 jun 2020, på kl 17:33, Jan Kiszka jan.kiszka@siemens.com skrev: > >>>> On 12.06.20 17:26, Per Oberg via Xenomai wrote: >>>>> Hi list > >>>>> I get a massive amount of "swithching ... to secondary mode after exception #14 >>>>> in kernel-space ..." followed by a WARNING as shown below. > >>>>> Can someone enlighten me regarding the meaning of exception #14 ? > >>>>> Is the "WARNING: CPU: 0 ..." the cause or the symptom ? It has a macro at fd.c >>>>> calling "XENO_WARN_ON(COBALT, fd->refs <= 0); > > >>>> Likely related: The WARN_ON triggers a stack dump and that may trigger >>>> fixable or ignorable faults. We may consider converting that >>>> XENO_WARN_ON into XENO_WARN_ON_ONCE. > >>>> What is actually interesting is the warning itself. Reference counting >>>> became imbalanced. How do you trigger that? > >>> Where do you see that? I can't figure out anything about what is going on from >>> that warning... > > >> Warning at .../rtdm/fd.c:299: > >> static void __put_fd(struct rtdm_fd *fd, spl_t s) >> { >> ... >> XENO_WARN_ON(COBALT, fd->refs <= 0); > > Oh, yes of course. Didn't make the connections to the source. Thought you could see it directly in the kernel message. My bad. > >> So, the file descriptor is released although its internal reference >> counter says it's not held. That is a bug in the kernel, likely leading >> to use-after-release issues. > >>> Not sure what I am actually doing, but I'd be glad to debug it if I knew where >>> to start. > >>> I'm working on compiling a network library for use in Xenomai. It uses a lot of >>> extra stuff to get everything up and running,but in the end it will use UDP for >>> the data exchange. So switching to secondary mode may be ok during the startup. > > >> I suppose you are writing a userspace application that uses RT-TCP here. >> That usage pattern up to the point you see the first warning would be >> interesting, ideally as minimal testcase. Also the configuration of the >> RTnet stack (compile-time and runtime). > > One thing that is noteworthy is that I was running Xenomai 3.1 with a 4.9.90 kernel (reporting itself as Xenomai 3.1), which seems like a big mess up. When I cleaned up my build-tree properly it wouldn't compile anymore which gave me the idea that I somehow managed to mix and match two xenomai versions in the same kernel. > > Anyway, I recompiled everything from scratch using : > Xenomai 3.1 > Linux 4.19.114-cip24 with ipipe patch 12 > > Now I get other errors, but I'm not sure yet whether that is because I have turned on the watchdog. I did a little quick-and dirty config of the kernel to get it up and running so I am not sure exactly how much that differs between my this and my old setup. Here goes: > > [ 1054.259075] [Xenomai] watchdog triggered on CPU #0 -- runaway thread 'RTTest' signaled > [ 1054.260509] ------------[ cut here ]------------ > [ 1054.260510] [Xenomai] switching rtnet-stack to secondary mode after exception #6 in kernel-space at 0xffffffff8bd7064b (pid 1449) > [ 1054.260517] WARNING: CPU: 0 PID: 1449 at /usr/src/kernel/kernel/xenomai/rtdm/fd.c:299 __put_fd+0x26b/0x2c0 > [ 1054.260517] Modules linked in: rttcp rtudp rtipv4 rt_igb rtnet x86_pkg_temp_thermal > [ 1054.260521] CPU: 0 PID: 1449 Comm: rtnet-stack Not tainted 4.19.114-cip24xeno-cobalt #1 > [ 1054.260522] Hardware name: Default string Default string/SKYBAY, BIOS 5.0.1.1 04/18/2016 > [ 1054.260522] I-pipe domain: Linux > [ 1054.260524] RIP: 0010:__put_fd+0x26b/0x2c0 > [ 1054.260525] Code: 83 e0 01 49 39 c4 74 08 4c 89 e7 e8 8f 98 f9 ff 48 8d 7d b0 e8 36 99 f9 ff e9 81 fe ff ff 48 c7 c7 e0 b0 db 8c e8 1e 4b f3 ff <0f> 0b 41 8b 5d 18 e9 ca fd ff ff 48 8b 05 eb d0 2d 01 49 c7 45 30 > [ 1054.260525] RSP: 0018:ffff94f9c0223dc0 EFLAGS: 00010282 > [ 1054.260526] RAX: 0000000000000028 RBX: 0000000000000000 RCX: 0000000000000001 > [ 1054.260527] RDX: 0000000000000000 RSI: 0000000000001140 RDI: ffffffff8d77d500 > [ 1054.260528] RBP: ffff94f9c0223e20 R08: 0000000000000045 R09: 000000000002e7c0 > [ 1054.260528] R10: ffff94f9c0223e38 R11: 0000000000000000 R12: 0000000000000000 > [ 1054.260529] R13: ffff919461fb4800 R14: 0000000000000000 R15: ffffffffc011d1e0 > [ 1054.260529] FS: 0000000000000000(0000) GS:ffff919465a00000(0000) knlGS:0000000000000000 > [ 1054.260530] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 1054.260531] CR2: 0000000000000000 CR3: 00000001dba0a001 CR4: 00000000003606f0 > [ 1054.260531] Call Trace: > [ 1054.260534] ? rtdm_nrtsig_pend+0x43/0x70 > [ 1054.260536] ? rtdm_cleanup+0x10/0x10 > [ 1054.260537] ? rtdm_fd_unlock+0x9b/0xd0 > [ 1054.260538] rtdm_fd_unlock+0x9b/0xd0 > [ 1054.260540] rt_ip_rcv+0x129/0x180 [rtipv4] > [ 1054.260542] rt_stack_deliver+0x22c/0x3a0 [rtnet] > [ 1054.260544] ? xnthread_map+0x370/0x370 > [ 1054.260545] rt_stack_mgr_task+0x66/0xa0 [rtnet] > [ 1054.260546] kthread_trampoline+0x77/0x133 > [ 1054.260548] kthread+0x10e/0x130 > [ 1054.260550] ? kthread_create_worker_on_cpu+0x70/0x70 > [ 1054.260552] ret_from_fork+0x36/0x50 > [ 1054.260554] ---[ end trace a6a10c1d0c5fd7df ]--- > [ 1054.260555] ------------[ cut here ]------------ > [ 1054.260571] WARNING: CPU: 0 PID: 1449 at /usr/src/kernel/kernel/xenomai/rtdm/drvlib.c:884 rtdm_event_timedwait+0x50/0x320 > [ 1054.260572] Modules linked in: rttcp rtudp rtipv4 rt_igb rtnet x86_pkg_temp_thermal > [ 1054.260573] CPU: 0 PID: 1449 Comm: rtnet-stack Tainted: G W 4.19.114-cip24xeno-cobalt #1 > [ 1054.260574] Hardware name: Default string Default string/SKYBAY, BIOS 5.0.1.1 04/18/2016 > [ 1054.260574] I-pipe domain: Linux > [ 1054.260575] RIP: 0010:rtdm_event_timedwait+0x50/0x320 > [ 1054.260575] Code: c0 48 85 f6 78 46 48 c7 c2 40 01 03 00 48 89 d0 65 48 03 05 da 17 2a 74 f6 40 09 40 74 19 48 c7 c7 e0 b0 db 8c e8 f9 77 f3 ff <0f> 0b 41 bc ff ff ff ff e9 24 01 00 00 65 48 03 15 b3 17 2a 74 48 > [ 1054.260576] RSP: 0018:ffff94f9c0223e80 EFLAGS: 00010282 > [ 1054.260576] RAX: 0000000000000024 RBX: ffffffffc0105e00 RCX: 0000000000000000 > [ 1054.260577] RDX: 0000000000000000 RSI: ffffffff8cdd42b1 RDI: 00000000ffffffff > [ 1054.260577] RBP: ffffffffc0104300 R08: ffff919465a00000 R09: 0000000000000466 > [ 1054.260578] R10: ffff94f9c0223e38 R11: 0000000000000000 R12: ffff94f9c02b3a90 > [ 1054.260578] R13: 0000000000000000 R14: 0000000000000000 R15: ffffffffc0104300 > [ 1054.260579] FS: 0000000000000000(0000) GS:ffff919465a00000(0000) knlGS:0000000000000000 > [ 1054.260579] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 1054.260580] CR2: 0000000000000000 CR3: 00000001dba0a001 CR4: 00000000003606f0 > [ 1054.260580] Call Trace: > [ 1054.260580] ? rt_stack_deliver+0x28b/0x3a0 [rtnet] > [ 1054.260581] ? xnthread_map+0x370/0x370 > [ 1054.260581] rt_stack_mgr_task+0x27/0xa0 [rtnet] > [ 1054.260582] kthread_trampoline+0x77/0x133 > [ 1054.260582] kthread+0x10e/0x130 > [ 1054.260583] ? kthread_create_worker_on_cpu+0x70/0x70 > [ 1054.260583] ret_from_fork+0x36/0x50 > [ 1054.260584] ---[ end trace a6a10c1d0c5fd7e0 ]--- > > > I will try to make a minimal example of my example and my current setup. Am I right in believing that there is now a "Standard distro" for xenomai that I can try this on with well known settings? If so, how can I take it out for a spin? > Yes, we testing via xenomai-images [1] (thought not all aspects). If you managed to reproduce the issue for its qemu-x86 configuration, that would be perfect. Jan [1] https://gitlab.denx.de/Xenomai/xenomai-images -- Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate Competence Center Embedded Linux ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-06-13 15:11 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-06-12 15:26 Exception #14 in kernel space when using rttcp Per Oberg 2020-06-12 15:33 ` Per Oberg 2020-06-12 15:33 ` Jan Kiszka 2020-06-12 15:47 ` Per Oberg 2020-06-12 15:54 ` Jan Kiszka 2020-06-13 8:05 ` Per Oberg 2020-06-13 15:11 ` Jan Kiszka
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.