* RTNet unable to handle kernel NULL pointer dereference at 0000000000000068 @ 2023-10-04 8:45 Per Oberg 2023-10-04 9:42 ` Jan Kiszka 0 siblings, 1 reply; 7+ messages in thread From: Per Oberg @ 2023-10-04 8:45 UTC (permalink / raw) To: xenomai Hi list I experienced a crash with Xenomai-3.1 on Linux 4.19.114-cip24 It does not seem to be the end of the world though because things were not immediately broken for me and I can't reproduce it. Any thoughts ? [ 129.326872] RTnet: registered rteth1 [ 129.326874] rt_igb 0000:02:00.0: Intel(R) Gigabit Ethernet Network Connection [ 129.326875] rt_igb 0000:02:00.0: rteth1: (PCIe:2.5Gb/s:Width x1) 00:1b:21:e0:b6:31 [ 129.327002] rt_igb 0000:02:00.0: rteth1: PBA No: G69016-005 [ 129.327003] rt_igb 0000:02:00.0: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s) [ 321.182356] rt_igb: rteth1: igb: rteth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 361.783381] [Xenomai] switching rtnet-rtpc to secondary mode after exception #14 in kernel-space at 0xffffffffc03855a0 (pid 1445) [ 361.783386] BUG: unable to handle kernel NULL pointer dereference at 0000000000000068 [ 361.783415] PGD 0 P4D 0 [ 361.783428] Oops: 0000 [#1] PREEMPT SMP PTI [ 361.784109] CPU: 2 PID: 1445 Comm: rtnet-rtpc Tainted: G W O 4.19.114-cip24-xeno-cobolt #1 [ 361.784809] Hardware name: Default string Default string/SKYBAY, BIOS 5.11 09/22/2016 [ 361.785504] I-pipe domain: Linux [ 361.786214] RIP: 0010:rtdev_reference+0x0/0x70 [rtnet] [ 361.786911] Code: 00 eb e0 48 89 ef e8 ff 25 00 00 89 de 48 c7 c7 19 cf 38 c0 e8 da fb 91 c1 eb c8 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 <48> 8b 57 68 48 85 d2 74 52 53 8b 47 5c 85 c0 74 19 8d 50 01 48 8d [ 361.787708] RSP: 0018:ffff947dc01a7df0 EFLAGS: 00010206 [ 361.788516] RAX: 000000000000587e RBX: 0000000000000000 RCX: 0000000000000000 [ 361.789333] RDX: 0000000083ca0000 RSI: 0000000000001a00 RDI: 0000000000000000 [ 361.790153] RBP: ffff891e64a4c058 R08: ffff891e617e8138 R09: 0000000000000002 [ 361.790980] R10: 0000000000000000 R11: 00000000000005dc R12: ffff891e617e8110 [ 361.791812] R13: 0000000000025540 R14: ffff891e64a4c278 R15: ffff891e617e8000 [ 361.792646] FS: 0000000000000000(0000) GS:ffff891e65b00000(0000) knlGS:0000000000000000 [ 361.793490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 361.794334] CR2: 0000000000000068 CR3: 0000000051c0a005 CR4: 00000000003606e0 [ 361.795182] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 361.796026] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 361.796869] Call Trace: [ 361.797721] rt_tcp_segment.constprop.21+0x1ca/0x510 [rttcp] [ 361.798557] rt_tcp_send+0x3e/0xd0 [rttcp] [ 361.799411] ? rtdm_event_timedwait+0x181/0x320 [ 361.800271] rtpc_dispatch_handler+0xd4/0x1f0 [rtnet] [ 361.801137] ? xnthread_map+0x370/0x370 [ 361.802002] kthread_trampoline+0x77/0x133 [ 361.802868] kthread+0x10e/0x130 [ 361.803716] ? kthread_create_worker_on_cpu+0x70/0x70 [ 361.804591] ret_from_fork+0x36/0x50 [ 361.805464] Modules linked in: rttcp rtpacket rtudp rtipv4 igb rt_igb rtnet x86_pkg_temp_thermal e1000e pcan(O) [ 361.806378] CR2: 0000000000000068 [ 361.807284] ---[ end trace a94628c3ed940012 ]--- [ 361.808201] RIP: 0010:rtdev_reference+0x0/0x70 [rtnet] [ 361.809144] Code: 00 eb e0 48 89 ef e8 ff 25 00 00 89 de 48 c7 c7 19 cf 38 c0 e8 da fb 91 c1 eb c8 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 <48> 8b 57 68 48 85 d2 74 52 53 8b 47 5c 85 c0 74 19 8d 50 01 48 8d [ 361.810147] RSP: 0018:ffff947dc01a7df0 EFLAGS: 00010206 [ 361.811161] RAX: 000000000000587e RBX: 0000000000000000 RCX: 0000000000000000 [ 361.812184] RDX: 0000000083ca0000 RSI: 0000000000001a00 RDI: 0000000000000000 [ 361.813207] RBP: ffff891e64a4c058 R08: ffff891e617e8138 R09: 0000000000000002 [ 361.814222] R10: 0000000000000000 R11: 00000000000005dc R12: ffff891e617e8110 [ 361.815248] R13: 0000000000025540 R14: ffff891e64a4c278 R15: ffff891e617e8000 [ 361.816246] FS: 0000000000000000(0000) GS:ffff891e65b00000(0000) knlGS:0000000000000000 [ 361.817247] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 361.818213] CR2: 0000000000000068 CR3: 0000000051c0a005 CR4: 00000000003606e0 [ 361.819152] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 361.820047] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Regards Per Öberg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RTNet unable to handle kernel NULL pointer dereference at 0000000000000068 2023-10-04 8:45 RTNet unable to handle kernel NULL pointer dereference at 0000000000000068 Per Oberg @ 2023-10-04 9:42 ` Jan Kiszka 2023-10-04 13:01 ` Per Oberg 0 siblings, 1 reply; 7+ messages in thread From: Jan Kiszka @ 2023-10-04 9:42 UTC (permalink / raw) To: Per Oberg, xenomai On 04.10.23 10:45, Per Oberg wrote: > Hi list > I experienced a crash with Xenomai-3.1 on Linux 4.19.114-cip24 > > It does not seem to be the end of the world though because things were not immediately broken for me and I can't reproduce it. > > Any thoughts ? > > [ 129.326872] RTnet: registered rteth1 > [ 129.326874] rt_igb 0000:02:00.0: Intel(R) Gigabit Ethernet Network Connection > [ 129.326875] rt_igb 0000:02:00.0: rteth1: (PCIe:2.5Gb/s:Width x1) 00:1b:21:e0:b6:31 > [ 129.327002] rt_igb 0000:02:00.0: rteth1: PBA No: G69016-005 > [ 129.327003] rt_igb 0000:02:00.0: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s) > [ 321.182356] rt_igb: rteth1: igb: rteth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX > [ 361.783381] [Xenomai] switching rtnet-rtpc to secondary mode after exception #14 in kernel-space at 0xffffffffc03855a0 (pid 1445) > [ 361.783386] BUG: unable to handle kernel NULL pointer dereference at 0000000000000068 > [ 361.783415] PGD 0 P4D 0 > [ 361.783428] Oops: 0000 [#1] PREEMPT SMP PTI > [ 361.784109] CPU: 2 PID: 1445 Comm: rtnet-rtpc Tainted: G W O 4.19.114-cip24-xeno-cobolt #1 > [ 361.784809] Hardware name: Default string Default string/SKYBAY, BIOS 5.11 09/22/2016 > [ 361.785504] I-pipe domain: Linux > [ 361.786214] RIP: 0010:rtdev_reference+0x0/0x70 [rtnet] Most likely ((struct dest_route *)rt)->rtdev was NULL. But I have no idea why that should have been the case. One commit - though not obviously related - that is missing from 3.1.x is https://source.denx.de/Xenomai/xenomai/-/commit/ac17dcebda74edb253922b8499eeb71bcd0c70ed. Are you facing larger frames? Keep an eye on it if see it again and can reproduce it more reliably. Jan > [ 361.786911] Code: 00 eb e0 48 89 ef e8 ff 25 00 00 89 de 48 c7 c7 19 cf 38 c0 e8 da fb 91 c1 eb c8 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 <48> 8b 57 68 48 85 d2 74 52 53 8b 47 5c 85 c0 74 19 8d 50 01 48 8d > [ 361.787708] RSP: 0018:ffff947dc01a7df0 EFLAGS: 00010206 > [ 361.788516] RAX: 000000000000587e RBX: 0000000000000000 RCX: 0000000000000000 > [ 361.789333] RDX: 0000000083ca0000 RSI: 0000000000001a00 RDI: 0000000000000000 > [ 361.790153] RBP: ffff891e64a4c058 R08: ffff891e617e8138 R09: 0000000000000002 > [ 361.790980] R10: 0000000000000000 R11: 00000000000005dc R12: ffff891e617e8110 > [ 361.791812] R13: 0000000000025540 R14: ffff891e64a4c278 R15: ffff891e617e8000 > [ 361.792646] FS: 0000000000000000(0000) GS:ffff891e65b00000(0000) knlGS:0000000000000000 > [ 361.793490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 361.794334] CR2: 0000000000000068 CR3: 0000000051c0a005 CR4: 00000000003606e0 > [ 361.795182] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 361.796026] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 361.796869] Call Trace: > [ 361.797721] rt_tcp_segment.constprop.21+0x1ca/0x510 [rttcp] > [ 361.798557] rt_tcp_send+0x3e/0xd0 [rttcp] > [ 361.799411] ? rtdm_event_timedwait+0x181/0x320 > [ 361.800271] rtpc_dispatch_handler+0xd4/0x1f0 [rtnet] > [ 361.801137] ? xnthread_map+0x370/0x370 > [ 361.802002] kthread_trampoline+0x77/0x133 > [ 361.802868] kthread+0x10e/0x130 > [ 361.803716] ? kthread_create_worker_on_cpu+0x70/0x70 > [ 361.804591] ret_from_fork+0x36/0x50 > [ 361.805464] Modules linked in: rttcp rtpacket rtudp rtipv4 igb rt_igb rtnet x86_pkg_temp_thermal e1000e pcan(O) > [ 361.806378] CR2: 0000000000000068 > [ 361.807284] ---[ end trace a94628c3ed940012 ]--- > [ 361.808201] RIP: 0010:rtdev_reference+0x0/0x70 [rtnet] > [ 361.809144] Code: 00 eb e0 48 89 ef e8 ff 25 00 00 89 de 48 c7 c7 19 cf 38 c0 e8 da fb 91 c1 eb c8 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 <48> 8b 57 68 48 85 d2 74 52 53 8b 47 5c 85 c0 74 19 8d 50 01 48 8d > [ 361.810147] RSP: 0018:ffff947dc01a7df0 EFLAGS: 00010206 > [ 361.811161] RAX: 000000000000587e RBX: 0000000000000000 RCX: 0000000000000000 > [ 361.812184] RDX: 0000000083ca0000 RSI: 0000000000001a00 RDI: 0000000000000000 > [ 361.813207] RBP: ffff891e64a4c058 R08: ffff891e617e8138 R09: 0000000000000002 > [ 361.814222] R10: 0000000000000000 R11: 00000000000005dc R12: ffff891e617e8110 > [ 361.815248] R13: 0000000000025540 R14: ffff891e64a4c278 R15: ffff891e617e8000 > [ 361.816246] FS: 0000000000000000(0000) GS:ffff891e65b00000(0000) knlGS:0000000000000000 > [ 361.817247] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 361.818213] CR2: 0000000000000068 CR3: 0000000051c0a005 CR4: 00000000003606e0 > [ 361.819152] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 361.820047] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > Regards > Per Öberg > -- Siemens AG, Technology Linux Expert Center ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RTNet unable to handle kernel NULL pointer dereference at 0000000000000068 2023-10-04 9:42 ` Jan Kiszka @ 2023-10-04 13:01 ` Per Oberg 2023-10-04 13:57 ` Jan Kiszka 0 siblings, 1 reply; 7+ messages in thread From: Per Oberg @ 2023-10-04 13:01 UTC (permalink / raw) To: xenomai ----- Den 4 okt 2023, på kl 11:42, Jan Kiszka jan.kiszka@siemens.com skrev: > On 04.10.23 10:45, Per Oberg wrote: > > Hi list > > I experienced a crash with Xenomai-3.1 on Linux 4.19.114-cip24 >> It does not seem to be the end of the world though because things were not > > immediately broken for me and I can't reproduce it. > > Any thoughts ? > > [ 129.326872] RTnet: registered rteth1 > > [ 129.326874] rt_igb 0000:02:00.0: Intel(R) Gigabit Ethernet Network Connection >> [ 129.326875] rt_igb 0000:02:00.0: rteth1: (PCIe:2.5Gb/s:Width x1) > > 00:1b:21:e0:b6:31 > > [ 129.327002] rt_igb 0000:02:00.0: rteth1: PBA No: G69016-005 >> [ 129.327003] rt_igb 0000:02:00.0: Using MSI-X interrupts. 1 rx queue(s), 1 tx > > queue(s) >> [ 321.182356] rt_igb: rteth1: igb: rteth1 NIC Link is Up 1000 Mbps Full Duplex, > > Flow Control: RX >> [ 361.783381] [Xenomai] switching rtnet-rtpc to secondary mode after exception > > #14 in kernel-space at 0xffffffffc03855a0 (pid 1445) >> [ 361.783386] BUG: unable to handle kernel NULL pointer dereference at > > 0000000000000068 > > [ 361.783415] PGD 0 P4D 0 > > [ 361.783428] Oops: 0000 [#1] PREEMPT SMP PTI >> [ 361.784109] CPU: 2 PID: 1445 Comm: rtnet-rtpc Tainted: G W O > > 4.19.114-cip24-xeno-cobolt #1 >> [ 361.784809] Hardware name: Default string Default string/SKYBAY, BIOS 5.11 > > 09/22/2016 > > [ 361.785504] I-pipe domain: Linux > > [ 361.786214] RIP: 0010:rtdev_reference+0x0/0x70 [rtnet] > Most likely ((struct dest_route *)rt)->rtdev was NULL. But I have no > idea why that should have been the case. > One commit - though not obviously related - that is missing from 3.1.x > is > https://source.denx.de/Xenomai/xenomai/-/commit/ac17dcebda74edb253922b8499eeb71bcd0c70ed. > Are you facing larger frames? > Keep an eye on it if see it again and can reproduce it more reliably. > Jan >> [ 361.786911] Code: 00 eb e0 48 89 ef e8 ff 25 00 00 89 de 48 c7 c7 19 cf 38 c0 >> e8 da fb 91 c1 eb c8 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 <48> 8b 57 68 > > 48 85 d2 74 52 53 8b 47 5c 85 c0 74 19 8d 50 01 48 8d > > [ 361.787708] RSP: 0018:ffff947dc01a7df0 EFLAGS: 00010206 > > [ 361.788516] RAX: 000000000000587e RBX: 0000000000000000 RCX: 0000000000000000 > > [ 361.789333] RDX: 0000000083ca0000 RSI: 0000000000001a00 RDI: 0000000000000000 > > [ 361.790153] RBP: ffff891e64a4c058 R08: ffff891e617e8138 R09: 0000000000000002 > > [ 361.790980] R10: 0000000000000000 R11: 00000000000005dc R12: ffff891e617e8110 > > [ 361.791812] R13: 0000000000025540 R14: ffff891e64a4c278 R15: ffff891e617e8000 >> [ 361.792646] FS: 0000000000000000(0000) GS:ffff891e65b00000(0000) > > knlGS:0000000000000000 > > [ 361.793490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 361.794334] CR2: 0000000000000068 CR3: 0000000051c0a005 CR4: 00000000003606e0 > > [ 361.795182] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > [ 361.796026] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > [ 361.796869] Call Trace: > > [ 361.797721] rt_tcp_segment.constprop.21+0x1ca/0x510 [rttcp] > > [ 361.798557] rt_tcp_send+0x3e/0xd0 [rttcp] > > [ 361.799411] ? rtdm_event_timedwait+0x181/0x320 > > [ 361.800271] rtpc_dispatch_handler+0xd4/0x1f0 [rtnet] > > [ 361.801137] ? xnthread_map+0x370/0x370 > > [ 361.802002] kthread_trampoline+0x77/0x133 > > [ 361.802868] kthread+0x10e/0x130 > > [ 361.803716] ? kthread_create_worker_on_cpu+0x70/0x70 > > [ 361.804591] ret_from_fork+0x36/0x50 >> [ 361.805464] Modules linked in: rttcp rtpacket rtudp rtipv4 igb rt_igb rtnet > > x86_pkg_temp_thermal e1000e pcan(O) > > [ 361.806378] CR2: 0000000000000068 > > [ 361.807284] ---[ end trace a94628c3ed940012 ]--- > > [ 361.808201] RIP: 0010:rtdev_reference+0x0/0x70 [rtnet] >> [ 361.809144] Code: 00 eb e0 48 89 ef e8 ff 25 00 00 89 de 48 c7 c7 19 cf 38 c0 >> e8 da fb 91 c1 eb c8 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 <48> 8b 57 68 > > 48 85 d2 74 52 53 8b 47 5c 85 c0 74 19 8d 50 01 48 8d > > [ 361.810147] RSP: 0018:ffff947dc01a7df0 EFLAGS: 00010206 > > [ 361.811161] RAX: 000000000000587e RBX: 0000000000000000 RCX: 0000000000000000 > > [ 361.812184] RDX: 0000000083ca0000 RSI: 0000000000001a00 RDI: 0000000000000000 > > [ 361.813207] RBP: ffff891e64a4c058 R08: ffff891e617e8138 R09: 0000000000000002 > > [ 361.814222] R10: 0000000000000000 R11: 00000000000005dc R12: ffff891e617e8110 > > [ 361.815248] R13: 0000000000025540 R14: ffff891e64a4c278 R15: ffff891e617e8000 >> [ 361.816246] FS: 0000000000000000(0000) GS:ffff891e65b00000(0000) > > knlGS:0000000000000000 > > [ 361.817247] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 361.818213] CR2: 0000000000000068 CR3: 0000000051c0a005 CR4: 00000000003606e0 > > [ 361.819152] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > [ 361.820047] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Regards > > Per Öberg > -- > Siemens AG, Technology > Linux Expert Center Thanks, the TCP part is a simple handshake and should not need any large messages afaik. I just had another hickup. This is 100 pct reproducible To get it up and running I had to remove rtpacket,rttcp,rtudp,rtipv4 and then forcibly remove rt_igb driver and then rtnet. This gave me another set of errors, also listed below. [15018.436248] ------------[ cut here ]------------ [15018.436249] [Xenomai] switching rtnet-stack to secondary mode after exception #6 in kernel-space at 0xffffffffb837064b (pid 1438) [15018.436258] WARNING: CPU: 0 PID: 1438 at /usr/src/kernel/kernel/xenomai/rtdm/fd.c:299 __put_fd+0x26b/0x2c0 [15018.436259] Modules linked in: rttcp rtpacket rtudp rtipv4 igb rt_igb x86_pkg_temp_thermal rtnet e1000e pcan(O) [15018.436265] CPU: 0 PID: 1438 Comm: rtnet-stack Tainted: G W O 4.19.114-cip24-xeno-cobolt #1 [15018.436266] Hardware name: Default string Default string/SKYBAY, BIOS 5.11 09/22/2016 [15018.436267] I-pipe domain: Linux [15018.436268] RIP: 0010:__put_fd+0x26b/0x2c0 [15018.436270] Code: 83 e0 01 49 39 c4 74 08 4c 89 e7 e8 8f 98 f9 ff 48 8d 7d b0 e8 36 99 f9 ff e9 81 fe ff ff 48 c7 c7 f0 53 3b b9 e8 1e 4b f3 ff <0f> 0b 41 8b 5d 18 e9 ca fd ff ff 48 8b 05 eb d0 2d 01 49 c7 45 30 [15018.436270] RSP: 0018:ffffbb998017bdc0 EFLAGS: 00010282 [15018.436272] RAX: 0000000000000028 RBX: 0000000000000000 RCX: 0000000000000001 [15018.436276] RDX: 0000000000000000 RSI: 0000000000001140 RDI: ffffffffb9d7b500 [15018.436277] RBP: ffffbb998017be20 R08: 0000000000000045 R09: 000000000002e7c0 [15018.436278] R10: ffffbb998017be38 R11: 0000000000000000 R12: 0000000000000000 [15018.436279] R13: ffffa290e1e95800 R14: 0000000000000000 R15: ffffffffc01081e0 [15018.436280] FS: 0000000000000000(0000) GS:ffffa290e5a00000(0000) knlGS:0000000000000000 [15018.436281] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [15018.436282] CR2: 00007ffd7412c080 CR3: 0000000039c0a001 CR4: 00000000003606f0 [15018.436283] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [15018.436284] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [15018.436284] Call Trace: [15018.436289] ? rt_tcp_retransmit_ack+0x205/0x260 [rttcp] [15018.436291] ? rtdm_fd_unlock+0x9b/0xd0 [15018.436292] rtdm_fd_unlock+0x9b/0xd0 [15018.436295] rt_ip_rcv+0x129/0x180 [rtipv4] [15018.436298] rt_stack_deliver+0x22c/0x3a0 [rtnet] [15018.436300] ? xnthread_map+0x370/0x370 [15018.436302] rt_stack_mgr_task+0x66/0xa0 [rtnet] [15018.436303] kthread_trampoline+0x77/0x133 [15018.436306] kthread+0x10e/0x130 [15018.436308] ? kthread_create_worker_on_cpu+0x70/0x70 [15018.436310] ret_from_fork+0x36/0x50 [15018.436313] ---[ end trace 1c6e468cf3ee6a54 ]--- [15018.436339] ------------[ cut here ]------------ [15018.436340] WARNING: CPU: 0 PID: 1438 at /usr/src/kernel/kernel/xenomai/rtdm/fd.c:299 __put_fd+0x26b/0x2c0 [15018.436341] Modules linked in: rttcp rtpacket rtudp rtipv4 igb rt_igb x86_pkg_temp_thermal rtnet e1000e pcan(O) [15018.436343] CPU: 0 PID: 1438 Comm: rtnet-stack Tainted: G W O 4.19.114-cip24-xeno-cobolt #1 [15018.436344] Hardware name: Default string Default string/SKYBAY, BIOS 5.11 09/22/2016 [15018.436344] I-pipe domain: Linux [15018.436345] RIP: 0010:__put_fd+0x26b/0x2c0 [15018.436346] Code: 83 e0 01 49 39 c4 74 08 4c 89 e7 e8 8f 98 f9 ff 48 8d 7d b0 e8 36 99 f9 ff e9 81 fe ff ff 48 c7 c7 f0 53 3b b9 e8 1e 4b f3 ff <0f> 0b 41 8b 5d 18 e9 ca fd ff ff 48 8b 05 eb d0 2d 01 49 c7 45 30 [15018.436346] RSP: 0018:ffffbb998017bdc0 EFLAGS: 00010282 [15018.436347] RAX: 0000000000000028 RBX: 0000000000000000 RCX: 0000000000000001 [15018.436348] RDX: 0000000000000000 RSI: 0000000000001140 RDI: ffffffffb9d7f400 [15018.436348] RBP: ffffbb998017be20 R08: 0000000000000045 R09: 000000000002e7c0 [15018.436349] R10: ffffbb998017be38 R11: 0000000000000000 R12: 0000000000000000 [15018.436350] R13: ffffa290e1e95800 R14: 0000000000000000 R15: ffffffffc01081e0 [15018.436350] FS: 0000000000000000(0000) GS:ffffa290e5a00000(0000) knlGS:0000000000000000 [15018.436351] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [15018.436352] CR2: 00007ffd7412c080 CR3: 0000000039c0a001 CR4: 00000000003606f0 [15018.436352] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [15018.436353] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [15018.436353] Call Trace: [15018.436354] ? rtdm_fd_lock+0xd5/0x1c0 [15018.436355] ? rtdm_fd_unlock+0x9b/0xd0 [15018.436355] rtdm_fd_unlock+0x9b/0xd0 [15018.436356] rt_ip_rcv+0x129/0x180 [rtipv4] [15018.436356] rt_stack_deliver+0x22c/0x3a0 [rtnet] [15018.436357] ? xnthread_map+0x370/0x370 [15018.436357] rt_stack_mgr_task+0x66/0xa0 [rtnet] [15018.436358] kthread_trampoline+0x77/0x133 [15018.436359] kthread+0x10e/0x130 [15018.436359] ? kthread_create_worker_on_cpu+0x70/0x70 [15018.436360] ret_from_fork+0x36/0x50 [15018.436360] ---[ end trace 1c6e468cf3ee6a55 ]--- [15018.436361] ------------[ cut here ]------------ [15018.436382] WARNING: CPU: 0 PID: 1438 at /usr/src/kernel/kernel/xenomai/rtdm/drvlib.c:884 rtdm_event_timedwait+0x50/0x320 [15018.436383] Modules linked in: rttcp rtpacket rtudp rtipv4 igb rt_igb x86_pkg_temp_thermal rtnet e1000e pcan(O) [15018.436385] CPU: 0 PID: 1438 Comm: rtnet-stack Tainted: G W O 4.19.114-cip24-xeno-cobolt #1 [15018.436386] Hardware name: Default string Default string/SKYBAY, BIOS 5.11 09/22/2016 [15018.436387] I-pipe domain: Linux [15018.436387] RIP: 0010:rtdm_event_timedwait+0x50/0x320 [15018.436388] Code: c0 48 85 f6 78 46 48 c7 c2 40 01 03 00 48 89 d0 65 48 03 05 da 17 ca 47 f6 40 09 40 74 19 48 c7 c7 f0 53 3b b9 e8 f9 77 f3 ff <0f> 0b 41 bc ff ff ff ff e9 24 01 00 00 65 48 03 15 b3 17 ca 47 48 [15018.436389] RSP: 0018:ffffbb998017be80 EFLAGS: 00010282 [15018.436390] RAX: 0000000000000024 RBX: ffffffffc011ce00 RCX: 0000000000000000 [15018.436390] RDX: 0000000000000000 RSI: ffffffffb93ce5b9 RDI: 00000000ffffffff [15018.436391] RBP: ffffffffc011b300 R08: ffffa290e5a00000 R09: 00000000000006c7 [15018.436391] R10: ffffbb998017be38 R11: 0000000000000000 R12: ffffbb998027ba90 [15018.436392] R13: 0000000000000000 R14: 0000000000000000 R15: ffffffffc011b300 [15018.436393] FS: 0000000000000000(0000) GS:ffffa290e5a00000(0000) knlGS:0000000000000000 [15018.436393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [15018.436394] CR2: 00007ffd7412c080 CR3: 0000000039c0a001 CR4: 00000000003606f0 [15018.436395] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [15018.436395] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [15018.436396] Call Trace: [15018.436396] ? rt_stack_deliver+0x28b/0x3a0 [rtnet] [15018.436397] ? xnthread_map+0x370/0x370 [15018.436397] rt_stack_mgr_task+0x27/0xa0 [rtnet] [15018.436398] kthread_trampoline+0x77/0x133 [15018.436399] kthread+0x10e/0x130 [15018.436399] ? kthread_create_worker_on_cpu+0x70/0x70 [15018.436400] ret_from_fork+0x36/0x50 [15018.436401] ---[ end trace 1c6e468cf3ee6a56 ]--- [15058.612235] RTnet: dropping packet in rtnetif_rx() [15059.612209] RTnet: dropping packet in rtnetif_rx() [15060.612166] RTnet: dropping packet in rtnetif_rx() [15083.961342] RTnet: dropping packet in rtnetif_rx() [15198.507222] RTnet: dropping packet in rtnetif_rx() [15254.669404] RTnet: dropping packet in rtnetif_rx() [15310.553224] RTnet: dropping packet in rtnetif_rx() At module unload [16042.887516] Disabling lock debugging due to kernel taint [16047.781092] ============================================================================= [16047.781093] BUG rtskb_slab_pool (Tainted: G R W O ): Objects remaining in rtskb_slab_pool on __kmem_cache_shutdo wn() [16047.781094] ----------------------------------------------------------------------------- [16047.781094] [16047.781095] INFO: Slab 0x0000000059a810bf objects=17 used=14 fp=0x00000000b5f0e913 flags=0x9300000008100 [16047.781097] CPU: 2 PID: 2085 Comm: rmmod Tainted: G R B W O 4.19.114-cip24-xeno-cobolt #1 [16047.781098] Hardware name: Default string Default string/SKYBAY, BIOS 5.11 09/22/2016 [16047.781099] I-pipe domain: Linux [16047.781100] Call Trace: [16047.781105] dump_stack+0x98/0xb3 [16047.781108] slab_err+0xa8/0xd0 [16047.781111] ? on_each_cpu_mask+0x47/0x90 [16047.781113] ? on_each_cpu_cond+0x90/0xd0 [16047.781115] __kmem_cache_shutdown+0x1c1/0x3d0 [16047.781118] kmem_cache_destroy+0x40/0xf0 [16047.781121] cleanup_module+0x31/0x7fa [rtnet] [16047.781123] __x64_sys_delete_module+0x155/0x220 [16047.781125] ? __do_page_fault+0x2a4/0x570 [16047.781127] do_syscall_64+0x64/0x170 [16047.781129] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [16047.781131] RIP: 0033:0x30792ecf59 [16047.781132] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0f ff 2b 00 f7 d8 64 89 01 48 [16047.781133] RSP: 002b:00007ffcb579a968 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0 [16047.781134] RAX: ffffffffffffffda RBX: 00007f195face6c0 RCX: 00000030792ecf59 [16047.781135] RDX: 0000000000000000 RSI: 0000000000000880 RDI: 00007ffcb579a980 [16047.781136] RBP: 00007ffcb579ac18 R08: 0000000000000000 R09: 0000000000000880 [16047.781137] R10: 000000000000005f R11: 0000000000000246 R12: 00007ffcb579ac08 [16047.781138] R13: 00007ffcb579ac00 R14: 0000000000000000 R15: 0000000000000000 [16047.781140] INFO: Object 0x00000000fdde1d6b @offset=0 [16047.781141] INFO: Object 0x000000008c150ec2 @offset=1856 [16047.781142] INFO: Object 0x00000000c18de762 @offset=3712 [16047.781142] INFO: Object 0x00000000d1da0c03 @offset=5568 [16047.781143] INFO: Object 0x00000000c214b504 @offset=7424 [16047.781144] INFO: Object 0x0000000079c960b5 @offset=9280 [16047.781145] INFO: Object 0x000000008c56ca31 @offset=11136 [16047.781145] INFO: Object 0x0000000097ec8f45 @offset=12992 [16047.781146] INFO: Object 0x0000000015430fa3 @offset=14848 [16047.781147] INFO: Object 0x00000000b6afa3d5 @offset=16704 [16047.781147] INFO: Object 0x00000000600412d5 @offset=18560 [16047.781148] INFO: Object 0x0000000097312ec2 @offset=20416 [16047.781149] INFO: Object 0x000000009169b1f8 @offset=22272 [16047.781149] INFO: Object 0x000000000bfd769e @offset=24128 [16047.781152] kmem_cache_destroy rtskb_slab_pool: Slab cache still has objects [16047.781153] CPU: 2 PID: 2085 Comm: rmmod Tainted: G R B W O 4.19.114-cip24-xeno-cobolt #1 [16047.781154] Hardware name: Default string Default string/SKYBAY, BIOS 5.11 09/22/2016 [16047.781155] I-pipe domain: Linux [16047.781155] Call Trace: [16047.781157] dump_stack+0x98/0xb3 [16047.781159] kmem_cache_destroy+0xe3/0xf0 [16047.781161] cleanup_module+0x31/0x7fa [rtnet] [16047.781163] __x64_sys_delete_module+0x155/0x220 [16047.781164] ? __do_page_fault+0x2a4/0x570 [16047.781166] do_syscall_64+0x64/0x170 [16047.781168] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [16047.781169] RIP: 0033:0x30792ecf59 [16047.781170] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0f ff 2b 00 f7 d8 64 89 01 48 [16047.781171] RSP: 002b:00007ffcb579a968 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0 [16047.781172] RAX: ffffffffffffffda RBX: 00007f195face6c0 RCX: 00000030792ecf59 [16047.781173] RDX: 0000000000000000 RSI: 0000000000000880 RDI: 00007ffcb579a980 [16047.781174] RBP: 00007ffcb579ac18 R08: 0000000000000000 R09: 0000000000000880 [16047.781174] R10: 000000000000005f R11: 0000000000000246 R12: 00007ffcb579ac08 [16047.781175] R13: 00007ffcb579ac00 R14: 0000000000000000 R15: 0000000000000000 [16047.781193] RTnet: unloaded Thanks Per Öberg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RTNet unable to handle kernel NULL pointer dereference at 0000000000000068 2023-10-04 13:01 ` Per Oberg @ 2023-10-04 13:57 ` Jan Kiszka 2023-10-04 14:15 ` Per Oberg 0 siblings, 1 reply; 7+ messages in thread From: Jan Kiszka @ 2023-10-04 13:57 UTC (permalink / raw) To: Per Oberg, xenomai On 04.10.23 15:01, Per Oberg wrote: > I just had another hickup. This is 100 pct reproducible > > To get it up and running I had to remove rtpacket,rttcp,rtudp,rtipv4 and then forcibly remove rt_igb driver and then rtnet. This gave me another set of errors, also listed below. What does "forcibly" mean here? Actually "rmmod --force"? Then you get what you deserve ;). > > [15018.436248] ------------[ cut here ]------------ > [15018.436249] [Xenomai] switching rtnet-stack to secondary mode after exception #6 in kernel-space at 0xffffffffb837064b (pid 1438) > [15018.436258] WARNING: CPU: 0 PID: 1438 at /usr/src/kernel/kernel/xenomai/rtdm/fd.c:299 __put_fd+0x26b/0x2c0 You are not using the latest release, not even of your stable series. This is needlessly risky. There is some imbalance in the reference counter for the socket, and later on are complaints about leaking buffers in the pool. The real problem may rather be that you didn't shut down the interface properly - or that we don't do that when pulling the plugs in the order you describe. Can you reproduce this rather generic issue over latest master as well? Jan -- Siemens AG, Technology Linux Expert Center ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RTNet unable to handle kernel NULL pointer dereference at 0000000000000068 2023-10-04 13:57 ` Jan Kiszka @ 2023-10-04 14:15 ` Per Oberg 2023-10-05 14:19 ` Per Oberg 0 siblings, 1 reply; 7+ messages in thread From: Per Oberg @ 2023-10-04 14:15 UTC (permalink / raw) To: Jan Kiszka; +Cc: xenomai ----- Den 4 okt 2023, på kl 15:57, Jan Kiszka jan.kiszka@siemens.com skrev: > On 04.10.23 15:01, Per Oberg wrote: > > I just had another hickup. This is 100 pct reproducible >> To get it up and running I had to remove rtpacket,rttcp,rtudp,rtipv4 and then >> forcibly remove rt_igb driver and then rtnet. This gave me another set of > > errors, also listed below. > What does "forcibly" mean here? Actually "rmmod --force"? Then you get > what you deserve ;). Yes, intentionally. I expected that the "rmmod --foce" would reek havok but I thought that perhaphs the actual havok caused could give more pointers to the actual problem. I wanted to see if I could recover from the first crash (not the one from my first email) without rebooting. > > [15018.436248] ------------[ cut here ]------------ >> [15018.436249] [Xenomai] switching rtnet-stack to secondary mode after exception > > #6 in kernel-space at 0xffffffffb837064b (pid 1438) >> [15018.436258] WARNING: CPU: 0 PID: 1438 at > > /usr/src/kernel/kernel/xenomai/rtdm/fd.c:299 __put_fd+0x26b/0x2c0 > You are not using the latest release, not even of your stable series. > This is needlessly risky. I am currently updating the kernel and xenomai libraries to see what that can do. I could not, however, find any patches that looked relevant for the issue. > There is some imbalance in the reference counter for the socket, and > later on are complaints about leaking buffers in the pool. The real > problem may rather be that you didn't shut down the interface properly - > or that we don't do that when pulling the plugs in the order you describe. To be clear, the "[Xenomai] switching rtnet-stack to secondary mode after exception #6 in kernel-space at 0xffffffffb837064b (pid 1438)" happens after a fresh reboot, not something I kept alive with artifial life support. > Can you reproduce this rather generic issue over latest master as well? I'm working on it. Hopefully not. > Jan > -- > Siemens AG, Technology > Linux Expert Center Per Öberg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RTNet unable to handle kernel NULL pointer dereference at 0000000000000068 2023-10-04 14:15 ` Per Oberg @ 2023-10-05 14:19 ` Per Oberg 2023-10-05 14:21 ` Jan Kiszka 0 siblings, 1 reply; 7+ messages in thread From: Per Oberg @ 2023-10-05 14:19 UTC (permalink / raw) To: xenomai ----- Den 4 okt 2023, på kl 16:15, Per Öberg pero@wolfram.com skrev: > ----- Den 4 okt 2023, på kl 15:57, Jan Kiszka jan.kiszka@siemens.com skrev: > > On 04.10.23 15:01, Per Oberg wrote: > > > I just had another hickup. This is 100 pct reproducible > >> To get it up and running I had to remove rtpacket,rttcp,rtudp,rtipv4 and then > >> forcibly remove rt_igb driver and then rtnet. This gave me another set of > > > errors, also listed below. > > What does "forcibly" mean here? Actually "rmmod --force"? Then you get > > what you deserve ;). > Yes, intentionally. I expected that the "rmmod --foce" would reek havok but I > thought that perhaphs the actual havok caused could give more pointers to the > actual problem. > I wanted to see if I could recover from the first crash (not the one from my > first email) without rebooting. > > > [15018.436248] ------------[ cut here ]------------ > >> [15018.436249] [Xenomai] switching rtnet-stack to secondary mode after exception > > > #6 in kernel-space at 0xffffffffb837064b (pid 1438) > >> [15018.436258] WARNING: CPU: 0 PID: 1438 at > > > /usr/src/kernel/kernel/xenomai/rtdm/fd.c:299 __put_fd+0x26b/0x2c0 > > You are not using the latest release, not even of your stable series. > > This is needlessly risky. > I am currently updating the kernel and xenomai libraries to see what that can > do. I could not, however, find any patches that looked relevant for the issue. > > There is some imbalance in the reference counter for the socket, and > > later on are complaints about leaking buffers in the pool. The real > > problem may rather be that you didn't shut down the interface properly - > > or that we don't do that when pulling the plugs in the order you describe. > To be clear, the > "[Xenomai] switching rtnet-stack to secondary mode after exception #6 in > kernel-space at 0xffffffffb837064b (pid 1438)" > happens after a fresh reboot, not something I kept alive with artifial life > support. > > Can you reproduce this rather generic issue over latest master as well? > I'm working on it. Hopefully not. > > Jan > > -- > > Siemens AG, Technology > > Linux Expert Center > Per Öberg Now tested on Linux 5.10.191-dovetail and Xenomai 3.2.4. Cannot reproduce the crash anymore Per Öberg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: RTNet unable to handle kernel NULL pointer dereference at 0000000000000068 2023-10-05 14:19 ` Per Oberg @ 2023-10-05 14:21 ` Jan Kiszka 0 siblings, 0 replies; 7+ messages in thread From: Jan Kiszka @ 2023-10-05 14:21 UTC (permalink / raw) To: Per Oberg, xenomai On 05.10.23 16:19, Per Oberg wrote: > > ----- Den 4 okt 2023, på kl 16:15, Per Öberg pero@wolfram.com skrev: > >> ----- Den 4 okt 2023, på kl 15:57, Jan Kiszka jan.kiszka@siemens.com skrev: > >>> On 04.10.23 15:01, Per Oberg wrote: >>>> I just had another hickup. This is 100 pct reproducible > >>>> To get it up and running I had to remove rtpacket,rttcp,rtudp,rtipv4 and then >>>> forcibly remove rt_igb driver and then rtnet. This gave me another set of >>>> errors, also listed below. > >>> What does "forcibly" mean here? Actually "rmmod --force"? Then you get >>> what you deserve ;). > >> Yes, intentionally. I expected that the "rmmod --foce" would reek havok but I >> thought that perhaphs the actual havok caused could give more pointers to the >> actual problem. >> I wanted to see if I could recover from the first crash (not the one from my >> first email) without rebooting. > >>>> [15018.436248] ------------[ cut here ]------------ >>>> [15018.436249] [Xenomai] switching rtnet-stack to secondary mode after exception >>>> #6 in kernel-space at 0xffffffffb837064b (pid 1438) >>>> [15018.436258] WARNING: CPU: 0 PID: 1438 at >>>> /usr/src/kernel/kernel/xenomai/rtdm/fd.c:299 __put_fd+0x26b/0x2c0 > >>> You are not using the latest release, not even of your stable series. >>> This is needlessly risky. > >> I am currently updating the kernel and xenomai libraries to see what that can >> do. I could not, however, find any patches that looked relevant for the issue. > >>> There is some imbalance in the reference counter for the socket, and >>> later on are complaints about leaking buffers in the pool. The real >>> problem may rather be that you didn't shut down the interface properly - >>> or that we don't do that when pulling the plugs in the order you describe. > >> To be clear, the >> "[Xenomai] switching rtnet-stack to secondary mode after exception #6 in >> kernel-space at 0xffffffffb837064b (pid 1438)" >> happens after a fresh reboot, not something I kept alive with artifial life >> support. > >>> Can you reproduce this rather generic issue over latest master as well? > >> I'm working on it. Hopefully not. > >>> Jan > >>> -- >>> Siemens AG, Technology >>> Linux Expert Center > >> Per Öberg > > Now tested on Linux 5.10.191-dovetail and Xenomai 3.2.4. > Cannot reproduce the crash anymore > Ok, that's good news. If you happen to identify missing patches in 3.1, we can probably pick them for that version as well. Jan -- Siemens AG, Technology Linux Expert Center ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-10-05 14:21 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-10-04 8:45 RTNet unable to handle kernel NULL pointer dereference at 0000000000000068 Per Oberg 2023-10-04 9:42 ` Jan Kiszka 2023-10-04 13:01 ` Per Oberg 2023-10-04 13:57 ` Jan Kiszka 2023-10-04 14:15 ` Per Oberg 2023-10-05 14:19 ` Per Oberg 2023-10-05 14:21 ` Jan Kiszka
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.