* Bad ioctl in rtnet @ 2020-06-09 14:49 Per Oberg 2020-06-09 16:16 ` Philippe Gerum 0 siblings, 1 reply; 7+ messages in thread From: Per Oberg @ 2020-06-09 14:49 UTC (permalink / raw) To: xenomai Hello list! I get this error when running a posix-wrapper-compiled software pacakge on rtnet. Could someone please help me pinpoint which ioctl is causing this? (Does it say in the text below or do I need to start spreading breadcrumbs ? ) [ 85.577201] I-pipe domain: Linux [ 85.577624] task: ffff880262df6c00 task.stack: ffffc9000138c000 [ 85.578058] RIP: 0010:[<ffffffffa02b4787>] [<ffffffffa02b4787>] rt_ip_ioctl+0x27/0x120 [rtipv4] [ 85.578512] RSP: 0018:ffffc9000138fda8 EFLAGS: 00010246 [ 85.578958] RAX: 000000000007ffff RBX: 0000000040180021 RCX: ffff88026dd00000 [ 85.579409] RDX: 00007ffcb6bd3470 RSI: 0000000040180021 RDI: ffff880262b33a00 [ 85.579858] RBP: ffffc9000138fdd0 R08: 0000000000000052 R09: ffff880262df6c00 [ 85.580310] R10: 00000000000000e6 R11: 0000000000000000 R12: ffff880262b33a00 [ 85.580763] R13: 0000000040180021 R14: 00007ffcb6bd3470 R15: 0000000062b33a00 [ 85.581217] FS: 00007fd21c07c480(0000) GS:ffff88026dd00000(0000) knlGS:0000000000000000 [ 85.581674] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 85.582133] CR2: 00007ffcb6bd3470 CR3: 0000000261a90000 CR4: 0000000000360630 [ 85.582602] Stack: [ 85.583067] ffffffffa02bf6f7 0000000000000001 ffffffff81178cd0 ffff880262b33a00 [ 85.583554] 0000000000000004 ffffc9000138fe60 ffffffff811725be 0000000000000202 [ 85.584040] ffff880262df6c00 ffff880200000010 ffffc9000138fe70 ffffc9000138fe08 [ 85.584531] Call Trace: [ 85.585015] [<ffffffffa02bf6f7>] ? rt_udp_ioctl+0x67/0x8c [rtudp] [ 85.585511] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 [ 85.586002] [<ffffffff811725be>] rtdm_fd_ioctl+0xee/0x280 [ 85.586488] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 [ 85.586975] [<ffffffff810a0933>] ? __ipipe_migrate_head+0x73/0xf0 [ 85.587466] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 [ 85.587957] [<ffffffff81178cde>] CoBaLt_ioctl+0xe/0x20 [ 85.588445] [<ffffffff81188472>] ipipe_syscall_hook+0x112/0x350 [ 85.588932] [<ffffffff8110acb8>] __ipipe_notify_syscall+0xc8/0x190 [ 85.589421] [<ffffffff8110adaa>] ipipe_handle_syscall+0x2a/0xb0 [ 85.589912] [<ffffffff81001c3d>] do_syscall_64+0x2d/0xf0 [ 85.590404] [<ffffffff818dffbe>] entry_SYSCALL_64_after_swapgs+0x58/0xc6 [ 85.590897] Code: 68 b8 eb b0 e8 ab d4 62 e1 81 fe 27 00 10 40 0f 84 c1 00 00 00 7e 73 81 fe 20 00 18 40 74 3c 81 fe 21 00 18 40 0f 85 a0 00 00 00 <8b> 02 8b 4a 10 4c 8b 42 08 8b 72 04 85 c0 0f 85 d6 00 00 00 83 [ 85.592061] RIP [<ffffffffa02b4787>] rt_ip_ioctl+0x27/0x120 [rtipv4] [ 85.592592] RSP <ffffc9000138fda8> [ 85.593120] CR2: 00007ffcb6bd3470 Best regards Per Öberg -------------- next part -------------- A non-text attachment was scrubbed... Name: WolframLogo.png Type: image/png Size: 4179 bytes Desc: not available URL: <http://xenomai.org/pipermail/xenomai/attachments/20200609/052586b7/attachment.png> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Bad ioctl in rtnet 2020-06-09 14:49 Bad ioctl in rtnet Per Oberg @ 2020-06-09 16:16 ` Philippe Gerum 2020-06-12 8:02 ` Per Oberg 0 siblings, 1 reply; 7+ messages in thread From: Philippe Gerum @ 2020-06-09 16:16 UTC (permalink / raw) To: Per Oberg, xenomai On 6/9/20 4:49 PM, Per Oberg via Xenomai wrote: > Hello list! > > I get this error when running a posix-wrapper-compiled software pacakge on rtnet. Could someone please help me pinpoint which ioctl is causing this? (Does it say in the text below or do I need to start spreading breadcrumbs ? ) > > [ 85.577201] I-pipe domain: Linux > [ 85.577624] task: ffff880262df6c00 task.stack: ffffc9000138c000 > [ 85.578058] RIP: 0010:[<ffffffffa02b4787>] [<ffffffffa02b4787>] rt_ip_ioctl+0x27/0x120 [rtipv4] > [ 85.578512] RSP: 0018:ffffc9000138fda8 EFLAGS: 00010246 > [ 85.578958] RAX: 000000000007ffff RBX: 0000000040180021 RCX: ffff88026dd00000 > [ 85.579409] RDX: 00007ffcb6bd3470 RSI: 0000000040180021 RDI: ffff880262b33a00 > [ 85.579858] RBP: ffffc9000138fdd0 R08: 0000000000000052 R09: ffff880262df6c00 > [ 85.580310] R10: 00000000000000e6 R11: 0000000000000000 R12: ffff880262b33a00 > [ 85.580763] R13: 0000000040180021 R14: 00007ffcb6bd3470 R15: 0000000062b33a00 > [ 85.581217] FS: 00007fd21c07c480(0000) GS:ffff88026dd00000(0000) knlGS:0000000000000000 > [ 85.581674] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 85.582133] CR2: 00007ffcb6bd3470 CR3: 0000000261a90000 CR4: 0000000000360630 > [ 85.582602] Stack: > [ 85.583067] ffffffffa02bf6f7 0000000000000001 ffffffff81178cd0 ffff880262b33a00 > [ 85.583554] 0000000000000004 ffffc9000138fe60 ffffffff811725be 0000000000000202 > [ 85.584040] ffff880262df6c00 ffff880200000010 ffffc9000138fe70 ffffc9000138fe08 > [ 85.584531] Call Trace: > [ 85.585015] [<ffffffffa02bf6f7>] ? rt_udp_ioctl+0x67/0x8c [rtudp] > [ 85.585511] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 > [ 85.586002] [<ffffffff811725be>] rtdm_fd_ioctl+0xee/0x280 > [ 85.586488] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 > [ 85.586975] [<ffffffff810a0933>] ? __ipipe_migrate_head+0x73/0xf0 > [ 85.587466] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 > [ 85.587957] [<ffffffff81178cde>] CoBaLt_ioctl+0xe/0x20 > [ 85.588445] [<ffffffff81188472>] ipipe_syscall_hook+0x112/0x350 > [ 85.588932] [<ffffffff8110acb8>] __ipipe_notify_syscall+0xc8/0x190 > [ 85.589421] [<ffffffff8110adaa>] ipipe_handle_syscall+0x2a/0xb0 > [ 85.589912] [<ffffffff81001c3d>] do_syscall_64+0x2d/0xf0 > [ 85.590404] [<ffffffff818dffbe>] entry_SYSCALL_64_after_swapgs+0x58/0xc6 > [ 85.590897] Code: 68 b8 eb b0 e8 ab d4 62 e1 81 fe 27 00 10 40 0f 84 c1 00 00 00 7e 73 81 fe 20 00 18 40 74 3c 81 fe 21 00 18 40 0f 85 a0 00 00 00 <8b> 02 8b 4a 10 4c 8b > 42 08 8b 72 04 85 c0 0f 85 d6 00 00 00 83 > [ 85.592061] RIP [<ffffffffa02b4787>] rt_ip_ioctl+0x27/0x120 [rtipv4] > [ 85.592592] RSP <ffffc9000138fda8> > [ 85.593120] CR2: 00007ffcb6bd3470 The header of this kernel splat - which should normally give you some hint about the code which triggers it - seems to be missing from the pasted text above. Anyway, quick and dirty trick to locate it: $ $CROSS_COMPILE-objdump -dl $linux-build-tree/drivers/xenomai/net/stack/ipv4/rtipv4.o | grep -A 30 '<rt_ip_ioctl>:' 00000000000029f0 <rt_ip_ioctl>: rt_ip_ioctl(): linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 29f0: 41 54 push %r12 29f2: 4c 8d 27 lea (%rdi),%r12 29f5: 55 push %rbp rtdm_fd_to_private(): linux/include/xenomai/rtdm/driver.h:163 29f6: 48 8d 2f lea (%rdi),%rbp rt_ip_ioctl(): linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 29f9: 48 8d 64 24 e0 lea -0x20(%rsp),%rsp rtdm_fd_to_private(): linux/include/xenomai/rtdm/driver.h:163 29fe: 48 83 c5 58 add $0x58,%rbp rt_ip_ioctl(): linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 2a02: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax 2a09: 00 00 2a0b: 48 89 44 24 18 mov %rax,0x18(%rsp) 2a10: 31 c0 xor %eax,%eax linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:215 2a12: 81 fe 20 00 18 40 cmp $0x40180020,%esi 2a18: 0f 84 d5 00 00 00 je 2af3 <rt_ip_ioctl+0x103> 2a1e: 7f 5f jg 2a7f <rt_ip_ioctl+0x8f> 2a20: 81 fe 26 00 10 40 cmp $0x40100026,%esi 2a26: 0f 84 98 00 00 00 je 2ac4 <rt_ip_ioctl+0xd4> 2a2c: 81 fe 27 00 10 40 cmp $0x40100027,%esi 2a32: 0f 85 81 00 00 00 jne 2ab9 <rt_ip_ioctl+0xc9> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:243 2a38: b9 10 00 00 00 mov $0x10,%ecx rt_ip_ioctl+0x27 would then be 000029f0 + 0x27, i.e. 00002a17 which would be somewhere after xenomai/net/stack/ipv4/ip_sock.c:215. This IP does not seem to match anything sensible in my dump (v3.1), but you may be using a different Xenomai code base, so this may explain. At any rate, this seems to be one of the generic sockopt handlers (setopt, getopt, getname, setname). Anyway, you get the point. PS: with the static portion of the kernel, you would have used addr2line directly on the vmlinux image, using the %RIP value as a hint to the command. -- Philippe. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Bad ioctl in rtnet 2020-06-09 16:16 ` Philippe Gerum @ 2020-06-12 8:02 ` Per Oberg 2020-06-12 8:40 ` Philippe Gerum 0 siblings, 1 reply; 7+ messages in thread From: Per Oberg @ 2020-06-12 8:02 UTC (permalink / raw) To: xenomai ----- Den 9 jun 2020, på kl 18:16, Philippe Gerum rpm@xenomai.org skrev: > On 6/9/20 4:49 PM, Per Oberg via Xenomai wrote: > > Hello list! >> I get this error when running a posix-wrapper-compiled software pacakge on >> rtnet. Could someone please help me pinpoint which ioctl is causing this? (Does > > it say in the text below or do I need to start spreading breadcrumbs ? ) > > [ 85.577201] I-pipe domain: Linux > > [ 85.577624] task: ffff880262df6c00 task.stack: ffffc9000138c000 >> [ 85.578058] RIP: 0010:[<ffffffffa02b4787>] [<ffffffffa02b4787>] > > rt_ip_ioctl+0x27/0x120 [rtipv4] > > [ 85.578512] RSP: 0018:ffffc9000138fda8 EFLAGS: 00010246 > > [ 85.578958] RAX: 000000000007ffff RBX: 0000000040180021 RCX: ffff88026dd00000 > > [ 85.579409] RDX: 00007ffcb6bd3470 RSI: 0000000040180021 RDI: ffff880262b33a00 > > [ 85.579858] RBP: ffffc9000138fdd0 R08: 0000000000000052 R09: ffff880262df6c00 > > [ 85.580310] R10: 00000000000000e6 R11: 0000000000000000 R12: ffff880262b33a00 > > [ 85.580763] R13: 0000000040180021 R14: 00007ffcb6bd3470 R15: 0000000062b33a00 >> [ 85.581217] FS: 00007fd21c07c480(0000) GS:ffff88026dd00000(0000) > > knlGS:0000000000000000 > > [ 85.581674] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > [ 85.582133] CR2: 00007ffcb6bd3470 CR3: 0000000261a90000 CR4: 0000000000360630 > > [ 85.582602] Stack: > > [ 85.583067] ffffffffa02bf6f7 0000000000000001 ffffffff81178cd0 ffff880262b33a00 > > [ 85.583554] 0000000000000004 ffffc9000138fe60 ffffffff811725be 0000000000000202 > > [ 85.584040] ffff880262df6c00 ffff880200000010 ffffc9000138fe70 ffffc9000138fe08 > > [ 85.584531] Call Trace: > > [ 85.585015] [<ffffffffa02bf6f7>] ? rt_udp_ioctl+0x67/0x8c [rtudp] > > [ 85.585511] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 > > [ 85.586002] [<ffffffff811725be>] rtdm_fd_ioctl+0xee/0x280 > > [ 85.586488] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 > > [ 85.586975] [<ffffffff810a0933>] ? __ipipe_migrate_head+0x73/0xf0 > > [ 85.587466] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 > > [ 85.587957] [<ffffffff81178cde>] CoBaLt_ioctl+0xe/0x20 > > [ 85.588445] [<ffffffff81188472>] ipipe_syscall_hook+0x112/0x350 > > [ 85.588932] [<ffffffff8110acb8>] __ipipe_notify_syscall+0xc8/0x190 > > [ 85.589421] [<ffffffff8110adaa>] ipipe_handle_syscall+0x2a/0xb0 > > [ 85.589912] [<ffffffff81001c3d>] do_syscall_64+0x2d/0xf0 > > [ 85.590404] [<ffffffff818dffbe>] entry_SYSCALL_64_after_swapgs+0x58/0xc6 >> [ 85.590897] Code: 68 b8 eb b0 e8 ab d4 62 e1 81 fe 27 00 10 40 0f 84 c1 00 00 >> 00 7e 73 81 fe 20 00 18 40 74 3c 81 fe 21 00 18 40 0f 85 a0 00 00 00 <8b> 02 8b > > 4a 10 4c 8b > > 42 08 8b 72 04 85 c0 0f 85 d6 00 00 00 83 > > [ 85.592061] RIP [<ffffffffa02b4787>] rt_ip_ioctl+0x27/0x120 [rtipv4] > > [ 85.592592] RSP <ffffc9000138fda8> > > [ 85.593120] CR2: 00007ffcb6bd3470 > The header of this kernel splat - which should normally give you some hint > about the code which triggers it - seems to be missing from the pasted text > above. Sorry about that, what was missing was essentially this: [174576.129988] [Xenomai] switching RTTest to secondary mode after exception #14 in kernel-space at 0xffffffffa02b4787 (pid 485) [174576.129994] BUG: unable to handle kernel paging request at 00007ffc68617830 [174576.130379] IP: [<ffffffffa02b4787>] rt_ip_ioctl+0x27/0x120 [rtipv4] [174576.130757] PGD 80000002633d5067 [174576.130765] PUD 2642a0067 [174576.131131] PMD 24e848067 [174576.131135] PTE 8000000262244067 [174576.131507] [174576.131880] Oops: 0001 [#1] PREEMPT SMP [174576.132257] Modules linked in: rtudp rtipv4 intel_powerclamp intel_rapl i915 coretemp rt_igb e1000e pcan(O) rtnet video fan thermal_sys [174576.133071] CPU: 3 PID: 485 Comm: OpENer Tainted: G O 4.9.90-xeno-cobolt #1 [174576.133485] Hardware name: Default string Default string/SKYBAY, BIOS 5.0.1.1 04/18/2016 > Anyway, quick and dirty trick to locate it: > $ $CROSS_COMPILE-objdump -dl > $linux-build-tree/drivers/xenomai/net/stack/ipv4/rtipv4.o | grep -A 30 > '<rt_ip_ioctl>:' > 00000000000029f0 <rt_ip_ioctl>: > rt_ip_ioctl(): > linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 > 29f0: 41 54 push %r12 > 29f2: 4c 8d 27 lea (%rdi),%r12 > 29f5: 55 push %rbp > rtdm_fd_to_private(): > linux/include/xenomai/rtdm/driver.h:163 > 29f6: 48 8d 2f lea (%rdi),%rbp > rt_ip_ioctl(): > linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 > 29f9: 48 8d 64 24 e0 lea -0x20(%rsp),%rsp > rtdm_fd_to_private(): > linux/include/xenomai/rtdm/driver.h:163 > 29fe: 48 83 c5 58 add $0x58,%rbp > rt_ip_ioctl(): > linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 > 2a02: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax > 2a09: 00 00 > 2a0b: 48 89 44 24 18 mov %rax,0x18(%rsp) > 2a10: 31 c0 xor %eax,%eax > linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:215 > 2a12: 81 fe 20 00 18 40 cmp $0x40180020,%esi > 2a18: 0f 84 d5 00 00 00 je 2af3 <rt_ip_ioctl+0x103> > 2a1e: 7f 5f jg 2a7f <rt_ip_ioctl+0x8f> > 2a20: 81 fe 26 00 10 40 cmp $0x40100026,%esi > 2a26: 0f 84 98 00 00 00 je 2ac4 <rt_ip_ioctl+0xd4> > 2a2c: 81 fe 27 00 10 40 cmp $0x40100027,%esi > 2a32: 0f 85 81 00 00 00 jne 2ab9 <rt_ip_ioctl+0xc9> > linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:243 > 2a38: b9 10 00 00 00 mov $0x10,%ecx > rt_ip_ioctl+0x27 would then be 000029f0 + 0x27, i.e. 00002a17 which would be > somewhere after xenomai/net/stack/ipv4/ip_sock.c:215. This IP does not seem to > match anything sensible in my dump (v3.1), but you may be using a different > Xenomai code base, so this may explain. At any rate, this seems to be one of > the generic sockopt handlers (setopt, getopt, getname, setname). Anyway, you > get the point. So, I get this: (With 0x1760 + 0x27 = 0x1787) 0000000000001760 <rt_ip_ioctl>: rt_ip_ioctl(): 1760: e8 00 00 00 00 callq 1765 <rt_ip_ioctl+0x5> 1765: 81 fe 27 00 10 40 cmp $0x40100027,%esi 176b: 0f 84 c1 00 00 00 je 1832 <rt_ip_ioctl+0xd2> 1771: 7e 73 jle 17e6 <rt_ip_ioctl+0x86> 1773: 81 fe 20 00 18 40 cmp $0x40180020,%esi 1779: 74 3c je 17b7 <rt_ip_ioctl+0x57> 177b: 81 fe 21 00 18 40 cmp $0x40180021,%esi 1781: 0f 85 a0 00 00 00 jne 1827 <rt_ip_ioctl+0xc7> 1787: 8b 02 mov (%rdx),%eax 1789: 8b 4a 10 mov 0x10(%rdx),%ecx 178c: 4c 8b 42 08 mov 0x8(%rdx),%r8 1790: 8b 72 04 mov 0x4(%rdx),%esi 1793: 85 c0 test %eax,%eax 1795: 0f 85 d6 00 00 00 jne 1871 <rt_ip_ioctl+0x111> 179b: 83 f9 03 cmp $0x3,%ecx 179e: 0f 86 c7 00 00 00 jbe 186b <rt_ip_ioctl+0x10b> 17a4: 83 fe 01 cmp $0x1,%esi 17a7: 0f 85 c4 00 00 00 jne 1871 <rt_ip_ioctl+0x111> 17ad: 41 8b 10 mov (%r8),%edx 17b0: 88 97 60 01 00 00 mov %dl,0x160(%rdi) 17b6: c3 retq 17b7: 48 8b 42 10 mov 0x10(%rdx),%rax 17bb: 48 8b 4a 08 mov 0x8(%rdx),%rcx 17bf: 8b 52 04 mov 0x4(%rdx),%edx 17c2: 83 38 03 cmpl $0x3,(%rax) 17c5: 0f 86 a0 00 00 00 jbe 186b <rt_ip_ioctl+0x10b> 17cb: 83 fa 01 cmp $0x1,%edx 17ce: 0f 85 9d 00 00 00 jne 1871 <rt_ip_ioctl+0x111> 17d4: 0f b6 97 60 01 00 00 movzbl 0x160(%rdi),%edx I have no code-line references to match it with (yet) because it's not compiled with debug info. However, the "mov (%rdx),%eax" does not seem like an impossible offender. I am on xenomai-3.0.8a (I don't remember of the 'a' is my name or a real release, it was due to an issue with a missing file missing in the original release i believe...) I'm not good enough in calling convention interpretation to figure out where the value in %rdx came from so I'll likely have to enable the debugging flags and recompile before I'll get any further. > PS: with the static portion of the kernel, you would have used addr2line > directly on the vmlinux image, using the %RIP value as a hint to the command. Nice, good to know! > -- > Philippe. Per Öberg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Bad ioctl in rtnet 2020-06-12 8:02 ` Per Oberg @ 2020-06-12 8:40 ` Philippe Gerum 2020-06-12 9:05 ` Per Oberg 0 siblings, 1 reply; 7+ messages in thread From: Philippe Gerum @ 2020-06-12 8:40 UTC (permalink / raw) To: Per Oberg, xenomai On 6/12/20 10:02 AM, Per Oberg wrote: > ----- Den 9 jun 2020, på kl 18:16, Philippe Gerum rpm@xenomai.org skrev: > >> On 6/9/20 4:49 PM, Per Oberg via Xenomai wrote: >>> Hello list! > >>> I get this error when running a posix-wrapper-compiled software pacakge on >>> rtnet. Could someone please help me pinpoint which ioctl is causing this? (Does >>> it say in the text below or do I need to start spreading breadcrumbs ? ) > >>> [ 85.577201] I-pipe domain: Linux >>> [ 85.577624] task: ffff880262df6c00 task.stack: ffffc9000138c000 >>> [ 85.578058] RIP: 0010:[<ffffffffa02b4787>] [<ffffffffa02b4787>] >>> rt_ip_ioctl+0x27/0x120 [rtipv4] >>> [ 85.578512] RSP: 0018:ffffc9000138fda8 EFLAGS: 00010246 >>> [ 85.578958] RAX: 000000000007ffff RBX: 0000000040180021 RCX: ffff88026dd00000 >>> [ 85.579409] RDX: 00007ffcb6bd3470 RSI: 0000000040180021 RDI: ffff880262b33a00 >>> [ 85.579858] RBP: ffffc9000138fdd0 R08: 0000000000000052 R09: ffff880262df6c00 >>> [ 85.580310] R10: 00000000000000e6 R11: 0000000000000000 R12: ffff880262b33a00 >>> [ 85.580763] R13: 0000000040180021 R14: 00007ffcb6bd3470 R15: 0000000062b33a00 >>> [ 85.581217] FS: 00007fd21c07c480(0000) GS:ffff88026dd00000(0000) >>> knlGS:0000000000000000 >>> [ 85.581674] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>> [ 85.582133] CR2: 00007ffcb6bd3470 CR3: 0000000261a90000 CR4: 0000000000360630 >>> [ 85.582602] Stack: >>> [ 85.583067] ffffffffa02bf6f7 0000000000000001 ffffffff81178cd0 ffff880262b33a00 >>> [ 85.583554] 0000000000000004 ffffc9000138fe60 ffffffff811725be 0000000000000202 >>> [ 85.584040] ffff880262df6c00 ffff880200000010 ffffc9000138fe70 ffffc9000138fe08 >>> [ 85.584531] Call Trace: >>> [ 85.585015] [<ffffffffa02bf6f7>] ? rt_udp_ioctl+0x67/0x8c [rtudp] >>> [ 85.585511] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 >>> [ 85.586002] [<ffffffff811725be>] rtdm_fd_ioctl+0xee/0x280 >>> [ 85.586488] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 >>> [ 85.586975] [<ffffffff810a0933>] ? __ipipe_migrate_head+0x73/0xf0 >>> [ 85.587466] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 >>> [ 85.587957] [<ffffffff81178cde>] CoBaLt_ioctl+0xe/0x20 >>> [ 85.588445] [<ffffffff81188472>] ipipe_syscall_hook+0x112/0x350 >>> [ 85.588932] [<ffffffff8110acb8>] __ipipe_notify_syscall+0xc8/0x190 >>> [ 85.589421] [<ffffffff8110adaa>] ipipe_handle_syscall+0x2a/0xb0 >>> [ 85.589912] [<ffffffff81001c3d>] do_syscall_64+0x2d/0xf0 >>> [ 85.590404] [<ffffffff818dffbe>] entry_SYSCALL_64_after_swapgs+0x58/0xc6 >>> [ 85.590897] Code: 68 b8 eb b0 e8 ab d4 62 e1 81 fe 27 00 10 40 0f 84 c1 00 00 >>> 00 7e 73 81 fe 20 00 18 40 74 3c 81 fe 21 00 18 40 0f 85 a0 00 00 00 <8b> 02 8b >>> 4a 10 4c 8b >>> 42 08 8b 72 04 85 c0 0f 85 d6 00 00 00 83 >>> [ 85.592061] RIP [<ffffffffa02b4787>] rt_ip_ioctl+0x27/0x120 [rtipv4] >>> [ 85.592592] RSP <ffffc9000138fda8> >>> [ 85.593120] CR2: 00007ffcb6bd3470 > >> The header of this kernel splat - which should normally give you some hint >> about the code which triggers it - seems to be missing from the pasted text >> above. > > Sorry about that, what was missing was essentially this: > > [174576.129988] [Xenomai] switching RTTest to secondary mode after exception #14 in kernel-space at 0xffffffffa02b4787 (pid 485) > [174576.129994] BUG: unable to handle kernel paging request at 00007ffc68617830 > [174576.130379] IP: [<ffffffffa02b4787>] rt_ip_ioctl+0x27/0x120 [rtipv4] > [174576.130757] PGD 80000002633d5067 > [174576.130765] PUD 2642a0067 > [174576.131131] PMD 24e848067 > [174576.131135] PTE 8000000262244067 > [174576.131507] > [174576.131880] Oops: 0001 [#1] PREEMPT SMP > [174576.132257] Modules linked in: rtudp rtipv4 intel_powerclamp intel_rapl i915 coretemp rt_igb e1000e pcan(O) rtnet video fan thermal_sys > [174576.133071] CPU: 3 PID: 485 Comm: OpENer Tainted: G O 4.9.90-xeno-cobolt #1 > [174576.133485] Hardware name: Default string Default string/SKYBAY, BIOS 5.0.1.1 04/18/2016 > > >> Anyway, quick and dirty trick to locate it: > >> $ $CROSS_COMPILE-objdump -dl >> $linux-build-tree/drivers/xenomai/net/stack/ipv4/rtipv4.o | grep -A 30 >> '<rt_ip_ioctl>:' > >> 00000000000029f0 <rt_ip_ioctl>: >> rt_ip_ioctl(): >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 >> 29f0: 41 54 push %r12 >> 29f2: 4c 8d 27 lea (%rdi),%r12 >> 29f5: 55 push %rbp >> rtdm_fd_to_private(): >> linux/include/xenomai/rtdm/driver.h:163 >> 29f6: 48 8d 2f lea (%rdi),%rbp >> rt_ip_ioctl(): >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 >> 29f9: 48 8d 64 24 e0 lea -0x20(%rsp),%rsp >> rtdm_fd_to_private(): >> linux/include/xenomai/rtdm/driver.h:163 >> 29fe: 48 83 c5 58 add $0x58,%rbp >> rt_ip_ioctl(): >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 >> 2a02: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax >> 2a09: 00 00 >> 2a0b: 48 89 44 24 18 mov %rax,0x18(%rsp) >> 2a10: 31 c0 xor %eax,%eax >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:215 >> 2a12: 81 fe 20 00 18 40 cmp $0x40180020,%esi >> 2a18: 0f 84 d5 00 00 00 je 2af3 <rt_ip_ioctl+0x103> >> 2a1e: 7f 5f jg 2a7f <rt_ip_ioctl+0x8f> >> 2a20: 81 fe 26 00 10 40 cmp $0x40100026,%esi >> 2a26: 0f 84 98 00 00 00 je 2ac4 <rt_ip_ioctl+0xd4> >> 2a2c: 81 fe 27 00 10 40 cmp $0x40100027,%esi >> 2a32: 0f 85 81 00 00 00 jne 2ab9 <rt_ip_ioctl+0xc9> >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:243 >> 2a38: b9 10 00 00 00 mov $0x10,%ecx > >> rt_ip_ioctl+0x27 would then be 000029f0 + 0x27, i.e. 00002a17 which would be >> somewhere after xenomai/net/stack/ipv4/ip_sock.c:215. This IP does not seem to >> match anything sensible in my dump (v3.1), but you may be using a different >> Xenomai code base, so this may explain. At any rate, this seems to be one of >> the generic sockopt handlers (setopt, getopt, getname, setname). Anyway, you >> get the point. > > So, I get this: (With 0x1760 + 0x27 = 0x1787) > > 0000000000001760 <rt_ip_ioctl>: > rt_ip_ioctl(): > 1760: e8 00 00 00 00 callq 1765 <rt_ip_ioctl+0x5> > 1765: 81 fe 27 00 10 40 cmp $0x40100027,%esi > 176b: 0f 84 c1 00 00 00 je 1832 <rt_ip_ioctl+0xd2> > 1771: 7e 73 jle 17e6 <rt_ip_ioctl+0x86> > 1773: 81 fe 20 00 18 40 cmp $0x40180020,%esi > 1779: 74 3c je 17b7 <rt_ip_ioctl+0x57> > 177b: 81 fe 21 00 18 40 cmp $0x40180021,%esi > 1781: 0f 85 a0 00 00 00 jne 1827 <rt_ip_ioctl+0xc7> > 1787: 8b 02 mov (%rdx),%eax > 1789: 8b 4a 10 mov 0x10(%rdx),%ecx > 178c: 4c 8b 42 08 mov 0x8(%rdx),%r8 > 1790: 8b 72 04 mov 0x4(%rdx),%esi > 1793: 85 c0 test %eax,%eax > 1795: 0f 85 d6 00 00 00 jne 1871 <rt_ip_ioctl+0x111> > 179b: 83 f9 03 cmp $0x3,%ecx > 179e: 0f 86 c7 00 00 00 jbe 186b <rt_ip_ioctl+0x10b> > 17a4: 83 fe 01 cmp $0x1,%esi > 17a7: 0f 85 c4 00 00 00 jne 1871 <rt_ip_ioctl+0x111> > 17ad: 41 8b 10 mov (%r8),%edx > 17b0: 88 97 60 01 00 00 mov %dl,0x160(%rdi) > 17b6: c3 retq > 17b7: 48 8b 42 10 mov 0x10(%rdx),%rax > 17bb: 48 8b 4a 08 mov 0x8(%rdx),%rcx > 17bf: 8b 52 04 mov 0x4(%rdx),%edx > 17c2: 83 38 03 cmpl $0x3,(%rax) > 17c5: 0f 86 a0 00 00 00 jbe 186b <rt_ip_ioctl+0x10b> > 17cb: 83 fa 01 cmp $0x1,%edx > 17ce: 0f 85 9d 00 00 00 jne 1871 <rt_ip_ioctl+0x111> > 17d4: 0f b6 97 60 01 00 00 movzbl 0x160(%rdi),%edx > > I have no code-line references to match it with (yet) because it's not compiled with debug info. However, the "mov (%rdx),%eax" does not seem like an impossible offender. > I am on xenomai-3.0.8a (I don't remember of the 'a' is my name or a real release, it was due to an issue with a missing file missing in the original release i believe...) IIRC, the project rather used 3.x.y.z for brown paper bag releases, so 3.0.8a may be your own tag. > > I'm not good enough in calling convention interpretation to figure out where the value in %rdx came from so I'll likely have to enable the debugging flags and recompile before I'll get any further. Ok, since $0x40180021 should be the ioctl code for _RTIOC_SETSOCKOPT in 3.0.x, I believe that you are hitting a generalized bug in RTnet for 3.0.x which has been gradually fixed in 3.1 by a (long) series of commits, addressing spurious direct accesses to user memory from kernel space instead of copy_to/from_user. This would be confirmed by the value of %RDX which very much looks like a user-space address. In other words, that address is most likely perfectly valid, but RTnet should not have dereferenced it directly, but should have used some form of copy_from_user() helper instead. On x86, you may want to try passing 'nosmap' in the kernel bootargs in order to work around this the hard way, by disabling the access validation done by the MMU. However, this would only paper over the issue, and any unexpected minor fault occurring as a result of such access (i.e. page table entry not present for an otherwise valid memory) would cause the kernel to take an uncontrolled exception and likely freak out. Those minor faults should not happen, however we have just experienced cases where it may happen if userland does some specific actions, like loading a DSO. The long-term solution would be to switch to 3.1 if the application system depends on RTnet. -- Philippe. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Bad ioctl in rtnet 2020-06-12 8:40 ` Philippe Gerum @ 2020-06-12 9:05 ` Per Oberg 2020-06-12 15:21 ` Per Oberg 0 siblings, 1 reply; 7+ messages in thread From: Per Oberg @ 2020-06-12 9:05 UTC (permalink / raw) To: xenomai ----- Den 12 jun 2020, på kl 10:40, Philippe Gerum rpm@xenomai.org skrev: > On 6/12/20 10:02 AM, Per Oberg wrote: > > ----- Den 9 jun 2020, på kl 18:16, Philippe Gerum rpm@xenomai.org skrev: > >> On 6/9/20 4:49 PM, Per Oberg via Xenomai wrote: > >>> Hello list! > >>> I get this error when running a posix-wrapper-compiled software pacakge on > >>> rtnet. Could someone please help me pinpoint which ioctl is causing this? (Does > >>> it say in the text below or do I need to start spreading breadcrumbs ? ) > >>> [ 85.577201] I-pipe domain: Linux > >>> [ 85.577624] task: ffff880262df6c00 task.stack: ffffc9000138c000 > >>> [ 85.578058] RIP: 0010:[<ffffffffa02b4787>] [<ffffffffa02b4787>] > >>> rt_ip_ioctl+0x27/0x120 [rtipv4] > >>> [ 85.578512] RSP: 0018:ffffc9000138fda8 EFLAGS: 00010246 > >>> [ 85.578958] RAX: 000000000007ffff RBX: 0000000040180021 RCX: ffff88026dd00000 > >>> [ 85.579409] RDX: 00007ffcb6bd3470 RSI: 0000000040180021 RDI: ffff880262b33a00 > >>> [ 85.579858] RBP: ffffc9000138fdd0 R08: 0000000000000052 R09: ffff880262df6c00 > >>> [ 85.580310] R10: 00000000000000e6 R11: 0000000000000000 R12: ffff880262b33a00 > >>> [ 85.580763] R13: 0000000040180021 R14: 00007ffcb6bd3470 R15: 0000000062b33a00 > >>> [ 85.581217] FS: 00007fd21c07c480(0000) GS:ffff88026dd00000(0000) > >>> knlGS:0000000000000000 > >>> [ 85.581674] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > >>> [ 85.582133] CR2: 00007ffcb6bd3470 CR3: 0000000261a90000 CR4: 0000000000360630 > >>> [ 85.582602] Stack: > >>> [ 85.583067] ffffffffa02bf6f7 0000000000000001 ffffffff81178cd0 ffff880262b33a00 > >>> [ 85.583554] 0000000000000004 ffffc9000138fe60 ffffffff811725be 0000000000000202 > >>> [ 85.584040] ffff880262df6c00 ffff880200000010 ffffc9000138fe70 ffffc9000138fe08 > >>> [ 85.584531] Call Trace: > >>> [ 85.585015] [<ffffffffa02bf6f7>] ? rt_udp_ioctl+0x67/0x8c [rtudp] > >>> [ 85.585511] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 > >>> [ 85.586002] [<ffffffff811725be>] rtdm_fd_ioctl+0xee/0x280 > >>> [ 85.586488] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 > >>> [ 85.586975] [<ffffffff810a0933>] ? __ipipe_migrate_head+0x73/0xf0 > >>> [ 85.587466] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 > >>> [ 85.587957] [<ffffffff81178cde>] CoBaLt_ioctl+0xe/0x20 > >>> [ 85.588445] [<ffffffff81188472>] ipipe_syscall_hook+0x112/0x350 > >>> [ 85.588932] [<ffffffff8110acb8>] __ipipe_notify_syscall+0xc8/0x190 > >>> [ 85.589421] [<ffffffff8110adaa>] ipipe_handle_syscall+0x2a/0xb0 > >>> [ 85.589912] [<ffffffff81001c3d>] do_syscall_64+0x2d/0xf0 > >>> [ 85.590404] [<ffffffff818dffbe>] entry_SYSCALL_64_after_swapgs+0x58/0xc6 > >>> [ 85.590897] Code: 68 b8 eb b0 e8 ab d4 62 e1 81 fe 27 00 10 40 0f 84 c1 00 00 > >>> 00 7e 73 81 fe 20 00 18 40 74 3c 81 fe 21 00 18 40 0f 85 a0 00 00 00 <8b> 02 8b > >>> 4a 10 4c 8b > >>> 42 08 8b 72 04 85 c0 0f 85 d6 00 00 00 83 > >>> [ 85.592061] RIP [<ffffffffa02b4787>] rt_ip_ioctl+0x27/0x120 [rtipv4] > >>> [ 85.592592] RSP <ffffc9000138fda8> > >>> [ 85.593120] CR2: 00007ffcb6bd3470 > >> The header of this kernel splat - which should normally give you some hint > >> about the code which triggers it - seems to be missing from the pasted text > >> above. > > Sorry about that, what was missing was essentially this: >> [174576.129988] [Xenomai] switching RTTest to secondary mode after exception #14 > > in kernel-space at 0xffffffffa02b4787 (pid 485) > > [174576.129994] BUG: unable to handle kernel paging request at 00007ffc68617830 > > [174576.130379] IP: [<ffffffffa02b4787>] rt_ip_ioctl+0x27/0x120 [rtipv4] > > [174576.130757] PGD 80000002633d5067 > > [174576.130765] PUD 2642a0067 > > [174576.131131] PMD 24e848067 > > [174576.131135] PTE 8000000262244067 > > [174576.131507] > > [174576.131880] Oops: 0001 [#1] PREEMPT SMP >> [174576.132257] Modules linked in: rtudp rtipv4 intel_powerclamp intel_rapl i915 > > coretemp rt_igb e1000e pcan(O) rtnet video fan thermal_sys > > [174576.133071] CPU: 3 PID: 485 Comm: OpENer Tainted: G O 4.9.90-xeno-cobolt #1 >> [174576.133485] Hardware name: Default string Default string/SKYBAY, BIOS > > 5.0.1.1 04/18/2016 > >> Anyway, quick and dirty trick to locate it: > >> $ $CROSS_COMPILE-objdump -dl > >> $linux-build-tree/drivers/xenomai/net/stack/ipv4/rtipv4.o | grep -A 30 > >> '<rt_ip_ioctl>:' > >> 00000000000029f0 <rt_ip_ioctl>: > >> rt_ip_ioctl(): > >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 > >> 29f0: 41 54 push %r12 > >> 29f2: 4c 8d 27 lea (%rdi),%r12 > >> 29f5: 55 push %rbp > >> rtdm_fd_to_private(): > >> linux/include/xenomai/rtdm/driver.h:163 > >> 29f6: 48 8d 2f lea (%rdi),%rbp > >> rt_ip_ioctl(): > >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 > >> 29f9: 48 8d 64 24 e0 lea -0x20(%rsp),%rsp > >> rtdm_fd_to_private(): > >> linux/include/xenomai/rtdm/driver.h:163 > >> 29fe: 48 83 c5 58 add $0x58,%rbp > >> rt_ip_ioctl(): > >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 > >> 2a02: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax > >> 2a09: 00 00 > >> 2a0b: 48 89 44 24 18 mov %rax,0x18(%rsp) > >> 2a10: 31 c0 xor %eax,%eax > >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:215 > >> 2a12: 81 fe 20 00 18 40 cmp $0x40180020,%esi > >> 2a18: 0f 84 d5 00 00 00 je 2af3 <rt_ip_ioctl+0x103> > >> 2a1e: 7f 5f jg 2a7f <rt_ip_ioctl+0x8f> > >> 2a20: 81 fe 26 00 10 40 cmp $0x40100026,%esi > >> 2a26: 0f 84 98 00 00 00 je 2ac4 <rt_ip_ioctl+0xd4> > >> 2a2c: 81 fe 27 00 10 40 cmp $0x40100027,%esi > >> 2a32: 0f 85 81 00 00 00 jne 2ab9 <rt_ip_ioctl+0xc9> > >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:243 > >> 2a38: b9 10 00 00 00 mov $0x10,%ecx > >> rt_ip_ioctl+0x27 would then be 000029f0 + 0x27, i.e. 00002a17 which would be > >> somewhere after xenomai/net/stack/ipv4/ip_sock.c:215. This IP does not seem to > >> match anything sensible in my dump (v3.1), but you may be using a different > >> Xenomai code base, so this may explain. At any rate, this seems to be one of > >> the generic sockopt handlers (setopt, getopt, getname, setname). Anyway, you > >> get the point. > > So, I get this: (With 0x1760 + 0x27 = 0x1787) > > 0000000000001760 <rt_ip_ioctl>: > > rt_ip_ioctl(): > > 1760: e8 00 00 00 00 callq 1765 <rt_ip_ioctl+0x5> > > 1765: 81 fe 27 00 10 40 cmp $0x40100027,%esi > > 176b: 0f 84 c1 00 00 00 je 1832 <rt_ip_ioctl+0xd2> > > 1771: 7e 73 jle 17e6 <rt_ip_ioctl+0x86> > > 1773: 81 fe 20 00 18 40 cmp $0x40180020,%esi > > 1779: 74 3c je 17b7 <rt_ip_ioctl+0x57> > > 177b: 81 fe 21 00 18 40 cmp $0x40180021,%esi > > 1781: 0f 85 a0 00 00 00 jne 1827 <rt_ip_ioctl+0xc7> > > 1787: 8b 02 mov (%rdx),%eax > > 1789: 8b 4a 10 mov 0x10(%rdx),%ecx > > 178c: 4c 8b 42 08 mov 0x8(%rdx),%r8 > > 1790: 8b 72 04 mov 0x4(%rdx),%esi > > 1793: 85 c0 test %eax,%eax > > 1795: 0f 85 d6 00 00 00 jne 1871 <rt_ip_ioctl+0x111> > > 179b: 83 f9 03 cmp $0x3,%ecx > > 179e: 0f 86 c7 00 00 00 jbe 186b <rt_ip_ioctl+0x10b> > > 17a4: 83 fe 01 cmp $0x1,%esi > > 17a7: 0f 85 c4 00 00 00 jne 1871 <rt_ip_ioctl+0x111> > > 17ad: 41 8b 10 mov (%r8),%edx > > 17b0: 88 97 60 01 00 00 mov %dl,0x160(%rdi) > > 17b6: c3 retq > > 17b7: 48 8b 42 10 mov 0x10(%rdx),%rax > > 17bb: 48 8b 4a 08 mov 0x8(%rdx),%rcx > > 17bf: 8b 52 04 mov 0x4(%rdx),%edx > > 17c2: 83 38 03 cmpl $0x3,(%rax) > > 17c5: 0f 86 a0 00 00 00 jbe 186b <rt_ip_ioctl+0x10b> > > 17cb: 83 fa 01 cmp $0x1,%edx > > 17ce: 0f 85 9d 00 00 00 jne 1871 <rt_ip_ioctl+0x111> > > 17d4: 0f b6 97 60 01 00 00 movzbl 0x160(%rdi),%edx >> I have no code-line references to match it with (yet) because it's not compiled >> with debug info. However, the "mov (%rdx),%eax" does not seem like an > > impossible offender. >> I am on xenomai-3.0.8a (I don't remember of the 'a' is my name or a real >> release, it was due to an issue with a missing file missing in the original > > release i believe...) > IIRC, the project rather used 3.x.y.z for brown paper bag releases, so 3.0.8a > may be your own tag. >> I'm not good enough in calling convention interpretation to figure out where the >> value in %rdx came from so I'll likely have to enable the debugging flags and > > recompile before I'll get any further. > Ok, since $0x40180021 should be the ioctl code for _RTIOC_SETSOCKOPT in 3.0.x, > I believe that you are hitting a generalized bug in RTnet for 3.0.x which has > been gradually fixed in 3.1 by a (long) series of commits, addressing spurious > direct accesses to user memory from kernel space instead of copy_to/from_user. > This would be confirmed by the value of %RDX which very much looks like a > user-space address. In other words, that address is most likely perfectly > valid, but RTnet should not have dereferenced it directly, but should have > used some form of copy_from_user() helper instead. > On x86, you may want to try passing 'nosmap' in the kernel bootargs in order > to work around this the hard way, by disabling the access validation done by > the MMU. Ah, yes of course. Now that you mention it I've seen this before but forgot about it. It does not happen with "nosmap" enabled. > However, this would only paper over the issue, and any unexpected > minor fault occurring as a result of such access (i.e. page table entry not > present for an otherwise valid memory) would cause the kernel to take an > uncontrolled exception and likely freak out. Those minor faults should not > happen, however we have just experienced cases where it may happen if userland > does some specific actions, like loading a DSO. Right now, I only need it for a proof of concept but I'll keep that in mind for later. > The long-term solution would be to switch to 3.1 if the application system > depends on RTnet. I will try to get 3.1 out for a spin before my summer holiday and report back whether it's solved or not. Otherwise I'll put it on my todo-list for later. > -- > Philippe. Thank you very much! Per Öberg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Bad ioctl in rtnet 2020-06-12 9:05 ` Per Oberg @ 2020-06-12 15:21 ` Per Oberg 2020-06-12 15:27 ` Per Oberg 0 siblings, 1 reply; 7+ messages in thread From: Per Oberg @ 2020-06-12 15:21 UTC (permalink / raw) To: xenomai ----- Den 12 jun 2020, på kl 11:05, xenomai xenomai@xenomai.org skrev: > ----- Den 12 jun 2020, på kl 10:40, Philippe Gerum rpm@xenomai.org skrev: > > On 6/12/20 10:02 AM, Per Oberg wrote: > > > ----- Den 9 jun 2020, på kl 18:16, Philippe Gerum rpm@xenomai.org skrev: > > >> On 6/9/20 4:49 PM, Per Oberg via Xenomai wrote: > > >>> Hello list! > > >>> I get this error when running a posix-wrapper-compiled software pacakge on > > >>> rtnet. Could someone please help me pinpoint which ioctl is causing this? (Does > > >>> it say in the text below or do I need to start spreading breadcrumbs ? ) > > >>> [ 85.577201] I-pipe domain: Linux > > >>> [ 85.577624] task: ffff880262df6c00 task.stack: ffffc9000138c000 > > >>> [ 85.578058] RIP: 0010:[<ffffffffa02b4787>] [<ffffffffa02b4787>] > > >>> rt_ip_ioctl+0x27/0x120 [rtipv4] > > >>> [ 85.578512] RSP: 0018:ffffc9000138fda8 EFLAGS: 00010246 > > >>> [ 85.578958] RAX: 000000000007ffff RBX: 0000000040180021 RCX: ffff88026dd00000 > > >>> [ 85.579409] RDX: 00007ffcb6bd3470 RSI: 0000000040180021 RDI: ffff880262b33a00 > > >>> [ 85.579858] RBP: ffffc9000138fdd0 R08: 0000000000000052 R09: ffff880262df6c00 > > >>> [ 85.580310] R10: 00000000000000e6 R11: 0000000000000000 R12: ffff880262b33a00 > > >>> [ 85.580763] R13: 0000000040180021 R14: 00007ffcb6bd3470 R15: 0000000062b33a00 > > >>> [ 85.581217] FS: 00007fd21c07c480(0000) GS:ffff88026dd00000(0000) > > >>> knlGS:0000000000000000 > > >>> [ 85.581674] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > >>> [ 85.582133] CR2: 00007ffcb6bd3470 CR3: 0000000261a90000 CR4: 0000000000360630 > > >>> [ 85.582602] Stack: > > >>> [ 85.583067] ffffffffa02bf6f7 0000000000000001 ffffffff81178cd0 ffff880262b33a00 > > >>> [ 85.583554] 0000000000000004 ffffc9000138fe60 ffffffff811725be 0000000000000202 > > >>> [ 85.584040] ffff880262df6c00 ffff880200000010 ffffc9000138fe70 ffffc9000138fe08 > > >>> [ 85.584531] Call Trace: > > >>> [ 85.585015] [<ffffffffa02bf6f7>] ? rt_udp_ioctl+0x67/0x8c [rtudp] > > >>> [ 85.585511] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 > > >>> [ 85.586002] [<ffffffff811725be>] rtdm_fd_ioctl+0xee/0x280 > > >>> [ 85.586488] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 > > >>> [ 85.586975] [<ffffffff810a0933>] ? __ipipe_migrate_head+0x73/0xf0 > > >>> [ 85.587466] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 > > >>> [ 85.587957] [<ffffffff81178cde>] CoBaLt_ioctl+0xe/0x20 > > >>> [ 85.588445] [<ffffffff81188472>] ipipe_syscall_hook+0x112/0x350 > > >>> [ 85.588932] [<ffffffff8110acb8>] __ipipe_notify_syscall+0xc8/0x190 > > >>> [ 85.589421] [<ffffffff8110adaa>] ipipe_handle_syscall+0x2a/0xb0 > > >>> [ 85.589912] [<ffffffff81001c3d>] do_syscall_64+0x2d/0xf0 > > >>> [ 85.590404] [<ffffffff818dffbe>] entry_SYSCALL_64_after_swapgs+0x58/0xc6 > > >>> [ 85.590897] Code: 68 b8 eb b0 e8 ab d4 62 e1 81 fe 27 00 10 40 0f 84 c1 00 00 > > >>> 00 7e 73 81 fe 20 00 18 40 74 3c 81 fe 21 00 18 40 0f 85 a0 00 00 00 <8b> 02 8b > > >>> 4a 10 4c 8b > > >>> 42 08 8b 72 04 85 c0 0f 85 d6 00 00 00 83 > > >>> [ 85.592061] RIP [<ffffffffa02b4787>] rt_ip_ioctl+0x27/0x120 [rtipv4] > > >>> [ 85.592592] RSP <ffffc9000138fda8> > > >>> [ 85.593120] CR2: 00007ffcb6bd3470 > > >> The header of this kernel splat - which should normally give you some hint > > >> about the code which triggers it - seems to be missing from the pasted text > > >> above. > > > Sorry about that, what was missing was essentially this: > >> [174576.129988] [Xenomai] switching RTTest to secondary mode after exception #14 > > > in kernel-space at 0xffffffffa02b4787 (pid 485) > > > [174576.129994] BUG: unable to handle kernel paging request at 00007ffc68617830 > > > [174576.130379] IP: [<ffffffffa02b4787>] rt_ip_ioctl+0x27/0x120 [rtipv4] > > > [174576.130757] PGD 80000002633d5067 > > > [174576.130765] PUD 2642a0067 > > > [174576.131131] PMD 24e848067 > > > [174576.131135] PTE 8000000262244067 > > > [174576.131507] > > > [174576.131880] Oops: 0001 [#1] PREEMPT SMP > >> [174576.132257] Modules linked in: rtudp rtipv4 intel_powerclamp intel_rapl i915 > > > coretemp rt_igb e1000e pcan(O) rtnet video fan thermal_sys > > > [174576.133071] CPU: 3 PID: 485 Comm: OpENer Tainted: G O 4.9.90-xeno-cobolt #1 > >> [174576.133485] Hardware name: Default string Default string/SKYBAY, BIOS > > > 5.0.1.1 04/18/2016 > > >> Anyway, quick and dirty trick to locate it: > > >> $ $CROSS_COMPILE-objdump -dl > > >> $linux-build-tree/drivers/xenomai/net/stack/ipv4/rtipv4.o | grep -A 30 > > >> '<rt_ip_ioctl>:' > > >> 00000000000029f0 <rt_ip_ioctl>: > > >> rt_ip_ioctl(): > > >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 > > >> 29f0: 41 54 push %r12 > > >> 29f2: 4c 8d 27 lea (%rdi),%r12 > > >> 29f5: 55 push %rbp > > >> rtdm_fd_to_private(): > > >> linux/include/xenomai/rtdm/driver.h:163 > > >> 29f6: 48 8d 2f lea (%rdi),%rbp > > >> rt_ip_ioctl(): > > >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 > > >> 29f9: 48 8d 64 24 e0 lea -0x20(%rsp),%rsp > > >> rtdm_fd_to_private(): > > >> linux/include/xenomai/rtdm/driver.h:163 > > >> 29fe: 48 83 c5 58 add $0x58,%rbp > > >> rt_ip_ioctl(): > > >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 > > >> 2a02: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax > > >> 2a09: 00 00 > > >> 2a0b: 48 89 44 24 18 mov %rax,0x18(%rsp) > > >> 2a10: 31 c0 xor %eax,%eax > > >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:215 > > >> 2a12: 81 fe 20 00 18 40 cmp $0x40180020,%esi > > >> 2a18: 0f 84 d5 00 00 00 je 2af3 <rt_ip_ioctl+0x103> > > >> 2a1e: 7f 5f jg 2a7f <rt_ip_ioctl+0x8f> > > >> 2a20: 81 fe 26 00 10 40 cmp $0x40100026,%esi > > >> 2a26: 0f 84 98 00 00 00 je 2ac4 <rt_ip_ioctl+0xd4> > > >> 2a2c: 81 fe 27 00 10 40 cmp $0x40100027,%esi > > >> 2a32: 0f 85 81 00 00 00 jne 2ab9 <rt_ip_ioctl+0xc9> > > >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:243 > > >> 2a38: b9 10 00 00 00 mov $0x10,%ecx > > >> rt_ip_ioctl+0x27 would then be 000029f0 + 0x27, i.e. 00002a17 which would be > > >> somewhere after xenomai/net/stack/ipv4/ip_sock.c:215. This IP does not seem to > > >> match anything sensible in my dump (v3.1), but you may be using a different > > >> Xenomai code base, so this may explain. At any rate, this seems to be one of > > >> the generic sockopt handlers (setopt, getopt, getname, setname). Anyway, you > > >> get the point. > > > So, I get this: (With 0x1760 + 0x27 = 0x1787) > > > 0000000000001760 <rt_ip_ioctl>: > > > rt_ip_ioctl(): > > > 1760: e8 00 00 00 00 callq 1765 <rt_ip_ioctl+0x5> > > > 1765: 81 fe 27 00 10 40 cmp $0x40100027,%esi > > > 176b: 0f 84 c1 00 00 00 je 1832 <rt_ip_ioctl+0xd2> > > > 1771: 7e 73 jle 17e6 <rt_ip_ioctl+0x86> > > > 1773: 81 fe 20 00 18 40 cmp $0x40180020,%esi > > > 1779: 74 3c je 17b7 <rt_ip_ioctl+0x57> > > > 177b: 81 fe 21 00 18 40 cmp $0x40180021,%esi > > > 1781: 0f 85 a0 00 00 00 jne 1827 <rt_ip_ioctl+0xc7> > > > 1787: 8b 02 mov (%rdx),%eax > > > 1789: 8b 4a 10 mov 0x10(%rdx),%ecx > > > 178c: 4c 8b 42 08 mov 0x8(%rdx),%r8 > > > 1790: 8b 72 04 mov 0x4(%rdx),%esi > > > 1793: 85 c0 test %eax,%eax > > > 1795: 0f 85 d6 00 00 00 jne 1871 <rt_ip_ioctl+0x111> > > > 179b: 83 f9 03 cmp $0x3,%ecx > > > 179e: 0f 86 c7 00 00 00 jbe 186b <rt_ip_ioctl+0x10b> > > > 17a4: 83 fe 01 cmp $0x1,%esi > > > 17a7: 0f 85 c4 00 00 00 jne 1871 <rt_ip_ioctl+0x111> > > > 17ad: 41 8b 10 mov (%r8),%edx > > > 17b0: 88 97 60 01 00 00 mov %dl,0x160(%rdi) > > > 17b6: c3 retq > > > 17b7: 48 8b 42 10 mov 0x10(%rdx),%rax > > > 17bb: 48 8b 4a 08 mov 0x8(%rdx),%rcx > > > 17bf: 8b 52 04 mov 0x4(%rdx),%edx > > > 17c2: 83 38 03 cmpl $0x3,(%rax) > > > 17c5: 0f 86 a0 00 00 00 jbe 186b <rt_ip_ioctl+0x10b> > > > 17cb: 83 fa 01 cmp $0x1,%edx > > > 17ce: 0f 85 9d 00 00 00 jne 1871 <rt_ip_ioctl+0x111> > > > 17d4: 0f b6 97 60 01 00 00 movzbl 0x160(%rdi),%edx > >> I have no code-line references to match it with (yet) because it's not compiled > >> with debug info. However, the "mov (%rdx),%eax" does not seem like an > > > impossible offender. > >> I am on xenomai-3.0.8a (I don't remember of the 'a' is my name or a real > >> release, it was due to an issue with a missing file missing in the original > > > release i believe...) > > IIRC, the project rather used 3.x.y.z for brown paper bag releases, so 3.0.8a > > may be your own tag. > >> I'm not good enough in calling convention interpretation to figure out where the > >> value in %rdx came from so I'll likely have to enable the debugging flags and > > > recompile before I'll get any further. > > Ok, since $0x40180021 should be the ioctl code for _RTIOC_SETSOCKOPT in 3.0.x, > > I believe that you are hitting a generalized bug in RTnet for 3.0.x which has > > been gradually fixed in 3.1 by a (long) series of commits, addressing spurious > > direct accesses to user memory from kernel space instead of copy_to/from_user. > > This would be confirmed by the value of %RDX which very much looks like a > > user-space address. In other words, that address is most likely perfectly > > valid, but RTnet should not have dereferenced it directly, but should have > > used some form of copy_from_user() helper instead. > > On x86, you may want to try passing 'nosmap' in the kernel bootargs in order > > to work around this the hard way, by disabling the access validation done by > > the MMU. > Ah, yes of course. Now that you mention it I've seen this before but forgot > about it. > It does not happen with "nosmap" enabled. > > However, this would only paper over the issue, and any unexpected > > minor fault occurring as a result of such access (i.e. page table entry not > > present for an otherwise valid memory) would cause the kernel to take an > > uncontrolled exception and likely freak out. Those minor faults should not > > happen, however we have just experienced cases where it may happen if userland > > does some specific actions, like loading a DSO. > Right now, I only need it for a proof of concept but I'll keep that in mind for > later. > > The long-term solution would be to switch to 3.1 if the application system > > depends on RTnet. > I will try to get 3.1 out for a spin before my summer holiday and report back > whether it's solved or not. Otherwise I'll put it on my todo-list for later. I have not tried this with xenomai-3.1 and can confirm that it solves this issue. > > -- > > Philippe. > Thank you very much! > Per Öberg Per Öberg ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Bad ioctl in rtnet 2020-06-12 15:21 ` Per Oberg @ 2020-06-12 15:27 ` Per Oberg 0 siblings, 0 replies; 7+ messages in thread From: Per Oberg @ 2020-06-12 15:27 UTC (permalink / raw) To: xenomai ----- Den 12 jun 2020, på kl 17:21, xenomai xenomai@xenomai.org skrev: > ----- Den 12 jun 2020, på kl 11:05, xenomai xenomai@xenomai.org skrev: > > ----- Den 12 jun 2020, på kl 10:40, Philippe Gerum rpm@xenomai.org skrev: > > > On 6/12/20 10:02 AM, Per Oberg wrote: > > > > ----- Den 9 jun 2020, på kl 18:16, Philippe Gerum rpm@xenomai.org skrev: > > > >> On 6/9/20 4:49 PM, Per Oberg via Xenomai wrote: > > > >>> Hello list! > > > >>> I get this error when running a posix-wrapper-compiled software pacakge on > > > >>> rtnet. Could someone please help me pinpoint which ioctl is causing this? (Does > > > >>> it say in the text below or do I need to start spreading breadcrumbs ? ) > > > >>> [ 85.577201] I-pipe domain: Linux > > > >>> [ 85.577624] task: ffff880262df6c00 task.stack: ffffc9000138c000 > > > >>> [ 85.578058] RIP: 0010:[<ffffffffa02b4787>] [<ffffffffa02b4787>] > > > >>> rt_ip_ioctl+0x27/0x120 [rtipv4] > > > >>> [ 85.578512] RSP: 0018:ffffc9000138fda8 EFLAGS: 00010246 > > > >>> [ 85.578958] RAX: 000000000007ffff RBX: 0000000040180021 RCX: ffff88026dd00000 > > > >>> [ 85.579409] RDX: 00007ffcb6bd3470 RSI: 0000000040180021 RDI: ffff880262b33a00 > > > >>> [ 85.579858] RBP: ffffc9000138fdd0 R08: 0000000000000052 R09: ffff880262df6c00 > > > >>> [ 85.580310] R10: 00000000000000e6 R11: 0000000000000000 R12: ffff880262b33a00 > > > >>> [ 85.580763] R13: 0000000040180021 R14: 00007ffcb6bd3470 R15: 0000000062b33a00 > > > >>> [ 85.581217] FS: 00007fd21c07c480(0000) GS:ffff88026dd00000(0000) > > > >>> knlGS:0000000000000000 > > > >>> [ 85.581674] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > > >>> [ 85.582133] CR2: 00007ffcb6bd3470 CR3: 0000000261a90000 CR4: 0000000000360630 > > > >>> [ 85.582602] Stack: > > > >>> [ 85.583067] ffffffffa02bf6f7 0000000000000001 ffffffff81178cd0 ffff880262b33a00 > > > >>> [ 85.583554] 0000000000000004 ffffc9000138fe60 ffffffff811725be 0000000000000202 > > > >>> [ 85.584040] ffff880262df6c00 ffff880200000010 ffffc9000138fe70 ffffc9000138fe08 > > > >>> [ 85.584531] Call Trace: > > > >>> [ 85.585015] [<ffffffffa02bf6f7>] ? rt_udp_ioctl+0x67/0x8c [rtudp] > > > >>> [ 85.585511] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 > > > >>> [ 85.586002] [<ffffffff811725be>] rtdm_fd_ioctl+0xee/0x280 > > > >>> [ 85.586488] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 > > > >>> [ 85.586975] [<ffffffff810a0933>] ? __ipipe_migrate_head+0x73/0xf0 > > > >>> [ 85.587466] [<ffffffff81178cd0>] ? CoBaLt_fcntl+0x20/0x20 > > > >>> [ 85.587957] [<ffffffff81178cde>] CoBaLt_ioctl+0xe/0x20 > > > >>> [ 85.588445] [<ffffffff81188472>] ipipe_syscall_hook+0x112/0x350 > > > >>> [ 85.588932] [<ffffffff8110acb8>] __ipipe_notify_syscall+0xc8/0x190 > > > >>> [ 85.589421] [<ffffffff8110adaa>] ipipe_handle_syscall+0x2a/0xb0 > > > >>> [ 85.589912] [<ffffffff81001c3d>] do_syscall_64+0x2d/0xf0 > > > >>> [ 85.590404] [<ffffffff818dffbe>] entry_SYSCALL_64_after_swapgs+0x58/0xc6 > > > >>> [ 85.590897] Code: 68 b8 eb b0 e8 ab d4 62 e1 81 fe 27 00 10 40 0f 84 c1 00 00 > > > >>> 00 7e 73 81 fe 20 00 18 40 74 3c 81 fe 21 00 18 40 0f 85 a0 00 00 00 <8b> 02 8b > > > >>> 4a 10 4c 8b > > > >>> 42 08 8b 72 04 85 c0 0f 85 d6 00 00 00 83 > > > >>> [ 85.592061] RIP [<ffffffffa02b4787>] rt_ip_ioctl+0x27/0x120 [rtipv4] > > > >>> [ 85.592592] RSP <ffffc9000138fda8> > > > >>> [ 85.593120] CR2: 00007ffcb6bd3470 > > > >> The header of this kernel splat - which should normally give you some hint > > > >> about the code which triggers it - seems to be missing from the pasted text > > > >> above. > > > > Sorry about that, what was missing was essentially this: > > >> [174576.129988] [Xenomai] switching RTTest to secondary mode after exception #14 > > > > in kernel-space at 0xffffffffa02b4787 (pid 485) > > > > [174576.129994] BUG: unable to handle kernel paging request at 00007ffc68617830 > > > > [174576.130379] IP: [<ffffffffa02b4787>] rt_ip_ioctl+0x27/0x120 [rtipv4] > > > > [174576.130757] PGD 80000002633d5067 > > > > [174576.130765] PUD 2642a0067 > > > > [174576.131131] PMD 24e848067 > > > > [174576.131135] PTE 8000000262244067 > > > > [174576.131507] > > > > [174576.131880] Oops: 0001 [#1] PREEMPT SMP > > >> [174576.132257] Modules linked in: rtudp rtipv4 intel_powerclamp intel_rapl i915 > > > > coretemp rt_igb e1000e pcan(O) rtnet video fan thermal_sys > > > > [174576.133071] CPU: 3 PID: 485 Comm: OpENer Tainted: G O 4.9.90-xeno-cobolt #1 > > >> [174576.133485] Hardware name: Default string Default string/SKYBAY, BIOS > > > > 5.0.1.1 04/18/2016 > > > >> Anyway, quick and dirty trick to locate it: > > > >> $ $CROSS_COMPILE-objdump -dl > > > >> $linux-build-tree/drivers/xenomai/net/stack/ipv4/rtipv4.o | grep -A 30 > > > >> '<rt_ip_ioctl>:' > > > >> 00000000000029f0 <rt_ip_ioctl>: > > > >> rt_ip_ioctl(): > > > >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 > > > >> 29f0: 41 54 push %r12 > > > >> 29f2: 4c 8d 27 lea (%rdi),%r12 > > > >> 29f5: 55 push %rbp > > > >> rtdm_fd_to_private(): > > > >> linux/include/xenomai/rtdm/driver.h:163 > > > >> 29f6: 48 8d 2f lea (%rdi),%rbp > > > >> rt_ip_ioctl(): > > > >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 > > > >> 29f9: 48 8d 64 24 e0 lea -0x20(%rsp),%rsp > > > >> rtdm_fd_to_private(): > > > >> linux/include/xenomai/rtdm/driver.h:163 > > > >> 29fe: 48 83 c5 58 add $0x58,%rbp > > > >> rt_ip_ioctl(): > > > >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:209 > > > >> 2a02: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax > > > >> 2a09: 00 00 > > > >> 2a0b: 48 89 44 24 18 mov %rax,0x18(%rsp) > > > >> 2a10: 31 c0 xor %eax,%eax > > > >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:215 > > > >> 2a12: 81 fe 20 00 18 40 cmp $0x40180020,%esi > > > >> 2a18: 0f 84 d5 00 00 00 je 2af3 <rt_ip_ioctl+0x103> > > > >> 2a1e: 7f 5f jg 2a7f <rt_ip_ioctl+0x8f> > > > >> 2a20: 81 fe 26 00 10 40 cmp $0x40100026,%esi > > > >> 2a26: 0f 84 98 00 00 00 je 2ac4 <rt_ip_ioctl+0xd4> > > > >> 2a2c: 81 fe 27 00 10 40 cmp $0x40100027,%esi > > > >> 2a32: 0f 85 81 00 00 00 jne 2ab9 <rt_ip_ioctl+0xc9> > > > >> linux/drivers/xenomai/net/stack/ipv4/ip_sock.c:243 > > > >> 2a38: b9 10 00 00 00 mov $0x10,%ecx > > > >> rt_ip_ioctl+0x27 would then be 000029f0 + 0x27, i.e. 00002a17 which would be > > > >> somewhere after xenomai/net/stack/ipv4/ip_sock.c:215. This IP does not seem to > > > >> match anything sensible in my dump (v3.1), but you may be using a different > > > >> Xenomai code base, so this may explain. At any rate, this seems to be one of > > > >> the generic sockopt handlers (setopt, getopt, getname, setname). Anyway, you > > > >> get the point. > > > > So, I get this: (With 0x1760 + 0x27 = 0x1787) > > > > 0000000000001760 <rt_ip_ioctl>: > > > > rt_ip_ioctl(): > > > > 1760: e8 00 00 00 00 callq 1765 <rt_ip_ioctl+0x5> > > > > 1765: 81 fe 27 00 10 40 cmp $0x40100027,%esi > > > > 176b: 0f 84 c1 00 00 00 je 1832 <rt_ip_ioctl+0xd2> > > > > 1771: 7e 73 jle 17e6 <rt_ip_ioctl+0x86> > > > > 1773: 81 fe 20 00 18 40 cmp $0x40180020,%esi > > > > 1779: 74 3c je 17b7 <rt_ip_ioctl+0x57> > > > > 177b: 81 fe 21 00 18 40 cmp $0x40180021,%esi > > > > 1781: 0f 85 a0 00 00 00 jne 1827 <rt_ip_ioctl+0xc7> > > > > 1787: 8b 02 mov (%rdx),%eax > > > > 1789: 8b 4a 10 mov 0x10(%rdx),%ecx > > > > 178c: 4c 8b 42 08 mov 0x8(%rdx),%r8 > > > > 1790: 8b 72 04 mov 0x4(%rdx),%esi > > > > 1793: 85 c0 test %eax,%eax > > > > 1795: 0f 85 d6 00 00 00 jne 1871 <rt_ip_ioctl+0x111> > > > > 179b: 83 f9 03 cmp $0x3,%ecx > > > > 179e: 0f 86 c7 00 00 00 jbe 186b <rt_ip_ioctl+0x10b> > > > > 17a4: 83 fe 01 cmp $0x1,%esi > > > > 17a7: 0f 85 c4 00 00 00 jne 1871 <rt_ip_ioctl+0x111> > > > > 17ad: 41 8b 10 mov (%r8),%edx > > > > 17b0: 88 97 60 01 00 00 mov %dl,0x160(%rdi) > > > > 17b6: c3 retq > > > > 17b7: 48 8b 42 10 mov 0x10(%rdx),%rax > > > > 17bb: 48 8b 4a 08 mov 0x8(%rdx),%rcx > > > > 17bf: 8b 52 04 mov 0x4(%rdx),%edx > > > > 17c2: 83 38 03 cmpl $0x3,(%rax) > > > > 17c5: 0f 86 a0 00 00 00 jbe 186b <rt_ip_ioctl+0x10b> > > > > 17cb: 83 fa 01 cmp $0x1,%edx > > > > 17ce: 0f 85 9d 00 00 00 jne 1871 <rt_ip_ioctl+0x111> > > > > 17d4: 0f b6 97 60 01 00 00 movzbl 0x160(%rdi),%edx > > >> I have no code-line references to match it with (yet) because it's not compiled > > >> with debug info. However, the "mov (%rdx),%eax" does not seem like an > > > > impossible offender. > > >> I am on xenomai-3.0.8a (I don't remember of the 'a' is my name or a real > > >> release, it was due to an issue with a missing file missing in the original > > > > release i believe...) > > > IIRC, the project rather used 3.x.y.z for brown paper bag releases, so 3.0.8a > > > may be your own tag. > > >> I'm not good enough in calling convention interpretation to figure out where the > > >> value in %rdx came from so I'll likely have to enable the debugging flags and > > > > recompile before I'll get any further. > > > Ok, since $0x40180021 should be the ioctl code for _RTIOC_SETSOCKOPT in 3.0.x, > > > I believe that you are hitting a generalized bug in RTnet for 3.0.x which has > > > been gradually fixed in 3.1 by a (long) series of commits, addressing spurious > > > direct accesses to user memory from kernel space instead of copy_to/from_user. > > > This would be confirmed by the value of %RDX which very much looks like a > > > user-space address. In other words, that address is most likely perfectly > > > valid, but RTnet should not have dereferenced it directly, but should have > > > used some form of copy_from_user() helper instead. > > > On x86, you may want to try passing 'nosmap' in the kernel bootargs in order > > > to work around this the hard way, by disabling the access validation done by > > > the MMU. > > Ah, yes of course. Now that you mention it I've seen this before but forgot > > about it. > > It does not happen with "nosmap" enabled. > > > However, this would only paper over the issue, and any unexpected > > > minor fault occurring as a result of such access (i.e. page table entry not > > > present for an otherwise valid memory) would cause the kernel to take an > > > uncontrolled exception and likely freak out. Those minor faults should not > > > happen, however we have just experienced cases where it may happen if userland > > > does some specific actions, like loading a DSO. > > Right now, I only need it for a proof of concept but I'll keep that in mind for > > later. > > > The long-term solution would be to switch to 3.1 if the application system > > > depends on RTnet. > > I will try to get 3.1 out for a spin before my summer holiday and report back > > whether it's solved or not. Otherwise I'll put it on my todo-list for later. > I have not tried this with xenomai-3.1 and can confirm that it solves this > issue. Sorry about the ambiguity and the spamming. It should have read: "I have NOW tried this with xenomai-3.1 and can confirm that it solves this issue." > > > -- > > > Philippe. > > Thank you very much! > > Per Öberg > Per Öberg Per Öberg ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-06-12 15:27 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-06-09 14:49 Bad ioctl in rtnet Per Oberg 2020-06-09 16:16 ` Philippe Gerum 2020-06-12 8:02 ` Per Oberg 2020-06-12 8:40 ` Philippe Gerum 2020-06-12 9:05 ` Per Oberg 2020-06-12 15:21 ` Per Oberg 2020-06-12 15:27 ` Per Oberg
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.