* Urgent Bug Report Kernel crash 6.5.2 @ 2023-09-15 4:05 Martin Zaharinov 2023-09-15 6:45 ` Eric Dumazet 2023-09-15 23:00 ` Martin Zaharinov 0 siblings, 2 replies; 35+ messages in thread From: Martin Zaharinov @ 2023-09-15 4:05 UTC (permalink / raw) To: netdev Cc: Eric Dumazet, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern Hi All This is report from kernel 6.5.2 after 4 day up system hang and reboot after this error : Sep 15 04:32:29 205.254.184.12 [399661.971344][ C31] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) Sep 15 04:32:29 205.254.184.12 [399661.971470][ C31] BUG: unable to handle page fault for address: ffffa10c52d43058 Sep 15 04:32:29 205.254.184.12 [399661.971586][ C31] #PF: supervisor instruction fetch in kernel mode Sep 15 04:32:29 205.254.184.12 [399661.971680][ C31] #PF: error_code(0x0011) - permissions violation Sep 15 04:32:29 205.254.184.12 [399661.971775][ C31] PGD 12601067 P4D 12601067 PUD 80000002400001e3 Sep 15 04:32:29 205.254.184.12 [399661.971871][ C31] Oops: 0011 [#1] PREEMPT SMP Sep 15 04:32:29 205.254.184.12 [399661.971963][ C31] CPU: 31 PID: 0 Comm: swapper/31 Tainted: G W O 6.5.2 #1 Sep 15 04:32:29 205.254.184.12 [399661.972079][ C31] Hardware name: Supermicro SYS-5038MR-H8TRF/X10SRD-F, BIOS 3.3 10/28/2020 Sep 15 04:32:29 205.254.184.12 [399661.972197][ C31] RIP: 0010:0xffffa10c52d43058 Sep 15 04:32:29 205.254.184.12 [399661.972289][ C31] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 30 d4 52 0c a1 ff ff 00 00 00 00 00 00 Sep 15 04:32:29 205.254.184.12 [399661.972448][ C31] RSP: 0018:ffffad0e0097ccc8 EFLAGS: 00010282 Sep 15 04:32:29 205.254.184.12 [399661.972543][ C31] RAX: ffffa10c52d43058 RBX: ffffa10c52d43000 RCX: 0000000000000000 Sep 15 04:32:29 205.254.184.12 [399661.972659][ C31] RDX: 0000000000002712 RSI: 0000000000000246 RDI: ffffa10c52d43000 Sep 15 04:32:29 205.254.184.12 [399661.972774][ C31] RBP: ffffa10c52d43000 R08: 0000000127a83c46 R09: 0000000000004d8c Sep 15 04:32:29 205.254.184.12 [399661.972889][ C31] R10: ffffe840ca0f7c00 R11: 0000000000000000 R12: ffffa10c8e764d80 Sep 15 04:32:29 205.254.184.12 [399661.973005][ C31] R13: ffffa10c92b4c760 R14: 0000000000000058 R15: ffffa10c92b4c600 Sep 15 04:32:29 205.254.184.12 [399661.973123][ C31] FS: 0000000000000000(0000) GS:ffffa1125fdc0000(0000) knlGS:0000000000000000 Sep 15 04:32:29 205.254.184.12 [399661.973244][ C31] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 15 04:32:29 205.254.184.12 [399661.973338][ C31] CR2: ffffa10c52d43058 CR3: 00000001059b8001 CR4: 00000000003706e0 Sep 15 04:32:29 205.254.184.12 [399661.973454][ C31] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 15 04:32:29 205.254.184.12 [399661.973569][ C31] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Sep 15 04:32:29 205.254.184.12 [399661.973684][ C31] Call Trace: Sep 15 04:32:29 205.254.184.12 [399661.973773][ C31] <IRQ> Sep 15 04:32:29 205.254.184.12 [399661.973859][ C31] ? __die+0xe4/0xf0 Sep 15 04:32:29 205.254.184.12 [399661.973949][ C31] ? page_fault_oops+0x144/0x3e0 Sep 15 04:32:29 205.254.184.12 [399661.974043][ C31] ? exc_page_fault+0x92/0xa0 Sep 15 04:32:29 205.254.184.12 [399661.974136][ C31] ? asm_exc_page_fault+0x22/0x30 Sep 15 04:32:29 205.254.184.12 [399661.974228][ C31] ? kfree_skb_reason+0x33/0xf0 Sep 15 04:32:29 205.254.184.12 [399661.974321][ C31] ? tcp_mtu_probe+0x3a6/0x7b0 Sep 15 04:32:29 205.254.184.12 [399661.974416][ C31] ? tcp_write_xmit+0x7fa/0x1410 Sep 15 04:32:29 205.254.184.12 [399661.974509][ C31] ? __tcp_push_pending_frames+0x2d/0xb0 Sep 15 04:32:29 205.254.184.12 [399661.974603][ C31] ? tcp_rcv_established+0x381/0x610 Sep 15 04:32:29 205.254.184.12 [399661.974695][ C31] ? sk_filter_trim_cap+0xc6/0x1c0 Sep 15 04:32:29 205.254.184.12 [399661.974787][ C31] ? tcp_v4_do_rcv+0x11f/0x1f0 Sep 15 04:32:29 205.254.184.12 [399661.974877][ C31] ? tcp_v4_rcv+0xfa1/0x1010 Sep 15 04:32:29 205.254.184.12 [399661.974968][ C31] ? ip_protocol_deliver_rcu+0x1b/0x270 Sep 15 04:32:29 205.254.184.12 [399661.975062][ C31] ? ip_local_deliver_finish+0x6d/0x90 Sep 15 04:32:29 205.254.184.12 [399661.976257][ C31] ? process_backlog+0x10c/0x230 Sep 15 04:32:29 205.254.184.12 [399661.976352][ C31] ? __napi_poll+0x20/0x180 Sep 15 04:32:29 205.254.184.12 [399661.976442][ C31] ? net_rx_action+0x2a4/0x390 Sep 15 04:32:29 205.254.184.12 [399661.976534][ C31] ? __do_softirq+0xd0/0x202 Sep 15 04:32:29 205.254.184.12 [399661.976626][ C31] ? do_softirq+0x3a/0x50 Sep 15 04:32:29 205.254.184.12 [399661.976718][ C31] </IRQ> Sep 15 04:32:29 205.254.184.12 [399661.976805][ C31] <TASK> Sep 15 04:32:29 205.254.184.12 [399661.976890][ C31] ? flush_smp_call_function_queue+0x3f/0x50 Sep 15 04:32:29 205.254.184.12 [399661.976988][ C31] ? do_idle+0x14d/0x210 Sep 15 04:32:29 205.254.184.12 [399661.977078][ C31] ? cpu_startup_entry+0x14/0x20 Sep 15 04:32:29 205.254.184.12 [399661.977168][ C31] ? start_secondary+0xe1/0xf0 Sep 15 04:32:29 205.254.184.12 [399661.977262][ C31] ? secondary_startup_64_no_verify+0x167/0x16b Sep 15 04:32:29 205.254.184.12 [399661.977359][ C31] </TASK> Sep 15 04:32:29 205.254.184.12 [399661.977448][ C31] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos Sep 15 04:32:29 205.254.184.12 [399661.977720][ C31] CR2: ffffa10c52d43058 Sep 15 04:32:29 205.254.184.12 [399661.977809][ C31] ---[ end trace 0000000000000000 ]--- Sep 15 04:32:29 205.254.184.12 [399661.977901][ C31] RIP: 0010:0xffffa10c52d43058 Sep 15 04:32:29 205.254.184.12 [399661.977992][ C31] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 30 d4 52 0c a1 ff ff 00 00 00 00 00 00 Sep 15 04:32:29 205.254.184.12 [399661.978150][ C31] RSP: 0018:ffffad0e0097ccc8 EFLAGS: 00010282 Sep 15 04:32:29 205.254.184.12 [399661.978243][ C31] RAX: ffffa10c52d43058 RBX: ffffa10c52d43000 RCX: 0000000000000000 Sep 15 04:32:29 205.254.184.12 [399661.978358][ C31] RDX: 0000000000002712 RSI: 0000000000000246 RDI: ffffa10c52d43000 Sep 15 04:32:29 205.254.184.12 [399661.978472][ C31] RBP: ffffa10c52d43000 R08: 0000000127a83c46 R09: 0000000000004d8c Sep 15 04:32:29 205.254.184.12 [399661.978587][ C31] R10: ffffe840ca0f7c00 R11: 0000000000000000 R12: ffffa10c8e764d80 Sep 15 04:32:29 205.254.184.12 [399661.978702][ C31] R13: ffffa10c92b4c760 R14: 0000000000000058 R15: ffffa10c92b4c600 Sep 15 04:32:29 205.254.184.12 [399661.978818][ C31] FS: 0000000000000000(0000) GS:ffffa1125fdc0000(0000) knlGS:0000000000000000 Sep 15 04:32:29 205.254.184.12 [399661.978940][ C31] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 15 04:32:29 205.254.184.12 [399661.979036][ C31] CR2: ffffa10c52d43058 CR3: 00000001059b8001 CR4: 00000000003706e0 Sep 15 04:32:29 205.254.184.12 [399661.979150][ C31] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 15 04:32:29 205.254.184.12 [399661.979265][ C31] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Sep 15 04:32:29 205.254.184.12 [399661.979381][ C31] Kernel panic - not syncing: Fatal exception in interrupt Sep 15 04:32:29 205.254.184.12 [399662.084038][ C31] Kernel Offset: 0x1e000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Sep 15 04:32:29 205.254.184.12 [399662.084162][ C31] Rebooting in 10 seconds.. Please if find fix update me . m. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-15 4:05 Urgent Bug Report Kernel crash 6.5.2 Martin Zaharinov @ 2023-09-15 6:45 ` Eric Dumazet 2023-09-15 22:23 ` Martin Zaharinov 2023-11-16 14:17 ` Martin Zaharinov 2023-09-15 23:00 ` Martin Zaharinov 1 sibling, 2 replies; 35+ messages in thread From: Eric Dumazet @ 2023-09-15 6:45 UTC (permalink / raw) To: Martin Zaharinov Cc: netdev, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern On Fri, Sep 15, 2023 at 6:05 AM Martin Zaharinov <micron10@gmail.com> wrote: > > Hi All > This is report from kernel 6.5.2 after 4 day up system hang and reboot after this error : > > > > Sep 15 04:32:29 205.254.184.12 [399661.971344][ C31] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) > Sep 15 04:32:29 205.254.184.12 [399661.971470][ C31] BUG: unable to handle page fault for address: ffffa10c52d43058 > Sep 15 04:32:29 205.254.184.12 [399661.971586][ C31] #PF: supervisor instruction fetch in kernel mode > Sep 15 04:32:29 205.254.184.12 [399661.971680][ C31] #PF: error_code(0x0011) - permissions violation > Sep 15 04:32:29 205.254.184.12 [399661.971775][ C31] PGD 12601067 P4D 12601067 PUD 80000002400001e3 > Sep 15 04:32:29 205.254.184.12 [399661.971871][ C31] Oops: 0011 [#1] PREEMPT SMP > Sep 15 04:32:29 205.254.184.12 [399661.971963][ C31] CPU: 31 PID: 0 Comm: swapper/31 Tainted: G W O 6.5.2 #1 > Sep 15 04:32:29 205.254.184.12 [399661.972079][ C31] Hardware name: Supermicro SYS-5038MR-H8TRF/X10SRD-F, BIOS 3.3 10/28/2020 > Sep 15 04:32:29 205.254.184.12 [399661.972197][ C31] RIP: 0010:0xffffa10c52d43058 > Sep 15 04:32:29 205.254.184.12 [399661.972289][ C31] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 30 d4 52 0c a1 ff ff 00 00 00 00 00 00 > Sep 15 04:32:29 205.254.184.12 [399661.972448][ C31] RSP: 0018:ffffad0e0097ccc8 EFLAGS: 00010282 > Sep 15 04:32:29 205.254.184.12 [399661.972543][ C31] RAX: ffffa10c52d43058 RBX: ffffa10c52d43000 RCX: 0000000000000000 > Sep 15 04:32:29 205.254.184.12 [399661.972659][ C31] RDX: 0000000000002712 RSI: 0000000000000246 RDI: ffffa10c52d43000 > Sep 15 04:32:29 205.254.184.12 [399661.972774][ C31] RBP: ffffa10c52d43000 R08: 0000000127a83c46 R09: 0000000000004d8c > Sep 15 04:32:29 205.254.184.12 [399661.972889][ C31] R10: ffffe840ca0f7c00 R11: 0000000000000000 R12: ffffa10c8e764d80 > Sep 15 04:32:29 205.254.184.12 [399661.973005][ C31] R13: ffffa10c92b4c760 R14: 0000000000000058 R15: ffffa10c92b4c600 > Sep 15 04:32:29 205.254.184.12 [399661.973123][ C31] FS: 0000000000000000(0000) GS:ffffa1125fdc0000(0000) knlGS:0000000000000000 > Sep 15 04:32:29 205.254.184.12 [399661.973244][ C31] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > Sep 15 04:32:29 205.254.184.12 [399661.973338][ C31] CR2: ffffa10c52d43058 CR3: 00000001059b8001 CR4: 00000000003706e0 > Sep 15 04:32:29 205.254.184.12 [399661.973454][ C31] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > Sep 15 04:32:29 205.254.184.12 [399661.973569][ C31] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Sep 15 04:32:29 205.254.184.12 [399661.973684][ C31] Call Trace: > Sep 15 04:32:29 205.254.184.12 [399661.973773][ C31] <IRQ> > Sep 15 04:32:29 205.254.184.12 [399661.973859][ C31] ? __die+0xe4/0xf0 > Sep 15 04:32:29 205.254.184.12 [399661.973949][ C31] ? page_fault_oops+0x144/0x3e0 > Sep 15 04:32:29 205.254.184.12 [399661.974043][ C31] ? exc_page_fault+0x92/0xa0 > Sep 15 04:32:29 205.254.184.12 [399661.974136][ C31] ? asm_exc_page_fault+0x22/0x30 > Sep 15 04:32:29 205.254.184.12 [399661.974228][ C31] ? kfree_skb_reason+0x33/0xf0 > Sep 15 04:32:29 205.254.184.12 [399661.974321][ C31] ? tcp_mtu_probe+0x3a6/0x7b0 > Sep 15 04:32:29 205.254.184.12 [399661.974416][ C31] ? tcp_write_xmit+0x7fa/0x1410 > Sep 15 04:32:29 205.254.184.12 [399661.974509][ C31] ? __tcp_push_pending_frames+0x2d/0xb0 > Sep 15 04:32:29 205.254.184.12 [399661.974603][ C31] ? tcp_rcv_established+0x381/0x610 > Sep 15 04:32:29 205.254.184.12 [399661.974695][ C31] ? sk_filter_trim_cap+0xc6/0x1c0 > Sep 15 04:32:29 205.254.184.12 [399661.974787][ C31] ? tcp_v4_do_rcv+0x11f/0x1f0 > Sep 15 04:32:29 205.254.184.12 [399661.974877][ C31] ? tcp_v4_rcv+0xfa1/0x1010 Your reports are not usable. Please make sure to include symbols next time. Please read these parts (and possibly complete files) Documentation/admin-guide/bug-hunting.rst:55:quality of the stack trace by using file:`scripts/decode_stacktrace.sh`. Documentation/admin-guide/reporting-issues.rst:978: [user@something ~]$ sudo dmesg | ./linux-5.10.5/scripts/decode_stacktrace.sh ./linux-5.10.5/vmlinux Documentation/admin-guide/reporting-issues.rst:985: [user@something ~]$ sudo dmesg | ./linux-5.10.5/scripts/decode_stacktrace.sh \ > Sep 15 04:32:29 205.254.184.12 [399661.974968][ C31] ? ip_protocol_deliver_rcu+0x1b/0x270 > Sep 15 04:32:29 205.254.184.12 [399661.975062][ C31] ? ip_local_deliver_finish+0x6d/0x90 > Sep 15 04:32:29 205.254.184.12 [399661.976257][ C31] ? process_backlog+0x10c/0x230 > Sep 15 04:32:29 205.254.184.12 [399661.976352][ C31] ? __napi_poll+0x20/0x180 > Sep 15 04:32:29 205.254.184.12 [399661.976442][ C31] ? net_rx_action+0x2a4/0x390 > Sep 15 04:32:29 205.254.184.12 [399661.976534][ C31] ? __do_softirq+0xd0/0x202 > Sep 15 04:32:29 205.254.184.12 [399661.976626][ C31] ? do_softirq+0x3a/0x50 > Sep 15 04:32:29 205.254.184.12 [399661.976718][ C31] </IRQ> > Sep 15 04:32:29 205.254.184.12 [399661.976805][ C31] <TASK> > Sep 15 04:32:29 205.254.184.12 [399661.976890][ C31] ? flush_smp_call_function_queue+0x3f/0x50 > Sep 15 04:32:29 205.254.184.12 [399661.976988][ C31] ? do_idle+0x14d/0x210 > Sep 15 04:32:29 205.254.184.12 [399661.977078][ C31] ? cpu_startup_entry+0x14/0x20 > Sep 15 04:32:29 205.254.184.12 [399661.977168][ C31] ? start_secondary+0xe1/0xf0 > Sep 15 04:32:29 205.254.184.12 [399661.977262][ C31] ? secondary_startup_64_no_verify+0x167/0x16b > Sep 15 04:32:29 205.254.184.12 [399661.977359][ C31] </TASK> > Sep 15 04:32:29 205.254.184.12 [399661.977448][ C31] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos > Sep 15 04:32:29 205.254.184.12 [399661.977720][ C31] CR2: ffffa10c52d43058 > Sep 15 04:32:29 205.254.184.12 [399661.977809][ C31] ---[ end trace 0000000000000000 ]--- > Sep 15 04:32:29 205.254.184.12 [399661.977901][ C31] RIP: 0010:0xffffa10c52d43058 > Sep 15 04:32:29 205.254.184.12 [399661.977992][ C31] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 30 d4 52 0c a1 ff ff 00 00 00 00 00 00 > Sep 15 04:32:29 205.254.184.12 [399661.978150][ C31] RSP: 0018:ffffad0e0097ccc8 EFLAGS: 00010282 > Sep 15 04:32:29 205.254.184.12 [399661.978243][ C31] RAX: ffffa10c52d43058 RBX: ffffa10c52d43000 RCX: 0000000000000000 > Sep 15 04:32:29 205.254.184.12 [399661.978358][ C31] RDX: 0000000000002712 RSI: 0000000000000246 RDI: ffffa10c52d43000 > Sep 15 04:32:29 205.254.184.12 [399661.978472][ C31] RBP: ffffa10c52d43000 R08: 0000000127a83c46 R09: 0000000000004d8c > Sep 15 04:32:29 205.254.184.12 [399661.978587][ C31] R10: ffffe840ca0f7c00 R11: 0000000000000000 R12: ffffa10c8e764d80 > Sep 15 04:32:29 205.254.184.12 [399661.978702][ C31] R13: ffffa10c92b4c760 R14: 0000000000000058 R15: ffffa10c92b4c600 > Sep 15 04:32:29 205.254.184.12 [399661.978818][ C31] FS: 0000000000000000(0000) GS:ffffa1125fdc0000(0000) knlGS:0000000000000000 > Sep 15 04:32:29 205.254.184.12 [399661.978940][ C31] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > Sep 15 04:32:29 205.254.184.12 [399661.979036][ C31] CR2: ffffa10c52d43058 CR3: 00000001059b8001 CR4: 00000000003706e0 > Sep 15 04:32:29 205.254.184.12 [399661.979150][ C31] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > Sep 15 04:32:29 205.254.184.12 [399661.979265][ C31] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Sep 15 04:32:29 205.254.184.12 [399661.979381][ C31] Kernel panic - not syncing: Fatal exception in interrupt > Sep 15 04:32:29 205.254.184.12 [399662.084038][ C31] Kernel Offset: 0x1e000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) > Sep 15 04:32:29 205.254.184.12 [399662.084162][ C31] Rebooting in 10 seconds.. > > > Please if find fix update me . > > m. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-15 6:45 ` Eric Dumazet @ 2023-09-15 22:23 ` Martin Zaharinov 2023-11-16 14:17 ` Martin Zaharinov 1 sibling, 0 replies; 35+ messages in thread From: Martin Zaharinov @ 2023-09-15 22:23 UTC (permalink / raw) To: Eric Dumazet Cc: netdev, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern Hi Eric run decode script … but i think miss function name … check log : 15: eb cf jmp 0xffffffffffffffe6 17: 48 c7 c7 68 f6 e2 8e mov $0xffffffff8ee2f668,%rdi 1e: c6 05 ac ae e6 00 01 movb $0x1,0xe6aeac(%rip) # 0xe6aed1 25: e8 11 71 c7 ff call 0xffffffffffc7713b 2a:* 0f 0b ud2 <-- trapping instruction 2c: eb df jmp 0xd 2e: cc int3 2f: cc int3 30: cc int3 31: cc int3 32: cc int3 33: cc int3 34: cc int3 35: cc int3 36: cc int3 37: cc int3 38: cc int3 39: cc int3 3a: cc int3 3b: 48 89 fa mov %rdi,%rdx 3e: 83 .byte 0x83 3f: e2 .byte 0xe2 Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: eb df jmp 0xffffffffffffffe3 4: cc int3 5: cc int3 6: cc int3 7: cc int3 8: cc int3 9: cc int3 a: cc int3 b: cc int3 c: cc int3 d: cc int3 e: cc int3 f: cc int3 10: cc int3 11: 48 89 fa mov %rdi,%rdx 14: 83 .byte 0x83 15: e2 .byte 0xe2 [40915.531389] RSP: 0018:ffffa62680318de8 EFLAGS: 00010296 [40915.531487] RAX: 0000000000000019 RBX: ffff982f02950c40 RCX: 00000000fffbffff [40915.531605] RDX: 00000000fffbffff RSI: 0000000000000001 RDI: 00000000ffffffea [40915.531721] RBP: ffff982e467d2000 R08: 0000000000000000 R09: 00000000fffbffff [40915.531839] R10: ffff98359d600000 R11: 0000000000000003 R12: ffff982f044e16c0 [40915.531956] R13: 0000000000000000 R14: 0000000000000258 R15: ffffa62680318f60 [40915.532075] FS: 0000000000000000(0000) GS:ffff98359fbc0000(0000) knlGS:0000000000000000 [40915.532195] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [40915.532291] CR2: 00005593eb3ff078 CR3: 0000000179f6e001 CR4: 00000000003706e0 [40915.532409] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [40915.532526] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [40915.532645] Call Trace: [40915.532736] <IRQ> [40915.532824] ? __warn (??:?) [40915.532918] ? report_bug (??:?) [40915.533011] ? handle_bug (traps.c:?) [40915.533104] ? exc_invalid_op (??:?) [40915.533198] ? asm_exc_invalid_op (??:?) [40915.533294] ? rcuref_put_slowpath (??:?) [40915.533389] ? rcuref_put_slowpath (??:?) [40915.533482] dst_release (??:?) [40915.533576] __dev_queue_xmit (??:?) [40915.533671] ? eth_header (??:?) [40915.533766] ip_finish_output2 (ip_output.c:?) [40915.533863] process_backlog (dev.c:?) [40915.533958] __napi_poll (dev.c:?) [40915.534050] net_rx_action (dev.c:?) [40915.534140] __do_softirq (??:?) [40915.534233] do_softirq (??:?) [40915.534326] </IRQ> [40915.534413] <TASK> [40915.534503] flush_smp_call_function_queue (??:?) [40915.534597] do_idle (build_policy.c:?) [40915.534687] cpu_startup_entry (??:?) [40915.534778] start_secondary (smpboot.c:?) [40915.534871] secondary_startup_64_no_verify (??:?) [40915.534968] </TASK> [40915.535057] ---[ end trace 0000000000000000 ]— For me may be problem is in this part : [40915.533863] process_backlog (dev.c:?) [40915.533958] __napi_poll (dev.c:?) [40915.534050] net_rx_action (dev.c:?) this start after upgrade to kernel 6.3.x with 6.2.x i dont have this problem. m. > On 15 Sep 2023, at 9:45, Eric Dumazet <edumazet@google.com> wrote: > > On Fri, Sep 15, 2023 at 6:05 AM Martin Zaharinov <micron10@gmail.com> wrote: >> >> Hi All >> This is report from kernel 6.5.2 after 4 day up system hang and reboot after this error : >> >> >> >> Sep 15 04:32:29 205.254.184.12 [399661.971344][ C31] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) >> Sep 15 04:32:29 205.254.184.12 [399661.971470][ C31] BUG: unable to handle page fault for address: ffffa10c52d43058 >> Sep 15 04:32:29 205.254.184.12 [399661.971586][ C31] #PF: supervisor instruction fetch in kernel mode >> Sep 15 04:32:29 205.254.184.12 [399661.971680][ C31] #PF: error_code(0x0011) - permissions violation >> Sep 15 04:32:29 205.254.184.12 [399661.971775][ C31] PGD 12601067 P4D 12601067 PUD 80000002400001e3 >> Sep 15 04:32:29 205.254.184.12 [399661.971871][ C31] Oops: 0011 [#1] PREEMPT SMP >> Sep 15 04:32:29 205.254.184.12 [399661.971963][ C31] CPU: 31 PID: 0 Comm: swapper/31 Tainted: G W O 6.5.2 #1 >> Sep 15 04:32:29 205.254.184.12 [399661.972079][ C31] Hardware name: Supermicro SYS-5038MR-H8TRF/X10SRD-F, BIOS 3.3 10/28/2020 >> Sep 15 04:32:29 205.254.184.12 [399661.972197][ C31] RIP: 0010:0xffffa10c52d43058 >> Sep 15 04:32:29 205.254.184.12 [399661.972289][ C31] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 30 d4 52 0c a1 ff ff 00 00 00 00 00 00 >> Sep 15 04:32:29 205.254.184.12 [399661.972448][ C31] RSP: 0018:ffffad0e0097ccc8 EFLAGS: 00010282 >> Sep 15 04:32:29 205.254.184.12 [399661.972543][ C31] RAX: ffffa10c52d43058 RBX: ffffa10c52d43000 RCX: 0000000000000000 >> Sep 15 04:32:29 205.254.184.12 [399661.972659][ C31] RDX: 0000000000002712 RSI: 0000000000000246 RDI: ffffa10c52d43000 >> Sep 15 04:32:29 205.254.184.12 [399661.972774][ C31] RBP: ffffa10c52d43000 R08: 0000000127a83c46 R09: 0000000000004d8c >> Sep 15 04:32:29 205.254.184.12 [399661.972889][ C31] R10: ffffe840ca0f7c00 R11: 0000000000000000 R12: ffffa10c8e764d80 >> Sep 15 04:32:29 205.254.184.12 [399661.973005][ C31] R13: ffffa10c92b4c760 R14: 0000000000000058 R15: ffffa10c92b4c600 >> Sep 15 04:32:29 205.254.184.12 [399661.973123][ C31] FS: 0000000000000000(0000) GS:ffffa1125fdc0000(0000) knlGS:0000000000000000 >> Sep 15 04:32:29 205.254.184.12 [399661.973244][ C31] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> Sep 15 04:32:29 205.254.184.12 [399661.973338][ C31] CR2: ffffa10c52d43058 CR3: 00000001059b8001 CR4: 00000000003706e0 >> Sep 15 04:32:29 205.254.184.12 [399661.973454][ C31] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> Sep 15 04:32:29 205.254.184.12 [399661.973569][ C31] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> Sep 15 04:32:29 205.254.184.12 [399661.973684][ C31] Call Trace: >> Sep 15 04:32:29 205.254.184.12 [399661.973773][ C31] <IRQ> >> Sep 15 04:32:29 205.254.184.12 [399661.973859][ C31] ? __die+0xe4/0xf0 >> Sep 15 04:32:29 205.254.184.12 [399661.973949][ C31] ? page_fault_oops+0x144/0x3e0 >> Sep 15 04:32:29 205.254.184.12 [399661.974043][ C31] ? exc_page_fault+0x92/0xa0 >> Sep 15 04:32:29 205.254.184.12 [399661.974136][ C31] ? asm_exc_page_fault+0x22/0x30 >> Sep 15 04:32:29 205.254.184.12 [399661.974228][ C31] ? kfree_skb_reason+0x33/0xf0 >> Sep 15 04:32:29 205.254.184.12 [399661.974321][ C31] ? tcp_mtu_probe+0x3a6/0x7b0 >> Sep 15 04:32:29 205.254.184.12 [399661.974416][ C31] ? tcp_write_xmit+0x7fa/0x1410 >> Sep 15 04:32:29 205.254.184.12 [399661.974509][ C31] ? __tcp_push_pending_frames+0x2d/0xb0 >> Sep 15 04:32:29 205.254.184.12 [399661.974603][ C31] ? tcp_rcv_established+0x381/0x610 >> Sep 15 04:32:29 205.254.184.12 [399661.974695][ C31] ? sk_filter_trim_cap+0xc6/0x1c0 >> Sep 15 04:32:29 205.254.184.12 [399661.974787][ C31] ? tcp_v4_do_rcv+0x11f/0x1f0 >> Sep 15 04:32:29 205.254.184.12 [399661.974877][ C31] ? tcp_v4_rcv+0xfa1/0x1010 > > Your reports are not usable. Please make sure to include symbols next time. > > Please read these parts (and possibly complete files) > > Documentation/admin-guide/bug-hunting.rst:55:quality of the stack > trace by using file:`scripts/decode_stacktrace.sh`. > > Documentation/admin-guide/reporting-issues.rst:978: > [user@something ~]$ sudo dmesg | > ./linux-5.10.5/scripts/decode_stacktrace.sh ./linux-5.10.5/vmlinux > Documentation/admin-guide/reporting-issues.rst:985: > [user@something ~]$ sudo dmesg | > ./linux-5.10.5/scripts/decode_stacktrace.sh \ > > > >> Sep 15 04:32:29 205.254.184.12 [399661.974968][ C31] ? ip_protocol_deliver_rcu+0x1b/0x270 >> Sep 15 04:32:29 205.254.184.12 [399661.975062][ C31] ? ip_local_deliver_finish+0x6d/0x90 >> Sep 15 04:32:29 205.254.184.12 [399661.976257][ C31] ? process_backlog+0x10c/0x230 >> Sep 15 04:32:29 205.254.184.12 [399661.976352][ C31] ? __napi_poll+0x20/0x180 >> Sep 15 04:32:29 205.254.184.12 [399661.976442][ C31] ? net_rx_action+0x2a4/0x390 >> Sep 15 04:32:29 205.254.184.12 [399661.976534][ C31] ? __do_softirq+0xd0/0x202 >> Sep 15 04:32:29 205.254.184.12 [399661.976626][ C31] ? do_softirq+0x3a/0x50 >> Sep 15 04:32:29 205.254.184.12 [399661.976718][ C31] </IRQ> >> Sep 15 04:32:29 205.254.184.12 [399661.976805][ C31] <TASK> >> Sep 15 04:32:29 205.254.184.12 [399661.976890][ C31] ? flush_smp_call_function_queue+0x3f/0x50 >> Sep 15 04:32:29 205.254.184.12 [399661.976988][ C31] ? do_idle+0x14d/0x210 >> Sep 15 04:32:29 205.254.184.12 [399661.977078][ C31] ? cpu_startup_entry+0x14/0x20 >> Sep 15 04:32:29 205.254.184.12 [399661.977168][ C31] ? start_secondary+0xe1/0xf0 >> Sep 15 04:32:29 205.254.184.12 [399661.977262][ C31] ? secondary_startup_64_no_verify+0x167/0x16b >> Sep 15 04:32:29 205.254.184.12 [399661.977359][ C31] </TASK> >> Sep 15 04:32:29 205.254.184.12 [399661.977448][ C31] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos >> Sep 15 04:32:29 205.254.184.12 [399661.977720][ C31] CR2: ffffa10c52d43058 >> Sep 15 04:32:29 205.254.184.12 [399661.977809][ C31] ---[ end trace 0000000000000000 ]--- >> Sep 15 04:32:29 205.254.184.12 [399661.977901][ C31] RIP: 0010:0xffffa10c52d43058 >> Sep 15 04:32:29 205.254.184.12 [399661.977992][ C31] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 30 d4 52 0c a1 ff ff 00 00 00 00 00 00 >> Sep 15 04:32:29 205.254.184.12 [399661.978150][ C31] RSP: 0018:ffffad0e0097ccc8 EFLAGS: 00010282 >> Sep 15 04:32:29 205.254.184.12 [399661.978243][ C31] RAX: ffffa10c52d43058 RBX: ffffa10c52d43000 RCX: 0000000000000000 >> Sep 15 04:32:29 205.254.184.12 [399661.978358][ C31] RDX: 0000000000002712 RSI: 0000000000000246 RDI: ffffa10c52d43000 >> Sep 15 04:32:29 205.254.184.12 [399661.978472][ C31] RBP: ffffa10c52d43000 R08: 0000000127a83c46 R09: 0000000000004d8c >> Sep 15 04:32:29 205.254.184.12 [399661.978587][ C31] R10: ffffe840ca0f7c00 R11: 0000000000000000 R12: ffffa10c8e764d80 >> Sep 15 04:32:29 205.254.184.12 [399661.978702][ C31] R13: ffffa10c92b4c760 R14: 0000000000000058 R15: ffffa10c92b4c600 >> Sep 15 04:32:29 205.254.184.12 [399661.978818][ C31] FS: 0000000000000000(0000) GS:ffffa1125fdc0000(0000) knlGS:0000000000000000 >> Sep 15 04:32:29 205.254.184.12 [399661.978940][ C31] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> Sep 15 04:32:29 205.254.184.12 [399661.979036][ C31] CR2: ffffa10c52d43058 CR3: 00000001059b8001 CR4: 00000000003706e0 >> Sep 15 04:32:29 205.254.184.12 [399661.979150][ C31] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> Sep 15 04:32:29 205.254.184.12 [399661.979265][ C31] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> Sep 15 04:32:29 205.254.184.12 [399661.979381][ C31] Kernel panic - not syncing: Fatal exception in interrupt >> Sep 15 04:32:29 205.254.184.12 [399662.084038][ C31] Kernel Offset: 0x1e000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) >> Sep 15 04:32:29 205.254.184.12 [399662.084162][ C31] Rebooting in 10 seconds.. >> >> >> Please if find fix update me . >> >> m. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-15 6:45 ` Eric Dumazet 2023-09-15 22:23 ` Martin Zaharinov @ 2023-11-16 14:17 ` Martin Zaharinov 2023-12-06 22:26 ` Martin Zaharinov 1 sibling, 1 reply; 35+ messages in thread From: Martin Zaharinov @ 2023-11-16 14:17 UTC (permalink / raw) To: Eric Dumazet Cc: netdev, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern Hi All report same problem with kernel 6.6.1 - i think problem is in rcu but … if have options to add people from RCU here. See report : [141229.505339] ------------[ cut here ]------------ [141229.505492] rcuref - imbalanced put() [141229.505504] WARNING: CPU: 8 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) [141229.505821] Modules linked in: xsk_diag unix_diag iptable_filter xt_TCPMSS iptable_mangle xt_addrtype xt_nat xt_MASQUERADE iptable_nat ip_tables netconsole coretemp e1000 ixgbe mdio pppoe pppox sha1_ssse3 sha1_generic ppp_mppe libarc4 ppp_generic slhc nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 [141229.506349] CPU: 8 PID: 0 Comm: swapper/8 Tainted: G O 6.6.1 #1 [141229.506527] Hardware name: Persy Super Server/X11DDW-L, BIOS 4.0 07/11/2023 [141229.506701] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) [141229.506843] Code: 31 c0 eb e2 80 3d ef 4e e6 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb cf 48 c7 c7 07 99 e3 97 c6 05 d5 4e e6 00 01 e8 d1 1f c7 ff <0f> 0b eb df cc cc cc cc cc cc cc cc cc cc cc cc cc 48 89 fa 83 e2 All code ======== 0: 31 c0 xor %eax,%eax 2: eb e2 jmp 0xffffffffffffffe6 4: 80 3d ef 4e e6 00 00 cmpb $0x0,0xe64eef(%rip) # 0xe64efa b: 74 0a je 0x17 d: c7 03 00 00 00 e0 movl $0xe0000000,(%rbx) 13: 31 c0 xor %eax,%eax 15: eb cf jmp 0xffffffffffffffe6 17: 48 c7 c7 07 99 e3 97 mov $0xffffffff97e39907,%rdi 1e: c6 05 d5 4e e6 00 01 movb $0x1,0xe64ed5(%rip) # 0xe64efa 25: e8 d1 1f c7 ff call 0xffffffffffc71ffb 2a:* 0f 0b ud2 <-- trapping instruction 2c: eb df jmp 0xd 2e: cc int3 2f: cc int3 30: cc int3 31: cc int3 32: cc int3 33: cc int3 34: cc int3 35: cc int3 36: cc int3 37: cc int3 38: cc int3 39: cc int3 3a: cc int3 3b: 48 89 fa mov %rdi,%rdx 3e: 83 .byte 0x83 3f: e2 .byte 0xe2 Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: eb df jmp 0xffffffffffffffe3 4: cc int3 5: cc int3 6: cc int3 7: cc int3 8: cc int3 9: cc int3 a: cc int3 b: cc int3 c: cc int3 d: cc int3 e: cc int3 f: cc int3 10: cc int3 11: 48 89 fa mov %rdi,%rdx 14: 83 .byte 0x83 15: e2 .byte 0xe2 [141229.507086] RSP: 0018:ffffa444449e0978 EFLAGS: 00010296 [141229.507229] RAX: 0000000000000019 RBX: ffff9b54866a4100 RCX: 00000000fff7ffff [141229.507404] RDX: 00000000fff7ffff RSI: 0000000000000001 RDI: 00000000ffffffea [141229.507577] RBP: ffff9b53e57b1ec0 R08: 0000000000000000 R09: 00000000fff7ffff [141229.507751] R10: ffff9b62db200000 R11: 0000000000000003 R12: ffff9b5b0595c000 [141229.507929] R13: ffff9b5b09c32200 R14: ffff9b5b09e29a00 R15: ffff9b5b0557e080 [141229.508101] FS: 0000000000000000(0000) GS:ffff9b62dfa00000(0000) knlGS:0000000000000000 [141229.508279] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [141229.508425] CR2: 00007fbadced6a80 CR3: 000000096f014002 CR4: 00000000003706e0 [141229.508599] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [141229.508773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [141229.508947] Call Trace: [141229.509079] <IRQ> [141229.509206] ? __warn (kernel/panic.c:235 kernel/panic.c:673) [141229.509342] ? report_bug (lib/bug.c:180 lib/bug.c:219) [141229.509482] ? handle_bug (arch/x86/kernel/traps.c:237) [141229.509617] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) [141229.509751] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) [141229.509892] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) [141229.510028] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) [141229.510164] dst_release (./arch/x86/include/asm/preempt.h:95 ./include/linux/rcuref.h:151 net/core/dst.c:166) [141229.510302] __dev_queue_xmit (./include/net/dst.h:283 net/core/dev.c:4324) [141229.510441] vlan_dev_hard_start_xmit (net/8021q/vlan_dev.c:130) [141229.510584] dev_hard_start_xmit (./include/linux/netdevice.h:4904 net/core/dev.c:3573 net/core/dev.c:3589) [141229.510722] __dev_queue_xmit (./include/linux/netdevice.h:3278 (discriminator 25) net/core/dev.c:4370 (discriminator 25)) [141229.510862] ? eth_header (net/ethernet/eth.c:85) [141229.510998] ip_finish_output2 (./include/net/neighbour.h:542 (discriminator 2) net/ipv4/ip_output.c:233 (discriminator 2)) [141229.511135] ip_sabotage_in (net/bridge/br_netfilter_hooks.c:881 net/bridge/br_netfilter_hooks.c:866) [141229.511269] nf_hook_slow (./include/linux/netfilter.h:144 net/netfilter/core.c:626) [141229.511406] ip_rcv (./include/linux/netfilter.h:259 ./include/linux/netfilter.h:302 net/ipv4/ip_input.c:569) [141229.511540] ? ip_rcv_core.constprop.0 (net/ipv4/ip_input.c:436) [141229.511678] netif_receive_skb (net/core/dev.c:5552 net/core/dev.c:5666 net/core/dev.c:5752 net/core/dev.c:5811) [141229.511814] br_handle_frame_finish (net/bridge/br_input.c:216) [141229.511954] ? br_pass_frame_up (net/bridge/br_input.c:75) [141229.512092] br_nf_hook_thresh (net/bridge/br_netfilter_hooks.c:1051) [141229.512227] ? br_pass_frame_up (net/bridge/br_input.c:75) [141229.512363] br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:427) [141229.512501] ? br_pass_frame_up (net/bridge/br_input.c:75) [141229.512644] ? nf_nat_ipv4_pre_routing (net/netfilter/nf_nat_proto.c:656) nf_nat [141229.512792] br_nf_pre_routing (net/bridge/br_netfilter_hooks.c:538) [141229.512928] ? br_nf_hook_thresh (net/bridge/br_netfilter_hooks.c:354) [141229.513061] br_handle_frame (./include/linux/netfilter.h:144 net/bridge/br_input.c:272 net/bridge/br_input.c:417) [141229.513196] ? br_pass_frame_up (net/bridge/br_input.c:75) [141229.513333] __netif_receive_skb_core.constprop.0 (net/core/dev.c:5446 (discriminator 1)) [141229.513475] ? ip_finish_output2 (net/ipv4/ip_output.c:243) [141229.513613] process_backlog (net/core/dev.c:5551 net/core/dev.c:5666 net/core/dev.c:5994) [141229.513749] __napi_poll (net/core/dev.c:6556) [141229.513887] net_rx_action (net/core/dev.c:6625 net/core/dev.c:6756) [141229.514023] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) [141229.514158] do_softirq (kernel/softirq.c:463 (discriminator 32) kernel/softirq.c:450 (discriminator 32)) [141229.514292] </IRQ> [141229.514420] <TASK> [141229.514548] flush_smp_call_function_queue (./arch/x86/include/asm/irqflags.h:134 (discriminator 1) kernel/smp.c:579 (discriminator 1)) [141229.514688] do_idle (kernel/sched/idle.c:314) [141229.514822] cpu_startup_entry (kernel/sched/idle.c:379) [141229.516148] start_secondary (arch/x86/kernel/smpboot.c:326) [141229.516291] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433) [141229.516435] </TASK> [141229.516562] ---[ end trace 0000000000000000 ]— Best regards, Martin > On 15 Sep 2023, at 9:45, Eric Dumazet <edumazet@google.com> wrote: > > scripts/decode_stacktrace.sh ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-11-16 14:17 ` Martin Zaharinov @ 2023-12-06 22:26 ` Martin Zaharinov [not found] ` <5E63894D-913B-416C-B901-F628BB6C00E0@gmail.com> 0 siblings, 1 reply; 35+ messages in thread From: Martin Zaharinov @ 2023-12-06 22:26 UTC (permalink / raw) To: Eric Dumazet Cc: netdev, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern Hi all its strange same problem is go on 6.6.4 same same debug log diff hardware , users number and …. in debug log is same : lib/rcuref.c in this line is : /* * If the reference count was already in the dead zone, then this * put() operation is imbalanced. Warn, put the reference count back to * DEAD and tell the caller to not deconstruct the object. */ if (WARN_ONCE(cnt >= RCUREF_RELEASED, "rcuref - imbalanced put()")) { atomic_set(&ref->refcnt, RCUREF_DEAD); return false; } [529520.875413] CPU: 13 PID: 0 Comm: swapper/13 Tainted: G O 6.6.3 #1 [529520.875533] Hardware name: Supermicro SYS-5038MR-H8TRF/X10SRD-F, BIOS 3.3 10/28/2020 [529520.875653] RIP: 0010:rcuref_put_slowpath+0x5f/0x70 [529520.875748] Code: 31 c0 eb e2 80 3d 9e d1 e6 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb cf 48 c7 c7 d9 96 e3 8f c6 05 84 d1 e6 00 01 e8 41 9d c7 ff <0f> 0b eb df cc cc cc cc cc cc cc cc cc cc cc cc cc 48 89 fa 83 e2 [529520.875908] RSP: 0018:ffffa823c052cde8 EFLAGS: 00010296 [529520.876003] RAX: 0000000000000019 RBX: ffffa0f049053180 RCX: 00000000fff7ffff [529520.876122] RDX: 00000000fff7ffff RSI: 0000000000000001 RDI: 00000000ffffffea [529520.876244] RBP: ffffa0f0a8fffec0 R08: 0000000000000000 R09: 00000000fff7ffff [529520.876364] R10: ffffa0f79ae00000 R11: 0000000000000003 R12: ffffa0f04655f000 [529520.876482] R13: 0000000000000258 R14: ffffa0f16ade1000 R15: ffffa0f79f964bd0 [529520.876601] FS: 0000000000000000(0000) GS:ffffa0f79f940000(0000) knlGS:0000000000000000 [529520.876723] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [529520.876822] CR2: 00007fa9bd56b3c8 CR3: 000000016e43e002 CR4: 00000000003706e0 [529520.877043] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [529520.877164] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [529520.877287] Call Trace: [529520.877382] <IRQ> [529520.877472] ? __warn+0x6c/0x130 [529520.877566] ? report_bug+0x1b8/0x200 [529520.877661] ? handle_bug+0x36/0x70 [529520.877753] ? exc_invalid_op+0x17/0x1a0 [529520.877849] ? asm_exc_invalid_op+0x16/0x20 [529520.877947] ? rcuref_put_slowpath+0x5f/0x70 [529520.878043] ? rcuref_put_slowpath+0x5f/0x70 [529520.878136] dst_release+0x1c/0x40 [529520.878229] __dev_queue_xmit+0x594/0xcd0 [529520.878324] ? eth_header+0x25/0xc0 [529520.878417] ip_finish_output2+0x1a0/0x530 [529520.878514] process_backlog+0x107/0x210 [529520.878610] __napi_poll+0x20/0x180 [529520.878702] net_rx_action+0x29f/0x380 [529520.878935] __do_softirq+0xd0/0x202 [529520.879033] do_softirq+0x3a/0x50 [529520.879127] </IRQ> [529520.879217] <TASK> [529520.879306] flush_smp_call_function_queue+0x3f/0x50 [529520.879407] do_idle+0x14d/0x210 [529520.879500] cpu_startup_entry+0x21/0x30 [529520.879597] start_secondary+0xe1/0xf0 [529520.879693] secondary_startup_64_no_verify+0x166/0x16b [529520.879793] </TASK> [529520.879884] ---[ end trace 0000000000000000 ]— m. > On 16 Nov 2023, at 16:17, Martin Zaharinov <micron10@gmail.com> wrote: > > Hi All > > report same problem with kernel 6.6.1 - i think problem is in rcu but … if have options to add people from RCU here. > > See report : > > > > [141229.505339] ------------[ cut here ]------------ > [141229.505492] rcuref - imbalanced put() > [141229.505504] WARNING: CPU: 8 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > [141229.505821] Modules linked in: xsk_diag unix_diag iptable_filter xt_TCPMSS iptable_mangle xt_addrtype xt_nat xt_MASQUERADE iptable_nat ip_tables netconsole coretemp e1000 ixgbe mdio pppoe pppox sha1_ssse3 sha1_generic ppp_mppe libarc4 ppp_generic slhc nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 > [141229.506349] CPU: 8 PID: 0 Comm: swapper/8 Tainted: G O 6.6.1 #1 > [141229.506527] Hardware name: Persy Super Server/X11DDW-L, BIOS 4.0 07/11/2023 > [141229.506701] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > [141229.506843] Code: 31 c0 eb e2 80 3d ef 4e e6 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb cf 48 c7 c7 07 99 e3 97 c6 05 d5 4e e6 00 01 e8 d1 1f c7 ff <0f> 0b eb df cc cc cc cc cc cc cc cc cc cc cc cc cc 48 89 fa 83 e2 > All code > ======== > 0: 31 c0 xor %eax,%eax > 2: eb e2 jmp 0xffffffffffffffe6 > 4: 80 3d ef 4e e6 00 00 cmpb $0x0,0xe64eef(%rip) # 0xe64efa > b: 74 0a je 0x17 > d: c7 03 00 00 00 e0 movl $0xe0000000,(%rbx) > 13: 31 c0 xor %eax,%eax > 15: eb cf jmp 0xffffffffffffffe6 > 17: 48 c7 c7 07 99 e3 97 mov $0xffffffff97e39907,%rdi > 1e: c6 05 d5 4e e6 00 01 movb $0x1,0xe64ed5(%rip) # 0xe64efa > 25: e8 d1 1f c7 ff call 0xffffffffffc71ffb > 2a:* 0f 0b ud2 <-- trapping instruction > 2c: eb df jmp 0xd > 2e: cc int3 > 2f: cc int3 > 30: cc int3 > 31: cc int3 > 32: cc int3 > 33: cc int3 > 34: cc int3 > 35: cc int3 > 36: cc int3 > 37: cc int3 > 38: cc int3 > 39: cc int3 > 3a: cc int3 > 3b: 48 89 fa mov %rdi,%rdx > 3e: 83 .byte 0x83 > 3f: e2 .byte 0xe2 > > Code starting with the faulting instruction > =========================================== > 0: 0f 0b ud2 > 2: eb df jmp 0xffffffffffffffe3 > 4: cc int3 > 5: cc int3 > 6: cc int3 > 7: cc int3 > 8: cc int3 > 9: cc int3 > a: cc int3 > b: cc int3 > c: cc int3 > d: cc int3 > e: cc int3 > f: cc int3 > 10: cc int3 > 11: 48 89 fa mov %rdi,%rdx > 14: 83 .byte 0x83 > 15: e2 .byte 0xe2 > [141229.507086] RSP: 0018:ffffa444449e0978 EFLAGS: 00010296 > [141229.507229] RAX: 0000000000000019 RBX: ffff9b54866a4100 RCX: 00000000fff7ffff > [141229.507404] RDX: 00000000fff7ffff RSI: 0000000000000001 RDI: 00000000ffffffea > [141229.507577] RBP: ffff9b53e57b1ec0 R08: 0000000000000000 R09: 00000000fff7ffff > [141229.507751] R10: ffff9b62db200000 R11: 0000000000000003 R12: ffff9b5b0595c000 > [141229.507929] R13: ffff9b5b09c32200 R14: ffff9b5b09e29a00 R15: ffff9b5b0557e080 > [141229.508101] FS: 0000000000000000(0000) GS:ffff9b62dfa00000(0000) knlGS:0000000000000000 > [141229.508279] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [141229.508425] CR2: 00007fbadced6a80 CR3: 000000096f014002 CR4: 00000000003706e0 > [141229.508599] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [141229.508773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [141229.508947] Call Trace: > [141229.509079] <IRQ> > [141229.509206] ? __warn (kernel/panic.c:235 kernel/panic.c:673) > [141229.509342] ? report_bug (lib/bug.c:180 lib/bug.c:219) > [141229.509482] ? handle_bug (arch/x86/kernel/traps.c:237) > [141229.509617] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) > [141229.509751] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) > [141229.509892] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > [141229.510028] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > [141229.510164] dst_release (./arch/x86/include/asm/preempt.h:95 ./include/linux/rcuref.h:151 net/core/dst.c:166) > [141229.510302] __dev_queue_xmit (./include/net/dst.h:283 net/core/dev.c:4324) > [141229.510441] vlan_dev_hard_start_xmit (net/8021q/vlan_dev.c:130) > [141229.510584] dev_hard_start_xmit (./include/linux/netdevice.h:4904 net/core/dev.c:3573 net/core/dev.c:3589) > [141229.510722] __dev_queue_xmit (./include/linux/netdevice.h:3278 (discriminator 25) net/core/dev.c:4370 (discriminator 25)) > [141229.510862] ? eth_header (net/ethernet/eth.c:85) > [141229.510998] ip_finish_output2 (./include/net/neighbour.h:542 (discriminator 2) net/ipv4/ip_output.c:233 (discriminator 2)) > [141229.511135] ip_sabotage_in (net/bridge/br_netfilter_hooks.c:881 net/bridge/br_netfilter_hooks.c:866) > [141229.511269] nf_hook_slow (./include/linux/netfilter.h:144 net/netfilter/core.c:626) > [141229.511406] ip_rcv (./include/linux/netfilter.h:259 ./include/linux/netfilter.h:302 net/ipv4/ip_input.c:569) > [141229.511540] ? ip_rcv_core.constprop.0 (net/ipv4/ip_input.c:436) > [141229.511678] netif_receive_skb (net/core/dev.c:5552 net/core/dev.c:5666 net/core/dev.c:5752 net/core/dev.c:5811) > [141229.511814] br_handle_frame_finish (net/bridge/br_input.c:216) > [141229.511954] ? br_pass_frame_up (net/bridge/br_input.c:75) > [141229.512092] br_nf_hook_thresh (net/bridge/br_netfilter_hooks.c:1051) > [141229.512227] ? br_pass_frame_up (net/bridge/br_input.c:75) > [141229.512363] br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:427) > [141229.512501] ? br_pass_frame_up (net/bridge/br_input.c:75) > [141229.512644] ? nf_nat_ipv4_pre_routing (net/netfilter/nf_nat_proto.c:656) nf_nat > [141229.512792] br_nf_pre_routing (net/bridge/br_netfilter_hooks.c:538) > [141229.512928] ? br_nf_hook_thresh (net/bridge/br_netfilter_hooks.c:354) > [141229.513061] br_handle_frame (./include/linux/netfilter.h:144 net/bridge/br_input.c:272 net/bridge/br_input.c:417) > [141229.513196] ? br_pass_frame_up (net/bridge/br_input.c:75) > [141229.513333] __netif_receive_skb_core.constprop.0 (net/core/dev.c:5446 (discriminator 1)) > [141229.513475] ? ip_finish_output2 (net/ipv4/ip_output.c:243) > [141229.513613] process_backlog (net/core/dev.c:5551 net/core/dev.c:5666 net/core/dev.c:5994) > [141229.513749] __napi_poll (net/core/dev.c:6556) > [141229.513887] net_rx_action (net/core/dev.c:6625 net/core/dev.c:6756) > [141229.514023] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) > [141229.514158] do_softirq (kernel/softirq.c:463 (discriminator 32) kernel/softirq.c:450 (discriminator 32)) > [141229.514292] </IRQ> > [141229.514420] <TASK> > [141229.514548] flush_smp_call_function_queue (./arch/x86/include/asm/irqflags.h:134 (discriminator 1) kernel/smp.c:579 (discriminator 1)) > [141229.514688] do_idle (kernel/sched/idle.c:314) > [141229.514822] cpu_startup_entry (kernel/sched/idle.c:379) > [141229.516148] start_secondary (arch/x86/kernel/smpboot.c:326) > [141229.516291] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433) > [141229.516435] </TASK> > [141229.516562] ---[ end trace 0000000000000000 ]— > > > Best regards, > Martin > > > >> On 15 Sep 2023, at 9:45, Eric Dumazet <edumazet@google.com> wrote: >> >> scripts/decode_stacktrace.sh > > ^ permalink raw reply [flat|nested] 35+ messages in thread
[parent not found: <5E63894D-913B-416C-B901-F628BB6C00E0@gmail.com>]
* Re: Urgent Bug Report Kernel crash 6.5.2 [not found] ` <5E63894D-913B-416C-B901-F628BB6C00E0@gmail.com> @ 2023-12-08 22:20 ` Thomas Gleixner 2023-12-08 23:01 ` Martin Zaharinov 0 siblings, 1 reply; 35+ messages in thread From: Thomas Gleixner @ 2023-12-08 22:20 UTC (permalink / raw) To: Martin Zaharinov, peterz Cc: netdev, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Eric Dumazet On Thu, Dec 07 2023 at 00:38, Martin Zaharinov wrote: >> On 7 Dec 2023, at 0:26, Martin Zaharinov <micron10@gmail.com> wrote: >> >> in this line is : >> >> >> /* >> * If the reference count was already in the dead zone, then this >> * put() operation is imbalanced. Warn, put the reference count back to >> * DEAD and tell the caller to not deconstruct the object. >> */ >> if (WARN_ONCE(cnt >= RCUREF_RELEASED, "rcuref - imbalanced put()")) { >> atomic_set(&ref->refcnt, RCUREF_DEAD); >> return false; >> } So a rcuref_put() operation triggers the warning because the reference count is already dead, which means the rcuref_put() operation is imbalanced. >> [529520.875413] CPU: 13 PID: 0 Comm: swapper/13 Tainted: G O 6.6.3 #1 Can you reproduce this without the Out of Tree module? >> [529520.875653] RIP: 0010:rcuref_put_slowpath+0x5f/0x70 >> [529520.878136] dst_release+0x1c/0x40 >> [529520.878229] __dev_queue_xmit+0x594/0xcd0 >> [529520.878324] ? eth_header+0x25/0xc0 >> [529520.878417] ip_finish_output2+0x1a0/0x530 >> [529520.878514] process_backlog+0x107/0x210 >> [529520.878610] __napi_poll+0x20/0x180 >> [529520.878702] net_rx_action+0x29f/0x380 >> [529520.878935] __do_softirq+0xd0/0x202 >> [529520.879033] do_softirq+0x3a/0x50 So this is one call chain triggering the issue... >>> report same problem with kernel 6.6.1 - i think problem is in rcu >>> but … if have options to add people from RCU here. That's definitely not a RCU problem. It's a simple refcount fail. Thanks, tglx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-12-08 22:20 ` Thomas Gleixner @ 2023-12-08 23:01 ` Martin Zaharinov 2023-12-12 18:16 ` Thomas Gleixner 0 siblings, 1 reply; 35+ messages in thread From: Martin Zaharinov @ 2023-12-08 23:01 UTC (permalink / raw) To: Thomas Gleixner Cc: peterz, netdev, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Eric Dumazet Hi Thomas, > On 9 Dec 2023, at 0:20, Thomas Gleixner <tglx@linutronix.de> wrote: > > On Thu, Dec 07 2023 at 00:38, Martin Zaharinov wrote: >>> On 7 Dec 2023, at 0:26, Martin Zaharinov <micron10@gmail.com> wrote: >>> >>> in this line is : >>> >>> >>> /* >>> * If the reference count was already in the dead zone, then this >>> * put() operation is imbalanced. Warn, put the reference count back to >>> * DEAD and tell the caller to not deconstruct the object. >>> */ >>> if (WARN_ONCE(cnt >= RCUREF_RELEASED, "rcuref - imbalanced put()")) { >>> atomic_set(&ref->refcnt, RCUREF_DEAD); >>> return false; >>> } > > So a rcuref_put() operation triggers the warning because the reference > count is already dead, which means the rcuref_put() operation is > imbalanced. > >>> [529520.875413] CPU: 13 PID: 0 Comm: swapper/13 Tainted: G O 6.6.3 #1 > > Can you reproduce this without the Out of Tree module? Same error without Out of Tree modules. i try many time from kernel 6.5.x to now. > >>> [529520.875653] RIP: 0010:rcuref_put_slowpath+0x5f/0x70 >>> [529520.878136] dst_release+0x1c/0x40 >>> [529520.878229] __dev_queue_xmit+0x594/0xcd0 >>> [529520.878324] ? eth_header+0x25/0xc0 >>> [529520.878417] ip_finish_output2+0x1a0/0x530 >>> [529520.878514] process_backlog+0x107/0x210 >>> [529520.878610] __napi_poll+0x20/0x180 >>> [529520.878702] net_rx_action+0x29f/0x380 >>> [529520.878935] __do_softirq+0xd0/0x202 >>> [529520.879033] do_softirq+0x3a/0x50 > > So this is one call chain triggering the issue... > >>>> report same problem with kernel 6.6.1 - i think problem is in rcu >>>> but … if have options to add people from RCU here. > > That's definitely not a RCU problem. It's a simple refcount fail. > > Thanks, > > tglx > Is this a problem or only simple fail , and is it possible to catch what is a problem and fix this fail. m. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-12-08 23:01 ` Martin Zaharinov @ 2023-12-12 18:16 ` Thomas Gleixner 2023-12-19 9:25 ` Martin Zaharinov 0 siblings, 1 reply; 35+ messages in thread From: Thomas Gleixner @ 2023-12-12 18:16 UTC (permalink / raw) To: Martin Zaharinov Cc: peterz, netdev, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Eric Dumazet Martin! On Sat, Dec 09 2023 at 01:01, Martin Zaharinov wrote: >> On 9 Dec 2023, at 0:20, Thomas Gleixner <tglx@linutronix.de> wrote: >> That's definitely not a RCU problem. It's a simple refcount fail. >> > Is this a problem or only simple fail , and is it possible to catch > what is a problem and fix this fail. Underaccounting a reference count is potentially Use After Free. if (rcuref_put(ref)) call_rcu(ref....); So after the grace period is over @ref will be freed. Depending on the timing the context which does the extra put() might already operate on a freed object. How to catch that, that's a good question. There is no instrumentation so far for this. Below is a straight forward trace_printk() based tracking of rcurefs, which should help to narrow down the context. Btw, how easy is this to reproduce? Thanks, tglx --- --- a/include/linux/rcuref.h +++ b/include/linux/rcuref.h @@ -64,8 +64,10 @@ static inline __must_check bool rcuref_g * Unconditionally increase the reference count. The saturation and * dead zones provide enough tolerance for this. */ - if (likely(!atomic_add_negative_relaxed(1, &ref->refcnt))) + if (likely(!atomic_add_negative_relaxed(1, &ref->refcnt))) { + trace_printk("get(FASTPATH): %px\n", ref); return true; + } /* Handle the cases inside the saturation and dead zones */ return rcuref_get_slowpath(ref); @@ -84,8 +86,10 @@ static __always_inline __must_check bool * Unconditionally decrease the reference count. The saturation and * dead zones provide enough tolerance for this. */ - if (likely(!atomic_add_negative_release(-1, &ref->refcnt))) + if (likely(!atomic_add_negative_release(-1, &ref->refcnt))) { + trace_printk("put(FASTPATH): %px\n", ref); return false; + } /* * Handle the last reference drop and cases inside the saturation --- a/lib/rcuref.c +++ b/lib/rcuref.c @@ -200,6 +200,7 @@ bool rcuref_get_slowpath(rcuref_t *ref) */ if (cnt >= RCUREF_RELEASED) { atomic_set(&ref->refcnt, RCUREF_DEAD); + trace_printk("get(DEAD): %px %pS\n", ref, __builtin_return_address(0)); return false; } @@ -211,8 +212,15 @@ bool rcuref_get_slowpath(rcuref_t *ref) * object memory, but prevents the obvious reference count overflow * damage. */ - if (WARN_ONCE(cnt > RCUREF_MAXREF, "rcuref saturated - leaking memory")) + if (cnt > RCUREF_MAXREF) { + trace_printk("get(SATURATED): %px %pS\n", ref, __builtin_return_address(0)); + WARN_ONCE(1, "rcuref saturated - leaking memory"); atomic_set(&ref->refcnt, RCUREF_SATURATED); + } else { + trace_printk("get(UNDEFINED): %px %pS\n", ref, __builtin_return_address(0)); + WARN_ON_ONCE(1); + } + return true; } EXPORT_SYMBOL_GPL(rcuref_get_slowpath); @@ -248,9 +256,12 @@ bool rcuref_put_slowpath(rcuref_t *ref) * require a retry. If this fails the caller is not * allowed to deconstruct the object. */ - if (!atomic_try_cmpxchg_release(&ref->refcnt, &cnt, RCUREF_DEAD)) + if (!atomic_try_cmpxchg_release(&ref->refcnt, &cnt, RCUREF_DEAD)) { + trace_printk("put(NOTDEAD): %px %pS\n", ref, __builtin_return_address(0)); return false; + } + trace_printk("put(NOWDEAD): %px %pS\n", ref, __builtin_return_address(0)); /* * The caller can safely schedule the object for * deconstruction. Provide acquire ordering. @@ -264,7 +275,9 @@ bool rcuref_put_slowpath(rcuref_t *ref) * put() operation is imbalanced. Warn, put the reference count back to * DEAD and tell the caller to not deconstruct the object. */ - if (WARN_ONCE(cnt >= RCUREF_RELEASED, "rcuref - imbalanced put()")) { + if (cnt >= RCUREF_RELEASED) { + trace_printk("put(WASDEAD): %px %pS\n", ref, __builtin_return_address(0)); + WARN_ONCE(1, "rcuref - imbalanced put()"); atomic_set(&ref->refcnt, RCUREF_DEAD); return false; } @@ -274,8 +287,13 @@ bool rcuref_put_slowpath(rcuref_t *ref) * mean saturation value and tell the caller to not deconstruct the * object. */ - if (cnt > RCUREF_MAXREF) + if (cnt > RCUREF_MAXREF) { + trace_printk("put(SATURATED): %px %pS\n", ref, __builtin_return_address(0)); atomic_set(&ref->refcnt, RCUREF_SATURATED); + } else { + trace_printk("put(UNDEFINED): %px %pS\n", ref, __builtin_return_address(0)); + WARN_ON_ONCE(1); + } return false; } EXPORT_SYMBOL_GPL(rcuref_put_slowpath); ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-12-12 18:16 ` Thomas Gleixner @ 2023-12-19 9:25 ` Martin Zaharinov 2023-12-19 14:26 ` Thomas Gleixner 0 siblings, 1 reply; 35+ messages in thread From: Martin Zaharinov @ 2023-12-19 9:25 UTC (permalink / raw) To: Thomas Gleixner Cc: peterz, netdev, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Eric Dumazet Hi Thomas, Thanks for your response! > On 12 Dec 2023, at 20:16, Thomas Gleixner <tglx@linutronix.de> wrote: > > Martin! > > On Sat, Dec 09 2023 at 01:01, Martin Zaharinov wrote: >>> On 9 Dec 2023, at 0:20, Thomas Gleixner <tglx@linutronix.de> wrote: >>> That's definitely not a RCU problem. It's a simple refcount fail. >>> >> Is this a problem or only simple fail , and is it possible to catch >> what is a problem and fix this fail. > > Underaccounting a reference count is potentially Use After Free. > > if (rcuref_put(ref)) > call_rcu(ref....); > > So after the grace period is over @ref will be freed. Depending on the > timing the context which does the extra put() might already operate on a > freed object. > > How to catch that, that's a good question. There is no instrumentation > so far for this. Below is a straight forward trace_printk() based > tracking of rcurefs, which should help to narrow down the context. > > Btw, how easy is this to reproduce? Its not easy this report is generate on machine with 5-6k users , with traffic and one time is show on 1 day , other show after 4-5 days… > > Thanks, > > tglx > --- > --- a/include/linux/rcuref.h > +++ b/include/linux/rcuref.h > @@ -64,8 +64,10 @@ static inline __must_check bool rcuref_g > * Unconditionally increase the reference count. The saturation and > * dead zones provide enough tolerance for this. > */ > - if (likely(!atomic_add_negative_relaxed(1, &ref->refcnt))) > + if (likely(!atomic_add_negative_relaxed(1, &ref->refcnt))) { > + trace_printk("get(FASTPATH): %px\n", ref); > return true; > + } > > /* Handle the cases inside the saturation and dead zones */ > return rcuref_get_slowpath(ref); > @@ -84,8 +86,10 @@ static __always_inline __must_check bool > * Unconditionally decrease the reference count. The saturation and > * dead zones provide enough tolerance for this. > */ > - if (likely(!atomic_add_negative_release(-1, &ref->refcnt))) > + if (likely(!atomic_add_negative_release(-1, &ref->refcnt))) { > + trace_printk("put(FASTPATH): %px\n", ref); > return false; > + } > > /* > * Handle the last reference drop and cases inside the saturation > --- a/lib/rcuref.c > +++ b/lib/rcuref.c > @@ -200,6 +200,7 @@ bool rcuref_get_slowpath(rcuref_t *ref) > */ > if (cnt >= RCUREF_RELEASED) { > atomic_set(&ref->refcnt, RCUREF_DEAD); > + trace_printk("get(DEAD): %px %pS\n", ref, __builtin_return_address(0)); > return false; > } > > @@ -211,8 +212,15 @@ bool rcuref_get_slowpath(rcuref_t *ref) > * object memory, but prevents the obvious reference count overflow > * damage. > */ > - if (WARN_ONCE(cnt > RCUREF_MAXREF, "rcuref saturated - leaking memory")) > + if (cnt > RCUREF_MAXREF) { > + trace_printk("get(SATURATED): %px %pS\n", ref, __builtin_return_address(0)); > + WARN_ONCE(1, "rcuref saturated - leaking memory"); > atomic_set(&ref->refcnt, RCUREF_SATURATED); > + } else { > + trace_printk("get(UNDEFINED): %px %pS\n", ref, __builtin_return_address(0)); > + WARN_ON_ONCE(1); > + } > + > return true; > } > EXPORT_SYMBOL_GPL(rcuref_get_slowpath); > @@ -248,9 +256,12 @@ bool rcuref_put_slowpath(rcuref_t *ref) > * require a retry. If this fails the caller is not > * allowed to deconstruct the object. > */ > - if (!atomic_try_cmpxchg_release(&ref->refcnt, &cnt, RCUREF_DEAD)) > + if (!atomic_try_cmpxchg_release(&ref->refcnt, &cnt, RCUREF_DEAD)) { > + trace_printk("put(NOTDEAD): %px %pS\n", ref, __builtin_return_address(0)); > return false; > + } > > + trace_printk("put(NOWDEAD): %px %pS\n", ref, __builtin_return_address(0)); > /* > * The caller can safely schedule the object for > * deconstruction. Provide acquire ordering. > @@ -264,7 +275,9 @@ bool rcuref_put_slowpath(rcuref_t *ref) > * put() operation is imbalanced. Warn, put the reference count back to > * DEAD and tell the caller to not deconstruct the object. > */ > - if (WARN_ONCE(cnt >= RCUREF_RELEASED, "rcuref - imbalanced put()")) { > + if (cnt >= RCUREF_RELEASED) { > + trace_printk("put(WASDEAD): %px %pS\n", ref, __builtin_return_address(0)); > + WARN_ONCE(1, "rcuref - imbalanced put()"); > atomic_set(&ref->refcnt, RCUREF_DEAD); > return false; > } > @@ -274,8 +287,13 @@ bool rcuref_put_slowpath(rcuref_t *ref) > * mean saturation value and tell the caller to not deconstruct the > * object. > */ > - if (cnt > RCUREF_MAXREF) > + if (cnt > RCUREF_MAXREF) { > + trace_printk("put(SATURATED): %px %pS\n", ref, __builtin_return_address(0)); > atomic_set(&ref->refcnt, RCUREF_SATURATED); > + } else { > + trace_printk("put(UNDEFINED): %px %pS\n", ref, __builtin_return_address(0)); > + WARN_ON_ONCE(1); > + } > return false; > } > EXPORT_SYMBOL_GPL(rcuref_put_slowpath); Apply this patch and will upload image on one machine as fast as possible and when get any reports will send you. Best regards, Martin ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-12-19 9:25 ` Martin Zaharinov @ 2023-12-19 14:26 ` Thomas Gleixner 2023-12-22 17:26 ` Martin Zaharinov 0 siblings, 1 reply; 35+ messages in thread From: Thomas Gleixner @ 2023-12-19 14:26 UTC (permalink / raw) To: Martin Zaharinov Cc: peterz, netdev, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Eric Dumazet On Tue, Dec 19 2023 at 11:25, Martin Zaharinov wrote: >> On 12 Dec 2023, at 20:16, Thomas Gleixner <tglx@linutronix.de> wrote: >> Btw, how easy is this to reproduce? > > Its not easy this report is generate on machine with 5-6k users , with > traffic and one time is show on 1 day , other show after 4-5 days… I love those bugs ... > Apply this patch and will upload image on one machine as fast as > possible and when get any reports will send you. Let's see how that goes! Thanks, tglx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-12-19 14:26 ` Thomas Gleixner @ 2023-12-22 17:26 ` Martin Zaharinov 2023-12-29 12:00 ` Martin Zaharinov 0 siblings, 1 reply; 35+ messages in thread From: Martin Zaharinov @ 2023-12-22 17:26 UTC (permalink / raw) To: Thomas Gleixner Cc: peterz, netdev, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Eric Dumazet Hi Thomas, this is with applyed patch from you. See logs [43040.198064] ------------[ cut here ]------------ [43040.198407] WARNING: CPU: 47 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath+0x2f/0x70 [43040.198685] Modules linked in: pppoe pppox ppp_generic slhc nft_limit nft_ct nft_nat nft_chain_nat nf_tables netconsole tg3 igb i2c_algo_bit e1000e bnxt_en mlx5_core mlxfw mlx4_en mlx4_core i40e ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_devintf ipmi_msghandler rtc_cmos [43040.199478] CPU: 47 PID: 0 Comm: swapper/47 Tainted: G O 6.6.8 #1 [43040.199660] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 [43040.199886] RIP: 0010:rcuref_put_slowpath+0x2f/0x70 [43040.200028] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4e e3 00 [43040.200387] RSP: 0018:ffffa39d83e88c30 EFLAGS: 00010246 [43040.200528] RAX: 0000000000000000 RBX: ffff9c58e966b840 RCX: ffff9c5bc4e35680 [43040.200700] RDX: ffff9c5fafde4f08 RSI: 00000000fffffe01 RDI: ffff9c58e966b840 [43040.200871] RBP: ffff9c5fafde4f08 R08: ffff9c5fafde4f08 R09: 0000000000000001 [43040.201044] R10: 00000000000286e0 R11: 0000000000000001 R12: ffff9c58e966b800 [43040.201255] R13: ffff9c58e966b868 R14: ffff9c5fafde4f08 R15: 000000008f5de42b [43040.201439] FS: 0000000000000000(0000) GS:ffff9c5fafdc0000(0000) knlGS:0000000000000000 [43040.201642] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [43040.201799] CR2: 00007f1401217714 CR3: 0000000464b94003 CR4: 00000000001706e0 [43040.201994] Call Trace: [43040.202095] <IRQ> [43040.202187] ? __warn+0x6c/0x130 [43040.202301] ? report_bug+0x1b8/0x200 [43040.202418] ? handle_bug+0x36/0x70 [43040.202534] ? exc_invalid_op+0x17/0x1a0 [43040.202652] ? asm_exc_invalid_op+0x16/0x20 [43040.202781] ? rcuref_put_slowpath+0x2f/0x70 [43040.202909] dst_release+0x1c/0x40 [43040.203026] rt_cache_route+0xbd/0xf0 [43040.203143] rt_set_nexthop.isra.0+0x1b6/0x450 [43040.203272] ip_route_input_slow+0x5d9/0xcc0 [43040.203401] ? nf_conntrack_udp_packet+0x17c/0x240 [nf_conntrack] [43040.203581] ip_route_input_noref+0xe0/0xf0 [43040.203704] ip_rcv_finish_core.isra.0+0xbb/0x440 [43040.203855] ip_rcv+0xd5/0x110 [43040.203962] ? ip_rcv_core+0x360/0x360 [43040.204079] process_backlog+0x107/0x210 [43040.204201] __napi_poll+0x20/0x180 [43040.204315] net_rx_action+0x29f/0x380 [43040.204432] __do_softirq+0xd0/0x202 [43040.204549] irq_exit_rcu+0x82/0xa0 [43040.204667] common_interrupt+0x7a/0xa0 [43040.204786] </IRQ> [43040.204876] <TASK> [43040.204965] asm_common_interrupt+0x22/0x40 [43040.205090] RIP: 0010:acpi_safe_halt+0x1b/0x20 [43040.205220] Code: ed c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 65 48 8b 04 25 40 32 02 00 48 8b 00 a8 08 75 0c eb 07 0f 00 2d 57 0f 2c 00 fb f4 <fa> c3 0f 1f 00 0f b6 47 08 3c 01 74 0b 3c 02 74 05 8b 7f 04 eb 9f [43040.205578] RSP: 0018:ffffa39d8234fe80 EFLAGS: 00000246 [43040.205718] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 000000000000001f [43040.205890] RDX: ffff9c5fafdc0000 RSI: ffff9c5882e95800 RDI: ffff9c5882e95864 [43040.206063] RBP: ffffffffa9216ea0 R08: ffffffffa9216ea0 R09: 0000000000000003 [43040.206246] R10: 0000000000000002 R11: 0000000000000008 R12: 0000000000000001 [43040.206419] R13: ffffffffa9216f08 R14: ffffffffa9216f20 R15: 0000000000000000 [43040.206593] acpi_idle_enter+0x77/0xc0 [43040.206711] cpuidle_enter_state+0x69/0x6a0 [43040.206835] cpuidle_enter+0x24/0x40 [43040.206954] do_idle+0x1a7/0x210 [43040.207066] cpu_startup_entry+0x21/0x30 [43040.207188] start_secondary+0xe1/0xf0 [43040.207310] secondary_startup_64_no_verify+0x166/0x16b [43040.207451] </TASK> [43040.207542] ---[ end trace 0000000000000000 ]--- [43040.198064] ------------[ cut here ]------------ [43040.198407] WARNING: CPU: 47 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) [43040.198685] Modules linked in: pppoe pppox ppp_generic slhc nft_limit nft_ct nft_nat nft_chain_nat nf_tables netconsole tg3 igb i2c_algo_bit e1000e bnxt_en mlx5_core mlxfw mlx4_en mlx4_core i40e ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_devintf ipmi_msghandler rtc_cmos [43040.199478] CPU: 47 PID: 0 Comm: swapper/47 Tainted: G O 6.6.8 #1 [43040.199660] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 [43040.199886] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) [43040.200028] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4e e3 00 All code ======== 0: 07 (bad) 1: 83 f8 ff cmp $0xffffffff,%eax 4: 75 19 jne 0x1f 6: ba 00 00 00 e0 mov $0xe0000000,%edx b: f0 0f b1 17 lock cmpxchg %edx,(%rdi) f: 83 f8 ff cmp $0xffffffff,%eax 12: 74 04 je 0x18 14: 31 c0 xor %eax,%eax 16: 5b pop %rbx 17: c3 ret 18: b8 01 00 00 00 mov $0x1,%eax 1d: 5b pop %rbx 1e: c3 ret 1f: 3d ff ff ff bf cmp $0xbfffffff,%eax 24: 77 14 ja 0x3a 26: 85 c0 test %eax,%eax 28: 78 06 js 0x30 2a:* 0f 0b ud2 <-- trapping instruction 2c: 31 c0 xor %eax,%eax 2e: eb e6 jmp 0x16 30: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) 36: 31 c0 xor %eax,%eax 38: eb dc jmp 0x16 3a: 80 .byte 0x80 3b: 3d e2 4e e3 00 cmp $0xe34ee2,%eax Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: 31 c0 xor %eax,%eax 4: eb e6 jmp 0xffffffffffffffec 6: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) c: 31 c0 xor %eax,%eax e: eb dc jmp 0xffffffffffffffec 10: 80 .byte 0x80 11: 3d e2 4e e3 00 cmp $0xe34ee2,%eax [43040.200387] RSP: 0018:ffffa39d83e88c30 EFLAGS: 00010246 [43040.200528] RAX: 0000000000000000 RBX: ffff9c58e966b840 RCX: ffff9c5bc4e35680 [43040.200700] RDX: ffff9c5fafde4f08 RSI: 00000000fffffe01 RDI: ffff9c58e966b840 [43040.200871] RBP: ffff9c5fafde4f08 R08: ffff9c5fafde4f08 R09: 0000000000000001 [43040.201044] R10: 00000000000286e0 R11: 0000000000000001 R12: ffff9c58e966b800 [43040.201255] R13: ffff9c58e966b868 R14: ffff9c5fafde4f08 R15: 000000008f5de42b [43040.201439] FS: 0000000000000000(0000) GS:ffff9c5fafdc0000(0000) knlGS:0000000000000000 [43040.201642] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [43040.201799] CR2: 00007f1401217714 CR3: 0000000464b94003 CR4: 00000000001706e0 [43040.201994] Call Trace: [43040.202095] <IRQ> [43040.202187] ? __warn (kernel/panic.c:235 kernel/panic.c:673) [43040.202301] ? report_bug (lib/bug.c:180 lib/bug.c:219) [43040.202418] ? handle_bug (arch/x86/kernel/traps.c:237) [43040.202534] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) [43040.202652] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) [43040.202781] ? rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) [43040.202909] dst_release (net/core/dst.c:166 (discriminator 1)) [43040.203026] rt_cache_route (net/ipv4/route.c:1499) [43040.203143] rt_set_nexthop.isra.0 (net/ipv4/route.c:1606 (discriminator 1)) [43040.203272] ip_route_input_slow (./include/net/lwtunnel.h:140 net/ipv4/route.c:1875 net/ipv4/route.c:2154 net/ipv4/route.c:2337) [43040.203401] ? nf_conntrack_udp_packet (net/netfilter/nf_conntrack_proto_udp.c:124) nf_conntrack [43040.203581] ip_route_input_noref (net/ipv4/route.c:2499) [43040.203704] ip_rcv_finish_core.isra.0 (net/ipv4/ip_input.c:367 (discriminator 1)) [43040.203855] ip_rcv (net/ipv4/ip_input.c:448 ./include/linux/netfilter.h:304 ./include/linux/netfilter.h:298 net/ipv4/ip_input.c:569) [43040.203962] ? ip_rcv_core (net/ipv4/ip_input.c:436) [43040.204079] process_backlog (net/core/dev.c:5997) [43040.204201] __napi_poll (net/core/dev.c:6556) [43040.204315] net_rx_action (net/core/dev.c:6625 net/core/dev.c:6756) [43040.204432] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) [43040.204549] irq_exit_rcu (kernel/softirq.c:436 kernel/softirq.c:641 kernel/softirq.c:653) [43040.204667] common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 47)) [43040.204786] </IRQ> [43040.204876] <TASK> [43040.204965] asm_common_interrupt (./arch/x86/include/asm/idtentry.h:640) [43040.205090] RIP: 0010:acpi_safe_halt (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:72 drivers/acpi/processor_idle.c:113) [43040.205220] Code: ed c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 65 48 8b 04 25 40 32 02 00 48 8b 00 a8 08 75 0c eb 07 0f 00 2d 57 0f 2c 00 fb f4 <fa> c3 0f 1f 00 0f b6 47 08 3c 01 74 0b 3c 02 74 05 8b 7f 04 eb 9f All code ======== 0: ed in (%dx),%eax 1: c3 ret 2: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1) 9: 00 00 00 00 d: 66 90 xchg %ax,%ax f: 65 48 8b 04 25 40 32 mov %gs:0x23240,%rax 16: 02 00 18: 48 8b 00 mov (%rax),%rax 1b: a8 08 test $0x8,%al 1d: 75 0c jne 0x2b 1f: eb 07 jmp 0x28 21: 0f 00 2d 57 0f 2c 00 verw 0x2c0f57(%rip) # 0x2c0f7f 28: fb sti 29: f4 hlt 2a:* fa cli <-- trapping instruction 2b: c3 ret 2c: 0f 1f 00 nopl (%rax) 2f: 0f b6 47 08 movzbl 0x8(%rdi),%eax 33: 3c 01 cmp $0x1,%al 35: 74 0b je 0x42 37: 3c 02 cmp $0x2,%al 39: 74 05 je 0x40 3b: 8b 7f 04 mov 0x4(%rdi),%edi 3e: eb 9f jmp 0xffffffffffffffdf Code starting with the faulting instruction =========================================== 0: fa cli 1: c3 ret 2: 0f 1f 00 nopl (%rax) 5: 0f b6 47 08 movzbl 0x8(%rdi),%eax 9: 3c 01 cmp $0x1,%al b: 74 0b je 0x18 d: 3c 02 cmp $0x2,%al f: 74 05 je 0x16 11: 8b 7f 04 mov 0x4(%rdi),%edi 14: eb 9f jmp 0xffffffffffffffb5 [43040.205578] RSP: 0018:ffffa39d8234fe80 EFLAGS: 00000246 [43040.205718] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 000000000000001f [43040.205890] RDX: ffff9c5fafdc0000 RSI: ffff9c5882e95800 RDI: ffff9c5882e95864 [43040.206063] RBP: ffffffffa9216ea0 R08: ffffffffa9216ea0 R09: 0000000000000003 [43040.206246] R10: 0000000000000002 R11: 0000000000000008 R12: 0000000000000001 [43040.206419] R13: ffffffffa9216f08 R14: ffffffffa9216f20 R15: 0000000000000000 [43040.206593] acpi_idle_enter (drivers/acpi/processor_idle.c:709) [43040.206711] cpuidle_enter_state (drivers/cpuidle/cpuidle.c:267) [43040.206835] cpuidle_enter (drivers/cpuidle/cpuidle.c:390 (discriminator 2)) [43040.206954] do_idle (kernel/sched/idle.c:134 kernel/sched/idle.c:215 kernel/sched/idle.c:282) [43040.207066] cpu_startup_entry (kernel/sched/idle.c:379) [43040.207188] start_secondary (arch/x86/kernel/smpboot.c:326) [43040.207310] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433) [43040.207451] </TASK> [43040.207542] ---[ end trace 0000000000000000 ]--- > On 19 Dec 2023, at 16:26, Thomas Gleixner <tglx@linutronix.de> wrote: > > On Tue, Dec 19 2023 at 11:25, Martin Zaharinov wrote: >>> On 12 Dec 2023, at 20:16, Thomas Gleixner <tglx@linutronix.de> wrote: >>> Btw, how easy is this to reproduce? >> >> Its not easy this report is generate on machine with 5-6k users , with >> traffic and one time is show on 1 day , other show after 4-5 days… > > I love those bugs ... > >> Apply this patch and will upload image on one machine as fast as >> possible and when get any reports will send you. > > Let's see how that goes! > > Thanks, > > tglx ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-12-22 17:26 ` Martin Zaharinov @ 2023-12-29 12:00 ` Martin Zaharinov 2024-01-04 20:51 ` Martin Zaharinov 0 siblings, 1 reply; 35+ messages in thread From: Martin Zaharinov @ 2023-12-29 12:00 UTC (permalink / raw) To: Thomas Gleixner Cc: peterz, netdev, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Eric Dumazet Hi Thomas, One more report from second machine: [21299.954952] ------------[ cut here ]------------ [21299.955047] WARNING: CPU: 15 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) [21299.955153] Modules linked in: nft_limit nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp virtio_net net_failover failover virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev virtio virtio_ring e1000e e1000 vmxnet3 i40e ixgbe mdio bnxt_en nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rtc_cmos [21299.955378] CPU: 15 PID: 0 Comm: swapper/15 Tainted: G O 6.6.8 #1 [21299.955475] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 02/09/2023 [21299.955575] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) [21299.955662] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4e e3 00 All code ======== 0: 07 (bad) 1: 83 f8 ff cmp $0xffffffff,%eax 4: 75 19 jne 0x1f 6: ba 00 00 00 e0 mov $0xe0000000,%edx b: f0 0f b1 17 lock cmpxchg %edx,(%rdi) f: 83 f8 ff cmp $0xffffffff,%eax 12: 74 04 je 0x18 14: 31 c0 xor %eax,%eax 16: 5b pop %rbx 17: c3 ret 18: b8 01 00 00 00 mov $0x1,%eax 1d: 5b pop %rbx 1e: c3 ret 1f: 3d ff ff ff bf cmp $0xbfffffff,%eax 24: 77 14 ja 0x3a 26: 85 c0 test %eax,%eax 28: 78 06 js 0x30 2a:* 0f 0b ud2 <-- trapping instruction 2c: 31 c0 xor %eax,%eax 2e: eb e6 jmp 0x16 30: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) 36: 31 c0 xor %eax,%eax 38: eb dc jmp 0x16 3a: 80 .byte 0x80 3b: 3d e2 4e e3 00 cmp $0xe34ee2,%eax Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: 31 c0 xor %eax,%eax 4: eb e6 jmp 0xffffffffffffffec 6: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) c: 31 c0 xor %eax,%eax e: eb dc jmp 0xffffffffffffffec 10: 80 .byte 0x80 11: 3d e2 4e e3 00 cmp $0xe34ee2,%eax [21299.955793] RSP: 0018:ffff96a7c0578c30 EFLAGS: 00010246 [21299.955879] RAX: 0000000000000000 RBX: ffff8b75d1e49a80 RCX: ffff8b75c6667c80 [21299.955974] RDX: ffff8b84bfbe4f08 RSI: 00000000fffffe01 RDI: ffff8b75d1e49a80 [21299.956070] RBP: ffff8b84bfbe4f08 R08: ffff8b84bfbe4f08 R09: 0000000000000001 [21299.956167] R10: 0000000000028530 R11: 0000000000000001 R12: ffff8b75d1e49a40 [21299.956261] R13: ffff8b75d1e49aa8 R14: ffff8b84bfbe4f08 R15: 00000000c26ab667 [21299.956358] FS: 0000000000000000(0000) GS:ffff8b84bfbc0000(0000) knlGS:0000000000000000 [21299.956457] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [21299.956540] CR2: 00007f2e185c73c8 CR3: 0000000950014003 CR4: 00000000003706e0 [21299.956635] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [21299.956730] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [21299.956826] Call Trace: [21299.956905] <IRQ> [21299.956983] ? __warn (kernel/panic.c:235 kernel/panic.c:673) [21299.957065] ? report_bug (lib/bug.c:180 lib/bug.c:219) [21299.957147] ? handle_bug (arch/x86/kernel/traps.c:237) [21299.957228] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) [21299.957308] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) [21299.957393] ? rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) [21299.957476] dst_release (net/core/dst.c:166 (discriminator 1)) [21299.957559] rt_cache_route (net/ipv4/route.c:1499) [21299.957641] rt_set_nexthop.isra.0 (net/ipv4/route.c:1606 (discriminator 1)) [21299.957722] ip_route_input_slow (./include/net/lwtunnel.h:140 net/ipv4/route.c:1875 net/ipv4/route.c:2154 net/ipv4/route.c:2337) [21299.957804] ? free_unref_page (./include/linux/list.h:150 (discriminator 1) ./include/linux/list.h:169 (discriminator 1) mm/page_alloc.c:2377 (discriminator 1) mm/page_alloc.c:2428 (discriminator 1)) [21299.957889] ip_route_input_noref (net/ipv4/route.c:2499) [21299.957972] ip_rcv_finish_core.isra.0 (net/ipv4/ip_input.c:367 (discriminator 1)) [21299.958058] ip_rcv (net/ipv4/ip_input.c:448 ./include/linux/netfilter.h:304 ./include/linux/netfilter.h:298 net/ipv4/ip_input.c:569) [21299.958139] ? ip_rcv_core (net/ipv4/ip_input.c:436) [21299.958220] process_backlog (net/core/dev.c:5997) [21299.958302] __napi_poll (net/core/dev.c:6556) [21299.958384] net_rx_action (net/core/dev.c:6625 net/core/dev.c:6756) [21299.958466] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) [21299.958549] irq_exit_rcu (kernel/softirq.c:436 kernel/softirq.c:641 kernel/softirq.c:653) [21299.958631] sysvec_call_function_single (arch/x86/kernel/smp.c:262 (discriminator 47)) [21299.958714] </IRQ> [21299.958792] <TASK> [21299.958869] asm_sysvec_call_function_single (./arch/x86/include/asm/idtentry.h:656) [21299.958953] RIP: 0010:acpi_safe_halt (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:72 drivers/acpi/processor_idle.c:113) [21299.959038] Code: ed c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 65 48 8b 04 25 40 32 02 00 48 8b 00 a8 08 75 0c eb 07 0f 00 2d 57 0f 2c 00 fb f4 <fa> c3 0f 1f 00 0f b6 47 08 3c 01 74 0b 3c 02 74 05 8b 7f 04 eb 9f All code ======== 0: ed in (%dx),%eax 1: c3 ret 2: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1) 9: 00 00 00 00 d: 66 90 xchg %ax,%ax f: 65 48 8b 04 25 40 32 mov %gs:0x23240,%rax 16: 02 00 18: 48 8b 00 mov (%rax),%rax 1b: a8 08 test $0x8,%al 1d: 75 0c jne 0x2b 1f: eb 07 jmp 0x28 21: 0f 00 2d 57 0f 2c 00 verw 0x2c0f57(%rip) # 0x2c0f7f 28: fb sti 29: f4 hlt 2a:* fa cli <-- trapping instruction 2b: c3 ret 2c: 0f 1f 00 nopl (%rax) 2f: 0f b6 47 08 movzbl 0x8(%rdi),%eax 33: 3c 01 cmp $0x1,%al 35: 74 0b je 0x42 37: 3c 02 cmp $0x2,%al 39: 74 05 je 0x40 3b: 8b 7f 04 mov 0x4(%rdi),%edi 3e: eb 9f jmp 0xffffffffffffffdf Code starting with the faulting instruction =========================================== 0: fa cli 1: c3 ret 2: 0f 1f 00 nopl (%rax) 5: 0f b6 47 08 movzbl 0x8(%rdi),%eax 9: 3c 01 cmp $0x1,%al b: 74 0b je 0x18 d: 3c 02 cmp $0x2,%al f: 74 05 je 0x16 11: 8b 7f 04 mov 0x4(%rdi),%edi 14: eb 9f jmp 0xffffffffffffffb5 [21299.959162] RSP: 0018:ffff96a7c015be80 EFLAGS: 00000246 [21299.959247] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 000000000000001f [21299.959343] RDX: ffff8b84bfbc0000 RSI: ffff8b75c76ba000 RDI: ffff8b75c76ba064 [21299.959437] RBP: ffffffffae216ea0 R08: ffffffffae216ea0 R09: 0000000000000003 [21299.959533] R10: 0000000000000002 R11: 0000000000000008 R12: 0000000000000001 [21299.959630] R13: ffffffffae216f08 R14: ffffffffae216f20 R15: 0000000000000000 [21299.959725] acpi_idle_enter (drivers/acpi/processor_idle.c:709) [21299.959807] cpuidle_enter_state (drivers/cpuidle/cpuidle.c:267) [21299.959890] cpuidle_enter (drivers/cpuidle/cpuidle.c:390 (discriminator 2)) [21299.959975] do_idle (kernel/sched/idle.c:134 kernel/sched/idle.c:215 kernel/sched/idle.c:282) [21299.960058] cpu_startup_entry (kernel/sched/idle.c:379) [21299.960140] start_secondary (arch/x86/kernel/smpboot.c:326) [21299.960223] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433) [21299.960306] </TASK> [21299.960384] ---[ end trace 0000000000000000 ]--- > On 22 Dec 2023, at 19:26, Martin Zaharinov <micron10@gmail.com> wrote: > > Hi Thomas, > > this is with applyed patch from you. > See logs > > > [43040.198064] ------------[ cut here ]------------ > [43040.198407] WARNING: CPU: 47 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath+0x2f/0x70 > [43040.198685] Modules linked in: pppoe pppox ppp_generic slhc nft_limit nft_ct nft_nat nft_chain_nat nf_tables netconsole tg3 igb i2c_algo_bit e1000e bnxt_en mlx5_core mlxfw mlx4_en mlx4_core i40e ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_devintf ipmi_msghandler rtc_cmos > [43040.199478] CPU: 47 PID: 0 Comm: swapper/47 Tainted: G O 6.6.8 #1 > [43040.199660] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 > [43040.199886] RIP: 0010:rcuref_put_slowpath+0x2f/0x70 > [43040.200028] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4e e3 00 > [43040.200387] RSP: 0018:ffffa39d83e88c30 EFLAGS: 00010246 > [43040.200528] RAX: 0000000000000000 RBX: ffff9c58e966b840 RCX: ffff9c5bc4e35680 > [43040.200700] RDX: ffff9c5fafde4f08 RSI: 00000000fffffe01 RDI: ffff9c58e966b840 > [43040.200871] RBP: ffff9c5fafde4f08 R08: ffff9c5fafde4f08 R09: 0000000000000001 > [43040.201044] R10: 00000000000286e0 R11: 0000000000000001 R12: ffff9c58e966b800 > [43040.201255] R13: ffff9c58e966b868 R14: ffff9c5fafde4f08 R15: 000000008f5de42b > [43040.201439] FS: 0000000000000000(0000) GS:ffff9c5fafdc0000(0000) knlGS:0000000000000000 > [43040.201642] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [43040.201799] CR2: 00007f1401217714 CR3: 0000000464b94003 CR4: 00000000001706e0 > [43040.201994] Call Trace: > [43040.202095] <IRQ> > [43040.202187] ? __warn+0x6c/0x130 > [43040.202301] ? report_bug+0x1b8/0x200 > [43040.202418] ? handle_bug+0x36/0x70 > [43040.202534] ? exc_invalid_op+0x17/0x1a0 > [43040.202652] ? asm_exc_invalid_op+0x16/0x20 > [43040.202781] ? rcuref_put_slowpath+0x2f/0x70 > [43040.202909] dst_release+0x1c/0x40 > [43040.203026] rt_cache_route+0xbd/0xf0 > [43040.203143] rt_set_nexthop.isra.0+0x1b6/0x450 > [43040.203272] ip_route_input_slow+0x5d9/0xcc0 > [43040.203401] ? nf_conntrack_udp_packet+0x17c/0x240 [nf_conntrack] > [43040.203581] ip_route_input_noref+0xe0/0xf0 > [43040.203704] ip_rcv_finish_core.isra.0+0xbb/0x440 > [43040.203855] ip_rcv+0xd5/0x110 > [43040.203962] ? ip_rcv_core+0x360/0x360 > [43040.204079] process_backlog+0x107/0x210 > [43040.204201] __napi_poll+0x20/0x180 > [43040.204315] net_rx_action+0x29f/0x380 > [43040.204432] __do_softirq+0xd0/0x202 > [43040.204549] irq_exit_rcu+0x82/0xa0 > [43040.204667] common_interrupt+0x7a/0xa0 > [43040.204786] </IRQ> > [43040.204876] <TASK> > [43040.204965] asm_common_interrupt+0x22/0x40 > [43040.205090] RIP: 0010:acpi_safe_halt+0x1b/0x20 > [43040.205220] Code: ed c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 65 48 8b 04 25 40 32 02 00 48 8b 00 a8 08 75 0c eb 07 0f 00 2d 57 0f 2c 00 fb f4 <fa> c3 0f 1f 00 0f b6 47 08 3c 01 74 0b 3c 02 74 05 8b 7f 04 eb 9f > [43040.205578] RSP: 0018:ffffa39d8234fe80 EFLAGS: 00000246 > [43040.205718] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 000000000000001f > [43040.205890] RDX: ffff9c5fafdc0000 RSI: ffff9c5882e95800 RDI: ffff9c5882e95864 > [43040.206063] RBP: ffffffffa9216ea0 R08: ffffffffa9216ea0 R09: 0000000000000003 > [43040.206246] R10: 0000000000000002 R11: 0000000000000008 R12: 0000000000000001 > [43040.206419] R13: ffffffffa9216f08 R14: ffffffffa9216f20 R15: 0000000000000000 > [43040.206593] acpi_idle_enter+0x77/0xc0 > [43040.206711] cpuidle_enter_state+0x69/0x6a0 > [43040.206835] cpuidle_enter+0x24/0x40 > [43040.206954] do_idle+0x1a7/0x210 > [43040.207066] cpu_startup_entry+0x21/0x30 > [43040.207188] start_secondary+0xe1/0xf0 > [43040.207310] secondary_startup_64_no_verify+0x166/0x16b > [43040.207451] </TASK> > [43040.207542] ---[ end trace 0000000000000000 ]--- > > > > [43040.198064] ------------[ cut here ]------------ > [43040.198407] WARNING: CPU: 47 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) > [43040.198685] Modules linked in: pppoe pppox ppp_generic slhc nft_limit nft_ct nft_nat nft_chain_nat nf_tables netconsole tg3 igb i2c_algo_bit e1000e bnxt_en mlx5_core mlxfw mlx4_en mlx4_core i40e ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_devintf ipmi_msghandler rtc_cmos > [43040.199478] CPU: 47 PID: 0 Comm: swapper/47 Tainted: G O 6.6.8 #1 > [43040.199660] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 > [43040.199886] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) > [43040.200028] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4e e3 00 > All code > ======== > 0: 07 (bad) > 1: 83 f8 ff cmp $0xffffffff,%eax > 4: 75 19 jne 0x1f > 6: ba 00 00 00 e0 mov $0xe0000000,%edx > b: f0 0f b1 17 lock cmpxchg %edx,(%rdi) > f: 83 f8 ff cmp $0xffffffff,%eax > 12: 74 04 je 0x18 > 14: 31 c0 xor %eax,%eax > 16: 5b pop %rbx > 17: c3 ret > 18: b8 01 00 00 00 mov $0x1,%eax > 1d: 5b pop %rbx > 1e: c3 ret > 1f: 3d ff ff ff bf cmp $0xbfffffff,%eax > 24: 77 14 ja 0x3a > 26: 85 c0 test %eax,%eax > 28: 78 06 js 0x30 > 2a:* 0f 0b ud2 <-- trapping instruction > 2c: 31 c0 xor %eax,%eax > 2e: eb e6 jmp 0x16 > 30: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) > 36: 31 c0 xor %eax,%eax > 38: eb dc jmp 0x16 > 3a: 80 .byte 0x80 > 3b: 3d e2 4e e3 00 cmp $0xe34ee2,%eax > > Code starting with the faulting instruction > =========================================== > 0: 0f 0b ud2 > 2: 31 c0 xor %eax,%eax > 4: eb e6 jmp 0xffffffffffffffec > 6: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) > c: 31 c0 xor %eax,%eax > e: eb dc jmp 0xffffffffffffffec > 10: 80 .byte 0x80 > 11: 3d e2 4e e3 00 cmp $0xe34ee2,%eax > [43040.200387] RSP: 0018:ffffa39d83e88c30 EFLAGS: 00010246 > [43040.200528] RAX: 0000000000000000 RBX: ffff9c58e966b840 RCX: ffff9c5bc4e35680 > [43040.200700] RDX: ffff9c5fafde4f08 RSI: 00000000fffffe01 RDI: ffff9c58e966b840 > [43040.200871] RBP: ffff9c5fafde4f08 R08: ffff9c5fafde4f08 R09: 0000000000000001 > [43040.201044] R10: 00000000000286e0 R11: 0000000000000001 R12: ffff9c58e966b800 > [43040.201255] R13: ffff9c58e966b868 R14: ffff9c5fafde4f08 R15: 000000008f5de42b > [43040.201439] FS: 0000000000000000(0000) GS:ffff9c5fafdc0000(0000) knlGS:0000000000000000 > [43040.201642] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [43040.201799] CR2: 00007f1401217714 CR3: 0000000464b94003 CR4: 00000000001706e0 > [43040.201994] Call Trace: > [43040.202095] <IRQ> > [43040.202187] ? __warn (kernel/panic.c:235 kernel/panic.c:673) > [43040.202301] ? report_bug (lib/bug.c:180 lib/bug.c:219) > [43040.202418] ? handle_bug (arch/x86/kernel/traps.c:237) > [43040.202534] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) > [43040.202652] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) > [43040.202781] ? rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) > [43040.202909] dst_release (net/core/dst.c:166 (discriminator 1)) > [43040.203026] rt_cache_route (net/ipv4/route.c:1499) > [43040.203143] rt_set_nexthop.isra.0 (net/ipv4/route.c:1606 (discriminator 1)) > [43040.203272] ip_route_input_slow (./include/net/lwtunnel.h:140 net/ipv4/route.c:1875 net/ipv4/route.c:2154 net/ipv4/route.c:2337) > [43040.203401] ? nf_conntrack_udp_packet (net/netfilter/nf_conntrack_proto_udp.c:124) nf_conntrack > [43040.203581] ip_route_input_noref (net/ipv4/route.c:2499) > [43040.203704] ip_rcv_finish_core.isra.0 (net/ipv4/ip_input.c:367 (discriminator 1)) > [43040.203855] ip_rcv (net/ipv4/ip_input.c:448 ./include/linux/netfilter.h:304 ./include/linux/netfilter.h:298 net/ipv4/ip_input.c:569) > [43040.203962] ? ip_rcv_core (net/ipv4/ip_input.c:436) > [43040.204079] process_backlog (net/core/dev.c:5997) > [43040.204201] __napi_poll (net/core/dev.c:6556) > [43040.204315] net_rx_action (net/core/dev.c:6625 net/core/dev.c:6756) > [43040.204432] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) > [43040.204549] irq_exit_rcu (kernel/softirq.c:436 kernel/softirq.c:641 kernel/softirq.c:653) > [43040.204667] common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 47)) > [43040.204786] </IRQ> > [43040.204876] <TASK> > [43040.204965] asm_common_interrupt (./arch/x86/include/asm/idtentry.h:640) > [43040.205090] RIP: 0010:acpi_safe_halt (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:72 drivers/acpi/processor_idle.c:113) > [43040.205220] Code: ed c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 65 48 8b 04 25 40 32 02 00 48 8b 00 a8 08 75 0c eb 07 0f 00 2d 57 0f 2c 00 fb f4 <fa> c3 0f 1f 00 0f b6 47 08 3c 01 74 0b 3c 02 74 05 8b 7f 04 eb 9f > All code > ======== > 0: ed in (%dx),%eax > 1: c3 ret > 2: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1) > 9: 00 00 00 00 > d: 66 90 xchg %ax,%ax > f: 65 48 8b 04 25 40 32 mov %gs:0x23240,%rax > 16: 02 00 > 18: 48 8b 00 mov (%rax),%rax > 1b: a8 08 test $0x8,%al > 1d: 75 0c jne 0x2b > 1f: eb 07 jmp 0x28 > 21: 0f 00 2d 57 0f 2c 00 verw 0x2c0f57(%rip) # 0x2c0f7f > 28: fb sti > 29: f4 hlt > 2a:* fa cli <-- trapping instruction > 2b: c3 ret > 2c: 0f 1f 00 nopl (%rax) > 2f: 0f b6 47 08 movzbl 0x8(%rdi),%eax > 33: 3c 01 cmp $0x1,%al > 35: 74 0b je 0x42 > 37: 3c 02 cmp $0x2,%al > 39: 74 05 je 0x40 > 3b: 8b 7f 04 mov 0x4(%rdi),%edi > 3e: eb 9f jmp 0xffffffffffffffdf > > Code starting with the faulting instruction > =========================================== > 0: fa cli > 1: c3 ret > 2: 0f 1f 00 nopl (%rax) > 5: 0f b6 47 08 movzbl 0x8(%rdi),%eax > 9: 3c 01 cmp $0x1,%al > b: 74 0b je 0x18 > d: 3c 02 cmp $0x2,%al > f: 74 05 je 0x16 > 11: 8b 7f 04 mov 0x4(%rdi),%edi > 14: eb 9f jmp 0xffffffffffffffb5 > [43040.205578] RSP: 0018:ffffa39d8234fe80 EFLAGS: 00000246 > [43040.205718] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 000000000000001f > [43040.205890] RDX: ffff9c5fafdc0000 RSI: ffff9c5882e95800 RDI: ffff9c5882e95864 > [43040.206063] RBP: ffffffffa9216ea0 R08: ffffffffa9216ea0 R09: 0000000000000003 > [43040.206246] R10: 0000000000000002 R11: 0000000000000008 R12: 0000000000000001 > [43040.206419] R13: ffffffffa9216f08 R14: ffffffffa9216f20 R15: 0000000000000000 > [43040.206593] acpi_idle_enter (drivers/acpi/processor_idle.c:709) > [43040.206711] cpuidle_enter_state (drivers/cpuidle/cpuidle.c:267) > [43040.206835] cpuidle_enter (drivers/cpuidle/cpuidle.c:390 (discriminator 2)) > [43040.206954] do_idle (kernel/sched/idle.c:134 kernel/sched/idle.c:215 kernel/sched/idle.c:282) > [43040.207066] cpu_startup_entry (kernel/sched/idle.c:379) > [43040.207188] start_secondary (arch/x86/kernel/smpboot.c:326) > [43040.207310] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433) > [43040.207451] </TASK> > [43040.207542] ---[ end trace 0000000000000000 ]--- > >> On 19 Dec 2023, at 16:26, Thomas Gleixner <tglx@linutronix.de> wrote: >> >> On Tue, Dec 19 2023 at 11:25, Martin Zaharinov wrote: >>>> On 12 Dec 2023, at 20:16, Thomas Gleixner <tglx@linutronix.de> wrote: >>>> Btw, how easy is this to reproduce? >>> >>> Its not easy this report is generate on machine with 5-6k users , with >>> traffic and one time is show on 1 day , other show after 4-5 days… >> >> I love those bugs ... >> >>> Apply this patch and will upload image on one machine as fast as >>> possible and when get any reports will send you. >> >> Let's see how that goes! >> >> Thanks, >> >> tglx > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-12-29 12:00 ` Martin Zaharinov @ 2024-01-04 20:51 ` Martin Zaharinov 2024-01-07 11:03 ` Martin Zaharinov 0 siblings, 1 reply; 35+ messages in thread From: Martin Zaharinov @ 2024-01-04 20:51 UTC (permalink / raw) To: Thomas Gleixner Cc: peterz, netdev, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Eric Dumazet Hi Thomas , Happy New Year! here is two debugs from two new installed machins with kernel 6.6.9: dmesg1 : [ 2257.449125] ------------[ cut here ]------------ [ 2257.449245] WARNING: CPU: 1 PID: 40622 at lib/rcuref.c:294 rcuref_put_slowpath+0x2f/0x70 [ 2257.449373] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding ixgbe mdio i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [ 2257.449642] CPU: 1 PID: 40622 Comm: nc Tainted: G O 6.6.9 #1 [ 2257.449761] Hardware name: Supermicro PIO-5038MR-H8TRF-NODE/X10SRD-F, BIOS 3.3 10/28/2020 [ 2257.449883] RIP: 0010:rcuref_put_slowpath+0x2f/0x70 [ 2257.449977] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4c e3 00 [ 2257.450135] RSP: 0000:ffffb455cef83b78 EFLAGS: 00010246 [ 2257.450227] RAX: 0000000000000000 RBX: ffff94873bb77dc0 RCX: ffff9486c0d46b80 [ 2257.450341] RDX: ffff948736578428 RSI: 00000000fffffe01 RDI: ffff94873bb77dc0 [ 2257.450456] RBP: ffff948736578428 R08: ffff948e1fa64f08 R09: 0000000000000001 [ 2257.450570] R10: 0000000000028530 R11: 0000000000000001 R12: ffff94873bb77d80 [ 2257.450685] R13: ffff94873bb77de8 R14: ffff948e1fa64f08 R15: 000000000266f59d [ 2257.450802] FS: 00007f0cdbc73800(0000) GS:ffff948e1fa40000(0000) knlGS:0000000000000000 [ 2257.450918] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2257.451012] CR2: 00007f0cdc3f5c30 CR3: 0000000178ea0002 CR4: 00000000003706e0 [ 2257.451127] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2257.451240] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2257.451353] Call Trace: [ 2257.451441] <TASK> [ 2257.451526] ? __warn+0x6c/0x130 [ 2257.451616] ? report_bug+0x1b8/0x200 [ 2257.451707] ? handle_bug+0x36/0x70 [ 2257.451797] ? exc_invalid_op+0x17/0x1a0 [ 2257.451886] ? asm_exc_invalid_op+0x16/0x20 [ 2257.452038] ? rcuref_put_slowpath+0x2f/0x70 [ 2257.452129] dst_release+0x1c/0x40 [ 2257.452222] rt_cache_route+0xbd/0xf0 [ 2257.452313] ? kmem_cache_alloc+0x31/0x390 [ 2257.452404] rt_set_nexthop.isra.0+0x1b6/0x450 [ 2257.452495] ip_route_input_slow+0x5d9/0xcc0 [ 2257.452586] ? nft_nat_do_chain+0x7f/0xd0 [nft_chain_nat] [ 2257.452681] ? nf_conntrack_udp_packet+0xcf/0x240 [nf_conntrack] [ 2257.452784] ? nf_nat_inet_fn+0x36f/0x3f0 [nf_nat] [ 2257.452880] ip_route_input_noref+0xe0/0xf0 [ 2257.452970] ip_rcv_finish_core.isra.0+0xbb/0x440 [ 2257.453064] ip_rcv+0xd5/0x110 [ 2257.453151] ? ip_rcv_core+0x360/0x360 [ 2257.453240] process_backlog+0x107/0x210 [ 2257.453330] __napi_poll+0x20/0x180 [ 2257.453420] net_rx_action+0x29f/0x380 [ 2257.453510] __do_softirq+0xd0/0x202 [ 2257.453599] irq_exit_rcu+0x82/0xa0 [ 2257.453689] sysvec_call_function_single+0x32/0x80 [ 2257.453781] asm_sysvec_call_function_single+0x16/0x20 [ 2257.453874] RIP: 0033:0x7f0cdc5928b2 [ 2257.453963] Code: 06 00 00 4c 89 65 88 49 83 fd 08 0f 84 f7 06 00 00 49 83 fd 26 0f 84 05 07 00 00 4d 85 ed 0f 84 5f 01 00 00 41 0f b6 44 24 04 <89> c6 40 c0 ee 04 0f 84 72 06 00 00 41 0f b6 54 24 05 83 e2 03 ff [ 2257.454121] RSP: 002b:00007ffc04d3e890 EFLAGS: 00000206 [ 2257.454215] RAX: 0000000000000012 RBX: 00007f0cdc444db8 RCX: 00007f0cdc4e6e60 [ 2257.454329] RDX: 0000000000000009 RSI: 00007f0cdc57ef30 RDI: 00007f0cdc42c808 [ 2257.454442] RBP: 00007ffc04d3e9b0 R08: 00007f0cdc445028 R09: 00007ffc04d3e940 [ 2257.454555] R10: 00007f0cdbf00be8 R11: 0000000000000000 R12: 00007f0cdc42c898 [ 2257.454670] R13: 0000000000000006 R14: 0000000600000006 R15: 00007f0cdc581000 [ 2257.454784] </TASK> [ 2257.454869] ---[ end trace 0000000000000000 ]— [ 2257.449125] ------------[ cut here ]------------ [ 2257.449245] WARNING: CPU: 1 PID: 40622 at lib/rcuref.c:294 rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) [ 2257.449373] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding ixgbe mdio i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [ 2257.449642] CPU: 1 PID: 40622 Comm: nc Tainted: G O 6.6.9 #1 [ 2257.449761] Hardware name: Supermicro PIO-5038MR-H8TRF-NODE/X10SRD-F, BIOS 3.3 10/28/2020 [ 2257.449883] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) [ 2257.449977] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4c e3 00 All code ======== 0: 07 (bad) 1: 83 f8 ff cmp $0xffffffff,%eax 4: 75 19 jne 0x1f 6: ba 00 00 00 e0 mov $0xe0000000,%edx b: f0 0f b1 17 lock cmpxchg %edx,(%rdi) f: 83 f8 ff cmp $0xffffffff,%eax 12: 74 04 je 0x18 14: 31 c0 xor %eax,%eax 16: 5b pop %rbx 17: c3 ret 18: b8 01 00 00 00 mov $0x1,%eax 1d: 5b pop %rbx 1e: c3 ret 1f: 3d ff ff ff bf cmp $0xbfffffff,%eax 24: 77 14 ja 0x3a 26: 85 c0 test %eax,%eax 28: 78 06 js 0x30 2a:* 0f 0b ud2 <-- trapping instruction 2c: 31 c0 xor %eax,%eax 2e: eb e6 jmp 0x16 30: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) 36: 31 c0 xor %eax,%eax 38: eb dc jmp 0x16 3a: 80 .byte 0x80 3b: 3d e2 4c e3 00 cmp $0xe34ce2,%eax Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: 31 c0 xor %eax,%eax 4: eb e6 jmp 0xffffffffffffffec 6: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) c: 31 c0 xor %eax,%eax e: eb dc jmp 0xffffffffffffffec 10: 80 .byte 0x80 11: 3d e2 4c e3 00 cmp $0xe34ce2,%eax [ 2257.450135] RSP: 0000:ffffb455cef83b78 EFLAGS: 00010246 [ 2257.450227] RAX: 0000000000000000 RBX: ffff94873bb77dc0 RCX: ffff9486c0d46b80 [ 2257.450341] RDX: ffff948736578428 RSI: 00000000fffffe01 RDI: ffff94873bb77dc0 [ 2257.450456] RBP: ffff948736578428 R08: ffff948e1fa64f08 R09: 0000000000000001 [ 2257.450570] R10: 0000000000028530 R11: 0000000000000001 R12: ffff94873bb77d80 [ 2257.450685] R13: ffff94873bb77de8 R14: ffff948e1fa64f08 R15: 000000000266f59d [ 2257.450802] FS: 00007f0cdbc73800(0000) GS:ffff948e1fa40000(0000) knlGS:0000000000000000 [ 2257.450918] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2257.451012] CR2: 00007f0cdc3f5c30 CR3: 0000000178ea0002 CR4: 00000000003706e0 [ 2257.451127] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2257.451240] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2257.451353] Call Trace: [ 2257.451441] <TASK> [ 2257.451526] ? __warn (kernel/panic.c:235 kernel/panic.c:673) [ 2257.451616] ? report_bug (lib/bug.c:180 lib/bug.c:219) [ 2257.451707] ? handle_bug (arch/x86/kernel/traps.c:237) [ 2257.451797] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) [ 2257.451886] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) [ 2257.452038] ? rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) [ 2257.452129] dst_release (net/core/dst.c:166 (discriminator 1)) [ 2257.452222] rt_cache_route (net/ipv4/route.c:1499) [ 2257.452313] ? kmem_cache_alloc (mm/slab.h:711 (discriminator 1) mm/slub.c:3461 (discriminator 1) mm/slub.c:3487 (discriminator 1) mm/slub.c:3494 (discriminator 1) mm/slub.c:3503 (discriminator 1)) [ 2257.452404] rt_set_nexthop.isra.0 (net/ipv4/route.c:1606 (discriminator 1)) [ 2257.452495] ip_route_input_slow (./include/net/lwtunnel.h:140 net/ipv4/route.c:1875 net/ipv4/route.c:2154 net/ipv4/route.c:2337) [ 2257.452586] ? nft_nat_do_chain (net/netfilter/nft_chain_nat.c:33) nft_chain_nat [ 2257.452681] ? nf_conntrack_udp_packet (net/netfilter/nf_conntrack_proto_udp.c:130) nf_conntrack [ 2257.452784] ? nf_nat_inet_fn (net/netfilter/nf_nat_core.c:844) nf_nat [ 2257.452880] ip_route_input_noref (net/ipv4/route.c:2499) [ 2257.452970] ip_rcv_finish_core.isra.0 (net/ipv4/ip_input.c:367 (discriminator 1)) [ 2257.453064] ip_rcv (net/ipv4/ip_input.c:448 ./include/linux/netfilter.h:304 ./include/linux/netfilter.h:298 net/ipv4/ip_input.c:569) [ 2257.453151] ? ip_rcv_core (net/ipv4/ip_input.c:436) [ 2257.453240] process_backlog (net/core/dev.c:6000) [ 2257.453330] __napi_poll (net/core/dev.c:6559) [ 2257.453420] net_rx_action (net/core/dev.c:6628 net/core/dev.c:6759) [ 2257.453510] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) [ 2257.453599] irq_exit_rcu (kernel/softirq.c:436 kernel/softirq.c:641 kernel/softirq.c:653) [ 2257.453689] sysvec_call_function_single (arch/x86/kernel/smp.c:262 (discriminator 69)) [ 2257.453781] asm_sysvec_call_function_single (./arch/x86/include/asm/idtentry.h:656) [ 2257.453874] RIP: 0033:0x7f0cdc5928b2 [ 2257.453963] Code: 06 00 00 4c 89 65 88 49 83 fd 08 0f 84 f7 06 00 00 49 83 fd 26 0f 84 05 07 00 00 4d 85 ed 0f 84 5f 01 00 00 41 0f b6 44 24 04 <89> c6 40 c0 ee 04 0f 84 72 06 00 00 41 0f b6 54 24 05 83 e2 03 ff All code ======== 0: 06 (bad) 1: 00 00 add %al,(%rax) 3: 4c 89 65 88 mov %r12,-0x78(%rbp) 7: 49 83 fd 08 cmp $0x8,%r13 b: 0f 84 f7 06 00 00 je 0x708 11: 49 83 fd 26 cmp $0x26,%r13 15: 0f 84 05 07 00 00 je 0x720 1b: 4d 85 ed test %r13,%r13 1e: 0f 84 5f 01 00 00 je 0x183 24: 41 0f b6 44 24 04 movzbl 0x4(%r12),%eax 2a:* 89 c6 mov %eax,%esi <-- trapping instruction 2c: 40 c0 ee 04 shr $0x4,%sil 30: 0f 84 72 06 00 00 je 0x6a8 36: 41 0f b6 54 24 05 movzbl 0x5(%r12),%edx 3c: 83 e2 03 and $0x3,%edx 3f: ff .byte 0xff Code starting with the faulting instruction =========================================== 0: 89 c6 mov %eax,%esi 2: 40 c0 ee 04 shr $0x4,%sil 6: 0f 84 72 06 00 00 je 0x67e c: 41 0f b6 54 24 05 movzbl 0x5(%r12),%edx 12: 83 e2 03 and $0x3,%edx 15: ff .byte 0xff [ 2257.454121] RSP: 002b:00007ffc04d3e890 EFLAGS: 00000206 [ 2257.454215] RAX: 0000000000000012 RBX: 00007f0cdc444db8 RCX: 00007f0cdc4e6e60 [ 2257.454329] RDX: 0000000000000009 RSI: 00007f0cdc57ef30 RDI: 00007f0cdc42c808 [ 2257.454442] RBP: 00007ffc04d3e9b0 R08: 00007f0cdc445028 R09: 00007ffc04d3e940 [ 2257.454555] R10: 00007f0cdbf00be8 R11: 0000000000000000 R12: 00007f0cdc42c898 [ 2257.454670] R13: 0000000000000006 R14: 0000000600000006 R15: 00007f0cdc581000 [ 2257.454784] </TASK> [ 2257.454869] ---[ end trace 0000000000000000 ]— dmesg2 : [ 2567.167952] ------------[ cut here ]------------ [ 2567.168053] WARNING: CPU: 11 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath+0x2f/0x70 [ 2567.168175] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding ixgbe mdio i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [ 2567.168445] CPU: 11 PID: 0 Comm: swapper/11 Tainted: G O 6.6.9 #1 [ 2567.168561] Hardware name: Supermicro X10SRD-F/X10SRD-F, BIOS 3.4 06/05/2021 [ 2567.168675] RIP: 0010:rcuref_put_slowpath+0x2f/0x70 [ 2567.168767] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4c e3 00 [ 2567.168924] RSP: 0018:ffffaeaf80418d00 EFLAGS: 00010246 [ 2567.169017] RAX: 0000000000000000 RBX: ffff9fef84d6a940 RCX: 0000000000000074 [ 2567.169132] RDX: ffff9fefe2e30000 RSI: 0000000000000000 RDI: ffff9fef84d6a940 [ 2567.169246] RBP: ffff9fefe2e306c0 R08: 0000000000000000 R09: 0000000000029300 [ 2567.169359] R10: 0000000000029300 R11: ffffaeaf80418d90 R12: ffff9fef8aebe000 [ 2567.169473] R13: ffff9fef80896800 R14: ffff9fef85335200 R15: ffff9fef8ae07080 [ 2567.169586] FS: 0000000000000000(0000) GS:ffff9ff6dfcc0000(0000) knlGS:0000000000000000 [ 2567.169702] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2567.169795] CR2: 00007f4eaa7e6650 CR3: 0000000156dcd006 CR4: 00000000003706e0 [ 2567.169908] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2567.170022] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2567.170137] Call Trace: [ 2567.170224] <IRQ> [ 2567.170309] ? __warn+0x6c/0x130 [ 2567.170399] ? report_bug+0x1b8/0x200 [ 2567.170488] ? handle_bug+0x36/0x70 [ 2567.170577] ? exc_invalid_op+0x17/0x1a0 [ 2567.170667] ? asm_exc_invalid_op+0x16/0x20 [ 2567.170758] ? rcuref_put_slowpath+0x2f/0x70 [ 2567.170850] dst_release+0x1c/0x40 [ 2567.170939] __dev_queue_xmit+0x598/0xce0 [ 2567.171029] vlan_dev_hard_start_xmit+0x82/0xc0 [ 2567.171122] dev_hard_start_xmit+0x95/0xe0 [ 2567.171216] __dev_queue_xmit+0x863/0xce0 [ 2567.171305] ? eth_header+0x25/0xc0 [ 2567.171394] ip_finish_output2+0x1a0/0x530 [ 2567.171485] process_backlog+0x107/0x210 [ 2567.171575] __napi_poll+0x20/0x180 [ 2567.171663] net_rx_action+0x29f/0x380 [ 2567.171752] ? rebalance_domains+0x14c/0x300 [ 2567.171843] __do_softirq+0xd0/0x202 [ 2567.171932] irq_exit_rcu+0x82/0xa0 [ 2567.172022] common_interrupt+0x7a/0xa0 [ 2567.172111] </IRQ> [ 2567.172198] <TASK> [ 2567.172283] asm_common_interrupt+0x22/0x40 [ 2567.172374] RIP: 0010:cpuidle_enter_state+0xa3/0x6a0 [ 2567.172467] Code: 46 40 40 0f 84 02 01 00 00 e8 c9 a0 70 ff e8 d4 f6 ff ff 31 ff 49 89 c6 e8 0a b9 6f ff 45 84 ff 0f 85 d9 00 00 00 fb 45 85 ed <0f> 88 b8 00 00 00 49 63 cd 48 8b 04 24 48 6b f1 68 49 29 c6 48 8d [ 2567.172623] RSP: 0018:ffffaeaf80177e98 EFLAGS: 00000202 [ 2567.172715] RAX: ffff9ff6dfce3a80 RBX: ffff9fef81338000 RCX: 000000000000001f [ 2567.172828] RDX: 00000255b721ed84 RSI: 00000000238e3b7a RDI: 0000000000000000 [ 2567.172942] RBP: ffffffffba216ea0 R08: 0000000000000004 R09: ffff9ff6dfcdef00 [ 2567.173055] R10: ffff9ff6dfcdef00 R11: 0000000000000007 R12: 0000000000000001 [ 2567.173168] R13: 0000000000000001 R14: 00000255b721ed84 R15: 0000000000000000 [ 2567.173283] ? cpuidle_enter_state+0x96/0x6a0 [ 2567.173374] cpuidle_enter+0x24/0x40 [ 2567.173464] do_idle+0x1a7/0x210 [ 2567.173552] cpu_startup_entry+0x21/0x30 [ 2567.173642] start_secondary+0xe1/0xf0 [ 2567.173732] secondary_startup_64_no_verify+0x178/0x17b [ 2567.173825] </TASK> [ 2567.173910] ---[ end trace 0000000000000000 ]— [ 2567.167952] ------------[ cut here ]------------ [ 2567.168053] WARNING: CPU: 11 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) [ 2567.168175] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding ixgbe mdio i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [ 2567.168445] CPU: 11 PID: 0 Comm: swapper/11 Tainted: G O 6.6.9 #1 [ 2567.168561] Hardware name: Supermicro X10SRD-F/X10SRD-F, BIOS 3.4 06/05/2021 [ 2567.168675] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) [ 2567.168767] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4c e3 00 All code ======== 0: 07 (bad) 1: 83 f8 ff cmp $0xffffffff,%eax 4: 75 19 jne 0x1f 6: ba 00 00 00 e0 mov $0xe0000000,%edx b: f0 0f b1 17 lock cmpxchg %edx,(%rdi) f: 83 f8 ff cmp $0xffffffff,%eax 12: 74 04 je 0x18 14: 31 c0 xor %eax,%eax 16: 5b pop %rbx 17: c3 ret 18: b8 01 00 00 00 mov $0x1,%eax 1d: 5b pop %rbx 1e: c3 ret 1f: 3d ff ff ff bf cmp $0xbfffffff,%eax 24: 77 14 ja 0x3a 26: 85 c0 test %eax,%eax 28: 78 06 js 0x30 2a:* 0f 0b ud2 <-- trapping instruction 2c: 31 c0 xor %eax,%eax 2e: eb e6 jmp 0x16 30: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) 36: 31 c0 xor %eax,%eax 38: eb dc jmp 0x16 3a: 80 .byte 0x80 3b: 3d e2 4c e3 00 cmp $0xe34ce2,%eax Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: 31 c0 xor %eax,%eax 4: eb e6 jmp 0xffffffffffffffec 6: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) c: 31 c0 xor %eax,%eax e: eb dc jmp 0xffffffffffffffec 10: 80 .byte 0x80 11: 3d e2 4c e3 00 cmp $0xe34ce2,%eax [ 2567.168924] RSP: 0018:ffffaeaf80418d00 EFLAGS: 00010246 [ 2567.169017] RAX: 0000000000000000 RBX: ffff9fef84d6a940 RCX: 0000000000000074 [ 2567.169132] RDX: ffff9fefe2e30000 RSI: 0000000000000000 RDI: ffff9fef84d6a940 [ 2567.169246] RBP: ffff9fefe2e306c0 R08: 0000000000000000 R09: 0000000000029300 [ 2567.169359] R10: 0000000000029300 R11: ffffaeaf80418d90 R12: ffff9fef8aebe000 [ 2567.169473] R13: ffff9fef80896800 R14: ffff9fef85335200 R15: ffff9fef8ae07080 [ 2567.169586] FS: 0000000000000000(0000) GS:ffff9ff6dfcc0000(0000) knlGS:0000000000000000 [ 2567.169702] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2567.169795] CR2: 00007f4eaa7e6650 CR3: 0000000156dcd006 CR4: 00000000003706e0 [ 2567.169908] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2567.170022] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 2567.170137] Call Trace: [ 2567.170224] <IRQ> [ 2567.170309] ? __warn (kernel/panic.c:235 kernel/panic.c:673) [ 2567.170399] ? report_bug (lib/bug.c:180 lib/bug.c:219) [ 2567.170488] ? handle_bug (arch/x86/kernel/traps.c:237) [ 2567.170577] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) [ 2567.170667] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) [ 2567.170758] ? rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) [ 2567.170850] dst_release (net/core/dst.c:166 (discriminator 1)) [ 2567.170939] __dev_queue_xmit (./include/net/dst.h:283 net/core/dev.c:4327) [ 2567.171029] vlan_dev_hard_start_xmit (net/8021q/vlan_dev.c:130) [ 2567.171122] dev_hard_start_xmit (./include/linux/netdevice.h:4926 net/core/dev.c:3576 net/core/dev.c:3592) [ 2567.171216] __dev_queue_xmit (./include/linux/netdevice.h:3300 (discriminator 25) net/core/dev.c:4373 (discriminator 25)) [ 2567.171305] ? eth_header (net/ethernet/eth.c:85) [ 2567.171394] ip_finish_output2 (./include/net/neighbour.h:542 (discriminator 2) net/ipv4/ip_output.c:233 (discriminator 2)) [ 2567.171485] process_backlog (net/core/dev.c:6000) [ 2567.171575] __napi_poll (net/core/dev.c:6559) [ 2567.171663] net_rx_action (net/core/dev.c:6628 net/core/dev.c:6759) [ 2567.171752] ? rebalance_domains (kernel/sched/fair.c:11719 kernel/sched/fair.c:11895) [ 2567.171843] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) [ 2567.171932] irq_exit_rcu (kernel/softirq.c:436 kernel/softirq.c:641 kernel/softirq.c:653) [ 2567.172022] common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 47)) [ 2567.172111] </IRQ> [ 2567.172198] <TASK> [ 2567.172283] asm_common_interrupt (./arch/x86/include/asm/idtentry.h:640) [ 2567.172374] RIP: 0010:cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291) [ 2567.172467] Code: 46 40 40 0f 84 02 01 00 00 e8 c9 a0 70 ff e8 d4 f6 ff ff 31 ff 49 89 c6 e8 0a b9 6f ff 45 84 ff 0f 85 d9 00 00 00 fb 45 85 ed <0f> 88 b8 00 00 00 49 63 cd 48 8b 04 24 48 6b f1 68 49 29 c6 48 8d All code ======== 0: 46 rex.RX 1: 40 rex 2: 40 0f 84 02 01 00 00 rex je 0x10b 9: e8 c9 a0 70 ff call 0xffffffffff70a0d7 e: e8 d4 f6 ff ff call 0xfffffffffffff6e7 13: 31 ff xor %edi,%edi 15: 49 89 c6 mov %rax,%r14 18: e8 0a b9 6f ff call 0xffffffffff6fb927 1d: 45 84 ff test %r15b,%r15b 20: 0f 85 d9 00 00 00 jne 0xff 26: fb sti 27: 45 85 ed test %r13d,%r13d 2a:* 0f 88 b8 00 00 00 js 0xe8 <-- trapping instruction 30: 49 63 cd movslq %r13d,%rcx 33: 48 8b 04 24 mov (%rsp),%rax 37: 48 6b f1 68 imul $0x68,%rcx,%rsi 3b: 49 29 c6 sub %rax,%r14 3e: 48 rex.W 3f: 8d .byte 0x8d Code starting with the faulting instruction =========================================== 0: 0f 88 b8 00 00 00 js 0xbe 6: 49 63 cd movslq %r13d,%rcx 9: 48 8b 04 24 mov (%rsp),%rax d: 48 6b f1 68 imul $0x68,%rcx,%rsi 11: 49 29 c6 sub %rax,%r14 14: 48 rex.W 15: 8d .byte 0x8d [ 2567.172623] RSP: 0018:ffffaeaf80177e98 EFLAGS: 00000202 [ 2567.172715] RAX: ffff9ff6dfce3a80 RBX: ffff9fef81338000 RCX: 000000000000001f [ 2567.172828] RDX: 00000255b721ed84 RSI: 00000000238e3b7a RDI: 0000000000000000 [ 2567.172942] RBP: ffffffffba216ea0 R08: 0000000000000004 R09: ffff9ff6dfcdef00 [ 2567.173055] R10: ffff9ff6dfcdef00 R11: 0000000000000007 R12: 0000000000000001 [ 2567.173168] R13: 0000000000000001 R14: 00000255b721ed84 R15: 0000000000000000 [ 2567.173283] ? cpuidle_enter_state (drivers/cpuidle/cpuidle.c:285) [ 2567.173374] cpuidle_enter (drivers/cpuidle/cpuidle.c:390 (discriminator 2)) [ 2567.173464] do_idle (kernel/sched/idle.c:134 kernel/sched/idle.c:215 kernel/sched/idle.c:282) [ 2567.173552] cpu_startup_entry (kernel/sched/idle.c:379) [ 2567.173642] start_secondary (arch/x86/kernel/smpboot.c:326) [ 2567.173732] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:449) [ 2567.173825] </TASK> [ 2567.173910] ---[ end trace 0000000000000000 ]— best regards, Martin > On 29 Dec 2023, at 14:00, Martin Zaharinov <micron10@gmail.com> wrote: > > Hi Thomas, > > One more report from second machine: > > [21299.954952] ------------[ cut here ]------------ > [21299.955047] WARNING: CPU: 15 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) > [21299.955153] Modules linked in: nft_limit nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp virtio_net net_failover failover virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev virtio virtio_ring e1000e e1000 vmxnet3 i40e ixgbe mdio bnxt_en nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rtc_cmos > [21299.955378] CPU: 15 PID: 0 Comm: swapper/15 Tainted: G O 6.6.8 #1 > [21299.955475] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 02/09/2023 > [21299.955575] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) > [21299.955662] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4e e3 00 > All code > ======== > 0: 07 (bad) > 1: 83 f8 ff cmp $0xffffffff,%eax > 4: 75 19 jne 0x1f > 6: ba 00 00 00 e0 mov $0xe0000000,%edx > b: f0 0f b1 17 lock cmpxchg %edx,(%rdi) > f: 83 f8 ff cmp $0xffffffff,%eax > 12: 74 04 je 0x18 > 14: 31 c0 xor %eax,%eax > 16: 5b pop %rbx > 17: c3 ret > 18: b8 01 00 00 00 mov $0x1,%eax > 1d: 5b pop %rbx > 1e: c3 ret > 1f: 3d ff ff ff bf cmp $0xbfffffff,%eax > 24: 77 14 ja 0x3a > 26: 85 c0 test %eax,%eax > 28: 78 06 js 0x30 > 2a:* 0f 0b ud2 <-- trapping instruction > 2c: 31 c0 xor %eax,%eax > 2e: eb e6 jmp 0x16 > 30: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) > 36: 31 c0 xor %eax,%eax > 38: eb dc jmp 0x16 > 3a: 80 .byte 0x80 > 3b: 3d e2 4e e3 00 cmp $0xe34ee2,%eax > > Code starting with the faulting instruction > =========================================== > 0: 0f 0b ud2 > 2: 31 c0 xor %eax,%eax > 4: eb e6 jmp 0xffffffffffffffec > 6: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) > c: 31 c0 xor %eax,%eax > e: eb dc jmp 0xffffffffffffffec > 10: 80 .byte 0x80 > 11: 3d e2 4e e3 00 cmp $0xe34ee2,%eax > [21299.955793] RSP: 0018:ffff96a7c0578c30 EFLAGS: 00010246 > [21299.955879] RAX: 0000000000000000 RBX: ffff8b75d1e49a80 RCX: ffff8b75c6667c80 > [21299.955974] RDX: ffff8b84bfbe4f08 RSI: 00000000fffffe01 RDI: ffff8b75d1e49a80 > [21299.956070] RBP: ffff8b84bfbe4f08 R08: ffff8b84bfbe4f08 R09: 0000000000000001 > [21299.956167] R10: 0000000000028530 R11: 0000000000000001 R12: ffff8b75d1e49a40 > [21299.956261] R13: ffff8b75d1e49aa8 R14: ffff8b84bfbe4f08 R15: 00000000c26ab667 > [21299.956358] FS: 0000000000000000(0000) GS:ffff8b84bfbc0000(0000) knlGS:0000000000000000 > [21299.956457] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [21299.956540] CR2: 00007f2e185c73c8 CR3: 0000000950014003 CR4: 00000000003706e0 > [21299.956635] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [21299.956730] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [21299.956826] Call Trace: > [21299.956905] <IRQ> > [21299.956983] ? __warn (kernel/panic.c:235 kernel/panic.c:673) > [21299.957065] ? report_bug (lib/bug.c:180 lib/bug.c:219) > [21299.957147] ? handle_bug (arch/x86/kernel/traps.c:237) > [21299.957228] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) > [21299.957308] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) > [21299.957393] ? rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) > [21299.957476] dst_release (net/core/dst.c:166 (discriminator 1)) > [21299.957559] rt_cache_route (net/ipv4/route.c:1499) > [21299.957641] rt_set_nexthop.isra.0 (net/ipv4/route.c:1606 (discriminator 1)) > [21299.957722] ip_route_input_slow (./include/net/lwtunnel.h:140 net/ipv4/route.c:1875 net/ipv4/route.c:2154 net/ipv4/route.c:2337) > [21299.957804] ? free_unref_page (./include/linux/list.h:150 (discriminator 1) ./include/linux/list.h:169 (discriminator 1) mm/page_alloc.c:2377 (discriminator 1) mm/page_alloc.c:2428 (discriminator 1)) > [21299.957889] ip_route_input_noref (net/ipv4/route.c:2499) > [21299.957972] ip_rcv_finish_core.isra.0 (net/ipv4/ip_input.c:367 (discriminator 1)) > [21299.958058] ip_rcv (net/ipv4/ip_input.c:448 ./include/linux/netfilter.h:304 ./include/linux/netfilter.h:298 net/ipv4/ip_input.c:569) > [21299.958139] ? ip_rcv_core (net/ipv4/ip_input.c:436) > [21299.958220] process_backlog (net/core/dev.c:5997) > [21299.958302] __napi_poll (net/core/dev.c:6556) > [21299.958384] net_rx_action (net/core/dev.c:6625 net/core/dev.c:6756) > [21299.958466] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) > [21299.958549] irq_exit_rcu (kernel/softirq.c:436 kernel/softirq.c:641 kernel/softirq.c:653) > [21299.958631] sysvec_call_function_single (arch/x86/kernel/smp.c:262 (discriminator 47)) > [21299.958714] </IRQ> > [21299.958792] <TASK> > [21299.958869] asm_sysvec_call_function_single (./arch/x86/include/asm/idtentry.h:656) > [21299.958953] RIP: 0010:acpi_safe_halt (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:72 drivers/acpi/processor_idle.c:113) > [21299.959038] Code: ed c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 65 48 8b 04 25 40 32 02 00 48 8b 00 a8 08 75 0c eb 07 0f 00 2d 57 0f 2c 00 fb f4 <fa> c3 0f 1f 00 0f b6 47 08 3c 01 74 0b 3c 02 74 05 8b 7f 04 eb 9f > All code > ======== > 0: ed in (%dx),%eax > 1: c3 ret > 2: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1) > 9: 00 00 00 00 > d: 66 90 xchg %ax,%ax > f: 65 48 8b 04 25 40 32 mov %gs:0x23240,%rax > 16: 02 00 > 18: 48 8b 00 mov (%rax),%rax > 1b: a8 08 test $0x8,%al > 1d: 75 0c jne 0x2b > 1f: eb 07 jmp 0x28 > 21: 0f 00 2d 57 0f 2c 00 verw 0x2c0f57(%rip) # 0x2c0f7f > 28: fb sti > 29: f4 hlt > 2a:* fa cli <-- trapping instruction > 2b: c3 ret > 2c: 0f 1f 00 nopl (%rax) > 2f: 0f b6 47 08 movzbl 0x8(%rdi),%eax > 33: 3c 01 cmp $0x1,%al > 35: 74 0b je 0x42 > 37: 3c 02 cmp $0x2,%al > 39: 74 05 je 0x40 > 3b: 8b 7f 04 mov 0x4(%rdi),%edi > 3e: eb 9f jmp 0xffffffffffffffdf > > Code starting with the faulting instruction > =========================================== > 0: fa cli > 1: c3 ret > 2: 0f 1f 00 nopl (%rax) > 5: 0f b6 47 08 movzbl 0x8(%rdi),%eax > 9: 3c 01 cmp $0x1,%al > b: 74 0b je 0x18 > d: 3c 02 cmp $0x2,%al > f: 74 05 je 0x16 > 11: 8b 7f 04 mov 0x4(%rdi),%edi > 14: eb 9f jmp 0xffffffffffffffb5 > [21299.959162] RSP: 0018:ffff96a7c015be80 EFLAGS: 00000246 > [21299.959247] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 000000000000001f > [21299.959343] RDX: ffff8b84bfbc0000 RSI: ffff8b75c76ba000 RDI: ffff8b75c76ba064 > [21299.959437] RBP: ffffffffae216ea0 R08: ffffffffae216ea0 R09: 0000000000000003 > [21299.959533] R10: 0000000000000002 R11: 0000000000000008 R12: 0000000000000001 > [21299.959630] R13: ffffffffae216f08 R14: ffffffffae216f20 R15: 0000000000000000 > [21299.959725] acpi_idle_enter (drivers/acpi/processor_idle.c:709) > [21299.959807] cpuidle_enter_state (drivers/cpuidle/cpuidle.c:267) > [21299.959890] cpuidle_enter (drivers/cpuidle/cpuidle.c:390 (discriminator 2)) > [21299.959975] do_idle (kernel/sched/idle.c:134 kernel/sched/idle.c:215 kernel/sched/idle.c:282) > [21299.960058] cpu_startup_entry (kernel/sched/idle.c:379) > [21299.960140] start_secondary (arch/x86/kernel/smpboot.c:326) > [21299.960223] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433) > [21299.960306] </TASK> > [21299.960384] ---[ end trace 0000000000000000 ]--- > >> On 22 Dec 2023, at 19:26, Martin Zaharinov <micron10@gmail.com> wrote: >> >> Hi Thomas, >> >> this is with applyed patch from you. >> See logs >> >> >> [43040.198064] ------------[ cut here ]------------ >> [43040.198407] WARNING: CPU: 47 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath+0x2f/0x70 >> [43040.198685] Modules linked in: pppoe pppox ppp_generic slhc nft_limit nft_ct nft_nat nft_chain_nat nf_tables netconsole tg3 igb i2c_algo_bit e1000e bnxt_en mlx5_core mlxfw mlx4_en mlx4_core i40e ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_devintf ipmi_msghandler rtc_cmos >> [43040.199478] CPU: 47 PID: 0 Comm: swapper/47 Tainted: G O 6.6.8 #1 >> [43040.199660] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 >> [43040.199886] RIP: 0010:rcuref_put_slowpath+0x2f/0x70 >> [43040.200028] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4e e3 00 >> [43040.200387] RSP: 0018:ffffa39d83e88c30 EFLAGS: 00010246 >> [43040.200528] RAX: 0000000000000000 RBX: ffff9c58e966b840 RCX: ffff9c5bc4e35680 >> [43040.200700] RDX: ffff9c5fafde4f08 RSI: 00000000fffffe01 RDI: ffff9c58e966b840 >> [43040.200871] RBP: ffff9c5fafde4f08 R08: ffff9c5fafde4f08 R09: 0000000000000001 >> [43040.201044] R10: 00000000000286e0 R11: 0000000000000001 R12: ffff9c58e966b800 >> [43040.201255] R13: ffff9c58e966b868 R14: ffff9c5fafde4f08 R15: 000000008f5de42b >> [43040.201439] FS: 0000000000000000(0000) GS:ffff9c5fafdc0000(0000) knlGS:0000000000000000 >> [43040.201642] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [43040.201799] CR2: 00007f1401217714 CR3: 0000000464b94003 CR4: 00000000001706e0 >> [43040.201994] Call Trace: >> [43040.202095] <IRQ> >> [43040.202187] ? __warn+0x6c/0x130 >> [43040.202301] ? report_bug+0x1b8/0x200 >> [43040.202418] ? handle_bug+0x36/0x70 >> [43040.202534] ? exc_invalid_op+0x17/0x1a0 >> [43040.202652] ? asm_exc_invalid_op+0x16/0x20 >> [43040.202781] ? rcuref_put_slowpath+0x2f/0x70 >> [43040.202909] dst_release+0x1c/0x40 >> [43040.203026] rt_cache_route+0xbd/0xf0 >> [43040.203143] rt_set_nexthop.isra.0+0x1b6/0x450 >> [43040.203272] ip_route_input_slow+0x5d9/0xcc0 >> [43040.203401] ? nf_conntrack_udp_packet+0x17c/0x240 [nf_conntrack] >> [43040.203581] ip_route_input_noref+0xe0/0xf0 >> [43040.203704] ip_rcv_finish_core.isra.0+0xbb/0x440 >> [43040.203855] ip_rcv+0xd5/0x110 >> [43040.203962] ? ip_rcv_core+0x360/0x360 >> [43040.204079] process_backlog+0x107/0x210 >> [43040.204201] __napi_poll+0x20/0x180 >> [43040.204315] net_rx_action+0x29f/0x380 >> [43040.204432] __do_softirq+0xd0/0x202 >> [43040.204549] irq_exit_rcu+0x82/0xa0 >> [43040.204667] common_interrupt+0x7a/0xa0 >> [43040.204786] </IRQ> >> [43040.204876] <TASK> >> [43040.204965] asm_common_interrupt+0x22/0x40 >> [43040.205090] RIP: 0010:acpi_safe_halt+0x1b/0x20 >> [43040.205220] Code: ed c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 65 48 8b 04 25 40 32 02 00 48 8b 00 a8 08 75 0c eb 07 0f 00 2d 57 0f 2c 00 fb f4 <fa> c3 0f 1f 00 0f b6 47 08 3c 01 74 0b 3c 02 74 05 8b 7f 04 eb 9f >> [43040.205578] RSP: 0018:ffffa39d8234fe80 EFLAGS: 00000246 >> [43040.205718] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 000000000000001f >> [43040.205890] RDX: ffff9c5fafdc0000 RSI: ffff9c5882e95800 RDI: ffff9c5882e95864 >> [43040.206063] RBP: ffffffffa9216ea0 R08: ffffffffa9216ea0 R09: 0000000000000003 >> [43040.206246] R10: 0000000000000002 R11: 0000000000000008 R12: 0000000000000001 >> [43040.206419] R13: ffffffffa9216f08 R14: ffffffffa9216f20 R15: 0000000000000000 >> [43040.206593] acpi_idle_enter+0x77/0xc0 >> [43040.206711] cpuidle_enter_state+0x69/0x6a0 >> [43040.206835] cpuidle_enter+0x24/0x40 >> [43040.206954] do_idle+0x1a7/0x210 >> [43040.207066] cpu_startup_entry+0x21/0x30 >> [43040.207188] start_secondary+0xe1/0xf0 >> [43040.207310] secondary_startup_64_no_verify+0x166/0x16b >> [43040.207451] </TASK> >> [43040.207542] ---[ end trace 0000000000000000 ]--- >> >> >> >> [43040.198064] ------------[ cut here ]------------ >> [43040.198407] WARNING: CPU: 47 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) >> [43040.198685] Modules linked in: pppoe pppox ppp_generic slhc nft_limit nft_ct nft_nat nft_chain_nat nf_tables netconsole tg3 igb i2c_algo_bit e1000e bnxt_en mlx5_core mlxfw mlx4_en mlx4_core i40e ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_devintf ipmi_msghandler rtc_cmos >> [43040.199478] CPU: 47 PID: 0 Comm: swapper/47 Tainted: G O 6.6.8 #1 >> [43040.199660] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 >> [43040.199886] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) >> [43040.200028] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4e e3 00 >> All code >> ======== >> 0: 07 (bad) >> 1: 83 f8 ff cmp $0xffffffff,%eax >> 4: 75 19 jne 0x1f >> 6: ba 00 00 00 e0 mov $0xe0000000,%edx >> b: f0 0f b1 17 lock cmpxchg %edx,(%rdi) >> f: 83 f8 ff cmp $0xffffffff,%eax >> 12: 74 04 je 0x18 >> 14: 31 c0 xor %eax,%eax >> 16: 5b pop %rbx >> 17: c3 ret >> 18: b8 01 00 00 00 mov $0x1,%eax >> 1d: 5b pop %rbx >> 1e: c3 ret >> 1f: 3d ff ff ff bf cmp $0xbfffffff,%eax >> 24: 77 14 ja 0x3a >> 26: 85 c0 test %eax,%eax >> 28: 78 06 js 0x30 >> 2a:* 0f 0b ud2 <-- trapping instruction >> 2c: 31 c0 xor %eax,%eax >> 2e: eb e6 jmp 0x16 >> 30: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) >> 36: 31 c0 xor %eax,%eax >> 38: eb dc jmp 0x16 >> 3a: 80 .byte 0x80 >> 3b: 3d e2 4e e3 00 cmp $0xe34ee2,%eax >> >> Code starting with the faulting instruction >> =========================================== >> 0: 0f 0b ud2 >> 2: 31 c0 xor %eax,%eax >> 4: eb e6 jmp 0xffffffffffffffec >> 6: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) >> c: 31 c0 xor %eax,%eax >> e: eb dc jmp 0xffffffffffffffec >> 10: 80 .byte 0x80 >> 11: 3d e2 4e e3 00 cmp $0xe34ee2,%eax >> [43040.200387] RSP: 0018:ffffa39d83e88c30 EFLAGS: 00010246 >> [43040.200528] RAX: 0000000000000000 RBX: ffff9c58e966b840 RCX: ffff9c5bc4e35680 >> [43040.200700] RDX: ffff9c5fafde4f08 RSI: 00000000fffffe01 RDI: ffff9c58e966b840 >> [43040.200871] RBP: ffff9c5fafde4f08 R08: ffff9c5fafde4f08 R09: 0000000000000001 >> [43040.201044] R10: 00000000000286e0 R11: 0000000000000001 R12: ffff9c58e966b800 >> [43040.201255] R13: ffff9c58e966b868 R14: ffff9c5fafde4f08 R15: 000000008f5de42b >> [43040.201439] FS: 0000000000000000(0000) GS:ffff9c5fafdc0000(0000) knlGS:0000000000000000 >> [43040.201642] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [43040.201799] CR2: 00007f1401217714 CR3: 0000000464b94003 CR4: 00000000001706e0 >> [43040.201994] Call Trace: >> [43040.202095] <IRQ> >> [43040.202187] ? __warn (kernel/panic.c:235 kernel/panic.c:673) >> [43040.202301] ? report_bug (lib/bug.c:180 lib/bug.c:219) >> [43040.202418] ? handle_bug (arch/x86/kernel/traps.c:237) >> [43040.202534] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) >> [43040.202652] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) >> [43040.202781] ? rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) >> [43040.202909] dst_release (net/core/dst.c:166 (discriminator 1)) >> [43040.203026] rt_cache_route (net/ipv4/route.c:1499) >> [43040.203143] rt_set_nexthop.isra.0 (net/ipv4/route.c:1606 (discriminator 1)) >> [43040.203272] ip_route_input_slow (./include/net/lwtunnel.h:140 net/ipv4/route.c:1875 net/ipv4/route.c:2154 net/ipv4/route.c:2337) >> [43040.203401] ? nf_conntrack_udp_packet (net/netfilter/nf_conntrack_proto_udp.c:124) nf_conntrack >> [43040.203581] ip_route_input_noref (net/ipv4/route.c:2499) >> [43040.203704] ip_rcv_finish_core.isra.0 (net/ipv4/ip_input.c:367 (discriminator 1)) >> [43040.203855] ip_rcv (net/ipv4/ip_input.c:448 ./include/linux/netfilter.h:304 ./include/linux/netfilter.h:298 net/ipv4/ip_input.c:569) >> [43040.203962] ? ip_rcv_core (net/ipv4/ip_input.c:436) >> [43040.204079] process_backlog (net/core/dev.c:5997) >> [43040.204201] __napi_poll (net/core/dev.c:6556) >> [43040.204315] net_rx_action (net/core/dev.c:6625 net/core/dev.c:6756) >> [43040.204432] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) >> [43040.204549] irq_exit_rcu (kernel/softirq.c:436 kernel/softirq.c:641 kernel/softirq.c:653) >> [43040.204667] common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 47)) >> [43040.204786] </IRQ> >> [43040.204876] <TASK> >> [43040.204965] asm_common_interrupt (./arch/x86/include/asm/idtentry.h:640) >> [43040.205090] RIP: 0010:acpi_safe_halt (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:72 drivers/acpi/processor_idle.c:113) >> [43040.205220] Code: ed c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 65 48 8b 04 25 40 32 02 00 48 8b 00 a8 08 75 0c eb 07 0f 00 2d 57 0f 2c 00 fb f4 <fa> c3 0f 1f 00 0f b6 47 08 3c 01 74 0b 3c 02 74 05 8b 7f 04 eb 9f >> All code >> ======== >> 0: ed in (%dx),%eax >> 1: c3 ret >> 2: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1) >> 9: 00 00 00 00 >> d: 66 90 xchg %ax,%ax >> f: 65 48 8b 04 25 40 32 mov %gs:0x23240,%rax >> 16: 02 00 >> 18: 48 8b 00 mov (%rax),%rax >> 1b: a8 08 test $0x8,%al >> 1d: 75 0c jne 0x2b >> 1f: eb 07 jmp 0x28 >> 21: 0f 00 2d 57 0f 2c 00 verw 0x2c0f57(%rip) # 0x2c0f7f >> 28: fb sti >> 29: f4 hlt >> 2a:* fa cli <-- trapping instruction >> 2b: c3 ret >> 2c: 0f 1f 00 nopl (%rax) >> 2f: 0f b6 47 08 movzbl 0x8(%rdi),%eax >> 33: 3c 01 cmp $0x1,%al >> 35: 74 0b je 0x42 >> 37: 3c 02 cmp $0x2,%al >> 39: 74 05 je 0x40 >> 3b: 8b 7f 04 mov 0x4(%rdi),%edi >> 3e: eb 9f jmp 0xffffffffffffffdf >> >> Code starting with the faulting instruction >> =========================================== >> 0: fa cli >> 1: c3 ret >> 2: 0f 1f 00 nopl (%rax) >> 5: 0f b6 47 08 movzbl 0x8(%rdi),%eax >> 9: 3c 01 cmp $0x1,%al >> b: 74 0b je 0x18 >> d: 3c 02 cmp $0x2,%al >> f: 74 05 je 0x16 >> 11: 8b 7f 04 mov 0x4(%rdi),%edi >> 14: eb 9f jmp 0xffffffffffffffb5 >> [43040.205578] RSP: 0018:ffffa39d8234fe80 EFLAGS: 00000246 >> [43040.205718] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 000000000000001f >> [43040.205890] RDX: ffff9c5fafdc0000 RSI: ffff9c5882e95800 RDI: ffff9c5882e95864 >> [43040.206063] RBP: ffffffffa9216ea0 R08: ffffffffa9216ea0 R09: 0000000000000003 >> [43040.206246] R10: 0000000000000002 R11: 0000000000000008 R12: 0000000000000001 >> [43040.206419] R13: ffffffffa9216f08 R14: ffffffffa9216f20 R15: 0000000000000000 >> [43040.206593] acpi_idle_enter (drivers/acpi/processor_idle.c:709) >> [43040.206711] cpuidle_enter_state (drivers/cpuidle/cpuidle.c:267) >> [43040.206835] cpuidle_enter (drivers/cpuidle/cpuidle.c:390 (discriminator 2)) >> [43040.206954] do_idle (kernel/sched/idle.c:134 kernel/sched/idle.c:215 kernel/sched/idle.c:282) >> [43040.207066] cpu_startup_entry (kernel/sched/idle.c:379) >> [43040.207188] start_secondary (arch/x86/kernel/smpboot.c:326) >> [43040.207310] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433) >> [43040.207451] </TASK> >> [43040.207542] ---[ end trace 0000000000000000 ]--- >> >>> On 19 Dec 2023, at 16:26, Thomas Gleixner <tglx@linutronix.de> wrote: >>> >>> On Tue, Dec 19 2023 at 11:25, Martin Zaharinov wrote: >>>>> On 12 Dec 2023, at 20:16, Thomas Gleixner <tglx@linutronix.de> wrote: >>>>> Btw, how easy is this to reproduce? >>>> >>>> Its not easy this report is generate on machine with 5-6k users , with >>>> traffic and one time is show on 1 day , other show after 4-5 days… >>> >>> I love those bugs ... >>> >>>> Apply this patch and will upload image on one machine as fast as >>>> possible and when get any reports will send you. >>> >>> Let's see how that goes! >>> >>> Thanks, >>> >>> tglx >> > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2024-01-04 20:51 ` Martin Zaharinov @ 2024-01-07 11:03 ` Martin Zaharinov 0 siblings, 0 replies; 35+ messages in thread From: Martin Zaharinov @ 2024-01-07 11:03 UTC (permalink / raw) To: Thomas Gleixner Cc: peterz, netdev, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Eric Dumazet Hi Thomas this is one more report from one machine Here you will see have to bug report in same day: [Sat Jan 6 07:37:23 2024] ------------[ cut here ]------------ [Sat Jan 6 07:37:23 2024] WARNING: CPU: 12 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) [Sat Jan 6 07:37:23 2024] Modules linked in: pppoe pppox ppp_generic slhc nft_limit nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding mlx5_core mlxfw mlx4_en mlx4_core i40e ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos megaraid_sas [Sat Jan 6 07:37:23 2024] CPU: 12 PID: 0 Comm: swapper/12 Tainted: G O 6.6.10 #1 [Sat Jan 6 07:37:23 2024] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 7.80 10/28/2020 [Sat Jan 6 07:37:23 2024] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) [Sat Jan 6 07:37:23 2024] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d b2 4a e3 00 All code ======== 0: 07 (bad) 1: 83 f8 ff cmp $0xffffffff,%eax 4: 75 19 jne 0x1f 6: ba 00 00 00 e0 mov $0xe0000000,%edx b: f0 0f b1 17 lock cmpxchg %edx,(%rdi) f: 83 f8 ff cmp $0xffffffff,%eax 12: 74 04 je 0x18 14: 31 c0 xor %eax,%eax 16: 5b pop %rbx 17: c3 ret 18: b8 01 00 00 00 mov $0x1,%eax 1d: 5b pop %rbx 1e: c3 ret 1f: 3d ff ff ff bf cmp $0xbfffffff,%eax 24: 77 14 ja 0x3a 26: 85 c0 test %eax,%eax 28: 78 06 js 0x30 2a:* 0f 0b ud2 <-- trapping instruction 2c: 31 c0 xor %eax,%eax 2e: eb e6 jmp 0x16 30: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) 36: 31 c0 xor %eax,%eax 38: eb dc jmp 0x16 3a: 80 .byte 0x80 3b: 3d b2 4a e3 00 cmp $0xe34ab2,%eax Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: 31 c0 xor %eax,%eax 4: eb e6 jmp 0xffffffffffffffec 6: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) c: 31 c0 xor %eax,%eax e: eb dc jmp 0xffffffffffffffec 10: 80 .byte 0x80 11: 3d b2 4a e3 00 cmp $0xe34ab2,%eax [Sat Jan 6 07:37:23 2024] RSP: 0018:ffffa773091ccdd8 EFLAGS: 00010246 [Sat Jan 6 07:37:23 2024] RAX: 0000000000000000 RBX: ffff8f458c192d00 RCX: 0000000000000042 [Sat Jan 6 07:37:23 2024] RDX: ffff8f455ad71800 RSI: 0000000000000000 RDI: ffff8f458c192d00 [Sat Jan 6 07:37:23 2024] RBP: ffff8f455ad71ec0 R08: 0000000000000000 R09: 0000000000000000 [Sat Jan 6 07:37:23 2024] R10: 0000000000000002 R11: ffffa773091ccd90 R12: ffff8f25c68df800 [Sat Jan 6 07:37:23 2024] R13: 000000000000000e R14: 0000000000000010 R15: ffff8f64bf8a4d10 [Sat Jan 6 07:37:23 2024] FS: 0000000000000000(0000) GS:ffff8f64bf880000(0000) knlGS:0000000000000000 [Sat Jan 6 07:37:23 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Sat Jan 6 07:37:23 2024] CR2: 00007fbd91318650 CR3: 000000177e014005 CR4: 00000000003706e0 [Sat Jan 6 07:37:23 2024] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [Sat Jan 6 07:37:23 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [Sat Jan 6 07:37:23 2024] Call Trace: [Sat Jan 6 07:37:23 2024] <IRQ> [Sat Jan 6 07:37:23 2024] ? __warn (kernel/panic.c:235 kernel/panic.c:673) [Sat Jan 6 07:37:23 2024] ? report_bug (lib/bug.c:180 lib/bug.c:219) [Sat Jan 6 07:37:23 2024] ? handle_bug (arch/x86/kernel/traps.c:237) [Sat Jan 6 07:37:23 2024] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) [Sat Jan 6 07:37:23 2024] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) [Sat Jan 6 07:37:23 2024] ? rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) [Sat Jan 6 07:37:23 2024] dst_release (net/core/dst.c:166 (discriminator 1)) [Sat Jan 6 07:37:23 2024] __dev_queue_xmit (./include/net/dst.h:283 net/core/dev.c:4327) [Sat Jan 6 07:37:23 2024] ? nf_hook_slow (./include/linux/netfilter.h:144 net/netfilter/core.c:626) [Sat Jan 6 07:37:23 2024] ip_finish_output2 (./include/net/neighbour.h:526 ./include/net/neighbour.h:540 net/ipv4/ip_output.c:233) [Sat Jan 6 07:37:23 2024] process_backlog (net/core/dev.c:6000) [Sat Jan 6 07:37:23 2024] __napi_poll (net/core/dev.c:6559) [Sat Jan 6 07:37:23 2024] net_rx_action (net/core/dev.c:6628 net/core/dev.c:6759) [Sat Jan 6 07:37:23 2024] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) [Sat Jan 6 07:37:23 2024] irq_exit_rcu (kernel/softirq.c:436 kernel/softirq.c:641 kernel/softirq.c:653) [Sat Jan 6 07:37:23 2024] sysvec_call_function_single (arch/x86/kernel/smp.c:262 (discriminator 47)) [Sat Jan 6 07:37:23 2024] </IRQ> [Sat Jan 6 07:37:23 2024] <TASK> [Sat Jan 6 07:37:23 2024] asm_sysvec_call_function_single (./arch/x86/include/asm/idtentry.h:656) [Sat Jan 6 07:37:23 2024] RIP: 0010:acpi_safe_halt (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:72 drivers/acpi/processor_idle.c:113) [Sat Jan 6 07:37:23 2024] Code: ed c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 65 48 8b 04 25 40 32 02 00 48 8b 00 a8 08 75 0c eb 07 0f 00 2d c7 0c 2c 00 fb f4 <fa> c3 0f 1f 00 0f b6 47 08 3c 01 74 0b 3c 02 74 05 8b 7f 04 eb 9f All code ======== 0: ed in (%dx),%eax 1: c3 ret 2: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1) 9: 00 00 00 00 d: 66 90 xchg %ax,%ax f: 65 48 8b 04 25 40 32 mov %gs:0x23240,%rax 16: 02 00 18: 48 8b 00 mov (%rax),%rax 1b: a8 08 test $0x8,%al 1d: 75 0c jne 0x2b 1f: eb 07 jmp 0x28 21: 0f 00 2d c7 0c 2c 00 verw 0x2c0cc7(%rip) # 0x2c0cef 28: fb sti 29: f4 hlt 2a:* fa cli <-- trapping instruction 2b: c3 ret 2c: 0f 1f 00 nopl (%rax) 2f: 0f b6 47 08 movzbl 0x8(%rdi),%eax 33: 3c 01 cmp $0x1,%al 35: 74 0b je 0x42 37: 3c 02 cmp $0x2,%al 39: 74 05 je 0x40 3b: 8b 7f 04 mov 0x4(%rdi),%edi 3e: eb 9f jmp 0xffffffffffffffdf Code starting with the faulting instruction =========================================== 0: fa cli 1: c3 ret 2: 0f 1f 00 nopl (%rax) 5: 0f b6 47 08 movzbl 0x8(%rdi),%eax 9: 3c 01 cmp $0x1,%al b: 74 0b je 0x18 d: 3c 02 cmp $0x2,%al f: 74 05 je 0x16 11: 8b 7f 04 mov 0x4(%rdi),%edi 14: eb 9f jmp 0xffffffffffffffb5 [Sat Jan 6 07:37:23 2024] RSP: 0018:ffffa773007fbe80 EFLAGS: 00000246 [Sat Jan 6 07:37:23 2024] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 000000000000001f [Sat Jan 6 07:37:23 2024] RDX: ffff8f64bf880000 RSI: ffff8f454c6f6800 RDI: ffff8f454c6f6864 [Sat Jan 6 07:37:23 2024] RBP: ffffffffaa216ea0 R08: ffffffffaa216ea0 R09: 0000000000000003 [Sat Jan 6 07:37:23 2024] R10: 0000000000000002 R11: 0000000000000007 R12: 0000000000000001 [Sat Jan 6 07:37:23 2024] R13: ffffffffaa216f08 R14: ffffffffaa216f20 R15: 0000000000000000 [Sat Jan 6 07:37:23 2024] acpi_idle_enter (drivers/acpi/processor_idle.c:709) [Sat Jan 6 07:37:23 2024] cpuidle_enter_state (drivers/cpuidle/cpuidle.c:267) [Sat Jan 6 07:37:23 2024] cpuidle_enter (drivers/cpuidle/cpuidle.c:390 (discriminator 2)) [Sat Jan 6 07:37:23 2024] do_idle (kernel/sched/idle.c:134 kernel/sched/idle.c:215 kernel/sched/idle.c:282) [Sat Jan 6 07:37:23 2024] cpu_startup_entry (kernel/sched/idle.c:379) [Sat Jan 6 07:37:23 2024] start_secondary (arch/x86/kernel/smpboot.c:326) [Sat Jan 6 07:37:23 2024] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:449) [Sat Jan 6 07:37:23 2024] </TASK> [Sat Jan 6 07:37:23 2024] ---[ end trace 0000000000000000 ]--- [Sat Jan 6 21:33:28 2024] ------------[ cut here ]------------ [Sat Jan 6 21:33:28 2024] rcuref - imbalanced put() [Sat Jan 6 21:33:28 2024] WARNING: CPU: 26 PID: 0 at lib/rcuref.c:279 rcuref_put_slowpath (lib/rcuref.c:279 (discriminator 1)) [Sat Jan 6 21:33:28 2024] Modules linked in: pppoe pppox ppp_generic slhc nft_limit nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding mlx5_core mlxfw mlx4_en mlx4_core i40e ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos megaraid_sas [Sat Jan 6 21:33:28 2024] CPU: 26 PID: 0 Comm: swapper/26 Tainted: G W O 6.6.10 #1 [Sat Jan 6 21:33:28 2024] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 7.80 10/28/2020 [Sat Jan 6 21:33:28 2024] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:279 (discriminator 1)) [Sat Jan 6 21:33:28 2024] Code: 31 c0 eb dc 80 3d b2 4a e3 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb c9 48 c7 c7 54 29 e4 a9 c6 05 98 4a e3 00 01 e8 db 7c c3 ff <0f> 0b eb df cc cc cc cc cc cc cc 48 89 fa 83 e2 07 48 85 f6 74 7f All code ======== 0: 31 c0 xor %eax,%eax 2: eb dc jmp 0xffffffffffffffe0 4: 80 3d b2 4a e3 00 00 cmpb $0x0,0xe34ab2(%rip) # 0xe34abd b: 74 0a je 0x17 d: c7 03 00 00 00 e0 movl $0xe0000000,(%rbx) 13: 31 c0 xor %eax,%eax 15: eb c9 jmp 0xffffffffffffffe0 17: 48 c7 c7 54 29 e4 a9 mov $0xffffffffa9e42954,%rdi 1e: c6 05 98 4a e3 00 01 movb $0x1,0xe34a98(%rip) # 0xe34abd 25: e8 db 7c c3 ff call 0xffffffffffc37d05 2a:* 0f 0b ud2 <-- trapping instruction 2c: eb df jmp 0xd 2e: cc int3 2f: cc int3 30: cc int3 31: cc int3 32: cc int3 33: cc int3 34: cc int3 35: 48 89 fa mov %rdi,%rdx 38: 83 e2 07 and $0x7,%edx 3b: 48 85 f6 test %rsi,%rsi 3e: 74 7f je 0xbf Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: eb df jmp 0xffffffffffffffe3 4: cc int3 5: cc int3 6: cc int3 7: cc int3 8: cc int3 9: cc int3 a: cc int3 b: 48 89 fa mov %rdi,%rdx e: 83 e2 07 and $0x7,%edx 11: 48 85 f6 test %rsi,%rsi 14: 74 7f je 0x95 [Sat Jan 6 21:33:28 2024] RSP: 0018:ffffa7730d528dd8 EFLAGS: 00010292 [Sat Jan 6 21:33:28 2024] RAX: 0000000000000019 RBX: ffff8f4573f7d000 RCX: 00000000ffefffff [Sat Jan 6 21:33:28 2024] RDX: 00000000ffefffff RSI: 0000000000000001 RDI: 00000000ffffffea [Sat Jan 6 21:33:28 2024] RBP: ffff8f265f02d6c0 R08: 0000000000000000 R09: 00000000ffefffff [Sat Jan 6 21:33:28 2024] R10: ffff8f64b6800000 R11: 0000000000000003 R12: ffff8f25c68df800 [Sat Jan 6 21:33:28 2024] R13: 000000000000000e R14: 0000000000000010 R15: ffff8f44c0024d10 [Sat Jan 6 21:33:28 2024] FS: 0000000000000000(0000) GS:ffff8f44c0000000(0000) knlGS:0000000000000000 [Sat Jan 6 21:33:28 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Sat Jan 6 21:33:28 2024] CR2: 00007fd79aca5000 CR3: 000000015d226005 CR4: 00000000003706e0 [Sat Jan 6 21:33:28 2024] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [Sat Jan 6 21:33:28 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [Sat Jan 6 21:33:28 2024] Call Trace: [Sat Jan 6 21:33:28 2024] <IRQ> [Sat Jan 6 21:33:28 2024] ? __warn (kernel/panic.c:235 kernel/panic.c:673) [Sat Jan 6 21:33:28 2024] ? report_bug (lib/bug.c:180 lib/bug.c:219) [Sat Jan 6 21:33:28 2024] ? handle_bug (arch/x86/kernel/traps.c:237) [Sat Jan 6 21:33:28 2024] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) [Sat Jan 6 21:33:28 2024] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) [Sat Jan 6 21:33:28 2024] ? rcuref_put_slowpath (lib/rcuref.c:279 (discriminator 1)) [Sat Jan 6 21:33:28 2024] ? rcuref_put_slowpath (lib/rcuref.c:279 (discriminator 1)) [Sat Jan 6 21:33:28 2024] dst_release (net/core/dst.c:166 (discriminator 1)) [Sat Jan 6 21:33:28 2024] __dev_queue_xmit (./include/net/dst.h:283 net/core/dev.c:4327) [Sat Jan 6 21:33:28 2024] ? nf_hook_slow (./include/linux/netfilter.h:144 net/netfilter/core.c:626) [Sat Jan 6 21:33:28 2024] ip_finish_output2 (./include/net/neighbour.h:526 ./include/net/neighbour.h:540 net/ipv4/ip_output.c:233) [Sat Jan 6 21:33:28 2024] process_backlog (net/core/dev.c:6000) [Sat Jan 6 21:33:28 2024] __napi_poll (net/core/dev.c:6559) [Sat Jan 6 21:33:28 2024] net_rx_action (net/core/dev.c:6628 net/core/dev.c:6759) [Sat Jan 6 21:33:28 2024] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) [Sat Jan 6 21:33:28 2024] irq_exit_rcu (kernel/softirq.c:436 kernel/softirq.c:641 kernel/softirq.c:653) [Sat Jan 6 21:33:28 2024] sysvec_call_function_single (arch/x86/kernel/smp.c:262 (discriminator 47)) [Sat Jan 6 21:33:28 2024] </IRQ> [Sat Jan 6 21:33:28 2024] <TASK> [Sat Jan 6 21:33:28 2024] asm_sysvec_call_function_single (./arch/x86/include/asm/idtentry.h:656) [Sat Jan 6 21:33:28 2024] RIP: 0010:acpi_safe_halt (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:72 drivers/acpi/processor_idle.c:113) [Sat Jan 6 21:33:28 2024] Code: ed c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 65 48 8b 04 25 40 32 02 00 48 8b 00 a8 08 75 0c eb 07 0f 00 2d c7 0c 2c 00 fb f4 <fa> c3 0f 1f 00 0f b6 47 08 3c 01 74 0b 3c 02 74 05 8b 7f 04 eb 9f All code ======== 0: ed in (%dx),%eax 1: c3 ret 2: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1) 9: 00 00 00 00 d: 66 90 xchg %ax,%ax f: 65 48 8b 04 25 40 32 mov %gs:0x23240,%rax 16: 02 00 18: 48 8b 00 mov (%rax),%rax 1b: a8 08 test $0x8,%al 1d: 75 0c jne 0x2b 1f: eb 07 jmp 0x28 21: 0f 00 2d c7 0c 2c 00 verw 0x2c0cc7(%rip) # 0x2c0cef 28: fb sti 29: f4 hlt 2a:* fa cli <-- trapping instruction 2b: c3 ret 2c: 0f 1f 00 nopl (%rax) 2f: 0f b6 47 08 movzbl 0x8(%rdi),%eax 33: 3c 01 cmp $0x1,%al 35: 74 0b je 0x42 37: 3c 02 cmp $0x2,%al 39: 74 05 je 0x40 3b: 8b 7f 04 mov 0x4(%rdi),%edi 3e: eb 9f jmp 0xffffffffffffffdf Code starting with the faulting instruction =========================================== 0: fa cli 1: c3 ret 2: 0f 1f 00 nopl (%rax) 5: 0f b6 47 08 movzbl 0x8(%rdi),%eax 9: 3c 01 cmp $0x1,%al b: 74 0b je 0x18 d: 3c 02 cmp $0x2,%al f: 74 05 je 0x16 11: 8b 7f 04 mov 0x4(%rdi),%edi 14: eb 9f jmp 0xffffffffffffffb5 [Sat Jan 6 21:33:28 2024] RSP: 0018:ffffa77300e7be80 EFLAGS: 00000246 [Sat Jan 6 21:33:28 2024] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 000000000000001f [Sat Jan 6 21:33:28 2024] RDX: ffff8f44c0000000 RSI: ffff8f454c6fc800 RDI: ffff8f454c6fc864 [Sat Jan 6 21:33:28 2024] RBP: ffffffffaa216ea0 R08: ffffffffaa216ea0 R09: 00003fd7b44cb0a0 [Sat Jan 6 21:33:28 2024] R10: 0000000000000002 R11: 0000000000000007 R12: 0000000000000001 [Sat Jan 6 21:33:28 2024] R13: ffffffffaa216f08 R14: ffffffffaa216f20 R15: 0000000000000000 [Sat Jan 6 21:33:28 2024] acpi_idle_enter (drivers/acpi/processor_idle.c:709) [Sat Jan 6 21:33:28 2024] cpuidle_enter_state (drivers/cpuidle/cpuidle.c:267) [Sat Jan 6 21:33:28 2024] cpuidle_enter (drivers/cpuidle/cpuidle.c:390 (discriminator 2)) [Sat Jan 6 21:33:28 2024] do_idle (kernel/sched/idle.c:134 kernel/sched/idle.c:215 kernel/sched/idle.c:282) [Sat Jan 6 21:33:28 2024] cpu_startup_entry (kernel/sched/idle.c:379) [Sat Jan 6 21:33:28 2024] start_secondary (arch/x86/kernel/smpboot.c:326) [Sat Jan 6 21:33:28 2024] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:449) [Sat Jan 6 21:33:28 2024] </TASK> [Sat Jan 6 21:33:28 2024] ---[ end trace 0000000000000000 ]--- > On 4 Jan 2024, at 22:51, Martin Zaharinov <micron10@gmail.com> wrote: > > Hi Thomas , > > Happy New Year! > > here is two debugs from two new installed machins with kernel 6.6.9: > > dmesg1 : > > [ 2257.449125] ------------[ cut here ]------------ > [ 2257.449245] WARNING: CPU: 1 PID: 40622 at lib/rcuref.c:294 rcuref_put_slowpath+0x2f/0x70 > [ 2257.449373] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding ixgbe mdio i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos > [ 2257.449642] CPU: 1 PID: 40622 Comm: nc Tainted: G O 6.6.9 #1 > [ 2257.449761] Hardware name: Supermicro PIO-5038MR-H8TRF-NODE/X10SRD-F, BIOS 3.3 10/28/2020 > [ 2257.449883] RIP: 0010:rcuref_put_slowpath+0x2f/0x70 > [ 2257.449977] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4c e3 00 > [ 2257.450135] RSP: 0000:ffffb455cef83b78 EFLAGS: 00010246 > [ 2257.450227] RAX: 0000000000000000 RBX: ffff94873bb77dc0 RCX: ffff9486c0d46b80 > [ 2257.450341] RDX: ffff948736578428 RSI: 00000000fffffe01 RDI: ffff94873bb77dc0 > [ 2257.450456] RBP: ffff948736578428 R08: ffff948e1fa64f08 R09: 0000000000000001 > [ 2257.450570] R10: 0000000000028530 R11: 0000000000000001 R12: ffff94873bb77d80 > [ 2257.450685] R13: ffff94873bb77de8 R14: ffff948e1fa64f08 R15: 000000000266f59d > [ 2257.450802] FS: 00007f0cdbc73800(0000) GS:ffff948e1fa40000(0000) knlGS:0000000000000000 > [ 2257.450918] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 2257.451012] CR2: 00007f0cdc3f5c30 CR3: 0000000178ea0002 CR4: 00000000003706e0 > [ 2257.451127] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 2257.451240] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 2257.451353] Call Trace: > [ 2257.451441] <TASK> > [ 2257.451526] ? __warn+0x6c/0x130 > [ 2257.451616] ? report_bug+0x1b8/0x200 > [ 2257.451707] ? handle_bug+0x36/0x70 > [ 2257.451797] ? exc_invalid_op+0x17/0x1a0 > [ 2257.451886] ? asm_exc_invalid_op+0x16/0x20 > [ 2257.452038] ? rcuref_put_slowpath+0x2f/0x70 > [ 2257.452129] dst_release+0x1c/0x40 > [ 2257.452222] rt_cache_route+0xbd/0xf0 > [ 2257.452313] ? kmem_cache_alloc+0x31/0x390 > [ 2257.452404] rt_set_nexthop.isra.0+0x1b6/0x450 > [ 2257.452495] ip_route_input_slow+0x5d9/0xcc0 > [ 2257.452586] ? nft_nat_do_chain+0x7f/0xd0 [nft_chain_nat] > [ 2257.452681] ? nf_conntrack_udp_packet+0xcf/0x240 [nf_conntrack] > [ 2257.452784] ? nf_nat_inet_fn+0x36f/0x3f0 [nf_nat] > [ 2257.452880] ip_route_input_noref+0xe0/0xf0 > [ 2257.452970] ip_rcv_finish_core.isra.0+0xbb/0x440 > [ 2257.453064] ip_rcv+0xd5/0x110 > [ 2257.453151] ? ip_rcv_core+0x360/0x360 > [ 2257.453240] process_backlog+0x107/0x210 > [ 2257.453330] __napi_poll+0x20/0x180 > [ 2257.453420] net_rx_action+0x29f/0x380 > [ 2257.453510] __do_softirq+0xd0/0x202 > [ 2257.453599] irq_exit_rcu+0x82/0xa0 > [ 2257.453689] sysvec_call_function_single+0x32/0x80 > [ 2257.453781] asm_sysvec_call_function_single+0x16/0x20 > [ 2257.453874] RIP: 0033:0x7f0cdc5928b2 > [ 2257.453963] Code: 06 00 00 4c 89 65 88 49 83 fd 08 0f 84 f7 06 00 00 49 83 fd 26 0f 84 05 07 00 00 4d 85 ed 0f 84 5f 01 00 00 41 0f b6 44 24 04 <89> c6 40 c0 ee 04 0f 84 72 06 00 00 41 0f b6 54 24 05 83 e2 03 ff > [ 2257.454121] RSP: 002b:00007ffc04d3e890 EFLAGS: 00000206 > [ 2257.454215] RAX: 0000000000000012 RBX: 00007f0cdc444db8 RCX: 00007f0cdc4e6e60 > [ 2257.454329] RDX: 0000000000000009 RSI: 00007f0cdc57ef30 RDI: 00007f0cdc42c808 > [ 2257.454442] RBP: 00007ffc04d3e9b0 R08: 00007f0cdc445028 R09: 00007ffc04d3e940 > [ 2257.454555] R10: 00007f0cdbf00be8 R11: 0000000000000000 R12: 00007f0cdc42c898 > [ 2257.454670] R13: 0000000000000006 R14: 0000000600000006 R15: 00007f0cdc581000 > [ 2257.454784] </TASK> > [ 2257.454869] ---[ end trace 0000000000000000 ]— > > > [ 2257.449125] ------------[ cut here ]------------ > [ 2257.449245] WARNING: CPU: 1 PID: 40622 at lib/rcuref.c:294 rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) > [ 2257.449373] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding ixgbe mdio i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos > [ 2257.449642] CPU: 1 PID: 40622 Comm: nc Tainted: G O 6.6.9 #1 > [ 2257.449761] Hardware name: Supermicro PIO-5038MR-H8TRF-NODE/X10SRD-F, BIOS 3.3 10/28/2020 > [ 2257.449883] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) > [ 2257.449977] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4c e3 00 > All code > ======== > 0: 07 (bad) > 1: 83 f8 ff cmp $0xffffffff,%eax > 4: 75 19 jne 0x1f > 6: ba 00 00 00 e0 mov $0xe0000000,%edx > b: f0 0f b1 17 lock cmpxchg %edx,(%rdi) > f: 83 f8 ff cmp $0xffffffff,%eax > 12: 74 04 je 0x18 > 14: 31 c0 xor %eax,%eax > 16: 5b pop %rbx > 17: c3 ret > 18: b8 01 00 00 00 mov $0x1,%eax > 1d: 5b pop %rbx > 1e: c3 ret > 1f: 3d ff ff ff bf cmp $0xbfffffff,%eax > 24: 77 14 ja 0x3a > 26: 85 c0 test %eax,%eax > 28: 78 06 js 0x30 > 2a:* 0f 0b ud2 <-- trapping instruction > 2c: 31 c0 xor %eax,%eax > 2e: eb e6 jmp 0x16 > 30: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) > 36: 31 c0 xor %eax,%eax > 38: eb dc jmp 0x16 > 3a: 80 .byte 0x80 > 3b: 3d e2 4c e3 00 cmp $0xe34ce2,%eax > > Code starting with the faulting instruction > =========================================== > 0: 0f 0b ud2 > 2: 31 c0 xor %eax,%eax > 4: eb e6 jmp 0xffffffffffffffec > 6: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) > c: 31 c0 xor %eax,%eax > e: eb dc jmp 0xffffffffffffffec > 10: 80 .byte 0x80 > 11: 3d e2 4c e3 00 cmp $0xe34ce2,%eax > [ 2257.450135] RSP: 0000:ffffb455cef83b78 EFLAGS: 00010246 > [ 2257.450227] RAX: 0000000000000000 RBX: ffff94873bb77dc0 RCX: ffff9486c0d46b80 > [ 2257.450341] RDX: ffff948736578428 RSI: 00000000fffffe01 RDI: ffff94873bb77dc0 > [ 2257.450456] RBP: ffff948736578428 R08: ffff948e1fa64f08 R09: 0000000000000001 > [ 2257.450570] R10: 0000000000028530 R11: 0000000000000001 R12: ffff94873bb77d80 > [ 2257.450685] R13: ffff94873bb77de8 R14: ffff948e1fa64f08 R15: 000000000266f59d > [ 2257.450802] FS: 00007f0cdbc73800(0000) GS:ffff948e1fa40000(0000) knlGS:0000000000000000 > [ 2257.450918] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 2257.451012] CR2: 00007f0cdc3f5c30 CR3: 0000000178ea0002 CR4: 00000000003706e0 > [ 2257.451127] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 2257.451240] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 2257.451353] Call Trace: > [ 2257.451441] <TASK> > [ 2257.451526] ? __warn (kernel/panic.c:235 kernel/panic.c:673) > [ 2257.451616] ? report_bug (lib/bug.c:180 lib/bug.c:219) > [ 2257.451707] ? handle_bug (arch/x86/kernel/traps.c:237) > [ 2257.451797] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) > [ 2257.451886] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) > [ 2257.452038] ? rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) > [ 2257.452129] dst_release (net/core/dst.c:166 (discriminator 1)) > [ 2257.452222] rt_cache_route (net/ipv4/route.c:1499) > [ 2257.452313] ? kmem_cache_alloc (mm/slab.h:711 (discriminator 1) mm/slub.c:3461 (discriminator 1) mm/slub.c:3487 (discriminator 1) mm/slub.c:3494 (discriminator 1) mm/slub.c:3503 (discriminator 1)) > [ 2257.452404] rt_set_nexthop.isra.0 (net/ipv4/route.c:1606 (discriminator 1)) > [ 2257.452495] ip_route_input_slow (./include/net/lwtunnel.h:140 net/ipv4/route.c:1875 net/ipv4/route.c:2154 net/ipv4/route.c:2337) > [ 2257.452586] ? nft_nat_do_chain (net/netfilter/nft_chain_nat.c:33) nft_chain_nat > [ 2257.452681] ? nf_conntrack_udp_packet (net/netfilter/nf_conntrack_proto_udp.c:130) nf_conntrack > [ 2257.452784] ? nf_nat_inet_fn (net/netfilter/nf_nat_core.c:844) nf_nat > [ 2257.452880] ip_route_input_noref (net/ipv4/route.c:2499) > [ 2257.452970] ip_rcv_finish_core.isra.0 (net/ipv4/ip_input.c:367 (discriminator 1)) > [ 2257.453064] ip_rcv (net/ipv4/ip_input.c:448 ./include/linux/netfilter.h:304 ./include/linux/netfilter.h:298 net/ipv4/ip_input.c:569) > [ 2257.453151] ? ip_rcv_core (net/ipv4/ip_input.c:436) > [ 2257.453240] process_backlog (net/core/dev.c:6000) > [ 2257.453330] __napi_poll (net/core/dev.c:6559) > [ 2257.453420] net_rx_action (net/core/dev.c:6628 net/core/dev.c:6759) > [ 2257.453510] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) > [ 2257.453599] irq_exit_rcu (kernel/softirq.c:436 kernel/softirq.c:641 kernel/softirq.c:653) > [ 2257.453689] sysvec_call_function_single (arch/x86/kernel/smp.c:262 (discriminator 69)) > [ 2257.453781] asm_sysvec_call_function_single (./arch/x86/include/asm/idtentry.h:656) > [ 2257.453874] RIP: 0033:0x7f0cdc5928b2 > [ 2257.453963] Code: 06 00 00 4c 89 65 88 49 83 fd 08 0f 84 f7 06 00 00 49 83 fd 26 0f 84 05 07 00 00 4d 85 ed 0f 84 5f 01 00 00 41 0f b6 44 24 04 <89> c6 40 c0 ee 04 0f 84 72 06 00 00 41 0f b6 54 24 05 83 e2 03 ff > All code > ======== > 0: 06 (bad) > 1: 00 00 add %al,(%rax) > 3: 4c 89 65 88 mov %r12,-0x78(%rbp) > 7: 49 83 fd 08 cmp $0x8,%r13 > b: 0f 84 f7 06 00 00 je 0x708 > 11: 49 83 fd 26 cmp $0x26,%r13 > 15: 0f 84 05 07 00 00 je 0x720 > 1b: 4d 85 ed test %r13,%r13 > 1e: 0f 84 5f 01 00 00 je 0x183 > 24: 41 0f b6 44 24 04 movzbl 0x4(%r12),%eax > 2a:* 89 c6 mov %eax,%esi <-- trapping instruction > 2c: 40 c0 ee 04 shr $0x4,%sil > 30: 0f 84 72 06 00 00 je 0x6a8 > 36: 41 0f b6 54 24 05 movzbl 0x5(%r12),%edx > 3c: 83 e2 03 and $0x3,%edx > 3f: ff .byte 0xff > > Code starting with the faulting instruction > =========================================== > 0: 89 c6 mov %eax,%esi > 2: 40 c0 ee 04 shr $0x4,%sil > 6: 0f 84 72 06 00 00 je 0x67e > c: 41 0f b6 54 24 05 movzbl 0x5(%r12),%edx > 12: 83 e2 03 and $0x3,%edx > 15: ff .byte 0xff > [ 2257.454121] RSP: 002b:00007ffc04d3e890 EFLAGS: 00000206 > [ 2257.454215] RAX: 0000000000000012 RBX: 00007f0cdc444db8 RCX: 00007f0cdc4e6e60 > [ 2257.454329] RDX: 0000000000000009 RSI: 00007f0cdc57ef30 RDI: 00007f0cdc42c808 > [ 2257.454442] RBP: 00007ffc04d3e9b0 R08: 00007f0cdc445028 R09: 00007ffc04d3e940 > [ 2257.454555] R10: 00007f0cdbf00be8 R11: 0000000000000000 R12: 00007f0cdc42c898 > [ 2257.454670] R13: 0000000000000006 R14: 0000000600000006 R15: 00007f0cdc581000 > [ 2257.454784] </TASK> > [ 2257.454869] ---[ end trace 0000000000000000 ]— > > > dmesg2 : > > [ 2567.167952] ------------[ cut here ]------------ > [ 2567.168053] WARNING: CPU: 11 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath+0x2f/0x70 > [ 2567.168175] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding ixgbe mdio i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos > [ 2567.168445] CPU: 11 PID: 0 Comm: swapper/11 Tainted: G O 6.6.9 #1 > [ 2567.168561] Hardware name: Supermicro X10SRD-F/X10SRD-F, BIOS 3.4 06/05/2021 > [ 2567.168675] RIP: 0010:rcuref_put_slowpath+0x2f/0x70 > [ 2567.168767] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4c e3 00 > [ 2567.168924] RSP: 0018:ffffaeaf80418d00 EFLAGS: 00010246 > [ 2567.169017] RAX: 0000000000000000 RBX: ffff9fef84d6a940 RCX: 0000000000000074 > [ 2567.169132] RDX: ffff9fefe2e30000 RSI: 0000000000000000 RDI: ffff9fef84d6a940 > [ 2567.169246] RBP: ffff9fefe2e306c0 R08: 0000000000000000 R09: 0000000000029300 > [ 2567.169359] R10: 0000000000029300 R11: ffffaeaf80418d90 R12: ffff9fef8aebe000 > [ 2567.169473] R13: ffff9fef80896800 R14: ffff9fef85335200 R15: ffff9fef8ae07080 > [ 2567.169586] FS: 0000000000000000(0000) GS:ffff9ff6dfcc0000(0000) knlGS:0000000000000000 > [ 2567.169702] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 2567.169795] CR2: 00007f4eaa7e6650 CR3: 0000000156dcd006 CR4: 00000000003706e0 > [ 2567.169908] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 2567.170022] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 2567.170137] Call Trace: > [ 2567.170224] <IRQ> > [ 2567.170309] ? __warn+0x6c/0x130 > [ 2567.170399] ? report_bug+0x1b8/0x200 > [ 2567.170488] ? handle_bug+0x36/0x70 > [ 2567.170577] ? exc_invalid_op+0x17/0x1a0 > [ 2567.170667] ? asm_exc_invalid_op+0x16/0x20 > [ 2567.170758] ? rcuref_put_slowpath+0x2f/0x70 > [ 2567.170850] dst_release+0x1c/0x40 > [ 2567.170939] __dev_queue_xmit+0x598/0xce0 > [ 2567.171029] vlan_dev_hard_start_xmit+0x82/0xc0 > [ 2567.171122] dev_hard_start_xmit+0x95/0xe0 > [ 2567.171216] __dev_queue_xmit+0x863/0xce0 > [ 2567.171305] ? eth_header+0x25/0xc0 > [ 2567.171394] ip_finish_output2+0x1a0/0x530 > [ 2567.171485] process_backlog+0x107/0x210 > [ 2567.171575] __napi_poll+0x20/0x180 > [ 2567.171663] net_rx_action+0x29f/0x380 > [ 2567.171752] ? rebalance_domains+0x14c/0x300 > [ 2567.171843] __do_softirq+0xd0/0x202 > [ 2567.171932] irq_exit_rcu+0x82/0xa0 > [ 2567.172022] common_interrupt+0x7a/0xa0 > [ 2567.172111] </IRQ> > [ 2567.172198] <TASK> > [ 2567.172283] asm_common_interrupt+0x22/0x40 > [ 2567.172374] RIP: 0010:cpuidle_enter_state+0xa3/0x6a0 > [ 2567.172467] Code: 46 40 40 0f 84 02 01 00 00 e8 c9 a0 70 ff e8 d4 f6 ff ff 31 ff 49 89 c6 e8 0a b9 6f ff 45 84 ff 0f 85 d9 00 00 00 fb 45 85 ed <0f> 88 b8 00 00 00 49 63 cd 48 8b 04 24 48 6b f1 68 49 29 c6 48 8d > [ 2567.172623] RSP: 0018:ffffaeaf80177e98 EFLAGS: 00000202 > [ 2567.172715] RAX: ffff9ff6dfce3a80 RBX: ffff9fef81338000 RCX: 000000000000001f > [ 2567.172828] RDX: 00000255b721ed84 RSI: 00000000238e3b7a RDI: 0000000000000000 > [ 2567.172942] RBP: ffffffffba216ea0 R08: 0000000000000004 R09: ffff9ff6dfcdef00 > [ 2567.173055] R10: ffff9ff6dfcdef00 R11: 0000000000000007 R12: 0000000000000001 > [ 2567.173168] R13: 0000000000000001 R14: 00000255b721ed84 R15: 0000000000000000 > [ 2567.173283] ? cpuidle_enter_state+0x96/0x6a0 > [ 2567.173374] cpuidle_enter+0x24/0x40 > [ 2567.173464] do_idle+0x1a7/0x210 > [ 2567.173552] cpu_startup_entry+0x21/0x30 > [ 2567.173642] start_secondary+0xe1/0xf0 > [ 2567.173732] secondary_startup_64_no_verify+0x178/0x17b > [ 2567.173825] </TASK> > [ 2567.173910] ---[ end trace 0000000000000000 ]— > > > [ 2567.167952] ------------[ cut here ]------------ > [ 2567.168053] WARNING: CPU: 11 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) > [ 2567.168175] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding ixgbe mdio i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos > [ 2567.168445] CPU: 11 PID: 0 Comm: swapper/11 Tainted: G O 6.6.9 #1 > [ 2567.168561] Hardware name: Supermicro X10SRD-F/X10SRD-F, BIOS 3.4 06/05/2021 > [ 2567.168675] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) > [ 2567.168767] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4c e3 00 > All code > ======== > 0: 07 (bad) > 1: 83 f8 ff cmp $0xffffffff,%eax > 4: 75 19 jne 0x1f > 6: ba 00 00 00 e0 mov $0xe0000000,%edx > b: f0 0f b1 17 lock cmpxchg %edx,(%rdi) > f: 83 f8 ff cmp $0xffffffff,%eax > 12: 74 04 je 0x18 > 14: 31 c0 xor %eax,%eax > 16: 5b pop %rbx > 17: c3 ret > 18: b8 01 00 00 00 mov $0x1,%eax > 1d: 5b pop %rbx > 1e: c3 ret > 1f: 3d ff ff ff bf cmp $0xbfffffff,%eax > 24: 77 14 ja 0x3a > 26: 85 c0 test %eax,%eax > 28: 78 06 js 0x30 > 2a:* 0f 0b ud2 <-- trapping instruction > 2c: 31 c0 xor %eax,%eax > 2e: eb e6 jmp 0x16 > 30: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) > 36: 31 c0 xor %eax,%eax > 38: eb dc jmp 0x16 > 3a: 80 .byte 0x80 > 3b: 3d e2 4c e3 00 cmp $0xe34ce2,%eax > > Code starting with the faulting instruction > =========================================== > 0: 0f 0b ud2 > 2: 31 c0 xor %eax,%eax > 4: eb e6 jmp 0xffffffffffffffec > 6: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) > c: 31 c0 xor %eax,%eax > e: eb dc jmp 0xffffffffffffffec > 10: 80 .byte 0x80 > 11: 3d e2 4c e3 00 cmp $0xe34ce2,%eax > [ 2567.168924] RSP: 0018:ffffaeaf80418d00 EFLAGS: 00010246 > [ 2567.169017] RAX: 0000000000000000 RBX: ffff9fef84d6a940 RCX: 0000000000000074 > [ 2567.169132] RDX: ffff9fefe2e30000 RSI: 0000000000000000 RDI: ffff9fef84d6a940 > [ 2567.169246] RBP: ffff9fefe2e306c0 R08: 0000000000000000 R09: 0000000000029300 > [ 2567.169359] R10: 0000000000029300 R11: ffffaeaf80418d90 R12: ffff9fef8aebe000 > [ 2567.169473] R13: ffff9fef80896800 R14: ffff9fef85335200 R15: ffff9fef8ae07080 > [ 2567.169586] FS: 0000000000000000(0000) GS:ffff9ff6dfcc0000(0000) knlGS:0000000000000000 > [ 2567.169702] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 2567.169795] CR2: 00007f4eaa7e6650 CR3: 0000000156dcd006 CR4: 00000000003706e0 > [ 2567.169908] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 2567.170022] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 2567.170137] Call Trace: > [ 2567.170224] <IRQ> > [ 2567.170309] ? __warn (kernel/panic.c:235 kernel/panic.c:673) > [ 2567.170399] ? report_bug (lib/bug.c:180 lib/bug.c:219) > [ 2567.170488] ? handle_bug (arch/x86/kernel/traps.c:237) > [ 2567.170577] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) > [ 2567.170667] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) > [ 2567.170758] ? rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) > [ 2567.170850] dst_release (net/core/dst.c:166 (discriminator 1)) > [ 2567.170939] __dev_queue_xmit (./include/net/dst.h:283 net/core/dev.c:4327) > [ 2567.171029] vlan_dev_hard_start_xmit (net/8021q/vlan_dev.c:130) > [ 2567.171122] dev_hard_start_xmit (./include/linux/netdevice.h:4926 net/core/dev.c:3576 net/core/dev.c:3592) > [ 2567.171216] __dev_queue_xmit (./include/linux/netdevice.h:3300 (discriminator 25) net/core/dev.c:4373 (discriminator 25)) > [ 2567.171305] ? eth_header (net/ethernet/eth.c:85) > [ 2567.171394] ip_finish_output2 (./include/net/neighbour.h:542 (discriminator 2) net/ipv4/ip_output.c:233 (discriminator 2)) > [ 2567.171485] process_backlog (net/core/dev.c:6000) > [ 2567.171575] __napi_poll (net/core/dev.c:6559) > [ 2567.171663] net_rx_action (net/core/dev.c:6628 net/core/dev.c:6759) > [ 2567.171752] ? rebalance_domains (kernel/sched/fair.c:11719 kernel/sched/fair.c:11895) > [ 2567.171843] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) > [ 2567.171932] irq_exit_rcu (kernel/softirq.c:436 kernel/softirq.c:641 kernel/softirq.c:653) > [ 2567.172022] common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 47)) > [ 2567.172111] </IRQ> > [ 2567.172198] <TASK> > [ 2567.172283] asm_common_interrupt (./arch/x86/include/asm/idtentry.h:640) > [ 2567.172374] RIP: 0010:cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291) > [ 2567.172467] Code: 46 40 40 0f 84 02 01 00 00 e8 c9 a0 70 ff e8 d4 f6 ff ff 31 ff 49 89 c6 e8 0a b9 6f ff 45 84 ff 0f 85 d9 00 00 00 fb 45 85 ed <0f> 88 b8 00 00 00 49 63 cd 48 8b 04 24 48 6b f1 68 49 29 c6 48 8d > All code > ======== > 0: 46 rex.RX > 1: 40 rex > 2: 40 0f 84 02 01 00 00 rex je 0x10b > 9: e8 c9 a0 70 ff call 0xffffffffff70a0d7 > e: e8 d4 f6 ff ff call 0xfffffffffffff6e7 > 13: 31 ff xor %edi,%edi > 15: 49 89 c6 mov %rax,%r14 > 18: e8 0a b9 6f ff call 0xffffffffff6fb927 > 1d: 45 84 ff test %r15b,%r15b > 20: 0f 85 d9 00 00 00 jne 0xff > 26: fb sti > 27: 45 85 ed test %r13d,%r13d > 2a:* 0f 88 b8 00 00 00 js 0xe8 <-- trapping instruction > 30: 49 63 cd movslq %r13d,%rcx > 33: 48 8b 04 24 mov (%rsp),%rax > 37: 48 6b f1 68 imul $0x68,%rcx,%rsi > 3b: 49 29 c6 sub %rax,%r14 > 3e: 48 rex.W > 3f: 8d .byte 0x8d > > Code starting with the faulting instruction > =========================================== > 0: 0f 88 b8 00 00 00 js 0xbe > 6: 49 63 cd movslq %r13d,%rcx > 9: 48 8b 04 24 mov (%rsp),%rax > d: 48 6b f1 68 imul $0x68,%rcx,%rsi > 11: 49 29 c6 sub %rax,%r14 > 14: 48 rex.W > 15: 8d .byte 0x8d > [ 2567.172623] RSP: 0018:ffffaeaf80177e98 EFLAGS: 00000202 > [ 2567.172715] RAX: ffff9ff6dfce3a80 RBX: ffff9fef81338000 RCX: 000000000000001f > [ 2567.172828] RDX: 00000255b721ed84 RSI: 00000000238e3b7a RDI: 0000000000000000 > [ 2567.172942] RBP: ffffffffba216ea0 R08: 0000000000000004 R09: ffff9ff6dfcdef00 > [ 2567.173055] R10: ffff9ff6dfcdef00 R11: 0000000000000007 R12: 0000000000000001 > [ 2567.173168] R13: 0000000000000001 R14: 00000255b721ed84 R15: 0000000000000000 > [ 2567.173283] ? cpuidle_enter_state (drivers/cpuidle/cpuidle.c:285) > [ 2567.173374] cpuidle_enter (drivers/cpuidle/cpuidle.c:390 (discriminator 2)) > [ 2567.173464] do_idle (kernel/sched/idle.c:134 kernel/sched/idle.c:215 kernel/sched/idle.c:282) > [ 2567.173552] cpu_startup_entry (kernel/sched/idle.c:379) > [ 2567.173642] start_secondary (arch/x86/kernel/smpboot.c:326) > [ 2567.173732] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:449) > [ 2567.173825] </TASK> > [ 2567.173910] ---[ end trace 0000000000000000 ]— > > best regards, > Martin > > >> On 29 Dec 2023, at 14:00, Martin Zaharinov <micron10@gmail.com> wrote: >> >> Hi Thomas, >> >> One more report from second machine: >> >> [21299.954952] ------------[ cut here ]------------ >> [21299.955047] WARNING: CPU: 15 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) >> [21299.955153] Modules linked in: nft_limit nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp virtio_net net_failover failover virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev virtio virtio_ring e1000e e1000 vmxnet3 i40e ixgbe mdio bnxt_en nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rtc_cmos >> [21299.955378] CPU: 15 PID: 0 Comm: swapper/15 Tainted: G O 6.6.8 #1 >> [21299.955475] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 02/09/2023 >> [21299.955575] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) >> [21299.955662] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4e e3 00 >> All code >> ======== >> 0: 07 (bad) >> 1: 83 f8 ff cmp $0xffffffff,%eax >> 4: 75 19 jne 0x1f >> 6: ba 00 00 00 e0 mov $0xe0000000,%edx >> b: f0 0f b1 17 lock cmpxchg %edx,(%rdi) >> f: 83 f8 ff cmp $0xffffffff,%eax >> 12: 74 04 je 0x18 >> 14: 31 c0 xor %eax,%eax >> 16: 5b pop %rbx >> 17: c3 ret >> 18: b8 01 00 00 00 mov $0x1,%eax >> 1d: 5b pop %rbx >> 1e: c3 ret >> 1f: 3d ff ff ff bf cmp $0xbfffffff,%eax >> 24: 77 14 ja 0x3a >> 26: 85 c0 test %eax,%eax >> 28: 78 06 js 0x30 >> 2a:* 0f 0b ud2 <-- trapping instruction >> 2c: 31 c0 xor %eax,%eax >> 2e: eb e6 jmp 0x16 >> 30: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) >> 36: 31 c0 xor %eax,%eax >> 38: eb dc jmp 0x16 >> 3a: 80 .byte 0x80 >> 3b: 3d e2 4e e3 00 cmp $0xe34ee2,%eax >> >> Code starting with the faulting instruction >> =========================================== >> 0: 0f 0b ud2 >> 2: 31 c0 xor %eax,%eax >> 4: eb e6 jmp 0xffffffffffffffec >> 6: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) >> c: 31 c0 xor %eax,%eax >> e: eb dc jmp 0xffffffffffffffec >> 10: 80 .byte 0x80 >> 11: 3d e2 4e e3 00 cmp $0xe34ee2,%eax >> [21299.955793] RSP: 0018:ffff96a7c0578c30 EFLAGS: 00010246 >> [21299.955879] RAX: 0000000000000000 RBX: ffff8b75d1e49a80 RCX: ffff8b75c6667c80 >> [21299.955974] RDX: ffff8b84bfbe4f08 RSI: 00000000fffffe01 RDI: ffff8b75d1e49a80 >> [21299.956070] RBP: ffff8b84bfbe4f08 R08: ffff8b84bfbe4f08 R09: 0000000000000001 >> [21299.956167] R10: 0000000000028530 R11: 0000000000000001 R12: ffff8b75d1e49a40 >> [21299.956261] R13: ffff8b75d1e49aa8 R14: ffff8b84bfbe4f08 R15: 00000000c26ab667 >> [21299.956358] FS: 0000000000000000(0000) GS:ffff8b84bfbc0000(0000) knlGS:0000000000000000 >> [21299.956457] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [21299.956540] CR2: 00007f2e185c73c8 CR3: 0000000950014003 CR4: 00000000003706e0 >> [21299.956635] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> [21299.956730] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> [21299.956826] Call Trace: >> [21299.956905] <IRQ> >> [21299.956983] ? __warn (kernel/panic.c:235 kernel/panic.c:673) >> [21299.957065] ? report_bug (lib/bug.c:180 lib/bug.c:219) >> [21299.957147] ? handle_bug (arch/x86/kernel/traps.c:237) >> [21299.957228] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) >> [21299.957308] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) >> [21299.957393] ? rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) >> [21299.957476] dst_release (net/core/dst.c:166 (discriminator 1)) >> [21299.957559] rt_cache_route (net/ipv4/route.c:1499) >> [21299.957641] rt_set_nexthop.isra.0 (net/ipv4/route.c:1606 (discriminator 1)) >> [21299.957722] ip_route_input_slow (./include/net/lwtunnel.h:140 net/ipv4/route.c:1875 net/ipv4/route.c:2154 net/ipv4/route.c:2337) >> [21299.957804] ? free_unref_page (./include/linux/list.h:150 (discriminator 1) ./include/linux/list.h:169 (discriminator 1) mm/page_alloc.c:2377 (discriminator 1) mm/page_alloc.c:2428 (discriminator 1)) >> [21299.957889] ip_route_input_noref (net/ipv4/route.c:2499) >> [21299.957972] ip_rcv_finish_core.isra.0 (net/ipv4/ip_input.c:367 (discriminator 1)) >> [21299.958058] ip_rcv (net/ipv4/ip_input.c:448 ./include/linux/netfilter.h:304 ./include/linux/netfilter.h:298 net/ipv4/ip_input.c:569) >> [21299.958139] ? ip_rcv_core (net/ipv4/ip_input.c:436) >> [21299.958220] process_backlog (net/core/dev.c:5997) >> [21299.958302] __napi_poll (net/core/dev.c:6556) >> [21299.958384] net_rx_action (net/core/dev.c:6625 net/core/dev.c:6756) >> [21299.958466] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) >> [21299.958549] irq_exit_rcu (kernel/softirq.c:436 kernel/softirq.c:641 kernel/softirq.c:653) >> [21299.958631] sysvec_call_function_single (arch/x86/kernel/smp.c:262 (discriminator 47)) >> [21299.958714] </IRQ> >> [21299.958792] <TASK> >> [21299.958869] asm_sysvec_call_function_single (./arch/x86/include/asm/idtentry.h:656) >> [21299.958953] RIP: 0010:acpi_safe_halt (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:72 drivers/acpi/processor_idle.c:113) >> [21299.959038] Code: ed c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 65 48 8b 04 25 40 32 02 00 48 8b 00 a8 08 75 0c eb 07 0f 00 2d 57 0f 2c 00 fb f4 <fa> c3 0f 1f 00 0f b6 47 08 3c 01 74 0b 3c 02 74 05 8b 7f 04 eb 9f >> All code >> ======== >> 0: ed in (%dx),%eax >> 1: c3 ret >> 2: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1) >> 9: 00 00 00 00 >> d: 66 90 xchg %ax,%ax >> f: 65 48 8b 04 25 40 32 mov %gs:0x23240,%rax >> 16: 02 00 >> 18: 48 8b 00 mov (%rax),%rax >> 1b: a8 08 test $0x8,%al >> 1d: 75 0c jne 0x2b >> 1f: eb 07 jmp 0x28 >> 21: 0f 00 2d 57 0f 2c 00 verw 0x2c0f57(%rip) # 0x2c0f7f >> 28: fb sti >> 29: f4 hlt >> 2a:* fa cli <-- trapping instruction >> 2b: c3 ret >> 2c: 0f 1f 00 nopl (%rax) >> 2f: 0f b6 47 08 movzbl 0x8(%rdi),%eax >> 33: 3c 01 cmp $0x1,%al >> 35: 74 0b je 0x42 >> 37: 3c 02 cmp $0x2,%al >> 39: 74 05 je 0x40 >> 3b: 8b 7f 04 mov 0x4(%rdi),%edi >> 3e: eb 9f jmp 0xffffffffffffffdf >> >> Code starting with the faulting instruction >> =========================================== >> 0: fa cli >> 1: c3 ret >> 2: 0f 1f 00 nopl (%rax) >> 5: 0f b6 47 08 movzbl 0x8(%rdi),%eax >> 9: 3c 01 cmp $0x1,%al >> b: 74 0b je 0x18 >> d: 3c 02 cmp $0x2,%al >> f: 74 05 je 0x16 >> 11: 8b 7f 04 mov 0x4(%rdi),%edi >> 14: eb 9f jmp 0xffffffffffffffb5 >> [21299.959162] RSP: 0018:ffff96a7c015be80 EFLAGS: 00000246 >> [21299.959247] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 000000000000001f >> [21299.959343] RDX: ffff8b84bfbc0000 RSI: ffff8b75c76ba000 RDI: ffff8b75c76ba064 >> [21299.959437] RBP: ffffffffae216ea0 R08: ffffffffae216ea0 R09: 0000000000000003 >> [21299.959533] R10: 0000000000000002 R11: 0000000000000008 R12: 0000000000000001 >> [21299.959630] R13: ffffffffae216f08 R14: ffffffffae216f20 R15: 0000000000000000 >> [21299.959725] acpi_idle_enter (drivers/acpi/processor_idle.c:709) >> [21299.959807] cpuidle_enter_state (drivers/cpuidle/cpuidle.c:267) >> [21299.959890] cpuidle_enter (drivers/cpuidle/cpuidle.c:390 (discriminator 2)) >> [21299.959975] do_idle (kernel/sched/idle.c:134 kernel/sched/idle.c:215 kernel/sched/idle.c:282) >> [21299.960058] cpu_startup_entry (kernel/sched/idle.c:379) >> [21299.960140] start_secondary (arch/x86/kernel/smpboot.c:326) >> [21299.960223] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433) >> [21299.960306] </TASK> >> [21299.960384] ---[ end trace 0000000000000000 ]--- >> >>> On 22 Dec 2023, at 19:26, Martin Zaharinov <micron10@gmail.com> wrote: >>> >>> Hi Thomas, >>> >>> this is with applyed patch from you. >>> See logs >>> >>> >>> [43040.198064] ------------[ cut here ]------------ >>> [43040.198407] WARNING: CPU: 47 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath+0x2f/0x70 >>> [43040.198685] Modules linked in: pppoe pppox ppp_generic slhc nft_limit nft_ct nft_nat nft_chain_nat nf_tables netconsole tg3 igb i2c_algo_bit e1000e bnxt_en mlx5_core mlxfw mlx4_en mlx4_core i40e ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_devintf ipmi_msghandler rtc_cmos >>> [43040.199478] CPU: 47 PID: 0 Comm: swapper/47 Tainted: G O 6.6.8 #1 >>> [43040.199660] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 >>> [43040.199886] RIP: 0010:rcuref_put_slowpath+0x2f/0x70 >>> [43040.200028] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4e e3 00 >>> [43040.200387] RSP: 0018:ffffa39d83e88c30 EFLAGS: 00010246 >>> [43040.200528] RAX: 0000000000000000 RBX: ffff9c58e966b840 RCX: ffff9c5bc4e35680 >>> [43040.200700] RDX: ffff9c5fafde4f08 RSI: 00000000fffffe01 RDI: ffff9c58e966b840 >>> [43040.200871] RBP: ffff9c5fafde4f08 R08: ffff9c5fafde4f08 R09: 0000000000000001 >>> [43040.201044] R10: 00000000000286e0 R11: 0000000000000001 R12: ffff9c58e966b800 >>> [43040.201255] R13: ffff9c58e966b868 R14: ffff9c5fafde4f08 R15: 000000008f5de42b >>> [43040.201439] FS: 0000000000000000(0000) GS:ffff9c5fafdc0000(0000) knlGS:0000000000000000 >>> [43040.201642] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [43040.201799] CR2: 00007f1401217714 CR3: 0000000464b94003 CR4: 00000000001706e0 >>> [43040.201994] Call Trace: >>> [43040.202095] <IRQ> >>> [43040.202187] ? __warn+0x6c/0x130 >>> [43040.202301] ? report_bug+0x1b8/0x200 >>> [43040.202418] ? handle_bug+0x36/0x70 >>> [43040.202534] ? exc_invalid_op+0x17/0x1a0 >>> [43040.202652] ? asm_exc_invalid_op+0x16/0x20 >>> [43040.202781] ? rcuref_put_slowpath+0x2f/0x70 >>> [43040.202909] dst_release+0x1c/0x40 >>> [43040.203026] rt_cache_route+0xbd/0xf0 >>> [43040.203143] rt_set_nexthop.isra.0+0x1b6/0x450 >>> [43040.203272] ip_route_input_slow+0x5d9/0xcc0 >>> [43040.203401] ? nf_conntrack_udp_packet+0x17c/0x240 [nf_conntrack] >>> [43040.203581] ip_route_input_noref+0xe0/0xf0 >>> [43040.203704] ip_rcv_finish_core.isra.0+0xbb/0x440 >>> [43040.203855] ip_rcv+0xd5/0x110 >>> [43040.203962] ? ip_rcv_core+0x360/0x360 >>> [43040.204079] process_backlog+0x107/0x210 >>> [43040.204201] __napi_poll+0x20/0x180 >>> [43040.204315] net_rx_action+0x29f/0x380 >>> [43040.204432] __do_softirq+0xd0/0x202 >>> [43040.204549] irq_exit_rcu+0x82/0xa0 >>> [43040.204667] common_interrupt+0x7a/0xa0 >>> [43040.204786] </IRQ> >>> [43040.204876] <TASK> >>> [43040.204965] asm_common_interrupt+0x22/0x40 >>> [43040.205090] RIP: 0010:acpi_safe_halt+0x1b/0x20 >>> [43040.205220] Code: ed c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 65 48 8b 04 25 40 32 02 00 48 8b 00 a8 08 75 0c eb 07 0f 00 2d 57 0f 2c 00 fb f4 <fa> c3 0f 1f 00 0f b6 47 08 3c 01 74 0b 3c 02 74 05 8b 7f 04 eb 9f >>> [43040.205578] RSP: 0018:ffffa39d8234fe80 EFLAGS: 00000246 >>> [43040.205718] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 000000000000001f >>> [43040.205890] RDX: ffff9c5fafdc0000 RSI: ffff9c5882e95800 RDI: ffff9c5882e95864 >>> [43040.206063] RBP: ffffffffa9216ea0 R08: ffffffffa9216ea0 R09: 0000000000000003 >>> [43040.206246] R10: 0000000000000002 R11: 0000000000000008 R12: 0000000000000001 >>> [43040.206419] R13: ffffffffa9216f08 R14: ffffffffa9216f20 R15: 0000000000000000 >>> [43040.206593] acpi_idle_enter+0x77/0xc0 >>> [43040.206711] cpuidle_enter_state+0x69/0x6a0 >>> [43040.206835] cpuidle_enter+0x24/0x40 >>> [43040.206954] do_idle+0x1a7/0x210 >>> [43040.207066] cpu_startup_entry+0x21/0x30 >>> [43040.207188] start_secondary+0xe1/0xf0 >>> [43040.207310] secondary_startup_64_no_verify+0x166/0x16b >>> [43040.207451] </TASK> >>> [43040.207542] ---[ end trace 0000000000000000 ]--- >>> >>> >>> >>> [43040.198064] ------------[ cut here ]------------ >>> [43040.198407] WARNING: CPU: 47 PID: 0 at lib/rcuref.c:294 rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) >>> [43040.198685] Modules linked in: pppoe pppox ppp_generic slhc nft_limit nft_ct nft_nat nft_chain_nat nf_tables netconsole tg3 igb i2c_algo_bit e1000e bnxt_en mlx5_core mlxfw mlx4_en mlx4_core i40e ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_devintf ipmi_msghandler rtc_cmos >>> [43040.199478] CPU: 47 PID: 0 Comm: swapper/47 Tainted: G O 6.6.8 #1 >>> [43040.199660] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 >>> [43040.199886] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) >>> [43040.200028] Code: 07 83 f8 ff 75 19 ba 00 00 00 e0 f0 0f b1 17 83 f8 ff 74 04 31 c0 5b c3 b8 01 00 00 00 5b c3 3d ff ff ff bf 77 14 85 c0 78 06 <0f> 0b 31 c0 eb e6 c7 07 00 00 00 a0 31 c0 eb dc 80 3d e2 4e e3 00 >>> All code >>> ======== >>> 0: 07 (bad) >>> 1: 83 f8 ff cmp $0xffffffff,%eax >>> 4: 75 19 jne 0x1f >>> 6: ba 00 00 00 e0 mov $0xe0000000,%edx >>> b: f0 0f b1 17 lock cmpxchg %edx,(%rdi) >>> f: 83 f8 ff cmp $0xffffffff,%eax >>> 12: 74 04 je 0x18 >>> 14: 31 c0 xor %eax,%eax >>> 16: 5b pop %rbx >>> 17: c3 ret >>> 18: b8 01 00 00 00 mov $0x1,%eax >>> 1d: 5b pop %rbx >>> 1e: c3 ret >>> 1f: 3d ff ff ff bf cmp $0xbfffffff,%eax >>> 24: 77 14 ja 0x3a >>> 26: 85 c0 test %eax,%eax >>> 28: 78 06 js 0x30 >>> 2a:* 0f 0b ud2 <-- trapping instruction >>> 2c: 31 c0 xor %eax,%eax >>> 2e: eb e6 jmp 0x16 >>> 30: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) >>> 36: 31 c0 xor %eax,%eax >>> 38: eb dc jmp 0x16 >>> 3a: 80 .byte 0x80 >>> 3b: 3d e2 4e e3 00 cmp $0xe34ee2,%eax >>> >>> Code starting with the faulting instruction >>> =========================================== >>> 0: 0f 0b ud2 >>> 2: 31 c0 xor %eax,%eax >>> 4: eb e6 jmp 0xffffffffffffffec >>> 6: c7 07 00 00 00 a0 movl $0xa0000000,(%rdi) >>> c: 31 c0 xor %eax,%eax >>> e: eb dc jmp 0xffffffffffffffec >>> 10: 80 .byte 0x80 >>> 11: 3d e2 4e e3 00 cmp $0xe34ee2,%eax >>> [43040.200387] RSP: 0018:ffffa39d83e88c30 EFLAGS: 00010246 >>> [43040.200528] RAX: 0000000000000000 RBX: ffff9c58e966b840 RCX: ffff9c5bc4e35680 >>> [43040.200700] RDX: ffff9c5fafde4f08 RSI: 00000000fffffe01 RDI: ffff9c58e966b840 >>> [43040.200871] RBP: ffff9c5fafde4f08 R08: ffff9c5fafde4f08 R09: 0000000000000001 >>> [43040.201044] R10: 00000000000286e0 R11: 0000000000000001 R12: ffff9c58e966b800 >>> [43040.201255] R13: ffff9c58e966b868 R14: ffff9c5fafde4f08 R15: 000000008f5de42b >>> [43040.201439] FS: 0000000000000000(0000) GS:ffff9c5fafdc0000(0000) knlGS:0000000000000000 >>> [43040.201642] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [43040.201799] CR2: 00007f1401217714 CR3: 0000000464b94003 CR4: 00000000001706e0 >>> [43040.201994] Call Trace: >>> [43040.202095] <IRQ> >>> [43040.202187] ? __warn (kernel/panic.c:235 kernel/panic.c:673) >>> [43040.202301] ? report_bug (lib/bug.c:180 lib/bug.c:219) >>> [43040.202418] ? handle_bug (arch/x86/kernel/traps.c:237) >>> [43040.202534] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) >>> [43040.202652] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) >>> [43040.202781] ? rcuref_put_slowpath (lib/rcuref.c:294 (discriminator 1)) >>> [43040.202909] dst_release (net/core/dst.c:166 (discriminator 1)) >>> [43040.203026] rt_cache_route (net/ipv4/route.c:1499) >>> [43040.203143] rt_set_nexthop.isra.0 (net/ipv4/route.c:1606 (discriminator 1)) >>> [43040.203272] ip_route_input_slow (./include/net/lwtunnel.h:140 net/ipv4/route.c:1875 net/ipv4/route.c:2154 net/ipv4/route.c:2337) >>> [43040.203401] ? nf_conntrack_udp_packet (net/netfilter/nf_conntrack_proto_udp.c:124) nf_conntrack >>> [43040.203581] ip_route_input_noref (net/ipv4/route.c:2499) >>> [43040.203704] ip_rcv_finish_core.isra.0 (net/ipv4/ip_input.c:367 (discriminator 1)) >>> [43040.203855] ip_rcv (net/ipv4/ip_input.c:448 ./include/linux/netfilter.h:304 ./include/linux/netfilter.h:298 net/ipv4/ip_input.c:569) >>> [43040.203962] ? ip_rcv_core (net/ipv4/ip_input.c:436) >>> [43040.204079] process_backlog (net/core/dev.c:5997) >>> [43040.204201] __napi_poll (net/core/dev.c:6556) >>> [43040.204315] net_rx_action (net/core/dev.c:6625 net/core/dev.c:6756) >>> [43040.204432] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) >>> [43040.204549] irq_exit_rcu (kernel/softirq.c:436 kernel/softirq.c:641 kernel/softirq.c:653) >>> [43040.204667] common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 47)) >>> [43040.204786] </IRQ> >>> [43040.204876] <TASK> >>> [43040.204965] asm_common_interrupt (./arch/x86/include/asm/idtentry.h:640) >>> [43040.205090] RIP: 0010:acpi_safe_halt (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:72 drivers/acpi/processor_idle.c:113) >>> [43040.205220] Code: ed c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 65 48 8b 04 25 40 32 02 00 48 8b 00 a8 08 75 0c eb 07 0f 00 2d 57 0f 2c 00 fb f4 <fa> c3 0f 1f 00 0f b6 47 08 3c 01 74 0b 3c 02 74 05 8b 7f 04 eb 9f >>> All code >>> ======== >>> 0: ed in (%dx),%eax >>> 1: c3 ret >>> 2: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1) >>> 9: 00 00 00 00 >>> d: 66 90 xchg %ax,%ax >>> f: 65 48 8b 04 25 40 32 mov %gs:0x23240,%rax >>> 16: 02 00 >>> 18: 48 8b 00 mov (%rax),%rax >>> 1b: a8 08 test $0x8,%al >>> 1d: 75 0c jne 0x2b >>> 1f: eb 07 jmp 0x28 >>> 21: 0f 00 2d 57 0f 2c 00 verw 0x2c0f57(%rip) # 0x2c0f7f >>> 28: fb sti >>> 29: f4 hlt >>> 2a:* fa cli <-- trapping instruction >>> 2b: c3 ret >>> 2c: 0f 1f 00 nopl (%rax) >>> 2f: 0f b6 47 08 movzbl 0x8(%rdi),%eax >>> 33: 3c 01 cmp $0x1,%al >>> 35: 74 0b je 0x42 >>> 37: 3c 02 cmp $0x2,%al >>> 39: 74 05 je 0x40 >>> 3b: 8b 7f 04 mov 0x4(%rdi),%edi >>> 3e: eb 9f jmp 0xffffffffffffffdf >>> >>> Code starting with the faulting instruction >>> =========================================== >>> 0: fa cli >>> 1: c3 ret >>> 2: 0f 1f 00 nopl (%rax) >>> 5: 0f b6 47 08 movzbl 0x8(%rdi),%eax >>> 9: 3c 01 cmp $0x1,%al >>> b: 74 0b je 0x18 >>> d: 3c 02 cmp $0x2,%al >>> f: 74 05 je 0x16 >>> 11: 8b 7f 04 mov 0x4(%rdi),%edi >>> 14: eb 9f jmp 0xffffffffffffffb5 >>> [43040.205578] RSP: 0018:ffffa39d8234fe80 EFLAGS: 00000246 >>> [43040.205718] RAX: 0000000000004000 RBX: 0000000000000001 RCX: 000000000000001f >>> [43040.205890] RDX: ffff9c5fafdc0000 RSI: ffff9c5882e95800 RDI: ffff9c5882e95864 >>> [43040.206063] RBP: ffffffffa9216ea0 R08: ffffffffa9216ea0 R09: 0000000000000003 >>> [43040.206246] R10: 0000000000000002 R11: 0000000000000008 R12: 0000000000000001 >>> [43040.206419] R13: ffffffffa9216f08 R14: ffffffffa9216f20 R15: 0000000000000000 >>> [43040.206593] acpi_idle_enter (drivers/acpi/processor_idle.c:709) >>> [43040.206711] cpuidle_enter_state (drivers/cpuidle/cpuidle.c:267) >>> [43040.206835] cpuidle_enter (drivers/cpuidle/cpuidle.c:390 (discriminator 2)) >>> [43040.206954] do_idle (kernel/sched/idle.c:134 kernel/sched/idle.c:215 kernel/sched/idle.c:282) >>> [43040.207066] cpu_startup_entry (kernel/sched/idle.c:379) >>> [43040.207188] start_secondary (arch/x86/kernel/smpboot.c:326) >>> [43040.207310] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433) >>> [43040.207451] </TASK> >>> [43040.207542] ---[ end trace 0000000000000000 ]--- >>> >>>> On 19 Dec 2023, at 16:26, Thomas Gleixner <tglx@linutronix.de> wrote: >>>> >>>> On Tue, Dec 19 2023 at 11:25, Martin Zaharinov wrote: >>>>>> On 12 Dec 2023, at 20:16, Thomas Gleixner <tglx@linutronix.de> wrote: >>>>>> Btw, how easy is this to reproduce? >>>>> >>>>> Its not easy this report is generate on machine with 5-6k users , with >>>>> traffic and one time is show on 1 day , other show after 4-5 days… >>>> >>>> I love those bugs ... >>>> >>>>> Apply this patch and will upload image on one machine as fast as >>>>> possible and when get any reports will send you. >>>> >>>> Let's see how that goes! >>>> >>>> Thanks, >>>> >>>> tglx >>> >> > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-15 4:05 Urgent Bug Report Kernel crash 6.5.2 Martin Zaharinov 2023-09-15 6:45 ` Eric Dumazet @ 2023-09-15 23:00 ` Martin Zaharinov 2023-09-15 23:11 ` Martin Zaharinov 1 sibling, 1 reply; 35+ messages in thread From: Martin Zaharinov @ 2023-09-15 23:00 UTC (permalink / raw) To: netdev Cc: Eric Dumazet, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern Ok fix one note this is kernel 6.5.3 … see log now : [40915.530445] ------------[ cut here ]------------ [40915.530529] rcuref - imbalanced put() [40915.530540] WARNING: CPU: 7 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) [40915.530698] Modules linked in: nf_conntrack_netlink nft_limit pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [40915.530899] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G O 6.5.3 #1 [40915.531018] Hardware name: Supermicro SYS-5038MR-H8TRF/X10SRD-F, BIOS 3.3 10/28/2020 [40915.531137] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) [40915.531230] Code: 31 c0 eb e2 80 3d c6 ae e6 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb cf 48 c7 c7 68 f6 e2 8e c6 05 ac ae e6 00 01 e8 11 71 c7 ff <0f> 0b eb df cc cc cc cc cc cc cc cc cc cc cc cc cc 48 89 fa 83 e2 All code ======== 0: 31 c0 xor %eax,%eax 2: eb e2 jmp 0xffffffffffffffe6 4: 80 3d c6 ae e6 00 00 cmpb $0x0,0xe6aec6(%rip) # 0xe6aed1 b: 74 0a je 0x17 d: c7 03 00 00 00 e0 movl $0xe0000000,(%rbx) 13: 31 c0 xor %eax,%eax 15: eb cf jmp 0xffffffffffffffe6 17: 48 c7 c7 68 f6 e2 8e mov $0xffffffff8ee2f668,%rdi 1e: c6 05 ac ae e6 00 01 movb $0x1,0xe6aeac(%rip) # 0xe6aed1 25: e8 11 71 c7 ff call 0xffffffffffc7713b 2a:* 0f 0b ud2 <-- trapping instruction 2c: eb df jmp 0xd 2e: cc int3 2f: cc int3 30: cc int3 31: cc int3 32: cc int3 33: cc int3 34: cc int3 35: cc int3 36: cc int3 37: cc int3 38: cc int3 39: cc int3 3a: cc int3 3b: 48 89 fa mov %rdi,%rdx 3e: 83 .byte 0x83 3f: e2 .byte 0xe2 Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: eb df jmp 0xffffffffffffffe3 4: cc int3 5: cc int3 6: cc int3 7: cc int3 8: cc int3 9: cc int3 a: cc int3 b: cc int3 c: cc int3 d: cc int3 e: cc int3 f: cc int3 10: cc int3 11: 48 89 fa mov %rdi,%rdx 14: 83 .byte 0x83 15: e2 .byte 0xe2 [40915.531389] RSP: 0018:ffffa62680318de8 EFLAGS: 00010296 [40915.531487] RAX: 0000000000000019 RBX: ffff982f02950c40 RCX: 00000000fffbffff [40915.531605] RDX: 00000000fffbffff RSI: 0000000000000001 RDI: 00000000ffffffea [40915.531721] RBP: ffff982e467d2000 R08: 0000000000000000 R09: 00000000fffbffff [40915.531839] R10: ffff98359d600000 R11: 0000000000000003 R12: ffff982f044e16c0 [40915.531956] R13: 0000000000000000 R14: 0000000000000258 R15: ffffa62680318f60 [40915.532075] FS: 0000000000000000(0000) GS:ffff98359fbc0000(0000) knlGS:0000000000000000 [40915.532195] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [40915.532291] CR2: 00005593eb3ff078 CR3: 0000000179f6e001 CR4: 00000000003706e0 [40915.532409] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [40915.532526] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [40915.532645] Call Trace: [40915.532736] <IRQ> [40915.532824] ? __warn (kernel/panic.c:668) [40915.532918] ? report_bug (lib/bug.c:223) [40915.533011] ? handle_bug (arch/x86/kernel/traps.c:324) [40915.533104] ? exc_invalid_op (arch/x86/kernel/traps.c:345 (discriminator 1)) [40915.533198] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) [40915.533294] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) [40915.533389] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) [40915.533482] dst_release (./include/linux/rcuref.h:151 net/core/dst.c:166) [40915.533576] __dev_queue_xmit (net/core/dev.c:4138) [40915.533671] ? eth_header (net/ethernet/eth.c:83) [40915.533766] ip_finish_output2 (./include/net/neighbour.h:544 net/ipv4/ip_output.c:230) [40915.533863] process_backlog (net/core/dev.c:5451 net/core/dev.c:5566 net/core/dev.c:5895) [40915.533958] __napi_poll+0x20/0x180 [40915.534050] net_rx_action (net/core/dev.c:5839 net/core/dev.c:5860 net/core/dev.c:6684) [40915.534140] __do_softirq (./arch/x86/include/asm/bitops.h:319 kernel/softirq.c:550) [40915.534233] do_softirq (kernel/softirq.c:463 (discriminator 32)) [40915.534326] </IRQ> [40915.534413] <TASK> [40915.534503] flush_smp_call_function_queue (kernel/smp.c:563 (discriminator 1)) [40915.534597] do_idle (kernel/sched/idle.c:295) [40915.534687] cpu_startup_entry (kernel/sched/idle.c:379 (discriminator 1)) [40915.534778] start_secondary (arch/x86/kernel/smpboot.c:326) [40915.534871] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:441) [40915.534968] </TASK> [40915.535057] ---[ end trace 0000000000000000 ]--- > On 15 Sep 2023, at 7:05, Martin Zaharinov <micron10@gmail.com> wrote: > > Hi All > This is report from kernel 6.5.2 after 4 day up system hang and reboot after this error : > > > > Sep 15 04:32:29 205.254.184.12 [399661.971344][ C31] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) > Sep 15 04:32:29 205.254.184.12 [399661.971470][ C31] BUG: unable to handle page fault for address: ffffa10c52d43058 > Sep 15 04:32:29 205.254.184.12 [399661.971586][ C31] #PF: supervisor instruction fetch in kernel mode > Sep 15 04:32:29 205.254.184.12 [399661.971680][ C31] #PF: error_code(0x0011) - permissions violation > Sep 15 04:32:29 205.254.184.12 [399661.971775][ C31] PGD 12601067 P4D 12601067 PUD 80000002400001e3 > Sep 15 04:32:29 205.254.184.12 [399661.971871][ C31] Oops: 0011 [#1] PREEMPT SMP > Sep 15 04:32:29 205.254.184.12 [399661.971963][ C31] CPU: 31 PID: 0 Comm: swapper/31 Tainted: G W O 6.5.2 #1 > Sep 15 04:32:29 205.254.184.12 [399661.972079][ C31] Hardware name: Supermicro SYS-5038MR-H8TRF/X10SRD-F, BIOS 3.3 10/28/2020 > Sep 15 04:32:29 205.254.184.12 [399661.972197][ C31] RIP: 0010:0xffffa10c52d43058 > Sep 15 04:32:29 205.254.184.12 [399661.972289][ C31] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 30 d4 52 0c a1 ff ff 00 00 00 00 00 00 > Sep 15 04:32:29 205.254.184.12 [399661.972448][ C31] RSP: 0018:ffffad0e0097ccc8 EFLAGS: 00010282 > Sep 15 04:32:29 205.254.184.12 [399661.972543][ C31] RAX: ffffa10c52d43058 RBX: ffffa10c52d43000 RCX: 0000000000000000 > Sep 15 04:32:29 205.254.184.12 [399661.972659][ C31] RDX: 0000000000002712 RSI: 0000000000000246 RDI: ffffa10c52d43000 > Sep 15 04:32:29 205.254.184.12 [399661.972774][ C31] RBP: ffffa10c52d43000 R08: 0000000127a83c46 R09: 0000000000004d8c > Sep 15 04:32:29 205.254.184.12 [399661.972889][ C31] R10: ffffe840ca0f7c00 R11: 0000000000000000 R12: ffffa10c8e764d80 > Sep 15 04:32:29 205.254.184.12 [399661.973005][ C31] R13: ffffa10c92b4c760 R14: 0000000000000058 R15: ffffa10c92b4c600 > Sep 15 04:32:29 205.254.184.12 [399661.973123][ C31] FS: 0000000000000000(0000) GS:ffffa1125fdc0000(0000) knlGS:0000000000000000 > Sep 15 04:32:29 205.254.184.12 [399661.973244][ C31] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > Sep 15 04:32:29 205.254.184.12 [399661.973338][ C31] CR2: ffffa10c52d43058 CR3: 00000001059b8001 CR4: 00000000003706e0 > Sep 15 04:32:29 205.254.184.12 [399661.973454][ C31] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > Sep 15 04:32:29 205.254.184.12 [399661.973569][ C31] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Sep 15 04:32:29 205.254.184.12 [399661.973684][ C31] Call Trace: > Sep 15 04:32:29 205.254.184.12 [399661.973773][ C31] <IRQ> > Sep 15 04:32:29 205.254.184.12 [399661.973859][ C31] ? __die+0xe4/0xf0 > Sep 15 04:32:29 205.254.184.12 [399661.973949][ C31] ? page_fault_oops+0x144/0x3e0 > Sep 15 04:32:29 205.254.184.12 [399661.974043][ C31] ? exc_page_fault+0x92/0xa0 > Sep 15 04:32:29 205.254.184.12 [399661.974136][ C31] ? asm_exc_page_fault+0x22/0x30 > Sep 15 04:32:29 205.254.184.12 [399661.974228][ C31] ? kfree_skb_reason+0x33/0xf0 > Sep 15 04:32:29 205.254.184.12 [399661.974321][ C31] ? tcp_mtu_probe+0x3a6/0x7b0 > Sep 15 04:32:29 205.254.184.12 [399661.974416][ C31] ? tcp_write_xmit+0x7fa/0x1410 > Sep 15 04:32:29 205.254.184.12 [399661.974509][ C31] ? __tcp_push_pending_frames+0x2d/0xb0 > Sep 15 04:32:29 205.254.184.12 [399661.974603][ C31] ? tcp_rcv_established+0x381/0x610 > Sep 15 04:32:29 205.254.184.12 [399661.974695][ C31] ? sk_filter_trim_cap+0xc6/0x1c0 > Sep 15 04:32:29 205.254.184.12 [399661.974787][ C31] ? tcp_v4_do_rcv+0x11f/0x1f0 > Sep 15 04:32:29 205.254.184.12 [399661.974877][ C31] ? tcp_v4_rcv+0xfa1/0x1010 > Sep 15 04:32:29 205.254.184.12 [399661.974968][ C31] ? ip_protocol_deliver_rcu+0x1b/0x270 > Sep 15 04:32:29 205.254.184.12 [399661.975062][ C31] ? ip_local_deliver_finish+0x6d/0x90 > Sep 15 04:32:29 205.254.184.12 [399661.976257][ C31] ? process_backlog+0x10c/0x230 > Sep 15 04:32:29 205.254.184.12 [399661.976352][ C31] ? __napi_poll+0x20/0x180 > Sep 15 04:32:29 205.254.184.12 [399661.976442][ C31] ? net_rx_action+0x2a4/0x390 > Sep 15 04:32:29 205.254.184.12 [399661.976534][ C31] ? __do_softirq+0xd0/0x202 > Sep 15 04:32:29 205.254.184.12 [399661.976626][ C31] ? do_softirq+0x3a/0x50 > Sep 15 04:32:29 205.254.184.12 [399661.976718][ C31] </IRQ> > Sep 15 04:32:29 205.254.184.12 [399661.976805][ C31] <TASK> > Sep 15 04:32:29 205.254.184.12 [399661.976890][ C31] ? flush_smp_call_function_queue+0x3f/0x50 > Sep 15 04:32:29 205.254.184.12 [399661.976988][ C31] ? do_idle+0x14d/0x210 > Sep 15 04:32:29 205.254.184.12 [399661.977078][ C31] ? cpu_startup_entry+0x14/0x20 > Sep 15 04:32:29 205.254.184.12 [399661.977168][ C31] ? start_secondary+0xe1/0xf0 > Sep 15 04:32:29 205.254.184.12 [399661.977262][ C31] ? secondary_startup_64_no_verify+0x167/0x16b > Sep 15 04:32:29 205.254.184.12 [399661.977359][ C31] </TASK> > Sep 15 04:32:29 205.254.184.12 [399661.977448][ C31] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos > Sep 15 04:32:29 205.254.184.12 [399661.977720][ C31] CR2: ffffa10c52d43058 > Sep 15 04:32:29 205.254.184.12 [399661.977809][ C31] ---[ end trace 0000000000000000 ]--- > Sep 15 04:32:29 205.254.184.12 [399661.977901][ C31] RIP: 0010:0xffffa10c52d43058 > Sep 15 04:32:29 205.254.184.12 [399661.977992][ C31] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 30 d4 52 0c a1 ff ff 00 00 00 00 00 00 > Sep 15 04:32:29 205.254.184.12 [399661.978150][ C31] RSP: 0018:ffffad0e0097ccc8 EFLAGS: 00010282 > Sep 15 04:32:29 205.254.184.12 [399661.978243][ C31] RAX: ffffa10c52d43058 RBX: ffffa10c52d43000 RCX: 0000000000000000 > Sep 15 04:32:29 205.254.184.12 [399661.978358][ C31] RDX: 0000000000002712 RSI: 0000000000000246 RDI: ffffa10c52d43000 > Sep 15 04:32:29 205.254.184.12 [399661.978472][ C31] RBP: ffffa10c52d43000 R08: 0000000127a83c46 R09: 0000000000004d8c > Sep 15 04:32:29 205.254.184.12 [399661.978587][ C31] R10: ffffe840ca0f7c00 R11: 0000000000000000 R12: ffffa10c8e764d80 > Sep 15 04:32:29 205.254.184.12 [399661.978702][ C31] R13: ffffa10c92b4c760 R14: 0000000000000058 R15: ffffa10c92b4c600 > Sep 15 04:32:29 205.254.184.12 [399661.978818][ C31] FS: 0000000000000000(0000) GS:ffffa1125fdc0000(0000) knlGS:0000000000000000 > Sep 15 04:32:29 205.254.184.12 [399661.978940][ C31] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > Sep 15 04:32:29 205.254.184.12 [399661.979036][ C31] CR2: ffffa10c52d43058 CR3: 00000001059b8001 CR4: 00000000003706e0 > Sep 15 04:32:29 205.254.184.12 [399661.979150][ C31] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > Sep 15 04:32:29 205.254.184.12 [399661.979265][ C31] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Sep 15 04:32:29 205.254.184.12 [399661.979381][ C31] Kernel panic - not syncing: Fatal exception in interrupt > Sep 15 04:32:29 205.254.184.12 [399662.084038][ C31] Kernel Offset: 0x1e000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) > Sep 15 04:32:29 205.254.184.12 [399662.084162][ C31] Rebooting in 10 seconds.. > > > Please if find fix update me . > > m. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-15 23:00 ` Martin Zaharinov @ 2023-09-15 23:11 ` Martin Zaharinov 2023-09-16 8:27 ` Paolo Abeni 0 siblings, 1 reply; 35+ messages in thread From: Martin Zaharinov @ 2023-09-15 23:11 UTC (permalink / raw) To: netdev Cc: Eric Dumazet, Paolo Abeni, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern one more log: Sep 12 07:37:29 [151563.298466][ C5] ------------[ cut here ]------------ Sep 12 07:37:29 [151563.298550][ C5] rcuref - imbalanced put() Sep 12 07:37:29 [151563.298564][ C5] WARNING: CPU: 5 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) Sep 12 07:37:29 [151563.298724][ C5] Modules linked in: nft_limit nf_conntrack_netlink vlan_mon(O) pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_xnatlog(O) ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [last unloaded: BNGBOOT(O)] Sep 12 07:37:29 [151563.298894][ C5] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G O 6.5.2 #1 Sep 12 07:37:29 [151563.298975][ C5] Hardware name: Supermicro SYS-5038MR-H8TRF/X10SRD-F, BIOS 3.3 10/28/2020 Sep 12 07:37:29 [151563.299091][ C5] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) Sep 12 07:37:29 [151563.299185][ C5] Code: 31 c0 eb e2 80 3d c7 b8 e6 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb cf 48 c7 c7 9b f5 e2 9f c6 05 ad b8 e6 00 01 e8 01 7b c7 ff <0f> 0b eb df cc cc cc cc cc cc cc cc cc cc cc cc cc 48 89 fa 83 e2 All code ======== 0: 31 c0 xor %eax,%eax 2: eb e2 jmp 0xffffffffffffffe6 4: 80 3d c7 b8 e6 00 00 cmpb $0x0,0xe6b8c7(%rip) # 0xe6b8d2 b: 74 0a je 0x17 d: c7 03 00 00 00 e0 movl $0xe0000000,(%rbx) 13: 31 c0 xor %eax,%eax 15: eb cf jmp 0xffffffffffffffe6 17: 48 c7 c7 9b f5 e2 9f mov $0xffffffff9fe2f59b,%rdi 1e: c6 05 ad b8 e6 00 01 movb $0x1,0xe6b8ad(%rip) # 0xe6b8d2 25: e8 01 7b c7 ff call 0xffffffffffc77b2b 2a:* 0f 0b ud2 <-- trapping instruction 2c: eb df jmp 0xd 2e: cc int3 2f: cc int3 30: cc int3 31: cc int3 32: cc int3 33: cc int3 34: cc int3 35: cc int3 36: cc int3 37: cc int3 38: cc int3 39: cc int3 3a: cc int3 3b: 48 89 fa mov %rdi,%rdx 3e: 83 .byte 0x83 3f: e2 .byte 0xe2 Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: eb df jmp 0xffffffffffffffe3 4: cc int3 5: cc int3 6: cc int3 7: cc int3 8: cc int3 9: cc int3 a: cc int3 b: cc int3 c: cc int3 d: cc int3 e: cc int3 f: cc int3 10: cc int3 11: 48 89 fa mov %rdi,%rdx 14: 83 .byte 0x83 15: e2 .byte 0xe2 Sep 12 07:37:29 [151563.299344][ C5] RSP: 0018:ffffad0e0033cde8 EFLAGS: 00010296 Sep 12 07:37:29 [151563.299440][ C5] RAX: 0000000000000019 RBX: ffffa10ba37ce100 RCX: 00000000fff7ffff Sep 12 07:37:29 [151563.299558][ C5] RDX: 00000000fff7ffff RSI: 0000000000000001 RDI: 00000000ffffffea Sep 12 07:37:29 [151563.299677][ C5] RBP: ffffa10b05c76000 R08: 0000000000000000 R09: 00000000fff7ffff Sep 12 07:37:29 [151563.299796][ C5] R10: ffffa1125ae00000 R11: 0000000000000003 R12: ffffa10b5f1a4ec0 Sep 12 07:37:29 [151563.299914][ C5] R13: 0000000000000000 R14: 0000000000000258 R15: ffffad0e0033cf60 Sep 12 07:37:29 [151563.300030][ C5] FS: 0000000000000000(0000) GS:ffffa1125f740000(0000) knlGS:0000000000000000 Sep 12 07:37:29 [151563.300152][ C5] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 12 07:37:29 [151563.300248][ C5] CR2: 00007fade7f56d40 CR3: 000000010088e005 CR4: 00000000003706e0 Sep 12 07:37:29 [151563.300363][ C5] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 12 07:37:29 [151563.300478][ C5] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Sep 12 07:37:29 [151563.300593][ C5] Call Trace: Sep 12 07:37:29 [151563.300683][ C5] <IRQ> Sep 12 07:37:29 [151563.300769][ C5] ? __warn (kernel/panic.c:668) Sep 12 07:37:29 [151563.300861][ C5] ? report_bug (lib/bug.c:223) Sep 12 07:37:29 [151563.300952][ C5] ? handle_bug (arch/x86/kernel/traps.c:324) Sep 12 07:37:29 [151563.301043][ C5] ? exc_invalid_op (arch/x86/kernel/traps.c:345 (discriminator 1)) Sep 12 07:37:29 [151563.301134][ C5] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) Sep 12 07:37:29 [151563.301225][ C5] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) Sep 12 07:37:29 [151563.301319][ C5] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) Sep 12 07:37:29 [151563.301412][ C5] dst_release (./include/linux/rcuref.h:151 net/core/dst.c:166) Sep 12 07:37:29 [151563.301502][ C5] __dev_queue_xmit (net/core/dev.c:4138) Sep 12 07:37:29 [151563.301595][ C5] ? eth_header (net/ethernet/eth.c:83) Sep 12 07:37:29 [151563.301686][ C5] ip_finish_output2 (./include/net/neighbour.h:327 ./include/net/sock.h:2251 net/ipv4/ip_output.c:228) Sep 12 07:37:29 [151563.301778][ C5] process_backlog (net/core/dev.c:5451 net/core/dev.c:5566 net/core/dev.c:5895) Sep 12 07:37:29 [151563.301871][ C5] __napi_poll+0x20/0x180 Sep 12 07:37:29 [151563.301964][ C5] net_rx_action (net/core/dev.c:5839 net/core/dev.c:5860 net/core/dev.c:6684) Sep 12 07:37:29 [151563.302057][ C5] __do_softirq (./arch/x86/include/asm/bitops.h:319 kernel/softirq.c:550) Sep 12 07:37:29 [151563.302150][ C5] do_softirq (kernel/softirq.c:463 (discriminator 32)) Sep 12 07:37:29 [151563.302240][ C5] </IRQ> Sep 12 07:37:29 [151563.302326][ C5] <TASK> Sep 12 07:37:29 [151563.302416][ C5] flush_smp_call_function_queue (kernel/smp.c:563 (discriminator 1)) Sep 12 07:37:29 [151563.302518][ C5] do_idle (kernel/sched/idle.c:295) Sep 12 07:37:29 [151563.302612][ C5] cpu_startup_entry (kernel/sched/idle.c:379 (discriminator 1)) Sep 12 07:37:29 [151563.302707][ C5] start_secondary (arch/x86/kernel/smpboot.c:326) Sep 12 07:37:29 [151563.302805][ C5] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:441) Sep 12 07:37:29 [151563.302900][ C5] </TASK> Sep 12 07:37:29 [151563.302986][ C5] ---[ end trace 0000000000000000 ]--- Sep 15 04:32:29 [399661.971344][ C31] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) Sep 15 04:32:29 [399661.971470][ C31] BUG: unable to handle page fault for address: ffffa10c52d43058 Sep 15 04:32:29 [399661.971586][ C31] #PF: supervisor instruction fetch in kernel mode Sep 15 04:32:29 [399661.971680][ C31] #PF: error_code(0x0011) - permissions violation Sep 15 04:32:29 [399661.971775][ C31] PGD 12601067 P4D 12601067 PUD 80000002400001e3 Sep 15 04:32:29 [399661.971871][ C31] Oops: 0011 [#1] PREEMPT SMP Sep 15 04:32:29 [399661.971963][ C31] CPU: 31 PID: 0 Comm: swapper/31 Tainted: G W O 6.5.2 #1 Sep 15 04:32:29 [399661.972079][ C31] Hardware name: Supermicro SYS-5038MR-H8TRF/X10SRD-F, BIOS 3.3 10/28/2020 Sep 15 04:32:29 [399661.972197][ C31] RIP: 0010:0xffffa10c52d43058 Sep 15 04:32:29 [399661.972289][ C31] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 30 d4 52 0c a1 ff ff 00 00 00 00 00 00 All code ======== ... 30: 00 00 add %al,(%rax) 32: 58 pop %rax 33: 30 d4 xor %dl,%ah 35:* 52 push %rdx <-- trapping instruction 36: 0c a1 or $0xa1,%al 38: ff (bad) 39: ff 00 incl (%rax) 3b: 00 00 add %al,(%rax) 3d: 00 00 add %al,(%rax) ... Code starting with the faulting instruction =========================================== ... 8: 58 pop %rax 9: 30 d4 xor %dl,%ah b: 52 push %rdx c: 0c a1 or $0xa1,%al e: ff (bad) f: ff 00 incl (%rax) 11: 00 00 add %al,(%rax) 13: 00 00 add %al,(%rax) ... Sep 15 04:32:29 [399661.972448][ C31] RSP: 0018:ffffad0e0097ccc8 EFLAGS: 00010282 Sep 15 04:32:29 [399661.972543][ C31] RAX: ffffa10c52d43058 RBX: ffffa10c52d43000 RCX: 0000000000000000 Sep 15 04:32:29 [399661.972659][ C31] RDX: 0000000000002712 RSI: 0000000000000246 RDI: ffffa10c52d43000 Sep 15 04:32:29 [399661.972774][ C31] RBP: ffffa10c52d43000 R08: 0000000127a83c46 R09: 0000000000004d8c Sep 15 04:32:29 [399661.972889][ C31] R10: ffffe840ca0f7c00 R11: 0000000000000000 R12: ffffa10c8e764d80 Sep 15 04:32:29 [399661.973005][ C31] R13: ffffa10c92b4c760 R14: 0000000000000058 R15: ffffa10c92b4c600 Sep 15 04:32:29 [399661.973123][ C31] FS: 0000000000000000(0000) GS:ffffa1125fdc0000(0000) knlGS:0000000000000000 Sep 15 04:32:29 [399661.973244][ C31] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 15 04:32:29 [399661.973338][ C31] CR2: ffffa10c52d43058 CR3: 00000001059b8001 CR4: 00000000003706e0 Sep 15 04:32:29 [399661.973454][ C31] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 15 04:32:29 [399661.973569][ C31] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Sep 15 04:32:29 [399661.973684][ C31] Call Trace: Sep 15 04:32:29 [399661.973773][ C31] <IRQ> Sep 15 04:32:29 [399661.973859][ C31] ? __die (arch/x86/kernel/dumpstack.c:478 (discriminator 1) arch/x86/kernel/dumpstack.c:465 (discriminator 1) arch/x86/kernel/dumpstack.c:420 (discriminator 1) arch/x86/kernel/dumpstack.c:434 (discriminator 1)) Sep 15 04:32:29 [399661.973949][ C31] ? page_fault_oops (arch/x86/mm/fault.c:703) Sep 15 04:32:29 [399661.974043][ C31] ? exc_page_fault (arch/x86/mm/fault.c:48 (discriminator 2) arch/x86/mm/fault.c:1479 (discriminator 2) arch/x86/mm/fault.c:1542 (discriminator 2)) Sep 15 04:32:29 [399661.974136][ C31] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570) Sep 15 04:32:29 [399661.974228][ C31] ? kfree_skb_reason (net/core/skbuff.c:1006 net/core/skbuff.c:1022 net/core/skbuff.c:1058) Sep 15 04:32:29 [399661.974321][ C31] ? tcp_mtu_probe (./include/net/sock.h:1627 (discriminator 1) net/ipv4/tcp_output.c:2338 (discriminator 1) net/ipv4/tcp_output.c:2463 (discriminator 1)) Sep 15 04:32:29 [399661.974416][ C31] ? tcp_write_xmit (net/ipv4/tcp_output.c:2678) Sep 15 04:32:29 [399661.974509][ C31] ? __tcp_push_pending_frames (net/ipv4/tcp_output.c:2940 (discriminator 1)) Sep 15 04:32:29 [399661.974603][ C31] ? tcp_rcv_established (net/ipv4/tcp_input.c:5626 net/ipv4/tcp_input.c:5620 net/ipv4/tcp_input.c:6066) Sep 15 04:32:29 [399661.974695][ C31] ? sk_filter_trim_cap (./include/linux/rcupdate.h:781 net/core/filter.c:157) Sep 15 04:32:29 [399661.974787][ C31] ? tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1728) Sep 15 04:32:29 [399661.974877][ C31] ? tcp_v4_rcv (./include/net/tcp.h:2342 (discriminator 1) net/ipv4/tcp_ipv4.c:2147 (discriminator 1)) Sep 15 04:32:29 [399661.974968][ C31] ? ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205) Sep 15 04:32:29 [399661.975062][ C31] ? ip_local_deliver_finish (net/ipv4/ip_input.c:233 (discriminator 1)) Sep 15 04:32:29 [399661.976257][ C31] ? process_backlog (net/core/dev.c:5451 net/core/dev.c:5566 net/core/dev.c:5895) Sep 15 04:32:29 [399661.976352][ C31] ? __napi_poll+0x20/0x180 Sep 15 04:32:29 [399661.976442][ C31] ? net_rx_action (net/core/dev.c:5839 net/core/dev.c:5860 net/core/dev.c:6684) Sep 15 04:32:29 [399661.976534][ C31] ? __do_softirq (./arch/x86/include/asm/bitops.h:319 kernel/softirq.c:550) Sep 15 04:32:29 [399661.976626][ C31] ? do_softirq (kernel/softirq.c:463 (discriminator 32)) Sep 15 04:32:29 [399661.976718][ C31] </IRQ> Sep 15 04:32:29 [399661.976805][ C31] <TASK> Sep 15 04:32:29 [399661.976890][ C31] ? flush_smp_call_function_queue (kernel/smp.c:563 (discriminator 1)) Sep 15 04:32:29 [399661.976988][ C31] ? do_idle (kernel/sched/idle.c:295) Sep 15 04:32:29 [399661.977078][ C31] ? cpu_startup_entry (kernel/sched/idle.c:379 (discriminator 1)) Sep 15 04:32:29 [399661.977168][ C31] ? start_secondary (arch/x86/kernel/smpboot.c:326) Sep 15 04:32:29 [399661.977262][ C31] ? secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:441) Sep 15 04:32:29 [399661.977359][ C31] </TASK> Sep 15 04:32:29 [399661.977448][ C31] Modules linked in: nft_limit nf_conntrack_netlink vlan_mon(O) pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_xnatlog(O) ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [last unloaded: BNGBOOT(O)] Sep 15 04:32:29 [399661.977720][ C31] CR2: ffffa10c52d43058 Sep 15 04:32:29 [399661.977809][ C31] ---[ end trace 0000000000000000 ]--- Sep 15 04:32:29 [399661.977901][ C31] RIP: 0010:0xffffa10c52d43058 Sep 15 04:32:29 [399661.977992][ C31] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 30 d4 52 0c a1 ff ff 00 00 00 00 00 00 All code ======== ... 30: 00 00 add %al,(%rax) 32: 58 pop %rax 33: 30 d4 xor %dl,%ah 35:* 52 push %rdx <-- trapping instruction 36: 0c a1 or $0xa1,%al 38: ff (bad) 39: ff 00 incl (%rax) 3b: 00 00 add %al,(%rax) 3d: 00 00 add %al,(%rax) ... Code starting with the faulting instruction =========================================== ... 8: 58 pop %rax 9: 30 d4 xor %dl,%ah b: 52 push %rdx c: 0c a1 or $0xa1,%al e: ff (bad) f: ff 00 incl (%rax) 11: 00 00 add %al,(%rax) 13: 00 00 add %al,(%rax) ... Sep 15 04:32:29 [399661.978150][ C31] RSP: 0018:ffffad0e0097ccc8 EFLAGS: 00010282 Sep 15 04:32:29 [399661.978243][ C31] RAX: ffffa10c52d43058 RBX: ffffa10c52d43000 RCX: 0000000000000000 Sep 15 04:32:29 [399661.978358][ C31] RDX: 0000000000002712 RSI: 0000000000000246 RDI: ffffa10c52d43000 Sep 15 04:32:29 [399661.978472][ C31] RBP: ffffa10c52d43000 R08: 0000000127a83c46 R09: 0000000000004d8c Sep 15 04:32:29 [399661.978587][ C31] R10: ffffe840ca0f7c00 R11: 0000000000000000 R12: ffffa10c8e764d80 Sep 15 04:32:29 [399661.978702][ C31] R13: ffffa10c92b4c760 R14: 0000000000000058 R15: ffffa10c92b4c600 Sep 15 04:32:29 [399661.978818][ C31] FS: 0000000000000000(0000) GS:ffffa1125fdc0000(0000) knlGS:0000000000000000 Sep 15 04:32:29 [399661.978940][ C31] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 15 04:32:29 [399661.979036][ C31] CR2: ffffa10c52d43058 CR3: 00000001059b8001 CR4: 00000000003706e0 Sep 15 04:32:29 [399661.979150][ C31] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 15 04:32:29 [399661.979265][ C31] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Sep 15 04:32:29 [399661.979381][ C31] Kernel panic - not syncing: Fatal exception in interrupt Sep 15 04:32:29 [399662.084038][ C31] Kernel Offset: 0x1e000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Sep 15 04:32:29 [399662.084162][ C31] Rebooting in 10 seconds.. > On 16 Sep 2023, at 2:00, Martin Zaharinov <micron10@gmail.com> wrote: > > Ok fix > one note this is kernel 6.5.3 … > > > see log now : > > > [40915.530445] ------------[ cut here ]------------ > [40915.530529] rcuref - imbalanced put() > [40915.530540] WARNING: CPU: 7 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > [40915.530698] Modules linked in: nf_conntrack_netlink nft_limit pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos > [40915.530899] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G O 6.5.3 #1 > [40915.531018] Hardware name: Supermicro SYS-5038MR-H8TRF/X10SRD-F, BIOS 3.3 10/28/2020 > [40915.531137] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > [40915.531230] Code: 31 c0 eb e2 80 3d c6 ae e6 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb cf 48 c7 c7 68 f6 e2 8e c6 05 ac ae e6 00 01 e8 11 71 c7 ff <0f> 0b eb df cc cc cc cc cc cc cc cc cc cc cc cc cc 48 89 fa 83 e2 > All code > ======== > 0: 31 c0 xor %eax,%eax > 2: eb e2 jmp 0xffffffffffffffe6 > 4: 80 3d c6 ae e6 00 00 cmpb $0x0,0xe6aec6(%rip) # 0xe6aed1 > b: 74 0a je 0x17 > d: c7 03 00 00 00 e0 movl $0xe0000000,(%rbx) > 13: 31 c0 xor %eax,%eax > 15: eb cf jmp 0xffffffffffffffe6 > 17: 48 c7 c7 68 f6 e2 8e mov $0xffffffff8ee2f668,%rdi > 1e: c6 05 ac ae e6 00 01 movb $0x1,0xe6aeac(%rip) # 0xe6aed1 > 25: e8 11 71 c7 ff call 0xffffffffffc7713b > 2a:* 0f 0b ud2 <-- trapping instruction > 2c: eb df jmp 0xd > 2e: cc int3 > 2f: cc int3 > 30: cc int3 > 31: cc int3 > 32: cc int3 > 33: cc int3 > 34: cc int3 > 35: cc int3 > 36: cc int3 > 37: cc int3 > 38: cc int3 > 39: cc int3 > 3a: cc int3 > 3b: 48 89 fa mov %rdi,%rdx > 3e: 83 .byte 0x83 > 3f: e2 .byte 0xe2 > > Code starting with the faulting instruction > =========================================== > 0: 0f 0b ud2 > 2: eb df jmp 0xffffffffffffffe3 > 4: cc int3 > 5: cc int3 > 6: cc int3 > 7: cc int3 > 8: cc int3 > 9: cc int3 > a: cc int3 > b: cc int3 > c: cc int3 > d: cc int3 > e: cc int3 > f: cc int3 > 10: cc int3 > 11: 48 89 fa mov %rdi,%rdx > 14: 83 .byte 0x83 > 15: e2 .byte 0xe2 > [40915.531389] RSP: 0018:ffffa62680318de8 EFLAGS: 00010296 > [40915.531487] RAX: 0000000000000019 RBX: ffff982f02950c40 RCX: 00000000fffbffff > [40915.531605] RDX: 00000000fffbffff RSI: 0000000000000001 RDI: 00000000ffffffea > [40915.531721] RBP: ffff982e467d2000 R08: 0000000000000000 R09: 00000000fffbffff > [40915.531839] R10: ffff98359d600000 R11: 0000000000000003 R12: ffff982f044e16c0 > [40915.531956] R13: 0000000000000000 R14: 0000000000000258 R15: ffffa62680318f60 > [40915.532075] FS: 0000000000000000(0000) GS:ffff98359fbc0000(0000) knlGS:0000000000000000 > [40915.532195] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [40915.532291] CR2: 00005593eb3ff078 CR3: 0000000179f6e001 CR4: 00000000003706e0 > [40915.532409] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [40915.532526] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [40915.532645] Call Trace: > [40915.532736] <IRQ> > [40915.532824] ? __warn (kernel/panic.c:668) > [40915.532918] ? report_bug (lib/bug.c:223) > [40915.533011] ? handle_bug (arch/x86/kernel/traps.c:324) > [40915.533104] ? exc_invalid_op (arch/x86/kernel/traps.c:345 (discriminator 1)) > [40915.533198] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) > [40915.533294] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > [40915.533389] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > [40915.533482] dst_release (./include/linux/rcuref.h:151 net/core/dst.c:166) > [40915.533576] __dev_queue_xmit (net/core/dev.c:4138) > [40915.533671] ? eth_header (net/ethernet/eth.c:83) > [40915.533766] ip_finish_output2 (./include/net/neighbour.h:544 net/ipv4/ip_output.c:230) > [40915.533863] process_backlog (net/core/dev.c:5451 net/core/dev.c:5566 net/core/dev.c:5895) > [40915.533958] __napi_poll+0x20/0x180 > [40915.534050] net_rx_action (net/core/dev.c:5839 net/core/dev.c:5860 net/core/dev.c:6684) > [40915.534140] __do_softirq (./arch/x86/include/asm/bitops.h:319 kernel/softirq.c:550) > [40915.534233] do_softirq (kernel/softirq.c:463 (discriminator 32)) > [40915.534326] </IRQ> > [40915.534413] <TASK> > [40915.534503] flush_smp_call_function_queue (kernel/smp.c:563 (discriminator 1)) > [40915.534597] do_idle (kernel/sched/idle.c:295) > [40915.534687] cpu_startup_entry (kernel/sched/idle.c:379 (discriminator 1)) > [40915.534778] start_secondary (arch/x86/kernel/smpboot.c:326) > [40915.534871] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:441) > [40915.534968] </TASK> > [40915.535057] ---[ end trace 0000000000000000 ]--- > >> On 15 Sep 2023, at 7:05, Martin Zaharinov <micron10@gmail.com> wrote: >> >> Hi All >> This is report from kernel 6.5.2 after 4 day up system hang and reboot after this error : >> >> >> >> Sep 15 04:32:29 205.254.184.12 [399661.971344][ C31] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) >> Sep 15 04:32:29 205.254.184.12 [399661.971470][ C31] BUG: unable to handle page fault for address: ffffa10c52d43058 >> Sep 15 04:32:29 205.254.184.12 [399661.971586][ C31] #PF: supervisor instruction fetch in kernel mode >> Sep 15 04:32:29 205.254.184.12 [399661.971680][ C31] #PF: error_code(0x0011) - permissions violation >> Sep 15 04:32:29 205.254.184.12 [399661.971775][ C31] PGD 12601067 P4D 12601067 PUD 80000002400001e3 >> Sep 15 04:32:29 205.254.184.12 [399661.971871][ C31] Oops: 0011 [#1] PREEMPT SMP >> Sep 15 04:32:29 205.254.184.12 [399661.971963][ C31] CPU: 31 PID: 0 Comm: swapper/31 Tainted: G W O 6.5.2 #1 >> Sep 15 04:32:29 205.254.184.12 [399661.972079][ C31] Hardware name: Supermicro SYS-5038MR-H8TRF/X10SRD-F, BIOS 3.3 10/28/2020 >> Sep 15 04:32:29 205.254.184.12 [399661.972197][ C31] RIP: 0010:0xffffa10c52d43058 >> Sep 15 04:32:29 205.254.184.12 [399661.972289][ C31] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 30 d4 52 0c a1 ff ff 00 00 00 00 00 00 >> Sep 15 04:32:29 205.254.184.12 [399661.972448][ C31] RSP: 0018:ffffad0e0097ccc8 EFLAGS: 00010282 >> Sep 15 04:32:29 205.254.184.12 [399661.972543][ C31] RAX: ffffa10c52d43058 RBX: ffffa10c52d43000 RCX: 0000000000000000 >> Sep 15 04:32:29 205.254.184.12 [399661.972659][ C31] RDX: 0000000000002712 RSI: 0000000000000246 RDI: ffffa10c52d43000 >> Sep 15 04:32:29 205.254.184.12 [399661.972774][ C31] RBP: ffffa10c52d43000 R08: 0000000127a83c46 R09: 0000000000004d8c >> Sep 15 04:32:29 205.254.184.12 [399661.972889][ C31] R10: ffffe840ca0f7c00 R11: 0000000000000000 R12: ffffa10c8e764d80 >> Sep 15 04:32:29 205.254.184.12 [399661.973005][ C31] R13: ffffa10c92b4c760 R14: 0000000000000058 R15: ffffa10c92b4c600 >> Sep 15 04:32:29 205.254.184.12 [399661.973123][ C31] FS: 0000000000000000(0000) GS:ffffa1125fdc0000(0000) knlGS:0000000000000000 >> Sep 15 04:32:29 205.254.184.12 [399661.973244][ C31] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> Sep 15 04:32:29 205.254.184.12 [399661.973338][ C31] CR2: ffffa10c52d43058 CR3: 00000001059b8001 CR4: 00000000003706e0 >> Sep 15 04:32:29 205.254.184.12 [399661.973454][ C31] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> Sep 15 04:32:29 205.254.184.12 [399661.973569][ C31] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> Sep 15 04:32:29 205.254.184.12 [399661.973684][ C31] Call Trace: >> Sep 15 04:32:29 205.254.184.12 [399661.973773][ C31] <IRQ> >> Sep 15 04:32:29 205.254.184.12 [399661.973859][ C31] ? __die+0xe4/0xf0 >> Sep 15 04:32:29 205.254.184.12 [399661.973949][ C31] ? page_fault_oops+0x144/0x3e0 >> Sep 15 04:32:29 205.254.184.12 [399661.974043][ C31] ? exc_page_fault+0x92/0xa0 >> Sep 15 04:32:29 205.254.184.12 [399661.974136][ C31] ? asm_exc_page_fault+0x22/0x30 >> Sep 15 04:32:29 205.254.184.12 [399661.974228][ C31] ? kfree_skb_reason+0x33/0xf0 >> Sep 15 04:32:29 205.254.184.12 [399661.974321][ C31] ? tcp_mtu_probe+0x3a6/0x7b0 >> Sep 15 04:32:29 205.254.184.12 [399661.974416][ C31] ? tcp_write_xmit+0x7fa/0x1410 >> Sep 15 04:32:29 205.254.184.12 [399661.974509][ C31] ? __tcp_push_pending_frames+0x2d/0xb0 >> Sep 15 04:32:29 205.254.184.12 [399661.974603][ C31] ? tcp_rcv_established+0x381/0x610 >> Sep 15 04:32:29 205.254.184.12 [399661.974695][ C31] ? sk_filter_trim_cap+0xc6/0x1c0 >> Sep 15 04:32:29 205.254.184.12 [399661.974787][ C31] ? tcp_v4_do_rcv+0x11f/0x1f0 >> Sep 15 04:32:29 205.254.184.12 [399661.974877][ C31] ? tcp_v4_rcv+0xfa1/0x1010 >> Sep 15 04:32:29 205.254.184.12 [399661.974968][ C31] ? ip_protocol_deliver_rcu+0x1b/0x270 >> Sep 15 04:32:29 205.254.184.12 [399661.975062][ C31] ? ip_local_deliver_finish+0x6d/0x90 >> Sep 15 04:32:29 205.254.184.12 [399661.976257][ C31] ? process_backlog+0x10c/0x230 >> Sep 15 04:32:29 205.254.184.12 [399661.976352][ C31] ? __napi_poll+0x20/0x180 >> Sep 15 04:32:29 205.254.184.12 [399661.976442][ C31] ? net_rx_action+0x2a4/0x390 >> Sep 15 04:32:29 205.254.184.12 [399661.976534][ C31] ? __do_softirq+0xd0/0x202 >> Sep 15 04:32:29 205.254.184.12 [399661.976626][ C31] ? do_softirq+0x3a/0x50 >> Sep 15 04:32:29 205.254.184.12 [399661.976718][ C31] </IRQ> >> Sep 15 04:32:29 205.254.184.12 [399661.976805][ C31] <TASK> >> Sep 15 04:32:29 205.254.184.12 [399661.976890][ C31] ? flush_smp_call_function_queue+0x3f/0x50 >> Sep 15 04:32:29 205.254.184.12 [399661.976988][ C31] ? do_idle+0x14d/0x210 >> Sep 15 04:32:29 205.254.184.12 [399661.977078][ C31] ? cpu_startup_entry+0x14/0x20 >> Sep 15 04:32:29 205.254.184.12 [399661.977168][ C31] ? start_secondary+0xe1/0xf0 >> Sep 15 04:32:29 205.254.184.12 [399661.977262][ C31] ? secondary_startup_64_no_verify+0x167/0x16b >> Sep 15 04:32:29 205.254.184.12 [399661.977359][ C31] </TASK> >> Sep 15 04:32:29 205.254.184.12 [399661.977448][ C31] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos >> Sep 15 04:32:29 205.254.184.12 [399661.977720][ C31] CR2: ffffa10c52d43058 >> Sep 15 04:32:29 205.254.184.12 [399661.977809][ C31] ---[ end trace 0000000000000000 ]--- >> Sep 15 04:32:29 205.254.184.12 [399661.977901][ C31] RIP: 0010:0xffffa10c52d43058 >> Sep 15 04:32:29 205.254.184.12 [399661.977992][ C31] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 30 d4 52 0c a1 ff ff 00 00 00 00 00 00 >> Sep 15 04:32:29 205.254.184.12 [399661.978150][ C31] RSP: 0018:ffffad0e0097ccc8 EFLAGS: 00010282 >> Sep 15 04:32:29 205.254.184.12 [399661.978243][ C31] RAX: ffffa10c52d43058 RBX: ffffa10c52d43000 RCX: 0000000000000000 >> Sep 15 04:32:29 205.254.184.12 [399661.978358][ C31] RDX: 0000000000002712 RSI: 0000000000000246 RDI: ffffa10c52d43000 >> Sep 15 04:32:29 205.254.184.12 [399661.978472][ C31] RBP: ffffa10c52d43000 R08: 0000000127a83c46 R09: 0000000000004d8c >> Sep 15 04:32:29 205.254.184.12 [399661.978587][ C31] R10: ffffe840ca0f7c00 R11: 0000000000000000 R12: ffffa10c8e764d80 >> Sep 15 04:32:29 205.254.184.12 [399661.978702][ C31] R13: ffffa10c92b4c760 R14: 0000000000000058 R15: ffffa10c92b4c600 >> Sep 15 04:32:29 205.254.184.12 [399661.978818][ C31] FS: 0000000000000000(0000) GS:ffffa1125fdc0000(0000) knlGS:0000000000000000 >> Sep 15 04:32:29 205.254.184.12 [399661.978940][ C31] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> Sep 15 04:32:29 205.254.184.12 [399661.979036][ C31] CR2: ffffa10c52d43058 CR3: 00000001059b8001 CR4: 00000000003706e0 >> Sep 15 04:32:29 205.254.184.12 [399661.979150][ C31] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> Sep 15 04:32:29 205.254.184.12 [399661.979265][ C31] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> Sep 15 04:32:29 205.254.184.12 [399661.979381][ C31] Kernel panic - not syncing: Fatal exception in interrupt >> Sep 15 04:32:29 205.254.184.12 [399662.084038][ C31] Kernel Offset: 0x1e000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) >> Sep 15 04:32:29 205.254.184.12 [399662.084162][ C31] Rebooting in 10 seconds.. >> >> >> Please if find fix update me . >> >> m. > > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-15 23:11 ` Martin Zaharinov @ 2023-09-16 8:27 ` Paolo Abeni [not found] ` <CALidq=UR=3rOHZczCnb1bEhbt9So60UZ5y60Cdh4aP41FkB5Tw@mail.gmail.com> 0 siblings, 1 reply; 35+ messages in thread From: Paolo Abeni @ 2023-09-16 8:27 UTC (permalink / raw) To: Martin Zaharinov, netdev Cc: Eric Dumazet, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern On Sat, 2023-09-16 at 02:11 +0300, Martin Zaharinov wrote: > one more log: > > Sep 12 07:37:29 [151563.298466][ C5] ------------[ cut here ]------------ > Sep 12 07:37:29 [151563.298550][ C5] rcuref - imbalanced put() > Sep 12 07:37:29 [151563.298564][ C5] WARNING: CPU: 5 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > Sep 12 07:37:29 [151563.298724][ C5] Modules linked in: nft_limit nf_conntrack_netlink vlan_mon(O) pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_xnatlog(O) ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [last unloaded: BNGBOOT(O)] > Sep 12 07:37:29 [151563.298894][ C5] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G O 6.5.2 #1 You have out-of-tree modules taint in all the report you shared. Please try to reproduce the issue with such taint, thanks! Paolo ^ permalink raw reply [flat|nested] 35+ messages in thread
[parent not found: <CALidq=UR=3rOHZczCnb1bEhbt9So60UZ5y60Cdh4aP41FkB5Tw@mail.gmail.com>]
* Re: Urgent Bug Report Kernel crash 6.5.2 [not found] ` <CALidq=UR=3rOHZczCnb1bEhbt9So60UZ5y60Cdh4aP41FkB5Tw@mail.gmail.com> @ 2023-09-17 11:35 ` Martin Zaharinov 2023-09-17 11:40 ` Martin Zaharinov 1 sibling, 0 replies; 35+ messages in thread From: Martin Zaharinov @ 2023-09-17 11:35 UTC (permalink / raw) To: Paolo Abeni Cc: netdev, Eric Dumazet, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern Hi Paolo and Eric See this is latest crash from kernel 6.5.3 without external moduls…. first is crash report , second is with decode: Sep 17 11:43:11 [127675.391688][ C2] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) Sep 17 11:43:11 [127675.391780][ C2] BUG: unable to handle page fault for address: ffff9bd9ff20f858 Sep 17 11:43:11 [127675.391859][ C2] #PF: supervisor instruction fetch in kernel mode Sep 17 11:43:11 [127675.391937][ C2] #PF: error_code(0x0011) - permissions violation Sep 17 11:43:11 [127675.392014][ C2] PGD 1a601067 P4D 1a601067 PUD 147b05063 PMD 800000023f2001e3 Sep 17 11:43:11 [127675.392099][ C2] Oops: 0011 [#1] PREEMPT SMP Sep 17 11:43:11 [127675.392173][ C2] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G O 6.5.3 #1 Sep 17 11:43:11 [127675.392257][ C2] Hardware name: Supermicro SYS-5038MR-H8TRF/X10SRD-F, BIOS 3.3 10/28/2020 Sep 17 11:43:11 [127675.392338][ C2] RIP: 0010:0xffff9bd9ff20f858 Sep 17 11:43:11 [127675.392413][ C2] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 f8 20 ff d9 9b ff ff 00 00 00 00 00 00 Sep 17 11:43:11 [127675.392540][ C2] RSP: 0018:ffffadfe0007ccc8 EFLAGS: 00010282 Sep 17 11:43:11 [127675.392635][ C2] RAX: ffff9bd9ff20f858 RBX: ffff9bd9ff20f800 RCX: 0000000000000000 Sep 17 11:43:11 [127675.392753][ C2] RDX: 0000000000002711 RSI: 0000000000000246 RDI: ffff9bd9ff20f800 Sep 17 11:43:11 [127675.392871][ C2] RBP: ffff9bd9ff20f800 R08: 000000010ca6060f R09: 00000000000079f2 Sep 17 11:43:11 [127675.392988][ C2] R10: ffffd88b47077c00 R11: 0000000000000000 R12: ffff9bd9bb6ca1c0 Sep 17 11:43:11 [127675.393107][ C2] R13: ffff9bd9b9013760 R14: 0000000000000053 R15: ffff9bd9b9013600 Sep 17 11:43:11 [127675.393226][ C2] FS: 0000000000000000(0000) GS:ffff9be01f680000(0000) knlGS:0000000000000000 Sep 17 11:43:11 [127675.393347][ C2] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 17 11:43:11 [127675.393445][ C2] CR2: ffff9bd9ff20f858 CR3: 0000000234668001 CR4: 00000000003706e0 Sep 17 11:43:11 [127675.393562][ C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 17 11:43:11 [127675.393677][ C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Sep 17 11:43:11 [127675.393794][ C2] Call Trace: Sep 17 11:43:11 [127675.393886][ C2] <IRQ> Sep 17 11:43:11 [127675.393973][ C2] ? __die+0xe4/0xf0 Sep 17 11:43:11 [127675.394065][ C2] ? page_fault_oops+0x144/0x3e0 Sep 17 11:43:11 [127675.394157][ C2] ? exc_page_fault+0x92/0xa0 Sep 17 11:43:11 [127675.394251][ C2] ? asm_exc_page_fault+0x22/0x30 Sep 17 11:43:11 [127675.394347][ C2] ? kfree_skb_reason+0x33/0xf0 Sep 17 11:43:11 [127675.394443][ C2] ? tcp_mtu_probe+0x3a6/0x7b0 Sep 17 11:43:11 [127675.394539][ C2] ? tcp_write_xmit+0x7fa/0x1410 Sep 17 11:43:11 [127675.394634][ C2] ? __tcp_push_pending_frames+0x2d/0xb0 Sep 17 11:43:11 [127675.394727][ C2] ? tcp_rcv_established+0x205/0x610 Sep 17 11:43:11 [127675.394822][ C2] ? sk_filter_trim_cap+0xc6/0x1c0 Sep 17 11:43:11 [127675.394914][ C2] ? tcp_v4_do_rcv+0x11f/0x1f0 Sep 17 11:43:11 [127675.395007][ C2] ? tcp_v4_rcv+0xfa1/0x1010 Sep 17 11:43:11 [127675.395100][ C2] ? ip_protocol_deliver_rcu+0x1b/0x270 Sep 17 11:43:11 [127675.395196][ C2] ? ip_local_deliver_finish+0x6d/0x90 Sep 17 11:43:11 [127675.395289][ C2] ? process_backlog+0x10c/0x230 Sep 17 11:43:11 [127675.395386][ C2] ? __napi_poll+0x20/0x180 Sep 17 11:43:11 [127675.395478][ C2] ? net_rx_action+0x2a4/0x390 Sep 17 11:43:11 [127675.395572][ C2] ? __do_softirq+0xd0/0x202 Sep 17 11:43:11 [127675.395666][ C2] ? do_softirq+0x3a/0x50 Sep 17 11:43:11 [127675.395760][ C2] </IRQ> Sep 17 11:43:11 [127675.395849][ C2] <TASK> Sep 17 11:43:11 [127675.395939][ C2] ? flush_smp_call_function_queue+0x3f/0x50 Sep 17 11:43:11 [127675.396039][ C2] ? do_idle+0x14d/0x210 Sep 17 11:43:11 [127675.396132][ C2] ? cpu_startup_entry+0x14/0x20 Sep 17 11:43:11 [127675.396224][ C2] ? start_secondary+0xe1/0xf0 Sep 17 11:43:11 [127675.396318][ C2] ? secondary_startup_64_no_verify+0x167/0x16b Sep 17 11:43:11 [127675.396417][ C2] </TASK> Sep 17 11:43:11 [127675.396504][ C2] Modules linked in: nf_conntrack_netlink nft_limit pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos Sep 17 11:43:11 [127675.396775][ C2] CR2: ffff9bd9ff20f858 Sep 17 11:43:11 [127675.396868][ C2] ---[ end trace 0000000000000000 ]--- Sep 17 11:43:11 [127675.396961][ C2] RIP: 0010:0xffff9bd9ff20f858 Sep 17 11:43:11 [127675.397052][ C2] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 f8 20 ff d9 9b ff ff 00 00 00 00 00 00 Sep 17 11:43:11 [127675.397211][ C2] RSP: 0018:ffffadfe0007ccc8 EFLAGS: 00010282 Sep 17 11:43:11 [127675.397305][ C2] RAX: ffff9bd9ff20f858 RBX: ffff9bd9ff20f800 RCX: 0000000000000000 Sep 17 11:43:11 [127675.397419][ C2] RDX: 0000000000002711 RSI: 0000000000000246 RDI: ffff9bd9ff20f800 Sep 17 11:43:11 [127675.397535][ C2] RBP: ffff9bd9ff20f800 R08: 000000010ca6060f R09: 00000000000079f2 Sep 17 11:43:11 [127675.397651][ C2] R10: ffffd88b47077c00 R11: 0000000000000000 R12: ffff9bd9bb6ca1c0 Sep 17 11:43:11 [127675.397767][ C2] R13: ffff9bd9b9013760 R14: 0000000000000053 R15: ffff9bd9b9013600 Sep 17 11:43:11 [127675.397886][ C2] FS: 0000000000000000(0000) GS:ffff9be01f680000(0000) knlGS:0000000000000000 Sep 17 11:43:11 [127675.398006][ C2] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 17 11:43:11 [127675.398101][ C2] CR2: ffff9bd9ff20f858 CR3: 0000000234668001 CR4: 00000000003706e0 Sep 17 11:43:11 [127675.398217][ C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 17 11:43:11 [127675.398334][ C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Sep 17 11:43:11 [127675.398451][ C2] Kernel panic - not syncing: Fatal exception in interrupt Sep 17 11:43:11 [127675.503611][ C2] Kernel Offset: 0x20000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Sep 17 11:43:11 [127675.503734][ C2] Rebooting in 10 seconds.. Second with decode: Sep 17 11:43:11 [127675.391688][ C2] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) Sep 17 11:43:11 [127675.391780][ C2] BUG: unable to handle page fault for address: ffff9bd9ff20f858 Sep 17 11:43:11 [127675.391859][ C2] #PF: supervisor instruction fetch in kernel mode Sep 17 11:43:11 [127675.391937][ C2] #PF: error_code(0x0011) - permissions violation Sep 17 11:43:11 [127675.392014][ C2] PGD 1a601067 P4D 1a601067 PUD 147b05063 PMD 800000023f2001e3 Sep 17 11:43:11 [127675.392099][ C2] Oops: 0011 [#1] PREEMPT SMP Sep 17 11:43:11 [127675.392173][ C2] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G O 6.5.3 #1 Sep 17 11:43:11 [127675.392257][ C2] Hardware name: Supermicro SYS-5038MR-H8TRF/X10SRD-F, BIOS 3.3 10/28/2020 Sep 17 11:43:11 [127675.392338][ C2] RIP: 0010:0xffff9bd9ff20f858 Sep 17 11:43:11 [127675.392413][ C2] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 f8 20 ff d9 9b ff ff 00 00 00 00 00 00 All code ======== ... 30: 00 00 add %al,(%rax) 32: 58 pop %rax 33:* f8 clc <-- trapping instruction 34: 20 ff and %bh,%bh 36: d9 9b ff ff 00 00 fstps 0xffff(%rbx) 3c: 00 00 add %al,(%rax) ... Code starting with the faulting instruction =========================================== ... 8: 58 pop %rax 9: f8 clc a: 20 ff and %bh,%bh c: d9 9b ff ff 00 00 fstps 0xffff(%rbx) 12: 00 00 add %al,(%rax) ... Sep 17 11:43:11 [127675.392540][ C2] RSP: 0018:ffffadfe0007ccc8 EFLAGS: 00010282 Sep 17 11:43:11 [127675.392635][ C2] RAX: ffff9bd9ff20f858 RBX: ffff9bd9ff20f800 RCX: 0000000000000000 Sep 17 11:43:11 [127675.392753][ C2] RDX: 0000000000002711 RSI: 0000000000000246 RDI: ffff9bd9ff20f800 Sep 17 11:43:11 [127675.392871][ C2] RBP: ffff9bd9ff20f800 R08: 000000010ca6060f R09: 00000000000079f2 Sep 17 11:43:11 [127675.392988][ C2] R10: ffffd88b47077c00 R11: 0000000000000000 R12: ffff9bd9bb6ca1c0 Sep 17 11:43:11 [127675.393107][ C2] R13: ffff9bd9b9013760 R14: 0000000000000053 R15: ffff9bd9b9013600 Sep 17 11:43:11 [127675.393226][ C2] FS: 0000000000000000(0000) GS:ffff9be01f680000(0000) knlGS:0000000000000000 Sep 17 11:43:11 [127675.393347][ C2] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 17 11:43:11 [127675.393445][ C2] CR2: ffff9bd9ff20f858 CR3: 0000000234668001 CR4: 00000000003706e0 Sep 17 11:43:11 [127675.393562][ C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 17 11:43:11 [127675.393677][ C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Sep 17 11:43:11 [127675.393794][ C2] Call Trace: Sep 17 11:43:11 [127675.393886][ C2] <IRQ> Sep 17 11:43:11 [127675.393973][ C2] ? __die (arch/x86/kernel/dumpstack.c:478 (discriminator 1) arch/x86/kernel/dumpstack.c:465 (discriminator 1) arch/x86/kernel/dumpstack.c:420 (discriminator 1) arch/x86/kernel/dumpstack.c:434 (discriminator 1)) Sep 17 11:43:11 [127675.394065][ C2] ? page_fault_oops (arch/x86/mm/fault.c:703) Sep 17 11:43:11 [127675.394157][ C2] ? exc_page_fault (arch/x86/mm/fault.c:48 (discriminator 2) arch/x86/mm/fault.c:1479 (discriminator 2) arch/x86/mm/fault.c:1542 (discriminator 2)) Sep 17 11:43:11 [127675.394251][ C2] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570) Sep 17 11:43:11 [127675.394347][ C2] ? kfree_skb_reason (net/core/skbuff.c:1006 net/core/skbuff.c:1022 net/core/skbuff.c:1058) Sep 17 11:43:11 [127675.394443][ C2] ? tcp_mtu_probe (./include/net/sock.h:1627 (discriminator 1) net/ipv4/tcp_output.c:2338 (discriminator 1) net/ipv4/tcp_output.c:2463 (discriminator 1)) Sep 17 11:43:11 [127675.394539][ C2] ? tcp_write_xmit (net/ipv4/tcp_output.c:2678) Sep 17 11:43:11 [127675.394634][ C2] ? __tcp_push_pending_frames (net/ipv4/tcp_output.c:2940 (discriminator 1)) Sep 17 11:43:11 [127675.394727][ C2] ? tcp_rcv_established (./include/net/tcp.h:2033 net/ipv4/tcp_input.c:5545 net/ipv4/tcp_input.c:6065) Sep 17 11:43:11 [127675.394822][ C2] ? sk_filter_trim_cap (./include/linux/rcupdate.h:781 net/core/filter.c:157) Sep 17 11:43:11 [127675.394914][ C2] ? tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1728) Sep 17 11:43:11 [127675.395007][ C2] ? tcp_v4_rcv (./include/net/tcp.h:2342 (discriminator 1) net/ipv4/tcp_ipv4.c:2147 (discriminator 1)) Sep 17 11:43:11 [127675.395100][ C2] ? ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205) Sep 17 11:43:11 [127675.395196][ C2] ? ip_local_deliver_finish (net/ipv4/ip_input.c:233 (discriminator 1)) Sep 17 11:43:11 [127675.395289][ C2] ? process_backlog (net/core/dev.c:5451 net/core/dev.c:5566 net/core/dev.c:5895) Sep 17 11:43:11 [127675.395386][ C2] ? __napi_poll+0x20/0x180 Sep 17 11:43:11 [127675.395478][ C2] ? net_rx_action (net/core/dev.c:5839 net/core/dev.c:5860 net/core/dev.c:6684) Sep 17 11:43:11 [127675.395572][ C2] ? __do_softirq (./arch/x86/include/asm/bitops.h:319 kernel/softirq.c:550) Sep 17 11:43:11 [127675.395666][ C2] ? do_softirq (kernel/softirq.c:463 (discriminator 32)) Sep 17 11:43:11 [127675.395760][ C2] </IRQ> Sep 17 11:43:11 [127675.395849][ C2] <TASK> Sep 17 11:43:11 [127675.395939][ C2] ? flush_smp_call_function_queue (kernel/smp.c:563 (discriminator 1)) Sep 17 11:43:11 [127675.396039][ C2] ? do_idle (kernel/sched/idle.c:295) Sep 17 11:43:11 [127675.396132][ C2] ? cpu_startup_entry (kernel/sched/idle.c:379 (discriminator 1)) Sep 17 11:43:11 [127675.396224][ C2] ? start_secondary (arch/x86/kernel/smpboot.c:326) Sep 17 11:43:11 [127675.396318][ C2] ? secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:441) Sep 17 11:43:11 [127675.396417][ C2] </TASK> Sep 17 11:43:11 [127675.396504][ C2] Modules linked in: nf_conntrack_netlink nft_limit pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos Sep 17 11:43:11 [127675.396775][ C2] CR2: ffff9bd9ff20f858 Sep 17 11:43:11 [127675.396868][ C2] ---[ end trace 0000000000000000 ]--- Sep 17 11:43:11 [127675.396961][ C2] RIP: 0010:0xffff9bd9ff20f858 Sep 17 11:43:11 [127675.397052][ C2] Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 58 f8 20 ff d9 9b ff ff 00 00 00 00 00 00 All code ======== ... 30: 00 00 add %al,(%rax) 32: 58 pop %rax 33:* f8 clc <-- trapping instruction 34: 20 ff and %bh,%bh 36: d9 9b ff ff 00 00 fstps 0xffff(%rbx) 3c: 00 00 add %al,(%rax) ... Code starting with the faulting instruction =========================================== ... 8: 58 pop %rax 9: f8 clc a: 20 ff and %bh,%bh c: d9 9b ff ff 00 00 fstps 0xffff(%rbx) 12: 00 00 add %al,(%rax) ... Sep 17 11:43:11 [127675.397211][ C2] RSP: 0018:ffffadfe0007ccc8 EFLAGS: 00010282 Sep 17 11:43:11 [127675.397305][ C2] RAX: ffff9bd9ff20f858 RBX: ffff9bd9ff20f800 RCX: 0000000000000000 Sep 17 11:43:11 [127675.397419][ C2] RDX: 0000000000002711 RSI: 0000000000000246 RDI: ffff9bd9ff20f800 Sep 17 11:43:11 [127675.397535][ C2] RBP: ffff9bd9ff20f800 R08: 000000010ca6060f R09: 00000000000079f2 Sep 17 11:43:11 [127675.397651][ C2] R10: ffffd88b47077c00 R11: 0000000000000000 R12: ffff9bd9bb6ca1c0 Sep 17 11:43:11 [127675.397767][ C2] R13: ffff9bd9b9013760 R14: 0000000000000053 R15: ffff9bd9b9013600 Sep 17 11:43:11 [127675.397886][ C2] FS: 0000000000000000(0000) GS:ffff9be01f680000(0000) knlGS:0000000000000000 Sep 17 11:43:11 [127675.398006][ C2] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 17 11:43:11 [127675.398101][ C2] CR2: ffff9bd9ff20f858 CR3: 0000000234668001 CR4: 00000000003706e0 Sep 17 11:43:11 [127675.398217][ C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 17 11:43:11 [127675.398334][ C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Sep 17 11:43:11 [127675.398451][ C2] Kernel panic - not syncing: Fatal exception in interrupt Sep 17 11:43:11 [127675.503611][ C2] Kernel Offset: 0x20000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Sep 17 11:43:11 [127675.503734][ C2] Rebooting in 10 seconds.. P.S. upload kernel on 5 machine with diff hw and make same on every one . Best regrads, m. > On 16 Sep 2023, at 12:04, Martin Zaharinov <micron10@gmail.com> wrote: > > Hi Paolo > > in first report machine dont have out of tree module > > this bug is come after move from kernel 6.2 to 6.3 > > m. > > On Sat, Sep 16, 2023, 11:27 Paolo Abeni <pabeni@redhat.com> wrote: > On Sat, 2023-09-16 at 02:11 +0300, Martin Zaharinov wrote: > > one more log: > > > > Sep 12 07:37:29 [151563.298466][ C5] ------------[ cut here ]------------ > > Sep 12 07:37:29 [151563.298550][ C5] rcuref - imbalanced put() > > Sep 12 07:37:29 [151563.298564][ C5] WARNING: CPU: 5 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > > Sep 12 07:37:29 [151563.298724][ C5] Modules linked in: nft_limit nf_conntrack_netlink vlan_mon(O) pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_xnatlog(O) ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [last unloaded: BNGBOOT(O)] > > Sep 12 07:37:29 [151563.298894][ C5] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G O 6.5.2 #1 > > > You have out-of-tree modules taint in all the report you shared. Please > try to reproduce the issue with such taint, thanks! > > Paolo > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 [not found] ` <CALidq=UR=3rOHZczCnb1bEhbt9So60UZ5y60Cdh4aP41FkB5Tw@mail.gmail.com> 2023-09-17 11:35 ` Martin Zaharinov @ 2023-09-17 11:40 ` Martin Zaharinov 2023-09-17 11:55 ` Martin Zaharinov 1 sibling, 1 reply; 35+ messages in thread From: Martin Zaharinov @ 2023-09-17 11:40 UTC (permalink / raw) To: Paolo Abeni Cc: netdev, Eric Dumazet, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Paolo Abeni, Pablo Neira Ayuso One more in changelog for kernel 6.5 : https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.5 I see have many bug reports with : Sep 17 11:43:11 [127675.395289][ C2] ? process_backlog+0x10c/0x230 Sep 17 11:43:11 [127675.395386][ C2] ? __napi_poll+0x20/0x180 Sep 17 11:43:11 [127675.395478][ C2] ? net_rx_action+0x2a4/0x390 In all server have simple nftables rulls , ethernet card is intel xl710 or 82599. its a very simple config. m. > On 16 Sep 2023, at 12:04, Martin Zaharinov <micron10@gmail.com> wrote: > > Hi Paolo > > in first report machine dont have out of tree module > > this bug is come after move from kernel 6.2 to 6.3 > > m. > > On Sat, Sep 16, 2023, 11:27 Paolo Abeni <pabeni@redhat.com> wrote: > On Sat, 2023-09-16 at 02:11 +0300, Martin Zaharinov wrote: > > one more log: > > > > Sep 12 07:37:29 [151563.298466][ C5] ------------[ cut here ]------------ > > Sep 12 07:37:29 [151563.298550][ C5] rcuref - imbalanced put() > > Sep 12 07:37:29 [151563.298564][ C5] WARNING: CPU: 5 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > > Sep 12 07:37:29 [151563.298724][ C5] Modules linked in: nft_limit nf_conntrack_netlink vlan_mon(O) pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_xnatlog(O) ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [last unloaded: BNGBOOT(O)] > > Sep 12 07:37:29 [151563.298894][ C5] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G O 6.5.2 #1 > > > You have out-of-tree modules taint in all the report you shared. Please > try to reproduce the issue with such taint, thanks! > > Paolo > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-17 11:40 ` Martin Zaharinov @ 2023-09-17 11:55 ` Martin Zaharinov 2023-09-17 12:04 ` Holger Hoffstätte ` (2 more replies) 0 siblings, 3 replies; 35+ messages in thread From: Martin Zaharinov @ 2023-09-17 11:55 UTC (permalink / raw) To: Paolo Abeni Cc: netdev, Eric Dumazet, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Pablo Neira Ayuso Hi Eric is it possible bug to come from this patch : https://patchwork.kernel.org/project/netdevbpf/cover/20230911170531.828100-1-edumazet@google.com/ m. > On 17 Sep 2023, at 14:40, Martin Zaharinov <micron10@gmail.com> wrote: > > One more in changelog for kernel 6.5 : https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.5 > > I see have many bug reports with : > > Sep 17 11:43:11 [127675.395289][ C2] ? process_backlog+0x10c/0x230 > Sep 17 11:43:11 [127675.395386][ C2] ? __napi_poll+0x20/0x180 > Sep 17 11:43:11 [127675.395478][ C2] ? net_rx_action+0x2a4/0x390 > > > In all server have simple nftables rulls , ethernet card is intel xl710 or 82599. its a very simple config. > > m. > > > > >> On 16 Sep 2023, at 12:04, Martin Zaharinov <micron10@gmail.com> wrote: >> >> Hi Paolo >> >> in first report machine dont have out of tree module >> >> this bug is come after move from kernel 6.2 to 6.3 >> >> m. >> >> On Sat, Sep 16, 2023, 11:27 Paolo Abeni <pabeni@redhat.com> wrote: >> On Sat, 2023-09-16 at 02:11 +0300, Martin Zaharinov wrote: >>> one more log: >>> >>> Sep 12 07:37:29 [151563.298466][ C5] ------------[ cut here ]------------ >>> Sep 12 07:37:29 [151563.298550][ C5] rcuref - imbalanced put() >>> Sep 12 07:37:29 [151563.298564][ C5] WARNING: CPU: 5 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) >>> Sep 12 07:37:29 [151563.298724][ C5] Modules linked in: nft_limit nf_conntrack_netlink vlan_mon(O) pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_xnatlog(O) ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [last unloaded: BNGBOOT(O)] >>> Sep 12 07:37:29 [151563.298894][ C5] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G O 6.5.2 #1 >> >> >> You have out-of-tree modules taint in all the report you shared. Please >> try to reproduce the issue with such taint, thanks! >> >> Paolo >> > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-17 11:55 ` Martin Zaharinov @ 2023-09-17 12:04 ` Holger Hoffstätte 2023-09-18 8:09 ` Eric Dumazet 2023-09-19 20:09 ` Martin Zaharinov 2 siblings, 0 replies; 35+ messages in thread From: Holger Hoffstätte @ 2023-09-17 12:04 UTC (permalink / raw) To: netdev On Sun, 17 Sep 2023 14:55:25 +0300, Martin Zaharinov wrote: > Hi Eric > is it possible bug to come from this patch : https://patchwork.kernel.org/project/netdevbpf/cover/20230911170531.828100-1-edumazet@google.com/ No, because 1) those patches are not in any released kernel 2) they work fine Holger ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-17 11:55 ` Martin Zaharinov 2023-09-17 12:04 ` Holger Hoffstätte @ 2023-09-18 8:09 ` Eric Dumazet 2023-09-19 20:09 ` Martin Zaharinov 2 siblings, 0 replies; 35+ messages in thread From: Eric Dumazet @ 2023-09-18 8:09 UTC (permalink / raw) To: Martin Zaharinov Cc: Paolo Abeni, netdev, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Pablo Neira Ayuso On Sun, Sep 17, 2023 at 1:55 PM Martin Zaharinov <micron10@gmail.com> wrote: > > Hi Eric > is it possible bug to come from this patch : https://patchwork.kernel.org/project/netdevbpf/cover/20230911170531.828100-1-edumazet@google.com/ > > Everything is possible, but this is not in 6.5 kernels. I would suggest you start a bisection. > m. > > > On 17 Sep 2023, at 14:40, Martin Zaharinov <micron10@gmail.com> wrote: > > > > One more in changelog for kernel 6.5 : https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.5 > > > > I see have many bug reports with : > > > > Sep 17 11:43:11 [127675.395289][ C2] ? process_backlog+0x10c/0x230 > > Sep 17 11:43:11 [127675.395386][ C2] ? __napi_poll+0x20/0x180 > > Sep 17 11:43:11 [127675.395478][ C2] ? net_rx_action+0x2a4/0x390 > > > > > > In all server have simple nftables rulls , ethernet card is intel xl710 or 82599. its a very simple config. > > > > m. > > > > > > > > > >> On 16 Sep 2023, at 12:04, Martin Zaharinov <micron10@gmail.com> wrote: > >> > >> Hi Paolo > >> > >> in first report machine dont have out of tree module > >> > >> this bug is come after move from kernel 6.2 to 6.3 > >> > >> m. > >> > >> On Sat, Sep 16, 2023, 11:27 Paolo Abeni <pabeni@redhat.com> wrote: > >> On Sat, 2023-09-16 at 02:11 +0300, Martin Zaharinov wrote: > >>> one more log: > >>> > >>> Sep 12 07:37:29 [151563.298466][ C5] ------------[ cut here ]------------ > >>> Sep 12 07:37:29 [151563.298550][ C5] rcuref - imbalanced put() > >>> Sep 12 07:37:29 [151563.298564][ C5] WARNING: CPU: 5 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > >>> Sep 12 07:37:29 [151563.298724][ C5] Modules linked in: nft_limit nf_conntrack_netlink vlan_mon(O) pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_xnatlog(O) ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [last unloaded: BNGBOOT(O)] > >>> Sep 12 07:37:29 [151563.298894][ C5] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G O 6.5.2 #1 > >> > >> > >> You have out-of-tree modules taint in all the report you shared. Please > >> try to reproduce the issue with such taint, thanks! > >> > >> Paolo > >> > > > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-17 11:55 ` Martin Zaharinov 2023-09-17 12:04 ` Holger Hoffstätte 2023-09-18 8:09 ` Eric Dumazet @ 2023-09-19 20:09 ` Martin Zaharinov 2023-09-20 3:59 ` Eric Dumazet 2 siblings, 1 reply; 35+ messages in thread From: Martin Zaharinov @ 2023-09-19 20:09 UTC (permalink / raw) To: Paolo Abeni Cc: netdev, Eric Dumazet, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Pablo Neira Ayuso Hi Eric Yes this patch is not come in 6.5 kernel and queue for 6.6 i test but not ok for now. One more i find same error have in old kernel 6.4.8 , update to kernel 6.5.4 and same error is come . Like this is hard to catch bug see logs : [1462610.861373] ------------[ cut here ]------------ [1462610.861480] rcuref - imbalanced put() [1462610.861491] WARNING: CPU: 22 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath+0x5f/0x70 [1462610.861718] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [1462610.862004] CPU: 22 PID: 0 Comm: swapper/22 Tainted: G O 6.4.8 #1 [1462610.863244] Hardware name: Supermicro Super Server/X10SRW-F, BIOS 3.4 06/05/2021 [1462610.863368] RIP: 0010:rcuref_put_slowpath+0x5f/0x70 [1462610.863469] Code: 31 c0 eb e2 80 3d 02 cd e6 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb cf 48 c7 c7 7f 68 e5 a4 c6 05 e8 cc e6 00 01 e8 e1 ab c7 ff <0f> 0b eb df cc cc cc cc cc cc cc cc cc cc cc cc cc 48 89 fa 83 e2 [1462610.863637] RSP: 0018:ffffaee60070cc38 EFLAGS: 00010292 [1462610.863736] RAX: 0000000000000019 RBX: ffffa1cdc35e5780 RCX: 00000000fff7ffff [1462610.863857] RDX: 00000000fff7ffff RSI: 0000000000000001 RDI: 00000000ffffffea [1462610.864129] RBP: ffffa1cf6aeb8de8 R08: 0000000000000000 R09: 00000000fff7ffff [1462610.864250] R10: ffffa1d51b000000 R11: 0000000000000003 R12: ffffa1cdc35e5740 [1462610.864370] R13: ffffa1cdc35e57a8 R14: ffffa1d51fda9008 R15: 00000000ade2eb6e [1462610.864489] FS: 0000000000000000(0000) GS:ffffa1d51fd80000(0000) knlGS:0000000000000000 [1462610.864615] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1462610.864713] CR2: 00007f057b8ad000 CR3: 0000000141881003 CR4: 00000000001706e0 [1462610.864833] Call Trace: [1462610.864928] <IRQ> [1462610.865021] ? __warn+0x6c/0x130 [1462610.865124] ? report_bug+0x1e4/0x260 [1462610.865223] ? handle_bug+0x36/0x70 [1462610.865318] ? exc_invalid_op+0x17/0x1a0 [1462610.865414] ? asm_exc_invalid_op+0x16/0x20 [1462610.865517] ? rcuref_put_slowpath+0x5f/0x70 [1462610.865618] ? rcuref_put_slowpath+0x5f/0x70 [1462610.865719] dst_release+0x2c/0x60 [1462610.865817] rt_cache_route+0xbd/0xf0 [1462610.865913] rt_set_nexthop.isra.0+0x1b6/0x440 [1462610.866008] ip_route_input_slow+0x90e/0xc60 [1462610.866116] ? nf_conntrack_udp_packet+0x16c/0x230 [nf_conntrack] [1462610.866229] ip_route_input_noref+0xed/0x100 [1462610.866328] ip_rcv_finish_core.isra.0+0xb1/0x410 [1462610.866425] ip_rcv+0xed/0x130 [1462610.866522] ? ip_rcv_core.constprop.0+0x350/0x350 [1462610.866621] process_backlog+0x10c/0x230 [1462610.866719] __napi_poll+0x20/0x180 [1462610.866818] net_rx_action+0x2a4/0x390 [1462610.866921] __do_softirq+0xd0/0x202 [1462610.867020] do_softirq+0x58/0x80 [1462610.867116] </IRQ> [1462610.867206] <TASK> [1462610.867298] flush_smp_call_function_queue+0x3f/0x60 [1462610.867403] do_idle+0x14d/0x210 [1462610.867500] cpu_startup_entry+0x14/0x20 [1462610.867602] start_secondary+0xec/0xf0 [1462610.867701] secondary_startup_64_no_verify+0xf9/0xfb [1462610.867799] </TASK> [1462610.867891] ---[ end trace 0000000000000000 ]— And this si 6.5.4 : [39651.441371] ------------[ cut here ]------------ [39651.441455] rcuref - imbalanced put() [39651.441470] WARNING: CPU: 12 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath+0x5f/0x70 [39651.441633] Modules linked in: nft_limit pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp igb i2c_algo_bit i40e ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [39651.441805] CPU: 12 PID: 0 Comm: swapper/12 Tainted: G O 6.5.3 #1 [39651.441911] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./EP2C612D8, BIOS P2.30 04/30/2018 [39651.442035] RIP: 0010:rcuref_put_slowpath+0x5f/0x70 [39651.442131] Code: 31 c0 eb e2 80 3d 86 ae e6 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb cf 48 c7 c7 68 f6 e2 9a c6 05 6c ae e6 00 01 e8 11 71 c7 ff <0f> 0b eb df cc cc cc cc cc cc cc cc cc cc cc cc cc 48 89 fa 83 e2 [39651.442294] RSP: 0018:ffffbb9a404b4de8 EFLAGS: 00010296 [39651.442390] RAX: 0000000000000019 RBX: ffffa13ac9a32640 RCX: 00000000fff7ffff [39651.442513] RDX: 00000000fff7ffff RSI: 0000000000000001 RDI: 00000000ffffffea [39651.442630] RBP: ffffa13a44a04000 R08: 0000000000000000 R09: 00000000fff7ffff [39651.442748] R10: ffffa1419ae00000 R11: 0000000000000003 R12: ffffa13ab640bec0 [39651.442866] R13: 0000000000000000 R14: 0000000000000010 R15: ffffbb9a404b4f60 [39651.442985] FS: 0000000000000000(0000) GS:ffffa1419f900000(0000) knlGS:0000000000000000 [39651.443106] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [39651.443201] CR2: 0000564f9e23f6e0 CR3: 000000010bcea002 CR4: 00000000003706e0 [39651.443319] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [39651.443438] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [39651.443558] Call Trace: [39651.443647] <IRQ> [39651.443736] ? __warn+0x6c/0x130 [39651.443829] ? report_bug+0x1e4/0x260 [39651.443924] ? handle_bug+0x36/0x70 [39651.444016] ? exc_invalid_op+0x17/0x1a0 [39651.444109] ? asm_exc_invalid_op+0x16/0x20 [39651.444202] ? rcuref_put_slowpath+0x5f/0x70 [39651.444297] ? rcuref_put_slowpath+0x5f/0x70 [39651.444391] dst_release+0x2c/0x60 [39651.444487] __dev_queue_xmit+0x56c/0xbd0 [39651.444582] ? nf_hook_slow+0x36/0xa0 [39651.444675] ip_finish_output2+0x27b/0x520 [39651.444770] process_backlog+0x10c/0x230 [39651.444866] __napi_poll+0x20/0x180 [39651.444961] net_rx_action+0x2a4/0x390 [39651.445055] __do_softirq+0xd0/0x202 [39651.445148] do_softirq+0x3a/0x50 [39651.445241] </IRQ> [39651.445329] <TASK> [39651.445416] flush_smp_call_function_queue+0x3f/0x50 [39651.445516] do_idle+0x14d/0x210 [39651.445609] cpu_startup_entry+0x14/0x20 [39651.445702] start_secondary+0xe1/0xf0 [39651.445797] secondary_startup_64_no_verify+0x167/0x16b [39651.445893] </TASK> [39651.445982] ---[ end trace 0000000000000000 ]— best regards, Martin > On 17 Sep 2023, at 14:55, Martin Zaharinov <micron10@gmail.com> wrote: > > Hi Eric > is it possible bug to come from this patch : https://patchwork.kernel.org/project/netdevbpf/cover/20230911170531.828100-1-edumazet@google.com/ > > > m. > >> On 17 Sep 2023, at 14:40, Martin Zaharinov <micron10@gmail.com> wrote: >> >> One more in changelog for kernel 6.5 : https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.5 >> >> I see have many bug reports with : >> >> Sep 17 11:43:11 [127675.395289][ C2] ? process_backlog+0x10c/0x230 >> Sep 17 11:43:11 [127675.395386][ C2] ? __napi_poll+0x20/0x180 >> Sep 17 11:43:11 [127675.395478][ C2] ? net_rx_action+0x2a4/0x390 >> >> >> In all server have simple nftables rulls , ethernet card is intel xl710 or 82599. its a very simple config. >> >> m. >> >> >> >> >>> On 16 Sep 2023, at 12:04, Martin Zaharinov <micron10@gmail.com> wrote: >>> >>> Hi Paolo >>> >>> in first report machine dont have out of tree module >>> >>> this bug is come after move from kernel 6.2 to 6.3 >>> >>> m. >>> >>> On Sat, Sep 16, 2023, 11:27 Paolo Abeni <pabeni@redhat.com> wrote: >>> On Sat, 2023-09-16 at 02:11 +0300, Martin Zaharinov wrote: >>>> one more log: >>>> >>>> Sep 12 07:37:29 [151563.298466][ C5] ------------[ cut here ]------------ >>>> Sep 12 07:37:29 [151563.298550][ C5] rcuref - imbalanced put() >>>> Sep 12 07:37:29 [151563.298564][ C5] WARNING: CPU: 5 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) >>>> Sep 12 07:37:29 [151563.298724][ C5] Modules linked in: nft_limit nf_conntrack_netlink vlan_mon(O) pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_xnatlog(O) ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [last unloaded: BNGBOOT(O)] >>>> Sep 12 07:37:29 [151563.298894][ C5] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G O 6.5.2 #1 >>> >>> >>> You have out-of-tree modules taint in all the report you shared. Please >>> try to reproduce the issue with such taint, thanks! >>> >>> Paolo >>> >> > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-19 20:09 ` Martin Zaharinov @ 2023-09-20 3:59 ` Eric Dumazet 2023-09-20 6:05 ` Martin Zaharinov 0 siblings, 1 reply; 35+ messages in thread From: Eric Dumazet @ 2023-09-20 3:59 UTC (permalink / raw) To: Martin Zaharinov Cc: Paolo Abeni, netdev, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Pablo Neira Ayuso On Tue, Sep 19, 2023 at 10:09 PM Martin Zaharinov <micron10@gmail.com> wrote: > > Hi Eric > > Yes this patch is not come in 6.5 kernel and queue for 6.6 i test but not ok for now. "not ok for now" ? What does this mean? Pointing out patches that are not related to your issue is a waste of time. If this was to bring my attention, this is a bad strategy, because I will probably not read your future emails. > > One more i find same error have in old kernel 6.4.8 , update to kernel 6.5.4 and same error is come . > > Like this is hard to catch bug > > see logs : > > > [1462610.861373] ------------[ cut here ]------------ > [1462610.861480] rcuref - imbalanced put() > [1462610.861491] WARNING: CPU: 22 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath+0x5f/0x70 > [1462610.861718] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos > [1462610.862004] CPU: 22 PID: 0 Comm: swapper/22 Tainted: G O 6.4.8 #1 > [1462610.863244] Hardware name: Supermicro Super Server/X10SRW-F, BIOS 3.4 06/05/2021 > [1462610.863368] RIP: 0010:rcuref_put_slowpath+0x5f/0x70 > [1462610.863469] Code: 31 c0 eb e2 80 3d 02 cd e6 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb cf 48 c7 c7 7f 68 e5 a4 c6 05 e8 cc e6 00 01 e8 e1 ab c7 ff <0f> 0b eb df cc cc cc cc cc cc cc cc cc cc cc cc cc 48 89 fa 83 e2 > [1462610.863637] RSP: 0018:ffffaee60070cc38 EFLAGS: 00010292 > [1462610.863736] RAX: 0000000000000019 RBX: ffffa1cdc35e5780 RCX: 00000000fff7ffff > [1462610.863857] RDX: 00000000fff7ffff RSI: 0000000000000001 RDI: 00000000ffffffea > [1462610.864129] RBP: ffffa1cf6aeb8de8 R08: 0000000000000000 R09: 00000000fff7ffff > [1462610.864250] R10: ffffa1d51b000000 R11: 0000000000000003 R12: ffffa1cdc35e5740 > [1462610.864370] R13: ffffa1cdc35e57a8 R14: ffffa1d51fda9008 R15: 00000000ade2eb6e > [1462610.864489] FS: 0000000000000000(0000) GS:ffffa1d51fd80000(0000) knlGS:0000000000000000 > [1462610.864615] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [1462610.864713] CR2: 00007f057b8ad000 CR3: 0000000141881003 CR4: 00000000001706e0 > [1462610.864833] Call Trace: > [1462610.864928] <IRQ> > [1462610.865021] ? __warn+0x6c/0x130 > [1462610.865124] ? report_bug+0x1e4/0x260 > [1462610.865223] ? handle_bug+0x36/0x70 > [1462610.865318] ? exc_invalid_op+0x17/0x1a0 > [1462610.865414] ? asm_exc_invalid_op+0x16/0x20 > [1462610.865517] ? rcuref_put_slowpath+0x5f/0x70 > [1462610.865618] ? rcuref_put_slowpath+0x5f/0x70 > [1462610.865719] dst_release+0x2c/0x60 > [1462610.865817] rt_cache_route+0xbd/0xf0 > [1462610.865913] rt_set_nexthop.isra.0+0x1b6/0x440 > [1462610.866008] ip_route_input_slow+0x90e/0xc60 > [1462610.866116] ? nf_conntrack_udp_packet+0x16c/0x230 [nf_conntrack] > [1462610.866229] ip_route_input_noref+0xed/0x100 > [1462610.866328] ip_rcv_finish_core.isra.0+0xb1/0x410 > [1462610.866425] ip_rcv+0xed/0x130 > [1462610.866522] ? ip_rcv_core.constprop.0+0x350/0x350 > [1462610.866621] process_backlog+0x10c/0x230 > [1462610.866719] __napi_poll+0x20/0x180 > [1462610.866818] net_rx_action+0x2a4/0x390 > [1462610.866921] __do_softirq+0xd0/0x202 > [1462610.867020] do_softirq+0x58/0x80 > [1462610.867116] </IRQ> > [1462610.867206] <TASK> > [1462610.867298] flush_smp_call_function_queue+0x3f/0x60 > [1462610.867403] do_idle+0x14d/0x210 > [1462610.867500] cpu_startup_entry+0x14/0x20 > [1462610.867602] start_secondary+0xec/0xf0 > [1462610.867701] secondary_startup_64_no_verify+0xf9/0xfb > [1462610.867799] </TASK> > [1462610.867891] ---[ end trace 0000000000000000 ]— > > > And this si 6.5.4 : > > [39651.441371] ------------[ cut here ]------------ > [39651.441455] rcuref - imbalanced put() > [39651.441470] WARNING: CPU: 12 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath+0x5f/0x70 > [39651.441633] Modules linked in: nft_limit pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp igb i2c_algo_bit i40e ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos > [39651.441805] CPU: 12 PID: 0 Comm: swapper/12 Tainted: G O 6.5.3 #1 > [39651.441911] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./EP2C612D8, BIOS P2.30 04/30/2018 > [39651.442035] RIP: 0010:rcuref_put_slowpath+0x5f/0x70 > [39651.442131] Code: 31 c0 eb e2 80 3d 86 ae e6 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb cf 48 c7 c7 68 f6 e2 9a c6 05 6c ae e6 00 01 e8 11 71 c7 ff <0f> 0b eb df cc cc cc cc cc cc cc cc cc cc cc cc cc 48 89 fa 83 e2 > [39651.442294] RSP: 0018:ffffbb9a404b4de8 EFLAGS: 00010296 > [39651.442390] RAX: 0000000000000019 RBX: ffffa13ac9a32640 RCX: 00000000fff7ffff > [39651.442513] RDX: 00000000fff7ffff RSI: 0000000000000001 RDI: 00000000ffffffea > [39651.442630] RBP: ffffa13a44a04000 R08: 0000000000000000 R09: 00000000fff7ffff > [39651.442748] R10: ffffa1419ae00000 R11: 0000000000000003 R12: ffffa13ab640bec0 > [39651.442866] R13: 0000000000000000 R14: 0000000000000010 R15: ffffbb9a404b4f60 > [39651.442985] FS: 0000000000000000(0000) GS:ffffa1419f900000(0000) knlGS:0000000000000000 > [39651.443106] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [39651.443201] CR2: 0000564f9e23f6e0 CR3: 000000010bcea002 CR4: 00000000003706e0 > [39651.443319] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [39651.443438] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [39651.443558] Call Trace: > [39651.443647] <IRQ> > [39651.443736] ? __warn+0x6c/0x130 > [39651.443829] ? report_bug+0x1e4/0x260 > [39651.443924] ? handle_bug+0x36/0x70 > [39651.444016] ? exc_invalid_op+0x17/0x1a0 > [39651.444109] ? asm_exc_invalid_op+0x16/0x20 > [39651.444202] ? rcuref_put_slowpath+0x5f/0x70 > [39651.444297] ? rcuref_put_slowpath+0x5f/0x70 > [39651.444391] dst_release+0x2c/0x60 > [39651.444487] __dev_queue_xmit+0x56c/0xbd0 > [39651.444582] ? nf_hook_slow+0x36/0xa0 > [39651.444675] ip_finish_output2+0x27b/0x520 > [39651.444770] process_backlog+0x10c/0x230 > [39651.444866] __napi_poll+0x20/0x180 > [39651.444961] net_rx_action+0x2a4/0x390 > [39651.445055] __do_softirq+0xd0/0x202 > [39651.445148] do_softirq+0x3a/0x50 > [39651.445241] </IRQ> > [39651.445329] <TASK> > [39651.445416] flush_smp_call_function_queue+0x3f/0x50 > [39651.445516] do_idle+0x14d/0x210 > [39651.445609] cpu_startup_entry+0x14/0x20 > [39651.445702] start_secondary+0xe1/0xf0 > [39651.445797] secondary_startup_64_no_verify+0x167/0x16b > [39651.445893] </TASK> > [39651.445982] ---[ end trace 0000000000000000 ]— > > > best regards, > Martin You keep sending traces without symbols, nobody here will even look at them. Again, your best route is a bisection. > > > On 17 Sep 2023, at 14:55, Martin Zaharinov <micron10@gmail.com> wrote: > > > > Hi Eric > > is it possible bug to come from this patch : https://patchwork.kernel.org/project/netdevbpf/cover/20230911170531.828100-1-edumazet@google.com/ > > > > > > m. > > > >> On 17 Sep 2023, at 14:40, Martin Zaharinov <micron10@gmail.com> wrote: > >> > >> One more in changelog for kernel 6.5 : https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.5 > >> > >> I see have many bug reports with : > >> > >> Sep 17 11:43:11 [127675.395289][ C2] ? process_backlog+0x10c/0x230 > >> Sep 17 11:43:11 [127675.395386][ C2] ? __napi_poll+0x20/0x180 > >> Sep 17 11:43:11 [127675.395478][ C2] ? net_rx_action+0x2a4/0x390 > >> > >> > >> In all server have simple nftables rulls , ethernet card is intel xl710 or 82599. its a very simple config. > >> > >> m. > >> > >> > >> > >> > >>> On 16 Sep 2023, at 12:04, Martin Zaharinov <micron10@gmail.com> wrote: > >>> > >>> Hi Paolo > >>> > >>> in first report machine dont have out of tree module > >>> > >>> this bug is come after move from kernel 6.2 to 6.3 > >>> > >>> m. > >>> > >>> On Sat, Sep 16, 2023, 11:27 Paolo Abeni <pabeni@redhat.com> wrote: > >>> On Sat, 2023-09-16 at 02:11 +0300, Martin Zaharinov wrote: > >>>> one more log: > >>>> > >>>> Sep 12 07:37:29 [151563.298466][ C5] ------------[ cut here ]------------ > >>>> Sep 12 07:37:29 [151563.298550][ C5] rcuref - imbalanced put() > >>>> Sep 12 07:37:29 [151563.298564][ C5] WARNING: CPU: 5 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > >>>> Sep 12 07:37:29 [151563.298724][ C5] Modules linked in: nft_limit nf_conntrack_netlink vlan_mon(O) pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding i40e nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_xnatlog(O) ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [last unloaded: BNGBOOT(O)] > >>>> Sep 12 07:37:29 [151563.298894][ C5] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G O 6.5.2 #1 > >>> > >>> > >>> You have out-of-tree modules taint in all the report you shared. Please > >>> try to reproduce the issue with such taint, thanks! > >>> > >>> Paolo > >>> > >> > > > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-20 3:59 ` Eric Dumazet @ 2023-09-20 6:05 ` Martin Zaharinov 2023-09-20 6:16 ` Bagas Sanjaya 0 siblings, 1 reply; 35+ messages in thread From: Martin Zaharinov @ 2023-09-20 6:05 UTC (permalink / raw) To: Eric Dumazet Cc: Paolo Abeni, netdev, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Pablo Neira Ayuso Hi Eric > On 20 Sep 2023, at 6:59, Eric Dumazet <edumazet@google.com> wrote: > > On Tue, Sep 19, 2023 at 10:09 PM Martin Zaharinov <micron10@gmail.com> wrote: >> >> Hi Eric >> >> Yes this patch is not come in 6.5 kernel and queue for 6.6 i test but not ok for now. > > "not ok for now" ? What does this mean? > Pointing out patches that are not related to your issue is a waste of time. > If this was to bring my attention, this is a bad strategy, because I > will probably not read your future emails. > I'm sorry, I didn't speak correctly. patch is very good but for kernel 6.6. I enjoy your kernel improvements. And thanks for that !! >> >> One more i find same error have in old kernel 6.4.8 , update to kernel 6.5.4 and same error is come . >> >> Like this is hard to catch bug >> >> see logs : >> >> >> [1462610.861373] ------------[ cut here ]------------ >> [1462610.861480] rcuref - imbalanced put() >> [1462610.861491] WARNING: CPU: 22 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath+0x5f/0x70 >> [1462610.861718] Modules linked in: nft_limit nf_conntrack_netlink pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp bonding ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos >> [1462610.862004] CPU: 22 PID: 0 Comm: swapper/22 Tainted: G O 6.4.8 #1 >> [1462610.863244] Hardware name: Supermicro Super Server/X10SRW-F, BIOS 3.4 06/05/2021 >> [1462610.863368] RIP: 0010:rcuref_put_slowpath+0x5f/0x70 >> [1462610.863469] Code: 31 c0 eb e2 80 3d 02 cd e6 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb cf 48 c7 c7 7f 68 e5 a4 c6 05 e8 cc e6 00 01 e8 e1 ab c7 ff <0f> 0b eb df cc cc cc cc cc cc cc cc cc cc cc cc cc 48 89 fa 83 e2 >> [1462610.863637] RSP: 0018:ffffaee60070cc38 EFLAGS: 00010292 >> [1462610.863736] RAX: 0000000000000019 RBX: ffffa1cdc35e5780 RCX: 00000000fff7ffff >> [1462610.863857] RDX: 00000000fff7ffff RSI: 0000000000000001 RDI: 00000000ffffffea >> [1462610.864129] RBP: ffffa1cf6aeb8de8 R08: 0000000000000000 R09: 00000000fff7ffff >> [1462610.864250] R10: ffffa1d51b000000 R11: 0000000000000003 R12: ffffa1cdc35e5740 >> [1462610.864370] R13: ffffa1cdc35e57a8 R14: ffffa1d51fda9008 R15: 00000000ade2eb6e >> [1462610.864489] FS: 0000000000000000(0000) GS:ffffa1d51fd80000(0000) knlGS:0000000000000000 >> [1462610.864615] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [1462610.864713] CR2: 00007f057b8ad000 CR3: 0000000141881003 CR4: 00000000001706e0 >> [1462610.864833] Call Trace: >> [1462610.864928] <IRQ> >> [1462610.865021] ? __warn+0x6c/0x130 >> [1462610.865124] ? report_bug+0x1e4/0x260 >> [1462610.865223] ? handle_bug+0x36/0x70 >> [1462610.865318] ? exc_invalid_op+0x17/0x1a0 >> [1462610.865414] ? asm_exc_invalid_op+0x16/0x20 >> [1462610.865517] ? rcuref_put_slowpath+0x5f/0x70 >> [1462610.865618] ? rcuref_put_slowpath+0x5f/0x70 >> [1462610.865719] dst_release+0x2c/0x60 >> [1462610.865817] rt_cache_route+0xbd/0xf0 >> [1462610.865913] rt_set_nexthop.isra.0+0x1b6/0x440 >> [1462610.866008] ip_route_input_slow+0x90e/0xc60 >> [1462610.866116] ? nf_conntrack_udp_packet+0x16c/0x230 [nf_conntrack] >> [1462610.866229] ip_route_input_noref+0xed/0x100 >> [1462610.866328] ip_rcv_finish_core.isra.0+0xb1/0x410 >> [1462610.866425] ip_rcv+0xed/0x130 >> [1462610.866522] ? ip_rcv_core.constprop.0+0x350/0x350 >> [1462610.866621] process_backlog+0x10c/0x230 >> [1462610.866719] __napi_poll+0x20/0x180 >> [1462610.866818] net_rx_action+0x2a4/0x390 >> [1462610.866921] __do_softirq+0xd0/0x202 >> [1462610.867020] do_softirq+0x58/0x80 >> [1462610.867116] </IRQ> >> [1462610.867206] <TASK> >> [1462610.867298] flush_smp_call_function_queue+0x3f/0x60 >> [1462610.867403] do_idle+0x14d/0x210 >> [1462610.867500] cpu_startup_entry+0x14/0x20 >> [1462610.867602] start_secondary+0xec/0xf0 >> [1462610.867701] secondary_startup_64_no_verify+0xf9/0xfb >> [1462610.867799] </TASK> >> [1462610.867891] ---[ end trace 0000000000000000 ]— >> >> >> And this si 6.5.4 : >> >> [39651.441371] ------------[ cut here ]------------ >> [39651.441455] rcuref - imbalanced put() >> [39651.441470] WARNING: CPU: 12 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath+0x5f/0x70 >> [39651.441633] Modules linked in: nft_limit pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp igb i2c_algo_bit i40e ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos >> [39651.441805] CPU: 12 PID: 0 Comm: swapper/12 Tainted: G O 6.5.3 #1 >> [39651.441911] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./EP2C612D8, BIOS P2.30 04/30/2018 >> [39651.442035] RIP: 0010:rcuref_put_slowpath+0x5f/0x70 >> [39651.442131] Code: 31 c0 eb e2 80 3d 86 ae e6 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb cf 48 c7 c7 68 f6 e2 9a c6 05 6c ae e6 00 01 e8 11 71 c7 ff <0f> 0b eb df cc cc cc cc cc cc cc cc cc cc cc cc cc 48 89 fa 83 e2 >> [39651.442294] RSP: 0018:ffffbb9a404b4de8 EFLAGS: 00010296 >> [39651.442390] RAX: 0000000000000019 RBX: ffffa13ac9a32640 RCX: 00000000fff7ffff >> [39651.442513] RDX: 00000000fff7ffff RSI: 0000000000000001 RDI: 00000000ffffffea >> [39651.442630] RBP: ffffa13a44a04000 R08: 0000000000000000 R09: 00000000fff7ffff >> [39651.442748] R10: ffffa1419ae00000 R11: 0000000000000003 R12: ffffa13ab640bec0 >> [39651.442866] R13: 0000000000000000 R14: 0000000000000010 R15: ffffbb9a404b4f60 >> [39651.442985] FS: 0000000000000000(0000) GS:ffffa1419f900000(0000) knlGS:0000000000000000 >> [39651.443106] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [39651.443201] CR2: 0000564f9e23f6e0 CR3: 000000010bcea002 CR4: 00000000003706e0 >> [39651.443319] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >> [39651.443438] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >> [39651.443558] Call Trace: >> [39651.443647] <IRQ> >> [39651.443736] ? __warn+0x6c/0x130 >> [39651.443829] ? report_bug+0x1e4/0x260 >> [39651.443924] ? handle_bug+0x36/0x70 >> [39651.444016] ? exc_invalid_op+0x17/0x1a0 >> [39651.444109] ? asm_exc_invalid_op+0x16/0x20 >> [39651.444202] ? rcuref_put_slowpath+0x5f/0x70 >> [39651.444297] ? rcuref_put_slowpath+0x5f/0x70 >> [39651.444391] dst_release+0x2c/0x60 >> [39651.444487] __dev_queue_xmit+0x56c/0xbd0 >> [39651.444582] ? nf_hook_slow+0x36/0xa0 >> [39651.444675] ip_finish_output2+0x27b/0x520 >> [39651.444770] process_backlog+0x10c/0x230 >> [39651.444866] __napi_poll+0x20/0x180 >> [39651.444961] net_rx_action+0x2a4/0x390 >> [39651.445055] __do_softirq+0xd0/0x202 >> [39651.445148] do_softirq+0x3a/0x50 >> [39651.445241] </IRQ> >> [39651.445329] <TASK> >> [39651.445416] flush_smp_call_function_queue+0x3f/0x50 >> [39651.445516] do_idle+0x14d/0x210 >> [39651.445609] cpu_startup_entry+0x14/0x20 >> [39651.445702] start_secondary+0xe1/0xf0 >> [39651.445797] secondary_startup_64_no_verify+0x167/0x16b >> [39651.445893] </TASK> >> [39651.445982] ---[ end trace 0000000000000000 ]— >> >> >> best regards, >> Martin > > You keep sending traces without symbols, nobody here will even look at them. > Here is trace with symbols : [39651.441371] ------------[ cut here ]------------ [39651.441455] rcuref - imbalanced put() [39651.441470] WARNING: CPU: 12 PID: 0 at lib/rcuref.c:267 rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) [39651.441633] Modules linked in: nft_limit pppoe pppox ppp_generic slhc nft_ct nft_nat nft_chain_nat nf_tables netconsole coretemp igb i2c_algo_bit i40e ixgbe mdio nf_nat_sip nf_conntrack_sip nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipmi_si ipmi_devintf ipmi_msghandler rtc_cmos [39651.441805] CPU: 12 PID: 0 Comm: swapper/12 Tainted: G O 6.5.3 #1 [39651.441911] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./EP2C612D8, BIOS P2.30 04/30/2018 [39651.442035] RIP: 0010:rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) [39651.442131] Code: 31 c0 eb e2 80 3d 86 ae e6 00 00 74 0a c7 03 00 00 00 e0 31 c0 eb cf 48 c7 c7 68 f6 e2 9a c6 05 6c ae e6 00 01 e8 11 71 c7 ff <0f> 0b eb df cc cc cc cc cc cc cc cc cc cc cc cc cc 48 89 fa 83 e2 All code ======== 0: 31 c0 xor %eax,%eax 2: eb e2 jmp 0xffffffffffffffe6 4: 80 3d 86 ae e6 00 00 cmpb $0x0,0xe6ae86(%rip) # 0xe6ae91 b: 74 0a je 0x17 d: c7 03 00 00 00 e0 movl $0xe0000000,(%rbx) 13: 31 c0 xor %eax,%eax 15: eb cf jmp 0xffffffffffffffe6 17: 48 c7 c7 68 f6 e2 9a mov $0xffffffff9ae2f668,%rdi 1e: c6 05 6c ae e6 00 01 movb $0x1,0xe6ae6c(%rip) # 0xe6ae91 25: e8 11 71 c7 ff call 0xffffffffffc7713b 2a:* 0f 0b ud2 <-- trapping instruction 2c: eb df jmp 0xd 2e: cc int3 2f: cc int3 30: cc int3 31: cc int3 32: cc int3 33: cc int3 34: cc int3 35: cc int3 36: cc int3 37: cc int3 38: cc int3 39: cc int3 3a: cc int3 3b: 48 89 fa mov %rdi,%rdx 3e: 83 .byte 0x83 3f: e2 .byte 0xe2 Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: eb df jmp 0xffffffffffffffe3 4: cc int3 5: cc int3 6: cc int3 7: cc int3 8: cc int3 9: cc int3 a: cc int3 b: cc int3 c: cc int3 d: cc int3 e: cc int3 f: cc int3 10: cc int3 11: 48 89 fa mov %rdi,%rdx 14: 83 .byte 0x83 15: e2 .byte 0xe2 [39651.442294] RSP: 0018:ffffbb9a404b4de8 EFLAGS: 00010296 [39651.442390] RAX: 0000000000000019 RBX: ffffa13ac9a32640 RCX: 00000000fff7ffff [39651.442513] RDX: 00000000fff7ffff RSI: 0000000000000001 RDI: 00000000ffffffea [39651.442630] RBP: ffffa13a44a04000 R08: 0000000000000000 R09: 00000000fff7ffff [39651.442748] R10: ffffa1419ae00000 R11: 0000000000000003 R12: ffffa13ab640bec0 [39651.442866] R13: 0000000000000000 R14: 0000000000000010 R15: ffffbb9a404b4f60 [39651.442985] FS: 0000000000000000(0000) GS:ffffa1419f900000(0000) knlGS:0000000000000000 [39651.443106] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [39651.443201] CR2: 0000564f9e23f6e0 CR3: 000000010bcea002 CR4: 00000000003706e0 [39651.443319] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [39651.443438] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [39651.443558] Call Trace: [39651.443647] <IRQ> [39651.443736] ? __warn (kernel/panic.c:235 kernel/panic.c:673) [39651.443829] ? report_bug (lib/bug.c:180 lib/bug.c:219) [39651.443924] ? handle_bug (arch/x86/kernel/traps.c:324) [39651.444016] ? exc_invalid_op (arch/x86/kernel/traps.c:345 (discriminator 1)) [39651.444109] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) [39651.444202] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) [39651.444297] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) [39651.444391] dst_release (./arch/x86/include/asm/preempt.h:95 ./include/linux/rcuref.h:151 net/core/dst.c:166) [39651.444487] __dev_queue_xmit (./include/net/dst.h:283 net/core/dev.c:4158) [39651.444582] ? nf_hook_slow (./include/linux/netfilter.h:143 net/netfilter/core.c:626) [39651.444675] ip_finish_output2 (./include/linux/netdevice.h:3088 ./include/net/neighbour.h:528 ./include/net/neighbour.h:542 net/ipv4/ip_output.c:230) [39651.444770] process_backlog (./include/linux/rcupdate.h:781 net/core/dev.c:5896) [39651.444866] __napi_poll (net/core/dev.c:6461) [39651.444961] net_rx_action (net/core/dev.c:6530 net/core/dev.c:6661) [39651.445055] __do_softirq (./arch/x86/include/asm/preempt.h:27 kernel/softirq.c:564) [39651.445148] do_softirq (kernel/softirq.c:463 (discriminator 32) kernel/softirq.c:450 (discriminator 32)) [39651.445241] </IRQ> [39651.445329] <TASK> [39651.445416] flush_smp_call_function_queue (./arch/x86/include/asm/irqflags.h:134 (discriminator 1) kernel/smp.c:570 (discriminator 1)) [39651.445516] do_idle (kernel/sched/idle.c:314) [39651.445609] cpu_startup_entry (kernel/sched/idle.c:378) [39651.445702] start_secondary (arch/x86/kernel/smpboot.c:326) [39651.445797] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:441) [39651.445893] </TASK> [39651.445982] ---[ end trace 0000000000000000 ]--- > Again, your best route is a bisection. For now its not possible to make bisection , its hard to change kernel on running machine … is there another way to catch from where is come this bug message. Best regards, Martin ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-20 6:05 ` Martin Zaharinov @ 2023-09-20 6:16 ` Bagas Sanjaya 2023-09-20 7:03 ` Martin Zaharinov 0 siblings, 1 reply; 35+ messages in thread From: Bagas Sanjaya @ 2023-09-20 6:16 UTC (permalink / raw) To: Martin Zaharinov, Eric Dumazet Cc: Paolo Abeni, netdev, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Pablo Neira Ayuso [-- Attachment #1: Type: text/plain, Size: 501 bytes --] On Wed, Sep 20, 2023 at 09:05:10AM +0300, Martin Zaharinov wrote: > > On 20 Sep 2023, at 6:59, Eric Dumazet <edumazet@google.com> wrote: > > Again, your best route is a bisection. > > For now its not possible to make bisection , its hard to change kernel on running machine … > You have to do bisection, unfortunately. There is many guides there on Internet. Or you can read Documentation/admin-guide/bug-bisect.rst. Bye! -- An old man doll... just what I always wanted! - Clara [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-20 6:16 ` Bagas Sanjaya @ 2023-09-20 7:03 ` Martin Zaharinov 2023-09-20 7:25 ` Eric Dumazet 0 siblings, 1 reply; 35+ messages in thread From: Martin Zaharinov @ 2023-09-20 7:03 UTC (permalink / raw) To: Bagas Sanjaya Cc: Eric Dumazet, Paolo Abeni, netdev, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Pablo Neira Ayuso Hi Ok on first see all is look come after in kernel 6.4 add : atomics: Provide rcuref - scalable reference counting ( https://www.spinics.net/lists/linux-tip-commits/msg62042.html ) I check all running machine with kernel 6.4.2 is minimal and have same bug report. i have fell machine with kernel 6.3.9 and not see problems there . and the problem may be is allocate in this part : [39651.444202] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) [39651.444297] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) [39651.444391] dst_release (./arch/x86/include/asm/preempt.h:95 ./include/linux/rcuref.h:151 net/core/dst.c:166) [39651.444487] __dev_queue_xmit (./include/net/dst.h:283 net/core/dev.c:4158) [39651.444582] ? nf_hook_slow (./include/linux/netfilter.h:143 net/netfilter/core.c:626) may be changes in dst.c make problem , I'm guessing at the moment. but in real with kernel 6.3 all is fine for now. dst.c changes 6.3.9 > 6.5.4 : --- linux-6.3.9/net/core/dst.c 2023-06-21 14:02:19.000000000 +0000 +++ linux-6.5.4/net/core/dst.c 2023-09-19 10:30:30.000000000 +0000 @@ -66,7 +66,8 @@ void dst_init(struct dst_entry *dst, str dst->tclassid = 0; #endif dst->lwtstate = NULL; - atomic_set(&dst->__refcnt, initial_ref); + rcuref_init(&dst->__rcuref, initial_ref); + INIT_LIST_HEAD(&dst->rt_uncached); dst->__use = 0; dst->lastuse = jiffies; dst->flags = flags; @@ -162,31 +163,15 @@ EXPORT_SYMBOL(dst_dev_put); void dst_release(struct dst_entry *dst) { - if (dst) { - int newrefcnt; - - newrefcnt = atomic_dec_return(&dst->__refcnt); - if (WARN_ONCE(newrefcnt < 0, "dst_release underflow")) - net_warn_ratelimited("%s: dst:%p refcnt:%d\n", - __func__, dst, newrefcnt); - if (!newrefcnt) - call_rcu_hurry(&dst->rcu_head, dst_destroy_rcu); - } + if (dst && rcuref_put(&dst->__rcuref)) + call_rcu_hurry(&dst->rcu_head, dst_destroy_rcu); } EXPORT_SYMBOL(dst_release); void dst_release_immediate(struct dst_entry *dst) { - if (dst) { - int newrefcnt; - - newrefcnt = atomic_dec_return(&dst->__refcnt); - if (WARN_ONCE(newrefcnt < 0, "dst_release_immediate underflow")) - net_warn_ratelimited("%s: dst:%p refcnt:%d\n", - __func__, dst, newrefcnt); - if (!newrefcnt) - dst_destroy(dst); - } + if (dst && rcuref_put(&dst->__rcuref)) + dst_destroy(dst); } EXPORT_SYMBOL(dst_release_immediate); > On 20 Sep 2023, at 9:16, Bagas Sanjaya <bagasdotme@gmail.com> wrote: > > On Wed, Sep 20, 2023 at 09:05:10AM +0300, Martin Zaharinov wrote: >>> On 20 Sep 2023, at 6:59, Eric Dumazet <edumazet@google.com> wrote: >>> Again, your best route is a bisection. >> >> For now its not possible to make bisection , its hard to change kernel on running machine … >> > > You have to do bisection, unfortunately. There is many guides there on > Internet. Or you can read Documentation/admin-guide/bug-bisect.rst. > > Bye! > > -- > An old man doll... just what I always wanted! - Clara ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-20 7:03 ` Martin Zaharinov @ 2023-09-20 7:25 ` Eric Dumazet 2023-09-20 7:29 ` Eric Dumazet 0 siblings, 1 reply; 35+ messages in thread From: Eric Dumazet @ 2023-09-20 7:25 UTC (permalink / raw) To: Martin Zaharinov Cc: Bagas Sanjaya, Paolo Abeni, netdev, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Pablo Neira Ayuso On Wed, Sep 20, 2023 at 9:04 AM Martin Zaharinov <micron10@gmail.com> wrote: > > Hi > > Ok on first see all is look come after in kernel 6.4 add : atomics: Provide rcuref - scalable reference counting ( https://www.spinics.net/lists/linux-tip-commits/msg62042.html ) > > I check all running machine with kernel 6.4.2 is minimal and have same bug report. > > i have fell machine with kernel 6.3.9 and not see problems there . > > and the problem may be is allocate in this part : > > [39651.444202] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > [39651.444297] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > [39651.444391] dst_release (./arch/x86/include/asm/preempt.h:95 ./include/linux/rcuref.h:151 net/core/dst.c:166) > [39651.444487] __dev_queue_xmit (./include/net/dst.h:283 net/core/dev.c:4158) > [39651.444582] ? nf_hook_slow (./include/linux/netfilter.h:143 net/netfilter/core.c:626) > > may be changes in dst.c make problem , I'm guessing at the moment. > > but in real with kernel 6.3 all is fine for now. > > dst.c changes 6.3.9 > 6.5.4 : Then start a real bisection. This is going to be the last time I say it. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-20 7:25 ` Eric Dumazet @ 2023-09-20 7:29 ` Eric Dumazet 2023-09-20 7:32 ` Martin Zaharinov 0 siblings, 1 reply; 35+ messages in thread From: Eric Dumazet @ 2023-09-20 7:29 UTC (permalink / raw) To: Martin Zaharinov Cc: Bagas Sanjaya, Paolo Abeni, netdev, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Pablo Neira Ayuso On Wed, Sep 20, 2023 at 9:25 AM Eric Dumazet <edumazet@google.com> wrote: > > On Wed, Sep 20, 2023 at 9:04 AM Martin Zaharinov <micron10@gmail.com> wrote: > > > > Hi > > > > Ok on first see all is look come after in kernel 6.4 add : atomics: Provide rcuref - scalable reference counting ( https://www.spinics.net/lists/linux-tip-commits/msg62042.html ) > > > > I check all running machine with kernel 6.4.2 is minimal and have same bug report. > > > > i have fell machine with kernel 6.3.9 and not see problems there . > > > > and the problem may be is allocate in this part : > > > > [39651.444202] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > > [39651.444297] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) > > [39651.444391] dst_release (./arch/x86/include/asm/preempt.h:95 ./include/linux/rcuref.h:151 net/core/dst.c:166) > > [39651.444487] __dev_queue_xmit (./include/net/dst.h:283 net/core/dev.c:4158) > > [39651.444582] ? nf_hook_slow (./include/linux/netfilter.h:143 net/netfilter/core.c:626) > > > > may be changes in dst.c make problem , I'm guessing at the moment. > > > > but in real with kernel 6.3 all is fine for now. > > > > dst.c changes 6.3.9 > 6.5.4 : > > Then start a real bisection. This is going to be the last time I say it. Or stick to an older kernel for your production, and wait for others to find the issue. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-20 7:29 ` Eric Dumazet @ 2023-09-20 7:32 ` Martin Zaharinov 2023-09-21 7:50 ` Bagas Sanjaya 0 siblings, 1 reply; 35+ messages in thread From: Martin Zaharinov @ 2023-09-20 7:32 UTC (permalink / raw) To: Eric Dumazet Cc: Bagas Sanjaya, Paolo Abeni, netdev, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Pablo Neira Ayuso I will make this yes . And will wait if any find fix in future release. Thanks for your time Eric m. > On 20 Sep 2023, at 10:29, Eric Dumazet <edumazet@google.com> wrote: > > On Wed, Sep 20, 2023 at 9:25 AM Eric Dumazet <edumazet@google.com> wrote: >> >> On Wed, Sep 20, 2023 at 9:04 AM Martin Zaharinov <micron10@gmail.com> wrote: >>> >>> Hi >>> >>> Ok on first see all is look come after in kernel 6.4 add : atomics: Provide rcuref - scalable reference counting ( https://www.spinics.net/lists/linux-tip-commits/msg62042.html ) >>> >>> I check all running machine with kernel 6.4.2 is minimal and have same bug report. >>> >>> i have fell machine with kernel 6.3.9 and not see problems there . >>> >>> and the problem may be is allocate in this part : >>> >>> [39651.444202] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) >>> [39651.444297] ? rcuref_put_slowpath (lib/rcuref.c:267 (discriminator 1)) >>> [39651.444391] dst_release (./arch/x86/include/asm/preempt.h:95 ./include/linux/rcuref.h:151 net/core/dst.c:166) >>> [39651.444487] __dev_queue_xmit (./include/net/dst.h:283 net/core/dev.c:4158) >>> [39651.444582] ? nf_hook_slow (./include/linux/netfilter.h:143 net/netfilter/core.c:626) >>> >>> may be changes in dst.c make problem , I'm guessing at the moment. >>> >>> but in real with kernel 6.3 all is fine for now. >>> >>> dst.c changes 6.3.9 > 6.5.4 : >> >> Then start a real bisection. This is going to be the last time I say it. > > Or stick to an older kernel for your production, and wait for others > to find the issue. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-20 7:32 ` Martin Zaharinov @ 2023-09-21 7:50 ` Bagas Sanjaya 2023-09-21 8:13 ` Martin Zaharinov 0 siblings, 1 reply; 35+ messages in thread From: Bagas Sanjaya @ 2023-09-21 7:50 UTC (permalink / raw) To: Martin Zaharinov, Eric Dumazet Cc: Paolo Abeni, netdev, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Pablo Neira Ayuso On 20/09/2023 14:32, Martin Zaharinov wrote: > I will make this yes . > > And will wait if any find fix in future release. > Please don't top-post; reply inline with appropriate context instead. Martin, what prevents you from doing bisection as Eric requested again? If you only have production systems, why can't you afford to have testing ones? Why not turning one of your prod machines to be testing and bisect from there? Sorry for inconvenience. -- An old man doll... just what I always wanted! - Clara ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-21 7:50 ` Bagas Sanjaya @ 2023-09-21 8:13 ` Martin Zaharinov 2023-09-22 3:06 ` Bagas Sanjaya 0 siblings, 1 reply; 35+ messages in thread From: Martin Zaharinov @ 2023-09-21 8:13 UTC (permalink / raw) To: Bagas Sanjaya Cc: Eric Dumazet, Paolo Abeni, netdev, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Pablo Neira Ayuso Hi Bagas, Its not easy to make this on production, have too many users on it. i make checks and find with kernel 6.3.12-6.5.13 all is fine. on first machine that i have with kernel 6.4 and still work run kernel 6.4.2 and have problem. in my investigation problem is start after migration to kernel 6.4.x in 6.4 kernel is add rcuref : https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.4 commit bc9d3a9f2afca189a6ae40225b6985e3c775375e Author: Thomas Gleixner <tglx@linutronix.de> Date: Thu Mar 23 21:55:32 2023 +0100 net: dst: Switch to rcuref_t reference counting Under high contention dst_entry::__refcnt becomes a significant bottleneck. atomic_inc_not_zero() is implemented with a cmpxchg() loop, which goes into high retry rates on contention. Switch the reference count to rcuref_t which results in a significant performance gain. Rename the reference count member to __rcuref to reflect the change. The gain depends on the micro-architecture and the number of concurrent operations and has been measured in the range of +25% to +130% with a localhost memtier/memcached benchmark which amplifies the problem massively. Running the memtier/memcached benchmark over a real (1Gb) network connection the conversion on top of the false sharing fix for struct dst_entry::__refcnt results in a total gain in the 2%-5% range over the upstream baseline. Reported-by: Wangyang Guo <wangyang.guo@intel.com> Reported-by: Arjan Van De Ven <arjan.van.de.ven@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230307125538.989175656@linutronix.de Link: https://lore.kernel.org/r/20230323102800.215027837@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> and i think problem is here : --- a/net/core/dst.c +++ b/net/core/dst.c @@ -66,7 +66,7 @@ void dst_init(struct dst_entry *dst, str dst->tclassid = 0; #endif dst->lwtstate = NULL; - atomic_set(&dst->__refcnt, initial_ref); + rcuref_init(&dst->__refcnt, initial_ref); dst->__use = 0; dst->lastuse = jiffies; dst->flags = flags; @@ -162,31 +162,15 @@ EXPORT_SYMBOL(dst_dev_put); void dst_release(struct dst_entry *dst) { - if (dst) { - int newrefcnt; - - newrefcnt = atomic_dec_return(&dst->__refcnt); - if (WARN_ONCE(newrefcnt < 0, "dst_release underflow")) - net_warn_ratelimited("%s: dst:%p refcnt:%d\n", - __func__, dst, newrefcnt); - if (!newrefcnt) - call_rcu_hurry(&dst->rcu_head, dst_destroy_rcu); - } + if (dst && rcuref_put(&dst->__refcnt)) + call_rcu_hurry(&dst->rcu_head, dst_destroy_rcu); } EXPORT_SYMBOL(dst_release); void dst_release_immediate(struct dst_entry *dst) { - if (dst) { - int newrefcnt; - - newrefcnt = atomic_dec_return(&dst->__refcnt); - if (WARN_ONCE(newrefcnt < 0, "dst_release_immediate underflow")) - net_warn_ratelimited("%s: dst:%p refcnt:%d\n", - __func__, dst, newrefcnt); - if (!newrefcnt) - dst_destroy(dst); - } + if (dst && rcuref_put(&dst->__refcnt)) + dst_destroy(dst); } EXPORT_SYMBOL(dst_release_immediate); but this is my thinking Martin > On 21 Sep 2023, at 10:50, Bagas Sanjaya <bagasdotme@gmail.com> wrote: > > On 20/09/2023 14:32, Martin Zaharinov wrote: >> I will make this yes . >> >> And will wait if any find fix in future release. >> > > Please don't top-post; reply inline with appropriate context instead. > > Martin, what prevents you from doing bisection as Eric requested again? > If you only have production systems, why can't you afford to have > testing ones? Why not turning one of your prod machines to be testing > and bisect from there? > > Sorry for inconvenience. > > -- > An old man doll... just what I always wanted! - Clara > ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-21 8:13 ` Martin Zaharinov @ 2023-09-22 3:06 ` Bagas Sanjaya 2023-09-22 9:50 ` Linux regression tracking (Thorsten Leemhuis) 0 siblings, 1 reply; 35+ messages in thread From: Bagas Sanjaya @ 2023-09-22 3:06 UTC (permalink / raw) To: Martin Zaharinov Cc: Eric Dumazet, Paolo Abeni, netdev, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Pablo Neira Ayuso, Thorsten Leemhuis, Wangyang Guo, Arjan Van De Ven, Thomas Gleixner, Linux Regressions [-- Attachment #1: Type: text/plain, Size: 3969 bytes --] On Thu, Sep 21, 2023 at 11:13:55AM +0300, Martin Zaharinov wrote: > Hi Bagas, > > > Its not easy to make this on production, have too many users on it. > > i make checks and find with kernel 6.3.12-6.5.13 all is fine. > on first machine that i have with kernel 6.4 and still work run kernel 6.4.2 and have problem. > > in my investigation problem is start after migration to kernel 6.4.x > > in 6.4 kernel is add rcuref : > > https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.4 > > commit bc9d3a9f2afca189a6ae40225b6985e3c775375e > Author: Thomas Gleixner <tglx@linutronix.de> > Date: Thu Mar 23 21:55:32 2023 +0100 > > net: dst: Switch to rcuref_t reference counting Is it the culprit you look for? Had you done the bisection and it points the culprit to that commit > > Under high contention dst_entry::__refcnt becomes a significant bottleneck. > > atomic_inc_not_zero() is implemented with a cmpxchg() loop, which goes into > high retry rates on contention. > > Switch the reference count to rcuref_t which results in a significant > performance gain. Rename the reference count member to __rcuref to reflect > the change. > > The gain depends on the micro-architecture and the number of concurrent > operations and has been measured in the range of +25% to +130% with a > localhost memtier/memcached benchmark which amplifies the problem > massively. > > Running the memtier/memcached benchmark over a real (1Gb) network > connection the conversion on top of the false sharing fix for struct > dst_entry::__refcnt results in a total gain in the 2%-5% range over the > upstream baseline. > > Reported-by: Wangyang Guo <wangyang.guo@intel.com> > Reported-by: Arjan Van De Ven <arjan.van.de.ven@intel.com> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Link: https://lore.kernel.org/r/20230307125538.989175656@linutronix.de > Link: https://lore.kernel.org/r/20230323102800.215027837@linutronix.de > Signed-off-by: Jakub Kicinski <kuba@kernel.org> > > > and i think problem is here : > > --- a/net/core/dst.c > +++ b/net/core/dst.c > @@ -66,7 +66,7 @@ void dst_init(struct dst_entry *dst, str > dst->tclassid = 0; > #endif > dst->lwtstate = NULL; > - atomic_set(&dst->__refcnt, initial_ref); > + rcuref_init(&dst->__refcnt, initial_ref); > dst->__use = 0; > dst->lastuse = jiffies; > dst->flags = flags; > @@ -162,31 +162,15 @@ EXPORT_SYMBOL(dst_dev_put); > > void dst_release(struct dst_entry *dst) > { > - if (dst) { > - int newrefcnt; > - > - newrefcnt = atomic_dec_return(&dst->__refcnt); > - if (WARN_ONCE(newrefcnt < 0, "dst_release underflow")) > - net_warn_ratelimited("%s: dst:%p refcnt:%d\n", > - __func__, dst, newrefcnt); > - if (!newrefcnt) > - call_rcu_hurry(&dst->rcu_head, dst_destroy_rcu); > - } > + if (dst && rcuref_put(&dst->__refcnt)) > + call_rcu_hurry(&dst->rcu_head, dst_destroy_rcu); > } > EXPORT_SYMBOL(dst_release); > > void dst_release_immediate(struct dst_entry *dst) > { > - if (dst) { > - int newrefcnt; > - > - newrefcnt = atomic_dec_return(&dst->__refcnt); > - if (WARN_ONCE(newrefcnt < 0, "dst_release_immediate underflow")) > - net_warn_ratelimited("%s: dst:%p refcnt:%d\n", > - __func__, dst, newrefcnt); > - if (!newrefcnt) > - dst_destroy(dst); > - } > + if (dst && rcuref_put(&dst->__refcnt)) > + dst_destroy(dst); > } > EXPORT_SYMBOL(dst_release_immediate); > > > but this is my thinking > What do you think that above causes your regression? Confused... [To Thorsten: I'm unsure if the reporter do the bisection and suddenly he found the culprit commit. Should I add it to regzbot? I had dealt with this reporter before when he reported nginx regression and he didn't respond with bisection to the point that I had to mark it as inconclusive (see regzbot dashboard). What advice can you provide to him?] -- An old man doll... just what I always wanted! - Clara [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-22 3:06 ` Bagas Sanjaya @ 2023-09-22 9:50 ` Linux regression tracking (Thorsten Leemhuis) 2023-09-22 11:09 ` Bagas Sanjaya 0 siblings, 1 reply; 35+ messages in thread From: Linux regression tracking (Thorsten Leemhuis) @ 2023-09-22 9:50 UTC (permalink / raw) To: Bagas Sanjaya, Martin Zaharinov Cc: Eric Dumazet, Paolo Abeni, netdev, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Pablo Neira Ayuso, Wangyang Guo, Arjan Van De Ven, Thomas Gleixner, Linux Regressions On 22.09.23 05:06, Bagas Sanjaya wrote: > On Thu, Sep 21, 2023 at 11:13:55AM +0300, Martin Zaharinov wrote: >> >> Its not easy to make this on production, have too many users on it. >> >> i make checks and find with kernel 6.3.12-6.5.13 all is fine. >> on first machine that i have with kernel 6.4 and still work run kernel 6.4.2 and have problem. This is confusing and hard to follow. You want to describe more carefully which kernels worked (avoid ranges, as I doubt you have tested everything between 6.3.12-6.5.13) and try to avoid complexity (you seem to have two machines? if everything works on one, don't even bring it up except maybe as a side note) >> in my investigation problem is start after migration to kernel 6.4.x >> >> in 6.4 kernel is add rcuref : >> >> https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.4 >> >> commit bc9d3a9f2afca189a6ae40225b6985e3c775375e >> Author: Thomas Gleixner <tglx@linutronix.de> >> Date: Thu Mar 23 21:55:32 2023 +0100 >> >> net: dst: Switch to rcuref_t reference counting > > Is it the culprit you look for? Had you done the bisection and it points > the culprit to that commit Martin, if you suspect this to be the culprit try to revert it on top of the latest kernel; if the problem then goes away it likely is the cause. > [...] >> but this is my thinking > > What do you think that above causes your regression? > > Confused... > > [To Thorsten: I'm unsure if the reporter do the bisection and suddenly he found > the culprit commit. Should I add it to regzbot? For now: no, things are too confusing and without knowing the culprit I guess nobody will look into this unless we are extremely lucky. Ciao, Thorsten ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Urgent Bug Report Kernel crash 6.5.2 2023-09-22 9:50 ` Linux regression tracking (Thorsten Leemhuis) @ 2023-09-22 11:09 ` Bagas Sanjaya 0 siblings, 0 replies; 35+ messages in thread From: Bagas Sanjaya @ 2023-09-22 11:09 UTC (permalink / raw) To: Linux regressions mailing list, Martin Zaharinov, Linux Kernel Mailing List Cc: Eric Dumazet, Paolo Abeni, netdev, patchwork-bot+netdevbpf, Jakub Kicinski, Stephen Hemminger, kuba+netdrv, dsahern, Florian Westphal, Pablo Neira Ayuso, Wangyang Guo, Arjan Van De Ven, Thomas Gleixner On 22/09/2023 16:50, Linux regression tracking (Thorsten Leemhuis) wrote: > On 22.09.23 05:06, Bagas Sanjaya wrote: >> [To Thorsten: I'm unsure if the reporter do the bisection and suddenly he found >> the culprit commit. Should I add it to regzbot? > > For now: no, things are too confusing and without knowing the culprit I > guess nobody will look into this unless we are extremely lucky. > OK, thanks! -- An old man doll... just what I always wanted! - Clara ^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2024-01-07 11:03 UTC | newest]
Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-15 4:05 Urgent Bug Report Kernel crash 6.5.2 Martin Zaharinov
2023-09-15 6:45 ` Eric Dumazet
2023-09-15 22:23 ` Martin Zaharinov
2023-11-16 14:17 ` Martin Zaharinov
2023-12-06 22:26 ` Martin Zaharinov
[not found] ` <5E63894D-913B-416C-B901-F628BB6C00E0@gmail.com>
2023-12-08 22:20 ` Thomas Gleixner
2023-12-08 23:01 ` Martin Zaharinov
2023-12-12 18:16 ` Thomas Gleixner
2023-12-19 9:25 ` Martin Zaharinov
2023-12-19 14:26 ` Thomas Gleixner
2023-12-22 17:26 ` Martin Zaharinov
2023-12-29 12:00 ` Martin Zaharinov
2024-01-04 20:51 ` Martin Zaharinov
2024-01-07 11:03 ` Martin Zaharinov
2023-09-15 23:00 ` Martin Zaharinov
2023-09-15 23:11 ` Martin Zaharinov
2023-09-16 8:27 ` Paolo Abeni
[not found] ` <CALidq=UR=3rOHZczCnb1bEhbt9So60UZ5y60Cdh4aP41FkB5Tw@mail.gmail.com>
2023-09-17 11:35 ` Martin Zaharinov
2023-09-17 11:40 ` Martin Zaharinov
2023-09-17 11:55 ` Martin Zaharinov
2023-09-17 12:04 ` Holger Hoffstätte
2023-09-18 8:09 ` Eric Dumazet
2023-09-19 20:09 ` Martin Zaharinov
2023-09-20 3:59 ` Eric Dumazet
2023-09-20 6:05 ` Martin Zaharinov
2023-09-20 6:16 ` Bagas Sanjaya
2023-09-20 7:03 ` Martin Zaharinov
2023-09-20 7:25 ` Eric Dumazet
2023-09-20 7:29 ` Eric Dumazet
2023-09-20 7:32 ` Martin Zaharinov
2023-09-21 7:50 ` Bagas Sanjaya
2023-09-21 8:13 ` Martin Zaharinov
2023-09-22 3:06 ` Bagas Sanjaya
2023-09-22 9:50 ` Linux regression tracking (Thorsten Leemhuis)
2023-09-22 11:09 ` Bagas Sanjaya
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).